Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -211,6 +211,20 @@ format you want. Refer to Nvidia's GPU support matrix for more details
conda install -c conda-forge "torchcodec=*=*cuda*"
```

### Specify FFmpeg Path Manually
If torchcodec cannot detect the FFmpeg installation correctly, you can set the `TORCHCODEC_FFMPEG_DIR` environment variable to the directory containing the FFmpeg shared libraries.

For example, in conda environments this is typically:

```bash
# Conda on Linux/macOS
export TORCHCODEC_FFMPEG_DIR="$CONDA_PREFIX/lib"
```
```powershell
# Conda on Windows
$env:TORCHCODEC_FFMPEG_DIR = "$env:CONDA_PREFIX\Library\bin"
```

## Benchmark Results

The following was generated by running [our benchmark script](./benchmarks/decoders/generate_readme_data.py) on a lightly loaded 22-core machine with an Nvidia A100 with
Expand Down
23 changes: 21 additions & 2 deletions src/torchcodec/_core/ops.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
# LICENSE file in the root directory of this source tree.


import ctypes
import glob
import io
import json
import os
Expand All @@ -22,13 +24,30 @@


expose_ffmpeg_dlls = nullcontext
if sys.platform == "win32" and hasattr(os, "add_dll_directory"):
if ffmpeg_dir := os.getenv("TORCHCODEC_FFMPEG_DIR"):
if hasattr(os, "add_dll_directory"):

def expose_ffmpeg_dlls(): # type: ignore[no-redef] # noqa: F811
return os.add_dll_directory(str(ffmpeg_dir))

else:
ffmpeg_library_glob = "*.dylib" if sys.platform == "darwin" else "*.so*"
ffmpeg_library_paths = glob.glob(os.path.join(ffmpeg_dir, ffmpeg_library_glob))
if not ffmpeg_library_paths:
raise RuntimeError(
"TORCHCODEC_FFMPEG_DIR is set, but no FFmpeg shared libraries "
f"were found in {repr(ffmpeg_dir)}."

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repr seems overkill

)
for ffmpeg_library_path in ffmpeg_library_paths:
ctypes.CDLL(ffmpeg_library_path)
Comment on lines +41 to +42

@NicolasHug NicolasHug Apr 13, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not ideal: this is loading every single .so file that is present in the TORCHCODEC_FFMPEG_DIR.

If, for example, the user specifies TORCHCODEC_FFMPEG_DIR=$CONDA_PREFIX/lib/ then we're loading hundreds and hundreds of .so files that we'll never use.

We shouldn't load the .so files ourselves, we should continue leaving it up to load_torchcodec_shared_libraries() load all the transitive .so files it needs. What we want is to tell load_torchcodec_shared_libraries() where to find some extra .so files. I.e. I think we just want to update LD_LIBRARY_PATH on linux and its equivalent on MacOS.

Something along these lines:

    custom_path = os.environ.get('TORCHCODEC_FFMPEG_LIBRARIES')
    if custom_path:
        # Prepend so it takes priority, just like LD_LIBRARY_PATH
        ld_path = os.environ.get('LD_LIBRARY_PATH', '')
        os.environ['LD_LIBRARY_PATH'] = f"{custom_path}:{ld_path}" if ld_path else custom_path
# now call load_torchcodec_shared_libraries() as usual

@bcw222 bcw222 Apr 15, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems reasonable on Linux. However, as far as I know, on macOS, DYLD_LIBRARY_PATH is ignored when using the system Python because of System Integrity Protection (SIP). Should we simply document this limitation and recommend that users use an external Python installation, or should we keep the logic to load the shared libraries manually with ctypes?

@Dan-Flores Dan-Flores Apr 15, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out @bcw222! Unfortunately this behavior on Linux and MacOS means we need to know the names of all the required .so files to avoid loading many unnecessary .so files (as @NicolasHug points out), and we load the .so files before load_torchcodec_shared_libraries().

These introduce complexity and potential for unexpected side effects. Based on our discussion, what we actually want is for TORCHCODEC_FFMPEG_DIR to act as an alias for LD_LIBRARY_PATH.

So reconsidering this approach, we should be able to have FFmpeg libs be correctly detected by setting LD_LIBRARY_PATH / DYLD_LIBRARY_PATH / PATH.

We can update this PR to document this in README.md:

# Conda on Linux
LD_LIBRARY_PATH="$CONDA_PREFIX/lib:$LD_LIBRARY_PATH"

# Conda on macOS
DYLD_LIBRARY_PATH="$CONDA_PREFIX/lib:$DYLD_LIBRARY_PATH"
# Conda on Windows Powershell
$env:PATH = "$env:CONDA_PREFIX\Library\bin;" + "$env:PATH"

@bcw222 bcw222 Apr 16, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, are you sure? On Linux it should be LD_LIBRARY_PATH while on Windows no environment variable equivalent exists. That's why we need TORCHCODEC_FFMPEG_DIR on Windows.

@Dan-Flores Dan-Flores Apr 16, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My apologies, those examples were very incorrect as I sent my message while drafting, I'll fix my previous comment.

My understanding is that the PATH variable is the Windows equivalent to make DLLs discoverable, and the syntax to use it would be very similar to LD_LIBRARY_PATH or DYLD_LIBRARY_PATH.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Linux it should be LD_LIBRARY_PATH while on Windows no environment variable equivalent exists. That's why we need TORCHCODEC_FFMPEG_DIR on Windows.

@bcw222 our understanding is that on Windows, the LD_LIBRARY_PATH equivalent is actually the PATH env var. Is that not the case?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, but globally modifying PATH would clutter the environment (essentially losing the advantages of using shims like chocolatey provides; for example, when using Tab auto-completion; although Windows is already quite messy in this regard); while manually passing it every time is cumbersome. Especially considering the differences across platforms, I think using a dedicated environment variable (TORCHCODEC_FFMPEG_DIR) to configure it would be better.
However, given that this is essentially just an alias for existing environment variables (PATH/LD_LIBRARY_PATH/DYLD_LIBRARY_PATH), perhaps documenting these considerations for users would be sufficient.
So what solution do you all think is more appropriate?

@Dan-Flores Dan-Flores Apr 20, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the OS differences require us to:

  1. Know the names of all .so files we want to load on Linux/MacOS (we could grep on libav*, libsw*, but thats not guaranteed to always work).
  2. Load the library using ctypes on Linux/MacOS, which diverges from the Windows logic and introduces library loading logic, which adds complexity and potentially side effects.

Lets stick to the standard environmental variables, and add a section documenting how to use them! It can be similar to my edited comment above and list the various PATH variables.

manually passing it every time is cumbersome

Depending on your set up, could you call os.add_dll_directory() before import torchcodec?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. So we have two solutions: with the original logic untouched, on Windows we could either modify PATH or directly call os.add_dll_directory() before import torchcodec.


elif sys.platform == "win32" and hasattr(os, "add_dll_directory"):
# On windows we try to locate the FFmpeg DLLs and temporarily add them to
# the DLL search path. This seems to be needed on some users machine, but
# not on our CI. We don't know why.
if ffmpeg_path := shutil.which("ffmpeg"):

def expose_ffmpeg_dlls(): # noqa: F811
def expose_ffmpeg_dlls(): # type: ignore[no-redef] # noqa: F811
ffmpeg_dir = Path(ffmpeg_path).parent.absolute()
return os.add_dll_directory(str(ffmpeg_dir)) # that's the actual CM

Expand Down
Loading