HI version is available. Content is displayed in original English for accuracy.
Advertisement
Advertisement
⚡ Community Insights
Discussion Sentiment
100% Positive
Analyzed from 610 words in the discussion.
Trending Topics
#ffmpeg#torchcodec#pytorch#decoding#version#linking#https#libraries#wav#meta

Discussion (10 Comments)Read Original on HackerNews
I'm working in this area recently and very keen to use this given the claimed performance benefits, but I tried all your links and didn't see any actual performance numbers. Do you have any to share?
IMO a fair performance benchmark for those not tied to the full pytorch stack would have ffmpeg and the wav already loaded into memory before execution. Given that torchcodec relies on the user-supplied ffmpeg installation I suspect that may not be the case for ffmpeg already, at least not by default.
I understand why meta wouldn't want to do this (then you are inevitably distributing exploitable security vulnerabilities in pytorch, because ffmpeg will probably always have them) but I've been statically linking fmpeg and keeping the binary in-memory while still using separate processes for different batches of audio, with I/O through UDS between the parent and ffmpeg; then the parent does VAD on the pcm on CPU before any further inference. My implementation for static linking is similar to the pattern in https://github.com/amenzhinsky/go-memexec#static-binary - would be interesting to see if this is possible in the pytorch/python ecosystem, or maybe it's already been done.
Note that TorchCodec relies on FFmpeg libraries, not the FFmpeg binary itself. The new WavDecoder is faster because it bypasses the FFmpeg libraries code, not because it bypasses loading the FFmpeg binary in memory.
Regarding static linking: we stick to dynamic linking to honor the L-GPL license of the FFmpeg libraries. TorchCodec is BSD-licensed, and statically linking against the L-GPL FFmpeg libs would not be compliant. Some libraries dynamically link against FFmpeg while still bundling the FFmpeg libraries as .so files in the Python wheel - whether that's still compliant is honestly unclear to me, so we prefer leaving it up for the user to supply their own FFmpeg via pure dynamic linking.
Thanks!
2. Ease of going back-and-forth between CPU and GPU; in our experience, there's still a lot of scenarios where CPU decoding makes sense.
3. Audio decoding support.
Please take a look at our tutorials to get a feel for what TorchCodec can do: https://meta-pytorch.org/torchcodec/stable/generated_example...
Up until recently, TorchCodec releases worked with one-and-only-one version of PyTorch. This is because up until recently, PyTorch did not have a stable ABI, and we needed to pin TorchCodec releases to PyTorch releases. But! PyTorch now has an excellent Stable ABI (https://github.com/meta-pytorch/torchcodec#compatibility-wit..., https://www.youtube.com/watch?v=HNdEmnvMvGE&t=1s) and TorchCodec is taking advantage of that since version 0.12.