feat: video decoder in the frontend #4719

milesial · 2025-12-03T07:36:01Z

Video decoding section of this diagram:

507753170-3ea77800-5a60-4d3a-8ec4-dc8603ff90fe

Gated under a feature flag, default off as discussed with @grahamking , since it relies on ffmpeg libraries being installed on the system. We enable the flag in framework containers and CI/CD where we have ffmpeg.

Rust dependency chain:

dynamo-llm → video-rs → ffmpeg-next → ffmpeg-sys-next
ffmpeg-sys-next builds a FFI with bindgen and needs ffmpeg headers and libs to be present

This PR adds video decoder configuration that can be passed via python bindings and the MDC the same way as the existing image decoder. Similar to image decoding we also have some output metadata sent to the backend. See the README changes for some more details on the configuration.

The demuxing and decoding itself happens on CPU via video-rs and ffmpeg. We first try a "seek-forward" approach where we place the decoder cursor close the the frames we want to decode, and try to reach that frame. If the seek fails for any reason, we fall back to a less efficient "full decode" of all the frames sequentially.

For some context, not all frames in a video can be decoded independently, there are only a few key frames scattered in the video, and other frames are compressed with a dependency on these key frames. So if a frame we wish to sample is not a key frame. We must start decoding from the closest key frame until we reach the frame of interest.

Summary by CodeRabbit

New Features
- Introduced video decoding functionality with customizable sampling options including frame count, fps, maximum frames, and pixel constraints.
- Video decoder supports frame selection and uniform sampling with metadata tracking (source fps and frame count).
- Added strict mode validation for video decoding operations.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-03T07:40:03Z

Walkthrough

This pull request adds comprehensive video decoding support to the media preprocessing system. It introduces a new VideoDecoder implementation with frame sampling, a new video-rs dependency with ffmpeg-next for video processing, updates the Python bindings API, and removes the placeholder "not supported yet" error in favor of actual video decoding.

Changes

Cohort / File(s)	Summary
Dependency Management `lib/llm/Cargo.toml`	Adds video-rs (0.10.5 with ndarray feature) and ffmpeg-next (7.1) as public dependencies; introduces media-ffmpeg-build feature flag.
Video Decoder Implementation `lib/llm/src/preprocessor/media/decoders/video.rs`	Introduces VideoDecoder and VideoMetadata types with Decoder trait implementation; handles frame sampling via fps or frame count, temporary file writing, pixel constraint enforcement, and uniform frame selection.
Core Decoder Extension `lib/llm/src/preprocessor/media/decoders.rs`	Adds pub mod video with re-exports; extends MediaDecoder struct with video_decoder field; adds Video variant to DecodedMediaMetadata enum.
Media Loader Integration `lib/llm/src/preprocessor/media/loader.rs`	Replaces VideoUrl error placeholder with actual video_decoder invocation for asynchronous decoding.
Python Bindings `lib/bindings/python/rust/llm/preprocessor.rs`	Adds video_decoder(&mut self, video_decoder: &Bound<'_, PyDict>) -> PyResult<()> method to MediaDecoder, deserializing Python dict to Rust VideoDecoder.
Module & Documentation Updates `lib/llm/src/preprocessor.rs`, `lib/llm/src/preprocessor/media/README.md`	Removes MediaDecoder/MediaFetcher re-exports; documents new video decoding API with configuration options (num_frames, fps, max_frames, strict, max_pixels).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Video decoding logic density: Frame sampling algorithm, temporary file handling, uniform frame index selection, and 4D tensor assembly in video.rs require careful validation
Edge case handling: Validate mutual exclusivity constraints (fps vs. num_frames, max_frames vs. num_frames), strict mode enforcement, and pixel limit boundaries
Error propagation: Verify FFmpeg error handling and graceful degradation for malformed video inputs
File system interactions: Confirm temporary file creation, cleanup, and potential race conditions in concurrent scenarios

Poem

🐰 Hoppy hops through frames so bright,
Video streams now decode right!
From pixel streams to tensor dreams,
A rabbit's joy in video schemes! 🎬✨

Pre-merge checks

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.
Description check	❓ Inconclusive	PR description lacks required template structure; missing explicit 'Details' section and 'Where should reviewer start?' guidance despite comprehensive technical context.	Reorganize description to follow template sections (Overview, Details, Where should reviewer start, Related Issues) for consistency and clarity.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat: video decoder in the frontend' clearly describes the main change—adding video decoder functionality to the frontend, which aligns with the substantial additions to video decoding logic across multiple files.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (2)

lib/llm/src/preprocessor/media/README.md (1)
49-49: Fix markdown heading formatting.

There's an extra space after the hash in the heading, which violates markdown lint rules (MD019).
-###  Sampling
+### Sampling
lib/llm/src/preprocessor/media/decoders/video.rs (1)
128-134: Consider logging decode errors for debugging.

Silently continuing past decode errors (line 131-133) may hide video corruption or decoder issues that are difficult to diagnose. A trace-level log would help with troubleshooting without impacting normal operation.
                 Err(video_rs::Error::ReadExhausted | video_rs::Error::DecodeExhausted) => {
                     break;
                 }
-                Err(_) => {
+                Err(e) => {
+                    tracing::trace!("Skipping frame {current_frame_idx}: {e}");
                     continue;
                 }

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2f2a13a and 05b8f94.

⛔ Files ignored due to path filters (2)

Cargo.lock is excluded by !**/*.lock
lib/bindings/python/Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (7)

lib/bindings/python/rust/llm/preprocessor.rs (1 hunks)
lib/llm/Cargo.toml (2 hunks)
lib/llm/src/preprocessor.rs (1 hunks)
lib/llm/src/preprocessor/media/README.md (2 hunks)
lib/llm/src/preprocessor/media/decoders.rs (2 hunks)
lib/llm/src/preprocessor/media/decoders/video.rs (1 hunks)
lib/llm/src/preprocessor/media/loader.rs (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-08-18T20:51:51.324Z

Learnt from: PeaBrane
Repo: ai-dynamo/dynamo PR: 2465
File: lib/runtime/src/pipeline/network/egress/push_router.rs:0-0
Timestamp: 2025-08-18T20:51:51.324Z
Learning: The runtime crate cannot depend on the llm crate due to architectural dependency constraints, preventing imports from lib/llm into lib/runtime.

Applied to files:

lib/llm/src/preprocessor.rs

🧬 Code graph analysis (2)

lib/llm/src/preprocessor/media/loader.rs (1)

lib/llm/src/preprocessor/media/common.rs (1)

from_url (16-42)

lib/llm/src/preprocessor/media/decoders/video.rs (1)

lib/llm/src/preprocessor/media/decoders.rs (1)

decode (17-17)

🪛 GitHub Actions: Copyright Checks

lib/llm/src/preprocessor/media/decoders/video.rs

[error] 1-1: Copyright header check failed: invalid/missing header in lib/llm/src/preprocessor/media/decoders/video.rs

🪛 LanguageTool

lib/llm/src/preprocessor/media/README.md

[style] ~46-~46: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...eds this value, abort the decoding. - max_alloc (uint64, > 0): Maximum allowed ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🪛 markdownlint-cli2 (0.18.1)

lib/llm/src/preprocessor/media/README.md

49-49: Multiple spaces after hash on atx style heading

(MD019, no-multiple-space-atx)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)

GitHub Check: sglang (amd64)
GitHub Check: operator (amd64)
GitHub Check: trtllm (amd64)
GitHub Check: trtllm (arm64)
GitHub Check: sglang (arm64)
GitHub Check: operator (arm64)
GitHub Check: vllm (amd64)
GitHub Check: Build and Test - dynamo
GitHub Check: tests (launch/dynamo-run)
GitHub Check: tests (lib/runtime/examples)
GitHub Check: tests (.)
GitHub Check: tests (lib/bindings/python)

🔇 Additional comments (7)

lib/llm/src/preprocessor/media/README.md (1)

42-59: Documentation looks good.

The video decoding options are well documented with clear explanations of the two sampling modes (fixed num_frames vs fps-based), strict mode behavior, and the max_pixels safety limit. This provides good guidance for users configuring video decoding.

lib/llm/src/preprocessor.rs (1)

30-31: LGTM!

The import simplification to only bring in MediaLoader aligns with how the preprocessor uses the media module—MediaLoader is constructed from mdc.media_decoder and mdc.media_fetcher rather than directly using the types here.

lib/bindings/python/rust/llm/preprocessor.rs (1)

102-109: LGTM!

The video_decoder method implementation correctly mirrors the image_decoder pattern, using pythonize::depythonize for Python dict to Rust struct conversion with appropriate error mapping. The API is consistent and straightforward.

lib/llm/src/preprocessor/media/loader.rs (1)

124-129: Video decoding implementation is correct.

The implementation correctly mirrors the image decoding pattern: validate URL, fetch encoded data, then delegate to the video decoder. Error propagation via ? is consistent with the rest of the codebase.

Video processing is already tested in the test_media_url_passthrough integration test, which includes parametrized cases for single_video and mixed video scenarios, exercising the video decoding path through the real VideoDecoder implementation.

lib/llm/src/preprocessor/media/decoders.rs (1)

10-13: LGTM!

The video module integration follows the established pattern for ImageDecoder. The #[serde(default)] attribute ensures backward-compatible deserialization, and the Video variant cleanly extends the metadata enum.

Also applies to: 32-34, 40-40

lib/llm/src/preprocessor/media/decoders/video.rs (2)

15-33: LGTM on struct design.

The configuration struct is well-designed with sensible optional fields and deny_unknown_fields for strict config validation.

98-109: LGTM on frame sampling and tensor construction.

The uniform sampling logic using linspace formula is mathematically sound, and the 4D tensor construction (N, H, W, C) follows standard conventions for video data.

Also applies to: 149-161

lib/llm/Cargo.toml

lib/llm/src/preprocessor/media/decoders/video.rs

KrishnanPrash

Could you add a description to make it easier on the reviewers of this PR? Some things that aren't immediately obvious to me:

To my understanding,ffmpeg-next is a rust-wrapper of ffmpeg and the binary still needs to installed separately.
Build-wise what additional dependencies would audio support require?

milesial · 2025-12-10T01:01:21Z

Could you add a description to make it easier on the reviewers of this PR? Some things that aren't immediately obvious to me:
* To my understanding,`ffmpeg-next` is a rust-wrapper of `ffmpeg` and the binary still needs to installed separately.

* Build-wise what additional dependencies would audio support require?

Added some details. For audio decoding this is not in scope of this PR but I'm assuming that if ffmpeg is installed on the system, we shouldn't need much else.

Signed-off-by: Alexandre Milesi <[email protected]>

container/Dockerfile.trtllm

Signed-off-by: Alexandre Milesi <[email protected]>

milesial requested a review from a team as a code owner December 3, 2025 07:36

pull-request-size bot added the size/L label Dec 3, 2025

github-actions bot added the feat label Dec 3, 2025

coderabbitai bot reviewed Dec 3, 2025

View reviewed changes

lib/llm/Cargo.toml Outdated Show resolved Hide resolved

lib/llm/src/preprocessor/media/decoders/video.rs Show resolved Hide resolved

lib/llm/src/preprocessor/media/decoders/video.rs Outdated Show resolved Hide resolved

milesial force-pushed the alexandrem/video-decoding branch from 05b8f94 to 2cab45b Compare December 8, 2025 17:02

milesial requested review from a team as code owners December 8, 2025 17:02

pull-request-size bot added size/XL and removed size/L labels Dec 8, 2025

copy-pr-bot bot temporarily deployed to GITLAB December 8, 2025 17:03 Inactive

copy-pr-bot bot temporarily deployed to GITLAB December 8, 2025 17:05 Inactive

milesial force-pushed the alexandrem/video-decoding branch from 2cab45b to 6e47c66 Compare December 9, 2025 22:01

copy-pr-bot bot had a problem deploying to GITLAB December 9, 2025 22:01 Failure

milesial force-pushed the alexandrem/video-decoding branch from 6e47c66 to 738998c Compare December 9, 2025 23:19

copy-pr-bot bot had a problem deploying to GITLAB December 9, 2025 23:19 Failure

milesial force-pushed the alexandrem/video-decoding branch from 738998c to 3f818ac Compare December 9, 2025 23:25

copy-pr-bot bot had a problem deploying to GITLAB December 9, 2025 23:25 Failure

milesial requested review from KrishnanPrash, ayushag-nv, grahamking, indrajit96, krishung5 and rmccorm4 December 9, 2025 23:27

copy-pr-bot bot had a problem deploying to GITLAB December 10, 2025 00:26 Failure

copy-pr-bot bot had a problem deploying to GITLAB December 10, 2025 00:47 Failure

KrishnanPrash reviewed Dec 10, 2025

View reviewed changes

copy-pr-bot bot had a problem deploying to GITLAB December 10, 2025 01:50 Failure

copy-pr-bot bot temporarily deployed to GITLAB December 16, 2025 02:27 Inactive

milesial force-pushed the alexandrem/video-decoding branch from 69635b6 to 2c7cb81 Compare December 16, 2025 05:35

copy-pr-bot bot temporarily deployed to GITLAB December 16, 2025 05:35 Inactive

copy-pr-bot bot temporarily deployed to GITLAB December 16, 2025 05:39 Inactive

milesial force-pushed the alexandrem/video-decoding branch from 2c7cb81 to 2ea53b8 Compare December 16, 2025 06:44

copy-pr-bot bot temporarily deployed to GITLAB December 16, 2025 06:44 Inactive

copy-pr-bot bot temporarily deployed to GITLAB December 16, 2025 06:45 Inactive

milesial added 11 commits December 16, 2025 10:56

feat: video decoder in the frontend

932d969

Signed-off-by: Alexandre Milesi <[email protected]>

Sparse sampling, cleanups and tests

40b0f20

Signed-off-by: Alexandre Milesi <[email protected]>

fix: docker

aa43ebe

Signed-off-by: Alexandre Milesi <[email protected]>

fix: ci build

f2dc302

Signed-off-by: Alexandre Milesi <[email protected]>

fix: exclude .so from wheel

6d159e1

Signed-off-by: Alexandre Milesi <[email protected]>

fix: build light ffmpeg from source

9630fa7

Signed-off-by: Alexandre Milesi <[email protected]>

fix: Remove logs

2f88f1b

Signed-off-by: Alexandre Milesi <[email protected]>

feat: update FW dockers

eaa4a9b

Signed-off-by: Alexandre Milesi <[email protected]>

fix: duplicate ffmpeg build

a3b7043

Signed-off-by: Alexandre Milesi <[email protected]>

fix: pkg-config

6a20ee7

Signed-off-by: Alexandre Milesi <[email protected]>

fix: Copy ffmpeg into dev

218ab0e

Signed-off-by: Alexandre Milesi <[email protected]>

milesial force-pushed the alexandrem/video-decoding branch from 2ea53b8 to 218ab0e Compare December 16, 2025 18:56

copy-pr-bot bot had a problem deploying to GITLAB December 16, 2025 18:56 Failure

dillon-cullinan reviewed Dec 16, 2025

View reviewed changes

container/Dockerfile.trtllm Outdated Show resolved Hide resolved

container/Dockerfile.trtllm Outdated Show resolved Hide resolved

copy-pr-bot bot had a problem deploying to GITLAB December 16, 2025 21:22 Failure

fix: sccache and better cross-stage copies

387aa86

Signed-off-by: Alexandre Milesi <[email protected]>

milesial force-pushed the alexandrem/video-decoding branch from 95b96f3 to 387aa86 Compare December 16, 2025 21:26

copy-pr-bot bot had a problem deploying to GITLAB December 16, 2025 21:26 Failure

dillon-cullinan approved these changes Dec 16, 2025

View reviewed changes

copy-pr-bot bot had a problem deploying to GITLAB December 16, 2025 22:16 Failure

milesial merged commit 74fcd4a into main Dec 16, 2025
38 of 40 checks passed

milesial deleted the alexandrem/video-decoding branch December 16, 2025 22:28

smatta-star pushed a commit to smatta-star/dynamo that referenced this pull request Dec 19, 2025

feat: video decoder in the frontend (ai-dynamo#4719)

61135a1

Signed-off-by: Alexandre Milesi <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: video decoder in the frontend #4719

feat: video decoder in the frontend #4719

Uh oh!

milesial commented Dec 3, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Dec 3, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

KrishnanPrash left a comment

Uh oh!

milesial commented Dec 10, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

feat: video decoder in the frontend #4719

feat: video decoder in the frontend #4719

Uh oh!

Conversation

milesial commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

KrishnanPrash left a comment

Choose a reason for hiding this comment

Uh oh!

milesial commented Dec 10, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

milesial commented Dec 3, 2025 •

edited

Loading

coderabbitai bot commented Dec 3, 2025 •

edited

Loading