Skip to content

Conversation

@nv-nmailhot
Copy link
Contributor

@nv-nmailhot nv-nmailhot commented Dec 1, 2025

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

Summary by CodeRabbit

  • New Features

    • Build metrics now capture and include sccache statistics, providing enhanced visibility into compilation caching performance during the build process.
  • Chores

    • Integrated sccache data extraction and JSON export capabilities into the build metrics pipeline for improved build telemetry.

✏️ Tip: You can customize this high-level summary in your review settings.

@nv-nmailhot nv-nmailhot requested review from a team as code owners December 1, 2025 19:56
@github-actions github-actions bot added the feat label Dec 1, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 1, 2025

Walkthrough

This PR extends the Docker build system to capture, export, and parse sccache (shared compilation cache) statistics. It adds an extraction step in the GitHub Actions workflow that copies sccache stats from built images, integrates sccache JSON parsing into the build metrics parser, and introduces a new export-stats-json command in the build utility script to write cache statistics in JSON format.

Changes

Cohort / File(s) Summary
GitHub Actions Workflow
.github/actions/docker-build/action.yml
Adds a step to extract sccache statistics from built images into build-metrics/sccache-stats.json, passing the stats file path to the build metrics parser via a new --sccache argument. Handles cases where the file is unavailable.
Build Metrics Parser
.github/scripts/parse_buildkit_output.py
Adds support for optional --sccache=<file> CLI argument to read sccache JSON data, merge it into container metadata, and include a sccache summary section in the final output. Includes error handling for missing or invalid sccache files.
Container Build Scripts
container/Dockerfile, container/use-sccache.sh
Dockerfile chains a new export-stats-json command to write sccache statistics to JSON after showing stats. use-sccache.sh adds a new export_stats_json() function that parses sccache --show-stats output and exports it as JSON with snake_case keys and unit suffixes, with CLI routing for the export-stats-json command.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Areas requiring extra attention:
    • The export_stats_json() function logic in use-sccache.sh, particularly the awk parsing and JSON formatting with key normalization
    • Error handling pathways in parse_buildkit_output.py for sccache file reads and data validation
    • GitHub Actions workflow orchestration ensuring stats file propagation between steps

Poem

🐰 A rabbit crafts caches so fine,
Measuring stats in JSON lines,
From sccache logs to parsed delight,
Build metrics gleam, data shining bright,
Every build tracked with metrics' might! ✨

Pre-merge checks

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Description check ⚠️ Warning The pull request description is entirely empty with only placeholder template sections and no actual content filled in by the author. Fill in the Overview, Details, Where should the reviewer start?, and Related Issues sections with relevant information about the sccache metrics feature being added.
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: adding sccache metrics functionality to the database through integration with build metrics workflow.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
.github/scripts/parse_buildkit_output.py (1)

280-297: Consider replacing Unicode character and simplifying the data check.

Two minor suggestions:

  1. The Unicode character ℹ️ on line 293 may not display correctly in all terminal environments. Consider using a plain ASCII alternative like "INFO:" or "[i]".

  2. The check on line 286 is redundant: if sccache_data and len(sccache_data) > 0: can be simplified to just if sccache_data: since an empty dict is falsy and a non-empty dict is already truthy.

Apply this diff:

             if sccache_data and len(sccache_data) > 0:
                 build_data["container"]["sccache"] = sccache_data
                 print(
                     f"✅ sccache metrics added to container: {len(sccache_data)} metrics",
                     file=sys.stderr,
                 )
             else:
-                print("ℹ️  No sccache metrics found", file=sys.stderr)
+                print("INFO: No sccache metrics found", file=sys.stderr)

Or to also simplify the check:

-            if sccache_data and len(sccache_data) > 0:
+            if sccache_data:
                 build_data["container"]["sccache"] = sccache_data
                 print(
                     f"✅ sccache metrics added to container: {len(sccache_data)} metrics",
                     file=sys.stderr,
                 )
             else:
-                print("ℹ️  No sccache metrics found", file=sys.stderr)
+                print("INFO: No sccache metrics found", file=sys.stderr)
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5d9fc85 and e12f16f.

📒 Files selected for processing (4)
  • .github/actions/docker-build/action.yml (2 hunks)
  • .github/scripts/parse_buildkit_output.py (4 hunks)
  • container/Dockerfile (1 hunks)
  • container/use-sccache.sh (4 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-08-30T20:43:49.632Z
Learnt from: keivenchang
Repo: ai-dynamo/dynamo PR: 2797
File: container/Dockerfile:437-449
Timestamp: 2025-08-30T20:43:49.632Z
Learning: In the dynamo project's devcontainer setup, the team prioritizes consistency across framework-specific Dockerfiles (like container/Dockerfile, container/Dockerfile.vllm, etc.) by mirroring their structure, even when individual optimizations might be possible, to maintain uniformity in the development environment setup.

Applied to files:

  • container/Dockerfile
📚 Learning: 2025-09-16T17:16:07.820Z
Learnt from: keivenchang
Repo: ai-dynamo/dynamo PR: 3051
File: container/templates/Dockerfile.vllm.j2:221-233
Timestamp: 2025-09-16T17:16:07.820Z
Learning: In the dynamo project, when converting Dockerfiles to Jinja2 templates, the primary goal is backward compatibility - generated Dockerfiles must be identical to the originals. Security improvements or other enhancements that would change the generated output are out of scope for templating PRs and should be addressed separately.

Applied to files:

  • container/Dockerfile
📚 Learning: 2025-08-30T20:43:10.091Z
Learnt from: keivenchang
Repo: ai-dynamo/dynamo PR: 2797
File: .devcontainer/devcontainer.json:12-12
Timestamp: 2025-08-30T20:43:10.091Z
Learning: In the dynamo project, devcontainer.json files use templated container names (like "dynamo-vllm-devcontainer") that are automatically processed by the copy_devcontainer.sh script to generate framework-specific configurations with unique names, preventing container name collisions.

Applied to files:

  • container/Dockerfile
🪛 Ruff (0.14.6)
.github/scripts/parse_buildkit_output.py

293-293: String contains ambiguous (INFORMATION SOURCE). Did you mean i (LATIN SMALL LETTER I)?

(RUF001)


296-296: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: trtllm (amd64)
  • GitHub Check: trtllm (arm64)
  • GitHub Check: sglang (arm64)
  • GitHub Check: operator (amd64)
  • GitHub Check: operator (arm64)
  • GitHub Check: vllm (amd64)
  • GitHub Check: Build and Test - dynamo
🔇 Additional comments (11)
container/use-sccache.sh (4)

18-21: LGTM! Clear documentation for the new command.

The help text clearly describes the new export-stats-json command and provides a helpful usage example.

Also applies to: 35-36


69-72: LGTM! Proper error handling for missing sccache.

The function correctly checks for sccache availability and writes a well-formed error JSON when it's not available.


157-160: LGTM! Properly integrated into the command dispatcher.

The new command follows the existing pattern for argument handling.


65-142: Verify or add tests for sccache output format compatibility.

The AWK parser assumes a specific sccache --show-stats output format (lines starting with letters, numeric values matching /^[0-9.]+$/, and unit suffixes % or s). With no test fixtures, documentation of expected format, or error handling, the parser could silently produce incomplete JSON if the output format differs from expectations. Consider:

  • Adding test cases with sample sccache output to validate parsing logic
  • Documenting the expected output format in code comments or README
  • Adding validation/logging to detect malformed parsing results
container/Dockerfile (1)

326-327: Consider whether sccache export failure should fail the build.

The use of && means that if export-stats-json fails, the entire Docker build will fail. This could be appropriate if sccache metrics are critical, but if they're optional/best-effort, you might want to use || true to allow the build to continue even if export fails.

Current behavior (fails build on export error):

/tmp/use-sccache.sh show-stats "Dynamo" && \
/tmp/use-sccache.sh export-stats-json "/tmp/sccache-stats.json" "Dynamo"

Alternative (continues build on export error):

/tmp/use-sccache.sh show-stats "Dynamo" && \
/tmp/use-sccache.sh export-stats-json "/tmp/sccache-stats.json" "Dynamo" || true

Which behavior is preferred for your use case?

.github/scripts/parse_buildkit_output.py (3)

176-176: LGTM! Clear documentation of the new --sccache option.

The usage text clearly documents the new option and provides a helpful example.

Also applies to: 180-180


190-190: LGTM! Consistent argument parsing pattern.

The sccache argument parsing follows the same pattern as the existing metadata argument.

Also applies to: 195-196


323-340: LGTM! Defensive key checking is good.

The sccache summary printing correctly checks for key existence before accessing values, preventing potential KeyError exceptions.

.github/actions/docker-build/action.yml (3)

205-236: LGTM! Robust extraction with good error handling.

The sccache statistics extraction step is well-implemented with:

  • Defensive checks for image availability
  • Graceful fallback to empty JSON when stats aren't available
  • Proper cleanup with || true to prevent failure
  • Appropriate use of if: always() to run even after build failures

295-295: LGTM! Good use of variables for clarity.

Defining SCCACHE_STATS as a variable and logging it improves maintainability and debuggability.

Also applies to: 298-298


303-303: LGTM! Correct parser invocation with sccache metrics.

The parser is correctly invoked with both container metadata and sccache statistics files.

Also applies to: 308-309

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants