Distributed cost reporting#967
Conversation
…to prevent collisions fix(log_dispatch): append traceback to console message when include_exception is True test(bundle_validator): ensure consecutive sweeps use distinct registry IDs and do not collide test(bundle_validator): verify registry is opened once and closed in finally for successful runs test(log_dispatch): add tests for exception logging behavior in console messages docs(init): add non-interactive `pipelex init --yes` mode for CI compatibility docs(registry-leak): document `UsageRegistry` leak on success path of full pipeline run
- Introduced `--costs` option in CLI commands (`run_bundle_cmd`, `run_method_cmd`, `run_pipe_cmd`) to enable or disable cost reporting. - Updated execution configuration to include `is_generate_costs` flag for controlling cost reporting behavior. - Enhanced `GraphContext` to manage emission of usage events independently from graph events. - Modified pipeline run setup to initialize tracing context based on graph and cost reporting flags. - Implemented tests to validate the behavior of cost reporting and event emission gates in various scenarios. - Updated configuration files to reflect new cost reporting settings.
…nsure correctness across contexts docs: update references to with_execution_overrides in execution graph tracing documentation chore(TODOS): document implementation plan and review findings for Phase 1 emit decoupling
- Replace `act_assemble_graph` with `act_assemble_tracing` to unify tracing assembly logic. - Remove the `act_assemble_graph` activity and its associated tests. - Introduce `act_assemble_tracing` to handle both graph specification and usage aggregation from trace events. - Update `pipeline_run_setup` to correctly propagate emit flags to the new tracing activity. - Modify `WfPipeRun` and `WfPipeRouter` to accommodate changes in tracing assembly logic. - Add integration tests for tracing assembly scenarios, ensuring correct population of `tokens_usages` and `graph_spec`. - Implement unit tests for `TracingAssembly` to validate behavior under various conditions, including error handling.
…tests for WfPipeRun behavior
… commands - Introduced cost reporting in the agent CLI, allowing for structured cost summaries in JSON output. - Removed deprecated cost report flags from CLI commands and replaced them with a unified cost reporting mechanism. - Updated the CostRegistry to aggregate costs by model and compute total costs from token usages. - Implemented a new cost report renderer to handle console and CSV outputs based on configuration settings. - Added integration tests to ensure correct cost report generation for both non-zero and zero-cost runs. - Updated configuration files to enable cost logging to console by default. - Refactored existing code to streamline cost reporting logic and improve maintainability.
- Updated the cost reporting logic to ensure that runs using free models with real tokens still generate a cost report, reflecting a total cost of 0. - Enhanced the `CostRegistry` to aggregate costs and determine reportability based on token usage and cost, ensuring that dry runs (zero tokens and zero cost) are the only cases that suppress reports. - Modified the `render_run_cost_report` function to check for reportable usage before generating output. - Added tests to verify that free models are reported correctly and that dry runs do not produce reports. - Documented the changes in the relevant markdown files to clarify the behavior of cost reporting in different scenarios.
- Removed unnecessary assertions and code related to orphan registries in ReportingManager tests. - Updated tests to reflect the removal of per-run usage registries, ensuring that reporting functions correctly without accumulating orphan registries. - Simplified test setup for JobMetadata and workflow run IDs, eliminating the need for explicit registry opening and closing. - Adjusted assertions in tests to verify correct behavior of reporting events and usage without relying on registry state. - Enhanced clarity in test descriptions and streamlined the overall test structure for better maintainability.
…inference` CLI option for enhanced cost reporting
…reporting - Introduced dry_mock.py to handle synthetic inference jobs for both --dry-run and --mock-inference modes. - Added mock_llm_gen_text, mock_llm_gen_object, and mock_llm_gen_object_list functions to generate mock outputs and report usage. - Updated llm_generate.py to route calls to mock functions based on job_metadata.is_mock_inference flag. - Enhanced JobMetadata to include is_mock_inference attribute for tracking mock inference state. - Modified pipeline execution setup to propagate is_mock_inference flag through job metadata. - Created integration tests for --mock-inference in both direct and temporal contexts to ensure correct behavior and reporting. - Added unit tests for dry_mock functions to validate synthetic job reporting and usage aggregation. - Updated CLI commands to declare --mock-inference option and enforce mutual exclusivity with --dry-run.
…tions - Introduced `MockInferenceUnsupportedError` to prevent real provider calls under `--mock-inference` for image generation, document extraction, and web search. - Updated relevant leaf functions (`img_gen_single_image`, `img_gen_image_list`, `extract_gen_pages`, and `PipeSearch`) to raise the new error when `is_mock_inference` is true. - Added integration and unit tests to ensure the guard functions correctly and prevents unnecessary spending. - Documented the new error in the errors index and created a dedicated page for `MockInferenceUnsupportedError`. - Updated TODOs and follow-up documentation to reflect the changes and future work on full leaf coverage.
…rence fidelity - Added `validate_run_flag_combination` function to enforce legal combinations of `--dry-run`, `--mock-inference`, and `--mock-inputs` flags across `pipe`, `method`, and `bundle` commands. - Refactored command implementations in `bundle_cmd.py`, `method_cmd.py`, and `pipe_cmd.py` to utilize the new validation function, removing redundant checks. - Introduced `_revalidate_against_object_class` function in `content_generator.py` to ensure fidelity of mock inference objects against their original classes, raising `MockInferenceObjectFidelityError` for validation failures. - Updated `ContentGeneratorInWorkflow` to include fidelity checks for mock inference objects. - Created integration tests for mock inference fidelity and unit tests for run flag combination validation. - Added detailed error handling and messaging for mock inference fidelity issues.
… docs Phase 6 — temporal-e2e-validate skill: add Tier 8b (cross-worker cost report assembly). Uses split router+runner workers + --mock-inference so it is free and deterministic — asserts the submitter renders a single non-suppressed cost report from runner-side usage, with a --no-costs negative arm and an opt-in live arm. Updates the skill description/triggers, the master results table, and reconciles the Tier 8 note to point at --mock-inference as the cheap deterministic path. Phase 7 — docs/changelog/wrap: - CHANGELOG [Unreleased]: --costs/--no-costs (default on), --cost-report removal, distributed cost reporting, tokens_usages on PipeOutput, --mock-inference, agent-CLI cost_report JSON, is_log_costs_to_console default flip, dry-run suppression, act_assemble_tracing rename, and the UsageRegistry success-path leak fix. - wip docs flipped to as-built: distributed-execution T2 -> FIXED, P0.1 + P1 -> DONE; registry README -> IMPLEMENTED. New wip/registry/deferred-followups.md consolidates the deferred non-goals. - TODOS.md now reads as the as-built PR-review guide to the branch. Markdown-only; make agent-check green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Greptile SummaryThis PR moves cost reporting onto the run result and makes it work across distributed execution. It changes:
Confidence Score: 5/5This looks safe to merge.
Reviews (11): Last reviewed commit: "refactor(tracing): rename GraphContext →..." | Re-trigger Greptile |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 51c5c85606
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
4 issues found across 124 files
Note: This PR contains a large number of files. cubic only reviews up to 100 files per PR, so some files may not have been reviewed. cubic prioritizes the most important files to review.
On a pro plan you can use ultrareview for larger PRs.
Re-trigger cubic
Three real findings fixed; two design tradeoffs documented as deferred. - act_assemble_tracing no longer swallows unexpected errors (greptile + cubic P1). Expected best-effort failures are still caught inside assemble_tracing and returned on the *_assembly_error fields; a genuinely unexpected error now propagates so the activity fails and WfPipeRun's existing `except ActivityError` branch records it on usage_assembly_error / graph_assembly_error — symmetric with DIRECT mode, instead of the workflow silently succeeding with no cost report and no diagnostic. Pinned by a new unit test. - Agent CLI local cost-summary block is now wrapped to mirror the main CLI's render_run_cost_report guard (cubic P2): CostRegistry.build_cost_summary can raise CostRegistryError (a PipelexError); catch + log so a reporting-side error never fails an otherwise-successful run. - execution_seams.prepare_pipe_job docstring corrected (cubic P3): --mock-inference fakes the LLM leaf only; non-LLM leaves raise MockInferenceUnsupportedError. Deferred (documented, not patched): - Agent `--costs` is a silent no-op in API mode → #6b in wip/registry/cost-report-deferred-decisions.md (facet of #6; needs the tri-state param first). - Activity usage write can mutate a workflow-registered buffer in a single-process Temporal deployment → pre-existing, T3 territory; documented in wip/registry/deferred-followups.md (split router+runner topology doesn't hit it). make agent-check + full make agent-test green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Review round 1 addressed (commit 51b7e3f)Thanks for the reviews. Triage: Fixed
Deferred (documented design decisions, not silent bugs)
@greptileai @cubic-dev-ai please re-review. |
There was a problem hiding this comment.
2 issues found across 125 files
Note: This PR contains a large number of files. cubic only reviews up to 100 files per PR, so some files may not have been reviewed. cubic prioritizes the most important files to review.
On a pro plan you can use ultrareview for larger PRs.
Re-trigger cubic
…nd 2) The agent CLI's CLAUDE.md overstated when `cost_report` appears — round-1's build_cost_summary guard means it can now be absent on a real run too. Doc now states it is best-effort and absent for: dry runs, --no-costs, the API-runner path, or when cost-summary aggregation fails (caught + skipped). (cubic P3) The other round-2 finding (UnicodeEncodeError in CSV write, cubic P2) is a false positive: save_to_csv writes encoding="utf-8", which encodes every valid Python str, so UnicodeEncodeError cannot occur — adding it would be a speculative catch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Review round 2 addressed (commit 73ac388)
CI green on the prior commit; this change is doc-only. @greptileai @cubic-dev-ai please re-review. |
@lchoquel I have started the AI code review. It will take a few minutes to complete. |
There was a problem hiding this comment.
1 issue found across 125 files
Note: This PR contains a large number of files. cubic only reviews up to 100 files per PR, so some files may not have been reviewed. cubic prioritizes the most important files to review.
On a pro plan you can use ultrareview for larger PRs.
Re-trigger cubic
…ound 3) CHANGELOG's agent cost_report entry read as "always attached"; it is best-effort/optional (absent for dry runs, --no-costs, API-runner path, or an aggregation failure). Matches the agent_cli CLAUDE.md correction from round 2. (cubic P2) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Review round 3 addressed (commit 72692b0)
@greptileai @cubic-dev-ai please re-review. |
- Cost-report guard now also catches UnicodeEncodeError (cubic P2). The guard's contract is "cost reporting never fails a successful run"; it already caught OSError (a CSV-write failure). The CSV is written UTF-8, which can raise UnicodeEncodeError on a lone-surrogate string — same write-failure category, so it belongs in the same catch. (Reverses the round-2 decline: the earlier "impossible with UTF-8" rationale was imprecise.) Pinned by a parametrized test (UnicodeEncodeError / CostRegistryError / OSError all degrade to a warning). - execution_seams docstring: "LLM cogt leaf" -> "LLM inference-leaf" (cubic P3). `cogt` is the package name, not a typo, but the jargon read as one; reworded for clarity. - reporting-config.md: console cost-table is printed only when the run had reportable usage — suppressed for dry runs and on an assembly/aggregation failure (cubic P3). Deferred (documented): --no-costs no-op in agent API mode (cubic P2) is #6b in wip/registry/cost-report-deferred-decisions.md. make agent-check + targeted reporting/cost tests green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Review round 4 addressed (commit 0ddd65c)
@greptileai @cubic-dev-ai please re-review. |
@lchoquel I have started the AI code review. It will take a few minutes to complete. |
There was a problem hiding this comment.
3 issues found across 125 files
Note: This PR contains a large number of files. cubic only reviews up to 100 files per PR, so some files may not have been reviewed. cubic prioritizes the most important files to review.
On a pro plan you can use ultrareview for larger PRs.
You've manually re-run cubic several times on this PR. Each manual re-review checks the full PR again and counts toward your usage quota. To preserve your usage limits, we recommend letting cubic automatically review new commits.
Re-trigger cubic
- PipelexRunner.__init__ is now keyword-only (cubic P2). is_mock_inference was added mid-signature; making the constructor keyword-only (all call sites already pass keywords) removes the positional-shift footgun for good and aligns with the codebase's keyword-only-args direction. - cost_report_renderer module docstring corrected (cubic P3): only the main CLI calls render_run_cost_report; the agent CLI builds a machine-readable cost_report via CostRegistry.build_cost_summary instead (no Rich table). Deferred (documented): --no-costs no-op in agent API mode, re-flagged on method_cmd — #6b in wip/registry/cost-report-deferred-decisions.md. make agent-check + pipeline integration tests green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Review round 5 addressed (commit 0f92c37)
@greptileai @cubic-dev-ai please re-review. |
@lchoquel I have started the AI code review. It will take a few minutes to complete. |
There was a problem hiding this comment.
1 issue found across 125 files
Note: This PR contains a large number of files. cubic only reviews up to 100 files per PR, so some files may not have been reviewed. cubic prioritizes the most important files to review.
On a pro plan you can use ultrareview for larger PRs.
You've manually re-run cubic several times on this PR. Each manual re-review checks the full PR again and counts toward your usage quota. To preserve your usage limits, we recommend letting cubic automatically review new commits.
Re-trigger cubic
Review converged (commit 0f92c37)Round 6 surfaced only a re-flag of the already-documented-deferred #6b ( Summary of what was addressed across rounds:
Proceeding to final pre-merge review. |
…/review) Final pre-merge /review (Claude + Codex adversarial passes) found one real best-effort violation: assemble_tracing's read-error catch omitted botocore ClientError, so in DIRECT mode with the DynamoDB backend a transient DDB error (throttle/auth/network) during tracing assembly would fail an otherwise-successful run — the sibling reporting_manager fallback already handles it. Added the conditional botocore catch (mirrors reporting_manager) so the read degrades to an *_assembly_error note. Pinned by test_read_botocore_clienterror_is_caught (pytest.importorskip-guarded). Flagged + deferred (pre-existing, separate concerns; documented in wip/registry/): - Cost report ignores reasoning/audio/prediction token costs (under-reports total) → deferred-followups.md (cost-model change, not this PR's scope). - DIRECT vs TEMPORAL object re-validation use different model_dump modes → deferred-followups.md. - Agent API-mode flag handling is inconsistent (--graph/--costs silently ignored while --dry-run/--mock-inputs are rejected) → broadened #6b in cost-report-deferred-decisions.md (holistic fix, not a partial --no-costs patch). make agent-check + targeted pipe_run/reporting/pipeline tests green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Final pre-merge review (gstack /review) — commit 474b88eRan independent Claude + Codex adversarial passes over the full diff. Fixed (1 real best-effort violation):
Flagged + deferred (pre-existing, separate concerns — documented in
Confirmed fine: exactly-once usage emit (fast-path XOR fallback),
|
|
Final commit |
@lchoquel I have started the AI code review. It will take a few minutes to complete. |
There was a problem hiding this comment.
1 issue found across 125 files
Note: This PR contains a large number of files. cubic only reviews up to 100 files per PR, so some files may not have been reviewed. cubic prioritizes the most important files to review.
On a pro plan you can use ultrareview for larger PRs.
You've manually re-run cubic several times on this PR. Each manual re-review checks the full PR again and counts toward your usage quota. To preserve your usage limits, we recommend letting cubic automatically review new commits.
Re-trigger cubic
… (PR #967 review round 7) Greptile + cubic correctly flagged that the round-6 /review botocore fix was incomplete: it caught ClientError (service-side throttle/auth) but not BotoCoreError (transport/credential/timeout — EndpointConnectionError, ReadTimeoutError, NoCredentialsError, ...), which is a SEPARATE base class, not a ClientError subclass. A DynamoDB transport failure would still escape the best-effort read/emit catch and fail an otherwise-successful run. Now both sibling classes are caught in both assemble_tracing (read) and reporting_manager's runner fallback (emit) — matching s3_storage_provider, which already catches the pair. botocore is untyped so the imported classes are Unknown; cast to type[BaseException] keeps the except-tuple (and the bound exception var) cleanly typed. Test extended to cover both ClientError and a BotoCoreError subclass (EndpointConnectionError). make agent-check (pyright 0/0, mypy 2026 files) + tracing/reporting tests green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Review round 7 addressed (commit 5372b23)Both bots correctly caught that my round-6 /review fix was incomplete: it caught botocore
@greptileai @cubic-dev-ai final re-review on this commit, please. |
@lchoquel I have started the AI code review. It will take a few minutes to complete. |
There was a problem hiding this comment.
No issues found across 125 files
Note: This PR contains a large number of files. cubic only reviews up to 100 files per PR, so some files may not have been reviewed. cubic prioritizes the most important files to review.
On a pro plan you can use ultrareview for larger PRs.
You've manually re-run cubic several times on this PR. Each manual re-review checks the full PR again and counts toward your usage quota. To preserve your usage limits, we recommend letting cubic automatically review new commits.
Re-trigger cubic
✅ Review complete — PR readyRound 8 (commit Across 8 review rounds + the gstack /review pass, every real finding was fixed:
Design tradeoffs were deferred with documentation rather than reflexively patched (in
|
…_costs → is_generate_usage (#968) The per-execution tracing context was born graph-only, but distributed cost reporting now rides usage/cost events over the same object — so the name under-described it. Rename it to the shared trace transport it actually is, and align the internal usage gate to the stream it gates. - `GraphContext` → `TraceContext`; file `graph_context.py` → `trace_context.py`; attribute `graph_context` → `trace_context` everywhere (incl. `JobMetadata`). Re-document the object as one node tree with two streams (graph + usage). - Gate flag `is_generate_costs` → `is_generate_usage` on `PipelineExecutionConfig` (+ all three `pipelex.toml` files) and the override param `with_execution_overrides(generate_costs=…)` → `generate_usage=…`. - Cost-domain renderer left cost-named on purpose: `render_run_cost_report` keeps `is_generate_costs`, fed the usage gate at the seam ("if usage was generated, render the cost view"). Public `--costs` CLI flag unchanged. Deferred: the optional `graph_id` → `run_id` nit (ripples into the deliberately graph-named `GraphTracerManager` plumbing). No behavior change. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Deleted the `registry-lifecycle-synthesis.md`, `registry-success-path-leak-assessment.md`, `registry-success-path-leak-brief.md`, `registry-success-path-leak-execution-contexts.md`, and `usage-reporting-without-cost.md` files as they are no longer relevant to the current implementation and design direction. - The `UsageRegistry` leak was confirmed, where successful runs left registries open, leading to memory leaks in long-lived servers. The design now ensures that the registry is closed properly on success paths. - The synthesis document now encapsulates the lifecycle of the `UsageRegistry`, addressing both the leak and the cost reporting mechanisms in a unified manner. - The capability of usage reporting has been clarified, distinguishing between usage data and cost projections, with a focus on making usage a first-class output.
What & why
This branch turns cost reporting from a leaky per-run in-process buffer into an event-sourced artifact on the run result, and in doing so fixes the
UsageRegistrysuccess-path leak and wires up distributed (cross-worker) cost reporting — they were two facets of one lifecycle (the leak is the missing close; distributed reporting is the missing replay-populate of the same per-run buffer).The full as-built record, phase by phase with cold-start notes, is in
TODOS.md— read it as the guide to this branch. The locked design iswip/registry/registry-lifecycle-synthesis.md.The five load-bearing changes
Leak fixed by removal. The
UsageRegistry(and itsopen_registry/close_registrylifecycle,inject_tokens_usages,generate_report, the live-add trio) is gone. It was opened inpipeline_run_setupbut closed only on the failure path, so every successful run leaked its registry — and in a long-lived process (the Pipelex API) reusing apipeline_run_idcould collide on the orphan. With nothing to open, the leak is structurally impossible.Usage rides on
PipeOutput. Newtokens_usages: list[AnyTokensUsage] | None+usage_assembly_error: str | None, mirroringgraph_spec/graph_assembly_error. Usage is assembled from the trace-event stream at end of run, in both DIRECT and TEMPORAL modes — so a cross-worker run carries its true total cost back on the result, exposed automatically wherever aPipeOutputis returned (including the Pipelex API response).--costs/is_generate_costs(default on) decouples cost collection from--graph. Both ride the shared event-log transport:--graphgates graph events +GraphSpecAssembler;--costsgatesUsageReportEvents +UsageAggregator.--cost-reportis removed (folded into--costs; no back-compat, per repo policy).Event-sourced renderer. The only cost renderer now reads
pipe_output.tokens_usagesviaCostRegistry.generate_report(tokens_usages=...)(main CLI throughreporting/cost_report_renderer.py). The Temporalact_assemble_graphactivity is renamedact_assemble_tracingand assembles both graph and usage from a single event read.is_log_costs_to_consolenow defaultstrue(CLI shows the table by default when--costsis on); per-inference-job console logging is removed (report renders once at end of run). Dry runs are suppressed (zero tokens + zero cost); free/zero-price model runs (e.g. Ollama) still report token usage with a zero total.--mock-inference. A LIVE run that fakes AI calls at the inference leaf with reportable non-zero synthetic usage — operators dispatch real activities (exercising the true distributed path) but no tokens are billed. It's the cheap, deterministic way to validate distributed cost reporting (and thetemporal-e2e-validateskill's new Tier 8b consumes it). Mutually exclusive with--dry-run. Covers the LLM leaf; image-gen/extract/search under--mock-inferencefail loud withMockInferenceUnsupportedErrorrather than silently spending.Breaking changes (no back-compat, per repo policy)
--cost-report/--no-cost-reportremoved fromrun pipe|method|bundle→ use--costs(default on).ReportingProtocol/ReportingNoOpno longer exposeopen_registry/close_registry/generate_report/inject_tokens_usages; theUsageRegistrymodel is gone. Embedders that rendered viaget_report_delegate().generate_report()must render from the returnedpipe_output.tokens_usages.is_log_costs_to_consoledefault flippedfalse → true(inpipelex.toml,.pipelex/pipelex.toml, and thekit/configsclient template).Testing
make agent-check(ruff/plxt clean, pyright 0/0, mypy clean) and fullmake agent-testgreen. Distributed cost reporting is validated without spend:test_split_worker_usage.py(read-back aggregation),test_mock_inference_temporal.py(--mock-inferencecross-process),test_direct_tracing_assembly.py/test_cost_report_rendering.py(DIRECT). Thetemporal-e2e-validateskill's Tier 8b is the manual 3-process counterpart.Cross-repo (deferred to release pin-bump — not in this PR)
cocode and pipelex-api pin a released pipelex (registry still present there). They don't break on this branch; their migration (
swe_cmd.py→CostRegistry.generate_report(tokens_usages=...), plus API leak verification) lands when the pin bumps to a build carrying these changes. See TODOS Phase 3/4 cross-repo boxes.Deferred non-goals
Consolidated in
wip/registry/deferred-followups.md(D5 costs-only tracer skip, cost-per-node correlation, theGraphContext.from_execution_configfactory, T5, T3 request-scoped tracing,--mock-inferenceper-operator coverage) and the two open design decisions inwip/registry/cost-report-deferred-decisions.md.🤖 Generated with Claude Code
Summary by cubic
Rebuilt cost reporting as an event‑sourced artifact on
PipeOutput, fixing the success‑pathUsageRegistryleak and enabling cross‑worker cost aggregation. Hardening: trace reads now catchbotocore.ClientErrorandbotocore.BotoCoreError, and both event‑log emission and cost‑report rendering failures are best‑effort so successful runs don’t fail; obsolete registry docs were removed and the leak fix is documented.New Features
pipe_output.tokens_usages(+usage_assembly_error), assembled from trace events in DIRECT and TEMPORAL; multi‑worker runs aggregate into one report.--costs/--no-costs(default on) decouples usage from--graph; costs‑only skips graph assembly and heavy payload serialization.tokens_usages; console prints once at end when enabled, suppressed for dry runs (0 tokens) and when usage assembly/aggregation fails. CSV/summary errors are logged and skipped.cost_reportJSON (omitted for dry runs,--no-costs, API‑runner path, or if local summary aggregation fails). Free/zero‑price models report tokens with a 0 total.act_assemble_tracingto assemble graph and costs from a single read; unexpected errors propagate soWfPipeRunrecords them on*_assembly_error.--mock-inference: LLM leaf returns synthetic, non‑zero usage without provider calls; non‑LLM leaves raiseMockInferenceUnsupportedError. Mocked objects are re‑validated; failures raiseMockInferenceObjectFidelityError.Migration
--cost-reportwith--costs(default on). Config addsis_generate_usage(wasis_generate_costs). Console defaultis_log_costs_to_console = true.GraphContextrenamed toTraceContext;JobMetadata.graph_context→JobMetadata.trace_context. The shared context carries both graph and usage streams; costs‑only still mints node IDs but does not assemble aGraphSpec.get_report_delegate().open/close_registryand.generate_report(). Render frompipe_output.tokens_usagesviaCostRegistry.generate_report(...)orrender_run_cost_report(...).PipelineExecutionConfig.with_execution_overrides(...)(replaceswith_graph_config_overrides(...)).act_assemble_graphrenamed toact_assemble_tracing.PipelexRunner.__init__is now keyword‑only; pass all constructor args by name.Written for commit 67fcf0f. Summary will update on new commits.