fix(composio/triage): demote provider-config-rejection rollups (Sentry TAURI-RUST-1V)#2689
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughAdds "may not be available on your provider" to the provider config-rejection matcher with a unit test; routes Composio triage errors through crate::core::observability::report_error_or_expected instead of direct tracing::error!. ChangesError Classification and Observability
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
LGTM — clean, surgical fix with good test coverage.
Nice work routing the composio triage error path through report_error_or_expected instead of the raw tracing::error!. The new phrase anchor in config_rejection.rs is well-scoped (sole producer at reliable.rs:332) and the cross-referencing comments will help catch drift.
Both changes follow the established pattern used across ~10 other call sites in the codebase. Tests cover the multi-line rollup and bare phrase cases. The {e:#} alternate Display for the error chain in the detail string is correct.
CI note: all failures are Docker login (GHCR auth) issues for the contributor's fork — unrelated to these changes. The Tauri build, E2E suites, and core image build all pass.
5fa4ca8 to
e8397ab
Compare
|
Actionable comments posted: 0 |
…r provider" (Sentry TAURI-RUST-1V) Add the canonical phrase from `reliable.rs:332` to the ProviderConfigRejection classifier. `reliable.rs` rolls every exhausted fallback into `All providers/models failed. Attempts:\n…\nThe model `<id>` may not be available on your provider. Configure a fallback chain via `reliability.model_fallbacks` in …`, which the composio triage subscriber re-reports to Sentry. The remediation lives entirely in the user's `reliability.model_fallbacks` config; Sentry has no remediation path. Drops the bulk of self-hosted Sentry TAURI-RUST-1V (10,692 events / 14d on `tauri-rust` project, dominated by `gemini-3-flash-preview`-style ProviderConfigRejection rollups). Sentry-Issue: TAURI-RUST-1V
…sifier (Sentry TAURI-RUST-1V) Swap the raw `tracing::error!` at memory_sync/composio/bus.rs:354 with `crate::core::observability::report_error_or_expected` so user-config / budget-exhausted rollups from the upstream provider chain get demoted to info-level breadcrumbs instead of surfacing as Sentry errors. Pairs with the new `may not be available on your provider` anchor in `is_provider_config_rejection_message` — together they neutralise the self-hosted Sentry TAURI-RUST-1V noise (10,692 events / 14d) whose inner attempts the provider layer already correctly demoted but whose outer rollup escaped via this raw error emit. Genuine triage runtime bugs that don't classify still reach Sentry unchanged. Sentry-Issue: TAURI-RUST-1V
e8397ab to
1733f06
Compare
|
Actionable comments posted: 0 |
M3gA-Mind
left a comment
There was a problem hiding this comment.
PR #2689 — fix(composio/triage): demote provider-config-rejection rollups (Sentry TAURI-RUST-1V)
Walkthrough
Two surgical, additive changes that close a Sentry noise loop around [composio][triage] run_triage failed (10.7 k events / 14 d). The inner provider layer (reliable.rs + compatible.rs) already correctly classifies and demotes budget-exhausted / provider-config-rejection errors; the outer composio bus re-emitted the final aggregated error via a raw tracing::error! that bypassed the central classifier entirely. This PR routes that re-emit through report_error_or_expected and adds the canonical reliable-chain remediation phrase to the classifier so the full rollup message classifies on its way through. Both changes are consistent with the existing pattern for this class of Sentry noise (see #2481, #2612). No user-visible behaviour changes.
Changes
| File | Summary |
|---|---|
src/openhuman/inference/provider/config_rejection.rs |
Adds "may not be available on your provider" to PHRASES; new test covers the full rollup body and the bare remediation phrase |
src/openhuman/memory_sync/composio/bus.rs |
Replaces raw tracing::error! at triage-failure site with report_error_or_expected; formats detail string to preserve label + anyhow error chain |
Actionable comments (0)
None. The implementation is correct.
Verified / looks good
- Sole-producer claim confirmed.
"may not be available on your provider"appears only inreliable.rs:324(the producer) plus the classifier and its tests. The phrase is tight enough that no unrelated error would match it. report_error_or_expectedcall site is correct. Signature is(err: &E, domain, operation, extra: &[Tag<'_>])whereTag<'a> = (&'a str, &'a str). The call passesdetail.as_str()(full error chain via{e:#}),"composio","trigger_triage", and&[("label", ...)]. Thelabelfield is preserved both inline in the detail string (visible in classified/demoted logs) and as a structured tag (used by Sentry for genuine errors). This mirrors how other bus handlers in the codebase call the classifier.apply_decisionerror path intentionally left astracing::error!. That path fires when triage succeeds but the follow-on action fails — a genuine runtime error, not a provider-config rollup. Leaving it is correct.- Tests.
detects_reliable_chain_exhaustion_rollupcovers the full multi-line rollup body (the TAURI-RUST-1V wire shape) and the bare single-line remediation phrase. Existing polarity tests (ignores_transient_and_server_and_unrelated,does_not_classify_unrelated_tools_phrases_as_config_rejection) continue to run unchanged. - All 25 CI checks pass (Rust core tests, coverage ≥ 80%, E2E, fmt/clippy).
- No merge conflicts with current main.
- Project conventions followed: no new files at
src/openhuman/root, no dynamic imports, no secret logging, debug logging preserves the[composio][triage]grep-friendly prefix.
Summary
[composio][triage] run_triage failedthrough the central observability classifier instead of rawtracing::error!— so user-config / budget-exhausted rollups from the upstream provider chain get demoted to info-level breadcrumbs."may not be available on your provider"tois_provider_config_rejection_message— the canonical phrase emitted byreliable.rs:332when the user'sreliability.model_fallbackschain is misconfigured.Problem
Self-hosted Sentry
tauri-rustproject's #4 unresolved by event count (14d, sort=freq) is[composio][triage] run_triage failedat 10,692 events. Breadcrumbs show every event boils down to:The reliable-provider stack and the inference observability layer already classify these as user-config — but the bus-level re-emit at
memory_sync/composio/bus.rs:354used a rawtracing::error!that bypassed the classifier entirely.Root cause is user-side: the user's
reliability.model_fallbacksconfig doesn't list a model the provider actually serves. Remediation = fix that config. Sentry has no remediation path.Solution
Two surgical changes, kept as separate micro-commits for review:
src/openhuman/inference/provider/config_rejection.rs— append"may not be available on your provider"to thePHRASEStable feedingis_provider_config_rejection_message. Canonical phrase, anchored toreliable.rs:332(the sole producer in-tree). Comment cross-links the call site so future drift of that wording is caught at review.src/openhuman/memory_sync/composio/bus.rs— swap the rawtracing::error!at L354 withcrate::core::observability::report_error_or_expected(..., "composio", "trigger_triage", &[("label", …)]). Same structured fields as before, but now the classifier runs. Genuine runtime bugs that don't classify still surface as full Sentry errors.The two changes are additive; either landed alone would partially help, but only the pair fully closes the loop.
Submission Checklist
diff-cover) meet the gate enforced by.github/workflows/coverage.yml. Runpnpm test:coverageandpnpm test:rustlocally; PRs below 80% on changed lines will not merge.## RelatedSentry-Issue: TAURI-RUST-1Vis in## Relatedinstead ofCloses #NNN.Impact
info!breadcrumbs in local trace, so support can still inspect them. Sustained outages — were they ever to indicate a real triage bug — would surface via separate health/escalation paths.Related
tauri-rustproject, 10,692 events / 14d)OPENHUMAN-(TAURI|REACT|CORE)-…andBACKEND-ALPHAHUMAN-…shapes; self-hosted prefix isTAURI-RUST-…. Widen the regex (separate workflow-tooling PR) so the post-merge Sentry resolve sweep finds these IDs.AI Authored PR Metadata (required for Codex/Linear PRs)
Linear Issue
Commit & Branch
Validation Run
pnpm --filter openhuman-app format:checknot needed.pnpm typechecknot needed.Validation Blocked
Behavior Changes
Parity Contract
Duplicate / Superseded PR Handling
Summary by CodeRabbit
Bug Fixes
Chores