Skip to content

fix(triage): defer malformed cloud replies#2415

Open
aqilaziz wants to merge 1 commit into
tinyhumansai:mainfrom
aqilaziz:codex/2322-composio-triage-guard
Open

fix(triage): defer malformed cloud replies#2415
aqilaziz wants to merge 1 commit into
tinyhumansai:mainfrom
aqilaziz:codex/2322-composio-triage-guard

Conversation

@aqilaziz
Copy link
Copy Markdown
Contributor

@aqilaziz aqilaziz commented May 21, 2026

Summary

  • Treat malformed classifier replies from the cloud retry arm as retryable-exhausted instead of fatal.
  • Let two malformed cloud replies fall through to local fallback, or return TriageOutcome::Deferred when no local arm is available.
  • Add regression coverage for both paths so Composio background triage does not bubble this case into [composio][triage] run_triage failed error events.

Problem

Composio trigger triage runs in a background task. The first malformed cloud classifier reply was retried, but a malformed reply on CloudAfterRetry became ArmError::Fatal. That returned Err from run_triage, causing the Composio subscriber to emit the Sentry-grouping error log from #2322.

Solution

Cloud parse failures are now handled like other retryable cloud failures: retry once, then exhaust the cloud arm and continue to local/Deferred. Local parse failure remains fatal inside the local arm, which the outer chain already converts into a deferred outcome.

Submission Checklist

  • Tests added or updated (happy path + at least one failure / edge case) — added two triage evaluator regression tests for malformed cloud retry fallback and no-local deferral.
  • Diff coverage ≥ 80% — new branch behavior is covered by focused unit tests.
  • Coverage matrix updated — N/A: no feature matrix row changed.
  • All affected feature IDs from the matrix are listed in ## Related — N/A: no matrix feature ID touched.
  • No new external network dependencies introduced.
  • Manual smoke checklist updated if this touches release-cut surfaces — N/A: Rust triage error handling only.
  • Linked issue closed via Closes #NNN in ## Related.

Impact

  • Runtime/platform: Rust core triage evaluator, especially background Composio trigger triage.
  • User-visible: malformed classifier output now degrades to local fallback or deferred retry behavior instead of producing a Sentry error event.
  • Security/performance: no new IO or dependencies; one existing retry/fallback path is reused.

Tests

  • cargo fmt -- --check — passed.
  • git diff --check — passed.
  • cargo test --lib double_cloud_parse_failure -- --nocapture — attempted; blocked locally by missing libclang for whisper-rs-sys before tests could run.

Related


AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Commit & Branch

  • Branch: codex/2322-composio-triage-guard
  • Commit SHA: 9caae7a9baf55abcb4e3f6d0033a93b25ac1efed

Validation Run

  • pnpm --filter openhuman-app format:check — N/A: no frontend changes.
  • pnpm typecheck — N/A: no TypeScript changes.
  • Focused tests: cargo test --lib double_cloud_parse_failure -- --nocapture — attempted, blocked by local toolchain dependency below.
  • Rust fmt/check (if changed): cargo fmt -- --check passed; git diff --check passed.
  • Tauri fmt/check (if changed): N/A: no Tauri shell changes.

Validation Blocked

  • command: cargo test --lib double_cloud_parse_failure -- --nocapture
  • error: Unable to find libclang: couldn't find any valid shared libraries matching: ['clang.dll', 'libclang.dll']; set LIBCLANG_PATH
  • impact: Local Windows environment cannot build whisper-rs-sys; CI should run the focused Rust tests in an environment with libclang available.

Behavior Changes

  • Intended behavior change: malformed cloud classifier replies no longer make background Composio triage return Err after the retry; they continue to local fallback or deferred outcome.
  • User-visible effect: fewer non-actionable Sentry error events from background trigger triage; trigger processing still records failure/defer telemetry.

Parity Contract

  • Legacy behavior preserved: auth/model/config fatal errors still bubble as fatal; local-arm parse failure still becomes a deferred outcome through the existing local failure branch.
  • Guard/fallback/dispatch parity checks: new tests lock the dispatch count and resolution path for cloud parse retry exhaustion.

Duplicate / Superseded PR Handling

Summary by CodeRabbit

  • Bug Fixes

    • Enhanced retry logic for cloud classifier responses; malformed or non-parseable replies now trigger a single automatic retry before the system falls back to local processing or defers the triage decision.
  • Tests

    • Added comprehensive test coverage for triage routing behavior when cloud returns invalid classifier responses, including scenarios with and without local fallback availability.

Review Change Stack

@aqilaziz aqilaziz requested a review from a team May 21, 2026 06:06
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 21, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c897acf1-ff4a-4f1b-b967-a02b8e86583f

📥 Commits

Reviewing files that changed from the base of the PR and between c204a53 and 9caae7a.

📒 Files selected for processing (2)
  • src/openhuman/agent/triage/evaluator.rs
  • src/openhuman/agent/triage/evaluator_tests.rs

📝 Walkthrough

Walkthrough

Triage evaluator now treats malformed classifier JSON replies as retryable cloud failures. Parse errors on both initial and retry cloud arms return ArmError::Retryable, enabling fallthrough to local arm or deferred outcome instead of fatal failure. Two end-to-end tests validate local fallback and deferred scenarios.

Changes

Cloud Parse Failure Retry Handling

Layer / File(s) Summary
Retry logic and documentation
src/openhuman/agent/triage/evaluator.rs
Module documentation clarifies malformed classifier replies are retryable cloud failures. try_arm error handling returns ArmError::Retryable for parse failures on both Cloud and CloudAfterRetry arms, enabling retry and fallthrough to local/deferred instead of fatal errors.
End-to-end retry scenario tests
src/openhuman/agent/triage/evaluator_tests.rs
Two async tests verify retry behavior: first confirms cloud parse failures retry twice before falling through to local arm (3 total calls, LocalFallback resolution); second confirms deferred outcome when local arm is unavailable after cloud retries (2 calls, TriageOutcome::Deferred with reason).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

A malformed reply once made the flow crash,
Now retry once more, then dash!
To local arms or deferred dreams,
The triage flows smoothly—nothing seems
Broken by bad JSON schemes. 🐰✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(triage): defer malformed cloud replies' directly addresses the main change: handling malformed cloud classifier replies as retryable/deferrable rather than fatal.
Linked Issues check ✅ Passed The PR successfully addresses issue #2322 by treating malformed cloud parse failures as retryable exhausted errors that fall through to local/deferred outcomes instead of bubbling up as fatal errors.
Out of Scope Changes check ✅ Passed All changes are scoped to the triage evaluator module and directly address the linked issue objective of preventing malformed cloud replies from causing fatal errors.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[composio][triage] run_triage failed

1 participant