Skip to content

fix: improve error handling and response in Walled AI input guardrail#244

Merged
amithad merged 1 commit intodevelopfrom
Feature/Walled-AI
Mar 10, 2026
Merged

fix: improve error handling and response in Walled AI input guardrail#244
amithad merged 1 commit intodevelopfrom
Feature/Walled-AI

Conversation

@JohnPraveenYL
Copy link
Contributor

@JohnPraveenYL JohnPraveenYL commented Mar 9, 2026

Description

This PR improves the WalledAI guardrail flow for correctness, safety, and consistency with other guardrail providers.
It fixes request handling edge cases, improves failure behavior, and preserves tracing context in output unmasking.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)
  • Performance improvement
  • Test update
  • CI/CD update
  • Other (please describe):

Related Issues

Fixes #
Relates to #

Changes Made

  • Improved error handling for WalledAI redaction failures to return a safe AgentReplyText (instead of propagating exceptions), aligning behavior with OpenAI/Bedrock guardrails.
  • Preserved prompt when returning unmasked AgentReplyText so tracing/instrumentation keeps input-output context.
  • Updated relevant WalledAI documentation to reflect per-request processing and non-text pass-through behavior.

Testing

  • Unit tests pass locally
  • Integration tests pass locally
  • Manual testing completed
  • New tests added for changes

Checklist

  • My code follows the project's style guidelines
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published

Screenshots (if applicable)

N/A

Additional Notes

@JohnPraveenYL JohnPraveenYL requested a review from amithad March 9, 2026 14:13
@amithad amithad marked this pull request as ready for review March 10, 2026 08:28
Copilot AI review requested due to automatic review settings March 10, 2026 08:28
@amithad amithad merged commit 8049be5 into develop Mar 10, 2026
57 of 58 checks passed
@amithad amithad deleted the Feature/Walled-AI branch March 10, 2026 08:28
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves error handling and response consistency in the WalledAI input/output guardrail, aligning behavior with the OpenAI and Bedrock guardrail providers already present in the codebase.

Changes:

  • Redaction failures now return a safe AgentReplyText instead of re-raising the exception, making error behavior consistent with other guardrail providers.
  • The output unmasking path now preserves the prompt from agent_reply when constructing the unmasked AgentReplyText, improving tracing context.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants