Skip to content

task helper#366

Open
lorenss-m wants to merge 4 commits intomainfrom
l/task-adjustments
Open

task helper#366
lorenss-m wants to merge 4 commits intomainfrom
l/task-adjustments

Conversation

@lorenss-m
Copy link
Contributor

@lorenss-m lorenss-m commented Mar 11, 2026

Note

Medium Risk
Introduces a new Task.run() execution path that wraps run_eval() and is now used by Chat.send(), so any behavior differences could affect evaluation tracing/reward propagation. Also changes file/path handling for coding tools on Windows, which may have platform-specific edge cases.

Overview
Adds Task.run() as a one-liner helper to execute a single task (accepting a model string or agent instance), internally creating the eval context via run_eval() and propagating ctx.reward onto the returned Trace.

Refactors Chat.send() and multiple docs/examples to use task.run(...)/env(...).run(...) instead of manually creating an eval context and running the agent.

Improves Windows compatibility in coding tools by giving a Windows-specific absolute-path error in EditTool.validate_path() and falling back to direct file I/O in read_file_async/write_file_async when shell utilities/heredocs aren’t available.

Written by Cursor Bugbot for commit e27a71d. This will update automatically on new commits. Configure here.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

mock_result.reward = 1.0

with patch("hud.eval.task.Task.run", new_callable=AsyncMock, return_value=mock_result):
await chat.send("hello")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test patches class method but instance is MagicMock

Medium Severity

The patch targets hud.eval.task.Task.run on the Task class, but dummy_task is a MagicMock, not a Task instance. When Chat.send() calls task.run(...), task is the MagicMock returned by model_copy, so task.run(...) invokes MagicMock's auto-generated run attribute — not the patched Task.run. This returns a non-awaitable MagicMock, causing await task.run(...) to raise TypeError. The patch has no effect on the actual code path being tested.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant