Conversation
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
|
Updated the implementation to make the rule stricter: test output alone no longer counts as PR evidence. I also ran the changed code path directly and captured runtime evidence for this PR. Command run: uv run --with openhands-sdk --with openhands-tools --with lmnr python - <<'PY'
import importlib.util
import os
import sys
from pathlib import Path
repo = Path.cwd()
scripts_dir = repo / 'plugins' / 'pr-review' / 'scripts'
sys.path.insert(0, str(scripts_dir))
spec = importlib.util.spec_from_file_location('pr_review_agent_script', scripts_dir / 'agent_script.py')
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
os.environ.update({
'LLM_API_KEY': 'demo-key',
'GITHUB_TOKEN': 'demo-token',
'PR_NUMBER': '102',
'PR_TITLE': 'Require evidence in roasted PR reviews',
'PR_BODY': '## Summary\nAdds an evidence requirement.',
'PR_BASE_BRANCH': 'main',
'PR_HEAD_BRANCH': 'openhands/issue-102-require-evidence',
'REPO_NAME': 'OpenHands/extensions',
'REVIEW_STYLE': 'roasted',
'REQUIRE_EVIDENCE': 'true',
})
config = module.validate_environment()
prompt = module.format_prompt(
skill_trigger='/codereview-roasted',
title=config['pr_info']['title'],
body=config['pr_info']['body'],
repo_name=config['pr_info']['repo_name'],
base_branch=config['pr_info']['base_branch'],
head_branch=config['pr_info']['head_branch'],
pr_number=config['pr_info']['number'],
commit_id='abc123',
diff='diff --git a/plugins/pr-review/scripts/prompt.py b/plugins/pr-review/scripts/prompt.py',
review_context='',
require_evidence=config['require_evidence'],
)
print('require_evidence:', config['require_evidence'])
print('review_style:', config['review_style'])
print('contains Evidence section:', '## PR Description Evidence Requirement' in prompt)
print('contains test-output rejection:', 'Tests alone do **not** count as evidence.' in prompt)
print('contains end-to-end wording:', 'real code path end-to-end' in prompt)
print('contains conversation link guidance:', 'https://app.all-hands.dev/conversations/{conversation_id}' in prompt)
PYObserved output: This exercises the actual setting flow through repository code:
|
all-hands-bot
left a comment
There was a problem hiding this comment.
🟢 Good taste - Clean, pragmatic implementation that solves a real problem.
VERDICT: ✅ Merge it
KEY INSIGHT: This is exactly the kind of straightforward, non-breaking feature addition that demonstrates good engineering judgment - optional flag, sensible default, clear data flow from env → config → prompt, and thorough evidence in the PR description proving it actually works.
| - For frontend/UI changes: a screenshot or video demonstrating the implemented behavior in the real product | ||
| - For backend, API, CLI, or script changes: the exact command(s) used to run the real code path end-to-end and the resulting output | ||
| - Tests alone do not count as evidence; reject `pytest`, unit test output, or similar test runs when they are the only proof provided | ||
| - For agent-generated work when available: a link back to the originating conversation, e.g. `https://app.all-hands.dev/conversations/{conversation_id}` |
There was a problem hiding this comment.
Can we maybe leave this out? I'm not sure the agent knows where it originated from 😆
On a side note, maybe it would be (really) nice to give the link of the public share, now that the cloud has public share. But I don't know how the agent finds that out or if we can create public share without the user intently creating it... 🤔
e.g. https://app.all-hands.dev/shared/conversations/bfca083e57ce4d748a489ff4fb703637
enyst
left a comment
There was a problem hiding this comment.
Thank you for this! I always do this manually, asking the agent to "live test" or run examples itself, and sometimes upload its results in .pr/ directory for human inspection.
(by the way, maybe we could mention examples/ directory... or not in the main extensions content, in the repo instructions I assume)
Co-authored-by: Engel Nyst <engel.nyst@gmail.com>
|
@OpenHands check and fix failing tests (see the conversation history on this issue for reference as necessary) |
|
I'm on it! neubig can track my progress at all-hands.dev |
Co-authored-by: openhands <openhands@all-hands.dev>
|
Final update:
Checklist:
Git:
So the request has been fully addressed, the change was minimal, and the branch has already been updated remotely. |
Co-authored-by: openhands <openhands@all-hands.dev>
Summary
require-evidenceinput to the PR review actionEvidencesection in PR descriptionscodereview-roastedguidance and prompt tests to match the stricter ruleFixes #102.
Testing
uv run --group test pytestEvidence
agent_script.pyreadingREQUIRE_EVIDENCE=trueand passing it intoformat_prompt()):