-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Open
Description
Hi team! Thank you for the great project.
I found that CORRECTNESS_PROMPT does not use user_question, report, or answer in tests/prompts.py:
open_deep_research/tests/prompts.py
Lines 95 to 104 in b419df8
| CORRECTNESS_PROMPT = """You are evaluating the correctness of a research report that was generated by a research agent. | |
| You will be provided with the question, the report, and the answer from an independent authority. | |
| Score the report from 1-5 on how well it mirrors the answer from the authority. | |
| We expect the report to contain more information that is not in the answer, that's perfectly okay. | |
| They likely won't be perfectly the same, but they should have the same themes and ideas to get a high score. | |
| Use your best judgement when comparing the answer to the report! | |
| """ |
However, the eval_correctness function in tests/evaluators.py formats the prompt with these fields:
open_deep_research/tests/evaluators.py
Line 112 in b419df8
| user_input_content = CORRECTNESS_PROMPT.format(user_question=query, report=final_report, answer=answer, today=get_today_str()) |
As a result, format() has no effect and the evaluator receives a prompt that lacks the actual question, the generated report, and the authoritative answer.
Could you please confirm whether this is a bug?
Metadata
Metadata
Assignees
Labels
No labels