Test: Caret moves to incorrect thought from note#3973
Conversation
Co-authored-by: raineorshine <750276+raineorshine@users.noreply.github.com>
|
@copilot Run the puppeteer caret tests with |
Co-authored-by: raineorshine <750276+raineorshine@users.noreply.github.com>
The test was failing in CI with a 20s timeout because after paste the cursor wasn't on |
|
@fbmcipher Here are some questions for investigation. You can view the agent's two sessions by clicking the "View Session" buttons above.
|
|
Thank you! With the new GitHub Copilot seat I am now able to review the session and let you know what went wrong here.
The solution: By writing a skill or using a custom agent, I am able to get Copilot to follow instructions more reliably. I've only run with this a couple of times in my fork, but so far this looks more reliable. I think this technique works because it breaks out a single monolithic By that same logic, breaking out our single (It's worth noting that GitHub's official docs state that Copilot can't consistently follow links in custom instructions. Currently, the background & contributing guidelines are linked as URLs. We would probably need to break those out as
The agent spent 2-3 minutes figuring out how to run However, upon applying a fix, it did not run Testing showed Copilot can run The solution to this problem: Add a custom instruction or agent behavior that pushes a commit then waits for CI runs to complete. Then, Copilot can read the CI run and continue to fix the submission if there's work left to do. GitHub Copilot just relies on the existing CI pipelines to validate if a commit is good or bad. This worked well in my testing.
Claude Code was able to successfully add the test and ensure it passed in a single prompt. In general, Claude Code is much more disciplined about running tests, even without explicit instruction. It's interesting to see wildly different performance, despite running the same model. This is the power of the harness – the system prompts, tools and decision trees which guide the agent towards completion. Despite both using Claude Opus 4.6, Copilot and Claude Code perform differently because the tools prompt them differently. I'll post up more general suggestions in our AI discussions thread, but the main takeaways:
|
Sounds good to me.
I was hoping to avoid duplication, but perhaps that was wishful thinking.
Good to know. It would be nice to get Copilot to behave correctly, as running cloud agents is going to be a lot more flexible and parallelizable. |
When the caret is in a note and the user clicks a different thought, the caret incorrectly lands on the note's parent thought instead of the clicked thought.
Changes
src/e2e/puppeteer/helpers/clickNote.ts— NewclickNotehelper, counterpart toclickThought, that waits for and clicks a[aria-label="note-editable"]element matching the given value.src/e2e/puppeteer/__tests__/caret.ts— New regression test covering Caret moves to incorrect Thought from Note #3956: clicks into a note onTwo, then clicksOne, and asserts the caret lands onOne.The test fails before
ac7f9d2(which simplifiedNote.onBlurto useselection.isThought()) and passes after.Original prompt
🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.