Skip to content

DRAFT: Add nightly integration test workflow#385

Draft
xingyaoww wants to merge 1 commit into
mainfrom
nightly-integration-tests
Draft

DRAFT: Add nightly integration test workflow#385
xingyaoww wants to merge 1 commit into
mainfrom
nightly-integration-tests

Conversation

@xingyaoww
Copy link
Copy Markdown
Contributor

  • A human has tested these changes.

Why

We need a nightly integration test to catch regressions in the live Agent Server ↔ frontend stack, similar to the integration-runner in software-agent-sdk.

Summary

Adds .github/workflows/nightly-integration.yml — a GitHub Actions workflow that runs the Playwright live E2E suite on a recurring schedule and on demand.

Triggers

Trigger When
Nightly schedule 10:30pm UTC every day (main repo only)
workflow_dispatch Manual trigger with optional model, pr_number, and reason inputs
integration-test label When added to a same-repo PR (via pull_request_target)

What it does

  1. Starts a real local Agent Server + UI stack via npm run test:e2e:live
  2. Runs the Playwright live E2E suite against a real LLM (default: openhands/claude-haiku-4-5-20251001 via the OpenHands LLM proxy)
  3. Uploads Playwright reports and test results as artifacts (30-day retention)
  4. Posts results as PR comments when triggered from a PR
  5. Logs results to stdout for scheduled runs (tracker issue placeholder — update the issue number when one is created)

Security

  • Fork PRs are excluded from pull_request_target via explicit same-repo check
  • LLM credentials are only injected into the test step, not at job level
  • persist-credentials: false prevents checkout from leaking tokens

TODO (follow-ups)

  • Create a tracking issue for nightly results and wire up the gh issue comment call
  • Add multi-model matrix support (like software-agent-sdk's resolve_model_config.py)
  • Add consolidated report generation across models
  • Consider adding more live E2E test scenarios beyond the current terminal conversation smoke test

Type

  • Bug fix
  • Feature
  • Refactor
  • Breaking change
  • Docs / chore

Notes

This PR was created by an AI agent (OpenHands) on behalf of the user.

@xingyaoww can click here to continue refining the PR

Adds a scheduled GitHub Actions workflow that runs the live Agent Server
E2E tests nightly, similar to the integration-runner in software-agent-sdk.

Triggers:
- Nightly schedule (10:30pm UTC)
- Manual workflow_dispatch with optional model/PR number overrides
- PR label 'integration-test' (same-repo PRs only, via pull_request_target)

The workflow:
- Starts a real local Agent Server + UI stack via npm run test:e2e:live
- Runs the Playwright live E2E suite against the configured LLM
- Uploads Playwright reports and test results as artifacts
- Posts results as PR comments when triggered from a PR
- Posts results to a tracking issue for scheduled runs (placeholder)

Default model: openhands/claude-haiku-4-5-20251001 via the OpenHands LLM proxy.

Co-authored-by: openhands <openhands@all-hands.dev>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 12, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agent-canvas Error Error May 12, 2026 6:24pm

Request Review

@github-actions
Copy link
Copy Markdown

PR Artifacts Notice

This PR contains a .pr/ directory with PR-specific artifacts. This directory will be automatically removed when the PR is approved.

Fork PRs require manual cleanup before merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants