Vibetest

Vibetest takes a vibecoding app chat transcript, extracts formal functional requirements, tests them using Browser Use, and reports back. You can then pass this report to your vibe-coding agent to create a meta-agentic loop.

video-demo.mp4

Structure

Q-NCLC/
├── data/
│   ├── test-cases/    # Input test cases (URL + chat transcripts)
│   ├── results/       # Output logs from test runs
│   └── legacy/        # Old experiment files
└── packages/
    ├── agent1/        # UX task extraction from conversations
    ├── agent2/        # Browser-based UX testing
    ├── agent3/        # Test results grouping and analysis
    ├── vibetester/    # Pipeline orchestrating Agent 1 + Agent 2 + Agent 3
    └── shared/        # Shared utilities (logging, LLM providers)

System Overview

The system consists of three agents working in a pipeline:

Agent 1 (Requirement Extractor): Analyzes natural language conversations between a user and a coding assistant to extract formal testing requirements. It produces a structured list of atomic test steps. Uses DSPy for prompt optimization with few-shot learning.
Agent 2 (Browser Tester): Takes the test steps from Agent 1 and executes them in a real browser environment to verify the application's behavior. It acts as an automated user and generates individual test results for each atomic step.
Agent 3 (Test Grouper): Groups related atomic test steps from Agent 1 and their results from Agent 2 into meaningful, cohesive test scenarios. It identifies test types (validation, functional, integration, workflow) and provides aggregated reporting.

Together, they form the Vibetester pipeline, testing the application's behavior based on the user's intent and providing structured test reports.

Prerequisites

Python 3.12 or higher
uv

Setup

Clone the repo
Sync dependencies:
```
uv sync
```
[!NOTE] uv sync automatically creates a virtual environment in .venv. If you use VSCode and Pylance cannot resolve packages, open the command palette, execute "Python: Select Interpreter", and choose the environment in .venv.
Duplicate .env.example to .env and add your API keys (e.g., BROWSER_USE_API_KEY, OPENAI_API_KEY, etc.).

Running the Agents

Vibetester (full pipeline)

The full pipeline: extracts UX requirements from a test case and tests them in a browser.

# Recommended: Use unified test case file (contains URL and transcript)
uv run vibetester -tc pitch-humanity-simple.json

# Legacy: Separate transcript and URL
uv run vibetester -t my_transcript.json -u https://myapp.example.com

This will:

Load the test case from ./data/test-cases/pitch-humanity-simple.json
Extract UX requirements using Agent 1
Test them in a browser using Agent 2
Group atomic test results into meaningful scenarios using Agent 3
Save results to ./data/results/ (when --logging is enabled)

Run uv run vibetester --help for all options, or refer to the dedicated README.

Agent 1 (Standalone)

uv run agent1

Run uv run agent1 --help for all options, or refer to the dedicated README.

Recompiling Agent 1 (DSPy Optimization)

Agent 1 uses a pre-compiled DSPy model for fast inference. To retrain/optimize with new examples:

uv run agent1-compile

See the Agent 1 README for details on the compilation workflow.

Agent 2 (Standalone)

uv run agent2

Run uv run agent2 --help for all options, or refer to the dedicated README.

Note

Avoid interacting with the browser window spawned by Agent 2 to not disrupt the agent.

Logging

Logging is disabled by default. Enable it via:

CLI flag: --logging
Environment variable: LOGGING=true

Logs are saved to ./data/results/ as JSON files with timestamps.

Development

Add a dependency to a package (e.g. agent 1):
```
uv add --package agent1 <package-name>
```

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
.vscode		.vscode
data		data
packages		packages
src		src
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
index.html		index.html
list_model_names.py		list_model_names.py
model_names.txt		model_names.txt
pyproject.toml		pyproject.toml
test_agent3.py		test_agent3.py
test_agent3_naming.py		test_agent3_naming.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vibetest

Structure

System Overview

Prerequisites

Setup

Running the Agents

Vibetester (full pipeline)

Agent 1 (Standalone)

Recompiling Agent 1 (DSPy Optimization)

Agent 2 (Standalone)

Logging

Development

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Vibetest

Structure

System Overview

Prerequisites

Setup

Running the Agents

Vibetester (full pipeline)

Agent 1 (Standalone)

Recompiling Agent 1 (DSPy Optimization)

Agent 2 (Standalone)

Logging

Development

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages