Agents.Code — Autonomous 3-Agent Coding Harness

A GAN-inspired autonomous coding harness built with the Claude Agent SDK. Takes a short prompt and autonomously builds a full-stack application (Next.js + .NET) using three specialized agents.

Architecture

User Prompt → [Planner] → spec.md → [Generator] → app → [Evaluator/QA] → feedback
                                          ↑                                    │
                                          └────────── fix round ◄──────────────┘

Agent	Role	Tools
Planner	Expands 1-4 sentence prompt into full product spec	File I/O
Generator	Builds Next.js + .NET app from spec	File, Bash, Git
Evaluator	QA tests running app, grades against criteria	Playwright MCP

Quick Start

# Install dependencies
npm install

# Set your API key
cp .env.example .env
# Edit .env with your ANTHROPIC_API_KEY

# Install Playwright (for evaluator agent)
npx playwright install chromium

# Run the harness
npx tsx src/index.ts "Build a task management app with kanban boards"

Options

--output-dir <path>     Output directory (default: ./output)
--artifacts-dir <path>  Artifacts directory (default: ./artifacts)
--model <model>         Claude model (default: claude-opus-4-5-20250918)
--max-rounds <n>        Max QA rounds (default: 3)
--max-budget <usd>      Max budget in USD (default: 50)
--api-key <key>         API key (or set ANTHROPIC_API_KEY)

How It Works

Planning — The planner agent takes your short prompt and expands it into an ambitious product spec with features, user stories, design direction, and AI-powered features.
Building — The generator agent reads the spec and builds the complete application — Next.js frontend + .NET backend — committing to git at milestones.
QA — The evaluator agent uses Playwright to interact with the running app like a real user. It grades against four criteria (Product Depth, Functionality, Visual Design, Code Quality) and files specific bugs with file/line references.
Iteration — If QA fails, the generator gets the evaluator's feedback and fixes issues. This loop repeats up to --max-rounds times.

Target Stack

Frontend: Next.js 14+ (App Router), TypeScript, Tailwind CSS
Backend: .NET 10, ASP.NET Core Web API, Entity Framework Core
Database: SQLite (dev) / PostgreSQL (prod)

Skills & Plugins

# Frontend design skill
npx skills add vercel-labs/agent-skills

# Nextjs best practices
npx skills add vercel-labs/next-skills --skill next-best-practices

# Playwright MCP (used by evaluator)
npx playwright install chromium

References

Harness Design for Long-Running Apps — Anthropic Engineering
Claude Agent SDK — Official Docs
DotNet Skills — .NET agent skills

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
docs		docs
src		src
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agents.Code — Autonomous 3-Agent Coding Harness

Architecture

Quick Start

Options

How It Works

Target Stack

Skills & Plugins

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Agents.Code — Autonomous 3-Agent Coding Harness

Architecture

Quick Start

Options

How It Works

Target Stack

Skills & Plugins

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages