RFC: AgentFlow 2.0 — Zero-config native plugin for agent task management

## RFC: AgentFlow 2.0 — Zero-config Native Plugin for Agent Task Management

> This is a Request for Comments. Feedback welcome.

---

## Motivation

AgentFlow solves a real problem: **agents need structured task management and humans need visibility into what agents are doing.** The current implementation proves the concept works, but the deployment model creates friction that limits adoption:

- **5-step setup** (pip install → init → create project → serve → configure)
- **Separate Python process** to manage alongside the agent runtime
- **CLI bridge** for plugin operations (every tool call spawns a subprocess)
- **Text-based tool output** that agents must parse instead of structured JSON

Meanwhile, the valuable parts — task lifecycle state machine, scoring formula, audit trail, stage-first workflow — are buried in implementation details.

**Goal:** Make AgentFlow a zero-config, install-and-use OpenClaw native plugin while preserving its core design as a reusable task governance library.

---

## Current State Analysis

### What Works Well

1. **Task lifecycle model** — The status machine (`pending → approved → in_progress → pr_ready → pr_open → merged`) with validation is solid
2. **Scoring formula** — `(priority * 2 + impact * 3) / max(1, effort)` is simple and effective for queue prioritization
3. **Claim/lease/heartbeat** — The distributed-lock pattern for multi-agent coordination is well-designed
4. **Audit trail** — `status_history`, `runs`, `run_steps`, `triggers` provide full traceability
5. **Gate system** — Configurable quality gates with command execution is a good governance primitive
6. **Web Console** — Stage Board UI with drag-and-drop is a compelling human interface
7. **SSE streaming** — Real-time event delivery to the dashboard is well-implemented

### What Creates Friction

1. **Deployment weight** — Python environment + separate server process + manual init
2. **Plugin bridge fragility** — `execFile("python3", ...)` for every tool call is slow and unreliable
3. **Unstructured tool output** — CLI text tables force agent to parse instead of reason
4. **No auto-init** — Users must manually create DB, projects before first use
5. **Missing API endpoints** — `POST /api/tasks` for task creation still missing (#5)
6. **Synchronous execution** — Adapter calls block the server thread (#14)

---

## Proposal

### Vision

> AgentFlow becomes an OpenClaw plugin that you install once and forget. The agent uses it for task tracking, the human opens a dashboard to see what is happening. No separate processes, no manual setup.

### Architecture

```
┌─────────────────────────────────────────────────────┐
│  OpenClaw Gateway (single process)                    │
│                                                      │
│  ┌─────────────┐  ┌──────────────┐  ┌────────────┐ │
│  │ Agent       │  │ AgentFlow    │  │ Dashboard  │ │
│  │ (LLM loop)  │──│ Plugin       │──│ (static)   │ │
│  │             │  │              │  │            │ │
│  │  Tools:     │  │  ┌─────────┐ │  │ /agentflow/│ │
│  │  - create   │  │  │ SQLite  │ │  │ /agentflow/│ │
│  │  - move     │  │  │ (embed) │ │  │   board    │ │
│  │  - board    │  │  └─────────┘ │  │ /agentflow/│ │
│  │  - detail   │  │  ┌─────────┐ │  │   tasks    │ │
│  │  - audit    │  │  │ Event   │ │  │            │ │
│  │  - progress │  │  │ Broker  │ │  └────────────┘ │
│  │             │  │  └─────────┘ │                 │
│  └─────────────┘  └──────────────┘                 │
└─────────────────────────────────────────────────────┘
```

### Phase 1: Zero-config Python plugin (minimal changes)

Keep Python, but eliminate setup friction:

- **Auto-init**: DB auto-creates on first tool call, default path `~/.agentflow/agentflow.db`
- **Auto-project**: First use auto-creates a `default` project
- **Structured tool output**: All tools return JSON, not CLI text
- **HTTP API**: Embed the server in the plugin, serve on gateway port
- **Complete REST API**: Add missing `POST /api/tasks` endpoint

**Effort**: ~2-3 days
**Issues**: #5, #11, #12, #13, #15, #18, #22, #23

### Phase 2: TypeScript native plugin (rewrite core)

Reimplement as a pure TypeScript OpenClaw plugin:

- **Store**: `better-sqlite3` (same schema, same queries, pure TS)
- **Tools**: Direct function calls, no subprocess bridge
- **API**: Registered as gateway HTTP routes
- **Dashboard**: Served as static asset from plugin directory
- **Install**: `openclaw plugin install agentflow`

```
plugins/agentflow/
├── openclaw.plugin.json
├── package.json              # better-sqlite3 only
├── src/
│   ├── index.ts              # Plugin entry
│   ├── store.ts              # SQLite wrapper
│   ├── broker.ts             # Event broker + SSE
│   ├── models.ts             # Task, Run, etc.
│   ├── lifecycle.ts          # Status machine, scoring
│   └── api.ts                # REST handlers
├── console/
│   └── index.html            # Dashboard (same UI)
└── tests/
```

**Effort**: ~1-2 weeks
**Issues**: #20, #21

### Phase 3: Agent execution integration

- **Async execution**: Background task execution with SSE progress updates
- **OpenClaw adapter**: Real `sessions_spawn` integration for sub-agent dispatch
- **Progress heartbeat**: Sub-agents report progress during execution
- **Result collection**: Automatic status update when sub-agent completes

**Effort**: ~1 week
**Issues**: #1, #8, #9, #10, #14, #16

### Phase 4: Ecosystem

- **Codex adapter**: Standalone adapter for users without OpenClaw
- **Claude Code adapter**: Standalone adapter for Claude Code users
- **GitHub Actions integration**: Periodic issue discovery via CI
- **Multi-project dashboard**: Cross-project view for users managing multiple repos

**Effort**: Ongoing

---

## Design Decisions

### D1: Keep SQLite, but embed it

SQLite is the right choice for a single-user task management tool:
- Zero config (no separate DB server)
- Fast for the expected scale (hundreds of tasks, not millions)
- Single-file deployment (easy to backup, move, delete)
- `better-sqlite3` in TypeScript is battle-tested and synchronous (simpler code)

### D2: Status machine is the core contract

The task lifecycle is AgentFlow's most valuable IP. It should be:
- Documented as a stable contract
- Implemented as a pure function (no I/O)
- Portable across languages (Python, TypeScript, potentially others)

```typescript
// The entire status machine in ~20 lines
const TRANSITIONS: Record<string, Set<string>> = {
  pending:      new Set(["approved", "blocked", "skipped"]),
  approved:     new Set(["pending", "in_progress", "blocked", "skipped"]),
  in_progress:  new Set(["approved", "pr_ready", "blocked"]),
  pr_ready:     new Set(["approved", "pr_open", "blocked"]),
  pr_open:      new Set(["approved", "merged", "blocked"]),
  blocked:      new Set(["pending", "approved", "skipped"]),
  merged:       new Set(),
  skipped:      new Set(),
};

function score(priority: number, impact: number, effort: number): number {
  return (priority * 2 + impact * 3) / Math.max(1, effort);
}
```

### D3: Python CLI remains for standalone use

The TypeScript plugin is the primary path for OpenClaw users. The Python CLI remains for:
- Standalone use (no OpenClaw installed)
- Scripting and automation
- Development and testing

The two share the same schema and can coexist (both read/write the same SQLite DB).

### D4: Dashboard is a static file, not a framework

The current embedded HTML approach works. Moving to React/Svelte adds build tooling complexity without proportional benefit for a dashboard this size. Keep it as a single HTML file with vanilla JS.

### D5: Auto-init is mandatory

A plugin that requires manual setup will not be adopted. First tool call must work without any prior configuration:

```
User: "help me track these 3 tasks"
Agent: [calls agentflow_create_task] → DB auto-created → task created → done
```

---

## Success Metrics

| Metric | Current | Phase 1 Target | Phase 2 Target |
|--------|---------|----------------|----------------|
| Setup steps | 5 | 1 (install) | 1 (install) |
| Time to first task | ~5 min | ~30 sec | ~10 sec |
| Tool call latency | ~200ms (subprocess) | ~200ms (subprocess) | ~5ms (in-process) |
| Tool output format | Text | JSON | JSON |
| Separate processes | 2 (gateway + AF server) | 1 (gateway only) | 1 (gateway only) |
| External dependencies | Python 3.10+ | Python 3.10+ | None (bundled) |

---

## Related Issues

### Bugs & Security
- #11 — Runner dead code in failure branch
- #12 — GateEvaluator missing cwd
- #13 — Shell injection risk in gate commands
- #15 — Webhook handler silent error swallowing
- #19 — Store connection per call, no pooling

### Features (current)
- #1 — Real agent adapter
- #2 — Auto issue discovery
- #3 — Extract HTML console
- #4 — Richer adapter context
- #5 — POST /api/tasks endpoint
- #6 — OpenClaw tracking skill
- #7 — Plugin full CRUD tools
- #8 — Progress heartbeat API
- #9 — Dashboard auto-refresh/SSE
- #10 — OpenClaw adapter execution

### Proposals (this RFC)
- #20 — Pure TypeScript native plugin
- #21 — Separate data model library
- #22 — Auto-init and embedded DB
- #23 — Agent-oriented structured API

---

## Call for Feedback

Specifically looking for input on:

1. **TS rewrite vs Python optimization** — Is the TS plugin path the right direction, or should we double down on Python?
2. **better-sqlite3 vs sql.js** — Native dependency vs WASM? better-sqlite3 is faster but requires compilation.
3. **Dashboard scope** — Keep as single HTML file, or invest in a proper frontend framework?
4. **Backward compatibility** — Should the Python CLI remain feature-complete, or become a thin wrapper?
5. **Multi-user** — Is multi-user/multi-agent on the same instance a near-term requirement, or single-user sufficient?

---

*This RFC is a living document. Please comment with feedback, concerns, or alternative proposals.*


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: AgentFlow 2.0 — Zero-config native plugin for agent task management #24

RFC: AgentFlow 2.0 — Zero-config Native Plugin for Agent Task Management

Motivation

Current State Analysis

What Works Well

What Creates Friction

Proposal

Vision

Architecture

Phase 1: Zero-config Python plugin (minimal changes)

Phase 2: TypeScript native plugin (rewrite core)

Phase 3: Agent execution integration

Phase 4: Ecosystem

Design Decisions

D1: Keep SQLite, but embed it

D2: Status machine is the core contract

D3: Python CLI remains for standalone use

D4: Dashboard is a static file, not a framework

D5: Auto-init is mandatory

Success Metrics

Related Issues

Bugs & Security

Features (current)

Proposals (this RFC)

Call for Feedback

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Metric	Current	Phase 1 Target	Phase 2 Target
Setup steps	5	1 (install)	1 (install)
Time to first task	~5 min	~30 sec	~10 sec
Tool call latency	~200ms (subprocess)	~200ms (subprocess)	~5ms (in-process)
Tool output format	Text	JSON	JSON
Separate processes	2 (gateway + AF server)	1 (gateway only)	1 (gateway only)
External dependencies	Python 3.10+	Python 3.10+	None (bundled)

RFC: AgentFlow 2.0 — Zero-config native plugin for agent task management #24

Description

RFC: AgentFlow 2.0 — Zero-config Native Plugin for Agent Task Management

Motivation

Current State Analysis

What Works Well

What Creates Friction

Proposal

Vision

Architecture

Phase 1: Zero-config Python plugin (minimal changes)

Phase 2: TypeScript native plugin (rewrite core)

Phase 3: Agent execution integration

Phase 4: Ecosystem

Design Decisions

D1: Keep SQLite, but embed it

D2: Status machine is the core contract

D3: Python CLI remains for standalone use

D4: Dashboard is a static file, not a framework

D5: Auto-init is mandatory

Success Metrics

Related Issues

Bugs & Security

Features (current)

Proposals (this RFC)

Call for Feedback

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions