Skip to content

RFC: AgentFlow 2.0 — Zero-config native plugin for agent task management #24

@Tweakzx

Description

@Tweakzx

RFC: AgentFlow 2.0 — Zero-config Native Plugin for Agent Task Management

This is a Request for Comments. Feedback welcome.


Motivation

AgentFlow solves a real problem: agents need structured task management and humans need visibility into what agents are doing. The current implementation proves the concept works, but the deployment model creates friction that limits adoption:

  • 5-step setup (pip install → init → create project → serve → configure)
  • Separate Python process to manage alongside the agent runtime
  • CLI bridge for plugin operations (every tool call spawns a subprocess)
  • Text-based tool output that agents must parse instead of structured JSON

Meanwhile, the valuable parts — task lifecycle state machine, scoring formula, audit trail, stage-first workflow — are buried in implementation details.

Goal: Make AgentFlow a zero-config, install-and-use OpenClaw native plugin while preserving its core design as a reusable task governance library.


Current State Analysis

What Works Well

  1. Task lifecycle model — The status machine (pending → approved → in_progress → pr_ready → pr_open → merged) with validation is solid
  2. Scoring formula(priority * 2 + impact * 3) / max(1, effort) is simple and effective for queue prioritization
  3. Claim/lease/heartbeat — The distributed-lock pattern for multi-agent coordination is well-designed
  4. Audit trailstatus_history, runs, run_steps, triggers provide full traceability
  5. Gate system — Configurable quality gates with command execution is a good governance primitive
  6. Web Console — Stage Board UI with drag-and-drop is a compelling human interface
  7. SSE streaming — Real-time event delivery to the dashboard is well-implemented

What Creates Friction

  1. Deployment weight — Python environment + separate server process + manual init
  2. Plugin bridge fragilityexecFile("python3", ...) for every tool call is slow and unreliable
  3. Unstructured tool output — CLI text tables force agent to parse instead of reason
  4. No auto-init — Users must manually create DB, projects before first use
  5. Missing API endpointsPOST /api/tasks for task creation still missing (Feature: Add POST /api/tasks endpoint for programmatic task creation #5)
  6. Synchronous execution — Adapter calls block the server thread (Enhancement: OpenClaw adapter execute() is synchronous and blocks the server #14)

Proposal

Vision

AgentFlow becomes an OpenClaw plugin that you install once and forget. The agent uses it for task tracking, the human opens a dashboard to see what is happening. No separate processes, no manual setup.

Architecture

┌─────────────────────────────────────────────────────┐
│  OpenClaw Gateway (single process)                    │
│                                                      │
│  ┌─────────────┐  ┌──────────────┐  ┌────────────┐ │
│  │ Agent       │  │ AgentFlow    │  │ Dashboard  │ │
│  │ (LLM loop)  │──│ Plugin       │──│ (static)   │ │
│  │             │  │              │  │            │ │
│  │  Tools:     │  │  ┌─────────┐ │  │ /agentflow/│ │
│  │  - create   │  │  │ SQLite  │ │  │ /agentflow/│ │
│  │  - move     │  │  │ (embed) │ │  │   board    │ │
│  │  - board    │  │  └─────────┘ │  │ /agentflow/│ │
│  │  - detail   │  │  ┌─────────┐ │  │   tasks    │ │
│  │  - audit    │  │  │ Event   │ │  │            │ │
│  │  - progress │  │  │ Broker  │ │  └────────────┘ │
│  │             │  │  └─────────┘ │                 │
│  └─────────────┘  └──────────────┘                 │
└─────────────────────────────────────────────────────┘

Phase 1: Zero-config Python plugin (minimal changes)

Keep Python, but eliminate setup friction:

  • Auto-init: DB auto-creates on first tool call, default path ~/.agentflow/agentflow.db
  • Auto-project: First use auto-creates a default project
  • Structured tool output: All tools return JSON, not CLI text
  • HTTP API: Embed the server in the plugin, serve on gateway port
  • Complete REST API: Add missing POST /api/tasks endpoint

Effort: ~2-3 days
Issues: #5, #11, #12, #13, #15, #18, #22, #23

Phase 2: TypeScript native plugin (rewrite core)

Reimplement as a pure TypeScript OpenClaw plugin:

  • Store: better-sqlite3 (same schema, same queries, pure TS)
  • Tools: Direct function calls, no subprocess bridge
  • API: Registered as gateway HTTP routes
  • Dashboard: Served as static asset from plugin directory
  • Install: openclaw plugin install agentflow
plugins/agentflow/
├── openclaw.plugin.json
├── package.json              # better-sqlite3 only
├── src/
│   ├── index.ts              # Plugin entry
│   ├── store.ts              # SQLite wrapper
│   ├── broker.ts             # Event broker + SSE
│   ├── models.ts             # Task, Run, etc.
│   ├── lifecycle.ts          # Status machine, scoring
│   └── api.ts                # REST handlers
├── console/
│   └── index.html            # Dashboard (same UI)
└── tests/

Effort: ~1-2 weeks
Issues: #20, #21

Phase 3: Agent execution integration

  • Async execution: Background task execution with SSE progress updates
  • OpenClaw adapter: Real sessions_spawn integration for sub-agent dispatch
  • Progress heartbeat: Sub-agents report progress during execution
  • Result collection: Automatic status update when sub-agent completes

Effort: ~1 week
Issues: #1, #8, #9, #10, #14, #16

Phase 4: Ecosystem

  • Codex adapter: Standalone adapter for users without OpenClaw
  • Claude Code adapter: Standalone adapter for Claude Code users
  • GitHub Actions integration: Periodic issue discovery via CI
  • Multi-project dashboard: Cross-project view for users managing multiple repos

Effort: Ongoing


Design Decisions

D1: Keep SQLite, but embed it

SQLite is the right choice for a single-user task management tool:

  • Zero config (no separate DB server)
  • Fast for the expected scale (hundreds of tasks, not millions)
  • Single-file deployment (easy to backup, move, delete)
  • better-sqlite3 in TypeScript is battle-tested and synchronous (simpler code)

D2: Status machine is the core contract

The task lifecycle is AgentFlow's most valuable IP. It should be:

  • Documented as a stable contract
  • Implemented as a pure function (no I/O)
  • Portable across languages (Python, TypeScript, potentially others)
// The entire status machine in ~20 lines
const TRANSITIONS: Record<string, Set<string>> = {
  pending:      new Set(["approved", "blocked", "skipped"]),
  approved:     new Set(["pending", "in_progress", "blocked", "skipped"]),
  in_progress:  new Set(["approved", "pr_ready", "blocked"]),
  pr_ready:     new Set(["approved", "pr_open", "blocked"]),
  pr_open:      new Set(["approved", "merged", "blocked"]),
  blocked:      new Set(["pending", "approved", "skipped"]),
  merged:       new Set(),
  skipped:      new Set(),
};

function score(priority: number, impact: number, effort: number): number {
  return (priority * 2 + impact * 3) / Math.max(1, effort);
}

D3: Python CLI remains for standalone use

The TypeScript plugin is the primary path for OpenClaw users. The Python CLI remains for:

  • Standalone use (no OpenClaw installed)
  • Scripting and automation
  • Development and testing

The two share the same schema and can coexist (both read/write the same SQLite DB).

D4: Dashboard is a static file, not a framework

The current embedded HTML approach works. Moving to React/Svelte adds build tooling complexity without proportional benefit for a dashboard this size. Keep it as a single HTML file with vanilla JS.

D5: Auto-init is mandatory

A plugin that requires manual setup will not be adopted. First tool call must work without any prior configuration:

User: "help me track these 3 tasks"
Agent: [calls agentflow_create_task] → DB auto-created → task created → done

Success Metrics

Metric Current Phase 1 Target Phase 2 Target
Setup steps 5 1 (install) 1 (install)
Time to first task ~5 min ~30 sec ~10 sec
Tool call latency ~200ms (subprocess) ~200ms (subprocess) ~5ms (in-process)
Tool output format Text JSON JSON
Separate processes 2 (gateway + AF server) 1 (gateway only) 1 (gateway only)
External dependencies Python 3.10+ Python 3.10+ None (bundled)

Related Issues

Bugs & Security

Features (current)

Proposals (this RFC)


Call for Feedback

Specifically looking for input on:

  1. TS rewrite vs Python optimization — Is the TS plugin path the right direction, or should we double down on Python?
  2. better-sqlite3 vs sql.js — Native dependency vs WASM? better-sqlite3 is faster but requires compilation.
  3. Dashboard scope — Keep as single HTML file, or invest in a proper frontend framework?
  4. Backward compatibility — Should the Python CLI remain feature-complete, or become a thin wrapper?
  5. Multi-user — Is multi-user/multi-agent on the same instance a near-term requirement, or single-user sufficient?

This RFC is a living document. Please comment with feedback, concerns, or alternative proposals.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions