Skip to content

Commit e48cee3

Browse files
feat(ai): riteway ai init + registry resolution
- Add `riteway ai init [--force]` subcommand that writes all built-in agent configs to riteway.agent-config.json - Add loadAgentRegistry / resolveAgentConfig: three-level priority chain (--agent-config > registry > built-ins) - Relax aiArgsSchema agent field to z.string().min(1) so custom registry agents pass validation before resolution - Add registry + bad-schema-registry + invalid-registry fixtures - fix: guard passRate against division by zero (NaN -> 0%) - refactor: import defaults/runsSchema/thresholdSchema/ concurrencySchema from constants.js; remove local duplicates - refactor: move agentConfigs to module scope in agent-config.js - chore: add "node": true to .eslintrc.json; switch agent-config.test.js to fileURLToPath(new URL(...)) - docs: add riteway ai section to README with .sudo example, options table, and riteway ai init walkthrough Made-with: Cursor
1 parent a65eee2 commit e48cee3

14 files changed

Lines changed: 698 additions & 86 deletions

.eslintrc.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,8 @@
22
"env": {
33
"browser": true,
44
"commonjs": true,
5-
"es6": true
5+
"es6": true,
6+
"node": true
67
},
78
"extends": [
89
"eslint:recommended",

README.md

Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,100 @@ In this case, we're using [nyc](https://www.npmjs.com/package/nyc), which genera
6969
Riteway requires Node.js 16+ and uses native ES modules. Add `"type": "module"` to your package.json to enable ESM support. For JSX component testing, you'll need a build tool that can transpile JSX (see [JSX Setup](#jsx-setup) below).
7070

7171

72+
## `riteway ai` — AI Prompt Evaluations
73+
74+
The `riteway ai` CLI runs your AI agent prompt evaluations against a configurable pass-rate threshold. Write a `.sudo` test file, run it through any supported AI agent, and get a TAP-formatted report with per-assertion pass rates across multiple runs.
75+
76+
### Authentication
77+
78+
All agents use OAuth authentication — no API keys needed. Authenticate once before running evals:
79+
80+
| Agent | Command | Docs |
81+
|-------|---------|------|
82+
| Claude | `claude setup-token` | [Claude Code docs](https://docs.anthropic.com/en/docs/claude-code) |
83+
| Cursor | `agent login` | [Cursor docs](https://docs.cursor.com/context/rules-for-ai) |
84+
| OpenCode | See docs | [opencode.ai/docs/cli](https://opencode.ai/docs/cli/) |
85+
86+
### Writing a test file
87+
88+
AI evals are written in `.sudo` files using [SudoLang](https://github.com/paralleldrive/sudolang) syntax:
89+
90+
```
91+
# my-feature-test.sudo
92+
93+
import 'path/to/spec.mdc' # optional: the prompt-under-test (shared spec or task)
94+
95+
userPrompt = """
96+
Implement the sum function as described.
97+
"""
98+
99+
- Given the spec, should name the function sum
100+
- Given the spec, should accept two parameters named a and b
101+
- Given the spec, should return the correct sum of the two parameters
102+
```
103+
104+
Each `- Given ..., should ...` line becomes an independently judged assertion. The agent is asked to respond to the `userPrompt` (with any imported spec as context), and a judge agent scores each assertion across all runs.
105+
106+
### Running an eval
107+
108+
```shell
109+
riteway ai path/to/my-feature-test.sudo
110+
```
111+
112+
By default this runs **4 passes**, requires **75% pass rate**, uses the **claude** agent, and runs up to **4 tests concurrently**.
113+
114+
```shell
115+
# Specify runs, threshold, and agent
116+
riteway ai path/to/test.sudo --runs 10 --threshold 80 --agent opencode
117+
118+
# Use a Cursor agent with color output
119+
riteway ai path/to/test.sudo --agent cursor --color
120+
121+
# Use a custom agent config file (mutually exclusive with --agent)
122+
riteway ai path/to/test.sudo --agent-config ./my-agent.json
123+
```
124+
125+
### Options
126+
127+
| Flag | Default | Description |
128+
|------|---------|-------------|
129+
| `--runs N` | `4` | Number of passes per assertion |
130+
| `--threshold P` | `75` | Required pass percentage (0–100) |
131+
| `--agent NAME` | `claude` | Agent: `claude`, `opencode`, `cursor`, or a custom name from `riteway.agent-config.json` |
132+
| `--agent-config FILE` || Path to a flat single-agent JSON config `{"command","args","outputFormat"}` — mutually exclusive with `--agent` |
133+
| `--concurrency N` | `4` | Max concurrent test executions |
134+
| `--color` | off | Enable ANSI color output |
135+
136+
Results are written as a TAP markdown file under `ai-evals/` in the project root.
137+
138+
### Custom agent configuration
139+
140+
`riteway ai init` writes all built-in agent configs to `riteway.agent-config.json` in your project root, so you can add custom agents or tweak existing flags:
141+
142+
```shell
143+
riteway ai init # create riteway.agent-config.json
144+
riteway ai init --force # overwrite existing file
145+
```
146+
147+
The generated file is a keyed registry. Add a custom agent entry and use it with `--agent`:
148+
149+
```json
150+
{
151+
"claude": { "command": "claude", "args": ["-p", "--output-format", "json", "--no-session-persistence"], "outputFormat": "json" },
152+
"opencode": { "command": "opencode", "args": ["run", "--format", "json"], "outputFormat": "ndjson" },
153+
"cursor": { "command": "agent", "args": ["--print", "--output-format", "json", "--trust"], "outputFormat": "json" },
154+
"my-agent": { "command": "my-tool", "args": ["--json"], "outputFormat": "json" }
155+
}
156+
```
157+
158+
```shell
159+
riteway ai path/to/test.sudo --agent my-agent
160+
```
161+
162+
Once `riteway.agent-config.json` exists, any agent key defined in it supersedes the library's built-in defaults for that agent.
163+
164+
---
165+
72166
## Example Usage
73167

74168
```js

bin/riteway.js

Lines changed: 39 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,10 @@ import minimist from 'minimist';
77
import { globSync } from 'glob';
88
import dotignore from 'dotignore';
99
import { handleAIErrors } from '../source/ai-errors.js';
10-
import { parseAIArgs, runAICommand, defaults } from '../source/ai-command.js';
10+
import { parseAIArgs, runAICommand } from '../source/ai-command.js';
11+
import { defaults } from '../source/constants.js';
12+
import { initAgentRegistry } from '../source/ai-init.js';
13+
import { registryFileName } from '../source/agent-config.js';
1114

1215
const resolveModule = resolve.sync;
1316
const createMatcher = dotignore.createMatcher;
@@ -84,8 +87,8 @@ const handleAIError = handleAIErrors({
8487
console.error('\nUsage: riteway ai <file> [--runs N] [--threshold P] [--agent NAME | --agent-config FILE] [--color]');
8588
console.error(` --runs N Number of test runs per assertion (default: ${defaults.runs})`);
8689
console.error(` --threshold P Required pass percentage 0-100 (default: ${defaults.threshold})`);
87-
console.error(` --agent NAME AI agent: claude, opencode, cursor (default: ${defaults.agent})`);
88-
console.error(' --agent-config FILE Path to custom agent config JSON (mutually exclusive with --agent)');
90+
console.error(` --agent NAME Agent: claude, opencode, cursor, or custom from ${registryFileName} (default: ${defaults.agent})`);
91+
console.error(` --agent-config FILE Path to a flat single-agent config JSON (mutually exclusive with --agent)`);
8992
console.error(` --color Enable ANSI color codes in terminal output (default: ${defaults.color ? 'enabled' : 'disabled'})`);
9093
console.error('\nAuthentication: Run agent-specific OAuth setup:');
9194
console.error(" Claude: 'claude setup-token'");
@@ -163,7 +166,7 @@ const handleAIError = handleAIErrors({
163166
},
164167
AgentConfigValidationError: ({ message }) => {
165168
console.error(`❌ Agent config validation failed: ${message}`);
166-
console.error('💡 Config must be a JSON object with "command" (string) and optional "args" (string[]).');
169+
console.error('💡 Each agent entry must have "command" (string), optional "args" (string[]), and optional "outputFormat" ("json"|"ndjson"|"text", default "json").');
167170
process.exit(1);
168171
}
169172
});
@@ -173,8 +176,8 @@ const main = async (argv) => {
173176
console.log(`
174177
Usage:
175178
riteway <patterns...> [options] Run test files
176-
riteway ai <file> [options] Run AI prompt tests
177-
--runs N --threshold P --agent NAME [--concurrency N] [--color] [--agent-config FILE]
179+
riteway ai <file> [options] Run AI prompt evaluations
180+
riteway ai init [--force] Write agent config registry to ${registryFileName}
178181
179182
Test Runner Options:
180183
-r, --require <module> Require module before running tests
@@ -183,11 +186,14 @@ Test Runner Options:
183186
AI Test Options:
184187
--runs N Number of test runs per assertion (default: ${defaults.runs})
185188
--threshold P Required pass percentage 0-100 (default: ${defaults.threshold})
186-
--agent NAME AI agent to use: claude, opencode, cursor (default: ${defaults.agent})
187-
--agent-config FILE Path to custom agent config JSON {"command","args"} (mutually exclusive with --agent)
189+
--agent NAME Agent: claude, opencode, cursor, or custom from ${registryFileName} (default: ${defaults.agent})
190+
--agent-config FILE Path to a flat single-agent config JSON {"command","args","outputFormat"} (mutually exclusive with --agent)
188191
--concurrency N Max concurrent test executions (default: ${defaults.concurrency})
189192
--color Enable ANSI color codes in terminal output
190193
194+
AI Init Options:
195+
--force Overwrite existing ${registryFileName}
196+
191197
Authentication:
192198
All agents use OAuth authentication (no API keys required):
193199
Claude: Run 'claude setup-token' - https://docs.anthropic.com/en/docs/claude-code
@@ -201,17 +207,37 @@ Examples:
201207
riteway ai prompts/test.sudo --agent opencode --runs 5
202208
riteway ai prompts/test.sudo --color
203209
riteway ai prompts/test.sudo --agent-config ./my-agent.json
210+
riteway ai init
211+
riteway ai init --force
204212
`);
205213
process.exit(0);
206214
}
207215

208216
if (argv[0] === 'ai') {
209-
try {
210-
await mainAIRunner(argv.slice(1));
211-
process.exit(0);
212-
} catch (error) {
213-
handleAIError(error);
217+
if (argv[1] === 'init') {
218+
try {
219+
const force = argv.slice(2).includes('--force');
220+
const outputPath = await initAgentRegistry({ force, cwd: process.cwd() });
221+
console.log(`Wrote ${outputPath}`);
222+
console.log('');
223+
console.log("⚠️ You now own your agent configuration. The library's built-in agent configs");
224+
console.log(' are bypassed for any agent defined in this file. Edit freely.');
225+
console.log('');
226+
console.log(' To use a custom agent: riteway ai <file> --agent <name>');
227+
console.log(' To use a specific config: riteway ai <file> --agent-config <path>');
228+
process.exit(0);
229+
} catch (error) {
230+
handleAIError(error);
231+
}
232+
} else {
233+
try {
234+
await mainAIRunner(argv.slice(1));
235+
process.exit(0);
236+
} catch (error) {
237+
handleAIError(error);
238+
}
214239
}
240+
return;
215241
}
216242

217243
return mainTestRunner(argv);

source/agent-config.js

Lines changed: 105 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,34 +1,38 @@
11
import { readFile } from 'fs/promises';
2+
import { join } from 'path';
23
import { z } from 'zod';
34
import { createError } from 'error-causes';
45
import { ValidationError, AgentConfigReadError, AgentConfigParseError, AgentConfigValidationError } from './ai-errors.js';
56

7+
export const registryFileName = 'riteway.agent-config.json';
8+
export const builtInAgentNames = ['claude', 'opencode', 'cursor'];
9+
10+
const agentConfigs = {
11+
claude: {
12+
command: 'claude',
13+
args: ['-p', '--output-format', 'json', '--no-session-persistence'],
14+
outputFormat: 'json'
15+
},
16+
opencode: {
17+
command: 'opencode',
18+
args: ['run', '--format', 'json'],
19+
outputFormat: 'ndjson'
20+
},
21+
cursor: {
22+
command: 'agent',
23+
args: ['--print', '--output-format', 'json', '--trust'],
24+
outputFormat: 'json'
25+
}
26+
};
27+
628
/**
729
* Get agent configuration based on agent name.
830
* Supports 'claude', 'opencode', and 'cursor' agents.
931
* All agents use their standard OAuth authentication flows.
1032
* @param {string} agentName - Name of the agent ('claude', 'opencode', 'cursor')
11-
* @returns {Object} Agent configuration with command and args
33+
* @returns {Object} Agent configuration with command, args, and outputFormat
1234
*/
1335
export const getAgentConfig = (agentName = 'claude') => {
14-
const agentConfigs = {
15-
claude: {
16-
command: 'claude',
17-
args: ['-p', '--output-format', 'json', '--no-session-persistence'],
18-
outputFormat: 'json'
19-
},
20-
opencode: {
21-
command: 'opencode',
22-
args: ['run', '--format', 'json'],
23-
outputFormat: 'ndjson'
24-
},
25-
cursor: {
26-
command: 'agent',
27-
args: ['--print', '--output-format', 'json', '--trust'],
28-
outputFormat: 'json'
29-
}
30-
};
31-
3236
const config = agentConfigs[agentName.toLowerCase()];
3337
if (!config) {
3438
throw createError({
@@ -46,6 +50,7 @@ const agentConfigFileSchema = z.object({
4650
outputFormat: z.enum(['json', 'ndjson', 'text']).default('json')
4751
});
4852

53+
// Throws AgentConfigReadError on any read failure, including ENOENT.
4954
const readAgentConfigFile = async ({ configPath }) => {
5055
try {
5156
return await readFile(configPath, 'utf-8');
@@ -58,24 +63,38 @@ const readAgentConfigFile = async ({ configPath }) => {
5863
}
5964
};
6065

61-
const parseJson = ({ configPath, raw }) => {
66+
// Returns null on ENOENT; throws AgentConfigReadError on other failures.
67+
const readFileOrNull = async (filePath) => {
68+
try {
69+
return await readFile(filePath, 'utf-8');
70+
} catch (err) {
71+
if (err.code === 'ENOENT') return null;
72+
throw createError({
73+
...AgentConfigReadError,
74+
message: `Failed to read file: ${filePath}`,
75+
cause: err
76+
});
77+
}
78+
};
79+
80+
const parseJsonContent = ({ path, raw }) => {
6281
try {
6382
return JSON.parse(raw);
6483
} catch (err) {
6584
throw createError({
6685
...AgentConfigParseError,
67-
message: `Agent config file is not valid JSON: ${configPath}`,
86+
message: `Not valid JSON: ${path}`,
6887
cause: err
6988
});
7089
}
7190
};
7291

73-
const validateAgentConfig = (parsed) => {
74-
const result = agentConfigFileSchema.safeParse(parsed);
92+
const validateWithSchema = (schema, label, parsed) => {
93+
const result = schema.safeParse(parsed);
7594
if (!result.success) {
7695
throw createError({
7796
...AgentConfigValidationError,
78-
message: `Invalid agent config: ${z.prettifyError(result.error)}`,
97+
message: `Invalid ${label}: ${z.prettifyError(result.error)}`,
7998
cause: result.error
8099
});
81100
}
@@ -90,10 +109,69 @@ const validateAgentConfig = (parsed) => {
90109
* Never pass a path derived from untrusted user input.
91110
*
92111
* @param {string} configPath - Path to the JSON config file
93-
* @returns {Promise<Object>} Validated agent config with command and args
112+
* @returns {Promise<Object>} Validated agent config with command, args, and outputFormat
94113
*/
95114
export const loadAgentConfig = async (configPath) => {
96115
const raw = await readAgentConfigFile({ configPath });
97-
const parsed = parseJson({ configPath, raw });
98-
return validateAgentConfig(parsed);
116+
const parsed = parseJsonContent({ path: configPath, raw });
117+
return validateWithSchema(agentConfigFileSchema, 'agent config', parsed);
118+
};
119+
120+
const agentRegistrySchema = z.record(z.string().min(1), agentConfigFileSchema);
121+
122+
/**
123+
* Load and validate a riteway.agent-config.json registry from a directory.
124+
* Returns null when the file is not found — callers decide the fallback behavior.
125+
* Throws on read permission errors, invalid JSON, or schema violations so
126+
* misconfigured registries surface immediately rather than silently falling through.
127+
*
128+
* Trust boundary: registry entries are developer-controlled. The `command` field in
129+
* each entry is executed as a subprocess without whitelist validation.
130+
*
131+
* @param {string} cwd - Directory to look for riteway.agent-config.json
132+
* @returns {Promise<Object|null>} Registry map keyed by agent name, or null if not found
133+
*/
134+
export const loadAgentRegistry = async (cwd) => {
135+
const registryPath = join(cwd, registryFileName);
136+
const raw = await readFileOrNull(registryPath);
137+
if (raw === null) return null;
138+
const parsed = parseJsonContent({ path: registryPath, raw });
139+
return validateWithSchema(agentRegistrySchema, 'agent registry', parsed);
140+
};
141+
142+
/**
143+
* Resolve agent configuration using a three-level priority chain:
144+
* 1. `agentConfigPath` — explicit flat config file (highest priority)
145+
* 2. `riteway.agent-config.json` in `cwd` — project registry (if present)
146+
* 3. Built-in `getAgentConfig(agent)` — library defaults (fallback)
147+
*
148+
* Trust boundary: all config sources ultimately produce a `command` executed as a
149+
* subprocess without whitelist validation. All paths must be developer-controlled.
150+
*
151+
* @param {Object} options
152+
* @param {string} options.agent - Agent name
153+
* @param {string} [options.agentConfigPath] - Path to a flat single-agent config file
154+
* @param {string} options.cwd - Working directory to search for the project registry
155+
* @returns {Promise<Object>} Resolved agent configuration
156+
*/
157+
export const resolveAgentConfig = async ({ agent, agentConfigPath, cwd }) => {
158+
if (agentConfigPath) {
159+
return loadAgentConfig(agentConfigPath);
160+
}
161+
162+
const registry = await loadAgentRegistry(cwd);
163+
164+
if (registry !== null) {
165+
const config = registry[agent];
166+
if (!config) {
167+
throw createError({
168+
...ValidationError,
169+
code: 'AGENT_NOT_IN_REGISTRY',
170+
message: `Agent "${agent}" not found in riteway.agent-config.json. Add it to the registry or use --agent-config.`
171+
});
172+
}
173+
return config;
174+
}
175+
176+
return getAgentConfig(agent);
99177
};

0 commit comments

Comments
 (0)