feat(ai): PR 5 — test extractor + core runner [5/7] by ianwhitedeveloper · Pull Request #416 · paralleldrive/riteway

ianwhitedeveloper · 2026-02-24T20:29:04Z

Summary

PR 5 of 7 in the consolidation of draft PR #394 — decomposing the 80-commit monolith into small, focused, dependency-ordered PRs.

Depends on: PR 1 (ai-errors, constants), PR 2 (tap-yaml), PR 3 (agent-parser, execute-agent), PR 4 (agent-config, validation) — all merged. Plus #417 (debug-logger removal) which is the direct base branch for this PR.

Note: This PR supersedes the pre-PR 5 prep work that was on this branch (#415, now closed). The outputFormat string strategy and related tasks were deferred to post-consolidation (see Architectural Notes below). The branch was reset to consolidation HEAD and this implementation applied cleanly.

What's in This PR

New modules

source/test-extractor.js — Two-agent prompt pipeline for extracting and evaluating test assertions:
- buildExtractionPrompt — instructs an LLM to parse a test file into structured { userPrompt, importPaths, assertions[] } data
- buildResultPrompt — instructs the result agent to execute the user prompt and return plain text
- buildJudgePrompt — instructs the judge agent to evaluate one result against one requirement and return TAP YAML
- extractTests — full extraction pipeline: calls extraction agent → validates result → resolves imported prompt files → returns { userPrompt, promptUnderTest, assertions }
source/ai-runner.js — Orchestration layer for the two-agent test execution flow:
- runAITests — full pipeline: read file → extract tests → run N runs (result agent once per run, judge agents per assertion in parallel) → aggregate results
- verifyAgentAuthentication — delegates to validation.js with injected executeAgent

Modified modules

source/ai-errors.js — Three new error types (AgentConfigReadError, AgentConfigParseError, AgentConfigValidationError)
source/agent-config.js — Updated to use specific AgentConfig* error types; z.prettifyError() used directly for Zod error formatting (custom formatZodError abstraction removed as unnecessary)

Dependency graph (this PR)

execute-agent.js  (leaf — spawns subprocess)
      ▲                        ▲
      │                        │
test-extractor.js          ai-runner.js

No cycles. test-extractor.js imports executeAgent directly from execute-agent.js — not from ai-runner.js. This was the circular dependency in the original feature branch.

Fixes and remediations details

WIP Fixes Applied

#	Issue	Resolution
12	Zod error formatting in pipeline	Custom `formatZodError` removed; `z.prettifyError()` used directly in `agent-config.js`
13	Re-exports in `test-extractor.js`	Removed — `parseTAPYAML` and `parseExtractionResult` imported directly by callers
—	`AgentConfig*` error types (Eric's PR #410 review)	Added to `ai-errors.js`; `agent-config.js` uses specific types; tests use `handleAIErrors` routing

Review Remediations Applied

#	Finding	Resolution
1	Fragile error tests used `package.json` at `process.cwd()`	Replaced with hermetic temp dirs + explicit `projectRoot` in all affected tests
2	No error path tests in `ai-runner.test.js`	Added: file-not-found (`ENOENT`) and extraction agent non-zero exit (`AgentProcessError` routing)
3	`loadAgentConfig` command injection trust boundary undocumented	Added trust boundary note to JSDoc
4	Misleading `testFilePath &&` guard in `resolveImportPaths` condition	Removed — resolution depends only on `importPaths.length`, not `testFilePath`
7	No test for `testFilePath` omitted path (import resolution still works)	Added `resolves imports using projectRoot when testFilePath is not provided`
8	Enhancement comments referenced style guide filenames	Converted to self-contained `// TODO:` markers
11	`runAITests` JSDoc missing `agentConfig` default	Documented `[options.agentConfig=getAgentConfig()]` with default note

Deferred with TODO(post-consolidation to mitigate derailing overall recomposition plan - can be decided upon once all functionality + e2e tests are in place): markers (tracked in consolidation plan):

buildJudgePrompt missing promptUnderTest guard (inconsistency with buildResultPrompt)
Validation ordering before IO in extractTests (structural errors should surface before resolveImportPaths)
judgeAssertion parameter bloat from logging (assertionIndex/totalAssertions for console.log only)
createTwoAgentMockArgs TAP YAML backtick fragility (use JSON.stringify)
Path traversal in resolveImportPaths — intentional, requires GitHub issue for remediation epic

Architectural Notes (Deferred)

`outputFormat` string strategy + `parseOutput`

The parseOutput: fn pattern stays as-is for PRs 5–7. The outputFormat declarative string strategy (making configs fully serializable for riteway ai init) and the IO/mapping separation in executeAgent are deferred to the post-consolidation epic. See the consolidation plan's Post-Consolidation section for details.

`runAITests` default `agentConfig`

runAITests uses getAgentConfig('claude') as its default — eliminates the previously duplicated inline claude config literal. Deferred cleanup tracked in the plan.

Test Results

136 tests across 12 test files, all passing. 19 new tests: 13 in test-extractor.test.js + 6 in ai-runner.test.js.

test-extractor tests cover buildExtractionPrompt, buildResultPrompt, buildJudgePrompt (full string comparison), and extractTests (integration tests using real node subprocesses for the agent mock, hermetic temp dirs for all fixtures — no reliance on project files)
ai-runner tests cover runAITests end-to-end using a two-agent node mock (extraction → result → judge) with hermetic temp dirs, including error paths: file-not-found and extraction agent failure

npm test       → 136/136 passing (12 test files)
npm run lint   → Lint complete.
npm run ts     → TypeScript check complete.

Checklist

Made with Cursor

ianwhitedeveloper · 2026-02-24T20:40:33Z

source/agent-config.js

-/**
- * Format Zod validation errors into a human-readable message.
- * @param {any} zodError - Zod validation error
- * @returns {string} Formatted error message
- */
-export const formatZodError = (zodError) => {
-  const issues = zodError.issues || zodError.errors;
-  return issues
-    ? issues.map(e => `${e.path.join('.')}: ${e.message}`).join('; ')
-    : zodError.message || 'Validation failed';
-};
-


I determined this was a completely unnecessary abstraction

- Add AgentConfigReadError, AgentConfigParseError, AgentConfigValidationError to ai-errors.js - Update agent-config.js to use specific AgentConfig* error types; z.prettifyError() inline - Update agent-config.test.js to use handleAIErrors routing pattern throughout - Add test-extractor.js: buildExtractionPrompt, buildResultPrompt, buildJudgePrompt, extractTests - Add ai-runner.js: runAITests, verifyAgentAuthentication - Add @paralleldrive/cuid2 dependency for hermetic test temp-dir naming Co-authored-by: Cursor <cursoragent@cursor.com>

- Fix fragile error tests in test-extractor.test.js that relied on package.json at process.cwd(); replace with hermetic temp dirs - Add resolves-imports-without-testFilePath test to cover the testFilePath guard removal behavioral branch - Remove misleading testFilePath guard from resolveImportPaths condition in extractTests (guard was unused; resolution depends only on importPaths.length) - Add error path tests to ai-runner.test.js: file-not-found and extraction agent non-zero exit (AgentProcessError routing) - Document loadAgentConfig command injection trust boundary - Fix Enhancement comments in ai-runner.js to self-contained TODOs - Inline redundant judgments intermediate variable in executeSingleRun - Add agentConfig default to runAITests JSDoc - Add TODO(post-consolidation) markers for deferred items: buildJudgePrompt promptUnderTest guard, validation ordering before IO, judgeAssertion param bloat, tapYAML mock fragility Made-with: Cursor

ianwhitedeveloper commented Feb 24, 2026

View reviewed changes

This comment was marked as outdated.

Sign in to view

ianwhitedeveloper force-pushed the pr/ai-core-runner branch 2 times, most recently from 8cb4278 to 3e64783 Compare February 25, 2026 20:00

ianwhitedeveloper mentioned this pull request Feb 25, 2026

refactor(ai): remove debug-logger abstraction — PR 2–4 modules #417

Merged

ianwhitedeveloper changed the base branch from ai-testing-framework-implementation-consolidation to pr/ai-debug-logger-removal February 25, 2026 20:03

ianwhitedeveloper force-pushed the pr/ai-core-runner branch 3 times, most recently from b65d74b to efada8b Compare February 25, 2026 20:38

ianwhitedeveloper force-pushed the pr/ai-debug-logger-removal branch 2 times, most recently from ecd2c58 to 82202d2 Compare February 25, 2026 20:44

ianwhitedeveloper force-pushed the pr/ai-core-runner branch from efada8b to 18fd7ab Compare February 25, 2026 20:44

ianwhitedeveloper marked this pull request as ready for review February 25, 2026 22:26

janhesters approved these changes Feb 26, 2026

View reviewed changes

Base automatically changed from pr/ai-debug-logger-removal to ai-testing-framework-implementation-consolidation February 26, 2026 17:57

ianwhitedeveloper and others added 2 commits February 26, 2026 12:03

ianwhitedeveloper marked this pull request as draft February 26, 2026 18:10

ianwhitedeveloper force-pushed the pr/ai-core-runner branch from 335ff01 to ae15c8f Compare February 26, 2026 18:11

ianwhitedeveloper marked this pull request as ready for review February 26, 2026 18:11

ianwhitedeveloper merged commit abe98d9 into ai-testing-framework-implementation-consolidation Feb 26, 2026

ianwhitedeveloper deleted the pr/ai-core-runner branch February 26, 2026 18:13

ianwhitedeveloper mentioned this pull request Mar 4, 2026

feat(ai): AI Testing Framework — consolidation staging branch [6/8 → master] #411

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ai): PR 5 — test extractor + core runner [5/7]#416

feat(ai): PR 5 — test extractor + core runner [5/7]#416
ianwhitedeveloper merged 2 commits intoai-testing-framework-implementation-consolidationfrom
pr/ai-core-runner

ianwhitedeveloper commented Feb 24, 2026 •

edited

Loading

Uh oh!

ianwhitedeveloper Feb 24, 2026

Uh oh!

This comment was marked as outdated.

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ianwhitedeveloper commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's in This PR

New modules

Modified modules

Dependency graph (this PR)

WIP Fixes Applied

Review Remediations Applied

Architectural Notes (Deferred)

outputFormat string strategy + parseOutput

runAITests default agentConfig

Test Results

Checklist

Uh oh!

ianwhitedeveloper Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ianwhitedeveloper commented Feb 24, 2026 •

edited

Loading

`outputFormat` string strategy + `parseOutput`

`runAITests` default `agentConfig`