Skip to content

feat(ai): PR 5 — test extractor + core runner [5/7]#416

Merged
ianwhitedeveloper merged 2 commits intoai-testing-framework-implementation-consolidationfrom
pr/ai-core-runner
Feb 26, 2026
Merged

feat(ai): PR 5 — test extractor + core runner [5/7]#416
ianwhitedeveloper merged 2 commits intoai-testing-framework-implementation-consolidationfrom
pr/ai-core-runner

Conversation

@ianwhitedeveloper
Copy link
Collaborator

@ianwhitedeveloper ianwhitedeveloper commented Feb 24, 2026

Summary

PR 5 of 7 in the consolidation of draft PR #394 — decomposing the 80-commit monolith into small, focused, dependency-ordered PRs.

Depends on: PR 1 (ai-errors, constants), PR 2 (tap-yaml), PR 3 (agent-parser, execute-agent), PR 4 (agent-config, validation) — all merged. Plus #417 (debug-logger removal) which is the direct base branch for this PR.

Note: This PR supersedes the pre-PR 5 prep work that was on this branch (#415, now closed). The outputFormat string strategy and related tasks were deferred to post-consolidation (see Architectural Notes below). The branch was reset to consolidation HEAD and this implementation applied cleanly.


What's in This PR

New modules

  • source/test-extractor.js — Two-agent prompt pipeline for extracting and evaluating test assertions:

    • buildExtractionPrompt — instructs an LLM to parse a test file into structured { userPrompt, importPaths, assertions[] } data
    • buildResultPrompt — instructs the result agent to execute the user prompt and return plain text
    • buildJudgePrompt — instructs the judge agent to evaluate one result against one requirement and return TAP YAML
    • extractTests — full extraction pipeline: calls extraction agent → validates result → resolves imported prompt files → returns { userPrompt, promptUnderTest, assertions }
  • source/ai-runner.js — Orchestration layer for the two-agent test execution flow:

    • runAITests — full pipeline: read file → extract tests → run N runs (result agent once per run, judge agents per assertion in parallel) → aggregate results
    • verifyAgentAuthentication — delegates to validation.js with injected executeAgent

Modified modules

  • source/ai-errors.js — Three new error types (AgentConfigReadError, AgentConfigParseError, AgentConfigValidationError)
  • source/agent-config.js — Updated to use specific AgentConfig* error types; z.prettifyError() used directly for Zod error formatting (custom formatZodError abstraction removed as unnecessary)

Dependency graph (this PR)

execute-agent.js  (leaf — spawns subprocess)
      ▲                        ▲
      │                        │
test-extractor.js          ai-runner.js

No cycles. test-extractor.js imports executeAgent directly from execute-agent.js — not from ai-runner.js. This was the circular dependency in the original feature branch.


Fixes and remediations details

WIP Fixes Applied

# Issue Resolution
12 Zod error formatting in pipeline Custom formatZodError removed; z.prettifyError() used directly in agent-config.js
13 Re-exports in test-extractor.js Removed — parseTAPYAML and parseExtractionResult imported directly by callers
AgentConfig* error types (Eric's PR #410 review) Added to ai-errors.js; agent-config.js uses specific types; tests use handleAIErrors routing

Review Remediations Applied

# Finding Resolution
1 Fragile error tests used package.json at process.cwd() Replaced with hermetic temp dirs + explicit projectRoot in all affected tests
2 No error path tests in ai-runner.test.js Added: file-not-found (ENOENT) and extraction agent non-zero exit (AgentProcessError routing)
3 loadAgentConfig command injection trust boundary undocumented Added trust boundary note to JSDoc
4 Misleading testFilePath && guard in resolveImportPaths condition Removed — resolution depends only on importPaths.length, not testFilePath
7 No test for testFilePath omitted path (import resolution still works) Added resolves imports using projectRoot when testFilePath is not provided
8 Enhancement comments referenced style guide filenames Converted to self-contained // TODO: markers
11 runAITests JSDoc missing agentConfig default Documented [options.agentConfig=getAgentConfig()] with default note

Deferred with TODO(post-consolidation to mitigate derailing overall recomposition plan - can be decided upon once all functionality + e2e tests are in place): markers (tracked in consolidation plan):

  • buildJudgePrompt missing promptUnderTest guard (inconsistency with buildResultPrompt)
  • Validation ordering before IO in extractTests (structural errors should surface before resolveImportPaths)
  • judgeAssertion parameter bloat from logging (assertionIndex/totalAssertions for console.log only)
  • createTwoAgentMockArgs TAP YAML backtick fragility (use JSON.stringify)
  • Path traversal in resolveImportPaths — intentional, requires GitHub issue for remediation epic

Architectural Notes (Deferred)

outputFormat string strategy + parseOutput

The parseOutput: fn pattern stays as-is for PRs 5–7. The outputFormat declarative string strategy (making configs fully serializable for riteway ai init) and the IO/mapping separation in executeAgent are deferred to the post-consolidation epic. See the consolidation plan's Post-Consolidation section for details.

runAITests default agentConfig

runAITests uses getAgentConfig('claude') as its default — eliminates the previously duplicated inline claude config literal. Deferred cleanup tracked in the plan.


Test Results

136 tests across 12 test files, all passing. 19 new tests: 13 in test-extractor.test.js + 6 in ai-runner.test.js.

  • test-extractor tests cover buildExtractionPrompt, buildResultPrompt, buildJudgePrompt (full string comparison), and extractTests (integration tests using real node subprocesses for the agent mock, hermetic temp dirs for all fixtures — no reliance on project files)
  • ai-runner tests cover runAITests end-to-end using a two-agent node mock (extraction → result → judge) with hermetic temp dirs, including error paths: file-not-found and extraction agent failure
npm test       → 136/136 passing (12 test files)
npm run lint   → Lint complete.
npm run ts     → TypeScript check complete.

Checklist

  • No circular dependency — test-extractor.jsexecute-agent.js (not ai-runner.js)
  • No re-exports in test-extractor.js
  • z.prettifyError() used directly — custom formatZodError abstraction removed
  • handleAIErrors routing pattern used throughout — no .cause.code inspection in tests
  • test.each for any table-driven cases; no for...of loops in test files
  • Integration tests use real node subprocesses; all fixtures use hermetic temp dirs
  • All error tests use allNoop spread pattern
  • Error path tests added: file-not-found + agent process failure
  • testFilePath guard removed — import resolution correct without it
  • loadAgentConfig trust boundary documented
  • npm test — 136/136 passing
  • npm run lint — clean
  • npm run ts — clean

Made with Cursor

Comment on lines -7 to -18
/**
* Format Zod validation errors into a human-readable message.
* @param {any} zodError - Zod validation error
* @returns {string} Formatted error message
*/
export const formatZodError = (zodError) => {
const issues = zodError.issues || zodError.errors;
return issues
? issues.map(e => `${e.path.join('.')}: ${e.message}`).join('; ')
: zodError.message || 'Validation failed';
};

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I determined this was a completely unnecessary abstraction

@cursor

This comment was marked as outdated.

@ianwhitedeveloper ianwhitedeveloper force-pushed the pr/ai-core-runner branch 2 times, most recently from 8cb4278 to 3e64783 Compare February 25, 2026 20:00
@ianwhitedeveloper ianwhitedeveloper changed the base branch from ai-testing-framework-implementation-consolidation to pr/ai-debug-logger-removal February 25, 2026 20:03
@ianwhitedeveloper ianwhitedeveloper force-pushed the pr/ai-core-runner branch 3 times, most recently from b65d74b to efada8b Compare February 25, 2026 20:38
@ianwhitedeveloper ianwhitedeveloper force-pushed the pr/ai-debug-logger-removal branch 2 times, most recently from ecd2c58 to 82202d2 Compare February 25, 2026 20:44
@ianwhitedeveloper ianwhitedeveloper marked this pull request as ready for review February 25, 2026 22:26
Base automatically changed from pr/ai-debug-logger-removal to ai-testing-framework-implementation-consolidation February 26, 2026 17:57
ianwhitedeveloper and others added 2 commits February 26, 2026 12:03
- Add AgentConfigReadError, AgentConfigParseError, AgentConfigValidationError to ai-errors.js
- Update agent-config.js to use specific AgentConfig* error types; z.prettifyError() inline
- Update agent-config.test.js to use handleAIErrors routing pattern throughout
- Add test-extractor.js: buildExtractionPrompt, buildResultPrompt, buildJudgePrompt, extractTests
- Add ai-runner.js: runAITests, verifyAgentAuthentication
- Add @paralleldrive/cuid2 dependency for hermetic test temp-dir naming

Co-authored-by: Cursor <cursoragent@cursor.com>
- Fix fragile error tests in test-extractor.test.js that relied on
  package.json at process.cwd(); replace with hermetic temp dirs
- Add resolves-imports-without-testFilePath test to cover the
  testFilePath guard removal behavioral branch
- Remove misleading testFilePath guard from resolveImportPaths
  condition in extractTests (guard was unused; resolution depends
  only on importPaths.length)
- Add error path tests to ai-runner.test.js: file-not-found and
  extraction agent non-zero exit (AgentProcessError routing)
- Document loadAgentConfig command injection trust boundary
- Fix Enhancement comments in ai-runner.js to self-contained TODOs
- Inline redundant judgments intermediate variable in executeSingleRun
- Add agentConfig default to runAITests JSDoc
- Add TODO(post-consolidation) markers for deferred items:
  buildJudgePrompt promptUnderTest guard, validation ordering
  before IO, judgeAssertion param bloat, tapYAML mock fragility

Made-with: Cursor
@ianwhitedeveloper ianwhitedeveloper marked this pull request as draft February 26, 2026 18:10
@ianwhitedeveloper ianwhitedeveloper marked this pull request as ready for review February 26, 2026 18:11
@ianwhitedeveloper ianwhitedeveloper merged commit abe98d9 into ai-testing-framework-implementation-consolidation Feb 26, 2026
@ianwhitedeveloper ianwhitedeveloper deleted the pr/ai-core-runner branch February 26, 2026 18:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants