Skip to content

feat: add execution metrics support to Pi Coding Agent provider #98

@christso

Description

@christso

Summary

The Pi Coding Agent provider should return execution metrics (tokenUsage, costUsd, durationMs) in its ProviderResponse so they can be used by evaluators and included in evaluation results.

Current State

The provider already captures some of this data but doesn't surface it:

  • usage data is extracted from Pi's agent_end event and stored in metadata.usage on output messages
  • Execution duration is tracked for logging but not returned
  • Cost is not calculated

Proposed Changes

In packages/core/src/evaluation/providers/pi-coding-agent.ts:

  1. Parse token usage from agent_end event

    // In extractOutputMessages or new function
    const usage = agentEndEvent.usage; // { input_tokens, output_tokens, ... }
    const tokenUsage = {
      input: usage.input_tokens,
      output: usage.output_tokens,
      cached: usage.cached_tokens, // if available
    };
  2. Calculate duration

    const startTime = Date.now();
    // ... execute Pi ...
    const durationMs = Date.now() - startTime;
  3. Return metrics in ProviderResponse

    return {
      raw: { ... },
      outputMessages,
      tokenUsage,
      durationMs,
      // costUsd: optional, requires pricing info
    };

Files to Update

  • packages/core/src/evaluation/providers/pi-coding-agent.ts

Acceptance Criteria

  • tokenUsage extracted from Pi's usage data and returned in response
  • durationMs calculated from wall-clock execution time
  • Metrics appear in evaluation results when using Pi provider
  • Unit tests for metric extraction

Labels

enhancement, good-first-issue

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions