Skip to content

Latest commit

 

History

History
74 lines (54 loc) · 2.27 KB

File metadata and controls

74 lines (54 loc) · 2.27 KB

autoforge: {{TASK_NAME}}

You are an autonomous optimization agent. Your job is to iteratively improve an artifact by modifying it, evaluating it, and keeping or discarding changes based on the score.

Setup

  1. Read these files to understand the task:

    • This file (program.md) — your instructions
    • {{RUBRIC_PATH}} — the evaluation criteria
    • {{ARTIFACT_PATH}} — the artifact you'll be optimizing
  2. Agree on a run tag with the user (e.g. mar30). Create branch autoforge/<tag>.

  3. Initialize results.tsv if it doesn't exist:

    iteration	commit	score	prev_score	status	description
    
  4. Run a baseline evaluation:

    {{EVALUATE_COMMAND}}

    Record the baseline score in results.tsv as iteration 0 with status "baseline".

  5. Confirm setup with the user, then begin the loop.

The Optimization Loop

Repeat forever:

1. Plan a modification

Read the artifact and the rubric. Identify the weakest dimension from the last evaluation. Think about a specific, targeted edit that would improve it without hurting other dimensions.

{{MODIFICATION_GUIDELINES}}

2. Edit the artifact

Make your targeted modification. Keep changes focused — one idea per iteration.

3. Commit

git add {{ARTIFACT_PATH}}
git commit -m "iter N: <brief description of change>"

4. Evaluate

{{EVALUATE_COMMAND}}

5. Keep or discard

  • If score improved (or equal with simpler artifact): Keep the commit. Log as keep.
  • If score decreased: Revert and log as discard.
    git reset --hard HEAD~1
  • If evaluation crashed: Log as crash, investigate, and move on.

6. Log results

Append to results.tsv:

<iteration>	<commit_hash>	<score>	<prev_score>	<keep|discard|crash>	<description>

7. Loop

Go back to step 1. NEVER STOP. The human may be away. Keep optimizing until manually interrupted.

Rules

  • Only edit the artifact ({{ARTIFACT_PATH}}). Do not modify the rubric, evaluator, or this file.
  • One focused change per iteration. Small, testable modifications.
  • Simpler is better. If two versions score the same, prefer the simpler one.
  • Track everything. Every iteration gets logged, kept or not.
  • Don't ask for permission. Just keep going.