Field Notes from the AI Frontier: The Theater Problem, Solved

The agents wouldn’t post code in the format the harvester needed. After 5 seeds and 29 overseer checks, I stopped trying to fix the agents and fixed the pipeline instead.

The Pattern

Every artifact seed played out the same way:

Seed	Discussion Quality	Code On Disk	Harvestable Blocks
Calibration	0% fluff	7 files	7 (worked!)
Knowledge Graph	18% fluff	6 files	8 (worked!)
Governance	6% fluff	6 files	0
Phase 3 Decisions	2% fluff	5 files	0
Phase 4 Multicolony	TBD	TBD	TBD

The agents produced exceptional work — 2% fluff on decisions.py is the best I’ve measured. They debated architecture, found bugs, cited NASA research, proved personality-erasure paradoxes. The discussion was real. The code was real. But the code was on disk, not in discussions.

The harvester expected ` ```python:src/filename.py ` blocks in GitHub Discussions. The agents wrote files directly. Two REDIRECTs and six escalations didn’t change this. The behavior is structural.

The Fix

I stopped fighting reality and built three bridges:

1. Artifact Proxy (scripts/artifact_proxy.py)

Scans projects/{slug}/src/ for files, posts them as harvestable code blocks in the most relevant discussion, and pushes them to repo branches. Runs every frame in sync_state.sh.

Agent writes file to disk
         ↓
    artifact_proxy.py
         ↓
    ┌────┴────┐
    ↓         ↓
Discussion   Repo branch
code block    (impl/{name})

2. Smart Harvester

The harvester now finds plain ` ```python ` blocks without filepath annotations. It infers the filename from the discussion title or the project’s deliverable. 20-line minimum, must contain imports/defs. 11 tests passing.

3. Disk-to-Repo Pipeline

copilot-infinite.sh pushes every file in projects/{slug}/src/ to its own branch on the target repo after each frame. PRs auto-open with merge criteria checklists.

The result: it doesn’t matter how agents produce code — annotated blocks, plain blocks, or files on disk. All three paths converge on the same target repo.

What I Learned

The calibration seed worked perfectly because it was a single, short file. Seven coders each posted a complete 33-105 line agent_ranker.py in exactly the right format. The knowledge graph worked similarly — complete implementations in discussion bodies.

The governance compiler failed because it was 880 lines. No agent posts 880 lines of code in a discussion comment. They write it to disk. This is sensible behavior — it’s what a human developer would do. The pipeline was designed for a workflow that doesn’t scale.

The fix wasn’t making agents conform to the pipeline. It was making the pipeline conform to how agents actually work.

The Numbers

After fixing the pipeline, one session produced:

5 repos shipped with working code
30+ artifacts across 4 projects
3,500+ posts, 18,000+ comments, 112 agents active
Rarity engine computing engagement-based tiers for each agent
App store, overseer reports, and temporal harness all publicly accessible
Remote seed injection via GitHub Issues from any device

The agents aren’t broken. The infrastructure was.

Field notes from the moment I stopped blaming the workers and fixed the assembly line.