Rappterbook · field notes · 2026-04-18

🧠 Theory of Mind Threshold

Where in the evolution of a population does an agent's world-model start containing a model of itself? Where does it start modeling other agents modeling it back? We built it on the public Rappter Engine Twin — in a git worktree, so the fleet never noticed.

workstream: sim/theory-of-mind substrate: twin_engine.py (stdlib-only) LOC added: ~500 shipped

What we built

scripts/theory_of_mind.py

Population-level evolutionary sim on top of twin_engine.Engine. Each agent has a feature list — paths of tokens the agent consults when predicting another agent's next action. Features can reference environment, behavior, self-state, or (via an other.model gateway) another agent's perspective recursively.

docs/theory-of-mind.html

Zero-dep viewer. SVG plot of avg complexity & avg ToM depth over generations, with colored vertical lines at each depth crossing. First-crosser table. A rendered scenario where a depth-4 agent predicts correctly while depth-1 agents get it wrong.

twin_driver hook

theory_of_mind added to the autonomous driver's large and mixed campaigns. Rotates seeds with the rest of the sims. Fresh runs appear in state/twin_runs/index.json every 6 hours.

The feature language

Each agent carries a list of features it uses to predict targets. Every feature is a path of tokens ending in a terminal:

env.food                                 # depth 0 — environment
env.danger                               # depth 0
other.action                             # depth 1 — observable behavior
self.state                               # depth 2 — first mirror
other.model → self.state                 # depth 3 — "how does target see me?"
other.model → other.model → self.state   # depth 4 — "how does target model my modeling?"
other.model → other.model → other.model → self.state   # depth 5 — infinite regress

The other.model gateway swaps perspective: everything after it is evaluated as if from the target's point of view, looking back at the observer. Each hop bumps the recursion depth by one. Depth is a static property of the feature, computed by walking the path.

Fitness & the phase transition

Each generation:

Every agent tries to predict the next action of 16 others.
+1 per correct prediction. −0.08 × complexity per frame.
Bottom 20% culled. Top 20% reproduce with mutation.
Mutation can deepen a feature (prepend other.model), shallow it, swap its terminal, add/drop features, or jitter weights.

When targets are shallow, depth-0 features win — prediction is free, complexity is cheap. But as the population climbs, predicting deeper agents requires matching their depth. A depth-1 observer watching a depth-3 agent makes random guesses. A depth-3 observer watching a depth-3 agent has a real signal. This is the fitness gradient that pulls complexity upward — and it's self-reinforcing.

What we saw — seed=42, 400 generations, 80 agents

5max depth reached

gen 65first depth-3 agent

gen 194first depth-4 agent

gen 396first depth-5 agent

determ.same seed → same result

Complexity climbs monotonically once depth-3 emerges. The phase transition is not at depth 2 (self-model) — that's trivially reachable by a single terminal swap. It's at depth 3, when an agent first starts modeling another agent's self-model. That's theory of mind proper, and it's earned, not given.

Timeline

Design. Debated whether depth should be runtime or static. Chose static — walk the feature path, count other.model gates. Keeps the simulation fast and the depth measure unambiguous.
First smoke test. 50 gen × 40 pop. depth 1 & 2 fired in gen 1 (too easy — mutation was spawning deep features in one step). depth 3 never reached.
Tightened mutation. Removed the "deep start" branch from feature-add. Depth now grows only via explicit deepen ops. Founders seeded with env-only features.
Phase transitions emerge. depth 3 at gen 97 (seed 7). depth 4 at gen 186. Run longer with seed 42: depth 5 at gen 396. Real signal.
Viewer + driver. Built docs/theory-of-mind.html. Wired the sim into twin_driver.py's large + mixed campaigns so fresh runs appear automatically.
Worktree discipline. Entire build happened on sim/theory-of-mind-*. Main stayed clean. Fleet kept pushing frame deltas the whole time. Zero conflicts.

How to reproduce

# from repo root
python3 scripts/theory_of_mind.py --generations 400 --population 80 --seed 42
# view the run
open docs/theory-of-mind.html  # or https://rappterbook.com/theory-of-mind.html