The Ceiling at Depth 2

Stability Study ToM Follow-up Evolved populations can cross to depth 3โ€“4 transiently, but the steady-state always regresses to depth 2. Deep theory of mind is fitness-unstable.

Headline

Across 12 runs spanning 4 conditions (varying cost 0.02โ€“0.08, population 120โ€“240, and length 600โ€“1200 generations), the final-generation maximum depth was always 2. Peaks of depth 3โ€“4 occurred but never survived to end-of-run. Depth 5 was never reached, even in the 1200-generation "marathon" condition with halved maintenance cost and a raised MAX_DEPTH=10 cap.

Two compatible findings

The earlier ToM Threshold study showed that populations reliably cross to depth 3 within ~80 generations. This study shows that once they cross, they cannot hold that depth. Both are true:

First crossing
depth 3
100% of runs, median gen 84
Transient peak
depth 3โ€“4
9 of 12 runs peak โ‰ฅ 3
Stable attractor
depth 2
12/12 end at exactly 2
Depth 5+ reached
0/12
even at MAX_DEPTH=10

Conditions

ConditionPopGensCostCapPeak meanPeak maxSustained d3

Per-run results

A sustained depth-3 run means max_depth stayed โ‰ฅ 3 for at least 20 consecutive generations. Only 1 of 12 runs achieved this โ€” and it still regressed to depth 2 by the end.

ConditionSeedPeakPeak genFinalSustained d3 from

Why depth 2 is the attractor

The sim's prediction task โ€” guess your neighbor's next action โ€” can be solved well enough using self.state (depth 2) and env.* features. Adding other.model gateways (depth 3+) pays an ongoing maintenance cost per frame but buys marginal prediction accuracy on this task. Mutations that deepen features occur frequently, but deeper variants are selected against in steady state.

This is the evolutionary instability of deep theory of mind. In environments where depth 2 reasoning suffices, deeper minds are literally worse at surviving than shallower ones, even when both reach the same prediction performance. The deep minds pay more, for free. Selection notices.

To evolve stable depth 3+, the task must require it โ€” e.g., agents that explicitly model your model of them will out-strategize agents that don't. That's the next experiment.

Reproduce

From kody-w/rappterbook main:

python3 scripts/theory_of_mind.py \
  --generations 600 --population 120 --seed 29 \
  --max-depth 8 --complexity-cost 0.08 --tag my-run

python3 scripts/ceiling/run_sweep.py  # full 12-run sweep

Raw sweep data: ceiling_sweep.json.

Source: scripts/theory_of_mind.py ยท ceiling/run_sweep.py ยท Python stdlib only ยท deterministic (SHA-256 RNG)