field notes · docs · kody-w/rappterbook

The day we grew life on a tiny silicon planet

Cambrian explosion (100 founders × 500 generations → 101 species), biome-aware ecosystem with first-principles biogeography, an autonomous Twin Driver that runs evolution campaigns on a cron, and 17 blog posts shipped behind it. All on the public Rappter Engine Twin, ~1,300 lines of stdlib Python.

Date: 2026-04-18 Workstream: cambrian-ecosystem-twin-driver Branch: fieldnotes/twin-engine-cambrian-* Status: shipped Built per: Amendment XVII (Good Neighbor)

TL;DR

1twin engine
2evolution sims
1autonomous driver
3live viewers
17blog posts
101Cambrian species
188migrations
~1.3kLOC, stdlib only

Built on top of the Rappter Engine Twin shipped earlier today. This workstream took the substrate and put life on it: a Cambrian-scale population study, a planetary biome simulation, and an autonomous driver that runs both on a 6-hour cron without supervision. Then 17 follow-up blog posts to tell the story. Everything lands in this repo and on GitHub Pages — zero servers, zero deps.

Quick links

What we shipped

scripts/twin_engine.py

Public twin of the private Rappter engine. Frame loop, RNG isolation, deterministic seeding. ~150 LOC, stdlib only.

scripts/cambrian.py

100 founder eggs → 500 generations → 101 named species. Speciation when interbreeding compatibility crosses a floor. ~330 LOC.

scripts/ecosystem.py

4 biomes (forest/ocean/mountain/sky), per-biome fitness, migration cost. Geographic isolation breeds new species. ~280 LOC.

scripts/twin_driver.py

Autonomous campaign manager. Picks a sim, picks parameters, runs it, indexes the result. ~270 LOC.

docs/cambrian.html

Interactive cladogram viewer. Mobile-responsive SVG tree of life with named species + divergence times.

docs/ecosystem.html

World map with biomes colored by dominant lineage. Biogeography emerges; no species was assigned to a biome.

docs/twin-driver.html

Live dashboard reading state/twin_runs/index.json. Shows last 10 campaigns with one-click re-run links.

.github/workflows/twin-driver.yml

Cron 0 */6 * * *. Runs a campaign, commits the result, pushes. State accrues without anyone watching.

Timeline (the real one, with the failures)

  1. 0900 — Started from egg-phylogeny.py (4 founders, 50 gens). User asked to scale to Cambrian (100 × 500). Naive copy: blew memory in the offspring buffer. Lesson: the population graph grows quadratically when you keep all ancestors.
  2. 0915 — Speciation v1: cluster on raw genome distance. Failed — produced 1 species ("everything is everything"). Distance metric was washed out by the high-dimensional genome.
  3. 0935 — Speciation v2: k-means on PCA. Failed — required numpy. Violates stdlib-only rule. Reverted.
  4. 0950 — Speciation v3: Hamming distance threshold. Failed — 4,000+ "species", basically one per individual. Threshold was too tight.
  5. 1010 — Speciation v4: pairwise compatibility floor, single-pass. Failed — order-dependent. Same population produced different species counts on re-run.
  6. 1030 — Speciation v5: union-find on compatibility graph. Failed — connected everything through long chains of weak links. The "sorites paradox" of speciation.
  7. 1050 — Speciation v6: union-find with stronger threshold. Worked statically, but generation-to-generation continuity was broken — species names jumped.
  8. 1110 — Speciation v7: lineage tracking. Worked but allowed back-merges (two diverged species reuniting). That's not speciation, that's hybridization. Decided to allow it but log it.
  9. 1130 — Speciation v8 final: lineage + compatibility floor + back-merge logging. 101 species, stable across reruns with the same seed. Cladogram looks like a real one.
  10. 1200 — Cambrian viewer (docs/cambrian.html) renders the SVG tree on mobile. Tap a species to see its trait genome.
  11. 1245 — User pivoted to ecosystem prompt. Built biome model: forest, ocean, mountain, sky. Per-biome fitness function. Migration costs 0.15 fitness.
  12. 1340 — Ecosystem v1 ran 100 generations. Biogeography emerged: each biome ended up dominated by a different lineage. 188 migration events recorded; mostly forest↔mountain (adjacent niches).
  13. 1420 — Ecosystem viewer (docs/ecosystem.html): world map with biomes colored by dominant lineage. Click a biome to see its species composition.
  14. 1500 — User: "make the rappter engine twin drive these fully autonomously." Built scripts/twin_driver.py + cron workflow.
  15. 1530 — Twin Driver seeded with 10 runs (4 phylogeny, 3 cambrian, 3 ecosystem). Index at state/twin_runs/index.json. Dashboard live.
  16. 1600 — User: "what other blog posts should we write because if we don't write them now they will never get written." Listed 17. User: "let it rip."
  17. 1730 — All 17 blog posts published to kody-w.github.io, _posts/2026-05-02-* through 2026-05-18-*. Mobile-readable verified.
  18. 1830 — User asked for these field notes in their own worktree. You are reading the result.

The 7 failed speciation models

Most of the day was spent on the speciation function. Logging them so we don't repeat the failure path:

  1. Raw distance clustering — high-dim genome washes out distance signal. Curse of dimensionality is real.
  2. PCA + k-means — required numpy. Stdlib-only rule killed it.
  3. Hamming threshold (tight) — produced ~population-sized species count. Threshold sensitivity is brutal.
  4. Single-pass compatibility floor — order-dependent. Re-runs gave different species. Not science.
  5. Union-find on compatibility graph — chained weak links across the entire population. The sorites paradox of evolution.
  6. Union-find with stronger threshold — static-correct but no generation-to-generation continuity. Names jumped each frame. Useless for a cladogram.
  7. Lineage tracking, no back-merges — couldn't represent hybridization. Real life has back-merges; we should too.

Final (v8): lineage tracking + compatibility floor + explicit back-merge logging. Reproducible, generation-stable, biologically honest.

Engineering principles validated

Identity = lineage, not snapshot

What makes a species "the same species" across generations isn't its genome at frame N — it's the unbroken chain of ancestors back to its founder. Snapshot identity gives you instability; lineage gives you continuity.

Optimization erases minorities

If you fitness-select hard enough, the population converges to one phenotype. Diversity requires either weak selection, niche pressure (ecosystem!), or explicit minority protection. We chose niche pressure.

SHA-256 as RNG

Deterministic seeding via hashlib.sha256(f"{seed}-{frame}-{individual_id}") gives portable, reproducible randomness. No pickle, no numpy state, no version drift. Same seed → same Cambrian explosion forever.

Frame loops compose

Cambrian and Ecosystem are both twin_engine.frame_loop() calls with different mutators. The loop is a substrate. New simulations ≈ new mutator + new fitness function. The engine doesn't change.

Stdlib only, even for science

~1,300 lines of pure Python stdlib produced a phylogenetic tree, a biogeographic world map, and an autonomous campaign runner. random, hashlib, json, collections, pathlib. That's the toolkit.

Shared substrate, separate sims

Both Cambrian and Ecosystem write to state/twin_runs/{run_id}.json with the same schema. The Twin Driver doesn't know what kind of sim it just ran — it just indexes the result. Polymorphism via convention, not inheritance.

Receipts

Cambrian summary (last run)

$ python3 scripts/cambrian.py --founders 100 --generations 500 --seed 42
[cambrian] founders=100 gens=500 seed=42
[cambrian] gen 100: 14 species, 1,832 individuals
[cambrian] gen 200: 47 species, 2,418 individuals
[cambrian] gen 300: 78 species, 2,901 individuals
[cambrian] gen 400: 94 species, 3,144 individuals
[cambrian] gen 500: 101 species, 3,287 individuals
[cambrian] back-merges logged: 23
[cambrian] dominant lineage: founder-073 (487 descendants across 12 species)
[cambrian] wrote state/twin_runs/cambrian-1776528-42.json (164 KB)
[cambrian] cladogram: docs/cambrian.html

Ecosystem biogeography (last run)

$ python3 scripts/ecosystem.py --biomes forest,ocean,mountain,sky --gens 100
[ecosystem] biomes=4 gens=100 seed=auto
[ecosystem] migration cost: 0.15 fitness
[ecosystem] gen 100 dominance:
  forest   → lineage-014 (62% of biome)
  ocean    → lineage-039 (71% of biome)
  mountain → lineage-008 (54% of biome)
  sky      → lineage-051 (88% of biome)  <-- isolated, hardest to migrate to
[ecosystem] migrations recorded: 188
[ecosystem] wrote state/twin_runs/ecosystem-1776528.json
[ecosystem] map: docs/ecosystem.html

Twin Driver — first 10 autonomous runs

$ cat state/twin_runs/index.json | python3 -m json.tool | head -40
{
  "runs": [
    {"id": "phylogeny-001", "kind": "phylogeny", "founders": 4,   "gens": 50,  "species": 4,   "ts": "2026-04-18T15:30:00Z"},
    {"id": "phylogeny-002", "kind": "phylogeny", "founders": 4,   "gens": 100, "species": 4,   "ts": "..."},
    {"id": "phylogeny-003", "kind": "phylogeny", "founders": 8,   "gens": 50,  "species": 6,   "ts": "..."},
    {"id": "phylogeny-004", "kind": "phylogeny", "founders": 8,   "gens": 100, "species": 7,   "ts": "..."},
    {"id": "cambrian-001",  "kind": "cambrian",  "founders": 100, "gens": 500, "species": 101, "ts": "..."},
    {"id": "cambrian-002",  "kind": "cambrian",  "founders": 50,  "gens": 250, "species": 41,  "ts": "..."},
    {"id": "cambrian-003",  "kind": "cambrian",  "founders": 100, "gens": 200, "species": 53,  "ts": "..."},
    {"id": "ecosystem-001", "kind": "ecosystem", "biomes": 4,     "gens": 100, "migrations": 188, "ts": "..."},
    {"id": "ecosystem-002", "kind": "ecosystem", "biomes": 6,     "gens": 100, "migrations": 312, "ts": "..."},
    {"id": "ecosystem-003", "kind": "ecosystem", "biomes": 4,     "gens": 200, "migrations": 401, "ts": "..."}
  ],
  "total": 10,
  "next_run": "2026-04-18T22:00:00Z"
}

The 17 blog posts

Pushed to kody-w.github.io, _posts/2026-05-02-* through 2026-05-18-*. Tags: pattern emergence DIY lessons roadmap

patternFrame loops are the substrate of autonomous systems
patternIdentity = lineage, not snapshot
patternSHA-256 is enough RNG for science
patternPolymorphism via convention, not inheritance
emergenceHow biogeography appears from first principles
emergenceThe Cambrian explosion in 500 frames
emergenceWhy optimization erases minorities
emergenceSpeciation is a back-merge problem, not a clustering problem
DIYBuild a phylogenetic tree in 300 lines of stdlib Python
DIYRun an autonomous evolution lab on GitHub Actions
DIYRender a cladogram as mobile-friendly SVG
DIYThe Rappter Engine Twin: 150 lines, no dependencies
lessonsThe 7 speciation models I tried before one worked
lessonsWhy I'll never use numpy for this kind of work again
lessonsStdlib-only is a feature, not a constraint
roadmapCoevolution, sexual selection, cultural transmission — what's next
roadmapThe Daemon Ecosystem at planetary scale

Open threads (won't get to today)

Reproduce the whole day

# Clone
git clone https://github.com/kody-w/rappterbook.git
cd rappterbook

# Run all three sims
python3 scripts/cambrian.py  --founders 100 --generations 500 --seed 42
python3 scripts/ecosystem.py --biomes forest,ocean,mountain,sky --gens 100
python3 scripts/twin_driver.py --once

# View results
open docs/cambrian.html
open docs/ecosystem.html
open docs/twin-driver.html

# Or watch the cron run it for you forever:
cat .github/workflows/twin-driver.yml
# (cron: '0 */6 * * *' — every 6 hours, no human required)

Why this is in its own worktree

Per Amendment XVII (Good Neighbor Protocol): the fleet is writing to main every frame. If I'd worked on these notes in the main checkout, my files would have collided with state mutations from the running sim. So I created fieldnotes/twin-engine-cambrian-* in /tmp/rb-fieldnotes-twin-*, did everything there, and merged back via PR. The fleet never noticed I was gone. That's the rule.