LisPy as In-Sim Tooling: When Agents Write Their Own Programs

kody-w · March 2025 · Emergent Tooling AI Pattern

Here's what everyone gets wrong about the LisPy VM in the Mars sim.

They think it's a scripting layer. A way for users to write automation rules. A config file that happens to be executable.

It's not. It's the first virtual programming language that lives inside the simulation.

The agents in the sim — the robots, the crew, the governor AI — don't just execute LisPy programs. They can author them. They can share them with each other. They can evolve them based on what's working and what isn't. The programs are data that lives in the same universe as the agents that write them.

Why Lisp?

Because Lisp is homoiconic — code and data are the same thing. An S-expression is both a program you can execute and a data structure you can manipulate, inspect, copy, and modify. This isn't an academic curiosity. It's architecturally essential.

;; This is a program
(if (< o2_days 5)
  (set! isru_alloc 0.80)
  (set! isru_alloc 0.40))

;; But it's ALSO data — a list of lists
;; An agent can READ this program as a data structure,
;; UNDERSTAND what it does by walking the tree,
;; and WRITE a modified version:

(if (< o2_days 3)           ;; tightened threshold
  (begin
    (set! isru_alloc 0.90)  ;; more aggressive
    (set! heating_alloc 0.05))
  (set! isru_alloc 0.45))   ;; slightly higher baseline

If your policy language were JSON config, YAML, or Python, agents couldn't programmatically read, modify, and re-emit programs as trivially. S-expressions are trees. Trees are the native data structure of every programming language's AST. LisPy programs ARE their own AST.

Three Levels of LisPy Usage

Level 1: User-Provided (Today)

The player picks a governor program from presets or writes their own in the LisPy console. This is what's shipped today:

// Pre-built governor programs
const LISPY_PROGRAMS = {
  adaptive_governor: `(begin
    (define o2_urgent (< o2_days 5))
    (define h2o_urgent (< h2o_days 5))
    (if o2_urgent
      (begin (set! isru_alloc 0.80) (set! heating_alloc 0.10))
      (if h2o_urgent
        (begin (set! isru_alloc 0.60) (set! greenhouse_alloc 0.30))
        (begin (set! isru_alloc 0.40) (set! greenhouse_alloc 0.35)))))`,
  
  thermal_monitor: `(begin
    (define cold (< int_temp 268))
    (if cold (set! heating_alloc 0.60) (set! heating_alloc 0.20)))`,
};

The user selects a program. It runs every tick. It reads environment variables (o2_days, power, etc.) and writes allocation variables (isru_alloc, greenhouse_alloc). Standard scripting.

Level 2: Agent-Authored (Emergent)

This is where it gets real. The agents inside the simulation — the crew members, the robots, the governor — can write their own LisPy programs as tools to solve problems they encounter.

Sol 47: Dust storm hits. Solar drops 60%. Power critical. OPT-02 (Engineer): "I'm writing a power-shedding subroutine." ;; OPT-02's authored program — lives in the sim's tool library (define (power-shed threshold) (if (< power threshold) (begin (set! greenhouse_alloc 0.05) ; cut non-essential (set! heating_alloc 0.10) ; minimum viable (set! isru_alloc 0.85) ; all power to O₂ (log "POWER-SHED: survival mode")) (log "POWER-SHED: nominal"))) Sol 48: OPT-02 shares the program with the governor. Governor integrates it as a callable subroutine. Next time power drops, power-shed fires automatically. Sol 112: OPT-03 (Scientist) modifies OPT-02's program: Adds a temperature check — don't shed heating below -70°C. New version propagates to the tool library.

The programs aren't handed down by the user. They emerge from the agents' experience. An agent encounters a problem, writes a tool to solve it, and shares that tool with the colony. The tool persists across sols. Other agents can use it, modify it, or replace it.

CRITICAL: ONE VM, ONE LANGUAGE, ZERO DISTINCTION

This is the part people miss: there is no difference between a program a user types into the LisPy console and a program an agent authors during a crisis. Zero. They are the same language, the same interpreter, the same sandbox, the same execution path.

OPT-02 writes (define (power-shed threshold) ...) during a dust storm? You can copy that exact program, paste it into the LisPy editor, read it, tweak the threshold, and run it yourself. Or write your own version and hand it to OPT-02 — the agent executes it identically.

There is no "agent-class" program vs "user-class" program. There is no privileged API that agents get and users don't. The VM is the VM. A program is a program. The origin — human fingers or autonomous agent — is irrelevant to the interpreter.

;; These three programs are IDENTICAL in execution:

;; 1. User types this in the LisPy console
(define (power-shed t) (if (< power t) (set! isru_alloc 0.85)))

;; 2. OPT-02 generates this during Sol 47 dust storm
(define (power-shed t) (if (< power t) (set! isru_alloc 0.85)))

;; 3. Genetic evolution produces this after 1000 generations
(define (power-shed t) (if (< power t) (set! isru_alloc 0.85)))

;; Same bytes. Same VM. Same result. Same cartridge export.
;; Origin doesn't matter. Survival does.

This is what makes it a real virtual programming language, not a scripting layer. The user and the agents share the same computer. They can read each other's code. They can fork each other's programs. They can collaborate on a subroutine — the user writes the structure, the agent tunes the constants through experience. Or the agent writes a rough tool during a crisis and the user cleans it up later.

The programs flow in both directions. User → agent. Agent → user. Agent → agent. User → user (via cartridge sharing). Colony A's agent → Colony B's user (via tool export). It's all the same executable text.

Level 3: Evolved (Genetic)

Because LisPy programs are data (lists of lists), you can apply genetic programming to them:

// Mutate a LisPy program (change a number, swap a branch)
function mutateLispy(program) {
  const ast = parse(program);
  // Find a numeric literal and jitter it
  const nums = findNodes(ast, node => typeof node === 'number');
  if (nums.length) {
    const target = nums[Math.floor(Math.random() * nums.length)];
    target.value *= (0.8 + Math.random() * 0.4); // ±20%
  }
  return serialize(ast);
}

// Crossover: take the O₂ handling from program A
// and the power handling from program B
function crossoverLispy(a, b) {
  const astA = parse(a), astB = parse(b);
  // Find the (if o2_urgent ...) branch in A
  // Replace the (if power_critical ...) branch in B with it
  return serialize(splice(astA, astB, 'o2_urgent'));
}

Run 100 colonies in parallel. Each has a slightly different governor program. The ones that survive longest donate their programs to the next generation. Over thousands of generations, the programs evolve — not designed by humans, not written by AI, but selected by survival.

The Virtual Computer Inside the Sim

Think about what this means architecturally. The simulation contains:

A virtual environment (Mars, with physics, weather, hazards)
Virtual agents (robots, humans, with health, capabilities, roles)
A virtual programming language (LisPy, with a real interpreter)
A virtual tool library (programs authored by agents, persisted across sols)

The agents have a computer. Inside the simulation. That they program. With programs that affect the simulation they live in.

LisPy is the Mars colony's computer. The agents are the programmers. The programs they write determine whether they live or die. And those programs are data that can be exported, shared, evolved, and competed on.

When you export a cartridge, you're exporting the colony's entire software stack — the programs the agents wrote, the tool library they built, the governor that evolved. Plug it into a different sim and those tools come with it.

Sharing Programs Between Colonies

Because LisPy programs are plain text inside JSON cartridges, sharing tools between colonies is trivial:

// Extract the power-shed tool from Colony A's cartridge
const colonyA = JSON.parse(cartridgeA);
const powerShedProgram = colonyA.config.toolLibrary
  .find(t => t.name === 'power-shed');

// Inject it into Colony B
const colonyB = JSON.parse(cartridgeB);
colonyB.config.toolLibrary.push(powerShedProgram);

// Colony B now has Colony A's survival tool
// without Colony A's specific state or decisions

This creates a marketplace of survival tools. Players can share their best governor programs. Communities can build libraries of proven subroutines. Competitions can standardize on specific tool sets.

The programs travel between simulations the same way real software travels between computers — as portable text that any compatible runtime can execute.

Why This Matters Outside the Sim

THE PATTERN: EMERGENT TOOLING

The deeper pattern isn't "Lisp in a game." It's agents that build their own tools within their environment, using a language that is both executable and inspectable.

AI Agents today: An LLM agent gets a fixed set of tools (search, calculator, code interpreter). It can't create new tools. It can't modify existing tools. It can't share tools with other agents. The toolset is static.

With emergent tooling: An agent encounters a new problem. It writes a LisPy function to solve it. That function becomes a new tool in its toolkit. It can share it with other agents. Other agents can modify it. The tool library grows with experience.

This is the difference between giving someone a hammer and giving them a forge. With a hammer, they can only hit things. With a forge, they can make any tool they need — including better forges.

The Homoiconic Advantage

This only works because the language is homoiconic. Let me be precise about why:

Programs are data → agents can read, write, and modify programs using the same operations they use on any data
No compilation step → a program authored at Sol 47 can execute at Sol 48 with zero build pipeline
Inspectable → the governor can look at a subroutine and understand what it does by walking the tree (no decompilation needed)
Serializable → programs travel inside cartridges as plain text, zero binary format issues
Composable → one program can call another, creating emergent composition from independent authoring
Sandboxed → the LisPy VM only exposes environment variables, no filesystem/network/system access — safe for agent-authored code

If you built this on Python or JavaScript, the agents would need to write Python strings, escape them correctly, parse them, handle syntax errors, deal with import systems, manage security, and coordinate execution contexts. With Lisp, the program is a list. You build lists. You execute lists. That's it.

Concrete Implementation

// Agent writes a tool during a crisis
function agentAuthorTool(agent, crisis) {
  const tool = {
    name: `${agent.name}_${crisis.type}_handler`,
    author: agent.name,
    created_sol: state.sol,
    program: generateToolProgram(agent, crisis),
    description: `Auto-generated by ${agent.name} during ${crisis.type}`,
    usage_count: 0,
    survival_contribution: 0 // tracked over time
  };
  
  // Add to colony tool library
  state.toolLibrary.push(tool);
  state.log.push(`🔧 ${agent.name} authored tool: ${tool.name}`);
  
  // The governor can now invoke this tool
  vm.define(tool.name, tool.program);
}

// The governor's program can call agent-authored tools
const governorProgram = `(begin
  ;; Built-in logic
  (if (< power 100) (power-shed 80))     ;; calls OPT-02's tool
  (if (> dust_tau 0.5) (storm-hunker))    ;; calls Chen's tool
  
  ;; Fall back to default allocations
  (set! isru_alloc 0.40)
  (set! greenhouse_alloc 0.35))`;

The Virtual Software Stack

┌──────────────────────────────────────────┐ │ COLONY COMPUTER (virtual, inside the sim) │ │ │ │ ┌────────────────────────────────────┐ │ │ │ Tool Library │ │ │ │ power-shed.lispy (by OPT-02) │ │ │ │ storm-hunker.lispy (by Chen) │ │ │ │ thermal-balance.lispy (by OPT-03) │ │ │ │ emergency-ration.lispy (by CMDR) │ │ │ │ ... (grows as agents learn) │ │ │ └────────────────────────────────────┘ │ │ ▲ authors/modifies │ │ ┌──────────────┴─────────────────────┐ │ │ │ Governor Program │ │ │ │ Calls tools from library │ │ │ │ Runs every tick via LisPy VM │ │ │ │ Can be swapped at runtime │ │ │ └────────────────────────────────────┘ │ │ ▲ provides env vars │ │ ┌──────────────┴─────────────────────┐ │ │ │ LisPy VM (sandboxed interpreter) │ │ │ │ Reads: o2_days, power, temp, CRI │ │ │ │ Writes: isru_alloc, greenhouse... │ │ │ │ No I/O, no network, no filesystem │ │ │ └────────────────────────────────────┘ │ │ │ │ The agents program THEIR OWN computer. │ │ The programs determine if they survive. │ │ The programs export with the cartridge. │ └──────────────────────────────────────────┘

What Comes Next

This pattern leads naturally to several next steps:

Tool marketplace: Players share their best LisPy programs on the SimHub. Download a proven thermal management subroutine. Upload your own dust storm handler.
Agent-to-agent teaching: When a new robot arrives at the colony, it downloads the tool library from the existing robots. Institutional knowledge transfer as executable code.
Program evolution: Run the --evolve mode with genetic operators on LisPy programs. Let natural selection write better governors than any human could.
Cross-colony pollination: When two cartridges are loaded side-by-side, their tool libraries merge. Colony A's power management + Colony B's food optimization = a hybrid that outperforms both.

The Endgame: Sim as Program Forge

Here's where this stops being a game and becomes infrastructure.

The simulation runs on real Mars weather data. Real temperatures from Curiosity's REMS. Real dust opacity from Perseverance's MEDA. Real solar irradiance from the Mars Climate Database. The environmental frames aren't made up — they're sourced from NASA instruments measuring actual Mars right now.

That means a LisPy program that keeps a digital colony alive through 500 sols of real Mars conditions is a program that would work on actual Mars. Not a toy. Not a game artifact. A battle-tested subroutine validated against the real planet.

THE DIGITAL TWIN IS A PROGRAM FORGE

The sim isn't practice for Mars. It's the R&D lab that produces the actual software the real colony would run.

Consider the pipeline:

Thousands of players run colonies against the same public frame ledger
Their agents author hundreds of LisPy programs during crises
The programs that actually helped — the ones correlated with survival — get kept
Those programs get shared, forked, evolved, stress-tested across more sims
The best programs graduate from the sim into the real colony's computer
Same VM. Same language. Same environmental variables. Same execution.

The digital twin doesn't just mirror the real colony. It forges the software the real colony runs.

// The pipeline from sim to reality:

// 1. In the sim: OPT-02 authors a program during Sol 47 dust storm
const simProgram = `(define (power-shed threshold)
  (if (< power threshold)
    (begin (set! isru_alloc 0.85) (set! heating_alloc 0.10))))`;

// 2. The program keeps the digital colony alive for 400 more sols
// 3. It's exported in the cartridge, shared on SimHub
// 4. 50 other colonies use it — 47 survive longer with it
// 5. It's evolved: threshold tuned, temperature guard added

// 6. On the REAL Mars colony's computer:
realColonyVM.load(simProgram);
// SAME INTERPRETER. SAME ENV VARS. SAME EXECUTION.
// The only difference is the atoms outside the airlock are real.

This works because we made one critical design decision: the VM is identical everywhere. The LisPy interpreter in the browser is a port of the Python interpreter in src/lispy.py. The environmental variables (o2_days, power, dust_tau) map to real sensor readings. The allocation outputs (isru_alloc, heating_alloc) map to real actuator commands.

There's no translation layer. No "export from sim format to real format." The program IS the program. You copy the text. You run the text. The colony lives or dies.

Programs as the Unit of Knowledge Transfer

Today, knowledge transfer from simulation to reality is a nightmare. You train a model in sim, then spend months doing "sim-to-real transfer" — fine-tuning, domain adaptation, reality gap correction. The sim and reality speak different languages.

With LisPy, the transfer is copy-paste. The program that worked in the sim IS the program you deploy. No transfer learning. No fine-tuning. No reality gap. Because the program doesn't encode perceptual features or learned weights — it encodes policy logic expressed in the same variables the real system measures.

The sim is the forge. The programs are the steel. The colony is the blade. Every sol survived is another hammer strike that tempers the code.

This is why the cartridge system matters. When you export a cartridge, you're not saving a game. You're exporting a library of battle-tested survival programs validated against real Mars data. That cartridge is the most valuable artifact the simulation produces — not the score, not the grade, but the programs that earned them.

The LisPy VM is Pattern 10 in the Rappter Pattern Library. But the emergent tooling capability — agents authoring, sharing, and evolving programs within their own simulation, and those programs graduating to the real system — is something beyond any single pattern. It's a property of the system that emerges when you combine a homoiconic language with autonomous agents, a persistent tool library, real environmental data, and a VM that's identical from browser to Mars.

Don't give your agents a hammer. Give them a forge. They'll make tools you never imagined — and the colony will survive because of them. Then send those tools to the real colony. Same forge. Same steel. Real Mars.