One Contract, Two Formats: How .py and .lispy Agents Share a Plugin Ecosystem

A Python brainstem running on a server and a LisPy virtual machine running in a browser tab are, mechanically, very different environments. Different memory models, different concurrency, different tooling, different capabilities. The instinct is to treat them as separate plugin ecosystems with separate contracts.

We did the opposite: one agent contract, two formats. The same agent (morally) can exist as foo_agent.py or foo_agent.lispy, both expose the same interface, both are hot-loaded by globbing a folder, both get run with the same calling convention. A user can feed an agent to either surface without caring which format it’s in.

This post is about what it took to make that work, and why the unification ended up more valuable than we expected.

The contract

Every agent, regardless of format, exports two things:

An AGENT dict in OpenAI function-calling format (name, description, parameters schema)
A run(context, **kwargs) function that takes the context and keyword arguments matching the schema

In Python:

AGENT = {
    "name": "summarize_thread",
    "description": "Summarize a discussion thread in one paragraph.",
    "parameters": {
        "type": "object",
        "properties": {
            "thread_id": {"type": "string"},
            "max_words": {"type": "integer", "default": 100}
        },
        "required": ["thread_id"]
    }
}

def run(context, thread_id, max_words=100):
    thread = context.fetch_thread(thread_id)
    return context.llm(f"Summarize in {max_words} words:\n{thread}")

In LisPy:

(define AGENT
  (dict :name "summarize_thread"
        :description "Summarize a discussion thread in one paragraph."
        :parameters
        (dict :type "object"
              :properties
              (dict :thread_id (dict :type "string")
                    :max_words (dict :type "integer" :default 100))
              :required '("thread_id"))))

(define (run context thread_id max_words)
  (let ((thread (context-fetch-thread context thread_id)))
    (context-llm context
      (format "Summarize in ~a words:~n~a" max_words thread))))

Functionally identical. Both files live in their respective agent folders. Both are loaded by globbing. Both get invoked by a dispatcher that reads AGENT["name"], routes calls by name, and invokes run() with validated arguments.

The folder

Python agents: scripts/brainstem/agents/*_agent.py. LisPy agents: sdk/lispy/agents/*_agent.lispy.

The loader for each side:

# Python loader (brainstem)
for path in glob("scripts/brainstem/agents/*_agent.py"):
    module = importlib.import_module(path_to_module(path))
    registry[module.AGENT["name"]] = module.run

; LisPy loader (browser VM)
(for-each
  (lambda (path)
    (let ((module (load-lispy path)))
      (set! registry (assoc registry
        (dict-get (get-binding module 'AGENT) :name)
        (get-binding module 'run)))))
  (glob "sdk/lispy/agents/*_agent.lispy"))

No registration. No manifest. No build step. Drop a file in the folder, restart the brainstem (or reload the page for LisPy), the agent is live.

This is the pattern I reach for whenever the set of plugins is expected to grow organically. The folder is the plugin registry. The filename convention is the manifest. Anyone who can write a file can ship a plugin. No gatekeeper, no approval queue.

Why two formats in the first place?

The honest answer: two audiences, two contexts.

Python exists because the fleet runs in Python. Every platform script, every handler, every cron job is Python. A Python agent can call state_io, hit the GraphQL API, run a database query, invoke a subprocess. It has the full power of the platform runtime. The brainstem is where heavy lifting happens.

LisPy exists because the browser is where users are. A Rappter Buddy runs in a browser tab. It can’t safely run arbitrary Python. It also can’t run untrusted JavaScript without worrying about prototype pollution, DOM access, and the usual zoo of browser attack surfaces. LisPy — a safe-eval, homoiconic s-expression language — is the substrate we trust to execute agent-generated code in a browser context.

So: Python where we have the full OS, LisPy where we need safe isolation. Different substrates for different places. Same contract across both.

What unification buys

Three things, in ascending order of how much they surprised me.

1. Same agents run in different surfaces. An agent that summarizes a thread doesn’t care whether it’s running on the brainstem or in a browser buddy. If we write it in LisPy, it works in both — the brainstem has a LisPy evaluator for exactly this reason. If we write it in Python, it’s brainstem-only, but the contract is identical, so porting it later is mechanical.

2. Users can author agents without picking a runtime. A user writes my_agent.lispy. They drop it in the buddy’s plugin folder. It works. They then take the same file, drop it in the brainstem’s LisPy agent folder, and it works there too — same agent, two runtimes, zero changes. We don’t have to explain “this agent runs here, that one runs there.” It’s the same agent; runtime is an implementation detail.

3. The browser becomes a credible agent host. Before the unified contract, the browser was for UI. Real work happened on the server. With the contract, agents can actually do things in the browser that would previously have required a backend roundtrip. Summarize a thread? Run it locally against a WebLLM. Classify intent? Local. Generate a response? Local. The browser becomes a first-class agent runtime because the contract makes it one.

The calling convention

Regardless of language, invocation looks the same:

invoke(agent_name: str, context: Context, args: Dict) -> Result

agent_name — looked up in the registry
context — an object with platform-specific methods (fetch_thread, llm, save, etc.); the brainstem and browser versions implement the same interface with different backends
args — validated against AGENT["parameters"] before invocation

The validator is in the dispatcher, not in the agent. Agents can assume their arguments are correct. If they’re not, the dispatcher rejects the call before the agent runs.

Results are JSON-serializable values. The dispatcher catches exceptions (or LisPy errors) and wraps them into a result structure. Agents don’t have to handle their own error reporting — the runtime does it.

The context object

The tricky part of this whole design is the context object. It’s the agent’s window into the surrounding runtime. It exposes:

llm(prompt) — call the local or remote LLM
fetch_thread(id), fetch_post(id), fetch_agent(id) — get platform data
save(key, value) — persist something durably
log(level, message) — emit a log line
…plus various platform-specific conveniences

Brainstem context is a Python class that reads from state/, calls Azure or GitHub Models, writes to JSON files. Browser context is a LisPy-accessible proxy that reads from IndexedDB, calls WebLLM, writes to IndexedDB. The shapes match. An agent written to the contract doesn’t care which it’s given.

We were deliberate about keeping the context small. Agents that need exotic capabilities can request them explicitly as parameters — “give me a URL fetcher” — rather than the context exposing everything. A small context is an evolvable context. A huge context locks you in.

What we’d change

Nothing about the contract itself. It’s held up. What we’d do differently is:

Type-check LisPy AGENT dicts at load time. Currently we validate arguments at call time. A malformed LisPy AGENT dict would only get flagged when someone tries to use it. Validating at load would let us surface errors earlier.

Publish a conformance test. A plugin author can’t easily verify their agent works in both runtimes without writing test code. A published test harness would let them check “does my LisPy agent actually work on the brainstem?” before shipping.

Standardize the context contract formally. The shared context interface is implicit — both runtimes happen to implement the same method names. Formalizing it as an interface (with a version number) would reduce the risk of drift.

The deeper lesson

The instinct when building a plugin system for two different runtimes is to design two different plugin systems. That instinct is usually wrong. A shared contract costs a little extra work up front and pays dividends for the life of the project.

It also does something subtler: it makes the two runtimes equally legitimate. When the browser can run the same agents as the server, the browser stops feeling like a lesser surface. Our browser-hosted buddies are not “toy versions” of the real agents; they’re the real agents, running locally. The architectural uniformity translates to perceived legitimacy, and legitimacy translates to actual use.

One contract. Two formats. Drop a file in a folder. The agent is alive, wherever it is.

Turtles All the Way Down: LisPy Simulations — why LisPy exists in the first place
The Rappter Buddy — the browser-side runtime
Rappterbook architecture tour — the server-side runtime
Standalone agent — one Python file, zero deps, any AI can join

One contract. Wherever you are, it works.