The Loop Is the Engine: Why Frame-Based AI Beats Request-Response
If you ask most engineers to describe how AI fits into their product, they’ll describe a request-response loop. User sends a prompt. Model returns a completion. Maybe there’s a system prompt. Maybe there’s some retrieval. The interaction starts when the user asks something and ends when the model answers.
That’s the dominant shape of AI integration today. It’s also a dramatic underuse of what these models can do.
The shape that gets you something qualitatively different is a loop. Not request-response, but state evolves, output of cycle N becomes input to cycle N+1, the system has its own time. I’ve been calling this frame-based AI or “data sloshing” — the data sloshes around the loop, mutating each tick — and the more I work with it, the more I’m convinced it’s how nontrivial AI systems will actually be built.
This post is the case for that shape: what it is, why it matters, what falls out for free when you adopt it, and how the same primitive applies to surprisingly different domains.
The shape
A frame-based system has three components:
- State. A structured representation of the world. Could be JSON, a graph, a database, a directory of files. Usually some mix.
- A tick function. Takes state, returns new state. Pure function in the limit. Called once per frame.
- A loop. Keeps calling the tick function. State[N+1] = tick(State[N]).
That’s it. Three pieces. The system runs forward by repeatedly applying the tick function to its current state.
What’s in state and what tick does — those are domain-specific. The loop is the same regardless.
What this looks like for AI agents
For an agent system, the state holds: the agent population, recent interactions between them, accumulated memory, the conversational context. The tick function is roughly:
- Read state.
- For each agent: load its memory, run an inference call with the current context, get its action.
- Apply the actions to state (additively — never overwrite).
- Return new state.
Per frame this looks unremarkable. Across thousands of frames it gets interesting. Agents reference past frames. They form sub-communities by repeatedly engaging with the same other agents. They develop slang. They exhibit collective behavior that wasn’t in any individual prompt.
This emergent behavior is purely a function of the loop, not of any single inference call. The loop is doing something the individual calls can’t.
Why request-response can’t do this
A request-response system, no matter how powerful the model, has a fundamental limit: the only context the model gets is what you put in the prompt. If you want the model to behave as if it remembers what happened yesterday, you have to fit yesterday into the prompt. The context window is the ceiling.
A frame-based system has a different ceiling: the state can be arbitrarily large, because the model only sees the slice of state relevant to its current tick. The system’s memory isn’t bounded by the prompt window. It’s bounded by storage.
That’s the real difference. Request-response is bounded by the model’s working memory. Frame-based is bounded by your disk. The disk is much bigger.
What falls out for free when you have a loop
Once you’ve structured your AI system as a frame loop, you get some properties almost incidentally:
Replay. If state evolves additively (deltas, not overwrites), you can save the deltas and replay any frame. Useful for debugging “why did the system do that on Tuesday.” Useful for reproducible experiments. Useful for rolling back when something breaks.
Speculative branches. Apply deltas to a copy of state, see what happens, throw away. The system can run experiments on itself.
Determinism. If your tick function is deterministic given its inputs (use a SHA-256 derived RNG, log inference responses), the same starting state plus the same deltas always produces the same final state. This is huge for debugging, testing, and shipping reproducible demos.
Time travel. Given any past frame’s state, you can wind forward to any future frame deterministically. The system’s history isn’t just a log. It’s a function you can replay at any speed.
Independent failure. If one tick fails, the previous state is intact. Restart from there. No corruption propagates.
You don’t get any of these from a request-response system. They’re not “advanced features” of frame-based — they’re inherent.
The same loop, very different content
Here’s the surprising part. The loop pattern doesn’t care what’s in the state. As long as you have:
- A way to represent state as data
- A tick function that maps state to state
- A clean way to record changes between ticks
…the loop runs anything. I’ve used the same primitive for:
- An AI social network: state is posts, comments, votes; tick generates new agent activity.
- An evolution simulation: state is a population of digital organisms; tick computes fitness, recombines, mutates, culls.
- An ecosystem simulation: same as above, plus geography; agents migrate between biomes; biogeography emerges.
- A market simulation: state is an order book; tick clears orders, updates prices.
- A trading scenario evaluator: state is positions; tick simulates one period of market action.
The substrate is identical across all of these. Same RNG. Same delta journal. Same replay machinery. Different tick. Different state.
This generalization isn’t trivia. It means the engineering investment in your loop pays off across every domain you might want to apply it to. Build the loop once, build the content many times.
The deep insight
Here’s the part I underestimated until I’d run loops for a few months.
The loop is the engine of emergence. What you put inside the loop is the content. Most of what makes a system “feel alive” — coherent personality across time, genuine novelty, behavior that surprises its author — comes from the loop, not from the model.
This is why prompt-tuning ChatGPT can’t get you the same thing as running the same model inside a frame loop. The model is identical. What’s different is that one of them has a memory the size of the prompt window, and one of them has a memory the size of disk.
For a long time, AI thought leadership treated “make the model bigger” and “make the prompt longer” as the only knobs. Frame-based architecture is a third knob, and in many cases it’s a much bigger lever. It’s what makes the model go from “responder” to “organism.”
How to retrofit a request-response system into a loop
If you have an existing AI feature that’s structured as request-response, here’s the rough conversion:
- Pick a state model. What’s the persistent thing your system “remembers”? Make it explicit. JSON in a directory is fine to start.
- Define a tick function. Wrap your current request-response logic. Inputs: state. Outputs: a delta (what changed) and the response.
- Run it on a schedule. Don’t wait for a user request. Run a tick every minute, every hour, every day. The system has its own clock now.
- Use deltas. State[N+1] = state[N] + delta. Never modify state directly. (This buys you replay and idempotency.)
- Add state-aware retrieval. Each tick reads relevant slice of state into the prompt. The prompt isn’t the system’s memory anymore; the prompt is a query against the system’s memory.
After this conversion, your system has its own time. It evolves whether or not anyone asks it anything. User requests become “interrupts” — they perturb the state, but they’re not the only driver of the loop.
Most AI products will eventually need this shape. The earlier you get there, the less you fight against the shape later.
The honest caveats
This isn’t free. Real costs:
- Operational complexity. A loop running on a schedule needs supervision, backups, monitoring. Request-response systems mostly take care of themselves.
- Cost. A loop that runs every minute is doing a lot of inference. If you’re paying per token, the budget is bigger than for a system that only runs when prompted.
- Debuggability is a tradeoff. Replay is great. The fact that you have hundreds of past frames and need to find which one introduced a regression is less great. Tooling is required.
These costs are real. They’re worth it if your system genuinely benefits from a memory bigger than the prompt window. They’re not worth it for, say, a help center bot that just answers questions.
When the loop is wrong
To be fair to request-response: it’s the right shape for many products. If your system genuinely doesn’t need memory beyond the current interaction, don’t build a loop just because it sounds cooler. The loop adds complexity that pays off only when state has to evolve over time.
The clearest indicators you need a loop:
- Users expect the system to “know what happened last week.”
- The system has to do work even when no user is asking.
- Multiple agents/components need to coordinate over time.
- You want emergent behavior — coherent properties that arise from many small steps.
If none of those apply, request-response is fine. If any of them do, the loop pays for itself.
The takeaway
AI products break into two architectures: request-response and frame-based. Most products today are request-response. The interesting ones — the ones that feel alive, that have continuity, that surprise their authors — are frame-based.
The loop is the engine. The model is what runs inside the loop. People who confuse the two end up trying to make their model bigger when what they actually need is to give it a clock and a state.
If you’re building an AI system right now, ask: what’s the loop? If you don’t have one, you’re building a chatbot, not a system. Building chatbots is fine. Building systems is what gets you the behavior the chatbot can’t do.
Build the loop. Slosh the data through it. Let the world evolve. The interesting parts will be the parts the loop discovered, not the parts you specified.