How to Stop Prompting Agents and Start Designing Loops?
The mental shift from typing prompts to engineering the loop that types them for you — plus your first runnable example.
There is a line going around that sums up where agentic coding is heading:
“Here’s your monthly reminder that you shouldn’t be prompting coding agents anymore. You should be designing loops that prompt your agents.”
— Peter Steinberger
It sounds like a slogan, but it describes a real change in how the most productive people use tools like Claude Code and Codex. Boris Cherny, who leads Claude Code at Anthropic, put it even more bluntly: “I don’t prompt Claude anymore. I have loops running that prompt Claude and figuring out what to do. My job is to write loops.”
This article explains the idea and walks you through the simplest possible working example. A companion article then takes you up two more rungs: an intermediate checklist-driven loop and a complex multi-agent one.
The shift: take yourself out of the loop
When you chat with a coding agent, you are the loop. You type a prompt, read the result, notice it is 80% right, type a correction, read again, correct again. The agent does the work; you provide the judgement, the memory of what was asked, and the decision about when it is done. You are the slow, expensive, easily-distracted part.
As Simon Willison frames it, an LLM agent is really just something that runs tools in a loop to achieve a goal. The skill that matters now is not crafting the perfect single prompt — it is carefully designing the tools and the loop so the agent can drive itself toward the goal and know when it has arrived.
Designing the loop means answering four questions once, in writing, instead of answering them over and over in chat:
- Goal — what does “done” look like, stated so a machine can check it?
- State — where does the agent read what has happened so far?
- Feedback — what objective signal tells the agent whether the last step worked?
- Stop condition — what ends the loop: success, or a safety limit?
The anatomy of a loop
Every agentic loop, however fancy, is the same three-beat cycle that Addy Osmani calls plan–act–observe:
- Plan — the agent reads the goal and the current state, and decides the next step.
- Act — it runs a tool: edits a file, runs the test suite, greps the codebase.
- Observe — it reads the result of that action and feeds it back in.
Repeat until the goal is met or a stop condition fires. Two details make the difference between a toy and something that actually ships work overnight:
State lives on disk, not in the context window
The model is amnesiac — each run starts fresh, and long conversations “rot” as the context fills up. The filesystem and git are not amnesiac. So a well-designed loop keeps its memory in files: a task list, a progress log, the code itself, the git history. Each iteration reads enough state from disk to know what to do next, then writes its progress back. This is why the loops below lean on plain text files instead of one giant prompt.
The stop condition must be objective
If you let the agent decide when it is done, it will declare victory early — models are eager to please. The trick, central to Geoffrey Huntley’s “Ralph” pattern, is to remove the agent’s ability to grade its own homework. Use a signal that does not care about the agent’s feelings: tests passing, the type-checker going quiet, the linter returning clean, a build succeeding. Re-running the same instruction until that signal flips is what turns “generate some code” into reliable engineering.
! Safety first: sandbox before you loop
A loop runs the agent unattended, approving its own actions. That is
the whole point — and the whole danger. To run without stopping
for permission, Claude Code needs
--dangerously-skip-permissions and Codex needs an
auto-approving sandbox. The name is a warning. Run an unsupervised
agent on your laptop with your real credentials and you are exposed to
three things at once:
- A bad shell command deleting or mangling files you care about.
- Exfiltration — a stray instruction (even one hidden in a file the agent reads) that ships your source or secrets somewhere.
- Your machine being used as a proxy to attack something else.
Everything below assumes you are inside such a sandbox. The one-time cost is a few minutes; the payoff is being able to walk away from the loop and trust it.
The simplest loop that works: Ralph
The canonical first loop is so simple it became a meme. Geoffrey
Huntley named it after Ralph Wiggum from The Simpsons,
because it is just a while loop that keeps feeding the
agent the same prompt until the work is actually done. In its barest
form it is one line:
while :; do cat PROMPT.md | claude -p --dangerously-skip-permissions ; done
That is the whole idea. But a one-liner that never stops will burn tokens forever, so let us build the smallest useful version: a loop that builds a tiny URL shortener and stops the moment its tests pass. We will reuse this same little project as it grows in the next article.
1. Scaffold the project and write the failing tests
The objective signal is the test suite, so it has to exist before the loop starts. Create a tiny Node project with one test that does not pass yet:
mkdir shortener && cd shortener
npm init -y
npm install --save-dev vitest
mkdir src test
Put a spec in test/shorten.test.js that describes what “done” means:
import { describe, it, expect } from "vitest";
import { encode, decode } from "../src/shorten.js";
describe("short codes", () => {
it("round-trips an id to a code and back", () => {
const code = encode(12345);
expect(typeof code).toBe("string");
expect(decode(code)).toBe(12345);
});
it("uses url-safe base62 characters only", () => {
expect(encode(99999)).toMatch(/^[0-9a-zA-Z]+$/);
});
it("maps 0 to a single character", () => {
expect(encode(0).length).toBe(1);
});
});
Wire up the test script in package.json:
npm pkg set scripts.test="vitest run"
There is no src/shorten.js yet, so npm test
fails. That failing red is the loop’s starting line.
2. Write the prompt the loop will repeat
The prompt is the agent’s standing orders. Keep it small and
point it at the objective signal. Save this as PROMPT.md:
You are working in a Node project that implements a URL shortener.
Your goal: make `npm test` pass.
Steps:
1. Run `npm test` and read the failures.
2. Implement or fix `src/shorten.js` so the tests in `test/`
pass. Export `encode(id)` and `decode(code)` functions.
3. Run `npm test` again to confirm.
Do not edit the test files. Do not weaken the tests.
When `npm test` exits cleanly, stop and say "ALL TESTS PASS".
3. Write the loop
Now the harness. This is a real, runnable Ralph loop with the two
guardrails the one-liner lacks: an iteration cap so
it cannot run forever, and an objective exit —
it stops itself the instant npm test succeeds. Save it as
loop.sh:
#!/usr/bin/env bash
set -euo pipefail
MAX_ITERATIONS=10
for i in $(seq 1 "$MAX_ITERATIONS"); do
echo "──────────── iteration $i / $MAX_ITERATIONS ────────────"
# Objective stop condition: are we already green?
if npm test >/dev/null 2>&1; then
echo "✅ tests pass — stopping after $((i - 1)) iterations"
exit 0
fi
# Hand the standing orders to the agent for one pass.
claude -p "$(cat PROMPT.md)" \
--dangerously-skip-permissions \
--output-format stream-json --verbose
done
echo "❌ hit the $MAX_ITERATIONS-iteration cap without passing — go look"
exit 1
Run it inside your sandbox:
chmod +x loop.sh
./loop.sh
Watch what happens. The first pass writes a plausible
encode/decode. Maybe it fails the
“0 maps to a single character” edge case. The loop does not
argue or congratulate — it just runs the tests, sees red, and
hands the agent the same prompt again. This time the agent sees its
own failure in npm test output and fixes the edge case.
Green. The loop notices and exits. You did not type a single
correction.
The same loop with Codex
The pattern is tool-agnostic; only the command changes. Codex’s
non-interactive mode is codex exec, and instead of
--dangerously-skip-permissions you grant it an
auto-approving sandbox with --sandbox workspace-write
(which lets it read, edit within the workspace, and run local
commands without stopping to ask). Swap one line in
loop.sh:
# Codex equivalent of the `claude -p` line above:
codex exec "$(cat PROMPT.md)" --sandbox workspace-write
codex exec --full-auto;
that flag still works but now just prints a deprecation warning and
forces --sandbox workspace-write, so prefer the explicit
form in new scripts.
The official shortcut
Once you understand what the loop is doing, you do not have to hand-roll the bash every time. Anthropic ships a Ralph loop as a plugin. Inside a Claude Code session:
/plugin install ralph-wiggum@claude-plugins-official
It does the same thing — re-prompts, refuses to let the agent quit early, stops on an objective signal — with the iteration cap and progress tracking handled for you. Reach for it once the shape above feels obvious.
When not to reach for a loop
A loop is only as good as its stop condition, and the stop condition has to be something a machine can check. That makes loops superb for work with an objective oracle — passing tests, a clean type-check, a green build, a successful deploy — and a waste of tokens for work without one. “Make this copy more persuasive” has no test that goes green, so a loop will just spin (or worse, satisfy a proxy you did not mean). For those tasks, stay in the chat. Loop the things you can measure.
What you end up with
A working mental model — goal, state, feedback, stop condition — and a runnable Ralph loop that drives a coding agent to green tests with no babysitting. You have stopped being the loop. The companion article takes this same project up two rungs: a checklist-driven loop that builds many features in sequence, and a multi-agent loop that runs work in parallel and grades itself.