How to Stop Prompting Agents and Start Designing Loops?

The mental shift from typing prompts to engineering the loop that types them for you — plus your first runnable example.

Read time About 12 minutes
What you need A coding agent (Claude Code or Codex) and a sandbox
The shift You design the loop, not the prompt

There is a line going around that sums up where agentic coding is heading:

“Here’s your monthly reminder that you shouldn’t be prompting coding agents anymore. You should be designing loops that prompt your agents.”
Peter Steinberger

It sounds like a slogan, but it describes a real change in how the most productive people use tools like Claude Code and Codex. Boris Cherny, who leads Claude Code at Anthropic, put it even more bluntly: “I don’t prompt Claude anymore. I have loops running that prompt Claude and figuring out what to do. My job is to write loops.”

This article explains the idea and walks you through the simplest possible working example. A companion article then takes you up two more rungs: an intermediate checklist-driven loop and a complex multi-agent one.

The shift: take yourself out of the loop

When you chat with a coding agent, you are the loop. You type a prompt, read the result, notice it is 80% right, type a correction, read again, correct again. The agent does the work; you provide the judgement, the memory of what was asked, and the decision about when it is done. You are the slow, expensive, easily-distracted part.

As Simon Willison frames it, an LLM agent is really just something that runs tools in a loop to achieve a goal. The skill that matters now is not crafting the perfect single prompt — it is carefully designing the tools and the loop so the agent can drive itself toward the goal and know when it has arrived.

Designing the loop means answering four questions once, in writing, instead of answering them over and over in chat:

The anatomy of a loop

Every agentic loop, however fancy, is the same three-beat cycle that Addy Osmani calls plan–act–observe:

  1. Plan — the agent reads the goal and the current state, and decides the next step.
  2. Act — it runs a tool: edits a file, runs the test suite, greps the codebase.
  3. Observe — it reads the result of that action and feeds it back in.

Repeat until the goal is met or a stop condition fires. Two details make the difference between a toy and something that actually ships work overnight:

State lives on disk, not in the context window

The model is amnesiac — each run starts fresh, and long conversations “rot” as the context fills up. The filesystem and git are not amnesiac. So a well-designed loop keeps its memory in files: a task list, a progress log, the code itself, the git history. Each iteration reads enough state from disk to know what to do next, then writes its progress back. This is why the loops below lean on plain text files instead of one giant prompt.

The stop condition must be objective

If you let the agent decide when it is done, it will declare victory early — models are eager to please. The trick, central to Geoffrey Huntley’s “Ralph” pattern, is to remove the agent’s ability to grade its own homework. Use a signal that does not care about the agent’s feelings: tests passing, the type-checker going quiet, the linter returning clean, a build succeeding. Re-running the same instruction until that signal flips is what turns “generate some code” into reliable engineering.

! Safety first: sandbox before you loop

A loop runs the agent unattended, approving its own actions. That is the whole point — and the whole danger. To run without stopping for permission, Claude Code needs --dangerously-skip-permissions and Codex needs an auto-approving sandbox. The name is a warning. Run an unsupervised agent on your laptop with your real credentials and you are exposed to three things at once:

Rule: never point an auto-approving loop at your real machine. Run it inside a disposable container with no network access and no real credentials, with only the project folder mounted in. Anthropic’s own devcontainer and Codex’s sandbox modes exist for exactly this. If you need a Linux box to run loops in, our isolated toolbox guide sets one up in a minute.

Everything below assumes you are inside such a sandbox. The one-time cost is a few minutes; the payoff is being able to walk away from the loop and trust it.

The simplest loop that works: Ralph

The canonical first loop is so simple it became a meme. Geoffrey Huntley named it after Ralph Wiggum from The Simpsons, because it is just a while loop that keeps feeding the agent the same prompt until the work is actually done. In its barest form it is one line:

while :; do cat PROMPT.md | claude -p --dangerously-skip-permissions ; done

That is the whole idea. But a one-liner that never stops will burn tokens forever, so let us build the smallest useful version: a loop that builds a tiny URL shortener and stops the moment its tests pass. We will reuse this same little project as it grows in the next article.

1. Scaffold the project and write the failing tests

The objective signal is the test suite, so it has to exist before the loop starts. Create a tiny Node project with one test that does not pass yet:

mkdir shortener && cd shortener
npm init -y
npm install --save-dev vitest
mkdir src test

Put a spec in test/shorten.test.js that describes what “done” means:

import { describe, it, expect } from "vitest";
import { encode, decode } from "../src/shorten.js";

describe("short codes", () => {
  it("round-trips an id to a code and back", () => {
    const code = encode(12345);
    expect(typeof code).toBe("string");
    expect(decode(code)).toBe(12345);
  });

  it("uses url-safe base62 characters only", () => {
    expect(encode(99999)).toMatch(/^[0-9a-zA-Z]+$/);
  });

  it("maps 0 to a single character", () => {
    expect(encode(0).length).toBe(1);
  });
});

Wire up the test script in package.json:

npm pkg set scripts.test="vitest run"

There is no src/shorten.js yet, so npm test fails. That failing red is the loop’s starting line.

2. Write the prompt the loop will repeat

The prompt is the agent’s standing orders. Keep it small and point it at the objective signal. Save this as PROMPT.md:

You are working in a Node project that implements a URL shortener.

Your goal: make `npm test` pass.

Steps:
1. Run `npm test` and read the failures.
2. Implement or fix `src/shorten.js` so the tests in `test/`
   pass. Export `encode(id)` and `decode(code)` functions.
3. Run `npm test` again to confirm.

Do not edit the test files. Do not weaken the tests.
When `npm test` exits cleanly, stop and say "ALL TESTS PASS".
Why “do not edit the tests”: the fastest way for an agent to make tests pass is to delete them. Spelling out that the tests are the oracle, not an obstacle, is the single most important line in any loop prompt.

3. Write the loop

Now the harness. This is a real, runnable Ralph loop with the two guardrails the one-liner lacks: an iteration cap so it cannot run forever, and an objective exit — it stops itself the instant npm test succeeds. Save it as loop.sh:

#!/usr/bin/env bash
set -euo pipefail

MAX_ITERATIONS=10

for i in $(seq 1 "$MAX_ITERATIONS"); do
  echo "──────────── iteration $i / $MAX_ITERATIONS ────────────"

  # Objective stop condition: are we already green?
  if npm test >/dev/null 2>&1; then
    echo "✅ tests pass — stopping after $((i - 1)) iterations"
    exit 0
  fi

  # Hand the standing orders to the agent for one pass.
  claude -p "$(cat PROMPT.md)" \
    --dangerously-skip-permissions \
    --output-format stream-json --verbose
done

echo "❌ hit the $MAX_ITERATIONS-iteration cap without passing — go look"
exit 1

Run it inside your sandbox:

chmod +x loop.sh
./loop.sh

Watch what happens. The first pass writes a plausible encode/decode. Maybe it fails the “0 maps to a single character” edge case. The loop does not argue or congratulate — it just runs the tests, sees red, and hands the agent the same prompt again. This time the agent sees its own failure in npm test output and fixes the edge case. Green. The loop notices and exits. You did not type a single correction.

The same loop with Codex

The pattern is tool-agnostic; only the command changes. Codex’s non-interactive mode is codex exec, and instead of --dangerously-skip-permissions you grant it an auto-approving sandbox with --sandbox workspace-write (which lets it read, edit within the workspace, and run local commands without stopping to ask). Swap one line in loop.sh:

  # Codex equivalent of the `claude -p` line above:
  codex exec "$(cat PROMPT.md)" --sandbox workspace-write
Note: older guides use codex exec --full-auto; that flag still works but now just prints a deprecation warning and forces --sandbox workspace-write, so prefer the explicit form in new scripts.

The official shortcut

Once you understand what the loop is doing, you do not have to hand-roll the bash every time. Anthropic ships a Ralph loop as a plugin. Inside a Claude Code session:

/plugin install ralph-wiggum@claude-plugins-official

It does the same thing — re-prompts, refuses to let the agent quit early, stops on an objective signal — with the iteration cap and progress tracking handled for you. Reach for it once the shape above feels obvious.

When not to reach for a loop

A loop is only as good as its stop condition, and the stop condition has to be something a machine can check. That makes loops superb for work with an objective oracle — passing tests, a clean type-check, a green build, a successful deploy — and a waste of tokens for work without one. “Make this copy more persuasive” has no test that goes green, so a loop will just spin (or worse, satisfy a proxy you did not mean). For those tasks, stay in the chat. Loop the things you can measure.

What you end up with

A working mental model — goal, state, feedback, stop condition — and a runnable Ralph loop that drives a coding agent to green tests with no babysitting. You have stopped being the loop. The companion article takes this same project up two rungs: a checklist-driven loop that builds many features in sequence, and a multi-agent loop that runs work in parallel and grades itself.

Further reading