Agentic Engineering

What if iteration is all we need?

What if every ceremony in software development was a workaround for expensive human iteration? What happens when juniors learn architecture on day one and spec-driven development turns out to be waterfall in disguise? Are we in the best restaurant in the world, but there’s no menu?

Robert Glaser

24 Feb. 2026 — 12 min read

Anthropic published their AI Fluency Index last week. It studies how people collaborate with AI—which behaviors they exhibit, how they iterate, where they push back.

85.7% of productive conversations with Claude exhibit iteration and refinement. Those conversations consistently show twice as many productive behaviors as non-iterative ones—questioning assumptions, identifying gaps, pushing back on weak reasoning. Users who iterate are 5.6 times more likely to challenge the model’s output. Four times more likely to spot what’s missing.

Iteration is the practice that produces all the others. The loop itself generates the competence!

I’ve been thinking about feedback loops in agentic engineering for a while now. The elastic loop—the spectrum from tight, synchronous co-driving to loose, asynchronous delegation—has been the central concept in my talks and writing on how teams actually work with AI agents. ThoughtWorks recently published a retreat report arriving at a similar structural idea with their “middle loop”—supervisory engineering work between inner-loop coding and outer-loop delivery.

But reading the Anthropic data, I think the insight goes further than either framing. What if iteration is the only thing that ever mattered—and we spent decades building an elaborate scaffolding of workarounds because human iteration was too expensive to be the default?

The ceremony graveyard

Every process artifact in modern software development is a response to a human limitation.

We invented sprint planning, standups, and user stories to manage human cognitive limits—you can only hold so many tasks, you need regular sync points, work has to be broken into digestible chunks. Estimation exists because you don’t know if a feature takes two days or two weeks until someone sits down and tries. And code review exists because humans make mistakes that other humans catch. Interestingly, that last one applies to machines too—agents turn out to be excellent reviewers of each other’s work, precisely because they bring overlapping but different context. Codex, for instance, does sharp security reviews. So code review won’t die. It’ll compress from days to seconds, like time dilation for quality assurance.

All of these practices were reasonable—because human iteration cycles were measured in days and weeks. When a single iteration is expensive, planning prevents wasted cycles, estimation predicts how many you can afford, and ceremonies keep everyone synchronized. The entire apparatus of Scrum, SAFe, Kanban boards, XP rituals—at its core, a risk management framework for expensive iteration.

As Nate B. Jones recently put it, analyzing the state of agentic engineering:

“Every one of these structures is a response to a human limitation. And when the human is no longer the one writing the code, those structures aren’t optional—they’re friction.”

StrongDM’s software factory—three people, no sprints, no standups, no Jira—is the logical endpoint. They write specifications and evaluate outcomes. That’s it. The entire coordination layer that consumes most of an engineering manager’s week simply doesn’t exist—because it no longer serves a purpose.

What replaced it isn’t nothing. As Simon Willison documented, StrongDM built scenarios instead of tests—behavioral specifications living outside the codebase, functioning as “holdout sets” the agent never sees during development. They built a digital twin universe—simulated clones of every external service for full integration testing without touching production. I’ve not reached the point of thought where I can see that this is always a good idea. But at least they experiment, and try. They have to, because they want to have the dark factory. They replaced human ceremonies with machine ceremonies. The friction around the loop disappeared. On the elastic loop spectrum, the dark factory sits at the loosest possible end—maximum delegation, minimum human involvement in implementation. But inside that loose outer loop, StrongDM built tight inner loops: scenarios that evaluate every change, digital twins that catch integration failures instantly. They built loops within loops—loosening the outer constraints while tightening the inner feedback cycles.

Should you fear the dark factory?

Dan Shapiro (CEO of Glowforge) published a framework that maps where the industry stands. Five levels of AI-assisted development, from Level 0 (spicy autocomplete—you type, the AI suggests the next line) to Level 5 (the dark factory—specification in, working software out, no human writes or reviews code). StrongDM operates at Level 5. Most of the industry is between Level 1 and 3, treating AI like a junior developer.

Level 5 sounds frightening—no human writes or reviews code, the lights are off. But let’s take a look at what’s actually happening inside the factory: someone writes a specification. The agent implements it. Scenarios evaluate whether it works. The human reviews the outcome and iterates on the spec. That’s a loop. A very loose one—but it’s the same elastic loop, stretched to its widest. So the dark factory looks the widest possible stretch of the loop we can imagine at this point in time. Maybe there’ll be a level 6?

The dark factory is what happens when the loop migrates all the way up—from iterating on code to iterating on intent. The work moves to where it always mattered most: deciding what should exist and describing it precisely enough that machines can build it.

This distinction matters because there’s a growing trend that looks like the dark factory in miniature but misses the point entirely. Spec-driven development—write a perfect specification, hand it to an agent, expect working software in one shot—is becoming the new developer sport of choice. Who can write the spec that zero-shots the feature? I think this is just the waterfall, but with better tooling. A specification that has to be perfect before the agent touches it is a requirements document by another name. Teams are reintroducing the very pattern that agile was supposed to kill, and they’re not noticing because the agent delivers fast enough to mask the underlying problem.

StrongDM’s factory works precisely because it doesn’t zero-shot. The specification goes in, the agent builds, the scenarios evaluate, the human reviews, the spec gets refined. That’s a loop—maybe three or four iterations deep before something ships. The dark factory without the loop is just an automated waterfall. Fast, yes. But brittle in exactly the same ways waterfall always was: assumptions baked into the spec that nobody questions until production.

So, should you fear that? Uwe Friedrichsen argues that much of the current AI discourse (especially on X, if I may note) is fear-mongering—and he’s right. The tools are immature, the arcane knowledge needed today will be obsolete in a year, and nobody needs to panic about falling behind on the latest agent harness. Every early technology goes through this phase. The “arcane” incantations of today—the exact prompt format, the right model settings, which agentic harness works best—will be forgotten in a year. Remember prompt engineering?

But I think the fear is pointed at the wrong thing. The question was never whether you can use the tools—the tools will get easier. The question is whether you know what and how to build with them. It’s a thinking problem. And one that gets more exposed with every generation of better tools.

So: depends on which question scares you more. “Can we build it?” or “should we build it?” The first one is getting answered by machines—and the tools to direct them will only get more accessible. The second one never will be. If your value was in the coding—the act of translating intent into software—yes, the assumptions underneath are changing. If your value was in the building, the real kind, knowing what should exist and why, you just got unlimited capacity. Though “unlimited” comes with its own weight: directing agents, evaluating output, holding context across parallel workstreams. The cognitive load simply shifts elsewhere. We did a podcast in June 2025 (!) on “AI burnout” (in German). Must have been early to the table.

The people inside the loop

A colleague of mine who works as a product owner in a customer project has stopped using Figma. She does everything with Claude Code now—prototypes for customers as actual software, with the actual UI. She shows it to the customer, gets feedback, iterates. I wrote about her in the elastic loop post as an example of a tight loop pattern. But I undersold what’s actually happening. She’s learning (by practicing) software development—without meaning to. Because the loop between her intent and the running software is so tight that she immediately sees what works and what doesn’t. What matters here is the silo that just broke. She’s a product person working directly with code, getting real feedback from real software. And that raises the interesting question: does she now get the same agentic development environment the engineering team uses? Can her prototypes flow into the actual codebase? Because if the loop stops at “impressive demo” and never connects to production, the silo just got a nicer window.

The Anthropic data indicates why this works. People who iterate grow—they question more, identify gaps more, evaluate more critically. The loop has always been the teacher. And now, as the cost of iteration drops, it opens up to people who were never allowed inside: designers, product thinkers, juniors, anyone whose role kept them one handoff removed from the real feedback. The loop builds the skill.

Now think about what that means for the overqualified Scrum Master.

I know this person. You probably do too. Someone with deep systems understanding, architectural intuition, the kind of person who could be doing engineering or product work—sitting full-time in ceremonies, managing a process that exists because humans in teams need coordination. ThoughtWorks described the skills needed for their middle loop: thinking in delegation and orchestration, strong mental models of system architecture, rapidly assessing output quality. That’s exactly what a good Scrum Master already does—just applied differently. The work changes; the person was always ready. For the first time, they get to do what they’re actually qualified for.

Or the product unicorn I’ve worked with for a long time (actually one of my former co-founders of Reisekosten-Gorilla)—someone who can do product, development, and design. Rare and extremely impressive. And perpetually trapped in the ceremonies of product development, where teams often tend to iterate over meta-artifacts instead of practicing actual iteration: personas, jobs-to-be-done, user stories, epics, sprint rituals. The moments where we actually locked ourselves in a room for three days with a whiteboard and a problem—those were when we actually solved the hard problems. Even before AI. Uninterrupted iteration, no ceremonies, just the loop. Now imagine giving them an agent: they iterate in an afternoon through what used to take a sprint (or many of those). But the organizational ceremonies still expect two-week cycles, handoff documents, and approval gates. The system rewards conformity with the process, not speed of insight.

Or the designer who’s only ever been allowed to “paint” Figma mockups in their silo, with artificial click paths that feed into an implementation pipeline measured in sprints. Designers who could think in systems, who understand user behavior deeply—constrained to producing static artifacts that someone else translates into software, slowly, with loss at every handoff.

The junior who learns architecture on day one

This one’s controversial, and I think it’s the most important.

The traditional career ladder in software engineering works like an apprenticeship wearing enterprise clothing. Juniors write simple features and fix small bugs in the first years. Seniors review the work and mentor them. Over time, a junior becomes a senior through accumulated experience. The entry point is always the bottom: implementation work, small scope, supervised execution, at best in form of feedback loops.

AI breaks this model at the bottom. If agents handle simple features and bug fixes, where do juniors learn? The career ladder, as Nate B. Jones puts it, “is getting hollowed out from underneath. Seniors at the top, AI at the bottom, and a thinning middle where learning used to happen.”

Was the apprenticeship model actually good, or was it just the only one we could afford?

When a junior today builds a service from scratch with an agent—write a spec, let the agent implement, evaluate the output, iterate—they’re doing on day one what seniors do: making architecture decisions, evaluating tradeoffs, reasoning about system design. Yes, their specs will be terrible at first. And their architecture designs will be. But the loop corrects (if it is in place!). The agent builds exactly what the spec says—including every gap, every ambiguity, every unconsidered edge case. The junior sees immediately where their understanding was wrong. That’s a more brutal feedback loop than five years of bug fixing. It’s also faster. The challenge now is finding ways to bring juniors into architecture work as peers—people who’ve already felt the consequences of bad specs in their own loops. A different starting point than we’ve ever had.

The Harvard/P&G “Cybernetic Teammate” study found that a single person with AI outperforms a two-person human team without AI—in solution quality. And it found that AI blends siloed expertise: salespeople suddenly had technical context, marketing people had sales context, engineers had product intuition. No human gatekeeping of expertise anymore.

Evidence that silos were actively worse ways of distributing knowledge than we assumed. We treated specialization as the path to depth. It turns out that rapid iteration across boundaries builds depth faster than years of narrowly scoped work—if the feedback is honest.

What’s making me nervous: We’re sitting in the best restaurant in the world. Every imaginable dish is suddenly affordable. Ingredients that used to be scarce are abundant. The kitchen can execute anything.

But there’s no menu.

And out of safety, risk aversion, and a simple lack of imagination, most people keep ordering what they know. The schnitzel was fine last time. Let’s do the schnitzel again.

Aaron Boodman (Replicache) observed this directly:

“Many people don’t know what to build despite having powerful tools. Even with AI tools that can build almost anything, many employees face an ‘idea crisis’—they simply don’t know what to create.”

This is the uncomfortable twin of the identity crisis that ThoughtWorks flagged—the “genuine identity crisis for developers who fell in love with programming” when the work shifts from implementation to supervision. But the identity crisis (“who am I if I don’t write code?”) has a companion: the idea crisis (“what should I even build?”).

They reinforce each other! The person in the silo always had both questions answered from outside. You knew who you were (I’m the backend developer) and you knew what to do (the ticket in the sprint). Both were surrogates for your own thinking. When the execution barrier falls, you see for the first time who can actually think about what should exist—and who was only ever executing someone else’s vision.

The people who’ll thrive were always the hardest to replace: people who understand customers, think in systems, hold ambiguity, and make decisions under uncertainty: all facets of high agency. People who can articulate what needs to exist before it exists. The factory—dark or otherwise—amplifies them. If you will, it turns a great product thinker with five engineers into a great product thinker with unlimited engineering capacity.

Iteration with resistance

The Anthropic study itself flags a warning: when AI produces polished artifacts—code, documents, interactive tools—users become more directive but less evaluative. They’re 3.1 percentage points less likely to question reasoning. 5.2 points less likely to identify missing context. The better the output looks, the less people interrogate it.

A junior who builds a service with an agent and sees “it works” might learn less than one who struggles manually—because the failure becomes invisible. The code runs, the tests pass, but the architecture is fragile in ways the junior can’t see. StrongDM solved this with scenarios as holdout sets: evaluation criteria the agent never sees. But who writes the scenarios for junior development?

The loop alone isn’t enough. You need iteration with resistance (Armin Ronacher calls it back pressure)—something in the loop that pushes back honestly. TDD as prompt engineering, as ThoughtWorks calls it: tests that exist before the code, preventing the agent from verifying its own broken behavior. Specifications precise enough that ambiguity can’t hide. Constraints that make incorrect code unrepresentable.

A lot of human ceremonies will need to die, but the need for honest feedback in the loop will not. Can we build machine equivalents fast enough and are we honest enough to admit when the output looks good but isn’t?

What if iteration is all we need?

I started with a data point: iteration is the single strongest correlate of AI fluency. I’ve been modeling iteration as the elastic loop—the spectrum of collaboration rhythms with agents. ThoughtWorks found the same structural insight independently and called it the middle loop. But I think both framings were too modest.

Iteration is the unit of all productive work. It always was! The agile manifesto said exactly this—and then we buried it under Scrum, SAFe, and two decades of ceremony. We built an entire industry of workarounds because human iteration was expensive and slow. AI has driven the cost of iteration toward zero. And now we can see what was always true: the loop is the thing. Everything else was scaffolding.

The scaffolding was holding people back—but it was also answering two questions that most people never had to answer for themselves: who am I, and what should I build?

Nobody has this figured out yet. I’ve been iterating on the elastic loop for months and each conversation reshapes it. The Anthropic data reframed it again. ThoughtWorks found it, too. That’s the whole point! In a shift this large, we have to think in the open—share half-formed ideas, let others poke holes, iterate, and share again. The same loop that makes agentic engineering work—tight feedback, honest resistance, closed loops—is the one we need to apply to understanding the shift itself.

And I won’t pretend this is comfortable. If you’re running an engineering organization, what I’m describing requires nerve. Dismantling ceremonies that your teams rely on for structure. Letting people out of roles that defined their careers. Accepting that the junior you hired last year should be learning architecture, not fixing bugs. Watching your org chart flatten in ways that make middle managers wonder where they fit. There will be lots of pain. I keep thinking of the engineering organization as an espresso machine: you need pressure on the coffee bed—resistance matters. Too much, and the gasket blows. Too little, and the output tastes sour.

I think of this post as just another loop iteration. I’m putting it out there because I think the opportunity is enormous—for engineers, for product people, for designers, for anyone who was always more capable than their silo allowed. But we won’t figure out how to seize it by waiting until someone publishes the definitive guide. We’ll figure it out the way we always do with things that matter: by iterating.

Curious what the menu looks like when we stop ordering the schnitzel?

The ceremony graveyard

Should you fear the dark factory?

The people inside the loop

The junior who learns architecture on day one

The restaurant with no menu

Iteration with resistance

What if iteration is all we need?