Value vs % AI-written lines: peak near 35%, valley 50–90%, rise at 100% (vibe coding)
Value vs % AI-written lines — sweet spot 30–40%, valley 50–90%, vibe at 100%.

I’ve been leaning into LLMs—and especially agentic programming—over the last months. One pattern keeps showing up: value does not scale linearly with the share of code written by AI.

By value I mean a blend of time-to-feature, error rates, and the sense that we’re in control while building. %AI lines is the rough share of code introduced or rewritten by an agent or AI assistant.

0%: Pre-LLM

This is how our lives were before LLMs: classic software engineering. Predictable, sometimes slow, and often heavy on boilerplate. But don’t forget copy-pasting error-prone code from Stack Overflow…

1–20%: Assisted, not agentic

Up to 20%, we ask for code reviews, small optimizations, and copy-paste snippets from a chat window. It’s great for getting started with a new API or working on quick fixes. The value rises quickly here with almost no coordination cost.

30–40%: My sweet spot

Around a third AI involvement, we can plan and implement encapsulated features, tune performance, and refactor single modules. We still understand the codebase end-to-end, and reviewing AI-generated changes feels easy. This is where productivity peaks.

50–90%: The valley

Past roughly 50%, the valley begins and the value drops below the point of not using AI at all. Most lines are AI-authored, but we still need a robust mental model to review changes, keep boundaries intact, and prevent our edits from being overwritten. Coordination costs climb faster than throughput. It starts to feel as if we’re running behind the agent rather than working with it.

100%: Vibe coding

At 100%, we’re vibe coding. We treat the codebase like a black box and almost exclusively interact with it through the agent. The How becomes opaque, the What begins to shine. It feels liberating to realize projects we never had time to build before. But there are disadvantages: Large diffs are common as libraries swap easily and targeted manual edits require care so the agent doesn’t undo them. This works best when we accept the abstraction, explore ideas, and interact mostly through the agent.

Finding your spot

After all, everybody has to find their own sweet spot. Here are my rules of thumb: When specifications are tight, tests are in place, and context is reliable, we can safely push the share of AI-written code upward; when our goals are fuzzy, the 30–40% region is the better place. Before we try to scale, it helps to freeze the edges—files, module boundaries, and API shapes—so our later edits don’t get overwritten. Establishing guardrails also helps: formatting, linting, tests, and a CI that refuses sweeping agent diffs without coverage. And it’s worth checking in on a regular basis: if our review time starts to exceed our build time, that’s the cue to dial the percentage back.

What’s next

With vibe coding, we’ve reached the end of the x-axis, but the interesting growth may be vertical. The value isn’t only in how many lines the agent writes; it’s in how faithfully our intent is realized. If we can keep the intent intact—from a specification we can read and execute, expressed in a natural-to-write, formally verifiable DSL, through an interactive compiler, into code the agent can shape and verify—we shorten the distance between idea and realization. That path depends on better ways to interact with agents, tighter links from specs to implementation, traceable refinement loops, and tooling that keeps agents’ contexts synchronized with our edits. Once those pieces are in place, the doors open to language features that used to feel too costly—richer types, effect systems, even dependent types—because the agent handles the heavy, mechanical write-up while adhering to the verifiable spec. As these pieces mature, the valley narrows and the sweet spot widens.