Code w/ Claude 2026 · Spotify first-party case

Coding is no longer the constraint. But what makes Spotify fast isn't Claude

Same Claude — so why such different leverage? Chief Architect Niklas Gustavsson's answer is counterintuitive: the secret isn't the model, it's the platform foundation they built for humans years ago.

weeks-to-months 3 days
a recent backend Java migration, then vs now Fleet Management · Honk · Backstage
A reversal

The results first — then the turn

Spotify's numbers are striking. By the time of the talk (mid-2026), 96% of engineers code with AI every week, PR frequency is up 60%, and the vast majority of PRs are authored by a developer working alongside an AI agent. The adoption curve steepened sharply after the Opus 4.5 release late last year. As Niklas put it: "We roll out tools internally all the time to make our developers more productive, but we have never seen the rate of adoption that we've seen rolling out AI coding tools."

96% 50% 0 Opus 4.5 release holiday dip Mid-2025 Mid-2026
Typical internal-tool adoption Claude Code (steep after Opus 4.5)

So far it reads like another "we adopted AI and got faster" story. But Niklas turned it on its head: what makes Spotify this fast is not Claude itself. Follow that through and a corollary appears: buying the same tools doesn't equal reproducing Spotify's results — the difference lies outside the tool.

What actually makes them fast is the platform foundation they built years ago — one built entirely for humans. And the standardization done to make humans effective turns out to help agents even more.

Many assume "adopt the AI tool" is the starting line, and leverage follows. Spotify's experience is the opposite: the foundation sets the ceiling. Without that layer of standardization, even a great agent dropped into a fragmented codebase won't run well.

📐

"If Claude has a lot of other code to look at, and that code looks roughly consistent, Claude will do a better job. That's what we're seeing."

They measured it: in more fragmented codebases, agent performance is measurably worse · Niklas Gustavsson

The foundation

So what does it actually look like

The origin predates agents. Years ago Spotify hit a problem: the production codebase was growing seven times faster than the number of engineers. Developers spent more and more time on maintenance — upgrading dependencies, migrating APIs, patching vulnerabilities — and less on features. Migrations were the number one source of developer frustration.

Engineers
Production codebase
growing 7× faster
Headcount barely moved while the codebase multiplied — maintenance had to go to automation

To absorb that, they built two foundations. Both were built for humans — only later did they turn out to be exactly what agents needed too.

Foundation 1
Fleet Management
Instead of asking hundreds of teams to edit components by hand, write small snippets that modify source code (the system is called Fleetshift), run them across thousands of components, and auto-open PRs.
2.5M+ automated PRs · ~half of all PRs since mid-2024
Foundation 2
Backstage
Their open-source internal developer portal. Before it, ~100 internal tools each did their own thing — fragmented and confusing. Backstage consolidated them into a single pane of glass around a component catalog.
~100 tools → one portal

"The fewer technologies we are world-leading in, the faster we go."

One of Spotify's oldest engineering principles — predating AI by years, yet it paved the way for agents

Into leverage

How the foundation became a runway for agents

Start with the ceiling of deterministic scripts. Early Fleet Management ran on "write a script to change code" — great for simple, repeatable tasks, hard for complex ones, because defining transformations by manipulating an abstract syntax tree or writing regexes demands a lot of specialized expertise. The clearest example is the Maven dependency updater: its core job is just to find pom.xml and update Java dependencies, but to handle every corner case it grew to over 20,000 lines. Complex changes were beyond what anyone could write.

maven-bump.transformdeterministic script
// handle every corner case in pom.xml
if (node.type === 'dependencyManagement') {
  if (hasProperty(version)) resolveProp(...)
  else if (isRange(version)) parseRange(...)
  else if (isBOM(parent)) walkImportScope(...)
  // ...also profile / classifier
  // ...exclusion / relativePath / inheritance
}
if (node.type === 'plugin') { /* another pile */ }
// ...every corner hand-written, one by one
20,000+lines of code
honk.prompthand off to agent
"Upgrade the Java dependencies in these repos to the new version. Handle BOMs, profiles, and inheritance chains as needed. Make the build pass, then open a PR."
one sentence → the agent handles the corner cases itself

In February 2025, Spotify began using AI agents inside Fleet Management. After many iterations came Honk. As Niklas put it: it has a silly name and a silly icon, but it turns out to be very useful.

Here's the crux: Honk isn't powerful out of thin air — it runs because it stands on that foundation. These four pillars map directly onto it:

🎯
Fleetshift orchestrates it
Targeting, scheduling, progress tracking stay with Fleet Management; Honk only does the code edits in the middle
Standardization lets it self-verify
Runs Claude (Agent SDK) + their harness + K8s, runs CI builds across OSes; if it fails, it fixes and retries
📐
Consistent code makes it sharper
The measurable gap from section two: with consistent templates everywhere to mirror, the agent does noticeably better
🛡️
Active guardrails self-correct it
Backstage exposed as MCP; golden state + lint flag a wrong pattern the moment it appears and it corrects itself
Input Orchestrate Edit Verify → Ship One prompt change in plain English Fleetshift target · schedule · track Honk Claude Agent SDK · K8s CI self-verify builds across OSes PR if it fails, Honk fixes and retries on its own

The results are concrete. By November 2025, Honk had generated more than 1,500 PRs merged into production — and not trivial ones: replacing Java value types with records, migrating data pipelines to a new version of Scio with breaking changes, moving to the new frontend system in Backstage. These migrations saved 60–90% of the time versus by hand. Among all agents, Claude Code is their top performer, applied to about 50 migrations and the majority of merged agent PRs. That 3-day Java migration from the opening is exactly this at work.

1,500+ PRs
generated by Honk and merged to production (by Nov 2025)
60–90%
time saved on complex migrations vs by hand
3 days
most recent backend Java migration, once weeks-to-months

Developers found new uses on their own. Honk lives in Slack, where engineers mention it mid-conversation — a natural source of context — and it flies off, works on the problem, and returns with a PR. Their internal real-time dashboard is called Goose Farm, where each goose is an active Honk session. Honk v2 added multiplayer collaboration, so agents work with multiple developers and teams, not just one person at a terminal.

🪿🪿🪿🪿🪿🪿🪿🪿🪿🪿🪿🪿🪿🪿🪿🪿
Goose Farm · each goose = one running background coding session
Firmer guardrails are accelerators, not constraints.Niklas Gustavsson · Chief Architect, Spotify · Code w/ Claude 2026
Bottleneck shift

Coding is no longer the bottleneck — humans are

As coding velocity rises, the constraint shifts to human decisions. Spotify has always had more ideas than capacity to build them, but now anyone can open Claude in the client monorepo and prototype a feature idea in minutes instead of days. Even the CEO is building prototypes this way.

The flip side: there are now 60% more PRs to review. Spotify is learning where to apply human judgment — auto-merging what's safe, focusing review where it matters most. As the bottleneck moves from coding to decision-making, the bets they made years ago on Fleet Management, Backstage, and standardization are exactly what caught the handoff.

The old bottleneck
Writing code
prototyping went from days to minutes; anyone can open Claude and try
The new bottleneck
Review & decisions
60% more PRs to review; judgment is the new scarce resource
Primary sources
Coding is no longer the constraint: Scaling devex to teams and agents at Spotify

Niklas Gustavsson (Chief Architect & VP of Engineering, Spotify) @ Code w/ Claude 2026 · official recording

Spotify Engineering blog · the Honk series

Figures follow the official talk (96% / +60% PRs). Faithful to the talk and blogs; key numbers verified item by item.