EveryDev.ai
Sign inSubscribe
  1. Home
  2. Blogs
  3. Spotify Built an AI Coding Agent (Honk) That Engineers Control From Their Phones

Spotify Built an AI Coding Agent (Honk) That Engineers Control From Their Phones

Joe Seifi's avatar
Joe Seifi
1h·Founder at EveryDev.ai

Spotify's Q4 2025 earnings call introduced Honk, an internal tool powered by Claude Code where engineers fix bugs and deploy features from Slack on their phones. Most coverage overlooks the decade of infrastructure that made it work.

"Our most senior engineers say they haven't written a single line of code since December."

That was Spotify co-CEO Gustav Söderström on the company's Q4 2025 earnings call this Tuesday, speaking not to developers at a conference but to Wall Street analysts with the stock price in play. The claim hit hard in developer circles. But the real story goes beyond the headline. It's about what Spotify built to get there, and why it matters more than the soundbite suggests.

Spotify Built an AI Coding Agent (Honk) That Engineers Control From Their Phones

What Honk Does (and What It Sits On Top Of)

Söderström described the following scenario: a Spotify engineer is on the bus, commuting to work. They open Slack on their phone, tell Claude to fix a bug or add a feature to the iOS app. Claude does the work. The engineer receives a new build pushed back through Slack, tests it on their phone, and merges to production before arriving at the office.

Spotify calls this system Honk.

It sounds like science fiction, but it relies on something more ordinary than most realize: plumbing. Solid, traditional infrastructure plumbing that Spotify has been building for years.

Honk did not emerge from an AI model alone. It sits on top of a system Spotify has been constructing since 2022 called Fleet Management, a framework for applying code changes across hundreds or thousands of repositories at once. Before AI entered the picture, about half of Spotify's pull requests were already flowing through this system. But the changes were limited to straightforward tasks: bumping dependencies, updating config files, swapping out deprecated method calls. Anything that required understanding the meaning of code, not only its structure, still needed a human.

That's where Claude Code entered the picture. In July 2025, Spotify integrated the Claude Agent SDK into their Fleet Management infrastructure. The agent reads natural language prompts, navigates the codebase, makes changes, then runs formatters, linting, builds, and tests before opening a pull request. This entire process operates in a sandboxed container with limited permissions and virtually no access to surrounding systems. Engineers interact with it through an internal Slack bot.

According to Anthropic's published customer story, this system merges over 650 agent-generated pull requests into production every month, saving up to 90% of engineering time on complex code migrations. Those are real numbers from a real production codebase, not a controlled demo or a cherry-picked benchmark.

Why This Is a Backstage Story, Not a Claude Code Story

Most coverage of Honk overlooks a crucial point: it treats the tool as a Claude Code story when it's more about Backstage than Claude Code.

Backstage is Spotify's open-source internal developer portal, the system that catalogs every component, tracks ownership, and standardizes how software gets built across the company. It's what makes Fleet Management possible. You cannot safely turn an AI agent loose on thousands of repos if you don't know who owns what, how things are built, or what the dependency graph looks like. Spotify's engineering team emphasizes this directly: "You can't safely automate what you don't understand."


Infographic showing Spotify's AI coding architecture as a layered stack: Backstage (2020) at the foundation providing component ownership and dependency tracking, Fleet Management (2022) applying bulk code changes across thousands of repos, Claude Code with the Agent SDK (2025) editing code inside sandboxed containers with a feedback loop through linting, testing, and a Judge verifier, and Honk at the top where engineers interact via Slack on their phones. Timeline runs along the left side, with stats showing 650+ PRs merged per month and 90% time savings.

This is the part that doesn't generate attention-grabbing coverage but matters enormously. Spotify spent years building the organizational infrastructure that makes autonomous coding agents viable: clear component ownership, standardized build systems, and comprehensive test suites. The three-part engineering blog series they published in late 2025 candidly outlines the constraints. Without strong feedback loops, verifiers, and what they call a "Judge" to guide the agent, the code it produces "simply doesn't work."

Niklas Gustavsson, Spotify's Chief Architect and VP of Engineering, confirms this wasn't a random model selection: "Claude has consistently delivered the strongest performance for large-scale code transformation work, which is why it has become our model of choice."

The takeaway for anyone watching this space is that the AI model is necessary but not sufficient. The real moat is infrastructure maturity.

"Haven't Written a Single Line of Code Since December"

Let's tackle the main claim directly.

Söderström's full quote, from the Q&A portion of the call: "Over Christmas, Christmas this year was an event, a singular event in terms of AI productivity. Certainly, I spent my entire vacation coding rather than being on holiday. And I think most people in tech did. A lot of things happened in December, including Opus 4.5 coming out to Claude Code. And we crossed the threshold where things started working."

He continued: "When I speak to my most senior engineers, the best developers we had, they say that they haven't written a single line of code since December. They only generate code and supervise it."

Two things can be true simultaneously. First, this is an earnings call statement from a co-CEO positioning his company as an AI leader; it's crafted for maximum investor impact. Second, it likely reflects something real about how top-tier engineers at a well-instrumented company are working right now, especially on the types of tasks that Fleet Management targets.

There's a meaningful distinction between "senior engineers aren't writing code" and "no code is being written." What Söderström describes sounds a lot like what the best senior engineers have always done: they architect, review, direct, and decide. The difference now is that the entity executing on their direction is an AI agent rather than a more junior team member.

Spotify's own engineering blog is more measured than the earnings call rhetoric. Max Charas, a Senior Staff Engineer, describes it as engineers "executing fleet-wide migrations at a pace that simply wasn't possible before." That's a productivity claim, not a replacement claim.

When Execution Gets Cheap, Strategy Gets Expensive

The most interesting thing Söderström said on the call wasn't about Honk at all. It was this:

"People feel like when you have AI, you don't need to plan anymore. I think it's going to be the opposite. When you have productivity on tap, what you need to have are very good plans so that these agents are highly utilized and stay busy."

This cuts against the prevailing narrative that you describe what you want and the AI handles the rest. Spotify is arguing the opposite: when execution becomes cheap, strategy becomes the bottleneck. The question shifts from "can we build it?" to "should we build it, and in what order?"

Co-CEO Alex Norström reinforced this: "This shift began more than two years ago. It was carefully planned... we now not only synchronize across the company with all of the different teams and their leaders, but we also set targets and we land planes that are important."

This is the part most companies trying to replicate Spotify's results will miss entirely. Having access to Claude Code is a basic requirement now. Having the organizational discipline to know what to point it at, and the infrastructure to make it work safely at scale, is where competitive advantage lives.

Why Most Companies Can't Replicate This Tomorrow

Deloitte's 2025 Emerging Technology Trends study found that while 30% of organizations are exploring agentic AI and 38% are piloting solutions, only 11% are actively using these systems in production. Gartner predicts over 40% of agentic AI projects will fail by 2027 because legacy systems can't support modern AI execution demands.

Spotify is in that tiny 11%. The gap between them and others isn't the AI model; it stems from a decade of investing in infrastructure that made the AI model useful.

Consider what Spotify had in place before Honk was even conceivable. Backstage has been deployed since 2020 for component cataloging and ownership tracking, and it's now open-source with broad enterprise adoption. Fleet Management has handled large-scale automated code changes since 2022. Their Kubernetes infrastructure has enough spare capacity to run hundreds of concurrent transformation jobs. Their CI/CD pipelines include automated testing that gives the agent reliable feedback on whether its changes work.

Most companies don't have any of this. They have massive codebases with unclear ownership, inconsistent build systems, and test coverage that varies wildly across teams. Introducing an AI coding agent into that environment doesn't improve things; it accelerates existing chaos.

The Question Worth Asking

The developer community tends to focus on whether the "not writing code" claim holds up, whether this is sustainable, and whether code quality suffers. Those are fair questions. But there's a more fundamental one.

Spotify deployed 50 new features in 2025. Their Q4 earnings showed $4.5 billion in revenue, 13% growth, and they're projecting 15% growth next quarter. They paid out $11 billion to music rights holders last year. Monthly streaming hours per user are up 20% over five years. By every meaningful business metric, the machine is working.

So if Honk and the broader AI toolchain let Spotify move even faster in 2026, does it make Spotify better for the 751 million people who use it? Or does it make Spotify more efficient at deploying things that needed more thought?

Söderström seems to have considered this. His insistence that planning "becomes more important, not less" suggests he understands the risk. And that may be the most underreported insight from the entire earnings call. The most senior engineers at Spotify may not be writing code anymore, but someone still has to decide what gets built.

About the Author

Joe Seifi's avatar
Joe Seifi

Founder at EveryDev.ai

Apple, Disney, Adobe, Eventbrite, Zillow, Affirm. I've shipped frontend at all of them. Now I build and write about AI dev tools: what works, what's hype, and what's worth your time.

Comments

Sign in to join the discussion.

No comments yet

Be the first to share your thoughts!

Explore AI Tools
  • AI Coding Assistants
  • Agent Frameworks
  • MCP Servers
  • AI Prompt Tools
  • Vibe Coding Tools
  • AI Design Tools
  • AI Database Tools
  • AI Website Builders
  • AI Testing Tools
  • LLM Evaluations
Follow Us
  • X / Twitter
  • LinkedIn
  • Reddit
  • Discord
  • Threads
  • Bluesky
  • Mastodon
  • YouTube
  • GitHub
  • Instagram
Get Started
  • About
  • Editorial Standards
  • Corrections & Disclosures
  • Community Guidelines
  • Advertise
  • Contact Us
  • Newsletter
  • Submit a Tool
  • Start a Discussion
  • Write A Blog
  • Share A Build
  • Terms of Service
  • Privacy Policy
Explore with AI
  • ChatGPT
  • Gemini
  • Claude
  • Grok
  • Perplexity
Agent Experience
  • llms.txt
Theme
With AI, Everyone is a Dev. EveryDev.ai © 2026
Main Menu
  • Tools
  • Developers
  • Topics
  • Discussions
  • News
  • Blogs
  • Builds
  • Contests
Create
Sign In
    Sign in