AI EngineerMay 19, 202620m

Personalization in the Era of LLMs - Shivam Verma, Spotify

TL;DR

Spotify is rebuilding personalization around a unified generative model — Shivam Verma says the company is moving away from siloed candidate-generation-plus-ranking systems toward a single transformer-style backbone that can power recommendations, search, and steerable experiences across products.
User modeling is the foundation, and Spotify computes it at huge scale — his team generates daily embeddings for more than 1 billion users, turning years of listening behavior into vectors that downstream systems use to recommend tracks, podcasts, and other content.
Spotify now puts users, songs, and podcast episodes in the same embedding space — Verma shows a visualization where his own user embedding sits near the Big Technology podcast, illustrating how cross-content modeling lets the system reason across music and spoken audio together.
Catalog understanding comes from teaching LLMs Spotify’s inventory with semantic IDs — following ideas popularized in work from Google/YouTube, Spotify compresses item embeddings into 4-6 tokens so models can autoregressively predict the next song or episode like they would generate text.
Open-weight LLMs bring world knowledge, while Spotify fine-tuning adds platform knowledge — the company adapts models like Llama and Qwen to blend general understanding with Spotify-specific content, gaining steerability and explainability while managing tradeoffs like catastrophic forgetting.
The endgame is editable, user-controlled personalization — products like AI DJ, prompted playlists, and the new Taste Profile let users talk to Spotify in natural language, inspect what Spotify thinks their taste is, and even tell it what to keep or forget.

Summary

From Discover Weekly to AI DJ: why Spotify is changing the stack

Shivam Verma opens by framing Spotify’s next chapter in personalization: less about classic agent workflows, more about “context engineering on the modeling side.” He grounds it in Spotify’s scale — 750 million users, 100 million-plus tracks, roughly 400,000 audiobooks, millions of podcasts, 184 markets — and points to products like AI DJ, prompted playlists, and the new podcast-capable prompt playlists as signs that recommendations are becoming conversational and steerable.

The old recommender pipeline is breaking into silos

He gives the quick recommender-systems primer: traditional stacks shrink a huge catalog through candidate generation, then rankers, sometimes multiple ones, until you get a final list. The problem is organizational as much as technical — different surfaces like home, playlists, search, podcasts, and ads often end up with separate teams and separate models, which means uneven quality and fragmented features.

User embeddings: compressing your listening life into one vector

Verma’s team, the user representations group in Spotify’s AI Foundation org, builds the embeddings that tell Spotify who you are. He describes older approaches like autoencoders that compress a user’s features into a compact vector and reconstruct them, but says the company is moving toward sequential transformer-based modeling that treats user interactions as context, much closer to how LLMs work.

A map where users, tracks, and podcast episodes all live together

One of the coolest moments in the talk is a visualization of Spotify’s newer model: tracks in blue, podcast episodes in pink, users in green, all embedded in the same space. Verma points out his own embedding landing near the Big Technology podcast, making the point viscerally: once the model sees enough user context, it can place people and content on the same “hypersphere” and reason across modalities.

Teaching LLMs Spotify’s catalog with semantic IDs

Once you understand users, the next challenge is teaching an LLM the catalog itself. Spotify combines its own item embeddings with open-weight models like Llama and Qwen, then uses semantic IDs — compressed tokenized versions of item vectors — so the model can generate the “next item” autoregressively, not as plain text but as the next song or episode.

Why Ariana Grande and Bruno Mars start with the same tokens

Verma uses a concrete example: Spotify might represent artists with six semantic-ID tokens, and Ariana Grande and Bruno Mars share the first two because both live in a broad pop neighborhood. Later tokens diverge to capture finer-grained differences, creating a hierarchical structure that lets the model learn both broad similarity and niche distinctions.

Taste Profile and the final personalization trick: a user as a soft token

The last piece is user control. Taste Profile exposes a text summary of what Spotify thinks you like, then lets you edit it — maybe ask for more Justin Bieber, or tell Spotify to stop leaning on a certain podcast — and feed that signal back into the system.

Projecting the user into the LLM itself

Because you can’t train directly on every one of Spotify’s 750 million-plus users, Spotify injects personalization by projecting a user embedding into the LLM’s space as a soft token. Verma says this is already showing positive internal results, and if you’re getting next-episode recommendations in Spotify podcasts today, something like this is already in production.

Was This Useful?

LinkedIn X Email

Keep Reading

Tune your feedFive quick questions, and the feed ranks what matters to you first.

Or just get notified

The weekly Echo. Signal worth keeping in your inbox.

Every new piece, announced on X.

Follow @alcreon on X

Personalization in the Era of LLMs - Shivam Verma, Spotify

Summary

From Discover Weekly to AI DJ: why Spotify is changing the stack

The old recommender pipeline is breaking into silos

User embeddings: compressing your listening life into one vector

A map where users, tracks, and podcast episodes all live together

Teaching LLMs Spotify’s catalog with semantic IDs

Why Ariana Grande and Bruno Mars start with the same tokens

Taste Profile and the final personalization trick: a user as a soft token

Projecting the user into the LLM itself

Was This Useful?

Or just get notified

Read Next

The Retirement Email Isn't a Warning

The Cheapest Model That Passes

Cheap Models, Hard Tasks

Summary

From Discover Weekly to AI DJ: why Spotify is changing the stack

The old recommender pipeline is breaking into silos

User embeddings: compressing your listening life into one vector

A map where users, tracks, and podcast episodes all live together

Teaching LLMs Spotify’s catalog with semantic IDs

Why Ariana Grande and Bruno Mars start with the same tokens

Taste Profile and the final personalization trick: a user as a soft token

Projecting the user into the LLM itself

Was This Useful?

Make Alcreon Yours

Or just get notified

Read Next

The Retirement Email Isn't a Warning

The Cheapest Model That Passes

Cheap Models, Hard Tasks