Under 5 minutes to a deployed LLM endpoint — Audry Hsu, RunPod
TL;DR
RunPod's pitch is simple: bring your code or a Hugging Face model, and RunPod handles the GPUs, containers, and deployment plumbing so developers can focus on building instead of infrastructure.
The company started with basement GPUs: founders Zenon and Pardeep turned failed crypto-mining rigs into the first version of RunPod in 2022, posted on Reddit offering free GPU access for feedback, and have been revenue-generating since.
The scale is no longer tiny: Audrey says RunPod now serves more than 500,000 developers across 30-plus data centers and has reached $120 million in annual recurring revenue.
Serverless is the main product for inference: teams can set max workers, spending caps, and always-on workers, paying only while requests are actually being processed instead of keeping containers running all the time.
The live demo shows the tradeoff clearly: deployment from a Hub listing was fast, but the first request sat in queue for about 41 seconds because workers were initializing and downloading the model, while actual execution took only about 1.5 seconds.
The Hub is the shortcut: pre-vetted AI repos with preconfigured Dockerfiles and defaults let users fork, tweak environment variables, and deploy popular open-source models like vLLM-backed LLM endpoints with just a few clicks.
The Breakdown
RunPod claims you can go from zero to a deployed LLM endpoint in under five minutes, and Audrey Hsu more or less proves it live by spinning up a vLLM-based serverless endpoint from the Hub, with the first response arriving after a 41-second cold start. Along the way, she positions RunPod as the abstraction layer for GPU chaos: 500,000 developers, 30-plus data centers, and $120 million ARR built from a couple of failed crypto-mining rigs in a basement.
Was This Useful?
Share
Keep Reading
Make Alcreon Yours
Tune your feedFive quick questions, and the feed ranks what matters to you first.Or just get notified
The weekly Echo. Signal worth keeping in your inbox.
Every new piece, announced on X.
Read Next
See all
Playbook
Tasteful Skills
“Tasteful Skills” argues that the best agent skills are not documentation or best-practice lists.

Playbook
The Art of Tasteful Prompting
Learn how tasteful prompting helps you move beyond generic AI output by shaping context, style, and judgment from the start.

Playbook
The Codex /goal Playbook
OpenAI shipped /goal for the Codex CLI. It turns a prompt into a persisted, self-continuing contract.