Back to Podcast Digest
dotnet56m

On .NET Live: Shaving the outbox pattern yak

TL;DR

  • The outbox pattern exists because databases and brokers fail independently — João Antunes walks through the classic bug: a .NET API writes to PostgreSQL successfully, then fails publishing to RabbitMQ or Azure Service Bus, leaving the system inconsistent.

  • 'The broker is never down' is fantasy, and the cloud docs say so — he cites Azure SQL transient error guidance and SLAs from major clouds and Hetzner to make the point that 99.9995% uptime still means outages, failovers, and retry logic are part of the job.

  • His chaos demo made the failure painfully visible — with a 'container disruptor' randomly killing Docker containers, 2,000 requests produced only 1,009 successful client responses, while the producer stored 1,509 records and the consumer only 1,009, meaning 500 messages were lost.

  • The transactional outbox fixes consistency by moving the message into the same DB transaction as the business write — instead of publishing directly, the service writes both the entity and an outbox record to the same database, then a background publisher drains that table and sends messages later.

  • Outbox improves correctness, not perfection — João is explicit that it gives at-least-once delivery, not exactly-once, so consumers still need idempotency or an inbox-style dedupe strategy because retries can re-send the same message.

  • Outbox Kit is intentionally not the whole stack — his open-source project supports MySQL, PostgreSQL, and MongoDB, but he repeatedly says most teams should probably use mature .NET messaging frameworks like MassTransit, NServiceBus, Wolverine, or Brighter unless they need something highly custom.

The Breakdown

Why the Outbox Pattern Exists at All

João Antunes opens by framing the real problem, not the buzzword: a producer service has to write to a database and notify downstream consumers via a broker, but those are two separate systems with two separate failure modes. If the DB write succeeds and the message publish fails, the data becomes a fact in one place and invisible everywhere else.

'The Broker Is Never Down' Meets Reality

He tells a story from a previous company where an architect basically said messaging outages weren't worth worrying about because 'the broker is never down,' and João's reaction is basically: really? He backs that up with cloud-provider receipts — Azure SQL transient error docs, SLA math, and even Hetzner's 99.9% guarantee — to hammer home that outages, failovers, maintenance windows, and retries are not edge cases.

A Tour Through the Thousand Ways Systems Break

This is the fun overkill section: João zooms from a .NET API process to a container, a VM, a physical machine, network cards, switches, and software-defined networking, joking about 'VM inception' along the way. The point lands: with that many layers, something will eventually fail — code bugs, OOMs, graceful shutdowns during CI/CD, spot instance eviction, DNS, hardware, or plain bad luck.

The Chaos Demo: 500 Messages Gone

To make the problem concrete, he built a tiny system: a Spectre.Console CLI fires requests at a producer API, which writes to PostgreSQL and RabbitMQ, while a consumer writes its own DB record. Then he unleashes a 'container disruptor' that randomly kills services, like a tiny chaos monkey yanking plugs, and after 2,000 requests the numbers are ugly: 1,509 producer records vs. 1,009 consumer records — 500 lost messages.

Publishing First Is Also Wrong, Just in a Worse Way

He also tests the inverse idea — send the event first, then persist — and says it's still broken, only now you're announcing a fact that never actually happened. Katie and Myra help translate it into a purchase analogy: telling a customer they got the item before the purchase is secured is worse than securing it and forgetting to notify them, though both states are bad.

The Transactional Outbox, Explained Like a Mailbox

The fix is simple in concept: write the business entity and the outgoing message into the same database transaction, usually into an outbox table. João leans into the metaphor — it's literally a place where you drop letters for later pickup — and explains that a separate background process polls that table, publishes messages to the broker, and only then marks them processed.

What the Code Actually Changes

In the code path with Outbox Kit, the API no longer publishes directly to RabbitMQ; it adds both the entity and the outbox message to Entity Framework's change tracker and calls SaveChanges once. EF handles the transaction automatically, while Outbox Kit handles polling PostgreSQL and handing batches to a custom producer that publishes and marks successful messages complete.

What Outbox Kit Does, What It Doesn't, and When Not to Use It

João is refreshingly blunt: Outbox Kit is a toolkit, not a full messaging framework, and most teams should look first at MassTransit, NServiceBus, Wolverine, or Brighter. He built it because he had dozens of existing microservices and needed a minimally invasive way to add reliability; it currently supports MySQL, PostgreSQL, and MongoDB, preserves ordering by using a single publisher instance, and still requires consumers to handle duplicates because outbox guarantees at-least-once, not exactly-once, delivery.

Share