Message queues, pub/sub, delivery guarantees, backpressure, dead letter queues, and event streaming.
A user places an order on your e-commerce site. Your order service needs to: (1) charge the payment, (2) update inventory, (3) send a confirmation email, (4) notify the warehouse, (5) update the analytics dashboard. In a synchronous architecture, the order service calls each of these services in sequence. The user waits for all five to complete before seeing "Order Confirmed."
This is fragile. If the email service is slow (3 seconds), the user waits 3 extra seconds. If the analytics service is down, the entire order fails — even though analytics is not critical to order processing. If the warehouse notification times out, the payment has already been charged but the warehouse does not know about the order. You have a partial failure with no easy recovery.
The fundamental issue: synchronous communication couples services in time. The sender must wait for the receiver. If the receiver is slow or down, the sender is affected. You need a way to decouple services so they do not need to be available at the same time.
Watch an order flow through 5 services. Synchronous: the user waits for each. Asynchronous: the order service publishes a message and returns immediately.
With a message queue, the order service publishes an "order created" event and immediately returns success to the user. The payment, inventory, email, warehouse, and analytics services each consume this event at their own pace. If the email service is slow, it processes messages at its own speed — the user does not wait. If analytics is down, messages queue up and are processed when it recovers. The services are decoupled in time.
The simplest messaging pattern is point-to-point (also called a work queue or task queue). One or more producers put messages into a queue. One or more consumers take messages out. Each message is delivered to exactly one consumer. Once consumed, the message is removed from the queue.
Think of a queue at a deli counter. Customers (producers) take a number. When a worker (consumer) is free, they call the next number. Each number is served by exactly one worker. If there are more workers, numbers get served faster. If workers are busy, numbers queue up.
When you have more messages than one consumer can handle, add more consumers. They compete for messages — each message goes to one consumer. This is horizontal scaling for message processing.
A consumer pulls a message and starts processing. What if it crashes midway? The message is lost. To prevent this, queues use acknowledgments (ACKs). The flow:
1. Consumer pulls message. Queue marks it as "in flight" (invisible to other consumers). 2. Consumer processes the message. 3. Consumer sends ACK to the queue. Queue permanently removes the message. 4. If no ACK within the visibility timeout, the queue assumes the consumer died and makes the message visible again for another consumer.
A producer sends messages. 3 consumers compete to process them. Watch the queue grow and shrink as consumers process at different speeds.
| System | Model | Notable feature |
|---|---|---|
| RabbitMQ | Point-to-point + pub/sub | Flexible routing via exchanges, AMQP protocol |
| Amazon SQS | Point-to-point | Managed, infinite scale, at-least-once delivery |
| Redis Streams | Log-based queue | In-memory, sub-ms latency, consumer groups |
| Celery | Task queue (Python) | Wraps RabbitMQ/Redis, adds task scheduling |
Point-to-point delivers each message to one consumer. But what if multiple systems need to react to the same event? When an order is created, the payment service, inventory service, email service, and analytics service all need to know. Sending separate messages to 4 queues is wasteful and creates tight coupling — the producer must know about every consumer.
Publish/Subscribe (pub/sub) solves this. The producer publishes a message to a topic (or exchange). Consumers subscribe to the topic. Every subscriber gets a copy of every message. The producer does not know or care who the subscribers are.
This is called fan-out: one message fans out to N subscribers. Each subscriber processes the message independently. If the email subscriber is slow, it does not affect the payment subscriber. If analytics is down, the other subscribers still receive and process messages.
Pub/sub gives every subscriber a copy. But what if the email service has 3 instances for horizontal scaling? You do not want all 3 instances to send the same confirmation email. You want ONE of them to process each message.
The solution: consumer groups. Within a group, messages are distributed (point-to-point style). Across groups, messages are broadcast (pub/sub style). Each consumer group gets one copy of each message. Within the group, the message goes to one member.
A producer publishes to a topic. 3 consumer groups subscribe. Each group gets every message; within each group, messages are distributed to members.
When a producer sends a message and a consumer processes it, three things can go wrong: the message might not arrive (lost), it might arrive more than once (duplicated), or it might arrive but out of order. Delivery guarantees define what the messaging system promises about these failure modes.
The message is sent once. If it is lost (network failure, broker crash), it is not retried. Each message is delivered zero or one times. This is the simplest and fastest: fire and forget.
Use case: metrics, logs, non-critical notifications. If you lose a single CPU metric sample, the dashboard still works. The system is simpler and faster because there are no retries or deduplication.
The message is retried until the broker confirms receipt. If the confirmation is lost (network blip), the producer retries — and the broker might receive the message twice. Each message is delivered one or more times.
Use case: most production systems. Order processing, payment initiation, inventory updates. Duplicates are handled by making the consumer idempotent — processing the same message twice has the same effect as processing it once. For example, "set inventory to 10" is idempotent; "decrement inventory by 1" is not.
Exactly-once delivery means each message is processed exactly one time. No losses, no duplicates. This is what everyone wants, but it is extremely hard to achieve in a distributed system — some argue it is impossible in the general case.
What systems like Kafka actually provide is exactly-once semantics (EOS): a combination of idempotent producers (deduplicate at the broker), transactional writes (atomic batch commits), and consumer offset management. The message might physically be delivered twice, but the side effects are applied exactly once.
Watch messages flow from producer to consumer under each guarantee. Click "Inject Network Error" to see how each handles failures.
| Guarantee | Duplicates? | Loss? | Consumer requirement | Performance |
|---|---|---|---|---|
| At-most-once | No | Possible | None | Fastest |
| At-least-once | Possible | No | Idempotent processing | Fast |
| Exactly-once | No | No | Transactional processing | Slowest |
Messages arrive in order: "create order", "charge payment", "ship order". If they are processed out of order — "ship order" before "charge payment" — you ship before being paid. Order matters.
A single queue with a single consumer guarantees FIFO (First In, First Out) ordering. Messages are processed in exactly the order they were sent. But a single consumer is a throughput bottleneck. Multiple consumers process messages in parallel, and parallel processing breaks ordering.
The solution: partition the topic by key. All messages with the same key go to the same partition. Each partition has one consumer. Messages within a partition are ordered. Messages across partitions are not.
For our order system: use order_id as the partition key. All events for order #42 (created, paid, shipped) go to the same partition and are processed by the same consumer in order. Events for order #43 go to a different partition and are processed independently.
Messages for 3 orders flow into partitions by key. Each partition maintains order for its key. Watch how events for the same order are always in sequence.
A producer sends 10,000 messages per second. The consumer can process 1,000. The queue grows by 9,000 messages per second. In 10 minutes, there are 5.4 million messages in the queue. In an hour, 32.4 million. Eventually, the broker runs out of disk. New messages are rejected. The producer fails. The entire system is down.
This is the backlog problem, and its solution is backpressure — a mechanism that slows down the producer when the consumer cannot keep up. Instead of letting the queue grow unboundedly, the system signals "slow down" upstream.
A producer sends faster than the consumer can process. Watch the queue grow. Enable backpressure to see the producer slow down when the queue fills.
A consumer processes a message and fails. The message is retried. It fails again. And again. And again. This is a poison pill — a message that can never be successfully processed. Maybe the payload is malformed. Maybe it references a deleted record. Maybe there is a bug in the consumer that always crashes for this specific input.
Without safeguards, the poison pill blocks the queue forever. The consumer keeps retrying, the message keeps failing, and all messages behind it wait. This is the head-of-line blocking problem. One bad message takes down the entire pipeline.
A Dead Letter Queue (DLQ) is the solution. After N failed processing attempts, the message is moved to a separate queue — the DLQ. The original queue continues processing. Engineers can inspect the DLQ, fix the bug, and replay the messages.
Messages flow through the queue. One is a poison pill (red). Watch it fail repeatedly, then get moved to the DLQ while normal messages continue processing.
| Operation | When to use |
|---|---|
| Inspect | Look at DLQ messages to diagnose failures. Check the error, payload, metadata. |
| Replay | After fixing the bug, send DLQ messages back to the main queue for reprocessing. |
| Purge | If the messages are truly unrecoverable (corrupt data, obsolete events), delete them. |
| Alert | Set up monitoring: if the DLQ has more than N messages, alert the on-call engineer. |
Apache Kafka is the most widely deployed distributed messaging system. It is not a traditional message queue — it is a distributed commit log. Understanding its architecture explains why it is so fast and why it works the way it does.
Kafka stores messages in an append-only, ordered, immutable log. Each message is assigned a sequential offset (like a line number). Messages are never deleted from the log — they expire after a configurable retention period (e.g., 7 days). Consumers read the log by tracking their current offset.
A topic is a category of messages (e.g., "order-events"). A topic is split into partitions for parallelism. Each partition is a separate log, stored on a separate broker (server). Partitions are replicated across brokers for fault tolerance.
| Optimization | How it helps |
|---|---|
| Sequential I/O | Append-only writes to disk. Sequential disk writes are faster than random RAM access. |
| Zero-copy | Data goes directly from disk to network socket via sendfile(). No user-space copying. |
| Batching | Producer batches messages. Fewer network requests, better compression. |
| Page cache | Kafka relies on the OS page cache. Recently written data is read from RAM, not disk. |
| Partition parallelism | Each partition is an independent I/O stream. More partitions = more parallelism. |
A topic with 3 partitions across 3 brokers. Watch messages arrive and get distributed by partition key. Consumers track their offsets independently.
Time to put everything together. Below is a full message queue simulator with a producer, broker, and consumers. Control the production rate, inject failures, watch delivery guarantees in action, and observe backpressure and dead letter queues.
Producer sends messages to a broker. 3 consumers process them. Inject failures and poison pills to see retries and DLQ behavior.
Experiment 1: Normal operation. Set rate to 5/s and press Play. Watch messages flow from producer → broker → consumers. All healthy.
Experiment 2: Kill a consumer. One consumer dies. Its messages are redistributed to the remaining two. Throughput drops but the queue stays alive.
Experiment 3: Inject a poison pill. Watch the red message fail repeatedly, then get moved to the DLQ (shown separately). Normal messages continue processing.
Experiment 4: Overload. Set rate to 15/s with only 1 consumer alive. Watch the queue depth grow. This is what backpressure should prevent.
Messaging is the nervous system of distributed architectures. Here is how it connects to everything else.
| Concept | Key takeaway |
|---|---|
| Point-to-point | One message → one consumer. Work distribution. Competing consumers for throughput. |
| Pub/sub | One message → all subscribers. Event broadcasting. Consumer groups for per-group distribution. |
| Delivery guarantees | At-most-once (lossy, fast), at-least-once (safe, needs idempotency), exactly-once (hard). |
| Ordering | Total ordering limits throughput. Partition by key for per-entity ordering with parallelism. |
| Backpressure | Slow the producer or scale consumers. Unbounded queues eventually crash. |
| Dead letter queues | Quarantine poison pills. Protect the pipeline. Alert, inspect, replay. |
| Kafka | Distributed commit log. Append-only, offset-based, high throughput via sequential I/O. |
| Topic | Connection |
|---|---|
| Load Balancing | Consumer groups are a form of load balancing for message processing |
| Data Storage | Event-driven cache invalidation: publish change events, subscribers invalidate caches |
| Service Architecture | Messaging replaces synchronous HTTP calls between services for async workflows |
| Consensus | Kafka uses ZooKeeper/KRaft for leader election and partition assignment |