ACID, isolation levels, two-phase commit, sagas, and the outbox pattern — when you must coordinate.
You're building a payment system. Alice wants to send $100 to Bob. The operation requires two steps: debit Alice's account by $100 and credit Bob's account by $100. On a single database, this is straightforward — wrap both operations in a transaction, and the database guarantees they either both happen or neither does.
Now scale up. Alice's account lives in a database in US-East. Bob's account lives in a database in EU-West. There is no single database that owns both accounts. You need both databases to agree on the outcome of this transfer: either both commit (Alice debited AND Bob credited) or both abort (nothing changes). If Alice is debited but Bob isn't credited, $100 vanishes from the economy. If Bob is credited but Alice isn't debited, $100 appears from nowhere.
This is the atomic commit problem: getting multiple independent systems to agree on a single yes/no decision. It sounds simple until you add the reality of distributed systems: networks drop messages, servers crash mid-operation, and processes stall unpredictably.
| Failure mode | What happens | Business impact |
|---|---|---|
| Partial write | Alice debited, Bob not credited (crash between ops) | Money disappears — customer files dispute |
| Dirty read | Another transaction reads Alice's debited balance before transfer completes, then transfer aborts | Decisions based on phantom data |
| Double spend | Two concurrent transfers both check Alice's balance ($200), both debit $150 | Account goes to -$100, bank loses money |
| Phantom | A report sums all accounts while a transfer is in progress — total is wrong | Audit failure, regulatory violation |
Watch a money transfer between two databases. Without a transaction, a crash mid-operation causes money to vanish. With a transaction, the crash triggers a rollback.
The word "transaction" gets thrown around loosely. Let's be precise. A database transaction provides four guarantees, collectively known as ACID. But these guarantees are not equally useful, and not all databases implement them equally.
Atomicity means "all or nothing." Either every operation in the transaction completes, or none of them do. If the system crashes midway through, it rolls back any partial changes. Think of it as an undo button: the transaction is the unit of "undo." You can't undo half a transaction.
Important clarification: atomicity is NOT about concurrency (that's Isolation). Atomicity is about fault tolerance — what happens when things crash partway through. The name is misleading; "abortability" would be more accurate.
Consistency means the transaction preserves application invariants. If your invariant is "account balances never go negative," then a transaction that would cause a negative balance must be aborted. But here's the critical insight: the database can't know your application's invariants. YOU are responsible for writing transactions that maintain consistency. The database just provides the tools (constraints, triggers, atomicity). In practice, "C" in ACID is the application's responsibility, not the database's.
Isolation means concurrent transactions don't interfere with each other. Each transaction behaves as if it were the only transaction running. The strongest form is serializability: the result of running transactions concurrently is the same as if they ran one at a time in some serial order.
Durability means once a transaction commits, the data survives crashes. Typically implemented via write-ahead logging (WAL): changes are written to a durable log before the commit acknowledgment. Even if the server crashes, it replays the log on recovery. In distributed systems, durability often means "replicated to N nodes."
Each quadrant shows one ACID property. Click to simulate a violation and see what breaks.
True serializable isolation is expensive — it often requires locking or aborting transactions that conflict. In practice, databases offer a menu of isolation levels, each trading correctness for performance. Understanding these is essential: most production databases default to a weaker level, and bugs from insufficient isolation are among the hardest to diagnose.
| Level | Dirty Reads | Non-repeatable Reads | Phantoms | Write Skew | Performance |
|---|---|---|---|---|---|
| Read Uncommitted | Possible | Possible | Possible | Possible | Fastest |
| Read Committed | Prevented | Possible | Possible | Possible | Fast |
| Repeatable Read / Snapshot | Prevented | Prevented | Possible | Possible | Medium |
| Serializable | Prevented | Prevented | Prevented | Prevented | Slowest |
Let's demystify each level with concrete examples.
The most common default (PostgreSQL, Oracle, SQL Server). Two guarantees: (1) you only read data that has been committed (no dirty reads), (2) you only overwrite data that has been committed (no dirty writes). Implemented with row-level locks for writes and snapshot reads for reads.
Snapshot Isolation gives each transaction a consistent snapshot of the database at the moment it started. All reads within the transaction see that snapshot, even if other transactions commit changes in the meantime. This is how PostgreSQL's "REPEATABLE READ" and MySQL's "REPEATABLE READ" work (via Multi-Version Concurrency Control — MVCC).
Two concurrent transactions access the same account. Select an isolation level to see which anomalies can occur.
To understand isolation levels, you need to know the specific anomalies they prevent. Each "read phenomenon" is a concrete scenario where concurrent transactions produce results that couldn't happen in any serial execution.
A dirty read happens when transaction T1 reads data written by transaction T2 that hasn't committed yet. If T2 later aborts, T1 has read data that never officially existed.
A non-repeatable read happens when T1 reads the same row twice and gets different values because T2 committed a change between the two reads.
A phantom is like a non-repeatable read, but for sets of rows. T1 runs a query that returns a set of rows, T2 inserts or deletes a row that matches the query, and T1 re-runs the query — the set has changed.
Write skew is the subtlest and most dangerous anomaly. Two transactions each read the same data, make a decision, and write to different rows. Neither violates a constraint individually, but the combined result does.
Watch each read phenomenon play out step by step. Two transactions interleave their operations on a shared database.
How do you get two independent databases to agree on committing a transaction? You can't just send "COMMIT" to both — what if one succeeds and the other crashes before receiving the message? You need a protocol that ensures either ALL participants commit or ALL abort, even in the face of failures.
Two-Phase Commit (2PC) is the classic answer. It introduces a coordinator (also called the transaction manager) that orchestrates the commit across all participants (the databases).
Watch the 2PC protocol execute step by step. The coordinator orchestrates a commit across two database participants.
Two-Phase Commit has a fatal weakness: the coordinator is a single point of failure. If the coordinator crashes at exactly the wrong moment, participants are stuck in a state called in-doubt (or uncertain), holding locks indefinitely.
| Mitigation | How it helps | Limitation |
|---|---|---|
| Coordinator HA | Run coordinator on a replicated cluster (Raft/Paxos) | Still has failover delay; doesn't eliminate the problem |
| Timeout + presumed abort | If coordinator doesn't respond in T seconds, participants abort | Can cause inconsistency if coordinator actually committed |
| Three-Phase Commit (3PC) | Adds a "pre-commit" phase to reduce blocking window | Doesn't work under network partitions; rarely used in practice |
| Cooperative termination | In-doubt participants ask each other for the coordinator's decision | If ALL participants are in-doubt, nobody knows the answer |
Watch what happens when the coordinator crashes at different points in the 2PC protocol. The "danger zone" is after participants vote YES but before they receive the decision.
2PC is correct but blocking. What if you need distributed atomicity but can't afford to hold locks across services? The Saga pattern provides an alternative: instead of one distributed transaction, execute a sequence of local transactions, each in its own service. If any step fails, execute compensating transactions to undo the previous steps.
Think of it like booking a trip. You book a flight, then a hotel, then a rental car. If the car rental fails, you cancel the hotel, then cancel the flight. Each cancellation is a separate operation that undoes the corresponding booking.
| Property | 2PC | Saga |
|---|---|---|
| Atomicity | Full (all-or-nothing) | Eventual (via compensation) |
| Isolation | Yes (locks held during protocol) | No (intermediate states are visible) |
| Lock duration | Until commit/abort (could be seconds to hours) | Per-step only (milliseconds) |
| Blocking | Yes (coordinator crash blocks all) | No (each step is independent) |
| Complexity | Protocol is simple; failure handling is hard | Protocol is complex; each step needs a compensating action |
Choreography: Each service listens for events and decides its own next step. No central coordinator. Services communicate through events (e.g., Kafka topics).
Orchestration: A central saga orchestrator tells each service what to do and tracks the overall state. The orchestrator is like a state machine.
The hard part of sagas is writing correct compensating transactions. A compensation must undo the effect of a step, but it can't simply "rollback" — the original transaction already committed. Compensation is a new, forward-moving operation.
Compare the two saga styles. In choreography, services communicate through events. In orchestration, a central coordinator sends commands.
Sagas rely on services publishing events. But there's a subtle reliability problem: how do you atomically update the database AND publish an event? If you update the database first and then crash before publishing, the event is lost. If you publish first and then crash before updating, the event describes something that didn't happen.
This is the dual write problem: you need to write to two systems (database + message broker) atomically, but they don't share a transaction.
Instead of writing to both systems, write ONLY to the database — but include the event in an outbox table within the same database transaction. A separate process (the "relay" or "publisher") reads the outbox table and publishes events to the message broker.
Change Data Capture (CDC) eliminates polling by tailing the database's write-ahead log (WAL). Tools like Debezium, Maxwell, or native PostgreSQL logical replication capture every committed change and stream it to Kafka automatically. The outbox table still exists (for structure), but the relay is replaced by the CDC connector.
python # Outbox pattern: atomic write + deferred publish def create_order(order): with db.transaction(): db.execute("INSERT INTO orders ...", order) db.execute("INSERT INTO outbox (event_type, payload, published) VALUES (%s, %s, false)", 'OrderCreated', json.dumps(order)) # No Kafka publish here! The relay handles it. # Relay process (runs continuously) def relay(): while True: events = db.execute("SELECT * FROM outbox WHERE published = false ORDER BY id LIMIT 100") for e in events: kafka.publish(topic=e.event_type, value=e.payload, key=e.id) db.execute("UPDATE outbox SET published = true WHERE id = %s", e.id)
Watch the outbox pattern in action. A service writes to its database + outbox atomically. The relay then publishes events to Kafka. Crash it to see at-least-once delivery.
This is the showcase simulation. You are the saga orchestrator for an e-commerce order. The saga has four steps: create order, charge payment, reserve inventory, and schedule shipping. Each step can succeed or fail (you control which ones fail). When a step fails, the orchestrator runs compensating transactions for all previously completed steps in reverse order.
An order saga with 4 steps. Toggle which steps will fail, then run the saga. Watch the orchestrator execute steps, detect failure, and run compensations in reverse.
Observe the state machine transitions:
| State | On Success | On Failure |
|---|---|---|
| Create Order | → Charge Payment | → Saga Failed (no compensation needed) |
| Charge Payment | → Reserve Inventory | → Cancel Order (compensate step 0) |
| Reserve Inventory | → Schedule Shipping | → Refund Payment → Cancel Order |
| Schedule Shipping | → Saga Complete | → Release Inventory → Refund Payment → Cancel Order |
Distributed transactions are the price you pay for splitting data across multiple services. Every microservice architecture eventually faces this problem, and the solution is always a trade-off between consistency, availability, and complexity.
| If you need... | Use this | Trade-off |
|---|---|---|
| Strong atomicity + isolation across DBs | 2PC (XA transactions) | Blocking; coordinator is SPOF; high latency |
| Eventual consistency across services | Saga (orchestration) | No isolation; requires compensations; complex |
| Reliable event publishing | Outbox + CDC | At-least-once (idempotency required); added infrastructure |
| Simple read-your-writes on a single DB | Local transactions | Doesn't help with cross-service consistency |
| Serializable cross-shard transactions | Spanner-style (TrueTime) | Requires specialized hardware (atomic clocks); expensive |
A visual guide for choosing the right distributed transaction approach.
Related lessons:
"A distributed system is one where I can't get my work done because a computer I've never heard of has crashed." — Leslie Lamport