Saga Pattern

A sequence of local transactions that together implement a business transaction spanning multiple services, using compensating transactions to undo work when a step fails.

Problem

In a microservices architecture each service owns its own database (the Database per Service pattern). A business operation that spans multiple services — say, placing an order that debits a customer’s credit limit — cannot use a single ACID database transaction. Two-phase commit (2PC) is technically possible but creates tight coupling, reduces availability, and is not supported by many NoSQL stores and modern cloud services.

Solution

A saga breaks the cross-service transaction into a chain of local transactions. Each step:

  1. Completes its work atomically within its own service database.
  2. Publishes a message or event to trigger the next step.

If a step fails due to a business rule violation, the saga runs a series of compensating transactions that undo the changes made by the preceding steps. There is no global rollback; every step either succeeds or has a compensating action.

Key Transaction Concepts (Microsoft)

ConceptMeaning
Compensable transactionsSteps that can be reversed by a corresponding compensating action if a later step fails
Pivot transactionThe point of no return — once it succeeds, the saga must complete all remaining steps; no more compensations relevant
Retryable transactionsIdempotent steps after the pivot; may be retried until they succeed to reach the final consistent state

Key Components

Choreography-Based Saga

Services communicate purely through domain events — no central coordinator.

  1. Order Service creates a pending order and emits OrderCreated.
  2. Customer Service handles the event, reserves credit, and emits CreditReserved or CreditLimitExceeded.
  3. Order Service handles the outcome event and approves or cancels the order.

Pros: Loose coupling, simple for short sagas, no single point of failure.
Cons: Hard to follow the overall flow; risk of cyclic dependencies; integration testing requires all services.

Orchestration-Based Saga

A dedicated saga orchestrator sends commands to participant services and waits for replies.

  1. Order Service creates an orchestrator.
  2. Orchestrator sends ReserveCredit to Customer Service.
  3. Customer Service replies with success or failure.
  4. Orchestrator approves or cancels the order.

Pros: Explicit flow, easier to understand and monitor, avoids cyclic dependencies, better for complex workflows.
Cons: Orchestrator is a single point of failure; adds coordination logic complexity.

Compensating Transactions

Each step that can be undone must have a corresponding compensating action (e.g., CancelCreditReservation). Compensating transactions must be idempotent.

When to Use

  • You have adopted the Database per Service pattern.
  • A business operation spans two or more services.
  • You need eventual consistency rather than strong ACID guarantees.
  • You want to avoid the availability and coupling costs of 2PC.
  • You need to roll back or compensate if one of the operations in the sequence fails.

Not suitable when:

  • Transactions are tightly coupled and require strong isolation.
  • Cyclic dependencies between services exist.
  • Compensating transactions for earlier participants cannot be reliably designed.

Trade-offs

BenefitDrawback
Maintains consistency without distributed lockingNo automatic rollback — compensating transactions must be coded manually
Works with any persistence storeNo isolation between concurrent sagas (dirty reads possible)
Services remain loosely coupledDebugging and tracing cross-service flows is harder
Scales wellRequires reliable messaging (at-least-once delivery + idempotency)

Potential Data Anomalies (Microsoft)

  • Lost updates: One saga overwrites another’s changes
  • Dirty reads: Reading uncommitted changes from a concurrent saga
  • Fuzzy reads: Inconsistent data seen between saga steps due to updates in between

Mitigation strategies: semantic locks, commutative updates, pessimistic view reordering, version files.

Real-World Usage

  • E-commerce order placement: Reserve inventory → charge payment → schedule delivery. If payment fails, compensate by releasing the inventory reservation.
  • Travel booking: Book flight + hotel + car rental; if car rental fails, cancel flight and hotel via compensating transactions.
  • Saga orchestration frameworks exist across ecosystems — consult the sources section for specific library examples.

Implementation Patterns

Orchestration Saga vs. Choreography Saga

The saga’s control flow topology is a separate design decision from whether to use a saga at all:

  • Orchestration Saga uses a dedicated saga manager / process manager object. It sends commands, awaits replies, and drives the step sequence explicitly. Easier to observe, test in isolation, and reason about complex flows. Risk: the orchestrator is a coupling point.
  • Choreography Saga has no coordinator — each service reacts to domain events and publishes its own events. Better decoupling and autonomy. Risk: the overall flow is emergent and harder to trace; cyclic event dependencies can arise.

See Choreography vs Orchestration for the full trade-off analysis. For complex workflows with many steps and rollback conditions, orchestration is generally preferred; for short two- or three-step flows between autonomous services, choreography is simpler.

Compensation Transaction Pattern (Python-CQRS)

In Python-CQRS-style implementations, each saga step is a pair:

  • Forward command — the step’s normal action (e.g., ReserveInventoryCommand)
  • Compensating command — the rollback action (e.g., ReleaseInventoryCommand), registered at step definition time

The saga engine executes forward commands in sequence; on failure at step N, it executes compensating commands for steps N−1 down to 1 in reverse order. Compensating commands must be idempotent — they may be retried if delivery is uncertain.