Choreography vs Orchestration

Two fundamentally different approaches to coordinating multi-step workflows across distributed services: orchestration uses a central coordinator that directs each service, while choreography has services react to events and emit new events with no central controller.

Problem

When a business process spans multiple services (e.g., an order placement that triggers inventory reservation, payment processing, and shipping scheduling), how should the coordination work? Who drives the process?

Solution / Explanation

Orchestration

A dedicated orchestrator (a service or process manager) explicitly controls the flow:

Orchestrator ──► calls Inventory Service ──► success
             ──► calls Payment Service ──► success
             ──► calls Shipping Service ──► success
             ──► updates Order status

The orchestrator knows the entire workflow; it tells each service what to do and handles failures centrally.

Characteristics:

  • Flow is explicit and centralized — easy to understand and debug
  • Orchestrator is a single logical view of the process state
  • Services are called, not reactive — they don’t need to know about the workflow
  • Central process manager is a potential single point of failure

Tools: AWS Step Functions, Temporal, Cadence, Dapr Workflow, Conductor

Choreography

Services react to domain events and emit new events; no central coordinator:

OrderPlaced event published
     ↓
Inventory Service reacts ──► publishes InventoryReserved
     ↓
Payment Service reacts ──► publishes PaymentProcessed
     ↓
Shipping Service reacts ──► publishes ShipmentScheduled

The overall flow emerges from the individual reactions; no service knows the “whole picture.”

Characteristics:

  • Maximum service autonomy and decoupling
  • Flow is implicit — emerges from event reactions
  • Debugging requires following event chains across services
  • No single point of failure; but failures are harder to detect and handle
  • New services can join the workflow by subscribing to events

Tools: Apache Kafka, RabbitMQ, AWS SNS/SQS, Watermill

Saga Pattern

The Saga Pattern applies to both:

  • Orchestration Saga: The orchestrator calls services and handles compensation on failure
  • Choreography Saga: Services listen for failure events and compensate themselves

Decision Framework

FactorUse OrchestrationUse Choreography
Flow visibilityNeed to see the whole workflowEmergent flow is acceptable
Centralized error handlingYes — compensations need coordinationNo — services self-compensate
Service couplingServices can be calledServices must be autonomous
DebuggingEasy — one log to followHard — distributed event tracing needed
Adding stepsRequires orchestrator changeJust add a new subscriber
Long-running workflowsOrchestration (state management built in)Choreography (but needs careful design)
Agent systemsOrchestration (predictable AI tool calls)Choreography (reactive agent protocols)

Trade-offs

OrchestrationChoreography
CouplingServices decoupled from each other; all coupled to orchestratorServices fully decoupled
VisibilityHigh — central logLow — distributed tracing required
Failure handlingCentralized compensationsDistributed compensations
ScalabilityOrchestrator can bottleneckScales with event bus
TestabilityOrchestrator testable in isolationIntegration tests required
ComplexitySimpler for complex flowsComplex for error handling