Vol. I · Research Dispatch · SOSP 2007

Dynamo: Amazon's Highly Available Key-Value Store

Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, Werner Vogels

2007·SOSP 2007·8,500 citations·14 pages
Abstract

Reliability at massive scale is one of the biggest challenges we face at Amazon.com, one of the largest e-commerce operations in the world. Even the slightest outage has significant financial consequences and impacts customer trust. This paper presents the design and implementation of Dynamo, a highly available key-value storage system that some of Amazon's core services use to…

Reading Posture
From the Field
The paper that gave the industry vocabulary for hard trade-offs — and proved that 'always wrong sometimes' beats 'sometimes unavailable'.
Verdict:Reach for it
Reach for it when

Designing any distributed data store where availability must survive node failures and network partitions — shopping carts, session state, user preferences, any write-heavy workload that can tolerate eventual consistency.

Look elsewhere when

Your use case requires strong consistency: financial transactions, inventory deduction, leader election, anything where stale reads cause correctness bugs. Dynamo's model is deliberately inconsistent under certain conditions.

In context

Traditional RDBMS systems with strong consistency. Dynamo explicitly chose availability over consistency in the CAP trade-off, enabling a class of systems — Cassandra, Riak, Voldemort — that shaped distributed systems for a decade.

Complexity●●●Heavy
Read time~60 minutes

What It Claims

As told for the curious

Amazon needed a database that would keep working even when servers failed, networks split, or whole data centres went dark — because being down for even a minute meant lost sales at massive scale. Traditional databases solved uncertainty by refusing to answer: if they were not sure of the state, they would return an error or block. Dynamo did the opposite: always accept writes, always answer reads, and sort out any contradictions later. The technique that made this work — tracking which version of data is newer using vector clocks — became a foundational idea in distributed systems. Dynamo itself never became a public product, but its ideas live on in Cassandra, Amazon's public DynamoDB, Riak, and in every engineering team that has argued about consistency models.

Key Ideas

6 contributions · The core concepts in plain terms

Read Next

Papers and articles that extend or critique these ideas

Related Work

Papers and codebases in the same intellectual neighbourhood

Related Expeditions
Dynamo: Amazon's Hi…🔀Attention Is All …
 

Export & Share

Take the field notes with you

Terminology

Hover the dotted terms above for definitions in context

consistent hashing

concept

A partitioning scheme mapping keys and nodes to a ring, minimising redistribution when cluster membership changes.

eventual consistency

concept

A consistency model guaranteeing that if no new updates are made, all replicas will converge to the same value — but makes no guarantee about timing.

hinted handoff

concept

A mechanism where a surrogate node stores writes for a failed node with a hint to forward them when the target recovers.

merkle tree

concept

A hash tree used to efficiently detect divergence between replicas — comparing root hashes narrows to diverging keys in O(log n) steps.

quorum

concept

The minimum number of nodes that must acknowledge a read or write for the operation to succeed.

sloppy quorum

concept

A quorum satisfied by any available nodes rather than designated replicas, accepting temporary inconsistency for availability during failures.

vector clock

concept

A list of (node, counter) pairs tracking write causality — allows detection of concurrent conflicting versions.