Data Model

Bigtable • Data Model
Bigtable

Data Model

Bigtable models data as a sparse, distributed, multidimensional sorted map with keys (row, column, timestamp) → value. Columns are grouped into families with storage policies per family, and each cell can hold multiple timestamped versions.

Sparse map Column families Versioned cells Row-ordered tablets

1) Definition & Semantics

Formal definition:

(row: string, column: string, time: int64) → bytes

Row keys define lexicographic order and tablet boundaries (key ranges). Column keys are written as family:qualifier. Values are arbitrary byte strings.

  • Atomicity: mutations and reads are atomic per row.
  • Ordering: data is sorted lexicographically by row key.
  • Sparsity: nonexistent columns or cells take no storage.
  • Versions: multiple versions per cell, identified by timestamps.

2) Rows & Row Key Design

Row keys are arbitrary byte strings (typically 10–100 bytes) that define locality and load distribution.

  • Group related data: reversed-domain for URLs (e.g., com.google.maps/index.html).
  • Avoid hotspots: use salting/hash prefixes for high-throughput IDs.
  • Efficient scans: time bucketing (e.g., YYYYMMDD#user) for range queries.
Row-level atomicity Lexicographic ordering Tablet splits

3) Column Families & Qualifiers

Columns belong to families, which are the basic units for access control and storage tuning. Families are few and fixed; qualifiers are dynamic.

family:qualifier anchor:cnnsi.com → “Sports anchor text” contents: → “<html>…</html>” @ ti language:id → “en”
  • Per-family policies: compression, caching, Bloom filters, max_versions, TTL.
  • Locality: design schema so frequently co-accessed data is colocated.

4) Versions, Timestamps & Retention

Each cell may have multiple versions stored in descending timestamp order. Reads return the newest version unless a timestamp or time range is specified.

  • Timestamps: 64-bit integers, assigned by the system or client.
  • Retention: per-family limits such as keep last N or keep newer than X days.
  • Compaction: obsolete versions are pruned during background merges.
// Example (pseudo API) put(row=”com.cnn.www”, mutations=[ set(“anchor:www.c-span.org”,”CNN”, ts=1692), delete(“anchor:www.abc.com”, ts=1692) ]) get(row=”com.cnn.www”, columns=[“anchor:*”], return_all_versions=true)

5) Mapping to Physical Storage

The logical map is persisted via an in-memory memtable and on-disk immutable SSTables stored in GFS. Each tablet (a row-key range) is the unit of load balancing and parallelism.

  • Write path: commit log → memtable → flush → SSTables → compactions.
  • Read path: merge memtable + SSTables with block cache & Bloom filters.
  • Locality groups: group families for co-access and selective caching.

Design insight: row-key, family, and locality-group choices translate access patterns directly into performance and storage efficiency.

“Data Model” page — synthesized from the original Bigtable: A Distributed Storage System for Structured Data (OSDI 2006) paper.