Data Model
Bigtable models data as a sparse, distributed, multidimensional sorted map with keys (row, column, timestamp) → value. Columns are grouped into families with storage policies per family, and each cell can hold multiple timestamped versions.
1) Definition & Semantics
Formal definition:
Row keys define lexicographic order and tablet boundaries (key ranges).
Column keys are written as family:qualifier. Values are arbitrary byte strings.
- Atomicity: mutations and reads are atomic per row.
- Ordering: data is sorted lexicographically by row key.
- Sparsity: nonexistent columns or cells take no storage.
- Versions: multiple versions per cell, identified by timestamps.
2) Rows & Row Key Design
Row keys are arbitrary byte strings (typically 10–100 bytes) that define locality and load distribution.
- Group related data: reversed-domain for URLs (e.g.,
com.google.maps/index.html). - Avoid hotspots: use salting/hash prefixes for high-throughput IDs.
- Efficient scans: time bucketing (e.g.,
YYYYMMDD#user) for range queries.
3) Column Families & Qualifiers
Columns belong to families, which are the basic units for access control and storage tuning. Families are few and fixed; qualifiers are dynamic.
- Per-family policies: compression, caching, Bloom filters, max_versions, TTL.
- Locality: design schema so frequently co-accessed data is colocated.
4) Versions, Timestamps & Retention
Each cell may have multiple versions stored in descending timestamp order. Reads return the newest version unless a timestamp or time range is specified.
- Timestamps: 64-bit integers, assigned by the system or client.
- Retention: per-family limits such as keep last N or keep newer than X days.
- Compaction: obsolete versions are pruned during background merges.
5) Mapping to Physical Storage
The logical map is persisted via an in-memory memtable and on-disk immutable SSTables stored in GFS. Each tablet (a row-key range) is the unit of load balancing and parallelism.
- Write path: commit log → memtable → flush → SSTables → compactions.
- Read path: merge memtable + SSTables with block cache & Bloom filters.
- Locality groups: group families for co-access and selective caching.
Design insight: row-key, family, and locality-group choices translate access patterns directly into performance and storage efficiency.
