Architecture

1. Tablet Location

Clients find tablets via a three-level index: Chubby → Root tablet → Metadata tablets → User tablets. Tablet servers hold leases; the master reassigns tablets when needed.

2. Serving Data

Each tablet stores data as an in-memory memtable and a set of SSTables on disk. Updates are written to a commit log and then applied to the memtable.

Writes: Log → Memtable → SSTable (flush when full)
Reads: Merge memtable and SSTables using Bloom filters and caches.

3. Compactions & Optimization

Compactions merge SSTables to limit file count and improve performance. Bloom filters skip files that can’t contain requested rows.

Locality groups group columns read together
Compression saves space with minimal CPU cost

4. Recovery

Each tablet server writes a shared log for all tablets. On restart, it replays the log to rebuild memtables. SSTables are immutable, simplifying concurrency and garbage collection.

5. Summary

Architecture: Master + Tablet Servers + GFS
Data path: Log → Memtable → SSTables
Speed: Caches + Bloom filters + Compaction
Scalability: Horizontal growth and simple control

Essence: Simplicity and direct data paths make Bigtable fast and scalable.

Bigtable — System Architecture

1. Tablet Location

2. Serving Data

3. Compactions & Optimization

4. Recovery

5. Summary