Branimir Lambov from IBM on Cassandra

· databases · Source ↗

TLDR

  • Cassandra PMC member Branimir Lambov traces a decade of infrastructure work: Trie-based SSTable and memtable formats, Unified Compaction, and forthcoming ACID transactions via Accord.

Key Takeaways

  • The BTI (trie-indexed bigtable format) swaps Cassandra’s primary SSTable index for a byte-ordered trie, improving query performance and wide-partition handling; shipped in DSE 6 in 2017, contributed to Apache Cassandra 5 in 2024.
  • The Trie memtable in Cassandra 5 is the first output of CEP-57, a broader effort to replace Cassandra’s core storage and retrieval structures with trie-based alternatives.
  • Unified Compaction Strategy (also Cassandra 5) was built from academic research on compaction algorithms and handled order-of-magnitude higher densities than legacy strategies in DataStax’s internal branch.
  • A compaction parallelization bug caused data loss when an assertion was disabled in a release build; replication saved the data, but the lesson is that no code path is unimportant and assertions that can be toggled off are unreliable safety nets.
  • Accord will bring cross-partition ACID transactions to Cassandra with performance intended to scale similarly to Cassandra’s existing eventually-consistent operations.

Hacker News Comment Review

  • No substantive HN discussion yet.

Original | Discuss on HN