Architecture Overview
ScramDB is an adaptive HTAP SQL database written in Rust.
How a Query Executes
SQL Query
→ PostgreSQL Wire Protocol
→ SQL Parser
→ Cost-Based Optimizer
→ Bytecode Compiler
→ JIT Compiler (background)
→ Parallel Execution (core-pinned workers)
→ Results (streamed via PgWire)
Key Design Decisions
JIT Compilation
ScramDB compiles SQL queries to native machine code to native machine code instead of interpreting them. Queries are transparently compiled in the background - no configuration needed. Compiled code is cached to disk and reused across restarts.
Morsel-Driven Parallelism
Each query is split into pipelines that process data in parallel "morsels." Workers are pinned to CPU cores for zero-overhead scheduling and linear scaling.
Tundra Columnar Storage
Custom columnar storage engine with zone maps for automatic predicate pushdown, buffer pool caching, CRC32C integrity, and WAL-based crash recovery.
Transactions
Full MVCC with PostgreSQL-compatible isolation levels: Read Committed, Repeatable Read, and Serializable.
Indexing
Three index types: ART for point lookups, B+Tree for range queries, and Hash for equality checks. The optimizer automatically selects the best access path.
PostgreSQL Compatibility
Full PostgreSQL wire protocol - connect with psql, JDBC, psycopg2, or any PG driver. Standard SQL syntax, not a custom query language.