| name | sql-db-architecture |
| description | Understand the toy relational database architecture, crate organization, data flow, and implementation patterns. Use when exploring the database codebase, understanding query execution, debugging storage/indexing, or planning new features. Keywords: sql-database, database, crate, storage, executor, planner, parser, catalog, WAL, buffer pool, index, architecture |
sql-database Architecture Expert
Provides deep understanding of the toy relational database implementation, crate boundaries, and system design.
When to Activate
Use this skill when:
- Exploring how SQL queries flow through the system
- Understanding storage layer (heap files, slotted pages, WAL)
- Debugging parser โ planner โ executor pipeline
- Working with indexes (B+Tree, Hash, Bitmap, Trie)
- Navigating crate dependencies and boundaries
- Planning new features that span multiple crates
System Overview
sql-database is a minimal RDBMS in Rust implementing:
- SQL subset: CREATE/DROP TABLE/INDEX, INSERT, SELECT, UPDATE, DELETE
- Storage: slotted pages + buffer pool + WAL (redo-only)
- Execution: Volcano model with SeqScan/IndexScan operators
- Indexes: B+Tree (sled), Hash, Bitmap (roaring), Trie
Instructions
1. Identify the Query
When user asks about functionality, map to crates:
"How does SELECT work?"
โ parser โ planner โ executor โ storage/index โ buffer
"Where are tables created?"
โ catalog (metadata) โ storage (heap allocation) โ WAL (logging)
"How are indexes used?"
โ planner (index selection) โ executor (IndexScan) โ index crate
2. Crate Responsibility Map
Use this mapping to locate relevant code:
| Crate | Responsibility | Key Types |
|---|
common | Shared types, errors | Row, RecordBatch, DbError, ColumnId |
types | SQL types & values | SqlType, Value |
expr | Expression AST + eval | Expr, BinaryOp, UnaryOp |
parser | SQL โ AST | sqlparser-rs adapter |
planner | AST โ logical โ physical | Plan nodes, optimization rules |
executor | Volcano operators | Exec trait, SeqScan, Filter, Project |
storage | Heap table, tuples | HeapTable, RecordId, slotted pages |
buffer | Page cache (LRU) | Pager, PageId, TableId |
wal | Write-ahead log | WalRecord, append/replay |
catalog | Schema metadata | TableMeta, IndexMeta, TableSchema |
index | All index types | Index trait, BTree/Hash/Bitmap/Trie |
repl | CLI shell | rustyline, tabled rendering |
testsupport | Test fixtures | run_sql_script, snapshots |
3. Data Flow Patterns
Query Execution (SELECT):
SQL string
โ parser (sqlparser-rs โ AST)
โ planner (logical plan โ physical plan with index selection)
โ executor (Volcano operators: Scan โ Filter โ Project)
โ storage/index (fetch rows via RecordId)
โ buffer (page cache lookup/fetch)
โ REPL (format RecordBatch with tabled)
Write Path (INSERT/UPDATE/DELETE):
SQL string
โ parser โ planner โ executor
โ WAL (append WalRecord, fsync)
โ storage (heap table modification)
โ index (maintain all indexes on affected columns)
โ buffer (mark pages dirty)
Recovery:
Startup
โ WAL replay (read WalRecord log)
โ storage (re-apply operations)
โ catalog (restore table/index metadata)
4. Code Navigation Strategy
Finding implementations:
- Trait definitions: Search crate root (e.g.,
executor/lib.rs for Exec)
- Concrete operators: Look in
<crate>/src/<operator>.rs
- Tests: Adjacent
.rs files or tests/ directory
- Integration:
testsupport/ for end-to-end scripts
Dependency order (from CLAUDE.md):
common โ types โ expr
โ
parser
โ
catalog
โ
buffer โ storage โ wal
โ
planner
โ
index
โ
executor
โ
repl
5. Key Design Principles
From the design doc:
- Tiny interfaces - Prefer small traits (
Exec, Index, HeapTable, Pager)
- No implicit coercion - Same-type comparisons only (v1)
- WAL-first writes - (1) append WAL (2) fsync (3) apply to storage
- Rule-based planning - Predicate pushdown + index selection
- Single writer - No transactions/MVCC in v1
- Snapshot tests - Use
insta for query output verification
6. Response Format
When explaining architecture:
Component Overview:
- Purpose in 1 sentence
- Key types/traits
- Dependencies (what it calls)
- Dependents (what calls it)
Code References:
- Use
crate/path/file.rs:line format
- Link related components
- Show trait โ impl relationships
Examples:
- Provide concrete SQL queries
- Show expected output format
- Reference existing tests when available
Project Context
- Language: Rust (workspace with 13 crates)
- Build:
cargo check/test/fmt/clippy
- Coverage:
scripts/coverage.sh (llvm-cov)
- Testing: Unit tests inline, integration in
tests/, snapshots with insta
- Dependencies: Pinned in workspace
Cargo.toml with workspace = true
Common Questions
"How does index selection work?"
โ Check planner/ for rules matching WHERE predicates to available indexes
"Where is Row serialized?"
โ storage/ uses bincode for tuple layout in slotted pages
"How does the buffer pool work?"
โ buffer/ implements LRU cache over PageId, backed by file segments
"What SQL is supported?"
โ See SKILL.md header or parser/ for DDL/DML subset
"How to add a new operator?"
โ Implement Exec trait in executor/, wire into planner physical node generation
Tool Usage
- Grep: Find trait implementations, error types, specific SQL keywords
- Read: Examine crate root
lib.rs, trait definitions, test fixtures
- Glob: Locate all operators (
executor/**/*.rs), test files (**/tests/*.rs)
For detailed API signatures and implementation notes, see REFERENCE.md.