About
transactional.blog
2025-10-13
Copy-and-Patch
A Tutorial
•
How It Works
•
Control Flow
[draft]
•
Register Allocation
[draft]
•
Stencil Library Generation
[draft]
•
Stencil Selection
[draft]
•
Benchmarking with WasmNow
[draft]
2025-08-11
NULL BITMAP on SIMD
A NULL BITMAP guest post on loop parallelism transformations.
2025-08-01
A Failed Experiment with Siso
Benchmarking
[draft]
•
Encryption Survey
[draft]
2025-04-19
SIGMOD Programming Contest Archive
SIGMOD hosts a yearly competition to design a system which performs a set of queries as quickly as possible. The contests provide a starting framework and test harness. They make great intermediate database projects for learning.
In-Memory Join Pipeline (2025)
•
Hybrid Vector Search (2024)
•
Approximate K-nearest-neighbor Graph Construction (2023)
•
Blocking System for Entity Resolution (2022)
•
Entity Resolution (2021)
•
Entity Resolution (2020)
•
Sorting (2019)
•
Join Processing (2018)
•
Streaming N-Gram Filter (2017)
•
Shortest Path (2016)
•
Transaction Processing (2015)
•
Social Network Graph Processing (2014)
•
Streaming Full Text Search (2013)
•
Multi-dimensional Indexing (2012)
•
Durable Main-Memory Index Using Flash (2011)
•
Distributed Query Engine (2010)
•
Main Memory Transactional Index (2009)
Requirements and design constraints for a implementing SQL on a (distributed) key-value store, with commentary on tradeoffs therein.
2025-04-17
Decomposing Transactional Systems
Every transactional system must execute, order, validate, and persist transactions.
2025-04-12
Torn Write Detection and Protection
2025-02-27
Talks: Enough With All The Raft
There's better ways to replicate data than Raft.
2024-12-28
Personal: Time Tracking in Obsidian
2024-12-05
Notes On: Disaggregated OLTP Systems
Aurora, Socrates, PolarDB, and Taurus.
2024-11-19
Modern Hardware for Future Databases
2024-11-06
How to Learn
Suggested reading material for various topics.
Philosophy of How to Learn
•
Storage
[draft]
•
SSDs
[draft]
•
Userland Disk I/O
•
Key-Value Storage Engines
[draft]
•
Consensus
2024-08-26
Erasure Coding for Distributed Systems
An overview of erasure coding, its trade-offs, and applications in distributed storage systems.
2024-08-13
Database Startups
2024-07-31
Data Replication Design Spectrum
Consistent replication algorithms can be placed on a sliding scale based on how they handle replica failures. Across the three common points on this spectrum, the resource efficiency, availability, and latency are compared, providing guidance for how to choose an appropriate replication algorithm for a use case.
2024-06-05
Building BerkeleyDB
A B-Tree tutorial series implementing an ABI-compatible BerkeleyDB clone.
Introduction
•
BerkeleyDB Autograder
[draft]
•
Page Format
•
Entry Format
•
API Basics
•
Point Reads
•
Range Reads
[draft]
2024-04-15
Calling OCaml from C
2023-12-13
S3-Compatible Cloud Storage Cost Calculator
2023-05-06
RDMA: Soft-RoCE Requires a Specific IPv6 Address
2022-12-16
Concurrent Operation Diagram Generator
FoundationDB’s use of SQLite’s btree as its own storage engine should not be used to support claims that SQLite has a high quality and performant storage implementation.
2022-06-05
Darwin’s Deceptive Durability
A reminder that macOS does not respect the usual ways of making data durable on disk.
2022-03-21
A Survey of Database TLS Libraries
2022-03-01
Deterministic Simulation Testing
A walkthrough of how and why complex infrastructure should be built with deterministic simulation, and how to make such tests as productive as possible for developers.
Motivation
[draft]
•
Scheduling
[draft]
•
BUGGIFY
•
Workloads
[draft]
Databases run on real machines, and aspects of those machines will affect the operation of the database.
If a database has a better understanding of the machine it is being run on, it can both leverage that for increased performance and caution operators about potential dangers.
An inventory of the frameworks and libraries to use when writing a network service in C++.
The Network Time Protocol is the most available and widely-used time synchronization protocol, but is claimed to offer the worst maximum clock error.