This is almost what we do for TigerBeetle, a new distributed database being writ...

infamouscow · on June 23, 2022

> Latency is predictable and gloriously smooth, and the system overall is much simpler and fun to program.

This has also been my experience building a database in Zig. It's such a joy.

catlifeonmars · on June 24, 2022

> for example, our new storage engine can address 100 TiB of storage using only 1 GiB of RAM.

I’m a little confused by this statement. I assume by “address” you mean indexing, and the size of an index is related to the number of entries, not the amount of data being indexed. (For example, you could trivially address 100TiB using 1 address width of memory if all 100TiB belongs to the same key).

_vvhw · on June 24, 2022

> I’m a little confused by this statement. I assume by “address” you mean indexing, and the size of an index is related to the number of entries, not the amount of data being indexed.

Thanks for the question!

What's in view here is an LSM-tree database storage engine. In general, these typically store keys between 8 and 32 bytes and values up to a few MiB.

In our case, the question is how much memory is required for the LSM-tree to be able to index 100 TiB worth of key/value storage, where:

  * keys are between 8 and 32 bytes,
  * values are between 8 and 128 bytes,
  * keys and values are stored in tables up to 64 MiB,
  * each table requires between 128 and 256 bytes of metadata to be kept in memory,
  * auxiliary data structures such as a mutable/immutable table must be kept in memory, and where
  * all memory required by the engine must be statically allocated.

That's alot of small keys and values!

Typically, a storage system might require at least an order of magnitude more than 1 GiB of memory to keep track of that many keys using an LSM-tree as index, even using dynamic allocation, which only needs to allocate as needed.

Another way to think of this is as a filesystem, since it's a very similar problem. Imagine you stored 100 TiB worth of 4096 byte files in ZFS. How much RAM would that require for ZFS to be able to keep track of everything?

catlifeonmars · on June 30, 2022

Thanks for taking the time to reply in detail! The metric makes much more sense when put into context.

_vvhw · on July 3, 2022

It's a pleasure. Let me know if you have any more questions about TigerBeetle. Our design doc is also here: https://github.com/coilhq/tigerbeetle/blob/main/docs/DESIGN....