~

Roadmap

Increase Runtime Data Capacity

Summary

There are three main phases to increasing the amount of data the runtime can manage:

  • use pointer compression to max out the 32-bit allocator
  • improve auxiliary tooling to handle more data than fits in RAM
  • 64-bit or other more experimental increases in data storage

Pointer compression, as detailed more in its own sections, can increase the amount of data stored in Vere by roughly 12x. It can bring us from a 2GB "loom" (memory arena) to a 16GB loom whose cells are 2/3rds their current size. This is large enough for effectively unlimited amounts of chat and notebook data, and it should also be enough data for most nodes involved in the Uqbar network, with the possible exception of archive nodes.

The first stage of pointer compression is being worked on currently. An 8GB loom is in testing and will likely be released within the next couple of runtime releases.

Once there are 8GB or 16GB in the loom, it will be common for there to be more data in Urbit than fits in RAM. This presents several new problems:

  • demand paging (only load nouns into RAM on an as-needed basis, leaving the rest on disk)
  • snapshot management (taking an incremental snapshot is more complex)
  • tools, such as |meld, |pack, garbage collection, and deserializing large portable snapshots, will need to be rewritten to do their bookkeeping differently

Only once these tools can scale up to larger-than-RAM data can the system effectively use more than 16GB of memory, so that is when switching to a 64-bit interpreter might make sense -- managing a large Arvo snapshot means a lot more than just addressing the memory. By the time the first two phases are complete, New Mars might be ready. Since New Mars is a 64-bit interpreter, it might make sense not to build a 64-bit version of Vere at all; if New Mars is not yet ready, then a 64-bit Vere would make more sense.

An intermediate approach would be to use a 32-bit arena for cells, direct atoms, and double-pointers to indirect atoms, but then store large (indirect) atoms in a separate 64-bit arena. When combined with pointer compression, this might be enough data to kick the can down the road for many more years, depending on the common use cases of Urbit. If Uqbar has millions of accounts in its Merkle tree, this might not be sufficient, but if people just want to store their director's cut edition of Shrek 2, that could be a large atom stored outside the loom, and the loom itself could remain 32-bit.

Building a 64-bit Vere is not an insurmountable project, but the result would almost certainly be significantly slower than the current interpreter, since cache locality is usually the limiting factor in most modern software, and that would be roughly halved by switching from 32-bit to 64-bit.

Projects

Completed

Demand Paging

duration

1-3 Months

Demand paging refers to the ability to load only needed pages of memory into RAM, leaving other pages on disk, to reduce memory use. Operating systems almost always include this feature. Urbit does not include it yet, but it will need to, since Urbit is a "single-level store".

Completed

Pointer Compression: 8GB Loom

start date

2021-01-01

end date

2023-03-01

This project raises the amount of data the runtime can manage from 2GB to 8GB. It is the first phase of the Increase Runtime Data Capacity arc.

Current

New Mars Snapshot Design

duration

2 Months

start date

2023-01-01

end date

2023-04-30

New Mars has a novel scheme for ensuring copy-on-write semantics for memory pages in a way that minimizes disk write amplification when taking an incremental snapshot.

Current

New Mars

start date

2022-01-01

end date

2024

New Mars is an experimental rewrite of the Nock interpreter intended for much higher performance.

Next Up

Pointer Compression: Cell Compression

duration

2-4 Months

owner

TBD

An optimization would be to ensure that any page of memory inside the "loom" (noun memory arena) stores only cells (pairs) or atoms (numbers).

Next Up

Memory-Efficient Meld

duration

1-2 Months

owner

TBD

A more memory-efficient implementation of `|meld` should reduce dangerous memory pressure. This would reduce maintenance burden.

Future

Loom Tool Scaling

duration

2-4 Months

owner

TBD

Once there are 8GB or 16GB in the loom, it will be common for there to be more data in Urbit than fits in RAM. This presents several new problems, and demands solutions.

Future

Pointer Compression: 16GB Loom

duration

1 Month

owner

TBD

Once cells have all been reduced to two 64-bit words, Vere could align all allocations at 128 bits instead of 64. This would double the amount of addressable memory to 16GB.