An ordinary Urbit ship needs to be able to host a large chatroom, in addition to other scaling considerations. The performance of Urbit's networking is the bottleneck limiting this kind of scaling at the moment.
A number of incremental improvements need to be made to the implementation of Ames, Urbit's networking protocol, including fixes for flaky connections and more efficient use of timers. A second protocol called Fine will also be added for scalable content distribution. A multi-part project called "subscription reform" will allow apps to use this protocol effectively.
For scalability, basic solid-state publications are being prototyped. This is the next step after basic remote scry toward scalable data publishing on Urbit. These prototypes live in userspace; once we've proven the model with some real-world examples, we can build kernel support into Gall.
Tune %clog Backpressure Time and Space Constants
Tuning the constants in Ames's `%clog` system should reduce network flakiness, reduce the number of Arvo events, and improve publication bandwidth, especially until chat uses remote scry and solid-state subscriptions.
HTTP Mutable URL Caching
Urbit should be able to serve websites at custom URLs efficiently to support serving websites to the old web from a single ship without needing to configure caching reverse proxies such as nginx or varnish.
Basic Remote Scry Protocol
The "Fine" remote scry protocol will form the foundation of scalable content distribution in Urbit, by allowing many subscriber ships to read data efficiently from a publisher ship without incurring excessive load on the publisher.
Typed paths should improve performance and developer experience, and it could unblock a typed interface to publications.
Commit Before Compute
Current Vere needs to run Nock on an event before it can write it to disk. This places a lower bound on event latency, defined as time between receiving an event and performing its effects, at `D + N`, i.e. disk write latency (`D`) plus nock execution time (`N`). Commit-before-compute has amortized latency `max(D, N)`, which is usually significantly better.
Consolidate Packet Re-Send Timers
Ames sets a lot of timers to remind it to send packets again if they don't get acknowledged fast enough. Reducing the number of timers lowers disk write usage, improves quiescence (which should eventually let hosting providers use less RAM and thereby lower costs), and should improve overall performance on publisher ships.
Encrypted Remote Scry
Arvo needs to encrypt scry paths and the values bound to them in order to use the remote scry protocol for private data such as group chats. This requires changes to the kernel to distribute encryption keys and let applications determine which other ships should have access to data in which publications.
Generalized Deferral Mechanism
Arvo's Behn vane (kernel module) currently serves two purposes: setting timers, and deferring tasks to later events. Deferral could be split out into a separate feature, which could aid both in refactoring Behn to be easier to verify and optimize.
The scry namespace should be made available over HTTP, to improve developer experience and performance for Urbit clients ("airlocks").
A far-out proposal is to have the runtime perform serialization, packetization, encryption, and congestion control, instead of Arvo.
Add %pine Query-At-Latest Protocol
Remote scry on its own doesn't let one ship determine the latest state of a publication on another ship. This is solved by adding another protocol alongside the remote scry network protocol to implement `%pine` query-at-latest requests over the network, as pure reads.
Refactor Ames Vane
The Ames vane could be shorter, easier to read, more performant, and easier to prove correct.
Shared Memory IPC
Vere's two processes, Urth (the I/O process) and Mars (the Nock worker process), communicate using a custom noun-based interprocess communication (IPC) protocol. This currently uses the Unix stdin and stdout, but using shared memory instead would make IPC significantly faster, reducing event processing latency and improving overall data throughput.
The way ships ping their sponsors doesn't always work well behind home routers or other NATs, causing flaky connections.
Symmetric routing should improve multiple things about Ames networking, enabling star packet forwarding, as well as improving firewall flakiness and peer discovery.
Urbit's timer system could be better in several ways.
Typed Interface to Solid-State Publications
A typed interface to solid-state publications should improve developer experience and performance (by avoiding runtime typechecks and type coercions).
Add Urth-to-Urth Network Push Sessions
Solid-state publications that need low latency, such as chat, can't use the remote scry protocol without adding a new protocol to "push" updates to subscribers as soon as they are created. This protocol is Urth-to-Urth, and opt-in, to ensure naive runtimes still work without it (both as publisher and subscriber).