This seems to be a holy grail, to be honest! Super-simple database representatio...

josephg · on Sept 27, 2024

Author here. Thanks! Yeah this is my hope too.

Egwalker has one other advantage here: the data format will be stable and consistent. With CRDTs, every different crdt algorithm (Yjs, automerge/rga, fugue, etc) actually stores different fields on disk. So if someone figure out a new way to make text editing work better, we need to rip up our file formats and network protocols.

Egwalker just stores the editing events in their original form. (Eg insert “a” at position 100). It uses a crdt implementation in memory to merge concurrent changes (and everyone needs to use the same crdt algorithm for convergence). But the network protocol and file format is stable no matter what algorithm you use.

vlovich123 · on Sept 28, 2024

I’ve loved learning all your detailed info on CRDT work. Thank you for progressing the field!

Since it stores all the editing events, does this mean that the complexity of opening a document is at least O(N) in terms of number of edits? Or are there interim snapshots / merging / and/or intelligent range computations to reduce the number of edits that need to be processed?

josephg · on Sept 28, 2024

You can just store a snapshot on disk (ie, the raw text) and load that directly. You only ever need to look at historical edits when merging concurrent changes into the local document state. (And thats pretty rare in practice).

Even when that happens, the algorithm only needs to look at operations as far back as the most recent "fork point" between the two branches in order to merge. (And we can compute that fork point in O(n) time - where n is the number of events that have happened since then). Its usually very very fast.

In an application like google docs or a wiki, the browser will usually never need to look at any historical changes at all in order to edit a document.

vlovich123 · on Sept 28, 2024

Very clever idea. Thanks for explaining