A JavaScript implementation of the Logoot CRDT. There is a JavaScript companion library to this one at usecanvas/logoot-js.
This is a work-in-progress and is not battle-tested.
If available in Hex, the package can be installed as:
-
Add
logoot
to your list of dependencies inmix.exs
:def deps do [{:logoot, "~> 0.1.0"}] end
-
Ensure
logoot
is started before your application:def application do [applications: [:logoot]] end
Logoot is a conflict-free replicated data type that can be used to represent a sequence of atoms. Atoms in a Logoot sequence might be chunks of content or even characters in a string of text.
The key to Logoot is the way in which it generates identifiers for atoms. To understand this, here are a few definitions to start with:
identifier
A pair<p, s>
wherep
is an integer ands
is a globally unique site identifier. A "site" represents any independent copy of a given Logoot CRDT. One web client with independent editable views of the same sequence would be two separate sites on that client.position
A list of identifiers.atom identifier
A pair<pos, c>
wherepos
is a position, andc
is the value of a vector clock at a given site. The site maintains a vector clock that increases incrementally with each no atom identifier generated.sequence atom
A pair<ident, v>
whereident
is an atom identifier, andv
is any arbitrary value. This could be a single text character or a block in a text editor.sequence
A list of sequence atoms. This might represent a document or a list of blocks in an editor. Every sequence implicitly has a minimum sequence atom and a maximum sequence atom. All other atoms in the sequence are created somewhere between these two.
Because of how these atom identifiers are structured, they are totally ordered, as opposed to causally ordered. No identifier cares about the identifier before it once it's been created, and so tombstones are not necessary.
For the purposes of readability, I'm going to represent sequences in this document in Elixir terms.
Let's start with the empty sequence before any user has made edits to it:
[
{{[{0, 0}], 0}, nil}, # Minimum sequence atom
{{[{32767, 0}], 1}, nil} # Maximum sequence atom
]
Aside: Note that the integer in the maximum sequence atom's value is
32767
. This is chosen somewhat arbitrarily, but it is common for
implementations to use the maximum unsigned 16-bit integer (and the original
paper recommends it). One wouldn't want to choose an integer greater than any
implementation's maximum safe integer value, and all implementations that
communicate with one another must share this same maximum.
Now, a user at site 1
inserts the first line into their local copy of the
document:
[
{{[{0, 0}], 0}, nil}, # Minimum sequence atom
{{[{6589, 1}], 0}, "Hello, world!"},
{{[{32767, 0}], 1}, nil} # Maximum sequence atom
]
Because there is free space between the integer of the minimum sequence atom and the integer of the maximum sequence atom, Logoot chooses a random integer between the two (how this is chosen is somewhat arbitrary—it just must be between them) and ends up with the sequence identifier:
{
[
{
6589, # Number between min/max
1 # Site identifier
}
],
0 # Next value of site's vector clock
}
As a result, the document is properly sequenced. Ordering of sequence atoms is done by iterating over their position list and comparing first the integer, and then the site identifier if the integer is equal.
Note that vector clock values are not compared. Vector clock values are used to ensure unique atom identifiers, not for ordering.
Let's look at a more complex example. Start with a document that looks like this:
[
{{[{0, 0}]}, nil}, # Minimum sequence atom
{{[{1, 1}, {3, 2}], 5}, "Hello, world from site 2!"},
{{[{1, 1}, {5, 4}], 1}, "I came from site 4!"},
{{[{32767, 0}]}, nil} # Maximum sequence atom
]
Now, at site 3
, the user wants to insert a line between the two user-created
lines in the above sequence. Logoot iterates over the pairs of identifiers in
the "before" and "after" positions. Because the first identifier of each
position is {1, 1}
, Logoot can not insert an identifier directly between them,
so it moves on to the next pair, {3, 2}
and {5, 4}
. Because site 3's
site identifier is greater than site 2's, it can insert the identifier {3, 3}
here and preserve ordering, since {3, 2} < {3, 3} < {5, 4}
.
The resulting sequence would be (assuming 3's vector clock is at 1
):
[
{{[{0, 0}]}, nil}, # Minimum sequence atom
{{[{1, 1}, {3, 2}], 5}, "Hello, world from site 2!"},
{{[{1, 1}, {3, 3}], 1}, "Hello from site 3!"},
{{[{1, 1}, {5, 4}], 1}, "I came from site 4!"},
{{[{32767, 0}]}, nil} # Maximum sequence atom
]
Note that if this were actually site 1
, things would be different, because
{3, 2}
is not less than {3, 1}
. Instead, Logoot generates a random integer
between 3 and 5 (which is of course 4
), and our resulting identifier would be:
{[{1, 1}, {4, 1}], 1}
Hopefully this provides a good enough explanation of what Logoot is and why it may be an excellent option for a sequence CRDT. The paper presenting it is a relatively easy read, and you may also want to look at this project's Logoot.Sequence module and its tests.
- Make min and max implicit, do not force user to provide them.
- Prevent deleting min and max atoms.
- Make idempotent insert atom function.
- Make idempotent delete atom function.