diff --git a/pages/blog/bring-your-own-crdt.mdx b/pages/blog/bring-your-own-crdt.mdx index 6dba85e..563e0fa 100644 --- a/pages/blog/bring-your-own-crdt.mdx +++ b/pages/blog/bring-your-own-crdt.mdx @@ -38,11 +38,11 @@ But it got me thinking, can we bring the Replicache model to CRDTs? Can we give ## Bring your own CRDT -Yes we can! For lack of a better term, I'm calling it "CRDT substrate". It is a re-think on my earlier work ([LWW vs DAG](./lww-vs-dag)) to incoroprate named mutations, inspired by Replicache and [another earlier work](https://aphrodite.sh/docs/mutations-and-transactions). +Yes we can! For lack of a better term, I'm calling it "CRDT substrate". It is a re-think on my earlier work ([LWW vs DAG](./lww-vs-dag)) to incorporate named mutations, inspired by Replicache and [another earlier work](https://aphrodite.sh/docs/mutations-and-transactions). ## Building the Event Log -The basis of the model is an event log. Just like Replicache, we'll model the event log as `[mutation_name, ...args]` tuples. One thing we'll add to each event, however, is a pointer back to the event which preceeded it. This pointer will allow us to build a tree of events where forks represent concurrent edits. A traversal of this tree will create a total ordering of events and allow us to converge on a single state. +The basis of the model is an event log. Just like Replicache, we'll model the event log as `[mutation_name, ...args]` tuples. One thing we'll add to each event, however, is a pointer back to the event which preceded it. This pointer will allow us to build a tree of events where forks represent concurrent edits. A traversal of this tree will create a total ordering of events and allow us to converge on a single state. The tree is a CRDT given: - The tree is inflationary with a partial ordering, forming a semi-lattice @@ -102,7 +102,7 @@ const bareMutations = { async completeAllTodos(tx) { await tx.exec(`UPDATE todo SET completed = 1 WHERE completed = 0`); }, - async uncompleteAllTodos(tx) { + async incompleteAllTodos(tx) { await tx.exec(`UPDATE todo SET completed = 0 WHERE completed = 1`); }, async clearCompletedTodos(tx) { diff --git a/pages/blog/differential-dataflow.mdx b/pages/blog/differential-dataflow.mdx index c0c614f..6318bb4 100644 --- a/pages/blog/differential-dataflow.mdx +++ b/pages/blog/differential-dataflow.mdx @@ -8,7 +8,7 @@ import { Callout } from 'nextra/components' # Differential Dataflow For Mere Mortals -## Differental Dataflow +## Differential Dataflow Differential Dataflow is a technique for incrementally maintaining views and query results. diff --git a/pages/blog/distributed-recursive-ordering.mdx b/pages/blog/distributed-recursive-ordering.mdx index ad6cfe3..5340899 100644 --- a/pages/blog/distributed-recursive-ordering.mdx +++ b/pages/blog/distributed-recursive-ordering.mdx @@ -13,7 +13,7 @@ But how can we updating our recursive ordering to be a [CRDT](https://crdt.tech/ The problem with the current solution is that two peers could create a cycle in the ordering. -Take the follwing list: +Take the following list: ``` A -> B -> C ``` diff --git a/pages/blog/intro-to-crdts.mdx b/pages/blog/intro-to-crdts.mdx index 3653d5f..e175c12 100644 --- a/pages/blog/intro-to-crdts.mdx +++ b/pages/blog/intro-to-crdts.mdx @@ -71,7 +71,7 @@ A CRDT is a data type that can be: Going back to git, it's the same as giving every developer a copy of the repository and allowing each developer to make changes locally. At the end of the day, every developer in an organization can merge changes with every other developer however they like: pair-wise, round-robin, or through a central repository. Once all merges are complete, every developer will have the same state (assuming merge conflicts were deterministically resolved). Unlike git, a CRDT wouldn't hit merge conflicts and can merge out of order changes. -Let's see how this actually works and how you can implement it in the large (full blown applications) and in the small (individiual fields in a record). +Let's see how this actually works and how you can implement it in the large (full blown applications) and in the small (individual fields in a record). ## A Simple CRDT @@ -96,7 +96,7 @@ A table structured in this way can always be written to by separate nodes and me -A grow only table is intereseting but you'll eventually want to do one or more of the following: +A grow only table is interesting but you'll eventually want to do one or more of the following: 1. Modify a row 2. Run a reduce job over a set of rows to come up with a "view" of some object state @@ -112,7 +112,7 @@ Looking into the ways it can be incorrectly implemented will help us understand We'll take that "last write wins" is acceptable for the use case as a given. Now let's take some stabs at implementing a grow only table that supports last write replacements of rows. -The first solution would be to timestamp rows with the current system time of the node that did the write. Remember we need nodes to be able to work without internet connectivitiy so we only have access to that node's local clock. +The first solution would be to timestamp rows with the current system time of the node that did the write. Remember we need nodes to be able to work without internet connectivity so we only have access to that node's local clock. From simplest to most complex, the four common mistakes in a last write wins setup are: @@ -145,7 +145,7 @@ A\<-\>B merge and say B does not update the time for it's row to match the max b | --- | -------- | -------------------------------------------- | | x | **this** | 12:00:00 | -Now if Node C comes along with a modificaiton to the same row, but at time 12:30, it'll overwrite the current value. This, however, isn't the last write since A's write was later. We now have an inconsistent state. +Now if Node C comes along with a modification to the same row, but at time 12:30, it'll overwrite the current value. This, however, isn't the last write since A's write was later. We now have an inconsistent state. This is a trivial bug but covering it will help to understand logical clocks. To reiterate, the correct solution is to take the whole row including timestamp. @@ -252,7 +252,7 @@ In our proxy example: 1. When B receives new rows from C, it tags all those rows' `local_row_time` column with current time. 2. When A wants changes from B, it tracks `changes_since` by recording the largest `local_row_time` it has seen from B -3. Merging is done via `row_time` so last write semantics are still corretly respected. +3. Merging is done via `row_time` so last write semantics are still correctly respected. You can play with the idea below. Note that each node's local time is just a simple integer counter in this case. Works just as well as (actually better than) system time for our purposes and is a gentle introduction to the next error. @@ -271,7 +271,7 @@ System time moves with its own rules and breaks many assumptions we have about i ## About Time -So how do we handle time? CRDTs do not require a notion of time -- it isn't part of the strict mathematical definiton. [Mike Toomim](https://braid.org) has a great quote about how CRDTs collapse time, they remove time from the equation. +So how do we handle time? CRDTs do not require a notion of time -- it isn't part of the strict mathematical definition. [Mike Toomim](https://braid.org) has a great quote about how CRDTs collapse time, they remove time from the equation. Take a grow only set as an example. Time doesn't matter -- the state can always merge and merge at any point even without knowing about time. diff --git a/pages/blog/lww-vs-dag.mdx b/pages/blog/lww-vs-dag.mdx index c5d5c0d..1cc808c 100644 --- a/pages/blog/lww-vs-dag.mdx +++ b/pages/blog/lww-vs-dag.mdx @@ -1,6 +1,6 @@ --- date: February 17, 2023 -description: "Event sourcing is a fairly common design pattern for deriving application state from a history of events. But is it possible to turn an event log into a CRDT? To allow nodes to append events to the log, without coordinating with other nodes? Can we merge all these copies of the log, proces them, and arrive at the same application state across all nodes?" +description: "Event sourcing is a fairly common design pattern for deriving application state from a history of events. But is it possible to turn an event log into a CRDT? To allow nodes to append events to the log, without coordinating with other nodes? Can we merge all these copies of the log, process them, and arrive at the same application state across all nodes?" --- import Lww from '/components/lww-vs-dag/Lww'; @@ -9,7 +9,7 @@ import I from '/components/lww-vs-dag/Initialize'; # LWW vs P2P Event Log -Event sourcing is a fairly common design pattern for deriving application state from a history of events. But is it possible to turn an event log into a CRDT? To allow nodes to append events to the log, without coordinating with other nodes? Can we merge all these copies of the log, proces them, and arrive at the same application state across all nodes? +Event sourcing is a fairly common design pattern for deriving application state from a history of events. But is it possible to turn an event log into a CRDT? To allow nodes to append events to the log, without coordinating with other nodes? Can we merge all these copies of the log, process them, and arrive at the same application state across all nodes? This is possible. What's even better is that it is more powerful than other approaches, such as last write wins, to solving the problem of distributing application state. We'll do a brief review of how we can distribute state with last write wins registers, cover how that is a less powerful version of an event log and then build an distributed event log. diff --git a/pages/blog/sqlite-isnt-it.mdx b/pages/blog/sqlite-isnt-it.mdx index b79f3c2..4ecce44 100644 --- a/pages/blog/sqlite-isnt-it.mdx +++ b/pages/blog/sqlite-isnt-it.mdx @@ -75,7 +75,7 @@ These benefits come with huge downsides as in-memory model is effectively buildi Downsides: -1. Increased complexity. Must deal with the DB idiosynchracies as well as the ORM / in-memory model's. The in-memory model adds a whole new layer that requires mapping results to/from. +1. Increased complexity. Must deal with the DB idiosyncrasies as well as the ORM / in-memory model's. The in-memory model adds a whole new layer that requires mapping results to/from. 2. Transactions I. Many of these solutions do not provide an ability to update the in-memory model transactionally, leading to observation of stale state. 3. Transactions II. If many writes are made to the in-memory model then the lazy persist fails, what can be done? Can the in-memory model be rolled back? 4. Invariants. The DB will have foreign key constraints, check constraints, and unique indices. For the in-memory model to always succeed in persisting to the DB, these constraints must be replicated and checked there too. diff --git a/pages/docs/cr-sqlite/api-methods/crsql_changes.mdx b/pages/docs/cr-sqlite/api-methods/crsql_changes.mdx index edb63a5..b6b7b30 100644 --- a/pages/docs/cr-sqlite/api-methods/crsql_changes.mdx +++ b/pages/docs/cr-sqlite/api-methods/crsql_changes.mdx @@ -52,7 +52,7 @@ SELECT "table", "pk", "cid", "val", "col_version", "db_version", "site_id", "cl" AND "site_id" IS NOT ? ``` -This query will return all changes from the databse after a provided version and not applied by a given peer. +This query will return all changes from the database after a provided version and not applied by a given peer. - The `site_id IS NOT ?` is included as an optimization that reduces total sync roundtrips by one by preventing sites from receiving their own changes back. - Passing `site_id IS NULL` will only fetch changes that were made locally and not received through data sync (assuming the network layer is writing site ids which is optional). diff --git a/pages/docs/cr-sqlite/api-methods/crsql_tracked_peers.mdx b/pages/docs/cr-sqlite/api-methods/crsql_tracked_peers.mdx index 8b019f3..168dd4a 100644 --- a/pages/docs/cr-sqlite/api-methods/crsql_tracked_peers.mdx +++ b/pages/docs/cr-sqlite/api-methods/crsql_tracked_peers.mdx @@ -20,7 +20,7 @@ CREATE TABLE crsql_tracked_peers( 2. **version** - falls into two categories: For receive events: The clock value we last _received_ from that peer For send events: The clock value we last _sent_ that peer. I.e., For sends, this is _our_ db's clock not the peer's. -3. **tag** - used to differentiate sync sets. Tag is set to `0` for whole database syncs. Inserts into `crsql_changes` where tag is set to `-1` will not create tracking events. If you're only syncing subsets of the databse then this tag represents the subset you're syncing, such as an id of a "root node" of some sub-graph. Coming soon -- see [[docs/replicated-subgraphs]] +3. **tag** - used to differentiate sync sets. Tag is set to `0` for whole database syncs. Inserts into `crsql_changes` where tag is set to `-1` will not create tracking events. If you're only syncing subsets of the database then this tag represents the subset you're syncing, such as an id of a "root node" of some sub-graph. Coming soon -- see [[docs/replicated-subgraphs]] 4. **event** - 0 for receive events, 1 for send events. Values below 1000 are reserved for `crsqlite` ## Tag Values diff --git a/pages/docs/cr-sqlite/constraints.mdx b/pages/docs/cr-sqlite/constraints.mdx index 323762b..899d79d 100644 --- a/pages/docs/cr-sqlite/constraints.mdx +++ b/pages/docs/cr-sqlite/constraints.mdx @@ -6,5 +6,5 @@ Tables that have been upgraded to `crrs` may not have: 2. Unique constraints that are not the primary key 3. Check constraints that depend on columns other than the defined column -While there are techniques that allow preserving checked foreign keys, even under conditions created by eventual consistency, they all have nuanced tradeoffs. Throwing out these constraints for `crrs` is something that is clear and keeps the model easy to undestand. +While there are techniques that allow preserving checked foreign keys, even under conditions created by eventual consistency, they all have nuanced tradeoffs. Throwing out these constraints for `crrs` is something that is clear and keeps the model easy to understand. diff --git a/pages/docs/cr-sqlite/crdts/about.mdx b/pages/docs/cr-sqlite/crdts/about.mdx index 2ababc7..0d80f11 100644 --- a/pages/docs/cr-sqlite/crdts/about.mdx +++ b/pages/docs/cr-sqlite/crdts/about.mdx @@ -1,6 +1,6 @@ # About -`cr-sqlite` allows merging of databases by using conflit-free replicated data types. +`cr-sqlite` allows merging of databases by using conflict-free replicated data types. The big idea is that you define a table then say which CRDTs should be used to model it. diff --git a/pages/docs/cr-sqlite/crdts/sequence-crdts.mdx b/pages/docs/cr-sqlite/crdts/sequence-crdts.mdx index 940b23b..5e12aea 100644 --- a/pages/docs/cr-sqlite/crdts/sequence-crdts.mdx +++ b/pages/docs/cr-sqlite/crdts/sequence-crdts.mdx @@ -1,6 +1,6 @@ # Sequence CRDTs -Sequence CRDTs are primarily for modeling collaborative text documents. We're in the process of implemeting sequence crdt support. +Sequence CRDTs are primarily for modeling collaborative text documents. We're in the process of implementing sequence crdt support. Follow along [here](https://github.com/vlcn-io/cr-sqlite/issues/65). diff --git a/pages/docs/cr-sqlite/js/wasm.mdx b/pages/docs/cr-sqlite/js/wasm.mdx index e4b741f..15dacaa 100644 --- a/pages/docs/cr-sqlite/js/wasm.mdx +++ b/pages/docs/cr-sqlite/js/wasm.mdx @@ -77,7 +77,7 @@ const db = await sqlite.open("my-database.db"); Executes a SQL statement. Returns a promise that resolves when the statement has been executed. Returns no value. See execO and execA to receive values back. Rejects the promise on failure. -For better perfromance, preparing your queries is recommended. +For better performance, preparing your queries is recommended. ```ts function exec(sql: string, bind?: SQLiteCompatibleType[]): Promise; @@ -228,7 +228,7 @@ function onUpdate(fn: (type: UpdateType, dbName: string, tblName: string, rowid: ``` - `onUpdate` will eventually be superseeded by a native reactive query layer that allows you to subscribe to specific queries. A proof of concept implementation of this is available in the `react` integration via the `useQuery` hook. + `onUpdate` will eventually be superseded by a native reactive query layer that allows you to subscribe to specific queries. A proof of concept implementation of this is available in the `react` integration via the `useQuery` hook. **Params:** @@ -348,4 +348,4 @@ await stmt.finalize(null); ### TXAsync -All methods available on `DB` are also avaialble on `TXAsync`. The only difference is that `TXAsync` is intended to be used within a transaction. +All methods available on `DB` are also available on `TXAsync`. The only difference is that `TXAsync` is intended to be used within a transaction. diff --git a/pages/docs/cr-sqlite/networking/background.mdx b/pages/docs/cr-sqlite/networking/background.mdx index 72a9a31..246a7d9 100644 --- a/pages/docs/cr-sqlite/networking/background.mdx +++ b/pages/docs/cr-sqlite/networking/background.mdx @@ -18,7 +18,7 @@ You can see this idea in an [example implementation of whole crr sync](./whole-c ## Multi-Tenancy (Partial CRR Sync) -Most applications have a monotlithic database on the backend with data from many users in a sigle table or crr. In this model, you want to sync only the data that is relevant to a particular user. +Most applications have a monotlithic database on the backend with data from many users in a single table or crr. In this model, you want to sync only the data that is relevant to a particular user. This can be achieved through a combination of specifying what queries to sync and row level security policies. diff --git a/pages/docs/cr-sqlite/networking/partial-crr-sync.mdx b/pages/docs/cr-sqlite/networking/partial-crr-sync.mdx index 432e20f..e044f23 100644 --- a/pages/docs/cr-sqlite/networking/partial-crr-sync.mdx +++ b/pages/docs/cr-sqlite/networking/partial-crr-sync.mdx @@ -3,7 +3,7 @@ Partial CRR sync is useful for cases where: 1. You need row level security -2. You have mutli-tenant database where data from many users exists in the same tables / crrs. +2. You have multi-tenant database where data from many users exists in the same tables / crrs. 3. You want to sync only a subset of the data from the backend to the frontend due to dataset size. While it is possible to implement partial sync with the primitives available to you today, **it is not advised and not supported**. In **Q3/Q4 2023** we will be releasing primitives specifically intended to support partial sync, row level security, and large scale multi-tenant databases.