IntersectMBO
diff --git a/‎.github/workflows/ci.yml
Lines changed: 5 additions & 0 deletions b/‎.github/workflows/ci.yml
Lines changed: 5 additions & 0 deletions
diff --git a/‎CONTRIBUTING.md
Lines changed: 35 additions & 0 deletions b/‎CONTRIBUTING.md
Lines changed: 35 additions & 0 deletions
diff --git a/‎docs/tech-reports/report/chapters/storage/ledgerdb.tex
Lines changed: 1 addition & 91 deletions b/‎docs/tech-reports/report/chapters/storage/ledgerdb.tex
Lines changed: 1 addition & 91 deletions
diff --git a/‎docs/website/contents/about-ouroboros/utxo-hd.md
Lines changed: 33 additions & 0 deletions b/‎docs/website/contents/about-ouroboros/utxo-hd.md
Lines changed: 33 additions & 0 deletions
diff --git a/‎docs/website/contents/for-developers/utxo-hd/Overview.md
Lines changed: 3 additions & 0 deletions b/‎docs/website/contents/for-developers/utxo-hd/Overview.md
Lines changed: 3 additions & 0 deletions
@@ -76,6 +76,11 @@ jobs:
         cabal clean
         cabal update
 
+    - name: Install lmdb
+      run: |
+        sudo apt update
+        sudo apt install liblmdb-dev
+
     # We create a `dependencies.txt` file that can be used to index the cabal
     # store cache.
     #
 
@@ -132,6 +132,41 @@ cabal test ouroboros-consensus:test:consensus-test --test-show-details=direct
 Note the second one cannot be used when we want to provide CLI arguments to the
 test-suite.
 
+# Generating documentation and setting up hoogle
+
+The documentation contains some [tikz](https://tikz.net) figures that require
+some preprocessing for them to be displayed. To do this, use the documentation
+script:
+
+```bash
+./scripts/docs/haddocks.sh
+```
+
+If not already in your `PATH` (eg when in a Nix shell), this will install
+[`cabal-docspec`](https://github.com/phadej/cabal-extras/tree/master/cabal-docspec)
+from a binary, and then build the haddocks for the project.
+
+Often times, it is useful to have a
+[`hoogle`](https://github.com/ndmitchell/hoogle) server at hand, with the
+packages and its dependencies. Our suggestion is to install
+[`cabal-hoogle`](https://github.com/kokobd/cabal-hoogle) from github:
+
+```bash
+git clone [email protected]:kokobd/cabal-hoogle
+cd cabal-hoogle
+cabal install exe:cabal-hoogle
+```
+
+and then run `cabal-hoogle`:
+
+```bash
+cabal-hoogle generate
+cabal-hoogle run -- server --local
+```
+
+This will fire a `hoogle` server at https://localhost:8080/ with the local
+packages and their dependencies.
+
 # Contributing to the code
 
 The following sections contain some guidelines that should be followed when
 
@@ -1,98 +1,8 @@
 \chapter{Ledger Database}
 \label{ledgerdb}
 
-The Ledger DB is responsible for the following tasks:
 
-\begin{enumerate}
-\item \textbf{Maintaining the ledger state at the tip}: Maintaining the ledger
-  state corresponding to the current tip in memory. When we try to extend our
-  chain with a new block fitting onto our tip, the block must first be validated
-  using the right ledger state, i.e., the ledger state corresponding to the tip.
-  The current ledger state is needed for various other purposes.
-
-\item \textbf{Maintaining the past $k$ ledger states}: As discussed in
-  \cref{consensus:overview:k}, we might roll back up to $k$ blocks when
-  switching to a more preferable fork. Consider the example below:
-  %
-  \begin{center}
-  \begin{tikzpicture}
-  \draw (0, 0) -- (50pt, 0) coordinate (I);
-  \draw (I) -- ++(20pt,  20pt) coordinate (C1) -- ++(20pt, 0) coordinate (C2);
-  \draw (I) -- ++(20pt, -20pt) coordinate (F1) -- ++(20pt, 0) coordinate (F2) -- ++(20pt, 0) coordinate (F3);
-  \node at (I)  {$\bullet$};
-  \node at (C1) {$\bullet$};
-  \node at (C2) {$\bullet$};
-  \node at (F1) {$\bullet$};
-  \node at (F2) {$\bullet$};
-  \node at (F3) {$\bullet$};
-  \node at (I) [above left] {$I$};
-  \node at (C1) [above] {$C_1$};
-  \node at (C2) [above] {$C_2$};
-  \node at (F1) [below] {$F_1$};
-  \node at (F2) [below] {$F_2$};
-  \node at (F3) [below] {$F_3$};
-  \draw (60pt, 50pt) node {$\overbrace{\hspace{60pt}}$};
-  \draw (60pt, 60pt) node[fill=white] {$k$};
-  \draw [dashed] (30pt, -40pt) -- (30pt, 45pt);
-  \end{tikzpicture}
-  \end{center}
-  %
-  Our current chain's tip is $C_2$, but the fork containing blocks $F_1$, $F_2$,
-  and $F_3$ is more preferable. We roll back our chain to the intersection point
-  of the two chains, $I$, which must be not more than $k$ blocks back from our
-  current tip. Next, we must validate block $F_1$ using the ledger state at
-  block $I$, after which we can validate $F_2$ using the resulting ledger state,
-  and so on.
-
-  This means that we need access to all ledger states of the past $k$ blocks,
-  i.e., the ledger states corresponding to the volatile part of the current
-  chain.\footnote{Applying a block to a ledger state is not an invertible
-  operation, so it is not possible to simply ``unapply'' $C_1$ and $C_2$ to
-  obtain $I$.}
-
-  Access to the last $k$ ledger states is not only needed for validating candidate
-  chains, but also by the:
-  \begin{itemize}
-  \item \textbf{Local state query server}: To query any of the past $k$ ledger
-    states (\cref{servers:lsq}).
-  \item \textbf{Chain sync client}: To validate headers of a chain that
-    intersects with any of the past $k$ blocks
-    (\cref{chainsyncclient:validation}).
-  \end{itemize}
-
-\item \textbf{Storing on disk}: To obtain a ledger state for the current tip of
-  the chain, one has to apply \emph{all blocks in the chain} one-by-one to the
-  initial ledger state. When starting up the system with an on-disk chain
-  containing millions of blocks, all of them would have to be read from disk and
-  applied. This process can take tens of minutes, depending on the storage and
-  CPU speed, and is thus too costly to perform on each startup.
-
-  For this reason, a recent snapshot of the ledger state should be periodically
-  written to disk. Upon the next startup, that snapshot can be read and used to
-  restore the current ledger state, as well as the past $k$ ledger states.
-\end{enumerate}
-
-Note that whenever we say ``ledger state'', we mean the
-\lstinline!ExtLedgerState blk! type described in \cref{storage:extledgerstate}.
-
-The above duties are divided across the following modules:
-
-\begin{itemize}
-\item \lstinline!LedgerDB.InMemory!: this module defines a pure data structure,
-  named \lstinline!LedgerDB!, to represent the last $k$ ledger states in memory.
-  Operations to validate and append blocks, to switch to forks, to look up
-  ledger states, \ldots{} are provided.
-\item \lstinline!LedgerDB.OnDisk!: this module contains the functionality to
-  write a snapshot of the \lstinline!LedgerDB! to disk and how to restore a
-  \lstinline!LedgerDB! from a snapshot.
-\item \lstinline!LedgerDB.DiskPolicy!: this module contains the policy that
-  determines when a snapshot of the \lstinline!LedgerDB! is written to disk.
-\item \lstinline!ChainDB.Impl.LgrDB!: this module is part of the Chain DB, and
-  is responsible for maintaining the pure \lstinline!LedgerDB! in a
-  \lstinline!StrictTVar!.
-\end{itemize}
-
-We will now discuss the modules listed above.
+THIS PART WAS PORTED TO THE HADDOCKS
 
 \section{In-memory representation}
 \label{ledgerdb:in-memory}
 
@@ -0,0 +1,33 @@
+# UTxO HD
+
+This document describes the design followed to move the ledger state
+from memory to disk.
+
+## Expected performance
+
+On a 64G machine, with a AMD Ryzen 9 5900X processor, we obtained the following
+results when replaying and syncing from scratch up to slot 75M:
+
+
+|                  | Replay max mem | Replay time | Sync max mem | Sync time |
+|------------------|----------------|-------------|--------------|-----------|
+| Baseline         | 13 GB          | 1:51 h      | 15 GB        | 20:46 h   |
+| UTxO HD (in-mem) | 13 GB          | 2:50 h      | 16 GB        | 25:04 h   |
+| UTxO HD (LMDB)   | 8 GB           | 3:15 h      | 11.4 GB      | 25:50 h   |
+
+It is worth noting that these are single measurements, and they are only
+intended to provide an indication of the expected performance.
+
+These results correspond to obtained around 18 January 2023.
+
+The plots below show how replay and syncing a node from scratch progress over
+time, and how the memory usage evolves.
+
+![replay times](/img/utxo-hd/utxo-hd-replay-01-19-23.png)
+
+![sync times](/img/utxo-hd/utxo-hd-sync-01-19-23.png)
+
+## References
+
+* [Storing the Cardano ledger state on disk: analysis and design options (An IOHK technical report)](/pdfs/utxo-db.pdf)
+* [Storing the Cardano ledger state on disk: API design concepts (An IOHK technical report)](/pdfs/utxo-db-api.pdf)
@@ -0,0 +1,3 @@
+# Overview
+
+TODO