Skip to content

Commit a851cd7

Browse files
committed
UTXO-HD
1 parent decf7f5 commit a851cd7

File tree

322 files changed

+26464
-7661
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

322 files changed

+26464
-7661
lines changed

.github/workflows/ci.yml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,11 @@ jobs:
7676
cabal clean
7777
cabal update
7878
79+
- name: Install lmdb
80+
run: |
81+
sudo apt update
82+
sudo apt install liblmdb-dev
83+
7984
# We create a `dependencies.txt` file that can be used to index the cabal
8085
# store cache.
8186
#

CONTRIBUTING.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -132,6 +132,41 @@ cabal test ouroboros-consensus:test:consensus-test --test-show-details=direct
132132
Note the second one cannot be used when we want to provide CLI arguments to the
133133
test-suite.
134134

135+
# Generating documentation and setting up hoogle
136+
137+
The documentation contains some [tikz](https://tikz.net) figures that require
138+
some preprocessing for them to be displayed. To do this, use the documentation
139+
script:
140+
141+
```bash
142+
./scripts/docs/haddocks.sh
143+
```
144+
145+
If not already in your `PATH` (eg when in a Nix shell), this will install
146+
[`cabal-docspec`](https://github.com/phadej/cabal-extras/tree/master/cabal-docspec)
147+
from a binary, and then build the haddocks for the project.
148+
149+
Often times, it is useful to have a
150+
[`hoogle`](https://github.com/ndmitchell/hoogle) server at hand, with the
151+
packages and its dependencies. Our suggestion is to install
152+
[`cabal-hoogle`](https://github.com/kokobd/cabal-hoogle) from github:
153+
154+
```bash
155+
git clone [email protected]:kokobd/cabal-hoogle
156+
cd cabal-hoogle
157+
cabal install exe:cabal-hoogle
158+
```
159+
160+
and then run `cabal-hoogle`:
161+
162+
```bash
163+
cabal-hoogle generate
164+
cabal-hoogle run -- server --local
165+
```
166+
167+
This will fire a `hoogle` server at https://localhost:8080/ with the local
168+
packages and their dependencies.
169+
135170
# Contributing to the code
136171

137172
The following sections contain some guidelines that should be followed when

docs/tech-reports/report/chapters/storage/ledgerdb.tex

Lines changed: 1 addition & 91 deletions
Original file line numberDiff line numberDiff line change
@@ -1,98 +1,8 @@
11
\chapter{Ledger Database}
22
\label{ledgerdb}
33

4-
The Ledger DB is responsible for the following tasks:
54

6-
\begin{enumerate}
7-
\item \textbf{Maintaining the ledger state at the tip}: Maintaining the ledger
8-
state corresponding to the current tip in memory. When we try to extend our
9-
chain with a new block fitting onto our tip, the block must first be validated
10-
using the right ledger state, i.e., the ledger state corresponding to the tip.
11-
The current ledger state is needed for various other purposes.
12-
13-
\item \textbf{Maintaining the past $k$ ledger states}: As discussed in
14-
\cref{consensus:overview:k}, we might roll back up to $k$ blocks when
15-
switching to a more preferable fork. Consider the example below:
16-
%
17-
\begin{center}
18-
\begin{tikzpicture}
19-
\draw (0, 0) -- (50pt, 0) coordinate (I);
20-
\draw (I) -- ++(20pt, 20pt) coordinate (C1) -- ++(20pt, 0) coordinate (C2);
21-
\draw (I) -- ++(20pt, -20pt) coordinate (F1) -- ++(20pt, 0) coordinate (F2) -- ++(20pt, 0) coordinate (F3);
22-
\node at (I) {$\bullet$};
23-
\node at (C1) {$\bullet$};
24-
\node at (C2) {$\bullet$};
25-
\node at (F1) {$\bullet$};
26-
\node at (F2) {$\bullet$};
27-
\node at (F3) {$\bullet$};
28-
\node at (I) [above left] {$I$};
29-
\node at (C1) [above] {$C_1$};
30-
\node at (C2) [above] {$C_2$};
31-
\node at (F1) [below] {$F_1$};
32-
\node at (F2) [below] {$F_2$};
33-
\node at (F3) [below] {$F_3$};
34-
\draw (60pt, 50pt) node {$\overbrace{\hspace{60pt}}$};
35-
\draw (60pt, 60pt) node[fill=white] {$k$};
36-
\draw [dashed] (30pt, -40pt) -- (30pt, 45pt);
37-
\end{tikzpicture}
38-
\end{center}
39-
%
40-
Our current chain's tip is $C_2$, but the fork containing blocks $F_1$, $F_2$,
41-
and $F_3$ is more preferable. We roll back our chain to the intersection point
42-
of the two chains, $I$, which must be not more than $k$ blocks back from our
43-
current tip. Next, we must validate block $F_1$ using the ledger state at
44-
block $I$, after which we can validate $F_2$ using the resulting ledger state,
45-
and so on.
46-
47-
This means that we need access to all ledger states of the past $k$ blocks,
48-
i.e., the ledger states corresponding to the volatile part of the current
49-
chain.\footnote{Applying a block to a ledger state is not an invertible
50-
operation, so it is not possible to simply ``unapply'' $C_1$ and $C_2$ to
51-
obtain $I$.}
52-
53-
Access to the last $k$ ledger states is not only needed for validating candidate
54-
chains, but also by the:
55-
\begin{itemize}
56-
\item \textbf{Local state query server}: To query any of the past $k$ ledger
57-
states (\cref{servers:lsq}).
58-
\item \textbf{Chain sync client}: To validate headers of a chain that
59-
intersects with any of the past $k$ blocks
60-
(\cref{chainsyncclient:validation}).
61-
\end{itemize}
62-
63-
\item \textbf{Storing on disk}: To obtain a ledger state for the current tip of
64-
the chain, one has to apply \emph{all blocks in the chain} one-by-one to the
65-
initial ledger state. When starting up the system with an on-disk chain
66-
containing millions of blocks, all of them would have to be read from disk and
67-
applied. This process can take tens of minutes, depending on the storage and
68-
CPU speed, and is thus too costly to perform on each startup.
69-
70-
For this reason, a recent snapshot of the ledger state should be periodically
71-
written to disk. Upon the next startup, that snapshot can be read and used to
72-
restore the current ledger state, as well as the past $k$ ledger states.
73-
\end{enumerate}
74-
75-
Note that whenever we say ``ledger state'', we mean the
76-
\lstinline!ExtLedgerState blk! type described in \cref{storage:extledgerstate}.
77-
78-
The above duties are divided across the following modules:
79-
80-
\begin{itemize}
81-
\item \lstinline!LedgerDB.InMemory!: this module defines a pure data structure,
82-
named \lstinline!LedgerDB!, to represent the last $k$ ledger states in memory.
83-
Operations to validate and append blocks, to switch to forks, to look up
84-
ledger states, \ldots{} are provided.
85-
\item \lstinline!LedgerDB.OnDisk!: this module contains the functionality to
86-
write a snapshot of the \lstinline!LedgerDB! to disk and how to restore a
87-
\lstinline!LedgerDB! from a snapshot.
88-
\item \lstinline!LedgerDB.DiskPolicy!: this module contains the policy that
89-
determines when a snapshot of the \lstinline!LedgerDB! is written to disk.
90-
\item \lstinline!ChainDB.Impl.LgrDB!: this module is part of the Chain DB, and
91-
is responsible for maintaining the pure \lstinline!LedgerDB! in a
92-
\lstinline!StrictTVar!.
93-
\end{itemize}
94-
95-
We will now discuss the modules listed above.
5+
THIS PART WAS PORTED TO THE HADDOCKS
966

977
\section{In-memory representation}
988
\label{ledgerdb:in-memory}
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# UTxO HD
2+
3+
This document describes the design followed to move the ledger state
4+
from memory to disk.
5+
6+
## Expected performance
7+
8+
On a 64G machine, with a AMD Ryzen 9 5900X processor, we obtained the following
9+
results when replaying and syncing from scratch up to slot 75M:
10+
11+
12+
| | Replay max mem | Replay time | Sync max mem | Sync time |
13+
|------------------|----------------|-------------|--------------|-----------|
14+
| Baseline | 13 GB | 1:51 h | 15 GB | 20:46 h |
15+
| UTxO HD (in-mem) | 13 GB | 2:50 h | 16 GB | 25:04 h |
16+
| UTxO HD (LMDB) | 8 GB | 3:15 h | 11.4 GB | 25:50 h |
17+
18+
It is worth noting that these are single measurements, and they are only
19+
intended to provide an indication of the expected performance.
20+
21+
These results correspond to obtained around 18 January 2023.
22+
23+
The plots below show how replay and syncing a node from scratch progress over
24+
time, and how the memory usage evolves.
25+
26+
![replay times](/img/utxo-hd/utxo-hd-replay-01-19-23.png)
27+
28+
![sync times](/img/utxo-hd/utxo-hd-sync-01-19-23.png)
29+
30+
## References
31+
32+
* [Storing the Cardano ledger state on disk: analysis and design options (An IOHK technical report)](/pdfs/utxo-db.pdf)
33+
* [Storing the Cardano ledger state on disk: API design concepts (An IOHK technical report)](/pdfs/utxo-db-api.pdf)
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Overview
2+
3+
TODO

0 commit comments

Comments
 (0)