The one block header field that we didn’t investigate much in [chapter_blocks] was the Merkle root. To understand what makes the Merkle root useful, we first have to learn about Merkle trees and what properties they have. In this chapter, we’re going to learn exactly what a Merkle root is. This will be motivated by something called a proof of inclusion.
For a device that doesn’t have much disk space, bandwidth, or computing power, it’s expensive to store, receive, and validate the entire blockchain. As of this writing, the entire Bitcoin blockchain is around 200 GB, which is more than many phones can store; it can be very difficult to download efficiently and will certainly tax the CPU. If the entire blockchain cannot be put on the phone, what else can we do? Is it possible to create a Bitcoin wallet on a phone without having all the data?
For any wallet, there are two scenarios that we’re concerned with:
-
Paying someone
-
Getting paid by someone
If you are paying someone with your Bitcoin wallet, it is up to the person receiving your bitcoins to verify that they’ve been paid. Once they’ve verified that the transaction has been included in a block sufficiently deep, the other side of the trade, or the good or service, will be given to you. Once you’ve sent the transaction to the other party, there really isn’t anything for you to do other than wait until you receive whatever it is you’re exchanging the bitcoins for.
When getting paid bitcoins, however, we have a dilemma. If we are connected and have the full blockchain, we can easily see when the transaction is in a sufficiently deep block, at which point we give the other party our goods or services. But if we don’t have the full blockchain, as with a phone, what can we do?
The answer lies in the Merkle root field from the block header that we saw in [chapter_blocks]. As we saw in the last chapter, we can download the block headers and verify that they meet the Bitcoin consensus rules. In this chapter we’re going to work toward getting proof that a particular transaction is in a block that we know about. Since the block header is secured by proof-of-work, a transaction with a proof of inclusion in that block means, at a minimum, there was a good deal of energy spent to produce that block. This means that the cost to deceive you would be at least the cost of the proof-of-work for the block. The rest of this chapter goes into what the proof of inclusion is and how to verify it.
A Merkle tree is a computer science structure designed for efficient proofs of inclusion. The prerequisites are an ordered list of items and a cryptographic hash function. In our case, the items in the ordered list are transactions in a block and the hash function is hash256. To construct the Merkle tree, we follow this algorithm:
-
Hash all the items of the ordered list with the provided hash function.
-
If there is exactly 1 hash, we are done.
-
Otherwise, if there is an odd number of hashes, we duplicate the last hash in the list and add it to the end so that we have an even number of hashes.
-
We pair the hashes in order and hash the concatenation to get the parent level, which should have half the number of hashes.
-
Go to #2.
The idea is to come to a single hash that "represents" the entire ordered list. Visually, a Merkle tree looks like Merkle tree.
The bottom row is what we call the leaves of the tree. All other nodes besides the leaves are called internal nodes. The leaves get combined to produce a parent level (HAB and HCD), and when we calculate the parent level of that, we get the Merkle root.
We’ll go through each part of this process in the following sections.
Warning
|
Be Careful with Merkle Trees!
There was a vulnerability in Bitcoin 0.4–0.6 related to the Merkle root, which is detailed in CVE-2012-2459. There was a denial-of-service vector due to the duplication of the last item in Merkle trees, which caused some nodes to invalidate blocks even if they were valid. |
Given two hashes, we produce another hash that represents both of them. As they are ordered, we will call the two hashes the left hash and the right hash. The hash of the left and right hashes is what we call the parent hash. To clarify, here’s the formula for the parent hash:
-
H = Hashing function
-
P = Parent hash
-
L = Left hash
-
R = Right hash
- P=H(L||R)
Note that the || symbol denotes concatenation.
Here’s how we can code this process in Python:
link:code-ch11/examples.py[role=include]
The reason why we hash the concatenation to get the parent is because we can provide a proof of inclusion. Specifically, we can show that L is represented in the parent, P, by revealing R. That is, if we want proof L is represented in P, the producer of P can show us R and let us know that L is the left child of P. We can then combine L and R to produce P and have proof that L was used to produce P. If L is not represented in P, being able to provide R would be the equivalent to providing a hash pre-image, which we know is very difficult. This is what we mean by a proof of inclusion.
Given an ordered list of more than two hashes, we can calculate the parents of each pair, or what we call the Merkle parent level. If we have an even number of hashes, this is straightforward, as we can simply pair them up in order. If we have an odd number of hashes, then we need to do something, as we have a lone hash at the end. We can solve this by duplicating the last item.
That is, for a list like [A, B, C] we can add C again to get [A, B, C, C]. Now we can calculate the Merkle parent of A and B and calculate the Merkle parent of C and C to get:
- [H(A||B), H(C||C)]
Since the Merkle parent always consists of two hashes, the Merkle parent level always has exactly half the number of hashes, rounded up. Here is how we calculate a Merkle parent level:
link:code-ch11/examples.py[role=include]
-
We add the last hash on the list,
hashes[-1]
, to the end ofhashes
to make the length ofhashes
even. -
This is how we skip by two in Python.
i
will be 0 the first time through the loop, 2 the second, 4 the third, and so on.
This code results in a new list of hashes that correspond to the Merkle parent level.
To get the Merkle root we calculate successive Merkle parent levels until we get a single hash. If, for example, we have items A through G (7 items), we calculate the Merkle parent level first as follows:
- [H(A||B), H(C||D), H(E||F), H(G||G)]
Then we calculate the Merkle parent level again:
- [H(H(A||B)||H(C||D)), H(H(E||F)||H(G||G))]
We are left with just two items, so we calculate the Merkle parent level one more time:
- H(H(H(A||B)||H(C||D))||H(H(E||F)||H(G||G)))
Since we are left with exactly one hash, we are done. Each level will halve the number of hashes, so doing this process over and over will eventually result in a final single item called the Merkle root:
link:code-ch11/examples.py[role=include]
-
We loop until there’s one hash left.
-
We’ve exited the loop so there should only be one item.
Calculating the Merkle root in blocks should be pretty straightforward, but due to endianness issues, it turns out to be tricky. Specifically, we use little-endian ordering for the leaves of the Merkle tree. After we calculate the Merkle root, we use little-endian ordering again.
In practice, this means reversing the leaves before we start and reversing the root at the end:
link:code-ch11/examples.py[role=include]
-
We reverse each hash before we begin using a Python list comprehension.
-
We reverse the root at the end.
We want to calculate Merkle roots for a Block
, so we add a tx_hashes
parameter:
link:code-ch11/block.py[role=include]
-
We now allow the transaction hashes to be set as part of the initialization of the block. The transaction hashes have to be ordered.
As a full node, if we are given all of the transactions, we can calculate the Merkle root and check that the Merkle root is what we expect.
Now that we know how a Merkle tree is constructed, we can create and verify proofs of inclusion. Light nodes can get proofs that transactions of interest were included in a block without having to know all the transactions of a block (Merkle proof).
Say that a light client has two transactions that are of interest, which would be the hashes represented by the green boxes, HK and HN in Merkle proof. A full node can construct a proof of inclusion by sending us all of the hashes marked by blue boxes: HABCDEFGH, HIJ, HL, HM, and HOP. The light client would then perform these calculations:
-
HKL = merkle_parent(HK, HL)
-
HMN = merkle_parent(HM, HN)
-
HIJKL = merkle_parent(HIJ, HKL)
-
HMNOP = merkle_parent(HMN, HOP)
-
HIJKLMNOP = merkle_parent(HIJKL, HMNOP)
-
HABCDEFGHIJKLMNOP = merkle_parent(HABCDEFGH, HIJKLMNOP)
You can see that in Merkle proof, the dotted boxes correspond to the hashes that the light client calculates. In particular, the Merkle root is HABCDEFGHIJKLMNOP, which can then be checked against the block header whose proof-of-work has been validated.
The full node can send a limited amount of information about the block and the light client can recalculate the Merkle root, which can then be verified against the Merkle root in the block header. This does not guarantee that the transaction is in the longest blockchain, but it does assure the light client that the full node would have had to spend a lot of hashing power or energy creating a valid proof-of-work. As long as the reward for creating such a proof-of-work is greater than the amounts in the transactions, the light client can at least know that the full node has no clear economic incentive to lie.
Since block headers can be requested from multiple nodes, light clients have a way to verify if one node is trying to show them block headers that are not the longest. It only takes a single honest node to invalidate a hundred dishonest ones, since proof-of-work is objective. Therefore, isolation of a light client (that is, control of who the light client is connected to) is required to deceive in this way. The security of SPV requires that there be lots of honest nodes on the network.
In other words, light client security is based on a robust network of nodes and the economic cost of producing proof-of-work. For small transactions relative to the block subsidy (currently 12.5 BTC), there’s probably little to worry about. For large transactions (say, 100 BTC), the full nodes may have an economic incentive to deceive you. Transactions that large should generally be validated using a full node.
When a full node sends a proof of inclusion, there are two pieces of information that need to be included. First, the light client needs the Merkle tree structure, and second, the light client needs to know which hash is at which position in the Merkle tree. Once both pieces of information are given, the light client can reconstruct the partial Merkle tree to reconstruct the Merkle root and validate the proof of inclusion. A full node communicates these two pieces of information to a light client using a Merkle block.
To understand what’s in a Merkle block, we need to understand a bit about how a Merkle tree, or more generally, binary trees, can be traversed. In a binary tree, nodes can be traversed breadth-first or depth-first. Breadth-first traversal would go level by level like in Breadth-first ordering.
The breadth-first ordering starts at the root and goes from root to leaves, level by level, left to right.
Depth-first ordering is a bit different and looks like Depth-first ordering.
The depth-first ordering starts at the root and traverses the left side at each node before the right side.
In a proof of inclusion (see Merkle proof), the full node sends the green boxes, HK and HN, along with the blue boxes, HABCDEFGH, HIJ, HL, HM and HOP. The location of each hash is reconstructed using depth-first ordering from some flags. The process of reconstructing the tree, namely the dotted-edged boxes in Merkle proof, is described next.
The first thing a light client does is create the general structure of the Merkle tree. Because Merkle trees are built from the leaves upward, the only thing a light client needs is the number of leaves that exist to know the structure. The tree from Merkle proof has 16 leaves. A light client can create the empty Merkle tree like so:
link:code-ch11/examples.py[role=include]
-
Since we halve at every level, log2 of the number of leaves is how many levels there are in the Merkle tree. Note we round up using
math.ceil
as we round up for halving at each level. We could also be clever and uselen(bin(total))-2
. -
The Merkle tree will hold the root level at index 0, the level below at index 1, and so on. In other words, the index is the "depth" from the top.
-
There are levels 0 to
max_depth
in this Merkle tree. -
At any particular level, the number of nodes is the number of total leaves divided by 2 for every level above the leaf level.
-
We don’t know yet what any of the hashes are, so we set them to
None
. -
Note
merkle_tree
is a list of lists of hashes, or a two-dimensional array.
We can now create a MerkleTree
class:
link:code-ch11/merkleblock.py[role=include]
-
We keep a pointer to a particular node in the tree, which will come in handy later.
-
We print a representation of the tree.
Now that we have an empty tree, we can go about filling it to calculate the Merkle root. If we had every leaf hash, getting the Merkle root would look like this:
link:code-ch11/examples.py[role=include]
This fills the tree and gets us the Merkle root. However, the message from the network may not be giving us all of the leaves. The message might contain some internal nodes as well. We need a cleverer way to fill the tree.
Tree traversal is going to be the way we do this.
We can do a depth-first traversal and only fill in the nodes that we can calculate.
To traverse, we need to keep track of where exactly in the tree we are.
The properties self.current_depth
and self.current_index
do this.
We need methods to traverse the Merkle tree. We’ll also include other useful methods:
class MerkleTree:
...
link:code-ch11/merkleblock.py[role=include]
-
We want the ability to set the current node in the tree to some value.
-
We want to know if we are a leaf node.
-
In certain situations, we won’t have a right child because we may be at the furthest-right node of a level whose child level has an odd number of items.
We have Merkle tree traversal methods left
, right
, and up
.
Let’s use these methods to populate the tree via depth-first traversal:
link:code-ch11/examples.py[role=include]
-
We traverse until we calculate the Merkle root. Each time through the loop, we are at a particular node.
-
If we are at a leaf node, we already have that hash, so we don’t need to do anything but go back up.
-
If we don’t have the left hash, then we calculate the value first before calculating the current node’s hash.
-
If we don’t have the right hash, we calculate the value before calculating the current node’s hash. Note that we already have the left one due to the depth-first traversal.
-
We have both the left and the right hash, so we calculate the Merkle parent value and set that to the current node. Once set, we can go back up.
This code will only work when the number of leaves is a power of two, as edge cases where there are an odd number of nodes on a level are not handled.
We handle the case where the parent is the parent of the rightmost node on a level with an odd number of nodes:
link:code-ch11/examples.py[role=include]
-
If we don’t have the left node’s value, we traverse to the left node, since all internal nodes are guaranteed a left child.
-
We check first if this node has a right child. This is true unless this node happens to be the rightmost node and the child level has an odd number of nodes.
-
If we don’t have the right node’s value, we traverse to that node.
-
If we have both the left and the right node’s values, we calculate the current node’s value using
merkle_parent
. -
We have the left node’s value, but the right child doesn’t exist. This is the rightmost node of this level, so we combine the left value twice.
We can now traverse the tree for the number of leaves that aren’t powers of two.
The full node communicating a Merkle block sends all the information needed to verify that the interesting transaction is in the Merkle tree.
The merkleblock
network command is what communicates this information; it looks like Parsed merkleblock.
The first six fields are exactly the same as the block header from [chapter_blocks]. The last four fields are the proof of inclusion.
The number of transactions field indicates how many leaves this particular Merkle tree will have.
This allows a light client to construct an empty Merkle tree.
The hashes field holds the blue and green boxes from Merkle proof.
Since the number of hashes in the hashes field is not fixed, it’s prefixed with how many there are.
Last, the flags field gives information about where the hashes go within the Merkle tree.
The flags are parsed using bytes_to_bits_field
to convert them to a list of bits (1’s and 0’s):
link:code-ch11/helper.py[role=include]
The ordering for the bytes is a bit strange, but it’s meant to be easy to convert into the flag bits needed to reconstruct the Merkle root.
The flag bits inform where the hashes go using depth-first ordering.
The rules for the flag bits are:
-
If the node’s value is given in the hashes field (blue box in Processing a Merkle block), the flag bit is 0.
-
If the node is an internal node and the value is to be calculated by the light client (dotted outline in Processing a Merkle block), the flag bit is 1.
-
If the node is a leaf node and is a transaction of interest (green box in Processing a Merkle block), the flag is 1 and the node’s value is also given in the hashes field. These are the items proven to be included in the Merkle tree.
Given the tree from Processing a Merkle block:
-
The flag bit is 1 for the root node (1), since that hash is calculated by the light client.
-
The left child, HABCDEFGH (2), is included in the hashes field, so the flag is 0.
-
From here, we traverse to HIJKLMNOP (3) instead of HABCD or HEFGH since HABCDEFGH represents both those nodes and we don’t need them.
-
The right child, HIJKLMNOP, is also calculated, so it has a flag bit of 1.
-
To calculate HIJKLMNOP, we need the values for HIJKL and HMNOP (9). The next node in depth-first order is the left child, HIJKL (4), which is where we traverse to next.
-
HIJKL is an internal node that’s calculated, so the flag bit is 1.
-
From here, we traverse to its left child, HIJ (5). We will be traversing to HKL (6) when we come back to this node.
-
HIJ is next in depth-first ordering; its hash is included in the hashes list and the flag is 0.
-
HKL is an internal, calculated node, so the flag is 1.
-
HK (7) is a leaf node whose presence in the block is being proved, so the flag is 1.
-
HL (8) is a node whose value is included in the hashes field, so the flag is 0.
-
We traverse up to HKL, whose value can now be calculated since HK and HL are known.
-
We traverse up to HIJKL, whose value can now be calculated since HIJ and HKL are known.
-
We traverse up to HIJKLMNOP, whose value we can’t calculate yet since we haven’t been to HMNOP.
-
We traverse to HMNOP, which is another internal node, so the flag is 1.
-
HMN (10) is another internal node that’s calculated, so the flag is 1.
-
HM (11) is a node whose value is included in the hashes field, so the flag is 0.
-
HN (12) is of interest, so the flag is 1 and its value is in the hashes field.
-
We traverse up to HMN, whose value can now be calculated.
-
We traverse up again to HMNOP, whose value cannot be calculated because we haven’t been to HOP yet.
-
HOP (13) is given, so the flag is 1 and its hash is the final hash in the hashes field.
-
We traverse to HMNOP, which can now be calculated.
-
We traverse to HIJKLMNOP, which can now be calculated.
-
Finally, we traverse to HABCDEFGHIJKLMNOP, which is the Merkle root, and calculate it!
The flag bits for nodes (1) through (13) are:
1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 0
There should be seven hashes in the hashes field, in this order:
-
HABCDEFGH
-
HIJ
-
HK
-
HL
-
HM
-
HN
-
HOP
Notice that every letter is represented in the hashes, A to P. This information is sufficient to prove that HK and HN (the green boxes in Processing a Merkle block) are included in the block.
As you can see from Processing a Merkle block, the flag bits are given in depth-first order. Anytime we’re given a hash, as with HABCDEFGH, we skip its children and continue. In the case of HABCDEFGH, we traverse to HIJKLMNOP instead of HABCD. Flag bits are a clever mechanism to encode which nodes have which hash value.
We can now populate the Merkle tree and calculate the root, given the appropriate flag bits and hashes:
class MerkleTree:
...
link:code-ch11/merkleblock.py[role=include]
-
The point of populating this Merkle tree is to calculate the root. Each loop iteration processes one node until the root is calculated.
-
For leaf nodes, we are always given the hash.
-
flag_bits.pop(0)
is a way in Python to dequeue the next flag bit. We may want to keep track of which hashes are of interest to us by looking at the flag bit, but for now, we don’t do this. -
hashes.pop(0)
is how we get the next hash from the hashes field. We need to set the current node to that hash. -
If we don’t have the left child value, there are two possibilities. This node’s value may be in the hashes field, or it might need calculation.
-
The next flag bit tells us whether we need to calculate this node or not. If the flag bit is 0, the next hash in the hashes field is this node’s value. If the flag bit is 1, we need to calculate the left (and possibly the right) node’s value.
-
We are guaranteed that there’s a left child, so we traverse to that node and get its value.
-
We check that the right node exists.
-
We have the left hash, but not the right. We traverse to the right node to get its value.
-
We have both the left and the right node’s values, so we calculate their Merkle parent to get the current node’s value.
-
We have the left node’s value, but the right does not exist. In this case, according to Merkle tree rules, we calculate the Merkle parent of the left node twice.
-
All hashes must be consumed or we got bad data.
-
All flag bits must be consumed or we got bad data.
Simplified payment verification is useful but not without some significant downsides. The full details are outside the scope of this book, but despite the programming being pretty straightforward, most light wallets do not use SPV and instead trust data from the wallet vendor servers. The main drawback of SPV is that the nodes you are connecting to know something about the transactions you are interested in. That is, you lose some privacy by using SPV. This will be covered in more detail in [chapter_bloom_filters] as we make Bloom filters to tell nodes what transactions we are interested in.