Patricia Tries

If you’re not familiar with Merkle Trees, read the previous article to give you a basic idea of what Merkle Trees are, why they’re useful, and give you a glimpse of the significant advantages of Merkle Patricia Trees over standard Merkle Trees.

Merkle trees are a fundamental part of what makes blockchains tick. Although it is definitely theoretically possible to make a blockchain without Merkle trees, simply by creating giant block headers that directly contain every transaction, doing so poses large scalability challenges that arguably puts the ability to trustlessly use blockchains out of the reach of all but the most powerful computers in the long term. Thanks to Merkle trees, it is possible to build Ethereum nodes that run on all computers and laptops large and small, smart phones, and even internet of things devices.

Unlike transaction history, however, the state needs to be frequently updated: the balance and nonce of accounts is often changed, and what’s more, new accounts are frequently inserted, and keys in storage are frequently inserted and deleted. What is thus desired is a data structure where we can quickly calculate the new tree root after an insert, update edit or delete operation, without recomputing the entire tree. There are also two highly desirable secondary properties:

  • The depth of the tree is bounded, even given an attacker that is deliberately crafting transactions to make the tree as deep as possible. Otherwise, an attacker could perform a denial of service attack by manipulating the tree to be so deep that each individual update becomes extremely slow.
  • The root of the tree depends only on the data, not on the order in which updates are made. Making updates in a different order and even recomputing the tree from scratch should not change the root.

Thus making Merkle Patricia Tree is the combination of a:

  • Patricia Trie: An efficient Radix Trie, a data structure in which “keys” represent the path one has to take to reach a node
  • Merkle Tree: A hash tree in which each node’s hash is computed from its child nodes hashes. We’ll begin by exploring the “Patricia Trie” part of Merkle Patricia Trees, and then integrate their “Merkle Tree” part.

Merkle Patricia Trie is one of the key data structures for the Ethereum’s storage layer.

What is Radix Trie?

Trie

A trie, or prefix tree, is a type of search tree, a tree data structure used for locating specific keys from within a set. These keys are most often strings, with links between nodes defined not by the entire key, but by individual characters. In order to access a key (to recover its value, change it, or remove it), the trie is traversed depth-first, following the links between nodes, which represent each character in the key.

Unlike a binary search tree, nodes in the trie do not store their associated key. Instead, a node's position in the trie defines the key with which it is associated. This distributes the value of each key across the data structure, and means that not every node necessarily has an associated value.

Optimised version

Radix trie is a space-optimized trie, in which each node that is the only child is merged with its parent. The result is that the number of children of every internal node is at most the radix r of the radix tree, where r is a positive integer and a power x of 2, having x ≥ 1. Unlike regular trees, edges can be labeled with sequences of elements as well as single elements. This makes radix trees much more efficient for small sets (especially if the strings are long) and for sets of strings that share long prefixes.

Applications

Radix trees are useful for constructing associative arrays with keys that can be expressed as strings. They find particular application in the area of IP routing, where the ability to contain large ranges of values with a few exceptions is particularly suited to the hierarchical organization of IP addresses. They are also used for inverted indexes of text documents in information retrieval.

Merkle proof

A Merkle proof consists of a chunk, the root hash of the tree, and the “branch” consisting of all of the hashes going up along the path from the chunk to the root. Someone reading the proof can verify that the hashing, at least for that branch, is consistent going all the way up the tree, and therefore that the given chunk actually is at that position in the tree. The application is simple: suppose that there is a large database, and that the entire contents of the database are stored in a Merkle tree where the root of the Merkle tree is publicly known and trusted (eg. it was digitally signed by enough trusted parties, or there is a lot of proof of work on it). Then, a user who wants to do a key-value lookup on the database (eg. “tell me the object in position 85273”) can ask for a Merkle proof, and upon receiving the proof verify that it is correct, and therefore that the value received actually is at position 85273 in the database with that particular root. It allows a mechanism for authenticating a small amount of data, like a hash, to be extended to also authenticate large databases of potentially unbounded size. The original application of Merkle proofs was in Bitcoin, as described and created by Satoshi Nakamoto in 2009. The Bitcoin blockchain uses Merkle proofs in order to store the transactions in every block:

Patricia Trie

In a standard “trie”, the key is a path we follow step-by-step (i.e. one hexadecimal value at a time) to reach a destination: our value. Now, every time we take a step along that trie, we step on what’s called a “node”. In Patricia Tries, there are different kinds of nodes:

  • null: A non-existent node.
  • branch: A node that links (“branches out”) to up to 16 distinct child notes. A branch node can also itself have a value.
  • leaf: An “end-node” that contains the final part of the path and a value.
  • extension: A “shortcut” node that provides a partial path and a destination. Extension nodes are used to “bypass” unnecessary nodes when only one valid branch exists for a sequence of nodes. To improve the efficiency of standard tries, Patricia tries provide a new type of node: extension nodes. An extension node can replace useless sequences of branches that contain only one valid index.
console.log(Buffer.from('testKey'))
console.log(Buffer.from('testKey0'))
console.log(Buffer.from('testKeyA'))

// RESULT (BYTES)
Buffer 74 65 73 74 4b 65 79
Buffer 74 65 73 74 4b 65 79 30
Buffer 74 65 73 74 4b 65 79 41

We can see that the bytes representations of our keys branch off at byte 79. This makes sense: 79 stands for the letter y. And if you're to add these keys to the Patricia Trie, 'testKey' would be a branch node. Note that it’s possible for a branch node to not have a value itself, but only branches. This would simply mean that the keys (i.e. the paths) branch off at a certain point, but that there is no end-value assigned at that specific point. See for example:

await trie.put(Buffer.from('testKey0'), Buffer.from('testValue0'))
await trie.put(Buffer.from('testKeyA'), Buffer.from('testValueA'))

The key 'testKey' would still be a branch node, just without any value. What type of node do you think exists at path "testKe"? After all, this is also a common path, the first part of our 3 keys. Let’s try it:

var node2 = await trie.findPath(Buffer.from('testKe')) // We retrieve the node at "testKe"
console.log('Node 2: ', node2.node)

// RESULT
Node 2:  null

A null node! This reveals a key difference between standard (“Radix”) tries and Patricia tries. In standards tries, each step of the “path” would be a branch itself (i.e. branch node at 7, 74, 74 6, 74 65, and so on). However, creating a branch node for every step of the path is inefficient when there is only one valid “branch” for many of those steps. This is inefficient because each branch is a 17-item array, so using them to store only one non-null value is wasteful. Patricia tries avoid these unnecessary branch nodes by creating extension nodes at the beginning of the common path to act as “shortcuts”. This explains why there is no branch node at key "testKe": it’s bypassed by an extension node that begins higher-up (i.e. earlier in the path) in the trie.

If we take a look at the individual branches, as you recall, they should be pointing to leaf nodes:

// --path--  ------------- value -----------------
[ Buffer 30, Buffer 74 65 73 74 56 61 6c 75 65 30 ] // leaf node down the path of "testKey0"
[ Buffer 31, Buffer 74 65 73 74 56 61 6c 75 65 41 ] // leaf node down the path of "testKeyA"

As we said above, leaf nodes are arrays that contain two items: [ remainingPath, value ]. We can see in the code above that the “remainingPath” of our first leaf node is 30, while the second is 31. What does this mean?

The first hex character 3 indicates whether the node is a leaf node, or an extension node (this is to prevent ambiguity, as both nodes have a similar 2-item structure). This first hex character also indicates whether or not the remaining path is of “odd” or “even” length (required to write complete bytes: this indicates if the first hex character is appended with a 0 ).

Extension nodes

To create an extension node, we need to slightly change our keys. We’ll keep our branch node at path "testKey", but we’ll change the two other keys so that they lead down a lengthy common path.

console.log(Buffer.from('testKey'))
console.log(Buffer.from('testKey0001'))
console.log(Buffer.from('testKey000A'))

// RESULT
Buffer 74 65 73 74 4b 65 79
Buffer 74 65 73 74 4b 65 79 30 30 30 31
Buffer 74 65 73 74 4b 65 79 30 30 30 41

As you can see, the bytes 30 30 30 (standing for 000 are common to both keys. We therefore should assume an extension that begins at index 3 of the branch node at "testKey", as these indexes are not determined at random: they correspond to the next hex-value of the path of our keys into the child array (hex has 16 values and there are 16 children).

Generating and Verifying Hashes

As you may know, Merkle trees are hash trees that allow us to efficiently verify information. In the context of blockchains, Merkle Trees allow us to answer questions such as “Has this transaction been confirmed?”.

In Ethereum’s Merkle Patricia Trees, each node is referenced to by its hash. Note that this hash also can be referred to as a “key”, which can be confusing. The hash is not the same as the path we take when going down the trie.

You can think of paths as a sequence of instructions for a given input, something like “go down branch #3 → go down this extension → go down branch #8 → you have arrived at your destination: a leaf“. Hashes, on the other hand, act as unique identifiers for each node and are generated in a way that allows the verification of data.

So, how are hashes calculated in Ethereum?

  • First, all values from the node are serialized using the Recursive Length Prefix encoding function.
  • Then, a hash function (keccak256) is applied to the serialized data. This outputs a 32-bytes hash.

Let's try to verify the hash by ourselves:

var node1 = await trie.findPath(Buffer.from('testKey'))
console.log('Extension node hash: ', node1.node._branches[3])

var node2 = await trie._lookupNode(Buffer.from(node1.node._branches[3]))
console.log('Value of extension node: ', node2)

// RESULT
Extension node hash:  Buffer 39 1d 48 30 de 00 87 57 46 98 4c 4f d3 ef 5a 0c f7 ca 7b 40 f9 c2 8a ce e2 ba 22 49 84 41 8f 6b

Value of extension node:  ExtensionNode {
  _nibbles: [ 0, 3, 0, 3, 0 ],
  _value: Buffer 70 b3 d0 20 ad 85 8f d6 00 28 a4 23 e9 8f 1d 99 c5 37 cd b9 1f 27 49 16 40 06 6f ea c7 9c 2f 72
}

Similarly to leaf nodes, extension nodes are two-item arrays: [ encodedPath, hash ] . The encodedPath (denoted above as _nibbles") stands for the “remaining path”. In the case of extension nodes this is the path that we “shortcut”.

We should first use the Recursive Length Prefix encoding function on the node to serialize the values of the extension node. RLP-encoding the “raw” version (as an array of bytes) and to take the hash of this RLP output:

console.log('Our computed hash:       ', keccak256(rlp.encode(node2.raw())))
console.log('The extension node hash: ', node1.node._branches[3])

// RESULT
Our computed hash:        Buffer 39 1d 48 30 de 00 87 57 46 98 4c 4f d3 ef 5a 0c f7 ca 7b 40 f9 c2 8a ce e2 ba 22 49 84 41 8f 6b
The extension node hash:  Buffer 39 1d 48 30 de 00 87 57 46 98 4c 4f d3 ef 5a 0c f7 ca 7b 40 f9 c2 8a ce e2 ba 22 49 84 41 8f 6b

Merkle trees allow us to verify the integrity of their data contained without requiring us to download the full tree. Now let’s suppose that I’m provided with various pieces of information, some that I trust, and some that I don’t. Here’s what I know from a trustworthy source:

  • The tree contains key-value pair "testKey": "testValue".
  • There also exists a branch node with hash: Buffer 70 b3 d0 20 ad 85 8f d6 00 28 a4 23 e9 8f 1d 99 c5 37 cd b9 1f 27 49 16 40 06 6f ea c7 9c 2f 72.
  • The path corresponding to this branch node is "testKey000".
  • This branch node contains at least one valid branch. This branch is located at index 4, with value [ Buffer 31, Buffer 74 65 73 74 56 61 6c 75 65 31 ] (equivalent to key-value pair "testKey0001": "testValue1")

Now, I’m getting conflicting information from two other sources:

  • Bob claims that: That branch node contains another branch, representing key-value pair "testKey000A": "testValueA"
  • Alice claims that: That branch node contains another branch, representing key-value pairs "testKey000z": "testValuez".

How can I determine who’s telling the truth? We do this by computing the branch node hash for each possibility, and comparing them with our trusted branch node hash!

await trie1.put(Buffer.from('testKey'), Buffer.from('testValue'))
await trie1.put(Buffer.from('testKey0001'), Buffer.from('testValue1'))
await trie1.put(Buffer.from('testKey000A'), Buffer.from('testValueA')) // What Bob claims

await trie2.put(Buffer.from('testKey'), Buffer.from('testValue'))
await trie2.put(Buffer.from('testKey0001'), Buffer.from('testValue1'))
await trie2.put(Buffer.from('testKey000z'), Buffer.from('testValuez')) // What Alice claims

var temp1 = await trie1.findPath(Buffer.from('testKey'))
var temp2 = await trie2.findPath(Buffer.from('testKey'))

var node1 = await trie1._lookupNode(Buffer.from(temp1.node._branches[3]))
var node2 = await trie2._lookupNode(Buffer.from(temp2.node._branches[3]))

console.log('Branch node 1 hash: ', node1._value)
console.log('Branch node 2 hash: ', node2._value)

// Result
Branch node 1 hash:  Buffer 70 b3 d0 20 ad 85 8f d6 00 28 a4 23 e9 8f 1d 99 c5 37 cd b9 1f 27 49 16 40 06 6f ea c7 9c 2f 72
Branch node 2 hash:  Buffer 72 2a cc 1b 73 61 09 1c 9a 5e 33 15 4b e4 ac 9e d8 a8 b9 33 72 06 9b 09 53 da 40 9d a1 5a 20 c9

Using our already trusted information, we can be confident that Bob (node 1) is telling the truth, since the hash computed from his information is valid. Note that we also could have compared the root hashes of the trees and compared them to a trusted tree root. This is possible because each distinct tree will produce a distinct hash!

root1 = trie1.root
root2 = trie2.root
console.log(root1)
console.log(root2)

Buffer 46 7a 64 dd e5 de 0a 37 0a 35 75 03 42 a5 2d 80 1f 0b 9b 22 59 03 10 b4 1d 0b ab 7d 3d e0 a3 5e
Buffer 38 b0 a5 74 1a f3 61 14 1e 7a 29 b6 f2 9d ab 22 75 7b 7d 64 73 90 f6 e3 55 e5 2e 2b ea c1 e3 04

Merkle Patricia Trees in Ethereum

So, how does that actually work in Ethereum? Every block header in Ethereum contains not just one Merkle tree, but three trees for three kinds of objects:

  • The Transactions Tree
  • The Receipts Tree
  • The State Tree

This was intoriduced to overcome one particular limitation in Bitcoin, that is while you can prove the inclusion of transactions, you cannot prove anything about the current state (eg. digital asset holdings, name registrations, the status of financial contracts, etc). How many bitcoins do you have right now? A Bitcoin light client can use a protocol involving querying multiple nodes and trusting that at least one of them will notify you of any particular transaction spending from your addresses, and this will get you quite far for that use case, but for other more complex applications it isn’t nearly enough; the precise nature of the effect of a transaction can depend on the effect of several previous transactions, which themselves depend on previous transactions, and so ultimately you would have to authenticate every single transaction in the entire chain. To get around this, Ethereum takes the Merkle tree concept one step further.

This allows for a highly advanced light client protocol that allows light clients to easily make and get verifiable answers to many kinds of queries:
1. Has this transaction been included in a particular block?
2. Tell me all instances of an event of type X (eg. a crowdfunding contract reaching its goal) emitted by this address in the past 30 days
3. What is the current balance of my account?
4. Does this account exist?

The first is handled by the transaction tree; the third and fourth are handled by the state tree, and the second by the receipt tree. The first four are fairly straightforward to compute; the server simply finds the object, fetches the Merkle branch (the list of hashes going up from the object to the tree root) and replies back to the light client with the branch.

So, what does our destination look like? It’s a key-value pair (remember not to confuse this key with the path!) where: The key is the hash of the value. This is the “key” that we use when uniquely referencing a node (again, not to be confused with the “path” that we used to navigate to that key-value pair in the first place).

The value is the Recursive Layer Protocol encoding of the node itself (which, if the transaction exists, will most likely be a leaf node). Remember that the leaf node is a 2-item array: [ remainingPath, leafValue ]. In the context of a transaction, “leafValue” will contain information relevant to our transaction (like its value, and who sent it).

The purpose of the transactions tree is to record transaction requests. It can answer questions like: “What is the value of this transaction?” or “Who sent this transaction?”. In the Ethereum blockchain, each block has its own transactions tree. Just like in our previous examples, we need a path to “navigate” the tree and access a particular transaction. In the transactions tree, this path is given by the Recursive Layer Protocol encoding of the transaction’s index in the block. So, for example, if a transaction’s index is 127:

const rlp = require('rlp')
console.log('RLP encoding of 127: ', rlp.encode('127'))

// RESULT
RLP encoding of 127:  Buffer 83 31 32 37

We would, therefore, follow the path 8 → 3 → 3 → 1 → 3 → 2 → 3 → 7 and reach our destination. The transaction output contains all the information from the transaction node (plus additional information automatically provided by web3, like the blockHash and the blockNumber). Now let's extract this transaction data ([nonce, gasPrice, gasLimit, to, value, input, v, r, s]) from some transaction output to re-create a valid Ethereum transaction:

const INFURIA_ENDPOINT = require('./infura_endpoint')
const Web3 = require('web3')
const web3 = new Web3(new Web3.providers.HttpProvider(INFURIA_ENDPOINT))

const hash = '0x2f81c59fb4f0c3146483e72c1315833af79b6ea9323b647101645dc7ebe04074'
let transaction = await web3.eth.getTransaction(hash)
transactionData = [
    web3.utils.toHex(transaction.nonce),
    web3.utils.toHex(transaction.gasPrice),
    web3.utils.toHex(transaction.gas),
    transaction.to,
    web3.utils.toHex(transaction.value),
    transaction.input,
    transaction.v,
    transaction.r,
    transaction.s,
]

console.log('Transaction hash: ', keccak256(rlp.encode(transactionData)))
console.log('Original hash: ', transaction.hash);
console.log('Our hash: ', hash);

// RESULT
Generated hash:  Buffer 2f 81 c5 9f b4 f0 c3 14 64 83 e7 2c 13 15 83 3a f7 9b 6e a9 32 3b 64 71 01 64 5d c7 eb e0 40 74
Transaction hash: '0x2f81c59fb4f0c3146483e72c1315833af79b6ea9323b647101645dc7ebe04074'
Our hash: '0x2f81c59fb4f0c3146483e72c1315833af79b6ea9323b647101645dc7ebe04074'

Comments

Popular posts from this blog

CAP Theorem and blockchain

Length extension attack

Contract upgrade anti-patterns