Leaf levels
Consider the below figure for an SMT storing seven key-value pairs, built by following the principles explained in the foregoing subsection; where the keys are, The leaf levels are as follows; , , , , , and .
Remaining keys
In a general Sparse Merkle tree (SMT), values are stored at their respective leaf-nodes. But a leaf node not only stores a value, , but also the key-bits that are left unused in the navigation from the root to . These unused key-bits are called the remaining key, and are denoted by for the leaf node . Consider again the SMT of the 7 key-value pairs depicted in the figure above. The remaining keys of each of the 7 leaves in the SMT are as follows; , , , , , and .Fake-leaf attack
Note that the above simplified design of binary SMTs, based on key-value pairs, presents some problems. The characteristic of binary SMTs having leaves at different levels can be problematic to verifiers, especially when carrying out a simple Merkle proof.Scenario: Fake SMT leaf
What if the verifier is presented with a fake leaf? Consider the below figure, showing a binary SMT with a branch and its children and hidden from the verifier’s sight. That is, suppose the verifier is provided with the following information;- The key-value , where and .
- The root , the number of levels to root, and the siblings and .

- He uses the key to navigate the tree until locating the supposed leaf .
- He computes and sets it as .
- Then takes the sibling and calculates
. - And then, uses to compute the root, .
Solution to the Fake-leaf attack
In order to circumvent the Fake-Leaf Attack we modify how the binary SMTs are built. Here’s the trick: When building binary SMTs, differentiate between how leaves are hashed and how branches are hashed. That is, use two different hash functions; one hash function to hash leaves, denote it by , and the other function for hashing non-leaf nodes, denote it by .How does this prevent the Fake-leaf attack?
Reconsider now, the Scenario A, given above. Recall that the Attacker provides the following;- The key-value , where and .
- The root , the number of levels to root, and the siblings and .
- He then sets .
- And further computes .
- Again, calculates the root, .
Non-binding key-value pairs
Whenever the verifier needs to check inclusion of the given key-value pair in a binary SMT identified by the , he first navigates the SMT in order to locate the leaf storing , and thereafter carries out two computations. Both computations involve climbing the tree from the located leaf back to the root, . And the two computations are;-
Checking correctness of the key .
That is, verifier takes the Remaining Key, , and reconstructs the key by concatenating the key bits used to navigate to from , in the reverse order.
Suppose the number of levels to root is 3, and the least-significant bits used for navigation are , and .
In order to check key-correctness, verifier takes the remaining key and,
- Concatenates and gets ,
- Concatenates then gets ,
- Concatenates and gets .
- The Merkle proof: That is, checking whether the value stored at the located leaf was indeed included in computing the root, . This computation was illustrated several times in the above discussions. Note that the key-correctness and the Merkle proof are simultaneously carried out.
Example: Indistinguishable leaves
Suppose a binary SMT contains a key-value pair at the leaf , where . That is, . Note that, when building binary SMTs, it is permissible to have another key-value pair in the same tree with . An Attacker can pick the key-value pair such that and . And, with the above design, it means . Consider the below figure and suppose the Attacker provides the following data;
- The key-value , where and .
- The root, , the number of levels to root = 3, and the siblings , and .
- Concatenates , and gets ,
- Concatenates to get ,
- Concatenates , yielding .
- He computes the hash of and sets it as, .
- Then he uses to compute, .
- He also uses to compute, .
- He further calculates, .
- ,
- ,
- ,
- .
Why is this attack successful?
Note that equality of values, , associated with two distinct keys, has nothing to do with the efficacy of this attack. In fact, for all practical purposes, it should be permissible for distinct leaves to store any value, irrespective of whether other leaves store an equivalent value or not. The downfall of our binary SMTs design, thus far, is that it does not give the verifier any equation that relates or ties the keys to their associated values. In other words, the attack succeeds simply because the key-value pairs (as ‘committed’ values) are not binding.Solution to the non-binding key-value problem
The solution to this problem is straightforward, and it is to build the binary SMTs in such a way that the key-value pairs are binding. This means, create a relationship between the keys and their associated values, so that the verifier can simply check if this relationship holds true. In order to ensure that checking such a relationship blends with the usual proof machinery, one has two options. The naïve solution, which involves the keys, is one option.The Naïve solution
The naïve solution is to simply include keys in the argument of the hash function, when forming leaves. That is, when building a binary SMT, one includes a key-value pair by setting the leaf to be the hash of both the value and the key; Does this change remedy the non-binding problem? Suppose and are two key-value pairs such that , while and differ only in one of the most-significant bits. Since the hash functions used are collision-resistant, it follows that Consequently, although the key-value pairs and might falsely pass the key-correctness check, they do not pass the Merkle proof test. And this is because, collision-resistance also guarantees that the following series of inequalities hold true; where; is a sibling to and is a sibling to , making and branches traversed while climbing the tree from to root; Similarly, is a sibling to , while is a sibling to , also making and branches traversed while climbing the tree from to root. The only chance for the Merkle proof to pass is if the key-value pairs and are distinct and are individually on the same SMT. The inclusion of keys, in the argument of the hash functions, therefore ensures that leaves and are distinguishable. And most importantly, that key-value pairs in our SMTs are now binding.A better solution
The other solution, which is much more apt than the Naïve option, utilises the remaining keys when forming leaves. Since levels to root is related to the Remaining Key () notion, a much more apt solution is to rather include the remaining key, , as the argument to the hash function, instead of the whole key . That is, for a key-value pair , one sets the leaf to be the hash of both the value and the remaining key; With this strategy, the verifier needs the remaining key , instead of the whole key, in order to carry out a Merkle proof. So he adjusts the Merkle proof by;- Firstly, picking the correct hash function for leaves,
- Secondly, concatenating the value stored at the leaf and the remaining key , instead of the whole key ,
- Thirdly, hashing the concatenation .
Hiding values
It is often necessary to ensure that a commitment scheme has the hiding property. And this can be achieved by storing hashes of values, as opposed to plainly storing the values. A leaf therefore is henceforth constructed in two steps;- Firstly, for a key-value pair , compute the hash the value ,
- Secondly, form the leaf containing , as follows,
- Firstly, picking the correct hash function for leaf nodes,
- Secondly, concatenating the hashed-value and the remaining key ,
- Thirdly, hashing the concatenation in order to form the leaf, .
Example. (Merkle proof on hidden values)
The following example illustrates a Merkle proof when the above strategy is applied. Consider an SMT where the keys are 8-bit long, and the prover commits to the key-value with . See figure below.
- He computes,
- Then, he uses the sibling to compute, .
- Next, he computes, .
- Now, verifier uses to compute the supposed root, .
- Checks if equals .