← Take me home

A somewhat gentle introduction to Kademlia

July 29, 2021

Edited August 04, 2021

We describe a peer-to-peer system which has provable consistency and performance in a fault-prone environment. Our system routes queries and locates nodes using a novel XOR-based metric topology that simplifies the algorithm and facilitates our proof. The topology has the property that every message exchanged conveys or reinforces useful contact information. The system exploits this information to send parallel, asynchronous query messages that tolerate node failures without imposing timeout delays on users.

Even the abstract from the Kademlia paper is a lot to digest and I think we can say there's not going to be any hand-holding. Accordingly, this blog will serve as a collection of my thoughts.

An overview of the hash table

To begin, we'll go over a few definitions. Firstly, a hash function is a map hh such that

h:{0,1,...,u1}{0,1,...,m1}=K.h : \{0, 1, ..., u - 1\} \to \{0, 1, ..., m -1\} = K.

for u<mu < m. In words, hh reduces a universe U\mathcal{U} with uu-many keys to mm-many buckets.

Next, a collision occurs whenever h(ki)=h(kj)h(k_i) = h(k_j) for two keys kikjUk_i \neq k_j \in \mathcal{U}. A well-behaved hash function avoids collisions. Specifically, the authors of Kademlia propose SHA-1 with a keyspace (U\mathcal{U} in our model) of 160 bits.

Abstractly, a hashmap HH supports put(key,val)\verb|put|(key, val) and get(key)\verb|get|(key). The former takes a key kk and stores value vv in bucket H[h(k)]H[h(k)] and the latter returns H[h(k)]H[h(k)].

The upshot of this datastructure is fantastic lookup times (constant on average), but sometimes we need to store large values. Accordingly, we can store parts of the map on separate computers. But who coordinates all the computers? A centralized computer, much like the director of a play, decides which keys of the hashmap go to which computer. But say the director decides to quit the play, we've lost access to the values in our hashmap and this map is not particularly resilient.

What is a distributed hash table anyways?

Enter the distributed hash table. Much like the traditional hashmap, DHT also supports put(key,val)\verb|put|(key, val) and get(key)\verb|get|(key). But it is fault-tolerant because if a computer leaves the network, we can still access the map.

Now, it would be wonderful if we had a protocol that partitions our map to many computers without a director. This is precisely what Kademlia proposes.

In my next blog posts, I'll go over the how and why of Kademlia.

Disclaimer

If something is off or you need some clarification, please email me at by clicking to copy to clipboard.