kronosapiens.github.io - Trie, Merkle, Patricia: A Blockchain Story









Search Preview

Trie, Merkle, Patricia: A Blockchain Story

kronosapiens.github.io
In which we tell the story of the Patricia Tree.
.io > kronosapiens.github.io

SEO audit: Content analysis

Language Error! No language localisation is found.
Title Trie, Merkle, Patricia: A Blockchain Story
Text / HTML ratio 74 %
Frame Excellent! The website does not use iFrame solutions.
Flash Excellent! The website does not have any flash contents.
Keywords cloud >>> hash = cabinet programmer names brogrammer drawers cabinets people drawer computer things Merkle table data output values functions
Keywords consistency
Keyword Content Title Description Headings
>>> 42
hash 34
= 26
cabinet 23
programmer 17
16
Headings
H1 H2 H3 H4 H5 H6
2 6 1 0 0 0
Images We found 0 images on this web page.

SEO Keywords (Single)

Keyword Occurrence Density
>>> 42 2.10 %
hash 34 1.70 %
= 26 1.30 %
cabinet 23 1.15 %
programmer 17 0.85 %
16 0.80 %
names 15 0.75 %
brogrammer 15 0.75 %
drawers 14 0.70 %
cabinets 14 0.70 %
people 13 0.65 %
drawer 12 0.60 %
computer 11 0.55 %
things 11 0.55 %
Merkle 9 0.45 %
table 9 0.45 %
data 8 0.40 %
output 7 0.35 %
values 7 0.35 %
functions 7 0.35 %

SEO Keywords (Two Word)

Keyword Occurrence Density
of the 23 1.15 %
the hash 16 0.80 %
in the 16 0.80 %
the programmer 11 0.55 %
the brogrammer 10 0.50 %
the names 10 0.50 %
the same 9 0.45 %
hash table 9 0.45 %
and the 9 0.45 %
that the 8 0.40 %
the computer 7 0.35 %
programmer and 7 0.35 %
the name 7 0.35 %
there was 7 0.35 %
a little 6 0.30 %
hash function 6 0.30 %
have to 6 0.30 %
for a 6 0.30 %
into the 5 0.25 %
you can 5 0.25 %

SEO Keywords (Three Word)

Keyword Occurrence Density Possible Spam
the hash table 7 0.35 % No
the programmer and 6 0.30 % No
the hash function 5 0.25 % No
names in the 5 0.25 % No
programmer and the 4 0.20 % No
in the same 4 0.20 % No
keep track of 4 0.20 % No
of the hash 4 0.20 % No
the names in 4 0.20 % No
be able to 4 0.20 % No
a cabinet with 4 0.20 % No
a lot of 4 0.20 % No
and the brogrammer 4 0.20 % No
with 100 drawers 3 0.15 % No
of your password 3 0.15 % No
>>> secondCabinetLocation = 3 0.15 % No
cabinet with 100 3 0.15 % No
of the cabinet 3 0.15 % No
the name of 3 0.15 % No
is that the 3 0.15 % No

SEO Keywords (Four Word)

Keyword Occurrence Density Possible Spam
programmer and the brogrammer 4 0.20 % No
the names in the 4 0.20 % No
cabinet with 100 drawers 3 0.15 % No
the programmer and the 3 0.15 % No
a cabinet with 100 3 0.15 % No
size of the cabinet 3 0.15 % No
>>> secondCabinet = triefindsecondCabinetLocation 3 0.15 % No
names in the drawers 3 0.15 % No
secondCabinet = triefindsecondCabinetLocation >>> 3 0.15 % No
hash of your password 3 0.15 % No
thirdCabinet = triefindthirdCabinetLocation >>> 2 0.10 % No
the size of the 2 0.10 % No
in the same place 2 0.10 % No
of the hash function 2 0.10 % No
the second cabinet the 2 0.10 % No
And so the programmer 2 0.10 % No
to keep track of 2 0.10 % No
>>> thirdCabinet = triefindthirdCabinetLocation 2 0.10 % No
secondCabinetLocation = hashtablegetsecondCabinetHash >>> 2 0.10 % No
= hashtablegetsecondCabinetHash >>> secondCabinet 2 0.10 % No

Internal links in - kronosapiens.github.io

About
About
Strange Loops and Blockchains
Strange Loops and Blockchains
Trie, Merkle, Patricia: A Blockchain Story
Trie, Merkle, Patricia: A Blockchain Story
Reputation Systems: Promise and Peril
Reputation Systems: Promise and Peril
The Future of Housing, in Three Parts
The Future of Housing, in Three Parts
Proof of Work vs Proof of Stake: a Mirror of History
Proof of Work vs Proof of Stake: a Mirror of History
Introducing Talmud
Introducing Talmud
The Economics of Urban Farming
The Economics of Urban Farming
Time and Authority
Time and Authority
On Meaning in Games
On Meaning in Games
Objective Functions in Machine Learning
Objective Functions in Machine Learning
A Basic Computing Curriculum
A Basic Computing Curriculum
The Problem of Information II
The Problem of Information II
The Problem of Information
The Problem of Information
Elements of Modern Computing
Elements of Modern Computing
Blockchain as Talmud
Blockchain as Talmud
Understanding Variational Inference
Understanding Variational Inference
OpsWorks, Flask, and Chef
OpsWorks, Flask, and Chef
On Learning Some Math
On Learning Some Math
Understanding Unix Permissions
Understanding Unix Permissions
30 Feet from Michael Bloomberg
30 Feet from Michael Bloomberg
The Academy: A Machine Learning Framework
The Academy: A Machine Learning Framework
Setting up a queue service: Django, RabbitMQ, Celery on AWS
Setting up a queue service: Django, RabbitMQ, Celery on AWS
Versioning and Orthogonality in an API
Versioning and Orthogonality in an API
Designing to be Subclassed
Designing to be Subclassed
Understanding Contexts in Flask
Understanding Contexts in Flask
Setting up Unit Tests with Flask, SQLAlchemy, and Postgres
Setting up Unit Tests with Flask, SQLAlchemy, and Postgres
Understanding Package Imports in Python
Understanding Package Imports in Python
Setting up Virtual Environments in Python
Setting up Virtual Environments in Python
Creating superfunctions in Python
Creating superfunctions in Python
Some Recent Adventures
Some Recent Adventures
Sorting in pandas
Sorting in pandas
Mimicking DCI through Integration Tests
Mimicking DCI through Integration Tests
From Ruby to Python
From Ruby to Python
Self-Focus vs. Collaboration in a Programming School
Self-Focus vs. Collaboration in a Programming School
Designing Software to Influence Behavior
Designing Software to Influence Behavior
Maintaining Octopress themes as git submodules
Maintaining Octopress themes as git submodules
Setting up a test suite with FactoryGirl and Faker
Setting up a test suite with FactoryGirl and Faker
To Unit Test or not to Unit Test
To Unit Test or not to Unit Test
A Dynamic and Generally Efficient Front-End Filtering Algorithm
A Dynamic and Generally Efficient Front-End Filtering Algorithm
Trails & Ways: A Look at Rails Routing
Trails & Ways: A Look at Rails Routing
Getting Cozy with rspec_helper
Getting Cozy with rspec_helper
Exploring the ActiveRecord Metaphor
Exploring the ActiveRecord Metaphor
Civic Hacking as Inspiration
Civic Hacking as Inspiration
From Scheme to Ruby
From Scheme to Ruby
Setting up Auto-Indent in Sublime Text 2
Setting up Auto-Indent in Sublime Text 2
hello world
hello world
via RSS
Abacus

Kronosapiens.github.io Spined HTML


Trie, Merkle, Patricia: A Blockchain Story AbacusWell-nighTrie, Merkle, Patricia: A Blockchain Story Jul 4, 2018 In which we tell the story of the Patricia Tree. I. Introduction Spend a few days virtually blockchain engineers and unrepealable words will start to sound familiar. “Merkle Tree” and “Patricia Tree” in particular will start to seem… important somehow. You’ll sooner gather that these are quite essential parts of this whole blockchain thing… but why? What problems, exactly, do they solve? You might do a quick search and stumble upon increasingly than a few peices of #content which explain these things, but retreat upon seeing the complicated-looking diagrams. Fear not, dear reader. Here we will explain these things, not with graphs, but with story. Where to begin? The beginning, I suppose. II. The Hash Table In the whence there was the computer, stretching infinitely in all directions. In fact, it’s nonflexible to say that there plane was the computer, since existence implies absence, and there was nothing that wasn’t the computer. So there was the computer, but the computer was inert. Nothing was happening. Boring. So the computer decided to create a programmer. Pop. At first the programmer wasn’t very good, but over time she got better. There wasn’t much else going on at this time, so the programmer kept going, programming increasingly and increasingly things into the world. Animals and the like.Withouta while there were a lot of animals, which meant a lot of names to alimony track of. This was a problem. The programmer thought – “how can I alimony track of all of the names of these animals? I want to be worldly-wise to hands squint up the name I gave to each species of animal. I could write all the names lanugo in a big list, but sooner looking up the names will get really slow. If only I had the right data structure”. And so the programmer created the hash table. What is a hash table? For starters, its the understructure of everything else that’s going to happen, so we’re going to talk well-nigh it for a minute. Essentially, a hash table is a type of “key-value store”. This ways that for a given “key” (i.e. an unprepossessing specie) you can save the “value” (i.e. the name of the animal). The main property of the hash table is that when you have a key, you can find the value fast, regardless of how many other items are in the hash table. In computer science terms, this is known as “constant-time lookup” and is very useful, which is why hashtables are “arguably the single most important data structure known to mankind”). Here’s an example: >>> hashtable.set("dog", "fido") >>> hashtable.get("dog") "fido" How do they work? To understand the hash table, we have to digress for a moment and talk the hash function. Hash functions are a magical secret sauce which make some wondrous things possible. Hash functions are the “cryptography” people talk well-nigh when they talk well-nigh blockchains. Hash functions are legit. What is a hash function? Fortunately, hash functions are simple to understand. They are substantially tiny machines which take in some value, shake it virtually for a while (imagine a bartender shaking a cocktail), and output some other crazy-looking value (a big number). Their essential properties are: For a given input (like “cat”), you will unchangingly get the same output (like “0x52763589”) Two similar inputs (like “cat” and “car”) should not have similar outputs. Put flipside way, given an output, you should not be worldly-wise to guess the input. This makes hash functions extremely useful considering they let us handle sensitive information safely. Have you overly wondered how responsible websites alimony your passwords safe? They don’t store your password, they store a hash of your password. When you type in your password to log in, they take the hash of your password and compare that versus what they have in their database. But if a hacker overly gets in, all they’ll know is the gibberish hash of your password – useless since they have no way of figuring out what your very password was. The other thing they’re useful for is making hash tables. Why? Remember that the output of the hash function is a number. So when you hash the key, you substantially get a number telling you where to find the value. Imagine the hash table as a cabinet with 100 drawers. You hash("dog") and get 34 – you go to cabinet 34 and get the name out. You hash("cat") and get 89 – you go to cabinet 89. No need to squint through a whole list – you skip directly to the finish line. Pretty tomfool right? Yes it is. And so the programmer had the hash table, and for a while things were good. Great, even! But it couldn’t last. Eventually, the brogrammer appeared. At first things were good between them: they shared ideas, they shared code, they shared space. But sooner visionless clouds appeared on the horizon. They wanted variegated things. The programmer was fine with a little randomness thrown into things, but the brogrammer wanted certainty, and he wasn’t happy with hash tables anymore. They’re “not deterministic”, he said. What did he mean? To understand this point, we’ll have to talk a little increasingly well-nigh hash functions and hash tables. The first thing to note is that the “range” of the hash function (the possible values the output can take) is very large – depending on the computer, it can take as much as 2^256, but increasingly typically 2^32 or 2^64 possible values. 2^32 is 4,294,967,296 – and the others are much, much larger. Hash tables have to support this whole range, but we can’t make cabinets with that many drawers – there wouldn’t be room for anything else! So overdue the scenes, we do a little trick: we take the hash value modulo the size of the cabinet. The modulo operation (%) is substantially division’s sidekick: it gives you the remainder. The nice thing well-nigh modulo is that the output (the remainder) is unchangingly between 0 and the wiring – so no matter how big the input, the output can only be so big. So overdue the scenes, we make a cabinet with 100 drawers, and when deciding where to put the name of "dog", we squint in drawer hash("dog") % 100.Consideringthe hash value is random, the remainder will still be random, just smaller. This works great, but there’s a big downside: two animals might end up in the same drawer! Let’s say that hash("dog") is 1,000,034 and hash("shark") is 200,034.Variegatedvalues, but both will be 34 without the modulo. So we put them in the same drawer, and we have to squint through the drawer to find the dog’s name. It’s still fast, since there’s usually only one or two names in the drawer. So it’s fine in practice, but the brogrammer’s point is that the spot you put the name in is not 100% unswayable by the hash function you’re using. Two increasingly factors come in: the size of the cabinet, and the other animals! The size of the cabinet matters considering a cabinet with 10 drawers will put both 72 and 182 in the same place (2), but a cabinet with 100 drawers will put them in variegated places (72 and 82). Also, you can’t tell in whop if a name will be vacated in a drawer, or if it will have to share with other names. The brogrammer wasn’t happy well-nigh this, but dealt with his feelings in a healthy way and went off into the mountains for a few weeks to think well-nigh alternatives. “A place for everything, and everything in its place,” he kept repeating in his head. When he sooner came down, he had a new idea. III. The Trie The problem, the brogrammer had realized, was that we were trying to put everything into a single huge cabinet, which could never be big enough. The solution, the brogrammer said, was to use a sequence of smaller cabinets. The first cabinet would requite you the write of the second cabinet, the second cabinet the third, and sooner you would get to the cabinet which had the name you were looking for. You would need increasingly cabinets (but not that many, as it turns out), and each cabinet could be quite small (maybe 16 drawers, or plane 2!). Here’s an example, using an 8-drawer, 3-cabinet system (which gives us 8^3 = 512 drawers total): >>> hash3("dog") 0x237 >>> firstCabinet = trie.find(firstCabinetLocation) >>> secondCabinetLocation = firstCabinet.drawer(7).contents >>> secondCabinet = trie.find(secondCabinetLocation) >>> thirdCabinetLocation = secondCabinet.drawer(3).contents >>> thirdCabinet = trie.find(thirdCabinetLocation) >>> thirdCabinet.drawer(2).contents "fido" Note that each number tells us which drawer to open, and each number ways one increasingly cabinet. The brogrammer tabbed this system a “Trie” (as in retrieve), and said that the eyeful of it was that you didn’t need to build all the cabinets at once – you could start out with just one cabinet, and only build new cabinets the first time you needed them, wherever there was room. And while it ways a little increasingly work (opening increasingly drawers), every name will have a defended drawer, unchangingly in the same place. And the brogrammer knew that no one would overly need all the drawers, and so most of the cabinets would never need to be built (although you can’t rule it out). The programmer looked at the Trie and well-set it was a clever idea (although it involved quite a bit increasingly walking), and there was harmony between them. Years passed, and a new people started to towards in the nearby valley. Curious, the programmer and the brogrammer journeyed over to see these people and learn well-nigh their culture. They found the people intriguing, with a curious religion revolving virtually the worship of a particular wattle of carved granite blocks. The people were quite friendly, and without meeting with some of their priests, the programmer and brogrammer learned that these people had once been warlike, but without years of mismatch ripened a new system of “trust” which unliable them to co-exist in remarkable peace and prosperity. The computer, they said, was only as good as the programmer, and that humans could not be trusted to program alone. These people knew of the hash table and the trie, but they had found that people would cheat: sometimes people would come in the night and transpiration the names in the drawers; there was no way to prove that the names in the drawers were the right names. For a while these people had a warrior matriculation who guarded the cabinets, but found that this only led to increasingly conflict.Soonera number of their most skilled artisans ripened the technique of scarification blocks of granite; these blocks, they realized, were very difficult to carve, and so things carved into these blocks could be trusted in yonder that the names in the cabinets could not. It was unfeasible to whittle every name into the block, however, and to whittle new blocks when the names changed. What they needed, they said, was some way to whittle a signature of the names onto the block, such that if any one name changes, the signature would change; but if the names were the same, the signature would unchangingly be the same. Eventually, one of their scientists, Ralph, ripened a solution: the Merkle tree. IV. The Merkle Tree The Merkle tree behaves much like a Trie, but with a new rule: the drawers of each cabinet will not contain the location of the next cabinet, but rather the hash of all of the contents of the next cabinet. Separately, we alimony track of the location of each cabinet (using, of all things, a simple hash table): >>> hash3("dog") 0x237 >>> firstCabinetLocation = hashtable.get(firstCabinetHash) >>> firstCabinet = trie.find(firstCabinetLocation) >>> secondCabinetHash = firstCabinet.drawer(7).contents >>> secondCabinetLocation = hashtable.get(secondCabinetHash) >>> secondCabinet = trie.find(secondCabinetLocation) >>> thirdCabinetHash = secondCabinet.drawer(3).contents >>> thirdCabinetLocation = hashtable.get(thirdCabinetHash) >>> thirdCabinet = trie.find(thirdCabinetLocation) >>> thirdCabinet.drawer(2).contents "fido" Remember our hash function? Earlier we talked well-nigh hashing simple values like “dog” and “cat”, but in truth you can hash anything, including other hashes or sets of hashes. What Ralph realized was that by keeping the hashes in the cabinets, you can create a “hash trail” which will transpiration whenever any value changes (remember how websites store your passwords? Same idea). Here is how you update a value: >>> hash3("dog") 0x237 ### Find cabinet same as surpassing >>> thirdCabinet.drawer(2).contents = "rover" ### But then you start working backwards... >>> thirdCabinetHash = hash3(thirdCabinet.drawers) >>> hashtable.set(thirdCabinetHash, thirdCabinetLocation) >>> secondCabinet.drawer(3).contents = thirdCabinetHash >>> secondCabinetHash = hash3(secondCabinet.drawers) >>> hashtable.set(secondCabinetHash, secondCabinetLocation) >>> firstCabinet.drawer(7).contents = secondCabinetHash >>> firstCabinetHash = hash3(firstCabinet.drawers) >>> hashtable.set(firstCabinetHash, firstCabinetLocation) >>> firstCabinetHash 0x375 Now the final value, 0x375, is a “fingerprint” of the unshortened Merkle tree. You can save this fingerprint (or engrave it into a granite block), and know that if anyone changes any of the names in the drawers, the process of making the hashes will requite a variegated result – you’ll know something has changed. Notice that this adds increasingly steps compared to a simple Trie: you need to have a separate hashtable to alimony track of locations. But what you get is security. The programmer and the brogrammer walked up to get a closer squint at the granite blocks, and to their surprise, on them they saw engraved a series of hashes! 0x736, 0x264, 0x123, and so on, with 0x542 stuff the most recent. They were amazed! Nearby, they noticed some activity: one of this peculiar tribe wanted to prove that he had purchased a horse from another. He brought forward the name of the horse and his own name, set trie.set(horse, name) and through an elaborate ritual showed that his name, hashed with unrepealable other names, with unrepealable other names… voila! He arrived at 0x542, and thus all well-set that the horse was his. What a remarkable society, the programmer and brogrammer agreed. There was something nagging at the programmer, though. This was a small tribe – only 512 members. As they grow, they will need a new hash function with a larger range – thousands, millions, billions. And so updating and verifying the values in the Merkle tree will wilt increasingly and increasingly plush – from three cabinets to five, to ten, to sixty and beyond! And for what? Most of these drawers will be empty. It seems like an expensive system, slow and costly. Surely there must be a largest way? If only there was a Practical Algorithm To Retrieve Information Coded In Alphanumeric… V. The Patricia Tree To gather their thoughts, the programmer and the brogrammer took a walk into the hills whilom the valley. “There must be some way to optimize this tree!” they thought to themselves. The brogrammer suggested they squint at a few random hashes, to build some intuition: >>> hash8("cat") 0x14350235 >>> hash8("dog") 0x14350762 Then the brogrammer got excited – he noticed that both of these hashes happened to start with the same numbers: 14350. With just these two entries, getting to the final drawer should only need two cabinets: one for 14350, and one for whatever was left: 235 or 762. This would be much faster than using eight cabinets. You could unchangingly add increasingly cabinets later, but why make increasingly than you need? On each drawer we tape a little slip of paper, where we write lanugo the worldwide prefix for that drawer. Finally, the first cabinet is unquestionably just a single drawer. Looking up values would go like this: >>> hash8("dog") 0x14350762 >>> firstDrawerLocation = hashtable.get(firstDrawerHash) >>> firstDrawer = trie.find(firstDrawerLocation) >>> split(14350762, firstDrawer.commonPrefix) (14350, 762) >>> secondCabinetHash = firstDrawer.contents >>> secondCabinetLocation = hashtable.get(secondCabinetHash) >>> secondCabinet = trie.find(secondCabinetLocation) >>> secondDrawer = secondCabinet.drawer(7) >>> split(62, secondDrawer.commonPrefix) (62,) >>> secondDrawer.contents "fido" The programmer got excited – she felt pretty good well-nigh this. It would make the algorithm a little trickier, to make sure that cabinets were created thus and that worldwide prefixes are kept up-to-date, but nothing they couldn’t icon out. A little increasingly work at the whence to set this all up would save the valley people a lot of time over the long run. The pair sat lanugo and worked out the details of this new system, which they tabbed the “Patricia Tree”. Satisfied, they descended to the valley and presented their work to the people there. They people were joyous; the slow Merkle tree had been a stilt on their society. With the Patricia tree, they hoped, they would be worldly-wise to whop their arts, sciences, and industry faster. Satisfied, the programmer and the brogrammer left the valley. As they crested the ridge and began to make their way through the surrounding grassland, they heard a soft humming sound. Looking up, they saw a flying car sailing off into the horizon. VI. Summary What did we learn from this completely stylistically original story? First, that hash tables, tries, merkle trees, and patricia trees are all do substantially the same thing: they let you map keys to values. While there are differences between them, this is substantially what they do. Second, in computer science, nothing is self-ruling (but some things are cheap). Everything has a trade-off. Hash tables are fast, but have some randomness. Tries are fully determinstic, but slower. Merkle trees have nice security properties, but use a increasingly complicated algorithm and are slower to update. Finally, Patricia trees are faster than Tries and Merkle trees, but require an plane increasingly complicated algorithm. Third, Patricia trees are useful for blockchains considering they let you “prove” a potentially large value of data is correct, without having to store all of that data. This is very convenient: you can have a big tree with a lot of data (such as all of the transactions in the last 24 hours), but you only have to store a few numbers (like 0x323757382) on the very blockchain. You can alimony the rest of the data on a regular database somewhere and know that no one will be worldly-wise to tamper it and get yonder with it. Note that here the blockchain is only part of the system: it is co-dependent on other data stores to function. Fourth, the hash function is the magical machine that makes all of this possible. The diamond and implementation of hash functions has been the ongoing work of computer scientists for decades, and they are very nonflexible to get right. You should take a moment and fathom the years of work that made this magical technology possible. Comments Please enable JavaScript to view the comments powered by Disqus. Abacus Abacus kronovet@gmail.com kronosapiens kronosapiens I'm Daniel Kronovet, a data scientist living in Tel Aviv.