Computer Literacy: 4
In the second part of this series we explored the binary search algorithm as a method for quickly finding a card in a drawer. If you’ve forgotten it, go have a look and then come back. Don’t worry, we’ll wait for you.
The aforementioned solution is a lovely thing, but it has one major drawback: it requires that the drawer be in order before the search begins. This doesn’t sound so bad, but it’s actually quite difficult to keep the drawers in order because the numbered slots in which the cards are stored — unlike in a typical filing cabinet — are inflexible and immovable11. The reasons for this awful design have to do with the physical structure of computer memory. This is one of those cases where we must overcome an unfortunate but unavoidable limitation to make things work.. That is to say that when we place a new card in the middle of a drawer, we cannot squeeze it between two others; instead, we must move all the cards higher than it backward by one slot, one at a time.

Pierre’s cousin, François.
Helmut’s strategy for dealing with this problem was to break up the cards into subsets kept in different drawers, one for A-L and one for M-Z. This reduced the difficulty of re-ordering them by reducing the number of cards that had to be handled while adding a new one. It worked well enough while there were about as many cards in each range, but it went badly when a large number of refugees from Alphastan, all of them with A- names, moved to Pierre’s district.
Around the Clock Filing
In order to explain the technique Adalina used to overcome this difficulty, I must first explain modulus arithmetic. There’s no need for alarm. We all use modulus arithmetic22. Modular arithmetic was rigorously codified by Carl Friedrich Gauss in his 1801 book Disquisitiones Arithmeticae, but it’s actually quite old., whether or not we know that’s what it’s called. It’s the kind of “wrap-around” arithmetic we use to figure out what time to drop round for tea.
As an example of “clock modulus,” on an old twelve-hour clock it will be four o’clock in the afternoon33. Tea time! Did you bring any cakes? nine hours after it’s seven o’clock in the morning. We know this because we understand that there are only twelve numbers on the clock’s face, and that we should start again at the beginning after we’ve “wrapped around” the end44. This is also how we figure out the “remainder” after we divide one number by another. Division can be performed by counting up to the dividend around a special clock that’s numbered from zero to the divisor (i.e. 500/9 can be computed by counting around a nine-digit clock face 500 times. The quotient is the number of times around the clock, the remainder is the last number reached on the clock’s face)..
A mathematician would call the numbers on the face of the clock a “finite set55. The geekiest among you are muttering something about Georg Cantor right now. I can hear you.,” which is what we’re going to do from here on out.
The reason we need to be clear about this before going on is that just about all the techniques used to optimize the storage and retrieval of information are really just clever metaphors for modulus arithmetic.
Making a Hash of the Lot
One of the best algorithms for keeping data evenly distributed between a set of drawers is called a hash table66. This, surprisingly, has nothing at all to do with a hash pipe.. The purpose of a hash table is to create a fairly small number of drawers into which we can evenly distribute a much larger number of cards without stopping to look in the drawers to figure out which ones are more or less full. This is kind of tricky, so we do what engineers have always done under such circumstances: we cheat.
A hash table uses an algorithm called a hash function to map each possible card to a particular drawer, and then relies on probability to ensure that the actual cards that get filed are well distributed by that function. Hash functions can be complicated, but we’re going to demonstrate the idea with a couple of very simple ones.
The first time Adalina used a hash table, she instructed the carpentry staff to build seven drawers for Pierre, then had him file each card based on its first letter77. Remember: the alphabet is a set of 26 letters. using the above-mentioned clock trick to calculate the modulus of that letter’s position in the alphabet against the number of drawers. This allowed Pierre to quickly choose a drawer when a new card came in, and mostly guaranteed that the cards would end up spread evenly between the drawers. If this sounds confusing just have a look at the following figure:
I hear you saying, “Gosh, that’s pretty cool, but a sudden influx of Alphastanis would still cause havoc!” You’re right, of course, which is why actual hash functions are somewhat more complicated than what’s described above. The simplest trick used to improve this kind of algorithm is a variation on an ancient arithmetic toy of mystics and astrologers that has been repurposed by the Abstractionists.
Like Numerology, Only Useful
The Greeks learned and practiced arithmetic with small stones called “psephos88. The Romans started with the Greek system, which is why the word “calculate” is based on the Latin word for small stones, “calculi.”.” Among the things they did with these pebbles was add a bunch of them together and then use the clock trick99. From now on “applying the clock trick” will be called “calculating the modulus” (which sounds a bit like a euphemism for masturbation) or abbreviated to “something modulo some number,” where “some number” stands for how many numerals are on the clock’s face. (before there were clocks; they were clever, these Greeks) to figure the modulus of the sum against some other number. They called this process isopsephy1010. (iso, “equal,” plus psephos, “pebble.”). It’s also the heart of an old occult practice that is now called numerology or arithmancy1111. From the Greek αριθμομαντεια (“arithmomanteia”), literally “divination by numbers,” which would be a great name for a prog rock record.,1212. The Hebrew term for this, Gematria, is a deformation of the Greek term geometry..
For those you who’ve never seen this kind of thing, it involves assigning a number from, say, 0-9 to each letter in the alphabet using modulus arithmetic, then adding together all the numbers for all the letters in the name, also modulo 9. The result is a single number that is thought to be symbolic of something or other1313. This sort of thing has been going on since the earliest forms of writing were invented. Many ancient writing systems had a numerological advantage because the symbols of the alphabet were also used as numbers. In the old Phoenician system, for instance, the letter aleph (which is the ancestor of the letter “A”) was also the number one..
Adalina’s way of dealing with the Alphastani exodus was to use a something very similar to numerology. Pierre decides which of the seven drawers should receive which card by assigning a number to each letter of the name on the card based on that letter’s position in the alphabet, adding all those numbers together, and then calculating the modulus of the resulting number against the number of drawers. This result is a surprisingly good distribution between the drawers, even if every name starts with an A1414. I encourage you to play around with this idea on a piece of paper to get a sense of how it works in practice.:
Al’Bob
A (1) + L (12) + B (2) + O (15) + B (2) = 32
32 modulo 7 = drawer 4
Al’Joe
A (1) + L (12) + J (10) + O (15) + E (5) = 43
43 modulo 7 = drawer 1
Al’Tom
A (1) + L (12) + T (20) + O (15) + M (13) = 61
61 modulo 7 = drawer 5
There are more — and way more complicated — hash functions, but they all come down to variations on this theme.
In the next installment we’ll take apart another common algorithm, the binary tree.