This is Part 3 of a five part series:
- Introduction
- Encryption and Hashes
- Simple Hashes and Collisions (you are here)
- Reduction Functions
- Rainbow Tables and Chains
Geoff Kuenning, a computer science professor at Harvey Mudd College, has a great web page about hashes as a part of one of his classes. Let’s look at two of the three simple hash functions he presents (we’ll use one of these for our rainbow table).
If you haven’t looked at the spreadsheet yet, now is probably a good time to do so. It is broken into tabbed pages. The first four tabs are dedicated to Kuenning’s three hash functions.
When we look at the hash functions, one of the things we want to consider is the number of collisions that occur. Part of the nature of hash functions is that multiple input values can produce the same output result. When this happens, it is known as a collision. (If you remember high-school trigonometry, this is like the sin() and cos() functions. sin(0) and sin(180 degrees) both equal zero.)
Note: My nomenclature differs slightly from Kuenning’s. I use H() to indicate the hash function, h to indicate the hashed value, and p to indicated the unencrypted password (aka plaintext). On this page only, I’ll make p and h orange so you remember that we’re starting with one and calculating the other.
Hash Function 1: The Division Method
The division method simply uses the modulo function:
h = p mod m
where m is a prime number. (m should be far from a power of 2, but for our purposes it doesn’t matter.)
here is a table of hash values for plaintext values from 0-19, with a m value of 13:
p h p h p h p h 0 0 5 5 10 10 15 2 1 1 6 6 11 11 16 3 2 2 7 7 12 12 17 4 3 3 8 8 13 0 18 5 4 4 9 9 14 1 19 6
Lousy for encryption, don’t you agree? But, it is a hash – you can’t tell what the plaintext value is if you only know the result. And, there are collisions.
I’m going to skip the next function that Kuenning presents, the Knuth variant on division. It is easy to understand, and is included in the spreadsheet if you want to see how it compares with the other two.
Hash Function 2: The Multiplication Method
This hash function is a little more involved, but is still simple enough to implement easily in a spreadsheet. Kuenning conveniently breaks it into three calculations. This makes it easier to understand (and easier to program).
s = p*A x = fractional part of s h = floor(m*x)
Below is a table of the results of this hash applied to the numbers 0-19, with A set to 6.213335, m set to Kuenning’s recommended value of (SQRT(5) – 1)/2, and x truncated to 4 digits (if s = 1.234567, x is then 2345)
p: s: x: Hash: p: s: x: Hash: 0 0 0 0000 10 62.13335 1333 0823 1 6.213335 2133 1318 11 68.346685 3466 2142 2 12.42667 4266 2636 12 74.56002 5600 3460 3 18.640005 6400 3955 13 80.773355 7733 4779 4 24.85334 8533 5273 14 86.98669 9866 6097 5 31.066675 666 0411 15 93.200025 2000 1236 6 37.28001 2800 1730 16 99.41336 4133 2554 7 43.493345 4933 3048 17 105.626695 6266 3872 8 49.70668 7066 4367 18 111.84003 8400 5191 9 55.920015 9200 5685 19 118.053365 533 0329
A few points are worth noting:
- We don’t see any collisions here. However, the spreadsheet shows that there are collisions.
- No matter what integer we use as the source, the resulting hash will always be a fixed length: 4 digits. In other words, the hash will always be less than or equal to 9999. (Actually, it will be less than 6180).
- Because the hash is always an integer between 0 and 6180, we know that if we have a set of over 6180 numbers, we are guaranteed to have collisions.
- Zero does not hash particularly well with this algorithm.
This is a great hash for creating a simple rainbow table to learn how they work. Before we do so, however, let’s look at the notion of reduction functions in Part 4.
3 thoughts on “Rainbow Tables – Part 3 (Simple Hashes and Collisions)”