(written by lawrence krubner, however indented passages are often quotes). You can contact lawrence at: lawrence@krubner.com, or follow me on Twitter.

I did not know this:

Modular arithmetic keeps our numbers bounded, which solves our first problem, while also making our attacker’s life more difficult: finding a preimage for one of our hashes now requires solving the discrete log problem, a major unsolved problem in mathematics, and the foundation for several cryptosystems.

Apparently it is difficult to reverse-engineer stuff stored modulo, unless you know the modulo.

Off topic but I find myself wondering if DNA and RNA use a finite field system of math? Clearly RNA sometimes needs to add stuff up (how many proteins to build) but I find it hard to believe that RNA can handle stuff like integers and floating point math. Living organisms, arising from a million accidents, must use some simple counting system, yes?

This is a very cool way to do something like a checksum on a message:

Fortunately, there’s a way we can fix both problems in one shot: by using modular arithmetic. Modular arithmetic keeps our numbers bounded, which solves our first problem, while also making our attacker’s life more difficult: finding a preimage for one of our hashes now requires solving the discrete log problem, a major unsolved problem in mathematics, and the foundation for several cryptosystems.

Here, unfortunately, is where the theory starts to get a little more complicated – and I start to get a little more vague. Bear with me.

First, we need to pick a modulus for adding blocks together – we’ll call it q. For the purposes of this example, let’s say we want to add numbers between 0 and 255, so let’s pick the smallest prime greater than 255 – which is 257.
We’ll also need another modulus under which to perform exponentiation and multiplication. We’ll call this p. For reasons relating to Fermat’s Little Theorem, this also needs to be a prime, and further, needs to be chosen such that p – 1 is a multiple of q (written q | (p – 1), or equivalently, p % q == 1). For the purposes of this example, we’ll choose 1543, which is 257 * 6 + 1.

Using a finite field also puts some constraints on the number, g, that we use for the base of the exponent. Briefly, it has to be ‘of order q’, meaning that gq mod p must equal 1. For our example, we’ll use 47, since 47257 % 1543 == 1.

So now we can reformulate our hash to work like this: To hash a message block, we compute gb mod p – in our example, 47b mod 1543 – where b is the message block. To combine hashes, we simply multiply them mod p, and to combine message blocks, we add them mod q.
Let’s try it out. Suppose our message is the sequence [72, 101, 108, 108, 111] – that’s “Hello” in ASCII. We can compute the hash of the first number as 4772 mod 1543, which is 883. Following the same procedure for the other elements gives us our list of hashes: [883, 958, 81, 81, 313].

We can now see how the properties of the hash play out. The sum of all the elements of the message is 500, which is 243 mod 257. The hash of 243 is 47243 mod 1543, or 376. And the product of our hashes is 883 * 958 * 81 * 81 * 313 mod 1543 – also 376! Feel free to try this for yourself with other messages and other subsets – they’ll always match, as you would expect.

Post external references

  1. 1
    http://blog.notdot.net/2012/08/Damn-Cool-Algorithms-Homomorphic-Hashing
Source