## Hamming Codes

### May 22, 2012

Hamming codes, which were invented by Richard Hamming in 1950, are a method of transmitting data over a noisy channel that give the recipient the ability to correct simple errors. The sender adds parity bits to the transmission stream so that, when the data bits and parity bits are combined, any single-bit error, in either the data bits or parity bits, can be identified and corrected. The number of parity bits that are required is given by the Hamming rule d + p + 1 ≤ 2p where d is the number of data bits and p is the number of parity bits. The length of the code word c, which combines the data bits and parity bits, is d + p, and a Hamming code is described by (c,d). We will illustrate using a 4-bit data word, which requires 3 parity bits to satisfy the Hamming rule (2 is insufficient because 4+2+1>4, but 3 is sufficient because 4+3+1≤8) and is described by (7,4).

A particular instance of a Hamming code uses two matrices, G the generator matrix and H the syndrome matrix. Here are sample generator (left) and syndrome (right) matrices for a (7,4) Hamming code:

```1 0 0 0 1 1 1    1 0 1 1 1 0 0 0 1 0 0 0 1 1    1 1 0 1 0 1 0 0 0 1 0 1 0 1    1 1 1 0 0 0 1 0 0 0 1 1 1 0```

The generator matrix is denoted [ I : A ] and consists of an identity matrix I in the left-most four columns (the number of data bits) and a parity coding matrix A in the right-most three columns (the number of parity bits). There is no formula for the A matrix; it must be constructed so that each data bit is checked by two or more parity bits, in such a way that no combination of parity bits overlaps a data bit. The syndrome matrix is denoted [ AT : I ] and consists of the transpose of the parity coding matrix A in the left-most four columns and the identity matrix in the right-most four columns.

A data word is encoded by multiplying it by the generator matrix, with all arithmetic done modulo 2; we gave an algorithm for matrix multiplication in a previous exercise. For instance, the data word [1 0 0 1] is encoded as [1 0 0 1 0 0 1] like this:

```                                  | 1 |                                   | 0 |               | 1 0 0 0 1 1 1 |   | 0 | | 1 0 0 1 | * | 0 1 0 0 0 1 1 | = | 1 |               | 0 0 1 0 1 0 1 |   | 0 |               | 0 0 0 1 1 1 0 |   | 0 |                                   | 1 |```

Decoding is the inverse operation, with the syndrome matrix multiplied by the encoded data:

```                    | 1 |                     | 0 | | 1 0 1 1 1 0 0 |   | 0 |   | 0 | | 1 1 0 1 0 1 0 | * | 1 | = | 0 | | 1 1 1 0 0 0 1 |   | 0 |   | 0 |                     | 0 |                     | 1 |```

If all of the result bits are zero, then the transmission succeeded without errors, and the input code is the first four bits of the encoding [1 0 0 1]. But if there was a transmission error, the syndrome points to it. For instance, if the recipient received the tranmission as [1 0 1 1 0 0 1], then the syndrome computes as [1 0 1], which matches the third column of the H matrix, indicating that the third bit of the transmission was in error, so instead of [1 0 1 1] the message is [1 0 0 1], which is correct.

Your task is to write functions that encode and decode a message using Hamming codes as described above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.