We studied one-dimensional cellular automata in a previous exercise. In today’s exercise, we will implement the famous two-dimensional cellular automaton by the British mathematician John Horton Conway, the Game of Life. Life was introduced by Martin Gardner in the October 1970 issue of Scientific American; I can remember coming home from the library and tracing a glider as it moved across my checkerboard.

The automaton exists on an infinite two-dimensional grid of cells, each either alive or dead. Each cell has a neighborhood of the eight cells horizontally, vertically or diagonally adjacent to it. A cell that is alive in one generation dies if it has less than two live neighbors or more than three live neighbors, otherwise is survives to the next generation; a cell that is dead becomes alive at the next generation if it has exactly three living neighbors. All births and deaths occur simultaneously from one generation to the next.

Mathematicians have shown that these simple rules evoke immense variety, and have built general-purpose computing structures from particular configurations of living cells. Forty-two years on, there are articles in scholarly journals, doctoral dissertations, and jillions of web sites that serve Life enthusiasts.

Your task is to write a program that computes and displays the history of a Life population. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Twin Primes

April 17, 2012

Pairs of prime numbers that differ by two are known as twin primes: (3,5), (5,7), (11,13), (17,19), (29,31), (41,43), (59,61), (71,73), …. Sometimes the list is given by the lowest number in the pair (A001359: 3, 5, 11, 17, 29, 41, 59, 71, …), sometimes by the highest number in the pair (A006512: 5, 7, 13, 19, 31, 43, 61, 73, …), sometimes by the number in between (A014574: 4, 6, 12, 18, 30, 42, 60, 72, …), and sometimes just as a list of primes (A001097: 3, 5, 7, 11, 13, 17, 19, 29, 31, 41, 43, 59, 61, 71, 73, …). All twin primes have the form 6k±1.

It is simple and fast to compute the twin primes less than n by a variant of the Sieve of Eratosthenes. The sieving primes are the primes less than the square root of n, excluding 2 and 3 which are not of the form 6k±1. In the first step, the primes of the form 6k−1 are sieved; first make a bitarray for the numbers 5, 11, 17, 23, 29, …, then, for each sieving prime p, find the smallest multiple of the prime in the list, strike it off the list, and strike each pth number off the list as well. In the second step, the primes of the form 6k+1 are sieved; first make a bitarray for the numbers 7, 13, 19, 25, 31, …, then, for each sieving prime p, find the smallest multiple of the prime in the list, strike it off the list, and strike each pth number off the list as well. In the third step, sweep through the pair of bitarrays and collect twin primes for each remaining pair of corresponding bitarray elements.

The smallest multiple of a prime p of the form 6k−1 is the modular inverse of 6 mod p; the smallest multiple of a prime p of the form 6k+1 is the modular inverse of −6 mod p. Be sure not to strike the prime p. We discussed the modular inverse in a previous exercise.

An alternative method to compute twin primes is to use the normal Sieve of Eratosthenes to compute the primes less than n, then sweep through the list comparing each prime to its successor. But the algorithm given above is faster, about twice as fast, because the two bitarrays are shorter than the single bitarray of the normal Sieve of Eratosthenes.

Your task is to write a function that finds the twin primes less than a given input n. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

We studied McNugget Numbers in a previous exercise. You may recall that McDonald’s Restaurants sell Chicken McNuggets in meal sizes of 6, 9 or 20 nuggets, and a McNugget Number is a number that can be formed by some combination of 6, 9 and 20. For instance, 100 is a McNugget Number because 100 = 5 ·20.

The previous exercise asked you to find the list of numbers that cannot be formed by some combination of 6, 9 and 20; those numbers, which are the Non-McNugget Numbers, are 1, 2, 3, 4, 5, 7, 8, 10, 11, 13, 14, 16, 17, 19, 22, 23, 25, 28, 31, 34, 37 and 43. Today’s exercise asks you to count the ways in which a McNugget number can be formed; for instance, you can form the number 100 in 5 different ways: 100 = 0 · 6 + 0 · 9 + 5 · 20 = 1 · 6 + 6 · 9 + 2 · 20 = 4 · 6 + 4 · 9 + 2 · 20 = 7 · 6 + 2 · 9 + 2 · 20 = 10 · 6 + 0 · 9 + 2 · 20.

Your task is to write a program that calculates the number of ways a number can be expressed as a McNugget Number, and to use your program to calculate the number of ways to express one million as a McNugget Number. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Galton

April 10, 2012

In the previous exercise we had one of the simulations from A. K. Dewdney’s Five Easy Pieces, and today we have another. You’ve all seen a Galton board, invented by the Victorian statistician Sir Francis Galton; a series of pegs is arranged in diagonal columns at the top, a marble is dropped into the center of the top row, and it falls through the rows, moving left or right at each peg, until it drops into a bin at the bottom of the board, one bin under each gap in the final row of pegs. Each individual marble can go left or right at each peg with equal probability, but the overall effect after several marbles are dropped is to form a bell-shaped curve in the bins at the bottom.

Your task is to write a program to simulate a Galton board, showing the bell-shaped distribution in a histogram. When you are finished you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Voters

April 6, 2012

It’s election year in the United States, and in today’s exercise we have an election simulation. It comes to us from chapter “Five Easy Pieces” in A. K. Dewdney’s book The Armchair Universe, which is a compilation of some of Dewdney’s “Computer Recreations” columns in Scientific American. Dewdney calls his simulation voters.

The simulation is simple. A rectangular x-by-y grid contains cells that represent voters. Each cell has eight horizontally, vertically or diagonally adjacent neighbors, with the grid wrapping around on all sides to form a torus. Each cell can contain one of two values, which we call 0 and 1, not necessarily in order, instead of Democrat or Republican. At each time step, one cell, chosen at random, changes its allegiance to that of one of its eight neighbors, chosen at random; in some cases that random choice leaves the allegiance unchanged.

The current state of the simulation is displayed on the screen, with easily distinguishable symbols for 0 and 1. The simulation continues until it reaches a steady state — totalitarianism — in which all cells have the same value. It is mesmerizing to watch the screen as voting blocks form and changing allegiances sweep across the grid.

Your task is to write a program that displays the simulation described above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Cornacchia’s Algorithm

April 3, 2012

We have a fun little problem from number theory today. Take a minute and try to find x and y so that x2 + 4 y2 = 1733. If that’s too easy for you, try x2 + 58 y2 = 3031201.

Equations of the form x2 + d y2 = p, with p prime, can be solved by an algorithm developed by Giuseppe Cornacchia in 1908 (actually, a fellow named Smith developed the same algorithm in 1885, but his work seems to be forgotten). Here’s the algorithm, which assumes that p is prime and 0 < d < p:

1. If the jacobi symbol (−d/p) is negative, there is no solution; stop.

2. Compute x such that x2 ≡ −d (mod p). If there are two such values, choose the larger. Then set ap and bx.

3. While b2 > p, set ab and b = a mod b. The two assignments are done simultaneously, so the values on the right-hand sides of the two assignments are the old values of the variables. (You may notice that this is Euclid’s algorithm.)

4. After the loop of Step 3 is complete, if d does not evenly divide pb2 or if c = (pb2) / d is not a perfect square, there is no solution; stop. Otherwise, x = b and y = √c.

We can use Cornacchia’s algorithm with d = 1 to verify Fermat’s Christmas Theorem (because it was announced in a letter to Marin Mersenne on December 25, 1640) that all primes of the form 4k+1 can be represented as the sum of two squares; as usual, Fermat didn’t give the proof, which was finally published by Leonhard Euler in a letter to Goldbach in 1749 after seven years of effort.

Your task is to implement Cornacchia’s algorithm and use it to demonstrate that all primes of the form 4k+1 and less than 1000 can be written as the sum of two squares. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

We looked at the brute-force algorithm for the subset sum problem in the previous exercise. Today we look at a different algorithm that solves the same problem; the new algorithm is more efficient, but still exponential, with a time complexity of 0(n2n/2).

The new algorithm, called the meet in the middle algorithm, splits the input list into equal-size halves. The first half is subject to the same algorithm as the previous exercise, in which all subsets are generated, their sums computed, and the sums compared to the target. It is possible but unlikely that the target will be found in the first half. If not, the algorithm generates all subsets of the second half and checks each sum to see if the difference between target and sum was a sum in the first half, in which case the required subset has been found.

Your task is to implement the meet in the middle algorithm for solving the subset sum problem. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Subset Sum

March 27, 2012

The subset-sum problem asks you to find the subset of a set of integers that sums to a given target; for instance, given the set {267 439 869 961 1000 1153 1246 1598 1766 1922} and target 5842, the subset {869 961 1000 1246 1766} sums to 5842. This is a well-known NP problem, and the standard solution via dynamic programming takes time O(n2n). The basic idea is straight forward: generate a list of subsets, compute the sum of each, and return the one that sums to the target.

The dynamic programming solution has the same O(n2n) time complexity, but builds up the solution in stages. A matrix has n rows and 2n columns; the rows are marked with the n input values, the columns are marked with the various subset-sums, and each cell contains a list of items from the n values up to the current row that sums to the column total. We’ll look at an example: find the subset of items from {1 -3 4 2} that sums to 0. After the first element of the list is considered, the first row has two columns, the null column that we ignore and the column 1:

  1
1 1

With the second row, there are four columns: the null column that we ignore, and columns -3, 1, and 2:

  -3 1 2
1   1  
-3 -3 1 -3 1

When we add the third row, there are eight columns: the null column that we ignore, the six columns -3, -2, 1, 2, 4, and 5, and a hidden column for 1, which can be formed in two different ways, as 1 by itself and as the sum of -3 and 4:

  -3 -2 1 2 4 5
1     1      
-3 -3 -3 1 1      
4 -3 -3 1 1 -3 1 4 4 1 4

When we add the fourth row, there are sixteen columns, but only eleven appear in the table: -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, and 7. We ignore the null column, and there are four hidden columns corresponding to the following sum-pairs, only one of which appears in the table (it doesn’t matter which): 1 = -3 + 4, 2 = -3 + 1 + 4, 1 + 2 = -3 + 2 + 4, and 4 = -3 + 1 + 2 + 4:

  -3 -2 -1 0 1 2 3 4 5 6 7
1         1            
-3 -3 -3 1     1            
4 -3 -3 1     1 -3 1 4   4 1 4    
2 -3 -3 1 -3 2 -3 1 2 1 -3 1 4 1 2 4 1 4 2 4 1 2 4

The solution is the subset {-3 1 2}. I apologize for my lack of table-fu.

Your task is to write a function that solves the subset-sum problem. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Base-26 Arithmetic

March 23, 2012

A reader named Prashant recently wrote to suggest an exercise in base-26 arithmetic:

Write a function that takes two base-26 numbers in which digits are represented by letters with A=0, B=1, … Z=25 and returns their product using the same notation. As an example, CSGHJ × CBA = FNEUZJA.

Prashant was worried that the problem was specific to C/C++, but that’s not an issue.

Your task is to write the base-26 multiplication function. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

On February 15th, the New York Times published an article that described an attack on the RSA key-generation algorithm by Arjen Lenstra and others; Nadia Heninger and her friends did the same study, but have not yet published their work. The Times missed the essence of the Lenstra paper and reported that two out of every thousand internet passwords are insecure, which is incorrect; Dan Kaminsky gives a better description of the vulnerability, and explains why there is little to worry about.

The problem was not in the RSA algorithm itself, but in generating the keys that are used by the RSA algorithm. We studied the key-generation algorithm in a previous exercise. Briefly, the key-generation algorithm selects two large primes, usually called p and q, and uses them to create a public key and a private key; the public key is the product of p and q. The security of the RSA algorithm relies on the fact that it is difficult to factor the public key into its two components p and q. However, if you do, it is easy to determine the corresponding private key, and anyone who does so can decrypt any message thus encoded.

What Lenstra/Heninger found was that, due to a flaw in some implementations of the key-generation algorithm, multiple private keys used the same p. Here’s what Lenstra/Heninger and their colleagues did: First, they collected a large corpus of RSA public keys by scanning every IPv4 address on the internet using nmap; both teams collected somewhere around twelve million keys. Then, they computed the gcd of each key paired with every other key; where the gcd was not 1, it was one of the factors of the key, and the other factor could easily be found by division, and from the two factors the private key could be determined.

Today’s exercise challenges you to recreate the computation that Lenstra/Heninger used to find the factors. We’ll use this set of 21 public keys; these keys are small enough to be easily factored, but real keys are much bigger:

708894553 704488409 705674273
707478271 710167019 708093251
702013379 704030867 708691187
708374743 712748719 713581951
704387447 707015741 704308279
710872423 707947453 704996429
706323757 705031051 714623803

We’ll use two different methods to find the factorable keys. The first method, mentioned above, is to pair each key with every other key and compute the gcd. The time complexity of that algorithm is O(n 2), which is fine if n=21 but not so fine if n=12,000,000; 144trillion of anything will take a while. A second algorithm first multiplies all the keys, then for each key k computes gcd(k, kprod (mod k 2) / k). The second algorithm takes time O(n log log n); the product takes linear time, the gcd with each key takes linear time, and there is an additional factor of log log n because of the big-integer arithmetic on the product of the keys. The algorithm that Lenstra/Heninger actually used was neither of these; a paper by Daniel Bernstein gives the algorithm, but it’s rather more complicated than we care to deal with today, and the linear algorithm described above isn’t too bad, as long as n isn’t too large and as long as you have a good big-integer library.

Your task is to write functions that implement the two algorithms given above and use them to factor as many of the keys as you can. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2