One Million Hits

March 29, 2013

Programming Praxis passed one-million all-time hits yesterday at 9:22pm GMT (that’s 4:22pm CDT where I live). My thanks to all of you for making this happen.

Your task is to write a program that outputs the number 1000000; be as fun and creative as you can. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

We looked at this problem, from the Google Code Jam Qualification Round Africa 2010, in a previous exercise:

Store Credit: You receive a credit C at a local store and would like to buy two items. You first walk through the store and create a list L of all available items. From this list you would like to buy two items that add up to the entire value of the credit. The solution you provide will consist of the two integers indicating the positions of the items in your list (smaller number first). For instance, with C=100 and L={5,75,25} the solution is 2,3; with C=200 and L={150,24,79,50,88,345,3} the solution is 1,4; and with C=8 and L={2,1,9,4,4,56,90,3} the solution is 4,5.

The solution we gave there used two nested loops to look at all combinations and return the first that solved the problems. That takes O(n2) time and O(1) space. But there are other solutions. You could insert each item into some flavor of balanced binary search tree in a first pass, then check the “conjugate” of each item in a second pass; that takes O(n log n) space for the tree plus O(n log n) for each of the two passes. Using a hash table instead of a balanced binary search tree reduces the time and space requirement to O(n), assuming that each hash table lookup is O(1), which is not necessarily true. A fourth option is to sort value/position pairs by increasing value, then use binary search to find matches; that takes O(n log n) for the sort plus O(n log n) for the binary searches, but requires only O(1) space beyond the space used for the input.

Your task is to write the four programs described above; bonus points for finding a solution with some other space/time complexity (is there an exponential algorithm that solves the problem?). When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Jumping Jack

March 22, 2013

We have a fun little programming puzzle today; I found it here, but don’t know the original source of the puzzle:

Jack starts at 0 on the number line and jumps one unit in either direction. Each successive jump he makes is longer than the previous by 1 unit, and can be made in either direction. Write a program that takes a number and returns the minimum number of jumps Jack makes to reach that number.

For example, it takes seven jumps to reach 12, first a jump left to -1, then a jump right to 1, a jump right to 4, a jump right to 8, a jump right to 13, a jump right to 19, and a jump left to 12. To go to -12, simply reverse the direction of each jump.

Your task is to write programs that compute the minimum number of steps Jack must take to reach his target and that compute the actual steps. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Quadratic Sieve

March 19, 2013

One of the most powerful factoring methods is the quadratic sieve, invented as a “thought experiment” by Carl Pomerance in the early 1980s; it is routinely used to factor semi-primes up to a hundred digits, sometimes more. The quadratic sieve was famously used to factor RSA-129, a challenge number given by Martin Gardner in the August 1977 edition of Scientific American; using the methods available at the time, it was estimated that factorization would take forty quadrillion years, but in fact the number was factored in April 1994, revealing the encoded message “The magic words are squeamish ossifrage.”

Today’s exercise presents a basic version of the quadratic sieve; we will present more powerful (and more complicated) variants in future exercises. We will be following the thesis Factoring Integers with the Self-Initializing Quadratic Sieve by Scott Contini, which is something of a bible for implementers of the quadratic sieve; Contini’s thesis is available at http://www.crypto-world.com/documents/contini_siqs.pdf. Today’s exercise is drawn from Figure 2.1 of the thesis.

The quadratic sieve shares its basic structure with other factoring methods such as Dixon’s method and the continued fraction method that we have previously studied (today’s exercise will probably make more sense if you read those exercises first). The idea is that if x2y2 (mod n) with x ≢ ± y (mod n), then gcd(x±y, n) are two factors of n; about half the time, the two factors will be 1 and n, but the other half of the time the two factors will be non-trivial. All three methods compute a series of relations of the form x2y (mod n) where y is smooth over a relatively small factor base; the factored y are then combined using linear algebra to find x2y2 (mod n) and thus perform the factorization. The three methods differ in the way they compute the relations: Dixon’s method computes them at random, the continued fraction method finds them in the convergents of the square root of n, and the quadratic sieve finds them by sieving a polynomial of the form g(x) = (x + b)2n, where b is the smallest integer greater than the square root of n. The exponent 2 in function g is the “quadratic” in the name of the method.

Contini gives a description of the basic quadratic sieve algorithm that we adapt here:

Compute startup data: Choose f, the upper bound for factor base primes, and m, which is half the size of the sieve. Let b = ⌈√n⌉ and g(x) = (x + b)2n. Determine the factor base primes p < f such that the jacobi symbol (n/p) is 1 indicating that there is a solution to t2n (mod p); also include 2 in the factor base. For each factor base prime p, compute t, a modular square root of n (mod p). Then compute and store soln1p = tb (mod p), soln2p = −tb (mod p) and lp = ⌊log p⌉ (rounding off) for each factor base prime p.

Perform sieving: Initialize a sieve array to 0’s. For each odd prime p in the factor base, add lp to the locations soln1p + ip and soln2p + ip of the sieve array for i = 0, 1, 2, …. For the prime p = 2, sieve only with soln1p.

Trial division: Scan sieve array for locations x that have accumulated a value of a least log 2 xn minus a small error term. Trial divide g(x) by the primes in the factor base. If g(x) factors into primes less than f, then save smooth relation. After scanning entire sieve array, if there are more smooth relations than primes in the factor base, then go to linear algebra step. Otherwise, increase f or m or both and return to sieving step.

Linear algebra: Solve the matrix as in Dixon’s method and the continued fraction method. For each null vector, construct relation of form x2y2 (mod n) and attempt to factor n by computing gcd(xy, n). If all null vectors fail to give a non-trivial factorization, increase f or m or both and return to sieving step.

Let’s step back and look at the algorithm from a higher perspective. We are looking for relations of the form y2g(x) (mod n) where g(x) is smooth so we can combine the g(x) using linear algebra to find a set of relations for which the g(x) form a perfect square, which will give us an x and y that we can use to compute a factor of n. Sieving the polynomial g(x) is done in the same way as sieving for primes in the Sieve of Eratosthenes; in fact, in an optimized Sieve of Eratosthenes over the odd numbers starting from 3, the sieving is performed over the polynomial 2x+1.

The use of logarithms in the sieving is an optimization; we could equally sieve by the primes themselves if we chose. But if we did that, we would have to sieve by all the prime powers as well as the primes, and we would have to keep track of which primes and prime powers appeared at each item of the sieve, which takes lots of memory and limits the size of the sieve. Using logarithms keeps memory consumption down, and the small error factor allows us to ignore prime powers. Note that the logarithms are rounded to integers, so we don’t need the space to store floats or doubles.

The sieving range is b ± m. Any relations found in the left half of the sieve must have a factor of −1 added to their factorizations, which is then treated as any other prime in the factor base during the linear algebra step.

The only other thing to note is that, if you are clever, when increasing f or m after a failure, you only need to compute the new pieces, assuming you saved the old pieces from sieving step.

A complete, worked example is shown on the next page.

Your task is to write a program to factor integers using the quadratic sieve as described above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2 3

Buffon’s Needle

March 15, 2013

Yesterday was pi day, so today we will estimate pi by a method that dates to 1777: if you randomly drop needles onto a flat surface with lines separated by the length of a needle, then the number of needles dropped, divided by the number of needles that intersect a line, will equal pi. You can determine if a needle intersects a line with a little bit of trigonometry. The method is called Buffon’s needle because it was discovered by the French naturalist Georges-Louis Leclerc, the Comte de Buffon. We did a similar exercise to this one a few years ago.

Your task is to write a program that calculates an approximate value of pi by simulating the dropping of a million needles. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

An Array Of Two Symbols

March 12, 2013

I found today’s exercise on a discussion forum for learning programmers. It looks to me like homework, but the posting was old enough that I don’t think we have to worry about doing someone’s homework for them — unless the teacher reuses the same problems from one year to the next.

You are given an array of length n that is filled with two symbols; all m copies of one symbol appear first, at the beginning of the array, followed by all nm copies of the other symbol. You are to find the index of the first copy of the second symbol, m, in time O(log m).

The obvious solution is binary search. Pick the midpoint of the array; if the item there is the first symbol, search recursively in the right half of the array, otherwise search recursively in the left half of the array, stopping when the sub-array has been reduced to a single item. But that takes time O(log n), which violates the conditions of the exercise.

Your task is to write the O(log m) solution. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Knight Moves

March 8, 2013

[ Today’s exercise comes from “Learn You a Haskell for Great Good” written by Miran Lipovača; the solution was written by Kyle Ilantzis. Guest authors are always welcome; if you are interested, contact me at the email address on the menu bar. By the way, that’s a fine place to start if you want to learn about Haskell. ]

This exercise comes from “Learn You a Haskell for Great Good!” by Miran Lipovača. Lipovača demonstrates a solution to find out if a single knight piece on a chessboard can reach a position in three moves. As an exercise he suggests that we try to find out if that knight can reach a position on the board in n moves and if so, what are the moves that the knight should make.

Your task is to write a function that given a starting position, ending position, and n number of moves to take, returns a list of all possible paths that that knight piece could take to reach the goal. If some input has no solution then simply return an empty list. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Dutch National Flag

March 5, 2013

A famous problem given by Edsgar Dijkstra is to sort an array of red, white and blue symbols so that all reds come together, followed by all whites, followed finally by all blues; it’s called the Dutch National Flag problem because the Dutch flag consists of three stripes with red at the top, blue at the bottom, and white in the middle. You are allowed to scan through the array only once, and the only operations permitted are to examine the color of the symbol at a given array location and to swap the symbols at two locations.

Your task is to write a program to perform the sort given above. When you are finished, you are welcome to read or run a suggested solution, or to post your solution or discuss the exercise in the comments below.

Pages: 1 2

We examined the Baillie-Wagstaff probabilistic primality checker based on Lucas pseudoprimes in a previous exercise, where we got it wrong; among other things it incorrectly reported that 3825123056546413051 = 149491 × 747451 × 34233211 was prime. We’ll fix that today. Robert Baillie and Samuel Wagstaff Jr. describe the test in their article “Lucas Pseudoprimes” in Mathematics of Computation, Volume 35, Number 152, pages 1391-1417, October 1980; see also Thomas Nicely’s web page devoted to the test. This is the same algorithm used in the PrimeQ function in Mathematica. If at any point the math gets daunting, feel free to skip ahead to the code, which is short and simple.

Lucas numbers are defined by a recurrence formula similar to Fibonacci numbers, where Ln = Ln−1 + Ln−2 with L1 = 1 and L2 = 3; the Lucas numbers are 1, 3, 4, 7, 11, 18, 29, 47, 76, 123, … (Sloane’s A000204). Lucas numbers have the rather startling property that, if n is prime, Ln ≡ 1 (mod n). But the converse is not true, and composite numbers n such that Ln ≡ 1 (mod n) are known as Lucas pseudoprimes; the first few Lucas pseudoprimes are 705, 2465, 2737, 3745, 4171, … (Sloane’s A005845).

Lucas numbers are a special case of Lucas sequences. If P and Q are integers such that the discriminant D = P2 − 4Q, then the roots of x2Px + Q = 0 are a = (P + √D) / 2 and b = (P − √D) / 2 (that’s the quadratic equation you learned in middle school). There are two Lucas sequences Un(P,Q) = (anbn) / (ab) and Vn(P,Q) = an + an for n ≥ 0, which can be computed by the recurrence equations Um(P,Q) = PUm−1(P,Q) − QUm−2(P,Q) and Vm(P,Q) = PVm−1(P,Q) − QVm−2(P,Q). The Lucas numbers are given by the sequence V(1, -1) and the Fibonacci numbers are given by the sequence U(1, -1).

It is possible to compute the nth element of a Lucas sequence in logarithmic time rather than linear time using a double-and-add algorithm similar to the peasant algorithm for multiplication; such a computation is known as a Lucas chain. The rules are U2n = Un Vn, U2n+1 = Un+1 VnQn, V2n = Vn2 − 2 Qn, and V2n+1 = Vn+1 VnP Qn. For our purposes, we will be making all computations mod n.

Thus we have a method to determine if a number n is a Lucas pseudoprime. Choose a suitable P and Q, use a Lucas chain to compute Un(P,Q), and declare n either prime or a Lucas pseudoprime if Un(P,Q) ≡ 0 (mod n). There are several methods to compute suitable P and Q, of which the most commonly-used is due to John Selfridge: choose D as the smallest number in the sequence 5, -7, 9, -11, 13, -15, 17, … for which the Jacobi symbol (D / n) = -1, then set P = 1 and Q = (1 – D) / 4.

There is a strong version of the Lucas pseudoprimality test that bears the same relationship to the standard version of the Lucas pseudoprimality test described above as Gary Miller’s strong Fermat pseudoprimality test bears to the standard Fermat pseudoprimality test. Represent n as d · 2s + 1, then compute Ud(P,Q) and Vd(P,Q). Then n is either prime or a Lucas pseudoprime if Ud(P,Q) ≡ 0 (mod n) or if Vd 2r(P,Q) ≡ 0 (mod n) for any r with 0 ≤ r < s.

Finally, the Baillie-Wagstaff test of the primality of n consists of the following steps: 1) if n is not an integer, report a domain error; 2) if n < 2, report n is not prime; 3) if n is a square, report n is not prime; 4) if n is divisible by any primes less than some convenient limit (e.g. 100 or 1000), report n is not prime; 5) if n is not a strong Fermat pseudoprime to base 2, report n is not prime; 6) (optional, added by Mathematica) if n is not a strong Fermat pseudoprime to base 3, report n is not prime; 7) if n is not a Lucas pseudoprime (standard or strong), report n is not prime; 8) otherwise, report n is prime.

The Baillie-Wagstaff pseudoprimality test, whether or not the base-3 test is included and whether the standard or strong Lucas test is chosen, is extremely strong: there are no known errors. Exhaustive checks have been performed to 1017, and the test has been used for years in several high-quality primality proving programs that would have found an error if one occurred. There is a $620 reward for a counter-example to the test ($500 from Selfridge, $100 from Wagstaff, and $20 from Carl Pomerance), plus the certain fame that will accrue to the finder.

Your task is to write a Baillie-Wagstaff primality checker as described above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2