Programming Praxis

Back Home

May 12, 2010

Wow. Now I know the secret to getting lots of page hits and reader comments.

Thanks to all who sent their best wishes in comments and private emails. It’s comforting to know that so many of you care. Thanks also to Remco for posting exercises in my absence. Still, though I appreciate your words, what would really make me feel good is for you to solve an exercise and post the solution.

I’ve been home for a day now, and I’m fine. The blood clots are completely dissolved, and my blood chemistry is nearly correct to prevent future clots — another few days of medication and blood tests should get it exactly right. I’ll be returning to work next Monday, and I’ll likely return to blogging the week after that.

Thanks again,

Phil

Posted by programmingpraxis

Filed in Administrivia

3 Comments »

Taking A Break

May 9, 2010

I write this from my hospital bed, where I am recovering from blood clots in both lungs; my right lung was 100% blocked, my left lung 60%. I am doing well, and expect a complete recovery.

However, I will not be blogging for a while, probably for two to four weeks. Until I resume, your task is to visit Programming Praxis every Tuesday and Friday morning and do any exercise you have not yet done; I want to see those comment queues filling up.

Posted by programmingpraxis

Filed in Administrivia

21 Comments »

Integer Logarithms

May 7, 2010

The integer logarithm of a number n to a base b is the number of times b can be multiplied by itself without exceeding n.

The implementation of the ilog function in the Standard Prelude recently changed. The old version looked like this:

(define (ilog b n) (if (zero? n) -1 (+ (ilog b (quotient n b)) 1)))

That takes time O(n) as it repeatedly divides by b. A better version takes time O(log n) as it brackets the logarithm between two possible values then uses bisection to calculate the solution.

Your task is to write a program that calculates the integer logarithm of a number to a given base in time O(log n). When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Posted by programmingpraxis

Filed in Exercises

18 Comments »

Spectacular Seven

May 4, 2010

Steven Skiena has written a book, The Algorithm Design Manual, that is justly a favorite of Programming Praxis; it is an encyclopedia of common algorithms and data structures, with many pointers to original sources. Recently, I have been reading Skiena’s book, Calculated Bets, which describes a computer program, developed by Skiena and his students, for betting on the game jai alai.

Jai alai is similar to handball, played by teams of one or two players who alternately catch the ball in a basket worn on their wrist and throw it back to the front wall; a point is won when one team is unable to catch the ball and throw it back before it bounces twice, or for various other technical infractions. Although only two teams compete for each point, there are eight teams playing in a game; the first two teams start the game, with the remaining six teams forming a queue, and after each point the winner of the point scores, the loser goes to the back of the queue, and the first team in the queue competes against the previous winner. Each point has a value of 1 until each team has played once (that is, for the first eight points of a game), when the value of winning a point increases to 2. The team that first reaches seven points is the winner.

The purpose of the rule that increases the value of a point from 1 to 2 is to reduce the bias in favor of teams that start early in the queue (have a low “post position”). But, as Skiena points out, the rule isn’t perfect.

Your task is to write a program that simulates a large number of jai alai games and calculates the average winning percentage for each post position. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Posted by programmingpraxis

Filed in Exercises

5 Comments »

Integer Factorization

April 30, 2010

We have now discussed several different factorization algorithms: trial division, wheel factorization, Fermat’s method, Pollard’s rho method, Pollard’s p-1 method in both one-stage and two-stage variants, and the elliptic curve method in both one–stage and two–stage variants. So, if you must factor a large integer, where do you start?

The basic idea is to perform several methods, starting with simple methods and working up to more complicated ones until the factorization is determined. The exact choice of the methods to be used and the decision on when to abandon one method in favor of another is idiosyncratic at best. There are many possibilities, and many degrees of freedom, that suggest one alternative or another.

Your task is to write a program that factors large integers using some mix of the factorization methods we have studied. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Posted by programmingpraxis

Filed in Exercises

Modern Elliptic Curve Factorization, Part 2

April 27, 2010

The elliptic curve factorization algorithm is remarkably similar to Pollard’s p-1 algorithm:

define `p-1`(n, B₁, B₂)	1	define `ecf`(n, B₁, B₂, C)
q = 2	2	Q_x,z ∈ C // random point
for p ∈ π(2 … B₁)	3	for p ∈ π(2 … B₁)
a = ⌊`log`_p B₁⌋	4	a = ⌊`log`_p B₁⌋
q = q^{p^a} (`mod` n)	5	Q = p^a ⊗_C Q
g = `gcd`(q, n)	6	g = `gcd`(Q_z, n)
if (1 < g < n) return g	7	if (1 < g < n) return g
for p ∈ π(B₁ … B₂)	8	for p ∈ π(B₁ … B₂)
q = q^p (`mod` n)	9	Q = p ⊗_C Q
	10	g = g × Q_z (`mod` n)
g = `gcd`(q, n)	11	g = `gcd`(g, n)
if (1 < g < n) return g	12	if (1 < g < n) return g
return failure	13	return failure

Lines 3 and 8 loop over the same lists of primes. Line 4 computes the same maximum exponent. Lines 6 and 7, and lines 11 and 12, compute the same greatest common divisor. And line 13 returns the same failure. The structures of the two algorithms are identical. Also, note that both algorithms use variables that retain their meaning from the first stage (lines 2 to 7) to the second stage (lines 8 to 12).

Lines 2, 5, 9 and 10 differ because p-1 is based on modular arithmetic and ecf is based on elliptic arithmetic (note that the ⊗_C operator, otimes-sub-C, indicates elliptic multiplication on curve C). The most important difference is on the first line, where ecf has an extra argument C indicating the choice of elliptic curve. If the p-1 method fails, there is nothing to do, but if the ecf method fails, you can try again with a different curve.

Use of Montgomery’s elliptic curve parameterization from the previous exercise is a huge benefit. Lenstra’s original algorithm, which is the first stage of the modern algorithm, required that we check the greatest common divisor at each step, looking for a failure of the elliptic arithmetic. With the modern algorithm, we compute the greatest common divisor only once, at the end, because Montgomery’s parameterization has the property that a failure of the elliptic arithmetic propagates through the calculation. And without inversions, Montgomery’s parameterization is also much faster to calculate. Brent’s parameterization of the randomly-selected curve is also important, as it guarantees that the order of the curve (the number of points) will be a multiple of 12, which has various good mathematical properties that we won’t enumerate here.

There is a simple coding trick that can speed up the second stage. Instead of multiplying each prime times Q, we iterate over i from B₁ + 1 to B₂, adding 2Q at each step; when i is prime, the current Q can be accumulated into the running solution. Again, we defer the calculation of the greatest common divisor until the end of the iteration.

Your task is to write a function that performs elliptic curve factorization. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Posted by programmingpraxis

Filed in Exercises

9 Comments »

Modern Elliptic Curve Factorization, Part 1

April 23, 2010

We discussed Hendrik Lenstra’s algorithm for factoring integers using elliptic curves in three previous exercises. After Lenstra published his algorithm in 1987, mathematicians studied the algorithm extensively and made many improvements. In this exercise, and the next, we will study a two-stage version of elliptic curve factorization that features improved elliptic arithmetic and is much faster than Lenstra’s original algorithm. Today’s exercise looks at the improved elliptic arithmetic; we will look at the two-stage algorithm in the next exercise.

If you look at Lenstra’s original algorithm, much of the time is spent calculating modular inverses. Peter Montgomery worked out a way to eliminate the modular inverses by changing the elliptic arithmetic; instead of using the curve y² = x³ + ax + b, Montgomery uses the curve y² = x³ + ax² + x. Additionally, he does a co-ordinate transform to eliminate the y variable, so we don’t need to figure out which of the quadratic roots to follow. Finally, Montgomery stores the numerator and denominator of the a parameter separately, eliminating the modular inverse. We won’t give more of the math; for those who are interested, a good reference is Prime Numbers: A Computational Perspective by Richard Crandall and Carl Pomerance.

Here are the elliptic addition, doubling and multiplication rules for Montgomery’s elliptic curves:

A curve is given by the formula y² = x³ + (A_n / A_d – 2)x² + x (mod n).

A point is stored as a two-tuple containing its x and z co-ordinates.

To add two points P₁(x₁,z₁) and P₂(x₂,z₂), given the point P₀(x₀,z0) = P₁ – P₂ (notice that the curve is not used in the calculation):

x = ((x₁–z₁) × (x₂+z₂) + (x₁+z₁) × (x₂–z₂))² × z₀ mod n
z = ((x₁–z₁) × (x₂+z₂) – (x₁+z₁) × (x₂–z₂))² × x₀ mod n

To double the point P:

x = (x+z)² × (x–z)² × 4 × Ad mod n
z = (4 × Ad × (x–z)² + t × An) × t mod n where t = (x+z)² – (x–z)²

Multiplication is done by an adding/doubling ladder, similar to the method used in the Lucas chain in the exercise on the Baillie-Wagstaff primality checker; the Lucas chain is needed so that we always have the point P₁–P₂. For instance, the product 13P is done by the following ladder:

2P = double(P)
3P = add(2P, P, P)
4P = double(2P)
6P = double(3P)
7P = add(4P, 3P, P)
13P = add(7P, 6P, P)

Richard Brent gave a method to find a suitable elliptic curve and a point on the curve given an initial seed s on the half-open range [6, n), with all operations done modulo n:

u = s² – 5
v = 4s

A_n = (v–u)³ (3u+v)
A_d = 4u³v

P = (u³, v³)

Your task today is to write the three functions that add, double, and multiply points on an elliptic curve and the function that finds a curve and a point on the curve; the payoff will be in the next exercise that presents elliptic curve factorization. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Posted by programmingpraxis

Filed in Exercises

5 Comments »

145 Puzzle

April 20, 2010

Consider this math puzzle:

Form all the arithmetic expressions that consist of the digits one through nine, in order, with a plus-sign, a times-sign, or a null character interpolated between each pair of digits; for instance, 12+34*56+7*89 is a valid expression that evaluates to 2539. What number is the most frequent result of evaluating all possible arithmetic expressions formed as described above? How many times does it occur? What are the expressions that evaluate to that result?

Your task is to answer the three questions. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Posted by programmingpraxis

Filed in Exercises

11 Comments »

Expression Evaluation

April 16, 2010

The very first exercise at Programming Praxis evaluated expressions given in reverse-polish notation. In today’s exercise, we write a function to evaluate expressions given as strings in the normal infix notation; for instance, given the string “12 * (34 + 56)”, the function returns the number 1080. We do this by writing a parser for the following grammar:

expr	→	term
		expr + term
		expr – term

term	→	fact
		term * fact
		term / fact

fact	→	numb
		( expr )

The grammar specifies the syntax of the language; it is up to the program to provide the semantics. Note that the grammar handles the precedences and associativities of the operators. For example, since expressions are made up from terms, multiplication and division are processed before addition and subtraction.

The normal algorithm used to parse expressions is called recursive-descent parsing because a series of mutually-recursive functions, one per grammar element, call each other, starting from the top-level grammar element (in our case, expr) and descending until the beginning of the string being parsed can be identified, whereupon the recursion goes back up the chain of mutually-recursive functions until the next part of the string can be parsed, and so on, up and down the chain until the end of the string is reached, all the time accumulating the value of the expression.

Your task is to write a function that evaluates expressions according to the grammar given above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Posted by programmingpraxis

Filed in Exercises

3 Comments »

Traveling Salesman: Minimum Spanning Tree

April 13, 2010

We looked at the traveling salesman problem in two previous exercises. We also studied two algorithms for computing the minimum spanning tree. In today’s exercise, we will use a minimum spanning tree as the basis for computing a traveling salesman tour.

It is easy to convert a minimum spanning tree to a traveling salesman tour. Pick a starting vertex on the minimum spanning tree; any one will do. Then perform depth-first search on the tree rooted at the selected vertex, adding each vertex to the traveling salesman tour as it is selected. The resulting tour is guaranteed to be no longer than twice the optimal tour. Considering the speed of the algorithm, that’s a pretty good result.

Your task is to write a program that finds traveling salesman solutions using a minimum spanning tree as described above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Posted by programmingpraxis

Filed in Exercises

Programming Praxis

Back Home

May 12, 2010

Taking A Break

May 9, 2010

Integer Logarithms

May 7, 2010

Spectacular Seven

May 4, 2010

Integer Factorization

April 30, 2010

Modern Elliptic Curve Factorization, Part 2

April 27, 2010

Modern Elliptic Curve Factorization, Part 1

April 23, 2010

145 Puzzle

April 20, 2010

Expression Evaluation

April 16, 2010

Traveling Salesman: Minimum Spanning Tree

April 13, 2010

Categories

Archives

Archives