Lenstra’s Algorithm
August 4, 2009
Hendrik Lenstra devised the elliptic curve factorization algorithm in 1987, an algorithm that is simultaneously elegant and of immense practical importance. This exercise describes his original algorithm. With some algorithmic tweaks that we won’t discuss here, Lenstra’s algorithm is generally quick to find factors in the 20- to 25-digit range, slower to find factors in the 40-digit range, and ineffective if the factors are much larger than about 50 digits; at this writing, the largest factor found by the elliptic curve algorithm has 67 digits.
Lenstra’s elliptic curve factorization algorithm works by searching for a point on the elliptic pseudo-curve Ea,b (mod n), where n =pq, that exists modulo p but not modulo q, or vice versa. In the previous exercise, we searched for that point by picking an initial point and repeatedly adding it to multiples of itself until the elliptic algebra failed.
Instead of repeated addition, Lenstra speeds the search by borrowing from Pollard’s p-1 method the idea of searching over a large product of small integers. Pollard’s method uses the least common multiple of numerous small integers, but Lenstra instead uses a large factorial. For instance, if you multiply 10000! by a point on an elliptic pseudo-curve, you are likely to find a factor of a 50-digit number somewhere in the middle of the computation. But Lenstra’s method admits a continuation that Pollard’s lacks; if one elliptic curve fails, you can choose another and try again. Or you can also increase the multiplier; if you don’t find a factor with a limit of 10000! and 1000 curves, you can try again with a limit of 1000000! and 1000000 curves.
Of course, there are some details to fill in, and instead of computing 10000! and then performing the elliptic multiplication, the work is done in small steps, checking the elliptic algebra after each. Here is Lenstra’s original algorithm, searching a single curve to a given limit:
To get started, choose a parameter for the limit, choose a random pseudo-elliptic curve Ea,b (mod n) and choose a random point (x,y) on the curve. Calculate the discriminant d = 4a3 + 27b2 of the curve; if d = 0, the curve has a cusp or is self-intersecting, so choose a different curve. Then you can check if you were lucky; if gcd(d,n) > 1, d is a factor, so report it and quit. But most often d and n are co-prime, and you must continue.
The main body of Lenstra’s algorithm is a double-nested loop starting with random point (x,y):
for each prime p less than the limit
for as many times as the base-p integer logarithm of the limit
calculate a new (x,y) on Ea,b (mod n) as p times the old (x,y)
if the calculation fails, report the finding of a factorIf you find a factor, the loop exits early; if the loop finishes, you can either choose a different curve, or increase the limit, or quit and report failure.
The elliptic addition, doubling and multiplication functions from the prior exercise need to be modified to recognize an error of the elliptic algebra before it occurs. For instance, when adding two points, first compute the denominator of the slope x1 − x2, and check that it has an inverse by calculating gcd(x1 − x2,n) before computing the slope and determining the sum of the two points; if the gcd is 1, you can proceed to calculate the inverse, if it is n you have been horribly unlucky (the point you are calculating isn’t on either of the two true sub-curves of the pseudo-curve) and you must exit the loop with failure, but if the gcd is strictly between 1 and n, it is a factor of n, so you can report it and quit. You may wish to calculate the integer logarithm in the inner loop by multiplying an accumulating product times the base p during the loop, stopping the loop when the accumulator exceeds the limit. The calculation of the large factorial is implicit in the two loops.
It is hard to choose the limit and the number of curves. In his original paper, Lenstra gives a formula for calculating exactly the limit and the number of curves, but unfortunately, the formula relies on the factorization, which of course is not yet known; his formula does give him a good way to prove the time-complexity bound, however. There are several papers that examine the question, but specific advice must be based on the specific implementation. The only general advice we can give is that the limit and number of curves must increase as the number to be factored increases, using too big or too small a limit and number of curves will cost you time rather than save it, and you will have to experiment with your implementation to know what works for you. Be aware that there can be considerable variance in the time it takes to perform a factorization, depending on how the random number generator chooses the parameters of the search.
Lenstra’s algorithm, as given above, searches a single curve to a given limit. To make a complete factorization, you must arrange to call it repeatedly in the event of failure, and then check the result, which may itself be composite, calling Lenstra’s algorithm recursively until the factorization is complete. And any real implementation of Lenstra’s algorithm first uses some light-weight method, such as trial division by small primes or Pollard’s rho algorithm, to quickly reduce the scope of a factorization.
Your task is to write a function that factors integers using Lenstra’s algorithm. Use your function to find the factors of (1081-1)/9; you may wish to do a timing comparison with Pollard’s rho factorization function. When you are finished, you are welcome to read or run a suggested solution, or to post your solution or discuss the exercise in the comments below.