Pollard’s P-1 Factorization Algorithm, Revisited

September 16, 2011

We’ll solve this exercise twice, the first version with the improved second stage and adding the periodic calculation of the gcd at the second version. Beware that this gets messy, even though the concept is fairly simple. Here’s the first version:

(define pminus1 ; no backtracking (letrec ((diffs (lambda (xs) (let loop ((x (car xs)) (xs (cdr xs)) (zs (list))) (if (null? (cdr xs)) (reverse (cons (- (car xs) x) zs)) (loop (car xs) (cdr xs) (cons (- (car xs) x) zs)))))) (last-pair (lambda (xs) (if (null? (cdr xs)) xs (last-pair (cdr xs))))) (cycle (lambda xs (set-cdr! (last-pair xs) xs) xs))) (let* ((b1 #e16e4) (b2 #e32e5) (ps (primes b2)) (as (let loop ((ps ps) (as (list))) (let ((a (ilog (car ps) b1))) (if (= a 1) (append (reverse as) (cycle 1)) (loop (cdr ps) (cons a as)))))) (ds (map (lambda (x) (/ x 2)) (diffs ps)))) (lambda (n) (let loop1 ((ps ps) (as as) (ds ds) (q 2)) ; (display "loop1: ") (display q) (newline) (let ((g (gcd (- q 1) n))) (if (< 1 g n) g (if (< (car ps) b1) (loop1 (cdr ps) (cdr as) (cdr ds) (expm q (expt (car ps) (car as)) n)) (let* ((len (apply max ds)) (dv (make-vector (+ len 1) 0))) (do ((i 1 (+ i 1))) ((< len i)) (vector-set! dv i (expm q (+ i i) n))) (let loop2 ((ds ds) (q (expm q (car ps) n)) (p 1)) ; (display "loop2: ") (display q) ; (display " ") (display p) (newline) (let ((g (gcd p n))) (if (< 1 g n) g (if (null? ds) #f (let ((q (modulo (* q (vector-ref dv (car ds))) n))) (loop2 (cdr ds) q (modulo (* p (- q 1)) n))))))))))))))))

This is long but not hard. The calculations of ps, as and ds are done inside the closure but outside the function, so they are computed only once, at the time the function is defined, and reused at each call to the function; a consequence of this organization is that it is difficult to change B₁ and B₂. Vector dv holds the modular exponentiations of the differences.

The second version is messier, adding a counter j to keep track of when to compute the gcd and variables ps-saved, as-saved, ds-saved and q-saved to hold the breakpoints for backtracking; there’s also more code because now, for each stage, there are two different steps, the primary step that calculates successive qs and the secondary step that calculates both qs and also the gcd.

(define pminus1 ; with backtracking (letrec ((diffs (lambda (xs) (let loop ((x (car xs)) (xs (cdr xs)) (zs (list))) (if (null? (cdr xs)) (reverse (cons (- (car xs) x) zs)) (loop (car xs) (cdr xs) (cons (- (car xs) x) zs)))))) (last-pair (lambda (xs) (if (null? (cdr xs)) xs (last-pair (cdr xs))))) (cycle (lambda xs (set-cdr! (last-pair xs) xs) xs))) (let* ((b1 #e16e4) (b2 #e32e5) (ps (primes b2)) (as (let loop ((ps ps) (as (list))) (let ((a (ilog (car ps) b1))) (if (= a 1) (append (reverse as) (cycle 1)) (loop (cdr ps) (cons a as)))))) (ds (map (lambda (x) (/ x 2)) (diffs ps)))) (lambda (n) (let loop1 ((ps ps) (as as) (ds ds) (j 1) (q 2) (p 1) (ps-saved ps) (as-saved as) (q-saved 2)) ; (display "loop1: ") (display q) (newline) (if (and (positive? (modulo j 100)) (< (car ps) b1)) (let ((q (expm q (expt (car ps) (car as)) n))) (loop1 (cdr ps) (cdr as) (cdr ds) (+ j 1) q (modulo (* p (- q 1)) n) ps-saved as-saved q-saved)) (let ((g (gcd p n))) (cond ((< 1 g n) g) ((= g n) (let loop ((ps ps-saved) (as as-saved) (q q-saved)) (let ((g (gcd (- q 1) n))) (if (= g n) #f (if (< 1 g) g (loop (cdr ps) (cdr as) (expm q (expt (car ps) (car as)) n))))))) ((< (car ps) b1) (loop1 ps as ds 1 q 1 ps as q)) (else (let* ((len (apply max ds)) (dv (make-vector (+ len 1) 0))) (do ((i 1 (+ i 1))) ((< len i)) (vector-set! dv i (expm q (+ i i) n))) (let loop2 ((ds ds) (j 1) (q (expm q (car ps) n)) (p 1) (ds-saved ds) (q-saved (expm q (car ps) n))) ; (display "loop2: ") (display q) (display " ") (display p) (newline) (if (and (positive? (modulo j 100)) (pair? ds)) (let ((q (modulo (* q (vector-ref dv (car ds))) n))) (loop2 (cdr ds) (+ j 1) q (modulo (* p (- q 1)) n) ds-saved q-saved)) (let ((g (gcd p n))) (cond ((< 1 g n) g) ((= g n) (let loop ((ds ds-saved) (q q-saved)) (let ((g (gcd (- q 1) n))) (if (= g n) #f (if (< 1 g) g (loop (cdr ds) (modulo (* q (vector-ref dv (car ds))) n))))))) ((pair? ds) (loop2 ds 1 q 1 ds q)) (else #f)))))))))))))))

We used bounds B₁ = 160000 and B₂ = 3200000, which requires space to store about a quarter of a million primes and takes less than a second to run through the entire list and fail to split a fifty-digit composite. Generally speaking, bounds up to B₁ = 2000000 and B₂ = 100000000 are realistic; after that the elliptic curve method is faster. Also, it is usually best to use trial division or Pollard’s rho method to extract small factors before calling the p−1 method, since they will be quicker.

We used the primes function from an earlier exercise, and the expm and ilog functions from the Standard Prelude. You can run the program at http://programmingpraxis.codepad.org/qFzbCxzp, where you will also see an example factorization with timings.

Posted by programmingpraxis

Filed in Exercises

S	M	T	W	T	F	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30

Programming Praxis