Pollard’s P-1 Factorization Algorithm, Revisited

September 16, 2011

We’ll solve this exercise twice, the first version with the improved second stage and adding the periodic calculation of the gcd at the second version. Beware that this gets messy, even though the concept is fairly simple. Here’s the first version:

(define pminus1 ; no backtracking
  (letrec ((diffs (lambda (xs)
             (let loop ((x (car xs)) (xs (cdr xs)) (zs (list)))
               (if (null? (cdr xs))
                   (reverse (cons (- (car xs) x) zs))
                   (loop (car xs) (cdr xs) (cons (- (car xs) x) zs))))))
           (last-pair (lambda (xs)
             (if (null? (cdr xs)) xs (last-pair (cdr xs)))))
           (cycle (lambda xs (set-cdr! (last-pair xs) xs) xs)))
    (let* ((b1 #e16e4) (b2 #e32e5) (ps (primes b2))
           (as (let loop ((ps ps) (as (list)))
                 (let ((a (ilog (car ps) b1)))
                   (if (= a 1) (append (reverse as) (cycle 1))
                     (loop (cdr ps) (cons a as))))))
           (ds (map (lambda (x) (/ x 2)) (diffs ps))))
      (lambda (n)
        (let loop1 ((ps ps) (as as) (ds ds) (q 2))
          ; (display "loop1: ") (display q) (newline)
          (let ((g (gcd (- q 1) n)))
            (if (< 1 g n) g
              (if (< (car ps) b1)
                  (loop1 (cdr ps) (cdr as) (cdr ds)
                         (expm q (expt (car ps) (car as)) n))
                  (let* ((len (apply max ds))
                         (dv (make-vector (+ len 1) 0)))
                    (do ((i 1 (+ i 1))) ((< len i))
                      (vector-set! dv i (expm q (+ i i) n)))
                    (let loop2 ((ds ds) (q (expm q (car ps) n)) (p 1))
                      ; (display "loop2: ") (display q)
                      ; (display " ") (display p) (newline)
                      (let ((g (gcd p n)))
                        (if (< 1 g n) g
                          (if (null? ds) #f
                            (let ((q (modulo (* q (vector-ref dv (car ds))) n)))
                              (loop2 (cdr ds) q (modulo (* p (- q 1)) n))))))))))))))))

This is long but not hard. The calculations of ps, as and ds are done inside the closure but outside the function, so they are computed only once, at the time the function is defined, and reused at each call to the function; a consequence of this organization is that it is difficult to change B1 and B2. Vector dv holds the modular exponentiations of the differences.

The second version is messier, adding a counter j to keep track of when to compute the gcd and variables ps-saved, as-saved, ds-saved and q-saved to hold the breakpoints for backtracking; there’s also more code because now, for each stage, there are two different steps, the primary step that calculates successive qs and the secondary step that calculates both qs and also the gcd.

(define pminus1 ; with backtracking
  (letrec ((diffs (lambda (xs)
             (let loop ((x (car xs)) (xs (cdr xs)) (zs (list)))
               (if (null? (cdr xs))
                   (reverse (cons (- (car xs) x) zs))
                   (loop (car xs) (cdr xs) (cons (- (car xs) x) zs))))))
           (last-pair (lambda (xs)
             (if (null? (cdr xs)) xs (last-pair (cdr xs)))))
           (cycle (lambda xs (set-cdr! (last-pair xs) xs) xs)))
    (let* ((b1 #e16e4) (b2 #e32e5) (ps (primes b2))
           (as (let loop ((ps ps) (as (list)))
                 (let ((a (ilog (car ps) b1)))
                   (if (= a 1) (append (reverse as) (cycle 1))
                     (loop (cdr ps) (cons a as))))))
           (ds (map (lambda (x) (/ x 2)) (diffs ps))))
      (lambda (n)
        (let loop1 ((ps ps) (as as) (ds ds) (j 1) (q 2) (p 1) (ps-saved ps) (as-saved as) (q-saved 2))
          ; (display "loop1: ") (display q) (newline)
          (if (and (positive? (modulo j 100)) (< (car ps) b1))
              (let ((q (expm q (expt (car ps) (car as)) n)))
                (loop1 (cdr ps) (cdr as) (cdr ds) (+ j 1) q (modulo (* p (- q 1)) n) ps-saved as-saved q-saved))
              (let ((g (gcd p n)))
                (cond ((< 1 g n) g)
                      ((= g n) (let loop ((ps ps-saved) (as as-saved) (q q-saved))
                                 (let ((g (gcd (- q 1) n)))
                                   (if (= g n) #f (if (< 1 g) g
                                     (loop (cdr ps) (cdr as) (expm q (expt (car ps) (car as)) n)))))))
                      ((< (car ps) b1) (loop1 ps as ds 1 q 1 ps as q))
                      (else (let* ((len (apply max ds)) (dv (make-vector (+ len 1) 0)))
                              (do ((i 1 (+ i 1))) ((< len i)) (vector-set! dv i (expm q (+ i i) n)))
                              (let loop2 ((ds ds) (j 1) (q (expm q (car ps) n)) (p 1) (ds-saved ds) (q-saved (expm q (car ps) n)))
                                ; (display "loop2: ") (display q) (display " ") (display p) (newline)
                                (if (and (positive? (modulo j 100)) (pair? ds))
                                    (let ((q (modulo (* q (vector-ref dv (car ds))) n)))
                                      (loop2 (cdr ds) (+ j 1) q (modulo (* p (- q 1)) n) ds-saved q-saved))
                                    (let ((g (gcd p n)))
                                      (cond ((< 1 g n) g)
                                            ((= g n) (let loop ((ds ds-saved) (q q-saved))
                                                       (let ((g (gcd (- q 1) n)))
                                                         (if (= g n) #f (if (< 1 g) g
                                                           (loop (cdr ds) (modulo (* q (vector-ref dv (car ds))) n)))))))
                                            ((pair? ds) (loop2 ds 1 q 1 ds q))
                                            (else #f)))))))))))))))

We used bounds B1 = 160000 and B2 = 3200000, which requires space to store about a quarter of a million primes and takes less than a second to run through the entire list and fail to split a fifty-digit composite. Generally speaking, bounds up to B1 = 2000000 and B2 = 100000000 are realistic; after that the elliptic curve method is faster. Also, it is usually best to use trial division or Pollard’s rho method to extract small factors before calling the p−1 method, since they will be quicker.

We used the primes function from an earlier exercise, and the expm and ilog functions from the Standard Prelude. You can run the program at http://programmingpraxis.codepad.org/qFzbCxzp, where you will also see an example factorization with timings.

About these ads

Pages: 1 2

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 609 other followers

%d bloggers like this: