Sieving A Polynomial

May 30, 2017

We begin by writing a simple function that factors integers:

(define (factors n)
  (let loop ((n n) (f 2) (fs (list)))
    (cond ((< n (* f f)) (reverse (cons n fs)))
          ((positive? (modulo n f)) (loop n (+ f 1) fs))
          (else (loop (/ n f) f (cons f fs))))))

It doesn’t get any simpler than that. Then we can solve the problem by enumerating the numbers that are 1 modulo 4, by starting with 5 and increasing in step of 4 until we reach the limit, factoring each of them:

> (time
    (for-each
      (lambda (n)
        (display n) (display #\tab)
        (display (factors n)) (newline))
      (range 5 (+ 300001 1) 4)))
5	(5)
9	(3 3)
13	(13)
17	(17)
21	(3 7)
25	(5 5)
29	(29)
...
299977	(299977)
299981	(11 27271)
299985	(3 5 7 2857)
299989	(23 13043)
299993	(299993)
299997	(3 3 3 41 271)
300001	(13 47 491)
(time (for-each (lambda (...) ...) ...))
    3 collections
    0.914591752s elapsed cpu time, including 0.002393507s collecting
    2.947164401s elapsed real time, including 0.002407402s collecting
    27820480 bytes allocated, including 29532000 bytes reclaimed

Over half the time is spent computing the factorizations; the rest is spent printing them:

> (time (length (map factors (range 5 (+ 300001 1) 4))))
(time (length (map factors ...)))
    1 collection
    0.497749733s elapsed cpu time, including 0.004209815s collecting
    0.497980266s elapsed real time, including 0.004216131s collecting
    11065296 bytes allocated, including 4766688 bytes reclaimed
75000

If you’re working with lots of primes, it is almost always better to find a way to use a sieve to make the desired calculation. (The previous sentence is true; it is even more true if you remove the word almost.) Here is our sieving solution, which sieves over the polynomial 4k + 1 for 1 ≤ k ≤ ⌊n / 4⌋:

(define (factors-1mod4 n)

    (let* ((len (quotient (- n 1) 4))
           (factors (make-vector (+ len 1) (list))))

      (define (sieve p pp) ; p = prime, pp = prime-power
        (let* ((k (if (= (modulo pp 4) 1) pp (* 3 pp)))
               (k (quotient (- k 1) 4)))
          (do ((i k (+ i pp))) ((< len i))
            (vector-set! factors i
              (cons p (vector-ref factors i))))))

      (do ((ps (cdr (primes (isqrt n))) (cdr ps))) ((null? ps))
        (do ((pp (car ps) (* pp (car ps)))) ((< n pp))
          (sieve (car ps) pp)))

      (do ((k 0 (+ k 1))) ((< len k) factors)
        (let* ((n (+ (* 4 k) 1))
               (fs (vector-ref factors k))
               (f (/ n (apply * fs))))
          (vector-set! factors k
            (cond ((null? fs) (list n))
                  ((= f 1) (reverse fs))
                  (else (reverse (cons f fs)))))))))

We’ll describe that function in a moment; first, here are the timing comparisons:

> (time (let ((fs (factors-1mod4 300001)))
          (do ((k 1 (+ k 1))) ((= (vector-length fs) k))
            (display (+ (* 4 k) 1)) (display #\tab)
            (display (vector-ref fs k)) (newline))))
5	(5)
9	(3 3)
13	(13)
17	(17)
21	(3 7)
25	(5 5)
29	(29)
...
299977	(299977)
299981	(11 27271)
299985	(3 5 7 2857)
299989	(23 13043)
299993	(299993)
299997	(3 3 3 41 271)
300001	(13 47 491)
(time (for-each (lambda (...) ...) ...))
    3 collections
    0.914591752s elapsed cpu time, including 0.002393507s collecting
    2.947164401s elapsed real time, including 0.002407402s collecting
    27820480 bytes allocated, including 29532000 bytes reclaimed
(time (let ((...)) ...))
    3 collections
    0.375754800s elapsed cpu time, including 0.011405073s collecting
    3.366967942s elapsed real time, including 0.011437653s collecting
    26286032 bytes allocated, including 23110192 bytes reclaimed

Almost all of the time is spent printing; nearly no time is spent computing the factorizations:

> (time (vector-length (factors-1mod4 300001)))
(time (vector-length (factors-1mod4 300001)))
    1 collection
    0.019011232s elapsed cpu time, including 0.002646539s collecting
    0.019020886s elapsed real time, including 0.002651997s collecting
    7084752 bytes allocated, including 6562944 bytes reclaimed
75001

That’s a reduction from half a second to a fiftieth of a second, which pretty clearly shows the advantage of sieving.

The key is the vector factors that holds ⌊n / 4⌋ sets of factors; the element at index k corresponds to the number 4k + 1. We can sieve over the polynomial because it is a linear sequence; when we advance to the next index, we increase the number it represents by 4, so when we are sieving on 7, say, we skip 7 vector elements representing 28 on the number line. Sieving is performed by the local function sieve; the calculation of k in the function computes the index of the smallest multiple of pp that is 1 modulo 4, which is pp itself if pp is equivalent to 1 modulo 4, or 3 × pp if pp is equivalent to 3 modulo 4.

The heart of the function is the two nested do loops, the outer loop on p and the inner loop on pp. Variable p, which is implicit in (car ps), runs over the primes from 3 to the square root of the parameter n (we don’t use 2 because all of the numbers of interest are odd), computed using a standard Sieve of Eratosthenes. Variable pp runs over the powers of the primes, so we can capture all of the factors in their multiplicity. Function sieve skips pp vector elements at a time, marking each list that it visits with factor p.

The final loop on k cleans up the vector of factors. If an element of the vector contains the null list, then the corresponding n is prime. If the product of the factors is equal to n, then the list is complete, and needs only to be reversed so the factors appear in increasing order. If the product of the factors is less than n, then n has one additional factor greater than its square root, which must be added to the list.

For sake of completeness, here is the standard Sieve of Eratosthenes used to compute the sieving primes:

(define (primes n) ; list of primes not exceeding n
  (let* ((len (quotient (- n 1) 2)) (bits (make-vector len #t)))
    (let loop ((i 0) (p 3) (ps (list 2)))
      (cond ((< n (* p p))
              (do ((i i (+ i 1)) (p p (+ p 2))
                   (ps ps (if (vector-ref bits i) (cons p ps) ps)))
                  ((= i len) (reverse ps))))
            ((vector-ref bits i)
              (do ((j (+ (* 2 i i) (* 6 i) 3) (+ j p)))
                  ((<= len j) (loop (+ i 1) (+ p 2) (cons p ps)))
                (vector-set! bits j #f)))
            (else (loop (+ i 1) (+ p 2) ps))))))

You can run the program at http://ideone.com/VaC9RY.

Advertisements

Pages: 1 2

16 Responses to “Sieving A Polynomial”

  1. Milbrae said

    Simple Python3 with lots of space for optimization, possibly

    from __future__ import division
    
    N = 50
    
    def Sieve(m):
        tmp = [False] * (m + 1)
        for k in range(3, m + 1, 2): tmp[k] = True
    
        i = 3
        while i * i <= m:
            if tmp[i]:
                j = i * i
                while j <= m:
                    tmp[j] = False
                    j += i + i
            i += 2
    
        primes = [k for k in range(3, m + 1, 2) if tmp[k] == True]
        return primes
    
    #
    if __name__ == "__main__":
    
        primes = Sieve(N)
    
        for k in range(1, N + 1, 4):
            f = []
            x = k
            for p in primes:
                while (x % p == 0):
                    f.append(p)
                    x //= p
                if x == 1: break
            print ("%6d   %s" % (k, f))
    

    Output

         1   []
         5   [5]
         9   [3, 3]
        13   [13]
        17   [17]
        21   [3, 7]
        25   [5, 5]
        29   [29]
        33   [3, 11]
        37   [37]
        41   [41]
        45   [3, 3, 5]
        49   [7, 7]
    
  2. Rutger said

    In python:

    def factorize(number):
        # gives a list of all prime factors of number
        factors = []
        while not (number % 2):
            number = number // 2
            factors.append(2)
        i = 3
        while i <= number**0.5 and number-1:
            if not (number % i):
                factors.append(i)
                number = number // i
            else:
                i += 2
        if number != 1 and i >= number**0.5:
            factors.append(number)
        return factors
    
    
    for i in range(1,50,4):
        print(i, factorize(i))
    
    # outputs
    # 1 []
    # 5 [5]
    # 9 [3, 3]
    # 13 [13]
    # 17 [17]
    # 21 [3, 7]
    # 25 [5, 5]
    # 29 [29]
    # 33 [3, 11]
    # 37 [37]
    # 41 [41]
    # 45 [3, 3, 5]
    # 49 [7, 7]
    
  3. Rutger said

    Above solution takes 0.42 seconds for all numbers up to 75000 on my machine. Below solution uses a bigger wheel and goes for Brent algorithm trial division for larger numbers. Takes 0.08 seconds for first 75000.

    import time
    from fractions import gcd
    from random import randint
    from functools import reduce
    
    
    def brent(N):
      # brent returns a divisor not guaranteed to be prime, returns n if n prime
      if N % 2 == 0:
        return 2
      y, c, m = randint(1, N-1), randint(1, N-1), randint(1, N-1)
      g, r, q = 1, 1, 1
      while g == 1:             
        x = y
        for i in range(r):
          y = ((y*y) % N + c) % N
        k = 0
        while (k < r and g == 1):
          ys = y
          for i in range(min(m,r-k)):
            y = ((y*y) % N + c) % N
            q = q*(abs(x-y)) % N
          g = gcd(q,N)
          k = k + m
        r = r*2
      if g == N:
          while True:
            ys = ((ys*ys) % N + c) % N
            g = gcd(abs(x-ys), N)
            if g > 1:
              break
      return g
    
    
    def factorize(n, len_wheel, trial_division_bound = 1000000):
      result = []
      if n <= 0:
        raise
      if n == 1:
        return [1]
    
      for start_prime in first_primes:
        while n % start_prime == 0:
          result.append(start_prime)
          n //= start_prime
          if n == 1:
            return result
      idx = 0
      i = first_primes[-1]
      wheel[idx]
      #use trial division for factors below bound
      while i <= trial_division_bound:
        inc = wheel[idx]
        i += inc
        idx = (idx + 1) % len_wheel
        if i > n**0.5:
          result.append(n)
          return result
        while n % i == 0:
          result.append(i)
          n //= i
          if n == 1:
            return result
    
      #use brent for factors > bound   
      p = 0
      while n > trial_division_bound:
        p1 = n
        #iterate until n=brent(n) => n is prime
        while p1 != p:
          p = p1
          p1 = brent(p)
        result.append(p1)
        n //= p1 
      if n != 1:
        result.append(n)
      return result
    
    
    first_primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 163, 167, 173, 179, 181, 191, 193, 197, 199]
    wheel = [2, 10, 2, 4, 2, 4, 6, 2, 6, 4, 2, 4, 6, 6, 2, 6, 4, 2, 6, 4, 6, 8, 4, 2, 4, 2, 4, 8, 6, 4, 6, 2, 4, 6, 2, 6, 6, 4, 2, 4, 6, 2, 6, 4, 2, 4, 2, 10]
    start_time = time.time()
    for i in range(1,75000,4):
      factorize(i, len(wheel))
    
    print(time.time() - start_time)
    
    
  4. bavier said

    Good ‘ol shell:

    for i in `seq 5 4 300001`; do
      factor $i
    done
    
  5. Jussi Piitulainen said

    There’s factor in coreutils? That’s cool.

    seq 1 4 300001 | xargs factor
    

    (And surely the spec includes 1.)

  6. programmingpraxis said

    @Rutger: Please time the following function using the same parameters as your other timings:

    def prod(xs):
        p = 1
        for n in xs:
            p *= n
        return p
    def primes(n): # sieve of eratosthenes
        i, p, ps, m = 0, 3, [2], n // 2
        sieve = [True] * m
        while p <= n:
            if sieve[i]:
                ps.append(p)
                for j in range((p*p-3)/2, m, p):
                    sieve[j] = False
            i, p = i+1, p+2
        return ps
    def factors(n): # naive trial division
        f, fs = 2, []
        while f * f <= n:
            if n % f == 0:
                n /= f
                fs.append(f)
            else: f += 1
        fs.append(n)
        return fs
    def factors_1mod4(n):
        from math import sqrt
        len = (n - 1) // 4
        factors = [[] for i in range(len + 1)]
        def sieve(p, pp):
            k = pp if pp % 4 == 1 else 3 * pp
            k = (k - 1) // 4
            for i in range(k, len+1, pp):
                factors[i].append(p)
        for p in primes(int(sqrt(n)))[1:]:
            pp = p
            while pp <= n:
                sieve(p, pp)
                pp *= p
        for k in range(1, len+1):
            m = 4 * k + 1
            f = m / prod(factors[k])
            if factors[k] == []: factors[k] = [m]
            elif f != 1: factors[k].append(f)
        return factors[1:]
  7. programmingpraxis said

    @Jussi: Unix has had factor forever. It uses naive trial division.

  8. Jussi Piitulainen said

    @Praxis, I never knew.

    The info page for the GNU coreutils factor says it uses “the Pollard Rho algorithm” which is “particularly effective for numbers with relatively small factors” (when it’s compiled with the GNU MP library). It recommends other methods for factoring products of large primes.

  9. programmingpraxis said

    @Jussi: The V7 source code is in assembler, HERE; the man page is HERE.

    I’m not an assembler expert, but I see things like

    xt:
    	movf	fr0,fr2
    	divf	fr1,fr2

    that looks an awful lot like trial division (there is no division in the Pollard Rho algorithm), and something like

    kazoo	=.+2

    that looks an awful lot like trial division by odd numbers.

    I also note that the man page mentions the time complexity as O(sqrt n), which is characteristic of trial division, not Pollard Rho.

    I’ve always been under the impression that Unix factor used trial division. I’m not surprised that the GNU folks changed the algorithm.

  10. Steve said
    
    MUMPS V1
    SIEVING   ;New routine
             new ans,flg,i,i2,j,k,max
             set max=50
             for i=5:4:max d
             . new ans
             . set i2=i,j=i**.5
             . set:j#2=0 j=j-1
             . for k=j:-2:3 d
             . . if i2#k=0,$$isprime(k) do
             . . . do add(.ans,k)
             . . . set i2=i2/k
             . . . for  quit:i2#k>0  do add(.ans,k) set i2=i2/k
             . if i2>1,$$isprime(k) do add(.ans,i2)
             . write !,i,?9,"("
             . set flg=0,j=""
             . for  set j=$o(ans(j)) quit:j=""  do
             . . for k=1:1:ans(j) write $select('flg:j,1:" "_j) set:'flg flg=1
             . write ")"
             quit
             ;
    add      (ans,n) ; Add n to ans()
             set ans(n)=$get(ans(n))+1
             quit
             ;
    isprime  (k) ; Is it a prime?
             new flg,i,sqr
             if $data(primes(k)) set flg=1
             else  do
             . set flg=1,sqr=k**.5
             . for i=sqr:-1:3 if k#i=0 set flg=0 q
             quit flg
    
    MCL> d ^SIEVING
    
    5        (5)
    9        (3 3)
    13       (13)
    17       (17)
    21       (3 7)
    25       (5 5)
    29       (29)
    33       (3 11)
    37       (37)
    41       (41)
    45       (3 3 5)
    49       (7 7)
    MCL> 
    
    
  11. Jussi Piitulainen said

    @Praxis, thanks. I see also an interfance change between V7 and GNU factor: V7 takes one argument (or none), GNU takes any number of arguments and reports on all of them.

    To understate the effect, there’s a noticable difference between the running times of these (on a Red Hat server with recent coreutils, but the point should be robust):

    seq 1 4 300001 | xargs factor | tail
    
    seq 1 4 300001 | xargs -n 1 factor | tail
    
  12. Rutger said

    @programmingpraxis: tried timing your reply, but it uses an undefined primes() function.

  13. programmingpraxis said

    @Rutger: Sorry. I added primes and prod, both of which are called by the function.

  14. Rutger said

    Had a little more modifications to the code, but here is the result: 0.0156 seconds.

    
    from math import sqrt
    from functools import reduce
    from operator import mul
    
    def primes(n): # sieve of eratosthenes
        i, p, ps, m = 0, 3, [2], n // 2
        sieve = [True] * m
        while p <= n:
            if sieve[i]:
                ps.append(p)
                for j in range((p*p-3)//2, m, p):
                    sieve[j] = False
            i, p = i+1, p+2
        return ps
    
    def factors(n): # naive trial division
        f, fs = 2, []
        while f * f <= n:
            if n % f == 0:
                n /= f
                fs.append(f)
            else: f += 1
        fs.append(n)
        return fs
    
    def factors_1mod4(n):
        len = (n - 1) // 4
        factors = [[] for i in range(len + 1)]
    
        def sieve(p, pp):
            k = pp if pp % 4 == 1 else 3 * pp
            k = (k - 1) // 4
            for i in range(k, len+1, pp):
                factors[i].append(p)
    
        for p in primes(int(sqrt(n)))[1:]:
            pp = p
            while pp <= n:
                sieve(p, pp)
                pp *= p
        for k in range(1, len+1):
            m = 4 * k + 1
            f = m / reduce(mul, factors[k], 1)
            if factors[k] == []: factors[k] = [m]
            elif f != 1: factors[k].append(f)
        return factors[1:]
    
    import time
    start_time = time.time()
    n = 75000
    x = factors_1mod4(n)
    print(time.time() - start_time)
    
  15. programmingpraxis said

    @Rutger: So to process n = 75,000, it’s 0.42 seconds for trial division by 2 and odds, a five-times improvement to 0.08 seconds for a 2,3,5,7-wheel (since n = 75,000 you never call Brent-rho), and another five-times improvement to 0.0156 for sieving. And sieving has a slower order of growth than the other two, O(log n log log n) compared to O(sqrt n). Sieving always wins!

    Thank you for doing the timing comparison.

  16. sealfin said

    May 30th, 2017.c:

    #include "seal_bool.h" /* <http://GitHub.com/sealfin/C-and-C-Plus-Plus/blob/master/seal_bool.h> */
    #include "printDateAndTime.h" /* <http://Gist.GitHub.com/sealfin/6d35f3a3958bd6797a0f> */
    #include <stdio.h>
    #include <stdlib.h>
    
    #define N 300002L
    
    #if N <= 0
    #error "N ≤ 0."
    #endif
    
    bool g_isPrime[ N ];
    long g_numberOfPrimes = 0, *g_primes;
    
    bool f_IsPrime( const long p )
    {
      if( p % 2 == 0 )
        return p == 2;
      else
        return g_isPrime[ p ];
    }
    
    void p_RecursivelyPrintPrimeFactorsOfNumber( const long p_number, const bool p_firstPrimeFactor, FILE *p_fileToPrintTo )
    {
      if( !p_firstPrimeFactor )
        fprintf( p_fileToPrintTo, " " );
      if( f_IsPrime( p_number ))
        fprintf( p_fileToPrintTo, "%ld", p_number );
      else
      {
        long i = 0;
        for( ; i < g_numberOfPrimes; i ++ )
          if( p_number % g_primes[ i ] == 0 )
          {
            fprintf( p_fileToPrintTo, "%ld", g_primes[ i ] );
            p_RecursivelyPrintPrimeFactorsOfNumber( p_number / g_primes[ i ], false, p_fileToPrintTo );
            break;
          }
      }
    }
    
    #define p_PrintPrimeFactorsOfNumber( p_number, p_fileToPrintTo ) p_RecursivelyPrintPrimeFactorsOfNumber( p_number, true, p_fileToPrintTo )
    
    void main( void )
    {
      p_PrintDateAndTime();
      {
        long i = 0, k;
        FILE *f;
    
        /* Firstly, let's determine the prime numbers in the range [ 0, N ) so that later we can avoid trying to determine the prime factors of a prime number. */
        for( ; i < N; i ++ )
          g_isPrime[ i ] = true;
        if( N >= 1 )
          g_isPrime[ 0 ] = false;
        if( N >= 2 )
          g_isPrime[ 1 ] = false;
        if( N >= 3 )
          g_numberOfPrimes ++;
        for( i = 3; i < N; i += 2 )
          if( g_isPrime[ i ] )
          {
            g_numberOfPrimes ++;
            for( k = i + i; k < N; k += i )
              g_isPrime[ k ] = false;
          }
    
        /* Secondly,  let's create a list of the prime numbers in the range [ 0, N ) so that later we can test if a prime number is a factor of a number. */
        g_primes = ( long* )malloc( sizeof( long ) * g_numberOfPrimes );
        k = 0;
        if( N >= 3 )
          g_primes[ k ++ ] = 2;
        for( i = 3; k < g_numberOfPrimes; i += 2 )
          if( f_IsPrime( i ))
            g_primes[ k ++ ] = i;
    
        /* Now, for every number i in the range [ 2, N ), let's determine if i % 4 == 1, and print the number and its prime factors if it is. */
        f = fopen( "May 30th, 2017.out", "w" );
        for( i = 2; i < N; i ++ )
          if( i % 4 == 1 )
          {
            int n;
            fprintf( f, "%ld%n", i, &n );
            for( ; n < 6; n ++ )
              fprintf( f, " " );
            fprintf( f, "\t(" );
            p_PrintPrimeFactorsOfNumber( i, f );
            fprintf( f, ")\n" );
          }
        fclose( f );
    
        free( g_primes );
      }
      p_PrintDateAndTime();
    }

    May 30th, 2017.out:

    5    	(5)
    9    	(3 3)
    13    	(13)
    17    	(17)
    21    	(3 7)
    25    	(5 5)
    29    	(29)
    33    	(3 11)
    37    	(37)
    41    	(41)
    45    	(3 3 5)
    49    	(7 7)
    [74,976 lines omitted]
    299957	(7 73 587)
    299961	(3 3 33329)
    299965	(5 17 3529)
    299969	(299969)
    299973	(3 99991)
    299977	(299977)
    299981	(11 27271)
    299985	(3 5 7 2857)
    299989	(23 13043)
    299993	(299993)
    299997	(3 3 3 41 271)
    300001	(13 47 491)

    On an Apple Power Mac G4 (AGP Graphics) (450MHz processor, 1GB memory) to run the solution took:

    approximately forty-six seconds on Mac OS 9.2.2 (International English) (the solution interpreted using Leonardo IDE 3.4.1);
    approximately one second on Mac OS X 10.4.11 (the solution compiled using Xcode 2.2.1).

    (I’m just trying to solve the problems posed by this ‘site whilst I try to get a job; I’m well aware that my solutions are far from the best – but, in my defence, I don’t have any traditional qualifications in computer science :/ )

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: