Sieve Of Atkin, Improved

February 19, 2010

Our version, translated from a Python version by Steve Krenzel, is shown below:

(define (atkin limit)
  (define (exact x) (inexact->exact (floor x)))
  (let ((sieve (make-vector (+ (quotient limit 2) (modulo limit 2)) #f))
        (primes (list 3 2)))
    (define (flip! m) (vector-set! sieve m (not (vector-ref sieve m))))

    (let ((x-max (exact (sqrt (/ (- limit 1) 4)))) (x2 0))
      (do ((xd 4 (+ xd 8))) ((<= (+ (* x-max 8) 2) xd))
        (set! x2 (+ x2 xd))
        (let* ((y-max (exact (sqrt (- limit x2))))
               (n (+ x2 (* y-max y-max)))
               (n-diff (+ y-max y-max -1)))
          (when (even? n) (set! n (- n n-diff)) (set! n-diff (- n-diff 2)))
          (do ((d (* (- n-diff 1) 2) (- d 8))) ((<= d -1))
            (when (member (modulo n 12) (list 1 5)) (flip! (quotient n 2)))
            (set! n (- n d))))))

    (let ((x-max (exact (sqrt (/ (- limit 1) 3)))) (x2 0))
      (do ((xd 3 (+ xd 6))) ((<= (+ (* x-max 6) 2) xd))
        (set! x2 (+ x2 xd))
        (let* ((y-max (exact (sqrt (- limit x2))))
               (n (+ x2 (* y-max y-max)))
               (n-diff (+ y-max y-max -1)))
          (when (even? n) (set! n (- n n-diff)) (set! n-diff (- n-diff 2)))
          (do ((d (* (- n-diff 1) 2) (- d 8))) ((<= d -1))
            (when (= (modulo n 12) 7) (flip! (quotient n 2)))
            (set! n (- n d))))))

    (let ((x-max (exact (/ (+ (sqrt (- 4 (* (- 1 limit) 8))) 2) 4)))
          (y-min -1) (x2 0) (xd 3))
      (do ((x 1 (+ x 1))) ((<= (+ x-max 1) x))
        (set! x2 (+ x2 xd)) (set! xd (+ xd 6))
        (when (<= limit x2)
          (set! y-min (* (- (* (- (inexact->exact (ceiling (sqrt (- x2 limit)))) 1) 2) 2) 2)))
        (let ((n (- (* (+ (* x x) x) 2) 1))
              (n-diff (* (- (* (- x 1) 2) 2) 2)))
          (do ((d n-diff (- d 8))) ((<= d y-min))
            (when (= (modulo n 12) 11) (flip! (quotient n 2)))
            (set! n (+ n d))))))

    (do ((n 2 (+ n 1))) ((<= (quotient (+ (exact (sqrt limit)) 1) 2) n))
      (when (vector-ref sieve n)
        (let* ((p (+ n n 1)) (p2 (* p p)))
          (set! primes (cons p primes))
          (do ((k p2 (+ k (+ p2 p2)))) ((<= limit k))
            (vector-set! sieve (quotient k 2) #f)))))

    (do ((p (+ (exact (sqrt limit)) 1) (+ p 2))) ((<= limit p))
      (when (vector-ref sieve (quotient p 2))
        (set! primes (cons p primes))))

    (reverse primes)))

The program consists of a header, three blocks that each consider one of the three cases of the basic algorithm, a block that performs the actual sieving of squares (the p2 variable), a block that collects the remaining primes beyond the square root, and a final (reverse primes) that returns the result.

This version is much faster than the naive version, taking 2.563 seconds to enumerate the first million primes, about double the Sieve of Eratosthenes. Daniel Bernstein, co-author with Arthur Atkin of the original paper describing the Sieve of Atkin, has an insanely fast version, faster than the Sieve of Eratosthenes; it is instructive to read the code.

You can run this program at http://programmingpraxis.codepad.org/xyQbTn0p.

About these ads

Pages: 1 2

4 Responses to “Sieve Of Atkin, Improved”

  1. Mike said

    Here is my python version of an improved Sieve of Atkins. On my laptop it calculates the primes up to 1 million in .16 seconds, and primes to 10 million in about 1.5 seconds. That is 3.5 times faster than a basic Sieve of Eratosthenes and about 5.6 times faster than a naive Sieve of Atkins.

    To imptove the naive sieve, I first calculated appropriate loop limits. Second I replaced squaring of x and y with additions using the well known formula:(x+1)^2 = x^2 + (2x + 1), and changed the loop counters to return the difference between the squares (i.e., the 2x+1). Obviously, the factor on the x^2 term could also be included in the loop counter. For example, for the 4x^2 term goes 4 16 36 … , so the loop was set up to return the differences 4, 12, 20, ….

    Because of the way the loops counters incremented, I suspected there would be a pattern to the results of the n%12 calculations in the y^2 loops. After confirming the pattern, I changed the loop variables and split some loops in two, so that the loops only iterate over values of x and y for which the n%12 test would pass. This greatly reduced the number of loop iterations and eliminated the n%12 test. The start, stop, and step values for the loops were tricky to figure out.

    Although I plan further optimization of the 3x^2 – y^2 section when I have time, I thought I’d post what I have now. Also, I didn’t optimize memory usage at all, so I’m sure that can be reduced.

    def atkins13(limit=1000000):
        '''use sieve of atkins to find primes <= limit.'''
        primes = [0] * limit
    
        # n = 3x^2 + y^2 section
        xx3 = 3
        for dxx in xrange( 0, 12*int( sqrt( ( limit - 1 ) / 3 ) ), 24 ):
            xx3 += dxx
            y_limit = int(12*sqrt(limit - xx3) - 36)
            n = xx3 + 16
            for dn in xrange( -12, y_limit + 1, 72 ):
                n += dn
                primes[n] = not primes[n]
    
            n = xx3 + 4
            for dn in xrange( 12, y_limit + 1, 72 ):
                n += dn
                primes[n] = not primes[n]
    
        # n = 4x^2 + y^2 section
        xx4 = 0
        for dxx4 in xrange( 4, 8*int(sqrt((limit - 1 )/4)) + 4, 8 ):
            xx4 += dxx4
            n = xx4+1
    
            if xx4%3:
                for dn in xrange( 0, 4*int( sqrt( limit - xx4 ) ) - 3, 8 ):
                    n += dn
                    primes[n] = not primes[n]
    
            else:
                y_limit = 12 * int( sqrt( limit - xx4 ) ) - 36
    
                n = xx4 + 25
                for dn in xrange( -24, y_limit + 1, 72 ):
                    n += dn
                    primes[n] = not primes[n]
    
                n = xx4+1
                for dn in xrange( 24, y_limit + 1, 72 ):
                    n += dn
                    primes[n] = not primes[n]
    
        # n = 3x^2 - y^2 section
        xx = 1
        for x in xrange( 3, int( sqrt( limit / 2 ) ) + 1, 2 ):
            xx += 4*x - 4
            n = 3*xx
    
            if n > limit:
                min_y = ((int(sqrt(n - limit))>>2)<<2)
                yy = min_y*min_y
                n -= yy
                s = 4*min_y + 4
    
            else:
                s = 4
    
            for dn in xrange( s, 4*x, 8 ):
                n -= dn
                if n <= limit and n%12 == 11:
                        primes[n] = not primes[n]
    
        xx = 0
        for x in xrange( 2, int( sqrt( limit / 2 ) ) + 1, 2 ):
            xx += 4*x - 4
            n = 3*xx
    
            if n > limit:
                min_y = ((int(sqrt(n - limit))>>2)<<2)-1
                yy = min_y*min_y
                n -= yy
                s = 4*min_y + 4
    
            else:
                n -= 1
                s = 0
    
            for dn in xrange( s, 4*x, 8 ):
                n -= dn
                if n <= limit and n%12 == 11:
                        primes[n] = not primes[n]
    
        # eliminate squares        
        for n in xrange(5,int(sqrt(limit))+1,2):
            if primes[n]:
                for k in range(n*n, limit, n*n):
                    primes[k] = False
    
        return [2,3] + filter(primes.__getitem__, xrange(5,limit,2))
    
  2. Garrett said

    I made the above code MUCH faster by simply replacing the line:

    primes = [0] * limit

    with:

    primes = numpy.zeros(limit, dtype = ‘bool’)

    Obviously this requires the module numpy. This also helps considerably with the “memory management” he was talking about I would think!

    I’m still trying to wrap my head around the rest of of the code (really trying to wrap my head around the sieve in genera). Will post when I find out more!

  3. geeksumm said

    It’s simple. We can cut the loop by using these condition:
    if (xx + yy >= limit) {
    break;
    }

    Using JavaScript running on Chrome’s V8, it took about 2.8 seconds to generate primes with limit of 10 million. Here’s the code:

    function sieveOfAtkin(limit){
    limitSqrt = Math.sqrt(limit);
    sieve = [];

    //prime start from 2, and 3
    sieve[2] = true;
    sieve[3] = true;

    for (x = 1; x <= limitSqrt; x++) {
    xx = x*x;
    for (y = 1; y = limit) {
    break;
    }
    // first quadratic using m = 12 and r in R1 = {r : 1, 5}
    n = (4 * xx) + (yy);
    if (n <= limit && (n % 12 == 1 || n % 12 == 5)) {
    sieve[n] = !sieve[n];
    }
    // second quadratic using m = 12 and r in R2 = {r : 7}
    n = (3 * xx) + (yy);
    if (n y && n <= limit && (n % 12 == 11)) {
    sieve[n] = !sieve[n];
    }
    }
    }

    for (n = 5; n <= limitSqrt; n++) {
    if (sieve[n]) {
    x = n * n;
    for (i = x; i <= limit; i += x) {
    sieve[i] = false;
    }
    }
    }

    // primes are the one which sieve[n] returns true
    return sieve;
    }

  4. Harsh said

    Could you please explain how the you got the limits?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 634 other followers

%d bloggers like this: