Two Sieving Problems

August 27, 2013

Whenever you need to enumerate a large number of primes, it is best to find a way to use a sieve to find them; generating candidates and testing them with a primality checker will always be slower. We use two different sieves to solve the two problems.

For the first problem, it is not possible to sieve by all the primes less than the square root of 1050 + 106; there are so many sieving primes that no one knows how many. What we do instead is sieve to some convenient limit, then apply a primality checker to those numbers that survive the sieve:

(define (big-primes lo hi delta limit)
  (let* ((output (list))
         (sieve (make-vector delta #t))
         (ps (cdr (primes limit)))
         (qs (map (lambda (p) (modulo (* -1/2 (+ lo p 1)) p)) ps)))
    (let loop ((lo lo) (qs qs))
      (if (not (< lo hi)) (reverse output)
        (begin
          (do ((i 0 (+ i 1))) ((= i delta)) (vector-set! sieve i #t))
          (do ((ps ps (cdr ps)) (qs qs (cdr qs))) ((null? ps))
            (do ((j (car qs) (+ j (car ps)))) ((<= delta j))
              (vector-set! sieve j #f)))
          (do ((i 0 (+ i 1)) (t (+ lo 1) (+ t 2)))
              ((or (<= delta i) (<= hi t)))
            (if (and (vector-ref sieve i) (prime? t))
              (set! output (cons t output))))
          (loop (+ lo (* 2 delta))
                (map (lambda (p q) (modulo (- q delta) p)) ps qs)))))))

This is just a segmented sieve, as in a prior exercise, but with the sieving primes in ps limited. It is called like this:

> (length (big-primes #e1e50 (+ #e1e50 #e1e6) 25000 1000000))
8737

That took a little bit less than a minute on my old slow machine at home. Only about one second of that minute was spent sieving, the rest of the time was devoted to checking the 40808 candidates that survived the sieve. Increasing limit to 2000000 reduced the candidate count to 38888, and increasing limit to 5000000 reduced the candidate count to 36551. Changing from a Miller-Rabin test with k = 10 to a single pseudoprimality test to base 2 left the result unchanged and saved only a little bit of time, proving Henri Cohen’s comment about “industrial-grade primes.”

The second problem uses a segmented sieve that contains only the numbers congruent to 1 (mod 4); thus, instead of sieving 2, 3, 4, 5, … or 3, 5, 7, 9, …, we sieve 5, 9, 13, 17, …. Here delta is one-quarter the segment length, instead of one-half, and the definition of the qs changes to reflect the numbers that are in the sieve:

(define (primes1mod4 lo hi delta)
  (let* ((output (list))
         (sieve (make-vector delta #t))
         (ps (cdr (primes (isqrt hi))))
         (qs (map (lambda (p) (modulo (* -1 (inverse 4 p) (+ lo p 1)) p)) ps)))
    (let loop ((lo lo) (qs qs))
      (if (not (< lo hi)) (reverse output)
        (begin
          (do ((i 0 (+ i 1))) ((= i delta)) (vector-set! sieve i #t))
          (do ((ps ps (cdr ps)) (qs qs (cdr qs))) ((null? ps))
            (do ((j (car qs) (+ j (car ps)))) ((<= delta j))
              (vector-set! sieve j #f)))
          (do ((i 0 (+ i 1)) (t (+ lo 1) (+ t 4)))
              ((or (<= delta i) (<= hi t)))
            (if (vector-ref sieve i) (set! output (cons t output))))
          (loop (+ lo (* 4 delta))
                (map (lambda (p q) (modulo (- q delta) p)) ps qs)))))))

It takes only 70ms to solve the problem:

> (length (primes1mod4 1000000 2000000 25000))
35241

We have a bonus exercise today; it’s not about enumerating primes, but the solution does involve a sieve, so we include it here. The task is to make a list of all numbers less than or equal to n that have k distinct prime factors. For instance, with n = 25 and k = 1, the solution includes both prime numbers (their only factor is themselves) and prime powers (their only factor is the base prime): 2, 3, 4, 5, 7, 8, 9, 11, 13, 16, 17, 19, 23, 25. The solution is on the next page.

About these ads

Pages: 1 2 3

2 Responses to “Two Sieving Problems”

  1. Graham said

    My first answers are in J; it’s definitely cheating when your language has a
    “prime factors of” function.
    Problem 1:

    +/@(1=#@q:)(10x^50)+1+i.1e6%2
    

    This takes a very long time to get anywhere; the second is much faster:

    +/@(1=#@q:)1e6+1+4*1e6%4
    

    I originally had simpler answers, but decided that only working over odd
    numbers (first question) or numbers equivalent to 1 mod 4 (second question) was
    worth the uglification of my code.

    My second answers are in Ruby; I can’t imagine trying to do these in a language
    like C or C++, even with the help of multiprecision libraries like GMP or NTL.
    I went with the Miller-Rabin primality test.

    require 'mathn'
    
    def decomp n
      s, d = 0, n
      while d.even?
        s, d = s + 1, d >> 1
      end
      return s, d
    end
    
    
    def modpow n, e, m
      r = 1
      while e > 0
        if e.odd?
          r = (r * n) % m
        end
        e >>= 1
        n = (n * n) % m
      end
      return r
    end
    
    def miller_rabin n, k=42
      s, d= decomp(n - 1)
      k.times do
        a = 2 + rand(n - 4)
        x = modpow a, d, n
        next if [1, n - 1].include? x
        flag = (s - 1).times do
          x = (x * x) % n
          return false if x == 1
          break n - 1 if x == n - 1
        end
        next if flag == n - 1
        return false
      end
      return true
    end
    
    class Integer
      def prime?
        return false if self < 2
        return true if self == 2
        return false if self.even?
        return miller_rabin self
      end
    end
    
    def problem1
      (10**50 + 1 .. 10**50 + 10**6).step(2).select(&:prime?).length
    end
    
    def problem2
      (10**6 + 1 .. 2 * 10**6).step(4).select(&:prime?).length
    end
    
  2. Paul said

    A Python version with segments.

    import time
    from math import ceil, sqrt
    
    from gmpy2 import is_prime
    
    from ma.primegmp import sieve
    
    def erase(flags, primes, q, n):
        for pk, qk in zip(primes, q):
            for i in xrange(qk, n, pk):
                flags[i] = 0
    
    def gen_sieve_seg(l, r, number_miller_rabin=25):
        """segmented sieve
           uses prime numbes <= 1000000
           checks candidates with miller rabin
        """
        primes = list(sieve(1000000))
        primes.remove(2)
        q = [(-(l + 1 + pk) // 2) % pk for pk in primes]
        n = ((r - l + 1) // 2)
        flags = [1] * n
        erase(flags, primes,q, n)
        for i, f in enumerate(flags):
            if f:
                cand = l + i * 2 + 1
                if is_prime(cand, number_miller_rabin):
                    yield cand
                    
    def gen_sieve_seg_4np1(l, r):
        """segmented sieve
           ony numbers 4*n+1 are sieved
        """
        primes = list(sieve(int(ceil(sqrt(r)))))
        primes.remove(2)
        assert not l % 4
        q = [(-(l + 1 - (pk % 4) * pk) // 4) % pk for pk in primes]
        n = (r - l) // 4
        flags = [1] * n
        erase(flags, primes,q, n)
        for i, f in enumerate(flags):
            if f:
                yield l + i * 4 + 1
                
    low = 10 ** 50
    high = low + 10 ** 6
    
    t0 = time.clock()
    print sum(1 for p in gen_sieve_seg(low, high, 5)),
    print "{:6.1f} sec".format(time.clock() - t0)
    
    t0 = time.clock()
    print sum(1 for p in gen_sieve_seg_4np1(1000000, 2000000)),
    print "{:6.3f} sec".format(time.clock() - t0)
    
    """
    8737    4.3 sec
    35241  0.044 sec
    """
    

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 615 other followers

%d bloggers like this: