## Two Sieving Problems

### August 27, 2013

Whenever you need to enumerate a large number of primes, it is best to find a way to use a sieve to find them; generating candidates and testing them with a primality checker will always be slower. We use two different sieves to solve the two problems.

For the first problem, it is not possible to sieve by all the primes less than the square root of 1050 + 106; there are so many sieving primes that no one knows how many. What we do instead is sieve to some convenient `limit`, then apply a primality checker to those numbers that survive the sieve:

```(define (big-primes lo hi delta limit)   (let* ((output (list))          (sieve (make-vector delta #t))          (ps (cdr (primes limit)))          (qs (map (lambda (p) (modulo (* -1/2 (+ lo p 1)) p)) ps)))     (let loop ((lo lo) (qs qs))       (if (not (< lo hi)) (reverse output)         (begin           (do ((i 0 (+ i 1))) ((= i delta)) (vector-set! sieve i #t))           (do ((ps ps (cdr ps)) (qs qs (cdr qs))) ((null? ps))             (do ((j (car qs) (+ j (car ps)))) ((<= delta j))               (vector-set! sieve j #f)))           (do ((i 0 (+ i 1)) (t (+ lo 1) (+ t 2)))               ((or (<= delta i) (<= hi t)))             (if (and (vector-ref sieve i) (prime? t))               (set! output (cons t output))))           (loop (+ lo (* 2 delta))                 (map (lambda (p q) (modulo (- q delta) p)) ps qs)))))))```

This is just a segmented sieve, as in a prior exercise, but with the sieving primes in `ps` limited. It is called like this:

```> (length (big-primes #e1e50 (+ #e1e50 #e1e6) 25000 1000000)) 8737```

That took a little bit less than a minute on my old slow machine at home. Only about one second of that minute was spent sieving, the rest of the time was devoted to checking the 40808 candidates that survived the sieve. Increasing `limit` to 2000000 reduced the candidate count to 38888, and increasing `limit` to 5000000 reduced the candidate count to 36551. Changing from a Miller-Rabin test with k = 10 to a single pseudoprimality test to base 2 left the result unchanged and saved only a little bit of time, proving Henri Cohen’s comment about “industrial-grade primes.”

The second problem uses a segmented sieve that contains only the numbers congruent to 1 (mod 4); thus, instead of sieving 2, 3, 4, 5, … or 3, 5, 7, 9, …, we sieve 5, 9, 13, 17, …. Here `delta` is one-quarter the segment length, instead of one-half, and the definition of the `qs` changes to reflect the numbers that are in the sieve:

```(define (primes1mod4 lo hi delta)   (let* ((output (list))          (sieve (make-vector delta #t))          (ps (cdr (primes (isqrt hi))))          (qs (map (lambda (p) (modulo (* -1 (inverse 4 p) (+ lo p 1)) p)) ps)))     (let loop ((lo lo) (qs qs))       (if (not (< lo hi)) (reverse output)         (begin           (do ((i 0 (+ i 1))) ((= i delta)) (vector-set! sieve i #t))           (do ((ps ps (cdr ps)) (qs qs (cdr qs))) ((null? ps))             (do ((j (car qs) (+ j (car ps)))) ((<= delta j))               (vector-set! sieve j #f)))           (do ((i 0 (+ i 1)) (t (+ lo 1) (+ t 4)))               ((or (<= delta i) (<= hi t)))             (if (vector-ref sieve i) (set! output (cons t output))))           (loop (+ lo (* 4 delta))                 (map (lambda (p q) (modulo (- q delta) p)) ps qs)))))))```

It takes only 70ms to solve the problem:

```> (length (primes1mod4 1000000 2000000 25000)) 35241```

We have a bonus exercise today; it’s not about enumerating primes, but the solution does involve a sieve, so we include it here. The task is to make a list of all numbers less than or equal to n that have k distinct prime factors. For instance, with n = 25 and k = 1, the solution includes both prime numbers (their only factor is themselves) and prime powers (their only factor is the base prime): 2, 3, 4, 5, 7, 8, 9, 11, 13, 16, 17, 19, 23, 25. The solution is on the next page.

Pages: 1 2 3

### 2 Responses to “Two Sieving Problems”

1. Graham said

My first answers are in J; it’s definitely cheating when your language has a
“prime factors of” function.
Problem 1:

```+/@(1=#@q:)(10x^50)+1+i.1e6%2
```

This takes a very long time to get anywhere; the second is much faster:

```+/@(1=#@q:)1e6+1+4*1e6%4
```

I originally had simpler answers, but decided that only working over odd
numbers (first question) or numbers equivalent to 1 mod 4 (second question) was
worth the uglification of my code.

My second answers are in Ruby; I can’t imagine trying to do these in a language
like C or C++, even with the help of multiprecision libraries like GMP or NTL.
I went with the Miller-Rabin primality test.

```require 'mathn'

def decomp n
s, d = 0, n
while d.even?
s, d = s + 1, d >> 1
end
return s, d
end

def modpow n, e, m
r = 1
while e > 0
if e.odd?
r = (r * n) % m
end
e >>= 1
n = (n * n) % m
end
return r
end

def miller_rabin n, k=42
s, d= decomp(n - 1)
k.times do
a = 2 + rand(n - 4)
x = modpow a, d, n
next if [1, n - 1].include? x
flag = (s - 1).times do
x = (x * x) % n
return false if x == 1
break n - 1 if x == n - 1
end
next if flag == n - 1
return false
end
return true
end

class Integer
def prime?
return false if self < 2
return true if self == 2
return false if self.even?
return miller_rabin self
end
end

def problem1
(10**50 + 1 .. 10**50 + 10**6).step(2).select(&:prime?).length
end

def problem2
(10**6 + 1 .. 2 * 10**6).step(4).select(&:prime?).length
end
```
2. Paul said

A Python version with segments.

```import time
from math import ceil, sqrt

from gmpy2 import is_prime

from ma.primegmp import sieve

def erase(flags, primes, q, n):
for pk, qk in zip(primes, q):
for i in xrange(qk, n, pk):
flags[i] = 0

def gen_sieve_seg(l, r, number_miller_rabin=25):
"""segmented sieve
uses prime numbes <= 1000000
checks candidates with miller rabin
"""
primes = list(sieve(1000000))
primes.remove(2)
q = [(-(l + 1 + pk) // 2) % pk for pk in primes]
n = ((r - l + 1) // 2)
flags =  * n
erase(flags, primes,q, n)
for i, f in enumerate(flags):
if f:
cand = l + i * 2 + 1
if is_prime(cand, number_miller_rabin):
yield cand

def gen_sieve_seg_4np1(l, r):
"""segmented sieve
ony numbers 4*n+1 are sieved
"""
primes = list(sieve(int(ceil(sqrt(r)))))
primes.remove(2)
assert not l % 4
q = [(-(l + 1 - (pk % 4) * pk) // 4) % pk for pk in primes]
n = (r - l) // 4
flags =  * n
erase(flags, primes,q, n)
for i, f in enumerate(flags):
if f:
yield l + i * 4 + 1

low = 10 ** 50
high = low + 10 ** 6

t0 = time.clock()
print sum(1 for p in gen_sieve_seg(low, high, 5)),
print "{:6.1f} sec".format(time.clock() - t0)

t0 = time.clock()
print sum(1 for p in gen_sieve_seg_4np1(1000000, 2000000)),
print "{:6.3f} sec".format(time.clock() - t0)

"""
8737    4.3 sec
35241  0.044 sec
"""
```