## The Next Prime

### March 26, 2010

In two previous exercises, we had to iterate through the prime numbers. In one case, we generated a large number of primes using the Sieve of Eratosthenes, but without knowing in advance how large the sieve needed to be, and in the other case we iterated through the odd integers, checking the primality of each. Both solutions were less than attractive. In consideration of the old rule that if you do something twice you ought to build it into an abstraction, we will today write a function that, given a positive integer n, returns the smallest prime number greater than n.

Our method is to pre-compute a large number of primes and store them on disk. If n is within the bounds of the pre-computed list, it is easy to find the next prime. But if n is too large, we revert to checking individual candidates for primality. For our example we will pre-compute the primes to a million, but depending on your aspirations and your memory budget, you could adjust that number as desired.

To save memory space, we will store the pre-computed primes in a compressed data structure. Every prime number can be expressed as 30k±1, 30k±7, 30k±11, or 30k±13 for some k. That means we can use eight bits per thirty numbers to store all the primes; a million primes can be compressed to 33,334 bytes, plus a small program to load the compressed primes from disk and to manipulate the compressed data structure.

Your task is to write a function that builds the compressed data structure described above, a second function that loads it from disk to memory, and a third function that uses the compressed data structure to calculate the next prime. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

About these ads

Pages: 1 2

### 4 Responses to “The Next Prime”

1. programmingpraxis said

Another illustration of `next-prime` is this alternate implementation of Pollard’s p-1 factorization algorithm, which we have studied in two previous exercises:

```(define (pollard n b1 b2)   (let stage1 ((a 2) (p 2))     (if (< p b1)         (stage1 (expm a (expt p (ilog pb1)) n) (next-prime p))         (let ((d (gcd (- a 1) n)))           (if (< 1 d n) (list 'stage1 d)             (let stage2 ((p (next-prime b1)))               (if (< b2 p) #f                 (let ((d (gcd (- (expm a p n) 1) n)))                   (if (< 1 d n) (list 'stage2 d)                     (stage2 (next-prime p)))))))))))```

In addition to `next-prime`, we use `expm` and `ilog` from the Standard Prelude. For example:

```> (pollard 15770708441 150 180) (stage2 135979)```

A more serious effort is this factorization of the sixty-ninth repunit R69 = (10^69 – 1) / 9. The small factors 3, 37 and 277 are quickly found by trial division. Then Pollard’s p-1 method with bounds B1 = 30000 and B2 = 600000 find the fifteen digit factor 203864078068831. Applying the p-1 method again with the same bounds finds the twenty-three digit factor 11111111111111111111111, another repunit. The remaining twenty-eight digit factor 1595352086329224644348978893 is prime.

If you are interested in the factorization of repunits, see Makoto Kamada’s web page.

2. pv2b said

A million primes will not compress down to 33334 bytes. However, all primes up to one million compress down to this file size.

3. [...] : well, you might find this link of interest …it speaks to timing farther down the narrative http://programmingpraxis.com/2010/03/26/the-next-prime/Re: original ideas, the only things that come to mind is publishing via arxiv at some point if [...]

4. David said

A clojure solution, includes the Miller Rabin primality test, modified to use known deterministic tests when possible. To access the table, we use the fact that the expression x & -x masks out all bits in x except for the lower bit, so this is a constant time operation to get the smallest bit. We then index into a table mod 11 to convert the lowbit (which is a power of two) to the appropriate offset (1, 7, 11, 13, 17, 19, 23, or 29) I use an array of masks (based on the input mod 30,) to mask out the bits we will not consider prior to calculating x & -x. For primes > 1,000,000, I pretty much follow the reference solution.

```(load "miller-rabin")

(defn load-primes
"Load a file of compressed primes"
[filename]
(let [input (java.io.DataInputStream.
(java.io.BufferedInputStream.
(java.io.FileInputStream. filename))),
next-byte (fn []
(try
(.readByte input)
(catch java.io.EOFException e
nil)))]

(loop [data (vector-of :byte)]
(let [x (next-byte)]
(if (nil? x)
data
(recur (conj data x)))))))

(defn lowbit
"Get low order 1 bit in 1 byte value in constant time and convert to
the prime mod 30"
[x]
(let [mod_30 (vector-of :int -1 1 7 -1 11 17 -1 29 13 23 19)]
(mod_30 (mod (bit-and x (- x)) 11))))

(def masks
(reduce into [
(vector-of :int)
(repeat 1 2r11111111)    ; 0
(repeat 6 2r11111110)    ; 1 - 6
(repeat 4 2r11111100)    ; 7 - 10
(repeat 2 2r11111000)    ; 11 - 12
(repeat 4 2r11110000)    ; 13 - 16
(repeat 2 2r11100000)    ; 17 - 18
(repeat 4 2r11000000)    ; 19 - 22
(repeat 6 2r10000000)    ; 23 - 28
(repeat 1 2r00000000)])) ; 29

(def wheel235 (cycle [6 4 2 4 2 4 6 2]))
(def start-wheel   ; [offset of next possible prime & # of times to spin the wheel]
(reduce into [
[]
(repeat 1 [1  0])    ; 0
(repeat 6 [7  1])    ; 1 - 6
(repeat 4 [11 2])    ; 7 - 10
(repeat 2 [13 3])    ; 11 - 12
(repeat 4 [17 4])    ; 13 - 16
(repeat 2 [19 5])    ; 17 - 18
(repeat 4 [23 6])    ; 19 - 22
(repeat 6 [29 7])    ; 23 - 28
(repeat 1 [31 0])])) ; 29

(def prime-table  (load-primes "prime1e6.dat"))

(defn next-prime
"Given n, return next prime > n
For large n (> 3.4e14) tests are probabilistic"
[n]
(let [q (quot n 30), r (rem n 30)]
(cond
(< n 2)  2
(< n 3)  3
(< n 5)  5
(<= n 1000000)
(loop [index q, mask (masks r)]
(let [offset (lowbit (bit-and (prime-table index) mask))]
(if (> offset 0)
(+ (* index 30) offset)
(recur (inc index) 16rFF))))
:otherwise
(let [[offset, spins]  (start-wheel r),
base  (* 30 q)]
(loop [n (+ base offset), wheel (drop spins wheel235)]
(if (prime? n)
n
(recur (+ (first wheel) n)  (rest wheel))))))))

(def primes (iterate next-prime 2))
(defn primes-after [n]  (drop 1 (iterate next-prime n)))
```

To generate the file, there is another Clojure program that uses the (slow) O’Neil sieve to generate primes up to 1,000,030, and create a binary file.

```(load "lazy-sieve")

(defn bit_pos
"Return byte # and bit position of a prime number"
[n]
(let [bit_offset {1 1, 7 2, 11 4, 13 8, 17 16, 19 32, 23 64, 29 128}]
[(quot n 30), (bit_offset (mod n 30))]))

(defn adjust_max [n]   ; table size should be multiple of 30
(let [r (mod n 30)]
(if (> r 0)
(+ n (- 30 r))
n)))

(defn to_signed_byte [b]  ; java work-around, byte must be in range -128..127
(if (< b 128)
b
(- b 256)))

(defn make-table
"Create a compressed table of prime numbers with given size.
(excluding 2,3,5)"
[size]
(let [max_prime (adjust_max size)
table (into (vector-of :byte) (repeat (quot max_prime 30) 0))
primes (take-while #(< %1 max_prime) (drop 3 lazy-primes))]
(loop [l primes, t table]
(if (empty? l)
t
(let [[index, offset] (bit_pos (first l))]
(recur (rest l) (assoc t index (to_signed_byte (bit-or (t index) offset)))))))))

(def primes (make-table (int 1e6)))
(def output (java.io.DataOutputStream.
(java.io.BufferedOutputStream.
(java.io.FileOutputStream. "prime1e6.dat"))))

(doseq [p primes] (.writeByte output (int p)))

(.close output)
```

And some testing…

```user=> (take 25 primes)
(2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97)
user=> (take 20 (primes-after 1000000))
(1000003 1000033 1000037 1000039 1000081 1000099 1000117 1000121 1000133 1000151 1000159 1000171 1000183 1000187 1000193 1000199 1000211 1000213 1000231 1000249)
user=> (count (take-while #(< %1 1000) primes))
168
user=> (count (take-while #(< %1 1000000) primes))
78498
```