Incremental Sieve Of Eratosthenes
July 31, 2015
The standard Sieve of Eratosthenes requires that you specify an upper bound before you start. But often, programs that use prime numbers don’t know the upper bound in advance, or don’t want to pre-compute and store all of the primes up to their bound. In those cases, an incremental Sieve of Eratosthenes can be used:
Algorithm PRIMEGEN: Create a generating function that returns the next prime number each time it is called, starting from zero.
1. [Prime the pump] Yield 2, then yield 3.
2. [Initialization] Set ps ← PRIMEGEN, D ← {}, p ← next(ps), p ← next(ps) again, q ← p × p, and c ← p.
3. [Next candidate] Set c ← c + 2.
4. [Composite candidate] If c ∉ D, skip to Step 5. Otherwise, set s ← D{c}, m ← c + s. Remove c from D. While m ∈ D, set m ← m + s. Set D{m} ← s. Go to Step 3.
5. [Prime candidate] If c = q, skip to Step 6. Otherwise, yield c and go to Step 3.
6. [Next sieving prime] Set s ← p + p, m ← c + s. While m ∈ D, set m ← m + s. Set D{m} ← s. Set p ← next(ps), q ← p × p. Go to Step 3.
The PRIMEGEN algorithm creates the sequence of prime numbers, returning the next prime in the sequence each time it is called; the exact mechanism to do that depends on the implementation language. The first step primes the pump (sorry!) by issuing the first two primes: 2 is a special case because it is even, and 3 is a special case because the pump only considers odd numbers as prime candidates.
The second step initializes two important data structures: ps is the list of sieving primes, which is determined by calling PRIMEGEN recursively, and D is a dictionary of composite/stride pairs, one for each sieving prime less than the square root of the current prime; the dictionary will most likely be implemented as a hash table, but other data structures such as a balanced binary tree could also be used. The other initializations are the current sieving prime p, its square q, and the initial prime candidate c.
The third step begins the main loop of the function by calculating the next prime candidate; eventually, all odd numbers starting from 5 will be prime candidates. Then there are three possibilities, each handled by a separate step.
The fourth step handles the case that the candidate c is composite, resetting the dictionary entry for the appropriate sieving prime. The fifth step handles the case that the candidate c is both composite and less than the square q of the current sieving prime, which indicates that the candidate is prime. The sixth step occurs when the candidate is composite but not in the dictionary, having reached the square of the current sieving prime, when a new sieving prime is added to the dictionary and the current sieving prime is updated in variables p and q. After the appropriate option has been processed, the algorithm returns to the top of the main loop to obtain the next prime.
In the fourth and sixth steps, the while loop calculates the new dictionary entry for the current sieving prime: the stride s = 2p is the distance between odd multiples of the sieving prime, and m is the smallest multiple of the sieving prime p greater than the current candidate c that is not already in the dictionary. The dictionary is keyed by m, which is a multiple of a sieving prime, with the corresponding stride s as its value.
There are several points to note about this algorithm. First, it is recursive, and there will eventually be a stack of sieving prime sequences, which sounds bizarre but actually makes sense. Second, by postponing the addition of new sieving primes to the dictionary until reaching their squares, only primes less than the square root of the current candidate need to be stored. And third, eliminating duplicate dictionary entries with the while loop of steps 4 and 6 keeps the size of the dictionary at exactly the number of sieving primes already processed. The whole algorithm is very efficient, making it useful whenever processing primes incrementally.
Time complexity of the algorithm is O(n log log n), the same as any other implementation of the Sieve of Eratosthenes, and space complexity is O(sqrt n) to store the dictionary of sieving primes.
Your task is to implement the incremental Sieve of Eratosthenes. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.