The standard Sieve of Eratosthenes requires that you specify an upper bound before you start. But often, programs that use prime numbers don’t know the upper bound in advance, or don’t want to pre-compute and store all of the primes up to their bound. In those cases, an incremental Sieve of Eratosthenes can be used:

Algorithm PRIMEGEN: Create a generating function that returns the next prime number each time it is called, starting from zero.
1. [Prime the pump] Yield 2, then yield 3.
2. [Initialization] Set ps ← PRIMEGEN, D ← {}, p ← next(ps), p ← next(ps) again, qp × p, and cp.
3. [Next candidate] Set cc + 2.
4. [Composite candidate] If cD, skip to Step 5. Otherwise, set sD{c}, mc + s. Remove c from D. While mD, set mm + s. Set D{m} ← s. Go to Step 3.
5. [Prime candidate] If c = q, skip to Step 6. Otherwise, yield c and go to Step 3.
6. [Next sieving prime] Set sp + p, mc + s. While mD, set mm + s. Set D{m} ← s. Set p ← next(ps), qp × p. Go to Step 3.

The PRIMEGEN algorithm creates the sequence of prime numbers, returning the next prime in the sequence each time it is called; the exact mechanism to do that depends on the implementation language. The first step primes the pump (sorry!) by issuing the first two primes: 2 is a special case because it is even, and 3 is a special case because the pump only considers odd numbers as prime candidates.

The second step initializes two important data structures: ps is the list of sieving primes, which is determined by calling PRIMEGEN recursively, and D is a dictionary of composite/stride pairs, one for each sieving prime less than the square root of the current prime; the dictionary will most likely be implemented as a hash table, but other data structures such as a balanced binary tree could also be used. The other initializations are the current sieving prime p, its square q, and the initial prime candidate c.

The third step begins the main loop of the function by calculating the next prime candidate; eventually, all odd numbers starting from 5 will be prime candidates. Then there are three possibilities, each handled by a separate step.

The fourth step handles the case that the candidate c is composite, resetting the dictionary entry for the appropriate sieving prime. The fifth step handles the case that the candidate c is both composite and less than the square q of the current sieving prime, which indicates that the candidate is prime. The sixth step occurs when the candidate is composite but not in the dictionary, having reached the square of the current sieving prime, when a new sieving prime is added to the dictionary and the current sieving prime is updated in variables p and q. After the appropriate option has been processed, the algorithm returns to the top of the main loop to obtain the next prime.

In the fourth and sixth steps, the while loop calculates the new dictionary entry for the current sieving prime: the stride s = 2p is the distance between odd multiples of the sieving prime, and m is the smallest multiple of the sieving prime p greater than the current candidate c that is not already in the dictionary. The dictionary is keyed by m, which is a multiple of a sieving prime, with the corresponding stride s as its value.

There are several points to note about this algorithm. First, it is recursive, and there will eventually be a stack of sieving prime sequences, which sounds bizarre but actually makes sense. Second, by postponing the addition of new sieving primes to the dictionary until reaching their squares, only primes less than the square root of the current candidate need to be stored. And third, eliminating duplicate dictionary entries with the while loop of steps 4 and 6 keeps the size of the dictionary at exactly the number of sieving primes already processed. The whole algorithm is very efficient, making it useful whenever processing primes incrementally.

Time complexity of the algorithm is O(n log log n), the same as any other implementation of the Sieve of Eratosthenes, and space complexity is O(sqrt n) to store the dictionary of sieving primes.

Your task is to implement the incremental Sieve of Eratosthenes. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Today’s exercise is a reminder about the importance of writing good test code. We have two tasks. The first is to extract the next-to-last item from a linked list; for instance, given the list (1 2 3 4 5) the next-to-last item is 4. The second task is to extract the nth-from-last item from a linked last; for instance, given the same list, the second-from-last item is 4. In addition to writing the two functions, you should write test code that exercises the functions thoroughly.

Your task is to write the two functions and test code. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

One-Swappable Array

July 24, 2015

Today’s exercise helps somebody with their homework:

Given an array of unique integers, determine if it is possible to sort the array by swapping two elements of the array. For instance, the array [1,2,6,4,5,3,7] can be sorted by swapping 3 and 6, but there is no way to sort the array [5,4,3,2,1] by swapping two of its elements. You may use O(n) time, where the array has n integers, and constant additional space.

Your task is to write a program that determines if an array can be sorted with a single swap. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

A Number Puzzle

July 21, 2015

In last week’s exercise we had a word puzzle, so today’s exercise will be a number puzzle:

Find a 10-digit number, with all digits unique, such that the first n digits of the number are divisible by n. For instance, in the 3-digit number 345, the 1-digit prefix, 3, is divisible by 1, the 2-digit prefix, 34, is divisible by 2, and the 3-digit prefix, 345, is divisible by 3.

Your task is to write a program that finds a 10-digit number as defined above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

The Gas Station Problem

July 17, 2015

Today’s exercise is a classic problem in computer science courses:

There is a truck driving clockwise on a circular route; the truck has a gas tank with infinite capacity, initially empty. Along the route are gas stations with varying amounts of gas available: you are given a list of gas stations with the amounts of gas they have available and the amounts of gas required to drive to the next gas station. You must find a gas station that, for a trip starting from that gas station, will be able to return to that gas station.

For instance, consider a route with eight gas stations having 15, 8, 2, 6, 18, 9, 21, and 30 gallons of gas; from each of those gas stations to the next requires 8, 6, 30, 9, 15, 21, 2, and 18 gallons of gas. Obviously, you can’t start your trip at the third gas station, with 2 gallons of gas, because getting to the next gas station requires 30 gallons of gas and you would run out of gas reaching it.

Your task is to write a program that determines a suitable starting point for the truck; your algorithm should be linear in the number of gas stations and require constant space. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Ordered Words

July 14, 2015

An ordered word is one in which the letters in the word appear in alphabetic order; for instance, dirt and knotty are ordered words, praxis is not.

Your task is to write a program to find the longest ordered word in a dictionary. When you are finished, you are welcome to read a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

The city where I live used to publish a telephone directory; the “white pages” were distributed to all telephone customers once a year. Now my city no longer prints the directory; it is available on the internet, or you can pay an operator to look up a telephone number for you.

But there are still cities that print telephone directories, and some of those cities are big enough that the directory must be partitioned into multiple volumes. Consider this distribution of the first letters of customer’s last names:

A  B C  D  E  F G H I J  K L  M  N 0 P Q R  S  T U V W X Y Z
16 4 17 10 15 4 4 6 7 14 9 17 27 6 1 9 0 12 20 8 0 3 4 0 3 4

I’m not sure what the units are (probably tens of thousands of telephone customers), but the total is 220, and the telephone company has decided to print 4 volumes, so each should be about 55 units. One possible partitioning is A-D, E-J, K-O, P-Z, with counts of 47, 50, 60 and 63 units, differences of 8, 5, 5 and 8 from the ideal of 55, and a total difference of 26. Another partitioning is A-E, F-K, L-O, P-Z, with counts of 62, 44, 51, and 63 units, differences of 7, 11, 4 and 8, and a total difference of 30, which is worse. Before continuing, you might want to work out the optimal solution by hand, finding a minimum score of 18.

Your task is to write a program that determines the partitioning of the telephone book that minimizes the total difference. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Powerset

July 7, 2015

Sets are ubiquitous in programming; we have discussed sets in several previous exercises. One of the operations that can be performed on a set is to compute its powerset, which is a set of all the subsets of the original set. For instance, the powerset of (1 2 3) is the set (() (1) (2) (3) (1 2) (1 3) (2 3) (1 2 3)) where neither the order of the subsets nor the order of the elements of the individual subsets matters.

Your task is to write a program that computes the powerset of a set. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Assign Y

July 3, 2015

I’m sorry I missed Tuesday’s exercise; I’ve been very busy at work. Today’s exercise is an interview question of the kind I don’t like: it’s tricky, you either know the answer or you don’t, and it’s unlikely to be useful in any real programming situation:

You are give four integers x (which must be either 0 or 1), y, a and b. Your first task is to assign a to y if x = 0, or assign b to y if x = 1, without using any conditional operator, including the ternary operator. Your second task is to perform the same assignment without using any arithmetic operators.

Your task is to complete the two-part puzzle given above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2