## Two Random Selections

### December 10, 2010

In today’s exercise, we look at two programs that make random selections.

The first program picks an item from a list whose size is not known in advance, making a single pass through the list. The algorithm selects the first item in the list with probability ^{1}/_{1}, then replaces that item with the second item in the list with probability ^{1}/_{2}, then replaces whichever item is currently selected with the third item in the list with probability ^{1}/_{3}, and so on; when the end of the list is reached, each item will be selected with probability ^{1}/_{n}, where *n* is the number of items in the list. This algorithm is used in the unix `fortune`

program, which randomly selects a line from a file of pithy sayings, and appears in the Standard Prelude under the name `fortune`

.

The second program is used by auditors, pollsters, and others who need to randomly select *m* items from a population of *n* items. The program outputs a list of *m* numbers in the range 1 to *n*, in order; the user then selects the sampled items from a numbered list of the items in the population. The algorithm runs through the integers from 1 to *n* and selects each item with probability ^{m}/_{n}, where at each step *m* is the number remaining to be selected and *n* is the number remaining in the population.

Your task is to write the two programs described above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

[...] today’s Programming Praxis exercisewe have to implement two algorithms that select random items from a list [...]

My Haskell solution (see http://bonsaicode.wordpress.com/2010/12/10/programming-praxis-two-random-selections/ for a version with comments):

Interesting. I did not post any code but tried to code it (I did have issues with the specification :D).

Python solution:

Consider “k_of_many”, which takes a subset of k elements uniformly at random from a list of unknown size. (Or all the elements, if the list is shorter than k.) Then “fortune” and “sample” are special cases.