## Profiling

### February 28, 2014

I was reading Jon Bentley’s book *More Programming Pearls* the other day; it’s a superb book, and I re-read it every few years. In the first chapter Bentley uses a profiler to improve a small function to identify prime numbers by trial division; in today’s exercise, we will do the same.

Your task is to use a profiler to improve a program you have written; depending on the facilities provided by your language, you may have to write your own profiler. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

## Crossing Hands

### February 25, 2014

The hands of an analog clock occasionally cross as they revolve around the dial.

Your task is to write a progam that determines how many times the hands cross in one twelve-hour period, and compute a list of those times. When you are finisehd, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

## Anagrams Within Words

### February 21, 2014

We have a little programming puzzle today:

Given two words, determine if the first word, or any anagram of it, appears in consecutive characters of the second word. For instance,

catappears as an anagram in the first three letters ofactor, butcardoes not appear as an anagram inactoreven though all the letters ofcarappear inactor.

Your task is to write a function to determine if an anagram is present in a word. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

## Two Interview Questions

### February 18, 2014

It’s been a while since we had any interview questions. We have two today, since they are so simple:

1) Given a number

n, find the smallest 3-digit number such that the product of its digits is equal ton. For example, givenn= 100, the solution is 455.

2) Given two arrays, one with

nitems and one withn+2 items including the same items as the array withnitems plus two additional items, find the two additional items. Assume none of the items are duplicates.

Your task is to write solutions to the two interview questions given above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

## Reservoir Sampling, Improved

### February 14, 2014

[ Today’s exercise was written by guest author Paul Hofstra, who holds a PhD in Theoretical Nuclear Physics from the Free University in Amsterdam and is now retired from a major oil company. Guest authors are always welcome; contact me at the link in the menu bar if you are interested in writing for Programming Praxis. ]

Reservoir sampling is a method to get a random sample of size *n* from a stream of *a priori* unknown (and possibly very large) size. Jeffrey Scott Vitter gives several interesting algorithms for solving this problem; the algorithm that we used in a recent exercise is his Algorithm R, and he also gives Algorithm X and Algorithm Z. Algorithm R begins by placing the first *n* input items in an array called the reservoir, then for each succeeding input item chooses a random number that determines if the new item should replace one of the items in the reservoir. Algorithm X and Algorithm Z are faster because, instead of generating a random number for each input item, they use the random numbers to determine how many input items to skip; in the algorithms stated below, *s* is the number of items to skip and *t* is the number of items already processed. The expected number of skipped items is *t* / (*n* − 1) − 1. The probability function for the number of skipped records is *f*(*s*) = (*n* / (*t* − *n*)) · ((*t*−*n*)^{s+1}) / (*t* + 1)^{s+1}) and the distribution function for the probability *S* ≤ *s* is *F*(*s*) = 1 − ((*t* + 1 − *n*) ^{s+1}) / (*t* + 1)^{s+1}), where *a*^{b} = *a* (*a* + 1) … (*a* + *b* − 1).

For Algorithm X, generate a uniform random number *r* between 0 and 1 and determine *S* as the lowest value for which *F*(*S*) ≥ *r*. Then skip *S* records, swap the next record with a record in the reservoir, and continue until the input is exhausted.

Algorithm Z improves Algorithm X by rejection sampling. Two new functions *g* and *h* and a constant *c* are defined so that *h*(*s*) ≤ *f*(*s*) ≤ *c* *g*(*s*), where *c* does not depend on *s*. The functions are as close to *f* as possible, and the integral of *g*(*G*) has a simple inverse, making *h*(*s*) much faster to calculate than *f*(*s*). Now calculate X = *G*^{−1}(*random*) and *S* = ⌊*X*⌋. The rejection method rejects *s* if a random number *r* > *f*(*S*) / (*c* *g*(*X*)). To avoid the expensive calculation of *f*(*S*), *r* is first compared to *h*(*S*) / (*c* *g*(*X*)), and *S* is accepted if *r* is smaller; otherwise, *f*(*S*) must be calculated and compared. The calculations of *h*, *g* and *c* are: *h*(s) = *n* / (*t* + 1) · ((*t* − *n* + 1) / (*t* + *s* − *n* + 1))^{n+1}, *g*(*s*) = *n* / (*t* + *s*) · (*t* / (*t* + *s*))^{n}, *G*^{−1}(*y*) = *t*((1 − *y*)^{−1/n} − 1), and *c* = (*t* + 1) / (*t* − *n* + 1). Since this is fastest when t is large, Vitter recommends using Algorithm X when *t* ≤ *T* · *n* with *T* = 22.

Your task is to write functions to implement Algorithm X and Algorithm Z. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

## Years, Months, Days

### February 11, 2014

I was watching the Olympic Games this weekend, and one of the commentators remarked that one of the athletes was the youngest ever to accomplish some goal, aged so many years, so many months, and so many days, which was a few days younger than the previous youngest athlete. That didn’t matter so much to me, but it did suggest today’s exercise.

Your task is to write a function that, given two dates, returns the difference between them in years, months and days. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

## N+1 Primality Prover

### February 7, 2014

In the previous exercise we used a Lucas sequence to determine if a number is probably prime. In today’s exercise we use a Lucas sequence to prove the primality of a number *N*, given the factors of *N* + 1. Today’s algorithm is similar to a previous exercise in which we proved the primality of a number *N* using the factors of *N* − 1.

Choose

PandQsuch thatD=P^{2}− 4Qis not a square moduloN. LetN+ 1 =F·RwithF>R, whereRis odd and the prime factorization ofFis known. If there exists a Lucas sequence of discriminantDwithU(N+1) ≡ 0 (modN) and gcd(U((N+1)/q),N) = 1 for each primeqdividingF, thenNis prime; if no such sequence exists for a givenPandQ, a newP‘ andQ‘ with the sameDcan be computed asP‘ =P+ 2 andQ‘ =P+Q+ 1 (the sameDmust be used for all the factorsq).

Your task is to write a program that proves the primality of a number. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

## Lucas Pseudoprimes

### February 4, 2014

Since we now have a function to compute Lucas sequences, we can take another look at testing primality using Lucas pseudoprimes.

The standard version of the Lucas pseudoprime test chooses a suitable*P* and *Q*, computes the Lucas sequence *U*_{n+1}(*P*,*Q*), and declares *n* either prime or a Lucas pseudoprime if *U*_{n+1}(*P*,*Q*) ≡ 0 (mod *n*). The strong version of the Lucas pseudoprime test calculates *n* = *d* · 2^{s} + 1 with *d* odd, and declares *n* either prime or a Lucas pseudoprime if *U _{d}*(

*P*,

*Q*) ≡ 0 (mod

*n*) or if

*V*

_{d 2r }(

*P*,

*Q*) ≡ 0 (mod

*n*) for any

*r*with 0 ≤

*r*<

*s*. John Selfridge suggested a suitable method for choosing

*P*and

*Q*: choose

*D*as the smallest number in the sequence 5, -7, 9, -11, 13, -15, 17, … for which the Jacobi symbol (

*D*/

*n*) = -1, then set

*P*= 1 and

*Q*= (1 –

*D*) / 4.

Robert Baillie and Sam Wagstaff developed a pseudoprimality test based on Lucas pseudoprimes. Their procedure has four steps: 1) If *n* is divisible by any prime less than 1000, then *n* is composite; 2) if *n* is not a strong probable prime to base 2, then *n* is composite; 3) choose parameters *P* and *Q* and declare *n* to be composite if it is a square; 4) If *n* is not a Lucas probable prime with parameters *P* and *Q*, using either the standard or strong version of the Lucas test, then *n* is composite. Otherwise, *n* is almost certainly prime. The procedure can fail only by asserting that a composite number is prime; it will never report that a prime number is composite.

Your task is to write functions to determine if *n* is a probable prime using the standard and strong Lucas pseudoprime test. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.