Profiling

February 28, 2014

I was reading Jon Bentley’s book More Programming Pearls the other day; it’s a superb book, and I re-read it every few years. In the first chapter Bentley uses a profiler to improve a small function to identify prime numbers by trial division; in today’s exercise, we will do the same.

Your task is to use a profiler to improve a program you have written; depending on the facilities provided by your language, you may have to write your own profiler. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Crossing Hands

February 25, 2014

The hands of an analog clock occasionally cross as they revolve around the dial.

Your task is to write a progam that determines how many times the hands cross in one twelve-hour period, and compute a list of those times. When you are finisehd, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Anagrams Within Words

February 21, 2014

We have a little programming puzzle today:

Given two words, determine if the first word, or any anagram of it, appears in consecutive characters of the second word. For instance, cat appears as an anagram in the first three letters of actor, but car does not appear as an anagram in actor even though all the letters of car appear in actor.

Your task is to write a function to determine if an anagram is present in a word. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Two Interview Questions

February 18, 2014

It’s been a while since we had any interview questions. We have two today, since they are so simple:

1) Given a number n, find the smallest 3-digit number such that the product of its digits is equal to n. For example, given n = 100, the solution is 455.

2) Given two arrays, one with n items and one with n+2 items including the same items as the array with n items plus two additional items, find the two additional items. Assume none of the items are duplicates.

Your task is to write solutions to the two interview questions given above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Reservoir Sampling, Improved

February 14, 2014

[ Today's exercise was written by guest author Paul Hofstra, who holds a PhD in Theoretical Nuclear Physics from the Free University in Amsterdam and is now retired from a major oil company. Guest authors are always welcome; contact me at the link in the menu bar if you are interested in writing for Programming Praxis. ]

Reservoir sampling is a method to get a random sample of size n from a stream of a priori unknown (and possibly very large) size. Jeffrey Scott Vitter gives several interesting algorithms for solving this problem; the algorithm that we used in a recent exercise is his Algorithm R, and he also gives Algorithm X and Algorithm Z. Algorithm R begins by placing the first n input items in an array called the reservoir, then for each succeeding input item chooses a random number that determines if the new item should replace one of the items in the reservoir. Algorithm X and Algorithm Z are faster because, instead of generating a random number for each input item, they use the random numbers to determine how many input items to skip; in the algorithms stated below, s is the number of items to skip and t is the number of items already processed. The expected number of skipped items is t / (n − 1) − 1. The probability function for the number of skipped records is f(s) = (n / (tn)) · ((tn)s+1) / (t + 1)s+1) and the distribution function for the probability Ss is F(s) = 1 − ((t + 1 − n) s+1) / (t + 1)s+1), where ab = a (a + 1) … (a + b − 1).

For Algorithm X, generate a uniform random number r between 0 and 1 and determine S as the lowest value for which F(S) ≥ r. Then skip S records, swap the next record with a record in the reservoir, and continue until the input is exhausted.

Algorithm Z improves Algorithm X by rejection sampling. Two new functions g and h and a constant c are defined so that h(s) ≤ f(s) ≤ c g(s), where c does not depend on s. The functions are as close to f as possible, and the integral of g(G) has a simple inverse, making h(s) much faster to calculate than f(s). Now calculate X = G−1(random) and S = ⌊X⌋. The rejection method rejects s if a random number r > f(S) / (c g(X)). To avoid the expensive calculation of f(S), r is first compared to h(S) / (c g(X)), and S is accepted if r is smaller; otherwise, f(S) must be calculated and compared. The calculations of h, g and c are: h(s) = n / (t + 1) · ((tn + 1) / (t + sn + 1))n+1, g(s) = n / (t + s) · (t / (t + s))n, G−1(y) = t((1 − y)−1/n − 1), and c = (t + 1) / (tn + 1). Since this is fastest when t is large, Vitter recommends using Algorithm X when tT · n with T = 22.

Your task is to write functions to implement Algorithm X and Algorithm Z. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Years, Months, Days

February 11, 2014

I was watching the Olympic Games this weekend, and one of the commentators remarked that one of the athletes was the youngest ever to accomplish some goal, aged so many years, so many months, and so many days, which was a few days younger than the previous youngest athlete. That didn’t matter so much to me, but it did suggest today’s exercise.

Your task is to write a function that, given two dates, returns the difference between them in years, months and days. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

N+1 Primality Prover

February 7, 2014

In the previous exercise we used a Lucas sequence to determine if a number is probably prime. In today’s exercise we use a Lucas sequence to prove the primality of a number N, given the factors of N + 1. Today’s algorithm is similar to a previous exercise in which we proved the primality of a number N using the factors of N − 1.

Choose P and Q such that D = P2 − 4 Q is not a square modulo N. Let N + 1 = F · R with F > R, where R is odd and the prime factorization of F is known. If there exists a Lucas sequence of discriminant D with U(N+1) ≡ 0 (mod N) and gcd(U((N+1)/q), N) = 1 for each prime q dividing F, then N is prime; if no such sequence exists for a given P and Q, a new P‘ and Q‘ with the same D can be computed as P‘ = P + 2 and Q‘ = P + Q + 1 (the same D must be used for all the factors q).

Your task is to write a program that proves the primality of a number. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Lucas Pseudoprimes

February 4, 2014

Since we now have a function to compute Lucas sequences, we can take another look at testing primality using Lucas pseudoprimes.

The standard version of the Lucas pseudoprime test chooses a suitableP and Q, computes the Lucas sequence Un+1(P,Q), and declares n either prime or a Lucas pseudoprime if Un+1(P,Q) ≡ 0 (mod n). The strong version of the Lucas pseudoprime test calculates n = d · 2s + 1 with d odd, and declares n either prime or a Lucas pseudoprime if Ud(P,Q) ≡ 0 (mod n) or if Vd 2r (P,Q) ≡ 0 (mod n) for any r with 0 ≤ r < s. John Selfridge suggested a suitable method for choosing P and Q: choose D as the smallest number in the sequence 5, -7, 9, -11, 13, -15, 17, … for which the Jacobi symbol (D / n) = -1, then set P = 1 and Q = (1 – D) / 4.

Robert Baillie and Sam Wagstaff developed a pseudoprimality test based on Lucas pseudoprimes. Their procedure has four steps: 1) If n is divisible by any prime less than 1000, then n is composite; 2) if n is not a strong probable prime to base 2, then n is composite; 3) choose parameters P and Q and declare n to be composite if it is a square; 4) If n is not a Lucas probable prime with parameters P and Q, using either the standard or strong version of the Lucas test, then n is composite. Otherwise, n is almost certainly prime. The procedure can fail only by asserting that a composite number is prime; it will never report that a prime number is composite.

Your task is to write functions to determine if n is a probable prime using the standard and strong Lucas pseudoprime test. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Reservoir Sampling

January 31, 2014

If you want to choose a sample of k out of n items, and the n items are stored in an array, choosing the sample is easy: just generate k random numbers from 0 to n−1, without duplicates, and look up the associated array items. But that’s harder to do if the items are stored in a list, because indexing into a list is O(n) instead of O(1), and it’s impossible if the list is too large to fit in memory. The Standard Prelude includes a function that works when k = 1, but in the general case we must use a streaming algorithm called reservoir sampling.

The algorithm creates an array of length k, the reservoir, and initializes it with the first k elements of the input list. Then the remaining nk elements of list are each considered, one by one, in order. When the ith element of the input list is reached, a random integer j less than i is generated; if j < k, the jth element of the reservoir array is replaced by the ith element of the list. When all elements of the list have been considered, the items that remain in the reservoir array are returned as the k randomly-selected elements of the input list.

Your task is to write a program that performs reservoir sampling. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

William Hart wrote, in one line of PARI, a function that factors integers. His method, which he gives in the form of a program, is a variant of Fermat’s method:

ONELINEFACTOR(n,iter)
1    for i ← 1 … iter do
2        s ← ⌈√ni
3        ms2 (mod n)
4        if ISSQUARE(m) then
5            t ← √m
6            return GCD(n, st)
7        endif
8    endfor

Hart’s paper, which you should read, gives several optimizations. The algorithm works best when the two factors are near each other, so you should first perform trial division to n1/3, thus ensuring that there is a factor between n1/3 and n1/2, unless n is prime. The algorithm works well for semi-primes up to 1016 or 1018, and Hart claims it is faster than SQUFOF in that range.

Your task is to write a function that factors integers by Hart’s method. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Follow

Get every new post delivered to your Inbox.

Join 612 other followers