July 30, 2013
There was a big buzz on Stack Overflow recently when a question about buying croissants was asked:
At my office someone buys croissants for everybody every Friday. Sometimes we have problems arranging a suitable schedule because of absences — illness, vacation, training, customer meetings — so we’re looking for an algorithm that randomly chooses the person who should bring croissants so that everyone buys and consumes about as many croissants as everyone else.
The question caused a big ruckus at Stack Overflow, and is currently locked while the community decides if the question is appropriate; I personally think that it is, and further, that the question is both fun and practical.
Your task is to devise a suitable algorithm for deciding who should buy croissants, and demonstrate by simulation that your algorithms is fair. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
July 26, 2013
We have today another of our unlimited supply of interview questions: this one supposedly comes from Amazon:
Given an array X[0..n-1] of integers sorted into ascending order with no duplicates, find an array item that is also its index, so that X[i] = i.
For example, X = 3 in the array shown below:
i 0 1 2 3 4 5
x[i] -3 0 1 3 5 7
July 23, 2013
In the early days of Unix, Mike Lesk wrote a directory-assistance program that allowed users to input a name using the numeric keypad on a telephone and get a telephone number in return. The program worked by “signing” each name with its telephonic equivalent, then storing the signatures in a database. This scheme makes collisions possible, but in practice collisions are rare, and when they do happen the user is prompted for additional information.
Your task is to write a program that takes a name and returns a telephone number; a database is provided on the next page. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
July 19, 2013
It has been widely reported recently, including articles in the New York Times and the London Sunday Times, that J K Rowling wrote a book under a pseudonym that was discovered by a forensic linguist. Time magazine explains how the discovery was made:
As one part of his work, Juola uses a program to pull out the hundred most frequent words across an author’s vocabulary. This step eliminates rare words, character names and plot points, leaving him with words like of and but, ranked by usage. Those words might seem inconsequential, but they leave an authorial fingerprint on any work.
“Propositions and articles and similar little function words are actually very individual,” Juola says. “It’s actually very, very hard to change them because they’re so subconscious.”
The Time article gives a link to the program Juola used, but that site gives very little information about how the program works.
Your task is to write a program that compares the similarity of two texts to determine authorship; this task is purposely vague so you can make your own decisions about how to proceed. When you are finished, you are welcome to read or run a suggested solution or to post your own solution or discuss the exercise in the comments below.
July 16, 2013
We have today a problem from recreational mathematics.
A vampire number is a number v = x × y such that x and y are not both divisible by 10 and the digits of v consist of the digits of x and the digits of y. The number of digits n in the vampire number must be even, and the number of digits in each of the two fangs x and y must be half the number of digits in v. For instance 1260 = 21 × 60 and 1395 = 15 × 93 are both vampire numbers.
Your task is to write a program that calculates the vampire numbers of n digits. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
July 12, 2013
Our Standard Prelude has included a simple implementation of hash tables since the very beginning. That implementation is probably twenty years old, or more; it’s one of the first pieces of code that I wrote when I was first learning Scheme, and even though it has had a few improvements since then, it’s time for a better implementation of hash tables. We’ll do that today.
Let’s review. A hash table provides storage for key/value pairs, where the only operation permitted is to compare two keys for equality; unlike a binary search tree, there is no ordering relationship between two keys. The hash table stores key/value pairs in an array of lists; a hash function converts a key to an address in the array, then a linear search through the list at that array location identifies the key/value pair. The size of the array is fixed in advance; if it is too small, then the chains at each array location grow too long, which slows down the operation of the hash table, but if it is too large, space is wasted. Assuming that the array isn’t too very small, so that the chains don’t grow too long, and assuming a fair hash function, hash tables provide O(1) access to any key/value pair.
A dynamic hash table adjusts automatically to the number of key/value pairs that it stores. That implies some kind of compromise in the O(1) behavior of the hash table. One possibility is to store the chains in some kind of tree structure, but that means each access will cost O(log n). Another possibility is to double the size of the array whenever the load factor becomes too high, but that means pauses whenever the array is resized, which can be annoying (or worse) for some applications, and still leaves the asymptotic behavior at something greater than O(1).
So we cheat. We will store the chains in a two-stage data structure that consists of arrays of width w stored in the growable arrays of a previous exercise, and systematically increase the size of the array at each insertion instead of stopping from time to time for a big resizing. The arrays of width w mean that the base of the logarithm is w instead of 2, so for instance to store a million key/value pairs with w = 256 and a load factor of 5 we need 200,000 chains which will be stored in 800 elements of the growable array at a maximum depth from the root of 9, and we’ll say that 9 is close enough to 1 that O(log n) ≈ O(1). It’s fun to cheat!
To control the doubling of the array we maintain three variables: u is the current number of chains, m is the current number of available chains, and p is the index number of the next chain to be split. Variables u and m are initialized to w; p runs from 0 to m, which doubles when p reaches it, resetting p to 0 for the next doubling. When the average load factor is exceeded, p increases, the keys at bucket p are rehashed and split into two buckets at p and p + m, and p and u are increased by one. The table shrinks in a similar way during deletions, except that m and u never fall below w.
Your task is to write a program that maintains dynamic hash tables. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
July 9, 2013
There is no exercise today.
My mother, who is eighty-eight years old, was the bottom of three people who fell down some stairs a few days ago, and sustained a cut on her head that required five stitches, four broken ribs in her left back, and one bruise that runs from her left forearm, up to her left shoulder, and down the left side of her back, all the way to her left calf. Her injuries aren’t serious, but at her age neither are they simple, and it’s been difficult.
I’ve spent two recent overnights in hospital emergency rooms, and have been taking care of my mother during the day, so I haven’t had time to write an exercise for today. I do hope to have one ready for Friday.
In the meantime, none of my readers has done all the exercises. So pick an exercise you haven’t done and post your solution in the normal way.
July 5, 2013
Many programming languages provide a library function that calculates a serial number for each day, making it easy to calculate the number of days between two dates; we provide such a function in the Standard Prelude. Sometimes, though, the need is to calculate weekdays rather than total days.
Your task is to write a program that calculates the number of weekdays between two dates. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
July 2, 2013
Sm ppl cmprs txt msgs by rtnng only ths vwls tht bgn a wrd and by rplcng dbld ltrs wth sngl ltrs.
With a proper dictionary, it is possible to expand all the possibilities for a word. For instance, the “Sm” that starts the sentence above is properly translated “Some” but these other words are possible: same, sam, sum, seem, seam, sumo, and others.
Your task is to write a program that, given a sentence in text-speak, returns a list of all possibilities for each word. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.