Fibonacci Primes
October 29, 2010
A recent programming challenge asked you to solve the following problem:
Find the smallest prime fibonacci number greater than a given input number. Add one to that fibonacci prime, find the factors of the result, and return the sum of the factors. For instance, if the input number is 10, the smallest fibonacci prime greater than 10 is 13, the factors of 13 + 1 = 14 are 2 and 7, and their sum is 9.
Your task is to write a function to solve that problem, then apply your function to the input number 227000. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
Benford’s Law
October 26, 2010
Benford’s Law, which was discovered by Simon Newcomb in 1881 and rediscovered by Frank Benford in 1938, states that, for many sets of numbers that arise in a scale-invariant manner, the first digit is distributed logarithmically, with the first digit 1 about 30% of the time, decreasing digit by digit until the first digit is 9 about 5% of the time. Stated mathematically, the leading digit d ∈ {1 … b-1} for a number written in base b will occur with probability Pd = logb(1 + 1/d). Thus, in base 10, the probabilities of the first digit being the number 1, 2, … 9 are 30.1%, 17.6%, 12.5%, 9.7%, 7.9%, 6.7%, 5.8%, 5.1% and 4.6%.
Benford’s Law is counter-intuitive but arises frequently in nature. It is also frequently used in auditing. Make a list of the amounts of the checks that a bookkeeper has written in the past year; if more that 5% begin with the digit 8 or 9, the bookkeeper is likely an embezzler. More important, precinct results of the disputed Iranian elections a year ago displayed anomalous first-digit counts, suggesting vote fraud.
Recently, Shriram Krishnamurthy issued a programming challenge on the Racket mailing list, asking for smallest/tightest/cleanest/best code to calculate the first-digit percentages of a list of numbers; he also challenged readers to apply the function to data in a comma-separated values file. He didn’t give a source, but did mention that he was interested in the littoral area (in acres) of the lakes of Missesota; sample data appears on the next page.
Your first task is to write a function that calculates the first-digit percentages of a list of numbers. Your second task is to calculate the first-digit percentages of the data on the next page. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
Text File Databases: Part 2
October 22, 2010
In the previous exercise we developed four functions for reading records from various type of text-file databases. In today’s exercise we develop four functions for processing those records, including map, fold, filter, and for-each. We also wrote map-reduce in a previous exercise, which gives us a fifth processing function.
All four functions take as their first argument a reader function that fetches the next record from the input; we wrote some reader functions in the previous exercise, and it is common to write others for specific input formats. Map takes both a reader function and a transformer function and applies the transformer to each input record, returning a list of the transformed values in the same order as the input. Fold takes a reader function, a combining function and a base value and applies the combiner successively to each base value and the next record from the input, returning the final base value when the input is exhausted. Filter is a combinator; it takes a reader function and a predicate and returns a new reader function that passes only those input records for which the predicate is true. For-each takes a reader function and another procedure and applies the procedure to each input record in turn, only for its side-effects, until the input is exhausted; it returns nothing.
Your task is to write the four functions for processing text-file database records. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
Text File Databases: Part 1
October 19, 2010
There is a lot of data stored in plain-ascii text files consisting of records separated by newlines, each record consisting of multiple fields, and it is useful to have a function library for dealing with them. This exercise looks at some functions for reading the data; the next exercise will look at some functions for processing the data.
We will consider four common types of text file databases. A file with fixed-length data fields has records of a fixed number of characters, each record containing fields that are similarly in fixed positions; the data may be preceded by a fixed-length header. A file with character-delimited fields has variable-length records, each with fields separated by a single-character delimiter; the delimiter is often a tab or vertical bar. A particular type of variable-length delimited text database is known as comma-separated values, where the delimiter is a comma and fields may be surrounded by double-quote characters so that a comma within a quoted field loses its meaning as a field separator; in that case, a literal double-quote character may appear within a quoted field as two double-quote characters in succession. The fourth type that we will consider is a name-value record, where each record consists of multiple fields, one field per line, separated by blank lines, each field consisting of a type-name and a value separated by a delimiter; this format is often used for databases that have many optional fields, such as bibliographic databases.
We want reader functions for each of these file formats that all return a single record each time they are called, or an end-of-file marker when the input is exhausted, and advance the file pointer to the beginning of the next record. The return value should be a list or array, whichever is convenient, containing the value of one field in each element, except for the name-value record, which should return a list of name/value pairs.
Different operating systems have different methods of signalling the end of a line. For maximum portability, your functions should accept the end of a line indicated by a carriage return, a line feed, or both characters in either order. You should be prepared to accept any type of line marker because the data may come from any source; for instance, your computer running MS Windows with a CRLF line marker may fetch data from a Linux computer with a bare LF for the line marker. You should also accept the final line in the file whether or not it has a trailing line marker.
Your task is to write functions to read one record from each of the four file types described above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
Find The Longest Palindrome In A String
October 15, 2010
Greplin issued a programming challenge recently that required programmers to solve three problems; when completed, Greplin issued an invitation to send them a resume. The first problem required the programmer to find the longest palindrome in the following 1169-character string:
Fourscoreandsevenyearsagoourfaathersbroughtforthonthisconta
inentanewnationconceivedinzLibertyanddedicatedtotheproposit
ionthatallmenarecreatedequalNowweareengagedinagreahtcivilwa
rtestingwhetherthatnaptionoranynartionsoconceivedandsodedic
atedcanlongendureWeareqmetonagreatbattlefiemldoftzhatwarWeh
avecometodedicpateaportionofthatfieldasafinalrestingplacefo
rthosewhoheregavetheirlivesthatthatnationmightliveItisaltog
etherfangandproperthatweshoulddothisButinalargersensewecann
otdedicatewecannotconsecratewecannothallowthisgroundThebrav
elmenlivinganddeadwhostruggledherehaveconsecrateditfarabove
ourpoorponwertoaddordetractTgheworldadswfilllittlenotlenorl
ongrememberwhatwesayherebutitcanneverforgetwhattheydidhereI
tisforusthelivingrathertobededicatedheretotheulnfinishedwor
kwhichtheywhofoughtherehavethusfarsonoblyadvancedItisrather
forustobeherededicatedtothegreattdafskremainingbeforeusthat
fromthesehonoreddeadwetakeincreaseddevotiontothatcauseforwh
ichtheygavethelastpfullmeasureofdevotionthatweherehighlyres
olvethatthesedeadshallnothavediedinvainthatthisnationunsder
Godshallhaveanewbirthoffreedomandthatgovernmentofthepeopleb
ythepeopleforthepeopleshallnotperishfromtheearth
Your task is to write a function that finds the longest palindrome in a string and apply it to the string given above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
Rotate An Array
October 12, 2010
Today’s exercise is simple but tricky: write a function to rotate the elements of an array. The function takes two arguments: the array to be rotated and the number of elements to rotate, where a positive count rotates to the left and a negative count rotates to the right. For instance, given the array [1 2 3 4 5 6], a rotation of 2 to the left gives [3 4 5 6 1 2], and a further rotation of 2 to the right restores the original array.
You should be sure to handle the edge cases properly. If the count is zero or the length of the array, the array should remain unchanged. If the count is greater than the length of the array, you should still do the right thing; for instance, a rotation of 8 on the array given above gives [3 4 5 6 1 2], the same as a rotation of 2.
Your task is to write the rotate function described above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
Zeller’s Congruence
October 8, 2010
Zeller’s Congruence is a simple mathematical method for determining the day of the week for a given date.
In the 1880s, Christian Zeller, a German mathematician, noticed that, if you run a year from March through February, the cumulative number of days in each month forms a nearly straight line; that works because February, which would normally perturb the straight line, is moved to the end. He worked out the formula ⌊(13m−1)/5⌋ to give the number of weekdays that the start of the month moves each month, where m is the month number.
Then it is easy to calculate the day of the week for any given day: add the day of the month, the offset for the number of months since March, an offset for each year, and additional offsets for leap years and leap centuries (remembering to subtract one year for dates in January and February), taking the whole thing mod 7. It’s fun to work out the arithmetic yourself, but if you don’t want to take the time, the whole formula is shown in the solution.
Your task is to write a function that uses Zeller’s Congruence to calculate the day of the week for any given date. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
George Marsaglia’s Random Number Generators
October 5, 2010
In two posts on Usenet, George Marsaglia, a mathematician, statistician, computer scientist, and professor at Washington State and Florida State Universities, described a series of nine high-quality 32-bit random number generators; the references pages describe the various generators and their tasteful use in a variety of situations. Here they are in C:
#define znew (z=36969*(z&65535)+(z>>16))
#define wnew (w=18000*(w&65535)+(w>>16))
#define MWC ((znew<<16)+wnew)
#define SHR3 (jsr^=(jsr<<17), jsr^=(jsr>>13), jsr^=(jsr<<5))
#define CONG (jcong=69069*jcong+1234567)
#define FIB ((b=a+b),(a=b-a))
#define KISS ((MWC^CONG)+SHR3)
#define LFIB4 (c++,t[c]=t[c]+t[UC(c+58)]+t[UC(c+119)]+t[UC(c+178)])
#define SWB (c++,bro=(x<y),t[c]=(x=t[UC(c+34)])-(y=t[UC(c+19)]+bro))
#define UNI (KISS*2.328306e-10)
#define VNI ((long) KISS)*4.656613e-10
#define UC (unsigned char) /*a cast operation*/
typedef unsigned long UL;
/* Global static variables: */
static UL z=362436069, w=521288629, jsr=123456789, jcong=380116160;
static UL a=224466889, b=7584631, t[256], x=0, y=0, bro;
static unsigned char c=0;
/* Example procedure to set the table, using KISS: */
void settable(UL i1,UL i2,UL i3,UL i4,UL i5, UL i6)
{ int i; z=i1; w=i2; jsr=i3; jcong=i4; a=i5; b=i6;
for(i=0;i<256;i=i+1) t[i]=KISS; }
Personally, I think that code is a bit ugly on the outside but absolutely beautiful and exceedingly elegant on the inside. Marsaglia also provided a test suite, which should print seven zeros on successive lines:
#include <stdio.h>
int main(void){
int i; UL k;
settable(12345,65435,34221,12345,9983651,95746118);
for(i=1;i<1000001;i++){k=LFIB4;} printf("%u\n", k-1064612766U);
for(i=1;i<1000001;i++){k=SWB ;} printf("%u\n", k- 627749721U);
for(i=1;i<1000001;i++){k=KISS ;} printf("%u\n", k-1372460312U);
for(i=1;i<1000001;i++){k=CONG ;} printf("%u\n", k-1529210297U);
for(i=1;i<1000001;i++){k=SHR3 ;} printf("%u\n", k-2642725982U);
for(i=1;i<1000001;i++){k=MWC ;} printf("%u\n", k- 904977562U);
for(i=1;i<1000001;i++){k=FIB ;} printf("%u\n", k-3519793928U); }
Your task is to translate those nine random number generators into your favorite language. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
Oban Numbers
October 1, 2010
Oban is a small town on the west coast of Scotland, home to a rocky port, a small distillery, and a half-built folly started by a banker with not-enough money and less sense. I fondly remember a visit to Oban in 1987. My mother bought a scarf, a tin of shortbread, and a wee dram of the local whiskey at a small shop on the main street (it may have been one of the shops in the picture, I’m not sure) and was scolded by the waitress for using the wrong fork at dinner. Oban numbers have nothing to do with Scotland, except that the name was the occasion of pleasant reminiscing.
MathWorld describes an oban number as a number from which the letter “o” is missing when it is spelled out in words (the letter “o” is “banned”).
Your task is to make a complete list of all the oban numbers. If your language provides a function that spells out a number in words, don’t use it, but write your own instead. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.