Dodgson’s Doublets
March 20, 2009
Charles Dodgson was an English author, mathematician and logician of the nineteenth century; better known by his pseudonym Lewis Carroll, his most famous works are Alice’s Adventures in Wonderland and the sequel Through the Looking Glass. In 1879, Carroll published the Doublets word game in the Vanity Fair magazine:
The rules of the Puzzle are simple enough. Two words are proposed, of the same length; and the Puzzle consists in linking these together by interposing other words, each of which shall differ from the next word in one letter only. That is to say, one letter may be changed in one of the given words, then one letter in the word so obtained, and so on, till we arrive at the other given word. The letters must not be interchanged among themselves, but each must keep to its own place. As an example, the word ‘head’ may be changed into ‘tail’ by interposing the words ‘heal, teal, tell, tall’. I call the given words ‘a Doublet’, the interposed words ‘Links’, and the entire series ‘a Chain’, of which I here append an example:
H E A D
h e a l
t e a l
t e l l
t a l l
T A I LIt is, perhaps, needless to state that it is de rigueur that the links should be English words, such as might be used in good society.
Write a program that takes two words and finds a chain between them. What is the chain that connects black to white?
Comma-Separated Values
March 17, 2009
The comma-separated values file format is commonly used to exchange tabular information, such as spreadsheets, between different programs and operating systems. Though common, the csv format has never been formally defined, and there are variations. One definition is at http://www.rfc-editor.org/rfc/rfc4180.txt.
We will define a csv file as containing ascii text, with records terminated by a line-termination sequence and fields containing zero or more characters separated by commas; a line-termination sequence is a carriage-return character, or a line-feed character, or both characters in either order. Leading and trailing space in unquoted fields is preserved. Fields may be surrounded by double-quote characters (ascii \042) may contain any characters; such fields may contain newlines, literal commas (ascii \054), and double-quote characters represented as two successive double-quotes. The field separator is normally a comma, but may be changed to an arbitrary character (often a semi-colon) in those European countries that use a comma instead of a decimal point.
The examples below show some of the kinky situations that may arise; for readability, the three-character sequence TAB represents a single tab character (ascii \011):
1,abc,def ghi,jkl,unquoted character strings
2,"abc","def ghi","jkl",quoted character strings
3,123,456,789,numbers
4, abc,def , ghi ,strings with whitespace
5, "abc","def" , "ghi" ,quoted strings with whitespace
6, 123,456 , 789 ,numbers with whitespace
7,TAB123,456TAB,TAB789TAB,numbers with tabs for whitespace
8, -123, +456, 1E3,more numbers with whitespace
9,123 456,123"456, 123 456 ,strange numbers
10,abc",de"f,g"hi,embedded quotes
11,"abc""","de""f","g""hi",quoted embedded quotes
12,"","" "",""x"",doubled quotes
13,"abc"def,abc"def","abc" "def",strange quotes
14,,"", ,empty fields
15,abc,"def
ghi",jkl,embedded newline
16,abc,"def",789,multiple types of fields
Each of those records has five fields, shown below with the vertical bar character as a field separator:
1|abc|def ghi|jkl|unquoted character strings
2|abc|def ghi|jkl|quoted character strings
3|123|456|789|numbers
4| abc|def | ghi |strings with whitespace
5| "abc"|def | "ghi" |quoted strings with whitespace
6| 123|456 | 789 |numbers with whitespace
7|TAB123|456TAB|TAB789TAB|numbers with tabs for whitespace
8| -123| +456| 1E3|more numbers with whitespace
9|123 456|123"456| 123 456 |strange numbers
10|abc"|de"f|g"hi|embedded quotes
11|abc"|de"f|g"hi|quoted embedded quotes
12|| ""|x""|doubled quotes
13|abcdef|abc"def"|abc "def"|strange quotes
14||| |empty fields
15|abc|def
ghi|jkl|embedded newline
16|abc|def|789|multiple types of fields
Write a function that retrieves records from a csv file. Be sure to provide a way to handle user-specified field-separators.
Friday, the Thirteenth
March 13, 2009
Today is Friday, March 13, 2009. How many Fridays will occur on the thirteenth of the month in the next ten years, not counting today?
Word Frequencies
March 10, 2009
Write a program that takes a filename and a parameter n and prints the n most common words in the file, and the count of their occurrences, in descending order.
Roman Numerals
March 6, 2009
Roman numerals are a number-notation system developed in classical Rome, chiefly used today to indicate the year in which a motion picture was made, or the sequence number of a Super Bowl.
Roman numerals use letters of the alphabet to indicate numerical value, according to the following code:
I 1 V 5 X 10 L 50 C 100 D 500 M 1000
For example, the number 1732 is represented by Roman numerals as MDCCXXXII, and the number 1956 is represented by Roman numerals as MDCCCCLVI. Letter symbols are normally written from the largest symbol to the smallest, left to right, so the numeric values are additive. However, in order to conserve space, it is permissible to replace four of the same symbol written all in a row in a subtractive manner to the left of a higher-value symbol, so that 1956 may also be represented as MCMLVI, where the CM symbol, with C before M, indicates that C is subtracted from M, and thus indicates the numeric value 900. Wikipedia and MathWorld explain the common usage of Roman numerals.
Your task is to write a function that takes two roman numerals (character strings) as input and returns their sum as a roman numeral as output. Be sure that input can be given in either the additive or subtractive forms of Roman numerals; give output using the subtractive form. What is add-roman("CCCLXIX", "CDXLVIII")?
Creation
March 3, 2009
A coded message appears on the next page. The cipher-text was created by XORing the characters of a password, repeated as necessary, with the characters of the plain-text. The numbers are the ascii ordinals of the cipher-text characters; the cipher-text is given in that form in order to bypass the limitations of some browsers. The password consists of less than twenty alphanumeric characters (upper-case letters, lower-case letters, and digits). The plain-text consists of ascii text (printable characters, plus spaces and newlines) in English.
Determine the password and read the coded message.
Mark V. Shaney
February 27, 2009
In the mid-1980’s, Mark V. Shaney posted messages such as this to the Usenet group net.singles:
I seem to be important. For me, it would have agreed with the
technical insight that is dear to me. Because of this, I have no
advice for someone in that situation!Joining Mensa was something I did him one better. I wore a dress skirt
a day for one week. I did him one better. I wore a dress skirt a day
for a 2 year relationship. I’m wondering if anyone else out there has
ever experienced this phenomena, whether it was actually your
contention that this is true for me.I suppose it depends how you felt about someone before you became
emotionally attached and therefore “safer” – not to sporting events,
but to opera.I lost 90 lbs a few months during my “flower child” days in high school
where, due to her high academic standings, was shunned by many of the
tube. The experience really screwed them up — if not their heads,
their knees. Why does one have to be the prime measurement of
manhood. No?He was a scrawny, spastic nerd in high school, and I fantasized about
such a thing. It all depends on the sidelines, listening to what makes
the rest of the guys around her – suddenly finds herself in a situation
where guys are asking them out!? But this can result in members of
either the person of your dreams (in a larger number of males to
females studying the field of engineering), the ratio of males to
females is somewhere in the past. And, per the other person.I find it hard to reconcile the notion that something or someone isn’t
theirs anymore. I have a date with the woman. Subjectively, I have
also acted in this weekend.
However, Shaney wasn’t a person. Shaney was a bot created by three Bell Labs researchers — Bruce Ellis, Rob Pike and Don Mitchell — that analyzed Usenet postings and then created its own posting. Shaney’s writings were quirky, non-sensical, and beloved by many.
Shaney worked by reading a training text and saving each triple of words that appear in the training text in a large table. Then it generates text using a markov chain, starting with two words that appear in the training text and repeatedly writing the third word of the triple, sliding the output window from word to word. The genius of the method is that any two words may appear in the training text with multiple following words, and the generator is free to choose any of them; thus, short fragments of text make sense, but the text as a whole frequently veers from one train of thought to another. A word includes its surrounding punctuation, so that sentence structure and, indirectly, grammar, are built in to the output.
Write a program that implements Shaney.
Mardi Gras
February 24, 2009
Easter occurs each year on the first Sunday after the first full moon after the vernal equinox. Mardi Gras occurs on the Tuesday of the seventh week before Easter. Mardi Gras falls on February 24th in 2009, with Easter forty-seven days later on April 12th. On what date did Mardi Gras fall in 1989? On what date will Mardi Gras fall in 2049?
A Self-Reproducing Program
February 20, 2009
Write a program that takes no input and generates a copy of its own source text as its complete output.
The Digits of Pi
February 20, 2009
The ratio of the circumference of a circle to its diameter is given by the constant known by the Greek letter pi, and is an irrational number (its representation is non-terminating and non-repeating) with a value slightly larger than 3.14159. What is the one-thousandth digit of pi? (Counting begins at zero, so the zeroth digit of pi is 3 and the fourth digit of pi is 5.)