May 29, 2009
Transposition ciphers work by rearranging the letters of a plaintext to form a ciphertext; both the length of the text and the frequency distribution of the letters are identical between plaintext and ciphertext. The rail-fence cipher is a well-known transposition cipher; another is the columnar transposition that is the subject of today’s exercise.
To illustrate columnar transposition, we will encipher the plaintext PROGRAMMINGPRAXIS with the keyword COACH. First, COACH is converted to the numeric key 25134 by noting the alphabetic ordering of its letters; the two Cs have the same rank, so they are ordered left to right. Then, the plaintext is written in rows under the key, each row the same length as the key:
C O A C H
2 5 1 3 4
P R O G R
A M M I N
G P R A X
The ciphertext is read off by colums, taking the columns in numeric order; first OMR, then PAGI, and so on, ending with RMPS:
O M R P A G I G I A R N X R M P S
Decryption is the opposite operation. The shape of the grid is determined by the number of letters in the ciphertext and the number of letters in the key:
C O A C H
2 5 1 3 4
_ _ _ _ _
_ _ _ _ _
_ _ _ _ _
Then the letters of the ciphertext are filled in to the grid starting with the lowest-numbered column, then the second column, and so on, respecting the given column lengths; here is the grid with the first two columns filled
C O A C H
2 5 1 3 4
P _ O _ _
A _ M _ _
G _ R _ _
Columnar transposition is fairly easy to break; guess a key length, write out the columns, then look for common letter-sequences like QU or THE or ING. Columnar transposition can be made much more secure by double-encrypting the input with two keys of differing length: encrypt the plaintext with the first key, then encrypt the intermediate ciphertext with the second key.
Double transposition ciphers were routinely used for military field-grade encryption through the Second World War, because they are reasonably secure but manageable by hand in less-than-ideal circumstances. It is said that the French could normally read German ciphers due to poor radio discipline by German cipher clerks; search for Übchi to learn more of this fascinating story. In recent times, double transposition ciphers have been cryptanalyzed and attacks are known.
Your task is to write functions to perform double-transposition encryption and decryption. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
May 26, 2009
Word search puzzles are a popular time-wasting pasttime. To redeem some of that lost time, we will write a program to search puzzles. For instance, given a problem like
F Y Y H N R D
R L J C I N U
A A W A A H R
N T K L P N E
C I L F S A P
E O G O T P N
H P O L A N D
and the list of words ITALY, HOLLAND, POLAND, SPAIN, FRANCE, JAPAN, TOGO, and PERU, you should find words at the following locations:
ITALY row 5 column 2 up
HOLLAND row 7 column 1 up right
POLAND row 7 column 2 right
SPAIN row 5 column 5 up
FRANCE row 1 column 1 down
JAPAN row 2 column 3 down right
TOGO row 6 column 5 left
PERU row 5 column 7 up
May 22, 2009
This exercise comes originally from the Sphere Online Judge; I read it on Proggit based on a blog posting by Chethan T. So many of the answers were wrong that I decided it would make a good exercise for Programming Praxis. Here is SPOJ’s statement of the exercise:
A positive integer is called a palindrome if its representation in the decimal system is the same when read from left to right and from right to left. For a given positive integer K of not more than 1000000 digits, write the value of the smallest palindrome larger than K to output. Numbers are always displayed without leading zeros.
Your task is to write a function that calculates the next palindrome larger than its input. When you are finished, you are welcome to read or run a suggested solution, or post your own solution or discuss the exercise in the comments below.
May 19, 2009
We have previously studied integer factorization by trial division and wheel factorization. In this exercise we will implement integer factorization by Fermat’s method. Pierre de Fermat was a French lawyer and amateur mathematician of the seventeenth century, a contemporary of René Descartes and Blaise Pascal, who did early work in calculus, number theory, analytic geometry and probability.
Fermat’s method works by noticing that if n, the odd number to be factored, can be written as the difference of two squares n = x2 – y2, then it can be factored as (x – y) × (x + y). Thus, to find the factors of n, we start with y = 0 and x as the smallest integer greater than or equal to the square root of n and increase y until x2 – y2 is equal to n, in which case x and y reveal the factors of n, or x2 – y2 is less than n, when we increase x by one and iterate. This process must terminate; if n is prime, it will stop with the factors 1 and n.
Though this method works, the repeated squarings are relatively expensive, so it is normal to work with u = 2x + 1 and v = 2y + 1 and to keep track of r = x2 – y2 – n; when r = 0, the algorithm terminates. The variable u keeps track of the amount r increases when x2 is replaced by (x + 1)2 and variable v keeps track of the amount r decreases when y is replaced by (y + 1)2; as x and y increase by 1, u and v increase by 2.
Your task is to implement integer factorization by Fermat’s method. When you are finished, you are welcome to read or run the suggested solution, or post your own solution or discuss the exercise in the comments below.
May 15, 2009
A cellular automaton is a collection of cells arranged in an infinite grid, controlled by a clock, with each cell coloring itself at each tick of the clock based on the state of its neighboring cells. A linear cellular automaton has all the cells of the grid arranged in a single line. A time-lapse picture of a linear cellular automaton shows a two-dimensional grid with the automaton as a horizontal line, the state of the automaton at each clock tick flowing down the page.
For the elementary cellular automata that we will study, each cell may be either of two colors, black or white, and a cell’s neighborhood consists of itself and the two cells on either side. With two colors and three neighbors, there are 23=8 states and 28=256 possible rules for advancing from one state to the next.
Rules are specified using a kind of binary notation; for each of the eight states 111, 110, 101, 100, 011, 010, 001, and 000, where 1 is black and 0 is white and the neighbors are arranged left-to-right on the line. Then the rule is specified by the binary number corresponding to the eight successor states of each of the eight neighbors, so for instance rule 30 = 000111102 maps 111 to 0, 110 to 0, 101 to 0, 100 to 1, 011 to 1, 010 to 1, 001 to 1, and 000 to 0.
For example, the first 12 rows of the rule 158 automaton applied to a row containing a single black cell are shown below:
X X X
X X X X
X X X X X
X X X X X X X
X X X X X X X
X X X X X X X X X X
X X X X X X X X X
X X X X X X X X X X X X X
X X X X X X X X X X X
X X X X X X X X X X X X X X X X
X X X X X X X X X X X X X
X X X X X X X X X X X X X X X X X X X
You can see more examples at http://mathworld.wolfram.com/ElementaryCellularAutomaton.html.
Your task is to write a function that draws the result of applying a given rule to a given number of rows, starting from a row with a single black cell. When you are finished, you are welcome to read or run a suggested solution, or post your own solution or discuss the exercise in the comments below.
May 12, 2009
My daughter, a freshman in high school, is just completing her first programming class, using Java. Her final assignment was to write a program to print a loan amortization table, given an initial balance, annual interest rate, term in months, and monthly payment.
Your task is to write that program (you are not restricted to Java), then print the amortization table for a three-year car loan of $10,000 at 7% (it’s an old textbook). When you are finished, you are welcome to read a suggested solution, or to post your solution or discuss the exercise in the comments below.
May 8, 2009
Having discussed prime numbers in several previous exercises, we are now interested in the problem of factoring an integer n; for instance, the prime factors of 42 are 2, 3, and 7. A simple factoring method is to perform trial division by all the integers counting from 2 to the square root of n. Your first task is to write that function.
An easy optimization is to divide only by 2 and then by odd integers greater than 2, which saves half the work. A better optimization is to divide by 2, then 3, then 5, and thereafter to alternately add 2 and 4 to the trial divisors — 7, 11, 13, 17, 19, 23, and so on — since all prime numbers greater than 3 are of the form 6k±1 for some integer k.
It turns out that both those optimizations are special cases of a technique called wheel factorization. Consider a 2-wheel of circumference 2 rolling along a number line with a “spoke” at the number 1; if you start with the spoke at 3 on the number line, it will strike the number line at 5, then 7, and then every odd number after that. Or consider a 2,3-wheel of circumference 2×3=6 with spokes at the number 1 and 5; if you start with the 5-spoke at 5 on the number line, it will strike the number line at 7, 11, 13, 17, 19, 23, and so on. Or consider a 2,3,5-wheel of circumference 2×3×5=30 with spokes at 1, 7, 11, 13, 17, 19, 23 and 29 starting with the 29-spoke at 7. And so on: next is a 2,3,5,7-wheel, then a 2,3,5,7,11-wheel, and the sequence continues infinitely.
Wheel factorization works by performing trial division at each place where a spoke touches the number line. As the wheels grow larger, more and more of the trial divisors are prime, so less and less unnecessary work is done. Of course, there is a point of diminishing returns; when the wheel gets too large, it is just as much work to compute the wheel as to compute the list of primes, and costs just as much to store. But a small wheel is easy to compute, and not too big, and provides a simple optimization over naive trial division.
The spokes of the wheel are computed by looking for co-primes, which are those numbers for which the spoke has no factors in common with the circumference of the wheel; in other words, where the greatest common divisor of the spoke and the circumference is 1. For instance, a 2,3,5-wheel has a spoke at 17 because the greatest common divisor of 17 and 30 is 1, but no spoke at 18 because the greatest common divisor of 18 and 30 is 6. These numbers are called totatives; if you’re curious about the math behind them, ask your favorite search engine for information about Euler’s totient function.
It is easy to see this visually. Here is a list of the positive integers to 42, with primes highlighted:
1 2 3 4 5 6
7 8 9 10 11 12
13 14 15 16 17 18
19 20 21 22 23 24
25 26 27 28 29 30
31 32 33 34 35 36
37 38 39 40 41 42
After the first row, all the primes are in two columns, which correspond to the two spokes of a 2,3-wheel. If 853 were input to the 2,3-wheel factorization function, we would trial divide by 2, 3, 5, 7, 11, 13, 17, 19, 23, 25,and 29 before concluding that 853 was prime; note that 25 is not prime, but is relatively prime to the circumference of the wheel.
Your second task is to write a function that finds the factors of a given number using wheel factorization; you should compute and use a 2,3,5,7-wheel. What are the factors of 600851475143? When you are finished, you are welcome to read or run a suggested solution, or post your own solution or discuss the exercise in the comments below.
May 5, 2009
A priority queue is a data structure in which items arrive randomly and depart according to an ordering predicate. It is similar to a normal queue, in which items depart in the order they arrive (first-in, first-out), and a stack, in which items depart in the opposite of the order in which they arrive (last-in, first-out). The operations on priority queues include insert to add a new item to the queue, find-first to return the first item in the queue, delete-first to return the remaining items after the first, and merge to merge two queues. Priority queues are used in simulations, where keys correspond to “event times” which must be processed in order, in job scheduling for computer systems, where more-important jobs must be performed beforeless-important jobs, and in many other applications.
There are many ways to implement priority queues. An unordered list makes it easy to insert new items, but each time an item is extracted the entire list must be scanned. An ordered list makes extraction quick but requires a scan of half the list, on average, each time an item is inserted. Binary trees give a total ordering of all the items in a priority queue, but we only need to be able to identify the first item, so they do more work than we need. We will implement priority queues using leftist heaps.
A heap is a binary tree in which each node precedes its two children in a total ordering; the ordering predicate may be less-than or greater-than, as appropriate for the particular heap. A leftist heap satisfies the additional criterion that the rank of each left node is greater than or equal to the rank of its right sibling, where the rank of a node is the length of its right spine. As a result, the right spine of any node is always the shortest path to an empty node. The name leftist heap derives from the fact that the left subtree is usually taller than the right subtree, so a drawing of a leftist heap tends to “lean” left.
The fundamental operation on leftist heaps is the merge of two leftist heaps. This is accomplished by merging their right spines in the same manner as merging two sorted lists; this preserves the heap-order property. Then the children of the nodes along that new path are swapped as necessary to preserve the leftist property.
Given merge, the remaining operations are trivial. Insert builds a singleton priority queue, then merges it to the existing priority queue. Find-first simply returns the item at the root of the tree. Delete-first merges the two children of the root.
Leftist heaps were invented by Clark Crane in 1972 and popularized by Donald Knuth in 1973.
Your task is to implement the priority queue data structure using leftist heaps. When you are finished, you are welcome to read or run a suggested solution, or to post your solution or discuss the exercise in the comments below.
May 1, 2009
In a previous exercise you wrote a function that returned a list of prime numbers, and in another exercise you used that function to find a particular prime number. This exercise looks at prime numbers from a different perspective by considering a function that takes a number and determines if it is prime.
The algorithm that we will consider was developed initially by Gary Miller and refined by Michael Rabin, and is probabilistic in nature. It works like this: Express the odd number n to be factored as n = 2r s + 1 with s odd. Then choose a random integer a with 1 ≤ a ≤ n-1 and check if as ≡ 1 (mod n) or a2j s ≡ -1 (mod n) for some 0 ≤ j ≤ r-1. (Some browsers render that last equation poorly; it’s a raised to the power 2 to the j times s.) A prime number will pass the check for all a. A composite number will pass the check for about 1/4 of the possible as and fail the check for the remaining 3/4 of the possible as. Thus, to determine if a number n is prime, check multiple as; if k as are checked, this algorithm will err less than one time in 4k. Most primality checkers set k to somewhere between 25 and 50, making the chance of error very small.
Your task is to write a function that determines if an input number n is prime, then to determine if 289-1 is prime. When you are finished, you are welcome to read or run a suggested solution, or post your solution or discuss the exercise in the comments below.