Form Letters

November 30, 2010

Welcome back, Jane!
We hope that you and all the members
of the Public family are constantly
reminding your neighbors there
on Maple Street to shop with us.
As usual, we will ship your order to
    Ms. Jane Q. Public
    600 Maple Street
    Your Town, Iowa 12345

Everybody hates form letters. But they are part of the computing universe, and today’s exercise asks you to print them. Input to the form letter generator comes in two parts. First, there is a schema that defines the letter to be written. Here is the schema for the letter shown above:

Welcome back, $1!
We hope that you and all the members
of the $0 family are constantly
reminding your neighbors there
on $5 to shop with us.
As usual, we will ship your order to
    $3 $1 $2. $0
    $4 $5
    $6, $7 $8

Variable text is identified as $n, where n is the field number from a database; although it’s not shown above, n can be larger than 9, extending right-ward until a non-digit is encountered. Also not shown above is the construct $$, which prints a literal dollar sign.

The data comes from a comma-separated values file, of the type we have previously encountered. In this case, records have nine fields: last name, first name, middle initial, title, street number, street name, city, state, and zip code. Here is a sample two-record data file:

Public,Jane,Q,Ms.,600,Maple Street,Your Town, Iowa,12345
Smith,John,Z,Dr.,1234,Main Street,Anytown,Missouri,63011

Your task is to write a program that takes a schema and a data file and writes a series of form letters. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Divisors And Totatives

November 26, 2010

We have examined functions to compute the factors of a number in several previous exercises. In today’s exercise we examine functions to compute divisors and totients, which are concepts of number theory closely related to factors.

The divisors of a number are those numbers that divide it evenly; for example, the divisors of 60 are 1, 2, 3, 4, 5, 6, 10, 12, 15, 20, 30, and 60. The sum of the divisors of 60 is 168, and the number of divisors of 60 is 12.

The totatives of a number are those numbers less than the given number and coprime to it; two numbers are coprime if they have no common factors other than 1. The number of totatives of a given number is called its totient. For example, the totatives of 30 are 1, 7, 11, 13, 17, 19, 23, and 29, and the totient of 30 is 8. The totient function was discovered by Leonhard Euler, and it pops up in all kinds of strange places in number theory; for instance, the totatives of 30 are the spokes on a 2,3,5 factorization wheel.

Your task is to write a small library of five functions that compute the divisors of a number, the sum and number of its divisors, the totatives of a number, and its totient. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

String Subsets

November 23, 2010

This exercise is part of our on-going series of interview questions:

Given two strings, determine if all the characters in the second string appear in the first string; thus, DA is a subset of ABCD. Counts matter, so DAD is not a subset of ABCD, since there are two D in the second string but only one D in the first string. You may assume that the second string is no longer than the first string.

Your task is to write a function to determine if one string is a subset of another string. You should work as you would in a programming interview; if you find one solution, search for a better solution. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Topological Sort

November 19, 2010

A graph G is a collection of its vertices V and the edges E between them: G(V, E); the interstate highway system is an example of a graph, with cities as vertices and highways as edges. A directed graph is a graph in which each edge is identified with a from vertex and a to vertex; the roads in some city centers can be considered a directed graph, because one-way roads only allow traffic in a single direction (Venice has one-way canals, which blew my mind the first time I saw a sensico unico sign). A directed acyclic graph, sometimes called a DAG, is a directed graph in which there are no cycles, that is, by following the successors of a vertex you can never return to that vertex; the tasks involved in building a house form a DAG, in which the framing must be done before drywall can be installed, and the modules of a program library form a DAG, in which some modules must be compiled before others that depend on them.

A topological sort is an ordering of the vertices of a DAG in which each vertex appears before any of the vertices that depend on it. Topological sorts are typically messy, with multiple right answers; a fireman can spray water on a burning building even while his colleagues are searching for anyone still inside. There are many possible topological sorts of the sample graph; one of them is 7 5 11 2 3 8 9 10.

There are many ways to perform a topological sort. Perhaps the simplest is to pick a vertex that has no incoming edges; put it at the head of the sorted output, delete it and all the edges that come from it, and recur until no vertices remain. If there is more than one vertex that has no incoming edges, any of them may be chosen.

Your task is to write functions to identify cyclic graphs and perform topological sort of a graph. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

RSA Cryptography

November 16, 2010

In 1973, Clifford Cocks invented a method of public-key cryptography in which the recipient of a message holds a private key, known only to himself, and publishes a public key, which may be used by anyone to send him a message that only he will be able to read. Cocks worked for the British security service GCHQ, so he was unable to publish his work. In 1978, three MIT professors, Ronald L. Rivest, Leonard M. Adelman and Adi Shamir invented the same algorithm, independently, which they patented in 1983. The algorithm is now known as RSA, after their initials.

The basic idea of RSA starts with two large prime numbers of equal bit-length, p and q; their product n becomes the modulus of the cryptosystem. The totient of n is computed as φ(pq) = (p−1) × (q−1). Then two keys are chosen, the encryption key e and the decryption key d, such that de ≡ 1 (mod φ(pq)) and gcd(e, φ(pq)) = 1. Then, given a message m, an integer on the range 0 < m <n, the message is encrypted by computing me (mod n) and the resulting cipher-text c is decrypted by computing cd (mod n).

In practice, the keys are generated by selecting a bit-length k and and arbitrary encryption key e. A longer bit-length provides more security than a shorter bit-length; a 768-bit key has recently been factored, albeit with extreme effort (about 2000 PC-years), most commercial users of RSA are probably using 1024- or 2048-bit keys these days, and high-security users (banks, military, government, criminals) are probably using 4096- or 8192-bit keys. E must be odd, is frequently chosen to be prime to force the condition that it is co-prime to the totient, and is generally fairly small; e = 216+1 = 65537 is common. Then the key generator chooses p and q at random, computes d as the modular inverse of e, and reports n and d. In that way, nobody, not even the person generating the keys, knows p and q.

Here is a simple example from Wikipedia: Choose p = 61 and q = 53. Then n = p×q = 3233 and the totient φ(pq) = 60×52 = 3120. Choose e=17 which is co-prime to 3120 since 17 is prime and 17∤3120; the corresponding d is the inverse of 17 with respect to the modulus 3120, so d = 2753. Then the message m = 65 is encrypted as c = me (mod n) = 6517 (mod 3233) = 2790, and the cipher-text 2790 is decrypted as m = cd (mod n) = 27902753 (mod 3233) = 65.

The standard definition of RSA cryptography is known as PKCS #1. It provides a method for converting a text message to a number m suitable for encryption, and converting it back to the original text message, but we won’t examine that algorithm today. It is also possible to use RSA to provide non-forgeable signatures; the basic idea is that the sender encrypts a message hash with his decryption key, so the receiver can decrypt the message hash with the sender’s public key, which works because only the sender knows his private decryption key.

Your task is to write an RSA key generator and procedures to encrypt and decrypt messages using the RSA algorithm as described above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Regular readers of Programming Praxis know of my affinity for prime numbers. Today’s exercise is based on a prime-generating sequence described by Eric Rowland. Consider the sequence a1 = 7, an = an-1 + gcd(n, an-1): 7, 8, 9, 10, 15, 18, 19, 20, 21, 22, 33, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 69, 72, 73. Taking the differences between successive elements of the sequence gives a second sequence 1, 1, 1, 5, 3, 1, 1, 1, 1, 11, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 23, 3, 1. All the elements of that sequence are either 1 or prime. Eliminating the 1s gives a third sequence of primes that begins 5, 3, 11, 3, 23, 3, 47, 3, 5, 3, 101, 3, 7, 11, 3, 13, 233, 3, 467, 3, 5, 3, 941, 3. Rowland has proved that all elements of the sequence are either 1 or prime; it is conjectured but not proven that all the odd primes appear in the sequence.

It is possible to shortcut the sequence by omitting the 1s, since the number of 1s at any point can be pre-computed. If we have a least-prime-divisor function lpd, then Vladimir Shevelev describes the sequence as a1 = lpd(6-1) = 5, a2 = lpd(6-2+5) = 3, a3 = lpd(6-3+5+3) = 11, a4 = lpd(6-4+5+3+11) = 3, a5 = lpd(6-5+5+3+11+3) = 23, …, and an = lpd(6 – n + sum(a1an-1).

Your task is to write functions that generate the three sequences, including the shortcut. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Subset Sums

November 9, 2010

We have previously solved Part 1 and Part 2 of the Greplin Programming Challenge. In today’s exercise we will solve the third and final part:

Find all the subsets of a set of non-negative integers where the largest number is the sum of the remaining numbers, and return a count of the number of them. For instance, for the set { 1, 2, 3, 4, 6 } the subsets are 1+2=3, 1+3=4, 2+4=6, and 1+2+3=6, so the result is 4 subsets. Apply your program to the set { 3, 4, 9, 14, 15, 19, 28, 37, 47, 50, 54, 56, 59, 61, 70, 73, 78, 81, 92, 95, 97, 99 }.

Your task is to write a program to solve the challenge. In addition to solving the challenge, you might like to apply your solution to the set of prime numbers less than 210. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Weather Forecast

November 5, 2010

The internet provides a rich source of data for those who are interested: stock prices, baseball statistics, the littoral area in acres of lakes in Minnesota, and much more. In the United States, the National Oceanic and Atmospheric Administration publishes daily weather forecasts at http://weather.noaa.gov/pub/data/forecasts/.

Your task is to write a program that retrieves the current weather forecast for a requested city. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Emirps

November 2, 2010

An emirp is a prime number that is also prime when its digits are reversed, and that is not also a palindrome. For instance, 13 is an emirp because its reversal, 31, is also prime; 23 is not an emirp, even though it is prime, because its reversal, 32, is not prime; and 101 is not an emirp, even though it is prime, because it is a palindrome.

Your task is to enumerate the emirps below a million; you should strive for maximum speed. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2