ABC Conjecture

September 18, 2012

The ABC Conjecture has recently been in the news on math blogs because of the claim that it has been proved by Shinichi Mochizuki. Though the proof is being taken seriously, due to Mochizuki’s reputation, it is five hundred pages long, and confirmation will take several months. The conjecture states that given two positive integers a and b and their sum c with no common factors, the product of the distinct prime factors of abc is rarely much smaller than c.

The radical of a number n is the product of the distinct prime factors of n; for instance, rad(18) = 6 because 18 = 2 × 3 × 3 and, eliminating the duplicate occurrence of 3, 2 × 3 = 6. The quality of an (a,b,c) triple is given by q(a,b,c) = log(c) / log(rad(abc)). The precise statement of the ABC conjecture is that for every ε > 0 there are only finitely many triples (a,b,c) with a and b coprime positive integers and a + b = c such that rad(abc)1+ε < c, or equivalently, such that q(a,b,c) > 1.

Your task is to write the functions rad and q and find all the triples with c < 1000 for which q(a,b,c) > 1. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Tribonacci Numbers

September 14, 2012

You will recall that fibonacci numbers are formed by a sequence starting with 0 and 1 where each succeeding number is the sum of the two preceeding numbers; that is, F[n] = F[n-1] + F[n-2] with F[0] = 0 and F[1] = 1. We studied fibonacci numbers in a previous exercise.

Tribonacci numbers are like fibonacci numbers except that the starting sequence is 0, 0 and 1 and each succeeding number is the sum of the three preceeding numbers; that is, T[n] = T[n-1] + T[n-2] + T[n-3] with T[-1] = 0, T[0] = 0 and T[1] = 1. The powering matrix for tribonacci numbers, used similarly to the powering matrix for fibonacci numbers, is:

\begin{pmatrix} 1 & 1 & 0 \\ 1 & 0 & 1 \\ 1 & 0 & 0 \end{pmatrix}

The first ten terms of the tribonacci sequence, ignoring the two leading zeros, are 1, 1, 2, 4, 7, 13, 24, 44, 81 and 149 (A000073).

Your task is to write two functions that calculate the first n terms of the tribonacci sequence by iteration and the nth term by matrix powering; you should also calculate the tribonacci constant, which is the limit of the ratio between successive tribonacci numbers as n tends to infinity. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

The first ten prime numbers are 2, 3, 5, 7, 11, 13, 17, 19, 23, and 29. Their sum is 129.

Your task is to write a program that calculates the sum of the first billion primes. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

The First Two Programs

September 7, 2012

In their book The C Programming Language, Brian Kernigan and Dennis Ritchie say that the first program every programmer should write is a program that writes “Hello, world!” to the console. Then they give the second program that produces a fahrenheit/celsius temperature conversion table, with fahrenheit temperatures every 20 degrees from 0 to 300 and the corresponding celsius temperature to the nearest degree, each pair written on its own line with the two temperatures separated by tabs.

Your task is to write the first two programs. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Fountain Codes

September 4, 2012

Fountain codes provide an effective method of file transmission over lossy channels. Invented by Michael Luby in 1998, but not published unti 2002, they are used to send bit torrents and streaming media, transmit map and traffic data to in-car receivers, and probably for the recent “brain transplant” to the Curiosity rover on Mars (I’m speculating, but wouldn’t you expect to lose a few packets between here and Mars?). The idea is to send many chunks of a file, larger than the original file, like a fountain spraying droplets of water, and for the receiver to collect drops in a bucket, which can be reassembled into the original file without collecting all the drops sprayed by the fountain.

The encoding process is simple. Pick a random number d between 1 and n, the number of blocks in the file. Pick d blocks from the file at random (no duplicates) and xor them together. Send the xor of the combined blocks, along with the list of blocks used to make the xor. The distribution of d is important; it is not just a uniform random number. We choose d with probability d / (1/2)n(n+1), so if the file has 20 blocks, the denominator is 210 (the sum of the integers from 1 to 20), the degree d is 1 with probability 20/210, 2 with probability 19/210, and so on, until it is 20 with probability 1/210.

The decoding process isn’t much harder. Start with an array of blocks of length n, with an indication that each block is not currently decoded. Then each packet is analyzed as it is received. If the packet has a single block, it can be added immediately to the array of blocks; if not, any known blocks in the input list can be xor’ed into the packet, reducing its degree. A packet that is reduced to a single block is added to the output, others are added to a hold list. Any time a new block is discovered, the entire hold list is scanned for any further reductions in any of its packets.

Let’s illustrate with a fountain that sends blocks of a single byte from the “Programming Praxis” input string. We begin with an empty array and hold list and fetch the first packet from the fountain (22 15 9). That packet has two blocks, so we put it in the hold and go on. The next packet is (80 0), so we put a “P” in block 0 and go on. The next packet is (55 12 10), which we add to the hold. Next we get a packet (14 10 8), which doesn’t help immediately, but it shares a block with (55 12 10), which will be useful later; we add it to the hold, which now contains (22 15 9), (55 12 10) and (14 10 8). The next packet is (103 10). We can immediately add a “g” at block 10, then xor 55 with 103 to get the “P” at block 12 and xor 14 with 103 to get the “i” at block 8; our string is now “P…….i.g.P…..” and our hold is now (22 15 9). We next receive packets (19 5 1), (88 11 15), (45 1 9 14 12), (105 16) and (114 1), so we can add an “i” at block 16, an “r” at block 1, and, by virtue of the (19 5 1) packet in the hold, an “a” at block 5; now the string is “Pr…a..i.g.P…..” and the hold contains (22 15 9), (88 11 15), and (45 1 9 14 12). The next packet is (97 14), which gives us another block, the “a” of “Praxis,” and also lets us reduce the packet (45 1 9 14 12) in the hold to (76 1 9 12). The next packet is (99 16 4 10 1 7), which we can reduce to (118 16 4 7) by applying blocks 10 and 1. Next we receive (120 15), which gives us the “x” in block 15 directly, and also gives us the “n” in block 9 and the space character in block 11 by applying block 15 to the (22 15 9) and (88 11 15) items in the hold; we now have “Pr…a..ing P..x..” and packets (76 1 9 12) and (118 16 4 7) in the hold, with nine blocks left to be decoded. And so the process continues. Eventually the entire string is revealed, on average after O(n loge n) packets have been received, which is about 52 for our 18-block string.

In practice, it is common to do things a little bit differently. Certainly it is normal to use blocks larger than a single byte. Instead of sending the list of blocks, it is normal to send the seed to a simple random number generator; if both ends of the transmission use the same random number generator, both can create the list of blocks from the same seed. And it is common to send a checksum with each packet; if the packet fails the checksum, it is simply ignored, as if it had never been received.

It is also common to use a different degree-distribution than the one we used here. It turns out that a bimodal degree-distribution with lots of 1-, 2- or 3-block packets, combined with an n-block packet and a few n-minus-two-or-three block packets, with very few packets in between, makes the whole process go much quicker, an average O(n) with just a small overhead of 5% or 10% over the size of the file being transmitted. Use of this degree-distribution is called a raptor code.

Your task is to write a program that creates a fountain and decoder. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Once In A Blue Moon

August 31, 2012

Sometimes the phases of the moon fall on the calendar in such a way that one calendar month has two full moons; the second full moon of the month is called a blue moon, for reasons that have been lost in antiquity. Poets and songwriters sometimes use the phrase “once in a blue moon” to indicate an event that occurs infrequently, but in fact blue moons occur every two or three years, on average. Today’s full moon is the blue moon of August 2012.

We looked at the phases of the moon in a previous exercise. There we learned that new moons occur every 29.530588853 days, that a new moon occurred on January 6, 2000 (julian date 2451550.1), and that full moons occur halfway between two new moons.

Your task is to write a program that calculates all the blue moons of the twenty-first century. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Random Access Lists

August 28, 2012

Lists are ubiquitous in programming, especially in languages like Scheme, but elsewhere as well. Their primary advantage over arrays is that they can easily grow as needed, but that brings a corresponding disadvantage: it takes O(n) time to access the nth item in a list, but only O(1) time to access the nth item an an array.

Chris Okasaki has invented a remarkably clever data structure that provides the normal O(1) time complexity for the cons, head and tail operators of lists but reduces random access to the nth item in a list to O(log n), which means lists can sometimes be used in place of arrays, especially when it is inconvenient to determine the size of the array in advance or when the items of the array are normally accessed in sequence.

I won’t try to explain Okasaki’s data structure here; you can look at http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.55.5156 for the details, or look at Figure 9.7 in Okasaki’s book Purely Functional Data Structures, as I did.

Your task is to implement a library for random access lists, including the functions cons, head, tail, lookup and update. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

One of the classic data structures of computer science is the hash table, which provides constant-time access to key/value data in the average case. The idea is to store a key/value pair in a location that is computed based on the value of the key, which works fine when all the hash values are unique; the problem arises when two keys hash to the same value. The hash tables of the Standard Prelude use a method called chaining to resolve collisions; today’s exercise uses a different method called open addressing.

All hash tables work by computing an address at which to store an item based on its key. Sometimes two keys hash to the same value, which causes a collision. Chaining, as used in the Standard Prelude, resolves collisions by storing multiple items at the same location, forming a list of items. Open addressing instead computes a second address, or if necessary a third, or fourth, or …, continuing until it finds an empty spot. It is necessary that the computation of secondary addresses eventually visits every possible storage location; a simple approach, which we will adopt, is called linear probing, in which storage locations are accessed in increasing order until an empty location is found. It is possible, of course, for the hash table to become completely filled, in which case an error is reported when a new item cannot be inserted.

The tricky part of hashing with open addressing is deletions, because it’s not possible to simply delete an item because some other item may rely on the the fact that its storage location was filled when the item was inserted. The solution is to have three types of items in a storage location: nil, which indicates that the storage location has never been used; deleted, which indicates that the storage location is currently empty but has been used in the past, and in use, for those storage locations that are currently occupied.

Your task is to write functions that maintain a hash table with open addressing. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Two More Random Exercises

August 21, 2012

We did two tasks related to random numbers in the most recent exercise, and we have looked at high-quality random number generators in several previous exercises. In today’s exercise we look at two very low-quality random number generators, which should not be used for any production application.

The first, invented by John von Neumann in 1949, occasioned his famous quip “Any one who considers arithmetical methods of producing random digits is, of course, in a state of sin.” The middle-square method takes a number with an even number of digits, squares it, and extracts the middle digits for the next iteration; for instance, if the seed is 675248, the square is 455959861504, and the middle digits are 959861.

The second, invented by IBM in the early 1960s, caused Donald Knuth to claim “its very name RANDU is enough to bring dismay into the eyes and stomachs of many computer scientists!”. RANDU is based on the recursion xn+1 = 65539 · xn (mod 231), with x0 odd.

Your task is to write functions that generate random numbers by the middle-square and RANDU methods. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Two Random Exercises

August 17, 2012

We have two exercises related to random numbers today. I’m not sure of the source, but they look to me like homework exercises.

First, given a function rand3 that returns a number from 1 to 3 inclusive chosen at random, write a function that returns a number from 1 to 9, inclusive.

Second, given a function rand5 that returns a number from 1 to 5 inclusive chosen at random, write a function that returns a number from 1 to 7, inclusive.

In both cases all possible output numbers should be generated with equal frequency. You should demonstrate that your functions behave properly.

Your task is to write the rand9 and rand7 functions and demonstrate that they work properly. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2