## Finding God

### February 16, 2016

The first three verses of the Bible are shown below. Pick any word in the first verse, count its letters, and move forward that many words in the text. Then repeat until you reach the third verse. You will always end with *God*.

1 In the beginning God created the heaven and the earth.

2 And the earth was without form, and void; and darkness was upon the face of the deep. And the Spirit of God moved upon the face of the waters.

3 And God said, Let there be light: and there was light.

For instance, if you start at *beginning*, count nine letters and move forward to the first *the* in the second verse. Then count forward three words to *without*, count forward seven words to *upon*, then count forward three words to *the*, count forward three words to another *the*, count forward three words to *God*, count forward three words to *the*, count forward three words to yet another *the*, and finally count forward three words to *God*. This trick was discovered by Martin Gardner.

Your task is to write a program that reads a text and checks if the trick works; you might look at the first paragraph of *Alice in Wonderland*, where starting within the first ten words always leads to *sister*, or the words of *Twinkle, Twinkle Little Star*, where starting anywhere within the first two lines always leads to *you* in the last line. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

## Three String Exercises

### February 12, 2016

We have three simple exercises on strings today:

- Write a function that determines the length of a string.
- Write a function that finds the index of the first occurrence of a character in a string, optionally starting at a given index.
- Write a function that creates a new string as a substring of an input string, from a given starting index to a given ending index.

Your task is to write the three exercises described above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

## Counting Primes

### February 9, 2016

We have studied functions for counting the prime numbers less than a given number in two previous exercises. All of them were based on Legendre’s phi function that counts the numbers less than *x* not stricken by sieving with the first *a* primes.

Today we look at a rather different method of counting the primes that is due to G. H. Hardy and Edward M. Wright. Their method is based on factorials, and their formula is

for *n* > 3, where ⌊*n*⌋ is the greatest integer less than *n*. The expression inside the big parentheses is 1 when *n* is prime and 0 when *n* is composite, by Wilson’s Theorem, which states that an integer *n* > 1 is prime if and only if (*n* − 1)! ≡ −1 (mod *n*); the theorem was first stated by Ibn al-Haytham (c. 1000AD), but is named for John Wilson, who first published it in 1770, and it was first proved by Lagrange in 1771.

Your task is to write a program that counts primes by the Hardy-Wright method. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

## Freq

### February 5, 2016

Every so often I get it in mind to start solving the daily cryptogram in the newspaper. For instance:

[Category: Literature] V MVXEGC NK V RGIIZH HYZ IGXDK PZQ YNK QLMCGIIV HYGX AYG KQX NK KYNXNXO VXD HZXAK NA MVUE AYG LNXQAG NA MGONXK AZ CVNX. — LVCE AHVNX

The first word is either A or I, and the repeated two-letter words NK, NK, NA and NA suggest words like IF, IS, IT or OF, ON, OR. Further, the digraph NK also appears as YNK. So we guess that NK is IS, YNK is HIS, and V is A. Now the author LVCE AHVNK is _A__ __AI_, which rather strongly suggests MARK TWAIN. Now we have the phrase WH_N TH_ S_N IS SHININ_, which is almost certainly WHEN THE SUN IS SHINING, and the rest of the cryptogram is easy: A BANKER IS A FELLOW WHO LENDS YOU HIS UMBRELLA WHEN THE SUN IS SHINING AND WANTS IT BACK THE MINUTE IT BEGINS TO RAIN. — MARK TWAIN.

We’ve written a program to solve cryptograms, a long time ago, using a dictionary attack, but sometimes it’s still fun to solve them by hand. One way a computer can help is by creating frequency tables of single letters, digraphs, and trigraphs; in the puzzle shown above, a table of digraphs leads quickly to NX, which appears six times, along with NK (three times) and NA (two times), which quickly limits the possibilities for the letter N. We didn’t use it in the analysis above, but AYG is the most common trigram, and both times it appears it is a word by itself, so it is almost certainly THE or AND. Thus, frequency tables quickly lead to a solution.

Your task is to write a program that generates frequency tables, including single characters, digraphs, trigraphs, and other lengths. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

## Prime Gaps

### February 2, 2016

We studied maximal prime gaps in a recent exercise. In today’s exercise we will look at minimal prime gaps — the smallest gap of a given size. For instance, the minimal prime gap of 2 is the gap from 3 to 5, the minimal prime gap of 4 is the gap from 7 to 11, the minimal prime gap of 6 is the gap from 13 to 19, and so on.

Your task is to write a program that finds minimal prime gaps. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

## Compare Strings With One Error

### January 29, 2016

I don’t know if today’s exercise comes from an interview question or from somebody’s homework, but it’s a good exercise:

Given two strings, determine if they are the same except for exactly one difference. Two identical strings fail to match. Two strings that differ in two or more characters fail to match. Two strings with lengths that differ by one match if and only if they are identical except for one additional character. Indicate the index where the difference occurs, or report failure if the two input strings do not differ in exactly one character.

Your task is to write a program that compares strings with one error. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

## Higher-Order String Functions

### January 26, 2016

Scheme provides the higher-order functions `map`

, `for-each`

and `fold`

that operate on lists. In today’s exercise we will write versions of those functions that operate on strings:

`(string-map proc str)`

applies`proc`

to each character in`str`

, replacing the character with the output of`proc`

.`proc`

takes a single character and returns a single character.

`(string-for-each proc str)`

applies`proc`

to each character in`str`

; the function is executed only for its side-effects.`proc`

takes a single character and returns nothing.

`(string-fold proc base str)`

calls function`(proc base c)`

on each character of`str`

, working left to right, accumulating the result in`base`

as it goes.`proc`

takes a`base`

of any type, and a character, and returns a value of the same type as`base`

.`(string-fold-right proc base str)`

is the same, except that it works from right to left.

Your task is to implement these four functions in the language of your choice. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

## Entropy

### January 22, 2016

The Shannon entropy of a file is a measure of the information content in the file; higher entropy implies more information. Shannon entropy is computed as *H* = -1 * sum(*p _{i}* * log

_{2}(

*p*)) where

_{i}*p*is the frequency of each symbol

_{i}*i*in the input (frequency is the percentage of the total number of symbols). Shannon entropy is just one facet of the study of information theory, which is fundamental to computer science.

Your task is to compute the Shannon entropy of a file. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

## RSA Encryption Backdoor

### January 19, 2016

There has been much ado recently about a law-enforcement backdoor to encryption that would enable authorized access to private encrypted communications: FBI director James Comey and British Prime Minister David Cameron have both come out strongly in favor of an encryption backdoor as a tool in the fight against terrorists, while the EFF and cryptography experts are uniformly against the proposal. Today’s exercise looks at a way that an encryption backdoor can be added to the RSA cryptosystem.

Let’s start by reviewing how RSA works, as we have studied it in a previous exercise. Two large primes *p* and *q* are chosen at random; their product *n* is the modulus of the cryptosystem, and the totient φ(*pq*) of *n* is (*p*−1) × (*q*−1). Then two keys are chosen, the encryption key *e* and the decryption key *d* such that *de* ≡ 1 (mod φ(*pq*)) and gcd(*e*, φ(*pq*)) = 1. Then a message *m*, encoded as an integer on the range 0 < *m* < *n*, is encrypted as *m ^{e}* (mod

*n*) and the resulting ciphertext

*c*is decrypted as

*c*(mod

^{d}*n*). In practice,

*e*is usually 2

^{16}+1 = 65537 (that choice simplifies the arithmetic involved in encryption and decryption),

*p*and

*q*are

*t*-bit integers with

*t*∈ {512, 1024} bits, and there are algorithms that allow the message to be any length (not limited to

*n*), though we won’t look at them here because they are arcane and not germane to the RSA backdoor.

The encryption backdoor is inserted in the key generation process. The backdoor key *a* is chosen as a random prime of length 3*t*/4 bits; also chosen are two random primes *p*‘ and *q*‘ of length *t*/4 bits. Then *p* is computed based on *p*‘ by the following algorithm:

1 [Initialize]: Set

k←p‘.

2 [Computep]: Setp← a × k + p’.

3 [Check primality]: Ifpis prime, returnp.

4 [Try again]: Setk←k+ 1 and go to Step 2.

And likewise for *q*, which is computed based on *q*‘. This process must terminate by Dirichlet’s Theorem, which says that for integers *a*, *b* and *x* with *a* and *b* coprime, the arithmetic progression *ax* + *b* contains infinitely many prime numbers. Once you’ve computed *p* and *q*, the key generation and ciphering processes of the normal RSA algorithm are unchanged.

An attacker who doesn’t know the backdoor key *a* can read an encrypted message *m* only by factoring *n*, which is difficult; that’s the basis of the secrecy provided by the RSA system. But an attacker who knows the backdoor key *a* can factor *n* (mod *a*) as *p*‘ × *q*‘, recover *p* and *q* and thus the decryption key *d*, and thus read the message *m*. Factoring *n* (mod *a*) is very much easier than factoring *n*; for instance, if *n* is a 2048-bit semiprime, *n* (mod *a*) is only 512 bits, which can be factored by a single personal computer in a few months, or by an AWS cluster in a few days, or by the supercomputers in the basement of the NSA one morning before you finish your first donut.

In practice, the RSA backdoor would work like this: Each device that performs encryption, say an iPhone or Android tablet, has a unique MAC address or KEK key or serial number that is somehow converted to a backdoor key *a* that is also unique to the device. (If a single backdoor key *a* was used for all devices, it could be recovered by a method similar to this exercise.) The device generates backdoored keys on request, then sends encrypted messages in the normal way. When some police agency wants to read an encrypted message, they request warrants from a judge, seize the device, determine the backdoor key *a*, and perform the procedure described above.

Your task is to write programs that demonstrate the RSA backdoor. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

## Dobble

### January 15, 2016

Dobble is a French card game that uses 55 cards, each with 8 symbols; any pair of cards have exactly one symbol in common. The game is played by dealing one card to each player and placing the remaining cards face-up in the center of the table; the first player to spot the symbol that matches the top card on his pile to the top card on the central pile calls out the name of the symbol, then moves the top card from the central pile to the top of his own pile, and the game continues until all cards in the central pile have been claimed, with the winner being the player with the most cards.

Your task is to write a program that creates a suitable deck of cards, so that every pair of cards has exactly one symbol in common. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.