I Think I’m Crazy!

February 23, 2018

In a recent exercise I wrote about a shell script I am writing at work. Development of the shell script continues, as I work my way through the various requirements of the script. At one point, I decided to use Awk to process a piece of the task, and I had to look something up in the Awk manual, and while I was there I noticed that Gawk has a -M option to use gmp for the arithmetic, giving Awk a big-integer capability. So it occurred to me: awk big integers … prime numbers … awk big integers … prime numbers … and so a project was born.

I decided to write a prime number generator in Awk, based on this previous exercise. Never mind that Awk doesn’t provide iterators, I ought to be able to figure this out. And I did; the result is on the next page. For the record, it works. But it’s horribly slow; generating the 78498 primes less than a million takes about four-and-a-half seconds, which is at least four seconds too long. Am I crazy?

Your task is to tell us about something crazy you have done, writing a program in a language that doesn’t provide what you need, either because you were forced to do so by circumstances or because you wanted to be crazy, as I did. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Advertisements

Pages: 1 2

N-Gram Frequency

February 20, 2018

Given a string of characters and an n-gram size n, prepare a frequency table of all combinations of size n blocks of characters. For instance, with the string “Programming Praxis” and an n-gram size of 2, the n-grams are “Pr”, “ro”, “og”, “gr”, “ra”, “am”, “mm”, “mi”, “in”, “ng”, “g “, ” P”, “Pr”, “ra”, “ax”, “xi”, and “is”; the two n-grams “Pr” and “ra” appear twice, and all the others appear once.

Your task is to write a program that computes a frequency table of n-grams. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Perfect Power Sequence

February 16, 2018

I discovered the Numberphile channel on YouTube a few months ago, and have greatly enjoyed its videos at the intersection of mathematics and programming. Their recent video discusses the Catalan Conjecture:

In the infinite sequence of perfect powers 1, 4, 8, 9, 16, 25, 32, 36, 49, 64, 81, 100, …, the only two adjacent numbers are 2³ = 8 and 3² = 9.

The conjecture has recently been proved, as discussed at Numberphile.

Your task is to write a program that generates the perfect power sequence. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Today we have another homework problem:

What is the earliest time of day (using a 24-hour clock) that can be written with four given digits? For instance, given the digits 1, 2, 3, and 4, times like 14:32 and 23:14 can be written, but the earliest time is 12:34. Your program should report that no time can be formed with digits 6, 7, 8 and 9.

Your task is to find the earliest time of day that can be written by four digits. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Partial List Reversal

February 9, 2018

We’re a few weeks into the beginning of a new school term, and the beginning-programmer message boards are swelling with questions from new students. This question caught my eye; I imagine it’s from a second-semester programming student who learned C in the first semester and is now taking a class in data structures:

Write a program that, given a linked list and two integer indices, reverses the portion of the list between the two indices. For instance, reversing the list [0,1,2,3,4,5,6,7,8,9] between indices 3 (inclusive) and 6 (exclusive) yields the list [0,1,2,5,4,3,6,7,8,9].

Your task is to help the student by writing a program to reverse a portion of a list. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

I have mentioned previously that I work for a community college, part of a team maintaining an enterprise-wide database system on Oracle and Unix. My current assignment involves a file-transfer program between us and the Department of Education (they don’t use sftp like the rest of the world), and I am writing in the Posix shell language because that is the only language that can both be called from our user interface and run the program that the Department of Education requires. It’s not a big program, under five hundred lines, but there’s lots going on. Here are some examples:

  • Text file database: The Department of Education sends us a file of fixed-length records terminated by carriage-returns and newlines containing ascii text that acts as a database for holding configuration metadata. I read the database by selecting the desired record with grep and extracting the needed field with cut -c.
  • Strip file headers/trailers: Files received from the Department of Education have header and trailer lines added to the data. I strip those lines with an ed script:
    ed $FILENAME <
    
  • Arithmetic: At one point during the development of the program I needed to do some arithmetic, nothing fancy. That requirement has now gone away, but at the time I used bc to do the arithmetic, passing input using shell variables and returning output to a shell variable. And I couldn’t resist; the solutions page has a factoring program written in bc.
  • Oracle database: I use SQL*Plus to insert and query records in the Oracle database.
  • Shell built-ins: I use many of the built-in shell commands. If and test allow me to execute commands conditionally. While and do let me do loops. Cp and mv let me get things where they belong. Chown and chmod let me control the security of my data. Read lets me index through a file line-by-line. Shell variables let me parameterize the code. Shell functions let me modularize the code.

I’m not the first person to remark that having a single unifying datatype — ascii text in delimited files — and an assortment of programs to operate on that datatype makes a wonderfully useful system environment.

Your task is to tell us about your use of unix command-line utilities and shell scripts; hopefully other readers will be inspired by something interesting that you have done. When you are finished, you are welcome to read a suggested solution, or to discuss the exercise in the comments below.

Pages: 1 2

Folds

February 2, 2018

Folds, in the parlance of functional programming, are a way to convert lists to a value of some other type; a fold applies a function pair-wise to each element of a list and an accumulator, then returns the accumulator when the list is exhausted. The fundamental fold is foldr:

foldr f a [x1, x2, ..., xn] = f x1 (f x2 (... (f xn a)...))

Here, f is a function with two arguments, a is an initial value, and [x1, x2, ..., xn] is the input list. The name foldr stands for “fold right”, because the parentheses stack on the right side of the expansion, the items in the list are processed right-to-left, and the accumulator is on the right side of the binary function. Foldl is similar:

foldl f a [x1, x2, ..., xn] = (...((f a x1) x2) ... xn)

The arguments have the same meaning, with “fold left” referring to the fact that the parentheses stack on the left, the items in the list are processed left-to-right, and the accumulator is on the left side of the binary function. Note that foldl and foldr have different types, because the binary function takes its arguments in opposite order. In some cases, that makes a difference; for instance, when f is cons, you must use foldr. But when the function is associative, such as +, you can use either foldl or foldr. Here are some examples:

foldr + 0 [1,2,3,4] → 10
foldl + 0 [1,2,3,4] → 10
foldr cons [] [1,2,3,4] → [1,2,3,4]
foldl cons [] [1,2,3,4] → [[[[[],1],2],3],4]
foldr plusone 0 [1,2,3,4] → 4
foldl snoc [] [1,2,3,4] → [4,3,2,1]

Sometimes there is no obvious starting value. For instance, if you want to find the maximum item in a list, there is no “guaranteed to be less than anything else” value to use for a. In that case you can use the foldl1 and foldr1 variants that take the first item in the list as the initial value. Here, max is a binary function that takes two numbers and returns the larger; it is applied pair-wise at each item in the list (we ignore the fact that the built-in max can take more than two arguments):

foldr1 max [1,2,3,4] → 4
foldl1 min [1,2,3,4] → 1

Related to foldl is scan, which applies foldl to every initial segment of a list:

scan f a [x1, x2, ..., xn] = [a, f(a, x1), f(f(a, x1), x2), ..., f(f(f(a, x1), x2), x3)]

For instance:

scan + 0 [1,2,3,4] → [0,1,3,6,10]
scan snoc [] [1,2,3,4] → [[], [1], [2,1],[3,2,1],[4,3,2,1]]

Your task is to implement all the various folds shown above; if your language provides them natively, you should re-implement them from first principles. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Tri-48

January 30, 2018

Today’s exercise comes from Albert H. Beiler’s book Recreations in the Theory of Numbers Chapter 1, Problem 2):

One side of a right-angled triangle is 48. Find ten pairs of whole numbers which may represent the other two sides.

Your task is to find the solution to Beiler’s problem. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

In 1976, Volker Strassen, working with John Pollard, developed a factoring algorithm that is still the fastest proven deterministic factoring algorithm, with time complexity O(n1/4 log n); unfortunately, it is generally slower than other similar algorithms, even Pollard’s rho algorithm. The basic idea is that if you have the product fi of a consecutive set of integers modulo the number to be factored n, and that set of integers contains one of the factors of n, then gcd(fi, n) will be greater than 1. Here’s the algorithm:

  1. Compute c = n1/4 and define an array f containing c integers, each initially 1.
  2. In each array element fi, compute the product modulo n of the integers from i c + 1 to c(i+1).
  3. Compute the greatest common divisor of each of the fi with n, stopping when you find a (possibly-composite) factor.

The second loop, in Step 3, takes time O(n1/4 log n), so the trick is to compute the products of Step 2 in less time than O(n²). That can be done using product trees, as Berstein suggests, but that’s messy and a little bit complicated, and even if you manage to implement it, Strassen’s algorithm still isn’t competitive, so we won’t bother.

Your task is to implement Strassen’s factoring algorithm as described above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

Split An Array

January 23, 2018

We have a simple task today, intended for those students just starting a new semester who are learning programming for the first time:

Given an array of positive integers, determine if the array can be split into two pieces, without re-ordering the elements of the array, so that both pieces have the same sum. If it is possible, return the two pieces. If not, you should so indicate. For instance, the array [1, 2, 3, 4, 5, 15] can be split into two pieces, [1, 2, 3, 4, 5] and [15], each with sum 15.

Your task is to write the program to split an array into two pieces with equal sums. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2