July 13, 2012
[ Today's guest author, Sam Kennedy, is a hobbyist programmer just finishing his A-levels in Newcastle, England. ]
Genetic programming is a branch of artificial intelligence in which programmable machines are evolved to perform a specific task; among other tasks, genetic programming is used in the design of electrical circuits, development of chemical reactions, and solving instances of the travelling salesman problem. In today’s exercise we will look at artificial organisms called flibs (“finite living blobs”) invented by A. K Dewdney in his book The Armchair Universe (which we’ve seen in two previous exercises) which learn to recognize and predict the elements of a sequence.
The DNA of a flib is a finite state machine like that shown above right; the sample machine has three states A, B and C, and two input/output symbols 0 and 1, and can be written in its alternate form 1B1C0C0B1A0A, which we call a chromosome. A gene is a symbol in the chromosome: A, B, C, 0, or 1. An allele is a gene at a particular locus: The allele 0 at locus 7 of chromosome 1B1C0C0B1A0A controls the flib’s output symbol when it is in state B and receives an input symbol of 1. The task performed by a flib is to recognize and predict a sequence of input/output states; for instance, the sample flib responds to the input sequence 0111000010110 with the output sequence 100001100100. By shifting the output bits one place to the right, we see only 6 of the 12 predictions are correct, a score of 50% which is no better than random chance.
The purpose of genetic programming is to evolve a flib that perfectly predicts a random sequence. For instance, the sequence 010011010011010011… (an infinite sequence that repeats 010011 forever) can be predicted perfectly by the sample flib that failed on the earlier input sequence; the input sequence must cycle, as no flib can perfectly predict a random (non-repeating) sequence.
Genetic programming uses two operators. Mutation occurs when an allele is changed at random (think of a gene being struck by a cosmic ray); for instance, the chromosome 0D1C0D0B1A0C1B1A might be changed to 0D0C0D0B1A0C1B1A by mutation. Crossover occurs when two chromosomes breed by exchanging portions of their gene sequence. For instance, the chromosomes 0D1C0D0B1A0C1B1A and 1A1B0D1A0C1D1B0C might breed by replacing the fourth through eighth genes of the second chromosome with those of the first chromosome, forming the offspring chromosome 1A1C0D0B0C1D1B0C; the two crossover points are selected at random.
Given all that, the genetic algorithm takes a set of randomly-generated chromosomes (the gene pool), say a dozen of them, and runs the following loop:
1) calculate the score of each flib on identical input sequences
2) find the flibs with the highest and lowest scores
3) replace the lowest-scoring flib with its crossover with the highest-scoring flib, sometimes
4) mutate a single gene on a single flib
Each time a flib beats the current high score, its score and chromosome are printed; the loop continues until a flib reaches a perfect score or your patience is exhausted. In Step 3, the crossover is only performed sometimes, say 3 times out of 10, because if the flibs breed too often the gene pool becomes very similar to that of the highest-scoring flib and loses diversity; on the other hand, if they breed infrequently the simulation will take longer.
Your task is to write a genetic program that breeds perfect flibs. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.