Stepwise Program Development: A Heuristic Algorithm

December 11, 2012

I recently found a book by Niklaus Wirth in a used book store; it’s a fascinating book and has been the source of a few recent exercises:

Niklaus Wirth, Eidgenössische Technische Hochschule, Zürich, Switzerland. Systematic Programming: An Introduction. 1973: Prentice-Hall, Inc. Englewood Cliffs, New Jersey. ISBN 0-13-880369-2.

Most of the book is concerned with teaching the syntax and semantics of Wirth’s language Pascal, but the final chapter is devoted to stepwise program development, where Wirth demonstrates by example four different problems which he solves by developing programs through a succession of stepwise refinements. The last section is a kind of “final exam” for readers; it’s the longest section of the book, by far, and gives a complete, detailed exposition of the solution to the following problem:

Generate a sequence of N characters, chosen from an alphabet of three elements (e.g., 1, 2, 3), such that no two immediately adjacent subsequences are equal.

For instance, the sequence of length N = 5 with the characters “12321” is acceptable, but neither “12323” nor “12123” are.

Your task is to write a program that not only solves the stated problem but also gives all possible solutions, not just a single solution, for any given N. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Posted by programmingpraxis

Filed in Exercises

28 Comments »

28 Responses to “Stepwise Program Development: A Heuristic Algorithm”

Paul said

December 11, 2012 at 10:36 AM

A Python version. I did not look at Wirth’s best solution. I tried to solve it without looking at his method. This was my first attempt and it worked immediately.

import itertools
CHOICES = [1, 2, 3]

def check_sub_seq(seq):
    n = len(seq)
    halfn = n // 2 if n & 1 == 0 else n // 2 + 1
    for nss, start in enumerate(range(n-1, halfn - 1, -1), 1):
        if seq[start:] == seq[start - nss:n - nss]:
            return False
    return True

def brute_search(n):
    """use a queue for all possiblilities"""
    Q  = [CHOICES[:-1]]
    while Q:
        seq = Q.pop()
        for c in CHOICES:
            newseq = seq + [c]
            if check_sub_seq(newseq):
                if len(newseq) == n:
                    yield newseq
                else:
                    Q.append(newseq)

for i, seq in zip(itertools.count(1), brute_search(12)):
    print i, "".join(str(i) for i in seq)

Paul said

December 11, 2012 at 10:46 AM

A more elegant version, that uses recursion.

CHOICES = [1, 2, 3]

def check_sub_seq(seq):
    n = len(seq)
    halfn = n // 2 if n & 1 == 0 else n // 2 + 1
    for nss, start in enumerate(range(n-1, halfn - 1, -1), 1):
        if seq[start:] == seq[start - nss:n - nss]:
            return False
    return True

def search(seq, n):
    for c in CHOICES:
        newseq = seq + [c]
        if check_sub_seq(newseq):
            if len(newseq) == n:
                print "".join(str(i) for i in newseq)
            else:
                search(newseq, n)

search([1,2], 12)

Stepwise Program Development: A Heuristic Algorithm | My Blog said
December 11, 2012 at 10:53 AM
[…] Pages: 1 2 […]
Programming Praxis – Stepwise Program Development: A Heuristic Algorithm « Bonsai Code said
December 11, 2012 at 1:17 PM
[…] today’s exercise, our goal is to write an algorithm that, given an alphabet and a length, generates all […]

Remco Niemeijer said

December 11, 2012 at 1:18 PM

My Haskell solution (see http://bonsaicode.wordpress.com/2012/12/11/programming-praxis-stepwise-program-development-a-heuristic-algorithm/ for a version with comments):

import Control.Monad
import Data.List

nonAdjacent :: Ord a => [a] -> Int -> [[a]]
nonAdjacent xs = sort . filter (and . zipWith (==) xs . nub) . f where
    f 0 = [[]]
    f n = filter (\x -> not . or . tail $ zipWith isPrefixOf (inits x) (tails x)) $
          liftM2 (:) xs (f $ n - 1)

main :: IO ()
main = do mapM_ putStrLn $ nonAdjacent "123" 5
          mapM_ putStrLn $ nonAdjacent "123" 12
          mapM_ putStrLn $ nonAdjacent "123" 20

Tim Slatcher said

December 11, 2012 at 3:09 PM

My haskell solution. I’m not sure if it has optimum running time. Running “generate n” generates all possible sequences for a given N. If you run ‘take 1 (generate n)’, the lazyness of haskell should allow a fairly fast generation of a new solution.

split xs n	= (take n xs, take n (drop n xs))
--given ys is legal, checks that (y:ys) is also legal
check xs	= foldl (\a (ys, zs) -> if ys == zs then True else a) False (map (split xs) [1..((length xs) `div` 2)])

alphabet = [1,2,3]
generate 0 = [[]]
generate n = foldl (\ys zs -> if check zs then ys else zs:ys) [] (concat ( map (\x -> map (\y -> y:x) alphabet) (generate (n-1))))

Peter Klausler said
December 11, 2012 at 6:37 PM
lists _ 0 = [[]]
lists set n = [ l’ | l <- lists set (n-1), x <- set, let l' = l ++ [x], ok l' ]
where
ok l' = and [ a /= b | j <- [1 .. n `quot` 2],
let (a,b) = splitAt j $ drop (n-j-j) l' ]

main = print $ lists "123" 10
Peter Sergeant said
December 11, 2012 at 8:05 PM
Let the regular expression engine do the hard work…

#!perl use strict; use warnings;
# $> perl solve.pl 5 3 # Found 198 solutions my @asolutions = generate( $ARGV[0], # N (first cmd line argument) [1 .. $ARGV[1]] # Alphabet (number generated from second cmd line argument) ); printf("Found %d solutions\n", scalar @asolutions); sub generate { my ( $target_length, $alphabet, $head ) = @_; # Base Case our head is the right length return $head if (length ($head//'')) == $target_length;
# General Case return map { generate( $target_length, $alphabet, $_ ) } grep { ! /(.{2})\1/ } map { ($head // "") . $_ } @$alphabet; }
peteatclueballdotcom said
December 11, 2012 at 8:08 PM
If you want it to exclude single-item adjacent sequences, you can change the {2} to a {1}
Mika Moilanen said
December 11, 2012 at 8:21 PM
If you harden the problem a bit and try to avoid repeated permuations you get to abelian square free problem:
An abelian square means a non-empty word uv, where u and v are permutations of each other For example, abccba contains repeate permuation of a,b, and c.
Nat Gaertner said
December 11, 2012 at 8:35 PM
Another Python solution. Uses that you should only have to check that you’re not creating an adjacent sequence pair with each new addition.

import sys
N = int(sys.argv[1])
def next(seq=(),seqs=set(),pair_idxs = {(1,2):None,(1,3):None,(2,1):None,(2,3):None,(3,1):None,(3,2):None}):
if len(seq) == N:
seqs.add(seq)
return seqs
elif len(seq) == 0:
for i in range(1,4):
seqs = next(seq + (i,),seqs,pair_idxs)
else:
last = seq[-1]
possibles = set([1,2,3])
possibles.remove(last)
for i in possibles:
prev_pos = pair_idxs[(last,i)]
if prev_pos == None or test(seq,i,prev_pos):
new_pi = dict(pair_idxs)
new_pi[(last,i)] = len(seq)-1
seqs = next(seq + (i,),seqs,new_pi)
return seqs

def test(seq, i, prev_pos):
sub_new = seq[prev_pos+2:] + (i,)
sub_prev = seq[:prev_pos+2][-len(sub_new):]
return sub_new != sub_prev

print next()
Ben said
December 11, 2012 at 10:22 PM
how come some solutions don’t have the 3x and 2x sequences while some of the other solutions do?

if you include the 3x and 2x sequences it looks like the length of the solution is generated by this sequence which is kinda cool: http://oeis.org/A006156

Mike said

December 12, 2012 at 1:13 AM

Simple recursive solution in Python 3.

def gen(n, t=''):
    """yields sequences in lexicographic order."""

    if all(t[2*i:i] != t[i:] for i in range(-(len(t)//2), 0)):
        if n:
            for c in '123':
                yield from gen(n - 1, t + c)
        else:
            yield t

def wirth(n):
    """generates unique patterns as in the example solution."""

    return gen(n-2, '12'[:n])

aaronpeschel said
December 12, 2012 at 5:11 AM
Here is a quick solution in Python 3

#!/usr/bin/env python import re import itertools
VALID_CHARS = ['1', '2', '3'] def no_repetition(s): return re.search(r'(.+)\1', s) == None def valid_strings(n): dictionary = (''.join(x) for x in itertools.product(VALID_CHARS, repeat=n)) dictionary = filter(no_repetition, dictionary) for word in dictionary: print(word) def main(): valid_strings(3) valid_strings(5) valid_strings(10)
if __name__ == "__main__": main()
jpverkamp said
December 12, 2012 at 2:14 PM
Here are up my usual solutions in Python and Racket: Generating non-repeating strings

It’s just the brute force solutions for now, I think I’ll try to write up a smarter solution that bails out early tomorrow.

@Ben:
That actually makes perfect sense. If you parse the jargon, ternary is base three (thus the alphabet A = {a, b, c}) and a squarefree word is one without any adjacent subwords. So exactly what the problem asks for.

@Mike:
I think that yield from alone is just about enough for me to give Python 3.3+ another try over Python 2.7. That’s really helpful.

Eric said

December 12, 2012 at 6:05 PM

My C recursive solution.

#define FIRST_ELEMENT   '1'
#define LAST_ELEMENT    '3'

#define N               4

void addElement(char element, int index, char *sequence)
{
    sequence[index] = element;
    if (++index == N)
    {
        printf("%.*s\n", N, sequence);
    }
    else
    {
        char nextElement;
        for (nextElement = FIRST_ELEMENT; nextElement <= LAST_ELEMENT; nextElement++)
        {
            if (nextElement != element)
            {
                addElement(nextElement, index, sequence);
            }
        }
    }
}

int main()
{
    char element;
    char sequence[N];
    int index = 0;
    for (element = FIRST_ELEMENT; element <= LAST_ELEMENT; element++)
    {
        addElement(element, index, sequence);
    }
}

Eric said

December 12, 2012 at 6:55 PM

Sorry, but I don’t think I got the problem definition right.

Do the following also belong to a N=5 sequence?
13121
13123
13212
13213
13231
and also sequences starting with 2 and 3…

My C recursive solution.

#define FIRST_ELEMENT   '1'
#define LAST_ELEMENT    '3'

#define N               5

void addElement(char element, int index, char *sequence)
{
    sequence[index] = element;
    if ((index < 3 || strncmp(sequence + index - 1, sequence + index - 3, 2))
            && (index < 5 || strncmp(sequence + index - 2, sequence + index - 5, 3)))
    {
        if (++index == N)
            printf("%.*s\n", N, sequence);
        else
        {
            char nextElement;
            for (nextElement = FIRST_ELEMENT; nextElement <= LAST_ELEMENT; nextElement++)
                if (nextElement != element)
                    addElement(nextElement, index, sequence);
        }
    }
}

int main()
{
    char element;
    char sequence[N];
    int index = 0;
    for (element = FIRST_ELEMENT; element <= LAST_ELEMENT; element++)
        addElement(element, index, sequence);
}

cosmin said

December 12, 2012 at 9:51 PM

A divide and conquer approach in Pyhton:

def good_seq(a, b):
	m, n = len(a), len(b)
	c = a + b
	for i in xrange(m):
		for k in xrange(1 + (m-i)/2, 1 + (m+n-i)/2):
			if c[i:i+k] == c[i+k:i+2*k]: return False
	return True

def wirth_seq(n):
	if n == 1: return [[1], [2], [3]]
	if n%2 == 0:
		L = wirth_seq(n/2)
		return [a+b for a in L for b in L if good_seq(a, b)]
	else:
		L = wirth_seq(n-1)
		return [a+[c] for a in L for c in [1, 2, 3] if good_seq(a, [c])]

print len(wirth_seq(20))

ardnew said
December 13, 2012 at 6:59 PM
not entirely sure what you mean by “statement separator” vs. “statement terminator”, but semicolons in pascal are used as statement delimiters in (almost) exactly the same way as C. the difference in pascal is that the final statement of a lexical block does not need to be terminated with a semicolon. the end of the block terminates the expression.

also, similar to C, pascal does not require you use block delimiters “begin” and “end” (curly braces in C) for single-expression blocks.

a good example demonstrating all of these behaviors is found in his print procedure:
```
if m = N then begin print; change end
else extend
```
the call to “change” is not delimited with a semicolon because it is the last statement of the if-block. similarly, “extend” is not delimited with a semicolon because it is the last statement of the else-block. also, the else-block contains only a single expression, so the block delimiters “begin” and “end” are not used at all.

for clarity, the above could be rewritten with the following equivalent:
```
if m = N then 
begin 
   print; 
   change; 
end
else 
begin
   extend;
end;
```
I personally find the lack of semicolons and block delimiters to be slightly confusing and more difficult to maintain. so, unlike Wirth, I choose to use them for all blocks and all expressions.
sam said
December 13, 2012 at 8:10 PM
new to python/programming. First I created all the possible combinations. Then I test each one and return the valid ones. My results are in lists, but I’m sure could easily be changed to strings if that is ‘required’.
sam said
December 13, 2012 at 8:14 PM
sam said
December 13, 2012 at 8:16 PM
def wirth(n):

a=1
b=2
c=3
chars = [a,b,c]
validseq = []
allseq = [[a],[b],[c]]

i=0
while i < n-1: ##build all combinations of the 3 characters
newseq = []
for e in allseq:
x = e+[a]
newseq.append(x)
y = e+[b]
newseq.append(y)
z = e+[c]
newseq.append(z)
allseq = newseq
i+=1

for e in allseq:
testresult = testvalidity(e,n)
if testresult:
validseq.append(testresult)

print validseq

def testvalidity(e, n):

i=1
while i < n:
j=0
while j < min(i, n-i):
if e[(i-1)-j:i] == e[i:i+j+1]:
return None

j += 1
i += 1
return e

hamidj said

December 14, 2012 at 7:35 PM

My java solution here: (hamidj.wordpress.com)

import java.math.BigInteger;

public class Narcissistic_Number {
	public static void main(String[] args) {
		long begin = System.currentTimeMillis();
		//will generate first n Narcissistic Numbers
		generate(24);
		long end = System.currentTimeMillis();
		System.out.println("\nTotal execution time: "
                    + (end-begin)/1000 + "s");
	}

	public static void generate(int n){
        //n is number of Narcissistic to be generated
		BigInteger i = BigInteger.valueOf(0);
		int count = 0;
        //loop going from zero and checks if number is Narcissistic
		while(count < n) {
			if(isNarcissistic(i)  == true){
				System.out.print(i + " ");
				count++;
			}
			i = i.add(BigInteger.valueOf(1));
		}
	}

	public static boolean isNarcissistic(BigInteger n){
		String s = n.toString();
		int len = s.length();
		BigInteger sum = BigInteger.valueOf(0);
		for(int i = 0; i < s.length(); i++){
			sum = sum.add(BigInteger.valueOf((long) Math.pow(Character.getNumericValue(s.charAt(i)), len)));
		}
		return sum.equals(n);
	}
}

Matías Giovannini (@cordementeque) said

December 19, 2012 at 3:19 PM

My Mathematica translation of Wirth’s algorithm:


cubefree[n_Integer] := 
 Module[{print, extend, change, check, res = {}, s = Range[n], m = 2, 
    good = True},
   print[] := AppendTo[res, Take[s, m]];
   extend[] := s[[++m]] = 1;
   change[] := Module[{}, While[s[[m]] == 3, --m]; s[[m]]++];
   check[] := Module[{i, l, h = Floor[m/2]},
     good = True;
     For[l = 1, good && l <= h, l++,
      For[i = 0, i < l, i++, 
       If[s[[m - l - i]] != s[[m - i]], Break[]]];
      good = i < l]];
   While[True,
    Which[! good, change[], m < n, extend[], True, print[]; change[]];
    check[];
    If[m == 2, Break[]]
    ]; res] /; n > 2

beardedone85 said
January 23, 2013 at 12:35 AM
Just wondering if anyone has tested for a max N? My program generates quote a few solutions (non unique) with N=21, but zero at 22. Has anyone else noticed this or is there a problem with my program? Thanks.
programmingpraxis said
January 23, 2013 at 3:29 AM
If you follow the link from the exercise to codepad, and modify the script given there, you will find 691 solutions with n=22. Something is wrong with your program.
X Era Xcorp said
February 20, 2016 at 7:05 PM
wow no one came up with a C++ version ? ;(
X Era Xcorp said
February 20, 2016 at 7:07 PM
or even a C# version ?> im looking for a heuristic-algorithm to add its source to my code so that the heuristic function can be used

Programming Praxis