## Stepwise Program Development: A Heuristic Algorithm

### December 11, 2012

Here is Wirth’s solution, the final result of several rewrites; it gives only *unique* solutions, excluding those that are identical except for substituting one symbol of the alphabet for another:

varS: array[1 ..N]ofchar;

m:integer;good:Boolean;

procedureextend;

beginm:=m+ 1;S[m] := ‘1’;end;

procedurechange;

beginwhileS[m] = ‘3’dom:=m– 1;

S[m] := succ(S[m]);

end;

procedurecheck;

vari,L,mhalf:integer;

begingood:=true;L:= 0;mhalf:=mdiv2;

whilegood∧ (L<mhalf)do

beginL:=L+ 1;i:= 0;

repeatgood:=S[m–i] ≠S[m–L–i];i:=i+ 1

untilgood∨ (i=L)

end;

end;

procedure

vari:integer;

beginfori:= 1toNdowrite(S[i]);

writeln

end;

beginm:= 2;S[1] := ‘1’;S[2] := ‘2’;good=true;

repeatifgoodthen

ifm=Nthenbeginchangeend

elseextend

elsechange;

check

untilm= 2

end.

Wirth’s typography is beautiful, but gave me a jolly time trying to recreate it. Somebody will have to explain to me how a semicolon is a statement separator rather than a statement terminator in Pascal. I won’t try to explain the whole program (Wirth takes eight dense pages to do so), but the basic idea is that a starting sequence is *extend*ed if possible, *change*d if not, and backtracks only when necessary. If you can, get the book, either from a used book store or from a library, as Wirth’s explanation is highly rewarding. Here is my direct translation to Scheme:

`(define (wirth n)`

```
``` (let ((s (make-vector (+ n 1) #f))

(m 0) (good #f))

(define (extend)

(set! m (+ m 1))

(vector-set! s m #\1))

(define (change)

(while (char=? (vector-ref s m) #\3)

(set! m (- m 1)))

(vector-set! s m (succ (vector-ref s m))))

(define (check)

(let ((i 0) (l 0) (mhalf 0))

(set! good #t)

(set! l 0)

(set! mhalf (quotient m 2))

(while (and good (< l mhalf))

(set! l (+ l 1))

(set! i 0)

(repeat

(set! good

(not (char=?

(vector-ref s (- m i))

(vector-ref s (- m l i)))))

(set! i (+ i 1))

(until (or good (= i l)))))))

(define (print)

(do ((i 1 (+ i 1))) ((< n i))

(display (vector-ref s i)))

(newline))

` (set! m 2)`

(vector-set! s 1 #\1)

(vector-set! s 2 #\2)

(set! good #t)

(repeat

(if good

(if (= m n)

(begin (print) (change))

(extend))

(change))

(check)

(until (= m 2)))))

A sample run is shown below:

`> (wirth 5)`

12131

12132

12312

12313

12321

The code for `while`

, `repeat`

and `succ`

is shown at http://programmingpraxis.codepad.org/qM69avYY, where you can run the program.

A Python version. I did not look at Wirth’s best solution. I tried to solve it without looking at his method. This was my first attempt and it worked immediately.

A more elegant version, that uses recursion.

[…] Pages: 1 2 […]

[…] today’s exercise, our goal is to write an algorithm that, given an alphabet and a length, generates all […]

My Haskell solution (see http://bonsaicode.wordpress.com/2012/12/11/programming-praxis-stepwise-program-development-a-heuristic-algorithm/ for a version with comments):

My haskell solution. I’m not sure if it has optimum running time. Running “generate n” generates all possible sequences for a given N. If you run ‘take 1 (generate n)’, the lazyness of haskell should allow a fairly fast generation of a new solution.

split xs n = (take n xs, take n (drop n xs))

–given ys is legal, checks that (y:ys) is also legal

check xs = foldl (\a (ys, zs) -> if ys == zs then True else a) False (map (split xs) [1..((length xs) `div` 2)])

alphabet = [1,2,3]

generate 0 = [[]]

generate n = foldl (\ys zs -> if check zs then ys else zs:ys) [] (concat ( map (\x -> map (\y -> y:x) alphabet) (generate (n-1))))

lists _ 0 = [[]]

lists set n = [ l’ | l <- lists set (n-1), x <- set, let l' = l ++ [x], ok l' ]

where

ok l' = and [ a /= b | j <- [1 .. n `quot` 2],

let (a,b) = splitAt j $ drop (n-j-j) l' ]

main = print $ lists "123" 10

Let the regular expression engine do the hard work…

#!perl

use strict;

use warnings;

# $> perl solve.pl 5 3

# Found 198 solutions

my @asolutions = generate(

$ARGV[0], # N (first cmd line argument)

[1 .. $ARGV[1]] # Alphabet (number generated from second cmd line argument)

);

printf("Found %d solutions\n", scalar @asolutions);

sub generate {

my ( $target_length, $alphabet, $head ) = @_;

# Base Case our head is the right length

return $head if (length ($head//'')) == $target_length;

`# General Case`

return

map { generate( $target_length, $alphabet, $_ ) }

grep { ! /(.{2})\1/ }

map { ($head // "") . $_ }

@$alphabet;

}

If you want it to exclude single-item adjacent sequences, you can change the {2} to a {1}

If you harden the problem a bit and try to avoid repeated permuations you get to abelian square free problem:

An abelian square means a non-empty word uv, where u and v are permutations of each other For example, abccba contains repeate permuation of a,b, and c.

Another Python solution. Uses that you should only have to check that you’re not creating an adjacent sequence pair with each new addition.

import sys

N = int(sys.argv[1])

def next(seq=(),seqs=set(),pair_idxs = {(1,2):None,(1,3):None,(2,1):None,(2,3):None,(3,1):None,(3,2):None}):

if len(seq) == N:

seqs.add(seq)

return seqs

elif len(seq) == 0:

for i in range(1,4):

seqs = next(seq + (i,),seqs,pair_idxs)

else:

last = seq[-1]

possibles = set([1,2,3])

possibles.remove(last)

for i in possibles:

prev_pos = pair_idxs[(last,i)]

if prev_pos == None or test(seq,i,prev_pos):

new_pi = dict(pair_idxs)

new_pi[(last,i)] = len(seq)-1

seqs = next(seq + (i,),seqs,new_pi)

return seqs

def test(seq, i, prev_pos):

sub_new = seq[prev_pos+2:] + (i,)

sub_prev = seq[:prev_pos+2][-len(sub_new):]

return sub_new != sub_prev

print next()

how come some solutions don’t have the 3x and 2x sequences while some of the other solutions do?

if you include the 3x and 2x sequences it looks like the length of the solution is generated by this sequence which is kinda cool: http://oeis.org/A006156

Simple recursive solution in Python 3.

Here is a quick solution in Python 3

#!/usr/bin/env python

import re

import itertools

VALID_CHARS = ['1', '2', '3']

def no_repetition(s):

return re.search(r'(.+)\1', s) == None

def valid_strings(n):

dictionary = (''.join(x) for x in itertools.product(VALID_CHARS, repeat=n))

dictionary = filter(no_repetition, dictionary)

for word in dictionary:

print(word)

def main():

valid_strings(3)

valid_strings(5)

valid_strings(10)

`if __name__ == "__main__":`

main()

Here are up my usual solutions in Python and Racket: Generating non-repeating strings

It’s just the brute force solutions for now, I think I’ll try to write up a smarter solution that bails out early tomorrow.

@Ben:

That actually makes perfect sense. If you parse the jargon, ternary is base three (thus the alphabet

A = {a, b, c}) and a squarefree word is one without any adjacent subwords. So exactly what the problem asks for.@Mike:

I think that

yield fromalone is just about enough for me to give Python 3.3+ another try over Python 2.7. That’s really helpful.My C recursive solution.

Sorry, but I don’t think I got the problem definition right.

Do the following also belong to a N=5 sequence?

13121

13123

13212

13213

13231

and also sequences starting with 2 and 3…

My C recursive solution.

A divide and conquer approach in Pyhton:

not entirely sure what you mean by “statement separator” vs. “statement terminator”, but semicolons in pascal are used as statement delimiters in (almost) exactly the same way as C. the difference in pascal is that the final statement of a lexical block does not need to be terminated with a semicolon. the end of the block terminates the expression.

also, similar to C, pascal does not require you use block delimiters “begin” and “end” (curly braces in C) for single-expression blocks.

a good example demonstrating all of these behaviors is found in his print procedure:

the call to “change” is not delimited with a semicolon because it is the last statement of the if-block. similarly, “extend” is not delimited with a semicolon because it is the last statement of the else-block. also, the else-block contains only a single expression, so the block delimiters “begin” and “end” are not used at all.

for clarity, the above could be rewritten with the following equivalent:

I personally find the lack of semicolons and block delimiters to be slightly confusing and more difficult to maintain. so, unlike Wirth, I choose to use them for all blocks and all expressions.

new to python/programming. First I created all the possible combinations. Then I test each one and return the valid ones. My results are in lists, but I’m sure could easily be changed to strings if that is ‘required’.

def wirth(n):

a=1

b=2

c=3

chars = [a,b,c]

validseq = []

allseq = [[a],[b],[c]]

i=0

while i < n-1: ##build all combinations of the 3 characters

newseq = []

for e in allseq:

x = e+[a]

newseq.append(x)

y = e+[b]

newseq.append(y)

z = e+[c]

newseq.append(z)

allseq = newseq

i+=1

for e in allseq:

testresult = testvalidity(e,n)

if testresult:

validseq.append(testresult)

print validseq

def testvalidity(e, n):

i=1

while i < n:

j=0

while j < min(i, n-i):

if e[(i-1)-j:i] == e[i:i+j+1]:

return None

j += 1

i += 1

return e

My java solution here: (hamidj.wordpress.com)

My Mathematica translation of Wirth’s algorithm:

Just wondering if anyone has tested for a max N? My program generates quote a few solutions (non unique) with N=21, but zero at 22. Has anyone else noticed this or is there a problem with my program? Thanks.

If you follow the link from the exercise to codepad, and modify the script given there, you will find 691 solutions with n=22. Something is wrong with your program.

wow no one came up with a C++ version ? ;(

or even a C# version ?> im looking for a heuristic-algorithm to add its source to my code so that the heuristic function can be used