February 16, 2010
This task is harder than it looks; I will never admit how many wrong tries I made before I got a working result. Here’s my version:
(define (soundex str)
(define (code c)
(- (char->integer (char-upcase c)) 65)))
(define (finish zs)
(substring (list->string (reverse
(append (list #\0 #\0 #\0) zs))) 0 4))
(let* ((cs (cdr (map char-upcase (string->list str))))
(f (string-ref str 0)) (fx (code f))
(prev (if (char=? fx #\0) #\0 fx)))
(let loop ((cs cs) (zs (list f)) (prev prev))
(if (null? cs) (finish zs)
(let ((z (code (car cs))))
(cond ((char=? z prev) (loop (cdr cs) zs prev))
((char=? z #\0) (loop (cdr cs) zs #\0))
(else (loop (cdr cs) (cons z zs) z))))))))
code function maps letters according to the soundex rules, with zero representing an unused character. The
finish function extends or truncates the final result to four characters. The heart of the algorithm is in the last three lines, where the current character is skipped if it maps to the same soundex code as the previous character, or if it maps to zero, or otherwise is added to the accumulating output, and the previous character is updated.
This test is due to Knuth; calling
(test) should produce no output:
(let ((names1 (list "Euler" "Gauss" "Hilbert"
"Knuth" "Lloyd" "Lukasiewicz"))
(names2 (list "Ellery" "Ghosh" "Heilbronn"
"Kant" "Ladd" "Lissajous"))
(result (list "E460" "G200" "H416"
"K530" "L300" "L222")))
(assert (map soundex names1) result)
(assert (map soundex names2) result)))
Pages: 1 2