String Prefixes
March 8, 2016
The exercise talks about strings, which can be considered as lists of characters, but it is easier to solve the more general problem of finding the longest prefix of any list of lists of any kind, then specialize to strings. Here is a function that computes the common prefix of two lists, using a user-defined equality function:
(define (prefix-two eql? x1 x2)
(let loop ((x1 x1) (x2 x2) (zs (list)))
(cond ((or (null? x1) (null? x2))
(reverse zs))
((eql? (car x1) (car x2))
(loop (cdr x1) (cdr x2)
(cons (car x1) zs)))
(else (reverse zs)))))
Finding the prefix of several items is just a fold on the prefix-two function shown above:
(define (prefix eql? xss)
(if (null? xss)
(error 'prefix "no input")
(fold-left (lambda (x1 x2)
(prefix-two eql? x1 x2))
(car xss) (cdr xss))))
The type of xss is a list of lists. We throw an error if the input is empty. Finally we can write the requested program:
(define (string-prefix xs)
(list->string
(prefix char=?
(map string->list xs))))
Here’s an example:
> (string-prefix '(
"I love cats"
"I love dogs"
"I love my daughters"))
"I love "
We used fold-left from the Standard Prelude. You can run the program at http://ideone.com/Oj8ezv.
In Python.
def string_prefix(seq): longest_prefix = [] for e in zip(*seq): a = e[0] if any(a != ei for ei in e[1:]): break longest_prefix.append(a) return "'" + "".join(longest_prefix) + "'"import itertools def longest_common_prefix(a,b): return "".join([x[0] for x in itertools.takewhile(lambda x: x[0]==x[1], zip(list(a), list(b)))]) print longest_common_prefix("123", "125") print longest_common_prefix("123", "1234") print longest_common_prefix("i love coding", "i love tesiut")In Python’s standard library.
@Jussi: Cool solution. I did not know os.path.commonprefix.
Except of course that strings aren’t lists in Scheme, but more like specialized vectors, so you have to iterate on indexes rather than using car-cdr deconstruction. (Though there was a Lisp at one time that provided string functions “char” and “chdr”.) Fortunately, SRFI 13 has string-prefix-length, so we can write (substring s1 0 (string-prefix-length s1 s2)). String-fold is also available in SRFI 13 and R7RS.
#include <stdio.h> #include <string.h> template<typename T> void f(const T *a, int &n, const T **p, const T **q) { if (q-p == 1) { for (int i = 0; i < n; i++) { if (a[i] != (*p)[i]) { n = i; break; } } } else { T **r = p+(q-p)/2; f(a,n,p,r); f(a,n,r,q); } } int main(int argc, const char *argv[]) { int n = strlen(argv[1]); f(argv[1],n,argv+2,argv+argc); printf("%.*s\n",n,n,argv[1]); }Oops, that version isn’t quite right, it should be:
#include <stdio.h> #include <string.h> template<typename T> void f(const T *a, int &n, const T **p, const T **q) { if (q-p == 1) { for (int i = 0; i < n; i++) { if (a[i] != (*p)[i]) { n = i; break; } } } else { const T **r = p+(q-p)/2; f(a,n,p,r); f(a,n,r,q); } } int main(int argc, const char *argv[]) { int n = strlen(argv[1]); f(argv[1],n,argv+2,argv+argc); printf("%.*s\n",n,argv[1]); }def str_prefix(s1, s2): i = 0 k = min(len(s1), len(s2)) while i < k and s1[i] == s2[i]: i += 1 return s1[0:i] def seq_prefix(seq): p = None for s in seq: if p is None: p = s else: p = str_prefix(p, s) return p or '' if __name__ == '__main__': print(seq_prefix(['I love cats', 'I love dogs', 'I love pasta']))