String Prefixes
March 8, 2016
Two strings have a common prefix that consists of the longest prefix of the strings that is the same. For instance, the strings “I love cats” and “I love dogs” have the common prefix “I love ” (including a trailing space at the end of love, which doesn’t appear properly in some browsers).
Your task is to write a program that finds the common prefix of a list of strings (possibly more than two strings). When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
In Python.
def string_prefix(seq): longest_prefix = [] for e in zip(*seq): a = e[0] if any(a != ei for ei in e[1:]): break longest_prefix.append(a) return "'" + "".join(longest_prefix) + "'"import itertools def longest_common_prefix(a,b): return "".join([x[0] for x in itertools.takewhile(lambda x: x[0]==x[1], zip(list(a), list(b)))]) print longest_common_prefix("123", "125") print longest_common_prefix("123", "1234") print longest_common_prefix("i love coding", "i love tesiut")In Python’s standard library.
@Jussi: Cool solution. I did not know os.path.commonprefix.
Except of course that strings aren’t lists in Scheme, but more like specialized vectors, so you have to iterate on indexes rather than using car-cdr deconstruction. (Though there was a Lisp at one time that provided string functions “char” and “chdr”.) Fortunately, SRFI 13 has string-prefix-length, so we can write (substring s1 0 (string-prefix-length s1 s2)). String-fold is also available in SRFI 13 and R7RS.
#include <stdio.h> #include <string.h> template<typename T> void f(const T *a, int &n, const T **p, const T **q) { if (q-p == 1) { for (int i = 0; i < n; i++) { if (a[i] != (*p)[i]) { n = i; break; } } } else { T **r = p+(q-p)/2; f(a,n,p,r); f(a,n,r,q); } } int main(int argc, const char *argv[]) { int n = strlen(argv[1]); f(argv[1],n,argv+2,argv+argc); printf("%.*s\n",n,n,argv[1]); }Oops, that version isn’t quite right, it should be:
#include <stdio.h> #include <string.h> template<typename T> void f(const T *a, int &n, const T **p, const T **q) { if (q-p == 1) { for (int i = 0; i < n; i++) { if (a[i] != (*p)[i]) { n = i; break; } } } else { const T **r = p+(q-p)/2; f(a,n,p,r); f(a,n,r,q); } } int main(int argc, const char *argv[]) { int n = strlen(argv[1]); f(argv[1],n,argv+2,argv+argc); printf("%.*s\n",n,argv[1]); }def str_prefix(s1, s2): i = 0 k = min(len(s1), len(s2)) while i < k and s1[i] == s2[i]: i += 1 return s1[0:i] def seq_prefix(seq): p = None for s in seq: if p is None: p = s else: p = str_prefix(p, s) return p or '' if __name__ == '__main__': print(seq_prefix(['I love cats', 'I love dogs', 'I love pasta']))