Compare Strings With One Error
January 29, 2016
At first I thought this was easy, but then I had some trouble writing it. After a couple of false starts, I settled on the following algorithm: First, if the lengths differ by more than one, I immediately report failure. Then, I extract the common prefixes and suffixes of the two strings. If the sum of their lengths is one less than the maximum length of the two strings, then they match with one error, and the length of the matching prefix is reported. Otherwise, either the two strings match or there is more than one mis-match, and failure is reported:
(define (match1 str1 str2) (let* ((cs1 (string->list str1)) (cs1-len (length cs1)) (cs2 (string->list str2)) (cs2-len (length cs2)) (min-len (min cs1-len cs2-len)) (max-len (max cs1-len cs2-len))) (if (< 1 (- max-len min-len)) #f (let* ((prefix (common char=? cs1 cs2)) (suffix (reverse (common char=? (reverse cs1) (reverse cs2)))) (pref-len (length prefix)) (suff-len (length suffix))) (if (= (- max-len pref-len suff-len) 1) pref-len #f)))))
We first convert the strings to lists of characters, because it is easier to handle lists than strings in Scheme, then use the common
function shown below to compute the common prefix and suffix of the two lists:
(define (common eql? xs ys) (let loop ((xs xs) (ys ys) (zs (list))) (cond ((or (null? xs) (null? ys)) (reverse zs)) ((not (eql? (car xs) (car ys))) (reverse zs)) (else (loop (cdr xs) (cdr ys) (cons (car xs) zs))))))
If the strings are very long, it would be better to work directly with the strings and avoid all those list-reversals, but reverse
is very fast in most Scheme implementations, so we won’t. Here are some examples:
> (match1 "ABCDE" "BCDE") 0 > (match1 "BCDE" "ABCDE") 0 > (match1 "ABCD" "ABCDE") 4 > (match1 "ABCDE" "ABDE") 2 > (match1 "ABDE" "ACD") #f > (match1 "ABCDE" "ADCBE") #f > (match1 "ABCDE" "ABCDE") #f > (match1 "ABCDE" "ABC") #f
The algorithm is O(n). You can run the program at http://ideone.com/rh78RN.
In Common Lisp, we can use MISMATCH to find where two sequences differ.
However, we have several cases to consider:
same length:
“abcde”
“abxde”
different lengths:
“abc”
“ab”
“abcde”
“abde”
“abc”
“bc”
Python version
#include
#include
#include
#include
using namespace std;
//Function incorporating Solution.
void Solution(void);
void main(void)
{
Solution();
}
void Solution(void)
{
int DiffCount = 0;
int LenA, LenB;
string A,B;
cin>>A>>B;
LenA = A.length(), LenB = B.length();
if ((LenA == LenB))
{
for (int X = 0; X = 0 && DiffCount!= 1) cout << "Error";
else cout << "No Error";
}
else if (abs(LenA – LenB) == 1)
{
for (int X = 0; X 0) cout << "Error";
else cout << "No Error";
}
else cout << "Error";
}
Python solution. Basic idea is to find where the strings don’t match. Then there are two cases:
1) s and t are the same length, so the mismatch must be a character substitution and the rest of the strings must match after skipping the mismatch (s[i+1:] == t[i+1:]); and
2) s is shorter than t, so the rest of s must match the rest of t after skipping a character (s[i:] == t[i+1:])