String-Replace
May 22, 2015
Let’s first examine the claim that the function from the previous exercise takes quadratic time. Here’s the function:
(define (string-replace str pat rep) ; quadratic
(let ((x (string-find pat str)))
(if (not x) str
(string-append (substring str 0 x) rep
(string-replace (substring str (+ x (string-length pat))
(string-length str)) pat rep)))))
It’s quadratic because at each recursive call when the string is extended the entire string has to be scanned and rewritten. Here we demonstrate the quadratic time complexity:
> (time (begin (string-replace (make-string 2000 #\a) "a" "b") 'done))
(time (begin (string-replace (...) ...) ...))
2 collections
16 ms elapsed cpu time, including 0 ms collecting
22 ms elapsed real time, including 6 ms collecting
17764912 bytes allocated, including 15202400 bytes reclaimed
done
> (time (begin (string-replace (make-string 4000 #\a) "a" "b") 'done))
(time (begin (string-replace (...) ...) ...))
8 collections
78 ms elapsed cpu time, including 47 ms collecting
85 ms elapsed real time, including 31 ms collecting
67530192 bytes allocated, including 84461056 bytes reclaimed
done
> (time (begin (string-replace (make-string 8000 #\a) "a" "b") 'done))
(time (begin (string-replace (...) ...) ...))
27 collections
405 ms elapsed cpu time, including 188 ms collecting
421 ms elapsed real time, including 209 ms collecting
263062416 bytes allocated, including 133705904 bytes reclaimed
done
When the length of the string doubles from 2000 to 4000 execution time increases by 78 / 16 = 4.9 times and when the length of the string doubles from 4000 to 8000 execution time increases by 405 / 78 = 5.2 times, which is clearly quadratic. And look at all of the garbage that is generated!
A better approach is to scan the string from left to right, looking for matches, collect the outgoing text in a list, and write the new string at the end:
(define (string-replace str pat rep) ; linear
(let ((str-len (string-length str))
(pat-len (string-length pat)))
(let loop ((pos 0) (xs (list)))
(if (<= str-len pos)
(apply string-append (reverse xs))
(let ((x (string-find pat str pos)))
(if x
(loop (+ pos pat-len)
(cons rep (cons (substring str pos x) xs)))
(loop str-len (cons (substring str pos str-len) xs))))))))
And here are some timings:
> (time (begin (string-replace (make-string 25000 #\a) "a" "b") 'done))
(time (begin (string-replace (...) ...) ...))
3 collections
46 ms elapsed cpu time, including 0 ms collecting
42 ms elapsed real time, including 6 ms collecting
22201088 bytes allocated, including 24727008 bytes reclaimed
done
> (time (begin (string-replace (make-string 50000 #\a) "a" "b") 'done))
(time (begin (string-replace (...) ...) ...))
5 collections
78 ms elapsed cpu time, including 0 ms collecting
83 ms elapsed real time, including 8 ms collecting
45713280 bytes allocated, including 174417216 bytes reclaimed
done
> (time (begin (string-replace (make-string 100000 #\a) "a" "b") 'done))
(time (begin (string-replace (...) ...) ...))
11 collections
172 ms elapsed cpu time, including 46 ms collecting
174 ms elapsed real time, including 22 ms collecting
93394016 bytes allocated, including 90342368 bytes reclaimed
done
This is clearly linear. At the first doubling of the length of the string, execution time increased by 78 / 46 = 1.7 times, and at the second doubling execution time increased by 172 / 78 = 2.2 times. And the execution times are very much faster; when I tried the timing test on the quadratic version with a 40,000 character string, the machine crunched for several minutes, then died.
We used string-find
from the Standard Prelude. You can run the program at http://ideone.com/tRPxfW.
C++ implemenation for Python.
[sourecode lang=”C++”]
/* len(self)>=1, len(from)==len(to)>=2, maxcount>=1 */
Py_LOCAL(PyStringObject *)
replace_substring_in_place(PyStringObject *self,
const char *from_s, Py_ssize_t from_len,
const char *to_s, Py_ssize_t to_len,
Py_ssize_t maxcount)
{
char *result_s, *start, *end;
char *self_s;
Py_ssize_t self_len, offset;
PyStringObject *result;
/* The result string will be the same size */
self_s = PyString_AS_STRING(self);
self_len = PyString_GET_SIZE(self);
offset = stringlib_find(self_s, self_len,
from_s, from_len,
0);
if (offset == -1) {
/* No matches; return the original string */
return return_self(self);
}
/* Need to make a new string */
result = (PyStringObject *) PyString_FromStringAndSize(NULL, self_len);
if (result == NULL)
return NULL;
result_s = PyString_AS_STRING(result);
Py_MEMCPY(result_s, self_s, self_len);
/* change everything in-place, starting with this one */
start = result_s + offset;
Py_MEMCPY(start, to_s, from_len);
start += from_len;
end = result_s + self_len;
while ( –maxcount > 0) {
offset = stringlib_find(start, end-start,
from_s, from_len,
0);
if (offset==-1)
break;
Py_MEMCPY(start+offset, to_s, from_len);
start += offset+from_len;
}
return result;
}
[/sourecode]
This looked so suitable for C that I couldn’t resist trying it. The result is at http://codepad.org/glHPVdoL.
On a first run, the times were about ten times longer than those above, not what I expected. It turned out that my library strstr function execution was quadratic in string length. Replacing strstr with local code produced a times about 20 times faster than those above, and memory 200 times less.
So even coding close to the machine, you can’t be sure of a simple library function.