String Re-ordering
March 27, 2015
The obvious answer uses a decorate-sort-undecorate algorithm; here we add an order to each letter, forming an order/letter pair, sort by order, then extract the letters from the pairs:
(define (order1 t o)
(let loop ((ts (string->list t)) (ps (list)))
(if (pair? ts)
(loop (cdr ts) (cons (cons (index (car ts) o) (car ts)) ps))
(list->string
(map cdr (sort (lambda (a b) (< (car a) (car b))) ps))))))
The index
function is a variant of the string-index
function from the Standard Prelude that returns -1 instead of #f
(define (index c str)
(let ((x (string-index c str)))
(if x x -1)))
Here are some samples:
> (order1 "hello" "eloh")
"elloh"
> (order1 "help" "eloh")
"pelh"
That solution operates in time O(n log n) due to the sort, where n is the length of t. A better solution uses bucket sort to count characters in the ordering string, passing the unmatched characters directly to the output, then emitting the characters of each bucket in order; that will operate in O(n):
(define (order2 t o)
(define c->i char->integer)
(let ((buckets (make-vector 256 #f)))
(do ((os (string->list o) (cdr os))) ((null? os))
(vector-set! buckets (c->i (car os)) 0))
(let loop ((ts (string->list t)) (out (list)))
(cond ((null? ts) (build o buckets out))
((vector-ref buckets (c->i (car ts)))
(vector-set! buckets (c->i (car ts))
(+ (vector-ref buckets (c->i (car ts))) 1))
(loop (cdr ts) out))
(else (loop (cdr ts) (cons (car ts) out)))))))
(define (build o buckets out)
(list->string (apply append (reverse out)
(map (lambda (c)
(make-list (vector-ref buckets (char->integer c)) c))
(string->list o)))))
There’s a lot going on there, so let’s break it down. The do
loop initializes the bucket counts for the characters in the order string. The named let
scans through the target string and, for each character in the order string, increments the count. Then the auxiliary function build
appends the unordered characters in out
to the appropriate number of copies of each character in the order string, in order. Here are some samples:
> (order2 "hello" "eloh")
"elloh"
> (order2 "help" "eloh")
"pelh"
Although it’s not obvious here, the unordered characters are in the output in the same order they appeared in the input.
We used make-list
and string-index
from the Standard Prelude. You can run the program at http://ideone.com/WalkkG.
A solution using PHP >= 5.3: http://3v4l.org/vnoC1
In Python.
Haskell:
In Python the string.find() method returns the index of a character or -1 if a letter is not found in the string.
def reorder(t, o):
return ”.join(sorted(t, key=o.find))
This time with formatting:
An alternative Python solution.
The OrderedCounter counts the number of each character seen in the input, and remembers the order in which they are first seen.
Then each character in the ‘order’ string is moved to the end of the counter.
OrderedCounter.elements() returns each character repeated the number of times it was seen.
Bucket sort a nice idea. Here’s an in-place solution, scan the first string recording which characters are present, then scan the second string, moving uncounted characters to the front, then scan the first again, copying the right number of characters into the second. I’ve ignored problems with using a char to index an array, (don’t do this at work):
With Python list comprehensions, first gather the characters not in the ordering string and then the ones from the ordering string in their order.
def substr(ostr,tstr):
wstr = [x for x in tstr if x not in ostr]
for x in ostr:
wstr += [y for y in tstr if y == x]
return ”.join(wstr)
if __name__ == ‘__main__’:
ostr = “eloh”
tstr = “hello”
print substr(ostr,tstr)
A perl solution:
use Test::More;
is reorder(“hello”, “eloh”), “elloh”;
is reorder(“help”, “eloh”), “pelh”;
sub reorder {
my ($target, $order) = @_;
return join ”, sort { index($order,$a) index($order,$b) } split //, $target;
}
done_testing;
I have ported the given scheme solution using decorate-sort-undecorate algorithm to java:
This is a modified version of the solution posted by Krishnan R.
Bucketsort version:
A not particularly efficient but straightforward C++(11) version:
Java8: