Run Length Encoding
February 9, 2021
We will solve the problem three times in less than twenty-five minutes. Here is the first solution, which uses uniq-c
from the Standard Prelude:
(define (f str) (uniq-c char=? (string->list str))) > (f "aaaabbbcca") ((#\a . 4) (#\b . 3) (#\c . 2) (#\a . 1))
The linked article that was the source of the problem gives this solution, which uses group-by
from a previous exercise:
(define (chop-chop str) (let ((xs (group-by char=? (string->list str)))) (map list (map (compose string car) xs) (map length xs))))
> (chop-chop "aaaabbbcca") (("a" 4) ("b" 3) ("c" 2) ("a" 1))
This is the third exercise where we used group-by
, so I’ll add it to the Standard Prelude.
The third method scans the string from left to right, accumulating output every time the current character differs from its predecessor; this is probably the solution Alexey expects candidates to offer:
(define (rle str) (let loop ((cs (string->list str)) (prev #f) (count 0) (output (list))) (cond ((not prev) ; start input (if (null? cs) output (loop (cdr cs) (car cs) 1 output))) ((null? cs) ; end of input (reverse (cons (cons prev count) output))) ((char=? (car cs) prev) ; continue current run (loop (cdr cs) prev (+ count 1) output)) (else ; end current run, start another (loop (cdr cs) (car cs) 1 (cons (cons prev count) output))))))
> (rle "aaaabbbcca") ((#\a . 4) (#\b . 3) (#\c . 2) (#\a . 1))
You can run all three programs at https://ideone.com/gpOaZy.
First thing that came to mind. In Ruby.
Output:
Here is a short and simple solution in standard R7RS Scheme using fold
from SRFI 1.
Output:
Racket has splitf-at the same as split-while in your Prelude and I like :
Interesting drill. Although I don’t find RLE a serious contender when it comes to compression algos, it’s a fun exercise. Here is my take on it in Julia 1.5: https://pastebin.com/3hiUsc4v
Cheers
Here’s a solution in Python.
Output:
Here’s a Haskell version.
A solution in Racket:
Example:
Here’s are two solutions in C. The first streams the output while looping over the input string. The second creates an intermediate array with the character counts.
Example usage (same for both programs):