Sorting Without Duplicates, Revisited

February 10, 2017

This is a fun exercise because it makes us think about programs in a different way and because it provides two functions that can profitably be re-used in other programs. We start at the end, so you can see where we are going:

(define (sort-no-dups xs)
  (apply append (transpose (group-by = (sort < xs)))))

Sort is provided by most Scheme systems; if yours doesn’t have it, there is one in the Standard Prelude. Group-by repeatedly strips the common prefix from a list, producing a list of like sub-lists; it calls split-while from the Standard Prelude:

(define (group-by pred? xs)
  (let loop ((xs xs) (zss (list)))
    (if (null? xs) (reverse zss)
      (call-with-values
        (lambda () (split-while (lambda (x) (pred? x (car xs))) xs))
        (lambda (head tail) (loop tail (cons head zss)))))))

Transpose repeatedly strips the car from each sub-list, being careful to remove empty sub-lists at each step:

(define (transpose xss)
  (if (null? xss) (list)
    (cons (map car xss) (transpose (filter pair? (map cdr xss))))))

And that’s it. Here’s an example:

> (sort-no-dups '(2 9 1 5 1 4 9 7 2 1 4))
(1 2 4 5 7 9 1 2 4 9 1)

Group-by and transpose are useful tools to keep in your toolbox. You can run the program at http://ideone.com/BtPE5F.

Posted by programmingpraxis

Filed in Exercises

5 Comments »

5 Responses to “Sorting Without Duplicates, Revisited”

Gambiteer said
February 11, 2017 at 1:43 AM
Don’t you want
“`
(define (transpose l) (apply map list l))
“`
?
programmingpraxis said
February 11, 2017 at 2:21 AM
@Gambiteer: Your suggestion works if the list of lists is rectangular. For instance, (apply map list ‘((1 2 3) (4 5 6))) gives ((1 4) (2 5) (3 6)). But the result of the group-by is not necessarily rectangular. For instance, in the sample problem, the result of (group-by = (sort < '(2 9 1 5 1 4 9 7 2 1 4))) is ((1 1 1) (2 2) (4 4) (5) (7) (9 9)), which is not rectangular. The function you suggest is in the Standard Prelude with the name zip, where it would be called as (zip '(1 2 3) '(4 5 6)).
Globules said
February 12, 2017 at 4:19 AM
A Haskell version. The sorting and grouping are the same as in the previous problem. The foldr builds up two lists: the first consists of the head of each sublist produced by group; the second is the remaining elements (if any) of the sublists. Finally, we append the two lists from foldr.

With warnings enabled the compiler will complain that step doesn’t include a pattern match on [] (the empty list), but it can never be called that way since the sublists produced by group are always non-empty.
```
import Data.List (group, sort)

sortAppendDups :: Ord a => [a] -> [a]
sortAppendDups = uncurry (++) . foldr step ([],[]) . group . sort
  where step (x:xs) (ys, zs) = (x:ys, xs ++ zs)

main :: IO ()
main = print $ sortAppendDups [2, 9, 1, 5, 1, 4, 9, 7, 2, 1, 4 :: Int]
```
```
$ ./sortAppendDups 
[1,2,4,5,7,9,1,1,2,4,9]
```
Mike said
February 17, 2017 at 11:23 PM
Equivalent Python version.

Uses the ’roundrobin’ recipe from the itertools documentation (https://docs.python.org/3.5/library/itertools.html#itertools-recipes)
```
from itertools import groupby
from itertools_recipes import roundrobin

def sortwithoutdups2(lst):
    return list(roundrobin(*(list(v) for _,v in groupby(sorted(d)))))
```

Mike said

February 17, 2017 at 11:26 PM

Fixed a typo


from itertools import groupby
from itertools_recipes import roundrobin
 
def sortwithoutdups2(lst):
    return list(roundrobin(*(list(v) for _,v in groupby(sorted(lst)))))

S	M	T	W	T	F	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28

Programming Praxis