Sorting Without Duplicates, Revisited

February 10, 2017

In the comments to the previous exercise, reader Kevin gave a beautiful solution in Haskell; it sorts the input items, groups them into sub-lists of like items, transposes the rows and columns of the sub-lists, and concatenates the sub-lists back into a single list:

import Data.List
sortwodup :: (Ord a) => [a] -> [a]
sortwodup = concat . transpose . group . sort

Here’s the pipeline of functions applied to the sample input (2 9 1 5 1 4 9 7 2 1 4):

Sort converts the input list to (1 1 1 2 2 4 4 5 7 9 9).

Group converts the output of sort to a list of sub-lists: ((1 1 1) (2 2) (4 4) (5) (7) (9 9)).

Transpose swaps rows and columns, ignoring missing items, like this:

    1 1 1        1 2 4 5 7 9
    2 2          1 2 4 9
    4 4          1
    5
    7
    9 9

Thus, the output from transpose is ((1 2 4 5 7 9) (1 2 4 9) (1)). Finally, concatenate removes the sub-list structure, producing (1 2 4 5 7 9 1 2 4 9 1).

Your task is to write a Haskell-style program to solve the problem of sorting a list, moving duplicates to the end. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Advertisement

Pages: 1 2

5 Responses to “Sorting Without Duplicates, Revisited”

  1. Gambiteer said

    Don’t you want
    “`
    (define (transpose l) (apply map list l))
    “`
    ?

  2. programmingpraxis said

    @Gambiteer: Your suggestion works if the list of lists is rectangular. For instance, (apply map list ‘((1 2 3) (4 5 6))) gives ((1 4) (2 5) (3 6)). But the result of the group-by is not necessarily rectangular. For instance, in the sample problem, the result of (group-by = (sort < '(2 9 1 5 1 4 9 7 2 1 4))) is ((1 1 1) (2 2) (4 4) (5) (7) (9 9)), which is not rectangular. The function you suggest is in the Standard Prelude with the name zip, where it would be called as (zip '(1 2 3) '(4 5 6)).

  3. Globules said

    A Haskell version. The sorting and grouping are the same as in the previous problem. The foldr builds up two lists: the first consists of the head of each sublist produced by group; the second is the remaining elements (if any) of the sublists. Finally, we append the two lists from foldr.

    With warnings enabled the compiler will complain that step doesn’t include a pattern match on [] (the empty list), but it can never be called that way since the sublists produced by group are always non-empty.

    import Data.List (group, sort)
    
    sortAppendDups :: Ord a => [a] -> [a]
    sortAppendDups = uncurry (++) . foldr step ([],[]) . group . sort
      where step (x:xs) (ys, zs) = (x:ys, xs ++ zs)
    
    main :: IO ()
    main = print $ sortAppendDups [2, 9, 1, 5, 1, 4, 9, 7, 2, 1, 4 :: Int]
    
    $ ./sortAppendDups 
    [1,2,4,5,7,9,1,1,2,4,9]
    
  4. Mike said

    Equivalent Python version.

    Uses the ’roundrobin’ recipe from the itertools documentation (https://docs.python.org/3.5/library/itertools.html#itertools-recipes)

    
    from itertools import groupby
    from itertools_recipes import roundrobin
    
    def sortwithoutdups2(lst):
        return list(roundrobin(*(list(v) for _,v in groupby(sorted(d)))))
    
    
  5. Mike said

    Fixed a typo

    
    from itertools import groupby
    from itertools_recipes import roundrobin
     
    def sortwithoutdups2(lst):
        return list(roundrobin(*(list(v) for _,v in groupby(sorted(lst)))))
    
    

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: