Sorting By Frequency

January 5, 2018

We have another homework question today, as I continue clearing out my backlog of people who sent me email asking for homework help last fall:

You are given an array with duplicates, and are to sort the array by decreasing frequency of elements. If two elements have the frequency, sort them in increasing order of value. For instance, the input (2 3 5 3 7 9 5 3 7) is sorted as (3 3 3 5 5 7 7 2 9).

Your task is to write a program that sorts by frequency. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


Pages: 1 2

7 Responses to “Sorting By Frequency”

  1. chaw said

    Here is an O(n log n) solution (with the usual hash-table caveat)
    using R7RS Scheme and a couple of popular SRFIs (for hash-tables and

    (import (scheme base)
            (scheme write)
            (srfi 69)
            (srfi 95))
    (define input-101 '(2 3 5 3 7 9 5 3 7))
    (define output-101 '(3 3 3 5 5 7 7 2 9))
    (define (counts-ht nums)
      (let ((ht (make-hash-table)))
        (for-each (lambda (n)
                    (hash-table-update!/default ht n (lambda (v) (+ v 1)) 0))
    (define (make-lt? nums)
      (let ((ht (counts-ht nums)))
        (lambda (a b)
          (let ((ca (hash-table-ref ht a))
                (cb (hash-table-ref ht b)))
            (or (> ca cb)
                (and (= ca cb)
                     (< a b)))))))
    (display (equal? (sort input-101 (make-lt? input-101))

    ;; The example in the specification calls for a stable sort, therefore
    ;; we will have to keep the index of the first representant and use it
    ;; as secondary key in the sort operation.
    (defun stable-sort-by-frequency (sequence lessp &key (key (function identity)))
    Sorts the elements in the SEQUENCE based on the frequency of their KEY
    and the position of their first representant in the SEQUENCE.
    LESSP shall be an order function on REAL.
      (flet ((compute-frequencies (sequence key)
               (let ((table (make-hash-table))
                     (index -1))
                 (map nil (lambda (item)
                            (incf index)
                            (let ((item-key (funcal key item)))
                              (unless (gethash item-key table)
                                (setf (gethash item-key table) (list index)))
                              (push item (gethash item-key table))))
             (wrap (table)
               (let ((frequencies (make-array (hash-table-count table)))
                     (i -1))
                 (maphash (lambda (k v)
                            (declare (ignore k))
                            (let ((r (nreverse v)))
                              (setf (aref frequencies (incf i))
                                    (cons (length (cdr r)) r))))
             (sort-frequencies (frequencies lessp)
               (sort frequencies
                     (lambda (a b)
                         ((funcall lessp (first a) (first b)) t)
                         ((funcall lessp (first b) (first a)) nil)
                         (t                                   (< (second a) (second b)))))))
             (unwrap (result frequencies)
               (if (listp result)
                     :with current := result
                     :for (count nil . elements) :across frequencies
                     :do (replace current elements)
                         (setf current (nthcdr count current)))
                     :with start := 0
                     :for (count nil . elements) :across frequencies
                     :do (replace result elements :start1 start)
                         (incf start count)))
        (unwrap sequence
                (sort-frequencies (wrap (compute-frequencies sequence key)) lessp))))
    (assert (equalp (stable-sort-by-frequency (list 1 2 3 4 5 6 7)       '< :key (function oddp)) '(2 4 6 1 3 5 7)))
    (assert (equalp (stable-sort-by-frequency (list 0 1 2 3 4 5 6 7 8)   '< :key (function oddp)) '(1 3 5 7 0 2 4 6 8)))
    (assert (equalp (stable-sort-by-frequency (vector 2 3 5 3 7 9 5 3 7) '>) #(3 3 3 5 5 7 7 2 9)))
  3. matthew said

    I don’t see that append or mappend should be quadratic. Classic implementation: scan forwards down list, but build up result in reverse.

  4. Globules said

    Here’s a Haskell version. First we create a mapping from each list element to a count of how many time it occurs. We convert the map to its corresponding list of key/value pairs, which is sorted using a comparison function that ensures the order we want. Finally, each pair (x, n) in the sorted list is replaced by a list of n copies of x, then the result is concatenated.

    import Data.Monoid ((<>))
    import Data.Map.Strict (assocs, fromListWith)
    import Data.List (sortBy)
    -- Sort a list first in order of decreasing frequency of the elements, with
    -- equal elements ordered by decreasing value.
    freqSort :: Ord a => [a] -> [a]
    freqSort = concatMap (\(x, n) -> replicate n x) .
               sortBy (\(a, b) (c, d) -> compare d b <> compare a c) .
               assocs .
               fromListWith (+) .
               flip zip (repeat 1)
    main :: IO ()
    main = print $ freqSort  [2, 3, 5, 3, 7, 9, 5, 3, 7]
    $ ./freqsort 
  5. Daniel said

    Here’s a solution in Python.

    from collections import Counter
    def sort_by_frequency(l):
        counts = Counter(l)
        key = lambda x: (-counts.get(x), x)
        return sorted(l, key=key)
    l = [2, 3, 5, 3, 7, 9, 5, 3, 7]
    print sort_by_frequency(l)


    [3, 3, 3, 5, 5, 7, 7, 2, 9]
  6. Paul said

    Another solution in Python. This one sorts the items of the Counter, not all items from the original array.

    def sort_by_frequency(arr):
        c = sorted(Counter(arr).items(), key=lambda vn: (-vn[1], vn[0]))
        result = []
        for v, n in c:
            result += [v] * n
        return result
  7. matthew said

    Here’s another Haskell solution then. Sort the list, group into sublists of identical element. Concatenate while wrapping each element in a tuple with the count of identical elements (this is negated so we can use the standard tuple ordering for sorting). Sort and drop the count. Seems to run in about the same time as Globules’ solution above, depending on the amount of duplication.

    {-# LANGUAGE TupleSections #-}
    import Data.List (sort,group)
    f :: Ord a => [a] -> [a]
    f = map snd . sort . concatMap h . group . sort where h s = map (-length s,) s
    main = print $ f [2,3,5,3,7,9,5,3,7]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: