## Sorting By Frequency

### January 5, 2018

We have another homework question today, as I continue clearing out my backlog of people who sent me email asking for homework help last fall:

You are given an array with duplicates, and are to sort the array by decreasing frequency of elements. If two elements have the frequency, sort them in increasing order of value. For instance, the input (2 3 5 3 7 9 5 3 7) is sorted as (3 3 3 5 5 7 7 2 9).

Your task is to write a program that sorts by frequency. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

### 8 Responses to “Sorting By Frequency”

1. chaw said

Here is an O(n log n) solution (with the usual hash-table caveat)
using R7RS Scheme and a couple of popular SRFIs (for hash-tables and
sorting).

``` (import (scheme base) (scheme write) (srfi 69) (srfi 95)) (define input-101 '(2 3 5 3 7 9 5 3 7)) (define output-101 '(3 3 3 5 5 7 7 2 9)) (define (counts-ht nums) (let ((ht (make-hash-table))) (for-each (lambda (n) (hash-table-update!/default ht n (lambda (v) (+ v 1)) 0)) nums) ht)) (define (make-lt? nums) (let ((ht (counts-ht nums))) (lambda (a b) (let ((ca (hash-table-ref ht a)) (cb (hash-table-ref ht b))) (or (> ca cb) (and (= ca cb) (< a b))))))) (display (equal? (sort input-101 (make-lt? input-101)) output-101)) (newline) ```

2. ```
;; The example in the specification calls for a stable sort, therefore
;; we will have to keep the index of the first representant and use it
;; as secondary key in the sort operation.

(defun stable-sort-by-frequency (sequence lessp &key (key (function identity)))
"
Sorts the elements in the SEQUENCE based on the frequency of their KEY
and the position of their first representant in the SEQUENCE.
LESSP shall be an order function on REAL.
"
(flet ((compute-frequencies (sequence key)
(let ((table (make-hash-table))
(index -1))
(map nil (lambda (item)
(incf index)
(let ((item-key (funcal key item)))
(unless (gethash item-key table)
(setf (gethash item-key table) (list index)))
(push item (gethash item-key table))))
sequence)
table))
(wrap (table)
(let ((frequencies (make-array (hash-table-count table)))
(i -1))
(maphash (lambda (k v)
(declare (ignore k))
(let ((r (nreverse v)))
(setf (aref frequencies (incf i))
(cons (length (cdr r)) r))))
table)
frequencies))
(sort-frequencies (frequencies lessp)
(sort frequencies
(lambda (a b)
(cond
((funcall lessp (first a) (first b)) t)
((funcall lessp (first b) (first a)) nil)
(t                                   (< (second a) (second b)))))))
(unwrap (result frequencies)
(if (listp result)
(loop
:with current := result
:for (count nil . elements) :across frequencies
:do (replace current elements)
(setf current (nthcdr count current)))
(loop
:with start := 0
:for (count nil . elements) :across frequencies
:do (replace result elements :start1 start)
(incf start count)))
result))
(unwrap sequence
(sort-frequencies (wrap (compute-frequencies sequence key)) lessp))))

(assert (equalp (stable-sort-by-frequency (list 1 2 3 4 5 6 7)       '< :key (function oddp)) '(2 4 6 1 3 5 7)))
(assert (equalp (stable-sort-by-frequency (list 0 1 2 3 4 5 6 7 8)   '< :key (function oddp)) '(1 3 5 7 0 2 4 6 8)))
(assert (equalp (stable-sort-by-frequency (vector 2 3 5 3 7 9 5 3 7) '>) #(3 3 3 5 5 7 7 2 9)))

```
3. matthew said

I don’t see that append or mappend should be quadratic. Classic implementation: scan forwards down list, but build up result in reverse.

4. Globules said

Here’s a Haskell version. First we create a mapping from each list element to a count of how many time it occurs. We convert the map to its corresponding list of key/value pairs, which is sorted using a comparison function that ensures the order we want. Finally, each pair (x, n) in the sorted list is replaced by a list of n copies of x, then the result is concatenated.

```import Data.Monoid ((<>))
import Data.Map.Strict (assocs, fromListWith)
import Data.List (sortBy)

-- Sort a list first in order of decreasing frequency of the elements, with
-- equal elements ordered by decreasing value.
freqSort :: Ord a => [a] -> [a]
freqSort = concatMap (\(x, n) -> replicate n x) .
sortBy (\(a, b) (c, d) -> compare d b <> compare a c) .
assocs .
fromListWith (+) .
flip zip (repeat 1)

main :: IO ()
main = print \$ freqSort  [2, 3, 5, 3, 7, 9, 5, 3, 7]
```
```\$ ./freqsort
[3,3,3,5,5,7,7,2,9]
```
5. Daniel said

Here’s a solution in Python.

```from collections import Counter

def sort_by_frequency(l):
counts = Counter(l)
key = lambda x: (-counts.get(x), x)
return sorted(l, key=key)

l = [2, 3, 5, 3, 7, 9, 5, 3, 7]
print sort_by_frequency(l)
```

Output:

```[3, 3, 3, 5, 5, 7, 7, 2, 9]
```
6. Paul said

Another solution in Python. This one sorts the items of the Counter, not all items from the original array.

```def sort_by_frequency(arr):
c = sorted(Counter(arr).items(), key=lambda vn: (-vn, vn))
result = []
for v, n in c:
result += [v] * n
return result

```
7. matthew said

Here’s another Haskell solution then. Sort the list, group into sublists of identical element. Concatenate while wrapping each element in a tuple with the count of identical elements (this is negated so we can use the standard tuple ordering for sorting). Sort and drop the count. Seems to run in about the same time as Globules’ solution above, depending on the amount of duplication.

```{-# LANGUAGE TupleSections #-}
import Data.List (sort,group)
f :: Ord a => [a] -> [a]
f = map snd . sort . concatMap h . group . sort where h s = map (-length s,) s
main = print \$ f [2,3,5,3,7,9,5,3,7]
```
8. Eric said

Here are two Python approaches with and without Counter respectively. Although less elegant, the latter is more fun trying to figure out.

Sol. 1 (with Counter)

```from collections import Counter

def sort_by_freq(list_in):
list = []
for x,y in Counter(sorted(list_in)).most_common():
list += [x]*y

return list

l = [2, 3, 5, 3, 7, 9, 5, 3, 7]
print(sort_by_freq(l))
```

Sol.2 (without Counter)

```def sort_by_freq(list_in):
ll = len(list_in)
freq = [list_in.count(i) for i in range(10)]
numbf = [[i for i,x in enumerate(freq) if x == ll-j] for j in range(ll)]
list = []
for i in range(ll):
list += sorted(numbf[-(ll-i)]*(ll-i))

return list

l = [2, 3, 5, 3, 7, 9, 5, 3, 7]
print(sort_by_freq(l))
```