Inversions

October 31, 2017

In an array A, the term inversions refers to the number of items that are out of order; each i < j for which A[j] < A[i] counts as a single inversion. For instance, the array [1 4 3 2 5] has three inversions: (4,3), (4,2) and (3,2). Inversions are a measure of the disorder of an array: an ordered array has no inversions and a reversed array has the maximum n(n−1)/2 number of inversions.

Your task is to write programs that count the number of inversions in an array and make an enumeration of them. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Posted by programmingpraxis

Filed in Exercises

12 Comments »

12 Responses to “Inversions”

Paul said

October 31, 2017 at 1:14 PM

In Python using a modified mergesort.

def mergesort(arr):
    inversions = 0
    n = len(arr)
    if n <= 1:
        return arr, inversions
    left, right = arr[:n//2], arr[n//2:]
    sleft, invleft = mergesort(left)
    sright, invright = mergesort(right)
    merged = []
    while sright and sleft:
        if sleft[0] <= sright[0]:
            merged.append(sleft.pop(0))
        else:
            merged.append(sright.pop(0))
            inversions += len(sleft)
    return merged + (sleft or sright), inversions + invleft + invright

Daniel said

November 1, 2017 at 5:09 AM

I do not believe an O(n log n) solution is possible for the task of enumerating the inversions. For that task I’ve implemented an O(n^2) solution in Python.

# inversions.py

def pairs(array):
    return ((array[i], array[j])
            for i in range(len(array))
            for j in range(i, len(array)))

def enumerate_inversions(array):
    return [(x, y) for x, y in pairs(array) if x > y]

array = [1, 4, 3, 2, 5]
inversions = enumerate_inversions(array)
print inversions

[(4, 3), (4, 2), (3, 2)]

I was originally using itertools.combinations, but I wasn’t sure if that function is guaranteed to preserve the ordering of the original array (although it appears to).

# inversions.py

from itertools import combinations

def enumerate_inversions(array):
    return [(x, y) for x, y in combinations(array, 2) if x > y]

array = [1, 4, 3, 2, 5]
inversions = enumerate_inversions(array)
print inversions

For the task of counting the inversions, I’ve implemented an O(n log n) solution based on merge sort in C99.

#include <stdio.h>

static int _inversions(int array[], int aux[], size_t n) {
    if (n <= 1) return 0;
    size_t mid = n / 2;
    int left_count = _inversions(array, aux, mid);
    int right_count = _inversions(array + mid, aux, n - mid);
    for (size_t i = 0; i < n; ++i) {
        aux[i] = array[i];
    }
    int count = left_count + right_count;
    size_t i = 0;   // index for output
    size_t j = 0;   // index for left array
    size_t k = mid; // index for right array
    while (j < mid && k < n) {
        if (aux[j] <= aux[k]) {
            array[i++] = aux[j++];
        } else {
            array[i++] = aux[k++];
            count += mid - j;
        }
    }
    while (j < mid) {
        array[i++] = aux[j++];
    }
    while (k < n) {
        array[i++] = aux[k++];
    }
    return count;
}

int inversions(int array[], size_t n) {
    int aux[n];
    return _inversions(array, aux, n);
}


int main(int argc, char* argv[]) {
    int array[] = {1, 4, 3, 2, 5};
    size_t n = sizeof(array) / sizeof(int);
    int count = inversions(array, n);
    printf("%d\n", count);
}

Output:

Daniel said
November 1, 2017 at 5:13 AM
@Paul, I believe that a core property of merge sort is its O(n log n) computational complexity. However, I think that by using pop(0) in your loop, you’re losing that property, since pop(0) is an O(n) operation, which would seemingly make your implementation O(n^2 log n).

Paul said

November 1, 2017 at 7:27 AM

@Daniel: I think my original solution is still O(n log n), but you are right: it better to use pop() i.s.o. pop(0). See the code below. It is very fast, compared to the (n2) solution.

def mergesort(arr):
    inversions = 0
    n = len(arr)
    if n <= 1:
        return arr, inversions
    lft, rgt = arr[:n//2], arr[n//2:]
    lft, invlft = mergesort(lft)
    rgt, invrgt = mergesort(rgt)
    merged = []
    app = merged.append
    lftpop = lft.pop
    rgtpop = rgt.pop
    lenrgt = len(rgt)
    while lft and lenrgt:
        if lft[-1] > rgt[-1]:
            app(lftpop())
            inversions += lenrgt
        else:
            app(rgtpop())
            lenrgt -= 1
    merged.reverse()
    return (lft or rgt) + merged, inversions + invlft + invrgt

matthew said

November 1, 2017 at 12:54 PM

If we want to enumerate the inversions, then adapting insertion sort is nice: each time we swap adjacent elements, this removes exactly one inversion:

def inversions(a):
    for i in range(1,len(a)):
        for j in range(i,0,-1):
            if a[j-1] <= a[j]: break
            yield((a[j-1],a[j]))
            a[j-1],a[j] = a[j],a[j-1]

print(list(inversions([1,4,3,2,5])))

Daniel said
November 1, 2017 at 1:23 PM
“I think my original solution is still O(n log n)”

Hi @Paul, regarding your comment quoted above, I think that solution was O(n^2 log n). The complexity from the recursive calls is O(log n) from the divide-and-conquer approach. Each recursive call has a while loop, with a nested O(n) operation. If the while loop looped over all elements, but had a constant operation for its body, it would be O(n). The nested O(n) operation in the while loop seemingly makes the loop O(n^2). Multiplying this by the O(log n) from the recursive calls results in O(n^2 log n).

I’m not particularly confident in my analysis of algorithms. If possible, can you please elaborate on why you think that approach is O(n log n)?
matthew said
November 1, 2017 at 3:48 PM
@Daniel: The recurrence here is T(1) = 1, T(n) = 2T(n/2) + n^2 and https://en.wikipedia.org/wiki/Master_theorem_(analysis_of_algorithms) (see “Case 3 example”) suggests this is actually just n^2 (with an exact solution T(1) = 1, T(n) = n^2 – n otherwise).
matthew said
November 1, 2017 at 4:08 PM
Sorry, what I wrote there is wrong: the exact solution at https://en.wikipedia.org/wiki/Master_theorem_(analysis_of_algorithms)#Case_3_example is T(1) = 1, T(n) = 2n^2 – n (so T(2) = 22^2-2 = 6 = 2T(2/2)+2^2 = 2+4)
matthew said
November 1, 2017 at 4:10 PM
Hmm, there were some asterisks for multiplication there that have disappeared and turned the text italic. Just read the Wikipedia entry and ignore me.
Paul said
November 1, 2017 at 5:55 PM
Hi @Daniel. I did some timings (again) and see now that my 1st method gets really slow for n=10^6. It looks like O(n^2) from timings.
Timings confirm also that the 2nd method is O(n log n).
When I posted my 1st method I was aware of the fact that pop(0) is not optimal, but I did not think it was that bad. Immediately after my post I created the 2nd method, but I did not post it, as it is almost the same and I did not think that the time difference would be that big.
Paul said
November 2, 2017 at 8:32 AM
Hi @Daniel. I wrote down why I think my 1st method is O(n^2) and the second O(n log n). You can find it here.
Daniel said
November 3, 2017 at 6:01 AM
@Paul and @matthew. Thanks for your replies! I’ll have to switch to incorporating this approach into my future analysis of recursive algorithms (that way hopefully I won’t arrive at the wrong solution, like I did above).

S	M	T	W	T	F	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Programming Praxis

Inversions

October 31, 2017

12 Responses to “Inversions”

Leave a comment

Categories

Archives

Archives

Programming Praxis

Inversions

October 31, 2017

Share this:

Related

12 Responses to “Inversions”

Leave a comment

Categories

Archives

Archives