## Inversions

### October 31, 2017

We use nested loops to create a higher-order procedure that processes inversions:

(define (inversion lt? base proc xs) (do ((xs xs (cdr xs))) ((null? (cdr xs)) base) (do ((ys (cdr xs) (cdr ys))) ((null? ys)) (when (lt? (car ys) (car xs)) (set! base (proc (car xs) (car ys) base))))))

Here `lt?`

is a predicate that compares two items, `base`

accumulates the result, `proc`

adds the next inversion to the accumulator, and `xs`

is the input array, represented as a list. The two needed functions are then:

(define (inversion-count xs) (inversion < 0 (lambda (x y base) (+ base 1)) xs)) (define (inversion-enumerate xs) (inversion < (list) (lambda (x y base) (cons (list x y) base)) xs))

Here are two examples:

> (define xs '(1 4 3 2 5)) > (inversion-count xs) 3 > (inversion-enumerate xs) ((3 2) (4 2) (4 3))

Our program takes O(*n*^{2}) time and O(1) space. There exists an O(*n* log *n*) algorithm based on merge sorting, but I’ve never got it to work.

You can run the program at https://ideone.com/AyJvvf.

Advertisements

In Python using a modified mergesort.

I do not believe an O(n log n) solution is possible for the task of

enumeratingthe inversions. For that task I’ve implemented an O(n^2) solution in Python.I was originally using itertools.combinations, but I wasn’t sure if that function is guaranteed to preserve the ordering of the original array (although it appears to).

For the task of

countingthe inversions, I’ve implemented an O(n log n) solution based on merge sort in C99.Output:

@Paul, I believe that a core property of merge sort is its O(n log n) computational complexity. However, I think that by using pop(0) in your loop, you’re losing that property, since pop(0) is an O(n) operation, which would seemingly make your implementation O(n^2 log n).

@Daniel: I think my original solution is still O(n log n), but you are right: it better to use pop() i.s.o. pop(0). See the code below. It is very fast, compared to the (n2) solution.

If we want to enumerate the inversions, then adapting insertion sort is nice: each time we swap adjacent elements, this removes exactly one inversion:

Hi @Paul, regarding your comment quoted above, I think that solution was O(n^2 log n). The complexity from the recursive calls is O(log n) from the divide-and-conquer approach. Each recursive call has a while loop, with a nested O(n) operation. If the while loop looped over all elements, but had a constant operation for its body, it would be O(n). The nested O(n) operation in the while loop seemingly makes the loop O(n^2). Multiplying this by the O(log n) from the recursive calls results in O(n^2 log n).

I’m not particularly confident in my analysis of algorithms. If possible, can you please elaborate on why you think that approach is O(n log n)?

@Daniel: The recurrence here is T(1) = 1, T(n) = 2T(n/2) + n^2 and https://en.wikipedia.org/wiki/Master_theorem_(analysis_of_algorithms) (see “Case 3 example”) suggests this is actually just n^2 (with an exact solution T(1) = 1, T(n) = n^2 – n otherwise).

Sorry, what I wrote there is wrong: the exact solution at https://en.wikipedia.org/wiki/Master_theorem_(analysis_of_algorithms)#Case_3_example is T(1) = 1, T(n) = 2

n^2 – n (so T(2) = 22^2-2 = 6 = 2T(2/2)+2^2 = 2+4)Hmm, there were some asterisks for multiplication there that have disappeared and turned the text italic. Just read the Wikipedia entry and ignore me.

Hi @Daniel. I did some timings (again) and see now that my 1st method gets really slow for n=10^6. It looks like O(n^2) from timings.

Timings confirm also that the 2nd method is O(n log n).

When I posted my 1st method I was aware of the fact that pop(0) is not optimal, but I did not think it was that bad. Immediately after my post I created the 2nd method, but I did not post it, as it is almost the same and I did not think that the time difference would be that big.

Hi @Daniel. I wrote down why I think my 1st method is O(n^2) and the second O(n log n). You can find it here.

@Paul and @matthew. Thanks for your replies! I’ll have to switch to incorporating this approach into my future analysis of recursive algorithms (that way hopefully I won’t arrive at the wrong solution, like I did above).