Conditional Heap Insertion
May 17, 2016
This is a trick question, and I can’t decide if it’s good that Amazon asks it, or if it’s underhanded.
It is not possible to perform the task in O(n) time. The heap is a binary tree, but it’s not necessarily ordered so that all less-than nodes are on one side of the tree and all greater-than nodes are on the other side of the tree, so there is no way to search the heap for a specific item in anything less than linear time. Once you know the element is not in the heap, it can be inserted in O(n) time, but checking that the element is not already in the heap requires linear time. Someone with a shaky knowledge of data structures might easily miss that in the stress of an interview (which might be what Amazon intends).
So we will write a linear-time solution, based on the priority queues of a previous exercise. The queue is stored as nodes in a 4-slot vector, with the item in slot 1, the left child in slot 2, and the right child in slot 3. We begin with a function that determines in an item is in the priority queue:
(define (pq-contains? lt? x pq) (if (pq-empty? pq) #f (if (lt? x (pq-item pq)) #f (if (not (lt? (pq-item pq) x)) #t (or (pq-contains? lt? x (pq-lkid pq)) (pq-contains? lt? x (pq-rkid pq)))))))
Now it’s easy. If the element is already present, we simply return the priority queue unchanged, otherwise we insert the element and return the new priority queue:
(define (pq-conditional-insert lt? x pq) (if (pq-contains? lt? x pq) pq (pq-insert lt? x pq)))
Here’s an example:
> (define pq (pq-insert < 3 (pq-insert < 7 (pq-insert < 8 (pq-insert < 1 (pq-insert < 2 (pq-insert < 9 (pq-insert < 6 (pq-insert < 4 (pq-insert < 5 pq-empty)))))))))) > pq #(2 1 #(2 2 #(2 4 #(1 5 #(1 9 #(0 pq-empty pq-empty pq-empty) #(0 pq-empty pq-empty pq-empty)) #(0 pq-empty pq-empty pq-empty)) #(1 6 #(0 pq-empty pq-empty pq-empty) #(0 pq-empty pq-empty pq-empty))) #(1 7 #(0 pq-empty pq-empty pq-empty) #(0 pq-empty pq-empty pq-empty))) #(1 3 #(1 8 #(0 pq-empty pq-empty pq-empty) #(0 pq-empty pq-empty pq-empty)) #(0 pq-empty pq-empty pq-empty))) > (pq-contains? < 3 pq) #t > (pq-contains? < 0 pq) #f > (pq-conditional-insert < 0 pq) #(1 0 #(2 1 #(2 2 #(2 4 #(1 5 #(1 9 #(0 pq-empty pq-empty pq-empty) #(0 pq-empty pq-empty pq-empty)) #(0 pq-empty pq-empty pq-empty)) #(1 6 #(0 pq-empty pq-empty pq-empty) #(0 pq-empty pq-empty pq-empty))) #(1 7 #(0 pq-empty pq-empty pq-empty) #(0 pq-empty pq-empty pq-empty))) #(1 3 #(1 8 #(0 pq-empty pq-empty pq-empty) #(0 pq-empty pq-empty pq-empty)) #(0 pq-empty pq-empty pq-empty))) #(0 pq-empty pq-empty pq-empty))
You can run the program at http://ideone.com/Mz798f.
In Python this is easy. Heaps are implemented as lists, so it is easy to see if the new item is in the heap. If not, use heappush to insert the item in the heap.
Perl…. a simple OO version
The question as posed by Amazon is bizarre. If you are willing to pay the price of O(n) insertion time, you might as well use a linked list instead of a heap.
I would use an additional set to store elements that are in the heap. This requires extra space, but doesn’t alter Big Oh times.
Here’s a solution in C. I used an array-backed heap, where the array represents a binary tree (index 0 is the root, 1 and 2 are the children of the root, and e.g., 9 and 10 are the children of node 4).
I did not implement reheapification downward, which wasn’t necessary for this problem (since there is no removal from the heap).
Also note, the amazon question, as written on this page, says O(n), but the following text after the quote says “in logarithmic time”. The solution page implies that the amazon question is a trick question, but it’s seemingly the programming praxis portion that is a trick question (assuming the amazon quote is correct).
Also, I believe there are some issues on the solution page:
> “It is not possible to perform the task in O(n) time.”
should seemingly say “O(log n) time”
> “Once you know the element is not in the heap, it can be inserted in O(n) time”
should seemingly say “inserted in O(log n) time”
I forgot to include the program output. Here it is. Also, HeapInsertNew is the function corresponding to the question.