Conditional Heap Insertion
May 17, 2016
This is an Amazon interview question:
Given a heap (priority queue), insert an element into the heap if the element is not already present in the heap. Your solution must work in O(n) time, where n is the number of items in the heap.
Your task is to write a program to insert an element into a heap if the element is not already present in the heap, in logarithmic time. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
In Python this is easy. Heaps are implemented as lists, so it is easy to see if the new item is in the heap. If not, use heappush to insert the item in the heap.
from heapq import heappush def cond_heappush(heap, item): if item not in heap: heappush(heap, item)Perl…. a simple OO version
my @ids = qw(12 5 13 3 5 9 6 4 9 3 1 44); my $x = Heap->new; $x->push( $_ ) foreach @ids; print join ", ", $x->dump,"\n"; package Heap; sub new { my $self = []; bless $self; return $self; } sub push { my ( $self,$v ) = @_; unless( @{$self} ) { push @{$self}, $v,Heap->new,Heap->new unless @{$self}; return $self; } return $self if $self->[0] == $v; $self->[ $v < $self->[0] ? 1 : 2 ]->push( $v ); return $self; } sub dump { my $self = shift; return unless @{$self}; return ($self->[1]->dump,$self->[0],$self->[2]->dump); } 1;The question as posed by Amazon is bizarre. If you are willing to pay the price of O(n) insertion time, you might as well use a linked list instead of a heap.
I would use an additional set to store elements that are in the heap. This requires extra space, but doesn’t alter Big Oh times.
Here’s a solution in C. I used an array-backed heap, where the array represents a binary tree (index 0 is the root, 1 and 2 are the children of the root, and e.g., 9 and 10 are the children of node 4).
I did not implement reheapification downward, which wasn’t necessary for this problem (since there is no removal from the heap).
Also note, the amazon question, as written on this page, says O(n), but the following text after the quote says “in logarithmic time”. The solution page implies that the amazon question is a trick question, but it’s seemingly the programming praxis portion that is a trick question (assuming the amazon quote is correct).
Also, I believe there are some issues on the solution page:
> “It is not possible to perform the task in O(n) time.”
should seemingly say “O(log n) time”
> “Once you know the element is not in the heap, it can be inserted in O(n) time”
should seemingly say “inserted in O(log n) time”
#include <assert.h> #include <stdio.h> #include <stdlib.h> /* * Array-Backed Heap */ // TODO: A function for removing the root (reheapification downward), // int HeapRemove(Heap* h); // It wasn't necessary for the problem you were working on. // TODO: A function for building a heap from an arbitrary list of elements, // void HeapBuild(Heap* h, int* elements, int count); // There is an O(n) way to do this, versus calling HeapInsert n times, // which is O(n log n). // Helpers typedef int bool; #define true 1 #define false 0 // Given the index of a node in a binary tree, return the parent index. // Returns -1 if node_index is the root node. int parent(int node_index) { assert(node_index >= 0); // 0 is the root return (node_index-1) >> 1; // floor((node_index-1)/2) } void terpri(int count) { for (int i = 0; i < count; i++) { printf("\n"); } } // Data structures typedef struct { int* array; int count; int capacity; } Heap; // Initialize an empty heap. void HeapInit(Heap* h, int capacity) { assert(capacity > 0); h->count = 0; h->capacity = capacity; h->array = malloc(sizeof(int) * capacity); } void HeapCleanup(Heap* h) { free(h->array); } // O(log n) void HeapInsert(Heap* h, int element) { if (h->capacity <= h->count) { // Alternatively could realloc fprintf(stderr, "Heap full (%d elements)\n", h->capacity); abort(); } int* array = h->array; array[h->count] = element; int index = h->count; int parent_index = parent(index); while (parent_index >= 0 && array[index] > array[parent_index]) { // swap int tmp = array[index]; array[index] = array[parent_index]; array[parent_index] = tmp; index = parent_index; parent_index = parent(index); } h->count++; } bool HeapContains(Heap* h, int element) { // TODO: Can return early if an entire level of the corresponding binary tree // is less than element. for (int i = 0; i < h->count; i++) { if (h->array[i] == element) { return true; } } return false; } // Inserts an element if it's not present (hence "New"). // Returns false iff item was already in the heap. bool HeapInsertNew(Heap* h, int element) { bool contained = HeapContains(h, element); if (!contained) { HeapInsert(h, element); } return !contained; } void HeapPrint(Heap* h) { for (int i = 0; i < h->count; i++) { printf("%d ", h->array[i]); } } int main(int argc, char* argv[]) { Heap h; HeapInit(&h, 100); int elements[] = {77, 5, 33, 29, 16, 9, 82}; int n = sizeof(elements) / sizeof(elements[0]); for (int i = 0; i < n; i++) { HeapInsert(&h, elements[i]); } printf("Original Heap\n"); HeapPrint(&h); terpri(2); printf("Insert 29 (an existing element) with HeapInsertNew\n"); HeapInsertNew(&h, 29); HeapPrint(&h); terpri(2); printf("Insert 30 (a new element) with HeapInsertNew\n"); HeapInsertNew(&h, 30); HeapPrint(&h); terpri(1); HeapCleanup(&h); return 0; }I forgot to include the program output. Here it is. Also, HeapInsertNew is the function corresponding to the question.