Ordered Hash Tables
August 2, 2013
In a normal hash table, a key is hashed then linear search is performed in the bucket where the key will be found if it exists in the hash table. A successful search terminates as soon as the key is found, on average half-way through the bucket, but an unsuccessful search requires that every item in the bucket be examined.
Donald Knuth proposed that instead of keeping keys in a bucket in random order they be kept in increasing order, so that a search can stop as soon as the search passes the value of the key. That means inserts take longer; you have to find the right place in the bucket to put the key instead of just putting it at the beginning of the bucket. But lookups are quicker, as an unsuccessful search can terminate half-way through the bucket, on average. This change also means that the data type of the hash table changes, as now the comparison function is less-than rather than equal-to.
Your task is to write a small library of ordered hash tables; you should provide lookup, insert and delete functions, at least. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
Since you have a comparison function, you could go further and have each bucket be an auto-growing array; this would allow lookup to use a binary search within the bucket. But at that point it might be better to re-evaluate the hash function.
A simple Haskell solution: http://hamberg.no/erlend/posts/2013-08-02-ordered-hash-table.html
Can we implement the solution in Java?
Here’s a solution in Common Lisp. The operations on the sorted alist could be improved by using tail recursion.
An argument can be used to specify the number of buckets. Dynamic resizing is not implemented.
This solution uses separate chaining with linked lists, although a search tree would seemingly improve performance.