Ordered Hash Tables

August 2, 2013

In a normal hash table, a key is hashed then linear search is performed in the bucket where the key will be found if it exists in the hash table. A successful search terminates as soon as the key is found, on average half-way through the bucket, but an unsuccessful search requires that every item in the bucket be examined.

Donald Knuth proposed that instead of keeping keys in a bucket in random order they be kept in increasing order, so that a search can stop as soon as the search passes the value of the key. That means inserts take longer; you have to find the right place in the bucket to put the key instead of just putting it at the beginning of the bucket. But lookups are quicker, as an unsuccessful search can terminate half-way through the bucket, on average. This change also means that the data type of the hash table changes, as now the comparison function is less-than rather than equal-to.

Your task is to write a small library of ordered hash tables; you should provide lookup, insert and delete functions, at least. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Posted by programmingpraxis

Filed in Exercises

4 Comments »

4 Responses to “Ordered Hash Tables”

mvaneerde said
August 2, 2013 at 10:46 AM
Since you have a comparison function, you could go further and have each bucket be an auto-growing array; this would allow lookup to use a binary search within the bucket. But at that point it might be better to re-evaluate the hash function.
Erlend said
August 2, 2013 at 2:24 PM
A simple Haskell solution: http://hamberg.no/erlend/posts/2013-08-02-ordered-hash-table.html
Ram said
August 5, 2013 at 6:31 AM
Can we implement the solution in Java?

Daniel said

August 7, 2013 at 3:51 AM

Here’s a solution in Common Lisp. The operations on the sorted alist could be improved by using tail recursion.

An argument can be used to specify the number of buckets. Dynamic resizing is not implemented.

This solution uses separate chaining with linked lists, although a search tree would seemingly improve performance.

;;; Sorted alist operations
;;; These could be improved with tail recursion

(defun sorted-acons (key datum alist &key (comp #'<))
  "Inserts (key . datum) pair into a sorted alist"
  (if (null alist)
      (list (cons key datum))
      (let ((head (car alist))
            (tail (cdr alist)))
        (let ((head-key (car head)))
          (cond ((funcall comp key head-key)
                 (acons key datum alist))
                ((funcall comp head-key key)
                 (cons head (sorted-acons key datum tail :comp comp)))
                (t (acons key datum tail)))))))

(defun sorted-assoc (key alist &key (comp #'<))
  "Return the cons in a sorted alist with key matching argument"
  (if (null alist)
      nil
      (let ((head (car alist))
            (tail (cdr alist)))
        (let ((head-key (car head)))
          (cond ((funcall comp key head-key)
                 nil)
                ((funcall comp head-key key)
                 (sorted-assoc key tail :comp comp))
                (t head))))))

(defun sorted-alist-remove (key alist &key (comp #'<))
  "Remove pair (key . datum) from a sorted alist if key matches argument"
  (if (null alist)
      nil
      (let ((head (car alist))
            (tail (cdr alist)))
        (let ((head-key (car head)))
          (cond ((funcall comp key head-key)
                 alist)
                ((funcall comp head-key key)
                 (cons head (sorted-alist-remove key tail :comp comp)))
                (t tail))))))

;;; Ordered hash table structure and operations

(defun make-ordered-hash (&key
                          (hash-function #'sxhash)
                          (comp #'<)
                          (buckets 16))
  (let ((array (make-array buckets :initial-element nil)))
    (macrolet ((bucket (key)
                 `(aref array (rem (funcall hash-function ,key) buckets))))
      (vector
       #'(lambda (key datum)
           (setf (bucket key)
                 (sorted-acons key datum (bucket key) :comp comp))
           datum)
       #'(lambda (key)
           (let ((pair (sorted-assoc key (bucket key) :comp comp)))
             (if pair
                 (values (cdr pair) t)
                 (values nil nil))))
       #'(lambda (key)
           (setf (bucket key)
                 (sorted-alist-remove key (bucket key) :comp comp)))))))

(defun insert (key value ordered-hash)
  (funcall (aref ordered-hash 0) key value))

(defun lookup (key ordered-hash)
  (funcall (aref ordered-hash 1) key))

(defun del (key ordered-hash)
  (funcall (aref ordered-hash 2) key))

;;; test

(defun test ()
  (let ((table (make-ordered-hash)))
    (insert 25 78 table)
    (insert 95 18 table)
    (insert 10 2 table)
    (assert (= (lookup 25 table) 78))
    (del 95 table)
    (assert (null (lookup 95 table)))))

S	M	T	W	T	F	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Programming Praxis