Universal Hash Function

October 23, 2012

This exercise can be easy, impossible, or somewhere in between, depending on your computing environment. I had no trouble writing a universal hash function in Scheme, which has a limited number of types and predicates to recognize them. C gives you access to the internal bit-image of any object in the language, so it shouldn’t be hard to write a universal hash function there, either. But a straight-jacketed language like Java may make a universal hash function impossible. Here’s my version in Scheme:

(define (hash x)
  (define (mod n) (modulo n 4294967296))
  (cond ((boolean? x) (if x 1 0))
        ((symbol? x) (hash (symbol->string x)))
        ((char? x) (char->integer x))
        ((integer? x) (mod x))
        ((real? x)
          (let* ((r (inexact->exact x))
                 (n (numerator r))
                 (d (denominator r)))
            (mod (+ n (* 37 d)))))
        ((rational? x) (mod (+ (numerator x) (* 37 (denominator x)))))
        ((complex? x)
          (mod (+ (hash (real-part x))
                  (* 37 (hash (imag-part x))))))
        ((null? x) 4294967295)
        ((pair? x)
          (let loop ((x x) (s 0))
            (if (null? x) s
              (loop (cdr x) (mod (+ (* 31 s) (hash (car x))))))))
        ((vector? x)
          (let loop ((i (- (vector-length x) 1)) (s 0))
            (if (negative? i) s
                (loop (- i 1) (mod (+ (* 31 s) (hash (vector-ref x i))))))))
        ((string? x)
          (let loop ((i (- (string-length x) 1)) (s 0))
            (if (negative? i) s
              (loop (- i 1) (mod (+ (* 31 s) (hash (string-ref x i))))))))
        ((procedure? x) (error 'hash "can't hash procedure"))
        ((port? x) (error 'hash "can't hash port"))
        (else (error 'hash "don't know how to hash object"))))

The basic types are handled first, then the collection types list and vector just loop over their members, handling each recursively. You can see examples at http://programmingpraxis.codepad.org/uBMG1PWc.

About these ads

Pages: 1 2

6 Responses to “Universal Hash Function”

  1. burr said

    >>> ht = {“Test”: “String”, 1: “Integer”}
    >>> ht[1]
    ‘Integer’
    >>> ht["Test"]
    ‘String’
    >>>

  2. Probably cheating in Python: use the id() function. From calling help(id)
    in my Python environment:

    id(…)
    id(object) -> integer

    Return the identity of an object. This is guaranteed to be unique among
    simultaneously existing objects. (Hint: it’s the object’s memory address.)

  3. Paul said

    In Python it is trivial, as you can use hash(object), which gives a hash for many data types. Hash refuses to give a value for mutable objects like a list, which makes sense.

  4. Using Shen (which I’m definitely intrigued by):

    (tc +)
    
    (define praxis-hash
      {A --> number --> number}
      X N -> (hash X N))
    
    (praxis-hash praxising 1729)
    
  5. Sorry, link above is wrong; here’s the correct link.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 630 other followers

%d bloggers like this: