A Dozen Lines Of Code

June 3, 2016

I decided to implement hash tables, which are one of the most useful data structures in all of computer science. Here’s my version, which fits in nine lines of 63 characters:

(define (make-hash hash eql? size)  (let ((ht (make-vector size
(list)))) (case-lambda ((lookup k) (let loop ((b (vector-ref ht
(modulo (hash k) size)))) (if (null? b) b (if (eql? (caar b) k)
(car b)  (loop (cdr b))))))  ((install k v)  (let ((idx (modulo
(hash k) size))) (let loop ((b (vector-ref ht idx)) (x (list)))
(cond  ((null? b)  (vector-set! ht idx  (cons  (cons k v)  x)))
((eql? (caar b) k) (vector-set! ht idx (append (cons (cons k v)
(cdr b)) x)))  (else (loop (cdr b) (cons (car b) x))))))) (else
(error 'hash-table "unrecognized command"))))))

That might be a little bit hard to read. Here’s a better version, with proper indentation and sensible variable names, that unfortunately no longer fits in twelve lines:

(define (make-hash hash eql? size)
  (let ((hash-table (make-vector size (list))))
      ((lookup key)
        (let ((idx (modulo (hash key) size)))
          (let loop ((bucket (vector-ref hash-table idx)))
            (cond ((null? bucket) bucket)
                  ((eql? (caar bucket) key) (car bucket))
                  (else (loop (cdr bucket)))))))
      ((install key value)
        (let ((idx (modulo (hash key) size)))
          (let loop ((bucket (vector-ref hash-table idx)) (new-bucket (list)))
            (cond ((null? bucket)
                    (vector-set! hash-table idx (cons (cons key value) new-bucket)))
                  ((eql? (caar bucket) key)
                    (vector-set! hash-table idx
                      (append (cons (cons key value) (cdr bucket)) new-bucket)))
                  (else (loop (cdr bucket) (cons (car bucket) new-bucket)))))))
      (else (error 'hash-table "unrecognized command")))))

If you’re not familiar with Scheme, there are a couple of interesting things going on here. The lambda forms a closure, so hash-table, hash, eql? and size are all available within the body of the function, and persist from one call to the next, but aren’t visible outside the function. The case-lambda dispatches on arity, not on the message in the first field, so lookup and install are just unused variables, and the function can be called with other messages such as fetch and store. Here you can see it in action, where we use get and put to emphasize the point about arity:

> (define (string-hash str)
    (do ((i 0 (+ i 1))
         (h 0 (+ (* h 31) (char->integer (string-ref str i)))))
      ((= (string-length str) i) h)))
> (define h (make-hash string-hash string=? 17))
> (h 'put "alfa" 1)
> (h 'put "bravo" 2)
> (h 'put "charlie" 3)
> (h 'get "alfa")
("alfa" . 1)
> (h 'put "alfa" 13)
> (h 'get "alfa")
("alfa" . 13)

Notice that the install function replaces existing values. You can run the program at http://ideone.com/Q1GNqD.

Pages: 1 2

6 Responses to “A Dozen Lines Of Code”

  1. mcmillhj said

    Sorting IPv4 addresses numerically, code with whitespace and shebang line minus the test data is 12 lines:

    use strict;
    use warnings; 
    my @sorted_ip_addrs = 
       map  { join '.', @$_ } 
       sort { $a->[0] <=> $b->[0] || $a->[1] <=> $b->[1] || $a->[2] <=> $b->[2] || $a->[3] <=> $b->[3] }
       map  { chomp; [split /\./] } readline(*DATA); 
    print join("\n", @sorted_ip_addrs), "\n";


  2. Zack said

    function fne(x::Union{Array{Int8,1}, BitArray{1}}) # Feature Normalized Entropy
    n = length(x) # number of unique values of feature
    ln = log(2, n)
    z = 0.0
    for c in collect(values(counter(x))); z += c*(ln – log(2, c)) / n; end
    if typeof(x) <: BitArray
    return z
    me = log(2, length(unique(x))) # maximum entropy
    return z / me

    where counter() is a function that provides the frequencies of the elements of the (nominal) feature x.

    The actual function fne() is much more comprehensive and makes use of the fe() auxiliary, that calculates the feature entropy. The normalized version of the feature entropy takes values in [0, 1] and is a much more intuitive metric for assessing a given (nominal) feature.

  3. Globules said

    A Haskell program.

    -- This program uses the Karplus-Strong algorithm to synthesize the sound of a
    -- plucked string.  Excluding comments and blank lines, it consists of exactly
    -- 12 lines.  After compiling it, run it like this:
    --   ./karstr | play --type raw --rate 44100 --bits 64 \
    --                   --encoding floating-point --channels 1 \
    --                   --volume 0.5 --endian little -
    -- where the `play' program is from the SOX package.  It takes the sound wave
    -- data produced by karstr and plays it on the audio device.
    -- The entire algorithm is implemented by the `karstr' function, which
    -- highlights the use of laziness in Haskell.  In particular, the value `note'
    -- is defined in terms of itself.  It's an infinite list of sound samples, but
    -- only as many as are required by karstr's caller will actually be generated.
    -- The function simulates a plucked string as a "recirculating delay line" whose
    -- output is passed through a simple low-pass filter.  Its argument is a burst
    -- of white noise, representing the superposition of many sound frequencies
    -- immediately after the string is plucked.  The repeated application of the
    -- low-pass filter simulates energy loss in the string, causing higher
    -- frequencies to be dampened more quickly than lower frequencies.  This is what
    -- leads to the initial "twang" sound, followed by a decay towards a more pure
    -- tone.
    -- The length of the white noise burst is what determines the frequency of the
    -- note.
    import Data.ByteString.Builder
    import System.IO
    import System.Random
    karstr nz = note where note = nz ++ map (0.4995*) (zipWith (+) note (tail note))
    main = let pluck = take (secs 10) . karstr
               secs tm = round (44100 * tm)
               freq hz = round (44100 / hz)
               noise n = take n . randomRs (-1, 1)
           in do nz <- fmap (noise (freq 130.81)) getStdGen
                 hSetBinaryMode stdout True
                 hSetBuffering  stdout (BlockBuffering Nothing)
                 hPutBuilder stdout . mconcat . map (doubleLE . (*0.2)) $ pluck nz
  4. Globules said

    Definitely not a 12-line program, but a nice follow-up to the previous one. It uses the same Karplus-Strong algorithm, but mixes individual notes into a sequence of arpeggios.

    -- Compile, then run as:
    --   ./arpeggios | play --type raw --rate 44100 --bits 64 \
    --                      --encoding floating-point --channels 1 \
    --                      --volume 0.5 --endian little -
    import Data.ByteString.Builder
    import Data.List (transpose)
    import System.IO
    import System.Random
    -- The sampling rate (samples/second).
    rate :: Double
    rate = 44100
    -- A list of samples simulating a plucked string.  Its frequency is determined
    -- by the length of the white noise argument.
    karstr :: [Double] -> [Double]
    karstr nz = note
      where note = nz ++ map (0.4995 *) (zipWith (+) note (tail note))
    -- A sequence of notes each of whose start is delayed by the given number of
    -- samples.
    arpeggio :: Int -> [[Double]] -> [Double]
    arpeggio n = mix . zipWith (++) pauses . map karstr
      where mix = map sum . transpose
            pauses = map (`replicate` 0) [0, n ..]
    -- A chromatic scale starting at C3 (130.81 Hz).
    scale :: RandomGen g => g -> [[Double]]
    scale g = [noise (freq hz) g | i <- [0..11 :: Int], let hz = semitone i]
      where noise n = take n . randomRs (-1, 1)
            freq hz = round $ rate / hz
            semitone i = 130.81 * 2.0 ** (fromIntegral i / 12.0)
    -- The number of samples for the given number of seconds.
    secs :: Double -> Int
    secs tm = round $ rate * tm
    -- Write a series of 64-bit, little-ending floating point samples to the handle.
    putSamples :: Handle -> [Double] -> IO ()
    putSamples h samps = do
      hSetBinaryMode h True
      hSetBuffering  h (BlockBuffering Nothing)
      hPutBuilder stdout . mconcat . map (doubleLE . (* 0.2)) $ samps
    main :: IO ()
    main = do
      -- Convenient names for the notes we'll use.
      [c, _, d, _, e, f, _, g, _, a, _, b] <- fmap scale getStdGen
      -- A sequence of three note arpeggios.  The first three have a delay of 0.08
      -- seconds between the start of each note, and each arpeggio is played for 1.5
      -- seconds before beginning the next one.  The final arpeggio is played more
      -- slowly and lasts longer in order to better hear the notes decay.
      let arpeggios = concat [ take (secs 1.5) $ arpeggio (secs 0.08) [a, c, e]
                             , take (secs 1.5) $ arpeggio (secs 0.08) [f, a, c]
                             , take (secs 1.5) $ arpeggio (secs 0.08) [d, f, a]
                             , take (secs 9.0) $ arpeggio (secs 0.30) [e, g, b] ]
      putSamples stdout arpeggios
  5. Zack said

    Awesome applications! How does Haskell compare with C in terms of efficiency and resource management?

  6. Globules said

    @Zack I’m sure there’s a lot to be said on that subject, but I’m far from the best person to say it. :-) I only play around with Haskell; I don’t use it in my day job. With that being said… Haskell has automatic memory management (i.e. garbage collection), so you may not want to use it where periodic short pauses can’t be tolerated (e.g. game programming where you want to maintain a high frame rate). Also, you have to be careful of “space leaks”, due to laziness, which is the term Haskellers use to describe memory that is unintentionally consumed by unevaluated functions and data. There are techniques and libraries for dealing with this, just as there are different ways of avoiding memory leaks, wild pointers, etc. in C/C++. In general, the freedom from having to manage your own memory is very liberating. With respect to the speed of the resulting code I’ve seen small programs equal that of C/C++. For more realistic programs I wouldn’t be surprised if Haskell was slower by a small constant factor.

    For an overview of the areas in which Haskell does well (or not so well) I suggest checking out State of the Haskell ecosystem. (This is from the point-of-view of the libraries that are available. The compiler itself is solid.)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: