House Of Representatives
April 12, 2011
The population data comes from the U. S. Census Bureau at http://2010.census.gov/2010census/data/pop_change.csv. The data is not particularly convenient for our purpose, since it has totals by various regions and also includes Puerto Rico and the District of Columbia, which have no representatives. The total population of the fifty States is 308,143,815:
(define state-pop-data '(("Alabama" 4779736) ("Alaska" 710231)
("Arizona" 6392017) ("Arkansas" 2915918) ("California" 37253956)
("Colorado" 5029196) ("Connecticut" 3574097) ("Delaware" 897934)
("Florida" 18801310) ("Georgia" 9687653) ("Hawaii" 1360301)
("Idaho" 1567582) ("Illinois" 12830632) ("Indiana" 6483802)
("Iowa" 3046355) ("Kansas" 2853118) ("Kentucky" 4339367)
("Louisiana" 4533372) ("Maine" 1328361) ("Maryland" 5773552)
("Massachusetts" 6547629) ("Michigan" 9883640) ("Minnesota" 5303925)
("Mississippi" 2967297) ("Missouri" 5988927) ("Montana" 989415)
("Nebraska" 1826341) ("Nevada" 2700551) ("New Hampshire" 1316470)
("New Jersey" 8791894) ("New Mexico" 2059179) ("New York" 19378102)
("North Carolina" 9535483) ("North Dakota" 672591) ("Ohio" 11536504)
("Oklahoma" 3751351) ("Oregon" 3831074) ("Pennsylvania" 12702379)
("Rhode Island" 1052567) ("South Carolina" 4625364) ("South Dakota" 814180)
("Tennessee" 6346105) ("Texas" 25145561) ("Utah" 2763885)
("Vermont" 625741) ("Virginia" 8001024) ("Washington" 6724540)
("West Virginia" 1852994) ("Wisconsin" 5686986)("Wyoming" 563626)))
We represent each State as a structure containing state
, pop
, reps
and gmean
fields, using the structure datatype that has recently been added to the Standard Prelude; that will give us a function (make-sp
state
pop
reps
gmean)
to create sp
objects, (sp?
obj)
to recognize such objects, and accessors like (sp-reps
sp)
and setters like (set-sp-reps!
sp
value)
for each of the four fields:
(define-structure sp state pop reps gmean)
Our solution uses priority queues to manage the apportionment process. First, each State is assigned one representative in loop1
, its geometric mean g(n, p) is calculated, and the State is entered in a priority queue. Then, the rounds are performed in loop2
by extracting the State with the maximum geometric mean g(n, p) from the priority queue, increasing its number of representatives by 1 and recomputing its geometric mean g(n, p), and re-inserting it into the priority queue:
(define (huntington-hill n xs)
(when (< n (length xs)) (error 'huntington-hill "not enough"))
(let loop1 ((xs xs) (k 1) (pq pq-empty))
(if (pair? xs)
(let* ((s (caar xs)) (p (cadar xs)) (g (gmean 1 p)))
(loop1 (cdr xs) (+ k 1)
(pq-insert lt? (make-sp s p 1 g) pq)))
(let loop2 ((k k) (pq pq))
(if (not (< n k))
(let* ((p (pq-first pq)) (reps (+ (sp-reps p) 1)))
(set-sp-reps! p reps)
(set-sp-gmean! p (gmean reps (sp-pop p)))
(loop2 (+ k 1) (pq-insert lt? p (pq-rest lt? pq))))
(sort (lambda (a b) (< (cadr b) (cadr a)))
(map (lambda (sp) (list (sp-state sp) (sp-reps sp)))
(pq->list lt? pq))))))))
We use two helper functions. Lt?
compares two States on their geometric mean g(n, p), using not
because the priority queue returns the smallest element rather than the largest, and gmean
computes the geometric mean g(n, p):
(define (lt? a b) (not (< (sp-gmean a) (sp-gmean b))))
(define (gmean n p) (/ p (sqrt (* n (+ n 1)))))
Then we reapportion the representatives like this:
> (huntington-hill 435 state-pop-data)
(("California" 53) ("Texas" 36) ("New York" 27)
("Florida" 27) ("Illinois" 18) ("Pennsylvania" 18)
("Ohio" 16) ("Michigan" 14) ("Georgia" 14)
("North Carolina" 13) ("New Jersey" 12) ("Virginia" 11)
("Washington" 10) ("Massachusetts" 9) "Indiana" 9)
("Arizona" 9) ("Tennessee" 9) ("Missouri" 8) ("Maryland" 8)
("Wisconsin" 8) ("Minnesota" 8) ("Colorado" 7) ("Alabama" 7)
("South Carolina" 7) ("Louisiana" 6) ("Kentucky" 6)
("Oregon" 5) ("Oklahoma" 5) ("Connecticut" 5) ("Iowa" 4)
("Mississippi" 4) ("Arkansas" 4) ("Kansas" 4) ("Utah" 4)
("Nevada" 4) ("New Mexico" 3) ("West Virginia" 3)
("Nebraska" 3) ("Idaho" 2) ("Hawaii" 2) ("Maine" 2)
("New Hampshire" 2) ("Rhode Island" 2) ("Montana" 1)
("Delaware" 1) ("South Dakota" 1) ("Alaska" 1)
("North Dakota" 1) ("Vermont" 1) ("Wyoming" 1))
You can see the code assembled at http://programmingpraxis.codepad.org/NwMn9iLn. It is striking that in 1940 the Congress understood the relatively advanced mathematical concept of the geometric mean, but in 2011 the Congress doesn’t understand the basic economic concept that you can’t spend money you don’t have.
[…] today’s Programming Praxis exercise, our goal is to calculate the amount of seats each state gets in the […]
My Haskell solution (see http://bonsaicode.wordpress.com/2011/04/12/programming-praxis-house-of-representatives/ for a version with comments):
A Python solution:
My solution and first attempt to play with upcoming C++0x standard (compiles nicely with GCC version 4.5): github
Thought I should admit that before posting I have read the solution and understood that using sorted vector and insertion via lower_bound function is not as nice as using priority queue :)
For this one I reformatted the source file as:
Alabama: 4779736
Alaska: 710231
Arizona: 6392017
…
My Python solution.
My original solution involved a
State class, but pylint objected that it didn't have enough public methods to warrant its own class.
Apologies for the incorrectly clsoed “code” tag above; only
State
should be tagged.Note that in python’s heapq module, heappop returns the smallest item. Here we want the state with the largest geomean. So we use -geomean to get the desired ordering.
Interstingly, Montana has the lowest representation, with 1 rep per 989k people, and Rhode Island has the highest, with 1 representative per 526k people.
Not very ruby like in some ways (use of arrays) but pretty darn small …
I set the number of representatives to 1 in the initial data. The tricky line is line 23, so let’s step through it. We know we have 385 representatives left so we’ll go through each of them. The next piece is the collect which will generate an array of the g(n, p) values. We then do the each_with_index.max which returns an array of the maximum and the index of the maximum and we grab the index out of the [1] value. We then add 1 to the [2] value (representatives) of the state_pop_data array.
My try in REXX
Factor language code. Kind of clunky since heaps are not treated as sequences, so result needs to be converted to an array for printing. Also more stack shuffling than usual is used here.
( scratchpad ) house
California 53
Texas 36
New York 27
Florida 27
Illinois 18
Pennsylvania 18
Ohio 16
Michigan 14
Georgia 14
North Carolina 13
New Jersey 12
Virginia 11
Washington 10
….
In F#,
In scala. It could be a bit shorter if I made seats mutable in the State class. Now I have to remove the old State from the List and add the one with the added seat. Not very efficient :)