Partitioning The Telephone Book
July 10, 2015
The city where I live used to publish a telephone directory; the “white pages” were distributed to all telephone customers once a year. Now my city no longer prints the directory; it is available on the internet, or you can pay an operator to look up a telephone number for you.
But there are still cities that print telephone directories, and some of those cities are big enough that the directory must be partitioned into multiple volumes. Consider this distribution of the first letters of customer’s last names:
A B C D E F G H I J K L M N 0 P Q R S T U V W X Y Z 16 4 17 10 15 4 4 6 7 14 9 17 27 6 1 9 0 12 20 8 0 3 4 0 3 4
I’m not sure what the units are (probably tens of thousands of telephone customers), but the total is 220, and the telephone company has decided to print 4 volumes, so each should be about 55 units. One possible partitioning is A-D, E-J, K-O, P-Z, with counts of 47, 50, 60 and 63 units, differences of 8, 5, 5 and 8 from the ideal of 55, and a total difference of 26. Another partitioning is A-E, F-K, L-O, P-Z, with counts of 62, 44, 51, and 63 units, differences of 7, 11, 4 and 8, and a total difference of 30, which is worse. Before continuing, you might want to work out the optimal solution by hand, finding a minimum score of 18.
Your task is to write a program that determines the partitioning of the telephone book that minimizes the total difference. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.