Permuted Index

December 22, 2009

In 1972, David Parnas proposed the construction of a permuted index, also known as a keyword-in-context or kwic index, as an exercise in program design. For example, the three sentences

    All's well that ends well.
    Nature abhors a vacuum.
    Every man has a price.

produce the permuted index:

                          Nature     abhors a vacuum.                
                                     All's well that ends well.      
                 All's well that     ends well.                      
                                     Every man has a price.          
                       Every man     has a price.                    
                           Every     man has a price.                
                                     Nature abhors a vacuum.         
                 Every man has a     price.                          
                      All's well     that ends well.                 
                 Nature abhors a     vacuum.                         
                           All's     well that ends well.            
            All's well that ends     well.

Parnas proposed a three-step algorithm: rotate, sort, unrotate. The rotate step takes a sentence and produces all its rotations. The sort step sorts the rotations by their back half. The unrotate step produces neatly-formatted output. Somewhere, in one of the three steps, rotations that produce back halves starting with words on a stop list are discarded; note that there is no output for “a vacuum” or “a price.”

Your task is to implement a program that produces permuted indexes using Parnas’ three-step algorithm. When you are finished, you are welcome to read or run a suggested solution, or to post your solution or discuss the exercise in the comments below.

Pages: 1 2

Follow

Get every new post delivered to your Inbox.

Join 574 other followers