## Permuted Index

### December 22, 2009

In 1972, David Parnas proposed the construction of a permuted index, also known as a keyword-in-context or kwic index, as an exercise in program design. For example, the three sentences

```    All's well that ends well.     Nature abhors a vacuum.     Every man has a price.```

produce the permuted index:

```                          Nature     abhors a vacuum.
All's well that ends well.
All's well that     ends well.
Every man has a price.
Every man     has a price.
Every     man has a price.
Nature abhors a vacuum.
Every man has a     price.
All's well     that ends well.
Nature abhors a     vacuum.
All's     well that ends well.
All's well that ends     well.```

Parnas proposed a three-step algorithm: rotate, sort, unrotate. The rotate step takes a sentence and produces all its rotations. The sort step sorts the rotations by their back half. The unrotate step produces neatly-formatted output. Somewhere, in one of the three steps, rotations that produce back halves starting with words on a stop list are discarded; note that there is no output for “a vacuum” or “a price.”

Your task is to implement a program that produces permuted indexes using Parnas’ three-step algorithm. When you are finished, you are welcome to read or run a suggested solution, or to post your solution or discuss the exercise in the comments below.

Pages: 1 2

### 6 Responses to “Permuted Index”

1. [...] Praxis – Permuted Index By Remco Niemeijer In today’s Programming Praxis exercise we have to implement David Parnas’ permuted index system. [...]

2. Remco Niemeijer said

```import Data.Char
import Data.List
import qualified Data.List.Key as K
import Text.Printf

stopList :: [String]
stopList = words "a an and by for if in is of on the to"

rot :: [String] -> [(String, String)]
rot xs = [(unwords a, unwords b) | (a, b) <- init \$
zip (inits xs) (tails xs), notElem (head b) stopList]

prettyPrint :: [(String, String)] -> IO ()
prettyPrint xs = mapM_ (\(a, b) -> printf "%*s   %-*s\n" l1 a l2 b) xs
where l1 = maximum \$ map (length . fst) xs
l2 = maximum \$ map (length . snd) xs

permuteIndex :: String -> IO ()
permuteIndex = prettyPrint . K.sort (\(_, x) -> (map toLower x, x)) .
concatMap (rot . words) . lines
```
3. Geo Marchin said

My python solution.

def rotate(phrase):
“”"
Takes a given phrase and performs a keyword rotation on it, returning a
list of (part 1, part 2) tuples.
“”"
# I don’t want to figure out the sorting of capital letters.
phrase = phrase.lower()
# split the phrase into words.
split_phrase = phrase.split(‘ ‘)
rotation_list = []
# Perform the actual rotation and return the splits.
for i in range(len(split_phrase)):
non_words =\
["a","an","and","by","for","if","in","is","of","on","the","to"]
# If the split is before a non word skip the split, unless it’s the first
# word in the phrase.
if (split_phrase[i] in non_words) and (i is not 0):
continue
# Create the left and right portions of the split, and insert into the
# return tuple list.
left = “”
right = “”
for n in range(len(split_phrase[:i])):
if n == 0:
left = split_phrase[n]
else:
left = left + ‘ ‘ + split_phrase[n]
for n in range(len(split_phrase[i:])):
if n == 0:
right = split_phrase[i+n]
else:
right = right + ‘ ‘ + split_phrase[i+n]
rotation_list.append((left, right))
return rotation_list

def sort(rotation_list):
“”"
Given a rotation list, bubble sort by the right hand side because I’m lazy.
“”"
for i in range(len(rotation_list)):
for n in range(len(rotation_list)-1):
if (rotation_list[n][1] > rotation_list[n+1][1]):
rotation_list[n], rotation_list[n+1] = \
rotation_list[n+1], rotation_list[n]

def unrotate(rotation_list):
“”"
Neatly output the given rotation list to the user.
“”"
# Find the length of the longest string in the rotation list.
max_size = 0
for i in range(len(rotation_list)):
for j in range(2):
if (len(rotation_list[i][j]) > max_size):
max_size = len(rotation_list[i][j])

# Output the properly formatted index table.
for i in range(len(rotation_list)):
print ‘ ‘ * (max_size – len(rotation_list[i][0])) \
+ rotation_list[i][0] \
+ ” ” \
+ rotation_list[i][1]

4. Geo Marchin said

Apologies for failing at code posting.
Here you go:

5. I’ve written my OCaml solution (heavily “inspired” by Remco’s) here.

6. richard mullins said

Parnas’s paper on kwic index looked like he had spent a week on very intensive design. In my opinion, parnas’s ideas are roughly the same as Niklaus Wirth’s “refinement” method – or at least compatible with it.
I think Parnas was on the right track with his ideas.

I was astounded a couple of years ago to see that someone had written a kwic index progam in 3 lines of Python. I did not keep a note of the Author. I still think however, that Parnas’s design method is very good.