Animal.txt
January 12, 2021
It’s easiest to approach a task like this in pieces, using a pipeline to gently massage the data to its final form:
awk '
/once was/ { animal = $5 }
/he ate/ { print $4, animal } ' animal.txt |
sort -u |
awk '
NR == 1 { prev = $1; printf "%s %s", $1, $2 }
NR > 1 { if (prev == $1) { printf " %s", $2 }
else { prev = $1; printf "\n%s %s", $1, $2 } }
END { printf "\n" } ' |
awk '
{ if (NR > 1) { print "" }
print "Food:", $1, "Animals who ate it:", NF-1
print "========"
for (N=2; N<=NF; N++) { print $N } } '
The first awk command extracts food/animal pairs from the input. After that command, the pipeline consists of seven lines:
Apples Dog
Apples Dog
Apples Dog
Carrots Dog
Carrots Bear
Carrots Bear
Chicken Bear
The sort command with the -u flag sorts the lines and removes duplicates. Now there are four lines remaining in the pipeline; note that Carrots Bear precedes Carrots Dog, which is the correct order for output:
Apples Dog
Carrots Bear
Carrots Dog
Chicken Bear
The second awk command groups like foods on a single line, using the principle that “output is a weak form of concatenation.” Now three lines remain in the pipeline:
Apples Dog
Carrots Bear Dog
Chicken Bear
The third awk command produces the final output. Note the attention to the blank line between output groups, which appears before all groups except the first. Here is the final output:
Food: Apples Animals who ate it: 1
========
Dog
Food: Carrots Animals who ate it: 2
========
Bear
Dog
Food: Chicken Animals who ate it: 1
========
Bear
Yes, this is wildly inefficient in terms of computer usage; it spawns four processes and passes lots of data through a pipeline, where it would clearly be possible to do the whole thing in a single awk program. But each piece is tiny, does exactly one thing, is easy to write, and is a candidate for reuse; the grouping program in the second invocation of awk would clearly be useful in other programs, and the first awk program might be useful if other programs read the same input.
You can run the program at https://ideone.com/enCtmf.
A cute little exercise. I hope it’s not counted as cheating if I tackle it using Julia (1.5.2): https://pastebin.com/UDbyNBhQ
I really had no idea that bears ate carrots!
Here’s a solution in Python.
Example usage:
Python
import re
inFile = open(‘animal.txt’)
dt = dict()
l = set()
f = ”
for line in inFile:
if re.match(“There once was a”, line):
if f:
dt[f] = l
f = line.split()[-1].capitalize()
l = set()
elif re.search(“he ate”, line):
s = line.split()[-1].capitalize()
l.add(s)
dt[f] = l
dtr = dict() #reverse dict
for k, v in dt.items():
for iv in v:
if iv in dtr:
dtr[iv].add(k)
else:
dtr[iv] = set([k])
for val in dtr:
print(‘\nFood: {} Animals who ate it: {}’.format(val, len(dtr[val])))
print(‘=======’)
for i in dtr[val]:
print(i)
ouch (
https://pastebin.com/yh2iQweK
Here’s an AWK version. The food names are capitalized, and the output sorted, to match the example.
Here’s a solution in Haskell using the cool Parsec module in the standard library. I noticed the capitalisation of animals and foods was inconsistent in the example input but I’m just ignoring that. Also it’s not quite obvious how to format code in these comments, so this might look terrible.
(and then I saw the “HOWTO: Posting Source Code” link right at the top of the page. On the plus side, it seems that the “code” tag suffices in place of “sourcecode” and the “lang” property can be omitted for unsupported languages like Haskell)
Here a solution in Java:
package codekatas;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.TreeSet;
public class AnimalFood {
}
Here a solution in Java: