Animal.txt
January 12, 2021
It’s easiest to approach a task like this in pieces, using a pipeline to gently massage the data to its final form:
awk '
/once was/ { animal = $5 }
/he ate/ { print $4, animal } ' animal.txt |
sort -u |
awk '
NR == 1 { prev = $1; printf "%s %s", $1, $2 }
NR > 1 { if (prev == $1) { printf " %s", $2 }
else { prev = $1; printf "\n%s %s", $1, $2 } }
END { printf "\n" } ' |
awk '
{ if (NR > 1) { print "" }
print "Food:", $1, "Animals who ate it:", NF-1
print "========"
for (N=2; N<=NF; N++) { print $N } } '
The first awk command extracts food/animal pairs from the input. After that command, the pipeline consists of seven lines:
Apples Dog
Apples Dog
Apples Dog
Carrots Dog
Carrots Bear
Carrots Bear
Chicken Bear
The sort command with the -u flag sorts the lines and removes duplicates. Now there are four lines remaining in the pipeline; note that Carrots Bear precedes Carrots Dog, which is the correct order for output:
Apples Dog
Carrots Bear
Carrots Dog
Chicken Bear
The second awk command groups like foods on a single line, using the principle that “output is a weak form of concatenation.” Now three lines remain in the pipeline:
Apples Dog
Carrots Bear Dog
Chicken Bear
The third awk command produces the final output. Note the attention to the blank line between output groups, which appears before all groups except the first. Here is the final output:
Food: Apples Animals who ate it: 1
========
Dog
Food: Carrots Animals who ate it: 2
========
Bear
Dog
Food: Chicken Animals who ate it: 1
========
Bear
Yes, this is wildly inefficient in terms of computer usage; it spawns four processes and passes lots of data through a pipeline, where it would clearly be possible to do the whole thing in a single awk program. But each piece is tiny, does exactly one thing, is easy to write, and is a candidate for reuse; the grouping program in the second invocation of awk would clearly be useful in other programs, and the first awk program might be useful if other programs read the same input.
You can run the program at https://ideone.com/enCtmf.
A cute little exercise. I hope it’s not counted as cheating if I tackle it using Julia (1.5.2): https://pastebin.com/UDbyNBhQ
I really had no idea that bears ate carrots!
Here’s a solution in Python.
import sys from collections import defaultdict lookup = defaultdict(set) # maps foods to a set of animals with open(sys.argv[1]) as f: for line in f.readlines(): line = line.strip() if line.startswith('The'): # 'The' identifies 'There once ...' animal = line[17:].capitalize() # len('There once was a ') == 17 elif line: food = line[line.index(' ') + 8:].capitalize() # len(' he ate ') == 8 lookup[food].add(animal) for idx, (food, animals) in enumerate(lookup.items()): print(f'Food: {food} Animals who ate it: {len(animals)}') print('=======') for animal in animals: print(animal) if idx + 1 < len(lookup): print()Example usage:
Python
import re
inFile = open(‘animal.txt’)
dt = dict()
l = set()
f = ”
for line in inFile:
if re.match(“There once was a”, line):
if f:
dt[f] = l
f = line.split()[-1].capitalize()
l = set()
elif re.search(“he ate”, line):
s = line.split()[-1].capitalize()
l.add(s)
dt[f] = l
dtr = dict() #reverse dict
for k, v in dt.items():
for iv in v:
if iv in dtr:
dtr[iv].add(k)
else:
dtr[iv] = set([k])
for val in dtr:
print(‘\nFood: {} Animals who ate it: {}’.format(val, len(dtr[val])))
print(‘=======’)
for i in dtr[val]:
print(i)
ouch (
https://pastebin.com/yh2iQweK
Here’s an AWK version. The food names are capitalized, and the output sorted, to match the example.
/There once was a/ { animal = $5 } /he ate/ { foods[$4][animal] = 1 } END { PROCINFO["sorted_in"] = "@val_str_asc" for (f in foods) { printf "Food: %s Animals who ate it: %d\n", raiseCase(f), length(foods[f]) printf "========\n" for (a in foods[f]) { printf "%s\n", a } printf "\n" } } function raiseCase(s) { return toupper(substr(s, 1, 1)) substr(s, 2) }Here’s a solution in Haskell using the cool Parsec module in the standard library. I noticed the capitalisation of animals and foods was inconsistent in the example input but I’m just ignoring that. Also it’s not quite obvious how to format code in these comments, so this might look terrible.
module Main where import Text.Printf (printf) import Data.List (intercalate, union) import Data.Map (fromListWith, toList) import Text.Parsec main = do (Right foods) <- parse parser "" <$> getContents let s = map (\(food,animals) -> unlines $ [printf "Food: %s Animals who ate it: %d" food (length animals) ,"======="] ++ animals) $ toList foods putStrLn $ intercalate "\n" s parser = fromListWith union . concat <$> foodAnimals `sepBy` newline foodAnimals = do string "There once was a " animal <- many1 letter many space (do many1 letter string " he ate " food <- many1 letter return (food, [animal])) `endBy` newline(and then I saw the “HOWTO: Posting Source Code” link right at the top of the page. On the plus side, it seems that the “code” tag suffices in place of “sourcecode” and the “lang” property can be omitted for unsupported languages like Haskell)
Here a solution in Java:
package codekatas;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.TreeSet;
public class AnimalFood {
}
Here a solution in Java:
package codekatas; import java.io.FileNotFoundException; import java.io.FileReader; import java.io.IOException; import java.util.TreeSet; public class AnimalFood { class Animal implements Comparable<Animal>{ public Animal() { food = new TreeSet<String>(); } String name; TreeSet<String> food; public String getName() { return name; } public void setName(String name) { this.name = name; } public TreeSet<String> getFood() { return food; } public void setFood(TreeSet<String> food) { this.food = food; } @Override public String toString() { return "Animal [name=" + name + ", food=" + food + "]"; } public int compareTo(Animal o) { return this.name.compareTo(o.getName()); } } public static void main(String[] args) { try { AnimalFood outerObject = new AnimalFood(); TreeSet<Animal> animals = new TreeSet<Animal>(); AnimalFood.Animal innerObject = null; FileReader fw = new FileReader( "D:\\WORK\\workspace\\code_kata\\codekatas\\src\\main\\resources\\input\\Animal.txt"); int data = fw.read(); StringBuilder readed = new StringBuilder(); String checkAnimal = "There once was a "; String ate = " he ate "; while (data != -1) { readed.append((char) data); data = fw.read(); } fw.close(); String[] lines = readed.toString().split("\\r?\\n"); for (String line : lines) { if (line.toString().contains(checkAnimal)) { String[] split = line.split(checkAnimal); innerObject = outerObject.new Animal(); innerObject.setName(split[1].trim()); animals.add(innerObject); } if (line.contains(ate)) { String food = line.split(ate)[1]; innerObject.food.add(food.trim()); } } TreeSet<String> fruits = new TreeSet<String>(); for (Animal row : animals) { fruits.addAll(row.getFood()); } for (String fruit : fruits) { System.out.println("Food: " + fruit + " Animals who ate it: " + count(animals, fruit)); System.out.println("========"); printAnimal(animals, fruit); System.out.println(""); } } catch (FileNotFoundException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } } private static int count(TreeSet<Animal> animals, String fruit) { int result = 0; for (Animal animal : animals) { if (animal.getFood().contains(fruit)) { result++; } } return result; } private static int printAnimal(TreeSet<Animal> animals, String fruit) { int result = 0; for (Animal animal : animals) { if (animal.getFood().contains(fruit)) { System.out.println(animal.getName()); } } return result; } }