Favorite Color

November 11, 2014

Little database problems like this can be solved by big Scheme programs or little Awk programs; despite my fondness for Scheme, I solve problems like this using Awk:

awk ' /^favoritecolor: / { color[$2]++ }
      END { for (c in color) print color[c], c } ' database |
sort -rn |
sed '1q' |
awk ' { print $2 } '

The first line finds all favoritecolor database fields and counts the occurrences of each color. The second line prints each color/count combination on a separate line. The third line sorts in reverse numeric order. The fourth line selects the first (maximal) color. The fifth line strips the count and prints the color. Easy to do, and written as fast as I can type.

I wrote this little exercise because I had a problem at work today that could be solved in a manner similar to this. It’s a reminder that we don’t always need big programs; sometimes, a little program will do the job just as well.

You can see the program at http://programmingpraxis.codepad.org/cFVZnPVP. If you have some other little program, you might want to share it with the rest of us.

Pages: 1 2

3 Responses to “Favorite Color”

  1. Or you can do it in perl…

    perl -e '$x{$_}++ foreach map {m{favoritecolor: (.*)} ? $1 : ()} <>; print [sort {$x{$a} <=> $x{$b}} keys %x ]->[-1],"\n";' file.txt
  2. Jussi Piitulainen said

    Nah. I thought of Awk, and I thought of Python’s Counter objects, but this is what I would actually do, including the output of more than one line, to see if there is a tie or a near tie.

    $ grep -E '^favoritecolor:' particular.txt | cut -d ' ' -f 2 | sort | uniq -c | sort -nr | head

  3. Mike said

    Unfortunately, my UNIX command line is so rusty, I’d use Python.

    Assumes one name/value pair per line in the file (problem doesn’t specify). Didn’t bother splitting the name/value pair, because it doesn’t change the counts. Outputs a sorted list of all the colors and their counts.

    from collections import Counter
    import re
    match = re.compile(r'favoritecolor: .+').match
    with open('./testdb.txt', 'rt') as f:
        Counter(filter(match, f)).most_common()

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: