Form Letters
November 30, 2010
Welcome back, Jane!
We hope that you and all the members
of the Public family are constantly
reminding your neighbors there
on Maple Street to shop with us.
As usual, we will ship your order to
Ms. Jane Q. Public
600 Maple Street
Your Town, Iowa 12345
Everybody hates form letters. But they are part of the computing universe, and today’s exercise asks you to print them. Input to the form letter generator comes in two parts. First, there is a schema that defines the letter to be written. Here is the schema for the letter shown above:
Welcome back, $1!
We hope that you and all the members
of the $0 family are constantly
reminding your neighbors there
on $5 to shop with us.
As usual, we will ship your order to
$3 $1 $2. $0
$4 $5
$6, $7 $8
Variable text is identified as $n, where n is the field number from a database; although it’s not shown above, n can be larger than 9, extending right-ward until a non-digit is encountered. Also not shown above is the construct $$, which prints a literal dollar sign.
The data comes from a comma-separated values file, of the type we have previously encountered. In this case, records have nine fields: last name, first name, middle initial, title, street number, street name, city, state, and zip code. Here is a sample two-record data file:
Public,Jane,Q,Ms.,600,Maple Street,Your Town, Iowa,12345
Smith,John,Z,Dr.,1234,Main Street,Anytown,Missouri,63011
Your task is to write a program that takes a schema and a data file and writes a series of form letters. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
[…] Praxis – Form Letters By Remco Niemeijer In today’s Programming Praxis exercise, we have to write a program to generate form letters. Let’s get […]
My Haskell solution (see http://bonsaicode.wordpress.com/2010/11/30/programming-praxis-form-letters/ for a version with comments):
import Control.Applicative ((*>), (<*>), (<$>)) import Text.CSV import Text.Parsec fillWith :: String -> [String] -> String fillWith text vars = either show concat $ parse form "" text where form = many $ escape <|> count 1 anyChar escape = char '$' *> (string "$" <|> ((vars !!) . read <$> option "0" (many1 digit))) formLetters :: FilePath -> FilePath -> IO [String] formLetters schema vars = either (return . show) . map . fillWith <$> readFile schema <*> parseCSVFromFile varsMy (slightly golfed up) Ruby version:
d = File.read(‘input.csv’).split(“\n”).map {|line| line.split(“,”)}
s = File.read(‘letter.schema’)
d.each {|r| puts s.scan(/\$\d+/).inject(s) {|a,m| a = a.gsub(m,r[m[/\d+/].to_i])}}
A ruby version …
require 'csv' form = File.read(ARGV[0]) CSV.foreach(ARGV[1]) do |row| form_filled = form row.each_with_index do |v, i| form_filled = form_filled.gsub("$#{i}", v) end puts "#{form_filled}" endIf you’re using ruby 1.8, then require fastercsv.
OK, both mine and Chris’ have the same two issues in that they don’t work with > 10 elements and the $$ doesn’t work. This one should work (it’s uglier, but works as is often the case). It works backwards down the list so that $10 will be subbed out before $1 and it will leave a “$$” alone. At the end it changes the “$$” to a single “$”.
require 'csv' form = File.read(ARGV[0]) CSV.foreach(ARGV[1]) do |row| form_filled = form (row.size-1).downto(0) do |i| form_filled = form_filled.gsub(/([^$])\$#{i}/, "#{$1}#{row[i]}") end form_filled.gsub!(/\$\$/, "$") puts "#{form_filled}" end<?php function formLetters($schema_file,$csv_inputfile) { $s = file_get_contents($schema_file); $dcount = 0; if (($handle = fopen($csv_inputfile, "r")) !== FALSE) { while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) { if ($dcount == 0) $dcount = count($data); echo preg_replace(array('/\$(\d+)/e','/\$\$/'), array('$data[\\1]','$'), $s) . "\n"; } } } formLetters('/dev/form1.schema','/dev/data.csv'); ?>Left the $dcount variable in mine by accident, that would clean it up by two lines.
Works with “$$” and more than 10 fields.
#!/usr/bin/env perl
($a,$b)=@ARGV;open($S,$a);$s=join/\n/,<$S>;open
($D,$b);while(chomp($_=<$D>)){@d=split/,/;$_=$s
;s/\$(\d+)/$d[$1]/g;s/\$\$/\$/g;print}
A bit longer than everyone else’s. My answer can deal with arbitrarily many elements, but handles only the subcase
of the $$ problem where no other $n remain after a $$ in a line of the schema.
I’ve included the imports, hashbang line, and the test at the end (copy-pasted schema and data from first page):
#!/usr/bin/env python2.6 import csv from string import digits def form_letter(schema, data): reader = csv.reader(open(data)) s = open(schema) lines = s.readlines() s.close() for row in reader: output = '' for line in lines: x = 0 while x != -1: x = line.find("$", x) i = x+1 while line[i] in digits: i += 1 if i != x+1: # swap out $n with data's row[n] n = int(line[x+1:i]) line = line[:x] + row[n] + line[i:] else: break # handles $$ case, as long as no other numbers are # after it output = output + line print output + "\n" return if __name__ == "__main__": form_letter("schema.txt", "data.txt")Graham: Read the input a character at a time instead of a line at a time, and you won’t have a problem with $n following $$ on the same schema line.
I came up with the following version, which can handle many elements but still fails when the schema contains “invalid” markup such as $$$$1 where the amount of $s is unbalanced.
I also used some map/lambda foo to make it more interesting :)
#!/usr/bin/env python2.6 import csv, re def form_letter(schema, data): tmpl = open(schema).read() v_re = re.compile(r'(?<!$)\$\d+') order = map(lambda key: int(key[1:]), v_re.findall(tmpl)) tmpl = v_re.sub('%s', tmpl).replace('$$', '$') reader = csv.reader(open(data)) for row in reader: print tmpl % tuple(map(lambda key: row[key], order)) if __name__ == '__main__': form_letter('schema.txt', 'data.csv')import string, csv
def form_letter(letter, data):
input = csv.reader(open(data))
template = string.Template(re.sub(r”\$([0-9]+)”, r”$_\1″, open(letter).read()))
for row in input:
context = dict([(“_%s”%j,k) for j,k in enumerate(row)])
print template.substitute(**context)