Comma Quibbling
September 22, 2017
Eric Lippert, at his blog Fabulous Adventures in Coding, discusses the problem of comma quibbling, which turns a list of words like (“ABC” “DEF” “G” “H”) into the string “ABC, DEF, G and H” with commas between each pair of words except the last pair, which gets the word “and” (without a comma). Here are the rules:
- If the input is empty, so is the output.
- If the input has a single word, so does the output.
- If the input has two words, the output is the two words separated by “and”.
- If the input has more than two words, the output is all the words, separated by commas, but with the last comma replaced by “and”.
A word is a maximal sequence of characters not containing a comma or a space.
Your task is to write a function that adds commas to a list of words, using the rules described above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
Quick one in perl using regexs….
sub quibble { return "@_" =~ s{\s*,?\s+}{, }rg =~ s{(.*), }{\1 and }r,"\n"; } print quibble('ABC'),"\n"; print quibble('ABC','DEF'),"\n"; print quibble('ABC','DEF','G','HI'),"\n"; print quibble('ABC DEF G HI'),"\n"; print quibble('ABC , DEF G HI'),"\n"; print quibble('ABC, DEF G HI'),"\n";MUMPS/Cache
zjsg ; q ; quibble(str) ; n len s len=$l(str," ") q $s(len=1:str,len=2:$p(str," ")_" and "_$p(str," ",2),1:$$quibble2(str,len)) ; quibble2(str,len) ; n i,str2 s str2=$p(str," ") f i=2:1:len-2 s str2=str2_","_$p(str," ",i) q str2_$p(str," ",len-1)_" and "_$p(str," ",len)f i=”a”,”a b”,”a b cd e f g” w !,i,?15,$$quibble^zjsg(i)
a a
a b a and b
a b cd e f g a,b,cd,ef and g
Here’s a solution in Python.
def quibble(words): return ", ".join(words) if len(words) != 2 else "{} and {}".format(*words) print quibble([""]) print quibble(["ABC"]) print quibble(["ABC", "DEF"]) print quibble(["ABC", "DEF", "G"]) print quibble(["ABC", "DEF", "G", "H"])Output:
I didn’t read rule 4 carefully. With more than two words, I was outputting a comma between the last words, as opposed to “and”. Here’s an updated solution.
def quibble(words): return ", ".join(words[:-2] + [" and ".join(words[-2:])]) print quibble([""]) print quibble(["ABC"]) print quibble(["ABC", "DEF"]) print quibble(["ABC", "DEF", "G"]) print quibble(["ABC", "DEF", "G", "H"])Here’s a Python solution that follows the requirement in the original blog, to only use the ability to enumerate the input (in Python terms, the function allows any iterator).
def quibble(strings): # Get an iterator, so we can use next() strings = iter(strings) # Prime the pump - we keep the "head" (which is the comma-separated # elements we know we will be outputting), and the "tail" (which is # the last element we've seen, which might need an "and"). # Use a list for the head so we avoid quadratic runtime due to repeated # string concatenation. head = [next(strings, '')] try: tail = next(strings) except StopIteration: # Less than 2 elements, so we just return what we got return head[0] # Add elements one by one, putting the previous tail onto the head # as we now know there's more. for s in strings: head.append(tail) tail = s # Return the expected result return ', '.join(head) + ' and ' + tail if __name__ == '__main__': assert quibble([]) == '' assert quibble(['ABC']) == 'ABC' assert quibble(['ABC', 'DEF']) == 'ABC and DEF' assert quibble(['ABC', 'DEF', 'G', 'H']) == 'ABC, DEF, G and H'In modern versions of Python, it’s possible to use extended tuple unpacking to do this much more simply:
def quibble(strings): *head, tail = strings return ', '.join(head) + ' and ' + tailThis needs some special casing for the 0 and 1 element sequence cases, which I’ve omitted (trap ValueError for the 0-element case, and check for an empty head for the 1-element case).
(defun commated-list (words) (format nil "~[~;~*~A~;~{~A~} and ~A~:;~{~A~^, ~} and ~A~]" (length words) (butlast words) (first (last words)))) (mapcar (function commated-list) '(() ("Apple") ("Apple" "Banana") ("Apple" "Banana" "Cherry") ("Apple" "Banana" "Cherry" "Date"))) ;; --> ("" "Apple" "Apple and Banana" "Apple, Banana and Cherry" "Apple, Banana, Cherry and Date")A direct translation of the rules into Haskell.
Can someone explain why in C# (1) is O(N^2) but (2) is O(N)?
string[] x = an array of length N
string result = “”;
for(int i = 0; i < N; i++)
result = result + “, ” + x[i];
2.
string[] x = an array of length N
string result = String.Join(“, “, x);
Basically, with (1), every time you do result = result + “, ” + x[i] you are reallocating result, which means a copy of everything you’ve created so far. So, you copy the whole string N times, and the string is length N (technically, you’re copying items of length 1, 2, … N, so the average length is N/2, but a constant factor doesn’t affect the result).
With (2) you allocate the string once, at the end, avoiding the recopying.
def add_commas_noserial(seq): seq = iter(seq) for a in seq: break else: return for b in seq: break else: yield a return for c in seq: yield a yield ", " a, b = b, c yield a yield " and " yield b