Double Double Words

October 13, 2015

Today’s task is to write a program that reads a file and reports any instances of doubled words, which is a useful program for anyone that does a lot of writing, as I do in this blog.

Your task is to write a program to find doubled words. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


Pages: 1 2

6 Responses to “Double Double Words”

  1. Perfect example where perl rocks!

    while(<>){$l++;foreach(split/\W+/,lc$_){printf "%4d %s\n",$l,$_ if$x eq$_;$x=$_;}}
  2. Rutger said


    from collections import Counter
    import re
    text = """   Assassin beef noodles savant human chrome order-flow 
    lights neural physical render-farm post-stimulate fluidity skyscraper 
    8-bit. Free-market physical vinyl towards nano-Tokyo sign render-farm. 
    Decay digital katana disposable apophenia modem dissident narrative. 
    Soul-delay euro-pop vinyl pre-ablative market bridge sunglasses dead 
    youtube hotdog rebar claymore mine. """
    c = Counter(split for line in text.splitlines() for split in re.sub("[^\w]", " ",  line).split())
    print [word for word in c if c[word] > 1]
  3. mcmillhj said

    Alternate Perl solution:

    use strict; 
    use warnings; 
    my $text = do {
       local $/ = undef;
    my $line_no = 1;
    while ( my ($w1,$sep,$w2) = $text =~ m/(\w+)(\W+)(\w+)/ ) {
       $text =~ s/$w1\W+//;
       $line_no++ if $sep eq "\n";
       printf "%04d %s\n", $line_no, $w1 if $w1 eq $w2;
  4. Mike said

    Here’s my Python version:

    Uses fileinput from the standard library to handle opening and closing files provided on the command line. It also keeps track of name of the file and line number. Uses regex’s to find the words in a line.

    If a repeated word is found, the program prints the word, the line number(s), and a portion of the line(s) surrounding the repeated word for context.

    with fileinput.input() as f:
        for line in f:
            line = line.rstrip()
            if fileinput.isfirstline():
                prevline = ''
                prevword = None
            firstword = True
            for match in pat.finditer(line):
                word =
                if word == prevword:
                    b, e = match.span()
                    lineno = fileinput.filelineno()
                    fmt = "\t'{}' at {}: ...{}..."
                    if firstword:
                        context = prevline[-15:] + ' ' + line[:e+10]
                        where = "lines {}-{}".format(lineno-1, lineno)
                        context = line[b-15:e+10]
                        where = "line {}".format(lineno)
                    print(fmt.format(word, where, context))
                prevword = word
                firstword = False
            prevline = line

    Example output:

    	'a' at lines 2-3: ...upon a a time. The...
    	'of' at lines 4-5: ...was a test of of the emerg...
    	'if' at lines 6-7: ...cast system. If if there had...
    	'been' at line 7: ...there had been been...
  5. maroonedsia said

    string content = File.ReadAllText("file.txt");
                string[] words = text.Split(’ ‘, ‘\t’, ‘\n’);
                string output = "";
                for (int i = 0; i < words.Length – 1; i++)
                    if (words[i] == words[i1])
                        output += "Word Index: " i.ToString() ", "
                            "Word: " words[i] "\n";

  6. maroonedsia said

    1. I don’t know to format the text as code,
    2. For some reason, the “+” is removed from some line, for example the correct code is: if (words[i] == words[i+1])
    3. Why I cannot edit my comment to correct it?! :D

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: