Double Double Words

October 13, 2015

Today’s task is to write a program that reads a file and reports any instances of doubled words, which is a useful program for anyone that does a lot of writing, as I do in this blog.

Your task is to write a program to find doubled words. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

6 Responses to “Double Double Words”

  1. Perfect example where perl rocks!

    while(<>){$l++;foreach(split/\W+/,lc$_){printf "%4d %s\n",$l,$_ if$x eq$_;$x=$_;}}
    
  2. Rutger said

    Python

    from collections import Counter
    import re
    
    text = """   Assassin beef noodles savant human chrome order-flow 
    lights neural physical render-farm post-stimulate fluidity skyscraper 
    8-bit. Free-market physical vinyl towards nano-Tokyo sign render-farm. 
    Decay digital katana disposable apophenia modem dissident narrative. 
    Soul-delay euro-pop vinyl pre-ablative market bridge sunglasses dead 
    youtube hotdog rebar claymore mine. """
    
    c = Counter(split for line in text.splitlines() for split in re.sub("[^\w]", " ",  line).split())
    print [word for word in c if c[word] > 1]
    
  3. mcmillhj said

    Alternate Perl solution:

    use strict; 
    use warnings; 
    
    my $text = do {
       local $/ = undef;
       <>;
    };
    
    my $line_no = 1;
    while ( my ($w1,$sep,$w2) = $text =~ m/(\w+)(\W+)(\w+)/ ) {
       $text =~ s/$w1\W+//;
       $line_no++ if $sep eq "\n";
       printf "%04d %s\n", $line_no, $w1 if $w1 eq $w2;
    }
    
  4. Mike said

    Here’s my Python version:

    Uses fileinput from the standard library to handle opening and closing files provided on the command line. It also keeps track of name of the file and line number. Uses regex’s to find the words in a line.

    If a repeated word is found, the program prints the word, the line number(s), and a portion of the line(s) surrounding the repeated word for context.

    with fileinput.input() as f:
        for line in f:
            line = line.rstrip()
    
            if fileinput.isfirstline():
                print(fileinput.filename())
                prevline = ''
                prevword = None
    
            firstword = True
            for match in pat.finditer(line):
                word = match.group().lower()
                if word == prevword:
                    b, e = match.span()
                    lineno = fileinput.filelineno()
                    fmt = "\t'{}' at {}: ...{}..."
                    if firstword:
                        context = prevline[-15:] + ' ' + line[:e+10]
                        where = "lines {}-{}".format(lineno-1, lineno)
                    else:
                        context = line[b-15:e+10]
                        where = "line {}".format(lineno)
    
                    print(fmt.format(word, where, context))
    
                prevword = word
                firstword = False
    
            prevline = line
    

    Example output:

    C:/projects/testdata.txt
    	'a' at lines 2-3: ...upon a a time. The...
    	'of' at lines 4-5: ...was a test of of the emerg...
    	'if' at lines 6-7: ...cast system. If if there had...
    	'been' at line 7: ...there had been been...
    
  5. maroonedsia said

    string content = File.ReadAllText("file.txt");
                string[] words = text.Split(’ ‘, ‘\t’, ‘\n’);
                string output = "";
     
                for (int i = 0; i < words.Length – 1; i++)
                {
                    if (words[i] == words[i1])
                    {
                        output += "Word Index: " i.ToString() ", "
                            "Word: " words[i] "\n";
                    }
                }
     
                MessageBox.Show(output);

  6. maroonedsia said

    1. I don’t know to format the text as code,
    2. For some reason, the “+” is removed from some line, for example the correct code is: if (words[i] == words[i+1])
    3. Why I cannot edit my comment to correct it?! :D

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: