Remove Singleton

June 6, 2014

I’m not sure if this is a homework exercise or an interview questions, but it’s a fun little exercise to get your creative juices flowing on a Friday morning:

Given a string and a character, remove from the string all single occurrences of the character, while retaining all multiple consecutive instances of the character. For instance, given string “XabXXcdX”, removing singleton Xs leaves the string “abXXcd”.

Your task is to write the program given above; be sure to properly test your work. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Pages: 1 2

17 Responses to “Remove Singleton”

  1. concat . filter (/=[x]) . group

  2. matthew said
    #include <stdio.h>
    #include <string.h>
    
    void f(char *p, char c)
    {
       char *q = p;
     state0:
       if (*p == c) {
          p++;
          goto state1;
       }
       goto state3;
     state1:
       if (*p == c) {
          *q++ = c; *q++ = c;
          p++;
          goto state2;
       }
       goto state3;
     state2:
       if (*p == c) {
          *q++ = *p++;
          goto state2;
       }
     state3:
       if (*p == 0) {
          *q = 0;
          return;
       } else {
          *q++ = *p++;
          goto state0;
       }
    }
    
    int main(int argc, char *argv[])
    {
       char p[64];
       strcpy(p,argv[1]);
       char c = *argv[2];
       f(p,c);
       printf("%s\n", p);
    }
    
  3. Michael D said

    In Java:

    “XabXXcdX”.replaceAll(“(^|[^X])X([^X]|$)”, “$1$2”)

  4. Lev Lozhkin said

    Here’s a regex-based one in Python. It uses negative lookahead and lookbehind assertions rather than set complements like the Java one above.

    import re, sys
    re.sub(r"(?<!X)(X)(?!\1)", "", sys.argv[1])
    

    If we had a non-trivial singleton pattern to find, it would be better to use back-references in the lookbehind assertion, but Python’s RE module (and many other languages’ as well) does not support this. That would look something like this:

    re.sub(r"(X)(?<!\1\1)(?!\1)", "", sys.argv[1])
    

    The state-machine and set complement approaches should be quicker for simple singleton patterns (like a single known character). For non-trivial singleton patterns, this is probably the simplest general solution.

  5. Mike said

    Two python solutions. Both break the input string into runs of the same character and then removes/filters runs that match the character to be deleted.

    import itertools as it
    import re
    
    def func1(s, x):
        return re.sub(x+'+', lambda mo: '' if mo.group() == x else mo.group(),s)
    
    def func2(s, x):
        return ''.join(filter(x.__ne__, (''.join(v) for _,v in it.groupby(s))))
    
  6. Ruby, iterating over strings like it’s 1991:

    inputString = 'XabXXcdX'
    outputString = ''
    singleton = 'X'
    previous = ''
    
    inputString.each_char { |current|     
        if previous == singleton && current == singleton then 
          outputString += previous + current
        elsif previous != singleton then
            outputString += previous unless previous == ''  
        end   
        previous = current    
    }                              
    puts outputString 
    
  7. matthew said

    Here’s a nice little Markov program (see http://matthewarcus.wordpress.com/2014/01/08/markov-computation/):

    $ cat sing.mkv
    0aa => a1a
    0a => 0
    0b => b0
    1a => a1
    1b => b0
    0 =>.
    1 =>.
      => 0
    $ ./markov abaabababaa < sing.mkv
    abaabababaa
    0abaabababaa
    0baabababaa
    b0aabababaa
    ba1abababaa
    baa1bababaa
    baab0ababaa
    baab0babaa
    baabb0abaa
    baabb0baa
    baabbb0aa
    baabbba1a
    baabbbaa1
    baabbbaa
    
  8. matthew said

    That Java regex solution doesn’t quite work since the patterns overlap:

    $ java Foo aXaaXa
    aaaa
    $ java Foo aXaXa
    aaXa
    

    This seems to work (using the idea of Lev’s solution):

    public class Foo {
        public static void main(String[] args) throws Exception {
            String s = args[0];
            System.out.println(s.replaceAll("(?<!X)(X)(?!X)", ""));
        }
    }
    
  9. […] I was going through my RSS feed the other day, I came across this Programming Praxis problem and thought right away that the solution called for a state machine. It’s one of those […]

  10. Here is my solution in Python. Please review my code and let me know if I can improve my codes.

    def removesingleton(strs,chrs):
    ”’
    (str,str)->(str)
    Returns str with removed single chr
    >>>removesingleton(“Welcomee”,”e”)
    Wlcomee
    >>>removesingleton(“XabXXaX”,”X”)
    abXXa
    ”’
    output, inner_index, outer_index = “”, 0,0
    for chars in strs:
    if outer_index <= len(strs)-1:
    if outer_index == strs.index(chars,outer_index):
    if chars == chrs:
    inner_index = outer_index
    while inner_index+1 < len(strs) and strs[inner_index+1] == chrs:
    inner_index += 1
    output += strs[outer_index:inner_index+1]
    outer_index = inner_index
    else:
    output += strs[outer_index]
    outer_index += 1
    return output

  11. def removesingleton(strs,chrs):
    '''
    (str,str)->(str)
    Returns str with removed single chr
    >>>removesingleton("Welcomee","e")
    Wlcomee
    >>>removesingleton("XabXXaX","X")
    abXXa
    '''
    output, inner_index, outer_index = "", 0,0
    for chars in strs:
    if outer_index <= len(strs)-1:
    if outer_index == strs.index(chars,outer_index):
    if chars == chrs:
    inner_index = outer_index
    while inner_index+1 < len(strs) and strs[inner_index+1] == chrs:
    inner_index += 1
    output += strs[outer_index:inner_index+1]
    outer_index = inner_index
    else:
    output += strs[outer_index]
    outer_index += 1
    return output

    print removesingleton("XXXabXXXcdXXX","X")=='XXXabXXXcdXXX'

  12. mcmillhj said

    Simple Perl regex:

    #!/usr/bin/perl
    
    use strict;
    use warnings; 
    
    use feature qw(say);
    
    sub remove_singleton {
       my ($s, $c) = @_;
    
       $s =~ s/(?<!$c)($c)(?!$c)//g;
       return $s; 
    }
    
    say remove_singleton('XabXXcdX','X'); 
    
  13. beoliver said

    import Data.List (group)

    removeSingleton :: Eq a => a -> [a] -> [a]
    removeSingleton m = concat . filter (/= [m]) . group

  14. Peter said

    Here is a possible (partial) picture of the finite state machine referred
    in the official solution.

    DFA

    The signs before the input characters are meant to indicate what
    happens to the character in respect to the output.
    I hope the image shows up, if it doesn’t, that’s the URL for the time being:

    Below is a Haskell rendering of the algorithm:


       1   rmSingle c input =
       2       let zero [] ys = ys 
       3           zero (x:xs) ys
       4               | x == c = one xs ys
       5               | otherwise = zero xs $ x:ys
       6           one [] ys = ys
       7           one (x:xs) ys
       8               | x == c = multiple xs $ x:x:ys
       9               | otherwise = zero xs $ x:ys
      10           multiple [] ys = ys
      11           multiple (x:xs) ys
      12               | x == c = multiple xs $ x:ys
      13               | otherwise = zero xs $ x:ys
      14       in reverse $ zero input []

  15. matthew said

    Peter: Your DFA description is like a Mealy Machine: http://en.wikipedia.org/wiki/Mealy_machine. A fun programming project would be to write a function that inputs a description of such a machine and outputs an efficient implementation (maybe make a nice Lisp macro).

  16. Without using irregular regular expressions…

    sub remove_singletons {
    #@param $str (string)
    #@param $c (string) – singleton to remove
    #@return (string) – string with singletons removed..
    ## Remove singleton instances of given string ($c)
    my($str,$c) = @_;
    my $r = sprintf ‘(%s+)’, quotemeta $c;
    (my $str = shift)=~s{$r}{length $1 > 1 ? $1 : q()}ge;
    return $str;
    }
    [\sourcecode]

  17. […] few weeks ago we had an exercise that solved the problem of removing singleton occurrences of a given character from a string while […]

Leave a comment