FSM Generator

July 11, 2014

[ Our solution today is written by Matthew Arcus a software engineer in Cambridge, UK, mainly using C++ for networking at work, but who in the past has been paid to write Lisp and ML. Matthew wrote a comment on the previous exercise suggesting a program to generate finite state machines, and I contacted him to ask him to write that program. He graciously "volunteered," and today's solution is his response. Guest authors are always welcome at Programming Praxis; use the button on the menu bar above to contact me if you have an idea for an exercise. ]

Simple string processing functions, for example, removing singleton occurrences of a character, can often be implemented by a finite state machine augmented with the ability to output symbols. One such class of machine are Mealy machines, named for George H. Mealy, who defined the concept in 1954. Formally, each transition in a Mealy machine has a single associated output symbol, and the machine outputs this symbol whenever it makes that transition. For practical purposes, we can allow a machine to have N output symbols for each state, where N ≥ 0, and then, for example, we can express our singleton problem by the rules:

State 0:
  X -> 1,
  * -> 0,*
State 1:
  X -> 1,XX
  * -> 0,*

We define a machine with two states, 0 and 1. In state 0, it will accept input character X and go to state 1 with no output, or accept any other input (represented by *), output that input (another *) and stay in state 0. In state 1, it will accept a second X, stay in state 1 and output two X’s, or any other character which is output, then the machine goes back to state 0. It should be easy to see that this machine will remove singleton occurrences of X.

This exercise is to implement a program that takes a description of a Mealy machine in some suitable format, and generates a function that takes a string parameter and outputs the result of running that machine with the given string as input.

This makes a nice exercise in defining Lisp macros; modern Scheme uses “hygienic macros” that remove a lot of the problems with traditional macros, particularly to do with the avoidance of accidental variable capture, but it’s nice to do things the old-fashioned way sometimes, if only to see what we are missing, so this solution is in Common Lisp and also makes use of that language’s loop facility.

We will use a simple S-expression format for the machine rules, for example:

(mealy s
        ((0
          (#\X 1)
          (nil 0 nil))
         (1
          (#\X 1 #\X #\X)
          (nil 0 nil)))))

This represents our singleton machine with nil used as a wildcard.

A traditional Lisp macro is just a function that returns a piece of Lisp syntax that replaces the macro expression in the program. Conceptually, it’s just like a C macro, with the important difference that we have the full power of the Lisp language at our disposal. We need to be careful to ensure that any local variables used in the macro don’t clash with any variables used in the macro parameters — in our case, we can use variables to represent symbols or characters. This is what modern hygienic macros make straightforward, but here we will take the traditional approach of generating new, unique, symbols using the (gensym) function.

Writing macros is made much easier using the backquote feature: a backquoted expression is a literal with embedded expressions that are evaluated and inserted (with ,) or spliced (with ,@) into the literal, so (let ((x 1)) `(+ ,x ,x)) and (let ((x '(1 1))) `(+ ,@x)) are both just '(+ 1 1).

We coerce the input string to a list before iterating, and coerce the result back to a string at the end, though it would be straightforward to extend the macro to work with other sequence types.

(defmacro mealy (input actions)
  (let ((c (gensym)) ;; Avoid variable capture
        (state (gensym))
        (out (gensym))) ;; Put output here
    `(loop for ,c in (coerce ,input 'list) ;; Convert input into a list
           with ,state = 0
           while (>= ,state 0)
           append
           (cond
            ,@(mapcar
               #'(lambda (statespec)
                   (let ((thisstate (car statespec)))
                     `((= ,state ,thisstate)
                       (cond
                        ,@(mapcar
                           #'(lambda (actionspec)
                               (let ((match (car actionspec))
                                     (newstate (cadr actionspec))
                                     (output (cddr actionspec)))
                                 `(,(if (eq match nil) t `(eq ,c ,match))
                                   ;; Reset the state if necessary
                                   ,@(if (not (= thisstate newstate)) `((setq ,state ,newstate)))
                                   (list
                                    ,@(mapcar
                                       ;; If output char is nil, just echo the input.
                                       #'(lambda(outchar) (if (eq outchar nil) c outchar))
                                       (cddr actionspec))))))
                           (cdr statespec))))))
               actions)) into ,out
           finally (return (coerce ,out 'string)))))

We can get an idea of what our macro expands to using macro-expand-1:

(setq machine
      '(mealy s
              ((0
                (c 1)
                (nil 0 nil))
               (1
                (c 1 c c)
                (nil 0 nil)))))

(pprint (macroexpand-1 machine))

(LOOP FOR #:G3210 IN (COERCE S 'LIST)
      WITH #:G3211 = 0
      WHILE (>= #:G3211 0)
      APPEND
      (COND ((= #:G3211 0)
             (COND ((EQ #:G3210 C)
                    (SETQ #:G3211 1) (LIST))
                   (T (LIST #:G3210))))
            ((= #:G3211 1)
             (COND ((EQ #:G3210 C)
                    (LIST C C))
                   (T (SETQ #:G3211 0) (LIST #:G3210)))))
      INTO #:G3212
      FINALLY (RETURN (COERCE #:G3212 'STRING)))

We can use character constants in our macro:

(defun singleton1 (s)
  (mealy s
         ((0
           (#\X 1)
           (nil 0 nil))
          (1
           (#\X 1 #\X #\X)
           (nil 0 nil)))))
         
(print (singleton1 "XbbXXbX"))

Or pass in a variable:

(defun singleton2 (s c)
  (mealy s
         ((0
           (c 1)
           (nil 0 nil))
          (1
           (c 1 c c)
           (nil 0 nil)))))

(print (singleton2 "XbbXXbX" #\X))

Or just use the macro inline; this doubles all characters except #\a:

(print (mealy "abcacb" ((0 (#\a 0 nil) (nil 0 nil nil)))))

You can see the program in action at http://ideone.com/Ta3PPM.

About these ads

Pages: 1 2

4 Responses to “FSM Generator”

  1. Paul said

    As there is already a very good FSM for Python written by Noah Spurrier, I did not try to make another one. I give here how the FSM can be used for the remove singleton problem.

    """
    FSM is from Noah Spurrier
    (http://opensource.apple.com/source/lldb/lldb-159/test/pexpect-2.4/FSM.py)
    Also supports FSM memory
    """
    from __future__ import print_function
    
    from prpr.FSM import FSM
    
    def emit(fsm):
        print(fsm.input_symbol, end="")
    
    def secondX(fsm):
        emit(fsm); emit(fsm)
        
    def single(txt, x):
        f = FSM(0)
        f.add_transition     (x,      0,  None    ,       1)
        f.add_transition     (x,      1,  secondX ,       2)
        f.add_transition     (x,      2,  emit    ,       2)
        f.set_default_transition         (emit,           0)
        f.process_list(txt)
    
    single("abcccabcabccabcc", "c")
    # -> abcccababccabcc
    
  2. Mirko said

    In the common-lisp code above, the “output” variable is defined in the inner “let”, but left unused (courtesy of my CLISP). It can be safely commented out.

  3. Mike said

    I don’t think the two-state Mealy machine in the proposed solution will correctly handle 3 or more ‘X’s in a row.
    For example, an input of “XbbXXXbX” yields and output of “bbXXXXb”. I think a third state is needed:
    [pre]
    State 0:
    X -> 1,
    * -> 0,*
    State 1:
    X -> 2,XX
    * -> 0,*
    State 2:
    X -> 2,X
    * -> 0,*
    [/pre]

  4. matthew said

    Mirko: good catch, thanks, I think the program should say “output” instead of the later occurrence of “(cddr actionspec)”

    Mike: also well spotted (and particularly stupid on my part as I seem to have got it right in my solution to the original problem).

    This should be the right thing:

    (defun singleton1 (s)
      (mealy s
             ((0 
               (#\X 1) 
               (nil 0 nil))
              (1 
               (#\X 2 #\X #\X)
               (nil 0 nil))
              (2
               (#\X 2 #\X)
               (nil 0 nil)))))
    

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 630 other followers

%d bloggers like this: