Entab And Detab

May 6, 2011

In ancient times, say the 1970s, religious wars were fought about whether to indent blocks of code using tabs or spaces, and how wide the indents should be. Then editing environments got better and programmers stopped arguing about tabs and started arguing about other things, like where the braces go. Now, with the advent of the internet, the problem of tab/space indentation seems to be returning, as copy/paste operations between web browsers and text editors seem to get the indentation wrong (especially when the source is the evil PDF format).

In ancient times, the normal solution to the tab/space indentation problem was a pair of programs, entab and detab, that could easily convert between the two formats. Now, with the internet, it behooves us to resurrect those old programs.

Your task is to write programs that convert files using tabs for indentation to files using spaces for indentation, and vice versa; be sure to permit an argument specifying the width of the tab. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

About these ads

Pages: 1 2

10 Responses to “Entab And Detab”

  1. My Haskell solution (see http://bonsaicode.wordpress.com/2011/05/06/programming-praxis-entab-and-detab/ for a version with comments):

    import Text.Regex
    
    detab :: Int -> String -> String
    detab w s = subRegex (mkRegex "\t") s (replicate w ' ')
    
    entab :: Int -> String -> String
    entab w = unlines . map f . lines where
        f s = replicate tabs '\t' ++ replicate spaces ' ' ++ line where
            (indent, line) = span (`elem` " \t") s
            (tabs, spaces) = divMod (sum $ map width indent) w
        width c = if c == '\t' then w else 1
    
  2. Axio said

    Remko: you change all tabs/spaces. You should only consider the ones at the head of a line. And they may be mixed.

  3. Axio said

    “Remco”, sorry for the spelling error.

  4. Axio: Correct, detab changes all tabs, since this is the behaviour of the provided solution. entab only process spaces at the start of the line. Mixed spaces and tabs are already handled correctly.

  5. Axio said

    Gambit-C Scheme, and some macros inspired by Common Lisp…
    Not the most beautiful code, and no magic involved.
    Will handle mixed tabs and spaces on same line, and stop at the first non-space-nor-tab character.
    Procedures to apply to each line of a loaded file.

    I think that’s pretty much it…


    (define *tab-width* 4)

    (define (flush seen)
    (unless (zero? seen)
    (for-each (lambda (x) (display " ")) (iota 1 seen))))

    (define (entab-line line #!optional (tab-width *tab-width*))
    (let ((sl (string-length line)))
    (let loop ((pos 0)
    (seen 0))
    (if (= pos sl)
    (flush seen)
    (case (string-ref line pos)
    ((#\space)
    (if (= seen (- *tab-width* 1))
    (begin
    (display #\tab)
    (loop (1+ pos)
    0))
    (loop (1+ pos)
    (1+ seen))))
    ((#\tab)
    (flush seen)
    (display #\tab)
    (loop (1+ pos)
    0))
    (else
    (flush seen)
    (display (substring line pos sl))))))))

    (define (detab-line line #!optional (tab-width *tab-width*))
    (let ((sl (string-length line)))
    (let loop ((pos 0))
    (unless (= pos sl)
    (case (string-ref line pos)
    ((#\space)
    (display " ")
    (loop (1+ pos)))
    ((#\tab)
    (flush tab-width)
    (loop (1+ pos)))
    (else
    (display (substring line pos sl))))))))

  6. Axio said

    With better indentation, hopefully.

    (define *tab-width* 4)
    ;
    (define (flush seen)
      (unless (zero? seen)
        (for-each (lambda (x) (display " ")) (iota 1 seen))))
    ;
    (define (entab-line line #!optional (tab-width *tab-width*))
      (let ((sl (string-length line)))
        (let loop ((pos 0)
                   (seen 0))
          (if (= pos sl)
            (flush seen)
            (case (string-ref line pos)
              ((#\space)
               (if (= seen (- *tab-width* 1))
                 (begin
                   (display #\tab)
                   (loop (1+ pos)
                         0))
                 (loop (1+ pos)
                       (1+ seen))))
              ((#\tab)
               (flush seen)
               (display #\tab)
               (loop (1+ pos)
                     0))
              (else
                (flush seen)
                (display (substring line pos sl))))))))
    ;
    (define (detab-line line #!optional (tab-width *tab-width*))
      (let ((sl (string-length line)))
        (let loop ((pos 0))
          (unless (= pos sl)
            (case (string-ref line pos)
              ((#\space)
               (display " ")
               (loop (1+ pos)))
              ((#\tab)
               (flush tab-width)
               (loop (1+ pos)))
              (else
                (display (substring line pos sl))))))))

  7. Lautaro Pecile said
    class Start(object):
        def __init__(self, machine):
            self.machine = machine
        def __call__(self, c):
            if c == '\t':
                return c
            elif c == ' ':
                self.machine.spaces += 1
                if self.machine.spaces % self.machine.tablen == 0:
                    self.machine.spaces = 0
                    return '\t'
                return ''
            else:
                self.machine.state = End()
                return ' ' * self.machine.spaces + c
    
            
    class End(object):
        def __call__(self, c):
            return c
    
    
    class Machine(object):
        """
        A state machine. Transforms spaces into tabs until it finds 
            the first non whitespace character.
        There are two states in this machine, represented by the
            classes Start and End.
        """
        def __init__(self, tablen=8):
            self.state = Start(self)
            self.spaces = 0
            self.tablen = tablen
    
        def __call__(self, c):
            return self.state(c)
    
    
    def entab(s, tablen=8):
        m = Machine(tablen)
        return ''.join(map(m, s))
    
    
    def detab(s, tablen=8):
        return s.expandtabs(tablen)
    
  8. arturasl said

    My solution in C:

    #include      <string.h>
    #include      <stdio.h>
    
    int main(int argc, char **argv) {
    	int bEntab = 0, nSize = 4;
    	int chCur, bStartOfLine = 1, i, nSpaces = 0;
    
    	while (--argc) {
    		++argv;
    		bEntab = (strcmp(*argv, "e") == 0);
    	}
    
    	while ((chCur = getchar()) != EOF) {
    		if (bEntab == 1 && chCur == ' ' && bStartOfLine == 1) {
    			++nSpaces;
    
    			if (nSpaces == nSize) {
    				putchar('\t');
    				nSpaces = 0;
    			}
    		} else if (bEntab == 0 && chCur == '\t' && bStartOfLine == 1) {
    			for (i = 0; i < nSize; ++i) {
    				putchar(' ');
    			}
    		} else if (chCur == '\n') {
    			putchar(chCur);
    			bStartOfLine = 1;
    			nSpaces = 0;
    		} else {
    			putchar(chCur);
    			bStartOfLine = 0;
    		}
    	}
    
    	return 0;
    }
    
  9. [...] – entab and detab are used to handle problems on copy-and-paste from text files (ref) [...]

  10. siri said

    Write a program detab that replaces tabs in the input with the proper number of blanks to space to the next tab stop. Assume a fixed set of tab stops, say every n columns. Should n be a variable or a symbolic parameter?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 622 other followers

%d bloggers like this: