Programming Praxis


Home | Pages | Archives


Entab And Detab

May 6, 2011 9:00 AM

In ancient times, say the 1970s, religious wars were fought about whether to indent blocks of code using tabs or spaces, and how wide the indents should be. Then editing environments got better and programmers stopped arguing about tabs and started arguing about other things, like where the braces go. Now, with the advent of the internet, the problem of tab/space indentation seems to be returning, as copy/paste operations between web browsers and text editors seem to get the indentation wrong (especially when the source is the evil PDF format).

In ancient times, the normal solution to the tab/space indentation problem was a pair of programs, entab and detab, that could easily convert between the two formats. Now, with the internet, it behooves us to resurrect those old programs.

Your task is to write programs that convert files using tabs for indentation to files using spaces for indentation, and vice versa; be sure to permit an argument specifying the width of the tab. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Posted by programmingpraxis

Categories: Exercises

Tags:

10 Responses to “Entab And Detab”

  1. My Haskell solution (see http://bonsaicode.wordpress.com/2011/05/06/programming-praxis-entab-and-detab/ for a version with comments):

    import Text.Regex
    
    detab :: Int -> String -> String
    detab w s = subRegex (mkRegex "\t") s (replicate w ' ')
    
    entab :: Int -> String -> String
    entab w = unlines . map f . lines where
        f s = replicate tabs '\t' ++ replicate spaces ' ' ++ line where
            (indent, line) = span (`elem` " \t") s
            (tabs, spaces) = divMod (sum $ map width indent) w
        width c = if c == '\t' then w else 1
    

    By Remco Niemeijer on May 6, 2011 at 10:55 AM

  2. Remko: you change all tabs/spaces. You should only consider the ones at the head of a line. And they may be mixed.

    By Axio on May 6, 2011 at 1:14 PM

  3. “Remco”, sorry for the spelling error.

    By Axio on May 6, 2011 at 1:15 PM

  4. Axio: Correct, detab changes all tabs, since this is the behaviour of the provided solution. entab only process spaces at the start of the line. Mixed spaces and tabs are already handled correctly.

    By Remco Niemeijer on May 6, 2011 at 3:10 PM

  5. Gambit-C Scheme, and some macros inspired by Common Lisp…
    Not the most beautiful code, and no magic involved.
    Will handle mixed tabs and spaces on same line, and stop at the first non-space-nor-tab character.
    Procedures to apply to each line of a loaded file.

    I think that’s pretty much it…


    (define *tab-width* 4)

    (define (flush seen)
    (unless (zero? seen)
    (for-each (lambda (x) (display " ")) (iota 1 seen))))

    (define (entab-line line #!optional (tab-width *tab-width*))
    (let ((sl (string-length line)))
    (let loop ((pos 0)
    (seen 0))
    (if (= pos sl)
    (flush seen)
    (case (string-ref line pos)
    ((#\space)
    (if (= seen (- *tab-width* 1))
    (begin
    (display #\tab)
    (loop (1+ pos)
    0))
    (loop (1+ pos)
    (1+ seen))))
    ((#\tab)
    (flush seen)
    (display #\tab)
    (loop (1+ pos)
    0))
    (else
    (flush seen)
    (display (substring line pos sl))))))))

    (define (detab-line line #!optional (tab-width *tab-width*))
    (let ((sl (string-length line)))
    (let loop ((pos 0))
    (unless (= pos sl)
    (case (string-ref line pos)
    ((#\space)
    (display " ")
    (loop (1+ pos)))
    ((#\tab)
    (flush tab-width)
    (loop (1+ pos)))
    (else
    (display (substring line pos sl))))))))

    By Axio on May 6, 2011 at 3:35 PM

  6. With better indentation, hopefully.

    (define *tab-width* 4)
    ;
    (define (flush seen)
      (unless (zero? seen)
        (for-each (lambda (x) (display " ")) (iota 1 seen))))
    ;
    (define (entab-line line #!optional (tab-width *tab-width*))
      (let ((sl (string-length line)))
        (let loop ((pos 0)
                   (seen 0))
          (if (= pos sl)
            (flush seen)
            (case (string-ref line pos)
              ((#\space)
               (if (= seen (- *tab-width* 1))
                 (begin
                   (display #\tab)
                   (loop (1+ pos)
                         0))
                 (loop (1+ pos)
                       (1+ seen))))
              ((#\tab)
               (flush seen)
               (display #\tab)
               (loop (1+ pos)
                     0))
              (else
                (flush seen)
                (display (substring line pos sl))))))))
    ;
    (define (detab-line line #!optional (tab-width *tab-width*))
      (let ((sl (string-length line)))
        (let loop ((pos 0))
          (unless (= pos sl)
            (case (string-ref line pos)
              ((#\space)
               (display " ")
               (loop (1+ pos)))
              ((#\tab)
               (flush tab-width)
               (loop (1+ pos)))
              (else
                (display (substring line pos sl))))))))

    By Axio on May 6, 2011 at 3:38 PM

  7. class Start(object):
        def __init__(self, machine):
            self.machine = machine
        def __call__(self, c):
            if c == '\t':
                return c
            elif c == ' ':
                self.machine.spaces += 1
                if self.machine.spaces % self.machine.tablen == 0:
                    self.machine.spaces = 0
                    return '\t'
                return ''
            else:
                self.machine.state = End()
                return ' ' * self.machine.spaces + c
    
            
    class End(object):
        def __call__(self, c):
            return c
    
    
    class Machine(object):
        """
        A state machine. Transforms spaces into tabs until it finds 
            the first non whitespace character.
        There are two states in this machine, represented by the
            classes Start and End.
        """
        def __init__(self, tablen=8):
            self.state = Start(self)
            self.spaces = 0
            self.tablen = tablen
    
        def __call__(self, c):
            return self.state(c)
    
    
    def entab(s, tablen=8):
        m = Machine(tablen)
        return ''.join(map(m, s))
    
    
    def detab(s, tablen=8):
        return s.expandtabs(tablen)
    

    By Lautaro Pecile on May 6, 2011 at 7:42 PM

  8. My solution in C:

    #include      <string.h>
    #include      <stdio.h>
    
    int main(int argc, char **argv) {
    	int bEntab = 0, nSize = 4;
    	int chCur, bStartOfLine = 1, i, nSpaces = 0;
    
    	while (--argc) {
    		++argv;
    		bEntab = (strcmp(*argv, "e") == 0);
    	}
    
    	while ((chCur = getchar()) != EOF) {
    		if (bEntab == 1 && chCur == ' ' && bStartOfLine == 1) {
    			++nSpaces;
    
    			if (nSpaces == nSize) {
    				putchar('\t');
    				nSpaces = 0;
    			}
    		} else if (bEntab == 0 && chCur == '\t' && bStartOfLine == 1) {
    			for (i = 0; i < nSize; ++i) {
    				putchar(' ');
    			}
    		} else if (chCur == '\n') {
    			putchar(chCur);
    			bStartOfLine = 1;
    			nSpaces = 0;
    		} else {
    			putchar(chCur);
    			bStartOfLine = 0;
    		}
    	}
    
    	return 0;
    }
    

    By arturasl on May 8, 2011 at 10:10 AM

  9. […] – entab and detab are used to handle problems on copy-and-paste from text files (ref) […]

    By The C Programming Language (K&R) Chapter 5 Ex 10-20 « 盲頭烏蠅 on August 5, 2011 at 5:02 PM

  10. Write a program detab that replaces tabs in the input with the proper number of blanks to space to the next tab stop. Assume a fixed set of tab stops, say every n columns. Should n be a variable or a symbolic parameter?

    By siri on June 1, 2012 at 10:39 AM

Leave a Reply



Mobile Site | Full Site


Get a free blog at WordPress.com Theme: WordPress Mobile Edition by Alex King.