Abbreviated Sentences

April 28, 2017

The task doesn’t specify what to do with words that are one or two letters long; we arbitrarily decide to pass them through unchanged. Our program considers the input a character at a time, writing output every time it sees a non-letter:

(define (abbrev sentence)
  (with-output-to-string (lambda ()
    (define (word head len prev)
      (display head)
      (when (positive? len) (display (number->string len)))
      (when prev (display prev)))
    (let loop ((cs (string->list sentence))
               (head #f) (len -1) (prev #f))
      (cond ((null? cs) ; end of sentence
              (when head (word head len prev)))
            ((char-alphabetic? (car cs)) ; in a word
              (if head
                  (loop (cdr cs) head (+ len 1) (car cs))
                  (loop (cdr cs) (car cs) -1 #f)))
            (else ; not in a word
              (when head (word head len prev))
              (display (car cs))
              (loop (cdr cs) #f 0 #f)))))))

Here’s an example, with one-letter and two-letter words and a word with an embedded non-letter:

> (abbrev "A is one; Programming Praxis hello-goodbye.")
"A is o1e; P9g P4s h3o-g5e."

You can run the program at http://ideone.com/P6UgbQ.

Advertisement

Pages: 1 2

9 Responses to “Abbreviated Sentences”

  1. This is the sort of thing perl is great for… use a “regular expression” with replace function…

    print s{\b([[:alpha:]])([[:alpha:]]+)([[:alpha:]])\b}{$1.length($2).$3}ger while <>
    
  2. Jussi Piitulainen said

    Python’s got a more expressive regex library that can be installed and used instead of Python’s standard re. One thing it has is the named character classes. (Except I couldn’t see them in the interactive help text. But they seem to be there.)

    import regex as re
    import sys
    
    l4s = re.compile('([[:alpha:]])([[:alpha:]]+)([[:alpha:]])')
    
    def abbr(match):
        begin, middle, end = match.groups()
        return '{}{}{}'.format(begin, len(middle), end)
    
    def program(sentence):
        return re.sub(l4s, abbr, sentence)
    
    for sentence in sys.stdin:
        print(program(sentence), end = '')
    
  3. Can you give an example of when there are digits in the word? And one where there is a digit at the beginning and the end?

  4. programmingpraxis said

    @bookofstevegraham: Words are maximal sequences of letters. Digits are not part of words. So a word like 1Texas2Step3 is abbreviated as 1, followed by T3s for Texas, followed by 2, followed by S2p for Step, followed by 3, so the full abbreviation is 1T3s2S2p3.

    If you have questions about how a program works in a particular situation, you can always go to ideone.com, fork the recommended solution, and plug in your own data, like this: http://ideone.com/QETUMI.

  5. Just what I needed. Thanks.

  6. Cache (version of MUMPS)

    abbrsent(str) ;New routine
    ;
    n buffer,char,i,status
    s (buffer,status)=””
    w !!
    f i=1:1:$l(str) d
    . s char=$e(str,i)
    . i char?1a d
    . . i status=”char” d
    . . . s buffer=buffer_char
    . . e d
    . . . s status=”char”,buffer=char
    . e d
    . . i status=”char” d
    . . . d abbr(buffer)
    . . . s (buffer,status)=””
    . . w char
    i buffer]”” d abbr(buffer)
    q
    ;
    abbr(str) ;
    i $l(str)<3 w str
    e w $e(str)_($l(str)-2)_$e($reverse(str))
    q

    ===

    d ^abbrsent("hello")

    h3o

    d ^abbrsent("Programming Praxis")

    P9g P4s

    d ^abbrsent("12Progra3m4ming5 6Praxi7s89")

    12P4a3m4m2g5 6P3i7s89

    d ^abbrsent("12Progra3m4mi5 6Praxi7s89")

    12P4a3m4mi5 6P3i7s89

  7. fisherro said

    We now have regex is the C++ standard library, but it still seems much harder than it should be to do some simple things.

    #include <clocale>
    #include <iostream>
    #include <iterator>
    #include <regex>
    #include <string>
    
    int main()
    {
        std::setlocale(LC_ALL, NULL);
        std::string input;
        while (std::getline(std::cin, input)) {
            //Using [[:alpha:]] in the hopes of UTF-8 support by the locale
            //and the implementation
            std::regex regex(R"(([[:alpha:]])([[:alpha:]]+)([[:alpha:]]))");
            std::smatch match;
            while (std::regex_search(input, match, regex)) {
                std::cout << match.prefix()
                    << match[1]
                    << match[2].length()
                    << match[3];
                input = match.suffix();
            }
            std::cout << input << '\n';
        }
    }
    
  8. fisherro said

    Refactored it to create a reusable gsub function.

    #include <clocale>
    #include <iostream>
    #include <iterator>
    #include <regex>
    #include <sstream>
    #include <string>
    
    template<typename F>
    std::string gsub(std::string in, const std::regex& rx, F f)
    {
        std::string out;
        std::smatch m;
        while (std::regex_search(in, m, rx)) {
            out += m.prefix();
            out += f(m);
            in = m.suffix();
        }
        return out + in;
    }
    
    int main()
    {
        std::setlocale(LC_ALL, NULL);
        std::string line;
        while (std::getline(std::cin, line)) {
            //Using [[:alpha:]] in the hopes of UTF-8 support by the locale
            //and the implementation
            std::regex regex(R"(([[:alpha:]])([[:alpha:]]+)([[:alpha:]]))");
            std::cout << gsub(line, regex, [](const std::smatch& m) {
                return m[1].str() + std::to_string(m[2].length()) + m[3].str();
            });
            std::cout << '\n';
        }
    }
    
  9. john said

    Using C11:


    #include <ctype.h>
    #include <iso646.h>
    #include <stdbool.h>
    #include <stdio.h>
    #include <stdlib.h>

    int main(int argc, char **argv) {
      size_t word_len = 0;

      int current = getchar();
      int next = getchar();

      while (current != EOF) {
        if (isalpha(current)) {
          ++word_len;

          bool word_begin =
            word_len == 1;
          bool word_end =
            not isalpha(next);
          
          if (word_begin) {
            putchar(current);
          } else if (word_end) {
            if (word_len - 2 > 0) {
              printf("%zd", word_len - 2);
            }
            putchar(current);
            word_len = 0;
          }
        } else {
          putchar(current);
        }

        current = next;
        next = getchar();
      }

      if (ferror(stdin)) {
        fprintf(stderr, "fatal error while reading stdin.\n");
        exit(1);
      }
      
      exit(0);
    }

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: