Abbreviated Sentences

April 28, 2017

Sometimes people send me their homework problems and expect me to write programs for them. I ignore such people, but I do collect the tasks and use them in the blog from time to time, always waiting several months until the term has ended. Today’s exercise comes by that route:

Write a program that takes as input a sentence (a sequence of characters) and abbreviates it by replacing each word (a maximal sequence of letters) with the first letter of the word, followed by the number of letters in the middle of the word, followed by the last letter of the word. For instance, Programming Praxis is abbreviated P9g P4s. Any non-letter characters in the input should be retained in their original position in the output.

Your task is to write a program that abbreviates sentences. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Posted by programmingpraxis

Filed in Exercises

9 Comments »

9 Responses to “Abbreviated Sentences”

James Curtis-Smith said
April 28, 2017 at 11:38 AM
This is the sort of thing perl is great for… use a “regular expression” with replace function…
```
print s{\b([[:alpha:]])([[:alpha:]]+)([[:alpha:]])\b}{$1.length($2).$3}ger while <>
```

Jussi Piitulainen said

April 28, 2017 at 12:08 PM

Python’s got a more expressive regex library that can be installed and used instead of Python’s standard re. One thing it has is the named character classes. (Except I couldn’t see them in the interactive help text. But they seem to be there.)

import regex as re
import sys

l4s = re.compile('([[:alpha:]])([[:alpha:]]+)([[:alpha:]])')

def abbr(match):
    begin, middle, end = match.groups()
    return '{}{}{}'.format(begin, len(middle), end)

def program(sentence):
    return re.sub(l4s, abbr, sentence)

for sentence in sys.stdin:
    print(program(sentence), end = '')

bookofstevegraham said
April 28, 2017 at 5:30 PM
Can you give an example of when there are digits in the word? And one where there is a digit at the beginning and the end?
programmingpraxis said
April 28, 2017 at 6:29 PM
@bookofstevegraham: Words are maximal sequences of letters. Digits are not part of words. So a word like 1Texas2Step3 is abbreviated as 1, followed by T3s for Texas, followed by 2, followed by S2p for Step, followed by 3, so the full abbreviation is 1T3s2S2p3.

If you have questions about how a program works in a particular situation, you can always go to ideone.com, fork the recommended solution, and plug in your own data, like this: http://ideone.com/QETUMI.
bookofstevegraham said
April 28, 2017 at 6:59 PM
Just what I needed. Thanks.
bookofstevegraham said
April 28, 2017 at 8:48 PM
Cache (version of MUMPS)

abbrsent(str) ;New routine
;
n buffer,char,i,status
s (buffer,status)=””
w !!
f i=1:1:$l(str) d
. s char=$e(str,i)
. i char?1a d
. . i status=”char” d
. . . s buffer=buffer_char
. . e d
. . . s status=”char”,buffer=char
. e d
. . i status=”char” d
. . . d abbr(buffer)
. . . s (buffer,status)=””
. . w char
i buffer]”” d abbr(buffer)
q
;
abbr(str) ;
i $l(str)<3 w str
e w $e(str)_($l(str)-2)_$e($reverse(str))
q

===

d ^abbrsent("hello")

h3o

—

d ^abbrsent("Programming Praxis")

P9g P4s

—

d ^abbrsent("12Progra3m4ming5 6Praxi7s89")

12P4a3m4m2g5 6P3i7s89

—

d ^abbrsent("12Progra3m4mi5 6Praxi7s89")

12P4a3m4mi5 6P3i7s89

fisherro said

May 1, 2017 at 4:11 AM

We now have regex is the C++ standard library, but it still seems much harder than it should be to do some simple things.

#include <clocale>
#include <iostream>
#include <iterator>
#include <regex>
#include <string>

int main()
{
    std::setlocale(LC_ALL, NULL);
    std::string input;
    while (std::getline(std::cin, input)) {
        //Using [[:alpha:]] in the hopes of UTF-8 support by the locale
        //and the implementation
        std::regex regex(R"(([[:alpha:]])([[:alpha:]]+)([[:alpha:]]))");
        std::smatch match;
        while (std::regex_search(input, match, regex)) {
            std::cout << match.prefix()
                << match[1]
                << match[2].length()
                << match[3];
            input = match.suffix();
        }
        std::cout << input << '\n';
    }
}

fisherro said

May 1, 2017 at 4:28 AM

Refactored it to create a reusable gsub function.

#include <clocale>
#include <iostream>
#include <iterator>
#include <regex>
#include <sstream>
#include <string>

template<typename F>
std::string gsub(std::string in, const std::regex& rx, F f)
{
    std::string out;
    std::smatch m;
    while (std::regex_search(in, m, rx)) {
        out += m.prefix();
        out += f(m);
        in = m.suffix();
    }
    return out + in;
}

int main()
{
    std::setlocale(LC_ALL, NULL);
    std::string line;
    while (std::getline(std::cin, line)) {
        //Using [[:alpha:]] in the hopes of UTF-8 support by the locale
        //and the implementation
        std::regex regex(R"(([[:alpha:]])([[:alpha:]]+)([[:alpha:]]))");
        std::cout << gsub(line, regex, [](const std::smatch& m) {
            return m[1].str() + std::to_string(m[2].length()) + m[3].str();
        });
        std::cout << '\n';
    }
}

john said
May 2, 2017 at 2:29 PM
Using C11:

#include <ctype.h> #include <iso646.h> #include <stdbool.h> #include <stdio.h> #include <stdlib.h>
int main(int argc, char **argv) { size_t word_len = 0; int current = getchar(); int next = getchar(); while (current != EOF) { if (isalpha(current)) { ++word_len; bool word_begin = word_len == 1; bool word_end = not isalpha(next); if (word_begin) { putchar(current); } else if (word_end) { if (word_len - 2 > 0) { printf("%zd", word_len - 2); } putchar(current); word_len = 0; } } else { putchar(current); } current = next; next = getchar(); }
if (ferror(stdin)) { fprintf(stderr, "fatal error while reading stdin.\n"); exit(1); } exit(0); }

S	M	T	W	T	F	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30

Programming Praxis