First Word

January 25, 2019

We have a simple exercise today, inspired a co-worker. Where I work, we have a reporting tool that permits a “hook” to the underlying SQL in some places. My co-worker asked me how to write an SQL statement that extracts the first word (a maximal sequence of non-spaces) from the beginning of a string (assume there are no leading spaces). For instance, given the string “abcdefg hijklmnop qrs tuv wxyz” the first word is “abcdefg”. Here’s the SQL expression, wrapped in a select statement, with &&STR representing the string:

select substr('&&STR', 1, instr('&&STR', ' ') - 1) from dual

Your task is to write a program to extract the first word from a string. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Posted by programmingpraxis

Filed in Exercises

6 Comments »

6 Responses to “First Word”

matthew said
January 25, 2019 at 10:46 AM
Extra points for implementing the Unicode default word breaking algorithm: https://unicode.org/reports/tr29/#Word_Boundaries

V said

January 25, 2019 at 12:14 PM

Two solutions in golang.


package main

import (
	"fmt"
	"regexp"
)

func main() {
	s1 := "abcdefg hijklmnop qrs tuv wxyz"
	s2 := "~hola   caracola  de piazzolla"
	fmt.Println(firstWordWithLoop(s1))
	fmt.Println(firstWordWithRegexp(s1))
	fmt.Println()
	fmt.Println(firstWordWithLoop(s2))
	fmt.Println(firstWordWithRegexp(s2))
}

func firstWordWithLoop(str string) string {
	word := ""
	for _, char := range str {
		if char == ' ' {
			break
		}
		word += string(char)
	}
	return word
}

func firstWordWithRegexp(str string) string {
	return regexp.MustCompile(`[^ ]+`).FindString(str)
}

Daniel said

January 25, 2019 at 4:46 PM

Here’s a solution in C.

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char* argv[]) {
  if (argc != 2) {
    fprintf(stderr, "Usage: %s STR\n", argv[0]);
    return EXIT_FAILURE;
  }
  char* str = argv[1];
  while (1) {
    char c = *(str++);
    if (c == ' ' || c == '\0') break;
    printf("%c", c);
  }
  printf("\n");
  return EXIT_SUCCESS;
}

Example Usage:

$ ./a.out "abcdefg hijklmnop qrs tuv wxyz"
abcdefg

Steve said
January 25, 2019 at 6:08 PM
AWK version

$ echo "abcdefg hijklmnop qrs tuv wxyz" | awk ‘{ print $1 }’
abcdefg

—

Klong version

(-1)_((a?" ")@0)#a::"abcdefg hijklmnop qrs tuv wxyz"
"abcdef"
a
"abcdefg hijklmnop qrs tuv wxyz"
a?" "
[7 17 21 25]

—

MUMPS version

YDB>w $p("abcdefg hijklmnop qrs tuv wxyz"," ")
abcdefg

matthew said

January 25, 2019 at 11:41 PM

That Unicode algorithm looks a bit complicated, here’s a simple Unicode-friendly solution using Python str.isspace():

def firstword(s):
    start = -1
    for i,c in enumerate(s):
        if start < 0:
            if not c.isspace(): start = i;
        elif c.isspace():
            return s[start:i]
    return None if start < 0 else s[start:]

assert(firstword("") is None)
assert(firstword("  ") is None)
assert(firstword("foo") == "foo")
assert(firstword(" foo") == "foo")
assert(firstword("foo ") == "foo")

Andrey Sidorenko said
February 8, 2019 at 1:32 PM
Python solution:

firstWord = lambda x: x.lstrip(" ").split(" ")[0]

S	M	T	W	T	F	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Programming Praxis

First Word

January 25, 2019

6 Responses to “First Word”

Leave a comment

Categories

Archives

Archives

Programming Praxis

First Word

January 25, 2019

Share this:

Related

6 Responses to “First Word”

Leave a comment

Categories

Archives

Archives