Number Words

July 25, 2014

Today’s exercise is an interview question from Career Cup. When I first saw the exercise, I thought it would be easy, but I had trouble writing a solution, so maybe it will a good exercise for all you readers, as well:

Given a positive integer, return all the ways that the integer can be represented by letters using the mapping 1 -> A, 2 -> B, …, 26 -> Z. For instance, the number 1234 can be represented by the words ABCD, AWD and LCD.

Your task is to write the program to generate words from numbers. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Posted by programmingpraxis

Filed in Exercises

13 Comments »

13 Responses to “Number Words”

Paul said

July 25, 2014 at 10:30 AM

In Python. words form a number and a number from a word.

import string

D = dict(zip(string.ascii_uppercase, [str(s) for s in range(1, 27)]))
ID = dict(zip([str(s) for s in range(1, 27)], string.ascii_uppercase))
KEYS = set(ID.keys())

def words(number):
    res = []
    Q = [(str(number), "")]
    while Q:
        sn, w = Q.pop()
        if not sn:
            res.append(w)
        else:
            Q.append((sn[1:], w + ID[sn[0]]))
            if len(sn) > 1 and sn[:2] in KEYS:
                Q.append((sn[2:], w + ID[sn[:2]]))
    return res
    
def number(word):
    return int("".join([D[c] for c in word]))

n = number("HELLO")        
print n
print words(n)
print "PROGRAMMING" in words(161815718113139147)

James Curtis-Smith said

July 25, 2014 at 10:57 AM

In perl this returns a list of the words!

use strict;

use warnings;
use feature qw(say);
use utf8;

foreach( @ARGV ) {
  my @q = num_words( $_, q() );
  say join ", ", @q;
}

sub num_words {
  my $n = shift;
  my @words = @_;
  my ( $f, $q, $rest ) = split m{}, $n, 3;
  return map { $_.chr(64+$f) } @words unless defined $q && $q ne q();
  my @ret = num_words( "$q$rest", map { $_.chr(64+$f) } @words );

  if( $f eq '1' ) {
    push @ret, num_words( $rest, map { $_.chr(74+$q) } @words );
  } elsif( $f eq '2' && $q ge '0' && $q le '6' ) {
    push @ret, num_words( $rest, map { $_.chr(84+$q) } @words );
  }
  return @ret;
}

Ye said
July 25, 2014 at 4:59 PM
How would a number like 30 be represented? Should that just be a null list?
programmingpraxis said
July 25, 2014 at 5:24 PM
My function ignores the zero and returns “C” but there is no proper answer since it is always possible to add a string of zeros in the middle of the input. Feel free to make your own rule about the proper treatement of zero.

Mike said

July 25, 2014 at 9:02 PM

Recursive version in Python. I started the mapping at 0 -> ‘A’, so that numbers with zeros in them could be converted.


mapping = {str(ord(c)-ord('A')):c for c in 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'}

def numberword(n):
	s = str(n)
	if len(s) == 0:
		yield ''

	else:
		head = mapping[s[0]]
		for tail in numberword(s[1:]):
			yield head + tail

		if '10' <= s[:2] <= '25':
			head = mapping[s[:2]]
			for tail in numberword(s[2:]):
				yield head + tail

				
list(numberword(1234))  -> ['BCDE', 'BXE', 'MDE']

Mike said

July 26, 2014 at 8:48 PM

My solution above has a bug that causes duplicates to be generated when the number ends in a ‘2’.

Here’s a revised solution:

alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
mapping = {str(ord(c)-ord('A')):c for c in alphabet}

def numberword(n):
	s = str(n)
	if len(s) == 1:
		yield mapping[s[0]]

	elif len(s) > 1:
		head = mapping[s[0]]
		yield from (head + tail for tail in numberword(s[1:]))

		if '10' <= s[:2] <= '25':
			head = mapping[s[:2]]
			yield from (head + tail for tail in numberword(s[2:]))

matthew said

July 28, 2014 at 10:12 PM

Nice problem. I ended up with a sort of dynamic programming solution, C++11:

#include <vector>
#include <string>
#include <iostream>

using std::cout;
using std::vector;
using std::string;

struct label {
  label(const string &str_, const string &tag_) : str(str_),tag(tag_) {}
  string str;
  string tag;
};

// Consider the problem as a special case of "labelling" a string of
// characters - each label is a string with an associated tag and a
// labelling is a sequence of tags where the concatenation of the
// corresponding label strings matches the entire string.
//
// The idea here is that to find a labelling of the first n characters,
// we find an m-character label that matches the last m of those n
// characters, then append the tag of that label to all labellings of the
// first n-m characters. We progressively build up the labellings for the
// first 0, 1, 2 etc. characters. Actually, for this particular problem we
// only need to keep hold of the last 2 rows of the table, but in general,
// where labels can be any length, we need all the rows.

// We use a linear scan to find matching labels - a trie or FSA would be
// a better way of doing this in practise.

int main(int argc, char *argv[])
{
  string s = argv[1];
  int slen = s.size();
  // Construct our set of labels.
  vector<label> labels;
  for (int i = 1; i <= 26; i++) {
    labels.push_back(label(std::to_string(i), string(1,'A'+i-1)));
  }
  // Build up labelled prefixes here
  vector<vector<string>> table(slen+1);
  table[0].push_back("");
  for (int i = 1; i <= slen; i++) {
    for (auto label : labels) {
      int llen = label.str.size();
      if (llen <= i && s.compare(i-llen, llen, label.str) == 0) {
        for (auto prefix : table[i-llen]) {
          table[i].push_back(prefix+label.tag);
        }
      }
    }
  }
  for (auto row : table) {
    for (auto s : row) {
      cout << "\"" << s << "\" ";
    }
    cout << "\n";
  }
}

$ ./numwords 1234
"" 
"A" 
"AB" "L" 
"ABC" "LC" "AW" 
"ABCD" "LCD" "AWD" 
$ ./numwords 12131415
"" 
"A" 
"AB" "L" 
"ABA" "LA" "AU" 
"ABAC" "LAC" "AUC" "ABM" "LM" 
"ABACA" "LACA" "AUCA" "ABMA" "LMA" 
"ABACAD" "LACAD" "AUCAD" "ABMAD" "LMAD" "ABACN" "LACN" "AUCN" "ABMN" "LMN" 
"ABACADA" "LACADA" "AUCADA" "ABMADA" "LMADA" "ABACNA" "LACNA" "AUCNA" "ABMNA" "LMNA" 
"ABACADAE" "LACADAE" "AUCADAE" "ABMADAE" "LMADAE" "ABACNAE" "LACNAE" "AUCNAE" "ABMNAE" "LMNAE" "ABACADO" "LACADO" "AUCADO" "ABMADO" "LMADO" "ABACNO" "LACNO" "AUCNO" "ABMNO" "LMNO"

Jean P. Gadoury said
July 29, 2014 at 1:40 AM
The following is a C solution. Without the willingness to write a FIFO data set, I had to rely on recursion to get the problem completed. It took about 20 minutes and runs pretty fast. With the queue library, I could implement an iterative approach that would likely use less memory for larger numbers. Without further ado:

C:\Users\Dragonborn\Documents\CodeBlocks\numberwords\main.c

1 #include <stdio.h> 2 #include <stdlib.h> 3 #include <string.h> 4 5 typedef unsigned long long uint; 6 7 void printNumberWords(uint number); 8 void printNumberWords_aux(char* number_str, char* path); 9 char getLetterFromPrefix(char* str, unsigned int n); 10 char* appendCString(const char* str, const char c); 11 12 unsigned int countDigits(uint number); 13 char* uint2string(uint number); 14 15 int main() 16 { 17 printNumberWords(121223213123); 18 return 0; 19 } 20 21 void printNumberWords(uint number) 22 { 23 char* number_str = uint2string(number); 24 const unsigned int LEN = strlen(number_str); 25 26 char option1 = (LEN >= 1) ? getLetterFromPrefix(number_str, 1) : (''); 27 char option2 = (LEN >= 2) ? getLetterFromPrefix(number_str, 2) : (''); 28 29 if(option1 != '') 30 { 31 if(LEN == 1) 32 printf("%c\n", option1); 33 else 34 { 35 char* path = appendCString("", option1); 36 printNumberWords_aux(number_str + 1, path); 37 free(path); 38 } 39 } 40 if(option2 != '') 41 { 42 if(LEN == 2) 43 printf("%c\n", option2); 44 else 45 { 46 char* path = appendCString("", option2); 47 printNumberWords_aux(number_str + 2, path); 48 free(path); 49 } 50 } 51 free(number_str); 52 } 53 54 void printNumberWords_aux(char* number_str, char* path) 55 { 56 const unsigned int LEN = strlen(number_str); 57 58 char option1 = (LEN >= 1) ? getLetterFromPrefix(number_str, 1) : (''); 59 char option2 = (LEN >= 2) ? getLetterFromPrefix(number_str, 2) : (''); 60 61 if(option1 != '') 62 { 63 if(LEN == 1) 64 printf("%s%c\n", path, option1); 65 else 66 { 67 char* new_path = appendCString(path, option1); 68 printNumberWords_aux(number_str + 1, new_path); 69 free(new_path); 70 } 71 } 72 if(option2 != '') 73 { 74 if(LEN == 2) 75 printf("%s%c\n", path, option2); 76 else 77 { 78 char* new_path = appendCString(path, option2); 79 printNumberWords_aux(number_str + 2, new_path); 80 free(new_path); 81 } 82 } 83 } 84 85 char getLetterFromPrefix(char* str, unsigned int n) 86 { 87 static const char* LETTERS = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"; 88 static const unsigned int LETTERS_LEN = 26; 89 90 char* nstr = malloc(sizeof(char) * (n+1)); 91 nstr[n] = ''; 92 memcpy(nstr, str, n * sizeof(char)); 93 94 int number = atoi(nstr); 95 free(nstr); 96 97 if(number <= LETTERS_LEN && number > 0) 98 return LETTERS[number - 1]; 99 return ''; 100 } 101 102 char* appendCString(const char* str, const char c) 103 { 104 const unsigned int LEN = strlen(str); 105 char* rValue = malloc(sizeof(char) * (LEN + 2)); 106 memcpy(rValue, str, LEN); 107 rValue[LEN + 0] = c; 108 rValue[LEN + 1] = ''; 109 return rValue; 110 } 111 112 113 char* uint2string(uint number) 114 { 115 const unsigned int NUMBER_LEN = countDigits(number); 116 unsigned int pos = NUMBER_LEN - 1; 117 char* rValue = malloc(sizeof(char) * NUMBER_LEN + 1); 118 rValue[NUMBER_LEN] = ''; 119 while(number > 0) 120 { 121 rValue[pos] = (number % 10) + '0'; 122 number /= 10; 123 pos --; 124 } 125 return rValue; 126 } 127 128 unsigned int countDigits(uint number) 129 { 130 unsigned int rValue = 0; 131 while(number > 0) 132 { 133 rValue++; 134 number /= 10; 135 } 136 return rValue; 137 }

Jean P. Gadoury said

July 29, 2014 at 1:44 AM

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef unsigned long long uint;

void printNumberWords(uint number);
void printNumberWords_aux(char* number_str, char* path);
char getLetterFromPrefix(char* str, unsigned int n);
char* appendCString(const char* str, const char c);

unsigned int countDigits(uint number);
char* uint2string(uint number);

int main()
{
    printNumberWords(121223213123);
    return 0;
}

void printNumberWords(uint number)
{
    char* number_str = uint2string(number);
    const unsigned int LEN = strlen(number_str);

    char option1 = (LEN >= 1) ? getLetterFromPrefix(number_str, 1) : ('\0');
    char option2 = (LEN >= 2) ? getLetterFromPrefix(number_str, 2) : ('\0');

    if(option1 != '\0')
    {
        if(LEN == 1)
            printf("%c\n", option1);
        else
        {
            char* path = appendCString("", option1);
            printNumberWords_aux(number_str + 1, path);
            free(path);
        }
    }
    if(option2 != '\0')
    {
        if(LEN == 2)
            printf("%c\n", option2);
        else
        {
            char* path = appendCString("", option2);
            printNumberWords_aux(number_str + 2, path);
            free(path);
        }
    }
    free(number_str);
}

void printNumberWords_aux(char* number_str, char* path)
{
    const unsigned int LEN = strlen(number_str);

    char option1 = (LEN >= 1) ? getLetterFromPrefix(number_str, 1) : ('\0');
    char option2 = (LEN >= 2) ? getLetterFromPrefix(number_str, 2) : ('\0');

    if(option1 != '\0')
    {
        if(LEN == 1)
            printf("%s%c\n", path, option1);
        else
        {
            char* new_path = appendCString(path, option1);
            printNumberWords_aux(number_str + 1, new_path);
            free(new_path);
        }
    }
    if(option2 != '\0')
    {
        if(LEN == 2)
            printf("%s%c\n", path, option2);
        else
        {
            char* new_path = appendCString(path, option2);
            printNumberWords_aux(number_str + 2, new_path);
            free(new_path);
        }
    }
}

char getLetterFromPrefix(char* str, unsigned int n)
{
    static const char* LETTERS = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
    static const unsigned int LETTERS_LEN = 26;

    char* nstr = malloc(sizeof(char) * (n+1));
    nstr[n] = '\0';
    memcpy(nstr, str, n * sizeof(char));

    int number = atoi(nstr);
    free(nstr);

    if(number <= LETTERS_LEN && number > 0)
        return LETTERS[number - 1];
    return '\0';
}

char* appendCString(const char* str, const char c)
{
    const unsigned int LEN = strlen(str);
    char* rValue = malloc(sizeof(char) * (LEN + 2));
    memcpy(rValue, str, LEN);
    rValue[LEN + 0] = c;
    rValue[LEN + 1] = '\0';
    return rValue;
}


char* uint2string(uint number)
{
    const unsigned int NUMBER_LEN = countDigits(number);
    unsigned int pos = NUMBER_LEN - 1;
    char* rValue = malloc(sizeof(char) * NUMBER_LEN + 1);
    rValue[NUMBER_LEN] = '\0';
    while(number > 0)
    {
        rValue[pos] = (number % 10) + '0';
        number /= 10;
        pos --;
    }
    return rValue;
}

unsigned int countDigits(uint number)
{
    unsigned int rValue = 0;
    while(number > 0)
    {
        rValue++;
        number /= 10;
    }
    return rValue;
}

Peter said

July 31, 2014 at 10:09 PM

That is a tricky interview question considering the ambiguity with zero.
It is interesting to see how the different solutions react to
hostile inputs containing zeros. There might be some surprises.
My Haskell solution maps 0 to @.

codes :: String
codes = '@' : ['A' .. 'Z']

cwords :: String -> String -> [String]
cwords []     cs = [reverse cs]
cwords (x:xs) cs
    | null xs    = cwords xs (codes !! headval : cs)
    | otherwise  = if pairval < length codes && x /= '0'
                      then cwords xs (codes !! headval : cs) ++
                           cwords (tail xs) (codes !! pairval : cs)
                      else cwords xs (codes !! headval : cs)
    where headval = read [x]
          pairval = read [x, head xs]

main :: IO()
main = do
    print $ cwords "1234" []
    print $ cwords "317" []
    print $ cwords "204" []
    print $ cwords "1020" []

Output:

["ABCD","AWD","LCD"]
["CAG","CQ"]
["B@D","TD"]
["A@B@","A@T","JB@","JT"]

matthew said

August 1, 2014 at 12:09 AM

Good point about zeroes – I’m inclined to just say there aren’t any matches for eg. “30”.

Here’s a nice and simple recursive solution in C (my other solution avoids some duplicated effort in the recursive calls, but the space requirements get quite large). This generates the solutions in lexicographic order too.

There are 259584 solutions for the string “1234567891011121314151617181920212223242526” incidentally (including “ABCDEFGHIJKLMNOPQRSTUVWXYZ” of course).

#include <ctype.h>
#include <string.h>
#include <stdio.h>

void scan(const char *s, int ipos, char *outstr, int opos)
{
  if (s[ipos] == 0) {
    outstr[opos] = 0;
    printf("%s\n", outstr);
  } else if (isdigit(s[ipos])) {
    int n = s[ipos]-'0';
    if (n >= 1) {
      outstr[opos] = 'A'+n-1;
      scan(s,ipos+1,outstr,opos+1);
    }
    if (isdigit(s[ipos+1])) {
      n = 10*n+s[ipos+1]-'0';
      if (n >= 10 && n <= 26) {
        outstr[opos] = 'A'+n-1;
        scan(s,ipos+2,outstr,opos+1);
      }
    }
  }
}

int main(int argc, char *argv[])
{
  char outstr[strlen(argv[1])+1];
  scan(argv[1],0,outstr,0);
}

JP said

August 13, 2014 at 11:01 PM

Whee Rackety goodness!

Full writeup: Number words

; Given 1-26 mapping to A-Z, determine all possible words represented by a number
; Correctly resolve ambiguities where 1234 -> 1 2 3 4 = ABCD / 1 23 4 -> AWD / 12 3 4 -> LCD
(define (number->words str)
  ; Convert a number 1-26 to a letter A-Z
  (define (n->char n) (integer->char (+ 64 (string->number n))))
  
  ; Make an optional parser 
  ; If the regex matches, add it to each possible next parse
  ; If it does not, return an empty list (to be appendable)
  (define (make-parser re)
    (λ (str)
      (match str
        [(regexp re (list _ n rest))
         (map (curry ~a (n->char n)) (number->words rest))]
        [any 
         '()])))
  
  ; Create parsers for valid 1 digit and 2 digit letter numbers
  (define parse-1 (make-parser #px"([1-9])(.*)"))
  (define parse-2 (make-parser #px"(1[0-9]|2[0-6])(.*)"))
  
  ; Base case, so we can stop eventually
  (if (equal? str "")
      '("")
      (append (parse-1 str) (parse-2 str))))

matthew said

August 20, 2014 at 8:14 PM

I’ve been meaning to do this for a while – here’s an improved version of my “dynamic programming” solution; this one uses a tree structure to store the labellings rather than an explicit list (which gets rather large). Now, we can calculate that eg. ‘12345678910111213141516171819202122232425261234567891011121314151617181920212223242526’ has 67383853056 labellings, which would take a long time to generate explicitly. It also fixes a bug which mishandled the case where no labelling was possible.

#include <string.h>
#include <vector>
#include <string>
#include <iostream>
 
struct label {
  label(const std::string &s, char t) : str(s),tag(t) {}
  std::string str;
  char tag;
};
 
struct Node 
{
  Node(Node *p, Node *n, char t)
    : prefix(p), next(n), tag(t),
    count((p ? p->count : 1) + (n ? n->count : 0)) {}
  Node *prefix;
  Node *next;
  char tag;
  long count;
};

void scan(Node *node, char *buff)
{
  if (node == NULL) {
    std::cout << buff << "\n";
  } else {
    do {
      *(buff-1) = node->tag;
      scan(node->prefix,buff-1);
      node = node->next;
    } while (node != NULL);
  }
}

int main(int argc, char *argv[])
{
  bool doprint = false;
  if (strcmp(argv[1],"-p") == 0) {
    doprint = true;
    argc--; argv++;
  }
  std::string s = argv[1];
  int slen = s.size();
  std::vector<label> labels;
  for (int i = 1; i <= 26; i++) {
    labels.push_back(label(std::to_string(i), 'A'+i-1));
  }
  std::vector<Node *> table(slen+1);
  for (int i = 1; i <= slen; i++) {
    for (auto label : labels) {
      int llen = label.str.size();
      if (llen <= i && s.compare(i-llen, llen, label.str) == 0) {
        // Check there is a solution to extend
        if (i == llen || table[i-llen] != NULL) {
          table[i] = new Node(table[i-llen],table[i],label.tag);
        }
      }
    }
  }
  if (table[slen] == NULL) {
    std::cerr << "No labellings\n";
  } else {
    std::cerr << table[slen]->count << " labellings\n";
    if (doprint) {
      char *buff = new char[slen+1]();
      scan(table[slen],buff+slen);
    }
  }
}

Programming Praxis

Number Words

July 25, 2014

13 Responses to “Number Words”

Leave a comment

Categories

Archives

Archives

Programming Praxis

Number Words

July 25, 2014

Share this:

Related

13 Responses to “Number Words”

Leave a comment

Categories

Archives

Archives