Number Words
July 25, 2014
The heart of the solution is the recursive word function that builds a list of words from numbers; the critical case is the last two lines, where the accumulating output is split into two pieces when both the first digit and the first two digits are in the alphabet:
(define (word cs ds)
(if (null? ds) (list cs)
(if (null? (cdr ds)) (list (cons (letter (car ds)) cs))
(if (zero? (car ds)) (word cs (cdr ds))
(let ((n (+ (* (car ds) 10) (cadr ds))))
(if (< 26 n)
(word (cons (letter (car ds)) cs) (cdr ds))
(append (word (cons (letter (car ds)) cs) (cdr ds))
(word (cons (letter n) cs) (cddr ds)))))))))
Then the words function handles initialization and output structuring:
(define (words n)
(map list->string (map reverse (word (list) (digits n)))))
We used the digits function from the Standard Prelude, and an auxiliary function letter that maps numbers to letters. You can run the program at http://programmingpraxis.codepad.org/DOBTZzzd.
In Python. words form a number and a number from a word.
import string D = dict(zip(string.ascii_uppercase, [str(s) for s in range(1, 27)])) ID = dict(zip([str(s) for s in range(1, 27)], string.ascii_uppercase)) KEYS = set(ID.keys()) def words(number): res = [] Q = [(str(number), "")] while Q: sn, w = Q.pop() if not sn: res.append(w) else: Q.append((sn[1:], w + ID[sn[0]])) if len(sn) > 1 and sn[:2] in KEYS: Q.append((sn[2:], w + ID[sn[:2]])) return res def number(word): return int("".join([D[c] for c in word])) n = number("HELLO") print n print words(n) print "PROGRAMMING" in words(161815718113139147)In perl this returns a list of the words!
use strict; use warnings; use feature qw(say); use utf8; foreach( @ARGV ) { my @q = num_words( $_, q() ); say join ", ", @q; } sub num_words { my $n = shift; my @words = @_; my ( $f, $q, $rest ) = split m{}, $n, 3; return map { $_.chr(64+$f) } @words unless defined $q && $q ne q(); my @ret = num_words( "$q$rest", map { $_.chr(64+$f) } @words ); if( $f eq '1' ) { push @ret, num_words( $rest, map { $_.chr(74+$q) } @words ); } elsif( $f eq '2' && $q ge '0' && $q le '6' ) { push @ret, num_words( $rest, map { $_.chr(84+$q) } @words ); } return @ret; }How would a number like 30 be represented? Should that just be a null list?
My function ignores the zero and returns “C” but there is no proper answer since it is always possible to add a string of zeros in the middle of the input. Feel free to make your own rule about the proper treatement of zero.
Recursive version in Python. I started the mapping at 0 -> ‘A’, so that numbers with zeros in them could be converted.
mapping = {str(ord(c)-ord('A')):c for c in 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'} def numberword(n): s = str(n) if len(s) == 0: yield '' else: head = mapping[s[0]] for tail in numberword(s[1:]): yield head + tail if '10' <= s[:2] <= '25': head = mapping[s[:2]] for tail in numberword(s[2:]): yield head + tail list(numberword(1234)) -> ['BCDE', 'BXE', 'MDE']My solution above has a bug that causes duplicates to be generated when the number ends in a ‘2’.
Here’s a revised solution:
alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' mapping = {str(ord(c)-ord('A')):c for c in alphabet} def numberword(n): s = str(n) if len(s) == 1: yield mapping[s[0]] elif len(s) > 1: head = mapping[s[0]] yield from (head + tail for tail in numberword(s[1:])) if '10' <= s[:2] <= '25': head = mapping[s[:2]] yield from (head + tail for tail in numberword(s[2:]))Nice problem. I ended up with a sort of dynamic programming solution, C++11:
#include <vector> #include <string> #include <iostream> using std::cout; using std::vector; using std::string; struct label { label(const string &str_, const string &tag_) : str(str_),tag(tag_) {} string str; string tag; }; // Consider the problem as a special case of "labelling" a string of // characters - each label is a string with an associated tag and a // labelling is a sequence of tags where the concatenation of the // corresponding label strings matches the entire string. // // The idea here is that to find a labelling of the first n characters, // we find an m-character label that matches the last m of those n // characters, then append the tag of that label to all labellings of the // first n-m characters. We progressively build up the labellings for the // first 0, 1, 2 etc. characters. Actually, for this particular problem we // only need to keep hold of the last 2 rows of the table, but in general, // where labels can be any length, we need all the rows. // We use a linear scan to find matching labels - a trie or FSA would be // a better way of doing this in practise. int main(int argc, char *argv[]) { string s = argv[1]; int slen = s.size(); // Construct our set of labels. vector<label> labels; for (int i = 1; i <= 26; i++) { labels.push_back(label(std::to_string(i), string(1,'A'+i-1))); } // Build up labelled prefixes here vector<vector<string>> table(slen+1); table[0].push_back(""); for (int i = 1; i <= slen; i++) { for (auto label : labels) { int llen = label.str.size(); if (llen <= i && s.compare(i-llen, llen, label.str) == 0) { for (auto prefix : table[i-llen]) { table[i].push_back(prefix+label.tag); } } } } for (auto row : table) { for (auto s : row) { cout << "\"" << s << "\" "; } cout << "\n"; } }The following is a C solution. Without the willingness to write a FIFO data set, I had to rely on recursion to get the problem completed. It took about 20 minutes and runs pretty fast. With the queue library, I could implement an iterative approach that would likely use less memory for larger numbers. Without further ado:
C:\Users\Dragonborn\Documents\CodeBlocks\numberwords\main.c
1 #include <stdio.h>
2 #include <stdlib.h>
3 #include <string.h>
4
5 typedef unsigned long long uint;
6
7 void printNumberWords(uint number);
8 void printNumberWords_aux(char* number_str, char* path);
9 char getLetterFromPrefix(char* str, unsigned int n);
10 char* appendCString(const char* str, const char c);
11
12 unsigned int countDigits(uint number);
13 char* uint2string(uint number);
14
15 int main()
16 {
17 printNumberWords(121223213123);
18 return 0;
19 }
20
21 void printNumberWords(uint number)
22 {
23 char* number_str = uint2string(number);
24 const unsigned int LEN = strlen(number_str);
25
26 char option1 = (LEN >= 1) ? getLetterFromPrefix(number_str, 1) : ('');
27 char option2 = (LEN >= 2) ? getLetterFromPrefix(number_str, 2) : ('');
28
29 if(option1 != '')
30 {
31 if(LEN == 1)
32 printf("%c\n", option1);
33 else
34 {
35 char* path = appendCString("", option1);
36 printNumberWords_aux(number_str + 1, path);
37 free(path);
38 }
39 }
40 if(option2 != '')
41 {
42 if(LEN == 2)
43 printf("%c\n", option2);
44 else
45 {
46 char* path = appendCString("", option2);
47 printNumberWords_aux(number_str + 2, path);
48 free(path);
49 }
50 }
51 free(number_str);
52 }
53
54 void printNumberWords_aux(char* number_str, char* path)
55 {
56 const unsigned int LEN = strlen(number_str);
57
58 char option1 = (LEN >= 1) ? getLetterFromPrefix(number_str, 1) : ('');
59 char option2 = (LEN >= 2) ? getLetterFromPrefix(number_str, 2) : ('');
60
61 if(option1 != '')
62 {
63 if(LEN == 1)
64 printf("%s%c\n", path, option1);
65 else
66 {
67 char* new_path = appendCString(path, option1);
68 printNumberWords_aux(number_str + 1, new_path);
69 free(new_path);
70 }
71 }
72 if(option2 != '')
73 {
74 if(LEN == 2)
75 printf("%s%c\n", path, option2);
76 else
77 {
78 char* new_path = appendCString(path, option2);
79 printNumberWords_aux(number_str + 2, new_path);
80 free(new_path);
81 }
82 }
83 }
84
85 char getLetterFromPrefix(char* str, unsigned int n)
86 {
87 static const char* LETTERS = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
88 static const unsigned int LETTERS_LEN = 26;
89
90 char* nstr = malloc(sizeof(char) * (n+1));
91 nstr[n] = '';
92 memcpy(nstr, str, n * sizeof(char));
93
94 int number = atoi(nstr);
95 free(nstr);
96
97 if(number <= LETTERS_LEN && number > 0)
98 return LETTERS[number - 1];
99 return '';
100 }
101
102 char* appendCString(const char* str, const char c)
103 {
104 const unsigned int LEN = strlen(str);
105 char* rValue = malloc(sizeof(char) * (LEN + 2));
106 memcpy(rValue, str, LEN);
107 rValue[LEN + 0] = c;
108 rValue[LEN + 1] = '';
109 return rValue;
110 }
111
112
113 char* uint2string(uint number)
114 {
115 const unsigned int NUMBER_LEN = countDigits(number);
116 unsigned int pos = NUMBER_LEN - 1;
117 char* rValue = malloc(sizeof(char) * NUMBER_LEN + 1);
118 rValue[NUMBER_LEN] = '';
119 while(number > 0)
120 {
121 rValue[pos] = (number % 10) + '0';
122 number /= 10;
123 pos --;
124 }
125 return rValue;
126 }
127
128 unsigned int countDigits(uint number)
129 {
130 unsigned int rValue = 0;
131 while(number > 0)
132 {
133 rValue++;
134 number /= 10;
135 }
136 return rValue;
137 }
#include <stdio.h> #include <stdlib.h> #include <string.h> typedef unsigned long long uint; void printNumberWords(uint number); void printNumberWords_aux(char* number_str, char* path); char getLetterFromPrefix(char* str, unsigned int n); char* appendCString(const char* str, const char c); unsigned int countDigits(uint number); char* uint2string(uint number); int main() { printNumberWords(121223213123); return 0; } void printNumberWords(uint number) { char* number_str = uint2string(number); const unsigned int LEN = strlen(number_str); char option1 = (LEN >= 1) ? getLetterFromPrefix(number_str, 1) : ('\0'); char option2 = (LEN >= 2) ? getLetterFromPrefix(number_str, 2) : ('\0'); if(option1 != '\0') { if(LEN == 1) printf("%c\n", option1); else { char* path = appendCString("", option1); printNumberWords_aux(number_str + 1, path); free(path); } } if(option2 != '\0') { if(LEN == 2) printf("%c\n", option2); else { char* path = appendCString("", option2); printNumberWords_aux(number_str + 2, path); free(path); } } free(number_str); } void printNumberWords_aux(char* number_str, char* path) { const unsigned int LEN = strlen(number_str); char option1 = (LEN >= 1) ? getLetterFromPrefix(number_str, 1) : ('\0'); char option2 = (LEN >= 2) ? getLetterFromPrefix(number_str, 2) : ('\0'); if(option1 != '\0') { if(LEN == 1) printf("%s%c\n", path, option1); else { char* new_path = appendCString(path, option1); printNumberWords_aux(number_str + 1, new_path); free(new_path); } } if(option2 != '\0') { if(LEN == 2) printf("%s%c\n", path, option2); else { char* new_path = appendCString(path, option2); printNumberWords_aux(number_str + 2, new_path); free(new_path); } } } char getLetterFromPrefix(char* str, unsigned int n) { static const char* LETTERS = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"; static const unsigned int LETTERS_LEN = 26; char* nstr = malloc(sizeof(char) * (n+1)); nstr[n] = '\0'; memcpy(nstr, str, n * sizeof(char)); int number = atoi(nstr); free(nstr); if(number <= LETTERS_LEN && number > 0) return LETTERS[number - 1]; return '\0'; } char* appendCString(const char* str, const char c) { const unsigned int LEN = strlen(str); char* rValue = malloc(sizeof(char) * (LEN + 2)); memcpy(rValue, str, LEN); rValue[LEN + 0] = c; rValue[LEN + 1] = '\0'; return rValue; } char* uint2string(uint number) { const unsigned int NUMBER_LEN = countDigits(number); unsigned int pos = NUMBER_LEN - 1; char* rValue = malloc(sizeof(char) * NUMBER_LEN + 1); rValue[NUMBER_LEN] = '\0'; while(number > 0) { rValue[pos] = (number % 10) + '0'; number /= 10; pos --; } return rValue; } unsigned int countDigits(uint number) { unsigned int rValue = 0; while(number > 0) { rValue++; number /= 10; } return rValue; }That is a tricky interview question considering the ambiguity with zero.
It is interesting to see how the different solutions react to
hostile inputs containing zeros. There might be some surprises.
My Haskell solution maps 0 to @.
codes :: String codes = '@' : ['A' .. 'Z'] cwords :: String -> String -> [String] cwords [] cs = [reverse cs] cwords (x:xs) cs | null xs = cwords xs (codes !! headval : cs) | otherwise = if pairval < length codes && x /= '0' then cwords xs (codes !! headval : cs) ++ cwords (tail xs) (codes !! pairval : cs) else cwords xs (codes !! headval : cs) where headval = read [x] pairval = read [x, head xs] main :: IO() main = do print $ cwords "1234" [] print $ cwords "317" [] print $ cwords "204" [] print $ cwords "1020" []Output:
Good point about zeroes – I’m inclined to just say there aren’t any matches for eg. “30”.
Here’s a nice and simple recursive solution in C (my other solution avoids some duplicated effort in the recursive calls, but the space requirements get quite large). This generates the solutions in lexicographic order too.
There are 259584 solutions for the string “1234567891011121314151617181920212223242526” incidentally (including “ABCDEFGHIJKLMNOPQRSTUVWXYZ” of course).
#include <ctype.h> #include <string.h> #include <stdio.h> void scan(const char *s, int ipos, char *outstr, int opos) { if (s[ipos] == 0) { outstr[opos] = 0; printf("%s\n", outstr); } else if (isdigit(s[ipos])) { int n = s[ipos]-'0'; if (n >= 1) { outstr[opos] = 'A'+n-1; scan(s,ipos+1,outstr,opos+1); } if (isdigit(s[ipos+1])) { n = 10*n+s[ipos+1]-'0'; if (n >= 10 && n <= 26) { outstr[opos] = 'A'+n-1; scan(s,ipos+2,outstr,opos+1); } } } } int main(int argc, char *argv[]) { char outstr[strlen(argv[1])+1]; scan(argv[1],0,outstr,0); }Whee Rackety goodness!
Full writeup: Number words
; Given 1-26 mapping to A-Z, determine all possible words represented by a number ; Correctly resolve ambiguities where 1234 -> 1 2 3 4 = ABCD / 1 23 4 -> AWD / 12 3 4 -> LCD (define (number->words str) ; Convert a number 1-26 to a letter A-Z (define (n->char n) (integer->char (+ 64 (string->number n)))) ; Make an optional parser ; If the regex matches, add it to each possible next parse ; If it does not, return an empty list (to be appendable) (define (make-parser re) (λ (str) (match str [(regexp re (list _ n rest)) (map (curry ~a (n->char n)) (number->words rest))] [any '()]))) ; Create parsers for valid 1 digit and 2 digit letter numbers (define parse-1 (make-parser #px"([1-9])(.*)")) (define parse-2 (make-parser #px"(1[0-9]|2[0-6])(.*)")) ; Base case, so we can stop eventually (if (equal? str "") '("") (append (parse-1 str) (parse-2 str))))I’ve been meaning to do this for a while – here’s an improved version of my “dynamic programming” solution; this one uses a tree structure to store the labellings rather than an explicit list (which gets rather large). Now, we can calculate that eg. ‘12345678910111213141516171819202122232425261234567891011121314151617181920212223242526’ has 67383853056 labellings, which would take a long time to generate explicitly. It also fixes a bug which mishandled the case where no labelling was possible.
#include <string.h> #include <vector> #include <string> #include <iostream> struct label { label(const std::string &s, char t) : str(s),tag(t) {} std::string str; char tag; }; struct Node { Node(Node *p, Node *n, char t) : prefix(p), next(n), tag(t), count((p ? p->count : 1) + (n ? n->count : 0)) {} Node *prefix; Node *next; char tag; long count; }; void scan(Node *node, char *buff) { if (node == NULL) { std::cout << buff << "\n"; } else { do { *(buff-1) = node->tag; scan(node->prefix,buff-1); node = node->next; } while (node != NULL); } } int main(int argc, char *argv[]) { bool doprint = false; if (strcmp(argv[1],"-p") == 0) { doprint = true; argc--; argv++; } std::string s = argv[1]; int slen = s.size(); std::vector<label> labels; for (int i = 1; i <= 26; i++) { labels.push_back(label(std::to_string(i), 'A'+i-1)); } std::vector<Node *> table(slen+1); for (int i = 1; i <= slen; i++) { for (auto label : labels) { int llen = label.str.size(); if (llen <= i && s.compare(i-llen, llen, label.str) == 0) { // Check there is a solution to extend if (i == llen || table[i-llen] != NULL) { table[i] = new Node(table[i-llen],table[i],label.tag); } } } } if (table[slen] == NULL) { std::cerr << "No labellings\n"; } else { std::cerr << table[slen]->count << " labellings\n"; if (doprint) { char *buff = new char[slen+1](); scan(table[slen],buff+slen); } } }