Distinct Characters
May 9, 2017
We’ve done similar exercises in the past:
Write a program to determine if all the characters in a string are distinct. For instance, the word “Programming” has two m and two g, so all the characters are not distinct, but the word “Praxis” has six distinct characters.
Your task is to write a program to determine if all the characters in a string are distinct; you should provide three solutions, one that takes time O(n²), one that takes time O(n log n), and one that takes time O(n). When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
In Python.
Added an additional method to Paul’s solution in Python. Uses pruning, combined with dictionary add/lookup is O(1) average. Has better performance, depending on distribution of frequently occuring characters amongst all different words (i.e. character ‘e’ reoccurs frequently in English).
Rutger: I expect the difference between the set object and your emulation of it is just noise.
John: order_n_prune is rougly 10-20% faster than order_n for the words in /usr/share/dict/words (linux). For other datasets this may not hold, i.e. a dataset with words that only have unique characters.
Klong: Not sure how to provide the 3 different solutions, so here’s the quickest one
str::”Programming Praxis”
“Programming Praxis”
str=?str
0
str::”Praxis”
“Praxis”
str=?str
1
:” ?a returns a list containing unique elements from ‘a’ in order of appearance.”
[sourcecode lang="css"]
Not sure how to do the 3 solutions, so here’s just one:
DISTCHAR (str) ; Check for distinct characters
n arr,flg,i
s flg=1
f i=1:1:$l(str) d q:’flg
. s char=$e(str,i)
. i $d(arr(char)) s flg=0
. e s arr(char)=””
q flg
MCL> W $$DISTCHAR^DISTCHAR(“Programming Praxis”)
0
MCL> W $$DISTCHAR^DISTCHAR(“Praxis”)
1
Just the fastest solution (O[n] time) in C11:
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
bool
distinct (char *str);
int
main (int argc, char **argv)
{
char *strings[] = {"Programming", "Praxis", "Cat", "Dog", "Dogg", NULL};
for (register size_t i = 0; strings[i] != NULL; ++i)
{
if (distinct(strings[i]))
printf("'%s' is distinct\n", strings[i]);
else
printf("'%s' is not distinct\n", strings[i]);
}
exit(0);
}
// O(n) solution
bool
distinct (char *str)
{
size_t arr[127] = {0}; // ASCII
const size_t n = strlen(str);
for (register size_t i = 0; i < n; ++i)
if (++arr[str[i]] > 1)
return false;
return true;
}