First Non-Repeating Character
August 19, 2011
This question appears on several lists of interview questions:
Write a function that takes an input string and returns the first character from the string that is not repeated later in the string. For instance, the two letters “d” and “f” in the input string “aabcbcdeef” are non-repeating, and the function should return “d” since “f” appears later in the string. The function should return some indication if there are no non-repeating characters in the string.
Your task is to write the indicated function. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
Here’s a Clojure solution. Please note that, due to laziness, (first (filter… only evaluates up to the first non-repeating element.
A Haskell solution:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
firstNonRep.hs
hosted with ❤ by GitHub
In C#:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace PP_20110819 {
class Program {
static void Main(string[] args) {
Dictionary lookupTable = new Dictionary();
for (int i = 0; i < args[0].Length; i++) {
if (!lookupTable.ContainsKey(args[0][i])) {
lookupTable[args[0][i]] = 1;
} else {
lookupTable[args[0][i]]++;
}
}
for (int i = 0; i < args[0].Length; i++) {
if (lookupTable[args[0][i]] == 1) {
Console.WriteLine(string.Format("First non-repeating letter is {0}", args[0][i]));
return;
}
}
Console.WriteLine("There are no non-repeating letters.");
}
}
}
namespace PP_20110819 {
class Program {
static void Main(string[] args) {
Dictionary<char, int> lookupTable = new Dictionary<char, int>();
for (int i = 0; i < args[0].Length; i++) {
if (!lookupTable.ContainsKey(args[0][i])) {
lookupTable[args[0][i]] = 1;
} else {
lookupTable[args[0][i]]++;
}
}
for (int i = 0; i < args[0].Length; i++) {
if (lookupTable[args[0][i]] == 1) {
Console.WriteLine(string.Format("First non-repeating letter is {0}", args[0][i]));
return;
}
}
Console.WriteLine("There are no non-repeating letters.");
}
}
}
Oh, Lord. Please let me delete misformatted comments…
Erlend, your code fails this test: firstNonRep “faabcbcdeef” == ‘f’. (‘f’ moved to front of the input.) You rely on the order of the map.
This, however, works: https://gist.github.com/1156886
I shouldn’t have left the trailing ‘f’ in the test string above, of course.
Here’s my clojure solution. More detail at my blog.
#include
#include
#include
using namespace std;
int FirstNonRepeatingCharacterIndex(string str)
{
int result = -1;
set chars;
for(int i = str.size() – 1; i >= 0; –i)
{
char c = str[i];
if(chars.find(c) == chars.end())
{
result = i;
}
chars.insert(c);
}
return result;
}
int main()
{
string s(“aabcbcdeef”);
int res = FirstNonRepeatingCharacterIndex(s);
cout << "index:" <= 0)
{
cout << "\tcharacter: '" << s[res] << '\'';
}
cin.get();
return 0;
}
returns:
index:1 character: 'a'
[…] today’s Programming Praxis exercise, our goal is to find the first character in a string that does not […]
My Haskell solution (see http://bonsaicode.wordpress.com/2011/08/19/programming-praxis-first-non-repeating-character/ for a version with comments):
Python 3 version
My Python solution:
Rather than make two passes through the whole string, I make one pass through
the string and then one pass through all the unique characters (by calling min on
the “position” dictionary). If there are few unique characters in the string, this may
speed up the work significantly.
@Mike: super slick answer, but I think you might run into trouble with strings
in which all characters repeat—e.g. ‘aabbcc’—where it “should return some
indication if there are no non-repeating characters in the string,” or with
empty strings (perhaps a useless edgecase).
Ignore my previous post; it doesn’t work right if there are no unique letters.
Here is a correct solution.
Here’s a Scala solution…
@tailrec final def findNonRepeat(s:String):String = if(s.count(_ == s.first) > 1) findNonRepeat(s.tail.filterNot(_ == s.first)) else String valueOf (s.firstOption getOrElse “No non-repeating characters”)
I’ve updated my answer based on some comments left on my blog. Here’s my new solution:
Why work when the library functions will work for you?
C#:
using System;
using System.Linq;
class Program
{
static void Main(string[] args)
{
string s = Console.In.ReadLine();
try
{
char first = s.First(c => s.Count(x => x == c) == 1);
Console.Out.WriteLine(“First nonrepeating letter: ” + first);
}
catch
{
Console.Out.WriteLine(“No nonrepeating letters.”);
}
}
}
Clojure:
(defn freq-map [string]
(reduce #(assoc %1 %2 (+ (get %1 %2 0) 1)) (hash-map) string))
(defn first-unique-char [string]
(let [f-map (freq-map string)]
(first (filter #(= 1 (get f-map %)) string))))
Here’s an even better Scala solution found with the help of some friends.
final def findNonRepeat(s:String):String = String valueOf (s.drop(s.takeWhile(c => s.count(_ == c) > 1).size).firstOption getOrElse “No non-repeating characters” )
@Graham. Just after I posted my first solution, I realized it didn’t work if there were no unique characters. The idea for the Ordered Counter came straight from the Python 3 documentation for OrderedDict, so I can’t take credit for that.
I first tried a solution similar to yours. However, when a repeated letter is seen it is popped from the position dictionary–effectively “forgetting” that the letter had been seen. If a letter appears a third time, it will seem to be unique. Ex: fst_non_rep(‘ababac’) returns ‘a’ instead of ‘c’.
C# Linq solution. Minus all of the console crud and the all important answer.First();
var answer = from c in input
where input.IndexOf(c) == input.LastIndexOf(c)
orderby input.IndexOf(c)
select c;
clojure version using frequences:
@Mike: fair enough, thanks for pointing out my mistake! I was trying to minimize the work on the second run, but didn’t think hard enough, apparently. I think your Ordered Counter takes the cake :-)
In Clojure, also with frequencies, and golfing a bit:
(defn first-unique [s] (ffirst (filter #(= 1 (second %)) (frequencies s))))
(defun first-non-repeated (str)
(let ((freq (map ‘list (lambda (c) (cons (count c str) c)) str)))
(cdar (member 1 freq :key #’car))))
Three Haskell solutions: https://gist.github.com/1159127
there is a natural unfair advantage to perl programmers
In perl I can write the entire logic in 3 lines
perfectly readable code … Assume $str is the string input
foreach my $c ($str =~/(.)/g){
@uniq = ($seen{$c}++) ? grep{!/$c/} @uniq : (@uniq,$c);
}
print $uniq[0] . “\n”,
take the foreach also inline and implement the logic in 1 line .. But i guess that makes code unreadable
#include
#include
#include
int main () {
char str[80];
char str_a[80];
int l, i;
char c;
printf(“Please enter the string to find non-repeating char\n”);
scanf(“%s”,str);
l = strlen(str);
for (i = 0; i < l ; i++) {
c = str[i];
//copy rest of the string except char 'c' into another
memcpy(&str_a[0] , &str[0], i);
memcpy(&str_a[i] , &str[i+1], (l-i-1));
if (!strchr(str_a, c)) {
printf("First non repeating char in string : \"%c\"\n",c);
break;
}
}
if (i == l)
printf("Non repeating char not found in given string\n");
}
In ruby …
If a character is not on either the repeat or nonrepeat lists, add it to the nonrepeat. If it’s on the nonrepeat, remove it and add it to the repeat list, otherwise we’ve already seen it and it’s on the repeat list already. Return either the first item on the nonrepeat list if it’s not empty or the empty string if it is. I think this should work in all cases. We’re letting the ruby array class do all the hard work here.
My C solution. Cost of the algorithm: O(2*stringlength)
@crkirkendall @Alan @vukung those don’t work:
user> (first-unique-char “aabbcddeeffc”)
nil
user> (first-unique “aabbcddeeffc”)
nil
My verbosey clojure solution (new to clojure, coming from them common lisp):
(defn first-non-repeating [str]
(loop [c (first str)
cnt 1
[n & str] (rest str)]
(cond
(nil? c) nil
(and (not= c n) (= cnt 1)) c
(not= c n) (recur n 1 str)
true (recur c (inc cnt) str))))
More idiomatic clojure:
module Solution where
import List
solution :: String -> IO ()
solution a =
case reduce a of
[] -> putStrLn “No repeating characters!”
(x:_) -> putStrLn $ “Solution: ” ++ x
where
reduce = filter (\x -> length x == 1) . group . sort
Curious if this works…
@Clint Moore: Your code returns the alphabetically first non-repeating character, not the first according to the input string.
Example: For “fa” your code will give ‘a’, but the solution should give ‘f’.
Basically Gambit-C with extensions…
Recursively construct an occurrences table, and when returning, check if head of intermediate list as only one occurrence (guaranteeing the “first” unique element will be returned). Returns “#f” if no unique char.
A bit verbose, but pretty direct.
Basically, O(n).
Chicken scheme: this seems to work…
(define (non-rep-char str)
(let lp ((ls (string->list str)) (chrs ‘()))
(cond
((null? ls)
(print “No non-repeating chars.”)
done:)
((member (car ls) chrs)
(lp (cdr ls) chrs))
((member (car ls) (cdr ls))
(lp (cdr ls) (cons (car ls) chrs)))
(else
(car ls)))))
Hey!
Just found this site through sixrevisions.com and thought I might have a try on one of the puzzles! So I’m a student in java-programming among other stuff, so that’s the language I used for my solution. Here goes:
package nonRepeat;
import java.util.Scanner;
public class NonRepeatMain
{
public static void main(String[]args)
{
System.out.print(“Input string to check: “);
String result=new NonRepeatMain().checkString(new Scanner(System.in).nextLine().toCharArray());
if(result==null)
{System.out.print(“No unique character in the string.”);}
else
{System.out.println(“‘”+result+”‘ is the first unique character.”);}
}
private String checkString(char[] text)
{
boolean unique;
for(int i=0;i<text.length;i++)
{
unique=true;
for(int j=0;j<text.length;j++)
{
if(j!=i)
{
if(text[i]==text[j])
{
unique=false;
break;
}
}
}
if(unique==true)
{return Character.toString(text[i]);}
}
return null;
}
}
Pretty straightforward (I think?), don't know if I could have made this more effective in some way, as I've just been studying programming for less than a year.
Here is my Racket solution. I’ve used two tables to map each character to the number of occurrences (and viceversa); this way, it is possible to solve this with no conditionals (with the exception of the implicit one in the “for” construct) and traversing the string only once.
#lang racket
(define max-word-length 20)
(define (chars-repeating-times str n)
(let ((freq-table (make-vector max-word-length ‘()))
(char-table (make-hash)))
(for ((c (in-string str)))
(let* ((previous-count (hash-ref char-table c 0))
(next-count (add1 previous-count)))
(vector-set! freq-table previous-count
(remove c (vector-ref freq-table previous-count)))
(vector-set! freq-table next-count
(cons c (vector-ref freq-table next-count)))
(hash-set! char-table c next-count)))
(reverse (vector-ref freq-table n))))
(define (first-non-repeating-char str)
(first (chars-repeating-times str 1)))
(require rackunit)
(check-equal? (chars-repeating-times “” 1) ‘())
(check-equal? (chars-repeating-times “a” 1) ‘(#\a))
(check-equal? (chars-repeating-times “aabccd” 1) ‘(#\b #\d))
(check-equal? (chars-repeating-times “aabbcc” 1) ‘())
// This is the javascript version
function getFirstUniqueChar(s) {
var map = (function(map, len, i, c) {
while (i < len) map[c = s.charAt(i++)] = map[c] ? (map[c] + 1) : 1;
return map;
}({}, s.length, 0));
for (key in map) {
if (map[key] == 1) return key;
}
return -1;
}
A version in prolog that operates on lists:
non_repeating(Atom,List) :-
select(Atom,List,Rest), not(member(Atom,Rest)).
first_non_repeating(Atom,List) :-
non_repeating(Atom,List), !.
Java:
public static char firstNonRepeatingCharacter(String s) {
for (int i = 0; i < s.length() – 1; i++) {
if (s.lastIndexOf(s.charAt(i)) == i) {
return s.charAt(i);
}
}
return '';
}
@Karl Nasser:
what does firstNonRepeatingCharacter(“aa”) return?
#include
#include
#define MAX 10
void check(char *k)
{
int i,j,l=0,d,it=0;
int index[20];
int s=0,len=0,p=0,st=0;
for(d=0;d<strlen(k);d++)
{
for(i=d,j=i+1;j<strlen(k);j++)
{
if(k[i]==k[j])
{
index[s]=i;
s=s+1;
index[s]=j;
len=len+2;
s=s+1;
}
}
index[s]=-1;
len++;
s++;
}
for(st=0;st<len;st++)
{
if(index[st]!=-1)
{
p=index[st];
k[p]='0';
}
}
for(it=0;it<strlen(k);it++)
{
if(k[it]!='0')
{
printf("%c",k[it]);
break;
}
}
if(it==strlen(k))
printf("every charater in di string is repeated:");
}
void main()
{
char ch[11]="aabcbcdeef";
check(ch);
}
Python code:
def first(str):
for i in xrange(0, len(str)):
if str[i] not in str[:i] + str[i+1:]:
return str[i]
return “No element unique!”
#test code
#print first(‘kkkojlooouoojlv’)
#prints ‘u’
A different approach is to split the parsed characters into two sets, RC – containing repeating characters and NRC – containing not repeating characters.
First look at RC, if the character is there, go next. Otherwise, if the character is in NRC, move it to RC and go next. Otherwise put the character in NRC.. and go next.
NRC can be ordered by combining an hashset with a double linked list, so the first not repeating character is the first one left in NRC.
The resulting function parses the string only once at the cost of a bit more complex data structure.
FirstNonRepeatingCharacter.java
/home/fabio/Desktop/Prog/programming praxis/first non repeating character/FirstNonRepeatingCharacter.java
In Java.
public static int firstNonRepeatingCharacter(char[] chars) {
boolean found;
for (int i = 0; i < chars.length; i++) {
found = false;
for (int j = 0; j < chars.length; j++) {
found = (i != j) && (chars[i] == chars[j]);
if (found) {
break;
}
}
if (!found) {
return chars[i];
}
}
return -1;
}
All posted solutions I saw are incorrect, because they only work with a small number of characters and don’t work with diacritical marks.
C’mon, it isn’t 1970 anymore, use Unicode!
Hi Steve,
I’ve run my solution with unicode characters and it works correctly. But perhaps I am overlooking something. Can you provide me with some sample data that will show me what I might be missing?
public class NonRepeating {
public static void main(String[] args) {
printFirstNonRepeatingCharacter(“test1”, “aabbccdeeff”);
printFirstNonRepeatingCharacter(“test2”, “xdtàxadt㊨”);
printFirstNonRepeatingCharacter(“testUnicode”,
“\uDFFF\u0025\u0029\u0041\uDFFF\u0025”);
}
private static void printFirstNonRepeatingCharacter(String name, String str) {
int c = getFirstNonRepeatingCharacter(str.toCharArray());
if (c > 0) {
System.out.printf(“%s: %c%n”, name, c);
}
else {
System.out.printf(“%s: no non-repeating character found!%n”, name);
}
}
public static int getFirstNonRepeatingCharacter(char[] chars) {
boolean found;
for (int i = 0; i < chars.length; i++) {
found = false;
for (int j = 0; j < chars.length; j++) {
found = (i != j) && (chars[i] == chars[j]);
if (found)
break;
}
if (!found)
return chars[i];
}
return -1;
}
}
@steve
1. for example:
2. it doesn’t matter..
Using your sample data my solution does return the correct data, based upon what you have shown above.
Q&D PLT Racket version. Though, except for the first two Racket-specific lines, I think it’d work in just about any scheme that has SRFI 13.
How about some Factor!
And some tests to show it works:
#include
#include
void main()
{
int i=0,f=0,g,h,j=0,k=0,l=0,len;
char str[100],ch;
printf(“\n\n enter the string\n”);
gets(str);
len=strlen(str);
while(i<len)
{ g=0;h=0;
while(j<len)
{
if(str[i]=='*')
{
i++;
j=i;
continue;
}
if(str[i]==str[j+1])
{
str[j+1]='*';
j++;
h=1;
continue;
}
else
{
j++;
if(h==0&&j==len)
{
printf("\nthis is first character in string which is nonrepeating\n");
printf("%c",str[i]);
g=1;f=1 ;
break;
}
}
}
i++;
j=i;
if(g==1)
break;
}
if(f==0)
printf("\nthere is no nonreapeating character\n");
}
This PLT Scheme / Racket solution won’t be notably fast, but I think it’s worth sharing for its brevity.
(But here’s a GitHub gist for nicer presentation.)
(define (first-uniq chars)
(when (empty? chars)
(raise "There are no unique letters"))
(let ([c (first chars)])
(if (memq c (rest chars)) ; MEMQ is like MEMBER but uses EQ?
(first-uniq (remq* (list c) chars))
c)))
; test it
(if (equal? #\d (first-uniq (string->list "aabcbcdeef")))
(display "SUCCESS!!!")
(display "aw, darn."))
Here’s a functional solution in Racket that runs in linear time:
b<-c(1,2,3,4,5,6,1,2,3,4,5,6,7) #base string, contains integers 1 and greater
n<-length(b)
bprime<-b
for (i in 1:(n-1)) {
for (j in (i+1):n) {
if (b[i]==b[j]) {
for (a in 1:n) {
if (b[a]==b[i]) {
bprime[a]<-0
}
}
}
}
}
count<-0
for (e in 1:n) {
if (bprime[e]==0) {
count<-count+1
}
}
if (count==n) {
print("ERROR: no non-repeating characters")
}
#Now print the first non-repeating character from the input string
for (p in 1:n) {
if ((bprime[p])!=0) {
print(bprime[p])
break
}
}
A linear solution with O(1) extra-space avoiding the second iteration.
Oops. Ignore my first solution from two years ago :) A readable Linq solution, first getting a list of unique characters, then returning the first one, if any.
Thanks for this code. I’m gonna take a look at it and try some of these examples for myself. However, I am a noob at coding and these do seem a tadge complex. Keep up the good work. Cheers :-)
These days there are over a million possible characters and over a hundred thousand defined characters (in Unicode). It wasn’t clear to me what is supposed
to happen for an empty string or for say ‘ee’. (Arguably, when you look at the
*last* $e it is not followed by another copy, so the answer then is $e.)