Anagrams, Again
October 18, 2019
We first check the two strings are the same length. Then we count characters in 256 buckets, adding 1 for each character in the first string and adding -1 for each character in the second string. Finally, we scan the 256 buckets and report an anagram if and only if all buckets have a 0 value:
(define (anagram? str1 str2)
(if (not (= (string-length str1) (string-length str2))) #f
(let ((buckets (make-vector 256 0)))
(do ((cs (string->list str1) (cdr cs))) ((null? cs))
(let ((i (char->integer (car cs))))
(vector-set! buckets i (+ (vector-ref buckets i) 1))))
(do ((cs (string->list str2) (cdr cs))) ((null? cs))
(let ((i (char->integer (car cs))))
(vector-set! buckets i (- (vector-ref buckets i) 1))))
(let loop ((i 0))
(if (= i 256) #t
(if (not (zero? (vector-ref buckets i))) #f
(loop (+ i 1))))))))
Here’s an example:
> (anagram? "time" "emit") #t
You can run the program at https://ideone.com/qTqySy.
In Python. These functions work for ascii and for unicode.
from collections import Counter def is_anagram(a, b): return sorted(a) == sorted(b) def is_anagram(a, b): return Counter(a) == Counter(b)Here is my take on this, using Julia 1.1.1, without any dependencies: https://pastebin.com/9LgUHJr8
Have a nice weekend!
Here’s a solution in C.
#include <stdbool.h> #include <stdio.h> #include <stdlib.h> bool check_anagrams(char* s1, char* s2) { int diff[256] = {0}; for (char* s = s1; *s != '\0'; ++s) { ++(diff[(unsigned char)(*s)]); } for (char* s = s2; *s != '\0'; ++s) { --(diff[(unsigned char)(*s)]); } for (int i = 0; i < 256; ++i) { if (diff[i]) return false; } return true; } int main(int argc, char* argv[]) { if (argc != 3) return EXIT_FAILURE; bool anagram = check_anagrams(argv[1], argv[2]); printf("%d\n", anagram); return EXIT_SUCCESS; }Example:
Here’s a sorting-based solution in C.
#include <stdbool.h> #include <stdio.h> #include <stdlib.h> #include <string.h> int compare(const void* a, const void* b) { char a_ = *((char*)a); char b_ = *((char*)b); return (a_ > b_) - (a_ < b_); } // destructive bool check_anagrams(char* s1, char* s2) { qsort(s1, strlen(s1), 1, compare); qsort(s2, strlen(s2), 1, compare); return strcmp(s1, s2) == 0; } int main(int argc, char* argv[]) { if (argc != 3) return EXIT_FAILURE; bool anagram = check_anagrams(argv[1], argv[2]); printf("%d\n", anagram); return EXIT_SUCCESS; }Example:
@Paul: you need to be a bit careful with this approach with Unicode. Here’s an attempt to deal with combining characters, equivalences & casing:
import unicodedata def signature(s): # Apply compatibility decomposition s = unicodedata.normalize('NFKD',s) # Include: Letter, Number, Symbol. # Exclude: Mark, Punctuation, Separator, Other s = [c.lower() for c in s if unicodedata.category(c)[0] in "LNS"] s = "".join(sorted(s)) return(s) print(sorted("Ångström") == sorted("Ångström")) # => False print(signature("Ångström") == signature("Ångström")) # => True print(signature("Ångström")) # => agmnorst print(signature("Ἀχιλλεύς")) # => αειλλςυχ print(signature("H̸̡̪̯ͨ͊̽̅̾̎Ȩ̬̩̾͛ͪ̈́̀́͘ ̶̧̨̱̹̭̯ͧ̾ͬC̷̙̲̝͖ͭ̏ͥͮ͟Oͮ͏̮̪̝͍M̲̖͊̒ͪͩͬ̚̚͜Ȇ̴̟̟͙̞ͩ͌͝S̨̥̫͎̭ͯ̿̔̀ͅ")) # => ceehmosI’m a bit disappointed that Chrome on my Android phone can’t cope with the combining characters in that last post at all, FF on Linux seems to manage though.
Just trying something: “Ångström”, “Ἀχιλλεύς”, “H̸̡̪̯ͨ͊̽̅̾̎Ȩ̬̩̾͛ͪ̈́̀́͘ ̶̧̨̱̹̭̯ͧ̾ͬC̷̙̲̝͖ͭ̏ͥͮ͟Oͮ͏̮̪̝͍M̲̖͊̒ͪͩͬ̚̚͜Ȇ̴̟̟͙̞ͩ͌͝S̨̥̫͎̭ͯ̿̔̀ͅ”
OK, Chrome on Android just seems to have problems when displaying with its code font.