Changing Gender
July 26, 2016
The basic idea is to create a list of gender words, store them and their opposites in a lookup table, then scan the input string and change any gender word that appears. This is easy in concept but hard to do in full generality, because the list of gender words is impossible to specify, and gets complicated when you consider things like capitalization and punctuation. Here’s our list of gender words; you may want to build your own:
(define gender-words '(
("boy" "girl") ("girl" "boy")
("boyfriend" "girlfriend") ("girlfriend" "boyfriend")
("father" "mother") ("mother" "father")
("husband" "wife") ("wife" "husband")
("brother" "sister") ("sister" "brother")
("he" "she") ("she" "he")
("his" "her") ("her" "his")
("male" "female") ("female" "male")
("man" "woman") ("woman" "man")
("mr" "ms") ("mr" "ms")
("sir" "madam") ("madam" "sir")
("son" "daughter") ("daughter" "son")
("uncle" "aunt") ("aunt" "uncle")))
We’ll take this in pieces. First we split a string into words and everything else:
(define (split str)
(let loop ((cs (string->list str)) (word (list)) (words (list)))
(cond ((null? cs)
(reverse (if (null? word) words
(cons (list->string (reverse word)) words))))
((char-alphabetic? (car cs))
(loop (cdr cs) (cons (car cs) word) words))
((pair? word)
(loop (cdr cs) (list) (cons (string (car cs))
(cons (list->string (reverse word)) words))))
(else (loop (cdr cs) word (cons (string (car cs)) words))))))
The result is a little bit ugly, but it works:
> (split "My Brother's girlfriend is taking HER sister to the movies.")
("My" " " "Brother" "'" "s" " " "girlfriend" " " "is" " "
"taking" " " "HER" " " "sister" " " "to" " " "the" " "
"movies" ".")
Next we look up each word in the list of gender words. If it’s not in the list, it is copied unchanged. Otherwise, we make the replacement, being careful to preserve the cases of all-caps, initial capital, and everything else:
(define (replace word)
(let* ((d (string-downcase word))
(w (assoc d gender-words))
(w (if w (cadr w) d))
(w (string->list w)))
(cond ((all? char-upper-case? (string->list word))
(apply string (map char-upcase w)))
((char-upper-case? (car (string->list word)))
(list->string (cons (char-upcase (car w)) (cdr w))))
(else (list->string w)))))
This doesn’t handle odd capitalization properly, but that’s a low-frequency event:
> (replace "brother") "sister" > (replace "Brother") "Sister" > (replace "BROTHER") "SISTER" > (replace "broTHER") "sister" > (replace "hello") "hello"
Now it’s easy to put everything together:
(define (change-gender str) (apply string-append (map replace (split str))))
And here’s an example:
> (change-gender "My Brother's girlfriend is taking HER sister to the movies.") "My Sister's boyfriend is taking HIS brother to the movies."
Not perfect, but not bad, either. We used an association list, but a hash table would be a better choice if the list of gender-words is long.
We used all? from the Standard Prelude. You can run the program at http://ideone.com/ADiZZf.
Similar solution to yours but in Perl… I create the map in three parts – for those words for which you can only map one way (actually they have an alternative mapping elsewhere) – those words for which I can pluralise with “s” and those I can’t…
By extending the map with the ucfirst and uc versions of the strings – I can then just use a “join map split” to do the translation.
my %map = qw( aunty uncle aunties uncles miss mr mrs mr dame sir lord lady lords ladies); ## Entries for which we can just add an "s" to pluralize my @map_plural_s = qw( boy girl boyfriend girlfriend father mother husband wife son daughter brother sister sir madam uncle aunt male female widower widow ); ## Make all 4 mappings... while( my($a,$b) = splice @map_plural_s, 0, 2 ) { $map{$a}=$b; $map{$b}=$a; $map{$a.'s'}=$b.'s'; $map{$b.'s'}=$a.'s'; } ## Entries for which we can't just add an S to make them plural my @map_no_plural = qw( man woman men women gentleman lady gentlemen ladies his her ); ## Make both mappings... while( my($a,$b) = splice @map_no_plural, 0, 2 ) { $map{$a}=$b; $map{$b}=$a; } ## Now add the ucfirst and uc versions of the words... $map{ucfirst $_} = ucfirst $map{$_} foreach keys %map; $map{uc $_ } = uc $map{$_} foreach keys %map; sub r { return join q(), map { $map{$_}||$_ } split m{\b}, shift; } print r( "My Brother's girlfriend is taking HER sister to the movies." );