Finding God
February 16, 2016
The program is simple enough:
(define (chain words) (let loop ((words words) (chain (list (car words)))) (let* ((count (string-length (symbol->string (car words)))) (text (drop count words))) (if (null? text) (reverse chain) (loop text (cons (car text) chain)))))) > (chain bible) (In beginning the without upon the the God the the God there was)
Function chain
starts with the first word in a list of words and returns the chain formed according to the rule. Function chains
returns the chains for each word in a list of words:
(define (chains words) (let loop ((words words) (chains (list))) (if (null? words) (reverse chains) (loop (cdr words) (cons (chain words) chains))))) > (chains alice) ((Alice very by sister having twice the sister) (was get of by sister having twice the sister) (beginning sister having twice the sister) (to very by sister having twice the sister) (get of by sister having twice the sister) (very by sister having twice the sister) (tired sister having twice the sister) (of by sister having twice the sister) (sitting and nothing had the sister) (by sister having twice the sister) (her the of nothing had the sister) (sister having twice the sister) (on bank nothing had the sister) (the of nothing had the sister) (bank nothing had the sister) (and nothing had the sister) (of nothing had the sister) (having twice the sister) (nothing had the sister) (to once had the sister) (do or she into sister) (once had the sister) (or she into sister) (twice the sister) (she into sister) (had the sister) (peeped was) (into sister) (the sister) (book reading) (her reading) (sister) (was) (reading))
The first 26 of the 34 chains in alice
end with sister, ending with the chain starting with peeped. It’s easy to count the number of successive chains, starting from the first, that end with the same word:
(define (count chains) (let ((prev (car (reverse (car chains))))) (let loop ((chains chains) (count 0)) (cond ((null? chains) count) ((not (equal? (car (reverse (car chains))) prev)) count) (else (loop (cdr chains) (+ count 1))))))) > (count (chains alice)) 26
We will arbitrarily say that any text for which the count exceeds a third the length of the text obeys the rule:
(define (obey? words) (< 1/3 (/ (count (chains words)) (length words)))) > (obey? bible) #t > (obey? alice) #t > (obey? scheme) #f
We used drop
from the Standard Prelude. You can run the program at http://ideone.com/g2N2vr. The trick was first described by Martin Kruskal, and is analyzed here by Lagarias, Rains and Vanderbei. Several texts are shown on the next page; all the texts except the last obey the rule.
one
(same before but not linked)
Here’s a simulation: assume words have a uniformly random number of letters from 1 to n. We take two starting points within n words of each other (it doesn’t really matter what the exact distance is, we use -1 and 1 so the average starting value is 0), we then generate two series of random words until the two series coincide (after which they are identical). We can do that by always extending the sequence with the lowest current value. Each time we extend a sequence the probability that we hit the other value is exactly 1/n (since the two sequence values are never more that n away from each other) so the expected number of times we extend a sequence is exactly n, with expected final value n(n+1)/4 and this is the result of the simulation. Of course, real words don’t have such uniformly distributed lengths, but this seems to give a good insight into why the trick (usually) works:
import random
trials = 100000; n = 10; total = 0; final = 0
for i in range(trials):
count = 0; a = -1; b = 1
while True:
assert(abs(a-b) 0)
count += 1
if a < b: a += 1 + random.randrange(n)
else: b += 1 + random.randrange(n)
if a == b: break
total += count
final += a
print(1.0*total/trials,1.0*final/trials) # eg. (10.01902, 27.55609)
[/code]
Not sure what happened there. Let’s try again:
import random
trials = 100000; n = 10; total = 0; final = 0
for i in range(trials):
count = 0; a = -1; b = 1
while True:
assert(abs(a-b) 0)
count += 1
if a < b: a += 1 + random.randrange(n)
else: b += 1 + random.randrange(n)
if a == b: break
total += count
final += a
print(1.0*total/trials,1.0*final/trials)
[/code]
I’m mystified.
import random
trials = 100000; n = 10; total = 0; final = 0
for i in range(trials):
count = 0; a = -1; b = 1
while True:
assert(abs(a-b) 0)
count += 1
if a < b: a += 1 + random.randrange(n)
else: b += 1 + random.randrange(n)
if a == b: break
total += count
final += a
print(1.0*total/trials,1.0*final/trials)
[/code]
The code formatter didn’t like the “a < b” – not normally a problem.
Sorry about the spam, feel free to delete.