Finding God
February 16, 2016
The program is simple enough:
(define (chain words)
(let loop ((words words) (chain (list (car words))))
(let* ((count (string-length (symbol->string (car words))))
(text (drop count words)))
(if (null? text) (reverse chain)
(loop text (cons (car text) chain))))))
> (chain bible)
(In beginning the without upon the the God the the God there was)
Function chain starts with the first word in a list of words and returns the chain formed according to the rule. Function chains returns the chains for each word in a list of words:
(define (chains words)
(let loop ((words words) (chains (list)))
(if (null? words) (reverse chains)
(loop (cdr words) (cons (chain words) chains)))))
> (chains alice)
((Alice very by sister having twice the sister)
(was get of by sister having twice the sister)
(beginning sister having twice the sister)
(to very by sister having twice the sister)
(get of by sister having twice the sister)
(very by sister having twice the sister)
(tired sister having twice the sister)
(of by sister having twice the sister)
(sitting and nothing had the sister)
(by sister having twice the sister)
(her the of nothing had the sister)
(sister having twice the sister)
(on bank nothing had the sister)
(the of nothing had the sister)
(bank nothing had the sister)
(and nothing had the sister)
(of nothing had the sister)
(having twice the sister)
(nothing had the sister)
(to once had the sister)
(do or she into sister)
(once had the sister)
(or she into sister)
(twice the sister)
(she into sister)
(had the sister)
(peeped was)
(into sister)
(the sister)
(book reading)
(her reading)
(sister)
(was)
(reading))
The first 26 of the 34 chains in alice end with sister, ending with the chain starting with peeped. It’s easy to count the number of successive chains, starting from the first, that end with the same word:
(define (count chains)
(let ((prev (car (reverse (car chains)))))
(let loop ((chains chains) (count 0))
(cond ((null? chains) count)
((not (equal? (car (reverse (car chains))) prev))
count)
(else (loop (cdr chains) (+ count 1)))))))
> (count (chains alice))
26
We will arbitrarily say that any text for which the count exceeds a third the length of the text obeys the rule:
(define (obey? words) (< 1/3 (/ (count (chains words)) (length words)))) > (obey? bible) #t > (obey? alice) #t > (obey? scheme) #f
We used drop from the Standard Prelude. You can run the program at http://ideone.com/g2N2vr. The trick was first described by Martin Kruskal, and is analyzed here by Lagarias, Rains and Vanderbei. Several texts are shown on the next page; all the texts except the last obey the rule.
one
(same before but not linked)
import Data.Char -- Convert pure text to word lists removing symbols asWords :: [String] -> [[String]] asWords = map $ words . map r where r c | isAlpha c = c | otherwise = ' ' -- Lookup last row indexed word jump :: [[String]] -> Int -> Maybe String jump [] _ = Nothing jump [(x:_)] 0 = Just x jump ((x:xs):xss) 0 = jump (xs:xss) (length x) jump ([]:xss) n = jump xss n jump ((x:xs):xss) n = jump (xs:xss) (n - 1) -- With some word check if obey the law obey :: String -> [[String]] -> Bool obey _ [] = False obey w xs@(x:_) = all (Just w ==) $ (jump xs) <$> [1..length x] -- Example text = "In the beginning God created the heaven and the earth.\n\ \And the earth was without form, and void; and darkness was upon the face of the deep. And the Spirit of God moved upon the face of the waters.\n\ \And God said, Let there be light: and there was light" main = print $ obey "God" $ asWords $ lines textHere’s a simulation: assume words have a uniformly random number of letters from 1 to n. We take two starting points within n words of each other (it doesn’t really matter what the exact distance is, we use -1 and 1 so the average starting value is 0), we then generate two series of random words until the two series coincide (after which they are identical). We can do that by always extending the sequence with the lowest current value. Each time we extend a sequence the probability that we hit the other value is exactly 1/n (since the two sequence values are never more that n away from each other) so the expected number of times we extend a sequence is exactly n, with expected final value n(n+1)/4 and this is the result of the simulation. Of course, real words don’t have such uniformly distributed lengths, but this seems to give a good insight into why the trick (usually) works:
import random
trials = 100000; n = 10; total = 0; final = 0
for i in range(trials):
count = 0; a = -1; b = 1
while True:
assert(abs(a-b) 0)
count += 1
if a < b: a += 1 + random.randrange(n)
else: b += 1 + random.randrange(n)
if a == b: break
total += count
final += a
print(1.0*total/trials,1.0*final/trials) # eg. (10.01902, 27.55609)
[/code]
Not sure what happened there. Let’s try again:
import random
trials = 100000; n = 10; total = 0; final = 0
for i in range(trials):
count = 0; a = -1; b = 1
while True:
assert(abs(a-b) 0)
count += 1
if a < b: a += 1 + random.randrange(n)
else: b += 1 + random.randrange(n)
if a == b: break
total += count
final += a
print(1.0*total/trials,1.0*final/trials)
[/code]
I’m mystified.
import random
trials = 100000; n = 10; total = 0; final = 0
for i in range(trials):
count = 0; a = -1; b = 1
while True:
assert(abs(a-b) 0)
count += 1
if a < b: a += 1 + random.randrange(n)
else: b += 1 + random.randrange(n)
if a == b: break
total += count
final += a
print(1.0*total/trials,1.0*final/trials)
[/code]
import random trials = 100000; n = 10; total = 0; final = 0 for i in range(trials): count = 0; a = -1; b = 1 while True: assert(abs(a-b) <= n) assert(abs(a-b) > 0) count += 1 if b > a: a += 1 + random.randrange(n) else: b += 1 + random.randrange(n) if a == b: break total += count final += a print(1.0*total/trials,1.0*final/trials)The code formatter didn’t like the “a < b” – not normally a problem.
Sorry about the spam, feel free to delete.