The Hot Hand
October 2, 2015
My program is very simple. It flips four coins (1 heads, 0 tails), then counts the outcomes of each flip following a heads, repeating that a million times:
(define (hot-hand n)
(let ((streak 0) (break 0))
(do ((i 0 (+ i 1)))
((= n i) (values streak break))
(let ((one (randint 2))
(two (randint 2))
(three (randint 2))
(four (randint 2)))
(when (and (= one 1) (= two 1)) (set! streak (+ streak 1)))
(when (and (= one 1) (= two 0)) (set! break (+ break 1)))
(when (and (= two 1) (= three 1)) (set! streak (+ streak 1)))
(when (and (= two 1) (= three 0)) (set! break (+ break 1)))
(when (and (= three 1) (= four 1)) (set! streak (+ streak 1)))
(when (and (= three 1) (= four 0)) (set! break (+ break 1)))))))
> (hot-hand 1000000)
749018
750222
That looks an awful lot like 50% to me. I obviously don’t understand the article. Perhaps one of my readers can explain it to me.
We used the random number generator from the Standard Prelude. You can run the program at http://ideone.com/5kAvio.
In Python. As you can see the answer is 0.4047… I feel I made a mistake somewhere. It is a strange problem. There is no need to do one million random trials, as there are only 16 different outcomes for the 4 flips possible. I simple enumerate these 16 cases, which are equally likely. Two cases (TTTT and TTTH) do not score, as there are no heads for the first three flips.
from itertools import product from statistics import mean HEAD, TAILS = 1, 0 NOHEADS = -1 def score(flips): heads = sum(flips[:-1]) # last flip does not count if heads == 0: return NOHEADS head_after_head = sum(f1 & f2 for f1, f2 in zip(flips, flips[1:])) return head_after_head / heads def hothands(): scores = [score(flips) for flips in product((HEAD, TAILS), repeat=4)] return mean(s for s in scores if s != NOHEADS) print(hothands()) # 0.4047619047619047It’s a slightly more complex than the article leads one to believe, if I understand correctly. Gelman’s handling of it in the blog post linked in the article, and later in a follow up post is pretty decent.
The combinatorics can be done essentially by hand. For the sixteen types of equiprobable games (that will occur about equally often in a simulation), count the heads that are followed by something (“c” for “cases”) and the heads that follow heads (“s” for “successes”), skip the types with no cases (as irrelevant or something), report the mean proportion of success. (I meant those columns to be aligned.)
;; 1 h h h h h h h h t t t t t t t t ;; 2 h h h h t t t t h h h h t t t t ;; 3 h h t t h h t t h h t t h h t t ;; 4 h t h t h t h t h t h t h t h t (let ((c '(3 3 2 2 2 2 1 1 2 2 1 1 1 1 )) (s '(3 2 1 1 1 0 0 0 2 1 0 0 1 0 ))) (let* ((proportions (map / s c)) (mean (/ (apply + proportions) (length proportions)))) (display `(fraction ,mean)) (newline) (display `(per-cent ,(round (* 100 mean)))) (newline))) ;; displays: ;; (fraction 17/42) ;; (per-cent 40)I see now how I misunderstood the question. Thanks to Paul and Jussi. And yes, I knew that I only needed to count the sixteen possibilities, not perform a million trials, even though that’s what the article challenged me to do. Harumph. On reflection, I’m not sure I like this exercise.
A million-round simulation is a big overkill. A thousand rounds seems good already, and with ten thousand rounds the result varies within one percent unit of 50% (whether non-final heads is followed by heads) and about 40% (the average proportion of heads after non-final heads in 4-games with non-final heads), respectively.
I’ve now browsed through the discussion in the first blog post by Andrew Gelman that graham cited above. There were some comments about the equal weighting of games with different numbers of non-final heads in the averaging. That may have something to do with something. Or not :) The real debate seems to be over what is the best way to understand the concepts of “hot hands” and streaks, and the real application is to much more complex data. I must remain confused on that but we’re at least able to replicate a laboratory experiment. And clearly the 40% figure is fully compatible with the 50% figure.
I don’t care, I’m doing a million trials! :-) A Haskell version with variable trial size and number of trials. We treat False as tails and True as heads. (No real attempt at optimization except for the bang patterns in “mean”.)
{-# LANGUAGE BangPatterns #-} import Control.Monad (liftM) import Data.List (foldl') import Data.List.Split (chunksOf) import Data.Ratio import System.Environment (getArgs) import System.Random.Mersenne.Pure64 import System.Random import Text.Printf (printf) -- Run one trial for a list of coin flips. Return the pair (s, t), where s is -- the number of successful outcomes (two heads in a row) and t is the number -- of valid coin flip pairs (a head followed by another flip). trial :: [Bool] -> (Integer, Integer) trial bs = foldl' step (0, 0) $ zip bs (tail bs) where step (s, t) (True, True ) = (s+1, t+1) step (s, t) (True, False) = (s , t+1) step (s, t) ( _, _) = (s , t ) -- The result of a list of trials. trials :: [[Bool]] -> [(Integer, Integer)] trials = filter ((/= 0) . snd) . map trial -- The average of the ratios of a list of trials. avgTrials :: Fractional a => [[Bool]] -> a avgTrials = mean . map (uncurry (%)) . trials -- The mean of a list of numbers. mean :: (Real a, Fractional b) => [a] -> b mean [] = 0 mean xs = let (tot, cnt) = foldl' (\(!s, !n) x -> (s+x, n+1)) (0, 0 :: Int) xs in realToFrac tot / realToFrac cnt main :: IO () main = do -- Who needs argument validation? [c, n] <- liftM (map read) getArgs :: IO [Int] flips <- liftM (randomRs (False, True)) newPureMT let avg = avgTrials . take n . chunksOf c $ flips :: Double printf "%7d %7d %0.1f%%\n" c n (100 * avg)We run the program with different “trial” sizes, from one million runs of four flips per trial up to one run consisting of four million flips. As the number of flips per trial increases the percentage approaches 50% as expected. The first line corresponds to the problem as described.
$ echo 4 1000000 40 100000 400 10000 4000 1000 40000 100 400000 10 4000000 1 | xargs -n2 ./hothand 4 1000000 40.4% 40 100000 48.7% 400 10000 49.9% 4000 1000 50.0% 40000 100 50.0% 400000 10 50.0% 4000000 1 49.9%I am getting 35% whats wrong with my thinking:
Can anyone recommend me a book to strengthen my ability to formulate algorithms for either python or java ? I find when I attempt these problems I take way to long to break it down into its fundamental components.
python:
@Jens The key is in the phrase “the percentage of heads on the flips **coming immediately after heads**”. It’s telling us to restrict our calculation of a percentage to those lists that have a heads with (at least) one more flip after it. For example, if we generate TTTT or TTTH then we don’t calculate a percentage. It’s not that the percentage is zero, but rather we don’t even consider those cases (i.e. we skip them).
What this means for your code is that instead of dividing by num at the end, you should instead divide by the count of “valid” lists. Something like this:
import random def headInFour(num): #number of trials Count=0 Sum=0 for k in range(num): coin4=[] consecH=0 headTotal=0 for i in range(4): #generate coin flips coin4.append(random.randint(0,1)) for i in range(3): #sample size, since cant compare last index to anything if(coin4[i]==1):#total number of heads headTotal+=1 if(coin4[i]==1 and coin4[i+1]==1):#total number of consecutive heads consecH+=1 if(headTotal!=0): #calc average Count+=1 Sum+=consecH/headTotal return Sum/Count print("result ", headInFour(10000))@Jens Another thing (that I just noticed) is that the outermost loop condition should now be “while Count < num:", although for large values of num this should make almost no difference.
@Globules, ahh thank you I see where I made a mistake. cheers