## The Hot Hand

### October 2, 2015

My program is very simple. It flips four coins (1 heads, 0 tails), then counts the outcomes of each flip following a heads, repeating that a million times:

```(define (hot-hand n)
(let ((streak 0) (break 0))
(do ((i 0 (+ i 1)))
((= n i) (values streak break))
(let ((one (randint 2))
(two (randint 2))
(three (randint 2))
(four (randint 2)))
(when (and (= one 1) (= two 1)) (set! streak (+ streak 1)))
(when (and (= one 1) (= two 0)) (set! break (+ break 1)))
(when (and (= two 1) (= three 1)) (set! streak (+ streak 1)))
(when (and (= two 1) (= three 0)) (set! break (+ break 1)))
(when (and (= three 1) (= four 1)) (set! streak (+ streak 1)))
(when (and (= three 1) (= four 0)) (set! break (+ break 1)))))))

> (hot-hand 1000000)
749018
750222```

That looks an awful lot like 50% to me. I obviously don’t understand the article. Perhaps one of my readers can explain it to me.

We used the random number generator from the Standard Prelude. You can run the program at http://ideone.com/5kAvio.

Pages: 1 2

### 10 Responses to “The Hot Hand”

1. Paul said

In Python. As you can see the answer is 0.4047… I feel I made a mistake somewhere. It is a strange problem. There is no need to do one million random trials, as there are only 16 different outcomes for the 4 flips possible. I simple enumerate these 16 cases, which are equally likely. Two cases (TTTT and TTTH) do not score, as there are no heads for the first three flips.

```from itertools import product
from statistics import mean

def score(flips):
heads = sum(flips[:-1]) # last flip does not count

def hothands():
scores = [score(flips) for flips in product((HEAD, TAILS), repeat=4)]
return mean(s for s in scores if s != NOHEADS)

print(hothands()) # 0.4047619047619047
```
2. graham said

It’s a slightly more complex than the article leads one to believe, if I understand correctly. Gelman’s handling of it in the blog post linked in the article, and later in a follow up post is pretty decent.

3. Jussi Piitulainen said

The combinatorics can be done essentially by hand. For the sixteen types of equiprobable games (that will occur about equally often in a simulation), count the heads that are followed by something (“c” for “cases”) and the heads that follow heads (“s” for “successes”), skip the types with no cases (as irrelevant or something), report the mean proportion of success. (I meant those columns to be aligned.)

```;; 1       h h h h h h h h t t t t t t t t
;; 2       h h h h t t t t h h h h t t t t
;; 3       h h t t h h t t h h t t h h t t
;; 4       h t h t h t h t h t h t h t h t
(let ((c '(3 3 2 2 2 2 1 1 2 2 1 1 1 1    ))
(s '(3 2 1 1 1 0 0 0 2 1 0 0 1 0    )))
(let* ((proportions (map / s c))
(mean (/ (apply + proportions)
(length proportions))))
(display `(fraction ,mean)) (newline)
(display `(per-cent ,(round (* 100 mean)))) (newline)))
;; displays:
;; (fraction 17/42)
;; (per-cent 40)
```
4. programmingpraxis said

I see now how I misunderstood the question. Thanks to Paul and Jussi. And yes, I knew that I only needed to count the sixteen possibilities, not perform a million trials, even though that’s what the article challenged me to do. Harumph. On reflection, I’m not sure I like this exercise.

5. Jussi Piitulainen said

A million-round simulation is a big overkill. A thousand rounds seems good already, and with ten thousand rounds the result varies within one percent unit of 50% (whether non-final heads is followed by heads) and about 40% (the average proportion of heads after non-final heads in 4-games with non-final heads), respectively.

I’ve now browsed through the discussion in the first blog post by Andrew Gelman that graham cited above. There were some comments about the equal weighting of games with different numbers of non-final heads in the averaging. That may have something to do with something. Or not :) The real debate seems to be over what is the best way to understand the concepts of “hot hands” and streaks, and the real application is to much more complex data. I must remain confused on that but we’re at least able to replicate a laboratory experiment. And clearly the 40% figure is fully compatible with the 50% figure.

6. Globules said

I don’t care, I’m doing a million trials! :-) A Haskell version with variable trial size and number of trials. We treat False as tails and True as heads. (No real attempt at optimization except for the bang patterns in “mean”.)

```{-# LANGUAGE BangPatterns #-}

import Data.List (foldl')
import Data.List.Split (chunksOf)
import Data.Ratio
import System.Environment (getArgs)
import System.Random.Mersenne.Pure64
import System.Random
import Text.Printf (printf)

-- Run one trial for a list of coin flips.  Return the pair (s, t), where s is
-- the number of successful outcomes (two heads in a row) and t is the number
-- of valid coin flip pairs (a head followed by another flip).
trial :: [Bool] -> (Integer, Integer)
trial bs = foldl' step (0, 0) \$ zip bs (tail bs)
where step (s, t) (True, True ) = (s+1, t+1)
step (s, t) (True, False) = (s  , t+1)
step (s, t) (   _,     _) = (s  , t  )

-- The result of a list of trials.
trials :: [[Bool]] -> [(Integer, Integer)]
trials = filter ((/= 0) . snd) . map trial

-- The average of the ratios of a list of trials.
avgTrials :: Fractional a => [[Bool]] -> a
avgTrials = mean . map (uncurry (%)) . trials

-- The mean of a list of numbers.
mean :: (Real a, Fractional b) => [a] -> b
mean [] = 0
mean xs = let (tot, cnt) = foldl' (\(!s, !n) x -> (s+x, n+1)) (0, 0 :: Int) xs
in realToFrac tot / realToFrac cnt

main :: IO ()
main = do
-- Who needs argument validation?
[c, n] <- liftM (map read) getArgs :: IO [Int]
flips <- liftM (randomRs (False, True)) newPureMT
let avg = avgTrials . take n . chunksOf c \$ flips :: Double
printf "%7d %7d  %0.1f%%\n" c n (100 * avg)
```

We run the program with different “trial” sizes, from one million runs of four flips per trial up to one run consisting of four million flips. As the number of flips per trial increases the percentage approaches 50% as expected. The first line corresponds to the problem as described.

```\$ echo 4 1000000 40 100000 400 10000 4000 1000 40000 100 400000 10 4000000 1 | xargs -n2 ./hothand
4 1000000  40.4%
40  100000  48.7%
400   10000  49.9%
4000    1000  50.0%
40000     100  50.0%
400000      10  50.0%
4000000       1  49.9%
```
7. I am getting 35% whats wrong with my thinking:
Can anyone recommend me a book to strengthen my ability to formulate algorithms for either python or java ? I find when I attempt these problems I take way to long to break it down into its fundamental components.
python:

```import random

Sum=0
for k in range(num):
coin4=[]
consecH=0
for i in range(4): #generate coin flips
coin4.append(random.randint(0,1))
print(coin4)
for i in range(3): #sample size, since cant compare last index to anything
if(coin4[i]==1 and coin4[i+1]==1):#total number of consecutive heads
consecH+=1
return Sum/num

```
8. Globules said

@Jens The key is in the phrase “the percentage of heads on the flips **coming immediately after heads**”. It’s telling us to restrict our calculation of a percentage to those lists that have a heads with (at least) one more flip after it. For example, if we generate TTTT or TTTH then we don’t calculate a percentage. It’s not that the percentage is zero, but rather we don’t even consider those cases (i.e. we skip them).

What this means for your code is that instead of dividing by num at the end, you should instead divide by the count of “valid” lists. Something like this:

```import random

Count=0
Sum=0
for k in range(num):
coin4=[]
consecH=0
for i in range(4): #generate coin flips
coin4.append(random.randint(0,1))
for i in range(3): #sample size, since cant compare last index to anything
if(coin4[i]==1 and coin4[i+1]==1):#total number of consecutive heads
consecH+=1