The Hot Hand
October 2, 2015
My program is very simple. It flips four coins (1 heads, 0 tails), then counts the outcomes of each flip following a heads, repeating that a million times:
(define (hot-hand n) (let ((streak 0) (break 0)) (do ((i 0 (+ i 1))) ((= n i) (values streak break)) (let ((one (randint 2)) (two (randint 2)) (three (randint 2)) (four (randint 2))) (when (and (= one 1) (= two 1)) (set! streak (+ streak 1))) (when (and (= one 1) (= two 0)) (set! break (+ break 1))) (when (and (= two 1) (= three 1)) (set! streak (+ streak 1))) (when (and (= two 1) (= three 0)) (set! break (+ break 1))) (when (and (= three 1) (= four 1)) (set! streak (+ streak 1))) (when (and (= three 1) (= four 0)) (set! break (+ break 1))))))) > (hot-hand 1000000) 749018 750222
That looks an awful lot like 50% to me. I obviously don’t understand the article. Perhaps one of my readers can explain it to me.
We used the random number generator from the Standard Prelude. You can run the program at http://ideone.com/5kAvio.
In Python. As you can see the answer is 0.4047… I feel I made a mistake somewhere. It is a strange problem. There is no need to do one million random trials, as there are only 16 different outcomes for the 4 flips possible. I simple enumerate these 16 cases, which are equally likely. Two cases (TTTT and TTTH) do not score, as there are no heads for the first three flips.
It’s a slightly more complex than the article leads one to believe, if I understand correctly. Gelman’s handling of it in the blog post linked in the article, and later in a follow up post is pretty decent.
The combinatorics can be done essentially by hand. For the sixteen types of equiprobable games (that will occur about equally often in a simulation), count the heads that are followed by something (“c” for “cases”) and the heads that follow heads (“s” for “successes”), skip the types with no cases (as irrelevant or something), report the mean proportion of success. (I meant those columns to be aligned.)
I see now how I misunderstood the question. Thanks to Paul and Jussi. And yes, I knew that I only needed to count the sixteen possibilities, not perform a million trials, even though that’s what the article challenged me to do. Harumph. On reflection, I’m not sure I like this exercise.
A million-round simulation is a big overkill. A thousand rounds seems good already, and with ten thousand rounds the result varies within one percent unit of 50% (whether non-final heads is followed by heads) and about 40% (the average proportion of heads after non-final heads in 4-games with non-final heads), respectively.
I’ve now browsed through the discussion in the first blog post by Andrew Gelman that graham cited above. There were some comments about the equal weighting of games with different numbers of non-final heads in the averaging. That may have something to do with something. Or not :) The real debate seems to be over what is the best way to understand the concepts of “hot hands” and streaks, and the real application is to much more complex data. I must remain confused on that but we’re at least able to replicate a laboratory experiment. And clearly the 40% figure is fully compatible with the 50% figure.
I don’t care, I’m doing a million trials! :-) A Haskell version with variable trial size and number of trials. We treat False as tails and True as heads. (No real attempt at optimization except for the bang patterns in “mean”.)
We run the program with different “trial” sizes, from one million runs of four flips per trial up to one run consisting of four million flips. As the number of flips per trial increases the percentage approaches 50% as expected. The first line corresponds to the problem as described.
I am getting 35% whats wrong with my thinking:
Can anyone recommend me a book to strengthen my ability to formulate algorithms for either python or java ? I find when I attempt these problems I take way to long to break it down into its fundamental components.
python:
@Jens The key is in the phrase “the percentage of heads on the flips **coming immediately after heads**”. It’s telling us to restrict our calculation of a percentage to those lists that have a heads with (at least) one more flip after it. For example, if we generate TTTT or TTTH then we don’t calculate a percentage. It’s not that the percentage is zero, but rather we don’t even consider those cases (i.e. we skip them).
What this means for your code is that instead of dividing by num at the end, you should instead divide by the count of “valid” lists. Something like this:
@Jens Another thing (that I just noticed) is that the outermost loop condition should now be “while Count < num:", although for large values of num this should make almost no difference.
@Globules, ahh thank you I see where I made a mistake. cheers