Counterintuitive Results in Flipping Coins | /en/2009/08/counterintuitive-results-in-flipping-coins/

yihui 2022-12-16 17:17:18

https://yihui.org/en/2009/08/counterintuitive-results-in-flipping-coins/

2 Comments

giscus-bot 2022-12-16 17:17:19

Guest *Tel* @ 2009-10-17 14:32:32 originally posted:

At first I couldn't believe it, then I wrote my own code (in C, not R) to run it and I got very similar results to yours. Thinking that maybe the random generator was broken I even tried a different random generator, but kept getting similar results (HTT always gets substantially lower mean than HTH).

It only dawned on my what was happening when I looked closely at the histogram of probabilities for each "n" value:

N, P(HTT), P(HTH)
3, 12.5%, 12.5%
4, 12.5%, 12.5%
5, 12.5%, 9.3%
6, 11.0%, 7.8%
7, 9.4%, 7.1%
8, 7.8%, 6.3%
9, 6.5%, 5.5%
10, 5.2%, 4.8%
11, 4.3%, 4.2%
12, 3.5%, 3.7%

HTH has a much longer tail (as your plot suggests) so the initial probability is 1 in 8 (because there are 8 equal-weight permutations of the first three tosses) and the difference is only visible after several state transitions. If you draw the state transition table between the 8 possible states of the system, it becomes obvious that the HTT state sits on the confluence of many loop paths, while the HTH state is tucked in the middle with fewer loops passing through that state.

Consider it differently: suppose we keep flipping the coin until either the HTH sequence comes up or the HTT sequence comes up. Now which one is more likely to come up?

yihui 2022-12-16 17:17:21

For your different question, I think the expected times to flip the coin are equal for either HTH or HTT to come up; reasoning from the perspective of probability is much easier. And the simulation tells us the expected times are 5:

coin.seq2 = function(v = list(c(1, 0, 0), c(1, 0, 1))) {
    x = NULL
    n = 0
    sapply(v, function(y) identical(y, x))
    while (!any(sapply(v, function(y) identical(y, x)))) {
        x = append(x[length(x) - 1:0], rbinom(1, 1, 0.5))
        n = n + 1
    }
    c(n, which(sapply(v, function(y) identical(y, x))))
}
res = replicate(5000, coin.seq2())
tapply(res[1, ], res[2, ], mean)

For my original question, we may explain the result as follows: after we get the HT sequence, consider HTH and HTT respectively:

HTH: if we fail to get H next (i.e. get T instead), we still need to start all over again to wait for another initial H (then T then H);

HTT: if fail to get T next (i.e. get H), we've just got the initial element of HTT and may proceed to expect the next T to come up; that means we do not need to start all over again;

Thus, HTH takes longer time to appear in the sequence.

Originally posted on 2009-10-18 04:37:08

giscus-bot 2022-12-16 17:17:20

Guest *Joshua* @ 2009-12-29 01:01:58 originally posted:

Very useful post! (and code :P) Thanks.

http://www.ted.com/talks/lang/eng/peter_donnelly_shows_how_stats_fool_juries.html