The Monty Hall Problem

· antonio's blog


Table of Contents

Background #

The Monty Hall problem is a puzzle that's rooted in probability and is a bit of a brain teaser. A friend recently described the problem and that got me thinking about how to simulate the puzzle and some of the common gotchas that trip people up.

The problem #

The problem described is one of a game. You have 3 doors to choose from, 2 of which have goats behind them and 1 has a car behind it. You win whatever is behind the door that you choose. Once you choose a door, the host will tell you about one door that has a goat behind it. Once the host tells you this information, you have the choice to stay at the door you've already chosen or change doors. Which is better?

Assumptions #

According to Wikipedia, there are a few assumptions:

Monty Hall problem > Standard Assumptions

  1. The host must always open a door that was not selected by the contestant.
  2. The host must always open a door to reveal a goat and never the car.
  3. The host must always offer the chance to switch between the door chosen originally and the closed door remaining.

Simulation #

Therefore, we can come up with a simple simulation in Python:

 1import random
 2
 3
 4def main():
 5    """
 6    Run the main simulation 100000 times and print the number of times
 7    switching the door was better and the number of times switching the door was worse
 8    """
 9
10    # Run the simulation 100000 times and save the results of each iteration
11    res = []
12    for _ in range(0, 100000):
13        res.append(better())
14
15    # Count the number of times switching doors was better
16    good = res.count(True)
17
18    # Count the number of times switching doors was worse
19    bad = res.count(False)
20
21    print(f"Number of times switching was good: {good} ({(good/len(res))*100})")
22    print(f"Number of times switching was bad: {bad} ({(bad/len(res))*100})")
23
24
25def better():
26    """
27    Test the Monty Hall problem to decide which option is better (switching doors or not)
28    """
29
30    # Choose where the car is
31    car_pos = random.randint(0, 2)
32
33    # Set a list where we know the goats are now
34    goat_pos = [x for x in range(0, 3) if x != car_pos]
35
36    # Randomly guess where the car might be
37    guess_pos = random.randint(0, 2)
38
39    # Choose one of the goat options to tell the player about
40    # *IMPORTANT* It CANNOT be the door a player picked
41    while True:
42        told_goat_pos = random.choice(goat_pos)
43        if told_goat_pos != guess_pos:
44            break
45
46    # Figure out the other option that we switch to
47    other_opt = [
48        x for x in range(0, 3) if x != guess_pos and x != told_goat_pos
49    ][0]
50
51    # Check if the other option is the car position. If it is, return True
52    # meaning that the switch was better
53    if other_opt == car_pos:
54        return True
55    return False
56
57
58if __name__ == "__main__":
59    main()

The result of running this simulation a few different times is as follows:

 1$ python3 /tmp/test.py
 2Number of times switching was good: 66642 (66.642)
 3Number of times switching was bad: 33358 (33.358)
 4
 5$ python3 /tmp/test.py
 6Number of times switching was good: 66721 (66.721)
 7Number of times switching was bad: 33279 (33.278999999999996)
 8
 9$ python3 /tmp/test.py
10Number of times switching was good: 66713 (66.713)
11Number of times switching was bad: 33287 (33.287)

Which is pretty much what we expect. You have a 2/3 (~66%) chance of winning the car if you change doors. If you don't change doors, you have a 1/3 (~33%) chance of winning the car.

A common misconception of people is this would just result in a 1/2 (50%) chance of getting the car. This is because once a goat is revealed, you have two options left (one of which a goat, the other the car) which results in a 1/2 (50%) chance.

The one important thing that's missing from making the above true is that the host never reveals to the player what is behind the player's chosen door (based on our assumptions). Therefore, you can't operate on the assumption that our chances are 50/50. Here's the same simulation code, but we randomly tell the player about which door a goat is behind:

 1import random
 2
 3
 4def main():
 5    """
 6    Run the main simulation 100000 times and print the number of times
 7    switching the door was better and the number of times switching the door was worse
 8    """
 9
10    # Run the simulation 100000 times and save the results of each iteration
11    res = []
12    for _ in range(0, 100000):
13        res.append(better())
14
15    # Count the number of times switching doors was better
16    good = res.count(True)
17
18    # Count the number of times switching doors was worse
19    bad = res.count(False)
20
21    print(f"Number of times switching was good: {good} ({(good/len(res))*100})")
22    print(f"Number of times switching was bad: {bad} ({(bad/len(res))*100})")
23
24
25def better():
26    """
27    Test the Monty Hall problem to decide which option is better (switching doors or not)
28    """
29
30    # Choose where the car is
31    car_pos = random.randint(0, 2)
32
33    # Set a list where we know the goats are now
34    goat_pos = [x for x in range(0, 3) if x != car_pos]
35
36    # Randomly guess where the car might be
37    guess_pos = random.randint(0, 2)
38
39    # Randomly tell the player about what door has a goat behind it
40    told_goat_pos = random.choice(goat_pos)
41
42    # Figure out the other option that we switch to
43    other_opt = [
44        x for x in range(0, 3) if x != guess_pos and x != told_goat_pos
45    ][0]
46
47    # Check if the other option is the car position. If it is, return True
48    # meaning that the switch was better
49    if other_opt == car_pos:
50        return True
51    return False
52
53
54if __name__ == "__main__":
55    main()

The result of running this modified simulation a few different times is as follows:

 1$ python3 /tmp/test.py
 2Number of times switching was good: 50115 (50.114999999999995)
 3Number of times switching was bad: 49885 (49.885000000000005)
 4
 5$ python3 /tmp/test.py
 6Number of times switching was good: 50234 (50.234)
 7Number of times switching was bad: 49766 (49.766)
 8
 9$ python3 /tmp/test.py
10Number of times switching was good: 49879 (49.879)
11Number of times switching was bad: 50121 (50.121)

Which results in the 50/50 probability we expected! Pretty neat!