Back to the Table of Contents

Statistics: Probabilities and Distributions - Lesson 7

Simulating Experiments

Lesson Overview

Finding the correct probabilities of events may be difficult to do. At times the correct results may seem to be wrong. The use of simulation can be of great benefit by saving both time and money over the alternative of solving the problem by applying only the abstract principles of probability theory. We define a simulation as follows.

A simulation of an experiment is a process that behaves the same way as the experiment,
so that similar results are produced.

Let's Play Risk.

Some of you may be familiar with the game of Risk, a classic board game of world dominance through aggression, now available online. Within the game various battles are fought with the outcome determined by the roll of dice. The attacking army typically rolls three red dice to the defenders two white dice. The two highest red dice are compared with the white dice. Each high pair in turn is compared and red wins only if it is bigger. This could be analyzed either exhaustively, by trying all 65=7776 combinations or by simulation. To do it exhaustively one might write a program to analyze each result. The results would be 2890 red wins; 2611 split; and 2275 white wins. Expressed as probabilities: 0.372 red; 0.336 split; 0.293 white. (We classify as atypical times when the defender can only roll one die as in when he has but one army, or the attacker can roll or rolls only two dice.)

Alternatively, one could toss dice repeatedly and tally the results; program the TI-83+ graphing calculator to toss dice repeatedly [int(1+6rand)]; (or randInt(1,6,5) will roll 5 dice) or download a 30 day evaluation copy of MINITAB and simulate it. One might be tempted to use a standard spread sheet program, but since these were not written and are not endorsed by statisticians one should be very wary of the statistical results they produce. Some have been shown to be just plain wrong, but the software giant(s?) refuse to correct these errors.

A few other common teaching packages include: Fathom (much like Geometry Sketchpad and also by Key Curriculum), Data Desk (packaged with ActivStat, a multimedia Statisitical education package). Some professional packages include: SAS (Statistical Analysis System), BMDP, and SPSS (Statistical Package for the Social Sciences). All these packages are produced to provide statistically valid results (unlike many spreadsheets).

Let's Make Babies (not!).

Let's try another example.

Example: What is the expected average family size if a couple plans to stop having children after having one child of each gender. (No processes, such as timing, acidity, deposition depth, etc. are used to enhance gender selection.)

Solution: Instead of using expensive and time consuming methods such as conducting a controlled experiment or surveying a large number of families, we will toss a coin. Both sides (heads and tails) are equally likely. Heads can represent girls and tails can represent boys. For each "family", toss until you get one head and one tail: (H,H,H,T), (T,T,T,H), (H,H,H,H,H,H,T), (H,T), etc. After a dozen "families" or so, we will obtain a result close to 3, the theoretical results.

Historically, random number tables were commonly used as a source of random numbers. We will use here the digits of pi which are commonly available. Here we will let boys be represented by the digits 0, 1, 2, 3, and 4 and girls be represented by the digits 5, 6, 7, 8, and 9. (Since the digits of pi are uniformly but also randomly distributed, we could have just as well used even vs. odd or perhaps used just 1's and 0's, ignoring the rest.) Starting with 14159 26535 89793 ... we have the following families: (BBBG), (GB), (GGB), (GGGGGB).

This simple simulation will give us good results without much effort. Whenever a simulation is developed, we must be careful to ensure that the process imitates the actual process very well. One could critize this example by noting that boys (0.513) and girls (0.487) are not equally likely, nor are sibling genders necessarily independent.

Let's Make a Deal.

Let's at least set up another example. An old television game show called "Let's Make a Deal", hosted by Monty Hall, generated what is known as the Monty Hall problem. There are three doors with a prize (red Corvette, for example) behind one. You select one door. The host opens one of the remaining doors revealing that it is empty. He then offers you the choice of keeping your door, or switching to the other unopened door. The fact that you should switch because your probability of winning then becomes 2/3 is far from obvious. A possible scenerio for simulating this would be to have the digits 1, 2, and 3 represent door number 1; the digits 4, 5, and 6 represent door number 2; and the digits 7, 8, and 9 represent door number 3; the digit 0 is ignored. Before each round you would pick: 1) which door has the prize; 2) which door you pick first. This could be done by assigning doors as above and selecting digits in pairs. (We will use the digits of pi again.) Thus 14 would represent door 1 has the prize, but you picked door 2. The host would thus show door 3 as empty. The digits 15 would repeat that scenerio. The digits 92 would represent door 3 having the prize, but you picked door 1, he shows door 2. You would then track how often you win by switching and also by not switching.

One of the first applications of simulation here at Andrews University was back in the mid 1970's to analyze computer terminal usage. The results were published in the infamous 1976 self study. The director of the computing center, LeRoy Botten, had Bruce Ferris, a 14-year old high school drop out, known as The Kid or TK, do the analysis. Simulations are commonly used to forecast weather, "play" war games, analyze nuclear power plants, and other applications where conducting experiments are challenging.

References

John Burnette e-mailed the AP Statistics list server on Sunday, March 26, 2000 with the Risk probabilities and a C program which calculated them.

T. OF CONTENTS HOMEWORK SOLUTIONS ACTIVITY CONTINUE