Name(s)
:
Random Variables, Their Distributions and Means
Math 203, Introduction to Statistics, Fall 1999, Tom
Linton, Central College
The random digits in Table B of the text can be used to simulate values
of an experiment. For example, if we want to simulate samples of size 15
from a population where 60%, or p = 0.6 of the population has a property
(like believes milk costs too much), we can declare that the digits
0 to 5 (6 digits total) stand for people who believe milk is too expensive,
and digits 6 to 9 (4 digits total) stand for people who do not believe
milk is too expensive. By selecting various starting rows and columns in
the table, we can read off 15 digits and translate each into an agree
or disagree (by the scheme 0 to 5 is agree and 6 to 9 is disagree),
thus obtaining as many samples of size 15 as we require. The main points
are to ensure that all digits in the table are usable (so we don't skip
any), and that we assign digits (or pairs of digits, or triples of digits)
in such a way that the percentage of digits (or pairs etc.) matches the
percentage of the population with a given property. If instead of p = 0.6,
our population proportion was p = 0.43, we would use pairs of digits, and
we might declare that the pairs 00 to 42 (43 digit pairs in all) have the
property, while digits 43 to 99 (57 digit pairs in all) do not. Calculators
can also be used (more quickly) to generate random numbers, or events which
correspond to random variables. The TI-83 has several nice commands along
these lines, while the TI-82 and TI-85 have at least one such command.
The purpose of today's activity is to explore these random number commands,
while investigating probabilities of certain random events and their means
(average or expected values).
Ideally, each group should have at least one TI-83 calculator. If there
are not enough TI-83s in the class, a group with an 82 or an 85 will work.
If there are still not enough calculators, some groups will have to use
the random digits table. The first two parts of this activity give a brief
introduction to generating random numbers on the TI-83 and TI-82 or 85
calculators. You only need to read these sections if you have such a calculator,
and you only need to read the section about your calculator.
For all calculators, you should realize that the
calculators essentially have a long list of random numbers in them. Most
calculators will generate the same list of random numbers, each time you
turn them on. You can reset the calculator to give different random
numbers by changing the seed. On all of the TI calculators, this
is done by storing a value to the variable named rand. Change the
seed by selecting some HUGE number (of your choice) and storing the value
in rand. To accomplish this, type out the huge number, press [STO>],
then open the [MATH] menu and its [PROB] sub menu. Select
rand and press
[ENTER]. This is necessary to prevent
all groups from generating the same data, so be sure to do
this now.
TI-83 Random number generators
The TI-83 has many useful random number commands. We'll look at these three:
-
rand Located on the [MATH][PROB] sub menu, this is used
to change the seed (see above) and generate real numbers
from 0 to 1. If you simply execute the rand command, it will return
one number between 0 and 1. If you want 7 random numbers from 0 to 1, enter
the command rand(7). This command is handy for uses similar to
Table B, where you declare certain numbers to represent individuals with
a property, and the remaining numbers to represent individuals without
that property. The rand command gives decimals while the table
has digits. Often, decimals are better. If you want samples of size 15
from a population where p = 0.47 have a property, assign values less than
or equal to 0.47 as having the property, values larger than 0.47 as not
having the property, and issue the command rand(15). Each of the
15 generated numbers which is less than or equal to 0.47 counts as someone
who has the property, and those numbers larger than 0.47 indicate an individual
without the property. The 15 numbers will appear in a single line. Use
the arrow keys (left and right) to scroll through them all. Fortunately,
this style of coding, or assigning values to decimals to indicate which
do or do not have a property, is rarely needed for those of you with a
TI-83. The commands below will, most often, remove the need for such assigning
of values.
-
randInt the 5th command on the [MATH][PROB] sub
menu is used to generate any number of integers (whole numbers) from any
starting value to any ending value (as in whole numbers from 1 to 24 or
something). A single roll of a die produces a value from 1 to 6 (all values
are equally likely). To simulate the rolling of 4 dice, just enter the
command randInt(1,6,4) (see the left screen below). The screen
indicates that you rolled three 3s and a 1. If you've labeled your population
with numbers from 1 to 36, and you need a random sample of size 5, try
the command randInt(1,36,5) (see the right screen below).
Interpret the numbers given as the random sample of size 5. The screen
shows the sample 23, 12, 35, 27 and 22, so those individuals will comprise
your random sample of size 5. If some value is repeated, issue the command
randInt(1,36) (several times if needed) until a new person is
generated.

The general syntax for this command is
randInt(start, end, num)
This will give num integers, each with a value from
start
to end. Repeating the command (normally you can just type [ENTER]
to repeat the last command) will give another sample of size
num
from the integers from start to end!
-
randBin the 7th command on the [MATH][PROB] sub menu can
be used for many simulations. Of particular interest for us is selecting
a sample of size N from a population where p (a decimal like 0.05 for 5%,
or a fraction like 1/6 for 1 in 6) represents the proportion of people
who have a given property. If you simply want to know how many (out of
N) have the property, issue the command randBin(N,p) and the calculator
returns the total number of people who have the property. If you want a
sample of size 100 from a population where 40% have a property, the command
randBin(100,
0.4) will give the count (out of an SRS of size 100) which have this
property. The left screen below indicates that 47 out of 100 have the property,
so 53 out of 100 did not have the property.

On the other hand, there are situations where you need to know
which individuals had the property and which did not. For example, if team
A is better than team B, say team A wins 60% of all matches played against
team B, and you are interested in knowing how long a best of seven series
lasts (how many matches are required before one team wins 4 matches), you
need to know more than who won the series. The command randBin(7,0.6)
will indicate the number of games won by team A. If the result is 4 or
more, team A won the series. If the result is 3 or less, team B won the
series (so A lost). If the result is 5, you cannot tell how long the series
lasted. Perhaps team A's record was WWWWWLL (5 wins then 2 loses), but
it may also have been LLWWWWW or LWWWWL (or many other possible win-loss
records with 5 wins and 2 losses). If you want to know not only who won
the series, but who the winner of each match was, the randBin
command can be used to tell you this as well. The command randBin(1,p,N)
will return N values, each is a 0 or a 1. Interpret the value 0 as a loss
and 1 as a win, in a series of games where wins occur with proportion p.
The command randBin(1,0.6,7) will not only indicate if team A
won the series (meaning there are 4 or more 1s returned), but which team
won each game (a 1 means A won, a 0 means B won), so you can look at the
result and decide how long the series lasted. The right screen above indicates
that team A won the series in 5 games (team A won matches 1,2,4, 5, and
7, while team B won matches 3 and 6). Matches 6 and 7 never happened as
the series was over by that time. When you need to know not only the total
number in a sample who have a property (like A won a match), but which
individuals possess the property and which do not, use the randBin(1,p,N)
version of the command. Finally, I should indicate that you can simulate
several samples of size N from a population with proportion p having a
property, using the randBin command as well. Suppose you have
a large supply of ball bearings and 10% (p = 0.1) are defective (too big,
too small or not round enough). If you want 5 samples (lots) of size 200
from this collection of ball bearings, and only need to know the total
number, in each lot of size 200, which are defective, the command randBin(200,
0.1, 5) will return the 5 counts. The screen below indicates that
20 in the first lot were defective, 27 in the second lot, 23, 21 and 24
in the 3rd, 4th and 5th lots were defective. The first number in the command
is the lot size, the second is the proportion p and the third is the number
of lots you'd like to draw.
TI-82 or 85 random number generators
These calculators have a single random number command and require a fair
bit of "assigning" of values to certain meanings. Nonetheless, this single
command is faster than the random digits in Table B most often. By multiplying
the command by a fixed value (like 6), you can simulate random integers
(whole numbers) rather quickly. The command is found on the [MATH][PROB]
sub menu and is called rand.
-
rand Located on the [MATH][PROB] sub menu, this is used
to change the seed (see above) and generate real numbers
from 0 to 1. If you simply execute the rand command, it will return
one number between 0 and 1. This command is handy for uses similar to Table
B, where you declare certain numbers to represent individuals with a property,
and the remaining numbers to represent individuals without that property.
If you want samples of size 15 from a population where p = 0.47 have a
property, assign values less than or equal to 0.47 as having the property,
values larger than 0.47 as not having the property, and issue the rand
command 15 times (after the first time, simply press [ENTER] to re-issue
the command). Each of the 15 generated numbers which is less than or equal
to 0.47 counts as someone who has the property, and those numbers larger
than 0.47 indicate an individual without the property.
The 15 numbers can all be generated at once using the sequence command.
The command seq, located on the [OPS] sub menu of the
[LISTS] menu, is used to make a sequence or list of numbers. The
command seq(rand,X,1,15,1) will generate 15 random numbers between
0 and 1. On the TI-85, you can use x instead of X (just press the variable
graphing key, [x,t,q,n] on either calculator).
We do not need to know the entire syntax of the sequence command, but for
those who are interested, the command seq(formula, var, start, end,
step) will evaluate the expression formula (usually a formula
with X in it) for each value of var (normally X) from X = start
to X = end, with a stepwise of step. To square each of the
even numbers, from 0 to 50, you could use seq(X^2,X,0,50,2).
If N is any whole number, the command N*rand will generate
a number from 0 to N. This can make the assigning of values much easier.
For example, a single roll of a die results in a 1, 2, 3, 4, 5 or 6. Each
value (1 to 6) is equally likely. You can simulate the roll of a die with
just the command rand, but then you will spend a great deal of
time figuring out what 1/6, 2/6, 3/6, etc. are as decimals (numbers from
0 to 1/6 represent ones, numbers from 1/6 to 2/6 represent twos, numbers
from 2/6 to 3/6 represent threes, etc.). Instead, the command 6*rand
will generate numbers from 0 to 6. The generated numbers between 0 and
1 can correspond to rolling a one, numbers from 1 to 2 correspond to rolling
a two, etc. Deciding if a number is between 1 and 2 is a lot easier than
deciding if a number lies between 1/6 and 2/6. If you want to roll 3 dice,
give the command seq(6*rand, X, 1, 3, 1). This command tells the
calculator to generate a roll of a die for each value of X from 1 to 3
with steps of size 1. The screen below indicates that the first roll was
a 5, the second a 3, and the third roll produced a 6. I reset my calculator
to display fewer digits of these values. You can see the later values by
scrolling with the left and right arrow keys. If you want to decide which
team won each game of a seven game series, where team A wins 60% of matches
against team B, use the command seq(rand,X,1,7,1) and interpret
numbers less than or equal to 0.6 as wins for team A and numbers larger
than 0.6 as wins wins for team B. The command asks for 7 values between
0 and 1 (the default of rand), since you run X from 1 to 7 with
steps of size 1. The calculator will give you seven numbers and you'll
need to scroll to see them all.
The assignment
Based on the readings above, use your calculators (or Table B) to simulate
outcomes in the events of each of the two problems below. For both problems,
you will need to utilize the data generated from all groups in the class
to finish each question, so start by generating your data and recording
the results on the board for the other groups to see. Then proceed to estimate
the distributions and means requested in each question.
-
Here is a description of a common game in establishments where small stakes
gambling is allowed (or at least not patrolled real seriously by local
police departments). For $1, you get to select a whole number from 1 to
6 (say you select a 5). The establishment's patron (bartender in common
English) then gives you three normal dice (so all values from 1 to 6 appear
equally often) and you roll all 3 at one time. The establishment's patron
then pays you $1 for each die which happens to land on the number you selected.
If you roll zero fives, you end up losing $1 (or you profit -1 dollars).
If you roll 1 five, you get your dollar back and end up with a profit of
zero dollars. If you roll 2 fives, you end up with a profit of $1 and if
by luck you happen to roll 3 fives, your profit is $2. Let Y denote your
profit from a typical play of this game. Y can take on the values -1, 0,
1, or 2. For this question you are to estimate the probability that Y =
-1, 0, 1, and 2 and the average value of Y (which is your expected profit
from this game). To this end, select a number from 1 to 6, then use the
calculator or Table B to simulate 3 rolls of a die. Do this 5 times and
record the information below. When you are done with these 5 plays of this
game, record your values on the board to share your information with the
class. Either wait for the entire class to put there data on the board,
or go on to step 1 of the second question.
What value (1 to 6) did you select?
:
| 3 dice roll results |
value of Y (profit) |
| |
|
| |
|
| |
|
| |
|
| |
|
-
Each group should produce 5 simulated values of Y = your profit in this
game. What is the total number of Y values for the class?
-
In the table below, record the frequencies (counts or total numbers) for
the entire class's values of Y.
| Value of Y |
Class Count |
| -1 |
|
|
0
|
|
|
1
|
|
|
2
|
|
-
Compute the mean of this data set. Note: There are at least 2 shortcuts
to doing this. If you enter the Y values in L1 and the counts in L2 and
issue the command 1-VarStats L1, L2 (on the TI-83 anyway), the value
of x-bar returned is the correct mean. You can also simply add the products
of each Y value by its count, and divide by the total number of class observations.
For example, if you have the counts 15, 8, 3 and 1 for Y = -1, 0, 1 and
2 respectively, the command (-1*15 + 0*8 + 1*3 + 2*1) / 27 will give the
mean (27 is the total number of class observations, 15 + 8 + 3 + 1 = 27).
-
For each value of Y (-1, 0, 1, and 2) calculate an estimate of the probability
that your profit in this game equals Y, by calculating the class count
divided by the total number of Y values generated by the class. For example,
if the class generated 55 Y values and 32 turned out to be zero, your estimate
for the probability that Y = 0 is 32 / 55. Record your estimates in the
table below.
|
Value of Y
|
Probability Estimate
|
|
-1
|
|
|
0
|
|
|
1
|
|
|
2
|
|
-
Let's abbreviate "the probability that Y = i", with the symbols
p(i).
The book claims that the mean value of Y is simply the sum of all the numbers
i*p(i). In our case this is, -1*p(-1) + 0*p(0) + 1*p(1) + 2*p(2). Calculate
this value from the table above. How does this "formula in the book" compare
to your answer to part (c)?
-
Using your table above, calculate the probability that Y > 0 (meaning you
win something), by simply adding the probabilities that Y =1 and Y = 2.
Do you think this is a good game to play very often? Why or why not?
-
For this question we would like to estimate the average length of a best
of 7 series between two teams. Team A is better than team B and team A
wins 60% of all matches it has against team B. Best of 7 series between
the 2 teams split naturally into two collections; those won by team A and
those won by team B. We will estimate the probability that team A wins
such a series, and two averages, the average number of games in a series
where team A wins the series, and the average number of games in a series
where team B wins. Using your calculator or Table B, simulate 10 best of
seven series between team A and B. Note that the length of a series is
the number of games it takes for one team to accumulate 4 wins. Record
your series in the table below and then record your values on the board
to share with the class.
| Series Number |
Winner (A or B) |
Number of Games |
|
1
|
|
|
|
2
|
|
|
|
3
|
|
|
|
4
|
|
|
|
5
|
|
|
|
6
|
|
|
|
7
|
|
|
|
8
|
|
|
|
9
|
|
|
|
10
|
|
|
-
Using all of the data from the class, what is your estimate of the proportion
of series between these two teams which are won by team A? Explain how
you estimated this value.
-
Using all of the data from the class, what is your estimate of the average
length of a 7 game series between these two teams which is won by team
A? Explain how you estimated this value.
-
Using all of the data from the class, what is your estimate of the average
length of a 7 game series between these two teams which is won by team
B? Explain how you estimated this value.
-
Why is it reasonable for the last average (length of series won by team
B) to be larger than the second to last average (length of series won by
team A)?