Name(s)                                                     :
Random Variables, Their Distributions and Means
Math 203, Introduction to Statistics, Fall 1999, Tom Linton, Central College
The random digits in Table B of the text can be used to simulate values of an experiment. For example, if we want to simulate samples of size 15 from a population where 60%, or p = 0.6 of the population has a property (like believes milk costs too much), we can declare that the digits 0 to 5 (6 digits total) stand for people who believe milk is too expensive, and digits 6 to 9 (4 digits total) stand for people who do not believe milk is too expensive. By selecting various starting rows and columns in the table, we can read off 15 digits and translate each into an agree or disagree (by the scheme 0 to 5 is agree and 6 to 9 is disagree), thus obtaining as many samples of size 15 as we require. The main points are to ensure that all digits in the table are usable (so we don't skip any), and that we assign digits (or pairs of digits, or triples of digits) in such a way that the percentage of digits (or pairs etc.) matches the percentage of the population with a given property. If instead of p = 0.6, our population proportion was p = 0.43, we would use pairs of digits, and we might declare that the pairs 00 to 42 (43 digit pairs in all) have the property, while digits 43 to 99 (57 digit pairs in all) do not. Calculators can also be used (more quickly) to generate random numbers, or events which correspond to random variables. The TI-83 has several nice commands along these lines, while the TI-82 and TI-85 have at least one such command. The purpose of today's activity is to explore these random number commands, while investigating probabilities of certain random events and their means (average or expected values).

Ideally, each group should have at least one TI-83 calculator. If there are not enough TI-83s in the class, a group with an 82 or an 85 will work. If there are still not enough calculators, some groups will have to use the random digits table. The first two parts of this activity give a brief introduction to generating random numbers on the TI-83 and TI-82 or 85 calculators. You only need to read these sections if you have such a calculator, and you only need to read the section about your calculator.

For all calculators, you should realize that the calculators essentially have a long list of random numbers in them. Most calculators will generate the same list of random numbers, each time you turn them on. You can reset the calculator to give different random numbers by changing the seed. On all of the TI calculators, this is done by storing a value to the variable named rand. Change the seed by selecting some HUGE number (of your choice) and storing the value in rand. To accomplish this, type out the huge number, press [STO>], then open the [MATH] menu and its [PROB] sub menu. Select rand and press [ENTER]. This is necessary to prevent all groups from generating the same data, so be sure to do this now.

TI-83 Random number generators

The TI-83 has many useful random number commands. We'll look at these three:
The general syntax for this command is
randInt(start, end, num)
This will give num integers, each with a value from start to end. Repeating the command (normally you can just type [ENTER] to repeat the last command) will give another sample of size num from the integers from start to end!
On the other hand, there are situations where you need to know which individuals had the property and which did not. For example, if team A is better than team B, say team A wins 60% of all matches played against team B, and you are interested in knowing how long a best of seven series lasts (how many matches are required before one team wins 4 matches), you need to know more than who won the series. The command randBin(7,0.6) will indicate the number of games won by team A. If the result is 4 or more, team A won the series. If the result is 3 or less, team B won the series (so A lost). If the result is 5, you cannot tell how long the series lasted. Perhaps team A's record was WWWWWLL (5 wins then 2 loses), but it may also have been LLWWWWW or LWWWWL (or many other possible win-loss records with 5 wins and 2 losses). If you want to know not only who won the series, but who the winner of each match was, the randBin command can be used to tell you this as well. The command randBin(1,p,N) will return N values, each is a 0 or a 1. Interpret the value 0 as a loss and 1 as a win, in a series of games where wins occur with proportion p. The command randBin(1,0.6,7) will not only indicate if team A won the series (meaning there are 4 or more 1s returned), but which team won each game (a 1 means A won, a 0 means B won), so you can look at the result and decide how long the series lasted. The right screen above indicates that team A won the series in 5 games (team A won matches 1,2,4, 5, and 7, while team B won matches 3 and 6). Matches 6 and 7 never happened as the series was over by that time. When you need to know not only the total number in a sample who have a property (like A won a match), but which individuals possess the property and which do not, use the randBin(1,p,N) version of the command. Finally, I should indicate that you can simulate several samples of size N from a population with proportion p having a property, using the randBin command as well. Suppose you have a large supply of ball bearings and 10% (p = 0.1) are defective (too big, too small or not round enough). If you want 5 samples (lots) of size 200 from this collection of ball bearings, and only need to know the total number, in each lot of size 200, which are defective, the command randBin(200, 0.1, 5) will return the 5 counts. The screen below indicates that 20 in the first lot were defective, 27 in the second lot, 23, 21 and 24 in the 3rd, 4th and 5th lots were defective. The first number in the command is the lot size, the second is the proportion p and the third is the number of lots you'd like to draw.

TI-82 or 85 random number generators

These calculators have a single random number command and require a fair bit of "assigning" of values to certain meanings. Nonetheless, this single command is faster than the random digits in Table B most often. By multiplying the command by a fixed value (like 6), you can simulate random integers (whole numbers) rather quickly. The command is found on the [MATH][PROB] sub menu and is called rand.

The assignment

Based on the readings above, use your calculators (or Table B) to simulate outcomes in the events of each of the two problems below. For both problems, you will need to utilize the data generated from all groups in the class to finish each question, so start by generating your data and recording the results on the board for the other groups to see. Then proceed to estimate the distributions and means requested in each question.
  1. Here is a description of a common game in establishments where small stakes gambling is allowed (or at least not patrolled real seriously by local police departments). For $1, you get to select a whole number from 1 to 6 (say you select a 5). The establishment's patron (bartender in common English) then gives you three normal dice (so all values from 1 to 6 appear equally often) and you roll all 3 at one time. The establishment's patron then pays you $1 for each die which happens to land on the number you selected. If you roll zero fives, you end up losing $1 (or you profit -1 dollars). If you roll 1 five, you get your dollar back and end up with a profit of zero dollars. If you roll 2 fives, you end up with a profit of $1 and if by luck you happen to roll 3 fives, your profit is $2. Let Y denote your profit from a typical play of this game. Y can take on the values -1, 0, 1, or 2. For this question you are to estimate the probability that Y = -1, 0, 1, and 2 and the average value of Y (which is your expected profit from this game). To this end, select a number from 1 to 6, then use the calculator or Table B to simulate 3 rolls of a die. Do this 5 times and record the information below. When you are done with these 5 plays of this game, record your values on the board to share your information with the class. Either wait for the entire class to put there data on the board, or go on to step 1 of the second question.

  2.  

     

    What value (1 to 6) did you select?              :

3 dice roll results value of Y (profit)
   
   
   
   
   
    1. Each group should produce 5 simulated values of Y = your profit in this game. What is the total number of Y values for the class?

    2.  

       

    3. In the table below, record the frequencies (counts or total numbers) for the entire class's values of Y.
    4. Value of Y Class Count
      -1  
      0
       
      1
       
      2
       
    5. Compute the mean of this data set. Note: There are at least 2 shortcuts to doing this. If you enter the Y values in L1 and the counts in L2 and issue the command 1-VarStats L1, L2 (on the TI-83 anyway), the value of x-bar returned is the correct mean. You can also simply add the products of each Y value by its count, and divide by the total number of class observations. For example, if you have the counts 15, 8, 3 and 1 for Y = -1, 0, 1 and 2 respectively, the command (-1*15 + 0*8 + 1*3 + 2*1) / 27 will give the mean (27 is the total number of class observations, 15 + 8 + 3 + 1 = 27).
    6. For each value of Y (-1, 0, 1, and 2) calculate an estimate of the probability that your profit in this game equals Y, by calculating the class count divided by the total number of Y values generated by the class. For example, if the class generated 55 Y values and 32 turned out to be zero, your estimate for the probability that Y = 0 is 32 / 55. Record your estimates in the table below.
    7. Value of Y
      Probability Estimate
      -1
       
      0
       
      1
       
      2
       
    8. Let's abbreviate "the probability that Y = i", with the symbols p(i). The book claims that the mean value of Y is simply the sum of all the numbers i*p(i). In our case this is, -1*p(-1) + 0*p(0) + 1*p(1) + 2*p(2). Calculate this value from the table above. How does this "formula in the book" compare to your answer to part (c)?

    9.  

       
       
       
       

    10. Using your table above, calculate the probability that Y > 0 (meaning you win something), by simply adding the probabilities that Y =1 and Y = 2. Do you think this is a good game to play very often? Why or why not?

    11.  

       
       
       

  1. For this question we would like to estimate the average length of a best of 7 series between two teams. Team A is better than team B and team A wins 60% of all matches it has against team B. Best of 7 series between the 2 teams split naturally into two collections; those won by team A and those won by team B. We will estimate the probability that team A wins such a series, and two averages, the average number of games in a series where team A wins the series, and the average number of games in a series where team B wins. Using your calculator or Table B, simulate 10 best of seven series between team A and B. Note that the length of a series is the number of games it takes for one team to accumulate 4 wins. Record your series in the table below and then record your values on the board to share with the class.
Series Number Winner (A or B) Number of Games
1
   
2
   
3
   
4
   
5
   
6
   
7
   
8
   
9
   
10
   
    1. Using all of the data from the class, what is your estimate of the proportion of series between these two teams which are won by team A? Explain how you estimated this value.

    2.  

       
       
       
       

    3. Using all of the data from the class, what is your estimate of the average length of a 7 game series between these two teams which is won by team A? Explain how you estimated this value.

    4.  

       
       
       
       
       
       

    5. Using all of the data from the class, what is your estimate of the average length of a 7 game series between these two teams which is won by team B? Explain how you estimated this value.

    6.  

       
       
       
       
       

    7. Why is it reasonable for the last average (length of series won by team B) to be larger than the second to last average (length of series won by team A)?