NAMES:                                                                                  :
Confidence Intervals
by Tom Linton
Work in groups of 2 or 3 and turn in one paper per group.

The goal of this activity is to explore notions related to confidence intervals and learn how to quickly calculate confidence intervals with the TI-83.

On the last page of this handout are 100 rectangles of various sizes. We will draw samples of size 5 from this population of rectangles and record values of = the average area of the 5 rectangles in our sample. We will use these values of  to define confidence intervals for the population mean, m = the average area of all 100 rectangles. The population standard deviation (for the areas of all the rectangles) is s = 5.20 (squares).

To select a sample from our population, we will use randInt(1,100,5) and then remove any duplicates with repeated calls to randInt(1,100). Do NOT seed your calculator (so different groups will get different samples). The numbers returned by the randInt command will be the rectangles that you use to calculate an average area. For example, if the randInt command gives {69, 1, 98, 34, 12}, then you would use the rectangles with those numbers in your sample. The sample rectangles have areas equal to 4, 1, 6, 12 and 8. Their average area is therefore = (4 + 1 + 6 + 12 + 8) / 5 = 6.2. To calculate a 80%-confidence interval based on this sample, we would use z* = 1.282 and a standard deviation (for ) =  = 2.326. Our 80% confidence interval would therefore be (3.22, 9.18).

As it turns out, the distribution of the areas of the 100 individual rectangles is not very symmetric and slightly skewed, but the distribution for the average of 5 such areas should look approximately normal. Using software, I selected 400 samples of size 5 from this population and made a histogram of the values of . Here is what I got:

You can see that the distribution is roughly normal (normal enough to proceed with confidence intervals based on normal probabilities).

Each group will create three 80%-confidence intervals, using 3 different samples and 3 different calculation techniques.

  1. The first confidence interval will be done "by hand", using the formula  + - z*. Using the randInt command, select your sample of size five. Record the rectangles in your sample and their areas below.
Sample 1
Rect Number          
Area          
    Calculate the value of  for this sample, record the value below and add the value to the class data set on the board.
                      .
     
     
     
     

    Calculate the value for the margin of error, m = z* and record it here:  m =                             .
     
     
     

    Record your 80% confidence interval below and add your interval to the class data set on the board.

    confidence interval 1: (             ,               )
     
     
     
     
     
     
     

  1. For you second sample and second confidence interval, we will use the "stats" feature of the Z-Interval command. This command automates the process of calculating a confidence interval. To use the command, you must provide the TI-83 with the value of , the population standard deviation s, the sample size n, and the confidence level (as in 0.95 for a 95% confidence interval etc.).

  2.  

     
     
     
     
     

    Using the randInt command, select your sample of size five. Record the rectangles in your sample and their areas below.

Sample 2
Rect Number          
Area          
     
    To calculate the value of  for this sample, enter the areas into L1 and then run the 1-Var-Stats command, [Stat][right-arrow] (to select the [calc] sub-menu) and then [ENTER]. Record the value of  below and add the value to the class data set on the board.

                      .
     

    Now, press [STAT][left-arrow], to select the [TESTS] sub-menu, and then [7:Z-Interval]. This should bring up the Z-Interval, or confidence intervals for normal distributions screen. Select Stats for the input (by scrolling and pressing [ENTER] once over the Stats icon). Enter 5.2 for the population standard deviation. Note that the calculator automatically divides s by sqrt(n), you should enter the population's standard deviation, NOT the standard deviation of ). Next, enter your value of , 5 for n, and .80 for the confidence level. Finally, move the cursor to the bottom line and press [ENTER] once Calculate is selected. Record the confidence interval reported and add your interval to the class data set on the board.

    confidence interval 2: (            ,              )
     
     

  1. For your third and final confidence interval, we will again use the Z-Interval command, but this time with the "data" option.

  2.  

     
     
     
     
     

    Using the randInt command, select your sample of size five. Record the rectangles in your sample and their areas below.
     

    Sample 3
    Rect Number          
    Area          

    Using the statistical editor, enter the 5 areas into the list L1 and then open the Z-Interval screen by pressing [STAT][left-arrow][7:Z-Interval]. On the top line, select Data and then enter 5.2 for the standard deviation. Notice that the calculator automatically divides s by sqrt(n), you should enter the population's standard deviation, NOT the standard deviation of . Enter the list name that has your data in it, 1 for the frequency and .8 for the confidence level. Move the cursor to the Calculate line and press [ENTER]. Record the value of  and the confidence interval reported below and add your results to the class data set on the board.

    =                      .
    confidence interval 3: (            ,              )
     
     
     

  3. Copy the class stem and leaf plot of the  values below. Does the plot look approximately normal? What is the approximate center of the distribution?

  4.  

     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     

  5. If the values of  were perfectly normal and had a mean exactly equal to the population mean and a standard deviation exactly equal to = 2.3255, then 33% of the values of  should lie between 6 and 8 (including the endpoint values 6 and 8). What percentage of the class values of  actually do lie between 6 and 8 (inclusive)?

  6.  

     
     
     
     
     
     
     
     
     
     
     

  7. Theory predicts that 80% of the classes' confidence intervals should be "good", meaning that they contain the true value of the population mean m. Normally you do not know the value of m, so you can't tell for sure whether a given confidence interval is good (contains the population mean) or bad (does not contain the population mean). Our population is small enough so that we can calculate the true value of m. It is 7.42. What percentage of the classes confidence intervals actually do contain the population mean?

  8.  

     
     
     
     
     
     
     
     
     
     
     
     
     

  9. Despite the fact that we expect 80% of our intervals to be good ones, and 20% to be bad ones, if you pick any one specific confidence interval, the probability that it contains the population mean 7.42 is either zero or one. For example, the confidence interval (4.02, 9.98) contains 7.42 with probability 1 and the interval (9.02, 14.98) contains the value 7.42 with probability zero. For your group's three confidence intervals, give the probability that the population mean 7.42 lies in each specific interval.

  10.  

     
     
     
     
     

    Interval 1 probability:

    Interval 2 probability:

    Interval 3 probability:
     

    The last question presents a problem. We know that each and every confidence interval was constructed via a process that yields a good interval 80% of the time and a bad interval 20% of the time, but once we write down a specific confidence interval, the probability that it contains the population mean is no longer 80%, but is either zero (if the interval is bad) or one (if the interval is good). How do we capture the fact that each interval comes from a process that creates good intervals 80% of the time and bad intervals 20% of the time, without making the false statement that "this interval has an 80% chance of containing the population mean"? Statisticians summarize this situation by saying "we are 80% confident that this interval contains the population mean". This means that the interval came from a process that produces intervals that contain the mean 80% of the time (in the long-run), or that if we created many, many, of these confidence intervals, then roughly 80% would be good and 20% would be bad.
     
     

  11. One common misconception about 80% confidence intervals is that an 80% confidence interval contains 80% of "the data". This is only true if the center of your interval is exactly equal to the true population mean (i.e. your value of  is exactly equal to m, and this almost never will be the case). In fact, nearly all 80% confidence intervals contain less than 80% of the data. Values of  are distributed in an approximately normal manner, with a mean of 7.42 and a standard deviation of 2.3255. If a confidence interval is equal to (a, b), then the command normalcdf(a,b,7.42,2.3255) will give the percentage of  values that are in the confidence interval. Record below the percentage of  values that are in each of your three confidence intervals.

  12.  

     
     
     
     
     

    Percentage in confidence interval 1:

    Percentage in confidence interval 2:

    Percentage in confidence interval 3: