NAMES:                                                                               :
Significance or Hypothesis Tests
by Tom Linton
Work in groups of 2 or 3 and turn in one paper per group.

The goal of this activity is to explain the reasoning behind significance tests (also called hypothesis tests).
 

The Reasoning Behind Significance Tests

One version of the scientific method can be summarized as follows: In statistics, we often implement this procedure as follows: When  is NOT close 15, we must decide if our value of  was due to the typical variation of the statistic , or due to the fact that our assumption that m = 15 was a bad one. To help answer this question we rely on a probability calculation to answer the question:
If m = 15, how likely is a value of  like the one we obtained?
If values like the one we obtained occur frequently, we assume that m = 15 is a reasonable hypothesis. If the value we obtained is highly unlikely, then we assume our assumption that m = 15 is wrong.

To help explain this reasoning, consider the following scenario:

A trusted friend offers you an investment where you give them $1000 and at the end of each month, your friend pays you your earnings in cash (your $1000 remains in the investment). After each 3-month period, you can continue the investment for another 3 months, or withdrawal your $1000. The monthly earnings are claimed to vary in a normal fashion with a mean return of $15.00 per month and a standard deviation of $4.00 per month. Recognizing that your average annual return will be $180 ($15 * 12  = $180 per year, for an average profit of 18% per year), you consider this a good deal and decide to invest $1000.

Your first three monthly returns, $13.39, $13.29 and $13.64 (so  = $13.44) are all noticably below average.

  1. Which (you may select several) of the following alternatives do you think are possible?
    1. Your friend was truthful, the average monthly return is $15, but you just had three below average months.

    2.  

       

    3. Your friend was untruthful, the true average monthly return is less than $15, and the past three months indicate this fact.

    4.  

       

    5. Your friend was untruthful, the average monthly profit is more than $15 and you just experienced three very below average months.

    6.  

       
       
       

  2. Which of the three statements above do you feel is most likely to be true? Why?

  3.  

     
     
     

  4. Which of the three statements above do you feel is least likely to be true? Why?

  5.  

     
     
     

The performance in the first three months should cause some concern. Let's say you start to doubt the claim that the average monthly return is actually $15.00, thinking it is likely a smaller amount. However, the friend is a good one and 18% is a solid profit margin. Before pulling your money out of the investment, you'd like to be pretty sure that your 3-month average return wasn't just due to bad luck, or the typical variation in the average return over a three-month period.

In accordance with the scientific method outlined above, we have our original hypothesis, namely that X = your monthly earnings are normally distributed with a mean of 15 and a standard deviation of 4 (briefly we say that X is N(15,4) ). We also have a single value of , from a sample of size n = 3, namely  = 13.44.
The question we're interested in answering is

how likely is it that  = 13.44, if X is actually normal
with a mean of 15 and a standard deviation of 4?
If the value 13.44 is unlikely, then probably our friend lied and the actual monthly earnings are less than $15 (i.e. they are closer to the value of  that we observed). On the otherhand, if values like = 13.44 happen all the time when m = 15, then we just had a typical below average span of time. In the future things will hopefully get better.

We need to figure out how likely values like = 13.44 are, when X is N(15,4). The problem is that since X (and hence ) is normal, the probability that =13.44 is zero (as is the probability that  equals any other single value). We can only check probabilities of  associated with intervals. We need an interval for our probability calculation, but we only have a single point from which to determine this interval.

There are only two reasonable intervals that we could consider, namely  < 13.44 and  > 13.44. These two intervals have probabilities that sum to 1 (so they are essentially equivalent, if you know one, the other is easy to find). Since we think our  value is low, let's calculate P(<13.44). That is, that probability that we get an  value equal to ours, or smaller.

  1. If X is normal with a mean of 15 and a standard deviation of 4, what is the probability that  < 13.44? (Use normalcdf).

  2.  

     
     
     
     
     
     
     

Your answer should be roughly 0.25, so that about 25% of the time, just by the typical variation in , we should expect a value of  that is 13.44 or less.

Let's summarize our results and define some terminology to help describe this situation. If we assume that our friend's hypothesis that monthly returns are distributed in a normal fashion with m = $15 per month and s = $4 per month are true, then 3-month returns averaging $13.44 (or less) should happen one out of every four 3-month periods. We are testing the original assumption (m = 15) against the alternative that m < 15. Because of our alternative, we calculated the probability that  < 13.44, which is called a left-tailed test (we calculate the area under the density curve to the left of our observed value of ). We call the probability we calculated, namely P( < 13.44) the p-value of our test. If our original assumption is true, then values of  like ours (or lower), will occur roughly once in every four trials. This means that we have very little, or no evidence that our original assumption is false (our value is quite typical if we assume that m = 15). Based on this, we decide to leave our $1000 invested for another 3-month period.

After 6 months in the investment scheme, our monthly returns have averaged $12.91 (note, n = 6 now). Again, our returns are well below the stated value of $15 per month, but a 6-month average of $12.91 might be just due to the usual variation based on chance.

  1. If X is normally distributed with a mean of 15 and a standard deviation of 4, how likely is it that an average of 6 values of X has  < 12.91?

  2.  

     
     
     
     
     

  3. Which of the following alternatives (you may select several) do you now think are possible?
    1. Your friend was truthful. The average monthly returns are $15 and we just saw a six month period that was below average.

    2.  

       
       
       

    3. Your friend was untruthful. The average monthly return is less than $15. The past 6 months give evidence to support this.

    4.  

       
       
       
       

    5. Your friend was untruthful, The average monthly return is actually more than $15, and the past 6 month period was just below average.

    6.  

       
       
       
       

  4. Which of the three alternatives do you now think is the most likely? Why?

  5.  

     
     
     

  6. Which of the alternatives is the most unlikely? Why?

  7.  

     
     
     
     
     
     
     
     
     

    Your friend is a good one, and you'd like to be quite sure that m < 15, before you pull your money out of the investment. The p-value you just calculated isn't that small. It says that there is some evidence against the assumption that m = 15, but not a huge amount ( values like ours, or worse, don't happen all that often, but they are not that rare either). Suppose you wanted to be 95% confident that m < 15, before you pulled your money out of the investment. This translates to saying that you will pull your money out, if your  value is in the lowest 5% of all  values (equivalently if your p-value, or probability calculation  is less than 0.05). You want to find the value of  that has 5% of all values to its left (or 95% to the right). That is, you are looking for the number C, so that P( < C) = 0.05.
     

  8. Use your invNorm command to find this number C. Record the actual command you used and the output below. Assume here that we are talking about a six month sample mean (so n = 6).







  9. Do you leave your money in the investment, or do you pull your money out? Use the criteria above to decide.





  10. Suppose that you get busy and forget about your investment for a while. After 15 months you have average monthly earnings of  = $12.34, well below the $15 per month that your friend promised. How likely is this value (or smaller)? Would you still believe that m = 15 in this case? Explain.