Math 105 A, Introduction to Statistics,
Central College,
Fall 2006
Exam 2 Review Sheet by Tom
Linton
Practice Problems If you can answer the following questions
rather
quickly, you should be well prepared for the exam. These problems cover
most of the topics which will appear on exam 2 (chapters 5,8,9,10,11,
and 12 in the
Moore text).
- As people age they begin to experience hearing loss. A study was
done to determine the "comfort level" of sound for people of different
ages (i.e. the level of noise, measured in decibels, that people could
listen to comfortably). The data are given in the
table
below.
| Age (years) |
15 |
25 |
35 |
45 |
55 |
65 |
75 |
85
|
Sound level (decibels)
|
56 |
57 |
64 |
64 |
68 |
74 |
78 |
85
|
- If we want to predict a person's age from the level of sound
for their "comfort level", which variable would be X and which would be
Y?
- If we want to predict a person's sound level from their age,
which variable would be X and which would be
Y?
- Make a scatter plot that would be appropriate for predicting a
person's sound level, from their age. Does the scatter plot suggest
that a linear model is approriate?
- Find the equation of the least-squares regression
line for
the scatter plot in part (c). Add
the regression line to your scatterplot.
- What is the slope of your regression equaltion? In terms of age
and sound level, what does the slope of this
regression equation tell you?
- What is the y-intercept for this regression equation? For this
data set, is the y-intercept meaningful? Why or why not?
- Find one of the data pairs above that has a positive residual.
- Estimate the sound level for someone who is 60 years old.
- Why would it be inappropriate to use the regression equation to
predict the sound level of an infant?
- Decide if each of the quantities below is most likely a
parameter
or a statistic.
- The mean time spent sleeping last night by 4 students in
this
Intro Stats class.
- The mean number of items purchased by an SRS of 8 customers at
a local HyVee store.
- The median age of all pennies currently in circulation in the
United States.
- The standard deviation of the sample means, from all samples
of
size 8, from a given population.
- People who eat lots of fruits and vegetables have lower rates of
colon cancer than those who eat little of these foods. Fruits and
vegetables are rich in "antioxidants" such as vitamins A, C, and
E. Will taking antioxidants help prevent colon cancer? A
clinical trial studied this question with 864 people who were at risk
of colon cancer. The subjects were divided into four groups:
daily beta carotene, daily vitamins C and E, all three vitamins,
and daily placebo. After four years, the researchers were
surprised
to find no statistically significant difference in colon cancer among
the groups.
- Explain what the last sentence above means.
- Explain why this clinical trial is an experiment.
- What are the explanatory and response variables?
- Explain how you would use your calculator, or table B to
assign subjects to the treatment groups.
- The study was double-blind. What does this mean?
- Suggest some lurking variables that could explain why people
who
eat lots of fruits and vegetables have lower rates of colon cancer.
- Repetition, or using large enough groups for your treatments is
one of three key ingredients to a well designed experiment. What are
the other two key ingredients?
- Give an example where using a stratified random sample is
appropriate.
- Name two types of sampling methods that can give unreliable
results.
- In each of their games in the 1999 Major League Baseball
season, the Minnesota Twins committed X = 0,1,2,3, or 4 errors. The
distribution of X = number of errors per game is skewed to the right,
with X = 0 the most common value, X = 1 the second most common value,
and so on down to X = 4 the least common value. Their average for the
season was 0.84 errors per game with a standard deviation of 0.97
errors per game. Suppose we decide to select an SRS of n games (for the
Twins in the 1999 season) and calculate the sample mean
for the number of errors in those games. Let
denote the collection of all possible
values
(that is, the sampling distribution
).
- If n = 9, what is the mean of
?
What is the standard deviation of
?
- How large would n have to be, before it was safe to assume that
the distribution of
values was approximately Normal?
- If we select an SRS of size 30, what is the approximate
probability that
< 0.8?
- If we consider the games played before the all-star break to be
an SRS (there were 80 of these games), how likely is it that the Twins
committed a total of 56 or fewer errors in these games?
- The number of copies of the magazine Cosmopolitan that are sold
daily at a convenience store is a random variable X which takes on the
values 0,1,2,3,4 and 5. The distribution of X is mostly given in the
table below.
X
|
0
|
1
|
2
|
3
|
4
|
5
|
P(X)
|
0.10
|
0.12
|
0.25
|
0.30
|
0.20
|
0.03
|
- What is the probability that X = 1?
- What is the probability that X is greater than or equal to 4?
- What is the probability that the store sells one or more copies
of Cosmopolitan magazine on a randomly chosen day?
- What does it mean to say that daily sales of Cosmopolitan
magazine at this store are independent?
- Assuming daily sales are independent at this store, what is the
probability that this store sells 1 or more copies of Cosmopolitan
magazine for three days in a row?
- For total sales in a 2-day period, there are four ways that
this store can sell exactly 3 copies of the magazine. One of them is to
sell 0 copies day1 and 3 copies on day 2. Find the other 3 ways to sell
a total of 3 copies over a 2-day period, and calculate the probability
that this store sells a total of exactly 3 copies over a 2-day period.
- You are the marketing director for a mail order plant and seed
company that has produced two catalogs, one for spring orders and one
for fall orders. For each catalog sent to a potential customer,
the customer's entry in a data file is Y if they ordered something, and
N if they did not (Y = yes, N = no). After mailing the spring and fall
catalogs to a large collection of potential customers, you determine
the probabilities of the buying patterns to be:
Outcome (spring, fall):
|
YY
|
YN
|
NY
|
NN
|
Probability:
|
0.30
|
0.10
|
0.05
|
0.55
|
- Let S denote buying from the spring catalog, and F denote
buying from the fall catalog. Calculate P(S) and P(F). You might
consider making a Venn diagram to help answer these questions.
- Explain what the event "S and F" represents, and calculate P(S
and F).
- In words, what does P(F | S) represent? What is the value of
P(F | S)?
- Are F and S independent? Explain.
- Toot-Toots all you care to eat restaurant charges $8.95 per
customer to eat at the restaurant. They find that their expense per
customer (including the amount of food eaten and their expenses for
labor), has a distribution that is noticably skewed to the right with a
mean of $8.20 and a standard deviation of $3.
- Explain what the law of large numbers says about Toot-Toots
customers and profits.
- If a couple (2 people entering the restaurant together) can be
viewed as an SRS of size 2 from Toot-Toot's customer base, what are the
mean and standard deviation of the sampling distribution of a couple's
mean expense (that is, the average expense per customer, to
Toot-Toot's, based on a sample of size 2)? Would it be safe to assume
that the sampling distribution for a couple's mean expense has a Normal
distribution? Explain.
- Assume that on a given day, 100 customers eat at Toot-Toots. If
we view these 100 customers as an SRS from the customer base, what is
the probability that Toot-Toots earns a profit on this day, i.e. what
is P(
≤ 8.95)? What is the probability that Toot-Toots
averages at least $0.50 profit per customer on this day, i.e. what is P(
≤ 8.45)?
- It is estimated that 50% of all computer chips manufactured are
defective. Fortunately, inspections and other forms of quality control,
guarantee that only 5% of all legally marketed computer chips are
defective. Unfortunately, some chips are stolen immediately after being
produced (before inspections and other forms of quality control have
been used). It is estimated that 1% of all computer chips on the market
are stolen. Make a tree diagram to help analyze this situation. Your
first branch should consider whether a chip is stolen or legally
marketed. The second branch should correspond to whether the chip is
defective or good. Find the probability that a randomly purchased chip
is stolen, given that it is defective.