Practice Sem 1 Stats Final Chaps 1 8 Practice Sem 1

AP Statistics Practice Semester 1 Final (chapters 1–8)
Name: _______________________
1. Suppose that a county official send out a survey to 1000 randomly selected households in the county. The
survey was about increasing taxes to pay for road improvements. However, only 184 of the surveys were
returned. Of the households that returned the survey, only 12% wanted to increase taxes. Which of the
following best describes this study?
a. The results should not be trusted because of non-response bias. The people who didn’t return the
surveys may feel differently that those who did respond.
b. The results should not be trusted because of non-response bias. Only 1000 households were
selected, leaving out many other households.
c. The results should not be trusted since different people have different opinions about taxes.
d. The results should be trusted since the 1000 households were randomly selected.
e. The results should be trusted since 184/1000 is greater than 12%.
2. Suppose that a machine that cuts corks for wine bottles operates in such a way that the diameters of the
corks follow a normal distribution with a mean of 3 cm and a standard deviation of 0.1 cm. A cork is
considered acceptable if it’s diameter is between 2.85 and 3.15 cm. What percentage of corks produced by
this machine will be considered acceptable?
a. 87%
b. 30%
c. 7%
d. 95%
e. 68%
3. The article “Study links depression in seniors to body fat” (AZDS 12-2-08) describes that 2088 people over
age 70 were recruited from Memphis, TN and Pittsburgh, PA and followed for 5 years. Eighty-four of the
subjects were initially diagnosed with depression symptoms and had an average increase of 9 sq. cm. of
visceral fat during the 5 years. The other 2004 subjects lost 7 sq. cm. of visceral fat, on average. Which of
the following would be the most appropriate conclusion from this study?
a. Depression causes an increase in body fat for all people.
b. Depression causes an increase in body fat for all people over age 70.
c. Depression causes an increase in body fat for all people over age 70 in the Memphis and Pittsburgh
areas.
d. Depression may not cause an increase in body fat, but there is an association between depression and
body fat for all people.
e. None of these conclusions are appropriate.
Use the following stemplot of ages for questions 4 and 5.
4. What is the median age in the distribution to the right?
a. 1
b. 2.5
c. 6
d. 16
e. 18
5. What is the upper boundary for outliers in the distribution to the right?
In other words, outliers are ages above what value?
a. 27
b. 35
c. 49
d. 54
e. 59
AGES
0
1
2
3
4
5
147899
2245778
359
045
9
Key: 2 | 3 = 23 years old
6. Estimate the standard deviation of the distribution shown in the histogram below:
a.
b.
c.
d.
e.
2
6
10
40
75
7. Suppose that 40% of men in their 30’s have some gray hair and 30% of men in their 30’s are balding. What
percentage of men in their 30’s have some gray hair or are balding?
a. 12%
b. 30%
c. 40%
d. 70%
e. Cannot be determined since having gray hair and balding are not mutually exclusive events.
8. Suppose that the time it takes students to complete a standardized exam is approximately normally
distributed with a mean of 45 minutes and a standard deviation of 10 minutes. Approximately how much
time should be allowed so that 90% of the students have enough time to finish?
a. 32 minutes
b. 41 minutes
c. 58 minutes
d. 65 minutes
e. 68 minutes
9. In a study on solar panels, the outdoor temperature and current produced were recorded at 200 different
times during the year. The regression equation for this relationship is: yˆ = −18 + 0.72 x where y = current
and x = temperature. One observation was x = 67 degrees and y = 5.5 amps. What is the approximate
residual for this observation?
a. 25
b. -25
c. 30
d. -30
e. Cannot be determined without all of the observations
10. A automobile factory received a large shipment of 10,000 car stereos. The shipment was divided into 500
boxes with 20 stereos in each box. To inspect the shipment, 10 boxes were randomly selected and each
stereo in those 10 boxes was tested thoroughly. Which of the following sampling plans correctly describes
the procedure used at the factory?
a. Simple random sample
b. Stratified random sample
c. Cluster sample
d. Systematic sample
e. Census
11. In a particular high school, the number of courses a senior is taking was recorded and summarized in the
table below. Find the mean and standard deviation of this distribution.
Number of Courses 4 5 6 7
Proportion
.2 .5 .2 .1
a. mean = 5.5, SD = 1.1
b. mean = 5.5, SD = 1.3
c. mean = 5.2, SD = 0.9
d. mean = 5.2, SD = 1.1
e. Cannot be determined without the sample size
12. On a final exam in Algebra, the mean score was 23 with a standard deviation of 9. What is the z-score for a
student who scored 18 on the exam?
a. 28
b. 9
c. -5
d. 0.56
e. -0.56
13. Suppose that when traveling northbound on a local street, a particular stop light is red 70% of the time.
Which of the following simulation methods would be best to estimate the probability that the light is red on
5 consecutive trips on this street?
a. Let Heads = Red and Tails = Green. Flip the coin five times and count the number of heads.
b. Let 0-4 = Red and 5-9 = Green. Generate 5 random integers from 0-9 and count the number that are
between 0 and 4.
c. Let 0-4 = Red and 5-9 = Green. Generate 5 random integers (without replacement) from 0-9 and
count the number that are between 0 and 4.
d. Let 0-6 = Red and 7-9 = Green. Generate 5 random integers from 0-9 and count the number that are
between 0 and 6.
e. Let 0-6 = Red and 7-9 = Green. Generate 5 random integers (without replacement) from 0-9 and
count the number that are between 0 and 6.
14. A random sample of bears was captured and the lengths (in inches) and weights (in pounds) of these bears
were measured. The goal was to create a model for predicting weight based on length (since it is much
easier to estimate the length of a bear that its weight). A linear regression model was computed and r = .88.
Based on this value, which of the following statements may NOT be true?
a. There is a linear relationship between length and weight.
b. There is a positive relationship between length and weight.
c. There is a relatively strong relationship between length and weight.
d. If weight was measured in kilograms instead of pounds, r would still be .88.
e. If weight was used to predict length, r would still be .88.
15. On a recent statistics test, the raw scores had a median of 15 with an interquartile range of 7. The teacher
scaled these scores by multiplying each raw score by 3 and then adding 35. What are the median and
interquartile range of the scaled scores?
a. median = 80, IQR = 56
b. median = 80, IQR = 21
c. median = 80, IQR = 7
d. median = 45, IQR = 21
e. median = 45, IQR = 7
Problems 16 and 17 concern the 2008 Presidential Election. Given below is a joint-probability table for the
results of the Election. The results are based on Election-Day polls for the National Election Pool and show the
candidate voted for and ethnicity of the voter.
White voter
Black voter
Hispanic voter
Asian voter
Other voter
Obama
0.318
0.126
0.053
0.012
0.016
McCain
0.407
0.005
0.025
0.007
0.018
Other
Candidates
0.011
0.001
0.000+
0.000+
0.001
16. If you were to randomly select one white voter, what is the probability he or she voted for Obama?
a. 0.32
b. 0.43
c. 0.61
d. 0.74
e. 0.52
17. According to the data, do candidate preference and ethnicity of voters seem to be independent?
a. No, since some people prefer Obama while others prefer McCain
b. No, since some ethnicities favor Obama while other ethnicities favor McCain
c. No, since there are more white voters than any other ethnicity
d. Yes, since each person gets the choice of who to vote for
e. Yes, since all of the probabilities add up to 1
18. Suppose that boxes of candy bars contain 36 candy bars. The weights of individual candy bars have a mean
of 6 ounces and a standard deviation of 0.1 ounce. Also, the empty boxes have a mean weight of 2 ounces
with a standard deviation of 0.2 ounces. Assuming all the weights are independent, what are the mean and
standard deviation of the weight of a box with 36 candy bars?
a. mean = 218, SD = 3.80
b. mean = 218, SD = 1.95
c. mean = 218, SD = 0.40
d. mean = 218, SD = 0.63
e. mean = 218, SD = 0.22
19. Suppose that a student wanted to make a graph showing the relationship between gender and hair color.
Which of the following would be most appropriate?
a. a scatterplot
b. parallel boxplots
c. back to back stemplots
d. a cumulative frequency plot
e. a comparative bar chart
Box Plot
b
20. In the boxplots shown to the right, only the whiskers overlap. If
you were to randomly select one observation from distribution
A and one observation from distribution B, then the probability
that the A value is bigger than the B value is at most:
a. 1/2
b. 1/4
c. 1/8
d. 1/16
e. 1/32
a
Collection 1
0
2
4
6
8
21. Lamb’s-quarter is a common weed that interferes with the growth of corn. An agriculture researcher
planted the same number of corn seeds in each of 16 small plots and then weeded the plots by hand to allow a
fixed number of lamb’s-quarter plants to grow in each meter of corn row. The decision of how many of these
plants to leave in each plot was made at random. No other weeds were allowed to grow. Here are the yields of
corn (bushels per acre) in each of the 16 plots:
Weeds per meter
Corn Yield
0
166.7
0
172.2
0
165
Weeds per meter
Corn Yield
3
158.6
3
176.4
3
153.1
0
176.9
1
166.2
1
157.3
1
166.7
1
161.1
3
156
9
162.8
9
142.4
9
162.8
9
162.4
a. What is the explanatory variable in this experiment?
b. What is the response variable?
c. List the treatments.
d. What are the experimental units?
e. Why was it important that the researcher planted the same number of seeds in each of the 16 plots?
f. Why was it important that the researcher used random assignment to determine the number of weeds in each
plot?
g. Did the researcher use blocking in this experiment? Explain.
h. Explain why this was an experiment and not an observational study.
22. The researcher performed a regression analysis on the data and got the results below.
180
Yield
170
Predictor
Constant
Weeds
160
S = 7.97665
150
Coef
166.483
-1.0987
SE Coef
2.725
0.5712
R-Sq = 20.9%
T
61.11
-1.92
R-Sq(adj) = 15.3%
140
0
1
2
3
4
5
6
7
8
9
Weeds
i. Interpret the slope and y intercept of the regression line in the context of this experiment.
j. Calculate the residual for the plot that had 9 weeds/meter and a yield of 142.4 bushels/acre.
k. Interpret the value of s in the context of this experiment.
l. How does the point identified in part (j) affect the value of s? Explain.
P
0.000
0.075
Box Plot
a) Briefly compare these distributions.
Navel
23. Suppose that a citrus farmer grows two types of oranges:
Navel and Mandarin. One particularly beautiful day, the farmer
took a random sample of 20 oranges of each type from his
grove. The distributions of weights (in grams) of these oranges
are summarized in the boxplots below.
Mandarin
Collection 1
0
20
40
60
80
100
120
140
Weight (grams)
160
180
b) The standard deviation for this sample of mandarin oranges is 23. Interpret this value in context.
c) If the population distribution of weights of the Mandarin oranges from this grove was approximately
Normal, what proportion of the weights should be within one standard deviation of the mean?
d) In reality, only 11 of the 20 Mandarin oranges in the sample (11/20 = 55%) were within one standard
deviation of the mean. If the population distribution of weights is approximately Normal, find the
probability of getting 11 or fewer within one standard deviation of the mean.
e) Based on your answer to part (d), do you have convincing evidence that the population distribution of
Mandarin orange weights is not approximately Normal?
200
220