AP Statistics Semester I Final Exam Review

AP Statistics Semester I Final Exam Review
1. Jerry and George are playing golf. Their scores on the first 9 holes are shown in the table
below. For each player, find the mean, median and mode score. Use your results to explain
which measure of central tendency gives the best comparison of the abilities of the two
players.
Player/Hole
1
2
3
4
5
6
7
8
9
Total
Jerry
4
7
5
2
4
7
3
6
7
45
George
3
5
4
2
3
7
3
5
16
48
Mean
Median
Mode
2. The histograms below represent the distributions of the batting averages of the top National
League and American League batters for the years shown.
Examine the histograms and write a one paragraph observation about the data. Be sure to discuss
such measures of center and spread. Also, compare the three distributions, noting any apparent
trends.
3. Below are the winning men’s long jump distances for the first 22 Olympic Games.
Year
1896
1900
1904
1908
1912
1920
1924
1928
1932
1936
1948
Winner
Ellery Clark
Alvin Kraenzlin
Myer Prinstein
Francis Irons
Albert Gutterson
William Pettersson
De Hart Hubbard
Edward Hamm
Edward Gordon
Jesse Owens
William Steele
Meters
6.35
7.18
7.34
7.48
7.60
7.15
7.44
7.73
7.63
8.06
7.82
Year
1952
1956
1960
1964
1968
1972
1976
1980
1984
1988
1992
Winner
Jerome Biffle
Gregory Bell
Ralph Boston
Lynn Davies
Robert Beamon
Randy Williams
Arnie Robinson
Lutz Dombrowski
Carl Lewis
Carl Lewis
Carl Lewis
Meters
7.57
7.83
8.12
8.07
8.90
8.24
8.35
8.54
8.54
8.72
8.67
a) Create a stem-and-leaf plot of the data. Be sure to add a legend or key.
b) Now plot the same data with a time plot from 1896 to 1992. Label appropriately.
c) Write a one-paragraph conclusion about these plots. Compare/contrast the two plots, noting
any advantages one plot has over the other toward analyzing the data.
4. The scores of the winning teams in the Rose Bowl games played from 1931 to 1980 are
arranged in ascending order in the list below.
7
14
17
24
35
7
14
18
25
35
7
14
20
27
38
7
14
20
27
40
9
14
20
27
42
10
17
21
28
42
10
17
21
29
42
13
17
21
29
44
13
17
21
34
45
14
17
23
34
49
a) Create a histogram of the data, with bars of width 5 (1-5, 6-10, 11-15, 16-20, etc...).
b) Find the 5-number summary of the data and use it to construct a box plot.
MIN
Q1
MEDIAN
Q3
MAX
c) Describe the shape of the data from the histogram. Explain how it is usually possible to know
the shape of the histogram simply by looking at the box plot.
d) Is the highest value in this data an outlier? Use a formula to justify your response
5. Read the problem description below left:
a) Which curve do you think shows the weights of newly minted quarters, which curve the coins
after five years, and which curve the coins after ten years?
b) What happens to the average weight of the coins as time passes?
c) What happens to the standard deviation of the weight of the coins as time passes?
6. Lengths of pregnancy of women having children are normally distributed, with a mean of 266
days and a standard deviation of 16 days.
a) Sketch a normal curve from this information, labeling the mean and 3 standard deviations in
both directions.
Use your sketch to answer questions b-d
b) What percentage of pregnancies last between 250 and 282 days?
__________
c) What percentage of pregnancies last between 250 and 298 days?
__________
d) What percentage of pregnancies last between 234 and 266 days?
__________
e) A letter once appeared in Dear Abby’s newspaper column from a woman who said she had
been pregnant for 310 days before giving birth to her baby. This is considerably longer than the
typical pregnancy. What percentage of pregnancies last 310 or more days? Show a sketch and all
necessary work.
7. Consider the following data, which give the weight (in thousands of pounds) x and gasoline
mileage (miles per gallon) y for ten different automobiles.
x
y
2.5
40
3.0
43
4.0
30
3.5
35
2.7
42
4.5
19
3.8
32
2.9
39
5.0
15
2.2
14
a) Use your calculator to find the least-squares regression equation, r and r2 for this data. Also
use your calculator to make a scatter plot of the data.
Reg. Eq.: ___________________________________ r: _____________
b) Using your results from part (a), describe the strength of the linear relationship between these
two variables.
c) Use the results from part (a) to complete the table below (round to nearest tenth).
Weight (x) Actual MPG ( y )
3.0
43
4.0
30
5.0
15
Predicted MPG ( y )
Residual
d) Use your calculator to create a residual plot..
e) What does this residual plot tell us? Explain in a few sentences.
8.
Answer TRUE if the statement is always true. If the statement is not always true, replace
the word(s) in bold with words that make the statement always true.
a) Correlation analysis is a method of obtaining the equation that represents the relationship
between two variables.
b) The linear correlation coefficient is used to determine the equation that represents the
relationship between two variables.
c) A correlation coefficient of zero means that the two variables are perfectly correlated.
d) Whenever the slope of the regression line is zero, the correlation coefficient will also be
zero.
e) When r is positive, the slope will always be negative.
f) The slope of the regression line represents the amount of change expected to take place in y
when x increases by one unit.
g) The calculated value of r2 represents the fraction of variation in the y variable which can be
explained by the linear model.
h) Correlation coefficients range between 0 and 1.
i) The y variable is called the explanatory variable.
j) The line of best fit is used to predict the average value of y that can be expected to occur for
a given value of x.
k) If a point has a residual of zero, then it lies on the line of best fit.
l) If a point has a negative residual, then it lies above the line of best fit.
9. An issue of the school newspaper at a university reported the results of a survey conducted by
the office of student development on the percent of students who use their student
government representative to convey their feelings on university issues.
a) What is the population of interest in this survey?
b) In addition to recording whether the respondent uses the representative, what other variables
do you think should be measured?
c) Explain in some detail how you would take a sample for this survey.
Identify the sampling method used in the following:
d) Personal interviews of students leaving the cafeteria after lunch.
e) Sending a survey to the first name on each page of the student directory.
f) Asking students to respond to a survey in an issue of the school newspaper.
10. A clothing manufacturer would like to compare the durability of a newly designed line of
children’s clothes with that of its existing line of clothing. To do so, the company will
conduct an experiment using sets of identical twins. One of each set of twins will wear the old
clothing and the other one will wear the new clothing. The children will then be allowed to
play for a period of time, after which the clothing will be evaluated for durability.
a) Why is this an experiment?
b) What type of experiment is this?
c) Describe in detail how you would utilize the 3 Principles of Experimental Design to ensure
that the results of this experiment are trustworthy.



CONTROL
RANDOMIZATION
REPLICATION
11. You roll two 6-sided dice. The first die has 2 1’s, 2 2’s and 2 3’s on it. The other has 3 2’s
and 3 3’s. You will roll these dice and find the sum.
a) Show all possible outcomes for these two dice when rolled together.
b) Make a probability distribution table for the sum of these two dice
c) What is the probability that the sum is odd?
d) What is the probability that the sum is a multiple of 3?
e) What is the mean value of this discrete random variable?
12. A bag contains 5 marbles (3 are red and 2 are blue). You draw a marble out of the bag at
random and replace it. Let X = the number of marbles drawn until you get a red marble.
a) Explain why X has a Geometric Distribution.
Find the probability that...
b) It takes 4 draws to get the first red marble.
c) It takes 4 or fewer draws to get the first red marble.
d) It takes more than 3 draws to get the first red marble.
e) On average, how many draws would we expect to make to get the first red?
13. A new television show has a 20% chance of being successful. Assume that NBC will
introduce eight new shows this spring. Let X = the number of those shows that will succeed.
a)
Explain why X has a Binomial Distribution.
Find the probability that...
b) Half of the shows succeed.
c) Two or fewer of the shows succeed.
d) More than half of the shows succeed.
e) On average, how many out of eight shows would be expected to succeed?