TOP: Which probability?

AP Statistics – 1st Semester Final Exam Study Guide
1. You measure the age (years), weight (pounds), and marital status (single, married,
divorced, or widowed) of 1400 women. How many variables did you measure?
A) 1
B) 2
C) 3
D) 1400
E) 1403
TOP: Individuals and variables
2. X and Y are two categorical variables. The best way to determine if there is a relation between them
is to
A) construct parallel box plots of the X and Y values.
B) draw dot plots of the X and Y values.
C) make a two-way table of the X and Y values.
D) compare medians and interquartile ranges of the X and Y values.
E) compare means and standard deviations of the X and Y values.
TOP: When to use two-way tables
Scenario 1-1
A review of voter registration records in a small town yielded the following table of the number of
males and females registered as Democrat, Republican, or some other affiliation.
Democrat
Republican
Other
3.
4.
Male
300
500
200
Female
600
300
100
Use Scenario 1-1. The proportion of registered Democrats that are male is
A) 300
B) 33
C) 0.33
D) 0.30
TOP: Conditional distribution—calculation
Use Scenario 1-1. Your percentage from question number 14 is part of
A) The marginal distribution of political party registration.
B) The marginal distribution of gender.
C) The conditional distribution of gender among Democrats.
D) The conditional distribution of political party registration among males.
E) The conditional distribution of males within gender.
TOP: Conditional distribution--identification
E) 0.15
Scenario 1-2
Below is a two-way table summarizing the number of cylinders in selected car models manufactured
in six different countries in the 1990’s.
France
Germany
Italy
Japan
Sweden
U.S.A.
Total
5.
Number of cylinders
4
5
6
8
0
0
1
0
4
1
0
0
1
0
0
0
6
0
1
0
1
0
1
0
7
0
7
8
19
1
10
8
Total
1
5
1
7
2
22
38
Use Scenario 1-2. From this table, we might conclude that
A) there is a strong association between country of origin and number of cylinders.
B) about 18% of the cars sold in the United States were manufactured in Japan.
C) these data could be more effectively presented with a box plot.
D) the only eight cylinder cars in this data set were manufactured in Germany.
E) All the cars on Italian roads have four cylinders.
TOP: Interpret two-way table
6. The table below shows the results of the New Hampshire Democratic Presidential Primary
on January 8, 2008.
Candidate
Hillary Clinton
Barack Obama
John Edwards
Bill Richardson
Other
Percentage of votes
39
37
17
5
2
Which of the following lists of graphs are all appropriate ways of presenting these
data?
A) Bar graph, Pie Chart, Box plot
B) Bar graph, Box plot
C) Bar graph, Pie Chart
D) Bar Graph only
E) Pie Chart only
TOP: When to use bar graphs and pie charts
Scenario 1-4
For a physics course containing 10 students, the maximum point total for the quarter was 200. The point
totals for the 10 students are given in the stemplot below.
11
12
13
14
15
16
17
7.
6
1
3
2
8
4 8
7
6
9
Use Scenario 1-4. This stemplot is most similar to
A) a histogram with class intervals 110 ≤ score < 120, 120 ≤ score < 130, etc.
B) a time plot of the data with the observations taken in increasing order.
C) a boxplot of the data.
D) reporting the 5 number summary for the data, with the mean.
E) a dot plot of the data.
TOP: Interpret stem plot
Scenario 1-7
The following is a boxplot of the birth weights (in ounces) of a sample of 160 infants born
in a local hospital.
8.
Use Scenario 1-7. About 40 of the birthweights were below
A) 92 ounces.
B) 102 ounces.
C) 112 ounces.
D) 122 ounces.
E) 132 ounces.
TOP: Interpret boxplot
9. Use Scenario 1-7. The number of children with birthweights between 102 and 122
ounces is approximately:
A) 20.
B) 40.
C) 50.
D) 80.
E) 100.
TOP: Interpret boxplot
10. The first sentence in Henry James’s novel The Turn of the Screw has 62 words. The five number
summary for the lengths of those words is 1,2,3.5,6,12. According to the 1.5 x IQR rule for
identifying outliers, does this distribution have any outliers?
A) No, there are no outliers.
B) Yes, there is at least on high outlier but no low outliers.
C) Yes, there is at least one low outliers, but no high outliers.
D) Yes, there is at least one high and one low outlier.
E) There is not enough information given to determine if there are any outliers.
TOP: 1.5 x IQR rule (from 5-number summary)
11. You can roughly locate the mean of a density curve by eye because it is
A) the point at which the curve would balance if made of solid material.
B) the point that divides the area under the curve into two equal parts.
C) the point at which the curve reaches its peak.
D) the point where the curvature changes direction.
E) the point at which the height of the graph is equal to 1.
TOP: Mean of density curve
12. You are told that your score on an exam is at the 85 percentile of the distribution of scores. This
means that
A) Your score was lower than approximately 85% of the people who took this exam.
B) Your score was higher than approximately 85% of the people who took this exam.
C) You answered 85% of the questions correctly.
D) If you took this test (or one like it) again, you would score as well as you did this time 85% of
the time.
E) 85% of the people who took this test earned the same score you did.
TOP: Percentiles
13. Jack and Jill are both enthusiastic players of a certain computer game. Over the past year, Jack’s
mean score when playing the game is 12,400 with a standard deviation of 1500. During the same
period, Jill’s mean score is 14,200, with a standard deviation of 2000. They devise a fair contest: each
one will play the game once, and they will compare z-scores. Jack gets a score of 14,000, and Jill gets a
score of 16,000. Who won the contest, and what were each of their z-scores?
A) Jack’s z = 1.07; Jill’s z = 1.11; Jill wins the contest
B) Jack’s z = 1.07; Jill’s z = 0.90; Jack wins the contest
C) Jack’s z = 0.94; Jill’s z = 1.11; Jill wins the contest
D) Jack’s z = 0.94; Jill’s z = 0.90; Jack wins the contest
E) Jack’s z = 0.81; Jill’s z = 0.99; Jill wins the contest
TOP: compare relative standing
Scenario 2-1
A sample was taken of the salaries of 20 employees of a large company. The following are the
salaries (in thousands of dollars) for this year. For convenience, the data are ordered.
28
49
31
51
34
52
35
52
37
60
41
61
42
67
42
72
42
75
47
77
Suppose each employee in the company receives a $3,000 raise for next year (each employee's salary
is increased by $3,000).
14.
Use Scenario 2-1. The mean salary for the employees will
A) be unchanged.
B) increase by $3,000.
C) be multiplied by $3,000.
D) increase by $3, 000
E) increase by $150.
TOP: Impact of transformation on numerical summaries
15.
Use Scenario 2-1. Use Scenario 2-1. The standard deviation of the salaries for the employees will
A) be unchanged.
B) increase by $3,000.
C) be multiplied by $3,000.
D) increase by $3, 000
E) decrease by $3,000.
TOP: Impact of transformation on numerical summaries
16. Birthweights at a local hospital have a Normal distribution with a mean of 110 oz. and a
standard deviation of 15 oz. The proportion of infants with birthweights between 125 oz. and 140
oz. is about
A) 0.136.
B) 0.270.
C) 0.477.
D) 0.636.
E) 0.819.
TOP: 68-95-99.7 rule
17. Using the standard Normal distribution tables, the area under the standard Normal
curve corresponding to Z < 1.1 is
A) 0.1357.
B) 0.2704.
C) 0.8413.
D) 0.8438.
E) 0.8643.
TOP: Standard Normal Calculations
18. A soft-drink machine can be regulated so that it discharges an average of  oz. per cup. If the
ounces of fill are Normally distributed with a standard deviation of 0.4 oz., what value should  be set
at so that 98% of 6-oz. cups will not overflow?
A) 5.18
B) 6.00
C) 6.18
D) 6.60
E) 6.82
TOP: Inverse Normal Calculations
19. A study is conducted to determine if one can predict the yield of a crop based on the
amount of fertilizer applied to the soil. The response variable in this study is
A) yield of the crop.
B) amount of fertilizer applied to the soil.
C) the experimenter.
D) amount of rainfall.
E) the soil.
TOP: Explanatory/response
20. The correlation coefficient measures
A) whether there is a relationship between two variables.
B) the strength of the relationship between two quantitative variables.
C) whether or not a scatterplot shows an interesting pattern.
D) whether a cause and effect relation exists between two variables.
E) the strength of the linear relationship between two quantitative variables.
TOP: Interpreting correlation
Scenario 3-6
A researcher wishes to study how the average weight Y (in kilograms) of children changes during the
first year of life. He plots these averages versus the age X (in months) and decides to fit a least-squares
regression line to the data with X as the explanatory variable and Y as the response variable. He
computes the following quantities.
r = correlation between X and Y = 0.9
J = mean of the values of X = 6.5
M = mean of the values of Y = 6.6
Sx = standard deviation of the values of X = 3.6
Sy = standard deviation of the values of Y = 1.2
21.
Use Scenario 3-6. The slope of the least-squares
line is A) 0.30.
B) 0.88.
C) 1.01.
D) 3.0.
E) 2.7.
TOP: Regression slope from formula
22. The fraction of the variation in the values of a response y that is explained by the leastsquares regression of y on x is the
A)
correlation coefficient.
B)
slope of the least-squares regression line.
C)
square of the correlation coefficient.
D)
intercept of the least-squares regression line.
E)
sum of the squared residuals.
TOP: Interpret r-sq
23. Which of the following statements about influential points and outliers are true?
I. An influential point always has a high residual.
II. Outliers are always influential points.
III. Removing an influential point always causes a marked change in either the correlation,
the regression equation, or both.
A) I only.
B) II only.
C) III only.
D) II and III only.
E) I, II, and III are all true.
TOP: Influential points and outliers
Scenario 3-7
Below is a scatter plot (with the least squares regression line) for calories and protein (in grams) in
one cup of 11 varieties of dried beans. The computer output for this regression is below the plot.
Predictor
Constant
Calories
S = 3.37648
Coef
2.08
0.06297
SE Coef
15.93
0.02409
R-Sq = 43.2%
T
0.13
2.61
P
0.899
0.028
R-Sq(adj) = 36.9%
24. Use Scenario 3-7. Which of the following best describes what the number S = 3.37648 represents?
A) The slope of the regression line is 3.37648.
B) The standard deviation of the explanatory variable, calories, is 3.37648.
C) The standard deviation of the response variable, protein content, is 3.37648.
D) The standard deviation of the residuals is 3.37648.
E) The ratio of the standard deviation of protein to the standard deviation of calories is 3.37648.
TOP: Interpret s from computer output
Scenario 4-1
A sportswriter wants to know how strongly Lafayette residents support the local minor league
baseball team, the Lafayette Leopards. She stands outside the stadium before a game and interviews
the first 20 people who enter the stadium.
25. Use Scenario 4-1. The sample for the survey is
A) all residents of Lafayette.
B) all Leopard fans.
C) all people attending the game the day the survey was conducted.
D) the 20 people who gave the sportswriter their opinion.
E) the sportswriter.
TOP: Identify sample
26. A television station is interested in predicting whether voters in its viewing area are in favor of
offshore drilling. It asks its viewers to phone in and indicate whether they support/are in favor of or are
opposed to this practice. Of the 2241 viewers who phoned in, 1574 (70%) were opposed to offshore
drilling. The viewers who phoned in are
A) a voluntary response sample.
B) a convenience sample.
C) a probability sample.
D) a population.
E) a simple random sample.
TOP: You figure it out!
27. A simple random sample of size n is defined to be
A) a sample of size n chosen in such a way that every unit in the population has the same chance
of being selected.
B) a sample of size n chosen in such a way that every unit in the population has a known
nonzero chance of being selected.
C) a sample of size n chosen in such a way that every set of n units in the population has an
equal chance to be the sample actually selected.
D) a sample of size n chosen in such a way that each selection is made independent of every
other selection.
E) all of the above. They are essentially identical definitions.
TOP: SRS definition
Scenario 4-3
We wish to choose a simple random sample of size three from the following employees of a
small company. To do this, we will use the numerical labels attached to the names below.
1. Bechhofer
4. Kesten
7. Taylor
2. Brown
5. Kiefer
8. Wald
3. Ito
6. Spitzer
9. Weiss
We will also use the following list of random digits, reading the list from left to right, starting at
the beginning of the list.
11793 20495 05907 11384 44982 20751 27498 12009 45287 71753 98236 66419 84533
28. Use Scenario 4-3. Which of these statements about the table of random digits is true?
A) Every row must have exactly the same number of 0's and 1's.
B) In the entire table, there are exactly the same number of 0's and 1's.
C) If you look at 100 consecutive pairs of digits anywhere in the table, exactly 1 pair is 00.
D) All of these are true.
E) None of these is true.
TOP: Idea of random digits table
29. To determine the proportion of each color of Peanut Butter M&M, you buy 10 1.69 ounce
packages and count how many there are of each color. This is an example of
A)
simple random sampling
B) cluster sampling
C) multistage sampling
D)
stratified random sampling
E) systematic random sampling
TOP: What kind of sampling?
30. A 1992 Roper poll found that 22% of Americans say that the Holocaust may not have happened.
The actual question asked in the poll was
“Does it seem possible or impossible to you that the Nazi extermination of the Jews never
happened?”
and 22% responded possible. The results of this poll cannot be trusted because
A) undercoverage is present. Obviously, those people who did not survive the Holocaust could not
be in the poll.
B) the question is worded in a confusing manner.
C) we do not know who conducted the poll or who paid for the results.
D) nonresponse is present. Many people will refuse to participate, and those who do will be biased
in their opinions.
E) the question is clearly biased in the direction of a "possible" answer.
TOP: No grasshopper, you tell me!
31. The essential difference between an experiment and an observational study is that
A) observational studies may have confounded variables, but experiments never do.
B) in an experiment, people must give their informed consent before being allowed to participate.
C) observational studies are always biased.
D) observational studies cannot have response variables.
E) an experiment imposes treatments on the subjects, but an observational study does not.
32. A study of elementary school children, ages 6 to 11, finds a high positive correlation between
shoe size x and score y on a test of reading comprehension. The observed correlation is most likely
due to
A) the effect of a lurking variable, such as age.
B) a mistake, since the correlation must be negative.
C) cause and effect (larger shoe size causes higher reading comprehension).
D) "reverse" cause and effect (higher reading comprehension causes larger shoe size).
E) several outliers in the data set.
TOP: This looks very difficult
Scenario 5-1
To simulate a toss of a coin we let the digits 0, 1, 2, 3, and 4 correspond to a head and the digits 5, 6, 7,
8, and 9 correspond to a tail. Consider the following game: We are going to toss the coin until we
either get a head or we get two tails in a row, whichever comes first. If it takes us one toss to get the
head we win
$2, if it takes us two tosses we win $1, and if we get two tails in a row we win nothing. Use the
following sequence of random digits to simulate this game as many times as possible:
12975 13258 45144
33. Scenario 5-1. Based on your simulation, the estimated probability of winning $2 in this
game is A) 1/4.
B)
5/15.
C)
7/15.
D)
9/15.
E)
7/11.
TOP: Simulation to estimate probability
34. I select two cards from a deck of 52 cards and observe the color of each (26 cards in the deck are
red and 26 are black). Which of the following is an appropriate sample space S for the possible
outcomes?
A) S = {red, black}
B) S = {(red, red), (red, black), (black, red), (black, black)}, where, for example, (red, red) stands
for the event "the first card is red and the second card is red."
C) S = {(red, red), (red, black), (black, black)}, where, for example, (red, red) stands for the
event "the first card is red and the second card is red."
D) S = {0, 1, 2}.
E) All of the above.
TOP: Sample space
35. A stack of four cards contains two red cards and two black cards. I select two cards, one at a
time, and do not replace the first card selected before selecting the second card. Consider the events
A = the first card selected is red
B = the second card selected is red
The events A and B are
A) independent and disjoint.
B) not independent, but disjoint.
C) independent, not disjoint
D) not independent, not disjoint.
E) independent, but we can’t tell it’s disjoint without further information.
TOP: Independent and mutually exclusive events
36. Event A occurs with probability 0.3, and event B occurs with probability 0.4. If A and B
are independent, we may conclude that
A) P(A and B) =
0.12.
B) B) P(A|B) = 0.3.
C) P(B|A) = 0.4.
D) all of the above.
E) none of the above.
TOP: ?? formula
Scenario 5-8
A student is chosen at random from the River City High School student body, and the following
events are recorded:
M = The student is
male F = The student
is female
B = The student ate breakfast that
morning.
N = The student did not eat breakfast that morning.
The following tree diagram gives probabilities associated with these events.
37. Use Scenario 5-8. What is the probability that the selected student is a male and ate
breakfast? A) 0.32
B) 0.40
C) 0.50
D) 0.64
E) 0.80
TOP: Probabilities from tree diagram
Scenario 5-10
The Venn diagram below describes the proportion of students who take chemistry and
Spanish at Jefferson High School, Where A = Student takes chemistry and B = Students takes
Spanish.
Suppose one student is chosen at random.
38. Use Scenario 5-10. Find the value of P A  B  and describe it in words.
A) 0.1; The probability that the student takes both chemistry and Spanish.
B) 0.1; The probability that the student takes either chemistry or Spanish, but not both.
C) 0.5; The probability that the student takes either chemistry or Spanish, but not both.
D) 0.6; The probability that the student takes either chemistry or Spanish, or both.
E) 0.6; The probability that the student takes both chemistry and Spanish.
TOP: Venn diagrams
Scenario 5-11
The following table compares the hand dominance of 200 Canadian high-school students and
what methods they prefer using to communicate with their friends.
Cell phone/Text
In person
Online
Total
Left-handed
12
13
9
34
Right-handed
43
72
51
166
Total
55
85
60
200
Suppose one student is chosen randomly from this group of 200.
39. Use Scenario 5-11. If you know the person that has been randomly selected is left-handed, what is
the probability that they prefer to communicate with friends in person?
A) 0.065
B) 0.153
C) 0.17
D) 0.382
E) 0.53
TOP: ?? probability from 2-way table
40. An ecologist studying starfish populations collects the following data on randomly-selected 1-meter
by 1-meter plots on a rocky coastline.
--The number of starfish in the plot.
--The total weight of starfish in the plot.
--The percentage of area in the plot that is covered by barnacles (a popular food for starfish).
--Whether or not the plot is underwater midway between high and low tide.
How many of these measurements can be treated as continuous random variables and how many
as discrete random variables?
A) Three continuous, one discrete.
B) Two continuous, two discrete.
C) One continuous, three discrete.
D) Two continuous, one discrete, and a fourth that cannot be treated as a random variable.
E) One continuous, two discrete, and a fourth that cannot be treated as a random variable.
TOP: Continuous vs. Discrete random variables
Scenario 6-5
A small store keeps track of the number X of customers that make a purchase during the first hour that
the store is open each day. Based on the records, X has the following probability distribution.
0
1
2
3
4
X
0.1
0.1
0.1
0.1
0.6
P(X)
41. Scenario 6-5. Use The mean number of customers that make a purchase during the first hour that
the store is open is
A) 2.0.
B) 2.5.
C) 2.9.
D) 3.0.
E) 4.0.
TOP: Mean of Discrete Random Variable
Scenario 6-8
Let the random variable X represent the profit made on a randomly selected day by a certain
store. Assume X is Normal with a mean of $360 and standard deviation $50.
42. Use Scenario 6-8. The value of P(X > 400) is
A) 0.2881.
B) 0.8450.
C) 0.7881.
D) 0.2119.
E) 0.1600.
TOP; Normal random variable probability
43. An airplane has a front and a rear door that are both opened to allow passengers to exit when the
plane lands. The plane has 100 passengers seated. The number of passengers exiting through the front
door should have
A) a binomial distribution with mean 50.
B) a binomial distribution with 100 trials but success probability not equal to 0.5.
C) a geometric distribution with p = 0.5.
D) a normal distribution with a standard deviation of 5.
E) none of the above.
TOP: Binomial/Geometric setting
Scenario 6-14
A worn out bottling machine does not properly apply caps to 5% of the bottles it fills.
44. Use Scenario 6-14. If you randomly select 20 bottles from those produced by this machine, what is
the approximate probability that exactly 2 caps have been improperly applied?
A) 0.0002
B) 0.19
C) 0.74
D) 0.81
E) 0.92
TOP: Which probability?
Scenario 6-16
A poll shows that 60% of the adults in a large town are registered Democrats. A newspaper reporter
wants to interview a local democrat regarding a recent decision by the City Council.
45. Use Scenario 6-16. If the reporter asks adults on the street at random, what is the probability that
he will find a Democrat by the time he has stopped three people?
A) 0.936
B) 0.216
C) 0.144
D) 0.096
E) 0.064
TOP: Which probability?
46. Use Scenario 6-16. On average, how many people will the reporter have to stop before he finds
his first Democrat?
A) 1
B)
1.33
C)
1.67
D)
2 E)
2.33
TOP: Who knows? Right?
47. At a school with 600 students, 25% of them walk to school each day. If we choose a random sample
of 40 students from the school, is it appropriate to model the number of students in our sample who walk
to school with a binomial distribution where n = 40 and p = 0.25?
A) No, the appropriate model is a geometric distribution with n = 40 and p = 0.25.
B) No, it is never appropriate to use a binomial setting when we are sampling
without replacement.
C) Yes, because the sample size is less than 10% of the population size.
D) Yes, because 0.2  p  0.8 and n < 30.
E) We can’t determine whether a binomial distribution is appropriate unless the number of
trials is known.
TOP: Binomial setting and sampling