Prob Stat Midterm review

Prob Stat Midterm review
Name:
Standard 1 - Variables

Identify observational units (cases) and variables, and distinguish types of variables as
categorical or quantitative
On July 15, 2004, the Harris Poll released the results of a study asking whether people favored or
opposed abolishing the penny. Of a national sample of 2136 adults, 59% opposed abolishing the
penny.
1.
Which of the following is a categorical variable in the Harris Poll?
(a) the 2004 participants
(b) whether each person favors or opposes abolishing the penny.
(c) whether or not a person responded to the poll
(d) the percent of people who oppose abolishing the penny
2.
What are the observational units in the study?
(a) the number of people who would abolish the penny in the entire population
(b) the number of people who would abolish the penny in the sample
(c) the people who responded to the poll
(d) the percent of people who oppose abolishing the penny
3.
Suppose that the observational units in a study are Pennsylvania high schools. Which of
the following is not a valid variable?
(a.) whether or not the school has an animal for its mascot
(b.) proportion of students scoring proficient or better on the PSSA at each school
(c.) total number of students at each school
(d.) number of high schools in Pennsylvania which have indoor pools
Standard 2 - Sampling

Distinguish between populations and samples and between parameters and statistics
4.
Suppose that 80% of all American students send a card to their mother on Mother's Day
and that you selected a simple random sample of 400 American college students and to
determine the proportion of them who send a card to their mother on Mother's Day.
Suppose further that in the random sample of 400 students, 300 or 75% of them send a
card to their mothers.
(a.)
(b.)
(c.)
(d.)
1. B
80% is a parameter, 75% is a statistic.
75% is a parameter, 80% is a statistic.
Both 75% and 80% represent statistics.
Both 75% and 80% represent parameters.
5.
A sample is:
(a.) a number resulting from the manipulation of raw data according to specified rules.
(b.) a subset of a population.
(c.) a characteristic of a population which is measurable.
(d.) a complete set of individuals, objects, or measurements having some common
observable characteristic.
For questions 6 and 7 use the following situation:
Suppose that 80% of all American college students send a card to their mother on
Mother's Day and that you selected a simple random sample of 400 American college
students and to determine the proportion of them who send a card to their mother on
Mother's Day. Suppose further that in the random sample of 400 students, 300 or 75%
of them send a card to their mothers.
6.
Which of the following is true?
(a.)
the 400 college students are the population
(b.)
the sample size is 400
(c.)
all college students in the world are the sample
(d.)
the sample is the 300 students who send a card
7.
Which of the following is the parameter of interest?
(a.)
the proportion of American college students who send a card on Mother’s Day
(b.)
the proportion of the 400 students in the sample who send a card
(c.)
the proportion of all college students in the world who send a card
(d.)
the 300 students in the sample who sent a card
8.
Suppose we are interested in the average reading achievement test
score of the currently enrolled students in Edison Elementary School.
The tests would be the observational units. The test scores would be ________ while the
average score of all students in one teacher’s class is __________.
(a.) an observational unit, a sample
(b.) a sample, a parameter
(c.) a variable, a statistics
(d.) a population, a variable
Standard 3 - SRS

9.
Carry out a simple random sample
Which one of the following would not be an appropriate method to take a random sample
of 10 people from a numbered list of 200 names?
(a.) Use a random number table.
(b.) Write the names on slips of papers, mix well, and select 10 slips of paper.
(c.) Pick 5 make and 5 female names without really thinking about it.
(d.) Use a computer to generate 10 random numbers between 1 and 200.
Below is a list of names numbered 1 to 20. Use the random number table to randomly select 5
names from the list by starting at the beginning of the table and taking pairs of digits.
1
2
3
4
5
6
7
8
9
10
Sofia
Eassa
Jeffrey
Shakoya
John
Rebecca
William
Johanna
Allyson
Brandon
11
12
13
14
15
16
17
18
19
20
Dara
Jay
Nicole
Francis
Audrey
Anthoula
Hiep
Sean
Shanira
Alexis
Table of random digits
11035 61298 32134 10012 99091
67743 11123 45672 04567 00998
10.
What is the second name selected?
(a.) Dara
(b.) Jeffrey
(c.) Jay
(d.) Allyson
11.
What is the fifth name selected?
(a.) Dara
(b.) Jeffrey
(c.) Jay
(d.) Allyson
12.
We wish to draw a sample of size 5 without replacement from a population 50
households. Suppose the households are numbered 01, 02, . . . , 50, and suppose that the
relevant line of the random number table is:
11362 35692 96237 90842 46843 62719 64049 17823.
Then the households selected are:
(a) households 11 13 36 62 73
(b) households 11 36 23 08 42
(c) households 11 36 23 23 08
(d) households 11 36 23 56 92
Standard 4 - Sampling Errors

Critique sampling done by others.

Recognize bias in poor sampling methods and that random sampling is an unbiased
method
13.
Which best describes a SRS?
(a.) Gives every member of the population an equal chance of being selected.
(b.) Gives every member of the sample an equal chance of being selected.
(c.) Gives every different sample size an equal chance of being selected.
(d.) Gives every different population an equal chance of being selected.
14.
A grocer receives a shipment of apples in a large crate. In order to determine the
proportion of apples that are bruised in the large crate of apples, she examines a sample
of 20 apples taken from the top of the crate and notes that 5% are bruised. He sampling
method could be said to be biased because…
(a.) the apples in this crate may have been damaged during shipment
(b.)this crate of apples may not be representative of all such crates
(c.)she only looked at 20 apples, her sample is too small
(d.)the proportion of apples that are bruised in the sample using only top apples is likely
to be smaller than the proportion of apples that are bruised in the entire
crate.
15.
A simple random sample is a good way to get a useful sample because
(a) the sample is always a good representation of the whole population
(b) this method of sampling will result in unbiased estimates
(c) it’s very simple to just pick whatever subset of the population you want
(d) all of the above are true.
(e) none of the above are true.
16.
A survey was conducted by visiting the Haverford High School parking lot to estimate
the proportion of cars that were red. The spaes were numbered and a random sample of
spaces was selected. Which of the following is NOT correct?
(a) If the sampled stall was empty, we can simply choose another stall, at random, to take
its place because it is not likely that the stall being vacant is related to a car being red.
(b) The sample would produce an unbiased estimate of the true proportion of red cars
(c) Even though a random sample was taken from cars in the parking lot, the sample may
not be representative of the cars parked at Haverford High School because proportions
calculated through random samples can be biased
(d) If a another sample of cars was chosen, it is likely that a different proportion of cars
that are red would be obtained.
17.
A properly conducted random survey selected 1000 Canadians (from a total population of
about 30 million) and 1000 Americans (from a total population of about 300 million).
Which of the following is FALSE?
(a) Randomization ensures that both samples will result in unbiased estimates of
characteristics of their respective populations.
(b) The precision is determined by the ratio of the sample size to the total population size.
(c) A smaller proportion of the American population has been chosen. Therefore, a
particular person has a smaller chance of being selected in America than in Canada.
(d) Random digit dialing to select people for the survey by telephone could induce biases
in the results if the characteristic of interest for the survey is related to income.
Standard 5 - Studies



18.
Distinguish between observational studies and experimental studies
understand the different types of conclusions that can be drawn from each
Recognize that in an observational study (e.g. a survey), an observation may be best
explained by random sampling variability
You are concerned that your employees have little saved for retirement. You conduct a
survey of your 100,000 employees using a simple random sample of size 47. You find
that the mean of the savings of this sample of employees is $40,000 with standard
deviation of $3,000.
This is an example of an…
(a) observational study since subjects are randomly assigned to groups
(b) observational study since it is based on taking a sample of a population without
intervening
(c) experiment since subjects are randomly assigned to groups
(d) experiment study since it is based on taking a sample of a population without
intervening
19.
Researchers concerned about the effects of excessive television viewing on students
school performance are planning to conduct a study. Which of the following is true?
(a.) This study would be considered an observational study if they assigned one group of
students to watch 4 hours of television each day and another group to not watch any
television.
(b.) This study would be considered biased if they assigned one group of students to
watch 4 hours of television each day and another group to not watch any TV.
(c.)This study would be considered an experiment if they assigned one group of students
to watch 4 hours of television each day and another group to not watch any TV.
(d.) This study would be considered an experiment if they took a random sample of
students, recorded their grades and the number of hours of television they watched.
20.
Suppose that 80% of all American students send a card to their mother on Mother's Day
and that you selected a simple random sample of 400 American college students and to
determine the proportion of them who send a card to their mother on Mother's Day.
Suppose further that in the random sample of 400 students, 300 or 75% of them send a
card to their mothers. This is an example of an…
(a) observational study since subjects are randomly assigned to groups
(b) observational study since it is based on taking a sample of a population without
intervening
(c) experiment since subjects are randomly assigned to groups
(d) observational study since it is based on taking a sample of a population without
intervening
21.
A survey is to be undertaken of recent nursing graduates in order to compare the starting
salaries of women and men. For each graduate, three variables are to be recorded (among
others) sex, starting salary, and area of specialization.
(a) Sex and starting salary are explanatory variables; area of specialization is a response
variable
(b) Sex is an explanatory variable; starting salary and area of specialization are response
variables.
(c) Sex is an explanatory variable; starting salary is a response variable; area of
specialization is a possible confounding variable
(d) Sex is a response variable; starting salary is an explanatory variable; area of
specialization is a possible confounding variable
(e) Sex and area of specialization are response variables; starting salary is an explanatory
variable.
Standard 6 - Experimental Design


Understand principles of control including comparison, replication, randomization,
blindness, and blocking
Carry out a well-controlled experiment
22.
A study was conducted to see if Smartfood Popcorn makes people smarter. A group of 50
participants in the study were divided into two groups. One group received Smartfood
Popcorn before taking a spelling test, and the other took the test without first getting
popcorn. The control group in an experiment should be designed to receive:
(a.) the opposite of the experiences afforded the experimental group.
(b.) the experiences afforded the experimental group except for the treatment under
examination.
(c.) the experiences afforded the experimental group except for receiving the treatment at
random.
(d.) the experiences which constitute an absence of the experiences received by the
experimental group.
23.
The Smartfood experiment would be said to take into account the principle of blindness
if ___________, and it could be said to be double-blind if _____________.
(a.) the subjects are randomly assigned to either eat Smartfood or not;
those evaluating the subjects are blindfolded
(b.) the subjects are not aware of which treatment group they are in;
those evaluating the subjects are not aware of which treatment group received Smartfood
(c.) the subjects are selected at random from the population;
those evaluating the subjects are not aware of which treatment group the subjects are in
(d.) the subjects are not aware of which treatment group they are in;
the two treatment groups are never come in contact
24.
An experiment is conducted to determine if the use of certain specified amounts of a drug
will increase the IQ scores for students in the fifth grade.
In this experiment, IQ serves as:
(a.) a response variable
(b.) an explanatory variable
(c.) a placebo variable
(d.) a control variable
25.
What is a placebo?
(a.) an experimental treatment
(b.) a control treatment
(c.) a parameter
(d.) a statistic
26.
A new headache remedy was given to a group of 25 subjects who had headaches. Four
hours after taking the new remedy, 20 of the subjects reported that their headaches had
disappeared. From this information you conclude:
(a) that the remedy is effective for the treatment of headaches.
(b) nothing, because the sample size is too small.
(c) nothing, because there is no control group for comparison.
(d) that the new treatment is better than aspirin.
27.
Which of the flowing is NOT a reason that subjects should be assigned to treatments at
random?
(a) to get a random sample of the population of interest
(b) to eliminate the potential for researchers to influence the results
(c) to create experimental groups that are similar
(d) so that the effects of variable that were not measured will likely be balanced out
between the experimental groups
Standard 7 - Confounding

Critique experiments performed by others by suggesting possible confounding (lurking)
variables not accounted for in a study

Understand the difference between an association and a cause and effect relationship.
Recognize that in a well-controlled experiment, an observed difference may be due to
the pre-existing differences in the groups created by random assignment of subjects to
treatments.

28.
Researchers have observed that drinking red wine seems to lead to fewer men having
heart attacks. More recently, others have noted that drinking red wine leads to
headaches and people with headaches tend to take aspirin. Furthermore, aspirin is
known to reduce the changes of having heart attacks. Given these facts, the relationship
between drinking red wine and having heart attacks would be best described as being due
to:
(a.) cause-and-effect.
(b.) strong correlation.
(c.) a lurking variable.
(d.) placebo effect.
29.
There is a relationship between the number of drownings and ice
cream sales. This is an example of an association likely caused by:
(a) coincidence
(b) the fact that ice cream causes drownings
(c) confounding or lurking variable
(d) the fact that drowning cause eating ice cream
30.
An experiment was designed to investigate the effect of the amount of water and seed
variety upon subsequent growth of plants. Each plant was potted in a clay plot, and a
measured amount of water was given weekly. The plants that were assigned to receive
more water had all been placed closer to a window that the ones that received less water.
The height of the plant at the end of the experiment was measured. Which of the
following is not correct?
(a) The response variable is the plant height.
(b) The explanatory variables are the amount of water and seed variety.
(c) The seeds should be randomly selected from the population of all seed varieties rather
than randomly assigned to receive more or less water.
(d) The effect of the amount of water was confounded by the effect of being near the
window
Standard 8 - Interpreting displays of distributions

Determine the overall pattern of variability (distribution) for a variable by describing the
center (tendency), spread (consistency), and shape of a distribution verbally in context

Identify departures from the overall pattern of a distribution (outliers)
Understand the relationship between the shape of a distribution and the relative position


of the mean and median
Be able to interpret various displays of a distribution or distributions (pie chart, bar
graph, dotplot, stemplot, histogram, boxplots)
31.
Which of the following best describes an outlier?
(a) The largest or smallest number in a distribution
(b) An observation that doesn’t fit in with the overall pattern of variability
(c) Any really big number is an outlier
(d) An unusually tall peak in the distribution of a variable
32.
The dotplots display the February
temperatures for three cities. Because of
the shapes of their distributions we
should expect the mean of the February
temperatures for…
(a) Lincoln to be about the SAME as the
median for Lincoln while the mean of the
temperatures for Sedona will be LOWER
than the median for Sedona
(b) Lincoln to be about the SAME as the median for Lincoln while the mean of the
temperatures for Sedona will be HIGHER than the median for Sedona
(c) Lincoln to be HIGHER than the median for Lincoln while the mean of the
temperatures for Sedona will be LOWER than the median for Sedona
(d) Lincoln to be LOWER than the median for Lincoln while the mean of the
temperatures for Sedona will be HIGHER than the median for Sedona
33.
The histogram displays the percent of overweight adults in each state. Which of the
following is NOT true?
(a) The distribution is symmetric
(b) In a typical state about 37% of the people are overweight
(c) 14 states have less than 36% overweight
(d) 2 states have 11% overweight
Standard 10 - Comparing Distributions


Be able to create displays for comparing distributions of two or more variables (back to
back stemplots, stacked dotplots, and modified boxplots)
Compare and contrast distributions
The following side-by-side boxplots represent the
rushing yards gained by the starting running backs in the
opening game. Compare and contrast their performance.
34.
Carson runs further than 5 yards about
what percent of the time?
(a) 15%
(b) 25%
(c) 50%
(d) 75%
35.
_____ tends to run further. _____ is more consistent.
(a) Asika; Asika
(b) Asika; Carson
(c) Carson; Asika
(d) Carson; Carson
36.
Based on the dotplots of February temperatures for three cities, answer the following
which of the following is NOT true?
(a) San Luis Obispo experienced generally
higher temperatures than Sedona and
Lincoln
(b) The distribution of temperatures for
Sedona is skewed toward lower values
(c) The city with the most consistent
temperatures was Sedona
(d) The temperatures in Lincoln tend to be
higher than those in Sedona
Standard 11 - Measures of center



37.
Compute and interpret numerical measures of the center of a distribution including mean,
median, and mode.
Apply the concept of resistance to measures of center
Determine which measures of center apply for different situations
The measure of spread which is resistant to extreme scores on the higher or lower end of
a distribution is the:
(a) median.
(b) mean.
(c) standard deviation.
(d) IQR
38.
Making the largest number in a data set much larger will increase the ______ but not
change the _________.
(a) median; mean
(b) mean; standard deviation
(c) standard deviation; IQR
(d) IQR; median
39.
Which of the following is not a measure of center?
(a.) mean
(b.) median
(c.) mode
(d.) standard deviation
40.
What is the sample mean of the following sample
X
Frequency of X
----------------------------------2
1
3
2
4
3
(a.)
(b.)
(c.)
(d.)
3
2
20
3.33
Standard 12 - Statistical Tendency

Understand the concept of statistical tendency
Summary statistics:
Column
Husband's Age
Wife's Age
Median Min Max Q1 Q3
30.5
19
71 25 44.5
29
16
73 24 41.5
41.
Based on the 5-number summaries for the ages of a group of husbands and wives at
marriage, which of the following is NOT true?
(a.) the husbands tend to be older than the wives
(b.) every husband is older than his wife
(c.) the youngest person was a wife
(d.) the oldest person was a wife
42.
To say that 5th graders tend to be taller than 4th grader is to say that…
(a.) every child in 5th grade is taller than every child in 4th grade.
(b.) almost every child in 5th grade is taller than almost every child in 4th grade.
(c.) if you select a 4th grader and a 5th grader at random, more often than not, the 5th
grader will be taller
(d.) most 4th graders are not very tall.
43.
The dotplot to the right compares some
systolic and diastolic blood pressure
measurements. Which of the following is
NOT true?
(a.) systolic blood pressure tends to be
higher than diastolic blood pressure
(b.) systolic blood pressure reading tend
to be above 100
(c.) every systolic reading is higher than
every diastolic reading
(d.) diastolic blood pressure readings
tend to be below 100
Standard 13 - Measures of spread



Compute and interpret measures of spread for a distribution including IQR, standard
deviation, and range
Apply the concept of resistance to measures of spread
Determine which measures of spread apply for different types of situations
44.
If you are told a population has a mean of 25 and a standard deviation of 0, what must
you conclude?
(a.) Someone has made a mistake.
(b.) There is only one element in the population.
(c.) There are no elements in the population.
(d.) All the elements in the population are 25.
45.
The measure of spread which is sensitive (not resistant) to extreme scores on the higher
or lower end of a distribution is the:
(a.) median.
(b.) mean.
(c.) standard deviation.
(d.) IQR
46.
Calculate the range and the IQR for the data set: {1,2,3,4,5,6,7,8}
(a.) 7 and 4
(b.) 8 and 3
(c.) 8 and 1
(d.) 6 and 5
Standard 14 - Empirical Rule

47.
Apply the empirical rule for interpreting the standard deviation for mound-shaped
distribution
According to the empirical rule, for any mound-shaped distribution, about 95% of the
data will be…
(a) within one standard deviation of the mean
(b) within two standard deviations of the mean
(c) within three standard deviations of the mean
(d) within four standard deviation of the mean
48.
If the mean of a distribution of test score is 70 and the standard deviation is 10, the
empirical rule predicts that about 68% of the students scored between…
(a) 60 and 80
(b) 70 and 80
(c) 50 and 90
(d) 60 and 90
Standard 15 - Linear Transformations

49.
Understand the effects of linear transformations of the data on the summary statistics
A distribution of 6 scores has an IQR of 21. If the highest score increases 3 points, the
new IQR will be:
(a.) 21
(b.) 21.5
(c.) 24
(d.) cannot be determined
50.
Adding 10 to each item in a set of data will result in which of the following:
(a.)
(b.)
(c.)
(d.)
51.
increase the standard deviation by 10
multiply the IQR by 10
increase the mean by 10
increase the median by 2
If every number in a data set is multiplied by 2 then …
(a.)
(b.)
(c.)
(d.)
the range will be doubled but the mean will stay the same
the mean will be doubled but the IQR will remain the same
the mean will be doubled and the standard deviation will also be doubled
the median will be doubles and the standard deviation will remain the same
Standard 17 - Categorical variable distributions

Compute and interpret conditional and marginal distributions
52.
The marginal distribution of a categorical variable is
(a.)
the total amount of things in one category
(b.)
the grand total
(c.)
the proportional breakdown of the categories for that variable calculated using the
totals
(d.)
the proportional breakdown of the categories for one variable considering only the
the outcomes for a particular category of another variable.
53.
A conditional distribution of a categorical variable is
(a.)
the total amount of things in one category
(b.)
the grand total
(c.)
the proportional breakdown of the categories for that variable calculated using the
totals
(d.)
the proportional breakdown of the categories for one variable considering only the
the outcomes for a particular category of another variable.
Consider the following two-way table showing the favorite leisure activities for 50 adults.
54.
TV
Total
Men
2
10
8
20
Women
16
6
8
30
Total
18
16
16
50
89%, 11%
40%, 60%
36%, 64%
50%, 50%
What is the conditional distribution of the variable Gender for people preferring TV?
(a.)
(b.)
(c.)
(d.)
56.
Sports
What proportion of people preferring Dance are woman and what proportion are men?
(a.)
(b.)
(c.)
(d.)
55.
Dance
89%, 11%
40%, 60%
36%, 64%
50%, 50%
What is the marginal distribution of the variable Gender?
(a.)
(b.)
89%, 11%
40%, 60%
(c.)
(d.)
36%, 64%
50%, 50%
Standard 18 - Segmented Bar Graphs

Create and interpret segmented bar graphs
Answer 57 and 58 based on the segmented bar graph
57. The proportion of Men who prefer Dance is about…
(a.)
(b.)
(c.)
(d.)
11%
20%
55%
64%
58. The proportion of Women who prefer Sports is about…
(a.)
(b.)
(c.)
(d.)
11%
20%
55%
64%
Standard 19 - Categorical Variable Relationships


Describe the relationship between two variables (categorical v categorical)
Determine whether 2 categorical variables are independent
59.
Based on the segmented bar graph, are sex and
favorite leisure activity independent
(a.)
(b.)
(c.)
Yes
No
Cannot be determined
Consider the two-way table below based on a class of 30 students and their responses as to
whether or not they own a cat and whether or not they own a dog:
60.
1. B
2. C
3. D
4. A
5. B
6. B
Has a Cat
No Cat
Total
Has a Dog
8
4
12
No Dog
2
16
18
Total
10
20
30
Describe the relationship between having a cat and having a dog in this class.
(a.) whether a student has a dog is not related to whether the student has a cat
(b.) Students with a cat are more likely to also have a dog than students without a cat
(c.) Students with a dog are less likely to have a cat than students who do not have a dog
(d.) Most students have either a cat or a dog or both
7. A
8. C
9. C
10. B
11. D
12. B
13. A
14. D
15. B
16. C
17. B
18. C
19. C
20. B
21. C
22. B
23. B
24. A
25. B
26. C
27. A
28. C
29. C
30. D
31. B
32. A
33. D
34. C
35.B
36. D
37. D
38. C
39. D
40. D
41. B
42. C
43. C
44. D
45. C
46. A
47. B
48. A
49. A
50. C
51. C
52. C
53. D
54. A
55.D
56.B
57.A
58.B
59. B
60. B