5. Practice problems - 1.

University of California, Los Angeles
Department of Statistics
Statistics 10
Instructor: Nicolas Christou
Practice questions
Questions 1-4:
The Survey of Study Habits and Attitudes (SSHA) is a psychological test that evaluates college students
motivation, study habits, and attitude towards school. This test was given to all incoming freshmen at a
small private college. A sample of 18 female students and 20 male students revealed the following side-by-side
boxplots for their test scores.
1. About 75% of the female students have test scores higher than (approximately):
a. 125
b. 150
c. 100
d. 180
2. About 50% of the male students have test scores higher than (approximately):
a. 150
b. 110
c. 180
d. 100
3. The largest non-outlier in the female sample is around:
a. 200
b. 180
c. 100
d. 150
4. The distribution of the scores in the male sample most likely is:
a. Skewed to the right
b. Skewed to the left
c. Symmetrical
1
Questions 5-6:
A study on college students found that the men had an average weight of 66 kg with standard deviation 9
kg. The women had an average weight of about 55 kg with standard deviation 9 kg.
5. Find the average and variance of the women’s weight in pounds (1 kg = 2.2 lb).
a. x̄ = 121, s2 = 43.56
b. x̄ = 121, s2 = 178.2
c. x̄ = 121, s2 = 392.04
d. x̄ = 266.2, s2 = 392.04
6. If you took the men and women together, would the standard deviation of their weights be bigger
than 9 kg, 9 kg, or smaller than 9 kg?
a. > 9
b. = 9
c. < 9
Question 7:
Which sample has the smaller variation:
List P contains the 4000 integers
1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, . . . , 999, 999, 999, 999, 1000, 1000, 1000, 1000.
List Q contains the 3000 integers
1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, . . . , 999, 999, 999, 1000, 1000, 1000.
a. P
b. Q
c. Exactly equal variation
Question 8:
The following table lists the hourly wage rate of a sample of 1000 industrial workers.
Wage category
$5.00 − $5.99
$6.00 − $6.99
$7.00 − $7.99
$8.00 − $8.99
$9.00 − $9.99
$10.00 − $10.99
$11.00 − $11.99
Total
Frequency
232
220
200
182
85
61
20
1000
a. Construct the relative frequency histogram of these data.
b. Describe the shape of this histogram.
c. Which category contains the median? Certainly the median of these 1000 values cannot be determined.
Make a reasonable guess for the median. Will the mean of these data be larger or smaller than the
median?
2
Question 9:
Consider the simple linear regression on a random sample of size n = 10, which gave
10
X
yi = 63.68,
i=1
10
X
xi = 190,
i=1
10
X
yi2 = 446.33,
i=1
10
X
x2i = 3940,
i=1
Note: An alternative formula for SSR is SSR = β̂12
0
X
xi yi = 1100.
i=1
P10
i=1 (xi
− x̄)2 .
a. Find the least squares estimates β̂0 and β̂1 .
b. Find s2e .
c. Find R2 .
d. Suppose x1 = 25. Find its leverage value. Is it a high leverage point? Please explain.
e. Show that ŷi = ȳ + β̂1 (xi − x̄).
Question 10:
You are given the variance covariance matrix of the returns of three stocks (AAPL, IBM, XOM, and the market
GSPC).
AAPL
IBM
XOM
^GSPC
AAPL
0.0051576327
0.0009561605
0.0008924827
0.0012088348
IBM
0.0009561605
0.0020179051
0.0010469402
0.0008613309
XOM
0.0008924827
0.0010469402
0.0020493003
0.0012078645
^GSPC
0.0012088348
0.0008613309
0.0012078645
0.0013817892
Find the correlation coefficient between AAPL and GSPC.
Question 11:
The following ecdf was constructed using 1000 observations of a certain variable. Will the corresponding
histogram be symmetrical, skewed to the left, or skewed to the right? Please explain.
0.0
0.2
0.4
F(x)
0.6
0.8
1.0
Empirical cumulative distibution function of x
0
20
40
60
80
x
Question 12:
Consider the simple regression model yi = β0 + β1 xi + i . Explain what the permutation test is for testing
H0 : β1 = 0 against Ha : β1 6= 0. Suppose for a certain problem the permutation test gave p-value=0.001.
Using significance level α = 0.05. What do you conclude?
3
Question 13:
The following were obtained from two sets of data:
n1 = 20, x̄ = 25, s2x = 5 and
n2 = 30, ȳ = 20, s2y = 4.
Find the sample mean and sample variance of the combined sample.
Question 14:
A list of 20 values x1 , x2 , . . . , x20 has an average x̄ = 135 and a sample standard deviation s = 14. It is
discovered that there is a 21st data value (it was missing when the list was formed). This value is x21 = 135.
Find the sample mean and the sample standard deviation of the list of 21 values.
Question 15:
The average inventory of flour, determined at 25 inventory audits, at a certain bakery was 4650 pounds.
What is the average in kilograms? Note: 2.2 pounds = 1 kilogram.
Question 16:
The sample mean of 15 newly-appointed partners in various Los Angeles law firms is $150000. The sample
mean of 15 New York law firms for newly-appointed partners is $180000 . What is the sample mean of all
30 firms?
Question 17:
Write a set of numbers in which the sample mean exceeds the median.
Write a set of numbers in which the sample mean equals the median.
Write a set of numbers in which the sample mean is less than the median.
Question 18:
You are given the following data set:
1
2
3
4
5
6
x
181072
181025
181165
181298
181307
181390
y cadmium copper lead zinc
333611
11.7
85 299 1022
333558
8.6
81 277 1141
333537
6.5
68 199 640
333484
2.6
81 116 257
333330
2.8
48 117 269
333260
3.0
61 137 281
Construct the empirical cumulative distribution function (ecdf) of cadmium.
Question 19:
True or False: If the median salary of all male employees in a firm exceeds $56000, and if the median salary
of all female employees in the firm exceeds $55000, then it is certain that the median salary of all employees
exceeds $55000.
Question 20:
There two groups of salespeople at a certain company. Group A consists of salespeople with no previous
experience in sales, and group B consists of those who do have sales experience. There are 14 people in
group A. During January, these group A people had average sales of $170000. There are 20 people in group
B. During January, the group B people had average sales of $240000. Find the average sales during January
for all 34 salespersons.
Question 21:
Refer to question 20. The 14 people in group A have median age 28.4 years. The 20 people in group B have
median age 36.2 years. What can you say about the median of the combined group of 34?
4