Name: Math 17 Section 02/ Enst 24 — Introduction to Statistics Third

Name:
Math 17 Section 02/ Enst 24
—
Introduction to Statistics
PRACTICE 2
Third Midterm Exam
Instructions:
1.
Show all work. You may receive partial credit for partially completed problems.
2.
You may use calculators and a one-sided sheet of reference notes, as well as the provided tables
(t,chi-square). You may not use any other references or any texts.
3.
You may not discuss the exam with anyone but me.
4.
Suggestion: Read all questions before beginning and complete the ones you know best first.
Point values per problem are displayed below if that helps you allocate your time among
problems.
5.
Use 4 decimal places for calculations involving proportions.
6.
You MAY NOT use a calculator to do more than the standard arithmetic functions, exponents,
and square roots. I.E. You may not use t-test functions, regression functions, and the like.
7.
Good luck!
Problem
1
2
I
4
Total
Points Earned
Possible Points
50
enthusiasts prefer a red color, 16.2% silver,
1. According to Ward’s Communication, 19% of sports car
A random sample of 250 cars at a NASCAR
14.7% black, 14.1% green, 14% white, and 22% other colors.
green cars, 39 white cars, and 50 other
raceway revealed 45 red cars, 42 silver cars, 34 black cars, 40
match the sports car enthusiast color
color cars. We want to know if the NASCAR color preferences
distribution stated by Ward’s Communication.
particular question? (Be specific.)
a. What procedure should you perform to address this
A/aT
N
ter(s).
b. What hypotheses should you test? Define your parame
)
2
,/L
1
,
Null:
Alternative:
where
p
p
7Z
=
2Z
N
14
.j
j
p
.,
2
3- hI(_
ure, and fill in the table.
c. Find the appropriate expected counts for your test proced
Color
Red
Silver
Black
Green
White
Observed
45
42
34
40
39
so
3,75
325
TOth
50
35oo
O0
L_______
72O
your test statistic.
d. The conditions for the test procedure check out. Compute
z
E
=
-
÷ ,2o54-
•‘iol
+
L/çLJ5
2, 0 305
hypothesis is true? Be specific.
e. What is the distribution of the test statistic assuming the null
dk-l
f. What is the p-value for your test?
g. State your conclusion using a .10 significance level.
Ve;
o
2I,
Li... >
Z
2.15)
__
____
__
__
2. In a memory experiment, three groups of subjects were given a list of words to try to remember. The
length of the list for the first group was 10 words (short), whereas for the second group it was 20 words
(medium), and for the third group it was 40 words (long). The percentage of words recalled for each
subject was recorded. We are interested in knowing if the average percentage of words recalled
depends on the length of the list.
Source
SS
df
List Length
2668.8
2
Residuals
Total
MS
,
C)
p-value
1 5. 77 3O 5
.0003
84.6
H
16
3852.9
3
F
-
-
-
a. Some values in the ANOVA table are missing. Complete the table above.
b. State the null and alternative hypotheses appropriate for this ANOVA.
4-”
-
,L,Uw/
‘&-‘
c. What conditions for the ANOVA could you check using a side-by-side boxplot? How should the boxplot
look if the conditions ARE satisfied?
1st
ipj
%4LL
a
‘
i&ta /?
U
1 %/
(/
),
d. What is the distribution of the test statisticssuming the null hypothesis is true?
F (2)
i.,)
e. The following pairwise confidence intervals were generated using Tukey’s multiple comparisons
methods. If appropriate, use the intervals to summarize the differences. If not appropriate, explain why
p
not.
Med-Short
Long-Short
Long-Med
Estimate Lwr
-39.61
-20.33
-29.17
-47.54
-28.11
-8.83
Upper
-1.06
-10.79
10.44
-
I
1AA4
0
No
No
Z
_%.
t
3. An article in the Journal of Statistics Education
reported price of diamonds of different sizes in
Singapore dollars. The prices were converted to US
dollars in 2004. A student wants to know how carat
(diamond size) is related to price (LISD). A graph of
the data is at right.
a. Does linear regression appear to be appropriate?
Explain in one sentence.
Selected regression output is shown. Use the output to address the questions that follow.
oeffjcients:
(Intercept)
Carat
Estimate Std. Error t value Pr(>lt)
-558,52
57.88
-9.649 7.98e—08
8225.06
239.11
34.399 1.lOe-15
Residual standard error: 64.94 on 15 degrees of freedom
Oluitiple R—squared: 0.9875,
Adjusted R—sauared: 0.9866
F—statistic:
1183 on
and 15 OF,
p—value: 1.098e-15
b. Report the regression line fit by the student.
55.52 ÷ 225,O()
u5b
c. What is the average size of a residual from this regression?
5’€
d. Report the value of R-squared, and interpret this value.
5
?
.
76
!(
‘ad
e. One data point is a .25 carat diamond which cost $1508.88. What is the residual for this data point?
l6o.8
-
9
I5O.
-
-5,52
+
225,o(2S))
=
f. Would it be appropriate to use this regression to predict the price of a .50 carat diamond? Explain
why or why not in one sentence.
.
0
N
o
x
-
.
-
__
g. Does there appear to be a significant_inear relationship between diamond size and price? Perform an
appropriate test at a .05 significance level, reporting your hypotheses, test statistic, p-value, and
conclusion in context. (Assumptions will be checked below>.
N
Null:
C)
N*e
Alternative:
39, 39
Teststatistic:
p-value:
I 1O
10
Conclusion:
I’
‘)VL
I.
h. Thnt makes some additional plots to finish off the regression. Using the plots, discuss whether
or not the regression assumptions appear to be satisfied in the space below.
Resduais vs pitted
Normal Q-Q
20
500
c0
C,
0
a..
0
C’
C’
0
0
500
1080
1500
Fitted
r4rJ
pdZ
2000
-JA
;
cLe/
iw1
c
7
t
JJ
cJ
41
A11
7k
0
Theoretca Quantles
G.
(-1oo ioo)
£
-1
values
rwrd 0
C’
00
‘V<L
7
A4
4-1
Aii
I
Z-
bc
T ‘d hc.
t4-rvt.
tt-2’
t?f4
vr€-4
AIo
4. In a study of how the burden of poverty varies among the U.S. regions,
a random sample of 4000
individuals from certain U.S. regions recently yielded some information on the
distribution of poverty.
The data are summarized in the table. We want to know if the srl
tf
)ve
e
rty
isthe sae for
these four regions of the U.S.
]
Northwest
F
,
Midwest
South
West
In Poverty
112 (121)
105 (121)
154 (121)
I 113
Not in Poverty
888 (879)
895 (879)
846 (879)
887
Total
1000
1000
1000
1000
(
1
Total
)
2
484
( -79)
3516
4000
a. What is the appropriate test procedure for this question? (Be specific.>
A!OT
2
b. What are the appropriate hypotheses for your test?
Null: 1’. ‘4z
Alternative:
c
a,
/
/c*,L(
8
&yo
.
-
NOT
+
(AJOT
)/).
4d-rn.e
-i
c. Compute and fill-in the appropriate expected counts for the cells with missing expect
ed counts.
d. What is the name of the rule used to compute the expected counts?
-
e. The test statistic value was 14.01. What is the distribution of the test statistic assum
ing the null
hypothesis is true?
%2(’
3)
(r-i’)(cI)
3
3
.
f. What can you say about the numeric value of the p-value for your test?
p-
y
(ve_
g. Give your conclusion in context of the problem using a .05 significance level.
I
p4AJ4
t
-
kWj
Q&42& A2
i4L
5Z,
Ahr
00 5
3
/
i,
N01
+ Sov
70
h. Explain what your p-value means in context of the problem.
<
/Yo/
V