Test2_Fall2014_Solutions

Prob-Stats Test 2, Fall 2014
Solutions
1. 1. A study compares the “mean length of utterances” (MLUs) from age-matched Down
Syndrome children (n=22) and typically developing children (n=20). The mean MLU
for the Down Syndrome (DS) group is 3.1 with a standard deviation of .55. The mean
MLU for the Typically Developing (TD) group is 3.9 with a standard deviation of .65.
Test whether there is a significance difference in MLUs between the DS and TD groups
at the .1 level of significance.
Independent Samples t-test. Let D refer to the Downes Syndrome group and T
refer to the Typically Developing group.
H0 : µD = µT
Ha : µD = µT
Verification.N = nT + nD = 42 ≥ 40 so no need for histogram or stem plot. Sample
size is adequate to ensure accuracy of p-values due to the robustness of t. OK to
proceed.
Run Test, α = .05. Since (t = ±4.28) p = .0001 < .05 → Reject the null. To conduct
the test by-hand, we first find t∗ based upon df = min(n1 −1, n2 −1) = min(21, 19) = 19.
Looking up t∗ using two-tailed portion of chart and df = 19, we find t∗ = 1.729. To
calculate the test statistic:
.8
3.9 − 3.1
= 4.32
=
t= q
.1853
.552
.652
+ 22
20
Since t = 4.32 > 1.729 = t∗ we reject the null.
Conclusion. Evidence suggests there is significant difference in the MLUs between
the Downes Syndrome group and the Typically Developing group.
2. A poker player competing in heads up match play (1-on-1 poker tournaments) must
win at least 55% of her matches to show a profit after paying casino rake (e.g. entry
fees that go to the casino and not into the prize pool). Our shero has won 23 matches
and lost 13. Conduct a hypothesis test that her win rate exceeds the 55% mark (using
the plus 4 method). Test at the α = .05 level of significance.
1 Sample z-Proportion Test.
H0 : prop = .55
Ha : prop = .55
Verification.n · p0 = 36(.55) = 19.8 ≥ 10 as required. Also, n(1 − p0 ) = 36(.45) =
16.2 ≥ 10. Hence, the sample size is adequate for z-procedures.
Run Test, α = .05. We have p̂ = 23
which, after the Plus 4 Method conversion
36
25
becomes p̂ = 40 = .625. Since (z = .953) p = .1701 > .05 → Fail to reject the null. To
conduct the test by-hand, we first find z ∗ = 1.64. To calculate the test statistic:
.625 − .55
z=q
= .91
.552 (.452 )
40
Since z = .91 < 1.64 = z ∗ we fail to reject the null.
Conclusion. No evidence to suggest shero wins at a rate higher than 55%.
3. A group of at-risk students volunteer to participate in a new study program. To assess
if the program is effective, the organizer gives the students a test before the program
starts and a similar exam at the end of the program. Here are the scores for each of
the students. Test whether the program is helping students increase their exam scores
at the .05 level of significance.
Pre 54 55 62 73 56 46 68 70 66 57 58 72 65 74 75
Post 50 55 76 84 62 59 65 76 70 54 59 76 60 76 74
Dependent Samples t-test.
H0 : µg = 0
Ha : µg > 0
Verification.n = 15 which means we must check a boxplot (n < 40) but not the
histogram (n 6< 15). The boxplot shows no outliers. OK to proceed.
Run Test, α = .05. Since (t = 1.91) p = .0387 < .05 → Reject the null. To conduct
the test by-hand, we first find t∗ = 1.761 based on df = n − 1 = 14. To calculate the
test statistic:
3−0
t = 6.1 = 1.91
√
15
Since t = 1.91 > 1.761 = t∗ we reject the null.
Conclusion. Evidence suggests the study program helps the at-risk kids do better.
4. Is there a relationship between the number of times you miss class and the grade you
make in the class? The scatterplot below shows the numeric average and the number
of absences during the semester for each student in two different NGCSU sections
of Math 2400 (same professor). A regression line is also computed and shown for
these data (figure omitted). Perform the following tasks: (a) analyze the correlation
coefficient, (b) analyze the coefficient of determination, (c) predict the semester average
of a student with 6 absences, and (d) analyze the slope coefficient.
(a) r = −.55 → moderate negative relationship
(b) r2 = .31 → 31% of the variance in AVG accounted for by ABSENCES.
(c) y = 73.0604
∆y
= −2.42
→ for each additional absence, we expect AVG to decrease by
(d) m = ∆x
1
2.42 points.
5. Dobermans can be bred in 4 colors: black, red, blue and fawn, all of which have rust
colored highlights. If a black male Doberman (hetero dominant allele) and a fawn
female Doberman (homo recessive allele) were bred, on average half their pups would
be black, a quarter blue and the remaining quarter fawn. Given these parameters,
suppose over the course of several years, 28 pups was born: 11 black, 11 blue and 6
fawn. Test the hypothesis that the predicted proportions are accurate at the .1 level
of significance.
χ2 Goodness of Fit. Let K indicate black, U indicate blue and F indicate fawn.
Then:
1
1
1
H0 : pK = , pU = , pF =
2
4
4
Ha : at least one probability not as expected
Black Blue Fawn
OBS
11
11
6
EXP
14
7
7
Verification. None of the 3 EXP cells have low counts (less than 5). OK to proceed.
∗
We find the critical value by using df = 2 and alpha = .05 and see that χ2 = 4.605.
We compute the test statistic:
χ2 =
(11 − 14)2 (11 − 7)2 (6 − 7)2
9
16 1
43
+
+
=
+
+ =
≈ 3.07
14
7
7
14
7
7
14
∗
Since χ2 = 3.06 < 4.605 = χ2 we fail to reject the null and thus have no evidence that
the probability model for these is not as hypothesized.
6. 6. A landscaper provides plants and landscaping for shopping centers, office parks, and
apartment complexes. The three types of flowering bushes ordered in the last month
were placed as follows: (figure omitted). Conduct a hypothesis test at the 0.1 level
of significance to determine if the different type of buildings utilize different types of
flowering bushes in their landscaping.
χ2 Test of Independence.
H0 : Vars Indep
Ha : Vars Dep
Verificaiton. No low cell counts in EXP matrix (B). OK to proceed.
Since (χ2 = 16.7) p = .0022 < .1 = α, we reject the null. Evidence suggests that
flowering plant choice depends upon the type of commerical building being landscaped.
7. Four groups of fifth graders are randomly selected and assigned to a different educational training program for computer literacy. At the end of the training programs,
each student is given a standardized test. Twenty-eight students participated, but only
21 finished the course. Test at the .10 level whether there was a significant difference
in skill levels produced by the different training programs (char omitted).
ANOVA.
H0 : µA = µB = µC = µD
Ha : ∃x, y ∈ {A, B, C, D} such that µx 6= µy
or
Ha : At least one group is significantly different
Verification. Sample size n = 21 ≥ 20. Also, the ratio of largest to smallest sample
size is 6 : 4 = 3 : 2 < 2 : 1. OK to proceed.
Since (F.06) p = .9012 > .05 we fail to reject the null. There is absolutely no evidence
at all these group means are different.
8. Tukey HSD post hoc. Note the question asks for the level of significance from the
original ANOVA. We see from the output that p = .024 < .05, we causes us to reject
the null and, therefore, know that a post hoc test is required.
Note that with 3 groups and dfW = 120 ≈ 100, we find q ∗ 3.36. To calculate our HSD’s,
we need two harmonic means. The mean of any number with itself is just the number,
so n01 = 39. For the others
n02 = n12 =
2(39)45
= 41.8
39 + 45
Also note that MSW ≈ 53. Next, we calculate the two different HSD’s.
r
53
= 3.92
HSD01 = 3.36
39
r
53
HSD02 = HSD12 = 3.36
= 3.78
41.8
We can now compare the group mean differences with their respective HSD’s:
|x̄0 − x̄1 | = 2.99 < HSD01 = 3.36
|x̄0 − x̄2 | = 4.54 > HSD02 = 3.78 ∗
|x̄1 − x̄2 | = 1.55 < HSD12 = 3.78
This indicates that the group mean difference between the Low and High 3-betting
groups is the only one that is significantly different at the .05 level.
9. Who invented the t-test?
William Sealy Gossett
10. In a data set with sample size n = 40, we find there are two significant outliers to the
right, but none to the left. Which of the following statements are likely to be true?
Check all that apply.
The mean is greater than median. YES
The distribution is skewed left. NO
The distribution is skewed right. YES
11. Perfectionism scores at UNG have the N(82,18) distribution. Find the 80th percentile
score.
Using the function invNorm(.8,82,18) we find X = 97.15.
12. Perfectionism scores at UNG have the N(82,18) distribution. Kaylas perfectionism
score is 111. Find Kaylas percentile.
Using z-test, set µ0 = 82, σ = 18, x̄ = 111 and n = 1, we use the < inequality and
find that z = 1.611 and p = .9464. We do not round the p-value to get the percentile.
We truncate, so the answer is 94th percentile.
13. For the following research scenario, (1) setup the null and alternative hypotheses, (2)
state both what Type I and Type II error are and what the real-world implications
of each would be, and then (3) set alpha stating your reasons why. Environmental
engineering students at Georgia Tech have found a new way to cure concrete. The old
method (developed in Athens, Georgia) generated concrete with an average strength
of 5000 kg/cm2. They cure and test 42 samples. Statistically test the hypothesis that
the new curing method generates stronger cement.
Let T denote the Georgia Tech sample and G denote the Georgia sample.
H0 : µG = µT
Ha : µG > µT
Type I error is falsely claiming the GT concrete is stronger with implication that the
new concrete is sold for as a better product when it’s actually no better.
Type II error is falsely climing that the GT is NOT stronger with implication that
they are unable to start selling this better product and have to do more R and D.
I personally would set α high (say, α = .1) to control Type II. I can see an argument
in the other direction.