Section 20

Section 20: Goodness of Fit Tests
1) Under the terms negotiated by the Supernatural Species Treaty of 2014, U.S. counties should
consist of 85% human residents, 5% werewolves, 5% vampires, and 5% misunderstood
teenagers with supernatural abilities. In Angst County, a sample has 180 humans, 10
werewolves, 6 vampires, and 4 misunderstood teenagers with supernatural abilities. Do we have
evidence at the 1%, 5%, 10% levels that the county is in violating of the treaty?
Observed Expected
180
170
10
10
6
10
4
10
observed expected
180
10
6
4
N
200
170
10
10
10
obs exp
(obs exp)^2
10
0
-4
-6
100
0
16
36
(obs - exp)^2 /
sum
exp
0.588235294 5.7882
0
1.6
3.6
DF Chi-Square P-value
3
5.7882353
0.1224
No, no, no, since 0.1224 isn’t less than 0.01, 0.05, 0.10.
2) An eight sided die is rolled repeatedly. We get 230 ones, 260 twos, 255 threes, 290 fours, 220
fives, 245 sixes, 268 sevens, and 232 eights. Do we have evidence at the 1%, 5%, 10% levels
that the die is unfair?
Observed Expected
230
250
260
250
255
250
290
250
220
250
245
250
268
250
232
250
observed expected
230
260
255
290
220
245
268
232
N
2000
250
250
250
250
250
250
250
250
obs (obs (obs - exp)^2 /
sum
exp
exp)^2
exp
-20
400
1.6 14.7920
10
100
0.4
5
25
0.1
40
1600
6.4
-30
900
3.6
-5
25
0.1
18
324
1.296
-18
324
1.296
DF Chi-Square P-value
7
14.792
0.0388
No, yes, yes, since 0.0388 is less than 0.01 but not 0.05, 0.10.
3) The statistics for Scotland show that 40% of Scots have medium-brown hair, 35% have darkbrown hair, 20% blond hair, 4% have red hair, and 1% have black hair. A simple random sample
of Scots has 3190 with medium-brown hair, 2900 dark-brown hair, 1500 with blond hair, 330
with red hair, and 80 with black hair. Do we have evidence at the 1%, 5%, 10% levels that the
statistics are mistaken?
Observed Expected
3190
3200
2900
2800
1500
1600
330
320
80
80
observed expected
3190
2900
1500
330
80
N
8000
3200
2800
1600
320
80
obs (obs - exp) ^2 /
(obs - exp)^2
sum
exp
exp
-10
100
0.03125 10.1652
100
10000
3.571428571
-100
10000
6.25
10
100
0.3125
0
0
0
DF Chi-Square P-value
4
10.165179
0.0377
No, yes, yes, since 0.0377 is less than 0.01 but not 0.05, 0.10.
4) The official census figures for Bullfrog, ND for household types say that 26% are married
with children, 29% are married with no children, 9% are single parent, 25% are one person, and
11% are "other". A simple random sample from Bullfrog, ND gives 106, 125, 25, 90, and 54
respectively. Do we have evidence at the 1%, 5%, 10% levels that the official census figures are
inaccurate?
Observed Expected
106
104
125
116
25
36
90
100
54
44
observed expected
106
125
25
90
54
N
400
104
116
36
100
44
obs exp
2
9
-11
-10
10
(obs - exp)^2
4
81
121
100
100
(obs - exp) ^2 /
exp
0.038461538
0.698275862
3.361111111
1
2.272727273
DF Chi-Square P-value
4
7.3705758
0.1176
No, no, no, since 0.1176 isn’t less than 0.01, 0.05, 0.10.
sum
7.3706
5) The Fish and Game Department stocked Lake Yawannafish with fish in the following
proportions: 30% catfish, 15% bass, 40% bluegill, and 15% pike. Five years later, a sample of
fish yielded 120 catfish, 85 bass, 220 bluegill, and 75 pike. Do we have evidence at the 1%, 5%,
10% levels that the proportions of the various types of fish in the lake has changed?
Observed Expected
125
150
80
75
220
200
75
75
observed expected
125
80
220
75
N
500
150
75
200
75
obs (obs - exp) ^2 /
(obs - exp)^2
exp
exp
-25
625
4.166666667
5
25
0.333333333
20
400
2
0
0
0
DF Chi-Square P-value
3
6.5
0.0897
No, no, yes, since 0.0897 is less than 0.0, 0.05 but 0.10.
sum
6.5000
6) Benford’s Law says that in financial data in large quantities, the leading digit of money
amounts should be a one 30.1% of the time, and so forth:
1
0.301
2
0.176
3
0.125
4
0.097
5
0.079
6
0.067
7
0.058
8
0.051
9
0.046
Here are the deficit figures for a certain country of the European Union (2009).
1
41
2
36
3
28
4
14
5
3
6
6
7
7
8
4
9
1
Do we have evidence at the 1%, 5%, 10% levels that the country’s deficit figures are not
genuine?
Observed
Expected
41
42.14
36
24.64
28
17.5
14
13.58
3
11.06
6
9.38
7
8.12
4
7.14
1
6.44
observed
41
36
28
14
3
6
7
4
1
rel freq
0.301
0.176
0.125
0.097
0.079
0.067
0.058
0.051
0.046
expected
= rel freq
(obs* 140
obse-exp
exp)^2
42.14
-1.14
1.2996
24.64
11.36 129.0496
17.5
10.5
110.25
13.58
0.42
0.1764
11.06
-8.06 64.9636
9.38
-3.38 11.4244
8.12
-1.12
1.2544
7.14
-3.14
9.8596
6.44
-5.44 29.5936
(obs-exp)^2
/ exp
sum
0.030840057
24.803587
5.237402597
6.3
0.012989691
5.873743219
1.217953092
0.154482759
1.380896359
4.595279503
N
140
DF Chi-Square P-value
8
24.803587
0.0017
Observed Expected
41
42.14
36
24.64
28
17.5
14
13.58
3
11.06
6
9.38
7
8.12
4
7.14
1
6.44
And how! (0.0017 is less than 0.01, 0.05, 0.10).
7) We will regard a package of M&M’s as a simple random sample. In it are 212 blue, 147
orange, 103 green, 50 red, 46 yellow, 42 brown. Do we have evidence at the 1%, 5%, 10%
levels that the various colors of M&M’s are not made in equal proportion?
Observed Expected
212
100
147
100
103
100
50
100
46
100
42
100
observed expected
212
147
103
50
46
42
N
600
100
100
100
100
100
100
obs (obs - exp) ^2 /
(obs - exp)^2
sum
exp
exp
112
12544
125.44 235.4200
47
2209
22.09
3
9
0.09
-50
2500
25
-54
2916
29.16
-58
3364
33.64
DF Chi-Square P-value
5
235.42 <0.0001
Yes, yes, yes, since our p-value is much smaller than 0.01, 0.05. 0.10.
8) Random digits generated by a random number generator are supposed to be equally
likely. We run the generator several times and receive:
0
11
1
12
2
8
3
14
4
7
5
9
6
9
7
8
8
14
Do we have evidence at the 1%, 5%, 10% levels that the random number generator is not
working as it’s supposed to work?
Observed Expected
11
10
12
10
8
10
14
10
7
10
9
10
9
10
8
10
14
10
8
10
observed expected
11
12
8
14
7
9
9
8
14
8
10
10
10
10
10
10
10
10
10
10
obs exp
(obs - exp)^2
1
2
-2
4
-3
-1
-1
-2
4
-2
1
4
4
16
9
1
1
4
16
4
(obs - exp) ^2 /
exp
0.1
0.4
0.4
1.6
0.9
0.1
0.1
0.4
1.6
0.4
sum
6.0000
9
8
N
100
DF Chi-Square P-value
9
6
0.7399
No, no, no, since 0.7399 isn’t less than 0.01, 0.05, 0.10.
9) The blood types in a certain country are, according to the official figures, distributed as 45%
O, 30% A, 20% B, and 5% AB. A simple random sample contains 134 O’s, 60 A’s, 53 B’s, and
13 AB’s. Do we have evidence at the 1%, 5%, 10% levels that the official figures are mistaken?
Observed Expected
134
117
60
78
53
52
13
13
observed expected
134
60
53
13
N
260
117
78
52
13
obs exp
17
-18
1
0
(obs - exp)^2
289
324
1
0
(obs - exp) /
exp
2.47008547
4.153846154
0.019230769
0
DF Chi-Square P-value
3
6.6431624
0.0842
No, no, yes, since 0.0842 is not less than 0.01, 0.05 but is less than 0.10.
sum
6.6432