PSTAT 120C Probability and Statistics - Week 9

PSTAT 120C Probability and Statistics - Week 9
Fang-I Chu, Varvara Kulikova
University of California, Santa Barbara
May 30, 2012
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Topics for review
More on contingency table
Higher oder table
Forms
Remark
Hint for #2 ,#3,#4 in hw7
Bayesian Inference
Overview and rules
examples for illustration
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Contingency table
More on contingency table
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Contingency table
More on contingency table
three category of hypothesis
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Contingency table
More on contingency table
three category of hypothesis
(1) Conditionally independent- A|C and B|C are independent.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Contingency table
More on contingency table
three category of hypothesis
(1) Conditionally independent- A|C and B|C are independent.
(2) Marginally independent- A ∩ C is independent of B. Or A is
independent of B ∩ C
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Contingency table
More on contingency table
three category of hypothesis
(1) Conditionally independent- A|C and B|C are independent.
(2) Marginally independent- A ∩ C is independent of B. Or A is
independent of B ∩ C
(3) Full independent- A,B, and C are all independent.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Contingency table
More on contingency table
three category of hypothesis
(1) Conditionally independent- A|C and B|C are independent.
(2) Marginally independent- A ∩ C is independent of B. Or A is
independent of B ∩ C
(3) Full independent- A,B, and C are all independent.
Remark: when dealing with multiple comparisons, we may
want to include a lurking variable.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
hypothesis for multiple comparisons: (1) Conditionally
independent
(1) Conditionally independent
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
hypothesis for multiple comparisons: (1) Conditionally
independent
(1) Conditionally independent
(a) Condition on the new variable (lurking variable)
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
hypothesis for multiple comparisons: (1) Conditionally
independent
(1) Conditionally independent
(a) Condition on the new variable (lurking variable)
(b) examine whether other categories are independent.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
hypothesis for multiple comparisons: (1) Conditionally
independent
(1) Conditionally independent
(a) Condition on the new variable (lurking variable)
(b) examine whether other categories are independent.
(c) Construct separate contingency table for each level of the
lurking variable.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
hypothesis for multiple comparisons: (1) Conditionally
independent
(1) Conditionally independent
(a) Condition on the new variable (lurking variable)
(b) examine whether other categories are independent.
(c) Construct separate contingency table for each level of the
lurking variable.
(d) sum of independent χ2 is still χ2 .
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
hypothesis for multiple comparisons: (1) Conditionally
independent
(1) Conditionally independent
(a) Condition on the new variable (lurking variable)
(b) examine whether other categories are independent.
(c) Construct separate contingency table for each level of the
lurking variable.
(d) sum of independent χ2 is still χ2 .
(e) suppose there is t level of the lurking variable, our degrees of
freedom is t(r − 1)(c − 1)
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
hypothesis for multiple comparisons: (2) Marginally
independent
(2) Marginally independent
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
hypothesis for multiple comparisons: (2) Marginally
independent
(2) Marginally independent
(a) The two categories are independent of the third one, while
they are related to each other.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
hypothesis for multiple comparisons: (2) Marginally
independent
(2) Marginally independent
(a) The two categories are independent of the third one, while
they are related to each other.
(b) include every combination of these two categories into one
category (as in product form)
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
hypothesis for multiple comparisons: (2) Marginally
independent
(2) Marginally independent
(a) The two categories are independent of the third one, while
they are related to each other.
(b) include every combination of these two categories into one
category (as in product form)
(c) Notice this is again two-way contingency table.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
hypothesis for multiple comparisons: (2) Marginally
independent
(2) Marginally independent
(a) The two categories are independent of the third one, while
they are related to each other.
(b) include every combination of these two categories into one
category (as in product form)
(c) Notice this is again two-way contingency table.
(d) Suppose there is t level of first category and r level of second
category, now we have t × r rows.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
hypothesis for multiple comparisons: (2) Marginally
independent
(2) Marginally independent
(a) The two categories are independent of the third one, while
they are related to each other.
(b) include every combination of these two categories into one
category (as in product form)
(c) Notice this is again two-way contingency table.
(d) Suppose there is t level of first category and r level of second
category, now we have t × r rows.
(e) degrees of freedom = (rt − 1)(c − 1).
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
hypothesis for multiple comparisons: (3) Fully independent
(3) Fully independent
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
hypothesis for multiple comparisons: (3) Fully independent
(3) Fully independent
(a) All three categories are independent of each other.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
hypothesis for multiple comparisons: (3) Fully independent
(3) Fully independent
(a) All three categories are independent of each other.
(b) To estimate row,column, and table marginal totals separately.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
hypothesis for multiple comparisons: (3) Fully independent
(3) Fully independent
(a) All three categories are independent of each other.
(b) To estimate row,column, and table marginal totals separately.
(c) the estimate of the expectation Eijk =
Fang-I Chu, Varvara Kulikova
ri cj tk
.
n2
PSTAT 120C Probability and Statistics
hypothesis for multiple comparisons: (3) Fully independent
(3) Fully independent
(a) All three categories are independent of each other.
(b) To estimate row,column, and table marginal totals separately.
(c) the estimate of the expectation Eijk =
ri cj tk
.
n2
(d) degrees of freedom= rtc − r − c − t + 2.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#2 in hw7
A Gallup survey asked each responder whether they thought
abortion laws should be stricter. The sociologists are interested in
whether there is a generational difference in the attitudes about
abortion. The survey respondents were classified by generation
(”18-49 years old” and ”more than 50 years old) and sex. (a)The
cross tabulation for generation and their response is
want stricter laws dont want stricter laws
18 − 49
188
328
50+
217
282
Test whether or not opinions about abortion laws are independent
of generation. Useα = 0.01.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#2(a) continued...
Hints: table of expected value
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#2(a) continued...
Hints: table of expected value
18 − 49
E
50+
E
want stricter laws
188
205.9
217
199.1
405
Fang-I Chu, Varvara Kulikova
dont want stricter laws total
328
516
310.1
282
499
299.9
610
1015
PSTAT 120C Probability and Statistics
#2(a) continued...
Hints: table of expected value
18 − 49
E
50+
E
want stricter laws
188
205.9
217
199.1
405
Use formula X 2 =
Pk
i=1
dont want stricter laws total
328
516
310.1
282
499
299.9
610
1015
(Ei −Oi )2
Ei
Fang-I Chu, Varvara Kulikova
to compute test statistics
PSTAT 120C Probability and Statistics
#2(a) continued...
Hints: table of expected value
18 − 49
E
50+
E
want stricter laws
188
205.9
217
199.1
405
Use formula X 2 =
Pk
i=1
dont want stricter laws total
328
516
310.1
282
499
299.9
610
1015
(Ei −Oi )2
Ei
to compute test statistics
X2
Obtained
= 5.26. To draw conclusion we need to compare
2
2
X with χ1,0.01 .
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#2(a) continued...
Hints: table of expected value
18 − 49
E
50+
E
want stricter laws
188
205.9
217
199.1
405
Use formula X 2 =
Pk
i=1
dont want stricter laws total
328
516
310.1
282
499
299.9
610
1015
(Ei −Oi )2
Ei
to compute test statistics
X2
Obtained
= 5.26. To draw conclusion we need to compare
2
2
X with χ1,0.01 .
Note: degrees of freedom= (2 − 1)(2 − 1) = 1
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#2 continued...
#2
(b) We can disaggregate the data to see the effect of sex on the
opinion,
women 18 − 49
50+
men 18 − 49
50+
want stricter laws dont want stricter laws
79
134
128
137
109
194
89
145
Test whether or not age is condotionally independent of opinion
when we condition on the sex of the respondent. Use α = 0.01.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#2 (b)continued...
Hint: table of expected value
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#2 (b)continued...
Hint: table of expected value
18 − 49
E
women
(E −O)2
E
50+
E
(E −O)2
E
total
want stricter laws dont want stricter laws total
79
134
213
92.2
120.8
1.90
128
114.8
1.45
137
150.2
1.53
207
1.17
271
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
265
478
#2 (b)continued...
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#2 (b)continued...
want stricter laws dont want stricter laws total
18 − 49
109
194
303
E
111.7
191.3
men
(E −O)2
E
50+
E
(E −O)2
E
total
0.066
89
86.3
0.039
145
147.7
0.086
198
0.05
339
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
234
537
#2 (b)continued...
want stricter laws dont want stricter laws total
18 − 49
109
194
303
E
111.7
191.3
men
(E −O)2
E
50+
E
(E −O)2
E
total
0.066
89
86.3
0.039
145
147.7
0.086
198
0.05
339
compute X 2 for women and men, then we obtain total
X 2 = 6.29
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
234
537
#2 (b)continued...
want stricter laws dont want stricter laws total
18 − 49
109
194
303
E
111.7
191.3
men
(E −O)2
E
50+
E
(E −O)2
E
total
0.066
89
86.3
0.039
145
147.7
0.086
198
0.05
339
compute X 2 for women and men, then we obtain total
X 2 = 6.29
degrees of freedom= 2(2 − 1)(2 − 1)
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
234
537
#2 continued..
#2
(c) Write up an explanation of the results that the sociologists
would understand
Hint:
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#2 continued..
#2
(c) Write up an explanation of the results that the sociologists
would understand
Hint:
if conclusion is : fail to reject null- the data is not significantly
different from what would be expected if there was no
generational difference in attitudes about abortion laws.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#2 continued..
#2
(c) Write up an explanation of the results that the sociologists
would understand
Hint:
if conclusion is : fail to reject null- the data is not significantly
different from what would be expected if there was no
generational difference in attitudes about abortion laws.
if conclusion is: to reject null- the data is significantly
different from what would be expected if there was no
generational difference in attitudes about abortion laws.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#2 continued..
#2
(c) Write up an explanation of the results that the sociologists
would understand
Hint:
if conclusion is : fail to reject null- the data is not significantly
different from what would be expected if there was no
generational difference in attitudes about abortion laws.
if conclusion is: to reject null- the data is significantly
different from what would be expected if there was no
generational difference in attitudes about abortion laws.
Be cautious about whether result obtained using the pooled
survey data (all the people) is different from results obtained
using survey of men and women separately.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#3
#3
A survey of some students in a class found the following
distribution of eye color, hair color, and gender. Perform the
appropriate test to test whether these three characteristics are
completely independent or not.
men
hair/eye brown blue hazel green total
Black
32
11
10
3
56
Brown
38
50
25
15
128
Red
10
10
7
7
34
Blond
3
30
5
8
46
Total
83
101
47
33
264
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#3
#3
women
hair/eye brown
blue
hazel green total
Black
36
9
5
2
52
Brown
81
34
29
14
158
Red
16
7
7
7
37
Blond
4
64
5
8
81
total
137
114
46
31
328
eye totals 220
215
93
64
592
You may have to combine two rows or columns together to insure
that the expected values are large enough to make the χ2
approximation appropriate.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#3 continued...
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#3 continued...
Examine whether χ2 approximation is appropriate or not: the
smallest set of marginal totals is for red-haired(71),
green-eyed(64) men (264), giving us E = (64)(71)(264)
.
5922
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#3 continued...
Examine whether χ2 approximation is appropriate or not: the
smallest set of marginal totals is for red-haired(71),
green-eyed(64) men (264), giving us E = (64)(71)(264)
.
5922
Note the values of all other expectations are greater than 4.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#3 continued...
Examine whether χ2 approximation is appropriate or not: the
smallest set of marginal totals is for red-haired(71),
green-eyed(64) men (264), giving us E = (64)(71)(264)
.
5922
Note the values of all other expectations are greater than 4.
In order to make the χ2 approximation appropriate, we can
combine the columns of people with green eyes and hazel eyes
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#3 continued...
obtained combined data:
Hair / Eye
Black
Brown
Red
Blond
Total
Brown
32
38
10
3
83
Hair / Eye
Black
Brown
Red
Blond
Total
Brown
36
81
16
4
137
Men
Blue Hazel & Green
11
13
50
40
10
14
30
13
101
80
Women
Blue Hazel & Green
9
7
34
43
7
14
64
13
114
77
Fang-I Chu, Varvara Kulikova
Total
56
128
34
46
264
Total
52
158
37
81
328
PSTAT 120C Probability and Statistics
#3 continued...
Using the expected value formula through taking the product of
the three magical totals divided by n2 , we obtained table of
expected value as
Hair / Eye
Black
Brown
Red
Blond
Hair / Eye
Black
Brown
Red
Blond
Men
Brown Blue
17.90 17.49
47.40 46.32
11.77 11.50
21.05 20.57
Women
Brown Blue
22.24 21.73
58.89 57.55
14.62 14.29
26.15 25.55
Fang-I Chu, Varvara Kulikova
Hazel & Green
12.77
33.82
8.40
15.02
Hazel & Green
15.87
42.02
10.43
18.66
PSTAT 120C Probability and Statistics
#3 continued...
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#3 continued...
Use formula X 2 =
Pk
i=1
(Ei −Oi )2
Ei
Fang-I Chu, Varvara Kulikova
to compute test statistics
PSTAT 120C Probability and Statistics
#3 continued...
Use formula X 2 =
Pk
i=1
(Ei −Oi )2
Ei
to compute test statistics
Obtained X 2 = 163.4. To draw conclusion we need to
compare X 2 with χ217,0.05 .
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#3 continued...
Use formula X 2 =
Pk
i=1
(Ei −Oi )2
Ei
to compute test statistics
Obtained X 2 = 163.4. To draw conclusion we need to
compare X 2 with χ217,0.05 .
Note: degrees of
freedom= rtc − r − c − t + 2 = 2(3)(4) − 2 − 3 − 4 + 2 = 17
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4
#4
The following data documents the fate of a sample of the people
aboard the Titanic by their gender and which class ticket they were
given or if they were part of the crew.
survivors
class
male
female
1st
62
141
2nd
25
93
3rd
88
90
crew
192
20
lost
class male female
1st 118
4
2nd 154
13
3rd 422
106
crew 670
3
We are interested in testing if there was a relationship
between gender and survival.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4
#4
(a)What sort of null hypothesis is appropriate?
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4
#4
(a)What sort of null hypothesis is appropriate?
interested in the relationship between gender and survival.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4
#4
(a)What sort of null hypothesis is appropriate?
interested in the relationship between gender and survival.
ignore the effects from being different classes (class is
apparently related to the survival)
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4
#4
(a)What sort of null hypothesis is appropriate?
interested in the relationship between gender and survival.
ignore the effects from being different classes (class is
apparently related to the survival)
we want to test whether Gender and Survival are conditionally
independent given same class
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4
#4
(b) Calculate the appropriate χ2 statistic and interpret the result
Reconstruct the data into four 2 × 2 tables
1st Class
Male Female
Survived
62
141
Lost
118
4
180
145
203
122
325
2nd Class
Male Female
25
93
154
13
179
106
118
167
285
Survived
Lost
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4 continued...
Survived
Lost
Survived
Lost
3rd Class
Male Female
88
90
422
106
510
196
178
528
706
Crew
Male Female
192
20
670
3
862
23
212
673
885
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4 continued...
Survived
Lost
Survived
Lost
1st Class
Male Female
112.43 90.57
67.57
54.43
180
145
203
122
325
2nd Class
Male Female
74.11
43.89
104.89 62.11
179
106
118
167
285
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4 continued...
Survived
Lost
Survived
Lost
3rd Class
Male Female
128.58 49.42
381.42 146.58
510
196
178
528
706
Crew
Male Female
206.49
5.51
655.51 17.49
862
23
212
673
885
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4 continued...
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4 continued...
Use formula X 2 =
Pk
i=1
(Ei −Oi )2
Ei
Fang-I Chu, Varvara Kulikova
to compute test statistics
PSTAT 120C Probability and Statistics
#4 continued...
Use formula X 2 =
Pk
i=1
(Ei −Oi )2
Ei
to compute test statistics
Obtained X 2 = 397.54. To draw conclusion we need to
compare X 2 with χ24,0.05 .
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
#4 continued...
Use formula X 2 =
Pk
i=1
(Ei −Oi )2
Ei
to compute test statistics
Obtained X 2 = 397.54. To draw conclusion we need to
compare X 2 with χ24,0.05 .
Note: degrees of freedom= 4(2 − 1)(2 − 1)
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Bayesian Inference
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Bayesian Inference
Definition
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Bayesian Inference
Definition
Prior distribution: Prior information about the parameter is
given by a density function.
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Bayesian Inference
Definition
Prior distribution: Prior information about the parameter is
given by a density function.
Posterior distribution: the information about the parameter
from the data is f (p|X )
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Bayesian Inference
Definition
Prior distribution: Prior information about the parameter is
given by a density function.
Posterior distribution: the information about the parameter
from the data is f (p|X )
Rules
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Bayesian Inference
Definition
Prior distribution: Prior information about the parameter is
given by a density function.
Posterior distribution: the information about the parameter
from the data is f (p|X )
Rules
Law of total probability: for a partition B1 , . . . , Bk
P(A) =
k
X
P(A|Bj )P(Bj )
j=1
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Bayesian Inference
Definition
Prior distribution: Prior information about the parameter is
given by a density function.
Posterior distribution: the information about the parameter
from the data is f (p|X )
Rules
Law of total probability: for a partition B1 , . . . , Bk
P(A) =
k
X
P(A|Bj )P(Bj )
j=1
Bayes Rule: for discrete case,
P(Bj |A) =
P(A|Bj )P(Bj )
P(A|Bj )P(Bj )
= Pk
P(A)
j=1 P(A|Bj )P(Bj )
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Bayesian Inference
Definition
Prior distribution: Prior information about the parameter is
given by a density function.
Posterior distribution: the information about the parameter
from the data is f (p|X )
Rules
Law of total probability: for a partition B1 , . . . , Bk
P(A) =
k
X
P(A|Bj )P(Bj )
j=1
Bayes Rule: for discrete case,
P(Bj |A) =
P(A|Bj )P(Bj )
P(A|Bj )P(Bj )
= Pk
P(A)
j=1 P(A|Bj )P(Bj )
Bayes Rule: for continuous case:
f (y |x) = R
Fang-I Chu, Varvara Kulikova
f (x|y )f (y )
f (x|y )f (y )dy
PSTAT 120C Probability and Statistics
Beta-Binomial model
Y ∼ Bin(n, p)
Prior: Beta
p ∼ Be(a, b)
Posterior:
p|y ∼ Be(a + y , n − y + b)
Bayes point estimator:
E [p|y ] =
Fang-I Chu, Varvara Kulikova
y +a
n+a+b
PSTAT 120C Probability and Statistics
Example: Beta-Binomial
An NBA rookie misses his first 10 free throws. What is the Bayes
estimate of a mean number of free throws for this player?
Consider the following Be(a, b) priors:
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Example: Beta-Binomial
An NBA rookie misses his first 10 free throws. What is the Bayes
estimate of a mean number of free throws for this player?
Consider the following Be(a, b) priors:
Non-informative p ∼ Be(1, 1) or p ∼ Be(1/2, 1/2)
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Example: Beta-Binomial
An NBA rookie misses his first 10 free throws. What is the Bayes
estimate of a mean number of free throws for this player?
Consider the following Be(a, b) priors:
Non-informative p ∼ Be(1, 1) or p ∼ Be(1/2, 1/2)
95% of NBA players are 75% ± 2*10% which implies
p ∼ Be(11, 3) and Pr [0.55 < p < 0.95] = 0.95
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Example: Beta-Binomial
An NBA rookie misses his first 10 free throws. What is the Bayes
estimate of a mean number of free throws for this player?
Consider the following Be(a, b) priors:
Non-informative p ∼ Be(1, 1) or p ∼ Be(1/2, 1/2)
95% of NBA players are 75% ± 2*10% which implies
p ∼ Be(11, 3) and Pr [0.55 < p < 0.95] = 0.95
College stats are 660/724 free throws which implies that prior
is p ∼ Be(660, 64)
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Example: Beta-Binomial
An NBA rookie misses his first 10 free throws. What is the Bayes
estimate of a mean number of free throws for this player?
Consider the following Be(a, b) priors:
Non-informative p ∼ Be(1, 1) or p ∼ Be(1/2, 1/2)
95% of NBA players are 75% ± 2*10% which implies
p ∼ Be(11, 3) and Pr [0.55 < p < 0.95] = 0.95
College stats are 660/724 free throws which implies that prior
is p ∼ Be(660, 64)
Then, the posterior distribution is Be(a + 0, b + 10) mean is the
a+0
, i.e.
Bayes estimator and equal E [p|y ] = 10+a+b
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Example: Beta-Binomial
An NBA rookie misses his first 10 free throws. What is the Bayes
estimate of a mean number of free throws for this player?
Consider the following Be(a, b) priors:
Non-informative p ∼ Be(1, 1) or p ∼ Be(1/2, 1/2)
95% of NBA players are 75% ± 2*10% which implies
p ∼ Be(11, 3) and Pr [0.55 < p < 0.95] = 0.95
College stats are 660/724 free throws which implies that prior
is p ∼ Be(660, 64)
Then, the posterior distribution is Be(a + 0, b + 10) mean is the
a+0
, i.e.
Bayes estimator and equal E [p|y ] = 10+a+b
E [p|y ] =
1+0
10+1+1
= 1/12 = 0.08
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Example: Beta-Binomial
An NBA rookie misses his first 10 free throws. What is the Bayes
estimate of a mean number of free throws for this player?
Consider the following Be(a, b) priors:
Non-informative p ∼ Be(1, 1) or p ∼ Be(1/2, 1/2)
95% of NBA players are 75% ± 2*10% which implies
p ∼ Be(11, 3) and Pr [0.55 < p < 0.95] = 0.95
College stats are 660/724 free throws which implies that prior
is p ∼ Be(660, 64)
Then, the posterior distribution is Be(a + 0, b + 10) mean is the
a+0
, i.e.
Bayes estimator and equal E [p|y ] = 10+a+b
E [p|y ] =
E [p|y ] =
1+0
10+1+1 = 1/12 = 0.08
11+0
10+11+3 = 11/24 = 0.46
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
Example: Beta-Binomial
An NBA rookie misses his first 10 free throws. What is the Bayes
estimate of a mean number of free throws for this player?
Consider the following Be(a, b) priors:
Non-informative p ∼ Be(1, 1) or p ∼ Be(1/2, 1/2)
95% of NBA players are 75% ± 2*10% which implies
p ∼ Be(11, 3) and Pr [0.55 < p < 0.95] = 0.95
College stats are 660/724 free throws which implies that prior
is p ∼ Be(660, 64)
Then, the posterior distribution is Be(a + 0, b + 10) mean is the
a+0
, i.e.
Bayes estimator and equal E [p|y ] = 10+a+b
E [p|y ] =
E [p|y ] =
E [p|y ] =
1+0
10+1+1 = 1/12 = 0.08
11+0
10+11+3 = 11/24 = 0.46
660+0
10+660+64 = 0.89
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics
25
20
Non−informative
NBA players
College
10
f(p|y)
15
0.89
0.46
0
5
0.08
0.0
0.2
0.4
0.6
0.8
1.0
p
Fang-I Chu, Varvara Kulikova
PSTAT 120C Probability and Statistics