PSTAT 120C Probability and Statistics - Week 9 Fang-I Chu, Varvara Kulikova University of California, Santa Barbara May 30, 2012 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Topics for review More on contingency table Higher oder table Forms Remark Hint for #2 ,#3,#4 in hw7 Bayesian Inference Overview and rules examples for illustration Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Contingency table More on contingency table Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Contingency table More on contingency table three category of hypothesis Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Contingency table More on contingency table three category of hypothesis (1) Conditionally independent- A|C and B|C are independent. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Contingency table More on contingency table three category of hypothesis (1) Conditionally independent- A|C and B|C are independent. (2) Marginally independent- A ∩ C is independent of B. Or A is independent of B ∩ C Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Contingency table More on contingency table three category of hypothesis (1) Conditionally independent- A|C and B|C are independent. (2) Marginally independent- A ∩ C is independent of B. Or A is independent of B ∩ C (3) Full independent- A,B, and C are all independent. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Contingency table More on contingency table three category of hypothesis (1) Conditionally independent- A|C and B|C are independent. (2) Marginally independent- A ∩ C is independent of B. Or A is independent of B ∩ C (3) Full independent- A,B, and C are all independent. Remark: when dealing with multiple comparisons, we may want to include a lurking variable. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics hypothesis for multiple comparisons: (1) Conditionally independent (1) Conditionally independent Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics hypothesis for multiple comparisons: (1) Conditionally independent (1) Conditionally independent (a) Condition on the new variable (lurking variable) Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics hypothesis for multiple comparisons: (1) Conditionally independent (1) Conditionally independent (a) Condition on the new variable (lurking variable) (b) examine whether other categories are independent. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics hypothesis for multiple comparisons: (1) Conditionally independent (1) Conditionally independent (a) Condition on the new variable (lurking variable) (b) examine whether other categories are independent. (c) Construct separate contingency table for each level of the lurking variable. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics hypothesis for multiple comparisons: (1) Conditionally independent (1) Conditionally independent (a) Condition on the new variable (lurking variable) (b) examine whether other categories are independent. (c) Construct separate contingency table for each level of the lurking variable. (d) sum of independent χ2 is still χ2 . Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics hypothesis for multiple comparisons: (1) Conditionally independent (1) Conditionally independent (a) Condition on the new variable (lurking variable) (b) examine whether other categories are independent. (c) Construct separate contingency table for each level of the lurking variable. (d) sum of independent χ2 is still χ2 . (e) suppose there is t level of the lurking variable, our degrees of freedom is t(r − 1)(c − 1) Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics hypothesis for multiple comparisons: (2) Marginally independent (2) Marginally independent Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics hypothesis for multiple comparisons: (2) Marginally independent (2) Marginally independent (a) The two categories are independent of the third one, while they are related to each other. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics hypothesis for multiple comparisons: (2) Marginally independent (2) Marginally independent (a) The two categories are independent of the third one, while they are related to each other. (b) include every combination of these two categories into one category (as in product form) Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics hypothesis for multiple comparisons: (2) Marginally independent (2) Marginally independent (a) The two categories are independent of the third one, while they are related to each other. (b) include every combination of these two categories into one category (as in product form) (c) Notice this is again two-way contingency table. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics hypothesis for multiple comparisons: (2) Marginally independent (2) Marginally independent (a) The two categories are independent of the third one, while they are related to each other. (b) include every combination of these two categories into one category (as in product form) (c) Notice this is again two-way contingency table. (d) Suppose there is t level of first category and r level of second category, now we have t × r rows. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics hypothesis for multiple comparisons: (2) Marginally independent (2) Marginally independent (a) The two categories are independent of the third one, while they are related to each other. (b) include every combination of these two categories into one category (as in product form) (c) Notice this is again two-way contingency table. (d) Suppose there is t level of first category and r level of second category, now we have t × r rows. (e) degrees of freedom = (rt − 1)(c − 1). Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics hypothesis for multiple comparisons: (3) Fully independent (3) Fully independent Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics hypothesis for multiple comparisons: (3) Fully independent (3) Fully independent (a) All three categories are independent of each other. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics hypothesis for multiple comparisons: (3) Fully independent (3) Fully independent (a) All three categories are independent of each other. (b) To estimate row,column, and table marginal totals separately. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics hypothesis for multiple comparisons: (3) Fully independent (3) Fully independent (a) All three categories are independent of each other. (b) To estimate row,column, and table marginal totals separately. (c) the estimate of the expectation Eijk = Fang-I Chu, Varvara Kulikova ri cj tk . n2 PSTAT 120C Probability and Statistics hypothesis for multiple comparisons: (3) Fully independent (3) Fully independent (a) All three categories are independent of each other. (b) To estimate row,column, and table marginal totals separately. (c) the estimate of the expectation Eijk = ri cj tk . n2 (d) degrees of freedom= rtc − r − c − t + 2. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #2 in hw7 A Gallup survey asked each responder whether they thought abortion laws should be stricter. The sociologists are interested in whether there is a generational difference in the attitudes about abortion. The survey respondents were classified by generation (”18-49 years old” and ”more than 50 years old) and sex. (a)The cross tabulation for generation and their response is want stricter laws dont want stricter laws 18 − 49 188 328 50+ 217 282 Test whether or not opinions about abortion laws are independent of generation. Useα = 0.01. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #2(a) continued... Hints: table of expected value Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #2(a) continued... Hints: table of expected value 18 − 49 E 50+ E want stricter laws 188 205.9 217 199.1 405 Fang-I Chu, Varvara Kulikova dont want stricter laws total 328 516 310.1 282 499 299.9 610 1015 PSTAT 120C Probability and Statistics #2(a) continued... Hints: table of expected value 18 − 49 E 50+ E want stricter laws 188 205.9 217 199.1 405 Use formula X 2 = Pk i=1 dont want stricter laws total 328 516 310.1 282 499 299.9 610 1015 (Ei −Oi )2 Ei Fang-I Chu, Varvara Kulikova to compute test statistics PSTAT 120C Probability and Statistics #2(a) continued... Hints: table of expected value 18 − 49 E 50+ E want stricter laws 188 205.9 217 199.1 405 Use formula X 2 = Pk i=1 dont want stricter laws total 328 516 310.1 282 499 299.9 610 1015 (Ei −Oi )2 Ei to compute test statistics X2 Obtained = 5.26. To draw conclusion we need to compare 2 2 X with χ1,0.01 . Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #2(a) continued... Hints: table of expected value 18 − 49 E 50+ E want stricter laws 188 205.9 217 199.1 405 Use formula X 2 = Pk i=1 dont want stricter laws total 328 516 310.1 282 499 299.9 610 1015 (Ei −Oi )2 Ei to compute test statistics X2 Obtained = 5.26. To draw conclusion we need to compare 2 2 X with χ1,0.01 . Note: degrees of freedom= (2 − 1)(2 − 1) = 1 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #2 continued... #2 (b) We can disaggregate the data to see the effect of sex on the opinion, women 18 − 49 50+ men 18 − 49 50+ want stricter laws dont want stricter laws 79 134 128 137 109 194 89 145 Test whether or not age is condotionally independent of opinion when we condition on the sex of the respondent. Use α = 0.01. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #2 (b)continued... Hint: table of expected value Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #2 (b)continued... Hint: table of expected value 18 − 49 E women (E −O)2 E 50+ E (E −O)2 E total want stricter laws dont want stricter laws total 79 134 213 92.2 120.8 1.90 128 114.8 1.45 137 150.2 1.53 207 1.17 271 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics 265 478 #2 (b)continued... Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #2 (b)continued... want stricter laws dont want stricter laws total 18 − 49 109 194 303 E 111.7 191.3 men (E −O)2 E 50+ E (E −O)2 E total 0.066 89 86.3 0.039 145 147.7 0.086 198 0.05 339 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics 234 537 #2 (b)continued... want stricter laws dont want stricter laws total 18 − 49 109 194 303 E 111.7 191.3 men (E −O)2 E 50+ E (E −O)2 E total 0.066 89 86.3 0.039 145 147.7 0.086 198 0.05 339 compute X 2 for women and men, then we obtain total X 2 = 6.29 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics 234 537 #2 (b)continued... want stricter laws dont want stricter laws total 18 − 49 109 194 303 E 111.7 191.3 men (E −O)2 E 50+ E (E −O)2 E total 0.066 89 86.3 0.039 145 147.7 0.086 198 0.05 339 compute X 2 for women and men, then we obtain total X 2 = 6.29 degrees of freedom= 2(2 − 1)(2 − 1) Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics 234 537 #2 continued.. #2 (c) Write up an explanation of the results that the sociologists would understand Hint: Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #2 continued.. #2 (c) Write up an explanation of the results that the sociologists would understand Hint: if conclusion is : fail to reject null- the data is not significantly different from what would be expected if there was no generational difference in attitudes about abortion laws. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #2 continued.. #2 (c) Write up an explanation of the results that the sociologists would understand Hint: if conclusion is : fail to reject null- the data is not significantly different from what would be expected if there was no generational difference in attitudes about abortion laws. if conclusion is: to reject null- the data is significantly different from what would be expected if there was no generational difference in attitudes about abortion laws. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #2 continued.. #2 (c) Write up an explanation of the results that the sociologists would understand Hint: if conclusion is : fail to reject null- the data is not significantly different from what would be expected if there was no generational difference in attitudes about abortion laws. if conclusion is: to reject null- the data is significantly different from what would be expected if there was no generational difference in attitudes about abortion laws. Be cautious about whether result obtained using the pooled survey data (all the people) is different from results obtained using survey of men and women separately. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #3 #3 A survey of some students in a class found the following distribution of eye color, hair color, and gender. Perform the appropriate test to test whether these three characteristics are completely independent or not. men hair/eye brown blue hazel green total Black 32 11 10 3 56 Brown 38 50 25 15 128 Red 10 10 7 7 34 Blond 3 30 5 8 46 Total 83 101 47 33 264 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #3 #3 women hair/eye brown blue hazel green total Black 36 9 5 2 52 Brown 81 34 29 14 158 Red 16 7 7 7 37 Blond 4 64 5 8 81 total 137 114 46 31 328 eye totals 220 215 93 64 592 You may have to combine two rows or columns together to insure that the expected values are large enough to make the χ2 approximation appropriate. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #3 continued... Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #3 continued... Examine whether χ2 approximation is appropriate or not: the smallest set of marginal totals is for red-haired(71), green-eyed(64) men (264), giving us E = (64)(71)(264) . 5922 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #3 continued... Examine whether χ2 approximation is appropriate or not: the smallest set of marginal totals is for red-haired(71), green-eyed(64) men (264), giving us E = (64)(71)(264) . 5922 Note the values of all other expectations are greater than 4. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #3 continued... Examine whether χ2 approximation is appropriate or not: the smallest set of marginal totals is for red-haired(71), green-eyed(64) men (264), giving us E = (64)(71)(264) . 5922 Note the values of all other expectations are greater than 4. In order to make the χ2 approximation appropriate, we can combine the columns of people with green eyes and hazel eyes Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #3 continued... obtained combined data: Hair / Eye Black Brown Red Blond Total Brown 32 38 10 3 83 Hair / Eye Black Brown Red Blond Total Brown 36 81 16 4 137 Men Blue Hazel & Green 11 13 50 40 10 14 30 13 101 80 Women Blue Hazel & Green 9 7 34 43 7 14 64 13 114 77 Fang-I Chu, Varvara Kulikova Total 56 128 34 46 264 Total 52 158 37 81 328 PSTAT 120C Probability and Statistics #3 continued... Using the expected value formula through taking the product of the three magical totals divided by n2 , we obtained table of expected value as Hair / Eye Black Brown Red Blond Hair / Eye Black Brown Red Blond Men Brown Blue 17.90 17.49 47.40 46.32 11.77 11.50 21.05 20.57 Women Brown Blue 22.24 21.73 58.89 57.55 14.62 14.29 26.15 25.55 Fang-I Chu, Varvara Kulikova Hazel & Green 12.77 33.82 8.40 15.02 Hazel & Green 15.87 42.02 10.43 18.66 PSTAT 120C Probability and Statistics #3 continued... Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #3 continued... Use formula X 2 = Pk i=1 (Ei −Oi )2 Ei Fang-I Chu, Varvara Kulikova to compute test statistics PSTAT 120C Probability and Statistics #3 continued... Use formula X 2 = Pk i=1 (Ei −Oi )2 Ei to compute test statistics Obtained X 2 = 163.4. To draw conclusion we need to compare X 2 with χ217,0.05 . Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #3 continued... Use formula X 2 = Pk i=1 (Ei −Oi )2 Ei to compute test statistics Obtained X 2 = 163.4. To draw conclusion we need to compare X 2 with χ217,0.05 . Note: degrees of freedom= rtc − r − c − t + 2 = 2(3)(4) − 2 − 3 − 4 + 2 = 17 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 #4 The following data documents the fate of a sample of the people aboard the Titanic by their gender and which class ticket they were given or if they were part of the crew. survivors class male female 1st 62 141 2nd 25 93 3rd 88 90 crew 192 20 lost class male female 1st 118 4 2nd 154 13 3rd 422 106 crew 670 3 We are interested in testing if there was a relationship between gender and survival. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 #4 (a)What sort of null hypothesis is appropriate? Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 #4 (a)What sort of null hypothesis is appropriate? interested in the relationship between gender and survival. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 #4 (a)What sort of null hypothesis is appropriate? interested in the relationship between gender and survival. ignore the effects from being different classes (class is apparently related to the survival) Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 #4 (a)What sort of null hypothesis is appropriate? interested in the relationship between gender and survival. ignore the effects from being different classes (class is apparently related to the survival) we want to test whether Gender and Survival are conditionally independent given same class Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 #4 (b) Calculate the appropriate χ2 statistic and interpret the result Reconstruct the data into four 2 × 2 tables 1st Class Male Female Survived 62 141 Lost 118 4 180 145 203 122 325 2nd Class Male Female 25 93 154 13 179 106 118 167 285 Survived Lost Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 continued... Survived Lost Survived Lost 3rd Class Male Female 88 90 422 106 510 196 178 528 706 Crew Male Female 192 20 670 3 862 23 212 673 885 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 continued... Survived Lost Survived Lost 1st Class Male Female 112.43 90.57 67.57 54.43 180 145 203 122 325 2nd Class Male Female 74.11 43.89 104.89 62.11 179 106 118 167 285 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 continued... Survived Lost Survived Lost 3rd Class Male Female 128.58 49.42 381.42 146.58 510 196 178 528 706 Crew Male Female 206.49 5.51 655.51 17.49 862 23 212 673 885 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 continued... Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 continued... Use formula X 2 = Pk i=1 (Ei −Oi )2 Ei Fang-I Chu, Varvara Kulikova to compute test statistics PSTAT 120C Probability and Statistics #4 continued... Use formula X 2 = Pk i=1 (Ei −Oi )2 Ei to compute test statistics Obtained X 2 = 397.54. To draw conclusion we need to compare X 2 with χ24,0.05 . Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics #4 continued... Use formula X 2 = Pk i=1 (Ei −Oi )2 Ei to compute test statistics Obtained X 2 = 397.54. To draw conclusion we need to compare X 2 with χ24,0.05 . Note: degrees of freedom= 4(2 − 1)(2 − 1) Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Bayesian Inference Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Bayesian Inference Definition Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Bayesian Inference Definition Prior distribution: Prior information about the parameter is given by a density function. Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Bayesian Inference Definition Prior distribution: Prior information about the parameter is given by a density function. Posterior distribution: the information about the parameter from the data is f (p|X ) Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Bayesian Inference Definition Prior distribution: Prior information about the parameter is given by a density function. Posterior distribution: the information about the parameter from the data is f (p|X ) Rules Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Bayesian Inference Definition Prior distribution: Prior information about the parameter is given by a density function. Posterior distribution: the information about the parameter from the data is f (p|X ) Rules Law of total probability: for a partition B1 , . . . , Bk P(A) = k X P(A|Bj )P(Bj ) j=1 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Bayesian Inference Definition Prior distribution: Prior information about the parameter is given by a density function. Posterior distribution: the information about the parameter from the data is f (p|X ) Rules Law of total probability: for a partition B1 , . . . , Bk P(A) = k X P(A|Bj )P(Bj ) j=1 Bayes Rule: for discrete case, P(Bj |A) = P(A|Bj )P(Bj ) P(A|Bj )P(Bj ) = Pk P(A) j=1 P(A|Bj )P(Bj ) Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Bayesian Inference Definition Prior distribution: Prior information about the parameter is given by a density function. Posterior distribution: the information about the parameter from the data is f (p|X ) Rules Law of total probability: for a partition B1 , . . . , Bk P(A) = k X P(A|Bj )P(Bj ) j=1 Bayes Rule: for discrete case, P(Bj |A) = P(A|Bj )P(Bj ) P(A|Bj )P(Bj ) = Pk P(A) j=1 P(A|Bj )P(Bj ) Bayes Rule: for continuous case: f (y |x) = R Fang-I Chu, Varvara Kulikova f (x|y )f (y ) f (x|y )f (y )dy PSTAT 120C Probability and Statistics Beta-Binomial model Y ∼ Bin(n, p) Prior: Beta p ∼ Be(a, b) Posterior: p|y ∼ Be(a + y , n − y + b) Bayes point estimator: E [p|y ] = Fang-I Chu, Varvara Kulikova y +a n+a+b PSTAT 120C Probability and Statistics Example: Beta-Binomial An NBA rookie misses his first 10 free throws. What is the Bayes estimate of a mean number of free throws for this player? Consider the following Be(a, b) priors: Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Example: Beta-Binomial An NBA rookie misses his first 10 free throws. What is the Bayes estimate of a mean number of free throws for this player? Consider the following Be(a, b) priors: Non-informative p ∼ Be(1, 1) or p ∼ Be(1/2, 1/2) Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Example: Beta-Binomial An NBA rookie misses his first 10 free throws. What is the Bayes estimate of a mean number of free throws for this player? Consider the following Be(a, b) priors: Non-informative p ∼ Be(1, 1) or p ∼ Be(1/2, 1/2) 95% of NBA players are 75% ± 2*10% which implies p ∼ Be(11, 3) and Pr [0.55 < p < 0.95] = 0.95 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Example: Beta-Binomial An NBA rookie misses his first 10 free throws. What is the Bayes estimate of a mean number of free throws for this player? Consider the following Be(a, b) priors: Non-informative p ∼ Be(1, 1) or p ∼ Be(1/2, 1/2) 95% of NBA players are 75% ± 2*10% which implies p ∼ Be(11, 3) and Pr [0.55 < p < 0.95] = 0.95 College stats are 660/724 free throws which implies that prior is p ∼ Be(660, 64) Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Example: Beta-Binomial An NBA rookie misses his first 10 free throws. What is the Bayes estimate of a mean number of free throws for this player? Consider the following Be(a, b) priors: Non-informative p ∼ Be(1, 1) or p ∼ Be(1/2, 1/2) 95% of NBA players are 75% ± 2*10% which implies p ∼ Be(11, 3) and Pr [0.55 < p < 0.95] = 0.95 College stats are 660/724 free throws which implies that prior is p ∼ Be(660, 64) Then, the posterior distribution is Be(a + 0, b + 10) mean is the a+0 , i.e. Bayes estimator and equal E [p|y ] = 10+a+b Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Example: Beta-Binomial An NBA rookie misses his first 10 free throws. What is the Bayes estimate of a mean number of free throws for this player? Consider the following Be(a, b) priors: Non-informative p ∼ Be(1, 1) or p ∼ Be(1/2, 1/2) 95% of NBA players are 75% ± 2*10% which implies p ∼ Be(11, 3) and Pr [0.55 < p < 0.95] = 0.95 College stats are 660/724 free throws which implies that prior is p ∼ Be(660, 64) Then, the posterior distribution is Be(a + 0, b + 10) mean is the a+0 , i.e. Bayes estimator and equal E [p|y ] = 10+a+b E [p|y ] = 1+0 10+1+1 = 1/12 = 0.08 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Example: Beta-Binomial An NBA rookie misses his first 10 free throws. What is the Bayes estimate of a mean number of free throws for this player? Consider the following Be(a, b) priors: Non-informative p ∼ Be(1, 1) or p ∼ Be(1/2, 1/2) 95% of NBA players are 75% ± 2*10% which implies p ∼ Be(11, 3) and Pr [0.55 < p < 0.95] = 0.95 College stats are 660/724 free throws which implies that prior is p ∼ Be(660, 64) Then, the posterior distribution is Be(a + 0, b + 10) mean is the a+0 , i.e. Bayes estimator and equal E [p|y ] = 10+a+b E [p|y ] = E [p|y ] = 1+0 10+1+1 = 1/12 = 0.08 11+0 10+11+3 = 11/24 = 0.46 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics Example: Beta-Binomial An NBA rookie misses his first 10 free throws. What is the Bayes estimate of a mean number of free throws for this player? Consider the following Be(a, b) priors: Non-informative p ∼ Be(1, 1) or p ∼ Be(1/2, 1/2) 95% of NBA players are 75% ± 2*10% which implies p ∼ Be(11, 3) and Pr [0.55 < p < 0.95] = 0.95 College stats are 660/724 free throws which implies that prior is p ∼ Be(660, 64) Then, the posterior distribution is Be(a + 0, b + 10) mean is the a+0 , i.e. Bayes estimator and equal E [p|y ] = 10+a+b E [p|y ] = E [p|y ] = E [p|y ] = 1+0 10+1+1 = 1/12 = 0.08 11+0 10+11+3 = 11/24 = 0.46 660+0 10+660+64 = 0.89 Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics 25 20 Non−informative NBA players College 10 f(p|y) 15 0.89 0.46 0 5 0.08 0.0 0.2 0.4 0.6 0.8 1.0 p Fang-I Chu, Varvara Kulikova PSTAT 120C Probability and Statistics
© Copyright 2025 Paperzz