Applied Data Analysis Spring 2017 Ukraine, 2016 Karen Albert [email protected] Thursdays, 4-5 PM (Hark 302) Lecture outline 1. Confidence intervals 2. Intro to hypothesis testing Confidence intervals Remember, sample percentage = population percentage ± chance error In an previous example, 49% = population percentage ± 2.7% Confidence intervals Provided the normal approximation is appropriate, how likely is it that the parameter is within 1 standard error of the sample percentage? Confidence intervals Provided the normal approximation is appropriate, how likely is it that the parameter is within 1 standard error of the sample percentage? Provided the normal approximation is appropriate, how likely is it that the parameter is within 2 standard errors of the sample percentage? Confidence intervals Provided the normal approximation is appropriate, how likely is it that the parameter is within 1 standard error of the sample percentage? Provided the normal approximation is appropriate, how likely is it that the parameter is within 2 standard errors of the sample percentage? Provided the normal approximation is appropriate, how likely is it that the parameter is within 3 standard errors of the sample percentage? Confidence intervals What we just calculated were confidence intervals for the population percentage. [43.6%, 54.4%] 95% confidence interval Confidence intervals What we just calculated were confidence intervals for the population percentage. [43.6%, 54.4%] 95% confidence interval Want more “confidence”? [40.9%, 57.1%] 99.7% confidence interval Confidence intervals What we just calculated were confidence intervals for the population percentage. [43.6%, 54.4%] 95% confidence interval Want more “confidence”? [40.9%, 57.1%] 99.7% confidence interval Note the tradeoff between confidence and precision. Interpretation Chances are you want to interpret the interval thusly: There is a 95% chance that the population percentage is between 43.6% and 54.5%. Interpretation Chances are you want to interpret the interval thusly: There is a 95% chance that the population percentage is between 43.6% and 54.5%. But that would be completely wrong. The correct interpretation Out there in the world, there exists a true percentage. The correct interpretation Out there in the world, there exists a true percentage. Consider our interval [43.6%, 54.5%] The true percentage is either in there, or it is not. There is no probability about it. The correct interpretation Out there in the world, there exists a true percentage. Consider our interval [43.6%, 54.5%] The true percentage is either in there, or it is not. There is no probability about it. Or more accurately, the probability is 0 or 1. What does the 95% refer to? It refers to the procedure. Interpretation: take 2 If we repeated the procedure over and over again many times, we would expect 95% of the confidence intervals generated to contain the true percentage. Interpretation: take 2 If we repeated the procedure over and over again many times, we would expect 95% of the confidence intervals generated to contain the true percentage. If we repeated the procedure over and over again many times, we would expect 95% of the confidence intervals generated to contain the true percentage. Interpretation: take 2 If we repeated the procedure over and over again many times, we would expect 95% of the confidence intervals generated to contain the true percentage. If we repeated the procedure over and over again many times, we would expect 95% of the confidence intervals generated to contain the true percentage. If we repeated the procedure over and over again many times, we would expect 95% of the confidence intervals generated to contain the true percentage. Example An SRS of 1000 persons is taken to estimate the percentage of Democrats in a large population It turns out that 543 of the people in the sample are Democrats. Example An SRS of 1000 persons is taken to estimate the percentage of Democrats in a large population It turns out that 543 of the people in the sample are Democrats. First, find the sample percentage. 543 = 0.543 1000 Example An SRS of 1000 persons is taken to estimate the percentage of Democrats in a large population It turns out that 543 of the people in the sample are Democrats. Example An SRS of 1000 persons is taken to estimate the percentage of Democrats in a large population It turns out that 543 of the people in the sample are Democrats. Find the standard error. r 0.543 × (1 − 0.543) = 0.0158 s.e. = 1000 The interval 54.3% ± 3.2% Which is correct? 1. There is about a 95% chance for the percentage of Democrats in the population to be in the range of 54.3% plus or minus 3.2%. 2. 54.3% plus or minus 3.2% is a 95% confidence interval for the sample percentage. 3. 54.3% plus or minus 3.2% is a 95% confidence interval for the population percentage. CI for Averages The standard error for the average: σ s.e. = √ n The CI is then x̄ ± z1− α2 σ √ n Where the equation comes from 1 − α = Pr −z ≤ x̄ − µ √σ n ! ≤z Where the equation comes from 1 − α = Pr −z ≤ x̄ − µ √σ n ! ≤z σ σ = Pr −z √ ≤ x̄ − µ ≤ z √ n n Where the equation comes from 1 − α = Pr −z ≤ x̄ − µ √σ n ! ≤z σ σ = Pr −z √ ≤ x̄ − µ ≤ z √ n n σ σ = Pr −x̄ − z √ ≤ −µ ≤ −x̄ + z √ n n Where the equation comes from 1 − α = Pr −z ≤ x̄ − µ √σ n ! ≤z σ σ = Pr −z √ ≤ x̄ − µ ≤ z √ n n σ σ = Pr −x̄ − z √ ≤ −µ ≤ −x̄ + z √ n n σ σ = Pr x̄ − z √ ≤ µ ≤ x̄ + z √ n n Question time If I gave you a coin and asked you whether it is fair, how many times would you have to flip before you got a result that is impossible given a fair coin? Question time If I gave you a coin and asked you whether it is fair, how many times would you have to flip before you got a result that is impossible given a fair coin? Answer: an infinite number of times. Question time If I gave you a coin and asked you whether it is fair, how many times would you have to flip before you got a result that is impossible given a fair coin? Answer: an infinite number of times. There is no way to disprove a probabilistic claim. Question time If I gave you a coin and asked you whether it is fair, how many times would you have to flip before you got a result that is impossible given a fair coin? Answer: an infinite number of times. There is no way to disprove a probabilistic claim. Instead, we will look for a a result that is unlikely given a particular assumption about the data. Let’s start with an example.... A senator introduces a bill to simply the tax code. She claims that the bill is revenue-neutral — it won’t cost either the government or the taxpayer money. Is this the case? The Treasury Department uses a computer file of 100,000 representative tax returns and gauges the total tax payable under the old rules and under the new rules. Then, they look at the change: change = tax under the new rules − tax under the old rules Let’s start with an example.... A senator introduces a bill to simply the tax code. She claims that the bill is revenue-neutral — it won’t cost either the government or the taxpayer money. Is this the case? The Treasury Department uses a computer file of 100,000 representative tax returns and gauges the total tax payable under the old rules and under the new rules. Then, they look at the change: change = tax under the new rules − tax under the old rules Is the change positive or negative? The senator says that the change is zero. Is she right? Continuing the example Let’s say that the work has been done on a pilot sample of 100 forms chosen at random. The sample average is, x̄ = −$219 Why doesn’t this end the discussion? Continuing the example Let’s say that the work has been done on a pilot sample of 100 forms chosen at random. The sample average is, x̄ = −$219 Why doesn’t this end the discussion? We could have arrived at this answer by chance. We need to know how likely it was to get an answer such as this one. We need the standard error.... Let’s say the standard deviation is $725. How do we get the standard error? We divide the standard deviation by the square root of the sample size. $725 √ = $72.50 100 So given that the senator believes that her tax bill is revenue-neutral, how likely is it that we would observe a result of $219 by chance? What are the chances? Remember that the senator says that the change is 0. So, z= −$219 − $0 = −3 $72 That is, $219 is 3 standard errors below what we expected given revenue-neutrality. What are the chances? The chances Using the normal approximation, the area to the left of -3 is about 0.1%. So there is 1 chance in 1000 of getting a value as extreme or more extreme than -$219 if the true mean were 0. Conclusion? The chances Using the normal approximation, the area to the left of -3 is about 0.1%. So there is 1 chance in 1000 of getting a value as extreme or more extreme than -$219 if the true mean were 0. Conclusion? Either the senator is wrong, or something very rare has occurred. Significance tests What we just did is known as a significance test or a hypothesis test. These tests tell us whether a statistic we observe is due to chance or something else. In other words, significance tests can tell us whether an effect is real or simply due to chance. Let’s consider the pieces of a significance test. Five pieces • Assumptions • Hypotheses • Test statistic • P-value • Conclusion Assumptions All hypothesis tests make assumptions, and a test is only valid when its assumptions are met. • Most tests assume that the data are from a random sample. That is, the tests assume that the box model is a good analogy. • Most tests assume that the underlying population has a particular distribution, usually a normal distribution. • Each test applies to either quantitative data or categorical data. Hypotheses The null hypothesis expresses the idea that an observed difference is due to chance. In the last example, the null hypothesis is that the average of “the box” equal $0. Notation: H0 : µd = $0 The null hypothesis above is that the average difference is equal to $0. The alternative hypothesis The alternative hypothesis expresses the idea that an observed difference is not due to chance; i.e. it is “real.” In the example we just did, the alternative hypothesis is that the average of “the box” is less than $0. Notation: H1 : µd < $0 The alternative hypothesis here is that the average difference is less than $0. Test statistic We temporarily assume that the null hypothesis is true in order to test it. We assume the null is true, and then calculate the test statistic. In our example, the statistic is −$219 − $0 = −3 $72 A test statistic is a measure of the difference between the data we observed, and the data that we expected to see assuming that the null is true. More about test statistics The test statistic in our case is a z, of course, z= observed − expected under the null standard error More about test statistics The test statistic in our case is a z, of course, z= observed − expected under the null standard error Tests using the z-statistic are called... More about test statistics The test statistic in our case is a z, of course, z= observed − expected under the null standard error Tests using the z-statistic are called... wait for it... More about test statistics The test statistic in our case is a z, of course, z= observed − expected under the null standard error Tests using the z-statistic are called... wait for it... z-tests! P-values The probability of seeing a z-statistic of -3 or smaller given the null hypothesis was 1 in 1000. The probability is known as an observed significance level or often as a p-value. • The observed significance level is the chance of getting a test statistic as extreme, or more extreme than, the observed one. • The chance is computed on the assumption that the null hypothesis is true. • The smaller this chance is, the stronger the evidence against the null. Hmm.... Why does the p-value take into account the area to the left of -3, as opposed to just z = −3? Hmm.... Why does the p-value take into account the area to the left of -3, as opposed to just z = −3? There are two reasons: 1. What is the probability that z = −3? 2. It is a question of evidence. The area to the left... The data could have turned out differently. Consider if the observed values had turned out to be: z= −$239 − $0 −$162 − $0 = −4.1 z = = −2.6 $59 $63 The value on the left is stronger evidence against the null, and the value on the right is weaker evidence against the null. The area to the left... The data could have turned out differently. Consider if the observed values had turned out to be: z= −$239 − $0 −$162 − $0 = −4.1 z = = −2.6 $59 $63 The value on the left is stronger evidence against the null, and the value on the right is weaker evidence against the null. But why? Because we arguing by contradiction. The logic • We want to show that the null hypothesis leads to an absurd conclusion and must be rejected. • Say you get a p-value of 1 in 1000, as we did. • Imagine many other investigators repeating the experiment. Only 1 in 1000 would get a test statistic as extreme or more extreme than we did given the null hypothesis is true. • So the null hypothesis has created an absurdity and should be rejected. • The smaller the p-value, the more you want to reject the null. Warning Be careful... Warning Be careful... the p-value is not the probability that the null hypothesis is true. Just like confidence intervals, the null hypothesis is either true, or it is not. The p-value is essentially the probability of seeing a large test statistic...assuming that the null is true. Conclusion Small p-values are evidence against the null. A small p-value indicates an observed value far from what was expected under the null hypothesis. Too far, and we make a decision to reject the null hypothesis. We’ll discuss in the future what it means to be small. Reviewing the pieces • The null hypothesis expresses the idea that an observed result is due to chance. Reviewing the pieces • The null hypothesis expresses the idea that an observed result is due to chance. • The alternative hypothesis states that the observed result is not due to chance. Reviewing the pieces • The null hypothesis expresses the idea that an observed result is due to chance. • The alternative hypothesis states that the observed result is not due to chance. • The test statistic measures the distance between the result we observed, and what we expected under the null hypothesis. Reviewing the pieces • The null hypothesis expresses the idea that an observed result is due to chance. • The alternative hypothesis states that the observed result is not due to chance. • The test statistic measures the distance between the result we observed, and what we expected under the null hypothesis. • The observed significance level or p-value tells us the chance of getting a test statistic as extreme or more extreme than the observed test statistic. Another example In one investigator’s model, the data are like 400 draws made at random from a large box. The null hypothesis says that the average of the box equals 50; the alternative says that the average of the box is more than 50. In fact, the data averaged out to 52.7 and the standard deviation was 25. What do you conclude? Example: the hypotheses The null is H0 : µ = 50 The alternative is H1 : µ > 50 Example: the standard error S.E.X̄ = s √ n = 25 √ 400 = 1.25 se <- 25/sqrt(400) se ## [1] 1.25 Example: the test statistic z = = x̄ − µ0 s.e. 52.7 − 50 1.25 = 2.16 z <- (52.7-50)/se z ## [1] 2.16 Example: the p-value and conclusion pnorm(z,0,1,lower.tail=FALSE) ## [1] 0.01538633 The probability of seeing a result as extreme or more extreme than 2.16, given that the true mean is 50, is roughly 1.5%. This is a small probability, so we must conclude that either the null hypothesis is false or something rare has occurred. Example: the p-value and conclusion pnorm(z,0,1,lower.tail=FALSE) ## [1] 0.01538633 The probability of seeing a result as extreme or more extreme than 2.16, given that the true mean is 50, is roughly 1.5%. This is a small probability, so we must conclude that either the null hypothesis is false or something rare has occurred. Which is it, by the way? The procedure: short version 1. Set up the null and alternative hypotheses. 2. Pick a test statistic to measure the difference between the data and what is expected under the null. 3. Compute the observed significance level. Why “pick a test statistic”? The choice of the test statistic depends on the model and the hypothesis being considered. The test we just did is a “one-sample z-test.” There are also: • two-sample z-tests • t-tests • χ2 -tests And there are many, many other test statistics as well. Another example Someone approached the first 100 students he saw one day at Berkeley and asked in which school or college they were enrolled. The sample included 53 men and 47 women. From the Registrar’s data, 25,000 students were registered at Berkeley, and 67% of them were male. Was his sample procedure like taking a random sample? Answer H0 : p = 0.67 H1 : p < 0.67 Answer H0 : p = 0.67 H1 : p < 0.67 se <- sqrt((0.67*0.33)/100) se ## [1] 0.04702127 z <- (0.53-0.67)/se z ## [1] -2.977376 Answer pnorm(z,0,1) ## [1] 0.001453637 pnorm(0.53,0.67,.047) ## [1] 0.00144726 Conclusion The p-value is therefore 1/1000. Conclusion The p-value is therefore 1/1000. Was it like taking a simple random sample? The p-value is small, so the result cannot be explained by chance. Conclusion The p-value is therefore 1/1000. Was it like taking a simple random sample? The p-value is small, so the result cannot be explained by chance. No, the investigator had too many women in his sample. What did we learn? • Confidence intervals • Hypothesis testing
© Copyright 2026 Paperzz