Prediction Intervals for random variable X when is unknown and 2 is known We know how to give a probabilistic interval for X when X N , 2 and both and 2 are known. For example, if X = weight of a chocolate bar in ounces, and we know that X N 8.1,.25 , so .5 , then we could say that the probability is .95 that X falls in the interval 8.1 1.96(.5) , or , in general, z 2 . Source of uncertainty about the value of X when known: the fact that X is a random variable. We just learned how to use data to find a confidence interval when we don’t know , so that we must estimate using X n , and when 2 is known, and 1. X i , i 1,..., n represent a random sample from the population of interest (iid, remember?) 2. Either 2.1. X i N , 2 , n any size, or 2.2. X i Other , 2 , n large enough to apply CLT 3. Confidence level is C % 100 1 % , where 0 1 . Suppose we took a sample of 10 chocolate bars and got a sample mean of X n 8.2 . A confidence interval for is 8.2 1.96(.5) , or, in general, x z X 2 n . Source of uncertainty about the value of : the fact that we estimated it using X n . 1 But suppose we still want a prediction interval for X , the random variable, not just a confidence interval for , the population mean. Sources of uncertainty about the value of X when unknown: 1. The fact that X is a random variable. 2. The fact that we must estimate the mean using X n . To find a prediction interval for X when unknown, notice that Var X X n Var ( X ) Var X n 2 2 1 2 1 . n n Wait! That formula assumes the two RV’s being added are independent! Are they? Yes. Remember that the X 1 , X n represent a random sample from the population of interest (independent and identically distributed). Think of the plain‐old X as X n 1 . It’s independent of the other X i and identically distributed. So for a symmetric prediction interval for X , with a confidence level of C, we use: x z 2 X 1 1 . n We still say: We are C% confident that X will fall between x z 2 X 1 1 1 and x z 2 X 1 . n n Why? Probabilistic statements about X do make sense, because X is a random variable. But we can’t make a probabilistic statement about a random variable unless we know its distribution. o The name or pdf/pmf of the distribution, in this case, Normal. o The value of . o The value of 2 . We don’t know all three of these things for X , so we can only make a confidence statement. 2 Finishing the chocolate bar example: To calculate a 95% prediction interval for X , we calculate: 8.2 1.96(.5) 1 1 7.17,9.23 . 10 We say: We are 95% confident that the weight of a randomly selected candy bar will fall between 7.17 and 9.23 ounces. If we want more precision, we can reduce the width of this interval somewhat by increasing the sample size, but increasing the sample size isn’t as effective in reducing the width of a prediction interval as it is for a confidence interval, because we still have the variability of the random variable itself, which we can’t divide by n . 3
© Copyright 2026 Paperzz