The binomial distribution gives exact probabilities for the number of successes in samples from a dichotomous population (S-F) when sampling with replacement. The binomial distribution gives approximate probabilities when sampling without replacement provided that the sample size n is small relative to the population size N. The hypergeometric distribution gives exact probabilities in the last case. 2 Assumptions: The population or set to be sampled consists of N (finite) individuals, objects, etc. Each individual can be classified as S or F, and there are M successes in the population. A sample of n individuals is selected without replacement in such a way that each subset of size n is equally likely to be chosen. The random variable X of interest is the number of successes in the sample. The distribution is denoted by P(X=x) = h(x;n,M,N). 3 If X is the number of S’s in a random sample of size n drawn from a population consisting of M S’s and (N-M) F’s, M N M x n x P X x h x; n, M , N N n for x, an integer, satisfying max 0, n M N x min n, M 4 The mean and variance of a hypergeometric random variable X having pmf h(x;n,M,N) is M EX n N N n M V X n N 1 N M 1 N 5 The ratio M/N is the proportion of S’s in the population. If we replace M/N by p, we see that the mean of the hypergeometric mean is the same as for the binomial. The hypergeometric variance is multiplied by the factor (N-n)/(N-1) (finite population correction factor). The correction factor is less than 1 (the hypergeometric has a smaller variance than the binomial), and is close to one when n is small relative to N. 6 Five individuals from an animal population of 25 are caught, tagged, and released. After they have mixed with the general population, a sample of 10 of these animals is selected. Let X be the number of tagged animals in this sample of size 10. Compute P( X 2) and P( X 2) , the expected number of tagged animals, and the variance of the number of tagged animals. 7 5 20 2 8 P X 2 P X x .385 25 10 P X 2 x0 h x;10,5,25 .057 .257 .385 .699 2 E X nM / N (10)(5) / (25) 2 N n M V X n N 1 N M 1 N 15 5 20 (10) 1 24 25 25 8 If the population size N is actually unknown, it makes sense to equate the observed sample proportion of tagged animals x/n with the population proportion M/N. The estimate of N if x=2 would then be M n (5)(10) ˆ N 25 x 2 9 This random variable and distribution are based on an experiment that satisfies the conditions for a binomial random variable and one additional condition: The experiment continues until a total of r successes have been observed, where r is a positive integer. The random variable of interest is X=the number of failures that precede the r-th success. The distribution is denoted by nb x; r , p in the text. 10 The pmf of a negative binomial rx X with parameters r and p x r 1 r x nb x; r , p p 1 p r 1 11 r 1 p EX p r 1 p V X 2 p 12 Suppose that p=P(female birth)=.5. A couple wishes to have exactly two female children in their family. They will have children until this condition is fulfilled. What is the probability that the family has x male children? What is the probability that the family has four children? How many male children would you expect this family to have? How many children would you expect this family to have? 13 P(x male children)= x 1.5 P(four children)=P(X=2)= (3).54 .1875 The expected number of male children before the second female is 2 x 2 .5 EX 2 .5 The expected number of children is then four. 14 When r=1, X is the number of failures before the first success. The random variable is called geometric in this case (though Y = X + 1 is also called geometric). P( X x) 1 p p x x 0,1,2 15
© Copyright 2026 Paperzz