Probability Distributions (where statistics meets probabilities) Recall: The histogram histograms are graphs that are created when measured data are sorted into groups or ‘class intervals’ or ‘bin widths’ the width of each bar is known as the ‘class width’ – a small class width may result in a histogram that does not effectively summarise the distribution; at least 5 bars are needed in the data set so that a representative display is achieved histograms can come in many shapes: (i) bimodal – a distribution that has two peaks (ii) uniform distribution – height of each bar is roughly equal (iii) skewed to the left or right – the direction of the skew is determined by the direction the mean has shifted mound-shaped distribution – symmetrical about a line passing through the interval with the greatest frequency As you can see, histograms come in many shapes and sizes, but a histogram that is symmetrical and bell-shaped is commonly known as a normal distribution (well… actually it is a discrete approximation of a continuous curve). In statistics we are often interested in situations that result from elements of chance (such as rolling two die). We look at random variables. The probabilities within their range of possible values of these random variables are called probability distributions. A few are of particular interest to us: The binomial probability distribution (discrete), the Poisson distribution (discrete) and the normal distribution (continuous). Sadly, the hypergeometric distribution (discrete) is not in the curriculum but I might mention it if I have time. Exploration: Fill in the table below for the sum of two dies. Sum 1 2 3 4 5 6 1 2 3 4 5 6 Plot the relative frequency* on the graph below: * the frequency of a value or group of values expressed as a fraction or percent of the whole data set. Relative Frequency Observations (Sum) What is the mode? What is the median? What is the mean? Add all the relative frequencies – what was the result? Discrete Probability Distribution If you add up all the probabilities of all possible outcomes for a discrete random variable, X, you need to get a total of 1. The distribution of each probability is called a discrete probability distribution. If the discrete random variable can assume any of the values x1 , x2 , x3 ,..., xn , then n P( X x ) 1 i 1 i In the case above, You get: n P( X x ) i 1 i P(2) P(3) P(4) P(5) P(6) P(7) P(8) P(9) P(10) P(11) P(12) 1 2 3 4 5 6 5 4 3 2 1 36 36 36 36 36 36 36 36 36 36 36 1 Another way to look at it is to make a table: x 2 3 4 5 6 7 8 9 10 11 12 P( X x ) 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36 Sum=36/36 Example: A discrete random variable X has the following probability distribution: x 0 1 2 3 Find P( X x ) p 2p 1-2p2 2p-3p2 (a) the value of p; (b) P( X 2) ; (c) P( X 2) ; (d) P( X 2 X 2) . Example: Urn A contains 5 red and 3 blue marbles; urn B contains 2 red and 4 blue marbles. A marble is selected from each urn and the colour noted. Let X represent the number of red marbles selected. Tabulate the probability distribution for X. The Expected Value (mean for probabilities) The expected value of the random variable is a measure of the central tendency of X. It is a weighted average or a long term average. It can be written as i n E( X ) xi P( X x) i1 These are the kind of ‘averages’ gamblers, stock analysts, brokers, and actuaries look at. Example: Consider the probability distribution of the number of heads that appear in 3 tosses of a coin. x P(X=x) xP(X=x) 0 0.125 0.000 1 0.375 0.375 2 0.375 0.750 3 0.125 0.375 3 xP( X x) 1.5 x 0 The expected number of heads when 3 coins are tossed is 1.5. Let’s play a game with 2 die. First, let’s look at the mean of the distribution. x 2 3 4 5 6 7 8 9 10 11 12 P( X x ) xP( X x) 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36 2/36 6/36 12/36 20/36 30/36 42/36 40/36 36/36 30/36 22/36 12/36 12 xP( X x) 7 x2 Nothing special there since it is what I expected. Now let’s make things a little more interesting: You bet $1.00 to play. If you get a seven or eleven, you win $3. Otherwise I keep the $1. What is the mean return of this game? x 3 -1 P( X x ) xP( X x) 8/36 28/36 24/36 -28/36 3 xP( X x) 0.11 x 1 This means that in the long run, you would lose $0.11 for every dollar you put into the game. Explore: Discuss how this is used in various fields. Explore: Try making up your own game and find the long term value (mean). Explore: Look at the long term value for a lottery ticket. The Measure of Spread (standard deviation) The variance provides a measure of spread of the variability of the results. It can be written as n Var ( X ) 2 E (( X ) 2 ) ( xi ) 2 P( X x) i 1 or Var ( X ) 2 E ( X 2 ) 2 A clearer measure of spread is the standard deviation which has the same units as the values. . It is simply the square root of the variance and can be written as Sd ( X ) E ( X 2 ) 2 Example: Consider the following probability distribution. x 2 3 4 P(X=x) 0.3 0.5 0.2 Find (i) the mean, (ii) the variance and (iii) the standard deviation. Binomial Probability Distribution This is a special type of a discrete probability distribution. A lot of the probabilities that we will look at have to do with a ‘success’ or ‘failure’ (either-or) situation which is repeated many times and each trial is independent. These are called Bernoulli trials. Let’s say that the probability of success is p and that of failure is q=1-p. We are interested in looking at r successes in n trials (which gives us n-r failures). This yields a binomial probability distribution!!! Binomial? You mean… Yup, the values of the probabilities are successive terms of the binomial n expansion of p q . Since p q 1 then the expansion must be equal to 1 (of course since it contains all possible outcomes. The probability of obtaining r successes in n independent trials is n P(n, r , p) p r (1 p) n r for 0 r n r Look at the tree diagram to convince yourself. We can write B(n, p) as a shorthand to say that the distribution is binomial with n trials and p the probability of success. The Expected Value (mean), the Variance and the Standard Deviation of a Binomial distribution It can be easily argued that the expected value can be found using E( X ) np for a binomial distribution. Really? Well think about it: The expected value of one trial is given by E(x) 1 p 0 1 p p . The mean of n independent trials is hence E ( X ) np but let’s prove it. n E(X) xp(x) x0 n x p xqn x x 0 x n n n x p xqn x x 0 x When x 0 then the term =0 therefore n x x 1 n x x 1 n n x 1 n n x 1 n! p xqn x x!(n x)! n (n 1)! p xqn x x (x 1)!(n x)! (n 1)! p xqn x (x 1)!(n x)! (n 1)! p p x 1q n x (x 1)!(n x)! (n 1)! p x 1q n x x 1 (x 1)!(n 1) (x 1) ! n np n n 1 x 1 n x np p q x 1 x 1 np n 1 n 1 x 1 p x 1 (n 1)( x 1) q x 1 0 The sum describes the terms of another binomial whose number of trials is n-1 and whose value is x-1. n 1 n 1 x 1 p Therefore: x 1 (n 1)( x 1) q ( p q)n 1 x 1 0 Hence n 1 E(X) np n 1 x 1 p x 1 (n 1)( x 1) q np(1) x 1 0 E(X) np Variance and standard deviation in a binomial distribution is less obvious but also yields a straightforward formula, Var(X) 2 np(1 q) and Sd( X ) np(1 p) . The variance of one trial is given by 2 2 2 2 1 p 0 1 p 1 p p p2 1 p p 1 p . The variance of n independent trials is hence 2 np 1 p . Let’s prove that: n Var(X) (x )2 P(X x) x0 n (x np)2 p x q n x x x0 n n x 2 2npx (np)2 p x q n x x x0 n n n n n n n x 2 p x q n x 2np x p x q n x n 2 p 2 p x q n x x x0 x 0 x x 0 x n n n x n x n x n x n 2 2 (x x x) p q 2np x p q n p p x q n x x x0 x 0 x x 0 x n 2 n n n n n n n n x(x 1) p x q n x x p x q n x 2np x p x q n x n 2 p 2 p x q n x x x0 x 0 x x 0 x x 0 x n x(x 1) p x q n x np 2np(np) n 2 p 2 (1) x x0 n When x 0 and x 1 then each respective term =0 therefore n n x(x 1) p x q n x np 2n 2 p 2 n 2 p 2 x x2 n x(x 1) x2 n x(x 1) x2 n n(n 1) x2 n! p x q n x np n 2 p 2 x!(n x)! n(n 1)(n 2)! p x q n x np n 2 p 2 x(x 1)(x 2)!(n x)! (n 2)! p x q n x np n 2 p 2 (x 2)!(n x)! n n 2 x n x 2 2 n(n 1) p q np n p x 2 x2 n n 2 2 x2 n x 2 2 n(n 1) p p q np n p x 2 x2 n 2 x 2 (n 2)( x 2) np n 2 p 2 x 2 p q x20 The sum describes the terms of another binomial n(n 1) p 2 n2 whose number of trials is n-2 and whose value is x-2. n 2 x 2 (n 2)( x 2) ( p q)n 1 x 2 p q x20 n2 Therefore: Hence n(n 1) p 2 (1) np n 2 p 2 n 2 p 2 np 2 np n 2 p 2 np(1 p) Example: A die is tossed 180 times. Find the mean and standard deviation of the random variable representing the total number of sixes obtained. Example: How many throws of two dice are required to ensure that the probability of obtaining at least one “double six” is greater than 0.95? (note there are two ways to do this…using the equation above will yield a situation that requires you to guess and check. You can also try 1 P(no double 6 in n trials) ) Example: Five percent of a large consignment of fruit is inedible. Find the probability that in a random selection of 10 pieces of fruit from this consignment, exactly two pieces are inedible. (note: this is not really independent events but the difference will be negligible for very large consignments). What is the mean and standard deviation in this case? Example: A manufacturer finds that 30% of the items produced from one of the assembly lines are defective. During a floor inspection, the manufacturer selects 6 items from this assembly line. Find the probability that the manufacturer finds (a) two defectives (b) at least two defectives. The Poisson Distribution The Poisson distribution (not named for the fish) models situations where there is a minimum number of ‘successes’ but no maximum number of ‘successes’. The probability of getting a large number of successes does go down to a very small number rapidly. It is used to determine the probability of obtaining a certain number of successes that can take place in a certain time (or space) interval. Examples of situations where this is applicable: - the number of phone calls per hour - the number of misprints on a page of a book - the number of fish caught in a day - the number of car accidents on a given road per month It is defined with the equation P(X x) m x e m , x 0,1, 2, 3, 4,... where m is the parameter. x! Conditions for a Poisson distribution: - the average number of occurrences ( ) is constant for each time interval. - The possibility of more than one occurrence in a given time interval is very small (the number of occurrences in a given interval is small… about 10% or less) - The number of occurrences within the intervals are independent of each other. Exploration Use a graphing calculator to do the following exploration. - Find the mean and the standard deviation of the table which represent the number of errors, X, and their frequency. X Frequency - 1 11 2 16 3 18 4 15 5 9 6 5 7 1 8 1 0 1 2 3 4 5 6 7 8 9 Find the mean and standard deviation of this model What do you notice? If the sum continued to infinity, what would the mean and standard deviation be? The Expected Value (mean) and Standard Deviation of a Binomial Distribution These are quite easy: The mean is m and the variance is m hence the standard deviation is We can write Po(m) to represent a distribution of mean m. Proof 9 0 10 1 Make a table of X and probability P(X=x) using the mean as m in the Poisson model for x = 0 to 10 X Frequency - 0 3 m. 10 n E(X) xp(x) x0 m x e m x x! x0 m x e m since x 0 when x 0 x! m x e m x x! x 1 x x 1 m m x 1e m x(x 1)! m x 1e m x 1 (x 1)! m m m x 1e m x 1 0 (x 1)! The sum describes the terms of another Poisson Distribution whose start value is x-1. m x 1e m 1 x 1 0 (x 1)! Therefore: E(X) m Var(X) E(X 2 ) E(X)2 E(X 2 ) m 2 Need E(X 2 ) E(X 2 ) x 2 P(X x) x0 x(x 1 1) x0 x(x 1) x0 x(x 1) x0 m x e m x! m x e m m x e m x x! x! m x e m m x e m x x! x! x0 m x e m x(x 1) m x! x0 m x e m Since x(x 1) 0 for x=0 and x=1 x! m x e m x(x 1) m x! x2 m 2 m x 2 e m m (x 2)! x2 m x 2 e m m x 2 (x 2)! The sum describes the terms of another Poisson Distribution whose start value is x-1. m2 m x 2 e m 1 x 2 (x 2)! Therefore: E(X 2 ) m 2 m Var(X) E(X 2 ) E(X)2 m2 m m2 m Var(X) m Example: One gram of a radioactive substance is positioned so that each emission of an alpha particle will flash on a screen. The emissions over 500 periods of 10 second duration are given in the following table: Number/Period 0 1 2 3 4 5 6 7 Frequency 91 156 132 75 33 9 3 1 (a) Find the mean of the distribution. (b) Fit a Poisson model to the data and compare the actual data to that of this model. (c) Find the standard deviation of the distribution. How close I it to m found in (a)? Example: Top cars rent cars to tourists. They have four cars, which are hired out on a daily basis. The number of requests per day is distributed according to the Poisson model with a mean of 3. Determine the probability that: (a) none of the cars are rented; (b) At least 3 of the cars are rented; (c) Some requests will be refused; (d) All are hired out given that at least two are. EXTRA: Hypergeometric Distribution The Hypergeometric Distribution is another special type of a discrete distribution. We must select a sample from a population without replacement. Each must have only a success or failure outcome. For a population of size N, which is known to contain D ‘defective items’, if we select random sample of size n from this population (and do so without replacement), then, if we define the random variable X, to be the number of defectives observed in the sample of size n, we say that X has a Hypergeometric Distribution. We have D N D x nx P( X x) , x 0,1, 2.., n N n Example: Of the 15 light bulbs in a box, 5 are defective. Four bulbs are selected at random from the box. Let the random variable X represent the number of defective bulbs selected. Construct a table to represent this distribution and show that the sum of the probabilities is 1. Example: A sports committee at the local hospital consists of 5 members. A new committee is to be elected, from 5 women and 4 men. What is the probability that the committee will consist of 3 women? Binomial or Hypergeometric? An accounting population consists of 2000 line items of which 10% are incorrectly stated. Find the probability that no more than 2 incorrectly stated accounts will be found in a sample of size 10. Let X = number of incorrectly stated accounts. P(X ≤2) = P(X = 0) + P(X = 1) + P(X = 2) With Replacement Sampling P(X 0) 0.910 .3487 10 P(X 1) 0.11 0.9 9 .3874 1 10 P(X 2) 0.12 0.9 8 .1937 2 P(X ≤2) = .9298 Without Replacement Sampling 1800 10 P( X 0) .3476 2000 10 200 1800 1 9 P( X 1) .3881 2000 10 200 1800 2 8 .1939 P( X 2) 2000 10 P(X ≤2) = .9296 Binomial or Hypergeometric? What is the probability of getting no more than 2 misstated accounts in a simple random sample of size 10 drawn without replacement from an accounting population which has a 10% misstatement rate. Pop Size 20 200 2000 Bin Approx P(X = 0) .2368 .3398 .3476 .3487 P(X = 1) .5263 .3974 .3881 .3874 P(X = 2) .2368 .1975 .1939 .1937 P(X ≤ 2) .9999 .9347 .9296 .9298
© Copyright 2026 Paperzz