Chapter 6 Section 1 Day 1 2016s Notes.notebook April 28, 2016 Honors Statistics Aug 23-8:26 PM 3. Notes Quiz Chapter 6 Section 1 Introduction to Chapter 6 Section 1 Aug 23-8:31 PM 1 Chapter 6 Section 1 Day 1 2016s Notes.notebook April 28, 2016 Apr 27-9:20 AM May 5-8:16 PM 2 Chapter 6 Section 1 Day 1 2016s Notes.notebook April 28, 2016 May 5-8:18 PM Nov 27-10:28 PM 3 Chapter 6 Section 1 Day 1 2016s Notes.notebook April 28, 2016 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 6.11 6.12 Nov 27-9:53 PM Dec 1-2:28 PM 4 Chapter 6 Section 1 Day 1 2016s Notes.notebook April 28, 2016 Length of a rope Songs on the CD Apr 26-9:21 AM Dec 1-1:30 PM 5 Chapter 6 Section 1 Day 1 2016s Notes.notebook April 28, 2016 This distribution is not symmetric, it is skewed to the right. The five number summary should be used to analyze the data. Nov 21-11:54 AM This distribution is symmetric, it is bell shaped. The mean and standard deviation can be used to analyze the data. Nov 21-11:54 AM 6 Chapter 6 Section 1 Day 1 2016s Notes.notebook April 28, 2016 Nov 27-10:38 PM Nov 27-10:40 PM 7 Chapter 6 Section 1 Day 1 2016s Notes.notebook April 28, 2016 Nov 21-11:54 AM Introductory Example Suppose the following numbers are placed on ping pong balls and the balls are put in an air mixer (like the lottery). Over a 10 week period what is the average amout of money you could expect to win, if one ball is selected randomly each week, the amount recorded and then the ball is replaced to be used again in the context for the next week. Let x = the number on a randomly selected ping pong ball Dec 8-2:17 PM 8 Chapter 6 Section 1 Day 1 2016s Notes.notebook April 28, 2016 Nov 27-10:40 PM Nov 27-10:40 PM 9 Chapter 6 Section 1 Day 1 2016s Notes.notebook April 28, 2016 Nov 27-10:55 PM Nov 27-10:55 PM 10 Chapter 6 Section 1 Day 1 2016s Notes.notebook April 28, 2016 Nov 27-10:41 PM Nov 27-10:41 PM 11 Chapter 6 Section 1 Day 1 2016s Notes.notebook April 28, 2016 Nov 27-10:41 PM Nov 27-10:41 PM 12 Chapter 6 Section 1 Day 1 2016s Notes.notebook April 28, 2016 Introductory Example Suppose the following numbers are placed on ping pong balls and the balls are put in an air mixer (like the lottery). Over a 10 week period what is the average amout of money you could expect to win, if one ball is selected randomly each week, the amount recorded and then the ball is replaced to be used again in the context for the next week. Let x = the number on a randomly selected ping pong ball 50 50 50 50 50 50 50 50 50 50 Is this a legitimate probability distribution ? 1. All individual probabilities are between 0 and 1 √ 2. The sum of all probabilities is 1 √ 1. Find the probability that the ball is worth more than $4 ... 2. Find the probability that the ball is worth $4 or more... 3. Find the expected value of the distribution. Interpret this value in context. 4. Find the standard deviation of the distribution. Interpret this value in context. Dec 8-2:17 PM X Dec 2-8:27 PM 13 Chapter 6 Section 1 Day 1 2016s Notes.notebook April 28, 2016 X If many, many students were randomly selected their gpa's would average 2.78 points. If a student is randomly selected their GPA will typically differ from the mean of 2.78 by approximately 0.867 points. Dec 2-8:27 PM 1. Write in words what the meaning of P(X ≥ 3) is. What is this probability? What is the probability that a randomly selected student in online statistics 101 earned a grade of B or higher? P(X ≥ 3) = 0.42 + 0.26 = 0.68 2. Write the event “the student got a grade worse than C” in terms of values of the random variable X. What is the probability of this event? P(X < 2) = 0.02 + 0.10 = 0.12 frequency of letter grade earned 3. Sketch a graph of the probability distribution. Describe what you see. 0.5 0.4 0.3 0.2 0.1 Value of letter grade 4. Find the expected value of the distribution. Interpret this value in context. µx = 0(0.02) + 1(0.10) + 2(0.20) + 3(0.42) + 4(0.26) = 2.8 If many, many STAT 101 students were randomly selected, their GPA's would average 2.8 points. 5. Find the standard deviation of the distribution. Interpret this value in context. σ2x = (0-2.8)2(0.02) + (1-2.8)2(0.10) + (2-2.8)2(0.20) + (3-2.8)2(0.42) + (4-2.8)2(0.26) = 1 σx =√1 = 1 The standard deviation of X is σx = 1 A randomly selected stats 101 grade will will typically differ from the mean (2.8) by about 1 point. Nov 27-10:56 PM 14 Chapter 6 Section 1 Day 1 2016s Notes.notebook April 28, 2016 Pair-a-dice Suppose you roll a pair of fair, six-sided dice. Let T= the sum of the spots showing on the up-faces. (a) Find the probability distribution of T. Probability model for the sum of two fair dice frequency of sum (b) Make a histogram of the probability distribution. Describe what you see. 0.16 0.12 0.08 0.04 2 3 4 5 6 7 8 9 10 11 12 sum of dice (c) Find P(T ≥ 5) and interpret the result. P(T ≥ 5) = 1 - P(T < 5) = 1 - ( 1 + 36 2 36 + 3 36 )=1- 6 36 = 30 36 = 0.8333 The probability that a random roll of dice will add to more than 5 is 0.8333 or 83.3% (d) Find the expected value of the distribution. Interpret this value in context. 1 2 3 4 36 36 36 5 µT = 2(36 ) + 3( ) + 4( ) + 5( ) + 6( 36) + 7( 5 4 3 2 1 6 36 )+ 252 8( 36) + 9(36 ) + 10( ) + 11( 36) + 12(36 ) = 36 = 7 36 If a pair of fair dice are rolled many, many times and the sum of the dots is calculated the sum will average a total of 7. The expected sum of a pair of randomly rolled dice is 7. Nov 27-10:56 PM Ask about 1 and 5 A Skip 4, 12, 16 Write a program to generate random numbers. I've decided to give them free will. Apr 25-10:55 AM 15 Chapter 6 Section 1 Day 1 2016s Notes.notebook April 28, 2016 Toss 4 times Suppose you toss a fair coin 4 times. Let X = the number of heads you get. First List the Sample Space .... HHHH THHH HHHT THHT HHTH THTH HHTT THTT HTHH TTHH HTHT TTHT HTTH TTTH HTTT TTTT (a) Find the probability distribution of X. (b) Make a histogram of the probability distribution. Describe what you see. frequency 0.5 0.4 0.3 0.2 0.1 Number of heads (c) Find P(X ≤ 3) and interpret the result. P( X ≤ 3) = + + = 11 = 0.6875 16 The probability that 4 tosses of a coin results in 3 or fewer heads is 0.6875 Nov 28-12:03 AM Kids and toys In an experiment on the behavior of young children, each subject is placed in an area with five toys. Past experiments have shown that the probability distribution of the number X of toys played with by a randomly selected subject is as follows: > (a) Write the event “plays with at most two toys” in terms of X. > What is the probability of this event? P(x ≤ 2) = 0.03 + 0.16 + 0.30 = 0.49 > (b) Describe the event X > 3 in words. > The probability that the young child plays with more than 3 toys. > What is its probability? > What is the probability that X ≥ 3? P(X ≥ 3) = 0.28 + 0.23 = 0.51 P(X > 3) = 0.17 + 0.11 = 0.28 Nov 28-12:08 AM 16 Chapter 6 Section 1 Day 1 2016s Notes.notebook April 28, 2016 Benford’s law Faked numbers in tax returns, invoices, or expense account claims often display patterns that aren’t present in legitimate records. Some patterns, like too many round numbers, are obvious and easily avoided by a clever crook. Others are more subtle. It is a striking fact that the first digits of numbers in legitimate records often follow a model known as Benford’s law.7 Call the first digit of a randomly chosen record X for short. Benford’s law gives this probability model for X (note that a first digit can’t be 0): (a) Show that this is a legitimate probability distribution. all individual probabilities are between 0 and 1 0.301 + 0.176 + 0.125 + 0.097 + 0.079 + 0.067 + 0.058 + 0.051 + 0.046 = 1 (b) Make a histogram of the probability distribution. Describe what you see. See histogram above. The distribution is NOT symmetric. It is skewed to the right. The data should be analyzed using the 5 number summary. (c) Describe the event X ≥ 6 in words. What is P(X ≥ 6)? What is the probability that the first digit in a legitimate legal document is 6 or greater? P(X ≥ 6) = 0.067 + 0.058 + 0.051 + 0.046 = 0.222 (d) Express the event “first digit is at most 5” in terms of X. What is the probability of this event? P(X < 6) = 1 - P(X ≥ 6) = 1 - 0.222 = 0.778 Nov 14-5:53 PM Benford’s law Refer to Exercise 5. The first digit of a randomly chosen expense account claim follows Benford’s law. Consider the events A = first digit is 7 or greater and B = first digit is odd. (a) What outcomes make up the event A? What is P(A)? P(X ≥ 7) = 0.058 + 0.051 + 0.046 = 0.155 (b) What outcomes make up the event B? What is P(B)? P(X is odd) = 0.301 + 0.125 + 0.079 + 0.058 + 0.046 = 0.609 (c) What outcomes make up the event “A or B”? What is P(A or B)? Why is this probability not equal to P(A) + P(B)? P(X ≥ 7 or X is odd) = 0.609 + 0.155 - (0.058 + 0.046) = 0.66 Both 7 and 9 are included in each event and must their sum must be subtracted because they were counted twice ( the general probability addition rule) Nov 28-12:14 AM 17 Chapter 6 Section 1 Day 1 2016s Notes.notebook April 28, 2016 Kids and toys Refer to Exercise 4. Calculate the mean of the random variable X and interpret this result in context. µx = 0(0.03) + 1(0.16) + 2(0.30) + 3(0.23) + 4(0.17) + 5(0.11) = 2.68 If many, many children participated in this experiment, the mean number of toys that randomly selected children would play with will average 2.68 toys. (The expected number of toys a randomly selected young child will play with is 2.68.) This statement is optional. Nov 28-12:16 AM Benford’s law and fraud A not-so-clever employee decided to fake his monthly expense report. He believed that the first digits of his expense amounts should be equally likely to be any of the numbers from 1 to 9. In that case, the first digit Y of a randomly selected expense amount would have the probability distribution shown in the histogram. > (a) Explain why the mean of the random variable Y is located at the solid red line in the figure. The mean is the balance point of the distribution. So it is located in the center of a uniform or symmetric distribution histogram in this case at 5. > (b) The first digits of randomly selected expense amounts actually follow Benford’s law (Exercise 5). According to Benford’s law, what’s the expected value of the first digit? Explain how this information could be used to detect a fake expense report. µx = 1(0.301) + 2(0.176) + 3(0.125) + 4(0.097) + 5(0.079) + 6(0.067) + 7(0.058) + 8(0.051) + 9(0.046) = 3.441 To detect a fake expense report, compute the sample mean of the first digits of the numbers on the report. A value closer to 3.441 suggests a truthful report but a value closer to 5 (the more uniform distribution) suggest a false report. > (c) What’s P(Y > 6) in the above distribution? According to Benford’s law, what proportion of first digits in the employee’s expense amounts should be greater than 6? How could this information be used to detect a fake expense report? P(Y > 6) = 0.058 + 0.051 + 0.046 = 0.155 For a uniform distribution the P(Y > 6) = 0.3 To detect a fake expense report, compute the percent of the first digits that are greater than 6 on the report. A value closer to 0.155 suggests a truthful report but a value closer to 0.3 (the more uniform distribution) suggest a false report. Nov 28-12:18 AM 18 Chapter 6 Section 1 Day 1 2016s Notes.notebook April 28, 2016 Kids and toys Refer to Exercise 4. Calculate and interpret the standard deviation of the random variable X. Show your work. σ2x = (0-2.68)2(0.03) + (1-2.68)2(0.16) + (2-2.68)2(0.30) + (3-2.68)2(0.23) + (4-2.68)2(0.17) + (5-2.68)2(0.11) = 1.7176 σx =√1.7176 = 1.31057 The standard deviation of X is σx = 1.31057 The number of toys a randomly selected young child will play with will typically differ from the mean (2.68) by about 1.31 toys. Nov 28-12:22 AM Benford’s law and fraud Refer to Exercise 13. It might also be possible to detect an employee’s fake expense records by looking at the variability in the first digits of those expense amounts. > (a) Calculate the standard deviation σY. This gives us an idea of how much variation we’d expect in the employee’s expense records if he assumed that first digits from 1 to 9 were equally likely. σ2x = (1-5)2(0.10) + (2-5)2(0.10) + (3-5)2(0.10) + (4-5)2(0.10) + (5-5)2(0.10) + (6-5)2(0.10) + (7-5)2(0.10) + (8-5)2(0.10) + (9-5)2(0.10) = 6 σx =√6 = 2.58 > (b) Now calculate the standard deviation of first digits that follow Benford’s law (Exercise 5). Would using standard deviations be a good way to detect fraud? Explain. σ2x = (1-3.44)2(0.301) + (2-3.44)2(0.176) + (3-3.44)2(0.125) + (4-3.44)2(0.097) + (5-3.44)2(0.079) + (6-3.44)2(0.067) + (7-3.44)2(0.058) + (8-3.44)2(0.051) + (9-3.44)2(0.046) = 6.06052 σx =√6.06052 = 2.42 Because the standard deviations are so close 2.58 and 2.42 it would be difficult to determine fake reports from legitimate reports using the standard deviation. Nov 28-12:22 AM 19
© Copyright 2026 Paperzz