MAS113 Introduction to Probability and Statistics Exercises You will be told which problems to work on each week at the lecture each Monday, and you should try them before your class each Wednesday. Every two weeks, starting in week 3, you will also be asked to hand in solutions to two questions as homework. You will be told what the homework questions are after the tutorials, and you should hand them in at the lecture on the following Monday. Hints for questions marked with a * are given at the end of this booklet. 1. In each of the following cases, describe a suitable sample space for the experiment and identify the event indicated as a subset of this sample space. [Do not try to assign probabilities.] (a) Experiment: toss a coin three times. Event: the number of heads is even. (b) Experiment: count the number r of red tomatoes and the number y of yellow tomatoes grown by a gardener. Event: there are more yellow tomatoes than red ones. (c) Experiment: measure the quantities a, b, c (in cm) of rainfall in a day at each of three weather stations A, B and C. Event: it is driest at station A. 2. For S = N and subset A of S, consider the set function g(A) = min(A), the smallest element of the set A, with g(∅) defined to be 0. Give a counter-example to show that g is not a measure. 3. * For any three subsets A, B, C of a sample space S (not necessarily disjoint), prove that for any measure m, m(A ∪ B ∪ C) = m(A) + m(B) + m(C) − m(A ∩ B) −m(A ∩ C) − m(B ∩ C) + m(A ∩ B ∩ C). You may quote the result that A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C). 1 4. * Show that if m and n are two measures on a sample space S, and c ≥ 0, then (a) m + n is a measure on S, where (m + n)(A) = m(A) + n(A), (b) cm is a measure on S, where (cm)(A) = c × m(A). 5. * Given a sample space S with A ⊆ S, prove that P (Ā) = 1 − P (A). Justify each step in your proof carefully. 6. In the Champions League semi-finals, teams are chosen at random to play each other. Two teams out of the four are selected at random for the first semi-final, and the remaining two teams play each other in the second semi-final. Assuming that all random selections are equally likely, if two of the four teams are English, what is the probability that they are not drawn to play each other? Show clearly how you have derived your probability. 7. A bookmaker is offering odds on the outcome of the 2017/18 Ashes series between England and Australia. They offer odds of 4/7 against Australia winning the series, 5/2 against England winning the series and 5/1 against a drawn series. By converting these odds to probabilities, show that these odds do not correspond to a valid probability measure (i.e. they do not satisfy the rules of probability). If the odds for England and Australia winning the series are unchanged, what should the odds against a drawn series be to give a valid probability measure? 8. In the game show “Who wants to be a Millionaire?”, a contestant must answer 12 questions correctly to win £1,000,000. Each question is multiple choice, with four possible answers. For each question, explain your reasoning carefully: (a) If a contestant guesses each time, picking one of the four answers at random, what is the probability of winning the million pound prize? (b) Now suppose the contestant uses the three “lifelines”: for one question, two false answers are eliminated; for another question, the audience votes for the correct answer; for a third question, the contestant may phone a friend for advice. Suppose that both the audience and the friend give the correct answer on their respective questions. The contestant uses the answers from the audience and the friend, but otherwise picks an answer at random each time. What is the probability of winning the million pound prize? 9. An online banking website requires its account holders to choose a four digit pin number, and a case-sensitive password made up of digits and letters (the letters can be 2 upper or lower case). Suppose you choose your pin number and a six character password at random (for the password, each character has the same chance of being any digit or upper or lower case letter). When logging in, the website asks you to specify three digits from your pin number, and three characters from your password, all chosen randomly, and will give you three attempts if you make a mistake. If someone tries to log into your account, what is the probability that they would gain entry within three attempts? Justify your answer. 10. In the card game of Blackjack, players attempt to get a total of 21 points in their hand. An ace may be counted as 1 point or 11 points, and face cards (kings, queens, and jacks) are counted as 10 points. For each question, explain your reasoning carefully: (a) A player is dealt two cards from a standard deck of 52. What is the probability that these two cards sum to 21 points? (b) A player is dealt a 10 and a 6 for his hand, again from a standard deck of 52. His opponent (the dealer) has drawn a 4. If the player draws one more card, what is the probability that the player’s total will exceed 21 points? (You should assume that an Ace will be counted as 1 point in this context.) (c) Suppose instead two standard decks of cards are shuffled together, so that there are 104 cards in total. A player is again dealt two cards. What is the probability that these two cards sum to 21 points? 11. In the game show “Deal or no Deal?”, a contestant must first choose 5 boxes out of 22 to be opened. Each box contains a different amount of money. Once a box has been opened, its contents can no longer be won. For each question, explain your reasoning carefully: (a) What is the probability that the contestant chooses the most valuable box? (b) What is the probability that the contestant chooses at least one of the top 5 most valuable boxes? 12. Two events A and B have P (A) = 0.4, P (B) = 0.3 and P (A ∪ B) = 0.58. Are A and B independent? 13. In a group of 100 men, 55 are clean-shaven, 30 have beards and moustaches, 10 have moustaches only, and 5 have beards only. One man is selected at random. For each question, explain your reasoning carefully: (a) What is the probability that the man has a beard? 3 (b) If it is known that the selected man has a beard, what is the probability that he has a moustache? (c) In this group of 100 men, are the events of having a beard and having a moustache independent for a randomly selected man? 14. Suppose England are going to play Spain at football in one month’s time. You judge that if England’s star player is fit, your probability that England win is 0.3, but if he is injured, your probability that England win is 0.2. If your probability that England’s star player will be injured at the time of the game is 0.05, what is your probability that England will win? If, after the match, you found out only that England won, what would your probability be that the star player was injured? Justify your answers. 15. In a production line, it is estimated that 1 in every 200 items will be faulty. At the quality control stage, each item is visually inspected, and it is believed that there is a 90% chance of detecting a faulty item, and a 5% chance of mistakenly declaring a non-faulty item as being faulty. If, at the inspection, an item is declared as faulty, what is the probability that it genuinely is faulty? Define any notation that you introduce, and justify your answer. 16. * An art dealer receives a shipment of five old paintings from abroad, and, on the basis of past experience, she feels that the probabilities are, respectively, 0.76, 0.09, 0.02, 0.01, 0.02 and 0.10 that 0,1,2,3,4 or all 5 of them are forgeries. Since the cost of authentication is fairly high, she decides to select one of the five paintings at random and send it away for authentication. If it turns out that this painting is a forgery, what probability should she now assign to the possibility that all the other paintings are also forgeries? Note that the answer is not 0.1. 17. * Three prisoners, A, B and C, have been told by their jailer that one of them, chosen at random, will be executed, and the other two will be freed. Prisoner A says to the jailer,“I know that one of the other prisoners must be freed, so why don’t you tell me which of the two it will be, or tell me one name at random if they are both to be freed? It can’t change my chances of being executed.” The jailer replies, “Once I’ve told you a name, that only leaves two people, so your chances of being executed will have gone up to 1 in 2.” Who is right? 18. A man is on trial for the murder of his wife. The defendant is known to have been violent towards his wife on at least one occasion. The defence argue that this evidence is of little relevance, since fewer than 1 in 2500 men who are violent to their partners go on to murder them. Assuming the figure is right, is the defence’s argument valid? 4 19. An email spam filter applies the following test to each incoming email: if all of the following conditions are true, the email is declared spam: • the email has not been sent from a university account; • the email has not been sent from a distribution list, to which the user is subscribed; • the receiver has never sent an email to the sender; • the email text does not contain the receiver’s name. Suppose 80% of all emails are spam, 95% of spam emails are declared spam by the filter test, and 5% of genuine emails are declared spam by the filter test. Calculate the probability that an incoming email will be declared spam, and the probability of a ‘false positive’: the probability that an email that is declared to be spam is in fact genuine. Define any notation that you introduce, and justify your answer. 20. Form a list of 9 digits using the 9 digits in your student registration number and add the digit 5 to the list. Let X be the number obtained by selecting at random one of the 10 digits from the list, such that each of the 10 digits has the same probability of being selected. Tabulate the probability mass function and cumulative distribution function of X, for integer values of X from 0 to 10 inclusive. Sketch a plot of the cumulative distribution of X, with the x-axis ranging from -1 to 10. Calculate the standard deviation of X. 21. Suppose, for a lottery scratchcard, probabilities of different prizes are as follows prize probability £0 0.689 £2 0.300 £10 0.010 £100 0.001 If a scratchcard costs £1, and you buy one card, tabulate the probability mass function and cumulative distribution function of your profit. Calculated the expectation and standard deviation of your profit. Define any notation that you introduce. 22. A discrete random variable X has expectation 3 and variance 10. Let Y = (X + 1)2 . 5 Explain what is wrong with the following, and derive the correct expectation of Y . E(Y ) = E{(X + 1)2 } = = = = E(X 2 + 2X + 1) E(X)2 + 2E(X) + 1 32 + 2 × 3 + 1 16. (1) (2) (3) (4) 23. On a European roulette wheel, the ball can land on one of the integers 0 to 36. Assume that the ball is equally likely to land anywhere. Let X be a random variable that takes the value 1 if the ball lands on an odd number, and 0 otherwise. Calculate the expectation and variance of 5X. Now suppose that the ball is spun 5 times, with Y the total number of times that the ball lands on any odd number from 1 to 35. Does Y have the same expectation and variance as 5X? Justify your answer. 24. * You are asked to provide your subjective probability that it will rain tomorrow. If you provide a value q, you will be paid £q pounds if it rains tomorrow, and £(1 − q) if it does not rain tomorrow. If your subjective probability of rain tomorrow is p, and you want to maximise your expected earnings, should you be truthful and state q = p, or should you state some other value for q? Your value of q must lie in the interval [0, 1]. 25. In a multiple choice test, there are 10 questions, with four answers per question. For each question, only one out of the four answers is correct. If you were to pick one answer at random for each question, calculate the probability of getting exactly 6 out of 10 answers correct. Define any notation that you introduce, and justify your answer 26. A TV chef claims his free-range chickens taste better than battery-farmed chickens. 10 people are given a sample of each to taste, without knowing which is which, and are asked to state which sample they prefer. It is suggested that the participants cannot actually taste the difference, and are effectively choosing which sample they prefer at random. If this is true, (a) what probability distribution would you use to describe the number of people who say they prefer the free-range chicken? (b) Using your distribution in (a), calculate the probability that the number of people who say they prefer the free-range chicken is 6 i. not 5; ii. no more than 8. Define any notation that you introduce, and justify your answers. 27. * Let X1 ∼ Bin(n1 , p1 ) and X2 ∼ Bin(n2 , p2 ), and assume that X1 and X2 are independent. (a) Find the mean and variance of X1 + X2 . (b) If p1 = p2 , what distribution would you expect X1 + X2 to have? Show that this works in the case where n1 = n2 = 2 and p1 = p2 = 1/2. If n1 = n2 = 2, p1 = 1/2 and p2 = 1/8, show that X1 + X2 does not have a binomial distribution. 28. In a production line, it is estimated that 1 in every 200 items will be faulty. At the quality control stage, each item is visually inspected, and it is believed that there is a 90% chance of detecting a faulty item, and a 5% chance of mistakenly declaring a non-faulty item as being faulty. In a batch of 10 items, what is the probability of two items being declared faulty at the quality control stage? Define any notation that you introduce, and justify your answer. 29. At a road junction, it is estimated that the mean number of car accidents is 5 per year. Suggest a suitable probability distribution for the number of accidents at the junction next year, and using your distribution, calculate the probability that there will be more than 1 accident at the junction next year. 30. According to data collected by the British Geological Survey, earthquakes of magnitude between 3 and 3.9 on the Richter scale occur in the UK on average (mean) 3 times per year. Suppose we choose to model the number of such earthquakes occurring next year with a Poisson distribution. Suggest a suitable rate parameter for the Poisson distribution. Using this parameter value, (a) what is the probability that there will be precisely 3 such earthquakes next year? (b) What is the probability that there will be at least 2 such earthquakes next year? 31. Let X ∼ P oisson(5). If it is known that 4 ≤ X ≤ 6, what is the probability that X = 5? Justify your answer. 32. On a roulette wheel, 18 numbers are coloured red, 18 numbers are coloured black, and 1 number is coloured green. A bet of £x on red will return £x plus the original stake of £x (so that the profit is £x) if the ball lands on a red number. 7 Consider the following betting strategy: bet £y on red, and if you win, stop betting. If you lose, bet £2y on red. Then, if you win, stop betting, but if you lose, keep doubling your stake until you win. From the table below, we see that if you win, your profit will always be £y: Win on attempt money spent (£) 1 y y + 2y 2 3 y + 2y + 4y y + 2y + 4y + 8y 4 .. .. . . Pn−1 a n a=0 2 y (Recall that for a geometric series, Px−1 j=0 payout (£) 2y 4y 8y 16y .. . profit y y y y .. . 2 × 2n−1 y y krj = k(1 − rx )/(1 − r)). (a) Calculate the expectation and standard deviation of the number of times that you will bet on red. (b) What is the probability that you will win within your first 20 attempts? In practice, could you use this strategy to guarantee yourself a profit of £y? 33. Suppose you buy one National Lottery ticket each week, until you win a prize. Let Z be the number of tickets you buy until you first win a prize, so that Z ∼ Geometric(p). Clearly, losing one week doesn’t make it more likely that you will win the next week, so which of the two statements below is true? Justify your answer with a proof. (a) P (Z > z + n|Z > n) = P (Z > z + n), (b) P (Z > z + n|Z > n) = P (Z > z). (p is approximately 0.02, but you can leave your proof in terms of p). 34. * If Y ∼ P oisson(λ), prove that E(Y 2 ) = λ2 + λ. In your proof, you should not quote the result that Var(Y ) = λ, but you may quote other results from the lecture notes. 35. * Let X ∼ P oisson(µ) and Y ∼ P oisson(µ), with X and Y independent. Derive the probability mass function of Z = X + Y , and state the distribution of Z. You may quote the result that n X n! , (1 + 1)n = a!(n − a)! a=0 (which follows from the binomial theorem). 8 36. Humans each carry 0, 1 or 2 copies of a particular gene. A woman who carries 1 copy of the gene has a child, whose paternity is disputed. Let X be the number of copies of the gene carried by the child, and Y the number of copies carried by their father. Standard genetic theory, plus background information about this particular gene, suggest the following joint distribution 0 pX,Y (x, y) 0 16/50 x 1 16/50 2 0 y 1 2 4/50 0 8/50 1/50 4/50 1/50 (a) Calculate the marginal probability mass functions of X and Y . (b) If it is known that the child carries only 1 copy of the gene, what would the probability be that the father carries 2 copies? Justify your answer. (c) Calculate the covariance between X and Y . Are X and Y independent? Justify your answer. 37. Prove that Cov(X, Y ) = E(XY ) − E(X)E(Y ), and hence prove that Cov(X, Y + Z) = Cov(X, Y ) + Cov(X, Z). 38. Let X and Y be the heights (in cm) of two adult sisters. • Do you think X and Y should be independent? • Do you think X − Y and X + Y should be independent? If Var(X) = Var(Y ), find Cov(X + Y, X − Y ). Comment on your result. 39. Thirty students are enrolled on an MSc course. The possible grades that can be awarded at the end of the course are fail, pass, merit and distinction. At the start of the course, uncertainty about how many students will achieve each grade is to be modelled with a multinomial distribution, and the expected numbers of students achieving each grade are as follows. fail pass merit distinction 2 8 15 5 For each question, define any notation that you introduce, and justify your answer. 9 (a) Using the expected numbers of each grade, state suitable parameter values for the multinomial distribution. How likely is it that the observed numbers of each grade all equal to their expected values? (b) What is the probability that more than one student fails? (c) If four students fail, what is the probability that there will be 5 distinctions? 40. Three players play a game 4 times. Only one player can win each game, and there are no draws. Let X denote the number of times player 1 wins, Y the number of times player 2 wins and Z the number of times player 3 wins. Suppose that, in any single game, player 1 wins with probability 0.5, player 2 wins with probability 0.2 and player 3 wins with probability 0.3. (a) What is the joint probability distribution of X, Y, Z? (b) Calculate the probability P (X = 2, Y = 1, Z = 1). (c) Calculate the probability P (X > 2). (d) Calculate the probability P (X > 2|Y = 1). 41. Humans each carry 0, 1 or 2 copies of a particular gene. Let X be the number of copies of the gene carried by a child, and Y the number of copies carried by their father. Standard genetic theory, plus background information about this particular gene, suggest the following joint distribution pX,Y (x, y) 0 0 16/50 x 1 16/50 2 0 y 1 2 4/50 0 8/50 1/50 4/50 1/50 In a sample of 10 randomly selected children, all with different fathers, what is the probability that three children carry no copies, five children carry one copy, and two children carry two copies of the gene? Justify your answer. 42. Let Z be a random variable with probability density function 3 3 fZ (z) = − z 2 + , 2 2 for z ∈ [0, 1] and 0 otherwise. 10 (a) Tabulate the cumulative distribution function of Z. (b) Give a check to show that your cumulative distribution function in part (a) is correct. (c) Calculate P (Z > 0.5). (d) Find the expectation and standard deviation of Z. 43. A random variable X has cumulative distribution function given by 0 x ≤ 0, x2 0 ≤ x ≤ 1, 2 FX (x) = x2 2x − 2 − 1 1 ≤ x ≤ 2, 1 x ≥ 2. Explain how to obtain the probability density function of X, and then draw it. Include a check to show that your probability density function is correct. Calculate the expectation and standard deviation of X. 44. Let Y be a random variable with probability density function fY (y) = exp(−2|y|), with RY = R. (a) Calculate P (−1 < Y < 2). (b) If it is known that Y > 0, calculate P (Y < 1|Y > 0). (Note that this is not the same as P (0 < Y < 1)). (c) Find the 75th percentile of the distribution of Y . (Hint: sketch the pdf of Y . If y0.75 is the 75th percentile, what must the area under the pdf be between −∞ and y0.75 ?) (d) Find the cumulative distribution function of Y . 45. Suppose X ∼ Exp(1). Without using a calculator, find the value of P (ln 1 ≤ X ≤ ln 2). 46. Radioactive decay can be modelled using an exponential distribution. It is estimated that the half life of carbon-14 (used for radiocarbon dating) is 5730 years. This means that the probability of one carbon-14 atom decaying into nitrogen-14 within 5730 years is estimated to be 0.5. Find the expected time taken for one carbon-14 atom to decay into nitrogen-14. Define any notation that you introduce, and justify your answer. 11 47. A patient with gastroesophageal reflux disease is treated with a new drug to relieve pain from heartburn. Following treatment, the time until the patient next experiences the symptoms is recorded. The doctor treating the patient thinks there is a 50% chance that the patient will stay symptom-free for at least 30 days. For each question, define any notation that you introduce, and justify your answer: (a) If the time until recurrence of symptoms is to be modelled using an exponential distribution, find the rate parameter of this distribution based on the doctor’s judgement. (b) What is the probability the patient will remain symptom-free for at least 60 days? (c) If, after 30 days, the patient has remained symptom-free, what is the probability the patient will be symptom-free for at least another 30 days? (Note that the answer to part (b) is different). 48. A cyclist leaves a bicycle chained to some railings, and returns five hours later to find that the bike has been stolen. Define T to be the time in which the bike was stolen, counting in hours from when the bike was left by the owner. Assuming that T has a uniform distribution, calculate (a) the mean of T and E(T 2 ); (b) the probability that T lies between 3 and 4 four hours; (c) the 95th percentile of the distribution of T ; (d) the probability that T = 2. 49. If W ∼ Exp(φ), prove that Var(W ) = 1/φ2 . You should not state that E(W 2 ) = 2/φ2 without proof. (Hint: you may use the result that if φ > 0, then x2 e−φx → 0 as x → ∞). 50. (a) In R, generate one uniform U [0, 1] random variable using the command runif(1,0,1). Denoting your observed value of the random variable by u, find the u × 100th percentile of the Exp(0.1) distribution. Explain your method carefully. Check your calculation using the qexp command. (b) In R, repeat the process in part (a) 100,000 times: • generate 100,000 uniform U [0, 1] random variables, storing the result in a vector u. 12 • For each element in u, find the corresponding percentile of the Exp(0.1) distribution, using the command qexp(u,0.1), storing the results in a vector y. Plot a histogram of the 100,000 values in y using the command hist(y,prob=T) (the argument prob=T scales the histogram so that the total area of all the histogram bars is 1). Draw the density function of the Exp(0.1) distribution on top of your histogram, using the command curve(dexp(x,0.1),from=0,to=100,add=T) Comment on your result. 51. (a) Let U ∼ U [0, 1] and let Y = U 2 . Calculate P (Y ≤ 0.09). (b) Now suppose U ∼ U [0, 1] and let Y = − λ1 ln(1 − U ), where λ > 0. Derive an expression for P (Y ≤ y). Comment on your result in relation to your results in Q50. 52. Let X ∼ N (0.5, 0.25). Using the output from an R session below, calculate (a) P (X > 1.5); (b) P (0 < X < 1.5). Note that two of the following four R commands are not relevant! > pnorm(2,0,1) [1] 0.9772499 > pnorm(1.5,0,1) [1] 0.9331928 > pnorm(0,0,1) [1] 0.5 > pnorm(1,0,1) [1] 0.8413447 What commands would you use in R to calculate (a) and (b) directly? 13 53. The total rainfall in Sheffield in January next year is to be modelled using a normal distribution with mean 90mm and standard deviation 30mm. Using the result that P (−1.645 < Z < 1.645) = 0.9, where Z ∼ N (0, 1), calculate an interval (a1 , b1 ) such that total rainfall in Sheffield in January has a 90% chance of lying in the interval. State a second interval (a2 , b2 ) that has (approximately) a 95% chance of containing the total rainfall in Sheffield in January. 54. For this question, you will need the result that a linear combination of a set of independent normal random variables is another normal random variable (so Z defined below has a normal distribution). Suppose X1 ∼ N (10, 18) and X2 ∼ N (10, 18), with X1 and X2 independent. Calculate an interval (a1 , b1 ) such that P (Z ∈ (a1 , b1 )) ' 0.95, with Z= X1 + X 2 . 2 55. * (a) Let Z ∼ N (0, 1). Find E(eZ ). (b) Following your result in part (a), if ln Y ∼ N (µ, σ 2 ), is it true that E(Y ) = eµ ? 56. In a political opinion poll, 50 voters are asked whether they are in favour of a particular policy, and 37 of them say that they are. (a) What is the proportion of the people who are against the policy? (b) If X is the random variable denoting the number of voters consulted in the opinion poll that are in favour of the policy, give the distribution of X. Is this distribution relevant only for this particular experiment? 57. The independent random variables X, Y, Z are normally distributed with E(X) = E(Y ) = 1, E(Z) = 21 , Var(X) =Var(Y ) = 1 and Var(Z) = 1/8. Furthermore, two new random variables U, W are defined as U = 2X + 3Y − 8Z and W = 2X − 2Y. (a) Find the mean and variance of U and W . 14 (b) Find the distribution of U and W . 58. 50 patients with gastroesophageal reflux disease are treated with a new drug to relieve pain from heartburn. Following treatment, the time Ti until patient i next experiences the symptoms is recorded. The doctor treating the patients thinks there is a 50% chance that patient i will stay symptom-free for at least 30 days. Assuming the times are independent and identically distributed, each with an exponential distribution, find the expectation and variance of 50 T̄ (50) = 1 X Ti . 50 i=1 59. A random variable has mean 2 and standard deviation 0.5. What can you say about P ((X < −2) ∪ (X > 6))? What can you say about this probability if you learn that the standard deviation is not 0.5, but is in fact 0.25? 60. * Prove Markov’s inequality, that for a continuous random variable X with range [0, ∞) P (X > c) ≤ E(X) , c for any c > 0. 61. Use the moment generating function for X ∼ N (µ, σ 2 ) to check that E(X) = µ, and Var(X) = σ 2 . Find E(X 3 ) and E(X 4 ) using this technique. 62. Suppose that the random variable X has an exponential distribution with rate λ. Write down the moment generating function for the random variable −X. Compute E(X n ), and check earlier calculations of the mean and variance of X. The reason for working with −X rather than X, is that if t ≥ 0 (and this is in fact sufficient) then all the integrals are finite. 63. Let X ∼ Bin(n, p). Starting from the probability function, find the moment generating function of X and use this technique to confirm that X is the sum of n i.i.d. random variables each of which has a Bernoulli(p) distribution. 64. (a) A continuous random variable X having pdf f , has moment generating function MX . If a and b are arbitrary real numbers, show that the moment generating function of the random variable aX + b is given for each t ∈ R by MaX+b (t) = etb MX (at). 15 (b) If X is a random variable (discrete or continuous), what can you say about MX (0)? 65. Suppose that each week, the number of calls to a minicab firm has a Poisson distribution with rate parameter 100. Use the central limit theorem to estimate the probability that the total number of calls in one year is no more than 5000. 66. (a) Suppose that the random variable X ∼ P oisson(λ). Show that MX (t) = exp {λ(et − 1)}. (b) If Sn is the sum of n i.i.d. random variables, each of which is P oisson(λ), show that S(n) ∼ P oisson(nλ). (c) Using the result of (b), revisit question 65 and find a suitable R command to calculate the probability directly, without using the CLT. 67. Using a particular instrument, a measurement X of the mass (in grams) is expressed as X = µ + 2(U − 12 ), where U follows the uniform distribution U (0, 1) and µ is its unknown true weight. Calculate the mean and variance of X. You can assume that µ will be much greater than zero, so you need not worry about measurements being less than zero. (a) Five measurements X1 , . . . , X5 are taken of the mass of the same object, each with the same distribution as X above. Find the approximate distribution of X̄ = 51 (X1 + X2 + X3 + X4 + X5 ). Do you expect this approximation to work well? Justify your answer. (b) Carry out some simulation experiments in R to investigate the histogram of X̄ in (a). Experiment with different sample sizes n (instead of 5, try 10, 100, etc). For what sample size does the distribution of the X̄ start to look different from the distribution of Xi themselves? What size sample is ‘large’, in the sense of practical use of the central limit theorem? You will need to use the function runif in R, and fix some arbitrary value for µ. 16 Hints to selected questions 3. Try writing D = B ∪ C, then consider how to expand m(A ∪ D) using the results in your notes and in the question. 4. For a function g to be a measure, what requirements must it satisfy? We are told that m and n are measures, so for any A, B ∈ S with A ∩ B = ∅, what do we know about m(A ∪ B) and n(A ∪ B)? 5. If P is a probability measure, what is the value of P (S)? What is the relationship between S, A and Ā? 16. Good notation is important; using poor notation makes this problem much harder to solve! Examples of poor notation are as follows. 1) The question is asking us for P (5|1) = P (5 ∩ 1) . P (1) This notation is very confusing. What is 5 ∩ 1? If it’s the empty set, aren’t we going to end up with P (5|1) = 0? Is P (1) the (prior) probability that there is exactly one forgery (0.09), or the probability of randomly selecting one painting and discovering that it is a forgery? 2) Let A be the event of 5 forgeries, and B be the event that the selected painting is forgery. We want P (A|B) = P (A)P (B|A) . P (B) This is better, but notation masks the fact that there could be other numbers of forgeries, which is important for calculating P (B). An example of better notation is Let Ei be the event that there are exactly i forgeries, and F be the event that the randomly selected painting is a forgery. The question is asking for P (E5 |F ). From the definitions of Ei and F , what is P (F |Ei )? Now consider how to use Bayes’ theorem to get P (E5 |F ). 17 17. As with Q16, good notation helps. Let EA , EB and EC be the events that prisoners A, B and C are to be executed, respectively. Suppose the jailer names prisoner B as being set free. Let F be the event that the jailer chooses to name B as being set free. This event is not the same as EB . We know P (EA ) = P (EB ) = P (EC ) = 1/3. Consider the values of P (F |EA ), P (F |EB ) and P (F |EC ), then try using Bayes’ theorem. 24. First suppose that p = 0.6. Do you maximise your expected earnings by choosing q = 0.6? Now suppose p = 0.4. Do you maximise your expected earnings by choosing q = 0.4? Now consider how you should choose q, dependent on whether p > 0.5. 27. When n1 = n2 = 2, X1 + X2 can take integer values from 0 to 4. Consider the different ways it can take each of these values in terms of the values of X1 and X2 , and calculate their probabilities. 34. The idea is similar to that used to obtain the mean of a Poisson random variable: change variables in the sum to make it look like something we know the answer to. 35. Consider, for example, pZ (1) = P (Z = 1). If Z = 1, what values could X and Y take, such that Z = X + Y = 1? The only possibilities are (X = 0, Y = 1) and (X = 1, Y = 0), so that pZ (1) = P (X = 0, Y = 1) + P (X = 1, Y = 0), As X and Y are independent, we have pZ (1) = P (X = 0)P (Y = 1) + P (X = 1)P (Y = 0) = pX (0)pY (1) + pX (1)pY (0). Now consider the general case, pZ (z). If Z = z, what possible values could X and Y take? Try writing an P expression for pZ (z) in terms of the probability mass functions of X and Y , using notation, and then see how you can simplify your expression. 55. Start with the definition of E(g(X)), tidy up the integrand as far as you can, then consider the following: (a) We can write x2 − 2x as (x − 1)2 − 1. (b) What is the density function of a N (1, 1) random variable? What do we get when we integrate this function between −∞ and ∞? 60. Study the proof of Chebyshev’s inequality, and try something similar using the mean of X rather than the variance. 18
© Copyright 2025 Paperzz