Probability The probability Pr(A) of an event A is a number in [0, 1] that represents how likely A is to occur. The larger the value of Pr(A), the more likely the event is to occur. Pr(A) = 1 means the event must occur. Pr(A) = 0 means the event cannot occur. Pr(A) > Pr(B) means A is more likely to occur than B. Events A and B are called disjoint (mutually exclusive) if they cannot both occur simultaneously, that is, if Pr(A and B) = 0. Equivalently, saying A and B are disjoint means Pr(A or B) = Pr(A) + Pr(B). Events A and B are called complementary if they are disjoint, but one of them must occur. Equivalently, Pr(A) + Pr(B) = Pr(A or B) = 1. An random variable is a quantity that assumes different values with certain probabilities. In other words, X is a random variable if we can assign values to Pr(X = x), Pr(X ≤ x), Pr(X < x), Pr(X ≥ x), Pr(X > x) for every real number x. The events X = x and X 6= x are complimentary: Pr(X = x) = 1 − Pr(X 6= x). The events X ≤ x and X > x are complimentary: Pr(X ≤ x) = 1 − Pr(X > x). The events X ≥ x and X < x are complimentary: Pr(X ≥ x) = 1 − Pr(X < x). Example. If we toss two fair coins, there are four possible outcomes: • • • • HH HT TH TT where H is heads and T is tails. Since the coins are fair and the tosses are independent (the outcome of one toss doesn’t affect the outcome of the other), each of the four outcomes has probability 21 · 12 = 14 . Let Y be the random variable defined by 0 1 Y = 2 3 if the outcome is HH if the outcome is HT if the outcome is TH if the outcome is TT 1 2 Then 1 for k = 0, 1, 2, 3 4 Pr(Y = −1) = 0 Pr(Y = k) = Pr(Y = π) = 0 Pr(Y > 2) = Pr(Y = 3) = 1 4 Pr(Y ≤ 2) = Pr(Y = 0 or Y = 1 or Y = 2) = Pr(Y = 0) + Pr(Y = 1) + Pr(Y = 2) = Pr(Y ≤ 2) = 1 − Pr(Y > 2) = 1 − 3 4 1 3 − 4 4 The random variable Y is an example of a discrete random variable. A random variable is called a discrete random variable if it assume only finitely many or countably many values, that is, if we can list the values it assumes as x1 , x2 , . . . ,. Roughly, a continuous random variable is one that can assume a continuum of values. We will give a precise definition below after stating some terminology, but here are some things that can be modelled by a continuous random variables: • the amount of rainfall in vancouver next week • the lifespan of a lightbulb • the height of a randomly selected person in Canada The cumulative distribution function (CDF) of a random variable X is the function F (x) = Pr(X ≤ x). A function F is a CDF of some random variable if and only if the following properties hold. (1) F is right-continuous: limx→c+ F (x) = F (c) for all real numbers c. (2) F is non-decreasing: F (x) ≤ F (y) when x ≤ y for all real numbers x, y. (3) lim F (x) = 1. x→∞ (4) lim F (x) = 0. x→−∞ Properties (2), (3), and (4) imply 0 ≤ F (x) ≤ 1 for all real numbers x. Example. Let a > 0 be a constant. Show that F (x) = function. 1 is a cumulative distribution 1 + e−ax (1) Since 1 + e−ax is continuous and never 0, F (x) is continuous and, therefore, right-continuous everywhere. (2) Since d d e−ax F (x) = (1 + e−ax )−1 = −(1 + e−ax )−2 (−ae−ax ) = ≥ 0, dx dx (1 + e−ax )2 for all x, F is non-decreasing. (3) lim F (x) = lim x→∞ x→∞ 1 1 = =1 1 + e−ax 1+0 3 1 =0 x→−∞ 1 + e−ax (4) lim F (x) = lim x→−∞ Therefore F (x) = 1 is a CDF. 1 + e−ax 1 2 Extra Example. The function F (x) = k arctan(x) + the value of k. is a cumulative distribution function. Find We must have lim F (x) = 1. So x→∞ π 1 1 =k + 1 = lim F (x) = lim k arctan(x) + x→∞ x→∞ 2 2 2 Solving for k yields k = 1 . π A random variable is called a continuous random variable if its CDF is continuous. If X is a continuous random variable, then Pr(X ≤ x) = Pr(X < x), Pr(X = x) = 0, Pr(X ≥ x) = Pr(X > x), Pr(a ≤ X ≤ b) = Pr(a < X ≤ b) = Pr(a ≤ X < b) = Pr(a < X < b). for all real numbers x, a, b. We will prove Pr(X = x) = 0 below; all the other formulas follow easily from it. For a continuous random variable X whose cumulative distribution function F is differentiable, the probability density function (PDF) of X is defined to be f (x) = d F (x), dx and, moreover, Z x Pr(X ≤ x) = F (x) = f (t) dt −∞ Z Pr(a ≤ X ≤ b) = F (b) − F (a) = b f (x) dx a for all real numbers x, a, b with a ≤ b. A function f is a PDF of some random variable if and only if the following properties hold. (1) Z f is non-negative: f (x) ≥ 0 for all real numbers x. ∞ (2) f (x) dx = 1 −∞ Example. Let kx(1 − x)2 if 0 ≤ x ≤ 1 0 otherwise be the probability distribution function of a random variable X. f (x) = (a) Find the value of the constant k. (b) Find the probability that 1 2 ≤ X < 1. (c) Find the cumulative distribution function of X. 4 Part (a): We must have R∞ −∞ f (x) dx Z = 1. Therefore ∞ Z 1 kx(1 − x)2 dx f (x) dx = 1= −∞ Z 1 0 2 3 (x − 2x + x ) dx = k 1 2 1 k =k = . − + 2 3 4 12 =k 0 1 2 2 3 1 4 1 x − x + x 2 3 4 0 So k = 12. Part (b): Pr Z 1 Z 1 1 ≤X<1 = f (x) dx = 12 x(1 − x)2 dx 2 1/2 1/2 Z 1 1 2 2 3 2 3 = 12 (x − 2x + x ) dx = 12 x − x + 2 3 1/2 " 2 1 1 1 2 2 3 1 4 (1) − (1) + (1) − − = 12 2 3 4 2 2 1 1 1 5 1 − − + = . = 12 12 8 12 64 16 1 4 1 x 4 1/2 3 !# 2 1 1 1 4 + 3 2 4 2 Part (c). For x < 0, Z x F (x) = Pr(X ≤ x) = Z x f (t) dt = −∞ 0 dt = 0. −∞ For 0 ≤ x ≤ 1, Z x Z x F (x) = Pr(X ≤ x) = f (t) dt = 12 t(1 − t)2 dt −∞ 0 Z x 1 2 3 1 4 x 2 3 2 = 12 (t − 2t + t ) dt = 12 t − t + t 2 3 4 0 0 1 2 2 3 1 4 x − x + x . = 12 2 3 4 5 For x > 1, Z x F (x) = Pr(X ≤ x) = Z f (t) dt −∞ Z 1 0 −∞ Z 0 0 Z 1 = 0 dt + 12 −∞ = 12 f (t) dt 1 t(1 − t)2 dt + 2 ∞ Z 0 dt 1 1 3 (t − 2t + t ) dt = 1 2 1 = 1. − + 2 3 4 0 ∞ 0 Z = 12 Z f (t) dt + f (t) dt + = 1 2 2 3 1 4 1 t − t + t 2 3 4 0 Part (b) again: Remember that Z Pr(a ≤ X ≤ b) = F (b) − F (a) = b f (x) dx, a where F is the CDF of X and f is the PDF of X. So if we had done part (c) before part (b), we could have used that 0 if x < 0 1 2 2 3 1 4 F (x) = 12 2 x − 3 x + 4 x if 0 ≤ x ≤ 1 1 if x > 0 to compute 1 1 Pr ≤ X < 1 = F (1) − F 2 2 ! 1 2 2 3 1 4 1 1 2 2 1 3 1 1 4 = 12 (1) − (1) + (1) − 12 − + 2 3 4 2 2 3 2 4 2 = 5 . 16 Example. Find the probability distribution function f (x) for a random variable with cumulative distribution function − 12 x if x ≥ 0 F (x) = 1 − e 0 if x < 0 For x > 0, f (x) = 1 1 1 d d F (x) = 1 − e− 2 x = e− 2 x . dx dx 2 For x < 0, d d F (x) = 0 = 0. dx dx F (x) is not differentiable at 0, but it’s okay to have f (x) undefined at finitely many points because this won’t affect the intergal of f . In conclusion, 1 − 12 x if x > 0 2e f (x) = 0 if x < 0. f (x) = 6 Let X be a continuous random variable with probability density function f . The expected value (or expectation or mean or average) of X is Z ∞ xf (x) dx. E(X) = −∞ The variance of X is Z ∞ Var(X) = (x − E(X))2 f (x) dx = E(X 2 ) − (E(X))2 −∞ The standard deviation of X is σ(X) = p Var(X). Here is a useful fact (see below for a proof): EX 2 = Z ∞ x2 f (x) dx. −∞ The expected value of a random variable is a measure of the center of its distribution. The variance and standard deviation of a random variable are measures of the dispersion (or horizontal spread) of its distribution. Example. Find the standard deviation of the random variable X with probability density function 3 if x ≥ 0 f (x) = (x + 1)4 0 if x < 0 First we find the expected value: Z ∞ E(X) = xf (x) dx ∞ Z ∞ =3 x(x + 1)−4 dx 0 Z t = 3 lim x(x + 1)−4 dx t→∞ 0 Z t+1 = 3 lim (u − 1)u−4 du t→∞ 1 Z t+1 = 3 lim (u−3 − u−4 ) du u=x+1 t→∞ 1 1 −2 1 −3 t+1 = 3 lim − u + u du t→∞ 2 3 1 1 1 1 1 −2 −3 = 3 lim − (t + 1) + (t + 1) − − + t→∞ 2 3 2 3 1 1 = 3 −0 + 0 + − 2 3 1 = . 2 7 Then we compute 2 Z ∞ x2 f (x) dx E(X ) = ∞ Z ∞ x2 (x + 1)−4 dx 0 Z t = 3 lim x2 (x + 1)−4 dx t→∞ 0 Z t+1 = 3 lim (u − 1)2 u−4 du u=x+1 t→∞ 1 Z t+1 = 3 lim (u−2 − 2u−3 + u−4 ) du =3 t→∞ 1 1 −3 t+1 = 3 lim −u + u − u du t→∞ 3 1 1 1 −3 −1 −2 − −1 + 1 − = 3 lim −(t + 1) + (t + 1) − (t + 1) t→∞ 3 3 1 = 3 −0 + 0 − 0 + 3 = 1. Therefore −1 −2 2 1 3 Var(X) = E(X ) − (E(X)) = 1 − = 2 4 √ p 3 σ(X) = Var(X) = . 2 2 and 2 8 Proofs of Two Facts Fact 1. If X is a continuous random variable, then Pr(X = c) = 0 for all real numbers c. If X has a probability density function, the proof is easy: Z Pr(X = c) = Pr(c ≤ X ≤ c) = c f (x) dx = 0. c If X is a continuous random variable without a probability density function, we have to work harder. We will need to recall the definition of a continuous random variable and a property of limits. Recall: X is a continuous random variable if its CDF, F (x) = Pr(X ≤ x), is continuous. Recall: If m ≤ g(x) for all x < c, then m ≤ lim f (x). x→c− Proof of Fact 1. Step 1. For any x < c, Pr(X ≤ c) = Pr(X ≤ x or x < X ≤ c) = Pr(X ≤ x) + Pr(x < X ≤ c) and therefore Pr(x < X ≤ c) = Pr(X ≤ c) − Pr(X ≤ x) = F (c) − F (x). Step 2. For any x < c, if X = c, then x < X ≤ c. So X = c cannot be more likely than x < X ≤ c. Therefore Pr(X = c) ≤ Pr(x <≤ X ≤ c). Step 3. Pr(X = c) = lim Pr(X = c) x→c− ≤ lim Pr(x < X ≤ c) x→c− = lim (F (c) − F (x)) x→c− = F (c) − lim F (x) x→c− = F (c) − F (c) = 0. 9 Fact 2. If X is a continuous random variable with probability density function f , then Z ∞ x2 f (x) dx. E(X 2 ) = −∞ Remark. Fact 2 is a special case of a more general formula called the law of the unconscious statistician, Z ∞ g(x)f (x) dx, E(g(X)) = −∞ which is valid whenever the integral on the right converges. Proof of Fact 2. Let F and f be, respectively, the CDF and PDF of X. Let G and g be, respectively, the CDF and PDF of the random variable X 2 . If x < 0, G(x) = Pr(X 2 ≤ x) = 0. If x ≥ 0, G(x) = Pr(X 2 ≤ x) = Pr(|X| ≤ √ √ √ √ √ x) = Pr(− x ≤ X ≤ x) = F ( x) − F (− x). Now we compute g(x). If x < 0, g(x) = d d G(x) = 0 = 0. dx dx If x > 0, √ √ d d G(x) = F ( x) − F (− x) dx dx √ d √ d √ 0 √ = F ( x) ( x) − F 0 (− x) (− x) dx dx √ √ F 0 ( x) F 0 (− x) √ √ = + 2 x 2 x √ √ 1 = √ f ( x) + f (− x) . 2 x g(x) = We don’t need to worry about g(x) at x = 0, since the value of g at one point won’t affect the integral of g. We can now compute Z ∞ 2 E(X ) = xg(x) dx −∞ Z ∞ √ √ x √ f ( x) + f (− x) dx = 2 x 0 Z √ √ 1 ∞√ = x f ( x) + f (− x) dx 2 0 Z t √ √ √ 1 = lim x f ( x) + f (− x) dx. t→∞ 2 0 Making the substitution √ u = x, u2 = x, 2u du = dx x = 0 ⇒ u = 0, x=t⇒u= √ t 10 gives √ E(X 2 ) = lim Z t u2 (f (u) + f (−u)) du t→∞ 0 √ Z u2 f (u) du + = lim t→∞ √ t Z 0 ! t u2 f (−u) du . 0 In the second integral we make the substitution v = −u to get ! Z − √t Z √t 2 2 2 E(X ) = lim u f (u) du − v f (v) dv t→∞ 0 0 √ Z = lim Z = 2 u f (u) du + t→∞ = lim t Z t→∞ 0 Z ∞ 2 0 √ t 0 ∞ x2 f (x) dx + = Z ∞ −∞ f (v) dv 0 t→∞ −√t Z 0 −∞ Z 0 −∞ 0 = 2 u f (u) du + lim u f (u) du + Z √ v − t Z 2 ! 0 x2 f (x) dx. v 2 f (v) dv x2 f (x) dx v 2 f (v) dv
© Copyright 2026 Paperzz