MTH 202 : Probability and Statistics Lecture S2 : 11. Estimation of parameters 11.1 : Method of moments Recall that the k-th moment µk of an RV X is defined by µk = E(X k ). If X1 , X2 , . . . , Xn are independent identically distributed (i.i.d.) RVs from the distribution of X, the k-th sample moment is defined by n 1X k X µ̂k := n i=1 i We can view µ̂k as an estimate of µk = E(X k ). Suppose we wish to estimate θ1 , θ2 where θ1 = f1 (µ1 , µ2 ), θ2 = f2 (µ1 , µ2 ) then the method of moment estimates are θ̂1 = f1 (µ̂1 , µ̂2 ), θ̂2 = f2 (µ̂1 , µ̂2 ) Example 11.1.1 : (Gamma distribution with parameters α, λ) We have seen that µ1 = α α(α + 1) , µ2 = λ λ2 Also we know that σ 2 = µ2 − µ21 . From these equations the parameters α, λ are expressed in terms of µ1 , µ2 as α= µ21 µ1 ,λ = 2 µ2 − µ1 µ2 − µ21 Hence the method of moments estimate of the parameters σ 2 , λ, α are σ̂ 2 = µ̂2 − µ̂21 , λ̂ = µ̂1 µ̂21 , α̂ = µ̂2 − µ̂21 µ̂2 − µ̂21 1 2 See pages 261-263, Sec-8.4, Chapter-8 for more examples. 11.2 : Method of maximum likelihood (mle) Suppose X1 , X2 , . . . , Xn have a joint density or frequency function f (x1 , x2 , . . . , xn | θ), where θ is a parameter. Given observed values Xi = xi (i = 1, . . . , n), the likelihood of θ as a function of x1 , x2 , . . . , xn is defined as lik(θ) = f (x1 , x2 , . . . , xn | θ) The maximal likelihood estimate (mle) of θ is the value θ0 of θ that maximizes the likelihood lik(θ). If the Xi are assumed to be i.i.d. we have n Y lik(θ) = f (Xi | θ) i=1 It is often convenient to maximize the natural logarithm of the above product (which are equivalent) n X l(θ) = log f (Xi | θ) i=1 Example 11.2.1 : Assume that X1 , . . . , Xn are all i.i.d. to the normal distribution N (µ, σ 2 ). Then n 1 h x − µ i2 Y 1 i √ f (x1 , x2 , . . . , xn | µ, σ) = exp − 2 σ σ 2π i=1 hence n X 1 h x − µ i2 i 1 i √ exp − 2 σ σ 2π i=1 n n 1 X (Xi − µ)2 = −nlogσ − log(2π) − 2 2 2σ i=1 l(µ, σ) = log h To find the mle, we need to solve n ∂l 1 X = 2 (Xi − µ) = 0 ∂µ σ i=1 and n ∂l n 1 X =− + 3 (Xi − µ)2 ∂σ σ σ i=1 3 solving these, we get n n 1X 1X µ= Xi , σ 2 = (Xi − µ)2 n i=1 n i=1 It can be checked from the higher order derivatives that these values indeed give the maximum values. Hence the mle of µ and σ are given by n 1X 2 (Xi − X)2 µ̂ = X, and σ̂ = n i=1 Also see pages 268-272, Sec-8.5, Chapter-8 (Rice) for more examples. Example 11.2.2 : (mle of Multinomial cell probabilities) Recall that the joint PMF of X1 , X2 , . . . , Xm is given by px1 px2 . . . pxmm f (x1 , x2 , . . . , xm | p1 , p2 , . . . , pm ) = n!. 1 2 x1 !x2 ! . . . xm ! The log likelihood is given by m m X X l(p1 , . . . , pm ) = logn! − logxi ! + xi logpi i=1 i=1 Using the method of Lagrange multiplier we can maximize l(p1 , . . . , pm ) (as in page-272, Sec-8.5.1, Chapter-8, Rice) and obtain the set of estimates X1 X2 Xm p̂1 = , p̂2 = , . . . , p̂m = n n n where each Xi is the marginal distribution of the joint PMF given above and is binomial with parameters (n, pi ). 11.3 : Large sample theory for mles Definition 11.3.1 : Let θ̂n be an estimate of a parameter θ based on a sample size n. Then θ̂n is said to be consistent in probability if for any > 0, lim P (|θ̂n − θ| > ) = 0 n→∞ which is symbolically written as θ̂n ∼ θ in probability as n large. Theorem 11.3.2 : Under appropriate smoothness conditions on f , the mle from an i.i.d. sample is consistent. 4 Proof : See page 275-276, Sec-8.5.2, Chapter-8, Rice. TO BE UPDATED References : [RS] An Introduction to Probability and Statistics, V.K. Rohatgi and A.K. Saleh, Second Edition, Wiley Students Edition. [RI] Mathematical Statistics and Data Analysis, John A. Rice, Cengage Learning, 2013
© Copyright 2026 Paperzz