4. Bayesian Inference, Part IV: Least Mean Squares (LMS) es;ma;on ECE 302 Fall 2009 TR 3‐4:15pm Purdue University, School of ECE Prof. Ilya Pollak An Es;mate vs an Es;mator • Suppose X and Y are random variables. • Y is observed, X is not observed. An Es;mate vs an Es;mator • Suppose X and Y are random variables. • Y is observed, X is not observed. • An es;mator of X based on Y is any func;on g(Y). An Es;mate vs an Es;mator • • • • Suppose X and Y are random variables. Y is observed, X is not observed. An es;mator of X based on Y is any func;on g(Y). Since g(Y) is a func;on of a r.v., it is a r.v. itself. An Es;mate vs an Es;mator • • • • • Suppose X and Y are random variables. Y is observed, X is not observed. An es;mator of X based on Y is any func;on g(Y). Since g(Y) is a func;on of a r.v., it is a r.v. itself. An es;mate of X based on Y=y is a number, the value of an es;mator as determined by the observed value y of Y. An Es;mate vs an Es;mator Suppose X and Y are random variables. Y is observed, X is not observed. An es;mator of X based on Y is any func;on g(Y). Since g(Y) is a func;on of a r.v., it is a r.v. itself. An es;mate of X based on Y=y is a number, the value of an es;mator as determined by the observed value y of Y. • E.g., if g(Y) is an es;mator of X, then g(y) is an es;mate of X for any specific observa;on y. • • • • • Pros and Cons of MAP Es;mators • Minimize the error probability. Pros and Cons of MAP Es;mators • Minimize the error probability. • Not appropriate when different errors have different costs (as in, e.g., the burglar alarm and spam filtering examples). Pros and Cons of MAP Es;mators • Minimize the error probability. • Not appropriate when different errors have different costs (as in, e.g., the burglar alarm and spam filtering examples). • For some posterior distribu;ons, may result in large probabili;es of large errors. An example when the MAP es;mate is bad Suppose fX|Y(x|0) is: 10 5 0 0.01 1 • MAP: the max of the tall le[ triangle, x* = 0.005. 1.38 x An example when the MAP es;mate is bad Suppose fX|Y(x|0) is: 10 5 0 0.01 1 1.38 • MAP: the max of the tall le[ triangle, x* = 0.005. • Le[ triangle: 0.05 of the condi;onal probability mass. x An example when the MAP es;mate is bad Suppose fX|Y(x|0) is: 10 5 0 0.01 1 1.38 • MAP: the max of the tall le[ triangle, x* = 0.005. • Le[ triangle: 0.05 of the condi;onal probability mass. • MAP makes errors of ≥ 0.995 with cond. prob. 0.95. x An example when the MAP es;mate is bad Suppose fX|Y(x|0) is: 10 5 0 • • • • 0.01 1 1.38 MAP: the max of the tall le[ triangle, x* = 0.005. Le[ triangle: 0.05 of the condi;onal probability mass. MAP makes errors of ≥ 0.995 with cond. prob. 0.95. Another es;mate: condi;onal mean E[X|Y=0] = 0.050.005 + 0.951.19 ≈ 1.13. x Mean Squared Error (MSE) for an Es;mator g(Y) of X Mean Squared Error (MSE) for an Es;mator g(Y) of X • The expecta;on is taken with respect to the joint distribu;on of X and Y. Mean Squared Error (MSE) for an Es;mator g(Y) of X • The expecta;on is taken with respect to the joint distribu;on of X and Y. • This is the expecta;on of the squared difference between the unobserved random variable X and its es;mator based on the observed random variable Y. Mean Squared Error (MSE) for an Es;mator g(Y) of X • The expecta;on is taken with respect to the joint distribu;on of X and Y. • This is the expecta;on of the squared difference between the unobserved random variable X and its es;mator based on the observed random variable Y. • Makes large errors more costly than small ones. Condi;onal Mean Squared Error for an Es;mate g(y) of X, given Y=y Condi;onal Mean Squared Error for an Es;mate g(y) of X, given Y=y • The expecta;on is taken with respect to the condi;onal distribu;on of X given Y=y. Condi;onal Mean Squared Error for an Es;mate g(y) of X, given Y=y • The expecta;on is taken with respect to the condi;onal distribu;on of X given Y=y. • This is the condi;onal expecta;on, given Y=y, of the squared difference between the unobserved random variable X and its es;mate based on the specific observa;on y of the random variable Y. Least Mean Squares (LMS) Es;mator • LMS es'mator for X based on Y is the es'mator that minimizes the MSE among all es'mators of X based on Y. Least Mean Squares (LMS) Es;mator • LMS es'mator for X based on Y is the es'mator that minimizes the MSE among all es'mators of X based on Y. • If g(Y) is the LMS es;mator, then, for any observa;on y of Y, the number g(y) minimizes the condi;onal MSE among all es;mates of X. Least Mean Squares (LMS) Es;mator • LMS es'mator for X based on Y is the es'mator that minimizes the MSE among all es'mators of X based on Y. • If g(Y) is the LMS es;mator, then, for any observa;on y of Y, the number g(y) minimizes the condi;onal MSE among all es;mates of X. • The LMS es'mator of X based on Y is E[X|Y]. Least Mean Squares (LMS) Es;mator • LMS es'mator for X based on Y is the es'mator that minimizes the MSE among all es'mators of X based on Y. • If g(Y) is the LMS es;mator, then, for any observa;on y of Y, the number g(y) minimizes the condi;onal MSE among all es;mates of X. • The LMS es'mator of X based on Y is E[X|Y]. • For any observa;on y of Y, the LMS es;mate of X is the mean of the posterior density, i.e., E[X|Y=y]. LMS es;mate with no data • Suppose X is a r.v. with a known mean. LMS es;mate with no data • Suppose X is a r.v. with a known mean. • We observe nothing. LMS es;mate with no data • Suppose X is a r.v. with a known mean. • We observe nothing. • We would like an es;mate x* of X which minimizes the MSE, E[(X−x*)2]. LMS es;mate with no data • Suppose X is a r.v. with a known mean. • We observe nothing. • We would like an es;mate x* of X which minimizes the MSE, E[(X−x*)2]. • E[(X−x*)2] = var(X−x*) + (E[X−x*])2 LMS es;mate with no data • Suppose X is a r.v. with a known mean. • We observe nothing. • We would like an es;mate x* of X which minimizes the MSE, E[(X−x*)2]. • E[(X−x*)2] = var(X−x*) + (E[X−x*])2 = var(X) + (E[X] − x*)2 LMS es;mate with no data • Suppose X is a r.v. with a known mean. • We observe nothing. • We would like an es;mate x* of X which minimizes the MSE, E[(X−x*)2]. • E[(X−x*)2] = var(X−x*) + (E[X−x*])2 = var(X) + (E[X] − x*)2 MSE var(X) E[X] x* LMS es;mate with no data • Suppose X is a r.v. with a known mean. • We observe nothing. • We would like an es;mate x* of X which minimizes the MSE, E[(X−x*)2]. • E[(X−x*)2] = var(X−x*) + (E[X−x*])2 = var(X) + (E[X] − x*)2 • This is a quadra;c func;on of x*, minimized by x*=E[X]. MSE var(X) E[X] x* LMS es;mate from one observa;on • Suppose X, Y are r.v.’s, and E[X|Y=y] is known. LMS es;mate from one observa;on • Suppose X, Y are r.v.’s, and E[X|Y=y] is known. • We observe Y=y. LMS es;mate from one observa;on • Suppose X, Y are r.v.’s, and E[X|Y=y] is known. • We observe Y=y. • We would like an es;mate x* of X which minimizes the posterior MSE, E[(X−x*)2|Y=y]. LMS es;mate from one observa;on • Suppose X, Y are r.v.’s, and E[X|Y=y] is known. • We observe Y=y. • We would like an es;mate x* of X which minimizes the posterior MSE, E[(X−x*)2|Y=y]. • E[(X−x*)2|Y=y] = var(X−x*|Y=y) + (E[X−x*] |Y=y)2 LMS es;mate from one observa;on • Suppose X, Y are r.v.’s, and E[X|Y=y] is known. • We observe Y=y. • We would like an es;mate x* of X which minimizes the posterior MSE, E[(X−x*)2|Y=y]. • E[(X−x*)2|Y=y] = var(X−x*|Y=y) + (E[X−x*] |Y=y)2 = var(X|Y=y) + (E[X|Y=y] − x*)2 LMS es;mate from one observa;on • Suppose X, Y are r.v.’s, and E[X|Y=y] is known. • We observe Y=y. • We would like an es;mate x* of X which minimizes the posterior MSE, E[(X−x*)2|Y=y]. • E[(X−x*)2|Y=y] = var(X−x*|Y=y) + (E[X−x*] |Y=y)2 = var(X|Y=y) + (E[X|Y=y] − x*)2 MSE var(X|Y=y) E[X|Y=y] x* LMS es;mate from one observa;on • Suppose X, Y are r.v.’s, and E[X|Y=y] is known. • We observe Y=y. • We would like an es;mate x* of X which minimizes the posterior MSE, E[(X−x*)2|Y=y]. • E[(X−x*)2|Y=y] = var(X−x*|Y=y) + (E[X−x*] |Y=y)2 = var(X|Y=y) + (E[X|Y=y] − x*)2 • Quadra;c func;on of x*, minimized by x*=E[X|Y=y]. MSE var(X|Y=y) E[X|Y=y] x* LMS es;mator • Suppose X, Y are r.v.’s, and E[X|Y=y] is known for all y. LMS es;mator • Suppose X, Y are r.v.’s, and E[X|Y=y] is known for all y. • Let g(y) = E[X|Y=y] be the LMS es;mate of X based on data Y=y. LMS es;mator • Suppose X, Y are r.v.’s, and E[X|Y=y] is known for all y. • Let g(y) = E[X|Y=y] be the LMS es;mate of X based on data Y=y. • Then g(Y) = E[X|Y] is an es;mator of X based on Y. LMS es;mator • • • • Suppose X, Y are r.v.’s, and E[X|Y=y] is known for all y. Let g(y) = E[X|Y=y] be the LMS es;mate of X based on data Y=y. Then g(Y) = E[X|Y] is an es;mator of X based on Y. Let h(Y) be any other es;mator of X based on Y. LMS es;mator • • • • • Suppose X, Y are r.v.’s, and E[X|Y=y] is known for all y. Let g(y) = E[X|Y=y] be the LMS es;mate of X based on data Y=y. Then g(Y) = E[X|Y] is an es;mator of X based on Y. Let h(Y) be any other es;mator of X based on Y. We just showed that, for every value of y, we have: E[(X−g(y))2|Y=y] ≤ E[(X−h(y))2|Y=y] LMS es;mator Suppose X, Y are r.v.’s, and E[X|Y=y] is known for all y. Let g(y) = E[X|Y=y] be the LMS es;mate of X based on data Y=y. Then g(Y) = E[X|Y] is an es;mator of X based on Y. Let h(Y) be any other es;mator of X based on Y. We just showed that, for every value of y, we have: E[(X−g(y))2|Y=y] ≤ E[(X−h(y))2|Y=y] • Therefore, E[(X−g(Y))2|Y] ≤ E[(X−h(Y))2|Y] • • • • • LMS es;mator Suppose X, Y are r.v.’s, and E[X|Y=y] is known for all y. Let g(y) = E[X|Y=y] be the LMS es;mate of X based on data Y=y. Then g(Y) = E[X|Y] is an es;mator of X based on Y. Let h(Y) be any other es;mator of X based on Y. We just showed that, for every value of y, we have: E[(X−g(y))2|Y=y] ≤ E[(X−h(y))2|Y=y] • Therefore, E[(X−g(Y))2|Y] ≤ E[(X−h(Y))2|Y] • Take expecta;ons of both sides and use the iterated expecta;on formula: E[(X−g(Y))2] ≤ E[(X−h(Y))2] • • • • • LMS es;mator Suppose X, Y are r.v.’s, and E[X|Y=y] is known for all y. Let g(y) = E[X|Y=y] be the LMS es;mate of X based on data Y=y. Then g(Y) = E[X|Y] is an es;mator of X based on Y. Let h(Y) be any other es;mator of X based on Y. We just showed that, for every value of y, we have: E[(X−g(y))2|Y=y] ≤ E[(X−h(y))2|Y=y] • Therefore, E[(X−g(Y))2|Y] ≤ E[(X−h(Y))2|Y] • Take expecta;ons of both sides and use the iterated expecta;on formula: E[(X−g(Y))2] ≤ E[(X−h(Y))2] • Therefore, the condi;onal expecta;on es;mator achieves an MSE which is ≤ to the MSE for any other es;mator. • • • • • Other names that LMS es;mate goes by in literature • Least‐squares es;mate (LSE) • Bayes least‐squares es;mate (BLSE) • Minimum mean square error es;mate (MMSE) Ex 8.3: es;ma;ng the mean of a Gaussian r.v. • Observe Yi = X + Wi for i = 1,…,n • X, W1, …, Wn independent Gaussian, with known means and variances • Wi has mean 0 and variance σi2 • X has mean x0 and variance σ02 • Previously, we found the MAP es;mate of X based on observing Y = (Y1, …, Yn) • Now let’s find the LMS es;mate of X based on observing Y = (Y1, …, Yn) Prior model and measurement model Prior model: Measurement model: n 1 fY| X (y | x) = ∏ e i =1 σ i 2π 1⎛ y −x⎞ − ⎜ i 2 ⎝ σ i ⎟⎠ ⎛ n 1 ⎛ y − x⎞2⎞ ⎞ i exp − ⎜ ∑ ⎟⎠ ⎜⎝ σ ⎟⎠ ⎟ 2 i i ⎝ i =1 ⎠ n 1 − n /2 ⎛ = ( 2π ) ⎜ ∏ σ ⎝ i =1 2 Posterior PDF f X |Y (x | y) = fY| X (y | x) f X (x) fY (y) Posterior PDF f X |Y (x | y) = fY| X (y | x) f X (x) fY (y) Posterior PDF f X |Y (x | y) = fY| X (y | x) f X (x) fY (y) Posterior PDF f X |Y (x | y) = fY| X (y | x) f X (x) fY (y) 2 2 n n ⎛ ⎞ ⎛ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ 1 1 1 yi − x 1 1 x − x0 ⎞ − n /2 = exp ⎜ − ⎜ ( 2π ) ⎜ ∏ ⎟ exp ⎜ − ∑ ⎜ ⎟ ⎟ ⎟⎠ ⎟ fY (y) 2 σ 2 σ ⎝ i =1 σ i ⎠ ⎝ ⎠ ⎝ σ 2 π i 0 ⎝ i =1 ⎠ 0 ⎝ ⎠ Posterior PDF f X |Y (x | y) = fY| X (y | x) f X (x) fY (y) 2 2 n n ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ 1 1 1 y − x 1 1 x − x − n /2 i 0 = exp ⎜ − ⎜ ( 2π ) ⎜ ∏ ⎟ exp ⎜ − ∑ ⎜ ⎟ ⎟ ⎟ ⎟ fY (y) σ 2 σ 2 σ ⎝ i =1 i ⎠ ⎝ i ⎠ ⎠ σ 0 2π 0 ⎠ ⎠ ⎝ i =1 ⎝ ⎝ ⎛ n 1 ⎛ y − x⎞2⎞ = C exp ⎜ − ∑ ⎜ i ⎟⎠ ⎟ 2 σ ⎝ i ⎝ i=0 ⎠ (Denoting y0 = x0 and C = the factor that's not a function of x.) Further massaging the posterior PDF… 2 n n n ⎛ n 1 ⎛ y − x⎞2⎞ ⎛ ⎡ ⎤⎞ 1 1 y y 2 i i i f X |Y (x | y) = C exp ⎜ − ∑ ⎜ ⎟ = C exp ⎜ − ⎢ x ∑ 2 − 2x ∑ 2 + ∑ 2 ⎥⎟ ⎟ ⎝ 2 ⎣ i=0 σ i i=0 σ i i = 0 σ i ⎦⎠ ⎝ i=0 2 ⎝ σ i ⎠ ⎠ Further massaging the posterior PDF… 2 n n n ⎛ n 1 ⎛ y − x⎞2⎞ ⎛ ⎡ ⎤⎞ 1 1 y y 2 i i i f X |Y (x | y) = C exp ⎜ − ∑ ⎜ ⎟ = C exp ⎜ − ⎢ x ∑ 2 − 2x ∑ 2 + ∑ 2 ⎥⎟ ⎟ ⎝ 2 ⎣ i=0 σ i i=0 σ i i = 0 σ i ⎦⎠ ⎝ i=0 2 ⎝ σ i ⎠ ⎠ n yi ∑σ2 1 Denoting v = n and m = i =n0 i , we have: 1 1 ∑σ2 ∑σ2 i=0 i=0 i i Further massaging the posterior PDF… 2 n n n ⎛ n 1 ⎛ y − x⎞2⎞ ⎛ ⎡ ⎤⎞ 1 1 y y 2 i i i f X |Y (x | y) = C exp ⎜ − ∑ ⎜ ⎟ = C exp ⎜ − ⎢ x ∑ 2 − 2x ∑ 2 + ∑ 2 ⎥⎟ ⎟ ⎝ 2 ⎣ i=0 σ i i=0 σ i i = 0 σ i ⎦⎠ ⎝ i=0 2 ⎝ σ i ⎠ ⎠ n yi ∑σ2 1 Denoting v = n and m = i =n0 i , we have: 1 1 ∑σ2 ∑σ2 i=0 i=0 i i ⎛ 1 2 m 2 1 n yi2 ⎞ 2 f X |Y (x | y) = C exp ⎜ − ⎡⎣ x − 2xm + m ⎤⎦ + − ∑ 2⎟ 2v 2 i = 0 σ i ⎠ ⎝ 2v Further massaging the posterior PDF… 2 n n n ⎛ n 1 ⎛ y − x⎞2⎞ ⎛ ⎡ ⎤⎞ 1 1 y y 2 i i i f X |Y (x | y) = C exp ⎜ − ∑ ⎜ ⎟ = C exp ⎜ − ⎢ x ∑ 2 − 2x ∑ 2 + ∑ 2 ⎥⎟ ⎟ ⎝ 2 ⎣ i=0 σ i i=0 σ i i = 0 σ i ⎦⎠ ⎝ i=0 2 ⎝ σ i ⎠ ⎠ n yi ∑σ2 1 Denoting v = n and m = i =n0 i , we have: 1 1 ∑σ2 ∑σ2 i=0 i=0 i i ⎛ 1 2 m 2 1 n yi2 ⎞ 2 f X |Y (x | y) = C exp ⎜ − ⎡⎣ x − 2xm + m ⎤⎦ + − ∑ 2⎟ 2v 2 i = 0 σ i ⎠ ⎝ 2v ⎛ (x − m)2 ⎞ = C1 exp ⎜ − 2v ⎟⎠ ⎝ Further massaging the posterior PDF… 2 n n n ⎛ n 1 ⎛ y − x⎞2⎞ ⎛ ⎡ ⎤⎞ 1 1 y y 2 i i i f X |Y (x | y) = C exp ⎜ − ∑ ⎜ ⎟ = C exp ⎜ − ⎢ x ∑ 2 − 2x ∑ 2 + ∑ 2 ⎥⎟ ⎟ ⎝ 2 ⎣ i=0 σ i i=0 σ i i = 0 σ i ⎦⎠ ⎝ i=0 2 ⎝ σ i ⎠ ⎠ n yi ∑σ2 1 Denoting v = n and m = i =n0 i , we have: 1 1 ∑σ2 ∑σ2 i=0 i=0 i i ⎛ 1 2 m 2 1 n yi2 ⎞ 2 f X |Y (x | y) = C exp ⎜ − ⎡⎣ x − 2xm + m ⎤⎦ + − ∑ 2⎟ 2v 2 i = 0 σ i ⎠ ⎝ 2v ⎛ (x − m)2 ⎞ = C1 exp ⎜ − 2v ⎟⎠ ⎝ This is a Gaussian PDF with mean m and variance v, Further massaging the posterior PDF… 2 n n n ⎛ n 1 ⎛ y − x⎞2⎞ ⎛ ⎡ ⎤⎞ 1 1 y y 2 i i i f X |Y (x | y) = C exp ⎜ − ∑ ⎜ ⎟ = C exp ⎜ − ⎢ x ∑ 2 − 2x ∑ 2 + ∑ 2 ⎥⎟ ⎟ ⎝ 2 ⎣ i=0 σ i i=0 σ i i = 0 σ i ⎦⎠ ⎝ i=0 2 ⎝ σ i ⎠ ⎠ n yi ∑σ2 1 Denoting v = n and m = i =n0 i , we have: 1 1 ∑σ2 ∑σ2 i=0 i=0 i i ⎛ 1 2 m 2 1 n yi2 ⎞ 2 f X |Y (x | y) = C exp ⎜ − ⎡⎣ x − 2xm + m ⎤⎦ + − ∑ 2⎟ 2v 2 i = 0 σ i ⎠ ⎝ 2v ⎛ (x − m)2 ⎞ = C1 exp ⎜ − 2v ⎟⎠ ⎝ This is a Gaussian PDF with mean m and variance v, 1 and therefore we immediately have: C1 = 2π v Posterior PDF and LMS es;mate n f X |Y (x | y) = ⎛ (x − m)2 ⎞ 1 exp ⎜ − 2v ⎟⎠ ⎝ 2π v yi ∑σ2 1 where v = n and m = i =n0 i 1 1 ∑σ2 ∑σ2 i=0 i=0 i i Posterior PDF and LMS es;mate n f X |Y (x | y) = ⎛ (x − m)2 ⎞ 1 exp ⎜ − 2v ⎟⎠ ⎝ 2π v yi ∑σ2 1 where v = n and m = i =n0 i 1 1 ∑σ2 ∑σ2 i=0 i=0 i i n yi ∑σ2 Therefore, the LMS estimate is E[X | Y = y] = m = i =n0 i , 1 ∑σ2 i=0 i Posterior PDF and LMS es;mate n f X |Y (x | y) = ⎛ (x − m)2 ⎞ 1 exp ⎜ − 2v ⎟⎠ ⎝ 2π v yi ∑σ2 1 where v = n and m = i =n0 i 1 1 ∑σ2 ∑σ2 i=0 i=0 i i n yi ∑σ2 Therefore, the LMS estimate is E[X | Y = y] = m = i =n0 i , 1 ∑σ2 i=0 i n Yi ∑σ2 and the LMS estimator is E[X | Y] = i =n0 i 1 ∑σ2 i=0 i Posterior PDF and LMS es;mate n f X |Y (x | y) = ⎛ (x − m)2 ⎞ 1 exp ⎜ − 2v ⎟⎠ ⎝ 2π v yi ∑σ2 1 where v = n and m = i =n0 i 1 1 ∑σ2 ∑σ2 i=0 i=0 i i n yi ∑σ2 Therefore, the LMS estimate is E[X | Y = y] = m = i =n0 i , 1 ∑σ2 i=0 i n Yi ∑σ2 and the LMS estimator is E[X | Y] = i =n0 i 1 ∑σ2 i=0 i Note that, in this example, LMS = MAP Posterior PDF and LMS es;mate n f X |Y (x | y) = ⎛ (x − m)2 ⎞ 1 exp ⎜ − 2v ⎟⎠ ⎝ 2π v yi ∑σ2 1 where v = n and m = i =n0 i 1 1 ∑σ2 ∑σ2 i=0 i=0 i i n yi ∑σ2 Therefore, the LMS estimate is E[X | Y = y] = m = i =n0 i , 1 ∑σ2 i=0 i n Yi ∑σ2 and the LMS estimator is E[X | Y] = i =n0 i 1 ∑σ2 i=0 i Note that, in this example, LMS = MAP, because: • the posterior density is Gaussian • for a Gaussian density, the maximum is at the mean Ex. 8.11 • X = con;nuous uniform over [4, 10]. • W = con;nuous uniform over [‐1, 1]. • X and W are independent. Ex. 8.11 • • • • X = con;nuous uniform over [4, 10]. W = con;nuous uniform over [‐1, 1]. X and W are independent. Y = X + W. Ex. 8.11 • • • • • X = con;nuous uniform over [4, 10]. W = con;nuous uniform over [‐1, 1]. X and W are independent. Y = X + W. Calculate the LMS es;mate of X based on Y=y. Ex. 8.11 • • • • • • X = con;nuous uniform over [4, 10]. W = con;nuous uniform over [‐1, 1]. X and W are independent. Y = X + W. Calculate the LMS es;mate of X based on Y=y. Calculate the associated condi;onal MSE. Ex. 8.11 • • • • • X = con;nuous uniform over [4, 10]. W = con;nuous uniform over [‐1, 1]. X and W are independent. Y = X + W. fY|X(y|x) is uniform over [x−1, x+1] Ex. 8.11 • • • • • X = con;nuous uniform over [4, 10]. W = con;nuous uniform over [‐1, 1]. X and W are independent. Y = X + W. fY|X(y|x) is uniform over [x−1, x+1] ⎧⎪ 1 / 12, if 4 ≤ x ≤ 10 and x − 1 ≤ y ≤ x + 1 f X ,Y (x, y) = f X (x) fY | X (y | x) = ⎨ ⎪⎩ 0, otherwise Ex. 8.11: support of the joint PDF of X and Y ⎧⎪ 1 / 12, if 4 ≤ x ≤ 10 and x − 1 ≤ y ≤ x + 1 f X ,Y (x, y) = f X (x) fY | X (y | x) = ⎨ ⎪⎩ 0, otherwise x 10 The joint PDF is supported (i.e., is nonzero) inside the blue parallelogram. x−1=y x+1=y 4 3 5 9 11 y Ex. 8.11: the form of the posterior f X ,Y (x, y) f X |Y (x | y) = fY (y) Ex. 8.11: the form of the posterior f X ,Y (x, y) f X |Y (x | y) = fY (y) fX,Y is uniform over the parallelogram. x 10 x=y+1 x=y−1 4 3 5 9 11 y Ex. 8.11: the form of the posterior f X ,Y (x, y) f X |Y (x | y) = fY (y) fX,Y is uniform over the parallelogram. Therefore, for any fixed y=y0, the posterior is uniform over the intersec;on of the line y=y0 with the parallelogram. x 10 x=y+1 x=y−1 4 3 5 9 11 y Ex. 8.11: the posterior f X ,Y (x, y) f X |Y (x | y) = fY (y) fX,Y is uniform over the parallelogram. Therefore, for any fixed y=y0, the posterior is uniform over the intersec;on of the line y=y0 with the parallelogram. fX|Y(x|y) is uniform over: [4, y+1] for 3 ≤ y ≤ 5 x 10 x=y+1 x=y−1 4 3 5 9 11 y Ex. 8.11: the posterior f X ,Y (x, y) f X |Y (x | y) = fY (y) fX,Y is uniform over the parallelogram. Therefore, for any fixed y=y0, the posterior is uniform over the intersec;on of the line y=y0 with the parallelogram. fX|Y(x|y) is uniform over: [4, y+1] for 3 ≤ y ≤ 5 [y−1, y+1] for 5 ≤ y ≤ 9 x 10 x=y+1 x=y−1 4 3 5 9 11 y Ex. 8.11: the posterior f X ,Y (x, y) f X |Y (x | y) = fY (y) fX,Y is uniform over the parallelogram. Therefore, for any fixed y=y0, the posterior is uniform over the intersec;on of the line y=y0 with the parallelogram. fX|Y(x|y) is uniform over: [4, y+1] for 3 ≤ y ≤ 5 [y−1, y+1] for 5 ≤ y ≤ 9 [y−1, 10] for 9 ≤ y ≤ 11 x 10 x=y+1 x=y−1 4 3 5 9 11 y Ex. 8.11: the LMS es;mate f X ,Y (x, y) f X |Y (x | y) = fY (y) fX,Y is uniform over the parallelogram. Therefore, for any fixed y=y0, the posterior is uniform over the intersec;on of the line y=y0 with the parallelogram. fX|Y(x|y) is uniform over: [4, y+1] for 3 ≤ y ≤ 5 [y−1, y+1] for 5 ≤ y ≤ 9 [y−1, 10] for 9 ≤ y ≤ 11 x 10 x=y+1 Therefore, E[X|Y=y] = (y+5)/2 for 3 ≤ y ≤ 5 x=y−1 4 3 5 9 11 y Ex. 8.11: the LMS es;mate f X ,Y (x, y) f X |Y (x | y) = fY (y) fX,Y is uniform over the parallelogram. Therefore, for any fixed y=y0, the posterior is uniform over the intersec;on of the line y=y0 with the parallelogram. fX|Y(x|y) is uniform over: [4, y+1] for 3 ≤ y ≤ 5 [y−1, y+1] for 5 ≤ y ≤ 9 [y−1, 10] for 9 ≤ y ≤ 11 x 10 x=y+1 Therefore, E[X|Y=y] = (y+5)/2 for 3 ≤ y ≤ 5 y for 5 ≤ y ≤ 9 x=y−1 4 3 5 9 11 y Ex. 8.11: the LMS es;mate f X ,Y (x, y) f X |Y (x | y) = fY (y) fX,Y is uniform over the parallelogram. Therefore, for any fixed y=y0, the posterior is uniform over the intersec;on of the line y=y0 with the parallelogram. fX|Y(x|y) is uniform over: [4, y+1] for 3 ≤ y ≤ 5 [y−1, y+1] for 5 ≤ y ≤ 9 [y−1, 10] for 9 ≤ y ≤ 11 x 10 x=y+1 Therefore, E[X|Y=y] = (y+5)/2 for 3 ≤ y ≤ 5 y for 5 ≤ y ≤ 9 (y+9)/2 for 9 ≤ y ≤ 11 x=y−1 4 3 5 9 11 y Ex. 8.11: the LMS es;mate f X ,Y (x, y) f X |Y (x | y) = fY (y) fX,Y is uniform over the parallelogram. Therefore, for any fixed y=y0, the posterior is uniform over the intersec;on of the line y=y0 with the parallelogram. fX|Y(x|y) is uniform over: [4, y+1] for 3 ≤ y ≤ 5 [y−1, y+1] for 5 ≤ y ≤ 9 [y−1, 10] for 9 ≤ y ≤ 11 x 10 x=y+1 Therefore, E[X|Y=y] = (y+5)/2 for 3 ≤ y ≤ 5 y for 5 ≤ y ≤ 9 (y+9)/2 for 9 ≤ y ≤ 11 x=y−1 4 3 5 9 11 y Ex. 8.11: the condi;onal MSE E[(X−E[X|Y=y])2 | Y=y] = var(X|Y=y) Ex. 8.11: the condi;onal MSE fX|Y(x|y) is uniform over: [4, y+1] for 3 ≤ y ≤ 5 x 10 x=y+1 Therefore, E[(X−E[X|Y=y])2 | Y=y] = var(X|Y=y) = (y−3) 2/12 for 3 ≤ y ≤ 5 x=y−1 4 3 5 9 11 y Ex. 8.11: the condi;onal MSE fX|Y(x|y) is uniform over: [4, y+1] for 3 ≤ y ≤ 5 [y−1, y+1] for 5 ≤ y ≤ 9 x 10 x=y+1 Therefore, E[(X−E[X|Y=y])2 | Y=y] = var(X|Y=y) = (y−3) 2/12 for 3 ≤ y ≤ 5 1/3 for 5 ≤ y ≤ 9 x=y−1 4 3 5 9 11 y Ex. 8.11: the condi;onal MSE fX|Y(x|y) is uniform over: [4, y+1] for 3 ≤ y ≤ 5 [y−1, y+1] for 5 ≤ y ≤ 9 [y−1, 10] for 9 ≤ y ≤ 11 x 10 x=y+1 Therefore, E[(X−E[X|Y=y])2 | Y=y] = var(X|Y=y) = (y−3) 2/12 for 3 ≤ y ≤ 5 1/3 for 5 ≤ y ≤ 9 (11−y) 2/12 for 9 ≤ y ≤ 11 x=y−1 4 3 5 9 11 y Ex. 8.11: the condi;onal MSE E[(X−E[X|Y=y])2 | Y=y] = var(X|Y=y) = (y−3) 2/12 for 3 ≤ y ≤ 5 1/3 for 5 ≤ y ≤ 9 (11−y) 2/12 for 9 ≤ y ≤ 11 Conditional MSE 1/3 3 5 9 11 y Es;ma;on error for LMS We denote the LMS estimator of X based on Y by X̂LMS : X̂LMS = E[X | Y ] The associated estimation error is denoted by X LMS and is defined as: X LMS = X̂LMS − X = E[X | Y ] − X Proper;es of the es;ma;on error for LMS We denote the LMS estimator of X based on Y by X̂LMS : X̂LMS = E[X | Y ] The associated estimation error is denoted by X LMS and is defined as: X LMS = X̂LMS − X = E[X | Y ] − X The estimation error has zero mean: E ⎡⎣ X LMS ⎤⎦ = E [ E[X | Y ] − X ] = E [ E[X | Y ]] − E [ X ] = 0 Proper;es of the es;ma;on error for LMS We denote the LMS estimator of X based on Y by X̂LMS : X̂LMS = E[X | Y ] The associated estimation error is denoted by X LMS and is defined as: X LMS = X̂LMS − X = E[X | Y ] − X The estimation error has zero mean: E ⎡⎣ X LMS ⎤⎦ = E [ E[X | Y ] − X ] = E [ E[X | Y ]] − E [ X ] = 0 The estimation error is uncorrelated with X̂LMS : E ⎡⎣ X̂LMS X LMS ⎤⎦ = 0 Proper;es of the es;ma;on error for LMS We denote the LMS estimator of X based on Y by X̂LMS : X̂LMS = E[X | Y ] The associated estimation error is denoted by X LMS and is defined as: X LMS = X̂LMS − X = E[X | Y ] − X The estimation error has zero mean: E ⎡⎣ X LMS ⎤⎦ = E [ E[X | Y ] − X ] = E [ E[X | Y ]] − E [ X ] = 0 The estimation error is uncorrelated with X̂LMS : E ⎡⎣ X̂LMS X LMS ⎤⎦ = 0 Therefore, ( ) = var ( X̂ ) + var ( X ) ( ) ( ) ( var(X) = var X̂LMS − ( X̂LMS − X) = var X̂LMS − X LMS = var X̂LMS + var − X LMS LMS LMS )
© Copyright 2026 Paperzz