Sampling Models for the Population Mean Ed Stanek UMASS Amherst 1 Basic Problem (Population Mean) Population Data Listing Rose Latent Value yRose yRose Lily yLily Daisy yDaisy What is ? yRose yLily yDaisy 3 2 Basic Problem (Population Mean) Some Notation Label Population L j j 1,..., N Listing Set of Subjects in the Population Latent Value y Rose yRose Lily yLily Daisy yDaisy λ0 j 1 Rose 2 Lily Daisy 3 y0 y j 1 y N L yLily yRose yDaisy N Using vector notation: 1 N yj N j 1 Listing Using set notation: Latent Values y1 yRose y2 yLily y y 3 Daisy Assumption: Response is equal to the latent value for the subject. There is no measurement error. 3 Sampling Model • Select a simple random sample without replacement of size n – Define an estimator that is a linear function of the sample data – Require the estimator to be unbiased – Determine coefficients that minimize the variance (over all possible samples) • Best Linear Unbiased Estimator (BLUE) 4 Sampling Model Select a simple random sample without replacement Order i All possible Permutations of subjects p*=1 p*=2 p*=3 p*=4 p*=5 p*=6 Potential Response i 1 Y1 R R L L D D i2 Y2 L D R D R L i 3 Y3 D L D R L R Listing p1 Probability of Permutation 0 I p* 1 if Y u p* y 0 0 otherwise E I p* p p* 0 0 p2 0 0 1 N! p3 0 p4 0 p5 0 for all p* 1, 2,..., N ! p6 0 5 Sampling Model Select a simple random sample without replacement All possible Permutations of latent values p*=1 p*=2 p*=3 p*=4 p*=5 p*=6 Potential Response Y1 N ! 0 Y Yi Y2 I p* u p* y 0 Y p* 1 3 y0 y j y1 yRose y2 yLily y y 3 Daisy u1y 0 1 0 0 u1 0 1 0 0 0 1 yRose yRose yLily yLily yDaisy yDaisy yLily yDaisy yRose yDaisy yRose yLily yDaisy yLily yRose u2y 0 1 0 0 u 2 0 0 1 0 1 0 yDaisy yRose yLily u6 y 0 0 0 1 u 6 0 1 0 1 0 0 6 Sampling Model Select a simple random sample without replacement All possible Permutation Yi E Yi Ei Permutation Order i N! Y I p* u p* y 0 Potential Response i 1 Y1 i2 Y2 i 3 Y3 0 p* 1 Y1 Y Y2 Y3 Data Remainder 7 Sampling Model Select a simple random sample without replacement •Represent the Population as a Vector of Random Variables •The random variables are indexed by their position- not the label for the subject in a position subject •The subject corresponding to a random variable can not be identified Sample Size: n=1 Y1 Y Y2 Y3 Permutation Data Remainder Position i=1 Sample Size: n=2 Y1 Y Y2 Y3 Permutation Data Remainder 8 Sampling Model Define the Target P gY Linear combination of Population Random Variables: g g1 g N g2 N giYi i 1 •May be a Parameter •May be a Random variable Special case: Mean (Parameter) Special case: Latent value for Randomly Selected Subject 1 1N Y N 1 N Yi N i 1 P P Yi gi 1 for all i 1,...N N gi 1 for i gi* 0 for all i* i 9 Sampling Model Expected Value Yi E Yi Ei Y1 Y YI Y 2 Y Y3 II Y E Y E Xβ E Under SRS w/o Rep: E Y Xβ Linear Link Function E Yi E Y 1N Expected Value E Y 1N Data E X 1N β Expected Value YI X I 1n E 1 Y X II II N n 10 Sampling Model Variance Yi E Yi Ei Y1 Y YI Y 2 Y Y3 II Variance Term due to finite population correction factor 1 var Y 2 I N J N N 2 PN where PN I N 1 JN N 1 N 2 y s N 1 s 1 2 Data Variance 1 I n N Jn YI var 2 1 1 1 YII N N n n VI VI , II V V II , I II 1 1n1N n N 1 I N n J N n N 11 Sampling Model Expected Value and Variance Reference Sets Expectation is evaluated over a reference set Reference Set: The set of possible values that sample random variables can have with positive probability Example: Y1 Y YI Y 2 Y Y3 II Data If n 1 YI Y1 Reference set for YI Y1 y Lily , yRose , yDaisy 12 Sampling Model Expected Value and Variance: Reference Sets Y1 Y YI Y 2 Y Y3 II Data E YI E Y1 YI Y1 n 1 Reference P Element yReference Reference Element Elements y Reference set for YI Y1 Lily , yRose , yDaisy Reference 1 P 3 Element Reference E Y1 P yReference Element Reference Element Elements 1 1 1 yLily yRose yDaisy 3 3 3 13 Sampling Model Expected Value and Variance Reference Sets Example when n 2 Y1 Y YI Y 2 Y Y3 II Data Y YI 1 Y2 Y1 Reference set for YI Y2 y Lily , yRose , yLily , yDaisy , yRose , yDaisy Sets of possible latent values If yLily 10 yDaisy 8 yRose 6 Reference set for YI 10 6,10 8,6 8 14 Sampling Model Expected Value and Variance Reference Sets vs Sequence Permutation (sequences) Example when n 2 Y1 Y YI Y 2 Y Y3 II Y1 Data YI Y2 Reference Set for YI y Lily p*=1 p*=2 p*=3 p*=4 p*=5 p*=6 Y1 Y 2 Y3 , yRose , yLily , yDaisy , yRose , yDaisy L L R R D D R D L D L R D R D L R L Reference Sequence for YI yLily yLily yRose yRose yDaisy yDaisy , , , , , y y y y y y Daisy Lily Daisy Lily Rose Rose 15 Sampling Model Expected Value and Variance Reference Sets vs Sequence Example when n 2 Y1 Y YI Y 2 Y Y3 II Y Data YI 1 Y2 Reference Sequence : Used in Random Permutation Model Reference Set : Sufficient, assuming order doesn’t matter yLily yRose yLily yDaisy yRose yDaisy , , , , , y y y y y y Rose Lily Daisy Lily Daisy Rose y Lily , yRose , yLily , yDaisy , yRose , yDaisy 16 Sampling Model Determining the BLUE for Target: P gY YI gII YII gI YI gII YII gI data where gI Linear Estimator: Pˆ gI a YI gI YI aYI a a1 a2 gII 1 1n 1N n N an Question: What should a be so that the estimator is unbiased and has minimum variance? 17 Sampling Model Determining the BLUE for Unbiased Constraint Unbiased requirement: E Pˆ P 0 P̂ P Pˆ gI YI aYI P gI YI gII YII E Y 1N YI 1n E 1 Y II N n Pˆ P aYI gII YII X E Pˆ P a gII I X II Implies that aXI gII XII 0 18 Sampling Model Determining the BLUE Minimizing the Variance Pˆ P aYI gII YII Variance VI var R Pˆ P a gII VII , I Unbiased Constraint VI , II a VII g II aXI gII XII 0 Lagrangian Function to Minimize with Respect to a f a, λ aVI a 2gII VII , I a gII VII g II 2 aX I gII X II η f a, η 2VI a 2VI , II g II 2X I η a f a, η 2 XI a XII g II η f aˆ ,ˆ 1 a VI 2 f aˆ ,ˆ XI VI X I X I aˆ VI , II g II 0n 0 ˆ XII g II 0 X I aˆ VI , II g II 0 ˆ XII g II 19 Sampling Model Determining the BLUE Minimizing the Variance Solving the Estimating Equations VI X I A B M C D X I aˆ VI , II g II 0 ˆ XII g II 1 1 1 1 1 1 A A BQ CA A BQ 1 M 1 1 1 Q CA Q VI XI XI 0 1 1 where Q D CA B 1 1 1 1 V 1 V 1X X V 1X V V X X V X X I I I I I I I I I I I I I 1 1 1 1 1 X I VI X I X I VI X I VI X I aˆ VI1 VI1X I X I VI1X I 1 X I VI1 VI , II g II VI1X I X I VI1X I 1 1 X II g II 20 Sampling Model Determining the BLUE Minimizing the Variance Solving the Estimating Equations aˆ VI1 VI1X I X I VI1X I 1 X I VI1 VI , II g II VI1X I X I VI1X I 1 X II g II aˆ gII VI, II VI1 VI1X I XI VI1X I XI VI1 gII X II XI VI1X I XI VI1 1 1 Pˆ gI YI aˆ YI Let ˆ X I VI1X I 1 X I VI1YI Pˆ g I YI g II X II ˆ VII , I VI1 YI X I ˆ var Pˆ var gI aˆ YI gI aˆ VI g I aˆ 21 Sampling Model Determining the BLUE of Using X I VI1X I gI n N N n 1 1 1 X I 1n gII 1n 1N n N N N 1 1 1 1 VI I n J n I n Jn N N n 1 so that ˆ X I VI X I 1 and X II 1 1N n N 1 X I VI1YI N 1n YI n Pˆ g I YI g II X II ˆ VII , I VI1 YI X I ˆ 1 1 N 1 1 N Pˆ f 1n YI 1N n 1N n 1n VII , I VI1 I n 1n 1n YI N n n N n N 1 1 1 N n 1 f 1n YI 1n 1N n VII , I VI1 I n J n YI n n N n N 1 1 fY 1 f Y 1N n VII , I VI1 I n J n YI N n where f n N 1 Y 1n YI n 22 Sampling Model Determining the BLUE of 1 1 Pˆ fY 1 f Y 1N n VII , I VI1 I n J n YI N n Now VII , I VI1 1 1 1N n1n I n Jn N N n 1 n 1N n 1 1n N N n 1 1N n1n N n 1 and 1n I n J n 0n n As a result Pˆ fY 1 f Y Y where 1 Y 1n YI n 23 Sampling Model Determining the BLUE of Pˆ gI YI aˆ YI where gI YI fY and aˆ YI 1 f Y Y Now var Pˆ var gI aˆ YI var Y Since 1 1 var YI 1n 2 n n 1 VI 2 I n J n N var Pˆ 1 f 2 n 24
© Copyright 2026 Paperzz