SAMPLE DESIGN: WHO WILL BE IN THE SAMPLE? Lu Ann Aday, Ph.D. The University of Texas School of Public Health SAMPLE DESIGN: Key Components Target Population or Universe: group about which information is desired Sampling frame: operational definition of the target population which directly matches the target population, e.g., existing or constructed list of individuals from which the sample would actually be drawn • Sample elements: types of individuals or units that will be drawn, i.e., ultimate sampling unit refers to final sampling unit that is usually the focus of the analysis, e.g., individuals SAMPLE DESIGN: Types of Designs Probability Sample: Relies on laws of chance to pick the sample, where probability of selection is known, i.e., based on sampling fraction: n/N Nonprobability Sample: Relies on human judgment to pick the sample SAMPLE DESIGN: Types of Nonprobability Designs Purposive: Pick people for certain purpose, e.g., focus groups Quota: Pick target number of people in certain categories, e.g., women 18-35 Chunk: Pick convenient “chunk” of people, e.g., church attendees Volunteer: Ask for volunteers, e.g., healthy male medical students Snowball: Identify small number of individuals representative of the population of interest, who then identify others that meet the same inclusion criteria, e.g., drug users SAMPLE DESIGN: Types of Probability Designs Simple random sample Systematic random sample Stratified sample Cluster sample SAMPLE DESIGN: Simple Random Sample Definition: Every unit in the population has a known, nonzero, and equal chance of being selected through a lottery-type procedure SAMPLE DESIGN: Simple Random Sample Procedures Draw sample randomly from numbers assigned to sampling elements placed in a sampling “urn” OR Use a random numbers table to identify sampling elements to be included OR Use computer software to randomly select sample from computerized sampling frame RANDOM NUMBERS TABLE: Example: 1-Select random starting point “X”; 2-Look at 1st two digits of random numbers; 3-Proceed from left to right through table to identify elements from sampling frame (numbered 1-50) until the target sample size (n) , e.g., 10, has been reached. 91567 42595 X 27958 30134 04024 17955 56349 90999 49127 20044 46503 18584 18845 49618 02304 92157 89634 94824 78171 84610 14577 62765 35065 81263 39667 SAMPLE DESIGN: Systematic Random Sample Definition: Variation of simple random sample selected through randomly selecting a starting point and then taking every n’th unit thereafter, based on the sampling fraction SAMPLE DESIGN: Systematic Random Sample Procedures 1-Determine the sampling interval required to sample the required number of cases, based on the sampling fraction: n/N, e.g, 10/50 = 1/5 2-Select a random starting point “X” within the first sampling interval, e.g., elements 1-5 3-Starting at “X”, sample every n/Nth case from the sampling frame until the target sample size (n) , e.g., 10, has been reached SYSTEMATIC RANDOM SAMPLE: Example, e.g., n/N=10/50 = 1/5 (20%) 1 11 21 31 41 2 12 22 32 42 3 X 13 X 23 X 33 X 43 4 14 24 34 44 5 15 25 35 45 6 16 26 36 46 7 17 27 37 47 8 X 18 X 28 X 38 X 48 9 19 29 39 49 10 20 30 40 50 X X SAMPLE DESIGN: Stratified Sample Definition: Sample based on dividing the population into homogeneous strata and drawing random-type sample separately from all the strata Proportionate: Use same sampling fraction in each stratum Disproportionate: Use different sampling fraction in each (or selected) stratum SAMPLE DESIGN: Stratified Sample Procedures 1-Order or group the sampling frame by relevant strata 2-Determine the sampling interval required to sample the required number of cases, based on the sampling fraction 3-Select a random starting point “X” within the first sampling interval 4-Starting at “X”, sample every n/Nth case from the sampling frame until the target sample size (n) has been reached STRATIFIED SAMPLE: ExampleProportionate, e.g., n/N=1/20 (5%) in all strata STRATA N (%) n/N n (%) A 500 (5%) 1/20 25 (5%) B 3000 (30%) 1/20 150 (30%) C 2000 (20%) 1/20 100 (20%) D 500 (5%) 1/20 25 (5%) E 700 (7%) 1/20 35 (7%) F 1600 (16%) 1/20 80 (16%) G 700 (7%) 1/20 35 (7%) H 1000 (10%) 1/20 50 (10%) 10000 500 STRATIFIED SAMPLE: ExampleDisproportionate, e.g., n/N=1/20 (5%) in strata B,C,F,H & 1/10 (10%) in strata A,D,E,G STRATA N (%) n/N n (%) A 500 (5%) 1/10 50 (8.1%) B 3000 (30%) 1/20 150 (24.2%) C 2000 (20%) 1/20 100 (16.1%) D 500 (5%) 1/10 50 (8.1%) E 700 (7%) 1/10 70 (11.3%) F 1600 (16%) 1/20 80 (12.9%) G 700 (7%) 1/10 70 (11.3%) H 1000 (10%) 1/20 50 (8.1%) 10000 620 SAMPLE DESIGN: Cluster Sample Definition: Sample based on dividing the population into heterogeneous clusters and drawing random-type sample separately from sample of clusters CLUSTER SAMPLE: Example—Probability Proportionate to Size (PPS) (Aday & Cornelius, 2006, Table 6.2) (continued in next lecture) Block A: 100 HUs* Block F: 250 HUs* Block K: 200 HUs* Block B: 50 HUs Block G: 125 HUs* Block L: 300 HUs* Block C: 75 HUs Block H: 50 HUs Block M: 125 HUs Block D: 150 HUs* Block I: 100 HUs* Block N: 150 HUs* Block E: 200 HUs* Block J: 50 HUs Block O: 275 HUs* CRITERIA FOR EVALUATING SAMPLE DESIGNS Precision—how close the estimates derived from the sample are to the true population value as a function of variable sampling error Accuracy—how close the estimates derived from the sample are to the true population value as a function of systematic sampling error (bias) CRITERIA FOR EVALUATING SAMPLE DESIGNS (cont.) Complexity—number of stages and steps required to implement the sample design Efficiency—obtaining the most accurate and precise estimates at the lowest possible costs ADVANTAGES & DISADVANTAGES: Simple Random Sample ADVANTAGES Requires little knowledge of population in advance DISADVANTAGES May not capture certain groups of interest May not be very efficient ADVANTAGES & DISADVANTAGES: Systematic Random Sample ADVANTAGES Easy to analyze and compute sampling (standard) errors High precision DISADVANTAGES Periodic ordering of elements in sample frame may create biases in the data May not capture certain groups of interest May not be very efficient ADVANTAGES & DISADVANTAGES: Stratified Sample ADVANTAGES Enables certain groups of interest to be captured Enables disproportionate sampling within strata Highest precision DISADVANTAGES Requires knowledge of population in advance May introduce more complexity in analyzing data and computing sampling (standard) errors ADVANTAGES & DISADVANTAGES: Cluster Sample ADVANTAGES Lowers field costs Enables sampling of groups of individuals for which detail on individuals themselves may not be available DISADVANTAGES Introduces more complexity in analyzing data and computing sampling (standard) errors Lowest precision
© Copyright 2026 Paperzz