Africa Impact Evaluation Program on AIDS (AIM-AIDS) Cape Town, South Africa March 8 – 13, 2009 Quasi-Experimental Methods Jean-Louis Arcand The Graduate Institute | Geneva [email protected] 1 Objective • Find a plausible counterfactual Reality check • Every method is associated with an assumption • The stronger the assumption the more we need to worry about the causal effect » Question your assumptions 2 Program to evaluate Hopetown HIV/AIDS Program (2008-2012) Objectives Reduce HIV transmission Intervention: Peer education Target group: Youth 15-24 Indicator: Pregnancy rate (proxy for unprotected sex) 3 I. Before-after identification strategy (aka reflexive comparison) Counterfactual: Rate of pregnancy observed before program started EFFECT = After minus Before 4 Year Number of areas Teen pregnancy rate (per 1000) 2008 70 62.90 2012 Difference 70 66.37 +3.47 5 Teen pregnancy (per 1000) Counterfactual assumption: no change over time 68 66 64 62 60 58 56 54 52 50 66.37 Effect = +3.47 62.9 2008 2012 Intervention Question: what else might have happened in 2008-2012 to affect teen pregnancy? 6 Examine assumption with prior data Number Teen pregnancy (per 1000) of areas 2004 2008 2012 70 54.96 62.90 66.37 Assumption of no change over time looks a bit shaky 7 II. Non-participant identification strategy Counterfactual: Rate of pregnancy among non-participants Teen pregnancy rate (per 1000) in 2012 Participants 66.37 Non-participants 57.50 Difference +8.87 8 Counterfactual assumption: Without intervention participants have same pregnancy rate as non-participants teen pregnancy (per 1000) 100 Participants 80 66.4 Effect = +8.87 60 Non-participants 57.5 40 2008 2012 Question: how might participants differ from nonparticipants? 9 Test assumption with pre-program data teen pregnancy (per 1000) 80 70 62.9 60 66.4 ? 57.5 50 40 46.37 2008 2012 REJECT counterfactual hypothesis of same pregnancy rates 10 III. Difference-in-Difference identification strategy Counterfactual: 1.Nonparticipant rate of pregnancy, purging preprogram differences in participants/nonparticipants 2.“Before” rate of pregnancy, purging beforeafter change for nonparticipants 1 and 2 are equivalent 11 Average rate of teen pregnancy in 2008 2012 Difference (2008-2012) Participants (P) 62.90 66.37 3.47 Non-participants (NP) 46.37 57.50 11.13 Difference (P-NP) 16.53 8.87 -7.66 12 Effect = 3.47 – 11.13 = - 7.66 Participants 80 teen pregnancy 70 66.37 – 62.90 = 3.47 66.4 62.9 60 57.5 50 46.37 57.50 - 46.37 = 11.13 40 2008 2012 Non-participants 13 Effect = 8.87 – 16.53 = - 7.66 teen pregnancy (per 1000) 80 Before 70 62.9 66.4 66.37 – 57.50 = 8.87 60 62.90 – 46.37 = 16.53 57.5 50 46.37 After 40 2008 2012 14 Counterfactual assumption: Without intervention participants and nonparticipants’ pregnancy rates follow same trends 15 teen pregnancy (per 1000) 80 74.0 70 62.9 60 66.4 16.5 57.5 50 40 46.37 2008 2012 16 teen pregnancy (per 1000) 80 70 62.9 60 74.0 -7.6 66.4 57.5 50 40 46.37 2008 2012 17 Questioning the assumption • Why might participants’ trends differ from that of nonparticipants? 18 Examine assumption with pre-program data Average rate of teen pregnancy in 2004 2008 Difference (20042008) Participants (P) 54.96 62.90 7.94 Non-participants (NP) 39.96 46.37 6.41 Difference (P=NP) 15.00 16.53 +1.53 ? counterfactual hypothesis of same trends doesn’t look so believable 19 IV. Matching with Difference-inDifference identification strategy Counterfactual: Comparison group is constructed by pairing each program participant with a “similar” nonparticipant using larger dataset – creating a control group from similar (in observable ways) non-participants 20 Counterfactual assumption: Unobserved characteristics do not affect outcomes of interest Unobserved = things we cannot measure (e.g. ability) or things we left out of the dataset Question: how might participants differ from matched nonparticipants? 21 76 73.36 Teem pregnamcy rate (per 1000) 74 72 Effect = - 7.01 70 68 66 64 62 Matched nonparticipant 66.37 Participant 60 58 56 2008 2012 22 Can only test assumption with experimental data Studies that compare both methods (because they have experimental data) find that: unobservables often matter! direction of bias is unpredictable! Apply with care – think very hard about unobservables 23 V. Regression discontinuity identification strategy Applicability: When strict quantitative criteria determine eligibility Counterfactual: Nonparticipants just below the eligibility cutoff are the comparison for participants just above the eligibility cutoff 24 Counterfactual assumption: Nonparticipants just below the eligibility cutoff are the same (in observable and unobservable ways) as participants just above the eligibility cutoff Question: Is the distribution around the cutoff smooth? Then, assumption might be reasonable Question: Are unobservables likely to be important (e.g. correlated with cutoff criteria)? Then, assumption might not be reasonable However, can only estimate impact around the cutoff, not for the whole program 25 Example: Effect of school inputs on test scores • • • • • • Target transfer to poorest schools Construct poverty index from 1 to 100 Schools with a score <=50 are in Schools with a score >50 are out Inputs transfer to poor schools Measure outcomes (i.e. test scores) before and after transfer 26 60 65 70 75 80 Regression Discontinuity Design - Baseline 20 30 40 50 Score 60 70 80 27 65 70 75 80 Regression Discontinuity Design - Baseline Non-Poor 60 Poor 20 30 40 50 Score 60 70 80 28 65 70 75 80 Regression Discontinuity Design - Post Intervention 20 30 40 50 Score 60 70 80 29 75 80 Regression Discontinuity Design - Post Intervention 65 70 Treatment Effect 20 30 40 50 Score 60 70 80 30 Applying RDD in practice: Lessons from an HIV-nutrition program • Lesson 1: criteria not applied well – Multiple criteria: hh size, income level, months on ART – Nutritionist helps her friends fill out the form with the “right” answers – Now – unobservables separate treatment from control… • Lesson 2: Watch out for criteria that can be altered (e.g. land holding size) 31 Summary • Gold standard is randomization – minimal assumptions needed, intuitive estimates • Nonexperimental requires assumptions – can you defend them? 32 Different assumptions will give you different results • The program: ART treatment for adult patients • Impact of interest: effect of ART on children of patients (are there spillover & intergenerational effects of treatment?) – Child education (attendance) – Child nutrition • Data: 250 patient HHs 500 random sample HHs – Before & after treatment • Can’t randomize ART so what is the counterfactual 33 Possible counterfactual candidates • Random sample difference in difference – Are they on the same trajectory? • Orphans (parents died – what would have happened in absence of treatment) – But when did they die, which orphans do you observe, which do you not observe? • Parents self report moderate to high risk of HIV – Self report! • Propensity score matching – Unobservables (so why do people get HIV?) 34 Estimates of treatment effects using alternative comparison groups Comparison group: ARV hh (<100 days) * Rd. 2 ARV hh (>100 days) * Rd. 2 Constant Observations R-squared (1) (2) Orphans in High/Mod. Random HIV Risk sample households All kids (8-18 years) (3) (4) Orphans in High/Mod. Random HIV Risk sample households All boys (8-18 years) (5) (6) Orphans in High/Mod. Random HIV Risk sample households All girls (8-18 years) 10.675 (3.262)*** 5.808 (3.133)* 14.723 (5.583)*** 334 0.86 15.686 (4.877)*** 10.930 (4.467)** 13.073 (6.510)** 164 0.84 10.805 (4.676)** 2.503 (4.566) 17.526 (10.406)* 170 0.90 10.787 (2.720)*** 5.316 (2.638)** 15.836 (4.753)*** 424 0.85 14.561 (3.832)*** 9.302 (3.513)*** 8.307 (5.693) 210 0.87 10.397 (3.979)** 1.652 (4.036) 23.553 (7.712)*** 214 0.86 • Compare to around 6.4 if we use the simple difference in difference using the random sample Standard errors clustered at the household level in each round. Includes child fixed effects, round 2 indicator and month-of-interview indicators. 35 Estimating ATT using propensity score matching • Allows us to define comparison group using more than one characteristic of children and their households • Propensity scores defined at household level, with most significant variables being singleheaded household and HIV risk 36 Probit regression results • Dependent variable: household has adult ARV recipient Single-headed household Amt of land owned (acres) Household size Value of livestock owned (shillings) Travel time to main road (mins.) Value of durables owned (shillings) House with tin roof House with non-mud roof Coefficient z-value 0.8917932 -0.0153242 0.0060359 9.36E-07 0.0034674 -9.35E-08 0.2535599 0.2180698 3.06 -0.83 0.12 0.4 1.4 -0.01 0.58 0.7 Household with respondent who reported high/moderate risk of having HIV/AIDS 2.76405 Constant -3.250733 Observations 225 Pseudo R-squared 0.5151 6.88 -4.87 37 ATT using propensity score matching Mean change between rounds 1 and 2 Random Sample ARV households Hours of school attendance Nearest neighbor matching neighbors=2 Kernel matching bandwidth=.06 Difference T-stat -10.97 -3.69 7.28 1.94 -7.82 -3.69 4.12 1.65 38 Nutritional impacts of ARV treatment (1) Dependent variable: Sample: ARV household * Round 2 ARV household (<100 days in rd 1) * Round 2 ARV household (>100 days in rd 1) * Round 2 Constant Observations R-squared (2) (3) (4) WHZ WHZ<=-2 All children 0-5 in round 1 0.315 (0.202) -0.498 (0.386) 772 0.87 -0.098 (0.043)** 0.570 (0.277)** -0.003 (0.252) -0.481 (0.386) 772 0.87 0.076 (0.082) 772 0.70 -0.071 (0.058) -0.111 (0.053)** 0.077 (0.082) 772 0.70 Includes child fixed effects, age controls, round 2 indicator, interviewer fixed effects, and month-ofinterview indicators. 39 Nutrition with alternative comparison groups (1) Dependent variable: Comparison Group: ARV household * Round 2 ARV household (<100 days in rd 1) * Round 2 ARV household (>100 days in rd 1) * Round 2 Constant Observations R-squared (2) RS Orphans (3) WHZ RS Mod/High Risk 1.038 (0.733) 0.864 (1.567) 96 0.92 (4) 0.521 (0.327) 1.195 (0.785) 0.773 (0.859) 0.904 (1.588) 96 0.92 -0.339 (0.819) 250 0.88 0.768 (0.392)* 0.220 (0.419) -0.314 (0.818) 250 0.88 Includes child fixed effects, age controls, round 2 indicator, interviewer fixed effects, and monthof-interview indicators. 40 Summary: choosing among nonexperimental methods • At the end of the day, they can give us quite different estimates (or not, in some rare cases) • Which assumption can we live with? 41 Thank You 42
© Copyright 2026 Paperzz