The Choice Between Fixed and Random Effects Models: Some Considerations For Educational Research Clarke, Crawford, Steele and Vignoles and funding from ESRC ALSPAC Large Grant Motivation • Need evidence from different disciplines to answer the research question : how can we improve pupil achievement? • Contribute to multi-disciplinary understanding by comparing common alternative models used by different disciplines Introduction • Pupils clustered within schools → hierarchical models • Two popular choices: fixed and random effects • Choice of model: – Often driven by discipline tradition – economists use fixed effects for example – May depend on whether primary interest is pupil or school characteristics Illustrations • What is the impact of SEN status on pupil achievement? • What is the impact of FSM status on pupil achievement? Why adjust for school effects? • Want to estimate causal effect of SEN on pupil attainment no matter what school they attend • Need to adjust for school differences in SEN labelling – e.g. children with moderate difficulties more likely to be labelled SEN in a high achieving school than in a low achieving school (Keslair et al, 2008; Ofsted, 2004) – May also be differences due to unobserved factors • Hierarchical models can account for such differences – Fixed or random school effects? Basic model yis 0 1 X is us eis • FE: us is school dummy variable coefficient • RE: us is school level residual – Additional assumption required: E [us|Xis] = 0 • That is, no correlation between unobserved school characteristics and observed pupil characteristics • Both: both models assume: E [eis|Xis] = 0 – That is, no correlation between unobserved pupil characteristics and observed pupil characteristics Relationship between FE, RE and OLS yis 0 1 X is us eis FE: yis yi 1 ( X is X i ) (eis ei ) RE: yis yi 1 ( X is X i ) (eis ei ) Where: 1 1 1 S u2 / e2 How to choose between FE and RE • Very important to consider sources of bias: – Is RE assumption (i.e. E [us|Xis] = 0) likely to hold? • Other issues: – – – – Number of clusters Sample size within clusters Rich vs. sparse covariates Whether variation is within or between clusters • What is the real world consequence of choosing the wrong model? SEN: Sources of selection • Probability of being SEN may depend on: – Observed school characteristics • e.g. ability distribution, FSM distribution – Unobserved school characteristics • e.g. values/motivation of SEN coordinator – Observed pupil characteristics • e.g. prior ability, FSM status – Unobserved pupil characteristics • e.g. education values and/or motivation of parents Intuition I • If probability of being labelled SEN depends ONLY on observed school characteristics: – e.g. schools with high FSM/low achieving intake are more or less likely to label a child SEN • Random effects appropriate as RE assumption holds (i.e. unobserved school effects are not correlated with probability of being SEN) Intuition 2 • If probability of being labelled SEN also depends on unobserved school characteristics: – e.g. SEN coordinator tries to label as many kids SEN as possible, because they attract additional resources • Random effects inappropriate as RE assumption fails (i.e. unobserved school effects are correlated with probability of being SEN) • FE accounts for these unobserved school characteristics, so is more appropriate – Identifies impact of SEN on attainment within schools rather than between schools Intuition 3 • If probability of being labelled SEN depends on unobserved pupil/parent characteristics: – e.g. some parents may push harder for the label and accompanying additional resources; – alternatively, some parents may not countenance the idea of their kid being labelled SEN • Neither FE nor RE will address the endogeneity problem: – Need to resort to other methods, e.g. IV Data • Avon Longitudinal Study of Parents and Children (ALSPAC) – Children born in Avon between April 1991 and December 1992 – Rich data • • • • Family background (including education, income, etc) Medical and genetic information Clinic testing of cognitive and non-cognitive skills Linked to National Pupil Database SEN • One in four pupils in England have SEN age 10 • Just under 4% have statement • In 2003-04, the period relevant to our data, approximately £1.3billion spent on primary school SEN (excluding special schools) – £1,600 per pupil with SEN SEN • • • • Substantial variation in %SEN across schools Quarter of schools have fewer than 15% SEN Quarter with more than 24% SEN Key question is whether the factors driving differences in % SEN between schools are correlated with unmeasured school-level influences on academic progress Estimated effects of SEN status on progress between KS1 and KS2 Model KS1 test score Plus administrative data Plus typical survey data Plus rich cohort data Plus school-level data Fixed effects ˆ FE (se) -0.335 -0.347 -0.355 -0.321 -0.321 (0.025) (0.025) (0.025) (0.024) (0.024) Random effects (i) ˆ RE (se) ̂ RE -0.330 -0.342 -0.349 -0.314 -0.319 (0.025) (0.025) (0.024) (0.024) (0.024) 0.175 0.161 0.086 0.076 0.064 % diff 1.5 1.4 1.7 2.2 0.6 Results from this analysis • SEN negatively correlated with progress between KS1 and KS2 • Choice of model does not seem to matter here – OLS, FE and RE give qualitatively similar results – Correlation between being SEN and unobserved school characteristics not important • Regression and random effects assumptions are likely to hold in this example - prefer the random effects model Conclusions • Often fixed effects approach is used because RE assumption is a strong one • Efficiency advantages to the RE approach • Failure of the regression assumption is major issue • Approach each problem with agnostic view on model/ may not make a difference – Should be determined by theory and data, not tradition
© Copyright 2026 Paperzz