PowerPoint Presentation - London Chess Conference

Linking administrative data to RCTs
John Jerrim
(UCL Institute of Education)
What do we mean by administrative data?
•
Central government records
•
Typically available for every person in the population
•
•
Not typically collected for research purposes…..
…rather for ‘record keeping’ / registration purposes
Examples include
Health
Education
Finance (tax)
Criminal records
Example . The National Pupil Database (NPD)
• One of the most widely used administrative datasets in England!
• Data for all state school children in England…..
• …..excludes (or missing a lot of information for) private school kids
• Test scores at age 5, 7, 11, 14, 16 and 18
• Demographic information (e.g. FSM, ethnicity, EAL)
• Available from around the mid 1990s
• Now routinely linked to education RCTs in England (via EEF)
England is lucky to have this data! Most countries don’t have it!
Example. Cluster (school) level data…
• ‘Admin’ data doesn’t have to be at individual level….
• Can have information on an administrative unit that a person attends…
E.g. A school, hospital, police station, prison.
• Often more easily accessible than pupil level data
• Particularly useful in cluster RCTs (when you are randomising the
cluster itself).
Education example
School inspection (OFSTED) ratings…..
School level demographics (e.g. % children eligible for FSM)
School level prior achievement
What are the benefits of admin data?
• Low cost…..
• Not intrusive to collect from participants……
• Regularly updated with new information…..
• Often collected in a consistent way across individuals…..
• Low levels of missing data….
• Low levels of measurement error…..
Together, this makes administrative data very attractive to include in our
analysis of RCTs!
What are some of the main challenges we face in
RCTs?.....
(….and how can admin data be used to try and resolve them?)
1. Boost statistical power
A lack of statistical power……
• In education: mostly cluster RCT’s
• Rather than randomise individuals….. Randomise whole schools
• Issue = ICC (ρ). Low power……
EXAMPLE
Secondary schools (clusters) = 100
200 children per school
ρ = 0.20
20,000 pupils in trial
Minimum detectable effect = 0.25 standard deviations
95% CI = 0 to 0.50 standard deviations
Example admin data to ↑ power….
•
One way to ↑ power is to control for stuff that is linked to the outcome….
•
…use NPD for this purpose
EXAMPLE
Maths mastery
Year 7 kids
New way of teaching them maths
Test end of year 7
CONTROL for KS2 MATH scores from NPD
Detectable effect = 0.36 without control (CI = 0 to 0.72)
= 0.22 with NPD controls (CI = 0 to 0.44)
MASSIVE BOOST TO POWER
2. Reduce evaluation cost.
Costly (including to test)….
• Imagine it costs £5 to test each child in this trial……
• …you have spent £100,000 just on a post-test!
• Got to deliver intervention in 50 schools (expensive…..)
• Many EEF secondary school RCT’s > £500,000 ……..
• …..average detectable effect across trials = 0.25
• Big ££ for quite wide confidence intervals……
Example: administrative data to reduce cost…..
•
In previous example, could have conducted a pre-test rather than use NPD.
•
Maths Mastery in 50 schools of 200 children = 10,000 kids
•
£5 per test. Hence pre-test would have cost a minimum of £50,000
ADMINISTRATIVE DATA SAVED THIS MONEY….
•
NPD data is there, ready to use.
- LETS USE IT!
- Doing a separate pre-test here would have had almost no benefit
3. Minimise attrition
Attrition…
•
•
Schools (and pupils within schools) drop out of the trial…..
….particularly when assigned to the control group!
Problems
- Breaks randomisation. Loses key advantage of the RCT
- Lose power
Example (my trial)
- 50 schools. 25 Treatment and 25 control
- Treatment follow-up = 23 / 25 schools
- Control follow-up = 9 / 25 schools
Worst of all worlds:
- Bias (selection effects)
- Low power
- High cost
Example: NPD to reduce attrition
•
Schools would have had to have taken time out of maths lessons to conduct
this pre-test…..
•
…there would be significant administrative burden on them to conduct the
test
•
This burden is a major reason for control schools dropping out
Administrative data has….
(i) massively reduced the burden on schools
(ii) Improved validity of the trial
4. Allow long-run follow-up
Administrative data for long-run follow-up
•
Test / follow-up often immediately at the end of the trial ….
...often when intervention most effective
•
BUT we are really interested in long-run, lasting effects
•
I.e. Much point ↑ age 11 test scores if kids don’t do any better at age 16??
•
Ideally want short, medium and long-term follow-up…..
•
….but this again ↑ $$$
•
However, administrative data may include long-run follow-up information
about individuals….
5. Insight into external validity
External validity
• Most RCT’s recruit participants via convenience sampling…..
….not from a well defined population
• How “weird” is our sample of trial participants?
Have mainly rich pupils?
Have only high-performing schools?
• How far can we generalise results?
• BIG ISSUE:
- Will we still get an effect when we scale up / roll-out?
BUT, FRANKLY, OFTEN IGNORED IN RCT’S
NPD for external validity / generalisability
•
Most RCT’s based upon non-random samples of willing participants.
•
Big issue. But often glossed over!
•
Without random samples, how do we know if study results generalise to a
wider (target) population?
•
Admin data – give us some handle on this……..
•
As we have data for (almost) every child/person in the country…….
•
…….We can examine how similar trial participants are to target population in
terms of observable characteristics
6. Additional characteristics in dataset
Additional characteristics
•
Administrative records may include information we did not collect as part of
our RCT.....
…. because it was too difficult too
…. because too costly
…. because we forgot!?
•
These are additional variables we can use in our analysis of our trial.
E.g. Additional variables we can perform ‘balance checks’ with….
E.g. Additional variables to examine heterogeneous effects…..
Example: Maths Mastery heterogeneous effects….
Linked in cluster (school) level
administrative data on OFSTED
(inspection) ratings…
Found big heterogeneity by
OFSTED rating!
ONLY POSSIBLE AFTER WE
LINKED TO ADMIN DATA!!!
7. Potential for clever designs….
See this paper: Improving recruitment of older people to clinical trials: use of the cohort multiple randomised
controlled trial design. Age Ageing 2015 doi:10.1093/ageing/afv044
Step 1: Admin data on population
Step 2: Randomly ask
people if they want
to receive treatment
Step 2: Control group.
Individuals not approached
Step 3: Follow up in
admin data
Step 3: Follow up in
admin data
Points to note
1. You never make any contact with control group!
2. If everyone you ask says yes – then you have a perfect RCT! (Both internal & external validity)
3. Statistical power very high….
4. ‘Business as usual control’ (by necessity)…
5. Non-compliance = People saying no when you approach them = the issue (ITT vs CA-ITT analysis)
Issues with linking to administrative data….
Sensitive data = high levels data security….
•
Most administrative is potentially identifiable……. you know who the person is!!
•
Some data probably won’t be given to you (e.g. names)……
•
You may not be the one doing the linking…….
…..it may be left up to others (who may not do this correctly!)
•
When you have access to linked data, you need to store it securely.
E.G. UCL Safe Data Haven.
https://www.ucl.ac.uk/isd/itforslms/services/handling-sens-data/tech-soln
•
Potential for big penalties if you don’t abide by the rules…..
£500,000 fine…..
Jail….
Ethics and consent….
•
Participants usually needs to give you consent to link their admin data to RCT….
Opt-in consent = They need to tick the box saying that you can link
Opt-out consent = They only need to contact you if they don’t consent.
•
Sometimes the person giving consent is not the person themselves…..
Example (education)
Opt-in consent from schools needed to access children’s NPD data….
Parents typically asked about opt-out consent….
•
Ethical issue with long-term linking?
What happens if your school and parent give consent to link when you are 10….
…..but then you decide you don’t want this at age 18?
…..should we have to re-ask for consent once children become adults?
Practicalities. How do you link?
1. Unique ID
• Variable that uniquely identifies an individual in both datafiles to be merged
• E.g. UPN in NPD; national insurance number in tax records.
2. By name
• Individuals named in both datafiles….
• Not as straightforward as it may sound!
• Names spelt wrong/differently across files…..
• Maiden vs married names…..
• Individuals with same name (e.g. NPD and children called Mohammed in London)
3. By individual characteristics
• AKA: ‘fuzzy matching’
• Need enough characteristics so can identify individuals…..
• E.g. Gender, Date of Birth, FSM etc. The more, the better!
Case study. Chess in Schools and communities.
www.bbc.co.uk/news/education-13343943
The intervention
→ Children to receive 30 hours of chess lessons during one academic year (year 5)
→ Follows a fully developed curriculum by the Chess in Schools and Communities (CSC) team
→ Chess lessons likely to be accompanied by an after school chess club
RQ. Does teaching primary school children how to play chess lead to an improvement in
their educational attainment?
Why is this of interest?
• In 30 countries (e.g. Russia) chess is part of the national curriculum
• ‘Well-known’ that influences maths test scores (at least within the chess world!)
“we have scientific support for what we have known all along--chess makes kids
smarter!”
(Chess Life, November, p. 16 / Johan Christiaen)
• Reasonably strong previous evidence
A cluster RCT in Italy produced effect size 0.35
Though caution – external validity!
Big previous effect sizes….but poor research designs
Sala and Trinchero
Eberhard
Gilga and Flesner
Aclego et al (2012)
Fried and Ginsburg
Yap (2006)
Krame and Flipp
Margulles (1992)
Sala et al
DuCette (2009)
Sala et al (2015)
Kazemi et al (2012)
Sgirtmac (2012)
Aydin (2015)
-0.2
Average effect size = 0.34
0
0.2
0.4
0.6
0.8
1
Why is this of interest?
• Intervention is VERY cheap to implement
- If +ive impact, then also likely cost effective!
• Fairly serious money invested in the project- £700K ($1m) for this
RCT alone
• Putting men into primary schools
More information see:
http://www.psmcd.net/otherfiles/BenefitsOfChessInEdScreen2.pdf
An interesting feature of this particular RCT is that it
used administrative data only!!
Step 1. Defined the population using administrative data…..
→ 11 LEA’s (geographic areas) in England purposefully selected
→ Year 5 (age 9 / 10) children in 2013 / 14 academic year (born Sep 2003 – Aug 2004 )
→ Disadvantaged schools
- > 37% of KS 2 pupils eligible for FSM in the last six years
→ Total of 450 on population list (sampling frame)
Step 2. Pre-specified use of administrative data in study protocol…
→ Primary outcome = Key Stage 2 math test score
- National examination in England
- Children will sit 1 year after end of intervention
- Due to sit tests in June 2015 (children age 11)
- ‘Intention to treat’ (ITT) analysis
- Information from NPD (administrative data)
- Should get 100% follow-up (very rare for RCT!)
→ Secondary outcome
- Math sub-domains (e.g. mental arithmetic)
- English & science test scores
Step 3: Power calculation
Assumptions
Between school ICC = 0.15
60 children per school on average
Correlation pre / post test (Key Stage 1 and Key Stage 2 test scores) = 0.65
80% power for 95% CI
NOTE: We are can base these assumptions on analysis of admin data from previous
years! Strong basis!
With 100 schools, we can detect an effect size of 0.20.
Hence recruit 100 schools …....
Step 4: Selection of the ‘sample’ (and external validity)
→ Chess in schools given list of all 450 schools
→ Asked to recruit 100 from this list
→ Sampling fraction of around 22%
→ How does our sample of children from the 100 recruited schools…..
……compare to the ‘population’ of children from the 450 schools?
→ USE ADMINISTRATIVE DATA TO FIND OUT!!!!
Example: Using the NPD to investigate external validity..
Variable
Key Stage 1 maths
Level 1
Level 2A
Level 2B
Level 2C
Level 3
Missing
Eligible for FSM
No
Yes
Gender
Female
Male
Language Group
English
Other
School n
Pupil n
Trial
participants
All eligible
pupils
England
12%
24%
31%
19%
12%
2%
12%
24%
30%
20%
11%
3%
8%
27%
27%
15%
20%
2%
66%
35%
65%
35%
82%
18%
50%
50%
50%
51%
49%
51%
65%
34%
100
3,775
63%
37%
442
16,397
82%
18%
0
570,344
Chess in Schools
Able to show participants
very similar to population
of interest (in terms of
observables…..)
…but very different to
population of England as
a whole!
Step 5: Random assignment
→ Stratify schools into 9 groups
- 3*3 matrix of %FSM and KS2 test scores at school level
→ Randomly select children from within each of these strata
→ 50 Treatment schools (children taught chess)
→ 50 Control schools (business as usual)
→All children within these schools taking part in the trial.
→ Q. WAS BALANCE ACHIEVED?
A. USE ADMINISTRATIVE DATA TO FIND OUT!!!!
.6
.8
Balance on prior achievement using admin data…..
.4
Balance upon KS1 average
points scores….
0
.2
These are tests children took
at age 7………..Two years
before the intervention took
place! (But that’s ok!)
-2
-1
kdensity Stand_ks1_aps
0
x
1
kdensity Stand_ks1_aps
2
Balance on other characteristics……
Control
% Asian
Treatment
% Black
ETHNICITY
% White
% Male
% Level 3 Maths
Pre-test
% Level 2 Maths
% Level 1 Maths
SES
% FSM (poor)
0
20
40
Percent
60
80
By using NPD, we almost eliminated attrition….
•
Clever design with NPD data means we can (almost) eliminate drop-out
EXAMPLE: Chess in Schools
- Year 5 children learn how to play chess during one school year
- 50 treatment schools receive chess
- 50 control schools = ‘business as usual’
- Use age 7 (Key Stage 1) as the pre-test scores
- Use age 11 (Key Stage 2) as the post-test scores
•
Almost no burden on schools (no testing to be done)
Key stage 2 results for all children
•
•
Have test scores even if they move schools……
…..should have very little attrition
Allocation
Randomised
school n=100
pupil n=4,009
Intervention
Control
School n= 50
Pupil n= 2,055
School n= 50
Pupil n= 1,954
Analysis
Almost zero attrition!
Analysed
Analysed
School n = 50
Pupil n =1,965
School n = 50
Pupil n = 1,900
Did it work? Outcomes 1 year post-intervention
Effect size
P-value
Mathematics
+0.01
0.93
Reading
-0.06
0.44
Science
-0.01
0.82
Mental arithmatic
+0.00
0.94
Answer
NO!
Sub-groups (mathematics)
Boys
-0.02
0.77
Girls
+0.03
0.73
FSM children
+0.01
0.95
Note
Able to look at
heterogeneous
effect by FSM
due to admin data
link…..
Planned long-run follow-up (using admin data)
•
Trial conducted in Year 5 (age 9/10). First follow at end Year 6 (age 10/11).
•
Treatment and control children then move onto secondary school.
•
Will be able to track these children via their unique pupil number. Hence longrun control:
Do treatment children do better in math GCSE? (Age 16)
Are they more likely to study maths post-16?
Are they more likely to enter a high-status university?
•
Administrative data means we can answer these questions at little extra cost.
•
Can answer the question – is there a lasting impact of the treatment?
Limitations
Exclusive use of administrative data meant
1. Could only look at educational attainment measures…….
…..and not look at impact upon ‘non-cognitive’ skills.
2. Outcome measured one year after intervention…….
…..might there have been an immediate effect?
3. Statistical power would have probably been higher with a specific pre-test….
…..but also would have been costly!
4. Balance checks and heterogeneous effects limited to characteristics observable
in administrative data only.