Trends in Inequality of Eductational Opportunity in the Netherlands

Trends in Inequality of Educational
Opportunity in the Netherlands 1900-1980:
The Effect of Missing Data
Maarten L. Buis & Harry B.G. Ganzeboom
Department of Social Research Methodology
Vrije Universiteit Amsterdam
RC28, Oslo, May 6-8 2005
Conclusions (1)
• A steady trend towards less IEO in the
Netherlands remains visible throughout the
20th century.
• However, on closer scrutiny there appears to
be evidence of a slower trend or even
stability for the earlier and most recent
cohorts.
• Spline analyses of trends confirms this.
Buis & Ganzeboom, Oslo 2005
2
Conclusions (2)
• Missing data in father’s occupation vary by
education of respondent: MV are about 3 times
more prevalent among the lowest educated than
among the highest educated.
• One would hypothesize that this mitigates
measures of IEO and the historical trend therein.
• Multiply imputed data for FISEI:
– Level of IEO increases
– (Linear) trends in IEO becomes steeper
Buis & Ganzeboom, Oslo 2005
3
Previous research
• IEO = Inequality of Educational Opportunity =
association between father’s occupation and
respondent’s education.
• Previous research: long-term linear trend towards
less IEO:
– Cohorts 1900-1960: Ganzeboom & De Graaf, 1989, De
Graaf & Ganzeboom, 1990a, 1990b.
– Cohorts 1900-1980: Ganzeboom & Luijkx, 2004.
• This holds for both linear regression models en
sequential logits (first two transitions).
Buis & Ganzeboom, Oslo 2005
4
ISMF
• Now 51 studies on the Netherlands, collected between
1958 and 2004, N > 104.000 men and women 25+.
• Recent additions (since 2002 and Breen 2004): 16
studies, appr. 30% of the N.
• Father: FISEI – International Socio-economic Index of
Occupational Status.
• Education: level of education scaled relative to
benchmarks: primary = 6, highest secondary = 12,
university complete = 17.
Buis & Ganzeboom, Oslo 2005
5
Estimated IEO for each birthyear separately
(based on complete case analysis)
12
10
8
6
4
point estimate
95% CI
2
1900
1920
1940
1960
1980
birth year
Buis & Ganzeboom, Oslo 2005
6
Research questions
• How do trend and level estimates of IEO
depend upon data qualities:
–
–
–
–
Measures used
Quality and nature of the sample
Non-response
Missing values
Buis & Ganzeboom, Oslo 2005
7
Missing values
• MCAR = Missing Completely at Random
• MAR = Missing at Random: missingness is
random given the values of control (X) variables.
• NMAR: Not Missing at Random: missingness
depends upon values of Y-variable.
• Rubin 1987, Little & Rubin 2002, Allison 2002.
• Multiple hotdeck imputation in STATA.
Buis & Ganzeboom, Oslo 2005
8
Complete case analysis
(listwise deletion)
• OK, if MCAR.
• Biased if MAR.
• Inefficient (too large standard errors – this
can be quite dramatic.
• Linear trend:
– EDU = 8.4 + 6.4*FIS – 5.3*FIS*COH etc. (Men)
(.17) (.08)
(.45)
– EDU = 6.5 + 5.6*FIS – 3.3*FIS*COH etc. (Women)
(.17) (.08)
(.44)
Buis & Ganzeboom, Oslo 2005
9
Buis & Ganzeboom, Oslo 2005
10
Hot deck imputation
• Classify all cases by combinations of
predictor variables (COH, FED, MED,
ISEI).
• Stratify the cases by these combinations.
• Substitute the missing FISEI by valid FISEI
of random (nearest) neighbor.
• Key idea: do not only borrow the systematic
(predicted) part, but also the error term.
Buis & Ganzeboom, Oslo 2005
11
Multiple hot deck imputation
• Do hot deck imputation several times (1020).
• Bootstrap from each stratum a sample (with
replacement) of stratum size.
• Random selection of neighbor varies by
imputation cycle.
• Key idea: Rubin (1987): pp. 122-124. Get
the variance-covariance estimation right.
Buis & Ganzeboom, Oslo 2005
12
Key results
• FISEI predicted by COH (4), FED (7),
MED (7), ISEI (8).
• 10 imputations
• Linear trend result:
– EDU = 8.6 + 6.9*FIS – 6.1*FIS*COH etc. (Men)
(.33) (.12)
(.64)
– EDU = 6.7 + 5.9*FIS – 3.7*FIS*COH etc. (Women)
(.46) (.12)
(.64)
Buis & Ganzeboom, Oslo 2005
13
Non-linearities
• Linear splines
• Estimates with 1, 2, 3, 4 etc. knots (and a
uniform distribution). We were happy with
the result with 3 knots.
• Test of equality of slopes:
– Between trajectories
– Between men and women
Buis & Ganzeboom, Oslo 2005
14
Buis & Ganzeboom, Oslo 2005
15
Results
• Complete case analysis finds:
– Decline in IEO occurs between cohorts 1920
and 1960. Before 1920 and after 1960, the trend
can be assumed to be flat.
– There is a constant difference in IEO between
men and women: women’s educational
attainment appr. 10% less dependent on FIS
than men’s.
Buis & Ganzeboom, Oslo 2005
16
5
Male
Female
1900
1920
5
7
8
1940
1960
1980
6
7
8
IEO (difference in years of education
between highest and lowest status)
6
IEO (difference in years of education
between highest and lowest status)
9
9
Complete Cases
Hotdeck
1900
birthyear
Buis & Ganzeboom, Oslo 2005
1920
1940
1960
1980
birthyear
17
Multiple hot deck imputed data
• Finds pattern very similar to complete case
analysis.
• But decline of IEO between 1920 and 1960
is steeper!
• However, standard errors of effects have
increased (despite inclusion of more
information).
Buis & Ganzeboom, Oslo 2005
18