More Schooling Is Not Always Better: Evidence

More Schooling Is Not Always Better:
Evidence from an Instrumental Variables
Approach to Educational Reform in Vietnam
Phu Viet Le
Fulbright Economics Teaching Program
August 19, 2015
1 / 41
Table of contents
1. History of education reforms in Vietnam and estimates of
returns to education
2. Estimating the return to education
3. Using instrumental variables to estimate the causal effect of
education on earnings in Vietnam
4. Results and robust checks
5. Conclusions and policy implications
2 / 41
History of education reforms in Vietnam and estimates of
returns to education
In the 20th century alone, Vietnam undertook three education
reforms in 1950, 1956, and the last one in 1980s, after each major
political upheaval:
I
I
I
I
The 1950 education reform: reduced from 11 years under a
French colonial system to 9 years, including three levels
corresponding to 4, 3, 2 years.
The 1956 education reform: shifted from a 9-year and 12-year
system in the North and the temporarily liberated region to a
10-year system like that in the Soviet Union, with each level
taking 4, 3, and 3 years.
The third education reform in 1980s: changed from the
10-year system to 12-year system like that in North America
which was already implemented in the South.
Obstacles in assessing quality due to lack of criteria and/or
data.
3 / 41
The third education reform in 1980s
I Context: two concurrent
education systems existed in the
North and the South of the 17th
parallel after the unification in
1975. The North followed
communist-oriented 10-year
education system, while the
South still kept its North
American 12-year system.
I The third education reform
unified these two systems under
a newly formed general
education system throughout
the country which takes 12 years
to complete.
I In the new education system,
primary, lower secondary and
upper secondary school takes 5,
4, and 3 years, respectively.
4 / 41
Timeline of the third education reform
Breakdown of grades (left) and transition time (right).
Source: VHLSS
5 / 41
Characteristics of the third education reform
I
The third education reform included a series of small
educational renovations rather than a quick adoption of a new
system, unlike Doi Moi, the most dramatic economic reform
in 1986. It took many years, from 1979 to 1996. Reforms on
curriculum and textbooks were even much more cautious.
I
Primary education remained free as it was aimed for
universalization. However, tuitions were required for secondary
schools. Increasing private contributions and semi-public
schools were allowed. Irregular expenses and supplementary
teaching were widespread.
I
Lower secondary schools required 33 weeks, which was shorter
than that in other comparable countries.
I
The teaching method was teacher-centered, less interactive,
which encouraged rote-learning.
6 / 41
Data sources and samples
I
Four rounds of Vietnam Household’s Living Standard Surveys
(VHLSS) in 2004, 06, 08, and 10.
I
The focus is a group with the highest obtained education level
not exceeding a high-school diploma, expectedly the most
affected group by the additional schooling year than college or
university graduates. This group may also have the highest
return to schooling.
I
Restricted to non-farm wage earners, who were at least 20
years old, and not currently enrolled in any school.
I
Other important reasons for excluding those with higher
educational levels.
7 / 41
Descriptive summaries of the data
North of the 17th parallel
Mean and standard deviation (SD)∗
Variable
Age
SD
Annual Wage (x1000)
SD
EDUC
SD
EXP
SD
Observations
2004
35.05
(10.98)
7,840.39
(6,173.53)
9.82
(2.07)
7.67
(7.56)
2,074
2006
35.07
(11.43)
10,230.59
(7,742.54)
9.93
(2.02)
7.68
(7.79)
2,191
2008
35.71
(11.13)
14,512.24
(11,392.26)
9.96
(2.05)
7.50
(7.18)
2,261
2010
35.95
(11.24)
22,366.23
(14,105.42)
9.76
(2.42)
—
2,307
? EXP: actual years of experience at the time of survey, unlike a standard
Mincer approach which assumes EXP = Age − EDUC − b. b is the age of
compulsory education. Why? job switching is prevalent, and experience may be
unrelated to current work.
? The official exchange rate during this period increased from USD/VND
15,746 in 2004 to 18,613 in 2010.
8 / 41
Impact of years of birth on
the average number of years in school∗
2004
2006
2008
2010
∗ Point
estimates and 95% confidence intervals, using regressions
on year dummies.
9 / 41
Rate of return to an additional schooling year in various
countries∗
10 / 41
Estimating the return to education
Relying on Mincer (1974)’s study:
logYi = logY0 + β1 × EDUCi + β2 × EXPi + β3 × EXPi2 + εi
? Dependent variable: logYi is the logarithm of income.
? Explanatory variables:
I
EDUCi is the total number of years in school, could be
transformed in educational degrees obtained: completing 12
years in the new education system corresponds to a
high-school diploma etc.
I
EXPi is the years of experience.
I
May include other explanatory variables representing
demographic and sectoral factors.
11 / 41
Estimating the return to education (2)
logYi = logY0 + β1 × EDUCi + β2 × EXPi + β3 × EXPi2 + εi
Meaning of the coefficients:
I
β1 : the average return to an extra year of schooling. For
example, β1 = 0.1 then each year of schooling will raise
income by about 10%.
I
Experience has a nonlinear impact on income as represented
by an inverted U-shaped function (β2 > 0 and β3 < 0).
Experience may be more important for young workers but less
important for more senior people. The marginal return to
experience is calculated as {β2 + 2β3 × EXPi }. The optimal
β2
}, which is
number of years of experience is {EXP ∗ = − 2β
3
the peak of the experience-income parabola. This was about
26 years in a study in Eastern European countries and in
Vietnam as well.
12 / 41
Problems in estimating the return to education
? Two major problems: omitted variables bias, and measurement
errors.
I
Omitted variables: personal abilities affect the number of
years in school and earnings. If personal abilities were not
accounted for, ordinary-least-squares estimates may be biased
due to correlations between the dependent variables (income)
and the residuals.
I
Measurement errors: it is difficult to measure education time
due to different criteria of ”what is education?”. Reported
estimates are about 10-15% less than the actual numbers
(Angrist and Krueger, Card). Then, OLS estimates are also
biased.
I
Other problems including the endogeneity of education and
functional forms of the return to education.
13 / 41
Omitted variables bias problem
logYi = logY0 +β1 ×EDUCi +β2 ×EXPi +β3 ×EXPi2 +γ×Abilityi +εi
? Ability represents unobserved individual characteristics such as
intelligence. Then, following Griliches, 1977:
E [b] = [X 0 X ]−1 X 0 Y = [X 0 X ]−1 X 0 [X β + γ × Ability + ε]
= [X 0 X ]−1 X 0 X β + γ[X 0 X ]−1 X 0 Ability + [X 0 X ]−1 X 0 ε
|
{z
} |
{z
} |
{z
}
=β
=γ
Cov [Ability,EDUC ]
Var [EDUC ]
⇒ E [b1 ] = β1 + γ
=0
Cov [Ability, EDU]
Var [EDUC ]
? Due to an expected positive correlation between individual ability
and education (barring some special cases such as Bill Gates or
Steve Job!), the estimated return to education may be exaggerated
(thus suffered from an upward bias).
14 / 41
Measurement errors problem
? Incorrect measures of education may also skew the least-square
estimates. In this case, we might estimate a return to EDUC∗
instead of the actual level of education EDUC:
EDUC = EDUC∗ + errors
logYi = logY0 +β1 ×EDUCi +β2 ×EXPi +β3 ×EXPi2 +εi − β1 × errorsi
|
{z
}
composite residuals
? Violation of a Gauss-Markov assumption on no correlation
between the explanatory variable EDUC and the residual:
Cov [EDUC∗ + errors, ε − β1 × errors] = β1 σe2 6= 0
⇒ biased coefficient.
? In the presence of measurement errors, the estimator of the
return to education is:
Var [EDUC ]
E [b1 ] = β1
Var [EDUC ] + Var [errors]
⇒ Downward bias (attenuation bias) of the actual value.
Magnitude depends on the signal-to-noise ratio.
15 / 41
Functional form of the return to education: linear or
nonlinear?
logYi = logY0 + β1 × EDUCi + β2 × EXPi + β3 × EXPi2 + εi
? If the above function is correct (and conditions for least-square
estimation hold) then the return to education β1 is BLUE.
? Some studies used a second order polynomial of education
β1 × EDUCi + β2 × EDUCi2 in the same way as the experience
variable.
? How to know which functional form is correct?
⇒ Non-parametric local regressions may help, without any
condition on the functional form.
16 / 41
Using nonparametric local regression to approximate the
return to education
I
First stage, filter out residual variations Ỹ due to other factors
not education from the earning equation:
logYi = logY0 + β1 × EDUCi + β2 × EXPi + β3 × EXPi2 + Ỹi
I
Then, Ỹ supposedly contains only variations due to education
(and other uncontrolled factors).
Second stage, estimate the residual earning function on the
education level:
Ỹi = g (EDUCi ) + εi
I
g(.) is a polynomial of arbitrary order.
Due to a nonparametric nature, we use graphical
representation of the estimated result. This method is also
called LOESS or LOWESS (locally weighted scatter-plot
smoothing).
17 / 41
Evidence of non-linear return to education∗
18 / 41
How to estimate the return to education in the presence of
omitted variables bias and measurement errors?
If these problems are not accounted for, OLS estimates are biased
with unknown directions.
I With regard to individual ability, including an IQ score or test scores as a
proxy for unobserved individual characteristics. However, most dataset do
not have this information.
I Diff-in-Diff using repeated observations in a panel dataset, then factors
that do not change over time such as individual ability will be discarded.
However, this method is restricted to only individuals working and
studying at the same time, which is often groups with low educational
attainments and a high return to education (selection bias).
I Another approach is to use data of twins. Twins are assumed identical in
every aspect, family, genetics, and ability. Thus difference in earnings may
be due to education.
19 / 41
How to estimate the return to education in the presence of
omitted variables bias and measurement errors? (2)
Using instrumental variables that are correlated with education
levels yet have no influence on earnings.
20 / 41
How to estimate the return to education in the presence of
omitted variables bias and measurement errors? (3)
Some famous instruments including:
I The month and quarter of birth may be a good instrument for education
levels. In the US, students are required to attend school to a legally
required age based on the date of birth. Then, those born in the early
quarters may be allowed to leave school earlier, thus having a shorter time
in school. The date of birth unlikely has an impact on individual income.
Vietnam does not have a similar regulation on school attendance, yet
those born in the very first days of a calendar year (say Jan 1st) may be
allowed to start school with those born in the previous year. Then DoB
(say born in the first week of January) can be used as an instrument for
education levels.
I The education level of spouse can be an instrument because marriages
often occur among couples of similar education levels. Or the education
level of parents can serve as an instrument because well-educated parents
often afford greater education to their children. The education level of
relatives expectedly does not affect a person’s earning.
I The distance to school, or an education policy which randomly affects
education in a natural experiment setting without an impact on earnings.
21 / 41
Summary of the first part
I
The average return to education in Vietnam is comparable
with that in other developing countries.
I
The return to education varies nonlinearly, with the highest
return for primary and upper secondary school, and lowest for
lower secondary school.
I
Within the same level then the length of time does not
seem to affect income. For example, for those
completed a high-school diploma then it does not
matter if he/she did it in 10 or 12 years. Similarly, those
completed lower secondary school, whether in 8 or 9
years it does not matter to earnings.
⇒ The value of the degree is very evident.
I
Next, we will quantify the effect of staying in school for
one more year on earnings.
22 / 41
Using instrumental variables to estimate the causal effect
of studying grade 9th on earnings in Vietnam
I
Common questions: Why do we have to use IV? To simplify
the matter, why don’t we use a dummy variable representing
those with and without the 9th grade in an ordinary least
square regression?
I
The problem is that we don’t know who took grade 9th. In the
questionnaire, we only know who complete all three
educational levels, whether 10 or 12 years. However, some had
to repeat a class, so adding one more year of school. Some
enrolled early, so avoiding the 9th grade, while others enrolled
late and had to study one more year.
I
Even if we knew, taking the 9th grade is also endogenous and
thus cannot be used as an explanatory variable.
23 / 41
Two proposed instruments for taking grade 9th
I
The first instrument is the year of birth, for a subset of data
in the North of the 17th parallel:
(
1
D1 =
0
I
1972 ≤ year of birth ≤ 1978
otherwise
The second instrument is the interaction between the status
of the 17th parallel and the year of birth, for the full country
dataset∗ :
(
1
N=
0
for
Provinces north of the 17th parallel line
otherwise
D2 = D1 × N
24 / 41
Assessing the condition for valid instrumental variables
I
Random assignment of the instruments:
I
No one knew the exact timing of the reform in order to
manipulate his or her time in school, or was able to migrate
across the 17th parallel in order to avoid or take the 9th grade.
I
There is also no reason for anyone to move from one region to
another just to avoid/take the 9th grade.
25 / 41
Assessing the condition for valid instrumental variables
I
(Exclusion restriction): Taking either of the two education
systems (prior or after reform) has no direct influence on
earnings.
I
May be acceptable when the curriculum had not changed.
This is true for those born in the overlapping years between
the two education systems, when new textbooks had not been
introduced. The only change is the extra year in school.
People affected are those born in between 1972-78 who started
the 9th grade in 1990.
I
This condition may not hold for those born well after the
reform years when there was sufficient time to change the
academic program.
26 / 41
Two-staged least squares (2SLS) with instrumental
variables
EDUCi = α0 + α1 D + α2 Expi + α3 Expi2 + Xi α4 + ηi
\ i + β2 Expi + β3 Expi2 + Xi β4 + εi
logYi = logY0 + β1 × EDUC
I
First stage, estimate the impact of the reform on the number
of years in school. D is the instrument; X is a vector of
explanatory variables, and ηi is the residual, supposedly iid
normally distributed.
I
Second stage, estimate the impact of education on earnings
\ i in a conventional Mincerian
via the instrument EDUC
equation.
27 / 41
Main estimation result
First-stage regression of education levels on the instruments
? Column [1] corresponds to a subset of data in the North of the 17th parallel.
Column [2] corresponds to the full country dataset.
? [*], [**], [***] represents coefficients being statistically significant at 90, 95,
and 99% confidence levels.
? The sample includes wage earners with age in between 20-70 years old.
28 / 41
Discussions
First-stage regression is as expected:
I
People born during the reform years in the North has a shorter
time in school by about 0.35-75 years, statistically significant
in three out of four samples (2006, 2008, v 2010).
I
Difference between the North and the South is stark: the
average schooling years in the North is significantly longer,
about 1-2 years, than in the South, despite having a shorter
education system. The difference increased once the North
transitioned to a 12-year education system.
I
This reflects the attitude towards education of Northern
people.
29 / 41
Discussions
Second-stage regression of earnings on education
attainments
? Column [1] corresponds to a subset of data in the North of the 17th parallel.
Column [2] corresponds to the full country dataset.
? [*], [**], [***] represents coefficients being statistically significant at 90, 95,
and 99% confidence levels.
? The sample includes wage earners with age in between 20-70 years old.
30 / 41
Discussions
Second-stage regression is surprising!
I
The years of schooling is statistically insignificant, or even
negative and insignificant ⇒ more schooling does not raise
earnings, or worse yet, reduce earnings.
I
This result contradicts with least-square estimates assuming a
constant (linear) return to schooling, but in a total agreement
with non-parametric local regressions. Why?
I
I
I
A linear model assumes a constant return to an extra year (or
the average return), irrespective of grades completed.
Nonparametric regressions show that studying grade 9th does
not raise earnings.
The IV estimate of the impact of grade 9th on earnings is the
locally average treatment effect (LATE), only applied to those
who were affected by the introduction of the reform, not all
observations in the data. Further, IV-LATE is unrelated to the
average return.
31 / 41
Robust checks
? Restricting to central
provinces, within 400
kilometers of the 17th
parallel.
? Those provinces were
expected to be similar in
most socio-economic
condition.
? Avoid the influence of two
major economic engines in
Hanoi and HCMC.
32 / 41
Robust checks
First-stage regression of education levels on the instruments
? The sample includes provinces in the central of Vietnam, namely Thanh Ha,
Ngh An, H Tnh, Qung Bnh in the North of the 17 parallel, and Qung Tr, Tha
Thin Hu, Nng, Qung Nam, Qung Ngi, Bnh nh, Ph Yn, and Khnh Ha in the
South of the 17 parallel. Central highland province Gia Lai v Kon Tum were
not used.
? Column [1] corresponds to a subset of data in the North of the 17th parallel.
Column [2] corresponds to the all central provinces.
33 / 41
Robust checks
Second-stage regression of earnings on education
attainments
? The sample includes provinces in the central of Vietnam, namely Thanh Ha,
Ngh An, H Tnh, Qung Bnh in the North of the 17 parallel, and Qung Tr, Tha
Thin Hu, Nng, Qung Nam, Qung Ngi, Bnh nh, Ph Yn, and Khnh Ha in the
South of the 17 parallel. Central highland province Gia Lai v Kon Tum were
not used.
? Column [1] corresponds to a subset of data in the North of the 17th parallel.
Column [2] corresponds to the all central provinces.
34 / 41
Validity of the instruments
Tests of validity of the instruments
Tests of weak instruments and under-identification all reject the
null of no correlation between the instruments and the endogenous
variables in all samples, except 2004 data.
35 / 41
The nature of IV-LATE estimates of the return to
education∗
Instrumental Variables - Local Average Treatment Effects:
E [βi ∗ ∆Si ]
E [∆Si ]
βi is the return to education, ∆Si is the extra schooling obtained
by person i as a result of the reform.
plimβIV =
I
I
I
The estimate is only relevant to those affected by the reform.
People dropped out before the reform initiated weren’t
affected ⇒ high internal validity but low external validity.
Those completed grade 9th were affected the most and thus
had the largest influence on the estimate of βIV through the
weight ∆Si . Those dropped out while studying grade 9th
affected the estimate by a lesser degree.
This estimate is unrelated to the population’s average return
to education as derived from the original Mincer’s study.
36 / 41
Why Zero Returns to 9th Grade in Vietnam?
I
The IV-LATE estimate only applies to those affected by the
reform, that is those who had completed at least 8 years of
schooling. Those people would have a much lower return to
an extra year in school than those with only 3-4 years of
schooling.
I
The IV estimates in other countries showing a positive and
large return to education (sometimes up to 20%) are due to
the purposive placement of the education program in poor and
less educated regions. In contrast, the last education reform in
Vietnam applied to those with already a high education level.
37 / 41
Why Zero Returns to 9th Grade in Vietnam? (2)
Other reasons:
I
A rigid wage structure (wage grids) for government’s
employees.
I
A slow pace of education reform due to political reasons leads
to a virtually unchanged curriculum in the first few years of
implementation.
I
The signaling value of the degree (“a sheepskin effect”) is
more important than the quality of education.
I
Pischke and von Wachter (2005): for the case of Germany,
apprenticeship training could make up for a loss of schooling
time.
38 / 41
Conclusions
I
Earnings increase nonlinearly with the years in school.
I
The value of the degrees is very evident. Length in school
does not affect earnings for each education level obtained.
I
Three reasons explain why earnings do not increase as per the
classical Mincer’s equation: rigid wage structure; ineffective
schooling that leads to a low productivity; and lack of a
mechanism to signal schooling quality.
I
As a consequence, fake degrees/fake learning cannot be
avoided!
39 / 41
Policy implications
I
Reduce schooling time or increase the integration between
subjects.
I
Move to performance-based salary instead of a rigid wage
structure.
I
Remove the completion exam at the end of an academic year.
Lower secondary school completion exam was removed
recently. Next target would be upper secondary school
completion exam?
I
The optimal education policy would be one that targets the
poor/disadvantage who likely have the highest marginal
return. In Vietnam, that would be towns/communes far from
big city, in mountainous areas, or the ethnic minorities.
40 / 41
Cautions and limitations
I
The return to education in this study singularly focuses on
personal income, not the rate of return for the society as a
whole. The personal return to the 9th grade might be zero,
but not for the whole country. The latter could be either
positive or negative, more work is needed.
I
No general equilibrium effects considered: increasing time in
school reduces labor supply, thus raising equilibrium wages (if
labor market is competitive).
I
Other effects not considered such as health and longevity,
crime effects.
41 / 41