Screening Technology and Moral Hazard in Disability

Screening Technology and Moral Hazard in Disability
Insurance: Identifying misclassification
Helge Liebert∗
December 13, 2014
Center for Disability and Integration, School of Economics and Political Science, University of
St. Gallen, Rosenbergstrasse 51, CH-9000 St. Gallen, Switzerland
Abstract
This paper evaluates the presence of moral hazard in disability insurance by exploiting
differences in screening quality across Swiss regions. A reform in 2004 introduced
medical screening institutions as an additional gatekeeper to the public disability
insurance system. Sequential introduction of the reform as a pilot project in a
subset of Swiss cantons is used for identification. I find that better medical screening
has reduced inflow by about 23%. This reduction only occurs for hard-to-diagnose
conditions which are more likely to be subject to moral hazard. Since screening
stringency and implicit eligibility requirements remained unchanged, the results
suggest that a substantial share of DI awards are afflicted by moral hazard.
Keywords:
JEL classification: H53, I18; J14
∗
University of St. Gallen, Center for Disability and Integration, Rosenbergstr. 51, 9000 St. Gallen,
Switzerland. Email: [email protected]
1
Introduction
Targeted welfare payments are the most common form of social insurance in developed
countries. Disability insurance is among the largest targeted social insurances, and its
costs have steadily increased over time in most countries. The SSDI program in the US
distributed $130 billion in benefits to over 10.6 million people, more than a 50% increase
compared to 2005.
The efficiency of any targeting system crucially relies on how accurate it is in identifying
deserving beneficiaries. The increasing number of people on disability rolls has often been
tied to moral hazard (e.g. Autor and Duggan 2003). Moral hazard in disability insurance
arises from the information asymmetry regarding claimants’ true health status. Applicants
may overstate health limitations if the disutility of working is large and benefit receipt is
an attractive alternative (e.g. Kreider 1998). In addition, the insurance office can only
imperfectly assess claimants’ health. The problem is exacerbated by the fact that DI
benefits are often granted for health deficiencies which are generally difficult to diagnose,
e.g. psychological problems or soft tissue pain. As a result, benefits may be granted to
individuals which are actually undeserving.
The most prominent solution to reduce inflow into DI is to adjust the screening
mechanism. Better screening mechanisms can effectively decrease insurance uptake, lead
to higher employment and reduce social costs, provided the extent of moral hazard is nonnegligible. A popular policy response to reduce insurance inflow is to instruct caseworkers
to screen applicants stricter (e.g. de Jong et al. 2011). However, such adjustments always
imply an implicit or explicit change in eligibility criteria. Changes in the requirements may
cause a reduction in insurance inflow, but do not necessarily improve targeting efficiency.
Rejected applications are a mixture of wrongful rejections (increase in false negatives),
identified moral hazard (decrease in false positives) and rejections due to the change in
requirements. For this reason, changes in screening stringency cannot provide convincing
evidence of moral hazard, as a share of rejected applicants may have been deserving
under the previous regime. To reduce and identify the extent of moral hazard, screening
adjustments need to target the information asymmetry between claimants and the DI office
such that award decisions are made on more qualified grounds. If insurance caseworkers
have better information about claimants’ true work capability, they are more likely to
deny benefits to those which are ineligible.
This paper is the first to focus on screening quality as opposed to screening stringency.
Unlike previous literature, I do not rely on ad-hoc differences in screening, e.g. caseworkers
being advised to apply eligibility rules more strictly, ruling out effects due to implicit
changes in award requirements. The identified effect can be traced to a reduction in the
information asymmetry between caseworker and applicant, thus providing more decisive
evidence on the incidence of moral hazard in public disability insurance.
2
Identification relies on quasi-experimental policy variation generated by the introduction
of a medical gatekeeper institution in Switzerland. Generous benefits and previously absent
medical screening render the Swiss system especially liable to moral hazard. Replacement
rates of up to 95% render benefit receipt an attractive alternative to work. The reform
established medical screening offices mandated to review all DI applications in a subset
of Swiss regions in 2002. The screening offices are introduced to improve the quality of
screening and provide better information to the responsible DI caseworker within the same
regulatory environment. Observed reductions in inflow provide a lower bound estimate for
the size of applications induced by moral hazard.
The sequential spatial implementation of the reform is exploited in a local differencein-differences design. A duration framework using age at DI entry as the outcome is
used for estimation. I utilize the spatial treatment assignment information to compare
individuals within the same local labor markets. My results suggest that a substantial
share of DI applications is likely to be affected by moral hazard. I estimate that better
medical screening has effectively reduced insurance inflow by about 23%. Reductions only
occur for psychological and musculoskeletal conditions which are difficult to diagnose and
most likely to be affected by moral hazard. Results are stable across a variety of robustness
checks.
A considerable body of literature has investigated the incidence of moral hazard in
public disability insurance. Autor and Duggan (2003) have related the strong increases in
the number of people on the disability insurance rolls and the associated expenditures in
the US during the 1990s and 2000s to lenient medical screening. A number of studies have
shown that higher replacement rates and easier access to benefits reduce the propensity to
work (e.g. Gruber and Kubik 1997, Gruber 2000, Autor and Duggan 2003, Autor et al.
2012), and that individuals out of the labor market tend to overstate work limitations
(Kreider 1999). However, empirical evidence with regard to the effectiveness of improved
screening measures is mixed, and most studies focus on labor supply effects (e.g. Gruber
and Kubik 1997, Karlström et al. 2008, Mitra 2009).
Up to date there is relatively little empirical evidence on the effects of screening processes
on insurance inflow. De Jong et al. (2011) focus on the effects of stricter disability screening
on DI applications. In their setting, 2 out of 26 regions in the Netherlands were instructed to
screen applications more strictly. They find that DI applications and sickness absenteeism
declined, possibly due to self-screening (cf. Parsons 1991). Spillovers into unemployment
insurance are absent. They argue that stricter screening has reduced the extent of moral
hazard by reducing applications of those able to work and thereby improved targeting
efficiency. Another approach to analyze the effectiveness of different screening procedures
is taken by Campolieti (2006). He uses aggregated data from different regions in Canada
to study the effect of more rigorous medical adjudication requirements on self-reported
occurences of hard-to-diagnose medical conditions. His findings suggest that increased rigor
3
of medical screening has reduced the prevalence of musculoskeletal conditions, suggesting
that screening may have reduced the extent of moral hazard. Both these studies identify
effects which include implicit eligibility requirement changes. Furthermore, screening
measures are primarily implemented to reduce costs and are targeted at DI inflow. It
is thus necessary to evaluate these policies particularly with respect to actual insurance
inflow.
The paper proceeds as follows: The next section covers the institutional setting,
section 3 introduces the data, section 4 discusses identification and estimation methods,
section 5 presents the results and section 6 concludes.
2
Institutional Background
Disability benefits in Switzerland are very generous: Individuals can expect to receive
between 60% and 95% of their final wage in benefits from the main insurance schemes.
Depending on the prior level of income, minimum benefits for a full pension amount
to 1,160 CHF, maximum benefits to 2,320 CHF per month before taxes from the main
public scheme alone. On top of this, people receive substantial additional payouts from
occupational pension plans, family-contingent benefits for spouses and children or meanstested supplementary benefits.
The main reason for these generous payments is that the Swiss disability insurance was
originally designed to cover earnings losses and not intended as a welfare transfer. The
official disability definition implies a causal connection between earnings and disability: A
person is considered disabled (i.e. eligible for benefits) if he suffers a disability-induced
earnings loss of at least 40%. Payouts are increasing in the individual loss of earnings
potential. Unlike in the US, the Swiss system allows for partial disability benefits.
1
Exact replacement rates are determined based on an individual’s disability degree, a
measure of work incapacity calculated as one minus the ratio of potential labor market
income with disability to the potential income without disability (typically prior earnings).
Eligibility for all payouts is determined by the local disability office and binding for all
other insurance providers.
To apply for benefits, an applicant has to submit the medical documentation of his
condition and his previous earnings records. The disability insurance offices then have to
assess the individual earnings loss based on the severity of the condition and its impact
on working hours. Based on this, the case worker makes a decision whether the person
qualifies for benefits. However, the insurance office could only assess applicants’ eligibility
from the medical certificates issued by the applicant’s chosen doctor until 2004. They were
1
During the period under study, these were set to 25%, 50%, 75% and 100%.
4
Figure 1: Cantons with medical audit services
Note: Pilot cantons shaded gray.
Legend: ZH: Zürich, BE: Bern, LU: Lucerne, UR: Uri, SZ: Schwyz, OW:
Obwalden, NW: Nidwalden, GL: Glarus, ZG: Zug, FR: Fribourg, SO: Solothurn, BS: Basel-Stadt, BL: BaselLandschaft, SH: Schaffhausen, AR: Appenzell A.-Rh., AI: Appenzell I.-Rh., SG: St. Gallen, GR: Graubünden,
AG: Aargau, TG: Thurgau, TI: Ticino, VD: Vaud, VS: Valais, NE: Neuchâtel, GE: Geneva, JU: Jura.
not allowed to examine the applicant, even when in doubt about the credibility or severity
of the impediment. The case workers deciding on the application also have no medical
training, although they sometimes consult public health officers.
The near complete absence of medical checks by public officials and the generous
benefits have rendered the Swiss system especially liable to moral hazard. Both the
number of disability pension recipients and the budget deficit strongly increased during the
1990s. Mounting financial pressure then led to a sequence of revisions of the Swiss disability
insurance system after 2000. The Swiss parliament passed the first major reform in 2003.
The reform created several supra-regional medical audit institutions tasked to conduct
(re-)appraisals of benefit claims and authorized to carry out medical examinations.2
The personnel at the offices is trained in both medicine and actuarial regulations. The
additional checks were introduced to improve the quality of screening in two ways. First,
the offices have the authority to screen people in person and order further examinations.
Under the new system, the responsible audit office always receives a complete copy of
an individual’s insurance application, including the medical documentation of potential
4. Revision des Bundesgesetzes über die Invalidenversicherung. Prior revisions of the DI law
ocurred in 1868, 1987 and 1992. Other relevant reform measures of the 4th revision included the
introduction of a three-quarter pension, the abolishment of additional pensions for spouses and increased
efforts to place applicants in employment. However, these changes were adopted nationwide and only
became effective in 2004. There were no confounding reforms to other social insurances during the pilot
period.
2
5
limitations. The office then provides an evaluation of the applicant’s eligibility for the
local DI office. If they notice inconsistencies in the application or deem the application to
be invalid, they have the authority to conduct further examinations or order specialist
consultations.
Second, the audit offices are supposed to help public officials to better assess the
implications of health problems and their impact on an individual’s ability to work
in relation to the insurance requirements. They are expected to reduce asymmetric
information by providing a more qualified judgement about eligibility by providing both a
more competent medical assessment and reducing communication deficiencies, i.e. framing
the evaluation in terms the local insurance office can better comprehend.
The final decision on whether benefits are granted remains with the responsible
insurance caseworker. This is a crucial issue: The regulatory setting remains unchanged,
only the provision of information about the subjects eligibility regarding health limitations
is affected by the reform.
To test the institutional changes, audit offices were already implemented in 11 out of
26 Swiss regions as part of a pilot project in 2002. In the remaining regions, operation
began in 2005 as scheduled by the reform proposal. This quasi-experimental variation
is exploited in the remainder of the paper to assess the extent of moral hazard in the
disability insurance. The cantons which introduced screening institutions in 2002 are
shown in Figure 1.
Selective reform changes may induce some individuals to change their behavior in
anticipation of prospective losses. However, the pilot period was introduced shortly
after reform changes were publicly announced, and the screening changes implied by
the reform proposal received little public attention. The first draft of the reform which
included the medical screening institutions was proposed in February 2001, and underwent
some revisions until being approved in March 2003. The pilot project began in January
2002. The rapid introduction of the pilot project within ten months alleviates concerns
regarding anticipatory behaviour. The operation of the screening institutions began almost
immediately after the reform was announced and publicly available information about the
pilot program at the time was very limited. Nevertheless, this issue is addressed in more
detail in section 4.
Wapf and Peters (2007) provide a qualitative evaluation of the audit offices’ work.
They describe that the institution has improved internal communication and reduced
knowledge disparities between physicians and insurance offices. Due to data limitations,
they are not able to make a decisive causal statement about whether reduced inflow rates
can be associated to the introduction of the audit offices.
6
3
Data
The main estimations are based on the SESAM (Syntheserhebung soziale Sicherheit und
Arbeitsmarkt) data set provided by the Swiss Federal Statistical Office. The SESAM data
link the official Swiss labor force survey to administrative public insurance records. The
sample period ranges from 1999–2011. The data provides a rich set of information about
income, labor market history, past benefit receipt, education, family background and a
wealth of other socio-economic characteristics. Specifically, it also includes information
about individuals’ municipality of residence. Given the survey weights, the data is
representative of the Swiss population.
SESAM is a rotating panel which tracks individuals for five years until they drop out
and each year 20% of individuals are resampled. Due to the small incidence of disability
insurance in the population (around 0.5% per year) and the limited number of individuals
that can be tracked over several years, the longitudinal dimension cannot be used for
the analysis. However, the age of the first disability insurance claim is observed in the
data. This allows me to reconstruct when an individual entered disability insurance. I
measure DI inflow using the age of first disability receipt as the main outcome in a duration
framework.
An issue that could potentially affect the analysis is measurement error in the disability
incidence indicator. The age at first claim is calculated by the Federal Statistical Office
based on the information how long a person has been paying insurance premiums, assuming
that everybody started contributing at age 21. This is true for anybody who has held a
job (including student or part-time jobs) for any period of time until age 21, entered civil
or military service (mandatory for men), or received some form of social assistance. This
constitutes the majority of the population. Even if an individual does not work until age
21, most people opt to pay the premium privately to avoid a gap in their basic pension
contributions.3 However, I cannot fully dismiss the possibility of measurement error. For
some people pursuing a higher education track, the entry date may be underestimated, i.e.
they are observed to claim benefits earlier than they actually do. This issue is addressed
in the robustness checks.
The treatment region is defined as the cantons participating in the pilot project, the
treatment period comprises the years 2002–2004. For each individual, the municipality of
residence is observed. The spatial treatment assignment is exploited in the estimations.
Two different samples are used when evaluating the impact of the reform on the treated
population. The unrestricted sample comprises 259,599 individuals in all Swiss regions.
A second, local sample is restricted to individuals in local labor markets near border
regions between treated and control cantons and comprises 56,640 individuals. Descriptive
3
Contributions are modest and amount to 480 CHF per year. Local insurance offices even send
payment reminders to students.
7
statistics for both estimation samples are given in Table A1 in the appendix.
To identify individuals in the vicinity of administrative borders, spatial information
is required. Data about distances between different municipalities is obtained from
www.search.ch. For each municipality, I compute the distance to the nearest treated/nontreated counterpart that was sampled in the same year. Weights are used to estimate
pairwise differences. The weights for treated municipalities are set to unity, the weights
for control municipalities are set to the inverse of the number of times a municipality is
used as a control. Distance information is available as both actual travel distance and
travel time by car. I choose a travel distance of 20 kilometers between municipalities as
the threshold for the estimation sample. Microcensus data on mobility show that 80%
of commuters stay within this distance limit, and it corresponds approximately to the
average commuting time in Switzerland of about 25 minutes (BSV 2012, Eugster and
Parchet 2011). Results are robust to the choice of distance measure, variations in the
threshold level and whether weights are applied.
4
Empirical Strategy
A difference-in-differences identification approach is used to evaluate the impact of the
medical screening institutions. The quantity of interest is incidence of disability insurance
take-up, i.e. the rate of newly awarded benefits among previously non-receiving working
age individuals. However, due to an opaque political decision process, selection into
treatment cannot be assumed to be random. Since cantons differ in many respects, simply
comparing individuals from different cantons may lead to biased estimates of treatment
effects. Differencing removes time-invariant influences on potential outcomes. Identification
still requires a common development of DI incidence in the absence of the new screening
institutions. This assumption raises some concerns.
As Autor and Duggan (2003) illustrate, people rarely transition directly from employment into DI, but typically apply conditional on job loss. The main concern is that labor
markets may be less resilient in some regions, or that regions with strong industrial and
commercial hubs are more affected by common economic shocks. If screening is imperfect
and disability insurance is used as a partial substitute to unemployment insurance or an
early retirement vehicle in case of job loss, differential labor market trends can confound
the results. Since Switzerland is a country with historically tight labor markets, such
concerns are alleviated to some degree.
To address this issue, I pursue a twofold approach. A first set of results is based on the
full sample of individuals across all regions. A more narrow identification approach focuses
on individuals within the same local labor markets in border regions between treated and
control areas. Similar strategies are used by Frölich and Lechner (2010) and Campolieti
8
and Riddell (2012).
For estimation, I exploit the natural spell format of the data and model insurance takeup as a duration problem. The main specification uses a stratified Cox (1972) proportional
hazard model to estimate the impact of the reform on DI incidence. The hazard rate
is modeled as h(t, P, D|X < x̄) = h0g (t) exp β0 P + β1 D + β2 P D , where h0g (t) is the
n
o
non-parametric baseline hazard within stratum g, t denotes time in years, D ∈ 0, 1 is a
n
o
binary treatment group indicator and P ∈ 0, 1 is a binary time-varying indicator for
the pilot period during 2002–2004. Samples are restricted to local labor markets in border
municipalities between treated and control regions within an absolute distance threshold
x̄, where individuals are similar in observables and remaining differences can credibly be
assumed to be time-constant.
The model is specified using age as the time scale. This is preferable to using time-onstudy as analysis time due to the age-dependent nature of the disability hazard, the rich
cohort data available and the interest in the effect of a time-varying covariate (Kom et al.
1997, Thiébaut and Bénichou 2004). As recommended by Kom et al. (1997), all models
are stratified by five-year birth cohorts. Individuals become at risk when they enter the
eligibility age at 18. Censoring occurs at the sampling date or when individuals reach the
retirement age, whichever occurs first. Disability benefit receipt constitutes failure. Due
to data limitations, the analysis is restricted to single spells and disability insurance is
assumed to be an absorbing state.4
The standard assumptions for difference-in-differences estimation have to be restated
for proportional hazard models. The exponentiated coefficient on the interaction between
treatment time and region represents a ratio of hazard ratios
exp β2 =
h(t, D = 1, P = 1)/h(t, D = 1, P = 0)
h(t, D = 0, P = 1)/h(t, D = 0, P = 0)
.
(1)
The distance condition has been dropped to ease notation. The effect of interest is the
relative change in the hazard for the treated, i.e. a relative average treatment effect on the
treated,
rATT =
h1 (t, D = 1, P = 1)
,
h0 (t, D = 1, P = 1)
(2)
where hD denotes potential hazard rates. I assume the observation rule (Rubin 1977)
holds, i.e. either of the two potential treatment states is observed. Identification requires
4
Actual outflow rates due to reasons other than death or moving to the old-age pension system
amount to less than 1% of the stock per year (BSV 2012).
9
the two usual conditions in restated form
h1 (t, D = 1, P = 0) = h0 (t, D = 1, P = 0)
(no anticipation) ,
(3)
h0 (t, D = 0, P = 1)
h0 (t, D = 1, P = 1)
=
h0 (t, D = 1, P = 0)
h0 (t, D = 0, P = 0)
(common trend) .
(4)
The main identifying assumption is that in the absence of stricter screening measures,
incidence for individuals in both pilot and non-pilot (border) regions would have changed
proportionally. The common trend assumption is not invariant to the scaling of the
dependent variable (Lechner 2010) and is modified accordingly. Instead of assuming a
common trend between regions over time in differences, I am assuming a constant hazard
ratio, i.e. a common relative change or a common absolute change in logs.
This strategy removes unobservable factors which have a time-invariant effect on log
potential outcomes. I assume that conditional on being in the same local labor market,
there are no trend-confounding factors. Since people living on different sides of the border
are subject to different policies, the causal effect of the reform on DI uptake can be
identified by comparing hazard rates close to the border across time. To the best of my
knowledge, there are no other cantonal-specific reforms during the relevant time period.
As disability insurance applicants are a small fraction of the population, it is credible that
general equilibrium effects are absent and the observation rule is satisfied. Given these
assumptions, the coefficient of the interaction identifies the hazard ratio of interest.
A duration approach is preferred to a standard linear difference-in-differences framework
for several reasons. It corresponds naturally to the spell format of the data and the
fact that DI entry is essentially a survival outcome.5 For the standard difference-indifferences approach, only data sampled after 2004 can be used for estimation, resulting
in a substantial loss of information. It also requires creating a pseudo-panel structure,
inferring past incidence figures from one or multiple chosen cross-sections and adjusting
them for past eligibility. Since disability incidence in the total population is comparatively
low, these issues are non-negligible and lead to unstable estimates.
Furthermore, estimation of the incidence impact in a standard difference-in-differences
framework would require modifying the standard common-trend assumption in a way
which prohibits a more detailed analysis. Since incidence is defined as new benefit awards
among previously non-receiving working-age individuals, it is necessary to condition on
the absence of benefit receipt in the previous period when calculating the incidence rate
5
Another possibility would be to analyze prevalence, i.e. the effect of the screening reform on the
stock of disability insurance beneficiaries as in Staubli (2011). This is unappealing for two reasons. In a
standard difference-in-differences framework, the common trend assumption implies that first differences
between periods are equal for treated and control regions. Using prevalence as an outcome, this would
imply equal incidence rates across regions, an assumption which seems unlikely to be fulfilled in the present
context. Furthermore, the reform is much more likely to affect the rates of newly awarded benefits before
effects on the stock of recipients eventually materialize.
10
for each period. Since the pilot program spans three years, only incidence rates within
this time frame can effectively be compared without biasing results by conditioning on an
outcome.
Table A2 shows differences in selected covariates between treatment and control regions,
separately for both the local and the unrestricted sample for a representative subset of
data sampled prior to the pilot period. In the full sample there are significant differences
with regard to age, the share of foreigners, education, marriage status and family size,
characteristics which influence the propensity to receive DI. Among DI beneficiaries,
musculoskeletal conditions are more prevalent in treated regions. In the local sample,
balance improves considerably. Differences are small in magnitude and mostly insignificant.
People in treated regions are on average more likely to be from a foreign country. There are
about 2% more people with primary education in treatment regions, and correspondingly
less with secondary and university-level education. There is also a small difference in the
unemployment rate of about 0.8%. These remaining differences in observables are small
in economic terms and will not affect the estimates unless trends between treatment and
control regions differ.6
A potential concern is that individuals considering to apply for disability benefits
anticipate the reform and move to regions where the pilot project is not implemented,
resulting in higher inflow in control regions and biased results. This can be dismissed for
several reasons. The pilot project was not announced publicly at the time. In addition,
the amount of people moving to another region who can be identified by tracking the
panel cases is negligible. Between 1999 and 2011 about than 3.1% of the people for whom
some panel information is available move to another canton, and less than 0.8% percent
move from a non-treated to a treated region. About 0.5% of those sampled during the
pilot period do so. These number suggest that mobility in Switzerland is relatively low.
Anecdotal evidence shows that people hesitate not to move across cantonal borders, it is
unlikely that results are driven by strategic behavior.
11
Table 1: Disability incidence
Full sample
(1)
Treat
Pilot time
Treat x pilot
Post time
(2)
(3)
1.322***
(0.088)
1.085
(0.091)
0.856*
(0.076)
1.322***
(0.088)
1.090
(0.091)
0.855*
(0.076)
0.692***
(0.075)
0.971
(0.092)
1.236***
(0.075)
1.112
(0.092)
0.860*
(0.076)
0.733***
(0.080)
0.970
(0.091)
2,337
5,653
197
1,713
2,338
5,994
232
1,713
X
2,338
5,994
232
1,713
Treat x post
Other controls
N municipalities
N individuals
N failures
N failures during pilot
Local sample (within 20 km)
(4)
(5)
1.151*
1.151*
(0.086)
(0.086)
1.259*
1.269**
(0.155)
(0.154)
0.770** 0.771**
(0.098) (0.098)
0.870
(0.160)
0.841
(0.123)
1,086
48,995
1,576
885
1,087
51,069
1,863
885
(6)
1.148**
(0.077)
1.302**
(0.159)
0.766**
(0.098)
0.921
(0.171)
0.829
(0.122)
X
1,087
51,069
1,863
885
Note: Cox Proportional Hazard estimates for individuals in treated and control regions based
on SESAM individual-level survey and administrative data sampled during 1999–2011. Baseline hazard for all regressions stratified by 5-year birth cohorts. Survey weights applied for the
full sample. Observations in the local sample are weighted for pairwise estimation. Results are
reported in exponentiated form as hazard ratios. Standard errors clustered at the municipality
level in parentheses, number of observations given below. *, ** and *** denote significance at
the 10%, 5% and 1% level respectively.
5
Results
The main results are presented in Table 1, separately for the unrestricted and the local
sample. The first column for each sample considers only spells which are censored or result
in failure before the end of the pilot period in 2005, the remaining columns use all recorded
spells. The last column adds individual control variables, among them gender, education,
marital status, number of children and citizenship. All specifications stratify the baseline
hazard by five year birth cohort intervals to account for cohort specific differences in
health environment. Survey weights are applied in the full sample such that estimates are
representative of the Swiss population. Observations in the local sample are weighted for
pairwise estimation. All tables report hazard ratios, i.e. exponentiated coefficients and
corresponding standard errors.
The results indicate that the reform significantly reduced insurance inflow. The estimate
for the full sample implies a 15% reduction. The magnitude for the full sample is slightly
6
The spatial treatment assignment and sharp differences in treatment intensity at the district borders
would also allow for a discontinuity-based evaluation strategy. However, such an approach is not feasible in
this setting, as Swiss regions differ considerably in other measures, causing discontinuities in characteristics
such as taxes and regulations which would compromise identification. Since such differences are most
likely time-invariant, a difference-in-differences strategy is preferable. Additionally, spatial regression
discontinuity designs using only distance as a one-dimensional forcing variable suffer from the strong
implicit assumption of spatially constant potential outcomes (Keele and Titiunik 2013). In a strongly
decentralized, federal country such as Switzerland the assumption that potential outcomes are constant
along the same distance to different regional borders is not credible. Conversely, using a multi-dimensional
vector of coordinates as forcing variables (e.g. Papay et al. 2011) would potentially introduce measurement
error since topography and infrastructure can influence travel distance measures substantially.
12
Table 2: Disability types
Treatment region
Pilot time
Treat x pilot
Post time
Treat x post
N
N
N
N
municipalities
individuals
failures
fail during pilot
All
Illness
Illness:
Psych.
Illness:
Nerve
Illness:
MSK
Accident
Congenital/
Other
(1)
(2)
(3)
(4)
(5)
(6)
(7)
1.151*
(0.086)
1.269**
(0.154)
0.771**
(0.098)
0.870
(0.160)
0.841
(0.123)
1.229**
(0.109)
1.384**
(0.193)
0.683***
(0.101)
0.975
(0.201)
0.733*
(0.126)
1.185
(0.180)
1.450*
(0.279)
0.700*
(0.146)
0.667
(0.204)
0.897
(0.203)
1.100
(0.197)
2.373*
(1.145)
0.377**
(0.160)
1.739
(1.224)
0.607
(0.261)
1.245**
(0.130)
1.412
(0.329)
0.633**
(0.146)
1.285
(0.448)
0.596**
(0.157)
0.843
(0.131)
0.900
(0.401)
1.729
(0.712)
0.175***
(0.103)
6.436***
(2.654)
1,087
51,069
1,863
885
1,087
51,069
1,524
753
1,087
51,069
650
352
1,087
51,069
119
61
1,087
51,069
451
210
1,087
51,069
161
59
1.293**
(0.161)
0.794
(0.165)
1.151
(0.247)
1.223
(0.397)
0.748
(0.172)
1,087
51,069
353
149
Note: Cox Proportional Hazard estimates for individuals in treated and control regions based on
SESAM individual-level survey and administrative data sampled during 1999–2011. Baseline hazard
for all regressions stratified by 5-year birth cohorts. Observations are weighted for pairwise estimation. Results are reported in exponentiated form as hazard ratios. Standard errors clustered at the
municipality level in parentheses, number of observations given below. *, ** and *** denote significance at the 10%, 5% and 1% level respectively.
higher and corresponds to a 23% lower inflow rate. Both estimates are stable in magnitude
across specifications. Other results also reflect aggregate historic trends. The coefficient of
the pilot period indicator is significantly positive and indicates a general increasing trend
in inflow during that time, whereas the indicator for the post-reform period suggest a
decrease in incidence after 2004. These estimates are also reflected in aggregate data on
benefit receipt by the Federal Office for Social Insurances, which show a historical peak in
2004 and decreasing inflow afterwards, possibly induced by follow-up reforms and court
verdicts on tightened eligibility criteria. The preferred specification for the remainder of
the paper is given in column (5), since adding covariates does not affect the results in a
notable way. The remaining analysis focuses on the local sample. Results for the main
sample are qualitatively similar.
The duration analysis indicates that the reform was effective in reducing the number of
new beneficiaries. Since the regulatory conditions and the benefit requirements remained
unchanged, the results suggest that a sizeable number of DI applications may be affected by
moral hazard. If this is indeed the case, we would expect the effect to be most pronounced
for those types of illnesses which are difficult to diagnose and verify.
Table 2 investigates this by differentiating between different reasons for benefit award.
Results confirm that reductions in benefit awards occur only for hard-to-diagnose conditions.
Looking at column (3) and (4), the effect is pronounced for psychological diseases and
illnesses related to nerve problems. Column (5) looks at the incidence of musculoskeletal
and bone diseases. This category includes a variety of conditions which are difficult to
verify (e.g. whiplash, back pain). The hazard ratio suggest a substantial reduction in
13
incidence as well, comparable in magnitude to the main effect. The specification in column
(6) looks at disability benefit awards due to handicaps incurred in accidents. The last
column considers disabilities due to congenital defects and other diseases. These conditions
are unlikely to be subject to moral hazard. Indeed, there is no effect on conditions which
are unaffected by improved screening measures.
As previously noted, for some people pursuing a higher education track, the contributing
age may be mismeasured. Stratification by education level as reported in Table A3 also
serves as a robustness test if there are imbalances in educational attainment between
regions and measurement error is prevalent. Column (1) restricts the sample to individuals
who do not pursue higher education and therefore are likely to work and contribute at
age 21. Considering the direction of a possible bias due to measurement error, for those
pursuing tertiary education, the disability insurance entry age will be underestimated.
If the increase in tertiary education is lower in treated regions, this will bias estimates
upwards, as they are observed to enter DI earlier than they actually do (and vice versa).
However, when conditioning on lower tiers of education, the coefficient retains its magnitude
compared to the full sample, indicating that measurement error is unlikely to be a severe
issue. Individuals start their pension contributions typically timely, and work-induced
disability prevalence is especially low among the higher education tiers. Stratifying by
other variables reveals few notable effect heterogeneities.7
To assess the validity of the identifying assumption, I test the effect of a placebo
reform prior to the treatment period and assume a pseudo-treatment to be effective during
1999–2001. Results are shown in Table 3. Estimates across all specifications are close to
one and insignificant at conventional levels, supporting the identification strategy.
Another potential concern is that the results are sensitive to the choice of distance
window. Figure 2 addresses this issue by plotting treatment effect estimates across a large
set of bandwidths, using both actual travel distance and travel time as distance measures.
The coefficient of interest remains fairly stable in size and significant across a large set of
distances. The estimates consistently suggest about a 23% reduction in incidence in the
treatment group during the pilot program, although the results using travel time as the
distance measure are insignificant for larger thresholds. More detailed estimates are given
in Table A4 in the appendix.
Estimates for both samples are robust to model changes and very stable in magnitude.
About a 7% difference in incidence reduction between the local and the full sample persists
7
Table A3 looks into effect heterogeneity. Samples are stratified by education, resident status and
gender. Since the number of individuals and failures during the relevant period is smaller in the subsamples,
estimates are often imprecise. Since disability onset is often tied to a strenuous work history, incidence is
more pronounced among the lower tiers of education. The low number of failures for individuals with
university-level education reflects the fact that this group is generally less likely to claim DI. I also observe
that reductions in incidence occur more for Swiss residents compared to foreign citizens, and also for
females. However, these results should be taken with caution as estimates are all similar in magnitude
and precision is likely to be an issue.
14
Table 3: Placebo reform
Full sample
(1)
Treatment region
Prepilot time
Treat x prepilot
Pilot time
(2)
Local sample (within 15 min.)
(3)
(4)
(5)
(6)
(7)
(8)
1.337***
(0.097)
1.235***
(0.080)
0.970
(0.084)
1.337***
(0.097)
1.241***
(0.081)
0.970
(0.083)
1.320***
(0.131)
0.847*
(0.082)
1.337***
(0.097)
1.241***
(0.081)
0.970
(0.083)
1.326***
(0.132)
0.846*
(0.081)
0.842
(0.102)
0.960
(0.098)
1.248***
(0.085)
1.274***
(0.083)
0.975
(0.084)
1.390***
(0.139)
0.852*
(0.082)
0.917
(0.113)
0.961
(0.098)
1.150
(0.109)
1.204
(0.157)
0.999
(0.127)
1.150
(0.109)
1.213
(0.158)
0.999
(0.128)
1.514***
(0.218)
0.770*
(0.107)
1.150
(0.109)
1.213
(0.158)
0.999
(0.128)
1.525***
(0.219)
0.771*
(0.107)
1.046
(0.213)
0.841
(0.124)
1.148
(0.101)
1.253*
(0.163)
0.996
(0.127)
1.612***
(0.236)
0.765*
(0.107)
1.142
(0.237)
0.829
(0.123)
2,336
5,450
149
0
1,950
2,337
5,653
197
1,713
1,950
2,338
5,994
232
1,713
1,950
X
2,338
5,994
232
1,713
1,950
1,086
47,508
1,229
0
989
1,086
49,044
1,576
885
989
1,087
51,119
1,863
885
989
X
1,087
51,119
1,863
885
989
Treat x pilot
Post time
Treat x post
Other controls
N municipalities
N individuals
N failures
N fail during pilot
N fail during prepilot
Note: Cox Proportional Hazard estimates for individuals in treated and control regions based on SESAM
individual-level survey and administrative data sampled during 1999–2011. Baseline hazard for all regressions
stratified by 5-year birth cohorts. Survey weights applied for the full sample. Observations in the local sample
are weighted for pairwise estimation. Results are reported in exponentiated form as hazard ratios. Standard errors clustered at the municipality level in parentheses, number of observations given below. *, ** and *** denote
significance at the 10%, 5% and 1% level respectively.
through all variations. To shed light on the differences between the local and the full
sample, I estimate a Probit model for the probability to be included in the local sample,
separately for treated and control regions. Table A5 presents the results. The local treated
sample closely resembles the rest of the treated region. However, the local control sample
is quite different from the rest of the control population. It has a higher share of foreigners
(about 10% at the mean), more women and more well-educated individuals - all factors
which contribute to a lower overall incidence and are likely to drive the difference in results.
6
Conclusion
The results indicate that the reform was effective in decreasing insurance inflow. Effect
magnitudes are substantial and indicate a reduction of the incidence rate of up to 23%. Since
the reform only affected screening quality but did not change the eligibility requirements,
two different mechanisms can be driving this result.
In theory, it is possible that screening institutions commit a large number of type-II
false negative errors, i.e. reject applicants that are actually deserving. Due to the highly
specialized training of the responsible medical staff and their protective public mandate,
it is unlikely that this is driving the entire effect. More likely, a substantial share of
applications is actually rejected because the applicants are truly undeserving. Since only
the provision of information available to the case workers deciding on the application is
15
Figure 2: Distance windows
1.4
Hazard ratio
1
.6
.2
.2
.6
Hazard ratio
1
1.4
1.8
(b) Travel time (min)
1.8
(a) Travel distance (km)
10
20
30
40
Estimate
50
60
70
Distance (km)
80
90
100
110
120
95% Confidence bounds
10
20
30
40
Estimate
50
60
70
Distance (min)
80
90
100
110
120
95% Confidence bounds
Note: Treatment effect estimates and 95% confidence bounds for different distance windows measured
using actual travel distance and travel time.
improved, this implies that a non-negligible fraction of applications can be attributed to
moral hazard, given people are not fully myopic regarding their eligibility status when
applying for insurance benefits. The relevance of moral hazard is also confirmed by the
fact that incidence is only reduced for those conditions which are difficult to verify and
thus most likely to be affected, corroborating the results by Campolieti (2006). Separating
the two effects and tracing the exact mechanisms remains a promising pursuit for further
research.
Given these results, medical screening services appear to be an effective tool in curbing
inflow rates into disability insurance by reducing the extent of moral hazard. Targeting
is improved substantially, as benefits are now awarded at a higher rate to those which
are strictly deserving. Since external institutions who review applications appear to be
effective in the Swiss setting, they might provide a viable policy option for other countries
burdened by high disability insurance costs. In the long-run, screening will help reduce
costs and help stabilize the financial situation of public insurance systems, burdened by
increasing deficits in many countries.
16
References
Autor, D., Duggan, M. and Gruber, J. (2012). Moral hazard and claims deterrence
in private disability insurance. Working Paper 18172, National Bureau of Economic
Research.
Autor, D. H. and Duggan, M. G. (2003). The rise in the disability rolls and the decline in
unemployment. The Quarterly Journal of Economics 118(1), 157–205.
BSV (2012). Statistiken zur sozialen Sicherheit – IV-Statistik 2011. Bundesamt für
Sozialversicherungen.
Campolieti, M. (2006). Disability insurance adjudication criteria and the incidence of
hard-to-diagnose medical conditions. Contributions to Economic Analysis & Policy 5(1),
Article 15.
Campolieti, M. and Riddell, C. (2012). Disability policy and the labor market: Evidence
from a natural experiment in Canada, 1998–2006. Journal of Public Economics 96(3–4),
306–316.
Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical
Society. Series B (Methodological) 34(2), 187–220.
Eugster, B. and Parchet, R. (2011). Culture and taxes: Towards identifying tax competition.
Cahiers de Recherches Economiques du Département d’Econométrie et d’Economie
politique (DEEP) 11.05, Université de Lausanne, Faculté des HEC, DEEP.
Frölich, M. and Lechner, M. (2010). Exploiting regional treatment intensity for the
evaluation of labor market policies. Journal of the American Statistical Association
105(491), 1014–1029.
Gruber, J. (2000). Disability insurance benefits and labor supply. Journal of Political
Economy 108(6), 1162–1183.
Gruber, J. and Kubik, J. D. (1997). Disability insurance rejection rates and the labor
supply of older workers. Journal of Public Economics 64(1), 1–23.
de Jong, P., Lindeboom, M. and van der Klaauw, B. (2011). Screening disability insurance
applications. Journal of the European Economic Association 9(1), 106–129.
Karlström, A., Palme, M. and Svensson, I. (2008). The employment effect of stricter rules
for eligibility for DI: Evidence from a natural experiment in Sweden. Journal of Public
Economics 92(10–11), 2071–2082.
17
Keele, L. and Titiunik, R. (2013). Geographic boundaries as regression discontinuities.
Working paper, University of Michigan.
Kom, E. L., Graubard, B. I. and Midthune, D. (1997). Time-to-event analysis of longitudinal
follow-up of a survey: Choice of the time-scale. American Journal of Epidemiology
145(1), 72–80.
Kreider, B. (1998). Workers’ applications to social insurance programs when earnings and
eligibility are uncertain. Journal of Labor Economics 16(4), 848–877.
Kreider, B. (1999). Latent work disability and reporting bias. Journal of Human Resources
34(4), 734–769.
Lechner, M. (2010). The estimation of causal effects by difference-in-difference methods.
Foundations and Trends in Econometrics 4(3), 165–224.
Mitra, S. (2009). Disability screening and labor supply: Evidence from South Africa.
American Economic Review 99(2), 512–516.
Papay, J. P., Willett, J. B. and Murnane, R. J. (2011). Extending the regressiondiscontinuity approach to multiple assignment variables. Journal of Econometrics
161(2), 203–207.
Parsons, D. O. (1991). Self-screening in targeted public transfer programs. Journal of
Political Economy 99(4), 859–876.
Rubin, D. B. (1977). Assignment to treatment group on the basis of a covariate. Journal
of Educational and Behavioral Statistics 2(1), 1–26.
Staubli, S. (2011). The impact of stricter criteria for disability insurance on labor force
participation. Journal of Public Economics 95(9-10), 1223–1235.
Thiébaut, A. C. M. and Bénichou, J. (2004). Choice of time-scale in cox’s model analysis of
epidemiologic cohort data: a simulation study. Statistics in Medicine 23(24), 3803–3820.
Wapf, B. and Peters, M. (2007). Evaluation der regionalen ärztlichen dienste. Beiträge
zur Sozialen Sicherheit, Bericht im Rahmen des mehrjährigen Forschungsprogramms zu
Invalidität und Behinderung, Forschungsbericht Nr. 13/07.
18
Appendix: Tables
Table A1: Descriptive statistics
Full sample
All individuals
Age
Female
Married
Foreign
Nr. of children
Education: Primary
Education: Secondary
Education: Tertiary
Gross annual earnings
Travel distance (km)
Travel time (min)
Unemployed
Receives DI
Region
Léman
Mittelland
Nordwestschweiz
Zürich
Ostschweiz
Zentralschweiz
Tessin
DI recipients
Years in DI
Disability: Psych. problems
Disability: Nerve
Disability: Muscoloskeletal cond.
Disability: Other
Disability: Accident
Mean
SD
Min
Max
N
50.308
0.539
0.552
0.322
0.582
0.234
0.510
0.255
41.428
34.318
31.426
0.027
0.035
18.033
0.498
0.497
0.467
0.973
0.423
0.500
0.436
107.204
31.841
23.177
0.163
0.185
18.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.2
0.6
0.0
0.0
104.0
1.0
1.0
1.0
7.0
1.0
1.0
1.0
42,317.4
194.1
169.5
1.0
1.0
259,599
259,599
259,599
259,599
259,599
259,599
259,599
259,599
259,599
259,599
259,599
259,599
259,599
0.192
0.194
0.136
0.166
0.122
0.107
0.083
0.394
0.396
0.343
0.372
0.328
0.310
0.275
0.0
0.0
0.0
0.0
0.0
0.0
0.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
259,599
259,599
259,599
259,599
259,599
259,599
259,599
9.416
0.341
0.072
0.235
0.160
0.092
6.846
0.474
0.259
0.424
0.367
0.289
0.0
0.0
0.0
0.0
0.0
0.0
48.0
1.0
1.0
1.0
1.0
1.0
9,208
9,208
9,208
9,208
9,208
9,208
Local sample (within 20 km)
All individuals
Age
Female
Married
Foreign
Nr. of children
Education: Primary
Education: Secondary
Education: Tertiary
Gross annual earnings
Travel distance (km)
Travel time (min)
Unemployed
Receives DI
Region
Léman
Mittelland
Nordwestschweiz
Zürich
Ostschweiz
Zentralschweiz
Tessin
DI recipients
Years in DI
Disability: Psych. problems
Disability: Nerve
Disability: Muscoloskeletal cond.
Disability: Other
Disability: Accident
Mean
SD
Min
Max
N
49.505
0.533
0.581
0.308
0.638
0.238
0.530
0.232
42.964
7.260
9.915
0.026
0.034
17.778
0.499
0.493
0.462
1.014
0.426
0.499
0.422
188.397
2.930
3.101
0.160
0.182
18.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.2
0.6
0.0
0.0
102.0
1.0
1.0
1.0
7.0
1.0
1.0
1.0
42,317.4
17.3
15.0
1.0
1.0
56,609
56,609
56,609
56,609
56,609
56,609
56,609
56,609
56,609
56,609
56,609
56,609
56,609
0.091
0.208
0.266
0.164
0.114
0.157
0.000
0.288
0.406
0.442
0.370
0.318
0.364
0.004
0.0
0.0
0.0
0.0
0.0
0.0
0.0
1.0
1.0
1.0
1.0
1.0
1.0
1.0
56,609
56,609
56,609
56,609
56,609
56,609
56,609
9.086
0.325
0.069
0.249
0.163
0.098
6.547
0.468
0.254
0.433
0.369
0.297
0.0
0.0
0.0
0.0
0.0
0.0
46.0
1.0
1.0
1.0
1.0
1.0
1,948
1,948
1,948
1,948
1,948
1,948
Note: Descriptive statistics for the unrestricted and the local estimation sample. Based
on the 1999–2011 SESAM data, restricted to eligible individuals (age 18–65).
19
Table A2: Pre-treatment covariate balance
Full sample
Total
Treated
Control
48.34
(18.28)
0.54
(0.50)
0.52
(0.50)
0.09
(0.29)
0.56
(0.98)
0.21
(0.41)
0.59
(0.49)
0.20
(0.40)
36.09
(48.35)
28.69
(27.22)
27.80
(20.27)
0.02
(0.12)
0.04
(0.20)
47.74
(18.83)
0.55
(0.52)
0.58
(0.52)
0.12
(0.34)
0.66
(1.08)
0.23
(0.44)
0.59
(0.52)
0.19
(0.41)
35.36
(50.57)
43.02
(37.62)
37.15
(27.79)
0.02
(0.14)
0.04
(0.20)
48.66
(17.95)
0.54
(0.49)
0.50
(0.49)
0.08
(0.26)
0.51
(0.91)
0.20
(0.39)
0.60
(0.48)
0.21
(0.40)
36.49
(47.10)
20.90
(15.95)
22.72
(12.98)
0.01
(0.11)
0.04
(0.20)
7.90
(6.94)
43.11
(11.69)
0.29
(0.46)
0.11
(0.31)
0.21
(0.41)
0.21
(0.41)
0.10
(0.30)
7.64
(7.48)
44.20
(13.25)
0.27
(0.49)
0.09
(0.31)
0.27
(0.49)
0.21
(0.45)
0.09
(0.31)
8.03
(6.62)
42.55
(10.84)
0.30
(0.43)
0.12
(0.31)
0.18
(0.37)
0.21
(0.38)
0.11
(0.30)
15,522
506
5,983
207
9,539
299
Local sample (within 15 min.)
Difference
Total
Treated
Control
Difference
-0.926***
(0.309)
0.009
(0.009)
0.078***
(0.009)
0.043***
(0.005)
0.142***
(0.018)
0.028***
(0.007)
-0.010
(0.009)
-0.019***
(0.007)
-1.135
(0.877)
22.125***
(0.506)
14.434***
(0.378)
0.005
(0.002)
-0.005
(0.004)
48.55
(18.56)
0.55
(0.50)
0.52
(0.50)
0.13
(0.34)
0.57
(0.98)
0.24
(0.43)
0.58
(0.49)
0.18
(0.39)
34.19
(45.81)
10.28
(4.80)
13.25
(5.24)
0.02
(0.14)
0.04
(0.19)
48.53
(10.61)
0.55
(0.29)
0.53
(0.29)
0.14
(0.20)
0.57
(0.56)
0.24
(0.25)
0.58
(0.28)
0.18
(0.22)
33.93
(26.26)
10.26
(2.74)
13.22
(3.00)
0.02
(0.08)
0.04
(0.11)
48.68
(40.06)
0.54
(1.08)
0.51
(1.08)
0.11
(0.67)
0.59
(2.15)
0.22
(0.89)
0.60
(1.06)
0.18
(0.84)
35.59
(97.48)
10.42
(10.35)
13.46
(11.23)
0.01
(0.25)
0.04
(0.43)
-0.153
(0.605)
0.009
(0.016)
0.021
(0.016)
0.027***
(0.010)
-0.023
(0.035)
0.024*
(0.014)
-0.021
(0.016)
-0.004
(0.012)
-1.658
(1.444)
-0.158
(0.150)
-0.240
(0.165)
0.008**
(0.004)
-0.004
(0.008)
-0.391
(0.646)
1.654
(1.142)
-0.028
(0.043)
-0.033
(0.029)
0.089**
(0.041)
-0.002
(0.039)
-0.025
(0.029)
7.41
(6.68)
45.05
(11.51)
0.29
(0.45)
0.11
(0.32)
0.23
(0.42)
0.19
(0.40)
0.08
(0.27)
7.67
(3.71)
45.26
(6.23)
0.27
(0.24)
0.11
(0.18)
0.26
(0.24)
0.20
(0.22)
0.06
(0.13)
6.09
(12.70)
43.99
(24.99)
0.35
(1.01)
0.11
(0.66)
0.12
(0.69)
0.17
(0.79)
0.19
(0.83)
1.582
(0.967)
1.270
(2.271)
-0.081
(0.091)
0.004
(0.051)
0.136**
(0.064)
0.034
(0.063)
-0.129
(0.090)
8,570
280
2,367
70
6,203
210
All individuals
Age
Female
Married
Foreign
Nr. of children
Education: Primary
Education: Secondary
Education: Tertiary
Gross annual earnings
Travel distance (km)
Travel time (min)
Unemployed
Receives DI in 2001
DI recipients
Years in DI
Entry age
DI: Psych. problems
DI: Nerve
DI: MSK
DI: Other illness
DI: Accident
All individuals
DI recipients
Note: Means of selected covariates for individuals in treated and control regions sampled between 1999–2001, prior to the
pilot period. Separate statistics for all individuals and those within a distance of 20 kilometers in border regions. Standard
deviation in parentheses. The last column in each block shows the difference between treated and control individuals for
each variable, standard error in parentheses. Survey weights applied for the full sample. Observations weighted for pairwise
differences in the local sample. *, ** and *** denote significance at the 10%, 5% and 1% level respectively.
20
Table A3: Effect heterogeneity
Education
Prim./Sec.
Tertiary
(1)
Treat
1.125
(0.091)
1.316**
(0.173)
0.779*
(0.105)
0.978
(0.187)
0.849
(0.133)
Pilot time
Treat x pilot
Post time
Treat x post
N
N
N
N
municipalities
individuals
failures
fail during pilot
Resident status
Foreign
(2)
Local
Gender
Male
Female
(3)
(4)
1.438**
(0.204)
1.141
(0.387)
0.670
(0.200)
0.515
(0.274)
0.736
(0.239)
1.144
(0.119)
1.343*
(0.230)
0.803
(0.132)
0.681
(0.204)
0.943
(0.205)
1.143
(0.104)
1.209
(0.197)
0.721*
(0.122)
1.010
(0.238)
0.787
(0.139)
1.071
(0.096)
1.335
(0.245)
0.797
(0.137)
0.593**
(0.138)
1.262
(0.210)
1,031
12,729
214
113
981
17,331
722
409
1,087
33,738
1,140
476
1,075
23,704
936
461
1,085
38,340
1,649
772
(5)
(6)
1.238**
(0.119)
1.198
(0.182)
0.752*
(0.118)
1.257
(0.333)
0.565***
(0.123)
1,076
27,365
927
424
Note: Cox Proportional Hazard estimates for individuals in treated and control regions
based on SESAM individual-level survey and administrative data sampled during 1999–
2011. Baseline hazard for all regressions stratified by 5-year birth cohorts. Observations
are weighted for pairwise estimation. Results are reported in exponentiated form as hazard ratios. Standard errors clustered at the municipality level in parentheses, number of
observations given below. *, ** and *** denote significance at the 10%, 5% and 1% level
respectively.
Table A4: Distance windows
Travel distance (km)
Treatment region
Pilot time
Treat x pilot
Post time
Treat x post
N
N
N
N
municipalities
individuals
failures
fail during pilot
Travel time (min)
10 km
15 km
20 km
25 km
30 km
10 min
15 min
20 min
25 min
30 min
1.13
(0.12)
1.29
(0.23)
0.75*
(0.13)
0.92
(0.26)
0.79
(0.17)
1.20**
(0.11)
1.38**
(0.19)
0.71**
(0.10)
0.91
(0.20)
0.83
(0.15)
1.15*
(0.09)
1.27**
(0.15)
0.77**
(0.10)
0.87
(0.16)
0.84
(0.12)
1.16**
(0.08)
1.25**
(0.14)
0.78**
(0.09)
0.82
(0.14)
0.85
(0.12)
1.20***
(0.08)
1.25**
(0.14)
0.78**
(0.09)
0.84
(0.14)
0.85
(0.11)
1.040
(0.152)
1.468
(0.343)
0.741
(0.153)
1.088
(0.344)
0.996
(0.255)
1.13
(0.13)
1.43**
(0.24)
0.66***
(0.10)
0.87
(0.22)
0.85
(0.19)
1.18*
(0.11)
1.30*
(0.18)
0.73**
(0.10)
0.79
(0.15)
0.86
(0.14)
1.16*
(0.10)
1.32**
(0.17)
0.76**
(0.10)
0.80
(0.14)
0.90
(0.14)
1.09
(0.10)
1.20
(0.16)
0.81
(0.12)
0.80
(0.12)
0.94
(0.12)
549
24,266
861
332
825
40,914
1,520
612
1,087
51,069
1,863
885
1,286
56,259
2,043
980
1,414
61,743
2,292
1,063
372
16,209
576
180
649
29,617
1,066
379
922
47,876
1,782
811
1,159
55,155
2,044
961
1,371
62,658
2,312
1,087
Note: Cox Proportional Hazard estimates for individuals in treated and control regions across various distance windows
from the border. Based on SESAM individual-level survey and administrative data sampled during 1999–2011. Observations are weighted for pairwise estimation. Results are reported in exponentiated form as hazard ratios. Standard errors
clustered at the municipality level in parentheses, number of observations given below. *, ** and *** denote significance at
the 10%, 5% and 1% level respectively.
21
Table A5: Determinants of local sample
Age
Female
Married
Foreign
Nr. of children
Education: Secondary
Education: Tertiary
N
Full sample
Treated
Control
(1)
(2)
(3)
−0.0004*
(0.0002)
0.0040
(0.0054)
−0.0115
(0.0165)
0.0175
(0.0258)
−0.0030
(0.0041)
0.0195***
(0.0068)
0.0373
(0.0228)
−0.0008***
0.0002
(0.0002)
(0.0003)
−0.0080
0.0159***
(0.0063)
(0.0057)
0.0093
−0.0320*
(0.0181)
(0.0181)
−0.0360
0.1117***
(0.0270)
(0.0210)
0.0037
−0.0050
(0.0033)
(0.0049)
0.0041
0.0189**
(0.0066)
(0.0083)
0.0008
0.0490**
(0.0240)
(0.0239)
259,323
117,701
141,622
Note: Probit estimates for the probability to be included in the local
sample separately for treated and control regions. Marginal effects
at the mean reported. Standard errors clustered at the municipality
level in parentheses, number of observations given below. *, ** and
*** denote significance at the 10%, 5% and 1% level respectively.
22