The Trade Effects of Skilled versus Unskilled Migration

The Trade Effects of Skilled versus Unskilled
Migration∗
Peter H. Egger
ETH Zurich
Maximilian von Ehrlich
Douglas R. Nelson
University of Bern
Tulane University
September 29, 2014
Abstract
In this paper, we assess the role of skilled versus unskilled migration for bilateral trade. Using a large data-set on bilateral skill-specific migration and
a flexible novel identification strategy, the impact of different levels of skilled
and unskilled immigration on the volume and structure of bilateral imports
is identified in a quasi-experimental design. We find evidence of a polarized
impact of skill-specific immigration on imports: highly concentrated skilled or
unskilled immigrants induce higher import volumes than a balanced composition of the immigrant base. This effect turns out particularly important when
institutions are weak. Regarding the structure of imports, we observe that
skilled immigrants specifically add to imports in differentiated goods. Both
bits of evidence are consistent with a segregation of skill-specific immigrant
networks and corresponding trade patterns.
Keywords: Skilled vs. unskilled immigration; Migrant networks; Bilateral trade; Quasi-randomized experiments; Generalized propensity score estimation
JEL classification: C14; C21; F14; F22
∗
We thank the participants of the CEPR Workshop on the Economics and Politics of Immigration in Turino as well as the participants of the 2nd TEMPO Conference on International Migration
in Vienna for numerous helpful comments.
Contact information: Egger: [email protected]. Egger is also co-affiliated with CEPR, CESifo,
Ifo, and WIFO; von Ehrlich: [email protected]; Nelson: [email protected];
Nelson and von Ehrlich are co-affiliated with CESifo
1
Introduction
Empirical work on the role of bilateral migration for bilateral trade typically relies
on three paradigms: (i) that the skill structure of migration is irrelevant for the
impact on trade;1 (ii) that the functional form of the impact of migration on trade
is linear or log-linear;2 and (iii) that migration is exogenous conditional on a linear
function of other determinants of bilateral trade.3 See Gaston and Nelson (2011) for
a survey of the literature on the nexus between migration and trade. A violation of
(i) would result in a heterogeneous impact of migration on trade, depending on the
skill-specific composition of migration, and this impact heterogeneity is masked by
the approach usually adopted. A violation of (ii) would mean that the estimates of
the trade-response to migration might be meaningless to the extent that marginal
changes in migration might have largely different effects on trade, depending on the
level of migration in the outset. A violation of (iii) would entail that bilateral migration is not exogenous conditional on a linear function of control variables, leading
to inconsistent estimates of the relationship between migration and trade. Inter alia
but not only, the latter results from a non-log-linear impact of the determinants of
trade on trade in general (see Anderson and van Wincoop, 2003; Santos Silva and
Tenreyro, 2006) and a non-log-linear impact of migration on trade in particular.
This paper aims at relaxing the three aforementioned paradigms by employing
a flexible econometric approach which permits skill-specific migration to be an endogenous determinant of trade while allowing for a flexible, nonparametric functional
form of the relationship. This is done on the basis of a so-called generalized propensity score framework for continuous endogenous variables (treatments; skill-specific
bilateral immigration in our case) and continuous outcome (bilateral imports).
Exploiting this flexibility and using two large cross-sections of bilateral stocks
of skilled and unskilled immigrants and bilateral import flows for 1990 and 2000,
respectively, we are able to provide novel evidence on the effect of skill-specific immigration on bilateral imports. In particular, the findings suggest that a concentration
on skilled or unskilled immigrants leads to a bigger response of bilateral imports
1
A number of the papers on trade and migration have considered different levels of skill and
found that skilled immigrants are strongly associated with trade creation, though intermediate and
low levels of skill seem to have no such relationship (see Felbermayr and Jung, 2009; Hatzigeorgiou,
2010; Javorcik et al., 2011; Felbermayr and Toubal, 2012). This work tended to assume a (log)linear relationship between trade and skill-specific immigrantion and a random assignment (i.e.,
exogeneity) of immigration.
2
A handful of papers considered a non-linear functional form. See Head and Ries (1998),
Wagner et al. (2002); Bryant et al. (2004) for parametric evidence and Egger et al. (2012a) for
non-parametric evidence, focusing on total rather than skill-specific immigration.
3
Egger et al. (2012a) permit total immigration to be endogenous in a nonparametric framework.
2
than a mixed composition of the same level of total immigration. Moreover, an
improvement of the institutional quality in a migrant’s origin country displays the
biggest positive effect on bilateral imports if, at the same time, the mix of immigrants is skill-wise relatively balanced. We rationalize these findings in a number of
ways, but they are consistent, in particular, with the presence of (at least partially)
segmented skilled and unskilled migration networks and their effectiveness for trade
in different domains of goods. Finally, the nonlinear impact of the two types of
immigration on import volume and specific categories thereof suggests that the insights gained in this paper could not easily be derived in the framework of log-linear
gravity equations as often used to analyze bilateral migration stocks or flows.
The remainder of the paper is organized as follows. Section 2 introduces the
concept of generalized propensity score estimation with multiple (or multivariate)
treatments for inference of the impact of skilled versus unskilled immigration on
bilateral imports. Section 3 introduces the data used for inferences and associated
descriptive statistics. Section 4 summarizes the results regarding the causal impact
of bilateral skilled versus unskilled immigration on bilateral international imports,
including an analysis on the structure of imports and Section 5 provides a rationalization of those findings against the background of economic theory and earlier
research on the nexus between migration and trade. The last section concludes with
a summary of the most important findings.
2
2.1
Econometric approach
General outline
For our purposes, we need a methodology that is able to handle two determinants
of bilateral trade (i.e. skilled and unskilled immigration) which are (i) continuous
and endogenous, and (ii) exhibit an impact on outcome (bilateral imports) whose
functional form is unknown. In many earlier empirical studies, bilateral migration
may be an endogenous regressor in the model of bilateral trade for two main reasons.
First, the set of determinants of bilateral trade flows is specified in a relatively
parsimonious way so that an exclusion of joint determinants of bilateral migration
and trade flows is likely, whereby bilateral migration is correlated with the residual.
This entails a classical omitted variables bias. Second, the functional form of the
relationship between bilateral migration and trade is not necessarily log-linear as is
typically assumed. With a log-nonlinear functional form, log bilateral migration is
again correlated with the residual in the trade model which generates endogeneity.
Econometric theory offers a range of remedies to the problem of endogeneity.
Of those, instrumental variable and control function estimation rely on the idea
3
of splitting endogenous explanatory variables into the endogenous part and the
exogenous part, the former being correlated and the latter being uncorrelated with
unobservable determinants of outcome (the residual; see Wooldridge, 2010). In linear
models, instrumental variable and control function regressions are two sides of the
same coin: while the former replace endogenous variables with their fitted values
from a first-stage regression in the outcome model so that the residual term there
includes the unobservable determinants of outcome as well as the endogenous part of
the explanatory variable(s), the latter adds the first-stage residual(s) as explanatory
variable(s) in the outcome regression in addition to the original endogenous variables.
In the present context, suppose that a researcher wanted to run the regression
M = W β + u, where M and W are vectors of log bilateral imports and log bilateral
immigration, respectively, and u is a vector of disturbances that contains all the
other determinants of bilateral imports. Obviously, β is the unknown parameter of
interest. With instrumental variables and control function estimation, one assumes
that immigration can be linearly decomposed in two components: W = [Wu , Wc ]
where Wu and Wc are the components which are uncorrelated and correlated with
u, respectively. The task is to estimate Wu and Wc and to either estimate the
regression by plug-in instrumental variable estimation, M = Ŵu βIV + uIV , or by
control function estimation, M = W βCF + Ŵc γCF + uCF . Notice that Ŵu is the
prediction of the first-stage model for W , while Ŵc is the residual from the same
regression. The term Ŵc γCF is a control function that absorbs the asymptotic bias
of the parameter of interest such that βCF is consistent. The same idea prevails for
nonlinear models.
An important difference between instrumental variables and control function estimation pertains to the set of instruments. Instrumental variable estimation requires
at least as many instruments as there are endogenous and exogenous regressors in
the outcome equation. These instruments need be exogenous (uncorrelated with all
possible unobservable determinants of treatment and outcome) and they need to be
relevant (be correlated with treatment) while not directly affecting outcome. With
suitable instruments, the parameters of interest in the outcome equation can be
estimated consistently. Control function estimation requires to condition on all relevant factors (instruments) that determine outcome and treatment simultaneously
so that outcome is independent of treatment status after conditioning on the instruments. With nonlinear models, the plug-in instrumental variable estimator is not
generally feasible while the control function approach is, since nonlinearities in the
model do not permit simply replacing endogenous variables by their fitted values
from first-stage regressions (see Terza et al., 2008). Notice that also nonparametric
instrumental variables estimation would build on a control function approach (see
Newey et al., 1999). The approach employed here only differs from nonparametric
4
instrumental variable estimation with regard to the underlying assumptions about
the instruments.
2.2
Notation and formal model outline
Let us denote the cross-sectional units of observation (here, country-pairs) by i =
1, ..., N and economic outcome (in the present context, the value of bilateral goods
imports) by Mi . The goal of this paper is to determine the causal impact of skilled
and unskilled immigration and their interaction on Mi .
For econometric identification, let us think of the levels of skilled and unskilled
immigration as potentially endogenous treatments. Yet, unlike in most of the evaluation literature in econometrics, these treatments are not binary but continuous.4
Let us denote specific potential treatment levels of skilled and unskilled migrants
by s, u. Moreover, denote the lower and upper bounds of these potential treatment
levels as {s0 , u0 } and {s1 , u1 }, respectively. The potential treatment levels are then
associated with sets of potential treatment levels of S ∈ [s0 , s1 ], U ∈ [u0 , u1 ], respectively.5 While potential treatment levels of migrants are denoted by lower-case
letters, corresponding realized treatment levels of country-pair i will be denoted by
upper-case letters, Si , Ui .
For each country-pair i, we may now define the set of potential outcomes in
terms of a unit-level dose-response function for ` = s, u as Mi (`) ≡ Mi (s, u) for
s ∈ S, u ∈ U and the corresponding average dose-response function as µ(`) ≡
µ(s, u) = E[Mi (s, u)].
We adopt a reduced-form specification to model immigration of any kind as a
function of a vector of covariates, Zi , determining Si and Ui . Suppose Si and Ui
were determined in a structural model by the respective vectors of covariates XSi
and XU i such as
Si = f (XSi , δS ) + εSi ;
Ui = f (XU i , δU ) + εU i ,
4
See Lee (2005) or Wooldridge (2010) for an extensive discussion of models with binary endogenous treatments. Also, see Lechner (2001) for econometric models with multiple binary endogenous
treatments. Finally, see Hirano and Imbens (2005) or Imai and van Dyk (2004) for a generalization
of propensity score estimation with univariate continuous treatments. The novelty of the approach
adopted in this paper is the generalization of the concept of generalized propensity scores for
multiple continuous treatments.
5
Obviously, the proposed generalized propensity score approach is in no way restricted to two
treatments only. Egger and von Ehrlich (2013) provide a proof and a Monte Carlo study to illustrate
identification issues with one, two, and more endogenous continuous treatments. The question at
stake calls naturally for a bivariate approach, which is also better amenable to graphical illustration
than higher-dimensional models.
5
where f (·) are flexible functions, δS and δU are unknown parameters, and εSi and εU i
are disturbances. Then, we could think of Zi as the joint set of exogenous regressors
or independent columns in [XSi , XU i ]. The reduced-form representation of the model
may then be written as
Si = f (Zi , γS ) + νSi ;
Ui = f (Zi , γU ) + νU i
(1)
with γS and γU being unknown vectors of reduced-form parameters and νSi and νU i
the corresponding disturbances.
We observe Zi , the continuous treatments Si and Ui , and outcome Mi (Si , Ui ) and
assume that they are suitably measurable. Denote any possible vector of exogenous
covariates determining treatment by z and define the bivariate conditional joint
density of s, u given z as
g(s, u, z) = fSi ,Ui |Zi (s, u|z).
Then, the generalized propensity score (GPS) is defined as
Gi = g(Si , Ui , Zi )
with the property that
Zi ⊥ 1{Si = s, Ui = u} |g(s, u, Zi ) .
The latter states that the probability of the observed treatments being equal to some
potential treatment combination {s, u} is independent of the covariates in Zi once
we condition on the GPS. Accordingly, the treatment status is independent of the
outcomes conditional on the GPS under the assumptions stated in Subsection 2.3.
For our identification strategy, this implies that we need to condition only on one
scalar, namely Gi for unit i, in order to remove the selection bias in the unconditional
impact of skilled and unskilled immigration on imports instead of all covariates in
the vector Zi . This allows a maximum of flexibility regarding the functional form of
the import response to skilled and unskilled immigration.
2.3
Assumptions
For identification, we have to assume unconfoundedness (or conditional mean independence) as stated in Rosenbaum and Rubin (1983) for the binary propensity score
and, more specifically, weak unconfoundedness as in Hirano and Imbens (2005) and
Imai and van Dijk (2004) for the generalized propensity score with a univariate, continuous treatment. The latter means that conditional independence holds at each
value of treatment without requiring joint independence of all potential outcomes.
6
Assumption 1. Weak unconfoundedness
Mi (s, u) ⊥ Si , Ui |Zi ∀s ∈ S, u ∈ U.
In our setup, the untestable assumption of weak unconfoundedness means that,
conditional on the vector of covariates Zi , the potential outcome Mi is independent
of the treatment status in the two treatment dimensions Si , Ui .
Assumption 2. Balancing of the covariates
The balancing of covariates is a testable assumption, and it means that the
functional form of the density of the residual treatment levels is appropriately chosen,
in the sense that there is sufficient overlap between observations with different levels
of treatment Si , Ui but similar levels of both the covariates in Zi and the GPS, Gi .
Hence, for all strata of the GPS the probability of the treatments (i.e. levels of
skilled and unskilled immigration) does not depend on the values of Zi . We will
illustrate in Section 4.1 that the data comply with this assumption.
2.4
Implementation and parametrization
The implementation of the approach proceeds in five steps. In Step 1, we estimate
reduced-form parametric first-stage models determining Si and Ui . The parametrization of these models is a polynomial form which is linear in parameters such that
f (Zi , γS ) = Zi γS and f (Zi , γU ) = Zi γU in (1). Accordingly, (1) is estimated by
ordinary least squares. The polynomial order of the terms included in Zi is chosen
according to an information criterion (the Akaike information criterion). In Step 2,
we utilize the estimated residuals from Step 1, (νSi , νU i ), assume a functional form
for the density function, namely bivariate normality, and compute the GPS as
bi =
G
1
p
exp
2πb
σSS σ
bU U 1 − ρb2
n
−
h νb
1
νbU i
2b
νSi νbU i ρbio
Si
+
−
,
2 (1 − ρb2 ) σ
bSS σ
bU U
σ
bSS σ
bU U
(2)
where σ
bSS and σ
bU U are the variances of the disturbances in the equation for Si
and Ui , respectively and ρb is the estimated covariance between those disturbances
(see, e.g., Greene, 2011, for a general treatment of bivariate normals). With the
treatments Si and Ui measured in logarithmic terms, the normality assumption is
approximately met.
In Step 3, we ensure common support, which means comparing only observations
with similar levels of predicted skill-specific immigration but different realizations
of the treatments. The GPS represents an objective criterion to select comparable
units. Hence, we restrict our analysis to the common support of Ĝi . In the binary
7
treatment case this is straightforward as there are only two types of observations
– treated and untreated – while with continuous treatments we have to discretize
the sample into treatment groups. As a benchmark, we split the data into four
groups in Si -Ui space where we denote the group an observation i belongs to by
Qi ∈ {1, 2, 3, 4}. Following Hirano and Imbens (2005), Kluve et al. (2012), and,
in particular, Flores et al. (2012) who all study problems with a single treatment
variable, we evaluate the GPS at the median treatment level of each treatment group
q, independently of whether the observation belongs to the group or not. This yields
for each observation four scalar values Ĝqi , where q = 1, ..., 4. Finally, we compare
the distribution of Ĝqi for observations that belong to the respective group, i.e., with
Qi = q and observations that do not belong to that group Qi 6= q. An observation
i satisfies the common support criterion for group q, if
h
i
q
q
q
q
q
Ĝi ∈ max{min{j:Qj =q} Ĝj , min{j:Qj 6=q} Ĝj }, min{max{j:Qj =q} Ĝj , max{j:Qj 6=q} Ĝj } .
(3)
Only those observations that are comparable across all groups are kept for the
analysis such that each observation has to fulfill the common support criterion (3)
for all groups q simultaneously. Naturally, the common-support sample depends on
the number of groups we choose for the discretization. We report the analysis with
four and nine groups.6
Step 4 is concerned with assessing the validity of the balancing assumption. As
argued by Imai at al. (2008), individual tests of the balancing assumption may be
misleading such that it is important to pursue different testing strategies. Hirano
and Imbens (2005) suggest splitting the sample along two lines. First, use again
the groups in Si -Ui space as defined in Step 3. Second, split the sample of observations within each group in Ĝi -space into blocs. These blocs are determined by the
GPS evaluated at the median treatment level of the respective group Ĝqi . In the
benchmark specification with four groups, we use sixteen blocs such that observations in bloc 1 of Ĝqi are those with a very high propensity for receiving the median
treatment intensity of group q. With an identical number of blocs per group, there
is approximately the same number of observations per bloc and group. We then
conduct t-tests about the equivalence of the averages in the data of each linear term
of the covariates. For instance, we take observations with a similar level of GPS
evaluated at the median treatment level of group q, Ĝqi , and compare the covariates
between observations Qi = q and Qi 6= q. This is done for all groups and blocs
and provides information on whether the covariates differ significantly across groups
once we condition on the GPS. Hence, in the conditional comparison, we estimate
6
In the working paper version of this paper we report also results for a 16 group common-support
sample (see Egger et al., 2012b).
8
16 t-values per group and covariate and then calculate the average over blocs. Of
course, the sample sizes differ slightly across groups and blocs so that test statistics
should be adjusted. This is done by weighting the data properly by the number of
observations used in each bloc. A sufficient balancing of the covariates is reached,
if these t-tests turn out insignificant. Our second test of the balancing assumption
follows Imai and van Dyk (2004) (this test is implemented in a similar fashion by
Kluve et al., 2012). For this, we regress each covariate on the treatment variables
with and without conditioning on the distribution of the GPS. In the conditional
models we evaluate the GPS at the quartiles of Si and Ui . The test essentially
represents a more flexible version of the aforementioned approach by Hirano and
Imbens (2005). Ideally, if the GPS absorbs all information relevant for treatment
assignment, the covariates Zi are uncorrelated with Si and Ui , once we condition on
the distribution of the GPS.
The last step pertains to estimating parametric and nonparametric regressions of
the second-stage, outcome equation. The parametric model specifies E(Mi |Si , Ui , Gi ) =
Hi λ as a function which is linear in parameters. The vector Hi depends on a polynomial function of the terms {Si , Ui , Gi } only (apart from a constant). As with
the first-stage models in Step 1, the polynomial order is determined by the Akaike
information criterion and the chosen regression approach is ordinary least squares.
The parameter estimates λ̂ from this regression are neither interpretable nor of interest themselves. The reason is that E(Mi |Si , Ui , Gi ) represents only a so-called
unit-level dose-response function, where the marginal effect of {Si , Ui } depends on
the level of Zi and Gi . What one would like to know is an average dose-response
function, where the marginal effect of treatment is independent of Zi . The latter can
be obtained as follows. First, one considers the range of treatment levels of interest,
[s0 , s1 ] and [u0 , u1 ]. Second, one discretizes these intervals into bins, calculates all
possible values Si −s and Ui −u and employs these instead of ν̂Si and ν̂U i in the GPS
in Step 2.7 Suppose one distinguishes 40 bins of equal size in [s0 , s1 ] and [u0 , u1 ]
each. This leads to 402 = 1, 600 combinations of {s, u} and, hence, as many different
values of Ĝi (s, u) as well as of E(Mi |Si − s, Ui − u, Ĝi (s, u)). One then computes
the average of E(Mi |Si − s, Ui − u, Ĝi (s, u)) for each of the 1, 600 combinations of
{s, u} across all observations N . Each one of the averages is now independent of the
country-pair-specific level of Zi . The average dose-response function in this example
is the relationship between all 1, 600 levels {s, u} and average associated (potential)
outcome (log bilateral imports), E[M (s, u)].
Alternatively, one may model the unit-level dose-response function by a nonparametric estimator. A flexible approach to estimate such a model is to employ a
7
Note that one keeps σ̂SS , σ̂U U , and ρ̂ constant at their originally estimated levels for this
exercise.
9
multivariate local linear estimator based on an Gaussian product kernel with bandwidth h (see Fan and Gijbels, 1996). We obtain predictions for E[M (s, u)] from the
estimate α̂ of the local linear estimator:
N
X
arg min
[Mi − α − λ1 (Si − s) − λ2 (Ui − u) − λ3 (Ĝi − Ĝi (s, u))]2 K h ,
α,λ
(4)
i=1
where E[M (u, s)] = α̂ and K h = K h (Ui − u)K h (Si − s)K h (Ĝi − Ĝi (s, u)).
Obviously, the critical parameter for the nonparametric estimation of the unit-level
dose-response function is the bandwidth. Conventional procedures for rule-of-thumb
bandwidth selection are constrained to univariate local polynomial regression (see
Fan and Gijbels, 1996). Therefore, we implement a leave-one-out cross-validation
procedure of optimal bandwidth selection.8
Both the parametric polynomial and even more so the nonparametric versions
of the avg. dose-response function are relatively flexible and they can best be visualized graphically by way of three-dimensional plots. The confidence bounds of
the parametric and nonparametric estimators can be estimated by bootstrapping
all estimation steps together, including the common support restriction, the firststage estimation of Ĝi (s, u), and the second-stage regressions (according to Efron
and Tibshirani, 1993, 200 replications should be sufficient for this).
Apart from the avg. dose-response function, the avg. treatment-effect function
may be of interest. With two continuous, endogenous potential treatment levels s
and u, there are two such avg. treatment-effect functions, and they correspond to
dE[M (s, u)]/ds and dE[M (s, u)]/du. They can be estimated by taking the respective
derivatives of the unit dose-response function and subsequent averaging over all
observations. Again, the confidence intervals of the avg. treatment-effect functions
have to be bootstrapped.
3
Data and descriptive statistics
The variables entering our analysis encompass direct drivers of bilateral imports
as well as direct drivers of skilled and unskilled bilateral immigration. All variables entering the analysis may be broadly grouped into dependent variables and
independent variables.
8
Due to the immense computational intensity of a three dimensional cross-validation we apply
the same optimal bandwidth in all three dimensions. This comes only at a small loss of precision
since we standardize all three variables before implementing the cross-validation and estimating
the local linear regressions by their respective standard deviation.
10
3.1
Dependent variables
The dependent variables are bilateral import flows (which we also refer to as outcome) and bilateral stocks of skilled and unskilled immigrants (which we refer to
as treatments). The goal will be to determine the impact of (endogenous) bilateral
skilled and unskilled immigration treatments on bilateral import outcome. Bilateral
skill-specific immigration data are available for the years 1990 and 2000 from Docquier et al. (2009) for OECD destination countries and from Artuc et al. (2014)
for non-OECD destination countries. This dictates the sample coverage of the data
in the empirical analysis. In the benchmark, we define skilled immigrants as those
with at least a secondary level of education. Note that our results are invariant
to an alternative aggregation with skilled immigrants characterized by at least tertiary education. As the outcome, we use data on bilateral imports from the United
Nations’ Comtrade Database for the average year within the period 1991-95 and
2001-05, respectively. Accordingly, we consider immigration and imports for two
time periods. In order to determine the effect of immigration on the structure of
imports we distinguish between homogeneous and differentiated goods according to
the classification by Rauch (1999). We assign sectors to the respective classifications
using the Standard International Trade Classification (SITC) three-digit data. This
obtains an additional dependent variable measuring the ratio between differentiated
and homogeneous goods imports denoted by MiD /MiH .
– Table 1 here –
Altogether, we cover 98 countries of origin and 29 (65) OECD (Non-OECD)
countries of residence of migrants and trade between those countries in our analysis
(see the Appendix for a list). Table 1 provides some descriptive features of the
data on the dependent variables covered in our analysis. We measure immigration
as well as imports in logarithmic terms such that our empirical analysis captures
only country pairs characterized by positive values of both bilateral imports and
immigration of skilled and unskilled.
3.2
Independent variables
As described in the previous section, we have to employ independent variables –
determinants of bilateral import flows as well as skilled and unskilled immigration
stocks as elements in the vector Zi – in order to remove the selection bias in an
assessment of the impact of the two types of immigration Si and Ui on bilateral
imports Mi of country-pair i (see Felbermayr and Jung, 2009; Mayda, 2010; Felbermayr et al., 2010; Felbermayr and Toubal, 2012; for important determinants of
bilateral migration).
11
Generally, the vector of observables Zi includes both continuous and multi-valued
discrete variables for both exporters/countries of origin and importers/countries of
residence in a flexible 3rd-order polynomial functional form and binary variables in a
linear functional form.9 Specifically, we include a parametric (polynomial) function
of exporter/origin- and importer/residence-specific log GDP, log GDP per capita,
and log population to account for effects of economic market size, per-capita income,
and population size in a fairly flexible way. These variables are taken from the World
Bank’s World Development Indicators 2009. Moreover, we control for origin- and
residence-country GINI coefficients, unemployment rates, life expectancies, fertility
rates, literacy rates, and real exchange rates between residence and origin countries as measures of unemployment risk, inequality, and economic well-being beyond
per-capita income. These variables come from the World Bank’s World Development Indicators 2009, the CIA World Factbook, and United Nations’ Educational,
Scientific, and Cultural Organization (UNESCO).
Furthermore, we control for bilateral distance between residence and origin countries as a continuous pair-i-specific geographical determinant of immigration and
imports as well as for common language, colonial relationship, common religion,
goods-trade-agreement membership, services-trade-agreement membership, bilateral migration impediments, OECD membership, and membership in the Warsaw
Pact as pair-specific binary geographical, cultural, political, and economic control variables. All geographical variables are based on information from the Centre d’Études Prospectives et d’Informations Internationales’ (CEPII) geographical
database. Trade agreement indicator variables as well as data on migration impediments are based on information from the World Trade Organization (WTO).
Also, we include a number of other control variables for both the country of
origin and residence, capturing political and institutional factors that may be relevant for both migration and trade. First of all, we include measures of the originand residence-specific number of armed conflicts from the Peace Research Institute
at Oslo, the corruption perception index from Transparency International, political
freedom (the POLITY-IV index) from the Center for Systemic Peace (see Marshall
et al., 2011), and a number of measures determining labor market features from the
International Labor Organization (ILO): the bargaining power index, the working
condition index, the worker discrimination index, and the child labor index. Second, we capture several dimensions of institutional quality from the World Bank’s
Governance Indicators.10 All covariates in Zi are measured prior to the endogenous
9
Binary variables enter linearly in order to keep the degree of multicollinearity of the model
reasonably low. In general, we will use acronyms of the variables for the sake of brevity in tables.
These acronyms are defined in Table 2.
10
While we only include country-pair-specific variables, it should be noted that the results are
12
variables Mi , Si , and Ui in each of the two periods.11
– Table 2 here –
Table 2 provides information on moments of the data for all (first-order) independent variables akin to both blocs of Table 1.
4
4.1
Results
Multivariate GPS estimation, common support, and the
balancing property
We include all main effects of the covariates listed in Table 2 together with quadratic
and cubic terms of all non-binary regressors in the regressions explaining Si and Ui .12
Altogether, there are 104 explanatory variables in the two equations.
We report parameter estimates and standard errors for the 3rd-order polynomial
reduced-form model specifications for Si and Ui based on 104 regressors. The coefficient signs are impossible to interpret due to the nonlinear form of the first-stage
regressions. However, it is generally the case with propensity score estimation that
the specific parameters and marginal effects of covariates in the treatment equation(s) are of subordinate interest. For completeness, the parameters and standard
errors of the regressors are summarized in Table 3. What matters is that the joint
contribution of the covariates to the variances of Si and Ui is decently large, and
that they are balanced (for given estimated values of the GPS).
– Table 3 here –
robust to the inclusion of third-country effects with regard to all of the origin- and residencecountry-specific covariates on treatment (see the working paper version of this manuscript: Egger
et al., 2012b). The inclusion of the latter would serve the purpose of accounting for interdependence
of origin and residence countries in supplying and attracting migrants (see Anderson, 2011).
11
Trade agreement membership, corruption perception, and ILO’s indices on labor market conditions are measured in 1990 and 2000. OECD membership, geographical, cultural, historical
variables, and bilateral migration impediments are time-invariant. All other elements of Zi are
measured as averages over 1986-90 and 1996-00 for the two periods covered, respectively.
12
The 3rd-order polynomial model specification is not arbitrary. We did a selection of the optimal
order of the polynomial on grounds of the Akaike information criterion searching across models
which involve 1st-order up to 5th-order polynomials. Based on this search, we selected the 3rdorder polynomial version as the preferred one, since no substantial decrease in the Akaike criterion
could be achieved when choosing a higher-order polynomial.
13
Indeed, the regressions do feature a decent predictive power with adjusted R2 s
of about 0.72 and 0.80 for Si and Ui , respectively. Hence, the covariates are jointly
highly relevant. Models based on 1st-order or 2nd-order would have achieved lower
tuples of adjusted R2 s of (0.64, 0.72) and (0.70, 0.76), respectively. Based on the
estimates in Table 3, one may compute the GPS explicitly by assuming bivariate
normality of the disturbances as in (2). With the treatments measured in logarithmic
terms, the normality assumption is approximately met. What remains to be enforced
is common support of the GPS as in Step 3 in Section 2.4, and what ought to be
checked is whether the common support assumption is tenable as outlined in Step 4
in Section 2.4. We do this for four and nine groups (called Qi above), respectively, in
Si -Ui -space and with 16 and 8 blocs, respectively. Enforcing common support with
four and nine groups leads to samples of 2,525 and 1,352 observations, respectively.
We illustrate the distribution of t-statistics about the equivalence of the averages of
each linear term of the covariates in Table 3 across groups and illustrate them in
Figures 1 and 2 for four and nine groups, respectively, where Panels A refer to the
unconditional covariate comparison and Panels B to the comparison conditioning
on the GPS.
– Figures 1 and 2 and Table 4 here –
Two insights can be gained from an inspection of the two figures. First, the
histogram plots illustrate that a large mass (namely 60% and 47% for the four and
nine group common support samples) of the t-values of unconditional comparisons
lie outside of the [−2.576, +2.576] interval which (approximately) indicates significance levels of less than one percent. When taking t-statistics in absolute terms, the
interquartile ranges amount to [1.68, 6.46] and [1.18, 4.35] with four and nine groups,
respectively. By way of contrast, only a very small mass of the distribution of tvalues of conditional comparisons (on blocs of the GPS) lie outside of that interval
(0.5 percent of the t-values are bigger than 2.576 in absolute terms). The interquartile range of absolute t-values for the conditional comparisons amount to [0.26, 0.80]
and [0.22, 0.71]. This is evident from the much more narrow-waisted distribution
of conditional-comparison t-statistics around zero relative to the unconditionalcomparison t-statistics in Panels A and B of Figures 1 and 2. Second, the mean and
median values of the absolute t-statistics – which we report at the bottom of the
two figures – are much smaller once we condition on the GPS. For instance, among
the unconditional-comparison t-statistics, the average absolute values are 4.86 and
3.32 and the median values are 3.30 and 2.45. Among the conditional-comparison
t-statistics, the average absolute values are 0.60 and 0.54 and the median values are
0.48 and 0.44. This illustrates that conditioning on the GPS is extremely powerful in
the data, independent on how stringent a common support we require. Conditioning
14
on the GPS and enforcing common support substantially raises the comparability of
country pairs in the dimensions of interest. Hence, we hypothesize that there is little
chance that the observable variables included in the empirical model explaining Si
and Ui confound the impact of Si and Ui on bilateral imports, Mi .
The alternative balancing test based on Imai and van Dyk (2004) and Kluve et
al. (2012) supports this conclusion, as can be seen from Table 4. The unconditional
models feature significant correlations between treatments and covariates where the
mean t-statistic ranges from 3.09 to 5.10 for the two common support samples and
treatments S and U . Hence, ex ante there is a severe selection issue. In contrast,
after conditioning on the GPS, the reduction of the t-values is remarkable as the
maximum t-statistic across all conditional models is 2.14 and the mean over all
covariates ranges from 0.15 to 0.60 for the conditional models. These results confirm
that the control function performs well in solving the selection issue with regard
to observable information. Hence, we may proceed to estimate the dose-response
function by means of parametric and nonparametric estimators as outlined in Step
5 of Section 2.4.
4.2
Parametric estimates of the multivariate dose-response
and treatment-effect functions
Utilizing the GPS as a compact (scalar-function) balancing score to reduce dramatically the endogeneity bias of Si and Ui in determining Mi by invoking the underlying
assumptions, we propose the parametric second-stage regression
E(Mi |Si , Ui , Gi ) = Hi λ
Hi = [1, Si , Ui , Si2 , Ui2 , Si Ui , Ĝi , Ĝ2i , Si Ĝi , Ui Ĝi , Si2 Ĝ2i , Ui2 Ĝ2i ]. (5)
This regression serves to predict the unit dose-response function. The parameters
are estimated by ordinary least squares and the standard errors are estimated by
a bloc-bootstrap procedure (with 200 replications) in order to respect two features:
first, that each unit i is observed in two years (1990 and 2000) so that the variancecovariance matrix may have i-specific blocs and, second, that (5) involves estimates
Ĝi rather than true GPS scores Gi . We ensure the common support criterion within
each bootstrap replication such that the sample size may vary slightly across bloc
bootstrap draws. The regression results corresponding to (5) are summarized in
Table 5.
– Table 5 here –
15
The point estimates of the second-stage regression are not immediately informative with regard to the causal effect of immigration on imports. However, the
significance of the terms provides information on the severeness of the endogeneity
issue and the performance of the propensity score in absorbing information that determines treatment and outcome. The main effects and interactive terms involving
Ĝi are jointly highly significant (the F-statistic on all terms involving Ĝi is 14.98),
which is a strong indication of selectivity across different levels of Si and Ui . Hence,
the GPS is relevant and helps reducing the bias of the estimated response of (log)
bilateral imports (Mi ) to changes in (log) bilateral immigration of the skilled (Si )
and the unskilled (Ui ). Controlling for observable information affecting selection
into treatment, we still observe that skilled and unskilled immigration have a substantial impact on imports as is evident from the joint significance of the S and U
terms. In order to judge the direction and the magnitude of the effects, we compute
the avg. dose-response function for a 40 · 40 grid in s − u-space which is governed
by the observed range [min(Ui ); max(Ui )], [min(Si ); max(Si )]. We utilize the parameters summarized in Table 5 to compute
N
1 X
(hi λ̂)
E[M (s, u)] =
N i=1
(6)
hi = [1, s, u, s2 , u2 , su, Ĝi (u, s), Ĝ2i (u, s), sĜi (u, s), uĜi (u, s), s2 Ĝ2i (u, s), u2 Ĝ2i (u, s)].
In Figure 3 we illustrate the avg. dose-response function as computed from (6)
based on four groups and note that the result for nine groups is very similar. The
plot contains three areas which indicate significant positive (in blue), insignificant
positive (light blue), and insignificant negative (yellow) responses. The significance
statements in the figure are based on 90% confidence intervals.
– Figures 3 and 4 here –
A key insight from Figure 3 is that import volume is not maximized at the diagonal where skilled and unskilled immigration reach similar levels but at the edges
where the immigration stock is either dominated by skilled or by unskilled individuals. Hence, our results suggest that bilateral import flows are stimulated mostly
by homogeneous immigrant communities while a heterogeneous mix between skilled
and unskilled immigrants yields ceteris paribus a lower import volume. According
to the results, there is a statistically significant (at 5%) positive level of log imports for almost any form of bilateral immigration. However, Figure 3 suggests that
16
the import-maximizing immigration treatment corresponds to a polarization of immigrant types, irrespective of whether unskilled or skilled immigration dominates.
Note that the range of observed skilled-unskilled immigration combinations does not
support all of the cells in the figures. However, the polarization result is found also
in the Si -Ui -subspace that is supported by the data.
The polarization result becomes even more evident from inspecting the so-called
avg. treatment-effect functions plotted in Figure 4. Dark blue areas correspond
to significant positive and red to significant negative predictions while light blue
and yellow mark insignificant positive and negative predictions, respectively. The
treatment-effect functions represent the partial derivatives of the dose-response function with respect to the two types of treatment. Panels A and B in Figure 4 correspond to dE[M (s, u)]/du and dE[M (s, u)]/ds, respectively. Consider the avg.
treatment-effect function of unskilled immigrants: starting from approximately the
diagonal in Panel A of the figure, a marginal increase in u yields a significantly positive increase in the treatment effect on import volume. In contrast, the marginal
effect of an increase in unskilled immigration u is insignificant or even negative if the
country pair’s point of origin is skewed towards relatively more skilled than unskilled
immigrants. The reverse holds true if we consider the treatment effect for skilled
immigration dE[M (s, u)]/ds in Panel B: a marginal skilled immigrant induces imports only if a country pair has a Ui -Si combination with relatively more skilled
than unskilled immigrants.
Hence, we find evidence for a polarized impact of skill-specific immigration on
imports at the diagonal of the skilled-unskilled immigration space. Moreover, we
can reject the null hypothesis of a positive marginal effect of skilled immigration
in areas where unskilled immigration dominates while we can reject it for unskilled
immigration in areas with predominantly skilled immigration levels.
4.3
Nonparametric estimates of the multivariate dose-response
and treatment-effect functions
The above results may depend to some extend on the functional form assumptions
in (5). We performed sensitivity checks with a number of higher order polynomials
which confirm our polarization result. Yet, an ultimate test requires a fully nonparametric specification of the dose-response and treatment-effect functions as described
in (4) and in the implementation Step 5 in Section 2.4. The nonparameric estimates
of the avg. dose-response and the avg. treatment-effect functions are illustrated
in Figures 5 and 6, again using the common support sample with four groups as a
reference point and noting that results are robust to using nine groups.
17
– Figures 5 and 6 here –
In contrast to the parametric results in Section 4.2 the estimated surfaces display a higher degree of nonlinearity but the general v-shape around the diagonal is
found again and is clearly visible. Bilateral imports are maximized at either a high
concentration of skilled or unskilled immigrants while a balanced mix of the two
groups yields a lower level of bilateral imports. A noticeable difference to Figure 3
is the asymmetry of the nonparametric surfaces at the edges with full polarization.
According to the nonparametric avg. dose-response functions, a concentration of
high-skilled immigrants leads to relatively more imports than a concentration of
unskilled immigrants, but we estimate for either of them a higher level of imports
than at the diagonal.
For the nonparametric avg. treatment-effect functions illustrated in Figure 6
we employ a local quadratic estimator as it is generally advisable to choose an odd
difference between the polynomial order of the local regressions and the degree of the
derivative (this is preferable in terms of bias reduction). It is evident that positive
(blue) treatment effects with regard to unskilled immigration dE[M (s, u)]/du are
confined to the areas with relatively more unskilled than skilled immigrants. On the
other side of the diagonal, we estimate mainly negative (yellow or red) values for
dE[M (s, u)]/du. A similar pattern evolves for the treatment effect with regard to
skilled immigration dE[M (s, u)]/ds: positive (blue) treatment effects are confined
to s − u combinations with a dominating stock of skilled immigrants. As a general
feature, the nonparametric estimates come at a lower degree of precision in terms
of their confidence intervals. However, the nonparametric avg. dose-response and
treatment-effect functions confirm the parametric findings: an immigrant mix which
is skewed towards either the skilled or the unskilled maximizes bilateral imports as
compared to a balanced mix of skilled and unskilled immigrants.
4.4
Immigration effects on the structure of imports
A question of interest to the matter is whether the composition of immigrants has
consequences for the composition of bilateral imports. Are the effects of skilled and
unskilled immigrants driven by specific product groups? Since an investigation at
the very disaggregated product level is not feasible for reasons of presentation, we
resort to an analysis at the level of aggregates of product classes. A widely accepted
way of grouping products is the one proposed by Rauch (1999; the so-called Rauch
classification), which distinguishes between differentiated products, homogeneous
products, and an intermediate category. Hence, each single observation on aggregate bilateral import flows may be split into the corresponding three sub-aggregates.
18
Rauch (1999) offers two classification schemes, one dubbed liberal and one conservative. Since the results turn out virtually identical for the two schemes, let us focus
on the liberal classification, here. Specifically, we consider the ratio between differentiated and homogeneous bilateral imports MiD /MiH as an alternative outcome to
total import volume. Since the nonparametric estimation strategy described above
is preferable in terms of flexibility, we focus the analysis in this section to nonparametric estimates based on the four-group common-support sample.13 We illustrate
the corresponding finding in Figure 7.
– Figure 7 here –
The figure suggests that the ratio between differentiated and homogeneous goods
imports, MiD /MiH , is unambiguously increasing in the stock of skilled immigrants,
while for unskilled immigrants the opposite is true. Hence, the two types of immigrants seem to stimulate different types of bilateral imports in terms of Rauch’s
categories. While the aggregate level of imports ceteris paribus rises with a high
concentration of either immigrant type, a high level of bilateral imports in differentiated goods seems to be facilitated by a high concentration of skilled immigrants.
Hence, the pattern identified in Figure 7 is consistent with the results in Rauch
(1999), Rauch and Trindade (2002), Briant et al. (2009), Tai (2009), Felbermayr et
al. (2010), Peri and Requena-Silvente (2010), Felbermayr and Toubal (2012), and
Genc et al. (2012), who found that the share of high skilled migrants is strongly associated with imports of differentiated goods and goods traded on organized markets,
but less so with goods associated with reference prices.
4.5
Effects of migration to OECD versus non-OECD resident countries on imports
Our benchmark estimates focus on migration from OECD and non-OECD countries
of origin to OECD destination countries. Recent research by Artuc et al. (2014)
made bilateral information on migration among non-OECD countries as well as migration from OECD to non-OECD countries available. While South-South migration
is as common as South-North migration, it is largely driven by other factors. As
pointed out by Ratha and Shaw (2007) South-South migration tends to take place
predominantly between countries with contiguous borders and with relatively small
differences in income. In many instances it occurs not for economic but for political reasons, especially due to conflicts and wars. The latter shows in the fact that
13
Note that parametric estimates along the lines described in Section 4.2 yield very similar
results. The findings are also robust to using the nine-group common-support sample.
19
most refugees and asylum seekers migrate among developing countries. Accordingly,
the first-stage coefficients on the observable determinants of bilateral immigration
in our estimation approach should be allowed to differ for South-South and other
migration instances. Moreover, we expect the measurement error to be more pronounced with data reported from developing (southern) countries of residence, no
matter of whether North-South or South-South migration is concerned. Therefore,
we decided to estimate the effects separately for the sample of non-OECD residence
countries. We again follow the approach outlined in Section 2 in estimating the avg.
dose-response and treatment-effect functions. For the sake of brevity, we only report
the avg. dose-response function, though, and provide it in Figure 8.
– Figure 8 here –
It is apparent from the figure that the predicted import response to combinations
of skilled and unskilled immigrants for non-OECD residence countries is very similar
to the benchmark sample based on OECD residence countries. Hence, the polarized
effect of immigration by skill-type on imports appears to be generic and independent
of the sample.
4.6
Effects of immigration as well as of emigration on imports
A further extension concerns the role of bilateral emigration. Suppose that emigrants
from one country to another one engage in business activity with their country of
origin. This may give rise to an additional import response which could bias the
insights gained in the benchmark analysis which focused on immigration. In order
to address this point, we specify total emigration as third endogenous treatment
beyond Si and Ui as a function of the same determinants as in Table 3, estimate the
GPS, Gi , based on a trivariate normal, using the residuals of the three first-stage
models, impose the common support condition as before, and estimate the avg. doseresponse function from a second stage model which is based on three treatments:
Si , Ui , and log bilateral emigration (apart from the GPS). Notice that this avg.
dose-response function adjusts for potential differences of emigration across country
pairs, while the one in the outset did not. However, since we are not primarily
interested in the (in comparison to immigration relatively more indirect) effect of
emigration on imports per se, integrate its effect out (i.e., we average over the level
of emigration) and display the results for the avg. dose-response function in Si -Ui space as before. Clearly, while such an analysis is not informative about the impact
of emigration as such, the avg. dose-response function for skill-specific immigration
20
may not be biased due to an omission of emigration. We display the corresponding
avg. dose-response function in Figure 9.
– Figure 9 here –
Two observations stand out. First, the shape of the avg. dose-response function
and, accordingly, the polarization result remains unaffected by the consideration
of bilateral emigration. Second, computing the difference of the predicted import
responses with and without controlling for emigration reveals that the effect increases
slightly on average. This suggests that the positive correlation between bilateral
immigration and emigration leads to some mis-attribution of import responses to
immigration, where emigration is the cause. However, the magnitude of the potential
mis-attribution does not vary with skilled relative to unskilled immigration so that
the qualitative pattern of effects is the same as in the outset.
5
An interpretation of the polarization result of
immigration on imports and some further evidence on the role of institutions
This section’s aim is to provide one interpretation of the paper’s main finding of
a by-skill-type polarized effect of immigration on imports based on the literature
on the the structure and the functioning of social networks. The argument that
immigration affects trade through network effects as such is not new (Rauch, 2001).
Distinct aspects of those networks discussed in that literature are two: one is the
role of networks in mediating the economic relationship between two dense networks
(representing the populations of residence and source countries); and the other one
is the internal structure of the networks that do the spanning. Following Burt (2005,
2009), network research often refers to these two dimensions as brokerage and closure. Brokerage involves building connections across groups to increase exposure to
diverse opinion and practice. Brokerage is associated with growth and innovation.
Closure involves strengthening connections within a group to focus the group on a
limited set of opinions and practice. Closure is associated with trust and alignment,
ultimately enhancing efficiency. Closure plays a particularly central role in dealing
with institutional failures and asymmetric information problems. Work on closure
starts from the problems of contracting in certain types of goods or environments.
The idea is that, in the absence of relatively complete contracts and/or effective legal
environments, the risk of loss due to opportunistic behavior is sufficiently high that
21
many mutually beneficial contracts would not be made in the absence of some alternative source of assurance. These factors have been emphasized in explaining the
role of ethnic networks and diasporas in the organization of trade across political
jurisdictions or, more generally, in the absence of effective protection of contractual/property rights (Polanyi, 1957, 1968; Geertz, 1963, 1978; Cohen, 1969, 1971;
Bonacich, 1973; Curtin, 1984). Accordingly, disasporas have been found to play
a particularly strong role in mediating trade links in the context of institutional
problems and asymmetric information problems. Moreover, earlier work suggests
that the internal structure (ethnic homogeneity or, more broadly, social proximity)
of groups plays an important role in dealing with contracting problems (see Blalock,
1967; Bonacich, 1973; Landa, 1981, 1994; Greif, 1989, 1991, 2006; Iyer and Jon,
1999; Burt, 2005, 2009).
We argue that the pronounced effect of polarized bilateral immigration on bilateral imports may be suggestive of a decisive role played by skill-specific networks
and, in particular, the formation of stronger networks among immigrants with more
homogeneous skills than heterogeneous ones. Immigrants are naturally brokers between their source country and the host country in which they settle. That is, they
carry information about commodities available in the source country to consumers
in the host country, and vice versa. This will tend to increase demand for products of the immigrant home market in their new host country (see Gould, 1994, for
an early argument along those lines; the results in Head and Ries, 1998; Wagner
et al., 2002; Bryant et al., 2004; and Egger et al., 2012a; are consistent with this
view). In particular, existing evidence on a diminishing import-fostering effect of rising immigration and the larger role of immigrants for differentiated (arguably more
information-intensive) goods trade is consistent with the information link effect of
immigrant networks.
Our results suggest that, by comparison to migrant communities with a wide
range of skills, migrations characterized by a strong concentration of a given skill
group will form more effective networks, generate “better” bridges, and thus produce
a stronger link between migration and trade. Networks play an essential role in the
location choices of emigrants. Because networks reduce the costs of migration, it
is well-known that immigrants drawn from well-defined sending regions tend to go
to equally well-defined locations in the receiving country (see, e.g., Taylor, 1986;
Massey et al., 1987; Winters et al., 2001; McKenzie and Rapoport, 2007). Thus,
when we consider the pre-existing social bonds between any group of migrants, the
claim that similarity of education creates closer bonds gains considerable plausibility.
The finding that the effect of immigration on imports is stronger for immigration
flows made up of people with relatively homogeneous skills is consistent with the
hypothesis that the degree of closure within an immigrant community drives the
22
network effects of trade.
The role of closure i.e. reputation-building and punishment via the exclusion
from group benefits should be particularly pronounced for countries characterized
by poor enforcement of contract and property rights. A high degree of closure within
an immigrant community permits information exchange and bonding that thereby
substitutes to some extend for the low quality of institutions. The homogeneity
of observable traits such as education represents a proxy for the degree of closure.
Accordingly, we should expect that homogeneous migrant groups exert a stronger effect on imports from countries with low quality of institutions compared to countries
where the enforcement of contracts is guaranteed by an efficient legal system.
In order to assess this hypothesis, we analyze whether the polarization result
varies with the quality of institutions as measured by indicators on the control
of corruption, government effectiveness, political stability, rule of law, regulatory
quality, and absence of violence which are contained in the World Bank’s World
Development Indicators database. Specifically, we proceed as follows to shed light on
the role of institutions in conjunction with the avg. dose-response function in Si -Ui space. We raise each one of the just-mentioned six measures of institutional quality
simultaneously. Using the estimated first-stage-regression parameters, we change the
residuals accordingly and give the change in institutional quality the interpretation
of an exogenous, random shock. This results in an alternative level of the GPS. The
latter and also all potential skilled and unskilled immigration levels, {s, u}, together
with these random shocks from the first stage are used to compute predictions
based on the original second-stage regression parameters but using counterfactual
values associated with raised institutional quality. Based on these, we compute the
predictions for an alternative counterfactual avg. dose-response function, associated
with the improved institutional quality. We display the difference between this
counterfactual avg. dose-response function and the original one in Figure 10.
– Figure 10 here –
The figure is capable of informing us about where – i.e., for which combination
of potential skilled and unskilled immigration – the impact of higher institutional
quality on the role of skill-specific immigration for imports is biggest and where it is
smallest. In the figure, blue color indicates s − u combinations where the predicted
import response is increased whereas red color indicates s − u combinations where
the predicted import response is reduced due to the higher level of institutional quality. The figure suggests that institutional quality has the biggest beneficial impact
for country pairs with a relatively balanced composition of skilled and unskilled immigrants, while it tends to have a negligible and sometimes negative impact where
the composition is polarized. This suggests that homogeneous groups of immigrants
23
stimulate imports particularly in cases where institutions are weak in their origin
countries. This is consistent with the aforementioned closure-effect hypothesis of
immigrant networks.
6
Conclusions
This paper assesses the role of skilled versus unskilled immigration for bilateral
imports in a large data-set of country pairs. Flexible parametric and nonparametric reduced-form models are postulated, where the stocks of skilled and unskilled
migrants at the country-pair level are determined as endogenous continuous treatments. By invoking conditional mean independence and weak unconfoundedness,
the impact of different levels of skilled and unskilled immigration on the volume
and structure of bilateral imports is assessed in a quasi-experimental design. This
is accomplished through a generalized estimation procedure for an assessment of
causal effects of univariate continuous treatments on outcome.
Two sets of results from this analysis stand out. First, we find evidence of
a polarized impact of skill-specific immigration on imports: highly concentrated
skilled or unskilled migrants induce higher import volumes than a balanced, more
homogeneous composition of the immigrant base. Second, while a polarization of
skilled migrants seems to foster primarily differentiated goods trade, a polarization
of unskilled migrants mainly stimulates homogeneous goods trade. Either piece of
evidence is consistent with a segregation of skill-specific immigrant networks and
corresponding trade patterns. XXX THIS PART READS A BIT STRANGE That
a polarization of migrants in the skill dimension stimulates trade is in line with
this argument per se. Moreover, our results are consistent with the hypothesis that
social ties generated in homogenous immigrant communities facilitate contracting
and thereby can substitute for weak institutions in the migrants’ countries of origin.
XXX That polarizations towards skilled versus unskilled migrants lead to a polarization of trade towards differentiated versus homogeneous goods suggests that the
nature of networks and of information in different migrant groups are skill-specific.
XXX SHALL WE INCLUDE A SENTENCE ON skill specific migration policy
and trade liberalization ???....XXX
References
Anderson, J.E., 2011. The gravity model. Annual Review of Economics 3, 133-160.
Anderson, J.E., van Wincoop, E., 2003. Gravity with gravitas: A Solution to the border
puzzle. American Economic Review 93, 170-192.
24
Artuc, E., Docquier, F., Ozden, C., Parsons, C. 2014. A global asessment of human capital
mobility: The role of non-OECD destinations. World Development, forthcoming.
Blalock, H.M., 1967. Toward a theory of minority-group relations. Wiley, New York.
Bonacich, E., 1973. Theory of middleman minorities. American Sociological Review 38,
583-594.
Briant, A., Combes, P.-P., Lafourcade, M., 2009. Product complexity, quality of institutions
and the pro-trade effect of immigrants. Paris School of Economics Working Paper #2009 06.
Bryant, J., Genç, M., Law, D., 2004. Trade and migration to new zealand. New Zealand
Treasury, Working Paper Series #04/18.
Burt, R.S., 2005. Brokerage and closure: An introduction to social capital. Oxford University Press, Oxford.
Burt, R.S., 2009. Network duality of social capital, in: Bartkus, V.O., Davis, J.H. (Eds.),
Social capital: Reaching out, reaching in. Edward Elgar, Cheltenham, UK, 39-65.
Cohen, A., 1969. Custom and politics in urban africa: A study of Hausa migrants in Yoruba
towns, new ed. Routledge & Kegan Paul, London.
Cohen, A., 1971. Cultural strategies in the organization of trading diasporas, in: Meillassoux, C. (Ed.), The development of indigenous trade and markets in West Africa. Oxford
University Press, Oxford, 266-278.
Curtin, P.D., 1984. Cross-cultural trade in world history. Cambridge University Press,
Cambridge Cambridgeshire; New York.
Docquier, F., Lowell, B.L., Marfouk, A. 2009. A gendered assessment of highly skilled
emigration. Population and Development Review 35, 297-322.
Efron, B., Tibshirani, R.J., 1993. An introduction to the bootstrap. Chapman & Hall/CRC
Monographs on Statistics & Applied Probability; London.
Egger, P.H., Ehrlich, M. v., 2013. Generalized propensity scores for multiple continuous
treatment variables. Economics Letters 119, 32-34.
Egger, P.H., Ehrlich, M. v., Nelson, D.R., 2012a. Migration and trade. The World Economy
35, 216-241.
Egger, P.H., Ehrlich, M. v., Nelson, D.R., 2012b. The trade effects of skilled versus unskilled
Migration. CEPR Discussion Paper 9053.
Fan, J., Gijbels, I., 1996., Local polynomial modelling and its applications. Chapman and
Hall; London.
25
Felbermayr, G.J., Jung, B., 2009. The pro-trade effect of the brain drain: Sorting out
confounding factors. Economics Letters 104, 72-75.
Felbermayr, G., Jung, B., Toubal, F., 2010. Ethnic networks, information, and international
trade: Revisiting the evidence. Annales d’Economie et de Statistique, 41-70.
Felbermayr, G.J., Toubal, F., 2012. Revisiting the trade-migration nexus: Evidence from
new OECD data. World Development 40, 928-937.
Flores, C..A, Gonzalez, A., and Neumann, T.C., 2012. Estimating the effects of length of
exposure to instruction in a training program: The case of job corps. Review of Economics
and Statistics 94, 153-171.
Gaston, N., Nelson, D.R., 2011. International migration, in: Bernhofen, D.M., Falvey,
R., Greenaway, D., Kreickemeier, U. (Eds.), Handbook of international trade. Palgrave
Macmillan, Basingstoke, 660-697.
Geertz, C., 1963. Peddlers and princes; social change and economic modernization in two
indonesian towns. University of Chicago Press, Chicago,.
Geertz, C., 1978. Bazaar economy: Information and search in peasant marketing. American
Economic Review 68, 28-32.
Genc, M., Gheasi, M., Nijkamp, P., Poot, J., 2012. The impact of immigration on international trade: A meta-analysis, in: Nijkamp, P., Poot, J., Sahin, M. (Eds.), Migration impact
assessment. Edward Elgar, Cheltenham, 301-337.
Gould, D.M., 1994. Immigrant links to the home country: Empirical implications for united
states bilateral trade flows. Review of Economics and Statistics 76, 302-316.
Green, W.H., 2011. Econometric analysis. Prentice Hall, Upper Saddle River; NJ.
Greif, A., 1989. Reputation and coalitions in medieval trade: Evidence on the Maghribi
traders. Journal of Economic History 49, 857-882.
Greif, A., 1991. The organization of long-distance trade: Reputation and coalitions in
the Geniza documents and Genoa during the 11th-century and 12th-century. Journal of
Economic History 51, 459-462.
Greif, A., 2006. Institutions and the path to the modern economy: Lessons from medieval
trade. Cambridge University Press, Cambridge.
Hatzigeorgiou, A., 2010. Migration as trade facilitation: Assessing the links between international trade and migration. The B.E. Journal of Economic Analysis & Policy 10, Article
24.
Head, K., Ries, J., 1998. Immigration and trade creation: Econometric evidence from
canada. Canadian Journal of Economics 31, 47-62.
26
Hirano, K., Imbens, G.W., 2005. The propensity score with continuous treatments, Applied
Bayesian modeling and causal inference from incomplete-data perspectives. John Wiley &
Sons, Ltd, 73-84.
Imai, K., King, G., Stuart, E.A., 2008. Misunderstandings between experimentalists and
observationalists about causal inference. of the Royal Statistical Society: Series A (Statistics
in Society) 171, 481–502.
Imai, K., van Dyk, D.A., 2004. Causal inference with general treatment regimes: Generalizing the propensity score. Journal of the American Statistical Association 99, 854-866.
Iyer, G.R., Jon, M.S., 1999. Ethnic entrepreneurial and marketing systems: Implications
for the global economy. Journal of International Marketing 7, 83-110.
Javorcik, B.S., Ozden, C., Spatareanu, M., Neagu, C., 2011. Migrant networks and foreign
direct investment. Journal of Development Economics 94, 231-241.
Kluve, J., Schneider, H., Uhlendorff, A. Zhao, Z., 2012. Evaluating continuous training
programmes by using the generalized propensity score. Journal of the Royal Statistical
Society: Series A (Statistics in Society) 175, 587-617.
Landa, J.T., 1981. A theory of the ethnically homogeneous middleman group: An institutional alternative to contract law. Journal of Legal Studies 10, 348-362.
Landa, J.T., 1994. Trust, ethnicity, and identity: Beyond the new institutional economics
of ethnic trading networks, contract law, and gift-exchange. University of Michigan Press,
Ann Arbor.
Lechner, M., 2001. Identification and estimation of causal effects of multiple treatments
under the conditional independence assumption, in: Lechner, M., Pfeiffer, F. (Eds.), Econometric evaluation of labour market policies, vol. 13. Physica-Verlag HD, 43-58.
Marshall, M. G., Jaggers, K., Gurr, T.R, 2011. Polity IV Project: Dataset Users’ Manual.
Center for Systemic Peace: Polity IV Project.
Massey, D.S., Alarcón, R., Durand, J., Gonzalez, H., 1987. Return to Aztlan: The social
process of international migration from western Mexico. University of California Press,
Berkeley.
Mayda, A.M., 2010. International migration: A panel data analysis of the determinants of
bilateral flows. Journal of Population Economics 23, 1249-1274.
McKenzie, D., Rapoport, H., 2007. Network effects and the dynamics of migration and
inequality: Theory and evidence from Mexico. Journal of Development Economics 84, 1-24.
Newey, W.K., Powell, J.L., Vella, F., 1999. Nonparametric estimation of triangular simultaneous equations models. Econometrica 67, 565-603.
Peri, G., Requena-Silvente, F., 2010. The trade creation effect of immigrants: Evidence
from the remarkable case of Spain. Canadian Journal of Economics 43, 1433-1459.
27
Polanyi, K., 1957. Trade and market in the early empires; economies in history and theory.
Free Press, Glencoe, Ill.
Polanyi, K., 1968. Primitive, archaic, and modern economies; essays of Karl Polanyi, [1st
ed. Anchor Books, Garden City, N.Y.,.
Ratha, D, Shaw, W., 2007. South-South Migration and Remittance. World Bank Working
Paper, No. 102.
Rauch, J.E., 1999. Networks versus markets in international trade. Journal of International
Economics 48, 7-35.
Rauch, J.E., 2001. Business and social networks in international trade. Journal of Economic
Literature 39, 1177-1203.
Rauch, J.E., Trindade, V., 2002. Ethnic Chinese networks in international trade. Review
of Economics and Statistics 84, 116-130.
Rosenbaum, P.R., Rubin, D.B., 1983. The central role of the propensity score in observational studies for causal effects. Biometrika 70, 41-55.
Santos Silva, J, Tenreyro, S., 2006. The log of gravity. Review of Economics and Statistics
88, 641-658.
Tai, S.H.T., 2009. Market structure and the link between migration and trade. Review of
World Economics 145, 225-249.
Taylor, J.E., 1986. Differential migration, networks, information, and risk, in: Stark, O.
(Ed.), Migration, human capital and development, research in human capital and development, vol. 4. JAI Press, New York, 141-173.
Terza, J.V., Basu, A., Rathouz, P.J., 2008. Two-stage residual inclusion estimation: Addressing endogeneity in health econometric modeling. Journal of Health Economics 27,
531-543.
Wagner, D., Head, K., Ries, J., 2002. Immigration and the trade of provinces. Scottish
Journal of Political Economy 49, 507-525.
Winters, P., de Janvry, A., Sadoulet, E., 2001. Family and community networks in MexicoUS migration. Journal of Human Resources 36, 159-184.
Wooldridge, J.M., 2010. Econometric analysis of cross section and panel data. MIT Press,
Cambridge, Mass.
Tables and Figures
Table 1: Descriptive Statistics Dependent Variables
Mean
OECD residence countries
ln(Ui )
5.977
ln(Si )
6.725
ln(Mi )
4.647
MiD /MiH
5.274
Non-OECD residence countries
ln(Ui )
5.292
ln(Si )
4.575
ln(Mi )
3.324
Std. dev.
Min
Max
Obs.
2.594
2.471
3.056
59.683
0.000
0.000
-8.517
0.000
15.310
14.468
12.346
2908.723
3,178
3,178
3,178
2,669
2.798
2.408
2.988
0.000
0.000
-8.517
14.619
11.451
11.023
1,939
1,939
1,939
Notes: We have observations for 98 countries of origin and 29 OECD countries of residence of migrants as well as
65 Non-OECD countries of residence. Countries of residence are importers of goods while countries of origin are
the exporters. Ui and Si refer to skilled and unskilled immigrants where we define skilled immigrants as those with
at least secondary level of education. Mi denotes total bilateral imports while MiD and MiH refer to differentiated
and homogeneous goods imports. The classification of goods follows Rauch (1999). For most pairs with non-OECD
residence countries we lack data on imports by goods classification such that we refrain from distinguishing between
differentiated and homogeneous goods trade for these pairs.
Table 2: Descriptive Statistics Independent Variables
Acronyms
Description
GDP R i
GDP O i
GDP P CR i
GDP P CO i
P OP R i
P OP O i
GIN IR i
GIN IO i
U N EM P R i
U N EM P O i
REALEXCH i
CP IR i
CP IO i
ILOBARGAIN R i
ILOBARGAIN O i
ILOLABORR i
ILOLABORO i
ILODISCRR i
ILODISCRO i
ILOCHILDR i
ILOCHILDO i
P OLIT Y 2R i
P OLIT Y 2O i
LIF EEXP R i
LIF EEXP O i
F ERT ILR i
F ERT ILO i
DIST i
LIT R i
LIT O i
W ARSAW i
CON F LICT R i
CON F LICT O i
COM LAN G i
COLON Y i
GT A i
ST A i
OECDO i
RELIGION i
M IGIM P ED1 i
M IGIM P ED2 i
M IGIM P ED3 i
M IGIM P ED4 i
M IGIM P ED5 i
GOV CCO i
GOV GEO i
GOV P SO i
GOV RLO i
GOV RQO i
GOV AV O i
Observations
log GDP residence country
log GDP origin country
log GDP per capita residence country
log GDP per capita origin country
log population residence country
log population origin country
GINI coefficient residence country
GINI coefficient origin country
unemployment rate residence country
unemployment rate origin country
real exchange rate btw, residence and origin country
corruption perception index residence country
corruption perception index origin country
ILO bargaining power index residence country
ILO bargaining power index origin country
ILO working condition index residence country
ILO working condition index origin country
ILO worker discrimination index residence country
ILO worker discrimination index origin country
ILO child labor index residence country
ILO child labor index origin country
Polity IV index residence country
Polity IV index origin country
life expectancy residence country
life expectancy origin country
fertility residence country
fertility origin country
log distance btw. residence and origin country
literacy rate residence country
literacy rate origin country
member of Warsaw Pact
number of armed conflicts residence country
number of armed conflicts origin country
common language in residence and origin country
colonial relationship btw. residence and origin country
goods trade agreement
service trade agreement
OECD member
common religion in residence and origin country
migration impediments: re-admission agreements
migration impediments: social security system
migration impediments: exchange of information
migration impediments: labor market regulation
migration impediments: respect for human rights
institutions origin country: control of corruption
institutions origin country: government effectiveness
institutions origin country: political stability
institutions origin country: rule of law
institutions origin country: regulatory quality
institutions origin country: absence of violence
Mean
Std. dev.
Min
Max
25.504
24.770
9.344
8.646
16.451
16.657
35.865
38.631
7.824
8.366
-0.359
6.235
4.918
27.440
25.429
34.596
29.149
24.521
22.339
3.359
2.793
7.445
5.026
73.966
69.956
2.130
2.752
8.397
82.300
78.545
0.029
0.094
0.180
0.142
0.041
0.157
0.103
0.378
0.271
0.022
0.214
0.176
0.235
0.160
0.396
0.443
0.109
0.296
0.465
0.330
5,117
1.804
2.018
0.825
1.142
1.347
1.443
8.815
9.494
4.552
5.151
3.716
2.341
2.320
16.144
16.179
16.235
16.297
13.128
13.617
4.283
4.048
4.979
6.113
5.026
7.743
1.020
1.496
1.013
11.702
17.410
0.168
0.257
0.331
0.349
0.199
0.364
0.304
0.485
0.445
0.146
0.410
0.381
0.424
0.367
1.136
1.019
0.947
1.038
0.882
0.945
20.789
19.859
5.950
5.950
13.400
13.400
23.005
23.310
0.030
0.030
-21.227
1.600
1.600
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
-10.000
-10.000
35.607
35.607
1.154
1.154
4.088
13.478
13.478
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
0.000
-1.374
-1.461
-2.510
-1.728
-2.098
-1.883
29.742
29.742
10.317
10.317
20.930
20.930
60.060
67.000
27.726
27.726
21.538
9.300
9.300
50.500
50.500
56.000
56.000
43.500
43.500
12.500
12.500
10.000
10.000
80.247
80.247
7.720
7.720
9.891
99.790
99.790
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
2.586
2.171
1.668
1.938
2.247
1.765
Notes: We summarize information for those observations that have non-missing, positive levels of Mi , Ui , and Si as
well as for all covariates. When estimating the effects for subgroups of bilateral imports the respective dependent
variable determines the number of observations (see Table 1).
Table 3: First-Stage Estimation of GPS
Const.
GDP R i
GDP R i2
GDP R i3
GDP O i
GDP O i2
GDP O i3
GDP P CR i
GDP P CR i2
GDP P CR i3
GDP P CO i
GDP P CO i2
GDP P CO i3
P OP R i
P OP R i2
P OP R i3
P OP O i
P OP O i2
P OP O i3
GIN IR i
GIN IR i2
GIN IR i3
GIN IO i
GIN IO i2
GIN IO i3
U N EM P R i
U N EM P R i2
U N EM P R i3
U N EM P O i
U N EM P O i2
U N EM P O i3
REALEXCH i
REALEXCH i2
REALEXCH i3
CP IR i
CP IR i2
CP IR i2
CP IO i
CP IO i2
CP IO i3
ILOBARGAIN R i
ILOBARGAIN R i2
ILOBARGAIN R i3
ILOBARGAIN O i
ILOBARGAIN O i2
ILOBARGAIN O i3
ILOLABORR i
ILOLABORR i2
ILOLABORR i3
ILOLABORO i
ILOLABORO i2
ILOLABORO i3
ILODISCRR i
ILODISCRR i2
Continued on next page
Si
Coef.
-1974.364∗
83.514∗
-3.187∗
0.039∗
-7.301
0.335
-0.005∗
274.215
-28.840
1.032
17.468∗∗∗
-2.388∗∗∗
0.104∗∗∗
15.188
-0.405
0.003
-20.784∗∗∗
1.289∗∗∗
-0.025∗∗∗
-4.761∗∗
0.142∗∗
-0.001∗∗
-0.111
0.003
-0.000
0.558∗∗∗
-0.056∗∗∗
0.002∗∗∗
0.239∗∗∗
-0.018∗∗∗
0.000∗∗∗
0.037∗∗
-0.003
0.000
-12.943∗∗∗
2.049∗∗∗
-0.101∗∗∗
2.026∗∗∗
-0.330∗∗∗
0.018∗∗∗
-0.100∗∗
0.002
-0.000
-0.034∗
0.002∗∗
-0.000∗∗
-0.164∗∗
0.008∗∗∗
-0.000∗∗∗
0.024
0.000
-0.000
0.182∗
-0.005
Std.err.
(1050.203)
(45.854)
(1.728)
(0.022)
(5.610)
(0.223)
(0.003)
(385.051)
(40.189)
(1.399)
(5.621)
(0.689)
(0.028)
(45.652)
(2.742)
(0.055)
(5.276)
(0.309)
(0.006)
(2.221)
(0.066)
(0.001)
(0.136)
(0.003)
(0.000)
(0.097)
(0.012)
(0.000)
(0.048)
(0.004)
(0.000)
(0.014)
(0.002)
(0.000)
(2.375)
(0.374)
(0.019)
(0.321)
(0.061)
(0.004)
(0.046)
(0.002)
(0.000)
(0.018)
(0.001)
(0.000)
(0.075)
(0.002)
(0.000)
(0.019)
(0.001)
(0.000)
(0.097)
(0.004)
Ui
Coef.
-8468.345∗∗∗
249.994∗∗∗
-9.478∗∗∗
0.117∗∗∗
1.427
-0.025
-0.000
245.495
-25.588
0.922
14.717∗∗∗
-1.955∗∗∗
0.084∗∗∗
-109.636∗∗∗
7.022∗∗∗
-0.140∗∗∗
-24.284∗∗∗
1.508∗∗∗
-0.030∗∗∗
-19.650∗∗∗
0.594∗∗∗
-0.006∗∗∗
-0.288∗∗
0.006∗∗
-0.000∗∗
0.789∗∗∗
-0.092∗∗∗
0.003∗∗∗
0.194∗∗∗
-0.013∗∗∗
0.000∗∗∗
0.026∗∗
-0.004∗∗
0.000
-27.485∗∗∗
4.507∗∗∗
-0.230∗∗∗
0.893∗∗∗
-0.133∗∗∗
0.008∗∗∗
-0.031
-0.003∗
0.000∗∗∗
-0.021
0.002∗∗
-0.000∗∗
-0.267∗∗∗
0.010∗∗∗
-0.000∗∗∗
-0.024
0.002∗∗
-0.000∗∗∗
0.163∗∗
-0.002
Std.err.
(889.349)
(37.962)
(1.428)
(0.018)
(4.815)
(0.191)
(0.003)
(308.797)
(32.193)
(1.120)
(4.755)
(0.582)
(0.023)
(39.917)
(2.400)
(0.048)
(4.159)
(0.244)
(0.005)
(1.908)
(0.056)
(0.001)
(0.115)
(0.003)
(0.000)
(0.079)
(0.010)
(0.000)
(0.038)
(0.003)
(0.000)
(0.012)
(0.002)
(0.000)
(2.102)
(0.330)
(0.017)
(0.257)
(0.049)
(0.003)
(0.038)
(0.002)
(0.000)
(0.016)
(0.001)
(0.000)
(0.062)
(0.002)
(0.000)
(0.017)
(0.001)
(0.000)
(0.081)
(0.003)
Table 3: First-Stage Estimation of GPS
ILODISCRR i3
ILODISCRO i
ILODISCRO i2
ILODISCRO i3
ILOCHILDR i
ILOCHILDR i2
ILOCHILDR i3
ILOCHILDO i
ILOCHILDO i2
ILOCHILDO i3
P OLIT Y 2R i
P OLIT Y 2R i2
P OLIT Y 2R i3
P OLIT Y 2O i
P OLIT Y 2O i2
P OLIT Y 2O i3
LIF EEXP R i
LIF EEXP R i2
LIF EEXP R i3
LIF EEXP O i
LIF EEXP O i2
LIF EEXP O i3
F ERT ILR i
F ERT ILR i2
F ERT ILR i3
F ERT ILO i
F ERT ILO i2
F ERT ILO i3
DIST i
DIST i2
DIST i3
LIT R i
LIT R i2
LIT R i3
LIT O i
LIT O i2
LIT O i3
W ARSAW i
CON F LICT R i
CON F LICT O i
COM LAN G i
COLON Y i
GT A i
ST A i
OECDO i
RELIGION i
M IGIM P ED1 i
M IGIM P ED2 i
M IGIM P ED3 i
M IGIM P ED4 i
M IGIM P ED5 i
adj. R2
AIC
Obs.
Notes:
∗∗∗ , ∗∗ , ∗
Si
Coef.
0.000
-0.039∗
0.001
-0.000
0.800∗∗∗
-0.189∗∗∗
0.011∗∗∗
0.125∗
-0.035∗∗
0.002∗∗
-0.688
0.007
0.004
-0.019
0.000
0.001∗∗
18.677
-0.275
0.001
-1.747∗∗∗
0.031∗∗∗
-0.000∗∗∗
17.685∗
-7.478
1.080
-1.038∗∗∗
0.283∗∗∗
-0.024∗∗∗
3.747
-0.668
0.031∗
1.884
-0.034
0.000
0.005
-0.000
0.000
1.457∗∗∗
-0.516
0.025
0.806∗∗∗
1.984∗∗∗
0.345∗∗∗
-0.392∗∗
0.280∗
0.359∗∗∗
0.766∗∗
0.749
0.396
-0.856∗
-0.547∗∗
0.72
11192
3,178
Std.err.
(0.000)
(0.021)
(0.001)
(0.000)
(0.153)
(0.035)
(0.002)
(0.068)
(0.016)
(0.001)
(0.816)
(0.170)
(0.009)
(0.017)
(0.002)
(0.000)
(34.237)
(0.451)
(0.002)
(0.367)
(0.006)
(0.000)
(9.869)
(5.385)
(0.980)
(0.389)
(0.102)
(0.008)
(3.148)
(0.415)
(0.018)
(5.974)
(0.071)
(0.000)
(0.047)
(0.001)
(0.000)
(0.400)
(0.369)
(0.096)
(0.108)
(0.166)
(0.130)
(0.153)
(0.165)
(0.066)
(0.389)
(0.498)
(0.245)
(0.479)
(0.221)
Ui
Coef.
-0.000
0.017
-0.002∗
0.000
1.047∗∗∗
-0.251∗∗∗
0.015∗∗∗
0.031
-0.015
0.001
1.503∗∗
-0.361∗∗
0.019∗∗
-0.006
-0.001
0.000∗
253.781∗∗∗
-3.376∗∗∗
0.015∗∗∗
-1.528∗∗∗
0.026∗∗∗
-0.000∗∗∗
48.446∗∗∗
-27.849∗∗∗
5.580∗∗∗
-0.733∗∗
0.183∗∗
-0.015∗∗
0.381
-0.194
0.010
2.280
-0.044
0.000
-0.033
0.001
-0.000
1.602∗∗∗
-1.117∗∗∗
0.032
1.138∗∗∗
1.589∗∗∗
0.366∗∗∗
-0.379∗∗∗
-0.133
0.264∗∗∗
0.365
0.547
0.336∗
-0.503
-0.338∗
0.80
9815
3,178
Std.err.
(0.000)
(0.017)
(0.001)
(0.000)
(0.132)
(0.031)
(0.002)
(0.053)
(0.012)
(0.001)
(0.708)
(0.146)
(0.008)
(0.014)
(0.001)
(0.000)
(29.310)
(0.386)
(0.002)
(0.290)
(0.005)
(0.000)
(8.319)
(4.531)
(0.825)
(0.308)
(0.080)
(0.006)
(3.148)
(0.407)
(0.017)
(4.997)
(0.060)
(0.000)
(0.036)
(0.001)
(0.000)
(0.321)
(0.309)
(0.077)
(0.084)
(0.134)
(0.102)
(0.121)
(0.128)
(0.054)
(0.277)
(0.382)
(0.200)
(0.363)
(0.184)
denote significance levels at 1, 5, and 10%, respectively.
Table 4: Balancing Test - T-statistics
GDP Ri
GDP Oi
GDP P CRi
GDP P COi
P OP Ri
P OP Oi
GIN IRi
GIN IOi
U N EM P Ri
U N EM P Oi
REALEXCHi
CP IRi
CP IOi
ILOBARGAIN Ri
ILOBARGAIN Oi
ILOLABORRi
ILOLABOROi
ILODISCRRi
ILODISCROi
ILOCHILDRi
ILOCHILDOi
P OLIT Y 2Ri
P OLIT Y 2Oi
LIF EEXP Ri
LIF EEXP Oi
F ERT ILRi
F ERT ILOi
DISTi
LIT Ri
LIT Oi
W ARSAWi
CON F LICT Ri
CON F LICT Oi
COM LAN Gi
COLON Yi
GT Ai
ST Ai
OECDOi
RELIGIONi
M IGIM P ED1 i
M IGIM P ED2 i
M IGIM P ED3 i
M IGIM P ED4 i
M IGIM P ED5 i
Mean
Obs. (common support)
4 Groups
Unconditional
Conditional
Ui
Si
Ui
Si
6.140
1.952
0.215
1.616
6.495
12.411
0.858
0.318
3.859
1.930
0.289
0.974
10.151
13.001
0.277
0.459
1.736
5.125
0.001
1.211
2.528
2.292
1.337
0.800
8.539
8.672
0.568
1.109
2.522
5.182
0.555
0.354
2.917
1.270
0.763
0.408
0.434
0.514
0.068
0.189
0.210
1.178
0.069
0.383
5.603
4.015
0.687
0.159
9.429
11.595
0.207
0.403
14.454
13.917
1.855
1.160
1.810
3.480
0.300
0.156
8.834
10.216
2.145
1.661
4.333
5.454
0.017
0.017
12.027
11.655
1.394
0.781
2.870
3.825
0.074
0.042
8.974
9.024
1.099
1.393
2.738
3.749
0.096
0.063
3.995
3.677
0.182
0.147
4.777
7.319
0.363
0.225
1.176
0.581
0.334
0.383
7.097
10.187
0.062
0.207
5.939
4.962
0.315
0.332
5.532
9.572
0.502
0.043
7.948
3.096
1.554
0.835
5.674
3.925
0.010
0.164
4.506
5.799
0.303
0.413
2.078
2.315
0.223
0.217
1.207
0.917
0.983
0.691
4.837
4.536
0.283
0.158
5.545
8.000
0.390
0.864
2.677
1.387
2.001
0.775
1.932
4.788
1.269
1.026
2.469
4.881
0.741
0.550
6.831
10.775
0.457
0.128
0.262
2.512
0.311
0.560
2.259
1.938
0.029
0.623
2.770
0.622
0.653
0.671
3.359
0.711
0.825
0.535
3.213
1.258
0.602
0.569
2.764
0.107
1.013
0.629
4.669
5.098
0.597
0.554
2,525
9 Groups
Unconditional
Conditional
Ui
Si
Ui
Si
5.175
2.475
0.005
0.076
4.345
6.517
0.044
0.031
2.938
2.082
0.017
0.140
8.617
8.491
0.007
0.104
2.007
0.280
0.132
0.116
3.385
0.472
0.048
0.052
5.987
4.704
0.180
0.032
2.392
3.396
0.232
0.197
1.143
0.538
0.034
0.191
0.519
0.014
0.077
0.015
0.205
0.601
0.044
0.039
3.479
2.618
0.566
0.340
7.440
7.405
0.407
0.179
7.823
6.258
0.133
0.472
1.978
2.859
0.115
0.113
3.904
3.312
0.080
0.402
3.428
3.725
0.190
0.185
6.418
4.474
0.230
0.594
4.444
4.441
0.003
0.039
4.577
3.939
0.104
0.026
3.923
3.455
0.175
0.155
3.214
2.453
0.090
0.072
4.827
5.281
0.056
0.027
1.440
0.279
0.001
0.131
6.166
6.622
0.085
0.168
5.365
3.581
0.006
0.031
5.073
5.405
0.290
0.274
3.875
1.856
0.117
0.039
4.493
3.899
0.396
0.410
3.601
3.188
0.734
0.747
2.996
1.814
0.830
0.069
0.853
1.368
0.346
0.173
3.998
3.010
0.053
0.045
3.190
4.009
0.495
0.248
2.679
1.315
0.003
0.043
4.639
4.669
0.241
0.019
3.189
3.700
0.180
0.026
6.932
7.400
0.112
0.096
0.316
1.956
0.147
0.171
0.437
0.155
0.418
0.125
0.571
0.440
0.265
0.077
1.240
0.227
0.118
0.085
0.551
0.436
0.258
0.094
0.642
0.804
0.030
0.043
3.509
3.089
0.184
0.152
1,352
Notes: We regress each covariate on the treatments Ui and Si whereby the conditional model controls for the
GPS evaluated at the quartiles of the treatments (see also Imai and van Dyk, 2004 for an analogous test of the
balancing property). For each of these regressions we report the t-statistics on whether the effect of Ui and Si on the
respective covariates is significantly different from zero. The unconditional and conditional models are estimated
for the common support samples based alternatively on 4 and 9 groups. Note that restricting the sample to the
common support of the GPS already improves the balancing compared to the corresponding unconditional estimates
based on the total sample.
Table 5: Second-Stage Estimation of the Unit Dose-Response Function
4 Groups
Ui
Si
Ui2
Si2
Ui × Ĝi
Si × Ĝi
Ui × Si
Ĝi
Ĝ2i
Constant
Observations
adj. R2
F-statistic Ĝi terms
F-statistic Ui terms
F-statistic Si terms
Coef.
0.568∗
0.274
0.157∗∗∗
0.155∗∗
0.417
2.171∗∗
-0.338∗∗∗
-2.779
-40.468∗∗
-0.576
2,525
0.311
14.98
12.11
32.64
Std. err.
(0.228)
(0.323)
(0.036)
(0.054)
(0.693)
(0.814)
(0.084)
(4.434)
(12.606)
(0.688)
Notes: ∗∗∗ , ∗∗ , ∗ denote significance levels at 1, 5, and 10%, respectively. Standard errors in parentheses. Ui and
Si refer to the logarithm of the stock of unskilled and skilled immigrants, respectively, who reside in the importer
country and originate from the exporter country. Ĝi refers to the generalized propensity score calculated according
to equation (2) using the coefficients from the first-stage regressions in Table 3. We estimate the standard errors
of the avg. dose-response function by bootstrapping with 200 draws that take into account that the second-stage
estimates involve imprecision from first-stage estimates.
Figure 1: Balancing of Covariates - 4 Groups
Mean: 4.86; Median: 3.30; 60% sign. at 1% level
A. Unconditional
Mean: 0.60; Median: 0.48; 0.5% sign. at 1% level
B. Conditional on Bivariate GPS
Note: The histograms contain the t-statistics for all 44 linear terms of the covariates. We test
for the equivalence of the covariates between 4 groups which yields 176 t-statistics. In the
conditional comparison we split the distribution of Ĝqi into 16 blocs and conduct the t-tests for
subsamples belonging to the same bloc. A weighted average over these blocs is computed for
each treatment group and each covariate.
Figure 2: Balancing of Covariates - 9 Groups
Mean: 3.32; Median: 2.45; 47% sign. at 1% level
A. Unconditional
Mean: 0.54; Median: 0.44; 0.5% sign. at 1% level
B. Conditional on Bivariate GPS
Note: The histograms contain the t-statistics for all 44 linear terms of the covariates. We test
for the equivalence of the covariates between 9 groups which yields 396 t-statistics. In the
conditional comparison we split the distribution of Ĝqi into 8 blocs and conduct the t-tests for
subsamples belonging to the same bloc. A weighted average over these blocs is computed for
each treatment group and each covariate.
Figure 3: Avg. Dose-Response – Import Volume
Note: Blue corresponds to significant and positive and light blue to insignificant and positive.
We obtain standard errors from a bloc bootstrapping routine with 200 replications. This routine
includes the common support restriction, the first-stage estimation of the GPS as well the
second-stage estimation.
Figure 4: Avg. Treatment Effect – Import Volume
A. Unskilled Migrants (U)
B. Skilled Migrants (S)
Note: Blue corresponds to significant and positive, light blue to insignificant and positive,
yellow to insignificant and negative, and red to significant and negative predictions. We obtain
standard errors from a bloc bootstrapping routine with 200 replications. This routine includes
the common support restriction, the first-stage estimation of the GPS as well the second-stage
estimation.
Figure 5: Nonparametric Avg. Dose-Response – Import Volume
Note: Blue corresponds to significant and positive, light blue to insignificant and positive, and
yellow to insignificant and negative. The surface is predicted from a multivariate local linear
regression. An optimal bandwidth is obtained by cross validation. Standard errors stem from
a bloc bootstrapping routine with 200 replications. This routine includes the common support
restriction, the first-stage estimation of the GPS as well the second-stage estimation.
Figure 6: Nonparametric Avg. Treatment Effect – Import Volume
A. Unskilled Migrants (U)
B. Skilled Migrants (S)
Note: Blue corresponds to significant and positive, light blue to insignificant and positive, yellow
to insignificant and negative, and red to significant and negative predictions. The surface is
predicted from a multivariate local quadratic regression. An optimal bandwidth is obtained by
cross validation. Standard errors stem from a bloc bootstrapping routine with 200 replications.
This routine includes the common support restriction, the first-stage estimation of the GPS as
well the second-stage estimation.
Figure 7: Nonparametric Avg. Dose-Response – Ratio of Differentiated and Homogeneous Goods
Note: Blue corresponds to significant and positive, light blue to insignificant and positive,
and yellow to insignificant and negative. The outcome refers to the ratio between imports in
differentiated and homogeneous goods MiD /MiH . The surface is predicted from a multivariate
local linear regression. An optimal bandwidth is obtained by cross validation. Standard errors
stem from a bloc bootstrapping routine with 200 replications. This routine includes the common
support restriction, the first-stage estimation of the GPS as well the second-stage estimation.
Figure 8: Avg. Dose-Response – Non-OECD residence countries
Note: In this Figure we limit the sample to pairs with non-OECD country of residence and
compute the avg. dose-response function on the basis of the parametric version of the secondstage. We obtain standard errors from a bloc bootstrapping routine with 200 replications.
This routine includes the common support restriction, the first-stage estimation of the GPS
as well the second-stage estimation. Dark (light) blue color reflects positive and significant
(insignificant) conditional expectations on imports.
Figure 9: Avg. Dose-Response – Controlling for emigration
Note: In this Figure we control for emigration as a third endogenous treatment and compute
the avg. dose-response functions on the basis of the parametric version of the second-stage. We
obtain standard errors from a bloc bootstrapping routine with 200 replications. This routine
includes the common support restriction, the first-stage estimation of the GPS as well the
second-stage estimation. Blue color reflects positive and significant conditional expectations on
imports.
Figure 10: The Role of Institutions
Note: This figure displays the difference between a counterfactual avg. dose-response
E 0 [M (U, S)] function where we have raised the quality of institutions by one standard deviation and the actual avg. dose-response function E[M (U, S)]. See Section 5 for details. Both
avg. dose-response functions are computed on the basis of the parametric version of the secondstage. We use the estimates on the control of corruption, government effectiveness, political
stability and absence of violence/terrorism, rule of law, regulatory quality, and voice and accountability from the World Bank as proxies for institutional quality. We raise each of these
dimensions at the same time. Hence, level of the difference E 0 [M (U, S)] − E[M (U, S)] reflects
the gain in bilateral trade volume due to a one-standard deviation increase of institutional
quality at different combinations of skilled and unskilled migration.
Appendix: Sample composition
29 OECD residence countries: Australia, Austria, Belgium, Canada, Czech Republic, Denmark, Finland, Greece, Hungary, Ireland, Italy, Japan, Korea (Republic
of), Mexico, Netherlands, New Zealand, Norway, Poland, Portugal, Slovak Republic,
Spain, Sweden, Turkey, United States, Spain, Switzerland, United Kingdom.
65 non-OECD residence countries: Albania, Algeria, Argentina, Azerbaijan,
Bangladesh, Belarus, Bolivia, Brazil, Bulgaria, Burkina, Cameroon, Chile, China,
Colombia, Costa, Cote, Croatia, Cyprus, Ecuador, El Salvador, Estonia, Georgia, Ghana, Guatemala, Honduras, Indonesia, Israel, Jordan, Kazakhstan, Kenya,
Kuwait, Kyrgyz, Latvia, Lithuania, Malaysia, Mali, Mauritius, Moldova, Mongolia, Morocco, Mozambique, Nepal, Nicaragua, Niger, Nigeria, Pakistan, Panama,
Paraguay, Peru, Philippines, Romania, Rwanda, Saudi Arabia, Singapore, Slovenia,
South Africa, Sri Lanka, Sudan, Thailand, Trinidad, Tunisia, Ukraine, Uruguay,
Uzbekistan, Vietnam.
98 OECD and non-OECD countries of origin: Algeria, Argentina, Armenia,
Australia, Austria, Azerbaijan, Bangladesh, Belgium, Bolivia, Bosnia and Herzegovina, Brazil, Bulgaria, Burkina Faso, Cambodia, Cameroon, Canada, Chile, China,
Colombia, Costa Rica, Cote d’Ivoire, Croatia, Czech Republic, Denmark, Dominican Republic, Ecuador, Egypt, El Salvador, Estonia, Finland, France, Gabon, Georgia, Ghana, Greece, Guatemala, Haiti, Honduras, Hungary, India, Indonesia, Ireland, Israel, Italy, Jamaica, Japan, Jordan, Kazakhstan, Kenya, Korea (Republic of), Latvia, Lithuania, Macedonia (Former Yugoslavian Republic), Madagascar, Malaysia, Mali, Mauritania, Mauritius, Mexico, Moldova, Mongolia, Morocco,
Mozambique, Nepal, Netherlands, New Zealand, Nicaragua, Niger, Norway, Pakistan, Panama, Paraguay, Peru, Philippines, Poland, Portugal, Romania, Russian
Federation, Singapore, Slovak Republic, Slovenia, South Africa, Spain, Sri Lanka,
Sweden, Switzerland, Tanzania, Thailand, Trinidad and Tobago, Tunisia, Turkey,
Ukraine, United Kingdom, United States, Uruguay, Venezuela, Vietnam, Yemen
(Republic of), Zambia.