SELF-SELECTION AND INTERNATIONAL MIGRATION: NEW EVIDENCE FROM MEXICO Robert Kaestner and Ofer Malamud* Abstract—This paper examines the selection of Mexican migrants to the United States using novel data with rich premigration characteristics that include permanent migrants, return migrants, and migrating households. Results indicate that Mexican migrants are more likely to be young, male, and from rural areas compared to nonmigrants, but they are similar to nonmigrants in cognitive ability and health. Migrants are selected from the middle of the education distribution. Male Mexican migrants are negatively selected on earnings, and this result is largely explained by differential returns to labor market skill between the United States and Mexico rather than proxies for differential costs of migration. I. Introduction M EXICAN immigrants account for one-third of all foreign-born residents in the United States and are the fastest-growing immigrant group. It is estimated that in 2009, 11.4 million Mexican immigrants lived in the United States, and this figure represents approximately 4% of the U.S. population and 11% of Mexico’s population.1 The size of the Mexican immigrant population in the United States, and the potential consequences associated with an increase in population of this magnitude, makes it important to identify characteristics of these immigrants that are likely to affect socioeconomic outcomes in both the United States and Mexico. For example, if migrants are relatively less educated, this will tend to reduce the relative scarcity of highly educated labor and reduce earnings disparities by education in Mexico. Positive selection of migrants with respect to education would have opposite effects. The education of Mexican immigrants in the United States also has implications for U.S. labor market and immigration policy. Negative selection of Mexican immigrants will tend to reduce the wages of less educated, native-born persons and exacerbate earnings inequality, although the evidence in support of this hypothesis remains controversial (Borjas, Freeman, & Katz, 1996; Lalonde & Topel, 1997; Borjas, 2003; Card, 2005). Similarly, the selection of migrants with respect to other characteristics such as age, health, cognitive ability, and earnings has social and economic implications for both Mexico and the United States. Received for publication November 29, 2010. Revision accepted for publication October 11, 2012. * Kaestner: University of Illinois, Chicago, and NBER; Malamud: University of Chicago and NBER. We thank the editor, Alberto Abadie, and three anonymous referees for their comments. Kerwin Charles, Jesus Fernández-Huertas Moraga, Jeff Grogger, Bruce Meyer, Stephen Trejo, and Abigail Wozniak, as well as seminar participants at the Georgetown University, University of Chicago, the University of Notre Dame, the University of Virginia, and the Universitat Pompeu Fabra, provided very useful suggestions. All errors and opinions are our own. A supplemental appendix is available online at http://www.mitpress journals.org/doi/suppl/10.1162/REST_a_00375. 1 Pew Hispanic Center, Statistical Profile: Hispanics of Mexican Origin in the United States, 2009, http://www.pewhispanic.org/2011/05/26 /hispanics-of-mexican-origin-in-the-united-states-2009/. This paper uses data from the Mexican Family Life Survey (MxFLS) to accomplish two objectives. First, we compare characteristics such as the age, education, health, and cognitive ability of Mexicans who migrate to the United States with nonmigrants who remain in Mexico. This descriptive analysis of Mexican migrant characteristics is relevant for policy because regardless of the causes of the pattern of selection, many consequences of migration for Mexico and the United States depend on the characteristics of those who choose to migrate. The MxFLS is ideally suited for this purpose because it contains information on individual characteristics prior to migration for a large, geographically diverse sample of Mexicans. These data have not been previously used to examine the selection of Mexican migrants, and they enable us to address several of the data limitations that have plagued earlier studies. The MxFLS avoids the potential undercount of Mexican migrants and possible overstatement of educational attainment in the U.S. Census. Moreover, in contrast to most Mexican surveys, the MxFLS allows us to identify all migrants, including those who have moved permanently to the United States or moved to the United States with their entire households. Finally, the MxFLS provides information on characteristics such as cognitive ability and health that are generally unmeasured in other surveys. Second, we use the MxFLS to describe the pattern of selection of migrants with respect to earnings and examine whether this pattern changes when we adjust for differences in some of the costs and benefits of migration. Earlier studies have often focused on ‘‘skill’’ characteristics, such as education and age (or experience), due to their strong associations with labor market earnings. Using the MxFLS, we examine the pattern of selection with premigration earnings directly. The MxFLS also contains detailed measures of migration networks, credit availability, and household assets, which serve as proxy variables for the costs of migration. In addition, we construct measures of monetary returns to migration using the Mexican and U.S. 2000 censuses. We use these variables to assess possible determinants of selection with respect to earnings by comparing changes in the pattern of migrant selection with and without adjusting for these measures of the costs and benefits of migration. Our descriptive analysis reveals a marked selection of Mexican migrants with respect to age: both male and female migrants are more likely to be younger, and there is a near monotonic, negative relationship between age and the probability of migrating. With respect to education, both male and female migrants tend to be drawn from the middle of the education distribution (four to nine years of schooling). There is little evidence of migrant selection with respect to cognitive ability or health. Finally, rural residents, The Review of Economics and Statistics, March 2014, 96(1): 78–91 Ó 2014 by the President and Fellows of Harvard College and the Massachusetts Institute of Technology SELF-SELECTION AND INTERNATIONAL MIGRATION unmarried persons, and those with a relative living in the United States are much more likely to migrate than urban residents, married persons, and those without a relative living in the United States, respectively. With regard to earnings, which we analyze only for males because of high rates of nonparticipation in the labor market for women, Mexican migrants tend to come from the bottom half of the (unconditional) earnings distributions. Interestingly, this pattern of selection of migrants with respect to earnings is not due to the relationship between earnings and three measures of the cost of migration: whether a person has a relative in United States, has access to credit, or has significant assets. While some of these measures of costs of migration are significantly related to the probability of migration, adding them to the regression has only a small impact on the pattern of migrant selection with respect to earnings. In contrast, a significant part of the pattern of migrant selection with respect to earnings can be explained by differences in labor market returns to observed determinants of earnings such as age and education. We find that Mexican males are more likely to migrate to the United States when the labor market return for people with their observed characteristics is larger in the United States than in Mexico. II. Related Literature Empirical analyses of the selectivity of migrants in the economics literature are motivated by theories of migration in which the decision to migrate depends on the costs and benefits of migration (Sjaastad, 1962). Predictions about the characteristics of migrants versus those of nonmigrants depend on how these characteristics are related to the costs and benefits of migration. For example, Borjas (1987) used a Roy (1951) model to generate predictions about whether migrants will be relatively low skilled or high skilled depending on how skill affects the pecuniary benefits of migration. In the absence of the differential time-equivalent costs of migration by skill, migrants are positively selected on skill when the returns to skill in Mexico are lower than in the United States and negatively selected on skill when the returns to skill are relatively higher in Mexico. Several authors have drawn on the Borjas (1987) model to motivate analyses of migrant selection with respect to education and earnings. Chiquiar and Hanson (2005) used Mexican and U.S. censuses to evaluate the selection of Mexican immigrants in terms of observed skills and found evidence of intermediate selection.2 Chiquiar and Hanson (2005) first compared the educational attainment of nonmigrants in the Mexican Census to migrants in the 2000 U.S. Census and showed that Mexican immigrants in the United States are drawn from the middle of the Mexican education distribution. They then compared the predicted wage distri- bution for residents of Mexico using the 2000 Mexican Census with the predicted wage distribution of Mexican immigrants in the United States using the 2000 U.S. Census had they been paid according to the skills prices in Mexico. Predicted wages were used as a proxy for observed skill, and wages were predicted using age, education, and marital status. As with education, the authors found that migrants are concentrated in the middle of Mexico’s observed skill distribution. Given evidence that the returns to skill are actually higher in Mexico than for migrants in the United States, they conjectured that high-skilled individual may have lower time-equivalent costs of migration (or that any fixed costs of migration are relatively less important for high-skill individuals).3 The use of the U.S. and Mexican censuses presents several empirical problems for Chiquiar and Hanson (2005) besides the fact that there are no measures of past earnings in Mexico for migrants: the likely undercount of immigrants in the U.S. Census, the possibility that Mexican immigrants can obtain additional schooling after arriving in the United States, return migration, and the likelihood that unobserved characteristics are correlated with education. Results remained relatively unchanged after attempts to address these potential problems, but despite their comprehensive assessments, these concerns cannot be completely alleviated. Ibarraran and Lubotsky (2007) used the 2000 Mexican Census to estimate differences in the educational attainment between migrants and nonmigrants. They replicated Chiquiar and Hanson’s (2005) finding that Mexican immigrants in the 2000 U.S. Census have more education (and are older) than nonmigrants in the Mexican Census. However, they further investigated potential overreporting of education by migrants and undercounting of younger migrants in the U.S. Census. They also exploited the fact that in the Mexican Census, heads of households were asked to list current or past household members who had lived abroad during the preceding five years. Unfortunately, the heads of household reported only a limited set of individual characteristics: age, gender, Mexican state of origin, and migration patterns. Since no information about educational attainment of migrants was available, the authors used measured characteristics to predict educational attainment. In contrast to Chiquiar and Hanson (2005), Ibarraran and Lubotsky (2007) found evidence of negative selection on education: less educated Mexicans were more likely than highly educated Mexicans to migrate to the United States, and the degree of negative selection was larger in regions with higher returns to schooling. However, this approach is based on the assumption that there are no differences 3 2 Feliciano (2005) also used the Mexican and U.S. censuses and found evidence of positive selection on education of Mexican migrants in terms of mean differences. 79 Caponi (2010) offers an alternative explantion for the U-shaped relationship between migration and education based on the imperfect transferability and intergenerational transmision of human capital. See also Caponi (2011). 80 THE REVIEW OF ECONOMICS AND STATISTICS between migrants and nonmigrants beyond those characteristics used to predict education. Fernández-Huertas Moraga (2011) used data from the Encuesta Nacional de Empleo Trimestral (ENET) survey to address some of the limitations associated with using census data. The ENET survey follows households for five consecutive quarters and includes reports on whether any household member has left for the United States. Consequently, it is possible to examine the selectivity of recent migrants on their education attainment and actual earnings prior to migration. Fernández-Huertas Moraga reported finding negative selection on wages and ‘‘intermediate to negative selection’’ on education for men aged 16 to 65 (although he reports positive selection for migration out of rural Mexico).4 The ENET is not subject to the problems of undercounting of migrants and overreporting of migrant educational attainment that are present in U.S. Census. However, these data still miss persons in households in which all members have migrated to the United States. Orrenius and Zavodny (2005) used data from the Mexican Migration Project (MMP) to examine how various factors influence the selectivity of undocumented migrants over time.5 They found that migrants were drawn from the middle of the education distribution. They did not examine other measures of skill. They also found that a greater cost of migration, as measured by greater border enforcement, is associated with relatively lower immigration of low-skilled Mexicans. While the MMP includes both migrants and nonmigrants living in Mexico, it does not include Mexicanborn household heads who have migrated to the United States permanently. Furthermore, the data from the MMP are collected retrospectively, which may result in recall bias, and from a limited number of communities. Finally, McKenzie and Rapoport (2010) used data from the 1997 Encuesta Nacional de la Dinámica Demográfica (ENADID) survey to examine the selection of recent Mexican migrants on education.6 This survey has information on migrants in the United States who have a (coresident) family member living in Mexico but not on migrants who have permanently left Mexico and do not have family members who remained in Mexico. McKenzie and Rapoport (2010) also incorporated the migration history of the community, as a measure of the cost of migration, into the analysis. They reported that migrants are negatively selected on education in communities that have a high proportion of persons who migrated, but that selection on education is positive in communities that do not have high rates of migration. 4 See Fernández-Huertas Moraga (2009) for further analysis of differences between urban and rural Mexico. 5 Munshi (2003) used the MMP to show that Mexican migrants are more likely to be employed and hold a higher-paying job when their network in the United States is larger. 6 Durand, Massey, & Zenteno (2001) used the ENADID survey to show that Mexican migrants have become more negatively selected over time after controlling for cohort effects (though still predominantly drawn from the middle of the education distribution). This brief review highlights several gaps in the previous literature examining the selectivity of Mexican migrants. First, previous researchers have had to confront significant data limitations. We avoid most of these issues by using arguably the most appropriate data available to study the selectivity of Mexican migrants.7 For example, studies that used the U.S Census to identify Mexican migrants may miss a significant portion of migrants because of undercounting, and they may overstate the education of migrants if immigrants obtain additional schooling after arrival in the United States. Other studies that have used Mexican surveys generally miss permanent migrants or migrants who move to the United States with their entire households. The MxFLS allows us to identify all migrants, even those who have moved permanently to the United States or moved to the United States with their entire households, and avoids the potential undercount of Mexican migrants in the United States. Census. Furthermore, the MxFLS includes return migrants (those who will eventually return to Mexico) and does not depend on recall to identify migrants and time of migration. Second, most previous research has focused on the selectivity of migrants with respect to education or predicted earnings based on a limited set of observed characteristics such as age and education. Actual earnings, however, differ significantly from predicted earnings, and an important question is where in the actual earnings, or skill, distribution Mexican migrants come from. Insofar as the MxFLS has information on earnings of migrants before they leave Mexico, we can examine the selection of migrants with respect to actual earnings. We also consider the selectivity of migrants with respect to two components of earnings: the part of earnings determined by observed characteristics such as age, education, and marital status and the remaining (orthogonal) component of earnings. This is an interesting comparison because the selectivity of migrants with respect to these components of earnings may differ. Moreover, how the pattern of selection of migrants for these two components of earnings changes after adjusting for measures of the costs and benefits of migration provides additional information about the causes of migrant selection with respect to earnings and characteristics correlated with earnings such as education. Third, the MxFLS enables us to examine the selectivity of migrants with respect to other factors such as cognitive ability and health, as well as potential proxies for the costs of migration such as credit availability, family networks in the United States, and household assets. Considering a broader set of factors is important because of the important social and economic consequences related to the selectivity of migrants with respect to characteristics other than education and earnings. For example, the selectivity of migrants 7 Rubalcava et al. (2008) used these data to examine the health selectivity of migrants to the United States and found relatively little evidence of differential selection on health. SELF-SELECTION AND INTERNATIONAL MIGRATION 81 TABLE 1.—SUMMARY STATISTICS Men Migrated to United States Rural Age Years of schooling Worked during past year Annual earnings Cognitive ability Married In good health Relative in United States Household assets Know person/places to borrow Women Mean SD Observations Mean SD Observations 0.036 0.385 38.9 7.40 0.935 34,128 5.84 0.751 0.539 0.350 211,922 0.401 0.187 0.487 12.222 4.459 0.247 47,337 2.66 0.432 0.437 0.418 426,707 0.464 8,116 8,114 8,116 8,103 8,114 6,681 8,116 8,116 8,116 8,116 8,116 8,116 0.020 0.381 38.6 6.68 0.439 24,539 5.52 0.710 0.464 0.358 206,010 0.317 0.141 0.486 11.999 4.282 0.496 34,010 2.80 0.453 0.470 0.452 406,947 0.445 9,183 9,180 9,183 9,168 9,176 3,388 9,183 9,183 9,183 9,183 9,183 9,183 All summary statistics are calculated for Mexican males aged 21 to 65 and are unweighted. ‘‘Migrated to the United States’’ is an indicator variable for whether the individual moved to the United States between 2002 and 2006. ‘‘Rural’’ is an indicator variable for living in a community with fewer than 2,500 people. ‘‘Age’’ and ‘‘Years of schooling’’ are in years. Annual earnings’’ and ‘‘Household assets’’ are in 2002 pesos. ‘‘Cognitive ability’’ is the number of correct answers out of twelve on the Raven’s Progressive Matrices test. ‘‘Married’’ is an indicator variable for being married. ‘‘In good health’’ is an indicator variable for reporting one’s current health as ‘‘very good’’ or ‘‘good’’ rather than ‘‘regular,’’ ‘‘bad,’’ or ‘‘very bad.’’ ‘‘Relative in United States’’ is an indicator variable for having a relative in the United States. ‘‘Know person/places to borrow’’ is an indicator for reporting that one knows person or places to borrow. See the online data appendix for more details on the construction of these variable. Source: Mexican Family Life Survey. with respect to health has implications for local financing of health care in areas where immigrants are concentrated. Finally, few papers have directly assessed whether differences in earnings between Mexico and the United States are associated with the probability of migration, a direct test of economic models of migration. We examine whether differential labor market returns predict migration. While this analysis is subject to certain caveats, we believe it is useful in helping to identify the causes of selection with respect to earnings. Specifically, it is reasonable to interpret the pattern of migrant selection with respect to earnings that remains after adjusting for the dominant source of monetary benefit of migration as due to the relationship between earnings and the costs of migration. Similarly, we examine whether measures of costs of migration are significant determinants of migration and show how the pattern of migrant selection with respect to earnings changes after adjusting for the differences in the costs of migration. III. Data and Empirical Strategy A. Data The Mexican Family Life Survey is a longitudinal survey of households from geographically diverse communities in Mexico (Rubalcava & Teruel, 2006a, 2006b).8 The first round of the survey was conducted from April to July 2002 and collected information from a sample of approximately 8,400 households or 35,000 individuals across 150 different communities in 16 of Mexico’s 32 states. The baseline survey included both household information and individuallevel information for all members living in the household. The survey covered many topics: educational attainment, 8 Sampling weights are available that can be used to produce representative estimates at the national, urban-rural, and regional level. We do not use sampling weights in the main analyses because we stratify the sample by age and gender. However, the appendix table presents robustness checks that include sampling weights. labor market participation and earnings, networks, assets, credit availability, and cognitive ability. The second round of the survey began in mid-2005 and completed in 2006, with a 90% recontact rate. Those who had migrated to the United States were contacted at a rate of over 91%. Although information from the second-round interviews of migrants to the United States has not been released, an indicator for whether the individual has moved to the United States is available in the public-use data; note that all migrants should be identified even if they could not be found in the second round. In cases where entire households were no longer present at the original address, the interviewers searched for the final destination of that particular household by contacting close relatives, including those living in the United States. Indeed, approximately 13% of migrants in the MxFLS migrate to the United States with their entire households. We restrict the sample to persons aged 21 to 65 in the first round of the MxFLS.9 For analyses that focus on earnings, we restrict the sample to men because of the low labor force participation rates of women and the absence of earnings information for women who do not work. Table 1 presents summary statistics for this sample. The fraction of men who migrated to the United States over the three-year period between the first and second round of the MxFLS is 3.6%, or 295 of 8,116 men. Among women, 2% migrated during the same period. These rates of migration are consistent with other sources. Hanson and McIntosh (2010) used consecutive Mexican censuses to estimate migration rates and reported ten-year migration rates of approximately 12% to 14% for young men in their 20s and 5% to 10% for men in their 30s. In our sample, the three-year migration rates for men in their 20s and 30s are 5.6% and 3.3%, respectively, and are therefore consistent with those reported by Hanson 9 Given our focus on education and earnings, analyses that include persons under age 21 are complicated by the fact that education has not been completed for a nontrivial fraction of young people and that many young people do not work. 82 THE REVIEW OF ECONOMICS AND STATISTICS and McIntosh (2010). Passel and Cohn (2009) used CPS data to estimate that approximately 550,000 Mexican immigrants arrived in the United States each year between 2003 and 2006. This figure represents about 1.1% of the nonelderly adult population of Mexico. In the MxFLS, approximately 2.8% of the baseline adult (male and female) sample migrated between 2002 and 2005, which is roughly equivalent to the annual rate reported by Passel and Cohn. Unfortunately, the actual number of migrants in the MxFLS is relatively small and prevents analysis of subsamples such as by urban and rural status. The remaining variables in table 1 are from the first round of the MxFLS in 2002, and are reported by the individual, the head of household, or proxy through some other knowledgeable person in the household. We combine information from these different sources to fill in missing values and maximize the sample size.10 Notably, there is substantial agreement in information that can be derived from selfreports, proxy reports, and roster reports (those from the head of household). For example, the correlation between self-reported years of schooling, proxy-reported years of schooling, and roster-reported years of schooling is 0.94. Similar agreement is found with respect to whether a person worked in the past year. There is also considerable agreement between self-reported, proxy-reported, and rosterreported earnings, though there is greater scope for measurement error because individual reports of annual earnings include earnings in the main job as well as any earnings received from a secondary job, and several materially important sources of earnings such as profit sharing and Christmas bonuses.11 We assess the sensitivity of our findings with respect to earnings in two ways. First, we drop all persons without self-reported earnings. Second, we predict earnings for those without self-reported or proxyreported earnings using estimates from a regression of selfreported or proxy-reported earnings on roster-reported earnings and other covariates. In both cases, results, which are presented in the appendix table and discussed below, are similar to those reported in the text. The proportion of our sample living in rural areas is approximately 38%, a little lower than the 41% of an analogous sample in the 2000 Mexican Census. The average age of men and women in our sample is about 39 years, a little higher than the 37.5 years of age in the 2000 Mexican Census. Furthermore, the average years of schooling of men and women in our sample is almost identical to the 7.3 and 6.7 years of schooling reported for a similar sample of men and women aged 21 to 65 in the 2000 Mexican Census. The MxFLS also provides information on cognitive ability, a 10 This is also the procedure recommended by the principal investigators of the MxFLS (Rubalcava & Teruel, 2006a). 11 Federal law requires firms to participate in a profit-sharing program in which employees receive 10% of the firm’s annual profits. Firms are also required to pay a year-end Christmas bonus (Aguinaldo) to all employees equivalent to at least two weeks of pay. dimension of skill that is usually not measured in most data sets. Cognitive ability is measured as the number of correct responses to twelve multiple-choice questions of a Raven’s (Raven, Court, & Raven, 1983) Progressive Matrices test, with an average of about five to six correct answers. This test is designed to assess general intelligence by measuring the ability to form perceptual relations and reason by analogy independent of language and formal schooling. Finally, there is information about marital status and a selfreported measure of health that we collapse to a dichotomous variable. In addition, the MxFLS contains unique information on plausible proxies for the costs of migration: presence of relatives in the United States, household assets, and availability of credit. Approximately 35% of individuals have relatives in the United States and about 30% and 40% of men and women, respectively, know persons or places from which to borrow. The presence of a relative in the United States is a direct measure of the potential availability of a network in the United States that will plausibly lower the costs of migration (Massey, 1986; Curran & Rivero-Fuentes, 2003; McKenzie & Rapoport, 2007). This variable is similar in spirit to the use of contemporaneous or historical migration rates in a person’s home community or state that has been used by previous researchers as a measure of migration networks (McKenzie & Rapoport, 2010; Woodruff & Zenteno, 2007). The other two variables are motivated by concerns that liquidity constraints prevent migration. Having access to credit and sufficient household assets, which we measure, likely alleviates those constraints and makes it possible to migrate if desired (Stark & Taylor, 1991; Orrenius & Zavodny, 2005; McKenzie & Rapoport, 2007). In the regression analysis, we collapse the measure of household assets into tertiles: households with low, medium, or high levels of assets. B. Empirical Strategy Our first objective is to describe differences between migrants and nonmigrants with respect to several characteristics related to labor market skill: education, age (as a proxy for experience), and cognitive ability. All of these factors have an independent association with wages in both Mexico and the United States. We highlight the characteristics of migrants most closely related to earnings, often referred to as measures of skill, because of the large role that earnings play in determining the benefits of migration and thus the decision to migrate. In addition, we describe differences between migrants and nonmigrants with respect to several other characteristics, including health, marital status, and whether the person lives in an urban or rural area. The availability of earnings prior to migration also allows us to examine the selectivity of migrants with respect to earnings directly instead of having to predict premigration earnings of migrants using a limited number of covariates. SELF-SELECTION AND INTERNATIONAL MIGRATION Further, we can decompose earnings into two parts: the part correlated with observed measures of skill (age, education, and so on) and the part uncorrelated with these skill measures (residual earnings). While this decomposition is somewhat arbitrary and depends on what observed factors are used to partition earnings, we do it for two reasons. First, it provides a link to past studies that did not have direct measures of earnings and therefore had to use a predicted measure of earnings. Second, the selectivity of migrants with respect to the two components of earnings may differ, and these differences can provide information as to the causes of selection with respect to characteristics used to partition earnings. We also explore how these results are affected by using different sets of observed factors to partition earnings. To decompose earnings into two parts, we estimate earnings regression models using two different specifications. The first specification is a saturated model that includes a complete set of interactions between indicator variables for age (five categories), education (five categories), and marital status (two categories). The second specification, which is also a fully interacted specification, adds indicators for cognitive ability (five categories) and health (two categories). The predicted values from these regressions represents the part of earnings correlated with the measured variables; the residual from these regressions is the remaining part of earnings. We divide each of these components of earnings, as well as total earnings, into quintiles to assess the pattern of migrant selection with respect to these earnings measures. Our second objective is to examine how patterns of selection with respect to earnings change when explicit measures of costs and benefits of migration are included in the regression model. This analysis is intended to shed light on the causes of selection of migrants with respect to earnings. The logic underlying the analysis is straightforward. Migrant selection with respect to any characteristic, for example, earnings, depends on how that characteristic is related to both costs and benefits of migration. Therefore, including measures of one or more of the benefits and costs of migration will affect the pattern of migrant selection with respect to earnings if earnings are correlated with costs and benefits. We acknowledge that this analysis is subject to important caveats because some measures of the costs and benefits of migration may themselves depend on migration. For example, a household’s future plans about migration may influence asset accumulation, education, and other determinants of the costs and benefits of migration. This is a general problem that characterizes most research on migrant selection, and there are few good solutions. Below, we describe how we attempt to address the reverse causality problem when measuring the benefits 12 In the case of education, the potential bias is likely to reinforce the pattern of negative selection because higher returns to schooling in Mexico would lead future migrants to accumulate less education than similar nonmigrants (McKenzie & Rapoport, 2011). 83 of migration by differential labor market returns between the United States and Mexico and when using the presence of a relative (network) in the United States as a measure of the costs of migration.12 We measure the costs of migration using three variables: whether person has a relative in United States, whether a person has access to credit, and whether a person has a significant amount of assets. Earnings are likely to be related to these measures of costs, particularly whether a person has access to credit and the amount of household assets.13 We assess these associations directly and discuss them later in the paper, but we are mainly interested in the mediating effect of these variables on the pattern of migrant selection with respect to earnings. We measure the benefit to migration by calculating differential labor market returns between the United States and Mexico to several characteristics known to influence earnings: age, education, and marital status. We allow the U.S.–Mexico difference in labor market returns to differ by Mexican state and urban or rural status because previous studies have shown that labor market returns to age and education differ by these geographic divisions (Ibarraran & Lubotsky, 2007). IV. Results A. Selection of Migrants with Respect to Age, Education, Cognitive Ability, and Other Characteristics We begin by describing the nature of selection of migrants by education, age, cognitive ability, health, and other characteristics. Table 2 shows the proportion of migrants and nonmigrants by several characteristics. On average, migrants are younger than nonmigrants, and there is a steep gradient in the likelihood of migrating by age, particularly among men. Compared to nonmigrants, migrants are significantly overrepresented among those aged 21 to 29 and significantly underrepresented among those aged 48 to 65. Given that earnings rise with age and that the labor market returns to age are relatively higher in the United States than in Mexico (authors’ calculations), a simple theory of migration would predict that older persons would be more likely to migrate. However, the longer time horizon and the likely lower costs of migration due to community and family attachments may explain much of why younger workers are so much more likely to migrate. Indeed, age is a good example of how the nature of selection on a characteristic depends critically on the correlation between that characteristic and both the benefits and costs of migration. 13 McKenzie and Rapoport (2007) develop a model in which the association between household assets and migration is U-shaped for agricultural (rural) workers. The intuition from this model is that at low levels of wealth (assets), increases in wealth facilitate migration because the family has the capability to finance the move. At higher levels of wealth, increases in wealth reduce migration because of the loss of household production of the migrant. 84 THE REVIEW OF ECONOMICS AND STATISTICS TABLE 2.—SELECTED CHARACTERISTICS BY MIGRATION STATUS MEN Age 21–29 years old 30–38 years old 39–47 years old 48–56 years old 57–65 years old Education 0–3 years of schooling 4–6 years of schooling 7–9 years of schooling 10–12 years schooling More than 12 years of schooling Cognitive ability 5th (bottom) quintile 4th quintile 3rd quintile 2nd quintile 1st (top) quintile Health status Not good Good Urban or rural status Rural Urban Marital status Not married Married Family networks No relative in United States Relative in United States Credit availability No borrowing options Have borrowing options Household assets 3rd (bottom) tertile 2nd (middle) tertile 1st (top) tertile WOMEN Fraction of Migrants Differences from Reference Category Fraction of Migrants Differences from Reference Category 0.0565 0.0335 0.0367 0.0184 0.0148 0.0230*** 0.0199*** 0.0381*** 0.0417*** 0.0294 0.0201 0.0159 0.0146 0.0129 0.0093** 0.0135*** 0.0148*** 0.0165*** 0.0295 0.0413 0.0484 0.0346 0.0173 0.0118** 0.0189*** 0.0050 0.0123** 0.0149 0.0223 0.0224 0.0260 0.0171 0.0074** 0.0075* 0.0111** 0.0022 0.0327 0.0302 0.0391 0.0396 0.0353 0.0025 0.0064 0.0069 0.0026 0.0163 0.0166 0.0215 0.0231 0.0235 0.0003 0.0052 0.0067 0.0072 0.0377 0.0293 0.0084* 0.0202 0.0187 0.0015 0.0560 0.0240 0.0320*** 0.0260 0.0167 0.0093*** 0.0450 0.0335 0.0116** 0.0252 0.0183 0.0070** 0.0228 0.0523 0.0295*** 0.0075 0.0410 0.0335*** 0.0358 0.0401 0.0043 0.0191 0.0237 0.0046 0.0313 0.0449 0.0328 0.0136** 0.0015 0.0140 0.0218 0.0251 0.0078** 0.0111*** ***, ** and * indicate statistically significant differences between migrants and nonmigrants using a t-test at the 1%, 5%, and 10%, level respectively. The sample is Mexican males aged 21 to 65 and unweighted. Source: Mexican Family Life Survey. With respect to education, table 2 indicates that rates of migration are significantly higher among those in the middle of the education distribution (four to nine years of education) than among those with the least amount of education. This is similar to the findings in Chiquiar and Hanson (2005), Orrenius and Zavodny (2005), and Fernández-Huertas Moraga (2011). It is also the case that most migrants come from the middle of the education distribution: 63% of male migrants and 58% of female migrants have between four and nine years of schooling. In contrast to age and education, migrants are relatively similar to nonmigrants with respect to cognitive ability. Rates of migration do not differ significantly by categories of cognitive ability, although there is some evidence of positive selection for women. The absence of any strong evidence for the selection of migrants with respect to cognitive test scores is surprising given the selection on education and the relatively high positive correlation between years of schooling and cognitive test score (r ¼ 0.49). This finding implies that education and cognitive ability do not have similar associations with the costs and benefits of migration even though the two variables are correlated.14 There is also only limited support for selection with respect to health. Among males, those who report better health are less likely to migrate than those who report poorer health, but the difference is only marginally significant. For women, self-reported health status is unrelated to the probability of migration. Moreover, there were no statistically significant correlations between the probability of migrating and several other measures of health, including objective measures of health, such as obesity, high blood pressure, and low hemoglobin. These findings were also reported by Rubalcava et al. (2008), but they are notable because of substantial evidence showing that at the time of 14 Notably, cognitive ability is significantly related to earnings among males. In a simple bivariate regression, each correct answer on the Raven’s test is associated with 7.6% higher earnings. Controlling for education, age, and state of residence, each correct answer is associated with only 1.9% higher earnings, but this estimate remains statistically significant. SELF-SELECTION AND INTERNATIONAL MIGRATION 85 TABLE 3.—ASSOCIATION BETWEEN MIGRATION AND EARNINGS DEPENDENT VARIABLE: MIGRATED TO UNITED STATES 2nd quintile 3rd quintile 4th quintile 5th quintile (1) (2) (3) (4) (5) (6) 0.005 [0.007] 0.003 [0.007] 0.009 [0.007] 0.015** [0.007] 0.006 [0.007] 0.005 [0.007] 0.013* [0.007] 0.018** [0.007] 0.005 [0.007] 0.001 [0.007] 0.005 [0.007] 0.005 [0.007] 0.025*** [0.007] 0.006 [0.007] 0.003 [0.007] 0.008 [0.007] 0.008 [0.007] 0.028*** [0.007] 0.005 [0.007] 0.003 [0.007] 0.009 [0.007] 0.015** [0.007] 0.006 [0.007] 0.005 [0.007] 0.013* [0.007] 0.018** [0.007] Differential returns Differential wage levels 0.000 [0.000] Relative in the United States Credit availability Household assets (medium) Household assets (high) Sample size 6,675 0.026*** [0.006] 0.006 [0.005] 0.008 [0.006] 0.000 [0.006] 6,675 6,675 0.026*** [0.006] 0.008* [0.005] 0.009 [0.006] 0.007 [0.006] 6,675 6,675 0.000 [0.000] 0.026*** [0.006] 0.006 [0.005] 0.008 [0.006] 0.000 [0.006] 6,675 Robust standard errors clustered at the household level are in brackets. ***, **, and * indicate statistical significance at the 1%, 5%, and 10 % level, respectively. The dependent variable is equal to 1 if migrated to the United States between 2002 and 2006, and 0 otherwise. The independent variables are quintiles of log annual earnings. Differential returns and wage levels constructed from Education. Age categories in the 2000 Mexican and U.S. censuses. All regressions are estimated using linear probability models for Mexican males aged 21 to 65, and are unweighted. Source: Mexican Family Life Survey. arrival, Mexican immigrants are healthier than U.S.-born persons of similar age and education. Figures in table 2 indicate significant selection of migrants with respect to several other characteristics. Rural residents have much higher rates of migration than urban residents, particularly among men. Married persons have significantly lower rates of migration than unmarried persons. Persons with a relative in the United States have significantly higher rates of migration than those without a relative in the United States. For men, there is a nonmonotonic relationship between household wealth and migration, with the highest rates of migration for men in the middle of the wealth distribution. This is consistent with the predictions and results in McKenzie and Rapaport (2007). For women, migration rates are highest from the top of the wealth distribution. Finally, there is some weak evidence for a positive association between migration and the availability of credit, but this is not significant. To summarize, the estimates in table 2 show that Mexican migrants differ from nonmigrants along several observable dimensions: migrants tend to be younger and unmarried, overrepresented in the middle of the education distribution, and come from rural areas. Surprisingly, migrants and nonmigrants tend to have similar distributions of cognitive test scores and similar health. We note that the finding of selection from the middle of the education distribution is similar to some previous research and is somewhat of a puzzle given that the returns to education in Mexico are generally higher than in the United States. All else equal, we expect migrants to come from the lowest-educated groups. However, table 2 also shows significant differences between migrants and nonmigrants with respect to a number of other variables that are correlated with education and the probability of migrat- ing, such as age, health, and rural status. Thus, it may be difficult to infer the causes of the association between education and migration from the simple association between education and migration reported in table 2. B. Selection of Migrants with Respect to Earnings One of the advantages of the MxFLS survey is the availability of earnings prior to migration for both migrants and nonmigrants. This information allows us to examine the pattern of migrant selection with respect to actual earnings instead of earnings predicted from a few observed characteristics. Table 3 presents estimates of the selection of migrants with respect to earnings for males. We do not use the sample of females because of high rates of labor market nonparticipation and the absence of earnings for those who do not work. For males, approximately 90% of both migrants and nonmigrants work, and there are no statistically or economically significant differences in the probability of working between male migrants and nonmigrants. We divide persons into earnings quintiles and examine the association between the probability of migration and these quintiles of earnings. Results are not qualitatively different if we use quartiles or tertiles of earnings instead of quintiles (not shown but available on request). Estimates in column 1 of table 3, which are unadjusted for any other variables, indicate a pattern of negative selection with respect to earnings. The probability of migrating is substantially lower among men in the top two quintiles of the earnings distribution (42% and 25% respectively) relative to the bottom quintile, although only the estimate pertaining to the top quintile is statistically significant. This result is similar to the only other study to examine earnings 86 THE REVIEW OF ECONOMICS AND STATISTICS directly; Fernández-Huertas Moraga (2011) reported that migrants were negatively selected on wages with low earners most likely to migrate. This pattern of selection of migrants with respect to earnings suggests that earnings are negatively associated with the benefits of migration or positively associated with the costs of migration. In column 2, we present estimates of the pattern of migrant selection with respect to earnings after adjusting for differences in three measures related to the costs of migration: whether a person has a relative in United States, whether a person has access to credit, and whether a person has significant assets. Estimates in column 2 suggest a slightly stronger pattern of negative selection with respect to earnings than estimates in column 1, with coefficients on the top two quintiles more economically and statistically significant. The change in the pattern of selection between columns 1 and 2 is due to the positive association between earnings and access to credit and household assets; those with higher earnings are more likely to have access to credit and have more household assets and are therefore more likely to migrate. There is little systematic association between earnings and the presence of a relative in the United States, although it is a strong predictor of migration.15 In addition, the slightly stronger pattern of negative selection evident in column 2 than in column 1 implies that the benefits of migration are negatively related to earnings. To assess whether the benefits of migration are a possible explanation for the pattern of migrant selection with respect to earnings, we construct a measure of the benefits of migration based on differences in earnings in Mexico and the United States. We do this by calculating the differential in labor market returns between recent Mexican immigrants in the 2000 U.S. Census and Mexican natives in the 2000 Mexican Census. We focus on labor market returns to determinants that are observed in both data sets: age, education, and marital status. We also allow these labor market returns to vary by urban and rural status and state of residence in Mexico. Specifically, we estimate Mincer-type earnings regressions that include a full set of interactions between age, education, and marital status and allow these effects to differ by urban and rural status and state of residence in Mexico. We then merge the differences in estimated returns between the United States and Mexico by age, education, and marital status. Adjusting for differential labor market returns to skill substantially alters the pattern of migrant selection on earnings. Estimates in column 3 of table 3 indicate little selection of migrants with respect to earnings. If anything, there 15 An important concern is that our proxies for migration costs, such as having a relative in the United States, are correlated with unobserved earnings ability abroad. Previous researchers (starting with Woodruff & Zentano, 2007) have used historical migration rates from geographic areas to proxy for the size and quality of migration networks. We followed this literature and used historical migration rates from 1955 to 1959 to instrument for whether a person has a relative in the United States. These results, not shown, were very similar to those reported in text and are available on request. is some evidence that persons in the lowest quintile (reference category) are more likely to migrate than those in other parts of the earnings distribution. The change in the pattern of selection after this adjustment suggests that differential labor market returns are an important determinant of migration and the primary explanation of the negative selection of migrants with respect to earnings. Differential labor market returns are also significantly correlated with migration; a 1-unit increase (100 percentage point) in the differential return increases the probability of migrating by 2.5 percentage points, or approximately 66% of the mean. Estimates in column 4 of table 3 come from a model that includes measures of the cost of migration and the differential return to labor market characteristics. The pattern of selection revealed by estimates in this column is similar to that shown by estimates in column 3 and indicate a slight degree of negative selection of migrants with respect to earnings mainly due to a 15% higher rate of migration among those in the bottom quintile of the earnings distribution relative to those in other parts of the earnings distribution. It is important to acknowledge the inherent selection bias associated with these simple calculations of the labor market returns to migrants and nonmigrants from Mexico. After all, the empirical evidence presented in this paper demonstrates that Mexican migrants to the United States are selected on a variety of observed characteristics. Consequently, our estimates of the labor market return to skill for Mexican migrants in the United States may not represent the actual returns faced by a random person in Mexico.16 Addressing this selection problem in a compelling manner is extremely challenging, but we do try to assess the severity of the problem with two alternative approaches. First, we obtained selection-corrected estimates of the labor market return to skill in the United States using Heckman’s (1979) two-step estimator.17 Second, we calculated the returns to skill for alternative samples in the U.S. Census: all Mexican immigrants in the United States (as opposed to only recent immigrants who arrived in the previous ten years), other Spanish-speaking immigrants from Central and South America, and U.S. natives who report Mexican ethnicity. In all cases, results from these alternative approaches (not shown but available on request) yielded findings very similar to those presented below. While we cannot rule out the possibility that the selection problem is confounding our estimates, we believe it is unlikely that the conclusions of the 16 However, to the extent that individuals in Mexico are not aware of the selection problem and make their migration decisions on the basis of the observed returns of migrants to the United States, these biased estimates are the appropriate ones to include in the model. 17 In particular, we used a probit regression to predict migration from Mexico based on age and education categories as well as marital status and number of children (assumed to be excluded from the second-stage model). We then used the parameters of this model to construct the inverse Mills ratio associated with the probability of being a migrant for the sample of Mexican migrants drawn from the U.S. Census and reestimated the earnings regressions, including the selection term. SELF-SELECTION AND INTERNATIONAL MIGRATION analysis would change qualitatively even if we could more effectively address the selection bias. It is also important to note that we have followed the model of Borjas (1987) to measure the earnings benefit of migration by using differences in the relative return to skill (log differences in earnings). We consider an alternative characterization of earnings benefits proposed by Grogger and Hanson (2011), for which we use skill-specific difference in the level of earnings between the United States and Mexico. In the last two columns of table 3, we present estimates of the pattern of migrant selection with respect to earnings from regression models that use the level difference in labor market returns of observed characteristics instead of the log difference. In this case, the differential labor market returns are not significantly related to the probability of migration and adding them to the regression has virtually no impact on the pattern of selection. Overall, estimates in table 3 indicate that Mexican migrants are negatively selected with respect to earnings and suggest that a potential cause for this pattern of selection is due to differences in the benefits of migration that are related to earnings. High earners in Mexico have smaller relative benefits than low earners, as measured by differential labor market returns, and this makes them less likely to migrate. Costs of migration, as measured by the presence of family networks in the United States, availability of credit, and household assets, do not have a very strong influence on the pattern of selection with respect to earnings. However, there is some evidence that differential costs of migration may slightly weaken the negative selection which suggests that earnings are negatively related to these costs measures. Finally, in the appendix table, we explore whether our main results are robust to different specifications of earnings. Specifically, we assess the sensitivity of our findings to the following: limiting the sample to those with selfreported earnings; imputing earnings for those without selfreported or proxy earnings by predicting earnings using estimates from a regression of self-reported or proxy earnings on roster-reported earnings and other covariates (education, age, state of residence, marital status, and urban or rural status); using sample weights to estimate weighted least squares; and limiting the sample to young men (using combined information on earnings from self reports and household roster). Overall, estimates in the appendix table are consistent with those in table 3 and indicate negative selection of migrants with respect to earnings. There is somewhat stronger evidence of negative selection among younger men than among the full sample (or older men), and there is somewhat weaker evidence of negative selection when only those with self-reported earnings are used instead of the full sample with earnings information from either self-reports or the household roster. The use of imputed earnings or sample weights has little effect on the estimated pattern of migrant selection with respect to earnings. Our results (not shown but available on request) are 87 also robust to excluding past migrants, return migrants, and individuals who migrated with their entire households. C. Selection of Migrants with Respect to Predicted Earnings and Residual Earnings As noted in our literature review, previous studies examining Mexican migrant selection that lacked information on pre-migration earnings in Mexico have used predicted earnings to assess the selection of migrants with respect to skill, or human capital. To link our study to this prior work, we also construct predicted earnings using labor market returns to several observed components of skill. We construct this index of observed skill by regressing observed earnings on various combinations of age, education, marital status, cognitive test scores, and health to form a predicted measure. However, because we have information on actual earnings, we can also examine migrant selection with respect to residual earnings—the part of earnings uncorrelated with observed components of skill. We assess whether the probability to migrate differs by these measures of predicted earnings and residual earnings. Again, we divide predicted and residual earnings into quintiles, although results are similar when using quartiles or tertiles. Table 4 presents regression estimates of the association between the probability of migrating and predicted and residual earnings. Panel A of table 4 presents estimates from a model that uses age, education, and marital status to predict earnings (and residual earnings). Panel B presents estimates’from a model that uses age, education, marital status, cognitive ability, and health to predict earnings. For each dependent variable, we estimate two models—one without covariates and one that includes the three measures related to the cost of migration: whether person has a relative in United States, whether a person has access to credit, and whether a person has a significant amount of assets. In this analysis, we do not include differential labor market returns as a measure of the benefits of migration because of the high degree of collinearity between predicted earnings and differential labor market returns, which are constructed from the same set of observed characteristics. The estimates in these specifications are all relatively imprecise, in part because we adjust the standard errors to account for the two-step procedure in which predicted and residual earnings are estimated from a separate regression.18 The point estimates in column 1 of the panel A of table 4 provide some suggestive evidence for selection from the middle of the predicted earnings distribution, with individuals in the third quintile about 1 percentage point more likely to migrate to the United States relative to the bottom quintile. However, this estimate is not statistically significant. The probability of migrating is 1.3 percentage points lower among those in the top quintile of the predicted earn18 Specifically, we bootstrap the standard errors by performing 500 replications of the two-step estimation procedure. 88 THE REVIEW OF ECONOMICS AND STATISTICS TABLE 4.—ASSOCIATION BETWEEN MIGRATION AND PREDICTED AND RESIDUAL EARNINGS DEPENDENT VARIABLE: MIGRATED TO THE UNITED STATES Predicted Earnings (1) 2nd quintile 3rd quintile 4th quintile 5th quintile Relative in the United States Credit availability Household assets Sample size 2nd quintile 3rd quintile 4th quintile 5th quintile Relative in the United States Credit availability Household assets Sample size Residual Earnings (2) (3) A. Partitioned by Age, Education, and Marriage 0.007 0.008 0.002 [0.011] [0.011] [0.008] 0.010 0.010 0.002 [0.011] [0.010] [0.008] 0.005 0.004 0.006 [0.009] [0.009] [0.008] 0.013* 0.014* 0.012* [0.008] [0.008] [0.007] X X X 6,678 6,678 6,678 B. Partitioned by Age, Education, Marriage, Cognitive Abiility, Health 0.003 0.003 0.001 [0.009] [0.010] [0.009] 0.009 0.005 0.007 [0.009] [0.008] [0.007] 0.013 0.007 0.008 [0.009] [0.009] [0.008] 0.004 0.013* 0.014** [0.007] [0.007] [0.007] X X X 6,678 6,678 6,678 (4) 0.004 [0.008] 0.000 [0.008] 0.007 [0.008] 0.016** [0.007] X X X 6,678 0.003 [0.008] 0.009 [0.008] 0.008 [0.008] 0.014** [0.007] X X X 6,678 Robust standard errors clustered at the household level are in brackets. ***, **, and * indicate statistical significance at the 1%, 5%, and 10% level, respectively. The dependent variable is equal to 1 if migrated to the United States between 2002 and 2006, and 0 otherwise. The independent variables are quintiles of predicted log earnings and wages, constructed by regressing annual earnings or hourly wages on (five edlucation. (five age, and marital status, fully interacted. All regressions are estimated using linear probability models for Mexican males aged 21 to 65 and are unweighted. Source: Mexican Family Life Survey. ings distribution, and this estimate is marginally significant. including the three measures of migration costs in the model has little effect on the pattern of selection with respect to predicted earnings, as evidenced by the similarity of estimates in columns 1 and 2 of the top panel of table 4. This result suggests that the selection from the middle of the predicted earnings distribution is not caused by the correlation between predicted earnings and these proxies for costs of migration. The point estimates in columns 3 and 4 of the top panel of table 4 do not indicate selection from the middle. Instead, we find evidence of negative selection of migrants with respect to residual earnings, or the part of earnings uncorrelated with age, education, and marital status. Males in the top quintile of residual earnings have a migration rate that is 1.2 percentage points greater than males in the bottom quintile, although the estimate is only marginally significant. The addition to the model of the three measures of the costs of migration slightly increases the degree of negative selection, and the coefficient on the top quintile becomes statistically significant at conventional levels. Thus, migration costs, as measured, are not a particularly important influence underlying the pattern of negative selection on residual earnings. In panel B of table 4, we constructed predicted earnings using cognitive ability and health in addition to age, education, and marital status. According to this alternative specification, selection from the middle of the predicted earnings distribution, is more apparent in the third and fourth quintiles, and there is no evidence of negative selection in the absence of controls for the costs of migration. However, adding these measures in column 2 in the panel B of table 4 leads to substantially less selection from the middle of the predicted earnings distribution and indicates that those in the top quintile of predicted earnings are the least likely to migrate. The pattern of selection by residual earnings is qualitatively similar between the top and bottom panels of table 4, with those in the top quintile of the residual earnings distribution being 1.4 percentage points less likely to migrate as compared to those in the bottom quintile. As estimates in the panels indicate, the pattern of selection with respect to predicted (and residual) earnings changes along with the variables used to construct these measures. However, is an analysis of migrant selection with respect to these measures useful? From a descriptive, policy point of view, these measures are of limited use because predicted and residual earnings are arbitrary distinctions and migrants’ human capital, or skill, is most accurately characterized by actual earnings. Moreover, it is difficult to link this pattern of selection and the underlying cause of it to specific characteristics because predicted earnings are a function of several factors. Those in the middle of the predicted earnings distribution have different ages, levels of education, and marital status. Therefore, it is not possible to link any specific characteristic to the pattern of selection or the potential cause of selection. The same issues apply to the interpretation of SELF-SELECTION AND INTERNATIONAL MIGRATION the pattern of selection with respect to residual earnings. For this measure, estimates suggest that the benefits of migration are largest for those in the lower end of the residual earnings distribution. However, those in the lower part of the residual earnings distribution are also a mix of persons with different observed characteristics, and it is difficult to link these characteristics to potential explanations of the cause of migrant selection. V. Conclusion In this paper, we presented new evidence of the nature of selection of Mexican migrants to the United States during the period 2002 to 2005. Our analysis was based on a previously unused, high-quality data source, the Mexican Family Life Survey (MxFLS), which is ideally suited to study the selection of Mexican migrants. In particular, the MxFLS has information on the education, cognitive ability, health, and prior earnings of Mexican migrants, including those in which the entire household migrated to the United States. These data allow us to examine the nature of selection of recent migrants on a more comprehensive set of characteristics than used in previous studies. Furthermore, the MxFLS contains variables that are good proxies for the costs of migration, and we use this information to help identify the pattern of selection of migrants with respect to earnings. Descriptive analyses found that Mexican migrants, both males and females, were more likely to be young and unmarried, to come from the middle of the education distribution (four to nine years of schooling), and to come from rural areas. Interestingly, there was relatively little selection of migrants with respect to cognitive ability and health, although among women, there was some evidence that migrants were more likely to be positively selected among those with greater cognitive ability. With respect to earnings, which we examined only for males because of low labor force participation rates among females, migrants tend to come from the first through third quintiles of the earnings distribution. Analyses that investigated the causes of migrant selection with respect to earnings revealed that the pattern of migrant selection with respect to earnings may be primarily due to the association between earnings and labor market benefits of migration. Persons in the upper end of the earnings distribution had larger relative returns to labor market skills than persons in the lower end of the earnings distribution. Controlling for these differences greatly diminished the negative pattern of selection with respect to earnings. In addition, differential labor market returns was a large and significant predictor of migration. In contrast, three measures of the costs of migration were not strongly related to earnings, and adjusting for differences in these variables had relatively little effect on the pattern of migrant selection with respect to earnings. Estimates indicated that earnings were negatively related to these costs of migration. Overall, our results suggest that there is negative migrant selection 89 with respect to earnings, or skill, and that this selection is largely driven by differential benefits associated with earnings. The finding of negative selection with respect to earnings that is largely the result of higher benefits of migration for those with relatively low earnings is consistent with the most basic economic model of migrant selection such as that presented by Borjas (1987). We also presented analyses of the pattern of selection with respect to predicted and residual earnings, which is similar to analyses conducted in some previous papers on the selection of Mexican migrants. The primary explanation for the use of predicted earnings is the absence of earnings information in Mexico for both migrants and nonmigrants. These analyses indicated that Mexican migrants are selected from the middle of the predicted earnings distribution and the lower end of the residual earnings distribution. No previous study has analyzed the selection with respect to residual earnings and the difference in the pattern of migrant selection with respect to predicted and residual earnings, is interesting. Notably, and not surprisingly, these findings were somewhat sensitive to the manner in which earnings were partitioned into predicted and residual earnings. The arbitrary nature of the division of earnings into predicted and residual earnings and the difference in the pattern of migrant selection between predicted and actual (residual) earnings raises questions about the efficacy of using predicted earnings as a measure of skill to assess migrant selection with respect to skill. Similar concerns arise with respect to using individual components of skill such as education to assess migrant selection with respect to skill. In sum, we have used arguably the most appropriate data to examine the pattern of Mexican migrant selection with respect to several characteristics of interest to policy and theory. However, our study, like most previous studies, has some limitations. We use a sample of recent migrants, from 2002 to 2006, and this sample may not accurately characterize the full population of migrants, in the United States (although the pattern of selection among recent migrants may be most relevant for informing current and future policy). The sample of migrants is also too small to assess whether patterns of migrant selection differ by urban or rural status, network availability, and other potential causes of heterogeneous associations. In addition, the measures of the costs and benefits of migration we used in our study have certain shortcomings. For example, whether a person has a relative in the United States may be related to unobserved factors that affect the migration of both the person and the relative. While we cannot rule out this possibility, we note that our results (not shown) are similar when we instrumented for this variable using historical migration rates. Finally, we were able to make only limited headway toward identifying the cause of migrant selection with respect to earnings. This is a complicated problem that has been underappreciated in the literature. Migrant selection depends on the costs and benefits of migration, and identi- 90 THE REVIEW OF ECONOMICS AND STATISTICS fying the pattern of selection with respect to any characteristic requires a full specification of the causal links between the variable of interest and costs and benefits, and between the variable of interest and other variables correlated with it and with costs and benefits of migration. Almost no research has addressed this more fundamental problem, which should be the focus of future research in this area. REFERENCES Borjas, George J., ‘‘Self-Selection and the Earnings of Immigrants,’’ American Economic Review 77:4 (1987), 531–553. Borjas, George J., ‘‘The Labor Demand Curve Is Downward Sloping: Reexamining the Impact of Immigration on the Labor Market,’’ Quarterly Journal of Economics 118:4 (2003), 1335–1374. Borjas, George J., Richard B. Freeman, and Lawrence F. Katz, ‘‘Searching for the Effect of Immigration on the Labor Market,’’ American Economic Review 86:2 (1996), 246–251. Caponi, Vincenzo, ‘‘Heterogeneous Human Capital and Migration: Who Migrates from Mexico to the US?’’ Annals of Economics and Statistics 97/98 (2010), 207–234. ——— ‘‘Intergenerational Transmission of Abilities and Self-Selection of Mexican Immigrants,’’ International Economic Review 58:2 (2011), 523–547. Card, David, ‘‘Is the New Immigration Really So Bad?’’ Economic Journal 115:207 (2005), 300–323. Chiquiar, Daniel, and Gordon H. Hanson, ‘‘International Migration, SelfSelection, and the Distribution of Wages: Evidence from Mexico and the United States,’’ Journal of Political Economy 113:2 (2005), 239–281. Curran, Sara R., and Estela Rivero-Fuentes, ‘‘Engendering Migrant Networks: The Case of Mexican Migration,’’ Demography 40:2 (2003), 289–307. Durand, Jorge, Douglas S. Massey, and Rene M. Zenteno, ‘‘Mexican Immigration to the United States: Continuities and Changes,’’ Latin American Research Review 36:1 (2001), 107–127. Feliciano, Cynthia, ‘‘Educational Selectivity in United States. Immigration: How do Immigrants Compare to Those Left Behind?’’ Demography 42:1 (2005), 131–152. Fernández-Huertas Moraga, Jesus, ‘‘Wealth Constraints, Skill Prices or Networks: What Determines Emigrant Selection?’’ mimeograph (2009). ——— ‘‘New Evidence on Emigrant Selection,’’ this REVIEW 93:1 (2011), 72–96. Grogger, Jeffrey, and Gordon H. Hanson, ‘‘Income Maximization and the Selection and Sorting of International Migrants,’’ Journal of Development Economics 95:1 (2011), 42–57. Hanson, Gordon H., and Craig McIntosh ‘‘The Great Mexican Emigration,’’ this REVIEW 92:4 (2010), 798–810. Heckman, James J., ‘‘Sample Selection Bias as a Specification Error,’’ Econometrica 47:1 (1979), 153–161. Ibarraran, Pablo, and Darren Lubotsky, ‘‘Mexican Immigration and SelfSelection: New Evidence from the 2000 Mexican Census,’’ in G. J. Boras (ed.), Mexican Immigration to the United States (Chicago: University of Chicago Press, 2007). LaLonde, Robert J., and Robert H. Topel, ‘‘Economic Impact of International Migration and the Economic Performance of Migrants,’’ in M. R. Rosenzweig and O. Stark (eds.), Handbook of Population and Family Economics, vol. 1B (Amsterdam: Elsevier Science, 1997). Massey, Douglas S., ‘‘The Settlement Process among Mexican Migrants to the United States,’’ American Sociological Review 51:5 (1986), 670–685. McKenzie, David, and Hillel Rapoport, ‘‘Network Effects and the Dynamics of Migration and Inequality: Theory and Evidence from Mexico,’’ Journal of Development Economics 84:1 (2007), 1–24. ——— ‘‘Self-Selection Patterns in Mexico-U.S. migration: The role of migration Networks,’’ this REVIEW 92:4 (2010), 811–821. ——— ‘‘Can Migration Reduce Educational Attainment? Evidence from Mexico,’’ Journal of Population Economics 24:4 (2011), 1331– 1358. Munshi, Kaivan, ‘‘Networks in the Modern Economy: Mexican Migrants in the US Labor Market,’’ Quarterly Journal of Economics 118:2 (2003), 549–599. Orrenius, Pia M., and Madeline Zavodny, ‘‘Self-Selection among Undocumented Immigrants from Mexico,’’ Journal of Development Economic 78:1 (2005), 215–240. Passel Jeffrey, and D’Vera Cohn, ‘‘Mexican Immigrants: How Many Come? How Many Leave?’’ Pew Hispanic Center report (July 22, 2009). Raven, John C., John H. Court, and John E. Raven, Manual for Raven’s Progressive Matrices and Vocabulary Scales: Standard Progressive Matrices (London: Lewis, 1983). Roy, A. D., ‘‘Some Thoughts on the Distribution of Earings,’’ Oxford Economics Papers, n.5. 3 (1951), 135–146. Rubalcava, Luis N., and Graciela M. Teruel, ‘‘User’s Guide for the Mexican Family Life Survey First Wave,’’ (2006a), www.mxfls.uia.mx. ——— ‘‘Mexican Family Life Survey, Second Round,’’ working paper (2006b), www.ennvih-mxfls.org. Rubalcava, Luis N., Graciela M. Teruel, Duncan Thomas, and Noreen Goldman, ‘‘The Healthy Migrant Effect: New Findings From the Mexican Family Life Survey,’’ American Journal of Public Health 98:1 (2008), 78–84. Sjaastad, Larry A., ‘‘The Costs and Returns of Human Migration,’’ Journal of Political Economy 70:5, pt. 2 (1962), 80–93. Stark, Oded, and J. Edward Taylor, ‘‘Migration Incentives, Migration Types: The Role of Relative Deprivation,’’ Economic Journal 101:408 (1991), 1163–1178. Woodruff, Christopher M., and Rene M. Zenteno, ‘‘Remittances and Micro-Enterprises in Mexico,’’ Journal of Development Economics 82:2 (2007), 509–528. SELF-SELECTION AND INTERNATIONAL MIGRATION 91 TABLE APPENDIX—ROBUSTNESS CHECKS FOR THE ASSOCIATION BETWEEN MIGRATION AND EARNINGS DEPENDENT VARIABLE: MIGRATED TO THE UNITED STATES Imputed Earnings (1) 2nd quintile 3rd quintile 4th quintile 5th quintile Relative in the United States Credit availability Household assets Differential returns Sample size 0.001 [0.013] 0.001 [0.010] 0.006 [0.009] 0.017** [0.008] 6,539 (1) 2nd quintile 3rd quintile 4th quintile 5th quintile Relative in the United States Credit availability Household assets Differential returns Sample size 0.004 [0.009] 0.009 [0.009] 0.001 [0.009] 0.012 [0.007] 6,675 (2) Self-Reported Earnings Only (3) A. Alternative Earnings Specifications 0.001 0.001 [0.012] [0.012] 0.001 0.002 [0.010] [0.010] 0.008 0.003 [0.009] [0.010] 0.021** 0.013 [0.010] [0.009] X X X X X X X 6,539 6,539 B. Alternative Sample Specifications Weighted Sample (4) (5) 0.010 [0.009] 0.007 [0.009] 0.002 [0.008] 0.009 [0.007] 3,665 0.010 [0.009] 0.008 [0.009] 0.002 [0.009] 0.009 [0.008] X X X 3,665 (6) 0.013 [0.009] 0.010 [0.009] 0.002 [0.007] 0.000 [0.008] X X X X 3,665 Young Men (under 40 Years) Sample (2) (3) (4) (5) (6) 0.003 [0.009] 0.008 [0.009] 0.003 [0.009] 0.015** [0.007] X X X 0.003 [0.009] 0.010 [0.009] 0.001 [0.009] 0.007 [0.007] X X X X 6,675 0.007 [0.010] 0.014 [0.010] 0.016 [0.010] 0.018* [0.010] 0.008 [0.010] 0.015 [0.010] 0.019* [0.010] 0.023** [0.010] X X X 3,829 3,829 0.008 [0.010] 0.013 [0.010] 0.012 [0.010] 0.012 [0.010] X X X X 3,829 6,675 Robust standard errors clustered at the household level are in brackets. ***, **, and * indicate statistical significance at the 1%, 5%, and 10% level, respectively. The dependent variable is equal to 1 if migrated to the United States between 2002 and 2006, and 0 otherwise. The independent variables are quintiles of predicted and residual log earnings and wages. Columns 2, 4, and 6 include controls for marital status, relative in the United.State, visited the United States, having U.S. documentation, and thought about moving to the United States. All regressions are for Mexican males aged 21 to 65, and are unweighted. Source: Mexican Family Life Survey.
© Copyright 2026 Paperzz