Conflict, Road Insecurity and Self-Employed Shopkeepers in Rural Peru Julia Seiermann ∗ November 8, 2012 Abstract This paper examines how violent conflict affects the decision to become a self-employed shopkeeper or vendor, proposing road insecurity as a new channel of conflict transmission. In a conflict situation, the destruction of roads and the risk of assaults increase the cost of transporting goods. The impact of distance to the next market on the probability of opening a shop is thus expected to differ between conflict and non-conflict districts. I test this prediction in the context of the Peruvian armed internal conflict. Using a logit model, I find that the probability of opening a shop decreases with distance to the next market in conflict districts. In a 2SLS specification, distance to Ayacucho and temperature are used as instruments for conflict and its interaction with distance to market. The coefficient of interest becomes insignificant, but exogeneity cannot be rejected. Several robustness checks provide further support to the logit result. ∗ Graduate Institute for International and Development Studies, Geneva 1 “. . . y regresamos a nuestra casa de miedo; de cierto ya no salı́amos ya a la calle o no habı́a negocio. Ya no habı́a ni qué hacer comer a nuestros hijos, ya no entraba ni a la tienda nadie, se ha cerrado no más ya, adentro estábamos con miedo.“ “. . . frightened, we returned to our house; we did not go out in the street anymore, neither was there any business. There was not even anything left to feed to our children, nobody entered the shop anymore, it was simply closed, and we were inside, frightened.“ Julia Castillo Jopa Shopkeeper from Puquio district, Ayacucho Public Hearing of the Truth and Reconciliation Commission Lima, June 21, 2002 1 Contents 1 Introduction 3 2 Literature Review 4 3 The Peruvian Armed Internal Conflict 5 4 Data 4.1 Data Sources . . . . . . . . . . . . . . . . . . 4.2 Descriptive Statistics . . . . . . . . . . . . . . 4.2.1 Shopkeepers and Street Vendors . . . 4.2.2 Conflict versus Non-Conflict Districts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 6 6 6 7 5 Theory and Empirical Strategy 5.1 Conflict, Transport Cost and the Occupation Decision 5.2 The Baseline Equation . . . . . . . . . . . . . . . . . . 5.3 Control variables . . . . . . . . . . . . . . . . . . . . . 5.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 7 8 9 11 6 Instrumenting for Conflict 6.1 Is There a Risk of Omitted Variable Bias? 6.2 Identification Strategy . . . . . . . . . . . 6.2.1 Distance to Ayacucho . . . . . . . 6.2.2 Suitability for Coca Cultivation . . 6.3 Technique . . . . . . . . . . . . . . . . . . 6.4 Results: 2SLS Model . . . . . . . . . . . . 6.4.1 First Stage . . . . . . . . . . . . . 6.4.2 Second Stage . . . . . . . . . . . . 6.4.3 Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 12 14 14 14 15 16 16 16 16 7 Robustness Checks 7.1 Different Conflict Periods . . . . . . . . . 7.2 Conflict Intensity . . . . . . . . . . . . . . 7.3 Other Non-Agricultural Self-Employment 7.4 Outliers and Market Villages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 19 20 21 21 . . . . . . . . . . . . . . . . 8 Conclusion 22 References 23 A Appendix 26 2 1 Introduction This paper assesses the impact of violent conflict on self-employment in rural Peru of the 1990s. While the conflict literature has so far emphasized the creation or destruction of employment opportunities (Justino [2010]) and migration (Bozzoli et al. [2011]), I examine a new channel through which conflict can affect the self-employment decision, namely road destruction and insecurity. In rural Peru, over 50% of the non-agricultural self-employed work as shopkeepers or street vendors. Many of the goods they sell are not produced in the same village, but need to be transported there. In a peaceful context, distance to the next market might increase the profitability of a shop, since distance makes it more efficient to bundle transportation. During the armed internal conflict, however, the insurgent group1 Sendero Luminoso (Shining Path) destroyed roads and other infrastructure. Trying to impose a subsistence economy and “starve out the cities“, the insurgents prohibited villagers to participate in markets, persecuted those who did not comply, and assaulted vehicles transporting goods (Del Pino [1996]). Thinking of road insecurity as an increase in transportation cost, it should lead to a reduction in vendors’ profits. Ceteris paribus, road insecurity should thus deter people from becoming a self-employed shopkeeper or vendor. Yet, conflict also affects the employment decision through other channels. People may be pushed into self-employment following the destruction of alternative employment opportunities. A reduction of the local labor force through migration or deaths may enable people to switch from selfemployment to wage employment. In order to test the road security channel, I therefore focus on distance to the next market. I expect its impact on self-employment to differ between conflict and non-conflict districts. Using a logit model, I regress self-employment as a shopkeeper or street vendor on a district-level conflict dummy, the distance between the village and the next market, and an interaction term between conflict and distance to market. The results suggest that distance to market has a negative impact impact on the probability of working as a self-employed shopkeeper in conflict districts. This finding is in line with my hypothesis, according to which conflict affects the occupation decision via the channel of road insecurity. It is robust to a variety of specifications, in which I control for other factors influencing the occupation decision. The concern that conflict and the self-employment decision might both be endogenous to local economic conditions is addressed by using an instrumental variables (IV) procedure. Given that my regression equation includes an interaction term, I need at least two instruments for conflict. First, I use distance to Ayacucho, a department capital in the central Andes. It was at the local university that philosophy professor Abimael Guzmán split from the Maoist wing of the Communist Party of Peru (PCP) to found Partido Comunista del Perú - Sendero Luminoso (PCP-SL), which would later initiate the armed internal conflict. Second, I use temperature, as a proxy for the suitability of a region to cultivate coca. Given fragile state control, insurgents of various groups were able to establish a stronghold in coca-producing areas. In the 2SLS model, the coefficient on the interaction term between conflict and distance to market is not significant. Yet, the hypothesis that conflict 1 In its decree to establish the Truth and Reconciliation Commission (CVR), the Peruvian government qualified Sendero Luminoso as a terrorist organization. The CVR itself, however, distinguishes between terrorist and nonterrorist actions in its analysis. As this paper does address the legal dimension of the conflict, I use the term insurgency when referring to Sendero Luminoso. 3 and its interaction with distance to market are exogenous cannot be rejected, which suggests that the logit results are unbiased. Several robustness checks provide support for the road insecurity hypothesis, and raise interesting further questions on the underlying mechanisms. The negative effect of distance to market on the decision to work as a vendor is neither driven by unobserved common characteristics of the districts around Ayacucho (the epicenter of the conflict) nor by a general effect of distance to market on non-agricultural self-employment. Conflict intensity, measured as the number of fatal victims in a district, increases the negative impact of distance to market. Significant results on past conflict raise the question of the persistence of the effect, but cannot be disentangled from conflict intensity in the present analysis. 2 Literature Review The self-employment literature has traditionally focused on the importance of credit constraints, such as the classic papers by Evans and Jovanovich [1989] and Blanchflower and Oswald [1998]. Other determinants of self-employment which have been highlighted include age, household characteristics, parents’ employment status, education, experience, and alternative labor market options (Georgellis et al. [2005], Le [1999]). Most of the literature is concerned with industrialized countries, but a number of recent studies have analyzed the determinants of self-employment in developing country contexts. Paulson and Townsend [2004] find that financial constraints restrict entrepreneurial activity in Thailand. The importance of infrastructure for the household decision to establish a nonfarm enterprise is emphasized in Deininger et al. [2007] for Sri Lanka, and in Jin and Deininger [2008] for Tanzania. Both studies report significant positive effects of different types of infrastructure, such as electrification, distance to a bank, credit access, public transport, and road quality. Escobal [2001] studies the determinants of income shares by activities in Peruvian households using the 1997 Living Standards Measurement Survey (LSMS). He finds a positive impact of education, electricity and credit access and a negative impact of distance to market on the share of income stemming from non-agricultural self-employment activities. As to my knowledge, only two studies have so far addressed the impact of conflict on selfemployment. Using data from Uganda, Deininger [2003] finds that civil strife reduces non-agricultural enterprise start-ups at the household level. Bozzoli et al. [2011] focus on the impact of conflict on self-employment in Colombia. They find a negative impact of homicide rates on men’s, but not women’s self-employment. Considering the channel of displacement, they find that individuals are less likely to be self-employed in sending communities and more likely to be self-employed in host communities. This paper ties in with the existing literature by suggesting a new channel through which violent conflict affects the self-employment decision, namely road insecurity. Since this channel is of particular importance for rural shopkeepers and vendors, who bring in goods in order to sell them 4 in their village, my analysis concentrates on these professions. While highlighting road insecurity, I control for other determinants of self-employment, such as education, credit access and agricultural wages. 3 The Peruvian Armed Internal Conflict Peru was just undergoing a transition from a military dictatorship to democracy and had elected a new government when the Maoist insurgent group Sendero Luminoso declared a “people’s war“ and initiated violence in 1980 in the Central Andean province of Ayacucho. Initially, the conflict did not figure among the government’s main preoccupations, but was perceived as a local delinquency problem. But when Sendero Luminoso enlarged the scope of its activities, and the police was not able to contain them, the government decided to deploy the military to the conflict zones in 1982. The conflict spread from Ayacucho to other regions of the country, mainly neighboring Andean provinces, Lima and the central jungle region. Parties to the conflict included Sendero Luminoso and Movimiento Revolucionario Túpac Amaru (MRTA), another left-wing insurgent group, as well as police, armed forces and rural self-defense committees (rondas campesinas). Violence escalated, and human rights violations by both insurgents and state forces were numerous. The total number of dead and disappeared during the conflict is estimated 70,000 by the Truth and Reconciliation Commission (Comisión de la Verdad y Reconciliación, CVR). The main violent phase of the conflict ended in 1992 with the capture of Abimael Guzmán, the head of Sendero Luminoso, in Lima. However, violence continued on a lower scale during the autocratic government of Alberto Fujimori, which lasted until 2000 (Comisión de la Verdad y Reconciliación [2003]). Until today, isolated attacks take place in the Andean and jungle regions of the country, where Sendero Luminoso is still cultivating links to drug trafficking (Huerto Amado [2012]). According to the CVR, the majority of victims were among the rural, historically excluded population, which was symptomatic of the divisions within the Peruvian society. The extensive research of the CVR focused on the numerous human rights violations committed in the conflict, paying close attention to the juridical dimension of the conflict, but also the context which enabled its emergence and duration. While these atrocities were certainly its most abominable aspect, the conflict also caused profound disruptions of public life and social and economic interactions (Theidon [2001]). The destruction of roads, the assaults, and the ideologically motivated effort of Sendero Luminoso to suppress market activity have affected the local economy, especially in rural areas. The aim of this paper is to contribute to a better understanding of their consequences, which could have persistent implications on poverty. 5 4 4.1 Data Data Sources I use data from the 1994 Living Standards Measurements Survey (conducted by the Instituto Nacional de Estadı́stica e Informática [1994]) for employment status, distance to market and individual, household and village-level controls. The LSMS comprises individual and household modules on a variety of topics, as well as a community questionnaire. For my analysis, I restrict the sample to adults (15 years and older) from rural areas who are active in the labor market and for whom all necessary information is available. 96% of the individuals in my initial sample live in villages which are at most 26 km away from the next market. The remaining 4% live in those 5 villages (out of 143) which lie between 48 and 84 km away from the next market. I exclude these observations, since they considerably alter the magnitude and significance of the regression coefficients (see robustness check 4) and might be caused by errors. My final sample comprises 2,122 observations. For conflict intensity, I use district-level data from the Peruvian Truth and Reconciliation Commission on deaths and disappearances during the armed internal conflict. Distance to Ayacucho is measured with respect to each district capital. I constructed the data using the online tool “Distance Calculator Peru“, which measures the air distance between locations based on their latitude and longitude. Altitude is obtained for each district capital from Google Earth. Temperature, humidity and precipitation are measured at department level, by constructing the average annual value between 1997 and 2006. The climate variables are obtained from the Compendio Estadistico 2007, an extensive statistical publication of the Instituto Nacional de Estadı́stica e Informática [2007]. District-level data on education is constructed from the 1993 national census conducted by the Instituto Nacional de Estadı́stica e Informática [1993]. 4.2 4.2.1 Descriptive Statistics Shopkeepers and Street Vendors 19% of the people in my sample list non-agricultural self-employment as their primary or secondary activity, with a significantly higher proportion in conflict (22%) than in non-conflict districts (17%). More than half of them work as shopkeepers or street vendors. (See appendix table 6.) A high percentage of shopkeepers and vendors are female (65%, compared to 42% of women in the entire sample). Shopkeepers and vendors are more likely to have completed primary education than people working in other occupations. Their households have higher than average consumption expenditures, credit access and electricity. The shopkeepers in my sample tend to live in villages which are closer to markets: 5.6 km on average, compared to a sample mean of 6.3 km. Their villages are relatively poor (indicated by a higher likelihood of investment by the Peruvian Social Fund, FONCODES), and labor market conditions are less likely to have improved over the previous years. (See appendix table 7.) 6 4.2.2 Conflict versus Non-Conflict Districts In conflict districts, 12% of individuals work as shopkeepers or vendors, compared to 9% in nonconflict districts. Individuals in conflict districts are more likely to have completed primary education, and the general education level obtained from the 1993 national census is higher in conflict than in non-conflict districts. In my sample, villages in conflict districts are further away from the next market (7.2 km, compared to 5.7 km for those in non-conflict districts) and have had a relatively favorable labor market evolution. It has become easier to find a job in more conflict than non-conflict villages, while it has become more difficult to find a job in more non-conflict than conflict villages. (See appendix table 8.) 5 5.1 Theory and Empirical Strategy Conflict, Transport Cost and the Occupation Decision As discussed above, the literature on self-employment identifies a number of variables which influence the occupation decision. Conceptually, they can be divided into two categories: individual and outside factors. Individual factors include personal characteristics such as age, education and physical and intellectual capabilities, but also family characteristics such as household composition, wealth, and access to credit. Outside factors comprise labor market conditions (i.e., alternatives to self-employment) and the profitability of different professional activities given local conditions. This paper emphasizes one particular outside factor, the profitability of working as a selfemployed vendor. I assume that profits from vending activities are altered by conflict via the channel of road insecurity. I do not directly estimate profits, but I assume that, ceteris paribus, a person is more likely to choose an occupation if expected profits are high. Under the assumption that the other important factors are controlled for, the decision to work as a self-employed vendor can serve as an indicator of profitability of this activity. Distance to market and transport costs are assumed to play a significant role in the decision to become a self-employed vendor. The importance of market access for agricultural producers in developing countries has been highlighted by numerous studies, for example Jacoby [2000]. To my knowledge, the impact of distance to market on the profitability of vending activities has not yet been investigated. In peaceful times, distance to market may even raise the profitability of vending activities, since it increases the efficiency of bundling the transportation of those goods which are not produced and sold in the same village. Conflict-induced road insecurity increases transport cost and thus the cost of supplying products. Note that the increase in transport costs, as an abstract concept, includes not only higher prices for public transport or the risk of losing one’s goods, but also the risk of being injured or killed in an assault. Anecdotal evidence shows that fear from the insurgents (or the armed forces) caused people to substantially reduce economic and social interaction. (Comisión de la Verdad y Reconciliación [2003] But conflict also affects profitability through other channels, such as clients’ income, and influences other factors of the self-employment decision, such as individual and labor 7 market characteristics. The aim of my estimation strategy is thus to isolate the effect of road insecurity on the occupation decision. 5.2 The Baseline Equation I use a logit model to regress a binary variable for working as a self-employed shopkeeper or vendor on a conflict dummy, distance to market and the interaction between these two. Standard errors are clustered at the village level, and different sets of control variables are included. As the coefficients from binary choice models cannot be interpreted directly, I report OLS results to provide additional intuition. Finally, I estimate a complementary log-log (cloglog) model, since the dependent variable only takes the value 1 for 10 % of the observations.2 vendori = β1 · distmktv + β2 · conf lictd + β3 · conf lictd ∗ distmktv + β4 · controls (1) The subscript i indicates that the variable is measured at the individual level. v denotes village, and d district level. The different sets of controls vary at individual, household, village, district and department level (for a complete list, see appendix table 5). The dependent variable, vendor, takes the value 1 if a person is working as a self-employed shopkeeper or street vendor as her primary or secondary occupation. conflict is a binary variable which takes the value 1 when the number of dead and disappeared at district level between 1990 and 1994 is positive, and 0 when no fatal victims were registered in this time period. Although the conflict started in 1980, I use only the years immediately preceding the LSMS survey, assuming that these are most relevant for the occupation decision. The role of past conflict is investigated in detail in robustness check 1. Distance to market (distmkt) is measured at village level.3 Control variables include several individual, household and village-level variables. They are discussed in detail in section 5.3., and a list of all variables and their definitions is included in the appendix. As conflict is measured at district level, the conflict variable is subsumed by the district fixed effects. However, the interaction term between conflict and distance to market can still be estimated, as there is variation in distance to market between the villages within a same district. The interpretation of the interaction coefficient is the same as in the model without fixed effects. In 19 districts, there are no observations of self-employed shopkeeper and vendors in the dataset. The district fixed effect would predict the outcome perfectly in these cases, which is why they are excluded from the logit and cloglog analyses. The sample size is reduced to 1,735 observations. Self-employed shopkeepers and vendors are now overrepresented with respect to the original sample; and other systematic differences between included and excluded observations might cause the results to differ substantially. A table with summary statistics of all explanatory variables in the included and excluded sample is provided in in the appendix. 2 The complementary log-log model is asymmetric around zero, which is why it is used when the outcome of interest is rare. Conditional probability in this model is given by the cdf of the extreme value distribution: C(x0 β) = 1 − exp(−exp(x0 β)) (Cameron and Trivedi [2005]). 3 To capture the time-dimension of transport costs, it would be ideal to construct a measure of “effective“ distance to market by combining geographic distance with measures of road access and the availability of public transport. Unfortunately, the data is not detailed enough for this endeavor. 8 I expect the impact of distance to market to differ between conflict and non-conflict districts. If β3 , the coefficient on the interaction term between conflict and distance, is negative and significant, this supports my hypothesis that conflict-induced road insecurity deters people from becoming a self-employed vendor. The interpretation of interaction terms in binary choice models is, however, far from straightforward and will be discussed in further detail in section 5.4. 5.3 Control variables I control for basic characteristics at the individual and household level, namely age, gender, and household composition (number of children (0-14) and adults in the household). Furthermore, I introduce a number of control variables aimed to capture other important factors in the occupation decision: education, access to capital, infrastructure, migration, labor market conditions and agricultural opportunities. A person’s education is an important determinant of her range of employment options. In particular, skills such as literacy and numeracy may give her a comparative advantage in vending occupations. A higher education level may make these occupations less desirable in comparison to alternative opportunities. To control for these effects, I introduce a dummy variable for complete primary education. Additional controls for complete secondary and tertiary education were insignificant, which is why I do not include them in the final specification anymore. Access to capital has been highlighted by the literature as one of the most decisive determinants of self-employment. Individual ability to raise capital is determined by a person’s (and her household’s) wealth, and by access to credit. I approximate household wealth by including annual consumption expenditures in the regression equation. This measure is not completely exogenous to the occupation decision and the income generated in each individual member’s activity. However, alternative measures, such as farm size, durable asset ownership or size of the household dwelling are not completely exogenous either, and consumption expenditures present the advantage of being a more contemporaneous measure of the household’s financial situation. Furthermore, the magnitude and significance level of the coefficients of interest remain the same when using the number of rooms in the household dwelling instead of consumption expenditures. To control for credit constraints, I include a self-reported binary measure of credit access. Only 15% of the individuals in my sample live in households which reported access to some form of credit. This low percentage may be explained by the fact that the Peruvian microcredit sector only took off in the early to mid-1990s, and might not have reached many rural areas in 1994. However, credit access might still be underreported, which is why this measure needs to be interpreted with caution. Since the development literature highlights the importance of infrastructure for establishing a nonfarm household enterprise, I include a binary variable indicating whether the household dwelling has electricity or not. Individuals with entrepreneurial talent or motivation might migrate to villages in which vending activities are particularly profitable, but also be more likely to migrate to escape violence. Selective migration could bias the coefficients of interest. Thus, I controlled for individual migration using a binary variables which takes the value 1 if a person was born in the current place of residence. This variable was insignificant in all specifications tested, which is why I exclude it from the final 9 specification. Bozzoli et al. [2011] highlight that conflict may affect local labor markets through displacement. As a proxy for migratory movements at a more aggregate level, I include dummy variables indicating whether the number of dwellings in the village has increased or decreased since 1991. The labor market situation in a village is probably one of the most important factors in the individual occupation decision. One potential measure of alternative employment opportunities in rural areas are agricultural wages. However, their impact on the occupation decision may go into both directions. High agricultural wages may prevent people from becoming vendors, but they may also make vending activities more profitable by raising the income of potential clients. Since my analysis is not directly interested in the impact on wages, I use a more direct measure of alternative opportunities. I combine an indicator of how the labor market in the village has evolved since 1991 with a proxy for the village’s economic situation in the beginning of the 1990s. The LSMS community questionnaire contains a question whether it was easier or more difficult to find a job in 1994 than in 1991. This variable the perceived evolution of the labor market situation during those years, but does not provide a level-level comparison between villages. In order to control for the relative economic situation, I use a binary variable indicating whether the Peruvian Social Fund (FONCODES) has invested in a village until the time of the survey. FONCODES was created in 1991 and started funding a variety of community-based projects in 1992. Resources were allocated using a district-level poverty index and informal on-site assessments. Looking at school infrastructure investments, Paxson and Schady [2002] find that FONCODES effectively targeted poor districts, which is why its activities can be used as a proxy for low economic development. Together, the FONCODES dummy and the recent labor market evolution should therefore capture a village’s economic and labor market situation. Education, an important factor in the occupation decision, played a crucial role in the advent of the Peruvian conflict. The country experienced an unprecedented education expansion between the 1940s and 1960s. With the increased access to education, rural and lower-class Peruvians were hoping for progress and social mobility. The frustration of these expectations made the population more receptive to violent, radical proposals such as those of Sendero Luminoso. Degregori describes the Ayacuchan “Movimiento del 1969“, a protest against the abolition of the gratuity of education by the military government, as a “clarion call, which announced the possibility of the apparition of a phenomenon like Sendero Luminoso, and of its expansion to other places“ (Degregori [2007], p. 6). Furthermore, given their distinguished position in the villages, school teachers (many of them educated at Universidad Nacional San Cristóbal de Huamanga, where the leader of Sendero Luminoso was a professor in the department of education) were instrumental in diffusing the movement’s ideology and gaining support among the rural population (Comisión de la Verdad y Reconciliación [2003]). Education could thus be an important source of omitted variables bias, which is why I introduce a measure for the education level in each district, namely the percentage of persons 15 and older with at least one year of secondary education. Finally, agriculture is the most important source of (self-)employment in rural areas, and might alter the relative attractiveness of non-agricultural self-employment. I therefore include log(altitude) of the district capital and average annual precipitation at the department level as proxies for agricultural opportunities. 10 5.4 Results The results from the different models are in line with my hypothesis. The coefficient on the interaction term between conflict and distance to market is negative and significant at the 5% level in all cases. The inclusion of district fixed effects substantially increases the coefficient of interest. As discussed above, this effect could be caused by systematic differences between excluded and included districts in the logit and cloglog cases. A comparison between the OLS, logit and cloglog fixed effect results reveals that all but one coefficient (More difficult to find a job) are of the same sign and statistical significance, which alleviates this concern about potential bias.The main result is robust to a variety of specifications, in which I include different sets of the control variables described above (see appendix table 6 for a detailed list, and tables 10-15 for reports of all coefficients). It indicates that the likelihood of working as a self-employed shopkeeper decreases with distance to market in conflict districts. The coefficient on distance to market is insignificant for the baseline specifications (without fixed effects). These results suggest that distance to market does not influence the decision to work as a shopkeeper in non-conflict districts, but deters individuals from choosing this occupation in conflict districts. With the inclusion of district fixed effects, however, the coefficient on distance to market turns positive and strongly significant. This suggests that, within a given district, distance to market makes individuals more prone to work as a shopkeeper in non-conflict settings. This positive effect of distance is offset in the presence of conflict, in which case the net effect is negative. Conflict per se appears to increase the likelihood of being a vendor, as the coefficient is positive and significant. The reason for this effect cannot be easily determined. While it is, for example, possible that the conflict curtailed other employment opportunities, note that there is no such positive effect for other non-agricultural self-employment activities (see robustness check 3). As mentioned above, the interpretation of coefficients on interaction terms in binary choice models is difficult. Norton et al. [2004] highlight that the marginal effect depends on the other covariates, and may even switch signs. Furthermore, its statistical significance also varies between observations. The authors suggest reporting the interaction effects (i.e., the cross partial derivative of the probability) for each observation and their significance plotted against the predicted probability of observing the outcome of interest for each observation. Kolasinski and Siegel [2010] criticize the use of the cross partial derivative and show that the switch in signs can be a mechanical effect driven by the restriction of the outcome variable between zero and one. Greene [2010] questions the usefulness of testing the statistical significance of the interaction effect for each individual, and suggests the use of graphics for a more meaningful analysis. In line with this suggestion, I construct graphs to illustrate the relation between increasing distance to market and the likelihood of working as a vendor in a non-conflict and conflict district respectively. The graphs depict the predictions from the logit regression (with district fixed effects) for a “representative“ individual who is likely to become a vendor: a 38-year old woman with completed primary education, living in a household with 2 children and 2 other adults, with average consumption expenditures and neither credit access nor electricity. All village, district and geographical controls are set at the sample mean. The solid line represents the predicted likelihood of working as a vendor from the logit estimation, and the dashed lines represent the 95% confidence interval. 11 Table 1: Results of different specifications, all control variables included OLS VARIABLES Logit Cloglog (1) (2) (3) (4) (5) (6) 0.087*** (0.032) 0.003 (0.003) -0.008** (0.004) - 0.117*** (0.030) -0.166** (0.077) 0.918*** (0.283) 0.037 (0.030) -0.090** (0.040) - 0.011*** (0.004) -0.013*** (0.004) 1.011*** (0.313) 0.041 (0.033) -0.100** (0.043) 0.101*** (0.025) -0.141** (0.071) Individual controls Household controls Village controls District controls X X X X X X X - X X X X X X X - X X X X X X X - District fixed effects - X - X - X Conflict Distance to market Conflict*Distance to market Observations 2,122 2,122 2,122 1,735 2,122 1,735 (Pseudo) R-squared 0.075 0.158 0.115 0.205 . . Log-pseudolikelihood -411.4 -311.4 -627.3 -525.7 -628.3 -524.1 Robust standard errors in parentheses, clustered at village-level. *** p<0.01, ** p<0.05, * p<0.1. 12 6 6.1 Instrumenting for Conflict Is There a Risk of Omitted Variable Bias? The validity of the regressions presented in the previous section relies on the assumption that the incidence of district-level conflict between 1990 and 1994 is exogenous to the occupation decision. A potential concern is that difficult economic or labor market circumstances may at the same time affect the occupation decision and induce people to support insurgent groups. A careful discussion of the circumstances of the conflict is thus necessary to evaluate the concern of omitted variables and devise a valid identification strategy. Violence was initiated in Ayacucho in 1980, and spread to other parts of the central Andes, to the jungle regions and to coastal cities over time. To understand the causal mechanisms behind the geographic dispersion of the conflict, it is helpful to distinguish two questions: Why did the conflict start in Ayacucho, and why did it spread to other parts of the country? The second question is less controversial: As observed by León [2012], Sendero Luminoso extended the domain of its activities according to a politico-military strategy, aiming to access political influence (Lima and other coastal cities) and resources (the coca regions in the jungle). The first question - why did Sendero Luminoso emerge in Ayacucho? - has motivated academic research from different disciplines, such as history, sociology and anthropology. The ideological radicalism of Sendero Luminoso, and its apparent disdain for human life, sets it apart from other insurgent groups in Latin America.4 Therefore, Sendero Luminoso has regularly been depicted as alien to the Peruvian society. Degregori [2007], for example, describes its evolution as an “ideological mutation“ (p. 182) towards “a kind of minuscule star, those in which matter agglutinates almost without interatomic spaces, thereby attaining great weight, disproportionate for its size“ (p. 180).5 Given these descriptions, one is tempted to regard Sendero Luminoso as exogenous to social and economic conditions, a strange historical accident. But while Sendero Luminoso remained a relatively small, autarkic movement, it is unlikely that the conflict could have reached a similar magnitude without any support from the local population. Degregori [2007] argues that the combination of the low profitability of latifundios (large agricultural estates) and the expansion of education can explain the success of Guzmán’s ideology in Ayacucho. As agricultural profits were meager and other professional alternatives lacking, education was seen as the only path to progress, and the frustration of the high expectations was particularly strong. Local economic conditions which influence the occupation decision may thus be directly related to the conflict. I attempt to account for these factors using village and district control variables for education and the economic and labor market situation, as described in section 4.3. However, some of these measures are rather coarse, and might not be sufficient to prevent omitted variables bias. To address this concern, I implement an instrumental variable strategy, using distance to Ayacucho and average annual temperature as instruments for conflict and the interaction term 4 Sendero Luminoso was responsible for the majority (54%) of victims of the conflict. This contrasts with other Latin American conflicts, in which state and paramilitary forces caused more victims than the insurgency. Unlike other groups, including MRTA, the other main subversive movement in the Peruvian conflict, Sendero Luminoso also attacked the defenseless population.(Comisión de la Verdad y Reconciliación [2003]) 5 Translation by the author. The original wording in Spanish is: “una especie de estrella enana, esas donde la materia se apelmaza casi sin espacios interatómicos, alcanzando ası́ un gran peso, desproporcionado para su tamaño“. 13 between conflict and distance to market. 6.2 6.2.1 Identification Strategy Distance to Ayacucho The armed internal conflict was initiated by the Maoist insurgent group Sendero Luminoso in 1980 in the province of Ayacucho, where their leader Abimael Guzmán had been a philosophy professor at Universidad Nacional San Cristóbal de Huamanga (UNSCH).6 While the combination of local economic conditions and education explain the movement’s success, these conditions were not unique to Ayacucho, as emphasized by Degregori [2007]. My identification strategy relies on the assumption that Guzmán’s appointment as a professor in Ayacucho, and thus the fact that the conflict started there, was a historical coincidence. UNSCH was founded by the Spanish colonizers as the second university of the country in 1680, when Huamanga was a prosperous mining city on the commercial route between Lima and Cuzco. Given its relatively low number of students and professors and the lack of government resources following the War of the Pacific, UNSCH was closed in 1876. It was only reopened in 1959, at the very beginning of an expansionary wave of public tertiary education in Peru (Luja Peralta and Zapata Tejerina [1988]). UNSCH distinguished itself from the existing “traditional“ universities by emphasizing its mission to serve the development of the region and maintaining a certain distance from the local landowning elites. It attracted renowned intellectuals from the entire country (Comisión de la Verdad y Reconciliación [2003]). Given the left-wing orientation of UNSCH, it is probably not a coincidence that Guzmán, an active member of the Communist Party of Peru (PCP), was appointed there in 1962. However, one of the reasons why UNSCH became such a distinguished project is that it was the first university to be opened in the context of a general movement towards the modernization and democratization of tertiary education. Its early opening, in turn, was possible because the university had already existed, such that its buildings were still in place. Since UNSCH was founded in the colonial era, its geographic location is arguably exogenous to the social and economic circumstances of the region in the middle of the 20th century. Therefore, distance to Ayacucho, where Guzmán was able to gain ideological support among university students and build the core of what would later become Sendero Luminoso, can be considered a valid instrumental variable for conflict. 6.2.2 Suitability for Coca Cultivation One element of Sendero Luminoso’s expansion strategy was the aim to extract resources from the coca business. In the coca producing regions of Peru, the weak presence of the state, its failure to effectively combat drug trade and the corruption of police and bureaucracy facilitated the success of both insurgent organizations, Sendero Luminoso and MRTA. Their links with the drug traffickers enabled them to obtain weapons from Colombia. In exchange, the insurgents provided “order“ and protection to coca cultivators and traffickers. The police and armed forces also cultivated 6 Until 1825, the city of Ayacucho was called San Juan de la Frontera de Huamanga, or, in its short form, Huamanga. 14 links with the coca business, for example by providing runways for planes operating between Peru and Colombia. However, they seem to have exercised less control over the entire supply chain than the insurgents. This situation resulted in numerous violent confrontations between insurgents, police and the military (which was deployed to the region to combat both, drugs and insurgency) (Comisión de la Verdad y Reconciliación [2003]). Therefore, the geographical and climatic suitability for coca cultivation can predict conflict in the Peruvian context. According to the (relatively scarce) literature on coca cultivation, Erythroxylum coca, the most common species in Peru, grows best at an altitude of 500-2000m. Furthermore, it requires relatively high temperatures, precipitation and humidity (Morello and Matteucci [2001], Ismatullayev [2011]). These conditions are met in the tropical parts of the Peruvian Andes. Like distance to Ayacucho, altitude of the district capital and department-level aggregates of temperature, precipitation and humidity over 1997-2006 predict conflict well in the Peruvian context. But the necessity of instrumenting the interaction term between conflict and distance to market as well complicates the identification strategy. A standard approach would involve using an instrument for conflict and the interaction of this instrument with distance to market. However, this strategy leads to under- and weak identification in the present setup, which is why I require at least two different instruments. When including a combination of three or more instruments, the Kleibergen-Paap statistic for weak identification is very low. Therefore, I use only two instruments, distance to Ayacucho and temperature, in my final specification. In order to be a valid IV, temperature should not directly influence the dependent variable, the choice to become a self-employed shopkeeper or vendor. Two concerns might arise with respect to this condition: First, substantial differences in agricultural employment opportunities with respect to other regions might alter the attractiveness of self-employment, and second, the presence of coca cultivation and trade might offer new employment opportunities or make vending activities more profitable. Angrist and Kugler [2008] analyze the consequences of the disruption of the so-called “air bridge“ for transporting coca paste in 1994, following which the production of coca leaves for cocaine production shifted from Peru and Bolivia to Colombia. Focusing on coca-producing rural areas of Colombia, they find that the income from self-employment increased, but employment status, wage and salary earnings and hours worked did not change significantly. These results suggest that even important changes in the price of coca do not affect the employment decision, which is my dependent variable. General growing conditions, which are presumably more or less stable across time, might still affect the occupation choice. However, other geographical control variables capturing agricultural opportunities (log(altitude and precipitation) are not significant in the logit, cloglog and 2SLS specification, which alleviates this concern. 6.3 Technique I employ a 2SLS procedure, with standard errors clustered at the village level. I use the two different instruments described above, distance of the district capital to Ayacucho (distaya) and average annual temperature in a department (temperature). I cannot estimate the 2SLS model including district fixed effects, as the second instrument is at department level, whereas both endogenous variables vary at district or village level. 15 I estimate a 2SLS model with two simultaneous first-stages for the endogenous regressors, conflict and its interaction with distance to market: conf lictd = α1 · distayad + α2 · temperaturedpt + β1 · distmktv + β2 · controlsihvd (2) conf lictd ∗ dmktv = α1 · distayad + α2 · temperaturedpt + β1 · distmktv + β2 · controlsihvd (3) The second stage is equivalent to the baseline specification, in which conflict and the interaction term are replaced by the predicted values from the first stages: \ vendori = β1 · distmktv + β2 · conf lictd + β3 · conf lict\ d ∗ distmktv + β4 · controlsivhd 6.4 6.4.1 (4) Results: 2SLS Model First Stage The instruments predict conflict and its interaction with distance to market reasonably well. As expected, the coefficient on distance to Ayacucho is negative, and the coefficient on temperature is positive. The R2 lies between 0.18 and 0.27 in the different specifications, which is not very high. The problem of weak identification which may result from the low explanatory power of the first stage is discussed more in detail in section 6.4.3 (Tests). In the first stage predicting conflict, the coefficient on temperature however turns insignificant when including all sets of control variables. In turn, the control variable of precipitation, while insignificant in all other specifications of all models, turn positive and significant at the 10% level in this particular case. See appendix table 16) Precipitation is positively correlated with temperature, and may thus capture a common effect of the two, a problem which may be aggravated by the high aggregation level of the climatic variables. While precipitation is also expected to predict the suitability of a region for coca cultivation, it is not a suitable instrument for conflict and its interaction to market due to under- and weak identification problems. 6.4.2 Second Stage The main result from the logit regression is not robust to the IV specification. The coefficient on the interaction term between conflict and distance to market is now insignificant in all specifications. The same is true for the coefficient on conflict. (The magnitude of the coefficients cannot be compared between logit and 2SLS). This result indicates that distance to market has no impact on the likelihood of working as a shopkeeper or vendor. As I will discuss more in detail in the next section, the insignificance of the coefficients is however likely to be caused by the relative imprecision of the 2SLS estimator, as opposed to the simple logit. 16 Table 2: 2SLS: First stage on conflict (only main coefficients) VARIABLES Distance to Ayacucho in 100km Temperature Distance to market Individual controls Household controls Village and district controls Geographical controls Dependent variable: Conflict (3) (4) (1) (2) -0.079*** (0.017) 0.021** (0.009) 0.011* (0.006) -0.082*** (0.017) 0.021** (0.008) 0.011* (0.006) -0.068*** (0.018) 0.016* (0.009) 0.009 (0.007) X - X X - X X - (5) (6) -0.082*** (0.016) 0.020** (0.010) 0.007 (0.007) -0.070*** (0.018) 0.016* (0.009) 0.009 (0.007) -0.070*** (0.018) 0.016 (0.010) 0.005 (0.008) X X X X X X - X X X X R2 0.184 0.202 0.224 0.229 0.241 0.266 Number of observations: 2,122. Robust standard errors in parentheses, clustered at village-level. *** p<0.01, ** p<0.05, * p<0.1. Table 3: 2SLS: First stage on conflict*distance to market (only main coefficients) VARIABLES Distance to Ayacucho in 100km Temperature Distance to market Individual controls Household controls Village and district controls Geographical controls (1) Dependent variable: Conflict*Distance to market (2) (3) (4) (5) (6) -0.596*** (0.172) 0.343*** (0.101) 0.524*** (0.114) -0.610*** (0.178) 0.347*** (0.101) 0.517*** (0.116) -0.451*** (0.175) 0.274*** (0.092) 0.501*** (0.112) -0.608*** (0.176) 0.359*** (0.103) 0.499*** (0.123) -0.450** (0.175) 0.273*** (0.090) 0.495*** (0.112) -0.429** (0.176) 0.304*** (0.093) 0.484*** (0.120) X - X X - X X - X X X X X X - X X X X R2 0.184 0.202 0.224 0.229 0.241 0.266 Number of observations: 2,122. Robust standard errors in parentheses, clustered at village-level. *** p<0.01, ** p<0.05, * p<0.1. 17 Table 4: 2SLS: Second stage (only main coefficients) VARIABLES Conflict Conflict*Distance to market Distance to market Individual controls Household controls Village and district controls Geographical controls Dependent variable: Vendor (3) (4) (5) (1) (2) 0.104 (0.111) -0.005 (0.013) 0.000 (0.007) 0.073 (0.110) -0.005 (0.012) 0.001 (0.007) 0.160 (0.121) -0.005 (0.014) 0.001 (0.007) 0.103 (0.121) -0.009 (0.014) 0.003 (0.007) 0.119 (0.118) -0.003 (0.013) 0.001 (0.007) 0.181 (0.119) -0.011 (0.014) 0.004 (0.007) X - X X - X X - X X X X X X - X X X X 0.059 0.056 0.053 0.0009 0.0025 0.0005 7.23 5.26 8.59 0.92 0.46 0.46 Adjusted R2 0.040 0.058 0.030 Underidentification P-value 0.0002 0.0002 0.0026 Weak identification Kleibergen-Paap rk Wald F-stat 7.81 8.49 5.05 Endogeneity test P-value 0.56 0.85 0.28 Number of observations: 2,122. Robust standard errors in parentheses, clustered at village-level. Stock-Yogo critical values: 7.03 (10%), 4.58 (15%), 3.95 (20%). 6.4.3 (6) *** p<0.01, ** p<0.05, * p<0.1. Tests The underidentification test is based on the Kleibergen-Paap rk LM statistic. The null hypothesis that the regression is underidentified can be rejected in all specifications of the model, in which I include different sets of control variables. The weak identification measure reported is the Kleibergen-Paap rk Wald F statistic, since the Cragg-Donald Wald F statistic is not valid with clustered standard errors. The Stock-Yogo critical values for weak identification, however, apply to the Cragg-Donald case, and can thus not be directly compared to the Kleibergen-Paap statistic. There are no corresponding critical values for weak identification in the presence of clustered standard errors yet (Schaffer [2012]), which is why I report the Stock-Yogo values for orientation. In the present specification, I cannot test for overidentification, since the model is exactly identified. When trying sets of three or more instruments, namely by including more geographical variables as proxies for coca cultivation, problems of weak identification occurred, but the null hypothesis that the instruments are valid could not be rejected. This may be interpreted as cautious support for the assumption that distance to Ayacucho and temperature are valid instruments in the present specification as well. 18 As suggested by Cameron and Trivedi [2005], I report the cluster-robust variant of the DurbinWu-Hausman test provided by the endog option of the ivreg2 command in Stata to test the endogeneity of conflict and its interaction with distance to market. The null hypothesis that these regressors can be treated as exogenous cannot be rejected. The control variables for education and the economic situation may be sufficient to avoid omitted variable bias in the simple logit. In this case, it is neither necessary nor optimal to use an IV strategy in the present analysis. The non-significance of the 2SLS results is likely to be caused by the loss of precision inherent to the instrumental variables procedure, which is aggravated in the case of weak instruments.(Cameron and Trivedi [2005]) Again, these results need to be interpreted with caution. The Durbin-Wu-Hausman test compares the coefficients resulting from OLS and 2SLS. Thus, the null hypothesis which is not rejected is that the coefficients are equal. However, in the presence of weak instruments, IV estimates are biased in the same direction as OLS in finite samples, as highlighted by Bound et al. [1995]. The result of the Durbin-Wu-Hausman test could also be driven by weak instruments, and allow no definite conclusion about the exogeneity of the explanatory variables. 7 Robustness Checks The present section discusses the results of a variety of robustness checks, which I conduct to corroborate my results and to gain more detailed insights in the causal mechanisms underlying the relation between distance to market, conflict, and the occupation decision. As the necessity of an IV procedure is not evident, I use the baseline logit model including all sets of control variables for the robustness checks. Tables reporting the respective results are included in the appendix. 7.1 Different Conflict Periods In order to verify whether contemporaneous conflict is indeed more relevant for the occupation decision than past conflict, I replace the dummy for conflict between 1990 and 1994 with dummies for earlier stages of the conflict: 1980-84 and 1985-89. The results are similar for all periods, but the coefficient on the interaction term between conflict and distance to market is larger for the earlier periods. The negative impact of distance to market seems to be larger in those districts which were affected by the conflict in its early stages. It is important to note that most of these districts still experienced violence in the 1990-94 period, since the conflict expanded gradually and attained its largest geographical coverage in the early 1990s. Of the 80 districts in my dataset, 49 never experienced conflict. Only 9 districts experienced conflict in 1980-84, compared to 23 in 1985-89 and 29 in 1990-94. Only 2 districts which were affected in the 1980s did not have any fatal victim in 1990-94. The significance of the results on the earlier stages of the conflict can thus partly be explained by the considerable overlap between districts affected in the different periods, which makes it difficult to conclude on the relative importance of past and contemporaneous conflict in the occupation decision. Nonetheless, the unexpected larger magnitude of the coefficients on earlier periods raises the question whether the negative effect of distance to market observed in conflict districts could be 19 caused by other factors than the violence. The districts around the epicenter of the conflict, which were affected in its early stages, are culturally and geographically similar. With the expansion of the conflict, the characteristics of the conflict districts became more diverse. To verify whether the coefficient of interest is not merely driven by regional specificities of the core conflict districts, which could explain why its magnitude declines with the geographic expansion, I implement a further robustness check. I run the baseline logit regression with all control variables, however replacing the conflict dummy with distance to Ayacucho, and the interaction term with distance to Ayacucho multiplied by distance to market. In a first specification, I use the subsample of districts never affected by the conflict, in a second one the subsample of districts not affected in 1990-94, and in a third one the full sample. If the coefficient on the interaction term between distance to Ayacucho and distance to market were positive and significant, this would entail that the negative effect of distance to market on the likelihood of being a vendor is a general characteristic of the region around Ayacucho, independently from the conflict. However, the coefficient is negative and significant in the first two specifications (subsamples of non-conflict districts). This means that the effect of distance to market, which on its own is positive in the non-conflict samples, decreases with distance to Ayacucho. When including both conflict and non-conflict districts, the coefficient on the interaction between distance to Ayacucho and distance to market is insignificant. Therefore, the negative effect of distance to market in conflict districts does not seem to be driven by unobserved regional specificities of the districts at the epicenter of the conflict. The larger magnitude of the coefficient on the interaction of distance to market with dummies of conflict in 1980-84 and 1985-89 raises the question whether the effects of the disruption of social and economic life due to the conflict are more persistent than initially expected. This question cannot be finally answered in the scope of this paper, but points towards an interesting direction for future research. 7.2 Conflict Intensity A competing explanation for the larger coefficient on distance to market in districts with conflict in the 1980s is that conflict intensity in 1990-94 was considerably higher in districts affected over the entire conflict period than in those affected only in the early 1990s. In the districts affected in both the early and late stages, the mean number of victims over the 1990-94 period is 27, compared to 10 in the districts only affected in the 1990s. To explore the effect of conflict intensity, I replace the conflict dummy in the baseline logit regression by the number of victims per district between 1990 and 1994. In further specifications, I use dummies for different conflict intensities, namely for more than 10, 20 and 50 victims per district between 1990 and 1994. As pointed out by León [2012], the conflict data collected by the CVR is likely to suffer from measurement error, given that it is self-reported and collected in public hearings. The share of victims caused by different perpetrators (insurgent groups, state agents and others) differs considerably between the CVR database and those of other institutions, such as human rights organizations and ministries (Comisión de la Verdad y Reconciliación [2003]). This might be caused by selective reporting of cases according to the mandate of each organization or its perception by the public. Therefore, the results on conflict intensity need to be interpreted with caution. 20 The coefficient on the interaction term between conflict intensity and distance to market is negative and significant. The same is true for the interactions between different conflict intensity dummies and distance to market. For these, the magnitude of the coefficient increases with the lower bound of the number of victims. Thus, the negative impact of distance to market in conflict districts seems to increase with conflict intensity. As the districts which experienced higher levels of violence are also those which were affected in the early stages of the conflict, it is difficult to disentangle the aspects of persistence and intensity. 7.3 Other Non-Agricultural Self-Employment Non-agricultural self-employment is generally more prevalent in the conflict districts, as shown in the summary statistics. If the differential impact of distance to market applied not only to shopkeepers and vendors, but also to other self-employment activities, this would cast doubt on the hypothesis that the transmission channel is road insecurity and the increase in transport cost. To verify this, I run the baseline logit regression on a dependent variable which takes the value 1 if a person is self-employed in a non-agricultural activity, but not a shopkeeper or vendor. In a second specification, I use a binary dependent variable for all types of non-agricultural self-employment (i.e., including shopkeepers and vendors). In both cases, the coefficient on the interaction term between conflict and distance to market is insignificant. This result indicates that the negative impact of distance to market is indeed proper to vendors and shopkeepers. It can be interpreted as support for the road insecurity hypothesis, since transport costs are likely to affect these professions in particular. 7.4 Outliers and Market Villages As mentioned in section 4, 5 out of the 143 villages in my sample lie very far away from the next market (between 48 and 84 km, compared to a mean of 6.3 km in the rest of the sample). I suspect that these values might be caused by errors, for example misunderstanding concerning the definition of a “market“ or “fair“. Unfortunately, I cannot verify their accuracy, since the LSMS data does not provide village names. While I consider it justifiable to exclude the outliers from my analysis, I report the results of the baseline logit on the full sample for academic transparency. The main result is not robust to the inclusion of the outliers. The coefficient on the interaction term between conflict and distance to market decreases in magnitude and becomes insignificant. The outliers exert considerable leverage on the results. Selective migration is an important concern when studying the impact of conflict on the occupation choice, as mentioned in section 5.3. If persons desiring to work as vendors migrate selectively to villages close to markets, and the same “type“ of person is also more likely to migrate to escape conflict, this can bias the result of interest. I conduct robustness checks using a sample excluding market villages (i.e., for which distance to the next market equals zero), and a sample excluding markets which are at most 2 km from the next market. The interaction term remains negative, significant, and of similar magnitude as with the usual sample, which corroborates my main result. 21 8 Conclusion In sum, this paper examined road insecurity as a transmission channel of conflict, and illustrated how it alters individual occupation decisions at the example of self-employed shopkeepers and vendors in rural areas of Peru. The initial hypothesis was that conflict-induced road insecurity raises the cost of transporting goods to a village, and thus prevents individuals from working as shopkeepers or vendors if they live relatively far from the next market. In line with this hypothesis, the results of the logit regressions display a negative relation between distance to market and the decision to become a self-employed shopkeeper or vendor, but only in districts affected by conflict. This result is however not robust to the 2SLS specification, which was proposed to alleviate concerns about omitted variable bias. The coefficient on the interaction term between conflict and distance to market becomes insignificant, potentially due to the relative inefficiency of the estimator. The endogeneity test results, however, suggest that conflict and its interaction with distance to market can be treated as exogenous in the present specification, in which case the results of the logit model are unbiased. Together with the robustness checks, which show, among others, that the result is neither driven by the geographic and cultural similarity of the districts close to the conflict’s epicenter nor by general effects on self-employment, these results can be viewed as evidence for the existence of a road insecurity channel through which conflict affects rural livelihoods. There seems to be a “transport cost“ imposed on shopkeepers by conflict-induced road insecurity, although, in the absence of data on assaults, it is not possible to say whether the fear of loosing one’s life or the risk of loosing one’s goods in an assault was more important in the Peruvian context. Given that non-agricultural income often represents more than 50% of rural households’ budgets (Davis et al. [2010]), the impact of conflict on the occupation decision can have important implications for poverty. Understanding its underlying mechanisms is thus relevant for development policy. As revealed by the robustness checks, further research is necessary to investigate the potential persistence of the effects of conflict-induced road insecurity. If there is such a thing as “learning by market participation“, and if market participation is inhibited due to road insecurity, conflict regions may suffer a permanent fall-back in terms of productivity. Policies to specifically tackle this problem could constitute a useful component of post-conflict reconstruction efforts. In order to provide insights for policy-makers, further research could focus on the relevance of the road insecurity channel for other activities (such as agriculture or manufacturing in rural areas) and the persistence of its effects. From a broader perspective, this paper highlights one particular way in which conflict can disrupt local economic interaction and affect the daily life of rural populations. By focusing on one aspect of the micro-economic cost of conflict, it shows how conflict restrains individuals’ economic opportunities, and may thereby cause persistent adverse effects on their well-being. 22 References Joshua D. Angrist and Adriana D. Kugler. Rural windfall or a new resource curse? Coca, income and civil conflict in Colombia. The Review of Economics and Statistics, 90(2):191–215, 2008. David G. Blanchflower and Andrew J. Oswald. What Makes an Entrepreneur. Journal of Labor Economics, 16(1):26–60, 1998. John Bound, David A. Jaeger, and Regina M. Baker. Problems With Instrumental Variable Estimation When the Correlation Between the Instruments and the Endogenous Explanatory Variable is Weak. Journal of the American Statistical Association, 90(430):443–450, 1995. Carlos Bozzoli, Tilman Brueck, and Nina Wald. Self-Employment and Conflict in Colombia. DIW Discussion Papers, 1098, 2011. A. Colin Cameron and Pravin K. Trivedi. Microeconometrics: Methods and Applications. Cambridge University Press, New York, 2005. Comisión de la Verdad y Reconciliación. Informe Final. Comisión de la Verdad y Reconciliación, Lima, 2003. Benjamin Davis, Paul Winters, and Gero Carletto. A Cross-Country Comparison of Rural Income Generating Activities. World Development, 38(1):48–63, 2010. Carlos Iván Degregori. ¿Por qué apareció Sendero Luminoso en Ayacucho? El desarrollo de la educación y la generación del 69 en Ayacucho y Huanta. In Anne Pérotin-Dumon, editor, Historizar el pasado vivo en América Latina. 2007. URL http://www.historizarelpasadovivo.cl/. Klaus Deininger. Causes and Consequences of Civil Strife. Micro-Level Evidence from Uganda. World Bank Policy Research Working Paper, 3045, 2003. Klaus Deininger, Songqing Jin, and Mona Sur. Sri Lanka’s Rural Non-Farm Economy: Removing Constraints to Pro-Poor Growth. World Development, 35(12):2056–2078, 2007. Ponciano Del Pino. Tiempos de guerra y de dioses. Ronderos, evangélicos y senderistas en el valle del rı́o Apurı́mac. In Carlos Iván Degregori, José Coronel, Ponciano Del Pino, and Orin Starn, editors, Las rondas campesinas y la derrota de Sendero Luminoso, pages 114–185. Instituto de Estudios Peruanos, Lima, 1996. Distance Calculator Peru. URL http://distancecalculator.globefeed.com/Peru_Distance_ Calculator.asp. Javier Escobal. The Determinants of Nonfarm Income Diversification in Rural Peru. World Development, 29(3):497–508, 2001. David S. Evans and Boyan Jovanovich. An Estimated Model of Entrepreneurial Choice under Liquidity Constraints. Journal of Political Economy, 97(4):808–827, 1989. Yannis Georgellis, John G. Sessions, and Nikolaos Tsitsianis. Self-Employment Longitudinal Dynamics: A Review of the Literature. Economic Issues, 10:291–296, 2005. 23 William Greene. Testing hypotheses about interaction terms in nonlinear models. Economics Letters, 107, 2010. Hans Huerto Amado. Combatir narcotráfico es clave para erradicar a Sendero del VRAE, según especialistas. El Comercio, February 12, 2012. Instituto Nacional de Estadı́stica e Informática. Censos Nacionales IX de Población y IV de Vivienda, 1993. URL http://iinei.inei.gob.pe/iinei/RedatamCpv1993.asp?id= ResultadosCensales?ori=C. Instituto Nacional de Estadı́stica e Informática. Encuesta Nacional de Hogares sobre Medición de Niveles de Vida (Living Standards Measurement Survey), 1994. URL http://microdata. worldbank.org/index.php/catalog/617. Instituto Nacional de Estadı́stica e Informática. Perú. Compendio Estadı́stico 2007. INEI, Lima, 2007. Otabek Ismatullayev. Erythroxylum coca, 2011. URL http://bioweb.uwlax.edu/bio203/2011/ ismatull_otab/index.htm. Hanan G. Jacoby. Markets and the Benefits of Rural Roads. The Economic Journal, 110(465): 713–737, 2000. Songqing Jin and Klaus Deininger. Key Constraints for Rural Non-Farm Activity in Tanzania: Combining Investment Climate and Household Surveys. Journal of African Economies, 18(2): 319–361, 2008. Patricia Justino. War and Poverty. Microcon Research Working Paper, 32, 2010. Adam C Kolasinski and Andrew F. Siegel. On the economic meaning of interaction term coefficients in non-linear binary response regression models, 2010. URL http://ssrn.com/abstract= 1668750orhttp://dx.doi.org/10.2139/ssrn.1668750. Anh T. Le. Empirical Studies of Self-Employment. Journal of Economic Surveys, 13(4):381–416, 1999. Gianmarco León. Civil Conflict and Human Capital Accumulation: Long Term Consequences of Political Violence in Peru. Journal of Human Resources (forthcoming), 2012. Héctor Luja Peralta and Mario Zapata Tejerina. La educación superior en Perú. CRESALC UNESCO, 1988. Jorge Morello and Silvia Diana Matteucci. Aspectos ecológicos del cultivo de coca. Encrucijadas (UBA), 1(8):82–91, 2001. Edward C. Norton, Hua Wang, and Chunrong Ai. Computing interaction effects and standard errors in logit and probit models. The Stata Journal, 4(2):154–167, 2004. Anna L. Paulson and Robert Townsend. Entrepreneurship and financial constraints in Thailand. Journal of Corporate Finance, 10:229–262, 2004. 24 Christina Paxson and Norbert R. Schady. The Allocation and Impact of Social Funds: Spending on School Infratstructure in Peru. The World Bank Economic Review, 16(2):297–319, 2002. Mark E. Schaffer. Statalist entry: st: RE: Interpreting Kleibergen Paap weak instrument statistic, 2012. URL http://www.stata.com/statalist/archive/2012-06/msg01166.html. Kimberly Theidon. Terror’s Talk: Fieldwork and War. Dialectical Anthropology, 26:19–35, 2001. 25 A Appendix 26 Table 5: List of Variables VARIABLE DEFINITION LEVEL Dependent variable Vendor Dummy: Self-employed shopkeeper or vendor Individual Independent variables Conflict Distance to market Conflict*Distance to market Dummy: No. of victims between 1990-94 > 1 Distance in km Interaction term District Village Individual controls Female Age Primary education Dummy Age in years Dummy: Completed primary school Individual Individual Individual Household controls Children Adults Consumption expenditures Credit access Electricity Numbe of children (0-14) in household Other adults in household Yearly, in 100 Peruvian Soles Dummy: Self reported access to credit Dummy: Dwelling has electricity Household Household Household Household Household Village and district controls More households in village Less households in village FONCODES Easier to find job More difficult to find job Education level Dummy: No. of households increased since 1991 Dummy No. of households decreased since 1991 Dummy: FONCODES invested in village Dummy: Easier to find a job than in 1991 Dummy: More difficult to find a job than in 1991 % of adults (≥15) with secondary education Village Village Village Village Village District Geographical controls log(altitude) Precipitation log(altitude of district capital in m) Average yearly precipitation in mm, 1997-2006 District Department Instruments for conflict Distance to Ayacucho Distance of district capital to Ayacucho, in km District Temperature Average temperature in C◦ , 1997-2006 Department Sources: Conflict: Comisión de la Verdad y Reconciliación (CVR) Individual, household and village-level variables: Living Standards Measurement Survey 1994 (LSMS) Education level: National Census 1993, INEI Altitude: Google Earth Temperature, precipitation: Instituto Nacional de Estadı́stica e Informatı́ca 2007 (INEI) Distance to Ayacucho: Peru Distance Calculator (PDC) 27 Table 6: Summary statistics - Occupations VARIABLES Full sample Non-agricultural self-employment Observations Only self-employed Shopkeeper or vendor Tailor, sewer, or other textile professional Services Artisan or craftsman Food production Technical professional (1) Full sample mean (sd) (2) No conflict mean (sd) (3) Conflict mean (sd) (4) Difference (2)-(3) 0.19 (0.39) 0.17 (0.37) 0.22 (0.42) -0.06*** 2122 1297 825 - 0.54 (0.50) 0.14 (0.35) 0.11 (0.31) 0.11 (0.31) 0.06 (0.22) 0.05 (0.22) 0.54 (0.50) 0.10 (0.31) 0.10 (0.33) 0.11 (0.32) 0.08 (0.27) 0.05 (0.22) 0.54 (0.50) 0.19 (0.39) 0.12 (0.31) 0.09 (0.29) 0.02 (0.13) 0.05 (0.22) 0.01 -0.08** -0.02 0.03 0.06*** 0.00 Observations 412 229 183 Asterisks denote the significance level of the t-test on the difference between conflict and non-conflict districts: *** p<0.01, ** p<0.05, * p<0.1. 28 Table 7: Summary statistics - All regression variables, by vendor VARIABLES Conflict Distance to market Female Age Primary Children Adults Consumption expenditures Credit access Electricity More households in village Less households in village FONCODES Easier to find job More difficult to find job Education level Altitude Precipitation Distance to Ayacucho Temperature (1) Full sample mean (sd) (2) Non-vendors mean (sd) (3) Vendors mean (sd) (4) Difference (2)-(3) 0.39 (0.49) 6.28 (6.26) 0.42 (0.49) 36.58 (15.85) 0.47 (0.50) 2.34 (1.81) 2.56 (1.76) 47.38 (43.22) 0.15 (0.36) 0.23 (0.42) 0.66 (0.47) 0.06 (0.24) 0.47 (0.50) 0.17 (0.37) 0.59 (0.49) 45.54 (15.15) 1718 (1436) 758.65 (681.38) 591 (293) 16.8 (5.8) 0.38 (0.49) 6.35 (6.30) 0.39 (0.49) 36.47 (16.00) 0.46 (0.50) 2.35 (1.84) 2.60 (1.78) 46.06 (42.94) 0.14 (0.35) 0.22 (0.41) 0.66 (0.47) 0.07 (0.25) 0.46 (0.50) 0.18 (0.38) 0.59 (0.49) 45.44 (15.14) 1738 (1444) 762.57 (677.05) 597 (292) 16.8 (5.7) 0.45 (0.50) 5.60 (5.93) 0.65 (0.48) 37.57 (14.50) 0.63 (0.48) 2.28 (1.61) 2.17 (1.50) 58.71 (44.04) 0.23 (0.42) 0.33 (0.47) 0.67 (0.47) 0.03 (0.16) 0.56 (0.50) 0.11 (0.32) 0.60 (0.49) 46.38 (15.21) 1543 (1350) 724.90 (718.29) 538 (300) 16.4 (6.11) -0.7** .75* -.26*** -1.10 -.18*** .07 .43*** -12.65*** -.09*** -.11*** -.01 .04** -.10*** .06** -.01 -.94 195* 37.67 59*** 0.4 Observations 2,122 1,901 221 Asterisks denote the significance level of the t-test on the difference between conflict and non-conflict districts: *** p<0.01, ** p<0.05, * p<0.1. 29 Table 8: Summary statistics - All regression variables, by conflict VARIABLES Vendor Distance to market Female Age Primary education Children Adults Consumption expenditures Credit access Electricity More households in village Less households in village FONCODES Easier to find job More difficult to find job Education level Altitude Precipitation Distance to Ayacucho Temperature Observations (1) Full sample mean (sd) (2) No conflict mean (sd) (3) Conflict mean (sd) (4) Difference (2)-(3) 0.10 (0.31) 6.28 (6.26) 0.42 (0.49) 36.6 (15.9) 0.47 (0.50) 2.34 (1.81) 2.56 (1.76) 47.38 (43.22) 0.15 (0.36) 0.23 (0.42) 0.66 (0.47) 0.06 (0.24) 0.47 (0.50) 0.17 (0.37) 0.59 (0.49) 45.54 (15.15) 1718 (1436) 758.65 (681.38) 591 (293) 16.8 (5.8) 0.09 (0.29) 5.67 (5.74) 0.40 (0.49) 37.5 (16.2) 0.45 (0.50) 2.16 (1.79) 2.60 (1.79) 47.37 (44.75) 0.15 (0.36) 0.23 (0.42) 0.65 (0.48) 0.06 (0.25) 0.48 (0.50) 0.08 (0.27) 0.67 (0.47) 43.39 (14.41) 1745 (1447) 647.42 (669.27) 667 (254) 16.6 (5.5) 0.12 (0.33) 7.22 (6.90) 0.44 (0.50) 35.2 (15.2) 0.51 (0.50) 2.63 (1.82) 2.49 (1.71) 47.39 (40.73) 0.16 (0.36) 0.22 (0.41) 0.68 (0.46) 0.06 (0.23) 0.46 (0.50) 0.31 (0.46) 0.48 (0.50) 48.91 (15.66) 1677 (1417) 933.52 (663.70) 472 (310) 17 (6.1) -.03** 2,122 1,297 825 -1.55*** -.04* 2.3*** -.05** -.47*** .11 -.02 -.01 .01 -.03 .01 .02 -.23*** .19*** -5.52*** 68 -286.01*** 195*** -0.4 - Asterisks denote the significance level of the t-test on the difference between conflict and non-conflict districts: *** p<0.01, ** p<0.05, * p<0.1. 30 Table 9: Summary statistics - All regression variables, by inclusion in fixed effects sample VARIABLES Vendor Conflict Distance to market Female Age Primary education Children Adults Consumption expenditures Credit access Electricity More households in village Less households in village FONCODES Easier to find job More difficult to find job Education level Altitude Precipitation Distance to Ayacucho Temperature Observations (1) Full sample mean (sd) (2) Included mean (sd) (3) Excluded mean (sd) (4) Difference (2)-(3) 0.10 (0.31) 0.39 (0.49) 6.28 (6.26) 0.42 (0.49) 36.6 (15.9) 0.47 (0.50) 2.34 (1.81) 2.56 (1.76) 47.38 (43.22) 0.15 (0.36) 0.23 (0.42) 0.66 (0.47) 0.06 (0.24) 0.47 (0.50) 0.17 (0.37) 0.59 (0.49) 45.54 (15.15) 1718 (1436) 758.65 (681.38) 591 (293) 16.8 (5.8) 0.13 (0.33) 0.42 (0.49) 5.99 (5.79) 0.41 (0.49) 36.1 (15.5) 0.48 (0.50) 2.38 (1.74) 2.58 (1.77) 49.64 (46.37) 0.16 (0.37) 0.26 (0.44) 0.69 (0.46) 0.07 (0.26) 0.51 (0.50) 0.16 (0.36) 0.61 (0.49) 45.86 (15.52) 1520 (1391) 760.36 (737.07) 590 (299) 17.18 (5.97) 0 (0) 0.24 (0.43) 7.58 (7.92) 0.44 (0.50) 38.9 (17.1) 0.43 (0.50) 2.18 (2.10) 2.48 (1.73) 37.24 (21.84) 0.10 (0.30) 0.08 (0.27) 0.57 (0.50) 0.02 (0.14) 0.28 (0.45) 0.22 (0.41) 0.51 (0.50) 45.54 (13.25) 2608 (1288) 750.98 (717.75) 594 (265) 15.00 (4.23) 0.13*** 2,122 1735 387 0.18*** -1.59*** -0.03 -2.8*** 0.06** 0.20* 0.10 12.40*** 0.07*** 0.18*** 0.12*** 0.05*** 0.23*** -0.06*** 0.10*** 1.75** -1089*** 9.38 -3.38 2.18*** - Asterisks denote the significance level of the t-test on the difference between conflict and non-conflict districts: *** p<0.01, ** p<0.05, * p<0.1. 31 Table 10: OLS with interaction terms - all coefficients reported VARIABLES Conflict (1) (2) 0.078** (0.034) 0.002 (0.003) -0.008** (0.004) 0.097*** (0.015) 0.004** (0.002) -0.000* (0.000) 0.061*** (0.016) - Dependent variable: Vendor (3) (4) (5) Adults - Consumption expenditures - Credit access - Electricity - More households in village - 0.077** (0.034) 0.003 (0.003) -0.008** (0.004) 0.095*** (0.015) 0.004** (0.002) -0.000** (0.000) 0.043** (0.017) -0.003 (0.004) -0.011** (0.005) 0.001* (0.000) 0.036 (0.029) 0.044* (0.025) - Less households in village - - FONCODES - - Easier to find job - - More difficult to find job - - Education level - - Log(altitude) - - -0.009 (0.026) -0.067* (0.034) 0.032 (0.022) -0.056* (0.034) -0.027 (0.025) 0.000 (0.001) - Precipitation - - - -0.077** (0.035) -0.073* (0.037) -0.074* (0.044) Distance to market Conflict*Distance to market Female Age Age2 Primary education Children Constant 0.083** (0.033) 0.003 (0.003) -0.008** (0.004) 0.098*** (0.015) 0.004** (0.002) -0.000* (0.000) 0.056*** (0.015) - 0.082** (0.033) 0.003 (0.003) -0.009** (0.004) 0.096*** (0.015) 0.004** (0.002) -0.000** (0.000) 0.042** (0.017) -0.003 (0.004) -0.011** (0.005) 0.001 (0.000) 0.035 (0.030) 0.044* (0.025) -0.005 (0.006) -0.000 (0.000) -0.038 (0.056) 0.082** (0.032) 0.003 (0.003) -0.008** (0.004) 0.096*** (0.015) 0.004** (0.002) -0.000** (0.000) 0.038** (0.017) -0.005 (0.004) -0.012** (0.005) 0.001* (0.000) 0.035 (0.029) 0.054** (0.026) -0.018 (0.025) -0.079** (0.037) 0.027 (0.022) -0.060* (0.031) -0.037 (0.025) -0.000 (0.001) -0.029 (0.052) (6) 0.087*** (0.032) 0.003 (0.003) -0.008** (0.004) 0.097*** (0.015) 0.004** (0.002) -0.000** (0.000) 0.038** (0.016) -0.005 (0.004) -0.012** (0.005) 0.001 (0.000) 0.033 (0.029) 0.059** (0.025) -0.021 (0.025) -0.080** (0.035) 0.027 (0.022) -0.056* (0.031) -0.038 (0.025) -0.001 (0.001) -0.006 (0.006) -0.000 (0.000) 0.030 (0.070) R-squared 0.049 0.065 0.057 0.066 0.074 0.075 Number of observations: 2,122. Robust standard errors in parentheses, clustered at village-level. *** p<0.01, ** p<0.05, * p<0.1. 32 Table 11: OLS with district fixed effects - all coefficients reported VARIABLES Distance to market Dependent variable: Vendor (2) (3) (1) Adults - Consumption expenditures - Credit access - Electricity - More households in village - 0.011** (0.004) -0.014*** (0.005) 0.103*** (0.016) 0.005*** (0.002) -0.000** (0.000) 0.050*** (0.017) -0.008 (0.005) -0.008* (0.005) 0.001** (0.000) 0.041 (0.029) 0.029 (0.032) - Less households in village - - FONCODES - - Easier to find job - - More difficult to find job - - Yes -0.018 (0.073) Conflict*Distance to market Female Age Age2 Primary education Children District fixed effects Constant 0.010** (0.004) -0.013** (0.005) 0.106*** (0.016) 0.004*** (0.002) -0.000** (0.000) 0.062*** (0.016) - 0.011*** (0.004) -0.012*** (0.005) 0.106*** (0.016) 0.004*** (0.002) -0.000** (0.000) 0.060*** (0.015) - 0.015 (0.027) -0.095 (0.067) 0.036 (0.023) -0.003 (0.026) -0.014 (0.025) 0.011*** (0.004) -0.013*** (0.004) 0.103*** (0.016) 0.005*** (0.002) -0.000** (0.000) 0.048*** (0.017) -0.008* (0.005) -0.008* (0.005) 0.001** (0.000) 0.045 (0.030) 0.027 (0.030) 0.013 (0.026) -0.098 (0.065) 0.038 (0.025) -0.018 (0.027) -0.023 (0.026) Yes Yes Yes -0.045 (0.093) -0.012 (0.072) -0.023 (0.084) - R-squared 0.141 0.153 0.146 Number of observations: 2,122. Robust standard errors in parentheses, clustered at village-level. *** p<0.01, ** p<0.05, * p<0.1. 33 (4) 0.158 Table 12: Logit with interaction terms - all coefficients reported VARIABLES Conflict (1) (2) 0.854** (0.342) 0.024 (0.033) -0.100** (0.049) 1.081*** (0.153) 0.058** (0.026) -0.001* (0.000) 0.706*** (0.175) - Dependent variable: Vendor (3) (4) Adults - Consumption expenditures - Credit access - Electricity - More households in village - 0.898*** (0.339) 0.037 (0.033) -0.102** (0.048) 1.071*** (0.157) 0.061** (0.026) -0.001** (0.000) 0.501*** (0.194) -0.035 (0.055) -0.152** (0.067) 0.004* (0.002) 0.357 (0.252) 0.474* (0.245) - Less households in village - - FONCODES - - Easier to find job - - More difficult to find job - - Education level - - Log(altitude) - - -0.105 (0.278) -1.005** (0.486) 0.352 (0.244) -0.590 (0.448) -0.268 (0.263) 0.002 (0.007) - Precipitation - - - -4.574*** (0.497) -4.486*** (0.518) -4.508*** (0.560) Distance to market Conflict*Distance to market Female Age Age2 Primary education Children Constant 0.877*** (0.329) 0.031 (0.031) -0.094** (0.046) 1.091*** (0.154) 0.057** (0.026) -0.001* (0.000) 0.655*** (0.172) - 0.958*** (0.336) 0.040 (0.034) -0.108** (0.048) 1.083*** (0.158) 0.059** (0.025) -0.001* (0.000) 0.489** (0.190) -0.032 (0.056) -0.156** (0.065) 0.004 (0.002) 0.328 (0.261) 0.503** (0.246) -0.062 (0.061) -0.000 (0.000) -4.020*** (0.662) (5) (6) 0.948*** (0.316) 0.041 (0.031) -0.093** (0.043) 1.087*** (0.158) 0.062** (0.026) -0.001** (0.000) 0.465** (0.192) -0.051 (0.057) -0.156** (0.070) 0.004** (0.002) 0.346 (0.243) 0.646*** (0.250) -0.208 (0.270) -1.267** (0.534) 0.331 (0.248) -0.660* (0.394) -0.404 (0.262) -0.004 (0.007) - 1.011*** (0.313) 0.041 (0.033) -0.100** (0.043) 1.101*** (0.158) 0.061** (0.025) -0.001** (0.000) 0.450** (0.187) -0.053 (0.056) -0.162** (0.068) 0.004* (0.002) 0.310 (0.248) 0.743*** (0.252) -0.240 (0.271) -1.271** (0.520) 0.320 (0.257) -0.622 (0.382) -0.436 (0.266) -0.008 (0.007) -0.084 (0.068) 0.000 (0.000) -3.238*** (0.814) -4.025*** (0.630) Pseudo-R2 0.076 0.097 0.088 0.099 0.113 0.115 Log-pseudolikelihood -655.0 -640.2 -646.4 -639.2 -628.7 -627.3 Number of observations: 2,122. Robust standard errors in parentheses, clustered at village-level. *** p<0.01, ** p<0.05, * p<0.1. 34 Table 13: Logit with district fixed effects - all coefficients reported VARIABLES Distance to market Dependent variable: Vendor (2) (3) (1) Adults - Consumption expenditures - Credit access - Electricity - More households in village - 0.114*** (0.028) -0.211*** (0.077) 1.340*** (0.179) 0.071*** (0.023) -0.001*** (0.000) 0.645*** (0.213) -0.118* (0.063) -0.131* (0.067) 0.007*** (0.003) 0.454 (0.284) 0.243 (0.345) - Less households in village - - FONCODES - - Easier to find job - - More difficult to find job - - Yes -4.094*** (0.747) Conflict*Distance to market Female Age Age2 Primary education Children District fixed effects Constant 0.104*** (0.031) -0.185** (0.076) 1.331*** (0.176) 0.064*** (0.022) -0.001** (0.000) 0.811*** (0.193) - 0.107*** (0.033) -0.154* (0.082) 1.331*** (0.176) 0.064*** (0.022) -0.001** (0.000) 0.798*** (0.195) - (4) 0.255 (0.288) -1.335 (0.828) 0.430 (0.281) 0.039 (0.474) -0.408 (0.260) 0.117*** (0.030) -0.166** (0.077) 1.342*** (0.179) 0.071*** (0.024) -0.001** (0.000) 0.611*** (0.215) -0.121* (0.066) -0.139** (0.068) 0.008*** (0.003) 0.501* (0.295) 0.288 (0.335) 0.249 (0.284) -1.539* (0.886) 0.475 (0.320) -0.283 (0.486) -0.548* (0.281) Yes Yes Yes -4.377*** (1.068) -4.221*** (0.843) -4.121*** (0.989) - Pseudo-R2 0.174 0.195 0.182 0.205 Log-pseudolikelihood -546.8 -533.0 -541.0 -525.7 Number of observations: 1,735. 19 districts were omitted, as there were no vendors among the observations in these, such that the district dummy predicted failure perfectly. Robust standard errors in parentheses, clustered at village-level. *** p<0.01, ** p<0.05, * p<0.1. 35 Table 14: Cloglog with interaction terms - all coefficients reported VARIABLES Conflict (1) (2) 0.778** (0.315) 0.022 (0.031) -0.092** (0.047) 1.006*** (0.146) 0.052** (0.026) -0.001 (0.000) 0.658*** (0.164) - Dependent variable: Vendor (3) (4) Adults - Consumption expenditures - Credit access - Electricity - More households in village - 0.827*** (0.312) 0.034 (0.030) -0.094** (0.046) 0.984*** (0.147) 0.055** (0.025) -0.001* (0.000) 0.465** (0.183) -0.028 (0.049) -0.137** (0.063) 0.003** (0.001) 0.339 (0.229) 0.431* (0.225) - Less households in village - - FONCODES - - Easier to find job - - More difficult to find job - - Education level - - Log(altitude) - - -0.090 (0.257) -0.945** (0.457) 0.331 (0.229) -0.532 (0.428) -0.234 (0.240) 0.002 (0.006) - Precipitation - - - -4.436*** (0.477) -4.312*** (0.491) -4.393*** (0.529) Distance to market Conflict*Distance to market Female Age Age2 Primary education Children Constant 0.787*** (0.301) 0.029 (0.029) -0.086** (0.044) 1.010*** (0.147) 0.051** (0.025) -0.001 (0.000) 0.608*** (0.161) - 0.870*** (0.306) 0.036 (0.031) -0.099** (0.045) 0.994*** (0.148) 0.053** (0.025) -0.001* (0.000) 0.451** (0.179) -0.026 (0.051) -0.142** (0.062) 0.002* (0.001) 0.308 (0.238) 0.463** (0.227) -0.054 (0.055) -0.000 (0.000) -3.906*** (0.613) (5) (6) 0.870*** (0.289) 0.037 (0.029) -0.084** (0.041) 0.992*** (0.148) 0.055** (0.025) -0.001* (0.000) 0.430** (0.178) -0.045 (0.052) -0.143** (0.066) 0.003** (0.001) 0.325 (0.218) 0.598*** (0.230) -0.197 (0.253) -1.198** (0.496) 0.325 (0.229) -0.581 (0.382) -0.367 (0.240) -0.004 (0.006) - 0.918*** (0.283) 0.037 (0.030) -0.090** (0.040) 1.003*** (0.148) 0.054** (0.024) -0.001* (0.000) 0.413** (0.174) -0.046 (0.051) -0.149** (0.065) 0.003** (0.001) 0.292 (0.224) 0.686*** (0.234) -0.221 (0.254) -1.198** (0.488) 0.311 (0.237) -0.550 (0.370) -0.399 (0.245) -0.007 (0.007) -0.071 (0.061) 0.000 (0.000) -3.217*** (0.762) -3.892*** (0.589) Log-pseudolikelihood -655.4 -641.3 -646.8 -640.4 -629.6 -628.3 Number of observations: 2,122. Robust standard errors in parentheses, clustered at village-level. *** p<0.01, ** p<0.05, * p<0.1. 36 Table 15: Cloglog with district fixed effects - all coefficients reported VARIABLES Distance to market Dependent variable: Vendor (2) (3) (1) Adults - Consumption expenditures - Credit access - Electricity - More households in village - 0.105*** (0.021) -0.195*** (0.072) 1.184*** (0.164) 0.056*** (0.021) -0.001** (0.000) 0.576*** (0.194) -0.106* (0.056) -0.124** (0.060) 0.006*** (0.002) 0.434* (0.238) 0.251 (0.311) - Less households in village - - FONCODES - - Easier to find job - - More difficult to find job - - Yes -3.838*** (0.676) Conflict*Distance to market Female Age Age2 Primary education Children District fixed effects Constant 0.095*** (0.025) -0.168** (0.069) 1.165*** (0.163) 0.051** (0.021) -0.000* (0.000) 0.721*** (0.178) - 0.091*** (0.029) -0.134* (0.075) 1.162*** (0.162) 0.051** (0.021) -0.000* (0.000) 0.719*** (0.181) - (4) 0.214 (0.263) -1.288* (0.760) 0.428* (0.247) 0.093 (0.421) -0.339 (0.231) 0.101*** (0.025) -0.141** (0.071) 1.185*** (0.164) 0.056*** (0.021) -0.001** (0.000) 0.549*** (0.195) -0.110* (0.057) -0.136** (0.061) 0.007*** (0.002) 0.478* (0.251) 0.289 (0.300) 0.228 (0.256) -1.552* (0.839) 0.489* (0.287) -0.276 (0.455) -0.472* (0.251) Yes Yes Yes -4.232*** (0.982) -4.012*** (0.748) -3.875*** (0.891) - Log-Pseudolikelihood -548.1 -532.7 -541.7 -524.1 Number of observations: 1,735. 19 districts were omitted, as there were no vendors among the observations in these, such that the district dummy predicted failure perfectly. Robust standard errors in parentheses, clustered at village-level. *** p<0.01, ** p<0.05, * p<0.1. 37 Table 16: 2SLS: First stage on conflict (all coefficients) VARIABLES Distance to Ayacucho in 100km (1) (2) -0.079*** (0.017) 0.021** (0.009) 0.011* (0.006) 0.019 (0.015) -0.002 (0.003) -0.000 (0.000) 0.014 (0.037) - Dependent variable: Conflict (3) (4) Adults - Consumption expenditures - Credit access - Electricity - More households in village - -0.082*** (0.017) 0.021** (0.008) 0.011* (0.006) 0.017 (0.015) -0.004 (0.003) 0.000 (0.000) 0.026 (0.036) 0.029*** (0.010) 0.006 (0.014) -0.000 (0.000) 0.045 (0.067) -0.074 (0.098) - Less households in village - - FONCODES - - Easier to find job - - More difficult to find job - - Education level - - Log(altitude) - - 0.071 (0.100) -0.069 (0.166) 0.017 (0.078) 0.214* (0.123) -0.048 (0.096) 0.001 (0.003) - Precipitation in 100mm - - - 0.477*** (0.147) 0.456*** (0.147) 0.398** (0.190) Temperature Distance to market Female Age Age2 Primary education Children Constant -0.068*** (0.018) 0.016* (0.009) 0.009 (0.007) 0.017 (0.013) -0.002 (0.003) 0.000 (0.000) 0.026 (0.034) - -0.082*** (0.016) 0.020** (0.010) 0.007 (0.007) 0.005 (0.014) -0.004 (0.003) 0.000 (0.000) 0.016 (0.037) 0.021** (0.010) 0.004 (0.014) -0.000 (0.000) 0.045 (0.064) -0.063 (0.094) 0.010 (0.030) 0.013* (0.008) 0.347 (0.362) (5) (6) -0.070*** (0.018) 0.016* (0.009) 0.009 (0.007) 0.013 (0.013) -0.004* (0.003) 0.000 (0.000) 0.032 (0.035) 0.030*** (0.010) 0.002 (0.013) -0.000 (0.000) 0.045 (0.063) -0.065 (0.096) 0.087 (0.096) -0.045 (0.165) 0.016 (0.078) 0.210* (0.123) -0.034 (0.096) 0.002 (0.003) - -0.070*** (0.018) 0.016 (0.010) 0.005 (0.008) 0.002 (0.012) -0.004 (0.002) 0.000 (0.000) 0.022 (0.035) 0.022** (0.010) -0.001 (0.013) -0.000 (0.000) 0.049 (0.061) -0.052 (0.091) 0.081 (0.093) -0.071 (0.167) -0.001 (0.077) 0.168 (0.126) -0.062 (0.094) 0.002 (0.003) 0.015 (0.033) 0.012* (0.007) 0.199 (0.436) 0.335* (0.188) R2 0.184 0.202 0.224 0.229 0.241 0.266 Number of observations: 2,122. Robust standard errors in parentheses, clustered at village-level. *** p<0.01, ** p<0.05, * p<0.1. 38 Table 17: 2SLS: First stage on conflict*distance to market (all coefficients) VARIABLES Distance to Ayacucho in 100km (1) Adults - Consumption expenditures - Credit access - Electricity - More households in village - -0.610*** (0.178) 0.347*** (0.101) 0.517*** (0.116) 0.226 (0.146) -0.018 (0.025) 0.000 (0.000) 0.012 (0.355) 0.151 (0.097) -0.012 (0.100) -0.004 (0.003) 0.016 (0.444) -0.292 (0.707) - Less households in village - - FONCODES - - Easier to find job - - More difficult to find job - - Education level - - Log(altitude) - - 1.042 (0.848) -1.008 (1.508) 0.091 (0.634) 2.540* (1.325) -0.270 (0.883) 0.025 (0.017) - Precipitation in 100mm - - - -2.512* (1.423) -2.351 (1.459) -4.246** (1.783) Temperature Distance to market Female Age Age2 Primary education Children Constant -0.596*** (0.172) 0.343*** (0.101) 0.524*** (0.114) 0.246 (0.153) -0.003 (0.026) -0.000 (0.000) -0.028 (0.353) - Dependent variable: Conflict*Distance to market (2) (3) (4) (5) -0.451*** (0.175) 0.274*** (0.092) 0.501*** (0.112) 0.217 (0.143) -0.002 (0.024) -0.000 (0.000) 0.082 (0.328) - -0.608*** (0.176) 0.359*** (0.103) 0.499*** (0.123) 0.162 (0.131) -0.014 (0.025) 0.000 (0.000) -0.022 (0.371) 0.115 (0.099) -0.022 (0.101) -0.002 (0.003) 0.057 (0.441) -0.234 (0.688) 0.142 (0.193) 0.056 (0.085) -3.828 (2.472) -0.450** (0.175) 0.273*** (0.090) 0.495*** (0.112) 0.182 (0.140) -0.021 (0.023) 0.000 (0.000) 0.069 (0.335) 0.173* (0.099) -0.067 (0.086) -0.005* (0.003) -0.006 (0.433) -0.265 (0.702) 1.141 (0.810) -0.958 (1.507) 0.059 (0.640) 2.560** (1.293) -0.189 (0.911) 0.030* (0.017) -4.160** (1.847) (6) -0.429** (0.176) 0.304*** (0.093) 0.484*** (0.120) 0.113 (0.129) -0.016 (0.023) 0.000 (0.000) 0.036 (0.343) 0.148 (0.094) -0.083 (0.085) -0.003 (0.003) 0.113 (0.436) -0.295 (0.678) 1.166 (0.803) -1.045 (1.498) -0.017 (0.617) 2.352* (1.259) -0.229 (0.862) 0.040** (0.018) 0.283 (0.218) 0.043 (0.079) -7.365** (3.137) R2 0.184 0.202 0.224 0.229 0.241 0.266 Number of observations: 2,122. Robust standard errors in parentheses, clustered at village-level. *** p<0.01, ** p<0.05, * p<0.1. 39 Table 18: 2SLS: Second stage (all coefficients) VARIABLES Conflict (1) (2) 0.10 (0.111) -0.01 (0.013) 0.00 (0.007) 0.10*** (0.015) 0.00** (0.002) -0.00* (0.000) 0.06*** (0.016) - Dependent variable: Vendor (3) (4) (5) Adults - Consumption expenditures - Credit access - Electricity - More households in village - 0.07 (0.110) -0.01 (0.012) 0.00 (0.007) 0.09*** (0.015) 0.00** (0.002) -0.00** (0.000) 0.04** (0.017) -0.00 (0.005) -0.01** (0.005) 0.00* (0.000) 0.04 (0.030) 0.04* (0.026) - Less households in village - - FONCODES - - Easier to find job - - More difficult to find job - - Education level - - Log(altitude) - - -0.02 (0.030) -0.07 (0.047) 0.03 (0.024) -0.09** (0.042) -0.02 (0.025) -0.00 (0.001) - Precipitation - - - -0.08 (0.063) -0.07 (0.059) -0.06 (0.083) Conflict*Distance to market Distance to market Female Age Age2 Primary education Children Constant 0.16 (0.121) -0.00 (0.014) 0.00 (0.007) 0.09*** (0.015) 0.00** (0.002) -0.00* (0.000) 0.05*** (0.016) - 0.10 (0.121) -0.01 (0.014) 0.00 (0.007) 0.10*** (0.015) 0.00** (0.002) -0.00** (0.000) 0.04** (0.016) -0.00 (0.005) -0.01** (0.005) 0.00 (0.000) 0.03 (0.031) 0.04* (0.026) -0.01 (0.007) -0.00 (0.000) -0.04 (0.058) 0.12 (0.118) -0.00 (0.013) 0.00 (0.007) 0.09*** (0.015) 0.00** (0.002) -0.00** (0.000) 0.04** (0.017) -0.01 (0.005) -0.01** (0.005) 0.00* (0.000) 0.03 (0.029) 0.06** (0.026) -0.03 (0.028) -0.08* (0.046) 0.03 (0.023) -0.08** (0.038) -0.03 (0.024) -0.00 (0.001) -0.01 (0.077) Adjusted R2 0.040 0.058 0.030 0.059 0.056 Underidentification P-value 0.0002 0.0002 0.0026 0.0009 0.0025 Weak identification Kleibergen-Paap rk Wald F-stat 7.81 8.49 5.05 7.23 5.26 Endogeneity test P-value 0.56 0.85 0.28 0.92 0.46 Number of observations: 2,122. Robust standard errors in parentheses, clustered at village-level. *** p<0.01, ** p<0.05, Stock-Yogo critical values: 7.03 (10%), 4.58 (15%), 3.95 (20%). (6) 0.18 (0.119) -0.01 (0.014) 0.00 (0.007) 0.10*** (0.015) 0.00*** (0.002) -0.00** (0.000) 0.04** (0.016) -0.01 (0.005) -0.01** (0.005) 0.00* (0.000) 0.03 (0.030) 0.06** (0.026) -0.03 (0.029) -0.08* (0.043) 0.03 (0.024) -0.07* (0.038) -0.03 (0.025) -0.00 (0.001) -0.01 (0.008) -0.00 (0.000) 0.05 (0.083) 0.053 0.0005 8.59 0.46 * p<0.1. Table 19: Robustness check 1a: Different Conflict periods Dependent variable: Vendor VARIABLES Distance to market Conflict 1980-84 Conflict 80-84*Dist. market Conflict 1985-89 (1) (2) (3) (4) (5) 0.018 (0.027) 1.800*** (0.338) -0.173*** (0.062) - 0.034 (0.027) - 0.041 (0.033) - 0.044 (0.033) - - - 0.046 (0.033) 1.159*** (0.428) -0.094 (0.065) 1.012** (0.422) -0.097* (0.052) -0.092 (0.396) -0.022 (0.042) - - Conflict 85-89*Dist. market - Conflict 1990-94 - 1.420*** (0.332) -0.147*** (0.052) - Conflict 90-94 *Distance to market - - Conflict 1980-94 - - 1.011*** (0.313) -0.100** (0.043) - Conflict 80-94 *Distance to market - - - - Individual controls Household controls Village and district controls Geographical controls X X X X X X X X X X X X X X X X X X X X 0.134 -614.3 0.118 -625.2 - Pseud-R2 0.124 0.125 0.115 Log pseudo-likelihood -620.9 -620.5 -627.3 Number of observations: 2,122. Robust standard errors in parentheses, clustered at village-level. *** p<0.01, ** p<0.05, * p<0.1. 41 1.119*** (0.302) -0.108** (0.042) Table 20: Robustness check 1b: Distance to Ayacucho Sample: VARIABLES Distance to market Distance to Ayacucho Distance to Ayacucho*Distance to market Individual controls Household controls Village and district controls Geographical controls No conflict 1980-94 (1) No conflict 1990-94 (2) Usual sample (3) 0.275*** (0.062) 0.002* (0.001) -0.000*** (0.000) 0.247*** (0.056) 0.001* (0.001) -0.000*** (0.000) 0.007 (0.063) -0.001 (0.001) -0.000 (0.000) X X X X X X X X X X X X Observations 1,244 1,297 Pseudo-R2 0.161 0.135 Lof pseudo-likelihood -311.8 -348.0 Robust standard errors in parentheses, clustered at village-level. *** p<0.01, ** p<0.05, * p<0.1 42 2,122 0.106 -633.6 Table 21: Robustness check 2: Different Conflict Intensities VARIABLES Distance to market Conflict Conflict*Distance to market Conflict intensity Dependent variable: Vendor (2) (3) (4) (1) 0.041 (0.033) 1.011*** (0.313) -0.100** (0.043) - (5) 0.019 (0.028) - 0.015 (0.027) - 0.015 (0.027) - 0.014 (0.026) - - - - - - - - - - - - - - - Conflict intensity*Distance to market - Conflict>10 - 0.033*** (0.006) -0.003** (0.001) - Conflict>10*Dist. market - - Conflict>20 - - 1.171*** (0.369) -0.095* (0.049) - Conflict>20*Dist. market - - - Conflict>50 - - - 2.114*** (0.441) -0.141** (0.064) - Conflict>50*Dist. market - - - - Individual controls Household controls Village and district controls Geographical controls X X X X X X X X X X X X X X X X X X X X 0.127 -619.3 0.130 -616.6 Pseudo-R2 0.115 0.131 0.115 Log pseudo-likelihood -627.3 -616.0 -627.7 Number of observations: 2,122. Robust standard errors in parentheses, clustered at village-level. *** p<0.01, ** p<0.05, * p<0.1. 43 2.437*** (0.391) -0.193** (0.083) Table 22: Robustness check 3: Other Non-Agricultural Self-Employment Dependent variable: VARIABLES Conflict Distance to market Conflict*Distance to market Individual controls Household controls Village and district controls Geographical controls Pseudo-R2 Log Pseudo-likelihood Vendor (1) Other non-agricultural self-employment (2) All non-agricultural self-employment (3) 1.011*** (0.313) 0.041 (0.033) -0.100** (0.043) -0.193 (0.377) 0.010 (0.030) 0.033 (0.035) 0.513* (0.289) 0.030 (0.024) -0.036 (0.030) X X X X X X X X X X X X 0.115 -627.3 0.073 -571.2 0.090 -936.3 Number of observations: 2,122. Robust standard errors in parentheses, clustered at village-level. *** p<0.01, ** p<0.05, * p<0.1. 44 Table 23: Robustness check 4: Outliers and market villages Sample: VARIABLES Conflict Distance to market Conflict*Distance to market Individual controls Household controls Village and district controls Geographical Usual sample (1) Including outliers (2) Excluding market villages (3) Excluding villages less than 2 km from market (4) 1.011*** (0.313) 0.041 (0.033) -0.100** (0.043) 0.767*** (0.283) -0.008 (0.010) -0.055 (0.035) 1.146*** (0.354) 0.039 (0.035) -0.113** (0.047) 1.005** (0.471) 0.050 (0.038) -0.102** (0.050) X X X X X X X X X X X X X X X X Observations 2,122 2,212 1,951 Pseudo-R2 0.115 0.111 0.110 Log pseudo-likelihood -627.3 -645.0 -570.2 Robust standard errors in parentheses, clustered at village-level. *** p<0.01, ** p<0.05, * p<0.1 45 1,551 0.109 -428.9
© Copyright 2026 Paperzz