Selection and Comparative Advantage in Technology Adoption Tavneet Suri1 Abstract I investigate an empirical puzzle in technology adoption: despite the potential of technologies like hybrid maize to increase returns dramatically, a signi…cant fraction of households do not use them. As an explanation, I examine the role of comparative advantage and heterogeneity in returns to the technology in a generalized Roy model, where high average returns are accompanied by low returns for marginal farmers. I derive a novel method that estimates both the importance of comparative advantage and the associated distribution of returns to the technology. Looking at this distribution, I …nd that farmers with the highest estimated (gross) returns do not use hybrid, but this is explained by high unobservable costs (from poor supply/infrastructure). Another group of farmers has lower returns and adopts, while the marginal farmers have zero returns and switch in and out of use. Overall, adoption decisions appear to be rational and well explained by (observed and unobserved) variation in heterogeneous net bene…ts to the technology. Keywords: Technology, Heterogeneity, Comparative Advantage 1 Sloan School of Management, Massachusetts Institute of Technology. Email: [email protected]. I am extremely grateful to Michael Boozer for hours of discussion on this project. I would also like to thank Gustav Ranis, Paul Schultz and Christopher Udry, as well as Ken Chay, Robert Evenson, Penny Goldberg, Koichi Hamada, Thom Jayne, Fabian Lange, Ashley Lester, Anandi Mani, Sharun Mukand, Ben Polak, Steve Pischke, Roberto Rigobon, James Scott, T.N. Srinivasan, and seminar audiences at Berkeley, Haas School of Business, Harvard, Hunter College, London School of Economics, MIT, NBER Productivity Lunch, NEUDC, Oxford, Princeton, Sloan School of Management, Stanford, Stanford Graduate School of Business, University of California at San Diego, University of Virginia and Yale for all their comments. The data come from the Tegemeo Agricultural Monitoring and Policy Analysis (TAMPA) Project, between Tegemeo Institute at Egerton University, Kenya and Michigan State University, funded by USAID. A special thanks to Thom Jayne and Margaret Beaver at Michigan State University for help with the 1997-2002 rounds of the data. I am sincerely grateful to the Tegemeo Institute and Thom Jayne for including me in the 2004 survey and especially to Tegemeo for their hospitality during the …eld work in Kenya. I would like to thank James Nyoro, Director of Tegemeo, as well as Tegemeo Research Fellows Miltone Ayieko, Joshua Ariga, Paul Gamba and Milu Muyanga, Senior Research Assistant Frances Karin, and Research Assistants Bridget Ochieng, Mary Bundi, Raphael Gitau, Sam Mburu, Mercy Mutua and Daniel Kariuki and all the …eld teams for their impressive work in all the rounds of the survey. An additional thanks to Margaret Beaver and Daniel Kariuki for spending immense energy ensuring the data was squeaky clean. I would like to acknowledge the …nancial support of the Yale Center for International and Area Studies Dissertation Fellowship, the Lindsay Fellowship and the Agrarian Studies Program at Yale. The rainfall data come from the Climate Prediction Center, part of the USAID/FEWS project - thanks to Tim Love for his help with these data. 1 Introduction Food security is currently at the forefront of the policy agenda across many Sub-Saharan African economies. Agricultural yields (per acre) have actually fallen in the last decade across many of these economies, despite the widespread availability of technologies that increase yields. Governments and policy makers need to understand why yields of agricultural staples are low across parts of Africa in order to …nd ways to enhance food production and incomes. Table 1 shows the falling yields of staple crops over the last decade in Kenya, compared with increasing yields in India and Mexico. These are worrisome trends: Kenya has not been able to take the same advantage of improvements in agricultural technologies as have India and Mexico during their Green Revolutions. In this paper, I examine one aspect of this policy issue that poses an empirical puzzle. Technologies that have extremely high average returns are available, yet households do not use them. For example, …eld trials at experiment stations across Kenya show that hybrid maize and fertilizers can increase yields of maize signi…cantly, increases ranging from 40% to 100% (see Gerhart (1975), Kenya Agricultural Research Institute, KARI (1993), Karanja (1996)). The Fertilizer Use Recommendations Project (FURP) conducted in Kenya in the early 1990’s also shows high returns to these technologies (see Hassan et al (1998)). These results are con…rmed for fertilizer in the small sample of Du‡o, Kremer and Robinson (2006). However, despite such high average returns, aggregate adoption rates of hybrid maize and fertilizer remain far below 100% across Kenya, with no increases in adoption over a long period. This poses the question I am interested in: why are aggregate adoption rates low and, more so, stagnating when returns to the technologies are high? Why are some farmers not adopting, even though others in the same community adopt and seem to receive signi…cant yield increases? Understanding how and why households decide what technologies to adopt is crucial for understanding the trends in agricultural yields in Sub-Saharan Africa, where risk and incomplete factor markets are potentially important. The literature has put forth a number of explanations to answer these questions, ranging from models of slow learning and informational barriers, credit constraints, taste preferences, di¤erences in agroecological conditions and local costs and bene…ts, to the lack of e¤ective commitment devices when farmers are time inconsistent (see Du‡o, Kremer and Robinson (2006)). My approach to these questions is in stark contrast to the standard literature. I examine whether the yield returns to adopting hybrid maize vary across a random sample of maize farmers, such that high average returns to the technologies are accompanied by low returns for the marginal farmer. I ask a simple question: is it the case that the returns to these technologies vary across space such that the farmers who do not use them are simply those that do not bene…t. The dominant approach to technology adoption is to nest it in a learning environment. I deliberately take the opposite approach and assume the net bene…ts 1 are known to each household ex ante, but these bene…ts vary across space. I am therefore interested in the spatial distribution of returns and how these returns correlate with both adoption decisions as well as other observables. Allowing for such heterogeneity in returns to play a role in the adoption decision implies that the knowledge of average returns is not enough to understand adoption decisions. I use the most general model of technology adoption that amounts to a generalized Roy model where the expected bene…ts to the technology (that are potentially heterogeneous) drive adoption decisions in each period. In modeling adoption this way, I allow for two forms of household speci…c heterogeneity: absolute advantage (independent of the technology used) and what I call comparative advantage. The main novel conceptual and econometric contribution of this paper is to analyze the role of comparative advantage in adoption decisions and to empirically estimate its importance. This also enables me to estimate the distribution of returns to the technology for my sample of farmers. I use an extensive panel dataset representative of maize-growing Kenya, covering the period from 1996 to 2004. I …rst document the adoption patterns of households over the period 1996 to 2004, for both hybrid maize and fertilizer. Aggregate adoption is stable over the period, even when I look across regions or asset/wealth quintiles. Strikingly, I observe adoption rates varying dramatically across regions and asset quintiles, but remaining stable over time. Even more surprisingly, at least 30% of my households switch into and out of the use of hybrid seed from period to period. These trends are seen in other parts of Africa Dercon and Christiaensen (2005) …nd identical patterns for fertilizer use in Ethiopia. The hypotheses that could generate such patterns in the data are rather limited. For example, a model where households learn about a technology with positive returns would imply that aggregate adoption rates increase over time. Given these trends, I specify a generalized Roy model of adoption decisions that leads me to empirically estimate a model of heterogeneous returns. In estimating this, I make an econometric contribution. I generalize Chamberlain’s (1982, 1984) approach to the …xed e¤ects model to accommodate not just heterogeneous intercepts across households (i.e. …xed e¤ects), but also heterogeneous slopes (here, household speci…c returns to hybrid). Most important of all, this generalization allows me to estimate the distribution of returns to the technology for my sample of farmers. I compare these results to a variety of baseline models (such as those with homogeneous returns and household …xed e¤ects), which I can reject. I …nd strong evidence of heterogeneity in returns to hybrid maize. The estimated returns control for input use, but do not account for unobservable costs and are therefore gross, rather than net, returns. I show that the joint distribution of returns and adoption decisions displays some extremely interesting features. In particular, there are three interesting groups of farmers in my sample. I observe a small group of farmers with extremely high counterfactual (gross) returns to hybrid seed, yet they choose not to adopt. This is rather striking and, at 2 …rst glance, seems to deepen the initial puzzle, at least for this group of farmers. However, the lack of adoption for these farmers appears to stem from supply and infrastructure constraints, such as distance to seed and fertilizer distributors. These farmers that have the highest estimated (counterfactual) returns and do no adopt have higher unobservables costs to using hybrid maize as they have poor access to input suppliers and distributors. Their overall net returns may indeed be rather low. On the ‡ip side, I …nd a comparatively large group of farmers for whom the (gross) returns are comparatively lower (though still reasonably high) and they adopt hybrid in every period. Finally, a third group of farmers has essentially zero returns and these farmers switch in and out of use of hybrid. These are the marginal farmers who switch easily in and out of adoption when subject to year to year ‡uctuations. The heterogeneity in returns to technologies has important implications for policy. For one, encouraging 100% adoption of a technology that has a large average return is seen to be a bad idea. In addition, knowing the distribution of returns to a technology allows for focused policy interventions which can be cost e¤ective. For example, for farmers who would have high returns but are constrained on the supply side, alleviating their constraints by targeted distribution of inputs and infrastructure improvements would improve yields dramatically. Similarly, the unconstrained farmers would bene…t greatly from the development of new hybrid strains. This research illustrates the importance of the distributional consequences of policy actions in developing countries and the importance of accounting for these. It also sheds light on the ‘scaling up’ of policy (see Attanasio et al (2003)) in contexts of policy evaluation2 . The rest of this paper is structured as follows. In Section 2, I outline the relevant literature, focusing on empirical studies. Section 3 describes the data and the background on maize cultivation in Kenya. Section 4 discusses a model of adoption decisions, focusing on the economics motivating my model of heterogeneous returns. In Section 5, I discuss baseline results and provide some preliminary evidence for selection and heterogeneity in returns. Section 6 describes estimation of the heterogeneous returns model and the associated distribution of returns. In Section 7, I cover robustness checks, in particular the Heckman two-step estimator and the treatment e¤ects under non-random assignment, i.e. the average treatment e¤ect (ATE), the treatment on the treated (TT), the marginal treatment e¤ect (MTE) and the local average treatment e¤ect (LATE) or IV estimator. Section 8 discusses the implications of my results for households and policy makers in Kenya. Here, I also discuss alternative models, such as learning, credit constraints and taste preferences. Section 9 concludes. 2 The questions posed here cannot be answered with experimental data without making speci…c assumptions. An experiment would randomize the use of a technology across farmers. I am interested in the choice (or selection) process of how farmers decide what technologies to use, which is actually randomized out in an experiment. In addition, with experimental data, one can test for the presence of heterogeneous returns, but estimating the distribution of returns to the technology requires assumptions about the underlying selection process (see Heckman, Smith and Clements (1997)). 3 2 Literature Review I start with a brief summary of some of the empirical literature on technology adoption3 in developing countries, with a focus on the studies that have been conducted in Sub-Saharan Africa. I begin with Gerhart (1975) who tracks the adoption of hybrid maize in Western Kenya in the early 1970’s. He highlights the fast di¤usion of hybrid maize upto the 1970’s and identi…es the following constraints to the use of hybrid: risk (measured by the diversi…cation in crops and o¤ farm income) education of the farmers, credit availability, extension services and the use of other improved farming techniques (the use of fertilizer and manure). I split the rest of this review into four main areas, …rst discussing studies that relate to heterogeneity, then looking at those that focus on learning externalities, then credit constraints and ending with the more recent experimental research on Kenya, which is the most relevant. The seminal empirical paper on technology adoption is Griliches (1957)4 who looks at heterogeneity across local conditions in the adoption speeds of hybrid corn in Midwestern US, emphasizing the role of economic factors like expected pro…ts and scale. He notes how the speed of adoption across geographical space depends on the suppliers of the technology and when the seed was adapted to local conditions. Other researchers have focused on describing what forms of heterogeneity drive the decisions of households to adopt new technologies. This covers quite a range of papers: from Schultz (1963) and Weir and Knight (2000), who emphasize the role of education, to various CIMMYT (The International Wheat and Maize Improvement Center) studies5 that collect data on what underlies adoption decisions across parts of Kenya. For example, Makokha et al (2001) look at the determinants of fertilizer and manure use in Kiambu district. The main (self-reported) constraints to use were high labor costs, high prices of inputs, unavailability and untimely delivery. Ouma et al (2002) …nd that, in Embu district, gender, agroclimatic zone, manure use, hiring of labor and extension services are signi…cant determinants of the adoption of improved seed and fertilizer. Wekesa et al (2003) look at the adoption of the Coast Composite, Pwani 1 and Pwani 4 hybrids and fertilizer in the Coastal Lowlands where the non-availability and high cost of seed, unfavorable climatic conditions, perceptions of su¢ cient soil fertility, and lack of money are cited as reasons for low use. Much of the literature on technology adoption has focused on the learning externality, described by Besley and Case (1993). Foster and Rosenzweig (1995) look at the adoption of high yielding varieties in post Green Revolution India, allowing for learning by doing, learn3 The literature on technology adoption is too vast to review here, excellent reviews can be found in Sunding and Zilberman (2001), Sanders, Shapiro and Ramaswamy (1996), Rogers (1995), Feder, Just and Zilberman (1985) and Hall (2004). For the theoretical literature, see Besley and Case (1993), Banerjee (1992) and Just and Zilberman (1983). And, for studies of land management practices, see Mugo et al (2000), for agricultural extension see Evenson and Mwabu (1998) and for property rights see Place and Swallow (2000). 4 Also see David (2003). 5 See Doss (2003) and De Groote et al (2002) for a review of all the CIMMYT micro surveys in Kenya. 4 ing from others, and costly experimentation. They …nd that farmers with more experienced neighbors are more pro…table than those without. Munshi (2003) looks at the same question and …nds that the impacts are heterogeneous: wheat growers respond strongly to their neighbors’ experiences while rice farmers experiment. He notes that rice high yielding varieties, unlike those for wheat, are sensitive to hard to observe soil characteristics and managerial inputs. Conley and Udry (2003) study the adoption of fertilizer in the small-scale pineapple industry in Ghana. They collect information on farmers’sources of information and …nd evidence of learning, not only from own experiences, but also within information neighborhoods. Bandiera and Rasul (2003) look at decisions to plant sun‡ower in the Zambezia province of Mozambique. They …nd that adoption decisions are correlated within networks of family and friends and that this e¤ect is stronger for disadvantaged farmers. Moser and Barrett (2003) analyze decisions to adopt, expand and disadopt a rice production method in Madagascar and …nd seasonal liquidity constraints and learning e¤ects from extension agents and other households to be important. A hypothesis that is often raised in the literature is that credit constraints explain the low rates of adoption. For example, Croppenstedt, Demeke and Meschi (2003) use self-reported information on why farmers did not purchase fertilizer to estimate a double hurdle fertilizer adoption model for Ethiopia. They …nd that credit is a major supply side constraint. Some of the CIMMYT studies also cover self-reported credit constraints; for example, Salasya et al (1998) look at the role of credit in adoption decisions in Western Kenya. More recently, Dercon and Christiaensen (2005) show that the possibility of a poor harvest (and hence very low consumption for that period) can account for the low use of fertilizer in Ethiopia. The …nal strand of literature on Kenya I describe is experimental. There are several impact assessment studies6 and …eld trials at experiment stations which show large increases in yields from using hybrid maize and fertilizer. Other than the work done by the Kenya Agricultural Research Institute (KARI), one of the early experimental studies was the Fertilizer Use Recommendations Project (FURP), conducted across 70 sites in the early 1990’s in conjunction with the Kenya Maize Database Project (MDBP). FURP recorded yields about half of those on experimental stations (KARI (1993)). The focus was to understand optimal rates of fertilizer use in comparison to survey data on actual use from the MDBP (see Corbett (1998)). Hassan et al (1998) use these data and …nd that adoption of hybrid seed and varietal turnover rates are higher (and di¤usion faster) in high potential areas. They blame poor extension services, bad infrastructure and poor seed distribution for low adoption in marginal areas. Hassan, Murithi and Kamau (1998) …nd that farmers in KMDP apply less 6 Studies of the welfare impacts of improved technology use include Karanja, Renkow and Crawford (2003) who …nd that adoption of technologies in high potential areas (relative to those in marginal areas) have large aggregate gains at the expense of poor distributional e¤ects, and Sserunkuuma (2002) who estimates large consumer gains and large producer losses of a shift in supply due to the adoption of hybrid maize in Uganda. 5 fertilizer than optimal, leading to an estimated 30% gap between current and optimal yields. There has been some more recent experimental work. For example, De Groote et al (2003) look at an ex-ante impact assessment of the Insect Resistant Maize for Africa (IRMA) project that develops GM maize varieties that are more resistant to stem borers. They estimate the surplus from a shift in supply due to the decreased crop loss (measured experimentally). Estimated gross crop losses amounted to 13.5% with an estimated value of $80 million7 . Closest to this paper and most recently is the work of Du‡o, Kremer and Robinson (2006) who run experiments to understand the low use of fertilizer in Western Kenya. They measure the returns from fertilizer use experimentally and …nd that the average net rate of return for investing in top-dressing fertilizer is between 100% and 162% and that the combined package of fertilizer and hybrid in this area is not pro…table. In addition, they analyze a variety of learning mechanisms, such as learning by doing (via starter kits) and learning from others (from agricultural contacts as well as neighbors). They …nd small learning e¤ects, if at all and some of these e¤ects die out over time rather quickly. The most signi…cant contribution they make is to understand whether farmers may be time inconsistent in this environment. They implement the Savings and Fertilizer Initiative (SAFI) as a commitment device for farmers by o¤ering them fertilizer at harvest time (as opposed to later during planting when it is used). They …nd that more farmers take up this program when it is o¤ered at harvest time vs. later. They show that o¤ering SAFI at harvest time increases adoption by 17% and that this e¤ect is about equivalent to a 50% subsidy to the fertilizer price. Overall, the literature on adoption has had little to say about the consistently low rates of adoption seen across a number of African countries. Other than correlations with observables, there is still a rather large part of adoption behavior that remains unexplained. 3 Survey Data and Maize Cultivation in Kenya I now go on to describe some of the relevant institutional detail and the data that help me motivate the model and approach I use. Maize is the main staple in Kenya8 , with about 90% of the country’s population depending on maize for income generation (see Nyameino, Kagira and Njukia (2003)). It accounts for approximately 3.7 million acres of cropped land. Maize policy in Kenya has an interesting history, Smale and Jayne (2003) provide an excellent review. Both hybrid seed and fertilizer have been available since the 1960’s with more than twenty modern varieties of seed released since 1955 (Ouma et al. (2002)). Hybrid maize holds its yield properties for one season so has to be repurchased every period. The commonly held 7 See http://www.cimmyt.org/ABC/InvestIn-InsectResist/htm/InvestIn-InsectResist.htm and Smale and De Groote (2003) for more information on the CIMMYT IRMA and GM projects in Kenya. 8 McCann (2005) describes the fascinating history of maize in Africa pointing out how maize not only gives more food per unit of land and labor, but also has the largest set of alternative uses compared to other grains. 6 wisdom is that later releases in hybrid technology in Kenya have not shown big improvements in yields9 though government recommendations for the types and quantities of hybrid seed and fertilizer to be applied vary by region10 . The agriculture is rain fed, with large variations in rainfall that make input use risky (see Hassan, Onyango and Rutto (1998)). From 1965 to 1980, hybrid variety 611 di¤used in Western Kenya “at rates as fast as or faster than among farmers in the US corn belt during the 1930’s-1940’s” (Gerhart (1975)). Smale and Jayne (2003) and Karanja (1996) attribute this to good germplasm, e¤ective research, good distribution and coordinated marketing of inputs and outputs. This changed in the 1990’s as the earlier policies (large subsidies, price supports, pan-territorial seed/output pricing, restrictions on cross district trade) resulted in large …scal de…cits11 . Reform of the cereal sector began in 1988, followed by liberalization in 199412 , though the government retained some control policies. The most important of these was pan territorial seed pricing, where seed was predominantly distributed by Kenya Seed Company at a …xed price across the country each year. This pan territorial pricing created poor incentives for suppliers to locate in distant areas or far from markets. The lack of a land market also deserves a mention - there are very few land sales, though there is a small land rental market. I use data from the Tegemeo Agricultural Monitoring and Policy Analysis (TAMPA) Project between Tegemeo Institute at Egerton University, Kenya and Michigan State University. This is a household level panel survey, representative of rural maize-growing areas. Figure 1 shows a map of the survey villages (marked by crosses) and the population density across the country (darker shades represent greater density). I have data for 1997, 1998, 2000, 2002 and 2004. The 1997, 2000 and 2004 surveys are similar, containing detailed agricultural input and output data (plus retrospective data for 1995-1996), consumption, income, demographics, infrastructure, location, and some basic credit information. The panel sample covers around 1300 households, with an additional 800 in the 2004 sample. The 1998 survey is similar, but covers only a sub-sample of about 612 households. In addition, in 1998, Kenya was strongly a¤ected by El Nino so this sample is very di¤erent. The 2002 survey was a very short survey, collecting no output data but with detailed data on hybrid use. I restrict myself mostly to the 1997, 2000 and 2004 data. I was involved in designing and collecting the 2004 survey. 9 Karanja (1996) states that “newly released varieties in 1989 had smaller yield advantages over their predecessors than the previously released ones... research yields were exhibiting a “plateau e¤ect””. Examples he gives are KSII, which was followed in time by H611 (with a 40% yield advantage), then H622 (16%) and then H611C (12%). H626 which had a 1% yield advantage over H625 was released eight years later. 10 See Ouma et al (2002), Hassan (1998), Salasya et al (1998), Wekesa et al (2003), Kamau (2002) and Karanja (1996). 11 Kenya National Cereals and Produce Board, the marketing board supporting these policies, managed to accrue losses on the order of 5% of the country’s GDP in the 1980’s (Smale and Jayne (2003)). 12 For studies of the impacts of the cereal sector reforms, see Jayne et al (2001). Karanja, Jayne and Strasberg (1998), Jayne, Myers and Nyoro (2005), Nyoro, Kiiru and Jayne (1999) and Wanzala et al (2001). 7 To motivate my research questions, I outline some trends in my data over the period 1996-2004 on the use of hybrid maize and fertilizer. Figure 2 shows the trends in adoption of hybrid maize by province over this period, illustrating the stability in aggregate adoption over time and the persistence of cross-sectional di¤erences13 . If I look across wealth/asset or acreage quintiles, the pattern is identical. There is no systematic temporal variation in the adoption of hybrid, presenting a prima facie case against learning playing a role in adoption decisions. If there was learning, we would expect aggregate adoption to increase over time. The use of inorganic fertilizer during the main season follows similar trends. Figure 3a shows the trends across provinces in the fraction of households that use inorganic fertilizer for maize production and Figure 3b shows the total value (in constant Kenyan shillings) of inorganic fertilizer used, by province. There is a lot more variation over time in the total value of inorganic fertilizer used, but both …gures illustrate the familiar persistent cross-sectional di¤erences. Finally, for an idea of the more general trends in my data, Figures 4 and 5 show main season yields of maize and the acreage planted to maize respectively over the same period. Yields are not stable over time, with the sharp drop in yields around 1997/1998 the result of El Nino. However, there are no dynamics in any of the inputs or technologies. Table 2a shows summary statistics for my sample of households for 1997, 2000 and 2004. Of the 26 di¤erent types of fertilizer reported as being used, Table 2a shows the four most popular (di-ammonium phosphate (DAP), calcium ammonium nitrate (CAN), nitrogen phosphorus potassium (NPK) and mono-ammonium phosphate (MAP)). Note the absence of the use of MAP in 2000, illustrating the unavailability of this fertilizer during this year. Table 2b breaks out some of these variables by hybrid/non-hybrid use for 1997 and 2004. In principle, hybrid use could be a continuous variable as farmers plant quantities of hybrid seed. However, in my sample very few farmers plant both hybrid and non-hybrid in a given season. For example, in 1997 only 2% of households planted both14 . I take hybrid use throughout to be binary (it simpli…es the empirical problem dramatically). From Table 2b, we can see that yields are lower across the board in the non-hybrid sector (the p-values on the t-tests are 0.000 in each period). However, a lot of the other variables look quite similar in means (p-values on the t-tests are often above 5%), except for fertilizer use and main season rainfall. Finally, Table 2c shows the transitions of hybrid use in my data, illustrating the fact that 30% of my households switch in and out of use over the three years of data I use. Over more periods, this fraction is a bit higher: 33% if 1996 is included and about 39% if 2002 is included. In the data, the households that always plant hybrid have the highest average 13 The Coast province looks rather di¤erent in 2004. The empirical work on the comparative advantage model reports results for cases without the Coast province. If I include these households (only 43 in number), the results are qualitatively the same. 14 This fact alone already signals the possibility of credit constraints not being a driving force behind the hybrid/non-hybrid decision. If credit constraints were very important, households would prefer to mix hybrid and non-hybrid maize given there are no di¤erential …xed costs in planting either variety. 8 yields, followed by households that switch in and out of use. Those that never use hybrid have the lowest average yields. Given all these trends, I go on to look at a model of how farmers may be making adoption decisions, taking seriously the temporal stability in adoption. 4 Modeling Adoption Decisions Since the variation in adoption decisions is largely cross-sectional or spatial (and not over time), I consider a model of adoption decisions that allows for both absolute and comparative advantage in the selection process determining who uses hybrid technology. For simplicity, I refer to hybrid seed as the relevant technology, but the empirical model will allow for generalizations to this, especially as regards fertilizer. In keeping with the empirical realities, I consider two technology options available to farmers: to plant either hybrid seed or traditional (non-hybrid) seed. Say the production functions at any point in time for a farmer are Cobb-Douglas: YitH YitN = e = e H t N t 0 @ 0 @ k Y j=1 k Y j=1 H j 1 N j 1 H Xijt A euit N Xijt A euit (1) (2) where YitH and YitN are the yields (output per acre) at time t when farmer i uses hybrid or non-hybrid respectively. Throughout, H is used to represent the use of hybrid maize and N the use of non-hybrid. The Xijt ’s represent other inputs (where j indexes the input), such as fertilizer, labor, rainfall, etc. The production functions for hybrid and non-hybrid maize have di¤erent parameters on the inputs, i.e. the Xijt ’s have di¤erent coe¢ cients in the production functions (indicated by H N) and to allow for complementarity between the maize variety used and the inputs. However, the same set of potential inputs are used to N grow both hybrid and non-hybrid maize. Finally, uH it and uit are sector-speci…c errors that may be the composite of time-invariant household characteristics and time-varying shocks to N production. I will consider some speci…c decompositions of uH it and uit below. Taking logs of equations (1) and (2) above, H yit N yit = = H t N t 0 H + uH it (3) 0 N + uN it (4) + Xit + Xit H and y H are the logs of yields, the X ’s have been rede…ned to represent the logs where yit it it 9 of the inputs. The gain in yields from planting hybrid maize is therefore H Bit = yit N yit = H t N t 0 H + Xit ( N ) + uH it uN it (5) Using this framework, we can start to think about what drives adoption decisions. The simplest decision to adopt would be based on a comparison of the yields under hybrid and H > y N and h = 0 if y H non-hybrid maize, such that hit = 1 if yit it it it N , where h = 1 yit it represents the use of hybrid and hit = 0 the use of non-hybrid. This is the basic assumption of the Roy (1951) selection model15 in terms of yields. So, the strict Roy model implies that H yit N yit > 1 for hit = 1 and H yit N yit 1 for hit = 0 (6) So, for any two individuals i and j using hybrid and non-hybrid respectively, H yit > N yit hit =1 H yjt N yjt ! (7) hjt =0 Equation (7) describes sorting based on comparative advantage16 . The yield maximization rule given by the Roy model in equation (6) imposes sorting via comparative advantage. The strict Roy model thus implies that H hit (yit ) = 1 yit N yit h 0 =1 ( H t N t ) 0 + Xit H N + uH it uN it 0 i (8) where 1[:] represents an indicator function of whether the expression in brackets holds true. This selection model can be generalized whereby yield maximization is replaced by a more general selection rule. This allows for both observed and unobserved costs as well as other factors like tastes17 to drive adoption. In my case, two properties of hybrid maize are important: not only does it increase mean yields, but it also reduces the variance in yields (due to better resistance to pests and drought). These characteristics are illustrated in Figures 6a, 6b and 6c, the conditional distributions of yields across hybrid and non-hybrid farmers for each year. The mean of the hybrid yield distribution lies to the right of that of non-hybrid yields. The hybrid yield distribution also has a lower variance than the non-hybrid distribution (the p-values on the null of equal standard 15 The Roy model relies on comparisons of wages, but the framework it provides can be applied to productivities (suggested by Mandelbrot (1962)). Also see Heckman and Honore (1990) and Borjas (1987). 16 The de…nitions and implications of absolute and comparative advantage as sorting mechanisms have been studied well (see Willis (1986), Green (1991), Sattinger (1993) and Carneiro and Lee (2004)). 17 One hypothesis for the lack of adoption and the persistence of non-hybrid maize is di¤ering tastes of hybrid and non-hybrid maize, as in Latin America. This is unlikely in Kenya where the di¤erent varieties have a uniform price and no distinction in the market. Section 8 provides evidence on this. 10 deviations are 0.000 in each period)18 . These …gures highlight an important property of hybrid maize: it eliminates the lower tail of the distribution of outcomes under non-hybrid maize. This is the motivation for thinking of a model of heterogeneous returns. The initial evidence here indicates that the farmers who would bene…t most from hybrid would be those at the lowest end of the non-hybrid yield distribution - bad outcomes would be muted under hybrid. So, it is likely that there are di¤erential returns to hybrid across a sample of farmers. Going back to the Roy model framework, it allows for a selection rule that is far more H ; Z H ) > f N (y N ; Z N ) general than that in equation (8). So, for example, hit = 1 if f H (yit it it it where Zitj (j 2 H; N ) represents costs and any other factors that a¤ect adoption. The simplest version of this bene…t function would of course be a pro…t maximization rule. When output prices are the same for hybrid and non-hybrid maize, yields and pro…ts di¤er only by (real) input costs. So, throughout, I estimate yield functions, controlling for complete input use19 . The advantage of estimating yield functions is that it allows me to estimate the e¤ects of other inputs on yields. Furthermore, it allows me to generalize the model of heterogeneous returns to account for joint decisions over hybrid and fertilizer use, which is important. Examples of other variables that may be in the set of Zit ’s above that are not in the yield equation and that are not pure tastes include measures of the availability of seed, such as the distance from a household to the seed/fertilizer distributor. Such measures of input availability clearly a¤ect the costs of planting hybrid maize (over and above the observed costs in the Xit ’s) and can therefore a¤ect the adoption decision from period to period. In the case of this generalized Roy model, any pattern of sorting is possible and the joint distribution f (y H ; y N ) is required to understand the patterns of sorting and comparative advantage (the standard program evaluation problem)20 . I can write the selection rule for a generalized Roy model for yields in terms of a binary choice based on the latent variable, hit , as hit (yit ) = 1 [hit h 0 0] = 1 Zit + usit i 0 (9) where hit is the binary adoption decision with respect to hybrid maize and hit is the latent 18 The empirical framework I consider below allows some of these aspects of summary statistics to be due to Roy model selection. The contribution of the empirical work is to separate selection from causal e¤ects. 19 The yield equation that I specify is driven by a number of factors. With complete markets, I would estimate pro…t functions, but here markets are incomplete. I need shadow values on non-priced inputs, mostly family labor. The data does not allow this and the data on inputs is not always at the crop level. Calculating maize pro…ts and estimating a pro…t function directly is therefore not straight forward. Instead, I use speci…cations that estimate yield functions controlling for input use throughout, which is close to estimating pro…t functions. Maize prices are captured by province and year dummies included in the speci…cation of my yield function. The log yield regression I estimate thus captures the bene…ts of hybrid on percentage yields and revenues, holding constant my input measures. This is like the gross pro…t function of Olley and Pakes (1996). 20 The generalized Roy model still imposes the restriction that the selection rule can be expressed as a single index function/supply side decision. I consider the appropriateness of this restriction later. 11 net bene…t of planting hybrid maize. usit is the selection error (the unobservable heterogeneity in the adoption selection equation). The higher usit in equation (9), the more (unobservably) likely the farmer is to plant hybrid maize21 . Using a generalized yield function of the form H yit = hit yit + (1 N hit )yit (10) I can substitute in equations (3) and (4) to get yit = N t +( H t N t )hit 0 + Xit N 0 + Xit ( H N H )hit + uN it + (uit uN it )hit (11) Equation (11) illustrates the standard selection problem. If farmers know all or some part N of the errors uH it and uit and act on what they know, then the decision to adopt hybrid, hit , H will generally be correlated with these errors (and hence with the composite error uN it + (uit uN it )hit in equation (11)). The OLS estimate of H t N t (and of the average treatment e¤ect) is therefore biased, with the bias composed of both a selection bias and a sorting bias22 . For example, it may be the case that the farmers who plant hybrid may just have higher soil quality and/or better farm management (all of which are unobservable). Heterogeneity of this …xed unobservables form has been emphasized in this literature, in particular with respect to farm management (see Mundlak (1961)), productivity of farmers, and soil quality (see Conley and Udry (2003) and Tittonell (2003)). In addition, there may be heterogeneity in returns to hybrid maize such that the farmers with the high returns to hybrid maize are the ones who plant it. The average returns to hybrid maize computed by comparing farmers who plant hybrid to those who do not are therefore overstated. My aim is to present a model that can account for both these ideas, which I formalize below. To solve the selection problem, I assume that what the farmers act on with respect to the errors in equation (11) is time invariant (though technology speci…c). This implies the N following decomposition or factor structure23 on the errors, uH it and uit : uH it = H i + H it (12) uN it = N i + N it (13) 21 Compare this to the case of the strict Roy model (without costs) where the adoption decision is based H N on di¤erences in yields so that hit is just yit yit and usit is uH uN it it as in equation (8). 22 Heckman and Li (2003) show that for a single cross section, the OLS estimate of the mean return to hybrid in a model with heterogeneity is biased for the ATE and that plim(b 23 OLS ) = = T T + Selection Bias = AT E + Sorting Bias + Selection Bias h i h i N (X) + E((uH uN E(uN i i )jhi = 1) + E(ui jhi = 1) i jhi = 0) This assumption is similar to the factor assumptions made in Carneiro, Hansen and Heckman (2003). 12 H i Farmers are assumed to know their N i , and but not the H it above highlights the key identi…cation assumption I make: the and H it N it . and The decomposition N it are assumed to be uncorrelated with each other, as well as with the Xit ’s and Zit ’s, unlike the H it and and N i This amounts to the transitory errors H i decision to use hybrid, though the N it H i N i . and not being allowed to a¤ect the farmer’s will. The implication is that the H it and N it components are known to the farmer only after the farmer has made his planting decision. The timing of the production of maize is crucial as well as the variables I observe in my dataset. This identi…cation assumption amounts to shocks to yields (production) being realized after the technology decision is made. Most farmers plant right before the onset of rains, with a reasonably long lag between planting and harvest. The seed type (hybrid or nonj it hybrid) is therefore …xed before the (j 2 H; N ) unobservable shocks to yields are realized (rainfall is in my data and so is not a component of these shocks). Even though the extent of the rainfall is unknown at the time of planting, the fact that the maize is rain fed and that rainfall I observe is important. Since inputs such as labor and fertilizer may be correlated with the rainfall shock, it is of key importance that I observe rainfall. Therefore, conditional on the covariates a¤ecting yields I have in my data and the timing of the production process, the assumptions underlying the H it and N it seem to be reasonable. An additional implication of this identi…cation assumption deals with what drives households to switch between the two technologies. This model assumes that households switch due to factors that do not a¤ect yields directly. Here, it would be factors such as supply constraints24 . If hybrid seed and/or fertilizer is not available easily right at the time of planting, this would manifest itself in a supply constraint or a higher cost of using the technology for the farmer who would then switch to the traditional technology. There is a lot of qualitative evidence for my sample of farmers that this happens reasonably often. Farmers get their seed and fertilizer (this is mostly a joint decision) from small local distributors, who get their supplies from slightly larger distributors, and so on. In the qualitative component of the survey, farmers who use traditional varieties were asked why they do not use improved seed varieties and 14% blamed unavailability of seed as one of top two reasons for this (65% put high costs as one of the top two reasons). In addition, Table 2a showed that in 2000 one of the common fertilizers (MAP) was not used at all due to an unavailability. And, …nally, several of the CIMMYT and FURP studies mentioned earlier cited unavailability of inputs. Substituting the factor structure described above into the yield equations in (3) and (4), H yit N yit = H t = N t + H i + N i + Xit H + Xit N + H it (14) + N it (15) 24 The framework in Dercon and Christiaensen (2005), for example, that shows adoption to be related to households’ability to smooth downside consumption risks would support the assumptions here. 13 Note that these equations cannot be estimated by a household …xed e¤ects model. The H i …xed e¤ects model is not general enough, the and the N i cannot be di¤erenced away as a signi…cant number of households switch between technologies from one period to the next. Following Heckman and Honore (1991), Lemieux (1998) and others, I use the linear H i projections of the N i and the H i on ( H i N i N i ) as follows = bH ( H i N i ) + i (16) = bN ( H i N i ) + i (17) 2 H H N ; i i ), 2 2 2 HN ), bN = ( HN HN )=( H + N H 2 25 V ar( i ), and 2N V ar( N i ) . H where the projection coe¢ cients are bH = ( 2 )=( 2 N H 2 N + I de…ne the 2 i HN ) and cov( HN to be farmer i’s absolute advantage as its e¤ect on yields does not vary H i by the choice of hybrid/non-hybrid. More importantly, I de…ne N i to be the gain from growing hybrid, which is orthogonal to the absolute advantage by construction. I then re-de…ne this gain to be my comparative advantage parameter, bN ( i H i i, as follows: N i ) (18) This is just a rede…nition of the sector speci…c unobservables H i and N i . The new i serves as a measure of comparative advantage in my model. It measures how much more productive a farmer is in the production of hybrid over non-hybrid maize, hence intuitively it is a measure of comparative advantage. It is only possible to identify a relative relationship between the H i and N i unobservables (also see Carneiro, Hansen and Heckman (2003)). Allowing bH bN 1, equations (16) and (17) become H i = ( + 1) N i = i + My de…nition of the comparative advantage gain i + (19) i (20) i i is di¤erent from the concepts of sorting based on comparative advantage described earlier. If individuals in the economy now sort on the basis of comparative advantage, then it actually implies that the coe¢ cient is negative. Substituting these back into equations (14) and (15), 25 that The H i i ’s N i H yit = H t + i + ( + 1) N yit = N t + i + i + Xit i + Xit N + N it H + H it (21) (22) in equations (16) and (17) are the same. Subtracting equation (17) from equation (16) implies N = (bH bN )( H bN must be equal to 1, i i ). For the i ’s to be equal across sectors, bH which is easily shown: bH bN = 2 H HN 2 + 2 H N 2 HN + N 2 HN = 1: 14 The expected gain for a farmer from using hybrid is now a function of both observed and unobserved household characteristics: H Bit = yit H yit =( H t N t ) + i H + Xit0 ( N ) (23) The logarithmic production functions for the case of heterogeneous returns in equations (21) and (22) map back into a generalized version!of the Cobb-Douglas production functions I k H Q +1 H H started with so that YitH = e i e i e t Xijtj e it and similarly for YiN . The motivation j=1 for this structure of production functions comes from the fact that we would expect individuals who are in the lower tail of the non-hybrid distribution to bene…t the most from the use of hybrid maize. The production functions illustrate these properties directly: when farmers with low i will have high rewards from hybrid maize (and low rewards if while farmers with high i will have lower rewards from hybrid maize (vice versa if < 0, > 0), > 0). I substitute equations (19) through (22) into the generalized yield equation (10) to get yit = where hit , i i, i + N t i + i +( H t N t )hit N + Xit0 + i hit + Xit0 ( H N )hit + "it (24) and "it is the unanticipated component of yields. Since, the coe¢ cient on now also depends on the unobserved household-speci…c e¤ect, i, this is a random coe¢ cient model. In fact, it is a correlated random coe¢ cient (CRC) model as the random coe¢ cients ( i ) can be correlated with the adoption decision (hit ). I am interested in two components of this model: the coe¢ cient that tells us how impor- tant di¤erences in comparative advantage are in determining yields (and adoption decisions) and, second, the distribution of comparative advantage, i, in my sample. I derive in the empirical work below how this CRC model is a generalization of Chamberlain’s (1984) correlated random e¤ects (CRE) model. Chamberlain allows for household-speci…c intercepts to be correlated with hit . The CRC model I develop below allows both the household speci…c intercepts as well as household speci…c slopes/returns to be correlated with hit . As I discuss in detail, this model can be estimated via methods similar to those introduced by Chamberlain (1984). In addition, this generalization allows me to estimate the distribution of comparative advantage and hence the distribution of returns to the technology. How are the estimates of and i interpreted? From equation (24), think of i as a household speci…c intercept, i.e. a household speci…c average yield. Similarly, think of as the household speci…c slope or return to hybrid. Since i and i construction, the covariance between household speci…c intercepts ( i ) and slopes ( cov( i ; i) 15 = 2 i are uncorrelated by i) is: (25) The structural coe¢ cient therefore determines whether high intercept households also have the higher returns to hybrid maize. For example, if > 0, then households with high average yields also have the higher returns to hybrid. If I compare this to an economy where households are randomly allocated across technologies, there is greater inequality in yields. Similarly, if < 0, then, compared to an economy with random allocation across technologies, there is less inequality in yields. In addition, < 0 implies that farmers are sorting into sectors on the basis of comparative advantage. Finally, note that equation (24) is a generalization of the household …xed e¤ects model. To illustrate this, let = 0 so that there is now the same …xed unobservable heterogeneity (for example, due to farm management) that a¤ects yields the same way irrespective of the type of maize seed planted. This imposes the following speci…c assumptions on the error structure: uH it = i + H it and uN it = i + N. it In addition, the di¤erence ( H it N) it must be independent of the selection rule. This implies that the timing of the decision to use hybrid maize at any point in time is independent of the yield di¤erence facing a farmer at that time. The yield function that corresponds to this is a …xed e¤ects version of equation (24): yit = where "it = N it +( N t H it + i +( N )h . it it H t N t )hit + Xit0 N H + Xit0 ( N )hit + "it (26) The expected gain to growing hybrid is now: Bit = ( H t N t ) + Xit0 ( H N ) (27) The expected gain from hybrid can therefore vary across households as long as it is a function of the observable Xit ’s. But, the unobservable heterogeneity, i, is restricted to a¤ect yields identically whether or not a farmer uses hybrid and does not appear in bene…t equation (27). The main contribution of this paper is therefore to use the CRC model to relax the assumption that the return/bene…t to hybrid varies along only observable dimensions across households. There have been a number recent studies on heterogeneity in returns. Advances have been made in the context of experimental data (for example, Heckman, Smith and Clements (1997)) and in cross sectional data where the covariate of interest is a stock variable like schooling (see Card (1998), Heckman and Vytlacil (1997), Heckman and Li (2003), etc.) or unionization (see Lemieux (1993, 1998)26 . To estimate the CRC model, I take a novel approach. I use the panel nature of my data to generalize the multivariate regression/minimum distance approach of Chamberlain (1984) to allow for heterogeneous returns as in equation (24). This generalization is extremely useful as not only does it let me estimate , but it in fact allows me to estimate the distribution of returns. 26 For more on the return to unionization, see Vella and Verbeek (1998), Green (1991) and Robinson (1989). 16 5 Baseline Results I begin the empirical work by looking at the OLS and household …xed e¤ects speci…cations as a baseline comparison to the heterogeneous returns model. Table 3 shows these results for speci…cations with and without covariates, i.e. this simpli…ed version of equation (26): yit = + hit + Xit0 + Comparing this to equation (26), allowing And, I have assumed that H = N H t N t i + uit t, (28) I have assumed that t = 8t. = . Later, I will relax the latter of these assumptions27 . The OLS estimates in the …rst column of Table 3 are extremely large and positive: households that plant hybrid maize tend to have much higher maize yields, on the order of 100% higher. In the third column, I add covariates to control for other household variables that could a¤ect yields and that may be correlated with the use of hybrid. They include land acreage, fertilizer use, land preparation costs, seed quantity, variables that measure labor inputs (both hired and family labor where possible), long term mean seasonal rainfall, current seasonal rainfall, year and province dummies. Adding these covariates decreases the OLS coe¢ cient further, though it is still large at 56%. The fourth and …fth columns report the household …xed e¤ects results. The coe¢ cient on hybrid decreases dramatically, though with covariates the di¤erence between the OLS and …xed e¤ects is less substantial. Even within households, there is still a substantial return to hybrid maize, about 15%. This household …xed e¤ects framework is restrictive in the assumptions it imposes on the adoption process and the comparison of the adopters and non-adopters. There is assumed to be no di¤erence between farmers who switch into the use of hybrid and those that switch out. Apart from the permanent component in the outcome equation, i, the adoption decision cannot depend on observed outcomes except under restrictive assumptions on the transitory component (see Ashenfelter and Card (1985)). Such assumptions can be motivated by myopia or ignorance of the potential gains from planting hybrid, but they are unrealistic here28 . Since the …xed e¤ects model may not be appropriate, the rest of this section describes some tests of this model, and some intuitive evidence for selection and heterogeneity in returns. 5.1 The Two Period CRE Model The household …xed e¤ects estimates described above are consistent only under the assumption of strict exogeneity of the errors. Chamberlain’s CRE approach provides a basis for 27 The former can be relaxed but relaxing it is empirically unimportant - the results from the more complicated model that allows for time-varying ’s are identical so I report results for the simpler model throughout. 28 Hybrid maize was introduced in the 1960’s, with widespread use of extension services to promote the technology. See Evenson and Mwabu (1998). Data in the survey instrument also supports this, as most of the farmers in my sample have used hybrid seed at some point in the past. 17 testing this assumption. I brie‡y outline the simple two period, no covariates CRE model where the data generating process is given by yit = + hit + + uit i (29) The assumption of strict exogeneity of the errors is: E(uit jhi1 ; :::; hiT ; i) =0 (30) The CRE approach illustrates how the …xed e¤ects model is overidenti…ed. Linearly project the …xed e¤ect, i, onto the history of the covariates (here just hybrid use): i = 0 + 1 hi1 + 2 hi2 + vi (31) where the projection error vi is uncorrelated with hi1 and hi2 by construction, and the ’s are the projection coe¢ cients. Substituting this linear projection into equation (29) gives yit = + hit + Let it 0 + 1 hi1 + 2 hi2 + vi + uit (32) = vi + uit where E [ it hi1 ] = E [ it hi2 ] = 0. For each time period, therefore: yi1 = ( + 0) +( + yi2 = ( + 0) + 1 )hi1 1 hi1 + +( + 2 hi2 + i1 (33) 2 )hi2 + i2 (34) These are the structural yield equations for each period. I estimate reduced form yield functions for each period of the form yi1 = 1 + 1 hi1 + 2 hi2 + i1 (35) yi2 = 2 + 3 hi1 + 4 hi2 + i2 (36) From the four reduced form coe¢ cients from equations (35) and (36), estimate the three structural parameters, 1; 2 1; 2; 3 and 4, I can and , using minimum distance. The intuition behind the identi…cation of the CRE model comes from the underlying assumption of strict exogeneity of the errors. If the …xed e¤ects model is valid, then the only way the history of hit (both past and future) a¤ects the current outcome is through the household level unobservable, i. Conditional on i and the included regressors, the changes in behavior are taken to be driven by transitory factors uncorrelated directly with yields. The CRE model is therefore overidenti…ed and testable with panel data. The minimum distance estimator of the structural parameters is also the minimum 18 2 estimator if the weight matrix used is the inverse of the variance covariance matrix of the reduced form coe¢ cients. This is called the optimal minimum distance (OMD) estimator. The 2 statistic is just the value of the minimand in the OMD problem. If the identity matrix is used as the weight matrix, the estimates are referred to as equally weighted minimum distance (EWMD) estimates. The OMD estimates are e¢ cient, but may be biased in small samples and can therefore be out-performed by EWMD (see Altonji and Segal (1996)). I also look at the diagonally weighted minimum distance (DWMD) estimates, where the weight matrix used is the same as the OMD case with the o¤-diagonal elements set to zero (see Pischke (1995)). Estimates of the CRE model for three periods, with and without covariates are shown in Table 4. In the CRE model, covariates can be treated as either exogenous or endogenous. Exogenous covariates enter the model in equation (29), but are assumed to be uncorrelated with the …xed e¤ects. Endogenous covariates are correlated with the …xed e¤ects and enter the projection in equation (31). The CRE model hence allows tests of whether covariates are endogenous. I report estimates where all covariates are assumed to be endogenous29 . Table 4 shows both the reduced form and structural estimates for various speci…cations. The reduced forms in the upper panel of Table 4 give nine reduced form parameters (not including the constants or covariates), from which I use minimum distance to estimate the four structural estimates, shown in the lower panel of the table. The CRE estimates of in all cases are very close to the household …xed e¤ects estimates in Table 3. The OMD, EWMD and DWMD estimates of are quite similar, all within sampling error of each other. The last column in the lower panel of Table 4 shows the 2 values on the overidenti…cation test. In all cases, I can reject the null that the minimum distance restrictions hold. This overidenti…cation test is an omnibus test and has low power against any speci…c alternative. It is therefore not surprising that I am able to reject the overidentifying restrictions. 5.2 Preliminary Evidence of Heterogeneity and Selection Previous research has developed tests for heterogeneity in returns for experimental data (see Heckman, Smith and Clements (1997)). Since my data is not experimental, I do not report results from these tests in detail. However, I am able to reject the null of no heterogeneity in returns30 . Such tests assume away selection, which is hardly tenable so, in Table 5, I look for evidence of selection. I split the adoption history of a farmer into a set of dummy variables describing the transition of this farmer across technologies. The idea is to look at whether households with di¤erent transition histories have di¤erent returns in terms of yields (see Card and Sullivan (1988)). I de…ne a “joiner” to be a farmer who does not plant hybrid the 29 If I treat hybrid as endogenous and all the other covariates as exogenous, the results are similar. These tests include looking at the Frechet-Hoe¤ding bounds and bounding the variance in the percentiles of the returns distribution and testing whether this bound is di¤erent from zero. These tests, …gures and tables are available from the author upon request and are reported in Suri (2006). 30 19 …rst period, but does the next, and a “leaver” to be a farmer who plants hybrid the …rst period, but not the next. Similarly, I de…ne “hybrid stayers” and “non-hybrid stayers”. I then look at each pair of periods in my data and compare yields for the hybrid and non-hybrid stayers, the joiners, and the leavers in each of the two periods separately to learn about the extent of selection31 . For example, the …rst two columns in Table 5 look at the transitions of households over 1997-2000. The …rst and second columns compare the yields in 1997 and 2000 for hybrid stayers, joiners and leavers (with the omitted group being the non-hybrid stayers) in 1997 and 2000 separately. If there was no selection at all (not even via a household …xed e¤ect), we would expect the coe¢ cient on the leavers in the yield equation for 1997 to be no di¤erent from the coe¢ cient on the hybrid stayers, and also no di¤erent from the coe¢ cient on the joiners in the yield equation for 2000. Similarly, the coe¢ cient on joiners in the yield equation for 1997 would be no di¤erent from zero, as should the coe¢ cient on the leavers in the yield equation for 2000. The coe¢ cients on the leavers and joiners across the two yield equations illustrate the extent of selection. Table 5 reports similar estimates for 2000-2004 and 1997-2004. The hybrid stayers uniformly get the largest yields and the leavers and joiners get very di¤erent changes in yields when they switch their adoption status. This table helps shed light on a pure liquidity constraints model. Consider an extreme version of this model where variation in adoption decisions is completely explained by liquidity constraints. In this case, if a non-hybrid farmer does better than average in terms of yields, it will lead him to use hybrid in the next period. But, in this period, he should then revert to having the average hybrid yield. So, we would see that non-hybrid households that have higher yields than average in the …rst period should have no di¤erent than average hybrid yields in the second period. This implies that for 1997-2000, the coe¢ cient on joiners would be greater than zero in 1997, and equal to the coe¢ cient on the hybrid stayers in 2000. Similarly, the leavers should do worse than the hybrid stayers in 1997 and then no di¤erent that the non-hybrid stayers in 2000. Throughout Table 5, these restrictions are rejected. This signals that liquidity constraints alone cannot explain the patterns in the data. In Table 6, I look for heterogeneity in returns to hybrid seed along observable dimensions. This relaxes the assumption that H = N = in the OLS and FE estimations reported above. Table 6 reports results for yield functions estimated separately for hybrid and nonhybrid households. I report both the OLS as well as household …xed e¤ects speci…cations. The returns to observables di¤er across the two sectors, especially in the cases of fertilizer and rainfall. This holds for the OLS as well as the household …xed e¤ects speci…cations. The last row of Table 6 then reports estimates of the return to hybrid (evaluated at the mean Xit ’s). There is still a signi…cant return to hybrid for both the OLS and …xed e¤ects speci…cations. 31 Throughout the sample period, hybrid stayers have the highest yields on average, followed by the leavers, then the joiners and …nally the non-hybrid stayers. 20 6 Estimating a Model with Heterogeneous Returns I now go on to describe how I estimate the CRC model. For simplicity, I leave out inputs other than hybrid and discuss extensions later. The model outlined earlier implies the following econometric model in maize yields for the simple two period, no covariates case32 : yit = + hit + i + i hit + (37) it To estimate this, I generalize Chamberlain’s approach. This is my preferred estimation method as it allows a 2 overidenti…cation test of the restrictions, it allows easy generalization of the basic model and, furthermore, it allows me to estimate a distribution of returns. The next sections are devoted to describing my estimation and identi…cation strategy. I then describe generalizations to this basic model and the estimation results. 6.1 Two Period CRC Model The key assumption here is that, conditional on i and i (and covariates in the more general speci…cations), the unanticipated component of yields, "it , is not correlated with the decision to adopt. Since i i + i, where yit = + i and i i are orthogonal, I re-write equation (37) as + hit + i hit + i + "it (38) I …rst derive how to estimate the parameters of this model, most importantly show how to use these estimates to derive a distribution for the predicted To estimate the parameters, I use a linear projection of the i ’s. The . Then, I i ’s. i ’s are projected onto not just the history of the hybrid decisions, but also their interactions so that, in the two period case, the projection error is orthogonal to hi1 and hi2 individually as well as to their product, hi1 hi2 by construction33 . The linear projection I use is therefore i = In addition, I normalize the 0 + i ’s 1 hi1 + so that 2 hi2 P i + 3 hi1 hi2 + vi (39) = 034 . The normalization implies that 0 32 This empirical model is similar to models of individual speci…c heterogeneity in Heckman and Vytlacil (1997), Card (2000, 1998), Deschênes (2001), Carneiro, Hansen and Heckman (2001), Carneiro and Heckman (2002), and Wooldridge (1997). Lemieux (1998) has the same model to look at whether the return to union membership varies along observable and unobservable dimensions - he estimates it using non-linear 2SLS. 33 Say I use the simple CRE projection, i = 1 hi1 + 2 hi2 + vi , and substitute this into the yield function: yit = 0 + 1 hi1 + 2 hi2 + hit + 0 + 1 hi1 hit + 2 hi2 hit + vi + vi hit + uit Even though vi is linearly uncorrelated with hi1 and hi2 individually, it is generally correlated with their product, hi1 hi2 , so that E [vi hi1 hi2 ] 6= 0. The projection for CRC must include interactions of hybrid histories. 34 I use this normalization so that corresponds to the average return to the hybrid seed as in the …xed 21 can be written as a function of 1, 2 and 3 from equation (39). Since hit is a dummy variable, substituting the above projection into the yield equations gives: yit = + 0 + 1 hi1 + 2 hi2 + 3 hi1 hi2 + hit + 0 hit + 1 hi1 hit + 2 hi2 hit + 3 hi1 hi2 hit +vi + vi hit +uit (40) For each of the two time periods, the yield functions are therefore: yi1 = ( + 0 )+[ 1 (1+ )+ + yi2 = ( + 0 )+ 1 hi1 +[ 2 (1+ 0 ]hi1 + 2 hi2 +[ 3 (1+ )+ + 0 ]hi2 +[ 3 (1+ )+ 2 ]hi1 hi2 +(vi + vi hi1 +ui1 ) (41) )+ 1 ]hi1 hi2 +(vi + vi hi2 +ui2 ) (42) The corresponding reduced forms include all the interactions of the hybrid histories: yi1 = 1 + 1 hi1 + 2 hi2 + 3 hi1 hi2 + i1 (43) yi2 = 2 + 4 hi1 + 5 hi2 + 6 hi1 hi2 + i2 (44) 1; 2; Equations (43) and (44) give six reduced forms coe¢ cients ( I can estimate …ve structural parameters ( 1; 2; 3; 3; 4; 5; 6 ), from which ; ) using minimum distance35 . The structural parameters are overidenti…ed, given the minimum distance restrictions: 1 = (1 + ) 1 + + 2 = 3 = (1 + ) 3 + 2 4 = 5 = (1 + ) 2 + + 6 = (1 + ) 3 + 1 From these restrictions, for example, form parameters in two ways, = 0 2 1 0 (45) can be estimated as a combination of the reduced 6 3 4 2 and 1 + = 1 5 4 2 . I now discuss extensions to the basic model and how to recover the distribution of 6.2 i ’s. Extensions I consider the following extensions: e¤ects model. The normalization used is irrelevant and does not change the results. It just …xes the average i in the sample and does not change the ordering of the i ’s. 35 There may seem to be six structural P parameters, given the presence of 0 in equations (39) and (45). However, given the normalization that i = 0, 0 is just a function of the hybrid histories, their interactions and the other 0 s. In particular, 0 = 1 hi1 2 hi2 3 hi1 hi2 where hi1 and hi2 are the averages of the adoption decisions in periods one and two, and hi1 hi2 is the average of their interaction. 22 1. Covariates: all the identi…cation arguments presented above generalize when covariates are included. Covariates can be thought of as either exogenous or endogenous. Exogenous covariates are uncorrelated with the i ’s and therefore enter only the reduced forms in equations (43) and (44), but not the projection. Endogenous covariates are correlated with the i ’s and enter the projection in equation (39). In the case where fertilizer is the only other endogenous covariate, the CRC projection generalizes to i = 0 + + 1 hi1 7 hi1 fi2 + + 2 hi2 + 8 hi2 fi2 3 hi1 hi2 + + 4 hi1 fi1 9 hi1 hi2 fi2 + + 10 fi1 5 hi2 fi1 + + 11 fi2 6 hi1 hi2 fi1 + vi (46) where fit for t = 1; 2 represents the use of fertilizer in each period. Adding more endogenous covariates does complicate the CRC model and it can be cumbersome36 . 2. Three periods: this is a simple extension. For space considerations, I do not show the restrictions, but the problem becomes heavily overidenti…ed. There are 9 structural parameters to estimate from the 21 reduced form coe¢ cients for the basic case. 3. Joint choice variables: the two-sector model presented above (hybrid/non-hybrid) can be extended to multiple sectors. Since in my data the use of hybrid seed and fertilizer are clearly a joint decision, as a further check, I incorporate the use of fertilizer into a multiple sector model. I estimate a model that rede…nes sectors to this joint decision and then looks at the heterogeneity in returns across these two sectors. In this scenario, a farmer is either in the technology sector (he uses both hybrid and fertilizer) or not. 6.3 CRC Estimates This section describes the results for the CRC model. I report estimates for the pure hybrid model described in detail above (with and without covariates) for both the two and three period cases. In addition, I report results for the endogenous covariates and joint hybridfertilizer sector extensions described above. Tables 7, 8a, 8b and 9 present the CRC model reduced form and structural estimates. These tables report the EWMD, DWMD and OMD estimates for cases with and without covariates, as well as the coe¢ cient of interest is components, i. 2 statistics on the overidenti…cation tests for the OMD cases. The , the coe¢ cient on the individual speci…c comparative advantage Table 7 presents the two period reduced forms for the CRC model, both with and without covariates, using data for 1997 and 2004. The reduced forms are estimated via 36 There is some justi…cation for treating only fertilizer as endogenous. First, the summary statistics by sector show that there aren’t immense di¤erences across sectors in the use of other inputs. Second, using the estimates from the pure hybrid problem in Table 8, I correlate the distribution of the predicted i ’s with the covariates. Only the correlations between the i ’s and fertilizer are important in magnitude and signi…cance. 23 seemingly unrelated regressions and the most general variance-covariance estimation. The estimates in Table 7 are for the basic speci…cation described above where hybrid is the only endogenous variable and the projection used is given by equation (39). Table 8a reports 2 the structural EWMD, DWMD and OMD results with covariates. The estimates of statistics, both with and without are consistently negative, though with larger standard errors in EWMD and DWMD cases (OMD is the asymptotically e¢ cient estimator). Table 8b presents two sets of structural estimates of that account for fertilizer entering the household’s decision making process regarding hybrid maize instead of restricting it to be exogenous. In the upper panel of Table 8b, I allow for fertilizer to be endogenous and correlated with the i ’s, using the projection in equation (46). Again, the estimates of in this panel of Table 8b are consistently negative, both with and without covariates as well as across EWMD, DWMD and OMD speci…cations. The lower panel in Table 8b reports results for the case where there is a joint hybrid-fertilizer decision on the part of the farmer so that he is in the technology sector if he uses both hybrid and fertilizer37 , otherwise he is not. The reason for estimating this model in this restrictive way is that in the three period case allowing fertilizer to be directly correlated with the i ’s as in equation (46) makes the model a bit cumbersome38 . For this simpler model, I report results for the two and three period cases for comparison. Results for in the lower panel of Table 8b are consistently negative. Table 9 shows the structural estimates for the three period CRC models for two cases: the pure hybrid decision and the joint hybrid-fertilizer technology sector decision (similar to the results in the lower panel of Table 8b). There are three reduced forms here, one for each period, that include all the possible interactions of the three hybrid histories. This gives a total of 21 reduced form estimates that map onto 9 structural estimates: 7 ’s from the projection of the i ’s, the average return to hybrid, , and the comparative advantage coe¢ cient, . For space reasons and since OMD is e¢ cient, I report only the OMD estimates. The EWMD and DWMD results are not signi…cantly di¤erent from zero. The OMD estimates of are negative across the board and often signi…cant. Overall, to summarize all these results, I can often reject the null hypothesis that is equal to zero, especially across the board in the OMD cases. However, for the two period two sector models, I cannot reject the null that is equal to 1. For the endogenous hybrid- fertilizer model in Table 8b and the three period models in Table 9, I can reject the null that is equal to 1. I explain the signi…cance of this in the discussion below. 37 The fertilizer decision I use is regarding chemical fertilizers. The results are similar if I include the use of manure as part of the fertilizer decision. 38 An additonal complication in the data is that the 2000 survey did not collect data on family labor. So, for the three period case, I opted for the simpler model. The results should be interpreted with caution. 24 6.4 Recovering the Distribution of bi Given the CRC structural estimates of the ’s and the form of the projection given by either equations (39) or (46), I can predict the i ’s for a given history of hybrid use. However, I must assume that the projections describe the true conditional expectation of the i ’s. This is essentially an assumption only for the case of equation (46). For equation (39), since hit is binary and each history is accounted for, the projection is close to saturated. Once I have the predicted distribution of i, I can derive the predicted Figure 7a shows the distribution of the predicted i ’s i ’s via the yield function. in my sample for the two period model with the full set of exogenous covariates (the estimates come from Table 8a). Since the predicted predicted i ’s i ’s are obtained from the projection in equation (39), the distribution of the has just four mass points. There are only four possible hybrid histories: non- hybrid stayers, hybrid stayers, leavers and joiners. So, given the projection, there are only four possible values of bi : 0:3684; 0:0691; 0:2275 and 0:4033 for the four histories respectively. Think of these mass points as averages of the underlying individual hybrid use. The non-hybrid stayers have the lowest predicted returns to hybrid as i ’s i ’s for each history of and hence the highest < 0. The joiners and leavers have close to zero returns (they are the marginal farmers) and the hybrid stayers have low positive returns. Figure 7b shows the distribution of returns across my sample. To get this distribution of returns, I need to use the estimate of (the structural coe¢ cient on hit in equation (24)), which is the average return to hybrid in the sample. The distribution of returns is therefore given by + i. Since < 0, the ordering of returns is the reverse of the ordering of i ’s, i.e. the non-hybrid stayers have the most negative i ’s, but the highest positive returns. Figure 7c shows the distribution of the bi ’s for the case where both hybrid and fertilizer use are endogenous (the estimates come from the upper panel of Table 8b). Now, the distribution of the predicted bi ’s is continuous since the predicted bi ’s come from the projection in equation (46). Figure 7c splits out the distribution of these predicted bi ’s by transition. Clearly, the ordering of average predicted bi ’s (and hence returns) is the same as earlier. The non-hybrid stayers have the highest returns (lowest bi ’s), followed by the hybrid stayers, then the leavers and …nally the joiners. Finally, as a check, Figure 7d shows the corresponding distribution of the predicted 7 i ’s (which were constructed to be orthogonal to hybrid choice). Robustness Checks In this section, I present as robustness checks control function and treatment e¤ect estimates of the returns to hybrid. This includes the Heckman two-step estimator (also see Garen (1984) and Deschênes (2001)) and estimates of treatment e¤ects under non-random assignment, i.e. the ATE, TT, MTE and LATE (see Björklund and Mo¢ tt (1987), Heckman, Tobias and 25 Vytlacil (2001) and Carneiro and Heckman (2002)). Rewriting equation (11) for one period, 0 yi = + Xi ( Estimates of i H N ) + (uH i 0 uN i ) hi + X i N + uN i i hi 0 N + Xi + uN i (47) would illustrate whether the average returns of farmers selecting into using hybrid maize are higher than the returns for the farmers at the margin. With assumptions of normality, it is possible to estimate the standard cross sectional selection parameters. These selection correction procedures are cross sectional in nature and impose distributional assumptions. They do not fully address the issues I am interested in, nor do they fully exploit the panel nature of my data. I present them as robustness checks. The yields in the hybrid and non-hybrid sectors are given by equations (3) and (4), and the selection equation is given by equation (9). Drawing on Heckman, Tobias and Vytlacil (2001), I look at the following treatment e¤ects under the simple assumption of normality: AT E(x) = E( jX = x) = x( T T (x; z; h(Z) = 1) = x( M T E(x; usi ) = x( where = yH H H N N )+( )+( y N is the yield gain from hybrid, H N H H H H 2 H (:) and N N (48) (z ) (z ) N) H (49) s N )ui = V ar(uH i ), is the error in the selection equation given by equation (9), s Corr(uN i ; ui ). ) (50) 2 N s = V ar(uN i ), ui s = Corr(uH i ; ui ) and N = (:) represent the normal probability and cumulative distribution functions respectively. V ar(usi ) is normalized to be 1, so that H H and N N correspond to the coe¢ cients on the selection terms in the sector speci…c selection models. These treatment e¤ects in the case of joint normality are simple to compute using a twostep control function procedure. The …rst stage is a probit describing the hybrid adoption decision, used to compute the selection correction terms that are used as controls in the second stage sector speci…c yield functions. The ATE uses the estimates of the coe¢ cients on the Xi ’s from this second step. The TT adjusts this estimated ATE for the selection of those who actually plant hybrid. The slope of the MTE describes whether people who are more likely to use hybrid for unobservable reasons (higher usi ) have higher or lower returns from planting hybrid. If the coe¢ cient on usi in equation (50) is negative, it implies that farmers with unobservables that make them the least likely to use hybrid have the highest yield return to planting hybrid over non-hybrid. A non-zero slope of the MTE therefore illustrates the role of unobservables and the importance of heterogeneity in returns. These treatment e¤ects are shown in Table 10 for every cross-section of my data. To estimate these parameters, I need an exclusion restriction, i.e. a variable that enters the selection equation (9), but not the yield functions in equations (3) and (4). I use a variable 26 that describes a household’s access to fertilizer and hybrid seed, in particular the distance to the closest stockist of fertilizer (not the distance to where fertilizer is purchased). This distance measure gives an idea of the access and availability of the technologies and the general level of infrastructure for these households. The ATE’s (evaluated at the mean of the Xi ’s) are all extremely large and positive, ranging from 0.768 in 2000 to 2.174 in 1997. The TT estimates are also large and positive and often smaller than the ATE estimates, ranging from 0.898 to 1.138. The MTE slope, meanwhile, is consistently negative and mostly signi…cant: –1.867 (with a standard error of 0.743) in 1997, -0.232 (0.372) in 2000 and -0.885 (0.240) in 2004. The negative MTE slopes imply that farmers with unobservables that make them most likely to use hybrid get the lowest returns from hybrid. Finally, I can estimate the instrumental variable (i.e. LATE) estimate using the same exclusion restriction or instrument. The IV estimates of the return to hybrid, reported in the lower panel of Table 10 are extremely large, on the order of 150%, especially when compared to the earlier OLS and household …xed e¤ects estimates39 . These IV estimates are consistent with my estimates of in Tables 8 and 9 and, given the non-zero MTE slopes, they provide further evidence of the heterogeneity in the e¤ects of hybrid. The next section is devoted to discussing all these estimates and relating them to estimates of to better understand farmers’adoption decisions. 8 Discussion How do all the results from the various models presented above relate to each other and inform the underlying correlations of returns, costs and adoption decisions? In this section, I relate my estimates of estimates of to the estimates from the selection models to understand what the mean and their policy relevance. Recall that the parameter tells us whether the households that do better on average, irrespective of the technology they use, also have the highest returns to hybrid, and it illustrates the relative equality or inequality in yields. I rewrite the factor structure in equations (12) and (13) for the one period/cross-sectional H i case, suppressing the transitory components, and uH i = ( + 1) uN i = i + i N i : + i i To relate this to the selection model, I need expressions for H H and N N in terms of 39 The IV results are similar if I use the interactions of distance with dummies describing which asset quantile a household is in as the exclusion restrictions with the level e¤ects of distance and asset quantiles entering as controls in both stages of the speci…cation. This approach (results not reported here, but available from the author) allows distance and asset quantiles to a¤ect yield outcomes directly, but relies on distance to be more of a constraint for poorer households so that the interaction is useful as an exclusion restriction. 27 the parameters of these factor structures. Recall that s Cov(uN i ; ui ) since V ar(usi ) H H N H H N N s = Cov(uH i ; ui ) and = 1. I can therefore derive an expression for N s = Cov(uH i ; ui ) 2. would simply be H H N N uN i = i. N = N: s s Cov(uN i ; ui ) = Cov( i ; ui ) In the strict Roy model without costs, usi = uH i (51), H H (51) In this case, from equation A negative MTE slope must therefore imply that < 0. However, in a generalized Roy model with costs usi = i Ci where Ci is a measure of relative costs in the hybrid sector (that do not enter the yield equation). This implies H H From the negative slope of the MTE, N N 2 = H H N Cov( i ; Ci ) N < 0. If < 0, then Cov( i ; Ci ) can be either positive and small, or negative. On the other hand, if > 0, it implies that Cov( i ; Ci ) needs to be quite large and positive (enough to outweigh the Roy model, the sign of (52) 2 ). In terms of a generalized tells us something about the covariance of comparative advantage and relative costs of planting hybrid. Given that I …nd that < 0, it must be that households with high return/low comparative advantage have higher relative costs of hybrid. To understand these costs, in Table 11 I look at what observables correlate with the predicted bi ’s. The results reported are for the two period model with endogenous hybrid and fertilizer (using estimates from the upper panel of Table 8b). The observables I consider include the distance from the household to the closest fertilizer seller, distance to the closest tarmac road, distance to the closest matatu (public transport) stop, distance to the closest motorable road, distance to the closest extension services, a set of dummies for education of the household head, a dummy for whether the household tried to get credit, a dummy for whether the household tried to get credit but did not receive any, a dummy for whether the household received any credit, province and year dummies. The infrastructure and education variables correlate strongly with the predicted bi ’s, but the credit variables do not40 . Both from the agronomy of hybrid maize, as well as the kernel density plots in Figure 6, it is clear that hybrid increases productivity, but this increase declines as you move rightward through the distribution, so the expected return from hybrid is almost zero in the right tail. When < 0, the marginal gain is negative, so that the farmers in the left tail have the largest gain. This is seen from the estimates of as well as the negative selection estimates. If selection was only as per the strict Roy Model, we would see the largest adoption rates in the left tail, but this ignores the possibility of larger costs/constraints for these farmers. Table 11 illustrates that the households with the lower predicted bi ’s have much higher costs. Recall that throughout the sample period, Kenya Seed Company was the predominant distributor of 40 These results hold up in the presence of …ner regional dummies, agro ecological zones or districts. 28 hybrid seed (about 1% of hybrid seed purchased in 2004 was not from Kenya Seed Company) and under pan territorial seed pricing, seed prices were the same across the country. This meant that suppliers had no incentives to locate far away as there was no compensating increased price. And, villages are large and heterogeneous and these measures of costs and infrastructure vary even within a village. The IV estimates in Table 10 also suggest that distance and infrastructure are constraining factors. An issue with IV estimates is that the estimated results are often quite di¤erent if di¤erent instruments are used. This is usually attributed to underlying heterogeneity in the population so that each instrument a¤ects a di¤erent subset of the population, hence giving the varied results. However, the instrument itself, per se, does not usually highlight which subset of the population it a¤ects. Appealingly, here the IV estimates are about 150% which are the same magnitude as the returns that the non-hybrid stayers would get from using hybrid. Distances and infrastructure do seem to pose a constraint to households using these technologies. Finally, note that for sector, the yields are given by non-hybrid sector include the + i. i = 1, looking at the outcomes in just the hybrid (plus the constant of course). But, the yields in the This also explains why the IV estimates are close to the estimates of . The mirror image of the result for the non-hybrid stayers is that there may be “over-adoption” for some farmers in the right tail. The fact that some households have lower returns yet adopt may hint at risk being an important factor in the adoption decision, since the mean gross returns to hybrid in the upper quartile are close to zero. To account for risk, it would be necessary to allow the variance of the transitory component in yields to play a role in the adoption decision. I leave this to future work. 8.1 Alternative Models Finally, I discuss some alternative models of adoption behavior that are popular in the literature on developing countries. The …rst is a model of learning, where households are uninformed about the bene…ts of a technology and they trial and experiment with it to learn what the returns are. Learning models give rise to the familiar S-shaped curves of adoption rates over time (see Griliches (1957)): initially, few households adopt, then as more households learn about the productivity of the technology, adoption rates increase until everyone has adopted. In my data, adoption rates show no dynamics - aggregate adoption remains constant over the span of almost ten years. In fact, there are no dynamics in any of the inputs either, even after as large a shock as El Nino in 1998. Any conventional model of learning is not borne out by these aggregates. It may be possible that households are learning about the di¤erent types of hybrid seed and are switching to newer varieties of hybrid. However, even in that case, we should still see aggregate increases in adoption over time. For two periods of the survey (2002 and 2004), the 29 data includes exactly what type of hybrid variety was used. In 2002, 61.5% of maize plots were planted with hybrid 614, 6% each of 625 and 627 and 4% of 511. The release date of these varieties was 1986, 1981, 1996 and the 1960’s respectively. In 2004, 69.5% of plots were planted with 614, 5% with 627, 4% each with 625 and 511. So, households are using rather old varieties and do not switch much between them. The switching between hybrid and non-hybrid from period to period may look like learning or experimenting behavior, but recall that hybrid and fertilizer have been available for a few decades now. The 2004 survey asked households what year they …rst used the technologies this data shows that 99% and 83% of households have used hybrid and fertilizer (respectively) before. Table 2c does not show any systematic persistence in adoption (the cycling behavior is about as prevalent as the non-cycling). In the qualitative parts of the survey, farmers who were using traditional varieties were asked why they were not using hybrid varieties and only about 0.3% cited experimenting or on trial. Learning therefore seems to be unimportant in my data. However, there may have been large learning e¤ects before my sample period - Gerhart (1975) shows S-shaped curves of adoption in the 1970’s. It seems that these curves ‡attened out at low rates of adoption. Finally, on the methodological side, a learning model with heterogeneity in returns would imply a random coe¢ cient model with feedback. Chamberlain (1993) shows that such models su¤er from identi…cation problems. Gibbons et al. (2005) estimate a random coe¢ cient model with learning, but they must rely on the structure of the learning process and a long panel (the NLSY) to identify the model. The second set of alternative models deals with the role of credit constraints. The data on credit show that in 2004, 41% of all households tried to get credit and 83% of these got credit. In 1997, these …gures were collected for agricultural credit, where 33.7% of households tried to get agricultural credit and 90% of these got it. Credit constraints do not seem to be of …rst order importance here, for the following three reasons. First, only 2% of households plant both hybrid and non-hybrid seed though there are no …xed costs to hybrid and both hybrid seed and fertilizer are available in small quantities from suppliers. Second, the patterns in Table 5 did not …t with a pure liquidity constraints story, as described earlier. And, third, none of the credit variables (which admittedly are not perfect) correlate strongly with comparative advantage in Table 11. The …nal hypothesis for the adoption behavior of households is di¤erences in tastes. In Latin America, the hybrid and non-hybrid varieties taste di¤erent and have di¤erent prices in the market. This is not the case in Kenya. Hybrid and non-hybrid maize output are indistinguishable, they are sold at the same price as a single commodity. The survey asked farmers why they chose the variety they plant and an equal fraction of each type of farmer (hybrid and non-hybrid) said it was because they preferred the taste. The di¤erential tastes hypothesis does not seem to …t the case of Kenya. 30 9 Conclusion This paper examines the adoption decisions and bene…ts of hybrid maize in a framework that is in stark contrast to the empirical technology adoption literature of the past two decades. Rather than think of adoption decisions as based on learning and information externalities, I focus on a framework that recognizes the large disparities in farming and input supply characteristics across the maize growing areas of Kenya. Within this framework, I …nd that returns to hybrid maize vary greatly. Furthermore, farmers who are on the margin of adopting and disadopting (and who do so during my sample period) experience little change in yields, a …nding consistent with my framework, yet harder to reconcile with a pure learning model. The experimental evidence points to average high, positive returns to these agricultural technologies. However, these experiments say little else about the returns, sometimes hypothesizing irrationality in decision making to account for patterns of low, non-increasing adoption rates. My framework of heterogeneous returns allows me to estimate not just average returns accounting for selection, but also to have an idea of what the distribution of returns looks like across my sample of farmers. I …nd extremely strong evidence of heterogeneity in returns to hybrid maize, with comparative advantage playing an important role in yield determination. The most important …ndings of my work are the policy implications of heterogeneity in returns. Since I am able to look at a distribution of returns across my sample of farmers, I can separate out farmers with low returns from those with high returns. I …nd that for a small group of farmers in my sample, returns from hybrid maize would be extremely high, yet they do not adopt hybrid maize. I show that these farmers have high unobservable costs that prevent them from adopting hybrid as they have poor access to seed and fertilizer distributors. In terms of policy, alleviating these constraints would greatly increase yields for these farmers. However, this is only a small fraction of my sample. For a large fraction of my sample, the returns to hybrid maize are smaller and these farmers choose to adopt. While I do not build risk into the choice framework used in this paper, farmers that are well o¤ can a¤ord to use hybrid maize, even when the mean returns to hybrid are low, since it helps insure them again bad outcomes. For these farmers, since they do not seem to be constrained, they would bene…t greatly from improved e¤orts in developing new hybrid strains and a biotechnology e¤ort similar to that of South Asia and Latin America, where cultivation of newer hybrids occurs often and leads to continual yield increases. Finally, the heterogeneity in return to hybrid seed, on the whole, suggests quite rational and relatively unconstrained adoption of existing hybrid strains. 31 References Altonji, Joseph, and Lewis Segal, (1996), “Small Sample Bias in GMM Estimation of Covariance Structures”, Journal of Business and Economic Statistics, 14, pp. 353-366. Ashenfelter, Orley, and David Card, (1985), “Using the Longitudinal Structure of Earnings to Estimate the E¤ects of Training Programs”, Review of Economics and Statistics, 67(4). Attanasio, Orazio, Costas Meghir, and Miguel Szekely, (2003), “Using Randomized Experiments and Structural Models for Scaling Up: Evidence from the PROGRESA Evaluation”, Mimeo, University College London. Bandiera, Oriana, and Imran Rasul, (2003), “Complementarities, Social Networks and Technology Adoption in Mozambique”, Working Paper, London School of Economics. Banerjee, Abhijit, (1992), “A Simple Model of Herd Behavior”, Quarterly Journal of Economics, 107(3), pp. 797-817. Besley, Timothy, and Anne Case, (1993), “Modelling Technology Adoption in Developing Countries”, AER Papers and Proceedings, 83(2), pp. 396-402. Björklund, Anders, and Robert Mo¢ tt, (1987), “The Estimation of Wage Gains and Welfare Gains in Self Selection Models”, Review of Economics and Statistics, pp. 42-49. Borjas, George, (1987), “Self-Selection and the Earnings of Immigrants”, American Economic Review, 77 (4), pp. 531-553. Card, David, (2000), “Estimating The Returns to Schooling: Progress of Some Persistent Econometric Problems”, NBER Working Paper No. 7769, Cambridge, MA. Card, David, (1998), “The Causal E¤ect of Education on Earnings”, in the Handbook of Labour Economics, eds. Orley Ashenfelter and David Card. Card, David, (1996), “The E¤ect of Unions on the Structure of Wages: A Longitudinal Analysis”, Econometrica, 64. Card, David, and Daniel Sullivan, (1988), “Measuring the E¤ect of Subsidized Training Programs on Movements In and Out of Employment”, Econometrica, 56. Carneiro, Pedro, Karsten Hansen, and James Heckman, (2003), “2001 Lawrence R. Klein Lecture: Estimating Distributions of Treatment E¤ects with an Application to the Returns to Schooling and Measurement of the E¤ects of Uncertainty on College Choice”, International Economic Review, 44 (2), pp. 361-422. Carneiro, Pedro, and James Heckman, (2002), “The Evidence on Credit Constraints in PostSecondary Schooling”, Economic Journal, 112 (October), pp. 705-734. Carneiro, Pedro, and Sokbae Lee, (2004), “Comparative Advantage and Schooling”, Working Paper, University College London. 32 Chamberlain, Gary, (1993), “Feedback in Panel Data Models”, mimeo, Department of Economics, Harvard University. Chamberlain, Gary, (1984), “Panel Data”, in Zvi Griliches and Michael Intriligator (eds.), Handbook of Econometrics, Amsterdam: North-Holland. Chamberlain, Gary, (1982), “Multivariate Regression Models for Panel Data”, Journal of Econometrics, 18, pp. 5-46. Conley, Timothy, and Christopher Udry, (2003), “Learning About A New Technology: Pineapple in Ghana”, Working Paper, University of Chicago and Yale University. Corbett, John, (1998), “Classifying Maize Production Zones in Kenya Through Multivariate Cluster Analysis”, in Maize Technology Development and Transfer: A GIS Application for Research Planning in Kenya, ed. Rashid Hassan, CAB International, Wallingford, UK. Croppenstedt, Andre, Mulat Demeke, and Meloria Meschi, (2003), “Technology Adoption in the Presence of Constraints: The Case of Fertilizer Demand in Ethiopia”, Review of Development Economics, 7(1), pp. 58-70. David, Paul, (2003), “Zvi Griliches on Di¤usion, Lags and Productivity Growth... Connecting the Dots”, Paper Prepared for the Conference on R&D, Education and Productivity, Held in Memory of Zvi Griliches (1930-1999), August 2003. De Groote, Hugo, William Overholt, James Ouma, and S. Mugo, (2003), “Assessing the Potential Impact of Bt Maize in Kenya Using a GIS Based Model”, Working Paper, CIMMYT (International Maize and Wheat Improvement Center). De Groote, Hugo, Cheryl Doss, Stephen Lyimo, and Wilfred Mwangi (2002), “Adoption of Maize Technology in East Africa: What Happened to Africa’s Emerging Maize Revolution?” Presented at Green Revolution in Asia and its Transferability to Africa, 2002. Dercon, Stefan and Luke Christiaensen, Luc, (2005), “Consumption Risk, Technology Adoption and Poverty Traps: Evidence from Ethiopia”, mimeo. Deschênes, Olivier, (2001), “Unobservable Ability, Comparative Advantage and the Rising Return to Education in the US, 1979-2000”, Working Paper, September 2001, University of California, Santa Barbara. Doss, Cheryl, (2003), “Understanding Farm Level Technology Adoption: Lessons Learned from CIMMYT’s Micro Surveys in Eastern Africa”, Economics Working Paper No. 03-07, CIMMYT (International Maize and Wheat Improvement Center), Kenya. Du‡o, Esther, Michael Kremer, and Jonathan Robinson, (2006), “Why Don’t Farmers Use Fertilizers: Evidence from Field Experiments in Western Kenya”, Working Paper, 2006. Duncan, Gregory, and Duane Leigh, (1985), “The Endogeneity of Union Status: An Empirical Test”, Journal of Labor Economics, 3 (3), pp. 385-402. 33 Evenson, Robert, and Germano Mwabu, (1998), “The E¤ects of Agricultural Extension on Farm Yields in Kenya”, Economic Growth Center Discussion Paper No. 798. Feder, Gershon, Richard Just, and David Zilberman, (1985), “Adoption of Agricultural Innovations in Developing Countries: A Survey”, Economic Development and Cultural Change. Food and Agriculture Organization, FAOSTAT Online Database. Foster, Andrew, and Mark Rosenzweig, (1995), “Learning by Doing and Learning from Others: Human Capital and Technological Change in Agriculture”, Journal of Political Economy, 103, pp. 1176-1209. Garen, John, (1984), “The Returns to Schooling: A Selectivity Bias Approach with a Continuous Choice Variable”, Econometrica, 52 (5), pp. 1199-1218. Gerhart, John Deuel, (1975), “The Di¤usion of Hybrid Maize in Western Kenya,” Ph.D. Dissertation, Princeton University. Gibbons, Robert, Lawrence F. Katz, Thomas Lemieux, and Daniel Parent, (2002), “Comparative Advantage, Learning and Sectoral Wage Determination”, Scienti…c Series, CIRANO. Green, David, (1991), “A Comparison of the Estimation Approaches for the Union-Nonunion Wage Di¤erential”, Discussion Paper No. 91-13, Department of Economics, University of British Columbia, Vancouver, Canada. Griliches, Zvi, (1957), “Hybrid Corn: An Exploration in the Economics of Technological Change”, Econometrica, 25, pp. 501-522. Hall, Bronwyn, (2004), “Innovation and Di¤usion”, NBER Working Paper No. 10212. Hassan, Rashid, Festus Murithi, and Geofry Kamau, (1998), “Determinants of Fertilizer Use and the Gap Between Farmers’ Maize Yields and Potential Yields in Kenya”, in Maize Technology Development and Transfer: A GIS Application for Research Planning in Kenya, ed. Rashid Hassan, CAB International, Wallingford, UK. Hassan, Rashid, Kiairie Njoroge, Mugo Njore, Robin Otsyula, and A. Laboso, (1998), “Adoption Patterns and Performance of Improved Maize in Kenya”, in Maize Technology Development and Transfer: A GIS Application for Research Planning in Kenya, ed. Rashid Hassan, CAB International, Wallingford, UK. Heckman, James, (2001), “Micro Data, Heterogeneity, and the Evaluation of Public Policy: Nobel Lecture”, Journal of Political Economy, 109 (4), pp. 673-748. Heckman, James, and Xuesong Li, (2003), “Selection Bias, Comparative Advantage and Heterogeneous Returns to Education: Evidence from China in 2000”, IFAU - Institute for Labour Market Policy Evaluation, Working Paper 2003:17. 34 Heckman, James, Justin Tobias, and Edward Vytlacil, (2001), “Four Parameters of Interest in the Evaluation of Social Programs”, Southern Economic Journal, 68 (2), pp. 210-223. Heckman, James, Je¤rey Smith, with the assistance of Nancy Clements (1997), “Making the Most out of Programme Evaluations and Social Experiments: Accounting for Heterogeneity in Programme Impacts”, Review of Economic Studies, 64, pp. 487-535. Heckman, James, and Edward Vytlacil, (1997), “Instrumental Variables Methods for the Correlated Random Coe¢ cient Model: Estimating the Average Rate of Return to Schooling When the Return is Correlated with Schooling”, Journal of Human Resources, 33(4). Heckman, James, and Bo Honore, (1990), “The Empirical content of the Roy Model”, Econometrica, 58 (5), pp. 1121-1149. Jakubson, George, (1991), “Estimation and Testing of the Union Wage E¤ect Using Panel Data”, Review of Economic Studies, 58 (5), pp. 971-991. Jayne, Thom S., Robert J. Myers, and James Nyoro, (2005), “E¤ects of Government Maize Marketing and Trade Policies on Maize Market Prices in Kenya”, Working Paper, MSU. Jayne, Thom S., Takashi Yamano, James Nyoro, and Tom Awuor, (2001), “Do Farmers Really Bene…t from High Food Prices? Balancing Rural Interests in Kenya’s Maize Pricing and Marketing Policy”, Tegemeo Draft Working Paper 2B, Tegemeo Institute, Kenya. Karanja, Daniel, (1996), “An Economic and Institutional Analysis of Maize Research in Kenya”, MSU International Development Working Paper No. 57, Department of Agricultural Economics and Department of Economics, Michigan State University, East Lansing. Karanja, Daniel, Thomas Jayne, and Paul Strasberg, (1998), “Maize Productivity and Impact of Market Liberalization in Kenya”, KAMPAP (Kenya Agricultural Marketing and Policy Analysis Project), Department of Agricultural Economics, Egerton University, Kenya. Karanja, Daniel, Mitch Renkow, and Eric Crawford, (2003), “Welfare E¤ects of Maize Technologies in Marginal and High Potential Regions of Kenya”, Agricultural Economics, 29. Lemieux, Thomas, (1993), “Estimating the E¤ects of Unions on Wage Inequality in a TwoSector Model with Comparative Advantage and Non-Random Selection”, Working Paper, CRDE and Departement de Sciences Economiques, Universite de Montreal. Lemieux, Thomas, (1998), “Estimating the E¤ects of Unions on Wage Inequality in a TwoSector Model with Comparative Advantage and Non-Random Selection”, Journal of Labor Economics, 16(2), pp. 261-291. Makokha, Stella, Stephen Kimani, Wilfred Mwangi, Hugo Verkuijl, and Francis Musembi, (2001), “Determinants of Fertilizer and Manure Use for Maize Production in Kiambu District, Kenya”, Mexico, D.F.: CIMMYT. 35 Mandelbrot, Benoit, (1962), “Paretian Distributions and Income Maximization”, Quarterly Journal of Economics, 76, pp.57-85. McCann, James C., (2005), Maize and Grace: Africa’s Encounter with a New World Crop, 1500-2000, Harvard University Press, USA. Moser, Christine, and Christopher Barrett, (20030, “The Complex Dynamics of Smallholder Technology Adoption: The Case of SRI in Madagascar”, Working Paper, Cornell. Mugo, Frida, Frank Place, Brent Swallow, and Maggie Lwayo, (2000), “Improved Land Management Project: Socioeconomic Baseline Study”, Nyando River Basin Report, ICRAF. Munshi, Kaivan, (2003), “Social Learning in a Heterogeneous Population: Technology Di¤usion in the Indian Green Revolution”, Journal of Development Economics, 73, pp. 185-213. Mundlak, Yair, (1961), “Empirical Production Function Free of Management Bias”, Journal of Farm Economics, 43(1) pp. 44-56. Nyameino, David, Bernard Kagira, and Stephen Njukia, (2003), “Maize Market Assessment and Baseline Study for Kenya”, RATES Program, Nairobi, Kenya. Nyoro, James K., M. W. Kiiru, and Thom S. Jayne, (1999), “Evolution of Kenya’s Maize Marketing Systems in the Post-Liberalization Era”, Tegemeo Working Paper, June 1999. Olley, Steven and Ariel Pakes, (1996), “The Dynamics of Productivity in the Telecommunications Equipment Industry”, Econometrica, 64, pp. 1263-1297. Ouma, James, Festus Murithi, Wilfred Mwangi, Hugo Verkuijl, Macharia Gethi, and Hugo De Groote, (2002), “Adoption of Maize Seed and Fertilizer Technologies in Embu District, Kenya”, Mexico, D.F.: CIMMYT (International Maize and Wheat Improvement Center). Pischke, Jörn-Ste¤en, (1995), “Measurement Error and Earnings Dynamics: Some Estimates from the PSID Validation Study”, Journal of Business and Economic Statistics, 13 (3). Place, Frank, and Brent Swallow, (2000), “Assessing the Relationship Between Property Rights and Technology Adoption on Smallholder Agriculture: A Review of Issues and Empirical Methods”, CAPRi Working Paper No. 2, ICRAF, Kenya. Robinson, Chris, (1989), “The Joint Determination of Union Status and Union Wage E¤ects: Some Tests of Alternative Models”, Journal of Political Economy, 97 (3), pp. 639-67. Rogers, Everett, (1995), “Di¤usion of Innovations”, Fourth Edition, Free Press, New York. Roy, A., (1951), “Some Thoughts on the Distribution of Earnings”, Oxford Economic Papers, 3, pp. 135-146. Salasya, M.D.S., W. Mwangi, Hugo Verkuijl, M.A. Odendo, and J.O. Odenya, (1998), “An Assessment of the Adoption of Seed and Fertilizer Packages and the Role of Credit in Smallholder Maize Production in Kakamega and Vihiga Districts, Kenya”, CIMMYT. 36 Sanders, John, Barry Shapiro, and Sunder Ramaswamy, (1996), “The Economics of Agricultural Technology in Semiarid Sub-Saharan Africa”, John Hopkins University Press. Sattinger, Michael, (1993), “Assignment Models of the Distribution of Earnings”, Journal of Economic Literature, 31 (2), pp. 831-880. Schultz, Theodore, (1963), The Economic Value of Education, Columbia University Press. Smale, Melinda, and Thom Jayne, (2003), “Maize in Eastern and Southern Africa: “Seeds” of Success in Retrospect”, EPTD Discussion Paper No. 97. Smale, Melinda, and Hugo De Groote, (2003), “Diagnostic Research to Enable Adoption of Transgenic Crop Varieties by Smallholder Farmers in Sub-Saharan Africa”, African Journal of Biotechnology, 2 (12), pp. 586-595. Sserunkuuma, Dick, (2002 ), “The Adoption and Impact of Improved Maize Varieties in Uganda”, Working Paper, Makerere University, Uganda. Sunding, David, and David Zilberman, (2001), “The Agricultural Innovation Process: Research and Technology Adoption in a Changing Agricultural Sector”, Chapter 4, Handbook of Agricultural Economics, Volume 1, eds. B. Gardner and G. Rausser. Suri, Tavneet, (2006), “Selection and Comparative Advantage in Technology Adoption”, Economic Growth Center Discussion Paper No. 944, Yale University. Tittonell, Pablo, (2003), “Soil Fertility Gradients in Smallholder Farms of Western Kenya, Their Origin, Magnitude and Importance”, Thesis, Wageningen University, Netherlands. Vella, F., and M. Verbeek, (1998), “Whose Wages Do Unions Raise? A Dynamic Model of Unionism and Wage Rate Determination for Young Men”, Journal of Applied Econometrics, 13, pp. 163-168. Wanzala, Maria, Thom Jayne, John Staatz, Amin Mugera, Justus Kirimi, and Joseph Owuor, (2001), “Fertilizer Markets and Agricultural Production Incentives: Insights from Kenya”, Tegemeo Institute Working Paper No. 3, Nairobi, Kenya. Weir, Sharada, and John Knight, (2000), “Adoption and Di¤usion of Agricultural Innovations in Ethiopia: The Role of Education”, Working Paper, CSAE, Oxford University. Wekesa, E., Wilfred Mwangi, Hugo Verkuijl, K. Danda, and Hugo De Groote, (2003), “Adoption of Maize Technologies in the Coastal Lowlands of Kenya”, Mexico, D.F.: CIMMYT. Willis, Robert, (1986), “Wage Determinants: A Survey and Reinterpretation of Human Capital Earnings Functions”, in 0rley Ashenfelter and Richard Layard (eds.), Handbook of Labor Economics, Amsterdam: North-Holland. Wooldridge, Je¤rey, (1997), “On Two Stage Least Squares Estimation of the Average Treatment E¤ect in a Random Coe¢ cient Model”, Economics Letters, 56, pp. 129-133. 37 Figure 1 Population Density and Location of Sample Villages Figure 2 Hybrid Maize Adoption Patterns, by Province 1 0.9 Rift Valley Fraction of Seed that is Hybrid 0.8 0.7 Western Central Sample 0.6 0.5 Eastern 0.4 Nyanza 0.3 0.2 0.1 0 1995 Coast 1996 1997 1998 1999 2000 Year 2001 2002 2003 2004 2005 Figure 3a Fraction of HH's Using Inorganic Fertilizer, by Province 1 Fraction of HHs that Use Inorganic Fertilizers Central 0.9 Western 0.8 Rift Valley 0.7 Eastern Sample 0.6 0.5 Nyanza 0.4 0.3 0.2 Coast 0.1 0 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 Year Figure 3b Real Expenditure on Inorganic Fertilizer, by Province Real Expenditure on Inorganic Fertilizers 6000 5000 Western Rift Valley 4000 3000 Sample 2000 Central Eastern 1000 Nyanza Coast 0 1995 1996 1997 1998 1999 2000 Year 2001 2002 2003 2004 2005 Figure 4 Maize Acreage, by Province 3.5 3 Coast Maize Acreage (Acres) 2.5 Rift Valley Western 2 Eastern 1.5 Nyanza 1 Central 0.5 0 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 Year Figure 5 Maize Harvest, by Province 1400 Rift Valley 1200 Western Maize Harvest (Kg/Acre) 1000 800 Central Eastern 600 Nyanza 400 Coast 200 0 1995 1996 1997 1998 1999 2000 Year 2001 2002 2003 2004 2005 Figure 6a 0 .1 .2 .3 .4 .5 Marginal Distribution of Yields by Sector, 1997 0 2 4 6 Non-Adopters 8 10 Adopters Figure 6b 0 .1 .2 .3 .4 .5 Marginal Distribution of Yields by Sector, 2000 0 2 4 6 Non-Adopters 8 10 Adopters Figure 6c 0 .2 .4 .6 Marginal Distribution of Yields by Sector, 2004 0 2 4 6 Yields (Kg/Acre) Non-Adopters 8 10 Adopters Figure 7b Distribution of Comparative Advantage Distribution of Returns 1 0 .5 Predicted Return .2 0 -.2 -.4 Predicted Theta .4 1.5 Figure 7a Non-Hybrid Hybrid Leavers Joiners Non-Hybrid Hybrid Leavers Figure 7d Endogenous Hybrid and Fertilizer Use Distribution of Tau 0 0 2 .2 .4 4 .6 6 Figure 7c Joiners -.4 -.2 0 .2 .4 .6 .8 Theta N H -3 -2 -1 0 1 Tau L J N, 1997 H, 1997 2 Table 1: Trends in Yields of Staples Average Annual % Changes in Yields (Hg/Ha), by Decade 1961-1970 1971-1980 1981-1990 1991-2004 Kenya Maize Wheat 0.362 5.646 2.373 2.333 1.169 -3.078 -1.198 0.984 India Maize Wheat Rice 1.502 4.876 0.954 0.842 2.514 1.714 1.900 3.343 3.310 2.572 1.235 0.838 Mexico Maize Wheat 2.057 4.586 4.267 3.204 -0.548 -0.255 1.447 1.664 Zambia Maize -0.267 10.403 1.571 -1.707 Notes: Surprisingly, in the 1960’s the levels of yields of maize in Kenya, India and Mexico were comparable and very similar (all ranging between 10,000 and 12,000 Hg/Ha). The levels of yields in Zambia were slightly lower (about 8,000 Hg/Ha) whereas those in Uganda and Malawi were comparable to those in Kenya. Source: FAOSTAT Online Database Table 2a: Summary Statistics, by Sample Year 1997 Sample 2000 Sample 2004 Sample Yield (Log Maize Harvest Per Acre) 5.907 (1.153) 6.125 (1.092) 6.350 (0.977) Acres Planted 1.903 (3.217) 2.313 (3.948) 1.957 (2.685) Total Seed Planted (Kg per Acre) 9.575 (7.801) 9.331 (6.805) 9.072 (6.863) Total Purchased Hybrid Planted (Kg per Acre) 6.273 (6.926) 5.918 (6.636) 5.080 (5.260) Total Local Seed Planted (Kg per Acre) 2.653 (6.326) 2.868 (6.129) 3.120 (7.318) Hybrid (dummy) 0.658 (0.475) 0.676 (0.468) 0.604 (0.489) Fertilizer (Kg DAP (di-ammonium phosphate) per Acre) 20.300 (38.444) 24.577 (79.919) 24.610 (34.001) Fertilizer (Kg MAP (mono-ammonium phosphate) per Acre) 1.566 (10.165) 0 (0) 0.308 (4.538) Fertilizer (Kg CAN (calcium ammonium nitrate) per Acre) 6.473 (24.727) 9.819 (75.081) 8.957 (21.702) Fertilizer (Kg NPK (nitrogen phosphorous potassium) per Acre) 4.256 (20.096) 2.365 (12.172) 1.217 (7.870) Total Fertilizer Expenditure (KShs per Acre) 1361.7 (2246.3) 1346.5 (3347.1) 1354.6 (1831.2) Land Preparation Costs (Kshs per Acre) 2168.1 (4895.7) 813.66 (1222.9) 808.99 (1080.2) Main Season Rainfall (mm) 620.83 (256.43) 599.10 (264.21) 728.11 (293.29) Notes: Standard deviations in parentheses. KShs represents Kenyan shillings (the current exchange rate is on the order of 75 shillings to the US dollar). Henceforth, any variables measured in Kenyan shillings are deflated. The differences in land preparation costs over time are due to a change in how the data was collected. Source: TAMPA Project data. Table 2b: Summary Statistics, by Hybrid/Non-Hybrid Use 1997 Sample 2004 Sample Hybrid Non-Hybrid Hybrid Non-Hybrid 791 411 726 476 Yield (Log Maize Harvest Per Acre) 6.296 (0.934) 5.158 (1.167) 6.751 (0.692) 5.738 (1.030) Total Maize Acres Cultivated 1.982 (3.557) 1.753 (2.428) 2.087 (3.029) 1.758 (2.042) Total Seed Planted (Kg per Acre) 9.669 (6.569) 9.394 (9.750) 8.746 (4.156) 9.569 (9.608) Fertilizer (Kg DAP per Acre) 28.755 (44.115) 4.028 (13.266) 37.148 (37.294) 5.488 (13.909) Fertilizer (Kg CAN per Acre) 9.087 (29.715) 1.442 (7.152) 12.708 (24.961) 3.235 (13.622) Land Preparation Costs (KShs/Acre) 2006.0 (1316.2) 2480.1 (8168.4) 918.44 (1095.2) 642.06 (1036.1) Expenditure on Fertilizer (KShs/Acre) 1922.3 (2542.9) 282.64 (740.53) 1893.3 (1964.7) 533.09 (1211.4) Inorganic Fertilizer Use (Dummy) 0.7421 (0.4378) 0.2311 (0.4221) 0.8994 (0.3009) 0.4055(0.4915) Main Season Rainfall (mm) 651.70 (228.82) 561.44 (293.88) 825.41 (215.20) 579.69 (332.05) Hired Labor (Hours) 109.98 (162.79) 95.118 (271.75) 486.15 (884.21) 225.71 (714.16) Family Labor (Hours) 255.10 (261.69) 353.73 (460.58) 146.66 (163.03) 115.64 (135.72) No. of Households Notes: Standard deviations in parentheses. The mean of yields is higher and the standard deviation of yields is lower in the hybrid sector. The dummy for inorganic fertilizer use just measures what fraction of households in that sector-year uses inorganic fertilizers. The trends in the land preparation costs and the labor variables over time are due to a change in how the data was collected. Source: TAMPA Project data. Table 2c: Transitions Across Hybrid/Non-Hybrid Sectors (Just for the Sample Periods 1997, 2000 and 2004) Transition in Terms of Technology Used (1997 2000 2004) Fraction of Sample (%) (N=1202 Households) NNN 20.38 NNH 2.83 HNH 6.07 NHH 4.91 HNN 5.99 HNH 3.16 HHN 7.15 HHH 49.50 Notes: This table shows all the possible three period transitions in my sample of farmers, and the fraction of my sample that experiences each of these transitions. The three periods correspond to 1997, 2000 and 2004. In the first column, the three letters represent the transition history with respect to technology, where “H” represents the use of hybrid and “N” represents the use of non-hybrid. For example, the transition “N N N” stands for farmers who used non-hybrid maize in all three periods – they make up about 20.4% of my sample. This switching behavior is not just measurement error. It is rather systematic. In addition, the survey instrument asks about hybrid use in multiple sections of the survey (it is a rather large part of household behavior so it is unlikely to be measurement error). Source: TAMPA Project data. Table 3: Basic OLS and Fixed Effects Specifications Dependent Variable is Yields (Log Maize Harvest Per Acre) OLS, Pooled OLS, Pooled OLS, Pooled FE FE 1.024 (0.033) 0.731 (0.034) 0.556 (0.035) 0.114 (0.052) 0.149 (0.049) Acres (x100) - - -0.385 (0.432) - -5.471 (0.839) Seed Kg per Acre (x10) - - 0.290 (0.020) - 0.272 (0.023) Land Preparation Costs per Acre (x10,000) - - 0.216 (0.048) - 0.151 (0.054) Main Season Rainfall (x1000) - - 0.688 (0.118) - 0.913 (0.112) 0.052 (0.006) - 0.031 (0.007) Hybrid Fertilizer per Acre (x1000) Year = 2000 0.199 (0.039) 0.205 (0.037) 0.272 (0.035) 0.216 (0.033) 0.289 (0.032) Year = 2004 0.498 (0.039) 0.482 (0.037) 0.403 (0.037) 0.449 (0.037) 0.377 (0.035) Constant 5.233 (0.035) 4.842 (0.072) 4.195 (0.085) 5.832 (0.042) -2.240 (4.931) No Yes Yes - - 0.228 0.317 0.424 0.071 0.197 Province Dummies R-Squared Notes: Standard errors are in parentheses. All regressions have 3600 observations (three periods). Covariates not reported include labor variables and average long term mean rainfall. Results are almost identical if the sample is restricted to two periods and actual family labor is included as a covariate (for this case the OLS return to hybrid is 0.540 and the FE return is 0.085). The OLS and household fixed effects specifications run are respectively: ′ ′ y it i h it X it it y it h it X it it Source: TAMPA Project data. Table 4: CRE Model Reduced Forms and Structural Estimates Dependent Variable is Yields (Log Maize Harvest Per Acre) Reduced Forms Without Covariates Yields, 1997 Yields, 2000 Yields, 2004 Reduced Forms With Covariates Yields, 1997 Yields, 2000 Yields, 2004 Hybrid, 1997 0.574 (0.075) 0.345 (0.084) 0.467 (0.067) 0.444 (0.066) 0.279 (0.080) 0.354 (0.062) Hybrid, 2000 0.323 (0.081) 0.424 (0.086) 0.230 (0.074) 0.214 (0.070) 0.426 (0.079) 0.171 (0.065) Hybrid, 2004 0.680 (0.079) 0.490 (0.081) 0.632 (0.069) 0.385 (0.071) 0.344 (0.080) 0.480 (0.066) OMD DWMD EWMD 0.078 (0.053) 0.110 (0.054) 0.121 (0.054) Standard CRE Model Structural Estimates λ1 λ2 λ3 Without Covariates: 0.440 (0.055) 0.293 (0.058) 0.572 (0.059) 0.435 (0.055) 0.281 (0.058) 0.563 (0.058) 0.422 (0.056) 0.286 (0.058) 0.560 (0.058) OMD DWMD EWMD 0.144 (0.049) 0.152 (0.050) 0.159 (0.050) With Endogenous Covariates: 0.317 (0.049) 0.219 (0.049) 0.316 (0.049) 0.211 (0.049) 0.306 (0.051) 0.217 (0.049) β 0.348 (0.054) 0.350 (0.053) 0.350 (0.054) χ2 (p-value) 79.33 (0.000) - 4.037 (0.043) - Notes: Standard errors are in parentheses. Reduced forms are estimated with and without covariates (acreage, real fertilizer expenditure, land preparation costs, seed, labor and rainfall variables). Covariates are treated as endogenous. Results are the same if only two periods with family labor are used. The reduced form for each t, the projection used to estimate the structural model by minimum distance and the structural model are respectively: ′ y it t 1t h i1 2t h i2 3t h i3 X i t it y it h it i u it i 0 1 h i1 2 h i2 v i Structural coefficients β and projection coefficients (λ’s) are reported. OMD, EWMD and DWMD are optimal (weighted by inverted reduced form variance-covariance matrix), equally weighted and diagonally weighted (only the diagonal elements from the OMD weight matrix) minimum distance respectively. The χ2 statistic on the overidentification test is the value of the OMD minimand. Source: TAMPA Project data. Table 5: Selection Returns by Hybrid History (Joiners, Leavers and Stayers) Dependent Variable is Yields (Log Maize Harvest Per Acre) Variable 1997-2000 1997 Yield 2000 Yield 2000-2004 2000 Yield 2004 Yield Without Covariates 1997-2004 1997 Yield 2004 Yield Hybrid Stayers 1.411 (0.070) 1.109 (0.070) 1.132 (0.067) 1.172 (0.057) 1.505 (0.066) 1.280 (0.056) Leavers 0.746 (0.112) 0.281 (0.110) 0.426 (0.095) 0.332 (0.081) 0.809 (0.094) 0.648 (0.079) Joiners 0.563 (0.105) 0.430 (0.104) 0.403 (0.127) 0.693 (0.108) 1.007 (0.114) 0.883 (0.096) With Covariates Hybrid Stayers 0.766 (0.073) 0.637 (0.075) 0.719 (0.073) 0.593 (0.061) 0.868 (0.074) 0.652 (0.063) Leavers 0.466 (0.097) 0.044 (0.100) 0.367 (0.086) 0.159 (0.067) 0.509 (0.085) 0.350 (0.069) Joiners 0.316 (0.089) 0.335 (0.094) 0.258 (0.115) 0.339 (0.092) 0.491 (0.101) 0.499 (0.084) Notes: Standard errors are in parentheses. In each year, the following regression is run both with and without the standard set of Xi covariates used previously: ′ y i 1 h i11 2 h i10 3 h i01 X i u i where hi11 is the dummy indicating that farmer i is a hybrid stayer (plants hybrid in both periods), hi10 indicates he is a leaver (plants hybrid the first year, not the second), hi01 indicates he is a joiner (plants hybrid the second year, not the first). The coefficients reported are µ1, µ 2 and µ 3. Source: TAMPA Project data. Table 6: Heterogeneity by Observables Returns in the Hybrid/Non-Hybrid Sector Dependent Variable is Yields (Log Maize Harvest Per Acre) Variable OLS, with Covariates FE, with Covariates Hybrid Non-Hybrid Hybrid Non-Hybrid Acreage (x 100) 0.784 (0.438) -5.427 (1.125) -4.353 (0.917) -8.360 (2.013) Total Seed Planted (Kg per Acre) (x 100) 3.418 (0.279) 2.176 (0.517) 2.854 (0.323) 2.397 (0.401) Land Preparation Costs per Acre (KShs) (x 10,000) 0.487 (0.129) 0.216 (0.031) 0.523 (0.170) 0.151 (0.068) Total Fertilizer Expenditure per Acre (KShs) (x 10,000) 0.430 (0.055) 1.855 (0.305) 0.245 (0.064) 1.176 (0.534) Main Season Rainfall (mm) (x 10,000) 3.879 (1.399) 12.286 (2.261) 9.297 (1.416) 12.706 (2.599) 0.466 (0.043) Average Return (β) when Returns Vary by Observables (evaluated at mean X’s) Number of Observations 2330 0.135 (0.053) 1276 2330 1276 Notes: Standard errors in parentheses. The following specification is estimated separately for the sample of farmers who use hybrid and non-hybrid maize: j j y it j X ti j it j ∈ H, N The covariates included that are not reported are labor variables, province dummies, year dummies and the average main seasonal rainfall. The results are similar if the sample is restricted to only 1997 and 2004 and family labor included as a covariate (average β for the OLS case is 0.473). Source: TAMPA Project data. Table 7: Two Period Basic Comparative Advantage CRC Model Reduced Form Estimates Dependent Variable is Yields (Log Maize Harvest Per Acre) Reduced Forms Without Covariates Reduced Forms With Covariates Yields, 1997 Yields, 2004 Yields, 1997 Yields, 2004 Hybrid, 1997 0.947 (0.113) 0.738 (0.092) 0.854 (0.104) 0.606 (0.087) Hybrid, 2004 1.011 (0.117) 0.832 (0.102) 0.753 (0.109) 0.723 (0.098) Hybrid 1997*Hybrid 2004 -0.450 (0.150) -0.352 (0.126) -0.383 (0.135) -0.356 (0.111) Acres (x10) - - 0.159 (0.060) 0.095 (0.056) Hired Labor per Acre (x 1,000) - - 0.019 (0.015) 0.011 (0.002) Family Labor per Acre (x 10,000) - - 0.108 (0.046) 0.556 (0.117) Seed Kg per Acre (x 10) - - 0.251 (0.052) 0.278 (0.092) Land Preparation Cost per Acre (x 1,000) - - 0.019 (0.004) 0.052 (0.018) Main Season Rainfall per Acre (x 1000) - - 0.834 (0.121) 0.335 (0.102) Fertilizer Expenditure per Acre (x 10,000) - - 0.868 (0.197) 0.666 (0.158) 0.297 0.289 0.419 0.433 R-Squared Notes: Standard errors are in parentheses. Reduced forms are estimated with and without covariates (all covariates reported). Reduced forms are for the case where all covariates are exogenous: only hybrid is correlated with the comparative advantage, θi. See Table 7b for projection and structural estimates. The reduced form equations run are as follows for 1997 and 2004 respectively: y i1 1 1 h i1 2 h i2 3 h i1 h i2 i1 Source: TAMPA Project data. y i2 2 4 h i1 5 h i2 6 h i1 h i2 i2 Table 8a: Two Period Basic Comparative Advantage CRC Model Structural Estimates Dependent Variable is Yields (Log Maize Harvest Per Acre) With Only Hybrid Endogenous EWMD Without Covariates DWMD OMD EWMD With Covariates DWMD OMD λ1 0.786 (0.087) 0.775 (0.088) 0.743 (0.091) 0.648 (0.082) 0.635 (0.083) 0.596 (0.089) λ2 0.963 (0.112) 0.956 (0.111) 1.030 (0.119) 0.711 (0.104) 0.702 (0.103) 0.772 (0.112) λ3 -5.417 (33.41) -4.367 (20.37) -1.622 (0.770) -1.098 (1.323) -1.037 (1.079) -0.930 (0.272) β 3.033 (18.33) 2.450 (11.11) 1.008 (0.295) 0.804 (0.337) 0.765 (0.224) 0.733 (0.062) φ -1.104 (0.766) -1.130 (0.754) -1.558 (0.562) -1.740 (2.299) -1.776 (2.219) -1.966 (0.926) χ21 (p-value) - - 71.62 (0.000) - - 63.75 (0.000) Notes: Standard errors are in parentheses. The reduced forms for these estimates are reported in Table 7 (for the case where only the hybrid decision is endogenous, i.e. correlated with the θi’s). The projection used in this model is: i 0 1 h i1 2 h i2 3 h i1 h i2 v i The minimum distance estimations are run for both cases with and without covariates (all the covariates are reported in Table 7 above). The structural coefficients reported are the average return to hybrid (β), the comparative advantage coefficient (ф), and all the projections coefficients (the λ’s). OMD, DWMD and EWMD are optimal weighted (the weight matrix is the inverted reduced form variance-covariance matrix), diagonally weighted (the weight matrix is the OMD weight matrix with the off diagonal elements set to zero) and equally weighted minimum distance respectively. The χ2 statistic on the overidentification test is the value of the OMD minimand. Source: TAMPA Project data. Table 8b: Two Period Comparative Advantage CRC Model Structural Estimates Dependent Variable is Yields (Log Maize Harvest Per Acre) With Both Fertilizer and Hybrid as Endogenous Projection: i 0 1 h i1 2 h i2 3 h i1 h i2 4 h i1 f i1 5 h i2 f i1 6 h i1 h i2 f i1 7 h i1 f i2 8 h i2 f i2 9 h i1 h i2 f i2 10 f i1 11 f i2 v i EWMD Without Covariates DWMD OMD EWMD With Covariates DWMD OMD β 0.003 (0.544) 0.609 (0.142) -0.701 (1.665) -0.053 (1.890) 0.045 (0.485) -0.717 (0.065) φ -0.468 (2.414) -2.102 (2.078) -0.824 (0.290) -0.783 (1.110) -0.547 (0.830) -1.806 (0.458) χ28 (p-value) - - 64.70 (0.000) - - 110.3 (0.000) With a Joint Fertilizer-Hybrid Decision Projection: EWMD i 0 1 t i1 2 t i2 3 t i1 t i2 v i Without Covariates DWMD OMD EWMD With Covariates DWMD OMD β 1.216 (5.632) 1.026 (2.735) 0.799 (0.059) 0.714 (1.072) 0.645 (0.697) 0.552 (0.052) φ -1.180 (1.509) -1.239 (1.429) -1.707 (0.770) -1.241 (0.889) -1.276 (0.867) -1.566 (0.527) χ21 (p-value) - - 55.73 (0.000) - - 50.41 (0.000) Notes: Standard errors are in parentheses. Reduced forms are not reported. Upper panel reports structural estimates of the average return to hybrid (β) and the comparative advantage coefficient (ф) for the model with endogenous hybrid and fertilizer use. Lower panel shows the results from a model where a farmer is in the technology sector if he is using both hybrid and fertilizer and not otherwise (above, tit represents the technology sector). The reduced forms and structural coefficients are as in Table 8a, with only the average return to hybrid (β) and the comparative advantage coefficient (ф) reported. Covariates include acreage, land preparation, seed, labor, and rainfall variables, and the total real expenditure on fertilizer. Source: TAMPA Project data. Table 9: Three Period CRC Model Structural OMD Estimates Dependent Variable is Yields (Log Maize Harvest Per Acre) i 0 1 h i1 2 h i2 3 h i3 4 h i1 h i2 5 h i1 h i3 6 h i2 h i3 7 h i1 h i2 h i3 v i Projection: Hybrid Decision Joint Hybrid-Fertilizer Sector Decision Without Covariates With Covariates Without Covariates With Covariates λ1 0.661 (0.076) 0.554 (0.072) 0.735 (0.084) 0.625 (0.078) λ2 0.510 (0.084) 0.407 (0.078) 0.540 (0.103) 0.511 (0.104) λ3 0.636 (0.104) 0.587 (0.108) 0.722 (0.098) 0.630 (0.095) λ4 -0.583 (0.108) -0.435 (0.116) -0.382 (0.165) -0.413 (0.149) λ5 -0.417 (0.136) -0.415 (0.134) -0.335 (0.139) -0.372 (0.142) λ6 -0.997 (0.169) -0.905 (0.178) -0.253 (0.152) -0.293 (0.145) λ7 0.340 (0.163) 0.323 (0.167) 0.194 (0.207) 0.325 (0.197) β 0.966 (0.050) 0.857 (0.063) 0.016 (0.056) 0.049 (0.059) φ -2.382 (0.483) -2.087 (0.437) -0.056 (0.155) -0.212 (0.199) Notes: Standard errors in parentheses. All specifications reported above in this table include the following set of covariates: acreage, real fertilizer expenditure, land preparation costs, seed, labor and rainfall variables (all treated as exogenous). In the case of the Hybrid Decision, the analysis is identical to that in Tables 7 and 8a (except for three periods, 1997, 2000 and 2004). In case of the Joint Hybrid-Fertilizer Decision, the analysis is identical to that in the lower panel of Table 8b (the technology sector choice is defined the same way). Only OMD estimates are reported (EWMD and DWMD estimates are not significantly different from zero). These results should be interpreted with caution (covariates do not include family labor). Source: TAMPA Project data. Table 10: Heckit and Treatment Effect Estimates under Non-Random Assignment (ATE, TT, MTE, LATE) Dependent Variable is Yields (Log Maize Harvest Per Acre) Year Heckman Two-Step Estimates Selection Correction λ Selection Correction λ Hybrid Sector Non-Hybrid Sector Implied Treatment Effects ATE TT MTE Slope 1997 -0.833 (0.170) 1.034 (0.724) 2.174 1.138 -1.867 (0.743) 2000 -0.847 (0.118) -0.615 (0.353) 0.768 0.898 -0.232 (0.372) 2004 -0.890 (0.183) -0.005 (0.156) 1.331 1.014 -0.885 (0.240) IV (LATE) Estimates (Conditional on Covariates) First Stage: Effect of Distance on Probability of Using Hybrid Second Stage: Effect of Predicted Hybrid on Yields (coefficient is reported x100) -0.525 (0.097) 1.452 (0.419) Notes: Standard errors are in parentheses. All regressions control for the same set of covariates as previously. The first two upper panel columns show (for each year) the two-step selection corrected estimates of coefficients on the Inverse Mills ratio for hybrid and non-hybrid sectors from: ′ ′ ′ ′ y Hi X i H H Z i /Z i h i Z i u si and ′ ′ ′ y Ni X i N N Z i / 1 − Z i The third column reports the average treatment effect accounting for the selection (it takes the predicted yields from the second step in each sector subtracts out the effect of the selection term and reports the difference in this predicted value across the two sectors). The treatment on the treated is reported next which looks at the difference in the second step predicted yields (including the selection terms) between the two sectors only for the sample of farmers who plant hybrid. Finally, the MTE slope is just the difference in the λ coefficients for the hybrid and non-hybrid selection terms (the difference between coefficients reported in the first two columns). ATE, TT, MTE are all evaluated at the mean Xi’s. The lower panel reports the IV estimates which use distance to closest fertilizer supplier (not where fertilizer is purchased) as an IV for the hybrid decision in the second stage: ′ ′ y it h it X i u it where h it 0 Dist X i IV results are roughly the same if excluded instruments are distance interacted with wealth quantiles, controlling for distance and wealth in both stages. Results are similar for the pooled data and for the period by period data on the IV. Source: TAMPA Project data. Table 11: Correlates of Estimated θi’s Dependent Variable is the Estimated θi’s (1) (2) (3) (4) (5) Distance to Closest Fertilizer Seller (x100) -0.544 (0.081) -0.521 (0.081) -0.511 (0.081) -0.521 (0.081) -0.497 (0.082) Distance to Motorable Road (x100) -0.737 (0.344) -0.742 (0.341) -0.726 (0.342) -0.741 (0.342) -0.724 (0.342) Distance to Tarmac Road (x100) -0.308 (0.069) -0.259 (0.069) -0.272(0.070) -0.259 (0.069) -0.293 (0.071) Distance to Matatu Stop (x100) -0.135 (0.207) -0.158 (0.206) -0.137 (0.206) -0.158 (0.206) -0.142 (0.207) Distance to Extension Services (x100) -0.399 (0.117) -0.358 (0.116) -0.347 (0.117) -0.358 (0.116) -0.341 (0.117) Tried to Get Credit (x10) - - 0.135 (0.104) - - Tried but Didn’t Receive Credit (x10) - - - 0.079 (0.199) - Received Credit (x10) - - - - 0.180 (0.109) Dummies for Household Head Education (P value on Joint Significance) No Yes (0.000) Yes (0.000) Yes (0.000) Yes (0.000) Province Dummies Yes Yes Yes Yes Yes Year Dummies Yes Yes Yes Yes Yes Notes: Standard errors are in parentheses. This table reports the correlations of the predicted θi’s with various observables in my dataset, namely distance from the household to the closest fertilizer seller, fraction of adult household members with no education, distance to the closest tarmac road, distance to the closest matatu (public transport) stop, distance to the closest motorable road, a dummy for whether the household tried to get credit, a dummy for whether the household received any credit, dummies for education of the household head, province and year dummies, depending on the specification. The predicted θi’s used are from the two period case with endogenous hybrid and fertilizer use (last column of the upper panel of Table 8b). Source: TAMPA Project data.
© Copyright 2026 Paperzz