Selection and Comparative Advantage in Technology Adoption

Selection and Comparative Advantage
in Technology Adoption
Tavneet Suri1
Abstract
I investigate an empirical puzzle in technology adoption: despite the potential of technologies like hybrid maize to increase returns dramatically, a signi…cant fraction of households do
not use them. As an explanation, I examine the role of comparative advantage and heterogeneity in returns to the technology in a generalized Roy model, where high average returns
are accompanied by low returns for marginal farmers. I derive a novel method that estimates
both the importance of comparative advantage and the associated distribution of returns to
the technology. Looking at this distribution, I …nd that farmers with the highest estimated
(gross) returns do not use hybrid, but this is explained by high unobservable costs (from
poor supply/infrastructure). Another group of farmers has lower returns and adopts, while
the marginal farmers have zero returns and switch in and out of use. Overall, adoption decisions appear to be rational and well explained by (observed and unobserved) variation in
heterogeneous net bene…ts to the technology.
Keywords: Technology, Heterogeneity, Comparative Advantage
1
Sloan School of Management, Massachusetts Institute of Technology. Email: [email protected]. I am
extremely grateful to Michael Boozer for hours of discussion on this project. I would also like to thank
Gustav Ranis, Paul Schultz and Christopher Udry, as well as Ken Chay, Robert Evenson, Penny Goldberg,
Koichi Hamada, Thom Jayne, Fabian Lange, Ashley Lester, Anandi Mani, Sharun Mukand, Ben Polak, Steve
Pischke, Roberto Rigobon, James Scott, T.N. Srinivasan, and seminar audiences at Berkeley, Haas School of
Business, Harvard, Hunter College, London School of Economics, MIT, NBER Productivity Lunch, NEUDC,
Oxford, Princeton, Sloan School of Management, Stanford, Stanford Graduate School of Business, University
of California at San Diego, University of Virginia and Yale for all their comments.
The data come from the Tegemeo Agricultural Monitoring and Policy Analysis (TAMPA) Project, between
Tegemeo Institute at Egerton University, Kenya and Michigan State University, funded by USAID. A special
thanks to Thom Jayne and Margaret Beaver at Michigan State University for help with the 1997-2002 rounds
of the data. I am sincerely grateful to the Tegemeo Institute and Thom Jayne for including me in the 2004
survey and especially to Tegemeo for their hospitality during the …eld work in Kenya. I would like to thank
James Nyoro, Director of Tegemeo, as well as Tegemeo Research Fellows Miltone Ayieko, Joshua Ariga,
Paul Gamba and Milu Muyanga, Senior Research Assistant Frances Karin, and Research Assistants Bridget
Ochieng, Mary Bundi, Raphael Gitau, Sam Mburu, Mercy Mutua and Daniel Kariuki and all the …eld teams
for their impressive work in all the rounds of the survey. An additional thanks to Margaret Beaver and Daniel
Kariuki for spending immense energy ensuring the data was squeaky clean.
I would like to acknowledge the …nancial support of the Yale Center for International and Area Studies
Dissertation Fellowship, the Lindsay Fellowship and the Agrarian Studies Program at Yale.
The rainfall data come from the Climate Prediction Center, part of the USAID/FEWS project - thanks to
Tim Love for his help with these data.
1
Introduction
Food security is currently at the forefront of the policy agenda across many Sub-Saharan
African economies. Agricultural yields (per acre) have actually fallen in the last decade across
many of these economies, despite the widespread availability of technologies that increase
yields. Governments and policy makers need to understand why yields of agricultural staples
are low across parts of Africa in order to …nd ways to enhance food production and incomes.
Table 1 shows the falling yields of staple crops over the last decade in Kenya, compared with
increasing yields in India and Mexico. These are worrisome trends: Kenya has not been able
to take the same advantage of improvements in agricultural technologies as have India and
Mexico during their Green Revolutions.
In this paper, I examine one aspect of this policy issue that poses an empirical puzzle.
Technologies that have extremely high average returns are available, yet households do not
use them. For example, …eld trials at experiment stations across Kenya show that hybrid
maize and fertilizers can increase yields of maize signi…cantly, increases ranging from 40%
to 100% (see Gerhart (1975), Kenya Agricultural Research Institute, KARI (1993), Karanja
(1996)). The Fertilizer Use Recommendations Project (FURP) conducted in Kenya in the
early 1990’s also shows high returns to these technologies (see Hassan et al (1998)). These
results are con…rmed for fertilizer in the small sample of Du‡o, Kremer and Robinson (2006).
However, despite such high average returns, aggregate adoption rates of hybrid maize and
fertilizer remain far below 100% across Kenya, with no increases in adoption over a long
period. This poses the question I am interested in: why are aggregate adoption rates low
and, more so, stagnating when returns to the technologies are high? Why are some farmers
not adopting, even though others in the same community adopt and seem to receive signi…cant
yield increases? Understanding how and why households decide what technologies to adopt
is crucial for understanding the trends in agricultural yields in Sub-Saharan Africa, where
risk and incomplete factor markets are potentially important.
The literature has put forth a number of explanations to answer these questions, ranging
from models of slow learning and informational barriers, credit constraints, taste preferences,
di¤erences in agroecological conditions and local costs and bene…ts, to the lack of e¤ective
commitment devices when farmers are time inconsistent (see Du‡o, Kremer and Robinson
(2006)). My approach to these questions is in stark contrast to the standard literature. I
examine whether the yield returns to adopting hybrid maize vary across a random sample of
maize farmers, such that high average returns to the technologies are accompanied by low
returns for the marginal farmer. I ask a simple question: is it the case that the returns to
these technologies vary across space such that the farmers who do not use them are simply
those that do not bene…t. The dominant approach to technology adoption is to nest it in a
learning environment. I deliberately take the opposite approach and assume the net bene…ts
1
are known to each household ex ante, but these bene…ts vary across space. I am therefore
interested in the spatial distribution of returns and how these returns correlate with both
adoption decisions as well as other observables.
Allowing for such heterogeneity in returns to play a role in the adoption decision implies
that the knowledge of average returns is not enough to understand adoption decisions. I use
the most general model of technology adoption that amounts to a generalized Roy model
where the expected bene…ts to the technology (that are potentially heterogeneous) drive
adoption decisions in each period. In modeling adoption this way, I allow for two forms of
household speci…c heterogeneity: absolute advantage (independent of the technology used)
and what I call comparative advantage. The main novel conceptual and econometric contribution of this paper is to analyze the role of comparative advantage in adoption decisions
and to empirically estimate its importance. This also enables me to estimate the distribution
of returns to the technology for my sample of farmers.
I use an extensive panel dataset representative of maize-growing Kenya, covering the
period from 1996 to 2004. I …rst document the adoption patterns of households over the
period 1996 to 2004, for both hybrid maize and fertilizer. Aggregate adoption is stable over
the period, even when I look across regions or asset/wealth quintiles. Strikingly, I observe
adoption rates varying dramatically across regions and asset quintiles, but remaining stable
over time. Even more surprisingly, at least 30% of my households switch into and out of the
use of hybrid seed from period to period. These trends are seen in other parts of Africa Dercon and Christiaensen (2005) …nd identical patterns for fertilizer use in Ethiopia. The
hypotheses that could generate such patterns in the data are rather limited. For example,
a model where households learn about a technology with positive returns would imply that
aggregate adoption rates increase over time.
Given these trends, I specify a generalized Roy model of adoption decisions that leads
me to empirically estimate a model of heterogeneous returns. In estimating this, I make
an econometric contribution. I generalize Chamberlain’s (1982, 1984) approach to the …xed
e¤ects model to accommodate not just heterogeneous intercepts across households (i.e. …xed
e¤ects), but also heterogeneous slopes (here, household speci…c returns to hybrid). Most
important of all, this generalization allows me to estimate the distribution of returns to the
technology for my sample of farmers. I compare these results to a variety of baseline models
(such as those with homogeneous returns and household …xed e¤ects), which I can reject.
I …nd strong evidence of heterogeneity in returns to hybrid maize. The estimated returns
control for input use, but do not account for unobservable costs and are therefore gross, rather
than net, returns. I show that the joint distribution of returns and adoption decisions displays
some extremely interesting features. In particular, there are three interesting groups of
farmers in my sample. I observe a small group of farmers with extremely high counterfactual
(gross) returns to hybrid seed, yet they choose not to adopt. This is rather striking and, at
2
…rst glance, seems to deepen the initial puzzle, at least for this group of farmers. However, the
lack of adoption for these farmers appears to stem from supply and infrastructure constraints,
such as distance to seed and fertilizer distributors. These farmers that have the highest
estimated (counterfactual) returns and do no adopt have higher unobservables costs to using
hybrid maize as they have poor access to input suppliers and distributors. Their overall net
returns may indeed be rather low. On the ‡ip side, I …nd a comparatively large group of
farmers for whom the (gross) returns are comparatively lower (though still reasonably high)
and they adopt hybrid in every period. Finally, a third group of farmers has essentially zero
returns and these farmers switch in and out of use of hybrid. These are the marginal farmers
who switch easily in and out of adoption when subject to year to year ‡uctuations.
The heterogeneity in returns to technologies has important implications for policy. For
one, encouraging 100% adoption of a technology that has a large average return is seen to be
a bad idea. In addition, knowing the distribution of returns to a technology allows for focused
policy interventions which can be cost e¤ective. For example, for farmers who would have
high returns but are constrained on the supply side, alleviating their constraints by targeted
distribution of inputs and infrastructure improvements would improve yields dramatically.
Similarly, the unconstrained farmers would bene…t greatly from the development of new
hybrid strains. This research illustrates the importance of the distributional consequences
of policy actions in developing countries and the importance of accounting for these. It also
sheds light on the ‘scaling up’ of policy (see Attanasio et al (2003)) in contexts of policy
evaluation2 .
The rest of this paper is structured as follows. In Section 2, I outline the relevant literature, focusing on empirical studies. Section 3 describes the data and the background on maize
cultivation in Kenya. Section 4 discusses a model of adoption decisions, focusing on the economics motivating my model of heterogeneous returns. In Section 5, I discuss baseline results
and provide some preliminary evidence for selection and heterogeneity in returns. Section 6
describes estimation of the heterogeneous returns model and the associated distribution of
returns. In Section 7, I cover robustness checks, in particular the Heckman two-step estimator and the treatment e¤ects under non-random assignment, i.e. the average treatment e¤ect
(ATE), the treatment on the treated (TT), the marginal treatment e¤ect (MTE) and the
local average treatment e¤ect (LATE) or IV estimator. Section 8 discusses the implications
of my results for households and policy makers in Kenya. Here, I also discuss alternative
models, such as learning, credit constraints and taste preferences. Section 9 concludes.
2
The questions posed here cannot be answered with experimental data without making speci…c assumptions. An experiment would randomize the use of a technology across farmers. I am interested in the choice
(or selection) process of how farmers decide what technologies to use, which is actually randomized out in an
experiment. In addition, with experimental data, one can test for the presence of heterogeneous returns, but
estimating the distribution of returns to the technology requires assumptions about the underlying selection
process (see Heckman, Smith and Clements (1997)).
3
2
Literature Review
I start with a brief summary of some of the empirical literature on technology adoption3 in
developing countries, with a focus on the studies that have been conducted in Sub-Saharan
Africa. I begin with Gerhart (1975) who tracks the adoption of hybrid maize in Western
Kenya in the early 1970’s. He highlights the fast di¤usion of hybrid maize upto the 1970’s and
identi…es the following constraints to the use of hybrid: risk (measured by the diversi…cation
in crops and o¤ farm income) education of the farmers, credit availability, extension services
and the use of other improved farming techniques (the use of fertilizer and manure).
I split the rest of this review into four main areas, …rst discussing studies that relate
to heterogeneity, then looking at those that focus on learning externalities, then credit constraints and ending with the more recent experimental research on Kenya, which is the most
relevant. The seminal empirical paper on technology adoption is Griliches (1957)4 who looks
at heterogeneity across local conditions in the adoption speeds of hybrid corn in Midwestern
US, emphasizing the role of economic factors like expected pro…ts and scale. He notes how
the speed of adoption across geographical space depends on the suppliers of the technology
and when the seed was adapted to local conditions.
Other researchers have focused on describing what forms of heterogeneity drive the decisions of households to adopt new technologies. This covers quite a range of papers: from
Schultz (1963) and Weir and Knight (2000), who emphasize the role of education, to various
CIMMYT (The International Wheat and Maize Improvement Center) studies5 that collect
data on what underlies adoption decisions across parts of Kenya. For example, Makokha et
al (2001) look at the determinants of fertilizer and manure use in Kiambu district. The main
(self-reported) constraints to use were high labor costs, high prices of inputs, unavailability
and untimely delivery. Ouma et al (2002) …nd that, in Embu district, gender, agroclimatic
zone, manure use, hiring of labor and extension services are signi…cant determinants of the
adoption of improved seed and fertilizer. Wekesa et al (2003) look at the adoption of the
Coast Composite, Pwani 1 and Pwani 4 hybrids and fertilizer in the Coastal Lowlands where
the non-availability and high cost of seed, unfavorable climatic conditions, perceptions of
su¢ cient soil fertility, and lack of money are cited as reasons for low use.
Much of the literature on technology adoption has focused on the learning externality,
described by Besley and Case (1993). Foster and Rosenzweig (1995) look at the adoption of
high yielding varieties in post Green Revolution India, allowing for learning by doing, learn3
The literature on technology adoption is too vast to review here, excellent reviews can be found in Sunding
and Zilberman (2001), Sanders, Shapiro and Ramaswamy (1996), Rogers (1995), Feder, Just and Zilberman
(1985) and Hall (2004). For the theoretical literature, see Besley and Case (1993), Banerjee (1992) and Just
and Zilberman (1983). And, for studies of land management practices, see Mugo et al (2000), for agricultural
extension see Evenson and Mwabu (1998) and for property rights see Place and Swallow (2000).
4
Also see David (2003).
5
See Doss (2003) and De Groote et al (2002) for a review of all the CIMMYT micro surveys in Kenya.
4
ing from others, and costly experimentation. They …nd that farmers with more experienced
neighbors are more pro…table than those without. Munshi (2003) looks at the same question
and …nds that the impacts are heterogeneous: wheat growers respond strongly to their neighbors’ experiences while rice farmers experiment. He notes that rice high yielding varieties,
unlike those for wheat, are sensitive to hard to observe soil characteristics and managerial
inputs. Conley and Udry (2003) study the adoption of fertilizer in the small-scale pineapple
industry in Ghana. They collect information on farmers’sources of information and …nd evidence of learning, not only from own experiences, but also within information neighborhoods.
Bandiera and Rasul (2003) look at decisions to plant sun‡ower in the Zambezia province of
Mozambique. They …nd that adoption decisions are correlated within networks of family and
friends and that this e¤ect is stronger for disadvantaged farmers. Moser and Barrett (2003)
analyze decisions to adopt, expand and disadopt a rice production method in Madagascar
and …nd seasonal liquidity constraints and learning e¤ects from extension agents and other
households to be important.
A hypothesis that is often raised in the literature is that credit constraints explain the low
rates of adoption. For example, Croppenstedt, Demeke and Meschi (2003) use self-reported
information on why farmers did not purchase fertilizer to estimate a double hurdle fertilizer
adoption model for Ethiopia. They …nd that credit is a major supply side constraint. Some
of the CIMMYT studies also cover self-reported credit constraints; for example, Salasya et
al (1998) look at the role of credit in adoption decisions in Western Kenya. More recently,
Dercon and Christiaensen (2005) show that the possibility of a poor harvest (and hence very
low consumption for that period) can account for the low use of fertilizer in Ethiopia.
The …nal strand of literature on Kenya I describe is experimental. There are several
impact assessment studies6 and …eld trials at experiment stations which show large increases
in yields from using hybrid maize and fertilizer. Other than the work done by the Kenya
Agricultural Research Institute (KARI), one of the early experimental studies was the Fertilizer Use Recommendations Project (FURP), conducted across 70 sites in the early 1990’s in
conjunction with the Kenya Maize Database Project (MDBP). FURP recorded yields about
half of those on experimental stations (KARI (1993)). The focus was to understand optimal
rates of fertilizer use in comparison to survey data on actual use from the MDBP (see Corbett (1998)). Hassan et al (1998) use these data and …nd that adoption of hybrid seed and
varietal turnover rates are higher (and di¤usion faster) in high potential areas. They blame
poor extension services, bad infrastructure and poor seed distribution for low adoption in
marginal areas. Hassan, Murithi and Kamau (1998) …nd that farmers in KMDP apply less
6
Studies of the welfare impacts of improved technology use include Karanja, Renkow and Crawford (2003)
who …nd that adoption of technologies in high potential areas (relative to those in marginal areas) have large
aggregate gains at the expense of poor distributional e¤ects, and Sserunkuuma (2002) who estimates large
consumer gains and large producer losses of a shift in supply due to the adoption of hybrid maize in Uganda.
5
fertilizer than optimal, leading to an estimated 30% gap between current and optimal yields.
There has been some more recent experimental work. For example, De Groote et al (2003)
look at an ex-ante impact assessment of the Insect Resistant Maize for Africa (IRMA) project
that develops GM maize varieties that are more resistant to stem borers. They estimate the
surplus from a shift in supply due to the decreased crop loss (measured experimentally).
Estimated gross crop losses amounted to 13.5% with an estimated value of $80 million7 .
Closest to this paper and most recently is the work of Du‡o, Kremer and Robinson (2006)
who run experiments to understand the low use of fertilizer in Western Kenya. They measure
the returns from fertilizer use experimentally and …nd that the average net rate of return for
investing in top-dressing fertilizer is between 100% and 162% and that the combined package
of fertilizer and hybrid in this area is not pro…table. In addition, they analyze a variety of
learning mechanisms, such as learning by doing (via starter kits) and learning from others
(from agricultural contacts as well as neighbors). They …nd small learning e¤ects, if at all
and some of these e¤ects die out over time rather quickly. The most signi…cant contribution
they make is to understand whether farmers may be time inconsistent in this environment.
They implement the Savings and Fertilizer Initiative (SAFI) as a commitment device for
farmers by o¤ering them fertilizer at harvest time (as opposed to later during planting when
it is used). They …nd that more farmers take up this program when it is o¤ered at harvest
time vs. later. They show that o¤ering SAFI at harvest time increases adoption by 17% and
that this e¤ect is about equivalent to a 50% subsidy to the fertilizer price.
Overall, the literature on adoption has had little to say about the consistently low rates of
adoption seen across a number of African countries. Other than correlations with observables,
there is still a rather large part of adoption behavior that remains unexplained.
3
Survey Data and Maize Cultivation in Kenya
I now go on to describe some of the relevant institutional detail and the data that help me
motivate the model and approach I use. Maize is the main staple in Kenya8 , with about
90% of the country’s population depending on maize for income generation (see Nyameino,
Kagira and Njukia (2003)). It accounts for approximately 3.7 million acres of cropped land.
Maize policy in Kenya has an interesting history, Smale and Jayne (2003) provide an excellent
review. Both hybrid seed and fertilizer have been available since the 1960’s with more than
twenty modern varieties of seed released since 1955 (Ouma et al. (2002)). Hybrid maize holds
its yield properties for one season so has to be repurchased every period. The commonly held
7
See http://www.cimmyt.org/ABC/InvestIn-InsectResist/htm/InvestIn-InsectResist.htm and Smale and
De Groote (2003) for more information on the CIMMYT IRMA and GM projects in Kenya.
8
McCann (2005) describes the fascinating history of maize in Africa pointing out how maize not only gives
more food per unit of land and labor, but also has the largest set of alternative uses compared to other grains.
6
wisdom is that later releases in hybrid technology in Kenya have not shown big improvements
in yields9 though government recommendations for the types and quantities of hybrid seed
and fertilizer to be applied vary by region10 . The agriculture is rain fed, with large variations
in rainfall that make input use risky (see Hassan, Onyango and Rutto (1998)).
From 1965 to 1980, hybrid variety 611 di¤used in Western Kenya “at rates as fast as or
faster than among farmers in the US corn belt during the 1930’s-1940’s” (Gerhart (1975)).
Smale and Jayne (2003) and Karanja (1996) attribute this to good germplasm, e¤ective
research, good distribution and coordinated marketing of inputs and outputs. This changed
in the 1990’s as the earlier policies (large subsidies, price supports, pan-territorial seed/output
pricing, restrictions on cross district trade) resulted in large …scal de…cits11 . Reform of the
cereal sector began in 1988, followed by liberalization in 199412 , though the government
retained some control policies. The most important of these was pan territorial seed pricing,
where seed was predominantly distributed by Kenya Seed Company at a …xed price across
the country each year. This pan territorial pricing created poor incentives for suppliers to
locate in distant areas or far from markets. The lack of a land market also deserves a mention
- there are very few land sales, though there is a small land rental market.
I use data from the Tegemeo Agricultural Monitoring and Policy Analysis (TAMPA)
Project between Tegemeo Institute at Egerton University, Kenya and Michigan State University. This is a household level panel survey, representative of rural maize-growing areas.
Figure 1 shows a map of the survey villages (marked by crosses) and the population density
across the country (darker shades represent greater density).
I have data for 1997, 1998, 2000, 2002 and 2004. The 1997, 2000 and 2004 surveys are
similar, containing detailed agricultural input and output data (plus retrospective data for
1995-1996), consumption, income, demographics, infrastructure, location, and some basic
credit information. The panel sample covers around 1300 households, with an additional 800
in the 2004 sample. The 1998 survey is similar, but covers only a sub-sample of about 612
households. In addition, in 1998, Kenya was strongly a¤ected by El Nino so this sample is
very di¤erent. The 2002 survey was a very short survey, collecting no output data but with
detailed data on hybrid use. I restrict myself mostly to the 1997, 2000 and 2004 data. I was
involved in designing and collecting the 2004 survey.
9
Karanja (1996) states that “newly released varieties in 1989 had smaller yield advantages over their
predecessors than the previously released ones... research yields were exhibiting a “plateau e¤ect””. Examples
he gives are KSII, which was followed in time by H611 (with a 40% yield advantage), then H622 (16%) and
then H611C (12%). H626 which had a 1% yield advantage over H625 was released eight years later.
10
See Ouma et al (2002), Hassan (1998), Salasya et al (1998), Wekesa et al (2003), Kamau (2002) and
Karanja (1996).
11
Kenya National Cereals and Produce Board, the marketing board supporting these policies, managed to
accrue losses on the order of 5% of the country’s GDP in the 1980’s (Smale and Jayne (2003)).
12
For studies of the impacts of the cereal sector reforms, see Jayne et al (2001). Karanja, Jayne and
Strasberg (1998), Jayne, Myers and Nyoro (2005), Nyoro, Kiiru and Jayne (1999) and Wanzala et al (2001).
7
To motivate my research questions, I outline some trends in my data over the period
1996-2004 on the use of hybrid maize and fertilizer. Figure 2 shows the trends in adoption
of hybrid maize by province over this period, illustrating the stability in aggregate adoption
over time and the persistence of cross-sectional di¤erences13 . If I look across wealth/asset or
acreage quintiles, the pattern is identical. There is no systematic temporal variation in the
adoption of hybrid, presenting a prima facie case against learning playing a role in adoption
decisions. If there was learning, we would expect aggregate adoption to increase over time.
The use of inorganic fertilizer during the main season follows similar trends. Figure 3a
shows the trends across provinces in the fraction of households that use inorganic fertilizer
for maize production and Figure 3b shows the total value (in constant Kenyan shillings) of
inorganic fertilizer used, by province. There is a lot more variation over time in the total value
of inorganic fertilizer used, but both …gures illustrate the familiar persistent cross-sectional
di¤erences. Finally, for an idea of the more general trends in my data, Figures 4 and 5 show
main season yields of maize and the acreage planted to maize respectively over the same
period. Yields are not stable over time, with the sharp drop in yields around 1997/1998 the
result of El Nino. However, there are no dynamics in any of the inputs or technologies.
Table 2a shows summary statistics for my sample of households for 1997, 2000 and 2004.
Of the 26 di¤erent types of fertilizer reported as being used, Table 2a shows the four most
popular (di-ammonium phosphate (DAP), calcium ammonium nitrate (CAN), nitrogen phosphorus potassium (NPK) and mono-ammonium phosphate (MAP)). Note the absence of the
use of MAP in 2000, illustrating the unavailability of this fertilizer during this year. Table
2b breaks out some of these variables by hybrid/non-hybrid use for 1997 and 2004. In principle, hybrid use could be a continuous variable as farmers plant quantities of hybrid seed.
However, in my sample very few farmers plant both hybrid and non-hybrid in a given season.
For example, in 1997 only 2% of households planted both14 . I take hybrid use throughout
to be binary (it simpli…es the empirical problem dramatically). From Table 2b, we can see
that yields are lower across the board in the non-hybrid sector (the p-values on the t-tests
are 0.000 in each period). However, a lot of the other variables look quite similar in means
(p-values on the t-tests are often above 5%), except for fertilizer use and main season rainfall.
Finally, Table 2c shows the transitions of hybrid use in my data, illustrating the fact that
30% of my households switch in and out of use over the three years of data I use. Over
more periods, this fraction is a bit higher: 33% if 1996 is included and about 39% if 2002
is included. In the data, the households that always plant hybrid have the highest average
13
The Coast province looks rather di¤erent in 2004. The empirical work on the comparative advantage
model reports results for cases without the Coast province. If I include these households (only 43 in number),
the results are qualitatively the same.
14
This fact alone already signals the possibility of credit constraints not being a driving force behind the
hybrid/non-hybrid decision. If credit constraints were very important, households would prefer to mix hybrid
and non-hybrid maize given there are no di¤erential …xed costs in planting either variety.
8
yields, followed by households that switch in and out of use. Those that never use hybrid have
the lowest average yields. Given all these trends, I go on to look at a model of how farmers
may be making adoption decisions, taking seriously the temporal stability in adoption.
4
Modeling Adoption Decisions
Since the variation in adoption decisions is largely cross-sectional or spatial (and not over
time), I consider a model of adoption decisions that allows for both absolute and comparative
advantage in the selection process determining who uses hybrid technology. For simplicity,
I refer to hybrid seed as the relevant technology, but the empirical model will allow for
generalizations to this, especially as regards fertilizer.
In keeping with the empirical realities, I consider two technology options available to
farmers: to plant either hybrid seed or traditional (non-hybrid) seed. Say the production
functions at any point in time for a farmer are Cobb-Douglas:
YitH
YitN
= e
= e
H
t
N
t
0
@
0
@
k
Y
j=1
k
Y
j=1
H
j
1
N
j
1
H
Xijt A euit
N
Xijt A euit
(1)
(2)
where YitH and YitN are the yields (output per acre) at time t when farmer i uses hybrid or
non-hybrid respectively. Throughout, H is used to represent the use of hybrid maize and
N the use of non-hybrid. The Xijt ’s represent other inputs (where j indexes the input),
such as fertilizer, labor, rainfall, etc. The production functions for hybrid and non-hybrid
maize have di¤erent parameters on the inputs, i.e. the Xijt ’s have di¤erent coe¢ cients in
the production functions (indicated by
H
N)
and
to allow for complementarity between
the maize variety used and the inputs. However, the same set of potential inputs are used to
N
grow both hybrid and non-hybrid maize. Finally, uH
it and uit are sector-speci…c errors that
may be the composite of time-invariant household characteristics and time-varying shocks to
N
production. I will consider some speci…c decompositions of uH
it and uit below.
Taking logs of equations (1) and (2) above,
H
yit
N
yit
=
=
H
t
N
t
0
H
+ uH
it
(3)
0
N
+ uN
it
(4)
+ Xit
+ Xit
H and y H are the logs of yields, the X ’s have been rede…ned to represent the logs
where yit
it
it
9
of the inputs. The gain in yields from planting hybrid maize is therefore
H
Bit = yit
N
yit
=
H
t
N
t
0
H
+ Xit (
N
) + uH
it
uN
it
(5)
Using this framework, we can start to think about what drives adoption decisions. The
simplest decision to adopt would be based on a comparison of the yields under hybrid and
H > y N and h = 0 if y H
non-hybrid maize, such that hit = 1 if yit
it
it
it
N , where h = 1
yit
it
represents the use of hybrid and hit = 0 the use of non-hybrid. This is the basic assumption
of the Roy (1951) selection model15 in terms of yields. So, the strict Roy model implies that
H
yit
N
yit
> 1 for hit = 1 and
H
yit
N
yit
1 for hit = 0
(6)
So, for any two individuals i and j using hybrid and non-hybrid respectively,
H
yit
>
N
yit
hit =1
H
yjt
N
yjt
!
(7)
hjt =0
Equation (7) describes sorting based on comparative advantage16 . The yield maximization
rule given by the Roy model in equation (6) imposes sorting via comparative advantage. The
strict Roy model thus implies that
H
hit (yit ) = 1 yit
N
yit
h
0 =1 (
H
t
N
t )
0
+ Xit
H
N
+ uH
it
uN
it
0
i
(8)
where 1[:] represents an indicator function of whether the expression in brackets holds true.
This selection model can be generalized whereby yield maximization is replaced by a more
general selection rule. This allows for both observed and unobserved costs as well as other
factors like tastes17 to drive adoption.
In my case, two properties of hybrid maize are important: not only does it increase mean
yields, but it also reduces the variance in yields (due to better resistance to pests and drought).
These characteristics are illustrated in Figures 6a, 6b and 6c, the conditional distributions
of yields across hybrid and non-hybrid farmers for each year. The mean of the hybrid yield
distribution lies to the right of that of non-hybrid yields. The hybrid yield distribution also has
a lower variance than the non-hybrid distribution (the p-values on the null of equal standard
15
The Roy model relies on comparisons of wages, but the framework it provides can be applied to productivities (suggested by Mandelbrot (1962)). Also see Heckman and Honore (1990) and Borjas (1987).
16
The de…nitions and implications of absolute and comparative advantage as sorting mechanisms have been
studied well (see Willis (1986), Green (1991), Sattinger (1993) and Carneiro and Lee (2004)).
17
One hypothesis for the lack of adoption and the persistence of non-hybrid maize is di¤ering tastes of
hybrid and non-hybrid maize, as in Latin America. This is unlikely in Kenya where the di¤erent varieties have
a uniform price and no distinction in the market. Section 8 provides evidence on this.
10
deviations are 0.000 in each period)18 . These …gures highlight an important property of
hybrid maize: it eliminates the lower tail of the distribution of outcomes under non-hybrid
maize. This is the motivation for thinking of a model of heterogeneous returns. The initial
evidence here indicates that the farmers who would bene…t most from hybrid would be those
at the lowest end of the non-hybrid yield distribution - bad outcomes would be muted under
hybrid. So, it is likely that there are di¤erential returns to hybrid across a sample of farmers.
Going back to the Roy model framework, it allows for a selection rule that is far more
H ; Z H ) > f N (y N ; Z N )
general than that in equation (8). So, for example, hit = 1 if f H (yit
it
it
it
where Zitj (j 2 H; N ) represents costs and any other factors that a¤ect adoption. The simplest
version of this bene…t function would of course be a pro…t maximization rule. When output
prices are the same for hybrid and non-hybrid maize, yields and pro…ts di¤er only by (real)
input costs. So, throughout, I estimate yield functions, controlling for complete input use19 .
The advantage of estimating yield functions is that it allows me to estimate the e¤ects of
other inputs on yields. Furthermore, it allows me to generalize the model of heterogeneous
returns to account for joint decisions over hybrid and fertilizer use, which is important.
Examples of other variables that may be in the set of Zit ’s above that are not in the
yield equation and that are not pure tastes include measures of the availability of seed, such
as the distance from a household to the seed/fertilizer distributor. Such measures of input
availability clearly a¤ect the costs of planting hybrid maize (over and above the observed costs
in the Xit ’s) and can therefore a¤ect the adoption decision from period to period. In the case
of this generalized Roy model, any pattern of sorting is possible and the joint distribution
f (y H ; y N ) is required to understand the patterns of sorting and comparative advantage (the
standard program evaluation problem)20 .
I can write the selection rule for a generalized Roy model for yields in terms of a binary
choice based on the latent variable, hit , as
hit (yit ) = 1 [hit
h 0
0] = 1 Zit + usit
i
0
(9)
where hit is the binary adoption decision with respect to hybrid maize and hit is the latent
18
The empirical framework I consider below allows some of these aspects of summary statistics to be due
to Roy model selection. The contribution of the empirical work is to separate selection from causal e¤ects.
19
The yield equation that I specify is driven by a number of factors. With complete markets, I would
estimate pro…t functions, but here markets are incomplete. I need shadow values on non-priced inputs,
mostly family labor. The data does not allow this and the data on inputs is not always at the crop level.
Calculating maize pro…ts and estimating a pro…t function directly is therefore not straight forward. Instead, I
use speci…cations that estimate yield functions controlling for input use throughout, which is close to estimating
pro…t functions. Maize prices are captured by province and year dummies included in the speci…cation of my
yield function. The log yield regression I estimate thus captures the bene…ts of hybrid on percentage yields
and revenues, holding constant my input measures. This is like the gross pro…t function of Olley and Pakes
(1996).
20
The generalized Roy model still imposes the restriction that the selection rule can be expressed as a single
index function/supply side decision. I consider the appropriateness of this restriction later.
11
net bene…t of planting hybrid maize. usit is the selection error (the unobservable heterogeneity
in the adoption selection equation). The higher usit in equation (9), the more (unobservably)
likely the farmer is to plant hybrid maize21 . Using a generalized yield function of the form
H
yit = hit yit
+ (1
N
hit )yit
(10)
I can substitute in equations (3) and (4) to get
yit =
N
t
+(
H
t
N
t )hit
0
+ Xit
N
0
+ Xit (
H
N
H
)hit + uN
it + (uit
uN
it )hit
(11)
Equation (11) illustrates the standard selection problem. If farmers know all or some part
N
of the errors uH
it and uit and act on what they know, then the decision to adopt hybrid, hit ,
H
will generally be correlated with these errors (and hence with the composite error uN
it + (uit
uN
it )hit in equation (11)). The OLS estimate of
H
t
N
t
(and of the average treatment e¤ect)
is therefore biased, with the bias composed of both a selection bias and a sorting bias22 .
For example, it may be the case that the farmers who plant hybrid may just have higher
soil quality and/or better farm management (all of which are unobservable). Heterogeneity
of this …xed unobservables form has been emphasized in this literature, in particular with
respect to farm management (see Mundlak (1961)), productivity of farmers, and soil quality
(see Conley and Udry (2003) and Tittonell (2003)). In addition, there may be heterogeneity
in returns to hybrid maize such that the farmers with the high returns to hybrid maize are
the ones who plant it. The average returns to hybrid maize computed by comparing farmers
who plant hybrid to those who do not are therefore overstated. My aim is to present a model
that can account for both these ideas, which I formalize below.
To solve the selection problem, I assume that what the farmers act on with respect to
the errors in equation (11) is time invariant (though technology speci…c). This implies the
N
following decomposition or factor structure23 on the errors, uH
it and uit :
uH
it
=
H
i
+
H
it
(12)
uN
it
=
N
i
+
N
it
(13)
21
Compare this to the case of the strict Roy model (without costs) where the adoption decision is based
H
N
on di¤erences in yields so that hit is just yit
yit
and usit is uH
uN
it
it as in equation (8).
22
Heckman and Li (2003) show that for a single cross section, the OLS estimate of the mean return to
hybrid in a model with heterogeneity is biased for the ATE and that
plim(b
23
OLS
)
=
=
T T + Selection Bias = AT E + Sorting Bias + Selection Bias
h
i h
i
N
(X) + E((uH
uN
E(uN
i
i )jhi = 1) + E(ui jhi = 1)
i jhi = 0)
This assumption is similar to the factor assumptions made in Carneiro, Hansen and Heckman (2003).
12
H
i
Farmers are assumed to know their
N
i ,
and
but not the
H
it
above highlights the key identi…cation assumption I make: the
and
H
it
N
it .
and
The decomposition
N
it
are assumed to
be uncorrelated with each other, as well as with the Xit ’s and Zit ’s, unlike the
H
it
and
and
N
i
This amounts to the transitory errors
H
i
decision to use hybrid, though the
N
it
H
i
N
i .
and
not being allowed to a¤ect the farmer’s
will. The implication is that the
H
it
and
N
it
components are known to the farmer only after the farmer has made his planting decision.
The timing of the production of maize is crucial as well as the variables I observe in
my dataset. This identi…cation assumption amounts to shocks to yields (production) being
realized after the technology decision is made. Most farmers plant right before the onset of
rains, with a reasonably long lag between planting and harvest. The seed type (hybrid or nonj
it
hybrid) is therefore …xed before the
(j 2 H; N ) unobservable shocks to yields are realized
(rainfall is in my data and so is not a component of these shocks). Even though the extent of
the rainfall is unknown at the time of planting, the fact that the maize is rain fed and that
rainfall I observe is important. Since inputs such as labor and fertilizer may be correlated
with the rainfall shock, it is of key importance that I observe rainfall. Therefore, conditional
on the covariates a¤ecting yields I have in my data and the timing of the production process,
the assumptions underlying the
H
it
and
N
it
seem to be reasonable.
An additional implication of this identi…cation assumption deals with what drives households to switch between the two technologies. This model assumes that households switch
due to factors that do not a¤ect yields directly. Here, it would be factors such as supply constraints24 . If hybrid seed and/or fertilizer is not available easily right at the time of planting,
this would manifest itself in a supply constraint or a higher cost of using the technology for
the farmer who would then switch to the traditional technology. There is a lot of qualitative
evidence for my sample of farmers that this happens reasonably often. Farmers get their
seed and fertilizer (this is mostly a joint decision) from small local distributors, who get their
supplies from slightly larger distributors, and so on. In the qualitative component of the
survey, farmers who use traditional varieties were asked why they do not use improved seed
varieties and 14% blamed unavailability of seed as one of top two reasons for this (65% put
high costs as one of the top two reasons). In addition, Table 2a showed that in 2000 one
of the common fertilizers (MAP) was not used at all due to an unavailability. And, …nally,
several of the CIMMYT and FURP studies mentioned earlier cited unavailability of inputs.
Substituting the factor structure described above into the yield equations in (3) and (4),
H
yit
N
yit
=
H
t
=
N
t
+
H
i
+
N
i
+ Xit
H
+ Xit
N
+
H
it
(14)
+
N
it
(15)
24
The framework in Dercon and Christiaensen (2005), for example, that shows adoption to be related to
households’ability to smooth downside consumption risks would support the assumptions here.
13
Note that these equations cannot be estimated by a household …xed e¤ects model. The
H
i
…xed e¤ects model is not general enough, the
and the
N
i
cannot be di¤erenced away
as a signi…cant number of households switch between technologies from one period to the
next. Following Heckman and Honore (1991), Lemieux (1998) and others, I use the linear
H
i
projections of the
N
i
and the
H
i
on (
H
i
N
i
N
i )
as follows
= bH (
H
i
N
i )
+
i
(16)
= bN (
H
i
N
i )
+
i
(17)
2
H
H N
;
i
i ),
2
2
2 HN ), bN = ( HN
HN )=( H + N
H
2
25
V ar( i ), and 2N V ar( N
i ) .
H
where the projection coe¢ cients are bH = (
2 )=( 2
N
H
2
N
+
I de…ne the
2
i
HN )
and
cov(
HN
to be farmer i’s absolute advantage as its e¤ect on yields does not vary
H
i
by the choice of hybrid/non-hybrid. More importantly, I de…ne
N
i
to be the gain
from growing hybrid, which is orthogonal to the absolute advantage by construction. I then
re-de…ne this gain to be my comparative advantage parameter,
bN (
i
H
i
i,
as follows:
N
i )
(18)
This is just a rede…nition of the sector speci…c unobservables
H
i
and
N
i .
The new
i
serves as
a measure of comparative advantage in my model. It measures how much more productive a
farmer is in the production of hybrid over non-hybrid maize, hence intuitively it is a measure
of comparative advantage. It is only possible to identify a relative relationship between the
H
i
and
N
i
unobservables (also see Carneiro, Hansen and Heckman (2003)).
Allowing
bH
bN
1, equations (16) and (17) become
H
i
= ( + 1)
N
i
=
i
+
My de…nition of the comparative advantage gain
i
+
(19)
i
(20)
i
i
is di¤erent from the concepts of sorting
based on comparative advantage described earlier. If individuals in the economy now sort on
the basis of comparative advantage, then it actually implies that the coe¢ cient
is negative.
Substituting these back into equations (14) and (15),
25
that
The
H
i
i ’s
N
i
H
yit
=
H
t
+
i
+ ( + 1)
N
yit
=
N
t
+
i
+
i
+ Xit
i
+ Xit
N
+
N
it
H
+
H
it
(21)
(22)
in equations (16) and (17) are the same. Subtracting equation (17) from equation (16) implies
N
= (bH bN )( H
bN must be equal to 1,
i
i ). For the i ’s to be equal across sectors, bH
which is easily shown: bH
bN =
2
H
HN
2 + 2
H
N
2
HN + N
2 HN
= 1:
14
The expected gain for a farmer from using hybrid is now a function of both observed and
unobserved household characteristics:
H
Bit = yit
H
yit
=(
H
t
N
t )
+
i
H
+ Xit0 (
N
)
(23)
The logarithmic production functions for the case of heterogeneous returns in equations
(21) and (22) map back into a generalized version!of the Cobb-Douglas production functions I
k
H
Q
+1
H
H
started with so that YitH = e i e i e t
Xijtj e it and similarly for YiN . The motivation
j=1
for this structure of production functions comes from the fact that we would expect individuals
who are in the lower tail of the non-hybrid distribution to bene…t the most from the use of
hybrid maize. The production functions illustrate these properties directly: when
farmers with low
i
will have high rewards from hybrid maize (and low rewards if
while farmers with high
i
will have lower rewards from hybrid maize (vice versa if
< 0,
> 0),
> 0).
I substitute equations (19) through (22) into the generalized yield equation (10) to get
yit =
where
hit ,
i
i,
i
+
N
t
i
+
i
+(
H
t
N
t )hit
N
+ Xit0
+
i hit
+ Xit0 (
H
N
)hit + "it
(24)
and "it is the unanticipated component of yields. Since, the coe¢ cient on
now also depends on the unobserved household-speci…c e¤ect,
i,
this is a random
coe¢ cient model. In fact, it is a correlated random coe¢ cient (CRC) model as the random
coe¢ cients ( i ) can be correlated with the adoption decision (hit ).
I am interested in two components of this model: the coe¢ cient
that tells us how impor-
tant di¤erences in comparative advantage are in determining yields (and adoption decisions)
and, second, the distribution of comparative advantage,
i,
in my sample. I derive in the
empirical work below how this CRC model is a generalization of Chamberlain’s (1984) correlated random e¤ects (CRE) model. Chamberlain allows for household-speci…c intercepts
to be correlated with hit . The CRC model I develop below allows both the household speci…c intercepts as well as household speci…c slopes/returns to be correlated with hit . As I
discuss in detail, this model can be estimated via methods similar to those introduced by
Chamberlain (1984). In addition, this generalization allows me to estimate the distribution
of comparative advantage and hence the distribution of returns to the technology.
How are the estimates of
and
i
interpreted? From equation (24), think of
i
as a
household speci…c intercept, i.e. a household speci…c average yield. Similarly, think of
as the household speci…c slope or return to hybrid. Since
i
and
i
construction, the covariance between household speci…c intercepts ( i ) and slopes (
cov( i ;
i)
15
=
2
i
are uncorrelated by
i)
is:
(25)
The structural coe¢ cient
therefore determines whether high intercept households also have
the higher returns to hybrid maize. For example, if
> 0, then households with high
average yields also have the higher returns to hybrid. If I compare this to an economy
where households are randomly allocated across technologies, there is greater inequality in
yields. Similarly, if
< 0, then, compared to an economy with random allocation across
technologies, there is less inequality in yields. In addition,
< 0 implies that farmers are
sorting into sectors on the basis of comparative advantage.
Finally, note that equation (24) is a generalization of the household …xed e¤ects model.
To illustrate this, let
= 0 so that there is now the same …xed unobservable heterogeneity
(for example, due to farm management) that a¤ects yields the same way irrespective of the
type of maize seed planted. This imposes the following speci…c assumptions on the error
structure: uH
it =
i
+
H
it
and uN
it =
i
+
N.
it
In addition, the di¤erence (
H
it
N)
it
must be
independent of the selection rule. This implies that the timing of the decision to use hybrid
maize at any point in time is independent of the yield di¤erence facing a farmer at that time.
The yield function that corresponds to this is a …xed e¤ects version of equation (24):
yit =
where "it =
N
it
+(
N
t
H
it
+
i
+(
N )h .
it
it
H
t
N
t )hit
+ Xit0
N
H
+ Xit0 (
N
)hit + "it
(26)
The expected gain to growing hybrid is now:
Bit = (
H
t
N
t )
+ Xit0 (
H
N
)
(27)
The expected gain from hybrid can therefore vary across households as long as it is a function
of the observable Xit ’s. But, the unobservable heterogeneity,
i,
is restricted to a¤ect yields
identically whether or not a farmer uses hybrid and does not appear in bene…t equation (27).
The main contribution of this paper is therefore to use the CRC model to relax the assumption that the return/bene…t to hybrid varies along only observable dimensions across
households. There have been a number recent studies on heterogeneity in returns. Advances
have been made in the context of experimental data (for example, Heckman, Smith and
Clements (1997)) and in cross sectional data where the covariate of interest is a stock variable like schooling (see Card (1998), Heckman and Vytlacil (1997), Heckman and Li (2003),
etc.) or unionization (see Lemieux (1993, 1998)26 . To estimate the CRC model, I take a
novel approach. I use the panel nature of my data to generalize the multivariate regression/minimum distance approach of Chamberlain (1984) to allow for heterogeneous returns
as in equation (24). This generalization is extremely useful as not only does it let me estimate
, but it in fact allows me to estimate the distribution of returns.
26
For more on the return to unionization, see Vella and Verbeek (1998), Green (1991) and Robinson (1989).
16
5
Baseline Results
I begin the empirical work by looking at the OLS and household …xed e¤ects speci…cations
as a baseline comparison to the heterogeneous returns model. Table 3 shows these results for
speci…cations with and without covariates, i.e. this simpli…ed version of equation (26):
yit = + hit + Xit0 +
Comparing this to equation (26), allowing
And, I have assumed that
H
=
N
H
t
N
t
i
+ uit
t,
(28)
I have assumed that
t
=
8t.
= . Later, I will relax the latter of these assumptions27 .
The OLS estimates in the …rst column of Table 3 are extremely large and positive: households that plant hybrid maize tend to have much higher maize yields, on the order of 100%
higher. In the third column, I add covariates to control for other household variables that
could a¤ect yields and that may be correlated with the use of hybrid. They include land
acreage, fertilizer use, land preparation costs, seed quantity, variables that measure labor inputs (both hired and family labor where possible), long term mean seasonal rainfall, current
seasonal rainfall, year and province dummies. Adding these covariates decreases the OLS
coe¢ cient further, though it is still large at 56%. The fourth and …fth columns report the
household …xed e¤ects results. The coe¢ cient on hybrid decreases dramatically, though with
covariates the di¤erence between the OLS and …xed e¤ects is less substantial. Even within
households, there is still a substantial return to hybrid maize, about 15%.
This household …xed e¤ects framework is restrictive in the assumptions it imposes on the
adoption process and the comparison of the adopters and non-adopters. There is assumed to
be no di¤erence between farmers who switch into the use of hybrid and those that switch out.
Apart from the permanent component in the outcome equation,
i,
the adoption decision
cannot depend on observed outcomes except under restrictive assumptions on the transitory
component (see Ashenfelter and Card (1985)). Such assumptions can be motivated by myopia
or ignorance of the potential gains from planting hybrid, but they are unrealistic here28 . Since
the …xed e¤ects model may not be appropriate, the rest of this section describes some tests
of this model, and some intuitive evidence for selection and heterogeneity in returns.
5.1
The Two Period CRE Model
The household …xed e¤ects estimates described above are consistent only under the assumption of strict exogeneity of the errors. Chamberlain’s CRE approach provides a basis for
27
The former can be relaxed but relaxing it is empirically unimportant - the results from the more complicated model that allows for time-varying ’s are identical so I report results for the simpler model throughout.
28
Hybrid maize was introduced in the 1960’s, with widespread use of extension services to promote the
technology. See Evenson and Mwabu (1998). Data in the survey instrument also supports this, as most of the
farmers in my sample have used hybrid seed at some point in the past.
17
testing this assumption. I brie‡y outline the simple two period, no covariates CRE model
where the data generating process is given by
yit = + hit +
+ uit
i
(29)
The assumption of strict exogeneity of the errors is:
E(uit jhi1 ; :::; hiT ;
i)
=0
(30)
The CRE approach illustrates how the …xed e¤ects model is overidenti…ed. Linearly project
the …xed e¤ect,
i,
onto the history of the covariates (here just hybrid use):
i
=
0
+
1 hi1
+
2 hi2
+ vi
(31)
where the projection error vi is uncorrelated with hi1 and hi2 by construction, and the ’s
are the projection coe¢ cients. Substituting this linear projection into equation (29) gives
yit = + hit +
Let
it
0
+
1 hi1
+
2 hi2
+ vi + uit
(32)
= vi + uit where E [ it hi1 ] = E [ it hi2 ] = 0. For each time period, therefore:
yi1 = ( +
0)
+( +
yi2 = ( +
0)
+
1 )hi1
1 hi1
+
+( +
2 hi2
+
i1
(33)
2 )hi2
+
i2
(34)
These are the structural yield equations for each period. I estimate reduced form yield
functions for each period of the form
yi1 =
1
+
1 hi1
+
2 hi2
+
i1
(35)
yi2 =
2
+
3 hi1
+
4 hi2
+
i2
(36)
From the four reduced form coe¢ cients from equations (35) and (36),
estimate the three structural parameters,
1;
2
1;
2;
3
and
4,
I can
and , using minimum distance.
The intuition behind the identi…cation of the CRE model comes from the underlying
assumption of strict exogeneity of the errors. If the …xed e¤ects model is valid, then the
only way the history of hit (both past and future) a¤ects the current outcome is through the
household level unobservable,
i.
Conditional on
i
and the included regressors, the changes
in behavior are taken to be driven by transitory factors uncorrelated directly with yields.
The CRE model is therefore overidenti…ed and testable with panel data. The minimum
distance estimator of the structural parameters is also the minimum
18
2
estimator if the
weight matrix used is the inverse of the variance covariance matrix of the reduced form
coe¢ cients. This is called the optimal minimum distance (OMD) estimator. The
2
statistic
is just the value of the minimand in the OMD problem. If the identity matrix is used as the
weight matrix, the estimates are referred to as equally weighted minimum distance (EWMD)
estimates. The OMD estimates are e¢ cient, but may be biased in small samples and can
therefore be out-performed by EWMD (see Altonji and Segal (1996)). I also look at the
diagonally weighted minimum distance (DWMD) estimates, where the weight matrix used is
the same as the OMD case with the o¤-diagonal elements set to zero (see Pischke (1995)).
Estimates of the CRE model for three periods, with and without covariates are shown in
Table 4. In the CRE model, covariates can be treated as either exogenous or endogenous.
Exogenous covariates enter the model in equation (29), but are assumed to be uncorrelated
with the …xed e¤ects. Endogenous covariates are correlated with the …xed e¤ects and enter
the projection in equation (31). The CRE model hence allows tests of whether covariates are
endogenous. I report estimates where all covariates are assumed to be endogenous29 .
Table 4 shows both the reduced form and structural estimates for various speci…cations.
The reduced forms in the upper panel of Table 4 give nine reduced form parameters (not
including the constants or covariates), from which I use minimum distance to estimate the
four structural estimates, shown in the lower panel of the table. The CRE estimates of
in all cases are very close to the household …xed e¤ects estimates in Table 3. The OMD,
EWMD and DWMD estimates of
are quite similar, all within sampling error of each other.
The last column in the lower panel of Table 4 shows the
2
values on the overidenti…cation
test. In all cases, I can reject the null that the minimum distance restrictions hold. This
overidenti…cation test is an omnibus test and has low power against any speci…c alternative.
It is therefore not surprising that I am able to reject the overidentifying restrictions.
5.2
Preliminary Evidence of Heterogeneity and Selection
Previous research has developed tests for heterogeneity in returns for experimental data (see
Heckman, Smith and Clements (1997)). Since my data is not experimental, I do not report
results from these tests in detail. However, I am able to reject the null of no heterogeneity in
returns30 . Such tests assume away selection, which is hardly tenable so, in Table 5, I look for
evidence of selection. I split the adoption history of a farmer into a set of dummy variables
describing the transition of this farmer across technologies. The idea is to look at whether
households with di¤erent transition histories have di¤erent returns in terms of yields (see
Card and Sullivan (1988)). I de…ne a “joiner” to be a farmer who does not plant hybrid the
29
If I treat hybrid as endogenous and all the other covariates as exogenous, the results are similar.
These tests include looking at the Frechet-Hoe¤ding bounds and bounding the variance in the percentiles
of the returns distribution and testing whether this bound is di¤erent from zero. These tests, …gures and
tables are available from the author upon request and are reported in Suri (2006).
30
19
…rst period, but does the next, and a “leaver” to be a farmer who plants hybrid the …rst
period, but not the next. Similarly, I de…ne “hybrid stayers” and “non-hybrid stayers”.
I then look at each pair of periods in my data and compare yields for the hybrid and
non-hybrid stayers, the joiners, and the leavers in each of the two periods separately to learn
about the extent of selection31 . For example, the …rst two columns in Table 5 look at the
transitions of households over 1997-2000. The …rst and second columns compare the yields
in 1997 and 2000 for hybrid stayers, joiners and leavers (with the omitted group being the
non-hybrid stayers) in 1997 and 2000 separately. If there was no selection at all (not even via
a household …xed e¤ect), we would expect the coe¢ cient on the leavers in the yield equation
for 1997 to be no di¤erent from the coe¢ cient on the hybrid stayers, and also no di¤erent
from the coe¢ cient on the joiners in the yield equation for 2000. Similarly, the coe¢ cient on
joiners in the yield equation for 1997 would be no di¤erent from zero, as should the coe¢ cient
on the leavers in the yield equation for 2000. The coe¢ cients on the leavers and joiners across
the two yield equations illustrate the extent of selection. Table 5 reports similar estimates for
2000-2004 and 1997-2004. The hybrid stayers uniformly get the largest yields and the leavers
and joiners get very di¤erent changes in yields when they switch their adoption status.
This table helps shed light on a pure liquidity constraints model. Consider an extreme
version of this model where variation in adoption decisions is completely explained by liquidity
constraints. In this case, if a non-hybrid farmer does better than average in terms of yields,
it will lead him to use hybrid in the next period. But, in this period, he should then revert
to having the average hybrid yield. So, we would see that non-hybrid households that have
higher yields than average in the …rst period should have no di¤erent than average hybrid
yields in the second period. This implies that for 1997-2000, the coe¢ cient on joiners would
be greater than zero in 1997, and equal to the coe¢ cient on the hybrid stayers in 2000.
Similarly, the leavers should do worse than the hybrid stayers in 1997 and then no di¤erent
that the non-hybrid stayers in 2000. Throughout Table 5, these restrictions are rejected.
This signals that liquidity constraints alone cannot explain the patterns in the data.
In Table 6, I look for heterogeneity in returns to hybrid seed along observable dimensions.
This relaxes the assumption that
H
=
N
=
in the OLS and FE estimations reported
above. Table 6 reports results for yield functions estimated separately for hybrid and nonhybrid households. I report both the OLS as well as household …xed e¤ects speci…cations.
The returns to observables di¤er across the two sectors, especially in the cases of fertilizer and
rainfall. This holds for the OLS as well as the household …xed e¤ects speci…cations. The last
row of Table 6 then reports estimates of the return to hybrid (evaluated at the mean Xit ’s).
There is still a signi…cant return to hybrid for both the OLS and …xed e¤ects speci…cations.
31
Throughout the sample period, hybrid stayers have the highest yields on average, followed by the leavers,
then the joiners and …nally the non-hybrid stayers.
20
6
Estimating a Model with Heterogeneous Returns
I now go on to describe how I estimate the CRC model. For simplicity, I leave out inputs other
than hybrid and discuss extensions later. The model outlined earlier implies the following
econometric model in maize yields for the simple two period, no covariates case32 :
yit = + hit +
i
+
i hit
+
(37)
it
To estimate this, I generalize Chamberlain’s approach. This is my preferred estimation
method as it allows a
2
overidenti…cation test of the restrictions, it allows easy generalization
of the basic model and, furthermore, it allows me to estimate a distribution of returns.
The next sections are devoted to describing my estimation and identi…cation strategy. I
then describe generalizations to this basic model and the estimation results.
6.1
Two Period CRC Model
The key assumption here is that, conditional on
i
and
i
(and covariates in the more general
speci…cations), the unanticipated component of yields, "it , is not correlated with the decision
to adopt. Since
i
i
+
i,
where
yit = +
i
and
i
i
are orthogonal, I re-write equation (37) as
+ hit +
i hit
+
i
+ "it
(38)
I …rst derive how to estimate the parameters of this model, most importantly
show how to use these estimates to derive a distribution for the predicted
To estimate the parameters, I use a linear projection of the
i ’s.
The
. Then, I
i ’s.
i ’s
are projected
onto not just the history of the hybrid decisions, but also their interactions so that, in the
two period case, the projection error is orthogonal to hi1 and hi2 individually as well as to
their product, hi1 hi2 by construction33 . The linear projection I use is therefore
i
=
In addition, I normalize the
0
+
i ’s
1 hi1
+
so that
2 hi2
P
i
+
3 hi1 hi2
+ vi
(39)
= 034 . The normalization implies that
0
32
This empirical model is similar to models of individual speci…c heterogeneity in Heckman and Vytlacil
(1997), Card (2000, 1998), Deschênes (2001), Carneiro, Hansen and Heckman (2001), Carneiro and Heckman
(2002), and Wooldridge (1997). Lemieux (1998) has the same model to look at whether the return to union
membership varies along observable and unobservable dimensions - he estimates it using non-linear 2SLS.
33
Say I use the simple CRE projection, i = 1 hi1 + 2 hi2 + vi , and substitute this into the yield function:
yit =
0
+
1 hi1
+
2 hi2
+ hit +
0
+
1 hi1 hit
+
2 hi2 hit
+ vi + vi hit + uit
Even though vi is linearly uncorrelated with hi1 and hi2 individually, it is generally correlated with their
product, hi1 hi2 , so that E [vi hi1 hi2 ] 6= 0. The projection for CRC must include interactions of hybrid histories.
34
I use this normalization so that corresponds to the average return to the hybrid seed as in the …xed
21
can be written as a function of
1,
2
and
3
from equation (39). Since hit is a dummy
variable, substituting the above projection into the yield equations gives:
yit = +
0 + 1 hi1 + 2 hi2 + 3 hi1 hi2 +
hit +
0 hit +
1 hi1 hit +
2 hi2 hit +
3 hi1 hi2 hit +vi +
vi hit +uit
(40)
For each of the two time periods, the yield functions are therefore:
yi1 = ( +
0 )+[ 1 (1+
)+ +
yi2 = ( +
0 )+ 1 hi1 +[ 2 (1+
0 ]hi1 + 2 hi2 +[ 3 (1+
)+ +
0 ]hi2 +[ 3 (1+
)+
2 ]hi1 hi2 +(vi +
vi hi1 +ui1 ) (41)
)+
1 ]hi1 hi2 +(vi +
vi hi2 +ui2 ) (42)
The corresponding reduced forms include all the interactions of the hybrid histories:
yi1 =
1
+
1 hi1
+
2 hi2
+
3 hi1 hi2
+
i1
(43)
yi2 =
2
+
4 hi1
+
5 hi2
+
6 hi1 hi2
+
i2
(44)
1;
2;
Equations (43) and (44) give six reduced forms coe¢ cients (
I can estimate …ve structural parameters (
1;
2;
3;
3;
4;
5;
6 ),
from which
; ) using minimum distance35 . The
structural parameters are overidenti…ed, given the minimum distance restrictions:
1
= (1 + )
1
+
+
2
=
3
= (1 + )
3
+
2
4
=
5
= (1 + )
2
+
+
6
= (1 + )
3
+
1
From these restrictions, for example,
form parameters in two ways,
=
0
2
1
0
(45)
can be estimated as a combination of the reduced
6
3
4
2
and 1 +
=
1
5
4
2
.
I now discuss extensions to the basic model and how to recover the distribution of
6.2
i ’s.
Extensions
I consider the following extensions:
e¤ects model. The normalization used is irrelevant and does not change the results. It just …xes the average
i in the sample and does not change the ordering of the i ’s.
35
There may seem to be six structural
P parameters, given the presence of 0 in equations (39) and (45).
However, given the normalization that
i = 0, 0 is just a function of the hybrid histories, their interactions
and the other 0 s. In particular, 0 =
1 hi1
2 hi2
3 hi1 hi2 where hi1 and hi2 are the averages of the
adoption decisions in periods one and two, and hi1 hi2 is the average of their interaction.
22
1. Covariates: all the identi…cation arguments presented above generalize when covariates
are included. Covariates can be thought of as either exogenous or endogenous. Exogenous covariates are uncorrelated with the
i ’s
and therefore enter only the reduced
forms in equations (43) and (44), but not the projection. Endogenous covariates are
correlated with the
i ’s
and enter the projection in equation (39). In the case where
fertilizer is the only other endogenous covariate, the CRC projection generalizes to
i
=
0
+
+
1 hi1
7 hi1 fi2
+
+
2 hi2
+
8 hi2 fi2
3 hi1 hi2
+
+
4 hi1 fi1
9 hi1 hi2 fi2
+
+
10 fi1
5 hi2 fi1
+
+
11 fi2
6 hi1 hi2 fi1
+ vi
(46)
where fit for t = 1; 2 represents the use of fertilizer in each period. Adding more
endogenous covariates does complicate the CRC model and it can be cumbersome36 .
2. Three periods: this is a simple extension. For space considerations, I do not show the
restrictions, but the problem becomes heavily overidenti…ed. There are 9 structural
parameters to estimate from the 21 reduced form coe¢ cients for the basic case.
3. Joint choice variables: the two-sector model presented above (hybrid/non-hybrid) can
be extended to multiple sectors. Since in my data the use of hybrid seed and fertilizer
are clearly a joint decision, as a further check, I incorporate the use of fertilizer into a
multiple sector model. I estimate a model that rede…nes sectors to this joint decision
and then looks at the heterogeneity in returns across these two sectors. In this scenario,
a farmer is either in the technology sector (he uses both hybrid and fertilizer) or not.
6.3
CRC Estimates
This section describes the results for the CRC model. I report estimates for the pure hybrid
model described in detail above (with and without covariates) for both the two and three
period cases. In addition, I report results for the endogenous covariates and joint hybridfertilizer sector extensions described above.
Tables 7, 8a, 8b and 9 present the CRC model reduced form and structural estimates.
These tables report the EWMD, DWMD and OMD estimates for cases with and without
covariates, as well as the
coe¢ cient of interest is
components,
i.
2
statistics on the overidenti…cation tests for the OMD cases. The
, the coe¢ cient on the individual speci…c comparative advantage
Table 7 presents the two period reduced forms for the CRC model, both with
and without covariates, using data for 1997 and 2004. The reduced forms are estimated via
36
There is some justi…cation for treating only fertilizer as endogenous. First, the summary statistics by
sector show that there aren’t immense di¤erences across sectors in the use of other inputs. Second, using the
estimates from the pure hybrid problem in Table 8, I correlate the distribution of the predicted i ’s with the
covariates. Only the correlations between the i ’s and fertilizer are important in magnitude and signi…cance.
23
seemingly unrelated regressions and the most general variance-covariance estimation. The
estimates in Table 7 are for the basic speci…cation described above where hybrid is the only
endogenous variable and the projection used is given by equation (39). Table 8a reports
2
the structural EWMD, DWMD and OMD results with
covariates. The estimates of
statistics, both with and without
are consistently negative, though with larger standard errors
in EWMD and DWMD cases (OMD is the asymptotically e¢ cient estimator).
Table 8b presents two sets of structural estimates of
that account for fertilizer entering
the household’s decision making process regarding hybrid maize instead of restricting it to
be exogenous. In the upper panel of Table 8b, I allow for fertilizer to be endogenous and
correlated with the
i ’s,
using the projection in equation (46). Again, the estimates of
in
this panel of Table 8b are consistently negative, both with and without covariates as well as
across EWMD, DWMD and OMD speci…cations. The lower panel in Table 8b reports results
for the case where there is a joint hybrid-fertilizer decision on the part of the farmer so that
he is in the technology sector if he uses both hybrid and fertilizer37 , otherwise he is not.
The reason for estimating this model in this restrictive way is that in the three period case
allowing fertilizer to be directly correlated with the
i ’s
as in equation (46) makes the model
a bit cumbersome38 . For this simpler model, I report results for the two and three period
cases for comparison. Results for
in the lower panel of Table 8b are consistently negative.
Table 9 shows the structural estimates for the three period CRC models for two cases:
the pure hybrid decision and the joint hybrid-fertilizer technology sector decision (similar
to the results in the lower panel of Table 8b). There are three reduced forms here, one for
each period, that include all the possible interactions of the three hybrid histories. This
gives a total of 21 reduced form estimates that map onto 9 structural estimates: 7 ’s from
the projection of the
i ’s,
the average return to hybrid,
, and the comparative advantage
coe¢ cient, . For space reasons and since OMD is e¢ cient, I report only the OMD estimates.
The EWMD and DWMD results are not signi…cantly di¤erent from zero. The OMD estimates
of
are negative across the board and often signi…cant.
Overall, to summarize all these results, I can often reject the null hypothesis that
is
equal to zero, especially across the board in the OMD cases. However, for the two period
two sector models, I cannot reject the null that
is equal to
1. For the endogenous hybrid-
fertilizer model in Table 8b and the three period models in Table 9, I can reject the null that
is equal to
1. I explain the signi…cance of this in the discussion below.
37
The fertilizer decision I use is regarding chemical fertilizers. The results are similar if I include the use
of manure as part of the fertilizer decision.
38
An additonal complication in the data is that the 2000 survey did not collect data on family labor. So,
for the three period case, I opted for the simpler model. The results should be interpreted with caution.
24
6.4
Recovering the Distribution of bi
Given the CRC structural estimates of the ’s and the form of the projection given by either
equations (39) or (46), I can predict the
i ’s
for a given history of hybrid use. However, I
must assume that the projections describe the true conditional expectation of the
i ’s.
This
is essentially an assumption only for the case of equation (46). For equation (39), since hit
is binary and each history is accounted for, the projection is close to saturated. Once I have
the predicted distribution of
i,
I can derive the predicted
Figure 7a shows the distribution of the predicted
i ’s
i ’s
via the yield function.
in my sample for the two period
model with the full set of exogenous covariates (the estimates come from Table 8a). Since
the predicted
predicted
i ’s
i ’s
are obtained from the projection in equation (39), the distribution of the
has just four mass points. There are only four possible hybrid histories: non-
hybrid stayers, hybrid stayers, leavers and joiners. So, given the projection, there are only four
possible values of bi : 0:3684; 0:0691; 0:2275 and 0:4033 for the four histories respectively.
Think of these mass points as averages of the underlying individual
hybrid use. The non-hybrid stayers have the lowest predicted
returns to hybrid as
i ’s
i ’s
for each history of
and hence the highest
< 0. The joiners and leavers have close to zero returns (they are the
marginal farmers) and the hybrid stayers have low positive returns.
Figure 7b shows the distribution of returns across my sample. To get this distribution of
returns, I need to use the estimate of
(the structural coe¢ cient on hit in equation (24)),
which is the average return to hybrid in the sample. The distribution of returns is therefore
given by
+
i.
Since
< 0, the ordering of returns is the reverse of the ordering of
i ’s,
i.e. the non-hybrid stayers have the most negative i ’s, but the highest positive returns.
Figure 7c shows the distribution of the bi ’s for the case where both hybrid and fertilizer use
are endogenous (the estimates come from the upper panel of Table 8b). Now, the distribution
of the predicted bi ’s is continuous since the predicted bi ’s come from the projection in equation
(46). Figure 7c splits out the distribution of these predicted bi ’s by transition. Clearly, the
ordering of average predicted bi ’s (and hence returns) is the same as earlier. The non-hybrid
stayers have the highest returns (lowest bi ’s), followed by the hybrid stayers, then the leavers
and …nally the joiners. Finally, as a check, Figure 7d shows the corresponding distribution
of the predicted
7
i ’s
(which were constructed to be orthogonal to hybrid choice).
Robustness Checks
In this section, I present as robustness checks control function and treatment e¤ect estimates
of the returns to hybrid. This includes the Heckman two-step estimator (also see Garen (1984)
and Deschênes (2001)) and estimates of treatment e¤ects under non-random assignment, i.e.
the ATE, TT, MTE and LATE (see Björklund and Mo¢ tt (1987), Heckman, Tobias and
25
Vytlacil (2001) and Carneiro and Heckman (2002)). Rewriting equation (11) for one period,
0
yi =
+ Xi (
Estimates of
i
H
N
) + (uH
i
0
uN
i ) hi + X i
N
+ uN
i
i hi
0
N
+ Xi
+ uN
i
(47)
would illustrate whether the average returns of farmers selecting into using
hybrid maize are higher than the returns for the farmers at the margin. With assumptions
of normality, it is possible to estimate the standard cross sectional selection parameters.
These selection correction procedures are cross sectional in nature and impose distributional
assumptions. They do not fully address the issues I am interested in, nor do they fully exploit
the panel nature of my data. I present them as robustness checks.
The yields in the hybrid and non-hybrid sectors are given by equations (3) and (4), and
the selection equation is given by equation (9). Drawing on Heckman, Tobias and Vytlacil
(2001), I look at the following treatment e¤ects under the simple assumption of normality:
AT E(x) = E( jX = x) = x(
T T (x; z; h(Z) = 1) = x(
M T E(x; usi ) = x(
where
= yH
H
H
N
N
)+(
)+(
y N is the yield gain from hybrid,
H
N
H H
H H
2
H
(:) and
N
N
(48)
(z )
(z )
N)
H
(49)
s
N )ui
= V ar(uH
i ),
is the error in the selection equation given by equation (9),
s
Corr(uN
i ; ui ).
)
(50)
2
N
s
= V ar(uN
i ), ui
s
= Corr(uH
i ; ui ) and
N
=
(:) represent the normal probability and cumulative distribution
functions respectively. V ar(usi ) is normalized to be 1, so that
H H
and
N
N
correspond
to the coe¢ cients on the selection terms in the sector speci…c selection models.
These treatment e¤ects in the case of joint normality are simple to compute using a twostep control function procedure. The …rst stage is a probit describing the hybrid adoption
decision, used to compute the selection correction terms that are used as controls in the
second stage sector speci…c yield functions. The ATE uses the estimates of the coe¢ cients on
the Xi ’s from this second step. The TT adjusts this estimated ATE for the selection of those
who actually plant hybrid. The slope of the MTE describes whether people who are more
likely to use hybrid for unobservable reasons (higher usi ) have higher or lower returns from
planting hybrid. If the coe¢ cient on usi in equation (50) is negative, it implies that farmers
with unobservables that make them the least likely to use hybrid have the highest yield return
to planting hybrid over non-hybrid. A non-zero slope of the MTE therefore illustrates the
role of unobservables and the importance of heterogeneity in returns.
These treatment e¤ects are shown in Table 10 for every cross-section of my data. To
estimate these parameters, I need an exclusion restriction, i.e. a variable that enters the
selection equation (9), but not the yield functions in equations (3) and (4). I use a variable
26
that describes a household’s access to fertilizer and hybrid seed, in particular the distance
to the closest stockist of fertilizer (not the distance to where fertilizer is purchased). This
distance measure gives an idea of the access and availability of the technologies and the
general level of infrastructure for these households.
The ATE’s (evaluated at the mean of the Xi ’s) are all extremely large and positive,
ranging from 0.768 in 2000 to 2.174 in 1997. The TT estimates are also large and positive
and often smaller than the ATE estimates, ranging from 0.898 to 1.138. The MTE slope,
meanwhile, is consistently negative and mostly signi…cant: –1.867 (with a standard error of
0.743) in 1997, -0.232 (0.372) in 2000 and -0.885 (0.240) in 2004. The negative MTE slopes
imply that farmers with unobservables that make them most likely to use hybrid get the
lowest returns from hybrid. Finally, I can estimate the instrumental variable (i.e. LATE)
estimate using the same exclusion restriction or instrument. The IV estimates of the return
to hybrid, reported in the lower panel of Table 10 are extremely large, on the order of 150%,
especially when compared to the earlier OLS and household …xed e¤ects estimates39 . These
IV estimates are consistent with my estimates of
in Tables 8 and 9 and, given the non-zero
MTE slopes, they provide further evidence of the heterogeneity in the e¤ects of hybrid.
The next section is devoted to discussing all these estimates and relating them to estimates
of
to better understand farmers’adoption decisions.
8
Discussion
How do all the results from the various models presented above relate to each other and
inform the underlying correlations of returns, costs and adoption decisions? In this section, I
relate my estimates of
estimates of
to the estimates from the selection models to understand what the
mean and their policy relevance. Recall that the parameter
tells us whether
the households that do better on average, irrespective of the technology they use, also have
the highest returns to hybrid, and it illustrates the relative equality or inequality in yields.
I rewrite the factor structure in equations (12) and (13) for the one period/cross-sectional
H
i
case, suppressing the transitory components,
and
uH
i
= ( + 1)
uN
i
=
i
+
i
N
i :
+
i
i
To relate this to the selection model, I need expressions for
H H
and
N
N
in terms of
39
The IV results are similar if I use the interactions of distance with dummies describing which asset
quantile a household is in as the exclusion restrictions with the level e¤ects of distance and asset quantiles
entering as controls in both stages of the speci…cation. This approach (results not reported here, but available
from the author) allows distance and asset quantiles to a¤ect yield outcomes directly, but relies on distance
to be more of a constraint for poorer households so that the interaction is useful as an exclusion restriction.
27
the parameters of these factor structures. Recall that
s
Cov(uN
i ; ui )
since V
ar(usi )
H H
N
H H
N
N
s
= Cov(uH
i ; ui ) and
= 1. I can therefore derive an expression for
N
s
= Cov(uH
i ; ui )
2.
would simply be
H H
N
N
uN
i =
i.
N
=
N:
s
s
Cov(uN
i ; ui ) = Cov( i ; ui )
In the strict Roy model without costs, usi = uH
i
(51),
H H
(51)
In this case, from equation
A negative MTE slope must therefore imply that
< 0. However, in a generalized Roy model with costs usi =
i
Ci where Ci is a measure
of relative costs in the hybrid sector (that do not enter the yield equation). This implies
H H
From the negative slope of the MTE,
N
N
2
=
H H
N
Cov( i ; Ci )
N
< 0. If
< 0, then Cov( i ; Ci ) can be
either positive and small, or negative. On the other hand, if
> 0, it implies that Cov( i ; Ci )
needs to be quite large and positive (enough to outweigh the
Roy model, the sign of
(52)
2 ).
In terms of a generalized
tells us something about the covariance of comparative advantage
and relative costs of planting hybrid. Given that I …nd that
< 0, it must be that households
with high return/low comparative advantage have higher relative costs of hybrid.
To understand these costs, in Table 11 I look at what observables correlate with the
predicted bi ’s. The results reported are for the two period model with endogenous hybrid
and fertilizer (using estimates from the upper panel of Table 8b). The observables I consider
include the distance from the household to the closest fertilizer seller, distance to the closest
tarmac road, distance to the closest matatu (public transport) stop, distance to the closest
motorable road, distance to the closest extension services, a set of dummies for education of
the household head, a dummy for whether the household tried to get credit, a dummy for
whether the household tried to get credit but did not receive any, a dummy for whether the
household received any credit, province and year dummies. The infrastructure and education
variables correlate strongly with the predicted bi ’s, but the credit variables do not40 .
Both from the agronomy of hybrid maize, as well as the kernel density plots in Figure 6,
it is clear that hybrid increases productivity, but this increase declines as you move rightward
through the distribution, so the expected return from hybrid is almost zero in the right tail.
When
< 0, the marginal gain is negative, so that the farmers in the left tail have the
largest gain. This is seen from the estimates of
as well as the negative selection estimates.
If selection was only as per the strict Roy Model, we would see the largest adoption rates in
the left tail, but this ignores the possibility of larger costs/constraints for these farmers. Table
11 illustrates that the households with the lower predicted bi ’s have much higher costs. Recall
that throughout the sample period, Kenya Seed Company was the predominant distributor of
40
These results hold up in the presence of …ner regional dummies, agro ecological zones or districts.
28
hybrid seed (about 1% of hybrid seed purchased in 2004 was not from Kenya Seed Company)
and under pan territorial seed pricing, seed prices were the same across the country. This
meant that suppliers had no incentives to locate far away as there was no compensating
increased price. And, villages are large and heterogeneous and these measures of costs and
infrastructure vary even within a village.
The IV estimates in Table 10 also suggest that distance and infrastructure are constraining
factors. An issue with IV estimates is that the estimated results are often quite di¤erent if
di¤erent instruments are used. This is usually attributed to underlying heterogeneity in
the population so that each instrument a¤ects a di¤erent subset of the population, hence
giving the varied results. However, the instrument itself, per se, does not usually highlight
which subset of the population it a¤ects. Appealingly, here the IV estimates are about 150%
which are the same magnitude as the returns that the non-hybrid stayers would get from
using hybrid. Distances and infrastructure do seem to pose a constraint to households using
these technologies. Finally, note that for
sector, the yields are given by
non-hybrid sector include the
+
i.
i
=
1, looking at the outcomes in just the hybrid
(plus the constant of course). But, the yields in the
This also explains why the IV estimates are close to the
estimates of . The mirror image of the result for the non-hybrid stayers is that there may
be “over-adoption” for some farmers in the right tail. The fact that some households have
lower returns yet adopt may hint at risk being an important factor in the adoption decision,
since the mean gross returns to hybrid in the upper quartile are close to zero. To account
for risk, it would be necessary to allow the variance of the transitory component in yields to
play a role in the adoption decision. I leave this to future work.
8.1
Alternative Models
Finally, I discuss some alternative models of adoption behavior that are popular in the literature on developing countries. The …rst is a model of learning, where households are
uninformed about the bene…ts of a technology and they trial and experiment with it to learn
what the returns are. Learning models give rise to the familiar S-shaped curves of adoption
rates over time (see Griliches (1957)): initially, few households adopt, then as more households learn about the productivity of the technology, adoption rates increase until everyone
has adopted. In my data, adoption rates show no dynamics - aggregate adoption remains
constant over the span of almost ten years. In fact, there are no dynamics in any of the inputs
either, even after as large a shock as El Nino in 1998. Any conventional model of learning is
not borne out by these aggregates.
It may be possible that households are learning about the di¤erent types of hybrid seed
and are switching to newer varieties of hybrid. However, even in that case, we should still see
aggregate increases in adoption over time. For two periods of the survey (2002 and 2004), the
29
data includes exactly what type of hybrid variety was used. In 2002, 61.5% of maize plots
were planted with hybrid 614, 6% each of 625 and 627 and 4% of 511. The release date of
these varieties was 1986, 1981, 1996 and the 1960’s respectively. In 2004, 69.5% of plots were
planted with 614, 5% with 627, 4% each with 625 and 511. So, households are using rather
old varieties and do not switch much between them.
The switching between hybrid and non-hybrid from period to period may look like learning
or experimenting behavior, but recall that hybrid and fertilizer have been available for a few
decades now. The 2004 survey asked households what year they …rst used the technologies this data shows that 99% and 83% of households have used hybrid and fertilizer (respectively)
before. Table 2c does not show any systematic persistence in adoption (the cycling behavior
is about as prevalent as the non-cycling). In the qualitative parts of the survey, farmers who
were using traditional varieties were asked why they were not using hybrid varieties and only
about 0.3% cited experimenting or on trial. Learning therefore seems to be unimportant
in my data. However, there may have been large learning e¤ects before my sample period
- Gerhart (1975) shows S-shaped curves of adoption in the 1970’s. It seems that these
curves ‡attened out at low rates of adoption. Finally, on the methodological side, a learning
model with heterogeneity in returns would imply a random coe¢ cient model with feedback.
Chamberlain (1993) shows that such models su¤er from identi…cation problems. Gibbons
et al. (2005) estimate a random coe¢ cient model with learning, but they must rely on the
structure of the learning process and a long panel (the NLSY) to identify the model.
The second set of alternative models deals with the role of credit constraints. The data on
credit show that in 2004, 41% of all households tried to get credit and 83% of these got credit.
In 1997, these …gures were collected for agricultural credit, where 33.7% of households tried
to get agricultural credit and 90% of these got it. Credit constraints do not seem to be of …rst
order importance here, for the following three reasons. First, only 2% of households plant
both hybrid and non-hybrid seed though there are no …xed costs to hybrid and both hybrid
seed and fertilizer are available in small quantities from suppliers. Second, the patterns
in Table 5 did not …t with a pure liquidity constraints story, as described earlier. And,
third, none of the credit variables (which admittedly are not perfect) correlate strongly with
comparative advantage in Table 11.
The …nal hypothesis for the adoption behavior of households is di¤erences in tastes. In
Latin America, the hybrid and non-hybrid varieties taste di¤erent and have di¤erent prices
in the market. This is not the case in Kenya. Hybrid and non-hybrid maize output are
indistinguishable, they are sold at the same price as a single commodity. The survey asked
farmers why they chose the variety they plant and an equal fraction of each type of farmer
(hybrid and non-hybrid) said it was because they preferred the taste. The di¤erential tastes
hypothesis does not seem to …t the case of Kenya.
30
9
Conclusion
This paper examines the adoption decisions and bene…ts of hybrid maize in a framework that
is in stark contrast to the empirical technology adoption literature of the past two decades.
Rather than think of adoption decisions as based on learning and information externalities,
I focus on a framework that recognizes the large disparities in farming and input supply
characteristics across the maize growing areas of Kenya. Within this framework, I …nd that
returns to hybrid maize vary greatly. Furthermore, farmers who are on the margin of adopting
and disadopting (and who do so during my sample period) experience little change in yields,
a …nding consistent with my framework, yet harder to reconcile with a pure learning model.
The experimental evidence points to average high, positive returns to these agricultural
technologies. However, these experiments say little else about the returns, sometimes hypothesizing irrationality in decision making to account for patterns of low, non-increasing adoption
rates. My framework of heterogeneous returns allows me to estimate not just average returns
accounting for selection, but also to have an idea of what the distribution of returns looks like
across my sample of farmers. I …nd extremely strong evidence of heterogeneity in returns to
hybrid maize, with comparative advantage playing an important role in yield determination.
The most important …ndings of my work are the policy implications of heterogeneity in
returns. Since I am able to look at a distribution of returns across my sample of farmers,
I can separate out farmers with low returns from those with high returns. I …nd that for a
small group of farmers in my sample, returns from hybrid maize would be extremely high, yet
they do not adopt hybrid maize. I show that these farmers have high unobservable costs that
prevent them from adopting hybrid as they have poor access to seed and fertilizer distributors.
In terms of policy, alleviating these constraints would greatly increase yields for these farmers.
However, this is only a small fraction of my sample. For a large fraction of my sample, the
returns to hybrid maize are smaller and these farmers choose to adopt. While I do not build
risk into the choice framework used in this paper, farmers that are well o¤ can a¤ord to
use hybrid maize, even when the mean returns to hybrid are low, since it helps insure them
again bad outcomes. For these farmers, since they do not seem to be constrained, they would
bene…t greatly from improved e¤orts in developing new hybrid strains and a biotechnology
e¤ort similar to that of South Asia and Latin America, where cultivation of newer hybrids
occurs often and leads to continual yield increases. Finally, the heterogeneity in return to
hybrid seed, on the whole, suggests quite rational and relatively unconstrained adoption of
existing hybrid strains.
31
References
Altonji, Joseph, and Lewis Segal, (1996), “Small Sample Bias in GMM Estimation of Covariance Structures”, Journal of Business and Economic Statistics, 14, pp. 353-366.
Ashenfelter, Orley, and David Card, (1985), “Using the Longitudinal Structure of Earnings
to Estimate the E¤ects of Training Programs”, Review of Economics and Statistics, 67(4).
Attanasio, Orazio, Costas Meghir, and Miguel Szekely, (2003), “Using Randomized Experiments and Structural Models for Scaling Up: Evidence from the PROGRESA Evaluation”, Mimeo, University College London.
Bandiera, Oriana, and Imran Rasul, (2003), “Complementarities, Social Networks and Technology Adoption in Mozambique”, Working Paper, London School of Economics.
Banerjee, Abhijit, (1992), “A Simple Model of Herd Behavior”, Quarterly Journal of Economics, 107(3), pp. 797-817.
Besley, Timothy, and Anne Case, (1993), “Modelling Technology Adoption in Developing
Countries”, AER Papers and Proceedings, 83(2), pp. 396-402.
Björklund, Anders, and Robert Mo¢ tt, (1987), “The Estimation of Wage Gains and Welfare
Gains in Self Selection Models”, Review of Economics and Statistics, pp. 42-49.
Borjas, George, (1987), “Self-Selection and the Earnings of Immigrants”, American Economic
Review, 77 (4), pp. 531-553.
Card, David, (2000), “Estimating The Returns to Schooling: Progress of Some Persistent
Econometric Problems”, NBER Working Paper No. 7769, Cambridge, MA.
Card, David, (1998), “The Causal E¤ect of Education on Earnings”, in the Handbook of
Labour Economics, eds. Orley Ashenfelter and David Card.
Card, David, (1996), “The E¤ect of Unions on the Structure of Wages: A Longitudinal
Analysis”, Econometrica, 64.
Card, David, and Daniel Sullivan, (1988), “Measuring the E¤ect of Subsidized Training
Programs on Movements In and Out of Employment”, Econometrica, 56.
Carneiro, Pedro, Karsten Hansen, and James Heckman, (2003), “2001 Lawrence R. Klein
Lecture: Estimating Distributions of Treatment E¤ects with an Application to the Returns to Schooling and Measurement of the E¤ects of Uncertainty on College Choice”,
International Economic Review, 44 (2), pp. 361-422.
Carneiro, Pedro, and James Heckman, (2002), “The Evidence on Credit Constraints in PostSecondary Schooling”, Economic Journal, 112 (October), pp. 705-734.
Carneiro, Pedro, and Sokbae Lee, (2004), “Comparative Advantage and Schooling”, Working
Paper, University College London.
32
Chamberlain, Gary, (1993), “Feedback in Panel Data Models”, mimeo, Department of Economics, Harvard University.
Chamberlain, Gary, (1984), “Panel Data”, in Zvi Griliches and Michael Intriligator (eds.),
Handbook of Econometrics, Amsterdam: North-Holland.
Chamberlain, Gary, (1982), “Multivariate Regression Models for Panel Data”, Journal of
Econometrics, 18, pp. 5-46.
Conley, Timothy, and Christopher Udry, (2003), “Learning About A New Technology:
Pineapple in Ghana”, Working Paper, University of Chicago and Yale University.
Corbett, John, (1998), “Classifying Maize Production Zones in Kenya Through Multivariate
Cluster Analysis”, in Maize Technology Development and Transfer: A GIS Application for
Research Planning in Kenya, ed. Rashid Hassan, CAB International, Wallingford, UK.
Croppenstedt, Andre, Mulat Demeke, and Meloria Meschi, (2003), “Technology Adoption
in the Presence of Constraints: The Case of Fertilizer Demand in Ethiopia”, Review of
Development Economics, 7(1), pp. 58-70.
David, Paul, (2003), “Zvi Griliches on Di¤usion, Lags and Productivity Growth... Connecting
the Dots”, Paper Prepared for the Conference on R&D, Education and Productivity, Held
in Memory of Zvi Griliches (1930-1999), August 2003.
De Groote, Hugo, William Overholt, James Ouma, and S. Mugo, (2003), “Assessing the
Potential Impact of Bt Maize in Kenya Using a GIS Based Model”, Working Paper,
CIMMYT (International Maize and Wheat Improvement Center).
De Groote, Hugo, Cheryl Doss, Stephen Lyimo, and Wilfred Mwangi (2002), “Adoption of
Maize Technology in East Africa: What Happened to Africa’s Emerging Maize Revolution?” Presented at Green Revolution in Asia and its Transferability to Africa, 2002.
Dercon, Stefan and Luke Christiaensen, Luc, (2005), “Consumption Risk, Technology Adoption and Poverty Traps: Evidence from Ethiopia”, mimeo.
Deschênes, Olivier, (2001), “Unobservable Ability, Comparative Advantage and the Rising
Return to Education in the US, 1979-2000”, Working Paper, September 2001, University
of California, Santa Barbara.
Doss, Cheryl, (2003), “Understanding Farm Level Technology Adoption: Lessons Learned
from CIMMYT’s Micro Surveys in Eastern Africa”, Economics Working Paper No. 03-07,
CIMMYT (International Maize and Wheat Improvement Center), Kenya.
Du‡o, Esther, Michael Kremer, and Jonathan Robinson, (2006), “Why Don’t Farmers Use
Fertilizers: Evidence from Field Experiments in Western Kenya”, Working Paper, 2006.
Duncan, Gregory, and Duane Leigh, (1985), “The Endogeneity of Union Status: An Empirical
Test”, Journal of Labor Economics, 3 (3), pp. 385-402.
33
Evenson, Robert, and Germano Mwabu, (1998), “The E¤ects of Agricultural Extension on
Farm Yields in Kenya”, Economic Growth Center Discussion Paper No. 798.
Feder, Gershon, Richard Just, and David Zilberman, (1985), “Adoption of Agricultural Innovations in Developing Countries: A Survey”, Economic Development and Cultural Change.
Food and Agriculture Organization, FAOSTAT Online Database.
Foster, Andrew, and Mark Rosenzweig, (1995), “Learning by Doing and Learning from Others: Human Capital and Technological Change in Agriculture”, Journal of Political Economy, 103, pp. 1176-1209.
Garen, John, (1984), “The Returns to Schooling: A Selectivity Bias Approach with a Continuous Choice Variable”, Econometrica, 52 (5), pp. 1199-1218.
Gerhart, John Deuel, (1975), “The Di¤usion of Hybrid Maize in Western Kenya,” Ph.D.
Dissertation, Princeton University.
Gibbons, Robert, Lawrence F. Katz, Thomas Lemieux, and Daniel Parent, (2002), “Comparative Advantage, Learning and Sectoral Wage Determination”, Scienti…c Series, CIRANO.
Green, David, (1991), “A Comparison of the Estimation Approaches for the Union-Nonunion
Wage Di¤erential”, Discussion Paper No. 91-13, Department of Economics, University of
British Columbia, Vancouver, Canada.
Griliches, Zvi, (1957), “Hybrid Corn: An Exploration in the Economics of Technological
Change”, Econometrica, 25, pp. 501-522.
Hall, Bronwyn, (2004), “Innovation and Di¤usion”, NBER Working Paper No. 10212.
Hassan, Rashid, Festus Murithi, and Geofry Kamau, (1998), “Determinants of Fertilizer
Use and the Gap Between Farmers’ Maize Yields and Potential Yields in Kenya”, in
Maize Technology Development and Transfer: A GIS Application for Research Planning
in Kenya, ed. Rashid Hassan, CAB International, Wallingford, UK.
Hassan, Rashid, Kiairie Njoroge, Mugo Njore, Robin Otsyula, and A. Laboso, (1998), “Adoption Patterns and Performance of Improved Maize in Kenya”, in Maize Technology Development and Transfer: A GIS Application for Research Planning in Kenya, ed. Rashid
Hassan, CAB International, Wallingford, UK.
Heckman, James, (2001), “Micro Data, Heterogeneity, and the Evaluation of Public Policy:
Nobel Lecture”, Journal of Political Economy, 109 (4), pp. 673-748.
Heckman, James, and Xuesong Li, (2003), “Selection Bias, Comparative Advantage and
Heterogeneous Returns to Education: Evidence from China in 2000”, IFAU - Institute for
Labour Market Policy Evaluation, Working Paper 2003:17.
34
Heckman, James, Justin Tobias, and Edward Vytlacil, (2001), “Four Parameters of Interest
in the Evaluation of Social Programs”, Southern Economic Journal, 68 (2), pp. 210-223.
Heckman, James, Je¤rey Smith, with the assistance of Nancy Clements (1997), “Making the
Most out of Programme Evaluations and Social Experiments: Accounting for Heterogeneity in Programme Impacts”, Review of Economic Studies, 64, pp. 487-535.
Heckman, James, and Edward Vytlacil, (1997), “Instrumental Variables Methods for the Correlated Random Coe¢ cient Model: Estimating the Average Rate of Return to Schooling
When the Return is Correlated with Schooling”, Journal of Human Resources, 33(4).
Heckman, James, and Bo Honore, (1990), “The Empirical content of the Roy Model”, Econometrica, 58 (5), pp. 1121-1149.
Jakubson, George, (1991), “Estimation and Testing of the Union Wage E¤ect Using Panel
Data”, Review of Economic Studies, 58 (5), pp. 971-991.
Jayne, Thom S., Robert J. Myers, and James Nyoro, (2005), “E¤ects of Government Maize
Marketing and Trade Policies on Maize Market Prices in Kenya”, Working Paper, MSU.
Jayne, Thom S., Takashi Yamano, James Nyoro, and Tom Awuor, (2001), “Do Farmers Really
Bene…t from High Food Prices? Balancing Rural Interests in Kenya’s Maize Pricing and
Marketing Policy”, Tegemeo Draft Working Paper 2B, Tegemeo Institute, Kenya.
Karanja, Daniel, (1996), “An Economic and Institutional Analysis of Maize Research in
Kenya”, MSU International Development Working Paper No. 57, Department of Agricultural Economics and Department of Economics, Michigan State University, East Lansing.
Karanja, Daniel, Thomas Jayne, and Paul Strasberg, (1998), “Maize Productivity and Impact
of Market Liberalization in Kenya”, KAMPAP (Kenya Agricultural Marketing and Policy
Analysis Project), Department of Agricultural Economics, Egerton University, Kenya.
Karanja, Daniel, Mitch Renkow, and Eric Crawford, (2003), “Welfare E¤ects of Maize Technologies in Marginal and High Potential Regions of Kenya”, Agricultural Economics, 29.
Lemieux, Thomas, (1993), “Estimating the E¤ects of Unions on Wage Inequality in a TwoSector Model with Comparative Advantage and Non-Random Selection”, Working Paper,
CRDE and Departement de Sciences Economiques, Universite de Montreal.
Lemieux, Thomas, (1998), “Estimating the E¤ects of Unions on Wage Inequality in a TwoSector Model with Comparative Advantage and Non-Random Selection”, Journal of Labor
Economics, 16(2), pp. 261-291.
Makokha, Stella, Stephen Kimani, Wilfred Mwangi, Hugo Verkuijl, and Francis Musembi,
(2001), “Determinants of Fertilizer and Manure Use for Maize Production in Kiambu
District, Kenya”, Mexico, D.F.: CIMMYT.
35
Mandelbrot, Benoit, (1962), “Paretian Distributions and Income Maximization”, Quarterly
Journal of Economics, 76, pp.57-85.
McCann, James C., (2005), Maize and Grace: Africa’s Encounter with a New World Crop,
1500-2000, Harvard University Press, USA.
Moser, Christine, and Christopher Barrett, (20030, “The Complex Dynamics of Smallholder
Technology Adoption: The Case of SRI in Madagascar”, Working Paper, Cornell.
Mugo, Frida, Frank Place, Brent Swallow, and Maggie Lwayo, (2000), “Improved Land Management Project: Socioeconomic Baseline Study”, Nyando River Basin Report, ICRAF.
Munshi, Kaivan, (2003), “Social Learning in a Heterogeneous Population: Technology Di¤usion in the Indian Green Revolution”, Journal of Development Economics, 73, pp. 185-213.
Mundlak, Yair, (1961), “Empirical Production Function Free of Management Bias”, Journal
of Farm Economics, 43(1) pp. 44-56.
Nyameino, David, Bernard Kagira, and Stephen Njukia, (2003), “Maize Market Assessment
and Baseline Study for Kenya”, RATES Program, Nairobi, Kenya.
Nyoro, James K., M. W. Kiiru, and Thom S. Jayne, (1999), “Evolution of Kenya’s Maize
Marketing Systems in the Post-Liberalization Era”, Tegemeo Working Paper, June 1999.
Olley, Steven and Ariel Pakes, (1996), “The Dynamics of Productivity in the Telecommunications Equipment Industry”, Econometrica, 64, pp. 1263-1297.
Ouma, James, Festus Murithi, Wilfred Mwangi, Hugo Verkuijl, Macharia Gethi, and Hugo
De Groote, (2002), “Adoption of Maize Seed and Fertilizer Technologies in Embu District,
Kenya”, Mexico, D.F.: CIMMYT (International Maize and Wheat Improvement Center).
Pischke, Jörn-Ste¤en, (1995), “Measurement Error and Earnings Dynamics: Some Estimates
from the PSID Validation Study”, Journal of Business and Economic Statistics, 13 (3).
Place, Frank, and Brent Swallow, (2000), “Assessing the Relationship Between Property
Rights and Technology Adoption on Smallholder Agriculture: A Review of Issues and
Empirical Methods”, CAPRi Working Paper No. 2, ICRAF, Kenya.
Robinson, Chris, (1989), “The Joint Determination of Union Status and Union Wage E¤ects:
Some Tests of Alternative Models”, Journal of Political Economy, 97 (3), pp. 639-67.
Rogers, Everett, (1995), “Di¤usion of Innovations”, Fourth Edition, Free Press, New York.
Roy, A., (1951), “Some Thoughts on the Distribution of Earnings”, Oxford Economic Papers,
3, pp. 135-146.
Salasya, M.D.S., W. Mwangi, Hugo Verkuijl, M.A. Odendo, and J.O. Odenya, (1998), “An
Assessment of the Adoption of Seed and Fertilizer Packages and the Role of Credit in
Smallholder Maize Production in Kakamega and Vihiga Districts, Kenya”, CIMMYT.
36
Sanders, John, Barry Shapiro, and Sunder Ramaswamy, (1996), “The Economics of Agricultural Technology in Semiarid Sub-Saharan Africa”, John Hopkins University Press.
Sattinger, Michael, (1993), “Assignment Models of the Distribution of Earnings”, Journal of
Economic Literature, 31 (2), pp. 831-880.
Schultz, Theodore, (1963), The Economic Value of Education, Columbia University Press.
Smale, Melinda, and Thom Jayne, (2003), “Maize in Eastern and Southern Africa: “Seeds”
of Success in Retrospect”, EPTD Discussion Paper No. 97.
Smale, Melinda, and Hugo De Groote, (2003), “Diagnostic Research to Enable Adoption
of Transgenic Crop Varieties by Smallholder Farmers in Sub-Saharan Africa”, African
Journal of Biotechnology, 2 (12), pp. 586-595.
Sserunkuuma, Dick, (2002 ), “The Adoption and Impact of Improved Maize Varieties in
Uganda”, Working Paper, Makerere University, Uganda.
Sunding, David, and David Zilberman, (2001), “The Agricultural Innovation Process: Research and Technology Adoption in a Changing Agricultural Sector”, Chapter 4, Handbook
of Agricultural Economics, Volume 1, eds. B. Gardner and G. Rausser.
Suri, Tavneet, (2006), “Selection and Comparative Advantage in Technology Adoption”,
Economic Growth Center Discussion Paper No. 944, Yale University.
Tittonell, Pablo, (2003), “Soil Fertility Gradients in Smallholder Farms of Western Kenya,
Their Origin, Magnitude and Importance”, Thesis, Wageningen University, Netherlands.
Vella, F., and M. Verbeek, (1998), “Whose Wages Do Unions Raise? A Dynamic Model of
Unionism and Wage Rate Determination for Young Men”, Journal of Applied Econometrics, 13, pp. 163-168.
Wanzala, Maria, Thom Jayne, John Staatz, Amin Mugera, Justus Kirimi, and Joseph Owuor,
(2001), “Fertilizer Markets and Agricultural Production Incentives: Insights from Kenya”,
Tegemeo Institute Working Paper No. 3, Nairobi, Kenya.
Weir, Sharada, and John Knight, (2000), “Adoption and Di¤usion of Agricultural Innovations
in Ethiopia: The Role of Education”, Working Paper, CSAE, Oxford University.
Wekesa, E., Wilfred Mwangi, Hugo Verkuijl, K. Danda, and Hugo De Groote, (2003), “Adoption of Maize Technologies in the Coastal Lowlands of Kenya”, Mexico, D.F.: CIMMYT.
Willis, Robert, (1986), “Wage Determinants: A Survey and Reinterpretation of Human Capital Earnings Functions”, in 0rley Ashenfelter and Richard Layard (eds.), Handbook of
Labor Economics, Amsterdam: North-Holland.
Wooldridge, Je¤rey, (1997), “On Two Stage Least Squares Estimation of the Average Treatment E¤ect in a Random Coe¢ cient Model”, Economics Letters, 56, pp. 129-133.
37
Figure 1
Population Density and Location of Sample Villages
Figure 2
Hybrid Maize Adoption Patterns, by Province
1
0.9
Rift
Valley
Fraction of Seed that is Hybrid
0.8
0.7
Western
Central
Sample
0.6
0.5
Eastern
0.4
Nyanza
0.3
0.2
0.1
0
1995
Coast
1996
1997
1998
1999
2000
Year
2001
2002
2003
2004
2005
Figure 3a
Fraction of HH's Using Inorganic Fertilizer, by Province
1
Fraction of HHs that Use Inorganic Fertilizers
Central
0.9
Western
0.8
Rift Valley
0.7
Eastern
Sample
0.6
0.5
Nyanza
0.4
0.3
0.2
Coast
0.1
0
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
Year
Figure 3b
Real Expenditure on Inorganic Fertilizer, by Province
Real Expenditure on Inorganic Fertilizers
6000
5000
Western
Rift
Valley
4000
3000
Sample
2000
Central
Eastern
1000
Nyanza
Coast
0
1995
1996
1997
1998
1999
2000
Year
2001
2002
2003
2004
2005
Figure 4
Maize Acreage, by Province
3.5
3
Coast
Maize Acreage (Acres)
2.5
Rift Valley
Western
2
Eastern
1.5
Nyanza
1
Central
0.5
0
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
Year
Figure 5
Maize Harvest, by Province
1400
Rift Valley
1200
Western
Maize Harvest (Kg/Acre)
1000
800
Central
Eastern
600
Nyanza
400
Coast
200
0
1995
1996
1997
1998
1999
2000
Year
2001
2002
2003
2004
2005
Figure 6a
0 .1 .2 .3 .4 .5
Marginal Distribution of Yields by Sector, 1997
0
2
4
6
Non-Adopters
8
10
Adopters
Figure 6b
0 .1 .2 .3 .4 .5
Marginal Distribution of Yields by Sector, 2000
0
2
4
6
Non-Adopters
8
10
Adopters
Figure 6c
0
.2
.4
.6
Marginal Distribution of Yields by Sector, 2004
0
2
4
6
Yields (Kg/Acre)
Non-Adopters
8
10
Adopters
Figure 7b
Distribution of Comparative Advantage
Distribution of Returns
1
0
.5
Predicted Return
.2
0
-.2
-.4
Predicted Theta
.4
1.5
Figure 7a
Non-Hybrid Hybrid
Leavers
Joiners
Non-Hybrid Hybrid
Leavers
Figure 7d
Endogenous Hybrid and Fertilizer Use
Distribution of Tau
0
0
2
.2
.4
4
.6
6
Figure 7c
Joiners
-.4
-.2
0
.2
.4
.6
.8
Theta
N
H
-3
-2
-1
0
1
Tau
L
J
N, 1997
H, 1997
2
Table 1: Trends in Yields of Staples
Average Annual % Changes in Yields (Hg/Ha), by Decade
1961-1970
1971-1980
1981-1990
1991-2004
Kenya
Maize
Wheat
0.362
5.646
2.373
2.333
1.169
-3.078
-1.198
0.984
India
Maize
Wheat
Rice
1.502
4.876
0.954
0.842
2.514
1.714
1.900
3.343
3.310
2.572
1.235
0.838
Mexico
Maize
Wheat
2.057
4.586
4.267
3.204
-0.548
-0.255
1.447
1.664
Zambia
Maize
-0.267
10.403
1.571
-1.707
Notes: Surprisingly, in the 1960’s the levels of yields of maize in Kenya, India and Mexico were comparable and very similar (all ranging
between 10,000 and 12,000 Hg/Ha). The levels of yields in Zambia were slightly lower (about 8,000 Hg/Ha) whereas those in Uganda
and Malawi were comparable to those in Kenya.
Source: FAOSTAT Online Database
Table 2a: Summary Statistics, by Sample Year
1997 Sample
2000 Sample
2004 Sample
Yield (Log Maize Harvest Per Acre)
5.907 (1.153)
6.125 (1.092)
6.350 (0.977)
Acres Planted
1.903 (3.217)
2.313 (3.948)
1.957 (2.685)
Total Seed Planted (Kg per Acre)
9.575 (7.801)
9.331 (6.805)
9.072 (6.863)
Total Purchased Hybrid Planted (Kg per Acre)
6.273 (6.926)
5.918 (6.636)
5.080 (5.260)
Total Local Seed Planted (Kg per Acre)
2.653 (6.326)
2.868 (6.129)
3.120 (7.318)
Hybrid (dummy)
0.658 (0.475)
0.676 (0.468)
0.604 (0.489)
Fertilizer (Kg DAP (di-ammonium phosphate) per Acre)
20.300 (38.444)
24.577 (79.919)
24.610 (34.001)
Fertilizer (Kg MAP (mono-ammonium phosphate) per Acre)
1.566 (10.165)
0 (0)
0.308 (4.538)
Fertilizer (Kg CAN (calcium ammonium nitrate) per Acre)
6.473 (24.727)
9.819 (75.081)
8.957 (21.702)
Fertilizer (Kg NPK (nitrogen phosphorous potassium) per Acre)
4.256 (20.096)
2.365 (12.172)
1.217 (7.870)
Total Fertilizer Expenditure (KShs per Acre)
1361.7 (2246.3)
1346.5 (3347.1)
1354.6 (1831.2)
Land Preparation Costs (Kshs per Acre)
2168.1 (4895.7)
813.66 (1222.9)
808.99 (1080.2)
Main Season Rainfall (mm)
620.83 (256.43)
599.10 (264.21)
728.11 (293.29)
Notes: Standard deviations in parentheses. KShs represents Kenyan shillings (the current exchange rate is on the order of 75 shillings to the US dollar).
Henceforth, any variables measured in Kenyan shillings are deflated. The differences in land preparation costs over time are due to a change in how the
data was collected.
Source: TAMPA Project data.
Table 2b: Summary Statistics, by Hybrid/Non-Hybrid Use
1997 Sample
2004 Sample
Hybrid
Non-Hybrid
Hybrid
Non-Hybrid
791
411
726
476
Yield (Log Maize Harvest Per Acre)
6.296 (0.934)
5.158 (1.167)
6.751 (0.692)
5.738 (1.030)
Total Maize Acres Cultivated
1.982 (3.557)
1.753 (2.428)
2.087 (3.029)
1.758 (2.042)
Total Seed Planted (Kg per Acre)
9.669 (6.569)
9.394 (9.750)
8.746 (4.156)
9.569 (9.608)
Fertilizer (Kg DAP per Acre)
28.755 (44.115)
4.028 (13.266)
37.148 (37.294)
5.488 (13.909)
Fertilizer (Kg CAN per Acre)
9.087 (29.715)
1.442 (7.152)
12.708 (24.961)
3.235 (13.622)
Land Preparation Costs (KShs/Acre)
2006.0 (1316.2)
2480.1 (8168.4)
918.44 (1095.2)
642.06 (1036.1)
Expenditure on Fertilizer (KShs/Acre)
1922.3 (2542.9)
282.64 (740.53)
1893.3 (1964.7)
533.09 (1211.4)
Inorganic Fertilizer Use (Dummy)
0.7421 (0.4378)
0.2311 (0.4221)
0.8994 (0.3009)
0.4055(0.4915)
Main Season Rainfall (mm)
651.70 (228.82)
561.44 (293.88)
825.41 (215.20)
579.69 (332.05)
Hired Labor (Hours)
109.98 (162.79)
95.118 (271.75)
486.15 (884.21)
225.71 (714.16)
Family Labor (Hours)
255.10 (261.69)
353.73 (460.58)
146.66 (163.03)
115.64 (135.72)
No. of Households
Notes: Standard deviations in parentheses. The mean of yields is higher and the standard deviation of yields is lower in the hybrid sector. The dummy for
inorganic fertilizer use just measures what fraction of households in that sector-year uses inorganic fertilizers. The trends in the land preparation costs
and the labor variables over time are due to a change in how the data was collected.
Source: TAMPA Project data.
Table 2c: Transitions Across Hybrid/Non-Hybrid Sectors
(Just for the Sample Periods 1997, 2000 and 2004)
Transition in Terms of Technology Used
(1997 2000 2004)
Fraction of Sample (%)
(N=1202 Households)
NNN
20.38
NNH
2.83
HNH
6.07
NHH
4.91
HNN
5.99
HNH
3.16
HHN
7.15
HHH
49.50
Notes: This table shows all the possible three period transitions in my sample of farmers, and the fraction of my sample that experiences each of these
transitions. The three periods correspond to 1997, 2000 and 2004. In the first column, the three letters represent the transition history with respect to
technology, where “H” represents the use of hybrid and “N” represents the use of non-hybrid. For example, the transition “N N N” stands for farmers
who used non-hybrid maize in all three periods – they make up about 20.4% of my sample.
This switching behavior is not just measurement error. It is rather systematic. In addition, the survey instrument asks about hybrid use in multiple
sections of the survey (it is a rather large part of household behavior so it is unlikely to be measurement error).
Source: TAMPA Project data.
Table 3: Basic OLS and Fixed Effects Specifications
Dependent Variable is Yields (Log Maize Harvest Per Acre)
OLS, Pooled
OLS, Pooled
OLS, Pooled
FE
FE
1.024 (0.033)
0.731 (0.034)
0.556 (0.035)
0.114 (0.052)
0.149 (0.049)
Acres (x100)
-
-
-0.385 (0.432)
-
-5.471 (0.839)
Seed Kg per Acre (x10)
-
-
0.290 (0.020)
-
0.272 (0.023)
Land Preparation Costs per
Acre (x10,000)
-
-
0.216 (0.048)
-
0.151 (0.054)
Main Season Rainfall (x1000)
-
-
0.688 (0.118)
-
0.913 (0.112)
0.052 (0.006)
-
0.031 (0.007)
Hybrid
Fertilizer per Acre (x1000)
Year = 2000
0.199 (0.039)
0.205 (0.037)
0.272 (0.035)
0.216 (0.033)
0.289 (0.032)
Year = 2004
0.498 (0.039)
0.482 (0.037)
0.403 (0.037)
0.449 (0.037)
0.377 (0.035)
Constant
5.233 (0.035)
4.842 (0.072)
4.195 (0.085)
5.832 (0.042)
-2.240 (4.931)
No
Yes
Yes
-
-
0.228
0.317
0.424
0.071
0.197
Province Dummies
R-Squared
Notes: Standard errors are in parentheses. All regressions have 3600 observations (three periods). Covariates not reported include labor variables and average
long term mean rainfall. Results are almost identical if the sample is restricted to two periods and actual family labor is included as a covariate (for this
case the OLS return to hybrid is 0.540 and the FE return is 0.085). The OLS and household fixed effects specifications run are respectively:
′
′
y it     i  h it  X it   it
y it    h it  X it   it
Source: TAMPA Project data.
Table 4: CRE Model Reduced Forms and Structural Estimates
Dependent Variable is Yields (Log Maize Harvest Per Acre)
Reduced Forms Without Covariates
Yields, 1997
Yields, 2000
Yields, 2004
Reduced Forms With Covariates
Yields, 1997
Yields, 2000
Yields, 2004
Hybrid, 1997
0.574 (0.075)
0.345 (0.084)
0.467 (0.067)
0.444 (0.066)
0.279 (0.080)
0.354 (0.062)
Hybrid, 2000
0.323 (0.081)
0.424 (0.086)
0.230 (0.074)
0.214 (0.070)
0.426 (0.079)
0.171 (0.065)
Hybrid, 2004
0.680 (0.079)
0.490 (0.081)
0.632 (0.069)
0.385 (0.071)
0.344 (0.080)
0.480 (0.066)
OMD
DWMD
EWMD
0.078 (0.053)
0.110 (0.054)
0.121 (0.054)
Standard CRE Model Structural Estimates
λ1
λ2
λ3
Without Covariates:
0.440 (0.055)
0.293 (0.058)
0.572 (0.059)
0.435 (0.055)
0.281 (0.058)
0.563 (0.058)
0.422 (0.056)
0.286 (0.058)
0.560 (0.058)
OMD
DWMD
EWMD
0.144 (0.049)
0.152 (0.050)
0.159 (0.050)
With Endogenous Covariates:
0.317 (0.049)
0.219 (0.049)
0.316 (0.049)
0.211 (0.049)
0.306 (0.051)
0.217 (0.049)
β
0.348 (0.054)
0.350 (0.053)
0.350 (0.054)
χ2 (p-value)
79.33 (0.000)
-
4.037 (0.043)
-
Notes: Standard errors are in parentheses. Reduced forms are estimated with and without covariates (acreage, real fertilizer expenditure, land preparation costs,
seed, labor and rainfall variables). Covariates are treated as endogenous. Results are the same if only two periods with family labor are used.
The reduced form for each t, the projection used to estimate the structural model by minimum distance and the structural model are respectively:
′
y it   t  1t h i1  2t h i2  3t h i3  X i  t   it
y it    h it   i  u it
 i   0   1 h i1   2 h i2  v i
Structural coefficients β and projection coefficients (λ’s) are reported. OMD, EWMD and DWMD are optimal (weighted by inverted reduced form
variance-covariance matrix), equally weighted and diagonally weighted (only the diagonal elements from the OMD weight matrix) minimum distance
respectively. The χ2 statistic on the overidentification test is the value of the OMD minimand.
Source: TAMPA Project data.
Table 5: Selection
Returns by Hybrid History (Joiners, Leavers and Stayers)
Dependent Variable is Yields (Log Maize Harvest Per Acre)
Variable
1997-2000
1997 Yield
2000 Yield
2000-2004
2000 Yield
2004 Yield
Without Covariates
1997-2004
1997 Yield
2004 Yield
Hybrid Stayers
1.411
(0.070)
1.109
(0.070)
1.132
(0.067)
1.172
(0.057)
1.505
(0.066)
1.280
(0.056)
Leavers
0.746
(0.112)
0.281
(0.110)
0.426
(0.095)
0.332
(0.081)
0.809
(0.094)
0.648
(0.079)
Joiners
0.563
(0.105)
0.430
(0.104)
0.403
(0.127)
0.693
(0.108)
1.007
(0.114)
0.883
(0.096)
With Covariates
Hybrid Stayers
0.766
(0.073)
0.637
(0.075)
0.719
(0.073)
0.593
(0.061)
0.868
(0.074)
0.652
(0.063)
Leavers
0.466
(0.097)
0.044
(0.100)
0.367
(0.086)
0.159
(0.067)
0.509
(0.085)
0.350
(0.069)
Joiners
0.316
(0.089)
0.335
(0.094)
0.258
(0.115)
0.339
(0.092)
0.491
(0.101)
0.499
(0.084)
Notes: Standard errors are in parentheses. In each year, the following regression is run both with and without the standard set of Xi covariates used previously:
′
y i     1 h i11   2 h i10   3 h i01  X i   u i
where hi11 is the dummy indicating that farmer i is a hybrid stayer (plants hybrid in both periods), hi10 indicates he is a leaver (plants hybrid the first
year, not the second), hi01 indicates he is a joiner (plants hybrid the second year, not the first). The coefficients reported are µ1, µ 2 and µ 3.
Source: TAMPA Project data.
Table 6: Heterogeneity by Observables
Returns in the Hybrid/Non-Hybrid Sector
Dependent Variable is Yields (Log Maize Harvest Per Acre)
Variable
OLS, with Covariates
FE, with Covariates
Hybrid
Non-Hybrid
Hybrid
Non-Hybrid
Acreage (x 100)
0.784
(0.438)
-5.427
(1.125)
-4.353
(0.917)
-8.360
(2.013)
Total Seed Planted (Kg per Acre)
(x 100)
3.418
(0.279)
2.176
(0.517)
2.854
(0.323)
2.397
(0.401)
Land Preparation Costs per Acre (KShs)
(x 10,000)
0.487
(0.129)
0.216
(0.031)
0.523
(0.170)
0.151
(0.068)
Total Fertilizer Expenditure per Acre
(KShs) (x 10,000)
0.430
(0.055)
1.855
(0.305)
0.245
(0.064)
1.176
(0.534)
Main Season Rainfall (mm)
(x 10,000)
3.879
(1.399)
12.286
(2.261)
9.297
(1.416)
12.706
(2.599)
0.466
(0.043)
Average Return (β) when Returns Vary by
Observables (evaluated at mean X’s)
Number of Observations
2330
0.135
(0.053)
1276
2330
1276
Notes: Standard errors in parentheses. The following specification is estimated separately for the sample of farmers who use hybrid and non-hybrid maize:
j
j
y it   j  X ti j   it
j ∈ H, N
The covariates included that are not reported are labor variables, province dummies, year dummies and the average main seasonal rainfall.
The results are similar if the sample is restricted to only 1997 and 2004 and family labor included as a covariate (average β for the OLS case is 0.473).
Source: TAMPA Project data.
Table 7: Two Period Basic Comparative Advantage CRC Model Reduced Form Estimates
Dependent Variable is Yields (Log Maize Harvest Per Acre)
Reduced Forms Without Covariates
Reduced Forms With Covariates
Yields, 1997
Yields, 2004
Yields, 1997
Yields, 2004
Hybrid, 1997
0.947 (0.113)
0.738 (0.092)
0.854 (0.104)
0.606 (0.087)
Hybrid, 2004
1.011 (0.117)
0.832 (0.102)
0.753 (0.109)
0.723 (0.098)
Hybrid 1997*Hybrid 2004
-0.450 (0.150)
-0.352 (0.126)
-0.383 (0.135)
-0.356 (0.111)
Acres (x10)
-
-
0.159 (0.060)
0.095 (0.056)
Hired Labor per Acre (x 1,000)
-
-
0.019 (0.015)
0.011 (0.002)
Family Labor per Acre (x 10,000)
-
-
0.108 (0.046)
0.556 (0.117)
Seed Kg per Acre (x 10)
-
-
0.251 (0.052)
0.278 (0.092)
Land Preparation Cost per Acre (x 1,000)
-
-
0.019 (0.004)
0.052 (0.018)
Main Season Rainfall per Acre (x 1000)
-
-
0.834 (0.121)
0.335 (0.102)
Fertilizer Expenditure per Acre (x 10,000)
-
-
0.868 (0.197)
0.666 (0.158)
0.297
0.289
0.419
0.433
R-Squared
Notes: Standard errors are in parentheses. Reduced forms are estimated with and without covariates (all covariates reported). Reduced forms are for the case
where all covariates are exogenous: only hybrid is correlated with the comparative advantage, θi. See Table 7b for projection and structural estimates.
The reduced form equations run are as follows for 1997 and 2004 respectively:
y i1   1  1 h i1  2 h i2  3 h i1 h i2   i1
Source: TAMPA Project data.
y i2   2  4 h i1  5 h i2  6 h i1 h i2   i2
Table 8a: Two Period Basic Comparative Advantage CRC Model Structural Estimates
Dependent Variable is Yields (Log Maize Harvest Per Acre)
With Only Hybrid Endogenous
EWMD
Without Covariates
DWMD
OMD
EWMD
With Covariates
DWMD
OMD
λ1
0.786 (0.087)
0.775 (0.088)
0.743 (0.091)
0.648 (0.082)
0.635 (0.083)
0.596 (0.089)
λ2
0.963 (0.112)
0.956 (0.111)
1.030 (0.119)
0.711 (0.104)
0.702 (0.103)
0.772 (0.112)
λ3
-5.417 (33.41)
-4.367 (20.37)
-1.622 (0.770)
-1.098 (1.323)
-1.037 (1.079)
-0.930 (0.272)
β
3.033 (18.33)
2.450 (11.11)
1.008 (0.295)
0.804 (0.337)
0.765 (0.224)
0.733 (0.062)
φ
-1.104 (0.766)
-1.130 (0.754)
-1.558 (0.562)
-1.740 (2.299)
-1.776 (2.219)
-1.966 (0.926)
χ21 (p-value)
-
-
71.62 (0.000)
-
-
63.75 (0.000)
Notes: Standard errors are in parentheses. The reduced forms for these estimates are reported in Table 7 (for the case where only the hybrid decision is
endogenous, i.e. correlated with the θi’s). The projection used in this model is:
 i   0   1 h i1   2 h i2   3 h i1 h i2  v i
The minimum distance estimations are run for both cases with and without covariates (all the covariates are reported in Table 7 above). The structural
coefficients reported are the average return to hybrid (β), the comparative advantage coefficient (ф), and all the projections coefficients (the λ’s).
OMD, DWMD and EWMD are optimal weighted (the weight matrix is the inverted reduced form variance-covariance matrix), diagonally weighted (the
weight matrix is the OMD weight matrix with the off diagonal elements set to zero) and equally weighted minimum distance respectively.
The χ2 statistic on the overidentification test is the value of the OMD minimand.
Source: TAMPA Project data.
Table 8b: Two Period Comparative Advantage CRC Model Structural Estimates
Dependent Variable is Yields (Log Maize Harvest Per Acre)
With Both Fertilizer and Hybrid as Endogenous
Projection:
 i   0   1 h i1   2 h i2   3 h i1 h i2   4 h i1 f i1   5 h i2 f i1   6 h i1 h i2 f i1   7 h i1 f i2   8 h i2 f i2   9 h i1 h i2 f i2   10 f i1   11 f i2  v i
EWMD
Without Covariates
DWMD
OMD
EWMD
With Covariates
DWMD
OMD
β
0.003 (0.544)
0.609 (0.142)
-0.701 (1.665)
-0.053 (1.890)
0.045 (0.485)
-0.717 (0.065)
φ
-0.468 (2.414)
-2.102 (2.078)
-0.824 (0.290)
-0.783 (1.110)
-0.547 (0.830)
-1.806 (0.458)
χ28 (p-value)
-
-
64.70 (0.000)
-
-
110.3 (0.000)
With a Joint Fertilizer-Hybrid Decision
Projection:
EWMD
 i   0   1 t i1   2 t i2   3 t i1 t i2  v i
Without Covariates
DWMD
OMD
EWMD
With Covariates
DWMD
OMD
β
1.216 (5.632)
1.026 (2.735)
0.799 (0.059)
0.714 (1.072)
0.645 (0.697)
0.552 (0.052)
φ
-1.180 (1.509)
-1.239 (1.429)
-1.707 (0.770)
-1.241 (0.889)
-1.276 (0.867)
-1.566 (0.527)
χ21 (p-value)
-
-
55.73 (0.000)
-
-
50.41 (0.000)
Notes: Standard errors are in parentheses. Reduced forms are not reported. Upper panel reports structural estimates of the average return to hybrid (β) and the
comparative advantage coefficient (ф) for the model with endogenous hybrid and fertilizer use. Lower panel shows the results from a model where a
farmer is in the technology sector if he is using both hybrid and fertilizer and not otherwise (above, tit represents the technology sector). The reduced
forms and structural coefficients are as in Table 8a, with only the average return to hybrid (β) and the comparative advantage coefficient (ф) reported.
Covariates include acreage, land preparation, seed, labor, and rainfall variables, and the total real expenditure on fertilizer.
Source: TAMPA Project data.
Table 9: Three Period CRC Model Structural OMD Estimates
Dependent Variable is Yields (Log Maize Harvest Per Acre)
 i   0   1 h i1   2 h i2   3 h i3   4 h i1 h i2   5 h i1 h i3   6 h i2 h i3   7 h i1 h i2 h i3  v i
Projection:
Hybrid Decision
Joint Hybrid-Fertilizer Sector Decision
Without Covariates
With Covariates
Without Covariates
With Covariates
λ1
0.661 (0.076)
0.554 (0.072)
0.735 (0.084)
0.625 (0.078)
λ2
0.510 (0.084)
0.407 (0.078)
0.540 (0.103)
0.511 (0.104)
λ3
0.636 (0.104)
0.587 (0.108)
0.722 (0.098)
0.630 (0.095)
λ4
-0.583 (0.108)
-0.435 (0.116)
-0.382 (0.165)
-0.413 (0.149)
λ5
-0.417 (0.136)
-0.415 (0.134)
-0.335 (0.139)
-0.372 (0.142)
λ6
-0.997 (0.169)
-0.905 (0.178)
-0.253 (0.152)
-0.293 (0.145)
λ7
0.340 (0.163)
0.323 (0.167)
0.194 (0.207)
0.325 (0.197)
β
0.966 (0.050)
0.857 (0.063)
0.016 (0.056)
0.049 (0.059)
φ
-2.382 (0.483)
-2.087 (0.437)
-0.056 (0.155)
-0.212 (0.199)
Notes: Standard errors in parentheses. All specifications reported above in this table include the following set of covariates: acreage, real fertilizer expenditure,
land preparation costs, seed, labor and rainfall variables (all treated as exogenous). In the case of the Hybrid Decision, the analysis is identical to that in
Tables 7 and 8a (except for three periods, 1997, 2000 and 2004). In case of the Joint Hybrid-Fertilizer Decision, the analysis is identical to that in the
lower panel of Table 8b (the technology sector choice is defined the same way). Only OMD estimates are reported (EWMD and DWMD estimates are
not significantly different from zero). These results should be interpreted with caution (covariates do not include family labor).
Source: TAMPA Project data.
Table 10: Heckit and Treatment Effect Estimates under Non-Random Assignment (ATE, TT, MTE, LATE)
Dependent Variable is Yields (Log Maize Harvest Per Acre)
Year
Heckman Two-Step Estimates
Selection Correction λ
Selection Correction λ
Hybrid Sector
Non-Hybrid Sector
Implied Treatment Effects
ATE
TT
MTE Slope
1997
-0.833 (0.170)
1.034 (0.724)
2.174
1.138
-1.867 (0.743)
2000
-0.847 (0.118)
-0.615 (0.353)
0.768
0.898
-0.232 (0.372)
2004
-0.890 (0.183)
-0.005 (0.156)
1.331
1.014
-0.885 (0.240)
IV (LATE) Estimates (Conditional on Covariates)
First Stage: Effect of Distance on Probability of Using Hybrid
Second Stage: Effect of Predicted Hybrid on Yields
(coefficient is reported x100)
-0.525
(0.097)
1.452
(0.419)
Notes: Standard errors are in parentheses. All regressions control for the same set of covariates as previously. The first two upper panel columns show (for
each year) the two-step selection corrected estimates of coefficients on the Inverse Mills ratio for hybrid and non-hybrid sectors from:
′
′
′
′
y Hi  X i  H   H Z i /Z i 
h i  Z i   u si
and
′
′
′
y Ni  X i  N   N Z i / 1 − Z i 
The third column reports the average treatment effect accounting for the selection (it takes the predicted yields from the second step in each sector
subtracts out the effect of the selection term and reports the difference in this predicted value across the two sectors). The treatment on the treated is
reported next which looks at the difference in the second step predicted yields (including the selection terms) between the two sectors only for the
sample of farmers who plant hybrid. Finally, the MTE slope is just the difference in the λ coefficients for the hybrid and non-hybrid selection terms (the
difference between coefficients reported in the first two columns). ATE, TT, MTE are all evaluated at the mean Xi’s. The lower panel reports the IV
estimates which use distance to closest fertilizer supplier (not where fertilizer is purchased) as an IV for the hybrid decision in the second stage:


′
′

y it    h it  X i   u it where h it   0  Dist  X i 
IV results are roughly the same if excluded instruments are distance interacted with wealth quantiles, controlling for distance and wealth in both stages.
Results are similar for the pooled data and for the period by period data on the IV.
Source: TAMPA Project data.
Table 11: Correlates of Estimated θi’s
Dependent Variable is the Estimated θi’s
(1)
(2)
(3)
(4)
(5)
Distance to Closest Fertilizer Seller (x100)
-0.544 (0.081)
-0.521 (0.081)
-0.511 (0.081)
-0.521 (0.081)
-0.497 (0.082)
Distance to Motorable Road (x100)
-0.737 (0.344)
-0.742 (0.341)
-0.726 (0.342)
-0.741 (0.342)
-0.724 (0.342)
Distance to Tarmac Road (x100)
-0.308 (0.069)
-0.259 (0.069)
-0.272(0.070)
-0.259 (0.069)
-0.293 (0.071)
Distance to Matatu Stop (x100)
-0.135 (0.207)
-0.158 (0.206)
-0.137 (0.206)
-0.158 (0.206)
-0.142 (0.207)
Distance to Extension Services (x100)
-0.399 (0.117)
-0.358 (0.116)
-0.347 (0.117)
-0.358 (0.116)
-0.341 (0.117)
Tried to Get Credit (x10)
-
-
0.135 (0.104)
-
-
Tried but Didn’t Receive Credit (x10)
-
-
-
0.079 (0.199)
-
Received Credit (x10)
-
-
-
-
0.180 (0.109)
Dummies for Household Head Education
(P value on Joint Significance)
No
Yes
(0.000)
Yes
(0.000)
Yes
(0.000)
Yes
(0.000)
Province Dummies
Yes
Yes
Yes
Yes
Yes
Year Dummies
Yes
Yes
Yes
Yes
Yes
Notes: Standard errors are in parentheses. This table reports the correlations of the predicted θi’s with various observables in my dataset, namely distance from
the household to the closest fertilizer seller, fraction of adult household members with no education, distance to the closest tarmac road, distance to the
closest matatu (public transport) stop, distance to the closest motorable road, a dummy for whether the household tried to get credit, a dummy for
whether the household received any credit, dummies for education of the household head, province and year dummies, depending on the specification.
The predicted θi’s used are from the two period case with endogenous hybrid and fertilizer use (last column of the upper panel of Table 8b).
Source: TAMPA Project data.