Forecasting When it Matters: Evidence from Semi-Arid India

Forecasting When it Matters:
Evidence from Semi-Arid India
Xavier Giné∗
World Bank
Robert M. Townsend
MIT
James Vickery
FRB of NY
This draft: October 2014
Abstract
In regions of India where cultivation is rainfed, the optimal time to plant coincides with the onset of the monsoon. We elicit from farmers their prior beliefs
about the timing of the monsoon and assess their accuracy by comparing them to
historical data. We find substantial heterogeneity in beliefs and accuracy. More
importantly, elicited beliefs can explain differences in observed behavior. We then
investigate the sources of inaccuracy and find that they fit well the predictions of
a simple model of costly acquisition of information: poorer farmers whose income
depends more on the monsoon are more accurate. Finally, we show that accuracy
leads to average income gains of 8 to 9 percent of agricultural production.
Keywords: Information Acquisition, Expectations, Forecasting, Climate.
JEL Codes: D81, D84, O13, Q54.
∗
Email addresses: [email protected], [email protected], and [email protected].
Financial support from CRMG, GARP and World Bank is gratefully acknowledged. We want to thank
Kumar Acharya, the survey team at ICRISAT and specially Dr. K.P.C Rao for many discussions about
agricultural practices in semi-arid India and for their efforts in collecting the survey data. Dr. Piara Singh
from ICRISAT provided valuable help in computing rainfall absorption of different soil types. We also
thank Jean Marie Baland, Abhijit Banerjee, Shawn Cole, Adelaine Delavande, Quy-Toan Do, Pascaline
Dupas, Michael Kremer, Lance Lochner, Hanan Jacoby, David McKenzie Mark Rosenzweig, Chris Udry
and participants in several conferences and seminars for valuable comments. Paola de Baldomero Zazo,
Helene Bie Lilleor, Francesca de Nicola and Joan Pina Martı́ provided outstanding research assistance.
The views expressed in this paper are the authors’ and should not be attributed to the World Bank,
Federal Reserve Bank of New York or the Federal Reserve System.
1
Uttara chusi, yattara gampa.
Wait for the Uttara rains; if they don’t come, leave the place.
Telugu Proverb
1
Introduction
Weather risk is a major source of income fluctuations for rural households in developing
countries. Rosenzweig and Binswanger (1993), for example, find that the delay of the
monsoon in semi-arid India can have considerable negative effects on agricultural yield
and profits.
With complete and frictionless financial markets, households would be able to protect
consumption from weather shocks, but because insurance markets in developing countries
are typically limited, households rely on the ex-ante and ex-post risk coping strategies
that typically trade expected profits for lower risk (Walker and Ryan, 1990; Morduch,
1995).
One such strategy, particularly relevant if agriculture is rainfed, is the choice of an
optimal time to plant (Fein and Stephens, 1987; Rao et al., 2000 and Gadgil et al., 2002).
In semi-arid India the main growing season runs from June to November, coinciding with
the monsoon. Because the onset of the monsoon varies from year to year, deciding when to
plant involves significant judgement. If farmers plant after the first rains, but subsequent
rains are scattered and fall several days apart, the seeds may not germinate. Farmers
are then forced to either replant or to abandon the crop altogether, both resulting in
significant losses. On the other hand, being too conservative by postponing planting until
one is certain that the monsoon has arrived is also costly, because yield will typically be
lower (Bliss and Stern, 1982; Fafchamps, 1993; Rao et al., 2000 and Singh et al., 1994).
In short, when the first rains of the season come, farmers must assess whether these
are just early pre-monsoon rains, in which case they should postpone planting, or whether
the rains signal the onset of the monsoon, in which case they should plant immediately.
2
Our data provide direct evidence that mistakes about the precise timing of the monsoon are costly. In 2006, about a quarter of our sample had replanted in the past and a full
73 percent had abandoned the crop at least once in the prior 10 years due to poor rains.
In addition, the extra expenses required to replant are large, equivalent to 20 percent
of total average production expenses. Thus, it seems that farmers would benefit greatly
from having accurate priors of the onset of the monsoon.
In this paper, we elicit the farmers’ priors about the onset of the monsoon and study
whether these beliefs are similar across farmers and accurate. The survey covers 1,054
farming households living in 37 villages of two districts in Andhra Pradesh, India. Experimental methods are used to first elicit the respondent’s subjective definition of the
onset of the monsoon, and then its subjective calendar distribution. To elicit the subjective definition, each farmer reports the minimum depth of soil moisture that he would
require to start planting. This measure is then converted into a quantum of rainfall using
the absorption capacity of the respondent’s specific soil type. The subjective calendar
distribution is obtained by giving each respondent 10 stones and a sheet of paper with
boxes corresponding to 13-14 day calendar periods called “kartis”.1 The respondent is
then instructed to distribute the 10 stones across the different boxes according to the
likelihood that the monsoon would start in the calendar period indicated by each box (see
Delavande et al. 2011a for a review of these methods).
This elicitation method delivers prior subjective expectations that are quite heterogeneous, even within villages. To understand the implications of this heterogeneity, we
build a simple model that combines heterogeneity in beliefs, irreversible investment and
costly information acquisition.
Consistent with the data, this model predicts that in villages where the monsoon onset
is more volatile, farmers will plant later. Intuitively, if initial rains are not very informative
about the onset of the monsoon, the option value of waiting for more information before
1
Kartis or naksatra in Sanskrit are based on the traditional solar calendar (Fein and Stephens, 1987;
Gadgil et al., 2002).
3
planting is higher.
Second, farmers who believe the monsoon will start later are also more likely to plant
later. Likewise, they tend to replant less and to purchase a lower share of total production
inputs before the onset of the monsoon. All of these findings, also consistent with the
model predictions, provide strong evidence that individuals make decisions according to
their prior expectations, even when controlling for self-reported proxies of risk aversion,
discount rates and the actual start of the monsoon.
The model also predicts that the accuracy of farmers’ priors are dictated by a simple
cost-benefit analysis where accuracy depends on the importance of the event. We find
that accurate farmers are poorer, with more rainfall dependent income and more likely to
be credit constrained. Put differently, accurate farmers are less able to cope with weather
risk and have livelihoods that are more dependent on the monsoon. Thus, accuracy is
rationally explained by how relevant the event is to the forecaster (Brunnermeier and
Parker, 2005), rather than by heuristics or “rules of thumb” developed in the psychology
and behavioral economics literature (Kahneman, Slovic and Tversky, 1982; Rabin, 1998
or DellaVigna, 2007 for a review).
Finally, we show that inaccuracy about the onset of the monsoon is costly. In particular, we find that an improvement in our measure of accuracy from the 25th percentile
of the distribution to the 75th percentile would increase gross agricultural income by 8 to
9 percent. Similarly, accurate farmers are less likely to replant, perhaps due to the fewer
planting mistakes they make.
More generally, because differences in behavior can be explained by differences in
expectations independently of differences in preferences, these results warrant the use of
expectations data to predict behavior.
In this sense, the paper contributes to a growing literature that measures expectations
and its impacts on various outcomes (see Manski, 2004 for an excellent review; Norris and
Kramer, 1990 for an early review of elicitation methods applied to agricultural economics
4
or more recently, Attanasio, 2009; Delavande et al. 2011a and Delavande, 2013 that also
review the literature in developing countries and advocate the collection of expectations
data).2
More importantly, we follow the literature in eliciting expectations about an event very
familiar to respondents but we are the first to use historical data to assess heterogeneity
in accuracy and its welfare impacts.
The rest of the paper is organized as follows. The next section presents a simple model
and a set of testable predictions, Section 3 describes the context and the data collected,
Section 4 takes the model predictions to the data and finally Section 5 concludes.
2
A Simple Model of Costly Information Acquisition
Consider a risk-neutral farmer that at the beginning of the growing season receives some
rainfall. Since this rainfall is informative of the onset of the monsoon, the farmer must
decide whether to plant a rainfed crop immediately, or whether to wait instead. Because
all farmers receive the same signal, there is no benefit to a particular farmer of inferring
the signal of others.
The basic tradeoff here is identical to that found in the literature on irreversible
investments. The benefit of waiting is that it allows the farmer to make a more informed
planting decision. This is valuable because if the monsoon has not yet arrived and the
farmer planted, the seeds would not germinate. The cost of waiting, however, is that
planting early increases total agricultural output if the monsoon did arrive.
In what follows, we first lay out the assumptions and then we characterize the opti2
Luseno et al. (2003) and Lybbert et al. (2005), for example, study the extent to which cattle
herders in Kenya update their priors on rainfall expectations in response to new information. They
find significant updating but little change in behavior after the updating, possibly due to the flexible
nature of their income generating activities with respect to weather changes, as herds can be moved to
greener pastures. McKenzie et al. (2007) finds that first time intending migrants tend to under-estimate
the prospects of finding a job abroad and the incomes they could earn, when compared to the actual
experience of as similar group of migrants. [NEED TO FINISH LIT. REVIEW]
5
mal planting strategy as a function of the informativeness of the initial rains. Finally,
we introduce inaccurate priors and the possibility of devoting resources (attention) to
increasing accuracy to minimize planting mistakes.
Timing of events. The timing of events is shown in Figure 1 below. There are two
periods. At the beginning of period 1, the farmer observes an initial amount of rain r,
which can take on three values rH , rM or rL , rH > rM > rL . The farmer then chooses to
either plant the rainfed crop or wait. The monsoon can arrive at the end of period 1 or in
period 2. If the monsoon arrives at the end of period 1, the state s is “Onset” (s = O),
while if it arrives in period 2, the state s is “Delay” (s = D). The rains are informative
of the state in that state s is correlated with initial rains r. We assume that planting in
period 2 is still profitable, and so farmers that did not plant in period 1, will do so in
period 2. At the end of period 2, the farmer receives income which depends on the state
and the timing of planting.
Output. If the farmer decides to plant in period 1, he obtains gross income y if the
monsoon arrives at the end of period 1 (s = O) and 0 if the monsoon is delayed and
arrives in period 2 (s = D). Planting costs are c > 0. If the farmer waits until period 2
to plant, he obtains net income y irrespective of the state. We assume that y − c > y,
reflecting the cost of waiting to plant in period 2 when the state is s = O, or the cost of a
delayed monsoon if s = D. Net income y satisfies y > 0 because by assumption it is still
optimal to plant in period 2.
Correlation between signal r and state s. The probability of observing signal rj ,
j = H, M, L conditional on state s, s = O or D is given by the following table:
Pr(rj |s)
rH
rM
rL
O
3+2δ
9
3+δ
9
1−δ
3
D
1−δ
3
1−δ
3
1+2δ
3
The parameter δ measures the degree to which signal rj is informative.
6
The unconditional probability of receiving signal rH can be written as
1 + 2δ
πH =
µ+
9
1−δ
3
(1 − µ),
where µ denotes the farmer’s prior that the monsoon arrives at the end of period 1 (state
s = O). Using Bayes Law, one can write the posterior probability of the state s = O
conditional on observing signal rH as
Pr(s = O|rH ) =
Pr(rH |s = O)µ
=
πH
1+
1
3(1−δ) 1−µ
3+2δ µ
,
and analogous expressions can be derived for Pr(s = O|rM ) and Pr(s = O|rL ).
2.1
Optimal Planting Decision
The decision to plant in period 1 depends on prior µ and signal informativeness δ. In
particular, the expected utility of planting in period 1 conditional on observing signal rj ,
j = H, M, L can be written as
Ey(Plant|rj ) = Pr(s = O|rj )(y − c) + Pr(s = D|rj )(−c) = Pr(s = O|rj )y − c.
Similarly, the expected utility of waiting in period 1, conditional on observing the same
signal rj is
Ey(Wait|rj ) = y.
Let ∆j be the difference in expected utility between planting and waiting in period 1
when signal rj is received. Thus, if the gains from planting are positive, ∆j > 0, it is
optimal to plant. Using the conditional probabilities defined above, we can write ∆H as
a function of δ as
∆H (δ) =
y
1+
3(1−δ) 1−µ
3+2δ µ
7
− c − y.
Similar expressions obtain for ∆M and ∆L .3 If is easy to show that ∆H (δ) and ∆M (δ)
are increasing in signal informativeness δ while ∆L (δ) is decreasing. In addition, when
the signal is uninformative (δ = 0), then
∆H (0) = ∆M (0) = ∆L (0) = µy − c − y,
and when it is conclusive (δ = 1), then
∆H (1) = ∆M (1) = y − c − y
and ∆L (1) = −c − y.
To make things interesting, we assume that ∆j (0) < 0 for j = H, M, L so that whenever
rL is observed, it is always optimal to wait, irrespective of the signal received or signal
informativeness δ.4
Figure 2 plots the functions ∆j (δ) for j = H, M, L. The optimal planting strategy for
the farmer when the signal is rH (rM ) is to plant, as long as δ ≥ δH (δ ≥ δM ). If rL is
received, the farmer should wait because ∆L (δ) < 0 always. The cutoff values δH and δM
are defined by ∆H (δH ) = 0 and ∆M (δM ) = 0 and can be written as5
δH =
3(c + y − µy)
2µ(y − c − y) + 3(1 − µ)(c + y)
and δM =
3(c + y − µy)
.
µ(y − c − y) + 3(1 − µ)(c + y)
Thus, signal informativeness δ plays an important role in the optimal planting strategy,
as stated in the following proposition:
Proposition 1 The amount of rain required to plant in period 1 is (weakly) decreasing
in signal informativeness δ.
To see this, consider first the case when δH < δM < δ. Since in this case both
3
4
In particular, ∆M (δ) =
y
1+
3(1−δ) 1−µ
3+δ
µ
− c − y and ∆L (δ) =
c+y
y
1+
(1+2δ) 1−µ
1−δ
µ
− c − y.
This assumption is satisfied as long as µ < y .
5
It is easy to show that 0 < δH < δM < 1, because by assumption y > c + y and µy < c + y.
8
∆H (δ) > 0 and ∆M (δ) > 0, the farmer will plant if either rM or rH is received. Then,
when signal informativeness is lower at δH < δ < δM , the farmer will only plant if signal
rH is received. Finally, if signal informativeness is even lower at δ < δH < δM , the farmer
will always wait, even after receiving rH .
In addition, since the cutoff values δH and δM are decreasing in µ, a farmer with higher
prior µ is more likely to plant in period 1 for all levels of signal informativeness δ because
the probability that δ > δj , j = H, M , which guides the decision to plant is higher. We
state this prediction in the next proposition.
Proposition 2 The probability of planting in period 1 is (weakly) increasing in the prior
probability µ of state s = O.
We now use the optimal planting strategy to characterize expected net profits EY .
We know that if δ < δH , then it is never optimal to plant, if δH ≤ δ < δM , then it is only
optimal to plant if rH is observed and if δ ≥ δM , then it is optimal to plant when either
rM or rH is observed. Therefore, expected net profits can be written as



y


EY =
πH Ey(Plant|rH ) + (1 − πH )y = y + πH ∆H



 y+π ∆ +π ∆
H
H
M
M
for δ < δH
for δH ≤ δ < δM
(1)
for δM ≤ δ
Since ∆j > 0, j = H, M are increasing in signal informativeness δ, so are expected
profits EY .
2.2
Accuracy of Priors
Suppose now that priors differ from the objective probability of the state of nature.
Farmers hold priors µ
b which are different from µ and are therefore prone to mistakes,
that is, they may plant when they should have waited and they may wait when they
should have planted. More formally, the cutoffs that farmers use to decide whether to
9
plant δj (b
µ) will be different from δj (µ). For example, the optimal response to signal rL is
always to wait because the cutoff δL , (computed similarly to δH and δM as ∆L (δL ) = 0)
is always negative:
δL (µ) =
µy − c − y
< 0,
µ(y − c − y) + 2(1 − µ)(c + y)
because by assumption µy < c + y. However, if the farmers inaccurate priors µ
b are large
enough that µ
by > c+y, then δL (b
µ) would be positive and thus farmers with δ < δL (b
µ) ≡ δbL
would be induced to plant even when receiving rL .
In this context, expected profits EY from the rainfed crop can be written as
EY
h
i
b
b
= πH Pr(δ ≥ δH )Ey(Plant|rH ) + 1 − Pr(δ ≥ δH ) y
h
i
+ πM Pr(δ ≥ δbM )Ey(Plant|rM ) + 1 − Pr(δ ≥ δbM ) y
h
i
+ πL Pr(δ ≤ δbL )Ey(Plant|rL ) + 1 − Pr(δ ≤ δbL ) y
which simplifies to
EY = y + πH Pr(δ ≥ δbH )∆H + πM Pr(δ ≥ δbM )∆M + πL Pr(δ ≤ δbL )∆L .
(2)
It is informative to compare expected profits EY with accurate priors in (1) with
expected profits with inaccurate priors in (2) above. The main difference is that Pr(δ ≥
δbj ), j = H, M and Pr(δ ≤ δbL ) appear in (2) but not in (1), because with accurate priors,
Pr(δ ≥ δj ) j = H, M is either 0 or 1 depending on the signal accuracy δ where the
farmer lives, and Pr(δ ≤ δL ) = 0 always. These probabilities Pr(δ ≥ δbj ), j = H, M
and Pr(δ ≤ δL ) reflect the possibility of planting mistakes, as farmers that should have
planted with certainty may decide to wait, and farmers that should have waited may
decide to plant. The other terms in the expressions (1) and (2), namely the probabilities
πj and gains to planting ∆j , j = H, M, L are identical because they are all evaluated
10
using accurate priors µ.
To be clear, in this paper we only call a “mistake” the decision to plant before the
state is revealed when the farmer would have waited with accurate priors, or viceversa, the
decision to wait when the farmer with accurate priors would have planted. Other ex-post
errors made by farmers with accurate priors when they planted when the monsoon was
delayed (s = D), or when they waited when the monsoon arrived in period 1 (s = O)
are not considered “mistakes” because farmers made the best decision possible before the
state was revealed with accurate information.
As a result, the only source of mistakes are the probability of planting Pr(δ ≥ δbj )
for j = H, M and Pr(δ ≤ δbL ) that can be written in terms of the inaccurate priors µ
b as
Pr(b
µ ≥ µj ) for j = H, M, L, where
µH =
3(c + y)(1 − δ)
,
3(y − δ(c − y)) + 2δ(y − c − y)
µM =
3(c + y)(1 − δ)
3 y − δ(c − y) + δ(y − c − y)
and
µL =
(c + y)(1 + 2δ)
.
(1 − δ)y + 3δ(c + y)
It is easy to show that 0 ≤ µH ≤ µM ≤ µL ≤ 1 and that µL ∈ [
µL > µ because by assumption µ <
2.3
c+y
, 1]
y
always satisfies
c+y
.
y
Attention
We now put a bit more structure into how priors held by farmers µ
b are formed. To keep
things tractable, we assume that µ
b0 = µ + , where µ
b0 are initial priors and follows a
uniform distribution ∼ U [, ]. Farmers are endowed with one unit of attention a that
can be devoted to improving the accuracy of the prior or to another activity whose return
is uncorrelated with the monsoon, like growing an irrigated crop. In particular, we assume
that
µ
b = aµ + (1 − a)b
µ0
so that µ
b = µ + (1 − a).
11
The bounds on can be computed assuming a = 0 and so µ + = 1 or = 1 − µ and
µ + = 0 or = −µ. With these assumptions, the probability of planting becomes
µj − µ
Pr(b
µ ≥ µj ) = Pr(µ + (1 − a) ≥ µj ) = Pr ≥
1−a
We now define cutoffs ā0j and ā1j as follows. If
and if
µj −µ
1−a
≤ = −µ, then a ≥
µj
µ
µj −µ
1−a
.
≥ = 1 − µ, then a ≥
(3)
1−µj
1−µ
= ā0j
= ā1j . Note that for a given signal informativeness δ,
only ā0j or ā1j is relevant, but not both. Since ā0j < 1 if ā1j > 1 and viceversa, the relevant
cutoff will be ā0j or ā1j depending on which one is less than 1. For the case of j = L, only
ā0L is relevant because ā1L > 1 always.
Therefore the probability in (3) can be written as,


0
µj − µ
Pr(b
µ ≥ µj ) = Pr ≥
=
 1−µ−
1−a
or


1
µj − µ
Pr(b
µ ≥ µj ) = Pr ≥
=
 1−µ−
1−a
if a ≥ ā0j
1−µj
1−µ
otherwise
if a ≥ ā1j
1−µj
1−µ
otherwise
depending on which attention cutoff is relevant.
Expected profits still depend on signal informativeness δ as before but also on attention
a. For example, under accurate priors µ, EY = y for δ < δH from expression (1). In
words, it is optimal to always wait (irrespective of the signal) because initial rains are not
very informative. In the case of inaccurate priors, δ < δH implies that µ < µH < µM < µL
and that the relevant attention cutoffs are ā0L < ā0M < ā0H < 1. Expected profits in this
12
case can be therefore be written as



y + πH Pr(b
µ > µH )∆H + πM Pr(b
µ > µM )∆M + πL Pr(b
µ > µL )∆L




 y + π Pr(b
µ > µH )∆H + πM Pr(b
µ > µM )∆M
H
EY (a) =


µ > µH )∆H
y + πH Pr(b




 y
for a ∈ [0, ā0L )
for a ∈ [ā0L , ā0M )
for a ∈ [ā0M , ā0H )
for a ∈ [ā0H , 1].
Since δ < δH , the gains from planting are always negative, ∆j < 0, j = H, L, M .
Clearly, the lower is attention, the lower are expected profits as more negative terms
appear in the expression above. Take for example the case where attention a ∈ [0, ā0L ).
In this case, the probability of making a mistake, that is, of planting when a farmer with
accurate priors would wait is always positive, irrespective of the signal. When attention is
higher, say, a ∈ [ā0L , ā0M ), the farmer observing signal rL will not plant, but may still plant
when signals rH or rM are received, depending on his prior µ
b. When attention is higher at
a ∈ [ā0H , 1], then the farmer always wait irrespective of the signal, because Pr(b
µ ≥ µj ) = 0
for j = H, M, L and thus the farmer makes no mistakes.6
Figure 3 plots expected profits EY as a function of attention a for different levels of
signal informativeness δ. As we have just shown, expected profits are (weakly) increasing
in attention a. The functions have kinks at the different attention cutoffs, and as attention
a increases, the probability of mistakes decreases and expected profits become flatter.
These flat segments are longer as signal informativeness increases. In the extreme when
the signal is conclusive δ = 1, priors and hence accuracy do not matter and expected profits
are constant (see also Figure 5). We state these results in the form of a proposition:
Proposition 3 When priors are inaccurate, more mistakes are made and thus, on average, the expected profits of rain-sensitive crops are lower.
Figure 4 plots expected profits EY as a function of signal informativeness δ for three
6
The appendix contains the expressions for EY for the cases where δH < δ < δM and δM < δ. [To be
written]
13
levels of attention a, namely a = 1 corresponding to accurate priors or µ
b = µ, a = 0.5
and a = 0 corresponding to inaccurate priors.
The expected profits function EY (δ; a = 1) is a step-wise linear function. When
δ < δH , the optimal strategy is to wait regardless of the signal received, and thus net
income does not depend on δ. When δH < δ < δM , it is optimal to plant only if signal rH
is received. In this case, higher signal informativeness results in higher expected income
due to the lower probability of planting in period 1 when the monsoon is delayed (state
s = D). Finally, when δ > δM , it is optimal to plant when either signal rH or rM
is received. In this case, the slope of the function EY (δ; a = 1) is higher than when
δH < δ < δM because the probability of planting in period 1 when the monsoon is delayed
is higher and thus the payoff to signal informativeness is higher.
When a = 0, the function EY (δ; a = 0) is always increasing in signal informativeness
and coincides with EY (δ; a = 1) when the signal is conclusive δ = 1 since priors do
not matter then. However, when a = 0.5, there are two competing forces that shape
EY (δ; 0.5) into a U function for δ < δM . On the one hand, as δ increases, the probability
of planting mistakes – Pr(δ > δbH ) or Pr(δ > δbM ) depending on the signal received – is
increasing where it is optimal to wait, since δ < δH and δ < δM . On the other hand,
expected gains from planting ∆j , j = H, M are also increasing in signal accuracy δ.
Figure 5 plots expected net profits EY (a, δ) as both a function of attention a and
signal informativeness δ.
2.4
Choice of Accuracy
Now let γ be the share of the farmer’s land with access to irrigation that can be used to
grow irrigated crop with profits YI . The farmer will allocate the unit of attention to solve
the following problem:
14
(1 − γ)EY (a; δ) + γYI (1 − a)
maxa
s.t. 0 ≤ a ≤ 1
Expected net profits are in general piecewise convex functions that become flatter as
attention increases, while net profits to the other crop are linear, so the above maximand
may have an interior although it is not differentiable at the attention cutoffs.
Clearly, the larger the returns to the irrigated crop yI or the larger the fraction of
land with irrigation, the lower will attention a be. More in general, farmers with access
to irrigated crops whose yields do not depend on the accuracy of the monsoon, will tend
to have less accurate priors, as stated in the next proposition.
Proposition 4 The less relevant the monsoon is to overall income, the less accurate will
the prior be.
3
Data and Context
This paper uses household survey data and historical rainfall data to test the predictions
of the model. Survey data were collected using a specialized survey that was also designed
to evaluate a rainfall insurance pilot program described in Giné, Townsend and Vickery
(2008) and Cole et al (2013). The survey took place after the 2004 harvest, and covered
1,051 households in 37 villages from two drought prone and relatively poor districts in
the states of Andhra Pradesh and Telangana, India. We resurveyed the same households
in 2006 and 2010, so a few variables are constructed using these later rounds.7
Survey villages are assigned to the closest mandal (county) rainfall gauge, located less
7
The sampling framework used a stratification based on whether the household attended a marketing
meeting and whether the household purchased a deficit rainfall insurance policy in 2004. Within each
strata, households were randomly selected.
15
than 5 miles away. There are five rainfall gauges in Mahbubnagar district and five in
Anantapur district. Table 1 presents the number of villages and households assigned to
each rainfall gauge, the average distance from each gauge to the villages assigned to it
and the number of years with available daily rainfall data.
Table 2 presents summary statistics of the respondents’ household and land characteristics and the variables later used in the analysis.8 The average household head is 47 years
old and has been farming for 25 years. Thirty percent of households have a member older
than 60 who could assist the head in forming expectations about the onset of the monsoon
given the elder’s farming experience.9 The head of household has an average of 3.3 years
of schooling and a literacy rate (measured by the self-reported ability to read and write)
of only 37 percent. Respondents cultivate on average 6.28 acres valued at Rs. 246,000
(USD 5,234) and the value of the households’ primary dwelling averages Rs 69,000 (USD
1,468), thus confirming that the sample consists mostly of smallholder farmers.
Table 2 also includes the percentage of land with different textures, the prevalence of
intercropping and the slope and depth of the plots.10 The bottom of Table 2 reports basic
cropping patterns followed by our sample of farmers. Interestingly, most farmers plant
all the rain sensitive crops at once, instead of planting different plots at different times
as a way to diversify risk. They do so because of the relatively high fixed costs of land
preparation and the prevalence of intercropping that requires that crops be sown at the
same time.
The main cropping season runs from June to November coinciding with the monsoon.
Farmers grow a variety of cash and subsistence crops that vary in their yield’s sensitivity
8
Appendix Table 1 contains a detailed description of how these variables are constructed. All statistics
in Table 2 are weighted by the sampling weights. See Giné, Townsend and Vickery, 2008 for further details.
9
Rosenzweig and Wolpin (1985) develop a theory that explains the predominance of intergenerational
family extension, kin labor and the scarcity of land sales from the fact that elders have specific experiential
knowledge about farming practices.
10
Intercropping is frequently used in the semi-arid tropics and consists of growing two (or more crops)
in the same plot planted at the same time, alternating one or more rows of the first crop with one or more
rows of the second, possibly in different proportions. One crop will typically be shallow rooted, while the
other will be long-rooted, ensuring that crops do not compete for soil nutrients at the same depth.
16
to drought. The main cash crops grown in the area are castor in Mahbubnagar (mostly
in Mahbubnagar, covering 34 percent of the cultivated area), groundnut (covering half
of the cultivated area in Anantapur) and paddy (grown in both disticts and covering 9
percent of cultivated area in the sample). The main subsistence crops are grams and
sorghum (covering less than 10 percent each). All crops with the exception of paddy are
rainfed (less than 5 percent of plots are irrigated). Paddy is almost exclusively irrigated
(84 percent of all plots use groundwater irrigation only available to wealthier farmers).
Table 2 reports that farmers in Anantapur tend to plant later, closer to kartis Punarvasu (July 6) while farmers in Mahbubnagar plant around June 15th. Thus, farmers in
Anantapur plant much later than the normal onset date of June 3rd according to the
Indian Meteorological Department (IMD), drawn in grey in Figure 6.11
According to our survey data, between 1996 and 2006, twenty-two percent of households have replanted at least once, and almost three quarters have abandoned the crops
at some point over the same period. The main reason for abandoning the crop is failure
of the seed to germinate and either lack of capital to purchase additional seeds, or low
expectations of subsequent rains to warrant replanting. In 2006, roughly 3 percent of the
sample replanted some crop, spending an additional Rs 5,000 (USD 86) accounting for 23
percent of total production costs.
All in all, this evidence suggests that farmers could avoid costly mistakes by having
accurate priors about the start of the monsoon.12
11
Murphy and Winkler (1984) describe the field of meteorology as one in which probability forecasts
are very accurate, possibly due to the wealth of available data and the long tradition in forecasting. In
contrast, this finding suggest that the IMD forecast of the onset of the monsoon is less accurate than the
farmer’s expectations, perhaps because the focus of IMD’s forecast is unrelated to the optimal sowing
time that farmers in a specific region should follow.
12
Although what ultimately matters for a good harvest is total accumulated rainfall at key points of
the cropping season, the onset of the monsoon, as defined in the next subsection is correlated with total
accumulated rainfall (p-value is 0.003). Thus, the later the onset, the lower will be accumulated rainfall
during the cropping season.
17
3.1
Onset of Monsoon
We use two definitions of onset of the monsoon. The first definition is based on the
respondent’s self-reported minimum quantum of rainfall required to start planting. The
survey asked “What is the minimum amount of rainfall required to sow?” and also
“What is the minimum depth of soil moisture required to sow?”. Only 10 percent of
farmers provided an answer when asked in millimeters, but all farmers responded using
the depth of soil moisture, so we used the soil’s absorption capacity to convert soil depth
into millimeters of rainfall.13 We label this quantum of rainfall from each respondent, the
“individual definition” of the onset. The second definition is simply the average of the
individual definitions just described in each district. The average onset of the monsoon
in millimeters is 28.95 mm in Mahbubnagar and 33.05 mm in Anantapur.14
Figure 7 shows that there is volatility in the onset of the monsoon. It plots for a
particular rainfall station in each district, the time that it took daily rainfall to accumulate
to the definitions described above each year, measured in days since June 1st. For example,
the left plot shows that according to the historical daily rainfall for the year 2000, it took
33 days for rainfall as measured in the Hindupur gauge to accumulate to 33.05 mm, the
average of the individual definitions in Anantapur district. In the same year (right graph),
it only took 3 days for rainfall measured in Mahbubnagar to accumulate to its district
average.
To be clear, by the time farmers decide when to plant, they may have received numerous signals about the likelihood that the monsoon will arrive in the coming days. If upon
receiving such a signal, farmers knew the onset with certainty and therefore the optimal
13
The calculations for rainfall penetration for various soil textures use the generic values of field capacities under the following assumptions: (i) The top soil (1-1.5 foot) is completely dry, (ii) no runoff occurs
(iii) moisture is primarily determined by the texture and structure of the soil and (iv) evaporation from
soil surface is ignored.
14
Since a given farmer may have plots of differing soil texture, the quantum of rainfall is computed as
a weighted average of a millimeter amount for each soil texture, weighted by plot size. Eighty percent
of cultivable land in Anantapur is red soil, with texture either loamy sand or loam, and the rest is black
soil, with texture either silty clay or clay loam. Mahbubnagar has more of a balance, with 66 percent of
the soil being red. See “Land Characteristics” in Table 2.
18
time to plant, then priors would not matter, since they would simply wait for the signal.
But a long and rich folk tradition of trying to predict the arrival of the monsoon rains
indicate that these signals are noisy at best and that the prior distribution still matters
as it affects the farmers’ conditional expectation.15
Indeed, Figure 8 shows that the onset of the monsoon conditional on the IMD forecast
still displays substantial volatility.16 For a given year, it plots the difference in days
between the actual onset of the monsoon plotted in Figure 2 and the forecast from the
IMD.
Figures 7 and 8 show that there is no trend in the start of the monsoon over time,
consistent with an article by Goswami et al. (2006) that finds that while there is no trend
in the average accumulated rainfall over the season, the frequency of extreme events
such as heavy rains and prolonged droughts increased over the period 1951–2000. Thus,
global warming may have increased the variance of rainfall during the monsoon, but not
necessarily its average nor its onset.17,18
The lack of a time trend in the timing of the monsoon is relevant because if such
a trend existed, recent observations should carry more weight when constructing the
historical distribution.
In addition, both the onset of the monsoon in Figure 7 and the conditional onset of
Figure 8 are more volatile in Anantapur than in Mahbubnagar [report F-test of variances].
In terms of the model, it appears that signal informativeness in Anantapur is lower than
15
This accumulation of indigenous knowledge over thousands of years is reflected in the literature, folk
songs, and proverbs or sayings. For example, farmers use the color of the sky, the shape and color of the
clouds, the direction of the winds, the appearance of certain insects or migratory birds, and so on, to
update the probability that the monsoon has arrived (Fein and Stephens, 1987).
16
The IMD issues a single forecast of the onset of the monsoon over Kerala, around May 15th. We
extrapolate this forecast to other regions adding the normal onset dates plotted in Figure 1 to the forecast
over Kerala.
17
See also Gadgil and Gadgil (2006) and Mall et al. (2006) for further evidence on all-India climatic
trends and a review of their effect on agriculture.
18
When we regress the onset date (in days since June 1st) against a linear and quadratic trend using
year and rainfall station observations in each district (a total 170 observation in Anantapur and 158
in Mahbubnagar), neither the linear nor the quadratic coefficients are significantly different from zero
(results not reported).
19
in Mahbubnagar. In the next section we will assess how farmers in Anantapur respond
to this more erratic environment.
3.2
Consistency of Subjective Distributions
The subjective distribution of the monsoon arrival is obtained experimentally by giving
the respondent 10 stones and a sheet of paper with boxes corresponding to the different
kartis. The respondent is instructed to place all the stones in the different boxes according
to his estimate of the likelihood that the monsoon would start in each period. The question
does not specify any year in particular and should be interpreted as the respondent’s prior
distribution of the onset of the monsoon.
The simple model of the previous section considers only 2 possible dates of monsoon
arrival and thus the prior distribution boils down to a single number µ
b. In practice, the
monsoon can arrive anytime from early June to mid July and thus a multi-date support
is required. More importantly, with the full distribution we can use moments higher than
the mean to assess behavior.
This method of elicitation has been successfully used elsewhere (see Giné and Jacoby,
2014 for a recent example or Delavande et al. 2011a for a review) and here, because every
respondent provided a subjective distribution and no respondent allocated less than the
total 10 stones given. In addition, no respondent allocated a stone in the first or last kartis
of the predefined support, indicating that it did not constrain the respondents’ answers.19
Table 4 provides further evidence of the consistency of subjective distributions as
compared to the historical ones, computed both under the “individual” and “district”
definitions. The lower and upper bound of both the subjective and historical distribu19
One concern is that respondents are constrained by the limited number of stones given, especially in
Anantapur where there is more uncertainty about the start of the monsoon. If that were true, then the
predictions in Anantapur would be less accurate. However, as we show below, respondents in Anantapur
report prior distributions that match more closely the historical distribution. Delavande et al. (2011)
compare the accuracy of eliciting distributions with 10 or 20 stones and concludes that the number of
stones does improve accuracy but only when the number of support points is large.
20
tions are remarkably similar, except perhaps in Anantapur, where respondents seem to
underestimate the range. In addition, most respondents put stones in 3 or 4 different bins,
and nobody puts all 10 stones in a single bin, indicating uncertainty about the onset.
3.3
Accuracy
We measure the accuracy of the subjective expectations by computing a simple chi-square
statistic for binned data. Let Hk be the number of years of available data when the
monsoon arrived in kartis k as recorded in the rainfall station assigned to the respondent,
and Sk the number of stones (from the available ten stones), that the respondent placed
in the same kartis k. Because the number of years is different from the number of stones,
the chi-square statistic is
2
χ =
X(
k
p
p
S/HHk − H/SSk )2
,
Hk + Sk
(4)
where
S=
X
Sk
and
H=
X
Hk ,
k
k
and k indexes the kartis. This statistic follows a chi-square distribution with degrees
of freedom given by the number of bins (i.e. kartis in the sheet of paper given to the
respondent) minus 1.20
Table 4 reports the percentage (appropriately weighted) of individuals in each district
whose subjective distribution is statistically different from the historical ones at different
significance levels.
Accuracy is lower with the individual definition as compared to its average across respondents in the district, perhaps because inaccurate forecasters may miss both the timing
of the onset of monsoon as well as its definition. With the district average, mistakes across
respondents are compensated in such a way that the respondents’ subjective distribution
20
See Press (1992) for further details and references.
21
of the timing of the onset is closer to the historical distribution, thus appearing to be
more accurate.
In total, 40 percent of households in Mahbubnagar but only 20 percent in Anantapur
report a subjective distribution that is significantly different at the 10% level from the
historical one computed using the individual subjective definition. These numbers are
23% and 14% when the district average of individual definitions is used. It appears that
respondents are more accurate in Anantapur where the onset is more volatile, and thus
where it matters more, but we will explore these differences more formally in the next
section.
4
Empirical Analysis and Results
[To be rewritten] In this section we formally test the propositions derived from the model
of Section 2.
4.1
Timing of Planting under uncertainty
In Figure 8 described in Section 3, we showed that the conditional onset of the monsoon is
more erratic in Anantapur as compared to Mahabubnagar. Proposition 1 in the previous
section stated that in places with more uncertainty, or equivalently, when signals are less
informative, farmers require a higher quantum of accumulated rainfall to start planting.
Similarly, farmers should also purchase a lower fraction of agricultural inputs before the
monsoon.
Both decisions have a degree of irreversibility. As already discussed, the downside of
planting too early is that the farmer may have to either replant or abandon the crop if the
seeds do not germinate. Likewise, preparing the land for a certain crop and purchasing
the seeds before the monsoon arrives may be costly if given the patterns of the rains, it
is best for the farmer to plant some other variety or crop that requires another type of
22
land preparation.
To test this hypothesis of the model, we run the following regression:
0
β + isd ,
Yisd = αd Dd + αs Vs + Xisd
where Yisd is either the minimum amount of rainfall that farmer i in a village mapped to
rainfall station s in district d requires to start planting or the percentage of purchases of
agricultural inputs before the monsoon, Dd is a dummy that takes value 1 if the district is
Anantapur, Vs is the standard deviation of the conditional onset of the monsoon, that is,
the difference in days from the IMD forecast until the onset of monsoon (using the district
average of individual definitions) across years at rainfall station s, and Xisd are household
level characteristics. The coefficient of interest is αs and we expect it to be positive and
significant when the dependent variable is the minimum amount of rainfall and negative
when the dependent variable is the percentage of expenditure before the monsoon. The
coefficient αd picks up variation that is specific to Anantapur, and is not explained by the
proxy of uncertainty Vs .
Table 6 reports the results of the above specification. All regressions include land and
other household controls. Columns 1 and 4 include the district dummy only, columns 2
and 5 include the standard deviation of the conditional onset of monsoon Vs and columns
3 and 6 include both the district dummy and the standard deviation of onset of monsoon.
As predicted by the model, the district dummy and the proxy for uncertainty are
both positive and significant in columns 1 and 2. More importantly, when they appear
together in column 3, only the proxy for uncertainly is significant, suggesting that it is
capturing all the variation in the required amount of rainfall to start planting. Notice also
that the coefficient on risk aversion is positive, as expected, but not precisely estimated.
According to the historical data, the point estimate of uncertainty is equivalent to waiting
on average 3 additional days in Anantapur to plant.
Finally, farmers in Anantapur do not seem to spend anything at all before the monsoon,
23
indicating that they value being able to adapt the cropping patterns to the observed
rainfall. Although the coefficient of the standard deviation of the onset of monsoon in
column 5 is negative and significant, it loses predictive power in column 6. We therefore
conclude that there is something peculiar about expenditures before the monsoon in
Anantapur that cannot be explained by the uncertainty about the start of the monsoon.
4.2
Do farmers behave according to their prior expectations?
We test Proposition 2 by assessing whether respondent’s behavior is consistent with his or
her prior. We use the mean of the subjective distribution as a proxy for the household’s
prior expectation of the monsoon onset. To test the correlation between expectations and
decisions, we run the following set of regressions:
Yi = αµSi + βri + δki + Xi0 γ + i ,
where Yi are one-time decisions taken by individual i, µSi is the mean in kartis of the subjective prior distribution (the expected onset of the monsoon in kartis), ri is the individual
definition of the monsoon in mms (minimum amount of rainfall required to start planting) and ki is the actual onset of the monsoon computed using the “district” definition in
odd-numbered columns or the “individual” definition in even-numbered columns.21 This
variable captures all the information available to the farmer when the planting decision
is made. In addition, the vector Xi includes household characteristics that may influence
decision-making, in particular risk aversion and the discount rate.
The coefficient of interest is again α. We expect it to be significant if prior expectations influence decision-making. Notice that by including the actual start of the monsoon,
we are able to assess the relative importance of the unconditional vis a vis the conditional
expectation. In addition, we control for the individual definition in order to isolate differ21
We only have one rainfall gauge with available data in 2004 in Anantapur and 3 in Mahbubnagar.
Based again on minimum distance, we reassigned villages to the next closest gauge with available data.
24
ences in the expected timing of the monsoon from differences in the individual definition.
We consider the following decisions as dependent variables. First, whether the household bought rainfall insurance in 2004 (in general and conditional on attending a marketing meeting).22 The rainfall insurance paid off if accumulated rainfall until a predetermined date was below a certain trigger. Because the dates of the policy coverage were
determined in advance, households that expected the monsoon to start later would also
find that the insurance policy more attractive, because the elapsed time from the expected
monsoon arrival until the fixed end-of-coverage date would be shorter, thereby increasing
the chances of a payout. In sum, households that expect a later onset, would be more
inclined to purchase the insurance. As expected we find in columns 3 and 4 of Table 7,
that the subjective mean of onset is positive and significant at the 5 percent level, when
the sample is restricted to households that attended the marketing meeting.23
Second, we consider in columns 5-8 of Table 7 the average kartis in which the household
planted in 2004 and 2006, averaging across crops and plots cultivated. In this case,
households that ex-ante believe that the monsoon will arrive later should also plant later.
The coefficient on the subjective mean are positive and significant, as well as those of the
individual definition and the actual onset of the monsoon. These results are significant
because they provide direct evidence that priors do matter, and that there is significant
updating. The coefficients suggests that an increase of one kartis in the prior expectation
would delay planting by roughly 0.4 kartis (approximately one week).
Next we look in columns 9 and 10 at the percentage of expenditures before the actual
onset of the monsoon in 2006. Again, households that believe that the monsoon would
start later will end up making fewer purchases before the monsoon actually sets in, although they would have liked to make such purchases in advance. Again, both coefficient
22
Since weather insurance policies were not advertised outside a marketing meeting, and actual take-up
of the product was very low, most farmers that did not attend the meeting would not have heard about
weather insurance.
23
Notice that risk aversion is significant but negative, suggesting that it is negatively correlated with
the purchase of insurance. This contradictory result is explained in detail in Giné, Townsend and Vickery
(2008).
25
of interest have the expected sign and are significant.
Finally, we look at the decision to replant in the last 10 years, a variable collected in
2006. Here, households that believe that the monsoon will start later should have a lower
probability of replanting, because households that plant early are most likely to replant.
This is exactly what we find.
In sum, prior expectations are a powerful predictor of behavior, even more so than
proxies of risk aversion or discount rates. All in all, these results are remarkable because
they not only validate the elicitation method, but also indicate that heterogeneity in prior
expectations, more so than heterogeneity in preferences, explain actual behavior.
4.3
Who are the good predictors?
We now turn to the test of Proposition 3 and compare the subjective prior distribution
of the onset of the monsoon with the historical one, computed using both definitions for
the onset of the monsoon. According to Proposition 3, accuracy should depend on how
relevant the production of rain-sensitive crops. The specification we run is the following:
0
0
log(χ2ivs ) = Zvs
α + Xivs
β + ivs
where χ2ivs is the chi-square statistic computed according to the formula in (4) for individual i in village v mapped to rainfall station s. The vector Zvs contains village-level
variables such as a district dummy, the proxy for uncertainty Vs described in the previous subsection and the distance from village v to rainfall gauge s. The vector Xivs
includes a set of household level characteristics that may affect accuracy. These variables
are described in Section 3 , and include basic demographics, such as caste, literacy and
education, wealth, ability to cope with income shocks and exposure to weather shocks. A
positive and significant coefficient on the variable Xivs would indicate that Xivs reduces
accuracy, since higher χ2ivs values are associated with larger differences between the prior
26
subjective and historical distribution.
Table 8 reports the results. In columns 1 and 2, the historical distribution is computed
using the district average of the individual definitions of the onset of monsoon while in
columns 3 and 4, the individual definition is used. Columns 1 and 3 include the variables
in Zvs , while columns 2 and 4 include village dummies. The district dummy is significant
in column 1, suggesting that after controlling for household characteristics and other
village controls, respondents in Anantapur are more inaccurate. The variable distance is
significant and positive in column 3, indicating that the possible lower correlation between
the timing of rainfall at the gauge and at the village (basis risk) is partly responsible for
the observed lack of accuracy.
Variables that proxy for parameters in the utility function do not seem to influence
accuracy. The lowest p-value of an F-test that they are jointly insignificant is 0.35.
Other demographic variables such as literacy, age and caste improve accuracy but they
are estimated very imprecisely. Thus, accuracy is not about cognition per se, since literacy
(or education) do not seem to influence it significantly. An F-test that all demographic
variables are jointly insignificant cannot be rejected at the 10 percent (lowest p-value is
0.4).
In addition, the wealthier the household as measured by the log value of the primary
dwelling or the log value of the landholdings, the worse the household is in forecasting
the monsoon. An F-test that both wealth variables are jointly insignificant is always
rejected at the 5 percent. Other variables that proxy for the ability to cope with shocks in
general, such as per capita income, participation in a chit fund or being credit constrained
are mostly significant and of the expected sign. The F-test is rejected in 3 out of 4
specifications. In addition, variables that are correlated with the exposure to weather
shocks, such as the amount of land the household cultivates (positive correlation) and the
percentage of land devoted to paddy (negative correlation) are also of the right sign and
mostly significant. The F-test is rejected in 2 out of the 4 specifications.
27
In sum, confirming the prediction of the model, better forecasters seem to be households whose incomes depend more heavily on the monsoon and households that lack
proper risk-coping mechanisms to smooth weather shocks.
Thus, households for which the monsoon is important are willing to devote resources
(maybe attention?) to acquire more information and thus are on average more accurate
at predicting it.
4.4
Does Accuracy Matter?
Finally, we test Proposition 4 in the model. Inaccuracy increases the scope for mistakes
in the optimal time to plant, so over time, inaccurate farmers should (i) experience lower
yields in rain sensitive crops and (ii) replant fewer times.
We test this proposition by running two regressions. First, the following yield regression
ycidt = Dc + Dt + Di + Dd × Dc + αc log(χ2i ) × Dc + Xc0 β + cidt ,
(5)
where ycidt is the yield per acre of crop c grown by farmer i in district d in year t, Dc is a
crop c dummy, Dt is a year dummy, Di is an individual dummy, log(χ2i ) is the log of the
accuracy proxy χ2 according to the formula in (4), and Xc is a set of crop characteristics,
including soil characteristics and use of intercropping, among others. We use crop yields
for 2004 and 2006, the years for which we have data.
Second, we run the same specification as that of Table 7 including log(χ2i ) as a regressor.
Ri = αχ log(χ2i ) + αµ µSi + βri + δki + Xi0 γ + i ,
where Ri is the decision to replant in the last 10 years and αχ is the variable of interest.
Table 9 reports the main results. In Panel A the results from regression (5) with
rice as the omitted crop category. Indeed, for all crops, lower accuracy results in lower
yields. Because the variable log(χ2i ) is hard to interpret, we conduct the following thought
28
experiment: By how much would income increase for an individual that cultivates rainsensitive crops using the average land devoted to them, if accuracy was increased from
the 25th percentile of the distribution to the 75th percentile? The gains in yields per acre
are converted into Kg for each crop using the average land size devoted to them, then Kg
are multiplied by the respective prices and they are finally aggregated. We end up with
an average gain between Rs 1,500 and Rs 1,800 depending on whether the individual or
district definition is used. These income gains translate into 7.8 percent and 9.1 percent of
the total value of agricultural production or 5.3 percent and 6.1 percent of total household
income.
Panel B shows that accuracy is correlated with a lower probability of having replanted
in the past, although the coefficient is only significant when the individual definition of
start of monsoon is used to compute the χ2 measure and actual onset of monsoon. If
again accuracy was to increase from the 25th percentile of the distribution to the 75th
percentile, the probability of replanting would shrink from XX to YY.
5
Conclusions
Differences in decision-making under uncertainty (when constraints are unimportant)
could be the result of differences in key parameters of the utility function, such as risk aversion or the discount rate; or by differences in the probability distribution of the different
states of nature. The literature has typically assumed that agents share the same probability distribution but have different utility functions. However, if one allows differences
in both probability distributions and utility functions, then a fundamental identification
problem ensues, since behavior can be explained by differences in either the utility function, the prior distribution or both (Manski, 2004). In this paper we elicited directly the
agent’s subjective probability distribution of the monsoon onset to assess the role that
they play in decision-making.
We do so in the context of rain-fed agriculture in semi-arid India, where farmers should
29
benefit greatly from having accurate expectations about the onset of the monsoon, since it
coincides with the optimal time to plant. By combining the subjective expectations data
with historical rainfall data, we were able to compute individual measures of accuracy
and assess which were the most accurate and why. Finally, we quantified the welfare costs
of inaccuracy.
We show that the elicitation method used delivers informative prior subjective expectations and that there is significant heterogeneity in beliefs across households. More
importantly, real outcomes, such as the timing of planting decisions and agricultural profitability, are strongly influenced by households’ beliefs about the monsoon. This suggests
that expectations are a powerful predictor of behavior, even more so than proxies for
parameters in the utility function.
In addition, we study the determinants of accuracy in expectations and find that
individuals with stronger incentives to gather information have more accurate beliefs.
As a result, accuracy seems to be less explained by cognitive biases (Kahneman, Slovic
and Tversky, 1982; Rabin, 1998) than by how relevant the event is to the forecaster as
predicted by a simple model of limited attention.
Finally, we quantify the costs of accuracy and find that it leads to average income
losses of 8 to 9 percent of agricultural production.
In sum, eliciting expectations data is important and should be incorporated in the
analysis of decision-making problems.
References
[1] O. Attanasio. Expectations and Perceptions in Developing Countries: Their Measurement and Their Use. American Economic Review, 99(2):87–92, 2009.
[2] H. P. Binswanger. Attitudes towards risk: experimental measurement in rural India.
American Journal of Agricultural Economics, 62(8):395–407, 1980.
30
[3] H. P. Binswanger. Attitudes towards risk: theoretical implications of an experiment
in rural India. The Economic Journal, 91(9):867–889, 1981.
[4] H.P Binswanger and D. A. Sillers. Risk Aversion and Credit Constraints in Farmers’
Decision Making: A Reinterpretation. Journal of Development Studies, 20:5–21,
1984.
[5] C. J. Bliss and N. H. Stern. Palanpur, the Economy of an Indian Village. Oxford
University, Oxford, England, 1982.
[6] A. Delavande, X. Giné, and D. McKenzie. Eliciting Probabilistic Expectations with
Visual Aids in Developing Countries: How sensitive are answers to variations in
elicitation design? Journal of Applied Econometrics, 26(3):479–497, 2011.
[7] A. Delavande, X. Giné, and D. McKenzie. Measuring Subjective Expectations in
Developing Countries: A Critical Review and New Evidence. Journal of Development
Economics, 94:151–163, 2011.
[8] A. Delavande and H.P. Kohler. Subjective Expectations in the Context of HIV/AIDS
in Malawi. Working Paper, RAND Corporation, 2007.
[9] S. DellaVigna. Psychology and Economics: Evidence from the Field. Journal of
Economic Literature, 47(2):315–372, 2009.
[10] W. DuMouchel and G. Duncan. Using Sample Survey Weights in Multiple Regression
Analyses of Stratified Samples. Journal of the American Statistical Association,
78(383):535–543, 1983.
[11] M. Fafchamps. Sequential Labor Decisions Under Uncertainty: an Estimable Household Model of West African Farmers. Econometrica, 61(5):1173–1197, 1993.
[12] J. Fein and P. Stephens. Monsoons. John Wiley and Sons, New York, NY, 1987.
31
[13] S. Gadgil and S. Gadgil. The Indian Monsoon, GDP and Agriculture. Economic and
Political Weekly, November 25:4887–4895, 2006.
[14] S. Gadgil, P.R. Seshagiri Rao, and K. Narahari Rao. Use of Climate Information
for farm-level decision making: rainfed groundnut in southern India. Agricultural
Systems, 74:431–457, 2002.
[15] X. Giné and H. Jacoby. Markets, Contracts, and Uncertainty: A Structural Model
of a Groundwater Economy. mimeo, 2014.
[16] X. Giné, R. Townsend, and J. Vickery. Patterns of Rainfall Insurance Participation
in Rural India. The World Bank Economic Review, 22(3):539–566, 2008.
[17] C. Gollier. The Economics of Risk and Time. MIT Press, Cambridge, MA, 2001.
[18] B.N. Goswami, V. Venugopal, D. Sengupta, M.S. Madhusoodanan, and Prince K.
Xavier. Increasing Trend of Extreme Rain Events Over India in a Warming Environment. Science, 314(5804):1442–1445, 2003.
[19] J. Hirshleifer and J. Riley. The Analytics of Uncertainty and Information. Cambridge
University Press, Cambridge, UK, 1995.
[20] D. P. Slovic Kahneman and A. Tversky. Judgment under Uncertainty: Heuristics
and Biases. Cambridge University Press, Cambridge, UK, 1982.
[21] S. Klonner. Private Information and Altruism in Bidding Roscas. The Economic
Journal, 118(528):775–800, 2008.
[22] W. Luseno, J. McPeak, C. Barrett, G. Gebru, and P. Little. The Value of Climate
Forecast Information for Pastoralists: Evidence from Southern Ethiopia and Northern Kenya. World Development, 31(9):1477–1494, 2003.
32
[23] T. Lybbert, C. Barrett, J. McPeak, and Winnie Luseno. Bayesian herders: asymmetric updating of rainfall beliefs in response to external forecasts. Working Paper,
Cornell University, 2005.
[24] R. Mall, R. Singh, A. Gupta, G. Srinivasan, and L.S. Rathore. Impact of Climate
Change on Indian Agriculture: A Review. Climatic Change, 78(2-4):445–478, 2006.
[25] C. Manski. Measuring Expectations. Econometrica, 72(5):1329–76, 2004.
[26] D. McKenzie, J. Gibson, and S. Stillman. A land of milk and honey with streets paved
with gold: Do emigrants have over-optimistic expectations about incomes abroad?
Working Paper, World Bank, 2007.
[27] J. Morduch. Income Smoothing and Consumption Smoothing. Journal of Economic
Perspectives, 9(3):103–114, Summer 1995.
[28] A. Murphy and R. Winkler. Probability Forecasting in Meteorology. Journal of the
American Statistical Association, 79(387):489–500, 1984.
[29] K. Narahari Rao, S. Gadgil, P.R. Seshagiri Rao, and K. Savithri. Tailoring strategies to rainfall variability – The choice of the sowing window. Current Science,
78(10):1216–1230, 2000.
[30] P. Norris and R. Kramer. The Elicitation of Subjective Probabilities with Applications in Agricultural Economics. Review of Marketing and Agricultural Economics,
58(2-3):127–147, 1990.
[31] W. H. Press. Numerical Recipes in C. Cambridge University Press, Cambridge, UK,
1992.
[32] M. Rabin. Psychology and Economics. Journal of Economic Literature, 36:11–46,
1998.
33
[33] M. Rosenzweig and H.P Binswanger. Wealth, Weather Risk and the Composition
and Profitability of Agricultural Investments. The Economic Journal, 103(1):56–78,
1993.
[34] M. Rosenzweig and K. Wolpin. Specific Experience, Household Structure, and Intergenerational Transfers: Farm Family Land and Labor Arrangements in Developing
Countries. Quarterly Journal of Economics, 100(Supplement):961–987, 1985.
[35] P. Singh, K.J. Boote, A. Yogeswara Rao, M.R. Iruthayaraj, A.M. Sheik, S.S Hundal,
R.S. Narang, and Phool Singh. . Field Crops Research, 39:147–162, 1994.
[36] T Walker and J. Ryan. Village and Household Economics in Indias Semi-Arid Tropics. John Hopkins University Press, Baltimore, Maryland, 1990.
34
Table 1. Assignment of villages to rainfall stations in each district
Rainfall gauge
Mahabubnagar
Utkor
Narayanpet
Mahbubnagar
Atmakur
CC Kunta
Total
Anantapur
Somandepalli
Madakasira
Hindupur
Parigi
Lepakshi
Total
N. Villages
N. Households
Distance (miles)
N. Years of
rainfall data
6
5
9
1
2
204
142
247
25
70
4.01
2.87
5.33
2.53
3.44
17
25
35
35
17
23
688
4.14
25.8
4
3
1
3
3
111
70
26
80
76
5.22
5.63
0.69
3.58
4.74
14
34
35
16
16
14
363
4.51
20.2
Notes:This table reports the number of villages assigned to each rainfall station and the number of years with
available rainfall data. Each village is assigned to the closest rainfall gauge in the district. "N. Villages" is the
number of villages assigned to the rainfall gauge. "N. Households" is the number of households interviewed in
the villages assigned to the rainfall gauge. "Distance in miles" is the average distance from the rainfall gauge to
the villages assigned to the rainfall gauge. "N. Years" is the average number of years with available rainfall data.
The row "Total" reports the district averages of "Distance" and "N. Years" weighted by number of households
assigned to each rainfall gauge, and district totals of "N. Villages" and "N. Households".
Table 2. Summary Statistics
Mean
Preferences
Risk aversion
Discount Rate
Information / Social Networks
Membership in BUA (1=Yes)
Weather info from informal sources (1=Yes)
Wealth
Log market value of the house
Log value of owned land
Ability to smooth income shocks
Participation in chit fund (1=Yes)
Household is credit constrained (1=Yes)
Log per capita total income
Exposure to Rainfall Shocks
Cultivated land in acres
Pct. land of rainfed crops
Other Household Characteristics
Literacy (1=Yes)
Education
Age of the household's head
Age of the eldest household member > 60 (1=Yes)
Forward caste (1=Yes)
Consumption (Rs.)
Land Characteristics
Herfindahl index soil types
Pct. of land with loamy sand soil
Pct. of land with loam soil
Pct. of land with clay loam
Pct.of land with slope (higher than 1%)
Pct. of land with soil deeper than 80 cm
Onset of Monsoon distribution
Ind. Def. of monsoon (mms.)
Mean of subjective distribution of monsoon onset
Std Dev of subjective distribution of monsoon onset
Mean of Actual Onset of monsoon (District definition)
Mean of Actual onset of monsoon (Individual defination)
Unconditional St. Dev. of actual onset of monsoon (Individual definition)
Conditional St. Dev. of actual onset of monsoon (Individual definition)
Agricultural Practices
Bought rainfall insurance in 2004
Bought insurance in 2004 conditional on attending meeting
Pct. expenses before monsoon
Average planting kartis
Household ever replanted (1=Yes)
Intercropping (1=Yes)
Agricultural Profits (Rs.)
Distance from village to rainfall gauge
Number of Observations
2004
2006
2008
2009
Full Sample
Std. Dev.
Min
Max
Mahbubnagar
Anantapur
Means
p-value
0.37
0.30
0.48
0.28
0.00
0.00
1.00
2.33
0.38
0.26
0.36
0.37
0.756
0.001
0.03
0.19
0.16
0.39
0.00
0.00
1.00
1.00
0.04
0.18
0.00
0.22
0.083
0.337
10.85
11.87
0.77
1.00
8.70
7.15
13.12
16.86
10.93
12.12
10.69
11.40
0.034
0.000
0.24
0.80
7.79
0.43
0.40
0.97
0.00
0.00
5.08
1.00
1.00
13.34
0.19
0.80
7.65
0.33
0.80
8.26
0.188
0.924
0.002
6.28
0.73
6.03
0.29
0.00
0.00
82.00
1.00
6.93
0.70
5.06
0.78
0.003
0.030
0.37
3.31
47.83
0.30
0.15
36,358
0.48
4.36
11.91
0.46
0.36
21,492
0.00
0.00
21.00
0.00
0.00
5,980
1.00
18.00
80.00
1.00
1.00
1,277,640
0.33
2.91
47.22
0.31
0.12
37,168
0.46
4.04
46.08
0.28
0.21
34,855
0.027
0.036
0.131
0.572
0.137
0.250
0.86
0.50
0.21
0.19
0.45
0.12
0.21
0.45
0.38
0.35
0.46
0.27
0.33
0.00
0.00
0.00
0.00
0.00
1.00
1.00
1.00
1.00
1.00
1.00
0.88
0.51
0.16
0.22
0.48
0.13
0.82
0.48
0.31
0.13
0.41
0.09
0.036
0.680
0.064
0.257
0.081
0.303
29.37
13.82
10.37
13.97
13.99
14.56
14.56
10.08
0.96
1.14
1.03
1.08
5.70
5.69
7.92
0.00
0.00
12.86
12.57
5.60
5.60
101.24
22.20
18.74
16.29
16.64
30.87
30.87
27.93
13.47
10.19
13.93
13.96
11.15
11.16
32.07
14.49
10.69
14.05
14.06
20.94
20.94
0.001
0.000
0.007
0.477
0.553
0.000
0.000
0.03
0.37
0.26
14.10
0.22
0.47
-1,508
4.36
0.16
0.48
0.23
1.45
0.41
0.50
4,672
1.75
0.00
0.00
0.00
10.00
0.00
0.00
-67,455
0.62
1.00
1.00
1.00
25.00
1.00
1.00
17,135
8.29
0.03
0.34
0.26
13.35
0.30
0.25
-956
4.08
0.02
0.48
0.26
15.11
0.07
0.88
-2,559
4.87
0.651
0.155
0.695
0.000
0.000
0.000
0.001
0.168
687
629
687
648
363
312
363
345
1050
941
1050
993
Notes: This table reports the summary statistics of all the variables used in the analysis. The statistics reported are computed by weighting observations by the appropriate stratification
(Attending Marketing Meeting and Purchasing Insurance in 2004) weights. Appendix A1 contains a detailed description of the variables reported. Unless otherwise noted, all variables
were collected in 2004. Consumption, Actual onset of the monsoon and the percentage of expenses before the onset were collected in 2004, 2006 and 2009. The average planting kartis
was collected in 2004, 2006, 2008 and 2009. For variables collected in multiple years, the statistics reported use all years available. The variable bought rainfall insurance is conditional
on having attended a marketing meeting. Under "Number of observations" we report the number of observations in each year of data collection.
District:
Distribution:
Table 3: Comparison of Subjective and Historical Distributions
All
Mahbubnagar
Historical
Historical
Subjective
Subjective
Individual District
Individual District
Means
Support of Distributions
Highest bin
15.0
15.3
15.5
Lowest bin
12.3
12.8
12.0
Number of bins with positive mass
3.5
4.2
4.3
Moments and Properties of Distributions
Mean
13.8
13.8
13.4
Standard Deviation
0.9
1.0
1.0
Unimodal (1=Yes)
93.1%
70.4%
77.8%
Distributions is Normal (1=Yes)
95.5%
75.8%
89.3%
Accuracy
Subjective distribution is different from historical distribution at…
0.08
0.09
1 percent
0.20
0.19
5 percent
0.28
0.26
10 percent
Subjective distribution has same mode as historical distribution (1=Yes)
0.43
0.46
Anantapur
Historical
Subjective
Individual District
15.0
12.3
3.7
15.3
12.8
3.6
14.8
12.0
3.8
15.6
13.3
3.2
17.7
12.4
5.5
16.9
12.0
5.2
13.5
0.9
92.1%
98.8%
13.7
0.8
96.2%
82.8%
13.3
0.9
93.1%
100.0%
14.4
0.8
95.1%
89.3%
14.2
1.4
21.9%
62.8%
13.7
1.3
49.1%
69.0%
-
0.11
0.23
0.30
0.09
0.14
0.17
-
-
0.51
0.57
-
0.03
0.15
0.25
0.30
0.08
0.28
0.41
0.26
Notes: This table reports statistics about the support and moments of the subjective and historical distributions. The historical distribution is computed using the
individual definition of monsoon or the district definition which is the average across individuals in the district. The table also reports measures of
similaritybetween the subjective and historical distributions. The support of the distribution is measured in kartis or 10-day periods from April to September (kartis
9 to 20). Appendix Table A2 reports the name and dates of each kartis. "Number of bins with positive mass" corresponds to the number of kartis in the support
with positive probability. The distribution is normal if one cannot reject that the kurtosis and skewness come from a normal distribution. The comparison of
subjective and historical distributions is based on the chi-square statistic described in the text and used in Tables 6 and 7.
Table 4. Behavior under uncertainty
Rainfall required to plant (mms.)
Dependent variable
Individual
(2)
(3)
0.638***
(0.098)
4.652***
(0.319)
-4.562***
(1.266)
0.934***
(0.167)
0.553***
(0.076)
0.549
(0.591)
0.890
(1.015)
0.424
(0.344)
1.443*
(0.851)
0.514
(0.600)
1.592
(0.948)
0.457
(0.589)
0.386
(1.071)
Definition of Onset:
(1)
Variable of interest
Dummy Anantapur
St. Dev. of conditional historical distribution of monsoon onset
Preferences
Risk aversion
Discount rate (*100)
Land characterstics
Pct. of land with loamy sand soil
Pct. of land with loam soil
Pct. of land with clay loam
Pct.of land with slope (higher than 1%)
Pct. of land with soil deeper than 80 cm
Pct. land of rainfed crops
Household characteristics
Forward caste (1=Yes)
Education
Age of the household's head
Cultivated land in acres
Village Fixed Effects
Mean dependent variable
Observations
Individuals
Years
R-squared
District
(4)
-2.629
2.030
-2.106
-3.179
(2.992)
(2.152)
(2.901)
(3.108)
7.631**
6.837***
7.846**
8.439**
(3.282)
(2.488)
(3.146)
(3.412)
15.138*** 11.579*** 14.446*** 16.460***
(2.494)
(1.823)
(2.462)
(2.554)
-1.404**
-1.019*
-1.445**
-1.207*
(0.648)
(0.524)
(0.665)
(0.625)
8.942*** 7.369*** 8.708*** 9.867***
(2.274)
(1.824)
(2.150)
(2.375)
2.204**
0.758
2.988***
1.270
(1.053)
(0.516)
(1.013)
(1.094)
Pct. expenses before monsoon
Individual
(6)
(7)
District
(8)
-0.003***
(0.001)
-0.000
(0.003)
0.049**
(0.020)
-0.006***
(0.002)
-0.002
(0.001)
0.009
(0.010)
-0.012
(0.016)
0.010
(0.010)
-0.020
(0.017)
0.009
(0.010)
-0.019
(0.015)
0.009
(0.011)
-0.011
(0.017)
-0.006
(0.036)
-0.004
(0.036)
-0.015
(0.033)
-0.006
(0.010)
0.002
(0.032)
-0.009
(0.016)
-0.007
(0.037)
-0.001
(0.035)
-0.002
(0.031)
-0.000
(0.009)
0.019
(0.032)
-0.017
(0.015)
-0.013
(0.036)
-0.006
(0.036)
-0.008
(0.033)
-0.006
(0.010)
0.004
(0.032)
-0.018
(0.017)
-0.005
(0.035)
-0.010
(0.035)
-0.021
(0.032)
-0.008
(0.010)
-0.003
(0.031)
-0.006
(0.016)
(5)
1.773*
(0.887)
-0.014
(0.063)
0.045**
0.770
(0.659)
0.021
(0.051)
0.028*
2.495**
(1.055)
0.017
(0.065)
0.051**
1.196
(0.808)
-0.045
(0.058)
0.035*
-0.017
(0.010)
0.001
(0.001)
0.000
-0.019*
(0.010)
0.001
(0.001)
-0.000
-0.025**
(0.011)
0.001
(0.001)
-0.000
-0.016
(0.011)
0.001
(0.001)
0.000
(0.021)
0.062
(0.040)
(0.016)
0.065**
(0.029)
(0.023)
0.035
(0.045)
(0.020)
0.063
(0.040)
(0.000)
-0.002***
(0.000)
(0.000)
-0.001***
(0.000)
(0.000)
-0.002***
(0.000)
(0.000)
-0.002***
(0.000)
No
30.356
1046
Yes
30.356
1046
No
30.356
1046
No
30.356
1046
No
0.248
2,048
Yes
0.248
2,048
No
0.248
2,048
No
0.248
2,048
0.012
1046
2004 and 2009
0.044
0.015
0.007
1046
2004
0.546
0.757
0.546
0.477
Notes: This table tests Proposition 1 of the model. In columns (1)-(2) the dependent variable is the amount of rain in millimeters that respondents require to start
planting. In colums (3)-(4) the dependent variable is the percentage of all agricultural expenditures made before the onset of the monsoon. The expenses include bullock
and manual labor, hiring tractors, manure, irrigation, purchase of seeds and fertilizer. Standard deviation of the conditional historical distribution of the monsoon onset is
the standard deviation of the number of days since 1st June until the onset of monsoon (using the individual definition) minus the IMD forecast of the onset of the
monsoon for that year, across all available years. All regressions control for the variables used in the stratification. Robust standard errors are reported in brackets below
Table 5. Do individuals behave according to their expectations?
Bought insurance
(1)
Variables of interest
Subjective mean of onset
Ind. def of monsoon (mms.)
Preferences
Risk aversion
Discount Rate (*100)
Land characteristics
Pct. of land with rainfed crops
Household characteristics
Credit constrained (1=Yes)
Village Fixed Effects
Other controls?
Mean dependent variable
Observations
Individuals
Years
R-squared
Bought insurance
conditional to have
attended meeting
(2)
Average planting
kartis
(3)
Household ever
replanted
(4)
Pct. expenses before
onset of the monsoon
(5)
-0.002**
(0.001)
0.000**
(0.000)
0.038***
(0.009)
-0.001***
(0.000)
0.093***
(0.029)
0.003
(0.002)
-0.054***
(0.019)
-0.002
(0.001)
0.002
(0.005)
0.000
(0.000)
-0.019***
(0.002)
-0.024***
(0.004)
-0.083***
(0.013)
-0.045**
(0.022)
-0.026
(0.042)
-0.013
(0.099)
-0.010
(0.028)
0.055
(0.050)
0.010
(0.008)
-0.018
(0.016)
0.000
(0.003)
0.053**
(0.023)
-0.114
(0.088)
0.039
(0.052)
-0.020
(0.014)
-0.012***
(0.002)
-0.066***
(0.015)
-0.068
(0.050)
0.081***
(0.031)
-0.002
(0.009)
Yes
Yes
0.29
57,333
739
2004
0.067
Yes
Yes
0.44
5,830
489
2004
0.281
Yes
Yes
14.12
3,293
1032
2004/06/08 and 2009
0.263
Yes
Yes
0.25
935
935
2006
0.228
Yes
Yes
0.25
2,024
1046
2006 and 2009
0.042
Notes: This table tests Proposition 2. Subjective mean of onset and Subjective st.dev of onset are the mean and standard deviation of the subjective distribution elicited from respondents.
Regressions control for variables used in the stratification (attended marketing meeting and insurance purchase). In addition, all regressions control for Forward caste (1=Yes), Literacy
(1=Yes), Age of the household's head, Market value of the house (expressed in Rs.100,000), Value of Owned land (expressed in Rs. 1,000,000), Use of intercropping system (1=Yes),
and village fixed effects. Standard errors are robust in columns 1-2 and 5-6 and clustered in columns 3-4 and 7-8. All standard errors are reported in brackets below the coefficient. The
symbols *, ** and *** represent significance at the 10, 5 and 1 percent, respectively.
Table 6. Differences between subjective and historical distributions
District
Definition of Onset:
(1)
(2)
Village characteristics
Dummy Anantapur
1.281***
(0.180)
St. Dev. of conditional historical distribution of monsoon onset
-0.043***
(0.013)
Distance from village to rainfall gauge
0.014
(0.031)
Preferences
Risk aversion
-0.057
-0.092
(0.062)
(0.071)
Discount Rate
0.061
0.037
(0.108)
(0.108)
Wealth
Log (market value of the house/100,000)
0.127**
0.100**
(0.051)
(0.043)
Log (value of owned land/1.000,000)
0.114*
0.042
(0.066)
(0.052)
Ability to smooth income shocks
Participation in chit fund (1=Yes)
0.191**
0.085
(0.085)
(0.072)
Household is credit constrained (1=Yes)
-0.078
-0.077
(0.062)
(0.060)
Log (per capita total income/10,000)
0.102*
0.047
(0.055)
(0.058)
Exposure to Rainfall Shocks
Logarithm of cultivated land in acres
-0.178*
-0.038
(0.099)
(0.075)
Pct. of land with rainfed crops
-0.026
-0.224*
(0.126)
(0.116)
Information / Social Networks
Weather info from informal sources (1=Yes)
-0.019
-0.005
(0.077)
(0.069)
Other Household Characteristics
Herfindal index soil type
0.206*
0.196*
(0.111)
(0.105)
Literacy (1=Yes)
-0.082
-0.048
(0.064)
(0.060)
Age of the household's head
-0.003
-0.003
(0.003)
(0.003)
Age of the eldest household member > 60 (1=Yes)
-0.070
-0.058
(0.050)
(0.057)
Forward caste (1=Yes)
-0.084
-0.107
(0.093)
(0.092)
Village Fixed Effects
No
Yes
Mean dependent variable
1.53
1.53
Observations
1024
1024
R-squared
0.236
0.336
(3)
Individual
(4)
0.341**
(0.161)
-0.032**
(0.015)
0.040
(0.025)
0.011
(0.021)
0.060
(0.041)
0.038
(0.091)
0.030
(0.040)
0.069
(0.083)
0.108**
(0.042)
0.128**
(0.059)
0.085**
(0.035)
0.097*
(0.050)
0.235***
(0.076)
-0.130**
(0.054)
0.057
(0.036)
0.151**
(0.069)
-0.139***
(0.043)
0.002
(0.033)
-0.260***
(0.067)
-0.035
(0.091)
-0.137**
(0.063)
-0.130*
(0.068)
-0.131
(0.084)
-0.100
(0.080)
0.136
(0.100)
-0.003
(0.052)
0.003
(0.002)
-0.026
(0.057)
0.017
(0.073)
No
1.99
1024
0.134
0.169*
(0.097)
0.034
(0.049)
0.001
(0.002)
-0.035
(0.060)
0.038
(0.057)
Yes
1.99
1024
0.276
Notes: This table tests Proposition 3. The dependent variable is the logarithm of the chi-square computed from comparing the subjective and the historical
distributions. In colums 1-2 the historical distribution is computed using the district definition of onset of monsoon, that is the average in the district of the
minimum quantum of rainfall that respondent requires to plant while columns 3-4 use individual definition of the onset of the monsoon. All regressions control for
the variables used in the stratification. Columns 2 and 4 include village fixed effects. Robust standard errors are reported in brackets below the coefficient. The
symbols *, ** and *** represent significance at the 10, 5 and 1 percent, respectively. See Appendix Table A1 for the definition of variables.
Table 7. Does Accuracy Matter?
Panel A: Yields and Profits
Total yields
Individual
District
(1)
(2)
6.460
43.072*
(37.688)
(23.889)
-51.188*
-71.111***
(30.923)
(19.050)
3,678
3,678
1034
1034
2004 and 2006
0.719
0.718
Definition of Onset:
2
Log (χ )
Log (χ2) * No Paddy
Number of Observations
Number of Individuals
Number of Years
R squared
Total profits (in thousand Rs.)
Individual
District
(5)
(6)
31.651
117.486
(321.581)
(228.640)
202.836
-174.276
(344.429)
(240.647)
2,128
2,128
963
963
2004
0.316
0.316
Panel B: Ever Replant and Consumption
Household ever replanted
Definition of Onset:
Log(χ2)
Subjective mean of onset
Ind. def of monsoon (mms.)
Actual onset of monsoon
Village fixed effects
Other Controls
Observations
Individuals
Years
R-squared
Individual
(1)
(2)
0.055***
0.043**
(0.021)
(0.019)
-0.048***
-0.038**
(0.019)
(0.018)
-0.002
-0.001
(0.003)
(0.002)
0.000
-0.040
(0.130)
(0.052)
Yes
No
Yes
Yes
930
930
930
930
Consumption
District
(3)
(4)
0.046**
0.038**
(0.019)
(0.019)
-0.074***
-0.058***
(0.020)
(0.020)
-0.002
-0.002
(0.001)
(0.001)
-0.029
(0.046)
Yes
No
Yes
Yes
930
930
930
930
2006
0.230
0.150
0.231
0.15
Individual
District
(5)
(6)
(7)
(8)
-68.593
-452.987
-374.670
-769.088
(480.471)
(536.718)
(508.252)
(524.067)
-585.513
-538.692
-410.529
-184.495
(553.082)
(532.176)
(564.564)
(577.386)
-16.890
0.354
-55.664
-20.739
(45.900)
(38.580)
(49.264)
(36.863)
-1890.299*** -1130.817*** -2144.677*** -1316.806***
(606.988)
(404.571)
(579.364)
(401.609)
Yes
No
Yes
No
Yes
Yes
Yes
Yes
3003
3003
3003
3003
1027
1027
1027
1027
2004/06 and 2009
0.120
0.106
0.121
0.107
Notes: This table tests Proposition 4. In the odd columns of Panel A the log(χ2) is computed with the historical distribution that uses the district definition, that is average in
the district of the minimum quantum of rainfall that respondent requires to plant while the even columns use the individual definition of the onset of the monsoon. In Panel A,
the regression includes No paddy dummy, crop dummies, 2006 dummy, No paddy dummy x 2006 dummy, 2006 dummy x log(χ2) and plot level characteristics such as whether
soil is loamy sand (1=Yes), soil is loam (1=Yes), soil is clay loam (1=Yes), soil is deeper than 80cm. (1=Yes) and use of intercropping (1=Yes). Each observation is a crop.
The regression in columns 1 and 2 is estimated with OLS with household fixed effects and clustering at household level. The regression in columns 3 and 4 is estimated OLS
clustering at household level. In Panel B, "Actual Start of Monsoon" in odd columns is the kartis in which accumulated rainfall for the relevant year when the dependent
variable was measured reached the use the "individual" definition. In even column, the "district" average of individual definitions, rather than each individual definition is
used. The regressions control for variables used in the stratification (attended marketing meeting and insurance purchase). In addition, the following controls are included: Risk
aversion, Discount Rate, percentage of land with rain fed crops, credit constraints (1=Yes), forward caste (1=Yes), literacy (1=Yes), Age of the household's head, log market
value of the house, log value of owned land. If village fixed effects are excluded, a dummy for Anantapur is included. For both panels, robust standard errors are reported in
brackets below the coefficient. The symbols *, ** and *** represent significance at the 10, 5 and 1 percent level, respectively.
Appendix Table A1. Construction of Variables
Significance of variable
Dummy variable equal to 1 safe bet (50 Rs. if heads or tails) is chosen.
Variable name
Risk aversion
(100/x-1)*100 where x is the minimum amount they are willing to accept to get a hypothetical lottery price of Rs.1000
immediately instead of waiting for one month.
Discount Rate
Membership in BUA (1=Yes)
Weather info from informal sources (1=Yes)
Dummy variable equal to 1 if any household member belongs to Borewell User Association (BUA) or Water User Group
(WUG)
Dummy variable equal to 1 if the household actively seeks information about weather from trader/middleman, friend/relative,
progressive farmer or neighboring farmer
Market value of the house (Rs.100,000)
Present market value of the household's primary dwelling and the plot located in the house (what would they be able to get if
they sell it)
Value of Owned land (Rs. 1.000,000)
Present market value of the owned plots
Participation in chit fund (1=Yes)
Dummy variable equal to 1 if household participates in at least one chit fund (ROSCA)
Household is credit constrained (1=Yes)
Dummy variable equal to 1 if household was denied credit or cites lack of collateral as a reason for not applying or not having
one more loan
Per capita total income (Rs. 10,000)
Total income per household member
Cultivated land in acres
Total land cultivated by the household
Pct. of land with paddy
Land dedicated to paddy over total household cultivated land
Pct. land of rainfed crops
Pct of total cultivated acres for the following crops: Sorghum, Groundnut, Castor, Red Gram, Maize, Cotton, All Millets,
Sesamamum, Safflower and Greengram.
Literacy (1=Yes)
Dummy variable equal to 1 if the household head is literate
Education
Years of education of household's head
Age of the household's head
Age of household's head
Forward caste (1=Yes)
Dummy variable equal to 1 if the household is of forward caste
Pct. of land with loamy sand soil
Pct of total household cultivated land that has loamy sand soil (loamy sand includes alluvial, sandy and red sandy soils)
Pct. of land with loam soil
Pct of total acres cultivated by the HH that have loam soil (red loam soil)
Pct. of land with clay loam
Pct of total acres cultivated by the HH that have clay loam soil (clay loam includes black soil both very shallow and shallow
as well as saline soil)
Pct.of land with slope (higher than 1%)
Pct of total acres cultivated by the household that have slope greater than 1%
Pct. of land with soil deeper than 80 cm
Pct of total acres cultivated by the household that are deeper than 80cm.
Pct. of land with rain sensitive crops
Pct of land under maize, red gram, green gram, black gram, finger millet, foxtail millet, sesamum, sorghum
Individual definition of start of monsoon (mms)
Amount of rain (mms) required to start sowing. The most of the respondents report the amount of soil moisture that they need
and we used the following conversion to get the amount in mms: 4.318 mms./in. for loamy sand soil, 6.858 mms./in. for
loam, 5.842 mms./in. for silty clay soil and 8.382 mms./in. for clay loam soil.
Pct. of expenditures before onset of the monsoon a Percentage of total inputs' expenditure invested before the onset of the monsoon in 2006
Dummy Anantapur
1 if household is located in the district of Anantapur
Standard deviation of onset of monsoon
Standard deviation of number of days since June 1st until the start of the monsoon for each station across time.
Distance from village to rainfall gauge
Age of the eldest household member > 60
(1=Yes)
Attended meeting and bought insurance in 2004
Distance from village to rainfall gauge in kms.
Bought insurance in 2004
Dummy variable equal to 1 if the household bought rainfall insurance for 2004
Average planting kartisb
HH replanted in last 10 years (1=Yes)
1 if the eldest member of the household is more than 60 years old.
Dummy variable equal to 1 if the household attended the weather insurance meeting and bought insurance in 2004
Average kartis when planting took place across crops in relevant year
a
Dummy variable equal to 1 if the household has ever replanted at least one crop
Subjective mean of onset
Mean kartis of the subjective distribution
Actual onset of monsoonb
Actual onset of the monsoon in relevant year computed using either the individual definition or the average of the individual
definitions in the district.
Note: a Variable is recorded only in 2006; b Variable is recorded in both 2004 and 2006.
Appendix Table A2. Kartis Codes
Code
9
10
11
12
13
14
15
16
17
18
19
20
Name
Ashwini
Bharani
Krittika
Rohini
Mrigashira
Ardra
Punarvasu
Pushya
Aslesha
Makha
Pubbha
Uttara
Dates
Apr 13 – Apr 26
Apr 27 – May 10
May 11 – May 23
May 24 – June 6
June 7 – June 20
June 21 – July 5
July 6 – July 19
July 20 – Aug 2
Aug 3 – Aug 16
Aug 17 – Aug 29
Aug 30 – Sep 12
Sep 13 – Sep 25
Notes: The code is a serial number that takes value 1
with the first kartis of the year.
Figure 1. Timeline of Model
Period 1
r received
Decides
whether
to plant or
wait
Period 2
State s
revealed
Figure 2. Optimal Planting Decision
Plants if
waited in
period 1
Income is
obtained
Figure 3: Expected Net Income from Rain-Sensitive Crop as function of a
Figure 4: Expected Net Income from Rain-Sensitive Crop as function of δ
Figure 5: Expected Net Income from Rain-Sensitive Crop
Figure 6. District Location in Andhra Pradesh, India
JUN 10
JUN 5
JUN 1
JUN 10
JUN 5
JUN 1
Figure 7. Onset of the Monsoon over Time
50
40
30
20
10
0
-10
-10
0
10
20
30
40
Number of days since June 1st
50
60
Mahabubnagar, MBN
60
Hindupur, ANT
1975
1985
years
1995
2005
1975
1985
years
1995
2005
Figure 8. Days until Onset of the Monsoon since IMD Forecast over Time
50
40
30
20
10
0
-10
-10
0
10
20
30
40
Number of days since IMD forecast
50
60
Mahabubnagar, MBN
60
Hindupur, ANT
1975
1985
years
1995
2005
1975
1985
years
1995
2005