AN ERROR COMPONENTS MODEL FOR
PREDICTION OF COUNTY CROP AREAS
USING SURVEY AND SATELLITE DATA
George E, Battese and W~yn~ A. Fuller
No.15 - Februar~y 1982
ISSN
0157-0188
ISBN
0 85834 419 X
An Error Components Model for Prediction of County
Crop Areas Using Survey and Satellite Data
George E. Baitese, ~.Nniversi’ty of New England
Wayne A. Fuller, Iowa .State Un~verslty
INTRODUCTION
The estimati~ of parametersi for small areas has received
considerable attention in recent years. A comprehensive review of
research in small-area estimation is given by Purcell and Kish (1979).
Agencies of the Federal Governmenf,have been significantly involved in
this research to obtain estimates Of such items as population counts,
unemployment rates, per capita income, health needs, etc., for states
and local government areas. Acts.~ the U.S. Congress (e.g., Local
of 1972 ~{i~e National Health Planning and
Resources Development Act of l~have~
f created a need for accurate
small-area estimates. Research iN this area is illustrated in such
papers as DiGaetano et al. (1980), Fay and Herriot (1979),Ericksen
(1974), Gonzalez (1973) and Gon~alez and Hoza (1978). Fay and Herriot
(1979) outline the approach used by the U.S. Bureau of the Census which
is based upon James-Stein estimators [see Efron and Morris (1973),
James and Stein (1961)].
"
Recently the U.S. Department of Agriculture (U.S.D.A.) has been
investigatingthe use of LANDSAT satellite data to improve its estimates
of crop areas for Crop Reporting Districts and to develop estimates for
individual counties. The methodology used in some of these studies is
presented in C~rdenas,~°Blanchardand Craig (1978), Hanuschak et al. (1979),
and Sigman et al. (1978). In these studies ground observations from
the U.S.D.A.’s June Enumerative Survey for sample segments are regressed
on the corresponding satellite data for given strata. County estimates
obtained by the regression approach generally have smaller estimated
variances than those for the tradition~lly "direct expansion approach"
using survey data only.
In this paper we consider the prediction of crop areas in counties
for which survey and satellite data are available. It is assumed that
for sample counties, reported crop areas are obtained for a sample of
area segmenti by interviewing farm o~4rators. We assume that data for
more than one sample segment are available for several sample counties.
In addition, we assume that for each ~amPle segment and county, satellite
data are obtained and the crop cover ~Zassified for each pixel.
A pixel
(an acronym for "picture element") ~i~e unit for which satellite
information is recorded and is about ~.45 hectares in area. Predictors
for county crop areas are obtained under the assumption that the
nested-error regression mode! defines the relationship between the
survey and satellite data.
2.
NESTED-ERROR MODEL AND PREDICTORS
Consider the model
Y~].. = x..8~~]~ + u..,~] i:l,...,t; j:l,..., ni ;
(1)
and
u.. = v. + e., ,
where Y.o is the reported area in the given crop for the j-~h sample
segment of the i-th connty as recorded in the sample survey involved;
(2)
n. is the number of sample segments obse~i~ed in the i-th sample county;
xi]~ . is a (i x k) vector of values of explanatory. ..~ variables .which are
functions of the satellite data; and 8 is a (k x i) vector of unknown
parameters. The random errors, vi, i:I,2,..., t, are-assumed to be
N.I.D.(0, °2)’v independent of the eij’s, which are assumed to be
2
N.I.D.(O, o ). From these assumptions it follows that the covariance
structure of the errors of the model is given by
2 2
o + o , if i:i’ and j:j’
v
e
E(uijui,j,)
2
oV
, if i=i’ and j#j’
0
, if i~i’
(3)
This model specifies that the reported crop areas for segments
within a given county are correlated, and that the covariances are
the same for all counties, but that the reported crop areas for
different counties are not correlated. Efficient estimation of the
nested-error model is discussed in Fuller and Battese (1973).
We consider that for each sample county, the mean crop area per
segment is to be predicted. These are conditional means that-are
denoted by ~i’ i=1,2,..,, t , where
(4)
x.
6 +v.i
~ ¯ = ~~(p)~
-i
where x~i(p) =- Ni Zj=I ~ij’
the mean of’the 5i]’s. for the N.~ population
segments in the i-th county, is ~s-sumed known. Note that ui is the
conditional mean of Y. for the ’i-th sample county, given the realized
county effect and the population.mean of the x. ¯-values.
~m]
It is noted that the above problem is a special case of the
estimation of a linear .~ombination of fixe4 effects and realized values
of random effects [see Harville (1976), (1979) and Henderson (1975)].
Although some of our results are obtained as special cases of the
general results given in these papers, we derive the predictors
involved by first considering that the elements of ~ (as well as the
2
2
variance components, vo > 0 and o e> 0) are known. Considering the
prediction of the county effects, vi , i=1,2,..., t, is motivation for
the predictors of the county means to be presented later.
(a)
Prediction When B is Known
If the parameters of the model (I) are known, then the random
errors, ui~j, are observable. The sample mean of the random errors for
county i, u. =- n.
i.
1
variance o2v + n-102
1 e
E
u.. = v. + e. , has unconditional mean 0 and
j:l 1]
Z
I.
The condition~l mean E(~. i.
Iv )1
is equal to v 1
-i 02
and the conditional variance, V(~i.lvi), is equal to n.~
e" Thus, the
sample mean, ~i.
’
is a conditionally unbiased predictor for v.,1
i=1,2,..., t .
We consider the class of linear predictors for v.
that is defined
l
by
where 8. is a constant such that’O ~ 8. ~ I. The error in this predictor
i
i
is given by
-(8) - v. : -(l-~.)v. + ~.~.
v.
i
1
1
1
1 i.
(5)
and so the mean squared error of the predictor is
.2 -i 2
a
E[~!~)-v ¯ ]2 = (i~~, l )202v + b.n.
i i e
l
l
(6)
It is easily verified that (6.) is equal to
= (
E[ ~ 6)-v,]2
¯
v
n. 0 )(
~ e
+ (l-y,)02
~ v ’
5 o
where y. is defined by
i
Yi
~i 2 -i
=o(2v ~2
+n.o)
v
l e
(7)
Thus ~ the best linear predictor Of v. is $(Y).
E ~.~.
i
ll.
and its mean
~
squared error is
~!
E[ Y)-v.
1
]2 (l_Yi)o2
=
1
V
-i-2
= y.n. o
(8)
1 l e
However, under the assumption of Mormality of v.
1 and e..,
1] it follows
that E(v’l~l i.) =
Yiui.- and so ~!¥)z is the predictor with minimum mean
squared error;
Since the expectation of the prediction error (5), conditional on
v., is -(l=6.)v., then the mean of the squared conditional bias
(hereafter referred to as the mean. squared bias) of the predictor is
[
(9)
E[E(~ 6)Ivi)-vi ]2 : (1-6)202
i v
We consider that the mean squared bias (9) may be of basic importance
in the consideration of predictors for. v.. In fact, we assume that
information is available such that the mean squared bias of predictors
is required to be no larger than a predetermined constant. The best
predictor with constrained mean squared bias is stated in Theorem i~
Theorem i. If model (1)-(2) holds and the parameters are known, then,
in the class of linear predictors~ v!i
~ 6.~.~
for which the mean
ii.
squared bias is constrained by Ai, the best predictor is defined by
Yi~i. if A > (l-’fi)2o2
~(¥*)
V
(io)
i
2 2
¯"’.if A < (l-y) o
i
v
7[ui. "",
i
(&./o2)½ ; and therefore predictors satisfying
where y~ = i
i v
0 S 6. < Yi are inadmissible.
1
The mean squared error and mean squared bias functions are
presented graphically in Figure i~ The ~-values that define the best
~(Y) and, if the Nean squared bias is to be no
linear predictor, vi ,
larger than &., the best constrained predictor, ~r~"’), are indicated
in Figure i.
(b) Prediction When B is Unknown
Returning to the prediction of the conditional county means,
~i ~ x.
~~(p)~
B + v. , i=1,2,..., t , it is seen that the problem is that
~
of predicting the sum of v. and a linear function of unknown parameters.
i
We consider the class of predict~rs’defined by
-(6)
~ + ~ (~ -~. ~)
~i ~ ~i(p)~
i i. ~~.~ ’
(ii)
where ~ is the best linear unbiased estimator for 8 (,assuming again
n.
2
02
-i ~iYi
that ov > 0 and
e > 0 are unknown); ~ ~i. n. Z1
j
j; ~ ~i. ~
-I ni
n.1 Zj=l~ij ; and 6.± is a constant such that 0 s 6i -< I.
For 6. = 0 , the predictor (ii) is
I
~i
i(p) ~ ’
which is referred to as the "regression predictor." For ~. = i, the
predictor is
i
’/ o /n.
/,
2
/ ~ Mean Squared Error
:
Mean Squared Bias-~ ~ I__~ (
¯ L_ Mean Squared Error for predictor
.
0
Figure I:
.
with mean squared bias equal to
i
Yi Y~
~.
I
Mean Squared Bias and Mean Squared Error of Predictor,
v.
I
~ 6.~, , for v.
ii.
1
which is referred to as the "adjusted survey predictor.." This
predictor adjusts the survey sampie mean, ~i. , to account for the
sample mean of the regressors ~. , differing from the population
mean, ~i(p)
The error in the predictor (ll)’is expressed by
= [-(l-6i)vi + 6i~i " ]
~!6)-~i
1
(12)
where the first term is the prediction error (5) for the case when ~
is known and the second term arises in the estimation of ~. The mean
squared error for the general predictor (ii) and the best linear
predictor are stated in the next Theorem.
Theorem 2
If model (i)-(2) hol~, where o2 and o2 are known positive
’
V
e
constants then the mean squared error for the predictor ~!~) is
2 -’i 2~
E[~!6)-U..]2=[(1-6.12o2 + 6.n.-o .J
1
i
1
V
1 1
+ e(6i-Yi .-;(~xi("
~ _~.
p ~_~.
" )~6.~.
)V(~)~’.
)
+ (~i(p)-6i~i. ~ ~)V(~)(~i(p)-6"~’~~~.)’ ,
(13)
where V(~) is the covariance matrix for ~. Furthermore, the mean
squared error is ~a minimum when ~i = Yi ~ o2(
v ov2+n-lo2)-i
¯. e
It can be shown that the expectation of the prediction error (12),
conditional on the realized random effects, [ = (Vl,V2,...,vt)’~ is
Iv]
~ -
i
=
i)v.
¯
t
-I
~..(~i(p)-~ ~ )V(~) [ ~[ v.(o2+n]lo2]
,
i~i. ~ ~] ~:i~]¯ 3 v ] e
(14)
From this result the mean squared bias of-(~i.6) ’ denoted by MSB(~(6))~
’
i
and the best constrained predictor are obtained, as stated in the
following theorem.
Theorem 3.
2
2
If model (1)-(2) holds, where o and o are known positive
v
e
constants, then the mean squared bias of the predictor ~i
~(8)
is
MSB(~i(~)) = ~1-6 )2~
i v
- 2(l-6i)Yi(~i(p)-6i~i.)V(~)~!
t
(]5)
~ [(_Xi(p)-6.x.
)V(~)
~~~.
~ ~ [<’..y~]2/cr2
~3
v
j:l
Furthermore, if the mean squared bias is constrained by &i’ then the
best constrained predictor is defined by
Y))
~i(p)~ + Yi(~i. -~.
~7.~~), if Ai ~ MSB(~!i
~(¥*)
(16)
1
-~.
~) ’if &i < MSB(~(Y))
~~ + Y~(~i.
~i(p)
~~.~
i
E[E(~(6)Iv)
where y9 is the root of A
1
1
i
~
u.
1
such that the
corresponding predictor has the.smaller mean squared error.
The mean squared bias (15) has positive derivative with respect
to 6i and is generally expected to be monotone decreasing as 6i
increases
from
zero
to
one.
It is noted that the mean squared bias is positive when
unless ~i(p)
-
: ~~.
~. ’
for all i:i,2,..., t.
In general, the minimum
~(i)
mean squared bias is positive and has value E[E(~i Y) - ~i ]2 if
(8*)
the function~is monotone decreasing or E[E(~i
[y) otherwise, where
0 S 8* < i. Thus the value of A cannot be less than this minimum
i0.
value. Further, the mean squared bias is less than the mean
squared error at 6. = O~ The two functions are depicted graphically
in Figure 2. The 6-value for the best predictor with mean squared
bias no larger than A. is also ShOwn.
i
(c) Adjusted Predictors
In addition to predicting i.ndividual county means, there is
generally interest in predicting the mean for an aggregate of counties.
Suppose that a weighted average of t~e sample county means is to be
t
predicted, ~ ~ Will, Where ~che~ weights are related to size of
i:l
t
the counties (e.g., W.I = Ni(j!IN’)-I)" ] Consider the weighted average
of the individual county predice.ors (11),
t
i=l
It can be shown that the ~ean squared error for this predictor
is
E(~(~) ~)2 t
.)2
’62. inil
i 2’ V
i 2 e
i=l I
t
+
~
i=l
.×.,
Wi(-xi(’p)-6~~~.
t
~ ~
]
[w
i=l i i-Yi.~ .
t
t
+ [ [ Wi(~i(p 1~m. )]V(~)[
)-~i~i )]’
~ ~ [ W i(~i(p
)-6.x.
i=l
i=l
"
--(~)
Furthermore, the conditional bias of ~
is
t
~
: -[
i=l i
i
l
V(~_)
~ ~.. Yivi/o
- i=I
(18)
ii.
!
Mean Squared Error
I
I
Mean Squared
Bias
I
I
Mean squared error fcr best predictor
I
with mean squared bias no larger than
I
I
&.
i
Figure 2:
Mean Squared Error and Mean Squared Bias of Predictor,
~ + ~.(~
f’- ~)
~(~)
~ -~i(p)~
i
± i. ’~i.~
’ for ~i ~ ~i(p
)~ + v.i
12.
When 6. = I for all i=1,2,.o., t, it follows that the predictor,
t
=(1)p
= ~ wilt° ~(i), is approximately unbiased for ~, because the
i=l
t
elements of the vector, [_ Wi(x~i(p)-_xi.), are likely to be close to
i=l
zero. Thus, u
may be preferred for predicting the weighted sum of
the county means.
i), is
Given that the approximately unbiased predictor,= p(
desired for the weighted mean, ~, it is possible to adjust the
~(6)
individual predictors, ~i
, i=1,2,..., t, so that the weighted
average of the adjusted predictors is identical to the predictor
=(i)
~
This is accomplished by adding the bias correction
t
l) - It
j:i ] ]
t
-=j:iI]
j:l ] ]
to each predictor ~i-(6) ’
(19)
]
]" ~]’~
i:i,2,..., t
Let these adjusted predictors be represented by ~916), where
,.,
T(~)
~(6)
~,
~ ~.
+ ~t w.(i-~,)(9. -~. g)
~
i
j:l ]
]
]" ~]’~
It can be shoun that the mean squared error for ~(~).
is
(~) ]2
Eri(~
t~i )-Pi]2 = E[~i -~i
+ 2
Wi (i-@.)[6.(o +n.
2 -i
o )2- 02]
i
i v I e
v
- (~i--Yi) [ W.(I-6.)~.
j:l ]
] ~]"
+
[ W.(l-6
2 .)2
]
j=l ]
V(~)~[
~ ~ ~~"
(a2+n.-i a2)
v ]
e
(21)
=1 j
~ ~
=i ]
J
13.
Note that when 6. : Yi’ i=1,2 ..... , t, th~ second ierm of (21)
[the covariance between the error in the predictor ’(ii) and the
correction term (19)] is zero and sol’the mean squared error for the
adjusted predictor, ~¥), exceeds that,, for the best predictor ~(¥)~..
(d) Prediction for Nonsample
Counties
In this section we discuss the effidiency in predicting means
for a county containing sample observations relative, to a county for
which no sample observations are available, hereafter referred to as
a "nonsample county". Suppose observa~ions are available on t sample
counties and the notation of previous sections applies. Let the
mean of a nonsample county be denoted~.~y
-
where X,(p)~~ is known and v~,~ is ind$
2
and has variance oV ¯
~(o) -
of y :
v)’
(Vl, v2, ¯ .. ,
¯~
The best predictor
for p, is
"
~
(22)
where ~ is the best linear unbiased estimator for ~, based on the
sample observations. The mean squared error of this predictor (22) is
s~-(o)
~2
)v(~)~t
LU.,. -P.,.J
=O
X,
v2+-~,(p
~ ~ (p)
(23)
Suppose that the i-th sample county has mean, Bi = x-i(p)B~ + v.i , such
that- v ,Xi~p~ =- ~ ,x-*~P~" It can be shown that the difference between the
-(~)
mean squared error of ~!0) and that of the general predictor, p.
for a sample county .is
2 )
,~(o) ,2 ~(~i (8)
-~i
E~U.:~ -U.A)
-
,.-
,
14.
2~2 -12
: yi[Ov+ni °e)
x.
]+
~i .... ~.
)2 2v ?1~2)
-
i~*(P) ....
--
(6i-Yi [(o +n i e - x.~A.V(f)x[
]
~ - ~i.
i
)2 > 0 if and
This expression implies that E(~!0)-p.::
)2 - E( ~(¥)-Ui
~,
only if
)2
(6i-7.~ ~ Y~ + 2Yi~*(p)V(~)~ ~ ~ .~[
~ ~ (02 -I
2
v+n’~ Oe ) - ~’~~.v(B)x[’~ ~ ~~
]-i
(24)
This implies that any predictor, ~i(4) , for a sample county has a
~
_
~
smaller mean squared error than the predictor, p!i0),~ ~ x_,(p) , for
a nonsample county’ provided 6.i is in a sufficiently small neighborhood of ¥i" For some values ofthe parameters the upper bound on
the inequality (24) may exceed one, in which case the predictor for
the nonsample county has larger mea.n squared error than any predictor
for a sample county with.correspondin$ parameters.
ESTIMATION OF VARIANCES
2
2
When the" variance componen.~s,
o
and
o ,
are unknown, different
"
V
e
estimators can be used. Harvi~$e (1977) contains a discussion of
estimation methods for component-of-variance models. We use the
fitting-of-constants estimators presented in Fuller and Battese (1973)
for the nested-error model (1)-(2). The estimators are given here in
a form in which estimators for their variances are presented.
2
^2
The fitting-of-constants estimator for o , denoted by ~ , is
e
e
given by the residual mean square for the regression model in which a
suitable set of indicator variables are defined to account for the
county effects, vi, i:i,2,..., ~; that is,
e
df(W)
,
15.
where $’$ is the residual sum of squares for the regression of
the Y-deviations, Y.. - ~. , on the x-deviations, x.. - x. ,
that are not identically zero; and df(W) is the degrees of freedom
for the variation within counties, which is given by
df(W) =
t
) n. - k - (t-X) ,
1
i=l
where I represents the number of the x-variables that are a linear
combination of the indicator variables for counties.
2
^2
The fitting-of-constants estimator for o , denoted by o., can
v
v
be defined in terms of the mean square for the variation among
counties, MS(Among), which is defined by
MS(Among) -
t-l
where ~’~ is the residual sum of ~quares for the ordinary least-squares
fit of the model, Y..~3= ~i~Bj~ + ui~] , where u..
13: v.~ + e.. ; and
is the degrees of freedom for the variation among counties.
^2
Using this notation it follows that oV is defined by
[MS(Among) -
]n i
,
where
n, :
t
-i
~ n.[l-n.~. (X’X)-I~[ ](t-l)
i=l
1
l~l . ~
-1 ¯
and X is the matrix of values of the k x-variables. This is an
unbiased estimator for ~2 because the expectation of ~’~ is given
by [cf. Fuller and Battese (1973, p.628)]
(t-k)n,ov
t -k)o~
2
E(.~’O) : ( [’ni
+
i:l
(26)
16.
^2
Theorem 4 Given the model (1)-(2), the statistics o and MS(Among)
e
are independently distributed and have variances
V($2) = 204/df(W)
e
(27)
e
and
(28)
V[MS(Among)] = 2[o4 + 2n,o~O2v+ n~,~,O4v](t-l)-I,
e
where n**
t
~ n.[l-n.~. (X’X)-I~[ 1.](t-l)-I
i:l z
z~z ....
Given these results it follows that estimators for the variance
^2
^2
^2
of o and the covariance between o ¯ and o are defined by
v
v
e
^2 ,,’ -2
{/(~21 = {O[MS(Among)] +
(29)
V
and
(30)
whereA ^V(02) and 0[MS(Among)] are estimators for the variances
e
and MS(Among)
that are defined by substituting oe
¯
and^ o2 for
v
of o
e
2
0 and
e
ov , respectively, in (27) and (28~~.~, These estimators for the
variances and covariances of the v~riance estimators are neces’sary
2
2
for inference about o and o and. for obtaining approximate
v
e
generalized least-squares estimates for the variances when prior
information is available.
The estimated generalized least-squares estimator for 8 that is
2
defined by replacing the variances ov and
^2
^2
estimators o and o is denoted by ~.
V
e
o2
e
by their corresponding
Since the predictor (ii) with
~
minimum mean squared error is ~!Y)~ ~ ~’(p)~-I
~
+ Yi(~i.-~i.~ ~) ’ it is
approximated by the predictor,. Bi , defined by
(31)
~!~)
)~ + yi(Yi.-x.~i.~
8)
i
: x~i(p
^
2~^2 -1 ^2 -i
where Yi : ~v’Ov+n’l Oe)
The relationship between the mean squared
error of this predictor and that which assumes that the var{ances are
known is considered in the next theorem.
Theorem 5 If model (1)-(2) and the regularity conditions of Theorem
3 of Fuller and Battese (1973, p.629) apply, then the mean squared
error for the predictor ~(~). is approximated by
i
2
(i) ]2
E[~i :Pi : E[~i -Pi] + n?2(°2+n?l°2)-3V(°252~
v m e
e v - ~2~2)v e
t
, Op(Max{tTl,[ ~ (n.-l)]-l}).
I
i--I
(32)
For the special case in which all sample sizes are equal and
only a scalar parameter is being estimated (i.e., ni=r, i=1,2,..., t
and x..=l for all i and j), the increase in the mean squared error
~i]
due to estimation of the variances is given by
2r-2o4(o2+r-lo2)-I (n_l)[(tml)(n-t)]-I, where n -= rt.
e v
e
4.
EMPIRICAL RESULTS
We consider prediction of areas of corn and soybeans for 12
counties in North-Central lowa, based on data for 1978. The Economics
and Statistics Service of the U.S. Department of Agriculture determined the area of soybeans in 37 area sampling units (segments) in
the 12 counties during the June Enumerative Survey in 1978. The
segments are about 259 hectares (or one square mile) in area. The
numbers of pixels classified as corn and soybeans in these area
segments were determined from the NASA’s LANDSAT satellites during
passes over Iowa in August and September 1978. The areas under corn
18.
and soybeans, as reported in the June Enumerative Survey, and the
satellite data for the sample segments are presented in Table i.
Also presented are the popnlation number of segments and the mean
number Of pixels of corn and .soybeans forthe 12 ~ounties involved.
To obtain predictions of crop ~reas we assume the simple model,
Yij = 60 + 61Xij+ uij’ i=1,2,...,12; j=l,2,. ,ni ,
where Y.. is the number of hectares of corn (or soybeans) in the
j-th sample segment of the i-th county as recorded in the June
Enumerative Survey in 1978; X.. is .the number of pixels of corn (or
soybeans) for the j-th sample segment of county i. The parameter
estimates are obtained by use of the nested-error software of SUPER
CARP [Hidiroglou, Fuller and Hickman (1980)]. The parameter estimates
and their estimated standard errors (in parentheses) are given to two
significant digits, as follows:
^2
^2
For corn: Y.. = 5.5 + 0.388X.. where o = 60 and o = 292
~]
~ (13.5) (0.044)
v (75)
e (84)
For soybeans:
Y.. = -3.8 + 0.475X.., where o = 250 and o = 184.
v
e
~3
(53)
(9.3) (0.040)l]
(142)
In each case, the estimate for the intercept parameter is not
significantly different from zero. The among-counties variance
~2
estimate, Ov, is not significant for corn, but is significant at the
5% level fsr soybeans. This may be due, in part, to the rotation
patterns for these crops.
Given the preceding results, the predictions fop the mean hectares
of corn and soybeans per segment in the several counties are listed
in Table 2 for the different predictors discussed in preceding
19.
Table i:
Survey and Satellite Data for Corn and Soybeans
in Sample Segments of Selected lowa Counties
Segments in the
County
Population
Reported Hectares
Pixels for
sample segments
Sample
Corn
Soybeans
Corn
Soybeans
55
Corn Soybeans
295.29
189.70
137
206
165
318.21
188.06
209
218
300.40
196.65
99.15
124.56
110.88
109.14
143.66
313
246
353
271
237
190
270
172
228
297
314.28
198.66
88.59
88.59
165.35
i04.00
88.63
153.70
102.59
29.46
69.28
99.15
143.66
94.49
220
340
355
261
187
35O
262
87
160
221
345
190
325,99
177.05
185.35
116.43
6.47
63.82
432
367
96
178
290.74
220.22
93.48
121.00
109.91
122.66
104.21
91.05
l&2.33
143.14
104.13
i!8.57
221
369
343
342
294
167
191
249
182
179
298.65
204.61
Cerro Gordo
545
i
165.76
8.09
374
Franklin
564
3
162.08
152.04
161.75
43.50
71.43
42.49
361
.288
369
Hamilton
566
i
96.32
106.03
Hancock
569
5
114.12
100.60
127.88
116.90
87.41
Hardin
556
Humboldt
424
Kossuth
Mean PiXels
for counties
Pocahontas
57O
92.88
149.94
64.75
105.26
76.49
174.34
2O6
316
145
218
221
338
257.17
247.13
Webster
687
99.96
140.43
98.95
131.04
144.15
103.60
88.59
i15.58
252
293
206
302
303
221
222
274
262.17
247.09
Winnebago
402
127.07
133.55
77.70
95.67
76.57
93.48
~55
295
223
128
147
204
291.77
185..37
Worth
394
76.08
103.60
253
250
289.60
205.28
Wright
567
206.39
108.33
118.17
37.84
131.12
124.44
459’
290
307
77
217
258
301.26
221.36
i
20.
sections. Also ~resented are the values of the sample mean of
the reported crop hectares for the June Enumerative Survey. The
square root of the esiimated mean squared error is given in
parentheses below the corresponding prediction.
It is evident from Table 2 that the use of the satellite data
to obtain predictions is much more efficient than using only the
reported crop areas from the June Enumerative Survey. The mean
squared errors of the sample mean of the reported hectares are
relatively large for the individu~l counties. For predicting
soybean areas, the regression pred9ctor is always less efficient
than the adjusted survey predictor because the variance among
counties is the dominant term in th~ total variance. The reverse
situation applies with corn.
The approximate mean squared error for the estimated best
^(0)
predictor, ~)~ is less than ~that for the regression predictor ~i
when at least three sample segments are available for predicting corn.
However, for the case of soybeans, in which the among-counties
variance is dominant, the mean squared.error for the best predictor
is always less than that for the regression predictor. The mean
squared error for the estimated best predictor for corn exceeds that
for the best predictor when the variances are assumed known, such
that the percentage difference increases as the number of sample
segments increases from one to five’. The percentage increases in
the mean squared error range from 18%.to 30% over this range of
observations. However, estimation of the variances for soybeans
results in decreasing percentage increases in the mean squared error
of the best predictoro.~ the number of sample segments within a
Table 2: Predicted Mean Hectares’of Corn and Soybeans per Segment
Corn
^ (o)
County
~i
Cerro Gordo
0.17
P.
1
Soybeans
^(o)
^(i)
P.
1
i
120.0
(8.3)
122.6
(8.6)
72.1
.8.1
86.4
78.2
135.2 165.8
(11.5)
(13.7)
(65.2)
(16.9)i (35.0) (0.i6) (15.6)
0.38 128.9
(0.33) (8.O)
137.1
(7.7)
150.4
(9.8)
i22.0
(0.20) (8.3)
123.6
(8.6)
0.58¯ 89.7
131.8
96.3
(16.8) (39.3) (0.16) (15.7),
0.51 127.3
(0.35) (7.8)
124..2
(6.9)
121.i
(7.7)
i09.4
(14.0)
Hardin
0.55 131.9
(0.34) (7.8)
131 .l
(6.7)
130.5
(7.1)
0.89
80.4
137.8
(17.2) (O.O7) (15.2)
74.4
Humboldt
0.29 118.2
(0.29) (8.2)
i15.4
(8.3)
I0~.7
150.9
0.73 100.9
(43.9) (.0.i3) (15.6)
81.8
0.51 121.3
(0.35) (7.8)
I12.7
(6.9)
104.4
(7.6)
ii0.3
(9.6)
0.38 105.2
(0.33) (7,9)
109.3
(7.7)
i16.0
(9.7)
i02.5
(,16.7)
0.45 i07.1
(0.34) (7.9)
Ill. 7
(7.3)
i17.2
i17.6
0,38 118.6
(0.33) (8.0)
116.5
(7.7)
(0.20)
Franklin
Hamilton
Hahcock
Kossuth
Pocahontas
Webster
Winnebago
Worth
0.17
0.17
117.8
(8.3)
113 .i
122.3
(0.33) (8.0)
123.2
(7.7)
(0.20)
Wright
o.38
(8.6)
158.6
(12.8)
0.80
(o.li)
0.87
.(o.o7)
0.87
85.6
(15.3)
(0.o7)
93.5
(15.2)I
0.80
13.7
(O.ll) (i5.2)
52.5
(ii.8)
95.9 i06 ..0
(13.6) (16.9)
(5.5)
(9.1)
(5.9)
(7.3)
73.7
89.8
(16.9)
(5.7)
74.7
35 .i
(40,6)
(9.9)
123.1
(6.1)
LI7.8
li3.1
I18.7
(9.6)
(7.8)
(8~.o)
13.7 I09.9 i09.2 i13.0
0.84
(0.09) (15.1) (6.5) (6.8) (7.7)
"(8.6)
(9.9)
84.3
i12.8
0.80
(9.9) (O,il) (15.3)
88.6
97.6
(7.4)
I0018
(7.9)
87.2
(15.7)
82.3 103.6
(13.6) (25.2)
144.3
0.80 101.5
(22.0) (O.il) (15.3)
97.8
i15.6
(19.4)
(7.5) (8.0)
76.1
0.58
90.3
(17.0) (22.2) (0.i6)
(9.8)
93.3
6!.4
(7.8)
90.7 100.5 101.9 117.5
(15.2)
(5.9) (6.2) (16.7)
(8~6)
i24.6
(7.4)
93.8
(14.5)
22.
county increases. The percentage increases in the mean squared
error decline from 9% to 4% as the number of sample segments increase
from one to six.
^(1)
The adjusted survey predictor, ~i , is less efficient than
the regression predictor for corn when the number of sample segments
is no larger than four. "In the case of soybeans the adjusted survey
predictor is always more efficient than the regression predictor.
The values of the adjusted predictors (20), Ui and
¯
, are
given in Table 3 for both corn and soybeans. It is evident that the
mean squared errors for the adjusted best predictor, t(~)
~i , are slightly
larger than those for the best predictor ~(Y)^
However the adjusted
predictor, ±(0)
~i ’ sometimes has smaller mean sqnared error than the
regression predictor, ~(0).
I
The last row of Table 3 involves the
weighted average of the adjusted survey predictors for the sample
12 ^(1)
counties,£i=IWi~i ¯
The efficiency of predicting county mean crop areas for sample
counties relative to predicting for nonsample counties with corresponding values of the regressors is investigated by calculating the
ratio of their mean squared errors ,~
EE~iY
[_~.O)_B...
)-Ui]2/E~!
.
This
ratio is a rather complicated function of the variances, the values
of the x-variables, and the sample sizes. However, for the corn and
soybean data, the values of this ratio for the different counties
varied little for particular numbers of sample segments. The average
value of the ratio is presented in Table 4 for the different values
of the sample sizes¯ Although these statistics are based on different
numbers of Observations and should be interpreted with caution, they
show an interesting pa~tern. As the number of sample segments
23.
Table 3: Adjusted Predictions of Mean Hectares of Corn and
Soybeans per Segment
Corn
County
Soybeans
G!o)
~(o)
87.7
(15.4)
(11.5)
(7.8)
86.9
(15.4)
65.7
(7.4)
122,4
(8.1)
124.2
(8.7)
91.0
(15.4)
92.8
(ll.i)
127.7
(8.2)
124.8
I00.0
(7
92.O
(15.4)
132.3
(8.3)
131.7
(6.8)
81.7
(15.5)
74.0
(5.6)
Humboldt
118.6
(8.3)
i16.0
(8,.4)
102.2
(15.9)
81.3
(9.!)
Kossuth
121.7
(7.7)
i13.3
(7..0)
94.8
(14.4)
i18.8
(5.9)
Pocahontas
105.6
(8.1)
109.9
(7:8)
i15.0
(15.3)
i12.7
(7~4)
Webster
107.5
(8.o)
122.3
(,~. 5)
115.0
(i5.0)
109.5
(6.5)
Winnebago
i19.0
(8.3)
i17.1
(7.8)
85.6
(15.8)
97.1
(7.4)
Worth
i18.2
(8.3)
113.7
95.1
(iS.8)
86.8
(ii.i
Wright
122.7
(8.i)
123.8
(7.9)
102.8
(15.5)
112.3
Weighted
Mean ~
120.4
(3.2)
120.4
(3.2)
96.3
(2.6)
96.3
(2.6)
120.4
(8.2)
123.2
Franklin
129.3
(8.1)
137.7
Hamilton
Hancock
Cerro Gordo
Hardin
(8.7)
(8.7)
77.7
(6,0)
(7.s)
24.
Table 4 :
Averages of the Ratio of the Mean Squared Error
for the Best Predictor for a Sample County to
that for the Reggession Predictor for a
Non-sample County
Number of sample
segments, ni
i
MSE(~!y)/MSE(~(0))
y
Corn
Soybeans
0.99
0.45 .
0.93
0.30
0.79
0.20
0.70
0.15
0.64
0.13
0159
0 .ii
25.
increases, the relative mean squared error decreases, but at a
declining rate. This is due to the fact that the mean squared
error for the best predictor decreases markedly as the number of
sample segments increases, but that for the regression predictor
does not. The decreases in the relative mean squared error with
increasing numbers of sample segments are quite substantial for
soybeans. Furtherm6re, these data would suggest that obtaining data
for a few sample segments in more counties is likely to result in
greater efficiency of prediction than obtaining more data for fewer
counties.
Empirical analyses were conducted on a smaller data set involving corn, soybeans and winter wheat for twenty segments in five
counties in North-Central Missouri. Th~ values of the estimator (26)
for the among-counties variance were negative for soybeans and winter
wheat, but was not significantly greater than zero for corn. Thus
the best predictions in these cases were. equivalent to, or not
significantly. different from, the ordinary least-squares predictions.
5.
CONCLUSIONS
The nested-error regression model with satellite data as the
auxiliary variable offers a promising approach to prediction of crop
areas for counties provided segment data are sufficiently highly
correlated. A reasonably large number of counties is required for
the among-counties variance to be satisfactorily estimated. The U.S.
Department of Agriculture plans to implement the software of the
-nested-error
approach for the prediction of county crop areas in the
next crop year.
26,
ACKNOWLEDGEMENTS
This research was p~rtly supported by Research Agreement
58-319T-I-0054X with the Economics and Statistics Service of the
U.S. Department of Agri~culture. We thank Cheryl Auer for writing
the computer programs for the empirical analyses and Rachel Harter
for helpful discussions on earl,ier drafts of the paper. This
research was conducted during the time the first author was at lowa
State University under an Outside Studies Program from-the Universiiy
of New England.
27.
REFERENCES
C~rdenas, M., Blanchard, M.M., and Craig, M.E. (1978), On the
Development of Small Area Estimators Using LANDSAT Data as
Auxiliary Information, Economics, Statistics, and Cooperatives
Service, U.S. Department of Agriculture, Washington, D.C.
DiGaetano, R., MacKenzie, E., Waksberg, J., and Yaffe, R. (1980),
"Synthetic Estimates for Local Areas from the Health Interview
Survey," 1980 Proceedings of the Section of the Survey Research
Methods, American Statistical Association.
Efron, B., and Morris, C. (1973), "Stein’s Estimation Rule and Its
Competitors-An Empirical Bayes Approach," Journal of the American
Statistical Association,~,. 117-130.
Ericksen, E.P. (1974), "A Regression Method for Estimating Population
Changes of Local Areas," Jour.n~al of the American Statistical
Association, 69, 867-875.
Fay, R.E., and Herriot, R. (1979),i"~.stimates of Income for Small
Places: An Application of J~mes-Stein Procedures to Census Data,"
Journal of the American Statistical Association, 74, 269-277.
Fuller, W.A., and Battese, G.E. (’i~73), "Transformations for Estimation
of Linear Models with Nest~-Error Structure," Journal Of the
American Statistical Associa~fon, ~, 626-632.
Gonzalez, M.E. (1973), "Use and ~valuation of Synthetic Estimates,"
1973 Proceedings of the Social Statistics Section, American
Statistical Association, 33-36.
Gonzalez, M.E., and Hoza, C. (1978), "Small Area Estimation with
Application to Unemployment and Housing Estimates," Journal of
the American Statistical Association, 73, 7-15.
Hanuschak, G., Sigman, R., Craig, M., Ozga, M., Luebbe, R., Cook, P.,
Kleweno, D., and Miller, C. (1979), Obtaining Timely Crop Area
Estimates Using Ground-Gathered and LANDSAT Data, Technical
Bulletin No. ~, Economics, Statistics, and Cooperatives Service,
U.S. Department of Agriculture, Washington, D.C.
Harville, D.A. (1976), "Extension of the Gauss-Markov Theorem to Include
the Estimation of Random Effects," The Annals of Statistics, 4,~
384-395.
Harville, D.A. (1977), "Maximum Likelihood Approaches to Variance
Component Estimation and Related Problems,’ Journal of the American
Statistical Association, ~, 320-338.
Harville, D.A. (1979), "Some Useful Representations for Constrained
Mixed-Model Estimation," Journal of the American Statistical
Association, ~5,,~,200-206.
Henderson, C.R. (1975), "Best Linear[’~nbiased Estimation and Prediction
Under a Selection Model," Bio~T~r~ ics, 31, 423-447.
Hidiroglou, M.A., Fuller, W.A., a~ Hickman, R.D. (1980), SUPER CARP,
Sixth Edition, Survey Section, Statistical Laboratory, lowa State
University, Ames.
James, W., and Stein, Charles (1961), "Estimation with Quadratic Loss,"
Proceedings of the Fourth Berkeley Symposium of Mathematical
Statistics and Probability, Vol. I, University of California Press,
361-379.
Purcell, N.J., and Kish, L. (1929), "Estimation for Small Domains,"
Biometrics, 35, 365-384.
Searle, S.R. (1971), Linear Models, Wiley, New York.
Sigman, R.S., Hanuschak, G.A., ~ra~g, M.E., Cook, P.W., and C~rdenas, M~
(1978)°, "The Use of Regression Estimation with LANDSAT and
Probability Ground Sample Data," 1978 Proceedings of the Section
on Survey Research Methods, American Statistical Association,
165-168.
29.
APPENDIX
Outlines of the proofs of the basic results in the paper are given
in the Appendix.
Proof of Theorem i:
If A. > (l-y.)2 o2, then the predictor ~(Y). has smaller mean
i
I
v
squared bias and therefore is best since it has smallest mean squared
error. If A.1 < (1-¥i
)2 o2, then a predictor with smaller mean squared
v
bias than ~(Y). is required. Sincethe mean squared bias is a monotone
i
decreasing function of ~. this implies that 6 must exceed Yi" However,
i
I’
in the range y. < 6. ~ i, the mean squared error is a monotone increasing
i
function. Thus the smallest ~. for which the mean squared bias,
i
(I-6.)2~2, does not exceed &. defines the best predictor. Thus, for no
i
v
i
will a predictor ~6) =_ ~.~. for which 0 _< 6. < Yi be optimal.
1 I. "
Proof of Theorem 2:
The first and third terms of (13) are clearly the variances of
-the two terms in (12). The second term, the covariance between these
two terms, is ~btained by use of the results
~
- S)v.]
:
~ i
(A.I)
I~ ~
and
(A.2)
~
I.
i " ~ ~i.
These results are obtained as follows:
~
~j=l~]~] ~], j:l-]~] ~]
where X._] ~ (Sjl"
x[7]2,..., x[~]nj)’ is the (nj x k) matrix of values, of
the regressor variables for the j-th sample county; and V. is the
covariance matrix for ui~3 ~ (uJI’
uj2,..., U.3nj)’ which is expressed by
30.
2
2
-V. : o I. ÷ o J. ,
~]
e ~3
v ~]
where I. is the identity matrix of order n. and J. is the (n. x n.)
matrix with all entries equal to one,
Note also that
[j=l
t ~ il.xj]-i
Thus,
t
S)v.] : v(~)
~ X[V~1 E(~jvi)
2
: ~V(~)X,v-llo
~ ~i~i ~iv
where i. is a column vector of order n, having all entries equal to
~i
one.
I
By expressing V, as a sum involving mutually orthozonal matrices,
~i
namely
2
V. = o2(I. - n -ij. ) + (02 + n.o )n?lj.
,
~±
e -i
i ~I
e
I v
I ~I
it follows that
-i
V. : o -2(I .
n.J.)
+-(o 2 + n.o2 )-In~ij.
-I
and so
2
2]-1 2
E[(~ - B)v.]
~i v
~ i : ~V(~)X[(o
~I e + n.oi v 1.o
.
= V([)n.~’
(02 +.o2
n )-io2
v
~ ~ i~i. e
I v
i.e., E[(~ - 8)v.] : V(~)~[ Y.
I
~
~i. i
This establishes the result of (A.I). The proof of the second
9esult (A.2) is similar and is omitted.
, ~) ~ ~i(p)~ + ¥i(~.
The predictor
~. ~), can be shown to
~~.~
have the smallest mean°°’shuared error as follows:
31.
and by equations (A.I) and (A.2) it follows 7hat ~i
(z)
and
~(Y). - ~(~). are uncorrelated and
I
i
E[~(Y)i - ~(6)]2i
: (Yi
2
-i 2
- 6i )2[(av + n.1 ae )
]
x.
Proof of Theorem 3:
It follows from (12) that
E(~(~)IY)--i’ ~’l : -(i - 6i)v.i + (~i(p)~
where _v : (Vl, v2, ..., vt)’
But by methods similar to those in .’the derivation of (A.I), it
follows that
t
~l V.
r,(~lv)= s + v(~) [ ~t].(a~
v -{ dT~)
] e
j=l
and so the result of equation (15) follows readily.
Further, the
second derivative of (15) with respect to 8. iis positive because
Also, the mean squared bias functien has a minimum for a positive
value of 6. provided
Proof of Theorem 4:
~
Now ~’~ = Y’N Y an~ ~’e : :~,.Y where M = I - X(X’X)
32.
M. = I - X,(X.[X.,.) _XJ. and X, =(X: Z) where Z is the matrix of
indicator variables for the t counties and (X~X,) denotes the
generalized inverse of (X~X,).
^2
We show that (M-M,.)V M.,. = 0 and so MS(Among) and o are independent
[see Theorem 4 of Searle (1971)].
Now V = 02
~
e ~
2
2
2
I + o2 Z Z’ and so V~¢M, =e o~ M,
+ vo ~Z ~Z’M,
= oe ~
M,~
~
~ ~
v ....
because .Z’_M, : 0. ¯
Thus ( M
: 0 because M M.,. :’0
Further, the variance of ~’~ - ~’~ is given by
^ ^
V(u’u - $’$) : 2tr[(M-M.,.)V(M-M.,.)V] [See Searle (1971), p.451].
o~2
-2 ZZ’ :o~2(M-M.,.) + o2M Z Z’
Now (M-M.,.)V
~,,
v
~.. ~ = (M-M.,.)
~ ~,~ e ~I + (M-M.,.)o
e
~ ~,, v ~~
4
2
2 = o tr(M-M.,.)
.
+ 2o202 tr[M~ ~Z ~Z’] + 04 tr[M~ Z~ Z’
Thus tr[(M-M,)V]
~ ]2
~ ~~¢ ~
e ....
e v
v
But _M-_M, is a symmetric idempotent matrix and so tr[(.M-.M,)2] : (t-l).
Further, tr(M Z Z_’) : tr(Z__Z’) - tr[X(X’X)-I _X’Z_.Z’]
: tr(Z’Z) - tr[(X’X)-I X’ZZ’X]
t
n. x[ .x ]
~ n.
- tr[(X’
i
- ~X)-I [ 2i ~i.
~i.
i:l
i=l
t
-I ~1 ]
~ n.[l-n.~’.
(X’X)
i
i~I .
~
~l.
i=l
-= (t-l) n, , by equation (26).
Also since the (i,j)-th column of M Z Z’ is given by M Z. , i=1,2,..., t
and j=l,2,..., ni, it follows that
t
tr(M~ -Z~ Z’)2 : [ n..(Z~M
Z.)
1 ~l~ ~i
i=l
33.
t
-I
~ n.[n. -n.~. i~i.
(X’X)
~ ~
i=l 1 1
n.~I~I.]
~n.[l-n.~. (X’X)-1 ~: ]
2
i=l
1
i~i. ~ ~
-I.
-= (t-l) n** , by equation (28).
^
22
Thus V(~’u~ ~ - ~’$)~ : 2[O~e + 2n,OeOv + n**o ](t-l).
Proof of Theorem 5:
From equation (12) it readily follows that
i
i(p)
i l Yiei.
l
i~~. ~ -
: {[-(l-Yi)vi +’Yiei.] + (~i(p) - Yi-xi.
+
^
~ i.
) +
(p)-Yi~i.
- ~~
-
(~)
_
i.e., ~i~) - ~i : (~i -~i) + (~i-Yi)(vi+~i.) + (x’i(P)7Yi’xi-)(~-~)
-(~i-Yi)~_i. (~-8_)
Now Yi : Yi + [ni-i °~2(~2v-°2v)- n-ll" °’2(~2v
e- O2)]e (O.2v
2
^2
2
^2
0,2
where ~,2 is between ov and o v, and e is between oe and o e¯
v
t
-½,[[
Thus yi-Yi : 0p[Max{t
(ni-l)]-½})
i=l
t
2] + O[Max{t-l,[ [ (n -i)]
-q V[o
2^2 oo2~
¯ -I
and E(~i:yi )2= -2
n.1 (~2v + n~lo2)
l e
e v- v e
i=l z
Also by assuming that the n.’s
are bounded and the regularity
1
conditions of Theorem 3 of Fuller. and Battese (1979) apply, then it
follows that
(~ - ~) : O It-½(I÷~)) wheme ~ > 0.
t
+
+
Max{t-l,[
Also E[(~i-Yi)2(vi+~i,)2~]
n~lo2)
O[
¯ e
¯-o,: E(~i-Yi)2(O2v
i=l 1
This establishes the result of equation (32)¯
© Copyright 2026 Paperzz