PRECISION OF SAMPLING BY DOTS
FOR FROFqRTIONS OF tAND USE CIASSES
by
JOOP A. J. FABER
Institute of Statistics
Mimeograph Series No. 773
Raleigh, North Carolina
iv
TABLE OF CONTENTS
Page
1.
INTRODUCTION. . • • . . . 0 .
20
THEORY OF SYSTEMATIC SAMPLING 0
10
2.1
2.2
2.3
2.4
10
10
20
30
4.
5.
1
Introduction. • • • . • .
. 0 • •
Systematic sampling of populations of N constants. 0
Systematic sampling and stochastic processes • •
Systematic sampling of 2-dimensional stochastic
processes.
•••• • • . • • . •
28
THE CORRELOGRAM . .
38
3.1
3.2
3.3
3.4
38
39
62
Introduction 0
• • • • • • • ••
Estimators of autocorrelations in a 0,1 valued field
Variances of correlation estimators. • • • • • •
Estimation of trend in correlograms of the
distribution of forest on aerial photographs
73
EVALUATION OF THE PRECISION OF SOME SAMPLING SCHEMES. .
112
4.1
4.2
112
Int-rod-uction
0
•
eo"
0
•
0
0
•
0
0
0
0
•
0
0
Evaluation of the precision of some sampling
schemes for populations of constants
Expected variances
. • • •
SUMMARY AND CONCLUSIONS •
5.1
5.2
Summary. • • •
Conclusions. •
112
123
138
oeoo.oo
138
142
6.
LIST OF REFERENCES.
147
7.
APPENDIX. . • • . • .
150
1.
INTRODUCTION
In a variety of activities one is interested in determining the
distribution of land use area among a particUlar set of land use
categories classified for a definable geographic area such as a
township, county, state, watershed, etc.
When the objective does not
require a mapping of the specific locations of land uses nor a direct
determination of the area occupied by each land use of interest, but
rather requires an estimate of the proportion of the total area in each
category, the investigator will want to resort to a sampling procedure.
The study reported here deals with the precision of sampling for
land use proportions from aerial photographs.
The specific geographical
area used is part of the Lake Michie watershed, the primary water source
for the city of Durham in North Carolina.
Our investigation of the
problem of estimating land use proportions originates in a project of
the School of Forest Resources of North Carolina State University titled:
"Economic evaluation of changes inland use of a municipal watershed as
a guidance to decision making".
For several series of aerial photo-
graphs mainly of scale 1:20,000 estimates were required of the acreage
under pine, hardwood, cultivated land, pasture, ponds, residential area
and roads.
As complete enumeration was not feasible, a sample design
had to be developed.
Although our work is concerned with surveying aerial photographs
for proportions of land use classes, the results bear on any survey
designed to estimate proportions or areas from planar figures.
For
example, similar problems occur in ecology concerning estimation of
plant coverage.
2
We are concerned with sampling techniques applicable under any
land use classification.
The land use classification used here in-
volves classes summarily defined as follows:
Forest - points in images of crowns of trees;
Ponds - points in images of any visible impoundment of water;
Roads - points in images of the area between ditches of broad
and readily identifiable roads built for frequent use
by automobiles;
Residential area - points in images of buildings and its
immediate surroundings such as yards, gardens, parking
lots, drive ins, etc;
Other - points in images not clearly belonging to the classes
defined before.
Given a set of aerial photographs of a certain scale, the number
of methods to determine the proportion of land uses are numerous •
.Either complete enumeration or sampling procedures are possible.
We
assume that cost considerations make sampling procedures necessary.
Therefore, we do not consider complete enumeration methods like planimetring, weighing cut out pieces of paper, the use of high speed photoelectric planimeters or reading devices feeding information into
computers (for such a device see the Journal of Forestry, 1970, p. 500).
However, a partiCUlar sampling method may not be ,equally good for
estimating proportions of all land use classes.
For instance, an
arbitrary size and shape of a grid of points may be good for estimating
the acreage for some land use classes but may not suffice for other
classes.
3
Dot counting and transect methods are suitable and common ways to
estimate proportions (Yates, 1960; Moran, 1969; Barrett and Philbrook,
1970) or, if the total area is known, to estimate the area accounted
for by each land use class.
Counting dots or measuring intercepts
along transects do not differ in principle from the electronic
measurement devices mentioned above (Russell, 1956; Moran, 1968), with
the exception that the intensity of sampling can be high and that
measurements can be repeated easily.
Questions that arise when designing a sample by dot grids, concern
configuration of the grid, number of dots per square inch and size of
template.
Certain standard numbers have evolved in forestry •. For
estimation of areas of 1,000 acres or less 64 dots per square inch are
recommended in the case of aerial photographs of scale 1:20,000, while
40 dots per square inch are recommended for photographs of scale
1:15,840 (Avery, 1966).
For estimation of crown closure 144 dots per
square inch for color photographs of scale 1:1,584 have been found to
be within planimeter accuracy (Weber, 1965; Aldrich, 1966).
The effort
involved may well come close to that of complete enumeration.
For
large areas the number of sample points needed to obtain a desired
precision is often determined by random samp1ing formulae as for instance demonstrated in Avery's Forester's Guide to Aerial Photo
Interpretation (1966).
Since a certain amount of guess work with
respect to the population variance is required, the use of random
sampling formulae as a guide to determine the size of a systematic
sample may -Rot be too critical.
But the use of such formulae to
4
. evaluate precision of an actually observed·systematic sample has
received considerable criticism in forestry (Finney, 1949; Wittgenstein,
1966; Barrett and Philbrook, 1970).
Planimetring, dot gridding and transect methods are well known
practices in forestry.
Osborne (1942) suggests the line transect
method by proposing an approximation of the systematic sampling error
based on assumptions about the serial correlations.
His paper is an
early and notable contribution referred to by many later writers.
Similarly, Matern (1947 and 1960) discusses the use of strip sampling
as well as dot sampling to estimate cover type areas.
In his Forester's
Guide Avery states that dot gridding is the preferred method on aerial
photographs.
If only relative proportions of land use categories are needed
bias will not be significant when observations are taken at lattice
points throughout the photograph provided the land is flat or gently
rolling as in the South and Midwest of the United States (Aldrich, 1955).
Even in more rugged country taking observations within the effective
area or close to the center of the photograph does not introduce bias
if the differences in relief ranges from 500 to 1,000 feet and if the
survey concerns large tracts (Wilson, 1949; Moessner, 1957).
The
Lake Michie watershed in the Piedmont region of North Carolina to which
our experimental material relates, falls in both categories.
Dot counting on lattices and measuring intercepts along equally
spaced tranSects are two forms of systematic sampling.
These are
respectively 2- and l-dimensional systematic sampling.
The theory of
5
both forms of systematic sampling is much the same.
The basic tool to
compare the performance of systematic sampling with random sampling as
well as one systematic sampling scheme with another systematic sampling
scheme is the correlogram.
A correlogram is a graphical representation
of correlations between quantities at any distances apart plotted
against these distances.
Instead of the correlogram the more preferable
spectral densities can be considered (Matern, 1960, p. 81), but we will
not do this here.
In case of 2-dimensional systematic sampling com-
plications arise due to complexity of formulae and the possibility of
aligned sampling (Quenouille, 1949), and the fact that direction has
to be taken into account when computing "serial" correlations.
If
isotropy is assumed, evaluation of 2-dimensional sampling is simpler
since correlations between quantities at any distance do not depend on
direction and the correlogram can be represented as in the l-dimensional
case.
Otherwise, 3-dimensional correlograms have to be constructed and
evaluated, an undertaking which is difficult in practice.
In such cases
l-dimensional systematic sampling by strips or transects is to be
preferred.
l
We can now state the objectives ) of this stUdy as follows:
1.
To determine the best sampling designs for all land use classes
and for each class separately and to determine the best shape
of lattice in case of sampling by dots.
1)
Originally investigation of sampling by transects had been inclUded
but time did not permit pursuit of the relevant objectives.
6
2.
To study correlograms of the distribution of land use classes,
to explain a correlogram, in particular negative correlations,
in terms of the nature of a class, and to infer from a cor- .
relogram the best design to estimate the proportion of the
corresponding land use class.
3.
To compare precision of systematic and random sampling schemes
and in particular the precision of systematic sampling at
different intensities.
4.
To approximate estimates of systematic sampling error.
To meet these objectives we will consider two basic formulations
of the problem of sampling for land uses.
These formulations reflect
the choice of different population models describing the occurrence of
land uses on aerial photographs.
First, consider an aerial photograph where the population values
are the coded land use categories observed at each point of the photo.graph.
These population values are constants.
Although the aerial
photograph is finite, the number of points is infinite and therefore
\
the population of indicator values or codes is infinite.
For purposes
of sampling for proportions of land uses on this photograph it may be
practical to divide the area into a large number of cells of equal size
and shape.
The centers of these cells form a lattice over the area.
The population values are now the coded land use categories observed
at each center of a cell.
The population is finite and if the centers
are indexed, the indices function as addresses.
constitutes a frame.
A list of all indices
Finite sampling theory can be applied where
7
samples are drawn from the frame using random numbers.
Depending on
the size and the shape of the cells results will vary slightly.
A
different placing of the grid may result in different population values.
This sort of variation is not a consequence of the process which causes
land uses to occur in a certain pattern, but is effected by approximating
the photograph, a continuum, by a lattice which is a discrete representation.
It is conceivable that beyond a certain density of the
lattice the results will be the same for any given photograph, no
matter how the grid is shaped or how it is placed.
The second basic formulation considers an aerial photograph where
the coded land use categories observed at each point of the photograph
are now assumed to be realizations of indicator random variables indexed by coordinates of a coordinate system arbitrarily placed on the
surface of the photograph.
related.
The random variables are generally cor-
In other words, consider the land uses to be random variables
attached to each point.
The random variable at a point is the absence
or 'presence of a certain land use class.
For any particUlar photograph
and a particular point the observation consists of naming the land use
at that point by inspection.
If sYmmetry and consistency conditions
hold with respect to the joint distribution of all possible sets of
random variables, then any planar configuration of land use is a
realization of a stochastic process.
We will suppose the process to
be such that the mean and covariance functions are constant with respect
to location of random variables,
sense stationary over space.
2:..~.
the process is assumed to be wide
In addition we suppose that the covariance
8
function depends only on the distance but not on the direction between
random variables,
l.~.
the process is also assumed to be isotropic.
As for the first formulation, a region of the plane can be approximated
by a lattice.
Each time we draw a lattice we obtain a discrete image
of a realization of the entire process if the process is indeed
stationary.
The sample space may represent lattices from all other
photographs, but i f stationarity holds in time as well as in space
then it could include those from all other photographs taken in the
next season.
The sampling of the aerial photograph or lattice cor-
responds to sUbsampling a realization of the stochastic process.
We
believe these concepts to be in accord with Matern (1947, p. 122) and
Cochran (see Matern, 1960, p. 69).
Although sampling from a finite popUlation will be seen to be the
only source of variation that enters into the estimates, it has been
found convenient to keep in mind a probabilistic model for that
population.
This model then determines the characteristics, notably
the autocorrelation patterns, that determine how well various sampling
designs perform.
The model entertains the notion that a popUlation of
values is created by assigning land use names to a collection of points
according to the probabilities of each land use to occur at a point
and with the conditional probabilities of land uses to occur at a point
given the land uses already assigned to other points.
The need for
. such a model is that it permits one to generalize his experiences with
one aerial photograph to others for which the basic probabilistic
structure is similar.
When the size and
~urvatures
of the outlines of
9
fields are similar from one photograph to the next as jUdged by eye
inspection, then it should be expected that the probabilistic structure
is also similar.
By making the rather sweeping supposition that the
process is wide sense stationary and isotropic, then the correlogram
or its spectral representation becomes the key feature of the process.
10
2.
TEEORY OF SYSTEMA.TIC SAMPLING
2.1
Introduction
The main part of this chapter consists of a review of the
literature concerning the precision of different sampling designs.
The variance·formulae given in the following sections are based on the
population models discussed in the previous chapter.
Section 2 concerns
Madow and Madow's variance formulae for systematic sampling and cluster
sampling of I-dimensional populations consisting of a finite number of
constants.
Their results are extended to the case of systematic
sampling of 2-dimensional populations.
In Sections
3 and 4
variance
formulae for simple random, stratified random and systematic sampling
of 1- and 2-dimensional populations are discussed but now from the
point of view of stochastic processes.
Extensions, along the lines of
Cochran's and Quenouille's work, are made for random and systematic
cluster sampling as a pendant to Madow's results in the case of finite
populations of N constants.
Some of the 2-dimensional variance formulae
for the different designs are nsed in a later chapter of the thesis.
2.2
Systematic sampling of populations of N constants
The first major contribution to systematic sampling theory appears
to have been made by Madow and Madow
(1944.).
They are concerned with
a finite population of N elements a.nd define a sampling design as the
combination of a method of classifying N elements into k overlapping or
non overlapping classes and a procedure of selecting one of these
classes in an objective manner; each class has a designated probability
11
of being selected.
A systematic sampling design, then, is a
classification of N elements into k classes of n elements with a
selection procedure such that each class has a probability of ~ of
being selected.
To avoid minor problems N, nand k are always chosen
such that N "" nk where nand k are positive integers.
Consider then a finite population of N elements Xl' X2 , •••
and classes S.~ consisting of n elements X.,
X.~ +k'
~
i
= 1,
2, ••• , ko
000'
Xw
X.~+ ( n~.l)k where
An element is, technically speaking, an address
where the values X. are locatedo
~
x.
by x, which equals
The sample (class) mean is denoted
if a particular class S. has been selected.
~
The
~
arithmetic mean of the population values is denoted by X.
For these
conditions Madow and Madow show the variance of the mean of a systematic
sample to be
V
(i
= -n
sy (x)
(1 + (n-l)"rk ),
where the population variance
(i
and the intraclass correlation r
given by
N
t
(2..2.2)
(x. _X)2 IN,
~
i=l
and
n-l
2
rk
=
(
nn-l
)
t
5=1
(n-5)rk~'
IJ
k
are
12
The serial correlation r
(2.2.4)
r
k8
k8
is given by
=
or
The variance of the mean of a systematic sample can be rewritten
as
N
t (X. _x)2
i=l
(2.2.6)
V
sy
~
(i) = --nN::-::---- + l
nN
n-l
1:
t
8=1 \i-j
(X. -X)(X .-X) •
1=8k ~
J
The formula given here for the intraclass correlation
~oefficient
r k corresponds with that given by Kendall and Stuart (1967, Vol. 2,
P# 303), where the symbol n in our notation is their k.
Madow and Madow
call the r
of r
serial correlation coefficients. However, the expression
k8
given by Madow and Madow is elsewhere considered an approximated
k8
form of the serial correlation coefficient of a finite series of
observations (Kendall and Stuart, 1968, Vol. 3, po 362).
, is defined as (Kendall and Stuart, 1968,
k8
361; Wittgenstein, 1966, po 71; Wold, 1956, p. 12)
form denoted here by r
Vol. 3, p~
The exact
13
N-k8
N-k8
where Xl =
t X./(N-k8) and X2 == l.=t. Xi +kl /(N-k8). It is some.times
i::l l
0
1
said that the difference between the two forms is due to end effects
where reference is made to, what we may call, "whole mean" and "partial
mean" forms of serial correlations.
We quote from Kendall and Stuart
(1968, Vol. 3, p. 362):
" .•. for series of moderate length the difference ••. is
negligible. We must be careful not to use [the approximation]
for short series where exactitude in estimation is necessary. . In particular values of r [our r ] greater than unity
k8
k
mayarlse."
Thus only for large series of data or small lags can we use either
form.
If the population becomes very large to infinite, serial
correlations and their approximations will be the same.
Upon computing all possible serial correlations from any ·finite
data set, some correlations have to be negative if end effects are
neglected.
This can be shown as follows.
Let
JS.'
X2 , .. d.,
~
be the
values of the finite data set after sUbtracting the mean so that
N
t
. 1
l=
x.
= O.
Then:
l
N
( t x.)2 =
i=l l
N N
+ 2 t
tX . X.
i=l j=l l J
i < j
=
N 2
N-l N-8
t x. + I3
t 2X. +lX,
i=l l
8=1 i=l l 0 l
=0
14
Unless all X. are zero, the sum,
1
N
2
t x. , is greater than zero which
i=l
1
implies
N...:.l N-&
N-&
t
Let
t
&:;:1 i=l
t 2X. ~X. =
i=l
1+\J 1
C~,
\J
then
N-l
t
C& < 0, which implies some C& < O.
&=1
Since Cais the factor in the numerator of formula (2.2.5), and all
other factors in that formula are positive, it follows that some r
have to be negative.
a
The same can be shown more generally by using
integrals instead of summations.
One could have special information
on the overall mean and upon sUbtracting this, t Xi
:f
O.
Also,
correlations computed by inclUding end effects can be al.l positive;
consider
~.~.
points on a straight line.
If we want to apply Madow and Madow's formula to a 2-dimensional
array, the term serial becomes inappropriate other than.. indicating a
sequence of correlations at increasing distances.
case r
k6
In the I-dimensional
can be computed from pairs of items taken twice but the second
time in reverse order (see equation (2.2.5)) and the result will not
differ from that based on computations with pairs taken once.
But it
is this procedure alone, of taking pairs twice, that makes it possible
to compute quantities like r & for 2-dimensional finite populations
k
without specifying a particular path through the array.
For instance,
if pairs are taken once, correlations computed for some distances may
differ from those computed after a 90
0
rotation of the array.
15
For rectangular arrays of size N x N the variance of the mean
2
l
of a systematic sample of size n x n equals
2
l
(2.2.8)
where n1k l = N and n2k2 = N2 such that nl' n , k and k are integers.
1
2
l
2
The intrac1ass correlation coefficient
rk
k is a weighted average of
l' 2
between pairs of units that are in
correlation coefficients r
8 k8
k1 l' 2 2
the same systematic sample. The intraclass correlation coefficient can
be defined as
nl-l n2 -1
t
8 =0
1
t
2/::, (n -8 )(n -8 )
2 2
l 1
n n 2 (n1 n 2 -1)
l
82 =0
81=8 2 1= 0
t
where r k 8 ,k 8 =
1 1 2 2
t
I i-i' I=k 8
\. _ . , I' =k 8
III J J
2 2
(X .. -X)(X., "I-X)
1.J
1. J .
1
/
N '. N
2
l
2/::, k1k2(nl-81)(n2-82)
with /::, =1, if ,8
.'
------------~-~--------
t
t (X .. -X)2/N1N2
. 1 J=
. 1
1.=
1.J
.
= 0 and 82 = 1, 2, ••• , N , or 82 = 0 and
2
81 = 1, 2, ••• , Nl ; /::, = 2, if otherwise.
Madow and Madow indicate that the estimation of
rk
from a sample
by use of formula (2.2.3) is biased and even inconsistent.
One can
overcome this problem by assuming a particular model for thecorrelogram
so that the correlations at all possible interpoint distances will be
known.
But, quoting Cochran (1946),
16
"To assume that the p's [r'sin our notation] are strictly
monotone for an actual finite population of only moderate
size does not seem realistic. While the co rrelo gram may
exhibit a defin~te downward trend, yet individual
fluctuations about the trend prevent the correlogram from
being monotone. "
Only for large popUlations may we expect a smooth trend in rfrom the
smallest lag to intermediate lags.
IrregUlarities are a property of
the phenomena observed over a finite range while a smooth correlogram
is strictly a mathematical object.
Irregularities of the correlogram
at the large interpoint distances are more pronounced and cause
fluctuations in
rk
that prevent the systematic sampling va'f'iance to
decrease uniformly with increasing sample size.
This phenomenon has
been termed quasi-periodicity (Cochran, 1963,p. 219) and an example
has been given by Madow (1946).
The reason for greater irregularities
in the correlogram at larger distances is that in finite populations
of N elements correlations for pairs of items far apart are obtained
by averaging over relatively few pairs.
The erratic behavior of the correlogram at larger lags and thus
the possibility of irregular decrease in systematic sampling variance
with increasing sample size creates a difficulty when attempting to
determine the efficiency of systematic sampling with respect to random
sampling.
If true trends and periodicity are underlying the population
elements, then Madow and Madow have given conditions under which
systematic sampling will be superior.
In case of a linear trend
stratified random sampling is more efficient than systematic sampling,
but systematic sampling is still more efficient than simple random
sampling.
In case of periodicity systematic sampling is more efficient
17
than stratified random sampling when the systematic sampling interval
is an odd multiple of the half period.
Madow and Madow consider also
the case that trend and periodicity occur simultaneously.
From the analysis of variance (see Appendix) it is clear than an
unbiased estimator of the systematic sampling variance does not exist,
since asystematic sample with one random start can be considered as a,
simple random sample of size 1.
Several variance estimators have been
suggested based on models appropriate for the population at hand.
of these make use of a correlogramo
Some
Knowing the correlations will
enable us to compare sampling schemes and to estimate the sampling
error.
However, in case of finite populations of N constants very
detailed knowledge about the correlations r
k5
is needed.
It is un=
reasonable to speak in this context of smoothly decaying non-negative
correlograms.
Not only do deviations from a generally downward trend
occur as spoken of by Cochran, but also some correlations have to be
negative as shown.
Smoothed correlograms may cause severe bias,
~.~.
in the case where a smooth positive correlogram is accepted one should
never expect systematic sampling to be superior to simple random
sampling for any sample size.
In the case where systematic sampling
is in fact better we should expect negative correlations in the
correlogram.
Systematic sampling can be used in combination with other schemes.
For instance Madow
(1949) discusses possible combinations with cluster
sampling such as (a) a scheme that selects systematically m out of M
clusters and comp.letely enumerates each cluster sampled and (b) a scheme
18
that selects systematically m out of M clusters and sUbsamples
systematically each of the m clusters selected.
Discussing only the first scheme for a I-dimensional population,
let there be M clusters in the population, of equal size with N elements.
N = nk andM = cm where n,k,c and m are all integers.
Then, for the
first scheme, the systematic cluster sampling variance of the mean is
(2.2.10)
2
where the between cluster variance C'b is defined as
M
~{X. _X)2
i=l J.
C'b = --M:-:--- with Xi being the mean of the ith cluster.
2
(2.2.11)
The between cluster variance can also be written as
2
2
C' (
C'b = I f l + (N-I) r) .
(2.2.12)
The rand r* are both intraclass correlation coefficients, the
c
f~rst
is the result of the use of clusters and the second is due to the use
of a systematic sampling scheme for selecting clusters.
If the clusters are smal.l and compact and the sampling interval c
is not too small Madow's. formula may be written in terms of correlations
between single elements.
The resulting variance is then an approximation.
This can be shown as follows.
that
From (2.2.6) and (2.2.11) i t can·be seen
19
2
2
(X) = -mN(J [1 + (N-l)rJ + roM
1sy
V .
c
m-l c(m-6)
t
E
6=1 i=l
(X. -X) {X. +' ..t-X)
l
l CU
The summation over i in the last :term on the right can be wr;it.ten as
c(m-6)
t
i=l
1
(X.-x){X.+
l
l
.I.-X)
Cu
c(m-6) N
= -2
N
and approximated by c(m-6)r
t
i=l
c6N
t
N
t (X . . -x)(X.+ .. t -X)
j=l u=l
. clsy
l
cu~u
(J2, so that we obtain
2
V
l,J
'" mN(J [1 + (N-l)r + N(m-l)
m-l~(
6
t
m=l'
8=1 m m
r .tNJ,
Cu
or
2
(2.2.14)
c1sy ....- ~ [1 + (N-l)r + N(m-l)rCNJ •
V
If the correlogram is known for items between pairs of, units, then all
quantities can be determined and sampling schemes compared since
r =
N-1 2 (N_8)
t N(N.;.l) r 8 '
6=1
M N-8
13 E {X. '('_1)N+8-X)(X'+('_1)N-X)
where r
j=l i=l
a
J
H
l
J .
=~:::.....;;;......;;~---::-:=-----------
MN
E (X._X)2
i=l
l
However, such detailed knowledge will not be available in actual
practice •. If anything of the correlogram is known, it may be a general
trend and in that case the results of the next section, if applicable,
will be more helpful.
20
2.3
Systematic sampling and stochastic processes
So far the population has been assumed to consist of a finite
number of elements numbered from 1 to N.Although Madow and Madow considered underlying trends or periodicities, the population values were
constants and, as indicated in the Appendix, the between group variance
may not decrease regularly with increasing size of group.
Cochran (1946)
suggested a model and derived variance formulae for populations with
elements drawn from a larger population in which the elements are autocorrelated.
We continue an earlier quo te from Cochran (1946):
"It is more reasonable to regard the finite population as
being itself a sample from an infinite population in which
the piS are mono tone. II
and
"Thus, comparisons between the systematic and stratified
random samples will be made not for a single finite
population, but for the average of finite popUlations drawn
from an infinite population with monotone decreasingp.
Results for an individual finite popUlation will differ
from the average results because the ris which appear in
the population fluctuate about their expectation p. As the
finite popUlation becomes larger, its results will tend to
coincide with the average results."
If a lattice is placed on an aerial photograph (or on several
photographs) drawn from an abstract set of infinitely many possible
photographs and if only the occurrence of land uses at the points of
the lattice on the photograph are considered,
~.~.
the photograph is
thought to be represented by a lattice, a 2-dimensional variant of
Cochran's model applies directly.
Each lattice is a realization of
the process responsible for the occurrence of land uses and corresponds
21
to a finite population that is sampled in its turn.
At the same time a
given lattice can be considered as drawn from the set of all possible
lattices on a photograph.
sampling process employed.
A lattice is then also a realization of the
If it is reasonable to assume that beyond a
certain point no more information is obtained for any denser lattice,
this source of variation is negligible for sufficiently dense lattices.
Cochran derived expected variances of sampling schemes based on
the following model.
The elements X., i = 1, 2, ••• , nk = N, constitute
J.
a finite population of random variables drawn from an infinite
population in which EX i = IJ., E(Xi-lJ.)
2 2 2
= 0', E(Xi+u-lJ.) =PuO'.
Let the
finite population be divided into n strata and k classes as in the
Appendix and let the total samp.le size be n.
Then forsimp.lerandom
sampling the average variance of the mean about the mean of the finite
popUlation (averaged over all finite popUlations) is
(2.3.1)
2
1
2
kn-l
2
0'
O'r = E Vran (x) =n- (1-k)(1-kn(kn_1) I: (kn-u) P )
u=l
u
and for stratified random sampling
2
(2.3.2)
0'
=n
22
and for systematic sampling
(J2
(J2 ::: E V (x) = n
sy
sy
(J2 (
=n
1)( kn-l -k(n-l) - )
l-k 1- k-l P + k-l
Pk
Q.uenouille (1949) points out that Cochran's results are theoretical
kn
minima and that addition of the term l(l_~)~ t (J~ is required to
n
n i=l 1
all three variances if (1) each X. is a sample from a population with
1
i,
mean 1-1. and variance
(2)1-1. is distributed about mean 1-1 with
1 1 1
2
..
2 1 kn-u
variance (J , (3)E(I-1'-I-1)(I-1.-I-1) ::: p.. (J ~ and (4) P =: -k- t p. '+ .
1
J
1J
U
n-u i=l 1,1 U
Equations (2.3.1), (2.302) and (2.303) apply without any restrictions
on the piS.
U
Q.uenouille provides also approximated formulae for sampling continuous processes which we have slightly modified as follows
1
2
2
(J2
(J = - (1- Sf (U)p(U)dU) which is equivalent to ~ (l-'R),
r
n
l
o
(J2
=n
2
(J
sy
(J2
n (1-
=:-
d
2
(1- Jfd(U)p(U)dU)rv ~
p'),
(l-
o
l I n - I 2(n-u)
+ (n-l) t n(n-1) PUd)
o
u=l
d Jf1 (u)p(u)dU
2
a
rv
n
(
1-
nk-l k(n-l)-)
k-l P + k-l
Pd'
23
where 1 is the length of the segment over which the continuous process
is observed, d refers to length of the strata, andf(u) is the distribution of the distance between two random points on a segment of a
certain length.
Cochran shows that if the p 's are monotone decreasing and positive
u
the variance within strata is a monotone increasing function of the
2
2
stratum size so that cr
<
crr, but that nothing can be said about the
st
superiority of systematic sampling with respect to simple random or
stratified random sampli.ng without additional assumptions.
If in
addition the correlogram is convex systematic sampling will be superior
to stratified random sampling.
Cochran gives an example of a
correlogram to demonstrate the possibility of irregular behavior of
the systematic sampling variance with increasing sample size.
of strictly monotone decreasingp
u
IS
In case
the systematic sampling variance
decreases steadily with increasing sample size.
In order to avoid the problem of estimating the systematic sampling
error it is sometimes suggested to use multiple random starts or in
other words to obtain an estimate of the error by repeated systematic
sampling.
Gautschi (1957) shows by using Cochran's model that if the
correlogram is convex systematic sampling with mUltiple random starts
is inferior to systematic sampling with one random start with the same
sampling effort.
A logical extreme of systematic sampling is centered systematic
sampling being completely devoid of any randomness but still objective.
Madow (1953) shows that under the same conditions of the preceding
24
paragraph centered systematic sampling is superior to random start
systematic sampling.
It is now not difficult to find the variance formulae fo·r random
and systematic sampling of cluster-s of contiguous elements.
Since
systematic sampling is a form of cluster sampling, the expected cluster
sampling variance of the mean of a random sample of size one (.!..!:.' one
compact cluster of contiguous elements is randomly selected) can be
written in analogy to the systematic sampling variance as
2k
+ (k-l)n
n-l
t (n-u)p )
u
u=l
and when s compact clusters are randomly selected without replacement
the expected random cluster sampling variance becomes
Equation (2.3.8) can also be proved as follows.
be randomly selected without replacement.
tiguous elements.
elements.
Let sout of k clusters
A cluster contains n con-
Let a be a member of index set A containing (:)
25
Xa
=
s
t X./
. 1 aJ s
J=
= (Xa1
+ Xa2 + .•• +
= E E (X
a
A
The first term of the right
A
a
a
a. n
ns
=E A
E
A
a
A
+••• +X 1+.'. +X
_~J2
as
asn
ns
[(X 11-'~ +••• +X 1 -~)+ ••• +(X 1-jJ.+· •• +X
-~)J
a.
an'
as
asn
ns
ns
ns
ns
s
= E E
+ E E (X_~)2
E E (X -jJ.){XoejJ.)
becomes~
E E (X _1J.)2 = E E [X 11+•• ' +X' 1
A
_~)2_2
Xas )/s
t
2
n
t (XaiJ'-~) +2
A i=1 .-}--"""-J2 2
n s
+2 E E
s
E E t
A v=1
s
s
n
n
t
t
t
t (X. -~)(Xa
· 1 . 1
U=1 ~=J=
v < u
A v=1
av~
.-~)
U~
2 2
n s
2
(2.3.10)
(J'
=-+
ns
+E 2s(s-1)
k(k-1)
k
k
n
n
t
t
t
t
v=l u=1 i=l j=l
v < u
(x . ..,~)(X .-.jJ.)
uJ
v~
2 2
n s
2
26
The last term is obtained by arguments of sYmmetry (Cochran, 1963,
p.21 and 22).
2
Using the following equality
k
k
n
n
I:
I:
I:
I: (X
k
v=l u=l j=l i=l
v < u
-2
.~~)(X .~~)=2
Vl
n
I:
I:' I: (X
n
. 1 J=
. 1
v=l l=
i < j
n
n
I: (X .~~)(X :~~) +
v=l u=l i=l j =1 Vl
uJ
v ~ u, if V=U then i < j
uJ
k
k
I:
t
I:
.-~)(X .~~)
Vl
vJ
and taking the expectation it can be seen that (2.3.10) equals
*B
2
E E(X _~)2= .£.... + n-l 1)1(/ + s s-l
A a
ns
ns
k k-l
2
=
cr
n-l - 2
ns + -ns
p'o -
nk-l
[_,_1_ I: 2(nk-5) p.o.(l +
2
2
n s 5=1
IJ
(s~l)(n-l)
ns(k-l)
2
k(kn-l)(s-l) - 2
p cr + kn{k-l)s
po
-I
The second term in equation (2.3.8) is evaluated as
fol1ows~
-2EE(X-~)(X-~)= -2EE[(X 11-~+" .+X '1 -~)+ ..• +(X
A a
A
a
ns
.[
~-,~ +
"Ilk
s
= -2E -k [ Xl-~ +
ns
I-lJ.+·· .+X . -lJ.)J
a n , as
a sn ,
ns
ns
ns
+ Xnk-~
nk
J
+ Xnk-'~
ns
J[
Xl-~ +
nk
+ Xnk-'~
nk
J
.
27
2
nk
= -2 ! [~+ t 2(nk-5) p~cr2 ]
k
ns
5=1 n2 sk
v
2
=
-2 ~ _ 2(nk-l) -2
nk
nk
pcr
Finally the third term in equation (2.3.8) equals
EE(X-IJ,)
A
2
nk
2
[ t (X. "'IJ,) ]
A i=l
1
nk
= EE
=
nk
2
t (X.-IJ,) + 2 EE
A 1=
. 1 ""-";2~2'-1
A
n k
EE
2 + 2 nk
(
)
__ 2~
nk-5
"
P~
nk
v
5=1 n2k2
cr2
nk-l - 2
+ - - pcr
nk
nk
= -
Adding the results of equations (2.3.11), (2.3.12) and 2.3.13) together
yields the expected variance of random cluster sampling given by
equation (2.3.8).
If s compact clusters are systematically selected
at an interval of c clusters such that cs = k and c,s and k are all
integers, then i t can be shown similarly that the expected variance of
systematic cluster sampling is given by
iclsy = E V(s)
(x)
clsy
1:. ~ (X. _X)2
c i=l
1
c
s
n
_ 2
2 2 t [ t u=tl(X(i+(J.-l)c-l)n+u-X) ]
cn s i=l j=l
1
= E
= E
28
If clusters are small and compact and the interval between clusters is
not too small, approximation of the resulting algebraic expression
yields the variance
2
2
cr (c-I)(
nk-l c(n~l) -,
nC(S-I)-)
O'clsy ,.... ns c
I - c-l P + c-l
P + c-l
pcn •
If we sample from a continuous process, and if c and k are large,
then
2
I
n
cr
,....fl(u)p(U)dU + n
fn(u)p(U)dU)
ns (l-nS
o .
0
S
where 1 is the length of the population, n is the length of a cluster,
cn is the mean distance between clusters and k is the number of clusters.
2.4
Systematic sampling of 2-dimensional stochastic processes
So far we have mainly discussed variances of samples drawn from
I-dimensional populations.
Although adding a dimension does not
essentially alter concepts, problems arise with respect to the
correlogram and the possible sampling
complexity of formulae.
~chemes
with an increasing
Recall that we discussed the applicability of
Cochran's stochastic model to sampling aerial photographs by means of
,
lattices.
Q,uenouille (1949) and Das (1950) generalized Cochran's
result for 2-dimensional sampling schemes and gave conditions for the
29
superiority of systematic and stratified sampling.
Their results hold
for finite populations with a 2-dimensional index, (i,
j), consisting
of elements Xij , i = 1,2, ••• , nlk ; j = 1,2, ... , n k , being random
2 2
l
variables drawn from an infinite population in which E X.. = IJ.;
lJ
2
2
E(X.. _1J.)2 =0' , E(X.,~IJ.)(X,+. '+ ~IJ.) = P1'.J'UVO' and p = P
is
;LJ
1 U,J v
uv
-U,-v
lJ
defined as
1
PUV =(k n l l
lu\) (k2n2-\VT)
~ ~ Pijuv
Since we can use different sampling schemes in two directions
(~.,~.
select the i index in one way and the j in ana ther) many
2-dimensional sampling schemes are possible.
For example one could use
stratified random sampling in a North-South direction and sample
systematically East-West.
In addition sample units mayor may not be
aligned in one or both directions.
If we consider unrestricted random
sampling, stratified random sampling and systematic sampling, then
there are 36 possible sampling schemes.
If aligned sampling is denoted
by the SUbscript 1 and unaligned sampling with
0,
the following table
shows all possibilities.
r
Sampling scheme
in vertical
direction
r
r
0
r sy
o 0
r oSYl
rlst o
rlst l
rlsyo
rlsYl
ror l
r st
rlr o
rIr l
0
l
rostl
0
r r
0
Sampling scheme in horizontal direction
r
st
syo
stl
sYl
0
l
0
0
st
st r
0
st or
l
st st
st ost
stl
st1r o
stlr l
stlst o
sy
sy r
0
syorl
sYl
sylr o
sylrl
0
0
0
0
o
st oSY
l
stlst l
st sy
1 o
stlsYl
syo st0
syo stl
sy sy
o 0
sYosYl
sylsto
sylstl
sylsyo
sY1SYl
0
0
l
stosY
30
For our purpose we will deal only with unaligned simple random sampling
and unaligned stratified random sampling and aligned systematic sampling,
where the kind of sampling is the same for both directions.
~uenouille
(1949) has given variance formulae for these three
(2.4.l)
klnl-l
k2 n2 -l
{kl nl-lu I)(k 2n 2 -lv \)
·{l- t
t
-(klnl-l) -(k2n 2-l) klk2nln2(klk2nln2-l
in the case of stratified random sampling
(2.4.2)
and in the case of systematic sampling, where k and k are the
2
l
systematic sampling intervals in the two directions,
~uv)'
31
(]
2
sy
=
E
vsy (X')
(nl kl - iu I) (n 2k2 -lv I)
nln2klk2{klk2-1)
Puv +
klk2(nl-lul)(n2-lvl)
Puk
vk
(k k 2 -1)n n
2
l'
l
l 2
where the double summations exclude the point u
= v = O.
Q,uenouille's
paper provides also derivations for the more intricate designs, where
i t is shown that for positive monotonic decreasing correlation functions,
and for the majority of functions realized in practice, alignment will
usually increase the variance for simple random and stratified random
sampling.
For systematic samples it is more difficult to make a
general statement whether precision will be improved by alignment.
Where systematic sampling is used for its convenience in selection,
location and control, we think that non alignment is less useful and
that we are in fact faced with the same problems as in case of random
sampling.
Principally there is an exact correspondence with previous results
for the I-dimensional case.
and k.
Instead of n n and k k we had before n
l 2
l 2
The coefficients of thep's add up to 1 for all three equations
and in the case of systematic sampling the coefficients ofp
kl k2nl n2-1
k k2 -1
l
uv
are
times as large as those of unrestricted random sampling and
32
the correlations in the second term are lagged (k ,k ).
l 2
Noticing that
the coefficients are functions of the relative frequencies of distances
within a rectangular lattice taking direction into account, the
correspondence with the l-dimensional case is seen.
For a rectangular N x N lattice with equidistant points and
2
l
assuming isotropic processes we can write the above three formulae more
simply as
(2.4.4)
(2.4.6)
t
r P(Dn
= r)
Pr .k) ,
The summation extends over u and v such that
1 ~ u
2
22
+ v = r
~ (Nl-l)
2
2
+ (N -1) , where r
2
no neat analytic solution to u
2
22
+ v
:: r .
2 is an integer. There is
The simplest method is to
give v (or u) all integral values less than or equal to rand examine
2 2
2 2
which of these render r _v (or r _u ) a perfect square (Chrystal,
1961, p. 486).
If r is the distance between elements in the same row
or column, 8. = 1; otherwise 8. = 2.
33
p(Dk=r) is defined similarly.
In case of 'P(D =r) care is
n
necessary if the systematic sampling interval is not the same in both
directions.
The same difficulty arises if the lattices sampled have
more points in one particular direction.
We may then want to dis-
tinguish between the two directions as follows, still assuming isotropy,
let the rows contain kln = N points and the columns k2n 2 == N2 points
l
l
with N < N .
2
l
Let u = 1, 2, .•. , N -1 and v == 1, 2, ..• , N -1.
2
l
between points on a column
~ ~ 4 (n -u)(n
P(Dn=r)P
rk
=
~u_v~--r
l
2
~
n n (n n -1
l 2 l 2
-v)
__
22 k2 2)
P(klU
+ 2v
where the double summation is overu and v for all
22+n 2v 2)! el°ther In
° a dlagona
.
1 dOlrec tolon,
r k = (klU
2
i. e. u ~ v =f 0, or between point s not within the
same row or column, but not when u = v.
In case of 2-dimens:i,onal compact cluster sampling variance formulae
can be written analogous to equations (2.3.8) and (2.4.3).
clusters of size n
l
x n
2
I f s s2
l
are sampled randomly and without replacement,
assuming isotropic conditions the expected variance is given by
2
02
s l s2
NI N2-1 ~
oclr'
=E Vclr (x)= n n s s (1 - )
(1
k k
. - k k 1 r""'l'(DN=r) Pr +
1 2 1 2
1 2
1 2-
~P(Dn=r)p r ) •
r
34
If the clusters are small and systematic samples are taken at large
intervals c (horizontal direction) and c (vertical direction) such
l
2
that cls = k and c s
2 2
1
l
= k2 ,
then the expected approximate variance
is given by
(2.4.8)
analogous to formula (2.3.14) and assuming isotropy.
For sampling a continuous process Zubrzycki (1958) obtained results
for a finite region D where D consists of k congruent strata or regions
D , D , ••• , D .
2
l
k
that n d n d
l l 2 2
If the region is rectangular and of size 1112 such
= 111 2
where 0. 0. is the size of a stratum or the
1 2
2-dimensional sampling interval then
p
uv
duQ.v)
(2.4.10)
0.
0.
2
1
2 (d -lu 1)(0. 2
2
cr
1
=
(Icrst
n n
2
-0. -0.
d
1 2
1
2
1 2
(2.4.11)
1
1
2
1 1
2 (11-lul)(12-lv})
1
2
cr
= - - (l-.J:....g
cr
Puvdudv
2 2
n n
sy
0. 0.
1 1
1 2
1 2 -11 -1 2
1 2
S S
S S
i
jv l)
p
uv
dudv)
35
Similar variance formulae have been given by Quenouille (1949) and by
Matern {196o) for infinitely large regions, where Matern considered in
addition isotropic processes.
In case isotropy is assumed and the sample units are all of one
geometric shape, then factors of the form under the integral signs can
be replaced by the distribution of the distance between two random
(equal probability) points within the respective geometrical figure
such as a rectangle, triangle or circle.
Such distribution functions
are given by Ghosh (1943, 1951) and Matern (1947).
Considering the efficiency of the different sampling schemes in
the 2-dimensional case Das (1950) gives conditions such that stratified
sampling will be better than simple random sampling and systematic
sampling will be better than stratified random sampling.
If
p(u+1,v) + p(u-l,v) ~ 2p(u,v) and p(u,v+1)+p(u,v-l) ~ 2p{u,v) then
2
o sy s::
2
2
2
st for any size of sample and 0 sy < (Jst if the inequalities are
C'
strict.
ZUbrzycki (1958) shows that if the process is isotropic and
2
2
if the strata are congruent then 0st s:: or'
correlogram is of the form e
r is distance, then
°2sy
-ar
2
In particular if the
,where a is some positive constant and
s:: (Jst when a is sufficiently large.
Infact
ZUbrzycki shows that if the strata are congruent, disjoint circles
forming the region D with diameter1ess than l/a (which may not be a
2
2
very realistic example), then, if the process is isotropic, 0sy > (Jst'
Examples can be given that under such circumstances the conditions
given by Das are not satisfied.
Thus results for the I-dimensional
case do not carryover exactly to 2-dimensional situations as has a;Lready been suggested earlier by Cochran (see Haynes, 1942, p. 38).
36
Matern
(1960) investigated the precision of systematic sampling
schemes consisting of lattices of different shape but with the same
number of points per unit area.
Assuming isotropy and exponential
correlograms, l:e found that triangular lattices were the most precise,
but the difference with square lattices was rather slight.
For
elongated, narrow strata the systematic sampling variance per point
was much larger than that of stratified random and simple random
sampling,
2:..!;..
rectangular patterns with many more points in one
direction than in the other are inefficient with respect to precision.
,Matern assumed an infinite region and his computations are truncated
at a point beyond which contributions as a consequence of the correlogram are negligible.
From the previous it is clear that if we are considering
systematic sampling, we should be concerned with estimation of the
correlogram.
compared.
Once the correlogram is known, sampling designs can be
Also systematic cluBtersampling with systematic sampling
at the second stage can be considered.
Correlograms are thought of as
being convex and positive in case of forest land use (Osborne,
Matern
1942;
1947), however, negative correlations should be possible and
expected in some instances in the United States.
If the correlogram
indicates negative correlations, it is probably a result of trends or
periodicity.
Only in the case of positive convex correlograms can
direct statements be made concerning"relative precision o£ stratified
.and systematic sampling when considering infinite popUlation models as
those of Cochran, Quenouille and Matern.
Some empirical evaluations
37
should be made under these conditions for finite regions.
When cor-
relograms are slowly falling, but are positive and convex, systematic
sampling is obvious.ly advantageous, whereas systematic sampling is
about equivalent to simple random sampling if the correlogram is convex
but sharply decreasing (Matern, 1947).
It depends then on the total
size of the area and the number of points to be sampled whether
systematic sampling should be used.
38
3.
3.1
THE CORRELOGRAM
Introduction
In this chapter we will discuss methods of estimating the
correlogram and some properties of the estimates.
No mathematical rigor
is claimed all the way through this part of the study which is based
primarily on empirical observations.
Workability of any solution to
the problem of how to estimate the correlogram reasonably well has
always been found in the foreground as well as its application to an
actual finite but continuous population like the occurrence of land
uses on aerial photographs of the Lake Michie watershed.
Section 2 deals with the definition of the autocorrelation and its
estimators in a 2-valued random field.
Section 3 approximates the
variance of some estimators under ideal conditions.
The result is
shown to be identical to Yule's variance formula for a measure of
association in a 2 x 2 table.
Finally a procedure is discussed for
testing whether correlations are real.
In Section
4 the results of
the previous two sections are applied to actual popUlations consisting
of land use data on aerial photographs.
Non linear estimation of the
trends in these correlograms, 2:,.,:;:., the autocorrelation function p(d),
is attempted.
This chapter, thus, provides a logical link between the
theory in Chapter 2 and its application to the evaluation of some
systematic sampling schemes reported in Chapter
4.
39
3.2
Estimators of autocorrelations in a 0,1 valued field
In the literature on time series and stochastic processes autocorrelations are very generally defined.
For continuously distributed
random variables particular results concerning expected values,
variances and covariances of autocorrelation estimators are derived
assuming normality.
In this section we will define autocorrelations
for 0,1 distributed random variables in a 2-dimensional field assuming
stationarity and isotropy.
We will then proceed to discuss three
possible forms of autocorrelation estimators.
Consider a region in the plane.
variable z attached.
. two values
° and 1.
At each point there is a random
The random variable z can take on either of the
To identify a point in this region a rectangular
coordinate system is superimposed with its origin at an arbitrary
point of the region.
and let d= (u
2
Let the coordinate axes be denoted by x and y,
2 l+ v )2be a a distance between two points in the region.
The quantities u and v may be positive or negative.
Assuming station-
arity and isotropy it follows that for any d
(3.2.1)
E[z(x,y)z(x±U,y±v)] -E[z(x,y) ]E[z(x;tU,Y:rv)]
p(d)
=
2
~ll - TIl
TIlTI2
40
whereP(z (x, y )=1 }=TIl' p(z (x,y)=O }=TI2 , p(z (x, y)=l, z (x±u,y:fv)=l }=TI
ll
andp(d) denotes the autocorrelation which we could call the regional
autocorrelation, at distance or lag d.
For any two points (x,y) and (Xfu, y~v) within the region at
distance d, the probabilities of all possible outcomes can be presented
in the following table.
Outcome of z(x,y)
Distance d
1
0
outcome of
1
TIll
TI12
TIl
z(xiu,y~·v)
0
TI
2l
TI
22
TI
2
TIl
TI
2
1
Clearly
~12
= TI21 , hence
TIll - TI22 = TIl - TI2
Therefore, it makes no difference which state is identified by a 0 or 1
and we can say that the correlogram of the distribution of the complement
of a state is the same as that of the state itself.
If there are more
.e
41
than two states, this complementary feature will hold for combinations
of states.
Theoretically, an additional state could be arranged
physically as to make the correlograms of other states zero for all
lags.
In practice this is an unlikely event, particularly if the
additional state or composite of additional states is scarce.
As an example of this feature consider two main states or classes,
forest and farm land, and three minor states, roads, residential areas
Figures 3.1 through 3.5 show estimated correlograms from an
and ponds.
aerial photograph on which these five classes were recognized in the
following respective percentages:
54.96, 38.56, 2.64, 2.68 and 1.16.
The correlograms (Figures 3.1 and 3.2) for the distribution of forest
and farm land do not differ markedly, while the correlograms of the
distribution of the minor classes are close to zero almost throughout
the whole range of lags.
Let an aerial photograph represent a particular realization of a
process responsible for the occurrence of forest and non forest where
the number 1 is assigned to all points in forest and 0 to all other
points.
Sampling the photograph by observing pairs of random (equal
probability) points at a distance d, the expected results can be
represented as follows (Pielou, 1969):
Distance d
Z'(X,y)
Z'(XIU, y:!v)
.1
o
.1
0
Pll
P.12
PI
P21
P22
P2
PI
P2
1
e
e
e
I
I 0
1
0.51I
1
I
1
0.41- 0
1
I
1
1
0.31-
1
I
I
1
..
* ..
• *
0 *
*. *
.. ***
***.. * ..
* •
0.110
.
***** *** ..
.. **
..
I
** *
******** ••• ** .. ***** *** * ..
* .. * ....
I
0
+: ******
**..
.. *******. *-* * ****** **0** ..
.. * **00
I
0
*****00**** ****** ..
** *.
.... ********** ****0*0*0
* ..
.. * .. 0**0
I
*0
****0000**000000000*0* ..
********0*000*.00**
****0*******. **
.. *0*
..
0.01-- --- -- ---*** -- -- -----* -- *0000" *- -* .. *- .. *** ** *0*00 00"" *** ***0*00*" 0--*0*-*0- *0 ..... **** 00* ** .. *0-" **** .. ** ,*--*- *0-0-*. -I
00**
****0* ** ** ***.. ** .. ** *oa**o*o*****O*:t:* *0*" 0*0**000* *** .. 0* ••• *** ** *0
oj
0000****0000* * * . .
*****00*0*0*0* *** *******0 **oa** ****
00000********0 ..
.. 00
I
**OCOo*
..
.. ********..
***.. .*********
*****0000***0*"
.. ** 0
I
** **.
**** ** ** * * *********.
**
****0****·
-0.11.. * *
.. .. * **** ****** *.....
** ** ***00 ..
I
* * ....
** **
*
** ..
I
..
**
• *
....
•
I
•
o
*
*.*
I
-0.21*•
o
I
•
0.21I
o
*
1
1
I
..
-.*.------_.
-"'.*
..
.
.*
••
I
I
I
-0.31I
I
o
••
•
1
I
I i i
1
Til
-o~i----O.2----0.J---o.li----o:-5---0~6---0~7----0.T
I
--0.9- --1-.0- --1.~i--l~2-- -i~j···
1.4
1.5
1.6
1.7
1.R
1.9
2.0
o
2.1
nistance in miles
p
~
Figure 3.1
Corre1ogram- of the distribution of forest on photograph 33
t;
e
e
e
#
.•1-0
°"1
1
1
1
o.LI1
I 0
I
I
n 1;1'·'1
1
1
0
I
0.211
g
:jj
*
0
- ...
I
I
I
0
..
....*
.-.
..
*..
*
'.
**
..
.* *
.. *..
** ..
**
****
** *
*
.** *** ** ..
*** ****** *
* **0 * ..
.~-**** * **
* .,--- _.
***.*****- ****'.... * *0**0****
..
**** ..
I
0
**** .**** ******.
****** ****** *** ** *****'*00*0**' * **
***0
*
I
0*
***000000000000000000*0"
**. ***0* 0 *00* *** **** ******** *** ** **
***
..
0.01--- --- - --* 0** --- - - -*-*--* *000 * if-"" ***** •**** ** *0*00 * ***** ** 0**0* *0 0-- 0*00 0* *****-- *00*** *-0--***- *------* *----0*-* -** *---,;,. --I
*00 * *** *****0***
* ***
****00****00*0**0*
**0" *00 **0'00 *******0*000**
*****000
0 oJ
**000000000000
.. ***0000*** * ***********0 000******** ** 0***0* * * .. 0*
00
I
- ** ** *****
"-'******* **** **. *** ****'*'****** * ** **0** ·***00*
* * a
I
**
* **** * **** *** .********** *
** * ***00 * *** **
**
~ o.ll~
I
~
t
t...'
...--- - --. -..- .... -_.--. --
0
* .. '
..
-.*****
0
..
*
..
** * ** *
*
*
-0.11-
I
I
**********
**** *
*
**** * **0 ****
... **. 000
*
* **
***** ** * .. 0 * ..
*
.. *****
I
I
-0.21-
*• *
..
1
***
I
I
I
-0.31I
*
*
* *
o
o
*
I
I
I
-o~l.-
I
*
.
*
*
*
o
-J--.J
!
I _
I
I..
.1.__I _ _ -'-...l_ .. _1. _._.! _L._L
'-_l
1.
I
I
I
_1__..1...
I
-O~1----O~----0:3--0:t---O:5---0:6---0:7--0:8---o:i---l~---1~i--l~2---i~3---1~4--1:5----1:6----1:7---i~--i:9---2~---2:L---
Distance in miles
Figure 3.2
Correlogram of the distribution of farm land on photograph 33
&"
e
e
e
I
1
I
I
0.71I
0
1
0
I
1
~~
0
1
1
I
1
0.51I
0
I
I
1
50 •41 -
;J
."i
~
l::
1
1
I
I
8 0 .31I
I
I
I
00
0
0.2\I
1
J
o.1l-0
I
..
*
*
..
'"
..
..
** ....
..
I
0"
*'
.... *
****.. *****.* *** ..
'I
0000 *** 0 *** ..
.*"'
..
*****
..
***"'.
********"***'**"******~**O***.******
..
...
0
o.(j --- --- - 0- 0000 000 00 0 000 0000 * 00 *00000 0 0 0000000 00000 00 00 000000 0000 00 000000000000000000000 **00 00-- 0-- - **-* * *0 *00 0* ** .- T
l
-II
00
..
.. *0
_.L_.. _..1. _
*****0
I
I
.. *
. . . . . .. **
.
-_ -- - __.
00 *0* ********************** ******.... ************ •• *********00*0**00*00000*000*0*********
.. **
I
I
I
I
I
1
I
I
I
**.*.***** **"'.
I
I
1
1
I
I
I
1
-o.i---O~2---0;3---o;4----5.r-~o'X--O~7--o~ii---O~9---i:5----1-.i---l-.'2---'i·X--i);--i:5---i:6----1.7---'i-:ii---1.9---'2:0---2:i--'
Di~t,an~e
Figure 3.3
:in
m:i1f'~
Correlogram of the distribution of roads on photograph 33
+:-
-!='
e
e
e
1
I
0
I
I
ry.7iI
1
I
I
1.61I
I
I
I
ry.SII
0
j
J
1
" n.LID
I
::;
I
i8 0.31-:
I
J
I
I
0.21-0
0"
j
I
I
i
0
00
0
*...
0.1J -
* *
*....*•• +- . .. *** *•• **
*
**********.* ******** .. . . •••• .. **
I
*0* 0"
**
**0* * ..
*000******. **.
• ••• *. ** ..
o.ot -- ---- -00*000 0000 00. - 0*000*000* 000000* 000000 ·000* 0* 00* 00* 00000000" *0000000*0*000000" 00000*00 -00--0 00-- 0*" •• **-** **0----- -- --I
..
*00*0* *0** *** .**0******0***0*0**0**0********** ** ***0*0******0*****0**0**00***00*0000* *0* ***
I
..
*** ....... ***
.. **.. .. *••• **••• *••
I
I
J
0
000
.*.
** **
*
*
*'..
** .. .**. ***
..
J
I
I
I
I
I
·1
J
I
I
I
I
j
I
I
J
I
I
-o~i----O:2----0-.,,---o.4----o.5---0~6---0:7---ij~f'---O:i----l-.O-ml~i--j~2---i~3----iX--l.~"----iX---l~7---i~B--i~9----2:0----2-.l---
ni,st,l1nc:p in mUm',
Figure
3.4 Correlogram of the distribution of residential area on photograph 33
4=""
\J1
e
e
e
..
j
0.61I
1
1
I
o
o
o.~l
o
I
I
I
I
o.~ II
I 0
~
I
I
0.31I
I
I
I
o
~ 0.2j-
~>: II
8
0
I
I
I
I
00
o
0.11.. j
••
•
0'"
0
***0*
****:to***...
**
*
........
•
*
**
'*
*** ****0*****
00
•
.. *** *
*
.. * .. *
** 0***** *****.at:****
0
o o
o o
00
o
*
000
o
*
00
0.0 1----00---0000*00--000- - *000(100 *0000 0000000000*00000 00*000 0* ooooc-OOOOOOQOOOOOOOOOO* 00000 "****** ****0*0* **0 ***** *0****---- ----
I
*000****0**
I
* **000** ***0********:+:1'*********1<**0**********0********************** •• * ...
-* *"'** *
I
.._.
~
I
-O:1----O:2---0:j--O:L---O:5---0~6---:0:7---0;8---0:9---i:o---i~i--i~;---i:;---i:h--i:5---i:6---i~i---l:8--1:~---2:0---2:i-T"\i~t..~nc"':
Figure 3.5
in
m;l('J~
Correlogram of the distribution of ponds on photograph 33
+:-
0'1
,...,
where z denotes an outcome of the random variable zforthis realization.
The p . . are fixed quantities for this photograph and are the probabilities
lJ
that two random points at distance d are both lIs, both O's, 0 and 1
or land O.
It follows then that the realized autocorrelation, which
we could call the photograph autocorrelation, is given by
=
r(d)
Similarly, we can consider a lattice of points instead of all points
in a region,
as before,
At each point of the lattice a random variable.is attached
Placing the lattice randomly on an aerial photograph and
observing the values at the lattice points we obtain a realization of
a random process,
Observing all pairs of points at distance d where d
can only take on particUlar values depending on the shape of the lattice
and the distance between adjacent points in either of the two directions,
a table can be set up as before with entries denoted by
p.. ' The
lJ
realized autocorrelation which we could call the lattice autocorrelation,
is now given by
•
r =
• 2
Pll - Pl
The variability of
PlP2
p..
lJ
will depend on the density of the lattice, but
averaging over all locations of the lattice E
p..
lJ
will be equal to p . . '
lJ
For the applications in which we are interested, we will assume lattices
sUfficiently dense to say thatr
= r.
48
Under certain conditions the p .. andp. may be replaced by
lJ
and
'fT
i
.
l
These conditions relate to the ergodic theorem.
ment of this theorem see ,Yaglom "(1969).
'fT ••
lJ
For a state-
To satisfy the conditions we
assume in addition to stationarity and isotropy that p(d) approaches
zero if d approaches infinity and that the fourth order moments of the
process are negligible.
Periodicity has as yet not been reported in
forestry literature without challenge, but it is intuitively reasonable
to assume that the autocorrelations dwindle to zero with increasing
distance.
In the work to be discussed in a later section we found some
evidence of damped oscillations but these are not at all persistent
over distance or from photograph to photograph.
Assuming the fourth
order moments to be zero permits r(d) to be written in terms of
that r(d) == p(d) fora SUfficiently large region.
'fT ••
lJ
so
For a justification
of this assumption see Jenkins and Watts (1968, page 175).
It should be noted that whatever the size of the region unless
infinitely large, autocorrelation estimates for larger values of d will
be less precise.
As the distance d becomes larger relatively fewer
pairs of points at distance d apart are available to estimate p{d).
To get some indication of the distance at which the reliability of the
estimates begin to decrease severely, a correlogram has been computed
for a square array of 50 by 50 random numbers consisting of zero's and
onels produced by a random number generator (see Figure 3.6).
same lattice were to occur on an aerial photograph of
If the
scalel~20,000
within a region of 5 inches by 5 inches at the center of the photo,
the cut off point would occur at about 1.25 miles.
e
e
e
I
j
0.31-
t
I
I
I
0.?1-
t
j
..
I
*
I
J
I
. . . . '"
n.], -
.*.
..
0+:
* 0
J
*
*.*... . .
** .... **.*
••
*
•
*'... '" '" ..
*'" .. '"
... .. .. **.* .....* .. *' ** **
**** 0 .* ,,** ...
~
I
***.. .. 0*.* *** ***** ***
*~**.*** •• *.***.******************.**.* .***~O*O.*.*
0*0*
.. 0
~ (1.(1-00000000000000000.00*000000000000000000000*000 00 000*000000 000000 00000 00000000.0000 00.00 00 "'*0 .. -0 ...... 00'" * **-00- *- *- -- -* *----- --_.
~
I
....
* *** 0 ***** *.**.****** .*0***** *.0* ... *********************0******0****0***0* 0**0*0*0**00* ***0
~
I
....
..
'"
***..
.. *** **** ******* .. **** ****** ** 00 * 0
S
~
..
."'.**.* '" ."'••• ***. 0 *0 '"
I
•
...
** ..
-O.lJ ..
.. ** .. ** '"
* .. *
0-
6
"'.,
'"
..
*.
*
i.
I
J
I
-0.21-
..
.. '"
•
*
j
J
I
I
-0.31J
I
*
..
*
..
0*
*
*
j
I
-o.WI
I
I
I
I
I
I
1
I
I
I
1
I
I
I
-O:1---o:i---o~3---0J~---O.5---0:6---0:7--0:8-·-0:9----1:n--l:l--l:?---l:3---1J~--1:s----i:6---1~---1:8--1:9---2:0----2:1-nistance in
Figure
m.~.lp,s
3.6 Correlogram of a 50 x 50 square array of O,l random numbers
+:-
\0
50
The problem of concern is one of estimating the autocorrelation r(d)
in order to represent the correlogram of the distribution of a certain
land use in an area, or in a part of the total area of interest, like
those of Figures 3.1 through 3.6.
The estimators are assembled from the
same type of tables as used in the definition of p(d) and r(d).
Consider again a map or photograph on which only two states are
recognized, forested and non forested areas.
points at distance d are selected,
~.g.,
Consider that n pairs of
by throwing a needle n times.
If a point falls in a forested area a score of 1 will be taken as the
observation at the point, otherwise the score will be O.
The variable
observed at the first point of a pair is designated as variable x and
that at the second point as y, then the following table can be
constructed:
=
score at second point
1
0
1
n
ll
n
12
nl.
0
n
21
n 22
n2 •
.1
n. 2
n
y
x
= score
at
first point
n
Assuming isotropy the product moment corre.lation coefficient between x
and y may be taken as an estimator of the autocorrelation between points
at distance d.
by
This estimator will be denoted as rl(d) and is calculated
51
n
t x.y. - nxy
1. 1.
r (d) = - - - - - - - - - - - l
n
2
-2
n
2
-2 1i=l
[(tx.
i=l
-nx)(ty.
i=l 1.
1.
-ny)}2
and thus in terms of the n ..
1.J
which can be rewritten as
n n
- n n
12 21
ll 22
I
(n1. n .ln 2. n .2)2
To avoid the arbitrary labeling of the observations as outcomes of
the random variables x and y,each pair of points should be recorded
twice but the second time in reverse order.
If the nwnber of pairs
sampled is large the difference between computations based on points
taken once and computations based on points taken twice is negligible.
The 2 x 2 table in the last case is given below.
y score
x score
I
0
1
all
a
0
a
a
a
21
1
a
l2
22
2
a.I
a
a
2
52
It follows that all
a
= 2n.
= 2nll ,
a l2
~
nl2 + n21 , a l2
= a 21 ,
a 22 = 2n
and
22
Using again the product moment correlation coefficient the
autocorrelation estimator is denoted by r (d) and is calculated by
2
(3.2.6)
To select n pairs of points the sampling process could consist of
throwing a needle of length d n
tim~s.
An alternative method is to
approximate the aerial photograph by a sufficiently fine lattice.
The
observations at the lattice points are then transferred into a computer
and using a random number generator n pairs of points at distance d can
be selected.
One way of doing this is as follows.
Program the computer
to select an initial point on the lattice out of all lattice points
with equal probability.
Instruct the computer to search and catalogue
all other points on the lattice which are at distance d from the initial
point.
Program the computer to select another point with equal
probability from the points which are catalogued in the previous step,
if any.
Repeat the process n times.
The actual use of needles of
different lengths is of little practical value.
The computerized
method we found time consuming, too, relative to other possibilities
of its kind.
For instance, a set of q points could be selected randomly
and without replacement from the lattice points.
For this set of q
points all pairs of points at distance d are determined.
In this way n
53
is not fixed and varies according to the distribution of the distance
between two random points (equal probability) on a lattice.
If we allow
to compute correlations for all possible distances occt!.rring in the set
of q points, this method is much more efficient.
Instead of a set of q
random points, a lattice with larger spacing between points could also
be taken out of the initial denser lattice.
However, any method which uses the same set of points to estimate
correlations at different lags will give dependent estimates.
This may
not 1!'e a serious problem if the set of points is large enough to provide large numbers of pairs at the smaller lags.
It is a common
practice in time series analysis of equally spaced observations to
compute several or all possible serial correlations.
of the whole series is used,
l.~o,
Usually the mean
neglecting end effects.
For instance
Jenkins and watts (1968, p. 182) do not recommend the use of (2.2.7)
with two means for discrete time series data, but use the following
formula for relatively small lags:
N-k
~ (x. - x)(x'+k - x)
. 1
J.==
1
1
r(k) == --~--~--
N
~
i=l
-2
(x. - x)
l
where xl' x 2 ' ••• '·XN are the N observations in the series.
Since the
denominator contains N terms and the numeratorN-k terms, the factor
N
--- is being neglected for large N and small k.
N-k
The use of this formula causes more severe problems in our case
where the data are arranged in a two dimensional array of Nx Npoints.
For a given distance in a given direction the number of times the
distance occurs equals (N-h) (N=v) where h equals the number of units. in
the horizontal direction and v the number of units in the vertical
direction such that h
2
~
2
2
v = k •
Because of isotropy a particUlar
distance occurs at least in two directions perpendicular to each other,
often in four directions and occasionally in more than four.
Therefore
the correlation estimate r(k) is often closer to zero than it ought to
be, also for many relatively small distances, the correlation estimates
tend to zero providing a smoother correlogram than might actually be
the case whatever the total size of the lattice.
Therefore we suggest
a third correlation estimator r (k), which is better adapted to our
3
situation and is given by
N
t
(3.2.8)
N
t (x .. -x) (x. h .+ -x)/ t;(N-h)(N-v)
i=l j=l 1J
1+ ,J v
=--""'--~--==-------------
N
N
t (x .. -x)
t
i=l j=l
2
/N
2
1J
The quantity 8. is a factor to be put in the denominator of r (k) as well,
its value equals the number of directions in which a distance can occur.
If r (k) is expressed in terms of 0,1 data and pairs are counted twice
3
as previously indicated the equation is
II
112
all - 2al Pl + PIa
r 3 (d) :::: -';'-"""II~II--a PIP2
II.
where Pl
1\
Pl and P2
~
P2 and Pl and P2 are defined as before.
To
demonstrate the difference between r(k), corrected by the quantity 8.
55
for the number of directions in which k can occur, and
I'
3 (k) both
quantities have been computed fora 50 by 50 square array of 0,1
random numbers and for an aerial photograph observed at the points of
a 50 by 50 lattice.
Bart of the results are presented in Table 3.1.
The table clearly shows that Ir(k) I <
1r3 (k) I
and that
I'
(k) approaches
zero for large lags in both cases.
At this point we should point out that the way the estimators 1'2
and 1'3 have been defined does not depend on the way the number of pairs
have been selected.
the lattice.
In fact they can be computed from all points of
Fora given lattice all pairs of points at each possible
distance within the lattice can be collected when a computer is available.
Observations can be surrrrnarized in as many two by two tables as
there are different distances.
The total number of pairs for a given
distance is determined by the size and shape of the lattice.
In the
1\
case of r , Pl is calculated using all points on the lattice.
3
Similarly, a certain number of single points can be selected
randomly and without replacement, and based on this set alone all
correlations at the occurring distances can be calculated.
we
h~ve
Although
used this random procedure to determine the number of points
by which the trend in the correlogram could still be reasonably estimated,
it is not an answer to the problem associated with systematic sampling.
-It could be used in case sufficient supplementary data points can be
obtained or when the same lattice of random points is used
from one aerial photograph to another.
systematica~;Ly
In case the number of pairs of
points at a particular distance apart is large, the difference between
"
56
Table 3.1
Comparison of r(k) and r (k) for a 50 x 50 square array
3
of 0,1 random numbers and for a 50 x 50 square lattice placed on
the center of aerial photograph 33
2
k =
.
.1ag 2.ln baSle
units
1
2
4
5
8
9
10
13
16
17
40
65
90
117
193
261
338
410
490
577
653
733
810
901
980
1066
1154
1233
1314
1409
1492
1586
1666
1754
1849
1933
2025
2106
2186
2276
2362
k=
lag in
miles
.032
.045
.063
.071
.089
.095
.100
.114
.126
.130
.200
.254
.299
.341
.439
.510
.580
.639
.699
.758
.807
.855
.898
.947
.988
1.031
1.072
1.108
1.144
1.185
1.219
1.257
1.288
1.322
1.357
1.388
1.420
1.449
1.476
1.506
1.534
Random lattice
r(k)
.00179
-.00060
~.00385
.00420
-.00663
-.00147
.00556
.00354
-.00538
-.01237
-.00350
.00310
.00322
.00440
.00006
.00580
.00633
-.00064
-.00154
.00005
-.00118
.00196
.00899
.00554
.00666
-.00267
-.00500
.00291
-.00654
-.00188
-.00659
.00131
.00121
-.00240
-.00511
-.00206
.00075
.00028
-.00060
.00344
.00498
r (k)
3
.00183
-.00062
-.00401
.00446
-.00719
-.00156
.00604
.00392
-.00585
-.01372
-.00414
.00384
.00418
.00610
.00009
.00941
.00827
-.00121
-.00309
.00009
-.00286
.. 00443
.02383
.01523
.02104
-.00915
-.01852
.01125
-.02749
-.00854
=.03269
.00699
.00694
-.01482
=.03653
-.01741 '
.00189
.00336
- .00524 -.03586
005389
Aerial photograph
r(k)
.52770
.38671
.30721
.25469
.18080
.19595
.16646
.12486
.12068
~09421
-.01605
-.02775
-.03114
-.06591
-.01267
.02087
.02969
.02227
.03171
.00674
.00183
-.00480
-.00796
-.02897
-.00828
... 00560
.00384
.01939
.01213
-.01921
.01395
-.00599
-.01929
-.01594
-.00539
-.00442
-.00052
.00873-.00080
---.00213
.00407
r (k)
3
'.53848
.40266
.32001
.27072
.19618
.20846
.18070
.13836
.13118
.10449
~.01900
-.03438
- .04040
- .09134
-.01938
.03387
.05294
.03942
.06358
.01323
.00441
-.01087
-.02111
-.07945
-.02613
-.01917
.01422
.07503
.05098
.08731
.06920
-;03187
-.11088
--.09839
- c.03851
-.03735
,;,.00587
.10642
';"~00704
-.02220
.04406
.-
57
r , r and r is negligible 0 However, it can be shown t.hat r
l
2
1
3
follows.
:2
r
2
as
From (30204) and (3.206) i t is seen that respectively
n nll - n1. n .1
I
[n1. n. l no 2n2. ]'2
and
Since the geometric mean is not larger than the arithmetic mean, it
follows that
since
In case a set of points is sampled randomly and without replacement,
and maintaining a lexicographic ordering corresponding to the order in
which the points selected have entered the sample set, r
to r
2
l
will be equal
on the average, but otherwise utmost care should be taken.
If
the pattern of land use is not isotropic, then for a fixed coordinate
system, if r
l
is computed from a lattice, r
l
will be different when
e
e
tit
Table 3.2
Correlation estimates r , r and r of the distribution of forest for
2
l
3
different sampling intensities for aerial photograph 33 and for 0,1 random numbers
lag
in
miles
.032
.200
.299
.381
.4lJ.9
.510
.568
.622
.669
.714
.•758
.797
.836
.877
.910
.947
;032
.200
.299
.381
.449
.510
.568
.622
.669
.714
.758
.797
.836
.877
.910
.947
r
r
1
2
Aerial photograph 33
r
r
l:"3
2
1
r
3
200 points
660 points
.4853 .lJ.790 .5135
-.0502 -.0608 -.0290
-.0377. -.0380 -.0350
-.2911 -.3028 -.2286
.1635 .1628 .1626
-.1105 -.1212 -.1079
.0219 -.0026 .0831
.5594 .5508 .6231
-.3514 -.3589 -.3451>
-.4167 -.4286 -.4239
.3320 .2962 .2961
.1667 .1213 .1284
.0727 .0278 .0391
-.0550 -.0572 -.0226
-.J.688 -.1895 -.1252
-.2035 -.2076 -.2055
400 points
.5231 .5207 .5288
-.0529 . -.0529 -.0529
-.0485 -.0519 -.0517
-.0220 -.0225 -.0201
-.0416 -.olJ.75 -.0475
-.0134-..0137 -.0137
.1378 .1378 .1393
.0160 .0160 .0174
.0591 .0591 .0599
-.1268 -.1268 -.1266
.0749 .0712 .0714
-.0006 -.0010 -.0005
.0793 .0774 .0792
.0237 .0232 .0233
.0254 .0253 .0262
-.0957 -.1002 -.0995
800 points
.5204
-.0089
-.0925
-.0868
-.0544
-.0337
.2385
.1265
-.0646
.0561
.2095
.0952
.1574
-.0072
.0198
-.0771
.5371
- .0689
-.0348
-.0320
-.0761
.0528
.0427
.0506
.0565
-.0011
.0302
.0062
.0802
-.0632
.0074
-.0704
.5200
-.0139
-.0941
-.0872
-.0608
-.0396
.2385
.1184
-.0646
.0545
.2073
.0861
.1490
-.0073
.0190
-.1053
.5311
-.0139
-.0941
-.0872
-.0606
-.0395
.2380
.1286
-.0642
.0591
.2077
.0862
.1508
-.0072
.0264
-.0794
.5362
-.0690
-.0375
-.0325
-.0774
.0528
.0420
.0497
.0565
-.0027
.0295
.0058
.0802
-.0633
.0071>
-.0723
.5416
-.0679
-.0333
-.0318
-.0772
.0530
.0438
.0497
.0565
-.0016
.0300
.0077
.0818
-.0632
.0074
-.0710
r
r2
1
1000 points
r
3
0,2 random numbers
r
r2
r
1
3
1000 points
.5302 .5290 .5303 -.0128 -.0138 -.0133
-.0639 -.0640 -.0635 .0291 .0291· .0291
-.0442 -.044"6 -.0429 .008lJ. .0064 .0064
-.0163 -.01.65 -.0155 .0355 .0338 .0343
-.0881 -.0902 -.0896 .0018 .001.8 .0018
.0432 .0432 .0433 .0162 .0149 .0159
.0257 .0254 .0267 -.0151> -.0160 -.0158
.0317 .0312 .031,3 .0550 .0550 .0561 ..
.1076 .1075 .1073 .0089 .0088 .0089
.0173 .0170 .0177 -.0035 -.0036 -.0034
.0068 .0047 .0047 .0074 .0072 .0072
-.0358 -.0359 -.0352 .0132 .0119 .0120
.0051 .0051 .0070 .0014 .0013 .0019
- .0576 -.0576 -.0576 .0219 .0218 .0219
-.0070 -.0073 -.0072 .0346 .0344 .0358
-.0677 -.0680 -.0672 .0435 .0424 .0425
2500 points
2500 points
.5379
-.0198
-.0412
-.0208
-.0692
.0334
.0154
.0710
.0755
.0081
.0134
.0115
-.0057
-.0219
-.0392
-.0791
.5379
-.0198
-.0412
-.0208
-.0693
.0334
.0141
.0710
.0755
.0081
.0132
.0114
-.0059
-.0224
-.0394
-.0794
.5385
-.0190
-.0404
-.0202
-.0690
.0339
.0141
.0712
.0758
.0081
.0132
.0115
-.0055
-.0223
-.0394
-.0795
.0018
-.0041
.0042
.0188
.0144
.0094
-.0161
.0052
-.0034
-.0032
.0002
-.0079
.0009
-.0159
-.0081
.0153
.0018
-.0041
.0041
.0188
.0143
.0094
-.0162
.0052
-.0034
-.0032
.0001
-.0080
.0008
-.0159
-.0082
.0152
.0018
-.0041
.0042
.0188
.0144
.0094
-.0162
.0052
-.0033
-.0031
.0001
-.0079
.0008
-.0158
-.0082
.0152
\Jl
())
60
o
computed from the lattice again after rotation over an angle of 90 •
Correlograms computed from lattices of 50 points by 50 points, placed
different ordering of points within pairs as a consequence of the
rotation.
Photograph 196 shows a rather large contiguous forest area
in its southeast corner and there is not much patchiness of forest and
non forest areas.
Estimates of r , r and r are given in Table 3.2 for several
2
l
3
distances and sampling intensities, that is number of points in the set
of random points.
The corre'sponding number of pairs from which these
estimates are computed are given in Table 3.3.
The estimates are for,
an aerial photograph identified as 33 covering a small part of the
Lake Michie Watershed near Durham, North Carolina.
The large number of
possible correlations prevents a complete tabulation, therefore only
some correlations have been presented at an interval of twenty lines
of the total computer output.
Correlation estimates have also been
computed for 0,1 random numbers for two sampling intensities.
Table 3.2
shows that the differences between the three estimators are slight
even for low sampling intensities.
Occasionally an estimate will be
much different from the other two.
In order to. summarize these
differences between estimates in the table, the absolute differences
weighted by the square root of n, the number ,of pair's on which the
estimates are based and are given in 'l'able 3.3, have been computed and
added.
The results are given below for the photograph and the 0,1
random number s.
61
t~1-r3111~ (ir2-r3I/I~
Population
Sampling
intensity
photograph
200
0.046937
0.078608
0.080157
II
400
0.007115
0.005403
0.004589
"
600
0.001277
0.001755
0.001122
"
800
0.000508
0.000653
0.000869
II
1000
0.000253
0.000402
0.000365
"
2500
0.000036
0.000095
0.000061
1000
0.000323
0.000312
0.000214
2500
0.000011
0.000012
0.000009
random
numbers
"
Ij
t [r l -r 2 11
This summary shows the tendency for r
particular photograph.
1
and r
2
to be close for this
If the same procedure had been applied to
photograph 196 the results would have shown a much larger difference
between r
and r • If large lags had been
3
included the results might have shown larger differences between r
l
and r
2
than between r
2
3
and the other two, partially due to negativity which has to
necessarily occur for r
because of neglection of end effects.
3
The differences between correlation estimates at different
sampling intensities are relatively large.
However, large correlations
like r(O.032) for the aerial photograph remain fairly stable.
Also the
first group of negative correlations remain negative for all sampling
intensities except the lowest, which indicates that the negativity of
the estimates is a real feature of the photograph.
Similar observations have been made for other sampling intensities
as well as for some other aerial photographs of the watershed, but are
not presented here.
",e
3.3 Variances of correlation estimators
Q••
Since periodicity is a debatable issue in forestry and negative
correlations are associated with
periodicity~
a means of testing any
occurrence of negative correlations will be of interest.
.
aerial photographs used in the
in the correlo'gram.
study~
For some
negative correlations occurred
To test in general if correlations are different
from zero some estimator of the variability of the correlation estimator
is needed.
biases of r
In this section we will derive approximate variances and
l
and r •
2
From the previous section it is clear that r
l
can
be considered a measure of association, corresponding to r in the
notation used by Yule
(1912)~
between variables x and y observed at n
pairs of ordered points at distance d, where the outcomes of the n
pairs of observed variables are summarized in a 2 by 2 table.
Finally,
we will evaluate a testing procedure assuming normality and taking
correlations computed from 0,1 random numbers as a standard.
In the I-dimensional case the variance of estimators of autocovariances depend on the covariances at all other lags (Jenkins and
Watts, 1968, p. 178).
The same is true for the 2-dimensional case.
For purposes of comparing r
l
and r
2
and for testing whether correlations
are different from zero it will be sufficient to consider a realization
of the process, 2:..~. a popUlation of values or in our case aphotograpb.;
The approximate variance and bias of r
l
and r
2
are now derived assuming
that the region sampled is large and that pairs of points are sampled
independently.
,
..
F',
63
and n 22 can be considered as
The observations n , n , n
2l
12
ll
realizations of a multinomial distribution with underlying probabilities
Pll' P12' P21 and P22 as defined above.
Thus a result from Fi.sher
(1967, p.' 309; see als~ Kendall and Stuart, 1969, Vol. 1, p. 242) on
approximate variances can be used appropriate to the theory of large
samples.
Let Tbe a function of the frequencies x. of a multinomially
1.
distributed variable ~
=
(xl' x 2 ' ••• , ~) with expected frequencies
......
np = (nPl' nP2' ••• , nPk) for a given sample of size n.
Then using a
Taylor's series expansion
k
T(x , ••• , xk)=T(nPl, ••• ,nPk) + t T~(nPl, ••• ,nPk)(x.-np.) + O(n-l)'
l
i.=l 1.
1.
1.
where T.'(nPl,
••• ,nPk) =
1.
k
aT
.
-I.-I..........
uX. x=np
Notice al.so that
1.
......
aT
I.
lJ
.. p =
n l.x=n......
k
......
t T~ I~_ ~..
Omitting terms of orderl/n, T(x)-T(np)= t T~ (np)(x.-np.)
i=l 1.
1.
1.
approximately.
It then follows that the approximate variance, VeT) say,
i=l 1.x-np 1.
of T is by definition
k
VeT) = E [ t T~ (np)(x. -np.) ]
i=l 1.
1.
1.
k
=
=
t
2
T~ V~x.) +
i=l 1.
1.
2
k
k
~
~
~T.ITi.
~
.
'.L.1.J
1.=1
1.iJ
k aT
2
t (-I . . . ) np.1. (l-p.)
.1.=1 8x.1. np
1.
Cov ( x., x. ) ,
1.
J
64
k
~ (~ ~I ~)
J)
k 6T 1 2 k 6T I 2 2
=nl . t lOx.
(...... np~) p.-[
t lOX.
(-1.- np
~) p.l +2 . t 1 .<". 0ax. \)ax. np Pl·PJ'
1
.
1=
1
1=
1
1= l J
1
J
I
k OT
2
8TI
2
=nl . t 1 (-a-I-'p)
Pl' - (an
n-llp) ) •
\IX. n
I,l
1=
1
Substituting rl(d) forT and n .. for x .. and writing the marginals
lJ
lJ
n. , n . in terms of n .. the variance of rl(d) becbmes
1.
.J
IJ
~ v(rl (d»
where the partial derivatives are derived from
6r (d)
l
6n ..
IJ
6
=~
IJ
..a
Evaluating the partials at np, denoting the denominator of rl(d) by D
and the numeratorbyN, and using P12 = P21' we obtain:
2 4
1 2 2
2 (
)2
2
,6r l (d) 2
n22 D + 4'N n. 2n2 . n.l+nl.
- n22D Nn. 2n 2 • (n. l + nl.)
(an
) In..a
p = ------'---""'----:io6----------'---""""'---ln;
o 11
D
244
2 242
342
P22PI P2 + (PllP22 - P12) P2P1 - 2P22PIP2(PIIP22 - P12 )
=
2 6 6
n
and similarly
PI P2
65
8r l (d) 2
8r l (d) 2
and (8
) I ~ equals (8
) I ~ after substitution of P22 by PII ,
nIl
np
n 22
np
(
8r (d) 2
~
n
rl(d).
) Inp.... = 0
since n does not occur explicitly in the formula for
Assembling these parts it follows that
which can be reduced to
66
Pll' P22 and P12 in (3.3.1) the variance equation reduces to
If r(d)
= 0,
it is necessary that Pll
= Pl 2
and PII P22
:=
2
P12 by
definition; hence
1
= n
For large samples vCr (d)) can be estimated after substitution of the
l
n 2n2
n Inl
parametersp .. by n .. /n and of PI by •2 "
of P2 by • 2 • and of
lJ
lJ
.
n
n
P12 by (n12 + n )/2n. To test whether correlations are different from
21
2
zero we could use a X test since n(r (d)2 is asymptotically distributed
l
2
as a x with one degree of freedom (Kendall and Stuart, 1967, Vol. 2,
p. 549).
We have used this test without a continuity correction factor.
If we consider n pairs of points, chosen randomly, and observe
the presence or absence of forest at each point, then rl(d) is a
measure of association between two points at distance d apart with
respect to the attribute forest.
Using Yule's formula for the sampling
variance (Yule, 1912) we obtain the same result as given by formula
To evaluate the bias of rl(d) approximately, we use
....
E T(x)
....
= T(np)
+~
k...),.
r
tT'.' (np) var(x.) +o(n-).
i=l 1
1
67
with a similar experession for
with a similar expression for
....
a2~,x).
Evaluating the second partial
8n 2l
derivatives at np it follows that the approximated bias equals
which after substitution of Pll' P22 and P12 as shown before, reduces t:o
68
(3.3.4)
bias r1(a)
= ~~~~2[
r(d)2[ I
+ np P 4
.1 2
If r(a)
= 0,
-
t
3
3
-P.1 - P2 - 2
t
PIP2+5P.12P23J +
3 2 2
r(a)3
PI P2 J n
then the approximated bias is zero.
In order to evaluate the variance and bias of r 2 (d) i t is necessary
to rewrite r (d) in terms of the n ...
2
lJ
again.
Thus writing
We can then use Fisher's result
T(~) = r 2 \d) = [4nnll-(n+nll-n22)2J/[n2_(nll-n22)2J
and taking partial derivatives with respect to n.1I' n22 and n we obtain:
....
Squaring the partial derivatives evaluated at np, we obtain
which can be reduced to
Since equation (3.3.5) equals equation (3.3.1), i t follows that
Deriving the approximated bias of r (d) in the same way as we did
2
before for r (d), we obtain:
l
70
Multiplying these equations with respectively var(n
ll
) and var(n
22
)
and adding, the equation for the approximated bias reduces to:
which, after substitution of PIl' P22 and P12 as before, reduces to
422
4
rl)\ P - P P +p
( )2
I 2
2 .)+ r d (2 2_
+2 2)+
+ ~(l
2n
p p
2n
PI PIP2 P2
I 2
yo (
If r(d)
= 0, then bias r 2 (d) =
- 3PIP2
2n
•
For large samples, to which these results apply, the differences
between variances and bias of r
l
and r
2
are negligible.
Where the cor-
relations are very small, differences will be within rounding error.
If a 2-dimensional random process is such that p(d) =
° for
all
distances, and if this process is sampled by a square or rectangular
lattice, then the ratio of the variances of r(u) and r(v) will be on
the average inversely proportional to the ratio of the frequencies of
the distances u and v within the lattice.
A feature of the variance is that negative correlation estimates
have a smaller variance than corresponding positive correlation estimates,
except when PI
= P2' as can be seen from formula (3.3.2).
Since the
71
minimum of r is not always -1, but its maximum, r:l, can yet be
attained, the distribution of r in 2 x 2 tables is skew and thus
explains the smaller variance for negative correlations.
In order to test whether correlations rl(d) differ from zero for a
2
given realization of a random process we will use the X test as
indicated before.
In the case p(d)
= 0,
~.~.,
r(d) equals zero on the
average, neighboring points on the correlogram are uncorrelated(Jenkins
and Watts, 1968,
p.187).
From the previous observations i t is not
2
unreasonable to assume that the X test applies to r
2
and r
3
as well.
The number of correlations found significant at the 5% significant
.level should be about 5% for all three correlations.
If a 2-valued
random .lattice can be used as a standard and indeed 5% of the correlations are significant, it could be taken as an encouraging sign
that correlations computed for aerial photographs could be tested
similarly, where we are particularly interested in real, negative
correlations.
In Tab.le
3.4 we have recorded the number of significant
correlations for sets of 20 correlations going from the first lag up
to the .lag which is equivalent to a distance of about .l.25 miles in a
50 by 50 lattice on an aerial photograph of sca.le
.l~20,000.
The
correlations r , r and r have been computed from a set of 1000 points
2
l
3
selected randomly out of a square array of 50 by 50 random numbers
consisting of zeros and ones and from the whole square array.
distance in mileage is artificial.
The
From the table i t is seen that in
case of .l000 points, the test is fairly good for the first 280 corre.lations but then tapers off.
Comparing these results with those of
72
Table 3.4
Number of correlations found significant at the
using ~ in intervals containing 20 correlations.
5%
level
Correlations have
been computed for a set of 1000 points selected randomly and without
replacement from a 50 x 50 array of 0,1 random numbers as well as for
the whole array.
Cumulative total
of correlations
at largest lag
Largest lag in miles
of interval containing
20 correlations
20
40
60
80
100
120
140
160
180
200
220
240
260
280
300
320
340
360
380
400
420
440
460
480
495
, .199638
.299458
.381409
.448631
.509958
.568181
.621770
.668863
.714248
.758232
.796680
.835744
.877046
.910491
.947494
.980568
1.014528
1.045484
1.078786
1.108847
1.138551
1.169632
1.199492
1.229438
1.249933
1)
In case of r
2)
In case of r
3
2
Number of correlations
found significant in an
interval of 20 correlations
Set of
1000 points
Whole
array
1
1
0
3
1
0
1
1
1
2
0
1
11)
0
0
0
0
12 )
0
0
0
0
0
3
1
0
3
2
2
3
0
1
0
0
0
3
0
0
1
1
0
3
1
1
1
1
2
1
0
0
one correlation was found significant
and r
3
one correlation was found significant
e
73
testing correlations calculated from the whole square array too many
correlations are found significant.
This is due to finding too many
correlations significant for the first hundred correlations since for
260 correlations or more the significance level remains quite stable
at
5+%.
The congruence of r , r and r of Table 3.4 was to be
2
l
3
expected from the results given in Table .3.2.
3.4
Estimation of trend in correlograms of the
distribution of forest on aeria.l photographs
The correlogra.m for any finite data set will show deviations from
a general trend, if such a trend exists at all.
Even in the case of a
continuous l-or 2-dimensional finite record the correlations have in
practice to be calculated from discrete observations, taken at short
intervals.
Depending on the size of the interval the continuous record
is more or less well approximated.
The longer the record, the larger
will be the lags at which serious deviations will occur (see Section 3).
If the finite record is considered as a realization of a random process,
then as the record becomes larger the correlogram will tend to the expected correlogram of the process provided ergodicity holds.
In order to determine the trend of a correlogram, we need to be
able to distinguish genuine features such as negativity from sampling
variation.
For our own purposes it is also desirable to compute
correlograms taking end effects into consideration.
In case covariances
between random variables are known to be zero, the variability of the
correlogram can be assessed at least graphically by simulated sampling
74
as has been shown in Figure 3.6.
otherwise, it becomes difficult to
estimate the variability about a certain trend.
To overcome this
prob.lem a solution might be found in assuming a certain model to hold
for the correlation function or trend based on the estimated correlogram.
A sufficient number of observations over a sufficiently large area is
needed to avoid that the trend will be too much blurred.
The model is
fitted and the sum of the squared deviations will be used as a measure
of variability,
2:.~.,
instead of determining the error for an estimate
at a particular lag, the errors are lumped and given for the correlogram as a whole.
Using alternative models depending on the strength
of the occurrence of certain features in the correlogram,such as
negativity of the correlations over some intervals or damped oscillatory
movements, the existence of such features can be assessed.
Thislast
approach we will attempt to apply in case of the correlograms obtained
for aerial photographs.
We first describe now the construction of a correlogram as for
example shown in Figure 3.1.
Given an aeria.l photograph of size 9 inches
by 9 inches and scale 1:20,000, a square lattice is placed on the center
of the photograph with its sides parallel to the sides of the photograph.
The size of the.lattice is 5 inches by 5 inches and contains 50 by 50
points systematica.l.ly arranged in a square pattern which means that
the distance between adjacent points is 0.1 inch in both directions
and isequiva.lent to about 0.0315656 miles ground distance.
Such a
.lattice is fundamenta.l to this study of the feasibi.lity of estimating
the trend of a correlogram of a stationary, isotropic process; but any
other
patt~rn
or other dimensions are possible.
75
Computations for square lattices, however, are most convenient and
as already mentioned in Chapter 2, the difference in precision is slight
compared with triangular lattices.
~'he
size of the square was chosen
such that neither lattices on photographs of adjacent flightlines nor
lattices on alternating photographs of the same flightline would
overlap.
In order to obtain a reasonable approximation of the land use
pattern a rather dense lattice may be required.
In our case, where the
aerial photographs covered part of the Lake Michie watershed, this
approximation is certainly not good for roads, residential areas and
ponds, even for forest and farm lands the approximation is fair.
However, a denser lattice would have created intractability due to the
.limitations of the computing facilities.
A smaller but denser lattice
would have more severely limited the workable stretch of the correlogram
which has been taken to be the interval from zero to 1. 25 miles for
the 5 inches by 5 inches square lattice.
By using corresponding graph
paper the .lattice has been physically transferred to the photograph
by pinpricks.
used.
Instead of pinpricking a plastic overlay could have been
At each pinprick the land use class was observed.
For the
purpose of this stUdy only forested and non forested areas have been
recognized.
Correlations were then calculated for each possible
distance occurring on the lattice without taking directions into account.
The cqmputationsconsist basically of formation of as many 2 x 2 tables
as there are distances as exp.lained in Section 2, no matter whether r ,
1
r
2
or r 3 is used.
Expected and observed biases and variances were
76
calculated using the results of Section 2 where some handwaving has
been made with respect to the independent selection of points and pairs
of points.
The correlograms presented here have been mechanically plotted.
The smallest distance resulting in two different points on the abscissa
is 0.0173 mile.
Similarly the smallest difference between two cor-
relation values resulting in two separate points plotted is 0.02.
The
result is that points are overprinted or, seemingly, several correlation
values have been computed for the same lag.
This is the reason that a
band of points is observed on the correlogram plots instead of a single
string.
The zeros printed at each interval on the correlogram are
averages of the correlations occurring in the intervals of 0.0173 mile
length.
Since each interval may contain a different number of cor-
relations, the averages are not based on the same riumber.
The number
of correlations per interval increases with the distance and tapers
off when the distance becomes larger than the side of the square lattice.
The plot of zeros may be helpful in determining the model, for the trend
of the correlogram, especially when the number of points in the data
set is much smaller than 2500.
To investigate the feasibility of estimating the trend of a
correlogram based on a minimum number of points, several aerial photographs have been observed.
on their flightlines is as
The schematic loca.tion of these photographs
follows~
where theoretically adjacent photographs on the same line have 60%
overlap and adjacent photographs on adjacent lines have 30% overlap.
The photographs are always identified by the nwnbers without the flightline code.
The correlograms of these photographs have been estimated
as described in the previous paragraphs and are presente\i in Figures 3.7
through 3.17.
The correlations in these figures are based on full
lattices of 2500 points and are computed by using the formula for r
l
(see Section 2).
It is not easy
to
provide a physical background for any model
which is asswned to fit the trend in the correlogram if the process is
not known.
However, any model fitted on the basis of sample data should
reveal something about the nature of the process s1,lch as the occurrence
of forest in our case.
The model should not just be accepted because
it fits the data well over a given range.
Our vision on the occurrence
of forest for this area may be obscured, but it seems reasonable to
asswne that correlations are zero for distances larger than about two
miles.
If any oscillations occur, they are damped and happen only over
short distances in some parts of the area.
The more contiguous the
e
e
e
I 0
I
I
I
0.51I
I
I
I
0
0.41I
.
o
I
I
I
0.3'I
I
I
•
***
*
*,1*
0
O·
o
•
o
••
*'" *****
•
******
**0
0*
***
** **
s:: .
0
__ .
*' **** ** *,.. *' * *
**0*0*
~
I
*
**,..
*' ** *
*** ** 0 * 0
~, Q 1,1:00, ,
_,.
_,
_. _ ,
*
*
*',..
** *' * *
* ****0 * 0
r-t- • • I .
*0*"
. -- ~---_ .. _-.-----* **** ******* *** * *' * * *' * * **
** *
E.
*0"
**** ******** ** * **** 0 * **** ** *
* 00***
B
I
*0'
*******0*000000000***** ****0*** ** ** **
*** * *
0*
J
*0
, _ '"
,,
** * ****00000*0***** ****0*0****0**0**0* 0 **** * * ****0 * * ** *
0.01 ------------- *0-"0* ***-- *--- -- ---* **- *-* *** * **- **0000 *.. *- *****-- *- ***0 *0 *00 *0* 0* *-0* 00* **-- ** * *- ** ***-*---- *--- -.. *------------I
0*00000000000**** *0***0**000*000***** ******
* *****0 ****** ** 00*0 *** .** *0 0* * * *
0
I " " ..._-,
**OOOOOo".-OO·().-O(j"***i:)"**-- *'-'*-"-- **
* -* -* *•• **** **** * ***0*00 *** *0 ** -*
* * -* -*
I
0
*
0.21-
--_.
J
I"
I
n
***,!~.*~.*
-*
t
I
~ __.__
-0.),1-
J
j __
I
..... __.
-.
_n
·"Q·.3I::-·
uo
*
** **** ****** ** 0*** 0
*** *-*
** *-* -*** -* 0*0 *
****** *
-* *
0*0*
-* '"
m.....
.
m...........
m.
.. •••
••
•
I
I
I
I
I
I
I
I
I
I
I
...
I
-*
•
•
•
..•
••
** **
•• • ••
•
•*
•
* *******
** -*****-** *
*-*
._ . __.
._~
-.
*
. .___.
... _._
I
-0.2f-
..mJ..
I
I
I
.*·*liC.**.*_
*
I
I
I
*
o
•
·0
••
•o
•
•
I
-O·;:L-::-:::-Q;2::-·::::-·O:;"::-·::11~ii::"::::"o;~::-"=U;~:::::::;r.::;:::::-"U;5::-·-·::::-O;~::::::1:;o·::::-,::r.i·::-:i:i----l.3----i:'L---i:5'::--1:6:::::-:i~i---i~8----1:9----2:0----2~1-·
___ ..
Figure
3.7
.
...
- _...
.
.~
-.
.._.Distanee. in miles
r1-correlogram of the distribution of forest on photograph 85.
The percentage forest i853%.
-...:J
OJ
e
-
J 0
I
j
O.Sl-
e
I
I
I
j
I)
0.4 1I
*
J
0
j
1
0.3J -
* *
0
j
1
I
I
0.21I
j
§
:;5
1
*
*
'" *
*
00
*0
~
O.lJ -
f
0
i
:t'
*Oilt
8'
...,.--*
1
*00
·.·,,·.···(((00000·0.,., :4
*0""
...... COQCOO· ...... '" .... 0 ...·00 .... ,....
+:
a.oJ ------ -------- ';:000.·-- -:-.f, • • 000" - ..... - - . - - ~ .. ~ ,*," ... CO 00'" ,t ~If •. ~
*0000·-0'00
..
·~ .... :,t:OO()OOOO
I
*
0•
I
**.
":to
I
-0.11 -
*
..
to:
, . , . . '"
:to . . . " .
"':tC
*0
°1
*
*
*
I
"*40::6.~
*
...,t:
'!'+:
*
** ***
••
"
"*
**
*' *
-**'00
*
.~.
*
'"
'"
0
* * '"
*.
*
**
*'"
*
'"
•
•
0
*
**
**0
***
**0·**
'" *
*
*
*
•• *"0-'0·.0 't'+ .. ,.***.;':."-" • .,..:+-tr:**~* 0* *0 0***
* 0 0
"""'-.:*0",0*0000,,·000000 ••
** *0'+;0
*0*
0
*
*000 - *~ **. "'·0 * ....... " -0000.0" Q·OO*O-**-**~***-*0-';"'0-* -- --- ----- --_.
0 ,..~, +;
"";.*- . · " • • "0.0"'0.** .-* **** 0 *
If."0 00:+
,..,.,*.
*
,. ..
*t: ......'!'
If,
*
4<
..:+... .. .. -«
.''':
.. ,.
-+
*
"+:
*** ..
..
*
... **
* **
.....
j
j
* .;.:
*000 **
**** *
**
*
* * *.. * *
-.:
•
j
* *
*
..
* *
*...
*
*
*
*
I
**.
*
•
* *
I
1
-02
• I-
0
*
*
*
***
*
*
0
*
*
j
*
0
*
1
I
•
*
*
I
I
-0.41 -
I
I
0
0
*
*
'"
I
i
-0.5 1I
j
-----------------------0.1
0.2
0.3
0.11
o.~
-_ .. -
,-
- .. - '--"-'---"- _...- --- -_._- _.-"' ._--_. --_.--,_ .....
[J.n
[J.7
('l.fl
[J.9
1.0
Di..st.."'l,nC8 in.
Figure
0
0
I
-0.31-
'"
1..1
1.2
I
I
•0
*
* *0
*.
0*0
*.
I
--------------------------------------------------1.:'
l.l!
I •."
1.6
1~
l.R 1.9
2.0
2.1
mj.ln~
3.8 r1-correlogram of the distribution of forest on photograph 86.
The percentage forest is 63%0
-..:)
'0
-
e
e
.'
I
I 0
;~r.. ""(
__ " n " ' _ '
I
-T'
I
0.41:
.1
I
I
I
0.31I
11
I
0:21-
o
o
•
•
•0
It
* *** * *
* *
******
- .....---.--.----'-----..- - - - - . - .
0
~:
•
o
*0
... _.-
0
_.. .
n
.- ------....
**:. *: '" :::*:. '"
n_. __ n"_
*
*
**
***
0:
* '"*
_
**
*00*** '"
00
**** *** **00**0****
0
~
****0**
* '"
**
'" ***0** * *0**
*
8
*
******'*'-*'***0000'0*00***
...
*** *lI<****
'" ** *** ,0* * *0* **
***
0*
*' * '" ** ****0*00********00*** ****** ** **** **** * '" **** 00* *** 0*
'" ...
0
o.a ---------- --*0--- ---* -- "'--* * 0 *0-00*00*0** ------ ** *- 00000*0*0*0* - ** ***- *~--- ****** *-'*0 ***- * --*-***0 **-- *--- *--00-----------I
00. *, * * 0*0*,**·*0**.0. '" *
.** **0*0***0*000*0*0*0*"'** *"'OOOO.·,~'" '"
** **
* '* 0
1
*0*00*"**000 0
",<**
* *'. ** ~O**C* *0*0*0*** 000* ***'t'.':;:** * *' **0**
* *
*'
I
0**000** ****'***
*
* *
.****** ****.000***0* ***~~ **
***0* ***** * **
'c.*O*OO
J
* *** *' *"'* * *
** *** ** ** ...... -.**.* *.* * * *
***000* 0*** ** *'*0 **
~ 0.11-
.-l
'"
I
I
I
I
__,
0*
*0
*0
-_._--
-0.11I
I
--_'_~'*-
*
I
1
-0.21-
'
*' '" ** *** ** *,*
* *',fI:**** * ... * **
* *' *
*'
*'
**0 0*0**·
* *0***
** *** *'
*'
***
•
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
1
1
-O:r---O::f----o.T--o;r,----O.,----O:"---O:r--O:S---o:9"---f~o--T.Y--T.r--l:r--:Di--'i:,---f:6---i~7----y.8--T:9---2 :6---"2:i-
Distance in miles
Figure 309
r1-correlogram of the distribution of forest on photograph 870
The percentage forest is 716/00
g>
e
e
e
I
I 0
I
I
0.51I
I
I 0
I
0.41I
o
I
I
I
o
0.31I
*
*
§
~
*
*
0*
I
I
I
o
**
0
I
*
*
I
I
I
0.21-
o
*
** **
*****.. *
*
*
**
*
*
* * **
o
*
*
*
a **
*
*o
** o** o
*
* *
* .. *$*** ** * ** *
*
* *
*
** * .;: ** *
***
* ** *
*
~
I
*00* * **.*000*0 *
*
**
* ** ****************
,
** * * *0
~
I
**000*0000 * 0*0000*' ***
* ***** * *****
*
*
***** •• **** *00*** *0 ** *
o
I
**0 * **
~**~OOOOO
\*
*
** *
* *
* *** ***** ***0*0* ** **00*.00
***
** * 0 0
I
*
* **0'*0 * * ** ***** * ** *
** ***** ****0*0 * 0*00*** **0 ** * *** *0* *
0.01----------------------------------*00***-0*-**0*00000**'-'***--***--******-**000'--'000*****'*-***0*0-'***-.-.---*-*------I ·
****00*.0000*0* * *000** ******0*0*0* ***0****** *****0 * ******0 00***
0** *
I
*****0'***** **** ***00*0*0 00 0*0*00*000* ******* * * * ***** *00 0 *
*
I
**** *~* ~!* .. ** ****O*O~O*Q~****~ O.~~*~** * ** •• *
***
0
I
** * *
* *** ***** ** *** ****** * * * *
*.. ** *** * * 0** ****
*
-0.11* * ***** * * *****
* * *
0 *
I
** ***
* *
* * **
..
****
I
"
*".. . *
* ** *
** *- * **
*
I
***
*
*
I
*
* *
**
~
Q)
*
0
0*
. * 0**
0.11 -
If;
-0.2 1I
I
*
*
*
I
I
I
I
I
I
I
I
"I
I
I
I
I
I
I
*
I
*
II
--o.i----o.2---0~3--0~4---0~5---0~6----0.7---0;8----0~9---i~o--~i:r:~~:i---l:3---1;4--l[.5---i~6---i~i----1.8---1:9---2:0---2:i-
Tlistanee in miJ.es"
Figure 3.10
r -corre1ogram of the distribution afforest on photograph 32.
1
The percentage forest is 62%.
(X)
I-'
82
,,
,
,,
.
.. .
,
o
..'* ....
,
** **0*'* I
o ....
.. . **0
I
**01
0"
..
.
0
*0
*1
0
I
o
I **0'*
***0***
***0**
.
'*
'* **
**
. .. ...
*
***0* '*
**0*
***0*
****0*
** **10
***0* *
***
***0***
*0
***
'**'*0***
'*
**
**0***
I
.*
*.#
.. ... .
.. .
'* * '* * 0 '* * ***
'* . . * *
*****0
*,",iIl
'*0**'**
*******0**
**0*****
* * *'*0,** '***
'*'*
'* *~"
-I}
I
0
~"
-J;-
0
10
-l~
,,
,V\.
,,0
,,
,
,,-",,
''''
,,
,,c.
,
,,
'
,,,'0-
.~
>1-
-Il-~"
~:.
~~
i~
-1:-
i~
,0
'**
0*
o
o
o
,
.
I
.
.
I
0"
I
.
I
0" ,,
,
.
o
0
0
0
0
0
,
C
CI.l
QJ
....
0
G-!
G-!
0
c:
0
-.-I
.p
::;j
..0
-.-I
....
.p
CI.l
-.-I
r(j
QJ
,Q
.p
,
.
9
.
9
G-!
0
~....
QO
I,0
0
.-1
I
tC\!
....
....
....
--~----------------~~--~----------------------lJ'\
..:;t
l""\
C\J
r-I
a
r-t
('\l
'"
..
c
0
.p
I
* CCc- *
,,,
,,
,r
.
,
,,
,
,''''
.
,C
:g
o
.&;
....
,,
*
*0
c:
,,,C
C*
1
O-~
1*"·0
I
r-,
o
i
,
!a:.
*::;
I
I
I
I
0
P-!
I'
**
*
I
,Q
't'C
{<
QO
0
1M
,,0-
C ..:- -II-I" 0
t
....
'e; .iJS
t.
J,}
** ?
! * ~
I ..:. =-l~ *
C
P-!
ttl
I",
,,ir-l.
*
-I"
.;:- C
0-11-
,Q
.p
,
0* i!o
'* * *'*'* 0-11C I -l~
.J..~*O**
*.0
(Y)
(Y)
,,
,
!r.
**0**
'* 0 ii- *
**0*
..
0 ..
*
,,
,
,.
,,,',.-<
,
,'
..
**0** ..10
:*'"'0**
* of!- 0 1 ..:-
I
I",
*Q*
0 * -Ii'
.. C
,,,,,
.
,"-<
, M. ,,
,1
'* ..* 000 .. *
"0"
**0*
i..~
r-rl
,
,,
.
*0***
**0**
**0*
*0#*
rl
'""";
'rl
**'* ,*0****
'*
0** ** '* '*
'* *******1
*0
*
0* ***
***
****-**0**
** *0*
** '*'* *
***
01**0&1>*
'*
'*
**0**
* * *****0**
'*
0*
** '*
*****01
#*** 0 1 * ****
'* '* ..
*' *
* '*
***
0***
****-**C*
410,*0*
'* '* '* *
'* **0***
***0
'* *
**O***ifo
****0'* *
**0***
*
~**C**
,
,,''''
''''.....
....
I U\
olio
-lio_O,*
,
,, '"
'0
..
***0****
* -!I-***O
0'* '* *.* '*
***
I
I
I rl
I
I
*
.. .
1****0
***0*
*0**
o
. .
* '* o-u-O'*-tf.*
*****0**
*-II.u0
I
*0*
* .. ****01100
410*
. .. ..
.
*
IT\
Lr\
CI.l
-.-I
QJ
.p
0
....
U
1.-1
....
CI.l
QJ
0
G-!
QJ
QO
ttl
.-1
.-1
.p
(Y)
U
QJ
....
::;j
bD
-.-I
J'r.i
c:QJ
....QJ
Pi
QJ
,Q,
E-i
e
.
e
e
,
I 0
I
.0.51j
j
I
j
0
0.41J
I
I
I
0.3J-
o
o
I
J
j
0
I
*
0.21-
I
I
0
0
§
10
::3
1
~
~
I
"
~*~*.t:**
*0*
*0***** *
*
* * 0***
000000
i
"
"* " " "*
'"
*
**
*
******~**J;,*~****** *" ** *
**** 0*0* *****.0******* ******
* '*
O*O**{)"CCCCOOOO~OOOQ"*""** " *" ** " "
" "
**O*OJtt***1O':V::':'***********I)**** **** * ***** *
*
{)O
~O.ll~
I
8
'"
* *
0*
0000**
".
**
"
~*
*
** ** *
o
*
o.oj-------------------*--***OOO*-**------------*------ -------------**O***-**-*~******-****OG*ooo***-*------* -----*-**-----*--*---
"** *0000"*** ******* *****""*
****0**"**** ,,**
****0000***0***0000000*00*** *******0****
*""
****000*000 ******0**00**0***000****
*
****** '" * *******·00*000* *
* * ******* *
*
***
*
I
I
I
I
-0.11-
I
. **"****00000***0"
"*" *0" * "
*" *****"*""***" 0*0
0** ~ 0
" "* *000 0 *0 *0
*', ****
** *0*00
0 *0 *
*"
* *
J
I
I
-0.21I
I
* ***~*** ****0* *
*
**00
* * ***** ** * ** a *
"
*"
"*
*
* *
*
*
* *
~
* ** * **
c
o
*
j
I
-0.31I
*
*
1
I
I
*
-0.41I
I
1
*
o
a
I
_____________________________________________________________________________________________________________ - - - - - - -
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.1l
0.9
1.0
~tt.~nce
Figure 3.12
1.1
in
1.2
1.3
1.1,
1.<;
1.7
loR
1.9
2.0
0
2.1
T':d_lp.~
r -corre1ogram of the distribution of forest on photograph
1
The percentage forest is 55%.
1.6
34.
0:>
W
e
e
e
I
I 0
0.61I
I
I
I
0.51I
I
I
I
ooLII
I
I
..
I
0.31I
I
§ I
:::J I
o
•
~
o
o
.......*. '"'
~o.21~
I
O'
0
I
j
I
O.]J-
"'.
*
*0*
",,,,
*'4 If<
0
'"
I
0*
I
I
*0
...
'"
~
... .... *
.........
.. "'......
'"
.*
. . '* *......
.. .. **
"'It. ", .. **
••
"'...
..
*4; "'*00* "00
... -0
000"
*'*
.,
******
'" *** 0
"'
** .***0
**
0**
***** "."''''. '"
** *O~*OO*-O·
** ""'''''''0*
0 •••
'"
I
00* '"
**"'* ******** ***** 0 .. ** '" **
'" ***0**0.*.00.0***** '" *0*0 "'.. .. .000** *0" ..
0.01-------------- .'0" **
- .*- .. -0* 0*00" 0* 0*0*0 a*0000000 '" 0:+ "'* '" ** - *-"' '" 0 .. 0
**--.- --ll< *- '" 0"" ---*- -:- -*-- -- '" *00---* *----- --I
**0000*00 0-0* *.0*.0*0 0 0 ***", ••••• 0.0000000000000*."'. .*.*.*0** ·.·00*"''' '" *** ***00 **0
0
I
*** 0* Q O.O*O****+- *.+:* ** .... *****."'******ll<"0 *.'" '" ..... 0"'0. O. *'" .... **..
.."'.. ** 00
I
"'. '"
**'.",< '" ....
.. '" ."'..... ..
** *.0*0**0**
* ....
.. *0
o ••
I
**. .*
.. *****
.. .. '" '"
**""0 **.*.*****
-O.]J+:
...
..
*
*,.. **** "" * .....
0*0 *
I
",
* * *
..
+:*"'*'"
**
** *.* * ..
..... ****
I
*
..
** '"
*
..
It'
'"
.+...
'"
ll-
.. ."'*
.....
>10:
I
I
-0.21I
I
·1
1
I
1
I
I
I
1
1
I
I
I
j
I
I
.••
'" *
I
I
1
1
-o:i---o:2-----oX--o.1----0~5---o:6---o:7---o:8---o:i----i".O----1~i--i~2---i:3---i:4--i~5---1~{----1~7---i:8--i:9---2~O-----
Dist.ance in miles
Figure 3.13
r1-corre1ogram of the distribution of forest on photograph 5.
The percentage forest is
56%.
co
+:-
e
e
e
.
06l-
•I 0
I
1
I
0.S/-
I
I
I 0
I
0.411
o
I
I
..
o
I
o.~I
I
I
0.21 -
(.)
·0*
no
I
§
i
:rJ
I
~-ooll~
I
8
..
.
.
0
..
0
'"
0
.. "'.**
O.
I
*"
00
I
.
o
.00
*0*"
"ooco
"'..
* lit
00')"'01<..
l"j*O '"
..
.. " . .
...
* '" * .. '"
..
..
'"
"'....
..
'" ***
*'" '"
'"
....
**..
***. i-.
"'.....
·.Jt, "
",
:to
*
'"
**
.
.
.. *
0" 0
'"
I
*DOC *+0*0'''''''.
.. ... ooe·o"." .*,..
**0... **0 .. "''''0**''' .... * ...
*O.O:t:+**
'"
** ."0
0
o.d---------------*OOO.O.·----- .. - .. -.- .... OOO.·O .... OO*O*Oooo*ooo*oooo"oO**+.·--.OMO'OO.O-oo*o.o*"o+.~*-.*_·-*+.-00--00----*------
"'* .
I
I
1:
-+-
·oO*OQO**.Q"'*.*
•
""1"1"'**
'" '*"
I
I
.... "'**""'00 .. 00000000·.0 .. 0
..
..
** "' .. '"
.. ***"1'''' ••• +
** .. *.. **** '" .. ..
...
-o.. ll -
"'.0*0*00*00
'"
... ....
..
",e
,,..,.. *:+*
'I. •
••
"'..
•
".·
• ••••
*"'·
..........."
I
j
I
I
'" •• 0 ...
0
000.* ..
0*
'... **
..
o
-o.?J I
I
I
I
"
-0.1,1-
I
I
I
I
I
I
I
I
I
I
I
I
;
I
!
'"
.. 0
00.00.*·· •
·0"'0* ** ...
I
I
I
.
I
I
--0.1-- - ":2---0~3 --o~ii --- O~5-- -0.6- -- 0:7 -- ",:f-- -":Q---l ~ii --l~C -'[;?--.- T.Y - -1:fi--l:~- - "i~6---i:·7----I:;' _.. _;:?- - -~:ii ---~-:i-Distance in miles
Figure
3.14
r1-correlogram of the distribution of forest on photograph 200.
The percentage forest is
56%.
co
\Jl
e
e
~
e
I
..
I
I
J
0.51I
o
I
I
•..
j
0.41I
I
I
I
0.31I
,. *'" '"
I
I
I
O.?I-
o •
I
~
o
.*
I
I
H
1
8
:
...... '"
.. **
I
:;i
..
o
"0"
~ Q.1I-
O~..
00'
.,.:J.(!.Q....
I
o.d -- - _.--
·0'(:
,.....
·ocrc··
..
.... ,
cc ~'(:
u .__
I
I
•••
0" r.
'-0-0.
( .. "':'
1
...
'"
.. '"
... "'** '"
..
***,.. **
*."'.... "',....
.....
0
'"
,.. . .
00
·'"'·000·0'··
.*.*.*.- .
*....
00··· .....
"oe.ooeoo- ..
*0
'"
...
00
0
I
-0.1J -
*
0
.........
."'*....
...
. .. ....•••
I
00
00
_
**."'.*** ••• ***••••
"'*'"
*""*
"'*. **.*. ******** "'** ... *"'." "'' ' '" '" 0
** .... "' . . *"'. ** ***
** ..... '"
... *** **
*** * '"
0
.. ***. * **
* *'" *
"'*
* *
..
..
. . . .. * **..
*....
I
lI
0
0'"
*00000*0**0004< **00*0**0**0*-0 •• 0000- ****0--*--******0*00.*.000**0 00*00 00 00****0*
****
*.*.
.*
'"
..
..." '" '"
'"
..
r: .'~""""
I
'"
*"',,"'........ *..
******* *•• ********* **
0
0 00
**0000"';'*"'* .*,f: ........ **** ..... *,. ... * *** 0*00**
···000 ....... 00 •• 0** *****.* '. *.. *"'•• * *** *** ****0
*......
t.
,. 0
...
* '" '"
*** .. *
'"
-0.211
I
.
....
*
.
•••
.
...
I
I
**
0
-0.)1-
I
I
-o:i o;i-'· 0:3 --0.4
n'
I
0-5
·0.6·· .
o. 7
O~R -- -
0;9
m
ni~ta.nce
Figure 3.15
I
I
I
I
i:o· ---iol- ---1.2 ---1:3- -- i:4"- -i:5"---i:i---i."7----1T-l:9----2:0--2:1-in· miles
r -corre1ogram of the distribution afforest on photogra,ph197.
1
The percentage of forest is 47%.
~
e
e
e
I
1 0
0.611
1
1
1
0
0.51I
I
I
I
0.41I
I
I
0
1
0.31I
1
I
I
0.211
I
~:
~ 0.11-
t
8
:
00
OO"~
·':~~~OO."P
... 0000000···· .... •• . . . . . *...
•.. •. ~O~~~~;~;.:~::~*:~~-::.:::u:::·:
I
.;:.::~:.::*
0 •••••
...
00·00"00,,'.0000*0
·O ... O.. O.C.OO*OOO*O.O
O.OO...
"
• .. ··0 .. ·-0*00·0 .. --0"
" 0.0·000· ··00 •• "'.·. ..
..
0.01--------- ------------------ ._--"--- - -- --,. .. ·0··· .. *- _
0 - . . . . . • . • '*
-· 0. "'.,"." -- .. - •• -
I
I
I
•••• ..
.~
I
I
..
*** ••
--
••••
·""0·'
. . . . ..00 .. • •• • '··000*
.'"'..
.. ..
... ··.00·.··· -0
.... .... C'''O'' 0 0
-0.11I
I
I
I
.fI
··.0
·"0
0 .. .. ..
·
* .. "...
·0..
I
I
I
*.. . .....
.. ..... ... .... .. .. 0" C
·*·*0 C .. -0 0
..
"'oil. "0
....
..
*' ,.
-0.. 31-
* .. *
'"
"'..
•
J
I
I
I
I
1
I
I
I
I
I
0..
•• coo
.. *0
• • •• 0*
I
I
I
I
-0.411
..
*. *0*·· ..
••
.. *')OOC*O
....
I
..
..
"
.. ........
..
-0.2 1-
..
..
-000 '"
I
I
.*
* **
....
*
*
c
0*
*
I
- O~l-- -O~2- --- -0-."3- - -0.4- ---0-.5---0:6-- -o~i --O~8-- -o~i ----;.0----1-.1--"1:2--- i:3" ---i~4 ---1.-'- ---1),----1:7- --i:8--i:9---2 ~O---2~1-·
Distance in miles
Figure 3.16
r -corre1ogram of the distribution of forest on photograph 196.
1
The percentage forest is 77%.
~
e
e
(.
e
,
j
0
0.61j
j
1
1
-0
0.511
I
j
j
0
0.41-
1
1
J
1
0.3; ;
1
1
o
1)
I
0.21-
00
j
.~
~
"
I
J
I
~
J
g
:
.00 .. ••
"'eoo·
• .. ·000·· ••
'01 0.1J-
o
0000·· .....
'"
OOCO("·OOO"
..
I
0.01------ --- - .. - -
0-
..
.. " ·000**
.. 00-00-00··0*."
..
..
...... .. -0
"00" 000· O· O· 0" .. " .. ....
..
..
...·0 '0·" .. " 0
0-0,00 *0"0" -
J
•• "'ll' . . . . .
I
.o.
I
I
-
0.0"'00-0 .. 0 0*060·
-0.11-
-0 ........
.........--.. 00-00 .. 00-00·· ••••
....
,
e" ·0·
-00-
.
·0·····
... ··OQ .. •• .. ••
. . . . --00··· .. • • • •
I
1
I
I
..
"'000'0000'"
'0
-0-0-·· -o.o-
-0.2 J1
. . . . . . . . . . . . . . IO . . .
-000""
I
I
I
.....
-0.3 I-
0· .. •
0·· .....
0
0
••• ··0··00 0000·.0
.. " .... -r)(.··0"" .. 0"·
I
J
I
I
0.1
Figure 3.17
0.2
0.3
o.h
0.5
0.6
0.7
0.8
0.9
1.0
1.1 1.2
Pis tance in miles
1.3
i.4·
).C;
1..6
rl-correlogram of the distribution of forest on photograph
of the lattices of Figure
3.16
over an angle of
90 0 •
1.'1
196
1.R
1.9
2.0
2.1
after rotation
The percentage forest is 77%.
0)
0)
land use the more points will fall in the same land use class and the
higher the correlations will be at the smaller lags.
An exponential
trend or combinations of exponentials may be expected.
Except for cor-
relations at the very small lags, the correlograms drop quickly to
levels between -0.2 and +0.2 where it seems that many correlations are
very small.
A model, which could describe the characteristics mentioned in the
previous paragraph is
which we have coded model 3 in Table 3.5.
The first term within brackets takes care of the large attenuation
in the correlogram, where the second term is necessary in order to
preserve small values at larger lags.
Therefore, h is large and h is
2
l
small, but: both should be greater than or equal to zero to keep the
model realistic for large distances.
flexibility in fitting the model.
The parameter h
3
gives a larger
The cosine factor accounts for
periodicity." A flaw in the model is that it does not account for phase
difference in case periodicity occurs, but since the estimation
procedure is non linear, introduction of a fifth parameter may not
improve convergence at reasonable parameter estimates.
not account for combinations of two or more periods.
It also does
In our case an
obvious difficulty with "any estimation procedure w-ill be that the
correlation values are close to zero for almost all distances.
90
Table 3.5
Parameter estimates and sum of squared deviations of
non linear models fitted to r
l
correlograms of the distribution of
forest on some aerial phot.ographs of the Lake Michie watershed.
Photograph
Figure
Model 1 )
f"tt
~
ed
cp
Parameter estimates
h
~
h
2
h4
3
Percentage
f ores t
85
3.7
1
2
3
16.29
·12.68 4.08 -0.157
15.00 15.00 0.001
0.666
0.6013 )
4.27 0.664
53
86
3.8
1
2
3
4
16.06
0.610
16.68 0.91 0.011
0.5963 )
16.34 -0.52 -0.018 12.79 0.735
-2)
-
63
1
2
3
16.46
14.20
87
34
33
3.9
3.12
3.11
-
-
-
0.051
7.96 0.546
1
2
3
13.23
11.32 -0.17 -0.290
1.517
1.019
62
1
2
3
19.37
0.50
-
16.80
-
0.841
-
71
-
-
-
-
0.070
- 9.52 0.795
1.034
55
·0.265
1.340
1.154
55
11.12
-
!
32
3.10
!
i
5
200
197
196
L)
2)
3)
3.13
3.14
3.15
3.16
I
I
1
2
3
4
21 5.07
14. 3
26.49
13:59
I-
-
14.90
12.99
2.01
1
2
3
5
16.03
19.15
3.42 10:077
15.35
0:56
-
- ,tO~057
-
6.06 0.850
~
1
2
3
-
-
r O•44O
,
-
i
0.876
0.796
56
0.454
0.408
56
-
-
0:030 b-3:02 0.362
~
15.95
25.89 l1.95 •' 0.546
0.449
0.442
1
2
3
4
16.40
-
-
0.014
5.51 0.396
1
2
9.76
18.07
1.43
0.145
1.658
0.608
-
-
47
-
77
For explanation of the codes see text
A dash indicates that the program failed to converge due to either blowing
up or very slow· convergence
This value indicates that a better set of parameter estimates is available.
These are the estimates given for model 2, which is model 3 with h = O.
4
See text also.
91
This initial model can easily be reduced to more simple models if,
for instance, periodicity is very damped and we are only certain of
negative correlations at the first dip of the correlogram we can write
for all practical purposes
p(d)
If negative correlations are not real, the process may have an
expected positive convex correlogram, represented by
(3.4.4)
p(d)
which we have coded mode1 2 in Table 3.5.
Or, a simple exponential
model may be used,
p(d)
which we have ca11ed model 1 in Table 3.5.
To fit models of the kind described above, a non linear estimation
program has been used.
The program is a rewrite and modification of
SHARE SDA 3094.01 (NLIN) by D.W.Marquardt (1966).
For a fitted model
92
parameter estimates and sum of the squared deviations of the predicted
value from the observed value, denoted by r.p , are presented in Table 3.5
for the correlograms of Figures 3.7 through 3.16.
The models are
fitted based on 495 data points covering the range 0.03 miles to 1.25
miles.
The correlation estimates are computed using the formula of
It is seen that for the simplest model of one exponential, in-
dicated by model 1 in Table 3.5, convergence is always attained by the
non linear fitting process and parameter estimates attained.
not always true for the other two models.
showed improvement in r.p.
This is
Where mode;' 2,converged it
However, for the correlogram of photograph 34
the parameter estimates are unrealistic, since for lags outside the
observed range correlations will be less than -1.
The performance of
model 3, which is model 2 mUltiplied by a cosine factor, is disappointing.
Parameter estimates under convergence could be obtained for the correlograms of photographs 33, 85, 86 and 87, but the fit is only
moderately good for photographs 33 and 87 in that r.p is noticeably
reduced.
In case of photographs 85 and 86 the parameter estimates in-
dicate that models given respectively in equations (3.4.3) and (3.4.4)
are more appropriate.
These models are in fact model 3 with h
4
=
O.
We could not obtain convergence for model 3 for correlograms of photographs 5, 32, 34, 197 and 200.
This model is not at all appropriate
for the correlogram of photograph 196.
In search of models which could
fit the data better the following models have been proposed, indicated
in Tab.le 3.5 as mod~l 4 and 5 respectively:
e
93
0.4.6)
p(d)
and
p(d)
The first of these last two models is clearly inadequate in that
the correlations never reduce to zero for large d.
However, it fits
correlograms of photographs 32 and 197 with some reduction in c.p.
The
last model fits only the correlogram of photograph 200, attempts to fit
it to any other correlogram failed.
Fitting non linear models is not always to be trusted since
convergence depends on good initial parameter estimates.
When convergence
occurs we may not be certain that the sum of squared deviations is a
global minimum,
~.~.
the program could converge to a local minimum.
When convergence does not occur, it may be that convergent parameter
estimates could be obtained with other initial parameter sets.
However,
time limitations prevent attempting many different initial parameter
guesses, especially when the error surface is ill defined causing infinitely slow convergence.
Some estimates given in Table 3.5 may be
best in the sense that we have used the right model and, most important
of all, a global minimum has been found.
that.
But we will never be sure of
We can be more confident, however, if several different initial
guesses provide the same estimates.
We have obtained convergence for
several different initial parameter sets which resulted in the same
estimates, but there are also cases where different estimates have been
For instance, in the cases of photographs 85 and 86 we
obtained.
obtained convergence for model 3 under two different initial parameter
1\
4 = O. The results
is smaller than that reported
sets, since model 2 is equal to model 3 when h
reported for model 2 are better since
against model 3.
~
The results reported for model 3 are due to con-
vergence to a local minimum.
The parameter estimates reported for
model 2 for photograph 86, however, are unsatisfactory since they do
not reflect the dip in the correlogram.
The values of
~
tend to be about the same order of magnitude for
different models within the same photograph, but there is more
variability from photograph to photograph.
between
~
The rather slight differences
values indicate that it does not matter much which model is
fitted due to the minute values of the correlations.
On the other hand
one can say that an ever so slight undulation is observable in the
trend of the correlograms, at least over part of the area as indicated
by the differences in
~
values of the models fitted to the correlograms
of photographs 32, 33, 85, 87, 197 and 200.
not seem to influence the type of trend.
The percentage forest does
The percentage of forest for
photographs 87 and 196 are both above seventy,but model 2 with h
fits the correlogram of photograph 87 best where model 2 with h
3
fits best for photograph 196.
and 33.
3
<0
>0
A similar pair is found in photographs 5
If the models are real and different for each photograph like
the percentage forest, then stationarity should not be assumed.
Another disturbing factor is that a model fits one of the correlograms
95
of two overlapping photographs but not both.
This situation occurs in
case of model 3 fitted to correlograms of photographs 32, 33 and 34
and somewhat differently in case of correlograms of photographs 196
and 197.
To find any physical explanation for negative correlations or some
damped oscillatory movement over part. of the area we have to go back to
the aerial photographs.
Initially photograph 33 has been selected
since it is fairly patchy containing some important roads, a number of
farm ponds, scattered residential areas and forest intermingled with
farm lands.
The percentages of these five classes, estimated by a
5 inches by 5 inches lattice with 100 points per square inch have been
given in the beginning of this chapter.
The observations on the lattice are pictured in Figure 3.18 in
which the roads are indicated by thel~~ter r, forest by 1 and everything else by a O.
The trend of the correlogram with respect to the
attribute forest is rather smooth (see Figure 3.11) and shows small
damped oscillations for the first part of it.
The occurrence of
negative dips, albeit very small, was thought to be interesting in the
light of reports in the literature of only smooth, convex but positive,
correlograms and because of its implication with respect to systematic
sampling.
Although almost all correlograms show some undulation, the
periods within a correlogram as well as between correlograms differ
widely as is reflected in the non linear estimates.
Even correlograms
of the photographs surrounding photograph 33 are different.
Where
non linear confidence intervals proved to be adequate, the estimated
96
00000000000010111J..001 JJ.lll1111J.l111111111111111100
001100000rOOOOllll01111 0111111111111J1111111111111
00000000000000111111110000011111111111111111111111
00000001000000111111100000001111111111111111111111
00000111110rO00111111000000000011l11l1111111111111
001111111000100101111000000000011011111J1111111111
001111111000rO01000l1000000000l100l1111111J0011110
0l11111000111r1000011000000011JI001111111000111110
OllllllOOOOOOlOOOOOllOOOCl11111l1011J0110000111111
01111110101111rO000010001111l1110011111100000r1111
11111111110011100000101011101101001111000rO0000001
1111111110010000000010100011110l10111100000011000r
1111100110000010rO00111101011111000111r000001l0000
11111001100000100000011100100001111r00100000110000
1111101100011111000011111 00000011rOOOOlOOOOOOOOOO1
1111000100011101100011111 OlOOOOrOOOOOOOOOO1 0000000
111100011001111110001111101rrrOlOOOOl0000l1111001r
111100011111110111000001'110111 OJ. 00001 0011111111111
1110000001111111 Orr00111111111011 010000011111111r1
111000000000rrOlll1l11111J.111l11111000011111100111
1100oooooorOOOOOOl111Jl1111110011100111l1111100r11
1100000rO00000001l101100111111111111J1111111l00l11
1100rr0000000000010001001111111111111111111110r111
Or000000000000000100011111111111111111111111111111
001000000000000001111111111111111111111111111r1111
011100000000000000111111111111111111000111111rlll1
0011000l1001100000111111111000011000000011111rl111
111111100001100000111111111000110000000000011r1110
1111111101111110001111000000l1000000000000111rlIOO
rll111111101111100111100000011000000000000001r1100
1r1l10101101111110100000000010000000l00000000r1000
1111111111111111111100000000111000001iOOOOOlOr0111
l111r1110111111101100000000001100001100l0010001111 I
11111r110l111111101000010000000l1l1100110000000011
111111111111111111.10001111000001111100110000000011
11111111r100101111100111111111l1111111r0000001rl11
1111001101000011COOOOl1l111111111r1111100000111011
110000010000001000000011111100111111110000rrlrrr01
11100001000000100000001111110001111110000000000011
10000l10000000100000001001110rlJ.111110110011000000
1000011000000r10000000100000r10011111111011110Q01r
10000l11100000r001100010000010001111111J.1110100011
100010011100000r1l111011000r00001l11111l11111000l1
11001101100000110011111100010000011111111111100011
11111101101001110r11010000rl0000001111111100001111
111011011000111111 r1 001110111000001111111111001111
1110100011111111111rll110r111100001111111111011111
1110100110000l000111r00101111000011111101100011011
111010010100010001111r1111111110001111111100000011
OOOll10101100l0000011111rl11111100l1111111l1000011
Figure 3.18
r
= roads,
Observations at .lattice points on photograph 33.
1 = forest and 0 = other.
97
parameter h , indicating periodicity, has been found to differ from
4
zero at the 5% significance level.
could not be found.
However, usually an upper limit
It seems reasonable that the distance between
roads corresponds with the first period in the correlogram.
Since
farmlands are usually associated with roads, it follows that the more
regular' the road system, the more patchy we can expect the land use to
be.
As the occurrence of roads will be isotropic and still irregular
withon one photograph, the effect of periodicity is rather diluted.
The most clear case photograph-wise is that of photograph 86 of which
the lattice is pictured in Figure 3.19 and for which it was difficult
to find any converging parameter set for a model taking account of
periodicity.
The fact that partly overlapping photographs do not
present the same type of correlogram shows that the road pattern is
not ·the same over a larger area, which is indeed the case for the
Piedmont of North Carolina.
Keeping in mind that photograph 33.was
not randomly selected, but chosen for its patchiness, the damped
regular periodicity is rather an exception than the rule.
This
patchiness may exist locally but is not characteristic for the whole
area.
Another consequence is that correlograms should be determined
from observations covering a larger area than that of a photograph.
The same lattice has therefore been placed on an index map of
aerial photographs covering almost all photographs sampled separately.
'The~orrelogram is
beyond 1.125 miles.
shown in Fibure 3.20 and proves to be quite flat
The distance between two adjacent points on the
98
00000001111111 011 nU10ClOOOOOOOOOlJ 11101011101111
00001001111110011111110001 OOOOJI01J 110000000001111
OOOl11010l00000l1Jl11"UOOOOOOrllll.0111000000011111
OrOOllOl0000000011l1000l1111000000111111l00l111111
OOl10rr1'1'OOOOOrOOllOOOOlOOl00000000111111111111111
11100000000000010000000100110000000011111111111111
111000000000000001rOOrl00011rOOOOOlll1111111111111
11111 0000111110001010001 rOlOOOOOOllLU 000110011111
ill 11000001111000111 0001001'00000001111100110010000
111 11000000110000111.111101110000011111111110010000
1111100000000100l1lJ1111001100000lJ.I011111100lJ.OOO
1111100000000l0CII01IJ.lIIOOOOOlI011001111000111111
111110000001n00110011111 OOOOOOJ] 00111111000J11111
11111000000111011100l11JI000rOl0101111111100U1110
1111111 OOOllJI0} 1"1 0111] 1J1001'0010000111] 1100l00(lOO
11111J 1100111 0011 000111111. 001'001000011111110000000
011111111001J.00110m1l11111J 1"111000111111110000000
0111111111001111100l111DOOOOOOl111111111110000000
11111111111111111011110000000000011111111110000000
111111111111110011J11000000000()000111J111110000000
1111111111J1]1100111000000000flr00011J1111110000011
111111111111111 001111 11nOOOrOOOOOl111111110000001
011J.11OJ111111J.111J11Jllll110000]011l1111 oooooorOo
r01101000111lJ.11111UnOOllrJ00110rlll0000000rOOOO
0000r0000111111111111J.OOOlrOOOOOlllOrlOOOOOr{)00000
OOOOOOOOrlllllOll 1111000011010001111J 0011111111111
11l11001111r1101111000010110000101J.l00011111111000
11111100l111J.lOlI0r100010l1J 1001111110011111110noo
11111111JJ.1111011000111001000001111110000001110000
111111111111111 110000r000100111111J11100000l111111
111111111111111111110r00110000l1000010111111111100
1111111111111111111 orOoon 10001110001 0001111111111
1111111111l01111111r011rI1100011111111111111111111
1111111111100111111'111110111001100011111111111111l
.1111111111 ooon 11111]111 00000000000110111111111111
111111111100000001111J.110r000000001110001111111111
11111111001000000111111111 y{)0000001l10000111111111
1111l1100000000rOOl11111111r1000001100000111111111
11111111111000000001111111111100001100000001111111
1111111111110000001111111111Jr111 01000000001111111
1111111111111000001111111] 11110rl00100001111111111
1111111111110r0001011111111111010r0100001111111111
11111110110rT'l00111001111111111.1001100011111111111
1110111011r11100011001111111111111000000010n0111l
1110111011'1111010111111111111111100110101001011111
11101111rlll1001001111111111 111111111] 001011111111
11011111'11111111001111111111J 111111111111000111111
1001111'1111111010011111111111111111111111000111111
101100111111 110101111111111111111111111 r 1011111111
10000111110101100001111111111111111111111111111111
Figure 3.19
Observations at lattice points on photograph
r = roads.L.1 = forest and 0 == other.
86.
e
e
e
I
I
I
0.U~
I
I
I
'':'.31 I
I
I
I 0
E
n.?! I
I
J
~
o.!I-
~
Q)
~
o
I
..
0
10'"
I
{)
I
*0-*
*,;41:
'II. *
•
. . . . " .. "'..
I
00000000 >I< 0 .... *'l<·"",.'~ .... r"",".,.*o,,,,·,·· ..... "..
..
(t,,01---------*"'**- *oo!')*ncco·,~,.''.. " • 'le ':'" r'uG C00+ CCOCC' ...,O{)C'(I00 ........ • ........
* If"".
I
",
..,"' .. "
" .. ••
I
'
,. '"
.. ,.
,....
". ":1<1': .. , . . .
..",,, ..
..." ..
0. .,
..
.. .. ,.,....
,
"''''
....,...
••• .. ·000 .. •· ..
•• .. .......
'Ol.
~
..
0 .. 0
···········oc·.·.. ··(·
.. •
.......
..
0
.. o·
..
.- _ . . . . . .
0
00"' ..... 0(;00.0.,.
• .... cc •• O••••••• to (
• •,
••
-0" ".. 0 •• 0000".000"" "'0 "'0 0,." -one ... r·c "" 0'" (;..... " •., ... ~ ... " C .. 0"·····()(80~,~'II:O(j .. (.OO,·cc
·,."
• .. O· .. ·Co.o ••• ·o-oc:
t. 0" OC·"
.•••• ,
. I
I
-0.11I
I
..
••••• ,.
_....
-C
00' ......
I
I
-O.?I-
I
I
I
I
--OX---O.-8-----1:2---i:~---2:0-- ·2;J~----?-Y
,y. -"i".!,"
-,,:1
ntr. tlUIW')
Figure 3.20
:;
n
y1:r:· . f,.~,
'.1
<;.7
-r>.i
6-.5---6:97:3
7.7
".1
~-~B-~<;
r~·~ 1 ':~
r1-oorrelogram of the distribution of forest on index map.
The percentage forest is 57%.
\.0
\.0
100
lattice is now 0.128 miles and between two adjacent points in the correlogram is about 0.0702 miles.
Periodicity is almost nonexistent and
exponentials fit easily.
The correlograms and their estimated trend discussed above are
based on one type of correlation estimator.
The results of Section 3
are such that we should not expect any difference in non linearly
estimated parameters whet.tJ.er the correlogram is based on estimators r ,
l
r
2
or r .
3
Also correlograms have been computed from dense lattices with 100
points per square inch.
It would be desireable to start with lattices
with fewer points per square inch from which reasonably smooth correlogramscould be computed so that the trend of such corre1ograms
could be determined under convergence of the non linear estimation
program.
To demonstrate the similarity of different correlation
estimators and to see at the same time if smaller lattices will be
"satisfactory under our conditions, smaller lattices have been obtained
from the large 50 by 50 lattice by sampling points randomly and without
replacement.
Such lattices will be called random lattices in
to lattices where points are equally spaced.
contr~st
If for a random lattice
with a certain average number of points per square inch the trend in
the correlogram can be determined, a systematic lattice of about the
same size will provide the same information with less effort.
Since
the same number of pairs"will be available the correlations will be
more precisely estimated although less lags spaced farther apart will
occur.
The correlograms have been computed and the trend estimated for
101
all three correlation estimators at seven different sampling intensities
result~ng
in random lattices of respectively 1,000, 800, 600,
200 and 100 points.
500~
400,
The first L25 miles of the correlograms of these
random lattices contain· always 495 lags except for the correlogram of
the 100 point random lattice which provides 491 lags.
These seven
sampling intensities have been applied in the cases of photographs 33
and 196 and partly in the case of photograph 87.
For the remainder of
the photographs correlation functions have been estimated based on
lattices of 400 random points, where the correlogram has been computed
using the formula forr1.
For photographs 34, 86 and 197 computations
have also been made from lattices of 1,000 random points but based on r •
2
Since the nwnber of pairs varies for different lags, a weighted
estimation procedure of trends in r
l
correlograms has been used at all
sampling intensities for photographs 33, 87 and 196.
For comparison
purposes the nwnber of pairs at each lag of the full lattice was used
as the weight so that the weighting is independent of the samples.
The results of all these computations are given in Tables 3.6, 3.7, 3.8
and 3.9.
For each photograph the same initial parameter values have
been used for, the non linear. estimation.
ferences in cp forr , r
1
2
and r
3
The results show that dif-
are too small to have any practical
meaning, but differences do increase for smaller sample sizes.
The
pl:l-rameter estimates behave likewise but the differences are larger
•
since the parameter'estimates are correlated. However, we believe
that if the correlations were not so small as they are, the differences
might have been more substantiaL
The difference in parameter estimates
102
Comparison of non linear estimation of the correlation
Table 3.6
function, according to model 2, in correlograms of the distribution
of forest on aerial photograph 196 based on r , r 2 and r for several
l
3
sampling intensities
Correlation
formula
r
1
r2
r
3
r
1
sampled
2500
2500
2500
2500
hI
NW
18.07
27.18
26.81
NW
NW
r1
r2
r
3
r1
500
500
500
500
r1
r2
r
3
r1
400
400
400
400
r1
r2
r
3
r1
200
200
200
200
r1
r2
r
3
r1
100
100
100
NW
100
W
r
r
1
1
NW
NW
NW
W
NW
NW
NW
W
NW
NW
NW
30.45
30.50
31.23
30.15
33.43
33.56
33.72
30.19
'30.20
30.42
31.36
W
NW
NW
NW
NW
NW
28.18
28.28
30.63
NW
NW
W
NW
NW
h
h
3
.145
.379
.365
cp
.608
1.035
1.026
4.76
4.80
4.74
4.60
4.84
4.90
4.80
4.43
4.55
4.63
4.49
.345
.346
.336
.337
.368
.369
.353
.329
.438
.442
.418
1.570
1.588
1.586
11690
2.092
2.119
2.137
15456
3.256
3.325
3.316
4.464.60
4.43
.395
.399
.376
4.030
4.100
4.082
NO CONVERGENCE
26.65
26,.99
29.82
W
NW
2
1.43
4.75
4.70
NO CONVERGENCE
W
NW
h
h
NO CONVERGENCE
W
r2
r
3
r1
r1
r2
r
3
r1
1
r2
r
3
h
Analysis
1000
1000
1000
1000
800
800
800
800
600
600
600
600
r
e
# points
4.46
4.75
4.43
.369
.381
.362
5.063
5.105
5.045
NO CONVERGENCE
33.62
40.75
38.72
.89
4.94
3.40
.094
.190
.175
33.543
17.741
17.604
NO CONVERGENCE
64.45
46.92
51.14
-.71
-.81
3.83
.284
.046
.059
NO CONVERGENCE
147.302
120.36
62.49
103
Table
3.7 Comparison of non linear estimation of the correlation
function, according to model 3, in correlograms of the distribution of
forest on aerial photograph 33 based on r , r and r for several
2
l
3
sampling intensities
Correlation
formula
r1
r2
r
3
r1
r
1
r2
r
3
r1
r
1
r2
r
3
r
1
r1
r2
r
3
r
1
r
1
r2
r
3
r1
r
# points
sampled
2500
2500
2500
2500
Analysis
NW
NW
NW
W
1
r2
r
3
r1
400
400
400
400
r1
r2
r
3
r1
r1
r2
r
3
r .
1
200
200
200
200
100
100
100
100
A
h
1.09
loll
3
.070
.069
.068
.069
17.99
18.07
17.95
18.14
1.29
1.30
1.29
1.44
16.71
16.74
16.53
16.53
A
h
4
lP
9.52
9.52
.795
.798
.802
.067
.068
.067
.072
9.71
9.70
9.71
9.52
1.087
1.088
1.086
8492
1.87
1.82
1.82
1.68
.075
.074
.072
.067
9.49
9.47
9.48
1.361
1.361
1.361
11179
17.16
17.31
17.01
16.95
1.58
1.62
1.61
1.43
.056
.060
16.07
16.41
16.10
2.17
2.30
2.18
.113
.126
10.24
10.19
.119 10.27
NO CONVERGENCE
3.320
3.337
3.331
17.38
18.49
17.42
2.54
.165 10.35
2.85
.193 10.28
.168 10.38
2.55
NO CONVERGENCE
4.107
4.192
NW
20.45
22.62
1.32
1.48
NW
20.43
1.38
NW
NW
500
500
500
500
2
1.12
1.10
5561
1000
600
600
600
600
A
h
1
16.80
16.78
16.69
16.66
9.53
9.47
1000
1000
1000
800
800
800
800
A
h
NW
W
NW
NW
NW
W
NW
NW
NW
W
NW
NW
NW
W
NW
NW
NW
W
NW
.057
.050
.146
.167
.152
9.39
9.75
9.71
9.77
9.71
12.02
12.28
12.11
1.986
1.999
2.007
16852
4.159
19.541
19.180
18.656
W
NW
NW
NW
W
9.71
NO CONVERGENCE
.122 13.48
.63
NO CONVERGENCE
104.94
104
Table 3.8
Comparison of non linear estimation of the correlation
function, according to model 3, in correlograms of the distribution
of forest on aerial photograph 87 based on r , r and r for several
2
l
3
sampling intensities
# points Analysis
sampled
lJ.
r2
r
3
r1
2500
2500
2500
2500
r1
r2
r
3
r1
1000
1000
1000
1000
r1
r2
):'3
r1
800
800
800
800
W
r
1
r1
600
600
W
r1
r1
500
500
W
r1
r1
400
400
r1
r1
r1
r1
Oorre1ation
formula
r;L
A
A
A
A
14.20
14.20
14.07
14.00
h2
.50
.49
.50
.54
h
3
.051
.051
.051
.052
h4
7.96
7.97
7.96
7.91
14.19
14.13
14.34
14.01
.29
.24
.31
.34
.052
.050
.053
.053
8.16
8.17
8.16
8.14
1.433
1.432
1.441
10852
14.86
14.81
15.00
1!t.81
.71
.63
.72
1.00
.064
.061
.065
.077
8.07
8.09
8.07
8.00
2.138
2.139
2.144
15980
12.79
12.36
-.19
--.15
.027
.027
8.53
8.47
2.933
22555
14.14
14.44
-.71
.04
.016
.027
8.73
8.60
3.642
29608
W
14.31
13.76
-.86
-.62
.009
.011
8.38
8.30
5.667
44925
200
200
NW
60.79
6.61
.495
8.71
27.036
W
NO CONVERGENCE
100
100
NW
NO CONVERGENCE
W
NO CONVERGENCE
NW
NW
NW
W
NW
NW
NW
W
NW
NW
NW
NW
NW
NW
If'
.,46
.553
.558
4607
105
Table 3.9
Comparison of non linear estimation of trend in correlograms
on some aerial photographs of the Lake Michie watershed for different
~
sampling intensities
Photo
number
32
Correlation
formula
Model l )
fitted
Number
points
sampled
h
l
r
2
2500
26.49
5.09
0.265
1.154
2
1
400
29.42
5.91
0.416
7.193
1.019
r
34
11.32
-0.17
-0.029
10.65
-0.06
-0.035
5.415
2
1000
9.14
0.22
-0.051
1.463
3
2500
15.00
15.00
0.001
4.27
0.664
1
3
400
18.82 5266.32
0.172
-0.00
5.254
1
3
2500
16.36
-0.52
-0.018
12.79
0.735
l
r2
3
400
17.41
-0.20
-0.021
13.05
5.623
3
1000
15.54
-0.27
-0.028
12.37
1.502
12.99
2.01
-0.057
NO CONVERGENCE
r
r
l
2
1
2
2500
2
400
l
5
2500
l
5
400
2
2
2
r
l
r1..
r
r
197
cp
400
1
r
200
h4
2500
r
5
A
h
3
2
2
r
86
l
A
A
h
2
r
r
85
l
A
r
r
r
l
l
2
0.796
13.02
0.3~2
15.35
15.61
0.56
1.00
0.030
0.022
2500
25.89
11.95
0.546
0.442
400
18.25
5.84
0.126
1000
37.26
12.15
0.594
5.736
1.121
1)
For explanation of the code see text
2)
The program did not converge for a reasonable value for h •
3
constrained between 0 and 1.
A
13.84
A
4.1262 )
Therefore h2 was
106
and in <pvalues for the correlation function fitted to the correlogram
based on the full lattice of 2500 points in the case of photograph 196
is due to the way r
l
is computed as discussed at the end of Section 3.2.
Looking at the <p values at different sampling intensities and
different photographs it may be noted that <p does not differ much for
lattices of 1,000 or 2500 points.
<p is a little above
1.
For random lattices of 1,000 points
For random lattices of less than 600 points <p
starts to increase faster and becomes large for lattices with less than
400 points.
For random lattices of 400 points <p varies around 5 for
almost all photographs.
The behavior of <p has been plotted in
Figure 3.21 for photographs 33, 87 and 196; where the results of the
other photographs are indicated with a t:. in the graph.
lattices of 400 to 600 points would be satisfactory.
It appears that
If one uses a
systematic lattice, square or rectangular, the estimation of the trend
should be equally possible if not better provided definite periodicities
in either or both directions do not occur.
From a systematic lattice
with 25 points per square inch placed on photograph 33 a correlogram
has been computed (see Figure 3.22) and its trend estimated according
to model 3.
The value of <p on a per point base is 0.0028 where for a
random lattice of 600 points <p on a per point base is 0.0040.
A
~
A
The
A
0.115 and h 4 = 9.52
3
which compare well with the estimates obtained in Table 3.7, considering
parameter estimates are h
l
=::
15.54, h2
A
the high correlation between h
=::
1.94, h
=::
A
2
and h .Comparisons of the weighted
3
analysis show only slight differences in the parameter estimates.
Convergence appears more difficult, but is generally slow for photograph
.196.
The results on the weighted analysis do not warrant its use in practice.
.
e
--
'
--
I
25
I
I
I
I
I
I
20
cp
15
10
9
8
7
6
5
4
3
2
1
I
l
L____
r
I
.~=:::-::--s
t
200
Figure 3.21
..... ~-=--_~_-=..=
400 500 600 800 1000
2500
number of points sampled
Graph of cp , the sum of the squared deviations against the number of points sampled
randomly without replacement from a lattice of 2500 points
I-'
o
-..::J
e
--
e
I
I
I
I
0,31I
o
I
I
..
1
0.21I
I
I
I
0.11l
s::
i
.~
I
~
I
~
O..,O~
i::
8
• 0 •
o
.. "..
• CO'"
.....
,..
..
0'·
0 *0
*.....
.....
Ill.,."....
C
0 0 ...
·OCuC'O"
:'''0(1 ·0*0 '"
.... *0*
t O O " " 0·0 0 O.
------- -- -00----- ---- .. _.. _. ·0 - . - -- - - - CO"" (' .. _r~· rl-O.O 0'" of, *- "0- --"0-·· - .. - -00-0-- ... ~ - -- .. - --OO-c- -Ill - - - - --Ol: -C (,---- ----- -- -- -------~
I
00
000"""
..
....
")0"
0
·UO.....
-0"
......
......
.. 00
I
'"
·0 00
0
..... 000 "00
0
. . . . * *.,,:..
"' ... "' ........
I
0
0
j
-0.11-
0
...
*... . .
a
..
...
..
.....
...
0
..
...
COO··
...... *
..*
Q.-
• 0
0"" "' .. 0*· ...
r)
I
I
I
1
ceo
.
"
-0.21-
.
1
I
1
I
-0.31I
I
I
I
-O~l---
I
I
I
I
I
I
I
u
Uistanc~
Figure 3.22
I
--0."2-- -(iX- aT -- O~5 --- 0:6-- -- -0-.7-- -a.fl --0-.0 ---1-.0 ---1:;---;:2--- -1.3-- -iT -i~5- --iT --i:7-- --]-.-~- --1.-9-- - --2~O----in miles
rl-correlogram of the distribution of forest on photograph 33 based on
a 25 x 25 lattice
f-'
o
(Xl
109
Since the correlograms and also the non linearly estimated trends
display dips below the lag axis, negative correlations were tested by
the method described at the end of Section 3.
Since we are interested
in negative parts of the correlogram, the total number of significant
correlations as well as the significant negative correlations have been
This has been done for photographs 86, 33, 5, 196 and the
counted.
index map.
The results are presented in Table 3.10.
The distance in
miles should be multiplied by the factor 4.065 in case of the index
map data.
The results clearly indicate that the correlations are statistically
significant and that negativity is even significant on the index map at
a distance of more than four miles.
Thislast statement, however,
should not be taken too seriously as the correlations are still very
smaIl.
Also, the significance level is only approximate.
Some remarks concerning isotropy are appropriate at this point.
If isotropy does really exist, i t does not matter which direction one
chooses in computing the correlogram.
For two pairs of perpendiCUlar
directions this has been done and the correlograms are displayed in
Figure 3.23.
Considering the small values of the correlations, the
assumption of isotropy seems to have some merit for the partiCUlar
photograph for which this comparison was made.
However, if the process
is not stationary or if a particular trend is not apparent from the
photograph, it may be best to average over some directions.
In our
case different trends in correlograms, based on several partiCUlar
directions within the same photograph, would have been estimated using
the non linear estimation procedure described above.
e
,
e
-
.
e
e
It
0.4
- - 0 - ...,.-
4
·5
0.3
~
0·
/il.nd135
0
\
\
0.2
2.p° 34 1 a.nd :l-:l-6Q 34 I
~
II
II
II
1\
\1
\I
\0
0'
0.1
, ,\
s::::
o
"-\ '
\
•..-l
+'1
'0
\
t\'l
'\
r-l
H
H
o
,.,/
,0
,
o
QJ
0.0
,. .- ....0
.-'"
'
I
,'I.
o
.,
fIII'~O
",0"
\
\\
\
\
\
,
" 0...........(..
'0 I
........... -0,..
. . . -0,.
'" "
,
\
,
\
'9
'I.
-0.1
,
\
\
", o
b
\
'\
'\
I
__L
0.1
0.2
I
0.3
I
0.4
I
0.5
I
I
I
0.6
0.7
0.8
Distance in miles
1
0.9
I
1.0
\
,
I
\
"
"
~_l_~__
1.1
1.2
J
1.3
H
H
Figure 3.23
perpendicul/il.r
Correlogra.m~
direction~
of the
di~tribution offore~t
on photQgra.ph 33 for two pairs of two
f-.,J
112
;':
4.
EVALUATION OF THE PRECISION OF SOME SAMPLING SCHEMES
4.1
Introduction
In this chapter we will evaluate some sampling schemes for the two
population models discussed in Chapter 1.
Combining the theory given
in Chapter 2 and the results concerning the estimation of the correlogram presented in Chapter 3 the behavior of the systematic sampling
variance will be described as well as its relation to variances for
other sample designs.
In Section 2 we will stUdy the true variances
computed for some finite populations and in Section 3 we will discuss
what should be expected.
In this last section we will also indicate a
procedure to determine the best sample design for a given sample size.
4.2
Evaluation of the precision of some sampling
schemes for populations of constants
The purpose of this section is to look at the performance of some
sampling schemes by comparing the calculated variances of their estimates.
True variance formulae of systematic sampling schemes for finite
populations of constants follow from an analysis of variance as given in
the Appendix.
In Chapter 1 the possibility of representing a photograph by a
square lattice has been discussed.
The popUlations considered in this
section consist of land use codes observed on aerial photographs at the
points of a lattice of generally
48
x
48
points.
The distance between
the points is 0.1 inch which is equivalent to a ground distance of 0.03156
113
miles.
An additional population considered consists of the land use
codes observed at the points of a square lattice placed on an index map
of aerial photographs.
In that case the ground distance between two
adjacent points is 0.128 miles.
The land uses concerned are forest and
non forest coded by 1 and 0 respectively.
Since the popUlation values
are known, actual sampling experiments have not been carried out-.
To evaluate sampling designs for different popUlations several
aerial photographs with different correlograms have been selected.
Re-
ferring to Table 3.5 and the corresponding figures, photographs 85 and
34 have been selected because of the dip in the correlogram below the
horizontal axis.
The correlogram of photograph 34 shows a larger more
negative dip than that of photograph 85.
33 shows a clear oscillatory pattern.
The correlogram of photograph
Photograph 197 is an example of
a nearly convex exponential correlogram but some oscillation is
observable.
The correlogram of photograph 200 is unique in that it is
fitted by a model which could not be fitted to the correlograms of other
photographs.
Finally, to cover a larger area, computations have in-
cluded the finite population obtained from a part of the index map.
The correlogram of this popUlation is almost flat.
Variances have been computed for several sampling intensities of
simple random sampling, stratified random sampling, systematic sampling,
one stage random c.lustersampling without replacement and systematic
cluster sampling.
Cochran (1963).
For variance formulae of the random schemes see
114
Systematic sampling of single points equally spaced over the whole
population will be referred to as systematic sampling; while systematic
sampling of clusters of contiguous points on the lattice will be referred to as systematic c·luster sampling.
In the latter case the
population or lattice is first divided into clusters and any subsequent
sample consists of a number of clusters equally spaced over the whole
popUlation of clusters.
In the case of systematic sampling n x n
points are selected at an interval k x k such that the distance between
two adjacent sample points in the horizontal and vertical directions is
kunits with nk
= 48.
However, if n
= 5,
n = 10 or n
consist of 50 x 50 points or 49 x 49 points.
=7
the lattices
For stratified sampling
the strata are of size k x k, while for simple random sampling any
sample of n
2
points out of N x N has an equal chance.
In the case of
cluster sampling let the size of the cluster be u x n, then if s x s
clusters are selected systematically the interval in the horizontal and
vertical directions is c, such that n(c x s) :: uk.
The effective sample
s~ze is n2 s 2 • If clusters are selected randomly, then any sample of s2
clusters out ofk x k has an equal chance of selection.
The results of
the computations are presented in Tables 4.1 and 4.2.
From Tables 4.1 and 4.2 i t can be seen that systematic sampling
has almost always a lower variance than simple random sampling.
For
some sampling intensities the systematic sampling variance is larger
and, since stratified random sampling is always superior to simple
random sampling, therefore the variance of systematic sampling is
decidedly larger than the variance of stratified random sampling.
No
·
e
Table
4.1
e
e
True variances times sample size for simple random (SRS), stratified random (ST)
and systematic (SY) sampling for some aerial photographs of the Lake Michie watershed
"'
Photograph
Sample
Design
85
SRS
ST
SY
33
SRS
ST
SY
34
SRS
ST
SV
SRS
ST
SY
SRS
ST
SY
197
200
INDEX
SRS
ST
SY
2 x 2
3 x 3
4 x 4
0.2488
0.2465
0.2587
0.2481
0.2429
0.2540
0.2473
0.2396
0.2243
0.2478
0.2416
0.2396
0.2446
0.2253
0.2176
0.2444
0.2414
0.2461
0.2483
0.2249
0.2239
0.2475
0.2441
0.2001
0.2467
0.2190
0.1450
0.2473
0.2258
0.2516
0.2441
0.2299
0.2027
0.2438
0.2365
0.2334
0.2475
0.2236
0.1979
0.2468
0.2373
0.2375
0.2460
0.2181
0.1711
0.2465
0.2320
0.2296
0.2433
0.2187
0.1741
0.2431
0.2350
0.2170
Size of sample in term of a com~act cluster
5 x 5 6 x 6 7 x 7 1::i x 1::i 10 x 10 12 x 12
0.2467
0.2221
0.1635
0.2452
0.2235
0.1629
0.2446
0.2081
0.1477
0.2469
0.2034
0.3480
0.2444
0.2166
0.1162
0.2423
0.2313
0.2512
0.2453
0.1970
0.1891
0.2446
0.2097
0.1587
0.2438
0.1977
0.1352
0.2444
0.2017
0.2300
0.2412
0.2031
0.0971
0.2410
0.2229
0.3451
0.2441
0.1842
0.1308
0.2430
0.1937
0.1793
0.2423
0.1899
0.1256
0.2439
0.1976
0.2427
0.2409
0.1838
0.1146
0.2395
0.2246
0.2810
0.2423
0.1778
0.2620
0.2416
0.1936
0.1757
0.2408
0.1778
0.2406
0.2414
0.1890
0.3241
0.2382
0.1766
0.1019
0.2380
0.2199
0.2378
0.2392
0.1704
0.1184
0.2377
0.1748
0.1668
0.2372
0.1598
0.1193
0.2395
0.1631
0.1353
0.2370
0.1659
0.1364
0.2349
0.2123
0.1486
0.2337
0.1532
0.0629
0.2330
0.1535
0.1645
0.2322
0.1433
0.1063
0.2327
0.1445
0.0722
0.2297
0.1519
0.1107
0.2295
0.1991
0.2109
16 x 16 24 x 24
0.2215
0.1287
0.4003
0.2209
0.1342
0.0866
0.2202
0.1236
0.0591
0.2207
0.1239
0.0972
0.2178
0.1213
0.0484
0.2176
0.1835
0.1753
0.1869
0.0866
0.0443
0.1864
0.0933
0.1102
0.1858
0.0880
0.0646
0.1862
0.0811
0.0368
0.1838
0.0775
0.1623
0.1836
0.1474
0.3498
f'-'
f'-'
\Jl
,
e
Table 4.2
,
e
e
True variances times effective sample size for random cluster sampling. (Vc1r )
and systematic cluster sampling {V 1 ) for some aerial photographs of the Lake Michie watershed
c sy
Effective
sample
size
Size of
cluster
16
36
64
100
144
256
576
36
144
64
144
256
100
625
144
576
256
576
2x 2
3x 3
4x 4
5x 5
6 x 6
8 x 8
Size of
systematic
cluster
sample
2x
3x
4x
5x
6x
8x
12 x
2x
4x
2x
3x
4x
2x
5x
2x
4x
2x
3x
2
3
4
5
6
8
12
2
4
2
3
4
2
5
2
4
2
3
Photograph
Photograph
85
Photograph
34
33
V
clr
V
0.6468
0.6412
0.6252
0.6260
0.6104
0.5788
0.4884
1.0710
1.0206
1.5024
1.4496
1.3745
1.9075
1.4900
2.4444
1.9584
3.0528
2.6728
0.6770
0.5346
0.4972
0.3648
0.4345
0.6989
0.1146
1.0225
0.3364
1.4729
0.8615
0.6997
1.8456
0.7331
2.2904
0.5391
3.9836
0.1102
clsy
Vclr
V
clsy
V
clr
V
clsy
0.6172
0.6116
0.6040
0.5896
0.5824
0.5524
0.4660
1.0161
0.9675
1.4924
1.4336
1.3584
1.7650
1.3775
1.8756
1.4712
2.2656
1.9136
0.6368
0.4131
0.5351
0.1484
0.2383
0.2889
0.1623
1.1119
0.8364
1.7391
0.9605
0.9234
1.8620
0.1044
2.1202
1.2656
3.3227
1.4757
0.6352
0.6296
0.6216
0.6120
0.5996
0.5684
0.4796
1.1034
1.0503
1.6336
1.5744
1.4928
2.0325
1.6500
2.3940
0.8512
2.9184
2.4640
0.5158
0.2706
0.3274
0.3033
0.2278
0.5774
0.0420
0.7854
0.4804
1.0105
0.4553
0.4680
1.6553
0.8267
1.1983
0.0151
1.4411
0.0298
I-'
I-'
0\
.........-
~
(
e
Table 4.2
(continued)
Effective
sample
size
Size of
cluster
16
36
64
100
144
256
576
36
144
64
144
256
100
625
144
576
256
576
2 x 2
3 x 3
4 x 4
5x 5
6 x 6
8 x 8
Size of
systematic
cluster
sample
2 x 2
3x 3
4 x 4
5x 5
6 x 6
8 x8
12 x 12
2 x 2
4x 4
2 x 2
3x 3
4x 4
2x 2
5x 5
2 x 2
4x 4
2 x 2
3 x 3
,
e
Photograph
197
V
V
clr
clsy
0.6648
0.6592
0.6508
0.6428
0.6276
0.5952
0.5020
1.0052
1.0521
1.6240
1.5664
1.4848
2.0450
1.6325
2.0304
1.6236
2.7200
' 2.3376
. 0.5881
0.6544
0.4370
0.9785
0.5132
0.7604
0.0420
0.9852
0.7033
1.2269
1.8535
0.4158
1.6689
4.0516
1.4082
0.1479
2.1753
1.6982
Photograph
200
V
V
clr
clsy
0.6664
0.6604
0.6524
0.6372
0.6292
0.5964
0.5032
1.0989
1.0467
1.4576
1.4048
1.3328
1.9600
1.5300
2.3436
1.8756
2.4512
2.0762
0.5587
0.4947
0.4509
0.1604
0.1480
0.1934
0.0790
0.7907
0.5395
1.0273
1.1385
0.6786
1.3532
0.1563
1.4483
0.7335
0.3644
2.4002
e
Index
V
V
0.3872
0.3840
0.3792
0.3744
0.3676
0.3468
0.2924
0.5436
0.5175
0.7152
0.6896
0.6428
0.7825
0.6125
1.4496
0.6804
1.2736
1.0752
0.3863
0.3164
0.2543
0.4502
0.5946
0.1806
0.1797
0.5386
0.3776
0.5781
0.2171
0.3116
0.5526
1.1795
0.7543
0.9549
0.8655
0.0738
clr
clsy
I--'
I--'
-.,;:]
118
pattern is apparent in the data to explain these differences.
Corres-
ponding distances in miles between adjacent points are 0.76, 0.19 and
0.09 for photograph 85, 0.50, 0.32 and 0.19 for photograph 197 and
3.11, 1.28, 1.02, 0.90 and 0.26 for the index map.
of the sample lattices see Table 4.3.
For other dimensions
In general the variances do not
differ appreciably for the three designs at the same sampling intensity,
but differences become relatively larger for larger samples.
crease is due to the finite population correction factor.
The de-
Variances
for both random sampling schemes decrease with increasing sample size
in contrast with systematic sampling where t_he behavior of the variance
is erratic (see Figure 4.1).
Excluding the larger sample sizes it
becomes difficult to decide on the best sample size, since differences
are small.
Except for photograph 200 the systematic sampling variance
is large for samples of size 8 x 8 points for all photographs.
Sampling
intensities of 6 x 6, 7 x 7 or 10 x 10 appear to be favorable for most
of the photographs.
A graphic presentation of the variances given in
Table 4.~ is given in Figure 4.1 for photograph 85 and for the index map.
The observations concerning 'rable 4.1 repeat themselves somewhat
in Table 4.2.
~ ..
As would be expected cluster sampling has a larger
variance than simple random sampling of single element s.
The variance
of systematic cluster sampling is genera1-ly larger than systematic
sampling of single elements for the same effective sample size, but
rarely systematic cluster sampling has a lower variance,
~.~.
in the
case of photograph 33 for a sample of 5 x 5 clusters of 2 x 2 points
and for the largest sample sizes in the case of photograph 34 and 200.
.e
,.
e
0.45
index map
0.40
photograph 85
0.35
I'
I
Q)
N
.r-!
tI.l
/
0.30
""
S
.r-!
""
\
0.25
-------
tI.l
tI.l
Q)
"'" ""
"","" SY
\
/
r-l
~
,\
1
Q)
cO
e
e
0.20
.p
Q)
ST
()
c:
cO
.r-!
~
~
Q)
~
H
.p
o
2
3
4
5
6
7
8
, 10
12
16
2
number of points horizontally in square lattice
Figure 4..1
True varianc~s times sample size as given in Table 4.1 for photograph 85 and index map
I--'
I-'
\D
120
Table 4.3
Sample lattices and corresponding dimensions in miles
Points in sample
lattice
Distance between
adjacent points
2 x 2
0.76
0.50
0.38
0.32
0.25
0.22
0.19
0.16
0.13
0.09
0.06
3 x 3
4 x 4
5 x 5
6x 6
7 x 7
8 x 8
10
12
16
24
x 10
x 12
x 16
x 24
Length of side
of lattice
0.76
1.01
1.14
1.26
1.26
1.33
1.33
1.42
1.39
1.42
1.45
Length of diagonal
of lattice
1.07
1.43
1.61
1.79
1.79
1.87
1.87
2.01
1.96
2.01
2.05
The variance of cluster sampling increases generally with increasing
cluster size for the same effective sample size.
systematic cluster sampling, but exceptions occur •
The same holds for
Random cluster
sampling has generally a higher variance than systematic cluster
sampling.
Where this is not the case, the interval between centers of
adjacent clusters in a row or column in a systematic sample corresponds
in general to the distance between two adjacent single elements in a
systematic sample for which the variance was not smaller than that of
simple random sampling.
For instance, for photograph 85 the systematic
cluster sampling is greater than random cluster sampling for samples
of 2 x 2 and 8 x 8 clusters of 2 x 2 points and for samples of 2 x 2
clusters of 8 x 8 points.
The distances between two adjacent center
points are respectively 0.76, 0.19 and 0.76 miles.
The same is true
for photograph 33 where all systematic cluster samples of 2 x 2 clusters
of any size recorded in Table 14 have a larger variance than random
cluster sampling.
Finally, the behavior of the systematic cluster
121
sampling variance is erratic in the same manner as the systematic
sampling variance of single elements.
An important question to raise now
is~
Is it possible to explain
the results described in the previous paragraphs in terms of some
definite characteristics of the corresponding correlograms?
In the
case of I-dimensional systematic sampling all of the.relevant correlations are at lags which are integral mUltiples of the systematic
sampling interval.
This is also true for 2-dimensional systematic
sampling if the correlations are computed for lags in each direction
separately.
However, our correlations are computed by averaging over
all directions within a lattice, thus the lags are not simply integral
multiples of the systematic sampling interval.
The distribution of
distances within a square lattice is such that most of the lags occur
between one fifth of the length and the whole length of a side of the
lattice.
For lattices denser than 9 x 9 points, the percentage of
frequencies of distances smaller than the length of a side of the
lattice is larger than 90%.
The value of the correlations at these
lags should explain the performance of systematic sampling for the
several sampling intensities.
From the table of sample lattices and
their dimensions, Table 4.3, we can see the difficulties encountered
in explaining the differences in sampling variance for some lattices
for the correlograms given in Figures 3.7, 3.11, 3.12, 3.14, 3.15 and
3.20.
Differences between systematic sampling variances of samples bf5.;x 5
and 6 x 6 points, 7 x 7 and 8 x 8 points or lOx 10 and 16 x 16 points
may be due to slightly different population sizes, but sample lattices
122
within a finite population of
be explained easily either.
48 x 48 points give results which cannot
The half periods occurring in the cor-
relograms are generally too large to be consequential for the larger
sample sizes.
StUdying the results for each photograph, the following
remarks can be made.
If differences in variance for different sample
sizes are due to periodicity then this is not apparent from the correlogram.
If the sample points are spaced in such a manner that they
roughly follow certain repeating features in the land use pattern, the
corresponding distances within the lattice should fall in that part of
the correlogram where the correlations take positive values if the
variances are large, and in that part of thecorrelogram where the
correlations take negative values if the variances are smalL
If this
is not the case, then some correlations should take large positive
values or large negative values relative to other correlations in the
correlogram.
Neither of these features have been observed for distances
smaller than the length of a side of a lattice.
For some cases of
small sample sizes, it could be verified that the distances fell within
a sUfficiently large range of negative correlations to explain low
variances, ~.~. in the case of photograph 34.
But very little ex-
planation can be provided for the peaks in the variances in Table 4.1Differences in variances are probably due to small peculiarities
in the correlogram, which are not apparent to the eye.
Correlations
are small for the range of lags which are the most frequent.
Hidden
periodicities in the data if any, might have been more clearly demonstrated by spectral analysis than by means of correlograms, but spectral
123
analysis was not investigated in this study.
Differences may also be
due to the correlations in the beginning and at the end of the correlogram which, although the lags are less frequent, are much larger
relative to the middle range of the correlogram.
Although dimensions
of sample lattices may not be very different, emphasis on different
lags due to different distances between adjacent points in the lattices
may cause differences in sampling variances which are difficult to
relate to featureB in the correlogram.
Also, the reBults are obtained
forparticu.lar finite populations and it may well be that the results
are varying considerably from what is to be expected based on a general
trend in the correlogram.
Deviations from the expected correlation
function could contribute to situations as those described above.
4.3 Expected variances
Considering the same photographs and sample designs as in Section 2,
the expected variances have been calculated using the formulae of
Section
4 of Chapter 2.
The autocorrelations are obtained from the
fitted correlation functions of Section 4 of Chapter 3.
have been used for which the
~
values in Table 3.5 are smallest.
results are presented in Tables 4.4 and 4.5.
can be made.
Those functions
Except for photograph
The
The following observations
34 the expected variances are
within the same range as the true variances of Tables
4.1 and 4.2.
Apparently, the negative tail of the fitted correlation function causes
the expected variances to be larger in the case of photograph
34, the
effect of which decreases for increasing sample sizes of stratified
-
e
Table
4.4
e
Expected variances times sample size for simple random (SRS), stratified random (ST) and
systematic (SY) sampling based on fitted correlation functions for some aerial photographs of the
Lake Michie watershed
Photograph Sample
design
2 x 2
3 x 3
Size of sample in terms of a compact cluster
4 x 4. 5 x 5
6 x 6
~ x tl
lO x lO
7 x 7
85
SRS
ST
SY
0.2488
0.2445
0.2444
0.2392
0.l606
0.l2l2
33
SRS
ST
SY
0.2468
0.2460
0.25l0
0.2374
O.l773
0.l387
34
SRS
ST
SY
SRS
ST
SY
0.3229
0.3038
0.3033
0.2470
0.2423
0.2406
SRS
ST
SY
SRS
ST
SY
0.244l
0.2383
0.2294
0.3U4
0.l787
0.l387
0.2376
0.l663
0.l339
0.2349
0.l608
0.l244
197
200
INIJEX
0.2433
0.24l2
0.2406
l2 x l2
l6x l6
24 x 24
0.234l
0.2138
3
3
5
35
9
7
0_?071
7
947
,7
5
f-'
I\)
w
e
Table 4.5
·
e
e
Expected variance times effective sample size for random cluster sampling
{el-l
c r ) and systematic cluster sampling «(ll)
c sy based on fitted correlation functions
for some aerial photographs of the· Lake Michie watershed
Photograph
Effective
Size of
systematic
Size of
sample
85
(J2
(J2
size
cluster
cluster
clr
clsy
sample
2 x 2
2 x 2
16
0.6936
0.6789
0.6220
0.6875
36
3 x 3
64
4 x 4
0.5429
0.6790
0.4762
100
0.6706
5 x 5
144
6 x 6
0.6548
0.3853
8 x 8
0.6208
0.2619
256
12 x 12
0.1023
0.5238
576
2 x 2
1.1818
1.2086
36
3 x 3
4 x 4
1.1510
144
0.8759
2 x 2
1.6684
64
4 x 4
1.7050
1.4406
1.6441
144
3 x 3
1.1244
4 x 4
256
1.5588
2 x 2
2.1442
2.1144
100
5 x 5
0.8283
625
1.6752
5 x 5
2 x 2
2.4841
6 x 6
2.4438
144
4 x 4
1.2198
1.9873
576
2 x 2
8 x 8
2.9121
2.8926
256
1.9814
2.4571
3 x 3
576
Photograph
33
(J2
(J2
clr
clsy
0.6474
0.6417
0.6338
0.6259
0.6112
0.5795
0.4889
1.0669
1.0161
1.4241
1.3732
1.3020
1.6938
1.3233
1.8549
1.4839
1.9189
1.6190
0.6662
0.6607
0.5535
0.5060
.0.4302
0.3033
0.1172
1.1143
0.8607
1.5163
1.4941
1.0655
1.7912
0.8409
2.0864
1.0722
2.3454
2.2567
Photograph
34
2
(J2
(Jclr
clsy
0.9691
0.9609
0.9487
0.9402
0.9149
0.8674
0.7319
1.7849
1.6999
2.6520
2.5572
2.4246
3.5285
2.7567
4.2771
3.4217
5.5521
4.7882
0.8939
0.7847
0.6670
0.5785
0.4651
0.3208
0.1432
1.6255
1.1151
2.3877
1.9510
1.4803
3.1654
1.1435
3.7684
1.7269
4.6846
3.0417
I-'
f\)
VI
Table 4.5
(continued)
Effective
sample
size
Size of
cluster
16
36
64
100
144
256
576
36
144
64
144
256
100
625
144
576
256
576
Size of
systematic
cluster
sample
2 x2
3 x 3
4 x 4
5 x 5
6 x 6
8 x 8
12 x 12
2 x 2
4 x 4
2 x 2
3 x 3
4 x 4
2 x 2
5 x 5
2 x 2
4 x 4
2 x 2
3 x 3
2 x2
3 x 3
4 x 4
5 x 5
6 x 6
8 x 8
Photograph
2 197 (J2
(J
clr
clsy
0.6591
0.6533
0.6453
0.6276
0.6222
0.5900
0.4978
1.1238
1.0703
1.5679
1.5119
1.2794
1.9719
1.5405
2.3122
1.8498
1.9618
1.6553
0.6358
0.5820
0.5226
0.4583
0.3951
0.2828
0.1228
1.0771
0.8224
1.4947
1.4335
1.0420
1.4584
0.8185
1.8580
0.0 1)
1.8442
1.3441
Photograph
2 -200 (J2
(J
clr
clsy
0.6687
0.6629
0.6547
0.6470
0.6313
0.5986
0.5050
l..1554
1.1004
1.6207
1.5628
1.4818
2.0305
1. 7360
2.3364
1.8691
2.7034
2.2810
0.6120
0.6282
0.5354
0.4617
0.3751
0.2626
0.1090
1.0338
0.8613
1.4148
1.4794
1 •.1082
1.5863
0.7743
1.9118
1.2220
1.9943
2.2528
Index
2
clr
(J
0.3888
0.3854
0.3807
0.3761
0.36710.3480
0.2937
0.5138
0.4893
0.6239
0.6016
0.5704
0.7256
0.5669
0.0877
0.6462
0.9399
0.8329
2
clsy
(J
0.3787
0.3619
0.3443
0.3304
0.3102
·0.2751
0.1952
0.4925
0.4153
0.5885
0.5213
0.4512
0.6771
0.3684
0.7383
0.4294
0.7931
0.5641
1) The variance was computed to be negative and has therefore been set equal to zero.
I-'
I\)
(J\
e
,
e
e
127
random and systematic sampling.
This result should be a warning against
correlation functions with monotone increasing or decreasing tails.
The
results also indicate that the assumption of zero correlations for large
distances may be appropriate.
The correspondence between true and ex-
pected variances for simple and stratified random sampling is clearly
seen from the tables, but the correspondence is erratic for systematic
sampling.
The behavior of the expected systematic sampling variance is
regular in that the variance decreases with increasing sample size except for sample size 3 x 3 on photograph 200.
The expected systematic
sampling variance is always smaller than random sampling variances
except for the smallest sample sizes on photograph 33.
The results of cluster sampling given in Table
4.5 show somewhat
the same pattern as sampling of single elements.
The expected variances
are again larger than the true variances of Table
4.2 in case of photo-
graph 34, but the differences become smaller for the larger systematic
samples.
The behavior of the expected systematic cluster sampling
variance is regular in that the variance decreases with increasing
sample size and increases with cluster size for the same effective
sample size.
Although the expected systematic sampling variance does
not follow the true systematic sampling variance, for many sample sizes
it is reasonably close, even for large cl.usters, considering that the
formula used
(2.4.8) is an approximation based on the assumption of
small clusters at large intervals.
The observations made above should not be surprising since correlations are small.
The results presented in Table
4.4 are disappointing
128
since no relationship could be established between behavior of the
expected systematic sampling variance and the correlogram.
In spite
of this it is encouraging that the expected variances and the true
variances for most realizations are of the same size, although possible
exceptions such as photograph 34 should be kept in mind.
For the index
map where larger areas than that for the lattices on a photograph are
considered, it is comforting to know that the correlogrammay be almost
flat
a~d
that an exponential model can be fitted easily.
We would then
come to the same conclusion as Matern (1960) that the correlogram for
land use data is exponential.
In our case the correlogram decreases
much faster than in the examples given by Matern.
There is thus little
advantage in choosing systematic sampling above random sampling when
concerned with precision for sufficiently coarse sample lattices.
On
the other hand,on the average it should be reasonable to evaluate
systematic sampling by means of simple random sampling formulae under
such conditions; ..:!:.. ~., simple random sampling variance formulae can be
used if we are certain that the concerned portion of the correlogram is
indeed flat.
It is reasonable to question the validity of fitting the correlation
functions as described in Chapter 3.
For this reason the calculations
described above have been repeated using correlations averaged over
short intervals instead of correlations obtained from fitted correlation functions.
The results are presented in 'rabIes 4.6 and 4.7.
The results are somewhat in between the true variances of Tables
and
4.2
and the expected variances of Tables
4.4
and
4.5.
4.1
The variances
·
e
Table
4.6
e
e.
"
Expected variances times sample size for simple random (SRS), stratified random (ST)
and systematic (SY) sampling based on correlations averaged over short intervals for some aerial
photographs of the Lake Michie watershed
Photograph
number
Sample
design
85
SRS
ST
SY
33
SRS
ST
SY
34
SRS
ST
SY
SRS
ST
SY
197
200
INDEX
SRS
ST
SY
SRS
ST
SY
2 x 2
0.2480
0.2448
0.2400
0.2473
0.2450
0.2477
0.2481
0.2348
0.2060
0.2474
0.2406
0.2319
0.2425
0.2336
0.2225
0.2439
0.2409
0.2420
3 x 3
0.2474·
0.2332
0.2235
0.2468
0.2406
0.2257
0.2476
0.2195
0.1777
0.2469
0.2298
0.2312
0.2420
0.2265
0.2324
0.2434
0.2382
0.2307
Size of
14 x 4
0.2467
0.2212
0.1927
0.2460
0.2305
0.2245
0.2468
0.2077
0.1993
0.2461
0.2194
0.1799
0.2413
0.2182
0.1957
0.2426
0.2351
0.2313
sample
5 x 5
0.2459
0.2117
0.1893
0.2454
0.2211
0.1817
0.2460
0.1997
0.1836
0.2456
0.2113
0.1993
0.2409
0.2099
0.1681
0.2420
0.2323
0.2149
in terms of a compact cluster
bx6 7 x 7 1<5 x 8 10 x 10 12 x 12 16 x 16
0.2445 0.2433 0.2415 . 0.2385
0.2329 0.2208
0.1981 0.1894 0.1789 0.1655
0.1478 0.1229
0.1359 0.1640 0.1737 0.1161 0.1004 0.1193
0.2202
0.2439 0.2428 0.2409 0.2380 0.2323
0.2073 0.1981 0.1868 0.1729 0.1554
0.1313
0.1568 0.1470 0.1122 0.1066 0.0594 0.1166
0.2446
0.1887
0.1573
0.2440
0.1994
0.1727
0.2392
0~1973
0.1554
0.2405
0.2280
0.2342
0.2435
0.1815
0.1751
0.2429
0.1910
0.1243
0.2382
0.1888
0.1629
0.2394
0.2248
0.2139
0.2416
0.1729
0.1637
0.2410
0.1803
0.1534
0.2362
0.1782
0.1567
0.2375
0.2202
0.2072
002386
0.1620
0.1792
0.2381
0.1666
0.1068
0.2336
0.1645
0.1697
0.2347
0.2135
0.1693
0.2330
0.1471
0.0611
0.2324
0.1482
0.1081
0.2278
0.1463
0.0701
0.2290
0.2032
0.1905
0.2209
0.1254
0.1062
0.2203
0.1229
0.1250
0.2160
0.1212
0.1558
0.2172
0.1861
0.1993
24 x 24
0.1863
0.0865
0.0430
0.1858
0.0942
0.0702
0.1864
0.0906
0.0185
0.1859
0.0848
0.0142
0.1822
0.0846
0.0207
0.1832
0.1473
0.1029
I-'
f\)
(X)
.
e
e
e
Table 4.7
Expected variance times effective sample size for random cluster sampling
«(ll
'c ) and
systematic cluster sampling
T
«(ll
c sy ) based
on correlations
averaged
over
,
,
short intervals for some aerial photographs of the Lake Michie watershed
Size of
cluster
2 x 2
Size of
systematic
cluster
sample
2 x 2
3 x 3
4x 4
5x 5
6x 6
8
x
8
12 x 12
3 x 3
4x 4
5x 5
6 x 6
8 x 8
2 x
4x
2 x
3 x
4x
2 x
5x
2 x
4x
2 x
3 x
2
4
2
3
4
2
5
2
4
2
3
Photograph
85
2
--
~
2
a
0.6438
0.6381
0.6303
0.6222
0.6078
0.5762
0.4862
1.1157
1.0625
1.5741
1.5179
1.4392
2.0083
1.5690
2.3805
1.9044
2.9359
2.4772
0.6138
0.5479
0.4245
0.4109
0.1976
0.3488
0.0556
1.0539
0.6280
1.4745
1.2107
0.7173
1.8592
0.5948
2.1995
0.4959
2.6778
1.6229
~r~_
~sX
Photograph
33
2
-2
a
a
clr
c~_ _
Photograph
34
2
-2
a
a
clr
clsy
0.6105
0.6051
0.5976
0.5906
0.5763
0.5464
0.4610
1.0350
0.9857
1.4450
1.3934
1.3212
1.8154
2.0164
2.0849
1.6680
2.3615
2.5846
0.6278
0.6223
0.6146
0.6068
0.5927
0.5620
0.4741
1.0942
,1.0421
1.5873
1.5306
1.4512
2.0960
1.6375
4.2771
3.7684
3.4968
2.9504
0.6138
0.5260
0.5209
0.3500
0.2501
0.0720
0.0 1)
1.0476
0.8387
1.4764
1.1256
1.1050
1.4183
0.2210
2.1901
1.3545
1.9925
1.1811
0.4614
0.3484
0.4344
0.3720
0.2667
0.2922
0.0 1)
0.7254
0.6647
0.9425
0.4904
0.8346
1.3606
0.5418
3.4217
1.7269
1.1095
0.0 1)
1) The variance was computed to be negative and has therefore been set equal to zero.
l-'
LA>
o
e
Table 4.7
Size of
cluster
2 x 2
3 x 3
4 x 4
5 x5
6 x 6
8 x 8
·
'
,
,
e
e
(continued)
Size of
systematic
cluster
sample
2 x 2
·3 x 3
4 x 4
5 x 5
6 x 6
8 x 8
12 x 12
2 x 2
4 x 4
2 x 2
3 x3
4x4
2 x 2
5x 5
2 x 2
4 x 4
2 x 2
3 x 3
Photograph
197
2
2
crclr
crclsy
0.6485
0.6429
0.6349
0.6276
0.6123
0.5805
0.4898
1.1238
1.0703
1.5588
1.5031
1.4252
1.9719
1. 5405
2.3122
1.8498
2.8319
2.3894
0.5886
0.5859
0.3806
0.4583
0.3518
0.2746
0.0935
1.0771
0.8224
1.3389
1.3280
0.5069
1.4584
0.8185
1.8580
0.0 1)
2.0843
2.0406
Photograph
200
2
clr
0.6298
0.6243
0.6166
0.6103
0.5946
0.5637
0.4756
1.0824
1.0309
1.5122
1.4582
1.3826
1.9087
1..4912
2.2173
1. 7738
2.6663
2.2497
<J
2
clsy
0.5515
0.5912
0.4444
0.3341
0.2833
0.2886
0.01)
0.9118
0.6708
1.2184
1.3772
0.7900
1.4739
0.0896
1.5941
0.6302
1.6114
2.2465
<J
Index
2
crclr
0.3855
0.3822
0.3775
0.3732
0.3640
0.3451
0.2912
0.5170
0.4924
0.6420
0.6191
0.5870
0.7482
0.5845
0.8217
0.6573
0.9458
0.8937
2
clsy
0.3788
0.3334
0.3360
0.2704
0.3474
0.2396
0.1728
0.5033
0.4072
0.6206
0.4392
0.4496
0.6507
0.0358
0.7835
0.3989
0.7981
0.1684
<J
1) The variance was computed to be negative and has therefore been set equal to zero.
f--o
~
132
based on average correlations are thus within the same range as the
true variances.
The systematic sampling variance is almost always
smaller than random sampling variances, but shows some deviations from
the downward trend when the sample size increases.
We may conclude
from these results that the right choice of model and parameters provides
expected variances which will be of about the same order as the true
variances.
variance,
The small dips in the trend of the systematic sampling
!:..~.
sample sizes of 6 x 6 for photographs 85 and 34, and
10 x 10 for the index map, have been observed before, but in the case
of photograph 33 the variance decreases almost regularly with increasing
sample size, which is unexpected in the face of the periodicity in the
correlogram.
It should also be noted that variances turn out to be
negative more often than in the case of the use of fitted correlation
functions.
In such instances the variances have been reported to be
zero in the tables.
In his study of spatial variation Matern (1960, p. 124) reports
on the variance of the mean of observations made along linear sampling
units or tracts placed randomly on an area.
Matern assumes an isotropic
stationary process with an exponential correlogram.
He calculates
variances based on an empirical correlogram for the distribution of
forest land in a Swedish province.
square and rectangle.
The types of tracts are line, circle
Matern demonstrates that variances of these types
are increasing in that order.
He remarks that the variances are directly
applicable for unrestricted random sampling of a large area and that the
variances can be used to discriminate between different types of figures
when employed in other designs.
133
Instead of a tract we have considered two types of configurations
of n systematically arranged points,
l.~.
n equally spaced points on a
line or n points in a square lattice where n is a perfect square.
configurations are to be placed randomly on the photograph.
These
The
variance of the mean of the observations at the points defined by
E(X-.\.1)2 is given by
(4.3.1)
2
r:l
a -x = -n
(1
+
(n -
~
l)p),
where p is the average autocorrelation between points in the configuration •
.Computations have been made for configurations with increasing distances between adjacent points.
or 0.03156 mile ground distance.
The distances are multiples of 0.1 inch
The correlations have been obtained
from the fitted correlation functions for photographs 33 and 200 given
in Chapter 3.
For large distances and large n the differences in
variance within lines and squares and between lines and squares are
negligible and vary litt.le due to the small correlations inherent to
our data.
Since differences can be best observed for a reasonably large
sequence, results have been plotted for n
and
4.3, respectively.
=4
and n""
9 in Figures 4.2
Instead of variances, coefficients of variation
have been plotted.
Where earlier it was difficult to associate the behavior of the
systematic sampling variance with characteristics in the correlogram,
more can be said for these new plots.
The curves in Figures
4.2 and
4.3 are smoother for lattices than for points on a line over the range
56
\
\
\
l\
54
photograph 200
\
\
.\
.\
\
\
\
\.
\
\
\
52
photograph 33
\
\
\
\
\
f:l
o
.r!
\
.p
Q'l
.r!
i4
\
\
50
\
\
G-i
o
r(j
0
.r!
.,.-i
i4
i4
i4
\
\
(l)
PI
PI
PI
PI
G-i
G-i
.-I
.-I
.r!
48
U
.r!
G-i
G-i
\
~
'\
\
~ ' "" ".....
... \
(l)
tJ
\
r(j
(l)
46
.p
tJ
3G-i
'!J"
I
1
~---,
I
~
/
- 2 x 2
/- __
n __,/
\
o
(l)
£ti.(
1
I
(l)
(l)
.-I
Q'l
,.q
I
\
f:l
0
(l)
~
\
\
.p
r(j
0
'r!
i4
\
\
~
r(j
'r!
o
.\
\
r(j
--
.......,,- ,
....
-4 - -
..
.,
. ,,
,-
....
".....
....
" . . . . . . _---......
,......- -. - ......
--
-
- - --...,-......
...... _-----
n =
(l)
B
44
2 x 2
42.
1--
Figure 4.2
I
I
L_______
I
I
J
.J.
L
l
L--.....L
I
I
_-.L..
t
I
,
I
0.3 0.1+ 0.5
0.6
points equally spaced along a line or in a square lattice if randomly placed on photographs 33 and 200
e
I
0.7 o.8-0.9-l-=-O]~-:YL2 1.3 1. 1.5 1.6 1.7 1.8 1.9 2.0 2.l
Distance between adjacent points in inches
The expected coefficient of variation for the mean of observations made at the same number of
0.lO.2
I
of
..
I.
e
!-'
W
+:""
-
e
"'.~
l
\
40
\
,I
\
\
photograph 200
,
\
38
\
\
Cl
'M
\
36
~
\
\
\
34
\
\
s::
\
()
'M
"
""" "" "
"
o()
H
~
.
I
r-1
l~
I
1
~
'\.
"
32
30
QJ
Pol
'\
\
-rl
C)
'd
G-i
\
\
(!)
QJ
~
\
.p
+J
o
'M
r-1
i
I
ro
Pot
G-i
-t
photograph 33
~
Pot
I
\
\
\
QJ
QJ
o
~
i
\
\
\
ro
QJ
G-i
\
QJ
'M
QJ
Pot
\
\
~
G-i
G-i
ro
o
~
\
til
'M
o
ro
'M
\
.p
G-i
ro
'M
o
\
\
o
e
~
"
n=3x3
.... "
..... .....
""
,"
--
I
--- .., , ,
- '"
......~--. - - - -
--
-. ~,
_---r-" - .....
/ / "
,/
;'
'
.....
.---- .......
--- -
---
-n =
28
3 x 3
26.5
I
0.1
Figure
4..3
,
,
0.2 0.3
I
0.4
0.5 0.6
0.7
0.8
0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.02.1
Distance between adjacent points in inches
The expected coef'f'icient of variation f'orthe mean of observations made at the same number
of points equally spaced on a line or in a square lattice if randomly placed on photographs 33 and 200
I."J
W,
\,J1
136
observed.
The expected coefficient of variation·for lines is not always
smaller than that for lattices.
If adjacent points are at a distance
equal to the length of the full period of the correlation function, the
coefficient of variation shows apeak at that distance for points on a
line and at a slightly smaller distance for points in a lattice.
Such
a correspondence is not obtained for half periods where one might
anticipate the expected coefficient of variation should show a minimum.
This does not happen since the correlograms are not perfectly periodic
but are exponential in character for small distances.
The minima in
the plots of expected coefficient of variation for lattices in the case
of photograph 33 are quite conspicuous and occur when the distance between adjacent points is about two thirds of half the period.
Such a
feature cannot be observed in the case of photograph 200.
Globally we may say that for large distances the results for points
on a line and in a lattice show the same trend.
Since the correlograms
are exponentially decreasing for small lags and then extend into rapidly
damped oscillations, it may not be surprising to observe that the expected coefficient of variation decreases sharply in the beginning and
then continues into a quite variable region finally leveling out for
the larger distances.
The above procedure is perhaps helpful in deter-
mining the best sample configuration for a given sample size when the
correlogram shows signs of periodicity or has negative stretches.
In
case of cluster sampling the best arrangement of points in a cluster
and the best distance between adjacent points could be determined this
way.
A necessary condition is that the correlation between clusters
137
can be approximated by the correlation between cluster centers.
This
assumption we made also in Chapter 2 and is suggested by Matern (1960,
p. 125) as well.
138
5.
SUMMARY AND CONCLUSIONS
5•. 1
Summary
The area of a certain land use class can be expressed as a proportion
of the total area of a geographical region.
This proportion can be
estimated by observing the presence or absence of the land use at a
number of selected points distributed over one or more aerial photographs.
The distribution of points can be laid down according to schemes such
as random sampling, systematic sampling and cluster sampling.
The pre-
"
cision of these sampling schemes has been considered and in particUlar
that of systematic sampling schemes.
The need for the correlogram has been demonstrated by variance
formulae for random and systematic sampling given by Cochran (1946),
Quenouille (1949 ),Das (1950) and Zubrzycki (1958) and by variance
formulae for cluster sampling given here.
To apply these results it
has been assumed that the occurrence of land use on aerial photographs
is a realization of a wide sense stationary, isotropic, stochastic process.
Autocorrelations have been defined for 2-valued, 2-dimensional
stochastic processes and three different correlation estimators, r , r
2
l
and r , have been discussed in the case of sample data from a given
3
realization.
r ,
l
compute~
It has been shown that correlation est :imates, denoted by
from pairs of points taken once, are greater than or equal
to correlation estimates, denoted by r , computed from pairs of points
2
taken twice, but the second time in reverse order.
It has been shown
that correlation estimates must be negative for some lags if end effects
139
are neglected, and that, even for small lags, approximations of correlation estimators in the 2-dimensional case lead to correlation
estimates closer to zero than the true value.
Further, it has been
demonstrated that correlation estimates, r , computed from full lattices
l
will differ from correlation estimates, r l' computed after a 90
0
rotation of the lattice if the condition of isotropy does not hold.
This is not the case for thecorrelatiol1. estimator
T
2
•
This estimator
will not lead to detection of non isotropic conditions.
sample points are observed in a random order, r
detect non isotropy.
l
If sets of
will also fail to
For the smaller lags and under isotropic
consitions the three correlation estimators have been found to differ
very little.
It has been shown that when random samples of pairs of points are
taken at a given distance, the approximated variance of r
approximated variance of r
Yule ~1912).
2
l
equals the
and coincides with the formula given by
The derivations have been given using a result from
Fisher (1967) concerning functions of sample data from the multinomia.l
distribution.
It has been shown that the approximated bias of r
l
equals
zero and is always greater than or equal to the approximated bias of
r
2
if the true correlation is zero.
Under this last condition nr
1
be asymptotically distributed as i- with one degree of freedom.
data obtained by a random number generator the
5%
2
will
Using
significance level
has been checked.
Correlograms, based on r , of the distribution of forest on the
l
center part of several aerial photographs and on a part of an index map
140
have been computed.
Only for one photograph have correlograms of the
distribution of other classes been obtained.
Trends in the correlo-
grams of the distribution of forest have been estimated by a non linear
estimation procedure given by Marquardt (1966).
For some photographs
a non linear model with a periodic component he,s been fitted, but for
most other photographs only a combination of exponentials has resulted
in convergence.
The correlogram for part of the index map, covering a
much larger area than a single photograph, has been found to be
virtually convex, positive and steeply falling.
For most individual
photographs negative correlations occur in the less variable part of
the correlogram.
The location of the negative correlations has been
found to differ from photograph to photograph.
2
X test it has been shown that more than
5%
Using the uncorrected
of the negative correlations
are statistically significant.
For three photographs correlograms have been computed based on
r , r and r , and trends non linearly estimated for several sets of
l
2
3
points randomly selected without replacement from the full lattice.
It has been determined that lattices consisting of 16 to 25 points per
square inch are sufficient to infer an adequate model and to obtain
parameter estimates under convergence.
Difference in the non linear
estimation results for r , r and r have been found negligible for
2
1
3
random sets of points. The parameter estimates have indicated that in
case of one photograph the condition of isotropy does not hold since
deviating reSUlts, based on the r
lattice, have been obtained.
l
corre1ogram
computed from a full
:141
True and expected variances have been computed for several sampling
schemes and sampling intensities.
Expressing precision in terms of a
corre10gram has shown that deviations from the general trend are responsible for the quasi-periodicity in the behavior of the systematic
sampling variance.
Although correlograms for individual aerial photo-
graphs have little practical meaning for purpose of the exercise it
has been attempted to relate sampling results to features in the fitted
correlation function.
It has been investigated whether precision of
2-dimensional systematic sampling depends dearly on periodicity in the
isotropic correlation function if based on an assumed model or based on
average correlations within short intervals.
It has been seen that
realistic correlation functions provide expected variances of the same
order of magnitude as actual, realized variances.
Variances of randomly placed configurations of points have been
evaluated from the correlation functions fitted for some photographs.
A relationship betweep the variance of such a configuration and the
correlation function has been shown most obviously for configurations
with few points.
It has been demonstrated that if the number of points
is small, lattices are to be preferred above lines if the correlation
function has a periodic component.
142
5.2
Conclusions
In Chapter 1 specific objectives of this study were detailed.
now address ourselves to these objectives.
We
Since the correlogram has
been the key feature, we will base any recommendation with respect to
the Lake Michie watershed on tl1e correlogram of the distribution of
land use classes over the whole watershed.
Since we have found the
correlogram of the distribution of forest on a· sizable part of the
watershed we accept i t for the total area.
Only in the case of one
photograph have we obtained correlograms of the distribution of roads,
ponds, residential area and other.
The correlograms of the distribution
of the first three classes are mainly at the zero level except for very
small lags.
The correlogram of the distribution of the class identified
as other is practically identical to that of the distribution of forest.
Although the evidence is rather scanty, the shape of these correlograms
is not entirely unexpected and if we accept these to be valid for the
whole watershed, we may make the following statements:
1.
To sample for proportions of forest and cultivated land the
systematic sampling design is theoretically the best design.
Since we have not included any relationship between costs of
handling photographs, costs of taking observations on a
photograph and the total costs we cannot really-judge the
advantage of cluster sampling.
If costs are not considered
then cluster sampling is of course less precise than sampling
of single elements, but less so in the case of sampling for
proportions of roads, ponds and residential area.
For these
143
three classes the precision will be about the same for random
and systematic sampling schemes.
We therefore recommend a
systematic samp.le of single elements over the who.le area to
estimate proportions of land use classes under our classification scheme.
If cluster sampling is still to be used a
reduction in sampling error may be obtained by sampling
clusters systematically and by subsampling clusters by square
lattices since isotropic correlograms for single photographs
show periodic components or negative dips.
For other results
on the shape of lattices we have already referred in
Chapter 2 to the work of Matern.
2.
Since it has not been possible to investigate correlation
patterns of all land use classes we have limited ourselves to
the study of correlograms of the distribution of forest.
A
satisfactory explanation for the occurrence of negative dips
in the correlograms for individual photographs has not been
found.
The answer may be that patchiness, land uses occupying
small, disconnected areas, will create negativity.
Correlo-
grams for simulations of such patchiness which have not been
reported on here, have shown a negative dip.
The simulations
consisted of a field of squares in which smaller squares were
randomly located.
patterns.
Such a patchiness could be created by road
Since the correlograms differ for each photograph
we conclude that one photograph alone is not large enough to
infer the characteristics of the entire process responsible
144
for the occurrence of forest..
Negative correlations are often
found for small areas but in the case of a large area negative
correlations are weaker due to irregularities in the land use
pattern.
In this sense the process is not stationary from
photograph to photograph.
In this study we have not obtained
information about the size of the area for which the correlogram will be stable.
There is evidence that the assumption of isotropy for the
Lake Michie watersaed is in general reasonable.
The use of r
l
is recommended to investigate isotropic conditions in sample
data collected by means of lattices.
biasedness of r
l
The approximated un-
and the possibility of detecting non isotropy
is to be preferred above the invariance of r
2
with respect to
rotation of the lattice.
Further study of sampling for land use may be more
profitable when instead of a correlogram its spectral representation is computed from the data as a tool in determining
possible combinations of periodicities in either all or some
directions.
3.
The variance of systematic sampling for proportions of land
use from lattices on individual photographs will show
irregularities if the elements of the whole lattice form a
finite popUlation of constants.
These irregularities referred
to in the literature as quasi=periodicity are not very substantial when taken absolutely.
Quasi-periodicity is due to
~
deviations from the trend in the correlogramo
It has, however,
not been possible to relate our sampling results to definite
characteristics in the correlogramso
There is evidence that
for larger areas than a single photograph these variations in
the decrease of the variance of systematic sampling with increasing sample size will be less.
In the case of expected
variances the fitted correlation function smoothes out the
effect of quasi-periodicity.
4.
Since in the case of the Lake Michie watershed we have assumed
a positive, convex, rapidly to zero decreasing correlation
function of the distribution of any land use class, the
precision of random and systematic sampling schemes will be
about equal.
The evaluation of the precision of systematic
sampling by simple random sampling formulae is on the average
justified provided lattices are not very dense.
That is, if
one would be satisfied with an expected precision of a
systematic sample and if it is certain that the trend in the
correlogram is convex, positive and rapidly falling to zero,
the estimated variance of the sample mean evaluated by means
of simple random sampling formulae is a reasonable estimate
of the expected precision.
For further stUdy of sampling for land use the study of patterns
of the distribution of land use area is recoIT@ended by means of isotropic
correlograms when size and shape of the cells of the lattice are changed.
Computer simulations of occurrence of land uses under isotropic conditions
146
is recommended to investigate the possibility of negative correlations
and express these in an analytic form.
147
6.
LIST OF REFERENCES
Aldrich, R. C. 1955. A method of plotting a dot grid on aerial
photographs of mountainous terrain. Jour. For. 53~910-913.
Aldrich, R. C. 1966. Forestry applications of 70 mm color.
from Photogrammetric Engineering.
Anonymous. 1970.
For. 68:500.
Electronic device "reads" maps and charts.
Reprint
Jour.
Avery, T. E. 1966. Forester's guide to aerial photo interpretation.
U.S.D.A. Forest Service. Agricultural Handbook No. 308. 40 pp.
Barrett, J. P. and J. S. Philbrook. 1970. Dot grid area estimates:
Precision by repeated trials. Jour. For. 68:149-151.
Chrystal, G. 1961. Algebra, an Elementary Text-book, Part 2.
PUblications, Inc., New York.
Dover
Cochran, W. G. 1946. Relative accuracy of systematic and stratified
random samples for a certain class of populations. Ann. Math.
Stat. 17:164-177.
1963.
Cochran, W. G.
New York.
Sampling Techniques.
John Wiley and Sons, Inc.,
Das, A. c. 1950. Two dimensional systematic sampling and the associated
stratified and random sampling. Sankhya 10~95-108.
Finney, D. J. 1949. Random and systematic sampling in timber surveys.
Forestry 22:64-99.
Fisher, R. A. 1969. Statistical Methods for Research Workers, Hafner
Publishing Company, Inc., New York.
Gautschi, W. 1957. Some remarks on systematic sampling.
Stat. 28:385-394.
Ann. Math.
Ghosh, B. 1943. On the distribution of random distances in a rectangle.
Science and Culture 8:388.
Ghosh, B. 1951.
rectangles.
Haynes, J.D.
an area.
Random distances within a rectangle and between two
Bull. Calcutta Math. Soc. 43:17-24.
1948. An empirical investigation of sampling methods for
M.S. Thesis, University of North Carolina. 44 pp.
148
Jenkins, G. M. and D. G. Watts. 1968. Spectral Analysis and its
Applications. Holden-Day, San Francisco.
Kendall, M. G. and A. Stuart. 1969. The Advanced Theory of Statistics.
Vol. 1. Hafner Publishing Company, Inc., New York.
Kendall, M. G. and A. Stuart .1969. The Advanced Theory of Statistics.
Vol. 2. Hafner Publishing Company, Inc., New York.
Kendall, M. G. and A. Stuart. 1969. The Advanced Theory of Statistics.
Vol. 3. Hafner Publishing Company, Inc., New York.
Madow, L. H. 1946. Systematic sampling and its relation to other
,sampling designs. Jour. Amer. Stat. Assoc. 41~207-214.
Madow, W. G. and L. H. Madow. 1944. On the theory of systematic
sampling, 1. Ann. Math. Stat. 15~1-24.
Madow, W. G. 1949. On the theory of systematic sampling, II.
Math. Stat. 19~333-353.
Ann.
Madow,W. G. 1953. On the theory of systematic sampling, III.
Comparison of centered and random start systematic sampling.
Ann. Math. Stat. 23~101-106.
Marquardt, D. W. 1966. Least squares estimation of non linearparameters, a computer program in Fortran IV language. IBM SHARE
Library, Distribution number 309401.
Matern, B. 1947. Methods of estimating the accuracy of line and
sample plot surveys. Medd. fr. Statens Skogsforsknings Institut
36~1-138.
Matern, B. 1960. Spatial variation.
Institut 49~1-l44.
Medd. fro Statens Skogsforsknings
Moessner, K.E. 1957. How important is relief in area estimates from
dot sampling on aerial photos? U.S. Forest Service. Intermountain Forest and Range Expt. Sta. Res. Paper 42. 16 pp.
Moran,P.A. P. 1968. Statistical theory of a high-speed photoelectric
planimeter. Biometrika 55~419=422.
Moran,P.A. P. 1969. A second note on recent research in geometrical
probability. Adv. ,Appl. Prob. 1~73-89.
Osborne, J. G. 1942. Sampling errors of systematic and random surveys
of cover-type areas. ..Tour 0 Amer. Stat. Assoc. 37~256=264.
o
•
,
Pielou, E. C. 1969. An Introduction to Mathematical Ecology.
Interscience A Division of John Wiley and Sons, New York.
Quenouille, M. H.
1949.
Problems in plane sampling.
Wiley-
Ann. Math. Stat.
20~355-375.
Russell, A.M. 1956. Statistical approach to spatial measurement.
Amer. Jour. of Physics 24~562~567.
Weber, F. P. 1965. Aerial volume table for estimating cubic foot
losses of white spruce and balsam fir in Minnesota. Jour. For.
63~25-29.
Wilson,R. C. 1949. The relief displacement factor in forest area
estimates by dot templates on aerial photographs. Photogrammetric
Engin. 15~225-236.
Wittgenstein, L. S. 1966. A statistical test of systematic sampling
in forest surveys. Ph.D. Thesis, Yale University, 104 pp.
Wold, H.1956. A Study in the Analysis of Stationary Time Series.
Alinqvist andWiksell, Stockholm.
Yaglom, A.M. 1965. An Introduction to the Theory of Stationary
Random Functions. Prentice-Hall, Inc., Englewood Cliffs, New Jersey.
Yates, F. 1960. Sampling methods for Censuses and Surveys.
and Company, London, England.
Griffin
Yule, G. U. 1912. On the methods of measuring association between
two attributes. J. R. Statist. Soc. 75~579-633.
Zubrzycki, S. 1958. Remarks on random, stratified and systematic
sampling in a plane. Colloquium Mathematicurn 6~25l-264.
150
7.
APPENDIX
Systematic sampling variance may also be obtained from an Analysis
of Variance I table.
Suppose the N observations have two sUbscripts as
X.. for which we can write the following
lJ
X.. = X.. +
lJ
where i
= 1,
(X1.. - X.. ) +
2, •• "
k; j
= 1,
(X ..
lJ
identity~
-X..
L
1
2, ••• , nand kn
= N.
X.. is the ith
lJ
unit in the jth stratum or the jth unit in the ith cluster (sample).
In case of a stratum the population units are successively contiguous.
In case of clusters the successive units are k units apart.
All
clusters and all strata are of size nand k respectively and there are
n strata and k clusters.
Thus putting the X.. in an array we get
lJ
systematic sample no.
1
2
k
1
2
Stratum no.
n
151
The elements in the population can be constants.
assume anything about the population, but we
k
t
i=l
(X. - X
~
••
We need not
define~
)2!(k_l):: S~= the between sample {cluster) variance
k
n
t
t
_
2
(X .. - X. ) !k(n-l) = S2 = the within sample variance
n
i=l j=l ~J
~.
k
n
t
t (X .. = X )! (nk-l) :: S2
_
i=l j=l
~J
2
=the
population variance
o.
k
t (X' -X ) (X. -X )!k(n-l)n
i=l t~s ~ t
••
~s
••
t
k
t
i=l
t
It-51=!
X
(X' t -
..
~
)(X.
~s
=
PkS2 (N=1)!N
2
)!k(n-5) = P5 kS (N-l)!N
••
X
-
where 5 = 1, 2, ••• , n-l
The analysis of variance for the whole population is then:
Source
df
Between samples
SS
k
;...
2
t n(X. -X )
i=l
~.
o.
k
n
MS
2
_
2
Within samples
k(n-l) t
t (X .. - X. )
i=l j=l ~J
~.
Total
kn-l
k
n
_
t
t '(X .. - X
i=l j=l ~J
2
nSk~::Sb p. 243, Cochran (1963»
)2
The following three equations and conditions are given by Cochran
(1963, p. 209-11):
152
(kn-l)S
-
2
- k{n-l)S
2
nkV
n
-
sy
2
('X)
2
V (x) < V (x) if and only if S > S
sy·
ran
n
V
sy
or P < -lj(N-l)
k
(x) < V t(x) if and only if P t < 0
s
ws
where V denotes the systematic sampling variance, V
that of
ran
sy
simple random sampling and V that of stratified random sampling and
st
k
I: (X .. -X . )(X. -X )
I:
lU .u
i=l J.<U lJ .J
(2.11)
n
k
_
2
I: (X .. -X .)
I:
j=l i=l lJ .J
Systematic sampling implies the formation of blocks (the random
selection of one class out of k classes) where the number of elements
per block is determined by the sample size n.
H. Fairfield Smith (1968) proposed an empirical variance law in
case of auto correlated populations.
This law says that the between
block variance is a function of the block s.ize where successive elements
in the block are contiguous ors~ = S2jn
for each particular popUlation.
g
where gis to be determined
For the many cases studied by
Fairfield Smith gis between 0 and 1.
If the law would be generally
true also for blocking of non-contiguous elements then
V
sy
g
(x) = k~l S~ = k~l
= 1,
g
S2 jn •
It follows then that if
V (x) = S2(N-n)jNn : : : V
(i),
sy
~
ran
153
if g > 1,
vsy (x)
< V
ran
(x) and
(x).
if g < 1, V (x) > V
ran
sy
In a finite popUlation g may show the same irregUlarities as may occur
in the correlogramo
•
© Copyright 2026 Paperzz