Overton, W. S.; (1964)On estimating the mean ordiante of a continuous function over a specified interval."

ON ESTD1ATING THE
Fu~CTION
~N
ORDINATE OF A CONTINUOUS
OVER A SPECIFIED INTERVAL
Walter Scott Overton
Institute of Statistics
Mimeograph Series 390
AprJ.l·1964
ON ESTIMATING THE MEAN ORDDJ'ATE OF A CONTINUOUS
PUNCTION OVER A SPECIFIED DJ'TERVAL
by
WALTER SCOTT OVERTON
A thesis submitted to the Graduate Faculty
of North Carolina· State
of the University of North Carolina at Raleigh
in partial fulfillment of the
requireJl'lents for the Degree of
Doctor of Philosophy'
DEPAR'lMEN'I' OF EXPERIMENTAL STATISTICS
RALEIGH
1964
APPROVED BY ADVISORY COMMITTEE
Chairman
ABSTRACT
OVERTON, WALTER SCOTT.
On Estimating the Mean Ordinate of' a Con-
tinuous Function over a Specified Interval.
(Under the direction of
ALVA LEROY FINKNER).
In
can be
samplUlg problems, estimation of the mean ordinate
~~real
inte~reted
lits the estimation of, the area, M, between ,a
tinuous function y( T) and the Taxis, over some interval
!o,wJ.
m
T,
C0-n-
say
In the present dissertation, conventional methods of estii
mating mean ordina'te are evaluated under this interpretation by the
methods .of numerical analysis.
Reliance is placed on Weierstrass'
Approximation Theorem, as a consequence' of 'Which the continuous
funct:t.on y(r) is treated asa polynomial of arbitrarily high degree.
. Specific eases are studied as polynomials of prescribed degree.
The general appr,oach is show to have utility, and.' a number of
useful specific results are obtained.
In particular, the
classical
,
'
qUadrature, formu.lae (and designs' based on them) are shown to be very
useful..
~east
.An estiIpa.tor of M is described, based on the integral of the
squares fitted polynomial over the interval of interest, and
quadrature formulae are developed from orthogonal polynomials for the
eases in which observations are equally spaced in T.
ConventioXlal methods of estimating variance of the conventional
l
.
'
est:Lmatorsof systematic sampling are examined by the methods of
numerical analysis.
Modifications are suggested, and a procedUre
proposed for construction of unbiased estimators of variance of the
mean of a systematiC random sample, given restrictions placed on the
~
-
..
.
-
degree of' the polynomial.
I
In the :more general: problem of estimating
variances of quadrature estimators, supplementary observa·tions appear
to be the most useful source of information.
Several problems remain
unsolved in the use of residuals from the least squares fitted polynomials for this purpose.
In general, quadrature estimators of mean ordinate are considerably
more accurate than the conventional estimators (usually the observed
mean) of sampling over an interval in time or space.
Classical quad-
rature formulae (e.s.., Gaussian and Tchebysheff) are very accurate
-
"
. - .
.
v
"_
for a small number of observations, and the least squares quadratures
seem ideally suited to larger n and polynomials of uncertain degree.
The Newton-Cotes and Centric formulae will be valuable when observations must be equally spaced and whenn is small.
ii
BIOGRAPHY
Walter Scott Overton (Jr.) was born in FarmVille, Virginia,
October 3, 1925, the son of Walter Scott and Alice MOttley Overton.
He attended Grammar School and High School in Farmville, and entered
The Virginia Polytechnic Institute in 1942.
With two years spent in
.Military Service (European Theater of Operations, 83rd Infantry Division), he completed the B. S. Degree in Forestry and Wildlife Conservation in 1948 and the M. S. Degree in Wildlife Conservation in 1950.
From 1950-1958 he was employed by the Florida Game and Fresh Water
Fish Commission as a Biologist, Project Leader.
The last year (1957-
1958) of the tenure was spent at North Carolina State College, during
which time graduate work in Exper1D;lental Statistics was begun.
In July, 1959, he was employed by the Department of Experimental
Statistics, North Carolina State College, in the capacity of Assistant
Statistician, and in September, 1963, was appointed Associate Professor
of 'Biometry at Emory University, Atlanta, Georgia.
He is married to the former Joann Frances Price of Jacksonville,
Florida, and has two children, Deborah and. Michael.
iii
ACKNOWLEDGEMENTS
Acknowledgements are due the members of my Graduate Committee,
Drs. A. L. Finkner, R. L. Anderson, F. S. Barkalow, Jr., W. W. Hassler,
D. G. Horvitz, and H. L. Lucas.
In particular" Dr. Finkner is thanked
for his encouragement, his many suggestions" a· ,great deal of his time,
and his very g:reatpersonal interest in the completion of this work.
Dr. Horvitz also deserves special mention for his contributic;ms, as do
Drs. Anderson and Lucas.
These contributions are gratefully acknowl-
edged.
Several other persons contributed materially to the development
and completion of this research, including Drs. A. E. Grandage,
J.
G~
Baldwin, Jr., G. C. Caldwell and H. R. van der Va.a.rt.
To all of
these Iexten4 my sincere appreciation.
The majority of the work was accomplished while I was employed on
the Cooperative Fish and Game Statistics Project in the Department of
Experimental Statistics, North Carolina State, and work was completed
during my em;ploy.mentin the Department of Biometry, Emory University.
I am indebted to both of'these departments for the considerable amount
of time away from regular duties.
Finally, thanks are due my wife, Joann, for much patience a,nd
encouragement and for typing the rough dra;ft" a,nd to Miss Ann Sm1thwick
for typing the final copy.
iv
TABLE OF CO:NTENTS
Page
... ... ................·...
J!il!I!RODUCTION . . . . . . . . . . . . . . . . . · . • •
LIST OF TA:BLES
CHAPTER 1.
1'.1
1'.2
Orientation • • • • • • • • • • • • • • • •
The Problem • •
• • • • • • • • • • • •
1'.3 ReView of Literature • • • • • • • • • • •
1.4 A SUmmary of: the Mathematics • • • • • • • •
1. 5 Notation and Definitions • • • • • • • • •
CHAPTER 2.
• • •
• • •
• •
•
• • •
•
•
•
•
•
•
•
•
•
•
••
••
••
••
OONVENTIONAL ESTIMATORS FOR SAMPLnIG OVER .AN Df.I!ER.V.AL.
· .. . .
1
1
5
7
10
20
24
• •
2'.1 Introduction • • • • • • • • • • • • • • • '. • • '.
2'.2 'lhe S:l'Jniple Random Sample, n • 1 • • • • •• • • • •
2'.3 !!:he S:tm;ple Random Sample, n > 1 • • • • • • • • • •
2.4 The Stratified Random Sample, One Observation per
StrattlDl . . • . . . • • • . . . . . . . . '. 0. • • • • •
2'.5 Systematic Sam];lle, with RaJ:Jdom Start • • • • • '. • • • • •
2'.6 The Systematic Centric Sample •• • • • • • ••
• • •
2.7 Cong;>a;rison of the Conventional. Sampling Schemes • • • • •
24
25
ES!I1IMATION OF ERROR OF SYS~TIC SAMPLING • • • • • •
36
...
CHAPTER 3.
3-.1 Introdttction. • • • • • • • • • • • • • •
}o2
3.3
0
•
Qu,antitiesto be Estimated • • • • • •
Standard Estimators of V(Msr )' with Extensions
0
0
'0
•
•
•
0
•
•
•
•
•
0
0
•
0
0
•••
0
0
•
•
•
•
•
••
CHAPTER 4.Q.U1U)RA.TUBE DESIGNS .'. • • • • • • • • • • • •
4-. i
Introduction. • • • •
4'.2
-!!he -Eewton-Cotes Quadratures • • • •
4-.3
4-.4
4'.5
4.6
4.7
4.8
0
•
•
0
•
•
•
•
•
•
•
•
•
•
• • • • • •
Gaussian Quadratures • • • • • • • • • • •
• • • • • •
Tchebyeheff Quadratures • • • •
• • • • • • • • • ••
Centric Quadratures • • • •
• • • • •
• • • •
Quadrature FoJ:'Il1UJ.ae in Systematic Sampling with RandOIll,
Start . . . . • . . • . • . . . . . . . . . . .
..
Remainder Terms in Quadrature Formulae • • • • • • • • • •
Derivation of Quadrature Formulae by Least Squares.. • • •
CHAPTER 5 .00000,ARISON OF QU.ADRAWRE DESIGNS .AIilD '!!HE CONVEETIONAt
ESm1ATORS OF SYSTEMATIC SAMPLnIG • • • • • •
0
5-.1
5.2
0
0
•
28
,0
31
32
33
36
36
37
45
45
47
50
- 51
52
55
60
69
74
74
Introduct-ion. • • • • • • 0OlllJ?arison of Quadrature Designs • • • • • • • • • • • • • 78
5-.3 Some Specific Compar~sons0f Mean Square Error • • • • • • 84
504 Cong;>arisons under Two-Stage Sampling • • • • • • • • • • • 86
0
•
•
•
•
•
•
•
'.
-.
•
•
•
•
•
•
v
TABLE OFCO~ (continued)
Page
. CHAPTER 6.
ESTIMATION OF VARIANCE OF ESTIMATE OF QUADRATURE
DESJ:G.NS. • • • • • • • • • •
• • • • • • • • • •
94
·• .• .•
94
97
6'.1 Introduction
• • • • • • • • • • •
• • • • • •
••••••••••••
Differences Between Successive Quadratures Along an
IntervaJ. . . • . . • • • •
. • . • . • .
Lel;l.St Squares Residuals • • • • • • • • • .'. • • •
6'.2 Supplementary Observations
6.3
6.4
CHAP'mR
.
• • •
• • •
103
;1..04
7. SOMMARY AND CONCLUSIOES
7.1 Introduction • • • • • • • • • • • • • • • • • • •
7.2 Estimators of Mean Ordinate (Area Under the Curve)
7.3 Estimators of Variance (Mean Square Error) • • • 0.
7.4 Suggestions for Further Investigation • • • • • • •
LIST OF REI!'ERENCES
APPENDIX • • • • •
107
108
• • •
• • •
• • •
• • •
111
• • • •• • • • • • • • • • • • • • • • • • •
116
• • • • • • • • • • • • • • • • • • •
118
..• •
113
vi
LIST OF TABLES
Page
4.1 Coefficients of y(t i ) in Newton-Cotes Quadrature Formula
4.2 Coefficients of y(t i ) and values of {til, Gaussian Quadrature Formulae, n • 2, ••• , 6
l;,
• • •• • • • • • • •
..
...
51
4.3 Values of {ti } for Tchebycheff Quadrature Formulae • • • • • 52
4.4 Coefficients of y(t i ) for Centric Quadrature Formulae • • •• 55
4.5 First terms in the expansion of remainders of the basic Quad-
rature Fo:r.mtilae • • • • • • • • • • • • • • • • • • • • • • 64
5.1 Components of mean square error, R2 + cw2 a2 In, of Quadrature
Fonnulae
.. . . . . . • . • • • . . • . . . • • • • • • .
93
Chapter 1.
INTRODUCTION
1.1 Orientation
Systematic sampling is an enigma of sampling theory and practice.
It
is widely recognized as an efficient,and convenient, method of sampling from
a sequence or function in time or space, under a wide variety of conditions,
yet less efficient or less convenient methods are frequently advocated in
its stead., The often-realized gain in precision is through an increBJ;le, of
information (in the sense of variability) within the sample, resulting in
a decrease in the variation between possible samples.
Yet when such again
,
,
is achieved, variance estimators based on the sample second. moment have a
positive bias that increases as the precision of systematic sampling increases.
An :tm;portant source of information is the order of the obser-
vation,yet this has seldom been considered in estimation of precision,
and almost never in estimation of the mean ordinate.
A flurry of interest in the theory of systematic sampling was highlighted. by :i.ll:q)ortant papers by VJ.adow and Madow (1944), Cochran (1946),
' , - - .
Yates (1949) and. Quenouille (19-1·9).
'-.-
.
I
Subsequently, other problem are'as
in sampling occupied most of the interest of 'chose who might have wO'rked
on the problem, and. little of real :i.mportance. has been done in something
like ten years.
Even though several theoretical questions' have remained
unanswered, systematic sampling is widely used among practicing statisticians, particularly when estimates of precision ar,e not required.
This
use is evidence of faith in the technique, despite, the lack', of theoretical
respectability.
The present stimulus came from problems in Field Ecology.
Not only
does systematic sampling appeal to the intuition of the Field Ecologist
from the standpoint of precision and utility of results, but in many
cases there is no other practical sampling scheme.
In malting airplane
counts of breeding waterfowl, or deer track counts, or dove call counts
or a vegetation study, to name only a few of many examples, it is clearly impractical to travel long distances to get to a "random" plot.
One
might as well count while traveling or measure at intervals while traveling.
It has occasionally been advocated that a random set of plots be
set up, and then a shortest route selected for travel to cover all plots.
This has two practical disadvantages in that 1) seldom is
~t
possible
to accurately locate random points or plots in arorest or marsh,and
2) one is not getting maximum information from the sampling scheme.
Al-
so, many of the measurements are a function of time as well as space., so
that spatial randomization is only a partial solution.
The germ of the present approach to the problem was planted when
·the writer was introduced to Simpson's Rule in a course in basic calculus.
Prior to that time, he had used graphical or simple arithmetical
methods of computing the area under a curve of observations taken in
time or space, and numerical integration was obviously a better way to
do the same thing.
Since then, a number of simple applications of
Simpson's Rule have been made, most of which were in measurement of
fishing pressure through periodic counts of fishermen.
Field appli-
cation seems satisfactory.
The use of Simpson's Rule, or other quadrature formulae, involves
taking measurements of the system (function) at prescribed points along
the abscissa (time or space).
These measurements represent the ordinate
3
at the "sa.tIg;>le" points.
The quadratUre :formu.la then computes the area
under a :t>Olynomial of prescribed degree passing through the observed
ordi~tes.
Jh the simplerformuJ.ae, the observations are spaced at
equal intervals, so that there isa close relationship with systematic
(equal interva1) sampling, and so that the :formulae can be used with
standard systematic samples.
lJhe more complex formulae involva :non-
equidistantly spaced observations, and allow exact fit to polynomiaJ.s
of higher· degree from the same nttmber of measurements.
The latter can
be considered to be systematic sanq>les of a more generaJ. kind.
In comparing estimates obtained from these quadrature methods with
the conventional estimators associated with sampling a function in time
or space, it was naturaJ. to define the latter in terms of area under the
curve.
This device was considered briefly by Yates (1949) ~ aJ.so by
.
. -
~
Aitken (1939) and Moran (1950), though most of the literature is in
terms of the mean ordinate.
This must be considered a statistical habit,
as area under the curve is often of real interest.
Although there is a simple functional relationship between area and
mean ordinate, so that there is no mathematicaJ.
dit'ferenc~
in estimating
one or the other, there is nevertheless a distinct advantage in framing
the problem in terms of area under the cUI'Ve, even when mean ordinate
is of primary interest.
"Area under the curve" immediately evokes an
image of the curve and. the end points of the interval of interest, with
the observations superimposed!!'!.. ;place aJ.ong the time or .space axis.
With this picture in mind, it is natural to estimate area under the curve
by fitting a function to the observation, and integrating this function.
However, when one considers estimators of the mean of a univers-e from
4
which a sample is taken, one almost automatically thinks in terms of the
first moment of the sampling distribution, and the structure of the
sample in relation to the universe (when such structure eXists) is overlooked.
~e
main body of sampling theory is in terms of discrete sampling
. units,and most sampling thought is oriented in this direction.
However,
many observations and measurements are "instantaneous II , representing a
point rather than an interval in the continuous sample space.
Many others,
properly defined, have the statistical characteristics of an "instantaneous" measurement, and m.ay be interpreted as an ordinate to a curve
or surface.
WhUeit is true that any real world sampling scheme must
employ discrete units of time or space in describing the sample, there
is nevertheless distinct utility in the evaluation of the sampling process on the continuous scale.
Further, the Ulain body of sam;pling theory deals primarily 'With
completely general unbiased estimation, with no assum,ptions regarding the
nature of the systems or function be:i.ng sampled.
It must be considered
another habit of Sampling tha,t such assumptions are not commonplace.
Statistics has long since made its peace With restrictions of this kind
L
in other areas,
~,
Design of' Experiments.
lJ:hat systematic sampling, as here treated, is a' special case of
time series is :i.mm.ediately appw."ent, and it is necessary to justify the
f'act that a, traditional time series approach was not made.
lJ:he prim.ary
J:'eason is that many of' the systems of iImnediate interest are more properly
interpreted as functions than as stochastic processes and these were
studied first.
A logical next step would be the extension of these
5
In this respect, it can be noted that
results to the stochastic case.
this extension would add a third objective, that of estilnating the mean
value of the process over a specified interval, to the two classical objectives of time series analysis, prediction and description of the process.
In the present dissertation, four conventional estimators of mean
ordinate are expressed and. evaluated as area (M) under a segment of
The mean square errors of the estilnator~, e.(~ - M)2, are devel-
curve.
oped and
c~ared.
Estimators of the squared error of a particular
are developed in special cases.
~ample
Several quadrature formulae are sum-
marized, and their mean square errors developed.
These formulae are
compared with one another, and with the conventional estimators.
Com-
parisons are made in the one stage and the two stage sampling situations.
Due to the inq:lortance of small samples .in many applications, prj,.mary attention is given to characteristics of small samples, particularly
with respect to the quadrature methods.
1.2
1.2.1
The Problem
The Problem
The problem considered in this thesis is outlined as follows.
From
a set of observations of a continuous non~negative function yeT), defined on the closed interval [0, wJ, w > 0, it is desired to estimate
the area, M, between the function and. the T-axis on
That is, the parameter of interest is
M=
w
j
o
y(T)dT •
the interval [0, wJ.
6
Such a function can be uniformly approximated to any degree of accuracy by a polynomial of sufficiently high degree, in accordance with
Weirstrass' Approximation Theorem (see for example, Simmons, 1963,
p. 154, or Kopal, 1955, p. 19).
ttY(T) is a continuous real function defined on the closed interval [0, w], for every
E
> 0 there eXists a polynomial of degree m,
m
Pm(T)=
I
AiT
i
, such that
- - i=O
!Pm(T) - y(T)! < E for all T in [0, w] •
It follows
that,
,
i
w
1M ,-
Pm(T)dT! <
so that by an appropriate choice of
WE
E,
a polynomial Pmt can be chosen
such that
w
1M -
J Pmt(T)<iTI
= Irl
-0
can be made arbitrarily small.
On the basis of this result,
yeT) shall be treated as a polynomial
of (in general) relatively high degree.
However, it should be empha-
siz~d that the only general restrictions placed on YeT) are those im-
posed by the above theorem.
Now, i f there are n known
(observe~
values (y(t i )) of YeT) at n _
completely arbitrary distinct values (t i } of T, 0 ~ t~ ~_w, it is possible to uniquely describe a polynomial Pof degree n-l, such that
p(t i ) • yeti) for (t i }. That is, for any set (til there exists a set
of coeffi~ie~ts (Wi}' gWi- = w, such that
7
n
An
and.. such tha~R
III:
°
=L
wiy(ti )
=M -
R
1.1
if y( T) can be expressed exactly by a power poly-
'" and. R are functions of
nomal of degree < n, and. where M
n
£t i ) •
1.2.2 Conventional Methods
Conventional methods of estimating mean ordinate
J.L
III:
M/w
are evaluated by use of the above theorem and its mathematical consequences, and by this method, estimators of variance of these conventional estimators are also developed and. evaluated.
Further, this method is enq>loyed, in the form,of the classical
ideas of numerical integration, in development of estimators of M, or
mean ordinate;. and. evaluation of these estimators •
Specifically, numer-
ical integration formulae are considered when YeT) is observed with error.
To this end, it is specified that
2
2
III: 0, tE
III: (j
and tCiiE III: 0, i 1: j.
j
i
i
will be maintained throughout the dissertation.
where te
.--
.
.
This specification
I
1.3 Review of Literature
1.3.1 One Dimension
The key papers in which the theory of Systematic Sampling in one
dimension is investigated are Madow and Madow (1944), Cochran (1946),
Yates (1949) and. :Madow (1949).
Cochran (1953).
This theory is well sumlnarized by
8
Yates (1949) is the author of the major paper of the one dimensional
case, in which he examined systematic sampling theoretically and empirically.
He considered a number of variance estimators, several of which
are partially evaluated in the present dissertation .',
In addition, he
considered the systematic sample estimator' as area under the curve, but
did not follow 'UJ? the idea.
Yates departed from the conventional mean
observation estimator in his "end correction for linear trend," which
is here shown to be a simple quadrature method.
1.3.2 Tw,o Dimensions
Systematic sampling in two .dimensions has been theoretically studied
by Quenouille (1950) and Das (1950) and examined empirically by Osborne
.
,
(1942), 'Milne:''(1959L H~ynes (194$) a~ others.
(Osborne's methods
I
were actually one dimensional, although applied to a two dimensional
problem. )
The paper by Quenouille (1950) was evidently st1.muJ.ated by
Yates' (1949) paper, and eXtends the one dimensional ideas of Oochran
'(1946) and Yates to the two d:t.mensional case.
The paper by Das (1950)
is ve'ry close in scope to that of Quenouille (1950) but not as complete.
Empirical studies by Osborne (1942) and Haynes (1948) indicated
that sizeable gains in precision could be obtained from systematic sampling in some natural populations.
The study by Milne (1958) gave some-
what opposing results, in that he found the mean square among observations in a centric systematic sample to closely approximate the tlrandom"
mean square, implying little gain in precision over a random sample.
The crux of the matter,
as is known from the theory,
is that the error
,
,
of the systematic sample, in any given
populatio~,
or function, is
9
dependent on the chosen sampling interval.
Thus, for a systematic sample
to be self-evaluating, it is necessary that inferences about the function
be made from the sample.
This is the general approach of Yates (1949),
and many of the present results are extensions of his.
1.3.3 Numerica2 Integration
The connection between systematic sampling and numerical integration
was c~uallY mentioned by Yates (1949) and earlier implied by Aitken
(1939). Moran (1950) was stimulated by_Yates' (1949) paper to write on
the subject of "Numerical Integration by Systematic Sampling, If but no
one seems to have considered the possibility of using the established
results of numerical integration to obtain more precise estimators.
The
lone exception is the foreinentioned "~nd-correction" of Yates (1949),
which is in reality a first degree quadrature formula (Section 4.6),
although he does not identify it as such.
However, in rereading Yates (1949) and Cochran (1953 and 1946)
after completing the present thesis, one has the strong feeling that both
authors knew many of the present resuJ.ts, or at lea·st would not be very
surprised by them.
1:3.4 NumeriCal Analysis
Several numerical analysis texts were studied in development of
the results of this dissertation.
!£be most valuable to this study is
considered to be Kopal (19?5), although Rierdan (1958), Whitaker and
Robinson (1944), Kunz (1957), Guest (1961) and Booth (1955), all contributed to the results.
given in Section 1.4.
Pertinent results in numerical analysis are
10
1.4 A Summary of the Mathematics
A summary of the mathematics with useful identities and the develop-
ment of-general results follows.
1.4.1 Finite Differences
Define:
.6a = y(a+h) -yea)
2
~ = .6a +
h - .6a = y(a-Jeh) - 2y(a+h) + yea)
.6k =.6k- l _ .6k - l
a
a+h
a
and
E yea)
(1.1)
= y(a+h)
EF yea) =y(aieh)
JJf
Then
yea) == y(a+nh) •
.6a == (E-l)y(a)
~ ~ (E_l)2y (a) == (nf-2E+l)y(a)
.6~ == (E_l)ny(a) •
(1.3)
1.4.2 Expansion by Difference Operators
By the operators .6 and E, it is possible to derive many of the
useful relationships.
For example, i f it is desired to expand y(a+th)
in differences about yea),
t
E yea)
Et
:=
y(a+th)
= (l..fot1) t
=l+t.6 +
t{t-l) A2 +
.21 .
~
•••
11
a.
If t
is an integer, the e:x:pansion has t+l terms,
y(a+th)
b.
If t
= yea)
+ tA
a
+ t[2] A2 + ••• + "",t
2r a
(1.4)
a
is a fraction, the e:x:pansion is an infinite series,
, t[k] k
.
+ kf
~ + .•• , (1.5)
. a
'Where t [k]
=.
t!
(t..
k)'I' and where ( 1.5 ) is know as the
Gregory-Newtion Interpolation Formula.
1.4.2 The Cor;respondence of Difference and Derivative ExoRansions
The analogy between (1.5) and Taylor's eXpansion is immediately
apparent.
If y' is analytic, then, by Taylor's e:x:pansion,
( ) (a) + (th )221 (2)()
y{a+th) - yea) + th yl
a~
-
In order to facilitate the ma.nipulation of
I
.1
2
rr
Da ,
(
~ = De'
A'
1
3
3"!
Da ,
= ("'a" , l 21
A2.,
k'
a
-*d, ...)
31 a
k.-l
where ~ • ~+h
-
k-l
.6:a
~t = (t, t[2.], t[3], ••• )
t
-0
and
= (t,
t 2. , t 3 , ••• )
H. h
h
2
h3
(1.6)
infinite series, define,
•••
nak • y(k) (a)
Where
-a
~hese
.
+ •••
)
12
Then (1.5) can be expressed as
y(a+th) - yea) +
and (1.6) as,
y(a+th) - yea) + -a
Dt H -0
t •
Dt H t
Thus,
-a
(1.8)
.. .6. t t* •
-0
--0
Therefore, if there is a transformation matrix, A, such that
H ~ • At~ ,
then
At
-0
= -0
t*
(1.10)
Now, from (1.10), the nth row of A is the set of coefficients, s(n,k),
such that
n
I
ten]. s(n,k)tk ,
k-l
where the s(n,k) are known as Stirling Numbers of the First Kind.
numbers are tabled
(~.• ~.,
These
Riordan, 1958, p. 48), or may be generated by
the recursion formula,
. s(n+l,k) .. s(n,k-l) - ns(n,k) •
(loll)
It follows directly that the nth row of A-1 is the set of coefficients,
S(n,k), such that
I
n
t n ..
S(n,k)t[k] .
k=l
and where the S(n,k) are known as Stirling Numbers of the Second Kind.
~ese
are also tabled by Riordan (1958, p. 48).
1.4.4 Useful Integrals of Functions of
Define
~
1
-Jo ~dt
~
13
1
~ = S~dt
o
To = (~ - ~)(~ - .eo) t
'1'*
o
(t* - U*)(t* - U*)
=:
"""0
-0
-0
-0
t.
'1'* = AT At
o
0
Then
1·
So T*dt
,0
1
A{
I:
S Todt} At
0
1
and the ijth element of
J Todt is
o
'Where
1
so,
S
tijdt
o
1
==
1+j+1 ...
~~.,..i~j"="'rP"':'"':'~""
'. 1
(i+1) (j+1)
I:
(i+1) (j+l) (i+j+l)
'
Then the.tkt h
'element
of T* is
o
.t
k
t; =
I I
01=1
th
and the tk
element of
S(t,i)S(k,j)t ij ,
1=3.
1
S T*dt,
o
o
k
.t
-I I
,j=l 1=1
14
1.4.~
Useful Integrals of Functions of ~
t
"
1
~
= ~~dt
Tc
.
= (t
Then the ijth element of
~
1
- .-oc
U. )(t - u. )t
-e .-oc
Jo Tcdt is,
i and j even
=
o
,
i and j odd
,
i+j odd
1.4.6 Definition of .v
C*
c* --
.v
T*
0 "t1li&1/2
= A To
It=1/2A'
:=
A 2, At
so that the ijth element of .v
C is
c ij
=
[(1/2)i-et~] [(1/2)i-et j ]
1.4.7 U,Seful Constant, Matrices
seine usefUl constEl,nt matrices can be written from the preced1ng
paragraphs (1.4.4, 1.4.5, 1.4.6).
15
1
1.
10 T0 dt =
1
-121
1
4
12
3
•••
40"
1
12
45
12
~
•
"m""
1
1
112
(1.12)
•
•
A
2.
•
1
=
o
-1
1
2
-3
1
-6
II
-6
1
24
•
-50
35
-10
(1.13)
1
·•
1
I
~!it = - 1/12
o .
0
o
1/180
-1/120
-1/120
·••
-1/120 •••
...1/120
.•
23/1480' .
1
4.
I
Tcdt
o .
=
1/12
o
o
1/180
o
1/80
o
1/448
·••
1/80 •••
.•
•
•
(1.14)
16
,
.~
o•
5·
'"
0
0
0
0•••
0
1/144
1/96
11/960••
0
1/96
1/64
11/640
0
11/960
11/640
121/6400
•
•
·
*
6.
C:'.
"'-
•
0
0
0
0 •••
0
1/144
-1/96
73/3880..•••
0
-1/96
0
73/2880
••
1/64
.
-73/1920
•
•
•
-73/1920••
(1.17)
5329/57600
•
•
•
•
0* • 0 C'
and where
"',.
where
g}
--
I:
..
(0,-1/12, -1/8, -11/80,
• (0, -1/12, 1/8, -73/240,
1.4.8
•
(1.16)
Di:t'ferepp~s
... ) A'
... )
(1.18)
of Lag n
From the definition of .D.a., it follows that a difference of lag n
&. a
is defined as,
n+a-1
•
.D. •
i-a i
1:
+a
Ff
-
a
E
Thus, ~ =~-1
o
u
(1
=';"A
+ t::.o )n - 1
+ n(n-1) A2 +
.LJt-\0
'
..
2J
A,2 • (En _1)2
o
• Jin _2~+1
•
o
LoJo
•••
+ An
o
LoJo
(1.20)
17
== ,_ 2n6.
o
+ 2n(2n-l) A2 + ••• + tl- n
21
- [2n A +
o
0
0
2n{~il)
tP- +... + 2tP ]
. 0 0
+1
(reWling that the 1 appearing in the last term of the sum is the
identity operator),
and, in general,
&k. (~_l)k
o
• im _ k[l]]f(k-l) +
(1.~21)
1.4.9 Distin~tionBetween the Symbolic Squaring and the Arithmetic
S~uar*A5Qperation
A notational difficulty is encountered in using finite differ- ,
ences in quadratic terms, in that confusion easily occurs between the
eymbolicsquaring operation and the arithmetic squaring operation.
To illustrate, the second difference of lag 1 has been defined,
2
~a;.
yea) - 2y(afh) + y(a+2h) ,
whereas the square of the first difference of lag 1 is
(.6.;) == [y(a+h) - yea)
i
== [y(a)]2 .. 2 y(a)y(a+h) + [y(a_b)]2
'!his point is made at this time because the following paragraph,
1.4.10, deals with the arithmetic analog of the symbolic operation
treated in paragraph, 1.4.8.
However, this distinction must be main-
tained throughout, and the arithmetic power of a:rry difference operator will be denoted by brackets, as (.6. )k •
a
18
~
1.4.10 Identities
In this :paragra:ph are develo:ped several identities helpful in com-
paring expressions arising later, :particularly in Cha:pter :;.
These
identities ha:v:e not been developed in general, but rather under the
restriction that ~ == 0.
A more general develo:PInent would seem desir-
able.
(1.22)
Proof:
n-l
n
(1: .6...)<:;;
.illQ
1. t
lill
J.
,.,
.
(I::.c: t
l::.oLJ..
••••
!::.o !::.n...l
(LJ..)2
••••
.~
•
••
!::. 1::.. • • • • • •
n-l 0
Then, i f
~
I::. _
n 1
•
•
'2
(I::. _ )
n l
== 0,
•
(I::. )'2
l'
-
0
.•
LJ..(LJ..-Lf)
'
•
A (~ -(n-l)!::.)
n-l -r1-l
0
(LJ..) 2 ••• LJ.. (LJ..+(n-2)t\2)
i
•••
1
19
where,
G. 62
6 0 [1 + 2 + ••• + ~n-1)]
+ ~[2 + ••• + (n-2)]
•
•
•
+ 6 _ [-(n-1) - (n-2) ••• -1]
n 1
.~
(6
-'0
+
- 6
(~
n-1) [1 +2 + ••• + (n-1)]
.. 8 _2 ) [2 + ••• + (n-2)]
n
•
•
•{['\~} - e.(!!} ] [ n/2]
+
2
2
, n even
or 0, n odd
(n-1) (n-1) (n)
T
+ (n-3) (n-3)(n)
T
+ [n/'lI, n even
lo,nodd
.-
n even
n odd
(1.23)
20
2
.-
"2
2
2 2
• (A.t 1) + 2.6: A. 1" + (~ )
;.
(A. )
4.
1 {n-2
n
2}
An _l • n-l ,'E 6 i + ~ (n-l).6:
5·
(6 )2 .. ( ..l:....)
n-l
n-l
~
(1.24)
~-
i=o
{n~2 (6
i=o
)2 + n(n-lln+l) (62 )2 +
i
"6,
nA2n~2A
}
i=o i
(1.26) ,
(n-l)~n+l) (£-)2" + A2n~2A
}
i=o i
(4;.27)
1, i =,0, 2, 4, •••
0, i
=1,
;,
5, ••.
(1.28)
1.5 Notation and Definitions
w
1.
M II
j
Y(T)dT
o
.. area under the function between 0 and w.
,
"
2.
~
;.
Y( T) is a continuous non-negative function of ,..
4. Yi
.. M/w .. mean ordinate over the interval [0,w] •
II
yeTi)
or
.. Y( T i) +
5.
, depending on the context.
i
Mr .. Estimator of M based on a s1nWle random sample of size n
over a:n interval of length w .. hn.
E
21
6. Mst • Estimator of Mbased on a stratified random sample, one
sample at random in each of n strata of lengt.h h.
7. Msr • Estiinat.or
of M based on a systematiC random sample, first ob. . ..
servation taken at random between
intervals of h.
8.
..
9·
0
and h, and other n-l at
Msc • Estimator of Mbased on a centric systematic· sam.t>le, observa.tiona taken at centers of n panels of length h.
,.
,.
V(M). Mean square error of the Estimator, M •
2
= Mean square successive difference of lag 1.
10.
8
11.
8~ = Mean square successive difference of lag
32.
Linear combination of mean square successive differences
ai =different
lag.
13.
2
8 (Q) = Mean square successive difference between successive quadratures along an interval.
1..
of
14.
E =1.
2.
15.
6.a • y(a+h) - y(a), the first difference of y(a). It is usually
understood that differences are over intervals of h, so that
they are expressed in terms of the index,
16.
The squared mean alternate difference' 3.21, or
The symbolic operator, E .6.+ 1, 1.4.1.
6.i • Yi+l - Yi' the first difference of yi
k
k...l
k-l
th
~i = ~i+l - ~i ' the k
difference ofYi •
=Y( T i)
17· Ai = Yi +n - Yi • Y [(i+n)h] - y[ih]
18. ~.
19. D~
= (6.a '
h6.;,
h~,
...), 1.4.:;
y1 k ) • y(k) (T i), the kth derivative of yi = y(,.i)
1.
(D ,
a
1.·4'.;
•
~.
21.
s(n,k)
22.
S(n,k). Stirling numbers of the second l7ind, 1.4.3
23.
ten]
#:
1 D2 ,
2!
a
1 D3, ...,
)
3!
a
20.
Stirling numbers of the first kind, 1.4.3 .
=t!/(t-n)!
22
24.
t . L a uniform random variable (0,1) or,
2 • a sample value of ,.
25. "'" U(O,h) means: distributed as Uniform random variable on the
interval [0, hJ.
26.
f(t) == probability density function of t.
27.
e. 6
== expectation over 6
28.
e. t
== expectation over t
29.
1: indicates sUllll'lJation over i
i
--"\
30.
indicates sunnnation over all i in the sample.
1:
16£$
31.
t • ==
-0
(t, t 2 , t 3 , ... )
32. ~ = [ (t~,..., hj2), (t_, hj2)2, •••• )
J
2
33 • .l:!:.t. e.~) == (hj2, h j3, h3/4, .... )
34.
~ .e.~) =[0, ~(hj2)2, 0, 3<h/2)4, ••.•
J
0
III
• J!
°
35 • To
36.
III
(t
... If;J_
. ~ , (t - .I:!:.).
-
Tc .. (~ ... 1:!:c)(~ ...
37• Q,.:,
T
.0
t
t
III
hj2
J:!:c)',
J
equation (1.15) •
C* .. AfJP. t
~~
= . Some constant or coefficient,
.
t *.. • (t~t[2J , t[3] , •.•• ) , 1.4.3
C
38.
o
apparent by context.
'
39. T*o ....(t-0* ... ""'0
u*) (t
u* )., equation (1.14) •
* ... "'"0
-0
40 •
eo·* == e.:!;;,*
23
41.
A , ·
~
• a quadrature formula of n observation.
,
Superscripts N, C,
G, T indicate Newton-Cotes, Centric, Gaussian, and Tbhebycheff.
A
• a quadrature formula formed by successive application of
42. Q:~,n
,
a basic formula requiring k observations to a total of n observations.
.
43. Qn(t). a quadrature formuJ..a of degree n, in 'Which observations
"
are equally spaced, With a ;random start.
2.
As a synonym for ht in the section involving results in
terms of orthogonal polynomials.
to these immediately follow.ing.
49. !'. (l,x,X2, , •••• )
X
59. ,y
•
1
is.
;I..
~
•
•
•
"-
1
51.
52.
0
Xn
This definition applies
'
~ .. •
~ -. ,
~n
• ••
.t. matriX of
orthogonal polynomials, as given, for exam.Ple, by
,Anderson and Hous~ (1942). See paragraph ,lJ:.~.2.
,
Q?>.., is the orthogonalyzing transformation, defined in paragra.ph
,y4.8.2.
53.
! . ~!)-l .~ Ui')-1&!
54.
k is used in a number of 'Ways as a constant. Definition in any
one section will be apparent from conteXt.
24
Chapter 2.
OOEVENTIONAL EST:IMA.!IttmS FOR SAMPLmG OVER .AN INTERVAL
2.1
Introduction
The four conventional methods of sampling over an interval
(simple random, stratified random, systematic with random start and.
centric systematic) are all conventionally used with the same estimator
of mean ordinate, a simple mean of the observed ordinates.
The dif-
ferent properties of the various schemes are due entirely to the different methods of selecting the abscissae at Which the ordinate is to
be observed or measured.
In the present chapter, the expression for
estimated area under the curve is developed for each of the sam;pling
schemes in terms of the generalized function expansion.
The f1.Ulction considered is y(,.), and the observation yi' :made at
the ,ith
.- value ~f ,. is,
(2.1)
As the contribution of £ to the estimate and to the variance of the
estimate is add!tive; the terms involving
£
will be dropped from the
development of the general expressions, and. reintroduced in Section
2.7, 'When c~isons are made •
In the follo'Wing section, "2..2) J the basic expressions are de-
veloped f07" a single sample point selected at random over the interval
to,
h].
In subsequent sections of Chapter 2, the results of
.
Section 2.2 are expanded for n
interest over the
int~rval
> 1 to the various sampling schemes of
[0, w], 'Where w • nh.
The panels of
length h and the results' of Section 2.2 relating to them, are in
25
reality the building blocks for subsequent results.
resultsar~
In addition, all
expressed in terms of differences defined over the panels
of length h.
2.2
Ilhe S:f.m.ple Random Sample, n • 1
2.2.1 ~e Dt;f;ference Ex:;pansi08of Sample ResUlts
Insamplin8 from the intervaJ. (0, h) , it is natural to let -
-
t ,." '0(0, h).
However, in using the Gregory-Ne'Wton formula,
~s
well
as in subsequent development, it is advantageous to l~t t - '0(0.,1).
A little "wkwa3:'dness in phrasin8 the problem is more than compensated .
by later s:iJlq>l1:f'ication.
Ilherefore, let' a Value of '" ht, be selected at random b~i;ween 0 and
h, !.~., t,.., '0(0,1), and let y • Y(ht) •
of
h
Define
ll\-
III
0
lry•
.1
Then
I
M • Jy(,.) d,. .. h ! y(th)dt
. 0
and
(
J.
1
e(l1\.)· J(l1r )f(t)dt • J(ll\.)dt
0_
0. .
1
• h !y(ht)dt
o
• M•
• yeO) + t * ',6
-
-0
(2.4)
where
*
:!:.o~. (t,t(t-l), ••• , t [k] ,. •• ),
6. I
•
,6,0
lIIl
-0
and
.
Then
26
•
M .. h
.
(6.,
0
1
i5T
c;.
2
ts.0 , ••••
0),
'.
Y(h) - y(O).
1
Jo y(th)dt •
h Ty(O)
.
* - e~)!~
*
(lMr ... M) .. h(~
,
(1M .:. M)2 • h2,6,tT*6. ,
r
-0 0-0
*
*
= (~* .. e~)
(~
where
'*
To
*
- e~)
I
a.nd
vhMr) • et(lMr _. M)2
,
= h 2bi. t e,T*,6,
-0
2.2.2
(2.6)
0-0
Expansion by Central. Differences
It is possible to
e~ress
the results of paragraph 2.2.1 in terms
of central differences of interval h.
,6,c •
That is, let
~(~) -y(~)
,
Some of the resUlt.s of expansion in central differences a·re given in
Section 1.4, bUt the expansion of the pr~ceding paragraph is far more
useful, and little use will be made of the central difference expansion.
. 2.2.2 .Expans:ton by Taylor's FoJ."'IlJlll,a
If the function y( '1') 'is analytic, and therefore can be expanded
in Taylors series, it is possible to express the estimator and its
,
,
variance in terms of derivatiyes of y( th) in a manner a.nala,gous to that
27
of Paragraph 2.2.1.
In fact, these results ca·n be obtained by means
of an identical development from the Taylor expansion, or by means of
the transformation relationships developed in Section 1.4. .The results
.
.
.
are summarized here.
In terms of expansion about the initial ordinate,
lMr • h[y(O) +.~. H~]
tt = (t, t 2 , •••• )
where
-0
~ = [y(l) (0), y(:~ (0) ,
and
(2.7)
,
H.
...
]
,
o
(2.8)
Then,
where
T
= (t
0-0
- tt )(t
-0-0
t
V(lMr )
and
t"oJ
- tt )t,
-0
U(O,l)
~ tt(lMr - M)2
=h 2-0
DtH tT HD
0 -0
•
This is also ea.sily expressed in terms of expansion about the central
.
ordinate,
h
y(~)
•
lMr
where
=h
[y(~) + :;; H !?c]
(2.10)
28
Then
(lMr - 14)2 •
'Where
:
h2~HTcHEe
(2.11)
TC • (t
- e.t)(t
- e.t
).
-c
-c-c
-c
(2.12)
~.3
2.3.1
The Simple Random sample, n> 1
Genera1~zation of Sectio~ 2~
Generalization of the results of Section 2.2 to the estimate of
area under Y from measurements at n random values of t, follows from
the results of that section.
Let n Values of tbe selected at random,
t ,." U(O,l), and let Yi • y(wt i ) ,
~
V1.
nh.
Then
• h I'Yi
(2.13)
is an estimate of
1
V1
M· J y(tt)dt l
and eMr • h J I Yif(t)dt •
o
prev:iou~
w Jy(wt)dt,
0 0
1
As in the
•
V1
Jy(wt)dt
section,
expansion in d:L;f'ferences,
0
~
can be expressed in terms of a series
M • h
r
29
I[' .
'1(0) + t i
&0
t [k]
+ ... + i'.
k!
A k0
+...
]
,
(2.14)
(~
- M)g
:I
:I
[Mr - w(y(O) -
2
h
2
:I
where
h
2
A~ t~)J
'[A' (\.
t*i - ~t*)J2
L-o
-0
-0
A~ [I(~i - t~)] [ I (~i- t~)J~o •
A o. • yew)
- '1(0) •
2.3.2 Central Dit'f'eX'ences and Derivatives
,.
It is also possiblt? to derive ana.lagous expa,nsions in terms of' the
central ordinate and. in terms
~f'
derivatives.
However, the only one
of' interest is,
,
.
\"
222
",
.
~ :.wy(O) + h L (wtiDo + w tiPo + ••• ) .
2r
.
(Mr - M)2
:I
'
,
h2~H1I~i - nt~iJ [
t(l\. - M)2 :.
h2~H* [
I
tToi + 0 ]
v(1\.) .. rm2~H* tToH*~ ,
Where
H*-[: ~~J
I
(2.16)
,
~i - nt~iJ R~Eo
H*~
30
2.4 The Strat:I.:f'ied Ra·ndom Sample, One Observation per Stratum
2.4.1 Genera&Results
Let an interval of length
'W
be diVided into n sub..intervaJ.s of
length h, and let n values of t be selected at random, t. ,.., U(O,l),
J.
De:f'in~ _Mst
n-l
l!IBb y(ih +
m
(2.18)
hti )
i..,
Xl-l
hI
[Y(ih) +
t.~.
J. J.
-+! ....
~:.
iwo
• h
And
I
y(ih) + h E ~':;'i
2
• h [
(Mst - 14)2
•
~ ~i
h2[~~(i:i
n-l
• h2[t~(~T:i~) +
J.~
I[~ eT~i~J
2
Then V(Mst ) • h
n..l
·I
V(ll\.);t
illlQ
..
TA~ ~i ]
t
2
.. t ~i)J2
n-l .
Ij~
Ii."-_-..
~(~i .. t~i) (~j
~
+ 0
-(2.20)
31
2.4.2 Derivatives
Expression in termS of
(Mst - M)
and
V(Mst)
2
Bi
follows directly f;omearli:r results.
2[\
. J2
.. h
L B;.H(!:r. - t!:r.)
.,
(2.21)
=h2 I[~H t To~J
(2.22)
n-l
=
I
V(ll\.)i
•
illlQ
2.5 Systema.tic Sample, with Random Start _
2.5.1 Genera,l Results
Let an interval of length w be divided into n panels of length h,
and let a value of t, t ""U(O,l), be selected.
n-l
'!hen define,
Msr = h I y(ih + ht) •
(2.23)
i=o
Then,
Now Msr " h \ [y(ih) + t .6. + ... + t[].{]
i
L
k!
so that
li~ +
(M - M)2 .'h2 [ 1: ~(t* - tt*) J 2
sr
-.I. -0
-0
'. .. h2 ( ~PT: (~)
,
•• '. ]
'
(2.24)
and V(Msr ) .h2(~) tT:(~)
(2.25)
2.5.2 Derivatives
(
Again" expression in terms of
results
2
(Msr ' - M)2 .. h [ 1:
~ fotLlows d,irectlyfrom earlier
BI H(~
-
t~) J
2
(2.26)
2.6
~e
Systematic' Centric sample
2.6.1 Geee$aesults
Let an interval of length w be divided into n panels of length h,
and define,
n-l
Msc
=
hI
y(ih
+~)
(2.28)
•
i=o
. .
.
.
.'
.
1
is, then, a sJ.)ecial ca.se of Msr ' with t • 2' and the results
~$
can be written
do~
directly from 2.5.
(14
.. 14)2 _ [14 (1.) _ Ml 2
scsr 2
J
2[
1.* ]
=ll. ~£~
and
2.6.2 Derivatives
V(M
sc )-
,
sc'. _14)2
. •
(M
and Central P,1.t'ferences
It follows that V(M ) can wo be written, ;
sc
2
V(Msc ) h [ (l:B,nHToH(~,> ]
1
=
t
=2"
(2.31)
33
2.7 Comparison of the Conventional Sampling Schemes
2.7.1 Introduction
A general comparison of the sampling schemes, based on the resuJ.ts
of this chapter, is not possible.
ble.onl'yfor the case
t!-;;
nomial of first degree.
In fact, direct comparison 1s possi-
0; that is, 'When the function is a poly-
When higher differences are non-zero, it is
necessary to define classes of f'unctions for which'one or the other
sampling sche.meis best.
This point is not investigated in depth, but
rather onJ.y illustrated.
For pUrpose of comparison, the contribution
of measurement error to the variance of
2.1.2
fJl
is re-introduced.
.
2
Comparisop when ~ == 0
From the preceding section.'3a.nd
1.4.7, the variances of the con-
ventional est:i..lnators obtained by the four sampling schemes are,
v(Mrl
V(M
"'st'1
tl- ==
2
A
t..:.
0) •
2
r:ib.
[¥
+
ePJ
== 0)
V(Mscl D,,2 == 0) •
rill [
0
+
;J .
eP
Thus, it is seen that M is best for aJ.l values of "
and D. , alsc
though the advan"eage decreases as (J2 increasingly dominates (8)2
It
\-.
is also:seentbat, of the three randomized schemes, the order of precision .is Ms t' Msr and Mr and again the difference decreases as 02 increases.
2.7.3 COffiPa~on when ~ ~
Oc
To illustrate the comparison of the sampling schemes when
~
a 0, consider, for example, M
st and Msr •
o
o
1
180 .
2
/s..
.~
2J
2
• h {
...
fu E(~:i)2 + .fJ.o E(~~)2 .+ nl-
2
• h {
f2 E(~:t)2 + ';0 (t:J.2)2 +
2
n0
J.
}
} ,
and.
( I
3
V Msr t:J,
=0 )
• h
2 { j2
1 (T.6
) 2 + '120
1 (2)2
TAi
+ no2}
i
• h
2
2 { 121 (T.6 )2 + n'"{'20~ LS,..+
2)2 no2} •
i
From Identity 1, paragraph 1.4.10, this becomes
V(Msrl~ = 0) • h2 {
'!hus,
V(Mstl~ =0)
v(MI
tJ· =' 0)
sr
!2 E(~i)2
2
-
;0 (5n2. -
6)(~2)2 + n02}
if- ". 0,
35
It is.seen, therefore, that the advantage of stratified random
2
sampli~ over systematic random sampling is less ~hen. t:s.
2
t:i. • O.
.
Further, for any given 1:(8)
2
F0
tha,n when
2
and
t:s.(i.e., for any given
,
.-~
polynomial
. . of degree 2 and a specified interval), it is possible to
.
.
solve (2.33) for a region, in n, within which systematic sampling is
advantageous over stratified random sampling.
It 'Would then seem that the space defined by all possible intervals of length 'W on po'Wer polynomials of arbitrary degree is separated
into regions 'Within 'Which any .one of the sampling schemes is better,
equal to, or worse than any other of the sampling schemes considered,
as regards precision of estimate.
For any given polynomial and. inter'-
val, it 'Would seem possible to select the sampling scheme 'Which is best,
but it does not seem possible to satisfactorily describe the regions
of superiority in general.
However, it may be possible to define
regions of superiority for classes of functions.
It will be shown later (Chapter 4) that when one is given a specific polynomial and. interval, one is able to improve on any of the
sampling plans treated in this chapter, so it seems unnecessary to
pursue further the
presen~
line of investigation.
36
Chapter 3.
ESTIMATION OF ERROR OF SYSTEMATIC SAMPLING
3.1 Introduction
Several estimators of error of estimate of the systematic sample
have been proposed.
In the present chapter, these are expanded and
evalua·ted in the notational framework developed in the precedi11g
chapter.
In addition, a generaJ. method of constructing estimators of
the variance is proposed, and several exam,ples given.
Attention is
confined to the systematic samples, random and centric.
The' model' considered is, again,
+ ei
Yi • Y(Ti )
where y( T) is a continuous function of
tei ej
(3.1)
'
T and
te • 0, te2 •
rl
'am
• 0, i ~ j.
3.2 Quantities to be Estimated
3.2.1
~nera~ ~ressions
V(M lt l
sr"
,.
t e (Msr - M)2
..
h2(~) T:(~) +
rUlrl
V(Msr ) .. ttV(Msrlt) • e.tte(Msr - M)2
III
h2(IA'~) e.T* (I'A,.',) + rib.2 l
"""'* .
~
0
V(Msc:) • (Msc - M)21 t
1
-2'
• h 2 (m!':)c*(i'.tu)
"""J.
~.
~
+ rih2 a2
(3.4)
37
3.2.2 SRecial Cases
~
Let
V(M I~
sr
5!
V(M I ~
sr
5!
5!
O.
0; t) .. h
2
[ rA. (t ~
~)
+
2 2 ·
(t[2] - ~)J. +
2! .
~i
nr?c?,
(3.5)
0)
3.3 Standard
Estimators of V(MST ), with Extensions
.
'.3.1 Mean Square Successive Difference
n-2
Define
f>2 =
2(~-1)
I
(Von Neuma.nn, 1941)
(Yi+t+l - y i +t )2
(3.8)
i=c
1
-2 (n-l)
~{
L,
2·
t[k]
k+l·
}2
2
+. • •
+ a
6 i + tL::\i + ••• +""'k'r L::\i
1=0
n-2
and
e,l~f,a2
..
2(~-1)
I {~
tt
i=c
··•
(3.10)
38
•
ij \ ' \ ' S(i-ltk)S(j-l,.t) .
k+.t+l)
•
L L
==
and.
t(8'2I Ll'2
=0)
n-l
()
E (Ll )'2 + n n-3 (Ll'2)'2} + ;.
L {
6
'2n i=-:>· 1
'
==.l:. ni\Ll.)'2 + a'2
'2n i=-:>
==
S!:f-
J.
+ a'2
•
(3.13)
'2
From (3.6) and (3.13) it is seen that 8'2 is an unbiased estimator of
V(Msr )' letting
Lf- =0,
1
only if
1
-1'2
==2n
'2
This follows since, when t:S. e 0,
n-'2
E (Ll )
i=-:>
and
'2
i
n-l
( :E Ll.)2
== (n-l)(Ll)
==
'2
,
n'2 (Ll)'2 •
i=-:>J.
Thus, rih,282 is unbiased only when n == 6, and is negatively biased when
n
> 6, given that Ll'2 e O.
39
3.3.2 '!he Mean .......
Square Successive Difference of Lag
.
"_.
._-
--
--
-.
.
--
.
.
_
..
-
.
.
-
The mean square successive difference of lag 1"
1,
-
1,
an integer, is
(3.15)
Let
d
0, and. the expression
E
(3.15)
becomes
21 3 _
1
\ [ A. 2
A.
A 2
1
A 2 2J
2
t(81, t:s. = 0) -2(n-1,)L (L&!li) +L&!li~i +'3 (~i)
+ a ,
Ex~le: Let n.
1,
2
24, and D.
=}24/6 =2.
Then set
O.
5!
n
1,2
12· 2"
(3.16)
a,nd. solve,
nh28~ is an unbiased estimator of
Thus,
2
V(Msrl n - 24; t::.
5!
0) •
It follows that the expectation of the average of k such estimators,
'2
2
81,' with different 1, and given ts.
2)
t(8
r -
1
2k
5!
k
1:
i-1
0, will be of the form,
2()2
tit::.
2
Thus an entire family of unbia,sed estimators of V(Msr t;}
fined by
(3.18),
subject to the restriction,
n •
~
k
I t~ .
i=l
(3.18)
+a •
5!
0) is de-
More generally, consider an estimator of the form,
222
1* • a1 51,l + a2~J,2
2•
Where a l + a
1
and Where e.(~ll~= 0) •
=0)
e.(a;21~2
.
~(~)2 +
,?
+ ,?
~(~)2
•
~
Then, it is possible to choose
e.(&;1~2 ==
~
and a2 such that
a(~)2 +
0) •
0
2
~le:
22
2
1*. a1 51 + a2 52
Let
~ .~; .~ •
from (3.l"jr),
and from (3.6),
Thus, choose
n/12 •
0: •
a
.k:L.
0: -
1;11
2
2
~-~
1 (1i:'24)
24-n
;y- • """i'8"'""
al • b
&; .. (~86)8~ + 2iit Ii
_2
...
2
2
e.(~) .. I2(~) +
3.3.} :s9:eS51UflPed Mean
-
,.~-
-
io'
u
' . '
-
-
_.
,-
-
.
-
-
0
•
Alterna.~e
-
•
_..
.'
-
-
._-.
D.1:f'ference
-
•
--
-.-
'.-
_.
"~
•
The squared. mean alternate difference was defined by Yates (1948)
as:
Where n is an even integer.
41
2
A}
i=o 1+t i
n-2
'!hus,
-
~
~ ~
X
,X.
i
{l,
i = 0, 2,~~~
.
~ == 1, 3, •••
0,
..
n...l
==
e'(EI~ ==
and
~ {i~ (~i)2
0) ==
==
Hence,
m?
e(El~2 == 0)
2
... I2(n ... 2) (~)2} + (12 ,
~ { I:xi~J2
+
(3.22)
c?
~(~)2 + ;. •
2
== rib.
[~(~)2 +
== V(Msr
I ~2==
IJ
" 2 2
0)
+ n ' h 6,(~)
2
,
and E is positively biased for all n if ~2 == O.
2.3.4 Variations ofE
Two variations of E are suggested.
'!he first is modified from
(3.24)
where lon == 'no
1hus
~"
is the average of the k squared qua,ntities, defined by
application of E to the k successive sets of m == n/k observations.
For example, if' n • 8,
~
•.~ {[Yt .. Yl +t + Y2+t - Y3+tJ
2
2
+ [Y4+t - Y5+t + Y6+t - y 7+t J }
It :Ls seen that a lim:Lting form of
E:k'
En' is the mean square
2"
alternate difference
8; ==
~ I~~ (L1i +t )2
t(8;162 == 0) • ~(6)2 +
i
and, in general,
A second variation of E is the squared mean successive difference,
1
Es ==
n-2
2
1: A.t+t )
i.., .J.
2" (
2
atld it is seen that E is a l:Lmiting form of 8J, , .t .. n - 1, since
s
n-2
.1: 6 i +t .. (Yn - l +t - Yt ) ·
~-:>
3.3.5 Cochran's Quadratic Estimators
Cochran (1953, p. 180) proposes an estimator based (!)n an average
of squared second differences.
estimator,
..::-,."_.
-;..,". l '
,- ~.~ -;
Following is a modification of Oochran's
43
n-3
11(<::.2)
~ Vc ' •
1
. ~_.',{A; +
~ u...
5(n-2)
Then
e(a~I~2
:5
0)
2
And
t(a~ld
:5
0)
2
tA3i
+ •••}
u
+
2
0'
•
i-o
•
0'
,
It is immediately apparent that the coefficient of (~2)2 in (3.32) can
be modified by manipulation of the lag, as in paragraph 3.3.2, or by .
squaring sums of second. differences as in 3.3.4, so that it ispossible -to define a family of estimators based on squared second differences.
For example, let
Ec •
*(~+t)2
t(E 1~2 == 0) • 0'2
c
t(E Id
c
:5
0). (n 2)2
4
(~2)2
+ 0'2 •
All of the conventional estimators are related and can be considered to lie in the same general family of estimators, generated by
linear combinations of. squares of linear combinations of sample differ,
An unbiased estimator of V(Ms )
-
Lf :5
0) is developed for all n,
- n
but a more general unbiased estimator of V(Msr Its. == 0) was not obtained.
A general comparison of a2 and E is of interest. In particular,
ences.
it was noted that the bias of E is always greater than 0 if!S.2
:5
0, so
44
that there exist situa.tions (e ..s.., n
=6,
1:::.'2. ==
0) in 'Which 8'2. is
.
-
Yates (1948) demonstrated that when applied to the
.
'2.
autoregressive function, E was consistently smaller than 5 , but
I'better" tha·n E.
larger than V.
It is concluded that neither has a cons:Lstent supe:-
riority. .(It should be noted that in Yates' study, E was modified by
.
<
an end. Gorr.ection.)
However, the question of superiority among the standard variance
estimators is of little importance.
Direct comparison can be made
only in the case of specified polynomials, for many of which (if not
all) unbiased estimates of variance can be constructed.
Even so, in
most such cases it will be better to modify the estimates of M to take
advantage of the specified model,than to utilize this knOWledge in improving the estimates of error.
For this reason, this line of inves-
tigation will not be continued, although it is of considerable interest.
Chapter 4.
QUADRATURE DESIGNS
'l!he tem "quadrature designs" is used to designate s~ling designs that yield the exact area (definite integral) of polynomials of
specified degree, if measurement error is zero.
The term is derived
from the "quadrature formillae", appropriate to numerical integration
..
of the polynomials, given the observation made in accordance with the
designs.
4.1 Introduction
4.1.1
~e
Form of Quadratures
App~ication of quadrature, or numerical integration, methods to
the general continuous function is dependent on the, power polynomial
approxim.ation to that function., as were the resuJ..ts of earlier chapters.
As before, this approximation is justified by Weierstrass t !!heorem.
In the present chapter, attention is given to im;provedestimates of
area under the curve.
Refere~ce
is made to Kun(1954,Ch .·7)" KOpaJ.
(1955 ,Ch~7), Wh1taker and Rob'insOl?o(1944;Ch'. 7)a.nd Daniell(l940).
According to Weierstrass t Theorem (see 1.2), the continuous '
function y( T) can be represented.,
where Pm t ( T) is a power polynomial in
T
of degree mt •
Thus,
b
M •
SPmt(T)dT +
I' ,
(4.2)
a
where r can be made arbitrarily small by appropriate
choice of p.mt
.
,
I
.-
•
46
Throughout this chapter, it is assumed that Pm' is so chosen that r can
be ignored.
~t
is, Y(T) is treated as a power polynomial of arbi-
trary degree.
Quadrature formulae are of the general form,
where the Wi and t
i
are selected so as to make
b
~ • JPmCT)dT ,
(4.4)
a
for somem ::
4.1.2
~es
m~.
of Quadratures
A great many specific Quadrature. formulae have been described, of
which several are of particular interest and will be considered here.
1.
Let all of the 2n constants in (4.3) be utilized in specifying
the definite integral of Pm(T).
These are called Gaussian
Formulae.
2.
Let the t
i
be selected for convenience, and the Wi utUized in
~pecifying the definite integral of Pm(T).
A special cla,ss of
formulae in which the intervals between the t
"
be considered here.
are equal will
i
These include the Newton-Cotes Formulae
and the Centric Formulae.
3. Let the Wi be selected for convenience, and the t i utilized
in specifying the definite integral of Pm(T).
When the Wi
are equaJ., the class of formulae is named after Tchebycheff.
4.1.3
Some General Results,
When the n sample values of 'I' are arbitrary, say ttl' t 2 , ... , tnl,
then, by Lagrange's interpolation formula (Kunz, 1957, p. 89)"
n
Pn - l ('I') ..
I
a i ('I') yeti)
-illl1
-
a ('I') •
S
(4.6)
b
~ - J Pn_1(T)dT
. a
- -
n
-I
(4.7)
wiy(t i )
i-u.
.
where
b
Wi •
Ja ai(T)dT
(4.8)
•
Further"
M-Q+R
where R-
J
a
('I' - tl)(T - t 2 )···(T - t n ) y(n)(g)dT
-
nf-
4.2
4.2.1
1p.eWe~t6
The Newton-Cotes Quadratures
and Their Derivation
Let·i~ be desired to integrate Pn_l(T) over the range (O"w), where
win-a. ~ Jh, and let {til .\.
to,
h, 2h, ... , w}.
formulae for this case will be designated,
. N
~ .•
Numericalintegration
48
'lhen, let
u • .T - t 0 ,
h
so that
(4.5) can be rewritten,
n-l
Pn- l (u) •
I
(4.10)
&i(u)y(t i )
illlQ
(_l):n-l-su[n]
where as(u) • e:(n-l-sJI(u-s)
(4.11)
'UJ,us,
n-l
·I
Wiy(ti )
,
(4.12)
illlQ
n-l
where W • h Sas(u)du •
so·'
(4.13)
Similarly,
R
n
• hnnjl u[n]
.
0
nr
(For & general treatment of rema1nder terms, see Section
A
n~er
(4.14)
y(n)(&)dU.
4.7)
of formulae belonging to this group are in common use,
including the !lTapezoidal Rule, Simp sons Rule and the Three-Eighths
Rule.
'lhese are all defined by the basic form:uJ.a
n-l
·I
~ ~
Aiy(t i ) ,where Ai· An _i +l '
illO
'With appropriate coefficients of the formulae up to n • 7 given in
Table !t.l. A roore complete table (to n • 21) is given by Kopal
p. 536)" in a slightly different form.
(1955,
49
in Newton-Cotes
Table 4.1 Coefficients of y(t.)
J.
Qu'Eldrattu"e Formu1.ae '
AO
1
1
3
14
95
41
n
2
3
4
5
6
7
4
9
64
375
216
A
3
~
~
24
250
27
272
C
2
3
8
45
288
140
4.2.2 Use of Two or More Conventional Newton-Cotes Formulae in Combination
It is common in practice to construct quadrature formulae by
linear combination of other for.mulae.
Of particular interest is the
practice of constructing formulae for a large number of ordinates by
successive application of simple formulae to sections of the range.
For example, i f n .21, it is possible to partition the 20 panels into, (i) ten sets of two panels each, to each set of which is applied
Simpsons rule, (ii) five sets of four panels each, 'to each set of which
-
-
is applied the rule for n • 5, Qr,
(~)
,
four sets of three panels each,
and four sets of two panels each, to each set of which is applied the
appropriate
fo~a.
The remainder,term of such a constructed formula is constructed
according to Property 7, paragraph 4.7.1.
That is, linear combinations
of Newton-Cotes For.mulae of degree m have remainders of degree m
with numerical coefficients which are' linear combinations of the numerical coefficients of the remainders of the compbnent formulae.
50
4.3
Gaussian Quad;J:'atures
!the Gaussian Formulae are the most
gene~al
and powerful of the
quadrature formul~e, as all 2n constants on the right side of (4.3)
are selected
so as to specify the definite integral
of Pm(T).
.
.
..
With
.
2n constants, it is possible to com;pietely specify the 'definite integral ofa polynomial of degree 2n - 1. A sketch of one approach to
the derivation of these formulae follows. (See !{opal, 1955, p. ~52
and Whitaker and Robinson, 1944, p. l59~)
Given P _ (T) defined. over the interval [O,w] and n distinct
2n 1
,
arbitrary values {til in the interval, it is possible to uniquely
specify a po~mial, Fn_l(T), such that Fn_l(t i ) • P2n- l (t i ) for all
ti
E
{til.
Further, P2n~1(T) can be written,
P2n-l(T). Fn-leT)
Hence,
+ (T - t1')(T - t2) ••
. ~(T - t)G
n n-leT) •
w
w
Jp2n_l (T)dT -JFn_l(T)dT ,
o
0
W.
J(T - tl)(T - tg) ••• (T - tn)Gn_l(T)dT • O.
o
(4.16)
It turns out that the {til satisfying (4.16) are the roots of the
Legendre poly.nomial of degree n, suitablY transformed from the range
(-1, 1) into the range (o,w).
Given the values of (til, the coef...
ficients, 'Wi' are obtained by formula (4.8).
Kopal, (1955, p. 523) gives Gaussian Quadrature Formulae for n • 2
to n • 16.
9he coefficients
am
ap,Propriate .t t.}
for. the first
few
J.
.
are given below, in somewhat different form.
51
Table
4.2 Coefficients of
y(t~) and va~ues of {t.},Gaussian Quad-
rature Formulae, n
.
"
~
n
A:t
2
1/2
3
5/18
4
.17393
_2, ••• , 6.
A;
J.
t l /W
~l -~)
4/9
.32607
1. - -)
3
-(1
2
:5
.06943
·50000
.33001
5
.11846
.23931
.281M-
.04691
.23077
.50000
6
.08566
.18038
.23396
.03376
.16940
.38069
~•
Then
n
I ~iy(ti) ,
(4.17)
w
i=1
where Ai .. A _i +l
n
and t i .. w - t n - i +l
4.4 Tchebychei'f Qu,adra.tures
The third conventional tY.J;le of quadrature of particular interest
is that named a1'terTchebycheff •. In this family of formulae, it is
specified that all weight coefficients be equal, and that the {ti } be
selected so as to specify the definite integral of a polynomial of the
highest possible degree.
Reference is made to !\opal
(1955, p. 417).
Salient features are:
i.
Degree of the formulae is n, by definition.
is (n) or (n+l), whichever is odd.)
(Actually, degree
Only n-l of the weight coef-
ficients are a:i:obitrary (equal to the other), leaving n+l constants
on the ri&ht side of
(4.3) to be utilized in specifying
inite integral of interest.
the def-
52
i1.
Tchebycheff Quadratures exist onJ.y for n
~
2, ..• ,7,9.
For
n :> 9, no integral vaJ..ue of n exists for which all roots of
the Tchebycheff polynomials are real.
Thus, Tchebycheff Quadratures can be completely sUJ.tltDS,rized in a
n
QT.:! \ y(t.)
Then,
,
--xl
n
L
J.
(4.l8)
1-d
4.5 Centric Q;u.adratures
In several of the texts studied, !:..~., Whitaker and Robinson{1944,
p. 155)
~ntion
is :made of Quadrature Formulae based on ordinates in
the center of the panels of integration, where panels are of equaJ.
length, as in the Newlion-Cotes Q'Wa.dra.tures.
were
n~t
However, these fo1"lllUJ.ae
foWJd :1n the literatut'e" nor was a name associated w::tth them.
53
As the formulae are directly applicable to the observations obtained
from a Centric Systematic Sample, Section 2.6, they a,re of particular
interest here,
a~
fortunately can be derived through use of method-
ology already established.
'lhe name used refers to the sampling
scheme to which the formulae apply.
(4.19)
where
=
t
(2i - l)h
2
i
h
,
.
=win .
'Jllus, from 4.1.3,
(4.20)
(4.21)
n-l/2
and,
W
s
=h r '
-'i/2
a (u)du
s
= (_l)n-s h'
(B-1) ! (n-s)!
• h:g,'(n,s):!:! ,
I
u[n]
(u-s+l)
du
(4.22)
(4.23)
54
,
1
(4.24)
u
u
n-l/2
····r
- . -'i/2
and u •
1
n-l
. du"-
•
o
•o
-0
!(n _ ~)n _ !(_ !)n
n
2
n 2
o
u
n-
It is seen, then, that £(n,s) is the vector of coefficients opta-ined
by the algebraic expansion of
Further, one can write.,
B 'en)
•
(4.26)
:p"(n,l)
E,f(n,2)
·..
•
•
:p"(n,n)
so that the vector of coefficients for the Centric Quadrature Formula
can be expressed,
-w..
hB~(~) u •
(4.27)
-
Expression (4.27) is particularly helpful, as B'(n) is constant
-
for all formulae of n equally spaced abscissae, and in order to derive
.'
constants for a particular range of integration, it is necessary only
55
to evaluate
~
and the product of B'
and~.
A Table of Centric Quad-
Numerical values of B~n), n • 2, 3, •• ;,' 8,
rature formulae follows.
are given in the Appendix.
Table 4.4
(}.s>~pi~,ie~ts. ,o:f ;v:(tiL,,~o~~C.e.ntric O.u~ra.t.ure Jro,~8:~.
---
...
..
~
"
n
'J
c
A
~'
~
A:I.
j
4
2
1
3
3
2
3/8
4
13
11
1/32
5
275
100
402
5/1152
6
2223
3251
2286
1/1920
1
7
1.25300 .04466
2.83600
(7)
(173,215) (6174)
(392,049)
(-175,196)
8
,1.22200 .29485
1.95011
.53304
Q~
1
-1.26733
(1/138,240)
1
n
• he
I
Aiy(t i ) ,
i-1.
4.6 Quadrature Formulae in Systematic Sampling with Random Start
4.6.1 '. General DevelOpment
As in Section 2~5 select a value of t at random, 1.~., t -U(O,l),
and let {til == {th,th + h, ... , th + h(n - 1)}.
of Section
4.5,
it is possible to write,
Qn(t) ==
J Pn- 1 (T)dT • I Wi (t)y(ti )
·0
'Where
Following the results
Ws(t)
==
n-t
as(u)du •
h,r
-'It
(4.28)
Hence, from (4.27)
!let) • :bJ3~(n)ll(t)
where B'(n) is defined as before, (4.26), and
2
1J duo
u, u , ••• , un..
..
(4.30)
Then let
w
Rn(t) =
Io Pn('1")dT .. ~(t)
•
Therefore, it is possible to apply numerical integration methods
to data collected by systematic sample with random start.
Whether
this is in general an advantageous scheme cannot be here stated, as
an attempt to evaluate R(t)
WB,S
unsuccessful.
It is shown in para,..
graph 4.6.4 that in a particular case, tR(t) is not identically zero,
so it can be stated that the procedure does not provide unbiased
I
\,
estimators, in general.
4.6.2 The Svecial Case, n • 2
From (4.29), letting h .'1,
!let) • B(2);E;(t)
-i
2 ..t
Where ll'(t) •
and B'(2)
=
[l,u] du,
57
Ttlus,
4.6.3 TOe Case, n > 2
Let ts.2
.
>
El
random start.
0, h • 1, n
> 2,
and consider systematic sampling with
a
'lhe integral over the range (0, n) can be separated in-
to three integrals, (0, t + 1/2), (t + 1/2, n -
~ + t), (n - ~ +t, n).
TOe middle of these can, under the assumption l;).2
5
0, be represented
by the Centric Quadrature for.mula of degree 1,
n-l
I
. i.e
QC •
yeti) ,
which is simply a sum of for.mulae of the type, ~ •
Also, a quadrature formula of degree 1 for the sum of the two end
integrals, in terms of the two end observations, can be constructed by
the method outlined in paragraph 4.1. 3.
where a . t + 1/2
b • n
-3/2 +
t
tl • t
t
n
.n-l+t.
58
'!hus,
!!(t) ..
1 + net - 1/2)
, n - 1
(4.31)
•
••
is a set of weights for a quadrature formula of degree 1 when t is
selected at random.
'!his will be recognized as the continuous ana-
logue to Cochran's
(1953, p. 172) end corrections. Also, the result
of Paragraph
4.62 is obtained by setting n .. 2.
It will be noted that other "end corrections" can be devised as
quadrature formulae of degree 1 by the simple expedient of modifying
the segments of the integral.
Also, it will be noted that one can,
in general, construct "end corrections" as quadrature formulae of
higher degree, and that the process is a direct application of the
general process of construction of quadrature formulae for systematic
samples with a random start as developed in paragraph 4.6~1.
4.6.4 Properties of Ql(t)
. . Some properties o~ tt[~(t) - M], ~ (t) defined as in p.aragraph
4.6.3 are of interest.
1- 0,
i .. 1, n •
59
Hence,
eJ ~~t~ - MJ •
(n~l)
1
{
t
n
. .
(tl - 1/2)y(tl )dtl -nf (tn - 1/2?y(tn)dtJ •
l
(4.32)
An interesti:og inteJ:I>retation of (4.32) is,
e t [ ~ (t) - M J •
(n:r ){ 1\ [ ~l (t) - ~ J - Mn [~n(t)- ~J },
(4.33)
°
where ~l (t) is the mean of a ramom variable t l ,
< t l < 1, 'With
det.tsity function y~tl)/1-1.' and ~n(t) is the mean of a random variable
t n , n - 1 < t n < n; with density function yet n )/Mn • It is seen by
-inspection that et[~(t) - M] III 0 only for special cases, and that,
in general,~(t) is a biased estimator.
the bias of
t
"-I
~ (t)
A:not~er
useful e~ress1on for
follows the methods of- Chapter 2, where again
U(O,l),
et
[~(t)
- MJ
III
(n~l{~ - ~-lJ e\
t(t - 1/2)
t(t -
(4.34)
l)(t -
1/2)
•
•
·
n n-2 2
• (n-l)[ ~ ~i'
_
1~
1 n-2
.Jk] (t - 1/2)
·•
3 i- n-2 4
2T. J.IIIO
. ~ l::i..,
~ illlO
~ ~i
J.
III - 1
••• j ('4
-
.,.J..
1
3')
1
1
-1
1
(b' - '4)
(tr ~ 5')
·••
-1 n-l 2
:I
.
n 3
12 i~.6:
+ 24 I:s.
Il1O 1
60
In comparison, using for estimator the quadrature formula of degree 1
originally designated as Msc ' the bias of the centric systematic sample
is, from (1.18),
4
(Msc - MIA == 0) • [~ rA~, ~ ~ ] - ~
1-
8'
lJhus, bias-wise, the advantage of the "end corrected" systematic--random sample estimator is open to question, as this bias is twice that
4
of the centric systematic sample est:J.:mator, 'When A == 0, and both
est:t:mators are quadrature formulae of degree 1.
A general comparison
of these two biases was not acoo:m;plished, and comparison of variances
will be deferred to the chapter (5) dealing with general oo:m;parison
of quadrature designs.
4.7 Remainder Terms in Quadrature Formulae
!tis generally possible to evaluate the remainder. of a quadrature
formula.
lJhis SUbject is treated in some depth by Kunz (1957, pa·ra-
graph 7.10 and 7.11) and by Daniell (1940).
Some properties of re..
mainder terms are surmnarized here, and terms are developed for the
formulae of interest.
61
Define
'W
RP(T)
• Jp(T)dT - Q,P(T)
o
'Where Q, is any quadrature formula, and p( T) is a polynomial of
arbitrary degree.
4.7.1
1.
Pertinent Properties of Quadrature Remainders
A quadrature formula is
axk
said~o
.0, k
be of degree m if'
<m
" 0, k> m
2.
A quadrature formula is sa·id to be sinWlex, degree m, if',
given that If1+l is continuous in the range of integration,
.
.
11-1
RP • implies ~ _ . -0 ,
°
Where
a is
in the range of integration.
3. The remainder term of a simplex formula,
Q"
of degree m,
can be obtained by
4.
The remainder term of a simplex formula, Q" of degree m, in
which the abscissae are equaJ.ly spaced, can be obta·ined by::
(4.36)
5. The sign of a quadrature formula is said to be the sign of the
coefficient, C, in the remainder term,
RP .0U;+l
62
6. !!:he sum of simplex formulae of degree m" all of Which are of
the same sign" is also a siDij;>lex formula of degree m.
7. !!:he remainder of such a sum of simplex formuJ.ae is given 'by
..
where _the C are the coefficients of the remainder terms of
i
the com;posite formulae, and i is in the range of the new
formula.
4.7.2 Froperltbes of the Formulae Here Used
i.
!!:he Newton-Cotes formulae are all simplex (Kunz" p. 143) •
.
-
Further" the degree of the Newton-Cotes formulae is either
n - 1 or n" and is always odd.
ii.
At least some of the Quassian formulae are simplex" Da.:rd.ell
(1940)" although proof that all are
s~lex was not foUDd.
Further" Kopal (1955" p. 351) gives the remainder of the.
(]a.ussia.n Formula in the form (4.35) _, which follows the simplex
property.
iii.
All are of degree (2n -1).
Nothing was found regarding the simplex nature of Tchebycheft'
formuJ.ae.
The remainder term given by Kopal (1955, p. 419),
is also of the form (4.35).
All are of degree n or n
+ 1,
whichever is odd.
iv.
It is proven that the Oentric formulae are simplex for n even,
but proof for n odd was not found.
Degree is always odd, either
63
n or n - 1, as in the Newton-Cotes for.mulae.
It is assumed
that all are simplex and that the remainder can be obtained
by either
(4.35) or (4.36).
4.7.3 , Remainder of Interest
4.5. A general
Remainders of interest are summarized in Table
comment on the remainder ter.ms is in order.
They do not proVide com-
parison for any specific function or interval, but rather for some
"average" over all intervals and functions.
In a specific case, an
"i:n:f'erior" design, as evaluated by the remainder term, may actually
be superior by Virtue of the particular value of
mean value theorem.
~
fuJ.fill~
the
64
e
Table 4.5 First terms in the e~ansion of remainders of the basic
quadrature formulae. ~ese can also be interpreted as
Itaveragelt remainders 'UXlder the mean value theorem
Quadrature
FormuJ.a.
v?rl-
n
Newton-Cotes: 2-
..
21102
4110
4
w7n6
6!10 4
3
4
QC
n
-2.5536
-1·5772
8.3333
2.0833
3,.2407
1.6764
2.6546
1.5946
5
6
QT
n
2
3
4
5
6
2.5875
.42311
5.•5556
2.0833
2.6455
1.3434
1.9841
1.1293
7
9
Gaussian:
QG
n
2
3
4
5
10'1,lOb
..3·7202
-2.0952
7
8
Tchebycheff:
Bf105
"3·7037
5
6
1
2
J-1ifO
..8.3333
7
8
Centric:
viJn8
..16.6667
3
4
~
vfJn4
1.0767
5.5556
3.5714
2.2676
1.4316
4.7.4 Relationships for RepeatedSas>line;
~e
following
relationship~ are
useful in converting the remainders
of basic formuJ.ae into general remainder te:t"lll$ for repeated app1ication, of the same formula along an interval.
~
Let
..
.R:k
vations and
.
.
.
be the remainder term for the basic formula of k obser~
be the combined remainder term, such that
Il\i/D~~l
l
Rn/rr:+ •
lIL-te-1ll+1
'Where ~ • ClhJJ1i
,
'
'Which can be rewritten,
*:m+e
Bki • C2W
m+l
Dii
·
* C • cl/kD'lie •
Then, :if h . V!/k,
2
and i f
h
III
w*Ik-l, as in the Newton-Cotes formulae, then
-
C2
It follows that if h·.
win,
+2
III
°11 (k_l)m
.-
-*
•
.
where w • nw/k is the length of the new
composite interval,
l\/D~+l •
•
2?
mofe
lh
~lhD'lie
(4.,8)
66
*
V!'
n-1
Similarly,. 11' h • w/n-l, where w ~ k-l
as in Newton-Cotes.. formulae,
then one need o1'l1y substitute (n-i) and. (k-l) for n and k in formula
4.7.:t ]l,emaind.ers in Terms of fj,
'.
.
i
By the general relatioship between D and
.
i
one can write the
f:);. ,
"
.
expression for the remainder of a formula of ciegree m,
hm-teDm+~+l
R •
m
.
.
(4.~)
£ ..
. (in+l) :
__.m+l
"
--
(Where
.J:Ur'--' is' defined over the standardized interval, i.e., 11 • 1)
-. . . . "
a.nd l~tting fj,ll1'Ie == 0, as
R •
m
h fj,m+l~;'l
(m+l) .!
(4.41)
m+e
!'1()te here that is necessary to require IS.
== o. Otherwise, e satis. .
m+l
f~ng the mean value, ~
,may lie outside the inteJ:"V'al of interest.
4.7.6 The
_
-
__
.
An
-
'-.
~ated Difference &Rans1on of
.'.
"..
j""
-
._
,
-
•
n
_,
__
._
_
. • _._
0'"
_
R
•
identical resuJ.t is also obtained when one expands R in terms
of differences and truncates the expansion.
!I.b illustrate, recall
that
n-2
\
M • h { LYi
illlQ
n-2
1'\
n-2
n-2
1" ' \ 2
1 '\ 3
+ aLD.i - i2L~i + 24'L.6:i
1uO
'Where the k
illO
th
term is
-
illlO
k
1 '\
if L
1.-:1..
. 1
1.
s(k,1.)
t dt •
J
0
Then, successive application of the prismoidal formulae, ~, to the
n .. 1 panels of a sample of n observation yields a quadrature formula
which can be eXJ?anded in terms of differences,
(4.42)
N
{I 213
}
- j2 rL:;i ... 24 r.t\i .. •••
Thus, R(~,n) • h
Then, let
t? ==
•
0,
()
h n-2
~o that Re Q!i ) . -12E f).: • -h ~l f).2 •
"'2,n.
••
Recall (Paragraph
i
lIIIO J.
-
4. 7.4) that.
n-l
3
R(o!i ) • \. -eN • -h (n-l) n2
"'2,n
L ,I;'2,i
12
~
illQ
4.7.1.
A;n ~e:
Successive, A~cation of Simpson's ~
Q~,n • ~ {Yo'" 4Yl
... 2Y2 + ... + 4Yn _2 +yn-J (4.44)
...} .
68
Then, let 5
=0,
so that
R(QN ). • ' ~(n-l)64
~,nl80 . i
,
which is identical to the expression obtained from Paragraph 4.7.4,
(n-l)/2
.5
R( N ) . ' \ N
. Q3,n •
j-l
L!t3j • -
.. -h~~8~~)}
h
«-l~
2 90
if
4
Dg
~ =o.
4.7.8 SecondFEx@mple
. Consider the difference expansion of R(Q~
J,n
), the remainder of the
third d:egree Oentric formula, applied successively to sets of 3 observations, 'Where n is the total n'UIllber of observations.
Q~ = ~ {9Yh / 2
~.
+ 6Y3h/ 2 +' 9Y5h/ 2 }
2
=h
{I Yh
(i+l/2) +
i=o
~ ~/2}
(4.46)
'Where
Exj;>anding and collecting terms in (4.46), and subtracting from the
expansion of M, one obtains,
c)
R( Q3
II
2
~-h ' \ { 6 4
b40
7 i
L
21 5
}
+ '2 ~ + •••
illQ
and
n-l
R(~,n) --'-&Z{7-t\ + ~i
i=o
It is of interest that, from (4.46),
• •• } •
(4.48)
69
C
C
h
2
Q:? • Ql,3 + tr L£j2 '
so that
R(~) • R(Qi,3) - ~ ~j2
4.8 Derivation of Quadrature Formulae by Least Squares.
It is readily apparent that one can obtain an estimate of the area
.
,
under a curve of specified degree by integrating the fitted least
squares polynomial over the interval of interest.
Further, i f this
procedure is applied to the observations taken in accordance with a
basic quadrature design, then the result will be identical to the results of the quadrature formula,e, by the uniqueness of the power series
expansion.
4.8.1 General Considerations
Consider the application of the least squares quadrature to n
observations, where n is greater than the degree of the quadrature k.
Then,: i f y( T) can be represented exactly over the interval by a poly-
nomial of degree k, and there is no measurement 'error, then kth degree
~
-
-
I
quadratures applied to consecutive sets of observations (repeated
applica~ions) will yield a result identical to that of a kth degree
least squares polynomial over the interval.
If' measurement error is present, and the degree of the function is
"',
k, then each of the SUb-estimates,
M. ,will be an integral of an indi-
Vidual estimate of the polynomial.
If' the polynomial is truly constant
J.
over the entire interval, then the least squares polynomial fitted over
all n observations constitutes a ugoodU average of the possible
mates.
est~-
If the polynomial changes between the sections of the interval to
.
.
..
Which the basic formulae are applied, then the above result does not
hOld, in general.
In fact, appreciable changes in the polynomial ne-
cessitate stratification at the points of change, whether least squares
or quadrature methods are used.
The replicated application of quad-
ratures iii the next chapter fit this situation.
4.8.2
Orthogonal Polynomials
Use of orthogonal polynomials in deriving least squares quadrature
formulae when ordinates are equally spaced allows use of existing
tables (!.~., Anderson and Houseman, 1942), with a great saving of
labor.
Let
A
A
SY(!)dx
M•
where
.!'~
Y
+
6
! . !'~ +!
III
lt~* +!
where ,1' III !'Q 1and ..PA. is the orthogonalizing transformation.
Then,
~~
III
<.! !t.) -lj, !
These entities .can be defined variously, though tor the present
application, use is faciliated by defining the X variable to have mean
zero, so that, from Anderson and Houseman (1942),
o ..
1
,..;
o
2
_(3n -7)
20
o
1
o
4
2
l.5n -230n
1008
o
2
-5(n'-7).
~
1
o
1
and ,..;
X"
-407
o
o
1
....
o
o
•
1
••
and the necessary k. and
Then,
M. J l!i*d
i
are tabulated.
S
(4.49)
In
(4.49), the term in curly brackets is· .fiXed by n and the degree
of the formu.lae, and the term in square brackets isdeterm:tned by the
interval of integration.
brackets when Xl • 3:
To illustrate,' consider the term in curly
1
0
-8/J2
1
o
1
1/3
1
1
1/6
3
1
•
1/2
0
-2
1/3
1/3
1/3
1
0
-1/2
0
1/2
3
1/6
-2/6
1/6
0
•
-1/2
1/2
1
1
1
-1
o
1
1
-2
1
1
°
-1
Then, in (4.49), the term in square brackets, centric
Hence,
· [~ ,~, *]
I,
as before.
N
Similarly, to obtain Q3'
one integrates,
1
.i [1, X, x2 ]
from 'Which,
~
• [ 2, 0,
~]
d.x •
[2, 0, ~ ] '
o
1
-1/2
o
-1/2
1/2
-1
1/2
• [ 1/3, 4/3, 1/3 ]
I,
°
also as before.
0
-1/2
•
1/2
s~le,
is
73
Cases of Particular Interest
I
The orthogonal polynomial method is of particular interest in obtaini:ng quadrature formulae in special cases of equally spaced ordi!'~tes,
,but non-S;ymin.etrically' placed limits of integration.
dition,
.
.
,theproced~e
In ad-
is quite useful when one desires a quadrature
th
of, say, k degree , fit to n equally spaced observations.
To illus-
5 obser--
trate, consider a third degree Centric quadrature fit to
vations , so that the term in Gurly brackets is, (since the considered
integral is syrmnetrical, the cubic terms can be omittedt):
1
-
0
-2
1
0
1
1/5
1
1/10
1
e
•
1/14
1
1 1
1
1
1
-2 -1
0
1
2
2 -1 -2 -1
2
1/5
0
-1/7
1
1
1
1
1
0
1/10
0
-2
-1
0
1
2
0
0
1/14
2
-1
-2
-1
2
-3/35
12/35
17/35
12/35
-3/35
-1/5
-1/10
0
1/10
1/5
1/7
-1/14
-1/7
-1/14
1/7
...,
and the term in square brackets is
t
. [1, X,
r- ] dX •
- /2
Thus,
M.
~[
- 44,
[5, 0,
~J
,
169, 240, 169, .. 44 ] l .
(4.50)
Chapter 5.
OOMPARISON OF QUADRAWRE DESIGNS AND TEE CONVENTIONAL
ESTIMATORS OF SYSTEMATIC SAMPLING
5.1 Introduction
Several difficulties are encountered in making a general comparison
w
•
-
-.
of variance (mean square error) of the considered formulae.
.
I;t'
one
-
compares those basic formulae of fixed number of observations, then
the. degree varies, and the
rP
are not directJ. y comparable.
I;t'
one com-
pares formulae of fixed degree, then the number of observations, and
Also, designs requiring uneqUally spaced abscissae may
-
cost, varies.
be considered more costly than those of equal spacing.
Lastly, the
case of two-stage sampling complicates the comparison of variance of
a fixed quadrature design with that of conventional systematic random
sampling.
5.1.1
The General Expression
~
Mean Square Error of a Quadrature
Des~n
AI;
before, the model is
2
2
'Where y(,.) is a. continuous function of ,. and te - 0, te • a , and
t 1;1 e j
;.
0,
all i ,. j.
'!hen Q is a:r:r;r quadrature formula of degree m,
n
Q
-I
WiYi -M -
i=1
It follo'Ws that
~+
n
I
WiEi
i-l
•
75
in the case of a fixed design, and
2
VQ(t) == el'e [ Q(t) - MJ2 • et(t) + a et
in the case of a randomized design.
For
I ~(t)
br~Vity,
the e's will be
.dropped from most of the subsequent development, but should be "understood fl to be present.
5.1.2
"Types" of Copr,parisons
In order to put the f'irst part of the problem in perspective, con-
sider comparisons under the follo'Wing conditions:
i.
If1+l == 0, but If1 ~ .0., where m is the degree of' the f'ormulae being
compared.
i1.
ml
m
D + " 0, but D +2 =0.
Jil this case, the bias term. (-R) in-
volves the product of a know constant and (Ifl+l), which is also
a constant over the intervaJ..
iii.
r:f!+2" o.
Here one can make ~average" comparisons, in' t:tte sense
of' the mean value, ng+l , or specific comparisons, but not gener- .
al ones •.
Under (i), the variance of' fo:rnn1.ae of' degree m involves only the
2
term a, so that comparisons are readily made among all such formulae.
~
~
(The additional problem of' unequaJ. numbers of observa.tions is discussed later.)
-
Note, however, that all formulae of' degree mI > m are also comparable to those of' degree m, as they also involve only the terms in
;..
'Ems, given that Ifl+l. 0, selections of' the appropriate formula
Should be from all of degree m and greater.
U%lder (i1) 1 comparison of formulae of degree mtt > m is observed in
the preceding paragraph.
Consider comparison of two formulae of degree
m and. m + 1" having Vl and V2 " respectively.
$
Vl
if
Now"- under (i:L)"
..
V2
2
2
Em). + sla ~ (1 •
Rm • cIfl+l ,
if one can specify the ~atio
if
so that one can make
an
~
, exact conq>a:rison
Jf*lJ (12,
as follows:
cIfl+1J;' + al S a2 •
Such collt.Parisons 'Will be considered in terms of specific formuJ.ae in
a later section.
Under (iii), it has been shown that one can develop the general
series expansion of the remainder terms.
Such an expression may be
of interest, but its utility may be questioned.
To illustrate, let
two formulae of degree m be compared, and let,
R
=t t m(m+l)
l-
am2
•
Inl
and.
Where
&.1
and.
&e.
&e. ~(m+l)
are vectors of
and m (m+l).
constant~
mm+l
-
m:m+e
T,Am+3
.
t
•
Then, for
'Whether
a.rr:r given t and t
~
-l.~
or
~
is smaller.
and. a givenm(m+l) one can determine
-'
However, a general comparison is.;!
77
impossible since for a fiXed &1 and
wi;Ll depend on the choice of
&e'
the choice between' I1. aM.
~(n+l), !.!:..,
~
on the specified fUnction
and interval of interest.
However, it would seem generally admissable to choose, in the first
Place, a design of sufficient degree to account for all terms in the
expansion known or assumed to be of appreciable magnitude.
In general,
it will be considered that the pr:tm.ary problem posed by (iii) is determination of the minimum degree Which will allow acceptable approximation, so that all comparisons, as such, will be of type (ii).
5.1.3
~e Relat~ve
Value of Two Formulae of Degree m
The relative "value" of t'WO formula·e of degree m can be directly
evalua,ted in terms of mean square error (V) i f the two formulae involve
the same "cost."
Cost differential may occur if one sam@ling scheme is
more difficult to execute than another or if the numbers of observations are different for the two foJ:'ImJlae.
It would be impossible to
define a generally applicable cost function for, say Tchebycheff quadratures as compared to Centric, as such a function involves a subjective element peculiar to a particular application.
no attention is here given to this problem.
For this reason,
The problem of unequal
numbers of observations is more generally defined, a,nd is considered.
Comparisons can be made independent of the cost function in n by the
expedient of evaluating composite formulae, such that two cOll:[>osites
constructed from basic formulae of unequal numbers of ob'serya,tions,
have the same total number of observations.
5.2
Comparison of Quadrature Designs
2
ofR .
In compar-ison of R2 of two composite formula·e of degree m, each of
5.2.1Co~ar.ison.
* .
the type, w -kw!n, (see Paragraph 4.7. 4), where Cll is the appropriate
constant defined by C in formula·
l
(4.38)
for the remainder in the
nUIllerator, and CJ2 is the comparable constant for the
r~inder
denominator, and where the'k's are also defined as in
(4.38),
~
~
{~C}2
· ~C:
in the
, constant for all n,
so tnatone or the other of the composite formulae hasta constant advantage so far as the Bias term is concerned.
This can then be treated
as a characteristic advantage for that basic formula.
However, if one of the formula.e being compared, (say the first) is
* - (k-l) w!(n-l),
a Newton-Cotes, having the relationship, w
2
EN-c
r
then
2
Thus, for two specific formulae, with specified Cll' CJ2' k l and
~,
one can solve (5.5) for nt, such that the Newton-Cotes formula is
superior for all n > nt, and inferior for all n < n t • It is noted that
2
.
11' (C:l.f~l)
< 1, then the Newton-Cotes formula is
inferior.
)
always
A more satisfactory comparison of Newton-Cotes formulae to other
J
_
2
formulae can be made from the coefficients of R in Table 5.1.
n, it if;! possible to directly compare formulae by multiplying
_.
For any
t~e
79
tabled Numerical Coefficient of the Newton-Cotes
(n~l) 2(m+l~,
formula~
by
Where (m+l) is the exponent of D in the constant term
of R.
Note that comparison of two Newton-Cotes H's also yields a constant,
•
2.2.2
(~-l)Cll
2
(5.6)
,(k:t -l)C12
Com;L?ar,ison of Coefficients of ,;
In comparison of coefficients of ,;, one can again consider the
coefficient for composite formuJ.ae of equal n.
2
Let C(n) be the coef.
'
ficient of a in a composite formula of n observations, and C(k) the
-
coefficient of the basic formuJ.a.
Thus, if composite formuJ.ae are of
the type such that any observation (ordinate) is used in: only one application of the basic for.mi.lla,
k
C(k) •
II~
(5.7)
2
k i-1
t L~ If
C(n)
1m
i
i-1
2
and nc(n)/w •
lcO(k)/~ .
(5.8)
2 2 ','
That is, the coefficient of w a /n is the same as the coefficient of
~i/k. This relationship (5.8) holds for all formulae considered,
except for the Newton-Cotes formulae, for which'the following
sion is appropriate
~res-
80
k-2
c(n)/h2 • 2W; +
<a) I wi + 4'<~:~) .~
. i-:1.
k...J.
• (~) ~r/:
+ 2(n-k) ~
k-l
0
k-l ~. i
. illQ
k-l
nC(n)/~l·
Hence
Ii{
(n-l)(k-l) {
+
2£~~k) .~
}
i~
It is seen, iiierefore, that the coefficient of
.,.l r? /n
is constant
in n, for all permissible n, for any given formula. except those of the
Newton-Cotes type.
The reader is referred to Table 5.1 for a s'PJl]Illary
of these terms.
5.2.3
,_
Exam~les;
m
_
__.'
_._
.
_
2
of Comparison of R
_ . .
_
_..
_
•
_._
•
To illustrate the general procedure of com;parison of remainders
(or squared remainders) in series form, recall
1.
from Paragraph 1.4.4, the remainder of the first degree formula,
C·
Msc • Qi.,n
C
[n-l
]
). - h
~ 8I A(i - l:!:o)
,n .
i-o ..
t
.
R(Ql
.-12
o
1/12
-1/8
7;;/240
•
••
(5.• 10)
81
i1.
from Pa:ragraph 4.7.6, the remainder of the first degree
formuJ.a,
Q~ .
""2, n
o
-1/6
1/4
(5.11)
-19/30
•
••
iii.
from Pa:ragraph 4.7.8, the remainder of the 3rd degree formuJ.a
C
Q3,n
o
o
o
21/80
63/32
•
•
•
iv. from Paragraph 4. '7-7, the remainder of the third degree f'ormula.,
N··
~,n
o
o
o
-2/15
11/12
•
•
•
It wouJ.d appear superficially that detailed compairson of' the two
Centric formuJ.ae, - (5.10) and (5.12), with the
Newton-Cotes formulae
(5.11) and (5.13), is readily accom.Plished, and that one need only
82
specify
~.
However, there are several .difficulties, all due to the
fact
that there are n panels of integration, and terms
in the summation,
--.
-.
for the Centric formulae, but only n - 1 for the Newton-Cotes formulae.
Thus, h~1II win for Centric.. and h • w/n-l for Newton-Cotes.
.
Further, the
.
differences in the formulae are defined over the respective interval,
h.
It is possible, by the methods developed earlier (Section 1.4) and
..
variously used, to express
mulae
in terms of
....
~.
-Jo.
~,
as defined for, say, Newton-Cotes for-
defined for Centric formulae,
by 'WhichdeV'ices
.
.
...
.
one could make detailed c0Ill:Parisons of the formulae.
However, such
modifications are very involved, a.t best, and it would seem that the
only practical comparison of Newton-Cotes and Centric quadratures of
degree greater than 1 is limited to the first term .in the remai:cder.
As earlier indicated, this can be interpreted either as an exact com-
parison under the assumption that higher differences are zero or as an
"average" comparison in the sense of conventional remainder terms
evokiJ:lg the
mean value theorem.
_.
~
V1hen modification is made for number
-
of terms
in the. s'UIlIll'lation and the leJ:lgth of the panels of integration,
..
It is seen, then, that in compating formulae of this ·'type, in which
series expansion of the remainder term is relatively easily obtained,
it is still expedient to resort to the conventional form of the remainder in order to make a proper comparison.
Therefore,
developme~t
of the series expansion of the remainders will not be necessary, and
the method of Paragraph
4.7.4 is used to obtain the coefficients of
2
R in Table 5.1.
5.2.4 ~scuss~on of Coefficients ofcr2 in Formul~e of Xnterest
Following the general considerations of 5.1.2 and the developments
.
.
2 2
I'
of 5.1.3, comparisons of the coefficient of w cr In are of interest.
Reference is made to Table
5.1
and to equations
(5.7)
and
(5.9).
It
is seen that many of the formulae have coefficients inflated no more
than l~~ over the minimum, 1.
Q~,
Qg,
and
efficien~s
o
C
In particular one notices that ~, Q4'
Q~ (in addition to the Tchebycheff formulae which have coof 1, by definition) are good in this respect.
Also, it is noted that the Oentric formulae are considerably better
-
rd
~
than the Newton-Cotes formulae for small n and 3 . and 5
degreefor~
th
mulae, but the 7 degree Centric formula is apprec iably poorer, relatively.
Surprisingly, the Gaussian formulae hold up well for those
considered.
2
5.2 .5 . D,iscuss;on .of .R in F0E¥8bae of. Interest
With regard to the remainder terms, (Table 5.1), ~he Gaussian and
Tchebycheff formulae hold up well, so that if there is no penaJ.ty associated with unequal abscissae, the Gaussia.n and Tchebycheff for.nmJ.ae are
84
decidedly best, at least for formulae of degree 7 or less.
In the
formulae of evel'llyspaced abscissa,e, it is seen that no general superiority of one type or another exists.
5.2.6 GeneralOomments on ComJ?arison of Mean Square Error
_',
_n
•.•
•
.
. . ,
.
.
'
_.
From the two preceding paragraphs 1 and Table 5.1 1 it can be sum.marized that for a given ratio,
r:JIl+l/1- 1
and a given cost:f'unction in
n and in the configuration of abscissae 1 one can select the formulae
of degree m 'Which is "best. If However,
general superiorities do not
exist unless one places restrictions on the cost function.
Further1 the irregularity of the coefficients in Table 5.1,!.!..1
the absence of clear-cut trends, leads one to wonder i f there exist
certain formulae 1 higher than those considered 1 which will have desirable properties.
-'
This could be examined by a com,puter com,pilation
.
.
of the pertinent coefficients of formulae to arbitrary degree 1 and such
a compilation would appear desirable.
5.2.1 P~acipg; Re.strictions on
rf1
It would be most helpful in choosing the "best II formula if it were
possible to infer the magnitude Of"rJft+J./a2 from a known (or a,ssumed)
rrJ:/a2
1
or a~ least to be able to evaluate the relative m8gm.tude of'
the successive terms in the expansion of R.
However 1 this does not
seem possible, short of specifying the function.
5.3 Some Specific Com,parisons of Mean Square Error
~onsider the conventional systematic sampling estimators, Msc . • ~1
M and also the end-corrected systematic sample, with random start ~ (t) •
sr
5.;2.1
~e General Inferiority of ~(t)
It was shown earlier that:
1.
4
Bias of Msc was half that of CL
(t), when b.
J.
~
Coefficient of a /n in Msc and M is 1.
sr
.
.
~.
Coefficient of
lin
.
=O.
in variance of
~ (t)
is
(1 +
D;
2)'
6(n-JJ)
It is aJ?parent that Msc is always better than ~ (t), and that ~ is
better than
~ (t)
for some n ana. some
n2 / '12
•
Further, if
n4 ==
0, a
number, of higher degree formulae are better than ~(t), at least for
some n.
In short, there doesn't seem much justification for the end-
correct~on formula, ~ (t),
although some of the other forrm.:lJ.ae of this
type may have better characteristics.
The case of two stage sam;pling will be treated later (Section 5.4),
and it will be demonstrated that in. reJ?eated samJ?ling the randomized
designs achieve superiority over the fixed designs when reJ?lication is
sufficiently great and;
> O. This result may provide a justification
for the end-corrected systematic random samJ?le, as the end-corrections
can be used to advantage if one has systematic ra,ndom samJ?les for a
number of occasions, and then 'Wishes to estimate the values for individual occasions.
5.~.2
~he~
.1
-
Comparison of Msc to Msr
ComJ?arison of M61' with Mse ,
Q-
~
(t), and other. higher order Quad-
rature fo:rnnil.ae can be made by use of the results of Sections 2.7, 4.7
86
and the earlier :part of Chapter
comparison of
~anded
5. To illustrate another aspect of.
error terms, in
case~
in which the coefficients
2
of a are equal, consider,
V(M.,.ltt.
~ [(1)~
0) •
+
n
'Ihen,
V(MsC
,ti
5
0) •
V(Msr
,ti
5
0).
~60)
<
V(Msc
,d
(6,2)2
5
,tl ]
2
+ .j.a
In
+ vl-,J In
0) as
< 1/240,
S,. 0,
and never if 6,2 • O•
. From (5..18) it is seen that the necessary condition (under
ti
5
0)
for Ms to be "better" than M is for the constant second difference
r se
.
to be 16 times the absolute average first difference. This ~lies
symmetry of the funption about the Dlid-point of the' interval, and constitutes a potential rule of thumb in application.
Similar
comparisons between Msr
.
differences aJ."e assumec.l
F 0,
~~d
Msc are .possible
When
.
. higher
but little will be accomplished.
!mle
regions of superiority are more complex, and there is little hope of
Similarly, cO:J:qparison of Msr and higher degree
quadrature formu2ae would be redundant, since the methods and ideas
~ind1ng
working :rules.
have been developed previously.
5. 4 Comparisons . under. Two stage Sampling
When the designs here considered are applied at the second stage
of a two stage sampling scheme, the cO:J:qparisons between randomized
'designs and fixed designs are affected, although comparisons among
fixed designs are not.
Since the only randomized design still 'UJ:lder
consideration
,
. . is Msr ,and comparison among fixed designs has already
been treated, the present treatment will be restricted to a comparison
of Msr and M under two stage sa.n:g;>ling.
sc
5.4.1. Tbe 1'woGeneral Cases
Two'ncasesft'are recognized in the two stage sampling problem.
i.
Repeated sampling is over different "realizations" of the same
function and interval.
~j
The
are.. identical for
all i; that
is, constant for sampled occasions or locations (pr~ sam~
pling
U1'li
ts ) •
..
..
.
Only the "measurement" error, e:, changes from
r
realization to realization, and the e: are assumed
ind~pendently
and identically distributed, with mean zero.
i1.
?epeated sampling is oorer different "realizations" of tlle same
~j
generating process, so that
can be different
~n
, tinct occasion or location (primary
sampling unit).
.
..
anr disMeasure-
.
..
ment error also eXists, subject to the restrictions un[er i.
5.4.2 Case i
Consider an Universe of r'.PStJ" s, of Which r are selected at random
with equal probability.
Then define
,
r
, "rt
\,
Msr • 'If:il''':rr-/...., MSri
i-l
'
88
where M
is the conventional systematic random sample estimator apSri
plied to the i th PSU, 'Where t, the randam starting point, is selected
independently for the sampled PSU's, and where,
Also, define,
n- rn ,the total number of observations
i
'\'1 . -
r'wi' the totaJ. "length" of the Universe.
r
Then,
VeMsr )
- (f-)2
I
V(Msri )
i-:l.
-I
==
2
I'
-::tL {~lT: ~ + n~.
} .
i-:l.
Then, under the restriction of casei,
Similarly, define
I'
Msc . ==
¥i=1.
IM
SCi '
w 2
==
.
(~) {Y81j £*~j
.'
2
+ ni a } •
Then under the restriction of Case i,
V(Msc .) • (*)2 r2 {
Thus, if' one assumes that
~j ~ ~j+ n;
t? =0, If ,. 0,
tities 1.14, 1.~7) then V(Msr )
< V(Msc )
}.
(referring to the iden-
when
(5.20)
r >
48{· ~2i}2
ni~
+ 4/5 '.
.
That is,
ofMsc is reduced as the number of sampled'.
. .the
. .superiority
,
PSU's increases,
large r, Msr becomes superior. '
, and for sufficiently
.
~a.in, however, 'tAi
:/6.2 must be specified for an eJq)licit evaJ.:u.a.tion,
and choice of design.
Consider realizations of a process such that a
pol~omial
of the
same degree is appropriate for the realization characterizing each PSU,
but such that coefficients are subject to change.
Then if one samples
each PSU in accordance with schemes yielding, say Msc and Msr ' then
the,centric systematic sample will have the same over-all variance as
if the error were simple measurement error, a,nd the function were the
sum of the realizations.
That is, 'Wilen r • r ' ,
~ ~fi,~E ~j .}a2/~
;(Msc ) •
E
+
,
n
whereas the error of Msr ' when an independent random start is chosen
for each PSU, is
2 r
*
,
w \[
Y(Msr '> =-n
2 f...,
*
~},TO~j
]
2 2
+ w cr
In
i~
\
(Expression (5.22) follows directly from (2.20), as the sampling scheme
-
.
reduces to stratified sampling, with systematic random ,sampling within
strata.)
Formulae (5.21) and (5.22) can be compared, letting
2
.
d
EO,
'
I!- ,. 0,
and specifying that ts. is not necessarily the same for all realizations.
Then,
~(M
sr.
) < ~(14 .) as
sc.
to make Msr superior to Msc • As in Oa,se i, it is of greater interest
to consider a sample of PSU's than to include all in the sample.
The
two-stage sample variances of the two schemes under Case 1i are develOped below.
V(M .) • t { =:!\M
sr.
r,~sri
ies
•
(~.t)2
[ r
2
t
{~I 141
- M/r.}2 + t
{I
ies
since expectation of cross products is zero.
V(Msr)
'Where
•
_ 14 }2
(Msri - Mi) }2J
ies
(5.24)
The last term is the
• *
.
.. (r~)2Vr(M) + (;) V(Msr)
* ' ) is defined by (5.22).
V(14
sr.
91
However, the comparable expression for V(M
) involves the cross prodsc.
uct term,
which may be interpreted as the cova·ria,nce of the sum. of the sample M
i
and the bias of the Centric systematic estimator over a sample of I'
realizations.
Also, in the case of the centric designs,
the second term. of Vihich is the covariance among the realization
biases.
'!hus,
V(Msc )
-
III
(r!)2 Vr (M) +
[,r~l(r!r
(~t')~(Msc'>
+ [(~)2e
{I Mi - ~ }{I Mscit Mi }]
r) e(M
_ M )(M
- M)] j.J 1sci
i
scj
j'
T
Since both of the covariance terms in (5.26) can generally be expected
to be positive, no genera·l effect of sampling of realizations on the
relative merits of the random and centric designs can be inferred.
Again, all comparisons require specification of the functions.
~.4.4
Generalization of Two-Stage Results
It is apparent that the results of the preceding section can be
generalized to cover all quadrature designs under two-stage sampling.
In
particular~
generalization follows the form of (5.26), with appro-
priate simplifications in accordance With the nature of the particular
+ (r.)2 Vr (M) + (f)2 t
[L
Mi ~~J
ies
[I
(Qi - Mi ) ]
ies
where Q* is the appropriate matrix for the square of the difference
,...~
expansion of (Q - M) and C is a function of the particular design, as
described in Table 5.1.
It was earlier shown that truncation of this
expression to the first non-zero term is equivalent to utilizing the
conventional remainder term.
It is anticipated that (5.27) will be of value in designing two
stage samples.
Modifications in design at the first stage may be jus-
tified when restrictions can be placed on the Mi'
For example, certain
knOWledge of the M might lead to a systematic selection of PSU's.
i
Such designs are not considered here, though it is appa~ent that the
present results are directly applicable under stratification at the
first stage.
93
2
2
';rabJ,.e 5!:L Components of mean square error, R + Cw /- In, of Quadrature
!Ormulae. The general term refers to repeated application of the basic
formuJ.ae along an interval. "# for the basic formula is easily ob...
tained from the table of remainders, Table 4.5 •
2
R
Coefficient of ,}a2 In
a
Formula
N(2l
~~~
limit
Basic
General
ll'"t :be
n(2il~3)/2{ia-l)2
1
1
4/3
1
.1 2
,l+p/~(n... ~)
Coefficient
b
2. 7778
.6944
1
Common Term
10.2(..?i't
2'.n2
-
N(:~)
N(4)
C(3)
C(4l
G(2
T(3
N(5)
~~~~
~l~l
T 5)
:~~~
TI~.~
g~~~
2n(5n-6)~9(n-l)2 2
3n(11n...14 /32(n~1)
1.0313
1.0069
1
1
1'.5000
1.2500
1.0313
1.0069
1
1
1.1111
1.0313
2.321~
n~1.1793n"'1.3728~l~~:J.~~1.1793
n 1.0666n-l.2842 I n~1 ~.0666 10.7174
1.4137
1.2277
1.2541
1.0609
1.0556
1
1
1.2541
1.0609
1.0556
1
1
'
17.2043
55.3501
.06778
1.1742
4.4061
2
1.7075 n(1.4922n"'l.6637~/~n"'l~21.4922 1.839~
1'.2883 ' n(1.1537n...l.3388 I n-l 1.1537 8.2669
22.2500
2.9166
"
2.9766
1.4168
1.4168
5.0390
.00208
1.0926
1.0926
1'.1106
1
1
4.2383
1
1
10-2(,?n4)2
~
.n
76 2
n)
(w
6!n6
10-3
t
8
(v9n
m:g
'n
1.1176
1.1353
1.1116
101353
-
L
1.777~
9.0000
6.8904
18.4177
·7901
2.8476
I
'
N
=
Formulae designations are of the form QC N(n), w.here n refers to the
number of observations in the basic form:tha, 'With exception of ~(t), in
which case the subscript refers to the degree.
,',
bSubstitute (n-1) for n in the common term. With the exception of those
indicated by b ,all R2's havi:ng the same common term can be compared
directly, for all n, by comparison of the coefficient of R2. When b
appears by the coefficient, it mu.st (for purposes of comparison) 'be
muJ.tiplied by lJf.....
. .)2(m+l), where (m+l) is the power of n in th~ common
term.
\ n,a l "
94
Chapter 6.
ES!I?IMATION OF VARIANCE OF ESTIMATE OF QUADRATURE DESIGNS
6.1
Introduction
Estimators of'the variance (mean square error) of simple quadrature
designs (Msr ' Msc and others) were considered in Chapter 3. Inthe
present chapter, estimators of variance of higher degree formulae are
investigated.
Results are also extended to the case of two stage sam-
pling.
6.1.1 Orientation
In Chapter 3, it was demonstrated that the difference expansions
of the squared error of the conventional estimators of systematic sampling could be estimated by linear combinations of squares of linear
combinations of differences.
Further, it was pointed out that such a
procedure was of dubious valuetfSince the estimator of mean ordinate
should be modified to account for all terms of known significant magnitUde in the remainder.
That is, information about the remainder is
really information about the mean ordinate, and should be used as such
if possible, rather than as information about error of estimate.
It is readily apparent that similar estimators of difference· expansions of' variance of quadrature formulae can be constructed, i f all
of the information has not been used in the quadrature formulae.
Sev-
eral cases are considered in the following paragraphs in which such
information remains available.
Except where indicated to the contrary,
it is assumed that the basic formulae contain no unused information.
95
6.1.2 _~u~ceSStve Ap;p11ca.tion of Quadratures Along an InvervaJ._.
This is the case treated in C1.b.apter ;.Here, interest is in quadrature fornn.:U.ae of higher degree.
As before, the available information
lies in contrasts between the individual quadratures compris11lg the
'Whole.
As use of these contrasts in estimating variance requires as-
sumptions regaroing the "degree" of the underlying polynomial, the use
of supplementary observations will be useful.
6.1.; _A;p;plicat;!.on of Suadratures, Case i, paragra;ph 2.4.1
This is one of the two cases of two-stage sampling that were considered in Chapter 5.
2
(1 , only.
Contrasts between Poo's yield information about
As will be seen later, this information is quite useful,
but its use is dependent on the assumption that the polynomial and the
interval is the same. _ Here supplementary observations are necessary
to obtain any information about R.
6.1.4 A;p;plicat;L0n of Qu.adratur~. Case ii, ~aragra;ph 2.4.1
This is a generalization of the preceding paragraph, in 'Which R is
no longer constant over Poo's • Here , contrasts between Poo's yield
information about (12 and. the variation in R, but not separately.
it is
necessa~ to
Again,
use supplementary observations in order to obtain
allY information about the magnitude of R.
6.1.2 A;p;plicatci=on of Systematj"c Random Sample
The systematic random sample, with conventional estimator, Msr '
provides an exception to the statements of t:Q,e three preceding para graphs regarding limitation of the contrasts in estimation of mean
square error.
I:f' the "realizations" (PSlJ"s) sampled are from an in-
finite population, the contrasts provide an unbia,sed estimator of tl1e
variance of the over-all estimate.
finite, then
However, if this population is
s~plementary observations
are necessary in order to con-
struct an appropriate variance estimator.
6.1.6 Applicattonof Least Squares Quadratures
Least squares quadratures also provide an,exception to paragraphs
6.1.2, 6.1.3 and 6.1.4, as infor.mation about the variance is contained
within the PSlJ's.
.As earlier indicated, the least squares quadrature
is also valuable as an t'average" of the successive fitted polynomials of
the case in Paragraph 6.1.2 and may generally be better tha,n that case
(although this was not proven) so that estimation of variance of a
single least squares quadrature is of interest.
From the results of Chapter 5, it appears likely that the degree
.,
of the least squares quadrature "Which prOVides the minimum mean square
error is lower than the degree of the fitted polynomial
'Which,~',in
the
conventional treatment of polynomials, prOVides the minimal residua,l
mean square.
!}hus, it must be considered that R, and hence r , the coni
tribution of bias to the difference between Yi and Yi (paragraph 6.2.1),
will not be negligible for the optimal degree With regards to quadrature.
A moderate amount of effort, with little success, ws expeX1ded on
the problem of obtaining an estimator of variance of the least squares
97
quadrature frqm the residual mean squa:re.
Section 6.3.
ResuJ.ts are reported in
It appears that direct use of the residual mean square
is not too damaging.
6.2
Supplementary Observations
Supplementary observations, Yi • y( ori)
+ Ei , taken at random values
of or within the interval of interest, provide information about the
error of qua.dra:'Gure in the form of
,.
e i • Yi .. Yi
'Where
Yi is
the value of the fitted pol'ynomiaJ. implied by the quadrature
formula at the point T i •
Yi
Now in the calcuJ.us of finite differences,
can be 'Written down in terms of various interpolation formuJ.ae.
However, the results of Section 4.8 provide a useful framework in fa..
miliar notation that does not require development of additionaJ. for..
muJ.ae. In use of the' notation of orthogonal transformations, recall
that the transfor:ma..tion is generaJ., and that the results apply equally
to all quadrature designs.
In the subsequent development, it is specified that
.. zi + Ei •
Restrictions on this general expression will be made in the various
• 1'i
paragra.phs.
As in Section
4.8, the variable X is used for the argument of y in
development of general resuJ.ts.
The prima;ry justifications for this
change are that re-definition of vectors is simplified, and the range
of X is conveniently manipulated.
6.2.1
Est~t60n of
R
Consider the estimation of R by use of supplementary observations,
when it is assumed that measurement error is absent.
Then,
A
R-M-M
n
- h
Sr(x)c1x
o
where r i • r(x1 ) •
That is, r(x) is the signed difference between the fitted polyrtomial
and the true f'unction at the point x, 0
:s x S n.
Now rex) is ob-
servable, and i f values of rex) are taken at k randomly" chosen values
of
X, X• U(O,n),
then
k
i · run: • ~I r(xi )
(6.1)
1-1
is an unbiased estimator of R.
6.2.2 The Estimation of
R2
It will be recognized that
(rfh)2tr2 ~
and that (rfh)2tr2
rP
> R2 when R > 0, but that
unless y(x) is of the same degree as Y(x).
(6.2)
tr2 ,. 0 even though
R•
0,
Further recall that many
formulae are defined so that R == 0 even 'When y(x) is a polynomial of
(specified) higher degree than
y'" (x),
\
be great indeed.
so that the inequality
(6.2) may
99
'!hus, it is desirable that an estimator of R2 be found 'Which w:I.ll,
in effect, reflect as much as possible of the "hidden precision" due
(6.3)
(6.4)
h
Now the Quenouille estimator is appropriate if (R'2 in a power
~eries
in l/k,in 'Which case
compared to the bias of
~
f2
has
'If) is expandable
abias to order 1/k2 ,
to order l/k (Quenouille, 1956).
~
6.2.3 '!he Estimationof V(M)
Let the function be a polynomial of the same degree (n - :1) as the
polynomial specified by the formula, so· that rex) ;: 0, but spec1:f'y that
measurement error is present.
V(~)
RecaJ.l, in this case,
n
2
• h
I wi;' •
2
h !'!02
ill(l.
'Where the Wi are the weights of' the formula.
100
Then, the difference between the supplementary observation, y{x) and
the fitted polynomial, at X, is
e.y-y
Where, in the notation of Section
4.8, one can 'Write
y .:£.~-~
• a£~~ U,!!) -~ 11
...
'W'here Y is the vector of observations used in the formuJ.a aDd
;x:'
--. is
the
vector [ 1, x, x2 , ••• , xn-l] defined for the value of X chosen for the
supplementary
observation~
Then t e-. 0
e
2
t ee • {l + :£' ~)(?
'Where ,t.,QA, Ci~)-~
Hence
1 <L-.- (~:!H -1
•
s2. {l + ~t'lr:£)-l(y _ y}2
"
.
2
is an unbiased estimator of a •
Further, consider k random values of e.
Then
k
tsL e~ • {k + I(as1~) }a2
(6.6)
i-:l.
tfl{
and
L
e i )2 • {k +
(Lasl) 1: (L~»
(6.7)
so that unbiased estimators of ;- my be 'derived by solving
(6.7)
in the manner of
(6.5).
The estimator defined" by
(6.6)
(6.6)
and.
1s-~p­
propriate in the case cOIlsidered in the present paragraph, but both
(6.6)
and
(6.7)
'Will be useful in subsequent development.
101
6.2.4 The General Case
Let the fitted polynomial be of lower degree than the function am
measurement error be present.
~e
results of Paragraphs 6.2.2. and
6.2.3 are pertinent.
As
before,
•r + e1 + [ eoeY YiJ
eo eei •ri + {1 + ~. Wi
1
i -
'!hen define
(6.8)
and after· aragraph 6.2.2,
(6.10)
Then,
(6.11)
102
Then, since the k values of x are selected at random, one can 'Write
Recall that
so that, in expectation over x,
Yk would seem satisfactory.
However,
(6.5), ...i.e.,
- in
'Which the correct coefficients are obtained, conditional on 2£, was
the attempt to construct estimators of the form of
,
unsuccessful.
The problem, then,. must be considered only partially
solved, since a conditional estimator is patently desirable.
6.2.5 An A1ternate Form
k
(nh
r ~ e i ) 2 , employed in the preceding paragraph and
i-u.
The term,
made up of observed deviations from the fitted curve, can be rew.ritten,
k
(~Ei)2 • (~)2 {LYi - (L~) .t ~}2
i-u.
'Where
E
i is the sum of the su;pplementary observations,
L~
is the sum of the vectors defined by the chosen stlJ?plementary
observations ,
,.
n
and f3 • .~
iXY • ,...",,...,~
0 'i\. (e: e:'\ ~1r:'V'
, from the n observations comprising
~~
--
the formula.
103
6.2.6Se;parate.Estimatioll of poIDJ?onents of Mean S9.uare Error
A useful device in problems of this sort is that of estimating the
two Conl];lOnents separately, or at least one of them independently of
the other.
It is apparent that 'When it is possible to sa.m;ple. so as to
2
obtain clea·n estimates of " , it is then possible to construct estimators of the form
with any desired coefficient C, by linear combination of the estimator
of ,,2 and. the estimator of the preceding paragraphs • Unfortunately,
it is seldom possible in the applications considered here to obtain
clea·n estimates of ,,2.
An exception is the situation noted in para·'
graph 6.1.3.
6.3 Differences Between Successive Quadratures Along an Interval
Although the differences between successive quadratures along an
interval are considered of little general value, except perhaps in the
cases discussed in Chapter 3, it is worthwhile to briefly examine one
example in a quadrature formula of degree 1.
Consider,for exaaIq)le,
successive application of Simpson's rule along an interval
where Xi. 1 if i • 0, 2,
4, 6, •••
• 0, otherwise
Thus,
(6.J2)
104
Let
n~
.. C
2
L(l -
Xi) [4(Yi +e - Yi ) + (Yi +3 - Yi - 1 )]
i=(l.
•
Then, choose 0, such that 'When measurement error is present,
t e82 (Q) .. 09 +'
By inspection, C must be
1 ..
17{n-3)
Then, expaming 9 in terms of
I.
,
~
one obtains,
n~
2
() "LXi [.6.i+1 - .6.i_1 + 6(.6.i + 6 i +1 ) ]
(6.13)
. iaJ.
Then (6.13) can be manipulated by the. methods of-Section 1.4, depending on the form desired.
6.4 Least Square Residuals
As indicated earlier, residuals from least squares
po1ynom:t~s
were
examined as a possible estimator of mean square error of quadrature.
These residuals can be represented as
105
'Where the Yi are the observations used in the formula and the
corresponding point on the fitted polynomial.
Yi
the
It is' well known that
(6.14)
'Where it is reca·lled that the
:1:'
* 1 are defined for the fixed observation
points of the formula, and maybe expressed
,.
~
r.
.. ~. • ty.~ - ttYi
•
The difference in the ~i and the r i of the preceding sections is immediately apparent:
the present remainders are fixed, and those of the
supplementary observations are random.
As rex) should be smaller in
magnitude in the neighborhood of the points of fit, it appears that
*2
average r 1
< tri2 •
However, it -was not determined that
I*2
n
-1k
n-
r.
< tr.2~ ,
.~
i-:l.
or even that
although the latter would seem likel~.
Moderate effort to solve these
questions analytically was unsuccessful, and it is suggested that empirical examination of the latter two inequalities would be quite
valuable.
In spite of these difficulties, it is qUite apparent that the resi-
dual mean square is useful in est:tmating quadrature variance when the
106
~
2
are known to be dominated by 0 • Jil this case, the residuaJ. mean
.
,.2
square can be used as an approx1:ma.tion to 0 , and in conjunction 'With
,.
supplementary observations to yield an estimator of V(M).
~e
central theme of the present chapter is that supplementary
observations
.
-
~e
necessary in order to estimate quadrature errors in
practical situations.
Many questions, such as allocation in two stage
sam;pling and the necessary nuniber of supplementary observations have
not been answered, and are left for future investigations.
rt can be
remarked that this approach to the problem is of
interest,
particul~
because the de$igner is forced to explicitly allocate his resources to
the levels of inference, estimates and measures of uncertainty.
This
allocation is implicit in all saJrij?ling schemes, but seldom as well
defined as in the presently considered designs.
107
Chapter 7.
SUMMARY AND CONCLUSIONS
7.1 Introduction
The general problem of sampling a continuous function over a
specified closed interval is considered.
are:
The traditional approaches
simply random sampling, stratified raridom samplipg and systematic
random sampling.
The last is of particular interest because of prac-
tical considerations in field application.
eter of interest is considered to be the
Traditionally, the param-
~
ordinate; in the present
treatment, the problem is oriented toward estimation of
~
under
~
curve, and numerical integration methods are investigated.
The current status of Systematic Sampling is summarized nicely in
the words of Cochran (1953, p. 185),
"Systematic .samples are convenient to draw and to execute.
In most of the studies they compared favorably in precision
with stratified random samples. Their disadvantages are
that they may give poor precision when unexpected period~
icity is ~resent and that no trustworthy method for estimating
(variance) from the sample data is known."
That is, the gain in precision from systematic sampling is not
reflected in the conventional variance est;i.mators, and the procedure is
occasionally dangerous if the investigator doesn't know his experimental material.
In the present investigation, a moderately successful
attem,pt was made to:
1)
evaluate conventional estimators of mean ordinate and variance
by methods'of numerical analysis.
108
2)
evaluate classical quadrature for.mulaeas estimators in
systematic sampling, and further exploit the interpretation
of mean ordinate as area under the curve.
~)
develop estimators of variance of quadrature by the methods
of numerical analysis.
The experimental material was assumed to be a continuous function
over a fimte interval.
7.2
Estimators of Mean Ordinate
(Area Under the Curve)
The estimators considered were the conventional
~
observation,
a number of classical quadrature formulae, and the integral
2! !. least-
I
squares polynomial.
Some of the quadrature formulae utilize equally
spaced abscissae, while others require unequal spacing.
The quadrature-
methods apply also to observations taken at random over the intervai,
but this case was not treated except as a simple mean.
(However, the
application of least squares quadratures to random observations follows conventional theory.)
,.
The formulae ,M, for estimating area under the curve are developed
and evaluated in Chapters 2 and
4, and in Tables 4.1,
presented quadrature formulae for small n.
4.~
and
4.4 are
Considerable attention was
given to the problem of choosing a formula (or design) from several
eligible ones.
In general, this choice can be made only if the function
and interval are explicitly specified.
The attem,pt to define general
classes of functions such that one or another formula is best fpr a
given class was unsuccessful.
However, when classes of functions were
109
defined in terms of restrictions on the difference expansion (for
example, letting
~ a 0 thereby spec~fying a qUad~atic function), use-
ful comparisons if the formulae were possible.
Some useful conclusions regarding the estimators of mean ordinate
(area under the curve) are:
1)
The unequally spaced quadrature designs
(~.s..,
Gaussian and .
Tchebycheff) are generally superior to the equally spaced
ones, and it is abundantly clear that these should be used
when practicable (Table 5.1).
2)
Of the equally spaced designs, the centric designs are generally best for small n and low degree, with the' Newton-Cotes
designs better for high degree and large n.
Selection of
the better design in a particular case reqUires specification
of the relative magnitude of a2 •
;)
The integral of the least squares polynomial is considered
superior to repeated application of a fixed formula along an
interval of a function, degree of formulae remaining constant.
The least squares quadratures are also quite practical,&s
existing tables of orthogonal polynomials are readily accessible
(~.s..;
Anderson and Houseman, 1942) and the methods are
more familiar to statisticians than are the methods of finite
.
differences.
4)
It is well known (~.s.., Cochran, 195;, p. 170) that in sampling a function with a linear trend
2
(!.~., ts.
.
=0), the
110
systematic
randoms~le
is inferior to the stratified random
sample (both with conventional estimator).
In Paragraph
2.7.:; it is show that the superiority of the stratified
sample is always reduced 'When 6,2 ,. 0, ~ ;; 0, and that for
sufficiently large n, systematic sampling is superior under
these conditions.
5) The comparison of the systematic random sample with the
centric systematic sample (both with conventional estimator)
is of interest.
AlthoughMsr is unbiased, and Msc is not (in
is unbiased for tf ;; 0, and is in this case
general), M
sc
superior. to Msr • Furt.her, when
tf ,. 0,
and
~
;; 0
(!;.!., 'When
the function is quadratic), the constant second difference must
be 16 times the absolute average first difference for Msr to
be superior to Msc (Paragraph :5.:;.2). This condition implies
symmetry of the function about the mid-point of the interval
[0, w].
6) When two-stage sampling is employed, with simple random selection of r PSU's (occasions or intervals), and application
of systelnatic sampling to a continuous function at the second
stage, further comparisons of systematic random and systematic
centric sampling are of interest (Paragraphs 5.4.2 and 5.4.:;).
It is shown that the superiority
of Msc
over Msr is reduced
as the number of sampled PSU's increases, and is increased
with the increased vqiation among the functions characteriZing the PSU's.
111
7)
The end-correction estimator for systematic random sampling,
~(t),
is developed as a first degree quadrature formula in
Paragraph 4.6.3.
of
C,L (t)
'""J.
4
tion A ==
It is shown (Paragraph 4.6.4) that the bias
is exactly double that of
o.
, under the restricsc'
Further, it is apparent from Table 5.1 that,
M
under this restriction, a number of higher degree formulae
are better than Ql (t), at least for some n.
Thus, there seems
little jUstification for use of ~(t) in the one-stage sampling case (Paragraph 5.3.1).
However, in the two stage and stratified cases, 'When
systematic random sam,pling is advantageous. at the second stage
(or within strata) by virtue of the accuracy of the over-aJ.l
estimator, one 1.fJB,y still be interested in, com,puting estimates
for individual strata or PSU's.
~ (t)
In this case, the formula
will be useful, although it is likely that the more
general least squares estimator will be more accurate.
7.3 Est1'11JB,tors of Variance (Mean Square Error)
It is demonstrated in Chapter 3 that unbiased estimators of the
truncated difference e:x:,pansion of mean square error of quadrature
formulae can be constructed. in the case of repeated application of
formulae along an interval.
It was also pointed out that such esti-
1.fJB,tors involve information about the mean ordinate.
One of the more
1m;portant questions for further investigation can then be formulated:
how much of the information about mean ordinate should be used in the
estimator of variance?
112
However, in those situations in 'Which estimators of variance of
quadrature can be constructed from the differences between successive
quadratures, the least squares quadratures are preferred [see 3),
Paragraph 7.2], so that there would seem to be little opportunity to
The more im;portant sources of information about mean square error,
in the one stage sanipling case, seem to be a) residuals from least
squares fit, and b) supplementary observations.
The investigation of
these two sources must be considered inadequate for com,pletely satisfactory use of the information although usable results were obtained
,e
(Chapter 6).
Again, both of these sources provide information about
the mean ordinate, and, aga.in, the question arises regarding the
amount of this information tha.t should go into measurement of accuracy.
Some useful results are:
1_
In the case of a systematic random s~le, and estimator
=
Msr ' an ,unbiased estimator of V(Msr ) 1112 0) is given by:
,
2
a. the mean square successive difference, t> , if n • 6
(Paragraph 3.3.1).
b.
the mean square successive difference of lag L, a~,
if n • 6L
-m
2.
2
,
(Paragraph 3.3.2) •
-rs-
2
(n-6) 52 + (24-n)_2
c. ,50)(.•
51, tor general n (Paragraph 3.3.2 ) •
2
The squared mean alternate difference., E, (Paragraph 3.3.3)
is shown to be always positively biased.
113
3. .AJ.l of the conventional estimators considered are related, and
are generated by linear combinations of squares of linear combinations of sample differences.
4.
*
An estimator, V, of variance of quadrature is developed
(Paragraph 6.2.4) from supplementary observations.
This
estimator appears to have useful properties.
5. An attempt to evaluate the use of least squares residuals 'Was
unsuccessful, but these residuals are surely useful when it
*2
2
is known that the r.3. are dominated
by 0 (Section 6.4).
.
7.4 Suggestions for FUrther Investigation
7.4.1 EmPirical Studies
Sevel"al em;pirical studies are suggested for problems encountered
that do not appear analytically tractable.
1.
'lhese are:
A study of least square residuals when the degree of the
fitted curve is less than the true function.
2..
(Section 6.4).
A study of "characterizing" real populations (functions)
in terms of physical. features, such that the characterization
can be translated into statements about the terms in the
power series expansion of the function.
7.4.2 Examination of Quadrature Formulae
It was suggested in Paragraph 5.2.6 that the irregularity of the
2
2
patterns among coefficients of 0 and R (in Table 5.1) suggests that
there may be higher degree quadratures with very desirable properties.
For this reason, a more complete compilation of formulae would appear
warranted.
114
In addition, a general extension of qu.a.rature formulae in the
rarJdom systematic s~le 'Would be desirable.· In particular, an extension of the least
squaresq~adratures for
equally spaced obser-
vations would be most. valuable, as these 'Would proVide formulae for the
random systematic
s~le,
as well as for other designs requiring
equally spaced observations.
7.4.;
Decision as to Degree of Formula
Further investigation is needed with regard to the question of
degree of formula. required in a particular case.
related to the second problem of Paragraph
7.4.1,
This question is
and also to the
further development of least squares quadratures.
7.4.4
Characteristics of Variance Estimators
The o:nJ.y characteristic of the variance estimator receiving
attention in this dissertation is bias.
Other features of the dis-
tribution of these estimators are of practical interest and should be
investigated.
1.4.5 Allocation of Resource
Attention
was
called on several occasions to the necessity of
utilizing, in the variance est:ilnator, part of the information regarding
area under the curve.
The general problem of .deciding how much such
information should be diverted to est:i.Jnators of precision is 'Worthy
of investigation.
115
1. 4•6
Sampling in Two Dimensions
It is qUite desirable that the results 'of this investigation
be extended to two dimensions.
The cubature formulae of numerieal
analysis provide a direct extension of the present results to the twodimensional problem, in Which mean ordinate is best interpreted as
volume llXlder a surface.
It is apparent that this is also an extension
of eonventionaJ. theory of response surfaces, Which theory also
,
utilizes results of classical numerical anaJ.ysis.
7.4.1 Develo;pment as a Problem
in Time Series
I
It was mentioned in Section 1.1 that the present problem could
well be considered a problem in Time Series, although the methods of
Conventional Time Series were not employed in this dissertation.
How-
ever, it seems likely that an application of the models of Time Series
Analysis to the objective of estimation of area under the curve
be rewarding.
would
116
LIST OF REFERENCES
Aitken,A. C. 1939.
Edinburgh.
Statistical Mathematics. Oliver and Boyd,
Anderson, R. L. and E. E. Houseman 1942. Tables of orthogonalpoly-'
nomial values extended to N • 104. Res. Bull. 297. Agric. Exp.
Sta.; .Ames, Iowa.
Booth, A. D. 1955. Numerical Methods.
lications, London.
I
. ,
Butterworths Scientific Pub-
,
Cochran, W. G. 1946. Relative accuracy of systematic and stratified:
random samples for a certain class of ,populations. Annals Math.
Stat., 17:164-177.
Cochran, W. G. 1953. Sampling Techniques, 1st edition.
and Sons, New York.
John Wiley
Daniell, P'. J. 1940. Remainders in interpolation and quadrature formulae. Math. Gazette 24:238-244.
Das, A. C. 1950.
Two dimensional systematic sampling and associated
strati~ied and
Guest, P.' G. 1961.
Univ. Press.
random sampling.
Sankhya 10:95-108.
Numerical Methods of Curve Fitting.
Cambridge
Haynes, J. D. 1948. 'An empirical investigation of sampling methods
for an area. M. S. ThesiS, North Carolina State College, Raleigh.
,
Ko:pal, Z. 1955.
"
Kunz, K. S. 1957.
York.
Numerical Analys;ts.
.
Numerical Analysis.
John Wiley and Sons, New York.
McGraw-Hill Book Company, New
Madow, W. G.' 1949.' On the theory of·' systematic sampling,
of Math. Stat. 20:333-354. '
n.
Annals
Madow,W. G. and L. *dow 1944. On the theory 'of systematic sampling,
I. Annals of Math. Stat. 15:1-24.
Milne, A. 1959.. The centric systematic area-sample treated as a
random sample. Biometrics 15 :270-297.
Moran,P~' A.
P.' 1950. Numerical integration by sys'!iematic sampling.
Proc. Camb. Philos. Soc. 46:111-115.
'
117
Moran, P. A. P. 1957. Approximate relations between series aminte-.
grals. Math. Tables and. Aids in Computing. 34-37.
Von Neuman, (f., R. H. Kent, H. R. Bellinson and B. I. Hart.' 1941'.
The mean square successive difference. .A:np.als of Math. Stat.
12 :153-162.
.
Osborne,J. G. 1942. Sampling errors of systematic a,nd random: surveys
of cover-ty.pe areas. Jour. Amer. Stat. Assoc. 37:~56-264.
parzen, E. 1961. An approach to time series analysis.
Math Stat. 32:951-9$9.
Quenouille, Me Ho' 1949.
Stat. 20:355-375.
Problems in plane sampling.
Quenouille, M. H. 1956.
353-360.
Notes on bias in estimation.
.A.nnaJ.s of
AnnaJ.s of Math
Biometrika 43:
Riordan, J. 1958. An Introduction to Combinatorial Analysis.
Wiley and Sons, Inc., New York.
John
Simmons, G. F. 1963. Introduction' to Topology and Modern Analysis.
McGraw-Rill Book Company, :co.c., New York
Whitaker, E'. and G. Robinson, 1944. The CalcuJ..us of Observations,
4th Ed. Blackie and Son, ltd., London.
Yates, F. 1949.
345-377•
.
•
Systematic sampling.
Phil. Trans. Roy. Soc. 241(A):
APPENDIX
e·
119
e
Values of B(n): Refer to Section 4.5
1. B(2) •
l-~ :J[-~ ~ [~ -~ J
]
2. B(3) • 1/2
D
0
0
2
-3
1
0
-0
0
0
-2
1
0
0
1/2
0
-1
1
3. B(4) • -1/6
1/2
-1/2
1/6
•
-3/2
0
2
-1
0
-1/2
1/2
-0
....
11 -6 1
0
6 -5 1
0
3 -4 1
0
2 -3 1
1/2
1 -11/6 1 -1/6
•
0
3 -5/2 1/2
0
-3/2 2 -1/2
0
1/3-1/2 1/6
24 -50 35 -10 1
4. B(5) • 1/24
-1/6
1/4
-1/6
1/24
0 -24 26
-9 1
0
-1'2 19
-8 1
0
-8 14
-7 1
0
-6 11
-6 1
-120 274 -225 85 -15 1
5. B(6) • -1/1'20
0 1'20 -154 71 -14 1
1/24
..
1
-1/12
1/1'2
-1/24
1/120
0
60 -107 59 -13 1
0
40
-76 .49---1'2 1
0
30
-61 41 -11 1
0
24
-50 35 ..10 1
120
6. B(7) • 1/720
-1/120
1/48
.Z(7)
-1/36
1/48
-1/32.0
1/720
'Where
Z(7) •
7. B(8) •
720
-1764
1624
-735
175
-21
1
0
-720
1044
-580
155
-20
1
0
-360
702
..461
137
..19
1
0
-240
508
..372
121
-18
1
0
..180
396
..307
107
..17
1
0
-144
324
-260
95
-16
1
°
-32.0
271~
..225
85
-15
1
-1
7
.(
-21
.
35
-35
21
-7
1
1
50~,()
) Z (8)
121
where
Z(8)-
r
I
,
'.
-5040
13068
-13132
6769
-1960
322
-28
1
0
5040
-8028
5104
-1665
295
-27
1
0
2520
-5274
3929
-1~0
270
-26
1
0
1680
-3796
3112
-1219
247
-25
1
0
1260
-2952
2545
-1056
226
-24
1
0
1008
-2412
2144
-925
207
-23
1
0
840
-2038
1849
-820
190
--22
1
0
{20
-1764
1624
-735
175
-21
1
NORTH CAROLINA STATE
OF THE UNIVERSITY OF NORTH CAROLINA
AT RALEIGH
INSTITUTE OF STATISTICS
(Mimeo Series available for distribution at cost)
344. Roberts, Charles D. An asymptotically optimal sequential design for comparing several experimental categories with a
standard or control. 1963.
..
345. Novick, M. R. A Bayesian indifference procedure. 1963.
346. Johnson, N. L. Cumulative sum control charts for the folded normal distribution.
1963.
347. Potthoff, Richard F. On testing for independence of unbiased coin tosses lumped in groups too small to use
348. Novick, M. R. A Bayesian approach to the analysis of data for clinical trials.
x~.
1963.
349. Sethuraman, J. Some limit distributions connected with fixed interval analysis. 1963.
350. Sethuraman, J. Fixed interval analysis and fractile analysis. 1963.
351. Potthoff, Richard F. On the Johnson-Neyman technique and some extensions thereof. 1963.
352. Smith, Walter L. On the elementary renewal theorem for non-identically distributed variables. 1963.
353. Naor, P. and Yadin, M. Queueing systems with a removable service stations. 1963.
354. Page, E. S. On Monte Carlo methods in congestion problems-I. Searching for an optimum in discrete situations. February, 1963.
355. Page, E. S. On Monte Carlo methods in congestion problems-II. Simulation of queueing systems. February, 1963.
356. Page, E. S. Controlling the standard deviation by cusums and warning lines.
357. Page, E. S. A note on assignment problems. March, 1963.
358. Bose,
R.
C. Strongly regular graphs, partial geometries and partiaUy balanced designs. March, 1963.
359. Bose, R. C. and
J. N. Srivastava. On a bound useful in the theory of factorial designs and enor correcting codes. 1963.
360. Mudholkar, G. S. Some contributions to the theory of univariate and multivariate statistical analysis.
1963.
361. Johnson, N. L. Quota Fulfilment in finite populations. 1963.
362. O'Fallon, Judith R. Studies in sampling with probabilities depending on size. 1963.
363. Avi-Itzhak, B. and P. Naor. MUlti-purpose service stations in queueing problems. 1963.
364. Brosh, I. and P. Naor. On optimal disciplines in priority queueing.
1963.
365. Yniguez, A. D'B., R. J. Monroe, and T. D. Wallace. Some estimation problems of a non·linear supply model implied by
a minimum decision rule. 1963.
366. DaU'Aglio, G. Present value of a renewal process. 1963.
367. Kotz, S. and J.
'\T.
Adams.
On a distribution of sum of identically distributed correlated gamma variables.
1963.
368. Sethuraman, J. On the probability of large deviations of families of sample means. 1963.
369. Ray, S. N. Some sequential Bayes procedures for comparing two binomial parameters when observations are taken in
pairs. 1963.
370. Shimi,I. N.
Inventory problems concerning compound products. 1963.
371. Heathcote, C. R. A second renewal theorem for nonidentically distributed variables and its application
queues. 1963.
to
the theory of
372. Portman, R. M., H. L. Lucas, R. C. Elston and B. G. Greenberg. Estimation of time, age, and cohort effects. 1963.
373. Bose, R. C. and J. N. Srivastava. Analysis of irregular factorial fractions.
1963.
374. Potthoff, R. F. Some Scheffe.Type tests for some Behrens-Fisher type regression problems. 1963.