Simulated Sampling Strategies for Nematodes Distributed According

Simulated Sampling Strategies for Nematodes Distributed
According to a Negative Binomial Model ~
R. McSORLEY2
Abstract: A F O R T R A N c o m p u t e r p r o g r a m was developed to simulate n e m a t o d e soil s a m p l i n g
strategies consisting of various n u m b e r s of samples per field, with each sample consisting of
various n u m h e r s of soil cores. T h e p r o g r a m assumes that the nematode species involved fit a
negative binomial dislrilmtion. R e q u i r e d i n p u t data are estimates of the m e a n and k values, the
n u m h e r of samples per lield and cores per sample in the strategy to be investigated, and the
n u m h e r of times the simulation is to I)e replicated. O u t p u t consists of simulated values of the
relative deviation from the mean and standard error to m e a n ratio, both averaged over all
replications. T h e program was used to compare 150 simulated sampling strategies for Meloidogyne
incognila, involving all combinations of two mean values (2.0 and 10.0 la.rvae/10 cm '~ soil), three k
vahles (1.35, 0.544, and 0.294), live differem n u m b e r s of samples pet" tield (1, 2, 4. 10, 20), and
tire (lilterent numt)ers of cores per sample (I, 2, 4, 10, 20). Sinmlations resulting from different
mean values were similar, but best resuhs were obtained with higher k values and 21) cores per
sample. Re[alively few 20-core samples were needed to obtain average deviations from the m e a n
of 20-25%. Kt)" words: c o m p u t e r simulation, s a m p l i n g error, spatial distribution, MeIoidogyne
incognita.
.|ournal of Nematology 1414):517-522. 1982.
T h e accuracy of a sampling plan is
critical to the operation of a nematode
diagnostic laboratory, yet r e c o m m e n d a t i o n s
on the numbers of samples and n u m b e r of
soil cores per sample to be collected per
fiehl vary greatly (1). Relatively little information is available about how to construct an accurate and efficient sampling
plan, but available studies (5,8) suggest
that obtaining accurate estimates of field
populations requires considerable effort.
A recent study (7) has examined the
relative errors involved in estimating soil
populations from a single composite soil
sample composed of multiple cores from
ilelds of various sizes. In some cases, estimates of field populations within acceptable
error limits could not be obtained without
collecting very large numbers of cores. However, it may be possible to obtain more accurate estimates by taking several replicated
composite samples from a field with a lower
n n m b e r of cores per composite sample.
Goodell and Ferris (5) have developed
a method for comparing the relative accuracy of different sampling schemes consisting of different numbers of composite
samples aml cores per sample. T h e i r method
involves searching a large data base of
nematode counts from an alfalfa field (4)
for simulated nematode counts in various
sample and core combinations. T h e relaReceived for ptlblication 6 April 1982.
q:lorida Agricuhural Experimem Stations Journal Series
No. 3758.
:University of Florida Agricultural Research and Educalion Center, 1891)5 N. W. 280 SIreet, Homestead, g l , 33031.
tire accuracy of several sampling strategies
were c o m p a r e d by calculating DEV, the
deviation of the estimate from the field
mean (of the large data base), expressed as
a percentage of the field mean. Results were
then used to optimize sampling plans to
achieve m a x i m u m efficiency.
T h i s paper outlines a F O R T R A N corntinter p r o g r a m developed to extend the
work of Goodell and Ferris (5) to the more
general situation in which the underlying
spatial distribution is the negative binomial.
Instead of obtaining simulated counts from
a stored data base, this p r o g r a m uses simulation from a negative binomial distribution to obtain counts of nematodes per soil
core. Provided that a n e m a t o d e p o p u l a t i o n
fits a negative binomial distribution model,
a sampling p l a n in terms of n u m b e r of
samples and cores per sample can be developed by using the a p p r o p r i a t e m e a n and
k values. T h e relative efficiency of several
sampling strategies can be c o m p a r e d in
terms of relative deviations from the m e a n
and standard error to m e a n ratios.
COMPUTER PROGRAM
STRUCTURE
Required i n p u t variables are the m e a n
(f¢) and k value for tile a p p r o p r i a t e negative
1)inomial distribution. Using these values,
the individual terms of the negative binomial distribntion are calculated from
generalized formulae developed from Elliott
(2). T h e probability of a zero value (Po) is
calculated from the formula:
517
518 Journal o/ Nematology, Volume 14, No. 4, October 1982
while the probability of each successive i th
term (P0 is given by:
x
p.
T h e cumulative probability (CMP 0 is also
calculated for each term, and calculations
continue until the first 1,000 terms are computed or a cumulative probability of
I).999995 is reached.
Once the terms of the appropriate negative binonfial distribution have been calculated, the sampling loops are run for a
given sampling strategy. Reqnired input
variables for each such problem are the
n u m b e r of samples (IS) and numher of
cores per sample (IC) to be collected under
the given strategy, the n u m b e r ol times the
sinmlated strategy is to be replicated (IR),
and a seed for the F O R T R A N pseudor a n d o m n u m b e r generator (S).
A Monte Carlo simulation technique
(3) was used for generating simulated values
for the nematode counts in each core
collected. A pseudo-random n u m b e r is
matched to the corresponding point on the
cumulative probability distribution and the
corresponding discrete count is used in the
simulation.
T h e interaction of the various sampling
loops is illustrated (Fig. 1). Each simulated
core count is accumulated until IC is
reached, at which time a mean count for
the sample is calculated. Means of successive
samples are computed in a similar m a n n e r
until the total number of samples (IS) for
the strategy is reached. At this point, a
mean count (f') for the strategy is computed, and DEV, the relative deviation
from the theoretical mean, is calculated,
where
T h e procedure is repeated for each replication, after which AMDEV, an average value
of DEV, is calculated. Stamlard error to
mean ratios are also computed for each
replication,
and
an
average
value,
ASMDEV, is computed for all replications.
Thus, for a sampling strategy of four corn-
posite samples of 20 cores each, replicated
10 times, simulated values for 800 cores of
soil would be generated and included in
the computations. A second strategy can be
compared with the first by entering a second set of IC, IS, IR, aml S values. O u t p u t
for each problem consists of AMDEV, the
average deviation from the field mean, and
ASMDEV, the average standard error to
l n e a l l ratio.
SIMULATION RESULTS
Sinndations were run using data from
a previous study (7) on counts of Meloidogyne, incognita (Kofoid g: White)
Chitwood larvae. Appropriate k values were
1.35 for fields of ca. 0.5 ha in size, 0.544 for
ca. 1.0 ha and 0.294 for fields of ca. 1.5 ha
(7). With each k value, mean counts of 2.0
and 10.0 larvae per 10 cm ~ of soil were used.
Thus, sinmlations were performed for a
tot,d of six different combinations of R
and k values. For each combination, simulations of 25 different sampling strategies
were run. These strategies consisted of all
combinations of five different numbers of
composite samples per scheme (1, 2, ,t, 10,
20) and five numbers of cores per composite
sample (1, 2, 4, 10, 20). All combinations
were replicated 20 times. Resuhs of the
simulations are shown for the ~ = 2 cases
for the 20 cores per sample (Fig. 2A) and
the 10 cores per sample (Fig. 2B) strategies.
Theoretical curves of tile form y = ax ~'
were fit to the simulated points for each k
value. For a given k value, it is apparent
that more composite samples of 10 cores/
sample (Fig. 2B) are needed to maintain
the same relative deviation (AMDEV) than
in the corresponding 20 cores/sample case
(Fig. 2A). In general, the smaller the field,
the greater the k value will be (7). For a
given n u m b e r of cores per sample, fewer
samples per field are needed to maintain a
given level of error with a larger k value
compared to a smaller one.
Simulated values for the average relative deviation from the mean (AMDEV)
aml tile average of the standard error to
mean ratios (ASMDEV) were similar for a
given case; thus, only A M D E V values are
shown (Fig. 2).
In comparing the z2 = 2 cases (Fig. 2)
Simulated
Sampling
Strategies:
McSorley
519
!
~)---~'Compute sample mean ]
Ltermsfor P~and CMPo I
Accumulate samplevalues
IResetcore counter to zero.
[Increment sample no.
Read in values
of mean and k
~'.ompute neg. binomial
~
l
.
I=l,lO00.Compute rm.g~
i n . n o . t eforr mPisa n d CMP~
T
I
Read in no. of
problems=lP
I
iRead in IC,IS,IRand seed I~
or random no. ~enerator F
Calculate deviation ]
I n c r e m e n t replication no. /
A c c u m u l a t e replica, v a l u e s l
R e s e t s a m p l e c o u n t e r to O.I
I
Initialize counters I
nd accumulated valuesI
._.~. Generate random no.,R, L
for core value
~"
~
~
~ yes l Set core
, ~value=Ol
,yes
ICompute ave. deviationsI
I,nc,omofn, problems
o,.o,or,o, no. I
"to
Do I=1~1000 I
Set coreL---I
i Incnde~uemnt~°r~tgt:/s
Fig. 1. Flow chat't ,~f program for generating tel'ins of at negative binomial distribution and simulating
t¢~rresponding n e m a t o d e counts per core. sample, and replicatiCm. IS = n u m b e r of samples; IC = n u m b e r
of cores; ]R = n u m b e r of replications; Pt = probability of the ith term; CMP i = cumulative probability
ituludiug Ihe i ~I, term.
520 Journal o/Nematology, Volume 14, No. 4, October 1982
50
50
K:1.35
• K =0.544 .........
• K :0.294 . . . . . . .
[] K=1.35
• K:0.544 . . . . . . . . . .
• K :0.294 .........
B
g
Z
W
Z
:E
:E
0
o
-
.
,,=
Z
Z
0
_o
.
25
•
_a
",.
-.. '-°•
IE "'. ~", <--
>
...
-
w
ua
>
--
". " .......
"..
>
LN
:E
<
Y~0.284 )~0.372
r 2: 0.839
"*'"*.
~
'"..~
..........
"
r-
"'~='"*./...°
"............... ..'7.
-
_
L
"...
-0.376
Y=O.218 X
r 2:0.780 • •
...................
........ 71%; ...........
y : O . 4 9 9 X -0'505
"',. ;'"..
~,
"'"%.
r 2:0,958 ~
*
•
~"'.......:
y=0.385 X - 0:4'81
•
~
Y: 0.I 68 )( 0.389
r2:0,781
r 2~0.964 ~ *
I
I
I
I
5
10
15
20
NUMBER OF SAMPLES
O
1
5
~
10
15
20
NUMBER OF SAMPLES
Fig. '2. Relationship between average of percent deviation from tield mean (AMDEV) and nmnl)er of
samples per field for mean = 2.0 and various k values. Simulated values indicated by points; curves repro'so'hi bcsl tit to silnlllated points. A) 20 cm'e pet" sample. B) 10 cores per sample.
to the ~: = 10 cases (not shown) in tile
simulation runs, the mean deviations
(AMDEV) obtained for the R = 10 case
were only slightly lower than those obtained
for the R = 2 case. Largest deviations are
generally obtained for the smallest mean
values. Thus, in developing a sampling
plan for a field where no previous estimate
of the mean has been made, the most conservative sampling strategy is to assume a
very low mean value for the computer simulation. One must also assume that the
spatial distribution of the nematode fits a
negative binomial model, and an appropriate k value for the nematode species and
field size involved should be selected (7).
T r a n f o r m a t i o n o[ mean values depend on
the fitted negative binomial distributions.
Actual raw counts are normally used in the
goodness of fit tests (4,7), and the mean
values used here also represent actual
counts per 10 cm~ of soil.
DISCUSSION
It is impossible to develop meaningfid
error estimates for nematological sampling
plans without knowledge of the spatial
distribution of the nematode species involved. Much additional research into the
statistical distribution of plant parasitic
nematodes is needed to determine if there
are consistent patterns for a given species,
crop, or geographical region. Use of this
particular
computer
program
assumes
knowledge of the mean and k parameters
of the appropriate negative binomial distribution for the nematode to be studied.
T h e program also assumes a r a n d o m sampling scheme in the field. Similar results
would be anticipated for other patterns,
and so the division of the field into strips
for sampling is recommended (5). In this
study, multiple samples of multiple cores
refers to repeated samples from the same
Simulated Sampling Strategies:
field. Dividing a field into portions and
taking a separate sample from each portion
can give different results and transposes to
the single s a m p l e / m u l t i p l e cores case (7).
In general, subdividing a field into smaller
units will require more sampling effort, b u t
can provide m o r e detailed i n f o r m a t i o n if
spot treatment of only a portion of the field
is feasible. T h e advantage of c o m p u t e r
simulation in developing sampling plans is
apparent. For example, to test a strategy of
20 samples of 20 cores per sample, replicatetl 20 times, count data on 8,000 cores of
soil would be needed.
Various statistics have been used to evaluate the m a g n i t u d e of error terms in sampiing studies. Goodell and Ferris (5) and
the present study used DEV, as defined by
equation 3. T h i s is a particularly useful
term when c o m p a r i n g the deviations obtained by two simulated sampling strategies,
since it can compare the single s a m p l e /
multiple core case to the multiple s a m p l e /
m u h i p l e core case. T h e standard error to
mean ratio (E) is widely used in entomology, where it is called relative variation
when expressed as a percent (9). Sampling
error can also be expressed in terms of percentage confidence limits of the mean, and
various formulae are available for computing such terms (2,6,10).
Because it is a measure of precision and
therefore requires at least two samples for
its computation, the standard error to m e a n
ratio, E, is not as versatile as DEV in simulation studies such as the present study or
that of Goodell and Ferris (5). T h e limitation is that it is difficult to compare the
single s a m p l e / m u l t i p l e core case to the
multiple s a m p l e / m u l t i p l e core case using E.
Nevertheless, E is the more meaningful
term, statistically. I n the present study,
ASMDEV, an estimate of E o b t a i n e d for
multiple s a m p l e / m u l t i p l e core cases, was
relatively close to the corresponding values
of AMDEV.
For the single s a m p l e / m u l t i p l e core
case, E must be calculated over the cores
involved in the single sample by the form u l a used elsewhere (7,10):
n=-~g-
:~
+
q)
(4)
For n = 20 cores per sample, 5~ = 2, and k
McSorley
521
= 1.35, e q u a t i o n 4 simplifies to E = 0.249.
In the simulation of the one sample of 20
cores case for these m e a n a n d k values,
A M D E V = 0.213, relatively close to the
calculated value of E.
If i n f o r m a t i o n on the spatial distribution of a given n e m a t o d e is available, then
the procedures described here and elsewhere (5) can be used to develop sampling
schemes. Because these methods involve
simulation, error estimates obtained by
them are stochastic and therefore a n u m b e r
of replications should be r u n to assure that
the estimates are reasonable. Calculation of
actual error terms is possible only for the
single s a m p l e / m u l t i p l e core case using
equation 4. T h i s calculation can be compared with the simulation results to assure
that sttmcient replications are being used in
the simulation.
Before a sampling p r o g r a m is established, it is a p p a r e n t that consideration
should be given to the a m o u n t of error that
can be tolerated. T h e m e t h o d by which
error terms are to be calculated should be
understood, since discrepancies or misunderstandings can lead to differences in
the numbers of samples to be collected.
LITERATURE
CITED
1. Barker. K. R., a n d C. J. N u s b a u m . 1971.
Diagnostic a n d advisory programs, t'p. 231-301 in
B. M. Z u c k e r m a n , W. F. Mai, a n d R. A. R h o d e , eds.
P l a n t parasitic n e m a t o d e s . Vol. 1. M o r p h o l o g y ,
a n a t o m y , t a x o n o m y , a n d ecology. New York: Acadenlic Press.
2. Elliott, J. M. 1979. Some m e t h o d s for t h e
statistical analysis of samples of b e n t h i c invertehrates. Freshwater Biological Association Scientific
Publication. No. 25. W i n d e r m e r e , Eng.
3. Giffm, W. C. 1971. I n t r o d u c t i o n to operations
engineering. H o m e w o o d , Illinois: R i c h a r d D. Irwin,
I nc.
4. Goodell, P., a n d H. Ferris. 1980. P l a n t parasitic n e m a t o d e d i s t r i b u t i o n in an alfalfa field. J.
Nematol. 12:136-141.
5. Goodell, P. B., a n d H. Ferris. 1981. Sample
o p t i m i z a t i o n for five plant-parasitic n e m a t o d e s in
an alfalfa field. J. N e m a t o l . 13:304-313.
6. K a r a n d i n o s , M. G. 1976. O p t i m u m sample
size a n d c o m m e n t s on s o m e p u b l i s h e d f o r m u l a e .
Bull. E n t o m o l . Soc. Amer. 22:417-421.
7. McSorley, R., a n d J. L. Parrado. 1982. Estim a t i n g relative error in n e m a t o d e n u m b e r s f r o m
single soil samples composed of m u l t i p l e cores. J.
Nematol. 14: 522-529.
8. Proctor, J. R., a n d C. F. Marks. 1975. T h e
d e t e r m i n a t i o n of n o r m a l i z i n g t r a n s f o r m a t i o n s for
n e m a t o d e c o u n t data from soil samples a n d of
522 Journal o/Nematology, Volume 14, No. 4, October 1982
efficient sampling schemes. Nematologica 20:395-406.
9. Ruesink. W. G. 1980. I n t r o d u c t i o n to sampling
theory. Pp. 61-78 in M. Kogan and D. C. Herzog,
eds. Sampling methods in soybean e n t o m o l o ~ ' . New
York: Springer-Verlag.
10. Southwood, T. R. E. 1978. Ecological methods with particular reference to the study of insect
populations. New York: Halsted Press.
Estimating Relative Error in Nematode Numbers from
Single Soil Samples Composed of Multiple Cores ~
R . ~[CSORLEY AND J . L . PARRADO'-'
A bsltacl: Spatial distributions of several species of plant-parasitic nematodes were determined
in each of three fallow vegetable fields and in smaller s u b u n i t s of those fields. Goodness of fit to
each of several theoretical distributions was tested hy means of a X z test. Distributions for most
species showed g~od agreement with a negative binomial model. An exception occurred with
Crictmemella sp., which showed a b e n e r tit to the N e y m a n T y p e A distribution. For nematodes
distritmted according to the negative binomial model, tfie n u m b e r of cores p e r composite
sample tteedcd to achieve specified relative errors was calculated. For a given nematode species,
such as Qubtisttlcitts actus (Allen) Siddiqi or Meloidogyne incognita (Kofoid & White) Chitwood. the k values for the negative binomial distribution increased as field size decreased, with
the restth that fewer cores were ueeded to achieve the same level of precision in a smaller field.
Best resttlts were achieved when the single sample was used to estimate p o p u l a t i o n s in fields of
0.25-0.45 ha in size. W h e n using only a single composite sample to estimate mixed p o p u l a t i o n s
of the nematodes stmtied here in a field of that size, approximately 22 cores per composite
sample would be needed to estimate all p o p u l a t i o n means within a standard e r r o r to mean
ralio of _,)'~r°:/o. Considerably, more cores were needed to m a i n t a i n a given level of precision in
fields of 1.0 ha or greater, and it may be necessary to subdivide larger unils (ca, 1.5 ha and up)
for accurate sampling. Key wo~ds: spatial distribution, negative binomial distribution, Neyman
T y p e A distribution, Criconemella sp., Helicotylenchus dihhystera, ~.Ieloidogyne incognita,
Quinisulcius acutus, Rotylenchulus reni[ormis.
J o u r n a l of Nematology 14(4):522-529. 1982.
T h e need for accurate sampling plans
to estimate soil populations of plantparasitic nematodes has become a p p a r e n t
with the greater emphasis by diagnostic
services on n e m a t o d e numbers and economic thresholds. Few plans are available
for sampling agronomic and vegetable crops
for nematodes other than Heterodera spp.
T w e n t y cores of soil for a 1.6-ha field have
given adequate results in N o r t h Carolina
(1), b u t Proctor and Marks (11) found that
precise data on Pratylenchus penetrans
(Cobb) Filipjev k Schuurmans-Stekhoven in
small plots could not be obtained without
considerable effort and would be impractical in most cases. Goodell and Ferris (5)
found that different combinations of sample
and core numbers were needed to estimate
populations of different plant-parasitic
Received for Publication 6 April 1982.
1Florida Agricultural Experiment Stations Journal Series
No 3760.
etrniver~ity of Florida, IFAS, Agriculttlral Research and
Education Center. 189(15 S. W. 98(I Street, Homestead. FI.
3303t.
nematodes in a 7-ha alfalfa field. In most
cases, five hours of collecting and laboratory
work were needed to estimate populations
within acceptable limits of error.
Becanse of the wide variety of crops,
nematodes, and nematode distributions that
may occur in any one geographical area, it
is unlikely that any one sampling plan will
suffice in all situations. I t is desirable to
demonstrate a methodology by which a
~ampling plan can be developed for a particular situation. T h e present study examines the feasibility of estimating m e a n
nematode populations from a single composite sample consisting of multiple cores
from fallow fields of various sizes. T h e
single sample per field case is considered
first because 1) the mathematics of the
single sample case are more straightforward
than the multiple samples per field case, 2)
diagnostic laboratories m a y be required at
times to make diagnoses from a single
sample, and 3) it is desirable to demonstrate
the smallest field unit that can be accurately