Analyses of Time Series

Determining and increasing the sensitivity of existing environmental
surveillance monitoring networks to detect unanticipated effects that
may occur in the environment in response to the cultivation of
genetically modified crops.
CB0304: Final Report
Executive Summary
Environmental Surveillance Networks (ESNs) provide long term (>20 years) time series of
counts at multiple sites for many taxa in the UK, and estimates of changing abundance of
individual species can act as indicators for changes in biodiversity and ecosystems more
broadly. Such changes have acquired significant policy-making influence. Counts are usually
undertaken by volunteer surveyors, under the leadership of NGOs specialising in the various
groups. Differences in available resources and species’ ecology mean that field protocols vary,
but a common underlying, broad design is a site x year ‘matrix’ of annual counts (or series of
counts within each year) obtained at a large number of sites. This consistency in design has
also led to a degree of convergence in analytical techniques, namely Poisson-based
Generalized Linear Models with multi-level factors representing spatial and temporal variation
between the counts. In addition to the taxa specific ESNs that concentrate on species counts,
there are ESNs that collect data on multiple biophysical measures. For schemes collecting such
varying data, common models and protocols are difficult; however with a suitably flexible
modeling framework, such as Generalised Linear Mixed Models, comparisons can be made.
Such datasets, often more spatially intensive than temporally intensive, offer an opportunity to
examine patterns of stock and change across multiple metrics. The potential of existing
schemes to assess the ecological impact of changing agricultural regimes or management
practices is therefore considerable.
We investigate the statistical power of some of the most frequently adopted models to detect
changes over time, and/or between sites differing in some respect, using data from ESNs.
Power can depend upon many factors, such as the scale and duration of the survey, the
abundance of the organism and magnitude of its population change (and spatial variation in
these), the influence of stochastic variation in the data available and the inevitable ‘turnover’
rate of sites. We develop a single, simple linear model to estimate the power to detect change
over time inherent in a commonly-used Poisson model as a function of these factors. A similar
model was examined to look specifically at the power to detect a difference in two spatially
explicit regimes, which is more appropriate for ESNs focused on spatial rather than temporal
intensity. The outputs of these models are explored under a range of scenarios. Our examples
here focus on the surveillance of GM crops, and vary the degree of uptake of GM crop varieties
and the type of GM crop; these factors influence the degree to which the agricultural sites in
1
question are likely to be covered by existing ESNs. We also explore the influence of stochastic
variation, magnitude of change and spatial scale at which the growing of GM crops is reported.
We illustrate how these models can also be used to estimate the number of sites required to
achieve a certain level of power – so providing guidance on how extensions of existing networks
would provide additional power and, in turn, informing judgments as to the cost-effectiveness of
such extensions.
Finally, we explore two alternative approaches. First of all we use the (count) linear model we
have derived above to explore the power of the Breeding Bird Survey (BBS) and the Butterfly
Monitoring Scheme (BMS) to detect change between two classes of site over a number of
years, and a second model to detect spatial change between two regimes within one year (CS).
Secondly we use data from the monitoring of long-term, national scale impacts of management
changes introduced under Environmental Stewardship (ES) on birds monitored by the BBS.
This second approach is an example where a management change (ES) has had an impact on
populations of species of conservation concern. Under the assumption that changes due to GM
crops may be of similar magnitude, we apply the same scenarios to explore how these influence
the power of detecting these changes. Our alternative approaches represent an illustration and
potential validation (through cross-referencing with real data simulations) of the application of
aspects of the generic tool and a proof-of-concept with respect to the feasibility of detection of
real effect sizes.
The conclusion is that the paired design (comparison of GM and conventional fields of each
crop) applied to existing ESN data does provide a means of detecting change within the
agricultural environment. It is also possible to explore the limitations of this approach: if change
is small, indicator species relatively rare (in terms of number of sites at which they are recorded)
or if it is necessary to detect change within a few years, the power of this approach is likely to be
low. However, with well chosen indicator species (widespread and sensitive to the management
of the farmed environment) and an appropriate ESN (with sufficient monitoring sites in arable
land), the probability of detecting change is much higher, especially after five or more years.
2
EXECUTIVE SUMMARY .................................................................................................................................................. 1
BACKGROUND ............................................................................................................................................................... 4
OBJECTIVES ................................................................................................................................................................... 5
IDENTIFICATION OF INDICATOR SPECIES, AGRICULTURAL, ECOLOGICAL AND POLICY SCENARIOS TO UNDERLIE
SIMULATIONS (OBJ 1) ............................................................................................................................................ 6
BUTTERFLIES AS INDICATORS OF BIODIVERSITY IN AN AGRO-ECOSYSTEM .............................................................................................6
COUNTRYSIDE SURVEY INDICATORS ............................................................................................................................................6
BBS BIRD INDICATORS .............................................................................................................................................................8
SCENARIO DEFINITION ...........................................................................................................................................................10
ANALYSES OF TIME SERIES .......................................................................................................................................... 11
DERIVATION OF AN EMPIRICAL MODEL FOR POWER CALCULATION IN TIME SERIES MODELS OF ANIMAL COUNTS (OBJS 2 & 3) .....................11
EXPLORING THE UTILITY OF EXISTING ESNS USING THE ‘GENERIC EQUATION’. (OBJ 4) ............................................... 25
UK BUTTERFLY MONITORING SCHEME .....................................................................................................................................25
BREEDING BIRD SURVEY (BBS) ...............................................................................................................................................26
ANALYSES OF SPATIAL DATA WITHIN YEARS USING COUNTRYSIDE SURVEY ............................................................... 29
METHODS ...........................................................................................................................................................................29
UNIFIED MODEL...................................................................................................................................................................38
IMPACTS OF ENVIRONMENTAL STEWARDSHIP MEASURES AS A PROXY FOR GM CROP IMPACTS. (OBJ 4) .................. 40
METHODS ........................................................................................................................................................................40
RESULTS ...........................................................................................................................................................................43
IMPLICATIONS OF INCREASING POWER THROUGH INCREASING SAMPLE SIZE. ........................................................... 47
COSTS FOR INCREASED BUTTERFLY SAMPLING UNDER WCBS .........................................................................................................47
INCREASING THE COUNTRYSIDE SURVEY SAMPLE .........................................................................................................................48
COSTS FOR INCREASED BIRD SAMPLING UNDER BBS ....................................................................................................................49
CONCLUSIONS ............................................................................................................................................................. 51
THE GENERIC EQUATION FOR COUNT DATA ...............................................................................................................................51
ANALYSIS OF SPATIAL DATA WITHIN YEARS .................................................................................................................................51
USING ENVIRONMENTAL STEWARDSHIP MEASURES AS A PROXY FOR GM CROPS ..............................................................................52
LIMITATIONS OF GENERAL SURVEILLANCE ..................................................................................................................................52
REFERENCES ................................................................................................................................................................ 54
3
Background
Legislation requires that prior to commercial marketing in Europe, a genetically modified
organism (GMO) must undergo an environmental risk assessment, and, if authorized, a postmarket environmental monitoring (PMEM) plan is put in place. Part of this PMEM is General
Surveillance (GS) to detect any unintended effects of the GMO. Pre-existing Environmental
Surveillance Networks (ESN's) are expected to play a key role in GS.
This has brought to the fore the need for policy makers to understand the ability of existing
ESN's in terms of their sensitivity and therefore potential for detecting changes in the
environment that may be correlated to the cultivation of GM crops and has also highlighted the
broader applicability of this approach to detect unintended impacts of change in farm
management practices more generally. If greater power to detect change is required, policy
makers would need to understand the degree to which extending existing ESNs would increase
the power of an ESN, the assumptions upon which any such predicted increase in power would
be based and any uncertainties in the predictions that the latter suggest.
There is also a need to provide clear guidance to applicants for licenses for GM crops as they
prepare their PMEM plans, to enable them to identify which data they will use from ESNs, how
they will analyse these data, and the strengths and limitations of such an analysis.
Power analyses, using Monte Carlo simulation techniques, estimate the power to detect change
of specified magnitude in given circumstances. This practice is relatively straightforward
theoretically, but computationally demanding, and moreover the results are applicable only
under the assumptions made in simulating the artificial data used. These assumptions are
important because the data need to be produced to mimic real survey results from many
hundreds of sites, based on plausible GM crop uptake for each site and plausible speciesspecific responses, both of which may vary from case to case. Power can also depend upon
other factors specific to the organization of the survey scheme, such as the scale and duration
of the survey, the abundance of the organism and magnitude of its population change (and
spatial variation in these) and the inevitable ‘turnover’ rate of sites. The “true” values of such
factors in future surveys clearly cannot be predicted accurately in advance, hence it is essential
to consider the power consequent upon a range of such assumptions, rather than a single
value, often plotted graphically as a ‘power curve’, or to explore existing or previous survey data
to predict (ranges of) plausible values.
We develop here a linear equation to predict the power of detecting change between two
treatments by monitoring an indicator species or environmental metric. We explore how power
is influenced by a range of GM crop uptake scenarios, how existing ESNs may perform in
detecting change under these scenarios, and how effective enhancing these networks would be
in improving the power to detect change.
4
Objectives
1) To identify policy scenarios, indicator species and a plausible range of ecological and
agricultural scenarios (extent of ‘take-up’) to underlie the simulations upon which a model is
to be based.
2) To perform a large number of analyses of data simulated within the range ascertained
above.
3) To use the simulation results above to produce and test a simple approximation of ‘power’,
in terms of the various factors by which it is determined, in the form of a linear model that
can be used to estimate power in a range of circumstances without the need for additional
extensive simulation work.
4) To illustrate its use via examples from current ESN bird and butterfly data.
5) Prepare report/manuscript with a view to publication in scientific literature.
5
Identification of indicator species, agricultural, ecological and policy
scenarios to underlie simulations (Obj 1)
Butterflies as indicators of biodiversity in an agro-ecosystem
While arable land is in general a species-poor environment for butterflies, there are species
which can be found there in good numbers. There are no ‘specialist’ farmland species
analogous to those in the Farmland Bird Index (see below), but generalist species such as the
Whites, some common Brown species and the Small Tortoiseshell utilise such land, not
because it is a preferred habitat but because they alone are able to survive there, possibly in
rough patches or amongst nettles where hedges have been removed. As power to detect
change depends upon species’ abundance, useful indicator species are those present in some
number, and likely to react most to any change in habitat or land-use. Of these, different species
may therefore function as optimal ‘indicator’ species, according to the nature of agricultural
change: where management results in a loss of rough ground the consequences are most likely
to be quickly detected in species like Small Tortoiseshell whose larvae are dependent upon
nettles. On the other hand, changes in growing practice of, for example, oilseed rape may more
quickly manifest themselves through changes in Large and Small white whose larvae feed,
amongst other things, on this crop.
For the purposes of this report, the Small Tortoise shell, Aglais urticae and the Large White,
Pieris brassicae were chosen as indicator species, due to their presence and abundance on
agricultural land.
Countryside Survey Indicators
Countryside survey measures data on many metrics that are potential indicators of
environmental health. In particular CS data on habitat connectivity and plant richness in the
wider countryside is used in the ‘UK Biodiversity Indicators in your Pocket 2012’. Biodiversity is
a clear indicator of farmland health and as such biodiversity metrics of plant species richness
and nectar plant species richness provide good indicators of environmental status of farmland.
Delivery of key ecosystem services provides a suitable criterion by which to measure farmland
health and determine appropriate indicators. As such, cover of arable weeds is one possible
metric as not only do they provide biodiversity (cultural/aesthetic) value in their own right but
they can provide a vital food source for birds and insects. Some species of common weed that
meet these criteria, are fairly abundant and so are potentially good indicators, are listed in table
1.
Soils also play a crucial role in delivery of ecosystem services, including carbon sequestration,
water quality and nutrient cycling. Soil chemistry and nutrient status are therefore key metrics to
6
ascertain the status of any parcel of land. Soils can also determine many above ground
measures and indicators of potential change.
A full list of potential metrics to use as indicators of farmland health is given in table 1. Common
weeds of Cirsium arvense (creeping thistle), Galium aparine (goosegrass) or Poa annua (annual
meadow grass) were chosen as they are considered to be species indicative of farmland
environmental quality, and, for the purposes of this report, also have different means and
variances in abundance. We also explored three measures of soil quality (carbon, nitrogen and
pH), and one of water quality. As an indicator for water quality the Average (BMWP – Biological
Monitoring Working Party) Score per Taxon (ASPT) was used. This is deemed the most
appropriate and most sensitive single metric for describing water quality.
7
Table 1: Potential indicators of farmland health
Metric
Total Plant species
richness
Nectar plant richness
Cover of arable weeds
Soil Water-holding
capacity
Water quality: Average
(Biological Monitoring
Working Party) Score per
Taxon (ASPT)
Related to
specific arable
fields
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
Nitrogen
√
√
Carbon
√
√
Phosphorus
√
√
pH
√
√
X
√
√
X
Capsella bursapastoris
Cerastium fontanum
Cirsium arvense
Dactylis glomerata
Elymus repens,
Elytrigia repens
Galium aparine
Holcus lanatus
Persicaria maculosa
Poa annua
Polygonum aviculare
Rumex obtusifolius
Stellaria media
Taraxacum officinale
agg.
Trifolium repens
Urtica dioica
Soil invertebrate
richness
Ellenberg fertility
Soil chemistry
Measured in
CS2007
BBS bird indicators
Individual bird species use the cropped environment in one or more of three broad ways (i)
nesting in the crop, (ii) feeding in the cropped area in summer and (iii) feeding in the cropped
area in winter (perhaps after the crop has been harvested and/or the field ploughed). The extent
to which each is important to a given species will determine its potential sensitivity to the
8
changes to the environment likely to be caused by a switch to GM and also, therefore, the value
of its abundance as an indicator of GM crop effects. Note, however, that it would be overly
simplistic to consider how many of the three ways are relevant to a species as an index of its
sensitivity because the latter will be determined by where population-limiting factors are found
and on the size of the biological effect in each case. Some species utilize the cropped
environment in all three ways, but rarely in the same crop type or field. In general, among small
passerines, the evidence suggests that winter foraging is where most species are limited and
they will be most vulnerable, whereas it is factors associated with nesting or summer foraging
that are most relevant to species with precocial young.
The geographical ranges of each species are important for several reasons. First, clearly,
species can only inform about a given crop’s impact on the environment when their distribution
overlaps with that of the crop. Second, as well as this spatial limit, species with restricted ranges
tend not to be found sufficiently frequently on BBS squares for sample sizes to be sufficiently
large to allow statistical detection of relationships with environmental variables (although,
conversely, they may also be the most sensitive to change). Third, species move between
seasons to different extents. Thus, fully migratory species have no relationship with UK
farmland in winter and, in partial migrants (species where a proportion of the UK breeding
population winters overseas), the same is true for some of the population, weakening any
relationships between winter habitat factors and breeding numbers. To a lesser extent, the
same applies to species whose winter populations are augmented by immigrants from further
north, because the UK breeders might conceivably be disproportionately affected or unaffected
by a given habitat change because of factors such as differential habitat selection or relative
social dominance affecting their vulnerability.
Finally, potential indicator species are presented individually in Table 2. Considering the
abundance of individual species is the most sensitive way in which to use the information for
General Surveillance. Data from these species could potentially be combined in many different
ways, such as diversity indices, guild-level abundance or average annual abundance indices (as
in the national Farmland Bird Index). However, while combining data, in an appropriate way, for
species with similar life histories or habitat or food requirements might increase sensitivity, it is
also very likely that some combinations of species will tend to obscure the responses of
particularly vulnerable species because other species do not share the specific factors that
generate the vulnerability. Combining species for surveillance in this context also incorporates
an implicit hypothesis that the factors that the species concerned have in common are those
that define sensitivity and response to GM crops (but see1).
1
http://www.rivm.nl/dsresource?objectid=rivmp:118436&type=org&disposition=inline
9
Table 2: Potential indicator species, defined as those that use the cropped areas of arable fields in winter or summer (shown
by ticks). Ticks in parentheses show cases for which in-field habitats are minor for the species, even within arable regions.
Note that some species also have significant proportions of their UK populations in non-arable habitats (e.g. lapwing,
woodpigeon), so to maximize their potential as indicators of the health of the cropped environment, the arable portions of
each species’ population should be extracted from national data sets where possible. Species with “restricted” ranges are
those likely to be too uncommon or range-restricted to be useful indicators, even if their populations are monitored
nationally using the BBS.
Species
Summer
Winter
Breeding range
Wintering range
Feeding
Nesting
Skylark



UK
UK
Dunnock
()
()
()
UK
UK
Yellowhammer


UK
UK
Linnet


UK
Partial/short-distance migrant;
60% winter outside UK
Grey Partridge


UK
UK
Lapwing


UK
UK

UK
UK

UK
UK

UK
UK

UK
Partial/short-distance migrant;
60% winter outside UK

UK
UK

Song Thrush
Reed Bunting
()
()
Chaffinch
Goldfinch
()
Greenfinch
(supplemented
immigrants)
(supplemented
immigrants)
Woodpigeon


UK
UK
Stock Dove


UK
UK
Yellow Wagtail


Restricted
Africa
Corn Bunting


Restricted
Restricted
Whitethroat
()
()
Tree Sparrow
House Sparrow
()

UK
Africa

Restricted
Restricted
()
UK
UK
by
by
Scenario definition
It was decided that the scenarios could be defined fairly simply by a) crop; b) uptake rate; and c)
effect size that General Surveillance would need to detect. Crops were chosen to match those
in the regulatory pipeline which may be grown in the UK, namely potatoes, sugar beet and
maize. Uptake rates were varied between 20% and 80% (20, 40, 60, 80), and effect sizes were
varied between 5% and 50% (5, 10, 25, 50).
10
Analyses of Time Series
Derivation of an empirical model for power calculation in time series models of
animal counts (Objs 2 & 3)
Background
There are now many national wildlife surveys in which the abundances of species in various
taxonomic groups are counted, according to some standardised field protocol, at a number of
different sites. Though the precise detail behind the data collection may vary according to the
ecology of the taxa, we might consider a generic data set as represented by counts Ci,t, taken at
site i in year t of the survey. Traditionally, these have often been modelled as Poisson random
variables with expected value µi,t .
A model for time-series of count data
Freeman and Newson (2008) developed a model in which population growth, defined as the
ratio between two consecutive expected counts at a site, was modelled as a linear function of a
site- and time-specific covariate Pi,t :
(1) log (
𝜇𝑖,𝑡
𝜇𝑖,𝑡−1
) = 𝑅𝑡 + 𝛼𝑃𝑖,𝑡−1
Thus Rt is the rate of growth at a site where Pi,t-1 = 0, and α quantifies the effect upon growth of
the covariate, so that if Pi,t is a simple binary variable (0/1), growth where Pi,t-1 = 1 is reduced (if
α<0) to a fraction eα of that at a site where Pi,t-1 = 0, otherwise e is the effect of unit change in a
continuous variable Pi,t.
Freeman and Newson (2008) considered the presence/absence of a competing species as their
covariate of interest, but the model clearly can be applied to a wider range of circumstances.
Here we consider Pi,t, again a binary variable, to distinguish between sites operating one of two
management regimes, and taking the value 1 (‘treated’) or 0 (‘control’).
After some algebra, we have:
𝑡−1
(2) log(𝜇𝑖,𝑡 ) = ∑𝑡−1
𝑗=1 𝑅𝑗 + 𝛼 ∑𝑗=1 𝑃𝑖,𝑗 + log(𝜇𝑖,1 )
and the model is linear in the unknown parameters R, α and the μI,1, which can therefore easily
be estimated using a Generalized Linear Model (Freeman and Newson, 2008).
A power analysis
Our objective is to characterise the power of the model defined in equation (2) to identify a nonzero value of α (that is, a difference in trend over time between sites operating under the two
regimes). Clearly this power may depend upon any of a number of factors in practice, e.g. the
scale and duration of the survey and the abundance and range of the animals themselves. We
11
seek here to develop a model to predict power from nine of these key properties, initially as
follows:
(3) g(𝜃) = 𝑎0 + ∑9𝑖=1 𝑎𝑖 𝑋𝑖
where a0 – a9 are unknown parameters, θ (0<θ<1) is the power (expressed as the probability of
rejecting the hypothesis that α=0) and g is an appropriate link function constraining θ to lie in the
valid range (0,1). This basic model is then readily extended to incorporate interactions between
any of the Xi.
An adjustment for lack of fit in the Poisson model
Power calculations could be based upon the assumption, following Freeman and Newson
(2008) that the data are adequately described by a Poisson distribution. In many cases this may
not be so. Model fitting is then generally extended by using the Pearson Chi-squared statistic or
residual deviance to modify standard errors and test statistics, the ‘quasi-Poisson’ option in
many statistical packages. If the fit of a Poisson model is poor, power calculations based upon
such an assumption are likely to be too high. Note therefore that data overdispersed with
respect to a Poisson distribution have been generated for these analyses, and the fit of the
models adjusted accordingly. In summary, our nine predictor variables X1 – X9 are defined as in
Table 3:
12
Table 3: Definition and range of nine predictor variables
Variable
Range considered
Definition
X1 : slope, R.
-0.1 to -0.1
The average annual rate of change
at an untreated site, Rt (= R,
assumed constant here)
X2 : Nsites.
50 to 200
The number of sites visited2
X3 : treated.
0.01 to 0.9
The proportion of sites treated (e.g.
the proportion of crops grown that
are GM)
X4:abundance_mean. 0 to 5
The average log(abundance) at a
site in year one of the survey
X5 : abundance_var.
0.1 to 4
The
variance
of
the
log(abundance) measures at each
site in year one of the survey; a
measure of inter-site variability
X6 : missed.
0.2 to 0.5
The proportion of survey visits
missed
X7 : duration
5 to 20
The duration of the survey
X8 : q
1 to 10
Scale parameter: A measure of the
excess residual deviance, or
overdispersion in the data
X9 : alpha, .
-0.1 to 0 (i.e. annual The magnitude of the difference
reductions of up to 9.6%)
between the two treatments (in this
case GM and conventional crops)
To illustrate the simulation method, consider for example a survey of five years’ duration (thus
fixing X7) in which the consequences of adopting some ‘treatment’ at certain random sites is
represented by α = -0.1 that is growth at a ‘treated’ site is reduced to e-0.1 = 90.4% per year of
that at an untreated control (fixing X9). Estimation of θ by Monte Carlo simulation then proceeds
in two stages:
i)
2
Select the dimensions of an artificial survey by generating values of X 2, X3 and X6 at
random, from the following uniform distributions: X2~Unif(50,200), X3~Unif(0.1,0.9)
and X6~Unif(0.2,0.5).
We consider X2 and X3 in the model by replacing them with two transformed variables representing the numbers
of treatment and control sites (i.e. No. Sites x proportion treated and No.. sites x (1-proportion treated)
respectively) since power can be expected to increase monotonically with these.
13
ii)
iii)
iv)
v)
vi)
vii)
Select parameters reflecting the range and demography of an artificial population by
generating values X1, X4 and X5 at random from the following distributions: X1~Unif(0.1,-0.1), X4~Unif(0,5) and X5~Unif(0.1,4). Note, to put the latter two ranges in
perspective, that two of the commonest species in the UK Butterfly Monitoring
Scheme, Small Tortoiseshell and Large White, in the latest year available had
average log-counts of 3.5 and 2.6 respectively, and variances of 3.5 and 5.7 across
the few arable sites currently in the scheme.
Generate X2 initial expected abundances μi,1, one for each site, by generating a
series of independent and identically distributed variables from a normal distribution:
log(µi,1) ~N(X4,X5), i=1,2,…X2.
Generate expected abundances µi,t, t=2,3,4,5 for each site as the survey progresses,
using equation (2).
Generate a series of artificial counts Ci,t, where E(Ci,t) = µi,t, and a randomly selected
scale parameter X8 is used to produce overdispersion with respect to a Poisson
distribution. Fit model (2) to the data arising and, adjusting for overdispersion, test
the hypothesis of α=0.
Return to (iii) and repeat, generating 2000 such data sets; count those returning a
significant value of α.
Select new values of X7 and X9 at random from the range considered, return to (i)
and generate new values X1-X6, repeat the entire process to produce a count of
significant results from 2000 replicated simulations based on the new survey
dimensions and demographic parameters.
This entire process was repeated 1000 times, yielding 1000 estimates Si of the number of
significant results based upon 1000 scenarios defined by the values X1i-X6i, i=1,2,… 1000.
Power θ is then estimated by assuming the Si are binomially distributed, Si ~ Binom(2000,θ) with
θ related to X1-X9 via equation (3), and interactions considered to improve the predictive ability
of this model.
The advantage of the method is that, once the parameters in (3) (or an extension of it – see
later) are estimated, they can be used quickly to derive an approximation of the power for any
chosen combination of survey scale / demographic variables X, and the relationship between
power and any of these covariates is easily explored, without the need to repeat the computerintensive simulations every time.
14
Results
A simple, linear combination of the nine candidate predictor variables (Equation 3) proves to be
a rather poor predictor of the observed (simulated) power values (Figure 1). Although cases in
which the simulations produced high power were similarly identified by the model, power in a
substantial proportion of those in which the observed value was low was overestimated by the
model. The same linear model based upon most of the same predictors performed well when
applied to simulations derived from a single, constant value of α. We concluded a more complex
approach is needed to span a range of values of .
Figure 1: Estimated values of power from a 9
parameter univariate model (y) versus power
estimated by repeated simulations (x).
The same model performs considerably better when fitted to data over a smaller range of values
for α; two are shown for illustration in Figures 2 & 3, where the model is fitted only to data where
-0.01 < α < 0 or Figure 2 and -0.1 < α <-0.09 in Figure 3.
Figure 2: As Figure 1, but model fitted to cases
with a reduced range for α: -0.01 < α < 0
15
Note that in Figure 3, where the effect of a treatment is largest, results are concentrated into the
top right hand corner, where power is almost one.
Figure 3 As Figure 1, but model fitted to cases with a
reduced range for α: -0.1 < α < -0.09
Coefficients of this model fitted to the data in ten subsets, obtained by splitting the range of α
into ten equal segments, are shown in Table 4. Variation in these as α is changed is
considerable, and often erratic, implying potential for interactions between these variables and
α.
Table 4: Estimated coefficients a0 – a9 from the fitting of equation (3) to the simulated power estimates for α in ranges of
bin-width 0.01, centred on mid-points indicated in left-hand column.
α(mid-point)
a0
a1
a2
a3
a4
a5
a6
a7
a8
a9
-0.005
-7.79
5.27
0.01
0.01
0.57
1.00
0.15
-0.66
-287.17
-0.11
-0.015
-8.61
9.26
0.01
0.00
0.95
1.28
0.23
-0.95
-149.40
-0.25
-0.025
-7.07
3.42
0.02
0.01
1.00
1.42
0.32
-3.91
-24.61
-0.19
-0.035
-3.29
6.68
0.03
0.00
1.02
1.31
0.25
-1.79
42.58
-0.30
-0.045
-10.06
2.43
0.04
0.01
1.06
1.20
0.28
-3.35
-81.18
-0.14
-0.055
-13.52
3.33
0.03
0.00
1.26
1.29
0.33
-4.74
-164.66
-0.25
-0.065
-19.62
6.79
0.06
0.01
1.73
1.65
0.34
-3.03
-187.10
-0.37
-0.075
-5.82
-0.38
0.04
0.00
0.92
0.70
0.30
0.34
-11.26
-0.18
-0.085
-6.09
2.64
0.05
0.02
1.50
1.75
0.31
-0.35
18.38
-0.24
-0.095
-10.05
9.33
0.05
0.01
1.27
1.30
0.33
-1.26
-45.95
-0.19
16
We therefore next extended the model by adding first-order interactions between α and each of
the other predictors, effectively forcing the combined coefficients of the latter to vary (smoothly)
with α. Though a formal test shows the fit to be greatly improved (AIC arising from Figure 4
being 225,252 as opposed to 311,526 for Figure 1), resulting improvement in the models
capacity to predict the power estimated from the simulations is modest (Figure 4).
Figure 4: Estimated values of power under a
model with interactions between α and all
other predictor variables (y) versus those
estimated by repeated simulations (x).
We then examined models with the addition of an extra interaction between each of the (7 × 8)
/2 =28 possible pairs of predictors, other than α, computed the AIC of each and identified the
pair responsible for the greatest improvement in formal fit, over that of Figure 4. By some
distance, this was the interaction between the number of treated sites and the number of control
sites, which reduced AIC to 195,502. Extreme outliers are now somewhat reduced, the region
bounded by the upper and lower lines in Figure 5 illustrating the range in which the observed
and fitted values differ by less than 0.2.
Figure 5: Estimated values of power under a
model with interactions between (i) α and all
other predictor variables and (ii) the numbers
of sites treated and not treated (y), versus
those estimated by repeated simulations (x).
We thus added this interaction to the model, and then considered adding quadratic terms in
each of the variables – much the greatest improvement (AIC=176,041) came about as a
consequence of adding the quadratic form of α (Figure 6). Finally, with this quadratic term also
added we also considered adopting a non-symmetric complementary log-log link function, rather
than the logit used to date, but the logit remained a better fit. The number of outliers, however,
though much reduced, is not negligible (Figure 6).
17
Figure 6: Estimated values of power under a
model with interactions between (i) α and all
other predictor variables, (ii) the numbers of
sites treated and not treated and (iii) a
quadratic term for α (y), versus those
estimated by repeated simulations (x).
The same set of parameters performs much better estimated from small values of α (α > -0.02;
Figure 7), but remains prone to occasional erratic predictions of cases in which power is low
when fitted only to data in which α is greater than this (Figure 8). We restrict further
consideration therefore to predictions based upon this model with coefficients estimated from
(and hence applicable to) the subset of data with α > -0.02, that is, considering reductions of
more than 2% per annum in ‘treated’ sites, accumulating to, say, reductions of at least 17% over
a ten-year survey. Parameter values are given in Table 5.
Figure 7: Expected and observed (simulated)
values of power using the model of Figure 6,
but fitted only to data where α < -0.02.
Figure 8: Expected and observed (simulated)
values of power using the model of Figure 6, but
fitted only to data where α > -0.02.
18
Table 5: Estimated coefficients under the final model, for α > -0.02, with various interactions.
Variable(s)
Coefficient
Intercept
-7.65
Slope
7.28
Nsites * treated
-0.0137
Nsites*(1-treated)
-0.0123
Abundance_mean
0.527
Abundance_var
0.701
Duration
0.226
missed
-0.464
α
-262.2
Scale
-0.0785
α2
-12650
α*Slope
24.65
α*Nsites*treated
-0.761
α*nsites*(1-treated)
-0.125
α*abundance_mean
-30.85
α*abundance_var
-43.73
α*Duration
-3.072
α*missed
121.2
α*Scale
4.683
2
Nsites * treated * (1-notreated)
0.00042
An additional, cross-validation exercise (Figure 9) was also carried out as a more robust test of
the model’s predictive power. In this case, the model was fitted to one set of artificial data from
500 sets of simulations and then used to predict the power matched to 200 sets of entirely
separate simulations. That is,the predictions are based upon a model fitted beforehand to
separate data, so the coefficients are entirely independent of the values used subsequently in
the second set. While the match is predictably less close than those in previous figures, due to
this removal of the dependency, the model is largely adequate for indetifying scenarios with low
or high associated power.
Figure 9: A cross-validation exercise. The model
of Figure 6 was fitted to the values of power
estimated from 500 sets of simulated data, and
used to predict the power from 200 further
additional, independent, observations. The
independent estimates of power from the latter
are plotted against their values as predicted
from the model fitted to the former. Model
coefficients are given in Table 5.
19
Using the generic equation to illustrate dependence of power on key variables
The power curves arising can be represented in an almost unlimited variety of forms, given the
large number of factors determining their shape. In figure 10a, we assume a scenario in which
the slope at a control site is zero (i.e, the population is stable), and both the mean and variance
of the initial log-abundances are 2.0, with the ensuing ratio of variance/expected counts
(‘overdispersion’) = 5.0. A random 40% of scheduled visits are assumed missed over a survey
of ten years duration; the true effect α is taken as -0.01, that is the growth at a treated site is
given by exp(α) = 99% compared to that at an untreated site. To produce the curves, various
levels are assumed for the proportion of sites treated (ranging here from 0.2 to 0.5, in
increments of 0.1, bottom to top) and the total number of sites surveyed, presented on the xaxis. As would be predicted, power increases with the number of sites surveyed, and as the
proportion of these treated rises towards equality with the control sites. It increases also with the
duration of the survey (Figure 10b, where the survey is extended to one of fifteen years and all
else is as Fig 10a), decreases with the proportion of visits missed (Figure 10c, which differs
from 10a only in that this proportion is reduced from 40% to 20%). Power appears largely robust
to realistic variation in the extent of the decline at the control sites; Figure 10d differs from
Figure 10a in that this decline is taken as Rt = -0.05, that is a decline of almost 5% per year.
a: 40% of visits over 10 years missed at random. Value
b: 40% of visits over 15 years missed at random.
of slope at ‘control’ sites = 0
Value of slope at ‘control’ sites = 0
20
c: 20% of visits over 10 years missed at random. Value
d: 40% of visits over 10 years missed at random.
of slope at ‘control’ sites = 0
Value of slope at ‘control’ sites = -0.05
Figure 10: Estimated power as a function of sample size. Alpha = -0.01. Mean and variance of initial log-abundances
= 2 and the overdispersion parameter = 5. Power curves, top to bottom, represent 50%, 40%, 30% and 20% of all
sites treated. Other parameter values vary as stated in individual legends.
An alternative visualization is shown in Figure 11, where the number of sites is fixed at 100
throughout and the x-axis represents variation in the initial abundance. All other parameters are
fixed to the same values, respectively, as the analogous plots in Figure 10; that is, the
proportion of sites treated ranges from 0.2 to 0.5, bottom to top; 9b runs to 15 years duration, 9c
has only 20% of visits missed and 9d has a decreasing trend at ‘control’ sites. The predictable
increase in power with abundance is clear, and the conclusions drawn from Figure 10 previously
re-inforced: power increases with survey duration, decreases with visits missed and is relatively
little affected by modest changes in the underlying temporal trend.
21
a: 40% of visits to 100 sites over 10 years missed at
b: 40% of visits to 100 sites over 15 years missed at
random. Value of slope at ‘control’ sites = 0
random. Value of slope at ‘control’ sites = 0
c: 20% of visits to 100 sites over 10 years missed at
d: 40% of visits to 100 sites over 10 years missed at
random. Value of slope at ‘control’ sites = 0
random. Value of slope at the control sites = -0.05.
Figure 11: Estimated power as a function of initial abundance. Variance of initial log-abundances = 2 and the
overdispersion parameter = 5. Power curves, top to bottom, represent 50%, 40%, 30% and 20% of all sites treated.
Other parameter values vary as stated in the legend.
Needless to say, power also increases with the scale of the effect sought, α. Figure 12 provides
curves derived from parameters identical to those of 11(a,b), but with α = -0.02 rather than 0.01.
22
a
b
Figure 12: Power curves analogous to those of Figure 10 a & b, but with α reduced to -0.02.
Finally, the duration of the survey will also influence the power to detect change. Simulations
covered a range of 5 to 20 years; as Figure 13 illustrates, with time series at the lower end of
this range, given the other parameter values as stated in the legend, power is greatly reduced,
particularly for less abundant species when the number of sites is limited (Fig. 13 a & b).
23
a
b
c
d
Figure 13: Estimated power as a function of the duration of the study for species of Log (mean abundance) of a) 1; b) 2; c)
3; and d) 4. Other parameter values are R = 0; variance = 1; over dispersion = 5; sites = 50; alpha = -0.01; 40% visits
missed at random.
24
Exploring the utility of existing ESNs using the ‘Generic Equation’. (Obj 4)
We used data from two ESNs which collect counts of species abundance over time to explore
where individual species from each scheme lie on the power graphs generated by the generic
equation.
UK Butterfly Monitoring Scheme
We used data from the UK butterfly monitoring scheme to estimate the abundances at each site
over two visits (the scheme protocol) and the level of overdispersion under the standard model
of Freeman and Newson (2008) for two species, the Large White and the Small Tortoiseshell.
These we used to replace the arbitrary values in Figure 10a, thus producing approximate power
curves matched specifically to data that might be expected to accrue for these species during a
similar scheme. Results are shown in Figure 14, and show slightly higher power for the Large
White.
In reality, the number of UK BMS sites that cross arable land in which these two species are
recorded is ~30, so we would anticipate that the best estimate of the probability of detecting
change under these parameters would lie at the lower end.
a
b
Figure 14: A set of power curves matched to data observed in the UK Butterfly Monitoring Scheme for a) the Large
White with Log (mean abundance) = 2.61; Variance = 5.73 and b) Small Tortoiseshell with log (mean abundance) =
3.5; variance = 3.5. Other parameter values are: 40% of visits over 10 years missed at random; alpha = -0.01 (i.e. 1%
reduction p.a.); over dispersion = 10; power curves, top to bottom, represent 50%, 40%, 30% and 20% of all sites
treated.
25
Breeding Bird Survey (BBS)
Data from the breeding bird survey was used to parameterize the generic equation for three
species chosen from Table 2 (Linnet, Reed Bunting and Yellowhammer). As with the previous
example, species and ESN specific parameters determine the predicted power curve in each
case. However, for practical reasons, simulations covered a range of parameter values up to
200 sites, yet the BBS has up to 800 sites for some species. As extrapolation cannot be justified
beyond the simulated range, a conservative value of 200 sites was chosen for each of these
three sets of power curves, with the other parameters as specified in table 6. These parameters
are derived from real survey data from BBS squares in south-east England (100km grid squares
TA, TF, TG, TL, TM, TQ, TR and TV) from 2002-10.
Table 6: Parameter values representing three bird species from the BBS for the generic power equation.
Species
Growth at
untreated
site (R)
Mean
initial
abundance
Variance
of initial
site
abundance
Proportion
of visits
missed
Scale
parameter
Yellowhammer
-0.0204
0.6285
0.0022
0.3585
1.357
Linnet
-0.0407
0.6482
0.0020
0.3319
Reed
Bunting
0.0083
-0.3806
0.0046
0.3292
Survey
duration
Real Total
sites
Sites
assumed
for
simulation
9
690
200
4.110
9
704
200
1.322
9
377
200
(years)
26
1.2
Linnet
1
alpha=0
Power
0.8
alpha=-0.004
0.6
alpha=-0.008
0.4
alpha=-0.012
alpha=-0.016
0.2
alpha=-0.02
0
0
0.2
0.4
0.6
Proportion of sites treated
0.8
a
1.2
Reed Bunting
Power
1
alpha=0
0.8
alpha=-0.004
0.6
alpha=-0.008
0.4
alpha=-0.012
alpha=-0.016
0.2
alpha=-0.02
0
0
0.2
0.4
0.6
Proportion of sites treated
0.8
b
1.2
Yellowhammer
1
alpha=0
Power
0.8
alpha=-0.004
0.6
alpha=-0.008
0.4
alpha=-0.012
alpha=-0.016
0.2
alpha=-0.02
0
0
0.2
0.4
0.6
Proportion of sites treated
0.8
c
Figure 15: A set of power curves matched to data in the breeding bird survey for a) Linnet; b) Reed Bunting; c)
Yellowhammer. Other parameter values are as in Table 6.
27
The predictions shown are derived from real survey data from Breeding Bird Survey squares,
but assuming a total sample of 200 squares. In this area, the real totals of survey squares in
which these species are found range from 377 to 704, so a total of 200 can be considered to
reflect “treatment” over a smaller spatial scale. Power would be expected to increase with larger
areas being treated. For all three species, the power to detect differences >1.6% p.a. is greater
than 0.8 after a nine year period.
28
Analyses of spatial data within years using Countryside Survey
It is necessary to adopt a different approach for The Countryside Survey (CS), as this scheme
undertakes an intensive sample once every 7 years (approximately). The approach uses CS
plot level data located within, or adjacent to, fields of the pre-specified crop of interest. Plots
located within fields of maize and potatoes were assessed under the ‘generic uptake and
change scenarios’. There were insufficient plots recorded within fields of sugar beet for this
ESN, so sugar beet scenarios were not included in the analysis. The analysis focused on the
spatial differences observed between plots located within GM uptake and plots located outside
GM uptake, rather than changes over time because of the long period between successive
surveys. Analysis of the Countryside Survey therefore plays to the strengths the survey has in
terms of spatial representation, the number of variables collected at the plot level and the direct
association between the variable and the particular crop of interest. The power analysis
investigated was therefore purely a spatial one, where the difference between GM to non-GM
plots was examined within the same survey – CS2007.
The analysis was performed at two different scales for which data on GM uptake may be
available: field level and 1km square level. This allowed us to model the differences we see in
power between the two levels of information and the direct influence scale of information has on
any analysis or attribution.
Three categories of data were considered: weed abundance (% cover in quadrats of a number
of common weed species), soil properties (soil Carbon, soil Nitrogen, and soil pH) and water
quality (associated with arable areas).
Methods
Under the different uptake and change scenarios, a proportion, equal to the uptake scenario, of
CS plots containing the crop of interest were randomly selected and the observed response
associated with these plots changed by a factor equal to the change scenarios. The resulting
data was then modelled and the significance of the indicator term corresponding to GM areas
was stored. This was then repeated 1000 times to obtain a percentage of times we observe a
significant effect of the GM indicator term.
Weed Abundance
The percentage ground cover within a quadrat covered by each of these species was collected
in the CS vegetation plots randomly located within the 1km squares and vegetation plots at
arable field margins. The cover data was therefore directly associated with the specific crops of
interest – maize and potato (sample size for sugar beet fields was insufficient).
Power to detect spatial differences in cover within a survey year between GM uptake plots and
non-GM plots was examined using a generalised linear mixed model (GLMM) with a gamma
29
error distribution. Square level random effects were incorporated and the significance of the
term representing treatment (GM or conventional) was stored. The proportion of significant
results over 1000 simulations provided the statistical power. The statistical model used in the
analysis was a log-linear model with Gamma error distribution (as cover data was a positive,
continuous, skewed metric) and a random effect accounting for differing levels of variation
between CS squares to within CS squares. The model is given by:
ln(𝑃𝐶,𝑖,𝑗 ) = 𝜇𝐶 + 𝛼𝐺𝐶,𝑖,𝑗 + 𝑉𝐶,𝑖 + 𝜀𝐶,𝑖,𝑗
where 𝑃𝐶,𝑖,𝑗 is the percentage cover of the species in question in plot 𝑗 within square 𝑖 containing
crop 𝐶, 𝜇𝐶 is the mean percentage cover for plots containing crop 𝐶, 𝐺𝐶,𝑖,𝑗 is an indicator
variable taking value of 1 if plot 𝑗containing crop 𝐶 in square 𝑖 is in GM and 0 otherwise, 𝛼 is the
affect that GM has on species richness, 𝑉𝐶,𝑖 is a random effect for the 𝑖th square for crop 𝐶 and
𝜀𝐶,𝑖,𝑗 is a random effect for the 𝑗th plot in square 𝑖containing crop 𝐶. Note that no temporal
component is included here as we are not modelling change, merely a difference between two
treatments – GM and non-GM.
The results of the power analysis to detect the effect of a difference in common weed species
abundance between GM and non-GM plots are shown in Figure 16.
a
30
b
Figure 16: Power to detect changes in common weed species cover in (a) Maize plots and (b) Potato plots. Power is
shown as a function of GM uptake and Effect size with 5%, 10%, 25% and 50% effects represented by open circles,
filled circles, open squares and filled squares respectively.
Soil Properties
CS takes a 15cm core from the topsoil located within the randomly distributed vegetation plots.
Multiple soil measures including chemistry and biology are obtained from this core. Although
CS soil measurements are made at the within-field plot level, only single cores are taken and
hence there is a much smaller sample size and typically much greater spatial variability than for
the botanical data. This data was used to assess the power to detect changes in soil Carbon,
Nitrogen and pH between the GM and non-GM areas under the uptake and change scenarios
proposed.
Similar to the species cover analysis, a generalised linear mixed model (GLMM) was used with
a gamma error distribution and a log link function. Square level random effects were
incorporated and the significance of the term representing within GM or not was stored. The
proportion of significant results over 1000 simulations provided the statistical power. The model
is given by:
ln(𝑆𝐶,𝑖,𝑗 ) = 𝜇𝐶 + 𝛼𝐺𝐶,𝑖,𝑗 + 𝑉𝐶,𝑖 + 𝜀𝐶,𝑖,𝑗
where 𝑆𝐶,𝑖,𝑗 is the particular soil property in question (Carbon, Nitrogen or pH) in plot 𝑗 within
square 𝑖 containing crop 𝐶, and additional terms are as defined for the species cover analysis.
Results of the proportion of tests where a significant affect of GM uptake was detected for each
of the soil properties are shown in Figure 17.
31
a
b
Figure 17: Power to detect changes in soil chemistry in (A) Maize plots and (B) Potato plots. Power is shown as a
function of GM uptake and Effect size with 5%, 10%, 25% and 50% effects represented by open circles, filled circles,
open squares and filled squares
Square Level Spatial Analysis
The previous two analyses have used CS data at the plot level and the corresponding indicator
of GM uptake for each particular plot. Hence we have assumed that GM uptake information
would be available at the field level. However, as field level information on GM cultivation may
not be available, we decided to repeat the spatial analysis of common weed species abundance
and soil properties but at a square rather than plot level under the assumption that information
on GM uptake was known only at 1 km square resolution as opposed to field level resolution.
The statistical model is largely unchanged, except the modelled data now corresponds to mean
species cover and mean soil carbon/nitrogen /pH in square 𝑘containing crop 𝐶, represented by
𝑀𝐶,𝑘 Specifically the model is given by,
ln(𝑀𝐶,𝑘 ) = 𝜇𝐶 + 𝛼𝐺𝐶,𝑘 + 𝜀𝐶,𝑘
32
and as we are modelling square level means, no square level random effect is needed and we
use a gamma error distribution as before. Note that 𝐺𝐶,𝑘 represents the information we have on
GM uptake, which is either an indicator variable taking the value 1 if square k contains any GM
occurrence or it represents the number of plots within square k with GM uptake. In the
simulations all change scenarios are implemented at the plot level before then being aggregated
to square level for analysis.
As this analysis uses the same plots as the results in the previous analyses, the two can be
directly compared to see the effect on power the resolution of available GM information has.
Plots in Figures 18-20 show the affect that this has had on our power to detect change.
80
100
Soil Carbon
20
40
% Power
60
Figure 18: Power to detect change in soil carbon against
effect size for different uptake scenarios when GM
information is available at 1km square (dashed lines)
and field (solid lines)
0
20% uptake
40% uptake
60% uptake
80% uptake
0
10
20
30
40
50
% Change
100
Soil Nitrogen
20
40
% Power
60
80
Figure 19: Power to detect changes in soil nitrogen
against effect size for different uptake scenarios when
GM information is available at 1km square level
(dashed lines) and field level (soild lines).
0
20% uptake
40% uptake
60% uptake
80% uptake
0
10
20
30
40
50
% Change
33
80
100
Soil pH
20
40
% Power
60
Figure 20: Power to detect changes in soil pH against
effect size for different uptake scenarios when GM
information is available at 1km square level (dashed
lines) and field level (soild lines).
0
20% uptake
40% uptake
60% uptake
80% uptake
0
10
20
30
40
50
% Change
Water Quality
As the CS water samples are not collected specific to crop fields, like the botanical plot level
data and soil data are, we will investigate what power the current CS sampling scheme has to
detect changes generally in water quality under different effect scenarios deemed suitable for
that dataset. In CS freshwater measures include the sampling of a single headwater stream site
per 1km survey square at which a biological sample of the stream macroinvertebrates is taken
(measured as ASPT – see earlier). Although there is no direct link between the field and the
freshwater data we have, we still have detailed habitat mapping of each 1km square and
therefore we limited our analyses only to freshwater samples taken from arable dominated
squares. There are currently 34 such squares sampled in CS for which we have freshwater data
available.
A generalised linear model (GLM) was used with a gamma error distribution. No square level
random effects were needed here because the indicator used was a square level metric. The
significance of the term representing within GM or not was stored. The proportion of significant
results over 1000 simulations provided the statistical power. Specifically the model is given by,
ln(𝑊𝑘 ) = 𝜇 + 𝛼𝐺𝑘 + 𝜀𝑘
where 𝑊𝑘 represents the ASPT for square k, 𝐺𝑘 is an indicator variable taking value of 1 if
square 𝑘 is in GM and 0 otherwise, 𝛼 is the affect that GM has on water quality (ASPT) and 𝜀𝑘 is
the random error for the 𝑘th square. Resulting power estimates for the different uptake and
change scenarios are shown in table 7.
34
Table 7: Power to detect changes in water quality as measured by Countryside Survey. Power is shown as a percent under
different change effects and uptake scenarios.
Change %
Uptake %
5
10
25
50
20
11
43
99
99
40
17
71
99
99
60
27
65
99
99
80
18
51
99
99
Increasing the sample
For all analyses, the effect on power of increased sample size was investigated by repeating the
analysis with larger pseudo data sets. The new data sets were derived from the raw data by
repeatedly resampling plots with replacement until a simple size of the necessary number of
plots was achieved. Each pseudo sample set was analysed according to the same prescription
as set out above for the species abundance, soil properties and water quality data to obtain an
estimate of power. For each sample size the average power over 1000 pseudo sample sets was
taken. This provides an indication of how larger sample sizes can increase the power to detect
effects and also may provide an optimum efficiency sample size. Figures 21-23 clearly show
the increase in power obtained by increasing the sample size.
35
Figure 21: Power to detect changes in cover of Cirsium arvense in potato plots under different change and uptake scenarios
against sample size. Current CS sample size is at the lower end of the scale.
36
Figure 22: Power to detect changes in cover of Galium aparine in maize plots under different change and uptake scenarios
against sample size. Current CS sample size is at the lower end of the scale.
37
Figure 23: Power to detect changes in water quality under different change and uptake scenarios against sample size.
Current CS sample size (34 squares with arable dominated catchments) is at the lower end of the scale
Unified Model
Within the CS analysis we have investigated how sample size, uptake, effect size and level of
GM information all affect our power to detect and attribute change resulting from GM uptake.
However, we have shown this individually for different metrics and ideally we want to be able to
conclude what overall effect on power these features have. For example, we may want to know
what the average affect is of having GM information available only at square level rather than
field level. We therefore pooled together all the power analyses conducted above into a unified
model in order to see the average changes induced by certain factors and what overall affect
they have on power. We brought all the data from the analyses into a single unified logistic
model with gamma error distribution defined by
ln(𝑃𝑖 ) = 𝜇 + 𝛼1 𝐸𝑖 + 𝛼2 𝑁𝑖 + 𝛼3 𝑈𝑖 + 𝛼4 𝑆𝑖 + 𝛼5 𝑉𝑖 + 𝜀𝐶,𝑘
38
Where 𝑃𝑖 is the power estimate, 𝐸𝑖 is the effect size (either 5%, 10%, 25% or 50%), 𝑁𝑖 is the
sample size, 𝑈𝑖 is the uptake (either 20%, 40%, 60% or 80%), 𝑆𝑖 is the level of GM information
(either square level or field level), 𝑉𝑖 is the variation of the metric in question and 𝜀𝑖 is the
associated error.
This is possible as all analyses conducted have used the same gamma based error distribution.
Variation in metric is slightly complicated by the inclusion of the square level random effect.
However, it is the residual top level error that will be used in the model above rather than
including any square level variation. This unified model base approach allows us to estimate the
expected power we have to detect spatial differences between GM and non-GM areas given the
required set of input data. Table 8 shows the average affect each term has on the power to
detect changes and we can immediately see that on average only knowing GM uptake
information at square level as opposed to plot level decreases the power by just over 6%. We
can also see that on average each percentage change increases the power by just under 2% with 50% changes we are often at 100% power. The interaction between sample size (N) and
change effect shows that under 10% change we can reasonably expect an extra 100 plots to
increase the power by just over 5%.
Table 8: Significant coefficients from fitting a unified model to all power analyses conducted in order to understand the
average affect that each of the influential factors has on power.
Uptake
Change
Effect
Square
Level
N:Change
5%
N:Change
10%
N:Change
25%
N:Change
50%
0.01661
1.9266
-6.011845
0.007037
0.055757
0.080359
0.015341
39
Impacts of Environmental Stewardship Measures as a proxy for GM crop
impacts. (Obj 4)
We explored the power of the BTO/RSPB/JNCC Breeding Bird Survey (BBS), a standardised
national volunteer bird survey, data to detect changes in populations of farmland birds with a
percentage area change in land-use (e.g. 40% conversion of maize crop to GM maize), using
change from standard to Environmental Stewardship (ES) stubble management as a proxy for
GM crop uptake. This can be justified on the basis that most of the ES stubble area reflects a
change in crop management (from herbicide-sprayed to unsprayed), rather than a change in the
cropping regime and, therefore, it is conceptually not dissimilar to a change from conventional to
GM crops, but retaining the same crop type. Furthermore, the known effects of ES stubble
considered here, although statistically significant, are not large and, thus, this power analysis
will provide an assessment of the potential to detect relatively small changes in species’
abundance.
A major caveat, however, is that there is no evidence as to how the magnitude of the biological
effect of a switch to a GM crop compares to that of a switch from standard to ES management
of stubble for any given species or for a ‘generic bird’. In this analysis, we assume a similar
magnitude of effect on bird population growth rates for a GM crop as for ES stubble. However,
“GM crop” areas are simulated (by resampling BBS data) to match the regional distributions of
maize, beet and potatoes in order to approximate realistic bird data sets for the geographical
distribution of each crop. Note that this assumes that the uptake of GM varieties of each crop
follows the current distribution of cropping.
Using bird species for which we have previously demonstrated statistically significant
relationships between ES stubble and population growth rate (linnet, reed bunting and
yellowhammer), we investigated the power to detect these relationships given the spatial
distributions expected for GM crops and the six-year time period that has elapsed since the
inception of ES in 2005. We did this analysis using data from the whole of England together and
for BBS squares in arable dominated farmland.
METHODS
Breeding Bird Survey (BBS)
BBS (1994-present) covers c. 2000 randomly selected lowland farmland 1km squares
throughout England annually. Volunteers walk two nominally parallel 1km transects (500m
apart) through each square twice during the breeding season. Each transect is divided into five
200m sections; species-specific bird counts and habitat are recorded separately in each.
Annual, square-specific counts are calculated as the maximum over the two visits of the total
count summed across transect sections (Risely et al. 2011). For this study, BBS squares were
40
selected if they were in lowland farmland (CEH Land Cover Map 2000 Environmental Zones)
and had been surveyed in ≥2 years between 2002 and 2010. Squares comprising <50% farmed
land were omitted as non-agricultural. The major landscape type for each square was
categorised as arable (ratio of arable:pastoral areas ≥2), pastoral (pastoral:arable ≥2) or mixed
(all other squares), based on the CEH Land Cover Map 2000. The analysis was conducted with
all squares and arable squares only.
The analysis was restricted to species that rely on agricultural land for some part of their lifecycle (i.e. breed or winter on farmland) (e.g. Vickery et al. 2009) and to those that were
previously found to respond significantly to ES stubble management (Baker et al. unpublished
report to Natural England). Consequently, this analysis includes linnet, reed bunting and
yellowhammer. For farmland specialists (linnet and yellowhammer), data from all transect
sections were included in the analysis (not just ‘Farmland’) because birds recorded in the nonfarmland parts of a survey square are, nevertheless, likely to have been influenced by the
farmland nearby. For non-specialists (reed bunting) that regularly exploit non-agricultural
habitats (e.g. gardens, wetlands), only counts from transect sections that were recorded as
farmland were used for each square.
Environmental Stewardship data
Spatially referenced data containing the ES and Countryside Stewardship Scheme agreement
details for each holding were supplied by Natural England (NE) and were used to assess the
amount of stubble options per BBS square per year using the methods of Davey et al. (2010).
Data sampling
In order to generate a BBS data set in which the total area of ES stubble options were
representative of potential GM cropping scenarios, samples were drawn randomly, with
replacement, from the set of existing BBS squares for each region until a required area of
stubble was reached that reflected a predicted regional area coverage of a given GM crop, while
also maintaining the regional sample sizes found in the source data set. This was done by
separately sampling squares that included ES stubble and those that did not. The regional
random samples were then combined together for analysis.
Thus, in detail, the area of potato, maize and sugar beet cropping for each region in 2010 was
obtained
(http://archive.defra.gov.uk/evidence/statistics/foodfarm/landuselivestock/junesurvey/results.htm
) and the area for scenarios between 20 and 80% were calculated (i.e. area if 20% of existing
crop was converted to GM). The amount of each crop under each scenario that would be
expected to fall within the randomly distributed BBS squares was calculated, using the number
of squares that were surveyed at least twice between 2002 and 2010. The data for each region
were divided into Stubble > 0 (Stubble) and Stubble = 0 (NoStubble) and random samples of
BBS squares were drawn from the ‘Stubble’ data with replacement until the total area of ES
stubble option approximately equalled the total area expected to occur within all BBS squares
41
(±5%) for that region. Random samples were then drawn, with replacement, from the
‘NoStubble’ data and added to the selected ‘Stubble’ data set until the combined sample size
was equal to the actual number of BBS squares in the region. This needs to be done as a
separate step to ensure that the sample size remains the same as in the original BBS dataset.
This was repeated for all regions and scenarios, with 100 samples drawn for each
region/scenario combination. The data were combined into a national data set for analysis,
where total number of BSS squares was equal the total number from the original data set, but
the Stubble Area was different, reflecting the area expected given a particular GM cropping
scenario.
For several of the scenarios the samples reached the total number of squares in the region
before the required area of cropping was achieved. Where this occurred for only a few replicate
samples these samples were deleted and new samples were randomly generated until 100
samples with the required area and number of squares was reached. However, for some
scenarios/crops (e.g. sugar beet at > 55% in the East of England) these cropping areas were
not possible to sample given the number of squares available and so these scenarios were
omitted from the analysis (sugar beet > 55%).
Statistical analysis
We used a log-linear approach that models the change in expected abundance between
consecutive years and can incorporate effects of spatio-temporal covariates, e.g. ES option
quantities, on local growth rate. This approach allows maximum use of the available data by
including observations from squares not surveyed, or where counts were zero, in the previous
year. Fundamentally, the analyses estimated the additional effect of ES on each species’
population growth rate but, importantly, growth is not thereby forced to be greatest in the years
of highest management levels. The model is a multivariate extension of Freeman & Newson
(2008):
ln  i ,t 1   Rt  Pi ,t  Qi ,t  ln  i ,t 
(1)
where μi,t is the expected species count at site i at time t, Pi,t is the amount of a given ES
management variable in square i at time t and Qi,t is the percentage of arable habitat per
square. Qi,t was mean-centred prior to fitting, and was included because most ES options are
targeted at either arable or pastoral farmland (e.g. stubble or grassland management), so option
uptake is likely to be correlated with the balance of arable and pastoral farming in the
landscape, which could influence bird population trends (e.g. Robinson, Wilson & Crick 2001).
From (1), Rt is the ‘background’ population growth rate from t to t+1 at a hypothetical reference
site where Qi,t has the mean value and there is no management. The parameter α introduces
the effect of ES management on population growth at a site, and β controls for the effect of the
surrounding landscape. For fitting, we rewrite (1) as:
42
ln  i ,t 1    R j    Pi , j    Qi , j  ln  i ,1   ln Gi 
t
t
t
j 1
j 1
j 1
(2)
which is a standard generalized linear model, with offset ln(Gi), where Gi is the number of
transects surveyed in square i, introduced to standardise the square-specific intercepts μi,1, as
some squares had fewer than ten 200m sections. Models were fitted assuming a Poisson
distribution for the observed BBS counts using the GENMOD procedure in SAS 9.2 (SAS
Institute Inc. 2008), accounting for overdispersion using Pearson’s χ2 goodness-of-fit statistic.
The significance of ES effects on population growth rates was assessed using similarly adjusted
likelihood-ratio test statistics of the hypothesis that α = 0.
The data sets were analysed for all squares and also for arable only squares (ratio of
arable:pastoral > 2:1), because there is likely to be a stronger relationship between the
distribution of crops and bird populations in this landscape types, thus potentially increasing the
power to detect effects of changes in crop management.
Figures were plotted using R.2.15.0 and the smoothed curves fitted using Friedman’s super
smoother function (supsmu).
RESULTS
The power to detect statistically significant changes in population growth rates for linnet, reed
bunting and yellowhammer was affected by both the crop type (representing changes in
regional distribution) and scenario. For maize conversion scenarios (Figure 24) both linnet and
yellowhammer show >50% chance of detecting significant effects from changes in crop
management across all squares nationally. Yellowhammer shows a particularly high probability
of detecting these effects with conversion scenarios >50%. The power to detect significant
effects from changes in crop management is reduced when just arable squares are analysed,
but still indicates a ca. 80% chance of detecting effects with >50% conversion scenarios. Reed
bunting shows very low power to detect effects from changes in crop management using all
squares and arable only.
43
Figure 24: Power to detect changes in population
growth rate for three bird species through BBS
monitoring in maize.
With potato cropping patterns (Figure 25), linnet showed the highest power to detect effects
from changes in crop management, for all squares and arable squares only. When all squares
were included in the analysis, this power was >80% for all conversion scenarios. For
yellowhammer this power was lower, reaching a maximum of ca. 80% chance with conversion
scenarios approximately >60%. Here the arable only analysis followed a similar trend, showing
only slightly lower power. Again reed bunting had low power to detect effects of change in crop
management, although when analysing all squares this power increased rapidly with cropping
scenario, reaching ca. 80% chance of detecting an effect at the maximum conversion scenario
tested here (80%).
44
Figure 25: Power to detect change in population growth
rates for three bird species through BBS monitoring in
potato.
45
Because of the large area of sugar beet crop in the east of England we were unable to
simulate cropping patterns for sugar beet above 55% conversion scenarios within the
number of existing BBS squares. Thus, for sugar beet the simulations cover 20 to 55%
scenarios (Figure 26). For the analysis using all BBS squares the power to detect effects of
crop management on bird population trends is greatest for linnet, approaching 100% with
>30% conversion of existing sugar beet to GM. The yellowhammer results are similar,
although the slope is steeper, approaching 100% with >40% conversion scenarios. The
results for reed bunting again shows low power to detect changes in crop management, and
for all three species the analysis with arable only squares gives lower power.
Figure 26: Power to detect change in population growth
rates for three bird species through BBS monitoring in
sugar beet.
Overall, the results suggest that linnet and yellowhammer would be good species for
monitoring the effects of GM conversion (wide spread, relatively abundant and strongly
associated with arable landscapes), together giving high statistical power across the three
crops considered here. The low power shown by reed bunting is likely to be a consequence
of its use of wider habitat and shows the importance of the species used for monitoring.
46
Implications of increasing power through increasing sample size.
Our models and simulations have illustrated how power will be influenced by changes in
sample size, duration of study, and a range of other species specific and ESN specific
parameters. At some point in the design of a General Surveillance strategy, the decision
needs to be made as to how powerful GS strategies should be designed to be, and whether
existing ESNs will provide sufficient data. The two simplest ways in which to increase power
of a surveillance strategy are through the most appropriate statistical analysis, or through
increasing the sample size. This report has not compared different methods of analysis, but
chose a design (paired comparison within years) that is one of the more powerful methods.
Here we consider potential costs for the second option, increasing the sample size of three
ESNs: a) the Wider Countryside Butterfly Survey (WCBS); b) Countryside Survey; c) the
Breeding Bird Survey. We do not consider the UK BMS, as the choice of site is led by the
volunteer, and few volunteer to monitor arable sites due to their relatively low butterfly
biodiversity. However, the newly established WCBS will focus on arable land, and offers
greater potential for GS in future years.
Costs for increased butterfly sampling under WCBS
There are two possible approaches for increasing sample sizes within the Wider Countryside
Butterfly Survey (WCBS): professional and volunteer. The benefit of the former is that
funders have complete control over sample location, effort, consistency and observer quality.
This approach is likely to be adopted by the Welsh Government to monitor agri-environment
schemes and has been used to increase sampling in under-recorded areas in Scotland. In
the long term, however, using volunteers is more sustainable and cost-effective. There are
some important caveats: (i) the WCBS is run by Butterfly Conservation, the Centre for
Ecology & Hydrology, BTO, JNCC and other funders, so all stakeholders would have to
agree to any extension work; (ii) volunteers are a finite resource, are in demand for other
surveys and are difficult to recruit in remote areas, so uptake is unpredictable; (iii) all that
can be costed is the recruitment effort, not the cost per unit survey effort, because surveys
are voluntary and it is unknown how much of the maximum available volunteer pool current
survey effort uses (so there are also likely to be diminishing returns for larger samples as the
upper limit to the number of volunteers available is approached).
Given the professional and volunteer possibilities, guideline costs are given below (at current
rates, which will be subject to inflation). Please note that these are subject to formal approval
and should not be taken as definitive or fixed.
Professional - £25K per 100 extra squares annually (this is based on resource requirements
for supplementary WCBS monitoring and support in the context of monitoring agrienvironment schemes in Wales).
Volunteers – initially c. £30K per 100 extra squares (very approximate one-off recruitment
cost; there would be some ongoing effort required to keep sample size up, probably c. £10K
per annum per 100 extra initial squares).
47
Note that these costs are lower than those estimated for increased sampling under BBS
(below). The WCBS is a newly established scheme and is judged to have good scope for
increased sample size by 100 extra squares.
Increasing the Countryside Survey sample
Countryside survey is a stratified random sample of approximately 600 1km squares over
Great Britain. Within each 1km square data on various biophysical measurements are taken.
For some measurements however, such as species richness and soil properties, the 1km
scale is not appropriate and therefore smaller plots nested within these 1km squares are
used for specific vegetation and soil core sampling. In addition to a randomly located set of
5 plots per square the nested plots also cover different habitat features of the square.
Examples include arable field margins, river banks and roadside plots.
To increase the sample size of CS measurements there are therefore two possibilities to
consider. The first is to increase the number of squares surveyed across the landscape and
the second option is to increase the number of plots nested within the squares. The
advantage of increasing the number of squares is that the spatial coverage over GB is
increased and also any measures that we currently collect at the 1km scale, such as water
quality, will also have an increased sample size. The advantage of increasing the number of
plots within squares is that it will be considerably cheaper than any square level additions
and can be targeted at particular habitat types relevant to the needs of funders. For
example, we could specifically increase the number of arable field margin plots or within
arable field plots without affecting the overall statistical robustness, whereas we could not
target whole 1km squares to particular areas or habitats.
CS is a survey conducted entirely by professionals with extensive QA and QC procedures in
place to ensure optimum quality and efficiency of the data. Each square is surveyed by a
team of up to four people in a short a time as possible at the same time of year as for all
previous surveys. The greatest cost to CS is paying the surveyors, training them and actually
getting them to the squares. With previous experience of CS, the average cost per square
across the UK is approximately £7k. This includes surveyors’ time, overheads, training,
equipment and T&S. Prep phase activities, lab costs, project management or the cost of
providing vehicles is not included in this cost.
If we add all elements of the survey together including surveyors’ time, overheads, training,
equipment, T&S, data preparation, management, analysis and reporting then the cost per
square is doubled to approximately £14k.
Adding additional plots is a far cheaper option because there is no additional cost of placing
the surveyors at the square. For measures such as soil properties though there are still the
associated lab costs. The cost per plot therefore depends on whether soil cores are to be
taken. A rough approximation of £500 per plot is considered suitable to incorporate these
factors. Translating these costs per plot into a sample size increase that affects the power to
detect change, we can the relationship between additional financial resources and increases
in power. This relationship is shown below in Figure 27 which shows the power to detect a
10% change in species cover assuming a 40% uptake in GM crop cultivation. It is important
48
to note that any changes to the CS sampling design needs to ensure that it is agreed with all
of the co-funders and most importantly that it remains consistent with the previous 34 year
record of the survey. Increases in sample size obviously have time management implications
on the surveyors and this must be taken into account. Over the 34 years of CS there has
been an increased sample size from each survey year to the next and we therefore have
extensive experience of analysing the unequal sample sizes at each time point, whilst at the
same time maximising the use of the data.
Figure 27: Power to detect a 10% change in common weed species cover assuming 40% uptake in GM crop cultivation
against additional costs to the survey. This additional cost is assumed to translate directly into an increased number of
plots.
Costs for increased bird sampling under BBS
There are two possible approaches for generating additional bird monitoring sample sizes:
professional and volunteer. The benefit of the former is that BTO/the funders have complete
control over sample location, effort, consistency and observer quality. The BTO has used
this approach before to supplement the Breeding Bird Survey, under funding from
Defra/Natural England. In the long term, using volunteers is, obviously, cheaper. However,
there are some important caveats: (i) BBS is run by BTO, JNCC and RSPB, so all
stakeholders would have to agree to any extension work; (ii) volunteers are a finite resource
and are in demand for other surveys as well as BBS, so in any given year BTO and
stakeholders for other, potentially competing surveys would need to prioritize recruitment
and retention for a BBS extension relative to the other surveys, whose future relative priority
is unpredictable; (iii) all that can be costed is the recruitment effort, not the cost per unit
survey effort, because surveys are voluntary and it is unknown how much of the maximum
available volunteer pool current survey effort uses (so there are also likely to be diminishing
returns for larger samples as the upper limit to the number of volunteers available is
approached).
49
Given the professional and volunteer possibilities, guideline costs are below (at current
rates, which will be subject to inflation). Please note that these are subject to formal approval
and should not be taken as definitive or fixed; in addition, consideration of the volunteer
option would need to take place within the broader context of volunteer surveys across the
BTO and its partners, so formal approval cannot be guaranteed at this stage.
Professional - £37K per 100 extra squares annually (this is based on resource requirements
for supplementary BBS monitoring and support in the context of monitoring Entry Level
Stewardship for Natural England).
Volunteers – initially c. £33K per 100 extra squares (very approximate one-off recruitment
cost based on a previous application in a particular context; NB retention rate is c. 85%, so
there would be some ongoing effort required to keep sample size up, probably c. £10K per
annum per 100 extra initial squares).
50
Conclusions
This study has illustrated the potential and some of the complexities behind predicting the
power of a General Surveillance strategy using our existing Environmental Surveillance
Networks.
The Generic Equation for Count Data
Considering ‘count’ data collected in annual census surveys, it is possible to derive a
‘generic equation’ which can predict power (probability of detecting change) with reasonable
accuracy given nine explanatory variables (number of sites, duration of study, number of
survey visits missed, mean and variance in abundance of the indicator species, background
rate of growth at a ‘control’ site, the proportion of ‘treated’ sites, the degree of overdispersion
in the data and the magnitude of the effect between control and treated sites). However, it
was necessary to include as many as 10 interaction terms (9 of which involved alpha, the
magnitude of the effect), and only to consider values of alpha >-0.02, for the predictive
model to perform adequately. It should also be emphasized that this predictive model should
only be used within the range of parameter values used for the simulations upon which it is
based.
Having derived the generic equation, it can be used to estimate how power will be influenced
by the nine explanatory variables. The trends illustrated here are as expected:


Factors that increase power include a) longer time series (Fig 10a v Fig 10b); b)
fewer visits missed (Fig 10a v Fig 10c); c) higher mean abundance of indicator
species (Fig 11); d) greater value of alpha (i.e. searching for larger differences
between GM and non-GM sites) (Fig 12a v Fig 10a); e) greater number of sites (Fig
10).
Increasing the background decline in the indicator species at control sites (Fig 10a v
Fig 10d) only has a very modest impact on power.
In practical terms, and for the purposes of general surveillance, power is best influenced by
choice of indicator species: this will influence both the mean and variance in abundance
across sites, and also the number of sites at which that species is detected. Provided that
the parameters of the network and the species fall within the range simulated in this study,
the generic equation may be used to ensure that power is within an acceptable range. Power
curves are drawn for two butterfly and three bird species, to illustrate how power varies with
effect size, proportion of sites treated and with number of sites monitored.
Analysis of spatial data within years
The Countryside Survey provides a contrasting case study, in which intensive monitoring
occurs within one year (but this is not repeated for 7-8 years). A range of metrics for the
delivery of ecosystem services are collected in a number of 1km squares which include
arable land. Key messages from this case study include:


The power to detect large (>10%) change in many of these metrics is limited (Fig 16
& 17)
If the information on the location of GM crops is only available at 1km square (as
opposed to the field) level, this reduces the power to detect change (Fig 18-20)
51
A second generic equation was derived based on a model with gamma errors which allowed
an estimate of the average effect that each influential factor has on power.
Using Environmental Stewardship Measures as a proxy for GM crops
A third, rather different, approach used an existing BBS dataset in which monitoring was
used to evaluate the effectiveness of environmental stewardship targeted at birds which
utilize stubbles over winter for feeding. If the premise is accepted that the impact of over
wintering stubbles is a suitable proxy for a potential impact of change in management
associated with a GM crop, then this study illustrates how the probability of detecting those
impacts is likely to vary between crops and species. Using this ‘real life’ example also
provides further evidence that a change in management can be detected by the monitoring
of appropriate indicator species, in spite of the noise and bad behavior inevitably associated
with field data collected by an ESN.
One crucial point about this approach is that the management change which is monitored in
this example (provision of stubbles over winter) is designed to impact on the three target bird
species, whereas in the case of GS of GM crops, the surveillance is for unintended effects –
so there is no clear pathway that links change to the indicator species. Consequently, the
choice of indicator species should cover a range of ways in which the agro-ecosystem is
utilized. The sensitivity of any indicator species will depend upon its ecology, the timing of
monitoring with respect to the time of year when the species utilizes the agricultural
landscape, and how this is influenced by the change that occurs in the agro-ecosystem.
Limitations of this study
In many cases the transects from which ESNs conduct their monitoring are constrained to
follow field edges within arable land. This introduces an obvious bias towards edge habitat,
and counts will be more heavily influenced by how arable management impinges upon field
edges rather than crop centres. These biases will be greater for less mobile taxa (for
example butterflies will be more biased than birds). Population counts will also undoubtedly
be influenced by landscape context: for example, in a resource rich landscape, management
changes within an arable field would be expected to have much less influence than in a
resource poor landscape. A challenge for the analysis of data from monitoring networks,
seeking to detect the cause of changes in biodiversity, is to include those metrics which
capture the influence of landscape context on farmland biodiversity, along with other
covariates. This would be an important means of increasing the power to detect causes of
change in any analysis of ‘real data’. In our simulation study, these influences are all
captured in the variance of the indicator metric.
Other biases may also arise upon the introduction of GM crops. One example could be that
early adopters are a biased sample of all potential farmers: for example those with
pernicious weed problems may be more likely to be amongst the first to adopt herbicide
tolerant crops. Our simplifying assumption has been that early adopters are a sample of all
farmers currently growing the crop concerned, randomly chosen with respect to farming
practice, baseline biodiversity and all other metrics.
52
This study has focused on a limited range of metrics. In particular, the examples drawn from
the BMS and the BBS have involved counts of single species. Looking at covariation in a
range of species simultaneously could represent a more sensitive means of detecting
change, and would be a fruitful line of future investigation.
Limitations of General Surveillance
This study also highlights the limitations of general surveillance. This is most clearly
illustrated by two points.
The power to detect small changes is low
Figure 15 illustrates how power changes with the magnitude of alpha for three farmland bird
species: from -0.004 (which equates to a 0.4% change p.a., or 3.15% change over a nine
year period) to -0.02 (a 2% change p.a. amounting to a 14.8% change over a nine year
period). The power to detect small changes, even over fairly long time series of nine years,
is low. This message is also illustrated by the CS case study examining the power to detect
differences between two treatments within years, with power being relatively low for effect
sizes  10% (Figures 21, 22 & 23).
The power to detect change within a few years is likely to be low
Figure 13 illustrates how the duration of the study influences power, with a relatively short
study of five years providing low power to detect change even in abundant species if the
number of sites is limited (50 in this example).
In conclusion, to detect small effects, on rare species, at an early stage of GM uptake (or
any one of those), existing ESNs would need to be supplemented with extra sites, and the
costs of this have been estimated for our case studies.
53
References
Baker, D.J., Freeman, S.N., Grice, P.V. and Siriwardena, G.M. (2012) Landscape scale
responses of birds to agri-environment management: a test of the English Environmental
Stewardship Scheme. Journal of Applied Ecology, 49, 4, 871-882.
Davey, C., Vickery, J., Boatman, N., Chamberlain, D., Parry, H. And Siriwardena, G. (2010).
Regional variation in the efficacy of entry level stewardship in England. Agriculture,
Ecosystems and Environment, 139, 1-2, 121-128.
Freeman, S.N. and Newson, S.E. (2008) On a log-linear approach to detecting ecological
interactions in monitored populations. Ibis, 150, 2, 250-258.
Risely, K., Massimino, D., Johnston, A., Newson, S.E., Eaton, M.A., Musgrove, A.J., Noble,
D.G., Procter, D. & Baillie, S.R. (2012). The Breeding Bird Survey 2011. BTO Research
Report 624. British Trust for Ornithology, Thetford.
Robinson, R.A., Wilson, J.D. and Crick, H.Q.P. (2001) The importance of arable habitat for
farmland birds in grassland landscapes. Journal of Applied Ecology, 38, 5, 1059-1069.
SAS Institute Inc. (2008) SAS OnlineDoc, Version 9.2. SAS Institute Inc., Cary, NC.
Vickery, J.A., Feber, R.E. & Fuller, R.J. (2009). Arable field margins managed for biodiversity
conservation: a review of food resource provision for farmland birds. Agriculture,
Ecosystems and Environment, 133, 1–3.
54