A benthic index of biological integrity for

Marine Environmental Research 48 (1999) 269±283
www.elsevier.com/locate/marenvrev
A benthic index of biological integrity for
assessing habitat quality in estuaries of the
southeastern USA
R.F. Van Dolah a,*, J.L. Hyland b, A.F. Holland a, J.S. Rosen c,
T.R. Snoots a
a
SC Marine Resources Division, PO Box 12559, Charleston, SC 29422, USA
NOAA Carolinian Province Oce, PO Box 12559, Charleston, SC 29422, USA
c
TPMC, Mill Wharf Plaza, Suite 208, Scituate, MA 02066, USA
b
Received 1 April 1998; received in revised form 1 October 1998; accepted 1 February 1999
Abstract
A benthic index of biotic integrity was developed for use in estuaries of the southeastern
USA (Cape Henry, VA; St. Lucie Inlet, FL) using a modi®cation of the method developed by
Weisberg et al. (1997. An estuarine benthic index of biotic integrity (B-IBI) for Chesapeake
Bay. Estuaries, 20 (1), 149±158). Data from non-degraded stations sampled in 1993 and 1994
were analyzed using classi®cation analysis of species composition to de®ne major habitat types
relative to selected physical parameters. Various benthic metrics were then tested on a larger
1994 data set for each major habitat to determine those that discriminated between nondegraded and degraded sites classi®ed on the basis of dissolved oxygen, sediment chemistry,
and sediment toxicity results. Scoring criteria for each metric were developed based on the distribution of values at non-degraded sites. Average scores from di€erent combinations of the
most sensitive metrics were compared to derive the ®nal index, which integrates the average
scores of four metrics (number of taxa, abundance, dominance, and percent sensitive taxa). An
independent data set representing sites sampled in 1993 and 1995 was used to validate the index.
The ®nal combined index correctly classi®ed 93% of stations province-wide in the developmental data set and 75% of stations in the validation data set. Comparison of the index results
with those of individual benthic measures and sediment bioassays from stations sampled in
1993 and 1995 showed that the index detected a higher percentage of samples where bioe€ects
* Corresponding author. Fax: +1-803-762-5110.
E-mail address: [email protected] (R.F. Van Dolah)
0141-1136/99/$ - see front matter # 1999 Elsevier Science Ltd. All rights reserved.
PII: S0141-1136(99)00056-2
270
R.F. Van Dolah et al. / Marine Environmental Research 48 (1999) 269±283
were expected (based on sediment chemistry) than did any of these other measures individually.
# 1999 Elsevier Science Ltd. All rights reserved.
Keywords: Benthic index; Contaminants; Southeastern estuaries; Sediment toxicity; EMAP
1. Introduction
Marine and estuarine benthic communities have been used extensively to document biological responses to contaminant exposure, organic enrichment, hypoxic
events, and a variety of other changes in environmental quality. Traditional measures of response have included changes in faunal abundance and biomass, species
diversity, species dominance, presence of pollution-tolerant or pollution-sensitive
species, and changes in trophic function or structure (Berge, 1990; Dauer, 1993;
Dauer & Alden, 1995; Gaston, Rutledge & Walther, 1985; Pearson & Rosenberg,
1978; Rhoads, McCall & Yingst, 1978). More recent approaches have integrated
many or all of these biological measures into a single multi-metric index that can
e€ectively discriminate between degraded and non-degraded environments.
Two general approaches to developing multi-metric indices have been shown to
work well in estuarine environments. One approach, which has been used successfully in the middle Atlantic and Gulf coast regions of the USA (Engle, Summers &
Gaston, 1994; Weisberg et al., 1992), combines stepwise and canonical discriminant
analyses to produce a multi-variate index that is normalized to account for the
e€ects of natural environmental factors on component biological metrics used in
the index. However, when there are many environmental in¯uences, the normalization process can be complex and produce results that may not always be consistent with established ecological principals.
The second approach is the benthic index of biotic integrity (B-IBI), which
has been applied recently in the Chesapeake Bay and New York±New Jersey
Harbor areas (Ranasinghe, Weisberg, O'Connor & Adams, unpublished; Weisberg
et al., 1997). This method is a variation of the index of biotic integrity (IBI) originally developed for freshwater systems (Karr, 1981, 1991; Karr, Fausch, Angermeier,
Yant & Schlosser, 1986; Kerans & Karr, 1994). The B-IBI is a multi-metric
index that re¯ects the degree to which component measures of biological response
deviate from values expected in habitats that show no evidence of anthropogenic
stress. Natural variations in these measures that are due to various environmental
factors (e.g. salinity, latitude) are accounted for by de®ning habitat-speci®c reference conditions for each metric. The simplicity of this approach makes it easy to
understand and interpret, and applicable to a range of habitats for which data are
available.
We used a modi®cation of the B-IBI approach described by Weisberg et al. (1997)
to develop a benthic index for use in southeastern estuaries of the USA. Our goal
was to create an index that was e€ective in discriminating between degraded and
non-degraded sites in a variety of habitat types, and to test the applicability of the
method at a regional scale. This paper summarizes the basic steps we used to derive
R.F. Van Dolah et al. / Marine Environmental Research 48 (1999) 269±283
271
the index and demonstrates its utility as a biological tool for detecting signals of
degraded sediment quality in southeastern estuaries.
2. Index development process
Benthic data used for development and validation of the index were collected from
171 sites in Virginia, North Carolina, South Carolina, Georgia and Florida over a
3-year period (1993±95). These sites were part of a larger array of stations sampled
for the Environmental Monitoring and Assessment Program (EMAP) in the Carolinian Province. Details on the speci®c sampling locations and protocols are
described by Ringwood, Holland, Kneib and Ross (1996), and Hyland et al. (1996,
1998).
Two to four grabs were collected at each site using a Young grab (0.04 m2). Stations sampled in 1993 generally included three replicate samples per station, whereas
stations sampled in 1994 and 1995 were generally represented by two replicate samples per station. Only six stations (all 1994) were represented with four grabs. All
samples were sieved through a 0.5-mm screen and the macrofauna retained on the
sieve were preserved, sorted, and identi®ed to the lowest possible taxonomic level.
Several other measurements or samples were collected in conjunction with the
benthic samples to characterize conditions at each site (Hyland et al., 1996, 1998;
Ringwood et al., 1996). These measures included information on general habitat
characteristics (water depth, temperature, salinity, dissolved oxygen, and pH), sediment grain size composition, and total organic carbon (TOC), sediment contaminant concentrations (up to 131 analytes), and sediment toxicity data based on two
or more laboratory bioassays (Microtox1 assay, 10-day acute amphipod assay using
Ampelisca abdita and/or Ampelisca verrilli, and 7-day growth-inhibition assay
using juvenile Mercenaria mercenaria).
The basic steps used to develop the index involved: (1) de®ning major habitat
types based on classi®cation analysis of benthic species composition and evaluation
of the physical characteristics of the resulting site groups; (2) selecting a development data set representative of degraded and non-degraded reference sites in each
habitat; (3) comparing various benthic attributes between reference and degraded
sites for each of the major habitat types; (4) selecting the benthic attributes that
discriminated between reference and degraded sites for inclusion as component metrics in the index; (5) establishing scoring criteria (thresholds) for the selected
metrics based on the distribution of values at reference sites; (6) constructing a
combined index value for any given sample by assigning an individual score for each
metric based on the scoring criteria, and then averaging the individual scores; and
(7) validating the index by comparing observed to expected responses in an independent data set.
Stations were divided into non-degraded and degraded categories based on a
combination of chemical and toxicological criteria summarized in Table 1. Sites
that did not meet these criteria were considered to be marginal sites (i.e. with
intermediate characteristics) and were excluded from further analyses. Data from 59
272
R.F. Van Dolah et al. / Marine Environmental Research 48 (1999) 269±283
Table 1
Criteria for classifying stations as reference or degraded based on sediment contaminationa, sediment
toxicityb, and near-bottom dissolved oxygen (DO)c conditions
Reference
Degraded
Low contamination
and
Low toxicity
and
Acceptable DO
High contamination
or
High toxicity
or
Unacceptable DO
a
High contamination: 53 analytes exceeding ER-L/TEL sediment quality guidelines, or 51 analyte
exceeding ER-M/PEL guidelines. Low contamination: not meeting high contamination criteria. ER-L and
ER-M sediment quality guideline values are from Long et al. (1995) and Long and Morgan (1990). TEL
and PEL sediment quality guidelines are from MacDonald (1994) and MacDonald et al. (1996).
b
High toxicity: 550% of assays with positive toxicity results. Low toxicity: no positive toxicity assay
results. Up to four di€erent toxicity assays were performed (Ampelisca abdita, Ampelisca verrilli, Microtox, Mercenaria mercenaria).
c
Unacceptable DO: any observation with DO <0.3 mg/l, or 20% or more time-series observations <2
mg/l, or all time-series observations <5 mg/l. Acceptable DO: not meeting unacceptable DO criteria.
non-degraded sites sampled in 1993 and 1994 were then analyzed by classi®cation
(cluster) analysis of benthic species composition, and evaluation of the physical
factors associated with the resulting station groups to de®ne major habitat types.
Several types of cluster analyses were performed. The one that produced the clearest
results was a normal (Q-mode) analysis run on log10-transformed data with ¯exible
sorting as the clustering method and Bray±Curtis similarity as a resemblance measure (Boesch, 1977). Di€erences in abiotic factors (salinity, latitude, percent silt±clay,
TOC) among the resulting site groups were then examined by analysis of variance
(ANOVA) and pair-wise multiple comparison tests (Duncan's test and Tukey's
HSD) to help delineate the major habitat types. Four habitat groups resulted: (1)
oligohaline±mesohaline stations (418%) from all latitudes; (2) polyhaline±euhaline
stations (>18%) from northern latitudes (>34.5 N); (3) polyhaline±euhaline stations from middle latitudes (30±34.5 N); and (4) polyhaline±euhaline stations from
southern latitudes (<30 N).
Seventy-®ve stations sampled during the 1994 survey were selected to represent
the `development data set'. These stations provided data from both degraded and
non-degraded sites in each of the four habitat types. Classi®cation of the sites into
degraded and non-degraded categories was based on the criteria listed in Table 1.
Forty di€erent infaunal attributes (Table 2) were tested with the 1994 development data set to determine those that discriminated between non-degraded and
degraded stations within each habitat type. This initial list of attributes included
various measures of diversity, abundance, dominance, and presence of individual
species or taxonomic groups that we considered to be potentially useful indicators
of stress (based either on evidence in the literature or actual distributions observed
in this study). After preliminary statistical evaluation of all the attributes, six
were selected for further testing as possible component metrics of the index. Key
criteria considered in the selection were whether di€erences between degraded and
R.F. Van Dolah et al. / Marine Environmental Research 48 (1999) 269±283
273
Table 2
List of biological attributes that were considered as candidate metrics for the benthic index (all attributes
were evaluated using mean values of replicate samples at a site. Percent values represent percent of total
faunal abundance)
Diversity measures
Mean number of taxa**
Shannon Weiner Index**
Dominance measures
100 minus percent abundance of most dominant taxa*
100 minus percent abundance of two most dominant taxa**
100 minus percent abundance of three most dominant taxa*
Abundance measures
Mean abundance of all fauna**
Percent abundance of subsurface feeders
Percent abundance of surface feeders*
Percent abundance
Percent abundance
Percent abundance
Percent abundance
of Amphipoda*
of Ampeliscidae
of Haustoriidae
of Ampeliscidae and Haustoriidae
Percent abundance
Percent abundance
Percent abundance
Percent abundance
Percent abundance
Percent abundance
Percent abundance
Percent abundance
Percent abundance
Percent abundance
of Polychaeta
of Paraprionospio pinnata
of Streblospio benedicti
of Capitellidae
of Spionidae
of Orbiniidae*
of Hesionidae
of Cirratulidae
of Nereididae
of Mediomastus spp.
Percent abundance
Percent abundance
Percent abundance
Percent abundance
Percent abundance
Percent abundance
Percent abundance
Percent abundance
Percent abundance
of Oligochaeta
of Mollusca*
of Bivalvia*
of Gastropoda*
of Tellinidae*
or Lucinidae
of Lucinidae and Tellinidae*
of Mulinea lateralis*
of Acteocina annilicuata*
Percent abundance of Crustaceansa,*
Percent abundance of Xanthidae
Percent abundance Cyathura polita+Cyathura burbancki*
Percent abundance of pollution-tolerant taxab
Percent abundance of pollution-sensitive taxab,*
(Table continued on next page)
274
R.F. Van Dolah et al. / Marine Environmental Research 48 (1999) 269±283
Table 2 (continued)
Percent abundance
Percent abundance
Percent abundance
Percent abundance
of pollution-sensitive Group Ac,*
of pollution-sensitive Group Bc,*
of pollution-sensitive Group Cc,**
of pollution-sensitive Group Dc,**
a
All crustaceans except Insecta, Pycnagonida, Thoracica.
Based on literature and/or previous EMAP databases.
c
Based on di€erent combinations of taxa that appeared to be pollution sensitive in this study: Group
A: percent Ampeliscidae, Tellinidae, Lucinidae, Hesionidae, Cirratulidae, C. polita, C. burbancki;
Group B: percent Ampeliscidae, Tellinidae, Hesionidae, Cirratulidae, C. polita, C. burbancki; Group C:
percent Ampeliscidae, Haustoriidae, Tellinidae, Lucinidae, Hesionidae, Cirratulidae, C. polita, C. burbancki; Group D: percent Crustacea, Mulinia spp. in habitat 1; percent Acteocina spp., Mediomastus spp.,
Ampeliscidae, Orbinidae, Tellinidae, Lucinidae, Cirratulidae in habitat 2; percent Crustacea in habitat 3;
percent Mediomastus spp., Ampeliscidae, Hesionidae, C. polita, C. burbancki in habitat 4.
*Metrics which showed a signi®cant di€erence between sites.
**Metrics chosen for ®nal testing in various combinations.
b
non-degraded stations were statistically signi®cant, both throughout the region and
within the majority of habitat types considered (based on results of Mann±Whitney
U test at ˆ 0:1), and the di€erences were in a direction consistent with established
ecological principles. The six attributes were: mean number of taxa, mean abundance (all taxa), mean H0 diversity, 100 minus percent abundance of the two most
numerically dominant taxa, and two di€erent measures of percent abundance of
pollution-sensitive taxa (Table 2). Statistical evaluation of these candidate metrics
indicated that all showed highly signi®cant di€erences between non-degraded and
degraded sites using the region-wide data set (Table 3). They also showed signi®cant
di€erences in at least two of the three sub-regional habitat groups that had a sucient number of sites in each category to test.
Scoring criteria for each metric were developed based on the distribution of values
at the non-degraded (reference) sites in the 1994 development data set. A score of 1
was used if the value of the metric for the station being evaluated was in the lower
10th percentile of corresponding reference values. A score of 3 was used if the value
of the metric for the station was in the lower 10±50th percentile of reference values.
A score of 5 was used if the value of the metric for the station was in the upper 50th
percentile of reference values. Scoring criteria were determined separately for each
metric and habitat type using the threshold values provided in Table 4.
A combined index value was computed for a station by assigning a score for each
component metric and then averaging the individual scores. An index score <3
suggests the presence of a degraded benthic assemblage (some apparent level of
stress to very unhealthy) because the averaged metrics deviate from conditions
typical of the `best' (upper 50th percentile) reference sites.
Forty di€erent combinations of the six candidate benthic metrics were further
evaluated to determine which represented the best combined index. These evaluations were made by calculating the rate at which each multi-metric index correctly
classi®ed degraded stations as degraded (index score <3), and non-degraded stations as non-degraded (index score 53), using the independent criteria described in
17
Degraded
62.4
190.1
0.000
±, Not tested due to small sample size.
58
Overall
Non-degraded
Polyhaline±euhaline (southern latitudes)
Non-degraded
8
292.8
±
Degraded
1
6.5
Polyhaline±euhaline (middle latitudes)
Non-degraded
19
268.0
0.045
Degraded
3
27.5
Polyhaline±euhaline (northern latitudes)
Non-degraded
20
112.1
0.096
Degraded
4
49.1
3.7
21.2
3.0
36.0
4.7
26.6
2.1
16.9
4.2
8.8
0.000
±
0.017
0.003
0.005
p
Mean
Mean
p
Mean number
of taxa
Mean
abundance
Oligohaline±mesohaline (all latitudes)
Non-degraded
11
122.7
0.184
Degraded
9
86.1
n
1.0
2.9
1.4
3.8
1.5
3.1
0.5
3.0
1.0
1.8
Mean
Mean H0
0.000
±
0.013
0.002
0.005
p
85.3
53.4
69.2
37.4
74.4
50.1
89.2
49.9
89.0
77.1
Mean
0.000
±
0.045
0.005
0.023
p
% Abundance of
two most dominant
taxa
6.7
16.8
0.0
4.3
30.8
23.3
0.5
18.9
2.1
10.8
Mean
0.000
±
0.473
0.024
0.011
p
% Abundance of
pollution-sensitive
taxa Group C
5.1
26.6
0.0
21.5
1.3
15.7
18.3
43.8
1.1
17.8
Mean
0.000
±
0.085
0.112
0.012
p
% Abundance of
pollution-sensitive
taxa Group D
Table 3
Results of Mann±Whitney U tests comparing degraded and non-degraded sites by metric and habitat group (statistical signi®cance was determined at the
ˆ 0:1 level)
R.F. Van Dolah et al. / Marine Environmental Research 48 (1999) 269±283
275
276
R.F. Van Dolah et al. / Marine Environmental Research 48 (1999) 269±283
Table 4
Scoring criteria percentile breakpoints for component metrics used in index calculation
Metric
Oligohaline±
mesohaline
All latitudes
Polyhaline±
euhaline
Northern latitudes
Polyhaline±
euhaline
Middle latitudes
Polyhaline±
euhaline
Southern latitudes
10th
50th
10th
50th
10th
50th
10th
50th
93.00
26.00
109.75
18.50
255.50
112.50
301.00
8.50
7.50
17.00
6.25
23.00
26.50
35.00
25.45
28.94
51.53
17.36
52.04
52.89
61.19
5.04
0.00
12.83
1.61
12.23
0.71
2.22
Mean abundance
53.50
per 0.04 m2
Mean number of
7.00
taxa per 0.04 m2
100% of two
9.62
most dominant taxa
% Pollution-sensitive 0.61
taxa (Group Ca)
a
Group C: percent Ampeliscidae, Haustoriidae, Tellinidae, Lucinidae, Hesionidae, Cirratulidae,
Cyathura polita, Cyathura burbancki.
Table 1. The metric combination that produced the highest percentage of correct
classi®cations consistently across the various habitats was then selected to represent
the ®nal index.
The index selected for use in the Carolinian Province was calculated using the
average score of four metrics: (1) mean abundance; (2) mean number of taxa; (3) 100
minus percent abundance of the top two numerical dominants; and (4) percent abundance of pollution-sensitive taxa (Group C, Table 2). This index (and other top candidate indices) were then evaluated using the combined 1993 and 1995 database as an
independent `validation data set' to con®rm that the index we selected produced the
highest correct classi®cation eciency of those considered for the overall study area.
3. Results and discussion
The index we selected for use throughout the province correctly classi®ed 93% of
the stations in the development data set and 75% of the stations in the validation
data set (Table 5). When both data sets were considered together, the index correctly
classi®ed 83% of the stations. Our evaluation of the utility of this index in the Carolinian Province is limited to the validation data set, which was independent of the
data set used to derive the index. In several cases, we also limited our comparisons
to only the 1995 data set because most of the stations in the 1993 pilot study were
not randomly selected and/or the full suite of bioassays were not conducted in that
year. While random selection of sites was not essential for development and testing
of the index, only the random, probability-based sites were used to estimate the
percent of the region that was degraded based on the benthic index values.
Index values in the validation data set covered the full scale from 1 to 5, with clear
trends observed in the percentage of stations that were classi®ed correctly based on
R.F. Van Dolah et al. / Marine Environmental Research 48 (1999) 269±283
277
Table 5
Percent of correct station classi®cations (number of correct positives and correct negatives/n) using the
Carolinian benthic index of biotic integrity within each habitat group (1±4) and for all groups combined
in the 1994 development data set, the 1993±95 validation data set, and all years combined
Data set
Habitat group
n
Percent correct
classi®cations
Reference (1994)
(1)
(2)
(3)
(4)
Oligohaline±mesohaline, all latitudes
Polyhaline±euhaline, northern latitudes
Polyhaline±euhaline, middle latitudes
Polyhaline±euhaline, southern latitudes
Province wide
20
24
22
9
75
90
92
95
100
93
Validation (1993, 1995)
(1)
(2)
(3)
(4)
Oligohaline±mesohaline, all latitudes
Polyhaline±euhaline, northern latitudes
Polyhaline±euhaline, middle latitudes
Polyhaline±euhaline, southern latitudes
Province wide
46
13
27
10
96
78
85
74
50
75
All data
(1)
(2)
(3)
(4)
Oligohaline±mesohaline, all latitudes
Polyhaline±euhaline, northern latitudes
Polyhaline±euhaline, middle latitudes
Polyhaline±euhaline, southern latitudes
Province wide
66
37
49
19
171
82
89
84
74
83
independent measures of habitat quality (Fig. 1). Values 41.5, which represent the
clearest evidence of degraded benthos, occurred at 23 (24%) of the 96 stations
sampled in those years. Only one (5%) of the 23 stations was considered to be misclassi®ed based on the lack of elevated contaminants, sediment toxicity, or low dissolved oxygen (DO). Other environmental or chemical stresses not measured may
have accounted for the low index value observed at this site. Using only the 1995
probability-based samples, 14 stations (21% of the province area) had degraded
benthos based on index values of 41.5 (Hyland et al., 1998).
Transitional values of 2±2.5, which indicate possible benthic stress, occurred at 18
(19%) of the sites sampled in 1993 and 1995. Thirteen of these sites were categorized
as degraded based on sediment chemistry, sediment toxicity, or low DO values. The
remaining ®ve sites showed no evidence of degradation based on these factors.
The moderately low scores at these latter sites may be the result of unmeasured
anthropogenic stressors or other natural factors (e.g. predation or physical disturbance from storm events, etc.). Evaluation of the probability-based sites sampled
in 1995 indicated that 14 stations (15% of the province area) had a benthic index of
2±2.5 (Hyland et al., 1998).
Index values 53, which are indicative of a non-degraded benthos, occurred at the
remaining 55 sites sampled in 1993/95 and at 58 (64% of the region) of the randomly
placed sites sampled in 1995 (Hyland et al., 1998). None of the sites which scored an
index value of 5 showed evidence of degradation based on sediment chemistry,
toxicity, or low DO. Seven of the 16 stations with index values of 4 or 4.5 were
misclassi®ed based on the other parameters measured and values of 3±3.5 resulted in
278
R.F. Van Dolah et al. / Marine Environmental Research 48 (1999) 269±283
Fig. 1. Frequency distribution of the index scores from stations sampled in the validation data set compared to independent measures of station quality based on sediment contaminant levels, sediment toxicity
using laboratory bioassays, and/or dissolved oxygen conditions.
the greatest uncertainty of correct classi®cation (Fig. 1). These results indicate that
index values in the transitional range should be interpreted with caution, but values
at each end of the scale are in close agreement with other predictions of sediment
bioe€ects based on the combined exposure data. Additionally, some of the apparent
misclassi®cations may be attributable to reduced bioavailability of the contaminants
present or some other mitigating factor.
The benthic index we selected for use in assessing conditions throughout the
Carolinian Province proved to be very ecient at correctly classifying both
R.F. Van Dolah et al. / Marine Environmental Research 48 (1999) 269±283
279
non-degraded and degraded sites. The percentage of correctly classi®ed stations in the
validation data set (75%) was lower than estimates reported for the Chesapeake Bay
(Weisberg et al., 1997) and New York/New Jersey Harbor (Ranasinghe et al., in
review) using a similar approach. When samples with transitional index scores (52.5±
43.5) were removed from the validation data set, the eciency of our index increased
to 81% province-wide for the validation data set and to 85% for all years combined.
Among the four habitat groups we identi®ed, the southern-most stations with
salinities >18 ppt scored the lowest in classi®cation eciency in the validation data
set (Table 4). This result suggests that the index metrics may not be well suited to
those sites; however, it should be noted that only 10 stations were sampled in this
habitat group during 1993 and 1995. Thus, even 1±2 station misclassi®cations would
substantially alter the eciency estimate for this subgroup. The percentage of correct station classi®cations for this group increased to 74% when all 3 years of data
were considered together.
Another possible explanation for the relatively low classi®cation eciency in the
southern-most habitat group is due to the fact that none of the benthic attributes we
evaluated could be statistically compared for this station group since there was only
one station that was classi®ed as environmentally degraded based on sediment
chemistry. Thus, some of the ®nal attributes selected for common use in the regional
index may be less suitable than others that might have been identi®ed if a larger
array of stations from this area could have been analyzed.
Station classi®cation eciency in the other three habitat groups varied from 74 to
85% in the validation data set (Table 5). Much of this variability may have been
related to the distribution of the sensitive taxa that showed strong relationships to
sediment contaminant levels in the development data set (Group C, Table 2).
Although these taxa proved to be the best choice in our province-wide tests of the
various metric combinations, some of these taxa were either absent or rarely collected in one or more of the habitat groups. For example, ampeliscid amphipods
(primarily Ampelisca vadorum, A. abdita, and A. verrilli), lucinid bivalves (primarily
Parvilucina multilineataz), hesionid and cirratulid polychaetes (primarily Podarkeopsis levifuscina, Podarke spp., Tharyx killariensis, and Monticellina dorsobranchialis), and the isopod Cyathura burbanki were either absent or rare at lowersalinity stations (Habitat Group 1). Haustorid amphipods (primarily Acanthohaustorius millsi and Protohaustorius diechmannae) and tellinid bivalves (primarily Tellina agilis and Tellina texana) were present at many of the lower-salinity sites, but
they were generally more abundant at higher-salinity stations. Di€erences in the
mean densities of Tellinidae at degraded versus non-degraded sites with higher salinities were also much greater than we observed at lower-salinity sites.
Those interested in using the B-IBI approach for a speci®c portion of the region
should consider whether another combination of benthic metrics or habitat groups
(e.g. based on sediment type) would provide even greater power for discriminating
between degraded and non-degraded sites. For example, we recently re-analyzed the
data from estuaries with high tidal amplitudes (primarily South Carolina and
Georgia stations) and found that a mean score of three metrics (mean number of taxa,
percent of total abundance represented by sensitive taxa Group C, and 100 minus
280
R.F. Van Dolah et al. / Marine Environmental Research 48 (1999) 269±283
percent of total abundance of the three most dominant species) provided a better
station classi®cation eciency for that portion of the Carolinian Province than the
region-wide index (Van Dolah, Snoots & Hyland, unpublished data) In the study
reported here, the goal was to develop a single index that could be used provincewide to assess benthic condition. The index we selected worked best for that purpose.
Comparison of the Carolinian Province index with individual measures of benthic
condition (H0 , mean abundance, mean number of taxa) indicated that the index was
better at detecting bioe€ects where expected based on high sediment contamination
or low DO (Fig. 2). Additionally, the index was substantially better at detecting
bioe€ects related to contaminants than the four sediment bioassays run on 1995
samples (Fig. 3). The two amphipod assays were the least sensitive indicators of
stations with contaminants that exceeded bioe€ect guidelines (Long, MacDonald,
Smith & Calder, 1995; MacDonald, 1994). Less than 7% of these stations tested
positive using either amphipod species. In contrast, 70% of the contaminated stations tested positive with the benthic index. The Microtox1 whole sediment assay
and the 7-day seed-clam growth assay (using M. mercenaria) showed a higher correct classi®cation of contaminated sites than the amphipod assays (36 and 42%,
respectively). Both of these latter assays, however, were less ecient than the index
at correctly classifying chemically degraded sites.
The benthic index represents an in situ measure of biological condition that
should re¯ect chronic e€ects of degraded habitat quality more reliably than a shortterm laboratory bioassay. Our comparisons of the Carolinian Province index with
the laboratory assays do not account for di€erences that may be due to other factors
Fig. 2. Classi®cation of contaminated stations (see Table 1 for criteria) sampled in the validation data set
using the benthic index versus other selected measures of benthic condition. Percent expected bioe€ects=number of stations where an e€ect was detected divided by the number of stations with either
contamination or low dissolved oxygen.
R.F. Van Dolah et al. / Marine Environmental Research 48 (1999) 269±283
281
Fig. 3. Percent of stations with elevated sediment contaminants that also showed signi®cant bioe€ects
using various bioassays or the benthic index. Only `validation' stations sampled in 1995 are shown since
two of the four assays were not conducted in 1993. Percent expected bioe€ects=number of stations where
an e€ect was detected divided by the number of stations where there were contaminants that exceeded
bioe€ect guidelines (53 ER-L/TEL and/or 51 ER-M/PEL). Stations with low dissolved oxygen were
excluded from this analysis.
unrelated to pollution, such as recent bottom disturbance, predation e€ects, or low
dissolved oxygen e€ects. These and other non-chemical e€ects could have in¯uenced
the index scores at some of the sites, but it is unlikely that these factors would have
accounted for all of the di€erences noted.
Benthic indices have their limitations, but they have been proven to be valuable
tools for assessing sediment quality in a variety of estuarine habitats (Engle, Summers & Gaston, 1994; Hyland et. al., 1996, 1998; Strobel et al., 1995; Weisberg et al.,
1997). The index we developed for the Carolinian Province proved to be e€ective at
a regional scale, while still employing a simple protocol that can be easily understood by resource managers. Judgements about sediment quality, however, should
not be based solely on a benthic index, or any single measure of habitat condition.
Combining the index with other measures of habitat quality, such as direct measures
of sediment contamination and toxicity, can reduce misinterpretation of the data
and provide a powerful weight-of-evidence approach to assessing the overall condition of a site, estuary, or region.
Acknowledgements
This work was jointly sponsored by US Environmental Protection Agency (EPA)
and the National Oceanic and Atmospheric Administration (NOAA). EPA funds
282
R.F. Van Dolah et al. / Marine Environmental Research 48 (1999) 269±283
were provided by Interagency Agreement No. DW13936394-01 to NOAA from
EPA's National Health and Environmental E€ects Research Laboratory
(NHEERL), Gulf Ecology Division. NOAA funds were provided by the Coastal
Monitoring and Bioe€ects Assessment Division (CMBAD), of the Oce of Ocean
Resources Conservation and Assessment (ORCA), and the Coastal Services Center
in Charleston, SC. Special recognition is extended to Martin Posey (University of
North Carolina-Wilmington) for providing benthic data from the North Carolina
portion of the study region and to David Camp and Tom Perkins (Florida Department of Enviromnental Protection) for providing benthic data from the Florida
subregion. We also wish to thank Lynn Zimmerman, John Jones, and Len Balthis
for their assistance in summarizing the data and preparing ®gures. Finally, we wish
to thank Drs D. Dauer, A. Ranasinghe, and two anonymous reviewers for their
constructive criticism of an earlier draft of this manuscript
References
Berge, J. A. (1990). Macrofauna recolonization of subtidal sediments. Experimental studies on defaunated
sediment contaminated with crude oil in two Norwegian fjords with unequal eutrophication status. I.
Community responses. Mar. Ecol. Prog. Ser., 66, 103±115.
Boesch, D. F. (1977). Application of Numerical Classi®cation in Ecological Investigations of Water Pollution. Ecological Research Series (EPA-600/3-77-033). Newport, OR: US EPA Corvallis Environmental
Research Laboratory.
Dauer, D. M. (1993). Biological criteria, environmental health and estuarine macrobenthic community
structure. Mar. Pollut. Bull., 26(5), 249±257.
Dauer, D. M., & Alden, R. W., III (1995). Long-term trends in the macrobenthos and water quality of the
Lower Chesapeake Bay. Mar. Pollut. Bull., 30(12), 840±850.
Engle, V. D., Summers, J. K., & Gaston, G. R. (1994). A benthic index of enviromnental condition of
Gulf of Mexico estuaries. Estuaries, 17(2), 372±384.
Gaston, G. R., Rutledge, P. A., & Walther, M. L. (1985). The e€ect of hypoxia and brine on recolonization of macrobenthos o€ Cameron, Louisiana (USA). Contrib. Mar. Sci., 28, 79±93.
Hyland, J. L., Herrlinger, T. J., Snoots, T. R., Ringwood, A. H., Van Dolah, R. F., Hackney, C. T.,
Nelson, G. A., Rosen, J. S., Kokkinakis, S. A. (1996). Environmental Quality of Estuaries of the Carolinian Province: 1994. Annual Statistical Summary for the 1994 EMAP-Estuaries Demonstration Project
in the Carolinian Province (NOAA Technical Memorandum NOS ORCA 97). Silver Spring, MD:
N0AA/NOS, Oce of Ocean Resources Conservation and Assessment.
Hyland, J. L., Balthis, L., Hackney, C. T., McRae, G., Ringwood, A. H., Snoots, T. R., Van Dolah, R.
F., Wade, T. L. (1998). Environmental Quality of Estuaries of the Carolinian Province: 1995. Annual
Statistical Summary for the 1995 EMAP-Estuaries Demonstration Project in the Carolinian Province
(NOAA Technical Memorandum NOS ORCA 123). Silver Spring, MD: NOAA/NOS, Oce of Ocean
Resources Conservation and Assessment.
Karr, J. R. (1981). Assessment of biotic integrity using ®sh communities. Fisheries, 6(6), 21±27.
Karr, J. R. (1991). Biological integrity: a long-neglected aspect of water resource management. Ecological
Applications, 1, 66±84.
Karr, J. R., Fausch, K. D., Angerincier, P. L., Yant, P. R., & Schlosser, I. J. (1986). Assessing Biological
Integrity in Running Waters: A Method and its Rationale. Special Publication 5. Champaign, IL: Illinois
Natural History Survey.
Kerans, B. L., & Karr, J. R. (1994). A benthic index of biotic integrity (B-IBI) for rivers of the Tennessee
Valley. Ecological Applications, 4, 768±785.
Long, E. R., & Morgan, L. G. (1990). The potential for biological e€ects of sediment-sorbed contaminants
tested in the National Status and Trends Program. NOAA Technical Memorandum NOS OMA 52.
R.F. Van Dolah et al. / Marine Environmental Research 48 (1999) 269±283
283
NOAA/NOS, Oce of Ocean Resources Conservation and Assessment, Rockville, MD. 175 p. plus
appendices.
Long, E. R., MacDonald, D. D., Smith, S. L., & Calder, F. D. (1995). Incidence of adverse biological e€ects
within ranges of chemical concentrations in marine and estuarine sediments. Envir. Man., 19, 81±97.
MacDonald, D. D. (1994). Approach to the Assessment of Sediment Quality in Florida Coastal Waters,
Vols. 1±IV. FL: Florida Department of Environmental Protection.
MacDonald, D. D., Carr, R. S., Calder, F. D., Long, E. R., & Ingersoll, C. G. (1996). Development and
evaluation of sediment quality guidelines for Florida coastal waters. Ecotoxicology, 5, 253±278.
Pearson, T. H., & Rosenberg, R. (1978). Macrobenthic succession in relation to organic enrichment and
pollution of the marine environinent. Oceanogr. Mar. Biol. Ann. Rev., 16, 229±311.
Ranasinghe, J. A., Weisberg, S. B., O'Connor, J. S., Adams, D. A. A benthic index of biotic integrity (BIBI) for the New York/New Jersey Harbor. Journal of Aquatic Ecosystem Stress and Recovery, in
review.
Rhoads, D. C., McCall, P. L., & Yingst, J. Y. (1978). Disturbance and production on the estuarine sea¯oor. Amer. Scient., 66, 577±586.
Ringwood, A. H., Holland, A. F., Kneib, R., & Ross, P. (1996). EMAP/NS&T Pilot Studies in the Carolinian Province: Indicator Testing and Evaluation in Southeastern Estuaries (Final Report under Grant
NA90AA-D-SG790 through SC Sea Grant College Program. NOAA Technical Memorandum NOS
ORCA 102). Charleston, SC: SC Dept. of Natural Resources, Marine Resources Research Institute.
Strobel, C. J., Bu€um, H. W., Benyi, S. J., Petrocelli, E. A., Reifsteck, D. R., & Kieth, D. J. (1995). Statistical Summary EMAP-Estuaries Virginian ProvinceÐ1990 to 1993 (EPA/620/R-94/026). Narragansett, RI: US Environmental Protection Agency, National Health and Environmental E€ects
Research Laboratory, Atlantic Ecology Division.
Weisberg, S. B., Ranasinghe, J. A., Dauer, D. M., Scha€ner, L. C., Diaz, R. J., & Frithsen, J. B. (1997).
An estuarine benthic index of biotic integrity (B-IBI) for Chesapeake Bay. Estuaries, 20(l), 149±158.
Weisberg, S. B., Frithsen, J. B., Holland, A. F., Paul, J. F., Scott, K. J., Summers, J. K., Wilson, H. T.,
Valente, R., Heimbuch, D. G., Gerritsen, J., Schimmel, S. C., & Latimer, R. W. (1992). EMAP-Estuaries Virginian Province 1990 Demonstration Project Report (EPA/600/R-92/100). Narragansett, RI: US
EPA Environmental Research Laboratory, Narragansett.