Use of expert judgment in exposure assessment Part I

Journal of Exposure Analysis and Environmental Epidemiology (2001) 11, 308 – 322
# 2001 Nature Publishing Group All rights reserved 1053-4245/01/$17.00
www.nature.com/jea
Use of expert judgment in exposure assessment Part I. Characterization of
personal exposure to benzene
KATHERINE D. WALKER,a JOHN S. EVANSb AND DAVID MACINTOSHb
a
Harvard School of Public Health, Boston, Massachusetts
The University of Georgia, Athens, Georgia
b
This paper presents the results of the first phase of a study, conducted as an element of the National Human Exposure Assessment Survey ( NHEXAS ) , to
demonstrate the use of expert subjective judgment elicitation techniques to characterize the magnitude of and uncertainty in environmental exposure to
benzene. In decisions about the value of exposure research or of regulatory controls, the characterization of uncertainty can play an influential role. Classical
methods for characterizing uncertainty may be sufficient when adequate amounts of relevant data are available. Frequently, however, data are neither abundant
nor directly relevant, making it necessary to rely to varying degrees on subjective judgment. Since the 1950s, methods to elicit and quantify subjective
judgments have been explored but have rarely been applied to the field of environmental exposure assessment. In this phase of the project, seven experts in
benzene exposure assessment were selected through a peer nomination process, participated in a 2 - day workshop, and were interviewed individually to elicit
their judgments about the distributions of residential ambient, residential indoor, and personal air benzene concentrations ( 6 - day integrated average )
experienced by both the non - smoking, non - occupationally exposed target and study populations of the US EPA Region V pilot study. Specifically, each expert
was asked to characterize, in probabilistic form, the arithmetic means and the 90th percentiles of these distributions. This paper presents the experts’ judgments
about the concentrations of benzene encountered by the target population. The experts’ judgments about levels of benzene in personal air were demonstrative of
patterns observed in the judgments about the other distributions. They were in closest agreement about their predictions of the mean; with one exception, their
best estimates of the mean fell within 7 – 11 g / m3 although they exhibited striking differences in the degree of uncertainty expressed. Their estimates of the
90th percentile were more varied with the best estimates ranging from 12 to 26 g / m3 for all but one expert. However, their predictions of the 90th percentile
were far more uncertain. The paper demonstrates that coherent subjective judgments can be elicited from exposure assessment scientists and critically examines
the challenges and potential benefits of a subjective judgment approach. The results of the second phase of the project, in which measurements from the
NHEXAS field study in Region V are used to calibrate the experts’ judgments about the benzene exposures in the study population, will be presented in a
second paper. Journal of Exposure Analysis and Environmental Epidemiology ( 2001 ) 11, 308 – 322.
Keywords: benzene, expert subjective judgment, probabilistic risk analysis, risk management.
Introduction
The recent movement of regulatory agencies toward
probabilistic analyses of human health and environmental
risks (Habicht, 1992; NRC, 1994; Browner, 1995 ) has
turned a spotlight on the source and quality of the estimates of
variability and uncertainty1 that underlie them ( CRARM,
1996; Cullen and Frey, 1997; US EPA, 1997 ) . Of particular
concern is how uncertainty — a measure of what is not
known — is characterized. Greatest attention is often paid to
1. Abbreviations: BEADS, Benzene Exposure and Absorbed Dose
Simulation model; CRARM, Commission on Risk Assessment and
Risk Management; ETS, environmental tobacco smoke; NHEXAS,
National Human Exposure Assessment Survey; NRC, National
Research Council; TEAM, Total Exposure Assessment Methodology;
US EPA, US Environmental Protection Agency
2. Address all correspondence to: Dr. Katherine D. Walker, P.O. Box
6308, Lincoln, MA 01773. Tel.: + 1 - 781 - 259 - 4490. E-mail:
[email protected]
Received 27 April 1998; accepted 25 April 2001.
classical statistical methods for characterizing uncertainty;
such approaches may suffice when adequate amounts of
relevant data are available. More often than not, however,
data are neither abundant nor directly relevant, making it
necessary to rely, to varying degrees, on subjective judgment.
The increased focus on characterization of uncertainty has
arisen in part out of the recognition that assumptions about
uncertainty — whether they have been quantified or not —
are a fundamental element of environmental decisions.
Consider, for example, decision makers faced with the
questions of whether to fund exposure assessment research,
to improve inferences about the relationship between
exposure and disease, or to help discriminate between
regulatory control options. They must depend on assess1
Variability is the distribution of quantity over time, space, age, or other
factors. Variability is, in theory, knowable and ultimately irreducible.
Uncertainty ( also referred to as knowledge or epistemic uncertainty
( Paté - Cornell, 1996 ) can be characterized as the distribution of a
quantity, which may have a true value but one which is currently unknown.
Uncertainty is, in theory, often reducible by appropriate research.
Expert judgment in exposure assessment
ments of what is known, what is not known, and some
estimate of what will be learned once the research is
complete. Their decision might rest on whether the research
will make it possible to observe a significant difference from
what has been observed before or whether new information
could help distinguish between the benefits of one regulatory
action and another.
Argument arises as to whether classical or subjectivist
approaches should be used to characterize uncertainties. The
reality is that it is often not possible to separate them.
Whether explicitly acknowledged or not, subjective judgment enters into many facets of environmental exposure
assessment. Development of exposure models requires
decisions about model structure, model inputs, and the data
to characterize them. The presence of subjective judgment is
most clear where fundamental gaps in knowledge due to the
scarcity of relevant data or substantial disagreements about
the interpretation of available data exist. It is less obvious that
even where classical statistical methods are used, subjective
judgment is still present. For example, to characterize
variability and uncertainty in a given parameter, choices or
assumptions must be made about the quality or relevance of
available data, and about the appropriate choice of model,
parametric or otherwise, to characterize the selected data.
Furthermore, when the goal is to provide answers of
relevance to decision makers, excluding important sources
of uncertainty simply because they are not readily quantified
is undesirable. The real argument between classical and
subjectivist approaches is not about how to analyze the
limited data that are available but about how to bridge the
gap between the sample and the population of interest to the
decision maker (e.g., between studies done in one region of
the country but used to inform regulations for the whole
country) . Analysts relying exclusively on classical statistical analysis prefer to provide an objective interpretation of
the available data and a qualitative description of the
relevance of these data to the problem at hand. In contrast,
analysts employing subjective approaches attempt to
address issues of relevance or other sources of uncertainty
quantitatively. Judgments about the impact of such factors
can be substantial even when they remain unquantified and
can play influential roles in decisions about environmental
risks ( Morgan and Henrion, 1990; Paté - Cornell, 1996 ).
Quantifying all sources of uncertainty to the extent possible
makes more explicit their relative importance. The decision
making process can then be more systematic and transparent.
Classical and subjectivist methods for describing
uncertainty are not mutually exclusive. The ‘‘Bayesian’’
view ( from Bayes theorem ) allows probability to be
defined as P ( x|"), the probability of observing x given "
where " is the person’s state of information on which the
judgment is conditioned (Morgan and Henrion, 1990) . If
no empirical data are available, " may depend entirely on
Journal of Exposure Analysis and Environmental Epidemiology (2001) 11(4)
Walker et al.
a subjective judgment and the probability will strictly be a
measure of ‘‘degree of belief.’’ However, a Bayesian
concept of probability does not exclude the possibility
that " includes some statistical characterization of
uncertainty, if it exists. Indeed, Cooke ( 1991 ) has argued
that analysts should begin with a statistical premise then
modify it as necessary to reflect other sources of
uncertainty for which empirical measurements do not
exist. For example, an analyst interested in characterizing
uncertainty in the ‘‘true,’’ but unknown, mean exposure to
a chemical experienced by some population could
appropriately use the standard error of the mean from a
pre - existing study as a first approximation, but might
choose to modify that estimate according to his /her
judgments about the quality of the study, the representativeness of the sample population for the population of
interest, or other factors.
The study of subjective judgment about uncertainty has
a long history that has been reviewed in two recent books
(Cooke, 1991; Wright and Ayton, 1994 ) . Numerous
articles document efforts to understand the cognitive basis
for subjective judgments about uncertainty and factors
affecting their quality (Kahneman et al., 1982; von
Winterfelt and Edwards, 1986 ). Cooke ( 1991 ) traces
current interest in the application of expert subjective
opinion to the period following World War II. Many of
the early applications were military in nature but formal
uses for expert judgment have found application in the
aerospace industry, artificial intelligence, engineering,
expert systems, medicine, and nuclear energy ( Cooke,
1991 ). Otway and von Winterfeldt ( 1992 ) discuss the use
and pitfalls of subjective judgment in risk analysis using
examples from nuclear power.
In contrast to the rather extensive literature on the
measurement of environmental exposures, relatively few
applications of subjective judgment to environmental
problems exist. While not exhaustive, the following
summary gives a sense of the numbers and breadth of
environmental studies that have incorporated subjective
judgment techniques. Amaral ( 1983 ) and Morgan et al.
(1984) elicited judgments about key parameters affecting
long - range transport of sulfur and the associated health
effects of exposure to sulfur compounds. Three studies have
focused on health outcomes directly; Wallsten and Whitfield ( 1986 ) obtained judgments about lead -induced
hemoglobin and Intelligence Quotient (IQ ) decrements in
support of the US EPA’s review of the airborne lead
standard; Hawkins and Graham ( 1988 ) elicited subjective
expert judgments about the carcinogenicity of formaldehyde; Evans et al. ( 1994a,b ) used weight -of -evidence
techniques ( Sielken, 1989) to characterize the distributions
of cancer potency of formaldehyde and chloroform. Winkler
et al. (1995 ) used subjective judgment in their recent
assessment of the impact of chronic ozone exposure on
309
Walker et al.
injuries to the lungs. Several studies of the impact of global
climate change have made use of the subjective judgments
of experts (Manne and Richels, 1994; Nordhaus, 1994;
Morgan and Keith, 1995) .
Relatively few studies have asked specific questions
about direct environmental exposures to chemical contaminants. Most of these have been in the field of industrial
hygiene, whether for use in retrospective epidemiologic
studies (Stewart et al., 1986; Rinsky et al., 1987 ) or for the
assessment of current exposures. Winn et al. ( 1977 )
assessed the utility of industrial hygiene questionnaires in
ranking exposures to dust, vapor, and noise. Other
investigators compared the ability of industrial hygienists,
plant supervisors, and workers to categorize the exposure of
different workers ( Kromhout et al., 1987 ). Hawkins and
Evans (1989 ) conducted a study in which they requested
industrial hygienists to predict toluene exposures for
workers in a batch chemical process in two phases, one in
which they were given only a description of the process and
the second in which some anecdotal monitoring data were
provided. Macaluso et al. ( 1993 ) evaluated agreement
between exposure assessors asked to rate exposures to
solvents in a car assembly plant. De Cock et al. ( 1996 )
asked three groups of experts with different expertise to rate
several aspects of fruitgrowers’ exposure to pesticides,
given descriptions of the farm operations, and to evaluate
their judgments using monitoring data.
Of these studies, only the Hawkins and Evans ( 1989 )
study requested probabilistic estimates of exposures. The
others characterized exposure by relative rank or by
exposure categories. More studies are needed to demonstrate the role for, and applicability of, subjective judgment
techniques in exposure analysis.
This article is the first of a two -part study, conducted as
an element of the National Human Exposure Assessment
Survey (NHEXAS ) 2 to study the application of subjective
judgment techniques to characterize uncertainty in current
estimates of environmental exposures to benzene. Benzene
is an organic chemical found in gasoline, tobacco smoke,
and a diminishing number of consumer products. It has been
regulated for many years by US EPA as a human carcinogen
on the basis of epidemiologic studies suggesting an
association between occupational benzene exposures and
leukemia. Benzene was selected for this study because of
the regulatory interest in this chemical and because it is one
of the primary compounds studied in the NHEXAS pilot
study in Region V.
2
The NHEXAS is a major research program launched by US EPA’s Office
of Research and Development ( ORD ) . The program was conceived to
improve research methods for measuring population exposures to
environmental contaminants, to provide data for validating exposure
models, and more generally to provide improved data for environmental
policy decisions ( Sexton, 1995 ) .
310
Expert judgment in exposure assessment
The project had two phases: (i ) the elicitation of experts’
judgments about uncertainty in levels of benzene in US EPA
Region V and ( ii ) the assessment of how those judgments
compare with observed results from the NHEXAS pilot
study in Region V ( i.e., the calibration phase ). This paper
presents the results of the first phase of the project in which
expert judgments were elicited about residential ambient
(‘‘ambient’’), residential indoor (‘‘indoor’’ ), and personal
air concentrations (‘‘personal exposure’’ ) of benzene
experienced by the non- smoking, non- occupationally
exposed population of Region V. The results of the second
phase will be presented in a later paper.
Methods
The methodology used in this study has been adapted from
several recent studies. Since their inception in the years
following World War II, formal methods for obtaining the
subjective judgments of experts have been evolving with the
intent of improving both the process of expert selection and
the scientific rigor and consistency of their subjective
judgments ( Cooke, 1991 ) . Despite improvements in
understanding of the characteristics that promote good
judgments, standardized protocols for the selection, preparation, and elicitation of experts do not exist. Analysts in
this field argue that they should not necessarily exist but
rather, protocols should be crafted to suit the particular
problem under investigation (Morgan and Henrion, 1990;
Hora, 1992; Otway and von Winterfeldt, 1992 ).
Identification of Issues for Expert Input
The first step in the development of a sound elicitation
strategy is a clear definition of the problem to be solved
(Spetzler and Staël von Holstein, 1975; Morgan and
Henrion, 1990; Hora, 1992 ). It underlies the identification
of the appropriate scientists for participation in the study as
well as the formulation of the questions for which subjective
judgments will be sought.
Two factors helped define the focus of our elicitations
— the design of the NHEXAS field study and an analysis
of the major contributors to uncertainty in benzene
personal exposure. Since an ultimate goal of this project
is to assess how well experts take into account uncertainty
in their quantitative judgments, the quantities for which
the NHEXAS study will provide empirical evidence were
the primary focus of our questions. During 1996 and
1997, the NHEXAS study team collected 6- day integrated
air samples3 to measure benzene and other contaminant
3
Volatile organic compounds ( VOCs ) were collected on 3M OVM 3520
charcoal badges. In the earlier TEAM studies, the VOCs were collected on
Tenax cartridges using battery - operated pumps over two 12 - h periods
representing a full 24 - h period.
Journal of Exposure Analysis and Environmental Epidemiology (2001) 11(4)
Expert judgment in exposure assessment
levels in ambient and indoor air at the residence and in
total personal air for a stratified random sample of the
population of US EPA Region V ( Illinois, Indiana,
Minnesota, Michigan, Ohio, and Wisconsin ). Expert
judgments about the distributions of 6 -day average
benzene concentrations in ambient, indoor, and personal
air for this population, covering the approximately 1 -year
period of the study, were therefore central to our study.
The focus of the questions was further restricted to the
non -smoking, non- occupationally exposed individuals in
Region V. Several studies have shown that exposures to
benzene from active cigarette smoking (Wallace, 1989;
Wallace et al., 1988; Thomas et al., 1993 ) and from
occupational settings (MacIntosh et al., 1995b ) can account
for the highest benzene exposures among the population.
Unless separated from the analysis, benzene exposures from
these sources would obscure attempts to understand
benzene exposures from other potential sources of interest.
Modeling conducted by MacIntosh et al. ( 1995b ) of the
microenvironmental and source contributions to variability
and uncertainty in personal exposure to benzene also helped
shape the focus of this study. Their analysis indicated that
variability in benzene concentrations in the indoor residential microenvironment accounted for 60% of the variance in
personal exposure. Ambient air in turn accounted for 50%
of the variability in residential indoor air; with indoor
sources, ETS -related benzene and penetration from
attached garages accounting accounted for the remainder.
Exposure to benzene received while ‘‘in transit’’ (commuting or other travel in motor vehicles) was also identified as a
potentially important contributor to overall personal exposure. The MacIntosh et al. ( 1995b ) analysis of the major
sources of uncertainty suggested that uncertainty in benzene
personal exposure was again dominated by uncertainty in
the ambient levels of benzene ( about 60%) and indoor
sources of benzene (about 10% ) contributing to indoor air.
The NHEXAS field study design and these analyses led us
to seek expertise in benzene personal exposure, ambient,
and indoor contributions to benzene exposure, transit related benzene exposure, and benzene related to environmental tobacco smoke.
Expert Selection
Numerous methods have been proposed for the selection of
experts: citations in the peer-reviewed literature (Wolff et
al., 1990 ), peer nomination counts from surveys of the
membership of professional societies ( Hawkins and Graham, 1988; Hawkins and Evans, 1989; Spedden, 1992 ),
membership on a specific committee of the National
Academy of Sciences (Siegel et al., 1990 ), as well as less
formal methods for selecting individuals with the desired
expertise ( Manne and Richels, 1994 ). No consensus on the
best approach has emerged and an evaluation of alternate
approaches was beyond the scope of this paper. However,
Journal of Exposure Analysis and Environmental Epidemiology (2001) 11(4)
Walker et al.
the goals for any expert selection method should be to
identify the appropriate expertise, to achieve a representation of scientific and ideological perspectives, and to do so
in a process that is fair, reproducible, and reasonably
transparent to outside observers (Hawkins and Graham,
1988 ).
We selected a panel of seven experts in various aspects of
benzene exposure through a peer nomination process. We
first conducted structured telephone interviews with 20
individuals in the field of exposure assessment. Each
interview included:
o an introduction to the NHEXAS expert judgment project
o a discussion of our assessment of the critical areas for
expert input. The interviewee was invited to offer
alternatives.
o a request for nomination of potential experts.4
These nominations were all entered into a database
containing the names of the nominators and nominees.
The scores for each nominee were summed and ranked to
identify the individuals with the highest scores. Eight
individuals with the highest scores who collectively
reflected a balance of expertise and scientific viewpoints
were offered the opportunity to participate in the project.5
Seven individuals agreed to participate; their names and
affiliations are shown in Table 1. All received reimbursement of expenses and a modest stipend for their
involvement.
Workshop
We convened a workshop in Boston, Massachusetts, on
April 18– 19, 1996, which was attended by all seven
experts, and two of the authors, John Evans and Katherine
Walker, John Graham of the Harvard Center for Risk
Analysis, and Karen Hammerstrom of the US EPA. The
purposes of the workshop were (1 ) to introduce the
NHEXAS expert judgment project in more detail, (2 ) to
discuss the subjective judgment literature, in particular the
findings regarding the common heuristics6 operating when
probability judgments are given, and (3 ) to review and
critique the existing literature relevant to characterization of
exposures to benzene in air.
Prior to the workshop, each expert was sent a notebook
containing selected papers regarding benzene exposure.
4
Individuals were allowed to self - nominate, but only three of the experts
on the panel chose to do so and their votes did not substantially alter the
rank.
5
The number of participants was determined by project resources.
6
Heuristics in this context refer to cognitive devices or procedures by
which individuals make judgments in the presence of uncertainty
( Kahneman et al., 1982; Morgan and Henrion, 1990 ) . See Appendix for a
brief discussion of heuristics.
311
Walker et al.
Expert judgment in exposure assessment
Table 1. Benzene expert panel.a
Expert
Affiliation
Joseph Behar
US EPA Exposure Assessment Research Division,
Las Vegas, Nevadab
Steven Colome
Integrated Environmental Sciences
Irvine, California
Katherine Hammond UC Berkeley
School of Public Health
Ted Johnson
TRJ Assoc.,
Chapel Hill, North Carolina
William Ollison
American Petroleum Institute,
Washington, DC
Lance Wallace
US EPA,
Washington, DC
Clifford Weisel
Environmental and Occupational Health
Sciences Institute, Rutgers University
Piscataway, New Jersey
a
The experts’ results are identified only by a randomly assigned capital
letter ( A through G ) in order to preserve confidentiality.
b
Retired.
Each participant was asked to give a 20- min presentation on
one or more of the following topics:
o
o
o
o
o
total personal exposure to benzene
ambient benzene concentration
indoor benzene concentrations
automobile - related contributions to benzene exposure
environmental tobacco smoke contribution to personal
benzene exposure.
The participants were clearly informed that the purpose
of these presentations was neither to express their personal
judgments about particular quantities (i.e., the mean
personal exposure to benzene in Region V ) nor to forge a
consensus within the group on any subject. Instead, the goal
was for the experts to identify the most important studies on
each subject and to spark a candid critique of the strengths
and weaknesses of the existing literature. Each topic was
covered by two experts, chosen to give somewhat different
perspectives and intended to stimulate discussion. These
discussions were intended to help provide a common
foundation for the experts’ judgments about uncertainty in
subsequent individual interviews.
Elicitation of Individual Judgments
The interviews were carried out in the offices of each expert
and required 1 – 1.5 days to complete.7 Two interviewers
were always present — Katherine Walker, Harvard School
of Public Health and David MacIntosh, University of
7
For reasons that were beyond the control of the project, the interviews
could not take place until 6 – 9 months after the workshop rather than
immediately following as originally planned. John Evans was not available
to participate in the interviews as originally scheduled.
312
Georgia. On the basis of recommendations at the workshop,
the interviewers compiled a more extensive set of papers
discussing benzene exposure. These were available for
reference during the interviews. Information on the basic
design and sampling protocols to be used in the NHEXAS
Region V pilot study had been sent to each expert prior to
the interviews. Discussion of the judgments of other experts
was not permitted at any time. Individual judgments were
obtained before any result from the NHEXAS field study
was available.
Each interview followed the same structure, delineated in
a written elicitation protocol.8 It began with brief reviews of
the objective of our study, the discussions at the workshop,
and the common pitfalls encountered in giving subjective
judgments. The next step was to characterize each individual’s conceptual or ‘‘mental’’ model (Bostrom et al., 1992,
1994 ) of personal exposure to benzene, its basic structure
(e.g., microenvironmental or other, time dependency of
exposure, etc. ) , critical inputs, and sources. The purpose of
this exercise was to help combat one of the common heuristic
procedures relied on in giving judgments — that of
availability (see Appendix ) . Development of a mental
model required each expert to identify and give a rationale for
the important factors to be considered in evaluating benzene
exposures. The process also laid the basis for questions
throughout the interview about the completeness of the
model and the relative importance of each component.9
The fundamental goal of the interviews was to develop
each expert’s probabilistic characterization of the arithmetic
mean and the 90th percentile of the distributions of ambient,
indoor, and personal air benzene concentrations in US EPA
Region V, based on 6 -day integrated average samples (g/
m3 ) . That is, they were asked to express their estimate of
each quantity (e.g., the mean ambient benzene concentration for Region V ) in the form of a probability
distribution, which represented both their best judgment
about the value that the ‘‘true’’ mean might take and their
uncertainty in that quantity.
The experts were asked to give their judgments about two
different sets of benzene concentrations for two different
quantities: (1 ) the ‘‘true,’’ but unknown, exposures to
benzene concentrations potentially encountered by the non smoking, non- occupationally exposed target population of
Region V, and ( 2) the ‘‘actual’’ benzene concentrations
encountered by the NHEXAS Region V study population as
estimated by the NHEXAS pilot study. The first set of
judgments for the Region V target population is illustrative
8
The protocol was pilot - tested on David MacIntosh, University of
Georgia, and Barry Ryan, Emory University, and revised to the form used
in the study.
9
The purpose of the mental model was to help combat one of the common
heuristic procedures relied on in giving judgments — that of availability
( see Appendix ) .
Journal of Exposure Analysis and Environmental Epidemiology (2001) 11(4)
Expert judgment in exposure assessment
of information that decision makers might initially want to
have before taking regulatory action or initiating research
( e.g., are the benzene concentrations sufficient to warrant
action or is scientific uncertainty about the possible levels
sufficient that further research should be conducted first? ).
In the second set of judgments, the experts were asked, in
essence, to predict uncertainty in the expected results of the
NHEXAS field pilot study. This paper presents only the
results of the first set of elicitations.
The judgments about each uncertain quantity were
obtained through a series of six steps, beginning with: ( 1)
discussion and critique of the data on which the expert
wished to rely for his judgments about the population
distributions of each benzene concentration; ( 2) characterization of the variability in the quantity for the target
population of Region V (i.e., the ‘‘true,’’ but unknown,
distribution of benzene concentrations experienced by the
approximately 40 million people living in Region V ) ; ( 3)
identification and discussion of potential sources of
uncertainty arising from use of existing data to predict the
current benzene concentrations in Region V ( e.g., date,
geographic location, and representativeness of the existing
studies, changes in regulations affecting benzene concentrations in gasoline, etc. ) ; (4 ) quantification of the
judgments about the ‘‘true,’’ but unknown, mean and 90th
percentile concentration of benzene experienced by the
target Region V population; ( 5) identification of potential
sources of bias and random error arising from the NHEXAS
study design; and (6 ) adjustment, as appropriate, of the
probabilistic judgments about the means and 90th percentiles to reflect these additional sources of uncertainty about
the benzene levels predicted for the pilot study population.
Once this process was complete for one distribution (e.g.,
ambient ), it was repeated for each of the other two
distributions (indoor and personal exposure ) that were
the focus of this study. The elicitations began in ambient
benzene concentrations, proceeded to indoor benzene
concentrations, and ended with personal exposure to
benzene.
Beginning the elicitations with an overall view of
variability assisted the experts in thinking separately about
issues affecting variability and uncertainty and to develop
some confidence in points for discussions of uncertainty.
Experts were allowed to estimate variability using whatever
approach was most comfortable or intuitive; some chose to
define the distribution parametrically ( e.g., using a lognormal distribution and specifying the geometric mean (GM )
and geometric standard deviation (GSD ) ) while others
preferred to estimate directly the mean and specific fractiles
of the distribution ( 10th, 25th, 50th, 75th, and 90th ) .
The elicitations of uncertainty generally followed a fixed
probability method in which experts were asked to specify
benzene concentrations for specified fractiles of a cumulative density function. Each expert was asked to estimate
Journal of Exposure Analysis and Environmental Epidemiology (2001) 11(4)
Walker et al.
the minimum and maximum, the 90% credible interval,10
the interquartile range, and the median of their uncertainty
distribution about the arithmetic mean and the 90th
percentile of each of the frequency distributions: the
distributions of benzene concentrations in residential
ambient, residential indoor, and personal air. However, the
experts were given some freedom in how they chose to
develop their distributions. Some chose to estimate the
fractiles of the uncertainty distribution directly in the order
specified. Others requested that it be simulated once they
had specified the distributional form and parameters of the
population distribution and their uncertainty about those
parameters.11
The interviewers had several roles. They were responsible for reading and explaining each question and step of
the interview. Each took extensive notes, recording the
qualitative as well as quantitative basis for any responses.
They provided papers or data requested by the expert where
possible, ran simulations12 as necessary, and graphed and
presented results to provide feedback as the interview
progressed. One of the most important roles played by the
interviewers was to question the expert closely on each
judgment, to push for explanations and clarification for any
numerical judgment given, and to challenge judgments in
various ways in an effort to test the strength of the expert’s
convictions.
Following the interviews, the authors prepared transcripts from written notes and prepared tabular graphical
presentations of the quantitative elicitation results for each
expert. The authors contacted individual experts to clarify
results as necessary.
Results
This section presents the experts’ judgments about the
‘‘true,’’ but unknown, benzene concentrations potentially
encountered by the target population of Region V. We begin
our discussion of results by presenting the judgments about
personal exposure to benzene, the quantity of most direct
relevance to public health, and later examine their relationship to judgments about ambient and indoor components of
personal exposure.
10
The 90% ‘‘credible’’ interval is that interval which the expert believes to
include the true value with 90% probability.
11
Several experts expressed a preference for detailed modeling of the
quantities they were asked to predict, which was not possible for this study.
Limited modeling support was available.
12
For example, Expert G specified inputs to a simple microenvironmental
model, which simulated variability in ambient, indoor, and personal
exposures using Monte Carlo sampling techniques. We also estimated the
distributions for the mean and 90th percentiles of ambient, indoor, and
personal exposure distributions for Expert E using estimates of variability
and uncertainty given by Expert E and two - dimensional Monte Carlo
simulation techniques.
313
Walker et al.
Expert judgment in exposure assessment
Personal Exposure to Benzene
Figures 1 and 2 give the experts’ subjective probability
distributions for the mean and 90th percentiles of personal
exposures to benzene experienced by the target Region V
population. They are given in the form of boxplots in
which the plus sign designates the median — typically the
expert’s ‘‘best estimate,’’ the box denotes the interquartile
range, and the ‘‘whiskers’’ define the 90% credible
interval.
The judgments about personal exposure revealed several
patterns that were also observed in the judgments about
ambient and indoor concentrations. First, the experts were
in closest agreement about their best estimates of the mean.
With the exception of Expert E, who gave a best estimate
of 23 g / m3, the experts’ best estimates of the mean
personal exposure fell within the relatively narrow range of
7 –11 g / m3.
90 %
credible
interval
55
3
50
+
3
60
55
50
45
40
35
30
25
20
15
10
5
0
A
B
E
F
G
Despite similarities in their best estimates, striking
differences existed in the degree of uncertainty the various
experts expressed in those estimates. For example, Expert
D’s 90% credible interval ( the range from the 5th to 95th
percentiles ) fell within ± 20 –30% of the mean, a relatively
narrow range. At the other extreme, Experts A and G gave
100
interquartile
range
3
40
D
Figure 2. Subjective characterization of the 90th percentile of the
concentration of benzene in personal air ( g / m3 ) for the non - smoking,
non - occupationally exposed target population of Region V.
median
45
C
Expert
6-day Ave r age Pe r s o nal Air Con ce n tr ation (u g/m )
6-day integrated average concentration (µg/m )
60
100
6-day integrated average concentration (µg/m )
It is important to bear in mind that their each expert’s
judgments about uncertainty, while they may be based in
part on empirical evidence, are also reflections of an each
individual’s state of knowledge about a particular quantity,
their academic training, and style of thought. Consequently,
individuals’ subjective judgments about uncertainty may
differ even if they base their judgments on the same data.
Comparison of several experts’ judgments about a quantity
of interest therefore yields perspective on the state of expert
opinion about that quantity.
35
30
25
20
15
10
5
0
A
B
C
D
E
F
Experts
10
A
B
C
D
E
F
G
1
-1.5
-1
-0.5
0
0.5
1
1.5
G
Expert
0
Stan d ar d No r m al De viate
Figure 1. Subjective characterization of mean concentration of
benzene in personal air ( g / m3 ) for the non - smoking, non occupationally exposed target population of Region V.
314
Figure 3. Expert characterizations of the distribution of 6 - day average
personal air concentrations ( exposures ) in US EPA Region V.
Journal of Exposure Analysis and Environmental Epidemiology (2001) 11(4)
Expert judgment in exposure assessment
Walker et al.
Figure 4. Mental model for benzene personal exposure: Expert D.
90% credible intervals, which varied by a factor of 3 or more
from their best estimates.
The experts’ judgments about the 90th percentile of
benzene personal exposure were less similar ( see Figure 2 ).
They reflected greater variation among the experts about
their best estimates and increasing uncertainty about these
values ( see Figure 2 ). With the exception of Expert E, who
gave a best estimate of 45 g/m3, the remaining experts
gave best estimates of the 90th percentile, which fell
between 12 and 26 g/ m3. Most of the experts also gave
relatively broad uncertainty intervals for the 90th percentile;
five of six experts provided 90% credible intervals that fell
within a factor of 1.5 of their median estimates. Expert A’s
90% credible interval for the 90th percentile ranged from a
factor of 2.5 below his best estimate to a factor of 4 above.
These broader distributions reflect their concern — clearly
expressed in the interviews — that currently available data
provide relatively weak evidence about the upper fractiles of
the population distribution of benzene exposure.
As the boxplots for the mean and 90th percentiles indicate,
most of the experts gave uncertainty distributions that were
skewed to varying degrees. These results reflected their
greater confidence in defining lower bounds for mean and
90th percentile personal exposures ( e.g., by reference to
ambient levels of benzene) than in defining upper bounds.
The mean and the 90th percentiles were drawn from the
distributions the experts chose to characterize the population
distribution of personal exposures for the Region V target
population. These distributions, shown in Figure 3, turned out
to be nearly lognormal despite the different methods by which
experts approached their development. The geometric standard deviations for these distributions ranged from about 2 to 3.
In the process of developing their estimates of the mean
and 90th percentiles of the distribution of population
exposures, the experts first considered the distribution of
residential ambient and indoor benzene concentrations. All
Journal of Exposure Analysis and Environmental Epidemiology (2001) 11(4)
of the experts described some form of microenvironmental
model13 as their mental model for their characterization of
personal exposure although their specification of critical
microenvironments and sources varied to some degree.
Figure 4 presents Expert D’s mental model as an
illustrative example of the information developed during
the interview. The experts typically identified exposures in
‘‘residential indoor,’’ ‘‘other indoor’’ (i.e., work or school ),
and various automobile -related microenvironments as key
components of personal exposure. Ambient concentrations
of benzene were generally considered to underlie indoor
levels of benzene, based on analyses of data suggesting that
the penetration factor from outdoor to indoor is approximately 1 (MacIntosh et al., 1995a) .
In identifying major microenvironmental contributors to
personal exposure, experts were often implicitly acknowledging the importance of the fraction of time spent in those
microenvironments. For example, surveys of US time –
activity patterns show that most Americans spend a large
fraction of their time indoors, much of it at home (CARB,
1991; Robinson and Thomas, 1991 ). As a result, the
residential microenvironment often accounts for a substantial fraction of personal exposure. The experts also agreed
13
In the general form of a microenvironmental model, presented below,
personal exposure to an individual is the sum of exposures encountered in
different locations in which the individual spends time. Each microenvironmental exposure is the product of the concentration encountered and the
fraction of time spent in that microenvironment:
Ei ¼
j
X
Cij fij
j¼1
where E i = exposure to the ith individual, C ij = concentration encountered
by the ith individual in the jth microenvironment, f ij = fraction of time ( e.g.,
24 - h day ) spent by the ith individual in the jth microenvironment.
315
Walker et al.
Expert judgment in exposure assessment
Residential Ambient Benzene Concentrations
As residential ambient levels of benzene were often the
starting point for estimates of indoor concentrations and
personal exposures to benzene, differences in approaches to
characterizing ambient concentrations sometimes held the
key to understanding differences between estimates of
personal exposure. Consider, for example, Expert E’s and
Expert G’s best estimates of mean personal exposure to
benzene, approximately 5 and 23 g/ m3, respectively,
which span the range of judgments given (Figure 1 ). These
estimates have their roots in choices made to characterize
residential ambient benzene concentrations.
The experts’ judgments about the mean and 90th
percentile of residential ambient benzene are shown in
55
50
45
40
35
30
25
20
15
10
5
0
A
B
C
D
E
F
G
Ex pe rt
Figure 6. Subjective characterization of the 90th percentile of
residential ambient benzene concentration ( g / m3 — 6 - day
average ) for the non - smoking, non - occupationally exposed target
population of Region V.
Figures 5 and 6, respectively. Expert E’s best estimate for
mean ambient benzene is the second highest and Expert G’s
estimate the lowest of the seven estimates. Expert E relied on
the TEAM data for his estimates of residential ambient
levels,14 while Expert G simulated Region V residential
ambient benzene levels using data from National VOC
Database (pre - 1987 ) and the US EPA Aerometric Information and Retrieval System (AIRS ) (1985 – 1992 ) compiled
by MacIntosh et al. (1995a ) and differentiating rural and
‘‘non -rural’’ areas, but adjusting downward for declines in
benzene levels that have been observed in California
monitoring data beginning in 1993 and 1994 (Wallace,
1996 ). Wallace ( 1996 ) speculates that the drop in benzene
levels may be related to regulatory changes that have reduced
benzene releases from cars and from gas stations.
60
6-da y Inte gra te d Ave ra ge Conce ntra tion (µg/m 3 )
60
6-da y Inte gra te d Ave ra ge Conce ntra tion (µg/m 3 )
that other microenvironments (‘‘in transit,’’ gas stations,
etc. ) were potentially important contributors to personal
exposures because of the high concentrations encountered,
even though the fraction of time spent in them is often small.
Differences of opinion about the form of the microenvironmental model did not appear to have a substantial
impact on the judgments. Experts E and F took the position
that a more complex microenvironmental model (in which
the dependency between benzene concentration in a microenvironment and time of day was preserved) was necessary,
but they did not ultimately adjust their exposure estimates to
reflect the likely impact of such a change.
55
50
45
40
35
30
25
20
15
10
5
0
A
B
C
D
E
F
G
Residential Indoor Benzene Concentrations
Subsequent judgments these two experts made in progressing from ambient to indoor levels and from indoor to
personal exposure levels did not account for the large
differences in their final judgments about personal exposure. The experts’ judgments about mean residential
Ex pe rt
Figure 5. Subjective characterization of mean residential ambient
benzene concentration ( g / m3 — 6 - day average ) for the non smoking, non - occupationally exposed target population of Region V.
316
14
While other experts also used the TEAM data, Expert E appears not to
have made adjustments to his estimate to account for declines in benzene
content of gasoline and other regulations, which have resulted in lower
releases of benzene to ambient air.
Journal of Exposure Analysis and Environmental Epidemiology (2001) 11(4)
Expert judgment in exposure assessment
Walker et al.
55
50
45
40
35
30
25
20
15
10
5
0
A
B
C
D
E
F
G
Expe rt
Figure 7. Subjective characterization of mean residential indoor
benzene concentration ( g / m3 — 6 - day average ) for the non smoking, non - occupationally exposed target population of Region V.
indoor levels are shown in Figures 7 and 8. Expert E
estimated indoor and personal exposure levels directly from
ambient levels using the relationships between indoor,
outdoor, and personal benzene levels in co -located samples
( from non- smoking homes without attached garages ) in the
TEAM studies. Expert G approached indoor and personal
exposure levels indirectly using a simple microenvironmental model to combine contributions from different
sources and microenvironments. They both ended up
estimating similar quantitative relationships between residential indoor and ambient (i.e., indoor concentrations were
estimated to be a factor of 1.2– 1.5 times larger than ambient
levels) and between personal levels of exposure and
residential indoor levels ( i.e., personal exposure levels
were estimated to be a factor of between 1.5 and 1.8 greater
than indoor concentrations ) . Despite these different approaches, the quantitative relationships between residential
indoor and ambient and between personal and residential
indoor levels of benzene were remarkably similar.
Uncertainties in Expert Judgments About Benzene
Concentrations
Discussions with the experts provided insight about the
sources of uncertainty they considered for the probabilistic
characterization of the mean and 90th percentiles of each
benzene distribution. Comparison of the boxplots across
Figures 3– 8 indicates that most of the experts expressed
progressively greater uncertainty in their estimates of the
Journal of Exposure Analysis and Environmental Epidemiology (2001) 11(4)
mean as they moved from ambient to indoor and, finally, to
personal exposure. The 90% credible intervals for the mean,
averaged across all experts, spanned 5 g/ m3 for ambient,
11 g/ m3 for indoor, and 15 g/ m3 for personal exposure.
In general, the broader probability distributions given by
the experts for mean and 90th percentile of personal
exposure represented the experts’ view that personal
exposures to benzene were strongly affected by an
individual’s personal activities but that predicting the
frequency and impact on exposure of particular activities
was difficult. Several experts noted that the personal
activities that could increase exposure to benzene (e.g.,
lawn mowing, hobbies such as home car repair, or substance
abuse ) have not been characterized well in either population
surveys of human time –activity patterns or in studies of
human exposure to benzene.
The experts’ judgments about the distributions characterizing uncertainty in the mean and 90th percentile of
personal exposure also generally built upon earlier judgments about indoor and ambient concentrations. For
example, many of the experts indicated that predicting
indoor levels was particularly difficult given existing data;
they noted that the TEAM study and other data are not as
extensive on this subject as they are for ambient
concentrations and personal exposures to benzene. In
particular, they felt that current information about the
64
60
6-Da y Inte gra te d Ave ra ge Conce ntra tion (µg/m 3 )
6-Da y Inte gra te d Ave ra ge Conce ntra tion (µg/m 3 )
60
55
50
45
40
35
30
25
20
15
10
5
0
A
B
C
D
E
F
G
Expe rt
Figure 8. Subjective characterization of the 90th percentile of
residential indoor benzene concentration ( g / m3 — 6 - day average )
for the non - smoking, non - occupationally exposed target population
of Region V.
317
Walker et al.
potential contribution of indoor sources is limited; little is
known about the current contribution of consumer products
to benzene indoor air and, although attached garages have
been suggested as potentially important contributors to
indoor benzene levels, no population - based studies have
been conducted to verify this hypothesis. Another source
of uncertainty was the impact of changing smoking
behaviors on the ETS -related benzene concentrations in
residential indoor air since the TEAM studies were
conducted. Although many experts expressed the belief
that smoking rates had declined and that the percentage of
non -smokers living with smokers had declined, none had
specific data to confirm those beliefs.
The relative weights given to different studies can
sometimes explain differences in the uncertainty distributions given by the individual experts. Consider, for example,
the probability distributions given by Expert A for both the
mean and the 90th percentile of the distribution of personal
exposures. They are quite broad relative to those given by
most of the other experts. Most of the experts focused
extensively on the recent summary of studies of personal
exposures to benzene of Wallace ( 1996 ) while developing
probability distributions for the mean and 90th percentiles.
Unlike most of the members of the panel, who discounted
the relevance to Region V of the Goldstein et al. ( 1992 )
study in Valdez, Alaska, where personal benzene exposures
were among the highest reported, Expert A argued that
some weight should be assigned to this study, particularly in
analysis of possible values for the 90th percentile of the
distribution of personal exposure.
Discussion
This paper has described results from the first phase of a
project designed to assess how well exposure assessment
experts can characterize their uncertainty in estimating
human exposure to environmental contaminants. In this first
phase, we have demonstrated the use of expert judgment
elicitation techniques in the development of subjective
judgments about the ‘‘true,’’ but unknown ambient, indoor
and personal exposures to benzene experienced by the nonsmoking, non -occupationally exposed population of US
EPA Region V.
Taken at face value, these judgments provide a snapshot
of the experts’ state of knowledge about benzene exposures.
They represent responses to questions that decision makers
might reasonably ask before considering the need for
additional regulation or for further research. What do
existing data tell us about whether current levels of exposure
to benzene warrant action to protect public health? If
exposures appear to be high enough to justify consideration
of further regulation, are these values well enough known to
support immediate regulation or is the uncertainty in the
318
Expert judgment in exposure assessment
estimates large enough that regulations should be postponed
until further research can be conducted?
Responses to these questions may be developed in a
number of different ways including: (i) using computerized
models of exposure (e.g., BEADS ) ; ( ii ) through consensus
workshops such as those held by the US EPA Office of
Research and Development prior to the NHEXAS program
(Burke et al., 1992; Goldman et al., 1992; Graham et al.,
1992; Matanoski et al., 1992 ); or (iii ) through formal
expert judgment as described here. Even when responses to
these questions are developed using sophisticated environmental models or through workshops, they are often based
on subjective judgments that are not explicitly characterized. Recognizing the possibility for differences in individual’s judgments, encoding them in a systematic way, and
analyzing their impact on the decisions should be an integral
part of decision making.15
This study has provided several lessons about eliciting
subjective probability judgments from exposure assessment
experts. For all of the experts, elicitation of their subjective
judgments about variability and uncertainty in a probabilistic form was a new and, for most, a discomforting process.
This sentiment is not uncommon ( Morgan and Henrion,
1990 ). Many would have preferred to construct a detailed
model to characterize the variability of benzene concentrations in the Region V population. Although the subjective
judgment literature is still somewhat divided on whether
such ‘‘disaggregated’’ ( e.g., modeled) approaches lead to
better judgments than ‘‘direct’’ (i.e., aggregate or holistic)
assessments, the conventional wisdom is that they do
(Morgan and Henrion, 1990 ).16 While this study was not
designed to shed light on that debate, it did reveal that all of
the experts felt that developing credible ‘‘best estimates’’ for
the arithmetic mean and the 90th percentile was an
important first step and that most relied on some form of
simple model to develop them.
A disadvantage of the direct approach employed in this
study protocol is that it is not possible to discern
quantitatively the impact of alternative data sets, parameter
values, or other model assumptions. However, the detailed
record compiled of the conversation does permit some
qualitative insights into the impact of different assumptions
on either the best estimates or subjective uncertainty
distributions.
15
Formal decision analytic techniques known as VOI analyses have been
developed to help integrate quantitative responses to these questions.
Several examples exist of VOI techniques applied to environmental
problems ( Evans et al., 1988; Finkel and Evans, 1988; North et al., 1992;
Taylor et al., 1993; Dakins et al., 1996; Thompson and Evans, 1997 ) .
16
The authors initially considered a disaggregated approach to this project,
but the complexity of the model and the potential number of parameters for
which sets of judgments would have to be elicited made it necessary to
switch to a direct approach.
Journal of Exposure Analysis and Environmental Epidemiology (2001) 11(4)
Expert judgment in exposure assessment
The process of assigning probability distributions to
characterize their uncertainty about these best estimates of
the mean and 90th percentiles of the distributions of
benzene concentrations, although sometimes less time consuming, was conceptually more difficult for the
experts. Most were quite comfortable with statistical
approaches for developing estimates of uncertainty, e.g.,
the standard error of the mean from sample observations.
However, they found it relatively difficult to embrace the
subjectivist framing of this concept, i.e., ‘‘Give us your
90% credible interval for the mean — an interval such
that you believe it will contain the true mean with 90%
confidence.’’
One factor in their discomfort was simply the limited
practice that they, like most scientists, have with providing
such judgments and another was the very few studies on the
calibration of subjective judgments in environmental
assessments from which they could gain insight. Given
these potential difficulties and the need to obtain coherent
probabilistic responses from each expert, it is essential to
include one or more individuals with strong statistical skills
on the interview team.
The approach to encoding the experts’ probabilities
chosen for this study was selected because it appeared to be
the more intuitive and appropriate method for these experts.
Several methods for encoding probabilities have been
evaluated over the years and Morgan and Henrion ( 1990 )
have carefully reviewed the literature with an eye toward
how well the methods performed in calibration tests.
Although the fixed probability approach taken in our study
appears to be viewed favorably, Morgan and Henrion’s
review does not reach clear conclusions about the methods
most likely to lead to the best judgments. The review does
suggest a preference for methods that are well suited and
acceptable to the individuals whose judgments are being
elicited.
The interview protocol was set up to foster a
conceptual and quantitative distinction between variability and uncertainty in the experts’ judgments. However,
separation of variability and uncertainty was problematic
for several reasons. Although the conceptual distinction
is one with which risk analysts are increasingly familiar,
it is not commonly made by other scientists, including
some in this study. It is also not a simple matter to
distinguish variability and uncertainty in many of the
benzene data sets underlying the judgments. Time, data,
and analytical constraints imposed by the interview made
it difficult to assure that the separation was always
achieved.
The need to differentiate variability and uncertainty (both
conceptually and practically ) made it difficult to avoid
some of the common heuristic procedures of anchoring and
overconfidence (see Appendix ) . Other analysts have
counseled that when characterizing uncertainty, experts
Journal of Exposure Analysis and Environmental Epidemiology (2001) 11(4)
Walker et al.
should begin with outer percentiles of the uncertainty
distribution and work inward to a best estimate to help
minimize the impact of anchoring and overconfidence
(Morgan and Henrion, 1990; Cooke, 1991; Hora, 1992 ).
While this study did ask the experts to begin with the outer
percentiles of their uncertainty distribution ( i.e., maximum,
minimum, 5th, and 95th percentiles ) , the experts found it
difficult to do so without first having established a best
estimate ( e.g., of the mean or the 90th percentile of the
population distribution of ambient benzene concentrations ).
Thus, they were likely to establish an early anchor, which
rarely shifted substantially as the discussion of uncertainty
went on.
The experts’ reliance on the availability heuristic (see
Appendix ) required constant vigilance on the part of the
interviewers and was also difficult to overcome. The
workshop’s role was to identify key studies and to critique
them as bases for estimating benzene concentrations and
personal exposures. However, we still observed a strong
tendency for individuals to gravitate toward the work they
knew best, in some cases their own, despite uncertainties or
criticisms that had been voiced about it. Providing sufficient
opportunity and incentive for the experts to consider fully
all the sources of data for a particular question remains a
challenge for analysts interested in the elicitation of
subjective judgments.
The fact, that different biases and heuristics may be
operating on individual judgments, that they are difficult to
overcome, and that they can lead to very different
judgments, is itself an argument for conducting studies of
this kind. They may reveal important differences in the
scientific community that would less likely come to light if
judgments about particular issues were left to an individual
analyst or concealed in more ad hoc, or qualitative,
approaches.
Quantitative characterizations of uncertainty that make
explicit use of both subjective as well as ‘‘objective’’
classical forms of uncertainty have the advantage that
important sources of uncertainty, particularly those
having to do with lack of or incomplete knowledge,
can be included. Unfortunately, this study cannot address
the concern that more conventional approaches to
characterizing uncertainty such as those used in the
BEADS model ( MacIntosh et al., 1995a ) may understate
uncertainty.17 However, the second phase of this project
will examine how well this group of environmental
exposure experts was able to characterize their uncer17
Although a comparison between the results of such a model and the
judgments expressed in this study would be valuable, the current BEADS
model predictions include exposures to active smokers and occupationally
exposed individuals, making a direct comparison inappropriate. Future
work is planned to allow a comparison to be made between the two
approaches.
319
Walker et al.
Expert judgment in exposure assessment
tainty by examining how well calibrated their judgments
were given the results of the NHEXAS Region V pilot
study.
Formal elicitation protocols similar to the one used in
this study are neither necessary nor desirable for every
analysis or decision; as with any analysis, careful
consideration needs to be given to what level of rigor is
necessary to ‘‘do the job’’ ( Paté - Cornell, 1996 ). However, exposure analysts need to be cognizant of where
significant uncertainties may lie, when classical techniques for characterizing uncertainty may not suffice, and
what issues are to be faced when characterizing
uncertainty subjectively.
Availability
Overconfidence
Acknowledgments
This work would not have been possible without the experts
who participated in this study. They have our esteem and
thanks for their patience and attentiveness throughout the
long, and often arduous, interviews. Barry Ryan of Emory
University played a valuable role by serving as one of the
pilot studies for the elicitation protocol. We are thankful to
Paul Catalano, James Hammitt, and John Graham of the
Harvard School of Public Health for their helpful comments
on this study. This work was supported by US EPA
Cooperative Agreement CR822038 -03 -1 -3, HRSA, Bureau of Health Professions grant A03 -AH01165 -01, the
Leslie Silverman Foundation, and the Harvard Center for
Risk Analysis.
Base rate bias
Appendix
Table A1: Heuristics in subjective judgments: some
definitions
Anchoring
320
The process of beginning with a
value, particularly an estimate of the
central tendency of the quantity to be
estimated, and then adjusting one’s
probabilities of the ‘‘true’’ values of
that quantity from the starting point.
Research has shown that individuals
tend to become overly ‘‘anchored’’
to their initial estimate and fail to
adjust their judgments adequately
to reflect additional information
or the strength of their
underlying knowledge. Two
consequences of this heuristic are
bias toward the anchor and
overconfidence ( defined below ).
The tendency to base judgments on
information that is readily recalled
or more easily imagined. If the
full body of data on which to base
a judgment is not taken into account,
the judgment may be biased to reflect
those data which were most available
to the individual.
The tendency to express too much
certainty in an estimate.
For responses to binary questions
( e.g., true or false, rain or no rain ) ,
overconfidence may be manifested
in too high a probability placed on
one outcome or the other. For responses
which take the form of probability
distributions (as in this study ),
overconfidence may result in
distributions that are too tight
( i.e., so that particular confidence
intervals are too narrow to contain the
‘‘truth’’ with the desired
level of confidence ) .
‘‘The failure to consider population
base rates when assigning
probabilities’’ (Kahneman and Tversky,
1973; Hora, 1992 ) .
In the context of environmental
exposures, an example might be
assigning a high probability of
observing a benzene concentration
that has been associated with a
particular activity (e.g., lawn mowing )
without considering how often the
activity is likely to occur in the
population of interest.
References
Amaral D.A.L. Estimating uncertainty in policy analysis: health effects
from inhaled sulfur oxides. PhD Thesis, Department of Engineering
and Public Policy, Carnegie Mellon University, Pittsburgh, PA, 1983
( as cited in Morgan and Henrion, 1990 ) .
Bostrom A., Fischhoff B., and Morgan M.G. Characterizing mental
models of hazardous processes: a methodology and an application to
Radon. J Soc Issues 1992: 48 ( 4 ) : 85 – 100.
Bostrom A., Atman C., Fischhoff B., and Morgan M.G. Evaluating risk
communications: completing and correcting mental models of
hazardous processes, Part II. Risk Anal 1994: 14 ( 5 ) : 789 – 798.
Browner C. Policy for risk characterization. Memorandum from Carol
Browner, EPA Administrator, dated March 21, 1995.
Burke T., Anderson H., Beach N., Colome S., Drew R.T., Firestone M.,
Hauchman F.S., Miller T.O., Wagener D.K., Zeise L., and Tran N. Role
of exposure databases in risk management. Arch Environ Health 1992:
47 ( 6 ) : 421 – 429.
Journal of Exposure Analysis and Environmental Epidemiology (2001) 11(4)
Expert judgment in exposure assessment
CARB ( 1991 ) . Activity patterns of California residents. Final Report,
Contract No. A6-177-33.
Cooke R.M. ( 1991 ) . Experts in Uncertainty: Opinion and Subjective
Probability in Science. Oxford University Press, New York, NY.
CRARM ( Commission on Risk Assessment and Risk Management ) . Risk
assessment and risk management in regulatory decision making. Draft
Report, Commission on Risk Assessment and Management, Washington, DC, June 1996.
Cullen A.C., and Frey H.C. Developing distributions for use in probabilistic
exposure assessment: a handbook for dealing with variability and
uncertainty in models and inputs. Unpublished draft, 1997.
Dakins M.E., Toll J.E., Small M.J., and Brand K.P. Risk - based
environmental remediation: Bayesian Monte Carlo analysis and the
expected value of sample information. Risk Anal 1996: 16: 67 – 79.
De Cock J., Kromhout H., Heederik D., and Burema J. Experts subjective
judgment assessment of pesticide exposure in fruit growing. Scand J
Work Environ Health 1996: 22 ( 6 ) : 425 – 432.
Evans J.S., Hawkins N.S., and Graham J.D. Uncertainty analysis and the
value of information: monitoring for radon in the home. J Air Pollut
Control Assoc 1988: 38: 1380 – 1385.
Evans J.S., Graham J.D., Gray G.M., and Sielken R.L. A distributional
approach to characterizing low - dose cancer risk. Risk Anal 1994a:
14 ( 1 ) : 25 – 34.
Evans J.S., Gray G.M., Sielken R.L., Smith A.E., Valdez - Flores C., and
Graham J.D. Use of probabilistic expert judgment in uncertainty
analysis of carcinogenic potency. Regul Toxicol Pharmacol 1994b: 20:
15 – 36.
Finkel A.M., and Evans J.S. Evaluating the benefits of uncertainty
reduction in environmental health risk management. J Air Pollut
Control Assoc 1988: 37: 1164 – 1171.
Goldman L.R., Gomez M., Greenfield S., Hall L., Hulka B.S., Kaye W.E.,
Lybarger J.A., Mckenzie D.H., Murphy R.S., Wellington D.G., and
Woodruff T. Use of exposure databases for status and trends analysis.
Arch Environ Health 1992: 47 ( 6 ) : 430 – 438.
Goldstein B.D., Tardiff R.G., Baker S.R., Hoffnagle G.F., Murray D.R.,
Catizone P.A., Kester R.A., and Caniparoli D.G. Valdez Air Health
Study. Alyeska Pipeline Service, Anchorage, AK, 1992.
Graham J.D., Walker K.D., Berry M., Bryan E.F., Callahan M.A., Fan A.,
Finley B., Lynch J., Mckone T., Özkaynak H., and Sexton K. Role of
exposure data bases in risk assessment. Arch Environ Health 1992:
47 ( 6 ) : 408 – 420.
Habicht F.H. Guidance on risk characterization for risk managers and risk
assessors. Memorandum from F. Henry Habicht, EPA Deputy
Administrator, dated February 26, 1992 ( reprinted in Appendix B in
NRC, 1994 ) .
Hawkins N.C., and Evans J.S. Subjective estimation of toluene exposures:
a calibration study of industrial hygienists. Appl Ind Hyg J 1989: 4:
61 – 68.
Hawkins N.C., and Graham J.D. Expert scientific judgment and cancer
risk assessment: a pilot study of pharmacokinetic data. Risk Anal 1988:
8 ( 4 ) : 615 – 625.
Hora, S.C. Acquisition of expert judgment: examples from risk
assessment. J Energy Eng 1992: 188 ( 2 ) : 136 – 148.
Kahneman D., and Tversky A. On the psychology of prediction. Psychol
Rev 1973: 80 ( 4 ) : 237 – 251.
Kahneman D., Slovic P., and Tversky A. Judgment Under Uncertainty:
Heuristics and Biases. Cambridge University Press, Cambridge, UK,
1982.
Kromhout H., Ostendorp Y., Heederik D., and Boleij J.S.M. Agreement
between qualitative exposure estimates and quantitative exposure
measurement. Am J Ind Med 1987: 12: 551.
Macaluso M., Delzell E., Rose V., Perkins J., and Oestenstad K. Inter - rater
agreement in the assessment of solvent exposure at a car assembly
plant. Am Ind Hyg Assoc J 1993: 54: 351 – 359.
Journal of Exposure Analysis and Environmental Epidemiology (2001) 11(4)
Walker et al.
MacIntosh D.L., Özkaynak H., Xue J., and Ryan P.B. NHEXAS pre - field
exposure assessment study: estimated benzene exposures and absorbed
doses in EPA Region 5 and Arizona. Report to Karen Hammerstrom,
U.S EPA, April 1995a.
Macintosh D.L., Xue J., Özkaynak H., Spengler J.D., and Ryan P.B. A
population - based exposure model for benzene. J Expos Anal Environ
Epidemiol 1995b: 5 ( 3 ) : 375 – 403.
Manne A.S., and Richels R.G. The costs of stabilizing global CO2
emissions: a probabilistic analysis based on expert judgments. Energy
J 1994: T15 ( 1 ) : 31 – 56.
Matanoski G., Selevan S.G., Akland C., Bornschein R.L., Dockery D.,
Edmonds L., Greife A., Mehlman M., Shaw G.M., and Elliott E. Role
of exposure databases in epidemiology. Arch Environ Health 1992:
47 ( 6 ) : 439 – 446.
Morgan M.G., and Henrion M. Uncertainty: A Guide to Dealing with
Uncertainty in Quantitative Risk and Policy Analysis. Cambridge
University Press, New York, 1990.
Morgan M.G., and Keith D.W. Subjective judgments by climate experts.
Environ Sci Technol 1995: 29: 468A – 476A.
Morgan M.G., Morris S.C, Henrion M., Amaral D.A.L., and Rish W.R.
Technical uncertainty in quantitative policy analysis — a sulfur air
pollution example. Risk Anal 1984: 4 ( 3 ) : 201 – 216.
Nordhaus W.D. Expert opinion on climatic change. Am Sci 1994: 82: 45 – 51.
North D.W., Selker F.K., and Guardino T. Estimating the Value of
Research: An Illustrative Calculation for Ingested Inorganic Arsenic.
Decision Focus, Palo Alto, CA, October 1992.
NRC ( National Research Council ) . Science and Judgment in Risk
Assessment. National Academy Press, National Academy of Sciences,
Washington, DC, 1994.
Otway H., and von Winterfeldt D. Expert judgment in risk analysis and
management: process, context, and pitfalls. Risk Anal 1992: 12 ( 1 ) :
83 – 93.
Paté - Cornell M.E. Uncertainties in risk analysis: six levels of treatment.
Reliab Eng Syst Saf 1996: 54: 95 – 111.
Rinsky R.A., Smith A.B., Hornung R., et al. Benzene and leukemia — an
epidemiologic risk assessment. N Engl J Med 1987: 316 ( 17 ) : 1044.
Robinson J.P., and Thomas J. Time Spent in Activities, Locations,
Microenvironments: a California – National Comparison. Environmental Monitoring Systems Laboratory, US EPA, Las Vegas, NV, 1991.
Sexton K. Informed decisions about protecting and promoting public
health: rationale for a national human exposure assessment survey. J
Expos Anal Environ Epidemiol 1995: 5(3): 233 – 256.
Siegel J.E., Graham J.D., and Stoto M.A. Allocating resources among
AIDS research strategies. Policy Sci 1990: 23: 1 – 23.
Sielken R.L. Useful tools for evaluating and presenting more science in
quantitative cancer risk assessments. Toxic Subst J 1989: 9: 353.
Spedden S.E. Judgments and perceptions of human risk from chemical
carcinogens. ScD Thesis, Harvard University, Cambridge, MA, 1992.
Spetzler C.S., and Staël von Holstein C. - A.S. Probability encoding in
decision analysis. Manage Sci 1975: 22: 340 – 358.
Stewart P.A., Blair A., Cubit D.A., et al. Estimating historical exposures
to formaldehyde in a retrospective mortality study. Appl Ind Hyg
1986: 1: 34.
Taylor A., Evans J.S., and Mckone T.E. The value of animal test
information in environmental control decisions. Risk Anal 1993:
13 ( 4 ) : 403 – 412.
Thomas K.W., Pellizzari E.D., Clayton A.C., Perrit R.L., Dietz R.N.D.,
Goodrich R.W., Nelson W.C., and Wallace L. Temporal variability of
benzene exposures for residents in several new jersey homes with
attached garages or tobacco smoke. J Expos Anal Environ Epidemiol
1993: 3 ( 1 ) : 49 – 73.
Thompson K.M., and Evans J.S. The value of improved national exposure
information for perchloroethylene: a case study for dry cleaners. Risk
Anal 1997: 17 ( 2 ) : 253 – 271.
321
Walker et al.
US EPA. Guiding Principles for Monte Carlo Analysis. Risk Assessment
Forum, US Environmental Protection Agency, Washington, DC, EPA /
630 / R - 97 / 001.
von Winterfelt D., and Edwards W. Decision Analysis and Behavioral
Research. Cambridge University Press, Cambridge, England, 1986.
Wallace L. The Total Exposure Assessment Methodology ( TEAM ) study:
an analysis of exposures, sources, and risks associated with four volatile
organic chemicals. J Am Coll Toxicol 1989: 8 ( 5 ) : 883 – 895.
Wallace L. Environmental exposure to benzene: an update. Environ Health
Perspect 1996: 104 ( suppl 6 ) : 1129 – 1139.
Wallace L.A., Pellizzari E.A. Hartwell T.D., Whitmore R., Zelon H., Perritt
R., and Sheldon L. The California team study: breath concentrations
and personal exposures to 26 volatile compounds in air and drinking
water of 188 residents of Los Angeles, Antioch, and Pittsburgh, CA.
Atmos Environ 1988: 22 ( 10 ) : 2141 – 2163.
Wallsten T.S., and Whitfield R.G. Assessing the Risks to Young Children of
322
Expert judgment in exposure assessment
Three Effects Associated with Elevated Blood Lead Levels. Argonne
National Laboratory, Argonne, IL, December 1986, ANL / AA - 32.
Winkler R.L., Wallsten T.S., Whitfield R.G., Richmond H.M., Hayes
S.R., and Rosenbaum A.S. An assessment of the risk of chronic lung
injury attributable to long - term ozone exposure. Oper Res 1995:
43 ( 1 ) : 19 – 28.
Winn D., Lednar W., Williams T., and Van Ert M. The identification of
occupational exposures by an industrial hygiene questionnaire.
Presented to the American Industrial Hygiene Conference, New
Orleans, May 1977. Occupational Health Studies Group, University of
North Caroline School of Public Health, 1977.
Wolff S.K., Hawkins N.C, Kennedy S.M., and Graham J.D. Selecting
experimental data for use in quantitative risk assessment: an expert
judgment approach. Toxicol Ind Health 1990: 6 ( 2 ) : 275 – 291.
Wright G.W., and Ayton P. ( Eds. ) Subjective Probability. John Wiley and
Sons, New York, NY, 1994.
Journal of Exposure Analysis and Environmental Epidemiology (2001) 11(4)