Exposure variability

AMERICAN JOURNAL OF INDUSTRIAL MEDICINE 45:113–122 (2004)
Exposure Variability: Concepts and
Applications in Occupational Epidemiology
Dana Loomis,
PhD
1
and Hans Kromhout,
PhD
2
Background Standard approaches to assessing exposures for epidemiologic studies tend
to concentrate resources on obtaining detailed data for each study participant at the
expense of characterizing within-person variability.
Methods This paper presents some basic, generalizeable concepts concerning exposure
and its variability, describes methods that can be used to analyze, describe, and understand
that variability, and reviews related implications for the design and interpretation of
epidemiologic studies.
Results and Conclusions Insufficient attention to the balance of within- and betweenperson variation in exposure can reduce the efficiency of measurement efforts and attenuate estimates of exposure-disease association. Exposure variability should consequently
be considered carefully in the planning, analysis, and interpretation of epidemiologic
studies. Greater attention to these matters can lead to more meaningful characterization of
exposure itself, and, most importantly, improve the chances that epidemiologic studies can
identify and accurately characterize health hazards. Am. J. Ind. Med. 45:113–122, 2004.
ß 2003 Wiley-Liss, Inc.
KEY WORDS: exposure; within-individual variability; between-individual variability; epidemiology; methods
INTRODUCTION
As used in epidemiology, the term ‘‘exposure’’ is
understood to refer to either ‘‘proximity and/or contact with
a source of a disease agent’’ or ‘‘the amount of a factor to
which a group or individual was exposed’’ [Last, 2001]. In
the first definition, exposure is treated as an attribute of individuals that is either present or absent. In this case,
measuring exposure becomes a problem of assigning people
to ‘‘exposed’’ and ‘‘unexposed’’ groups that are assumed to
be essentially homogeneous. This basic approach may be
1
Departments of Epidemiology and Environmental Sciences and Engineering, University
of North Carolina, Chapel Hill, North Carolina
2
Environmental and Occupational Health Division, Institute for Risk Assessment
Sciences, Utrecht University, Utrecht,The Netherlands
*Correspondence to: Dr. Dana Loomis, Department of Epidemiology, CB 7435, School of
Public Health, UNC-CH, Chapel Hill, NC 27599. E-mail: [email protected]
Accepted 24 September 2003
DOI 10.1002/ajim.10324. Published online in Wiley InterScience
(www.interscience.wiley.com)
2003 Wiley-Liss, Inc.
extended to incorporate the second definition, in which
exposure has to do with the quantity of the agent, by creating
ordinal categories, also treated as homogeneous. In another
variation, quantitative exposure measurements are taken
and used to assign exposure scores to each category. All of
these standard approaches classify individuals in groups with
different exposures, but impose implicit assumptions that
individuals within groups are uniformly exposed, and
frequently that each individual’s exposure is fixed over time,
as well. These standard simplifying assumptions can mask
complex variation that should often be accounted for in the
design, conduct, and interpretation of studies. On those
occasions when exposure is measured on an individual basis,
it is often treated as an invariable fixed attribute of an individual.
Concepts and methods for working with complex
exposure information have been developed through epidemiological research on nutritional and occupational exposures [Willett, 1990; Rappaport, 1991; Boleij et al., 1995],
but are still not in common use. Here we will present some
basic, generalizable concepts concerning exposure and its
114
Loomis and Kromhout
variation among people and over time, describe methods
that can be used to analyze, describe, and understand that
variability, and consider implications for design and interpretation of epidemiological studies. Although our discussion focuses on occupational and environmental exposures,
the concepts and methods have applications to a wider range
of epidemiological problems.
DIMENSIONS OF EXPOSURE VARIABILITY
From the standpoint of epidemiologic research, variation in exposure has two fundamental dimensions: person
and time. Because epidemiologic studies generally require
comparison of the health experience of groups or populations, the notion that exposure varies between groups of
people is fundamental to epidemiologic research.
Exposure may also vary within groups of people. In
occupational studies, groups are frequently defined by
existing structures, such as job titles or work areas. Thus,
workers employed in the same plant are assumed to have
different exposures determined by their job titles and work
areas. However, exposures of workers with the same job in
a work area might vary as a result of differences in tasks,
work styles, ventilation, and personal protective equipment.
Figure 1 depicts this situation for two automobile manufacturing workers performing the same job (painting car
bodies) in the same well-controlled spray booth: despite the
apparent similarity of these workers’ tasks and environment,
their mean exposures to isopropanol differed by a factor
of 2.7.
When exposure groups are selected or created for study,
some aggregate measure of exposure is normally applied to
all individuals in a given group. A statistical measure of
central tendency, such as the arithmetic or geometric mean, is
typically used when the exposure variable is quantitative.
Although metrics of this type are based on explicit statistical
assumptions about the distribution of exposure within the
group, standard practices treat every individual’s exposure as
if it were equal to the mean.
For categorical exposure variables, an arbitrary score
(e.g., 0 or 1 for binary data) is usually applied as a metric of
exposure. Metrics of this type do not explicitly presume
variability within groups. Nevertheless, it is normally the
case that exposure does vary within categories treated as
uniform. For example, the quantitative level of lifetime
exposure to tobacco smoke may vary widely among those
who are classified as smokers by a binary indicator for ever
having consumed cigarettes.
Classical statistical methods for the analysis of exposure-disease associations do not facilitate formal consideration of the variability within groups, so it is frequently
ignored. Nevertheless, within-group variability has important implications, including the potential for exposure measurement error and misclassification.
FIGURE 1. Daily mean exposures to isopropanol (IPA) for two workers preparing automobile bodies for painting in a ventilated booth
(diamonds, worker 1; squares, worker 2). Data from [Flynn and George, 1996]: Applied Occupational and Environmental Hygiene, A field
evaluation of a mathematical model to predict worker exposure to solvent vapors, 11(10), pages1212^1216. Copyright1996. Cincinnati OH.
Reprintedwithpermission.
Exposure Variability: Concepts and Applications
Time is the basic dimension of exposure variability at the
individual level: people change jobs, quit smoking, retire,
and experience other changes. Indeed, congenital attributes
may be the only exposures that are truly fixed at the individual
level. Many methods of analyzing epidemiological data
nevertheless require that exposure be treated as fixed, rather
than as time-dependent, for each person. Case-control
studies, for example, typically use an exposure estimate
based on the exposure accumulated at the time of the cases’
diagnosis [e.g., Sanderson et al., 2001].
The temporal variability of exposure can be considered
on different time scales. Figure 2 illustrates this phenomenon
115
for workers exposed to magnetic fields. The two workers
shown in Figure 2a had different average exposures in each
decade, and each worker’s exposures declined by approximately 0.5 mT over the time interval. When worker 2’s
exposure is examined on a day-to-day basis, however, the
range of daily mean exposures is greater by a factor of 10, and
no secular trend is evident (Fig. 2b). The fluctuation of
exposure within a single day is greater still (Fig. 2c). As this
example suggests, developments in measurement technology
can make it possible to measure exposure on progressively
finer time scales. Nevertheless, decisions about which time
scale is relevant should still be biologically based, since there
FIGURE 2. a^c: Variation in individual occupational exposure to magnetic fields (a) by year, by day within1work week (b) and by10-s
interval within1day (c). Figure 2c from van der Woord et al. [2000].Within-day variability of magnetic fields among electric utility workers:
consequences for measurement strategies. Am Ind Hyg Assoc J 61:31^38.Copyright 2000 by the American Industrial Hygiene Association.
Reprintedwithpermission.
116
Loomis and Kromhout
is no reason to assume that any exposure that can be measured
must be important.
ANALYSIS AND DESCRIPTION OF
EXPOSURE VARIABILITY
A description of the dimensions and determinants of
exposure variability can be useful for a number of purposes,
including planning exposure measurements, assigning
estimates of exposure to study participants, and predicting
and controlling future exposures.
Mathematical Description
Simple conceptual models can be developed to describe
exposures, taking account of variability associated with
various factors. For example, if exposure varies between
groups, between people within groups, and from time-totime for each person as described above, the instantaneous
exposure X(t) of a person at time t can be described by:
Xij ðtÞ ¼ fð þ i þ ij þ ij ðtÞÞ, where m is the long-term,
overall mean exposure level, and a, , and g, respectively,
represent deviations from m associated with being a member
of group i, being the jth person in that group, and temporal
fluctuation of exposure at time t.
Statistical Models
Random-effects analysis of variance (ANOVA) models
are a useful tool for quantitatively describing variability in
exposure by partitioning it into variance components associated with different factors [Heederik et al., 1991]. Timevarying exposures within a population can be analyzed
using a simple one-way random effects ANOVA model:
Xij ¼ þ i þ "ij , where Xij is the observed exposure of
person i at time j, m is the long-term, population mean
exposure, i is the random deviation of the ith person’s
exposure from the population mean, and eij represents a
random deviation on the jth day from person i’s mean
exposure. This model assumes that i and eij are normally
distributed and independent. By fitting the ANOVA model
to the observed exposure data, the variance components 2b
and s2w associated with exposure variability between people
and within people, respectively, can be estimated. The total
variance in exposure, 2T is simply the sum 2b þ 2w .
Because the distribution of occupational exposures
tends to be lognormal, the ANOVA model is usually fit to
log-transformed data. In this case, the model is written as
Yij ¼ lnðXij Þ ¼ þ i þ "ij , where Yij is the log-transformed
individual exposure, Xij. The estimated geometric standard
deviations exp(Sb) and exp(Sw) are then used to describe the
variability between and within people, respectively. In
Figure 3 an example is presented of within- and betweenindividual variability in exposure to cutting fumes for six
workers demolishing a railway bridge.
The basic random-effects ANOVA model can be expanded for the analysis of more complex situations by adding
terms for additional explanatory factors or sources of variability. For example, the variability of exposure among
several populations might be analyzed using the model
FIGURE 3. Exposuretocuttingfumesofsixdemolitionworkersduring4consecutivedays.Thedifferencesbetweenindividualsoutweigh
the day-to-day variability (77%between-individual variability vs.23%within-individual variability).Estimatedgeometric standarddeviations
exp(Sb)andexp(Sw)are2.00and1.46,respectively,correspondingtoa bR0.95 of15anda wR0.95 of4.4.
Exposure Variability: Concepts and Applications
Xijk ¼ þ i þ ij þ "ijk , in which Xijk is the observed
instantaneous exposure at day k of a person j in group i, i
represents the random deviation of group i’s exposure from
the global mean, m, gij is the random deviation of person j’s
exposure from the mean of group i, and eijk is a random error
component consisting of temporal variability for person j.
This model assumes a hierarchical variance structure, with
each successive source of variability nested within the one
before it. The formulation is analogous to the conceptual
model above, with exposure varying between groups, between people within groups, and within people over time
[Kromhout and Heederik, 1995].
In some situations, it can be useful to employ a multilevel mixed effects regression model that allows exposures to
be modeled as a function of a combination of random and
fixed factors [Burton et al., 1998]. For example, groups
defined by job title or work area might be entered as
fixed effects, while the effect corresponding to individual
workers within groups and temporal effects are modeled as
random factors [Nylander-French et al., 1999; Burstyn and
Kromhout, 2000].
PLANNING EXPOSURE ASSESSMENTS
FOR EPIDEMIOLOGICAL STUDIES
Who, What, and When to Measure
The structure and magnitude of exposure variability
have important practical implications for planning the assessment of exposure in an epidemiologic study. Among the
first steps in planning a study are decisions about who and
what to measure. Exposure variability should be considered
along with other issues when selecting a study population. It
can be shown that the distribution of a population’s exposure
can effect study efficiency, with less variable exposures
generally requiring larger numbers of subjects to achieve the
same study power [White et al., 1994; Armstrong, 1996]
Exposure variability, and thereby efficiency, can be augmented by carefully selecting populations to increase exposure ranges or expand the overall variance of exposure
[McKeown-Eyssen and Thomas, 1985; Armstrong, 1996].
Causal models are the preferred starting point for
decisions about what to measure, but the balance of withinand between-person exposure variability can be an important, practical consideration with ramifications for both
validity and efficiency [Rappaport et al., 1995].
Exposure Measurement Strategies
Consideration of the relative magnitude of between- and
within-person components of exposure variability can yield
useful insights about the type of exposure measurement
strategy most likely to yield valid, precise estimates of true
exposure for study participants. Two extreme cases illustrate
117
the principles involved. If exposure varied only over time,
but not among people, then monitoring a single individual
would be sufficient to characterize the exposure of a population. Because of temporal variability, however, either
continuous monitoring of that individual’s exposure over
the entire time period of interest or multiple, shorter-term
measurements distributed randomly over that interval would
be required to accurately characterize the exposure. The
optimal exposure assessment program for this situation
would maximize the amount of data collected for an individual subject. At the opposite extreme, a single instantaneous measurement for each person in the study would be
sufficient if exposure varied only between people and not
over time. The optimum allocation of measurement effort
in this situation would maximize the number of randomly
selected individuals monitored, but taking more measurements per individual would add nothing.
Studies of the relation of childhood cancer to exposure to
residential magnetic fields provide a useful example from
environmental epidemiology. Initial attempts to estimate
exposure quantitatively in the 1980s were based on shortterm ‘‘spot’’ measurements a few minutes in duration in
participants’ homes [Savitz et al., 1988]. In subsequent
studies conducted in the 1990s, researchers employed new
technology in attempts to improve the estimation of exposure
by monitoring residential magnetic field levels over a longer
period of 24 hr [London et al., 1991; Linet et al., 1997] and by
restricting studies to participants with stable addresses in
order to reduce the number of houses involved and thereby
make it more feasible to obtain a quantitative exposure
measurement for every subject [Linet et al., 1997]. A recent
methodological study showed that spot measurements resulted in a misclassification rate of 36%, and the authors
concluded that these measurements were not to be recommended because of high-diurnal variability [Banks et al.,
2002].
The magnetic field investigators’ approach to improving
exposure estimation illustrates a common decision that emphasizes obtaining data for every participant at the expense of
assessing temporal variability. For most situations, however,
the optimal exposure assessment strategy would involve
distributing the sampling effort to achieve a balance between
maximizing the number of people monitored to account for
between-person variability and maximizing the temporal
detail of the data collected for each person monitored to
capture within-person variation [Phillips and Smith, 1993].
Meta-analyses of research on magnetic fields and childhood cancer suggest that the effort to improve the quality of
exposure data in this research area may have achieved some
success. An authoritative review issued in 1997 noted a
paradoxical finding that in the studies then available, cancer
risk was associated less strongly with quantitative indicators
of magnetic field intensity based on short-term quantitative
measurements than with a surrogate indicator derived from
118
Loomis and Kromhout
outdoor wiring configurations [National Research Council,
1997]. The paradox appeared to be resolved by later studies
that estimated individual exposures from 24 hr residential
monitoring data, which would presumably be more temporally stable than spot measurements. Whereas, on average,
studies that estimated exposure from spot measurements
suggested no association with childhood leukemia (odds
ratio 1.0), those using 24 hr measurements yielded odds ratios
between 1.3 and 1.7 [Loomis et al., 1999]. Although these
results suggest that the adoption of 24-hr monitoring improved the quality of exposure assessment, even this strategy
would be optimal only if magnetic field levels did not vary on
a time scale longer than 1 day. If there were significant dayto-day variation, it would be preferable to obtain several
exposure measurements on different days for each participant. Banks et al. [2002] showed that a 2-week measurement
regime improved exposure classification slightly when compared to 24 hr measurements, but concluded that the added
intrusiveness and cost were likely to outweigh the improvement in precision.
Pilot studies are an excellent way to gather the information needed to optimize exposure assessment strategies.
These pilot studies should include assessment of a representative sample of the study population, with participants
chosen among all relevant exposure categories and repeated
random sampling of each individual. This allows a preliminary analysis of the components of exposure variability
before undertaking a large field study. A final exposure
assessment strategy can then be developed based on the pilot
study findings.
A detailed discussion of the principles of designing
exposure assessment studies for epidemiologic research is
beyond the scope of this paper, but exceptionally clear
explanations were given as early as the 1950s [Oldham and
Roach, 1952; Ashford, 1958].
ASSIGNING EXPOSURE ESTIMATES
Once exposure information is obtained, it must be used
to assign estimates of exposure to study participants. Exposure values can be assigned in many forms, ranging from
binary to fully quantitative. The sources and magnitude of
exposure variability are among the factors that should be
considered in selecting the exposure assignment method.
Individual- Versus Group-Based
Assignments
The situation where quantitative exposure data are
available for each individual, although not always realized in
practice, provides the clearest illustration of the principles
involved. When exposure has been assessed for each subject,
there are two basic options for assigning estimated exposures. The most intuitively transparent is the individual-
based strategy, in which the exposure of each subject is
estimated directly using his or her own data alone. This type
of exposure assignment strategy is the norm in studies when
the investigators have been able to contact each participant to
obtain exposure information by measurement or interview, as
in many case-control and prospective cohort studies.
The principal alternative is a group-based exposure
assignment strategy. With this method, subgroups of people
are constructed based on common factors like job title, task,
or plant. Each subgroup’s mean exposure is then estimated
from a sample and this metric is applied to all individuals
in the subgroup. This method of assigning exposures is
common in occupational studies where historical design or
logistical problems prevent measurement of each worker. In
such cases, the use of group exposure assessments based
on such characteristics as job title, task, or environment is
the norm [Checkoway et al., 1986]. Although group-based
designs are well-known when dealing with quantitative exposure data, the same principle has also been used for
interview data, for instance when population-based jobexposure matrices are elaborated from exposure scores from
interviews with individuals sharing the same job [Kromhout
et al., 1992; Post et al., 1994].
It is often assumed that individual-based methods of
assigning exposure are preferable whenever the requisite data
are available. However, estimates of exposure-disease association can be severely attenuated when the temporal component of exposure variability within people is large relative
to that between people. Different methods for exposure
assignment were compared in a recent study on exposure
to carbon black and lung function among manufacturing
workers [van Tongeren et al., 1999]. The overall betweenworker variability was smaller than the temporal (day-today) variability in exposure concentrations and it was
estimated that the observed slope would be attenuated by
38% for inhalable dust and 59% for respirable dust with
exposure assigned on an individual basis. On the contrary,
group-based assignments resulted in negligible attenuation.
This relationship may be shown by the equation
¼ =ð1 þ =nÞ, where b and b* are, respectively, the
regression coefficients on true and observed exposure, n is
the number of exposure measurements per person, and l is
the ratio of the within- and between-person components of
exposure variance, 2w /2b [Cochran, 1968; Liu et al., 1978].
The potential magnitude of the resulting attenuation was
illustrated by analyses by Heederik and Attfield [2000] using
data from a study of coal miners that provided an average of
31 individual exposure measurements for each worker. When
the lung function data were re-analyzed using only a sample
of the repeated exposure measurements for each worker, the
observed 11-year decrement in lung function associated with
a 1 mg/m3 average coal dust exposure became progressively
more attenuated as the number of exposure measurements
was reduced (Table I). This type of attenuation can be re-
Exposure Variability: Concepts and Applications
TABLE I. Attenuation of the Regression Coefficient Expressing the
Association of Individual Mean Coal Dust Exposure and Change in Lung
Function (FEV1) in Relation to the Number of Exposure Measurements
Available per Worker
Dust measurements
per worker
All available (n ffi 36,000)
15
9
6
3
FEV1 (ml) per mg/m3 (b)
SEa
b/SEa
4.5
3.8
3.2
2.5
1.8
1.5
1.4
1.3
1.3
1.1
3.0
2.7
2.5
1.9
1.6
Based on National Study of Coal Workers’ Pneumoconiosis, 1969^1981, a study of
1,105 coal miners with adjustments for age, height, and smoking [from Heederik and
Attfield, 2000].
a
SE, standard error; b, regression coefficient for change in FEV1 (ml) per mg/m3 of coal
dust exposure.
duced in the design phase by minimizing the ratio l/n, either
by increasing the number of exposure measurements per
person, n, or by increasing 2b by selecting the study
population so that the inter-individual range in exposure is
broadened.
Substitution of a group-based exposure-assignment
scheme in the analysis phase of the study may be more effective, especially when temporal variability is large. Recent
methodological research shows that in most cases validity
can be substantially improved by reducing attenuation
though the use of group-, rather than individual-based
exposure assignments [Kromhout et al., 1996; 1997; Tielemans et al., 1998].
Group-based strategies may also be preferable for
practical reasons, as they tend to be logistically less demanding than individual measurement. Group-based exposure assignments are often the only kind possible in historical
studies of occupational cohorts, where the available exposure
data consist exclusively of area measurements representing
the average exposure of groups of workers in the same
section of a plant or company [e.g., Dement et al., 1983;
Burstyn et al., 2000]. A disadvantage of group-based
assignments, however, is that they may provide poorer statistical precision than an individual assignment using the same
data [Preller et al., 1995; Seixas and Sheppard, 1996].
Validity of the exposure–response relation is the more important consideration, however [Kupper, 1984].
Seixas and Sheppard [1996] proposed an alternative
exposure estimator based on an empirical-Bayes approach,
which weights individual and group estimates of mean
exposure to obtain a single exposure indicator with optimal
precision and bias. Although this indicator has not been
tested extensively, a study by Vermeulen et al. [2002] demonstrates how it combines the strengths of both assignment
strategies.
119
Modeling Exposures
Empirical statistical modeling can sometimes be used to
improve the estimation of exposure in situations where direct
measurements are sparse or unstable. To obtain such exposure estimates, multiple regression models are constructed
to predict measured exposure (typically log-transformed) as
a function of readily observable determinants like location,
activity, or job title. The predicted values from these regressions are then assigned as estimates of exposure.
As an example, Preller et al. [1995] found large withinperson variation in pig farmers’ exposure to endotoxins due
to the non-routine nature of the farmers’ work schedules.
They then used information on tasks performed and workplace characteristics (flooring, feeding system, etc.) from
activity diaries and repeated measurements of endotoxins
to develop statistical models that explained the temporal
variability in exposure. Consequently, information from
activity diaries, which were kept for several weeks, and the
statistical models were used to estimate more precise longterm average exposures. The modeled exposure estimates
yielded clear exposure–response relationships with lung
function, whereas no exposure–response relationship was
observed when individual exposure measurements were used
directly. Wameling et al. [2000] recently provided theoretical
proof for the use of this method.
INTERPRETING RESULTS
Exposure variability can influence the quantitative results of an epidemiologic study when data on exposure and
health outcomes are finally linked and analyzed to elucidate
the relationship of exposure and risk. Observed indicators of
exposure-disease association, like the rate ratio or odds ratio,
can be affected by the sources and magnitude of variation of
exposure in the populations under study, as well as by the way
in which that variability is handled in assigning exposure
estimates.
The statistical and epidemiological literature concerning
measurement error provides important concepts and methods
for understanding exposure variability and its consequences.
As used in the literature, the term ‘‘measurement error’’
actually describes two distinct phenomena. Classical, statistical measurement error refers to uncertainty introduced by
natural, random variation of the quantity being measured. In
epidemiologic research, this type of error is associated with
the sampling process. For example, if samples taken on
several randomly selected days are used to estimate an individual’s mean dermal exposure to pesticides, that estimate
will be associated with a certain amount of uncertainty about
the mean that can be described by the variance of the measurements. This uncertainty is an important component of
the total error in measuring exposure and consequently a
potentially important source of bias in epidemiologic results.
120
Loomis and Kromhout
Analytical error, the second class of measurement error,
is a tendency of measuring instruments to produce incorrect
values, whose distribution may be random or systematic.
Systematic analytical error can, for example, be introduced
by the inability to measure exposures below a minimum
detectable value. For most environmental measurements,
analytical error is small relative to the imprecision resulting
from natural temporal and spatial variation.
In the absence of analytical error, a high ratio of
within- to between-person exposure variability can attenuate
estimated exposure-disease associations, as shown in Table I.
The dust levels in the coal mines investigated in that example
are highly variable over time and space, so the ability of the
study to detect an effect depends on having a large number of
exposure measurements for each worker. To illustrate, if only
three exposure measurements per worker had been available,
the regression coefficient would have been non-statistically
significant and so severely attenuated as to compromise
the sensitivity of the study [Heederik and Attfield, 2000].
Although inadequate, three exposure measurements per
worker is nevertheless more than the number available in
most studies. The potential effect of insufficient exposure
information is usually difficult to gauge directly because of
the limited data available for each person.
Many epidemiologic studies use categorical, rather than
continuous, exposure scores in the analysis. In such studies,
the mechanism of misclassification is relevant [Copeland
et al., 1977]. Misclassification is a special case of measurement error that applies when the exposure variable used in
the analysis is categorical; it can arise from both classical,
random statistical error [Flegal et al., 1991] and from analytical inaccuracy. When those estimates are used in turn to
assign individuals to exposure categories, the probability that
they will be misclassified depends in part on the magnitude of
the underlying within-person variability.
To illustrate this dependence, we computed the probabilities of correctly classifying the exposures of a simulated
population of 1,000 workers, assuming a dichotomous exposure variable and a range of values for Sb and Sw. For any
given level of variability between people, larger withinperson variability is associated with lower sensitivity and
specificity, and thereby increased probability that an individual will be misclassified (Table II). Greater variability
between people has the opposite effect: for a given level of
within-person variability, greater between-person variability
is associated with higher probabilities of correct classification (Table II). The probability of misclassification can also
be reduced by taking more measurements for each person,
but pattern of effects associated with the balance of Sb and Sw
remains the same.
When categorical exposure variables are assigned with
error, as in Table II, the resulting misclassification can be
differential by disease status if true exposure and disease risk
are quantitatively related [Flegal et al., 1991]. Such differential misclassification often causes estimates of exposure-disease association to be attenuated, but exaggeration
and reversal of associations are also possible [Brenner and
Loomis, 1994; Wacholder, 1995]. Knowledge about the
TABLE II. Influence of Within-Person Exposure Variability on Probability of Exposure Misclassification, When
Long-Term ‘‘True’’ Mean Exposure for an Individual Is Estimated From Short-Term Measurements
Probability of correct classificationa
Cutpoint 0.0
Sw ¼ 0.5
Sw ¼1.0
Cutpoint1.0
Sw ¼ 2.0
Sw ¼ 0.5
Sw ¼1.0
Sw ¼ 2.0
0.61
0.93
0.60
0.82
0.48
0.69
0.83
0.93
0.85
0.67
0.67
0.72
0.89
0.96
0.80
0.88
0.69
0.81
Sbb ¼ 0.5
Sec
Spc
0.75
0.75
0.65
0.62
Se
Sp
0.84
0.86
0.76
0.73
Se
Sp
0.92
0.93
0.85
0.87
a
0.58
0.56
Sb ¼1.0
0.69
0.67
Sb ¼ 2.0
0.72
0.76
Simulated data assuming normally distributed exposure with mean ¼ 0 and standard deviation Sb, with exposure for each person measured on one randomly selected day. Exposure groups formed by dichotomizing observed individual means at cutpoints
of 0.0 and 1.0.
b
Sw, within-person component of exposure standard deviation; Sb, between-person component of exposure standard deviation.
c
Sensitivity, Se, and specificity, Sp, of exposure classification, defined as the proportion classified as exposed among those with
true exposure exceeding the cutpoint (sensitivity), and the proportion classified as unexposed among those with true exposures
below the cutpoint (specificity).
Exposure Variability: Concepts and Applications
relative magnitude of exposure variability within and between people can be used to qualitatively assess the chances
that exposure will be misclassified when individuals are
assigned to exposure groups.
In this discussion, we have focused on the consequences of error in measuring and assigning exposure in
an epidemiologic study. Covariates of exposure can vary in
similar ways and can also be measured with error, but we
have not considered errors in covariates because the
principles involved are the same. It should be noted, however,
that error in measuring covariates that act as confounders
may result in residual confounding and biased estimation of
the exposure-disease association [Greenland, 1980].
CONCLUSIONS
Exposure levels vary along several dimensions defined
by person, space, and time. The existence of a gradient of
exposure between populations is essential to many types of
epidemiologic research. Nevertheless, the structure and
magnitude of exposure variation on other dimensions are
not always recognized or exploited to full advantage. In
particular, the existence of a range of exposures within study
groups and of temporal variation of exposure levels within
individuals are often neglected. As a result, investigators
sometimes concentrate resources on obtaining exposure data
for each study participant at the expense of characterizing
within-person variability. Exposure variability should be
considered carefully in the planning, analysis, and interpretation of epidemiologic studies. Failure to do so can reduce study sensitivity and efficiency and may introduce bias.
Attention to exposure variation and its multiple dimensions
can lead to more meaningful characterization of exposure
itself, and, most importantly, improve the chances that epidemiologic studies identify and accurately characterize health
hazards.
ACKNOWLEDGMENTS
We thank Dr. David Savitz, Dr. Harvey Checkoway, and
Dr. David Kriebel for thoughtful comments on an earlier
draft, which helped us improve the manuscript.
REFERENCES
Armstrong BG. 1996. Optimizing power in allocating resources to
exposure assessment in an epidemiologic study. Am J Epidemiol 144:
192–197.
121
Boleij JSM, Buringh E, Heederik D, Kromhout H. 1995. Occupational
hygiene of chemical and biological agents. Amsterdam: Elsevier.
285 p.
Brenner H, Loomis D. 1994. Varied forms of bias due to nondifferential
error in measuring exposure. Epidemiology 5:510–517.
Burstyn I, Kromhout H. 2000. Are the members of a paving crew
uniformly exposed to bitumen fume, organic vapour, and benzo(a)pyrene. Risk Anal 20:653–663.
Burstyn I, Kromhout H, Kauppinen T, Heikkilä P, Boffetta P. 2000.
Statistical modelling of the determinants of historical exposure to
bitumen and polycyclic aromatic hydrocarbons among paving workers.
Ann Occup Hyg 44:43–56.
Burton P, Gurrin L, Sly P. 1998. Extending the simple linear regression
model to account for correlated responses: An introduction to generalized estimating equations and multi-level mixed modelling. Stat Med
17:1261–1291.
Checkoway H, Pearce NE, Crawford-Brown DJ. 1986. Research
methods in occupational epidemiology. New York: Oxford University
Press. 344 p.
Cochran WG. 1968. Errors of measurement in statistics. Technometrics
10:637–666.
Copeland KT, Checkoway H, McMichael AJ, Holbrook RH. 1977.
Bias due to misclassification in the estimation of relative risk. Am
J Epidemiol 105:488–495.
Dement JM, Harris RL, Symons MJ, Shy CM. 1983. Exposure and
mortality among chrysotile workers. Part I: Exposure estimates. Am
J Ind Med 4:399–419.
Flegal KM, Keyl PM, Nieto FJ. 1991. Differential misclassification
arising from nondifferential errors in exposure measurement. Am
J Epidemiol 34:1233–1244.
Flynn MR, George DK. 1996. A field evaluation of a mathematical
model to predict worker exposure to solvent vapors. Appl Occup
Environ Hyg 11:1212–1216.
Greenland S. 1980. The effect of misclassification in the presence of
covariates. Am J Epidemiol 112:564–569.
Heederik D, Attfield M. 2000. Characterization of dust exposure for
the study of chronic occupational lung disease: A comparison of
different exposure assessment strategies. Am J Epidemiol 151:
982–990.
Heederik D, Kromhout H, Burema J. 1991. Letter to the Editor: Assessment of long-term exposures to toxic substances in air. Ann Occup
Hyg 35:671–673.
Kromhout H, Heederik D. 1995. Occupational epidemiology in the
rubber industry. Implications of exposure variability. Am J Ind Med 27:
171–185.
Kromhout H, Heederik D, Dalderup LM, Kromhout D. 1992.
Performance of two general job-exposure matrices in a study of lung
cancer morbidity in the Zutphen cohort. Am J Epidemiol 136:
698–711.
Kromhout H, Tielemans E, Preller L, Heederik D. 1996. Estimates of
individual dose from current exposure measurements. Occup Hyg 3:
23–39.
Ashford JR. 1958. The design of a long-term sampling programme to
measure the hazard associated with an industrial environment. J R Stat
Soc Series A 121:333–347.
Kromhout H, Loomis DP, Kleckner RC, Savitz DA. 1997. Sensitivity of
the relation between cumulative magnetic field exposure and brain
cancer mortality to choice of monitoring data grouping scheme.
Epidemiology 8:442–445.
Banks RS, Thomas W, Mandel JS, Kaune WT, Wacholder S, Tarone RE,
Linet MS. 2002. Temporal trends and misclassification in residential
60 Hz magnetic field measurements. Bioelectromagnetics 23:196–205.
Kupper LL. 1984. Effects of the use of unreliable surrogate variables on
the validity of epidemiologic research studies. Am J Epidemiol 120:
643–648.
122
Loomis and Kromhout
Last JM. 2001. A dictionary of epidemiology. Oxford: Oxford
University Press. 141 p.
Linet MS, Hatch EE, Kleinerman RA, Robison LL, Kaune WT,
Friedman DR, Severson RK, Haines CM, Hartsock CT, Niwa S,
Wacholder S, Tarone RE. 1997. Residential exposure to magnetic
fields and acute lymphoblastic leukemia in children. N Engl J Med 337:
1–7.
Liu K, Stamler JA, Dyer A, McKeever J, McKeever P. 1978. Statistical
methods to assess and minimize the role of intra individual variability in
obscuring the relationship between dietary lipids and serum cholesterol.
J Chron Dis 31:399–418.
London SJ, Thomas DC, Bowman JD, Sobel E, Cheng TC, Peters JM.
1991. Exposure to residential electric and magnetic fields and risk of
childhood leukemia. Am J Epidemiol 134:923–937.
Loomis D, Lagorio S, Salvan A, Comba P. 1999. Update of evidence on
the association of childhood leukemia and 50/60 Hz magnetic field
exposure. J Expo Anal Environ Epidemiol 9:99–105.
McKeown-Eyssen GE, Thomas DC. 1985. Sample size determination in
case-control studies: The influence of the distribution of exposure.
J Chron Dis 122:55–61.
National Research Council. 1997. Possible health effects of exposure to
residential electric and magnetic fields. Washington: National Academy
Press.
Nylander-French LA, Kupper LL, Rappaport SM. 1999. An investigation of factors contributing to styrene and styrene-7,8-oxide exposures
in the reinforced-plastics industry. Ann Occup Hyg 43:99–105.
Oldham PD, Roach SA. 1952. A sampling procedure for measuring
industrial dust exposure. Br J Ind Med 9:112–119.
Phillips AN, Smith GD. 1993. The design of prospective epidemiological studies: More subjects or better measurements? J Clin Epidemiol
46:1203–1211.
Post WK, Heederik D, Kromhout H, Kromhout D. 1994. Occupational
exposures estimated by a population specific job exposure matrix and 25
year incidence rate of chronic nonspecific lung disease (CNSLD): The
Zutphen Study. Eur Respir J 7:1048–1055.
Rappaport SM, Symanski E, Yager JW, Kupper LL. 1995. The relationship between environmental monitoring and biological markers in
exposure assessment. Environ Health Perspect 103(Suppl 3):49–54.
Sanderson WT, Ward EM, Steenland K, Petersen MR. 2001. Lung
cancer case-control study of beryllium workers. Am J Ind Med 39:
133–144.
Savitz DA, Wachtel H, Barnes FA, John EM, Tvrdik J. 1988. Casecontrol study of childhood cancer and exposure to 60 Hz magnetic
fields. Am J Epidemiol 128:21–38.
Seixas NS, Sheppard L. 1996. Maximizing accuracy and precision using
individual and grouped exposure assessments. Scand J Work Environ
Health 22:94–101.
Tielemans E, Kupper LL, Kromhout H, Heederik D, Houba R. 1998.
Individual-based and group-based occupational exposure assessment:
Some equations to evaluate different strategies. Ann Occup Hyg 42:
115–119.
Van der Woord MP, Kromhout H, Barregard L, Jonsson P. 2000. Within day variability of magnetic fields among electric utility workers:
Consequences for measurement strategies. Am Ind Hyg Assoc J 61:
31–38.
van Tongeren MJA, Kromhout H, Gardiner K, Calvert IA, Harrington
JM. 1999. Assessment of the sensitivity of the relation between current
exposure to carbon black and lung function parameters when using
different grouping schemes. Am J Ind Med 36:548–556.
Vermeulen R, Talaska G, Schumann B, Bos RP, Rothman N, Kromhout
H. 2002. Urothelial cell DNA adducts in rubber workers. Environ Mol
Mutagen 39:306–313.
Wacholder S. 1995. When measurement errors correlate with truth:
Surprising effects of nondifferential misclassification. Epidemiology
6:157–161.
Wameling A, Schaper M, Kunert J, Blaszkewicz M, van Thriel C,
Zupanic M, Seeber A. 2000. Individual toluene exposure in rotary
printing: Increasing accuracy of estimation by linear models based on
protocols of daily activity and other measures. Biometrics 56:1218–
1221.
Preller L, Kromhout H, Heederik D, Tielen MJ. 1995. Modelling
chronic exposure in occupational exposure–response analysis. Scand
J Work Environ Health 21:504–512.
White E, Kushi LH, Pepe MS. 1994. The effect of exposure variance
and exposure measurement error on study sample size: Implications
for the design of epidemiologic studies. J Clin Epidemiol 47:873–
880.
Rappaport SM. 1991. Assessment of long-term exposures to toxic
substances in air. Ann Occup Hyg 35:61–121.
Willett W. 1990. Nutritional epidemiology. New York: Oxford
University Press.