Exposure Measurement in Cohort Studies: The

Epidemiologic Reviews
Copyright © 1998 by The Johns Hopkins University School of Hygiene and Public Health
All rights reserved
Vol. 20, No. 1
Printed in U.S.A.
Exposure Measurement in Cohort Studies: The Challenges of Prospective
Data Collection
Emily White,1' 2 Julie R. Hunt,1 and Deborah Casso 1.
2
INTRODUCTION
be the primary concern, cost considerations often limit
what is practicable.
Collection of accurate exposure information on
study participants is a major objective of all epidemiologic studies. The term exposure is used here to
include not only exposures to exogenous agents that
may be causally related to disease, but also socioeconomic factors, health habits, endogenous factors, and
factors that may confound or modify the primary
exposure-disease relation. Because many of the issues
in exposure measurement are similar for case-control
and cohort studies (1-3), this presentation attempts to
focus on the unique challenges in measurement that
arise from certain characteristics of many (but not all)
cohort studies; these include the prospective, repeated
exposure assessment and the larger sample size, which
lead to greater need for cost considerations in data
collection. We review methods of exposure measurement in cohort studies and issues of data collection
costs, including selection of cost-efficient measurement methods and sampling approaches to reduce
costs. Issues relating to prospective data collection are
covered, including choice of a reference period, frequency of exposure updates, changes in the instrument
over time, and use of repeated measures. The final
focus is on sources of exposure measurement error,
and approaches to reduce such measurement error
through the design, execution, and analysis of the
study. These are key issues, because exposure measurement error can lead to substantial bias in the
estimated relative risk for the exposure-disease relation (1, 4, 5).
Selection of a measurement method
A large range of exposure measurement methods
have been used in prospective cohort studies (examples are presented in table 1). Prospective cohort studies have used mailed self-administered questionnaires,
interviewer-administered questionnaires, measures of
blood or other tissues, physical measures, medical
tests (such as electrocardiography), use of medical or
other exposure records, and/or measures of the environment (e.g., air or water sampling) (table 1). Many
studies use multiple methods. For example, the Cardiovascular Health Study uses home interviews and
clinic visits, which include interviewer- and selfadministered questionnaires, physical tests (for anthropometries), blood measures, medical tests (blood
pressure, electrocardiography, ultrasonography), and
abstraction of prescription labels of medication use
(27). In the study of Colorado Plateau uranium miners,
Wagoner et al. (9) used physical examinations and
detailed occupational questionnaires, as well as longitudinal measures of environmental radiation from each
mine. Storage of blood or other sources of DNA for
future studies of biomarkers has become an important
component of several large cohort studies, including
the Women's Health Initiative Observational Study
(28) and the Nurses' Health Study (15, 16). Diaries
(e.g., food records) have rarely been used in cohort
studies because of the costs of training subjects and of
coding the diary information. However, diaries could
be more cost effective if the coding were limited to a
subsample of participants (see "Measurement of exposures on a sample of the full cohort to reduce costs"
below).
A major consideration in the selection of a method
is the validity of the method. Although issues surrounding validity are often specific for each type of
exposure, one general issue is whether a current measure (or prospectively repeated measures of current
exposure) is appropriate or whether a measure of past
MEASUREMENT METHODS IN
COHORT STUDIES
The first measurement issue in designing a cohort
study is usually the selection of the data collection
method(s). While selecting an accurate method should
Received for publication August 11, 1997, and accepted for
publication April 1, 1998.
1
Fred Hutchinson Cancer Research Center, Seattle, WA.
department of Epidemiology, University of Washington, Seattle,
WA.
43
TABLE 1. Examples of exposure assessment methods in selected prospective cohort studies
Follow-up exposure
Instruments*
Baseline exposure
instruments*
Study
(relerence(s))
Year
begun
The Framlngham Heart Study (6-8)
1948
Risk factors for cardiovascular disease
Residents of Framingham,
MA, ages 28-62
years
5,209 Interview, clinic examination (PE, lab, tests)
Interview, clinic examination (PE, lab tests)
Every 2 years
Colorado Plateau Uranium Miners
Study (9)
1950
Occupational risk factors
for cancer
White, male underground
uranium miners,
Colorado Plateau
3,415 Clinic examination (PE),
questionnaire,
environmental
measures (airborne
radiation)
Questionnaire 0n-person
or mailed), environmental measures
(airborne radiation)
Triennial
American Cancer Society: Cancer
Prevention Study 1(10)
1959 Cigarette smoking and
cancer mortality
Self-administered
questionnaire
Every 2 years
Tha Alameda County Study (11)
1965
Mailed questionnaire;
telephone Interview
or home interview
of nonrespondents
At years 9,16
Honolulu Heart Program (12)
1965 Coronary heart disease
and stroke In men of
Japanese ancestry
Men of Japanese ancestry
living on Oahu, HI,
ages 45-65 years
The Oral Contraception Study of
the Royal College of General
Practitioners (13,14)
1968 Oral contraceptive use
and cancer
Married premenopausal
British women
Same
Every 6 month
Nurses' Health Study (15,16)
1976
Mailed questionnaire
Lab (toenall sample)
Lab (home blood draw)
Every 2 years
At year 6
At year 13
Port Plrle Cohort Study (17,18)
1979
Lab (blood samples)
At 6,15, and
annually u
and at 11
Multlcenter AIDS Cohort Study
(19-20)
Clinic examination (PE,
tab), seli-admlnlslered
questionnaire,
interview
Clinic examination (PE,
lab, tests), selfadministered
questionnaire,
Interview
Every 6 month
At years 2, 5,
Lab
Annual
Main
focus
Factors associated with
health and mortality
Study
population
US men and women aged
30 years and older
Residents ot Alameda
County, CA, ages
16-94 years
Originally oral contraceptive use and
cancer, expanded to
women's health
Lead exposure and child
development
Married female US
registered nurses ages
30-55 years
1984
Risk factors for HIVt In
gay men
US homosexual men
ages 18-70 years
Coronary Artery Risk Development
In Young Adults (CARDIA)
(21,22)
1985
Risk factors for coronary
heart disease In
young adults
Black and white US men
and women ages
18-30 years
New York University Women's
Health Study (23)
1985
Iowa Women's Study (24)
Endogenous hormones
and risk ot breast
cancer
1986 Cancer In women
Infants bom In Port Plrle,
South Australia,
1979-1982
New York women ages
35-65 years
Iowa women ages 55-69
years
Size
1,045,087 Self-administered
questionnaire
delivered by
volunteers
6,928 Mailed questionnaire;
telephone Interview or
home Interview of
nonrespondents
Frequ
exposure
8,006 Mailed questionnaire,
interview, clinic
examination (PE,
lab, tests)
47,000 Medical form completed
by physician based
on patient interview
or medical record
121,700 Mailed questionnaire
723 Lab (blood samples from
pregnant mother and
umbilical cord at
birth
4,954 Clinic examination (PE,
lab), setf-admlnistered
questionnaire,
interview
5,115 Telephone Interview,
clinic examination (PE,
lab, tests), selfadministered
questionnaire,
Interview
14,291 Lab, self-administered
questionnaire
41,837 Mailed questionnaire
with self body girth
measures
None
Exposure Measurement
45
exposure is needed (see the section on Developing
an exposure measure—time considerations, below).
Blood measures, environmental measures, and physical measures generally measure only current exposure.
However, if the characteristic is fixed (e.g., inherited
genetic traits) or reasonably stable over long periods of
time, then a current measure can be a good measure of
past exposure.
Cost considerations
:§.
3
2
in
g
f
.C
If
— ZQ
jai.
Epidemiol Rev Vol. 20, No. 1, 1998
For large cohort studies, cost is a major concern.
The two common approaches to low-cost cohort studies are record-linkage studies, in which records of
exposure are linked to disease records, and studies
which recruit a cohort and collect exposures by mailed
questionnaires.
Numerous studies have been published in which
medical, pharmacy, educational, occupational, or other
records of exposure have been used (29-31), see the
article in this issue by Howe (32) for other examples.
Obtaining records is usually inexpensive even compared with mailed questionnaires, particularly if the
cohort is selected to facilitate the availability of
records (such as members of a health maintainance
organization) or if the records are computerized.
Moreover, use of records often allows the conduct of
a cohort study in which the exposure data collection
has occurred in the past (retrospective cohort study)
(2). Because the exposures are recorded for each subject before disease onset, retrospective cohort studies
generally have the advantages of prospective cohort
studies, including less susceptibility to differential and
nondifferential measurement error and to selection
bias. However, records are often not available for the
exposure of interest or are not sufficiently accurate.
Even when records are available with reasonably accurate measures of the primary exposure, information on
potential confounders is often poorly measured or is
missing (1).
Mailed questionnaires are generally the least expensive of the other available methods (33, 34), and have
been used as the primary exposure method in several
large cohort studies, including the Nurses' Health
Study and the Iowa Women's Study (15, 16, 24).
Machine reading of questionnaires using "mark-sense"
or fax (digitization) technology further reduces
costs. However, mailed questionnaires generally
must be shorter and less complex than intervieweradministered questionnaires. For example, complex
skip patterns must be avoided on mailed questionnaires.
Other innovative methods have been used to reduce
the costs of large cohort studies. In some studies, the
subjects themselves have served as data collectors for
46
White et al.
physical measures or biologic specimens, tasks that
usually are conducted by study staff. In the Nurses'
Health Study, the subjects were mailed materials and
instructions to provide a blood sample, and were also
asked for toenail samples (to assess selenium exposure) (15). In the Oral Contraceptive Study of the
Royal College of General Practitioners, data collection
costs were reduced by soliciting physicians to recruit
subjects and be the primary informants on subjects'
exposures (oral contraceptive use) (13, 14). In the
American Cancer Society Cohort Studies, volunteers
recruited and delivered the study questionnaire to participants (10).
Measurement of exposures on a sample of the
full cohort to reduce costs
Another approach to reduce data collection costs in
cohort studies is to collect or have available the research material for all cohort members, but to measure
the exposures on less than the full cohort (2, 35-38).
For example, medical records might be available for
the full cohort but only abstracted on a sample of
cohort members, or stored blood might be available
for the full cohort but only analyzed for a subsample.
Several types of sampling designs can be used, each of
which has advantages and limitations (2, 35-38). In a
nested case-control design, cases are compared with a
sample of controls free of disease at the time of the
case's diagnosis (39, 40). In a two-phase design, a
proportion of each of the four disease-by-exposure
groups are sampled (41, 42). In a case-cohort design,
a random subsample of the cohort is selected at baseline, and is compared with those who develop the
disease of interest (43). This latter design allows the
cohort subsample to be compared with more than one
disease group. These designs can lead to reduced exposure measurement costs with a relatively small loss
in study power.
DEVELOPING AN EXPOSURE MEASURETIME CONSIDERATIONS
Dose and time
After the exposure measurement method has been
selected (e.g., self-administered questionnaire for
physical activity), the actual instrument is selected
from among those available, or a new instrument is
developed. Most of the considerations in selecting or
developing an instrument are similar for cohort and
case-control studies. Briefly, these begin with conceptualization of the actual "active agent" that is hypothesized to be causally related to disease, identification
of those exposures which do (and do not) include the
agent, and conceptualization of the best construct(s) of
"dose" (e.g., cumulative dose of the active agent,
cumulative dose above some threshold of intensity,
etc.) (1). Then the researcher identifies the items
needed to capture the duration, frequency, and intensity of exposure over a specified time period, and
develops the algorithm for the analytic exposure variable^) which combines these items to yield the exposure dose construct that is expected to be causally
related to disease. Conversion tables are often used in
the algorithm to quantify the intensity of exposure into
units of the active agent (e.g., to convert each type of
physical exercise into kilocalorie expenditure). The
final step is the actual phrasing of the questions for
each item. Most of these issues have been covered in
the context of measurement of specific exposures, e.g.,
diet (44), physical activity (45), drugs (46), occupational exposures (47), laboratory-based measurements
(48, 49), and in books and articles on questionnaire
design (50-53).
Because cohort studies are often prospective, issues
relating to the time period of importance in the
exposure-disease relation deserve particular consideration. Several authors discuss the importance and
methods of considering the time period of exposure in
addition to the dose of exposure in evaluating the
effect of exposure on disease occurrence (54-58). One
of the primary ways in which the dose-time effect of
exposure on disease risk can be assessed is by computing cumulative dose (or average dose) only over the
time period thought to be of etiologic importance (54)
(see Thomas (56) for other analytic approaches). Considering the etiologically important time period can
help answer questions that arise in the design of cohort
studies, such as:
• Should current or past exposures be measured?
• How often should the measure of exposure be
repeated prospectively (e.g., repeat questions on
exercise or repeat blood draws)?
• How should exposures that change over time be
handled (e.g., how does one classify a subject
who was on hormone replacement therapy at
baseline but not at the 2-year follow-up)?
Additionally, one must consider how to handle
changes in the instrument over time. Changes might be
necessary if there are new sources of the exposure
(e.g., new forms of hormone replacement therapy) or
if the instrument can be improved.
Etiologically relevant time window
Rothman (54) provides a structure with which to
view the relation between timing of exposure and the
occurrence of disease. Specifically, if there is a series
of component causes which must occur in a sequence
Epidemiol Rev Vol. 20, No. 1, 1998
Exposure Measurement
47
figure 1). In practice, for each exposure variable to be
calculated (e.g., cumulative dose of hormone replacement therapy), a reference time period needs to be
specified. This reference time period is selected based
on what is known about the natural history of the
disease, i.e., it should reflect the hypothesized etiologically relevant time period. The reference period could
be lifetime (up to diagnosis), a physiologic time period
(e.g., exposure during adolescence), or could be measured backward from the time of the diagnosis of
disease (e.g., the 20 year period ending 5 years before
diagnosis). This latter type of reference period is easier
to implement in case-control studies, because the date
of diagnosis of cases is known at the time of data
collection, while in cohort studies, the date of diagnosis is only known after the (prediagnostic) exposure
information has been collected. For exposures with
long induction and/or latent periods, if the etiologically important time period does not overlap with the
study time period, then the best measure of exposure
will be retrospective exposure information collected at
baseline only. In these situations, the reference period,
by necessity, is expressed in relation to baseline, e.g.,
cumulative dose of aspirin over the 10 years before
baseline. For other exposures, the retrospective information from baseline would need to be updated with
the prospective information from follow-up contacts to
capture exposure during the etiologically relevant time
period. For example, to compute hormone replacement
therapy use for the 10-year period ending, say, 3 years
prior to diagnosis of disease, one would need to com-
to lead to disease, then there would be a limited time
period during which a given exposure is causally related to disease, i.e., an etiologically relevant time
window (top line of figure 1). After the action of that
exposure is complete, then an induction period ensues
during which any remaining component causes in the
etiologic sequence occur. The length of the induction
period therefore varies for each exposure related to the
disease, with those acting early in the causal sequence
having a long induction period and those late in the
causal sequence a short induction period. The induction period ends when the disease irreversibly begins,
and that point to diagnosis is the latent period. Thus, in
theory, there is an etiologically relevant time window
during which the exposure of interest is causally related to the disease. Further exposure during the induction and latent periods does not contribute to etiology, and including exposure before or after the
etiologically important time window, leads to exposure measurement error.
Selecting the reference period and frequency of
data collection
The limits of knowledge about the biology of most
diseases precludes a precise definition of the etiologically relevant time period. Nonetheless, consideration
of this concept can provide insight into the appropriate
time interval during which to collect exposure data
(middle line of figure 1), and the appropriate time
period over which the exposure dose variable should
be calculated (the reference period) (bottom line of
Causal sequence for a specific exposure and disease :*,t
Exposure
Action of exposure
complete
complete
-4V
-H
l+i-H
Action of all
causes complete
causes
complete
11Disease
111begins
11
Symptoms
Disease
diagnosis
1-
4Etiologically relevant
time window
for that exposure
Induction
period
Latent
period
Timeline of cohort study data collection:
h
Baseline
Update
Yearl
Update
Year 2
Update
Year 3
Update
Year 4
Reference period for calculation of exposure dose:*
Reference period
FIGURE 1. Timeline showing the relation between causal exposure-disease sequence, timeline of a cohort study, and reference period for
calculation of exposure dose. *Examp!e for a subject who developed the disease endpoint. tHashmarks equal episodes of exposure.
Epidemiol Rev
Vol. 20, No. 1, 1998
48
White et al.
bine information from baseline with that from, say,
yearly updates ending at 3 years before diagnosis (as
shown in figure 1). Or, if the exposure acts late in the
causal sequence, e.g., aspirin or other anticoagulant
drugs in relation to myocardial infarction, then the
most recent exposure update(s) before the disease
event could be the best exposure measure.
Consideration of the etiologically relevant time period helps answer questions about collection of current
versus past exposure data, the frequency of prospective data collection, and the reference period to be
used in constructing the dose variable. It is important
to note that it is the natural history of the exposuredisease relation that provides answers to these questions, not the "natural history" of the cohort study
design. Thus, the advantages of cohort studies, in
particular the ability to accurately measure current
exposures (e.g., current physical activity, current
blood measures) and to measure them prospectively
over time are only advantages if they more accurately
capture the true (etiologic) exposure. While valid measurement of current physical activity may seem to be
an advantage compared with poorly recalled adult
lifetime activity, it is quite possible that recalled adult
activity is a better measure of the true exposure (e.g.,
actual lifetime adult activity) than is current activity.
Similarly, annual prospective data collection for, say,
10 years would only improve exposure measurement if
these years included the etiologically important time
window.
Of course, the choice of the frequency of data collection often depends on other study requirements in
addition to capturing exposure dose over the reference
period. One must consider the frequency needed to
obtain accurate endpoint data (if endpoint data are
ascertained from the subjects). Frequent (e.g., yearly)
contact would usually be needed in order to not lose
endpoint information for conditions that lead to major
morbidity and mortality. Frequent contact is also
needed to keep up-to-date information that allows
tracking of subjects' whereabouts and may help to
keep subjects feeling committed to the research project
(see the article in this issue by Hunt and White (59)).
Dose and time considerations when there are
multiple disease endpoints
One challenge of cohort studies is that there may be
multiple disease endpoints, and the appropriate active
agent, dose construct, or etiologically relevant time
window for a type of exposure could differ for different diseases. For example, the type of physical activity
that influences cardiovascular disease risk may differ
from the nature of activity that influences colon cancer
risk. Colon cancer might be influenced by even low
levels of intensity (which might reduce bowel transit
time), while a physical activity measure for cardiovascular disease may need to include only the amount of
activity of sufficient intensity to have a cardiovascular
effect. Similarly, recent use of hormone replacement
therapy primarily affects endometrial cancer risk,
while only long-term use may affect breast cancer risk.
For cohort studies with multiple disease endpoints,
instruments must incorporate the items necessary to
compute the different exposure dose variables and/or
the different etiologically relevant time windows for
the various disease outcomes.
Changing the instrument over time
As a cohort study continues over time, new exposures may be added at exposure updates. For example,
a food frequency questionnaire was added to the
Nurses' Health Study in year 4, and has been used to
assess diet-disease relations for subsequent disease
events (15).
A greater challenge in prospective cohort studies is
that there may be a need to change the measurement of
a specific exposure over time (e.g., one might need to
change the way in which dietary fat intake is estimated). While in principle one should never change the
measure of an exposure during a study, there are
situations for which this may be necessary. First, the
available sources of the exposure can change (e.g.,
new types of hormone replacement therapy have continually been introduced over the last 20 years) or the
content of existing products can change (e.g., the fat
content of beef has declined over the last several
decades). When the external environment or "market
place" changes, changes in the questionnaire are
needed to continue to accurately measure the exposure. These changes do not necessarily improve the
accuracy of the exposure measure; rather, not making
the changes would decrease the accuracy. Therefore,
such changes should be made to the instrument or to
the conversion tables (e.g., fat content of foods) used
in the algorithm to create the exposure dose variable.
Creating and interpreting a cumulative dose over the
reference time period should not be hindered by these
types of changes to the instrument. For example, at the
exposure updates there may be more types of hormone
replacement therapy on the questionnaire which enter
into calculation of, say, "cumulative years of hormone
replacement therapy use," but this would not change
the interpretation of the exposure variable.
The other type of change in exposure measurement
over time would be a change to improve the accuracy
of the measure. One way that this happens is that new
technology leads to better measures (e.g., a better
measure of bone mineral density becomes available).
Epidemiol Rev Vol. 20, No. 1, 1998
Exposure Measurement
Another way this occurs, unfortunately, is that the
researcher discovers errors or omissions on the original instrument. The decision about whether to change
to a more precise measure depends on several factors.
First, if one can redo all past measures from stored
data or stored specimens using the newer instrument or
algorithm, then this should be done. For example, if
blood has been stored, it can be reanalyzed using a
new method even for older specimens. Or, if a new
table for /3-carotene in foods becomes available due to
better techniques for measuring /3-carotene, then one
can use the originally collected food frequency questionnaire data and recreate the exposure variable for
dietary intake of j3-carotene. For situations such as
these, there is no reason to keep an older, less accurate
exposure measure.
It is most challenging to decide whether to change
an exposure measure when it will mean that, for some
subjects, or at some data collection time points, the
exposure assessment is more accurate than others. In
the Framingham study, for example, the measure of
serum cholesterol became more precise by the second
examination, and questions on cigarette smoking were
modified over time to improve accuracy (6). These
changes meant that, for example, serum cholesterol
measured at baseline was less accurate than that measured at the year 2 clinic visit. One way to deal with
two versions of a measure is to drop the earlier, less
accurate measure. If the exposure measurement
changed at, say, year 4, one could only use the new
instrument and only the disease events that occur after
year 4. This, however, would reduce study power by
loss of the early endpoints. Little has been written
about incorporating the use of two different methods
for the same exposure in a study. One approach is to
stratify the analysis by the instrument used; then one
would expect that the relative risk based on the more
accurate exposure instrument would be less attenuated
due to measurement error than that based on the less
accurate instrument. This approach has been suggested
in the context of case-control studies which must use
proxy respondents for some subjects (60, 61).
When the instrument has changed in a cohort study,
it is important to not induce differential error by using
the two instruments differentially for diseased and
control subjects. Specifically, if information from the
earlier instrument is to be used for the early cases (e.g.,
those who develop the disease before the new instrument was used), or if information from the earlier
instrument is combined with the improved version
(e.g., to compute cumulative dose), the analytic techniques must be such that exposures from those with
and without the disease outcome are compared on the
same instrument or combination of instruments.
Epidemiol Rev Vol. 20, No. 1, 1998
49
MEASUREMENT ERROR IN COHORT STUDIESSOURCES AND EFFECTS
Measurement error and its effects
Exposure measurement error for an individual is
defined as the difference between a subject's measured
exposure (X) and his/her true exposure (T). The true
exposure is the agent of interest; for example, the
active agent hypothesized to cause the disease expressed using the appropriate dose construct and covering the etiologically relevant time period. Validity or
accuracy refers to the capacity of an exposure measure
to capture the true exposure in the population of interest; measures of validity are measures of the exposure measurement error in the population. For a
continuous variable, validity is measured by the systematic bias, i.e., the difference between the average
measured exposure in the study population and the
average true exposure (X — T) and the validity coefficient (prx), i.e., the correlation of the measured exposure with the true exposure (1, 4, 5). The validity
coefficient is inversely related to the ratio of the exposure error variance to the true exposure variance.
Even though validity often cannot be measured (because there is rarely a gold standard measure of the
true exposure), it is an important concept because the
validity of the exposure measure influences the validity of the results of an epidemiologic study.
Differential measurement error is a major concern
when the systematic bias differs between those with
and without the outcome; for example, when those
with the disease of interest have their exposure overestimated on average compared with those without the
disease. Differential error can lead to uninterpretable
study results because the observed relative risk can be
biased toward the null, away from the null, or cross
over the null value compared with the true relative
risk. Measurement error is nondifferential when the
systematic bias and the validity coefficient are the
same for those with and without the disease. Under
simplifying assumptions, when there is nondifferential
measurement error, the observed relative risk (RRO) is
biased toward the null value compared with the true
relative risk (RR r ) by the equation (1, 62):
RRO = R R / " .
(1)
Thus a true relative risk of 4 would be attenuated to an
observed relative risk of 1.4 if the validity coefficient
were 0.5. Note that only p ^ and not the systematic
bias enters this equation.
Sources of measurement error
Cohort studies generally have several advantages
that have the potential to reduce both differential ex-
50
White et al.
posure measurement error and nondifferential measurement error. Exposure assessments (including
questionnaire, blood, air sampling, etc.) in cohort studies generally are made closer to the time of the true,
etiologic exposure than in case-control studies, which
should improve the validity of the exposures measured
in cohort studies compared with case-control studies.
Cohort studies often include repeated exposure measures, which allows for a more accurate measure of
exposure during different time periods. Moreover, cohort studies are much less prone to differential measurement error because, in most cohort designs, exposures are measured before disease occurrence. This
reduces the likelihood that knowledge about the disease or the physical and psychologic effects of the
disease would lead to different validity of the exposure
between those with and without disease. These advantages generally apply to both prospective and retrospective cohort designs.
Despite these advantages, cohort studies are subject
to many sources of measurement error. Errors can be
TABLE 2.
introduced during any phase of the study (examples
are given in table 2). As implied above, errors can
arise from the design of the instrument (i.e., the questionnaire or laboratory procedure to measure the exposure) through failure to correctly specify the specific
causative agent, the etiologic important time window,
and the appropriate construct of dose (i.e., the appropriate way to incorporate intensity, frequency, and
time period of the exposure) for a given disease outcome. Measurement error also arises from errors or
omissions in the protocol for use of the instrument and
from poor execution of the protocol during data collection, e.g., failure of subjects to follow instructions
on self-administered questionnaires or poor handling
of biologic specimens. Errors can also arise from limitations due to inherent subject characteristics; these
include memory limitations, social desirability bias,
and month-to-month variability in biologic characteristics. Some sources of error may be greater in cohort
studies than in case-control studies because of the
usual longer time frame of cohort studies. In particu-
Sources of measurement error in cohort studies*
Errors in the selection or design of the instrument to measure the exposure
Lack of coverage of all sources of the active agent by the instrument
Inclusion of exposures that do not have the actual active agent
Time period assessed by the instrument is not the etiologically important time period
Phrasing of questions that lead to misunderstanding or bias
Errors or omissions in the protocol for use of the instrument
Failure to specify the protocol (including instructions to the subject) in sufficient detail
Failure to specify a method to handle unanticipated situations consistently
Poor execution of the study protocol
Failure of the data collectors or laboratory technicians to follow the protocol in the same manner for all
subjects
Failure of the subjects to read or understand the instructions in self-administered questionnaires
Improper handling or storage of biologic specimens
Limitations due to inherent subject characteristics
Memory limitations of subjects including poor recall of past exposures and influence of recent exposures
on memory of past exposures
Tendency of subjects to overreport socially desirable behaviors (e.g., exercise) and underreport socially
undesirable or sensitive behaviors (e.g., induced abortion)
Short-term (month to month) variability in biologic characteristics.
Effect of the outcome under study on the exposure measures (e.g., effect of preclinical disease)
Drift in accuracy of exposure measures over time
Change in instrument over time
Interviewer fatigue
Failure to standardize laboratory method periodically
Degradation of agent in biologic specimens
Errors during data processing and in the creation of the exposure variables
Data entry errors
Errors in conversion tables used to convert subject responses to units of active agent (e.g., p-carotene by
type of food)
Development of an exposure variable algorithm that does not express the etiologically important dose or
time period
* Adapted from Armstrong et al. (1).
Epidemiol Rev Vol. 20, No. 1, 1998
Exposure Measurement
lar, the accuracy of the instrument could drift, over
time, due to failure to standardize the laboratory instrument, fatigue among interviewers, or degradation
of the active agent in biologic specimens. Finally,
errors enter at the time of data analysis, including key
entry errors and errors in conversion tables used to
convert subjects' responses to units of the active agent
(e.g., errors in the nutrient database used to convert
foods to international units of vitamin A).
In cohort studies, most of the sources of error described above would lead to nondifferential error;
nonetheless, these and others can be sources of differential error. Major sources would be the effects of the
biologic changes during the prediagnostic phase of
disease on biologic measures (e.g., undetected bleeding may lead to lower iron status during the prediagnostic phase of colon cancer), the influences of prediagnostic symptoms on a subject's behavior (e.g.,
gastrointestinal symptoms may lead to reduced fiber
intake before diagnosis of colon cancer), and the effects of the subject's risk of the disease on accuracy of
reporting (e.g., those with strong family history of
colon cancer may be both more accurate in reporting
family history and at higher risk of colon cancer).
METHODS FOR REDUCING MEASUREMENT
ERROR IN COHORT STUIDES
The primary method to reduce measurement error is
the careful selection or construction of the exposure
instrument; several issues related to the design of
accurate instruments have been reviewed above. Other
approaches to reduce measurement error include judicious selection of a study population to improve measurements, use of repeated measures of exposure, and
rigorous quality control procedures throughout the
conduct of the study. Some steps also can be taken at
the time of analysis or interpretation of study results—
these include removing outcome events that occur
soon after the measurements were taken and the use of
information on the accuracy of the exposure measure
to "correct" the results. Each of these topics is discussed in more detail below, with emphasis on those
issues particularly relevant to cohort studies.
Selection of a cohort to reduce
measurement error
In cohort studies, the investigator often has more
options in selection of a study group(s) than in casecontrol studies, which are usually population-based.
Several researchers have taken advantage of this by
selecting a study cohort that could lead to reduced
measurement error. The Nurses' Health Study is an
excellent example of a study in which a specific cohort
was selected because the cohort members were exEpidemiol Rev Vol. 20, No. 1, 1998
51
pected to provide more accurate information on exposures and outcomes than the general population (15,
63). Female registered nurses were selected to be
studied because they were assumed to be excellent
reporters of exogenous hormone use, which was the
original study exposure, and disease outcomes.
Several authors have proposed another approach by
which selection of the cohort(s) would reduce measurement error. By selecting a population with larger
variation of exposure, the effects of measurement error
are reduced (64). For example, the European Prospective Investigation into Cancer and Nutrition (EPIC)
includes a multinational population which would have
a larger variance of nutrient intake than within a single
country (65). If the variance of the (true) exposure is
greater in one study population than another, and the
variance of the exposure measurement error is the
same, then the effect of the measurement error on
the relative risk is less in the population with the
greater variability of exposure (64). As noted above,
the validity coefficient varies inversely with the ratio
of the exposure error variance to the true exposure
variance. Thus, the validity of the exposure measure
can be increased by increasing the true exposure variance as well as by the usual methods to reduce the
error variance. This approach of selecting a cohort
with larger exposure variance would lead to less attenuation of the relative risk from the effects of nondifferential measurement error. It could also substantially reduce the sample size (or equivalently increase
power for a fixed sample size) unless the study population with the larger exposure variance has a significantly lower disease incidence rate than the alternative choices with smaller exposure variance (64).
At least one of the current large cohort studies has
attempted to artificially create a cohort with a larger
exposure variance through sampling. The American
Association of Retired Persons (AARP) Cohort Study,
being conducted by the National Cancer Institute, is
screening potential cohort members by use of a food
frequency questionnaire (66). Subjects are then selected for cohort entry by sampling more heavily from
those with the more extreme values, e.g., those in the
highest and lowest 10 percent of fat intake. This approach would lead to larger exposure variance than
random sampling of this population.
Use of repeated measurements to reduce
measurement error
Although issues relating to frequency of data collection were discussed above, another related consideration in determining the frequency of exposure measurement is the advantage of averaging (or summing)
two or more measures of the exposure for each sub-
52
White et al.
ject. The approach of averaging repeated measures has
been recommended as an effective method of decreasing the measurement error, compared with the use of a
single measurement (1,2, 67-69). The measures are
typically repeated administrations of the same instrument over time. Use of repeated measures is particularly applicable to prospective cohort studies, because
the measures can be averaged over long time periods
(e.g., years). For example, the New York University
Women's Health Study measured blood hormone concentrations by use of samples collected annually over
5 years (23). On the other hand, the average exposure
only increases the validity of the exposure measure if
the average over that time period is a good expression
of the true exposure.
Fleiss (68) and others (70-72) have shown the degree of improvement in the validity of an exposure
variable that results from the combination of multiple
measures when the errors in the two or more measures
to be averaged are equal and uncorrelated (the parallel
test model). "Equal error" means that each measure is
an equally precise measure of the true exposure, as
would be expected when the same instrument was
used for the repeated measures. "Uncorrelated error"
means that the measurement error on one administration of the instrument would not be correlated with the
error on the second administration, i.e., those subjects
whose first measure overestimated their true exposure
would be equally likely to have the second measure
over- or underestimate their true exposure (beyond any
systematic bias in the instrument that affects all subjects). Often repeated laboratory measures based on
biologic samples collected at multiple times for each
subject (not repeated analysis of the same biologic
sample) can be assumed to meet the criteria of the
parallel test model, if the true exposure is thought to be
the true average over the time period of specimen
collection.
When each individual in a study population is measured k times using repeated parallel measures of
TABLE 3.
exposure variable X, the improvement in validity by
use of the average measure, A, for each individual
(versus use of only one measure of X) can be calculated. The correlation of the average with the true
exposure, pTA (the validity coefficient of A), is greater
than the correlation of a single measure X with the true
exposure, p-jx (the validity coefficient of X), by
PTA
=
1 +
(* -
\)P2TX
(2)
If p-rx is known from a validity study, then the validity
coefficient for A can be calculated from the above
equation. Table 3 gives examples of the improvement
in validity one can achieve by using multiple parallel
measures. For example, averaging three measures each
with a validity coefficient of 0.5 can yield a new
measure with validity coefficient of 0.71, or averaging
two measures each with a validity coefficient of 0.85
can yield a new exposure measure with a validity
coefficient of 0.92.
Often one does not have a measure of the validity
coefficient of X, p ^ , but, rather, a measure of the
reliability of repeated measures of X, Pxx-. Pxx' m a v be
estimated by the Pearson correlation of two parallel
measures X and X' (or the average Pearson correlation
between pairs of measures, if X is repeated multiple
times) (70, 71). Then the reliability of parallel measures of X, p^', can be used to estimate p ^ (70, 71):
(3)
and Pxx^ can be substituted for in equation 2. Table 3
also gives the improvement in the validity of a measure by use of multiple measures based on values of
Pxx>. For example, parallel measures with a reliability
coefficient of 0.36 suggests that the validity coefficient of one measure is 0.6 and of the average of two
measures would be 0.73.
Improvement in the validity of a measure, by averaging k parallel measures*
PM
PTX
k=2
0.30
0.40
0.50
0.60
0.70
0.80
0.85
0.90
0.09
0.16
0.25
0.36
0.49
0.64
0.72
0.81
0.41
0.53
0.63
0.73
0.81
0.88
0.92
0.95
k=3
0.48
0.60
0.71
0.79
0.86
0.92
0.94
0.96
/f=4
/r=5
0.53
0.66
0.76
0.83
0.89
0.94
0.96
0.97
0.57
0.70
0.79
0.86
0.91
0.95
0.96
0.98
0.70
0.81
0.88
0.92
0.95
0.97
0.98
0.99
* p w is the correlation coefficient between each (parallel) exposure measure, X, and the true exposure,T.
PCT is the correlation between any two (parallel) exposure measures, Xand X' ( p * ^ p^). pTA is the correlation
of A, the average of A-(parallel) exposure measures, and T, the true exposure.
Epidemiol Rev Vol. 20, No. 1, 1998
Exposure Measurement
The primary advantage of averaging two or more
parallel measures is that the increase in the validity of
the exposure measure will reduce the degree of attenuation of the relative risk due to (nondifferential)
measurement error (equation 1), and, therefore, reduce
the required sample size (68).
One disadvantage of the use of repeated measures is
that the cost per subject would increase due to the
repeated study contacts. Thus the total study cost could
be increased or decreased depending on the trade-off
between the reduced sample size and the increased cost
per subject. One way to select k, the number of repeated
measures per subject, is to determine k which would
minimize total costs (68).
The possibility of violation of the assumptions of
the parallel test model must be considered when evaluating the use of multiple measures (1, 2). It may be
that the two or more measures to be averaged are not
equally precise. In this situation, the average of a
precise measure with a less precise measure may result
in a measure which is less valid than the better measure alone. An obvious example is the use of the
average of a perfect measure of exposure and an
imperfect measure.
When the two or more measures to be averaged
have correlated errors, but the other assumptions of
parallel tests hold, the improvement in the validity of
the exposure measure will be less than that predicted
by equation 2 or table 3. If the errors are perfectly
correlated (i.e., the measure is perfectly repeatable,
even though it is not a perfect measure of exposure),
the use of multiple measures will lead to no improvement in validity. Most questionnaire measures would
have some degree of correlated error. For example,
subjects whose fat intake was underestimated on the
administration of a food frequency questionnaire
would tend to have their fat intake underestimated on
the second administration (due to some subjects consistently underreporting high fat foods, failure of the
instrument to include certain high fat foods frequently
eaten by some subjects, etc.). Similarly, repeated analysis of the same biologic specimen would have correlated error, e.g., if a subject's serum cholesterol was
higher than his/her long-term average on the day the
blood was drawn, it would tend to be higher on each
repeated analysis of the sample.
Finally, when the repeated exposure measurements
are not collected uniformly over time, a simple average might not be appropriate. In the Port Pirie Study,
blood lead measures were taken at birth, 6, 15, and 24
months, and then annually. A composite, lifetime measure was computed for each subject as the area under
the time-exposure curve (17,18).
Epidemiol Rev Vol. 20, No. 1, 1998
53
Quality control
Quality control procedures are an important component of exposure measurement, but can only be briefly
mentioned here. When studies are large, as cohort
studies often are, it is even more essential to implement systematic quality control procedures to reduce
measurement error. It is important to have a complete
and detailed study procedures manual, to pretest instruments well, to have standardized training and monitoring of data collectors, and to edit data as they are
collected (1, 73). These and other quality control issues
are covered in the article in this issue by Whitney (74).
Use of information from validity and
reliability studies
The conduct of validity or reliability studies on a
subsample of the cohort has become an important and
common component of many large cohort studies (75—
82). Some cohort studies have incorporated a validity
substudy in which the measured exposure using the
cohort study instrument is compared with a near perfect measure of exposure which is too expensive or too
burdensome to be used in the full study (75). Since
perfectly valid exposure measures are not often available, validity studies are usually not feasible. More
often test-retest reliability studies are conducted in
which repeated applications of the exposure instrument are compared, or relative validity or intermethod
reliability studies are conducted in which the measured exposure is compared with a more accurate (but
less than perfect) comparison measure (76-82). If the
comparison measure is carefully selected so that certain assumptions are met, these types of studies can
also provide some information on the validity of the
measure (1, 70, 71). The design, analysis, and interpretation of validity and reliability studies is beyond
the scope of this presentation, but is covered by others
(1, 44, 68, 70, 72, 83, 84).
One goal for the use of the information from validity
and reliability substudies in cohort studies is to estimate the effects of the inaccuracy of the exposure
measure on the relative risk (38, 69). Development of
statistical techniques which incorporate the information from validation substudies to correct relative risk
estimates for the effects of measurement error is an
active area of methodological research (85-92). Work
is also currently being conducted on the optimal design of the validation substudy (93).
Reducing measurement error at the time of
analysis by dropping early events
As noted above, one source of differential measurement error is the effect of preclinical disease on the
54
White et al.
exposures under study. In case-control studies, the
general approach to removing the effects of symptoms
or preclinical disease on exposures is to have the
reference period end at a fixed time before diagnosis,
e.g., fiber intake could be assessed for a 10 year
reference period ending 2 years prior to diagnosis, and
for a comparable time for controls. Although one
could attempt to simulate this approach in cohort studies, another common approach to remove this source
of differential error is to drop from analysis cases who
develop the disease within a certain time after the
exposure was measured. For example, in a cohort
study of colon cancer and serum cholesterol, removing
the cancer cases that occurred within the first several
years after cholesterol measurement changed the relative risk from less than 1 to close to 1 (94). Thus,
what appeared to be a protective effect of serum cholesterol on cancer risk was more likely an effect of
preclinical colon cancer in reducing serum cholesterol.
Avoiding inducing differential measurement error
during data analysis
Care must be taken to select the appropriate reference period and the appropriate statistical method so
that the exposure measures for those with and without
the disease outcome are computed in a similar manner.
This is a particular concern in cohort studies, because
exposure assessment ends at the outcome event for
those who develop the disease but continues to endof-study for most participants who do not develop the
disease. The control reference period needs to be similar in length and measured over the same chronologic
years as the case reference period to avoid inducing
differential measurement error.
SUMMARY
Cohort study designs have several advantages over
case-control studies in terms of exposure measurement. If exposure measurement occurs before disease
occurrence, cohort studies are much less prone to
differential measurement error. Prospective data collection should also reduce measurement error due to
poor recall of past exposures. The primary drawback
of cohort studies is the large sample size leading to
high data collection costs. Several approaches to reduce such costs have been discussed in this presentation, such as selection of lower cost measurement
methods and fully measuring the exposure only on a
subsample of the cohort (e.g., nested case-control design). However, other innovative approaches to reduce
costs are needed. In addition, study reviewers should
also consider that the higher costs are justified in
relation to the several benefits of this study design,
which include not only less measurement error, but
also less susceptibility to selection bias and often the
ability to study multiple disease outcomes.
Improving the accuracy of exposure measurement is
increasingly important for cohort studies as we move
on to the study of exposures that are difficult to measure or to those with lower relative risks of disease. In
such studies, attenuation of the relative risk by the
effects of measurement error can lead to failure to
detect an association between exposure and disease.
The validity of exposure measurements could be improved by a better understanding of the biologically
active agent and etiologically important time period of
the exposure-disease relation, and by incorporating
these into the measure. Long-term cohort studies
which cover the etiologically relevant time period
could improve the accuracy of measures of exposures
by use of repeated biologic measures or repeated updates of self-reported exposures. Measurement error
also can be reduced by judicious choice of a cohort to
study and by careful attention to quality control procedures. Continued emphasis on the evaluation and
improvement of the measurement properties of instruments used in epidemiologic studies will improve the
validity of the results of cohort studies.
REFERENCES
1. Armstrong BK, White E, Saracci R. Principles of exposure
measurement in epidemiology. New York, NY: Oxford University Press, 1992.
2. Kelsey JL, Whittemore AS, Evans AS, et al. Methods in
observational epidemiology. 2nd ed. New York, NY: Oxford
University Press, 1996.
3. Correa A, Stewart WF, Yen HC, et al. Exposure measurement
in case-control studies: reported methods and recommendations. Epidemiol Rev 1994;16:18-32.
4. de Klerk NH, English DR, Armstrong BK. A review of the
effects of random measurement error on relative risk estimates
in epidemiological studies. Int J Epidemiol 1989;18:705-12.
5. Armstrong BG, Whittemore AS, Howe GR. Analysis of casecontrol data with covariate measurement error: application to
diet and colon cancer. Stat Med 1989;8:1151-63.
6. Gordon T, Kannel WB. The Framingham, Massachusetts
study, twenty years later. In: Kessler II, Levin ML, eds. The
community as an epidemiologic laboratory: a casebook of
community studies. Baltimore, MD: The Johns Hopkins Press,
1970:123-46.
7. Sytkowski PA, D'Agostino RB, Belanger A, et al. Sex and
time trends in cardiovascular disease incidence and mortality:
the Framingham heart study, 1950-1989. Am J Epidemiol
1996; 143:338-50.
8. Dawber TR. The Framinghan study: the epidemiology of
atherosclerotic disease. Cambridge, MA: Harvard University
Press, 1980.
9. Wagoner JK, Archer VE, Lundin FE Jr, et al. Radiation as the
cause of lung cancer among uranium miners. N Engl J Med
1965;273:181-8.
10. Hammond EC. Smoking in relation to death rates of one
million men and women. Natl Cancer Inst Monogr 1966; 19:
127-204.
Epidemiol Rev Vol. 20, No. 1, 1998
Exposure Measurement
11. Kaplan GA, Strawbridge WJ, Cohen RD, et al. Natural history
of leisure-time physical activity and its correlates: associations
with mortality from all causes and cardiovascular disease over
28 years. Am J Epidemiol 1996;144:793-7.
12. Yano K, Reed DM, McGee DL. Ten-year incidence of coronary heart disease in the Honolulu Heart Program: relationship
to biologic and lifestyle characteristics. Am J Epidemiol 1984;
119:653-66.
13. Oral contraceptives and health; an interim report. Royal College of General Practitioners Oral Contraceptive Study. London, England: Pitman, 1974.
14. Kay CR, Hannaford PC. Breast cancer and the pill—a further
report from the Royal College of General Practitioners' Oral
Contraceptive Study. Br J Cancer 1988;58:675-80.
15. Colditz GA, Manson JE, Hankinson SE. The Nurses' Health
Study: 20-year contribution to the understanding of health
among women. J Womens Health 1997;6:49-62.
16. Stampfer MJ, Colditz GA, Willett WC, et al. Postmenopausal
estrogen therapy and cardiovascular disease: ten-year followup from the Nurses* Health Study. N Engl J Med 1991;325:
756-62.
17. Baghurst PA, McMichael AJ, Wigg NR, et al. Environmental
exposure to lead and children's intelligence at the age of seven
years: the Port Pirie Cohort Study. N Engl J Med 1992;327:
1279-84.
18. Tong S, Baghurst P, McMichael A, et al. Lifetime exposure to
environmental lead and children's intelligence at 11-13 years:
the Port Pirie Cohort Study. BMJ 1996;312:1569-75.
19. Dudley J, Jin S, Hoover D, et al. The Multicenter AIDS
Cohort Study: retention after 9'/2 years. Am J Epidemiol
1995;142:323-30.
20. Kaslow RA, Ostrow DG, Detels R, et al. The Multicenter
AIDS Cohort Study: rationale, organization, and selected
characteristics of the participants. Am J Epidemiol 1987; 126:
310-18.
21. Friedman GD, Cutter GR, Donahue RP, et al. CARDIA: study
design, recruitment, and some characteristics of the examined
subjects. J Clin Epidemiol 1988;41:1105-16.
22. Anderssen N, Jacobs DR, Sidney S, et al. Change and secular
trends in physical activity in young adults: a seven-year longitudinal follow-up in the Coronary Artery Risk Development
in Young Adults study (CARDIA). Am J Epidemiol 1996;
143:351-62.
23. Toniolo PG, Levitz M, Zeleniuch-Jacquotte A, et al. A prospective study of endogenous estrogens and breast cancer in
postmenopausal women. J Natl Cancer Inst 1995;87:190-7.
24. Sellers TA, Kushi LH, Potter JD, et al. Effect of family
history, body-fat distribution, and reproductive factors on the
risk of postmenopausal breast cancer. N Engl J Med 1992;
326:1323-9.
25. Cauley JA, Lucas FL, Kuller LH, et al. Bone mineral density
and risk of breast cancer in older women: the study of osteoporotic fractures. Study of the Osteoporotic Fractures Research Group. JAMA 1996,276:1404-8.
26. Browner WS, Pressman AR, Nevitt MC, et al. Mortality
following fractures in older women: the study of osteoporotic
fractures. Arch Intern Med 1996;156:1521-5.
27. Fried LP, Borhani NO, Enright P, et al. The Cardiovascular
Health Study: design and rationale. Ann Epidemiol 1991;1:
263-76.
28. Design of the Women's Health Initiative Clinical Trial and
Observational Study. Women's Health Initiative Study Group.
Control Clin Trials 1998,19:61-109.
29. Infante PF, Rinsky RA, Wagoner JK, et al. Leukemia in
benzene workers. Lancet 1977;2:76-8.
30. Paffenbarger RS Jr, Wing AL, Hyde RT. Physical activity as
an index of heart attack risk in college alumni. Am J Epidemiol 1978; 108:161-75.
31. Rossing MA, Daling JR, Weiss NS, et al. Ovarian tumors in a
cohort of infertile women. N Engl J Med 1994,331:771-6.
32. Howe GR. Use of computerized record linkage in cohort
studies. Epidemiol Rev 1998;20:112-21.
Epidemiol Rev Vol. 20, No. 1, 1998
55
33. Weeks MF, Kulka RA, Lessler JT, et al. Personal versus
telephone surveys for collecting household health data at the
local level. Am J Public Health 1983;73:1389-94.
34. Rolnick SJ, Gross CR, Garrard J, et al. A comparison of
response rate, data quality, and cost in the collection of data on
sexual history and personal behaviours: mail survey approaches and in-person interview. Am J Epidemiol 1989;129:
1052-61.
35. Langholz B, Thomas DC. Nested case-control and case-cohort
methods of sampling from a cohort: a critical comparison.
Am J Epidemiol 1990;131:169-76.
36. Wacholder S, Gail M, Pee D. Selecting an efficient design for
assessing exposure-disease relationships in an assembled cohort. Biometrics 1991;47:63-76.
37. Wacholder S. Practical considerations in choosing between
the case-cohort and nested case-control designs. Epidemiology 1991;2:155-8.
38. Prentice RL. Design issues in cohort studies. Stat Methods
Med Res 1995;4:273-92.
39. Mantel N. Synthetic retrospective studies and related topics.
Biometrics 1973;29:479-86.
40. Prentice RL, Breslow NE. Retrospective studies and failure
time models. Biometrika 1978,65:153-8.
41. White JE. A two stage design for the study of the relationship
between a rare exposure and a rare disease. Am J Epidemiol
1982;115:119-28.
42. Breslow NE, Cain KC. Logistic regression for two-stage casecontrol data. Biometrika 1988;75:11-20.
43. Prentice RL. A case-cohort design for epidemiologic cohort
studies and disease prevention trials. Biometrika 1986;73:
1-11.
44. Willett W. Nutritional epidemiology. New York, NY: Oxford
University Press, 1990:20-33.
45. Laporte RE, Montoye HJ, Caspersen CJ. Assessment of physical activity in epidemiologic research: problems and prospects. Public Health Rep 1985; 100:131-46.
46. Strom BL, ed. Pharmacoepidemiology. 2nd ed. New York,
NY: Wiley, 1994:549-80.
47. Checkoway H, Pearce NE, Crawford-Brown DJ. Research
methods in occupational epidemiology. New York, NY: Oxford University Press, 1989:18-45.
48. Schulte PA, Perera FP, eds. Molecular epidemiology. San
Diego, CA: Academic Press, 1993.
49. Toniolo P, Buffetta P, eds. Methodological issues in the use of
biomarkers in cancer epidemiology. Lyon, France: International Agency for Research on Cancer, 1997. (IARC scientific
publications no. 142).
50. Dillman DA. Mail and telephone surveys: the total design
method. New York, NY: Wiley, 1978.
51. Sudman S, Bradbum NM. Asking questions: a practical guide to
questionnaire design. San Francisco, CA: Jossey-Bass, 1982.
52. Biemer PP, Groves RM, Lyberg LE, et al., eds. Measurement
errors in surveys. New York, NY: Wiley, 1991:237-57.
53. Developing and using questionnaires. Washington, DC: US
General Accounting Office Evaluation and Methodology Division., 1993. (GAO/PEMD-10.1.7).
54. Rothman KJ. Induction and latent periods. Am J Epidemiol
1981;114:253-9.
55. Breslow NE, Day NE. Statistical methods in cancer research.
Volume II: The design and analysis of cohort studies. Lyon,
France: International Agency for Research on Cancer, 1987:
232-70. (IARC publications no. 32).
56. Thomas DC. Models of exposure-time-response relationships
with applications to cancer epidemiology. Annu Rev Public
Health 1988;9:451-82.
57. Checkoway H, Pearce N, Hickey JLS, et al. Latency analysis
in occupational epidemiology. Arch Environ Health 199O;45:
95-100.
58. Moulton LH, Le MG. Latency and time-dependent exposure
in a case-control study. J Clin Epidemiol 1991;44:915-23.
59. Hunt JR, White E. Retaining and tracking cohort study members. Epidemiol Rev 1998;20:57-70.
56
White et al.
60. Walker AM, Velema JP, Robins JM. Analysis of case-control
data derived in part from proxy respondents. Am J Epidemiol
1988; 127:905-14.
61. Nelson LM, Longstreth WT Jr, Koepsell TD, et al. Proxy
respondents in epidemiologic research. Epidemiol Rev 1990;
12:71-86.
62. Prentice RL. Covariate measurement errors and parameter
estimation in a failure time regression model. Biometrika
1982;69:331-42.
63. Colditz GA, Martin P, Stampfer MJ, et al. Validation of
questionnaire information on risk factors and disease outcomes in a prospective cohort study of women. Am J Epidemiol 1986; 123:894-900.
64. White E, Kushi LH, Pepe MS. The effect of exposure variance
and exposure measurement error on study sample size: implications for the design of epidemiologic studies. J Clin Epidemiol 1994;47:873-80.
65. Riboli E. Nutrition and cancer: background and rationale of
the European Prospective Investigation into Cancer and Nutrition (EPIC). Ann Oncol 1992;3:783-91.
66. Freedman LS, Schatzkin A, Wax Y. Re: The impact of dietary
measurement error on planning sample size required in a
cohort study. (Letter; The authors reply). Am J Epidemiol
1991:134:1472-3.
67. Liu K, Stamler J, Dyer A, et al. Statistical methods to assess
and minimize the role of intra-individual variability in obscuring the relationship between dietary lipids and serum cholesterol. J Chronic Dis 1978;31:399-418.
68. Fleiss JL. The design and analysis of clinical experiments.
New York, NY: Wiley, 1986:1-32.
69. Holford TR, Stack C. Study design for epidemiologic studies
with measurement error. Stat Methods Med Res 1995;4:
339-58.
70. Carmines EG, Zeller RA. Reliability and validity assessment.
Beverly Hills, CA: Sage, 1979.
71. Bohrnstedt GW. Measurement. In: Rossi PH, Wright JD,
Anderson AB, eds. Handbook of survey research. New York,
NY: Academic Press, 1983:70-121.
72. Dunn G. Design and analysis of reliability studies. New York,
NY: Oxford University Press, 1989.
73. Meinert CL. Clinical trials: design, conduct, and analysis.
Monographs in epidemiology and biostatistics. Vol 8. New
York, NY: Oxford University Press, 1986.
74. Whitney CW, Lind BK, Wahl PW. Quality assurance and
quality control in longitudinal studies. Epidemiol Rev 1998;
20:71-80.
75. Willett WC, Reynolds RD, Cottrell-Hoehner S, et al. Validaltion
of a semi-quantitative food frequency questionnaire: comparison
with a 1-year diet record. J Am Diet Assoc 1987;87:43-7.
76. Kushi LH, Kaye SA, Folsom AR, et al. Accuracy and reliability of self-measurement of body girths. Am J Epidemiol
1988;128:740-8.
77. Giovannucci, Colditz G, Stampfer MJ, et al. The assessment
of alcohol consumption by a simple self-administered questionnaire. Am J Epidemiol 1991 ;133:810-17.
78. Sidney S, Jacobs DR Jr, Haskell WL, et al. Comparison of two
methods of assessing physical activity in the Coronary Artery
Risk Development in Young Adults (CARDIA) study. Am J
Epidemiol 1991;133:1231-45.
79. Lambert WE, Samet JM, Stidley CA. Classification of residential exposure to nitrogen dioxide. Atmospheric Environ
1992;26A:2185-92.
80. Ainsworth BE, Leon AS, Richardson MT, et al. Accuracy of
the College Alumnus Physical Activity Questionnaire. J Clin
Epidemiol 1993;46:1403-11.
81. Toniolo P, Koenig KL, Pasternack BJ. Reliability of measurements of total, protein-bound, and unbound estradiol in serum.
Cancer Epidemiol Biomarkers Prev 1994;3:47-50.
82. European Prospective Investigation into Cancer and Nutrition:
validity studies on dietary assessment methods. Int J Epidemiol 1997;26(Suppl l):Sl-190.
83. Fleiss JL. Statistical methods for rates and proportions. 2nd
ed. New York, NY: Wiley, 1981:188-236.
84. White E. Effects of biomarker measurement error on epidemiologic studies. In: Toniolo P, Buffetta P, eds. Methodological issues in the use of biomarkers in cancer epidemiology.
Lyon, France: International Agency for Research on Cancer,
1997:73-94. (IARC scientific publications no. 142).
85. Willett W. An overview of issues related to the correction of
non-differential exposure measurement error in epidemiologic
studies. Stat Med 1989;8:1031-40.
86. Chen TT. A review of methods for misclassified categorical
data in epidemiology. Stat Med 1989;8:1095-106.
87. Clayton DG. Models for the analysis of cohort and casecontrol studies with inaccurately measured exposures. In:
Dwyer JH, Lippert P, Feinleib M, et al., eds. Statistical methods for longitudinal studies of health. New York, NY: Oxford
University Press, 1991:301-31.
88. Thomas D, Stram D, Dwyer J. Exposure measurement error:
influence on esposure-disease: relationships and methods of
correction. Annu Rev Public Health 1993; 14:69-93.
89. Whittemore AS, Keller JB. Approximations for regression
with covariate measurement error. J Am Stat Assoc 1988;83:
1057-66.
90. Rosner B, Willett WC, Spiegelman D. Correction of logistic
regression relative risk estimates and confidence intervals for
systematic within-person measurement error. Stat Med 1989;
8:1051-69.
91. Marshall RJ. Validation study methods for estimating exposure proportions and odds ratios with misclassified data. J Clin
Epidemiol 1990;43:941-7.
92. Lee LF, Sepanski JH. Estimation in linear and nonlinear errors
in variables models using validation data. J Am Stat Assoc
1995;90:130-40.
93. Spiegelman D. Cost-efficient study designs for relative risk
modeling with covariate measurement error. J Stat Planning
Inference 1994;42:187-208.
94. Winawer SJ, Flehinger BJ, Buchalter J, et al. Declining serum
cholesterol levels prior to diagnosis of colon cancer: a timetrend, case-control study. JAMA 1990;263:2083-5.
Epidemiol Rev Vol. 20, No. 1, 1998