Epidemiologic Reviews Copyright © 1998 by The Johns Hopkins University School of Hygiene and Public Health All rights reserved Vol. 20, No. 1 Printed in U.S.A. Exposure Measurement in Cohort Studies: The Challenges of Prospective Data Collection Emily White,1' 2 Julie R. Hunt,1 and Deborah Casso 1. 2 INTRODUCTION be the primary concern, cost considerations often limit what is practicable. Collection of accurate exposure information on study participants is a major objective of all epidemiologic studies. The term exposure is used here to include not only exposures to exogenous agents that may be causally related to disease, but also socioeconomic factors, health habits, endogenous factors, and factors that may confound or modify the primary exposure-disease relation. Because many of the issues in exposure measurement are similar for case-control and cohort studies (1-3), this presentation attempts to focus on the unique challenges in measurement that arise from certain characteristics of many (but not all) cohort studies; these include the prospective, repeated exposure assessment and the larger sample size, which lead to greater need for cost considerations in data collection. We review methods of exposure measurement in cohort studies and issues of data collection costs, including selection of cost-efficient measurement methods and sampling approaches to reduce costs. Issues relating to prospective data collection are covered, including choice of a reference period, frequency of exposure updates, changes in the instrument over time, and use of repeated measures. The final focus is on sources of exposure measurement error, and approaches to reduce such measurement error through the design, execution, and analysis of the study. These are key issues, because exposure measurement error can lead to substantial bias in the estimated relative risk for the exposure-disease relation (1, 4, 5). Selection of a measurement method A large range of exposure measurement methods have been used in prospective cohort studies (examples are presented in table 1). Prospective cohort studies have used mailed self-administered questionnaires, interviewer-administered questionnaires, measures of blood or other tissues, physical measures, medical tests (such as electrocardiography), use of medical or other exposure records, and/or measures of the environment (e.g., air or water sampling) (table 1). Many studies use multiple methods. For example, the Cardiovascular Health Study uses home interviews and clinic visits, which include interviewer- and selfadministered questionnaires, physical tests (for anthropometries), blood measures, medical tests (blood pressure, electrocardiography, ultrasonography), and abstraction of prescription labels of medication use (27). In the study of Colorado Plateau uranium miners, Wagoner et al. (9) used physical examinations and detailed occupational questionnaires, as well as longitudinal measures of environmental radiation from each mine. Storage of blood or other sources of DNA for future studies of biomarkers has become an important component of several large cohort studies, including the Women's Health Initiative Observational Study (28) and the Nurses' Health Study (15, 16). Diaries (e.g., food records) have rarely been used in cohort studies because of the costs of training subjects and of coding the diary information. However, diaries could be more cost effective if the coding were limited to a subsample of participants (see "Measurement of exposures on a sample of the full cohort to reduce costs" below). A major consideration in the selection of a method is the validity of the method. Although issues surrounding validity are often specific for each type of exposure, one general issue is whether a current measure (or prospectively repeated measures of current exposure) is appropriate or whether a measure of past MEASUREMENT METHODS IN COHORT STUDIES The first measurement issue in designing a cohort study is usually the selection of the data collection method(s). While selecting an accurate method should Received for publication August 11, 1997, and accepted for publication April 1, 1998. 1 Fred Hutchinson Cancer Research Center, Seattle, WA. department of Epidemiology, University of Washington, Seattle, WA. 43 TABLE 1. Examples of exposure assessment methods in selected prospective cohort studies Follow-up exposure Instruments* Baseline exposure instruments* Study (relerence(s)) Year begun The Framlngham Heart Study (6-8) 1948 Risk factors for cardiovascular disease Residents of Framingham, MA, ages 28-62 years 5,209 Interview, clinic examination (PE, lab, tests) Interview, clinic examination (PE, lab tests) Every 2 years Colorado Plateau Uranium Miners Study (9) 1950 Occupational risk factors for cancer White, male underground uranium miners, Colorado Plateau 3,415 Clinic examination (PE), questionnaire, environmental measures (airborne radiation) Questionnaire 0n-person or mailed), environmental measures (airborne radiation) Triennial American Cancer Society: Cancer Prevention Study 1(10) 1959 Cigarette smoking and cancer mortality Self-administered questionnaire Every 2 years Tha Alameda County Study (11) 1965 Mailed questionnaire; telephone Interview or home interview of nonrespondents At years 9,16 Honolulu Heart Program (12) 1965 Coronary heart disease and stroke In men of Japanese ancestry Men of Japanese ancestry living on Oahu, HI, ages 45-65 years The Oral Contraception Study of the Royal College of General Practitioners (13,14) 1968 Oral contraceptive use and cancer Married premenopausal British women Same Every 6 month Nurses' Health Study (15,16) 1976 Mailed questionnaire Lab (toenall sample) Lab (home blood draw) Every 2 years At year 6 At year 13 Port Plrle Cohort Study (17,18) 1979 Lab (blood samples) At 6,15, and annually u and at 11 Multlcenter AIDS Cohort Study (19-20) Clinic examination (PE, tab), seli-admlnlslered questionnaire, interview Clinic examination (PE, lab, tests), selfadministered questionnaire, Interview Every 6 month At years 2, 5, Lab Annual Main focus Factors associated with health and mortality Study population US men and women aged 30 years and older Residents ot Alameda County, CA, ages 16-94 years Originally oral contraceptive use and cancer, expanded to women's health Lead exposure and child development Married female US registered nurses ages 30-55 years 1984 Risk factors for HIVt In gay men US homosexual men ages 18-70 years Coronary Artery Risk Development In Young Adults (CARDIA) (21,22) 1985 Risk factors for coronary heart disease In young adults Black and white US men and women ages 18-30 years New York University Women's Health Study (23) 1985 Iowa Women's Study (24) Endogenous hormones and risk ot breast cancer 1986 Cancer In women Infants bom In Port Plrle, South Australia, 1979-1982 New York women ages 35-65 years Iowa women ages 55-69 years Size 1,045,087 Self-administered questionnaire delivered by volunteers 6,928 Mailed questionnaire; telephone Interview or home Interview of nonrespondents Frequ exposure 8,006 Mailed questionnaire, interview, clinic examination (PE, lab, tests) 47,000 Medical form completed by physician based on patient interview or medical record 121,700 Mailed questionnaire 723 Lab (blood samples from pregnant mother and umbilical cord at birth 4,954 Clinic examination (PE, lab), setf-admlnistered questionnaire, interview 5,115 Telephone Interview, clinic examination (PE, lab, tests), selfadministered questionnaire, Interview 14,291 Lab, self-administered questionnaire 41,837 Mailed questionnaire with self body girth measures None Exposure Measurement 45 exposure is needed (see the section on Developing an exposure measure—time considerations, below). Blood measures, environmental measures, and physical measures generally measure only current exposure. However, if the characteristic is fixed (e.g., inherited genetic traits) or reasonably stable over long periods of time, then a current measure can be a good measure of past exposure. Cost considerations :§. 3 2 in g f .C If — ZQ jai. Epidemiol Rev Vol. 20, No. 1, 1998 For large cohort studies, cost is a major concern. The two common approaches to low-cost cohort studies are record-linkage studies, in which records of exposure are linked to disease records, and studies which recruit a cohort and collect exposures by mailed questionnaires. Numerous studies have been published in which medical, pharmacy, educational, occupational, or other records of exposure have been used (29-31), see the article in this issue by Howe (32) for other examples. Obtaining records is usually inexpensive even compared with mailed questionnaires, particularly if the cohort is selected to facilitate the availability of records (such as members of a health maintainance organization) or if the records are computerized. Moreover, use of records often allows the conduct of a cohort study in which the exposure data collection has occurred in the past (retrospective cohort study) (2). Because the exposures are recorded for each subject before disease onset, retrospective cohort studies generally have the advantages of prospective cohort studies, including less susceptibility to differential and nondifferential measurement error and to selection bias. However, records are often not available for the exposure of interest or are not sufficiently accurate. Even when records are available with reasonably accurate measures of the primary exposure, information on potential confounders is often poorly measured or is missing (1). Mailed questionnaires are generally the least expensive of the other available methods (33, 34), and have been used as the primary exposure method in several large cohort studies, including the Nurses' Health Study and the Iowa Women's Study (15, 16, 24). Machine reading of questionnaires using "mark-sense" or fax (digitization) technology further reduces costs. However, mailed questionnaires generally must be shorter and less complex than intervieweradministered questionnaires. For example, complex skip patterns must be avoided on mailed questionnaires. Other innovative methods have been used to reduce the costs of large cohort studies. In some studies, the subjects themselves have served as data collectors for 46 White et al. physical measures or biologic specimens, tasks that usually are conducted by study staff. In the Nurses' Health Study, the subjects were mailed materials and instructions to provide a blood sample, and were also asked for toenail samples (to assess selenium exposure) (15). In the Oral Contraceptive Study of the Royal College of General Practitioners, data collection costs were reduced by soliciting physicians to recruit subjects and be the primary informants on subjects' exposures (oral contraceptive use) (13, 14). In the American Cancer Society Cohort Studies, volunteers recruited and delivered the study questionnaire to participants (10). Measurement of exposures on a sample of the full cohort to reduce costs Another approach to reduce data collection costs in cohort studies is to collect or have available the research material for all cohort members, but to measure the exposures on less than the full cohort (2, 35-38). For example, medical records might be available for the full cohort but only abstracted on a sample of cohort members, or stored blood might be available for the full cohort but only analyzed for a subsample. Several types of sampling designs can be used, each of which has advantages and limitations (2, 35-38). In a nested case-control design, cases are compared with a sample of controls free of disease at the time of the case's diagnosis (39, 40). In a two-phase design, a proportion of each of the four disease-by-exposure groups are sampled (41, 42). In a case-cohort design, a random subsample of the cohort is selected at baseline, and is compared with those who develop the disease of interest (43). This latter design allows the cohort subsample to be compared with more than one disease group. These designs can lead to reduced exposure measurement costs with a relatively small loss in study power. DEVELOPING AN EXPOSURE MEASURETIME CONSIDERATIONS Dose and time After the exposure measurement method has been selected (e.g., self-administered questionnaire for physical activity), the actual instrument is selected from among those available, or a new instrument is developed. Most of the considerations in selecting or developing an instrument are similar for cohort and case-control studies. Briefly, these begin with conceptualization of the actual "active agent" that is hypothesized to be causally related to disease, identification of those exposures which do (and do not) include the agent, and conceptualization of the best construct(s) of "dose" (e.g., cumulative dose of the active agent, cumulative dose above some threshold of intensity, etc.) (1). Then the researcher identifies the items needed to capture the duration, frequency, and intensity of exposure over a specified time period, and develops the algorithm for the analytic exposure variable^) which combines these items to yield the exposure dose construct that is expected to be causally related to disease. Conversion tables are often used in the algorithm to quantify the intensity of exposure into units of the active agent (e.g., to convert each type of physical exercise into kilocalorie expenditure). The final step is the actual phrasing of the questions for each item. Most of these issues have been covered in the context of measurement of specific exposures, e.g., diet (44), physical activity (45), drugs (46), occupational exposures (47), laboratory-based measurements (48, 49), and in books and articles on questionnaire design (50-53). Because cohort studies are often prospective, issues relating to the time period of importance in the exposure-disease relation deserve particular consideration. Several authors discuss the importance and methods of considering the time period of exposure in addition to the dose of exposure in evaluating the effect of exposure on disease occurrence (54-58). One of the primary ways in which the dose-time effect of exposure on disease risk can be assessed is by computing cumulative dose (or average dose) only over the time period thought to be of etiologic importance (54) (see Thomas (56) for other analytic approaches). Considering the etiologically important time period can help answer questions that arise in the design of cohort studies, such as: • Should current or past exposures be measured? • How often should the measure of exposure be repeated prospectively (e.g., repeat questions on exercise or repeat blood draws)? • How should exposures that change over time be handled (e.g., how does one classify a subject who was on hormone replacement therapy at baseline but not at the 2-year follow-up)? Additionally, one must consider how to handle changes in the instrument over time. Changes might be necessary if there are new sources of the exposure (e.g., new forms of hormone replacement therapy) or if the instrument can be improved. Etiologically relevant time window Rothman (54) provides a structure with which to view the relation between timing of exposure and the occurrence of disease. Specifically, if there is a series of component causes which must occur in a sequence Epidemiol Rev Vol. 20, No. 1, 1998 Exposure Measurement 47 figure 1). In practice, for each exposure variable to be calculated (e.g., cumulative dose of hormone replacement therapy), a reference time period needs to be specified. This reference time period is selected based on what is known about the natural history of the disease, i.e., it should reflect the hypothesized etiologically relevant time period. The reference period could be lifetime (up to diagnosis), a physiologic time period (e.g., exposure during adolescence), or could be measured backward from the time of the diagnosis of disease (e.g., the 20 year period ending 5 years before diagnosis). This latter type of reference period is easier to implement in case-control studies, because the date of diagnosis of cases is known at the time of data collection, while in cohort studies, the date of diagnosis is only known after the (prediagnostic) exposure information has been collected. For exposures with long induction and/or latent periods, if the etiologically important time period does not overlap with the study time period, then the best measure of exposure will be retrospective exposure information collected at baseline only. In these situations, the reference period, by necessity, is expressed in relation to baseline, e.g., cumulative dose of aspirin over the 10 years before baseline. For other exposures, the retrospective information from baseline would need to be updated with the prospective information from follow-up contacts to capture exposure during the etiologically relevant time period. For example, to compute hormone replacement therapy use for the 10-year period ending, say, 3 years prior to diagnosis of disease, one would need to com- to lead to disease, then there would be a limited time period during which a given exposure is causally related to disease, i.e., an etiologically relevant time window (top line of figure 1). After the action of that exposure is complete, then an induction period ensues during which any remaining component causes in the etiologic sequence occur. The length of the induction period therefore varies for each exposure related to the disease, with those acting early in the causal sequence having a long induction period and those late in the causal sequence a short induction period. The induction period ends when the disease irreversibly begins, and that point to diagnosis is the latent period. Thus, in theory, there is an etiologically relevant time window during which the exposure of interest is causally related to the disease. Further exposure during the induction and latent periods does not contribute to etiology, and including exposure before or after the etiologically important time window, leads to exposure measurement error. Selecting the reference period and frequency of data collection The limits of knowledge about the biology of most diseases precludes a precise definition of the etiologically relevant time period. Nonetheless, consideration of this concept can provide insight into the appropriate time interval during which to collect exposure data (middle line of figure 1), and the appropriate time period over which the exposure dose variable should be calculated (the reference period) (bottom line of Causal sequence for a specific exposure and disease :*,t Exposure Action of exposure complete complete -4V -H l+i-H Action of all causes complete causes complete 11Disease 111begins 11 Symptoms Disease diagnosis 1- 4Etiologically relevant time window for that exposure Induction period Latent period Timeline of cohort study data collection: h Baseline Update Yearl Update Year 2 Update Year 3 Update Year 4 Reference period for calculation of exposure dose:* Reference period FIGURE 1. Timeline showing the relation between causal exposure-disease sequence, timeline of a cohort study, and reference period for calculation of exposure dose. *Examp!e for a subject who developed the disease endpoint. tHashmarks equal episodes of exposure. Epidemiol Rev Vol. 20, No. 1, 1998 48 White et al. bine information from baseline with that from, say, yearly updates ending at 3 years before diagnosis (as shown in figure 1). Or, if the exposure acts late in the causal sequence, e.g., aspirin or other anticoagulant drugs in relation to myocardial infarction, then the most recent exposure update(s) before the disease event could be the best exposure measure. Consideration of the etiologically relevant time period helps answer questions about collection of current versus past exposure data, the frequency of prospective data collection, and the reference period to be used in constructing the dose variable. It is important to note that it is the natural history of the exposuredisease relation that provides answers to these questions, not the "natural history" of the cohort study design. Thus, the advantages of cohort studies, in particular the ability to accurately measure current exposures (e.g., current physical activity, current blood measures) and to measure them prospectively over time are only advantages if they more accurately capture the true (etiologic) exposure. While valid measurement of current physical activity may seem to be an advantage compared with poorly recalled adult lifetime activity, it is quite possible that recalled adult activity is a better measure of the true exposure (e.g., actual lifetime adult activity) than is current activity. Similarly, annual prospective data collection for, say, 10 years would only improve exposure measurement if these years included the etiologically important time window. Of course, the choice of the frequency of data collection often depends on other study requirements in addition to capturing exposure dose over the reference period. One must consider the frequency needed to obtain accurate endpoint data (if endpoint data are ascertained from the subjects). Frequent (e.g., yearly) contact would usually be needed in order to not lose endpoint information for conditions that lead to major morbidity and mortality. Frequent contact is also needed to keep up-to-date information that allows tracking of subjects' whereabouts and may help to keep subjects feeling committed to the research project (see the article in this issue by Hunt and White (59)). Dose and time considerations when there are multiple disease endpoints One challenge of cohort studies is that there may be multiple disease endpoints, and the appropriate active agent, dose construct, or etiologically relevant time window for a type of exposure could differ for different diseases. For example, the type of physical activity that influences cardiovascular disease risk may differ from the nature of activity that influences colon cancer risk. Colon cancer might be influenced by even low levels of intensity (which might reduce bowel transit time), while a physical activity measure for cardiovascular disease may need to include only the amount of activity of sufficient intensity to have a cardiovascular effect. Similarly, recent use of hormone replacement therapy primarily affects endometrial cancer risk, while only long-term use may affect breast cancer risk. For cohort studies with multiple disease endpoints, instruments must incorporate the items necessary to compute the different exposure dose variables and/or the different etiologically relevant time windows for the various disease outcomes. Changing the instrument over time As a cohort study continues over time, new exposures may be added at exposure updates. For example, a food frequency questionnaire was added to the Nurses' Health Study in year 4, and has been used to assess diet-disease relations for subsequent disease events (15). A greater challenge in prospective cohort studies is that there may be a need to change the measurement of a specific exposure over time (e.g., one might need to change the way in which dietary fat intake is estimated). While in principle one should never change the measure of an exposure during a study, there are situations for which this may be necessary. First, the available sources of the exposure can change (e.g., new types of hormone replacement therapy have continually been introduced over the last 20 years) or the content of existing products can change (e.g., the fat content of beef has declined over the last several decades). When the external environment or "market place" changes, changes in the questionnaire are needed to continue to accurately measure the exposure. These changes do not necessarily improve the accuracy of the exposure measure; rather, not making the changes would decrease the accuracy. Therefore, such changes should be made to the instrument or to the conversion tables (e.g., fat content of foods) used in the algorithm to create the exposure dose variable. Creating and interpreting a cumulative dose over the reference time period should not be hindered by these types of changes to the instrument. For example, at the exposure updates there may be more types of hormone replacement therapy on the questionnaire which enter into calculation of, say, "cumulative years of hormone replacement therapy use," but this would not change the interpretation of the exposure variable. The other type of change in exposure measurement over time would be a change to improve the accuracy of the measure. One way that this happens is that new technology leads to better measures (e.g., a better measure of bone mineral density becomes available). Epidemiol Rev Vol. 20, No. 1, 1998 Exposure Measurement Another way this occurs, unfortunately, is that the researcher discovers errors or omissions on the original instrument. The decision about whether to change to a more precise measure depends on several factors. First, if one can redo all past measures from stored data or stored specimens using the newer instrument or algorithm, then this should be done. For example, if blood has been stored, it can be reanalyzed using a new method even for older specimens. Or, if a new table for /3-carotene in foods becomes available due to better techniques for measuring /3-carotene, then one can use the originally collected food frequency questionnaire data and recreate the exposure variable for dietary intake of j3-carotene. For situations such as these, there is no reason to keep an older, less accurate exposure measure. It is most challenging to decide whether to change an exposure measure when it will mean that, for some subjects, or at some data collection time points, the exposure assessment is more accurate than others. In the Framingham study, for example, the measure of serum cholesterol became more precise by the second examination, and questions on cigarette smoking were modified over time to improve accuracy (6). These changes meant that, for example, serum cholesterol measured at baseline was less accurate than that measured at the year 2 clinic visit. One way to deal with two versions of a measure is to drop the earlier, less accurate measure. If the exposure measurement changed at, say, year 4, one could only use the new instrument and only the disease events that occur after year 4. This, however, would reduce study power by loss of the early endpoints. Little has been written about incorporating the use of two different methods for the same exposure in a study. One approach is to stratify the analysis by the instrument used; then one would expect that the relative risk based on the more accurate exposure instrument would be less attenuated due to measurement error than that based on the less accurate instrument. This approach has been suggested in the context of case-control studies which must use proxy respondents for some subjects (60, 61). When the instrument has changed in a cohort study, it is important to not induce differential error by using the two instruments differentially for diseased and control subjects. Specifically, if information from the earlier instrument is to be used for the early cases (e.g., those who develop the disease before the new instrument was used), or if information from the earlier instrument is combined with the improved version (e.g., to compute cumulative dose), the analytic techniques must be such that exposures from those with and without the disease outcome are compared on the same instrument or combination of instruments. Epidemiol Rev Vol. 20, No. 1, 1998 49 MEASUREMENT ERROR IN COHORT STUDIESSOURCES AND EFFECTS Measurement error and its effects Exposure measurement error for an individual is defined as the difference between a subject's measured exposure (X) and his/her true exposure (T). The true exposure is the agent of interest; for example, the active agent hypothesized to cause the disease expressed using the appropriate dose construct and covering the etiologically relevant time period. Validity or accuracy refers to the capacity of an exposure measure to capture the true exposure in the population of interest; measures of validity are measures of the exposure measurement error in the population. For a continuous variable, validity is measured by the systematic bias, i.e., the difference between the average measured exposure in the study population and the average true exposure (X — T) and the validity coefficient (prx), i.e., the correlation of the measured exposure with the true exposure (1, 4, 5). The validity coefficient is inversely related to the ratio of the exposure error variance to the true exposure variance. Even though validity often cannot be measured (because there is rarely a gold standard measure of the true exposure), it is an important concept because the validity of the exposure measure influences the validity of the results of an epidemiologic study. Differential measurement error is a major concern when the systematic bias differs between those with and without the outcome; for example, when those with the disease of interest have their exposure overestimated on average compared with those without the disease. Differential error can lead to uninterpretable study results because the observed relative risk can be biased toward the null, away from the null, or cross over the null value compared with the true relative risk. Measurement error is nondifferential when the systematic bias and the validity coefficient are the same for those with and without the disease. Under simplifying assumptions, when there is nondifferential measurement error, the observed relative risk (RRO) is biased toward the null value compared with the true relative risk (RR r ) by the equation (1, 62): RRO = R R / " . (1) Thus a true relative risk of 4 would be attenuated to an observed relative risk of 1.4 if the validity coefficient were 0.5. Note that only p ^ and not the systematic bias enters this equation. Sources of measurement error Cohort studies generally have several advantages that have the potential to reduce both differential ex- 50 White et al. posure measurement error and nondifferential measurement error. Exposure assessments (including questionnaire, blood, air sampling, etc.) in cohort studies generally are made closer to the time of the true, etiologic exposure than in case-control studies, which should improve the validity of the exposures measured in cohort studies compared with case-control studies. Cohort studies often include repeated exposure measures, which allows for a more accurate measure of exposure during different time periods. Moreover, cohort studies are much less prone to differential measurement error because, in most cohort designs, exposures are measured before disease occurrence. This reduces the likelihood that knowledge about the disease or the physical and psychologic effects of the disease would lead to different validity of the exposure between those with and without disease. These advantages generally apply to both prospective and retrospective cohort designs. Despite these advantages, cohort studies are subject to many sources of measurement error. Errors can be TABLE 2. introduced during any phase of the study (examples are given in table 2). As implied above, errors can arise from the design of the instrument (i.e., the questionnaire or laboratory procedure to measure the exposure) through failure to correctly specify the specific causative agent, the etiologic important time window, and the appropriate construct of dose (i.e., the appropriate way to incorporate intensity, frequency, and time period of the exposure) for a given disease outcome. Measurement error also arises from errors or omissions in the protocol for use of the instrument and from poor execution of the protocol during data collection, e.g., failure of subjects to follow instructions on self-administered questionnaires or poor handling of biologic specimens. Errors can also arise from limitations due to inherent subject characteristics; these include memory limitations, social desirability bias, and month-to-month variability in biologic characteristics. Some sources of error may be greater in cohort studies than in case-control studies because of the usual longer time frame of cohort studies. In particu- Sources of measurement error in cohort studies* Errors in the selection or design of the instrument to measure the exposure Lack of coverage of all sources of the active agent by the instrument Inclusion of exposures that do not have the actual active agent Time period assessed by the instrument is not the etiologically important time period Phrasing of questions that lead to misunderstanding or bias Errors or omissions in the protocol for use of the instrument Failure to specify the protocol (including instructions to the subject) in sufficient detail Failure to specify a method to handle unanticipated situations consistently Poor execution of the study protocol Failure of the data collectors or laboratory technicians to follow the protocol in the same manner for all subjects Failure of the subjects to read or understand the instructions in self-administered questionnaires Improper handling or storage of biologic specimens Limitations due to inherent subject characteristics Memory limitations of subjects including poor recall of past exposures and influence of recent exposures on memory of past exposures Tendency of subjects to overreport socially desirable behaviors (e.g., exercise) and underreport socially undesirable or sensitive behaviors (e.g., induced abortion) Short-term (month to month) variability in biologic characteristics. Effect of the outcome under study on the exposure measures (e.g., effect of preclinical disease) Drift in accuracy of exposure measures over time Change in instrument over time Interviewer fatigue Failure to standardize laboratory method periodically Degradation of agent in biologic specimens Errors during data processing and in the creation of the exposure variables Data entry errors Errors in conversion tables used to convert subject responses to units of active agent (e.g., p-carotene by type of food) Development of an exposure variable algorithm that does not express the etiologically important dose or time period * Adapted from Armstrong et al. (1). Epidemiol Rev Vol. 20, No. 1, 1998 Exposure Measurement lar, the accuracy of the instrument could drift, over time, due to failure to standardize the laboratory instrument, fatigue among interviewers, or degradation of the active agent in biologic specimens. Finally, errors enter at the time of data analysis, including key entry errors and errors in conversion tables used to convert subjects' responses to units of the active agent (e.g., errors in the nutrient database used to convert foods to international units of vitamin A). In cohort studies, most of the sources of error described above would lead to nondifferential error; nonetheless, these and others can be sources of differential error. Major sources would be the effects of the biologic changes during the prediagnostic phase of disease on biologic measures (e.g., undetected bleeding may lead to lower iron status during the prediagnostic phase of colon cancer), the influences of prediagnostic symptoms on a subject's behavior (e.g., gastrointestinal symptoms may lead to reduced fiber intake before diagnosis of colon cancer), and the effects of the subject's risk of the disease on accuracy of reporting (e.g., those with strong family history of colon cancer may be both more accurate in reporting family history and at higher risk of colon cancer). METHODS FOR REDUCING MEASUREMENT ERROR IN COHORT STUIDES The primary method to reduce measurement error is the careful selection or construction of the exposure instrument; several issues related to the design of accurate instruments have been reviewed above. Other approaches to reduce measurement error include judicious selection of a study population to improve measurements, use of repeated measures of exposure, and rigorous quality control procedures throughout the conduct of the study. Some steps also can be taken at the time of analysis or interpretation of study results— these include removing outcome events that occur soon after the measurements were taken and the use of information on the accuracy of the exposure measure to "correct" the results. Each of these topics is discussed in more detail below, with emphasis on those issues particularly relevant to cohort studies. Selection of a cohort to reduce measurement error In cohort studies, the investigator often has more options in selection of a study group(s) than in casecontrol studies, which are usually population-based. Several researchers have taken advantage of this by selecting a study cohort that could lead to reduced measurement error. The Nurses' Health Study is an excellent example of a study in which a specific cohort was selected because the cohort members were exEpidemiol Rev Vol. 20, No. 1, 1998 51 pected to provide more accurate information on exposures and outcomes than the general population (15, 63). Female registered nurses were selected to be studied because they were assumed to be excellent reporters of exogenous hormone use, which was the original study exposure, and disease outcomes. Several authors have proposed another approach by which selection of the cohort(s) would reduce measurement error. By selecting a population with larger variation of exposure, the effects of measurement error are reduced (64). For example, the European Prospective Investigation into Cancer and Nutrition (EPIC) includes a multinational population which would have a larger variance of nutrient intake than within a single country (65). If the variance of the (true) exposure is greater in one study population than another, and the variance of the exposure measurement error is the same, then the effect of the measurement error on the relative risk is less in the population with the greater variability of exposure (64). As noted above, the validity coefficient varies inversely with the ratio of the exposure error variance to the true exposure variance. Thus, the validity of the exposure measure can be increased by increasing the true exposure variance as well as by the usual methods to reduce the error variance. This approach of selecting a cohort with larger exposure variance would lead to less attenuation of the relative risk from the effects of nondifferential measurement error. It could also substantially reduce the sample size (or equivalently increase power for a fixed sample size) unless the study population with the larger exposure variance has a significantly lower disease incidence rate than the alternative choices with smaller exposure variance (64). At least one of the current large cohort studies has attempted to artificially create a cohort with a larger exposure variance through sampling. The American Association of Retired Persons (AARP) Cohort Study, being conducted by the National Cancer Institute, is screening potential cohort members by use of a food frequency questionnaire (66). Subjects are then selected for cohort entry by sampling more heavily from those with the more extreme values, e.g., those in the highest and lowest 10 percent of fat intake. This approach would lead to larger exposure variance than random sampling of this population. Use of repeated measurements to reduce measurement error Although issues relating to frequency of data collection were discussed above, another related consideration in determining the frequency of exposure measurement is the advantage of averaging (or summing) two or more measures of the exposure for each sub- 52 White et al. ject. The approach of averaging repeated measures has been recommended as an effective method of decreasing the measurement error, compared with the use of a single measurement (1,2, 67-69). The measures are typically repeated administrations of the same instrument over time. Use of repeated measures is particularly applicable to prospective cohort studies, because the measures can be averaged over long time periods (e.g., years). For example, the New York University Women's Health Study measured blood hormone concentrations by use of samples collected annually over 5 years (23). On the other hand, the average exposure only increases the validity of the exposure measure if the average over that time period is a good expression of the true exposure. Fleiss (68) and others (70-72) have shown the degree of improvement in the validity of an exposure variable that results from the combination of multiple measures when the errors in the two or more measures to be averaged are equal and uncorrelated (the parallel test model). "Equal error" means that each measure is an equally precise measure of the true exposure, as would be expected when the same instrument was used for the repeated measures. "Uncorrelated error" means that the measurement error on one administration of the instrument would not be correlated with the error on the second administration, i.e., those subjects whose first measure overestimated their true exposure would be equally likely to have the second measure over- or underestimate their true exposure (beyond any systematic bias in the instrument that affects all subjects). Often repeated laboratory measures based on biologic samples collected at multiple times for each subject (not repeated analysis of the same biologic sample) can be assumed to meet the criteria of the parallel test model, if the true exposure is thought to be the true average over the time period of specimen collection. When each individual in a study population is measured k times using repeated parallel measures of TABLE 3. exposure variable X, the improvement in validity by use of the average measure, A, for each individual (versus use of only one measure of X) can be calculated. The correlation of the average with the true exposure, pTA (the validity coefficient of A), is greater than the correlation of a single measure X with the true exposure, p-jx (the validity coefficient of X), by PTA = 1 + (* - \)P2TX (2) If p-rx is known from a validity study, then the validity coefficient for A can be calculated from the above equation. Table 3 gives examples of the improvement in validity one can achieve by using multiple parallel measures. For example, averaging three measures each with a validity coefficient of 0.5 can yield a new measure with validity coefficient of 0.71, or averaging two measures each with a validity coefficient of 0.85 can yield a new exposure measure with a validity coefficient of 0.92. Often one does not have a measure of the validity coefficient of X, p ^ , but, rather, a measure of the reliability of repeated measures of X, Pxx-. Pxx' m a v be estimated by the Pearson correlation of two parallel measures X and X' (or the average Pearson correlation between pairs of measures, if X is repeated multiple times) (70, 71). Then the reliability of parallel measures of X, p^', can be used to estimate p ^ (70, 71): (3) and Pxx^ can be substituted for in equation 2. Table 3 also gives the improvement in the validity of a measure by use of multiple measures based on values of Pxx>. For example, parallel measures with a reliability coefficient of 0.36 suggests that the validity coefficient of one measure is 0.6 and of the average of two measures would be 0.73. Improvement in the validity of a measure, by averaging k parallel measures* PM PTX k=2 0.30 0.40 0.50 0.60 0.70 0.80 0.85 0.90 0.09 0.16 0.25 0.36 0.49 0.64 0.72 0.81 0.41 0.53 0.63 0.73 0.81 0.88 0.92 0.95 k=3 0.48 0.60 0.71 0.79 0.86 0.92 0.94 0.96 /f=4 /r=5 0.53 0.66 0.76 0.83 0.89 0.94 0.96 0.97 0.57 0.70 0.79 0.86 0.91 0.95 0.96 0.98 0.70 0.81 0.88 0.92 0.95 0.97 0.98 0.99 * p w is the correlation coefficient between each (parallel) exposure measure, X, and the true exposure,T. PCT is the correlation between any two (parallel) exposure measures, Xand X' ( p * ^ p^). pTA is the correlation of A, the average of A-(parallel) exposure measures, and T, the true exposure. Epidemiol Rev Vol. 20, No. 1, 1998 Exposure Measurement The primary advantage of averaging two or more parallel measures is that the increase in the validity of the exposure measure will reduce the degree of attenuation of the relative risk due to (nondifferential) measurement error (equation 1), and, therefore, reduce the required sample size (68). One disadvantage of the use of repeated measures is that the cost per subject would increase due to the repeated study contacts. Thus the total study cost could be increased or decreased depending on the trade-off between the reduced sample size and the increased cost per subject. One way to select k, the number of repeated measures per subject, is to determine k which would minimize total costs (68). The possibility of violation of the assumptions of the parallel test model must be considered when evaluating the use of multiple measures (1, 2). It may be that the two or more measures to be averaged are not equally precise. In this situation, the average of a precise measure with a less precise measure may result in a measure which is less valid than the better measure alone. An obvious example is the use of the average of a perfect measure of exposure and an imperfect measure. When the two or more measures to be averaged have correlated errors, but the other assumptions of parallel tests hold, the improvement in the validity of the exposure measure will be less than that predicted by equation 2 or table 3. If the errors are perfectly correlated (i.e., the measure is perfectly repeatable, even though it is not a perfect measure of exposure), the use of multiple measures will lead to no improvement in validity. Most questionnaire measures would have some degree of correlated error. For example, subjects whose fat intake was underestimated on the administration of a food frequency questionnaire would tend to have their fat intake underestimated on the second administration (due to some subjects consistently underreporting high fat foods, failure of the instrument to include certain high fat foods frequently eaten by some subjects, etc.). Similarly, repeated analysis of the same biologic specimen would have correlated error, e.g., if a subject's serum cholesterol was higher than his/her long-term average on the day the blood was drawn, it would tend to be higher on each repeated analysis of the sample. Finally, when the repeated exposure measurements are not collected uniformly over time, a simple average might not be appropriate. In the Port Pirie Study, blood lead measures were taken at birth, 6, 15, and 24 months, and then annually. A composite, lifetime measure was computed for each subject as the area under the time-exposure curve (17,18). Epidemiol Rev Vol. 20, No. 1, 1998 53 Quality control Quality control procedures are an important component of exposure measurement, but can only be briefly mentioned here. When studies are large, as cohort studies often are, it is even more essential to implement systematic quality control procedures to reduce measurement error. It is important to have a complete and detailed study procedures manual, to pretest instruments well, to have standardized training and monitoring of data collectors, and to edit data as they are collected (1, 73). These and other quality control issues are covered in the article in this issue by Whitney (74). Use of information from validity and reliability studies The conduct of validity or reliability studies on a subsample of the cohort has become an important and common component of many large cohort studies (75— 82). Some cohort studies have incorporated a validity substudy in which the measured exposure using the cohort study instrument is compared with a near perfect measure of exposure which is too expensive or too burdensome to be used in the full study (75). Since perfectly valid exposure measures are not often available, validity studies are usually not feasible. More often test-retest reliability studies are conducted in which repeated applications of the exposure instrument are compared, or relative validity or intermethod reliability studies are conducted in which the measured exposure is compared with a more accurate (but less than perfect) comparison measure (76-82). If the comparison measure is carefully selected so that certain assumptions are met, these types of studies can also provide some information on the validity of the measure (1, 70, 71). The design, analysis, and interpretation of validity and reliability studies is beyond the scope of this presentation, but is covered by others (1, 44, 68, 70, 72, 83, 84). One goal for the use of the information from validity and reliability substudies in cohort studies is to estimate the effects of the inaccuracy of the exposure measure on the relative risk (38, 69). Development of statistical techniques which incorporate the information from validation substudies to correct relative risk estimates for the effects of measurement error is an active area of methodological research (85-92). Work is also currently being conducted on the optimal design of the validation substudy (93). Reducing measurement error at the time of analysis by dropping early events As noted above, one source of differential measurement error is the effect of preclinical disease on the 54 White et al. exposures under study. In case-control studies, the general approach to removing the effects of symptoms or preclinical disease on exposures is to have the reference period end at a fixed time before diagnosis, e.g., fiber intake could be assessed for a 10 year reference period ending 2 years prior to diagnosis, and for a comparable time for controls. Although one could attempt to simulate this approach in cohort studies, another common approach to remove this source of differential error is to drop from analysis cases who develop the disease within a certain time after the exposure was measured. For example, in a cohort study of colon cancer and serum cholesterol, removing the cancer cases that occurred within the first several years after cholesterol measurement changed the relative risk from less than 1 to close to 1 (94). Thus, what appeared to be a protective effect of serum cholesterol on cancer risk was more likely an effect of preclinical colon cancer in reducing serum cholesterol. Avoiding inducing differential measurement error during data analysis Care must be taken to select the appropriate reference period and the appropriate statistical method so that the exposure measures for those with and without the disease outcome are computed in a similar manner. This is a particular concern in cohort studies, because exposure assessment ends at the outcome event for those who develop the disease but continues to endof-study for most participants who do not develop the disease. The control reference period needs to be similar in length and measured over the same chronologic years as the case reference period to avoid inducing differential measurement error. SUMMARY Cohort study designs have several advantages over case-control studies in terms of exposure measurement. If exposure measurement occurs before disease occurrence, cohort studies are much less prone to differential measurement error. Prospective data collection should also reduce measurement error due to poor recall of past exposures. The primary drawback of cohort studies is the large sample size leading to high data collection costs. Several approaches to reduce such costs have been discussed in this presentation, such as selection of lower cost measurement methods and fully measuring the exposure only on a subsample of the cohort (e.g., nested case-control design). However, other innovative approaches to reduce costs are needed. In addition, study reviewers should also consider that the higher costs are justified in relation to the several benefits of this study design, which include not only less measurement error, but also less susceptibility to selection bias and often the ability to study multiple disease outcomes. Improving the accuracy of exposure measurement is increasingly important for cohort studies as we move on to the study of exposures that are difficult to measure or to those with lower relative risks of disease. In such studies, attenuation of the relative risk by the effects of measurement error can lead to failure to detect an association between exposure and disease. The validity of exposure measurements could be improved by a better understanding of the biologically active agent and etiologically important time period of the exposure-disease relation, and by incorporating these into the measure. Long-term cohort studies which cover the etiologically relevant time period could improve the accuracy of measures of exposures by use of repeated biologic measures or repeated updates of self-reported exposures. Measurement error also can be reduced by judicious choice of a cohort to study and by careful attention to quality control procedures. Continued emphasis on the evaluation and improvement of the measurement properties of instruments used in epidemiologic studies will improve the validity of the results of cohort studies. REFERENCES 1. Armstrong BK, White E, Saracci R. Principles of exposure measurement in epidemiology. New York, NY: Oxford University Press, 1992. 2. Kelsey JL, Whittemore AS, Evans AS, et al. Methods in observational epidemiology. 2nd ed. New York, NY: Oxford University Press, 1996. 3. Correa A, Stewart WF, Yen HC, et al. Exposure measurement in case-control studies: reported methods and recommendations. Epidemiol Rev 1994;16:18-32. 4. de Klerk NH, English DR, Armstrong BK. A review of the effects of random measurement error on relative risk estimates in epidemiological studies. Int J Epidemiol 1989;18:705-12. 5. Armstrong BG, Whittemore AS, Howe GR. Analysis of casecontrol data with covariate measurement error: application to diet and colon cancer. Stat Med 1989;8:1151-63. 6. Gordon T, Kannel WB. The Framingham, Massachusetts study, twenty years later. In: Kessler II, Levin ML, eds. The community as an epidemiologic laboratory: a casebook of community studies. Baltimore, MD: The Johns Hopkins Press, 1970:123-46. 7. Sytkowski PA, D'Agostino RB, Belanger A, et al. Sex and time trends in cardiovascular disease incidence and mortality: the Framingham heart study, 1950-1989. Am J Epidemiol 1996; 143:338-50. 8. Dawber TR. The Framinghan study: the epidemiology of atherosclerotic disease. Cambridge, MA: Harvard University Press, 1980. 9. Wagoner JK, Archer VE, Lundin FE Jr, et al. Radiation as the cause of lung cancer among uranium miners. N Engl J Med 1965;273:181-8. 10. Hammond EC. Smoking in relation to death rates of one million men and women. Natl Cancer Inst Monogr 1966; 19: 127-204. Epidemiol Rev Vol. 20, No. 1, 1998 Exposure Measurement 11. Kaplan GA, Strawbridge WJ, Cohen RD, et al. Natural history of leisure-time physical activity and its correlates: associations with mortality from all causes and cardiovascular disease over 28 years. Am J Epidemiol 1996;144:793-7. 12. Yano K, Reed DM, McGee DL. Ten-year incidence of coronary heart disease in the Honolulu Heart Program: relationship to biologic and lifestyle characteristics. Am J Epidemiol 1984; 119:653-66. 13. Oral contraceptives and health; an interim report. Royal College of General Practitioners Oral Contraceptive Study. London, England: Pitman, 1974. 14. Kay CR, Hannaford PC. Breast cancer and the pill—a further report from the Royal College of General Practitioners' Oral Contraceptive Study. Br J Cancer 1988;58:675-80. 15. Colditz GA, Manson JE, Hankinson SE. The Nurses' Health Study: 20-year contribution to the understanding of health among women. J Womens Health 1997;6:49-62. 16. Stampfer MJ, Colditz GA, Willett WC, et al. Postmenopausal estrogen therapy and cardiovascular disease: ten-year followup from the Nurses* Health Study. N Engl J Med 1991;325: 756-62. 17. Baghurst PA, McMichael AJ, Wigg NR, et al. Environmental exposure to lead and children's intelligence at the age of seven years: the Port Pirie Cohort Study. N Engl J Med 1992;327: 1279-84. 18. Tong S, Baghurst P, McMichael A, et al. Lifetime exposure to environmental lead and children's intelligence at 11-13 years: the Port Pirie Cohort Study. BMJ 1996;312:1569-75. 19. Dudley J, Jin S, Hoover D, et al. The Multicenter AIDS Cohort Study: retention after 9'/2 years. Am J Epidemiol 1995;142:323-30. 20. Kaslow RA, Ostrow DG, Detels R, et al. The Multicenter AIDS Cohort Study: rationale, organization, and selected characteristics of the participants. Am J Epidemiol 1987; 126: 310-18. 21. Friedman GD, Cutter GR, Donahue RP, et al. CARDIA: study design, recruitment, and some characteristics of the examined subjects. J Clin Epidemiol 1988;41:1105-16. 22. Anderssen N, Jacobs DR, Sidney S, et al. Change and secular trends in physical activity in young adults: a seven-year longitudinal follow-up in the Coronary Artery Risk Development in Young Adults study (CARDIA). Am J Epidemiol 1996; 143:351-62. 23. Toniolo PG, Levitz M, Zeleniuch-Jacquotte A, et al. A prospective study of endogenous estrogens and breast cancer in postmenopausal women. J Natl Cancer Inst 1995;87:190-7. 24. Sellers TA, Kushi LH, Potter JD, et al. Effect of family history, body-fat distribution, and reproductive factors on the risk of postmenopausal breast cancer. N Engl J Med 1992; 326:1323-9. 25. Cauley JA, Lucas FL, Kuller LH, et al. Bone mineral density and risk of breast cancer in older women: the study of osteoporotic fractures. Study of the Osteoporotic Fractures Research Group. JAMA 1996,276:1404-8. 26. Browner WS, Pressman AR, Nevitt MC, et al. Mortality following fractures in older women: the study of osteoporotic fractures. Arch Intern Med 1996;156:1521-5. 27. Fried LP, Borhani NO, Enright P, et al. The Cardiovascular Health Study: design and rationale. Ann Epidemiol 1991;1: 263-76. 28. Design of the Women's Health Initiative Clinical Trial and Observational Study. Women's Health Initiative Study Group. Control Clin Trials 1998,19:61-109. 29. Infante PF, Rinsky RA, Wagoner JK, et al. Leukemia in benzene workers. Lancet 1977;2:76-8. 30. Paffenbarger RS Jr, Wing AL, Hyde RT. Physical activity as an index of heart attack risk in college alumni. Am J Epidemiol 1978; 108:161-75. 31. Rossing MA, Daling JR, Weiss NS, et al. Ovarian tumors in a cohort of infertile women. N Engl J Med 1994,331:771-6. 32. Howe GR. Use of computerized record linkage in cohort studies. Epidemiol Rev 1998;20:112-21. Epidemiol Rev Vol. 20, No. 1, 1998 55 33. Weeks MF, Kulka RA, Lessler JT, et al. Personal versus telephone surveys for collecting household health data at the local level. Am J Public Health 1983;73:1389-94. 34. Rolnick SJ, Gross CR, Garrard J, et al. A comparison of response rate, data quality, and cost in the collection of data on sexual history and personal behaviours: mail survey approaches and in-person interview. Am J Epidemiol 1989;129: 1052-61. 35. Langholz B, Thomas DC. Nested case-control and case-cohort methods of sampling from a cohort: a critical comparison. Am J Epidemiol 1990;131:169-76. 36. Wacholder S, Gail M, Pee D. Selecting an efficient design for assessing exposure-disease relationships in an assembled cohort. Biometrics 1991;47:63-76. 37. Wacholder S. Practical considerations in choosing between the case-cohort and nested case-control designs. Epidemiology 1991;2:155-8. 38. Prentice RL. Design issues in cohort studies. Stat Methods Med Res 1995;4:273-92. 39. Mantel N. Synthetic retrospective studies and related topics. Biometrics 1973;29:479-86. 40. Prentice RL, Breslow NE. Retrospective studies and failure time models. Biometrika 1978,65:153-8. 41. White JE. A two stage design for the study of the relationship between a rare exposure and a rare disease. Am J Epidemiol 1982;115:119-28. 42. Breslow NE, Cain KC. Logistic regression for two-stage casecontrol data. Biometrika 1988;75:11-20. 43. Prentice RL. A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika 1986;73: 1-11. 44. Willett W. Nutritional epidemiology. New York, NY: Oxford University Press, 1990:20-33. 45. Laporte RE, Montoye HJ, Caspersen CJ. Assessment of physical activity in epidemiologic research: problems and prospects. Public Health Rep 1985; 100:131-46. 46. Strom BL, ed. Pharmacoepidemiology. 2nd ed. New York, NY: Wiley, 1994:549-80. 47. Checkoway H, Pearce NE, Crawford-Brown DJ. Research methods in occupational epidemiology. New York, NY: Oxford University Press, 1989:18-45. 48. Schulte PA, Perera FP, eds. Molecular epidemiology. San Diego, CA: Academic Press, 1993. 49. Toniolo P, Buffetta P, eds. Methodological issues in the use of biomarkers in cancer epidemiology. Lyon, France: International Agency for Research on Cancer, 1997. (IARC scientific publications no. 142). 50. Dillman DA. Mail and telephone surveys: the total design method. New York, NY: Wiley, 1978. 51. Sudman S, Bradbum NM. Asking questions: a practical guide to questionnaire design. San Francisco, CA: Jossey-Bass, 1982. 52. Biemer PP, Groves RM, Lyberg LE, et al., eds. Measurement errors in surveys. New York, NY: Wiley, 1991:237-57. 53. Developing and using questionnaires. Washington, DC: US General Accounting Office Evaluation and Methodology Division., 1993. (GAO/PEMD-10.1.7). 54. Rothman KJ. Induction and latent periods. Am J Epidemiol 1981;114:253-9. 55. Breslow NE, Day NE. Statistical methods in cancer research. Volume II: The design and analysis of cohort studies. Lyon, France: International Agency for Research on Cancer, 1987: 232-70. (IARC publications no. 32). 56. Thomas DC. Models of exposure-time-response relationships with applications to cancer epidemiology. Annu Rev Public Health 1988;9:451-82. 57. Checkoway H, Pearce N, Hickey JLS, et al. Latency analysis in occupational epidemiology. Arch Environ Health 199O;45: 95-100. 58. Moulton LH, Le MG. Latency and time-dependent exposure in a case-control study. J Clin Epidemiol 1991;44:915-23. 59. Hunt JR, White E. Retaining and tracking cohort study members. Epidemiol Rev 1998;20:57-70. 56 White et al. 60. Walker AM, Velema JP, Robins JM. Analysis of case-control data derived in part from proxy respondents. Am J Epidemiol 1988; 127:905-14. 61. Nelson LM, Longstreth WT Jr, Koepsell TD, et al. Proxy respondents in epidemiologic research. Epidemiol Rev 1990; 12:71-86. 62. Prentice RL. Covariate measurement errors and parameter estimation in a failure time regression model. Biometrika 1982;69:331-42. 63. Colditz GA, Martin P, Stampfer MJ, et al. Validation of questionnaire information on risk factors and disease outcomes in a prospective cohort study of women. Am J Epidemiol 1986; 123:894-900. 64. White E, Kushi LH, Pepe MS. The effect of exposure variance and exposure measurement error on study sample size: implications for the design of epidemiologic studies. J Clin Epidemiol 1994;47:873-80. 65. Riboli E. Nutrition and cancer: background and rationale of the European Prospective Investigation into Cancer and Nutrition (EPIC). Ann Oncol 1992;3:783-91. 66. Freedman LS, Schatzkin A, Wax Y. Re: The impact of dietary measurement error on planning sample size required in a cohort study. (Letter; The authors reply). Am J Epidemiol 1991:134:1472-3. 67. Liu K, Stamler J, Dyer A, et al. Statistical methods to assess and minimize the role of intra-individual variability in obscuring the relationship between dietary lipids and serum cholesterol. J Chronic Dis 1978;31:399-418. 68. Fleiss JL. The design and analysis of clinical experiments. New York, NY: Wiley, 1986:1-32. 69. Holford TR, Stack C. Study design for epidemiologic studies with measurement error. Stat Methods Med Res 1995;4: 339-58. 70. Carmines EG, Zeller RA. Reliability and validity assessment. Beverly Hills, CA: Sage, 1979. 71. Bohrnstedt GW. Measurement. In: Rossi PH, Wright JD, Anderson AB, eds. Handbook of survey research. New York, NY: Academic Press, 1983:70-121. 72. Dunn G. Design and analysis of reliability studies. New York, NY: Oxford University Press, 1989. 73. Meinert CL. Clinical trials: design, conduct, and analysis. Monographs in epidemiology and biostatistics. Vol 8. New York, NY: Oxford University Press, 1986. 74. Whitney CW, Lind BK, Wahl PW. Quality assurance and quality control in longitudinal studies. Epidemiol Rev 1998; 20:71-80. 75. Willett WC, Reynolds RD, Cottrell-Hoehner S, et al. Validaltion of a semi-quantitative food frequency questionnaire: comparison with a 1-year diet record. J Am Diet Assoc 1987;87:43-7. 76. Kushi LH, Kaye SA, Folsom AR, et al. Accuracy and reliability of self-measurement of body girths. Am J Epidemiol 1988;128:740-8. 77. Giovannucci, Colditz G, Stampfer MJ, et al. The assessment of alcohol consumption by a simple self-administered questionnaire. Am J Epidemiol 1991 ;133:810-17. 78. Sidney S, Jacobs DR Jr, Haskell WL, et al. Comparison of two methods of assessing physical activity in the Coronary Artery Risk Development in Young Adults (CARDIA) study. Am J Epidemiol 1991;133:1231-45. 79. Lambert WE, Samet JM, Stidley CA. Classification of residential exposure to nitrogen dioxide. Atmospheric Environ 1992;26A:2185-92. 80. Ainsworth BE, Leon AS, Richardson MT, et al. Accuracy of the College Alumnus Physical Activity Questionnaire. J Clin Epidemiol 1993;46:1403-11. 81. Toniolo P, Koenig KL, Pasternack BJ. Reliability of measurements of total, protein-bound, and unbound estradiol in serum. Cancer Epidemiol Biomarkers Prev 1994;3:47-50. 82. European Prospective Investigation into Cancer and Nutrition: validity studies on dietary assessment methods. Int J Epidemiol 1997;26(Suppl l):Sl-190. 83. Fleiss JL. Statistical methods for rates and proportions. 2nd ed. New York, NY: Wiley, 1981:188-236. 84. White E. Effects of biomarker measurement error on epidemiologic studies. In: Toniolo P, Buffetta P, eds. Methodological issues in the use of biomarkers in cancer epidemiology. Lyon, France: International Agency for Research on Cancer, 1997:73-94. (IARC scientific publications no. 142). 85. Willett W. An overview of issues related to the correction of non-differential exposure measurement error in epidemiologic studies. Stat Med 1989;8:1031-40. 86. Chen TT. A review of methods for misclassified categorical data in epidemiology. Stat Med 1989;8:1095-106. 87. Clayton DG. Models for the analysis of cohort and casecontrol studies with inaccurately measured exposures. In: Dwyer JH, Lippert P, Feinleib M, et al., eds. Statistical methods for longitudinal studies of health. New York, NY: Oxford University Press, 1991:301-31. 88. Thomas D, Stram D, Dwyer J. Exposure measurement error: influence on esposure-disease: relationships and methods of correction. Annu Rev Public Health 1993; 14:69-93. 89. Whittemore AS, Keller JB. Approximations for regression with covariate measurement error. J Am Stat Assoc 1988;83: 1057-66. 90. Rosner B, Willett WC, Spiegelman D. Correction of logistic regression relative risk estimates and confidence intervals for systematic within-person measurement error. Stat Med 1989; 8:1051-69. 91. Marshall RJ. Validation study methods for estimating exposure proportions and odds ratios with misclassified data. J Clin Epidemiol 1990;43:941-7. 92. Lee LF, Sepanski JH. Estimation in linear and nonlinear errors in variables models using validation data. J Am Stat Assoc 1995;90:130-40. 93. Spiegelman D. Cost-efficient study designs for relative risk modeling with covariate measurement error. J Stat Planning Inference 1994;42:187-208. 94. Winawer SJ, Flehinger BJ, Buchalter J, et al. Declining serum cholesterol levels prior to diagnosis of colon cancer: a timetrend, case-control study. JAMA 1990;263:2083-5. Epidemiol Rev Vol. 20, No. 1, 1998
© Copyright 2025 Paperzz