Construction of Composite Scales for Risk Assessment in

American Journal of Epidemiology
Copyright © 1997 by The Johns Hopkins University School of Hygiene and Public Health
All rights reserved
Vol 145, No 3
Printed in U S A
Construction of Composite Scales for Risk Assessment in Epidemiology:
An Application to Ectopic Pregnancy
Joel Coste, 1 2 Jean Bouyer,1 and Nadine Job-Spira1
Composite sconng systems that combine information contained in a number of risk factors are being used
increasingly in clinical practice, planning, and health risk appraisal The authors propose a methodological
framework for the construction and validation of a composite measurement scale to assess the nsk, considered as a continuous phenomenon, of developing a particular disease or outcome. This framework integrates
several statistical methods, especiaJly those concerning model fitting, coefficient rounding, and validation
strategy. It also uses psychometric methods, addressing important measurement properties such as measurement level, content and construct validity, and reliability of the constructed scale. The proposed framework is illustrated by application to the construction of a composite scale for measurement of the risk of
ectopic pregnancy. Am J Epidemiol 1997;145:278-89
measurement; models, statistical, pregnancy, ectopic, nsk
Composite sconng systems have been increasingly
developed for use in epidemiology to combine information contained in a number of risk factors in a way
that best evaluates one subject's nsk of disease. Despite controversy regarding the accuracy and generallzability of these systems, they are widely used, both
by clinicians for predicting the development of disease
in individuals free of disease (1-6) and by public
health professionals for evaluating risk in populations
or the efficiency of prevention programs (7). In various domains such as, for example, coronary heart
disease or breast cancer, many attempts have been
made to quantify the risk of morbidity or mortality
(1-10). However, results obtained with composite
scoring systems are far from conclusive. In particular,
the scoring systems constructed are invariably less
predictive when applied to subsequent populations
(11-13). Several explanations for these poor performances have been suggested. Selection biases could
distort the sample used for the system development,
such that it is not representative of the population for
which the system is to be used (12, 13). Statistical
methods, particularly multivariate methods, are also
frequently misused, e.g., verifications of model relevance and assumptions and cross-validations or resamplings to avoid overfitting are seldom performed (14).
However, the problem may be more conceptual: Many
systems are devised with the final aim of discriminating
between diseased (nonsurvivors) and nondiseased (survivors) and are evaluated in terms of "misclassification
rates." This involves the assumption that it is possible
to separate subjects at high risk from those at low risk,
i.e., that there is a clear-cut threshold value (cutoff point)
for the risk function. However, this assumption is likely
to be erroneous because of the nature of most predictors,
which are either continuous (age, body mass index,
cholesterol, etc.) or related to pathophysiologic processes that are also often continuous, although potentially
with floor or ceiling effects. In particular, cutoff points
may be very sensitive to lifestyle characteristics or
changing conditions of care (12).
After a previous investigation of pitfalls in the development of composite measurement scales (CMS)
for epidemiology (14), we present herein a methodological framework for the construction and validation
of a CMS to assess the risk of disease (death) as a
continuous phenomenon. The framework is illustrated
by application: It was used to construct a CMS for
assessing risk of ectopic pregnancy.
Received for publication December 4, 1995, and in final form
October 21, 1996
Abbreviation: CMS, composite measurement scale
1
INSERM Unite U 292, HdpitaJ de Bicetre, Le Kremlin-Bicetre
Cedex, France
2
Departement de Biostatistique et d'lnformatique Medicale, H6prtal Cochin, 75674 Paris Cedex 14, France
Reprint requests to Dr. J Coste, INSERM Unite U. 292, H6prtal
de Bicetre, 82 rue du General Leclerc, 94276 Le Kremlin-Bicetre
Cedex, France
DEVELOPMENT OF A CMS FOR RISK
ASSESSMENT IN ECTOPIC PREGNANCY
Ectopic pregnancy constitutes 1.2-1.4 percent of all
reported pregnancies and is one of the major causes of
278
Composite Measurement Scales for Risk Assessment
maternal mortality in industrialized countries (15, 16).
Furthermore, ectopic pregnancy leads to permanent
sterility in 20-60 percent of cases and recurs in more
than 20 percent (17). Therefore, an indicator for the
risk of ectopic pregnancy has considerable practical
value for physicians, emergency services, and public
health professionals. It could be used 1) to predict
occurrence and provide an indication about which type
of early pregnancy follow-up and information should
be given to women planning a pregnancy, 2) to provide earlier diagnostic screening in case of mild symptoms, and 3) to help design prevention programs and
health planning in populations. However, many risk
factors are involved in the etiology of ectopic pregnancy (15), and thus, a valid estimation of the risk of
ectopic pregnancy for an individual women is not
straightforward. For these reasons, we developed a
CMS to assess, as precisely as possible, the risk of
ectopic pregnancy for women planning or starting a
pregnancy.
Data sources
The data from three case-control studies of ectopic
pregnancy conducted in France from 1988 to 1994
were used. The first two studies had the same design
and methods, described in detail in original papers (18,
19). Briefly, the first study was a hospital-based casecontrol study conducted in seven maternity hospitals
in the Paris, France, area in 1988; the second study
was a hospital-based case-control study conducted between October 1988 and December 1991, in 15 maternity hospitals in the Rhone-Alpes Region. In both
studies, the cases were women between ages 15 and 44
years whose diagnosis of ectopic pregnancy was confirmed by celioscopy or laparotomy. The controls
were women who gave birth in the same maternity
hospitals as the cases who delivered immediately after
surgical intervention of a case (one control per case in
the Paris study and two controls per case in the RhoneAlpes study). The third case-control study was register
based. A full description of case identification procedures, assessment of registration completeness, data
collection, and validation has been published previously (20). Cases were women resident in three departments (administrative units) in the Auvergne Region, with confirmed diagnosis of ectopic pregnancy
who were treated from January 1 to December 31,
1994. In this register-based case-control study, controls were women who gave birth in the same hospitals
as cases, whose delivery immediately followed diagnosis of a case, and who were similar to the cases with
regard to contraception status at the time of conception
(two controls per case were included).
Am J Epidemiol
Vol. 145, No. 3, 1997
279
To minimize biases associated with the absence of
controls undergoing induced abortion (21) and obtain
samples similar to the target population of women
planning or beginning a pregnancy, only cases and
controls who were married or living as married and not
using contraception at the time of conception, (i.e.,
women who probably planned to complete their pregnancy) were used for this study of development of a
CMS for assessment of the risk of ectopic pregnancy.
A total of 190 cases and 246 controls satisfied these
criteria in the Paris study, 382 cases and 1,142 controls
in the Rhone-Alpes study, and 122 cases and 231
controls in the register-based Auvergne study. In the
following analyses, women from the largest study
(Rhone-Alpes study) were used as the "training sample" for construction of the CMS, and women from
Paris study and Auvergne register-based study were
used as two "validation samples." (The largest sample
was used to maximize precision of the estimated risk
equation).
In the three studies, data on confirmed or suspected
risk factors for ectopic pregnancy were collected by
trained physicians or midwives using similar questionnaires. These data included the woman's sexual history, medical history including sexually transmitted
disease and pelvic inflammatory disease, previous surgical and reproductive history, previous use of birth
control and condoms, induced conception cycle, and
other factors such as sociodemographic characteristics, smoking habits at the time of conception, and
endometriosis. Information about the most frequent
sexual partner was also collected: his level of education, sociooccupational class, smoking status, and
number of episodes of sexually transmitted disease
during the previous 6 months. Blood samples were
also collected and tested for Chlamydia trachomatis
immunoglobulin G antibodies (22).
Preliminary steps of CMS development (training
sample)
The 382 cases and 1,142 controls of the training
sample were compared for all of the investigated exposures. Quantitative exposures such as age and smoking were categorized in this study by a priori cutpoints
used in previous publications by ourselves (18, 19)
and others (15). (In some contexts, optimal dataoriented cutpoints may be found using the methodology proposed by Schulgen et al. (23)). Odds ratios and
95 percent confidence intervals were used to describe
the association between ectopic pregnancy and potential risk factors (table 1). As expected, a large number
(n = 14) of the potential factors were found to be
associated to ectopic pregnancy.
280
Coste et al.
TABLE 1. Predictive ability of various single risk factors: crude odds ratio (OR) and 95 percent confidence Interval (CI) for each
factor and ectopic pregnancy (training sample), Rhdne-Alpes region, 1988-1991
variable
Controls
{n =1,142)
Cases
(n-382)
No.
exposed
Crude
OR
No
exposed
%
9
21.8
45.4
22.1
8.3
1.6
0.8
63
131
116
55
16
1
16.5
34.3
30.4
14.4
42
02
786
171
125
60
68.8
15.0
11.0
5.2
205
81
53
43
53.7
21.2
13.9
11.2
581
166
207
157
31
509
14.6
18.1
13.7
2.7
172
59
82
67
2
450
785
313
22
22
68.8
27.4
1.9
179
112
40
1.9
51
469
29.3
10.5
13.3
1,122
17
3
98.2
1.5
358
93.7
24
0
6.3
0.0
1,026
103
89.9
607
38.0
13
1.1
232
145
5
709
400
33
62.1
35.0
2.9
208
151
23
54.5
39.5
6.0
2.4
Prior tubal surgery§
No
Yes
1,100
42
96.3
3.7
332
50
86.9
13.1
3.9
Prior uterine surgery§
No
Yes
1,128
14
98.8
1.2
366
16
95.8
4.2
3.5
Maternal age (years)
15-24
25-29
30-34
35-39
£40
Missing
249
519
252
95
18
1
1.0
1.8
Z3
3.5
95% CI
0.7-1.4
1.3-2.6
1.5-3.5
1.7-7.3
No. of cigarettes/day at the time of conception
(woman)
0
1-10
11-20
>20
No. of cigarettes/day (partner)
0
1-10
11-20
>20
Missing
Prior STD*
No
Yes, without salpingitis
Yes, with probable PID*,t
Yes, with confirmed PID}
STD of sexual partner during previous 6 months
No
Yes
Missing
Positive Chlamycfia trachomatis serotogy
No
Yes
Not done
Appendectomy
No
Yes, unruptured appendix
Yes, ruptured appendix
0.3
9.0
15.4
215
17.6
0.5
1
1.8
1.6
2.8
1
1.2
1.3
1.4
13-2.5
1 1-2.3
1.8-4.2
0.8-1.7
0.9-1.8
1.0-2 0
1
1.6
8.0
10.1
1
4.4
1.2-2.0
4.6-13.7
6.0-17.2
2.3-8.3
1
6.2
4.7-8.3
1.3
1
1.3
1.0-1.6
1.4-4.1
1
2.6-6.1
1
1.7-7.3
Table continues
When many risk factors are involved in disease
development, muitivariate descriptive methods can be
useful to study correlations between them and possibly
to identify a "structure" in the risk pattern (24, 25). We
used multiple correspondence analysis to examine relations among categorical risk factors for ectopic pregnancy (26). This method is the natural counterpart to
principal component analysis for categorical data: It
provides a multidimensional representation of the dependence between the rows and columns of a binary
contingency table (26). This representation is found by
allocating scores to row and column categories and
displaying the categories as points in a reduced factor
space. The factor scores are used as coordinates of
these points. Plots of factor scores, which are sorted by
descending order of the eigenvalues, show associaAm J Epidemiol
Vol. 145, No. 3, 1997
Composite Measurement Scales for Risk Assessment
TABLE 1.
281
Continued
variable
Endometriosis
No
Yes
Controls
(n= 1,142)
Cases
No.
exposed
No.
exposed
%
Crude
OR
95% Cl
1,132
10
992
0.8
359
23
94.0
6.0
1
8.1
3.7-17.6
507
635
44.4
55.6
189
193
49.5
50.5
1
0.8
0.6-1.0
1,123
19
98.3
1.7
323
59
84.5
15.5
1
10.8
6.3-18.4
915
183
44
80.1
16.0
3.9
281
76
25
73.6
19.9
6.5
1
1.3
1.9
1.0-1 8
1.1-3.1
Prior induced abortion
No
Yes
963
179
84.3
15.7
301
81
78.8
212
1
1.4
1.1-1.9
Previous use of combined estroprogestative
pill
No
Yss
264
878
23.1
76.9
92
290
24.1
75.9
1
0.9
0.7-1.2
Previous use of Intrauterine device
No
Yes
986
156
86.3
13.7
305
77
79.8
20.2
1
1.6
1.2-2.2
Previous use of condoms
No
Yes
1,012
140
88.6
11.4
342
40
89.5
10.5
1
0.9
0.6-1.2
Previous sterilization
No
Yes
1,139
3
99.7
0.3
381
1
99.7
0.3
1
1.0
0.1-9.6
Induced conception cycle
No
Yes, with gonadotropins
Yes, with ctomiphene
IVF», with gonadotropins
1,116
4
10
12
97.7
0.3
0.9
1.1
348
5
14
15
91.1
1.3
3.7
3.9
1
4.0
4.5
4.0
1.1-15.0
Z0-10.2
1.9-8.6
Prior deivery
Yes
No
Prior ectopic pregnancy
No
Yes
Prior spontaneous abortion
No
1
*
t
t
§
STD, sexually transmitted diseases; PID, pelvic inflammatory disease; IVF, in vitro fertilization,
Probable pelvic inflammatory disease: association of fever, abdominal pain, and vaginal discharge.
PID, confirmed by ceOoscopy.
Other than that associated with previous ectopic pregnancy.
tions of risk factors that may be less obvious by simple
cross-tabulations. (Simple cross-tabulations provide
many results that are not easy to summarize when the
variables are numerous, as in this study). Multiple
correspondence analysis was performed using PROC
CORRESP of the SAS package (27). Multiple correspondence analysis of 14 categorical risk factors associated with ectopic pregnancy in the above monovariate analysis led to four interpretable factors. A plot
of the four factor scores is shown in figure 1. The first
factor opposes positive categories (presence) of risk
factors to negative (absence) ones. In particular, variables that are associated with higher ectopic pregnancy
Am J Epidemiol
Vol. 145, No. 3, 1997
risk (prior ectopic pregnancy, endometriosis, confirmed pelvic inflammatory disease, and tubal surgery)
score higher on this axis. The second factor evidences
the influence and possibly confounding status of age,
the categories of which are ordered along the axis.
Previous use of intrauterine device was also wellrepresented on this axis, located close to the highest
categories of age. The third factor opposes
clomiphene-induced pregnancy and endometriosis to a
"cluster" of infection markers (pelvic inflammatory
disease and C. trachomatis seropositivity) and prior
ectopic pregnancy. The fourth factorial axis opposes
tubal and uterine surgery to heavy smoking. The close-
282
Coste et al.
Axis 2
2
T
Endom+
TuSurg+CPID •
IVF
1.5Age25-2'i
Age<25
I
-1
-0 5
-+0.5
-0.5
Clomiph
Chll+
0.5Age3()-34a
h
1
1.5PEP^2
2 5
*
-1
3
Axis 1
PIUD+Age35-3y
-1.5
Age>4()
-2
AXKS 4
4 -pTuSurg+
Clomiph
UtSurg+
3.5 - GonadoSum
-(
3 --•
2.5 -2 -1 5 --
Endom+
+
05--
Chlt+
CPID PPID
PEPj
•
H
-5
-4
-3
-2
-1
1
2 Axis 3 3
-0.5--1 --1 5 --
Smoke>2()
FIGURE 1. Multiple correspondence analysis of nsk factors: plot of the axes 1 and 2 (top panel) and axes 3 and 4 (bottom panel). Training
sample (n = 1,554) Only categories with a contnbution to axis over 5 percent are considered. Code names/categones: age <25, maternal
age less than 25 years; age 25-29, maternal age 25-29 years; age 30-34, maternal age 30-34 years, age 35-39, maternal age 35-39 years;
age a40 years, maternal age 40 years or more, Chit, C trachomatis seropositivrty; Clomiph, conception cycle induced by clomiphene; CPID,
confirmed pelvic inflammatory disease; Endom, endometnosis; GonadoStim, conception cycle induced by gonadostimuhns, IVF, in vitro
fertilization; PEP, prior ectopic pregnancy; PIUD, previous use of intrautenne device, PPID, nonconfirmed pelvic inflammatory disease;
Smoke >20, maternal smoking of more than 20 cigarettes/day, TuSurg, prior tubal surgery, UtSurg, pnor utenne surgery + (-) indicates nsk
factor present (absent).
ness or (on the contrary) the distance between categories of variables on the interpretable axes, reflecting
the strength of the associations between them, suggest
the value of splitting nominal or ordered variables
(e.g., induced conception and age) into several binary
variables or, alternatively, aggregating variables (clustered in this analysis) that evaluate various aspects of
the same process, e.g., infection, which can be detected by clinical and serologic variables. Overall, at
least four independent dimensions for ectopic preg-
nancy risk factors were suggested: infection, surgery,
smoking, and induction.
Modeling the ectopic pregnancy risk (training
sample)
Linear logistic regression was used to model the
relation between candidate risk factors (single or
grouped as suggested by multiple correspondence
analysis) and ectopic pregnancy. The relevance of the
Am J Epidemiol
Vol. 145, No. 3, 1997
Composite Measurement Scales for Risk Assessment
multiplicative structure on which the logistic model is
based was examined by testing interactions (multiplicative terms) between the independent variables included in the model (data not shown). Goodness-of-fit
of the candidate models was studied using the - 2
log-likelihood statistic (sometimes called deviance
(28)) and two statistics based on the comparison of
observed and predicted values: the Hosmer and
Lemeshow statistic (29) and Harrell's c index (25).
The —2 log-likelihood statistic measures unexplained
variability in the data, and thus, lower values indicate
a better fit. The Hosmer and Lemeshow statistic is
obtained by calculating the Pearson chi-square statistic
from the 2 X g table of observed and expected frequencies, where g = 10 is the number of groups
formed by deciles of risk. This statistic has an asymptotic x2 (g — 2) distribution. To calculate the c index,
all possible pairs of patients, one with the disease and
one without, are considered. A pair is said to be
concordant if the one with the disease has the higher
predicted disease probability. The c index is the proportion of all pairs that are concordant. This statistic
has many advantages (25): 1) it is easy to interpret
since it estimates the probability that for a randomly
chosen pair of patients, the one having the disease is
the one who has the greater risk; 2) it is equal to the
area under a "receiver operating characteristic" curve;
and 3) it is not affected by the value of the model's
constant and can be used to assess and compare the
predictive ability of a model in both training and
validation samples.
Resampling was also performed: 20 random subsamples containing 50 percent of the women in the
training sample were formed and used to test the
independent predictive values of risk factors. Classical
graphical methods were used to look for possible
outlying responses and influent observations (30) (we
did not find any that required deletion). SAS PROC
LOGIST (27) was used to develop the models, compute the goodness-of-fit statistics, and create diagnostic plots.
As suggested by multiple correspondence analysis
results, not all risk factors provided independent information (e.g., previous use of an intrauterine device
was strongly associated with age, prior induced abortion was associated with pelvic inflammatory disease),
and some of them were no more predictive in many 50
percent subsamples (prior spontaneous abortion, uterine surgery, appendectomy). Leaving out these variables and aggregating clinical and serologic markers
of infection into a single variable led to a model with
seven risk factors. This model exhibited a very good
fit (table 2) and was considered to be the "final
model."
The regression coefficients of the final model were
scaled and rounded to integers to make the scoring
systems simpler to use. We used the algorithm proposed by Cole (31) to find optimal scaled and rounded
coefficients. Two equivalent optimal solutions were
provided, using different scaling coefficients (5 and
2.5). We subsequently verified that the resulting
scaled/rounded to integer coefficients provided predic-
TABLE 2. Parameter estimates, standard errors (SE), and statistics of fit of the final model (training
sample), Rh6ne-Alpes region, 1988-1991, and Its two derived simplified models (with rounded/scaled
coefficients)
Model wtth
scaled/rounded coefficients
Final model
Variable
Prior ectopic pregnancy
Endometriosis
Previous Infection*
Clomiphene-induced pregnancy
Tubal surgery
Aget
Smoking*
Goodness-of-fit statistics
- 2 log-Hkellhood (df)
Hosmer-Lemeshow statistic:
value (p)§
c index
*
t
$
§
Am J Epidemiol
P
SE(P)
Odds
ratio
2.003
1.814
1.582
1.414
0.814
0.580
0.394
0.297
0.458
0.148
0.452
0.252
0.148
0.104
7.4
6.1
4.8
4.1
2.3
1.8
1.5
Solution 1
(scaling,
coefficient
5)
10
9
8
7
4
3
2
Solution 2
(scaling
coefficient
2.5)
5
4
4
4
2
1
1
1,380.8 (1,497)
2.74 (0.95)
0.760
0.760
0.759
Either salplngitis (confirmed or not) or Chlamydla tmchomatis seropositivity.
Coded as follows: 0, <35 years; 1, £35 and <40 years; 2, £40 years.
Coded as follows: 0, no smoking; 1, smoking £20 cigarettes per day; 2, smoking >20 cigarettes per day.
Degree of freedom of the chl-square statistic = 8.
Vol. 145, No. 3, 1997
283
284
Coste et al.
5
R 2 =0.98
•
4 •
0 75 ••
3 •
0.5
2
••
1
••
0.25 J
10
10
15
20
25
30
35
0.75 -•
-2 -
5
••
4
•
15
20
25
30
35
•
R 2 =()98
3 ••
2
••
0.5 -•
1 -•
0.25 ••
•4
10
12
16
16
FIGURE 2. Evaluation of the measurement level of the two scaled/rounded equations Left panels plots of true scores (observed proportion
of ectopic pregnancy, vertical axis) against composite scores (nonzontal axis). Right panels, plots of logrt of true scores (logit of observed
proportion of ectopic pregnancy, vertical axis) against composite scores (horizontal axis); values of Fr (the logrt of proportion of ectopic
pregnancy for a given score as the dependent variable and the CMS score as the independent vanable). Upper panels refer to scaled
equation according to solution one; lower panels refer to rounded equation according to solution two (see table 2) Training sample (n =
1,554).
tive ability similar to that provided by the original
coefficients. Loss of fit, measured by the c index, due
to using scaled/rounded coefficients, was negligible
(table 2).
Level of measurement
Interval and ratio scales, which allow the use of
parametric statistics, are characterized by their ability
to preserve distances between subjects with respect to
the measured phenomenon (32). We showed that this
implies that there must be a linear relation between the
composite score and the measured phenomenon (possibly after transformations: log-interval/ratio and
logistic-interval/ratio scales). In other words, estimation of the strength of the linear relation between
composite scores and true attribute values is a method
for directly assessing whether the measurement scale
is of an interval or ratio type (33). Therefore, linear
regression and the square of the multiple correlation
coefficient (R2 or any criterion that reflects the amount
of variation that is not explained by the linear regression, such as residual sum of squares or mean square
error) can be used as an indicator of the linear relation
between the composite score and the measured phenomenon. Details may be found in the paper by Coste
et al. (33). To explore this relation within the whole
spectrum of score values, a pair of variables (composite score, (logit of) true value) was assembled for 10
equal-range intervals of CMS score values (34). Each
couple was weighted by the number of observations in
the interval. The center of the CMS score values
interval was used as the independent variable, and the
associated (logit of) observed proportion of ectopic
pregnancy was used for patients within the interval as
the dependent variable. Plots of composite scores versus true values and values of R2 for the two scaled/
rounded equations derived from the final model are
given in figure 2. After logistic transformation of the
true value, the linear relation between composite
scores and proportion of ectopic pregnancy was confirmed: R2 was higher than 0.98, confirming the (loAm J Epidemiol
Vol. 145, No. 3, 1997
Composite Measurement Scales for Risk Assessment
TABLE 3. Predicttva ability and level of measurement
achieved by the final model equations (rounded/scaled
coefficients) for validation samples, Parts study, 1980,
and Auvergne study, 1988-1991
Equation
c Index
Paris study
Scaled/rounded coefficients,
solution 1*
Scaled/rounded coefficients,
solution 2*
0.745
0.94
0.747
0.92
Auvergne study
Scaled/rounded coefficients,
solution 1
Scaled/rounded coefficients,
solution 2
0.922
0.94
0.920
0.89
' Scaled/rounded coefficients (see table 2)
gistic) interval level of measurement of the two simplified models.
Cross-validation studies
The predictive ability and the measurement level of
the two scaled/rounded CMS was evaluated using the
two independent validation samples. The values for
the c index and R2 for these models were similar (table
3). We retained for further application the first simplified model (solution 1, scaling coefficient = 5),
which appeared more precise and informative and
provided a measurement level closer to the interval
one.
in the studied sample. Various methods have therefore
been proposed (5, 35-38), most of which combine
relative risk estimates and some value of incidence
rates or attributable fraction, obtained from the same
population. Benichou and Wacholder (39) recently
compared these methods in the multivariate setting
and discussed their results. In other situations, such as
ours, the relative risk model has already been developed in a parent population. There, an approximation
of the absolute risk can be obtained if values for the
frequency of disease and the risk factor exposure level
are available for the population of interest. The ectopic
pregnancy register in Auvergne, the basis of the third
case-control study, contains these data. In 1994, the
incidence ratio of ectopic pregnancy was two per 100
livebirths (40). If studied controls are representative
the exposure to risk factors of women from this region
who are planning a pregnancy, we can estimate the
risk of ectopic pregnancy for a woman from this
region who plans a pregnancy. (Since the only selection criterion for controls was to give birth immediately after a woman was treated for ectopic pregnancy,
this representativeness seems reasonable.) According
to the logistic model equation, the (expected) risk,
Pr(EP), of ectopic pregnancy in the Auvergne region,
as a function of risk score is given by
Pr(EP) =
=
Computation of absolute risk in a given
population
Previous sections focused on the determination of
relative risk equations. The constant of the models had
no relevance nor did the proportion of cases/controls
in samples used for model development. However, in
several circumstances of clinical or public health practice, the absolute risk, i.e., the actual probability of
occurrence of the disease, is of greater interest. Since
this risk has far less generalizability than the relative
risk because of its relation with baseline incidence
rates that may vary from one population to another, it
is therefore preferable to make estimates using the
population to which it will be applied. When the risk
modeling is conducted in samples derived from this
population, a direct estimation of the absolute risk or
probability of developing disease can be obtained.
Contrary to cohort studies, case-control designs do not
allow direct estimation of the absolute risk, in particular because the intercept of the risk model is directly
linked to the relative proportion of cases and controls
Am J Epidemiol
Vol. 145, No. 3, 1997
285
Pr(EP/score)dscore
exp(a + (scaling coefficient) ' X score)/
[1 + exp(a + (scaling coefficient)"' X score)]dscore
Although it is impossible to express this integral in
closed form, an approximation_may be obtained by
expanding Ln[Pr(EP/score)/Pr(EP/score)] as a firstorder Taylor series about the mean score, score (this
approximation performs well when the disease is rare
(41), as is the case here), whereby one can obtain the
approximation:
Pr(EP) = exp(a + (scaling coefficient) - l
X score)/[l + exp(a + (scaling coefficient)
X score)].
When probabilities and expected means are replaced by their estimates, obtained in the Auvergne
population-based case-control study, the estimation of
the constant is straightforward:
0.02 = exp(a + (5)"' X 3.23)/[l + exp
(& + (5)-' X 3.23)] so that a = -4.538.
286
Coste et al.
This equation allows determination, for a given level
of risk factors (risk factor coding as given in table 2),
of the absolute risk of developing ectopic pregnancy in
the population of women from Auvergne who are
planning a pregnancy.
Presentation and use of the scale
The scale can be used in two ways: 1) using the
computed constant (and appropriate scaling and logistic transformation, as in the worksheet shown in Appendix 1) as an estimate of the absolute risk of ectopic
pregnancy in the Auvergne region and in populations
with similar baseline ectopic pregnancy incidence
rates or 2) without the constant, as an indicator of the
relative risk of ectopic pregnancy in a wider context;
the relative risk (RR) (given by the odds ratio) for
ectopic pregnancy for a given woman is simply
RR = exp[woman's computed score
X (scaling coefficient)"'].
DISCUSSION
This study illustrates that a simple risk scale, using
an integer-based linear combination, may be a valid
tool for predicting the risk of occurrence of a disease,
assessed as a continuous phenomenon. The good, correspondence of results in the cross-validation studies
suggests that the scale can provide accurate risk evaluation even when used for different populations and
under conditions of standard practice. Moreover, we
showed that the scale may be used as either an evaluation of the relative or absolute risk, according to the
degree of generalizability of the results to the population considered.
This methodological framework for scale development could be used in a wide variety of applications
that compare probabilistic evaluations and binary outcomes, such as assessment by a physician of the probability of the outcome for a patient (disease occurrence
or complication) and assessment by public health professionals of the likely impact of an intervention. This
framework integrates several statistical methods concerning model fitting, coefficient rounding, and validation strategy. These methods, published mainly in
specialized journals, are not well known by researchers developing CMS for epidemiology (14). Indeed,
multivariate statistical methods and modeling are necessary in the development CMS that predict disease or
outcome (11-13). Verifications of model relevance
should be performed. Although multiplicative models
(especially logistic ones) have an established place in
epidemiologic methodology, the multiplicative structure does not always provide the most accurate de-
scription of the processes being analyzed (42, 43).
Testing interactions between the variables included in
the model is useful to identify large departures from
multiplicativity (44). A complementary approach is to
study the goodness-of-fit of the model. For several
types of models, a variety of methods are available,
from graphic methods such as examination of "diagnostic plots" to statistics based on comparison of observed and predicted values (30,45,46). Alternatively,
statistical models may be excessively sensitive to the
data from which they are developed (10, 13). For large
data sets, variables may be inappropriately selected on
the basis of type I error ("overfitting"). Overfitting can
be minimized or avoided by resampling or crossvalidation of the data. The best approach to crossvalidation is to perform independent studies at different times or at the same time in different settings (11).
This is especially true when the populations and design used for the CMS construction are open to classical selection and information biases. The use of a
stepwise algorithm may also lead to inappropriate
selection of the important variables, especially with
small training samples (25, 47). One recommended
strategy is to perform all possible subset regressions
and select the best subset with the help of goodnessof-fit criteria. The other strategy, which we used in this
study, is to employ a two-step procedure in which the
initial set of candidate variables is first reduced by
using a multivariate descriptive analysis, e.g., factor
analysis or cluster analysis, associated with clinical
judgment (to form groupings of variables that are then
used as predictors) (25, 48). This strategy, in which
variables are not only included on grounds of statistical significance but also on an assessment of conceptual importance, merits attention.
An important condition for application of a CMS in
practice, especially in the clinical setting, is simplicity
of use. Scaling and rounding to integers of equation
coefficients may contribute to this simplicity. Loss of
information due to this simplification may be minimized using the algorithm proposed by Cole (31). We
verified that the loss due to rounding was negligible in
this study. Note that the approach of rounding coefficients, followed by an assessment of the loss due to
rounding, is an indirect but practical way to consider
random variability, i.e., precision of estimates. A more
direct approach to statistical uncertainty has been proposed recently by Gail et al. (5) and further developed
by Benichou and Wacholder (39). They proposed a
method to compute confidence intervals for relative or
absolute risks associated with multiple exposures using parametric bootstrap (5) or the delta method (39).
Further research is needed, however, to assess its value
and applicability in epidemiologic practice. In particAm J Epidemiol
Vol. 145, No. 3, 1997
Composite Measurement Scales for Risk Assessment
ular, other sources of error, associated with sampling
biases or model misspecification discussed above,
may largely exceed in magnitude and seriousness the
specific statistical error. However, the requirement for
a computer program to calculate estimates (49) may be
disadvantageous in the context of clinical or routine
public health practice. Authors who develop CMS
should be encouraged to present their instruments in a
simple and attractive form, such as a worksheet or
nomogram, for example.
The methodological framework presented also integrates some psychometric principles and methods of
measurement. Indeed, it appears that the approach to
measurement of complex phenomena applied for decades to psychology and behavioral sciences ("psychometrics") may be of considerable value to investigators developing composite instruments in the fields
of clinical research or epidemiology (50-53). The
evaluation of the validity of a CMS that aims to
replace or predict an external criterion, such as disease
occurrence or outcome, should focus on the analysis of
criterion validity, i.e., the investigation of the concordance between the result of the CMS and the external
criterion (14, 50). We have presented above the statistical problems associated with this analysis. However, criterion validity is not the only important aspect
of validity for these CMS. Collection and selection of
all candidate predictors and multidimensional structure should be considered for determining the usefulness and applicability of the CMS. Again, factor analysis, i.e., principal component analysis (when
predictors are continuous variables) or multiple correspondence analysis (when predictors are categorical
variables) may be useful to determine a structure (dimensionality) in a set of risk factors. Sometimes, the
reliability of the constructed CMS should be considered: Data collected in epidemiology studies are often
based on interviews or clinical, radiologic, or histologic examination and are, therefore, liable to significant intraobserver or interobserver variability. Finally,
the legitimacy of using particular statistics with CMS
scores depends mainly on the level of measurement
achieved (32, 33); in particular, parametric statistics
should, in general, be only used with interval and ratio
scales. If it is planned to use the constructed scale as if
it was a ratio scale, the underlying linear relation
between the CMS score and the measured phenomenon should be assessed. Note that this assessment
provides complementary information about the fit of
the model since it directly evaluates the strength of the
linear (logistic) relation between predictors and outcome (contrary to Hosmer-Lemeshow and c indices
that provide categorical and ordinal assessments of
this relation, respectively).
Am J Epidemiol
Vol. 145, No. 3, 1997
287
In summary, we present a framework for the construction and validation of a CMS to assess the risk,
considered as a continuous phenomenon, of developing a particular disease or outcome. The framework
integrates both classical psychometric principles of
measurement and some specialized statistical methods. In particular, we suggest that psychometric methods should be more widely used by investigators planning to construct a CMS. Important measurement
properties and, in particular, measurement level, content and construct validity, and reliability should be
addressed. We remind researchers who develop CMS
that multivariate methods and statistical modeling
should be conducted carefully. Notably, model relevance and assumptions should be checked, and crossvalidation in independent samples should be performed to avoid problems of overfitting. Finally, we
suggest that every effort should be made to simplify
CMS constructed and to present it in an attractive form
to ensure its applicability in diverse settings.
ACKNOWLEDGMENTS
The authors thank Dr. Pierre Ducimetiere for valuable
comments on an earlier draft of this paper.
REFERENCES
1 Multiple Risk Factor Intervention Trial Group. Statistical design considerations in the NHLBI Multiple Risk Factor Intervention Trial (MRFIT). J Chronic Dis 1977,30.261-75
2. Lipid Research Clinics Program The coronary primary prevention trial: design and implementation J Chronic Dis 1979,
32:609-31.
3. Anderson KM, Wilson PWF, Odell PM, et al. An updated
coronary nsk profile, a statement for health professionals.
Circulation 1991 ;83:356-62.
4 Chambless LE, Dobson AJ, Patterson CC, et al On the use of
logistic nsk score in predicting nsk of coronary heart disease.
Stat Med 1990,9.385-96.
5 Gail MH, Bnnton LA, Byar DP, et al Projecting individualized probabilities of developing breast cancer for white females who are being examined annually J Natl Cancer Inst
1989;81 1879-86
6 Gazmaranan JA, Foxman B, Yen LT, et al. Comparing the
predictive accuracy of health nsk appraisal: the Centers for
Disease Control versus Carter Center program. Am J Public
Health 1991 ;81 • 1296-301.
7. Rothenberg R, Ford ES, Vartiainen E Ischemic heart disease
prevention: estimating the impact of interventions. J Chn
Epidemiol 1992,45.21-9.
8 Negn E, DeCarli A, La Vecchia C, et al Identification of high
risk groups for breast cancer by means of logistic models.
J Chn Epidemiol 1990,43413-18.
9 Katz D, Foxman B. How well do prediction equations predict?
Using receiver operating characteristic curves and accuracy
curves to compare validity and generalizability. Epidemiology
1993;4:319-26.
10 Phillips AN, Thompson SG, Pocock SJ Prognostic scores for
detecting a high risk group, estimating the sensiuvity when
applied to new data Stat Med 1990,9:1189-98.
288
Coste et al.
11. Wasson JH, Sox HC, Neff RK, et al Clinical prediction rules.
Applications and methodological standards. N Engl J Med
1985;313:793-9.
12. Diamond GA. Future imperfect: the limitations of clinical
prediction models and the limits of clinical prediction. J Am
Coll Cardiol 1989;14 (Suppl. A):12A-22A.
13. Charlson ME, Ales KL, Simon R, et al. Why predictive
indexes perform less well in validation studies Is it magic or
methods? Arch Intern Med 1987;147:2155-61.
14. Coste J, Fermanian J, Venot A. Methodological and statistical
problems in the construction of composite measurement
scales- a survey of six medical and epidemiological journals.
Stat Med 1995;14:331-45.
15 Chow WH, Daling JR, Cates W Jr, et al Epidemiology of
ectopic pregnancy. Epidemiol Rev 1987;9:70-94.
16. Washington AE, Katz P. Ectopic pregnancy in the United
States: economic consequences and payment sources trends.
Obstet Gynecol 1993;81:287-92
17 Ory SJ. New options for diagnosis and treatment of ectopic
pregnancy. JAMA 1992;267:534-7.
18 Coste J, Job-Spira N, Fernandez H, et al Risk factors for
ectopic pregnancy: a case-control study in France, with special
focus on infectious factors. Am J Epidemiol 1991; 133:
839-49.
19. Coste J, Laumon B, Bremond A, et al. Sexually transmitted
diseases as major causes of ectopic pregnancy: results from a
large case-control study in France. Fertil Steril 1994;62:
289-95
20. Coste J, Job-Spira N, Aublet-Cuvelier B, et al Incidence of
ectopic pregnancy. First results of a population-based register
in France Hum Reprod 1994,9.742-5.
21 Weiss NS, Daling JR, Chow WH Control definition in casecontrol studies of ectopic pregnancy. Am J Public Health
1985;75.67-8.
22. Wang SP, Grayston JT, Alexander ER, et al. Simplified microimmunofluorescence test with trachoma-lymphogranuloma
venereum (Chlamydia trachomatis) antigens for use as a
screening test for antibody. J Clin Microbiol 1975;1250-5.
23. Schulgen G, Lausen B, Olsen JH, et al Outcome-onented
cutpoints in analysis of quantitative exposures. Am J Epidemiol 1994; 140:172-84.
24. Leclerc A, Luce D, Lert F, et al Correspondence analysis and
logistic modelling- complementary use in the analysis of a
health survey among nurses. Stat Med 1988;7.983-95.
25. Harrell FE, Lee KL, Cahff RM, et a] Regression modeling
strategies for improved prognostic prediction. Stat Med 1984;
3:143-52.
26. Lebart L, Monneau A, Warwick KM. Multivanate descriptive
statistical analysis: correspondence analysis and related techniques. New York, NY: John Wiley & Sons, 1984.
27. SAS Institute, Inc SAS user's guide, statistics. Version 6 ed
Cary, NC SAS Institute, Inc , 1990.
28 Hosmer DW Jr, Lemeshow S. Applied logistic regression
New York, NY. John Wiley & Sons, 1989
29 Lemeshow S, Hosmer DW Jr. A review of goodness of fit
statistics for use in the development of logistic regression
models. Am J Epidemiol 1982;115.92-106.
30. Mackuch RW, Rosenberg PS. Identifying prognostic factors in
binary outcome data: an application using liver function tests
and age to predict liver metastases Stat Med 1988;7:843-56.
31. Cole TJ. Scaling and rounding regression coefficients to integers Appl Stat 1993;42261-8
32 Stevens SS. On the theory of scales and measurement. Science
1946;103:667-80
33 Coste J, Walter E, Venot A. A new approach to selection and
weighting of items in evaluative composite measurement
scales. Stat Med 1995;14:2565-80.
34. Coste J, Wassermann D, Venot A. Predicting mortality in
adult burned patients: methodological aspects of the construction and validation of a composite ratio scale. J Clin Epidemiol 1996;491125-31.
35 Miettinen O Estimability and estimation in case-referent studies. Am J Epidemiol 1976,103 226-35.
36. Greenland S. Multivariate estimation of exposure-specific incidence from case-control studies J Chronic Dis 1981;34:
445-53.
37. Greenland S. Estimation of exposure-specific rates from
sparse case-control data. J Chronic Dis 1987,40:1087-94
38. Beruchou J, Gail MH Methods of inference for estimates of
absolute risk derived from population-based case-control studies. Biometrics 1995 ;51.182-94.
39. Benichou J, Wacholder S A comparison of three approaches
to estimate exposure-specific incidence rates from populationbased case-control data. Stat Med 1994;13:651—61.
40. Coste J, Job-Spira N, Aublet-Cuvelier B, et al. Stability of
incidence rates of ectopic pregnancy. Results of a populationbased register in France. Presented at the 15th Fertility and
Sterility World Congress, Montpellier, France, September
1995
41. Rosner B, Willett WC, Spiegelman D Correction of logistic
regression relative risk estimates and confidence intervals for
systematic within-person measurement error. Stat Med 1989;
8 1051-69.
42 Rothman KJ. The estimation of synergy or antagonism. Am J
Epidemiol 1976;103-5O6-ll.
43 Greenland S. Limitations of the logistic analysis of epidemiologic data Am J Epidemiol 1979,110.693-8
44. Greenland S Tests for interaction in epidemiologic studies: a
review and a study of power Stat Med 1983,2.243-51.
45. Pregibon D. Logistic regression diagnostics Ann Stat 1981;
9:705-24.
46 Kay R, Little S. Assessing the fit of the logistic model: a case
study of children with the haemolytic uraemic syndrome. Appl
Stat 1986;35 16-30.
47. Flack VF, Chang PC. Frequency of selecting noise variables in
subset regression analyses a simulation study. Am Stat 1987;
41:84-6.
48. Marshall G, Grover FL, Henderson WG, et al. Assessment of
predictive models for binary outcomes, an empirical approach
using operative death from cardiac surgery. Stat Med 1994;
13:1501-11
49. Benichou J. A computer program for estimating individualized probabilities of breast cancer. Comp Biomed Res 1993;
26:373-82.
50. Nunnally JC Psychometric theory. New York, NY: McGrawHill Book C o , 1978.
51. Kirshner B, Guyatt G. A methodological framework for assessing health indices. J Clin Epidemiol 1985;38:27-36
52. Cox DR, Fitzpatnck R, Fletcher AE, et al Quality of life
assessment can we keep it simple? J R Stat Soc 1992; 155:
353-93.
53. Streiner DL, Norman GR. Health measurement scales. A
practical guide to their development and use Oxford,
England. Oxford University Press, 1989.
Am J Epidemiol
Vol. 145, No. 3, 1997
Composite Measurement Scales for Risk Assessment
289
APPENDIX 1
Ectopic pregnancy (absolute) risk prediction worksheet (to be used in the Auvergne Region, France).
1: Score each risk factor (number of points)
Age(yr)
Smoking (cig/day)
Points*
Points
Other factors
Points
<35
0
no
0
35-39
3
£20
2
Prior ectopic pregnancy
£40
6
>20
4
Endometriosis
9
0
Previous Infection
8
0
Clomiphene-Induction
7
0
Tubal Surgery
4
0
Yes
No
10
0
2: Add points for all risk factors
Age
Smoking
Prior ectopic pregnancy
Endometriosis
Previous Infection
Clomiphene Induction
Tubal Surgery
Tulal
3: Read the risk for ectopic pregnancy corresponding to the points total from the following nomogram:
Point total
10
I
1- H
1 1
12
14
1 1 1
in
in
16
II
1 1
20
21
14
26
a
H
1 h
oil
U6
on
X
1
ui
32
34
1 1
u;
191
31
1
1
40
h
a
44
H
46
1 1
an
Risk
(For example, a 36-year old woman smoking 25 cigarettes/day, with a previous ectopic pregancy, and a clorniphene-induced pregancy
will have a score = 3 + 4 + 10 + 7 = 24 which correspond to a risk for ectopic pregnancy = 0.57)
* Points ore scaled (x 5) coefficients of the logistic model, therefore exp[points x (scaling coefficient)'1 ] represents the odds ratio associated to a given nsk
factor scheme
Am J Epidemiol
Vol. 145, No. 3, 1997