An evaluation of the inter-observer and intra

Human Reproduction Vol.17, No.6 pp. 1616–1622, 2002
An evaluation of the inter-observer and intra-observer
variability of the ultrasound diagnosis of polycystic ovaries
S.A.K.S.Amer1, T.C.Li1,5, C.Bygrave2, A.Sprigg3, H.Saravelos4 and I.D.Cooke1
1Department
of Reproductive Medicine and Surgery, The Jessop Wing, Sheffield Teaching Hospitals NHS Trust, Tree Root Walk,
Sheffield S10 2SF, 2Department of Medical Physics & Clinical Technology, Royal Hallamshire Hospital, Glossop Road, Sheffield
S10 2JF, 3Department of Radiology, Sheffield Children’s Hospital NHS Trust, Western Bank, Sheffield S10 2TH, UK and 4University
Department of Obstetrics and Gynecology, Hippokration Hospital, Aristotle University of Thessaloniki, 54008 Thessaloniki, Greece
5To
whom correspondence should be addressed. E-mail: [email protected]
BACKGROUND: This prospective observational study was undertaken to evaluate the reliability and consistency
of ultrasound diagnosis of polycystic ovarian syndrome (PCOS). METHODS: Eighteen women with clinical and
biochemical features suggestive of PCOS and nine normal control women underwent transvaginal ultrasound scan
by a single ultrasonographer. The 27 ovarian scans were video-recorded and the recordings were later edited and
arranged randomly so that each record appeared twice at random on the tape producing a total of 54 ovarian
scans. Four experienced observers independently reviewed the recordings. The observers scored each case as
follows: normal, possible polycystic ovary (PCO) and definite PCO. RESULTS: The mean intra-observer agreement
was 69.4% (κ ⍧ 0.54) and the mean inter-observer agreement was 51% (κ ⍧ 0.28). CONCLUSION: The results
suggest that the currently used ultrasonographic criteria for the diagnosis of polycystic ovaries do have significant
intra-observer and inter-observer variability and as such must be considered subjective. Transvaginal ultrasonography alone may not therefore be a reliable method of diagnosing or excluding PCOS.
Key words: observer variability/polycystic ovaries/transvaginal scan
Introduction
Polycystic ovarian syndrome (PCOS) is a common endocrine
disorder, affecting women in the reproductive age group.
Common presenting features include anovulatory infertility,
oligo/amenorrhoea, hirsutism and obesity. In many cases, the
condition is associated with a number of well-recognized
biochemical features including raised serum LH level, high
LH:FSH ratio and elevated plasma androgen levels. Ultrasonographically, the condition may be associated with bilaterally
enlarged polycystic ovaries (PCO). Despite this classic picture,
there is still much controversy in the diagnostic criteria for
PCOS. In England and Europe, the diagnosis is primarily
based on ovarian morphology as assessed by transvaginal scan
(Fox et al, 1991; Balen et al., 1995; Homburg, 1996), whereas
in North America, it appears that the emphasis is on biochemical features, especially that of hyperandrogenaemia
(Lobo, 1995; Carmina et al, 1997; Lewis, 2001). In some
studies, the diagnosis of PCOS was based on clinical and
endocrinological features without reference to ultrasound morphology, e.g. in one study (Loucks et al., 2000) the diagnosis
of PCOS was based on chronic anovulation plus one of the
three other features: (i) hirsutism, (ii) hyperandrogenaemia, or
(iii) increased LH:FSH ratio (艌2); only 21/63 (33%) of ‘PCOS’
patients had ultrasound evidence of PCOS.
The classic ultrasonographic features of PCOS, which have
1616
been previously described (Swanson et al., 1981; Adams et al.,
1986), include an enlarged ovary with multiple (艌10) small
cysts (2–8 mm in diameter), which are typically arranged
peripherally around an increased echogenic stroma. Although
these are the most frequently used ultrasonographic criteria
for the diagnosis of PCOS, there are several reasons why the
criteria have not been universally accepted as a gold standard
for diagnosis. First, considerable overlap exists between the
normal and the PCO in follicular number and size and ovarian
volume to the extent that a cut-off level with satisfactory
sensitivity and specificity cannot be obtained for many of the
parameters (Pache et al., 1992; Fox, 1999). The number of
small cysts necessary to define PCO on ultrasound has been
reported to vary between ⬎5 (Yeh et al., 1987; Battaglia et al.,
1999), ⬎10 (Adams et al., 1985) and ⬎15 (Fox et al., 1991).
Furthermore, some of the criteria used to define PCO, such as
the stromal echogenicity and follicular pattern, are purely
subjective. Whilst some investigators believe that the ovarian
volume is the most important criterion (Swanson et al., 1981),
others put much emphasis on the stromal hyperechogenecity
(Ardaens et al., 1991). Second, the precision of the commonly
used ultrasound diagnostic criteria has never been formally
evaluated. It could well account for the significant variation
in the prevalence of PCO amongst various investigators:
normal population 2.5–33% (Swanson et al., 1981; Polson
© European Society of Human Reproduction and Embryology
Observer variability in ultrasound diagnosis of PCOS
et al., 1988; Clayton et al., 1992; Farquhar et al., 1994;
Borgfeldt and Andolf, 1999; Koivunen et al., 1999; Michelmore
et al., 1999; Loucks et al., 2000); anovulatory infertility
patients 57–83% (Adams et al., 1986; Hull, 1987; Kousta
et al., 1999); and recurrent miscarriage population 7.8–50%
(Sagle et al., 1988; Regan et al., 1990; Liddell et al., 1997;
Li et al., 2000; Rai et al., 2000).
In this study we aimed to evaluate the precision of one of
the more widely used ultrasound criteria for the diagnosis of
PCOS by measuring inter-observer and intra-observer variability using video-taped recordings of ovarian ultrasonography.
Materials and methods
Subjects
A total of 27 women were included in the study. Eighteen of
the women presented with either clinical or biochemical features
suggestive of PCOS, including oligo/amenorrhoea, elevated serum
LH (艌10 IU/l), elevated LH:FSH ratio of 艌2, elevated androgen
levels (testosterone 艌2.5 nmol/l, androstenedione 艌10 nmol/l, or
free androgen index ⬎4). The remaining nine subjects did not have
any clinical or biochemical features indicative of the condition. In
the latter group, all the women had normal menstrual cycles with
cycle lengths of between 25 and 35 days, normal serum LH and
androgen levels, normal body mass index (between 20 and
26 kg/m2) and aged between 20 and 35 years.
Transvaginal ultrasonography
All subjects underwent a transvaginal scan by a single ultrasonographer, using a Toshiba ultrasound machine (model Sonolayer
SSA-250A) with a convex 6 MHz transvaginal ultrasound probe.
Women with regular menstrual cycles were scanned on days 2–5 of
the cycle, whereas women with irregular cycles were not timed
according to the menstrual cycle. Each ovary was localized in relation
to the iliac vessels, and scanned from inner to outer margins in
longitudinal cross-sections and from upper to lower ends in transverse
cross-sections. The three diameters of the ovary were measured
(longitudinal, anteroposterior and transverse). All the ovarian scans
were video-recorded using a Panasonic PAL video system (AG-6200).
Editing and randomization of ultrasonographic records
The 27 ultrasonographic records were duplicated using a Panasonic
video-editing system (AG-5700). A total of 54 ultrasonographic
records were therefore derived from the 27 original records from the
27 women who participated in the study. The 54 records were each
given a number and arranged randomly in a final edited videotape
record. In all, four identical videotapes were produced for evaluation
by each of the four observers.
Evaluation of the ultrasound record by independent observers
Four individuals with experience in transvaginal ultrasonography
were asked to evaluate the videotape records. Two of them were
ultrasonographers with ⬎15 years experience each in gynaecological
ultrasonography, whereas the other two were gynaecologists with a
special interest in Reproductive Medicine and transvaginal scan (who
would normally perform ⬎20 transvaginal scans per week). They
were asked to examine the videotape records and score the appearance
of each ovary using the following published criteria (Swanson et al.,
1981; Adams et al., 1985). Definite PCO: if all the features of PCO
are present (score ⫽ 2) including: (i) an increased echogenic (bright)
stroma; (ii) 艌10 small (2–8 mm) peripheral cysts; (iii) an increased
ovarian volume (艌12 ml). Possible PCO: if some but not all the
features of PCO are present (score ⫽ 1). Normal ovarian morphology,
not indicative of PCO: if none of the above features of PCO are
present (score ⫽ 0).
Ethical issues
This prospective study was approved by the South Sheffield Ethics
Committee. Informed consent was obtained from each of the patients
participating in the study.
Statistical analysis
The results of the scoring by each of the four observers were entered
into the Statistical Package for Social Science (SPSS) for PC version
10.01. κ-Statistics were used to determine the degree of intra-observer
and inter-observer agreement after correction for the agreement
expected by chance. A κ-value has a maximum of 1.0 when agreement
is perfect. A value of 0 indicates no agreement better than chance
agreement. Values between 0 and 1 are interpreted according to
published guidelines (Landis and Koch, 1977) subsequently modified
(Altman, 1991).
Results
Intra-observer agreement
The intra-observer agreement for each of the 27 cases for the
four observers are summarized in Tables I and II. Observer 1
agreed with himself in 20 out of 27 cases (74%, κ ⫽ 0.54),
observer 2 agreed with himself in 21 out of 27 cases (78%,
κ ⫽ 0.56), observer 3 agreed with himself in 17 out of 27
cases (63%, κ ⫽ 0.47) and observer 4 agreed with himself in
17 out of 27 cases (63%, κ ⫽ 0.46). The mean intra-observer
agreement for all the four observers is therefore 69.4% and
the corresponding κ-value is 0.54 (Figure 1).
Inter-observer agreement
The inter-observer agreements between each pair of observers
(1 and 2, 1 and 3, 1 and 4, 2 and 3, 2 and 4, 3 and 4) are
summarized in Tables III and IV. The agreement ranged from
43 to 72%. The corresponding κ-values ranged from 0.11 to
0.58. The average agreement was 51% (Figure 2) and the
κ-value was 0.28.
Unilaterality of ovarian morphology
The results of ovarian morphology and assessment for each
of the ovaries were compared to the contralateral ovaries. In
total, as the 27 cases were evaluated twice by the four
investigators, it produced 216 pairs of results. Ten records
were excluded due to suboptimal image quality of one of the
two ovaries in each, leaving a total of 206 pairs of ovaries.
The ovarian morphology was deemed to be the same between
the right and left ovaries in 119/206 (58%). In the remaining
87 (42%) pairs of results, there was a lack of agreement in
morphology of the right and left ovaries, in 12 (6%) there
were two levels of disagreement, i.e. one ovary was deemed
to be normal, the other ovary was deemed to be definite for
PCO, and in the remaining 75 (36%) pairs, there was only
one level of disagreement. Among 150 ovarian video records
considered to show PCO, 87 (58%) occurred as unilateral
1617
S.A.K.S.Amer et al
Table I. The intra-observer agreement for each observer
Agreement
Disagreement
(one-place)b
Disagreement
(two-place)c
Total
Observer 1
na (%)
Observer 2
n (%)
Observer 3
n (%)
Observer 4
n (%)
Total
n (%)
20 (74)
7 (25.9)
21 (77.7)
6 (22.3)
17 (62.9)
9 (33.3)
17 (62.9)
7 (25.9)
75 (69.4)
29 (26.8)
0 (0)
0 (0)
1 (3.7)
3 (11.1)
4 (3.7)
27 (100)
27 (100)
27 (100)
27 (100)
108 (100)
an
represents the number of times each observer agreed/disagreed with himself.
disagreement: same case scored as normal [or definite polycystic ovaries (PCO)] and possible
PCO by the same observer.
cTwo-place disagreement: same case scored as normal and definite PCO by the same observer.
bOne-place
phenomena whereas the remaining 63 (42%) occurred as part
of a pair.
Discussion
In this study, we examined the intra-observer and inter-observer
agreement of commonly used ultrasound criteria for the
diagnosis of PCOS between four independent observers.
To evaluate inter-observer variability/agreement, it is necessary for all the observers to be present at the time of the
ultrasound examination, or they need to perform a transvaginal
scan on the same patient separately, or as in this case, they
could look at the same record of the original ultrasound
examination. We chose the latter approach because it is less
intrusive to the patient, so that they do not have to go through
repeated scans by different observers or have to undergo the
transvaginal scan in the presence of four observers which
could be an embarrassing experience for the woman concerned.
Table II. The intra-observer agreement and κ-values for each observer
κ
Agreement (%)
Observers
1
2
3
4
Overall
74
78
63
63
69
0.57
0.62
0.45
0.46
0.54
Similarly, to evaluate intra-observer variability/agreement,
without the use of video-recording, the patient would have to
undergo examination by the same observer twice on separate
occasions, making it even more onerous for the patient.
On the other hand, the use of video records of the original
transvaginal scan may be criticized because the quality of the
original images may be compromised. This may not affect
so much the measurements, but the brightness, which may
potentially influence the assessment of stromal echogenicity.
However, as all the observers were assessing the same copies
of the original, the minor changes should, if any, remain
constant for all the video recordings and should not therefore
significantly influence the results relating to inter-observer and
intra-observer variability. Each of the observers was given an
opportunity to assess the quality of the imaging; it was
mentioned in the instructions to all the observers that they
could indicate if the quality of the image was not good enough
for evaluation. On 10 occasions, observers commented that
the quality of the image was suboptimal in one of the two
ovaries. However, the overall diagnosis in these cases was
based on the appearance of the other ovary with the better
image. In all other situations, the quality of the image was
deemed to be sufficiently good for evaluation.
In this study, all the observers were blinded to the clinical
information. In other words, they were not aware of the
biochemical results of investigations including LH and
androgen levels or of the pattern of menstruation in each
of these subjects. It is uncertain if the provision of relevant
Table III. The inter-observer agreement between each two observers
Observers
Agreement
Disagreement (one-place)b
Disagreement (two-place)c
Total
1–2
na (%)
1–3
n (%)
1–4
n (%)
2–3
n (%)
2–4
n (%)
3–4
n (%)
Overall
n (%)
23
21
10
54
27
18
9
54
26
21
7
54
25
25
4
54
25
26
3
54
39
13
2
54
165
124
35
324
(42.6)
(38.8)
(18.6)
(100)
(50)
(33.3)
(16.6)
(100)
(48.1)
(38.8)
(12.9)
(100)
aThe number of times each pair of observers agreed/disagreed between themselves.
bOne-place disagreement: same case diagnosed as normal [or definite polycystic ovaries
cTwo-place
1618
(46.3)
(46.3)
(7.4)
(100)
(46.3)
(48.1)
(5.6)
(100)
(72.3)
(24)
(3.7)
(100)
(50.9)
(38.3)
(10.8)
(100)
(PCO)] by one observer and possible PCO by the other observer.
disagreement: same case diagnosed as normal by one observer and definite PCO by the other observer.
Observer variability in ultrasound diagnosis of PCOS
Figure 1. Overall intra-observer agreement
Figure 2. Overall inter-observer agreement.
Table IV. Inter-observer agreement and κ-values
Observers
1–2
1–3
1–4
2–3
2–4
3–4
Overall
Agreement (%)
κ
43
50
48
46
46
72
51
0.11
0.26
0.23
0.22
0.20
0.58
0.28
clinical information may bias the observer into making a
diagnosis of PCO. The omission of this clinical information
from the evaluation ensures that the assessment is only on
the basis of ovarian morphology, independent of clinical
information.
Although the assessment of the ovarian morphology in the
diagnosis of PCOS has been in use for 20 years, the precision
of the measurement has never been previously evaluated. To
the best of our knowledge, this is the first study ever conducted
to evaluate the intra-observer and inter-observer variability of
the measurement. In this study, we found that there was only
modest agreement within the same observer (mean agreement
69.4%, κ-value 0.54). This means that when a particular
observer is asked to assess the same ovaries on a separate
occasion, the likelihood of producing the same result is only
70%. In other words, there is a 30% chance that the same
observer will disagree with himself/herself in a subsequent
occasion. The inter-observer agreement, as a general rule, is
less than that of intra-observer agreement. In our study, the
mean agreement between observers was only 51% (κ-value
0.28), which indicates only a fair amount of agreement between
the observers.
Some optimists may argue that as the ultrasound diagnosis
of PCOS is primarily subjective, a level of disagreement
between definite PCO and possible PCO is not unusual and
so if one-place disagreement is excluded the overall intraobserver agreement will be 96% and inter-observer agreement
will be 89%.
Among the four observers, observers 3 and 4, who are both
gynaecologists with an interest in pelvic ultrasonography,
appear to have the best agreement (72%); whereas the agreement between the remaining combinations of the four observers
produced a result ranging from 43 to 50%. Although it is
possible that gynaecologists who have more ready access to
clinical information are more likely to agree on a clinical
diagnosis of PCOS, in this study the two gynaecologists were
similarly blinded to the clinical information, as were the two
radiologists. Nevertheless, the two gynaecologists spent 2 years
working closely together in the same Reproductive Medicine
Unit and it is possible that the sharing of experience in the
same team produces better agreement, a phenomenon which
is well recognised in the pathological evaluation (Langley
et al., 1983).
Overall, our results suggest that the currently used criteria
for the diagnosis of PCOS have significant intra-observer and
inter-observer variability and must therefore be considered
subjective. Transvaginal ultrasonography alone is therefore not
a reliable method of diagnosing or excluding PCOS. Indeed,
several criteria for the diagnosis of PCOS have been proposed
(Swanson et al., 1981; Adams et al., 1985; Yeh et al., 1987;
Ardaens et al., 1991; Fox et al., 1991; Franks, 1992; Battaglia
et al., 1999). None of them has been universally agreed as the
diagnostic criterion of choice. Given the subjective nature of
ultrasound assessment, it will be necessary to quantify the
measurements to improve their diagnostic value.
The lack of precision of the commonly used ultrasonographic
criteria for the definition of PCO may be due to a number of
reasons. First, the criteria refer to three different aspects of
ovarian morphology. Whilst the conclusion may be reached
easily if all three criteria are met, or if none of the three
criteria are met, the difficulties arise if only one or two of the
criteria are fulfilled. It is possible that an observer may then
attempt to make a decision, subconsciously, on the overall
impression of the ultrasonographic appearance, or even the
appearance of the other ovary.
Secondly, a more in-depth analysis of the various components of the criteria suggests that each of them may be subjective
1619
S.A.K.S.Amer et al
in one way or another. In general terms, the ultrasound
criteria used can be classified into two types: quantitative and
qualitative. The former include parameters that are obtained
by measuring physical entities such as the ovarian volume,
stromal volume and the number and volume of small cysts.
Although they are relatively objective, their quantification is
influenced by the skill and the carefulness of the examiner
(Dewailly, 1997). The qualitative parameters include stromal
echogenicity and follicular pattern. Their evaluation is visual
and therefore more subjective and depends not only on the
perception of the sonographer but also on the setting of
the ultrasound machine. The most objective component of the
commonly used ultrasound criteria, at first glance, appears to
be ovarian volume; however, this measurement does not take
into consideration the possibility of finding a follicle or cyst
of significant size, i.e ⬎10 mm in diameter (perhaps
艌20 mm). In this situation, the validity of the ovarian volume
measurement is in question—it is possible that some observers
may consider it as evidence against PCOS; others may accept
it as a finding compatible with PCOS (on the assumption that
ovulation occurs sporadically in anovulatory PCOS women)
and so make certain allowance for the ovarian volume. In any
case, there is no universal agreement of how to make allowance
for such a finding; each observer is therefore left to make a
subjective decision on each occasion.
Another component of the criteria relates to the finding of
‘⬎10 small (2–8 mm) peripherally arranged cysts’. Whilst it
states the number of cysts to be ⬎10, it is possible that certain
observers do not actually count the number of cysts but merely
form an impression that there are many small cysts. It is also
possible that some observers realize that certain investigators
accept a finding of more than five small follicles (Yeh et al.,
1987; Battaglia et al., 1999), and so might find it difficult to
be certain of the significance if the number of small cysts is
⬎5 but ⬍10.
The third component of the criteria relates to stromal
echogenicity, which is subjective. The stromal echogenicity
may be affected by the setting of the ultrasound machine.
Nevertheless, many authors have emphasized the important
diagnostic value of abnormal ovarian stroma (Adams et al.,
1986; Conway et al., 1989; Eden et al., 1989; Ardaens et al.,
1991). One study (Buckett et al., 1999) objectively measured
ovarian stromal echogenicity and the stromal index (ratio of
mean stromal echogenicity to mean echogenicity of the entire
ovary) in normal ovaries (n ⫽ 77) and PCO (n ⫽ 46). They
found no difference in the mean stromal echogenicity, but the
stromal index was significantly greater in women with PCO.
They concluded that the apparent subjective increase in stromal
echogenicity in PCO, as exemplified by the greater stromal
index, is due to a combination of the increased volume of
ovarian stroma and the significantly lower mean echogenicity
of the entire ovary in these women.
use of 3-dimentional (3-D) ultrasound (Kyei-Mensah et al.,
1996a,b). Kyei-Mensah et al. (1996a) compared the volume
of ovarian follicles measured by transvaginal 2-D and 3-D
ovarian scan carried out in 25 women immediately before
follicular aspiration and the volume of follicular fluid aspirated
during IVF treatment. They found that the true volume of
ovarian follicles measured by a 3-D ultrasound system is more
accurate than that measured by 2-D ultrasound techniques. In
another study, the same authors (Kyei-Mensah et al., 1996b)
investigated the reproducibility of ovarian and endometrial
volume measurements obtained using transvaginal 3-D ultrasound. Three observers independently measured the volume
of 20 stored ovarian and endometrial scans. The intra-observer
coefficient of variation for both ovarian and endometrial
volume was 8%. The inter-observer coefficient of variation
was 9% for ovarian volume and 11% for endometrial volume.
They concluded that transvaginal 3-D ultrasound produces
highly reproducible ovarian and endometrial volume measurements (Kyei-Mensah et al., 1996b).
Similarly, the number of follicles 2–8 mm in diameter may
be readily measured by the use of 3-D ultrasound.
As far as stromal echogenicity is concerned, 3-D ultrasound
will not improve upon the objectivity and hence the precision of
the measurement. It is unclear if stromal volume measurement,
considered important by some investigators (Dewailly et al.,
1997; Kyei-Mensah et al., 1998; Fox, 1999), could replace
stromal echogenicity as one of the morphological criteria. If
so, it is possible that 3-D ultrasound may, once again, provide
a more precise measurement of the ovarian stroma than
conventional 2-D ultrasound.
On the other hand, it is possible that a scoring system based
on the measurement of the three separate components of
ovarian morphology may help in the not so clear-cut cases,
which are not uncommon in day-to-day clinical practice.
Consideration should then be given to whether or not each
component should be weighed: a separate study will be required
to investigate such a possibility.
Finally, the combined assessment of the ovarian morphology
by transvaginal ultrasound and colour Doppler flow analysis
of the intraovarian and uterine vessels may be a promising
new approach to define PCOS. One study (Battaglia et al.,
1995) carried out a transvaginal colour Doppler measurement
of the uterine and intraovarian vessel variations in 22 PCOS
women and in 18 normal control women. They found significantly elevated uterine artery pulsatility index values associated
with a typical low resistance index of stromal ovary vascularization. The pulsatility index was positively correlated with
the LH:FSH ratio, and the resistance index was negatively
correlated. The elevated uterine artery resistance was correlated
with androstenedione levels. They concluded that Doppler
analysis could be a valuable additional tool for the diagnosis
of PCOS.
How may the precision and usefulness of the ultrasonographic criteria be improved?
First, the precision of the individual measurements of each of
the three components should be improved. Ovarian and stromal
volume could now be measured with greater precision by the
The significance of finding PCO on ultrasound scan
1620
Several previous studies have investigated the significance of
ultrasound appearance of PCO in normal women and in women
with PCOS. Carmina et al. reported on the significance of
ultrasonographic finding of PCO in 15 normal non-PCOS
Observer variability in ultrasound diagnosis of PCOS
women (Carmina et al., 1997). The study found that about
a third of this group of women had some evidence of
hyperandrogenaemia and significantly lower insulin-like
growth factor I (IGF-I) than women with normal ovaries. They
concluded that the presence of PCO in apparently non-PCOS
women may represent a part of the spectrum of the patients
with PCOS or that these women may be susceptible to
developing PCOS in the future. Furthermore, it was reported
(Norman et al., 1995) that women with PCO without hyperandrogenaemia (n ⫽ 21) had disturbances in insulin and lipid
profile similar to those with PCOS, i.e. those with PCO and
hyperandrogenaemia (n ⫽ 97), suggesting that ultrasonographic finding of PCO alone, independent of clinical and
endocrine manifestations, is predictive of the metabolic sequelae of PCOS. In contrast, another team (Clayton et al., 1992)
investigated the significance of the ultrasound diagnosis of
PCO in 41 non-PCOS women. They found that the prevalence
of PCO was high (41/190, 22%) in non-PCOS women, but
was associated with minimal clinical manifestations and no
hormonal abnormalities. They concluded that an isolated
finding of PCO might be a normal variation.
To conclude, there appears to be significant intra-observer
and inter-observer variability in the currently used ultrasound
criteria for the diagnosis of PCOS. It remains to be seen
whether or not 3-D ultrasound evaluation, by providing a
more objective means of assessing ovarian morphology, could
improve the diagnostic accuracy of ultrasound in the diagnosis
of PCOS. It will also be interesting in future studies to directly
compare the positive predicted value and negative predicted
value of transvaginal ultrasonography and biochemical measurement in the diagnosis of PCOS. For now, we have identified
a clear need to continue to search for a better diagnostic tool
for PCOS.
References
Adams, J., Franks, S., Polson, D.W., Mason, H.D., Abdulwahid, N., Tucker,
M., Morris, D.V., Price, J. and Jacobs, H.S. (1985) Multifollicular ovaries:
clinical and endocrine features and response to pulsatile gonadotrophin
releasing hormone. Lancet, ii, 1375–1378.
Adams, J., Polson, D.W. and Franks, S. (1986) Prevalence of polycystic
ovaries in women with anovulation and idiopathic hirsutism. Br. Med. J.,
293, 355–359.
Altman, D.G. (1991) Inter-rater agreement. In Practical Statistics for Medical
Research. Chapman & Hall/CRC, London, pp. 403–409.
Ardaens, Y., Robert, Y., Lemaitre, L., Fossati, P. and Dewailly, D (1991)
Polycystic ovarian disease: contribution of vaginal endosonography and
reassessment of ultrasonic diagnosis. Fertil. Steril., 55, 1062–1068.
Ardaens, Y., Robert, Y. and Dewailly, D. (1995) Polycystic ovaries: an
imprecise ultrasonographic definition. [In French.] Contracept. Fertil. Sex.,
23, 415–419.
Balen, A.H., Conway, G.S., Kaltsas, G., Techatrasak, K., Manning, P.J., West,
C. and Jacobs, H.S. (1995) Polycystic ovary syndrome: the spectrum of the
disorder in 1741 patients. Hum. Reprod., 10, 2107–2111.
Battaglia, C., Artini, P.G., D’Ambrogio, G., Genazzani, A.D. and Genazzani,
A.R. (1995) The role of color Doppler imaging in the diagnosis of polycystic
ovary syndrome. Am. J. Obstet. Gynecol., 172, 108–113.
Battaglia, C., Regnani, G., Petraglia, F., Primavera, M.R., Salvatori, M. and
Volpe, A. (1999) Polycystic ovary syndrome: it is always bilateral?
Ultrasound Obstet. Gynecol., 14, 183–187.
Borgfeldt, C. and Andolf, E. (1999) Transvaginal sonographic ovarian findings
in a random sample of women 25–40 years old. Ultrasound Obstet.
Gynecol., 13, 345–350.
Buckett, W.M., Bouzayen, R., Watkin, K.L., Tulandi, T. and Tan, S.L. (1999)
Ovarian stromal echogenicity in women with normal and polycystic ovaries.
Hum. Reprod., 14, 618–621.
Carmina, E., Wong, L., Chang, L., Paulson, R.J., Sauer, M.V., Stanczyk, F.Z.
and Lobo, R.A. (1997) Endocrine abnormalities in ovulatory women with
polycystic ovaries on ultrasound. Hum. Reprod., 12, 905–909.
Clayton, R.N., Ogden, V., Hodgkinson, J., Worswick, L., Rodin, D.A., Dyer,
S. and Meade, T.W. (1992) How common are the polycystic ovaries in
normal women and what is their significance for the fertility of the
population? Clin. Endocrinol., 37, 127–134.
Conway, G.S., Honour, J.W. and Jacobs, H.S. (1989) Heterogenity of polycystic
ovarian syndrome: clinical, endocrine and ultrasound features in 556
patients. Clin. Endocrinol., 30, 459–470.
Dewailly, D. (1997) Definition and significance of polycystic ovaries.
Baillière’s Clin. Obstet. Gynecol., 11, 350–368.
Eden, J.A., Place, J., Carter, G.D., Jones, J., Alaghband-Zadeh, J. and Pawson,
M.E. (1989) The diagnosis of polycystic ovaries in subfertile women. Br.
J. Obstet. Gynaecol., 96, 809–815.
Farquhar, C.M., Birdsall, M., Manning, P., Mitchell, J.M. and France, J.T.
(1994) The prevalence of polycystic ovaries on ultrasound scanning in a
population of randomly selected women. Aust. NZ J. Obstet. Gynecol., 34,
67–72.
Fox, R. (1999) Transvaginal ultrasound appearances of the ovary in normal
women and hirsute women with oligomenorrhoea. Aust. NZ J. Obstet.
Gynaecol., 39, 63–68.
Fox, R., Corrigan, E., Thomas, P.A. and Hull, M.G. (1991) The diagnosis of
polycystic ovaries in women with oligo-amenorrhoea: predictive power of
endocrine tests. Clin. Endocrinol., 34, 127–131.
Franks, S. (1992) Morphology of the polycystic ovary in polycystic ovary
syndrome. In Dunaif, A., Given, J.R., Haseltine, F.P. and Merriam, G.R.
(eds), Polycystic Ovary Syndrome. Blackwell, Boston, 19pp.
Homburg, R. (1996) Polycystic ovary syndrome—from gynaecological
curiosity to multisystem endocrinopathy. Hum. Reprod., 11, 29–39.
Hull, M.G.R. (1987) Epidemiology of infertility and polycystic ovarian
disease: endocrinological and dermographical studies. Gynecol. Endocrinol.,
1, 235–245.
Koivunen, R., Laatikainen, T., Tomas, C. Huhtaniemi, I., Tapanainen, J. and
Martikainen, H. (1999) The prevalence of polycystic ovaries in healthy
women. Acta Obstet. Gynecol. Scand., 78, 137–141.
Kousta, E., White, D.M., Cela, E., McCarthy, M.I. and Franks, S. (1999) The
prevalence of polycystic ovaries in women with infertility. Hum. Reprod.,
14, 2720–2723.
Kyei-Mensah, A., Zaidi, J., Pittrof, R., Shaker, A., Campbell, S. and Tan, S.L.
(1996a) Transvaginal three-dimensional ultrasound: accuracy of follicular
volume measurements. Fertil. Steril., 65, 371–376.
Kyei-Mensah, A., Maconochie, N., Zaidi, J., Pittrof, R., Campbell, S. and Tan,
S.L. (1996b) Transvaginal three-dimensional ultrasound: reproducibility of
ovarian and endometrial volume measurements. Fertil. Steril., 66, 718–722.
Kyei-Mensah, A.A., LinTan, S., Zaidi, J. and Jacobs, H.S. (1998) Relationship
of ovarian stromal volume to serum androgen concentrations in patients
with polycystic ovary syndrome. Hum. Reprod., 13, 1437–1441.
Landis, J.R. and Koch, G. (1977) The measurement of observer agreement
for categorical data. Biometrics, 33, 159–174.
Langley, F.A., Baak, J.P.A. and Oort, J. (1983) Diagnosis making: error
sources. In Baak, J.P.A. and Oort, J. (eds), A Manual of Morphometry in
Diagnostic Pathology. Springer-Varlag, Berlin, 6pp.
Lewis, V. (2001) Polycystic ovary syndrome: a diagnostic challenge. Obstet.
Gynecol. Clin. N. Am., 28, 1–20.
Li, T.C., Spuijbroek, M.D.E.H., Tuckerman, E., Anstie, B., Loxley, M. and
Laird, S. (2000) Endocrinological and endometrial factors in recurrent
miscarriage. Br. J. Obstet. Gynaecol., 107, 1471–1479.
Liddell, H.S., Sowden, K. and Farquhar, C.M. (1997). Recurrent miscarriage:
screening for polycystic ovaries and subsequent pregnancy outcome. Aust.
NZ J. Obstet. Gynecol., 37, 402–406.
Lobo, R.A. (1995) A disorder without identity: ‘HCA,’ ‘PCO,’ ‘PCOD,’
‘PCOS,’ ‘SLS’. What are we to call it? Fertil. Steril., 63, 1158–1160.
Loucks, T.L., Talbott, E.O. and McHugh, P. (2000) Do polycystic-appearing
ovaries affect the risk of cardiovascular disease among women with
polycystic ovary syndrome? Fertil. Steril., 74, 547–552.
Michelmore, K.F., Balen, A.H., Dunger, D.B. and Vessey, M.P. (1999)
Polycystic ovaries and associated clinical and biochemical features in young
women. Clin. Endocrinol., 51, 779–786.
Norman, R.J., Hague, W.M., Masters, S.C. and Wang, X.J. (1995) Subjects
with polycystic ovaries without hyperandrogenaemia exhibit similar
1621
S.A.K.S.Amer et al
disturbances in insulin and lipid profiles as those with polycystic ovary
syndrome. Hum. Reprod., 10, 2258–2261.
Pache, T.D., Wladimiroff, J.W., Hop, W.C. and Fauser, B.C. (1992) How to
discriminate between normal and polycystic ovaries: transvaginal US study.
Radiology, 183, 421–423.
Polson, D.W., Adams, J., Wadsworth, J. and Franks, S. (1988) Polycystic
ovaries—a common finding in normal women. Lancet, ii, 870–872.
Rai, R., Backos, M., Rushworth, F. and Regan, L. (2000) Polycystic
ovaries and recurrent miscarriage—a reappraisal. Hum. Reprod., 15,
612–615.
Regan, L., Owen, E.J. and Jacobs, H.S. (1990) Hypersecretion of luteinising
hormone, infertility and miscarriage. Lancet, 336, 1141–1144.
1622
Sagle, M., Bishop, K., Ridley, N., Alexander, F.M., Michel, M., Bonney, R.C.,
Beard, R.W. and Franks, S. (1988) Recurrent early miscarriage and
polycystic ovaries. Br. Med. J., 297, 1027–1028.
Swanson, M., Sauerbrie, E.E. and Cooperberg, P.L. (1981) Medical
implications of ultrasonically detected polycystic ovaries. J. Clin.
Ultrasound, 9, 219–222.
Yeh, H.C., Futterweit, W. and Thornton, J.C. (1987) Polycystic ovarian
disease: US features in 104 patients. Radiology, 163, 111–116.
Submitted on March 15, 2001; resubmitted on October 29, 2001; accepted
on January 25, 2002