Human Reproduction vol.12 no. 11 pp.2465-2468, 1997 Variation in the determination of follicular diameter: an inter-unit pilot study using an ultrasonic phantom G.L.DriscoII1'3, J.P.P.Tyler1 and D.Carpenter 2 ^ityWest IVF, 12 Caroline St, Westmead, NSW 2145, Australia and 2CSIRO Division of Telecommunication and Industrial Physics, Lindfield NSW 2070, Australia 3 To whom correspondence should be addressed Ultrasound operators in assisted reproductive technology units in New South Wales, Australia cooperated in a study that was conducted to assess their ability to measure fixed objects embedded in an ultrasound phantom. The results have demonstrated a large variation ranging between 10 and 25% coefficient of variation on distances between 10 and 32 mm. As ultrasonic imaging is the only component of assisted reproductive technology not currently controlled by an external quality assurance scheme, the authors suggest there may be a need to establish a programme. Key words: assisted reproductive technology/follicular diameter/quality control/ultrasound Introduction The monitoring of ovarian follicular development by transvesical, or more commonly, transvaginal ultrasound is a standard procedure in assisted reproductive technology treatment. Furthermore with the increasing use of 'down-regulation' with gonadotrophin releasing hormone (GnRH) analogues (e.g. lucrin, buserelin, syneral) the endocrine component of follicular tracking has been reduced and clinical decisions have become more dependent on the results of the ultrasound scans. The optimum follicle size from which to induce ovulation is not precisely known and the decision to give human chorionic gonadotrophin (HCG) depends on a number of factors including the total number of follicles recruited, their echogenic appearance by ultrasound (Fukuda et al., 1995; Gore et al., 1995), the sizes of the leading group and the perceived potential for hyperstimulation. Thus different clinics have their own criteria. This ranges between two or more follicles of > 17 mm diameter to those >21 mm (personal communication with all Australian assisted reproductive technology units). Considerable debate continues as to which is the 'correct' approach and about whether one is demonstrably better than another. However, according to statistics compiled by the Fertility Society of Australia (Lancaster et al., 1997) most Australian units appear to be achieving similar results with respect to pregnancy success, perhaps suggesting a wide window for inducing ovulation pre-assisted reproductive technology. Another reason for this wide latitude may be the calibration and/or operator variation in the ultrasound assessment of follicles. European Society for Human Reproduction and Embryology There have been few studies examining the performance of ultrasonic follicular measurements in assisted reproductive technology and those that have been done have examined observer reproducibility using a single scanner (Forman et al., 1991) rather than a range of machines likely to be found in different units. Studies in other disciplines (e.g. obstetrics) have, however, demonstrated the need for experienced ultrasonographers (Bahmaie and Edmonds, 1996) and the necessity of instrument calibration (Clark, 1988; Fish, 1990; Dudley and Griffith, 1996). Similarly there is increasing discussion about quality assurance in diagnostic medicine (Arger, 1995; Henderson, 1996; Weber, 1996) as the number of ultrasound services per 1000 head of population in Australia has increased from 34.9 to 97.9 over a 4 year period (Hailey, 1996). Similarly statements from professional bodies allied to ultrasound in the UK suggest that training and performance review should be routine and a Consortium for the Accreditation of Sonographic Education (CASE) has been established to provide an overview of the education of sonographers (BMUS Bulletin, 1996). Thus the object of this preliminary project was to assess the variation in ultrasonic determination of objects approximating follicular sizes using a standard ultrasound phantom distributed between seven assisted reproductive technology units and to make recommendations on the need — or otherwise — of establishing an external quality assurance (EQA) programme for ultrasound operators in assisted reproductive technology units. Materials and methods Experimental design For this preliminary study a standard 'phantom' used for the calibration of ultrasonic instruments (ATS Multipurpose Phantom Model 539: Gamasonics Institute for Research and Calibration, PO Box 411, Mittagong, NSW 2575, Australia) was used. Measurements were not made of the pre-set patterns within the phantom which would have alerted the operator to the probable 'correct' value required. The ultrasound phantom The phantom is a solid, ultrasound-permeable block of rubberized material in which are embedded wires and cylinders which, when scanned by ultrasound, reflect the appearance of either white spots or black filled circles of different diameters and consistency against a background echo scattering designed to represent liver tissue. They are situated at different levels and therefore can be used as both far and near objects (relative to the scanning probe). The technique of scanning a phantom is quite critical and the transducer needs to be aligned correctly to give the clearest picture. Similarly maximizing adjustments to the machine (e.g. total gain settings) will give clearer imaging. These, of course, are factors which would also affect the scanning and visualization of human ovaries. 2465 G.L.Driscoll, J.P.P.Tyler and D.Carpenter Participants The phantom was taken to each unit in Sydney and one in Canberra (see Acknowledgements for a list of the participating assisted reproductive technology units) and every operator who performed ultrasound measurements for that programme given the opportunity to take part in the study. The type of instrument used, the probe frequency and the status of the operator (trained ultrasonographer, medical practitioner, nurse specialist etc.) was also recorded. Each was shown the layout of the phantom before measurements were made. Measurements Ten values were requested (reproduced in Figure 1 and indexed as measurements 1-10 in the results). These were distances between different-sized 'circles', so had no exact distance by which a 'wanted' value could be guessed. Within the 10 determinations three sets of paired measurements were made and the final value (measurement 10) was a calibrated distance of 2 cm. Thus an assessment of an instrument's calibration and inter-observer variation in assessing distance could be made. Each operator performed the measurements without knowledge of the values obtained by the others. 8mm 6mm 4mm 3mm 2mm echogenic Figure 1. A diagram (not to scale) of the layout of the phantom. Participants were requested to measure the distances denned by the numbered 'lines' (i.e. outer circumference to outer circumference for measurements 3, 6 and 8; inner circumference to inner circumference for measurements 2, 5 and 9; diameter for measurement 4, etc.). Results Twenty-four operators from seven units completed the measurements. Five were trained ultrasonographers, 10 were experienced clinical nurse specialists, seven were experienced medical practitioners and two were scientists with no experience in ultrasound methods. The instruments used were: Ausonics Opus 1 with 7.5 MHz probe (10 operators), Ausonics Micro with 7.5 MHz probe (three operators), Siemens with 5.0 MHz probe (two operators), Acuson 128XP4 with 5.0 MHz probe (one operator), RT 3600 with 5.0 MHz probe (one operator), GERT with 5.0 MHz probe (one operator), Toshiba 140 with 7.5 MHz probe (one operator), ATL50 with 5.0 MHz probe (one operator) and Sonoace 1500 with 7.5 MHz probe (four operators). A summary of the results for each requested measurement is given in Table I. Mean values were normally distributed (mean approximated median value) and the measurements (mean range 10-32 mm) approximated the clinical situation in determining follicular diameter. However, the coefficient of variation was large and ranged between 10.1 and 29.9%. Furthermore it appeared unrelated to the size of the measurement (correlation of linear regression, r = 0.38) or to the distance of the object from the probe. For those measurements that were replicated (3 vs 8; 3 vs 6; 6 vs 8; 2 vs 5) no statistically significant difference could be found (P > 0.05) using Student's paired Mest. While the numbers were not large it was also possible to suggest that probe type (5.0 or 7.5 MHz) and operator experience also did not markedly affect measurement performance. However some experienced nurse specialists did have problems in clearly defining the 'deeper' and smaller measurements (numbers 1, 2, 6 and especially 5) suggesting a lack of knowledge of the ability to optimize their machine settings. Of interest was that measurements generated by the two novice participants always fell within the interquartile range. The distribution of values for each measurement is diagrammatically represented as box plots in Figure 2 where 'outliers' can be identified and the notched box represents the interquartile range (25th to 75th percentiles). Thus even with a defined distance (measurement 10 which was factory set at 20 mm) the range was between 18.0 and 29.4 mm. A line plot Table I. Summary statistics (in centimetres) for each measurement requested in the pilot external quality assurance study Measurement no.a 3 No. of operators' Mean distance SD Coefficient of variation (%) Median value Minimum value Interquartile range Maximum value a 1 2 3 4 5 6 7 8 9 10 22 1.76 0.34 19.0 22 1.06 0.25 23.4 24 3.21 0.58 18.2 24 1.56 0.47 29.9 20 1.04 0.19 18.6 22 3.24 0.52 16.1 24 2.13 0.22 10.1 24 3.16 0.58 18.5 24 1.73 0.32 18.6 24 2.18 0.29 13.3 1.70 1.06 0.30 2.40 1.04 0.70 0.16 1.94 3.18 1.39 0.47 4.30 1.40 1.17 0.30 3.40 1.04 0.64 0.16 1.64 3.20 1.90 0.30 4.87 2.10 1.29 0.12 2.50 3.12 1.86 0.33 5.11 1.80 1.10 0.54 2.30 2.10 1.80 0.27 2.94 Refer to Figure 1. Where n does not equal 24 some operators were unable to assess a measurement because of technical difficulties. b 2466 Ultrasound variation in assisted reproductive techniques 4 5 Measurement 3 4 5 6 7 Measurement number Figure 2. Notched box plots of all data for each measurement (refer to Figure 1) showing median values (notch), the interquartile range (box), 10th and 90th percentiles (whiskers) and outliers. of the data (Figure 3) further suggests that no one particular observer or instrument consistently affected the variation seen in these data. Discussion All aspects of pathology laboratories in Australia are required to be accredited by the National Agency of Testing Authorities (NATA). An integral component of this registration is demonstration of competence in an external quality assurance scheme. Similarly the Australian assisted reproductive technology licensing body [the Reproductive Technology Accreditation Committee (RTAC)] also requires careful monitoring of invitro fertilization culture laboratories, treatment consent forms, patient satisfaction with information brochures and counselling practice etc. Ultrasound measurement of ovarian follicles remains the only 'uncontrolled' aspect of assisted reproductive technology treatment. Notwithstanding the question of suitability of tissue-mimicking phantoms in assessing equipment calibration (Letter to the Editor, 1985), and the importance of the observer as a significant source of variability in clinical data assessment (Corson et al., 1995), this pilot survey has demonstrated a large variation in the ability of the participants to determine distance in an ultrasound phantom. It could be argued that this is of little consequence for assisted reproductive technology units since pregnancy rates across Australia appear similar (Lancaster et al., 1995) and an 'eyeball' approach to follicular size and number may be all that is necessary in clinical practice, particularly to limit the potential for ovarian hyper stimulation. However, the corollary of this is that the phantom is made using stable rigid bodies and thus should not be as difficult to measure as the ovary where it might be argued that variation could be even larger. Thus an error of 18% would mean that a follicle actually measuring 18 mm could be anywhere between 14.7 and 21.2 mm. While the exact relationship between follicle volume 6 7 number Figure 3. A line plot for each measurement for each observer demonstrating that no one instrument or person consistently 'read' high or low. and oocyte competence is still not precisely known, it is widely accepted that an oocyte collected from a smaller follicle (<16 mm) would generally be less competent than that from a larger one (>18 mm). Similarly, experienced ultrasonographers are capable of distinguishing healthy from atretic follicles but admit that it is 'impractical to observe and assess more than four or five follicles in one ovary by ultrasound', a scenario often encountered in women stimulated for assisted reproductive technology (Fukuda et al., 1995). Furthermore, intracytoplasmic sperm injection (ICSI) is demonstrating that there is as much variation in oocyte quality as there is in sperm morphology and motility. Should a woman decide not to use ovulating drugs for her assisted reproductive technology attempt then the current approach (i.e. as long as there is a reasonable group of follicles 'some' will be competent) ceases to be clinically applicable and the need to maximize oocyte quality based on biochemical and follicular maturity becomes paramount. Finally, in this survey the lowest coefficient of variation was 10%, a value generally considered the upper limit of allowable error in many pathology tests. The resolution of ultrasound determination is obviously limited by many factors (machine settings, patient compliance etc.) and since ovarian follicles are not usually spherical the interpretation of their volume will remain subjective. In conclusion this pilot study suggests the need for an EQA programme for ultrasound in assisted reproductive technology programmes. Individual clinics could consider making their own 'perishable' phantoms (Gibson and Gibson, 1995) for in-house assessment or develop within their geographic areas an external programme such as has been established for mammography (Royal Australian College of Radiology, 1996). Acknowledgements The authors thank Dr A.Ekangaki, of Macquarie University, for statistical assistance, the Ultrasonics Laboratory, CSIRO, for the loan of the phantom for this study and the following Medical Directors for allowing their staff to participate in this survey: Dr Geoffrey Driscoll, CityWest IVF, 12 Caroline St, Westmead. NSW 2145; Dr Robert Jansen, Sydney IVF, 4 O'Connell St, Sydney, NSW 2000; Dr Trevor Johnson, Royal Hospital for Women Fertility Group, 253 2467 G.L.Driscoll, J.P.P.Tyler and D.Carpenter Oxford St, Bondi Junction, NSW 2022; Dr David Macourt, St George Fertility Centre, 172-174 Forrest Rd, Hurstville, NSW 2220; Dr Ric Porter, North Shore ART, Royal North Shore Hospital, St Leonards, NSW 2065; Dr Howard Smith, Human Reproduction Unit, Westmead Hospital, Westmead, NSW 2145; Dr Martyn Stafford-Bell, Canberra Fertility Centre, John James Memorial Hospital, Strickland Cres, Deakin, ACT 2605. References Arger, P.H. (1995) Ultrasound Practice Accreditation. American Institute of Ultrasound in Medicine Reporter, 11, Issue 8, September. Bahmaie, A. and Edmonds, D.K. (1996) Booking ultrasound examination performed by obstetric senior house officers — Time to reevaluate? Aust. NZJ. Obstet. GynaecoL, 36, 389-391. British Medical Ultrasound Society (1996) Bulletin, August. Clark, P.D. (1988) Performance checks. In Lerski, R.A. (ed.), Practical Ultrasound. IRL Press, Oxford, pp. 74-76. Corson, S.L., Batzer, F.R., Gocial, B. et al. (1995) Intra-observer and interobserver variability in scoring laparoscopic diagnosis of pelvic adhesions. Hum. Reprod., 10, 161-164. Dudley, N.J. and Griffith, K. (1996) The importance of rigorous testing of circumference measuring callipers. Ultrasound Med. Biol., 22, 1117-1119. Fish, P. (1990) Physics and Instrumentation of Diagnostic Medical Ultrasound. Wiley, Chichester. Forman., R.G., Robinson, J., Yudkin, P. et al. (1991) What is the true follicular diameter: an assessment of the reproducibility of transvaginal ultrasound monitoring in stimulated cycles. Fertil. Steril., 56, 989-992. Fukuda, M, Fukuda, Y., Yding Andersen, C. and Byskov, A.G. (1995) Healthy and atritic follicles: vaginosonographic detection and follicular fluid hormone profiles. Hum. Reprod., 10, 1633-1637. Gibson, R.N. and Gibson, K.I. (1995) A home-made phantom for learning ultrasound-guided invasive techniques. Aust. Radiol., 39, 356-357. Gore, M.A., Nayudu, PL., Vlaisavljevic, V. and Thomas, N. (1995) Prediction of ovarian cycle outcome by follicular characteristics, stage 1. Hum. Reprod., 10, 2313-2319. Hailey, D.M. (1995) Ultrasound makes rapid gains in Australia. Diagnostic Imaging Asia Pacific, June, 25-27. Henderson, K. (1996) Quality assurance and the maintenance of professional standards. Australian Society Ultrasound Medicine Newsletter, June, 4. Lancaster, P., Shafir, E., Hurst, T. and Huang, J. (1997) Assisted Conception Australia and New Zealand 1994 and 1995. Fertility Society of Australia, 4-6. Perkins, A.C. (1985) The suitability of tissue mimicking phantoms for equipment calibration. [Letter.] Ultrasound Med. Biol., 11, 193-194. Royal Australian College of Radiology (1996) Mammography quality assurance programme. Newsletter 48, 22-23. Weber, A. (1996) Voluntary accreditation aims to deter regulation. Diagnostic Imaging Asia Pacific, January, pp. 45-47. Received on February 4, 1997; accepted on July 8, 1997 2468
© Copyright 2026 Paperzz