Mutagenesis vol. 23 no. 3 pp. 223–231, 2008 Advance Access Publication 7 March 2008 doi:10.1093/mutage/gen006 Variation in assessment of oxidatively damaged DNA in mononuclear blood cells by the comet assay with visual scoring Lykke Forchhammer, Elvira Vaclavik Bräuner, Janne Kjærsgaard Folkmann, Pernille Høgh Danielsen, Claus Nielsen, Annie Jensen, Steffen Loft, Gitte Friis and Peter Møller* Department of Environmental Health, Institute of Public Health, University of Copenhagen, Øster Farimagsgade 5, Building 5, 2nd floor, PO Box 2099, DK-1014 Copenhagen K, Denmark The comet assay is popular for assessments of genotoxicity, but the comparison of results between studies is challenging because of differences in experimental procedures and reports of DNA damage in different units. We investigated the variation of DNA damage in mononuclear blood cells (MNBCs) measured by the comet assay with focus on the variation related to alkaline unwinding and electrophoresis time, number of cells scored, as well as the putative benefits of transforming the primary end points to common units by the use of reference standards and calibration curves. Eight experienced investigators scored pre-made slides of nuclei differently, but each investigator scored constantly over time. Scoring of 200 nuclei per treatment was associated with the lowest residual variation. Alkaline unwinding for 20 or 40 min and electrophoresis for 20 or 30 min yielded different dose–response relationships of cells exposed to g-radiation and it was possible to reduce the variation in oxidized purines in MNBCs from humans by adjusting the level of lesions with protocol-specific calibration curves. However, there was a difference in the level of DNA damage measured by different investigators and this variation could not be reduced by use of investigator-specific calibration curves. The mean numbers of lesions per 106 bp in MNBCs from seven humans were 0.23 [95% confidence interval (CI): 0.14–0.33] and 0.31 (95% CI: 0.20–0.55) for strand breaks (SBs) and oxidized guanines, respectively. In conclusion, our results indicate that inter-investigator difference in scoring is a strong determinant of DNA damage levels measured by the comet assay. Introduction During the last two decades, the single-cell gel electrophoresis (comet) assay has become a method of increasing popularity. It was originally developed as a technique for the measurement of DNA strand breaks (SBs), but further modifications have increased the range of lesions that can be measured. Novel modifications to the original method include the detection of enzyme-sensitive sites such as oxidized pyrimidine and purine bases (1) and DNA repair incision activity of both oxidized and bulky DNA lesions (2,3). Oxidized purine bases can be measured by digestion of the DNA with formamidopyrimidine DNA glycosylase (FPG) that removes the altered bases and the resulting alkaline-labile sites can be measured by methods such as alkaline elution, alkaline unwinding and the comet assay. The useful application of the comet assay in biomonitoring studies necessitates the comparison of results between studies and the minimization of assay variation. It is still a common practice that comet assay results are reported as primary end points, e.g. tail length, tail moment, percentage of fluorescence in the tail or arbitrary units obtained by categorizing nuclei in different classes. The primary end points reported as the percentage of DNA in tail and visual score can be compared, whereas the relationship between the other end points is less clear due to the large variation in the level of migration reported by different laboratories (1,4). We believe that the most informative way of reporting comet assay results are as lesions per unaltered nucleotides or diploid cells (e.g. lesions per 106 bp). We prefer to report the data obtained by the comet assay as lesions per 106 bp, rather than 106 dG, because we believe that it is somewhat inaccurate to report SBs as lesion per 106 dG since it may be interpreted as breaks in the DNA strand which only occur at guanines. This can be obtained by using a calibration curve determined by the dose–response relationship of ionizing radiation. In the early 2000s, the European Standards Committee on Oxidative DNA Damage (ESCODD) compared the FPG-based method on detection of oxidized purines in different laboratories. Rather discouragingly, only half of the laboratories detected a dose–response relationship over the full range of DNA damage in HeLa cells (5). A subsequent analysis showed a large variation in the level of DNA damage in mononuclear blood cells (MNBCs) from young and healthy subjects in different laboratories, but there was an almost identical variation in the level of DNA damage in HeLa cell samples that served as controls and should not vary between laboratories (6). This indicates that a significant contribution to the inter-laboratory variation arises from the variability in assay procedures, including variation in the scoring of slides. This notion has been further substantiated by another investigation where seven different laboratories scored the same set of slides; the variation of DNA damage expressed as the coefficient of variation had a range of 10–100% in samples treated with various doses of hydrogen peroxide (7). Likewise, we have previously shown that different investigators from the same laboratory obtained different levels of DNA damage when scoring the same set of slides, but the variation in DNA damage could be slightly reduced if the level of DNA damage was expressed as the number of lesions per unaltered nucleotides by using investigator-specific calibration curves (8). The aim of this study was to investigate the variation in DNA damage measured by the comet assay with emphasis on *To whom correspondence should be addressed. Tel: þ45 3532 7654; Fax: þ45 3525 7686; Email: [email protected] Ó The Author 2008. Published by Oxford University Press on behalf of the UK Environmental Mutagen Society. All rights reserved. For permissions, please e-mail: [email protected]. 223 L. Forchhammer et al. the contribution of the most common variations in assay procedures, number of cells scored and the use of investigatorspecific calibration curves. Materials and methods Study design The study encompasses three parts with specific focus on (i) the variation in slide scoring between investigators and over time in scoring the same set of slides, (ii) variation in DNA damage of cells that have been analysed by one investigator using three of the most common assay protocols and (iii) differences between investigators that have carried out separate comet assay experiments on identical samples of cells. Eight experienced comet assay investigators took part in the study. Their experience ranged from 1 to 12 years. All investigators participated in parts 1 and 3 of the study, while one (2 years of experience) also carried out part 2 of the study. Part 1 variation in visual scoring on different occasions with varying number of nuclei per sample. The first part of the study was a slide-scoring exercise where the eight investigators scored the same set of GelBond slides on two different occasions. The aim was to investigate the variation in score between investigators and to determine whether or not an increased number of scored nuclei per sample decreased the residual variance. The slides were prepared as extra positive and negative controls for an in vitro assay for base excision repair (9,10). In this assay, substrate A549 human lung epithelial cells were treated with Ro19-8022 (a gift from Hoffman-La Roche, Basel, Switzerland) and white light in order to induce 8-oxoguanine, embedded in agarose and lysed (as for the standard comet assay). Extract prepared from a different batch of A549 cells was applied to the gels (in parallel with controls with just buffer) and incubated for 20 min at 37°C. 8-Oxoguanines were converted by 8-oxoguanine DNA glycosylase 1 (OGG1) in the extract to DNA breaks which were measured by alkaline electrophoresis as in the standard comet assay. The investigators were informed to score a set of 15 pairs of slides. Each pair of slides consisted of two slides labelled A or B (with two gels on each slide). The letters A and B referred to treatment with buffer or extract (but were coded so that the investigator did not know which was which). The slides were prepared on 15 separate days during a biomonitoring study (9): the pairs of slides were therefore numbered 1–15. On two occasions, 3 months apart, the investigators scored the 15 pairs of slides (with two gels each). A most common practice is to score two gels of 100 nuclei per gel (corresponding to 200 nuclei per sample). In this part of the study, the investigators scored 50 nuclei per gel (slides 1–5), 100 nuclei per gel (slides 6–10) and 200 nuclei per gel (slides 11–15), giving a total of 7000 nuclei scored on each occasion. The appearances of the nuclei were scored according to a five-class scoring system in arbitrary units from 0–400 (Figure 1). This type of scoring system generally shows excellent correlation with image analysis systems (4). Part 2 variation in experimental setup. The second part of the study was designed to assess the variation in DNA damage when comparing three of the protocols most commonly used in biomonitoring studies. Recent biomonitoring studies typically use 20 or 40 min of alkaline treatment and 20 or 30 min of electrophoresis, and the most common combinations in the biomonitoring studies appear to be 20/20, 40/20 or 40/30 min of alkaline treatment and electrophoresis (11). The alkaline treatment increases the level of cleavage of alkaline-labile sites and the DNA migration distance is influenced by the duration of the electrophoresis and alkaline unwinding (12,13). The levels of SB and FPG sites were detected by the comet assay in MNBCs using the following: (i) 20 min of alkaline unwinding and 20 min of electrophoresis, (ii) 40 min of alkaline unwinding and 20 min of electrophoresis and (iii) 40 min of alkaline unwinding and 30 min of electrophoresis. The level of migration was scored in 200 nuclei per sample. There were seven samples of MNBCs included in the experiment because this was the maximal number of samples that could be analysed in one batch of analysis. It was our hypothesis that the reduction of variation related to different comet assay protocols could be achieved by adjusting the raw data with the value obtained by either the reference standard or the protocol-specific calibration curves. It has previously been reported that correcting the raw data with a reference standard yielded less inter-electrophoresis and inter-investigator variation (14). The reference standard was a sample of A549 cells treated with Ro19-8022/white light. We obtained protocol-specific calibration curves for these three experimental setups by analysis of the level of SBs in A549 cells that were irradiated with ionizing radiation (see below). The level of DNA damage in MNBCs is reported as raw data (as arbitrary units), data corrected according to the reference standard 224 Fig. 1. The five-class scoring system used by all the investigators in the study. The figure shows representative digital images of nuclei with (A) score zero, (B) score 1, (C) score 2, (D) score 3 and (E) score 4. The colour indicates the intensity of light in the scale from red (highest intensity), yellow, green and blue (lowest intensity). (reported as arbitrary units), adjusted using the protocol-specific calibration curve (reported as lesions per 106 bp) or adjusted according to both reference standard cells and protocol-specific calibration curve (reported as lesions per 106 bp). Part 3 variation in DNA damage measured by different investigators. The third part of the study focused on the variability in DNA damage obtained by eight different investigators carrying out the comet assay on identical samples of cryopreserved MNBCs from seven healthy individuals and on a reference standard sample containing A549 cells treated with Ro19-8022/white light. The lysis solution and enzyme buffer was identical for all investigators, whereas the investigators prepared their own solution for alkaline treatment and electrophoresis on the day of the experiment. The standard protocol followed by all investigators included 40 min of alkaline treatment and 20 min of electrophoresis and scored 200 nuclei per sample. The investigators obtained calibration curves by scoring slides of A549 cells irradiated with c-rays. All investigators scored the same set of slides for the calibration curve. The level of DNA damage in MNBCs is reported as raw data (as arbitrary units), data corrected according to the reference standard (reported as arbitrary units), adjusted according to the investigator-specific calibration curve (reported as lesions per 106 bp) and adjusted according to both reference standard and investigator-specific calibration curve (reported as lesions per 106 bp). Comet assay The levels of SB and FPG sites were detected by the comet assay as described previously (15,16). Briefly, MNBCs or A549 cells were embedded in 0.75% low-melting point agarose (Sigma-Aldrich, Brøndby, Denmark) on GelBondÒ Variability of DNA damage measured by the comet assay films (Cambrex, Medinova Scientific A/S, Hellerup, Denmark) and lysed (1% Triton X-100, 2.5 mM NaCl, 100 mM Na2EDTA and 10 mM Tris, pH 5 10) for a minimum of 1 h at 4°C. Afterwards, the GelBonds were washed 3 5 min in buffer (40 mM HEPES, 0.1 M KCl, 0.5 mM Na2EDTA and 200 lg/ml bovine serum albumin, pH 5 8). The levels of FPG sites and SBs were detected by incubation of the agarose-embedded nuclei with 1 lg/ml of FPG or buffer for 45 min at 37°C, respectively. The FPG enzyme was kindly provided by Prof. Andrew Collins, University of Oslo, Norway. The nuclei were then immersed in an alkaline solution (300 mM NaOH and 1 mM Na2EDTA, pH . 13) for 40 or 30 min and the duration of the subsequent electrophoresis was 30 or 20 min in the same solution at 0.83 V/cm and 300 mA. After electrophoresis, the nuclei were washed 3 5 min in Tris buffer (0.4 M Tris– HCl, pH 5 7.5), rinsed with MilliQÒ water and placed in 96% ethanol for a minimum of 90 min and maximum of 12 h. Nuclei were scored with an Olympus fluorescence microscope at 40 magnitude with visual inspection after staining with YOYO-1 in phosphate-buffered saline (PBS) solution (Molecular Probes, Eugene, OR, USA). The net number of FPG sites was obtained as the difference between slides treated with the FPG enzyme and with buffer. Scores were translated into lesions per 106 bp by means of the investigator-specific calibration curves. The calculations were based on the fact that the human genome contains 2.9 109 nucleotides (17), corresponding to 6 109 bp per diploid cell in G0. Assuming that the average molecular weight of a DNA base pair is 650 Da, diploid human cells in G0 phase contains 4 1012 Da DNA. Calibration curve Human A549 epithelial cells were cultured in medium consisting of Hams F12, 10% foetal bovine serum (Invitrogen A/S, Tåstrup, Denmark) and 1% penicillin– streptomycin solution (the stock solution from Invitrogen A/S contains 10 000 U/ml of penicillin G and 10 000 lg/ml of streptomycin in 0.85% saline). The cells were irradiated with c-rays from a Cs137 source at 0, 2.5, 10 and 25 Gy with a Gamma Cell 2000 (dose rate 3.77 Gy/min) in PBS. We have previously reported that the yield of SBs increased linearly with ionizing radiation in the 0–10 Gy dose range using a comet assay protocol with 40 min of alkaline unwinding and 20 min of electrophoresis (8). In this experiment, we included a higher dose (25 Gy) because we expected that the yield of SBs would be different in various assay protocols and investigators score slides differently. In our laboratory, we usually observe a linear relationship between the dose of strand-breaking agents (ionizing radiation or hydrogen peroxide) and the visual score in the range of 0–300 arbitrary units, whereas the assay becomes increasingly saturated in the range of 300–400 arbitrary units (given a total range of 0–400 arbitrary units; results not shown). Inspection of the investigator-specific calibration curves indicated that those investigators who had the highest score in the 0–10 Gy dose range had lower score in the 25 Gy samples than expected based on linear extrapolation from the 0–10 Gy dose range (the level of DNA damage would then have been .300 arbitrary units in the samples irradiated with 25 Gy). This indicates that the investigators’ experiments had resulted in saturation of the assay. In attempt to provide a similar dose range of the calibration curves for all the investigators, we therefore only used the 0–10 Gy interval. MNBC separation MNBCs from seven healthy volunteers were collected and isolated in VacutainerÒ cell preparation tubes (Becton Dickenson A/S, Brøndby, Denmark) according to the manufacturer’s instructions and frozen at 80°C in a mixture containing 50% foetal bovine serum, 40% culture medium (RPMI 1640, Invitrogen A/S, Tåstrup, Denmark) and 10% dimethylsulfoxide. The blood sampling was part of a larger project that has been approved by the Danish ethical committee (approval no. KF 01 283243). Statistics The data were analysed by parametric analysis of variance (ANOVA) tests where homogeneity of the variance between groups (assessed by Levene’s test) and normal distributions of residuals (assessed by Shapiro–Wilks W test) were fulfilled for either raw or cubic-root transformed data. Linear relationships were assessed by regression analysis. In all tests, the level of significance was set to 5%. All data are reported as the mean with standard deviation (SD) or SEM or as the geometric mean with 95% confidence interval (CI). Parametric tests were performed using Statistica version 5.5 for Windows, StatSoft, Inc. (1997), Tulsa, OK, USA. In part 1 of the study, we could not a priori rule out that the 15 pairs of slides would have different levels of DNA migration as experiments were performed on 15 different occasions. We expected that the investigators would score the samples differently. In addition, the score obtained from the gels treated with enzyme extract was expected to be different from the gels treated with buffer. The statistical analysis thus included variables as follows: set of slides (n 5 15), treatment (buffer or extract, n 5 2) and investigator (n 5 8). The variance of these variables was tested in three different ANOVA tests stratified into the set of slides scored as 50, 100 or 200 nuclei per gel (corresponding to 100, 200 or 400 nuclei per sample of cells). The three-factor main effect ANOVA test did not include tests for interaction between factors. Differences in the distribution in score between the first and second occasion of scoring were analysed by v2-test. In part 2, differences in the slopes of linear regression lines of the calibration curves were tested by analysis of covariance (ANCOVA). The ANOVA components related to the differences in the level of DNA damage in MNBCs by the three protocols were tested by two-factor ANOVA without the test for interactions. This analysis included the sample (n 5 7) and protocol (n 5 3) as categorical variables. In part 3, the assessment of DNA damage by different investigators yielded rather scattered data and thus the variance attributed to investigators and samples was analysed by two-factor non-parametric ANOVA without the test for interactions. The analysis included the sample (n 5 7) and investigator (n 5 8) as categorical variables. The data are reported as the geometric means and 95% CI. Results Part 1 variation in visual scoring on different occasions with varying nuclei per sample Table I outlines the mean and SD of groups of slides where the levels of DNA damage have been scored in 100, 200 or 400 nuclei per sample of cells. As can be seen, the distribution of SD differs between the groups (P , 0.001, Levene’s test), but the means are also different and this implies inhomogeneity of variance between the sample of cells in the three groups. Consequently, we stratified the statistical ANOVA in scoring 100, 200 or 400 nuclei per sample into these three groups. Each analysis included the set of slides (n 5 15), treatment (buffer or extract, n 5 2) and investigator (n 5 8) as main effects. The residual variations of these statistical tests are also outlined in Table I. There was higher residual variation (SDres) in the set of slides where the level of DNA damage had been determined from scoring 100 nuclei per sample, whereas scoring 200 or 400 nuclei per sample was associated with the same magnitude of residual variation. Although this means that scoring 200 nuclei per sample is sufficient to reduce the variance, it should be recognized that it may not apply to samples with less DNA damage because it can be argued that it is the number of nuclei Table I. Variance in DNA damage of MNBCs attributed to differences in set of slides, investigator and treatment Nuclei per Mean treatmenta (SD) SDresb Variance componentsc Set of Investigator Treatment slides 100 200 400 77.2 (36.3) 24.1 (P , 0.001) 17*** 32*** 79.8 (35.9) 16.6 (P 5 0.53) 34*** 44*** 69.4 (26.6) 15.9 (P 5 0.28) 24*** 37*** 6.1*** 0.7* 2.7** a The nuclei per treatment refers to the number of nuclei that were scored in two gels treated with either buffer or cell extract. b Distribution of the residuals for ANOVA tests of main effect of investigator (n 5 8), set of slides (n 5 15) and treatment (n 5 2). The P-values correspond to the Shapiro–Wilks W test for normality of the residuals. c The data are the percentage of the total sum of squares that can be explained by differences in the set of slides originating from the same sample of cells, investigators and treatment. The unexplained fraction of the variance corresponds to the residual variance. The P-values correspond to a main effect ANOVA with the set of slides, investigator and treatment as categorical variables (*P , 0.05, **P , 0.01, ***P , 0.001). 225 L. Forchhammer et al. Part 2 variation in experimental setup Figure 3 shows the dose–response relationships of the calibration curves. There was a clear dose–response relationship between the dose of ionizing radiation and DNA migration for all the protocols. The slopes were different for the regression lines of the calibration curves of A549 cells analysed in protocols with differences in the duration of alkaline treatment and electrophoresis (P , 0.001, ANCOVA). The level of DNA damage in MNBCs from seven healthy individuals was assessed with the same variations in the experimental protocol as the calibration curve (i.e. 20/20, 40/20 or 40/30 min of alkaline unwinding and electrophoresis). In Figure 4, the levels of SB and FPG sites in MNBCs are depicted as both scores in arbitrary units (0–400 arbitrary units) and lesions per 106 bp, as well as with and without correction with the reference standard and calibration curve. Table II outlines the statistical analysis of raw data as well as the data that have been corrected with the reference standard and/or the calibration curve. The analysis included samples (n 5 7) and protocol (n 5 3) as categorical variables. There was a significant effect of the protocol for the level of SBs (P , 0.05, 226 A 140 2ndoccasion (a.u.) 120 100 80 60 40 40 60 80 100 120 140 1stoccasion (a.u.) B 2nd occasion (% of total cells) with migration that determine the statistical power. In this experiment, the investigators scored 77 nuclei with DNA migration (mean) out of the 200 per sample (class 1–4, range: 46–110 damaged nuclei). The analysis also showed that the investigators scored the set of slides differently and this was the variable that contributed the most to the overall variance (Table I). There were differences in the level of DNA damage between the slides representing different days of analysis and the level of DNA damage depended on the treatment (buffer/extract). Table I outlines the contribution of these variables to the overall variance of the analysis. The difference attributed to the set of slides should be viewed as random variance (noise), which is related to the day-to-day experimental variation, since the nuclei in these slides originated from the same sample of cells. The treatment only accounted for a fraction of the overall variance, consistent with the fact that the score for gels treated with extract was only 15% (95% CI: 10–21%) higher than the score for gels treated with buffer. Figure 2A depicts the mean level of DNA damage in the 15 samples of cells obtained by the eight investigators on the two occasions. It is obvious that the investigators scored the set of slides differently, but each investigator scored with remarkable consistency on the two occasions. This indicates that variation in DNA damage measured by the comet assay is not the consequence of large variation between different gels that are applied onto GelBond slides. It is our experience that gels applied onto frosted glass microscope slides produce a higher degree of variance between gels. Thus, it is possible that one will not see the same consistency between two scoring occasions if it is carried out with frosted glass slides. It should be emphasized that no scoring bias was introduced on the second occasion because the investigators did not know the results of their previous scoring. The distribution of single classes of images is depicted in Figure 2B. Here, it can be seen that the investigators scored a similar number of nuclei in each class on the two occasions. The distribution of nuclei in the five classes was identical on the two occasions for all investigators (v2 ranged from 0.05–5.6 among the eight investigators and the critical value for statistical significance is v20.05, 4 5 9.5). 100.0 10.0 1.0 0.1 0.1 1.0 1st 10.0 100.0 occasion (% of total cells) Fig. 2. Level of DNA damage in MNBCs scored by eight investigators 3 months apart. (A) Each point represents the mean (SEM) of DNA damage in 15 set of slides. The score for each set of slide was calculated as the mean score of nuclei in gels treated with cell extract or buffer. The overall regression coefficient was r 5 0.80 (n 5 120). (B) The distribution (in percent of the total number of cells scored) scored into the five categories as follows: diamond (class 0), square (class 1), triangle (class 2), cross (class 3) and circle (class 4). Each investigator scored 7000 nuclei on each of the two occasions. The overall regression coefficient is r 5 0.99. ANOVA). The effect related to different protocols could not be eliminated by transformation of the data to SBs per 106 bp (P , 0.001, ANOVA). However, the differences in SBs attributed to different protocols could be eliminated for both the raw data and the data transformed to SBs per 106 bp by correcting the data with the reference standard (P . 0.05, ANOVA). The statistical analysis of the raw data of FPG sites showed a significant effect of the protocol (P , 0.001, ANOVA), which could not be corrected for by using the reference standard (P , 0.01, ANOVA). The effect of the protocol was eliminated by transformation of the data to FPG lesions per 106 bp (P . 0.05, ANOVA). Variability of DNA damage measured by the comet assay 300 although the CI is slightly larger for the data obtained by the common calibration curve than the investigator-specific calibration curve. The mean level of SBs per 106 bp of the samples was 0.27 (95% CI: 0.12–0.61) and 0.23 (95% CI: 0.09–0.56) for the data calculated by a common and investigator-specific calibration curves, respectively. These distributions were not statistically different (P 5 0.17, Levene’s test) and there were similar ranges of the CIs. Score (a.u.) 200 100 0 Discussion 0 5 10 15 20 25 Irradiation dose (Gy) Fig. 3. Calibration curves obtained by a single investigator in three different comet assay protocols of A549 cells irradiated with c-rays. The points represent calibration curves obtained in assay protocols with the duration of alkaline unwinding/electrophoresis in minutes as follows: triangle (40/30), square (40/20) and diamond (20/20). Part 3 variation in DNA damage measured by different investigators Figure 5 outlines the calibration curves obtained by the eight investigators who scored the same set of slides of A549 cells irradiated with c-rays. The mean level of DNA damage in MNBCs obtained by the eight investigators is shown in Table III. In this experiment, the total variance is composed of the biological variation (differences in the level of DNA damage of the samples), variation due to experimental differences (as a consequence of differences in the analysis by different investigators) and the residual variation. The latter is the unexplained variance and it should be as low as possible. For both SB and FPG sites, there were no statistically significant contributions of the samples to the variance of the data, whereas the variance attributed to the investigators was statistically significant (P , 0.001, two-factor non-parametric ANOVA). This means that most of the variation in the results is due to random (assay) variation. Ideally, the correction of the raw data with the reference standard or calibration curve should reduce the variance attributed to the investigator and residual variation, whereas the variance attributed to the sample might increase if there is a real difference in the level of DNA damage between healthy humans. However, transformation of the data with the calibration curve or the reference standard did not alter the variance related to differences in scoring between investigator and the variance attributed to differences in the samples remained low. Figure 6 outlines the level of SB and FPG sites transformed by the calibration curves. The unaltered variation by use of investigator-specific calibration curve indicates that it might be sufficient to use a common calibration curve for all investigators. We compared this approach with the investigator-specific calibration curve. The common calibration curve was the mean of all calibration curves (1 arbitrary unit 5 0.025 lesions per 106 bp). The mean level of DNA damage in the seven MNBC samples was 0.43 (95% CI: 0.14–1.27) and 0.31 (95% CI: 0.11–0.86) FPG sites per 106 bp for the data obtained by a common and investigatorspecific calibration curve, respectively. The distributions of FPG sites were not different (P 5 0.65, Levene’s test), During the last decade, the comet assay has become a popular technique for the determination of genotoxicity in tissues or MNBCs in biomonitoring studies. However, researchers keep reporting DNA damage measured by the comet assay in different units that renders comparison difficult. An important finding of this investigation is that investigators score slides differently, but each investigator displays a remarkable consistency in scoring over time. Judged from results obtained in the slide-scoring exercise (part 1) and analysis of DNA damage in MNBC (part 3), the major attribution to the overall variability of DNA damage was the variance of the investigators. Moreover, comparing the analysis of different protocols (part 2) with the analysis of FPG sites in MNBCs, it can be seen that the fraction of the variance that was explained by the biological variation (i.e. referred to as ‘sample’ in the variance component analysis in Tables II and III) was relatively low, whereas the variation explained by the protocol and investigator was about the same (i.e. ,20% of the overall variation). These comparisons suggest that a large fraction of the variation in DNA damage reported in different studies is related to the fact that investigators score slides differently. Intuitively, it should be possible to diminish the variation in DNA damage observed by different investigators if the data are corrected with the use of a calibration curve or reference standard. We corrected the primary comet assay scores with either a reference standard that was included in the same batch as the MNBC samples or a calibration curve. The results were somewhat disappointing in the sense that this correction did not consistently lead to lower investigator-specific variation of DNA damage in MNBCs. However, it is reassuring that the variation in FPG sites obtained by the different assay protocols (part 2) was reduced after correction of the primary end point with the calibration curve. It should be emphasized that in this experiment, the MNBC samples were analysed on two different days (each day of analysis consisted of seven MNBC samples and a reference standard). Thus, for each MNBC sample, there were two data points (one for each day of analysis). In comparison, each investigator in part 3 analysed the seven MNBC samples in same analysis, but the investigators analysed their samples on different days. Therefore, we cannot discriminate the day-to-day variation in the assay from the investigator-specific variation in part 3 of the study because that would have required that the investigators had analysed the samples on two different days. This means that the variance attributed to the investigators in part 3 of the study contains both the variance related to the differences in scoring between investigators and day-to-day variation. Previously, we have observed that 65% of the assay variation was attributed to the day-to-day variation, whereas 35% was intra-assay variation (18). Thus, the detection of lower variance of the lesions by calibration may be missed in this study because there is a strong effect of the day-to-day 227 L. Forchhammer et al. 105 B 90 75 75 60 45 45 30 15 15 0 20/20 40/20 40/30 1.8 D 1.6 1.4 1.4 1.2 1.2 1.0 0.8 0.6 20/20 40/20 40/30 20/20 40/20 40/30 1.8 1.6 Lesionsper 106 bp Lesionsper 106 bp 60 30 0 C 105 90 Score (a.u.) Score (a.u.) A 1.0 0.8 0.6 0.4 0.4 0.2 0.2 0.0 0.0 20/20 40/20 40/30 Fig. 4. Variation in DNA damage of MNBCs analysed by different experimental protocols. Each symbol represents the measurement of DNA damage in one MNBC sample analysed in the assay protocol with varying duration of alkaline treatment and electrophoresis. Open symbols represent SBs. Closed symbols represent FPG sites. The graphs depict the DNA damage (A) in arbitrary units (au), as well as data corrected for (B) reference standard, (C) the calibration curve and (D) both the reference standard and calibration curve. variation. Further studies should thus focus on reducing the effect of the day-to-day variation by having each investigator analyse the same sample of cells on different days. In fact, we typically assess duplicate samples by the comet assay on different days of analysis. For example, in a normal biomonitoring study, we would have calculated the mean of the two days of analysis, but in this study the measurements on both days of analysis in part 2 were included in the statistical analysis because we were interested in the variation of the measurement. In recent years, there has been focus on the true level of oxidized guanines in tissues and cells of humans. An assessment of DNA damage obtained in a number of biomonitoring studies concluded that reference values for human diploid MNBCs were in the range of 0.16–0.18 SBs per 106 bp and 0.11–0.33 FPG sites per 106 bp (11). The greater 228 range observed for the FPG sites might be because the measurement of these lesions is technically more demanding than SBs and there might be batch variation of various preparations of the FPG enzyme. Our estimate of the number of FPG sites is in the upper end of this range, i.e. 0.31 FPG sites per 106 bp (95% CI: 0.11–0.86 lesions per 106 bp) for the data obtained by the investigator-specific calibration curves. This estimate is slightly higher than that measured in lymphocytes of 99 subjects in Sweden where the level of FPG sites was reported to be 0.10 0.033 FPG sites per 106 bp (mean SD) and the number of lesions was calculated on the basis of the calibration curve from the ESCODD trial (19). Using investigator-specific calibration curves, we previously reported that MNBCs from healthy humans contained 0.22 0.14 FPG sites per 106 bp (8). Interestingly, Pitozzi et al. (20) obtained a protocol-specific calibration curve for assay conditions of 20 Variability of DNA damage measured by the comet assay Table II. Variance in DNA damage of MNBCs attributed to differences in analysis by different comet assay procedures and samples Variance componentsa DNA damage (mean SD) Type Group 1 b SBs (au) SBs (au, corrected)b,c SBs per 106 bpd SBs per 106 bp (corrected)c,d FPG (au)b FPG (au, corrected)b,c FPG per 106 bpd FPG per 106 bp (corrected)c,d 2.7 6.7 0.06 0.15 47.5 48.8 1.10 1.13 1.9 5.4 0.04 0.13 12.9 13.1 0.30 0.30 Group 2 10.6 8.8 0.20 0.17 66.6 73.2 1.25 1.38 3.5 2.9 0.07 0.05 10.3 11.4 0.19 0.21 Group 3 6.3 5.2 0.02 0.49 86.4 78.9 1.22 1.11 2.7 3.3 0.01 0.52 10.1 10.7 0.14 0.15 Sample Protocol 5.9 14.7 4.0 13.7 5.2 6.7 7.6 8.3 20.4* 4.1 34.0*** 9.2 33.4*** 24.3** 2.2 6.7 Total variance 2182 2219 0.75 11.34 31 664 29 574 8.44 9.18 The groups correspond to alkaline treatment (minutes) and electrophoresis (minutes) as follows: 20/20 (group 1), 40/20 (group 2) and 40/30 (group 3). a The data are analysed by two-factor ANOVA with protocol (n 5 3) and sample (n 5 7) as categorical variables and two independent determinations for each treatment. There were statistically significant single-factor effects of the protocol as follows: *P , 0.05, **P , 0.01 and ***P , 0.001. The variance components represent the percentage of the total sum of squares that can be explained by differences in the samples and protocols. The unexplained fraction of the variance corresponds to the residual variance. b The level of SB and FPG sites are reported in arbitrary units (au) in the 0–400 range. c The raw data were transformed by the formula as follows: corrected data 5 [(raw data) (mean of reference standards)]/(protocol-specific standard). The level of SB sites was corrected with the reference standard for the buffer treated A549 cells exposed to Ro19-8022/white light (mean: 21.3 au), and the FPG sites were corrected with the reference standard of A549 cells exposed to Ro19-8022/white light and treated with FPG enzyme (mean: 173.3 au). d The data are converted into lesions per 106 bp by use of the protocol-specific calibration curve (cf. Figure 3). 250 200 Score (a.u.) min of alkaline treatment and 20 min electrophoresis, and it was reported that the level of FPG sites was 0.6 0.07 FPG sites per 106 bp (mean SEM) in 70 healthy volunteers. In fact, this estimate is well within the 0.13–1.85 lesions per 106 bp (corresponding to 0.3–4.2 modifications per 106 dG) that researchers involved in the ESCODD trial argued was the range of true values and where the lowest values represent the FPG-based measurements by the comet assay, alkaline elution or the alkaline unwinding assays (21). It is reassuring that a similar level of lesions are obtained by several FPG-based methods, but it should be recognized that all these assays are calibrated on the basis of an equivalence between ionizing radiation dose and DNA break frequency that was established using alkaline sucrose sedimentation (22,23). This technique has low sensitivity in the range of ionizing radiation where the comet assay displays a linear dose–response relationship. This means that estimations of the level of lesions in diploid cells are based on the assumption that ionizing radiation induces strand breakage with a linear dose–response relationship over a wide range of doses. In addition, it should also be emphasized that some cell types are radiosensitive and this affects the slope of the calibration curve. It has, for example, been shown that there was almost 3-fold difference in the slopes of the relationship between ionizing radiation and break frequency in six different bladder cancer cell lines (24). In fact, the slopes of the calibration curves that we have obtained in this study were in the low range of estimations, which have been reported, and the slopes are lower than those that have obtained in previous and later experiments (results not shown). We do not have an explanation for the shallow dose–response curves, but it does not appear to be related to the comet assay because the levels of SBs and FPG sites in the MNBC samples were within the range that we usually have observed in biomonitoring studies. Our analyses of the variability in the level of DNA damage obtained by the comet assay have been based on only the fiveclass visual scoring system. The purpose of our investigation was to display as much as possible of the comet assay variation. This might provide a grim picture of the visual classification system as a way of quantifying the DNA 150 100 50 0 0 5 10 15 20 25 Irradiation dose (Gy) Fig. 5. Calibration curves obtained by eight experienced investigators. Each curve represents the score of one investigator. The investigators scored the same set of slides for the calibration curves. The assay protocols for the investigator-specific calibration curves were all analysed with similar conditions (40 min of alkaline unwinding and 20 min of electrophoresis). migration in the comet assay. However, we like to emphasize that the visual classification is still a useful measurement because all the investigators were able to observe a dose– response relationship in the calibration curve, detect a relatively modest difference in slides treated with our cell extract (part 1), and had very consistent scoring of the slides. The visual classification system should be regarded as a reliable way of measuring DNA migration in the comet assay because it is a common practice that only one investigator analyse the samples within one investigation. There is no general consensus as to which of the primary comet assay end points provide the best measurement of the DNA damage. Earlier guidelines did not provide recommendations to specific 229 L. Forchhammer et al. A Table III. Variance in DNA damage of blood cells attributed to differences in analysis by different investigators and MNBCs samples Type of lesion SBs (au)b SBs (au, corrected)b,c SBs per 106 bpd SBs per 106 bp (corrected)c,d FPG (au)b FPG (au, corrected)b,c FPG per 106 bpd FPG per 106 bp (corrected)c,d Mean (95% CI) 10.7 15.3 0.23 0.42 17.1 16.7 0.31 0.50 (4.7–24.5) (7.2–32.6) (0.09–0.56) (0.19–0.95) (5.4–51.0) (5.0–56.1) (0.11–0.86) (0.18–1.42) Variance componentsa Sample Investigator 0.8 0.6 0.6 0.9 2.6 2.3 2.2 2.1 20.7*** 21.2*** 19.0*** 18.9*** 19.9*** 20.8*** 21.0*** 20.5*** The data are analysed by two-factor non-parametric ANOVA with investigator (n 5 8) and sample (n 5 7) as categorical variables. There were statistically significant single-factor effects of the investigator (***P , 0.001). a The data are the percentage of the total sum of squares that can be explained by differences in the samples and investigators. The total variance is 45 486. The unexplained fraction of the variance corresponds to the residual variance. b The level of SB and FPG sites are reported in arbitrary units (au) in the 0–400 range. c The raw data were transformed by the formula as follows: corrected data 5 [(raw data) (mean of reference standards)]/(investigator-specific standard). The level of SBs was corrected with the reference standard for the buffer treated A549 cells exposed to Ro19-8022/white light (mean of all investigators: 23.75 au), and the FPG sites were corrected with the reference standard of A549 cells exposed to Ro19-8022/white light and treated with FPG enzyme (mean of all investigators: 171.13 au). d The data are converted into lesions per 106 bp by use of the investigatorspecific calibration curve (cf. Figure 5). primary end points (25,26). However, the reports from more recent workgroups recommend image analysis (27) and state that image analysis is preferred but not required for assessment of DNA damage by the comet assay (28). This suggests that in the recent years, there has been increased acceptance of image analysis at the expense of the visual classification, which might have been due to the increased availability of commercial and free image analysis systems or because measurements of DNA migration by image analysis systems have better compliance with good laboratory practice. To the best of our knowledge, there are no investigations that have focused on direct comparison of the assay variance between visual classification and image analysis. We believe that the overall conclusions obtained in this report will have a general applicability to all the primary comet assay end points, although image analysis of nuclei might be associated with less variability between investigators as compared with the visual classification of nuclei. However, it should also be emphasized that whether analysed with image analysis or by visual classification, we believe that the most informative way of presenting comet assay results are as lesions per unaltered nucleotides or diploid cells. Funding Environmental Cancer Risk, Nutrition and Individual Susceptibility, a network of excellence operating within the European Union 6th Framework Program, Priority 5: ‘Food Quality and Safety’ (Contract No 513943). Acknowledgements Conflict of interest statement: None declared. 230 B 0.01 0.10 1.00 10.00 0 FPG sites / 106 bp 400 D C 0.01 200 FPG sites (a.u.) 0.10 1.00 SB / 106 bp 10.00 0 50 100 SB (a.u.) Fig. 6. Variation in the level of DNA damage of seven MNBC samples scored by eight investigators and converted to lesion per 106 bp by investigatorspecific calibration curves. The investigators have been ranked according to the increasing level of FPG sites in MNBCs (A) and the same ranking order is used for the SBs in MNBCs (C) as well as the levels of DNA damage in the A549 reference standard samples for FPG sites (B) and SBs (D). The closed symbols represent the mean and 95% CIs of DNA damage in seven MNBCs scored by eight experienced investigators. The open symbols represent mean and 95% CIs of lesions in MNBCs based on the mean level of lesions of each investigator (n 5 8). References 1. Collins, A. R. and Dusinska, M. (2002) Oxidation of cellular DNA measured with the comet assay. Methods Mol. Biol., 186, 147–159. 2. Collins, A. R., Dusinska, M., Horvathova, E., Munro, E., Savio, M. and Stetina, R. (2001) Inter-individual differences in repair of DNA base oxidation, measured in vitro with the comet assay. Mutagenesis, 16, 297–301. 3. Langie, S. A., Knaapen, A. M., Brauers, K. J., van, B. D., van Schooten, F. J. and Godschalk, R. W. (2006) Development and validation of a modified comet assay to phenotypically assess nucleotide excision repair. Mutagenesis, 21, 153–158. 4. Møller, P. (2006) The alkaline comet assay: towards validation in biomonitoring of DNA damaging exposures. Basic Clin. Pharmacol. Toxicol., 98, 336–345. 5. ESCODD (2003) Measurement of DNA oxidation in human cells by chromatographic and enzymic methods. Free Radic. Biol. Med., 34, 1089–1099. 6. ESCODD, Gedik, C. M. and Collins, A. (2005) Establishing the background level of base oxidation in human lymphocyte DNA: results of an interlaboratory validation study. FASEB J., 19, 82–84. 7. Garcia, O., Mandina, T., Lamadrid, A. I., Diaz, A., Remigio, A., Gonzalez, Y., Piloto, J., Gonzalez, J. E. and Alvarez, A. (2004) Sensitivity Variability of DNA damage measured by the comet assay and variability of visual scoring in the comet assay. Results of an interlaboratory scoring exercise with the use of silver staining. Mutat. Res., 556, 25–34. 8. Møller, P., Friis, G., Christensen, P. H. et al. (2004) Intra-laboratory comet assay sample scoring exercise for determination of formamidopyrimidine DNA glycosylase sites in human mononuclear blood cell DNA. Free Radic. Res., 38, 1207–1214. 9. Vaclavik Bräuner, E., Forchhammer, L., Møller, P., Simonsen, J., Glasius, M., Wåhlin, P., Raaschou-Nielsen, O. and Loft, S. (2007) Exposure to ultrafine particles from ambient air and oxidative stressinduced DNA damage. Environ. Health Perspect., 115, 1177–1182. 10. Collins, A. R. and Horvathova, E. (2001) Oxidative DNA damage, antioxidants and DNA repair: applications of the comet assay. Biochem. Soc. Trans., 29, 337–341. 11. Møller, P. (2006) Assessment of reference values for DNA damage detected by the comet assay in human blood cell DNA. Mutat. Res., 612, 84–104. 12. Vijayalaxmi, Tice, R. R. and Strauss, G. H. (1992) Assessment of radiationinduced DNA damage in human blood lymphocytes using the single-cell gel electrophoresis technique. Mutat. Res., 271, 243–252. 13. Speit, G., Trenz, K., Schutz, P., Rothfuss, A. and Merk, O. (1999) The influence of temperature during alkaline treatment and electrophoresis on results obtained with the comet assay. Toxicol. Lett., 110, 73–78. 14. De Boeck, M., Touil, N., De Visscher, G., Vande, P. A. and KirschVolders, M. (2000) Validation and implementation of an internal standard in comet assay analysis. Mutat. Res., 469, 181–197. 15. Risom, L., Møller, P., Vogel, U., Kristjansen, P. E. and Loft, S. (2003) X-ray-induced oxidative stress: DNA damage and gene expression of HO-1, ERCC1 and OGG1 in mouse lung. Free Radic. Res., 37, 957–966. 16. Vinzents, P. S., Møller, P., Sørensen, M., Knudsen, L. E., Hertel, O., Jensen, F. P., Schibye, B. and Loft, S. (2005) Personal exposure to ultrafine particles and oxidative DNA damage. Environ. Health Perspect., 113, 1485–1490. 17. International Human Genome Sequencing Consortium (2004) Finishing the euchromatic sequence of the human genome. Nature, 431, 931–945. 18. Møller, P., Loft, S., Alfthan, G. and Freese, R. (2004) Oxidative DNA damage in circulating mononuclear blood cells after ingestion of blackcurrant juice or anthocyanin-rich drink. Mutat. Res., 551, 119–126. 19. Hofer, T., Karlsson, H. L. and Möller, L. (2006) DNA oxidative damage and strand breaks in young healthy individuals: a gender difference and the role of life style factors. Free Radic. Res., 40, 707–714. 20. Pitozzi, V., Pallotta, S., Balzi, M., Bucciolini, M., Becciolini, A., Dolara, P. and Giovannelli, L. (2006) Calibration of the comet assay for the measurement of DNA damage in mammalian cells. Free Radic. Res., 40, 1149–1154. 21. Collins, A. R., Cadet, J., Möller, L., Poulsen, H. E. and Vina, J. (2004) Are we sure we know how to measure 8-oxo-7,8-dihydroguanine in DNA from human cells? Arch. Biochem. Biophys., 423, 57–65. 22. Ahnström, G. and Erixon, K. (1981) Measurement of strand breaks by alkaline denaturation and hydroxypatite chromatography. In Friedberg, E. C. and Hanawalt, P. C. (eds), DNA Repair: A Laboratory Manual Research Procedures. Marcel Decker, New York, NY, pp. 403–418. 23. Kohn, K. W., Erickson, L. C., Ewig, R. A. and Friedman, C. A. (1976) Fractionation of DNA from mammalian cells by alkaline elution. Biochemistry, 15, 4629–4637. 24. Moneef, M. A., Sherwood, B. T., Bowman, K. J., Kockelbergh, R. C., Symonds, R. P., Steward, W. P., Mellon, J. K. and Jones, G. D. (2003) Measurements using the alkaline comet assay predict bladder cancer cell radiosensitivity. Br. J. Cancer., 89, 2271–2276. 25. Tice, R. R., Agurell, E., Anderson, D. et al. (2000) Single cell gel/comet assay: guidelines for in vitro and in vivo genetic toxicology testing. Environ. Mol. Mutagen., 35, 206–221. 26. Albertini, R. J., Anderson, D., Douglas, G. R. et al. (2000) IPCS guidelines for the monitoring of genotoxic effects of carcinogens in humans. International Programme on Chemical Safety. Mutat. Res., 463, 111–172. 27. Hartmann, A., Agurell, E., Beevers, C. et al. (2003) Recommendations for conducting the in vivo alkaline Comet assay. 4th International Comet Assay Workshop. Mutagenesis, 18, 45–51. 28. Burlinson, B., Tice, R. R., Speit, G. et al. (2007) Fourth International Workgroup on Genotoxicity testing: results of the in vivo Comet assay workgroup. Mutat. Res., 627, 31–35. Received on September 10, 2007; revised on January 18, 2008; accepted on January 18, 2008 231
© Copyright 2026 Paperzz