Variation in assessment of oxidatively damaged

Mutagenesis vol. 23 no. 3 pp. 223–231, 2008
Advance Access Publication 7 March 2008
doi:10.1093/mutage/gen006
Variation in assessment of oxidatively damaged DNA in mononuclear blood cells by
the comet assay with visual scoring
Lykke Forchhammer, Elvira Vaclavik Bräuner, Janne
Kjærsgaard Folkmann, Pernille Høgh Danielsen,
Claus Nielsen, Annie Jensen, Steffen Loft, Gitte Friis and
Peter Møller*
Department of Environmental Health, Institute of Public Health, University
of Copenhagen, Øster Farimagsgade 5, Building 5, 2nd floor, PO Box 2099,
DK-1014 Copenhagen K, Denmark
The comet assay is popular for assessments of genotoxicity,
but the comparison of results between studies is challenging because of differences in experimental procedures and
reports of DNA damage in different units. We investigated
the variation of DNA damage in mononuclear blood cells
(MNBCs) measured by the comet assay with focus on the
variation related to alkaline unwinding and electrophoresis
time, number of cells scored, as well as the putative
benefits of transforming the primary end points to
common units by the use of reference standards and
calibration curves. Eight experienced investigators scored
pre-made slides of nuclei differently, but each investigator
scored constantly over time. Scoring of 200 nuclei per
treatment was associated with the lowest residual variation. Alkaline unwinding for 20 or 40 min and electrophoresis for 20 or 30 min yielded different dose–response
relationships of cells exposed to g-radiation and it was
possible to reduce the variation in oxidized purines in
MNBCs from humans by adjusting the level of lesions with
protocol-specific calibration curves. However, there was
a difference in the level of DNA damage measured by
different investigators and this variation could not be
reduced by use of investigator-specific calibration curves.
The mean numbers of lesions per 106 bp in MNBCs from
seven humans were 0.23 [95% confidence interval (CI):
0.14–0.33] and 0.31 (95% CI: 0.20–0.55) for strand breaks
(SBs) and oxidized guanines, respectively. In conclusion,
our results indicate that inter-investigator difference in
scoring is a strong determinant of DNA damage levels
measured by the comet assay.
Introduction
During the last two decades, the single-cell gel electrophoresis
(comet) assay has become a method of increasing popularity. It
was originally developed as a technique for the measurement of
DNA strand breaks (SBs), but further modifications have
increased the range of lesions that can be measured. Novel
modifications to the original method include the detection of
enzyme-sensitive sites such as oxidized pyrimidine and purine
bases (1) and DNA repair incision activity of both oxidized and
bulky DNA lesions (2,3). Oxidized purine bases can be
measured by digestion of the DNA with formamidopyrimidine
DNA glycosylase (FPG) that removes the altered bases and the
resulting alkaline-labile sites can be measured by methods such
as alkaline elution, alkaline unwinding and the comet assay.
The useful application of the comet assay in biomonitoring
studies necessitates the comparison of results between studies
and the minimization of assay variation. It is still a common
practice that comet assay results are reported as primary end
points, e.g. tail length, tail moment, percentage of fluorescence
in the tail or arbitrary units obtained by categorizing nuclei in
different classes. The primary end points reported as the
percentage of DNA in tail and visual score can be compared,
whereas the relationship between the other end points is less
clear due to the large variation in the level of migration
reported by different laboratories (1,4). We believe that the
most informative way of reporting comet assay results are as
lesions per unaltered nucleotides or diploid cells (e.g. lesions
per 106 bp). We prefer to report the data obtained by the comet
assay as lesions per 106 bp, rather than 106 dG, because we
believe that it is somewhat inaccurate to report SBs as lesion
per 106 dG since it may be interpreted as breaks in the DNA
strand which only occur at guanines. This can be obtained by
using a calibration curve determined by the dose–response
relationship of ionizing radiation.
In the early 2000s, the European Standards Committee on
Oxidative DNA Damage (ESCODD) compared the FPG-based
method on detection of oxidized purines in different laboratories. Rather discouragingly, only half of the laboratories
detected a dose–response relationship over the full range of
DNA damage in HeLa cells (5). A subsequent analysis showed
a large variation in the level of DNA damage in mononuclear
blood cells (MNBCs) from young and healthy subjects in
different laboratories, but there was an almost identical
variation in the level of DNA damage in HeLa cell samples
that served as controls and should not vary between
laboratories (6). This indicates that a significant contribution
to the inter-laboratory variation arises from the variability
in assay procedures, including variation in the scoring of slides.
This notion has been further substantiated by another investigation where seven different laboratories scored the
same set of slides; the variation of DNA damage expressed
as the coefficient of variation had a range of 10–100% in
samples treated with various doses of hydrogen peroxide (7).
Likewise, we have previously shown that different investigators from the same laboratory obtained different levels of
DNA damage when scoring the same set of slides, but the
variation in DNA damage could be slightly reduced if the level
of DNA damage was expressed as the number of lesions per
unaltered nucleotides by using investigator-specific calibration
curves (8).
The aim of this study was to investigate the variation in
DNA damage measured by the comet assay with emphasis on
*To whom correspondence should be addressed. Tel: þ45 3532 7654; Fax: þ45 3525 7686; Email: [email protected]
Ó The Author 2008. Published by Oxford University Press on behalf of the UK Environmental Mutagen Society.
All rights reserved. For permissions, please e-mail: [email protected].
223
L. Forchhammer et al.
the contribution of the most common variations in assay
procedures, number of cells scored and the use of investigatorspecific calibration curves.
Materials and methods
Study design
The study encompasses three parts with specific focus on (i) the variation in
slide scoring between investigators and over time in scoring the same set of
slides, (ii) variation in DNA damage of cells that have been analysed by one
investigator using three of the most common assay protocols and (iii)
differences between investigators that have carried out separate comet assay
experiments on identical samples of cells. Eight experienced comet assay
investigators took part in the study. Their experience ranged from 1 to 12 years.
All investigators participated in parts 1 and 3 of the study, while one (2 years of
experience) also carried out part 2 of the study.
Part 1 variation in visual scoring on different occasions with varying number
of nuclei per sample. The first part of the study was a slide-scoring exercise
where the eight investigators scored the same set of GelBond slides on two
different occasions. The aim was to investigate the variation in score between
investigators and to determine whether or not an increased number of scored
nuclei per sample decreased the residual variance. The slides were prepared as
extra positive and negative controls for an in vitro assay for base excision repair
(9,10). In this assay, substrate A549 human lung epithelial cells were treated
with Ro19-8022 (a gift from Hoffman-La Roche, Basel, Switzerland) and white
light in order to induce 8-oxoguanine, embedded in agarose and lysed (as for
the standard comet assay). Extract prepared from a different batch of A549 cells
was applied to the gels (in parallel with controls with just buffer) and incubated
for 20 min at 37°C. 8-Oxoguanines were converted by 8-oxoguanine DNA
glycosylase 1 (OGG1) in the extract to DNA breaks which were measured by
alkaline electrophoresis as in the standard comet assay.
The investigators were informed to score a set of 15 pairs of slides. Each
pair of slides consisted of two slides labelled A or B (with two gels on each
slide). The letters A and B referred to treatment with buffer or extract (but were
coded so that the investigator did not know which was which). The slides were
prepared on 15 separate days during a biomonitoring study (9): the pairs of
slides were therefore numbered 1–15.
On two occasions, 3 months apart, the investigators scored the 15 pairs of
slides (with two gels each). A most common practice is to score two gels of 100
nuclei per gel (corresponding to 200 nuclei per sample). In this part of the
study, the investigators scored 50 nuclei per gel (slides 1–5), 100 nuclei per gel
(slides 6–10) and 200 nuclei per gel (slides 11–15), giving a total of 7000
nuclei scored on each occasion.
The appearances of the nuclei were scored according to a five-class scoring
system in arbitrary units from 0–400 (Figure 1). This type of scoring system
generally shows excellent correlation with image analysis systems (4).
Part 2 variation in experimental setup. The second part of the study was
designed to assess the variation in DNA damage when comparing three of the
protocols most commonly used in biomonitoring studies. Recent biomonitoring
studies typically use 20 or 40 min of alkaline treatment and 20 or 30 min of
electrophoresis, and the most common combinations in the biomonitoring
studies appear to be 20/20, 40/20 or 40/30 min of alkaline treatment and
electrophoresis (11). The alkaline treatment increases the level of cleavage of
alkaline-labile sites and the DNA migration distance is influenced by the
duration of the electrophoresis and alkaline unwinding (12,13). The levels of
SB and FPG sites were detected by the comet assay in MNBCs using the
following: (i) 20 min of alkaline unwinding and 20 min of electrophoresis, (ii)
40 min of alkaline unwinding and 20 min of electrophoresis and (iii) 40 min of
alkaline unwinding and 30 min of electrophoresis. The level of migration was
scored in 200 nuclei per sample. There were seven samples of MNBCs
included in the experiment because this was the maximal number of samples
that could be analysed in one batch of analysis. It was our hypothesis that the
reduction of variation related to different comet assay protocols could be
achieved by adjusting the raw data with the value obtained by either the
reference standard or the protocol-specific calibration curves. It has previously
been reported that correcting the raw data with a reference standard yielded less
inter-electrophoresis and inter-investigator variation (14). The reference
standard was a sample of A549 cells treated with Ro19-8022/white light. We
obtained protocol-specific calibration curves for these three experimental setups
by analysis of the level of SBs in A549 cells that were irradiated with ionizing
radiation (see below). The level of DNA damage in MNBCs is reported as raw
data (as arbitrary units), data corrected according to the reference standard
224
Fig. 1. The five-class scoring system used by all the investigators in the study.
The figure shows representative digital images of nuclei with (A) score zero,
(B) score 1, (C) score 2, (D) score 3 and (E) score 4. The colour indicates the
intensity of light in the scale from red (highest intensity), yellow, green and
blue (lowest intensity).
(reported as arbitrary units), adjusted using the protocol-specific calibration
curve (reported as lesions per 106 bp) or adjusted according to both reference
standard cells and protocol-specific calibration curve (reported as lesions per
106 bp).
Part 3 variation in DNA damage measured by different investigators. The third
part of the study focused on the variability in DNA damage obtained by eight
different investigators carrying out the comet assay on identical samples of
cryopreserved MNBCs from seven healthy individuals and on a reference
standard sample containing A549 cells treated with Ro19-8022/white light. The
lysis solution and enzyme buffer was identical for all investigators, whereas the
investigators prepared their own solution for alkaline treatment and
electrophoresis on the day of the experiment. The standard protocol followed
by all investigators included 40 min of alkaline treatment and 20 min of
electrophoresis and scored 200 nuclei per sample. The investigators obtained
calibration curves by scoring slides of A549 cells irradiated with c-rays. All
investigators scored the same set of slides for the calibration curve. The level of
DNA damage in MNBCs is reported as raw data (as arbitrary units), data
corrected according to the reference standard (reported as arbitrary units),
adjusted according to the investigator-specific calibration curve (reported as
lesions per 106 bp) and adjusted according to both reference standard and
investigator-specific calibration curve (reported as lesions per 106 bp).
Comet assay
The levels of SB and FPG sites were detected by the comet assay as described
previously (15,16). Briefly, MNBCs or A549 cells were embedded in 0.75%
low-melting point agarose (Sigma-Aldrich, Brøndby, Denmark) on GelBondÒ
Variability of DNA damage measured by the comet assay
films (Cambrex, Medinova Scientific A/S, Hellerup, Denmark) and lysed (1%
Triton X-100, 2.5 mM NaCl, 100 mM Na2EDTA and 10 mM Tris, pH 5 10)
for a minimum of 1 h at 4°C. Afterwards, the GelBonds were washed 3 5
min in buffer (40 mM HEPES, 0.1 M KCl, 0.5 mM Na2EDTA and 200 lg/ml
bovine serum albumin, pH 5 8). The levels of FPG sites and SBs were
detected by incubation of the agarose-embedded nuclei with 1 lg/ml of FPG or
buffer for 45 min at 37°C, respectively. The FPG enzyme was kindly provided
by Prof. Andrew Collins, University of Oslo, Norway. The nuclei were then
immersed in an alkaline solution (300 mM NaOH and 1 mM Na2EDTA,
pH . 13) for 40 or 30 min and the duration of the subsequent electrophoresis
was 30 or 20 min in the same solution at 0.83 V/cm and 300 mA. After
electrophoresis, the nuclei were washed 3 5 min in Tris buffer (0.4 M Tris–
HCl, pH 5 7.5), rinsed with MilliQÒ water and placed in 96% ethanol for
a minimum of 90 min and maximum of 12 h. Nuclei were scored with an
Olympus fluorescence microscope at 40 magnitude with visual inspection
after staining with YOYO-1 in phosphate-buffered saline (PBS) solution
(Molecular Probes, Eugene, OR, USA). The net number of FPG sites was
obtained as the difference between slides treated with the FPG enzyme and with
buffer. Scores were translated into lesions per 106 bp by means of the
investigator-specific calibration curves. The calculations were based on the fact
that the human genome contains 2.9 109 nucleotides (17), corresponding to
6 109 bp per diploid cell in G0. Assuming that the average molecular
weight of a DNA base pair is 650 Da, diploid human cells in G0 phase contains
4 1012 Da DNA.
Calibration curve
Human A549 epithelial cells were cultured in medium consisting of Hams F12,
10% foetal bovine serum (Invitrogen A/S, Tåstrup, Denmark) and 1% penicillin–
streptomycin solution (the stock solution from Invitrogen A/S contains 10 000 U/ml
of penicillin G and 10 000 lg/ml of streptomycin in 0.85% saline). The cells
were irradiated with c-rays from a Cs137 source at 0, 2.5, 10 and 25 Gy with
a Gamma Cell 2000 (dose rate 3.77 Gy/min) in PBS. We have previously
reported that the yield of SBs increased linearly with ionizing radiation in the
0–10 Gy dose range using a comet assay protocol with 40 min of alkaline
unwinding and 20 min of electrophoresis (8). In this experiment, we included
a higher dose (25 Gy) because we expected that the yield of SBs would be
different in various assay protocols and investigators score slides differently. In
our laboratory, we usually observe a linear relationship between the dose of
strand-breaking agents (ionizing radiation or hydrogen peroxide) and the visual
score in the range of 0–300 arbitrary units, whereas the assay becomes
increasingly saturated in the range of 300–400 arbitrary units (given a total
range of 0–400 arbitrary units; results not shown). Inspection of the
investigator-specific calibration curves indicated that those investigators who
had the highest score in the 0–10 Gy dose range had lower score in the 25 Gy
samples than expected based on linear extrapolation from the 0–10 Gy dose
range (the level of DNA damage would then have been .300 arbitrary units in
the samples irradiated with 25 Gy). This indicates that the investigators’
experiments had resulted in saturation of the assay. In attempt to provide
a similar dose range of the calibration curves for all the investigators, we
therefore only used the 0–10 Gy interval.
MNBC separation
MNBCs from seven healthy volunteers were collected and isolated in
VacutainerÒ cell preparation tubes (Becton Dickenson A/S, Brøndby, Denmark)
according to the manufacturer’s instructions and frozen at 80°C in a mixture
containing 50% foetal bovine serum, 40% culture medium (RPMI 1640,
Invitrogen A/S, Tåstrup, Denmark) and 10% dimethylsulfoxide. The blood
sampling was part of a larger project that has been approved by the Danish
ethical committee (approval no. KF 01 283243).
Statistics
The data were analysed by parametric analysis of variance (ANOVA) tests
where homogeneity of the variance between groups (assessed by Levene’s test)
and normal distributions of residuals (assessed by Shapiro–Wilks W test) were
fulfilled for either raw or cubic-root transformed data. Linear relationships were
assessed by regression analysis. In all tests, the level of significance was set to
5%. All data are reported as the mean with standard deviation (SD) or SEM or
as the geometric mean with 95% confidence interval (CI). Parametric tests were
performed using Statistica version 5.5 for Windows, StatSoft, Inc. (1997),
Tulsa, OK, USA.
In part 1 of the study, we could not a priori rule out that the 15 pairs of slides
would have different levels of DNA migration as experiments were performed
on 15 different occasions. We expected that the investigators would score the
samples differently. In addition, the score obtained from the gels treated with
enzyme extract was expected to be different from the gels treated with buffer.
The statistical analysis thus included variables as follows: set of slides
(n 5 15), treatment (buffer or extract, n 5 2) and investigator (n 5 8). The
variance of these variables was tested in three different ANOVA tests stratified
into the set of slides scored as 50, 100 or 200 nuclei per gel (corresponding to
100, 200 or 400 nuclei per sample of cells). The three-factor main effect
ANOVA test did not include tests for interaction between factors. Differences
in the distribution in score between the first and second occasion of scoring
were analysed by v2-test.
In part 2, differences in the slopes of linear regression lines of the calibration
curves were tested by analysis of covariance (ANCOVA). The ANOVA
components related to the differences in the level of DNA damage in MNBCs
by the three protocols were tested by two-factor ANOVA without the test for
interactions. This analysis included the sample (n 5 7) and protocol (n 5 3) as
categorical variables.
In part 3, the assessment of DNA damage by different investigators yielded
rather scattered data and thus the variance attributed to investigators and
samples was analysed by two-factor non-parametric ANOVA without the test
for interactions. The analysis included the sample (n 5 7) and investigator
(n 5 8) as categorical variables. The data are reported as the geometric means
and 95% CI.
Results
Part 1 variation in visual scoring on different occasions with
varying nuclei per sample
Table I outlines the mean and SD of groups of slides where the
levels of DNA damage have been scored in 100, 200 or 400
nuclei per sample of cells. As can be seen, the distribution of
SD differs between the groups (P , 0.001, Levene’s test), but
the means are also different and this implies inhomogeneity of
variance between the sample of cells in the three groups.
Consequently, we stratified the statistical ANOVA in scoring
100, 200 or 400 nuclei per sample into these three groups. Each
analysis included the set of slides (n 5 15), treatment (buffer
or extract, n 5 2) and investigator (n 5 8) as main effects. The
residual variations of these statistical tests are also outlined in
Table I. There was higher residual variation (SDres) in the set of
slides where the level of DNA damage had been determined
from scoring 100 nuclei per sample, whereas scoring 200 or
400 nuclei per sample was associated with the same magnitude
of residual variation. Although this means that scoring 200
nuclei per sample is sufficient to reduce the variance, it should
be recognized that it may not apply to samples with less DNA
damage because it can be argued that it is the number of nuclei
Table I. Variance in DNA damage of MNBCs attributed to differences in set
of slides, investigator and treatment
Nuclei per Mean
treatmenta (SD)
SDresb
Variance componentsc
Set of Investigator Treatment
slides
100
200
400
77.2 (36.3) 24.1 (P , 0.001) 17*** 32***
79.8 (35.9) 16.6 (P 5 0.53) 34*** 44***
69.4 (26.6) 15.9 (P 5 0.28) 24*** 37***
6.1***
0.7*
2.7**
a
The nuclei per treatment refers to the number of nuclei that were scored in two
gels treated with either buffer or cell extract.
b
Distribution of the residuals for ANOVA tests of main effect of investigator
(n 5 8), set of slides (n 5 15) and treatment (n 5 2). The P-values correspond
to the Shapiro–Wilks W test for normality of the residuals.
c
The data are the percentage of the total sum of squares that can be explained by
differences in the set of slides originating from the same sample of cells,
investigators and treatment. The unexplained fraction of the variance
corresponds to the residual variance. The P-values correspond to a main effect
ANOVA with the set of slides, investigator and treatment as categorical
variables (*P , 0.05, **P , 0.01, ***P , 0.001).
225
L. Forchhammer et al.
Part 2 variation in experimental setup
Figure 3 shows the dose–response relationships of the
calibration curves. There was a clear dose–response relationship between the dose of ionizing radiation and DNA migration
for all the protocols. The slopes were different for the
regression lines of the calibration curves of A549 cells
analysed in protocols with differences in the duration of
alkaline
treatment and electrophoresis (P , 0.001,
ANCOVA).
The level of DNA damage in MNBCs from seven healthy
individuals was assessed with the same variations in the
experimental protocol as the calibration curve (i.e. 20/20, 40/20
or 40/30 min of alkaline unwinding and electrophoresis). In
Figure 4, the levels of SB and FPG sites in MNBCs are
depicted as both scores in arbitrary units (0–400 arbitrary units)
and lesions per 106 bp, as well as with and without correction
with the reference standard and calibration curve. Table II
outlines the statistical analysis of raw data as well as the data
that have been corrected with the reference standard and/or the
calibration curve. The analysis included samples (n 5 7) and
protocol (n 5 3) as categorical variables. There was a significant effect of the protocol for the level of SBs (P , 0.05,
226
A
140
2ndoccasion (a.u.)
120
100
80
60
40
40
60
80
100
120
140
1stoccasion (a.u.)
B
2nd occasion (% of total cells)
with migration that determine the statistical power. In this
experiment, the investigators scored 77 nuclei with DNA
migration (mean) out of the 200 per sample (class 1–4, range:
46–110 damaged nuclei).
The analysis also showed that the investigators scored the set
of slides differently and this was the variable that contributed
the most to the overall variance (Table I). There were
differences in the level of DNA damage between the slides
representing different days of analysis and the level of DNA
damage depended on the treatment (buffer/extract). Table I
outlines the contribution of these variables to the overall
variance of the analysis. The difference attributed to the set of
slides should be viewed as random variance (noise), which is
related to the day-to-day experimental variation, since the
nuclei in these slides originated from the same sample of cells.
The treatment only accounted for a fraction of the overall
variance, consistent with the fact that the score for gels treated
with extract was only 15% (95% CI: 10–21%) higher than
the score for gels treated with buffer.
Figure 2A depicts the mean level of DNA damage in the 15
samples of cells obtained by the eight investigators on the two
occasions. It is obvious that the investigators scored the set of
slides differently, but each investigator scored with remarkable
consistency on the two occasions. This indicates that variation
in DNA damage measured by the comet assay is not the
consequence of large variation between different gels that are
applied onto GelBond slides. It is our experience that gels
applied onto frosted glass microscope slides produce a higher
degree of variance between gels. Thus, it is possible that one
will not see the same consistency between two scoring
occasions if it is carried out with frosted glass slides. It should
be emphasized that no scoring bias was introduced on the
second occasion because the investigators did not know the
results of their previous scoring. The distribution of single
classes of images is depicted in Figure 2B. Here, it can be seen
that the investigators scored a similar number of nuclei in each
class on the two occasions. The distribution of nuclei in the five
classes was identical on the two occasions for all investigators
(v2 ranged from 0.05–5.6 among the eight investigators and the
critical value for statistical significance is v20.05, 4 5 9.5).
100.0
10.0
1.0
0.1
0.1
1.0
1st
10.0
100.0
occasion (% of total cells)
Fig. 2. Level of DNA damage in MNBCs scored by eight investigators
3 months apart. (A) Each point represents the mean (SEM) of DNA damage in
15 set of slides. The score for each set of slide was calculated as the mean score
of nuclei in gels treated with cell extract or buffer. The overall regression
coefficient was r 5 0.80 (n 5 120). (B) The distribution (in percent of the
total number of cells scored) scored into the five categories as follows:
diamond (class 0), square (class 1), triangle (class 2), cross (class 3) and circle
(class 4). Each investigator scored 7000 nuclei on each of the two occasions.
The overall regression coefficient is r 5 0.99.
ANOVA). The effect related to different protocols could not be
eliminated by transformation of the data to SBs per 106 bp
(P , 0.001, ANOVA). However, the differences in SBs
attributed to different protocols could be eliminated for both
the raw data and the data transformed to SBs per 106 bp by
correcting the data with the reference standard (P . 0.05,
ANOVA). The statistical analysis of the raw data of FPG sites
showed a significant effect of the protocol (P , 0.001,
ANOVA), which could not be corrected for by using the
reference standard (P , 0.01, ANOVA). The effect of the
protocol was eliminated by transformation of the data to FPG
lesions per 106 bp (P . 0.05, ANOVA).
Variability of DNA damage measured by the comet assay
300
although the CI is slightly larger for the data obtained by the
common calibration curve than the investigator-specific calibration curve. The mean level of SBs per 106 bp of the samples
was 0.27 (95% CI: 0.12–0.61) and 0.23 (95% CI: 0.09–0.56)
for the data calculated by a common and investigator-specific
calibration curves, respectively. These distributions were not
statistically different (P 5 0.17, Levene’s test) and there were
similar ranges of the CIs.
Score (a.u.)
200
100
0
Discussion
0
5
10
15
20
25
Irradiation dose (Gy)
Fig. 3. Calibration curves obtained by a single investigator in three different
comet assay protocols of A549 cells irradiated with c-rays. The points
represent calibration curves obtained in assay protocols with the duration of
alkaline unwinding/electrophoresis in minutes as follows: triangle (40/30),
square (40/20) and diamond (20/20).
Part 3 variation in DNA damage measured by different
investigators
Figure 5 outlines the calibration curves obtained by the eight
investigators who scored the same set of slides of A549 cells
irradiated with c-rays. The mean level of DNA damage in
MNBCs obtained by the eight investigators is shown in Table
III. In this experiment, the total variance is composed of the
biological variation (differences in the level of DNA damage of
the samples), variation due to experimental differences (as
a consequence of differences in the analysis by different
investigators) and the residual variation. The latter is the
unexplained variance and it should be as low as possible. For
both SB and FPG sites, there were no statistically significant
contributions of the samples to the variance of the data,
whereas the variance attributed to the investigators was
statistically significant (P , 0.001, two-factor non-parametric
ANOVA). This means that most of the variation in the results
is due to random (assay) variation. Ideally, the correction of the
raw data with the reference standard or calibration curve should
reduce the variance attributed to the investigator and residual
variation, whereas the variance attributed to the sample might
increase if there is a real difference in the level of DNA damage
between healthy humans. However, transformation of the data
with the calibration curve or the reference standard did not alter
the variance related to differences in scoring between investigator and the variance attributed to differences in the
samples remained low. Figure 6 outlines the level of SB and
FPG sites transformed by the calibration curves.
The unaltered variation by use of investigator-specific
calibration curve indicates that it might be sufficient to use
a common calibration curve for all investigators. We compared
this approach with the investigator-specific calibration curve.
The common calibration curve was the mean of all calibration
curves (1 arbitrary unit 5 0.025 lesions per 106 bp). The mean
level of DNA damage in the seven MNBC samples was 0.43
(95% CI: 0.14–1.27) and 0.31 (95% CI: 0.11–0.86) FPG sites
per 106 bp for the data obtained by a common and investigatorspecific calibration curve, respectively. The distributions of
FPG sites were not different (P 5 0.65, Levene’s test),
During the last decade, the comet assay has become a popular
technique for the determination of genotoxicity in tissues or
MNBCs in biomonitoring studies. However, researchers keep
reporting DNA damage measured by the comet assay in
different units that renders comparison difficult. An important
finding of this investigation is that investigators score slides
differently, but each investigator displays a remarkable consistency in scoring over time. Judged from results obtained in
the slide-scoring exercise (part 1) and analysis of DNA damage
in MNBC (part 3), the major attribution to the overall
variability of DNA damage was the variance of the
investigators. Moreover, comparing the analysis of different
protocols (part 2) with the analysis of FPG sites in MNBCs, it
can be seen that the fraction of the variance that was explained
by the biological variation (i.e. referred to as ‘sample’ in the
variance component analysis in Tables II and III) was relatively
low, whereas the variation explained by the protocol and
investigator was about the same (i.e. ,20% of the overall
variation). These comparisons suggest that a large fraction of
the variation in DNA damage reported in different studies is
related to the fact that investigators score slides differently.
Intuitively, it should be possible to diminish the variation in
DNA damage observed by different investigators if the data are
corrected with the use of a calibration curve or reference
standard. We corrected the primary comet assay scores with
either a reference standard that was included in the same batch
as the MNBC samples or a calibration curve. The results were
somewhat disappointing in the sense that this correction did not
consistently lead to lower investigator-specific variation of
DNA damage in MNBCs. However, it is reassuring that the
variation in FPG sites obtained by the different assay protocols
(part 2) was reduced after correction of the primary end point
with the calibration curve. It should be emphasized that in this
experiment, the MNBC samples were analysed on two
different days (each day of analysis consisted of seven MNBC
samples and a reference standard). Thus, for each MNBC
sample, there were two data points (one for each day of
analysis). In comparison, each investigator in part 3 analysed
the seven MNBC samples in same analysis, but the
investigators analysed their samples on different days.
Therefore, we cannot discriminate the day-to-day variation in
the assay from the investigator-specific variation in part 3 of
the study because that would have required that the
investigators had analysed the samples on two different days.
This means that the variance attributed to the investigators in
part 3 of the study contains both the variance related to the
differences in scoring between investigators and day-to-day
variation. Previously, we have observed that 65% of the assay
variation was attributed to the day-to-day variation, whereas
35% was intra-assay variation (18). Thus, the detection of
lower variance of the lesions by calibration may be missed in
this study because there is a strong effect of the day-to-day
227
L. Forchhammer et al.
105
B
90
75
75
60
45
45
30
15
15
0
20/20
40/20
40/30
1.8
D
1.6
1.4
1.4
1.2
1.2
1.0
0.8
0.6
20/20
40/20
40/30
20/20
40/20
40/30
1.8
1.6
Lesionsper 106 bp
Lesionsper 106 bp
60
30
0
C
105
90
Score (a.u.)
Score (a.u.)
A
1.0
0.8
0.6
0.4
0.4
0.2
0.2
0.0
0.0
20/20
40/20
40/30
Fig. 4. Variation in DNA damage of MNBCs analysed by different experimental protocols. Each symbol represents the measurement of DNA damage in one
MNBC sample analysed in the assay protocol with varying duration of alkaline treatment and electrophoresis. Open symbols represent SBs. Closed symbols
represent FPG sites. The graphs depict the DNA damage (A) in arbitrary units (au), as well as data corrected for (B) reference standard, (C) the calibration curve and
(D) both the reference standard and calibration curve.
variation. Further studies should thus focus on reducing the
effect of the day-to-day variation by having each investigator
analyse the same sample of cells on different days. In fact, we
typically assess duplicate samples by the comet assay on
different days of analysis. For example, in a normal biomonitoring study, we would have calculated the mean of the
two days of analysis, but in this study the measurements on
both days of analysis in part 2 were included in the statistical
analysis because we were interested in the variation of the
measurement.
In recent years, there has been focus on the true level of
oxidized guanines in tissues and cells of humans. An
assessment of DNA damage obtained in a number of
biomonitoring studies concluded that reference values for
human diploid MNBCs were in the range of 0.16–0.18 SBs per
106 bp and 0.11–0.33 FPG sites per 106 bp (11). The greater
228
range observed for the FPG sites might be because the
measurement of these lesions is technically more demanding
than SBs and there might be batch variation of various
preparations of the FPG enzyme. Our estimate of the number of
FPG sites is in the upper end of this range, i.e. 0.31 FPG sites
per 106 bp (95% CI: 0.11–0.86 lesions per 106 bp) for the data
obtained by the investigator-specific calibration curves. This
estimate is slightly higher than that measured in lymphocytes
of 99 subjects in Sweden where the level of FPG sites was
reported to be 0.10 0.033 FPG sites per 106 bp (mean SD) and the number of lesions was calculated on the basis of
the calibration curve from the ESCODD trial (19). Using
investigator-specific calibration curves, we previously reported
that MNBCs from healthy humans contained 0.22 0.14 FPG
sites per 106 bp (8). Interestingly, Pitozzi et al. (20) obtained
a protocol-specific calibration curve for assay conditions of 20
Variability of DNA damage measured by the comet assay
Table II. Variance in DNA damage of MNBCs attributed to differences in analysis by different comet assay procedures and samples
Variance componentsa
DNA damage (mean SD)
Type
Group 1
b
SBs (au)
SBs (au, corrected)b,c
SBs per 106 bpd
SBs per 106 bp (corrected)c,d
FPG (au)b
FPG (au, corrected)b,c
FPG per 106 bpd
FPG per 106 bp (corrected)c,d
2.7
6.7
0.06
0.15
47.5
48.8
1.10
1.13
1.9
5.4
0.04
0.13
12.9
13.1
0.30
0.30
Group 2
10.6
8.8
0.20
0.17
66.6
73.2
1.25
1.38
3.5
2.9
0.07
0.05
10.3
11.4
0.19
0.21
Group 3
6.3
5.2
0.02
0.49
86.4
78.9
1.22
1.11
2.7
3.3
0.01
0.52
10.1
10.7
0.14
0.15
Sample
Protocol
5.9
14.7
4.0
13.7
5.2
6.7
7.6
8.3
20.4*
4.1
34.0***
9.2
33.4***
24.3**
2.2
6.7
Total
variance
2182
2219
0.75
11.34
31 664
29 574
8.44
9.18
The groups correspond to alkaline treatment (minutes) and electrophoresis (minutes) as follows: 20/20 (group 1), 40/20 (group 2) and 40/30 (group 3).
a
The data are analysed by two-factor ANOVA with protocol (n 5 3) and sample (n 5 7) as categorical variables and two independent determinations for each
treatment. There were statistically significant single-factor effects of the protocol as follows: *P , 0.05, **P , 0.01 and ***P , 0.001. The variance components
represent the percentage of the total sum of squares that can be explained by differences in the samples and protocols. The unexplained fraction of the variance
corresponds to the residual variance.
b
The level of SB and FPG sites are reported in arbitrary units (au) in the 0–400 range.
c
The raw data were transformed by the formula as follows: corrected data 5 [(raw data) (mean of reference standards)]/(protocol-specific standard). The level of
SB sites was corrected with the reference standard for the buffer treated A549 cells exposed to Ro19-8022/white light (mean: 21.3 au), and the FPG sites were
corrected with the reference standard of A549 cells exposed to Ro19-8022/white light and treated with FPG enzyme (mean: 173.3 au).
d
The data are converted into lesions per 106 bp by use of the protocol-specific calibration curve (cf. Figure 3).
250
200
Score (a.u.)
min of alkaline treatment and 20 min electrophoresis, and it
was reported that the level of FPG sites was 0.6 0.07 FPG
sites per 106 bp (mean SEM) in 70 healthy volunteers. In
fact, this estimate is well within the 0.13–1.85 lesions per 106
bp (corresponding to 0.3–4.2 modifications per 106 dG) that
researchers involved in the ESCODD trial argued was the
range of true values and where the lowest values represent the
FPG-based measurements by the comet assay, alkaline elution
or the alkaline unwinding assays (21). It is reassuring that
a similar level of lesions are obtained by several FPG-based
methods, but it should be recognized that all these assays are
calibrated on the basis of an equivalence between ionizing
radiation dose and DNA break frequency that was established
using alkaline sucrose sedimentation (22,23). This technique
has low sensitivity in the range of ionizing radiation where the
comet assay displays a linear dose–response relationship. This
means that estimations of the level of lesions in diploid cells
are based on the assumption that ionizing radiation induces
strand breakage with a linear dose–response relationship over
a wide range of doses. In addition, it should also be emphasized
that some cell types are radiosensitive and this affects the slope
of the calibration curve. It has, for example, been shown that
there was almost 3-fold difference in the slopes of the
relationship between ionizing radiation and break frequency
in six different bladder cancer cell lines (24). In fact, the slopes
of the calibration curves that we have obtained in this study
were in the low range of estimations, which have been
reported, and the slopes are lower than those that have obtained
in previous and later experiments (results not shown). We do
not have an explanation for the shallow dose–response curves,
but it does not appear to be related to the comet assay because
the levels of SBs and FPG sites in the MNBC samples were
within the range that we usually have observed in biomonitoring studies.
Our analyses of the variability in the level of DNA damage
obtained by the comet assay have been based on only the fiveclass visual scoring system. The purpose of our investigation
was to display as much as possible of the comet assay
variation. This might provide a grim picture of the visual
classification system as a way of quantifying the DNA
150
100
50
0
0
5
10
15
20
25
Irradiation dose (Gy)
Fig. 5. Calibration curves obtained by eight experienced investigators. Each
curve represents the score of one investigator. The investigators scored the
same set of slides for the calibration curves. The assay protocols for the
investigator-specific calibration curves were all analysed with similar
conditions (40 min of alkaline unwinding and 20 min of electrophoresis).
migration in the comet assay. However, we like to emphasize
that the visual classification is still a useful measurement
because all the investigators were able to observe a dose–
response relationship in the calibration curve, detect a relatively
modest difference in slides treated with our cell extract (part 1),
and had very consistent scoring of the slides. The visual
classification system should be regarded as a reliable way of
measuring DNA migration in the comet assay because it is
a common practice that only one investigator analyse the
samples within one investigation. There is no general
consensus as to which of the primary comet assay end points
provide the best measurement of the DNA damage. Earlier
guidelines did not provide recommendations to specific
229
L. Forchhammer et al.
A
Table III. Variance in DNA damage of blood cells attributed to differences in
analysis by different investigators and MNBCs samples
Type of lesion
SBs (au)b
SBs (au, corrected)b,c
SBs per 106 bpd
SBs per 106 bp (corrected)c,d
FPG (au)b
FPG (au, corrected)b,c
FPG per 106 bpd
FPG per 106 bp (corrected)c,d
Mean (95% CI)
10.7
15.3
0.23
0.42
17.1
16.7
0.31
0.50
(4.7–24.5)
(7.2–32.6)
(0.09–0.56)
(0.19–0.95)
(5.4–51.0)
(5.0–56.1)
(0.11–0.86)
(0.18–1.42)
Variance componentsa
Sample
Investigator
0.8
0.6
0.6
0.9
2.6
2.3
2.2
2.1
20.7***
21.2***
19.0***
18.9***
19.9***
20.8***
21.0***
20.5***
The data are analysed by two-factor non-parametric ANOVA with investigator
(n 5 8) and sample (n 5 7) as categorical variables. There were statistically
significant single-factor effects of the investigator (***P , 0.001).
a
The data are the percentage of the total sum of squares that can be explained by
differences in the samples and investigators. The total variance is 45 486. The
unexplained fraction of the variance corresponds to the residual variance.
b
The level of SB and FPG sites are reported in arbitrary units (au) in the 0–400
range.
c
The raw data were transformed by the formula as follows: corrected data 5
[(raw data) (mean of reference standards)]/(investigator-specific standard).
The level of SBs was corrected with the reference standard for the buffer treated
A549 cells exposed to Ro19-8022/white light (mean of all investigators: 23.75
au), and the FPG sites were corrected with the reference standard of A549 cells
exposed to Ro19-8022/white light and treated with FPG enzyme (mean of all
investigators: 171.13 au).
d
The data are converted into lesions per 106 bp by use of the investigatorspecific calibration curve (cf. Figure 5).
primary end points (25,26). However, the reports from more
recent workgroups recommend image analysis (27) and state
that image analysis is preferred but not required for assessment
of DNA damage by the comet assay (28). This suggests that in
the recent years, there has been increased acceptance of image
analysis at the expense of the visual classification, which might
have been due to the increased availability of commercial and
free image analysis systems or because measurements of DNA
migration by image analysis systems have better compliance
with good laboratory practice. To the best of our knowledge,
there are no investigations that have focused on direct
comparison of the assay variance between visual classification
and image analysis. We believe that the overall conclusions
obtained in this report will have a general applicability to all
the primary comet assay end points, although image analysis of
nuclei might be associated with less variability between
investigators as compared with the visual classification of
nuclei. However, it should also be emphasized that whether
analysed with image analysis or by visual classification, we
believe that the most informative way of presenting comet
assay results are as lesions per unaltered nucleotides or diploid
cells.
Funding
Environmental Cancer Risk, Nutrition and Individual Susceptibility, a network of excellence operating within the European
Union 6th Framework Program, Priority 5: ‘Food Quality and
Safety’ (Contract No 513943).
Acknowledgements
Conflict of interest statement: None declared.
230
B
0.01
0.10
1.00
10.00 0
FPG sites / 106 bp
400
D
C
0.01
200
FPG sites (a.u.)
0.10
1.00
SB / 106 bp
10.00
0
50
100
SB (a.u.)
Fig. 6. Variation in the level of DNA damage of seven MNBC samples scored
by eight investigators and converted to lesion per 106 bp by investigatorspecific calibration curves. The investigators have been ranked according to the
increasing level of FPG sites in MNBCs (A) and the same ranking order is
used for the SBs in MNBCs (C) as well as the levels of DNA damage in the
A549 reference standard samples for FPG sites (B) and SBs (D). The closed
symbols represent the mean and 95% CIs of DNA damage in seven MNBCs
scored by eight experienced investigators. The open symbols represent mean
and 95% CIs of lesions in MNBCs based on the mean level of lesions of each
investigator (n 5 8).
References
1. Collins, A. R. and Dusinska, M. (2002) Oxidation of cellular DNA
measured with the comet assay. Methods Mol. Biol., 186, 147–159.
2. Collins, A. R., Dusinska, M., Horvathova, E., Munro, E., Savio, M. and
Stetina, R. (2001) Inter-individual differences in repair of DNA base
oxidation, measured in vitro with the comet assay. Mutagenesis, 16, 297–301.
3. Langie, S. A., Knaapen, A. M., Brauers, K. J., van, B. D., van
Schooten, F. J. and Godschalk, R. W. (2006) Development and validation
of a modified comet assay to phenotypically assess nucleotide excision
repair. Mutagenesis, 21, 153–158.
4. Møller, P. (2006) The alkaline comet assay: towards validation in
biomonitoring of DNA damaging exposures. Basic Clin. Pharmacol.
Toxicol., 98, 336–345.
5. ESCODD (2003) Measurement of DNA oxidation in human cells by
chromatographic and enzymic methods. Free Radic. Biol. Med., 34,
1089–1099.
6. ESCODD, Gedik, C. M. and Collins, A. (2005) Establishing the background level of base oxidation in human lymphocyte DNA: results of an
interlaboratory validation study. FASEB J., 19, 82–84.
7. Garcia, O., Mandina, T., Lamadrid, A. I., Diaz, A., Remigio, A.,
Gonzalez, Y., Piloto, J., Gonzalez, J. E. and Alvarez, A. (2004) Sensitivity
Variability of DNA damage measured by the comet assay
and variability of visual scoring in the comet assay. Results of an interlaboratory scoring exercise with the use of silver staining. Mutat. Res., 556,
25–34.
8. Møller, P., Friis, G., Christensen, P. H. et al. (2004) Intra-laboratory comet
assay sample scoring exercise for determination of formamidopyrimidine
DNA glycosylase sites in human mononuclear blood cell DNA. Free
Radic. Res., 38, 1207–1214.
9. Vaclavik Bräuner, E., Forchhammer, L., Møller, P., Simonsen, J.,
Glasius, M., Wåhlin, P., Raaschou-Nielsen, O. and Loft, S. (2007)
Exposure to ultrafine particles from ambient air and oxidative stressinduced DNA damage. Environ. Health Perspect., 115, 1177–1182.
10. Collins, A. R. and Horvathova, E. (2001) Oxidative DNA damage,
antioxidants and DNA repair: applications of the comet assay. Biochem.
Soc. Trans., 29, 337–341.
11. Møller, P. (2006) Assessment of reference values for DNA damage
detected by the comet assay in human blood cell DNA. Mutat. Res., 612,
84–104.
12. Vijayalaxmi, Tice, R. R. and Strauss, G. H. (1992) Assessment of radiationinduced DNA damage in human blood lymphocytes using the single-cell
gel electrophoresis technique. Mutat. Res., 271, 243–252.
13. Speit, G., Trenz, K., Schutz, P., Rothfuss, A. and Merk, O. (1999) The
influence of temperature during alkaline treatment and electrophoresis on
results obtained with the comet assay. Toxicol. Lett., 110, 73–78.
14. De Boeck, M., Touil, N., De Visscher, G., Vande, P. A. and KirschVolders, M. (2000) Validation and implementation of an internal standard
in comet assay analysis. Mutat. Res., 469, 181–197.
15. Risom, L., Møller, P., Vogel, U., Kristjansen, P. E. and Loft, S. (2003)
X-ray-induced oxidative stress: DNA damage and gene expression of
HO-1, ERCC1 and OGG1 in mouse lung. Free Radic. Res., 37, 957–966.
16. Vinzents, P. S., Møller, P., Sørensen, M., Knudsen, L. E., Hertel, O.,
Jensen, F. P., Schibye, B. and Loft, S. (2005) Personal exposure to ultrafine
particles and oxidative DNA damage. Environ. Health Perspect., 113,
1485–1490.
17. International Human Genome Sequencing Consortium (2004) Finishing the
euchromatic sequence of the human genome. Nature, 431, 931–945.
18. Møller, P., Loft, S., Alfthan, G. and Freese, R. (2004) Oxidative DNA
damage in circulating mononuclear blood cells after ingestion of
blackcurrant juice or anthocyanin-rich drink. Mutat. Res., 551, 119–126.
19. Hofer, T., Karlsson, H. L. and Möller, L. (2006) DNA oxidative damage
and strand breaks in young healthy individuals: a gender difference and the
role of life style factors. Free Radic. Res., 40, 707–714.
20. Pitozzi, V., Pallotta, S., Balzi, M., Bucciolini, M., Becciolini, A., Dolara, P.
and Giovannelli, L. (2006) Calibration of the comet assay for the
measurement of DNA damage in mammalian cells. Free Radic. Res., 40,
1149–1154.
21. Collins, A. R., Cadet, J., Möller, L., Poulsen, H. E. and Vina, J. (2004) Are
we sure we know how to measure 8-oxo-7,8-dihydroguanine in DNA from
human cells? Arch. Biochem. Biophys., 423, 57–65.
22. Ahnström, G. and Erixon, K. (1981) Measurement of strand breaks by
alkaline denaturation and hydroxypatite chromatography. In Friedberg, E.
C. and Hanawalt, P. C. (eds), DNA Repair: A Laboratory Manual Research
Procedures. Marcel Decker, New York, NY, pp. 403–418.
23. Kohn, K. W., Erickson, L. C., Ewig, R. A. and Friedman, C. A. (1976)
Fractionation of DNA from mammalian cells by alkaline elution.
Biochemistry, 15, 4629–4637.
24. Moneef, M. A., Sherwood, B. T., Bowman, K. J., Kockelbergh, R. C.,
Symonds, R. P., Steward, W. P., Mellon, J. K. and Jones, G. D. (2003)
Measurements using the alkaline comet assay predict bladder cancer cell
radiosensitivity. Br. J. Cancer., 89, 2271–2276.
25. Tice, R. R., Agurell, E., Anderson, D. et al. (2000) Single cell gel/comet
assay: guidelines for in vitro and in vivo genetic toxicology testing.
Environ. Mol. Mutagen., 35, 206–221.
26. Albertini, R. J., Anderson, D., Douglas, G. R. et al. (2000) IPCS guidelines
for the monitoring of genotoxic effects of carcinogens in humans.
International Programme on Chemical Safety. Mutat. Res., 463, 111–172.
27. Hartmann, A., Agurell, E., Beevers, C. et al. (2003) Recommendations for
conducting the in vivo alkaline Comet assay. 4th International Comet
Assay Workshop. Mutagenesis, 18, 45–51.
28. Burlinson, B., Tice, R. R., Speit, G. et al. (2007) Fourth International
Workgroup on Genotoxicity testing: results of the in vivo Comet assay
workgroup. Mutat. Res., 627, 31–35.
Received on September 10, 2007; revised on January 18, 2008;
accepted on January 18, 2008
231