Large DNA fragment sizing by flow cytometry

 1996 Oxford University Press
4202–4209 Nucleic Acids Research, 1996, Vol. 24, No. 21
Large DNA fragment sizing by flow cytometry:
application to the characterization of P1 artificial
chromosome (PAC) clones
Zhengping Huang, Jeffrey T. Petty+, Brian O’Quinn+, Jonathan L. Longmire1,
Nancy C. Brown1, James H. Jett1 and Richard A. Keller*
Chemical Science and Technology Division and 1Life Sciences Division, Los Alamos National Laboratory,
Los Alamos, NM 87545, USA
Received July 9, 1996; Revised and Accepted September 13, 1996
ABSTRACT
A flow cytometry-based, ultrasensitive fluorescence
detection technique is used to size individual DNA
fragments up to 167 kb in length. Application of this
technology to the sizing of P1 artificial chromosomes
(PACs) in both linear and supercoiled forms is described. It is demonstrated that this method is well
suited to characterizing PAC/BAC clones and will be
very useful for the analysis of large insert libraries.
Fluorescence bursts are recorded as individual, dye
stained DNA fragments pass through a low power,
focused, continuous laser beam. The magnitudes of
the fluorescence bursts are linearly proportional to the
lengths of the DNA fragments. The histograms of the
burst sizes are generated in <3 min with <1 pg of DNA.
Results on linear fragments are consistent with those
obtained by pulsed-field gel electrophoresis. In comparison with pulsed-field gel electrophoresis, sizing of
large DNA fragments by this approach is more accurate, much faster, requires much less DNA, and is
independent of the DNA conformation.
INTRODUCTION
A central role in all physical genetic analyses is the sizing of DNA
molecules, which is typically accomplished using gel electrophoresis. Conventional acrylamide and agarose gels have been used to
separate DNA of lengths 10–2000 and 100–40 000 base pairs (bp)
respectively (1). For DNA fragments larger than 40 kilobase pairs
(kb), migration through the gel is independent of the fragment length
(2), so pulsed field gel electrophoresis (PFGE) is used (3). Although
pulsed field electrophoresis adequately separates large DNA
molecules, to achieve high resolution separation requires tens of
hours; gel-to-gel, day-to-day reproducibility and accuracy remain
problematic (2,4). Furthermore, PFGE cannot be used to size large
supercoiled or open circular DNA (5); samples must be linearized
before analysis. An optical contour maximization technique, based
on analyzing the optical images of single DNA fragments, is an
alternative approach (6). This methodology is relatively easy and
fast to implement, but the sample fragment selection is arbitrary and
many fragments need to be analyzed to achieve an accuracy
comparable with PFGE (7). The applicability of pulsed-field
technology to capillary electrophoresis shows considerable promise
for the rapid separation of DNA fragments up to 1.6 × 106 bp (8,9).
We (10–12) and others (13,14) have developed a flow
cytometric (FCM) method to size DNA. DNA fragments are
stained with bisintercalating dye TOTO-1 [a thiazole orange
homodimer (15)], which binds to DNA in a stoichiometric
manner. Individual stained DNA fragments are passed through
the laser illuminated detection volume (∼10 pl), producing
fluorescence photon bursts. The burst size is proportional to the
number of dye molecules bound to DNA, and thus is proportional
to the fragment length. The primary advantage of our technique
is short data collection time (<3 min), high sensitivity (<1 pg of
DNA was analyzed) and linear response versus fragment size and
number of fragments. Our previous published work demonstrated
a sizing range of 1.5–48.5 kb (12).
Our goal is to size larger DNA fragments faster and more
accurately than can be done by pulsed field gel electrophoresis.
This could be applied in the construction of a high resolution
physical map for each of the human chromosomes. Completion
of the map requires the availability of comprehensive libraries of
DNA clones in appropriate vectors. Cosmids allow inserts of
30–45 kb (16); the bacteriophage P1 cloning system accepts
inserts in the 70–100 kb range (17); P1-derived artificial
chromosomes (PACs) allow inserts of 70–300 kb (18); bacterial
artificial chromosomes (BACs) also accept inserts up to 300 kb
(19); yeast artificial chromosomes (YACs) can propagate exogenous DNA in excess of 1 × 106 bp (20). Due to drawbacks
associated with cosmids (comparatively small sizes, instability of
clones) and YACs (low transformation efficiency, excessive
presence of chimeric clones, difficulties in DNA manipulation)
(18), PAC/BAC systems are gaining increasing popularity (21).
In this work, we demonstrate the extension of our sizing range
up to 167 kb. A linear relationship between fluorescence burst
size and the length of DNA standards was observed. FCM was
successfully applied to size both linear and supercoiled PAC
*To whom correspondence should be addressed. Tel: +1 505 667 3018; Fax: +1 505 665 3024; Email: [email protected]
+Present
address: Department of Chemistry, Furman University, Greenville, SC 29613, USA
4203
Nucleic Acids
Acids Research,
Research,1994,
1996,Vol.
Vol.22,
24,No.
No.121
Nucleic
clones. Our results for linear clone sizes are in agreement with
those obtained by PFGE. Clones from PAC libraries can be sized
directly and routinely.
MATERIALS AND METHODS
DNA
Bacteriophage λ DNA was obtained from GIBCO BRL Life
Technologies (Gaitherburg, MD); KpnI digests of λ DNA and
Low Range PFG Markers were obtained from New England
BioLabs (Beverly, MA). Coliphage T4 and T5 DNA were
obtained from Sigma Chemical Co. (St Louis, MO). Cl-7, Cl-10,
Cl-32 and Cl-56 are PAC clones obtained directly from a PAC
library constructed at Los Alamos.
PAC clones Cl-26 and Cl-29 were isolated from Escherichia coli
host cells using a rapid alkaline lysis miniprep method (18). This is
a modification of a standard Qiagen-Tip method which uses no
organic extractions or columns. Escherichia coli cells contained
clones that were transformed into the cells by electroporation (22).
Two ml cultures of cells were grown for 18 h at 37C in Luria Broth
supplemented with 30 µg/ml kanamycin. For the final step, PAC
DNA was ethanol precipitated and resuspended in 40 µl of TE buffer
(10 mM Tris–HCl, 1 mM EDTA, pH 8). The procedure resulted in
a recovery of 0.5–2 µg of DNA. Extra care was used to minimize
the amount of contamination from E.coli genomic DNA, cellular
protein and RNA, and to minimize the loss and shearing of clone
DNA. After the cells were lysed all pipeting was done with
wide-bore or cut pipet tips.
Restriction digestion
EagI digestion is required to linearize the PAC DNA before
PFGE can be used to size clones. A volume of 11.5 µl of TE
buffer, 2 µl of NEB 3 buffer (50 mM Tris–HCl, 10 mM MgCl2,
100 mM NaCl, 1 mM DTT, pH 7.9), 5 µl of DNA sample and
1.5 µl (15 U) of EagI (New England BioLabs, Beverly, MA) were
mixed gently and incubated for 3 h at 37C. The mixture was then
heated at 65C for 15 min to inactivate the enzyme.
Sample preparation
The stock DNA solutions were stored at 4C. TOTO-1 (Molecular Probes, Eugene, OR) was stored as a 1 mM solution in DMSO
at –20C. A 1 × 10–5 M solution of TOTO-1 was prepared by
diluting 1 µl of the stock in 99 µl of TE. To size intact supercoiled
PAC clones, 1 µl of the DNA clone (∼20–50 ng/µl), 0.7 µl of
10 ng/µl λ, 0.4 µl of 10 ng/µl λ KpnI digest, 2 µl of 25 ng/µl T4
DNA, and 3.4 µl of 1 × 10–5 M TOTO-1 solution were added to
150 µl of TE. This gives concentrations of 2 × 10–7 M TOTO-1
and 0.5–0.7 ng/µl DNA, corresponding to a base pair/dye
molecular ratio between 3.6:1 and 5:1. Only λ and T4 were added
as markers when sizing the EagI digest of PAC clones. The
mixture solutions were incubated for at least 60 min at room
temperature in the dark and diluted 150-fold in TE to give a final
total fragment concentration of ∼5 × 10–14 M. The final clone
concentration was ∼1.5 ng/ml. The DNA/dye complex was very
stable; narrow burst distributions were obtained for samples
sitting at room temperature (∼21C) for up to several days.
A challenge of large DNA analysis (>50 kb) is to avoid significant
DNA breakage during sample preparation and handling. Sample
manipulations must be done slowly and carefully. Since some
4203
amount of shearing is unavoidable, the sample composition (relative
amount of different DNA markers, clones) is important to achieve
good results. If too much T4 or clone DNA is added, the bursts from
sheared fragments mask the smaller marker peaks; if too little T4
DNA or clone is added, their peak heights would be small with the
same data collection time and there would be more uncertainties in
locating the peaks. DNA shearing in the flow system is relatively
insignificant: flowing samples 3-fold faster did not decrease the
relative peak areas of larger fragments in the data.
Pulsed field gel electrophoresis (PFGE)
PFGE was carried out on a CHEF-DR II (Bio-Rad Laboratories,
Richmond, CA). Conditions were the following: 1% GTG
Seaplaque agarose (FMC BioProducts, Rockland, ME), 0.5×
TBE (45 mM Tris-borate, 1 mM EDTA, pH 8) running buffer;
16C buffer temperature; switch times ramped from 2.0 to 30 s;
200 V; 15 h running time. The current was typically between 0.17
and 0.22 A. Gels were stained with ethidium bromide and
visualized under UV light.
Ultrasensitive flow cytometry (FCM)
The experimental apparatus is similar to that described previously
(10,12). The excitation light, provided by a cw Ar+/Kr+ laser
operated at 514.5 nm, was coupled into a single mode optical fiber.
The output from the fiber was collimated and focused onto a sheath
flow cuvette using a spherical lens with a 25 cm focal length. The
1/e2 beam diameter at the center of the flow cuvette was measured
to be ∼46 µm by translating a razor blade through the beam. The
excitation laser power in the flow cell was measured to be
20–30 mW. The fused silica square bore cuvette has a (250 µm)2
internal dimension and ∼100 µm detection side thickness. Fluorescence was collected at 90 to both the optical and flow axes with
a 40×, NA 0.85 microscope objective. The collected light was
focused onto a 1.2 (horizontal) × 3.0 (vertical) mm slit (spatial filter)
located at the image plane of the objective, passed through a 550 ±
15 nm interference filter, and focused onto the photocathode of a
thermoelectrically cooled (–30C) PMT (RCA, Model 31034A,
Sommerville, NJ). The detection volume of ∼10 pl was defined by
the image of the spatial filter in the flow cell, the focused laser beam
and the sample flow stream. Photoelectron pulses were amplified,
discriminated, and counted on a multichannel scaler (MCS). The
MCS summed the number of pulses in 40.96 µs bins. Two hundred
scans of data with 16384 bins per scan were collected and transferred
to a Macintosh computer and analyzed by a program written in the
LabVIEW language (National Instruments, Austin, TX). The total
data collection time was 134 s.
Sample solution and sheath fluid were introduced into flow cell
by gravity feed. The sample and sheath flow rates were controlled
by height differences between the sample tube, the sheath bottle
and the drain container. Ultrapure water (Millipore, Bedford,
MA) was used as the sheath fluid. Transit times of the stained
fragments through the sample volume were controlled by the
sheath flow rates (∼30 µl/min, corresponds to a linear velocity of
2–4 cm/s at the center of the cuvette and a corresponding transit
time through the laser beam of 1–2 ms). The sample flowed at a
rate of ∼0.2 µl/min, or ∼40 fragments/s. The probability of two
fragments being in the sample volume at the same time is given
by Poisson statistics as p(2) = 1 – exp(–τ × r), where τ is the transit
time and r is the count rate (23). Under our conditions, the
probability of double occupancy is ∼5%. The point at which
4204 Nucleic Acids Research, 1996, Vol. 24, No. 21
double occupancy becomes a problem depends on the application
and can be calculated from the above formula.
An important factor to achieve good resolution in FCM is to
maintain a narrow and stable sample stream. To achieve this, the
flow system has to be free of contaminants and air bubbles.
Periodically, the flow system was disassembled and parts were
sonicated in 2% RBS (Pierce, Rockford, IL) methanol/H2O (1:1)
solution and rinsed with ultrapure water. Before running samples,
air bubbles were removed by flowing methanol, followed by
degassed methanol/H2O (1:1) through the flow system. Microspheres (Yellow-green Fluoresbrite, diameter 0.997 ± 0.026 µm,
Cat# 18860, Polysciences Inc., Warrington, PA) were used to
align the optics to maximize the signal as monitored by an
oscilloscope.
Data analysis
Transit times of 1–2 ms were determined by autocorrelation of the
first scan of raw data (12). The background was determined by
averaging the data below a level set near the maximum of the
background noise. The whole data set was scanned for bursts. A
burst was recorded when a series of points exceeded a threshold
set above the average background. A typical background rate for
the data discussed below was 5 photoelectrons per MCS bin
(40.96 µs), and a typical threshold was chosen to be 6
photoelectrons per bin. The criterion for choosing the proper
threshold was discussed in detail (12). For large DNA fragments,
the value chosen for the threshold is not critical. The areas of the
bursts were integrated, and histogramed to give a burst size
distribution using Sigmaplot software (Jandel Scientific, San
Rafael, CA). Histograms were fit to a sum of Gaussians plus an
exponentially decaying background (Figs 1–4 and Fig. 6). Burst
size means (centroids of histogram peaks) of the DNA standards
determined by the fit were plotted versus the fragment lengths and
were fit by linear regression. The unknown DNA sizes were
determined by their burst size means and the linear regression
function.
Optical saturation measurements
Detector saturation represents one potential problem when
measuring the photon bursts of large DNA fragments. To
characterize the saturation limit of our PMT, histograms of burst
sizes were obtained from yellow-green Fluoresbrite beads with
different neutral density filters (OD = 0–2.0) placed in the
detection path to attenuate the fluorescence. The log of burst size
means (pe/ms) were plotted versus OD (a linear plot is expected
if the detector is not saturating). We found that our detector did
not saturate when the burst size mean was below 4000 pe/ms. The
largest DNA we sized (T4 DNA) has a burst size mean of ∼2700
pe/ms, well below the saturation point.
RESULTS
Sizing of PAC clone Cl-29 by FCM
Figure 1a shows the histogram of a sample containing a mixture
of TOTO-1 stained λ KpnI digest (1.5, 17.1 and 29.9 kb), λ DNA
(48.5 kb), T4 DNA (167 kb) and supercoiled PAC clone Cl-29.
λ KpnI digest, λ DNA and T4 DNA were used as size standards.
Figure 1. (a) Histogram of the fluorescence burst sizes of intact supercoiled
clone Cl-29. λ KpnI digest, λ DNA and T4 DNA were used as size standards.
The bin width was 10 pe. The histogram was fit to a sum of 5 Gaussians plus
an exponentially decaying background. Experimental conditions: transit time
1.6 ms, laser power 30 mW. (b) Plot of burst size means versus fragment
lengths. The means of DNA standard peaks (17.1, 29.9, 48.5, 167 kb) were fit
by linear regression. The correlation coefficient of the linear fit is 0.99997. The
slope is 27.0 ± 0.2 pe/kb and the intercept is –28 ± 14 pe. The Cl-29 size of
88.9 ± 0.8 kb was calculated using the linear regression function and the burst
size mean as listed in Table 1.
The histogram is the result of analyzing 134 s of data obtained
from ∼5000 DNA fragments (this is <1 pg of DNA). Five
fragments were resolved while, in this data, the signature of the
1.5 kb fragment was masked by the background due to residual
scattered light, fluorescent impurities, and debris due to shearing
of the large fragments. The histogram was fit to a sum of 5
Gaussians: [Ai exp{–[(x – xi)/σi]2/2}, where Ai is the amplitude
of a given peak, xi is the burst size mean, σi is the standard
deviation, x is the burst size, and i = 1–5] and a decaying
exponential [α exp(–βx), where α and β are constants] with Ai,
xi, σi, α and β as fitting parameters. The results are summarized
in Table 1 and the resulting curve is shown in Figure 1a. The burst
size means from the Gaussian fits were plotted versus the lengths
of the DNA standards (17.1, 29.9, 48.5, 167 kb) and were fit by
a linear regression (Fig. 1b). The linear correlation coefficient is
0.99997; the slope of the resulting line is 27.0 ± 0.2 pe/kb and the
intercept is –28 ± 14 pe. The uncertainties are the standard
deviations of the resulting parameters as reported by Sigmaplot.
Deviations of the measured values from the fitted line are also
listed in Table 1. The average absolute deviation is 1.7%. The
unknown size of Cl-29 was calculated using the linear regression
function and the burst size mean of Cl-29 from the Gaussian fit
as listed in Table 1. The obtained Cl-29 size is 88.9 ± 0.8 kb. Here,
the standard deviation of the clone size was calculated using error
propagation theory.
4205
Nucleic Acids
Acids Research,
Research,1994,
1996,Vol.
Vol.22,
24,No.
No.121
Nucleic
4205
Table 1. Parameters obtained from the histograms of sizing PAC clone Cl-29 (Figs 1 and 2)
Fragment
length (kb)
Amplitude
(Ai)
17.1
35.3 ± 1.3b
Figure 1
29.9
Sizing intact
48.5
Cl-29
88.9
167
Mean
(xi)
Standard
deviation (σi)
CV(%)
(σi/xi)
Shot noise(%)
(xi /xi )
Deviations from
linear fit (%)a
453.4 ± 1.3b
29.8 ± 1.3b
6.6
4.7
–4.5
26.7 ± 1.2
771.7 ± 1.9
38.1 ± 1.9
4.9
3.6
1.0
30.2 ± 1.0
1266.7 ± 2.0
54.5 ± 2.0
4.3
2.8
1.2
23.3 ± 0.7
2372.3 ± 3.7
108.2 ± 3.8
4.5
2.1
28.3 ± 0.6
4486.9 ± 3.2
122.1 ± 3.3
2.7
1.5
Figure 2
16.2
14.6 ± 0.8
358.4 ± 2.1
32.8 ± 2.2
9.1
5.3
Sizing Cl-29
48.5
17.1 ± 0.7
1066.8 ± 2.3
50.3 ± 2.3
4.7
3.1
insert and
73.7
9.7 ± 0.6
1617.9 ± 4.5
63.5 ± 4.6
3.9
2.5
13.3 ± 0.5
3663.2 ± 4.3
109.2 ± 4.4
3.0
1.6
vector
aThe
bThe
167
–0.1
deviation of the data from the linear regression fit is defined as: (fitted burst size – measured burst size)/fitted burst size.
uncertainties are the standard deviations of the resulting parameters as reported by Sigmaplot.
Table 2. Parameters obtained from the histograms of sizing PAC clone Cl-26 (Figs 3, 4)
Fragment length
(kb)
Amplitude
(Ai)
Mean
(xi)
Standard
deviation (σi)
17.1
76.1 ± 2.2
328.3 ± 0.8
23.4 ± 0.8
Figure 3
29.9
38.8 ± 1.8
540.3 ± 1.8
34.4 ± 1.9
Sizing intact
48.5
41.7 ± 1.5
871.5 ± 2.1
52.4 ± 2.2
119.1
18.0 ± 1.1
2129.4 ± 6.7
99.3 ± 7.1
167
11.8 ± 0.9
2981.8 ± 12.4
145.1 ± 13.0
15.5
35.4 ± 2.1
236.6 ± 1.9
28.9 ± 2.1
Cl-26
Figure 4
48.5
103.4 ± 1.6
764.6 ± 0.9
50.0 ± 0.9
insert and
105.3
15.3 ± 1.1
1673.6 ± 8.2
98.0 ± 8.7
vector
167
17.8 ± 0.9
2658.4 ± 8.7
151.7 ± 9.4
Sizing Cl-26
Supercoiled PAC clone Cl-29 was digested with EagI. This
enzyme cuts at the vector and insert junction sites (18). The released
linear restriction fragments were sized by FCM with λ DNA and T4
DNA as size standards and the results are shown in Figure 2a. The
histogram was fit to a sum of 4 Gaussians plus a decaying
exponential. The resulting parameters Ai, xi and σi (i = 1–4) are
summarized in Table 1 and the resulting curve is shown in Figure 2a.
The burst size means from the Gaussian fits were plotted versus
DNA fragment lengths, and a line was drawn through the points of
DNA standards of 48.5 and 167 kb (Fig. 2b). The slope of the
resulting line is 21.9 ± 0.1 pe/kb and the intercept is 4 ± 4 pe. The
unknown insert and vector sizes were calculated using the function
of the resulting line and the burst size means from the Gaussian fits
as listed in Table 1. The obtained Cl-29 insert size: 73.7 ± 0.4 kb;
Cl-29 vector size: 16.2 ± 0.2 kb. The sum of the Cl-29 insert and its
vector, 89.9 ± 0.5 kb, agrees within experimental error with the size
obtained for the intact supercoiled Cl-29 (88.9 ± 0.8 kb, Fig. 1). The
size of the PAC vector (pCYPAC2) was treated as unknown here.
A closely related PAC vector (pCYPAC1) has a reported size of
∼17 kb (18).
Sizing of PAC clone Cl-26 by FCM
The same approach was used to determine the sizes of supercoiled
PAC clone Cl-26 and its linear restriction fragments by EagI.
Figure 3a shows the histogram of a sample of TOTO-1 stained
supercoiled Cl-26 with λ KpnI digest, λ DNA and T4 DNA as size
standards. Figure 3b shows the plot of the burst size means from
the Gaussian fits versus the DNA fragment lengths. The Gaussian
fit parameters Ai, xi and σi (i = 1–5) are summarized in Table 2
and the fitting curve is shown in Figure 3a. The linear correlation
coefficient of Figure 3b is 0.99998; the slope of the linear
regression line is 17.8 ± 0.1 pe/kb and the intercept is 15 ± 7 pe.
The obtained supercoiled Cl-26 size is 119.1 ± 0.9 kb.
Figure 4a shows the histogram of a sample of TOTO-1 stained
EagI restriction fragments of Cl-26 with λ DNA and T4 DNA as
size standards. Figure 4b shows the plot of the burst size means
from the Gaussian fits versus the DNA fragment lengths. The
Gaussian fit parameters Ai, x and σi (i = 1–4) are summarized in
Table 2 and the fitting curve is shown in Figure 4a. The slope of
the drawn line through the points of size standards in Figure 4b
4206 Nucleic Acids Research, 1996, Vol. 24, No. 21
Figure 2. (a) Histogram of the fluorescence burst sizes of EagI restriction
fragments of Cl-29. λ DNA and T4 DNA were used as size standards. The bin
width was 10 pe. The histogram was fit to a sum of 4 Gaussians plus an
exponentially decaying background. Experimental conditions: transit time 1.4
ms, laser power 30 mW. (b) Plot of burst size means versus fragment lengths.
A line was drawn through the points of DNA standards (48.5, 167 kb). The
slope of this line is 21.9 ± 0.1 pe/kb and the intercept is 4 ± 4 pe. The insert (73.7
± 0.4 kb) and vector (16.2 ± 0.2 kb) sizes were calculated using the function
defined by the line and burst size means as listed in Table 1.
is 16.0 ± 0.1 pe/kb and the intercept is –10 ± 4 pe. The obtained
Cl-26 insert size: 105.3 ± 0.9 kb; Cl-26 vector size: 15.5 ± 0.3 kb.
The sum of the Cl-26 insert and its vector, 120.8 ± 1.0 kb, agrees
within experimental error with the size obtained for the intact
supercoiled Cl-26 (119.1 ± 0.9 kb, Fig. 3).
Sizing of supercoiled and linear PAC clones by PFGE
To check the validity of PAC clone sizing by FCM, the intact
supercoiled clones isolated from E.coli host cells and their linear
restriction fragments were also sized by PFGE. Approximately
0.2 µg of clone DNA was loaded into each gel well. The pulsed-field
gel was run for 15 h with switch times ramped from 2.0 to 30 s.
Figure 5 shows a photograph of the ethidium bromide stained gel.
Lanes 1, 6, 7, 16 are Low Range PFG markers (sizes indicated on
the figure); lanes 2 and 8 are intact supercoiled Cl-26 from two
different preps; lanes 3 and 9 are EagI restriction fragments of intact
supercoiled Cl-26, showing two bands, corresponding to the linear
insert and vector; lanes 4 and 10 are intact supercoiled Cl-29 from
two different preps; lanes 5 and 11 are EagI restriction fragments of
intact supercoiled Cl-29; lanes 12–15 are not relevant. To estimate
the sizes of DNA fragments, the migration distances of all the bands
from the loading wells were measured using a ruler and the lengths
of markers were plotted versus their corresponding migration
distances. The resulting plot was fit to a 5th order polynomial
function and the sizes of the clones were estimated using their
migration distances and the polynomial function obtained from the
fit to the markers. The estimated linear Cl-29 insert size is
71.0 ± 7.1 kb, where the uncertainty represents a typical 10% error
Figure 3. (a) Histogram of the fluorescence burst sizes of intact supercoiled
clone Cl-26. λ KpnI digest, λ DNA and T4 DNA were used as size standards.
The bin width was 10 pe. The histogram was fit to a sum of 5 Gaussians plus
an exponentially decaying background. Experimental conditions: transit time
1.1 ms, laser power 30 mW. (b) Plot of burst size means versus fragment
lengths. The means of DNA standard peaks (17.1, 29.9, 48.5, 167 kb) were fit
by linear regression. The correlation coefficient of the linear fit is 0.99998. The
slope is 17.8 ± 0.1 pe/kb and the intercept is 15 ± 7 pe. The Cl-26 size of 119.1
± 0.9 kb was calculated using the linear regression function and the burst size
mean as listed in Table 2.
quoted for PFGE (6). The obtained Cl-29 insert size is in good
agreement with that measured by FCM (73.7 ± 0.4 kb, Fig. 2). The
estimated linear Cl-26 insert size from the gel is 107.1 ± 10.7 kb,
which is also in good agreement with that measured by FCM (105.3
± 0.9 kb, Fig. 4). The estimated size of the linear vector is 16.7 ±
1.7 kb, also agrees with that determined by FCM (16.2 ± 0.2 kb,
Fig. 2; 15.5 ± 0.3 kb, Fig. 4). The intact supercoiled Cl-26 and Cl-29
clones did not migrate appreciably, which is not surprising since
large supercoiled DNA migrates anomalously in gel electrophoresis
(5,24). Sizing of PAC clones by PFGE requires enzyme digestion
to linearize supercoiled DNA.
Sizing of PAC clones directly from the PAC clone
library by FCM
Figure 6 shows four representative histograms of sizing clones
directly from the PAC clone library. Here λ KpnI digest, λ DNA
and T4 DNA were used as markers (same as Figs 1 and 3). The
data analysis was the same as that employed to analyze data for
intact supercoiled Cl-29 and Cl-26. The obtained sizes for
supercoiled Cl-7, Cl-10, Cl-32 and Cl-56 are listed in Table 3. The
insert sizes, calculated by subtracting the vector size (15.9 ±
0.4 kb, Figs 2 and 4), are in agreement with those obtained by
PFGE within experimental uncertainty (Table 3). The apparent
systematic discrepancy derived from these two methods is not
significant: in general, both positive and negative deviations are
observed (see above section).
4207
Nucleic Acids
Acids Research,
Research,1994,
1996,Vol.
Vol.22,
24,No.
No.121
Nucleic
4207
Table 3. Comparison of the sizes of clones from the PAC library measured by FCM and PFGE
By FCM (insert + vector, kb)
By FCM (insert only, kb)
By PFGE (insert only, kb)
Cl-7
105.0 ± 1.7
89.1 ± 1.7
97 ± 10
Cl-10
101.7 ± 1.8
85.8 ± 1.8
93 ± 9
Clone
Cl-32
95.5 ± 0.8
79.6 ± 0.9
81 ± 8
Cl-56
103.2 ± 1.9
87.3 ± 1.9
94 ± 9
Figure 5. Sizing of PAC clones by pulsed field gel electrophoresis (PFGE).
Detailed conditions are in Materials and Methods. The gel was stained with
ethidium bromide. Lanes 1, 6, 7, 16: Low Range PFG markers (sizes on figure);
lanes 2 and 8, intact supercoiled Cl-26 from two different preps; lanes 3 and 9,
EagI restriction fragments of intact supercoiled Cl-26, showing two bands at
107.1 and 16.7 kb; lanes 4 and 10, intact supercoiled Cl-29 from two different
preps; lanes 5 and 11, EagI restriction fragments of intact supercoiled Cl-29,
showing two bands at 71.0 and 16.7 kb; lanes 12–15, not relevant.
Figure 4. (a) Histogram of the fluorescence burst sizes of EagI restriction
fragments of Cl-26. λ DNA and T4 DNA were used as size standards. The bin
width was 10 pe. The histogram was fit to a sum of 4 Gaussians plus an
exponentially decaying background. Experimental conditions: transit time 1.4
ms, laser power 20 mW. (b) Plot of burst size means versus fragment lengths.
A line was drawn through the points of DNA standards (48.5, 167 kb). The
slope of the line is 16.0 ± 0.1 pe/kb and the intercept is –10 ± 4 pe. The insert
(105.3 ± 0.9 kb) and vector (15.5 ± 0.3 kb) sizes were calculated using the
function defined by the line and burst size means as listed in Table 2.
Estimating DNA concentration based on the histogram
peak areas
Counts of the number of individual fragments are obtained from
FCM data. The relative numbers counted for different DNA
fragments are proportional to the molar concentrations of those
fragments in the sample solution. Therefore, it is possible to
estimate the unknown clone concentration based on its relative
peak area in the histogram. Seven sets of histogram data were
examined and the relative peak areas of λ and λ KpnI digest (48.5,
29.9 and 17.1 kb) were calculated to verify that the peak areas
correspond to the molar concentrations in the sample. The results
are listed in Table 4. The two approaches agree within listed
uncertainties, which means the accuracy of estimating the DNA
concentration based on the histogram peak area analysis is >80%
(see column 4 of Table 4). To show an example of using this
analysis to estimate the unknown clone concentration, the ratio of
the histogram peak areas for fragments with lengths 17.1, 29.9,
48.5 and 88.9 kb in Figure 1 is computed to be 0.6:0.6:1:1.6.
Based on the molar concentrations of λ (9.5 × 10–15 M) and λ
KpnI digest (5.6 × 10–15 M) in the final sample solution, the clone
concentration in the final sample solution is estimated to be
(1.5 ± 0.3) × 10–14 M. Assuming that DNA shearing did not occur
during sample preparation, the concentration of the original 1 µl
clone solution used to make up the staining solution is then
estimated to be 21 ± 4 µg/ml.
Table 4. Correspondence between the ratio of the histogram peak areas and the ratio of the molar concentrations of DNA fragments in the sample
Fragment length
Ratio of the molar
Ratio of the histogram
Ratio of the histogram peak areas/
(kb)
concentrations
peak areas
Ratio of the molar concentrations
17.1
29.9
48.5
0.59a
0.59a
1
0.70 ± 0.10b
0.63 ± 0.08b
1
1.19 ± 0.17
1.07 ± 0.14
1
aMolar
concentrations are given in the text: the ratios of the molar concentrations were normalized to the molar concentration of λ DNA.
bRatios of the histogram peak areas were calculated and averaged from seven sets of data, and were normalized to the histogram peak areas of λ DNA. Uncertainties
listed are the standard deviations.
4208 Nucleic Acids Research, 1996, Vol. 24, No. 21
Figure 6. Histograms of the fluorescence burst sizes of the clones directly from a PAC clone library. The bin widths were all 10 pe. λ KpnI digest, λ DNA and T4
DNA were used as markers. The data analysis was the same as that of Figures 1 and 3. (a) Cl-7, transit time 1.4 ms, laser power 30 mW, obtained Cl-7 size: 105.0
± 1.7 kb; (b) Cl-10, transit time 1.2 ms, laser power 30 mW, obtained Cl-10 size: 101.7 ± 1.8 kb; (c) Cl-32, transit time 1.1 ms, laser power 30 mW, obtained Cl-32
size: 95.5 ± 0.8 kb; (d) Cl-56, transit time 1.2 ms, laser power 30 mW, obtained Cl-56 size: 103.2 ± 1.9 kb.
DISCUSSION
Accuracy
Sizing of large DNA fragments by FCM is more accurate (∼2%
uncertainty) than by PFGE, which is generally considered to have
a 10% uncertainty in estimating sizes (6). This conclusion is based
upon the following: (i) the average absolute deviation from the
linear fits for large DNA standards is 1.7% [Fig. 1; the 17.1 kb
point, which has the largest deviation (–4.5%), is always above the
line as observed previously (10,12)]; (ii) the results are reproducible
from run-to-run and day-to-day: the average precision of sizing the
same PAC clone is 1.7%; and (iii) the signal in FCM (photon burst)
is linear with fragment length, while the signal in PFGE (migration
distance) is non-linear with DNA length.
Resolution
When the instrument is optimized, photon shot noise (xi/ xi )
accounts for half or more of the observed coefficient of variation
(CV) (Table 1). Larger fragments have relatively better resolution
since the shot noise is relatively smaller due to their larger burst
sizes. For a given fragment, a larger burst size will result in a
better CV. Larger burst sizes could be attained by increasing the
laser power or using brighter dyes. The signal count rate (pe/ms)
achievable is ultimately limited by detector saturation (see
Materials and Methods section). Larger burst sizes could be
obtained without saturation by increasing the transit time. Our
resolution on large fragments (2–5% for >50 kb DNA) is
comparable with or better than that of PFGE (25).
Sensitivity
Sensitivity is an important issue in characterizing clones obtained
in minute quantities (such as single copy PAC and BAC clones).
As little as 0.4 pg of total DNA (or ∼0.1 pg of clone DNA) is
analyzed to generate a histogram that allows accurate sizing of
clones. Although the staining solution was made up at ∼10–11 M
in the currently reported experiments, good results were obtained
in other work for sizing a mixture of λ DNA and λ KpnI digest
stained directly at ∼10–13 M with TOTO-1. Only ∼2 ng of clone
DNA are needed to make up 1 ml of 10–13 M working solution,
of which only ∼0.5 µl is analyzed to generate one histogram.
Improvements in sample handling will reduce further the volume
of solution needed for analysis. The amount of DNA loaded per
gel well in Figure 5 is ∼0.2 µg. Thus, sizing of DNA by FCM
requires orders of magnitude less DNA than does PFGE.
Signal linear with the number of fragments
FCM analyzes and classifies DNA fragments one at a time. The
relative numbers counted for different DNA fragments correspond to the molar concentrations of the fragments in the sample
solution. Unknown DNA concentrations can be estimated with
>80% accuracy based on the analysis of their relative peak areas
in the histogram.
Effect of DNA conformation on sizing
The sizes of intact supercoiled Cl-29 and Cl-26 obtained by FCM
are in good agreement with the sums of the sizes of their linear
restriction fragments (see Results section). These results demonstrate that FCM can be applied to size both linear and supercoiled
clones. Two implications can be drawn from these results: (i) DNA
conformation (supercoiled or linear) does not affect the stoichiometric binding of TOTO-1 to DNA double strands; and (ii) DNA
conformation does not affect the linear relationship between
photon burst size and the number of dye molecules associated
4209
Nucleic Acids
Acids Research,
Research,1994,
1996,Vol.
Vol.22,
24,No.
No.121
Nucleic
with the DNA strands. In contrast, large supercoiled DNA
migrates anomalously in PFGE and it is essential to digest
enzymatically the supercoiled clones into linear molecules before
they can be sized accurately.
CONCLUSIONS
We have demonstrated a flow cytometry-based technique to size
large DNA fragments and its application to the characterization of
P1 artificial chromosome clones. This technique is superior to the
current most commonly used technique (PFGE) for the following
reasons: (i) the data are acquired rapidly (<3 min compared with
15 h for PFGE); (ii) the technique is sensitive (<1 pg of DNA is
analyzed compared with 0.2 µg of DNA for PFGE); (iii) the
measurement is linear with both fragment length and number of
fragments; and (iv) the results are conformation independent as both
linear and supercoiled PAC clones were sized accurately. We
anticipate that this method will play an important role in characterizing PAC/BAC clone libraries that are widely used in gene mapping,
sequencing, and other genetic analyses. The primary challenge of
sizing large DNA in solution is to avoid significant DNA breakage
during sample preparation and handling. Since DNA as large as
1 × 106 bp has been handled successfully in the genome community
(with YAC cloning systems), we anticipate our DNA sizing range
can be pushed well beyond 167 kb. This method also holds the
potential for scale-up through multiplicity and automation.
ACKNOWLEDGEMENTS
This work was supported by internal funding from Los Alamos
National Laboratory, by the DOE funded Los Alamos Center for
Human Genome Studies (W-7405-ENG-36), and by the NIH
funded National Flow Cytometry Resource (RR-01315). We
thank Harvey Nutter for his valuable technical assistance.
4209
REFERENCES
1 Sambrook,J., Fritsch,E.F. and Maniatis,T. (1989) Molecular Cloning: A
Laboratory Manual, 2nd edition. Cold Spring Harbor Laboratory Press,
Cold Spring Habor, NY, pp. 6.5, pp. 6.37.
2 Cantor,C.R., Smith,C.L. and Mathew,M.K. (1988) Annu. Rev. Biophys.
Biophys. Chem., 17, 287–304.
3 Schwartz,D.C. and Cantor,C.R. (1984) Cell, 37, 67–75.
4 Carle,G.F., Frank,M. and Olson,M.V. (1986) Science, 232, 65–68.
5 Beverley,S.M. (1988) Nucleic Acids Res., 16, 925–939.
6 Guo,X.H., Huff,E.J. and Schwartz,D.C. (1992) Nature, 359, 783–784.
7 Cai,W., Aburatani,H., Stanton,V.P., Housman,D.E., Wang,Y.K. and
Schwartz,D.C. (1995) Proc. Natl Acad. Sci. USA, 92, 5164–5168.
8 Kim,Y.S. and Morris,M.D. (1994) Anal. Chem., 66, 3081–3085.
9 Kim,Y.S. and Morris,M.D. (1995) Anal. Chem., 67, 784–786.
10 Goodwin,P.M., Johnson,M.E., Martin,J.C., Ambrose,W.P., Marrone,B.L.,
Jett,J.H. and Keller,R.A. (1993) Nucleic Acids Res., 21, 803–806.
11 Johnson,M.E., Goodwin,P.M., Ambrose,W.P., Martin,J.C., Marrone,B.L.,
Jett,J.H. and Keller,R.A. (1993) Proc. SPIE-Inc. Soc. Opt. Eng., 1895,
69–78.
12 Petty,J.T., Johnson,M.E., Goodwin,P.M., Martin,J.C., Jett,J.H. and
Keller,R.A. (1995) Anal. Chem., 67, 1755–1761.
13 Castro,A., Fairfield,F.R. and Shera,E.B. (1993) Anal. Chem., 65, 849–852.
14 Castro,A. and Shera,E.B. (1995) Anal. Chem., 67, 3181–3186.
15 Glazer,A.N. and Rye,H.S. (1992) Nature, 359, 859–861.
16 Gingrich,J.C., Boehrer,D.M., Garnes,J.A., Johnson,W., Wong,B.S.,
Bergmann,A., Eveleth,G.G., Langlois,R.G. and Carrano,A.V. (1995)
Genomics, 32, 65–74.
17 Sternberg,N. (1990) Proc. Natl Acad. Sci. USA, 87, 103–107.
18 Ioannou,P.A., Amemiya,C.T., Garnes,J., Kroisel,P.M., Shizuya,H.,
Chen,C., Batzer, M.A. and DeJong,P.J. (1994) Nature Genet., 6, 84–89.
19 Shizuya,H., Birren,B., Kim,U.J., Mancino,V., Slepak,T., Tachiiri,Y. and
Simon,M. (1992) Proc. Natl Acad. Sci. USA, 89, 8794–8797.
20 Burke,D.T., Carle,G.F. and Olson,M.V. (1987) Science, 236, 806–812.
21 Ashworth,L.K., Alegria-Hartman,M., Burgin,M., Devlin,L., Carrano,A.V.
and Batzer,M.A. (1995) Anal. Biochem., 224, 564–571.
22 Sheng,Y.L., Mancino,V. and Birren,B. (1995) Nucleic Acids Res., 23,
1990–1996.
23 Lindmo,T., Peters,D.C. and Sweet,R.G. (1990) In Melamed,M.R.,
Lindmo,T. and Mendelsohn,M.L. (eds), Flow Cytometry and Sorting, 2nd
edition. Wiley-Liss, Inc., New York, pp. 159.
24 Wang,M. and Lai,E. (1995) Electrophoresis, 16, 1–7.
25 Mathew,M.K., Smith,C.L. and Cantor,C.R. (1988) Biochemistry, 27,
9204–9210.