1996 Oxford University Press 4202–4209 Nucleic Acids Research, 1996, Vol. 24, No. 21 Large DNA fragment sizing by flow cytometry: application to the characterization of P1 artificial chromosome (PAC) clones Zhengping Huang, Jeffrey T. Petty+, Brian O’Quinn+, Jonathan L. Longmire1, Nancy C. Brown1, James H. Jett1 and Richard A. Keller* Chemical Science and Technology Division and 1Life Sciences Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA Received July 9, 1996; Revised and Accepted September 13, 1996 ABSTRACT A flow cytometry-based, ultrasensitive fluorescence detection technique is used to size individual DNA fragments up to 167 kb in length. Application of this technology to the sizing of P1 artificial chromosomes (PACs) in both linear and supercoiled forms is described. It is demonstrated that this method is well suited to characterizing PAC/BAC clones and will be very useful for the analysis of large insert libraries. Fluorescence bursts are recorded as individual, dye stained DNA fragments pass through a low power, focused, continuous laser beam. The magnitudes of the fluorescence bursts are linearly proportional to the lengths of the DNA fragments. The histograms of the burst sizes are generated in <3 min with <1 pg of DNA. Results on linear fragments are consistent with those obtained by pulsed-field gel electrophoresis. In comparison with pulsed-field gel electrophoresis, sizing of large DNA fragments by this approach is more accurate, much faster, requires much less DNA, and is independent of the DNA conformation. INTRODUCTION A central role in all physical genetic analyses is the sizing of DNA molecules, which is typically accomplished using gel electrophoresis. Conventional acrylamide and agarose gels have been used to separate DNA of lengths 10–2000 and 100–40 000 base pairs (bp) respectively (1). For DNA fragments larger than 40 kilobase pairs (kb), migration through the gel is independent of the fragment length (2), so pulsed field gel electrophoresis (PFGE) is used (3). Although pulsed field electrophoresis adequately separates large DNA molecules, to achieve high resolution separation requires tens of hours; gel-to-gel, day-to-day reproducibility and accuracy remain problematic (2,4). Furthermore, PFGE cannot be used to size large supercoiled or open circular DNA (5); samples must be linearized before analysis. An optical contour maximization technique, based on analyzing the optical images of single DNA fragments, is an alternative approach (6). This methodology is relatively easy and fast to implement, but the sample fragment selection is arbitrary and many fragments need to be analyzed to achieve an accuracy comparable with PFGE (7). The applicability of pulsed-field technology to capillary electrophoresis shows considerable promise for the rapid separation of DNA fragments up to 1.6 × 106 bp (8,9). We (10–12) and others (13,14) have developed a flow cytometric (FCM) method to size DNA. DNA fragments are stained with bisintercalating dye TOTO-1 [a thiazole orange homodimer (15)], which binds to DNA in a stoichiometric manner. Individual stained DNA fragments are passed through the laser illuminated detection volume (∼10 pl), producing fluorescence photon bursts. The burst size is proportional to the number of dye molecules bound to DNA, and thus is proportional to the fragment length. The primary advantage of our technique is short data collection time (<3 min), high sensitivity (<1 pg of DNA was analyzed) and linear response versus fragment size and number of fragments. Our previous published work demonstrated a sizing range of 1.5–48.5 kb (12). Our goal is to size larger DNA fragments faster and more accurately than can be done by pulsed field gel electrophoresis. This could be applied in the construction of a high resolution physical map for each of the human chromosomes. Completion of the map requires the availability of comprehensive libraries of DNA clones in appropriate vectors. Cosmids allow inserts of 30–45 kb (16); the bacteriophage P1 cloning system accepts inserts in the 70–100 kb range (17); P1-derived artificial chromosomes (PACs) allow inserts of 70–300 kb (18); bacterial artificial chromosomes (BACs) also accept inserts up to 300 kb (19); yeast artificial chromosomes (YACs) can propagate exogenous DNA in excess of 1 × 106 bp (20). Due to drawbacks associated with cosmids (comparatively small sizes, instability of clones) and YACs (low transformation efficiency, excessive presence of chimeric clones, difficulties in DNA manipulation) (18), PAC/BAC systems are gaining increasing popularity (21). In this work, we demonstrate the extension of our sizing range up to 167 kb. A linear relationship between fluorescence burst size and the length of DNA standards was observed. FCM was successfully applied to size both linear and supercoiled PAC *To whom correspondence should be addressed. Tel: +1 505 667 3018; Fax: +1 505 665 3024; Email: [email protected] +Present address: Department of Chemistry, Furman University, Greenville, SC 29613, USA 4203 Nucleic Acids Acids Research, Research,1994, 1996,Vol. Vol.22, 24,No. No.121 Nucleic clones. Our results for linear clone sizes are in agreement with those obtained by PFGE. Clones from PAC libraries can be sized directly and routinely. MATERIALS AND METHODS DNA Bacteriophage λ DNA was obtained from GIBCO BRL Life Technologies (Gaitherburg, MD); KpnI digests of λ DNA and Low Range PFG Markers were obtained from New England BioLabs (Beverly, MA). Coliphage T4 and T5 DNA were obtained from Sigma Chemical Co. (St Louis, MO). Cl-7, Cl-10, Cl-32 and Cl-56 are PAC clones obtained directly from a PAC library constructed at Los Alamos. PAC clones Cl-26 and Cl-29 were isolated from Escherichia coli host cells using a rapid alkaline lysis miniprep method (18). This is a modification of a standard Qiagen-Tip method which uses no organic extractions or columns. Escherichia coli cells contained clones that were transformed into the cells by electroporation (22). Two ml cultures of cells were grown for 18 h at 37C in Luria Broth supplemented with 30 µg/ml kanamycin. For the final step, PAC DNA was ethanol precipitated and resuspended in 40 µl of TE buffer (10 mM Tris–HCl, 1 mM EDTA, pH 8). The procedure resulted in a recovery of 0.5–2 µg of DNA. Extra care was used to minimize the amount of contamination from E.coli genomic DNA, cellular protein and RNA, and to minimize the loss and shearing of clone DNA. After the cells were lysed all pipeting was done with wide-bore or cut pipet tips. Restriction digestion EagI digestion is required to linearize the PAC DNA before PFGE can be used to size clones. A volume of 11.5 µl of TE buffer, 2 µl of NEB 3 buffer (50 mM Tris–HCl, 10 mM MgCl2, 100 mM NaCl, 1 mM DTT, pH 7.9), 5 µl of DNA sample and 1.5 µl (15 U) of EagI (New England BioLabs, Beverly, MA) were mixed gently and incubated for 3 h at 37C. The mixture was then heated at 65C for 15 min to inactivate the enzyme. Sample preparation The stock DNA solutions were stored at 4C. TOTO-1 (Molecular Probes, Eugene, OR) was stored as a 1 mM solution in DMSO at –20C. A 1 × 10–5 M solution of TOTO-1 was prepared by diluting 1 µl of the stock in 99 µl of TE. To size intact supercoiled PAC clones, 1 µl of the DNA clone (∼20–50 ng/µl), 0.7 µl of 10 ng/µl λ, 0.4 µl of 10 ng/µl λ KpnI digest, 2 µl of 25 ng/µl T4 DNA, and 3.4 µl of 1 × 10–5 M TOTO-1 solution were added to 150 µl of TE. This gives concentrations of 2 × 10–7 M TOTO-1 and 0.5–0.7 ng/µl DNA, corresponding to a base pair/dye molecular ratio between 3.6:1 and 5:1. Only λ and T4 were added as markers when sizing the EagI digest of PAC clones. The mixture solutions were incubated for at least 60 min at room temperature in the dark and diluted 150-fold in TE to give a final total fragment concentration of ∼5 × 10–14 M. The final clone concentration was ∼1.5 ng/ml. The DNA/dye complex was very stable; narrow burst distributions were obtained for samples sitting at room temperature (∼21C) for up to several days. A challenge of large DNA analysis (>50 kb) is to avoid significant DNA breakage during sample preparation and handling. Sample manipulations must be done slowly and carefully. Since some 4203 amount of shearing is unavoidable, the sample composition (relative amount of different DNA markers, clones) is important to achieve good results. If too much T4 or clone DNA is added, the bursts from sheared fragments mask the smaller marker peaks; if too little T4 DNA or clone is added, their peak heights would be small with the same data collection time and there would be more uncertainties in locating the peaks. DNA shearing in the flow system is relatively insignificant: flowing samples 3-fold faster did not decrease the relative peak areas of larger fragments in the data. Pulsed field gel electrophoresis (PFGE) PFGE was carried out on a CHEF-DR II (Bio-Rad Laboratories, Richmond, CA). Conditions were the following: 1% GTG Seaplaque agarose (FMC BioProducts, Rockland, ME), 0.5× TBE (45 mM Tris-borate, 1 mM EDTA, pH 8) running buffer; 16C buffer temperature; switch times ramped from 2.0 to 30 s; 200 V; 15 h running time. The current was typically between 0.17 and 0.22 A. Gels were stained with ethidium bromide and visualized under UV light. Ultrasensitive flow cytometry (FCM) The experimental apparatus is similar to that described previously (10,12). The excitation light, provided by a cw Ar+/Kr+ laser operated at 514.5 nm, was coupled into a single mode optical fiber. The output from the fiber was collimated and focused onto a sheath flow cuvette using a spherical lens with a 25 cm focal length. The 1/e2 beam diameter at the center of the flow cuvette was measured to be ∼46 µm by translating a razor blade through the beam. The excitation laser power in the flow cell was measured to be 20–30 mW. The fused silica square bore cuvette has a (250 µm)2 internal dimension and ∼100 µm detection side thickness. Fluorescence was collected at 90 to both the optical and flow axes with a 40×, NA 0.85 microscope objective. The collected light was focused onto a 1.2 (horizontal) × 3.0 (vertical) mm slit (spatial filter) located at the image plane of the objective, passed through a 550 ± 15 nm interference filter, and focused onto the photocathode of a thermoelectrically cooled (–30C) PMT (RCA, Model 31034A, Sommerville, NJ). The detection volume of ∼10 pl was defined by the image of the spatial filter in the flow cell, the focused laser beam and the sample flow stream. Photoelectron pulses were amplified, discriminated, and counted on a multichannel scaler (MCS). The MCS summed the number of pulses in 40.96 µs bins. Two hundred scans of data with 16384 bins per scan were collected and transferred to a Macintosh computer and analyzed by a program written in the LabVIEW language (National Instruments, Austin, TX). The total data collection time was 134 s. Sample solution and sheath fluid were introduced into flow cell by gravity feed. The sample and sheath flow rates were controlled by height differences between the sample tube, the sheath bottle and the drain container. Ultrapure water (Millipore, Bedford, MA) was used as the sheath fluid. Transit times of the stained fragments through the sample volume were controlled by the sheath flow rates (∼30 µl/min, corresponds to a linear velocity of 2–4 cm/s at the center of the cuvette and a corresponding transit time through the laser beam of 1–2 ms). The sample flowed at a rate of ∼0.2 µl/min, or ∼40 fragments/s. The probability of two fragments being in the sample volume at the same time is given by Poisson statistics as p(2) = 1 – exp(–τ × r), where τ is the transit time and r is the count rate (23). Under our conditions, the probability of double occupancy is ∼5%. The point at which 4204 Nucleic Acids Research, 1996, Vol. 24, No. 21 double occupancy becomes a problem depends on the application and can be calculated from the above formula. An important factor to achieve good resolution in FCM is to maintain a narrow and stable sample stream. To achieve this, the flow system has to be free of contaminants and air bubbles. Periodically, the flow system was disassembled and parts were sonicated in 2% RBS (Pierce, Rockford, IL) methanol/H2O (1:1) solution and rinsed with ultrapure water. Before running samples, air bubbles were removed by flowing methanol, followed by degassed methanol/H2O (1:1) through the flow system. Microspheres (Yellow-green Fluoresbrite, diameter 0.997 ± 0.026 µm, Cat# 18860, Polysciences Inc., Warrington, PA) were used to align the optics to maximize the signal as monitored by an oscilloscope. Data analysis Transit times of 1–2 ms were determined by autocorrelation of the first scan of raw data (12). The background was determined by averaging the data below a level set near the maximum of the background noise. The whole data set was scanned for bursts. A burst was recorded when a series of points exceeded a threshold set above the average background. A typical background rate for the data discussed below was 5 photoelectrons per MCS bin (40.96 µs), and a typical threshold was chosen to be 6 photoelectrons per bin. The criterion for choosing the proper threshold was discussed in detail (12). For large DNA fragments, the value chosen for the threshold is not critical. The areas of the bursts were integrated, and histogramed to give a burst size distribution using Sigmaplot software (Jandel Scientific, San Rafael, CA). Histograms were fit to a sum of Gaussians plus an exponentially decaying background (Figs 1–4 and Fig. 6). Burst size means (centroids of histogram peaks) of the DNA standards determined by the fit were plotted versus the fragment lengths and were fit by linear regression. The unknown DNA sizes were determined by their burst size means and the linear regression function. Optical saturation measurements Detector saturation represents one potential problem when measuring the photon bursts of large DNA fragments. To characterize the saturation limit of our PMT, histograms of burst sizes were obtained from yellow-green Fluoresbrite beads with different neutral density filters (OD = 0–2.0) placed in the detection path to attenuate the fluorescence. The log of burst size means (pe/ms) were plotted versus OD (a linear plot is expected if the detector is not saturating). We found that our detector did not saturate when the burst size mean was below 4000 pe/ms. The largest DNA we sized (T4 DNA) has a burst size mean of ∼2700 pe/ms, well below the saturation point. RESULTS Sizing of PAC clone Cl-29 by FCM Figure 1a shows the histogram of a sample containing a mixture of TOTO-1 stained λ KpnI digest (1.5, 17.1 and 29.9 kb), λ DNA (48.5 kb), T4 DNA (167 kb) and supercoiled PAC clone Cl-29. λ KpnI digest, λ DNA and T4 DNA were used as size standards. Figure 1. (a) Histogram of the fluorescence burst sizes of intact supercoiled clone Cl-29. λ KpnI digest, λ DNA and T4 DNA were used as size standards. The bin width was 10 pe. The histogram was fit to a sum of 5 Gaussians plus an exponentially decaying background. Experimental conditions: transit time 1.6 ms, laser power 30 mW. (b) Plot of burst size means versus fragment lengths. The means of DNA standard peaks (17.1, 29.9, 48.5, 167 kb) were fit by linear regression. The correlation coefficient of the linear fit is 0.99997. The slope is 27.0 ± 0.2 pe/kb and the intercept is –28 ± 14 pe. The Cl-29 size of 88.9 ± 0.8 kb was calculated using the linear regression function and the burst size mean as listed in Table 1. The histogram is the result of analyzing 134 s of data obtained from ∼5000 DNA fragments (this is <1 pg of DNA). Five fragments were resolved while, in this data, the signature of the 1.5 kb fragment was masked by the background due to residual scattered light, fluorescent impurities, and debris due to shearing of the large fragments. The histogram was fit to a sum of 5 Gaussians: [Ai exp{–[(x – xi)/σi]2/2}, where Ai is the amplitude of a given peak, xi is the burst size mean, σi is the standard deviation, x is the burst size, and i = 1–5] and a decaying exponential [α exp(–βx), where α and β are constants] with Ai, xi, σi, α and β as fitting parameters. The results are summarized in Table 1 and the resulting curve is shown in Figure 1a. The burst size means from the Gaussian fits were plotted versus the lengths of the DNA standards (17.1, 29.9, 48.5, 167 kb) and were fit by a linear regression (Fig. 1b). The linear correlation coefficient is 0.99997; the slope of the resulting line is 27.0 ± 0.2 pe/kb and the intercept is –28 ± 14 pe. The uncertainties are the standard deviations of the resulting parameters as reported by Sigmaplot. Deviations of the measured values from the fitted line are also listed in Table 1. The average absolute deviation is 1.7%. The unknown size of Cl-29 was calculated using the linear regression function and the burst size mean of Cl-29 from the Gaussian fit as listed in Table 1. The obtained Cl-29 size is 88.9 ± 0.8 kb. Here, the standard deviation of the clone size was calculated using error propagation theory. 4205 Nucleic Acids Acids Research, Research,1994, 1996,Vol. Vol.22, 24,No. No.121 Nucleic 4205 Table 1. Parameters obtained from the histograms of sizing PAC clone Cl-29 (Figs 1 and 2) Fragment length (kb) Amplitude (Ai) 17.1 35.3 ± 1.3b Figure 1 29.9 Sizing intact 48.5 Cl-29 88.9 167 Mean (xi) Standard deviation (σi) CV(%) (σi/xi) Shot noise(%) (xi /xi ) Deviations from linear fit (%)a 453.4 ± 1.3b 29.8 ± 1.3b 6.6 4.7 –4.5 26.7 ± 1.2 771.7 ± 1.9 38.1 ± 1.9 4.9 3.6 1.0 30.2 ± 1.0 1266.7 ± 2.0 54.5 ± 2.0 4.3 2.8 1.2 23.3 ± 0.7 2372.3 ± 3.7 108.2 ± 3.8 4.5 2.1 28.3 ± 0.6 4486.9 ± 3.2 122.1 ± 3.3 2.7 1.5 Figure 2 16.2 14.6 ± 0.8 358.4 ± 2.1 32.8 ± 2.2 9.1 5.3 Sizing Cl-29 48.5 17.1 ± 0.7 1066.8 ± 2.3 50.3 ± 2.3 4.7 3.1 insert and 73.7 9.7 ± 0.6 1617.9 ± 4.5 63.5 ± 4.6 3.9 2.5 13.3 ± 0.5 3663.2 ± 4.3 109.2 ± 4.4 3.0 1.6 vector aThe bThe 167 –0.1 deviation of the data from the linear regression fit is defined as: (fitted burst size – measured burst size)/fitted burst size. uncertainties are the standard deviations of the resulting parameters as reported by Sigmaplot. Table 2. Parameters obtained from the histograms of sizing PAC clone Cl-26 (Figs 3, 4) Fragment length (kb) Amplitude (Ai) Mean (xi) Standard deviation (σi) 17.1 76.1 ± 2.2 328.3 ± 0.8 23.4 ± 0.8 Figure 3 29.9 38.8 ± 1.8 540.3 ± 1.8 34.4 ± 1.9 Sizing intact 48.5 41.7 ± 1.5 871.5 ± 2.1 52.4 ± 2.2 119.1 18.0 ± 1.1 2129.4 ± 6.7 99.3 ± 7.1 167 11.8 ± 0.9 2981.8 ± 12.4 145.1 ± 13.0 15.5 35.4 ± 2.1 236.6 ± 1.9 28.9 ± 2.1 Cl-26 Figure 4 48.5 103.4 ± 1.6 764.6 ± 0.9 50.0 ± 0.9 insert and 105.3 15.3 ± 1.1 1673.6 ± 8.2 98.0 ± 8.7 vector 167 17.8 ± 0.9 2658.4 ± 8.7 151.7 ± 9.4 Sizing Cl-26 Supercoiled PAC clone Cl-29 was digested with EagI. This enzyme cuts at the vector and insert junction sites (18). The released linear restriction fragments were sized by FCM with λ DNA and T4 DNA as size standards and the results are shown in Figure 2a. The histogram was fit to a sum of 4 Gaussians plus a decaying exponential. The resulting parameters Ai, xi and σi (i = 1–4) are summarized in Table 1 and the resulting curve is shown in Figure 2a. The burst size means from the Gaussian fits were plotted versus DNA fragment lengths, and a line was drawn through the points of DNA standards of 48.5 and 167 kb (Fig. 2b). The slope of the resulting line is 21.9 ± 0.1 pe/kb and the intercept is 4 ± 4 pe. The unknown insert and vector sizes were calculated using the function of the resulting line and the burst size means from the Gaussian fits as listed in Table 1. The obtained Cl-29 insert size: 73.7 ± 0.4 kb; Cl-29 vector size: 16.2 ± 0.2 kb. The sum of the Cl-29 insert and its vector, 89.9 ± 0.5 kb, agrees within experimental error with the size obtained for the intact supercoiled Cl-29 (88.9 ± 0.8 kb, Fig. 1). The size of the PAC vector (pCYPAC2) was treated as unknown here. A closely related PAC vector (pCYPAC1) has a reported size of ∼17 kb (18). Sizing of PAC clone Cl-26 by FCM The same approach was used to determine the sizes of supercoiled PAC clone Cl-26 and its linear restriction fragments by EagI. Figure 3a shows the histogram of a sample of TOTO-1 stained supercoiled Cl-26 with λ KpnI digest, λ DNA and T4 DNA as size standards. Figure 3b shows the plot of the burst size means from the Gaussian fits versus the DNA fragment lengths. The Gaussian fit parameters Ai, xi and σi (i = 1–5) are summarized in Table 2 and the fitting curve is shown in Figure 3a. The linear correlation coefficient of Figure 3b is 0.99998; the slope of the linear regression line is 17.8 ± 0.1 pe/kb and the intercept is 15 ± 7 pe. The obtained supercoiled Cl-26 size is 119.1 ± 0.9 kb. Figure 4a shows the histogram of a sample of TOTO-1 stained EagI restriction fragments of Cl-26 with λ DNA and T4 DNA as size standards. Figure 4b shows the plot of the burst size means from the Gaussian fits versus the DNA fragment lengths. The Gaussian fit parameters Ai, x and σi (i = 1–4) are summarized in Table 2 and the fitting curve is shown in Figure 4a. The slope of the drawn line through the points of size standards in Figure 4b 4206 Nucleic Acids Research, 1996, Vol. 24, No. 21 Figure 2. (a) Histogram of the fluorescence burst sizes of EagI restriction fragments of Cl-29. λ DNA and T4 DNA were used as size standards. The bin width was 10 pe. The histogram was fit to a sum of 4 Gaussians plus an exponentially decaying background. Experimental conditions: transit time 1.4 ms, laser power 30 mW. (b) Plot of burst size means versus fragment lengths. A line was drawn through the points of DNA standards (48.5, 167 kb). The slope of this line is 21.9 ± 0.1 pe/kb and the intercept is 4 ± 4 pe. The insert (73.7 ± 0.4 kb) and vector (16.2 ± 0.2 kb) sizes were calculated using the function defined by the line and burst size means as listed in Table 1. is 16.0 ± 0.1 pe/kb and the intercept is –10 ± 4 pe. The obtained Cl-26 insert size: 105.3 ± 0.9 kb; Cl-26 vector size: 15.5 ± 0.3 kb. The sum of the Cl-26 insert and its vector, 120.8 ± 1.0 kb, agrees within experimental error with the size obtained for the intact supercoiled Cl-26 (119.1 ± 0.9 kb, Fig. 3). Sizing of supercoiled and linear PAC clones by PFGE To check the validity of PAC clone sizing by FCM, the intact supercoiled clones isolated from E.coli host cells and their linear restriction fragments were also sized by PFGE. Approximately 0.2 µg of clone DNA was loaded into each gel well. The pulsed-field gel was run for 15 h with switch times ramped from 2.0 to 30 s. Figure 5 shows a photograph of the ethidium bromide stained gel. Lanes 1, 6, 7, 16 are Low Range PFG markers (sizes indicated on the figure); lanes 2 and 8 are intact supercoiled Cl-26 from two different preps; lanes 3 and 9 are EagI restriction fragments of intact supercoiled Cl-26, showing two bands, corresponding to the linear insert and vector; lanes 4 and 10 are intact supercoiled Cl-29 from two different preps; lanes 5 and 11 are EagI restriction fragments of intact supercoiled Cl-29; lanes 12–15 are not relevant. To estimate the sizes of DNA fragments, the migration distances of all the bands from the loading wells were measured using a ruler and the lengths of markers were plotted versus their corresponding migration distances. The resulting plot was fit to a 5th order polynomial function and the sizes of the clones were estimated using their migration distances and the polynomial function obtained from the fit to the markers. The estimated linear Cl-29 insert size is 71.0 ± 7.1 kb, where the uncertainty represents a typical 10% error Figure 3. (a) Histogram of the fluorescence burst sizes of intact supercoiled clone Cl-26. λ KpnI digest, λ DNA and T4 DNA were used as size standards. The bin width was 10 pe. The histogram was fit to a sum of 5 Gaussians plus an exponentially decaying background. Experimental conditions: transit time 1.1 ms, laser power 30 mW. (b) Plot of burst size means versus fragment lengths. The means of DNA standard peaks (17.1, 29.9, 48.5, 167 kb) were fit by linear regression. The correlation coefficient of the linear fit is 0.99998. The slope is 17.8 ± 0.1 pe/kb and the intercept is 15 ± 7 pe. The Cl-26 size of 119.1 ± 0.9 kb was calculated using the linear regression function and the burst size mean as listed in Table 2. quoted for PFGE (6). The obtained Cl-29 insert size is in good agreement with that measured by FCM (73.7 ± 0.4 kb, Fig. 2). The estimated linear Cl-26 insert size from the gel is 107.1 ± 10.7 kb, which is also in good agreement with that measured by FCM (105.3 ± 0.9 kb, Fig. 4). The estimated size of the linear vector is 16.7 ± 1.7 kb, also agrees with that determined by FCM (16.2 ± 0.2 kb, Fig. 2; 15.5 ± 0.3 kb, Fig. 4). The intact supercoiled Cl-26 and Cl-29 clones did not migrate appreciably, which is not surprising since large supercoiled DNA migrates anomalously in gel electrophoresis (5,24). Sizing of PAC clones by PFGE requires enzyme digestion to linearize supercoiled DNA. Sizing of PAC clones directly from the PAC clone library by FCM Figure 6 shows four representative histograms of sizing clones directly from the PAC clone library. Here λ KpnI digest, λ DNA and T4 DNA were used as markers (same as Figs 1 and 3). The data analysis was the same as that employed to analyze data for intact supercoiled Cl-29 and Cl-26. The obtained sizes for supercoiled Cl-7, Cl-10, Cl-32 and Cl-56 are listed in Table 3. The insert sizes, calculated by subtracting the vector size (15.9 ± 0.4 kb, Figs 2 and 4), are in agreement with those obtained by PFGE within experimental uncertainty (Table 3). The apparent systematic discrepancy derived from these two methods is not significant: in general, both positive and negative deviations are observed (see above section). 4207 Nucleic Acids Acids Research, Research,1994, 1996,Vol. Vol.22, 24,No. No.121 Nucleic 4207 Table 3. Comparison of the sizes of clones from the PAC library measured by FCM and PFGE By FCM (insert + vector, kb) By FCM (insert only, kb) By PFGE (insert only, kb) Cl-7 105.0 ± 1.7 89.1 ± 1.7 97 ± 10 Cl-10 101.7 ± 1.8 85.8 ± 1.8 93 ± 9 Clone Cl-32 95.5 ± 0.8 79.6 ± 0.9 81 ± 8 Cl-56 103.2 ± 1.9 87.3 ± 1.9 94 ± 9 Figure 5. Sizing of PAC clones by pulsed field gel electrophoresis (PFGE). Detailed conditions are in Materials and Methods. The gel was stained with ethidium bromide. Lanes 1, 6, 7, 16: Low Range PFG markers (sizes on figure); lanes 2 and 8, intact supercoiled Cl-26 from two different preps; lanes 3 and 9, EagI restriction fragments of intact supercoiled Cl-26, showing two bands at 107.1 and 16.7 kb; lanes 4 and 10, intact supercoiled Cl-29 from two different preps; lanes 5 and 11, EagI restriction fragments of intact supercoiled Cl-29, showing two bands at 71.0 and 16.7 kb; lanes 12–15, not relevant. Figure 4. (a) Histogram of the fluorescence burst sizes of EagI restriction fragments of Cl-26. λ DNA and T4 DNA were used as size standards. The bin width was 10 pe. The histogram was fit to a sum of 4 Gaussians plus an exponentially decaying background. Experimental conditions: transit time 1.4 ms, laser power 20 mW. (b) Plot of burst size means versus fragment lengths. A line was drawn through the points of DNA standards (48.5, 167 kb). The slope of the line is 16.0 ± 0.1 pe/kb and the intercept is –10 ± 4 pe. The insert (105.3 ± 0.9 kb) and vector (15.5 ± 0.3 kb) sizes were calculated using the function defined by the line and burst size means as listed in Table 2. Estimating DNA concentration based on the histogram peak areas Counts of the number of individual fragments are obtained from FCM data. The relative numbers counted for different DNA fragments are proportional to the molar concentrations of those fragments in the sample solution. Therefore, it is possible to estimate the unknown clone concentration based on its relative peak area in the histogram. Seven sets of histogram data were examined and the relative peak areas of λ and λ KpnI digest (48.5, 29.9 and 17.1 kb) were calculated to verify that the peak areas correspond to the molar concentrations in the sample. The results are listed in Table 4. The two approaches agree within listed uncertainties, which means the accuracy of estimating the DNA concentration based on the histogram peak area analysis is >80% (see column 4 of Table 4). To show an example of using this analysis to estimate the unknown clone concentration, the ratio of the histogram peak areas for fragments with lengths 17.1, 29.9, 48.5 and 88.9 kb in Figure 1 is computed to be 0.6:0.6:1:1.6. Based on the molar concentrations of λ (9.5 × 10–15 M) and λ KpnI digest (5.6 × 10–15 M) in the final sample solution, the clone concentration in the final sample solution is estimated to be (1.5 ± 0.3) × 10–14 M. Assuming that DNA shearing did not occur during sample preparation, the concentration of the original 1 µl clone solution used to make up the staining solution is then estimated to be 21 ± 4 µg/ml. Table 4. Correspondence between the ratio of the histogram peak areas and the ratio of the molar concentrations of DNA fragments in the sample Fragment length Ratio of the molar Ratio of the histogram Ratio of the histogram peak areas/ (kb) concentrations peak areas Ratio of the molar concentrations 17.1 29.9 48.5 0.59a 0.59a 1 0.70 ± 0.10b 0.63 ± 0.08b 1 1.19 ± 0.17 1.07 ± 0.14 1 aMolar concentrations are given in the text: the ratios of the molar concentrations were normalized to the molar concentration of λ DNA. bRatios of the histogram peak areas were calculated and averaged from seven sets of data, and were normalized to the histogram peak areas of λ DNA. Uncertainties listed are the standard deviations. 4208 Nucleic Acids Research, 1996, Vol. 24, No. 21 Figure 6. Histograms of the fluorescence burst sizes of the clones directly from a PAC clone library. The bin widths were all 10 pe. λ KpnI digest, λ DNA and T4 DNA were used as markers. The data analysis was the same as that of Figures 1 and 3. (a) Cl-7, transit time 1.4 ms, laser power 30 mW, obtained Cl-7 size: 105.0 ± 1.7 kb; (b) Cl-10, transit time 1.2 ms, laser power 30 mW, obtained Cl-10 size: 101.7 ± 1.8 kb; (c) Cl-32, transit time 1.1 ms, laser power 30 mW, obtained Cl-32 size: 95.5 ± 0.8 kb; (d) Cl-56, transit time 1.2 ms, laser power 30 mW, obtained Cl-56 size: 103.2 ± 1.9 kb. DISCUSSION Accuracy Sizing of large DNA fragments by FCM is more accurate (∼2% uncertainty) than by PFGE, which is generally considered to have a 10% uncertainty in estimating sizes (6). This conclusion is based upon the following: (i) the average absolute deviation from the linear fits for large DNA standards is 1.7% [Fig. 1; the 17.1 kb point, which has the largest deviation (–4.5%), is always above the line as observed previously (10,12)]; (ii) the results are reproducible from run-to-run and day-to-day: the average precision of sizing the same PAC clone is 1.7%; and (iii) the signal in FCM (photon burst) is linear with fragment length, while the signal in PFGE (migration distance) is non-linear with DNA length. Resolution When the instrument is optimized, photon shot noise (xi/ xi ) accounts for half or more of the observed coefficient of variation (CV) (Table 1). Larger fragments have relatively better resolution since the shot noise is relatively smaller due to their larger burst sizes. For a given fragment, a larger burst size will result in a better CV. Larger burst sizes could be attained by increasing the laser power or using brighter dyes. The signal count rate (pe/ms) achievable is ultimately limited by detector saturation (see Materials and Methods section). Larger burst sizes could be obtained without saturation by increasing the transit time. Our resolution on large fragments (2–5% for >50 kb DNA) is comparable with or better than that of PFGE (25). Sensitivity Sensitivity is an important issue in characterizing clones obtained in minute quantities (such as single copy PAC and BAC clones). As little as 0.4 pg of total DNA (or ∼0.1 pg of clone DNA) is analyzed to generate a histogram that allows accurate sizing of clones. Although the staining solution was made up at ∼10–11 M in the currently reported experiments, good results were obtained in other work for sizing a mixture of λ DNA and λ KpnI digest stained directly at ∼10–13 M with TOTO-1. Only ∼2 ng of clone DNA are needed to make up 1 ml of 10–13 M working solution, of which only ∼0.5 µl is analyzed to generate one histogram. Improvements in sample handling will reduce further the volume of solution needed for analysis. The amount of DNA loaded per gel well in Figure 5 is ∼0.2 µg. Thus, sizing of DNA by FCM requires orders of magnitude less DNA than does PFGE. Signal linear with the number of fragments FCM analyzes and classifies DNA fragments one at a time. The relative numbers counted for different DNA fragments correspond to the molar concentrations of the fragments in the sample solution. Unknown DNA concentrations can be estimated with >80% accuracy based on the analysis of their relative peak areas in the histogram. Effect of DNA conformation on sizing The sizes of intact supercoiled Cl-29 and Cl-26 obtained by FCM are in good agreement with the sums of the sizes of their linear restriction fragments (see Results section). These results demonstrate that FCM can be applied to size both linear and supercoiled clones. Two implications can be drawn from these results: (i) DNA conformation (supercoiled or linear) does not affect the stoichiometric binding of TOTO-1 to DNA double strands; and (ii) DNA conformation does not affect the linear relationship between photon burst size and the number of dye molecules associated 4209 Nucleic Acids Acids Research, Research,1994, 1996,Vol. Vol.22, 24,No. No.121 Nucleic with the DNA strands. In contrast, large supercoiled DNA migrates anomalously in PFGE and it is essential to digest enzymatically the supercoiled clones into linear molecules before they can be sized accurately. CONCLUSIONS We have demonstrated a flow cytometry-based technique to size large DNA fragments and its application to the characterization of P1 artificial chromosome clones. This technique is superior to the current most commonly used technique (PFGE) for the following reasons: (i) the data are acquired rapidly (<3 min compared with 15 h for PFGE); (ii) the technique is sensitive (<1 pg of DNA is analyzed compared with 0.2 µg of DNA for PFGE); (iii) the measurement is linear with both fragment length and number of fragments; and (iv) the results are conformation independent as both linear and supercoiled PAC clones were sized accurately. We anticipate that this method will play an important role in characterizing PAC/BAC clone libraries that are widely used in gene mapping, sequencing, and other genetic analyses. The primary challenge of sizing large DNA in solution is to avoid significant DNA breakage during sample preparation and handling. Since DNA as large as 1 × 106 bp has been handled successfully in the genome community (with YAC cloning systems), we anticipate our DNA sizing range can be pushed well beyond 167 kb. This method also holds the potential for scale-up through multiplicity and automation. ACKNOWLEDGEMENTS This work was supported by internal funding from Los Alamos National Laboratory, by the DOE funded Los Alamos Center for Human Genome Studies (W-7405-ENG-36), and by the NIH funded National Flow Cytometry Resource (RR-01315). We thank Harvey Nutter for his valuable technical assistance. 4209 REFERENCES 1 Sambrook,J., Fritsch,E.F. and Maniatis,T. (1989) Molecular Cloning: A Laboratory Manual, 2nd edition. Cold Spring Harbor Laboratory Press, Cold Spring Habor, NY, pp. 6.5, pp. 6.37. 2 Cantor,C.R., Smith,C.L. and Mathew,M.K. (1988) Annu. Rev. Biophys. Biophys. Chem., 17, 287–304. 3 Schwartz,D.C. and Cantor,C.R. (1984) Cell, 37, 67–75. 4 Carle,G.F., Frank,M. and Olson,M.V. (1986) Science, 232, 65–68. 5 Beverley,S.M. (1988) Nucleic Acids Res., 16, 925–939. 6 Guo,X.H., Huff,E.J. and Schwartz,D.C. (1992) Nature, 359, 783–784. 7 Cai,W., Aburatani,H., Stanton,V.P., Housman,D.E., Wang,Y.K. and Schwartz,D.C. (1995) Proc. Natl Acad. Sci. USA, 92, 5164–5168. 8 Kim,Y.S. and Morris,M.D. (1994) Anal. Chem., 66, 3081–3085. 9 Kim,Y.S. and Morris,M.D. (1995) Anal. Chem., 67, 784–786. 10 Goodwin,P.M., Johnson,M.E., Martin,J.C., Ambrose,W.P., Marrone,B.L., Jett,J.H. and Keller,R.A. (1993) Nucleic Acids Res., 21, 803–806. 11 Johnson,M.E., Goodwin,P.M., Ambrose,W.P., Martin,J.C., Marrone,B.L., Jett,J.H. and Keller,R.A. (1993) Proc. SPIE-Inc. Soc. Opt. Eng., 1895, 69–78. 12 Petty,J.T., Johnson,M.E., Goodwin,P.M., Martin,J.C., Jett,J.H. and Keller,R.A. (1995) Anal. Chem., 67, 1755–1761. 13 Castro,A., Fairfield,F.R. and Shera,E.B. (1993) Anal. Chem., 65, 849–852. 14 Castro,A. and Shera,E.B. (1995) Anal. Chem., 67, 3181–3186. 15 Glazer,A.N. and Rye,H.S. (1992) Nature, 359, 859–861. 16 Gingrich,J.C., Boehrer,D.M., Garnes,J.A., Johnson,W., Wong,B.S., Bergmann,A., Eveleth,G.G., Langlois,R.G. and Carrano,A.V. (1995) Genomics, 32, 65–74. 17 Sternberg,N. (1990) Proc. Natl Acad. Sci. USA, 87, 103–107. 18 Ioannou,P.A., Amemiya,C.T., Garnes,J., Kroisel,P.M., Shizuya,H., Chen,C., Batzer, M.A. and DeJong,P.J. (1994) Nature Genet., 6, 84–89. 19 Shizuya,H., Birren,B., Kim,U.J., Mancino,V., Slepak,T., Tachiiri,Y. and Simon,M. (1992) Proc. Natl Acad. Sci. USA, 89, 8794–8797. 20 Burke,D.T., Carle,G.F. and Olson,M.V. (1987) Science, 236, 806–812. 21 Ashworth,L.K., Alegria-Hartman,M., Burgin,M., Devlin,L., Carrano,A.V. and Batzer,M.A. (1995) Anal. Biochem., 224, 564–571. 22 Sheng,Y.L., Mancino,V. and Birren,B. (1995) Nucleic Acids Res., 23, 1990–1996. 23 Lindmo,T., Peters,D.C. and Sweet,R.G. (1990) In Melamed,M.R., Lindmo,T. and Mendelsohn,M.L. (eds), Flow Cytometry and Sorting, 2nd edition. Wiley-Liss, Inc., New York, pp. 159. 24 Wang,M. and Lai,E. (1995) Electrophoresis, 16, 1–7. 25 Mathew,M.K., Smith,C.L. and Cantor,C.R. (1988) Biochemistry, 27, 9204–9210.
© Copyright 2026 Paperzz