© 7990 Oxford University Press Nucleic Acids Research, Vol. 18, No. 15 4453 Random-breakage mapping, a rapid method for physically locating an internal sequence with respect to the ends of a DNA molecule John C.Game 1 , Maren Bell1, Jeff S.King2 and Robert K.Mortimer1'2 1 Division of Cellular and Molecular Biology, Lawrence Berkeley Laboratory, Berkeley, CA 94720 and department of Molecular and Cell Biology, Division of Biophysics and Cell Physiology, University of California, Berkeley, CA 94720, USA Received April 30, 1990; Revised and Accepted June 25, 1990 ABSTRACT We describe a method for determining the position of a cloned internal sequence with respect to the ends of a DNA molecule. The molecules are randomly broken at low frequency and the fragments are subjected to electrophoresis. Southern hybridization using the cloned DNA as a probe identifies only those fragments containing the sequence. The size distribution of these fragments is such that two threshold changes in intensity of signal are seen in the smear pattern below the unbroken molecules. The positions of the changes represent the distances from the sequence to each molecular end. The intensity changes arise because the natural ends of the molecules influence the fragment distribution obtained. From once-broken molecules, no fragments can arise that contain a given sequence and are shorter than the distance between that sequence and the nearest molecular end. We tested the method by using x-rays to induce breakage in yeast DNA. Genes of independently known position were mapped within whole chromosomes or Not I restriction fragments using Southern blots from gels of irradiated molecules. We present equations to predict fragment distribution as a function of break-frequency and position of the probed sequence. expected at certain positions in the smear pattern below a given DNA species. These positions reveal the distance of the probed sequence from the ends of the molecule. In principle, this is easily understood by considering molecules broken exactly once. Amongst the fragments of such molecules, none shorter than the distance between a sequence and the nearest end of the molecule can contain that sequence. A minimum of two breaks would be required to generate shorter fragments identifiable by the probe. When a low amount of DNA breakage is present ( < 1 break/molecule), most fragments will arise from once-broken molecules. Hence, a given probe will mainly identify fragments extending downwards only to a size representing the distance of the gene probed from the nearest end of the molecule. A second change in intensity is expected at a position corresponding to the distance from the gene to the more distant end of the molecule. Molecules broken twice or more will also contribute to these intensity changes. We tested the proposed method experimentally, using DNA from Saccharomyces cerevisiae. The results presented demonstrate that this is a simple and rapid way to map internal sequences with respect to molecular ends over a wide range of DNA sizes. MATERIALS AND METHODS Preparation of high molecular weight DNA INTRODUCTION The techniques of Southern hybridization (1) and conventional or pulsed-field gel electrophoresis (2-8) are widely used to assign cloned DNA sequences to particular chromosomes or DNA restriction fragments. However, locating the position of a given sequence within a chromosome or fragment is more difficult and often requires complex strategies (9-14). Current methods of directly positioning sequences with respect to the ends of chromosomes require transformation with specially constructed molecules containing these sequences (11, 13). We show here that the distance in base pairs between a cloned sequence and the nearest free end of a DNA molecule can in fact easily be determined by probing gels containing randomly broken molecules. When molecules contain randomly distributed breaks at low frequency, sharp changes in intensity of probe signal are Chromosomal-sized DNA from the haploid, prototrophic S. cerevisiae strain X2180-1A was prepared in low melting point agarose, by adapting procedures described by Burgers et al. (15) and Van Ommen et al. (16). Cells were spheroplasted before they were encased in agarose and no 2-mercaptoethanol was used. 1.5 ml of a fresh overnight culture grown in YPD at 30°C were harvested for 3 seconds in a microfuge, washed with 1 ml EDTA pH 7.5 and resuspended in 1 ml SCE (1 M sorbitol, 0.1 M Na citrate, 10 mM EDTA pH 7.5) containing 200 U lyticase (Sigma). Cells were incubated in SCE/lyticase at 30°C for 20-25 minutes and harvested at 500Xg for 5 minutes. Spheroplasts were resuspended in 500 fn\ 1 M sorbitol. An equal volume of molten 1.5% LMP agarose at 47°C in 0.125 M EDTA was added and cells were gently mixed by pipetting. Aliquots of 100 y\ were distributed into molds, refrigerated, and the solidified plugs were placed into 5 volumes ES (0.5 M EDTA pH 9.5, 1 % Sarcosyl) 4454 Nucleic Acids Research, Vol. 18, No. 15 containing 1 mg/ml proteinase K (Sigma). After ~ 6 hours of gentle shaking at room temperature, further proteinase K was added to give a total concentration of 2 mg/ml. Plugs were digested overnight at room temperature. They were then washed extensively with TE50 (10 mM Tris HC1 pH 7.5, 50 mM EDTA pH 7.5) and stored at 4°C. If a restriction digest was to follow, plugs were incubated with 0.1 mM PMSF (Sigma) for two washes of 1 hour each. Plasmid sources and preparation of probes We used four unique sequences as hybridization probes for yeast DNA. A 1.7 kb BamH I fragment containing the HIS3 gene (17) was isolated from plasmid pYAC 5(18), provided by Maynard Olson. A 1.1 kb Hind III fragment containing the URA3 gene (19) and a 1.3 kb Nru I-Mlu I fragment containing the RAD51 gene (20) were both isolated from a pUC13-derived plasmid, provided by Mari Aker and George Basile. A 2.1 kb Cla I fragment from the LEU2 region (21) was isolated from plasmid YEP13-RAD54, provided by Garry Cole. Plasmid DNA and electroeluted fragments were prepared by the methods of Birnboim and Doly (22) and Maniatas et al. (23) respectively. X-ray treatments A Machlett OEG 60 beryllium window x-ray machine was used at 50 kV and 20 mA to irradiate yeast DNA in agarose plugs. Plugs were equilibrated in TE50 prior to irradiation, since the amount of breakage obtained varies sharply in different buffers. Each agarose plug was turned over after half the x-ray dose, to minimize dose heterogeneity within the agarose caused by shielding. The exact dose rate within the plugs is difficult to determine, since there is some absorbtion by the agarose, but our best estimate based on observed breakage is that the effective dose rate was approximately 6 kilorads/minute at the settings used. Electrophoresis CHEF gels were run using techniques described elsewhere (5, 6). Pulse times of specific gels were chosen to achieve maximum resolution in the size ranges of interest and frequently varied within a single run (see figure legends). Southern blotting and hybridization Gels were treated with short wave-length UV, acid nicked, denatured and neutralized. DNA was capillarilly transferred to nitrocellulose (1) and bound to the membrane using a Stratagene UV-linker. Fragments of the S. cerevisiae H1S3, RAD51, LEU2 and URA3 genes were labeled by random oligonucleotide priming (Amersham). Hybridization was carried out overnight at 42 °C in 50% formamide, 0.1% SDS, 3XSSC, 1 mM EDTA pH 7.8, 10 mM Tris HC1 pH 7.6, lOxDenhart's solution, 0.05% Na pyrophosphate and 100 ^g/ml sonicated salmon sperm DNA. Computer analysis To verify the equations calculated in the appendix a Modula-2 program was written and run on an Apple Macintosh Ucx computer. The program simulated the breakage of chromosomes by randomly placing a fixed number of breaks on a chromosome containing one hundred possible break sites. Random numbers were generated using the Pascal version of the uniform deviate random number generator RANI from Press et al. (24). Typically, 300,000 imaginary chromosomes were broken and the size of the site-containing fragment from each chromosome -2/L Delected fragments from breaks in region D -1/L Detected fragments from breaks in region L - D D L-D SIZE OF FRAGMENT Figure 1: Illustration of the size distribution of probed fragments arising from a population of once-broken molecules. The upper part represents a chromosome of length L, with a unique sequence (A), a distance D from the nearest end. The symbol —O— represents the centromere. The lower part represents the distribution of fragments containing this sequence versus the size of the fragment. It can be seen that there are no sequence-containing fragments in the size-range 0 to D, and only half as many per unit length in the size-range D to L—D as there are in the size-range L - D to L. was recorded. The size distributions of fragments were normalized by dividing the total number of fragments of each size by the number of chromosomes broken. These distributions were compared to those predicted by the equations in the appendix. RESULTS The basis of the method The method relies on observing the pattern that broken molecules form when resolved according to size by gel electrophoresis. When molecules of a particular DNA species are randomly broken, the resultant fragments will form a smear on the gel below the position of the unbroken molecules. The overall size distribution of fragments will depend on the frequency of breaks, and has been described elsewhere (25). However, after Southern hybridization, only those fragments that contain the probed sequence are seen. Each such fragment will contribute equal intensity independently of its size, whereas when total DNA is visualized by staining, the intensity of signal will depend on both the size and number of fragments at any point. Fragments that contain a specific sequence will show a size distribution that differs from that of total fragments, if the DNA is linear and the number of breaks per molecule is low. This difference results from the contribution that natural ends of the original molecules make to the distribution. When end effects are significant, the fragment distribution reveals the distance from the ends to the probed sequence. This can be seen from the fragment distribution expected from molecules broken exactly once, at random positions (see figure 1). No fragment shorter than the distance from a unique sequence to the nearest end of Nucleic Acids Research, Vol. 18, No. 15 4455 information may be obtained from smear patterns from somewhat higher levels of breakage. This is because the fragment distribution from molecules with two to several breaks also shows significant peaks at the same size positions. The degree of peaking is dependent both on the break frequency and the position of the probed sequence within the molecule (see figure 5a and 5b). We investigated the effect of increasing break-frequency on the observed fragment distribution using x-rays, and derived and tested equations to predict the expected distribution for different break-frequencies and gene positions, as discussed below and in the appendix. 1.0 2.0 3.0 4.0 AVERAGE NUMBER OF BREAKS 5.0 Figure 2: Molecular species arising from breakage as a function of average break frequency per molecule. Line (a) shows the fraction of molecules that are broken once or more, line (b) shows the fraction of molecules broken exactly once and line (c) shows the fraction of total fragments arising from exactly once-broken molecules. The abscissa represents the average break frequency per molecule, and it is assumed that there is a Poisson distribution of number of breaks / molecule, randomly positioned. the molecule can contain that sequence, since at least two breaks would be required to generate such a fragment. Hence, the distribution of fragments identified by a probe should show a sharp 'cut-off in the smear pattern at a position representing the distance between the probed sequence and the nearest end of the molecule. Above this point, the number of fragments containing the sequence will be constant over a certain size range, when only once-broken molecules are considered, since each size will correspond to a break at only one position in the molecule. However, in a larger size range, corresponding to fragments larger than the distance from the sequence to the more distant end of the molecule, we expect twice this frequency of probebinding fragments of each size. This is because there are now two separate positions in the molecule where breakage will lead to fragments of a specific size, namely one between the probed sequence and its nearest end and another at an equivalent distance from the opposite end of the molecule (see figure 1). Hence, we expect an increase in the amount of bound probe at a position representing the distance from the probed sequence to the furthest end. Above this, the number of sequence-containing fragments will again be constant versus size, until the position of the unbroken molecules is reached. Note that the sum of the sizes represented by the lower and upper intensity changes should equal the size of the unbroken molecules. We show below that these expected intensity changes can readily be observed. Molecules broken more than once Randomly broken DNA will always contain some molecules broken more than once. The proportion of total fragments arising from these becomes small at break frequencies significantly less than one per molecule, but clearly there will be a limiting frequency below which breakage will be insufficient to generate a visible smear in the probed gel. Figure 2 illustrates how the fraction of fragments that are derived from once-broken molecules changes with break-frequency. It can be seen that there is a region between about 0.7 and 1.2 breaks per molecule when most molecules are broken (line a), but most fragments are still derived from once-broken molecules (line c). However, in practice, useful Experimental tests of the method a) Whole yeast chromosomes: We used DNA from the yeast Saccharomyces cerevisiae to test the usefulness of probed smear patterns for localizing genes on chromosomes and restriction fragments. We irradiated DNA with x-rays in vitro in agarose plugs at a variety of doses, and subjected this DNA to pulsedfield gel electrophoresis. We blotted the resultant gels to nitrocellulose, and probed these blots with DNA from several separate cloned yeast genes for which we had independent information concerning position. Figure 3 shows results for a single gel, probed consecutively with two different genes. For each probe, it can be seen that at least some lanes show two threshold changes of intensity, as indicated by arrows, at positions that differ for each of the two genes. In each case, the higher x-ray doses show a blurring of the threshold intensity changes and this occurs more rapidly with increasing dose at the upper position than at the lower one. As expected, the threshold changes are most readily seen at lower doses in the case of HIS3, which is on chromosome XV (1140 kb, (26)), than they are for RAD51 on chromosome V, which is much smaller (593 kb, (26)). Note also that the intensity of the band representing unbroken molecules declines much more rapidly with dose for the larger chromosome. Evidence that the intensity changes in figure 3 are consistent in position with the expected physical distances from the telomeres can be seen in table 1. This compares estimates of position for each probed gene from this study with data from L.Riles, J.Dutchik, A.Baktha, A.Link and M.Olson (personal communication) using genomic restriction mapping (10), together with genetic position estimates from Mortimer et al. (26). Our measurements fall within the range expected from the restriction map data for both genes in figure 3 and also for URA3 and LEU2 (blots not shown). There is reasonable agreement when the sizes represented by the two intensity changes are summed and compared to the size of the unbroken molecules on the same gel. We believe these observations indicate that the intensity changes in probed smear patterns do reflect the distance of the probed sequence from the ends of the molecule. b) Not I restriction digests of yeast DNA: Many organisms have chromosomes that are too large to be resolved by contemporary electrophoretic techniques. However, we reasoned that the ability to position genes within large restriction fragments by random-breakage mapping would provide useful information when constructing physical maps. We tested the feasibility of this by irradiating Not I digests of yeast DNA. By analogy with chromosomes the smear patterns of these digests should reveal the distance of the probed sequence from its nearest Not I sites on either side. If digestion is incomplete, overlapping smear patterns from larger molecules (cut at more distant Not I sites) might obscure information about the position of the sequence with respect to the more distant of its two nearest sites. However, 4456 Nucleic Acids Research, Vol. 18, No. 15 2 3 4 5 6 7 8 9 10 10 kb I* • 1(22— 1149— 1622 1140 830 677 MO— (77— 593" 450 4*7— 388340291- 24319414697.0- i (X) 0 0.67 1.3 2.0 (X) 4.0 8.0 16.0 (X) x-ray dose in minutes t 0J7 U U N «4 U Ut x-ny do** In H * M M 7 0 0 10 1149- It 07i> 1 Iff U U *Jt 0J> 110 W Figure 3: Positioning genes on yeast chromosomes by random-breakage mapping, (a) An ethidium-stained pulsed-field gel. Lanes 1, 6 and 10 contain unirradiated X-ladder DNA to provide size markers. The sample in lane 2 contained yeast DNA that received no x-rays, and the bands represent unbroken chromosomes. The samples in the other lanes contained the same yeast DNA but were treated with x-rays for successively longer times, as shown at the base of each lane. The x-ray dose-rate was approximately 6 kilorads/minute. DNA breakage is revealed at the higher doses by loss of bands and the progressively lower position of the ethidiumstained smear. Smaller yeast chromosomes show preferential survival with respect to dose compared to the larger ones. The gel was run for 29 hours with eight discrete pulse-times ranging from 16 seconds to 95 seconds. Sizes in kb are indicated on the right, (b) A Southern blot of the same gel, probed with a 1.7 kb fragment of the H1S3 gene. The probe identifies unbroken chromosome XV molecules, at a size position of about 1140 kb, as well as a smear pattern of broken fragments that contain the gene. Two discontinuities can be identified in this smear pattern, at positions indicated by arrows at the left and corresponding to 770 kb (arrow A) and 380 kb (arrow B) respectively. The magnitude of each discontinuity is dose-dependent and the upper discontinuity attenuates at a lower dose than the lower discontinuity. The weak band identified above chromosome XV probably represents partial hybridization of the probe to another site in the yeast genome, (c) The same blot, re-probed with a 1.3 kb fragment from the RAD5J gene. Chromosome V is identified at about 593 kb and two discontinuities in the smear pattern can be seen, at ~35O kb (arrow A) and - 2 3 5 kb (arrow B), whose sum is equivalent to 585 kb. incomplete digestion should not interfere with estimates of the nearest-site distance, provided that a significant number of molecules are cut at this site. Figure 4a shows an ethidium stained gel containing lanes of x-irradiated yeast DNA which had been previously digested with Not I. Figure 4b shows the same gel blotted and probed with the HIS3 fragment used in figure 3b. It can be seen that the probe identifies a — 270 kb Not I fragment, which is progressively broken as the x-ray dose increases. In the irradiated lanes (4 to 8), the smear pattern identified by the probe shows a sharp cutoff at a position corresponding to ~ 64 kb. Independent data provided by L. Riles, J. Dutchik, A. Baktha, A. Link and M. Olson and based on genomic restriction mapping (10) have placed HIS3 - 4 9 kb from a Not I site. This - 15 kb difference may be within the range of experimental error in the pulsed-field gels used, or may reflect real differences between the different yeast strains studied in each laboratory. Figure 4b demonstrates that probed sequences can be positioned with respect to restriction sites by random-breakage mapping. Fragment distribution as a function of breaks per molecule In order to fully assess the analysis of broken DNA as a physical mapping method, we calculated the expected size distributions of fragments containing genes at different locations as a function Nucleic Acids Research, Vol. 18, No. 15 4457 Table 1. Comparison of gene-to-telomere distances using genomic restriction mapping1, random-breakage mapping2 or genetic mapping methods3. gene probed HIS3 (Chr. XV) nearest end furthest end sum % distance from nearest end RAD51 (Chr. V) nearest end furthest end sum Restriction Mapping1 Random-Breakage Mapping2 Genetic Map Position3 402 kb 768 kb 1170 kb 34.4 380 kb 770 kb 1150 kb 116 cM 246 cM 362 cM 32.0 33.0 228 kb 357 kb 585 kb % distance from nearest end URA3 (Chr. V) nearest end furthest end sum % distance from nearest end LEU2 (Chr. Ill) nearest end furthest end sum % distance from nearest end 94 cM 137 cM 231 cM 39.0 235 kb 350 kb 585 kb 40.2 121 kb 464 kb 585 kb 20.7 130 kb 444 kb 574 kb 22.7 52 cM 179 cM 231 cM 120 kb 230 kb 350 kb 34.3 114 kb 241 kb 355 kb 46 cM 102 cM 148 cM 31.1 40,1 22.5 32.1 1. See ref. 10. Data provided by L. Riles, i. Dutchik, A. Baktha, A. Link, and M. Olson (personal communication). 2. Data from this study. 3. Data from ref. 26. a1 2 3 4 5 6 7 8 9 10 of DNA breakage. We assumed that the overall fragment distribution will represent the sum of the distributions arising from a Poisson distribution of molecules with zero, one, two, etc. breaks. We calculated separately the distributions expected for molecules with exactly one, two, or three breaks and found that these distributions can be represented by equations 4 through 12 in the appendix. The distribution of once-broken molecules has been described above (see figure 1). Figure 6 shows the predicted distribution for molecules broken exactly twice (figure 6a) or exactly three times (figure 6b) for probed sequences at three different internal positions in the molecule. Clearly, molecules broken more than once contribute to the changes in intensity observed for one break. The general equations numbered 1 3 - 1 5 in the appendix were used to predict distributions for higher break classes, and were tested using computer simulations. We programmed an Apple Macintosh Ilex to place breaks randomly in any of one hundred domains along a hypothetical molecule and to graph the number of fragments containing a given domain versus size, for 300,000 such broken molecules. We found agreement between these computer simulations and equations 13—15, for all break frequencies tested (one to twenty). Hence, we used these equations to compute the individual distributions for each break class. We used the Poisson distribution to compute the fraction of molecules with each break number for different average break frequencies, and summed the individual distributions to give the overall distributions expected for these frequencies. Representative results are given in figures 5a and 5b, where we show the effect of increasing the average number of breaks from one to five for genes positioned 10% and 40% from the nearest end of the molecule. For the nearest end it can be seen that higher break frequencies still provide information if the probed gene is near the telomere, but if the gene is near the center of the chromosome or restriction fragment, a lower 1 2 3 4 5 8 7 S S ^Q^10~ kb 355 —291 — 243 194 146 —97.0 i —48.5 0 3.0 0 1.0 2.0 4.0 8.0 16.0 (X) x-ray dos* in minutes 0 U 0 M U iO 1.0 i-ray don In mmmw W «V* Figure 4: Positioning a probed sequence within a restriction fragment, (a) An ethidium-stained pulsed-field gel. Lanes 1, 2 and 10 contain undigested yeast DNA, given no x-rays (lanes 1 and 10) or 1.5 minutes of x-rays (lane 2). Lane 3 contains unirradiated yeast DNA digested with Not I, and lanes 4 - 8 contain DNA digested with Not I and then irradiated with x-rays for successively longer times, as shown at the base of each lane. The dose rate was as in figure 2. Lane 9 contains undigested, unirradiated X-ladder DNA to provide size markers. The gel was run for 24 hours with a pulse time of 18 seconds, conditions that expand the resolution in the range of 50-350 kb. (b) The same gel, probed with a 1.7 kb fragment of the HIS3 gene. In the Not I-digested lanes, a strong band at - 2 7 0 kb is identified by the probe. After irradiation, the smear pattern resulting from broken fragments from this band shows a discontinuity at -64 kb, indicated by the arrow. In the original blot a weak discontinuity is also observable at - 2 2 2 kb. The weaker bands above 270 kb most likely represent restriction fragments arising from incomplete digestion by Not I, or weak binding of the probe to other sites in the yeast genome (see also figure 3b). Fragments from these bands are probably also responsible for partially obscuring the 222 kb discontinuity. 4458 Nucleic Acids Research, Vol. 18, No. 15 100 FRAGMENT SIZE (%) FRAGMENT SIZE (%) Figure 5: Fragment distribution as a function of break frequency and position of probed site, (a) The predicted size distribution of molecular fragments containing a site at 10% of the length of the molecule from one end, for average break frequencies of 1, 3 and 5 per molecule, assuming a Poisson distribution of number of breaks / molecule, placed at random, (b) The predicted size distribution of molecular fragments containing a site 40% from one end of a molecule, for average break frequencies of 1, 3 and 5 per molecule, assuming a Poisson distribution of breaks / molecule, placed at random. breakage frequency is required to maintain significant changes in signal. This effect arises because, in order to produce fragments below the lower intensity change, two breaks are required which must straddle the gene and be closer together than the distance between the gene and its nearest telomere or restriction site. Thus, for any given amount of breakage, the relative frequency of such fragments will depend on the gene-to-end distance. It can also be seen that the upper intensity change diminishes much more rapidly with increasing dose than does the lower one. However, unlike the lower change, the upper change attenuates with increasing dose less rapidly for positions near the center of the molecule than for probed genes nearer one end. This effect derives from the fact that the distance to the further end is shorter for more central sites, providing less 'target-length' within which a second break must fall to diminish the upper intensity change. Experimental variables that influence the technique As shown above, the optimal dose of x-rays to induce the required breakage will depend both on the size of the starting molecules and on the relative internal position of the sequence being located. We find that a single gel can be informative for a wide range of positions and DNA sizes by irradiating different samples with doses that increase two-fold at each step. In this way, eight irradiated lanes on a gel provide more than a hundred-fold range of break frequencies, and within this range, there will be one sample that is within fifty percent of any desired optimum. We note that in general optimal doses are lower when attempting to locate a gene with respect to the distal end of a chromosome or restriction fragment than they are when only information concerning the proximal end is required, and that for distal distances optimizing the break-frequency becomes much more critical. A limiting case obtains when both the intact molecule and fragments extending from the probed gene to the distal end are too large to be resolved on the gel. Here, only the distance to the nearest end or restriction site is available. To see the intensity change, break frequency must be low enough to leave many molecules unbroken in this interval, but high enough to produce sufficient fragments in the resolvable size range. This may be equivalent to many breaks per molecule. Above a certain range the initial size of the molecule becomes less relevant than the maximum length that can be resolved and the distance of the site from the end. Electrophoretic conditions also influence the effectiveness of the technique. In general, conditions that maximize the degree of resolution at the region of a given intensity change will provide the most accurate information about its position. However, as the resolution increases, both the magnitude and sharpness of the change should decrease until a limit of observability is reached. In addition, variations in degree of resolution in different parts of the gel will themselves lead to intensity variations in the smear pattern. Care may be needed to distinguish these from the changes representing position effects, especially in situations where there is a complex relationship between DNA size and relative mobility. We also note that reciprocity failure (27) in film used for autoradiography with intensifier screens (fluorography) may help, rather than hinder, the detection of threshold intensity changes. For example, under ideal conditions we expect a maximum change of two-fold in the amount of bound probe at the upper position. However, this can be amplified into a more than two-fold difference in darkness of film, if the signal intensity is such that reciprocity failure is more significant below the threshold position than above it. In practice, pre-flashed film may be helpful when the signal is faint, but for stronger signals, repeated exposures on non-flashed film as the signal decays may help to exploit reciprocity failure. We used non-flashed film for the exposures presented here (figures 3b, 3c and 4b). Finally, the size range of the DNA being studied and the amount of extraneous DNA present will influence the experimental design. In general, the standard error in locating an internal site should be largely independent of DNA size when position is expressed as a percentage of the length of the molecule. However, problems may arise with very small or very large DNA. In the small range, the probed sequence may itself be a significant fraction of the length of the molecule; but since there will be more molecules to bind the probe per unit amount of Nucleic Acids Research, Vol. 18, No. 15 4459 2/L 1/L 50 FRAGMENT SIZE (%) 100 50 FRAGMENT SIZE (%) 100 Figure 6: Fragment distribution from molecules with exactly two or exactly three breaks, (a) Predicted distribution of fragments containing a site whose distance from the nearest end is 10, 30 or 50% of the length of the molecule, and arising from molecules broken exactly twice, (b) Predicted distribution of fragments containing a site whose distance from the nearest end is 10, 30 or 50% of the length of the molecule, and arising from molecules broken exactly three times DNA, short probes may be effective. In the very large size range, the signal will be weaker because there are fewer molecules per unit amount of DNA. This will also be true when the DNA being studied is a small fraction of the total DNA present, for example in attempting to locate a sequence using an irradiated restriction digest of total DNA from large genomes. Here, longer probes and more heavily loaded gels may be advantageous and recently developed techniques for analyzing blots using storage phosphor imaging (28) in place of film may provide greater sensitivity and improved quantitation. In the limiting case, signal may need to be intensified with some loss of resolution by running the fragments for shorter distances. GENERAL DISCUSSION We have shown that the physical position of a unique sequence within a DNA molecule can be determined by probing the fragments of randomly broken molecules after electrophoresis. We used ionizing radiation to break the DNA, since the induced breaks are at least approximately random and a reproducible series of doses could be given quickly and easily. However, other methods of randomly breaking DNA in controlled amounts should also be effective. For example, the method of using DNAse I with manganese as described by Melgar and Goldthwait (29) could probably be adapted for use with large DNA in agarose plugs. We note that Vollrath et al. (11) have used a chromosome breakage method to determine the physical position of cloned sequences in yeast. However, their technique (11) differs in concept and practice from random-breakage methods in that homologous recombination is used to break the molecule and insert a telomere uniquely at the site being mapped. The random-breakage method outlined here may be useful in constructing physical maps of whole chromosomes and very large restriction fragments. In principle, it is applicable to any genes in DNA molecules up to twice the size that can be resolved by conventional, pulsed-field or field-inversion electrophoresis. When used with larger chromosomes, it may reveal the positions of genes that are within a few megabases of a telomere, since it is only necessary here to resolve fragments the size of the genetelomere distance, rather than the whole chromosome. In practice, it may often be desirable to map genes that are more centrally located within very large molecules, and it may be essential to resolve the ambiguity as to which molecular end the gene is nearest. These issues can be addressed by the method described here in combination with other approaches. Positioning of genes within large restriction fragments by using irradiated mammalian DNA digested with rarely cutting enzymes such as Not I, Sfi I or the recently described Fse I (30) may be useful for mapping previously defined internal regions of large chromosomes. Alternative methods for determining the relative distance between probed sites in large chromosomes include radiation hybrid mapping (9, 12) in which somatic cell hybrids are irradiated and the frequency with which sites are separated by x-ray breakage in vivo is assessed. The random-breakage method described here should be well suited to the analysis of mammalian DNA in yeast artificial chromosomes. These are usually of a size suitable for pulsedfield gel studies, and the ratio of probed sequence to total DNA on the gel would be more favorable using yeast than if cells with much larger genomes were used. Moreover, the ambiguity regarding the two ends may be resolved with YACs by using the random-breakage method in combination with restriction analysis, or the technique described by Pavan et al. (13), in which a series of stable but shorter derivatives of a YAC are created by transformation with an integrating plasmid containing a telomere and mammalian DNA of a repeated sequence. The derivatives are formed by integration of the plasmid at any one of a number of sites, and thus consist of terminal deletions of differing extent derived from the original YAC. To define a unique position for a gene, it would simply be necessary to position it by random-breakage mapping both on the original YAC and on any one shorter derivative. The proximal end can then be identified by whether the shorter or the longer gene-toend distance is altered in the shorter YAC derivative. Since only a few radiation doses are needed, both the original YAC and its shorter derivative could be studied on the same gel, and a single derivative could resolve the end-ambiguity for all genes on a YAC, provided that less than fifty per cent of the molecule is deleted. Analogous strategies may be devised for resolving the end ambiguity in other situations, and the use of restriction enzymes such as Not I and Sfi I in combination with randombreakage will frequently provide the additional information 4460 Nucleic Acids Research, Vol. 18, No. 15. required. For Saccharomyces cerevisiae, whose Not I and Sfi I maps are known (A. Link and M. Olson, submitted for publication), a single re-probable blot containing several lanes of undigested, Not I-digested and Sfi-I digested DNA given appropriate doses of x-rays should uniquely position any cloned single-copy DNA. Finally, we wish to point out that the discontinuities arising from end effects when broken DNA is probed may have some relevance to studies of DNA breakage and repair, in addition to their usefulness in mapping. A number of recent studies (25, 31, 32) have used pulsed-field gels to study double-strand breaks and their repair, and clearly the pattern of smeared DNA can provide information about these processes (25). To interpret such data, it is important to understand the fragment patterns to be expected (see appendix). In addition, the discontinuities observed may themselves provide experimental information about the number of breaks and their distribution along the molecules studied. ACKNOWLEDGEMENTS We would like to acknowledge early work on this topic by George Carle in the laboratory of Maynard Olson. We would also like to thank L. Riles, J. Dutchik, A. Baktha, A. Link and M. Olson for sharing unpublished information concerning the physical map of Saccharomyces cerevisiae. We thank Sylvia Spengler for careful reading of the manuscript. This work was supported in part by NIH grant GM30990 to R. Mortimer and by funds administered through DOE Contract No. DE-ACO3-765F00098 to the Human Genome Center at Lawrence Berkeley Laboratory. REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. Southern, E.M. (1975) J. Mol. Biol., 98, 503-517. Schwartz, D.C. and Cantor, C.R. (1984) Cell, 37, 6 5 - 7 5 . Carle, G.F. and Olson, M.V. (1984) Nucleic Acids Res., 12, 5647-5664. Carle, G.F., Frank, M. and Olson , M.V. (1986) Science, 232, 6 5 - 6 8 . Chu, G.,Vollrath, D. and Davis, R.W. (1986) Science, 234, 1582-1585. Chu, G. (1989) Electrophoresis, 10, 290-295. Gekeler.V., Weger, S., Eichele, E. and Probst, H. (1989) Anal. Biochem., 181, 227-233. Daniels, D.L., Olson, C.H., Brumley, R. and Blattner, F.R. (1990) Nucleic Acids Res., 18, 1312. Goss, S.J. and Harris, H. (1977) J. Cell Sci., 25, 39-57. Olson, M.V., Dutchik, J.E., Graham, M.Y., Brodeur, G.M., Helms, C , Frank, M., MacCollin, M., Scheinman, R. and Frank, T. (1986) Proc. Natl. Acad. Sci. USA, 83, 7826-7830. Vollrath, D., Davis, R.W., Connelly, C. and Hieter, P. (1988) Proc. Natl. Acad. Sci. USA, 85, 6027-6031. Cox, D.R., Pritchard, C.A., Uglum, E., Casher, D., Kobori, J. and Myers, R.M. (1989) Genomics, 4, 397-407. Pavan, W.J., Hieter, P. and Reeves, R.H. (1990) Proc. Natl. Acad. Sci. USA, 87, 1300-1304. Smith, C.L. and Condemine, G (1990) J. Bact., 172, 1167-1172. Burgers, P.M.J. and Percival, K.J. (1987) Anal. Biochem., 163, 391 -397. Van Ommen, G.J.B. and Verkerk, J.M.H. (1986) in Davies, K.E. (ed.) Human Genetic Diseases, IRL Press, Oxford, vol. 8, pp. 113-133. Struhl, K. (1985) Nucleic Acids Res., 13, 8587. Burke, D.T., Carle, G.F. and Olson M.V. (1987) Science, 236, 806-812. Rose, M., Grisafi, P. and Botstein, D. (1984) Gene, 29, 113-124. Calderon, I.L., Contopoulou, C.R. and Mortimer R.K. (1982) Current Genetics, 7, 93-100. Andreadis, A., Hsu, Y.-P., Kohlhaw, G.B. and Schimmel, P. (1982) Cell, 31, 319-325. Birnboim, H.C. and Do!y, J. (1979) Nucleic Acids Res., 7, 1513-1523. Maniatas, T., Fritsch, E.F. and Sambrook, J. (1982) In Molecular Cloning: A Laboratory Manual. Cold Spring Harbor University Press, Cold Spring Harbor. 24. Press, W.H., Flannery, B.P., Teukolsky, S.A. and Wetterling, W.T. (1986) In Numerical Recipes, Cambridge University Press, New York, pp. 196-197, 714-715. 25. Contopoulou, C.R., Cook.V.E. and Mortimer, R.K. (1987) Yeast, 3, 71-76. 26. Mortimer, R.K., Schild, D., Contopoulou, C.R. and Kans, J.A. (1989) Yeast, 5, 321-403. 27. Laskey, R.A. and Mills, A.D. (1975) Eur. J. Biochem., 56, 335-341. 28. Johnston, R.F., Pickett, S.C. and Barker, D.L. (1990) Applied and Theoret. Electrophoresis, In Press. 29. Melgar, E. and Goldthwait, D.A. (1968) J. Biol. Chem., 243, 4409-4416. 30. Meyertons Nelson, J., Miceli, S.M., Lechevalier, M.P. and Roberts, R.J. (1990) Nucleic Acids Res., 18, 2061-2064. 31. Blocher, D., Einspenner, M. and Zajackowski, J. (1989) Int. J. Radiat. Biol., 56, 437-448. 32. Game, J.C., Sitney, K.C., Cook, V.E. and Mortimer, R.K. (1989) Genetics, 123, 695-713. APPENDIX Functions relating the size of a detected fragment of length X to its expected frequency relative to the total number of detected fragments were calculated for the first three classes (those with one, two or three breaks). The calculation for a specific break class was based on using the previous break class and determining what distribution would result from an additional break. These calculations are carried out on a chromosome of size L, with a probed site a distance D from the right end (assumed to be the closest) and a distance L - D from the other end (see figure 1). The length of the probed sequence is assumed to be very small relative to the length of the chromosome, and it is treated as a point. Breaks can occur on either the left side or the right side of the probed site. Given that a break occurs, the probability that it occurs to the right of the probed gene is D/L and the probability that it occurs to the left is (L-D)/L. For more than one break, combinations of these probabilities were used to normalize the relevant distributions. For exactly one break, the distribution of fragments f(X) is: f(X) = 0 0 <X<D f(X) = — D <X< L L f(X) = - L-D<X<L L (1) (arising from one break to the left) (2) (arising from one break to the right) (3) These equations are then summed to yield the distribution of fragments for exactly one break. f,(X) = 0 0 <X<D (4) f,(X) = i D<X<L-D (5) L-D<X<L (6) Lr f,(X) = - To calculate the distribution for exactly two breaks the equations are retained as they were originally calculated (equations 1—3) and the distribution that results from the addition of a break is determined. For example, to determine the distribution for exactly two breaks, with one break on each side of the site, we start with the function f(x) = 1/L that is due to one break to the left and add a break to the right. This is done by observing that fragments of size D, generated by a break on the left that are then broken once at random will generate a distribution of detected fragments Nucleic Acids Research, Vol. 18, No. 15 4461 ranging in size from 0 to D, and likewise a fragment of size L will generate a distribution of detected fragments ranging in size from L—D to L, since the most that can be removed from a break on the right is a fragment of size D. The new function f2(X) is then calculated by integrating f,(X) over all the original fragments that will contribute a detected fragment of size X. When this is done for all the possible two-break combinations, normalizing for the relative frequencies of each combination, and the results summed, the functions obtained are: (7) (8) f2(X) = L-D<X<L u (9) Equations 7—9 are graphed in figure 6A for 3 different values for D. The functions resulting from a similar calculation for exactly three breaks are: f3(X) = 0<X<D (10) U f 3(X ) = 3(L-X) 2 6 °(L-X> D<X<L-D L-D<X<L = (11) (12) Equations 10, 11 and 12 are graphed in figure 6B, for three values of D. These three sets of functions (equations 4 - 1 2 ) suggested a general solution (for n breaks) of: - n(n-l)X(L-X)"- 2 , fn(X)= 0<X<D 2 -YVi-2 n(L-X)"-' + n(n-l)D(L-Xr L» L" n(n + l)(L — -.n-l (13) D<X<L-D (14) (15) L-D<X<L L" The functions were verified using computer simulations, and the general solution (equations 13—15) was found to match the computer simulations for all break classes tested (one through twenty). Each function, based on a specific number of breaks, was weighted according to the Poisson distribution and the weighted functions were summed to give the distribution of detected fragments for an average number of breaks. The resulting equations were used to graph the expected distribution of detected fragments as a function of position of probed site and average number of breaks. When equations 13-15 are multiplied by the appropriate Poisson terms and summed the resulting equations can be simplified using the Taylor series expansion of ey. This yields exponential solutions for an average number of breaks identical to analytical solutions to be presented elsewhere (V. Cook and R. Mortimer, submitted for publication).
© Copyright 2026 Paperzz