Random-breakage mapping, a rapid method for

© 7990 Oxford University Press
Nucleic Acids Research, Vol. 18, No. 15 4453
Random-breakage mapping, a rapid method for physically
locating an internal sequence with respect to the ends of
a DNA molecule
John C.Game 1 , Maren Bell1, Jeff S.King2 and Robert K.Mortimer1'2
1
Division of Cellular and Molecular Biology, Lawrence Berkeley Laboratory, Berkeley, CA 94720 and
department of Molecular and Cell Biology, Division of Biophysics and Cell Physiology, University of
California, Berkeley, CA 94720, USA
Received April 30, 1990; Revised and Accepted June 25, 1990
ABSTRACT
We describe a method for determining the position of
a cloned internal sequence with respect to the ends of
a DNA molecule. The molecules are randomly broken
at low frequency and the fragments are subjected to
electrophoresis. Southern hybridization using the
cloned DNA as a probe identifies only those fragments
containing the sequence. The size distribution of these
fragments is such that two threshold changes in
intensity of signal are seen in the smear pattern below
the unbroken molecules. The positions of the changes
represent the distances from the sequence to each
molecular end. The intensity changes arise because the
natural ends of the molecules influence the fragment
distribution obtained. From once-broken molecules, no
fragments can arise that contain a given sequence and
are shorter than the distance between that sequence
and the nearest molecular end. We tested the method
by using x-rays to induce breakage in yeast DNA.
Genes of independently known position were mapped
within whole chromosomes or Not I restriction
fragments using Southern blots from gels of irradiated
molecules. We present equations to predict fragment
distribution as a function of break-frequency and
position of the probed sequence.
expected at certain positions in the smear pattern below a given
DNA species. These positions reveal the distance of the probed
sequence from the ends of the molecule. In principle, this is easily
understood by considering molecules broken exactly once.
Amongst the fragments of such molecules, none shorter than the
distance between a sequence and the nearest end of the molecule
can contain that sequence. A minimum of two breaks would be
required to generate shorter fragments identifiable by the probe.
When a low amount of DNA breakage is present ( < 1
break/molecule), most fragments will arise from once-broken
molecules. Hence, a given probe will mainly identify fragments
extending downwards only to a size representing the distance of
the gene probed from the nearest end of the molecule. A second
change in intensity is expected at a position corresponding to the
distance from the gene to the more distant end of the molecule.
Molecules broken twice or more will also contribute to these
intensity changes. We tested the proposed method experimentally,
using DNA from Saccharomyces cerevisiae. The results presented
demonstrate that this is a simple and rapid way to map internal
sequences with respect to molecular ends over a wide range of
DNA sizes.
MATERIALS AND METHODS
Preparation of high molecular weight DNA
INTRODUCTION
The techniques of Southern hybridization (1) and conventional
or pulsed-field gel electrophoresis (2-8) are widely used to assign
cloned DNA sequences to particular chromosomes or DNA
restriction fragments. However, locating the position of a given
sequence within a chromosome or fragment is more difficult and
often requires complex strategies (9-14). Current methods of
directly positioning sequences with respect to the ends of
chromosomes require transformation with specially constructed
molecules containing these sequences (11, 13). We show here
that the distance in base pairs between a cloned sequence and
the nearest free end of a DNA molecule can in fact easily be
determined by probing gels containing randomly broken
molecules. When molecules contain randomly distributed breaks
at low frequency, sharp changes in intensity of probe signal are
Chromosomal-sized DNA from the haploid, prototrophic
S. cerevisiae strain X2180-1A was prepared in low melting point
agarose, by adapting procedures described by Burgers et al. (15)
and Van Ommen et al. (16). Cells were spheroplasted before
they were encased in agarose and no 2-mercaptoethanol was used.
1.5 ml of a fresh overnight culture grown in YPD at 30°C were
harvested for 3 seconds in a microfuge, washed with 1 ml EDTA
pH 7.5 and resuspended in 1 ml SCE (1 M sorbitol, 0.1 M Na
citrate, 10 mM EDTA pH 7.5) containing 200 U lyticase (Sigma).
Cells were incubated in SCE/lyticase at 30°C for 20-25 minutes
and harvested at 500Xg for 5 minutes. Spheroplasts were
resuspended in 500 fn\ 1 M sorbitol. An equal volume of molten
1.5% LMP agarose at 47°C in 0.125 M EDTA was added and
cells were gently mixed by pipetting. Aliquots of 100 y\ were
distributed into molds, refrigerated, and the solidified plugs were
placed into 5 volumes ES (0.5 M EDTA pH 9.5, 1 % Sarcosyl)
4454 Nucleic Acids Research, Vol. 18, No. 15
containing 1 mg/ml proteinase K (Sigma). After ~ 6 hours of
gentle shaking at room temperature, further proteinase K was
added to give a total concentration of 2 mg/ml. Plugs were
digested overnight at room temperature. They were then washed
extensively with TE50 (10 mM Tris HC1 pH 7.5, 50 mM EDTA
pH 7.5) and stored at 4°C. If a restriction digest was to follow,
plugs were incubated with 0.1 mM PMSF (Sigma) for two washes
of 1 hour each.
Plasmid sources and preparation of probes
We used four unique sequences as hybridization probes for yeast
DNA. A 1.7 kb BamH I fragment containing the HIS3 gene (17)
was isolated from plasmid pYAC 5(18), provided by Maynard
Olson. A 1.1 kb Hind III fragment containing the URA3 gene
(19) and a 1.3 kb Nru I-Mlu I fragment containing the RAD51
gene (20) were both isolated from a pUC13-derived plasmid,
provided by Mari Aker and George Basile. A 2.1 kb Cla I
fragment from the LEU2 region (21) was isolated from plasmid
YEP13-RAD54, provided by Garry Cole. Plasmid DNA and
electroeluted fragments were prepared by the methods of
Birnboim and Doly (22) and Maniatas et al. (23) respectively.
X-ray treatments
A Machlett OEG 60 beryllium window x-ray machine was used
at 50 kV and 20 mA to irradiate yeast DNA in agarose plugs.
Plugs were equilibrated in TE50 prior to irradiation, since the
amount of breakage obtained varies sharply in different buffers.
Each agarose plug was turned over after half the x-ray dose, to
minimize dose heterogeneity within the agarose caused by
shielding. The exact dose rate within the plugs is difficult to
determine, since there is some absorbtion by the agarose, but
our best estimate based on observed breakage is that the effective
dose rate was approximately 6 kilorads/minute at the settings
used.
Electrophoresis
CHEF gels were run using techniques described elsewhere (5,
6). Pulse times of specific gels were chosen to achieve maximum
resolution in the size ranges of interest and frequently varied
within a single run (see figure legends).
Southern blotting and hybridization
Gels were treated with short wave-length UV, acid nicked,
denatured and neutralized. DNA was capillarilly transferred to
nitrocellulose (1) and bound to the membrane using a Stratagene
UV-linker. Fragments of the S. cerevisiae H1S3, RAD51, LEU2
and URA3 genes were labeled by random oligonucleotide priming
(Amersham). Hybridization was carried out overnight at 42 °C
in 50% formamide, 0.1% SDS, 3XSSC, 1 mM EDTA pH 7.8,
10 mM Tris HC1 pH 7.6, lOxDenhart's solution, 0.05% Na
pyrophosphate and 100 ^g/ml sonicated salmon sperm DNA.
Computer analysis
To verify the equations calculated in the appendix a Modula-2
program was written and run on an Apple Macintosh Ucx
computer. The program simulated the breakage of chromosomes
by randomly placing a fixed number of breaks on a chromosome
containing one hundred possible break sites. Random numbers
were generated using the Pascal version of the uniform deviate
random number generator RANI from Press et al. (24).
Typically, 300,000 imaginary chromosomes were broken and
the size of the site-containing fragment from each chromosome
-2/L
Delected fragments
from breaks in
region D
-1/L
Detected fragments
from breaks in region L - D
D
L-D
SIZE OF FRAGMENT
Figure 1: Illustration of the size distribution of probed fragments arising from
a population of once-broken molecules. The upper part represents a chromosome
of length L, with a unique sequence (A), a distance D from the nearest end. The
symbol —O— represents the centromere. The lower part represents the distribution
of fragments containing this sequence versus the size of the fragment. It can be
seen that there are no sequence-containing fragments in the size-range 0 to D,
and only half as many per unit length in the size-range D to L—D as there are
in the size-range L - D to L.
was recorded. The size distributions of fragments were
normalized by dividing the total number of fragments of each
size by the number of chromosomes broken. These distributions
were compared to those predicted by the equations in the
appendix.
RESULTS
The basis of the method
The method relies on observing the pattern that broken molecules
form when resolved according to size by gel electrophoresis.
When molecules of a particular DNA species are randomly
broken, the resultant fragments will form a smear on the gel
below the position of the unbroken molecules. The overall size
distribution of fragments will depend on the frequency of breaks,
and has been described elsewhere (25). However, after Southern
hybridization, only those fragments that contain the probed
sequence are seen. Each such fragment will contribute equal
intensity independently of its size, whereas when total DNA is
visualized by staining, the intensity of signal will depend on both
the size and number of fragments at any point.
Fragments that contain a specific sequence will show a size
distribution that differs from that of total fragments, if the DNA
is linear and the number of breaks per molecule is low. This
difference results from the contribution that natural ends of the
original molecules make to the distribution. When end effects
are significant, the fragment distribution reveals the distance from
the ends to the probed sequence. This can be seen from the
fragment distribution expected from molecules broken exactly
once, at random positions (see figure 1). No fragment shorter
than the distance from a unique sequence to the nearest end of
Nucleic Acids Research, Vol. 18, No. 15 4455
information may be obtained from smear patterns from somewhat
higher levels of breakage. This is because the fragment
distribution from molecules with two to several breaks also shows
significant peaks at the same size positions. The degree of peaking
is dependent both on the break frequency and the position of the
probed sequence within the molecule (see figure 5a and 5b). We
investigated the effect of increasing break-frequency on the
observed fragment distribution using x-rays, and derived and
tested equations to predict the expected distribution for different
break-frequencies and gene positions, as discussed below and in
the appendix.
1.0
2.0
3.0
4.0
AVERAGE NUMBER OF BREAKS
5.0
Figure 2: Molecular species arising from breakage as a function of average break
frequency per molecule. Line (a) shows the fraction of molecules that are broken
once or more, line (b) shows the fraction of molecules broken exactly once and
line (c) shows the fraction of total fragments arising from exactly once-broken
molecules. The abscissa represents the average break frequency per molecule,
and it is assumed that there is a Poisson distribution of number of breaks / molecule,
randomly positioned.
the molecule can contain that sequence, since at least two breaks
would be required to generate such a fragment. Hence, the
distribution of fragments identified by a probe should show a
sharp 'cut-off in the smear pattern at a position representing the
distance between the probed sequence and the nearest end of the
molecule. Above this point, the number of fragments containing
the sequence will be constant over a certain size range, when
only once-broken molecules are considered, since each size will
correspond to a break at only one position in the molecule.
However, in a larger size range, corresponding to fragments
larger than the distance from the sequence to the more distant
end of the molecule, we expect twice this frequency of probebinding fragments of each size. This is because there are now
two separate positions in the molecule where breakage will lead
to fragments of a specific size, namely one between the probed
sequence and its nearest end and another at an equivalent distance
from the opposite end of the molecule (see figure 1). Hence, we
expect an increase in the amount of bound probe at a position
representing the distance from the probed sequence to the furthest
end. Above this, the number of sequence-containing fragments
will again be constant versus size, until the position of the
unbroken molecules is reached. Note that the sum of the sizes
represented by the lower and upper intensity changes should equal
the size of the unbroken molecules. We show below that these
expected intensity changes can readily be observed.
Molecules broken more than once
Randomly broken DNA will always contain some molecules
broken more than once. The proportion of total fragments arising
from these becomes small at break frequencies significantly less
than one per molecule, but clearly there will be a limiting
frequency below which breakage will be insufficient to generate
a visible smear in the probed gel. Figure 2 illustrates how the
fraction of fragments that are derived from once-broken molecules
changes with break-frequency. It can be seen that there is a region
between about 0.7 and 1.2 breaks per molecule when most
molecules are broken (line a), but most fragments are still derived
from once-broken molecules (line c). However, in practice, useful
Experimental tests of the method
a) Whole yeast chromosomes: We used DNA from the yeast
Saccharomyces cerevisiae to test the usefulness of probed smear
patterns for localizing genes on chromosomes and restriction
fragments. We irradiated DNA with x-rays in vitro in agarose
plugs at a variety of doses, and subjected this DNA to pulsedfield gel electrophoresis. We blotted the resultant gels to
nitrocellulose, and probed these blots with DNA from several
separate cloned yeast genes for which we had independent
information concerning position. Figure 3 shows results for a
single gel, probed consecutively with two different genes. For
each probe, it can be seen that at least some lanes show two
threshold changes of intensity, as indicated by arrows, at positions
that differ for each of the two genes. In each case, the higher
x-ray doses show a blurring of the threshold intensity changes
and this occurs more rapidly with increasing dose at the upper
position than at the lower one. As expected, the threshold changes
are most readily seen at lower doses in the case of HIS3, which
is on chromosome XV (1140 kb, (26)), than they are for RAD51
on chromosome V, which is much smaller (593 kb, (26)). Note
also that the intensity of the band representing unbroken molecules
declines much more rapidly with dose for the larger chromosome.
Evidence that the intensity changes in figure 3 are consistent
in position with the expected physical distances from the telomeres
can be seen in table 1. This compares estimates of position for
each probed gene from this study with data from L.Riles,
J.Dutchik, A.Baktha, A.Link and M.Olson (personal
communication) using genomic restriction mapping (10), together
with genetic position estimates from Mortimer et al. (26). Our
measurements fall within the range expected from the restriction
map data for both genes in figure 3 and also for URA3 and LEU2
(blots not shown). There is reasonable agreement when the sizes
represented by the two intensity changes are summed and
compared to the size of the unbroken molecules on the same gel.
We believe these observations indicate that the intensity changes
in probed smear patterns do reflect the distance of the probed
sequence from the ends of the molecule.
b) Not I restriction digests of yeast DNA: Many organisms
have chromosomes that are too large to be resolved by
contemporary electrophoretic techniques. However, we reasoned
that the ability to position genes within large restriction fragments
by random-breakage mapping would provide useful information
when constructing physical maps. We tested the feasibility of this
by irradiating Not I digests of yeast DNA. By analogy with
chromosomes the smear patterns of these digests should reveal
the distance of the probed sequence from its nearest Not I sites
on either side. If digestion is incomplete, overlapping smear
patterns from larger molecules (cut at more distant Not I sites)
might obscure information about the position of the sequence with
respect to the more distant of its two nearest sites. However,
4456 Nucleic Acids Research, Vol. 18, No. 15
2
3
4
5
6
7
8
9
10
10
kb
I*
•
1(22—
1149—
1622 1140 830
677
MO—
(77—
593"
450
4*7—
388340291-
24319414697.0-
i
(X) 0 0.67 1.3 2.0 (X) 4.0 8.0 16.0 (X)
x-ray dose in minutes
t
0J7 U
U
N
«4
U
Ut
x-ny do** In H * M M
7
0
0
10
1149-
It
07i>
1 Iff
U
U
*Jt
0J> 110
W
Figure 3: Positioning genes on yeast chromosomes by random-breakage mapping, (a) An ethidium-stained pulsed-field gel. Lanes 1, 6 and 10 contain unirradiated
X-ladder DNA to provide size markers. The sample in lane 2 contained yeast DNA that received no x-rays, and the bands represent unbroken chromosomes. The
samples in the other lanes contained the same yeast DNA but were treated with x-rays for successively longer times, as shown at the base of each lane. The x-ray
dose-rate was approximately 6 kilorads/minute. DNA breakage is revealed at the higher doses by loss of bands and the progressively lower position of the ethidiumstained smear. Smaller yeast chromosomes show preferential survival with respect to dose compared to the larger ones. The gel was run for 29 hours with eight
discrete pulse-times ranging from 16 seconds to 95 seconds. Sizes in kb are indicated on the right, (b) A Southern blot of the same gel, probed with a 1.7 kb fragment
of the H1S3 gene. The probe identifies unbroken chromosome XV molecules, at a size position of about 1140 kb, as well as a smear pattern of broken fragments
that contain the gene. Two discontinuities can be identified in this smear pattern, at positions indicated by arrows at the left and corresponding to 770 kb (arrow
A) and 380 kb (arrow B) respectively. The magnitude of each discontinuity is dose-dependent and the upper discontinuity attenuates at a lower dose than the lower
discontinuity. The weak band identified above chromosome XV probably represents partial hybridization of the probe to another site in the yeast genome, (c) The
same blot, re-probed with a 1.3 kb fragment from the RAD5J gene. Chromosome V is identified at about 593 kb and two discontinuities in the smear pattern can
be seen, at ~35O kb (arrow A) and - 2 3 5 kb (arrow B), whose sum is equivalent to 585 kb.
incomplete digestion should not interfere with estimates of the
nearest-site distance, provided that a significant number of
molecules are cut at this site.
Figure 4a shows an ethidium stained gel containing lanes of
x-irradiated yeast DNA which had been previously digested with
Not I. Figure 4b shows the same gel blotted and probed with
the HIS3 fragment used in figure 3b. It can be seen that the probe
identifies a — 270 kb Not I fragment, which is progressively
broken as the x-ray dose increases. In the irradiated lanes (4 to
8), the smear pattern identified by the probe shows a sharp cutoff at a position corresponding to ~ 64 kb. Independent data
provided by L. Riles, J. Dutchik, A. Baktha, A. Link and M.
Olson and based on genomic restriction mapping (10) have placed
HIS3 - 4 9 kb from a Not I site. This - 15 kb difference may
be within the range of experimental error in the pulsed-field gels
used, or may reflect real differences between the different yeast
strains studied in each laboratory. Figure 4b demonstrates that
probed sequences can be positioned with respect to restriction
sites by random-breakage mapping.
Fragment distribution as a function of breaks per molecule
In order to fully assess the analysis of broken DNA as a physical
mapping method, we calculated the expected size distributions
of fragments containing genes at different locations as a function
Nucleic Acids Research, Vol. 18, No. 15 4457
Table 1. Comparison of gene-to-telomere distances using genomic restriction
mapping1, random-breakage mapping2 or genetic mapping methods3.
gene probed
HIS3 (Chr. XV)
nearest end
furthest end
sum
% distance from
nearest end
RAD51 (Chr. V)
nearest end
furthest end
sum
Restriction
Mapping1
Random-Breakage
Mapping2
Genetic
Map Position3
402 kb
768 kb
1170 kb
34.4
380 kb
770 kb
1150 kb
116 cM
246 cM
362 cM
32.0
33.0
228 kb
357 kb
585 kb
% distance from
nearest end
URA3 (Chr. V)
nearest end
furthest end
sum
% distance from
nearest end
LEU2 (Chr. Ill)
nearest end
furthest end
sum
% distance from
nearest end
94 cM
137 cM
231 cM
39.0
235 kb
350 kb
585 kb
40.2
121 kb
464 kb
585 kb
20.7
130 kb
444 kb
574 kb
22.7
52 cM
179 cM
231 cM
120 kb
230 kb
350 kb
34.3
114 kb
241 kb
355 kb
46 cM
102 cM
148 cM
31.1
40,1
22.5
32.1
1. See ref. 10. Data provided by L. Riles, i. Dutchik, A. Baktha, A. Link,
and M. Olson (personal communication).
2. Data from this study.
3. Data from ref. 26.
a1
2
3
4
5
6
7
8
9
10
of DNA breakage. We assumed that the overall fragment
distribution will represent the sum of the distributions arising from
a Poisson distribution of molecules with zero, one, two, etc.
breaks. We calculated separately the distributions expected for
molecules with exactly one, two, or three breaks and found that
these distributions can be represented by equations 4 through 12
in the appendix. The distribution of once-broken molecules has
been described above (see figure 1). Figure 6 shows the predicted
distribution for molecules broken exactly twice (figure 6a) or
exactly three times (figure 6b) for probed sequences at three
different internal positions in the molecule. Clearly, molecules
broken more than once contribute to the changes in intensity
observed for one break. The general equations numbered 1 3 - 1 5
in the appendix were used to predict distributions for higher break
classes, and were tested using computer simulations. We
programmed an Apple Macintosh Ilex to place breaks randomly
in any of one hundred domains along a hypothetical molecule
and to graph the number of fragments containing a given domain
versus size, for 300,000 such broken molecules. We found
agreement between these computer simulations and equations
13—15, for all break frequencies tested (one to twenty). Hence,
we used these equations to compute the individual distributions
for each break class. We used the Poisson distribution to compute
the fraction of molecules with each break number for different
average break frequencies, and summed the individual
distributions to give the overall distributions expected for these
frequencies. Representative results are given in figures 5a and
5b, where we show the effect of increasing the average number
of breaks from one to five for genes positioned 10% and 40%
from the nearest end of the molecule. For the nearest end it can
be seen that higher break frequencies still provide information
if the probed gene is near the telomere, but if the gene is near
the center of the chromosome or restriction fragment, a lower
1
2
3
4
5
8
7
S
S
^Q^10~
kb
355
—291
— 243
194
146
—97.0
i
—48.5
0
3.0
0 1.0 2.0 4.0 8.0 16.0 (X)
x-ray dos* in minutes
0
U
0 M U iO 1.0
i-ray don In mmmw
W
«V*
Figure 4: Positioning a probed sequence within a restriction fragment, (a) An ethidium-stained pulsed-field gel. Lanes 1, 2 and 10 contain undigested yeast DNA,
given no x-rays (lanes 1 and 10) or 1.5 minutes of x-rays (lane 2). Lane 3 contains unirradiated yeast DNA digested with Not I, and lanes 4 - 8 contain DNA digested
with Not I and then irradiated with x-rays for successively longer times, as shown at the base of each lane. The dose rate was as in figure 2. Lane 9 contains undigested,
unirradiated X-ladder DNA to provide size markers. The gel was run for 24 hours with a pulse time of 18 seconds, conditions that expand the resolution in the
range of 50-350 kb. (b) The same gel, probed with a 1.7 kb fragment of the HIS3 gene. In the Not I-digested lanes, a strong band at - 2 7 0 kb is identified by
the probe. After irradiation, the smear pattern resulting from broken fragments from this band shows a discontinuity at -64 kb, indicated by the arrow. In the
original blot a weak discontinuity is also observable at - 2 2 2 kb. The weaker bands above 270 kb most likely represent restriction fragments arising from incomplete
digestion by Not I, or weak binding of the probe to other sites in the yeast genome (see also figure 3b). Fragments from these bands are probably also responsible
for partially obscuring the 222 kb discontinuity.
4458 Nucleic Acids Research, Vol. 18, No. 15
100
FRAGMENT SIZE (%)
FRAGMENT SIZE (%)
Figure 5: Fragment distribution as a function of break frequency and position of probed site, (a) The predicted size distribution of molecular fragments containing
a site at 10% of the length of the molecule from one end, for average break frequencies of 1, 3 and 5 per molecule, assuming a Poisson distribution of number
of breaks / molecule, placed at random, (b) The predicted size distribution of molecular fragments containing a site 40% from one end of a molecule, for average
break frequencies of 1, 3 and 5 per molecule, assuming a Poisson distribution of breaks / molecule, placed at random.
breakage frequency is required to maintain significant changes
in signal. This effect arises because, in order to produce fragments
below the lower intensity change, two breaks are required which
must straddle the gene and be closer together than the distance
between the gene and its nearest telomere or restriction site. Thus,
for any given amount of breakage, the relative frequency of such
fragments will depend on the gene-to-end distance. It can also
be seen that the upper intensity change diminishes much more
rapidly with increasing dose than does the lower one. However,
unlike the lower change, the upper change attenuates with
increasing dose less rapidly for positions near the center of the
molecule than for probed genes nearer one end. This effect
derives from the fact that the distance to the further end is shorter
for more central sites, providing less 'target-length' within which
a second break must fall to diminish the upper intensity change.
Experimental variables that influence the technique
As shown above, the optimal dose of x-rays to induce the required
breakage will depend both on the size of the starting molecules
and on the relative internal position of the sequence being located.
We find that a single gel can be informative for a wide range
of positions and DNA sizes by irradiating different samples with
doses that increase two-fold at each step. In this way, eight
irradiated lanes on a gel provide more than a hundred-fold range
of break frequencies, and within this range, there will be one
sample that is within fifty percent of any desired optimum. We
note that in general optimal doses are lower when attempting to
locate a gene with respect to the distal end of a chromosome or
restriction fragment than they are when only information
concerning the proximal end is required, and that for distal
distances optimizing the break-frequency becomes much more
critical.
A limiting case obtains when both the intact molecule and
fragments extending from the probed gene to the distal end are
too large to be resolved on the gel. Here, only the distance to
the nearest end or restriction site is available. To see the intensity
change, break frequency must be low enough to leave many
molecules unbroken in this interval, but high enough to produce
sufficient fragments in the resolvable size range. This may be
equivalent to many breaks per molecule. Above a certain range
the initial size of the molecule becomes less relevant than the
maximum length that can be resolved and the distance of the site
from the end.
Electrophoretic conditions also influence the effectiveness of
the technique. In general, conditions that maximize the degree
of resolution at the region of a given intensity change will provide
the most accurate information about its position. However, as
the resolution increases, both the magnitude and sharpness of
the change should decrease until a limit of observability is
reached. In addition, variations in degree of resolution in different
parts of the gel will themselves lead to intensity variations in the
smear pattern. Care may be needed to distinguish these from the
changes representing position effects, especially in situations
where there is a complex relationship between DNA size and
relative mobility. We also note that reciprocity failure (27) in
film used for autoradiography with intensifier screens
(fluorography) may help, rather than hinder, the detection of
threshold intensity changes. For example, under ideal conditions
we expect a maximum change of two-fold in the amount of bound
probe at the upper position. However, this can be amplified into
a more than two-fold difference in darkness of film, if the signal
intensity is such that reciprocity failure is more significant below
the threshold position than above it. In practice, pre-flashed film
may be helpful when the signal is faint, but for stronger signals,
repeated exposures on non-flashed film as the signal decays may
help to exploit reciprocity failure. We used non-flashed film for
the exposures presented here (figures 3b, 3c and 4b).
Finally, the size range of the DNA being studied and the
amount of extraneous DNA present will influence the
experimental design. In general, the standard error in locating
an internal site should be largely independent of DNA size when
position is expressed as a percentage of the length of the molecule.
However, problems may arise with very small or very large
DNA. In the small range, the probed sequence may itself be a
significant fraction of the length of the molecule; but since there
will be more molecules to bind the probe per unit amount of
Nucleic Acids Research, Vol. 18, No. 15 4459
2/L
1/L
50
FRAGMENT SIZE (%)
100
50
FRAGMENT SIZE (%)
100
Figure 6: Fragment distribution from molecules with exactly two or exactly three breaks, (a) Predicted distribution of fragments containing a site whose distance
from the nearest end is 10, 30 or 50% of the length of the molecule, and arising from molecules broken exactly twice, (b) Predicted distribution of fragments containing
a site whose distance from the nearest end is 10, 30 or 50% of the length of the molecule, and arising from molecules broken exactly three times
DNA, short probes may be effective. In the very large size range,
the signal will be weaker because there are fewer molecules per
unit amount of DNA. This will also be true when the DNA being
studied is a small fraction of the total DNA present, for example
in attempting to locate a sequence using an irradiated restriction
digest of total DNA from large genomes. Here, longer probes
and more heavily loaded gels may be advantageous and recently
developed techniques for analyzing blots using storage phosphor
imaging (28) in place of film may provide greater sensitivity and
improved quantitation. In the limiting case, signal may need to
be intensified with some loss of resolution by running the
fragments for shorter distances.
GENERAL DISCUSSION
We have shown that the physical position of a unique sequence
within a DNA molecule can be determined by probing the
fragments of randomly broken molecules after electrophoresis.
We used ionizing radiation to break the DNA, since the induced
breaks are at least approximately random and a reproducible
series of doses could be given quickly and easily. However, other
methods of randomly breaking DNA in controlled amounts should
also be effective. For example, the method of using DNAse I
with manganese as described by Melgar and Goldthwait (29)
could probably be adapted for use with large DNA in agarose
plugs. We note that Vollrath et al. (11) have used a chromosome
breakage method to determine the physical position of cloned
sequences in yeast. However, their technique (11) differs in
concept and practice from random-breakage methods in that
homologous recombination is used to break the molecule and
insert a telomere uniquely at the site being mapped.
The random-breakage method outlined here may be useful in
constructing physical maps of whole chromosomes and very large
restriction fragments. In principle, it is applicable to any genes
in DNA molecules up to twice the size that can be resolved by
conventional, pulsed-field or field-inversion electrophoresis.
When used with larger chromosomes, it may reveal the positions
of genes that are within a few megabases of a telomere, since
it is only necessary here to resolve fragments the size of the genetelomere distance, rather than the whole chromosome. In practice,
it may often be desirable to map genes that are more centrally
located within very large molecules, and it may be essential to
resolve the ambiguity as to which molecular end the gene is
nearest. These issues can be addressed by the method described
here in combination with other approaches. Positioning of genes
within large restriction fragments by using irradiated mammalian
DNA digested with rarely cutting enzymes such as Not I, Sfi
I or the recently described Fse I (30) may be useful for mapping
previously defined internal regions of large chromosomes.
Alternative methods for determining the relative distance between
probed sites in large chromosomes include radiation hybrid
mapping (9, 12) in which somatic cell hybrids are irradiated and
the frequency with which sites are separated by x-ray breakage
in vivo is assessed.
The random-breakage method described here should be well
suited to the analysis of mammalian DNA in yeast artificial
chromosomes. These are usually of a size suitable for pulsedfield gel studies, and the ratio of probed sequence to total DNA
on the gel would be more favorable using yeast than if cells with
much larger genomes were used. Moreover, the ambiguity
regarding the two ends may be resolved with YACs by using
the random-breakage method in combination with restriction
analysis, or the technique described by Pavan et al. (13), in which
a series of stable but shorter derivatives of a YAC are created
by transformation with an integrating plasmid containing a
telomere and mammalian DNA of a repeated sequence. The
derivatives are formed by integration of the plasmid at any one
of a number of sites, and thus consist of terminal deletions of
differing extent derived from the original YAC. To define a
unique position for a gene, it would simply be necessary to
position it by random-breakage mapping both on the original
YAC and on any one shorter derivative. The proximal end can
then be identified by whether the shorter or the longer gene-toend distance is altered in the shorter YAC derivative. Since only
a few radiation doses are needed, both the original YAC and
its shorter derivative could be studied on the same gel, and a
single derivative could resolve the end-ambiguity for all genes
on a YAC, provided that less than fifty per cent of the molecule
is deleted. Analogous strategies may be devised for resolving
the end ambiguity in other situations, and the use of restriction
enzymes such as Not I and Sfi I in combination with randombreakage will frequently provide the additional information
4460 Nucleic Acids Research, Vol. 18, No. 15.
required. For Saccharomyces cerevisiae, whose Not I and Sfi
I maps are known (A. Link and M. Olson, submitted for
publication), a single re-probable blot containing several lanes
of undigested, Not I-digested and Sfi-I digested DNA given
appropriate doses of x-rays should uniquely position any cloned
single-copy DNA.
Finally, we wish to point out that the discontinuities arising
from end effects when broken DNA is probed may have some
relevance to studies of DNA breakage and repair, in addition
to their usefulness in mapping. A number of recent studies (25,
31, 32) have used pulsed-field gels to study double-strand breaks
and their repair, and clearly the pattern of smeared DNA can
provide information about these processes (25). To interpret such
data, it is important to understand the fragment patterns to be
expected (see appendix). In addition, the discontinuities observed
may themselves provide experimental information about the
number of breaks and their distribution along the molecules
studied.
ACKNOWLEDGEMENTS
We would like to acknowledge early work on this topic by George
Carle in the laboratory of Maynard Olson. We would also like
to thank L. Riles, J. Dutchik, A. Baktha, A. Link and M. Olson
for sharing unpublished information concerning the physical map
of Saccharomyces cerevisiae. We thank Sylvia Spengler for
careful reading of the manuscript. This work was supported in
part by NIH grant GM30990 to R. Mortimer and by funds
administered through DOE Contract No. DE-ACO3-765F00098
to the Human Genome Center at Lawrence Berkeley Laboratory.
REFERENCES
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
Southern, E.M. (1975) J. Mol. Biol., 98, 503-517.
Schwartz, D.C. and Cantor, C.R. (1984) Cell, 37, 6 5 - 7 5 .
Carle, G.F. and Olson, M.V. (1984) Nucleic Acids Res., 12, 5647-5664.
Carle, G.F., Frank, M. and Olson , M.V. (1986) Science, 232, 6 5 - 6 8 .
Chu, G.,Vollrath, D. and Davis, R.W. (1986) Science, 234, 1582-1585.
Chu, G. (1989) Electrophoresis, 10, 290-295.
Gekeler.V., Weger, S., Eichele, E. and Probst, H. (1989) Anal. Biochem.,
181, 227-233.
Daniels, D.L., Olson, C.H., Brumley, R. and Blattner, F.R. (1990) Nucleic
Acids Res., 18, 1312.
Goss, S.J. and Harris, H. (1977) J. Cell Sci., 25, 39-57.
Olson, M.V., Dutchik, J.E., Graham, M.Y., Brodeur, G.M., Helms, C ,
Frank, M., MacCollin, M., Scheinman, R. and Frank, T. (1986) Proc. Natl.
Acad. Sci. USA, 83, 7826-7830.
Vollrath, D., Davis, R.W., Connelly, C. and Hieter, P. (1988) Proc. Natl.
Acad. Sci. USA, 85, 6027-6031.
Cox, D.R., Pritchard, C.A., Uglum, E., Casher, D., Kobori, J. and Myers,
R.M. (1989) Genomics, 4, 397-407.
Pavan, W.J., Hieter, P. and Reeves, R.H. (1990) Proc. Natl. Acad. Sci.
USA, 87, 1300-1304.
Smith, C.L. and Condemine, G (1990) J. Bact., 172, 1167-1172.
Burgers, P.M.J. and Percival, K.J. (1987) Anal. Biochem., 163, 391 -397.
Van Ommen, G.J.B. and Verkerk, J.M.H. (1986) in Davies, K.E. (ed.)
Human Genetic Diseases, IRL Press, Oxford, vol. 8, pp. 113-133.
Struhl, K. (1985) Nucleic Acids Res., 13, 8587.
Burke, D.T., Carle, G.F. and Olson M.V. (1987) Science, 236, 806-812.
Rose, M., Grisafi, P. and Botstein, D. (1984) Gene, 29, 113-124.
Calderon, I.L., Contopoulou, C.R. and Mortimer R.K. (1982) Current
Genetics, 7, 93-100.
Andreadis, A., Hsu, Y.-P., Kohlhaw, G.B. and Schimmel, P. (1982) Cell,
31, 319-325.
Birnboim, H.C. and Do!y, J. (1979) Nucleic Acids Res., 7, 1513-1523.
Maniatas, T., Fritsch, E.F. and Sambrook, J. (1982) In Molecular Cloning:
A Laboratory Manual. Cold Spring Harbor University Press, Cold Spring
Harbor.
24. Press, W.H., Flannery, B.P., Teukolsky, S.A. and Wetterling, W.T. (1986)
In Numerical Recipes, Cambridge University Press, New York, pp.
196-197, 714-715.
25. Contopoulou, C.R., Cook.V.E. and Mortimer, R.K. (1987) Yeast, 3, 71-76.
26. Mortimer, R.K., Schild, D., Contopoulou, C.R. and Kans, J.A. (1989) Yeast,
5, 321-403.
27. Laskey, R.A. and Mills, A.D. (1975) Eur. J. Biochem., 56, 335-341.
28. Johnston, R.F., Pickett, S.C. and Barker, D.L. (1990) Applied and Theoret.
Electrophoresis, In Press.
29. Melgar, E. and Goldthwait, D.A. (1968) J. Biol. Chem., 243, 4409-4416.
30. Meyertons Nelson, J., Miceli, S.M., Lechevalier, M.P. and Roberts, R.J.
(1990) Nucleic Acids Res., 18, 2061-2064.
31. Blocher, D., Einspenner, M. and Zajackowski, J. (1989) Int. J. Radiat. Biol.,
56, 437-448.
32. Game, J.C., Sitney, K.C., Cook, V.E. and Mortimer, R.K. (1989) Genetics,
123, 695-713.
APPENDIX
Functions relating the size of a detected fragment of length X
to its expected frequency relative to the total number of detected
fragments were calculated for the first three classes (those with
one, two or three breaks). The calculation for a specific break
class was based on using the previous break class and determining
what distribution would result from an additional break. These
calculations are carried out on a chromosome of size L, with
a probed site a distance D from the right end (assumed to be
the closest) and a distance L - D from the other end (see figure
1). The length of the probed sequence is assumed to be very small
relative to the length of the chromosome, and it is treated as a
point. Breaks can occur on either the left side or the right side
of the probed site. Given that a break occurs, the probability that
it occurs to the right of the probed gene is D/L and the probability
that it occurs to the left is (L-D)/L. For more than one break,
combinations of these probabilities were used to normalize the
relevant distributions.
For exactly one break, the distribution of fragments f(X) is:
f(X) = 0
0 <X<D
f(X) = —
D <X< L
L
f(X) = -
L-D<X<L
L
(1)
(arising from one break
to the left)
(2)
(arising from one break
to the right)
(3)
These equations are then summed to yield the distribution of
fragments for exactly one break.
f,(X) = 0
0 <X<D
(4)
f,(X) = i
D<X<L-D
(5)
L-D<X<L
(6)
Lr
f,(X) = -
To calculate the distribution for exactly two breaks the equations
are retained as they were originally calculated (equations 1—3)
and the distribution that results from the addition of a break is
determined. For example, to determine the distribution for exactly
two breaks, with one break on each side of the site, we start with
the function f(x) = 1/L that is due to one break to the left and
add a break to the right. This is done by observing that fragments
of size D, generated by a break on the left that are then broken
once at random will generate a distribution of detected fragments
Nucleic Acids Research, Vol. 18, No. 15 4461
ranging in size from 0 to D, and likewise a fragment of size L
will generate a distribution of detected fragments ranging in size
from L—D to L, since the most that can be removed from a break
on the right is a fragment of size D. The new function f2(X) is
then calculated by integrating f,(X) over all the original
fragments that will contribute a detected fragment of size X.
When this is done for all the possible two-break combinations,
normalizing for the relative frequencies of each combination, and
the results summed, the functions obtained are:
(7)
(8)
f2(X) =
L-D<X<L
u
(9)
Equations 7—9 are graphed in figure 6A for 3 different values
for D.
The functions resulting from a similar calculation for exactly three
breaks are:
f3(X) =
0<X<D
(10)
U
f 3(X ) = 3(L-X) 2
6
°(L-X>
D<X<L-D
L-D<X<L
=
(11)
(12)
Equations 10, 11 and 12 are graphed in figure 6B, for three values
of D.
These three sets of functions (equations 4 - 1 2 ) suggested a
general solution (for n breaks) of:
- n(n-l)X(L-X)"- 2
,
fn(X)=
0<X<D
2
-YVi-2
n(L-X)"-' + n(n-l)D(L-Xr
L»
L"
n(n + l)(L —
-.n-l
(13)
D<X<L-D
(14)
(15)
L-D<X<L
L"
The functions were verified using computer simulations, and the
general solution (equations 13—15) was found to match the
computer simulations for all break classes tested (one through
twenty). Each function, based on a specific number of breaks,
was weighted according to the Poisson distribution and the
weighted functions were summed to give the distribution of
detected fragments for an average number of breaks. The
resulting equations were used to graph the expected distribution
of detected fragments as a function of position of probed site and
average number of breaks. When equations 13-15 are multiplied
by the appropriate Poisson terms and summed the resulting
equations can be simplified using the Taylor series expansion of
ey. This yields exponential solutions for an average number of
breaks identical to analytical solutions to be presented elsewhere
(V. Cook and R. Mortimer, submitted for publication).