Pulsed field gel electrophoresis and investigations into mammalian

COMMENTARY
Pulsed field gel electrophoresis and investigations into mammalian genome
organization
KATHELEEN GARDINER
Eleanor Roosevelt Institute for Cancer Research, Denver, Colorado 80206, USA
Pulsed field gel (PFG) electrophoresis can resolve DNA
molecules as large as several million base pairs (mbp) in
size (Schwartz and Cantor, 1984; Carle and Olson, 1984;
Gardiner et al. 1986; Carle et al. 1986; Chu et al. 1986;
Clark et al. 1988; Anand, 1986; Gardiner and Patterson,
1988; Orbach et al. 1988). This is in contrast to conventional electrophoresis where the practical upper limit is
50-100 kb (lkb=10 3 bp). This increased resolving power
has especially important ramifications for the study of
complex genomes, allowing new questions to be asked and
providing faster solutions to older ones. It is now possible,
for example, to examine gene organization, physically link
and size mammalian genes, and search for translocation
breakpoints by means that are far more rapid and reliable
than conventional methods. PFG has made the cloning of
large genes, or groups of genes, possible via the yeast
artificial chromosome (YAC) method (Burke et al. 1987),
and it also makes the mapping of the human genome a
realistic endeavour. The purpose of this commentary is to
discuss how this technique can be applied to the study of
mammalian genomes, and to describe some of the insights
into human genome organization that are beginning to
emerge.
The essential technical innovation in PFG is the use of
electric fields that are not constant throughout the gel run,
but that regularly alternate in direction. The frequency
with which the change in direction occurs, called the pulse
time, dictates the size class of fragments to be resolved. In
general, pulse times in the order of seconds are used to
separate fragments of a few kb to 100-200 kb, 1 min for
those up to 1600 kb (Figs 1,2), and 30-120 min for
1600-10 000 kb. Several different pulsed field systems
have been developed, all capable of resolving up to 6 mbp,
and likely up to 10 mbp (for recent descriptions of the most
useful designs, and theoretical discussions; see Stellwagen, 1989). Later versions (Gardiner et al. 1986; Carle et al.
1986; Chu et al. 1986; Clark et al. 1988; Anand, 1986) have
largely eliminated the problem in the original designs
(Schwartz and Cantor, 1984; Carle et al. 1984) of lane to
lane distortion in the path of migration of the DNA. The
result is that fragments, and hybridization bands, can be
easily sized by comparison of their positions with those of
yeast chromosome or phage markers. This is particularly
important in making the technique readily applicable to
the study of the more complex mammalian genomes.
Because of the large size of the molecules to be examined, DNA for PFG must be carefully prepared to prevent
Journal of Cell Science 96, 5-8 (1990)
Printed in Great Britain © The Company of Biologists Limited 1990
shearing. This problem has been neatly and quickly solved
by the practice of preparing DNA in agarose blocks or
plugs (Schwartz and Cantor, 1984; Gardiner et al. 1986).
Generally, live cells are resuspended in a high-EDTA
buffer (to inactivate nucleases) and mixed with low melting point agarose. The mixture can be pipetted into molds
and, once solidified, easily handled with no damage to the
immobilized DNA. Cells can be lysed, proteins degraded,
debris removed, and DNA restricted, all by dialyzing
appropriate reagents into and out of the agarose plugs.
The procedure is simple, and provides DNA fragments
larger than 10 mbp in size.
The analysis of genomes of higher organisms requires
restriction enzyme digestion. With an upper size limit (at
present) of 10 million base pairs, PFG is obviously incapable of separating intact mammalian chromosomes
(the smallest human chromosome, for example, is about
50 mbp in size), and because such large fragments are to be
examined, a special class of restriction enzymes is needed.
These must cut relatively infrequently, generating fragments in complete digests of between 100 and 1000 kb in
average size. Enzymes that are particularly useful in this
respect are those whose recognition sequences are rich in
C+G and/or contain the dinucleotide CpG. Mammalian
genomes are approximately 40% G+C, and the CpG
dinucleotide is overall statistically under-represented
(Swartz et al. 1962; McClelland and Ivarie, 1982). The
result is that the recognition sequences for enzymes such
as Nrul (TCGCGA) or Not! (CCGGCCGG) are expected to
occur only about once every several hundred to a thousand
kb (Drmanac et al. 1986), an excellent range for pulsed
field work (in comparison, EcoBI sites occur approximately
every 4 kb). These fragment sizes are, of course, averages,
statistical in nature, and therefore subject to deviation. In
particular, the apparent frequency of these rare recognition sites is complicated by their clustering in CpG
islands and their resistance to cleavage due to methylation (see below). Nevertheless, these rare-cutting restriction enzymes remain the most generally useful group for
pulsed field analysis. Examples of human and hamster
DNA digested with five such enzymes are shown in Fig. 2.
All are complete digests, and all have generated fragments averaging >500 kb as demonstrated by comparison
with the Saccharomyces cerevisiae chromosomal markers.
Several features of Fig. 2 are noteworthy. First, in the
ethidium bromide stain a clear light/dark banding pattern is observable. This pattern is both enzyme- and
12 3 4 5
Fig. 1. Resolution of fragments <450kb. Lane 1, lambda DNA
digested with ffindm (bands are 1.9, 2.05, 4.6, 6.4, 9.6kb); lane
2, concatamers of lambda DNA (48, 96, 144,... kb); lane 3,
chromosomes of S. cerevisiae (220, 265, 340, 420 kb); lanes 4 and
5, human DNA digested with 5ssHEI and NotI, respectively.
Electrophoresis was for 10 h at 250 V with a pulse time of 10 8,
in the TAFE pulsed field apparatus (Gardiner and Patterson,
1989; Beckman Instruments). Fragments larger than 450 kb do
not resolve and run as the diffuse band seen near the wells in
lanes 3-5.
1 2
3
1 2
3
1 2
m
m
1
3
I
•
'1
I
1
1
M
1i
12
species-specific, but its source is not at all clear and has not
so far been investigated. It is, however, a useful phenomenon in that its appearance is a good diagnostic tool. It
indicates that the digestions are likely to be complete, the
DNA is undegraded, and the gel has not been overloaded.
This latter is an important point when accurate determination of band sizes is required. Increasing the quantity of
DNA in the plugs by 25% can noticeably decrease the
mobility of the restriction fragments, thus increasing
the apparent size of hybridization bands relative to the
markers. The upper limit on the quantity of DNA that can
be loaded without affecting mobility is possibly dependent
on the gel system and plug volumes, and needs to be
determined if accurate sizing of fragments is important.
A second interesting feature is the variation in band
sizes seen among different DNAs digested with the same
enzyme. In digests with NotI and Nrul, the chromosome
21-specific unique sequence identifies a larger fragment in
DNA from the two hybrid cell lines than in DNA from
lymphocytes. This variation could be due to a sequence
polymorphism or to partial digestion arising from a methylation difference. This latter is a possibility because
mammalian DNA contains a 5-methylcytosine modification that largely occurs in CpG dinucleotides (Razin and
Riggs, 1980) and has been shown to inhibit many PFG
enzymes (Nelson and McClelland, 1987). A strong argument for the presence of variable methylation can be made
in cases where two or more bands are seen that are of the
same sizes but vary in intensity among DNA sources. Such
differences are useful in map construction, because they
sometimes indicate the linkage of two probes that in a
complete digest (or unmethylated DNA) would be found on
different fragments (Gardiner et al. 1988, 1990; Gardiner
and Patterson, 1989). Reasonable criteria, however, are
needed to conclude physical linkage. In digests with
several enzymes, two probes should identify the samesized fragments or, in partial digests, detect the same
complex patterns. Conversely, the failure to detect any
common fragments gives no information on the proximity
3
1 2
3
1 2
3
3 1 2
1
•- t
i
N
B
M
Fig. 2. Resolution of fragments <1600kb. Lanes 1, human DNA; lanes 2 and 3, DNA from two hamster/human hybrid cell lines
that contain chromosome 21 as their only human material. Marker bands are the chromosomes from S. cerevisiae, the smallest band
is 220 kb, the largest 1600 kb. N, digestion with NotI; B, BssHII; M, Mlul; Nr, Nrul; S, SacII. On the left is an ethidium bromidestained gel, after electrophoresis for 20 h at 250 V with a 60s pulse time. Because of the increased pulse time, 12 chromosomal bands
are resolved in the yeast marker lane, and the densely staining band of unresolved material in the digests does not appear (compare
with Fig. 1). The results of Southern transfer and hybridization with a chromosome 21-specific unique sequence probe are shown on
the right. In digests with Not! and Nrul, the probe hybridizes to larger fragments in the hybrid DNA than in the human DNA.
K. Gardiner
of two sequences. Sites for PFG enzymes frequently cluster
in the unmethylated CpG islands that designate the 5'
ends of many genes (Bird, 1987). Two sequences may
therefore be only the few hundred base pairs of a CpG
island apart and still share no PFG restriction fragments.
A third point from Fig. 2 is that many restriction
fragments are significantly larger than 500 kb, indeed,
many are so large that they are retained within the wells,
as can be judged by the intensity of staining there. It
follows that this level of resolution (< 1600 kb) is likely to
be insufficient to determine the fragment sizes or to
explore the linkage relationships of many probes. For
molecules > 1600 kb very different electrophoresis conditions are required, generally including lower voltages
(1-3 V cm"1), pulse times of about 1 h, and run times of a
patience-straining 5-iO days. Physical mapping work
with human chromosome 21 has indicated that such
conditions are unfortunately frequently required (Gardiner et al. 1988, 1990). This is not a random requirement,
however, but rather one likely associated with the cytogenetic band location of the probe being examined. Human
chromosome 21 at low resolution can be divided into the
centromere-proximal half that is predominantly a Giemsa
dark band, 21q21, and the telomere-proximal half that is a
Giemsa light band, 21q22 (see Fig. 4, below). Fig. 3 shows
that fragments from 21q22, with several enzymes, average
considerably less than 1000 kb, whereas those from 21q21
average greater than 1500 kb. (Furthermore, in more
detailed analysis, it can be shown that the fragments
> 1500 kb from q22 largely map to the small Giemsa
dark band, 21q22.2.) The same observation is made with
the enzyme Sfil (separated out because, generally having
no CpGs in its recognition sequence, it tends to give
smaller fragments overall than the other enzymes used),
where the predominant size class (<100kb) in 21q22 is
completely absent in 21q21. This observation is, in part, a
reflection of fundamental differences between Giemsa
light and dark bands. Dyes used in cytogenetic staining
A
60
50
a 40
iZ
B (Sfil)
1[
1
r ~- ~i
•
0 ...\—1
0 100 500 1000 1500 3000 7000
i
3? 10
0 100 250 500 1000
Kilobase pairs
Fig. 3. Variation in restriction fragment size with Giemsa band
location. The histograms show the proportion of fragments
within each size range: A, with enzymes Notl, BssHU, Mlul,
Nnil and SacEI; B, with Sfil. (
) data for fragments within
the Giemsa light band 21q22; (
) those within the Giemsa
dark band 21q21 (see Fig. 4 for a schematic of the Giemsa band
pattern). A total of 111 fragments from 21q22 and 58 fragments
from 21q21 were analyzed. (For details, see Gardiner et al.
1988, 1990).
VWWWV
SOD!
200
200
350*
100
150
FGAKT
160
77 5
•25
55
100
400
4001
[400)
55
100
1100*
100
200
95
50
50
2000
1500
1500
1100550
1125
400
400
SO
-6911
D21S58
D21S65
[524-5P
525-5H
D21S17
D21S55
D21S60
U
Till-"
cr2ici
Lus
flll-lOB
-8.-21.WC2
1125
TSTJff
1500
1500
130
10:21.6911
700
700
-O21S3 —[IMS.
700
O21S71
700
1.519-9*
551-1*
775
O21S64
775
tax
775
ttSi
512-51
775
D21S53
775
CBS
1500
O21S15
El
775
O21S19
775
[520-10* 775
D21S56
D21S57
1050
500*
125*
125*
\*m
500
500
500
500
too
too
600
too
600
1225
100
300
40
750
550
550
750
can
750
750
550
1225.1125 700
1225.1125 700
1225.1125 700
1225.1125 700
1225.1125 700
1225.1125 1275
300
250
300
230
250
270
350
350
75
55
ICft
J 3U
£n
TOT
JIO
400
400
400
310
310
250
3B0
200
50
<0Q
450
450
450
-ling
10
70
-21;22
k£A
[C0L6A1
[COL6»2
100
750
7 50
100
120
150
150 150.300
15001 825.500
I500|
825,500
130
150
50
40
20
-7; 21
Fig. 4. Clustering of unique and transcribed sequences mapped to human chromosome 21. At the left is shown a schematic drawing
of the Giemsa band pattern of the long arm of human chromosome 21 at intermediate resolution. Horizontal broken lines indicate
regions denned by translocation breakpoints (Gardiner et al. 1988, 1990) named at the extreme right. Column 1 lists the human gene
mapping designation for the probes in column 2. The next six columns give the sizes of restriction fragments, detected by each probe,
in kb, for the enzymes Notl, BssHU, Mlul, Nrul, SacH and Sfil. Brackets indicate probes that are physically linked (Gardiner et al.
1989); transcribed sequences are underlined (CP21G1 and GART also contain gene sequences (Davidson et al. 1985; Schild et al.
1990). Boxed numbers indicate the smallest restriction fragment defining that group, and therefore give upper limits for the amount
of DNA necessary to contain all sequences within a group.
Pulsed field gel electrophoresis
are base-specific, and indicate that light bands are higher
in G+C content than dark bands (Simola et al. 1975). In
addition, light bands are enriched for genes (or at least for
those so far mapped; Goldman et al. 1984; Ikemura and
Aota, 1988) and, therefore, for the gene-associated CpG
islands. It is logical, therefore, that light bands, because of
their different base composition, will contain more sites
for the rare cutting restriction enzymes, as has been
demonstrated here. Pulsed field analysis can, therefore,
give an indication of the band location of a sequence,
possibly bearing on the nature of a gene, whether it is
tissue-specific or housekeeping.
Pulsed field analysis can also provide information both
on the sizes of mammalian genes and on their organization. Again, the physical map of human chromosome 21 is
beginning to provide interesting numbers. Essentially the
entire long arm of this chromosome (40 million base pairs)
has been accounted for on a collection of 33 Noil restriction
fragments, using some 50 unique sequence probes. Using
complete digests, the majority (70 %) of these probes have
been physically linked in 13 separate groups (Gardiner et
al. 1990, and unpublished observations). Fig. 4 shows data
on the six physical linkage groups mapping to the distal
third of the long arm. Together, these groups contain 14 of
the 24 genes (and 19 of 54 unique sequences) used to
construct the map. Consideration of the smallest restriction fragment that defines each group indicates that these
genes are locally clustered, being contained within only
3400 kb, or approximately 8 % of the long arm. Certainly,
chromosome 21 contains many more than these two dozen
genes (probably 500-1000), but this current evidence
strongly suggests that they are not uniformly distributed.
Perhaps a separate class of genes remains to be discovered,
one comprising genes of larger size or of very different
location or organization.
In conclusion, pulsed field gel electrophoresis has
increased by over 100-fold the size of DNA molecules that
can be quickly and easily resolved. The physical mapping
projects that it has made feasible are already yielding new
information on the relative proximity of individual genes,
local clustering of groups of genes, and the molecular basis
of cytogenetic banding patterns.
This is a contribution (no. 1076) from the Eleanor Roosevelt
Institute for Cancer Research. This work was supported by grants
from the National Institutes of Health (HD17449 and HD22720).
References
ANAND, R (1986). Pulsed field gel electrophoresis - a technique for
fractionating large DNA molecules. Trends Genet. 2, 278-283.
BIRD, A. P. (1987). CpG islands as gene markers in the vertebrate
nucleus. Trends Gtnet. 3, 342-347.
K. Gardiner
BURKB, D. T., CARLE, G. F. AND OLSON, M. V. (1987). Cloning of large
exogenous DNA into yeast using artificial chromosome vectors. Science
236, 806-812.
CARLE, G. F., FRANK, M. AND OLSON, M. V. (1986). Electrophoresis
separation of large DNA molecules by periodic inversion of the electric
field. Science 232, 65-68.
CABLE, G. F. AND OLSON, M. V. (1984). Separation of chromosomal DNA
molecules from yeast by orthogonal field alternating gel
electrophoresis. Nucl. Acid Res. 14, 5647-6664.
CHU, G., VOLRATH, D. AND DAVIS, R. W. (1986). Separation of large DNA
molecules by contour-clamped homogeneous electric fields. Science 234,
1582-1685.
CLARK, S. M., LAI, E., BIRREN, B. W. AND HOOD, L. (1988). A novel
instrument for separation of large DNA molecules with pulsed
homogeneous electric field. Science 241, 1203-1205.
DAVIDSON, J , RUMSBY, G. AND NISWANDER, L. (1985). Expression of genes
on human chromosome 21. Ann. NY Acad. Sci. 450, 43-54.
DRMANAC, R., PETROVIC, L., GUSIN, V. AND CRKVENJAKOV, R. (1986). A
calculation of fragment lengths obtained from human DNA with 78
restriction enzymes. Nucl. Acids Res. 14, 4691-4692.
GARDINER, K., HORISBERGER, M., KRAUS, J., TANTRAVAHI, U.,
KORENBERO, J., RAO, V., REDDY, S. AND PATTERSON, D. (1990). Analysis
of human chromosome 21: correlation of physical and cytogenetic
maps; gene and CpG island distributions. EMBO J. 9, 25-34.
GARDINER, K., LASS, W. AND PATTERSON, D. (1986). Fractionation of large
mammalian DNA restriction fragments using vertical pulsed field
gradient gel electrophoresia. Somat. Cell molec. Genet. 12, 185-195.
GARDINER, K. AND PATTERSON, D. (1988). Transverse alternating field
electrophoresis. Nature 241, 271-272.
GARDINER, K. AND PATTERSON, D. (1989). TAFE and applications to
mammalian genome mapping. Electrophoresis J. 10, 296-302.
GARDINER, K., WATKINS, P., MUNKE, M., DRABKIN, H., JONES, C. AND
PATTERSON, D. (1988). Partial physical map of human chromosome 21.
Somat. Cell molec. Genet. 14, 623-638.
GOLDMAN, M. A , HOLMQUIST, G. P., CRAY, M. C, CASTON, L. A. AND NOG,
A. (1984). Replication timing of genes and middle repetitive sequences
Science 224, 686-692.
IKEMURA, T. AND AOTA, S (1988). Global variation in G+C content along
vertebrate genomic DNA. J. molec. Biol. 203, 1-13.
MCCLELLAND, M. AND IVARIE, R. (1982). Asymmetric distribution of CpG
in an "average" mammalian gene. Nucl. Acids Res. 10, 7865-7877.
NELSON, M. AND MCCLELLAND, M. (1987). Effect of site specific
methylation on restriction-modification enzymes. Nucl. Acids Res. 15,
r219-r230.
ORBACH, M. J., VOLLRATH, D., DAVIS, R. W. AND YANOFSKY, C. (1988). An
electrophoresis karyotype of Neurospora crassa Molec. cell. Biol. 8, 4,
1469-1473.
RAZIN, A. AND RJGCS, A. D. (1980). DNA methylation and gene function.
Science 210, 604-610.
SCHILD, D., DRABB, A., KIEFER, M., YOUNG, D. AND BARR, P. (1900).
Cloning of the human GART gene. Proc. natn. Acad. Sci. U.S.A. (in
press).
SCHWARTZ, D. C. AND CANTOR, C. R (1984). Separation of yeast
chromosome-sized DNAs by pulsed field gradient gel electrophoresis.
Cell 37, 67-75.
SIMOLA, K., SELANDER, R-K., DE LA CHAPELLE, A., CORNEO, G. AND
GINELLI, E. (1975). Molecular basis of chromosome banding.
Chromosome 51, 199-205.
STELLWAGEN, N. C, ED (1989). Paper symposium. Electrophoresis J. 10,
nos 6—6
SWARTZ, M. N., TRAUTNER, T. A. AND KORNBERG, A. (1962). Enzymatic
synthesis of DNA: further studies on nearest neighbor base sequences
in DNA. J. biol. Chem. 237, 1961-1967