Quadruple Intercalated G-6 Stack: A Possible Motif in the Fold

doi:10.1006/jmbi.2001.5131 available online at http://www.idealibrary.com on
J. Mol. Biol. (2001) 314, 139±152
Quadruple Intercalated G-6 Stack: A Possible Motif in
the Fold-back Structure of the Drosophila
Centromeric Dodeca-Satellite?
Shan-Ho Chou* and Ko-Hsin Chin
Institute of Biochemistry
National Chung-Hsing
University, Taichung
40227, Taiwan
The purine-rich strand d(GTACGGGACCGA)n of the Drosophila centromeric dodeca-satellite sequence is highly conserved and was found to
form stable fold-back structures in which the homopurine 50 -GGGA-30
sequence was determined to play a crucial role. Here, we report the
stable formation of the d(GGGA)2 motif in the stem of a DNA hairpin
closed by a single-residue d(ACC) loop. Similar to the zipper-like
d(GGA)2 motif observed in the human centromeric (TGGAA)n sequence,
the central four guanosine bases in the d(GGGA)2 motif do not pair, but
interdigitate to form an elongated zipper-like quadruple-intercalated G-6
stack bracketed by sheared G A base-pairs. Comparison between the current d(GGGA)2 structure and the published crystal d(GAAA)2 structure
implies that the alignment of the unpaired purine bases plays an
important role in determining the minor groove width of the purine-rich
d(GPuPuA)2 motif. Similarity between the zipper-like motifs possibly present in the Drosophila centromeric dodeca-satellite sequence and in the
human centromeric (TGGAA)n sequence led us to propose that these
special zipper-like motifs may constitute common cores in organizing
eukaryotic centromeres.
# 2001 Academic Press
*Corresponding author
Keywords: Drosophila centromere dodeca-satellite; unusual DNA structure;
sheared GA pairing; quadruple intercalation; NMR
Introduction
The centromere is a specialized region in the
chromosome that is essential for the accurate
segregation of chromosomes during mitosis
and meiosis.1 Except for the budding yeast
Saccharomyces cerevisiae, most centromeric DNAs
contain highly repetitive satellite DNA sequences
that are essential for their proper functioning.
Intriguingly, most centromeric satellite sequences
are asymmetric in the distribution of purine content, which usually results in one strand being
purine-rich vis-aÁ-vis the other strand. Examination of such extraordinary purine distribution
has led to the ®nding that centromeric purinerich sequences are capable of adopting stable
fold-back structures.2 ± 7 In this respect, it is
Abbreviations used: DQ-COSY, double quantum
correlated spectroscopy; HSQC, heteronuclear single
quantum coherence; NOE, nuclear Overhauser
enhancement; NOESY, NOE spectroscopy.
E-mail address of the corresponding author:
[email protected]
0022-2836/01/010139±14 $35.00/0
important to note that a single-stranded DNAbinding protein from the Drosophila nuclear
extracts that preferentially binds to the individual dodeca-satellite pyrimidine-strand has been
discovered, implying that its complementary purine-strand may be free to form fold-back structures in vivo to organize the centromere
structure.8 Although centromeric satellite DNA
sequences are poorly conserved for most species,
it is expected that their structures may be conserved to some extent through evolution to
serve their specialized function.9 In order to look
for possible common structural elements in the
centromeres, it is interesting to note that two
DNA satellite sequences, the human 5 bp
satellite 50 -(TGGAA)n-30 2,10,11 and the Drosophila
dodeca-satellite
50 -(GTACGGGACCGA)n-30 12 ± 13
are abundantly present and highly conserved in
the centromere. These may serve as the model
sequences for the structural characterization of
the eukaryotic centromere.
The fold-back structure of the human centromeric (TGGAA)n repeat2,3 was indeed found to
adopt a stable zipper-like (GGA)2 motif in our
# 2001 Academic Press
140
earlier studies;4 the central guanosine bases are
not paired, but interdigitated to form a doubleintercalated G-4 stack with the bracketed G A
base-pairs. As a result, the unpaired guanine
bases and the bracketed sheared GA pairs are
able to provide many hydrogen-bonding donors
and acceptors in the major groove for possible
interaction with other ligands. On the other
hand, the structure of the G-rich repeats
(GTACGGGACCGA)n of the Drosophila centromeric dodeca-satellite sequence is less well
identi®ed,12 ± 13 although it was also demonstrated
to form stable fold-back structures with the central 50 -GGGA-30 tetranucleotide found to play a
determinant role in stabilizing the fold-back
structures.5 We are thus interested in studying
this fold-back structure in order to compare it
with the (GGA)2 motif in the human centromeric
(TGGAA)n repeats to search for possible common structural motifs in the eukaryotic centromeres. Since homopurine DNA sequences have a
tendency to be structurally polymorphic,14 we
embedded the (GPuPuA)2 tracts (Pu ˆ purine) in
the stem of a DNA hairpin closed by a stable
single-residue d(ACC) loop to facilitate its studies by NMR spectroscopy. The excellent stability of the resulting structures indicates that the
(GGGA)2 motif may possibly play important
roles in organizing the highly conserved Drosophila centromeric dodeca-satellite (GTACGGGACCGA)n repeats. The similar zipper-like
motifs adopted by the (GPuA)2 tract and the
(GPuPuA)2 tract and their prevalence in the
eukaryotic centromere imply that the zipper-like
interdigitated motifs may serve as common cores
in organizing the eukaryotic centromere structure.
Centromeric Interdigitated G-6 Stack Structure
Results
Thermodynamic studies
The various homopurine (GPuPuA)2 motifs are
embedded in the stem of a 19-mer hairpin 50 GCGPuPuAACACCGTGPuPuAGC-30 for both
UV-melting and NMR studies. The UV-melting
curves for several such motifs are shown in
Figure 1 with their thermodynamics parameters
listed in Table 1. All hairpins containing the homopurine (GPuPuA)2 motifs exhibit well-behaved
transition curves that indicate good structural formation for these sequences. Several points are
noteworthy from the UV-melting studies; (1) the
number of G-2NH2 group in the (GPuPuA)2 motifs
seems to affect the melting temperatures of the corresponding hairpins in a consistent manner, i.e. the
presence of each G-2NH2 group accounts for
approximately 3 to 4 deg.C increase in the melting
temperature for this series of hairpin structures.
Thus when the 50 -GGGA/AGGG-50 motif (row 2)
is replaced with the 50 -GAGA/AGAG-50 (row 6) or
50 -GAAA/AAAG-50 motif (row 8), the melting
temperatures of the corresponding hairpins are
decreased by approximately 6 deg.C and 13 deg.C,
respectively. Similarly, the replacement of
the 50 -GGGA/AGGG-50 motif with the 50 -GGIA/
AGGG-50 (row 3) or 50 -GIGA/AGIG-50 motif (row
5) leads to a decrease of 4 deg.C and 7 deg.C,
respectively, in the melting temperature of the corresponding hairpins. It is thus reasonable to
assume that each G-2NH2 in the central four guanosine residues (in bold) is involved in speci®c
hydrogen bonding, and the absence of one such
group would lead to the loss of one hydrogen
bond, which would in turn reduce the melting
Figure 1. The UV-melting transition curves for the 50 -GCGPuPuACACCGTGPuPuAGC-30 nonadecamers. The tm values were
determined from the maximum of
the ®rst differential curves.
141
Centromeric Interdigitated G-6 Stack Structure
Table 1. The thermodynamics parameters for the formation of DNA hairpins 50 -GCGPuPuAACACCGTGPuPuAGC30 containing various zipper-like quadruple-intercalation motifs (bold letters) in the stem region
Motif
Tm ( C)
G37 (kcal/mol)
ÿH (kcal/mol)
1
TTAA
AATT
52.5
ÿ1.96
37.0
113.0
2
GGGA
AGGG
52.7
ÿ2.75
57.0
174.8
3
GG I A
AGGG
48.6
ÿ1.94
54.3
168.8
4
G I GA
AGGG
47.2
ÿ1.58
49.0
153.0
5
G I GA
AG I G
45.5
ÿ1.20
50.0
157.5
6
GAGA
AGAG
46.5
ÿ1.46
50.0
156.6
7
GGAA
AAGG
43.3
0.96
48.0
151.7
8
GAAA
AAAG
40.0
ÿ0.29
40.0
128.0
ÿS (cal/mol K)
The data in the ®rst column are from the control sequence containing all canonical AT base-pairs in the corresponding region.
The UV-melting experiments were performed with a Cary 100 spectrophotometer equipped with a temperature-controller under a
low-salt, neutral buffer condition (20 mM NaCl, 3 mM sodium phosphate (pH 6.8) as that used in the NMR studies. A temperature
probe was used to monitor the genuine temperature inside the cell. Thermodynamic data were calculated from the van't Hoff plots
obtained from the Thermal application software supplied by the vendor. The rows are arranged in the order of decreasing potential
H-bonding numbers.
temperature by about 3 to 4 deg. C. (2) Although
the hairpin containing the 50 -GGGA/AGGG-50
motif (row 2) exhibits approximately the same
UV-melting temperature as that of the hairpin containing the canonically paired 50 -TTAA/AATT-50
motif (row 1), they do reveal considerable difference in the thermodynamics H and S values.
This result indicates that they adopt different structures under identical condition. Since approximately equal numbers of hydrogen bonds are
present in these two motifs (assuming that each
central guanosine residue in the (GGGA)2 motif
contributes one hydrogen bond), the huge difference in the H values must originate from the
different base stacking. The NMR structural studies
described below do prove that there is excellent
cross-strand G/G/G/G stacking in the 50 -GGGA/
AGGG-50 motif, contrary to the much weaker partial intra-strand stacking that usually takes place in
a canonically paired B-DNA duplex. This can
explain why the 50 -GGGA/AGGG-50 motif has a
larger H value (ÿ57 kcal/mol; 1 cal ˆ 4.184 J)
than the 50 -TTAA/AATT-50 motif that has a smaller H value (ÿ37 kcal/mol). However, this unusual interdigitated motif contains four unpaired
guanine bases (see Figure 5) that expose a large
number of functional groups for interacting with
the surrounding water molecules. Larger entropy
loss would be expected for this motif, due to the
ordered formation of water molecules surrounding
the interdigitated region. (3) The position of guano-
sine residues in the interdigitated motif also markedly affect the UV-melting temperatures of their
corresponding hairpins. Comparing row 6 with
row 7 in Table 1, one can see that when the
50 -GAGA/AGAG-50 motif is replaced with the
50 -GGAA/AAGG-50 motif, its UV-melting temperature drops by about 3 deg. C. This can be
explained by the special stacking present in the
(GPuPuA)2 motif. As shown in Figure 4, two interdigitated modes are possible for the (GGGA)2
motif; either type I with residue G4 stacking upon
residue G3 and residue G15 upon residue G14, or
type II with residue G5 stacking upon residue A6
and residue G16 upon residue A17. Experimental
data described below clearly indicate that the
(GGGA)2 motif adopts a type I interdigitated
mode, i.e. the unpaired residue G4 stacking upon
the sheared G3 A17 pair and the unpaired residue
G15 stacking upon the sheared G14 A6 pair. This
special intercalation mode can be used to rationalize why the hairpin containing the 50 -GAGA/
AGAG-50 motif is more stable than the hairpin containing the 50 -GGAA/AAGG-50 motif due to the
distinctive G/G stacking. Theoretical calculation
has indicated that the guanine base is much more
polar than the adenine base.15,16 Stability of the G/
G stacking would therefore depend considerably
upon whether they are engaged in the anti-parallel
or parallel stacking, with the anti-parallel stacking
stabilizing the G/G stacking and the parallel stacking destabilizing it. On the contrary, base stacking
142
with no polar base like A/A or with only one
polar base like G/A would experience no such
large orientation effect. This reasoning has in fact
been used to account for the sequence-dependent
stability of different DNA sequences.15,16 In the
current 50 -GGAA/AAGG-50 motif, the ®rst
unpaired guanine base (in bold) stacks in parallel
with the 50 -end guanine base (see Figure 4) that
would cause destabilization. While in the 50 GAGA/AGAG-50 motif, it is the adenine that
stacks in parallel with the 50 -end guanine; no destabilization would thus be expected. Furthermore,
the two inner zipper guanine bases in the 50 GAGA/AGAG-50 motif are stacked in an anti-parallel way (see Figure 6), leading further to its
stabilization against the 50 -GGAA/AAGG-50 motif.
(4) The 50 -GAAA/AAAG-50 motif (row 8) is the
least stable in this series of zipper-like motifs, possibly due to the lack of hydrogen bonding contribution from the inner zipper adenine bases.
Broader linewidth (Figure 2) and weaker NOE
cross-peaks for this motif lead to the deterioration
of the NMR spectra that are not of suf®cient quality for structural studies. However, it is interesting
to note that the crystal structure of a duplex containing the zipper-like 50 -GAAA/AAAG-50 motif
has been solved successfully in the presence of
metal hexamine salts.17
Centromeric Interdigitated G-6 Stack Structure
(GGA)2 motif,4 further implying that the central
two G-G steps adopt a similar interdigitated motif.
In the GAGA spectrum, only two such unpaired
G-imino protons (shifted to 9.9 ppm) were
observed, while in the GAAA spectrum, no such
proton was observed at all, although the two Gimino protons belonging to the bracketed sheared
G A base-pairs were observable in all the cases.
The imino proton spectra of these four oligomers
are thus consistent with a picture in which the central four purine residues in the (GPuPuA)2 motif
are not paired but interdigitated and bracketed by
a pair of sheared G A base-pairs. However, some
minor forms of unknown nature are present in the
GAGA or GAAA spectra, as revealed by the presence of minor peaks or the broader linewidth of
several peaks. In fact, the linewidth of the two
unpaired G-imino protons in the bracketed sheared
G A base-pairs in the GAAA spectrum have
become broadened to such an extent that they collapse into one broad peak. This phenomenon indicates that the amino protons in the unpaired
guanosine bases are indispensable for stabilizing
the interdigitated motif. They are possibly involved
in hydrogen bonding to stabilize the nearby imino
protons and prevent them from exchanging with
solvent.
NMR studies
Unusual cross-strand NOEs between the
interdigitated guanosine residues
The one-dimensional imino and aromatic proton
spectra at a neutral (pH 6.8) low-salt buffer condition at 0 C for the hairpins containing the
50 -GGGA/AGGG-50 motif (designated as GGGA),
(GIGA),
50 -GAGA/
50 -GIGA/AGGG-50 motif
AGAG-50 motif (GAGA), and 50 -GAAA/AAAG50 motif (GAAA) are shown in Figure 2. The imino
proton signals were assigned by .2D-NOESY in
90 % H2O/10 % 2H2O as previously described.18
All four oligomers reveal the characteristic imino
proton signals expected for the four canonical G C
and A T pairs in the 12.5-3.2 ppm region. However, extra signals from the (GPuPuA)2 motifs were
clearly detected. In the GGGA spectrum, two
sharp imino proton signals at approximately 10.2
ppm, as well as two sharp signals accounting for
four imino protons at 9.6 ppm, were observed. The
signals at 9.6 ppm were further separated into four
peaks when one of the central guanosine residues
is changed to inosine, as shown in the GIGA spectrum. The two imino proton signals at 10.2 ppm
are characteristic of the unpaired G-imino proton
in a sheared G A base-pair, indicating the formation of bracketed sheared G3 A17 and G14 A6
base-pairs.19,20 The four imino protons at 9.6 ppm,
on the other hand, imply that the central two G-G
``steps'' of the G4, G5, G15, and G16 residues are
not involved in a hydrogen bond, yet are well protected from the solvent exchange as judged by
their narrow linewidths. The chemical shifts of
these G-imino protons are similar to those of the
unpaired G-imino protons in the zipper-like
Due to the unusual structure formation in this
quadruple intercalation motif, many extraordinary
NOEs were observed, which are partially shown
in the NOESY spectrum in Figure 3. Figure 3(a)
illustrates the base-H10 connectivity (in blue) and
the base-H30 connectivity (in gray) that could be
followed successfully through the previously
described sequential assignment procedure.21 The
H10 chemical shifts of residues G3/G4 and G14/
G15 are overlapped but could be revolved by
replacing either residue G4 or G5 with inosine
(data not shown). Besides the regular NOEs, many
informative unusual NOEs were observed in
Figure 3(a) and these are marked with small letters.
We have observed systematic cross-strand NOEs
between the G4H8-G16H10 protons (crosspeak f,
see also Figure 5(a)), the G16H8-G5H10 protons (g),
and the G5H8-G15H10 protons (h), and their reciprocal NOEs between the G15H8-G5H10 protons
(i), the G5H8-G16H10 protons (j), and the G16H8G4H10 protons (k), which are extremely useful in
establishing this unusual quadruple-intercalation
feature of the central two G-G steps. In other
word, the observation of the G16H8-G4H10 /G5H10
and G5H8-G16H10 /G15H10 NOEs indicates that
residue G16 is intercalated between residues G4
and G5, while residue G5 is intercalated between
residues G16 and G15. These non-sequential and
systematic NOEs are rarely detected in any regular
B or A-form double helix except in the i motif consisted of two C ‡C paired duplexes interdigitiated
with each other to form a tetrameric structure.22
Centromeric Interdigitated G-6 Stack Structure
143
Figure 2. The one-dimensional 600 MHz imino, amino and aromatic proton NMR spectrum of the GGGA, GIGA,
GAGA, and GAAA hairpins. The imino protons were assigned from the NOESY experiments in 10 % 2H2O/90 %
H2O solution as described.18
Figure 3(b) further shows the inter-residue H10 -H10
NOEs of the four unpaired guanosine residues.
The G4H10 -G16H10 -G5H10 -G15H10 NOEs again
demonstrate that these four unpaired guanosine
residues are interdigitated in the G4/G16/G5/G15
order, not in the other way around of the G16/
G4/G15/G5 order (see Figure 4). Although these
crosspeaks were shown in the spectrum at 600 ms
mixing time, they were clearly detectable in the
spectrum at 100 ms mixing time (data not shown).
Again, these NOEs were hardly detected in any
regular B or A-form duplex except in the i motif.22
144
Centromeric Interdigitated G-6 Stack Structure
Figure 3. (a) The base-H10 /H5/H30 ®ngerprint region of the NOESY spectrum of the GGGA nonadecamer. The
H8/H6-H10 connectivity is traced by blue dotted lines in the lower part of the Figure, those of the H8/H6-H30 by
gray dotted lines in the upper portion of the Figure. The C(n)H5-C(n)H6 cross-peaks are respectively connected to the
C(n)H5-Pu(n-1)H8 cross-peaks by red horizontal lines. Some crucial NOEs for this unusual motif were indicated by
lower-case letters, namely: a, A9H2-C11H5; b, A9H2-C10H10 ; c, A17H2-G4H10 ; d, A17H8-G4H10 ; e, A6H8-G15H10 ; f,
G4H8-G16H10 ; g, G16H8-G5H10 ; h, G5H8-G15H10 ; i, G15H8-G5H10 ; j, G5H8-G16H10 ; k, G16H8-G4H10 . (b) The crossstrand G4H10 -G16H10 -G5H10 -G15H10 NOE connectivity. The cross-peak indicated by the arrow is the C19H5-G18H10
Ê . (c) The stacking plot of the boxed region in (a). The four CH5-CH6
NOE that has a distance of approximately 4.5 A
cross-peaks are labeled by their residue numbers. The very strong intra-residue H8-H30 NOE cross-peaks of the G4
Ê ) and the medium-strength H8-H30 NOE crossand G15 residues (corresponding to a distance of approximately 2.3 A
peaks of the G5 and G16 residues are labeled. Cross-peak a is the inter-residue G15H8-G14H10 and b is the inter-residue G4H8-G3H10 NOE. These strong NOE cross-peaks at the junction between the ®rst unpaired residues and the
sheared G A pair were observed also in the previously studied (GGA)2 motif.4 The weak G5H8-H10 and G16H10
NOE cross-peaks in the lower part of the Figure are labeled, and are suggestive of anti-glycosidic angles of the central
guanine residues in the (GGGA)2 motif.
Figure 3(c) shows the expanded stacked plot of the
boxed region shown in Figure 3(a) at a mixing
time of 100 ms. The intra-residue G15H8-G15H30
and G4H8-G4H30 cross-peaks (marked by bigger
red capital letters G4 and G15) are very strong
(even stronger than the CH5-CH6 cross peaks of
the C8, C2, C10, and C19 residues), indicating that
the outer zipper guanosine residues (G4 and G15)
in the interdigitated motif are located in the unusual C30 -endo domain with short intra-residue
Ê ). On the
GH8-GH30 distances (approximately 2.3 A
other hand, the corresponding cross-peaks for the
inner zipper guanosine residues (G5 and G16)
have weaker intensity but are still stronger than
those of other residues in this oligomer, indicating
that residues G5 and G16 incorporate a sugar conformation that is intermediate between the C20 -
endo and C30 -endo domains. The unusual sugar
puckers for the outer zipper G4 and G15 residues
and the inner zipper G5 and G16 residues are also
con®rmed by the DQ-COSY 23, 24 and 1H-13C
HSQC experiments.25 The C30 -endo sugar conformation for G4 and G15 residues are clearly demonstrated by the observation of strong H30 -H40 cross
peaks and equally strong H30 -H20 and H30 -H200
cross-peaks in the DQ-COSY spectrum (data not
shown). This is not the case for the G5 and G16
residues, which have instead equally strong H10 H20 and H10 -H200 cross-peaks, strong H30 -H200
cross-peaks but very weak H30 -H20 cross-peaks
that are more indicative of a sugar pucker between
the C20 -endo and C30 -endo conformations. The C30 endo sugar conformation of residues G4 and G15
are further con®rmed by a 1H-13C HSQC exper-
Centromeric Interdigitated G-6 Stack Structure
145
Figure 4. The 31P-1H heteronuclear correlation spectrum of the DNA hairpin containing the purine-rich d(GGGA)2
motif. The two intercalation modes of the central unpaired guanine residues could be distinguished from this spectrum as described in the text. The cross-peaks that are too weak or un-detectable are marked by x.
iment; the chemical shifts of G4C30 , G15C30 and
G4C40 , G15C40 are found shifted up®eld by
approximately 5 and 3 ppm respectively, a feature
that is characteristic of the switching of sugar
pucker from the C20 -endo to C30 -endo conformation.25
The 31P-1H heteronuclear correlation spectrum of
the GGGA sequence is shown in Figure 4, which is
also used to distinguish between the two possible
interdigitated modes (type I or II) for the unpaired
guanine residues in the (GGGA)2 motif. From the
Figure, it is clear that four phosphorus atoms (P5,
P6, P16, and P17) out of the six phosphorus atoms
in the backbone of the (GGGA)2 motif exhibit resonance at a higher ®eld position of ÿ2.60 to ÿ3.0
ppm, shifted approximately 1.7 ppm from the cluster resonance at around ÿ4.5 ppm. On the other
hand, the signals of the two remaining phosphorus
atoms P4 and P15 are located in the regular region
at approximately ÿ4.0 ppm. These results are more
compatible with interdigitated mode I than with
mode II. The 31P-1H heteronuclear correlation spectrum further reveals two H40 protons (G4H40 and
G15H40 ) that resonate at very high ®eld position of
1.65 and 1.9 ppm respectively. These large up®eld
shifts (approximately 2 ppm!) from the regular H40
chemical shifts are due to the stacking of the deox-
yribose of the unpaired G14 and G4 upon the A6
and A17 bases of the ¯anking sheared A6 G14 and
A17 G3 pairs, respectively (see Figure 6). This idea
of sugar-base stacking, with sugar O-40 and H-40
pointing directly toward the center of the ¯anking
A6 and A17 bases, can help stabilize this unusual
interdigitated motif, and have been observed several times in other cases.26 ± 32 Theoretical calculation also shows that the interaction energy of
sugar-base contacts can add up to 4 kcal/mol,
comparable with that afforded by normal basebase interaction.15 Therefore, all NMR data,
whether they are from the cross-strand G4H10 G16H10 -G5H10 -G15H10 NOEs (Figure 3), the
up®eld-shifted signals of the G4-G5, G5-A6, G15G16, and G16-A17 phosphodiesters (Figure 4), or
the up®eld-shifted signals of the G4H40 and
G15H40 protons (Figure 4), all indicate that the
unpaired guanine residues in the (GGGA)2 motif
adopt type I interdigitated mode. On the contrary,
the deoxyriboses of the central unpaired residues
G5 and G16 experience no stacking at all, even
though their bases experience excellent stacking
(Figure 5(a)). This accounts for the fact that the
chemical shifts of their sugar protons resume the
normal values. Since only weak or undetectable
(n)H40 -(n)P cross-peaks were observed for all the
146
Centromeric Interdigitated G-6 Stack Structure
Figure 5. (a) Some idiosyncratic NOEs of the zipper-like quadruple-intercalation (GGGA)2 motif. The protons that
exhibit inter-strand H10 -H10 NOEs in the G-rich zipper region are connected by red dotted arrows, while those exhibiting mutual H8-H10 NOEs are connected by blue dotted arrows. These NOEs are hardly ever observed in any DNA
structure. (b) The stereo overlapping picture of the 15 ®nal structures produced by embedding from and re®ning
against the distance bounds. (c) The picture of the view perpendicular to the helical axis and (d) into the minor
groove of the DNA oligomer containing the zipper-like quadruple intercalation (GGGA)2 motif (the top ACC loop
region is excluded for comparison). Guanne residues are shown in red, adenine in blue, cytosine in green, and thymine in yellow. The consecutive G/G/G/G/G/G stacking and the characteristic X shape, zigzag phosphodiester
backbone are obvious from this Figure. (e) The stereo picture in space-®lling mode into the major groove of the quadruple intercalation (GGGA)2 motif. The oxygen atoms are colored in red, nitrogen atoms in blue, carbon atoms in
green, phosphors atoms in orange, and hydrogen atoms in white. The abundant hydrogen-bonding donors and
acceptors are obvious in the major groove and may play important roles in interacting with other unidenti®ed proteins for organizing the Drosophila centromere.
phosphodiesters in the (GGGA)2 motif (G4, G5,
A6, G15, G16, and A17), their z and a torsional
angles were all left unconstrained during the structural calculations.
Using the information obtained from throughspace NOE connectivity, through-bond J-coupling
connectivity, and 31P-1H correlations, all exchangeable protons, non-exchangeable protons (except for
some H50 /H500 protons), and phosphorus atoms of
the 50 -d(GCGGGAACACCGTGGGAGC)-30 hairpin
were assigned unambiguously and are listed in the
Supplementary Material (Table S1). A representation of the abundant inter-stranded NOEs for
this quadrauple-intercalation G-6 motif is shown in
Figure 5(a), with the constraint statistics used to
determine its solution structure listed in Table 2.
Structural feature
Due to the abundant experimental distance and
torsional angle constraints, the current unusual
d(GGGA)2 motif was well determined, as judged
from the overlapping of the 15 ®nal structures
from the view perpendicular to the helical axis
shown in Figure 5(b). Well-converged structures
Ê (the top ACC
with r.m.s.d. values of 0.98 (0.25) A
147
Centromeric Interdigitated G-6 Stack Structure
Table 2. Structural statistics for the 50 -d(GCGGGAACACCGTGGGAGC)-30 hairpin
Restraints
Numbers
Exchangeable NOEs
Ê - 2.1 A
Ê)
H-bonds (1.8 A
Ê - 5.0 A
Ê
2.0 A
Ê - 6.0 A
Ê
3.0 A
Non-exchangeable NOEs
2.0
3.0
4.0
Total NOEs
Torsional Angles
NOEs per residue
NOEs and torsion angles per residue
Violations of experimental restraints
r.m.s.d.
Ê - 4.0 A
Ê
A
Ê - 5.0 A
Ê
A
Ê - 6.0 A
Ê
A
Ê
>5A
Backbone (b, g, e)
Glycosidic
Ê)
Distance restraints (> 0.15 A
Torsional angles restraints (> 3 )
loop was excluded for comparison) were obtained
after distance geometry/molecular dynamics calculation. The unusual X shape backbone due to the
four interdigitated unpaired guanine basess is
clearly revealed through the tracing ribbon shown
in Figure 5(c). The four unpaired guanine bases are
bracketed by two highly buckled sheared G A
pairs with the G-6 stack clearly revealed in the
right side (guanine residues are colored in red and
adenine residues in blue). Figure 5(d) shows the
view into the minor groove to reveal the zigzag
backbone of the zipper-like quadruple intercalation
motif. This Figure also reveals the rather narrow
minor groove (the shortest inter-strand P-P disÊ ) and the elongated feature of
tance is only 8.8 A
this duplex resulting from the interdigitation of the
central guanosine residues. Figure 5(e) further
shows the wide-eye stereo picture of the quadruple-intercalation motif in space-®lling mode. The
abundant hydrogen bonding donors and acceptors
of this unusual motif are obvious from this view
into the major groove, which, along with other unidenti®ed proteins, may work together to organize
eukaryotic centromeres. Figure 6 shows the parallel
stacking feature between the ®rst unpaired guanine base (G15 in blue) with the bracketed guanine
base (G14 in green) in the sheared G A pair (top
panel) and a typical anti-parallel stacking feature
between the two unpaired guanine bases in the
zipper (G5 and G15 in the middle panel) of the
(GGGA)2 motif. As described above, the guanine
base is signi®cantly more polar than the adenine
base, and thus the stability of guanine stacks
would depend very much upon the guanine base
alignment,15 with the anti-parallel alignment stabilizing and the parallel alignment destabilizing the
guanine stacks. This idea is indeed found to be
decisive in determining the conformation of the
G-6 stack in this unusual zipper motif. Thus,
the very strong inter-strand interaction between
20
4
2
57
95
55
39
272
104
85
19
14.3
19.7
0
0
0.98 0.25
the inner zipper G5 and G15 bases (middle panel)
is demonstrated by the excellent G5/G15 stacking
(the G5 base stacks almost entirely upon the G16
base) and the nearly anti-parallel alignment
(approximately 150 ) between the G5/G15 bases
(polarity is shown in orange arrows), while a twisting of approximately 60 is employed to prevent
the unfavorable parallel intra-strand G14/G15
stacking (top panel). A similar situation occurs in
the symmetrical half between the G4/G16 and G3/
G4 stacks (not shown). This Figure also accounts
for the almost 2 ppm difference in the H40 proton
chemical shifts between the outer zipper G15 and
the inner zipper G5 residues (Table S1); the H40
proton of the outer zipper G15 residue (pink dots
in the top panel of Figure 6) is situated right below
the center of the A6 six-membered ring of the
neighboring sheared G A pair to exhibit a dramatic up®eld chemical shift of 1.63 ppm, while
that of the inner zipper G5 residue has no neighboring purine base to shift its chemical shift, and
hence exhibits only a slightly up®eld value of 3.82
ppm compared with other H40 signals (middle
panel of Figure 6). Another point worth mentioning about the G-6 stack is the dramatically different
twisting angles employed to accommodate such a
special multiple purine stacking arrangement;
while more than 60 of twisting angle is employed
to prevent the parallel alignment of the ®rst
unpaired G15 base with the G14 base of the
bracketed G14 A6 pair (indicated by purple
arrows in the top panel of Figure 6), less than 10 of twisting angles is implemented between the
unpaired guanine bases to maintain the excellent
intra- and inter-strand stacking, as clearly revealed
in the middle and bottom panels of Figure 6, in
which the inter-strand G5/G15 stacking and the
intra-strand G15/G16 overlapping are obvious,
while both of the inter-strand and intra-strand
twisting angles are close to zero (indicated by
148
Centromeric Interdigitated G-6 Stack Structure
can account for the comparable stability of the
(GGGA)2 motif with the canonically paired
50 -(TTAA)/(AATT)-50 segment (Table 1).
Discussion
Figure 6. Several typical characteristic base stackings
in the d(GGGA)2 motif. Top: the intra-strand stacking at
the junction between the ®rst unpaired guanine base
and the paired guanine base in the bracketed sheared
G A pair. Middle: the inter-strand stacking. Bottom: the
intra-strand stacking between the outer zipper and inner
zipper guanine bases. The polarity of guanine bases is
indicated by orange arrows (top and middle) while the
glycosidic bonds are marked by purple arrows (top,
middle, and bottom). Signi®cantly different twist angles
were adopted to accommodate this special G-6 stack;
while a greater than 60 twisting angle was employed
at the junction between the outer zipper G15 residue
and the G14 A6 pair, less than 10 twisting angles were
adopted for both the inter (middle) and the intra-G/G
(bottom) stacking. Such uneven distribution of twisting
angles results in an overall excellent G-6 stack. The
outer zipper G15H40 proton (shown as a pink dot in the
top panel) is situated directly under the A6 base participating in the sheared pairing with G14 and experiences
a huge ring-current shielding effect to exhibit a chemical
shift at 1.63 ppm, while no such effect was observed for
the inner zipper G5H40 , which has no neighboring purine base and therefore exhibits a chemical shift at a
somewhat regular value of 3.82 ppm (Table S1).
purple arrows). It is important to note also that,
the intra-strand G15/G16 bases are not neighboring bases, but intercalated by a cross-strand G5
base (Figure 5(a)). Such a particular arrangement
of twisting angles has therefore resulted in an
excellent overall stacking between the G14/G15/
G5/G16/G4/G3 residues in the d(GGGA)2 motif
(Figure 5(a)), which, along with the H-bonding
between the amino protons of these unpaired guanine bases with the cross-strand phosphodiesters,
The dodeca-satellite sequence of Drosophila centeromere is highly conserved and has been
detected in widely different species separated for
more than 60 million years, like plant, ¯y, and
human.12 Previous studies by Azorin's group have
shown that the purine-rich dodeca-satellite strand
alone can form stable intramolecular fold-back
structure in a B-DNA environment (as judged by
electron microscopy).5 No such unusual structure
was detected for either the pyrimidine-rich strand
alone or the double-stranded dodeca-satellite DNA
under similar conditions. From chemical mapping
studies, it was suggested that the central guanine
residues in the GGGA-tract adopt special stacking
interaction with the adjacent GA mismatches and
contribute signi®cantly to the stability of the foldback structures.5 These data are consistent with the
novel quadruple-intercalated G-6 stack structure in
the (GGGA)2 motif presented here, in which the
central guanine residues are unpaired, but interdigitated to exhibit excellent cross-strand stacking
with each other and form cross-strand H-bonds
with the opposite strand backbone phosphodiesters.
However, it is still unclear which register of the
dodeca-satellite is responsible for the extraordinary
stability of the fold-back structures, as three different registers were proposed for the GGGA-tract
(hairpins I, II, and III 5) from the chemical mapping
studies that yield three different purine-rich motifs
of the (GA)2, (GGA)2, and (GGGA)2 sequences,
respectively. Although the (GA)2 motif containing
tandem sheared G A pairs was proposed to be the
major cause for the high stability of the fold-back
structures,7 it is, however, located in an unfavorable 50 -G-(G-A)-C-30 context that is not consistent
with our previous NMR studies.20,33 Our previous
studies indicated that only when situated in either
a 50 -Py(GA)Pu/Pu(AG)Py-50 20 or a 50 -Py(GA)Py/
Pu(AG)Pu-50 context 33 will the (GA)2 motif be
stable enough to compare with the canonically
paired duplex motif. Even when inserted into a
longer stem sequence, the 50 -(GGAC)2-30 motif still
does not adopt the sheared GA pairing con®guration, possibly to prevent the unfavorable parallel
G/G stacking in this context. There is thus some
discrepancy between the chemical mapping and
the NMR data. But, as described by these authors,
it is dif®cult to use either diethylpyrocarbonate
(DEPC) or dimethylsulphoxide (DMS) to determine
the base-pair con®guration unambiguously in the
homopurine (GGA)n and (GGGA)n sequences.7,14
For example, even though the N-7 atoms of guanine bases in the tandem sheared GA base-pairs
are not involved in hydrogen bonding, they are
anyway unreactive toward such chemicals, poss-
Centromeric Interdigitated G-6 Stack Structure
ibly due to the excellent cross-strand purine-purine
stacking, as suggested by the authors. More studies
are therefore necessary to clarify this situation.
Sequence-dependent studies of the (GGA)2 and
(GGGA)2 motifs are in progress in our laboratory
(unpublished results) to determine which register
is more responsible for the high stability of the
fold-back structure. Judged by the great stability of
these three distinct motifs, it is possible that no
single register dominates but that all three registers
are populated uniformly. In either case, the abundant hydrogen-bonding donors and acceptors from
the interdigitated guanine bases and the bracketed
sheared G A pairs would very likely be involved
in interacting with other proteins to organize the
eukaryotic kinetchore and centromere structure.
Recently, the crystal structure of a nonamer containing a zipper-like d(GAAA)2 motif has been
Ê 17 with the help of covalent hexamsolved at 2.1 A
ine cation. The cobalt ion basically serves to form
strong H-bonds with N-7 and O-6 atoms of the G
residue in the bracketed sheared G A pair to bring
together adjacent duplexes, which does not disturb
the zipper-like structure. However, the d(GAAA)2
motif is the least stable among the d(GPuPuA)2
motifs in solution as studied here, possibly due to
the lack of cross-strand hydrogen-bonding of the
zipper adenines (Table 1). Addition of cobalt ion
does not improve the spectra quality to any extent
(S.-H.C et al., unpublished results). Its detailed
three-dimensional structure could not therefore, be
addressed by NMR methods due to its dynamic
feature. Instead, we have studied the d(GGGA)2
motif, which is the most stable among the d(GPuPuA)2 motifs in solution and exhibits rather high
quality NMR spectra (Figures 2 and 3) that are
suitable for the structural determination. However,
it is still worthwhile to compare the solid-state
d(GAAA)2 structure with the solution-state
d(GGGA)2 structure, with the overlapping between
these two structures shown in Figure 7. From the
Figure, it is clear that several features of these two
structures are similar; the bracketed sheared G A
pairs are highly buckled, and the central four purine bases are unpaired and interdigitated with
each other to form an elongated backbone with a
characteristic X shape. However, two major differences exist between these two structures; (1) the
solid-state d(GAAA)2 structure (in red) has a narrower minor groove than that of the solution-state
d(GGGA)2 structure (in blue). The shortest interstrand P-P distance for the d(GAAA)2 structure is
Ê , while that of the d(GGGA)2 structure is
only 6.6 A
Ê ; (2) the phosphodiesters and the adenine resi8.5 A
dues of the zipper residues in the solid-state
d(GAAA)2 structure are not in a position to form
cross-strand hydrogen-bonding even when an
amino group is attached to the adenine C-2 position. This can be seen in the top of Figure 7, in
which a typical cross-strand hydrogen-bond
between the unpaired G-NH2 and the opposite
strand phosphodiester in the d(GGGA)2 structure
is marked by a blue arrow. It is clear from this
149
Figure 7. Overlapping stereo picture between the
crystal structure containing the d(GAAA)2 motif (in red)
and the NMR structure containing the d(GGGA)2 motif
(in blue). Different base orientation and groove width
were observed between these two structures, possibly
due to the different base alignment in the zipper region.
One typical base-base stacking between the outer and
the inner zipper adenine residues in the crystal
d(GAAA)2 structure is marked by two red arrows and
is expanded in the bottom panel. It is clear that the adenine base alignment in this Figure is different from the
guanine base alignment in the middle panel of Figure 6.
view that a similar hydrogen-bonding in the
crystal d(GAAA)2 structure would be unlikely to
happen (the corresponding distances are larger
Ê ), due to the different stacking pattern of
than 3.5 A
the zipper adenine residues. Closer examination of
the overlapping structures indicates that the intercalated adenine residue stacking in the crystal
d(GAAA)2 structure is considerably different from
the intercalated guanine base stacking in the solution d(GGGA)2 structure, due to the different
polar natures of the guanine and adenine bases.
The stacking between the inner zipper adenine
bases (marked by two red arrows in the top panel
of Figure 7 and expanded in the bottom panel of
Figure 7) in the crystal d(GAAA)2 structure is
around 110 , while that of the inner zipper guanine bases in the solution d(GGGA)2 structure is
closer to 180 (middle panel of Figure 6) to prevent
unfavorable repulsion. The resulting smaller twisting angle in the d(GGGA)2 motif thus draws nearer
the two strands and decreases the minor groove
150
width, while the larger twisting angle in the
d(GAAA)2 motif pushes away the two strands and
increases the minor groove width. The anti-parallel
stacking nature among the four unpaired guanine
bases in the d(GGGA)2 motif (top panel of Figure 7)
thus considerably increases its minor groove
width. However, it is not clear why the adenine
zipper does not adopt a stacking geometry similar
to that of the guanine zipper.
Unconstrained nanosecond molecular dynamics
studies of the d(GAAA)2 and d(GGGA)2 motifs
starting from the d(GAAA)2 crystal coordinates
have been performed.16 Both zipper motifs were
found to be internally stable with no major conformational change along the trajectory. However,
their theoretical calculation indicates that the
intrinsic base-base stacking energy difference in
vacuo between the (GGGA)2 and (GAAA)2 motifs is
only about 1 kcal/mol.16 This is signi®cantly different from our experimental data in the buffered
aqueous solution, in which a large enthalpy difference of approximately 17 kcal/mol or a UV-melting temperature difference of about 13 deg. C was
detected. It is likely that the different hydration
energies between the two zippers could affect the
zipper stability signi®cantly, as the intrinsic basebase stacking does not differ too much (personal
communication with Dr Sponer).
Materials and Methods
Sample preparation
All DNA samples were synthesized at the 3 mmol
scale on an Applied Biosystems 380B DNA synthesizer
with the ®nal 50 -DMT groups attached. The samples
were puri®ed and prepared for NMR studies as
described.34
UV-melting studies
The absorbance versus temperature pro®le was
obtained at 260 nm with a Cary 100 photospectrometer
equipped with a temperature-controller. A temperature
probe was placed inside the UV chamber to monitor
the cell temperature. The temperature in each run
was increased from 20 C to 90 C at a rate of 0.5 deg.
C/minute. All thermodynamics parameters were calculated by the Van't Hoff method 35 using the program
supplied by the vendor.
NMR experiments
All NMR experiments were obtained on a Varian
Unity Inova 600 MHz spectrometer. One-dimensional
imino proton spectra at 0 C were acquired using a
jump-return pulse sequence.36 The spectral width was
12,000 Hz with the carrier frequency set at the resonance
of water. The maximum excitation was set at 12.5 ppm.
For each experiment, 4 K complex points were collected
and 112 scans were averaged with a two seconds relaxation delay.
The 2D NOESY in 90 % H2O/10 % 2H2O was performed at 0 C in a pH 6.8 low-salt (20 mM NaCl, 3 mM
sodium phosphate) buffer with the following par-
Centromeric Interdigitated G-6 Stack Structure
ameters; delay time one second, mixing time 0.12 second,
spectra width 12,136 Hz, complex points 2048, number
of transients 96, and number of increments 500.
The 2D NOESY experiments in 2H2O were carried out
at 20 C in the hypercomplex mode with a spectral width
of 4705 Hz. Spectra were collected using three mixing
times of 100, 300, and 600 ms with a relaxation delay of
one second between each transient and with 2048 complex points in the t2 and 200 complex points in the t1
dimension. For each t1 increment, 64 scans were
averaged.
The 2D 1H-13C HSQC experiments in .2H2O were carried out at 20 C in the hypercomplex mode with a 1H
spectral width of 4705 Hz and a 13C spectra width of
25,649 Hz. Spectra were collected with a relaxation delay
of one second between each transient and with 2048
complex points in the t2 and 100 complex points in the t1
dimension. For each t1 increment, 96 scans were averaged.
A DQF-COSY spectrum was collected in the TPPI
mode with a spectral width of 4705 Hz in both dimensions; 2048 complex points in the t2 dimension and 320
(real) points in the t1 dimension were collected with a
relaxation delay of one second, and 40 scans were averaged for each t1 incrementation.
A proton-detected 31P-1H heteronuclear correlation
spectrum37 was collected in the TPPI mode with a spectral width of 4705 Hz in the 1H dimension and a spectral
width of 1000 Hz in the 31P dimension: 1024 complex
points in the t2 (1H) dimension and 128 complex points
in the t1 (31P) dimension were collected. Protons were
presaturated for 1.0 second and 128 scans were accumulated for each t1 incrementation.
The acquired data were transferred to an IRIS 4D
workstation and processed by the software FELIX (MSI
Inc.) as described.38
Structure determination
The 3D structures of the 50 -GAAGC-TCC-GCTTC-30
oligomer were generated by distance geometry and molecular dynamics calculations using distance and torsional angle constraints derived from NMR experiments.
Most distance constraints from NOESY spectra in 2H2O
were classi®ed as strong, medium, or weak based on
their relative intensities at 100 ms and 300 ms mixing
time and were given generous distance bounds of 2.0Ê , 3.0-5.0 A
Ê , or 4.0-6.0 A
Ê , respectively. Canonical
4.0 A
Ê were
hydrogen-bond distances with bounds of 1.8-2.1 A
assigned to Watson-Crick base-pairs. A large number of
distance constraints involving exchangeable protons
were also derived from H2O/NOESY spectra and were
Ê
given only two wide distance bounds of either 2.0-5.0 A
Ê , due to the exchange phenomena. The b and
or 3.0-6.0 A
g torsional angle constraints were determined primarily
semi-quantitatively from the 31P-1H heteronuclear correlation data30 using the in-plane ``W'' rule.39 Based on the
absence of long-range 4JH20 -P coupling, all e torsion angles
were constrained to the trans domain (180(30 ).40 The z
and a dihedral angles were all constrained in the nontrans domain, since no backbone phosphorus signal of
extraordinary shifting was observed.41 The w dihedral
angles were constrained to ÿ100 (ideal B-DNA values)
30 when no aromatic-anomeric cross-peaks of comparable intensity to the CH5/CH6 cross-peaks was
detected. These NOE distance (272 in total) and torsional
angle (104 in total) constraints were used to generate
initial structures using the DGII program (MSI, Inc.). The
Centromeric Interdigitated G-6 Stack Structure
initial structures were further re®ned by restrained molecular dynamics using the program DISCOVER (MSI,
Inc.). A 2 ps dynamics was run at 300 K with a step
size of 1.0 fs, which was followed by a conjugate gradient minimization of 200 iterations looped ten times.
Well-converged ®nal structures with pair-wise r.m.s.d.
Ê were obtained after molvalues of approximately 0.98 A
ecular dynamics calculations.
Acknowledgments
We thank the National Science Council and the
Chung-Zhen Agricultural Foundation Society of Taiwan,
ROC for the instrumentation grants and Dr Larvery for
the kind gift of the CURVE program. Personal communication with Dr Sponer is highly appreciated. S.-H. C. is a
recipient of the Outstanding Research Award from the
National Science Council, Taiwan. This work
was supported by the NSC grants 89-2113-M-005-034 to
S.-H. C.
References
1. Choo, K. H. (1997). The Centromere, Oxford University Press, Oxford, UK.
2. Grady, D. L., Ratliff, R. L., Robinson, D. L.,
McCanlies, E. C., Meyne, J. & Moyzis, R. K. (1992).
Highly conserved repetitive DNA sequences are present at human centromeres. Proc. Natl Acad. Sci.
USA, 89, 1695-1699.
3. Catasti, P., Gupta, G., Garcia, A. E., Ratliff, R.,
Hong, L., Yau, P. et al. (1994). Unusual structures of
the tandem repetitive DNA sequences located at
human centromeres. Biochemistry, 33, 3819-3830.
4. Chou, S.-H., Zhu, L. & Reid, B. R. (1994). The unusual structure of the human centromere (GGA)2
motif: unpaired guanosines stacked between sheared
GA pairs. J. Mol. Biol. 244, 259-268.
5. Ferrer, N., Azorin, F., Villasante, A., Gutierrez, C. &
Abad, J. P. (1995). Centromeric dedeca-satellite DNA
sequences form fold-back structures. J. Mol. Biol.
245, 8-21.
6. Zhu, L., Chou, S.-H. & Reid, B. R. (1996). A single
G-to-C change causes human centromere TGGAA
repeats to fold back into hairpins. Proc. Natl Acad.
Sci. USA, 93, 12159-12164.
7. Ortiz-Lombardia, M., Cortes, A., Huertas, D., Eritia,
R. & Azorin, F. (1998). Tandem 50 -GA:GA-30 mismatches account for the high stability of the foldback structures formed by the centromeric Drosophila
dodeca-satellite. J. Mol. Biol. 277, 757-762.
8. CorteÂs, A., Huertas, D., Fanti, L., Pimpinelli, S.,
Marsellach, F. X., PinÄa, B. & AzorõÂn, F. (1999).
DDP1, a single-stranded nucleic acid-binding protein
of Drosophila, associates with pericentric heterochromatin and is functionally homologous to the yeast
Scp 160p, which is involved in the control of cell
ploidy. EMBO J. 18, 3820-3833.
9. Sunkel, C. E. & Coelho, P. A. (1995). The elusive
centromere: sequence divergence and functional conservation. Curr. Opin. Genet. Dev. 5, 756-767.
10. Prosser, J., Frommer, M., Paul, C. & Vincent, P. C.
(1986). Sequence relationships of three human satellite DNAs. J. Mol. Biol. 187, 145-155.
151
11. Pluta, A. F., Cooke, C. A. & Earnshaw, W. C. (1990).
Structure of the human centromere at metaphase.
Trends Biochem. Sci. 15, 181-185.
12. Abad, J. P., Carmena, M., Baars, S., Saunders,
R. D. C., Glover, D. M., Ludena, P. et al. (1992).
Dodeca satellite: a converved G ‡ C-rich satellite
from the centromeric heterochromatin of Drosophila
melanogaster. Proc. Natl Acad. Sci. USA, 89, 46634667.
13. Carmena, M., Abad, J. P., Villasante, A. & Gonzalez,
C. (1993). The Drosophila melanogaster dodeca-satellite sequence is closely linked to the centeomere and
can form connections between sister chromatids
during mitosis. J. Cell Sci. 105, 41-50.
14. Huertas, D. & Azorin, F. (1996). Structural polymorphism of homopurine DNA sequences. d(GGA)n
and d(GGGA)n repeats form intramolecular hairpins
stabilized by different base-pairing interactions.
Biochemistry, 35, 13125-13135.
15. Sponer, J., Gabb, H. A., Leszczynski, J. & Hobza, P.
(1997). Base-base and deoxyribose-base stacking
interactions in B-DNA and Z-DNA: a quantumchemical study. Biophys. J. 73, 76-87.
16. Spackova, N., Berger, I. & Sponer, J. (2000). Nanosecond molecular dynamics of zipper-like DNA
duplex structures containing sheared G A mismatch
pairs. J. Am. Chem. Soc. 122, 7564-7572.
17. Shepard, W., Cruse, W. B. T., Fourme, R., Fortelle,
E. & Prange, T. (1998). A zipper-like duplex in
DNA: the crystal structure of d(GCGAAAGCT) at
Ê resolution. Structure, 6, 849-861.
2.1 A
18. Tseng, Y.-Y. & Chou, S.-H. (1999). Systematic NMR
assignment pathways for DNA exchangeable protons. J. Chin. Chem. Soc. 46, 699-706.
19. Chou, S.-H., Cheng, J.-W. & Reid, B. (1992). Solution
structure of [d(ATGAGCGAATA)]2: adjacent GA
mismatches stablized by cross-strand base-stacking
and BII phosphate groups. J. Mol. Biol. 228, 138-155.
20. Cheng, J.-W., Chou, S.-H. & Reid, B. R. (1992). Basepairing geometry in G A mismatches depends
entirely on the neighboring sequence. J. Mol. Biol.
228, 1037-1041.
21. Hare, D. R., Wemmer, D. E., Chou, S.-H., Drobny,
G. & Reid, B. R. (1983). Assignment of the
non-exchangeable
proton
resonances
of
d(CGCGAATTCGCG)
using
two-dimensional
nuclear magnetic resonance methods. J. Mol. Biol.
171, 319-336.
22. Gehring, K., Leroy, J.-L. & Gueron, M. (1993). A
tetrameric DNA structure with protonated cytosine:cytosine base pairs. Nature, 363, 561-565.
23. Varani, G. & Tinoco, I. J. (1991). RNA structure and
NMR spectroscopy. Quart. Rev. Biophys. 24, 479-532.
24. Marino, J. P., Schwalbe, H. & Griesinger, C. (1999).
J-coupling restraints in RNA structure determination. Accts Chem. Res. 32, 614-623.
25. Varani, G. & Tinoco, I. J. (1991). Carbon assignments
and heteronuclear coupling constants for an RNA
oligonucleotide from natural abundance 13C-1H correlated experiments. J. Am. Chem. Soc. 113, 93499354.
26. Wang, A. H.-J., Quigley, G. J., Kolpak, F. J., van der
Marel, G., van Boom, J. H. & Rich, A. (1981). Lefthanded double helical DNA: variations in the backbone conformation. Science, 211, 171-176.
27. Frederick, C. A., Coll, M., van der Marel, G. A., van
Boom, J. H. & Wang, A. H.-J. (1988). Molecular
structure of cyclic deoxydiadenylic acid at atomic
resolution. Biochemistry, 27, 8350-8361.
152
28. Guan, Y., Gao, Y.-G., Liaw, Y.-C., Robinson, H. &
Wang, A. H.-J. (1993). Molecular structure of cyclic
Ê resolution of two crystal
diguanylic acid at 1 A
forms: self-association, interactions with metal ion/
planar dyes and modeling studies. J. Biomol. Struct.
Dynam, 11,, 253-276.
29. Chou, S.-H., Zhu, L. & Reid, R. R. (1996). On the
relative ability of centromeric GNA triplets to form
hairpins versus self-paired duplexes. J. Mol. Biol.
259, 445-457.
30. Chou, S.-H., Zhu, L., Gao, Z., Cheng, J.-W. & Reid,
B. R. (1996). Hairpin loops consisting of single
adenine residues closed by sheared A:A and G:G
pairs formed by the DNA triplets AAA and GAG:
Solution structure of the d(GTACAAAGTAC) hairpin. J. Mol. Biol. 264, 981-1001.
31. Chou, S.-H., Tseng, Y.-Y. & Wang, S.-W. (1999).
Stable sheared A:C pair in DNA hairpins. J. Mol.
Biol. 287, 301-313.
32. Umezawa, Y. & Nishio, M. (2000). CH/p interaction
in the crystal structure of TATA-box binding protein/DNA complexes. Bioorg. Med. Chem. 8, 26432650.
33. Chou, S.-H., Tseng, Y.-Y., Chen, Y.-R. & Cheng,
J.-W. (1999). Structural studies of symmetric DNA
undecamers containing non-symmetrical sheared
(PuGAPu):(PyGAPy) motifs. J. Biomol. NMR, 14,
157-167.
34. Chou, S.-H. & Tseng, Y.-Y. (1999). Cross-strand
pruine-pyrimidine stack and sheared purine:pyrimidine pairing in the human HIV-1 reverse transcriptase inhibitors. J. Mol. Biol. 285, 41-48.
35. Marky, L. A. & Breslauer, K. J. (1987). Calculating
thermodynamic data for transitions of any molecularity from equilibirum melting curves. Biopolymers,
26, 1601-1620.
36. Plateau, P. & Gueron, M. (1982). Exchangeable Proton NMR without base-line distortion, using new
strong-pulse sequences. J. Am. Chem. Soc. 104, 73107311.
37. Sklenar, V., Miyashiro, H., Zon, G., Miles, H. T. &
Bax, A. (1986). Assignment of the 31P and 1H reson-
Centromeric Interdigitated G-6 Stack Structure
38.
39.
40.
41.
ances in oligonucleotides by two-dimensional NMR
spectroscopy. FEBS Letters, 208, 94-98.
Chou, S.-H., Cheng, J.-W., Fedoroff, O. & Reid, B. R.
(1994). DNA sequence GCGAATGAGC containing
the human centromere core sequence GAAT forms a
self-complementary duplex with sheared G:A pairs
in solution. J. Mol. Biol. 241, 467-479.
Sarma, R. H., Mynott, R. J., Wood, D. J. & Hruska,
F. E. (1973). Determination of the preferred conformations constrained along the C40 -C50 and C50 -O50
bonds of b-50 -nucleotide in solution. Four-bond.
31 1
P- H coupling. J. Am. Chem. Soc. 95, 6457-6459.
Altona, C. (1982). Conformational analysis of nucleic
acids. Determination of backbone geometry of
single-helical RNA and DNA in aqueous solution.
Recl. Trav. Chim. Pays-Bas. 101, 413-433.
Gorenstein, D. G., Schroeder, S. A., Fu, J. M., Metz,
J. T., Roongta, V. & Jones, C. R. (1988). Assignments
of 31P NMR resonances in oligodeoxyribonucleotides: origin of sequence-speci®c variations in the
deoxyribose phosphate backbone conformation and
the 31P chemical shifts of double-helical nucleic
acids. Biochemistry, 27, 7223-7237.
Edited by I. Tinoco
(Received 18 July 2001; received in revised form 27
September 2001; accepted 27 September 2001)
http://www.academicpress.com/jmb
Supplementary Material comprising one Table is
available on IDEAL