Nucleic Acids Research

Volume 14 Number 1 1986
Nucleic Acids Research
Computer graphics program to reveal the dependence of the gross three-dimensional structure of
the B-DNA double helix on primary structure
Chang-Shung Tung
Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, and
Stephen C. Harvey
Department of Biochemistry, University of Alabama, Birmingham, AL 35294, USA
Received 6 June 1985
ABSTRACT
Programs are presented to plot the gross three-dimensional structure of
the DNA double helix with the base sequence as input information. The rules
that determine the overall structure of the double helix are those that predict the dependence of local helix parameters (specifically, helix twist
angle and relative basepair roll angle) on sequence. For this purpose, the
user can select either the Calladine-Dickerson parameters or the Tung-Harvey
parameters. These programs can be used as tools to investigate the variation
of DNA tertiary structure with sequence, which may play an important role in
the sequence-specific recognition of DNA by proteins.
INTRODUCTION
The sequence-dependent DNA helix structure is an interesting topic which
has attracted much attention in recent years. This sequence-dependent nature
of DNA tertiary structure may play an important role in DNA packaging and in
the sequence-specific recognition of DNA by other molecules (1-3). In the
past few years, several models (2,3,4-8) have been developed for the prediction of helix parameters [e.g., helix twist angles (t ), changes of roll
angle (tr), propeller twist angles (t ), basepair separations (d), etc.] of
DNA double helices from primary sequences. These predictions are sets of
numerical data pertaining to the three dimensional structure of DNA double
helix. Even with all these helix parameters available through prediction
models, it is still very difficult to visualize the overall shape of the DNA
double helix in space. To solve this problem, computer programs are given
here to plot pictures of DNA double helices when the primary structures are
specified. Possible applications are also suggested in this paper.
Source programs are available in either of two formats. Requests for
tapes in standard card image format should be directed to CST at Los Alamos,
while VAX VMS format versions are available from SCH in Birmingham. There
is no charge, and programs will be supplied upon receipt of a self-addressed
mailing label and a blank tape.
381
Nucleic Acids Research
A
E
B
5A
F
+
loA
I__
Figure 1: A rectangular plate to represent a basepair. The rectagular plate
BCDE is 10 A long, 5 A wide, with the helix rotational center indicated as
"+". A and F represent two points in the sugar phosphate backbone.
METHOD
The purpose of these programs is to display the overall shape of the DNA
double helix with a specific sequence. Out of the helix parameters defined
by Dickerson et al. (4,7), only three (helix twist angle, change of roll
angle, and basepair separation) are used in this algorithm for the following
reasons: First, propeller twist angle is a property within a basepair which
does not affect the relative position and orientation of two consecutive
basepairs. Second, basepair sliding will not alter the orientation of the
following basepairs, and none of the existing prediction models can predict
basepair sliding well. Third, because basepair tilt angle is usually very
small in B-DNA (9, 10), it has been assumed to be zero.
To further simplify the pictorial presentation, each basepair is represented by a rectangular plate which lies on the mean plane of the basepair
(Fig. 1). The plate designated BCDE, is 10 A long, 5 A wide, and connected to
two points (A and F) that represent points in the backbone. The center of
helix rotation is indicated as "+".
Each plate corresponds to a local coordinate system where the origin coincides with the center of helix rotation, the x-axis is parallel to AF, and
the y-axis is parallel to CB. The geometry of the n th basepair can be easily
deduced in the local coordinate system corresponding to the (n-l)th basepair
by moving the plate a distance d in the z direction (basepair separation is
equal to d), rotating through an angle tg with respect to the z-axis (helix
twist angle is equal to t ), and then rotating through an angle tr with respect to the x-axis (change of roll angle is equal to tr), as shown in Fig.
2. Keeping track of the transformation matrices between each local coordinate system and the global coordinate system, the coordinates of all basepairs with respect to the global coordinate system can be derived. The re382
Nucleic Acids Research
x
Q< tg
tr
Figure 2: Geometry of two consecutive basepairs. Each basepair corresponds
to a local coordinate system. The geometry of the top basepair can be easily
derived in the local coordinate system of the bottom basepair by moving a
distance d in z direction, rotating through an angle t with respect to the
z-axis, and then rotating through an angle t with rApect to the x-axis.
sulting set of coordinates is then used for the plotting of the double helix.
Figure 3 shows the flow chart of the algorithm. Two programs were developed. PREPLT calculates the coordinates of all basepairs, while PLT plots
the results from PREPLT. Both programs were first developed in FORTRAN on a
CDC 7600 computer at the Los Alamos National Laboratory with DISPLA package
(Display Integrated Software System and Plotting Language from Integrated
Software Systems Corporation, 4186 Sorrento Valley Blvd., San Diego, CA
92121). VAX versions of the programs were later developed at the University
of Alabama at Birmingham; these are also in FORTRAN, and they use the PLOT10
plotting package (Tektronix, Inc., P. 0. Box 500, Beaverton, OR 97077).
SAMPLE PLOTS
The programs can apply the helix parameters predicted from either of the
two prediction models (2,3,7) to make the plot. We chose to use the prediction model developed by Tung and Harvey (2,3) because it is a more detailed
model, with the predictions derived from conformational energy calculations
on all-atom models for the basepairs. The principal advantages of this model
are two. First, it is detailed enough to distinguish between adenosine and
guanosine and between cytidine and thymidine; whereas the Calladine-Dickerson
model only distinguishes between purines and pyimidines. Second, the only
383
Nucleic Acids Research
START
s'roP
3: Flowchart of the algorithm. BP(N) indicates coordinates of t t
n
basepair. LC(N) represents local coordinate system corresponds to the n
basepair. GC is the global coordinate system, while TM(N) represents transformation matrix between LC(N) and GC.
Fiure
Figure 4: Plot of d((G) 12 (C)12). This 12 basepair DNA double helix is
straight with constant he ix parameters for all basepairs except some small
deviations for the end basepairs because of the lack of propagational effect
from the neighboring steps.
384
Nucleic Acids Research
Figure 5: Plot of d(CGCGAATTCGCG). As expected, this helix is not regular
but with local variations from the ideal B-DNA structure.
adjustable parameters correspond to simple physical quantities that can be
compared with experimental quantities (3).
The first plot is that of a homopolymer (d(G)12 d(C)12) as shown in Fig.
4. This piece of DNA is straight with identical helix parameters within the
I
(a)
(b)
Figure 6: Plot of the 51-basepair bending locus of trypanosome kinetoplast
DNA. The helix is bent with the bending nearly confined to a single plane.
This piece of DNA was identified by Wu and Crothers (12) from gel electrophoresis measurements to be the bending locus of trypanosome kinetoplast minicircle DNA. b) shows the view of the helix with 90 degree rotation from a).
385
Nucleic Acids Research
(a)
(b)
Figure 7: Plot of d((A5T5)20). This piece of DNA is 200 basepairs long.
When compared to the structure shown in Fig. 7, this helix bends even more
with a smaller radius of curvature. The bending of this helix is not planar
but forms a superhelical structure. For the purpose of clarity, basepairs
are not included in this plot. a) shows the side view, while b) shows the
top view of the DNA double helix.
helix (except some small deviations for the end basepairs due to the absence
of the propagational effect from the neighboring steps).
The next plot (Fig. 5) is for the self-complementary dodecamer
d(CGCGAATTCGCG). The interesting feature one notices first is that the helix
is not straight. The helix structure is not regular but depends on the sequence, as indicated in the crystal structure (7,8).
Figure 6 is a plot of the 50 basepair bending locus of the trypanosome
kinetoplast DNA whose anomalous gel mobility (11) is believed to be due to
macroscopic bending (12,13). One can see that this piece of DNA is indeed
bent. The bending is nearly confined to a plane; i.e., bending in one direction is much more pronounced than in other directions.
The last plot (Fig. 7) is a self-complementary DNA helix with an alternating adenosine-thymidine sequence (d(A5T5)20). This particular DNA is predicted to bend even more than the bending locus of trypanosome kinetoplast
DNA. The bending is not planar but forms a superhelical structure.
SUMMARY
All plots generated from PLT are stereoscopic views. The plotted DNA
double helix can be seen in three-dimensional space through a pair of stereo
386
Nucleic Acids Research
glasses with each lens focused on one plot.
This representation is a very
nice visual aid to the prediction models.
These programs can be used to search for specific gross structural
features in known DNA sequences. It can also be easily modified to look for
the optimum sequence which comes closest to some specific predetermined
three-dimensional structure.
ACKNOWLEDGEMENTS
This work was supported by the U. S. Department of Energy and a grant to
S.C.H. from the National Science Foundation (PCM-8417001).
REFERENCES
1.
Windom, J. (1984) Nature 309, 312-313.
2.
Tung, C.S. (1984) Ph.D. dissertation, Univ. of Alabama, Birmingham.
3.
Tung, C.S. and Harvey, S.C. (1984) Nucl. Acids Res. 12, 3343-3356.
4.
Fratini, A.V., Kopka, M.L., Drew, H.R. and Dickerson, R.E. (1982) J.
Biol. Chem. 257, 14686-14707.
5.
Kabsch, W., Sander, C. and Trifonov, E.N. (1982) Nucl. Acids Res. 10,
1097-1104.
6.
Calladine, C.R. (1982) J. Mol. Biol. 161, 343-352.
7.
Dickerson, R. E. (1983) J. Mol. Biol. 166, 411-419.
8.
Dickerson, R.E. (1983) Scientific American 249(2), 94-111.
9.
Mellema, J.R., van Kamper, P.N., Carlson, C.N., Bosshard, H.E. and
Altona, C. (1983) Nucl. Acids Res. 11, 2893-2905.
10. Wells, R.D., Goodman, T.C., Hillen, W., Horn, G.T., Klein, R.D., Larson,
J.E., Muller, U.R., Neuendorf, S.K., Panayotatos, N. and Stirdivant,
S.M. (1980) Proc. Nucl. Acid Res. Mol. Biol. 24, 167-267.
11. Marini, J.C., Levene, S.D., Crothers, D.M. and Englund, P.T. (1982)
Proc. Nat'l Acad. Sci. USA 79, 7664-7668.
12. Wu, H.-M. and Crothers, D.M. (1984) Nature 308, 509-513.
13. Hagerman, P.J. (1984) Proc. Nat'l. Acad. Sci. USA 81, 4632-4646.
387