Modeling of DNA Replication

Modeling of DNA Replication
Xiaoli Yang1, Rong Ge1, Yifan Cai1 and Charles Tseng2
1
Department of Electrical and Computer Engineering
2
Department of Biological Sciences
Purdue University Calumet
Hammond, IN, USA
Abstract - DNA replication is a necessary step prior to cell
division, so that the genetic material can be duplicated for
equal distribution in the daughter cells. Although in the
course of evolution, cells have developed specific mechanisms
to ensure the fidelity of the process, faulty enzymes and
mutagens may cause changes in DNA sequences, leading to a
variety of diseases including cancers. As important as the
DNA replication process is, however, teaching and learning of
the subject have been difficult. The present paper describes
an innovative computer program that stresses inquiry based
learning through visualization, cognitive feedback and handson interactions. It is one of a series of interactive computer
modules for learning genetics at both high school and college
levels.
Keywords: DNA replication, modeling, computer program
1
Introduction
DNA replication is a fundamental property of all living
organisms. Prior to cell division, DNA must be replicated, so
that after cell division, each of the resulting daughter cells
ends up with the same amount of genetic material as the
original cell. This process ensures the constancy and
continuity of genomic DNA during cell reproduction. Like
most biological processes, the detailed mechanisms of DNA
replication have not been completely worked out, although a
great deal of effort has gone into their elucidation. Our
current knowledge is based primarily on the study of bacteria
such as E. coli (1). However, since similar proteins involved
in DNA replication have also been identified in eukaryotic
cells (e.g., yeast and other eukaryotic cell cultures), it seems
safe to say that the major DNA replication processes in
prokaryotic and eukaryotic cells are similar except for minor
details (2).
Although DNA replication is a subject that is taught in
both high school and college biology courses, students of all
levels still find this subject difficult. Part of the difficulty lies
in the intricate and abstract nature of the molecular processes
(3-7). Textbooks these days have detailed illustrations that are
quite helpful for learning (1), but in the end, the learning is
not active. Recent multimedia tools such as DVDs and
computerized animations represent a new way of teaching and
learning (8-10). However, these multimedia based learning
methodologies do not emphasize the interaction of eyes,
mind, and hands in the learning process. The present paper
describes an innovative computer program that stresses
inquiry based learning through visualization, cognitive
feedback and hands-on interactions.
The DNA replication module is one of several modules
developed (11-15) for learning genetics using an interactive
computer program. Our specific aim is to provide a useful
learning tool for a number of high school and college level
courses in the areas of general biology, genetics, cell biology,
and molecular biology.
2
2.1
Model Development
Overview
The DNA model has three levels in its structural
hierarchy and is composed of many independent ball-shaped
elements. Each element has a position, a color, and a radius.
Linked together, the elements can interact with one another
and move uniformly. A smoothing algorithm, which adds a
square outline to the linked elements, is used to fill the gap
between two elements. A string of elements forms a rod,
representing a DNA strand (Fig. 1).
Fig. 1. DNA modeling : from balls to rod
2.2
Basic model element: node
The node class represents the basic element of the DNA
molecule – the ball. The “ball” is nothing but structural data
in a linked list. Every ball along the linked list is regarded as
a node. A node is characterized by the Cartesian coordinates
X and Y (location), a radius (size), and a color (identity) (Fig.
2). Initially, a node is created according to preset parameters.
After creation, the color and the radius remain static, while
the coordinates may change from time to time during the
simulation process.
Fig. 2. Node class overview
It should be pointed out that the nodes cannot overlap
with one another in the coordinate system. Each node
occupies its own location so that there is no ambiguity. To
form a bidirectional linked list, every node must have two
pointers – one points to the last node, while the other points to
the next node. Only the first node (head node) has a void
pointer. Every node interacts with others based on these
relationships (Fig.3). The interaction between small internal
nodes makes the rod move flexibly.
2
∗
2
∗
(3)
where DXY is the distance between two nodes and dx &
dy the distance along x & y axis. 2R is the sum of the radii of
two adjacent nodes.
Fig. 3. The bidirectional linked list
Smoothing algorithm
2.3
As mentioned above, to avoid the appearance of
discontinuous segments, a smoothing algorithm, which adds a
square outline to the linked nodes, is used to fill the gap
between every two nodes. This makes the list more like a rod
than a series of balls. To create the square outline, four points
need to be determined. Fig.4 shows how the 4 points are
calculated using a pair of homothetic triangles.
The known variants are the nodes’ coordinates and
radius. From the property of homothetic triangles, we know
that
(1)
Thus,
(2)
Fig. 4. Smoothing square and the smoothed rod
Assume that (N1x, N1y) and (N2x, N2y) are the
coordinates of the nodes. Point 1 is (N1x-a, N1y-b), Point 2 is
(N2x-a, N2y-b), Point 3 is (N2x+a, N2y+b), and Point 4 is
(N1x+a, N1y+b).
2.4
Node movement
The node itself is not capable of doing complex
movement. In fact, only two types of movement are allowed:
1) teleporting the node to a specified location and 2)
connecting a node to another nearby node (what we call
stepping-up movements). These two movements are one-time
movements; there are no intermediate states during the
movements. The algorithm for stepping-up is also based on
two homothetic triangles (Fig. 5). In order to step up, SX and
SY are calculated as follows:
Fig. 5. The stepping-up algorithm for the node
3
Program Contents
Content design is based on three fundamental concepts:
1) Unlike RNA polymerase, DNA polymerase is unable to
initiate synthesis of a new strand de novo, that is, it requires a
preexisting primer. The major role of DNA polymerase is,
therefore, primer extension. In the cell, the primer is
synthesized by the enzyme RNA primase. 2) DNA replication
is a protein-controlled process. Numerous proteins are
involved in changing the topology of the molecule and
separating the two strands of the double helix. The proteins
are aggregated in a complex “factory” through which the
DNA duplex passes (individual proteins do not “travel” to the
duplex) and are recognized/bound by individual proteins in
the factory for specific reactions. 3) Due to the antiparallel
nature of the DNA duplex, semiconservative replication must
proceed in the opposite directions on the two template strands.
For the two core enzymes of DNA polymerase to stay
together, the lagging strand template moves differently than
the leading strand template so that the two core enzymes can
perform both strand synthesis without falling apart (see details
below).
This module is designed to emphasize inquiry based
learning (16); learning is achieved through questioning and
hands-on interactions. In each of the learning steps, dynamic
models of DNA molecules undergoing changes mediated by
various proteins are presented for visualization, cognition, and
operation. Completion of the program requires comprehension
of the entire concept and thus ensures the success of learning
experiences.
3.1 Antiparallel
organization
semiconservative replication of DNA
and
Each of the two intertwined strands of the DNA double
helix is made of many basic units called nucleotides, which
are composed of a 5-carbon sugar (deoxyribose), a phosphate
group attached to the 5’C of the sugar, and a nitrogenous base
(A, C, G, or T) attached to the 1’C of the sugar. At the
opposite end of the phosphate group is an OH group attached
to the 3’C of the sugar. Therefore, each strand of DNA has
two ends: The 5’P end and the 3’OH end (Fig. 6).
Fig. 6. DNA double helix with antiparallel organization
After separation of the complementary DNA strands,
each strand serves as a template for DNA synthesis. Fig. 7
shows 2 new strands being synthesized in opposite directions.
The resulting two DNA duplexes each consists of an old
strand and a new strand; this is known as semiconservative
replication.
Fig 8. a) DNA denaturation by helicase, b) binding of SSBs, c)
binding and synthesis of RNA primers (blue) by RNA primase
(green)
Fig. 7. New DNA strands synthesized in opposite directions.
3.2
Individual steps: protein facilitated DNA
replication
DNA replication involves the following steps in
sequence: a) Denaturation of double stranded DNA by
helicase (Fig. 8a), b) Binding of single strand binding proteins
(SSBs) to prevent renaturation of newly separated DNA
strands (Fig. 8b), c) Binding of RNA primase to initiate the
synthesis of a short RNA primer in the 5’ to 3’ direction; the
strand that serves as a template for continuous DNA synthesis
is called the leading strand template, while the strand that
serves as a template for discontinuous synthesis is called the
lagging strand template (Fig. 8c), d) Extension of the RNA
primer by DNA polymerase III (core enzyme) known as DNA
synthesis or elongation (Fig. 9); the discontinuous synthesis
of the lagging strand is now evident and each fragment of the
newly formed DNA is known as an Okazaki fragment (Fig. 9
and 10a), e) Removal of the RNA primers by the enzyme
RNase H (Fig. 10b and c), which degrades the RNA
nucleotides in the 5’ to 3’ direction one by one until the last
RNA nucleotide which is then removed by an exonuclease
(Fig. 10a). f) Filling of the gap (after the removal of the RNA
primer) by DNA polymerase I through synthesis of a short
piece of DNA (Fig. 10c); the new DNA segment is not
connected to the neighboring Okazaki fragment, resulting in a
nick that is then sealed by DNA ligase (Fig. 10d), which
catalyzes the formation of a phosphoester bond.
Fig 9. Primer extension by DNA polymerase III: leading strand
(dark blue, left), lagging strand with two Okazaki fragments
(light blue, right)
Fig. 10. Replacement of RNA primer with new DNA. a) 3
Okazaki fragments (right) with an exonuclease (grayish blue
structure with spikes) for removing the last RNA primer
nucleotide of the oldest Okazaki fragment, b) RNase H
(circular gold structure) for removing the RNA primer, c)
DNA polymerase I (orange red oblong structure) for synthesis
of a new, short strand of DNA to replace RNA primer, d) DNA
ligase (triangular gold) for sealing the nick (phosphoester
bond formation).
3.3
Replication fork: the factory of DNA
replication
The replication fork is the junction between the double
stranded DNA and the newly separated single stranded DNA.
Typically, there are two replication forks, one on each side of
ori, the origin of replication (the point where DNA starts
unwinding for replication). The two replication forks move
away from each other until replication is completed.
Functionally, the replication fork serves as a factory
containing numerous proteins to facilitate the DNA replication
process. According to the individual replication steps
described above, however, two core enzymes of DNA
polymerases III (one for synthesizing the leading strand and
the other the lagging strand) move in opposite directions. How
can the two core enzymes stay together in the same protein
factory? The lagging strand template must first move forward
and pass through the replication protein complex (factory) in
one direction (from right to left) and then retract backward in
the opposite direction (from left to right) during the lagging
strand synthesis so that both the leading and the lagging strand
syntheses appear in the same direction (Figs. 11a-d).
3’
5’
3’
3.4
Prokaryotic
replications
and
eukaryotic
DNA
The typical prokaryotic DNA is circular, and replication
starts at Ori. The general mode of DNA replication is shown
in (Fig. 12).
The typical eukaryotic DNA is linear.
During
replication, the entire chromosomal DNA molecule may be
divided into many segments, each with an Ori. In the early S
phase of the cell cycle, DNA replication starts at each Ori and
extends laterally until the replicated DNA duplexes meet and
join (Figs. 13).
Fig. 12. Prokaryotic DNA replication: a) a circular DNA
duplex, b) DNA replication starting at the top with two
replication forks moving away from each other, c) completion
of replication with two daughter DNA duplexes.
5’
3’
5’
5’
3’
5’
5’
3’
3’
5’
5’
5’
3’
5’
3’
3’
5’
3’
5’
Fig. 11. a) Two core enzymes of DNA polymerase III are
joined by a β-clamp loader with a β-clamp (left); also shown
is a DNA duplex with a leading strand template (upper, dark
red) and a lagging strand template (lower, light red) which is
being encircled by the helicase for separating the duplex, b)
While the leading strand is being synthesized as the leading
strand template travels through the core DNA polymerase III
with a β-clamp, the lagging strand template is curved and a
RNA primer is made by the RNA primase, c) The lagging
strand template has entered and passed through the core
enzyme of DNA polymerase III and is about to be locked by
the β-clamp, d) Lagging strand synthesis occurs as its
template moves backwards (from left to right).
Fig. 13. Eukaryotic DNA replication: a) chromosomal DNA
synthesis at three origins of replication (Ori), b) completion
of DNA synthesis (only one daughter DNA duplex is shown).
3.5
Telomere and telomerase
Telomeres are specialized structures located at both ends
of eukaryotic chromosomes.
Telomeres are important
structures that protect and stabilize the chromosomes. Since a
telomere is at the end of a chromosome, it contains the 5’end
of one strand and 3’end of the other. During DNA replication,
lagging strand synthesis requires periodic syntheses of primers
ahead of DNA elongation. Once the last RNA primer, which
is at the 5’ end of the lagging strand, is removed, there is no
way that it can be replaced with a new DNA segment.
Consequently, eukaryotic DNA gets shorter and shorter after
each replication until, eventually, the essential DNA coding
sequence near the telomere is affected. In other words,
eukaryotic chromosomes become shorter after every cell
division until the cell dies.
The enzyme telomerase can elongate eukaryotic DNA at
the 3’ end so that it can serve as a template for synthesizing a
new strand, replacing the lost segment due to the removal of
the RNA primer. In normal cells the telomerase activity is
relatively low. However, in cancer cells the telomerase
activity is high, so that cancer cells may divide and live
indefinitely; and chromosomes in cancer cells are not
shortened after each cell division. How can telomerase
accomplish this task? It turns out that telomeric DNA
contains tandem repetitive units at the chromosome ends. In
humans, for example, there are tandem repeats of
5’TTAGGG3’ totaling 10 to 15 kb long. Since telomerase is a
RNA-protein complex, it contains the sequence
3’AAUCCC5’, which acts as a template for the 5’TTAGGG’
repeat. When telomerase binds to the terminal 3’ end of the
telomeric DNA, only part of the telomeric RNA is paired with
the telomeric DNA; the part near the 3’end of the RNA
remains as a free single stranded end, serving as a template for
telomeric DNA synthesis. The enzyme then moves to expose
the RNA sequence at 3’end as a free template for another
round of telomeric DNA elongation.
As the process
continues, the 3’ end of the telomeric DNA is lengthened and
serves as a template during next round of DNA replication,
recovering the previously shortened DNA (Fig. 14).
only the fundamental molecule of life but also related to
diseases. After encountering the intricacies of the replication
process, however, many find the subject difficult to grasp and
become disenchanted with biology, an unfortunate situation
that needs to be improved.
This computer program is written with the hope that
teaching and learning DNA replication becomes an easy and
interesting, motivating beginners and also serving as basis for
a variety of topics in genetics. This module has recently been
tested in a Genetics course at Purdue University Calumet,
along with other genetic modules. The initial feedbacks were
positive. The ultimate goal of the project is to complete a
whole series of interactive computer modules for learning
genetics.
5
References
[1] Leslie Griffiths A. J. F., S. R. Wessler, S. B. Carroll, J.
Doebley (2012) Introduction to Genetics Analysis, 10th ed, W.
H. Freeman and Co. New York, NY.
[2] Pursell, Z. F., I. Isoz2, E-B. Lundström, E. Johansson
and T. A. Kunkel (2007)"Yeast DNA Polymerase ε
Participates in Leading-Strand DNA Replication". Science
317 (5834):127–130.
[3] Tibell, L. A. E. and C. J. Rundgren (2010) “Educational
challenges of molecular life science: characteristics and
implications for education and research” CBE - Life Sci.
Educ. 9: 25-33.
[4] Huang, P. C. (2000) “The integrative nature of
biochemistry: challenges of biochemical education in the
USA” Biol. Educ. 28:14-17.
[5] Bahar, M., A. H. Johnstone, and M. H. Hansell (1999)
“Revisiting learning difficulties in biology” J. Biol. Educ. 33:
84-86.
[6] Brig, J. (1996) “Enhancing teaching through
constructive alignment” Higher Education, 32:347-364.
Fig. 14. Sliding movement of telomerase (a, b, c) to create a
free 5’ end of the telomeric RNA template for elongating the
3’ end of telomeric DNA to replace the lost 5’ end telomeric
DNA after the next round of DNA replication
4
Conclusions
DNA replication is an extremely complicated process. It
requires coordination of many enzymes to assure the fidelity
of replication. Missteps in DNA replication lead a variety of
diseases including cancers (17). This important subject is
generally taught as a unit right after the introduction to DNA
structure and again as an integral part with another unit on
mutation in most biology textbooks. Students are usually
excited about the subject at the beginning, since DNA is not
[7] Sheley, S. M. and T. R. Mertens (1990) “A Survey of
Introductory College Genetics Courses” J. Heredity 81: 153156
[8] Essential Biochemistry - DNA Replication
[9] www.wiley.com/college/pratt/.../animations/dna_replica
tion/index.ht
[10] DNA Replication Process-YouTube
[11] www.youtube.com/watch?v=teV62zrm2P0
[12] DNA makes DNA - Cell Biology Animation
[13] www.johnkyrk.com/DNAreplication.html
[14] Yang X., G. Rong, C. Tseng (2011) “Modeling of DNA
Replication” The 2011 International Conference on
Modeling, Simulation and Visualization Methods, p.146-149,
Las Vegas.
[18] Yang, X., D. Wen, Y. Cui, X. Cao, J. Lacny and C.
Tseng (2009) “Computer Based
Karyotyping” The 3rd
International Conference on Digital Society (ICDS 2009),
310-315, Cancun, Mexico.
[15] Wu W., X. Yang, C. Tseng (2011) “Effective
Algorithms for Altering Human Chromosome Shapes” The
2011 International Conference on Modeling, Simulation and
Visualization Methods, p. 257-261, Las Vegas.
[19] Inquiry Based Learning:
www.thirteen.org/edonline/concept2class/inquiry/
[16] Yang X., R. Ge, Y.Yang, H. Shen, Y. L and C. Tseng
(2009) “Interactive Computer Program for Learning the
Genetic Principles of Segregation and Independent
Assortment through Meiosis” The 31st Annual International
Conference of the IEEE Engineering in Medicine and
Biology Society (EMBC 2009), p. 5842-5845, Minneapolis.
[17] Wu W., X. Yang, B. Chen, Z. Zhao, J. Lacny and C.
Tseng (2009) “Computer Based Simulation of Chromosome
Abnormality” The 2009 World Congress in Computer
Science Computer Engineering and Applied Computing
(WORLDCOMP 2009) p. 359-363, Las Vegas.
[20] Helleday Thomas, T., E. Petermann, C. Lundin, B.
Hodgson and R. A. Sharma (2008) “DNA repair pathways as
targets for cancer therapy” Nature Reviews Cancer 8:193204.
[21] Ree Source Person. “Title of Research Paper”; name of
journal (name of publisher of the journal), Vol. No., Issue
No., Page numbers (eg.728—736), Month, and Year of
publication (eg. Oct 2006).