Replication fork dynamics and dynamic mutations

Opinion
TRENDS in Genetics Vol.21 No.5 May 2005
Replication fork dynamics and dynamic
mutations: the fork-shift model of
repeat instability
John D. Cleary and Christopher E. Pearson
Department of Molecular and Medical Genetics, University of Toronto and Program of Genetics and Genomic Biology,
The Hospital for Sick Children, 555 University Avenue, Elm Wing 11-135, Toronto, Ontario, Canada M5G 1X8
Gene-specific repeat instability is responsible for O36
human diseases. Active instability varies in a tissue-,
developmental stage- and locus-specific manner and
occurs in both proliferative and non-proliferative cells. In
proliferative cells, DNA replication can contribute to
repeat instability either by switching the direction of
replication, which changes the repeat sequence that
serves as the lagging-strand template (origin switching),
or by shifting the location of the origin of replication
without altering the replication direction (origin shifting). We propose that changes in the dynamics of
replication-fork progression, or architecture, will alter
the location of the repeat within the single-stranded
lagging-strand template, thereby influencing instability
(fork shifting). The fork-shift model, which does not
require origin relocation, is influenced by cis-elements
and trans-factors associated with driving and maintaining replication forks. The fork-shift model can explain
some of the complex behaviours of repeat instability
because it is dynamic and responsive to variations in
epigenomic and locus activity.
Introduction
Throughout the human genome, the number of tandem
repeats at any locus often varies between individuals, with
these ‘natural’ variations rarely having negative consequences. One exception is a subclass of microsatellites,
including certain trinucleotide repeats (TNRs), tetranucleotide, pentanucleotide and dodecanucleotide repeats,
which have been associated with at least 36 human diseases
[1]. Repeat-associated disorders include Huntington disease
(HD), myotonic dystrophy types 1 and 2 (DM1 and DM2),
fragile X syndrome (FRAXA), spinal bulbar muscular
atrophy (SBMA), Friedreich’s ataxia (FRDA) and a series
of spinocerebellar ataxias (SCA1–3, 6, 7, 8, 10, 12 and 17).
For these disorders, instability (see Glossary) is associated
with age-of-onset, disease severity and, possibly, disease
progression [2,3] (Figure 1). Changes in repeat length
(instability) probably arise from the propensity of these
repetitive sequences at these loci to form unusual DNA
structures or to promote DNA slippage during DNA
Corresponding author: Pearson, C.E. ([email protected]).
Available online 24 March 2005
metabolism (replication, repair and/or recombination).
Repeat expansions are characterized as ‘dynamic
mutations’ [4] because the expanded product of a mutation
event has an increased propensity to undergo further
mutations. In contrast to the global repeat instability
observed with certain cancers [5], the instability observed
for repeat-associated disorders is limited to the disease
locus, suggesting a complex mutation process(es).
The underlying mechanism(s) causing gene-specific
repeat instability has been hotly debated, with DNA replication, repair [6] and/or recombination all championed as
potential contributors [7]. In somatic or germline tissues,
instability is likely to arise from multiple processes that
contribute individually or in combination, depending on the
tissue and/or developmental stage. Instability can therefore
be associated with genome-duplication in proliferating
cells, such as blood cells [8], or with genome-maintenance,
such as in terminally differentiated neurons [9]. Recent
Glossary
Genetic anticipation: an increase in disease severity and a decrease in age-ofonset as the mutation is transmitted from one generation to the next caused by
the dynamic nature of the mutation event.
Strand slippage: the process by which two complementary repeat strands
move or slip relative to each other and, as a result, are mis-paired in an out-ofregister fashion producing an excess of repeats on either or both strands. This
process can either occur spontaneously or be facilitated by a protein such as a
polymerase or repair protein.
Heterogeneity: the observation of different repeat lengths within or between
tissues of an individual. Although heterogeneity is a clear indication that repeat
instability has occurred at some point, it is not always coincident with ongoing
instability.
Instability: the active process of changes in the number of units within a
repetitive sequence.
Okazaki initiation zone (OIZ): in primates, and probably other metazoans, this
is a w290-nt region of single-stranded DNA template on the lagging strand on
which the selection of Okazaki priming-sites occurs.
Polymerase switching: during DNA replication, the switch from the distributive
polymerase-a to the processive polymerase-d or 3 that occurs at w30–40 nt after
RNA initiation.
Leading and lagging strand coordination: owing to the necessity for the 5 0 -to-3 0
direction of polymerase synthesis, the co-directional synthesis of leading and
lagging strands is thought to occur through a trombone looping of the lagging
strand, with its synthesis coordinated with the leading strand (Figure 2). This
coordination permits both leading and lagging strands to grow at the same
speed.
Imbalanced synthesis: a difference in the rate of synthesis between the leading
and lagging strands of a replication fork. Imbalanced synthesis can be induced
chemically by various replication inhibitors (i.e. emetine [10,50] and aphidicolin
[10,67]), or possibly by strand asymmetries as the result of Okazaki initiationsite preferences.
www.sciencedirect.com 0168-9525/$ - see front matter Q 2005 Elsevier Ltd. All rights reserved. doi:10.1016/j.tig.2005.03.008
Opinion
TRENDS in Genetics Vol.21 No.5 May 2005
273
(a)
Instability
and disease
progression
Non-affected
Intermediate
Affected
Repeat length
(b) (i) Origin-switch model
(CNG)n
ori
ori
Non-affected
(Stable)
Affected
(Repeat instability)
(CNG)N
(ii) Origin-shift model
(CNG)n
ori
Affected
(Repeat instability)
ori
(CNG)N
(iii) Fork-shift model
Non-affected
(Stable)
OIZ
(CNG)n
ori
Non-affected
(Stable)
ori
Affected
(Repeat instability)
OIZ
(CNG)N
TRENDS in Genetics
Figure 1. Dynamic mutations and repeat instability (a) The level of genetic instability and degree of disease progression increases with increasing repeat-tract length. In the
general population, the length of repeat tract is relatively short, polymorphic and rarely displays instability (green area). Generally, the genetic stability threshold length
(green to yellow transition) at which repeat instability can occur is R25–34 repeats, but this length varies between disease loci. Lengths in this range are not typically
associated with disease. Above this threshold, the repeat becomes highly unstable (yellow to red) in a ‘dynamic mutation’ process, in which expanded products are more
susceptible to further expansion events. The dynamic nature of repeat mutations is responsible for the phenomenon of genetic anticipation – the decrease in age-of-onset
and increase in disease severity in subsequent generations. Repeat instability is closely connected with DNA replication in proliferating cells, with several models of
instability involving relocation of the origin of replication around the repeat tract [59]. In this article, we propose an alternative model for repeat instability, one that is dynamic
and responsive to epigenetic elements. Replication might trigger instability in normal alleles close to the stability threshold length [32] or contribute to ongoing differences in
instability between expanded alleles. To accommodate both situations, the terms ‘non-affected’ and ‘affected’ chromosome are therefore used in the figure and in the main
text. In all models, depending on the repeat sequence, larger repeat tracts allow for a greater chance of Okazaki fragment initiation and/or processing within the repeat tract,
increasing the likelihood for formation of mutagenic intermediates or DNA slippage, thereby increasing instability (Box 1). (b)(i) Origin-switch model. The direction of
replication, which dictates the repeat strand that will serve as the lagging-strand template, is different between non-affected (upper) and affected (lower) chromosomes. The
direction of replication is dictated by the location of the origin of replication (ori in blue). Instability within the repeat tract on the affected chromosome probably results from
the unusual DNA structure that is formed by the repeat with the lagging-strand template or nascent DNA, yielding deletions or expansion, respectively. (ii) Origin-shift model.
The location of the origin of replication on one side of the repeat tract will affect repeat instability, probably because of the position of the repeat within the OIZ. On the
affected chromosome, the origin of replication is located either closer (illustrated) or further away (not illustrated) from the origin than on the non-affected chromosome,
thereby producing instability. This model might only be predictive at distances where the OIZ can be definitively determined from the location of the origin of replication
(w300 nt). Further away from the repeat tract, the influence of the genome and the epigenomic environment on the OIZ becomes more complex. (iii) Fork-shift model. The
dynamics of replication-fork progression through the repeat tract, regardless of the location of the origin of replication, is the key component of this model. Any epigenetic
event, sequence or cellular process (indicated by X in purple) that occurs within the path of the replication fork can alter the dynamics of fork progression. Altered replicationfork progression might shift the location of the OIZ relative to the repeat tract, thereby permitting the formation of a mutagenic DNA structure and ultimately leading to repeat
instability. As in the previous two models, the formation of unusual DNA structures within the leading or lagging strands (nascent or template) and the fidelity with which
these structures are processed by replication and/or repair proteins will contribute to repeat instability
evidence from patient cells [10] and yeast [11,12]
supports the initiation of CTG/CAG mutations during
replication in proliferating cells, which are subsequently
acted on by an S-phase-specific repair process to produce
length alterations. This article focuses on the contribution
of DNA replication to repeat instability in proliferative
www.sciencedirect.com
tissues, keeping in mind that repair and/or recombination
processes are also likely to be involved.
Repeat instability
The connection between DNA replication and repeat
instability is most evident in actively proliferating cells.
Opinion
274
TRENDS in Genetics Vol.21 No.5 May 2005
In the peripheral blood cells of DM1 patients, CTG repeats
expand continually [8], whereas the extremely large
expansions in their muscle cells (which are up to 4000
repeats larger than those in blood cells) appear to have
occurred during a period of post-natal muscle growth
[13,14]. Repair might also contribute to instability in the
germline – recent evidence from HD and DM1 patients
shows that expansions occur before meiosis in the male
[15] and female germline [16]. Such expansions might
occur during DNA replication in proliferating spermatogonia or oogonia [17] or though genome maintenance
during one of the extended periods of cellular arrest.
A proliferation-associated mechanism is supported by
recent data demonstrating ongoing pre-meiotic CTG
expansions in the male germline of transgenic mice
ranging seven weeks to 11 months old [18]. For some
disorders, such as FRAXA [19] and the congenital form of
DM1 [20], somatic instability occurs in the period of rapid
cell proliferation during early fetal growth. However, fetal
instability is not evident in all disease, such as HD [21]
and SBMA [22]. Importantly, proliferation was shown to
be a requirement for spontaneous repeat expansions in
cultured DM1 patient cells [10]. Moreover, compounds
that altered the dynamics of replication-fork progression
dramatically enhanced the magnitude of the expansions
in these cells. Therefore, in both somatic and germline
tissues, DNA replication-associated processes can contribute to repeat instability in proliferating cells.
The presence of repeat-length heterogeneity, regardless
of proliferation status, is not an indication of ongoing
instability. Although the observation of tract-length
heterogeneity within a tissue, or between tissues, of a
patient is a clear indication that post-zygotic somatic
instability has occurred, this heterogeneity might have
arisen during the cell divisions required to reach the
mature tissue. Active repeat instability occurs in the
terminally differentiated neurons of the brain [23–26,9],
probably as a result of genome maintenance (DNA repair).
However, within the same tissue, glial cells that can divide
post-natally also show repeat instability [26,9]. Therefore,
the contribution of replication to repeat instability
probably ranges from non-participant (non-proliferative
state) to major determinant (proliferative state), depending on the locus and tissue.
The genomic context of the repeat is a strong contributing factor in repeat instability. The same repeat sequence
(CTG/CAG) of similar tract length, at different disease
loci, can display remarkably different levels of instability.
For example, both DM1 (CTG) and SCA7 (CAG) repeats
are highly unstable compared with the SBMA (CAG)
repeat. In many of these diseases, specific chromosomal
backgrounds (some extending tens of kilobases) are associated with the expanded repeat tracts, suggesting that
cis-elements flanking the repeat actively drive the
instability. This association has been supported experimentally in transgenic mice, where the insertion site,
amount and nature of flanking human sequence strongly
affects repeat instability [27–29]. Transgenic mice that
have large amounts of human genomic sequence flanking
the repeats typically exhibit expansion patterns similar to
Box 1. Unusual DNA structures
Disease-associated repeats can form various unusual DNA structures
that are dependent on repeat sequence, tract length and tract purity
(Figure I). Although some repeats form unwound, triple-, quadruplestranded and sticky DNA structures, the majority of repeats can form
intra-strand hairpins [63,64]. Intra-strand hairpins are key components
of the more complex slipped-strand structures – the long hypothesised
mutagenic intermediate of repeat tract instability. Slipped-DNAs are
formed by shifting and out-of-register mis-pairing of two complementary repeat strands. There are two forms of slipped DNA: (i) slippedhomoduplex DNA (S-DNA; Figure Ia) formed between two strands that
have the same number of repeats; and (ii) slipped-intermediate DNA
(SI-DNA; Figure Ib) formed between strands that contain different
numbers of repeats. Both S-DNAs and SI-DNAs can form in replicating
or non-replicating DNA. For CTG/CAG repeats, S-DNAs are composed
mostly of short slip-outs of limited size (one-to-ten repeats), whereas
SI-DNAs are composed of unique slip-outs located at specific
(a)
Slipped-homoduplex DNA (S-DNA)
locations, of a size equal to the difference between two strands. In
addition, unlike S-DNAs, the sequence of the SI-DNA slip-outs affects
the type of structure formed. Slip-outs of CAG repeats assume a
mixture of single-stranded random-coil structures and intra-strand
hairpins, whereas CTG slip-outs assume only unique intra-strand
hairpins. A similar difference in structure formation can occur for
CCTG/CCAG repeats associated with myotonic dystrophy type 2 [45], in
addition to CGG/CCG and GAA/TTC repeats associated with FRAXA
and FRDA, respectively [63]. The formation, recognition and processing of types of different structures can occur during numerous
biological processes, including replication, transcription, recombination, DNA metabolism and DNA ‘breathing’. The distinct biophysical
features of the unusual DNA are crucial to determine if, and to what
degree, these structures are recognized and processed by replication,
repair or recombination proteins.
(b)
Slipped-intermediate DNA (SI-DNA)
CTG
Excess CTG
CAG
Excess CAG
TRENDS in Genetics
Figure I. DNA structures formed by the mis-pairing of complementary repeat strands.
www.sciencedirect.com
Opinion
TRENDS in Genetics Vol.21 No.5 May 2005
275
(b) Nested discontinuity
5′
(a) 5′ to 3′ DNA replication
RNA–DNA primer: ~34 nt
3′
3′
Continuous leading strand
OIZ
RNA primer: 7–11 nt RNA–DNA primer: ~34 nt
DNA primer
(polα-primase)
5′
Okazaki fragment
(135–145 nt)
3′
Polα to polδ switch
Okazaki
initiation zone
(~290 nt)
5′
(c) Imbalanced synthesis
FEN1
DNA2
RNAse H
Pol-δ
Ligase
5′
3′
Single stranded region
~ 690 bp to ≥ 1400 nt
5′
OIZ
5′
(d) Protein–protein and protein–DNA interactions
Nucleosome free
(~225 bp)
Immature chromatin
H2A H3
H2B
H4
H3 H2A
PCNA
Destabilized nucleosomes
Mature chromatin
H1
H2B
Leading strand
Helicase
Polδ/ε
Okazaki fragment
RFC
DNA primer
Pol-α
H1
RNA primer
Primase
RPA
Lagging strand
FEN1, DNA2
RNAse H
Pol-δ and ligase
Nucleosome free
(~285 bp)
Unprocessed Okazaki fragments
occur in < 20% of nucleosomes
TRENDS in Genetics
Figure 2. Dynamics of the replication fork. (a) Replication proceeds in 5 0 -to-3 0 direction, which permits the continual synthesis of the leading strand and requires the lagging
strand to be synthesised in a discontinuous manner. Lagging-strand synthesis occurs after a 290-nt stretch of single-stranded template, called the Okazaki initiation zone
(OIZ), is exposed and is accomplished through the production of a series of Okazaki fragments (typically 135–145 nt). Each Okazaki fragment is initiated as an RNA primer
(7–11 nt), extended to DNA for 34 nt by distributive polymerase-a–primase complex, which is subsequently exchanged for processive polymerase-d. Synthesis is completed
when the Okazaki fragment reaches the RNA primer of the previous Okazaki fragment, which is subsequently removed, filled-in and sealed by the combined actions of FEN1,
DNA2, RNase H, polymerase-d and DNA ligase I. It is thought that R80% of Okazaki fragments are processed before rechromatinization [60], although some processing can
occur in the nucleosome [61]. Additional polymerases, including polymerase-3, might be involved in replication-fork dynamics, although their role is currently unclear.
(b) ‘Nested discontinuity’ is an alternative model for Okazaki-fragment maturation and involves the ligation of short Okazaki primers to form the larger continuous lagging
strand [49]. The sequence of the lagging-strand template might determine whether a ‘nested discontinuity’ method of Okazaki initiation and maturation occurs, a process that
might be particularly sensitive to repetitive sequences. The formation of multiple small Okazaki fragments can decrease the chance of hairpin formation in the template
strand while subsequently increasing the likelihood of small hairpin formation during Okazaki processing in nascent strand. (c) The rates of leading and lagging strand
synthesis must be coordinated, because imbalanced synthesis could lead to catastrophic events, including the accumulation of large stretches of single-stranded DNA on
either the lagging (illustrated) or leading strand (not illustrated) while synthesis on the other strand proceeds. The presence of excess amounts of single-stranded DNA might
enable the formation of mutagenic intermediates that can contribute to repeat instability [10]. The replication forks in (b) and (c) are not shown in full for simplicity.
(d) Replication forks contain numerous protein–protein and protein–DNA interactions that maintain replication-fork dynamics, which, in addition to architectural and
synthetic restraints, involves traversing chromatinized DNAs (for the sake of clarity, not all known interactions are illustrated). The lagging strand is looped to coordinate the
rate of continuous leading and discontinuous lagging strand synthesis, enabling the two processes to be coupled and proceed in the same direction, in a ‘trombone’ model
originally proposed by Alberts [62]. In order for replication to proceed, the nucleosome packaging of DNA (146 bp DNA in the core histone package) must be removed at the
replication fork (w225 and 285 bp on leading and lagging strand, respectively) and replaced following fork progression [54]. Chromatin structure and other epigenetic
modifications, such as CpG methylation, must also be reconstituted following replication-fork passage. The two nucleosomes in front of the replication fork can be
destabilized by the removal of linker histone H1. Following replication-fork passage, nucleosome maturation occurs at a certain distance (not shown to scale) behind
advancing replication fork and can involve incorporation of linker histone (illustrated) and refolding of the chromatin fiber and histone deacetylation (not illustrated). The
formation of unusual DNA structures within the repeat tract might interfere with the numerous interactions occurring at the replication fork, producing aberrant processing
and ultimately leading to repeat instability. Abbreviations: RPA, replication protein A; RFC, replication factor C.
www.sciencedirect.com
Opinion
276
TRENDS in Genetics Vol.21 No.5 May 2005
(a)
Major
(3′ - PuT - 5′)
(CTG)3
5′ - CTGCTGCTG - 3′
Excluded
Minor
(3′ - PuC - 5′)
(3′ - PyPuPu - 5′)
5′ - CTGCTGCTG - 3′
5′ - CTGCTGCTG - 3′
Infrequent
(3′ - PuPyPu - 5′)
(3 - PuPuPy - 5)
5′ - CTGCTGCTG - 3′
5′ - CTGCTGCTG - 3′
(CAG)3
5′ - CAGCAGCAG - 3′
5′ - CAGCAGCAG - 3′
5′ - CAGCAGCAG - 3′
5′ - CAGCAGCAG- 3′
5′ - CAGCAGCAG - 3′
(CGG)3
5′ - CGGCGGCGG - 3′
5′ - CGGCGGCGG - 3′
5′ - CGGCGGCGG - 3′
5′ - CGGCGGCGG - 3′
5′ - CGGCGGCGG - 3′
(CCG)3
(b)
Preferred
5′ - CCGCCGCCG - 3′
5′ - CCGCCGCCG - 3′
5′ - CCGCCGCCG - 3′
5′ - CCGCCGCCG - 3′
5′ - CCGCCGCCG - 3′
(CCTG)3
5′ - CCTGCCTGCCTG - 3′
5′ - CCTGCCTGCCTG - 3′
5′ - CCTGCCTGCCTG - 3′
5′ - CCTGCCTGCCTG - 3′
5′ - CCTGCCTGCCTG - 3′
(CAGG)3
5′ - CAGGCAGGCAGG - 3′
5′ - CAGGCAGGCAGG - 3′
5′ - CAGGCAGGCAGG - 3′
5′ - CAGGCAGGCAGG - 3′
5′ - CAGGCAGGCAGG - 3′
(GAA)3
5′ - GAAGAAGAA - 3′
5′ - GAAGAAGAA - 3′
5′ - GAAGAAGAA - 3′
5′ - GAAGAAGAA - 3′
5′ - GAAGAAGAA - 3′
(TTC)3
5′ - TTCTTCTTC - 3′
5′ - TTCTTCTTC - 3′
5′ - TTCTTCTTC - 3′
5′ - TTCTTCTTC - 3′
5′ - TTCTTCTTC - 3′
5′
3′
3′
5′
DM1
3′
(CTG)n
3′
Transcription
3′ UTR (CUG)
5′
(CAG)n
5′
FRAXA
(CGG)n
3′
5′
(CAG)n
3′
(CTG)n
3′
(CCG)n
5′
5′
5′ UTR (CGG)
HD
5′
3′
`
Exon (CAG)
`
SCA1
3′
(CAG)n
(CTG)n
5′
3′
Exon (CAG)
5′
SBMA
(CAG)n
3′
5′
(CAG)n
3′
(CTG)n
3′
(CTG)n
5′
5′
(CCTG)n
3′
Exon (CAG)
SCA7
5′
3′
Exon (CAG)
DM2
3′
(CCAG)n
5′
Intron (CCUG)
5′
FRDA
3′
3′ - PuT - 5′
Major preferred site of Okazaki initiation
(GAA)n
3′
(TTC)n
5′
5′
3′ - PyPuPu - 5′
Excluded site for Okazaki initiation
Intron (GAA)
5′ - CpG - 3′
Site of mammalian methylation
20 nt
TRENDS in Genetics
www.sciencedirect.com
Opinion
TRENDS in Genetics Vol.21 No.5 May 2005
those observed in humans [28,29]. DNA replication and its
regulation are also sensitive to genomic context,
suggesting that it contributes to the context-driven
instability that occurs in proliferating tissues.
Origin-switch model
In bacterial [30], yeast [31] and primate replication models
[32], repeat instability is altered simply by ‘switching’ the
direction of replication through the repeat (origin switch;
Figure 1). Repeat instability is often attributed to either
repeat-induced strand slippage or the formation of repeatspecific unusual DNA structures (Box 1) within the singlestranded region of the lagging strand, termed the Okazaki
initiation zone (OIZ; Figure 2). With respect to repeat
instability, the potential outcomes of the origin-switch
model (expansion or deletions) depend on the template for
lagging-strand synthesis (CAG or CTG) and the location of
the slip-out strand (nascent or template strand). The
involvement of the lagging strand in general, and Okazaki
processing specifically, is supported by the similarity
between the trinucleotide-repeat stability threshold
length (w35–40 repeats; 105–120 bp) and Okazaki fragment length (w135–145 bp) [33]. The altered instability of
repeat tracts in yeast with mutant flap endonuclease 1
(FEN1 or rad27) [34,35], and the poor ability of FEN1 to
recognize and process CTG or CAG structures also
supports the origin-switch model [36]. In addition,
replication-fork stalling at the repeat tract depends, in
part, on the orientation of the repeat tract relative to the
replication origin [37,38], adding further support for an
origin-switch model. Extended to humans, this model
suggests that the origin of replication on the affected
chromosome might lie on the opposite side of the repeat
tract relative to the non-affected chromosome (Figure 1).
Furthermore, the origin-switch model suggests that, in
affected individuals, chromosomes in proliferative tissues
will be replicated in the opposite direction, depending on
their instability status.
In metazoans, origin usage, which in turn dictates the
replication direction, is influenced by many different
factors including sequence motifs, nuclear organization,
chromatin structure, DNA methylation, transcription, the
availability of initiation proteins and nucleotide pool
levels [39–41]. Metazoan origins of replication, which
lack an identifiable consensus sequence, frequently have
common elements such as AT-rich DNA, asymmetric
purine and/or pyrimidine tracts, CpG islands, DNA
flexibility and negative supercoiling [41]. The interplay
277
between these factors and elements is probably responsible for the differences in origin usage that are observed
between different tissues and developmental stages [42].
Interestingly, repeat instability is frequently [19,20], but
not always [21,22], coincident with early post-zygotic
development, a period associated with numerous seemingly unregulated replication-initiation events permitting
rapid cell division [41]. Mapping metazoan origins is
technically demanding, hence only a few have been
identified and characterized [42]. Preliminary evidence
from the arrangement of replication-initiation sites that
were mapped to several trinucleotide-repeat disease loci
[43] appears to dispute the origin-switch model. In this
study, the location of the initiation events did not correlate
with the extensive instability of these loci in affected
individuals; however, the patient cells used did not display
active repeat instability. Although some evidence supports
an origin-switch model, it would appear that a more
complex relationship exists between DNA replication and
repeat instability in humans.
Origin-shift model
In a primate replication model of repeat instability [32],
‘shifting’ the location of the replication origin while
maintaining the direction of replication altered the repeat
instability drastically (origin shift; Figure 1). In close
proximity (i.e. within 300 bp) to the repeat tract, small
shifts (G130 bp) in the placement of the replication origin
altered the nature of the instability (expansion bias versus
deletion bias versus stable). This observation was first
reported for CTG/CAG repeats [32] and, recently, in the
CGG/CCG repeats involved in FRAXA [44], in which CpG
methylation also stabilized the repeat tract. A similar
effect of origin placement might also occur for the DM2
tetranucleotide repeats CCTG/CAGG [45]. Instability in
the origin-shift model is related to position and length of
the repeat tract in relation to the lagging strand, which in
part dictates the location of Okazaki initiation and
processing. This model predicts that, in patients, the
location of a replication origin is shifted on affected
relative to non-affected chromosomes (origin shift). Similarly, in affected individuals, chromosomes in proliferative
tissues will be replicated from different origin positions,
depending on their instability status. However, in the
primate replication model, complex patterns of instability
were produced from origins located further (O300 bp)
from the repeat tract [32], where the determination of
Figure 3. The sequence of the repeat and flanking regions can alter Okazaki initiations. Although replication direction will determine which strand serves as the lagging strand
template, the selection of Okazaki initiation or priming sites will determine the location of Okazaki processing and the likelihood of repeat instability. (a) Okazaki initiation can
exhibit some preference in initiation-site selection, with favoured and disfavoured sites [48]. The frequency of these sites varies between the complementary strands of a
particular repeat sequence and between different repeat sequences. Such variations might determine the frequency and type of Okazaki initiations occurring within the repeat
(classical, nested or excluded). Favoured (green nucleotides indicate major or minor) and disfavoured (red nucleotides indicate excluded or infrequent) Okazaki initiation sites
are listed separately for CTG/CAG, CGG/CCG, GAA/TTC and CCTG/CAGG repeats. Owing to the architecture of the replication fork (Figure 2), Okazaki selection occurs in
the 3 0 -to-5 0 direction on the template, whereas nascent strand synthesis proceeds in the 5 0 -to-3 0 direction. (b) The preference for Okazaki initiation within flanking sequence.
Variation in Okazaki initiation sites within flanking sequences can also affect Okazaki initiation-site selection and fork progression into and out of the repeat tract, and hence
repeat instability. Okazaki initiation occurs on the lagging strand, which differs depending on the direction of replication (see fork schematic, where the lagging strand is
shown in black from left to right, and the lagging strand is shown in blue from right to left). CpG methylation, which is associated with altered chromatinization and repeat
instability [50–52], might also affect fork progression. Variations in the density of CpG sites are evident between the repeat and flanking sequence. The direction of
transcription relative to replication direction might affect repeat instability. The major preferred (green triangles) and excluded sites (red bars) of Okazaki initiation in addition
to the potential sites for methylation (black circle) are indicated for eight disease loci and w150 nt of flanking sequence. The sequences were obtained from the following
GenBank accession numbers: L00727 (DM1), L29074 (FRAXA), Z49154 (HD), AL009031 (SCA1), NC_000023 (SBMA), AF020276 (SCA7), AY329622 (DM2) and NC_000009
(FRDA) and were modified so that the repeat tracts within each sequence are approximately the same length [i.e. (CNG)21 or (CCNG)15].
www.sciencedirect.com
278
Opinion
TRENDS in Genetics Vol.21 No.5 May 2005
Box 2. Outstanding questions
† Where is the location of the origins of replication relative to repeat
tracts in affected versus non-affected chromosomes for multiple
repeat-associated diseases?
† What is the identity of the cis-elements and cellular processes that
are responsible for influencing instability at each of disease loci?
† What structural, sequence or epigenetic elements determine
tissue-specific and/or germline instability?
† How are the DNA mutagenic intermediates formed at trinucleotide-repeat tracts and how are they subsequently handled by cellular
repair processes?
† Given that mismatch repair appears to be required for expansions
in mouse models of repeat instability [3,18,65,66], what role do
repair pathways have in proliferative (replication-associated) versus
non-proliferative (replication-independent) tissues?
Okazaki initiation sites becomes more difficult, suggesting
that a more adaptive model is required.
Fork-shift model and fork dynamics
The complex pattern of repeat instability between disease
loci and/or between tissues of the same patient has not
been adequately explained by the origin-shift or originswitch models. Hence, we propose a dynamic model that is
responsive to the epigenomic surroundings and the
accompanying cellular processes – the fork-shift model
for repeat instability. In the fork-shift model, cis-elements
within or flanking the repeat tract, alter the dynamics of
the advancing replication fork to produce repeat instability (Figure 1). In this model, the position of the repeat tract
within the advancing replication fork determines the
location of Okazaki initiation, termination and processing
events, and thereby determines whether a mutagenic
event will occur. In this way, instability is not dependent
on the direction or location of an origin of replication, as in
the origin-switch or origin-shift model, respectively.
Repeat instability in the fork-shift model is therefore
dependent on the repeat sequence and the portion of the
repeat tract within the Okazaki initiation zone, and also,
by extension, the length of repeat tract. These factors
might affect the probability of slippage, mutagenic DNA
structure formation and aberrant interactions with
replication and repair proteins. Mutagenic events at the
replication fork can lead to uncoupling, pausing and/or
slippage [10,37,38], events which might be processed
by S-phase-specific checkpoint-repair proteins [46,47].
Factors affecting the fork-shift model
The placement of Okazaki fragments relative to the repeat
tract affects the probability of structure formation, strand
slippage and aberrant protein interactions – events that
are important to the fork-shift model. Okazaki initiation
can display some site selectivity, with the frequency of
priming events correlating directly with the occurrence of
‘preferred’ or ‘excluded’ sites [48]. This selectivity might be
important because CTG, CCG and CCTG repeats contain
either ‘major’ or ‘minor preferred’ sites but lack ‘excluded’
or ‘infrequently used’ sites, whereas CAG, CGG and
CAGG repeats are virtually saturated with ‘excluded’
and ‘infrequently used’ sites (Figure 3a). Multiple closelyspaced or ‘nested’ Okazaki initiations [49] on CTG, CCG or
CCTG repeats can decrease the likelihood of hairpin
www.sciencedirect.com
formation in the template strand, while increasing the
likelihood of relatively short hairpin-induced errors in the
nascent DNA during Okazaki processing (Figure 2b and
Box 1). By contrast, infrequent Okazaki initiations on
CAG, CGG or CAGG would cause imbalanced synthesis,
increasing the amount of single-stranded DNA on the
lagging strand and promoting structure formation in
either leading or lagging strands (Figure 2c). Imbalanced
synthesis can uncouple the coordination of leading and
lagging-strand synthesis, an event that, when induced
chemically with emetine or aphidicolin (inhibits DNA
polymerase a, d and 3), provoked large expansions
(increases of up to 170 repeats) in cultured DM1 patient
cells [10]. Emetine preferentially blocks the synthesis of
Okazaki fragments, permitting synthesis of only the
leading strand and yielding replication forks with long
stretches of single-stranded DNA on the lagging-strand
template (from 690 nt to R1400 nt [10,50]). In a similar
manner, flanking sequences or cis-elements (sequence,
epigenetic or structural elements) might impair the
coordination of leading and lagging strand or influence
the placement of the OIZ. In addition to repeat-specific
replication fork interactions, Okazaki initiations dictated
by flanking sequences (Figure 3b) might influence the
placement of the OIZ within the repeat tract. Interestingly, the distribution of CpG-methylatable sites in the
flanking sequence varies considerably between repeatdisease loci [51,52] and these sites are present in only the
CGG/CCG repeat (Figure 3). In this manner the effects of
flanking cis-elements proximal to the repeat might affect
fork dynamics – setting the stage for future errors once the
fork reaches the repeat tract.
The surrounding epigenomic environment, such as the
state of chromatin packaging, protein factor availability
and regional CpG methylation, can influence DNA
replication. Recent evidence indicates that both origin
alterations and CpG methylation can manipulate CGG
instability [44]. Any influence on DNA replication,
including the activity of various repair proteins and/or
DNA-bound proteins (Figure 1b), might alter how the
replication fork encounters the repeat tract. This alteration in turn shifts Okazaki initiations and alters the
propensity to form mutagenic intermediates with the
repeat tract, ultimately affecting instability. Other factors
such as nucleosome packaging and dNTP-pool size can
also affect the location of Okazaki initiation [53]. Chromatin remodelling has been implicated in facilitating the
movement of the replication fork through heterochromatin domains [54] and its interaction with protein
complexes facilitates the binding of key replication
proteins. Nucleosome packaging is of particular importance for repeats, because some repetitive sequences
assemble preferentially (i.e. CTG/CAG) [55] or exclude
nucleosomes (i.e. CGG/CGG) [56]. This ability can be
modulated by both CpG methylation and repeat tract
purity [57]. DNA methylation, which can modulate
chromatin, alters the rate and fidelity of DNA synthesis.
In addition, gene activity within the immediate vicinity of
the repeat tract can enable head-to-head or co-directional
collisions between the advancing replication fork and
transcription machinery, which might alter fork
Opinion
TRENDS in Genetics Vol.21 No.5 May 2005
progression. Many of the factors that affect the fork-shift
model vary between loci and/or between tissues of the
same patient. Thus, replication forks initiated from the
same location might encounter the repeat tract in an
entirely different manner, depending on the cell, tissue,
temporal, developmental stage, differentiation status and
transcriptional activity.
Determining which model is occurring in humans
Given the size and complexity of the human genome,
mapping the replication origin, replication direction or
fork progression is extremely difficult – only a handful of
human origins have been identified to date [42], and only
one has been characterized at the nucleotide level [58].
Therefore, it is not trivial to distinguish between the originshift, origin-switch and fork-shift models experimentally.
Many of the replication-detection techniques are difficult
to adapt to detect small changes in single-locus origin
activity within the large, complex human genome [41]. In
addition, these analyses must be performed in tissues that
actively display repeat instability, adding further difficulty to the task. Given the complexity and diversity of
repeat-associated disorders, accurate data from model
systems must be carefully compared with the specific
locus-, tissue- and cell-specific instability observed in
patients, to yield important clues to identify which model
is occurring and to answer some outstanding questions in
the field (Box 2).
Concluding remarks
The dynamic and responsive nature of the fork-shift model
can explain some of the complex behaviours associated
with repeat instability. With this replication model,
instability does not have to strictly correlate with the
proliferation rate, because an increased probability of
altered replication fork dynamics in a slowly proliferating
tissue will produce greater instability than a decreased
probability in a rapidly proliferating tissue. The forksshift model centres on the likelihood of mutagenic
intermediate formation, a process that is influenced by
repeat sequence, tract size, flanking cis-elements, the
epigenomic environment and the location of the repeat
within the replication fork. Ultimately, repair and/or
recombination proteins that are involved in replication
and post-replication processes will determine whether
DNA mutagenic intermediates are processed correctly.
The wide variation in repeat instability observed between
diverse tissues at different developmental times and
among specific repeat loci probably reflects the permutations of cis-elements and trans-factors that can contribute to instability. The fork-shift model is dynamic because
it is responsive to epigenomic surroundings and the
accompanying cellular and biochemical processes, both
of which vary between and within proliferative tissues.
Acknowledgements
We thank G. Almounzi, R.A. Bambara, J. Hayes, U. Hubscher, T. Krude
and J. Sogo for their comments on Figure 2b. We also acknowledge K.
Nichol Edamura for her thoughtful comments on this article. Research in
the Pearson laboratory is supported by grants from the Muscular
Dystrophy Association USA, the Canadian Institutes of Health Research
(CIHR), Fragile X Research Foundation of Canada and the University of
www.sciencedirect.com
279
Toronto Deans Fund. J.D.C. is supported by a CIHR doctoral research
award. C.E.P. is a CIHR scholar and a Canadian Genetic Disease Network
Scholar.
References
1 Cleary, J.D. and Pearson, C.E. (2003) The contribution of cis-elements
to disease-associated repeat instability: clinical and experimental
evidence. Cytogenet. Genome Res. 100, 25–55
2 Kennedy, L. et al. (2003) Dramatic tissue-specific mutation length
increases are an early molecular event in Huntington disease
pathogenesis. Hum. Mol. Genet. 12, 3359–3367
3 Wheeler, V.C. et al. (2003) Mismatch repair gene Msh2 modifies the
timing of early disease in Hdh(Q111) striatum. Hum. Mol. Genet. 12,
273–281
4 Richards, R.I. and Sutherland, G.R. (1992) Dynamic mutations: a new
class of mutations causing human disease. Cell 70, 709–712
5 de la Chapelle, A. and Peltomaki, P. (1995) Genetics of hereditary
colon cancer. Annu. Rev. Genet. 29, 329–348
6 Lahue, R.S. and Slater, D.L. (2003) DNA repair and trinucleotide
repeat instability. Front. Biosci. 8, s653–s665
7 La Spada, A.R. et al. (2004) Dynamic mutations on the move in Banff.
Nat. Genet. 36, 667–670
8 Martorell, L. et al. (1998) Progression of somatic CTG repeat length
heterogeneity in the blood cells of myotonic dystrophy patients. Hum.
Mol. Genet. 7, 307–312
9 Hashida, H. et al. (2001) Single cell analysis of CAG repeat in brains of
dentatorubral-pallidoluysian atrophy (DRPLA). J. Neurol. Sci. 190,
87–93
10 Yang, Z. et al. (2003) Replication inhibitors modulate instability of an
expanded trinucleotide repeat at the myotonic dystrophy type 1
disease locus in human cells. Am. J. Hum. Genet. 73, 1092–1105
11 Freudenreich, C.H. and Lahiri, M. (2004) Structure-forming
CAG/CTG repeat sequences are sensitive to breakage in the absence
of Mrc1 checkpoint function and S-phase checkpoint signaling:
implications for trinucleotide repeat expansion diseases. Cell Cycle
3, 1370–1374
12 Lahiri, M. et al. (2004) Expanded CAG repeats activate the DNA
damage checkpoint pathway. Mol. Cell 15, 287–293
13 Thornton, C.A. et al. (1994) Myotonic dystrophy patients have larger
CTG expansions in skeletal muscle than in leukocytes. Ann. Neurol.
35, 104–107
14 Zatz, M. et al. (1995) Analysis of the CTG repeat in skeletal muscle of
young and adult myotonic dystrophy patients: when does the
expansion occur? Hum. Mol. Genet. 4, 401–406
15 Yoon, S.R. et al. (2003) Huntington disease expansion mutations in
humans can occur before meiosis is completed. Proc. Natl. Acad. Sci.
U. S. A. 100, 8834–8838
16 De Temmerman, N. et al. (2004) Intergenerational instability of the
expanded CTG repeat in the DMPK gene: studies in human gametes
and preimplantation embryos. Am. J. Hum. Genet 75, 325–329
17 Pearson, C.E. (2003) Slipping while sleeping? Trinucleotide repeat
expansions in germ cells. Trends Mol. Med. 9, 490–495
18 Savouret, C. et al. (2003) CTG repeat instability and size variation
timing in DNA repair-deficient mice. EMBO J. 22, 2264–2273
19 Wohrle, D. et al. (1993) Mitotic stability of fragile X mutations in
differentiated cells indicates early post-conceptional trinucleotide
repeat expansion. Nat. Genet. 4, 140–142
20 Martorell, L. et al. (1997) Somatic instability of the myotonic
dystrophy (CTG)n repeat during human fetal development. Hum.
Mol. Genet. 6, 877–880
21 Benitez, J. et al. (1995) Somatic stability in chorionic villi samples and
other Huntington fetal tissues. Hum. Genet. 96, 229–232
22 Jedele, K.B. et al. (1998) Spinal and bulbar muscular atrophy (SBMA):
somatic stability of an expanded CAG repeat in fetal tissues. Clin.
Genet. 54, 148–151
23 Takano, H. et al. (1996) Somatic mosaicism of expanded CAG repeats
in brains of patients with dentatorubral-pallidoluysian atrophy:
cellular population-dependent dynamics of mitotic instability. Am.
J. Hum. Genet. 58, 1212–1222
24 Hashida, H. et al. (1997) Brain regional differences in the expansion of
a CAG repeat in the spinocerebellar ataxias: dentatorubral-pallidoluysian atrophy, Machado-Joseph disease, and spinocerebellar ataxia
type 1. Ann. Neurol. 41, 505–511
280
Opinion
TRENDS in Genetics Vol.21 No.5 May 2005
25 Matsuura, T. et al. (1999) Mosaicism of unstable CAG repeats in the
brain of spinocerebellar ataxia type 2. J. Neurol. 246, 835–839
26 Watanabe, H. et al. (2000) Differential somatic CAG repeat instability
in variable brain cell lineage in dentatorubral pallidoluysian atrophy
(DRPLA): a laser-captured microdissection (LCM)-based analysis.
Hum. Genet. 107, 452–457
27 Monckton, D.G. et al. (1997) Hypermutable myotonic dystrophy CTG
repeats in transgenic mice. Nat. Genet. 15, 193–196
28 Seznec, H. et al. (2000) Transgenic mice carrying large human
genomic sequences with expanded CTG repeat mimic closely the
DM CTG repeat intergenerational and somatic instability. Hum. Mol.
Genet. 9, 1185–1194
29 Libby, R.T. et al. (2003) Genomic context drives SCA7 CAG repeat
instability, while expressed SCA7 cDNAs are intergenerationally and
somatically stable in transgenic mice. Hum. Mol. Genet. 12, 41–50
30 Kang, S. et al. (1995) Expansion and deletion of CTG repeats from
human disease genes are determined by the direction of replication in
E. coli. Nat. Genet. 10, 213–218
31 Maurer, D.J. et al. (1996) Orientation dependence of trinucleotide
CAG repeat instability in Saccharomyces cerevisiae. Mol. Cell. Biol.
16, 6617–6622
32 Cleary, J.D. et al. (2002) Evidence of cis-acting factors in replicationmediated trinucleotide repeat instability in primate cells. Nat. Genet.
31, 37–46
33 Anderson, S. and DePamphilis, M.L. (1979) Metabolism of Okazaki
fragments during simian virus 40 DNA replication. J. Biol. Chem.
254, 11495–11504
34 Freudenreich, C.H. et al. (1998) Expansion and length-dependent
fragility of CTG repeats in yeast. Science 279, 853–856
35 Schweitzer, J.K. and Livingston, D.M. (1998) Expansions of CAG
repeat tracts are frequent in a yeast mutant defective in Okazaki
fragment maturation. Hum. Mol. Genet. 7, 69–74
36 Spiro, C. et al. (1999) Inhibition of FEN-1 processing by DNA
secondary structure at trinucleotide repeats. Mol. Cell 4, 1079–1085
37 Samadashwily, G.M. et al. (1997) Trinucleotide repeats affect DNA
replication in vivo. Nat. Genet. 17, 298–304
38 Pelletier, R. et al. (2003) Replication and expansion of trinucleotide
repeats in yeast. Mol. Cell. Biol. 23, 1349–1357
39 Leffak, M. and James, C.D. (1989) Opposite replication polarity of the
germ line c-myc gene in HeLa cells compared with that of two Burkitt
lymphoma cell lines. Mol. Cell. Biol. 9, 586–593
40 Kitsberg, D. et al. (1993) Replication structure of the human betaglobin gene domain. Nature 366, 588–590
41 Aladjem, M.I. and Fanning, E. (2004) The replicon revisited: an old
model learns new tricks in metazoan chromosomes. EMBO Rep. 5,
686–691
42 Todorovic, V. et al. (1999) Replication origins of mammalian chromosomes: the happy few. Front. Biosci. 4, D859–D868
43 Nenguke, T. et al. (2003) Candidate DNA replication initiation regions
at human trinucleotide repeat disease loci. Hum. Mol. Genet. 12,
1021–1028
44 Nichol Edamura, K. et al. (2005) Role of replication and CpG
methylation in Fragile X syndrome CGG deletions in primate cells.
Am. J. Hum. Genet. 76, 302–311
45 Dere, R. et al. (2004) Hairpin structure-forming propensity of the
(CCTG/CAGG) tetranucleotide repeats contributes to the genetic
instability associated with myotonic dystrophy type 2. J. Biol. Chem.
279, 41715–41726
46 Freudenreich, C.H. and Lahiri, M. (2004) Structure-forming
CAG/CTG repeat sequences are sensitive to breakage in the absence
of Mrc1 checkpoint function and S-phase checkpoint signaling:
implications for trinucleotide repeat expansion diseases. Cell Cycle
3, 1370–1374
www.sciencedirect.com
47 Lahiri, M. et al. (2004) Expanded CAG repeats activate the DNA
damage checkpoint pathway. Mol. Cell 15, 287–293
48 Hay, R.T. et al. (1984) Sequence specificity for the initiation of
RNA-primed simian virus 40 DNA synthesis in vivo. J. Mol. Biol.
175, 131–157
49 Nethanel, T. et al. (1988) An Okazaki piece of simian virus 40 may
be synthesized by ligation of shorter precursor chains. J. Virol. 62,
2867–2873
50 Burhans, W.C. et al. (1991) Emetine allows identification of origins of
mammalian DNA replication by imbalanced DNA synthesis, not
through conservative nucleosome segregation. EMBO J. 10, 4351–4360
51 Gourdon, G. et al. (1997) Intriguing association between disease
associated unstable trinucleotide repeat and CpG island. Ann. Genet.
40, 73–77
52 Brock, G.J. et al. (1999) Cis-acting modifiers of expanded CAG/CTG
triplet repeat expandability: associations with flanking GC content
and proximity to CpG islands. Hum. Mol. Genet. 8, 1061–1067
53 Taljanidisz, J. et al. (1987) Initiation of simian virus 40 DNA
replication in vitro: identification of RNA-primed nascent DNA
chains. Nucleic Acids Res. 15, 7877–7888
54 Krude, T. (1999) Chromatin assembly during DNA replication in
somatic cells. Eur. J. Biochem. 263, 1–5
55 Wang, Y.H. et al. (1994) Preferential nucleosome assembly at DNA
triplet repeats from the myotonic dystrophy gene. Science 265,
669–671
56 Wang, Y.H. and Griffith, J. (1996) Methylation of expanded CCG
triplet repeat DNA from fragile X syndrome patients enhances
nucleosome exclusion. J. Biol. Chem. 271, 22937–22940
57 Mulvihill, D.J. et al. (2005) Effect of CAT or AGG interruptions and
CpG methylation on nucleosome assembly upon trinucleotide repeats
on spinocerebellar ataxia, type 1 and fragile X syndrome. J Biol Chem
280, 4498–4503
58 Abdurashidova, G. et al. (2000) Start sites of bidirectional DNA
synthesis at the human lamin B2 origin. Science 287, 2023–2026
59 Mirkin, S.M. and Smirnova, E.V. (2002) Positioned to expand. Nat.
Genet. 31, 5–6
60 Herman, T.M. et al. (1981) Structure of chromatin at deoxyribonucleic
acid replication forks: location of the first nucleosomes on newly
synthesized simian virus 40 deoxyribonucleic acid. Biochemistry 20,
621–630
61 Huggins, C.F. et al. (2002) Flap endonuclease 1 efficiently cleaves base
excision repair and DNA replication intermediates assembled into
nucleosomes. Mol. Cell 10, 1201–1211
62 Alberts, B.M. et al. (1975) Reconstruction of the T4 bacteriophage DN
replication apparatus from purified components. In DNA Synthesis
and its Regulation (Goulian, P. et al., eds), pp. 241–269, W.A.
Benjamin
63 Pearson, C.E. and Sinden, R.R. (1998) Trinucleotide repeat DNA
structures: dynamic mutations from dynamic DNA. Curr. Opin Struct.
Biol. 8, 321–330
64 Pearson, C.E. et al. (2002) Slipped-strand DNAs formed by long
(CAG)*(CTG) repeats: slipped-out repeats and slip-out junctions.
Nucleic Acids Res 30, 4534–4547
65 Manley, K. et al. (1999) Msh2 deficiency prevents in vivo somatic
instability of the CAG repeat in Huntington disease transgenic mice.
Nat. Genet. 23, 471–473
66 van Den Broek, W.J. et al. (2002) Somatic expansion behaviour of the
(CTG)(n) repeat in myotonic dystrophy knock-in mice is differentially
affected by Msh3 and Msh6 mismatch-repair proteins. Hum. Mol.
Genet. 11, 191–198
67 Michael, W.M. et al. (2000) Activation of the DNA replication
checkpoint through RNA synthesis by primase. Science 289,
2133–2137