I. DNA unzipping

The Central Dogma
of molecular biology
Helicase;
DNA polymerase
(cell division)
replication
DNA
transcription
RNA polymerase;
(creates mRNA transcript)
(Show movie)
RNA
study in detail
the molecular
machines – the
central dogma
in action…
translation
Protein
Ribosomes (synthesize proteins from mRNA)
DNA unzipping and motor proteins:
effect of the genetic code
• Statistical mechanics of unzipping long dsDNA at constant force (Fc=15 pN)
Strong first order transition delocalization of the unzipping fork
• Sequence heterogeneity dominates within ~5pN of the transition
Energy barriers ~ kBT√M, M = genome size! (kBT ≈ 0.6 kcal/mole)
Anomalous dynamics of the unzipping fork for F > Fc
Experimental evidence for jumps and pauses while unzipping λ-phage DNA
• Anomalous dynamics of molecular motors near the stall force
Dynamics of molecular motors on DNA substrates (driven by NTP’s…)
Polymerases, exonucleases and translocation via SSB proteins
Theorists
David Lubensky
Yariv Kafri
Julius Lucks
Buddhapriya Chakrabarti
Experimenters
Claudia Danilowicz
Vincent Coljee
Cedric Bouzigues
Mara Prentiss
DNA Stretching Experiments
Single molecule DNA stretching experiments,
made possible by laser tweezers and highly
specific biological linking agents, provided
direct experimental evidence for the Kratky-Porod
(or “worm-like chain”) theory of entropic polymer
elasticity; circa 1949.
Excellent fits to force-extension curves of dsDNA
characterized only by a “contour length” L and an
effective monomer size a.
Smith et. al. Science 271, 785 (1996)
Bustamante, Marko,
Siggia and Smith,
Science 265, 1599
(1994)
1
x
1 x
Fa / kT = (1 − ) 2 − +
2
L
4 L
Entropic spring
Random walks and entropic elasticity
L = contour length
a=monomer size
N = L/a = number of
independent units
r
r
N ~ 10,000 step random walk
The total number of random walks
terminating at r is a Gaussian….
The free energy G(r) associated with endto-end distance r is purely entropic:
Add a force F to obtain an “entropic spring”:
7 random walks, all
starting at the origin
r
(r ) ≈ z N
1
− r 2 / Na 2
e
(π Nl 2 )3/ 2
G (r ) = −TS = −T k B ln [ (r ) ]
1
= const. + kr 2 ; k = 2k BT / L a
2
1 2 r r
G (r ) = const. + kr − F r
2
ur
r
(F = k r)
Single Molecule Biophysics Experiments
Reversible action of DNA
polymerase/exonuclease…
Relaxation of supercoiled
DNA by topoisomerase II …
chiral specificity revealed;
no “inactive” enzymes
Bustamante et. al., Nature 421, 423 (2003);
see also van Oijen et. al. Science 301 29 (2003)
Strick et. al., Nature 404, 901 (2000)
Force-induced unfolding of RNA hairpins
Liphardt et. al., Science 292, 733 (2001)
Single Molecule DNA Unzipping Transitions
Bockelmann et. al., Biophys. J. 82, 1537 (2002)
Previous experiments and simulations
performed in a “constant extension” ensemble
However, exact results possible in a more
tractable (and biologically relevant?) “constant
force” ensemble ….
David K. Lubensky and drn:
Unzipping for F < Fc(T) dominated by pauses and
jumps determined by the base pair sequence
Precise predictions available even for single
molecule experiments
Anomalous dynamics of the DNA unzipping fork
for F > Fc(T); barriers scale as square root of genome
size!
Thermal Denaturation of DNA (poly A:T or G:C)
r
r
r r
Z ( r , 0; N ) ∝
probability that N base pairs of
DNA are separated by r at top
rr
−T 2 ur ur
T ∂ N Z ( r,0; N ) =
∇ − F /T
2g
(
)
2
rr
r rr
Z ( r,0; N ) + U ( r ) Z ( r, 0; N )
applied force at top…
{
for homopolymers like
poly(A:T)or poly(G:C)…
N b (T ) ∼
1
(Τm − Τ ) 2
a 0 (T ) ∼ − const . (T m − T ) 2
Force-induced denaturation of poly(A:T)
unzipping fork
a0(T)
= 2a1(F)
First order “unzipping” transition at Fc(T)
Two “phases” coexist at the unzipping fork
If m base pairs are unzipped, the
unzipping bias f = 2a1(F) – a0
controls the energy landscape ε(m)
ε(m) = ( N − m)a0 + 2ma1
{
= a0 N + fm
Unzipping of heterogeneous DNA
Generic energy landscape
for DNA unzipping
(f/kBT= 0.01)
sequence
information!
60
m
40
ε(m) = a0 N + fm + ∑ η (m ')
m '= 0
ε(m)
20
0
Energy landscape for
bacteriophage X174;
(f/kBT = 0.01; Santa Lucia et.al.)
-20
-40
0
1000
2000
3000
m
4000
5000
average along
sequence
For most coding DNA…
∆ 0 , m = m ' 
η (m)η (m ') ≈ 

0,
otherwise


Statistical mechanics of the unzipping fork
N
Z (m) = ∫ e−∆ε ( m ) / T dm, (k B = 1)
0
m
∆ε(m) = fm + ∑ η (m ')
m '= 0
f ∝ Fc (T ) − F
For homopolymers [η(m) = ∆0 = 0], <m> ≈ T/(Fc-F)
However, sequence heterogeneity [∆0 ≠ 0] leads to
plateaus, jumps and a
stronger divergence:
< m > ≈ ∆0 / f 2
∝ 1/( Fc − F ) 2
< m > ∝ T /( Fc − F )
The dynamics of the unzipping fork is controlled by a
random force energy landscape
m
ε( m) = − a0 N − | f | m + ∫ η (m ')dm '
0
Langevin equation for position of
the unzipping fork m(t):
dm(t )
d ε ( m)
= −Γ
+ ξ (t )
dt
dm
= Γ | f | −Γη (m) + ξ (t )
Physics same as particle in a random
force field: Sinai, Derrida, ~1982-83;
Long time dynamics depends only on
µ = 2kBT|f|/∆0
20
Energy landscape for
bacteriophage X174
f/kT = -0.01
0
ε(m)
-20
if µ =| f |= 0, then
( k BT ) 2 2
< [m(t ) − m(0)] > =
ln (t / τ 0 )
∆0
2
(logarithmic localization)
-40
-60
if 0 < µ < 1, then
-80
0
1000
2000
3000
m
4000
5000
< m(t ) > ~ t µ
(sub-ballistic drift)
DNA unzipped under a constant force
exhibits multiple metastable intermediates
C. Danilowicz, M. Prentiss, et. al. PNAS 100, 1694 (2003)
Fully Fully
zipped unzipped
Several dozen identical
λ-phage DNA’s attached to
magnetic beads and unzipped
in parallel in a magnetic field
gradient.
Positions and pauses during
unzipping process provide a
sequence-dependent molecular
fingerprint.
Unzipping is very slow at
piconewton forces; several hours
required to unzip the 48,502
base pairs of phage lambda.
(top view)
Unzipping histories and energy landscapes
for F = 15pN
Experiments consistent with large
energy barriers and sequence specific
pause points.
3 different DNA’s unzpped @ 15 pN
4 different DNA’s unzipped @20pN
}
Positions and local jumps between
two-level systems concide with those
predicted for phage lambda
Simulation of unzipping dynamics for
bacteriophage X174
f =-0.02
Simulations and genome
landscapes courtesy of
Julius Lucks….
See movie…..
Helicase Unzipping of heterogeneous DNA
sequence
information!
m
ε(m) = a0 N + fm + ∑ η (m ')
m '= 0
average along
sequence
For most coding DNA…
∆ 0 , m = m ' 
η (m)η (m ') ≈ 

0,
otherwise


Langevin equation for position of
the unzipping fork m(t):
20
Energy landscape for
bacteriophage X174
f/kT = -0.01
0
dm(t )
d ε( m)
= −Γ
+ ξ (t )
dt
dm
ε(m)
-20
Physics same as particle in a random force field:
Sinai, Derrida, ~1982-83; Long time dynamics
depends only on µ = 2kBT|f|/∆0
-40
-60
if 0 < µ < 1, then
-80
0
1000
2000
3000
m
4000
5000
< m(t ) > ~ t µ
(sub-ballistic drift)
Overview for DNA unzipping:
energy barriers scale like √M!
RNA hairpin
M = 25, √M ≈ 5
Liphardt et. al
5 kT
40
Bacteriophage φX174
M = 5386, √M ≈ 73
73 kT
ε(m)
20
Danilowicz et. al.
Phys. Rev. Lett. 93,
078101 (2004)
0
-20
-40
0
1000
2000
3000
m
4000
5000
Are energy landscapes for viral genomes
really ‘random walks’?
E. coli phage X174
(ssDNA, circular)
M = 5386, √M ≈ 73
ε (m)
ε (m)
kT√M
kT√M
E. coli phage X174
E. coli phage K
M = 5386
M = 6089
m
m
ε (m)
E. coli phage T7
(dsDNA, linear)
M =39,936, √M ≈ 200
ε (m)
kT√M
kT√M
E. coli phage T7
Roseobacter phage
M = 39,936
SIO1, M = 39898
m
Fluctuations scale like √M in many viruses, but …..
m
phage lambda is exceptional…
ε (m)
E. coli phage lambda
(dsDNA, linear)
M=48502, √M ≈ 220
kT√M
m
front end of λ-phage is GC-rich and back end is AT-rich
is this the result of an ancient splicing of genomes infecting
thermophillic and thermophobic organisms?
mechanical explanation from packing
constraints?
Anomalous landscapes also appear in other temperate
phages (share lysis/lysogeny switch)
upward slopes head/tail genes….
Phage HK022
Phage P2
P2 landscape hard to reconcile with “mechanical packing explanation” ….
Can we determine what part of landscape is due to the amino acid sequences in the proteins
and what part is due to codon usage?
Synonymous mutations in temperate phages
(assume mutation rate ~ 1 x 10-6/bp-replication)
Phage Lambda
Phage P2
●Residual features are due to conserving the protein sequences….
● Does codon bias preserve the original landscape over evolutionary time scales?
Hypothesis: GC content of phage lambda is related to
lysogenic/lytic life cycle and/or codon bias of E. coli host
3000
ε (m)
2500
ε(m)
2000
1500
kT√M
1000
phage tail
≈ 0.15 µ
500
0
0
10000
20000
30000
m
40000
50000
m
R. W. Hendrix & S. Casjens , “Bacteriophage λ and Its Genetic Neighborhood”, The Bacteriophages (Oxford, 2004)
\
Codon bias in
E. Coli
tRNA abundances and translational machinery of
E. coli (and other bacteria) are biased towards rapid
transcription of certain codons.
Codon bias pattern of a gene correlates with the levels with which it is translated into protein and
indeed the accuracy of the translation process
Codon bias tables are typically constructed from abundantly produced ribosomal proteins (~12,000
ribosomes per cell), elongation factors, etc.
Leucine is spelled “CUG”
with frequency ~ 65%
http://www.evolvingcode.net/
codon/cai/cais.php
In E. coli 8/19 (42%) of the most highly
preferred codons have AT in the 3rd position
But…11/19 (58%) of the most highly
preferred codons have GC in the 3rd position
GC3 Landscape
Phage Lambda
GC Landscape
Cumulative CAI Landscape
Anomalous dynamics for motor proteins?
[Y. Kafri, D. Lubensky, drn; Biophys. J. 86: 3373 (2004)]
velocity
(fit due to Fisher and
Kolomeisky PNAS 2001
98: 7748-7753; see also
Julicher, Prost and Ajdari))
kinesin –
periodic
substrate;
small, simple
F
K. Vissher, M. J. Schnitzer, S. M. Block
Nature 400, 184 (1999)
RNAp –
is sequence
heterogeneity
Important?
M. Wang et al, Science
282, 902 (1998)
Dynamics with heterogeneity:
one possibility is a random
energy landscape:
−
+
m monomers
translocated
Cis side
ssDNA translocation
through a pore
Electric field drives heteropolymer like
ssDNA through a pore…)
trans side
ε ( m)
MEMBRANE
Local interaction of nucleotides with pore leads
to tilted “random energy” landscape with bounded
fluctuations….
Resulting dynamics is “diffusion with drift”, with
diffusion constant and drift velocity renormalized
by the randomness….
m(t ) ≅ v R t ± 2 DR t
m
Motors on periodic tracks: kinesin
walking along a microtubule
1
3
5
Motor moves from the
plus end to the minus
end, powered by ATP
hydrolysis and opposed
by an applied force F
7
F
0
2
4
6
8
-
Kinesin-microtubule binding
potential (note asymmetry)
• Describe motion of chemically driven
motors by a simple two-sublattice model
• Lack of inversion symmetry means
wb→ ≠ wb← , wa→ ≠ wa←
Binding potential with
ATP driving force
Lattice Model of Motor Proteins
Motor dynamics can be captured by
a simplified two-state model
with two sublattices a and b and
→
←
→
←
rate constants wa , wa , wb and wb
dPn (t )
= wa→ Pn −1 (t ) + wa← Pn +1 (t ) − ( wb→ + wb← ) Pn (t )
dt
(odd sites)
dPn (t )
= wb→ Pn −1 (t ) + wb← Pn +1 (t ) − ( wa→ + wa← ) Pn (t )
dt
(even sites)
wa→ = (α e ∆µ / T + ω )e −∆ε / T − f / 2T
wb← = (α + ω )e f / 2T
wa← = (α ′e ∆µ / T + ω ′)e −∆ε / T + f / 2T
wb→ = (α ′ + ω ′)e − f / 2T
On a periodic substrate, dynamics
is again diffusion with drift….
Two parallel channels, α and ω,
both biased by an applied force f
ω-channel describes thermal
transitions over a barrier ∆ε, unassisted
by ATP
α-channel powered by ATP hydrolysis
  [ ATP ] 
 [ ATP ]eq

∆ µ = T  ln 
−
ln

 [ ADP ]eq [ Pi ]eq
  [ ADP ][ Pi ] 





 
Directional motion requires α ≠ α’, ω ≠ ω’,
(due to lack of inversion symmetry)
Decimation for Motors on Heterogeneous Tracks
Rates are now location-dependent!
For ∆µ >> ∆ε can again integrate out even sites to obtain, e. g.,
If ∆µ (13) = 0, then ∆E13 = E3 - E1 + 2 f
"random energy" landscape
Otherwise, η13 is a random function of position along the track and a
"random force" landscape is generated....
YK10
Sequence heterogeneity leads to
a random force landscape for RNAp…..
functions of location along track
for this setup is not a function of position
E (m) = 2 fm + ∑ l =1η (l )
m
Simulation with
η(l) nonzero
Random force landscape with energy
fluctuations which grow as
Slide 31
YK10
DNA or RNA
Yariv Kafri; 09/12/2003
Other sources of random forcing
RNA polymerase
produces RNA
using NTP energy
Both random chemical energy & random chemical potential
for each nucleotide play a role….
effective energy landscape
E (m) = 2 fm + ∑ l =1 f µ (l ) + ∑ l =1η (l )
m
m
explicit random forcing due to
nucleotide chemical potentials
m(t ) t µ , so
m(t )
t →∞
t
if µ ( f ) < 1!
v = lim
Experimental tests of predications
t µ −1 → 0
stall force
Fs
Close to the stall force, the observed
“velocity” will depend on the
experimental time scale tE!
convex
velocity force
curve…
window-dependent
effective velocity…
(MCS)
Summary: anomalous dynamics near the
stall force of molecular motors…
M. D. Wang et. al.,
Science 282, 902 (1998)
RNAp ?
stall force
Lambda exonuclease? [see, e.g.,
van Oijen et. al. Science 301 29 (2003)]
Velocity depends
on the time scale!
Simulations, courtesy of Yariv Kafri
Sequence heterogeneity matters
near the stall force Fc of RNAp.
Effective velocity vanishes at very
long times in a window around Fs…
Does sequence heterogeneity affect the dynamics of
even simpler models of molecular “motors”?
−
+
1. ssDNA translocation through a
nanopore random energy
landscape
m monomers
translocated
Cis side
trans side
MEMBRANE
2. Rec A - or SSB protein – induced
translocation of DNA through a
nanopore random force landscape
cis
trans
Pauses and jumps in molecular machines…
Pauses were actually ‘edited out’
of single molecule experiments
on RNA polymerase in an attempt
to obtain a well-defined velocity!!
Low load
high load
Is the velocity even well defined??
M. D. Wang et. al., Science 282, 902 (1998)
Note scaledependent
velocities…
T. T. Perkins et.al., Science 301, 1914 (2003)
In later experiments on lambda exonuclease,
the velocity clearly depends on time scale!!!