Structure printout 5

Random walk / random coil
Why do we care?
The random coil is the “most” disordered structure (in a well-defined statistical way)
Real proteins approach this when drastically denatured
The random coil is a good model for DNA on medium to large distance scales
The same concepts will apply when we look at the movement of molecular motors,
chemotaxis of E. coli and diffusion.
Also to gambling, though that’s not biophysics.
In fact, you could teach this whole course as “examples of random walks”.
Random walk / random coil
The 1D random walk
Consider the number of ways you can take 3 steps,
each to the left or right
This is exactly the same as the coefficients in the
binomial expansion
(l + r)3 = l3 r0 + 3l2 r1 + 3l1 r2 + l0 r3
For N steps, the coefficients are
(l + r)N =
N!
ln r(N −n) ≡
n!(N − n)!
! "
N n (N −n)
l r
n
l is the probability of going left; r is the probability of going
right; n is the number of steps to the left; (N–n) is the number
of steps to the right. Usually l+r=1, but it’s not required.
If you could go left and right with equal probability, then you
would have l=r=1/2, which is the PBoC case:
p(n; N ) =
N!
n!(N − n)!
! "N
1
2
This is the probability of taking exactly n leftward steps out of N
total unbiased steps.
No contact energy: E=0 for every state
This is an entropy-driven phenomenon
Random walk / random coil
The function p(n) has a maximum at n=N/2.
This is the most likely number of leftward steps.
We’re interested in behavior near this point.
We really care about net displacement R, which is (for step length h)
R = n · (−h) + (N − n) · (+h) = (N − 2l)h
We could solve for n and plug into equation for p(n) on the previous slide
this would give PBoC equation 8.10. which is exactly true but kind of messy
since factorials are unwieldy we will approximate p(R) by a normal distribution
The central limit theorem says we can do this for “most” random processes
We construct a normal distribution whose peak and curvature match p(R)
All 1D normal distributions look like
p(R; N ) = √
1
2πσ 2
e−(R−R0 )
2
/2σ 2
... we just need to find R0 and !.
PBoC does this by constructing a Taylor series for p(R) and for the normal distribution and making sure all the terms
match up to order R2, though they don’t phrase it that way). This is equivalent to matching the peak and curvature
of p(R). They find that
√
R0 = 0 and σ =
Nh
All random walk problems have long-time behaviors that grow like N1/2
Random walk / random coil
The normal distribution is a very good approximation to the binomial
near the center of the distribution
Even as small as N=50, you can’t tell the difference
p(R)
0.12
0.1
binomial distribution (points) vs
normal distribution (line)
0.08
0.06
0.04
0.02
0
!50
!40
!30
!20
!10
0
10
20
30
40
50
R/h
Except that the real probability of moving more than 50 steps to the left or right should be
zero. In the normal distribution it’s just very, very small – but not zero.
Random walk / random coil
2D and 3D random walks
A similar argument leads to random walk end-to-end distributions for 2D and 3D
These are 1, 2, or 3 independent random walks along mutually perpendicular directions.
1D:
2D:
3D:
p(R; N ) =
1
(2πσ 2 )1/2
e−R
2
/2σ 2
p(R; N ) =
2πR
(2πσ 2 )1
e−R
2
/2σ 2
p(R; N ) =
4πR2
(2πσ 2 )3/2
e−R
2
/2σ 2
√
σ = Nh
An unbiased random walk always has a mean displacement of zero, but the mean square
movement depends on the dimension:
in all cases,
!
"
1D: R2
! "
2D: R2
! "
3D: R2
= N h2
=
2N h2
=
3N h2
These results apply for an unbounded random walk only.
We will talk about boundaries when we get to molecular motors.
Random walk / random coil
Random coil
Make a chain out of N independent, freely jointed links, each of length a.
Each link points in some random direction in 3D space, independent of its neighbors
ri where |!ri | = a
Tracing along the chain, the ith link is a vector displacement !
!N =
The position of Nth link is R
N
!
!ri
i=1
We find that
!
!N
R
"
= 0 and
!
2
!N
R
"
= N a2
Even though we set up the problem differently, this is exactly the same result as for the random walk if you
remember that a = !3h in 3D.
The final distribution is insensitive to the details of how the random walk is generated. Central limit theorem again.
We can use the results from the previous slide if we write for 2D: h=a/!2 and for 3D: h=a/!3.
The link size a is also called the Kuhn length
In a more realistic chain, it’s often more reasonable to define the persistence length "p over which the orientation
exponentially dephases:
!!ri · !rj " = e−d/ξp
where d is the distance along the chain, which has total contour length L.
It can be shown that a=2"p
By definition then, the DLS measured radius is the radius of a hypothetical hard sphere
that diffuses with the same speed as the particle under examination. This definition is
somewhat problematic with regard to visualization however, since hypothetical hard
spheres are non-existent. In practice, macromolecules in solution are non-spherical,
dynamic (tumbling), and solvated. As such, the radius calculated from the diffusional
properties of the particle is indicative of the apparent size of the dynamic
hydrated/solvated particle. Hence the terminology, ‘hydrodynamic’ radius.
Random walk / random coil
A comparison of the hydrodynamic radius to other types of radii can be shown using
lysozyme as an example (see Figure 2). From the crystallographic structure, lysozyme
can be described as a 26 x 45 Å ellipsoid with an axial ratio of 1.73. The molecular
The
distance
is probably
not the
weight of the protein is 14.7
kDa,end-to-end
with a partial specific
volume
or inverse density
of best measure of the “size” of a random coil.
!
0.73 mL/g. The radius of gyration
(R
)
is
defined
by
the
expression
given
below,
where
g
The radius of gyration is defined as Rg ≡ "(Rj − "R$)2 $j
mi is the mass of the ith atom in the particle and ri is the distance from the center of mass
to the ith particle. RM is the equivalent radius of a sphere with the same mass
! and particle
ForRaR random
coil,established
it works out
Rg =
N/6 l
specific volume as lysozyme, and
is the radius
bythat
rotating
the protein
about the geometric center. This is what is measured in Small Angle X-rat Scattering (SAXS)
It’s closely related to the hydrodynamic radius, which is what is measured in diffusion-based methods
R 2g =
!m r
!m
2
i i
i
Rg: radius of gyration
RH: hydrodynamic radius
the (radius of a sphere that would have the same drag
coefficient or diffusion constant)
RM: equivalent mass radius
the radius of solid sphere that would have the same
mass and density
RR: radius of rotation
the maximum radius if you rotated the molecule
about its COM
Figure 2: Comparison of hydrodynamic radius (RH) to other radii for lysozyme.
It is instructive to note here, that RM is the hypothetical radius for a hard sphere with the
same mass and density as lysozyme. One might expect then, to see a closer correlation of
RM with RH. Remember however, that RH is the hydro-dynamic radius, which includes
both solvent (hydro) and shape (dynamic) effects.
Random walk / random coil
Like in the random walk, there is a distribution of end-to-end distances:
p(r/a)
0.25
N=25
N=100
N=400
N=1600
0.2
p(r/a) =
0.15
1
a
!
3
2πN
"3/2
4π(r/a)2 e−3(r/a)
2
/2N
0.1
0.05
0
0
10
20
30
40
50
60
70
80
90
PBoC equation 8.36 for p(r) on p. 297.
Note the difference with equation 8.34 for p(r).
100
r/a
Random walk / random coil
The distribution of chain positions is quite soft (no sharp cutoff at Rg).
Most proteins do not look like this unless very harshly denatured (M levels of GdHCl or drastic pH shifts)
Very good model for DNA (100s to 1000s of b.p.)
large-scale looping of double-stranded (ds) DNA (100s to 1000s of bp)
for dsDNA, the persistence length "p is about 50 nm (150 bp)
small-scale looping of single-stranded (ss) DNA (much less stiff)
for ssDNA, the persistence length "p is about 3 nm (10 bases)
describes DNA handles used in optical traps
describes some DNA size effects in gels
A self-avoiding or self-repelling coil is larger; a self-attracting coil is smaller
# solvents compensate for these effects by matching the Rg of a real protein to that of a random coil of the same
length. They generally do not match the entire distribution p(R), though.
Random walk / random coil
Size of genomes
The linear size of a genome is rarely relevant.
8.2. THE
MACROMOLECULES
AS RANDOM WALKS
402CHAPTER 8. RANDOM
WALKS
AND
STRUCTURE
R of most genomes
are pretty
large, so most cells
have to compress theirOF
DNA MACROMOLECULES
403
g
... which means the DNA explodes out when the cell ruptures.
human chr. 1
50
worm chr. 1
E. coli
fly chr. 4
Rg (mm)
10
5
yeast chr. 3
1
0.5
lambda phage
0.1
103
104
105
106
107
108
109
number of base pairs
Figure 8.5: Size of genomic DNA in solution.
the average
sizespatial
of a DNA
FigurePlot
8.6: of
Illustration
of the
extent of a bacterial genome which has
molecule in solution as a function of the escaped
number the
of base
pairscell.
using
theexpanded
randomregion in the figure shows a small
bacterial
The
walk model. The labels correspond to segment
particular
chromosomes
viruses,
of the
DNA and has from
a series
of arrows on the DNA, each of which have
bacteria, yeast, flies, worms and humans.a length equal to the persistence length in order to give a sense of the scale over
Random walk / random coil
Entropic force of DNA
Let’s treat DNA as a 1D random walk
Choose h=a=2"p and N=L/"p.
Recall that the end-to-end displacement x has zero average and N1/2h width, with a
distribution
2
2
p(x; N ) ∝ e−x /2N h
Since W(x) is proportional to p(x), the free energy of the DNA is
G(x, T ) = G0 (T ) +
1
2
!
kB T
N h2
"
x2
Therefore it has an effective spring constant is therefore k =
either by inspection or from ∂G/∂x = fx
kB T
N h2
If you pull on DNA with a force fo, it will stretch a distance x =
PBoC represents this as weights mg=f0 on either side of the “DNA”
fo
fo h
=L
k
kB T
Random walk / random coil
This is the distance that minimizes the free energy function
1
G (f, T ) = G0 (T ) +
2
!
!
kB T
N h2
"
x2 − f x
We will see this again many times: letting a system change its size s to equilibrate against a
force fs makes the free energy pick up a linear term – fs s.
This breaks down eventually, since you can’t have an extension greater than the contour
length L.
The normal distribution was itself an approximation to the binomial distribution, valid only in the center.
PBoC goes back to the binomial distribution and derives (p. 315)
x = L tanh
!
fh
kB T
This agrees with the expression on the previous slide for small f.
"
The exact solution (taking into account the other two dimensions) gives another correction
(equation 8.81):
!
"
x = L coth(f h/kB T ) − (f h/kB T )−1
Random walk / random coil
A nice summary of the force-extension curve of a random coil under various models: