appendix - BioMedSearch

Supplementary Material
General Transfer Matrix Formalism to Calculate DNA-Protein-Drug Binding in Gene
Regulation: Application to OR Operator of Phage 
Vladimir B. Teif
Institute of Bioorganic Chemistry, Belarus National Academy of Sciences,
Kuprevich 5/2, 220141, Minsk, Belarus, E-mail: [email protected], Tel: +375 29 706 94 79
Table S1. Enumeration of states of an elementary DNA unit for the basic competitive model. A
DNA unit may be either free from proteins or covered by a protein bound to DNA. There are f
different types of protein-DNA complexes, g = 1,..f. A g-type complex involves mg DNA units.
The position of a DNA unit inside the complex is numbered from left to right (1,…mg). The first
unit in the complex is assigned the binding constant Kng, where n is the number of the DNA unit.
The statistical weight of the last unit in the complex depends on the type and the presence of the
protein at the next DNA unit. The protein-protein contact is characterized by the contact
cooperativity parameter w = w(0, gn, gn+1). The weights of the DNA units inside the binding site
are equal to one if n’th unit in state i is followed by n+1’st unit in state i+1, otherwise the weight
is equal to zero.
Number of
state of the
n’th unit, i
1
…
m1
…
m1+…+mg-1+ 1
…
m1+ …+mg
…
m1+…+mf-1+1
…
m1+…+mf
m1+…+mf+1
State description
type of
complex
position
of unit
1
1
…
m1
…
…
1
g
…
mg
…
…
1
f
…
mf
free from proteins
Statistical
weight
Kn1
…
w(0, 1, gn+1)
…
Kng
…
w(0, g, gn+1)
…
Knf
…
w(0, f, gn+1)
1
The algorithm of the transfer matrix construction for the basic competitive model
According to Table S1, the nonzero elements of the transfer matrix Qn(i, j) corresponding to the
n’th DNA unit may be constructed using to the following equations:
A free DNA unit followed by a free unit:
i  j  R ; n = 1,…N: Qn(i, j) = 1
A free unit followed by g type protein:
g
i  R ; g = 1…f, j   mk  mg  1 ; n = 1,…N – (mg – 1): Qn(i, j) = 1
k 1
g type protein followed by a free unit:
g
g = 1…f; i   mk ; j  R ; n = mg,…N : Qn(i, j) = 1
k 1
g1 type protein followed by g2 type protein:
g1
g2
k 1
k 1
g1 = 1…f; i   mk ; g2 = 1…f, j   mk  mg2  1 ; n = mg1,…N – (mg2 – 1):
Qn (i, j )  w( g1, g2 )
1st unit inside g-type protein binding site:
g
g = 1…f; i   mk  mg  1 ; n = 1, …N – (mg – 1): Qn(i, j) = Kngc0g
k 1
hth unit inside g-type protein binding site (h > 1):
g
g = 1…f; i   mk  mg  h ; h = 2 … mg – 1; n = h, …N – (mg – h): Qn(i, j) = 1
k 1
Here Kng is the binding constant for a protein of type g and a frame of mg DNA units starting at
unit n. c0g is the bulk concentration of the protein of type g. w(0, g1, g2) is the cooperativity
parameter for the contact of proteins g1 and g2. In the absence of protein-protein interactions
w(0, g1, g2) = 1.
These equations allow calculations for any number (f) of large proteins (mg > 1), which may not
hang out from DNA ends. The transfer matrices for the other models, (e.g. Tables S2 and S3) are
constructed analogous to this algorithm.
Table S2. Enumeration of the free states of an elementary DNA unit for the competitive model
with long-range interactions. Table S2 should be combined with Table S1 to get the complete list
of states. The numbers of states continue from Table S1. Each protein g is assigned a maximum
interaction length, Vg. A gap of l units (l ≤ Vg) between the proteins g1 and g2 is assigned a
statistical weight w(l, g1, g2). The gaps longer Vg between the proteins g1 and g2 are assigned the
weights w(0…Vg+1, 0, g2) = 1 and w(0…Vg+1, g1, 0) = 1.
State number
f
m
a unit at the left
free DNA end
g
2
a unit at the right
free DNA end
…
g1-l-g2 gap (l free
units before next g2
protein), l  Vg2
…
g1-l-g2 gap (l free
units before next g2
protein), l > Vg2
…
a free unit out of
protein-protein
interactions, not at
the DNA ends
f
…
f
g2 1
g 1
g 1
 mg  2   Vg  l
…
m
g 1
g
 Vg   2  l
…
f
m
g 1
g
1
1
m
f
Stat. weight
g
g 1
g 1
State description
 Vg   2 
 max(Vg )  1
1
…
w(l, g1, g2)
…
1
…
1
Table S3. The enumeration of states of an elementary DNA unit for the large loops model. Each
row corresponds to one macrostate, which includes several microstates (listed in Tables S1 and
S2). The “layer 1” and “layer 2” columns indicate whether a protein is bound (“+”) or not (“-“) at
a given position in a given layer. The “bridge” column indicates whether a protein bridge
between DNA segments is formed (“+”) or not (“-“). The “1st bridge” column distinguishes the
units laying exactly under the first bridge (“=”), before (“<”) and after (“>”) the first bridge. The
last column lists statistical weights specific for the large loops model, which should be multiplied
by the corresponding weights of the simpler model (Tables S1 and S2) to get the final weight
matrices. The vertical contact between the g-type proteins bound to DNA sites n and n’ in the
first and second layer is assigned the statistical weight w = w(g(n), g(n’)). The contacts of the
second layer proteins with DNA are characterized by the binding constants Kn’g. The first bridge
between the DNA segments is assigned the statistical weight wloop.
Macrostate
1
2
3
4
5
6
7
8
layer 1
+
+
+
+
+
+
-
State description
layer 2 bridge 1st bridge
<
+
<
+
+
=
+
+
>
>
+
>
<
>
Additional
stat. weights
w
w  Kn’g  wloop
w  Kn’g
w
-