Supplementary Material General Transfer Matrix Formalism to Calculate DNA-Protein-Drug Binding in Gene Regulation: Application to OR Operator of Phage Vladimir B. Teif Institute of Bioorganic Chemistry, Belarus National Academy of Sciences, Kuprevich 5/2, 220141, Minsk, Belarus, E-mail: [email protected], Tel: +375 29 706 94 79 Table S1. Enumeration of states of an elementary DNA unit for the basic competitive model. A DNA unit may be either free from proteins or covered by a protein bound to DNA. There are f different types of protein-DNA complexes, g = 1,..f. A g-type complex involves mg DNA units. The position of a DNA unit inside the complex is numbered from left to right (1,…mg). The first unit in the complex is assigned the binding constant Kng, where n is the number of the DNA unit. The statistical weight of the last unit in the complex depends on the type and the presence of the protein at the next DNA unit. The protein-protein contact is characterized by the contact cooperativity parameter w = w(0, gn, gn+1). The weights of the DNA units inside the binding site are equal to one if n’th unit in state i is followed by n+1’st unit in state i+1, otherwise the weight is equal to zero. Number of state of the n’th unit, i 1 … m1 … m1+…+mg-1+ 1 … m1+ …+mg … m1+…+mf-1+1 … m1+…+mf m1+…+mf+1 State description type of complex position of unit 1 1 … m1 … … 1 g … mg … … 1 f … mf free from proteins Statistical weight Kn1 … w(0, 1, gn+1) … Kng … w(0, g, gn+1) … Knf … w(0, f, gn+1) 1 The algorithm of the transfer matrix construction for the basic competitive model According to Table S1, the nonzero elements of the transfer matrix Qn(i, j) corresponding to the n’th DNA unit may be constructed using to the following equations: A free DNA unit followed by a free unit: i j R ; n = 1,…N: Qn(i, j) = 1 A free unit followed by g type protein: g i R ; g = 1…f, j mk mg 1 ; n = 1,…N – (mg – 1): Qn(i, j) = 1 k 1 g type protein followed by a free unit: g g = 1…f; i mk ; j R ; n = mg,…N : Qn(i, j) = 1 k 1 g1 type protein followed by g2 type protein: g1 g2 k 1 k 1 g1 = 1…f; i mk ; g2 = 1…f, j mk mg2 1 ; n = mg1,…N – (mg2 – 1): Qn (i, j ) w( g1, g2 ) 1st unit inside g-type protein binding site: g g = 1…f; i mk mg 1 ; n = 1, …N – (mg – 1): Qn(i, j) = Kngc0g k 1 hth unit inside g-type protein binding site (h > 1): g g = 1…f; i mk mg h ; h = 2 … mg – 1; n = h, …N – (mg – h): Qn(i, j) = 1 k 1 Here Kng is the binding constant for a protein of type g and a frame of mg DNA units starting at unit n. c0g is the bulk concentration of the protein of type g. w(0, g1, g2) is the cooperativity parameter for the contact of proteins g1 and g2. In the absence of protein-protein interactions w(0, g1, g2) = 1. These equations allow calculations for any number (f) of large proteins (mg > 1), which may not hang out from DNA ends. The transfer matrices for the other models, (e.g. Tables S2 and S3) are constructed analogous to this algorithm. Table S2. Enumeration of the free states of an elementary DNA unit for the competitive model with long-range interactions. Table S2 should be combined with Table S1 to get the complete list of states. The numbers of states continue from Table S1. Each protein g is assigned a maximum interaction length, Vg. A gap of l units (l ≤ Vg) between the proteins g1 and g2 is assigned a statistical weight w(l, g1, g2). The gaps longer Vg between the proteins g1 and g2 are assigned the weights w(0…Vg+1, 0, g2) = 1 and w(0…Vg+1, g1, 0) = 1. State number f m a unit at the left free DNA end g 2 a unit at the right free DNA end … g1-l-g2 gap (l free units before next g2 protein), l Vg2 … g1-l-g2 gap (l free units before next g2 protein), l > Vg2 … a free unit out of protein-protein interactions, not at the DNA ends f … f g2 1 g 1 g 1 mg 2 Vg l … m g 1 g Vg 2 l … f m g 1 g 1 1 m f Stat. weight g g 1 g 1 State description Vg 2 max(Vg ) 1 1 … w(l, g1, g2) … 1 … 1 Table S3. The enumeration of states of an elementary DNA unit for the large loops model. Each row corresponds to one macrostate, which includes several microstates (listed in Tables S1 and S2). The “layer 1” and “layer 2” columns indicate whether a protein is bound (“+”) or not (“-“) at a given position in a given layer. The “bridge” column indicates whether a protein bridge between DNA segments is formed (“+”) or not (“-“). The “1st bridge” column distinguishes the units laying exactly under the first bridge (“=”), before (“<”) and after (“>”) the first bridge. The last column lists statistical weights specific for the large loops model, which should be multiplied by the corresponding weights of the simpler model (Tables S1 and S2) to get the final weight matrices. The vertical contact between the g-type proteins bound to DNA sites n and n’ in the first and second layer is assigned the statistical weight w = w(g(n), g(n’)). The contacts of the second layer proteins with DNA are characterized by the binding constants Kn’g. The first bridge between the DNA segments is assigned the statistical weight wloop. Macrostate 1 2 3 4 5 6 7 8 layer 1 + + + + + + - State description layer 2 bridge 1st bridge < + < + + = + + > > + > < > Additional stat. weights w w Kn’g wloop w Kn’g w -
© Copyright 2026 Paperzz