Design of artificial tubular protein structures in 3D hexagonal prism

Design of artificial tubular protein structures in 3D
hexagonal prism lattice under HP model
Arvind Gupta
Ján Maňuch
Mehdi Karimi
Alireza Hadj Khodabakhshi
Arash Rafiey
School of Computing Science
8888 University Drive, Simon Fraser University
Burnaby, BC, V5A 1S6, Canada
Abstract The inverse protein folding problem is
that of designing an amino acid sequence which
has a prescribed native protein fold. This problem
arises in drug design where a particular structure
is necessary to ensure proper protein-protein interactions. In this paper, we show that in the HP
model of Dill on the 3D (hexagonal prism) lattice it is possible to solve this problem for a class
of structures (tubular structures). We also show
that the proteins for the simplest (but arbitrary
large) members of our class of structures (consisting of one or two tubes) are stable, i.e., fold
uniquely into the desired tubular structure.
Keywords: inverse protein folding, HP model, protein
stability, protein design, hexagonal prism lattice
1
Introduction
It has long been known that protein interactions
depend on their native three-dimensional fold and
understanding the processes and determining these
folds is a long standing problem in molecular biology. Naturally occurring proteins fold so as to minimize total free energy. However, it is not known
how a protein can choose the minimum energy fold
amongst all possible folds [1].
Many forces act on the protein which contribute
to changes in free energy including hydrogen bonding, van der Waals interactions, intrinsic propensities, ion pairing, and hydrophobic interaction. Of
these, the most significant is hydrophobic interaction (see [2] for details). This led Dill to introduce the Hydrophobic-Polar model [3]. Here the
20 amino acids from which proteins are formed are
replaced by two types of monomers: hydrophobic
(H or ‘1’) or polar (P or ‘0’) depending on their
affinity to water. To simply the problem, the pro-
tein is laid out on vertices of a lattice with each
monomer occupying exactly one vertex and neighboring monomers occupy neighboring vertices. The
free energy is minimized when the maximum number of non-neighbor hydrophobic monomers are adjacent in the lattice. Therefore, the “native” folds
are those with the maximum number of such HH
contacts. Even though the HP model is the simplest model of the protein folding process, computationally it is an NP-hard problem, cf. [4] for twoand [5] for three-dimensional square lattices.
In many applications such as drug design, we
are interested in the complement problem to protein folding: inverse protein folding or protein design. A major challenge in designing proteins that
attain a specific native fold is to avoid proteins that
have multiple native folds. We say that a protein
is stable if its native fold is unique. It is generally
believed that all naturally occurring proteins are
stable, however this is usually not true for random
protein sequences. Extreme examples are proteins
containing only polar monomers in the HP model.
In this case, every fold achieves the lowest free energy.
The inverse protein folding problem involves
starting with a prescribed target fold and designing an amino acid sequence whose native fold is the
target (positive design) and which is stable (negative design). Given the inherent complexity of
this problem, research has focused on a simple HP
model. Even in the HP model, the complexity of
this problem is unknown but conjectured to be NPhard. Early work on this problem involved heuristics that bury the H monomers in a central core
with the P monomers on the outside [6], find all
possible short sequences and put these together [7],
or perform a sequence evolution, a form of local
search [8]. A relationship between symmetries and
designability of proteins was observed in [9].
Another approach to this problem is a heuristic
sequence design, i.e., design of a sequence fulfilling
easier alternative criteria which is likely to solve the
original inverse protein folding problem. There are
currently two sets of criteria studied, called canonical and grand canonical models, introduced in [10]
and [8], respectively. It has been shown that the
protein sequence design problem can be solved in
polynomial time in the grand canonical model for
both 2D and 3D square lattices, cf. [11], and in
polynomial time for 2D lattices while the problem
is NP-hard for 3D square lattice in the canonical
model, cf. [12]. Note however that design of heuristic sequences does not guarantee that the generated
sequence satisfies the two criteria (positive and negative design) of the inverse protein folding problem.
In [13] a new version of the inverse protein folding problem was considered: instead of a target
fold, a target structure (a connected set of lattice vertices) is given, and the goal is to design
a sequence which would (preferably uniquely) fold
into a structure (picked from a rich class of “constructible” structures) “close” to the target structure. The 2D square lattice was used and it was
shown that all designed proteins fold into corresponding constructible structures. It was also “formally” shown that the proteins for the simplest
(but arbitrary long) constructible structures fold
uniquely, and conjectured that the same holds for
all constructible structures. Design of stable proteins of arbitrary lengths in the HP model was also
studied in [14] (for 2D square lattice) and in [15]
(for 2D triangular lattice), motivated by a popular
paper of Brian Hayes [16].
Figure 1: Illustration of a small part of the hexagonal prism lattice — the white beads depict all five
neighbors of one vertex in the lattice (depicted with
a black bead).
In this paper, we use a 3-dimensional lattice: the
hexagonal prism lattice. The hexagonal prism lattice is composed by stacking horizontal hexagonal
grids (“honeycomb nets”) on top of each other, cf.
Figure 1. Two facts about this lattice are useful
to us in our construction. First these lattices have
a relatively low degree (the number of neighbors
of a vertex) of 5. The cubic lattice, for example,
had a degree of 6. This lower degree simplifies our
designs. At the same time, [17] have shown that relative to its degree this lattice is remarkably good at
representing a large class of natural protein structures.
(a)
(b)
Figure 2: An example of (a) a tubular structure
built with 3 tubes; (b) a coiled coil structure formed
by 6 alpha-helices in protein 1AIK.
In this paper, we design a class of structures
(called tubular structures) and corresponding proteins in the 3D hexagonal prism lattice and show
that each protein folds into the corresponding tubular structure. An example, of a tubular structure is shown in Figure 2(a), where hydrophobic
monomers are depicted with black beads, and polar ones with white beads. Interestingly, the basic building block of our tubular structures is a
tube which consists of six parallel “alpha helix”like structures. Similar designs appear in nature as
a coiled coil structural motif in which 2–6 alphahelices are coiled together, cf. Figure 2(b). Many
coiled coil type proteins are involved in important biological functions such as the regulation of
gene expression e.g. transcription factors [18, 19].
Although, the class of tubular structures is not
rich enough to approximate many target shapes, it
shows a potential of designability of complex structures in the hexagonal prism lattice.
We conjecture that the proteins of our tubular structures are stable, and we are able to prove
this formally for infinite subclass of the simplest
structures (consisting of one or two tubes). Here,
we assume that our proteins are closed chains of
monomer, a similar assumption as used in [14], i.e.,
that the beginning and the end of the sequence are
adjacent in the lattice. We remark that this property in any native fold could be ensured by replacing
the second and the last by one monomers (which
are hydrophobic) with cysteine monomers which
are hydrophobic as well and tend to pair with each
other to form a strong sulfide bridge (SS-bridge).
Despite the tremendous amount of work on protein design for 2D lattices, as far as we know, this is
the first such result for the 3D lattice. Given that
3D is the realistic setting, we believe that this work
could eventually help in designing actual proteins
with applications to drug design and nanotechnology.
2
Preliminaries
In this section we will review the HP model and
introduce some terminology used in the paper.
2.1
Hydrophobic-polar model
Proteins are chains of monomers where each
monomer is either hydrophobic or polar. We can
represent a protein chain as a binary string p =
p1 p2 . . . p|p| in {0, 1}∗, where “0” represents a polar
monomer and “1” a hydrophobic monomer.
The proteins are folded onto the regular lattice.
A fold of a protein p is embedding of a path of
length n into lattice, i.e., vertices of the path are
mapped into distinct lattice vertices and two consecutive vertices of the path are mapped to lattice
vertices connected by an edge (a peptide bond).
In our 3D HP model we use the hexagonal prism
lattice as a lattice structure. The vertices adjacent to a vertex are called the neighbors of that
vertex. As depicted in Figure 1, each vertex has
5 neighbors: 3 horizontal neighbors lying in the
same hexagonal grid and 2 vertical neighbors lying
above and bellow the vertex in the parallel hexagonal grids.
A protein will fold into a fold with the minimum
free energy, also called a native fold. In the HP
model only hydrophobic interactions between adjacent hydrophobic monomers which are not consecutive in the protein (contacts) are considered in
the energy model, with each contact contributing
with −1 to the total energy. Hence, a fold with the
lowest free energy corresponds to a fold with the
largest number of HH contacts. Note that there
might be several native folds for a given protein.
A protein with a unique native fold is called stable
protein.
2.2
Terminology
A lattice vertex containing a hydrophobic (respectively, polar) monomer (in a fold) will be called
1-vertex (respectively, 0-vertex). A neighbor of a
vertex v containing a ∈ {0, 1} will be called aneighbor.
We number hexagonal grids of the lattice (also
referred to as planes) with integer numbers, and
denote the i-th grid by Hi . Consider vertex x ∈ Hi .
We denote the vertical neighbor of x in Hi+1 (above
x) by x1 , and recursively, the vertical neighbor of
xj in Hi+j+1 by xj+1 . Similarly, we denote the
neighbor of x in Hi−1 by x−1 , and the neighbor of
x−j in Hi−j−1 by x−j−1 .
Let Gx be the graph of all 1-vertices in Hi which
are reachable from x by a path of 1-vertices in Hi .
For j ≥ 1, let Gjx be the graph of all vertices in Hi+j
−j
which have a neighbor in Gj−1
be the
x , and Gx
graph of all vertices in Hi−j which have a neighbor
in G−j+1
, i.e., Gjx , j 6= 0, are vertical copies of the
x
set Gx .
Note that Gx is a planar graph (as Hi is as well).
Let Bx be the boundary cycle of Gx , i.e., the set of
vertices of Gx which lie on the outer face of Gx . A
component in a fold F is a maximal set of 1-vertices
for which there is a path of 1-vertices between any
pair of them.
2.3
Saturated folds
The proteins used in [13] and the proteins we are
going use in our design have a special property. The
number of contacts of their native folds is maximal
possible with respect to the number of hydrophobic
“1” monomers contained in the protein. The following useful observation characterizes native folds of
such proteins.
Observation 1 (Saturated folds). Let p ∈
0{0, 1}∗0 be a protein, and F be the fold of p. If
for every 1-vertex v, three out of five edges incident
with v are contacts then (a) F is a native fold of
p; and (b) any other native fold of p satisfies this
property. We will call a fold satisfying this property
a saturated fold.
The proof of the observation follows by a simple argument that any 1-vertex v can have at most
three contacts since it is connected to exactly two
neighbors with a peptide bond.
Figure 3: Illustration of a tube with a hydrophobic
core of height 8 — the wavy lines at the top and
dashed lines at the bottom represents loops.
3
Tubular
structures
their proteins
and
In this section, we introduce a class of structures
in the hexagonal prism lattice, called tubular structures. A basic building block of tubular structures
is a tube, depicted in Figure 3. A tube consists of
6 identical “alpha helix”-like subfolds of the substring Pn = (1001)n forming a 2 × 2n vertical zigzag pattern (“plate”). On one side of the plate
there are only hydrophobic monomers, while on the
other side only polar monomers. Hydrophobic sides
of the plates attach together forming a hexagonal
tube of hydrophobic monomers (the hydrophobic
core). The plates are connected to each other with 6
short loops (3 at the top and 3 at the bottom), each
consisting of only two polar monomers. Thus the
hydrophobic core is completely surrounded by polar
monomers, i.e., the fold is saturated. The complete
protein string for the tube is Tn = (0Pn 0)6 . The
height of the hydrophobic core of the tube Tn is 2n.
Two tubes can be connected to one protein
structure as follows. One top loop of the first
tube is overlapped with a bottom loop of the second tube, and the peptide bonds between two polar monomers of each loop are disconnected, cf.
Figure 4. Thus when traversing the fold of the
first tube, instead of using the top loop (which
was overlapped with a bottom loop of the other
tube), we traverse the whole second tube and continue traversing the first tube from the other end
of the overlapped top loop. The complete protein string for two interconnected tubes is Tn,m =
(0Pn 0)2 0Pn (0Pm 0)6 Pn 0(0Pn 0)2 .
Tubular structures are structures constructed as
follows:
• start with a tube;
Figure 4: Illustration of interconnection of two
tubes by overlapping their loops: crosses depict a
peptide bond removed from both tubes.
• repetitively connect a new tube (as described
above) to existing structure.
Since, the folds of tubular structures are saturated,
by Observation 1, they are native folds to corresponding proteins (which can be easily reconstructed from the folds).
Unfortunately, due to spatial limitations, only
one of the three top (respectively, bottom) loops of
a tube can connect to another tube, thus all tubular structures form a staircase (which can change
direction and has steps of variable lengths), cf. Figure 2(a). It would be interesting to design more robust connections between two tubes or a different
basic building block which would allow more than
two other blocks to be attached to one block.
4
Stability of basic tubular
structures
In what follows we will show that the proteins of
basic tubular structures (structures built from one
or two tubes), Tn and Tn,m , are stable for any
n, m ≥ 1.
Let Q be the protein of any basic tubular structure. Let F be an arbitrary native (i.e., saturated)
fold of Q. Since, F is saturated each 1-vertex has
at least three 1-neighbors and the remaining two
neighbors contain some monomer. Since there is
no subsequence 111 in Q, each 1-vertex has at least
one 0-neighbor. Hence, we can classify every 1vertex x to one of the following five types based on
the position of its 0-neighbor(s), cf. Figure 5:
(a) vh-type: x has one vertical 0-neighbor (on top
or below) and one horizontal 0-neighbor (in the
same hexagonal grid);
(b) vv-type: x has two vertical 0-neighbors;
(c) hh-type: x has two horizontal 0-neighbors;
(d) h-type: x has one horizontal 0-neighbor;
(e) v-type: x has one vertical 0-neighbor.
A 1-vertex of type X, will be called X-vertex.
x
x
(a)
x
(d)
x
(b)
x
x
(c)
x
(e)
Figure 5: Five types of possible neighborhood of a
1-vertex x: (a) vh, (b) vv, (c) hh, (d) h and (e) v.
Let a be a 1-vertex with a 0-neighbor b. Observe
that ab or ba is a subsequence of Q, i.e., a and b are
connected with a peptide bond.
Lemma 1. There is no vv-vertex in F .
Proof. Consider a vv-vertex. Assume it is in a grid
Hi . Observe that any finite non-empty set of vertices of one hexagonal grid has at least three edges
going to other vertices of this grid. Consider a set
S of all vv-vertices in Hi . For any edge between
set S and Hi − S, there is a distinct substring 101
in F . Since, there are at most two occurrences of
this substring in Q, this can happen only twice, a
contradiction.
Lemma 2. There is no hh-vertex in F .
Proof. Let x be an hh-vertex in plane Hi . Let a, b >
0 be the smallest integers such that xa and x−b are
not hh-vertices. Note that each of them is a 1vertex. Since xa−1 is an hh-vertex and xa is not,
xa has a horizontal 1-neighbor x̄ such that x̄−1 , a
horizontal neighbor of xa−1 , is a 0-vertex. We have
an occurrence of 101 in F . The same is true for
x−b , i.e., we have two occurrence of 101 in F . No
other occurrence of this substring can occur in F .
This implies that vertical neighbors of every 1vertex in Gx are 1-vertices. Indeed, assume that
there is 1-vertex in Gx for which this is not true and
take the one closest to x, say z. W.l.o.g. assume z 1
is a 0-vertex. Let z̄ ∈ Gx be a horizontal neighbor
of z closer to x, i.e., z̄ 1 is a 1-vertex. We have
another occurrence of 101 on the path (z̄ 1 , z 1 , z), a
contradiction.
Now, by Lemma 1, each vertex in Gx has at least
one horizontal 0-neighbor, i.e., Gx forms a path
with x as one endpoint. Let y be the other endpoint. Then y is an hh-vertex as well, i.e., there are
another two occurrences of 101 above and bellow y,
a contradiction.
Note that by Lemma 2, for any 1-vertex x, Gx is
a 2-connected graph, and hence the boundary Bx
of Gx is a proper cycle (each vertex is visited only
once).
Lemma 3. Let x, y be two adjacent 1-vertices in
F . Then
(1) If x and y are horizontal neighbors then either
both x, y are of the same type, or one of them
is of vh-type and the other of v-type.
(2) If x and y are vertical neighbors then either
both x, y are of the same type, or one of them
is of vh-type and the other of h-type. Furthermore, the horizontal neighborhood of x and y
is aligned (1-neighbor above 1-neighbor and 0neighbor above 0-neighbor).
Proof. Suppose x and y are horizontal neighbors
and x is either a vh-vertex or a v-vertex, and y is
an h-vertex. W.l.o.g assume that x1 is a 0-vertex,
and let z be the horizontal 0-neighbor of y. By
Lemma 2, both two other horizontal neighbors of y 1
(including z 1 ) are 1-vertices. We have a substring
1011 in F (on the path (z 1 , z, y, u), where u is a 1vertex connected to y with a peptide bond) which
does not occur in Q, a contradiction. Claim (1)
follows.
Analogously one can show claim (2).
Corollary 1. Let x be a 1-vertex. Then G1x (Gx−1 )
contains either only 0-vertices or only 1-vertices.
Lemma 4. Let x be a 1-vertex in plane Hi . If
the boundary cycle Bx contains K ≥ 2 hexagons
then Gx has at least K + 6 vertices of degree 2.
If in addition there are vertices u, v ∈ Gx sharing
a common 0-neighbor then Gx has at least K + 7
vertices of degree 2.
Proof. Let us call a vertex of degree 2 a bivertex.
To show the first part of the claim, we will show
that the number of bivertices in Bx is at least K +
6. We can assume that all vertices inside of the
boundary cycle Bx belong to Gx , i.e., are 1-vertices,
as this would only decrease the number of boundary
bivertices. We proceed by induction.
If Bx contains exactly 2 hexagons then it is easy
to check that Bx contains exactly 8 bivertices. Suppose Bx contains K > 2 hexagons. There is a
hexagon X touching the boundary cycle that has at
least three bivertices. If we remove all bivertices of
X, we obtain a new graph G0 whose boundary cycle
contains only K −1 hexagons (X was removed). By
induction hypothesis G0 has at least K +5 bivertices
on the boundary cycle. We have removed at least
three bivertices to obtain G0 and during this process exactly two bivertices become part of boundary cycle of G0 . Therefore Bx has at least K + 6
bivertices.
Now, assume that there are vertices u, v ∈ Gx
sharing a common 0-neighbor z. Consider a graph
G0 = G ∪ {z}. By adding z, at least two bivertices (u and v) have degree 3 in G0 , and we added
at most one bivertex (z). The number of hexagons
contained in the boundary cycle could not decrease.
By the first part of the claim, G0 had at least K + 6
bivertices, and hence Gx has at least K + 7 bivertices.
Lemma 5. Let x be a v-vertex and let C be the
component containing x. Then C lies only on two
planes, contains only v-type and vh-type vertices,
and at least 16 vertices are of vh-type.
Proof. Let x be a v-vertex in component C.
W.l.o.g. assume that x−1 is a 0-vertex and x1 is
a 1-vertex. By Lemma 3, x1 is a v-vertex again.
2
Therefore, by Corollary 1, G−1
x and Gx contain only
1
0-vertices, while Gx contains only 1-vertices. Note
also that Gx1 = G1x . Since x has degree 3 in Gx ,
Bx contains at least 3 hexagons. By Lemma 4, the
number of vertices of degree 2 in Gx is at least 8.
Since, each of them has a 0-neighbor bellow and 1neighbor above, they are vh-vertices. The same is
true for vertices in G1x . Thus, C contains at least
16 vh-vertices.
Lemma 6. Let x be an h-vertex and let C be the
component containing x. Then C is a hydrophobic
core of a tube, i.e., |Gx | = 6, and C contains at
exactly 12 vh-vertices.
Proof. By Lemma 3, all vertices in Gx are of htype, i.e., they form a cycle, and there are integers
j, k > 0 such that xj and x−k are vh-vertices, while
x−k+1 , . . . , xj−1 are all h-vertices. By Corollary 1,
the vertices in Gjx and G−k
x are vh-vertices, i.e., C
contains at least 2|Gx | vh-vertices.
We have at most 24 vh-vertices in Q, so |Gx | ≤
12. This is only possible if Bx contains three or
just one hexagon. However, in the first case, the
0-vertex common to all three hexagons have three
peptide bonds. Therefore, |Gx | = 6 and the number
of vh-vertices in C is at least 12.
It is easy to see that the same is true for a component which contains only vh-vertices. Since, Tn
has only 12 vh-vertices, we have immediately the
following corollary.
Corollary 2. If Q = Tn , for some n ≥ 1, then Q
has a unique fold.
Theorem 1. If Q = Tn,m , for some n, m ≥ 1, then
Q has a unique fold.
Proof. First, assume that there is a v-vertex x in
Hi . By Lemmas 5 and 6 and the fact that Q
has only 24 vh-vertices, F has only one component C containing only v-type and vh-type vertices. W.l.o.g. assume that C lies on levels Hi
and Hi+1 , hence each Gx and G1x contains 12 vhvertices. Since Q contains 101, there are two vertices in Gx (G1x ) with degree 2 having a common
0-neighbor. Thus, by the second part of Lemma 4,
Bx contains at most 5 hexagons. In such a case, by
enumerating all possible configurations of up to five
hexagons, we can see that the number of v-vertices
of Gx is either 6 (if Bx contains 4 hexagons) or 8 (if
it contains 5 hexagons). The number of v-vertices
in Q is 12(m + n − 2). Therefore the number of vvertices of Gx is 6, m+n = 3, and this happens only
when Bx contains 4 hexagons X, Y, W, Z forming a
chain (X intersects only Y , Y only X and W , etc.)
However, since either n = 1 or m = 1, Q contains
subsequence (010)12 , while the longest such subsequence in C is (010)8 , a contradiction. Hence, we
can assume there is no v-vertex in F .
By Lemma 6 and the remark after the lemma,
we have that F contains exactly two components
C 0 and C 00 , and they are in the shape of hydrophobic cores of two tubes, as required. The size of
the cores must also match the desired size, otherwise we would get substrings (1001)k of incorrect
lengths. Finally, we need to verify that the tubes
are connected in the desired way. Note that C 0 and
C 00 do not have subsequence 101. So there must be
two pairs x, y and v, w of 1-vertices where x, v ∈ C 0
and y, w ∈ C 00 , and x, y (respectively, v, w) have a
common 0-neighbor. It is easy to see if both x, y
are in the same plane Hi then v, w are in plane
Hi as well, and now we cannot pick up x−1 and
x in any path going through F . Also if x ∈ Hi
and y ∈ Hi+1 then we see a loop in F which does
not contain all monomers of Q, a contradiction. We
conclude that the only possibilities is that x, v ∈ Hi
and y, w ∈ Hi+2 and C 0 and C 00 can only have the
desired configuration, cf. Figure 4. Therefore F is
unique.
[3] K. A. Dill.
Theory for the folding and
stability of globular proteins. Biochemistry,
24(6):1501–1509, 1985.
Conclusions
[4] P. Crescenzi, D. Goldman, C. Papadimitriou,
A. Piccolboni, and M. Yannakakis. On the
complexity of protein folding. In Proc. of
STOC’98, pages 597–603, 1998.
In this paper we build on our previous work in
[13] to solve the inverse protein folding problem on
the HP model in 3D for designing tubular proteins.
This is interesting for two main reasons: First, this
is the first paper to solve inverse protein in such
a general manner and second, the proteins being
designed correspond closely to a large class of naturally occurring proteins. We also showed that our
constructions provably yield stable proteins for an
infinite class of examples and conjecture that this
holds for all proteins constructible with our techniques.
While the techniques presented here will not allow for the direct construction of proteins, they represent a starting point for this process. In particular, we believe that our techniques can be used to
form the basis of an actual protein — we specify, at
each point of the chain whether a hydrophobic or
polar monomer is required and a designer can use
this information to choose amino acids. The choice
of actual amino acid will depend on other desired
molecular interactions and finer details about the
protein structure.
Much remains to be done and we are actively exploring a number of extensions of this work. Foremost, as we did in the 2D case, we would like to generalize our techniques for a broader class of structures so that using simple building blocks we can
approximate any given 3D structure. We would
also like to extend our proof techniques to show a
much broader class of constructible structures are
stable. This is non-trivial even in the 2D case where
we are still attempting to show this result in its full
generality. Finally, we are interested in exploring
whether our techniques can already be used in practice for the construction of proteins.
References
[5] B. Berger and T. Leighton. Protein folding
in the hydrophobic-hydrophilic (HP) model
is NP-complete. J. Comp. Biol., 5(1):27–40,
1998.
[6] S. Kamtekar, J. M. Schiffer, H. Xiong, J. M.
Babik, and M. H. Hecht. Protein design by
binary patterning of polar and nonpolar amino
acids. Science, 262:1680–1685, 1993.
[7] K. Yue and K. A. Dill. Inverse protein folding
problem: Designing polymer sequences. Proc.
Natl. Acad. Sci. USA, Biophysics, 89:4163–
4167, 1992.
[8] S. Sun, R. Brem, H. S. Chan, and K. A. Dill.
Designing amino acid sequences to fold with
good hydrophobic cores. Protein Engineering,
8(12):1205–1213, 1995.
[9] T. Wang, J. Miller, N. S. Wingreen, C. Tang,
and K. A. Dill. Symmetry and designability for
lattice protein models. J. of Chem. Physics,
113(18):8329–8336, 2000.
[10] E. I. Shakhnovich and A. M. Gutin. Engineering of stable and fast-folding sequences
of model proteins. Proc. Natl. Acad. Sci.,
90:7195–7199, 1993.
[11] W. E. Hart. On the computational complexity of sequence design problems. In Proc.
of Comp. Molecular Biology, pages 128–136,
1997.
[12] P. Berman, B. DasGupta, D. Mubayi,
R. Sloan, G. Turán, and Y. Zhang. The protein
sequence design problem in canonical model on
2D and 3D lattices. In Proc. CPM 2004, pages
244–253, 2004.
[1] K. A. Dill, S. Bromberg, K. Yue, K. M. Fiebig,
D. P. Yee, P. D. Thomas, and H. S. Chan.
Principles of protein folding: A perspective
from simple exact models. Protein Science,
4:561–602, 1995.
[13] A. Gupta, J. Maňuch, and L. Stacho.
Structure-approximating inverse protein folding problem in the 2D HP model. Journal
of Computational Biology, 12(10):1328–1345,
2005.
[2] K. A. Dill. Dominant forces in protein folding.
Biochemistry, 29(31):7133–7155, 1990.
[14] O. Aichholzer, D. Bremner, E.D. Demaine,
H. Meijer, V. Sacristán, and M. Soss. Long
proteins with unique optimal foldings in the
H-P model. Computational Geometry: Theory
and Applications, 25(1-2):139–159, 2003.
[15] Z. Li, X. Zhang, and L. Chen. Unique optimal foldings of proteins on a triangular lattice.
Appl. Bioinformatics, 4(2):105–16, 2005.
[16] B. Hayes. Prototeins.
86:216–221, 1998.
American Scientist,
[17] C.R. Mead, J. Maňuch, X. Huang, B. Bhattacharyya, L. Stacho, and A. Gupta. Investigating lattice structure for inverse protein
folding. FEBS Journal, 272((s1)):4739 1 380,
2005.
[18] Y. B. Yu. Coiled-coils: stability, specificity,
and drug delivery potential. Advanced Drug
Delivery Reviews, 54(8):1113–1129, 2002.
[19] Jody M. Mason and Katja M. Arndt. Coiled
coil domains: Stability, specificity, and biological implications. ChemBioChem, 5(2):170–
176, 2004.