Arabidopsis thaliana Tic110, involved in chloroplast

Biologia 69/2: 139—151, 2014
Section Cellular and Molecular Biology
DOI: 10.2478/s11756-013-0310-3
Arabidopsis thaliana Tic110, involved in chloroplast protein
translocation, contains at least fourteen highly divergent heat-like
repeated motifs
Jorge Hernández-Torres1,2*, Adrián José Jaimes-Becerra1 & Jacques Chomilier2,3
1
Laboratorio de Biología Molecular CINBIN, Escuela de Biología, Universidad Industrial de Santander, Apartado Aéreo
678, Bucaramanga, Colombia; e-mail: [email protected]
2
Protein Structure Prediction, IMPMC, CNRS UMR 7590, Université Pierre et Marie Curie, Paris, France
3
RPBS, Bâtiment Lamarck, 15 rue Hélène Brion, 750013 Paris, France
Abstract: Endosymbiotic theory suggests that plastids originated from a photosynthetic bacterium that was engulfed by
a primitive eukaryotic cell. In consequence, the chloroplast genome remains affected by this ancestral event, although it
is reduced in size and the number of constituent genes. Most parts of the plastid genome have been transferred to the
host cell nuclear genome and are nuclear-encoded. Thus, chloroplast proteins are synthesized in the cytosol as precursors
with N-terminal extensions called transit peptides. The evolution of import machinery was required to transfer transit
peptides to the stroma. Until the present, two protein complexes have been found to mediate the import process: the Toc
(outer) and Tic (inner) envelope membrane translocons. The evolutionary origin of many Tic and Toc proteins has been
established, but not for the Tic110 subunit. Tic110 binds signal peptides and serves as a scaffold for the recruitment of
stromal components. In this study, we analyzed hydrophobic clusters, protein folds, and protein structure homology and we
conclude that Tic110 is composed of fourteen repeated motifs related to HEAT-repeats. The explanation for the presence
of such repeats in Tic110 is that membrane arrangement is found in separate domains and their probable function in the
chloroplast import process is discussed.
Key words: Tic110; HCA method; molecular evolution; HEAT-repeat; protein evolution; chloroplast translocon; chloroplast
protein import.
Abbreviations: HCA, Hydrophobic Cluster Analysis; HEAT, huntingtin, elongation factor 3, 65 kDa α-regulatory subunit
of protein phosphatase 2A and yeast PI3-kinase TOR1; IMS, intermembrane space; PDB, Protein Data Bank.
Introduction
The widely accepted theory of endosymbiosis suggests that chloroplasts originated from a photosynthetic bacterium absorbed by a primitive eukaryotic
cell. As a consequence, most of the original prokaryotic
genome was lost or transferred to the host cell nuclear
genome (Cavalier-Smith 2000; McFadden 2001). From
this event, most chloroplast proteins are encoded by
the nuclear genome and are synthesized in the cytosol
as precursors with N-terminal targeting signals, the
transit (or signal) peptides. These proteins are translocated into the chloroplast by the so-called general import pathway, mediated by the Toc (translocon at the
outer envelope of chloroplasts) and Tic (translocon at
the inner envelope of chloroplasts) complexes (Soll &
Schleiff 2004; Stengel et al. 2007; Jarvis 2008; Benz et
al. 2009).
In contrast to the Toc complex in which the activities, functions and evolutionary origins of most of the
components of the machinery have been established and
well defined (Hiltbrunner et al. 2001; Kessler & Schnell
2006; Hernández et al. 2007), the study of Tic complex has been difficult to understand. This is due to
the fact that the assembly of functional complexes is a
dynamic process and occurs in response to pre-protein
translocation through the Toc complex. Additionally,
Tic proteins are not inferred as to be homologous with
any known membrane transport system (Kouranov et
al. 1998; Reumann et al. 2005).
Among the eight components of the Tic machinery (Kovács-Bogdán et al. 2010), Tic110 (translocon at
the inner envelope protein of 110 kDa) was the first to
be identified as a true subunit (Kessler & Blobel 1996;
Lubeck et al. 1996). Moreover, it was proposed to be one
of the essential components for protein translocation
into plastids, in association with at least two other Tic
subunits, namely Tic20 and Tic40 (Inaba et al. 2003,
2005). According to previous studies (Kessler & Blobel 1996; Inaba et al. 2003, 2005), Pisum sativum and
* Corresponding author
c 2013 Institute of Molecular Biology, Slovak Academy of Sciences
Unauthenticated
Download Date | 6/18/17 4:18 AM
140
Arabidopsis thaliana Tic110 (PsTic110 and AtTic110,
respectively) bind transit peptides and serve as a docking module for the recruitment of stromal components
involved in late stages of protein import.
Since the discovery of Tic110, three main aspects
have been examined: (i) the secondary structure determination of Tic110; (ii) its membrane crossing of the
inner envelope of chloroplasts; and (iii) the evolutionary
origin of the protein. So far, contradictory results have
been published concerning the distribution of regular
secondary structures. Based on circular dichroism spectra, Heins et al. (2002) concluded that Tic110 mainly
consists of β-sheets forming a translocation pore. In
contrast and using the same spectroscopic method, Inaba et al. (2003) determined “that the bulk of native
AtTic110 consists of α-helical structure”. Balsera et al.
(2009) corroborated this result, stating that “the farUV CD spectrum of Tic110 indicates that it is composed of about 53% α-helix, 8% β-strand, and 38%
non-regular elements of secondary structure”. Moreover, by computational methods, Balsera et al. (2009)
also proposed the presence of three HEAT-like repeats
in PsTic110, but their results suggest “Tic110 contains
more than six repetitions (. . .) that seem to be degenerated and therefore difficult to detect by direct sequence
comparison”. The canonical HEAT unit repeat (hereafter defined as a HEAT motif) involves two helices, A
and B, which form an α-hairpin. Arrays of HEAT motifs
consist of 3 to 36 units forming a rod-like structure and
appear to function as protein-protein interaction surfaces (Andrade et al. 2001a). HEAT tandem repeats are
found in a number of cytoplasmic proteins, including
the four proteins resulting in their name: Huntingtin,
elongation factor 3 (E F-3), the 65 kDa α-regulatory
subunit of protein phosphatase 2A (PP2A) and the
yeast PI3-kinase T OR1 (Andrade et al. 2001b).
The second controversial issue is the intramembrane arrangement of Tic110. Three models have attempted to explain the role of Tic110 in protein import
and each requires specific topological features. The first
model proposes that the mature protein projects a large
C-terminal soluble domain into the stroma and the Nterminus, containing two hydrophobic transmembrane
α-helices, serves to anchor the protein in the membrane
(Lübeck et al. 1996; Jackson et al. 1998; Inaba et al.
2003). Nevertheless, other experimental data showed
that Tic110 traverses the inner membrane in a zigzag fashion through multiple intramembrane domains
(Kessler & Blobel 1996). Recently, this hypothesis was
supported by limited proteolysis assays in inner envelope vesicles (trypsin, endoproteinase Glu-C and thermolysin) and Cys pegylation reactions (Balsera et al.
2009). According to this model, four extra amphipathic
transmembrane helices form a cation-selective channel
and stromal domains might interact with chaperones,
such as Hsp93 and Cpn60 (Kessler & Blobel 1996; Jackson et al. 1998; Inaba et al. 2003).
Finally, no homologous sequence has been found
for Tic110 in bacteria or in eukaryotic organisms, suggesting that this protein might have evolved in the orig-
J. Hernández-Torres et al.
inating plant cell, concurrently with chloroplast development.
Detection of protein repeats is a particularly arduous task because of its primary structure; sequence
repeats have short read lengths and have significant sequence divergence (Marcotte et al. 1999; Andrade et
al. 2001b; Bjorklund et al. 2006). In the present study,
we conducted robust sequence analysis (discriminative
motif discovery), HCA (Hydrophobic Cluster Analysis), and homology modelling of protein structure. In
agreement with the predictions of Balsera et al. (2009),
we found that the largest part of Tic110 appears to
be primarily composed of α-helices with at least 14
HEAT-like repeat motifs (37–50 amino acid residues
each), interspersed with six transmembrane domains
and two charged non-globular domains: one positively
(stromal) and one negatively (intermembrane space;
IMS). It has been noted that many HEAT repeat containing proteins are involved in intracellular transport
(Oeffinger et al. 2004; Stewart 2007) and HEAT repeats in Tic110 should favour pre-protein import into
the stroma through interaction with the Toc complex
and all other Tic components. Furthermore, we discuss
the likely evolutionary origin of Tic110, as derived from
our results.
Material and methods
Sequence searches and HCA analyses
PSI-BLAST searches were performed using the A. thaliana
Tic110 as a query against the non-redundant database at
NCBI (http://www.ncbi.nlm.nih.gov/blast; Altschul et al.
1997) using default parameters. Selection of the output sequences was based both on the level of sequence identity,
particularly with distantly related proteins (<30% identity),
and focused on the inclusion of one representative of all
photosynthetic organisms (higher plants, green algae, and
non-green algae).
To accurately align the retrieved sequences, the twodimensional Hydrophobic Cluster Analysis (HCA) was performed according to recommendations published previously
(Callebaut et al. 1997). Briefly, HCA designs a sequence on
the surface of a cylinder with the connectivity of an α-helix.
The two-dimensional planar surface is then duplicated in order to keep local environment for each amino acid residue,
and hydrophobic residues (VILFMWY) are then clustered,
provided they are first neighbours in this two-dimensional
pattern. The shapes of the clusters are a strong indication of the nature of the secondary structure. Furthermore,
the hydrophobic clusters observed in an HCA plot are not
distributed at random, and it has been demonstrated that
centres of the hydrophobic clusters statistically correspond
to the centres of regular secondary structures (Woodcock
et al. 1992). Therefore, it allows for the prediction of secondary structures, which in turn aids in aligning pairs of
distantly related sequences. HCA is a robust method that
allows alignments between distantly related proteins, with
as low as 10% sequence identity in the so-called ‘twilight
zone’ (Rost 1999). HCA identity is calculated by the number
of identical amino acid residues aligned in both sequences
to the number of residues of the longest sequence.
Unauthenticated
Download Date | 6/18/17 4:18 AM
HEAT motifs in Arabidopsis thaliana Tic110
MEME queries and protein structure homology modelling
Repeated motifs were detected with MEME (Multiple EM
for Motif Elicitation), which is a homology-based method
that uses an iterative algorithm to estimate the significance
of possible repeats in a group of related sequences (Bailey & Elkan 1994). The SBASE program (Vlahovicek et
al. 2005) has been employed to predict known domains directly from the sequence. The PSIPRED protein structure
prediction server (Bryson et al. 2005) was used to predict
the secondary structure (Jones 1999a) and fold recognition
was performed using the mGenTHREADER (Jones 1999b;
McGuffin & Jones 2003). Multiple fold recognition methods
were done using the CBS Meta-Server (Douguet & Labesse
2003).
Structural models were obtained by means of the ITasser (Zhang 2008) web server with AtTic100 as input. We
also used secondary structure assignment performed with
the PSEA algorithm (Labesse et al. 1997).
Hydropathy plots
Hydropathy plots were drawn according to the KyteDoolittle method (Kyte & Doolittle 1982) with a window of
9 amino acids using the online server Hydrophobicity Plot
1.0 (Cancer Research UK; http://bmm.cancerresearchuk.
org/∼offman01/hydro.html).
Accession numbers
Sequence data were deposited in the EMBL/GenBank libraries under the following accession numbers: NP 172176.1
(A. thaliana Tic110), CAA92823 (P. sativum Tic110),
EFJ41589 (Volvox carteri Tic110), XP 002503178 (Micromonas sp. Tic110), XP 001712500 (Hemiselmis andersenii Tic110), CBJ30096 (Ectocarpus siliculosus Tic110),
NP 189208 (A. thaliana protein phosphatase 2A subunit A2) and NP 001030954 (A. thaliana cullin-associated
NEDD8-dissociated protein 1, chain C).
Results
Detection of repeated motifs and domain duplication in
AtTic110
We used A. thaliana Tic110 as a reference instead of
P. sativum in the study of Balsera et al. (2009) because A. thaliana has more genetic resources (Meinke
et al. 1998) and because they are closely related. In order to define the number and family of the various domains involved in AtTic110, MEME was used. It identified four repeats composed of around 50 amino acid
residues predicted to contain two α-helices each with an
E-value of 8.2 × 10−280 . Subsequently, the SBASE protein domain library (Vlahovicek et al. 2005) detected
the presence of one helix-hairpin-helix motif with 99%
confidence. Since these attempts to assign motifs and
domains were not entirely conclusive, we complemented
our approach by using the PSIPRED protein structure prediction server (Bryson et al. 2005). It revealed
that some fragments of the Tic110 protein shared structural similarity with HEAT repeats with a confidence
of P < 0.0001 in the range of 70–80% certainty. This
result was confirmed by multiple fold recognition methods (CBS Meta-Server; Douguet & Labesse 2003).
On the basis of previous bioinformatics results,
we hypothesized the existence of domain duplications
141
throughout the protein, as it was proposed for other
proteins from the Tic and Toc families (Kuchler et
al. 2002; Hernández et al. 2007). For robust identification of the HEAT-containing domains, we used a
PSI-BLAST run with the full 960 residues of AtTic110
as a query. It allowed the retrieval of distantly related
proteins, required for HCA analysis (<30% identity),
including V. carteri (multicellular chlorophyte alga),
Micromonas sp. (picoplanktonic green alga), H. andersenii (cryptophyte alga) and E. siliculosus (a filamentous brown alga). E values ranged from 7 × 10−137 to
3 × 10−7 , therefore clearly identified as Tic110 orthologous proteins.
The comparison to the HCA plots of the five retrieved sequences led us to conclude that more than half
of the AtTic110 protein presented domain duplications,
interspersed with putative intramembrane domains. After further analyses to optimize the correspondence between aligned hydrophobic clusters, we found that AtTic110 contains four new membrane-spanning domains
as proposed by Balsera et al. (2009). Computational results and HCA analysis led to the conclusion that the
three globular domains consist of arrays of HEAT-like
motifs of 37–50 amino acid residues, with high level of
leucine (13%). The secondary structure consensus for
the HEAT repeat is a pair of helical domains, separated
by a non-helical spacer (Kobe et al. 1999).
The use of HCA plots of the analysed sequences
led us to confirm and redefine the position of the new
four transmembrane regions proposed by Balsera et al.
(2009). It is important to note that they were inconclusively defined by proteolytic-immunoblot analysis and
by Cys mapping using pegylation and computational
methods. However, HCA analysis is one of the most
pertinent bioinformatics methods to delineate protein
domains (Callebaut et al. 1997) and by using this technique we made refined predictions of both soluble and
intramembrane domains of AtTic110, on the basis of
the Balsera et al. (2009) data.
Domain assignment of AtTic110
Figure 1 shows a schematic of the domain distribution of Balsera et al. (2009) and from our results. The
two first transmembrane regions a and b were proposed
since the discovery of Tic110 (Kessler & Blobel 1996;
Lübeck et al. 1996). If we propose some minor changes
in the new transmembrane regions (c–f), we corroborate
Balsera et al. (2009) the number of regions (six; see Figure 6). We further propose an increase in the number of
HEAT motifs from 6 to 14. This increase is based on the
number of repeats shown in more divergent elements,
with a refined method of motif discrimination like HCA
(Lemesle-Varloot et al. 1990; Callebaut et al. 1997). According to Balsera et al. (2009) Tic110 spans across the
inner envelope membrane, resulting in five IMS or stromal domains, I to V (Fig. 1). As will be shown in the
following paragraphs, HCA analysis allows the redefinition of the limits of some domains, consistent with
transmembrane regions.
Domain I, from 91 to 204 (the limit was 208 in the
Unauthenticated
Download Date | 6/18/17 4:18 AM
142
J. Hernández-Torres et al.
Fig. 1. Domain boundary predictions according to (a) the work of Balsera et al. (2009) and (b) this work. Latin numbers correspond
to the various HEAT motifs, Arabic numbers to the domain names. The estimated molecular weight (kDa) and calculated isoelectric
point (pI) are also shown.
work of Balsera et al. 2009), lies in the stroma of the
chloroplast and its HCA plot with the plots of the other
green plant species that are evolutionary very distant
is shown in Figure 2. As AtTic110 has been used as a
reference throughout this paper, all sequence identities
resulting from the HCA alignments are given with respect to it (Table 1). Two repeated motifs are shown in
this domain and the shapes of their clusters are compatible with the presence of two helices per motif, with
a small linker between them, supporting an antiparallel topology. HEAT motifs in AtTic110 are sequentially
numbered, regardless of the domain they belong to (see
Figure 1).
Domain II is in the IMS and spans the positions
217-480. The corresponding HCA plots are represented
in Figure 3, with the same distantly related sequences
as in Figure 2. The number of repeats is higher for this
domain and is completely covered by the double helices
seen in domain I. In addition, homology was shown by
means of the I-Tasser server with the PR65/A subunit
Table 1. HCA identities among AtTic110 domains I, II and V
and related domains from distant photosynthetic organisms.
Species
Volvox carteri
Micromonas sp.
Hemiselmis andersenii
Ectocarpus siliculosus
Domain I
Domain II
Domain V
37%
32%
14%
13%
28%
26%
13%
12%
28%
30%
20%
16%
of protein phosphatase 2A (PDB code: 1b3u; Groves et
al. 1999). An alignment by HCA was performed, leading
to a 10% sequence identity with AtTic110.
The secondary structure of this template allows for
the identification of motifs with pairs of antiparallel helices. The PR65/A subunit of protein phosphatase 2A
(PDB code: 1b3u) is known to be composed of a series
of 15 HEAT repeated motifs, and therefore domain II
can be aligned on six HEAT motifs. Domains III and IV
Unauthenticated
Download Date | 6/18/17 4:18 AM
HEAT motifs in Arabidopsis thaliana Tic110
143
appear to be unstructured and contain a high content
of charged residues, but they do not appear to contain
HEAT repeats.
Domain V, from position approximately 676 to the
C-terminus in AtTic110, is composed of six repeats
(numbered from 9 to 14; Fig. 4). Homology was identified between AtTic110 and the PR65/A subunit of protein phosphatase 2A (PDB code: 1b3u) template and
was calculated by HCA to be 7% identity.
It is important to note that the three last repeats
of domain V are homologous with three domains of the
PR65/A subunit of protein phosphatase 2A that are
themselves homologous to the three first motifs of domain II (Fig. 3). It thus reinforces the assumption of
HEAT repeats to build Tic110. Under the plot from AtTic110 the secondary structure prediction is reported
realized by I-Tasser. Most of these structures corroborate the localization of the hydrophobic clusters, although the borders of the helices are rather difficult to
infer.
Fig. 2. HCA plot alignment of the domain I of A. thaliana Tic110,
with related orthologous proteins from Vc (Volvox carteri), M
(Micromonas sp.), Ha (Hemiselmis andersenii) and Es (Ectocarpus siliculosus). Sequence is read vertically, one line over two, and
the secondary structure is read horizontally, a cluster corresponding statistically to a regular secondary structure. Vertical lines
connect aligned amino acids. Conserved hydrophobic clusters are
grey shaded. Non-hydrophobic strict identities are indicated by
white letters on a black background. H1 and H2 represent two
predicted HEAT motifs, each one composed of a series of two
antiparallel α-helices.
Validation of HEAT-like motifs in AtTic110
We concluded in the previous sections that some domains of Tic110 were built from duplications of one
motif, and that these repeats cover more than 60%
of the protein length. In order to further examine
the AtTic110 HEAT-like motifs, we show examples of
HCA alignments of AtTic110 motifs (repeats 2, 6 and
13 from domains I, II and V, respectively) with already characterized HEAT motifs (Fig. 5). AtTic110
repeat 2 is aligned with Homo sapiens repeat 7 (out of
15) from PR65/A subunit of protein phosphatase 2A
(Groves et al. 1999), and with the homologous sequence
from A. thaliana, also known to involve HEAT motifs (Fig. 5a). AtTic110 repeat 2 and AtPR65/A share
17% sequence identity. AtTic110 HEAT-like repeat 6 is
aligned with repeat 14 (out of 19) from β-importin of
both human (PDB code: 1qgk; Cingolani et al. 1999)
and A. thaliana (Fig. 5b). This last sequence shares 13%
identity with AtTic110 repeat 6. The repeat 13 from
A. thaliana aligned with repeat 7 (out of 27) of Cand1
chain C from H. sapiens (PDB code: 1u6g; Goldenberg
et al. 2004) and A. thaliana, sharing 12% identity with
the AtTic110 repeat 13 (Fig. 5c).
Analysis of hydrophobic cluster shape, fully compatible with the existence of two α-helices separated
by a loop of polar or charged residues, in association with conservation of non-hydrophobic amino acids
(Fig. 5), improves the robustness of our assertion of a series of HEAT-like repeated motifs in AtTic110. Related
structures are available for human HEAT-containing sequences with PDB codes 1qgk and 1u6g, which confirm
this view. The repeat motifs found within Tic110 by
HCA and fold predictors fit well with the basic pattern of the HEAT repeats: two α-helices separated by
a variable region.
Taken together, these findings allow for the conclusion that Tic110 proteins contain at least 14 putative HEAT-like motifs of 37-50 amino acids each.
This assumption is derived from a combination of
Unauthenticated
Download Date | 6/18/17 4:18 AM
144
J. Hernández-Torres et al.
Fig. 3. HCA alignment for the domain II of A. thaliana Tic110 protein. The same species are shown as those in Figure 2. The top plot
comes from the phosphatase 2A PR65/A subunit (PDB code: 1b3u) with the secondary structure assignment below the plot. Below
AtTic110 plot, the secondary structure prediction is proposed, based on the alignment with 1b3u.
BLAST searches, threading, HCA alignments and homology modelling. This conclusion is consistent with
previous work by Balsera et al. (2009) and it extends their prediction. We propose that 14 HEAT repeats of Tic110 are present in the domain numbers
I, II and V. From an evolutionary point of view,
these motifs are found in all photosynthetic species
with sequence identities in the range of 12–37% (Table 1). Nevertheless, we cannot be certain if other
more distantly related motifs are present in Tic110
sequences, but they are beyond the scope of this
work.
Unauthenticated
Download Date | 6/18/17 4:18 AM
HEAT motifs in Arabidopsis thaliana Tic110
145
Fig. 4. HCA alignment for the domain V of A. thaliana Tic110 protein with the same species as in Figure 2.
Prediction of transmembrane domains
We show evidence to clarify why we agree with the assignment of new transmembrane regions (Fig. 6; c,d,e,f)
connecting the last five domains proposed by Balsera
et al. (2009) but we propose to move the two passages
within domains II and III (Fig. 6d), and III and IV
(Fig. 6e). This modification is done according to the
HCA plots in A. thaliana, P. sativum, V. carteri, Micromonas sp., H. andersenii, E. siliculosus and other
Tic110 orthologues not shown. Membrane spanning domains are easily detected by the distinctive shape of
their hydrophobic clusters. A transmembrane region is
typically characterized by a long stretch of hydrophobic residues, resulting in a large vertical cluster (Eudes et al. 2007), as shown for the c and f regions (Figs
6c,f). Kyte-Doolittle hydropathy plots of that stretch of
amino acid residues depict their hydrophobic nature, a
known characteristic of intramembrane domains. In the
same vein, transmembrane regions d and e of Balsera
et al. (2009) do not confer the suitable hydrophobicity for the formation of such intramembrane domains.
Indeed, sliding d and e passages as proposed in FigUnauthenticated
Download Date | 6/18/17 4:18 AM
146
J. Hernández-Torres et al.
Fig. 5. HCA alignment of various known HEAT motifs. (a) PR65/A (PDB code: 1b3u) motif number 7, with its homologues in At,
both aligned with AtTic110 motif 2. (b) Homo sapiens β-importin HEAT 14 (PDB code: 1qgk) is aligned with its counterpart in At,
both aligned with AtTic110 motif 6. (c) HEAT motif 7 of H. sapiens Cand1 (PDB code: 1u6g), aligned with its homologue in At, and
the motif 13 of AtTic110.
ures 6d,e shifts to more vertical clusters with the HCA
method (Eudes et al. 2007), and enhances their density
in hydrophobic residues. Our hypothesis is confirmed
by the poor results of Kyte-Doolittle hydropathy analysis of the passages proposed by Balsera et al. (2009)
compared to our results.
Domains III and IV
Once this distribution of the transmembrane regions
has been revised, domain II is lengthened by 180
residues and domain III is shortened by 154 ones, compared to the work by Balsera et al. (2009), (Figs 1, 3
and 4). These 154 residues are now in the IMS, according to our results, and domain III is only composed of
88 amino acid residues. The net charge of this domain
comprises 18 positive charges (R + K, 20.6% of the size
of the domain) and 11 negative charges (12.6%). Consequently, the isoelectric point of domain III is very basic,
close to 10. In illuminated chloroplasts where the stromal pH is typically around 8 with the proton pump at
the top (Heber & Heldt 1981), domain III is hypothesized to be highly positively charged because of its basic
pI.
Analysis of the amino acid composition of domain
IV leads to an opposite behaviour, with a larger number of negatively charged residues, 24, corresponding to
35.3% of the length of the domain, versus 12 positively
charged (17.7%). The pH of the IMS is on the same order than that of the cytosol (around 7.2; Heldt & Sauer
1971), so domain IV is likely negative (pI = 4.3).
Domains III and IV appear to be slightly disordered because the density of hydrophobic residues
(FILMVWY) is smaller than the canonical value of 33%
for a globular protein (21% for domain III and 16% for
domain IV). Nevertheless, the function of these two domains is still unclear, taking into account the fact that
their radical groups are highly conserved in most photosynthetic species.
Homology modelling of domains II and V
Structural models of domains II and V of AtTic110 have
been realised with the I-Tasser server (Zhang 2008).
The goal was not to obtain a structure useful for any
further process such as docking prediction, but rather
to increase the evidence for the repeated nature of domains II and V. The template used for both domains
was the PR65/A subunit of protein phosphatase 2A
(PDB code: 1b3u), a constant regulatory domain of protein phosphatase 2a from human, also known as pr65/A
subunit (Groves et al. 1999). Sequence identity shared
between 1b3u and AtTic110 HEAT domains was 10%
and 7% for domains II and V, respectively.
The models obtained by I-Tasser show the typical pairs of helices in the models, corresponding to
Unauthenticated
Download Date | 6/18/17 4:18 AM
HEAT motifs in Arabidopsis thaliana Tic110
147
Fig. 6. HCA plots of four of the six probable transmembrane regions, named according to Figure 1. The first two N-terminal domains
(a and b, Fig. 1) were not represented. Each square contains the HCA plot of the membrane-spanning domains suggested by Balsera
et al. (2009), but in d and e we propose to slide the predicted intra-membrane amino acid to the residues signaled by arrows. Together
with the HCA plots are Kyte-Doolittle hydropathy predictions of the same chain of residues (continuous line: AtTic110, dotted line,
PsTic110). Hydrophobic amino acids are in green, charged ones in red and polar residues are in black. Transmembrane stretches are
surrounded by brackets and underlined in the hydropathy plots.
six HEAT motifs for each domain (Figs. 7a,b for domains II and V). The template modelling score for the
best model of domain II is –1.8 and is reduced to –4.5
for the second ranked model. All ten templates used
by I-Tasser to build the model belong to the α-class
and are annotated either as helix-turn-helix, TPR, or
HEAT motifs. Secondary structures have been assigned
by PSEA (Labesse et al. 1997) for the best model, and
they are reported under the AtTic110 HCA plot of Figure 3. The other templates proposed by I-Tasser presented a common feature of a series of tandemly repeated helices comprised of various lengths.
For domain V, the template modelling score of the
best model is –1.69. The ten templates that I-Tasser
used do not include the PR65/A subunit of protein
phosphatase 2A (PDB code: 1b3u), and 9 of them are
different of the templates used for domain II. Nevertheless, they still all belong to the α-class of proteins and
annotations in the PDB files usually refer to repeats of
helical motifs. Among the 10 proteins in the PDB which
are structurally related to the first ranked I-TASSER
model, the PR65/A subunit of protein phosphatase 2A
(1b3u) is the closest for both domains (1.87 Å and
1.98 Å root-mean-square deviation, respectively; Reva
et al. 1998). These values are close, although the models are built on different templates for domains II and
V, and can be interpreted as corroborative evidence for
the repeated nature of these domains.
Discussion
In this paper, we clarified the three primarily, and related, questions concerning the evolutionary origin, the
number of transmembrane regions, and the secondary
structure of the domains. Some authors consider Tic110
to have only two N-terminal transmembrane domains
(a and b in this work, Fig. 1) so the rest of the protein faces the stroma (see, e.g., Kalanon & McFadden
Unauthenticated
Download Date | 6/18/17 4:18 AM
148
J. Hernández-Torres et al.
Fig. 7. Models proposed by I-Tasser server for the domains II and V of AtTic110. Each HEAT motif is represented by a unique color.
2008; Kovács-Bogdán et al. 2010; and Schwenkert et
al. 2011 for a detailed review of conflicting hypotheses). The controversy is limited to the C terminal part,
since the presence of two helices at the N terminus is
widely accepted (Schwenkert et al. 2011). However, a
number of experimental studies may have been underestimated and reopen the debate on the true topology
of the protein. The first analyses of Kessler & Blobel (1996) concluded that “IAP100 behaved as an integral membrane protein because it could not be extracted by carbonate at pH 11.5”. Subsequently, Jackson et al. (1998) with trypsin digestions of psTic110
and then Inaba et al. (2003) with the behaviour of
their AtTic11093−966 mutant in transgenic Arabidopsis, a truncated construct lacking the transmembrane
regions a and b (Fig. 1), reached an opposite conclusion. They claimed that “pre-AtTic11093−966 appeared
only in the soluble fraction indicating that the protein
was localized to the chloroplast stroma”.
Nevertheless, the same mutant constructed by Inaba et al. (2005) raised further uncertainty because
overexpressed AtTic110SolHis , i.e., the same mutant as
AtTic11093−966 from Inaba et al. (2003), in A. thaliana
“exhibited no apparent phenotypes, indicating that
these constructs do not interfere with the function of
endogenous AtTic110”, while other negative mutants
(AtTic110NHis and AtTic110CHis ) achieved to inhibit
import activity”. This result may be equally interpreted
in the opposite way: in some manner AtTic110SolHis
could accomplish its function by interacting with the
chloroplast inner membrane across intramembrane domains c to f. Finally, Balsera et al. (2009) with new refined experimental data (electrophysiological measurements, limited proteolysis assays in inner envelope vesicles, pegylation assays and CD spectroscopy) revisited
the hypothesis of a Tic110 with channel characteristics and six intramembrane domains. This controversial
point was revisited here and we show that AtTic110 is
actually composed of six intramembrane regions.
Domain I faces the stroma and contains two HEAT
motifs, presumably involved in chaperone recruitment
(Lubeck et al. 1996). According to Inaba et al. (2005)
some amino acids in the region 93–602 form the joined
translocation sites between the Toc and Tic complexes
and part of residues 185–370 interact with transit peptides. Our results are compatible with these data, but
we propose to shrink the length of these regions and
better delineate the domains facing either the IMS or
the stroma. In particular the segment 93–602 described
by Inaba et al. (2005) is proposed here to be reduced
to domain II, ranging from 217 to 480, and the function of interaction with transit peptides of the 185–370
amino acid residues also should be restricted to domain
II. This assertion is supported by the amino acids 93602 that are sufficient for assembly into supercomplexes
(Inaba et al. 2005). Therefore, this activity should be
assumed by a larger domain II inside the IMS. In this
work we show an enlargement of domain II from 83 to
263 residues long (amino acids 217–480), and the assignment of three new HEAT repeats, thus increasing
the available interacting surface with IMS proteins, including transit peptides.
According to Inaba et al. (2005) AtTic110 forms a
ternary complex with Tic40 and Hsp93, and some of the
Tic110 residues in the ranges 331–553 and 602–966 are
implicated in the binding (Balsera et al. 2009). This is
consistent with the domain boundaries we propose here:
domain III from amino acids 493 to 553 (60 residues)
and domain V from 676 to 966 (290 residues) both face
the stroma. The assignment of four additional HEAT
motifs in domain V favours the formation of such a
ternary complex.
One unexpected finding of this work is the splitting of the highly charged region recognized by Balsera
et al. (2009) into two discrete domains III (net positive charge) and IV (net negative charge). The meaning
of this novelty remains unknown, but it can be speculated that there is a relationship between Tic110 chanUnauthenticated
Download Date | 6/18/17 4:18 AM
HEAT motifs in Arabidopsis thaliana Tic110
nel properties and cationic selectivity with the sudden
reversal potential after the addition of CaCl2 (Balsera
et al. 2009). In any case, our results agree with the prediction that these two domains are disoriented given
their hydrophobic content, which is too low to give rise
to a globular structure.
The use of HCA in combination with prediction
methods enabled us to identify highly divergent HEATlike motifs that have been duplicated across the protein
at least 14 times. This task would be unattainable by
other alignment methods, particularly because of the
absence of a Tic110 three-dimensional structure and the
fact that HEAT-like repeats are degenerated throughout evolution.
One can advance the following arguments to support the presence of HEAT repeats within Tic110 proteins: (i) predicted Tic110 HEAT-like motifs accomplish the structural requirements to configure two helical patches with appropriate spacing; (ii) each predicted
repeat contains sequence elements that match the consensus sequence for HEAT repeats, mainly on the hydrophobic pattern; (iii) the resolution of the method
proposed by Balsera et al. (2009) allows for the recovery of some HEAT motifs, although here limited to
three. We propose that the number of such motifs is 14
over the whole AtTic110 sequence; and (iv) Tic110 and
HEAT-repeat proteins involve protein-protein interactions in their activities, such as assembly and membrane transport (Cingolani et al. 2000). These critical
actions would require multiple HEAT-repeats to collect
an assortment of subunits to assemble active supercomplexes. Because other HEAT-repeat proteins play key
roles in transport processes as scaffolding, this finding
provides a plausible mechanistic explanation for the origin of the Tic110 proteins.
Since internal repeats result in an enlargement
of the available interacting surface, the most common
function of repeat ensembles is protein binding (Andrade et al. 2001b). Such a global structure provides opportunities for new proteins to expand their repertoire
of cellular functions, such as protein-protein interaction, transport across membranes and protein-complex
assembly (Street et al. 2006). Although HEAT-repeat
proteins are involved in a great diversity of cellular
processes, a common function is to mediate multiple
protein-protein interactions like a scaffold, for translocation processes of diverse macromolecules (Groves &
Barford 1999; Cingolani et al. 2000).
Although previous studies have given some insights
into the function and topology of the Tic110 protein,
its evolutionary origin has been impossible to determine
by traditional one-dimensional sequence alignments or
database screening methods. Up to now, Tic110 is considered to be of eukaryotic origin (Reumann & Keegstra
1999; Kalanon & McFadden 2008; Kovács-Bogdán et
al. 2010). However, HEAT repeats are well distributed
in many prokaryotic proteins, for example, Interpro
database (Hunter et al. 2009) registers 840 bacteria and
173 cyanobacteria HEAT proteins (entry IPR000357)
and structures are available in the PDB (Rubinson et
149
al. 2008, 2010). Because prokaryotes were the first organisms to evolve (Vellai & Vida 1999), it is reasonable
to suggest that Tic110 might have been newly developed in the arising plant cell in concert with chloroplast
development, i.e., a dual origin or lost from ancestral
cyanobacteria (Reumann & Keegstra 1999; Reumann
et al. 2005).
Note added in proof
During the course of evaluation of the manuscript, a
paper appeared about the structure of Tic110 from
Cyanidioschyzon merolae (Tsai et al. 2013). The authors succeeded in obtaining a low-resolution structure
of the C terminus part, which starts at the e transmembrane region (see Fig. 1b). They argue the presence in
this region of seven HEAT motifs (14 antiparallel helices), while in our prediction we believe there are six.
The missing one may be the transmembrane region f
(Fig. 1b) that is attributed by Tsai et al. (2013) to
a HEAT motif. Nevertheless, our proposal of increasing the number of HEAT motifs compared to previous
work is coherent with the results of Tsai et al. (2013).
Their conclusion is in favour of a model of Tic structure with only two transmembrane regions and a long
C terminus region in the stroma. Nevertheless, they are
very prudent in their assertion, because their evidence
is not complete. In particular, crystallizing two longer
fragments was not successful (Tsai et al. 2013), and one
can thus hypothesize that this might be due to the presence of a transmembrane region, ready to aggregate in
the absence of detergent in the crystallization conditions. So, the debate is still not close on the number
of transmembrane regions, although it seems now clear
that the number of HEAT motifs is higher than what
was expected so far.
Acknowledgements
JHT has benefited of an invitation of three months by
CNRS. Many thanks are due to Catherine Venien-Bryan for
fruitful discussions. We gratefully acknowledge Dr. Christine
D. Bacon for the English corrections to the final manuscript.
References
Altschul S., Madden T., Schaffer A., Zhang J., Zhang Z., Miller
W. & Lipman D. 1997. Gapped-BLAST and PSI-BLAST: a
new generation of protein database search programs. Nucleic
Acids Res. 25: 3389–3402.
Andrade M., Petosa C., O’Donoghue S., Muller C. & Bork P.
2001a. Comparison of ARM and HEAT protein repeats. J.
Mol. Biol. 309: 1–18.
Andrade M., Perez C. & Ponting C. 2001b. Protein repeats: structures, functions, and evolution. J. Struct. Biol. 134: 117–131.
Bailey Y. & Elkan C. 1994. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc.
Int. Conf. Intell. Syst. Mol. Biol. 2: 28–36.
Balsera M.A., Goetze T.A., Kovács-Bogdán E., Schürmann P.,
Wagner R., Buchanan B.B., Soll J.A. & Bölter B. 2009. Characterization of Tic110, a channel-forming protein at the inner envelope membrane of chloroplasts, unveils a response
Unauthenticated
Download Date | 6/18/17 4:18 AM
150
to Ca2+ and a stromal regulatory disulfide bridge. J. Biol.
Chem. 284: 2603–2616.
Benz J., Soll J. & Bolter B. 2009. Protein transport in organelles:
the composition function and regulation of the Tic complex
in chloroplast protein import. FEBS J. 276: 1166–1176.
Bjorklund A., Ekman D. & Elofsson A. 2006. Expansion of protein domain repeats. PLoS Comput. Biol. 2: 114.
Bryson K., McGuffin L., Marsden R., Ward J., Sodhi J. & Jones
D. 2005. Protein structure prediction servers at University
College London. Nucleic Acids Res. 33: 36–38.
Callebaut I., Labesse G., Durand P., Poupon A., Canard L.,
Chomilier J., Henrissat B. & Mornon J.P. 1997. Deciphering protein sequence information though hydrophobic cluster
analysis: current status and perspectives. Cell. Mol. Life Sci.
53: 621–645.
Cavalier-Smith T. 2000. Membrane heredity and early chloroplast
evolution. Trends Plant Sci. 5: 174–182.
Cingolani G., Lashuel H., Gerace L. & Muller C. 2000. Nuclear
import factors importin α and importin β undergo mutually induced conformational changes upon association. FEBS
Lett. 484: 291–298.
Cingolani G., Petosa C., Weis K. & Muller C.W. 1999. Structure of importin-β bound to the IBB domain of importin-α.
Nature 399: 221–229.
Douguet D. & Labesse G. 2003. Easier threading through webbased comparisons and cross-validations. Bioinformatics 19:
874–881.
Eudes R., Le Tuan K., Delettré J., Mornon J.P. & Callebaut I.
2007. A generalized analysis of hydrophobic and loop clusters
within globular protein sequences. BMC Struct. Biol. 7: 2.
Goldenberg S.J., Cascio T.C., Shumway S.D., Garbutt K.C., Liu
J., Xiong Y. & Zheng N. 2004. Structure of the Cand1-Cul1Roc1 complex reveals regulatory mechanisms for the assembly of the multisubunit Cullin-dependent ubiquitin ligases.
Cell 119: 517–528.
Groves M. & Barford D. 1999. Topological characteristics of helical repeat proteins. Curr. Opin. Struct. Biol. 9: 383–389.
Groves M.R., Hanlon N., Turowski P., Hemmings B.A. & Barford D. 1999. The structure of the protein phosphatase 2A
PR65/A subunit reveals the conformation of its 15 tandemly
repeated HEAT motifs. Cell 96: 99–110.
Heber U. & Heldt H.W. 1981. The chloroplast envelope: structure, function, and role in leaf metabolism. Annu. Rev. Plant
Physiol. 32: 139–168.
Heins L., Mehrle A., Hemmler R., Wagner R., Küchler M.,
Hörmann F., Sveshnikov D. & Soll J. 2002. The preprotein
conducting channel at the inner envelope membrane of plastids. EMBO J. 21: 2616–2625.
Heldt H.W. & Sauer F. 1971. The inner membrane of the chloroplast envelope as the site of specific metabolite transport.
Biochim. Biophys. Acta 234: 83–91.
Hernández J., Arias M.A. & Chomilier J. 2007. Tandem duplications of a degenerated GTP-binding domain at the origin
of GTPase receptors Toc159 and thylakoidal SRP. Biochem.
Biophys. Res. Commun. 364: 325–331.
Hiltbrunner A., Bauer J., Vidi P.A., Infanger S., Weibel P., Hohwy M. & Kessler F. 2001. Targeting of an abundant cytosolic
form of the protein import receptor at Toc159 to the outer
chloroplast membrane. J. Cell Biol. 154: 309–316.
Hunter S., Apweiler R., Attwood T.K., Bairoch A., . . . & Yeats
C. 2009. InterPro: the integrative protein signature database.
Nucleic Acids Res. 37: D211–D215.
Inaba T., Alvarez M., Li M., Bauer J., Ewers C., Kessler F. &
Schnell D.J. 2005. Arabidopsis Tic110 is essential for the assembly and function of the protein import machinery of plastids. Plant Cell 17: 1482–1496.
Inaba T., Li M., Alvarez M., Kessler F. & Schnell D.J. 2003.
atTic110 functions as a scaffold for coordinating the stromal
events of protein import into chloroplasts. J. Biol. Chem. 278:
38617–38627.
Jackson D., Froehlich J. & Keegstra K. 1998. The hydrophilic
domain of Tic110, an inner envelope membrane component
of the chloroplastic protein translocation apparatus, faces the
stromal compartment. J. Biol. Chem. 273: 16583–16588.
J. Hernández-Torres et al.
Jarvis P. 2008. Targeting of nucleus encoded proteins to chloroplast in plants. New Phytol. 179: 257–285.
Jones D. 1999a. Protein secondary structure prediction based on
position-specific scoring matrices. J. Mol. Biol. 292: 195–202.
Jones D. 1999b. GenTHREADER: an efficient and reliable protein folds recognition method for genomic sequences. J. Mol.
Biol. 287: 797–815.
Kalanon M. & McFadden G.I. 2008. The chloroplast protein
translocation complexes of Chlamydomonas reinhardtii: a
bioinformatic comparison of Toc and Tic components in
plants, green algae and red algae. Genetics 179: 95–112.
Kessler F. & Blobel G. 1996. Interaction of the protein import
and folding machineries of the chloroplast. Proc. Natl. Acad.
Sci. USA 93: 7684–7689.
Kessler F. & Schnell D.J. 2006. The function and diversity of
plastid protein import pathways: a multilane GTPase highway into plastids. Traffic 7: 248–257.
Kobe B., Gleichmann T., Horne J., Jennings I.G., Scotney P.D.
& Teh T. 1999. Turn up the HEAT. Structure 7: R91–R97.
Kouranov A., Chen X., Fuks B. & Schnell D.J. 1998. Tic20 and
Tic22 are new components of the protein import apparatus at
the chloroplast inner envelope membrane. J. Cell Biol. 143:
991–1002.
Kovács-Bogdán E., Soll J. & Bölter B. 2010. Protein import into
chloroplasts: the Tic complex and its regulation. Biochim.
Biophys. Acta 1803: 740–747.
Kuchler M., Decker S., Hormann F., Soll J. & Heins L. 2002.
Protein import into chloroplasts involves redox regulated proteins. EMBO J. 21: 6136–6145.
Kyte J. & Doolittle R. 1982. A simple method for displaying the
hydropathic character of a protein. J. Mol. Biol. 157: 105–
132.
Labesse G., Colloc’h N., Pothier J. & Mornon J.P. 1997. P-SEA:
a new efficient assignment of secondary structure from Cα
trace of proteins. Comput. Applic. Biosci. 13: 291–295.
Lemesle-Varloot L., Henrissat B., Gaboriaud C., Bissery V., Morgat A. & Mornon J.P. 1990. Hydrophobic cluster analysis:
procedures to derive structural and functional information
from 2-D-representation of protein sequences. Biochimie 72:
555–574.
Lubeck J., Soll J., Akita M., Nielsen E. & Keegstra K. 1996.
Topology of IEP110, a component of the chloroplastic protein
import machinery present in the inner envelope membrane.
EMBO J. 15: 4230–4238.
Marcotte E., Pellegrini M., Yates T. & Eisenberg D. 1999. A
census of protein repeats. J. Mol. Biol. 293: 151–160.
McFadden G.I. 2001. Primary and secondary endosymbiosis and
the origin of plastids. J. Phycol. 37: 951–959.
McGuffin L. & Jones D. 2003. Improvement of the GenTHREADER method for genomic fold recognition. Bioinformatics 19: 874–881.
Meinke D.W., Cherry J.M., Dean C., Rounsley S.D. & Koornneef M. 1998. Arabidopsis thaliana: a model plant for genome
analysis. Science 282: 662–682.
Oeffinger M., Dlalic M. & Tollervey D. 2004. A pre-ribosomeassociated HEAT-repeat protein is required for export of both
ribosomal subunits. Genes Dev. 18: 196–209.
Reumann S., Inoue K. & Keegstra K. 2005. Evolution of the
general protein import pathway of plastids (review). Mol.
Membr. Biol. 1-2: 73–86.
Reumann S. & Keegstra K. 1999. The endosymbiotic origin of
the protein import machinery of chloroplastic envelope membranes. Trends Plant Sci. 4: 302–307.
Reva B.A., Finkelstein A.V. & Skolnick J. 1998. What is the
probability of a chance prediction of a protein structure with
an rmsd of 6 Å? Fold. Des. 3: 141–147.
Rost B. 1999. Twilight zone of protein sequence alignments. Protein Eng. 12: 85–94.
Rubinson E.H., Gowda A.S.P., Spratt T.E., Gold B. & Eichman
B.F. 2010. An unprecedented nucleic acid capture mechanism
for excision of DNA damage. Nature 468: 406–411.
Rubinson E.H., Metz A.H., O’Quin J. & Eichman B.F. 2008. A
new protein architecture for processing alkylation damaged
DNA: the crystal structure of DNA glycosylase AlkD. J. Mol.
Biol. 381: 13–23.
Unauthenticated
Download Date | 6/18/17 4:18 AM
HEAT motifs in Arabidopsis thaliana Tic110
Schwenkert S., Soll J. & Bölter B. 2011. Protein import into
chloroplasts: how chaperones feature into the game. Biochim.
Biophys. Acta 1808: 901–911.
Soll J. & Schleiff E. 2004. Protein import into chloroplasts. Nat.
Rev. Mol. Cell Biol. 5: 198–208.
Stengel A., Soll J. & Bolter B. 2007. Protein import into chloroplast: new aspects of a well known topic. Biol. Chem. 388:
765–772.
Stewart M. 2007. Molecular mechanism of the nuclear protein
import cycle. Nat. Rev. Mol. Cell Biol. 8: 195–208.
Street T., Rose G. & Barrick D. 2006. The role of introns in repeat
protein gene formation. J. Mol. Biol. 360: 258–266.
Tsai J.Y., Chu C.C., Yeh Y.H., Chen L.J., Li T.S. & Hsiao C.D.
2013. Structural characterizations of the chloroplast translocon protein Tic110. Plant J. 75: 847–857.
151
Vellai T. & Vida G. 1999. The origin of eukaryotes: the difference
between prokaryotic and eukaryotic cells. Proc. R. Soc. Lond.
B 266: 1571–1577.
Vlahovicek K., Kajan L., Agoston V. & Pongor S. 2005. The
SBASE domain sequence resource, release 12: prediction of
protein domain-architecture using support vector machines.
Nucleic Acids Res. 33: D223–D225.
Woodcock S., Mornon J.P. & Henrissat B. 1992. Detection of secondary structure elements in proteins by hydrophobic cluster
analysis. Protein Eng. 5: 629–635.
Zhang Y. 2008. I-TASSER server for protein 3D structure prediction. BMC Bioinformatics 9: 40.
Received May 30, 2013
Accepted October 26, 2013
Unauthenticated
Download Date | 6/18/17 4:18 AM