Extra domains in secondary transport carriers and channel proteins

Biochimica et Biophysica Acta 1758 (2006) 1557 – 1579
www.elsevier.com/locate/bbamem
Review
Extra domains in secondary transport carriers and channel proteins
Ravi D. Barabote, Dorjee G. Tamang, Shannon N. Abeywardena, Neda S. Fallah,
Jeffrey Yu Chung Fu, Jeffrey K. Lio, Pegah Mirhosseini, Ronnie Pezeshk, Sheila Podell 1 ,
Marnae L. Salampessy, Mark D. Thever, Milton H. Saier Jr. ⁎
Division of Biological Sciences, University of California at San Diego, La Jolla, CA 92093-0116, USA
Received 9 May 2006; received in revised form 16 June 2006; accepted 20 June 2006
Available online 27 June 2006
Abstract
“Extra” domains in members of the families of secondary transport carrier and channel proteins provide secondary functions that expand,
amplify or restrict the functional nature of these proteins. Domains in secondary carriers include TrkA and SPX domains in DASS family
members, DedA domains in TRAP-T family members (both of the IT superfamily), Kazal-2 and PDZ domains in OAT family members (of the MF
superfamily), USP, IIAFru and TrkA domains in ABT family members (of the APC superfamily), ricin domains in OST family members, and TrkA
domains in AAE family members. Some transporters contain highly hydrophilic domains consisting of multiple repeat units that can also be found
in proteins of dissimilar function. Similarly, transmembrane α-helical channel-forming proteins contain unique, conserved, hydrophilic domains,
most of which are not found in carriers. In some cases the functions of these domains are known. They may be ligand binding domains,
phosphorylation domains, signal transduction domains, protein/protein interaction domains or complex carbohydrate-binding domains. These
domains mediate regulation, subunit interactions, or subcellular targeting. Phylogenetic analyses show that while some of these domains are
restricted to closely related proteins derived from specific organismal types, others are nearly ubiquitous within a particular family of transporters
and occur in a tremendous diversity of organisms. The former probably became associated with the transporters late in the evolutionary process;
the latter probably became associated with the carriers much earlier. These domains can be located at either end of the transporter or in a central
region, depending on the domain and transporter family. These studies provide useful information about the evolution of extra domains in
channels and secondary carriers and provide novel clues concerning function.
© 2006 Elsevier B.V. All rights reserved.
Keywords: Transport; Secondary carriers; Evolution; Protein domains; DedA; TrkA; SPX; Kazal-2; PDZ; USP; IIAFru
Contents
1.
2.
3.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Computer methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1. TRKA-C domains in proteins of the DASS family (TC #2.A.47). . . . . . . . . . . . . . . . . . . . . . . .
3.2. SPX domains in proteins of the DASS family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3. DedA domains in TRAP-T family transporters (TC #2.A.56). . . . . . . . . . . . . . . . . . . . . . . . . .
3.4. Kazal-2 domains in proteins of the OAT family (TC #2.A.60) . . . . . . . . . . . . . . . . . . . . . . . . .
3.5. PDZ domains in proteins of the OAT family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.6. USP domains in proteins of the ABT family (TC #2.A.3.6) within the APC superfamily (TC #2.A.3) . . . .
3.7. TrkA-C domains in proteins of the aspartate:alanine exchanger (AAE) family (TC #2.A.81) . . . . . . . . .
3.8. Multiple domains in proteins of the eukaryotic-specific organic solute transporter (OST) family (TC #2.A.82)
3.9. Repeat domains in members of the 4 TMS multidrug endosomal transporter (MET) family (TC #2.A.74) . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
⁎ Corresponding author. Tel.: +1 858 534 4084; fax: +1 858 534 7108.
E-mail address: [email protected] (M.H. Saier).
1
Present address: Scripps Genome Center, Scripps Institution of Oceanography, University of California at San Diego, La Jolla, CA 92093-0202, USA.
0005-2736/$ - see front matter © 2006 Elsevier B.V. All rights reserved.
doi:10.1016/j.bbamem.2006.06.018
.
.
.
.
.
.
.
.
.
.
.
.
1558
1558
1558
1558
1559
1560
1561
1564
1565
1565
1569
1572
1558
R.D. Barabote et al. / Biochimica et Biophysica Acta 1758 (2006) 1557–1579
3.10. “Extra” domains in ion channel proteins . . . . . . . . . . . . .
3.11. The voltage-gated ion channel (VIC) superfamily (TC #1.A.1–5)
3.12. The ligand-gated ion channel (LIC) family (TC #1.A.9) . . . . .
3.13. The chloride ion channel (ClC) family (TC #1.A.11). . . . . . .
4. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1574
1574
1575
1575
1576
1577
1577
1. Introduction
2. Computer methods
For the past 15 years, our bioinformatics laboratory has
focused on the characterization, classification and evolutionary origins of transport proteins [1,2]. The availability of
the transporter classification system [3,4], now incorporated
into an extensive database [5], has allowed us to conduct
analyses leading to conclusions about the origins of specific
transporters and protein domains present in transport systems
[6,7].
Of particular value in identifying the functions of
transporters of unknown mechanism of action and specificity
has been the analysis of extra domains in “fusion” proteins.
For example, the fusion of a transporter of the major
facilitator superfamily (MFS) [8,9] to a fatty acyl transferase/
acyl-ACP synthetase led to the functional characterization of
a family of transporters catalyzing lysophospholipid flipping
across the cytoplasmic membranes of Gram-negative bacteria
[10]. In another case, fusion of transporters of the SulP
family to carbonic anhydrase homologues led to the correct
prediction that these systems are bicarbonate transporters
[11].
In further recent studies, we analyzed fusion proteins where
recognized domains of uncertain function are fused to transport
protein homologues. These studies also allowed correct
functional predictions [12]. Analyses of proteins that incorporate domains homologous to phosphoryl transfer proteins of the
phosphoenolpyruvate-dependent phosphotransferase system
(PTS) led to important predictions about their modes of
regulation [13].
Bioinformatic successes such as those described above have
tremendously facilitated functional genomic approaches. They
have prompted us to analyze a broader range of protein domains
associated with secondary transport carriers and α-type
channel-forming proteins. The results of our analyses are
reported here. We show that extra domains are not uncommon
in families of carriers and channels, and in many cases their
functions can be predicted. The presence of multiple such
domains often indicates the occurrence of complex molecular
interactions.
A Fasta-formatted file of all the protein sequences in the
Transporter Classification Database (TCDB; http://www.tcdb.
org/tcdb) was obtained and used for domain identification and
analysis. All TCDB proteins were screened against Pfam [14]
protein domain profiles using the HMMER program (version
2.3.1) (http://hmmer.wustl.edu/) [15] with an e-value cutoff of
e− 3. Secondary carriers (TC subclass #2.A) were chosen for
analysis of the occurrence of various Pfam domains. Secondary
transporters containing hydrophilic or amphipathic domains
that had not previously been identified and additional domains
that appeared to be of interest were detected and characterized.
TCDB proteins were used as query sequences for identification of all family homologues in the NCBI protein
database. All protein homologues discussed here were obtained
using the PSI-BLAST search tool with iterations [16], screening
the NCBI RefSeq databank. Transport proteins from the TCDB
database that were found to contain the domains under study
were used as query sequences in the PSI-BLAST searches.
Homologues obtained from the Blast searches were shown to be
members of the respective TC families using TC-BLAST
(http://www.tcdb.org/progs/blast.php), retaining only established members of the respective families under study.
Homologues were checked for redundancies using an unpublished program (S. Singhi and M.H. Saier, Jr. unpublished).
Multiple sequence alignments and clustal-generated trees were
constructed using the CLUSTAL X program [17]. Clustal-generated trees were viewed using the TreeView program [18]. Default
parameters were used. Tables listing all the proteins analyzed for
each family as well as the multiple alignments and clustal-generated trees (as figures) are provided as supplementary material on
our website: http://biology.ucsd.edu/~msaier/supmat/Domains.
The AveHAS program [19] was used for plotting the average
hydropathy, similarity and amphipathicity as a function of
alignment position after aligning the sequences with the
CLUSTAL X program. The different domains were recognized
using the NCBI CD-Search tool [20]. The occurrence of the
domain in all the homologues of a family was assessed by
analyzing the multiple sequence alignments.
3. Results
3.1. TRKA-C domains in proteins of the DASS family (TC #2.A.47)
The Divalent Anion:Na+ symporter (DASS) family (TC #2.A.47) [21] is a constituent family within the Ion Transporter (IT)
superfamily [22]. Various functionally characterized members of the DASS family transport inorganic sulfate, phosphate and organic
R.D. Barabote et al. / Biochimica et Biophysica Acta 1758 (2006) 1557–1579
1559
di- and tricarboxylates of the Krebs cycle as well as dicarboxylic amino acid. Some of these transporters exhibit narrow specificity
while others exhibit broad specificity. The DASS family is ubiquitous, occurring in all three domains of life, archaea, bacteria and
eukaryotes.
Hydrophilic TrkA domains consist of two conserved regions, TrkA-N and TrkA-C, both of about 120 residues, separated by a
short, poorly conserved linker region [23]. The TrkA-N domain occurs in a wide variety of transport and non-transport proteins such
as channels, carriers, exonucleases, dehydrogenases and phosphoesterases [24,25]. This domain binds NAD+. The β-structured
TrkA-C domain is less well defined but probably binds an unidentified ligand [23].
We have identified TrkA-C domains in members of the DASS family (TC #2.A.47) [21,22]. Members of this ubiquitous family
were divided into three subgroups based on the organisms of origin: eukaryotes, bacteria and archaea. The TrkA-C domain was
found in some but not all members of each of the three groups of proteins. In all of these proteins, these domains appear to occur
between hydrophobic peaks 6 and 7, that is, between the two putative repeat elements of these 10 or 12 TMS carrier proteins (see Fig.
1A–C) [22,26,27]. In the 10 TMS model, hydrophobic peaks 5 and 11 are reentrant loops that go into the membrane but do not
traverse it [26].
Of the eight archaeal homologues identified (see Table DASS-S1 for the list of proteins studied and their properties, and Fig.
DASS-S1A for the multiple alignment, both on our website), only 4 proteins contain the TrkA-C domain, and of them, three (Hma1
and Hma2 from Haloarcula marismortui as well as Hsa1-1 from Halobacterium salinarum) have this domain tandemly duplicated.
These two tandemly duplicated TrkA-C domains were designated as TrkA-C1 and TrkA-C2 (Fig. 1). The fourth TrkA-C-containing
archaeal homologue, Pae1-1 from Pyrobaculum aerophilum, had only one copy of the TrkA-C domain. The three halophilic
archaeal proteins cluster together on the clustal-generated tree while the Pyrobaculus protein clusters distantly from all others
(Fig. DASS-S1B).
While two archaeal Hma paralogues (Hma1 and Hma2) contain the TrkA-C domain, as noted above, a third Hma paralogue lacks
the TrkA-C domain. This protein clusters separately from its paralogues, but it clusters loosely with Mja1 from Methancaldococcus
jannaschii which also lacks the TrkA-C domain. The results suggest that both association and duplication of the TrkA-C domain
occurred early during the evolutionary divergence of these archaeal homologues.
Of the 195 bacterial homologues (see Table DASS-S2), the first TrkA-C domain (TrkA-C1) could be identified in about 75 of
them. The second TrkA-C domain (TrkA-C2) was present in almost all bacterial homologues. The multiple alignment and the clustalgenerated tree of the bacterial proteins (Figs. DASS-S2A and S2B, respectively) revealed that the occurrence of the TrkA-C domain
correlates reasonably well with phylogeny. For example, of the eight clusters shown in Figure DASS-S2B, no protein in clusters 1, 2,
7 and 8 contains a TrkA-C1 domain, but all members of clusters 4 and 5 and most members of clusters 3 and 6 contain these domains.
The organisms that either possess or lack the TrkA-C1 domain derive from diverse bacterial kingdoms, and many organisms encode
in their genomes homologues that both do and do not contain this domain. Thus, TrkA-C domain occurrence does not correlate with
organismal phylogeny.
One hundred and fifteen eukaryotic DASS family homologues were studied (Table DASS-S3). Of these, only four have
recognizable TrkA-C1 domains. All four of these proteins (Ehi1-4) are from Entamoeba histolytica (Fig. DASS-S3A). These
proteins cluster together and comprise a single subcluster of cluster 1 (see Fig. DASS-S3B). The TrkA-C2 domain was not detected
in any of the eukaryotic homologues.
3.2. SPX domains in proteins of the DASS family
The DASS family was described in the previous section. In addition to the TrkA domain described therein, some members of the
DASS family possess N-terminal SPX domains. These SPX (Syg1/Pho81/XPR1) domains, of about 180 residues, are found at the Ntermini of various proteins, particularly signal transduction proteins [28,29]. They are hydrophilic domains believed to bind Gprotein β-subunits [30,31]. Such a domain inhibits transduction of the mating pheromone signal in yeast [30]. Possibly all SPX
domains bind G-proteins for the purpose of signal transduction. SPX domains are often (1) variable in sequence, (2) split due to
insertions, or (3) fragmented due to deletions. They are found in DASS family proteins present in all of the major eukaryotic
kingdoms (fungi, plants, animals, slime molds and protozoans).
The clearly recognized SPX domains in the DASS family occur only in one cluster of the clustal-generated tree. This is fungal
cluster 7 (see Table DASS-S3 and Fig. DASS-S3B on our website). Almost all members of this cluster possess the SPX domain. It
accounts for a significant part of the large, N-terminal, poorly conserved, hydrophilic domain shown in Fig. 1C. The presence of
this domain suggests that fungal DASS family members of cluster 7 may have their transport functions regulated by direct Gprotein binding. Alternatively, they may mediate cellular regulatory functions in addition to their transport functions by serving as
receptors for G-proteins. Thus, by an uncharacterized mechanism, they may allow transmission of a signal to specific cellular
constituents.
Relevant to this second possibility, it is important to note that low affinity phosphate carriers, including Pho87, Pho90 and Pho91
in S. cerevisiae, all of cluster 7 in the DASS family (see Table DASS-S3 and Fig. DASS-S3B), have been shown to play a role in pho
gene regulation [32]. It is also noteworthy that several of the large superfamilies in TCDB include members that instead of, or in
addition to transport, function as receptors in signal transduction processes [2].
1560
R.D. Barabote et al. / Biochimica et Biophysica Acta 1758 (2006) 1557–1579
Fig. 1. Average hydropathy (top) and similarity (bottom) plots for sequenced members of the DASS family. A, archaeal homologues; B, bacterial homologues; C,
eukaryotic homologues. The plots were generated from the multiple alignments shown on our website (Figs. DASS-S1A, S2A and S3A) using the AveHas program
[19]. The proteins included are presented in Tables DASS-S1–S3, respectively. The positions of the TrkA-C domains in all three figures as well as that of the SPX
domain in figure C are indicated. In this and other figures, average similarity values are presented as relative, not absolute values, and the scale is not presented
[see Ref. [19] for details].
3.3. DedA domains in TRAP-T family transporters (TC #2.A.56)
The Tripartite ATP-independent Periplasmic Transporter (TRAP-T) family consists of systems with three components, an
extracytoplasmic binding receptor, a large transmembrane (TM) protein with ≥ 12 putative TMSs (peaks of hydrophobicity in a
hydropathy plot) and a small transmembrane protein with 4 TMSs [33,34]. The larger TM constituents are members of the IT
superfamily [22], and they often include DedA domains [14]. Nothing is known about the function(s) of DedA domains, but these
domains are found in proteins from archaea, bacteria and eukaryotes. In addition to transporters, they are found in membraneassociated enzymes such as phospholipases and oxidoreductases. TRAP-T systems are found in bacteria and archaea but not in
R.D. Barabote et al. / Biochimica et Biophysica Acta 1758 (2006) 1557–1579
1561
eukaryotes. The archaeal homologues are very distantly related to the bacterial homologues [22,34]. Only the latter were
analyzed in this study.
Two hundred and forty-five large subunits of TRAP-T systems from bacteria were characterized phylogenetically and analyzed
for DedA domains (see Table TRAP-S1). The multiple alignment and clustal-generated tree for all of these proteins are shown in
Figures TRAP-T-S1A and S1B, respectively. There are three obvious clusters, and the alignments and trees for these three clusters,
clusters 1–3, are shown in Figures TRAP-S2-4, A and B, respectively. Clusters 1 and 3 include clearly delineated DedA domains, but
the DedA domain could not be identified in any of the cluster 2 proteins.
The average hydropathy and similarity plots for aligned clusters 1, 2 and 3 proteins are shown in Fig. 2A–C, respectively. Cluster
1 homologues (Fig. 2A) show 11 peaks of conserved hydrophobicity that might correspond to 12 TMSs, numbered 1–12. The first
peak of hydrophobicity is broad and is predicted to include two TMSs with a loop region of minimal size, probably just a β-turn. The
DedA domains occur at alignment positions 430 to 760, and putative TMSs 6, 7 and 8 lie within this domain. From the average
similarity plot, it is clear that insertions have occurred within this domain in some of the homologues. The cluster 3 plot (Fig. 2C)
shows the same pattern where putative TMSs 6, 7 and 8 lie within the DedA domain. In this plot, as for that of the cluster 1
homologues, regions of poor conservation are observed, corresponding to insertions in just one or a few of the homologues. It should
be noted that DedA domains differ from all other “extra” domains in being parts of the hydrophobic core of these proteins. The
significance of the DedA domain relative to the remainder of the transporter is not clear.
Cluster 2 proteins show 22 peaks of hydrophobicity. Several of these peaks are poorly conserved, but there is a much greater range
of conserved sequences than in either cluster 1 or cluster 3. The DedA domain was not identified in initial searches, but subsequent
comparisons between cluster 2 proteins and proteins of clusters 1 and 3 revealed that sequence divergent DedA domains could be
identified. The hydropathy plot shown in Fig. 2B indicates where this DedA domain occurs since according to our TMS numbering
system, it always corresponds to TMSs 6–8 (Fig. 2A–C).
Putative TMSs that are believed to be homologous to putative TMSs 1–12 in the clusters 1 and 3 proteins are so indicated in Fig.
2B. Additional peaks of hydrophobicity, always less well conserved, are indicated by roman numerals in accordance with the
convention used previously [22]. In Fig. 2B, there are three peaks (putative TMSs I–III) that are relatively poorly conserved, in
agreement with Prakash et al. [22], preceding the first peak common to clusters 1 and 3 (peak 1 in Fig. 2B). Then follow 6 TMSs, the
last of which is within the DedA domain. Two non-conserved peaks (IV and V) follow. Then two well-conserved peaks within the
DedA domain are found (peaks 7 and 8). These are followed by two non-conserved peaks (VI and VII). Each of these two nonconserved regions occurs in only four proteins: the first insertion (alignment positions 425–500) were in Dps9, Pmu9, Msp17 and
Bbr1, while the second insertion (alignment positions 500–610) occurred in Rpa8, Mma3, Ppu2 and Spo23. Each of these 4 protein
sets occurs together in one cluster on the clustal-generated tree (see Fig. TRAP-S1B).
Following peak VII are the last three well-conserved peaks (9–11), corresponding to the last four well-conserved putative TMSs.
These are followed by four additional poorly conserved peaks (VIII–XI). The first three of these are found in many of the proteins
while the last peak is present in only a few (see Figs. TRAP-S3A and S3B). The occurrence of four poorly conserved putative Cterminal TMSs is in agreement with the results of Prakash et al. [22].
It should be noted that the numbers and positions of the extra central TMSs differ in the two studies as expected since different
protein sets were analyzed. Prakash et al. [22] included proteins from all three clusters. The last hydrophobic peak (past alignment
position 1000) was found in seven proteins, Dps9, Pmu9, Msp17, Bbr1, Wsu3, Uba1 and Cef2. The first four of these proteins are the
same ones that have the first insert including TMSs I–III, and the first six of these proteins cluster together on the clustal-generated
tree (Figs. TRAP-S1B and S3B). The plot shown in Fig. 2B makes structural sense, since insertion of an even number of TMSs
allows the remaining conserved portions of the proteins to retain their original topologies.
3.4. Kazal-2 domains in proteins of the OAT family (TC #2.A.60)
Secondary carriers of the organic anion transporter (OAT) family (TC #2.A.60) catalyze the Na+-independent facilitated transport
of organic anions and cations such as prostaglandins, bile acids, steroid conjugates, thyroid hormones, anionic oligopeptides, drugs,
toxins and other xenobiotics [35]. OAT transporters are found only in animals although the OAT family belongs to the Major
Facilitator Superfamily (MFS) which is ubiquitous [8,9].
The Kazal-type serine protease inhibitor domain (Kazal-2), found in serine protease inhibitors as indicated by the name, are also
found as extracellular parts of agrins, proteins involved in maturation of neuromuscular junctions in vertebrates [36]. These 48 amino
acyl residue domains mediate protein–protein interactions with several other types of proteins, thereby forming structural complexes
[37]. The crystal structure of the insect-derived double domain Kazal inhibitor, rhodniin, has been solved complexed to thrombin [38].
We have identified Kazal-2 domains in OAT family transporters. Of 123 OAT family homologues (Table OAT-S1), all but eight
contain a recognizable Kazal-2 domain. The multiple alignment of these protein sequences is shown in Figure OAT-S1. Kazal
domains are found in an internal location between putative TMSs 9 and 10 (see Fig. 3A). They therefore correspond to expectation in
being localized to a site that is predicted to be on the extracytoplasmic surface of the membrane.
The average hydropathy plot shown in Fig. 3A (in which the poorly conserved N-terminal and C-terminal regions of the
alignment were omitted) reveals twelve conserved peaks (1–12) that might correspond to transmembrane α-helical spanners (TMSs).
1562
R.D. Barabote et al. / Biochimica et Biophysica Acta 1758 (2006) 1557–1579
Fig. 2. Average hydropathy (top) and similarity (bottom) plots for the three clusters (A, cluster 1; B, cluster 2, and C, cluster 3) of the large transmembrane bacterial
subunits of TRAP-T transporters. These plots are based on the multiple alignments shown in Figures TRAP-S2A-4A on our website. Format of presentation and the
programs used are as for Fig. 1. In all three figures, Arabic numbers refer to the twelve conserved peaks of hydrophobicity while Roman numerals refer to the poorly
conserved peaks. The positions of the DedA domain encompass putative TMSs 6, 7 and 8.
Fig. 3. (A) Average hydropathy (top) and similarity (bottom) for all sequenced members of the OAT family included in this study. The entire conserved hydrophobic
domain (multiple alignment segment 1000–3500) is shown, but the long N- and C-terminal extensions, present only in a few proteins, were eliminated for clarity. The
plots are based on a CLUSTAL X alignment shown in Figure OAT-S1 [17] and were derived using the AveHas program [19]. Positions of Kazal-2 and PDZ domains
are indicated. (B) Clustal-generated tree of OAT family members. The tree shows 8 primary clusters, several of which can be subdivided into proteins from specific
animal types. The organismal sources of the proteins in each cluster are indicated. The tree, based on the CLUSTAL X alignment shown in Figure OAT-S1 (proteins
listed in Table OAT-S1) [17], was drawn with the TreeView program [63]. The proteins that either possess a very sequence-divergent Kazal-2 domain, or do not possess
this domain, are boxed.
R.D. Barabote et al. / Biochimica et Biophysica Acta 1758 (2006) 1557–1579
1563
These 12 putative TMSs occur in groups of 3, each separated by poorly conserved regions. TMSs 1–3 at the N-terminal region are
followed by a long (∼ 200 residue) poorly conserved region present in only one homologue, Cfa3. This 200 residue insert in Cfa3
includes two hydrophobic peaks (peaks A and B in Fig. 3A) that could span the membrane twice, bringing the polypeptide chain
1564
R.D. Barabote et al. / Biochimica et Biophysica Acta 1758 (2006) 1557–1579
back to the same side of the membrane as expected for all other homologues which lack this inserted sequence. TMSs 4, 5 and 6 are
relatively close to each other, but a hydrophilic insert of about 100 aas between TMSs 4 and 5 occurs in just two homologues, Ptr10
and 11. Another large non-conserved insert that may include 2 TMSs (peaks C and D in Fig. 3A) occurs between TMSs 6 and 7. This
insert is about 350 aas long and is found only in Ptr10.
Between TMSs 9 and 10 is a ∼ 300 residue insert that includes the Kazal-2 domain. This domain is entirely hydrophilic. Of the
various inserts, this one is the best conserved among members of the OAT family; in fact, it is present in almost all of the homologues
(see below). Finally, within the last three TMSs, a short (∼ 100 aa) insert occurs between TMSs 10 and 11. Two hydrophobic peaks
(peaks E and F in Fig. 3A) may correspond to transmembrane segments. These occur in many but not all of the OAT family members.
It is worthy of note that each of the inserts that include putative TMSs has those TMSs in pairs so as not to disrupt the normal
topology of the protein in the membrane.
The clustal-generated tree for the 123 OAT family members is shown in Fig. 3B. There are 8 distinct clusters, but several of these
can be subdivided because of deep branching, particularly within clusters 2, 5 and 6. As noted above, the OAT family proteins are
derived exclusively from animals, but almost all organisms represented contain multiple paralogues. For example, round worms have
6, but interestingly one of these in C. elegans (Cel7) is duplicated internally, containing two full-length OAT family sequences in
tandem with the two halves of the protein exhibiting 62% identity. Flies have 8 paralogues, the chicken and the puffer fish have 10,
and mammals have 10–17 paralogues.
When proteins from any one of these organismal types are examined in the clustal-generated tree, we discover that some of the
paralogues are closely related while others are very distantly related. For example, worm paralogues occur in clusters 2, 6 and 8.
Insect paralogues occur in clusters 2, 4, 5 and 6. Vertebrate paralogues occur in clusters 1–5 and 7. Further, within cluster 2,
subclusters can be identified that are derived exclusively from (a) worms, (b) insects, (c) vertebrates and (d) mammals. Also, in
cluster 6, the worm homologues segregate from the insect proteins. These observations suggest that most of the paralogues arose by
gene duplication events after the organismal types diverged from each other. However, some early paralogue existed before these
divergences occurred, giving rise to at least some of the primary clusters.
The Kazal-2 domains analyzed here were found in the vast majority of OAT family members. Only 8 members were found to
either lack this domain altogether or possess a sequence between TMSs 9 and 10 that was so divergent from the Kazal-2 sequence
that it was non recognizable. These 8 proteins, present in clusters 1, 2, 3 and 5, are scattered fairly randomly around the tree. They are
boxed in Fig. 3B. Exon inspection revealed that the Kazal-2 domain is encompassed within exon 12 in the mouse [39]. Homologues
that lack this domain altogether may lack exon 12.
Homologues that lack the Kazal-2 domain and a few sequence divergent proteins were eliminated from the multiple alignment
revealing a large percentage of well-conserved cysteines. The best-conserved motif was:
*
*
*
*
C X3 C X C X6 P V C X6 Y X S X C X A G C ½X ¼ any residue; * ¼ full conservation
Thus, in this 30-residue sequence, six well-conserved cysteines, including three fully conserved cysteines, were found. These
residues presumably possess functional significance, perhaps for the purpose of protein domain interactions, possibly also for
formation of intradomain or interdomain covalent disulfide bonds.
3.5. PDZ domains in proteins of the OAT family
The OAT family was described in the previous section. In addition to the Kazal-2 domain, some of these proteins contain short
(60–90 residue) hydrophilic domains at their C-termini. These PDZ domains are modular protein interaction domains that play roles
in protein targeting and complex assembly [40]. PDZ stands for (1) post-synaptic protein, PSD-95, (2) the Drosophila septate
junction protein, Discs-large, and (3) the tight junction protein, ZO-1. PDZ-containing proteins are found ubiquitously in
prokaryotes and all major classes of eukaryotes. Many hundreds have been identified due to genome sequencing. Some of these
proteins contain more than one PDZ domain. Critical for OAT family protein interactions is a C-terminal X-(S/T)-X-Hy tetrapeptide
sequence where X = any residue and Hy = a hydrophobic residue [41]. Mutations in PDZ domains of some human proteins can give
rise to severe diseases [40]. The high-resolution 3-dimensional structures of several PDZ domains complexed to their cognate ligands
have been solved [see Ref. [40] for a review].
Wang et al. [41] have recently characterized the specific function of the PDZ domain in one of the Oat carriers, Oatp1a1
(TC #2.A.60.1.1). They showed that this domain is required for interaction with the PDZK1 protein, a major scaffold protein of the
epithelial cell membrane [42]. Oatp1a1 binds preferentially to the first and third PDZ binding domains in PDZK1 [41]. Without
PDZK1, Oatp1a1 proved to localize to intracellular membranes instead of its normal basolateral plasma membrane local [41]. Thus,
association of Oatp1a1 with PDZK1 confers upon this transporter its proper subcellular localization in the basolateral membranes of
epithelial cells.
We identified OAT family members with C-terminal PDZ domains, and discovered that they fall into vertebrate clusters 1 and 7 in
the tree shown in Fig. 3B. However, not all proteins of these two clusters bear PDZ domains. Only four proteins in cluster 7 (Gga4,
Cfa4, Bta3 and Ppy2) contain this domain. The last 3 of these proteins cluster together, but Rno26, which also clusters with these
R.D. Barabote et al. / Biochimica et Biophysica Acta 1758 (2006) 1557–1579
1565
proteins, apparently lacks the PDZ domain. Most proteins in cluster 1 contain the PDZ domain. Those that lack it include Gga2, Xla1,
Tni3 and Tni2 which cluster loosely together. Also, Ptr11 and Hsa8, which cluster together, and Gga8, Tmi4, Gga6 and Dre1, which
cluster loosely together, lack the PDZ domain. Thus, proteins with or without the PDZ domain generally cluster together, but with
apparent exceptions. We suggest that association of OAT family proteins with the PDZ domain was a relatively late evolutionary
event, designed for the purpose of membrane targeting [41].
3.6. USP domains in proteins of the ABT family (TC #2.A.3.6) within the APC superfamily (TC #2.A.3)
The Amino Acid-Polyamine-Organocation (APC) superfamily of transport proteins includes members that function as solute:
cation symporters and solute:solute antiporters [43]. They occur in bacteria, archaea, unicellular eukaryotic protists, slime molds,
fungi, plants and animals [44]. They are thus essentially ubiquitous. They vary in length, being as small as 350 residues and as large
as 850 residues. The smaller proteins are generally of prokaryotic origin while the larger ones are of eukaryotic origin [45]. Most of
them possess twelve transmembrane α-helical spanners but may have a re-entrant loop between TMSs 2 and 3 [46].
The universal stress protein of E. coli, UspA, is a small cytoplasmic protein whose expression is enhanced when the cell is
exposed to stress agents [47]. UspA enhances the rate of cell survival during prolonged exposure to such conditions and may provide
a general “stress endurance” activity [48]. The crystal structure of Haemophilus influenzae UspA [49] reveals an alpha/beta fold
similar to that of the Methanococcus jannaschii MJ0577 protein which binds ATP [48]. The UspA proteins of E. coli and H.
influenzae have not been shown to possess ATP-binding activity.
The “universal stress protein” (USP) domain is found in various proteins by themselves or fused to other domains in single copy
or in duplicate. It occurs in a variety of transport, regulatory and catalytic proteins. For example, it is present in single copy at the Nor C-termini of protein kinases (e.g., ZP_00400544) and at the C-termini of sensor histidine kinases (e.g., ZP_00233092) where the
entire protein is sometimes duplicated. In the universal stress proteins cited above, this domain alone is present. It is also present in
tandemly duplicated copies at the C-termini of chloride carriers of the ClC family (e.g., ZP_00516804; TC #1.A.11; Ref. [50]) and
Na+/H+ antiporters and K+ efflux systems of the CPA2 family (e.g., BAB72614; TC #2.A.37; Ref. [21]).
We identified single and tandemly duplicated USP domains in members of one of the families within the APC superfamily [44].
This family is the archaeal/bacterial transporter (ABT) family (TC #2.A.3.6) (see Table ABT-S1). Some, but not all members of the
family contain C-terminal, single, or tandemly duplicated USP domains. None of the members of this family are functionally
characterized. Another APC superfamily homologue, a parachlamydial protein (YP_007261) with a single C-terminal USP domain,
proved to be distantly related to the 2.A.3.1.4 GABA (GabP) permease of E. coli in the amino acid transporter (AAT) family.
Based on the multiple alignment shown in Figure ABT-S1, the average hydropathy and similarity plots for the ABT family were
derived as shown in Fig. 4A. Twelve hydropathy peaks are all well conserved except for the last two (TMSs 11 and 12). Moreover, all
of the cytoplasmic loops as well as several of the extracytoplasmic loops are well conserved. Following the hydrophobic
transmembrane domain is a stretch of about 300 residues where the tandemly duplicated USP domains can be found. As indicated in
the plot, this region is very poorly conserved because it is lacking in most of the homologues.
The clustal-generated tree for the ABT family, also based on the multiple alignment shown in Figure ABT-S1, is shown in Fig. 4B.
All but two of the proteins possessing the duplicated USP domains are found in a single cluster, cluster 1. However, one of the
members of this cluster, Hma4 from Haloarcula marismortui, lacks these domains altogether. One other protein, Dac1 from the δproteobacterium, Desulfuromonas acetoxidans, the only bacterial protein in this cluster, possesses a hydrophilic C-terminal
extension of the same size as most of the archaeal proteins of this cluster, but surprisingly, it contains first, a fructose-specific IIA
domain (IIAFru) of the bacterial phosphotransferase system (PTS; Ref. [13]) followed by a TrkA-C domain (see above). Other
paralogues of H. marismortui as well as other archaeal proteins in this cluster possess the full-length duplicated USP domains (see
boxed proteins in Fig. 4B).
Two other proteins in the tree shown in Fig. 4B were also found to possess C-terminally duplicated USP domains. One is the
archaeal Afu2 protein from Archaeoglobus fulgidus (cluster 5) and the other is the Rba1 protein from the planctobacterium,
Rhodopirellula baltica (branch 8). These proteins branch distantly from other members of this family suggesting that the
hydrophilic domains became associated with these evolutionarily divergent proteins independently of the homologous domains in
the cluster 1 proteins.
3.7. TrkA-C domains in proteins of the aspartate:alanine exchanger (AAE) family (TC #2.A.81)
No publication has previously described the characteristics of the AAE family and its constituent members. Consequently, in this
section, we not only describe the uniquely situated duplicated TrkA-C domains, we also provide a brief description of this family.
A single functionally characterized protein, the aspartate:alanine exchanger (AspT) of the Gram-positive lactic acid bacterium,
Tetragenococcus halophilaD10 (Pediococcus halophilus) serves to characterize the AAE family [51]. This organism takes up
L-aspartate via AspT, decarboxylates it to L-alanine and CO2 in the cytoplasm using L-aspartate β-decarboxylase (AspD), and exports
the L-alanine in a 1:1 exchange reaction with L-aspartate. AspT is a hydrophobic protein of 543 aas and 10–12 putative TMSs. This
protein has many bacterial homologues of unknown function and one very distant homologue in the archaeon, Halobacterium sp.
1566
R.D. Barabote et al. / Biochimica et Biophysica Acta 1758 (2006) 1557–1579
R.D. Barabote et al. / Biochimica et Biophysica Acta 1758 (2006) 1557–1579
1567
strain NRC-1. This last protein (384 aas; 10–12 putative TMSs; AAC82885) is the only AAE family member from an archaeon in the
NCBI database, and there are none from eukaryotes (see below).
Because one more negative charge is brought in (aspartate) than is exported (alanine), the exchange transport process may result in
net charge movement, creating a membrane potential, negative inside. Further, decarboxylation of aspartate consumes a scalar proton
and thus generates a pH gradient (basic inside). The resultant pmf can drive ATP synthesis via the F-type ATPase (TC #3.A.2). Other
such exchangers generating a pmf are the prototypical oxalate/formate exchanger of the MFS (TC #2.A.1) as well as glutamate/
γ-amino butyrate, malate/lactate, citrate/lactate and histidine/histamine exchangers (for references see Ref. [51]).
Each of the two hydrophobic halves of AAE family proteins contains six peaks of hydropathy that may correspond to 5 or 6
TMSs. The two halves exhibit almost unprecedented degrees of sequence similarity, suggesting that they arose by an intragenic
duplication event relatively recently. If the six hydrophobic peaks correspond to 5 TMSs, there may be a single re-entrant loop
between TMSs 4 and 5 [26,27]. Although this characteristic resembles that of one well-studied putative member of the IT
superfamily, the CitS citrate transporter of Klebsiella pneumoniae in the CCS family (TC #2.A.24) [22,26,52], we could not establish
a statistically significant relationship between these two groups of proteins. Interestingly, in all but one of the bacterial homologues,
but not in the archaeal or the Fusobacterium nucleatum homologue, the two hydrophobic repeat elements are separated by two
tandem hydrophilic TrkA-C domains [23,24] that must have arisen by an independent duplication or insertional event.
Table AAE-S1 on our website (http://biology.ucsd.edu/~msaier/supmat/AAE) presents the members of the AAE family in
alphabetical order according to species name. All but one of the members of this family derive from bacteria, but several bacterial
kingdoms are represented. Most are derived from proteobacteria of the α, β, γ and δ-subclasses. Others are found in low and high
G + C Gram-positive bacteria (Firmicutes and Actinobacteria, respectively) as well as Bacteroides species, Chloroflexus aurantracus,
Fusobacterium nucleatum and Rhodopirellula baltica. Several species have multiple paralogues. Within the Bacteroides group,
Porphyromonas gingivalis has two, Bacteroides fragilus has 3, and B. thetaiotaomicron has 4, the most of any species examined.
Among the proteobacteria, organisms listed in Table AAE-S1 can encode within their genomes from one to three paralogues. The
α- and β-proteobacteria represented can have 1–3 paralogues while the γ- and δ-proteobacteria have just 1 or 2. The two Firmicutes
represented have just one member while actinobacteria can have either one or two. All three Corynebacterial species have two.
Finally, all other bacteria represented have only one. These proteins are almost all large, between 516 and 580 residues. This large
size reflects the presence of the two tandem TrkA-C domains between the two putative 5 or 6 TMS repeat units (mentioned above).
The two TrkA-C domains are each about 70 residues long. The central hydrophilic region is usually about 220 residues long while
the two flanking hydrophobic domains are about 170 residues long. These two hydrophobic and two hydrophilic domains thus
account for much of the full sizes of most of these proteins.
A few proteins listed in Table AAE-S1 are either shorter or longer than the large majority of homologues. Two of these, Msu1 and
Msu2 from Mannheimia succiniciproducens, are half-sized (280 and 297 residues). They correspond to the N- and C-terminal halves
of a typical AAE family member, each possessing a single AAE and a single TrkA–C domain. Fused, they comprise a full-length
AAE family homologue. Either a sequencing error or a mutation accounts for this splicing event.
Two proteins, Dps2 of Desulfotalea psychrophila and Vfi2 from Vibrio fischeri, are large (669 and 624 residues) because they
possess N-terminal extensions of 95 and 50 residues, respectively, both of which contain a single hydrophobic putative TMS. The
former, but not the latter, exhibits sequence similarity with the N-terminal region of the HlyA protein of E. coli (gi #68520714).
Thus, when residues 5–83 of Dps2 were compared with residues 15–94 of HlyD, with the GAP program [53], 36% identity, 50%
similarity and a comparison score of 14.3 S.D. was obtained. This is sufficient to establish homology. It is possible that these
sequences are important for targeting of the proteins to the secretory (Sec) apparatus for membrane insertion [54].
The sequences of members of the AAE family were multiply aligned with the CLUSTAL X program [17], and this alignment (see
Fig. AAE-S1 on our website) was used to generate average hydropathy/similarity plots (Fig. 5A; AveHas program; Ref. [19]) as well
as a clustal-generated tree (Fig. 5B; TreeView program; Ref. [55]).
The plot in Fig. 5A reveals one hydrophobic peak (peak 0) at alignment positions 20–60, very poorly conserved, found in only
two proteins as noted above. Five peaks of average hydropathy (peaks 1–5) follow peak 0, and these are all well conserved
(alignment positions 100–300). A potential semipolar re-entrant loop occurs between putative TMSs 4 and 5 [26,52]. Then follows
the central hydrophilic region where the two tandem TrkA–C domains are found. Finally, 5 more peaks of hydrophobicity, including
a potential re-entrant loop between putative TMSs 9 and 10, follow. Similarity between peaks 1–5 and 6–10 can be seen; for
example, peaks 1, 4, 6 and 9 are the most hydrophobic while the probable re-entrant loops, peaks R1 and R2, are the least
hydrophobic, exhibiting semipolar characteristics.
These observations prompted us to examine segments encompassing TMSs 1–5 and TMSs 6–10 to determine if they are
homologous. The IC [18] and GAP [53] programs were used for this purpose. The results of one comparison are shown in Figure
AAE-S2 on our website. The startling comparison score value of 23 S.D. is among the greatest observed for the two repeat units in
any family within the TCDB (see Refs. [22,56]). The high values obtained suggest that the intragenic duplication event that gave rise
Fig. 4. (A) The average hydropathy (top) and similarity (bottom) plots (AveHas program) for the ABT family within the APC superfamily. (B) The clustal-generated
tree (TreeView program) for the ABT family. The plots are based on a CLUSTAL X multiple alignment shown in Figure ABT-S1 using the proteins tabulated in Table
ABT-S1. In (B), the proteins possessing the C-terminal tandemly duplicated USP domains are boxed.
1568
R.D. Barabote et al. / Biochimica et Biophysica Acta 1758 (2006) 1557–1579
Fig. 5. (A) Average hydropathy and similarity plots for the proteins of the AAE family. The proteins (Table AAE-S1) included in the CLUSTAL X alignment
(Fig. AAE-S1), upon which this plot was based, can be viewed on our website (http://biology.ucsd.edu/~msaier/supmat/Domains). The putative 5 TMS repeat unit
(putative TMSs 1–5 or 6–10) plus a proposed semipolar re-entrant loop (R1 or R2) [26,27] are indicated as are the duplicated TrkA-C domains [23,24]. (B) Clustalgenerated tree for the AAE family. The TreeView program was used to draw the tree, based on the CLUSTAL X alignment shown in Figure AAE-S1 on our website.
The proteins comprising the eleven clusters and their abbreviations are presented in Table AAE-S1.
R.D. Barabote et al. / Biochimica et Biophysica Acta 1758 (2006) 1557–1579
1569
to these putative 10 TMS proteins must have occurred relatively recently in evolutionary history. They also suggest that the two
halves of these proteins serve similar functions and have not yet acquired a high degree of functional selectivity.
The clustal-generated tree for the AAE family proteins is shown in Fig. 5B. It can be seen that the various clusters exhibit a
limited diversity of organismal types from which these proteins were derived. Thus, clusters 1 and 7 proteins are all from γproteobacteria, and all proteins in each of these two clusters are from different organisms. The only exception is the Msu1 and
Msu2 proteins which may represent the two halves of a single protein, split because of a sequencing error or an authentic
mutation. We suggest that each of these clusters represents a set of orthologues although the two clusters are clearly nonorthologous to each other. In a like fashion, cluster 4 proteins are derived from the only two Firmicutes represented, and cluster
6 proteins are all derived from members of the Bacteroides. Of these proteins, only Porphyromonas gingivalis has two
paralogues. Clusters 10 and 11 each includes a single protein, from the only member of the Planctomycetacia and from
Fusobacterium nucleatum, respectively. Other clusters consist of homologues from broader groups of bacteria, and each of
these clusters is likely to comprise a different set of functionally distinct proteins. Thus, cluster 2 includes proteins from β-, γand δ-proteobacteria, cluster 3 is from α- and β-proteobacteria, cluster 5 is from α- and β-proteobacteria as well as Bacteroides,
cluster 8 is from Bacteroides as well as a δ-proteobacterium, and cluster 9 is from β- and δ-proteobacteria as well as Chloroflexi
and Actinobacteria.
3.8. Multiple domains in proteins of the eukaryotic-specific organic solute transporter (OST) family (TC #2.A.82)
Members of the OST family have been characterized from humans, mice and the little skate, Raja erinacea [57,58]. Each of these
characterized systems consists of two polypeptide chains, α and β. For the human system, the α-subunit (Ostα) is of 340 aas with 7
putative TMSs while the β-subunit (Ostβ) is of 128 aas with 1 putative TMS near the N-terminus (residues 40–56).
Neither Ostα nor Ostβ alone has activity, but the two together transport a variety of organic compounds, mostly anions. Transport
of estrone-3-sulfate is Na+-independent, ATP-independent, saturable and inhibited by other steroids and anionic drugs. Bile acids,
taurocholate, digoxin and prostaglandin E1 are also substrates [57,58]. The two proteins are highly expressed in many human tissues.
The β-subunit is not required to target the α-subunit to the plasma membrane, but coexpression of both genes is required to convert
Ostα to the mature glycosylated form in enterocyte basolateral membranes and possibly for trafficking through the Golgi apparatus
[59].
Table OST-S1 on our website (http://biology.ucsd.edu/~msaier/supmat/OST) presents the members of the Ostα family while
Table OST-S2 presents the recognized members of the Ostβ family. All β-subunits are derived from vertebrate animals, but the αsubunits are derived from animals, plants, fungi and protozoans. While only the chicken appears to have more than one Ostβ,
many organisms encode multiple Ostα's within their genomes. Of the organisms with fully sequenced genomes, plants have 7 or 8,
animals have 3–6, fungi have 1 or 2, and the single protozoan represented has just one. The α-subunits vary in size from about
250–800 aas while the β-subunits are either of about 110–130 aas or 280–380 aas. The mammalian Ostβ proteins are of the
smaller size.
The clustal-generated tree of the Ostα family is shown in Fig. 6A while that for the Ostβ family is shown in Fig. 6B. In Fig. 6A,
we find four clusters of animal homologues, three clusters of plant proteins, two clusters of fungal proteins, and the single protozoan
protein is on a branch by itself. This fact suggests that early intergenic duplication events gave rise to the primary clusters within each
eukaryotic kingdom. These duplication events may have occurred at about the time the major eukaryotic kingdoms arose. Speciation
accounts for separation of clusters according to kingdom. The phylogenetic analysis suggests that there has been no horizontal
transfer of genetic material encoding these homologues between kingdoms. Interestingly, fungal cluster 1 has far more members than
fungal cluster 2, and the cluster 2 proteins are exclusively from organisms that have two paralogues, one of which is in fungal cluster
1. Thus, in fungi, there have been no late gene duplication events within the OST family. There was a single early gene duplication
event, and subsequently one of the paralogues (those in cluster 2) was lost although those in cluster 1 were not. Many of the clusters
show phylogenetic grouping in accordance with organismal phylogeny, again suggesting orthology for proteins from different
species within a specific cluster.
The Ostβ tree (Fig. 6B) shows a pattern virtually identical to that of one subcluster in the Ostα animal 3 cluster (Fig. 6A). The
transporters in this subcluster presumably require the participation of a β-subunit. However, other systems represented in Table
OST-S1 either function without a β-subunit or use a sequence divergent β-subunit that was not recognized in our BLAST searches.
The average hydropathy and similarity plots shown in Fig. 7A and B were generated from the multiple alignments shown in
Figures OST-S1 and OST-S2 on our website, respectively (http://biology.ucsd.edu/~msaier/supmat/Domains). The Ostα alignment
and the plots presented in Fig. 7A show 7 well-conserved peaks of hydrophobicity. Most of these peaks are sharp, but peak 5 looks
like a double peak with good conservation. Examination of the multiple alignment revealed that this double peak is due to a 4residue insertion in the center of this putative TMS in one of the proteins. Several inter-TMS and post-TMS regions are also quite
well conserved. The inter-TMS regions between putative TMSs 1 and 2, and TMSs 5 and 6 are much better conserved than those
between TMSs 2 and 3, 3 and 4, 4 and 5, and 6 and 7. Thus, assuming a 7 TMS topology, we predict, on this basis, that
cytoplasmic loops tend to be best conserved, that the N-termini of these proteins are in the extracellular medium and the C-termini
are in the cytoplasm. The positive inside rule [60–62] gave the same prediction. We also suggest that these putative 7 TMS proteins
1570
R.D. Barabote et al. / Biochimica et Biophysica Acta 1758 (2006) 1557–1579
Fig. 6. Clustal-generated trees for the protein constituents of the OST family, Ostα (A) and Ostβ (B). The protein abbreviations are as presented in Tables OST-S1 and
OST-S2, respectively, and the multiple alignments upon which these trees were based can be viewed in Figures OST-S1 and OST-S2, respectively. The TreeView
program [55] was used to draw the tree.
arose by an intragenic duplication event in which a 3 TMS-encoding genetic element duplicated to give 7 TMSs (with the extra
TMS sandwiched in between the two repeat sequences) as has been documented for the Brho and LCT families [63].
Examination of the multiple alignment for the Ostα subunits revealed that over a dozen homologues were either truncated or
internally deleted, no two exhibiting the same apparent deletions. These atypical sequences, most believed to be artifactual due to
sequencing errors, incorrect initiation codon assignment or lack of exon recognition, were excluded from the alignment, yielding a
more coherent plot.
R.D. Barabote et al. / Biochimica et Biophysica Acta 1758 (2006) 1557–1579
1571
Fig. 7. Average hydropathy and similarity plots of (A) Ostα homologues and (B) Ostβ homologues. The proteins included are listed in Tables OST-S1 and OST-S2,
respectively, while the multiple alignments upon which these plots were based are shown in Figures OST-S1 and OST-S2, respectively. The AveHas program [19] was
used to generate these plots.
The most conserved regions of the Ostα proteins were used to design consensus sequences. These corresponded to TMSs 1 and 2
and TMSs 5 and 6. These sequences are:
H L X ðH NÞ Y ðT SÞ ðK NÞ P ðE NÞ ðEÞ Q R Hy5 P ðI VÞ Y ðA SÞ ðL FÞ ðD SÞ S ðW FÞ ðV LÞ ðS GÞ L
ðalignment positions 321357Þ
*
ð2Þ ðR KÞ ðE DÞ X L X P ðF YÞ ðR KÞ P Hy2 K F Hy3 K Hy3 F L ðS TÞ ðF YÞ W Q
ðalignment positions ð587612Þ
ðAlternative residues at a single position are in parentheses; Hy¼any hydrophobic residue; X ¼ any residue;
* ¼ fully conservedÞ
Examination of the Ostβ multiple alignment revealed that the β-subunits show poorly conserved N-terminal regions, and that a
single region in the β-subunit is well conserved. This region includes the single putative TMS with two fully conserved residues. The
consensus sequence for this region is:
*
*
ðS TÞ P W N Y S Hy4 ðS GÞ Hy4 ðR G AÞ R ðS NÞ I X A N R ðN KÞ R K ðalignment positions: 302334Þ
1572
R.D. Barabote et al. / Biochimica et Biophysica Acta 1758 (2006) 1557–1579
Several of the larger Ostα homologues proved to possess C-terminal hydrophilic domains that are homologous to those in other
proteins. For example, Cne1 from Cryptococcus neoformans (796 aas; gi 50255021) has a large domain (residues 426–791) that
exhibits significant sequence similarity with the huge dynein-related AAA-type ATPase (midasin; Mdn1p of Saccharomyces
cerevisiae) that forms a complex with the Rix1 protein complex. This large complex mediates the ATP-dependent remodeling of 60S
ribosomal subunits and their subsequent export from the nucleoplasm to the cytoplasm [64,65].
Another Ostα homologue, Tni4 from Tetraodon nigroviridis (869 aas; gi 47209270) exhibits 14 putative TMSs and possesses a
C-terminal Ostα domain and an N-terminal membrane-bound O-acyl transferase (MBOAT) domain. The MBOAT family includes
proteins from all eukaryotic kingdoms. The protein-cysteine N-palmitoyl transferase, porcupine [66] is a well-characterized example.
Another Ostα homologue, Tni5 (842 aas; gi 47227190) shows an extensive region (residues 549–814) containing numerous
repeat units of about 27 aas. This region is homologous to a 300 residue domain in the human polycystic kidney disease protein
(2318 aas; gi 62665258) [67–69].
Finally, the Uma protein from Ustilaga maydis (969 residues; gi 46100056) shows a C-terminal region (residues 684–855) that is
homologous to the N-terminal region (residues 7–169) of human kinesin (208 aas; gi 51464016) [70] as well as with a central
domain (residues 729–884) of the plasmodium erythrocyte binding protein that recognizes the human Duffy glycoprotein blood
group determinant (1070 aas; gi 1346921). These results show that eukaryotic Ostα transporter homologues can be fused to any of a
variety of conserved protein domains that may function in catalysis (e.g., the MBOAT domain) or in macromolecular recognition.
The two Ostβ paralogues from the chicken, Gga1 and Gga2, and the fish homologue, Dre1 from Danio rerio, are substantially
larger than the mammalian homologues (see Table OST-S2). One of these, Gga2, has an N-terminal ricin-type β-trefoil domain, a
carbohydrate-binding domain, formed by a presumed intragenic triplication event [71]. The conserved hydrophobic domain occurs
in the C-terminal region of this protein at residues 290–315. The other chicken Ostβ paralogue, Gga1 (286 aas; gi 50753202) and the
fish homologue, Dre1 (332 aas; gi 68369922), have their single TMSs centered at about residue 200. Gga1 does not exhibit a CDDrecognized conserved domain, but Dre1 clearly has the ricin-like domain. All 3 homologues proved to be homologous throughout
most of their lengths, and consequently, although that in Gga1 is sequence divergent, they all possess N-terminal ricin domains. We
conclude that the three non-mammalian Ostβ subunits possess a ricin-like carbohydrate-binding domain, possibly for the purpose of
recognizing a carbohydrate chain in Ostα or another protein. The ricin domain is proposed to function in glycoprotein recognition
although the possibility of a glycolipid recognition function cannot be ruled out.
3.9. Repeat domains in members of the 4 TMS multidrug endosomal transporter (MET) family (TC #2.A.74)
Proteins of the MET family have been sequenced and characterized from mice and humans [72]. These two orthologues have the
same length and sequence. The two functionally characterized human and mouse orthologues are located in late endosomal, Golgi
and/or lysosomal membranes. Their transcripts are expressed in many tissues and cell types. The C-termini of these proteins possess
hydrophilic domains containing several tyrosine-based sorting motifs, YXXHy, where Hy represents a bulky hydrophobic residue.
This sequence directs proteins to intracellular compartments in eukaryotes. Substrates of the mouse MET (also called “mouse
transporter protein,” MTP) include thymidine, both nucleoside and nucleobase analogues, antibiotics, anthracyclines, ionophores
and steroid hormones [73,74]. MET transporters may be involved in the subcellular compartmentation of steroid hormones and other
compounds.
Drug sensitivity by mouse MET is regulated by compounds that inhibit lysosomal function, interface with intracellular cholesterol
transport, or modulate the multidrug resistance phenotype of mammalian cells [72–74]. Thus, MET family members may
compartmentalize diverse hydrophobic molecules, thereby affecting cellular drug sensitivity, nucleoside/nucleobase availability and
steroid hormone responses. The mode of transport and energy-coupling mechanism, if any, has not been investigated, but a proton
antiport mechanism is inferred.
A related protein is the retinoic acid-inducible E3 protein, associated with lymphoid, erythroid and myeloid tissues but not nonhematopoietic tissues of the mouse [75,76]. The E3 protein is 261 amino acids long and exhibits 5 putative TMSs. Both MTP and the
E3 protein have human homologues. Three TMS homologues found in the related flatworms, Schistosoma mansoni and S.
haematobium, are called the “tri-spanning orphan receptors,” TorE [77].
Table MET-S1 on our website (http://biology.ucsd.edu/~msaier/supmat/MET) presents the proteins of the MET family. Using the
BLAST search tool [16] with the human MTP as the query sequence and several iterations, many proteins were obtained. A number
of these proteins proved to be randomly truncated or incomplete, and these were eliminated. Redundant sequences were also
eliminated prior to construction of the multiple alignment. We paid particular attention to organisms with fully sequenced genomes,
but several others are included in Table MET-S1. All organisms represented except for Trypanosoma cruzi are animals. Since the
T. cruzi sequence and those from Schistosoma species are nearly identical to the human Hsa5 protein, we suggest either that
contaminating DNA encoding the gene products is responsible, or that the encoding genes have been acquired by these pathogens
from a mammal via a recent horizontal gene transfer event as originally proposed by Inal [78]. Examination of the proteins listed in
Table MET-S1 reveals that many organisms have two or three paralogues. Thus, three are found in D. melanogaster, H. sapiens and
T. nigroviridis. A multiple alignment was generated using the CLUSTAL X program (Fig. MET-S1) [17] and from it, an AveHas plot
(Fig. 8A) and a clustal-generated tree (Fig. 8B) were generated.
R.D. Barabote et al. / Biochimica et Biophysica Acta 1758 (2006) 1557–1579
1573
Fig. 8. (A) The average hydropathy (top) and similarity (bottom) plots (AveHas program; Ref. [19]) for the eukaryotic MET family and (B) the clustal-generated tree
(TreeView program; Ref. [55]) for the MET family. The plots are based on the multiple alignment shown in Figure MET-S1. The protein abbreviations are presented in
Table MET-S1.
Most of the proteins listed in Table MET-S1 have sizes of 230–290 aas, but some are substantially larger. All three roundworm
proteins are large (562–568 aas). The same is true of two of the fruitfly proteins (432–474 aas), one of the P. troglodytes proteins
(481 aas), the S. japonicum homologue (414 aas) and one of the T. nigroviridis proteins (355 aas).
1574
R.D. Barabote et al. / Biochimica et Biophysica Acta 1758 (2006) 1557–1579
As shown in Table MET-S1, a majority of MET family homologues have 4 putative TMSs, but several are predicted to have 3 or
5. Eight of them are predicted to have 5 TMSs, and six have 3 putative TMSs. These topological differences are probably real and not
artifactual, due to sequencing or annotation errors.
The multiple alignment revealed tremendous diversity between the proteins. No single residue was conserved in all of the proteins
(see Fig. MET-S1). Fig. 8A shows the average hydropathy plot (top) and average similarity plot (bottom). The four well-conserved
hydrophobic peaks are labeled 1–4. Maximal conservation is observed not only for these peaks, but immediately following them,
particularly following peaks 1 and 4 (Fig. 8A). Three moderately hydrophilic regions are also quite well conserved.
The MET family clustal-generated tree is shown in Fig. 8B. There are four principal clusters. Clusters 1 and 4 proteins exhibit 4
putative TMSs while cluster 2 proteins exhibit 5 putative TMSs, and cluster 3 proteins are predicted to have 3 TMSs. In clusters 1–3,
there is a single exception to the standard topology for that group of proteins, possibly due to errors in sequencing, exon assignment or
initiation codon assignment. While the mouse transporter, MTP (Mmu1) is in cluster 1, the mouse retinoic acid-inducible E3 protein
(Mmu4) is in cluster 2. The tri-spanning orphan receptors, also called the complement inhibitory receptors, are present in cluster 3.
As noted above, clusters 1 and 4 proteins have 4 TMSs, cluster 2 proteins have 5 TMSs, and cluster 3 proteins have 3 TMSs. The
multiple alignment upon which Fig. 8 was based (Fig. MET-S1) was examined to determine which TMSs correspond. The 5 TMS
proteins in cluster 2 have an extra N-terminal TMS which in Fig. 8A occurs at the well-conserved, moderately hydrophobic peak
centered at alignment position 310. This peak is labeled “0” in Fig. 8A. The remaining 4 TMSs correspond in position to the 4 TMSs
in cluster 1 proteins. The 3 TMS proteins of cluster 3 correspond in position to TMSs 1–3 in the 5 TMS proteins of cluster 2. Thus, all
proteins share only TMSs 1 and 2 as labeled in Fig. 8A. They do, however, have C-terminal hydrophilic extensions. The Sja5 cluster
3 protein with 5 putative TMSs had a 150 residue N-terminal extension with two extra hydrophobic regions. Putative TMSs 3–5 in
this protein correspond to TMSs 1–3 in all other cluster 3 proteins.
We examined some of the proteins for “extra” domains. When a short homologue such as Rno5 was blasted against the NCBI
database, only recognized MET family members appeared. However, when we blasted a longer protein such as Cel1, which has an
N-terminal 4 TMS membrane domain and a long hydrophilic C-terminal extension of about 400 residues, many cytosolic proteins
were retrieved. One such protein, peptidylprolylisomerase G (cyclophilic G) of D. melanogaster (736 aas; similar to human
cyclophilins A, B and H; gi 45708892) showed ≥ 7 repeats of about 44 residues that were homologous to residues 320–490 in Cel1.
This repeat element, present at least 4 times in Cel1, proved to be exceptionally hydrophilic. The repeat unit was found in proteins
from amoebae, fungi, animals and plants. It was not recognized by CDD. Other proteins retrieved in the same BLAST search
included a hypothetical protein from Plasmodium yoelii yoelii (931 aas; gi 83314969) which had at least 11 repeats of this segment, a
putative protease from Gallus gallus, a hypothetical protein of Trypanosoma cruzi (839 aas; gi 70883026) with at least 13 repeat
units, and a fish homologue from Danio revio (1798 aas; gi 68367999).
3.10. “Extra” domains in ion channel proteins
Recognizable CDD-defined domains could be identified in various ion channel proteins. Using default settings for the NCBI
conserved domain database (CDD) search tool (expect cutoff of 0.01 with filter; Ref. [20]), α-helical-type channel proteins of TC
class 1.A were screened for extra hydrophilic domains. Domains of interest identified in representative channel-forming proteins are
presented in Table 1. A more detailed description of the proteins, their structures, and their functions can be found in TCDB.
3.11. The voltage-gated ion channel (VIC) superfamily (TC #1.A.1–5)
Within members of the VIC superfamily, we identified several different domains associated with various family members. In
families 4 and 5 of TC superfamily 1.A.1, three hydrophilic domains were identified. Thus, for example, in the Akt1 potassium
channel protein of Arabidopsis thaliana [79], the CAP_ED domain, a cyclic AMP binding domain found in transcription factors,
cyclic nucleotide-dependent protein kinases and cyclic nucleotide-gated ion channels, follows the voltage sensor and channelforming transmembrane domains. The CAP_ED domain is then followed by several (possibly six) ankyrin repeats (ANK domains)
found in a diverse group of proteins. ANK repeats are about 33 residues long and usually occur in 4-20 consecutive copies. The core
of this repeat appears to be a helix–loop–helix structure, and these repeats may mediate protein:protein interactions. The KHA
domain at the extreme C-terminus of Akt1 may promote tetramerization of the complex (Table 1; Refs. [80,81]).
Other VIC superfamily members include the calcium-activated potassium channel proteins such as human KCNN4 (TC #1.
A.1.16.2) [82,83]. These proteins have calmodulin-binding domains (CaMBD) C-terminal to the ion channel domains [84]. In
KCNN4, this domain occurs at residues 410–480. Other members of family 16 also bear this domain. Some VIC family members
have both CAP_ED and CaMBD domains [85], and thus are sensitive to both cyclic nucleotides and calcium.
Distantly related to the VIC superfamily is the ryanodine/inositol 1,4,5-triphosphate receptor calcium channel (RIR-CaC) family
(TC #1.A.3) [56,86]. One such protein, ITPR2 (TC #1.A.3.2.1) has multiple N-terminal MIR domains. These domains, in addition to
RIR-CaC family proteins, occur in O-mannosyl transferases [87]. Their function is unknown.
Another distantly related family within the VIC superfamily is the transient receptor potential-calcium channel (TRP-CC)
family (TC #1.A.4) [56]. Most if not all members of this family contain N-terminal ankyrin repeat (ANK) domains. As noted
R.D. Barabote et al. / Biochimica et Biophysica Acta 1758 (2006) 1557–1579
1575
Table 1
“Extra” domains found in representative α-helical-type ion channel proteins (TC class 1.A)
Family
TC # a
Familya
Example b
Domain abbreviation c
1.A.1
Voltage-gated ion channel (VIC)
CAP_ED domain, ANK X5,
KDH
CaMBD
1.A.3
Ryanodine/inositol 1,4,5-triphosphate
receptor calcium channel (RIR-CaC)
Transient receptor potential protein cation
channel (TRP-CC)
Cyclic nucleotide-gated potassium channel,
AKT1 (TC #1.A.1.4.1) (Q38998)
Small conductance calcium-activated potassium channel
protein 2, KCNN2 (TC #1.A.1.16.1) (Q9H2S1)
Inositol 1,4,5-triphosphate receptor, ITPR2 (TC #1.A.3.2.1) (P29995)
1.A.4
1.A.5
Polycystin cation channel (PCC)
1.A.9
Ligand-gated ion channel (LIC)
1.A.11
Chloride ion channel (ClC)
Transient receptor potential protein, TRP (TC #1.A.4.1.1) (P19334)
Transient receptor potential cation (Ca2+/Mg2+) channel
subfamily M, member 7, TRPM7 (TC #1.A.4.5.6) (Q96QT4)
Polycystin-1 precursor, PDK1 (TC #1.A.5.1.1) (P98161)
Neuronal acetylcholine receptor protein, alpha-7 subunit
precursor, ACHA7 (TC #1.A.9.1.1) (P36544)
Voltage-gated chloride channel, GEF1 (TC #1.A.11.1.1) (P37020)
Chloride channel protein skeletal muscle, ClC-1,
CLCN1 (TC #1.A.11.2.1) (P35523)
MIR X4
ANK X4
Alpha-kinase
LRR, WSC, PKD, CLECT,
PKD X15, REJ, GPS, PLAT
Neur_chan_LBD
CBS
COG3448
a
TC #, family transporter classification number. The family name and abbreviation are provided in column 2.
Representative proteins bearing the indicated domains are indicated by name, abbreviation, protein TC #, and SwissProt/TREMBL accession number.
c
When multiple domains are reported, these are presented in the order in which they occur in the protein (N-terminus to C-terminus). When multiple repeats are
present, the number identified is indicated (e.g., X4: four repeats).
b
above, other members of the VIC superfamily contain these hydrophilic protein-protein interaction domains. They may occur in
different positions relative to the ion channel domain. Other TRP-CC members (e.g., TRPM7; TC #1.A.4.5.6) have C-terminal
alpha-kinase domains [88]. These domains are protein kinase catalytic domains that have little or no sequence similarity to
conventional kinases. The family includes myosin heavy chain kinases and elongation factor-2 kinases. Alpha-kinase domains
show similarity to enzymes with ATP-GRASP domains. At least some members of the TRP-CC family that have these domains,
including TRPM7, are bifunctional with N-terminal channels and C-terminal ser/thr kinase domains. The kinase domain in
TRPM7 is essential for channel activity and phosphorylates annexin A1. It therefore serves as a positive effector domain for
Ca2+/Mg2+ influx [89–91].
A final distant family of the VIC superfamily is the polycystin cation channel family (TC #1.A.5) [56]. One such protein, human
polycystin-1 precursor [92] has a size of 4303 aas and multiple domains [93]. These domains include (from N-terminus to Cterminus) (1) at least one leucine-rich repeat (LRR) of unknown function, (2) a yeast cell wall integrity and stress response (WSC)
domain, present also in WSC proteins and fungal exoglucanases, (3) a single polycystic kidney disease 1 (PKD1) domain, also
present in other proteins such as microbial collagenases, (4) a C-type lectin-binding calcium-dependent carbohydrate-binding
(CLECT) domain, (5) 15 more polycystic kidney disease 1 (PKD1) repeats, (6) a receptor of egg jelly (REJ) domain (of unknown
function) found also in sperm receptor proteins, (7) the G-protein-coupled receptor proteolytic site (GPS) domain, also present in sea
urchin REJ and latrophilin-CL-1, and (8) the polycystin-1, lipoxygenase, α-toxin (PLAT or LH2) domain. This last-mentioned
domain, which can exist in single domain proteins, is thought to facilitate access to sequestered membrane or micellar-bound
substrates (see CDD for more detailed descriptions).
3.12. The ligand-gated ion channel (LIC) family (TC #1.A.9)
In the ligand-gated ion channel (LIC) family (TC #1.A.9) [94], members consist of N-terminal ligand binding domains (LBD),
homologous to solute-binding proteins in prokaryotes that serve as receptors for ABC-type uptake transporters as well as the ligandbinding domains of transcription factors in the LacI/GalR family [94,95]. In LIC family members, these ligand-binding domains are
followed by the channel-forming integral membrane domains. This general structural pattern is observed for most if not all LIC
family members such as the acetylcholine, serotonin, glycine, glutamate, and γ-aminobutyrate (GABA) receptors [96]. Highresolution X-ray structures of the pentameric acetylcholine receptor (α2βγδ) have been reported, revealing how ligand binding to the
LBD domain controls channel conformation and activity [94]. In this case, the LBD is essential for responsiveness of the channel to
the presence of a neurotransmitter.
3.13. The chloride ion channel (ClC) family (TC #1.A.11)
The chloride ion channel (ClC) family (TC #1.A.11) has C-terminal cystathionine-β-synthase (CBS) domains similar to those
found in many other proteins from organisms of the three domains of life [97]. Although their general function is not known and may
1576
R.D. Barabote et al. / Biochimica et Biophysica Acta 1758 (2006) 1557–1579
be multifaceted, in the synthases, they function in regulation by S-adenosyl methionine (SAM). Proposed functions of CBS domains
include (1) multimerization, (2) protein sorting, (3) channel gating, and (4) ligand binding. The recent demonstration that CBS
domains bind adenosine-containing ligands (ATP, AMP and SAM) has led to the suggestion that they are sensors of intracellular
metabolites and may also mediate signal transduction [97].
4. Discussion
In this bioinformatic analysis, we have identified “extra”
domains in known secondary carriers, channels and their
homologues. These proteins are all members of established
transporter superfamilies/families categorized in the Transporter
Classification (TC) System (see http://www.tcdb.org). Based on
their family/subfamily associations, the transported substrates,
the polarity of transport, and the cotransported cation (in
secondary carriers) can often be predicted. However, when no
functionally characterized member of a family or subfamily is
available, functional prediction is much more difficult.
In the past, we have found that prediction of transport
functions can be greatly facilitated by the identification of fused
domains of known function. For example, such associations
have allowed us to correctly predict the polarity and substrates
of lysophospholipid [10] and bicarbonate [11] uptake permeases. These successes have prompted us to seek extra
domains in other transporters.
Extra domains, particularly hydrophilic domains in channels
and secondary carriers, are likely to serve as regulatory,
targeting, or ligand/protein interaction domains. Often, these
domains have become fused to catalytic enzymes as well as
transporters, and if in any such case, the subfunction of the
domain is known, then extrapolation to other proteins of
unknown or differing function is, with caution, justified. For
example, TrkA-N domains bind NAD+ and function in
regulation or catalysis, and the sequence-related TrkA-C
domains may serve a similar function while binding some
other ligand [23].
The identification of these domains in members of the
DASS, ABT and AAE families of anion and amino acid
transporters suggests that these transporters may be regulated by
a mechanism, common to that of various channels, carriers and
enzymes that also contain these domains. In fact, it is possible
that these diverse vectorial and non-vectorial catalytic processes
are coordinately regulated by virtue of the presence of the same
fused regulatory domains. Since only certain members of a
particular family generally exhibit any one of these fused
domains, it can be supposed that a subfunction common to the
homologues containing that domain will be absent from those
that do not. Thus, the discovery of multidomain transporters
may facilitate functional characterization. However, it will also
extend the functionality of the carrier by providing clues about
specific subfunctions. Just as TrkA domains may be nucleotide
(e.g., NAD+) binding domains, the SPX domain may be Gprotein β-subunit binding domains.
The identification of large SPX domains in transport
homologues of the DASS family is particularly worthy of
note. These domains are believed to bind G-protein β-subunits
for the purpose of signal transduction [30,31]. This function is
consistent with our finding that they are restricted to fungal
members of the DASS family; none were found in the
prokaryotic homologues. Trimeric G-proteins and SPX domains
seem to be restricted to the eukaryotic domain, although they
are present in representative proteins from all eukaryotic
kingdoms. This may suggest that the fungal cluster 7 (in figure
DASS-S3B on our website) has gained a receptor function not
common to non-fungal members of the DASS family. It is
worthy of note that reception is one of the few non-transport
functions that transport homologues have acquired over
evolutionary time [2,32].
The DedA domains, found in TRAP-T family members,
differ from those cited above in that these domains are
hydrophobic, including putative transmembrane α-helices,
that presumably form structural constituents of the transport
proteins. They may function as structural scaffolds or
participate directly in transport. In this respect, DedA domains
differ from all of the other “extra” domains described in this
study.
The OAT family of animal organic anion transporters proved
to contain Kazal-2 and PDZ domains. These domains are
probably modular protein–protein interaction domains. The
presence of two structurally distinct types of such interaction
domains clearly suggests that the OAT transporters interact with
multiple cytoplasmic or membrane proteins. Possibly these
interacting proteins play a role in their transport functions. In
fact, mutations in PDZ domains of some human proteins result
in severe diseases [40,41]. In the case of Oatp1a1, the PDZ
domain promotes interaction with the PDZK1 scaffolding
protein of the epithelial cell, an event that is required for proper
subcellular localization [42].
A particularly complex situation proved to exist in members
of the OST family. Here, two subunits are required for the
function of at least some animal transporters. These subunits,
Ostα and Ostβ, could include in their polypeptides a number of
extra domains, depending on the homologue. Thus, different
Ostα subunits can incorporate (1) N-terminal catalytic membrane-bound O-acyl transferase (MBOAT) domains, (2) Cterminal hydrophilic domains also found in dynein-related
AAA ATPases, (3) C-terminal strongly hydrophilic repeat units
also found in the human polycystic kidney disease protein, and
(4) C-terminal domains found N-terminally in human kinesin.
The larger Ostβ homologues found in non-mammalian
animals have an N-terminal ricin carbohydrate binding domain
that probably functions in glycoprotein or glycolipid recognition. Such a degree of complexity within a single protein family
is almost unprecedented and suggests the occurrence of a
variety of late evolving subfunctions.
Our studies revealed the presence of Universal Stress Protein
(USP) domains in one family (ABT) of the APC superfamily of
amino acid/polyamine/organocation transporters. Since UspA
R.D. Barabote et al. / Biochimica et Biophysica Acta 1758 (2006) 1557–1579
of E. coli promotes cell survival and is synthesized in increased
amounts under stress conditions, and because it lacks other
domains, it can be inferred that the presence of USP domains in
a transporter could promote the involvement of the transporter
in one or more stress responses. Similarly, the presence of a
IIAFru domain in a bacterial member of this family suggests that
for this protein (which lacks the USP domain), regulation is
sensitive to the phosphorylation state of the fused IIA protein
and hence to the energy state of the cell.
Finally, examination of representative α-helical-type channel
proteins of TC class 1.A revealed a number of domains, several
of which are known to function in regulation as ligand-binding
domains. These include the cyclic nucleotide-binding CAP_ED
domain, the calmodulin-binding domain (CaMBD), the neurotransmitter ligand-binding domain (LBD) and the α-adenosyl
methionine-binding domain (CBS) that may actually be capable
of binding any of a variety of adenosine-containing molecules
such as ATP, AMP and SAM [47]. Other hydrophilic domains
in channel-forming proteins probably function in signal
transduction, macromolecular interactions or subcellular targeting. However, in one case, that of the TRPM7 Ca2+/Mg2+
channel protein, the extra C-terminal domain has ser/thr protein
kinase activity that regulates channel activity.
All of the suggestions and conclusions resulting from the
reported studies are subject to experimental verification. Molecular genetic/biochemical approaches coupled to physiological
analyses are likely to reveal the primary and secondary functions
of these transporters and their subdomains. While clues as to their
functions are already available, detailed analyses are likely to
reveal functional diversity much greater than is currently
recognized. We hope that these studies will provide guides for
structure/function relationships for many years to come.
Acknowledgments
This work was supported by NIH grants GM55434 and
GM64368. We thank Mary Beth Hiller for her assistance in the
preparation of this manuscript.
References
[1] M.H. Saier Jr., Computer-aided analyses of transport protein sequences:
gleaning evidence concerning function, structure, biogenesis, and evolution, Microbiol. Rev. 58 (1994) 71–93.
[2] M.H. Saier Jr., A functional–phylogenetic classification system for
transmembrane solute transporters, Microbiol. Mol. Biol. Rev. 64 (2000)
354–411.
[3] W. Busch, M.H. Saier Jr., The Transporter Classification (TC) system,
2002, CRC Crit. Rev. Biochem. Mol. Biol. 37 (2002) 287–337.
[4] W. Busch, M.H. Saier Jr., The IUBMB-endorsed transporter classification
system, Mol. Biotechnol. 27 (2004) 253–262.
[5] M.H. Saier Jr., C.V. Tran, R.D. Barabote, TCDB: the transporter
classification database for membrane transport protein analyses and
information, Nucleic Acids Res. 34 (2006) D181–D186 (Database issue).
[6] M.H. Saier Jr., Answering fundamental questions in biology with
bioinformatics, ASM News 69 (2003) 175–181.
[7] M.H. Saier Jr., Tracing pathways of transport protein evolution, Mol.
Microbiol. 48 (2003) 1145–1156.
[8] S.S. Pao, I.T. Paulsen, M.H. Saier Jr., The major facilitator superfamily,
Microbiol. Mol. Biol. Rev. 62 (1998) 1–32.
1577
[9] M.H. Saier Jr., J.T. Beatty, A. Goffeau, K.T. Harley, W.H.M. Heijne, S.-C.
Huang, D.L. Jack, P.S. Jahn, K. Lew, J. Liu, S.S. Pao, I.T. Paulsen, T.-T.
Tseng, P.S. Virk, The major facilitator superfamily, J. Mol. Microbiol.
Biotechnol. 1 (1999) 257–279.
[10] E.M. Harvat, Y.M. Zhang, C.V. Tran, Z. Zhang, M.W. Frank, C.O. Rock,
M.H. Saier Jr., Lysophospholipid flipping across the Escherichia coli inner
membrane catalyzed by a transporter (LplT) belonging to the major
facilitator superfamily, J. Biol. Chem. 280 (2005) 12028–12034.
[11] J. Felce, M.H. Saier Jr., Carbonic anhydrases fused to anion transporters of
the SulP family: evidence for a novel type of bicarbonate transporter,
J. Mol. Microbiol. Biotechnol. 8 (2004) 169–176.
[12] T. von Rozycki, M. Schultzel, M.H. Saier Jr., Sequence analyses of
cyanobacterial bicarbonate transporters and their homologues, J. Mol.
Microbiol. Biotechnol. 7 (2004) 102–108.
[13] R.D. Barabote, M.H. Saier Jr., Comparative genomic analyses of the
bacterial phosphotransferase system (PTS), Microbiol. Mol. Biol. Rev. 69
(2005) 608–634.
[14] A. Bateman, L. Coin, R. Durbin, R.D. Finn, V. Hollich, S. Griffiths-Jones,
A. Khanna, M. Marshall, S. Moxon, E.L. Sonnhammer, D.J. Studholme,
C. Yeats, S.R. Eddy, The Pfam protein families database, Nucleic Acids
Res. 32 (2004) D138–D141 (Database issue).
[15] S.R. Eddy, Profile hidden Markov models, Bioinformatics 14 (1998)
755–763.
[16] S.F. Altschul, T.L. Madden, A.A. Schäffer, J. Zhang, Z. Zhang, W. Miller,
D.J. Lipman, Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs, Nucleic Acids Res. 25 (1997)
3389–3402.
[17] J.D. Thompson, T.J. Gibson, F. Plewniak, F. Jeanmougin, D.G. Higgins,
The CLUSTAL X windows interface: flexible strategies for multiple
sequence alignment aided by quality analysis tools, Nucleic Acids Res. 25
(1997) 4876–4882.
[18] Y. Zhai, M.H. Saier Jr., A simple sensitive program for detecting internal
repeats in sets of multiply aligned homologous proteins, J. Mol. Microbiol.
Biotechnol. 4 (2002) 29–31.
[19] Y. Zhai, M.H. Saier Jr., A web-based program for the prediction of average
hydropathy, average amphipathicity and average similarity of multiply
aligned homologous proteins, J. Mol. Microbiol. Biotechnol. 3 (2001)
285–286.
[20] A. Marchler-Bauer, S.H. Bryant, CD-Search: protein domain annotations
on the fly, Nucleic Acids Res. 32 (2004) W327–W331 (Web Server issue).
[21] M.H. Saier Jr., B.H. Eng, S. Fard, J. Garg, D.A. Haggerty, W.J.
Hutchinson, D.L. Jack, E.C. Lai, H.J. Liu, D.P. Nusinew, A.M. Omar,
S.S. Pao, I.T. Paulsen, J.A. Quan, M. Sliwinski, T.-T. Tseng, S. Wachi,
G.B. Young, Phylogenetic characterization of novel transport protein
families revealed by genome analyses, Biochim. Biophys. Acta 1422
(1999) 1–56.
[22] S. Prakash, G. Cooper, S. Singhi, M.H. Saier Jr., The ion transporter
superfamily, Biochim. Biophys. Acta 1618 (2003) 79–92.
[23] V. Anantharaman, E.V. Koonin, L. Aravind, Regulatory potential, phyletic
distribution and evolution of ancient, intracellular small-molecule-binding
domains, J. Mol. Biol. 307 (2001) 1271–1292.
[24] L. Aravind, E.V. Koonin, A novel family of predicted phosphoesterases
includes Drosophila prune protein and bacterial RecJ exonuclease, Trends
Biochem. Sci. 23 (1998) 17–19.
[25] A. Schlosser, A. Hamann, D. Bossemeyer, E. Schneider, E.P. Bakker,
NAD+ binding to the Escherichia coli K+-uptake protein TrkA and
sequence similarity between TrkA and domains of a family of
dehydrogenases suggest a role for NAD+ in bacterial transport, Mol.
Microbiol. 9 (1993) 533–543.
[26] J.S. Lolkema, I. Sobczak, D.J. Slotboom, Secondary transporters of the
2HCT family contain two homologous domains with inverted membrane
topology and trans re-entrant loops, FEBS J. 272 (2005) 2334–2344.
[27] J.S. Lolkema, D.-J. Slotboom, Classification of 29 families of secondary
transport proteins into a single structural class using hydropathy profile
analysis, J. Mol. Biol. 327 (2003) 901–909.
[28] S. Nakano, E. Kuster-Schock, A.D. Grossman, P. Zuber, Spx-dependent
global transcriptional control is induced by thiol-specific oxidative stress in
Bacillus subtilis, Proc. Natl. Acad. Sci. U. S. A. 100 (2003) 13603–13608.
1578
R.D. Barabote et al. / Biochimica et Biophysica Acta 1758 (2006) 1557–1579
[29] S. Nakano, M.M. Nakano, Y. Zhang, M. Leelakriangsak, P. Zuber, A
regulatory protein that interferes with activator-stimulated transcription in
bacteria, Proc. Natl. Acad. Sci. U. S. A. 100 (2003) 4233–4238.
[30] B.H. Spain, D. Koo, M. Ramakrishnan, B. Dzudzor, J. Colicelli, Truncated
forms of a novel yeast protein suppress the lethality of a G protein αsubunit deficiency by interacting with the β-subunit, J. Biol. Chem. 270
(1995) 25435–25444.
[31] Y. Wang, C. Ribot, E. Rezzonico, Y. Poirier, Structure and expression
profile of the Arabidopsis PHO1 gene family indicates a broad role in
inorganic phosphate homeostasis, Plant Physiol. 135 (2004) 400–411.
[32] B. Pinson, M. Merle, J.M. Franconi, B. Daignan-Fornier, Low affinity
orthophosphate carriers regulate PHO gene expression independently
of internal orthophosphate concentration in Saccharomyces cerevisiae,
J. Biol. Chem. 279 (2004) 35273–35280.
[33] D.J. Kelly, G.H. Thomas, The tripartite ATP-independent periplasmic
(TRAP) transporters of bacteria and archaea, FEMS Microbiol. Rev. 25
(2001) 405–424.
[34] R. Rabus, D.L. Jack, D.J. Kelly, M.H. Saier Jr., TRAP transporters: an
ancient family of periplasmic solute receptor-dependent secondary active
transporters, Microbiology 145 (1999) 3431–3445.
[35] B. Hagenbuch, P.J. Meier, The superfamily of organic anion transporting
polypeptides, Biochim. Biophys. Acta 1609 (2003) 1–18.
[36] X. Xu, A.K. Fu, F.C. Ip, C.P. Wu, S. Duan, M.M. Poo, X.B. Yuan, N.Y. Ip,
Agrin regulates growth cone turning of Xenopus spinal motoneurons,
Development 132 (2005) 4309–4316.
[37] B. Schlott, J. Wohnert, C. Icke, M. Hartmann, R. Ramachandran, K.H.
Guhrs, E. Glusa, J. Flemming, M. Gorlach, F. Grosse, O. Ohlenschlager,
Interaction of Kazal-type inhibitor domains with serine proteinases:
biochemical and structural studies, J. Mol. Biol. 318 (2002) 533–546.
[38] A. van de Locht, D. Lamba, M. Bauer, R. Huber, T. Friedrich, B. Kroger,
W. Hoffken, W. Bode, Two heads are better than one: crystal structure of
the insect derived double domain Kazal inhibitor rhodniin in complex with
thrombin, EMBO J. 14 (1995) 5149–5157.
[39] K. Ogura, S. Choudhuri, C.D. Klaassen, Full-length cDNA cloning
and genomic organization of the mouse liver-specific organic anion
transporter-1 (lst-1), Biochem. Biophys. Res. Commun. 272 (2000)
563–570.
[40] A.Y. Hung, M. Sheng, PDZ domains: structural modules for protein
complex assembly, J. Biol. Chem. 277 (2002) 5699–5702.
[41] P. Wang, J.J. Wang, Y. Xiao, J.W. Murray, P.M. Novikoff, R.H. Angeletti,
G.A. Orr, D. Lan, D.L. Silver, A.W. Wolkoff, Interaction with PDZK1 is
required for expression of organic anion transporting protein 1A1 on the
hepatocyte surface, J. Biol. Chem. 280 (2005) 30143–30149.
[42] S.M. Gisler, S. Pribanic, D. Bacic, P. Forrer, A. Gantenbein, L.A. Sabourin,
A. Tsuji, Z.S. Zhao, E. Manser, J. Biber, H. Murer, PDZK1: I. a major
scaffolder in brush borders of proximal tubular cells, Kidney Int. 64 (2003)
1733–1745.
[43] M.H. Saier Jr., Families of transmembrane transporters selective for amino
acids and their derivatives, Microbiology 146 (2000) 1775–1795.
[44] D.L. Jack, I.T. Paulsen, M.H. Saier Jr., The amino acid/polyamine/
organocation (APC) superfamily of transporters specific for amino acids,
polyamines and organocations, Microbiology 146 (2000) 1797–1814.
[45] Y.J. Chung, C. Krueger, D. Metzgar, M.H. Saier Jr., Size comparisons
among integral membrane transport protein homologues in bacteria,
archaea, and eucarya, J. Bacteriol. 183 (2001) 1012–1021.
[46] E. Gasol, M. Jiménez-Vidal, J. Chillarón, A. Zorzano, M. Palacín,
Membrane topology of system xc-light subunit reveals a re-entrant loop
with substrate-restricted accessibility, J. Biol. Chem. 279 (2004)
31228–31236.
[47] T. Nystrom, F.C. Neidhardt, Expression and role of the universal stress
protein, UspA, of Escherichia coli during growth arrest, Mol. Microbiol.
11 (1994) 537–544.
[48] T.I. Zarembinski, L.W. Hung, H.J. Mueller-Dieckmann, K.K. Kim, H.
Yokota, R. Kim, S.H. Kim, Structure-based assignment of the biochemical
function of a hypothetical protein: a test case of structural genomics, Proc.
Natl. Acad. Sci. U. S. A. 95 (1998) 15189–15193.
[49] M.C. Sousa, D.B. McKay, Structure of the universal stress protein of
Haemophilus influenzae, Structure 9 (2001) 1135–1141.
[50] A. Accardi, C. Miller, Secondary active transport mediated by a
prokaryotic homologue of ClC Cl− channels, Nature 427 (2004)
803–807.
[51] K. Abe, F. Ohnishi, K. Yagi, T. Nakajima, T. Higuchi, M. Sano, M.
Machida, R.I. Sarker, P.C. Maloney, Plasmid-encoded asp operon confers a
proton motive metabolic cycle catalyzed by an aspartate–alanine exchange
reaction, J. Bacteriol. 184 (2002) 2906–2913.
[52] J.S. Lolkema, D.-J. Slotboom, Sequence and hydropathy profile analysis
of two classes of secondary transporters, Mol. Membr. Biol. 22 (2005)
177–189.
[53] J. Devereux, P. Haeberli, O. Smithies, A comprehensive set of sequence
analysis programs for the VAX, Nucleic Acids Res. 12 (1984) 387–395.
[54] T.B. Cao, M.H. Saier Jr., The general protein secretory pathway:
phylogenetic analyses leading to evolutionary conclusions, Biochim.
Biophys. Acta 1609 (2003) 115–125.
[55] Y. Zhai, J. Tchieu, M.H. Saier Jr., A web-based Tree View (TV) program
for the visualization of phylogenetic trees, J. Mol. Microbiol. Biotechnol. 4
(2002) 69–70.
[56] A.B. Chang, R. Lin, W.K. Studley, C.V. Tran, M.H. Saier Jr., Phylogeny as
a guide to structure and function of membrane transport proteins, Mol.
Membr. Biol. 21 (2004) 171–181.
[57] D.J. Seward, A.S. Koh, J.L. Boyer, N. Ballatori, Functional complementation between a novel mammalian polygenic transport complex and an
evolutionarily ancient organic solute transporter, OSTα-OSTβ, J. Biol.
Chem. 278 (2003) 27473–27482.
[58] W. Wang, D.J. Seward, L. Li, J.L. Boyer, N. Ballatori, Expression cloning
of two genes that together mediate organic solute and steroid transport in
the liver of a marine vertebrate, Proc. Natl. Acad. Sci. U. S. A. 98 (2001)
9431–9436.
[59] P.A. Dawson, M. Hubbert, J. Haywood, A.L. Craddock, N. Zerangue, W.V.
Christian, N. Ballatori, The heteromeric organic solute transporter α-β,
Ostα-Ostβ, is an ileal basolateral bile acid transporter, J. Biol. Chem. 280
(2005) 6960–6968.
[60] L. Sipos, G. von Heijne, Predicting the topology of eukaryotic membrane
proteins, Eur. J. Biochem. 213 (1993) 1333–1340.
[61] G. von Heijne, Net N–C charge imbalance may be important for signal
sequence function in bacteria, J. Mol. Biol. 192 (1986) 287–290.
[62] G. von Heijne, Y. Gavel, Topogenic signals in integral membrane proteins,
Eur. J. Biochem. 174 (1988) 671–678.
[63] Y. Zhai, W.H.M. Heijne, D.W. Smith, M.H. Saier Jr., Homologues of
archaeal rhodopsins in plants, animals and fungi: structural and functional
predications for a putative fungal chaperone protein, Biochim. Biophys.
Acta 1511 (2001) 206–223.
[64] J.E. Garbarino, I.R. Gibbons, Expression and genomic analysis of midasin,
a novel and highly conserved AAA protein distantly related to dynein,
BMC Genomics 3 (2002) 18.
[65] L.M. Iyer, D.D. Leipe, E.V. Koonin, L. Aravind, Evolutionary history and
higher order classification of AAA+ ATPases, J. Struct. Biol. 146 (2004)
11–31.
[66] K. Tanaka, K. Okabayashi, M. Asashima, N. Perrimon, T. Kadowaki, The
evolutionarily conserved porcupine gene family is involved in the
processing of the Wnt family, Eur. J. Biochem. 267 (2000) 4300–4311.
[67] G.I. Anyatonwu, B.E. Ehrlich, Organic cation permeation through the
channel formed by polycystin-2, J. Biol. Chem. 280 (2005) 29488–29493.
[68] M. Bycroft, A. Bateman, J. Clarke, S.J. Hamill, R. Sandford, R.L. Thomas,
C. Chothia, The structure of a PKD domain from polycystin-1.
Implications for polycystic kidney disease, EMBO J. 18 (1999) 297–305.
[69] G.M. Xu, S. González-Perrett, M. Essafi, G.A. Timpanaro, N. Montalbetti,
M.A. Arnaout, H.F. Cantiello, Polycystin-1 activates and stabilizes the
polycystin-2 channel, J. Biol. Chem. 278 (2003) 1457–1462.
[70] H. Miki, Y. Okada, N. Hirokawa, Analysis of the kinesin superfamily:
insights into structure and function, Trends Cell Biol. 15 (2005) 467–476.
[71] Z. Fujimoto, A. Kuno, S. Kaneko, H. Kobayashi, I. Kusakabe, H. Mizuno,
Crystal structures of the sugar complexes of Streptomyces olivaceoviridis
E-86 xylanase: sugar binding structure of the family 13 carbohydrate
binding module, J. Mol. Biol. 316 (2002) 65–78.
[72] C.N. Adra, S. Zhu, J.L. Ko, J.C. Guillemot, A.M. Cuervo, H. Kobayashi,
T. Horiuchi, J.M. Lelias, J.D. Rowley, B. Lim, LAPTM5: a novel
R.D. Barabote et al. / Biochimica et Biophysica Acta 1758 (2006) 1557–1579
[73]
[74]
[75]
[76]
[77]
[78]
[79]
[80]
[81]
[82]
[83]
lysosomal-associated multispanning membrane protein preferentially
expressed in hematopoietic cells, Genomics 35 (1996) 328–337.
D.L. Hogue, M.J. Ellison, J.D. Young, C.E. Cass, Identification of a novel
membrane transporter associated with intracellular membranes by
phenotypic complementation in the yeast Saccharomyces cerevisiae,
J. Biol. Chem. 271 (1996) 9801–9808.
D.L. Hogue, L. Kerby, V. Ling, A mammalian lysosomal membrane
protein confers multidrug resistance upon expression in Saccharomyces
cerevisiae, J. Biol. Chem. 274 (1999) 12877–12882.
Y. Hayami, S. Iida, N. Nakazawa, I. Hanamura, M. Kato, H. Komatsu, I.
Miura, B.J. Dave, W.G. Sanger, B. Lim, M. Taniwaki, R. Ueda,
Inactivation of the E3/LAPTm5 gene by chromosomal rearrangement
and DNA methylation in human multiple myeloma, Leukemia 17 (2003)
1650–1657.
L.M. Scott, L. Mueller, S.J. Collins, E3, a hematopoietic-specific transcript
directly regulated by the retinoic acid receptor alpha, Blood 88 (1996)
2517–2530.
J.M. Inal, Schistosoma TOR (trispanning orphan receptor), a novel,
antigenic surface receptor of the blood-dwelling, Schistosoma parasite,
Biochim. Biophys. Acta 1445 (1999) 283–298.
J.M. Inal, Complement C2 receptor inhibitor trispanning: from man to
schistosome, Springer Semin, Immunopathol. 27 (2005) 320–331.
R.E. Hirsch, B.D. Lewis, E.P. Spalding, M.R. Sussman, A role for
the AKT1 potassium channel in plant nutrition, Science 280 (1998)
918–921.
P. Daram, S. Urbach, F. Gaymard, H. Sentenac, I. Cherel, Tetramerization
of the AKT1 plant potassium channel involves its C-terminal cytoplasmic
domain, EMBO J. 16 (1997) 3455–3463.
P. Maser, S. Thomine, J.I. Schroeder, J.M. Ward, K. Hirschi, H. Sze, I.N.
Talke, A. Amtmann, F.J. Maathuis, D. Sanders, J.F. Harper, J. Tchieu, M.
Gribskov, M.W. Persans, D.E. Salt, S.A. Kim, M.L. Guerinot, Phylogenetic relationships within cation transporter families of Arabidopsis, Plant
Physiol. 126 (2001) 1646–1667.
T.M. Ishii, C. Silvia, B. Hirschberg, C.T. Bond, J.P. Adelman, J. Maylie, A
human intermediate conductance calcium-activated potassium channel,
Proc. Natl. Acad. Sci. U. S. A. 94 (1997) 11651–11656.
B.S. Jensen, D. Strobaek, S.P. Olesen, P. Christophersen, The Ca2+activated K+ channel of intermediate conductance: a molecular target for
novel treatments? Curr. Drug Targets 2 (2001) 401–422.
1579
[84] R. Roncarati, I. Decimo, G. Fumagalli, Assembly and trafficking of human
small conductance Ca2+-activated K+ channel SK3 are governed by
different molecular domains, Mol. Cell. Neurosci. 28 (2005) 314–325.
[85] T. Arazi, B. Kaplan, R. Sunkar, H. Fromm, Cyclic-nucleotide- and Ca2+/
calmodulin-regulated channels in plants: targets for manipulating heavymetal tolerance, and possible physiological roles, Biochem. Soc. Trans. 28
(2000) 471–475.
[86] J. Sneyd, M. Falcke, Models of the inositol trisphosphate receptor, Prog.
Biophys. Mol. Biol. 89 (2005) 207–245.
[87] C.P. Ponting, Novel eIF4G domain homologues linking mRNA translation
with nonsense-mediated mRNA decay, Trends Biochem. Sci. 25 (2000)
423–426.
[88] D. Drennan, A.G. Ryazanov, Alpha-kinases: analysis of the family and
comparison with conventional protein kinases, Prog. Biophys. Mol. Biol.
85 (2004) 1–32.
[89] M.V. Dorovkov, A.G. Ryazanov, Phosphorylation of annexin I by TRPM7
channel-kinase, J. Biol. Chem. 279 (2004) 40646–50643.
[90] L.V. Ryazanova, M.V. Dorovkov, A. Ansari, A.G. Ryazanov, Characterization of the protein kinase activity of TRPM7/ChaK1, a protein kinase
fused to the transient receptor potential ion channel, J. Biol. Chem. 279
(2004) 3708–3716.
[91] C. Schmitz, A.L. Perraud, D.O. Johnson, K. Inabe, M.K. Smith, R. Penner,
T. Kurosaki, A. Fleig, A.M. Scharenberg, Regulation of vertebrate cellular
Mg2+ homeostasis by TRPM7, Cell 114 (2003) 191–200.
[92] P.D. Wilson, Polycystin: new aspects of structure, function, and regulation,
J. Am. Soc. Nephrol. 12 (2001) 834–845.
[93] C. Stayner, J. Zhou, Polycystin channels and kidney disease, Trends
Pharmacol Sci. 22 (2001) 543–546.
[94] A. Miyazawa, Y. Fujiyoshi, N. Unwin, Structure and gating mechanism of
the acetylcholine receptor pore, Nature 423 (2003) 949–955.
[95] K. Brejc, W.J. van Dijk, R.V. Klaassen, M. Schuurmans, J. van Der Oost,
A.B. Smit, T.K. Sixma, Crystal structure of an ACh-binding protein
reveals the ligand-binding domain of nicotinic receptors, Nature 411
(2001) 269–276.
[96] H. Xue, Identification of major phylogenetic branches of inhibitory ligandgated channel receptors, J. Mol. Evol. 47 (1998) 323–333.
[97] S. Ignoul, J. Eggermont, CBS domains: structure, function, and pathology in human proteins, Am. J. Physiol., Cell Physiol. 289 (2005)
C1369–C1378.