View PDF - CiteSeerX

Page 1 ofArticles
58
in PresS. Physiol Genomics (January 16, 2007). doi:10.1152/physiolgenomics.00142.2006
1
The evolution and structural diversification of
Hyperpolarization-activated cyclic nucleotide-gated (HCN)
channel genes
Heather A. Jackson1, Christian R. Marshall2 and Eric A. Accili1
1
Department of Cellular and Physiological Sciences, Faculty of Medicine, University of
British Columbia, Vancouver, BC, Canada. 2Department of Molecular Biology and
Biochemistry, Simon Fraser University, Burnaby, BC, Canada.
‘Evolution of HCN channels’
Corresponding Author: E.A. Accili, University of BC, Vancouver, BC, Canada. e-mail
([email protected]), phone (604) 822-6900, fax (604) 822-6048.
Copyright © 2007 by the American Physiological Society.
Page 2 of 58
2
Abstract
Hyperpolarization-activated cyclic nucleotide-gated (HCN) channels are members of the
voltage-gated channel superfamily and play a critical role in cellular pacemaking. Overall
sequence conservation is high throughout the family and channel functions are similar but
not identical. Phylogenetic analyses are imperative to understand how these genes have
evolved and to make informed comparisons of HCN structure and function. These have
been previously limited, however, by the small number of available sequences, from a
minimal number of species unevenly distributed over evolutionary time. We have now
identified and annotated thirty-one novel genes from invertebrates, urochordates, fish,
amphibians, birds and mammals. With increased numbers and a broader representation, a
more precise sequence comparison was performed and an evolutionary history for these
genes was constructed. Our data confirm the existence of at least four vertebrate paralogs
and suggest that these arose via three duplication and diversification events from a single
ancestral gene. Additional lineage specific duplications appear to have occurred in
urochordate and fish genomes. Based on exon boundary conservation and phylogenetic
analyses, we hypothesize that mammalian gene structure was established, and duplication
events occurred, after the divergence of urochordates and before the divergence of fish
from the tetrapod lineage. In addition, we identified highly conserved sequence regions
that are likely important for general HCN functions, as well as regions with differences
conserved among each of the individual paralogs. The latter may underlie more subtle
isoform-specific properties that were otherwise masked by the high identity among
mammalian orthologs and/or inaccurate alignments between paralogs.
Page 3 of 58
3
Key Words:
Pacemaker channel; phylogeny; sequence analysis; molecular evolution
Page 4 of 58
4
Introduction
Hyperpolarization-activated currents, commonly referred to as Ih or If, have been identified in
species distributed across the metazoan lineage and contribute to membrane potential in excitable
and non-excitable cells. In the mammalian heart and nervous system they contribute to
spontaneous beating in pacemaker cells (11, 52). In non-pacing cells, Ih facilitates the regulation
of resting membrane properties by limiting the fluctuation of membrane potentials (52). The
proteins responsible for Ih have been cloned from both invertebrate and vertebrate species and are
referred to as hyperpolarization-activated cyclic nucleotide-gated (HCN) channels (18, 26, 35,
57). When expressed in heterologous systems, HCN subunits form cation-selective channels that
are directly modulated by cyclic nucleotides and exhibit biophysical properties that are similar to
those of Ih observed in native tissue.
HCN channels are members of the voltage-gated cation channel superfamily and based on
sequence homology, are most closely related to the cyclic nucleotide gated (CNG) channel and
ether-a-go-go (EAG) potassium channel families. Individual subunits (Figure 1) are predicted to
have six transmembrane segments (S1-S6), with a voltage sensor domain in the S4 segment and
an ion conducting pore between S5 and S6. Based on the crystal structures of related potassium
channels (14, 27, 32) it is proposed that four HCN subunits come together to form a tetramer
around a central pore. The distal termini of each subunit are cytoplasmic and from the crystal
structure of the C-terminus (78), it is now known that an C-helical linker region joins the
transmembrane region to an evolutionarily conserved cyclic nucleotide-binding domain (CNBD).
Page 5 of 58
5
To date, four mammalian isoforms have been cloned (HCN1-HCN4) (26, 35, 57). These
paralogs exhibit 80-90% sequence identity between the start of S1 to the end of the CNBD.
Differences in both sequence and length between paralogs occur predominantly in the N- and Ctermini (36). This high sequence identity suggests that the four mammalian HCN genes arose
from duplications of a single ancestral gene prior to the lineage divergence. However, when
these duplication and diversification events occurred remain unknown. The functional properties
of the individual isoforms, such as their responses to changes in voltage and modulation of
channel opening by cAMP, are similar but not identical. Likewise, the four HCN paralogs
display partial overlapping tissue distribution in both the heart (26, 45, 46, 63) and the central
nervous system (36). Despite what seems like redundant functional and expression profiles, the
overall physiological importance of the individual isoforms is clearly evident from the different
phenotypes of knockout mouse models. General and forebrain specific HCN1 knockout mice
demonstrate learning and movement dysfunction (49), HCN2-deficient mice exhibited absence
epilepsy as well as cardiac sinus dysrhythmia (34) and HCN4-deficient mice died during
embryonic development, possibly due to reduced HCN-mediated currents and heart rates of the
embryos (66). Furthermore, the physiological impact of the mammalian HCN4 isoform in the
heart has been confirmed by the association of mutations in this isoform to sinus arrhythmias in
humans (44, 60, 71). This pattern of functional disruption partially reflects the predominate
tissue distribution of the isoforms but also indicates that the sequence divergence that has
occurred in these four paralogs has been sufficient to differentiate their physiological roles,
thereby reducing functional redundancy and allowing each to have been retained throughout
evolution since the duplication process.
Page 6 of 58
6
In addition to the four mammalian isoforms, only a few HCN genes have been cloned from
lower vertebrate and invertebrate species. A single HCN1 ortholog has been cloned from the
rainbow trout (9) whereas one and two HCN homologs have been cloned from arthropods (20,
21, 30, 43) and the sea urchin (16, 18), respectively. These sequences demonstrate considerable
identity with the mammalian homologs, as well as some intriguing differences, and thus have
offered some preliminary clues about the evolutionary relationships among HCN genes. But due
to the lack of sequence representation from a diverse sampling of species distributed throughout
evolutionary history, previous phylogenetic analyses of the HCN family and its relatives have
been limited and have yielded inconsistent patterns of evolution (10, 16). Furthermore, the
current sampling of the four vertebrate isoforms is limited primarily to closely-related
mammalian sequences. The high residue conservation among these sequences, combined with
the lack of sequences from more distantly related vertebrates, renders comparative analyses of
the orthologs from the individual vertebrate isoforms ineffective.
In this study, we report the first thorough phylogenetic analysis and sequence comparison of the
HCN gene family. By performing an extensive search of the available protein and currently
accessible whole-genome databases we derived a comprehensive list of known and putative fulllength sequences for HCN homologs from a wide variety of species including urochordates and
lower vertebrates. The increased number of sequences and improved species diversification
provide information about HCN gene structure at critical periods during its evolutionary history.
We identified sequences that are conserved and likely important for general HCN functions, as
well as regions that may underlie more subtle differences in function among the different
isoforms. Furthermore, analyses of exon structure and genomic organization were carried out.
Page 7 of 58
7
These analyses provide insight into the molecular evolution of this protein within different taxa
and support the hypothesis that both lineage-specific and ancestral duplication and divergence
events of the HCN genes have occurred throughout its history.
Materials and Methods
Sequence Data:
Currently available HCN protein sequences were identified through BLASTP (1) searches of the
National Center for Biotechnology Information (NCBI) and UniProt/SwissProt non-redundant
(nr) protein databases. Human HCN2 (Genbank GI no. 21359848), trout HCN1 (Genbank GI no.
33312350) or drosophila HCN (Genbank GI no. 5326833) amino acid sequences were used as
queries and default parameters were applied. To prevent the inclusion of incorrect sequences
generated by computer prediction programs, all computer-derived annotations were temporarily
removed until confirmed by sequence alignment or database annotation described below. Splice
variants, short fragments, and other duplicates were also removed.
The Ensembl Genome Browser (http://www.ensembl.org/) (6, 24) was used to determine the
genomic position and distribution of the established HCN genes, and either to examine new
HCN genes identified by the computer gene-prediction programs and annotation process or to
identify novel genes. Sequences classified as HCN genes by Ensembl were examined and those
that spanned the entire length of known sequences were downloaded. Protein annotations that
resembled HCN but either lacked regions of the predicted sequence or showed signs of
additional exons or exon fragments were not included. However, the genomic DNA underlying
these protein predictions were used as further reference to help in the manual re-annotation of the
Page 8 of 58
8
protein sequence. TBLASTN (1) searches of the available genome databases using either the
low sensitivity default parameters (optimized for near-exact matches) or medium sensitivity
default parameters (optimized to allow for local mismatch) were conducted to identify any
further genes or genomic regions that showed significant sequence identity to the peptide query
sequence used. In total, thirteen genomes were analyzed, including: Zebrafish (Danio rerio) (v.
35.5b), Japanese pufferfish (Fugu rubripes) (v. 29.2e), Green Pufferfish (Tetraodon) (v. 31.1c),
opossum (Monodelphis domestica) (v35.2), dog (Canis Familiaris) (v. 35.1d), cow (Bos Taurus)
(v.36.2), chimpanzee (Pan troglodytes) (v.31.2a), clawed frog (Xenopus Tropicalis) (v. 31.1a),
chicken (Gallus gallus) (v.35.1k), sea squirt (Ciona Intestinalis) (v. 35.195b), pacific sea squirt
(Ciona Savignyi) (CSAV2.0), mosquito (Anopheles gambiae) (v. 23.2b.1), and worm
(Caenohabditis elegans) (v 29.130). Human HCN2, trout HCN1 and drosophila HCN sequences
were used as query sequences for vertebrate, fish and insect genomes, respectively. For the
urochordate genomes, sea urchin HCN (Genbank GI no. 74136757), trout HCN1 and human
HCN2 sequences were used. In general, similar TBLASTN results were found regardless of the
query sequence used, reflecting the high degree of sequence identity found within the core region
throughout the HCN family. Full length putative protein sequences were constructed from the
conceptual translation of genomic DNA as previously described (42) with the proposed starting
methionine in vertebrates supported by the presence of a consensus start sequence (29). Genome
position and intron-exon structure were examined and recorded. A list of the sequences used in
the analysis is shown in Table 1.
Multiple Sequence Alignments:
Page 9 of 58
9
Protein sequence alignments were performed using the default parameters of ClustalX (version
1.83) (69). Alignments were subsequently examined and edited in GeneDoc (48). To ensure that
the HCN genes identified were full-length and non-redundant, those lacking a putative start
codon were discarded. Isoform pairs identified in some of the fish genomes were included if they
corresponded to a different location in the genome, as they are likely the result of a
polyploidization event that is thought to have occurred early in their lineage ((23) and references
therein). Partial sequences that lacked a portion of the distal C-terminus corresponding to the last
mammalian exon were included as this region was not used in the phylogenetic analyses due to
the high degree of length and sequence variability among the different HCN genes. For the Nand C-terminal alignments, vertebrate sequences corresponding to the region upstream of S1 or
downstream of the CNBD (see Figure 1) were individually re-aligned for HCN1-4 and were
shaded based on sequence conservation of 100% (black), 80% (dark grey), and 60% (light grey).
Phylogenetic Analyses:
Sequences were trimmed to produce a core alignment spanning from the start of the
transmembrane segment S1 to the end of the CNBD (see Figure 1). This region exhibits high
sequence identity within the HCN family and among the other related sequences that were used.
As there is no known bacterial HCN channel, and based on a previous phylogenetic analysis of
HCN and CNG channels (10), KAT1 (Genbank GI no. 44888080), human ERG1 (Genbank GI
no. 7531135), human CNGA1 and CNGA3 (Genbank GI no. 2506302 and 13959682) sequences
were included in the analyses to serve as an outgroup for the rooted phylogenetic trees of the
HCN family. Six sequences that were missing exons due to gaps in the genome assembly were
removed from the alignment prior to running the programs. Neighbor-Joining (NJ) trees were
Page 10 of 58
10
generated using ClustalX, followed by tree evaluation with bootstrap re-sampling of 1000 times.
Additional NJ, Maximum Parsimony (MP) and Maximum Likelihood (ML) trees were created
using the Seqboot, Protdist, Neighbor, Protpars, Proml, and Consense programs from the
PHYLIP package (version 3.65), bootstrapping with 100 replicates with randomized input order
and 10 jumbles (15). The TreeView program (version 1.6.6) (51) was used to examine and
display all trees.
Results and Discussion
To date, 21 unique full-length HCN coding sequences have been cloned and deposited in
GenBank: 4 from arthropods, 2 from sea urchin, 1 from fish and 14 from mammals, specifically
human, rabbit, mouse and rat. Previous phylogenetic analyses of the HCN family and its
relatives have been limited and inconsistent, in part due to the lack of available sequences from a
diverse sampling of species. For example, the product of the first duplication event has been
shown to be either HCN1 (10) or HCN3 (16). Furthermore, the recently cloned spHCN2 (16)
sequence was shown to group with the HCN family, but was situated at the base of the
phylogenetic tree. This is not consistent with species evolution, which would place the sea urchin
genes with the other deuterostome species, after the arthropod clade division. Therefore,
spHCN2 represents either a new family of channels not yet identified in any other species or an
HCN homolog that was the product of a duplication event followed by rapid sequence
divergence. These inconsistencies underscore the current limitations and need for more detailed
analyses. This study expands the breadth of available HCN sequences through the manual
annotation of available genomes and in turn, provides the first comprehensive sequence
Page 11 of 58
11
comparison and phylogenetic analysis of this protein family including sequences from a
representative sampling of different species across evolutionary history.
HCN genes are present in multiple copies across a wide spectrum of species
Using a BLASTP search at NCBI or by searching the genome databases available at the Ensembl
website, we identified 58 non-redundant HCN sequences. Twenty-seven of these sequences were
previously identified by cloning or by computer annotation which were confirmed by genomic
data. The remaining thirty-one are novel and were completed by the data mining of genomes
available at Ensembl or the manual re-annotation of the predicted protein translations. Several of
the Ensembl predictions were either inconsistent with known sequences or the predicted gene did
not span the estimated length of the transcript. The inherent problems in gene prediction and
computer protein annotation methods have been previously described ((42) and references
therein). Therefore, we have corrected for this by multi-species comparisons and manual reannotation on a gene-by-gene basis. A complete list of the sequences included in this analysis is
shown in Table 1 and their protein sequences can be found in Supplementary Figure 1. Included
in the final list are: 23 mammalian sequences; including 2 and 3 new sequences from the
chimpanzee and opossum genomes, respectively; 22 lower vertebrate sequences; 6 urochordate
sequences, including 3 new sequences from both c. intestinalis and c. savignyi genomes; and 7
invertebrate sequences, with a new annotation of the single HCN gene found in the mosquito
genome. Of the lower vertebrates, 2 new sequences were from chicken, 3 from frog, 6 new
sequences from both the green pufferfish and the Japanese pufferfish genomes, and 2 new
sequences in the zebrafish. No HCN sequences were identified in c. elegans. With this
substantial increase in the number of full-length sequences representing a wide range of species,
Page 12 of 58
12
we were able to reconstruct the evolutionary history of this protein family and probe the
relationship of channel structure and function in much greater detail than was previously
possible.
High sequence identity amongst four vertebrate HCN isoforms within the core region
Due to the length and sequence variability that occurs in the N- and C-termini (see
Supplementary Figure 2), these regions were trimmed and a core alignment was produced
corresponding to a region between S1 and the end of the CNBD and approximately to exon 2
through 7 of the mammalian isoforms. Sequence conservation of the four vertebrate isoforms
within this region is high, at least 80-90% identity amongst the mammalian sequences and over
90% residue conservation between the newly identified fish sequences and their respective
mammalian orthologs (Figure 2). This indicates that this region has been slow to evolve during
the 450MY that separate the fish and human lineages. Two interesting findings based on this
sequence comparison are that HCN4 shares the highest sequence identity with all other
vertebrate isoforms and that the mammalian HCN3 and fish HCN3 sequences are as similar to
the other vertebrate isoforms as they are to each other. This suggests that HCN4 sequences have
diverged the least from a common ancestral sequence, as is further evident from the branch
lengths in the NJ and ML phylogenetic trees, and that HCN3 sequences have evolved
independently within the fish and mammalian lineages by equal amounts but at different sites.
Overall, the sequences of the invertebrates and urochordates display a lower conservation with
the mammalian sequences. Arthropods share a general sequence identity of approximately 6065% with the mammalian homologs, whereas one of the sea urchin sequences, spHCN1 (also
known as spIH or spHCN), and two of the ciona homologs, here named HCNb and HCNa, are
Page 13 of 58
13
only 55% identical. The second sea urchin sequence, spHCN2, and the other sequence from the
ciona species (HCNc) are even more diverged.
EST evidence supports the validity of highly diverged sequences identified in urochordates
Sequences identified from c. intestinalis and c. savignyi are substantially different from those of
either invertebrates or vertebrates. Two sequence pairs, here named HCNa and HCNb to limit
their association with any of the vertebrate isoforms, only share ~50-55% identity or 70-75%
conservation with either group. The third sequence, HCNc, is even less similar. It shares only
40% identity and 61-64% conservation with all other sequences with the exception of ciona
HCNa, with which it shares a slightly higher identity. The three sequences were found in the
genome databases of both ciona species and the orthologs share a conserved identity of over
90%. This HCN sequence similarity between the two ciona genome projects makes it very
unlikely that any differences between genes or with sequences from either invertebrates or
vertebrates are due to errors in sequencing.
To further support their validity, HCN expressed sequence tags (ESTs) were found in the
database of the Kyoto University ciona cDNA project (59) (http://ghost.zool.kyotou.ac.jp/indexr1.html). These ESTs span several exons and cover a large portion of the c.
intestinalis sequences provided here, suggesting that the three c. intestinalis genes have been
correctly identified and that their transcripts are expressed. A recent comprehensive analysis of
the ascidian genome revealed the existence of these three putative HCN ion channels (HCN1 =
HCNc, HCN2 = HCNa and HCN3 = HCNb) (50), however, the sequences themselves were not
Page 14 of 58
14
reported. The sequences identified here were in turn used to correctly annotate those found in
the c. savignyi genome.
Three different HCN duplication events occurred prior to the divergence of the fish lineage.
In order to examine the phylogeny of this family, a multiple sequence alignment of the core
region of HCN was created using 52 of the 58 identified sequences (see methods). Three
different phylogenetic programs, two NJ methods and one MP method, produced similar results.
Due to the high sequence conservation among mammalian sequences, ML methods were
incapable of analyzing the complete list of sequences. Using a subgroup of 41 sequences, ML
methods produced similar results to the NJ and MP trees. Figure 3 is a rooted MP consensus tree
produced by the PHYLIP program. The NJ rooted phylogram created by ClustalX and a single
representative ML phylogram produced by PHYLIP are provided in Supplementary Figure 3.
Four sequences were included as an outgroup: human CNGA1 and CNGA3, human ERG1 and
KAT1. Tree topology did not change when any of these outgroup sequences were removed.
HCN sequences did not group according to species but rather within isoform clades, indicating
that the HCN1-4 isoforms are paralogs. Genes from fish, birds and amphibians partitioned within
the four mammalian clades which strongly suggests that the four isoforms originated prior to the
origin of the vertebrate clade and were present in the common ancestor of fish and tetrapods.
From the tree topology, we also predict that an HCN ancestor underwent three separate gene
duplication events prior to the divergence of the fish lineage, over 450 MYA. Our data are
inconsistent with two rounds of whole-genome duplication and suggests that the HCN family
evolved through independent gene duplications or chromosomal block duplication (25).
Page 15 of 58
15
However, due to a relatively low sequence resolution, the 2R hypothesis can not be ruled out. Of
the four HCN isoforms, the clade belonging to HCN3 was the first to diverge and represents the
product of the first duplication event. This position of HCN3 is supported by all four phylogeny
methods and by a high bootstrap value, and supports the findings of Galindo and colleagues (16).
Functional data will be required to further explain this branching position. The low bootstrap
value at the division of the HCN3 clade in the MP and NJ trees indicates that this subdivision
remains unresolved. We hypothesize that this clade division may be due to the independent
sequence evolution which has occurred in the teleost and tetrapod HCN3 genes following the
lineage divergence. Interestingly, the HCN3 clade is not divided in the ML tree for which the
Jones-Taylor-Thornton model of evolution was imposed. We can not rule out the possibility that
this difference is due to the decreased sequences used in the ML analysis. The resolution of the
HCN3 clade will only be improved with the inclusion of more complete sequences from species
which evolved following the emergence of the four HCN paralogs but prior to the divergence of
mammals. Functional data of the HCN3 channels from lower vertebrates will also be required to
conclusively determine whether a functional divergence occurred within this clade.
Our data also suggests that HCN4 is the product of the second duplication event followed by the
emergence of HCN1 and HCN2. This proposed evolutionary pattern is evident in all three
phylogenetic trees and is further supported by the sequence conservation pattern shown in Figure
2. Based on phylogenetic results, the division between HCN1 and HCN2 remains unresolved.
Nevertheless, functional data for the mammalian isoforms suggests a clear difference between
these two isoforms. The discrepancy with the predicted order of species evolution within each
mammalian clade is likely due to the high sequence conservation seen within this core region.
Page 16 of 58
16
This results in a limited number of informative sites and produces a low phylogenetic signal (17).
The lack of sequence divergence in the mammalian clades may be due to insufficient
evolutionary time to permit the accumulation of mutations and/or a strong selective pressure to
retain the conserved sequence of this channel.
Fish lineage show evidence of duplicate HCN genes
Genome data mining revealed multiple HCN genes in both the Fugu rubripes and Tetraodon
species, in both cases exceeding the number found in the mammalian lineage. From the
branching order in the phylogenetic tree shown in Figure 3, it is clear that fish gene pairs group
together within the clades of individual mammalian orthologs. This suggests that the common
ancestor of teleost fish and tetrapods had 4 HCN genes, and that these underwent further
duplications independently in the fish lineage. These sequences have therefore been named
according to their mammalian ortholog and subsequently designated ‘a’ or ‘b’. This pattern is
clearly evident in the HCN2 and HCN4 clades, where the green puffer and fugu ‘a’ and ‘b’
sequences are grouped together and are separated from each other by high bootstrap values.
Because the teleost fish are predicted to have undergone a complete genome duplication early in
their own lineage ((23) and references therein), this distribution pattern is not unexpected.
However, because some of the genes identified were not full length and some showed evidence
of intron insertion, further analyses is required to determine whether these duplicates are
expressed and functional or have become pseudogenes (73)
Ciona genes most likely arose through lineage-specific duplication events
Page 17 of 58
17
In contrast to the observed pattern of the multiple fish genes, the multiple copies of HCN
sequences found within the urochordate species do not partition with any of the mammalian
isoform clades. However, each gene did show a highly significant pairing between c. intestinalis
and c. savignyi, indicating that these duplication events occurred before these two species
diverged. The timing of this duplication is unknown, but with the presence of multiple HCN
genes previously identified in the sea urchin, two different scenarios are possible. First, one
duplication of an ancestral gene may have occurred prior to the divergence of the deuterostomes
and given rise to the multiple HCN genes seen in the sea urchin and urochordate species. A
lineage specific duplication then occurred in the urochordates to produce the third ciona HCN
gene. Subsequently, the genes evolved rapidly and independently in the different lineages and
thus no longer resemble each other or any specific HCN isoform. The loss of one ancestral gene
and the duplication and diversification of the other, would then have given rise to the four
isoforms now common to all chordates. A second, and more parsimonious, pattern of evolution
is that lineage-specific duplication events of a single HCN ancestor occurred and gave rise to the
three HCN homologs within the urochordate lineage. Further ancestral duplication events
occurred within the vertebrate lineage, after the divergence of the urochordates and prior to the
fish lineage, giving rise to the four known mammalian isoforms. Lineage-specific gene
duplication in ciona has also been shown for the evolution of sodium gene family. Two sodium
channel genes have been identified but one possesses a sequence that has diverged considerably
from both its paralogous pair and from the other known sodium channel gene sequences. The
authors concluded that the duplication events occurred just prior to fish lineage (33). Similarly,
independent lineage-specific duplication was suggested for the ankyrin gene family based on
their phylogenetic results and the differences in gene sequences between ciona and the vertebrate
Page 18 of 58
18
homologs (7). The general branching order between the ciona HCN homologs, in which HCNa
and HCNc group together and HCNb is independent, is consistent among the different trees
produced. Their position within the tree, however, is variable and is most likely a result of the
different tree building methods used. It may also reflect the amount of lineage-specific sequence
divergence that has occurred in these HCN genes which has caused them to evolve
independently of both the invertebrate and vertebrate clades and has blurred their phylogenetic
position.
Predicted phylogenetic patterns are supported by exon boundary structure
Our phylogenetic analyses indicate that the four vertebrate isoforms arose via three duplications
of an ancestral HCN gene. This is supported by the exon structure of their coding sequences. The
four human HCN genes are located on different chromosomes: 1q21.2 (HCN3), 5p12 (HCN1),
15q24-q25 (HCN4) (61) and 19p13 (HCN2) (72). Using the genome databases, we found this
pattern of differential localization for all vertebrate HCN genes, fish through mammals.
Furthermore, the overall exon structure of the four isoforms has remained consistent since the
duplication and divergence events (Figure 4). They are comprised of 8 exons, with highly
conserved length and sequence in exons 2 through 7. An exception to this is observed in both of
the fish HCN3b genes which are predicted to have an intron of <75bp inserted in the middle of
exon 2. Exon 1 and 8, which directly correspond with the distal N- and C-termini, vary in both
length and sequence for all vertebrate genes analyzed. In addition, exon boundary positions are
highly conserved throughout the vertebrate lineage. This suggests that the extant vertebrate HCN
sequences are derived from a single ancestral gene that had an exon structure similar to current
Page 19 of 58
19
mammalian HCN genes and that the duplication events occurred after the intron positions were
fixed in the linear sequence.
Our phylogenetic results suggest that the invertebrate and urochordate HCN genes are homologs
of the four vertebrate isoforms but, because they group outside the vertebrate clades, they are not
orthologous to any one in particular. The presence of multiple genes in the urochordate species,
therefore, raised the question as to where and when duplications occurred and what course of
evolution lead from the invertebrate to vertebrate gene structure. To address these questions, we
compared the exon structures of invertebrates and urochordates to the highly conserved
vertebrate structure. The exon structure of the single HCN gene in the arthropod lineage is quite
different from the mammalian structure. Using the fly and mosquito genes as representatives, the
invertebrate gene is comprised of 13 or 14 shorter exons, separated by short introns. Of the seven
boundary positions conserved throughout the vertebrate lineage, only 2-4 are conserved in these
species. The exon structures of the urochordate genes, on the other hand, show similarities to
both the invertebrate and vertebrate genes. The HCNb gene contains 15 exons separated by short
introns, which is similar to the invertebrate structure. Nevertheless, seven of the exon boundaries
parallel all of those found in the vertebrate structure. The other two urochordate genes, HCNa
and HCNc, are different from both the vertebrate and invertebrate genes. HCNa contains 10
exons and only four boundaries are conserved with the vertebrate structure and none with the
invertebrate genes. The 13 exon boundaries of HCNc are not conserved with any other gene,
including paralogs from within the same species. In conjunction with sequence identity, residue
conservation and phylogenetic analyses, these findings further support the hypothesis that HCNb
represents the homolog most similar to the ancestral HCN gene of the urochordates, and that
Page 20 of 58
20
HCNa and HCNc arose through lineage-specific duplication processes. Furthermore, we suggest
that the ancestral gene of the HCN family contained numerous introns which were then lost
during the course of evolution. Exons became fused to form the structure now seen within the
vertebrate lineage.
The evolution of key residues in the voltage sensing domain and pore region
HCN channels open in response to changes in membrane voltage and allow for the passage of
Na+ and K+ ions across the plasma membrane. The transmembrane domains of the individual
subunits, which form tetramers around a central pore, are primarily responsible for these
functions. As might be expected from the natural constraints of the hydrophobic bilayer,
sequence conservation is abundant in areas predicted to correspond to the six transmembrane
segments. Similar to depolarization-activated K+ channels, the fourth transmembrane segment
(S4) contains positively charged residues and is likely to sense changes in voltage across the cell
membrane (Figure 5A, ‘a’). In contrast to depolarization-activated K+ channels, however, HCN
channels open instead of close in response to hyperpolarization. Interestingly, HCN channels
possess an additional 4 or 5 charged residues at the N-terminal end of S4 (Figure 5A ‘b’).
Together, in response to changes in voltage, these charges have been shown to move in the same
direction as that in depolarization-activated K+ channels (3). Therefore, it has been suggested that
the movement of the voltage-sensing domain is coupled to channel opening in the opposite way
(40). Throughout the HCN family, there is high sequence identity in this region among the
vertebrate isoforms and high conservation across all sequences. The more diverged sequences
from the sea urchin and urochordate species, spHCN2, ciona HCNa and ciona HCNc, which are
predicted to be the products of lineage-specific duplication, possess only three of the upstream
Page 21 of 58
21
positive charges. The effect of this loss of charge awaits functional characterization, but could be
due to relaxed constraints in these duplicate genes.
Based on their similarity to K+ channels (77), we predict that the HCN channels are in their
lowest energy state when closed and that the voltage-sensors work to open the gate upon
hyperpolarization of the membrane potential. The S4-S5 linker couples the sensing of membrane
voltage by S4 to the opening of the channel gate, which is located at the inner portion of the S6
segment. Channel opening is highly sensitive to mutations in the S4-S5 segment. Within this
region (Figure 5A ‘c’), a dipeptide of a tyrosine followed by an acidic residue is conserved
across the vertebrate and urochordate channels, but is not present in the invertebrates. Mutation
of the tyrosine residue to polar or acidic amino acids in the mammalian HCN2 isoform disrupts
channel closure (8, 38). Similar results were shown for the nearby arginine residue located at the
end of this segment which is conserved throughout the HCN family. These experiments confirm
that this region is critical for the regulation of channel opening by changes in voltage. The lack
of complete conservation within this region therefore suggests that the strength of coupling may
be variable among the different HCN genes. For example, spHCN1 channels exhibit an
inactivation process that is due to a desensitization of the opening of the intracellular gate to
voltage, which may reflect a weaker coupling of the voltage sensor and gate (55).
Based on sequence homology to the crystal structure of a bacterial K+ channel (14), the region
between S5 and the end of S6 is believed to form the ion conduction pore in HCN channels. It
contains a pore helix and selectivity filter and is involved in both ion selectivity and transport. In
K+ channels, the selectivity filter exhibits the K+ signature sequence (GY/FG), which is thought
Page 22 of 58
22
to confer their K+ selective nature (14). In the HCN family, this tripeptide has been conserved
but the channels allow the passage of significant amounts of both Na+ and K+. In all but two
HCN genes, the sequence motif that corresponds to the selectivity filter is CIGYG (5) (Figure 5b
‘a’). The conservation of this sequence only differs from K+ selective channels at the cysteine
residue, implicating this site’s involvement in the reduced K+ permeability. The two exceptions
are again found in gene duplicates from sea urchin and urochordate species. spHCN2 has a filter
sequence of SIGFG (16), which makes it more similar to the filter sequences of channels in the
EAG family. Functional data does not exist for the spHCN2 channel, so whether this difference
affects ion selectivity is not known. In ciona HCNc genes, the selectivity filter motif is CIGYS.
In mammalian HCN channels, substitution of the second glycine residue by serine (G404S in
HCN2), reduced the slow activating current (39). Evidence of channel function is required to
determine if this residue difference in the urochordate channel is involved in gene silencing. If
ciona HCNc is functional, a different mutational tolerance at this particular site seems likely and
may reflect an adaptive process that has enabled these channels to fill a different functional niche
specific to these species.
Recently, an N-linked glycosylation site (NXT/S) located in the outer turret between the end of
S5 and the pore helix (Figure 5b ‘b’) has been shown to play a role in membrane expression of
mammalian isoforms (47). Similar to residues in the S4-S5 linker, this motif emerges with the
urochordate sequences and is conserved in two of the three urochordate genes and throughout all
vertebrate sequences. To understand the necessity of this motif and the role it plays in channel
function throughout evolution, further studies are needed to determine if glycosylation does
occur in these ciona channels and if it, or some other compensatory post-translational
Page 23 of 58
23
modification that allows the channel to mature through the ER/golgi process, occurs elsewhere in
the invertebrate sequences.
Based on the K+ channel crystal structure (14), the S6 segments of HCN likely form the inner
vestibule of the pore and the gate. They show high sequence conservation with other protein
relatives, such as ERG and CNG, and are almost completely conserved throughout the HCN
family. Not surprisingly, the S6 segments of ciona HCNa and HCNc and spHCN2, which are
predicted to be the result of lineage-specific duplication, are the exceptions. Two glycine
residues (Figure 5b ‘c’ and ‘d’) are completely conserved among the HCN genes. Based on
sequence homology with voltage-gated K+ channels (62), it has been suggested that one of these
may form a glycine hinge involved in the opening of the channel gate in response to the
movement of the S4-S5 linker. However, more recent experimental and homology modeling
evidence (19, 55, 56) has shown that a threonine residue, positioned after the glycines in the
linear sequence and on the intracellular side of the channel, is only accessible in the open state.
Furthermore, a glutamine residue at the end of S6 is accessible in both the open and closed
positions, suggesting that the putative hinge position is located between these two residues
(Figure 5b ‘e’).
The evolution of the cyclic nucleotide binding and modulatory domains
Cyclic nucleotides are known to bind directly to the channel (12) in the CNBD on the
intracellular C-terminus and modify channel opening. The CNBD is an evolutionarily conserved
domain that is found in several cyclic nucleotide binding proteins, including the bacterial
catabolic activating peptide (CAP) and the protein kinase A (PKA) family (4). The crystal
Page 24 of 58
24
structure of the C-terminus of mouse HCN2 has been solved (78). Despite a low overall
sequence conservation, the tertiary structure of its CNBD is highly conserved with the crystal
structures of other CNBDs (4). Furthermore, residues identified as being critical to structure and
nucleotide binding are conserved throughout the HCN family (Figure 6). This includes the
phosphate binding cassette (PBC), the most conserved feature of the CNBD which makes direct
contact with cAMP (13), and the hinge region, important for the capping of cAMP by the C-helix
of the CNBD. The middle of the PBC has diverged in one of the sea urchin genes and all of the
urochordate genes, but these residues are also variable in the other related crystal structures,
undermining the importance of their identity to cyclic nucleotide binding. Because the HCN
family possesses these conserved key residues, it seems probable that all of their CNBDs can
stabilize the tertiary structure and bind cyclic nucleotides. One exception to this is seen in
urochordate HCNc genes. These two sequences are missing a key hydrophobic residue in the
PBC (Figure 6 ‘P’) that forms a conserved interaction with cAMP (53). In these genes, this
residue is threonine, a hydrophilic amino acid that could disrupt this cAMP interaction. This
difference is consistent with our hypothesis that the ciona HCNc genes have undergone
diversification following a lineage-specific duplication. Whether this difference also corresponds
to a functional change in cAMP binding is an interesting question that will require functional
studies to confirm.
The effects of cAMP-binding on HCN channel opening are variable among isoforms and depend
on the CNBD as well as the C-linker region which connects it to the transmembrane domain. For
the mammalian HCN1, HCN2 and HCN4 isoforms, the binding of cAMP relieves a tonic
inhibition of the transmembrane domain by the CNBD (75). The extent to which this relief
Page 25 of 58
25
occurs (HCN4 = HCN2 > HCN1) depends in part on the C-linker region which transmits the
effect of cAMP onto the gating machinery (76). In contrast, cAMP binding has no effect on
mouse HCN3 (45) and enhances the inhibition of channel opening of human HCN3 (67).
Whether the sequence differences within the C-linker are responsible for these differences is not
known. Further complicating the issue is that the effects of cAMP also depend upon regions in
the transmembrane domains (76) which may lead to more complex interactions with the
cytoplasmic portions of the gating elements. The complexity of these interactions is especially
apparent in the spHCN1 channel in which channel inactivation/desensitization after first opening
is eliminated by cAMP binding (64). C-linker sequences show considerable divergence
throughout the HCN family, including differences among mammalian paralogs, and even
between mammalian and non-mammalian orthologs. This divergence is consistent with the
variability in cAMP-mediated effects on channel opening.
Overall, the role of the C-linker in HCN channels is unusual compared to other regions in the
channel. Its sequence and function show variation throughout the family, but it connects two
domains that are themselves highly conserved in sequence throughout evolution and carry out
highly conserved, but distinct, functions in a cooperative manner (transmembrane = voltagesensing, channel opening and ion permeation, CNBD = cAMP binding and modulation of
channel open by the transmembrane domain). Because of its position between these two
domains, the sequence variability in the C-linker may be, in part, the result of an adaptive
process that has enabled the diversification of cyclic nucleotide binding and the modulation of
channel opening by the CNBD within the HCN family.
Page 26 of 58
26
Sequence variability and functional divergence of vertebrate HCN paralogs
From our data, we predict that the four vertebrate isoforms are paralogs of each other resulting
from three duplication events which occurred before the divergence of the teleost and tetrapod
lineages. At the time of their origin, the gene pairs produced by these events would have been
functionally redundant. Because the vast majority of duplicate genes are silenced throughout the
course of evolution (37), the retention of all four is probably due to the acquisition of unique
functional characteristics and/or expression patterns which result from tolerated mutations
specific to each paralog (65). Throughout the core region, there is high sequence identity among
the four vertebrate HCN isoforms. Some of these invariant residues probably contribute to
functional properties common among all vertebrate channels. In contrast, some of the residues
that vary among paralogs within this conserved region probably contribute to the more subtle
isoform-specific differences in function (31). The search has begun to identify which residues
underlie differences in function among the four mammalian isoforms but approaches have been
limited to direct sequence comparisons followed by single site mutagenesis and/or construction
of chimeras, and functional assays. Using these approaches, three studies have identified residues
or regions responsible for phenotypic variation among the four mammalian HCN channels. First,
differences in the rates of channel opening and cyclic nucleotide modulation between the HCN2
and HCN4 isoforms were localized to a variant residue in the S1 segment (68). Second, a single
residue difference in the inner selectivity filter was shown to confer chloride sensitivity to the
HCN2 channel (74) (see Figure 5b). Lastly, differences in the C-linker were shown to contribute
to differences in cAMP efficacy between HCN1 and HCN2, although the specific residues
involved were not identified (76). From these few examples, it is clear that the functional
consequences of residue variation among the four vertebrate isoforms, which may encompass not
Page 27 of 58
27
only overt differences in channel opening and closing but also differences in permeation, cyclic
nucleotide affinity, homo- and hetero-tetrameric assembly, glycosylation status, protein-channel
interactions and abilities to traffick to the cell surface, cannot be easily predicted simply by its
location within the channel. Furthermore, differences in function may involve multiple residues
and domains located throughout the channel. Therefore, site-directed mutagenesis and/or the
construction of chimeras between vertebrate isoforms may not be sufficient to determine all
differences, especially in the absence of high resolution three dimensional structures for each of
the four paralogs.
In this study, we expanded the list of sequences for each of the four HCN paralogs by the
addition of sequences from lower vertebrates. By broadening the evolutionary representation of
the four vertebrate isoforms, the sequence signal-to-noise ratio is improved and the identification
of residues that are conserved among orthologs, but differ among paralogs, is enhanced.
However, genes from lower vertebrates (eg. fish) and mammals have continued to evolve under
different selective pressures, since their divergence from a common ancestor, several million
years ago. Therefore, information derived from this expanded list of sequences is complex. We
identified three general groups of residues based on similarities among vertebrate orthologs.
First, sites may be conserved among orthologs and could contribute to a function specific to each
vertebrate isoform. Second, sites may be conserved in the mammalian and fish orthologs but
differ between these two groups. These sites may contribute to functional differences between
the mammalian and fish channels of a particular isoform. Finally, sites may be conserved in
orthologs of mammals or fish, but not both. These sites could be involved in species-specific
functions or paralog-specific functions within each species. Alternatively, they may be the result
Page 28 of 58
28
of relaxed evolutionary constraints. By analyzing these different sets of conserved sites together
with functional characterization of the various channels belonging to each of the vertebrate
isoforms, sites that may contribute to paralog-specific differences in channel function can be
more easily identified.
An informative sub-set of sites identified when comparing HCN genes from lower vertebrates
and mammals are those that are conserved with a paralog rather than its own ortholog. If we
assume that the probability of identical sites mutating to the same residues independently
following duplication and species divergence is low, then these probably represent sites that have
been retained from ancestral genes. Differential retention of sites from ancestral genes among
orthologs suggests that channel phenotypes may not be completely conserved among them. In
conjunction with the identification of conserved and non-conserved sites, the functional analysis
of HCN channels from lower vertebrates, and comparisons of their properties with those of
mammalian channels, will help to identify residues that contribute to different phenotypes, and
will also have the potential to shed light on the sequences and functions of ancestral HCN
channels.
Vertebrate isoform-specific alignments of sequences spanning 450MY, reveal conserved motifs
in the N- and C-termini
The large number of vertebrate HCN sequences assembled in this study has greatly increased the
power to identify isoform-specific motifs in the N- and C-termini that may contribute to unique
functions and specific patterns of expression within cells and tissues. The high sequence
conservation among the four vertebrate isoforms extends beyond the core region analyzed above
Page 29 of 58
29
and includes regions just upstream of the S1 segment and downstream of the CNBD. The more
distal N- and C-termini, however, vary in both length and sequence among paralogs and do not
align well. The sequences of the mammalian orthologs identified prior to this study were too
close in evolutionary time to reveal sequence divergence in the distal N- and C-termini. On the
other hand, the four paralogs from a single species were too diverged to align reliably. By
expanding the taxa sampling to include species from different time periods of vertebrate
evolution, the signal-to-noise ratio that is inherent in the sequence information was considerably
improved (2). Alignments of the distal termini, using this expanded list of vertebrate sequences,
revealed several isoform-specific blocks of conserved sequence interspersed with diverged
regions of variable length (Figure 7).
In the distal N- and C- termini, multiple regions specific to each of the four vertebrate paralogs
were identified and were seen to be interspersed with regions of variable length and sequence.
The conservation of these motifs throughout 450MY of evolution highlights a selective
constraint placed upon them and suggests that they confer isoform-specific properties. These
properties remain to be identified but there is some evidence to support a role for these motifs in
the regulation of cell surface expression. For example, a single block of five residues is
conserved in the N-terminus of HCN2 in fish, frog, chicken and mammals (Figure 7a ‘a’). This
motif is not analogous to any known consensus sequence. However, the deletion of the first 130
amino acids of the mouse HCN2 N-terminus, which includes this conserved block of five amino
acids, reduces cell surface expression and has only minor effects on channel opening and closing
(70). A second example is a block of residues found in the C-terminus of HCN1 channels (Figure
7b ‘a’). This region is responsible for binding of filamin A to HCN1, and not the other isoforms
Page 30 of 58
30
(22), and its removal abolishes the effects of filamin on channel cell surface expression. Overall,
the motifs conserved within the termini of an individual isoform are more likely involved in roles
specific to the function or expression of that paralog or may be involved in protein interactions
specific to the tissues where these proteins are expressed. The regions of variable sequence and
length observed between these conserved motifs result from DNA insertions and/or deletions.
This variability may represent either relaxed constraints and/or the beginnings of species-specific
adaptation processes. These adaptations may involve properties such as temperature sensitivity
(41, 42) or binding to regulatory proteins with species-specific sequences.
Finally, both the N and C-termini possess regions of high sequence conservation among the four
vertebrate isoforms, in addition to those of the core region used for the phylogenetic analyses.
These regions probably confer properties that are not unique to any individual isoform, but are
important for all vertebrate HCN channels. In the N-terminus, a region of approximately 50
residues immediately upstream of the start of S1 is conserved in all four vertebrate isoforms
(Figure 7a ‘b’, HCN3 not shown). In mouse HCN2, this region interacts with itself and thus may
be involved in intersubunit interactions of tetrameric assembly (70). Furthermore, the removal of
this region, along with the rest of the N-terminus, prevents the formation of functional channels.
Based on the high level of sequence conservation among the four isoforms, it seems probable
that this region provides similar interactions for HCN1, 3 and 4. If this is true, the few
differences among paralogs within this region may modify intersubunit interactions and thus
regulate homomeric and/or heteromeric assembly.
Page 31 of 58
31
In the C-terminus, a block of residues conserved among the vertebrate HCN1, 2 and 4 isoforms
was identified immediately downstream of the CNBD, which corresponds to the start of the last
exon (Figure 7b ‘b’). Deletion of this region, together with the entire distal C-terminus and the
C-helix of the CNBD, does not affect HCN1, HCN2 or HCN4 channel cell surface expression or
function in heterologous systems (54, 75). A function for this conserved block of residues is not
known but, whatever this function may be, based on the lack of sequence conservation, it is not
likely possessed by HCN3 channels. Finally, a motif found at the distal end of the C-terminus,
SNL, is conserved in HCN1, 2 and 4 (Figure 7b ‘c’). This motif, which qualifies as a consensus
PDZ-binding domain, interacts with PDZ-containing proteins in vitro (28), and also with the
TRP8 protein in vitro and in vivo, where it regulates channel cell surface expression (58). The
absence of this motif in the HCN3 genes suggests that either this isoform does not interact with
PDZ-containing proteins or that these interactions take place at different locations within the
channel.
Summary and Perspectives
The availability of an increasing number of genome sequences has enabled us to generate a list
of putative HCN coding sequences that has doubled the number of those previously known and
covers a significantly greater portion of evolutionary time. With this improved species
representation, we were able to more accurately complete sequence comparisons, phylogenetic
analyses and exon structure comparisons of the HCN gene family and put forward a model of its
molecular evolution. We propose that the vertebrate isoforms evolved from a single ancestral
sequence that had an exon structure similar to current mammalian HCN genes and that the
duplication events occurred after the intron positions were fixed in the linear sequence. We also
Page 32 of 58
32
suggest that HCN duplications have occurred independently in the sea urchin, urochordate and
fish lineages. Increasing the evolutionary distance between the sequences for each HCN isoform
provided a good contrast and enabled us to unmask and identify regions putatively important for
isoform-specific, as well as species-specific, functions. Together, our study provides a strong
basis from which to refine the proposed model of evolution as more genomes become available
and as the functional analysis of HCN genes progresses. Finally, our study provides a valuable
tool to aid in the planning of experiments designed to probe the relationship between structure
and function of HCN channels, and to determine the functional significance of sequence
similarities and differences among them.
Acknowledgements
We would like to acknowledge Dr. Fiona Brinkman from Molecular Biology and Biochemistry
at SFU for her advice, training and feedback on the manuscript as well as the Ouellette
laboratory at the UBC Bioinformatics Centre (http://bioinformatics.ubc.ca), for their resource
advice which has made this work possible.
Current address of C.R. Marshall: The Centre for Applied Genomics and Program in Genetics
and Genomic Biology, The Hospital for Sick Children, Department of Molecular and Medical
Genetics, University of Toronto, Ontario.
Grants
Supported by grants from the Natural Sciences and Engineering Research Council of Canada
(EAA).
Page 33 of 58
33
References
1.
Altschul SF, Gish W, Miller W, Myers EW, and Lipman DJ. Basic local alignment search
tool. J Mol Biol 215: 403-410, 1990.
2.
Anderson PA and Greenberg RM. Phylogeny of ion channels: clues to structure and function.
Comp Biochem Physiol B Biochem Mol Biol 129: 17-28, 2001.
3.
Bell DC, Yao H, Saenger RC, Riley JH, and Siegelbaum SA. Changes in local S4
environment provide a voltage-sensing mechanism for mammalian hyperpolarization-activated HCN
channels. J Gen Physiol 123: 5-19, 2004.
4.
Berman HM, Ten Eyck LF, Goodsell DS, Haste NM, Kornev A, and Taylor SS. The cAMP
binding domain: an ancient signaling module. Proc Natl Acad Sci U S A 102: 45-50, 2005.
5.
Biel M, Schneider A, and Wahl C. Cardiac HCN channels: structure, function, and
modulation. Trends Cardiovasc Med 12: 206-212, 2002.
6.
Birney E, Andrews TD, Bevan P, Caccamo M, Chen Y, Clarke L, Coates G, Cuff J,
Curwen V, Cutts T, Down T, Eyras E, Fernandez-Suarez XM, Gane P, Gibbins B, Gilbert J,
Hammond M, Hotz HR, Iyer V, Jekosch K, Kahari A, Kasprzyk A, Keefe D, Keenan S,
Lehvaslaiho H, McVicker G, Melsopp C, Meidl P, Mongin E, Pettett R, Potter S, Proctor G, Rae
M, Searle S, Slater G, Smedley D, Smith J, Spooner W, Stabenau A, Stalker J, Storey R, UretaVidal A, Woodwark KC, Cameron G, Durbin R, Cox A, Hubbard T, and Clamp M. An overview
of Ensembl. Genome Res 14: 925-928, 2004.
7.
Cai X and Zhang Y. Molecular evolution of the ankyrin gene family. Mol Biol Evol 23: 550-
558, 2006.
8.
Chen J, Mitcheson JS, Tristani-Firouzi M, Lin M, and Sanguinetti MC. The S4-S5 linker
couples voltage sensing and activation of pacemaker channels. Proc Natl Acad Sci U S A 98: 1127711282, 2001.
Page 34 of 58
34
9.
Cho WJ, Drescher MJ, Hatfield JS, Bessert DA, Skoff RP, and Drescher DG.
Hyperpolarization-activated, cyclic AMP-gated, HCN1-like cation channel: the primary, full-length
HCN isoform expressed in a saccular hair-cell layer. Neuroscience 118: 525-534, 2003.
10.
Craven KB and Zagotta WN. CNG AND HCN CHANNELS: Two Peas, One Pod. Annu Rev
Physiol 68: 375-401, 2006.
11.
DiFrancesco D. Pacemaker mechanisms in cardiac tissue. Annu Rev Physiol 55: 455-472, 1993.
12.
DiFrancesco D and Tortora P. Direct activation of cardiac pacemaker channels by
intracellular cyclic AMP. Nature 351: 145-147, 1991.
13.
Diller TC, Madhusudan, Xuong NH, and Taylor SS. Molecular basis for regulatory subunit
diversity in cAMP-dependent protein kinase: crystal structure of the type II beta regulatory subunit.
Structure 9: 73-82, 2001.
14.
Doyle DA, Morais Cabral J, Pfuetzner RA, Kuo A, Gulbis JM, Cohen SL, Chait BT, and
MacKinnon R. The structure of the potassium channel: molecular basis of K+ conduction and
selectivity. Science 280: 69-77, 1998.
15.
Felsenstein J. Inferring phylogenies from protein sequences by parsimony, distance, and
likelihood methods. Methods Enzymol 266: 418-427, 1996.
16.
Galindo BE, Neill AT, and Vacquier VD. A new hyperpolarization-activated, cyclic
nucleotide-gated channel from sea urchin sperm flagella. Biochem Biophys Res Commun 334: 96-101,
2005.
17.
Gallin WJ. Evolution of the "classical" cadherin family of cell adhesion molecules in
vertebrates. Mol Biol Evol 15: 1099-1107, 1998.
18.
Gauss R, Seifert R, and Kaupp UB. Molecular identification of a hyperpolarization-activated
channel in sea urchin sperm. Nature 393: 583-587, 1998.
Page 35 of 58
35
19.
Giorgetti A, Carloni P, Mistrik P, and Torre V. A homology model of the pore region of
HCN channels. Biophys J 89: 932-944, 2005.
20.
Gisselmann G, Marx T, Bobkov Y, Wetzel CH, Neuhaus EM, Ache BW, and Hatt H.
Molecular and functional characterization of an I(h)-channel from lobster olfactory receptor neurons.
Eur J Neurosci 21: 1635-1647, 2005.
21.
Gisselmann G, Warnstedt M, Gamerschlag B, Bormann A, Marx T, Neuhaus EM,
Stoertkuhl K, Wetzel CH, and Hatt H. Characterization of recombinant and native Ih-channels from
Apis mellifera. Insect Biochem Mol Biol 33: 1123-1134, 2003.
22.
Gravante B, Barbuti A, Milanesi R, Zappi I, Viscomi C, and DiFrancesco D. Interaction of
the pacemaker channel HCN1 with filamin A. J Biol Chem 279: 43847-43853, 2004.
23.
Hoegg S, Brinkmann H, Taylor JS, and Meyer A. Phylogenetic timing of the fish-specific
genome duplication correlates with the diversification of teleost fish. J Mol Evol 59: 190-203, 2004.
24.
Hubbard T BD, Birney E, Cameron G, Chen Y, Clark L, Cox T, Cuff J, Curwen V, Down
T, Durbin R, Eyras E, Gilbert J, Hammond M, Huminiecki L, Kasprzyk A, Lehvaslaiho H,
Lijnzaad P, Melsopp C, Mongin E, Pettett R, Pocock M, Potter S, Rust A, Schmidt E, Searle S,
Slater G, Smith J, Spooner W, Stabenau A, Stalker J, Stupka E, Ureta-Vidal A, Vastrik I, and
Clamp M. The Ensembl genome database project. Nucleic Acids Res 30: 38-41, 2002.
25.
Hughes AL and Friedman R. 2R or not 2R: testing hypotheses of genome duplication in early
vertebrates. J Struct Funct Genomics 3: 85-93, 2003.
26.
Ishii TM, Takano M, Xie LH, Noma A, and Ohmori H. Molecular characterization of the
hyperpolarization-activated cation channel in rabbit heart sinoatrial node. J Biol Chem 274: 1283512839, 1999.
27.
Jiang Y, Lee A, Chen J, Ruta V, Cadene M, Chait BT, and MacKinnon R. X-ray structure
of a voltage-dependent K+ channel. Nature 423: 33-41, 2003.
Page 36 of 58
36
28.
Kimura K, Kitano J, Nakajima Y, and Nakanishi S. Hyperpolarization-activated, cyclic
nucleotide-gated HCN2 cation channel forms a protein assembly with multiple neuronal scaffold
proteins in distinct modes of protein-protein interaction. Genes Cells 9: 631-640, 2004.
29.
Kozak M. Interpreting cDNA sequences: some insights from studies on translation. Mamm
Genome 7: 563-574, 1996.
30.
Krieger J, Strobel J, Vogl A, Hanke W, and Breer H. Identification of a cyclic nucleotide-
and voltage-activated ion channel from insect antennae. Insect Biochem Mol Biol 29: 255-267, 1999.
31.
Li B and Gallin WJ. Computational identification of residues that modulate voltage sensitivity
of voltage-gated potassium channels. BMC Struct Biol 5: 16, 2005.
32.
Long SB, Campbell EB, and Mackinnon R. Voltage sensor of Kv1.2: structural basis of
electromechanical coupling. Science 309: 903-908, 2005.
33.
Lopreato GF, Lu Y, Southwell A, Atkinson NS, Hillis DM, Wilcox TP, and Zakon HH.
Evolution and divergence of sodium channel genes in vertebrates. Proc Natl Acad Sci U S A 98: 75887592, 2001.
34.
Ludwig A, Budde T, Stieber J, Moosmang S, Wahl C, Holthoff K, Langebartels A, Wotjak
C, Munsch T, Zong X, Feil S, Feil R, Lancel M, Chien KR, Konnerth A, Pape HC, Biel M, and
Hofmann F. Absence epilepsy and sinus dysrhythmia in mice lacking the pacemaker channel HCN2.
Embo J 22: 216-224, 2003.
35.
Ludwig A, Zong X, Jeglitsch M, Hofmann F, and Biel M. A family of hyperpolarization-
activated mammalian cation channels. Nature 393: 587-591, 1998.
36.
Ludwig A, Zong X, Stieber J, Hullin R, Hofmann F, and Biel M. Two pacemaker channels
from human heart with profoundly different activation kinetics. Embo J 18: 2323-2329, 1999.
37.
Lynch M and Conery JS. The evolutionary fate and consequences of duplicate genes. Science
290: 1151-1155, 2000.
Page 37 of 58
37
38.
Macri V and Accili EA. Structural elements of instantaneous and slow gating in
hyperpolarization-activated cyclic nucleotide-gated channels. J Biol Chem 279: 16832-16846, 2004.
39.
Macri V, Proenza C, Agranovich E, Angoli D, and Accili EA. Separable gating mechanisms
in a Mammalian pacemaker channel. J Biol Chem 277: 35939-35946, 2002.
40.
Mannikko R, Elinder F, and Larsson HP. Voltage-sensing mechanism is conserved among
ion channels gated by opposite voltages. Nature 419: 837-841, 2002.
41.
Marshall C, Elias C, Xue XH, Le HD, Omelchenko A, Hryshko LV, and Tibbits GF.
Temperature dependence of cardiac Na+/Ca2+ exchanger. Ann N Y Acad Sci 976: 109-112, 2002.
42.
Marshall CR, Fox JA, Butland SL, Ouellette BF, Brinkman FS, and Tibbits GF.
Phylogeny of Na+/Ca2+ exchanger (NCX) genes from genomic data identifies new gene duplications
and a new family member in fish species. Physiol Genomics 21: 161-173, 2005.
43.
Marx T, Gisselmann G, Stortkuhl KF, Hovemann BT, and Hatt H. Molecular cloning of a
putative voltage- and cyclic nucleotide-gated ion channel present in the antennae and eyes of
Drosophila melanogaster. Invert Neurosci 4: 55-63, 1999.
44.
Milanesi R, Baruscotti M, Gnecchi-Ruscone T, and DiFrancesco D. Familial sinus
bradycardia associated with a mutation in the cardiac pacemaker channel. N Engl J Med 354: 151-157,
2006.
45.
Mistrik P, Mader R, Michalakis S, Weidinger M, Pfeifer A, and Biel M. The murine HCN3
gene encodes a hyperpolarization-activated cation channel with slow kinetics and unique response to
cyclic nucleotides. J Biol Chem 280: 27056-27061, 2005.
46.
Moosmang S, Stieber J, Zong X, Biel M, Hofmann F, and Ludwig A. Cellular expression
and functional characterization of four hyperpolarization-activated pacemaker channels in cardiac and
neuronal tissues. Eur J Biochem 268: 1646-1652, 2001.
Page 38 of 58
38
47.
Much B, Wahl-Schott C, Zong X, Schneider A, Baumann L, Moosmang S, Ludwig A,
and Biel M. Role of subunit heteromerization and N-linked glycosylation in the formation of
functional hyperpolarization-activated cyclic nucleotide-gated channels. J Biol Chem 278: 4378143786, 2003.
48.
Nicholas KB, Jr. NHB, and Deerfield DWI. GeneDoc: Analysis and Visualization of Genetic
Variation. EMBNEW NEWS 4: 14, 1997.
49.
Nolan MF, Malleret G, Lee KH, Gibbs E, Dudman JT, Santoro B, Yin D, Thompson RF,
Siegelbaum SA, Kandel ER, and Morozov A. The hyperpolarization-activated HCN1 channel is
important for motor learning and neuronal integration by cerebellar Purkinje cells. Cell 115: 551-564,
2003.
50.
Okamura Y, Nishino A, Murata Y, Nakajo K, Iwasaki H, Ohtsuka Y, Tanaka-Kunishima
M, Takahashi N, Hara Y, Yoshida T, Nishida M, Okado H, Watari H, Meinertzhagen IA, Satoh
N, Takahashi K, Satou Y, Okada Y, and Mori Y. Comprehensive analysis of the ascidian genome
reveals novel insights into the molecular evolution of ion channel genes. Physiol Genomics 22: 269282, 2005.
51.
Page RD. TreeView: an application to display phylogenetic trees on personal computers.
Comput Appl Biosci 12: 357-358, 1996.
52.
Pape HC. Queer current and pacemaker: the hyperpolarization-activated cation current in
neurons. Annu Rev Physiol 58: 299-327, 1996.
53.
Passner JM, Schultz SC, and Steitz TA. Modeling the cAMP-induced allosteric transition
using the crystal structure of CAP-cAMP at 2.1 A resolution. J Mol Biol 304: 847-859, 2000.
54.
Proenza C, Tran N, Angoli D, Zahynacz K, Balcar P, and Accili EA. Different roles for the
cyclic nucleotide binding domain and amino terminus in assembly and expression of hyperpolarizationactivated, cyclic nucleotide-gated channels. J Biol Chem 277: 29634-29642, 2002.
Page 39 of 58
39
55.
Rothberg BS, Shin KS, Phale PS, and Yellen G. Voltage-controlled gating at the
intracellular entrance to a hyperpolarization-activated cation channel. J Gen Physiol 119: 83-91, 2002.
56.
Rothberg BS, Shin KS, and Yellen G. Movements near the gate of a hyperpolarization-
activated cation channel. J Gen Physiol 122: 501-510, 2003.
57.
Santoro B, Liu DT, Yao H, Bartsch D, Kandel ER, Siegelbaum SA, and Tibbs GR.
Identification of a gene encoding a hyperpolarization-activated pacemaker channel of brain. Cell 93:
717-729, 1998.
58.
Santoro B, Wainger BJ, and Siegelbaum SA. Regulation of HCN channel surface expression
by a novel C-terminal protein-protein interaction. J Neurosci 24: 10750-10762, 2004.
59.
Satou Y, Kawashima T, Shoguchi E, Nakayama A, and Satoh N. An integrated database of
the ascidian, Ciona intestinalis: towards functional genomics. Zoolog Sci 22: 837-843, 2005.
60.
Schulze-Bahr E, Neu A, Friederich P, Kaupp UB, Breithardt G, Pongs O, and Isbrandt D.
Pacemaker channel dysfunction in a patient with sinus node disease. J Clin Invest 111: 1537-1545,
2003.
61.
Seifert R, Scholten A, Gauss R, Mincheva A, Lichter P, and Kaupp UB. Molecular
characterization of a slowly gating human hyperpolarization-activated channel predominantly
expressed in thalamus, heart, and testis. Proc Natl Acad Sci U S A 96: 9391-9396, 1999.
62.
Shealy RT, Murphy AD, Ramarathnam R, Jakobsson E, and Subramaniam S. Sequence-
function analysis of the K+-selective family of ion channels using a comprehensive alignment and the
KcsA channel structure. Biophys J 84: 2929-2942, 2003.
63.
Shi W, Wymore R, Yu H, Wu J, Wymore RT, Pan Z, Robinson RB, Dixon JE, McKinnon
D, and Cohen IS. Distribution and prevalence of hyperpolarization-activated cation channel (HCN)
mRNA expression in cardiac tissues. Circ Res 85: e1-6, 1999.
Page 40 of 58
40
64.
Shin KS, Maertens C, Proenza C, Rothberg BS, and Yellen G. Inactivation in HCN
channels results from reclosure of the activation gate: desensitization to voltage. Neuron 41: 737-744,
2004.
65.
Sidow A. Gen(om)e duplications in the evolution of early vertebrates. Curr Opin Genet Dev 6:
715-722, 1996.
66.
Stieber J, Herrmann S, Feil S, Loster J, Feil R, Biel M, Hofmann F, and Ludwig A. The
hyperpolarization-activated channel HCN4 is required for the generation of pacemaker action
potentials in the embryonic heart. Proc Natl Acad Sci U S A 100: 15235-15240, 2003.
67.
Stieber J, Stockl G, Herrmann S, Hassfurth B, and Hofmann F. Functional expression of
the human HCN3 channel. J Biol Chem 280: 34635-34643, 2005.
68.
Stieber J, Thomer A, Much B, Schneider A, Biel M, and Hofmann F. Molecular basis for
the different activation kinetics of the pacemaker channels HCN2 and HCN4. J Biol Chem 278: 3367233680, 2003.
69.
Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, and Higgins DG. The CLUSTAL_X
windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools.
Nucleic Acids Res 25: 4876-4882, 1997.
70.
Tran N, Proenza C, Macri V, Petigara F, Sloan E, Samler S, and Accili EA. A conserved
domain in the NH2 terminus important for assembly and functional expression of pacemaker channels.
J Biol Chem 277: 43588-43592, 2002.
71.
Ueda K, Nakamura K, Hayashi T, Inagaki N, Takahashi M, Arimura T, Morita H,
Higashiuesato Y, Hirano Y, Yasunami M, Takishita S, Yamashina A, Ohe T, Sunamori M,
Hiraoka M, and Kimura A. Functional characterization of a trafficking-defective HCN4 mutation,
D553N, associated with cardiac arrhythmia. J Biol Chem 279: 27194-27198, 2004.
Page 41 of 58
41
72.
Vaccari T, Moroni A, Rocchi M, Gorza L, Bianchi ME, Beltrame M, and DiFrancesco
D. The human gene coding for HCN2, a pacemaker channel of the heart. Biochim Biophys Acta 1446:
419-425, 1999.
73.
Vanin EF. Processed pseudogenes: characteristics and evolution. Annu Rev Genet 19: 253-272,
1985.
74.
Wahl-Schott C, Baumann L, Zong X, and Biel M. An arginine residue in the pore region is a
key determinant of chloride dependence in cardiac pacemaker channels. J Biol Chem 280: 1369413700, 2005.
75.
Wainger BJ, DeGennaro M, Santoro B, Siegelbaum SA, and Tibbs GR. Molecular
mechanism of cAMP modulation of HCN pacemaker channels. Nature 411: 805-810, 2001.
76.
Wang J, Chen S, and Siegelbaum SA. Regulation of hyperpolarization-activated HCN
channel gating and cAMP modulation due to interactions of COOH terminus and core transmembrane
regions. J Gen Physiol 118: 237-250, 2001.
77.
Yifrach O and MacKinnon R. Energetics of pore opening in a voltage-gated K(+) channel.
Cell 111: 231-239, 2002.
78.
Zagotta WN, Olivier NB, Black KD, Young EC, Olson R, and Gouaux E. Structural basis
for modulation and agonist specificity of HCN pacemaker channels. Nature 425: 200-205, 2003.
Page 42 of 58
42
Figure Legends
Figure 1: Schematic representation of the overall topology of HCN channels. Included are the six
putative transmembrane segments (S1-S6), including an S4 voltage sensor (light grey) and the reentrant P-loop region between S5 and S6, all structural characteristics common to the entire voltagegated potassium channel superfamily, including ERG, CNG and HCN channels. HCN and ERG
channels also share the potassium selectivity filter GY/FG. Based on the crystal structure of mHCN2
(78), the C-linker region in HCN channels contains six alpha helices (A’-E’) and connects the
cytoplasmic end of S6 to the start of the cyclic nucleotide binding domain (CNBD). The CNBD
consists of two alpha helices (A and B) that flank an eight-stranded beta-roll followed by a third alpha
helix at its C-terminal end (C-helix). The cyclic nucleotides bind to pocket that is formed by the betaroll and the C-helix. Thick black lines delineate the region that is included in the phylogenetic analyses.
In HCN channels, these lines encompass vertebrate exons 2 through 7. Alignments of HCN vertebrate
isoform-specific N-terminal (exon 1) and C-terminal (exon 8) regions include those residues that fall
outside of this region.
Figure 2: High sequence conservation observed between vertebrate isoforms. Sequence
conservation between mammalian HCN1-4 (M1-M4) and fish HCN1-4 (F1-F4) isoforms indicate that
mammalian HCN3 and fish HCN3 sequences are as similar to each other as they are to the other
isoforms. This suggests that sequence divergence has continued independently in each lineage.
Numbers indicate % sequence identity/sequence conservation generated by GeneDoc for the region
between S1 and end of CNBD, where the total number of sites used was 457. Fore example, 85/94
correlates to 393 of 457 sites identical and 430 of 457 conserved. Human sequences were used as a
representative of mammalian sequences and HCNa_fugu used as an example of fish sequences. ‘B’
sequences tend to show lower conservation.
Page 43 of 58
43
Figure 3: A maximum parsimony (MP) consensus tree of the HCN family. 52 HCN sequences
were included in this analysis. The MP tree was constructed by resampling 100 datasets (bootstrap)
with randomized input order of 10 jumbles using Seqboot, Protpars and Consense programs in the
PHYLIP software. To control for length differences between the different isoforms, only the region
between the start of S1 and the end of the CNBD was used. The tree is rooted with an outgroup of
KAT1, hERG1, CNGA1 and CNGA3, 4 sequences that are known to be related to the HCN family.
Similar to the previous studies, HCN2_urchin is shown to branch before the invertebrate clade and may
be consequence of the MP method’s sensitivity to unequal rates of evolution. The six urochordate
sequences from c. intestinalis and c. savignyi fall according to their evolutionary position, between the
invertebrates and vertebrate species, and are supported by high bootstrap values. Because they group
within their own species, they are not considered orthologous to any of the mammalian isoforms and
are therefore named HCNa, HCNb and HCNc. HCN3 is shown to be the product of the first duplication
event, however, a split between the fish and mammalian sequences is observed but the partition is not
well defined. HCN4 is shown to be the product of the second duplication event, followed by the
emergence of HCN2 and HCN1. Low bootstrap values within isoform clades are most likely a
consequence of the high sequence identity within the region used and in turn, the lack of informative
sites required by the MP method. Duplicate copies of mammalian paralogs in the fish species are
denoted with ‘a’ or ‘b’ in their sequence names. * indicate sequences annotated in this study. Numbers
indicate bootstrap values and represent % support for the respective partitions.
Figure 4: Conservation of exon structure reveals evolutionary patterns of HCN genes. The general
regions of the HCN protein are outlined in the grey box, not drawn to scale. These include: the Nterminal region, six transmembrane segments (1-6), a pore region (P) that forms the pore and
Page 44 of 58
44
selectivity filter in the quaternary structure of the channel, C-linker, CNBD, and the distal Cterminus. Below, the vertical lines represent exon boundary position of the coding sequence: black
lines indicate an exon boundary position that is conserved with the human HCN isoforms, dark grey
lines indicate conservation with invertebrate boundary positions and light grey indicates boundary
position specific to the individual protein. The exon number is shown above the horizontal black line
and amino acid length of the protein corresponding to the exon is shown below. The exon structure of
most vertebrate isoforms (HCN1-4) is conserved with that of human (exception is fish HCN3b genes,
see text for details), consisting of 8 exons with 2 through 7 corresponding to the region between the
start of S1 and the end of the CNBD. The sequence diverges in length between isoforms in the N- and
C-termini. A general HCN_human structure is shown as a representative with length differences of the
four isoforms in exon 1 and 8 indicated by a range. The total lengths of the predicted exons and the
exon boundary structure is variable among the three different c. intestinalis HCN coding sequences
identified in this analysis. HCNb_c.intestinalis is composed of 15 exons, sharing all vertebrate HCN
channels exon boundary positions, with an additional six unique boundaries and one boundary shared
with the invertebrate HCN proteins of the fly and mosquito. HCNa_c.intestinalis is composed of 10
exons but only 4 boundaries are conserved with vertebrates. HCNc_c.intestinalis is predicted to have
14 exons, and shows no similarity in exon structure or boundary position to either of the other ciona
proteins or vertebrate HCN. In general, the mosquito and fly proteins have a higher number of exons
compared to vertebrate HCN, sharing only a few boundary positions.
Figure 5: Sequence comparison of the voltage sensing domain and pore region. A) Sequence of
voltage sensing domain conserved throughout evolution. The fly, human and fugu sequences were
chosen to represent arthropod, mammal and fish species, respectively. Sequences are organized into
invertebrates, urochordates and vertebrate groups. A high sequence identity is notable amongst all
Page 45 of 58
45
vertebrate sequences. Shading indicates four levels of sequence conservation (see methods),
excluding the assembly gap regions. Black stars indicate regions of assembly gaps in genome database
and the arrow indicates the conserved exon boundary between mammalian exon 2 and 3. a = region of
charge common to all potassium channel voltage sensing domains, b = region of charge unique to HCN
channels and c = region associated with normal channel closure. B) A sequence comparison of the
HCN pore region. The same representative sequences as in 5A were used, except HCN4_chicken.
This was omitted due to a gap in the genome assembly across this region. The selectivity filter
sequence CIGYG (a) is conserved throughout the HCN family, with the exception of two duplicate
genes from sea urchin and urochordate species. The NXT/S glycosylation site (b) is located
immediately prior to the pore helix and is conserved in urochordate and vertebrate sequences. Two
conserved glycine residues (c and d) are located in S6, one of which was hypothesized to be the hinge
that obstructs that pore. More recent evidence suggests that the hinge is located further downstream
between the threonine and glutamine residues (e) at the C-terminal end of S6 in which a dipeptide is
conserved throughout the entire HCN family. (f) The residue identified as responsible for the difference
in chloride sensitivity in HCN2 vs. HCN1.
Figure 6: Sequence comparison of the cyclic nucleotide binding domain. A sequence alignment of
the CNBDs from representative sequences of the HCN family indicates that functionally important
residues in the CNBD are conserved throughout evolution. Shading indicates four levels of sequence
conservation (see methods). Evolutionarily conserved glycine residues known to be important for
maintaining the fold of the V-roll found in all CNBDs (located between A and B) and those residues
known to interact with the cyclic nucleotide are indicated by stars below the alignment. The phosphate
binding cassette (PBC) and hinge region common to all CNBDs are identified by the black lines below
Page 46 of 58
46
the sequence. The arrow indicates the hydrophobic site where divergence has occurred in ciona
HCNc genes (as explained in text). Cylinders above the sequence represent helices and arrows = bstrands that together form the B-roll. Cyclic nucleotides are known to bind to the second conserved
arginine in the PBC.
Figure 7: Individual isoform alignments of the distal termini of HCN channels. 7A) Alignment of
the N-terminal region of vertebrate HCN1, 2 and 4 isoforms. Sequence alignments of available
vertebrate sequences for the N-terminal region of three of the four individual isoforms were produced
using ClustalX and edited and shaded using GenDoc. Shading indicates four levels of sequence
conservation (see methods). a = example of conserved motif in vertebrate HCN2 sequences that may
play a role in channel surface expression. b = region conserved amongst all vertebrate HCN sequences
that may be involved in tetrameric interactions in HCN2 and may play a similar role in HCN1, 3 and 4.
Arrows between conserved blocks represent variable regions with the range of amino acid length
variation listed above. Numbers at the start and end of the sequences correspond to their position in the
full length sequences provided in Supplementary Figure 1. 7B) Alignment of the C-terminal region
of vertebrate HCN1, 2 and 4 isoforms. Sequence alignments of all available vertebrate sequences for
the C-terminal region of three of the four individual isoforms were produced using ClustalX and
edited/shaded using GenDoc. Shading indicates four levels of sequence conservation (see methods). a =
example of conserved motif in vertebrate HCN1 sequences has been shown to interact with Filamin A.
b = region conserved amongst all vertebrate HCN sequences downstream of the CNBD. c = tripeptide
sequence shown to be involved in protein-protein interactions with scaffolding proteins in vivo and to
regulate cell surface expression. Dotted line indicates conserved region amongst four vertebrate
isoforms previously identified but of unknown function. Arrows between conserved blocks represent
variable regions with the range of amino acid length variation listed above. Numbers at the start and
Page 47 of 58
47
end of the sequences correspond to their position in the full length sequences provided in
Supplementary Figure 1.
Supplementary Figure 1: FASTA sequences of putative HCN homologs used in analysis
Supplementary Figure 2: Full length alignment of HCN sequences used in analysis. Channel
regions indicated above the alignment. TMD = transmembrane domain, S4 = voltage sensor, SF =
selectivity filter, CNBD = cyclic nucleotide binding domain. Black lines below alignment correspond
to region used in phylogenetic analyses and to the region between the start of S1 and the end of the
CNBD. Numbers correspond to position in sequence used in this study (see FASTA sequence in
Supplementary Figure 1). * indicate gaps in genome assembly. C-terminal region in some fish species
may be broken up into multiple exons but interpreted to be one because downstream portions are in
frame with upstream sequence and high sequence conservation is shared between fish species.
Supplementary Figure 3: Additional phylogenetic trees of HCN family. 3A) Neighbor-Joining
(NJ) phylogram of the HCN family. The NJ tree was constructed with resampling (bootstrap, 1000
datasets) using ClustalX (excluding positions with gaps and corrected for multiple substitutions). Only
the region between the start of S1 and the end of the CNBD was used, to control for length differences
between the different isoforms. The tree is rooted with an outgroup of KAT1, hERG1, CNGA1 and
CNGA3, four sequences that are known to be distantly related to the HCN family. Tree topology
indicates that HCN3 was the first product of duplication events in the vertebrate lineage. There is a split
between mammalian and fish HCN3 clades, which may be expected based on the sequence
conservation shown in Figure 2, but the low bootstrap value indicates that this intra-isoform division
remains unresolved. HCN4 is represented as the product of the second duplication event, followed by
the emergence of HCN2 and HCN1. Similar to the HCN2_urchin, the urochordates, c. intestinalis and
Page 48 of 58
48
c. savignyi, are shown to branch before the invertebrate clade. This position, however, is not
conserved in the MP tree in which the 6 sequences fall according their evolutionary position, between
the invertebrates and vertebrates. Furthermore, the low bootstrap values at the nodal partitions
throughout the invertebrate/urochordate group indicate that the topology of these divisions remains
unclear with this tree building method. This may be a consequence of excluding alignment positions
with gaps and that insertions/deletions need to be included to resolve the evolutionary order of these
primitive sequences. * indicate sequences identified in this study. Numbers indicate bootstrap values
and represent number of trees containing the particular division.
3B) Maximum Likelihood phylogram of the HCN family. Due to the high sequence conservation
and program limitations, the maximum likelihood phylogram was constructed using a subgroup of 41
sequences, resampling of 100 datasets and a randomized input order of 10 jumbles. Seqboot, Proml and
Consense programs from PHYLIP were used and the JTT model of evolution was employed. A single
representative tree similar in topology to consense tree was used to provide indication of branch length
divergence. Numbers indicate bootstrap values for consense tree node divisions and represent %
support for the respective partition. Consistent with the NJ and MP trees generated, HCN2_urchin is at
the base of the phylogram, and is separated by a high bootstrap value. Similar to the NJ tree, ciona
HCNb genes are independent from the HCNa and HCNc and the invertebrates/urochordates divisions
remain undefined. HCN3 is separated from the other three vertebrate isoforms by a high bootstrap
value, whereas the order of the other two duplication events remains unresolved by this method. As ML
tree results are dependent on the evolutionary model imposed, this undefined partition suggests that the
JTT model can not conclusively generate the sequence evolution observed within the HCN family.
Based on branch length information, mammalian HCN4 and HCN2 sequences seem to have evolved
the least from a common ancestral sequence, whereas all fish sequences and those in the HCN1 and
Page 49 of 58
49
HCN3 clades seem to have undergone more evolutionary change. Similar results were suggested by
the sequence identity profiles shown in Figure 2.
Table 1: List of HCN Sequences Used in Analyses
Naming of sequences = HCN isoform followed by the species. () indicate previous names for
sequences. For duplicates, ‘a’ and ‘b’ labels were assigned arbitrarily. * indicate those that were reannotated or identified in this study and † indicates a sequence missing at least one internal exon and
was therefore removed from phylogenetic analyses. Abbreviations: Invt., invertebrate; Mam., mammal;
Vt., lower vertebrate; Uro., urochordate. NA = protein identifier was not available for the sequence
included in final analysis.
Page 50 of 58
1.
++
++
S1 S2 S3 S4 S5
++
++
G
Y
G
S6
Clin
ke
r
P
CN
NH2
cAMP
COOH
Figure 1
BD
Page 51 of 58
2.
M1
F1
M2
F2
M3
F3
M4
F4
93/
97
85/
94
85/
94
80/
92
82/
93
87/
94
84/
94
M1
85/
93
83/
93
79/
92
82/
94
86/
94
82/
93
F1
91/
96
82/
92
84/
94
90/
96
87/
93
M2
80/
91
83/
93
88/
95
85/
93
F2
83/
93
85/
94
82/
93
M3
86/
95
84/
94
F3
91/
96
M4
F4
Figure 2
Page 52 of 58
3.
hCNGA3
hCNGA1
KAT1
hERG1
HCN2_urchin
Outgroup
HCN_urchin
HCN_lobster
100
Invertebrates
HCN_mosquito
100
HCN_silkmoth
46.9
HCN_bee
51.3
HCN_fly
HCNb_c.intestinalis*
100
HCNb_c.savignyi*
83.7
HCNa_c.intestinalis*
100
Urochordates
HCNa_c.savignyi*
96
HCNc_c.intestinalis*
100
HCNc_c.savignyi*
HCN3_green puffer*
100
HCN3a_fugu*
33.3
82.7
HCN3b_fugu*
100
HCN3_zebrafish*
HCN3_opossum*
100
HCN3_mouse
92.5
75.8
HCN3_rat
HCN3_cow
100
37.1
HCN3_human
28
HCN3_dog
HCN4b_fugu*
41.2
HCN4b_green puffer*
93.4
HCN4_zebrafish
44.5
30.6
HCN4a_green puffer*
100
48.1
HCN4a_fugu*
HCN4_opossum*
92.9
HCN4_mouse
66.1
HCN4_rat
41.5
HCN4_rabbit
38
HCN4_dog
41.3
HCN4_human
95
HCN2a_green puffer*
100
HCN2a_fugu*
56.6
HCN2b_green puffer*
72.1
97
HCN2_zebrafish
HCN2_frog*
71.5
HCN2_rat
94.9
HCN2_human
69.7
56.5
HCN2_mouse
HCN1_trout
100
HCN1_fugu
94.5
100
HCN1_green puffer*
HCN1_mouse
98.5
100
HCN1_rat
25.1
HCN1_opossum
26.4
HCN1_chimp
62
HCN1_dog
29.5
HCN1_human
31.1
HCN1_rabbit
98
100
98
Figure 3
HCN3
HCN4
Vertebrates
HCN2
HCN1
Page 53 of 58
4.
1 2 3 4 5 P
N-terminus
1
HCN 1-4_human
(774-1203aa)
2
93-262
141-143
HCNb_c.intestinalis
(804aa)
HCNa_c.intestinalis
(688aa)
HCNc_c.intestinalis
(608aa)
HCN_mosquito
(945aa)
HCN_fly
(945aa)
6 C-linker
CNBD
C-terminus
3
4
5
6
7
8
54
73
49
80
56
225-488
1
2
3
4
5
6
7
8
9
10
11
12
13 14
15
41
45
53
44
54
44
29
49
40
42
56
38
45
197
1
2
3
4
5
6
7
8
9
10
219
61
43
54
75
50
44
42
48
52
1
2
3
4
5
6
7
8
9
10
11
12
13
14
18
29
43
25
49
50
54
43
46
53
27
60
74
37
27
1
2
3
4
5
6
7
8
9
10
11
12
13
14
79
80
55
45
46
50
26
47
57
32
29
64
43
41
5
6
7
8
9
10
11
12
13
50
26
104
32
29
106
44
1
2
3
4
268
55
80
55
45
51
Figure 4
Page 54 of 58
5A.
S4 – voltage sensor
S4-S5 linker
Invertebrate
HCN
Urochordate
HCN
*
Vertebrate
HCN
*
a
b
5B.
c
S6
Pore
Invertebrate
HCN
Urochordate
HCN
Vertebrate
HCN
b
a
Figure 5
f
c
d
e
Page 55 of 58
6.
A
P
* *
*
B
* *** *
PBC
Figure 6
* ***
*
Hinge
C
**
Page 56 of 58
7a.
21-49 AAs
b
a
18-35
AAs
22-29
AAs
7-18
AAs
20-41
AAs
71-110
AAs
Figure 7a
Page 57 of 58
7b.
B
57-113
AAs
a
b
13-14
AAs
10-11
AAs
29-41
AAs
c
21-35
AAs
b
54-103
AAs
c
b
8-14
AAs
Figure 7b
191-560
AAs
66-112
AAs
c
Page 58 of 58
Table 1:List of HCN Sequences Used in Analyses
Name
Organism
HCN_silkmoth (HvIh)
Heliothis virescens
HCN_bee (AmIh)
Apis mellifera
HCN_lobster (PaIh)
Panulirus argus
HCN_fly (DmIh)
Drosophila
HCN_urchin (SpIh/spHCN) Strongylocentrotus purpuratus
HCN2_urchin (spHCN2) Strongylocentrotus purpuratus
HCN1_mouse (mHCN1) Mus musculus
HCN1_trout (tHCN1)
Oncorhynchus mykiss
HCN1_human (hHCN1)
Homo sapiens
HCN1_rabbit (rbHCN1)
Oryctolagus cuniculus
HCN1_dog
Canis familiaris
HCN1_rat (rHCN1)
Rattus norvegicus
HCN1_fugu
Takifugu rubripes
HCN2_human (hHCN2)
Homo sapiens
HCN2_mouse (mHCN2) Mus musculus
HCN2_rat (rHCN2)
Rattus norvegicus
HCN2_zebrafish
Danio Rerio
HCN3_mouse (mHCN3) Mus musculus
HCN3_human (hHCN3)
Homo sapiens
HCN3_rat (rHCN3)
Rattus norvegicus
HCN3_cow
Bos taurus
HCN3_dog
Canis familiaris
HCN4_human (hHCN4)
Homo sapiens
HCN4_mouse (mHCN4) Mus musculus
HCN4_rabbit (rbHCN4)
Oryctolagus cuniculus
HCN4_rat (rHCN4)
Rattus norvegicus
HCN4_dog
Canis familiaris
HCN1_chimpanzee*
Pan troglodytes
HCN4_chimpanzee* †
Pan troglodytes
HCN1_opposum*
Monodelphis domestica
HCN3_opposum*
Monodelphis domestica
HCN4_opposum*
Monodelphis domestica
HCN2_chicken* †
Gallus gallus
HCN4_chicken* †
Gallus gallus
HCN1_frog* †
Xenopus tropicalis
HCN2_frog*
Xenopus tropicalis
HCN4_frog* †
Xenopus tropicalis
HCN1_green_puffer*
Tetraodon nigroviridis
HCN4a_green puffer*
Tetraodon nigroviridis
HCN4b_green_puffer*
Tetraodon nigroviridis
HCN2a_green_puffer*
Tetraodon nigroviridis
HCN2b_green puffer*
Tetraodon nigroviridis
HCN3_green_puffer*
Tetraodon nigroviridis
HCN3_zebrafish*
Danio Rerio
HCN4_zebrafish*
Danio Rerio
HCN2b_fugu*†
Takifugu rubripes
HCN2a_fugu*
Takifugu rubripes
HCN3a_fugu*
Takifugu rubripes
HCN3b_fugu*
Takifugu rubripes
HCN4a_fugu*
Takifugu rubripes
HCN4b_fugu*
Takifugu rubripes
HCN_mosquito*
Anopheles gambiae
HCNa_c.intestinalis*
Ciona intestinalis
HCNb_c.intestinalis*
Ciona intestinalis
HCNc_c.intestinalis*
Ciona intestinalis
HCNa_c.savignyi*
Ciona savignyi
HCNb_c.savignyi*
Ciona savignyi
HCNc_c.savignyi*
Ciona savignyi
Common Name
Taxonomy
silkmoth
Invt.
honey bee
Invt.
spiny lobster
Invt.
fruit fly
Invt.
sea urchin
Invt.
sea urchin
Invt.
mouse
Mam.
rainbow trout
Vt.
human
Mam.
rabbit
Mam.
dog
Mam.
norway rat
Mam.
japanese pufferfish
Vt.
human
Mam.
mouse
Mam.
norway rat
Mam.
zebrafish
Vt.
mouse
Mam.
human
Mam.
norway rat
Mam.
cow
Mam.
dog
Mam.
human
Mam.
mouse
Mam.
rabbit
Mam.
norway rat
Mam.
dog
Mam.
chimpanzee
Mam.
chimpanzee
Mam.
opossum
Mam.
opossum
Mam.
opossum
Mam.
red jungle fowl
Vt.
red jungle fowl
Vt.
frog
Vt.
frog
Vt.
frog
Vt.
spotted green pufferfish Vt.
spotted green pufferfish Vt.
spotted green pufferfish Vt.
spotted green pufferfish Vt.
spotted green pufferfish Vt.
spotted green pufferfish Vt.
zebrafish
Vt.
zebrafish
Vt.
japanese pufferfish
Vt.
japanese pufferfish
Vt.
japanese pufferfish
Vt.
japanese pufferfish
Vt.
japanese pufferfish
Vt.
japanese pufferfish
Vt.
mosquito
Invt.
sea squirt
Uro.
sea squirt
Uro.
sea squirt
Uro.
pacific sea squirt
Uro.
pacific sea squirt
Uro.
pacific sea squirt
Uro.
Identifier
GI:3970750
GI:33355927
GI:33355925
GI:5326833
GI:47551101
GI:74136757
GI:6754168
GI:33312350
GI:32698746
GI:38605639
GI:73954256
GI:29840774
SINFRUP00000160459
GI:21359848
GI:6680189
GI:29840773
GI:68399930
GI:6680191
GI:38327037
GI:29840772
GI:76612489
GI:73961597
GI:4885407
GI:29840776
GI:38605640
GI:29840771
GI:74000965
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
NA
Database
GenBank
GenBank
GenBank
GenBank
GenBank
GenBank
GenBank
GenBank
GenBank
GenBank
GenBank
GenBank
Ensembl
GenBank
GenBank
GenBank
GenBank
GenBank
GenBank
GenBank
GenBank
GenBank
GenBank
GenBank
GenBank
GenBank
GenBank
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl
Ensembl