A Newly Classified Vertebrate Calpain Protease

A Newly Classified Vertebrate Calpain Protease, Directly
Ancestral to CAPN1 and 2, Episodically Evolved a Restricted
Physiological Function in Placental Mammals
Daniel J. Macqueen,*,1 Margaret L. Delbridge,2 Sujatha Manthri,1 and Ian A. Johnston1
1
Physiological and Evolutionary Genomics Laboratory, School of Biology, Scottish Oceans Institute, University of St Andrews, St
Andrews, Fife, United Kingdom
2
The ARC Centre of Excellence for Kangaroo Genomics, Ecology, Evolution and Genetics, Research School of Biology, The
Australian National University, Canberra, ACT Australia
*Corresponding author: E-mail: [email protected].
Associate editor: David Irwin
Research article
Abstract
The most studied members of the calpain protease superfamily are CAPN1 and 2, which are conserved across vertebrates.
Another similar family member called l/m-CAPN has been identified in birds alone. Here, we establish that l/m-CAPN
shares one-to-one orthology with CAPN11, previously described only in eutherians (placental mammals). We use the name
CAPN11 for this family member and identify orthologues across vertebrate lineages, which form a monophyletic
phylogenetic clade directly ancestral to CAPN1 and 2. In lineages branching before therians (live-bearing mammals), the
CAPN11 coding region has evolved under strong purifying selection, with low nonsynonymous (dN) versus synonymous
(dS) substitution rates (dN/dS 5 0.076 across pretherians), and its transcripts were detected widely across different tissues.
These characteristics are present in CAPN1 and 2 across vertebrate lineages and indicate that pretherian CAPN11 likewise
has conserved a wide physiological function. However, an ;7-fold elevation in dN/dS is evident along the CAPN11 branch
splitting eutherians from platypus, paralleled by a shift to ‘‘testis-specific’’ gene regulation. Estimates of dN/dS in eutherians
were ;3-fold elevated compared with pretherians and coding and transcriptional-level evidence suggests that CAPN11 is
functionally absent in marsupials. Many CAPN11 sites are functionally constrained in eutherians to conserve a residue with
radically different biochemical properties to a fixed state shared between pretherian CAPN11 and CAPN1 and 2. Protein
homology modeling demonstrated that many such eutherian-specific residue replacements modify or ablate interactions
with the calpain inhibitor calpastatin that are observed in both pretherian orthologues and CAPN1/2. We propose a model
akin to the Dykhuizen–Hartl effect, where inefficient purifying selection and increased genetic drift associated with
a reduction in effective population size, drove the fixation of mutations in regulatory and coding regions of CAPN11 of
a common marsupial–eutherian ancestor. A subset of these changes had a cumulative adaptive advantage in a eutherian
ancestor because of lineage-specific aspects of sperm physiology, whereas in marsupials, no advantage was realized and the
gene was disabled. This work supports that functional divergence among gene family member orthologues is possible in
the absence of widespread positive selection.
Key words: CAPN11 and l/m-CAPN, episodic gene evolution, functional divergence of gene family orthologues,
transcriptional regulation, functional constraints, Dykhuizen–Hartl effect.
Introduction
2þ
The calpain superfamily of Ca -dependent cysteine proteases regulate a multitude of physiological processes including apoptosis, membrane fusion, cell motility, and
signal transduction (reviewed by Goll et al. 2003) and
are implicated in several human diseases (Saez et al.
2006). Vertebrates other than teleosts have up to 15 calpains, originating from duplications dating before and
within metazoan and vertebrate lineages (Jékely and
Friedrich 1999). Calpains have been classified by their
domain structure and/or expression patterns (Sorimachi
et al. 1997; Sorimachi and Suzuki 2001; Goll et al. 2003).
The first classification separates the family into ‘‘typical’’
or ‘‘atypical’’ groups (Goll et al. 2003). Typical calpains, including CAPN-1, -2, -3, -8, -9, –11, -12, -13, and -14, have
a conserved set of functional domains called DI, DIIa, DIIb,
DIII, and DIV (Sorimachi and Suzuki 2001; Goll et al. 2003).
Atypical calpains invariably have DIIa and DIIb, which together form the papain-like protease domain present in all
calpains, some have DIII, which is reminiscent of C2-like
complement domains and some have additional conserved
domains (Sorimachi and Suzuki 2001; Goll et al. 2003). DIV,
a calmodulin-like penta-EF-hand domain, is unique to the
typical calpains (Sorimachi and Suzuki 2001; Goll et al.
2003). The second commonly used classification is based
on messenger RNA (mRNA) transcript levels in different
tissues and splits the family into ‘‘ubiquitous’’ or tissuespecific types (e.g., Sorimachi and Suzuki 2001; Saez
et al. 2006).
The majority of research into calpains has focused on
two typical family members, CAPN1 and 2, which are
© The Author 2010. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please
e-mail: [email protected]
1886
Mol. Biol. Evol. 27(8):1886–1902. 2010 doi:10.1093/molbev/msq071
Advance Access publication March 11, 2010
Episodic Evolution of a Newly Classified Calpain Gene · doi:10.1093/molbev/msq071
closely related 80-kDa proteins coded by two respective
genes, CAPN1 and 2. When bound to a common 30-kDa
small subunit (called CAPNS1), they are classified as land m-CAPN respectively, which are considered the archetypal broad functioning ubiquitous calpains (Sorimachi
and Suzuki 2001; Goll et al. 2003). Indeed, CAPN1 and 2
genes are each transcribed broadly across tissues in mammals and birds (Sorimachi et al. 1995; Farkas et al. 2003).
Both l- and m-CAPN cleave hundreds of known substrates
in many cell types while being tightly regulated by a specific
ubiquitous inhibitor called calpastatin (CAST; Goll et al.
2003). Although l- and m-CAPN share fundamental biochemical properties due to the close evolutionary relationship of CAPN1 and 2 (Jékely and Friedrich 1999; Macqueen
et al. 2010), they also have distinct functions, for example,
they are activated by different levels of cellular Ca2þ (lM
and mM concentrations respectively; Sorimachi and Suzuki
2001; Goll et al. 2003). Another family member called
CAPN8 is more closely phylogenetically related to CAPN2
than CAPN1 (Jékely and Friedrich 1999), and while requiring
a similar Ca2þ concentration for its activation (Hata et al.
2007), has diverged in several critical functions shared by
CAPN1 and 2, for example, in mammals, its transcripts are
‘‘stomach specific’’ (Sorimachi and Suzuki 2001) and its protein does not require CAPNS1 for activity (Hata et al. 2007).
Apart from CAPN1 and 2, the only other characterized
ubiquitous typical calpain was identified solely in birds and
called l/m-CAPN due to its midway sequence similarity
and Ca2þ requirement to CAPN1 and 2 (Sorimachi et al.
1995). It branched closely to CAPN1 and 2 within a calpain
phylogeny (Jékely and Friedrich 1999), and its mRNA transcripts were similarly broadly expressed across tissues
(Sorimachi et al. 1995; Lee et al. 2007). In common with
CAPN1 and 2, chicken l/m-CAPN forms a heterodimer
with CAPNS1 and is inhibited by CAST (Wolfe et al.
1989). It is the predominant translated calpain in chicken
erythrocytes (Murakami et al. 1988) and across different
avian tissues (Lee et al. 2007). Furthermore, in chickens,
CAPN1 protein is present at very low concentrations in several tissues and CAPN2 is not translated (Lee et al. 2007).
These results are in marked contrast to the situation in
mammals, for example, where CAPN2 is developmentally
essential (Dutt et al. 2006). The phylogenetic status of
l/m-CAPN is unclear, and it has been variously ascribed
as avian specific (Lee et al. 2007) or based on sequence
identity, a potential orthologue of CAPN11 (Dear et al.
1999). However, CAPN11 has only been characterized in
eutherians from the Euarchontoglires lineage, where its
gene transcripts are testis restricted (Dear and Boehm
1999; Dear et al. 1999).
Here, we demonstrate using shared synteny and phylogenetic analyses that l/m-CAPN and eutherian CAPN11
genes share true orthology and are examples of a single vertebrate–wide calpain family member (CAPN11) that may
represent the progenitor sequence to CAPN1 and 2. Our
results suggest that CAPN11 of pretherians is under strict
purifying selection to maintain wide physiological functions across many cell types. However, CAPN11 became
MBE
testis specific in a common eutherian ancestor in parallel
with a striking shift in coding level constraints that lead to
residue replacements affecting physical interactions with
CAST that are conserved in both pretherian CAPN11
and CAPN1/2. This study provides evidence for functional
divergence of calpain orthologues on par with differences
previously observed between paralogous members of the
superfamily and highlights the pitfalls of adopting classification systems for orthologous gene family members based
on expression patterns or functions.
Materials and Methods
Sequences
Sequences used in this study are listed in supplementary
table S1, Supplementary Material online, and were obtained from Ensembl (http://www.ensembl.org) or NCBI
(http://www.ncbi.nlm.nih.gov/) databases. Blast searches
were performed using the NCBI Web server (http://
blast.ncbi.nlm.nih.gov/Blast.cgi), specifically with BlastP
versus the nonredundant NCBI protein database or
TBlastN versus the marsupial expressed sequence tag
(EST) database. Regions of Ensembl marsupial genomes
that were predicted to harbor CAPN11 were screened
against the Ensembl trace server (http://trace.ensembl
.org/cgi-bin/tracesearch) to confirm that observed nucleotides were not due to sequencing errors (see Results). Sequence alignment of marsupial calpain proteins against
other family members was performed with Mafft v.6 employing the G-INS-I strategy (Katoh and Toh 2008).
Phylogenetic Analyses
Fifty-eight amino acid sequences were used for phylogenetic analyses including CAPN-1, -2, -3, -8, -9, and -11
orthologues (listed in supplementary table S1, Supplementary Material online). The whole polypeptide sequence contains valuable phylogenetic signal because each included
calpain has an identical domain structure. Sequence alignment was performed with Mafft v.6 employing the G-INS-I
strategy (Katoh and Toh 2008). An 890-site output was
manually checked and submitted to Gblocks to eliminate
ambiguous/saturated sites with the most stringent block
selection setting (Castresana 2000). The 282-site output
(alignment A in supplementary fig. S1, Supplementary Material online) was submitted to ProtTest (Abascal et al.
2005), which indicated Le and Gascuel (LG) þ G þ I (LG
substitution model with estimation of gamma distribution
shape parameter, a, and the proportion of invariable sites)
as the best fitting of 112 examined evolutionary models according to Akaike information criterion (AIC) statistics.
Maximum likelihood (ML) was performed using this model
in PhyML (Guindon and Gascuel 2003) with four substitution rate categories. Bayesian inference (BI) was performed
in MrBayes 3.12 (Ronquist and Huelsenbeck 2003) with the
next best ProtTest model (Jones, Taylor and Thornton [JTT]
þ G þ I) because LG is not available in the program. Two
runs were used, each of a single chain of 20,000,000 generations, sampled every 20,000 generations. Convergence was
1887
MBE
Macqueen et al. · doi:10.1093/molbev/msq071
assessed by comparing the standard deviation (SD) of split
frequencies between runs. Visual assessment with Tracer
v1.4 (Drummond and Rambaut 2007) also indicated that
a suitable mixing of Markov chains was obtained. The first
5,000 generations were discarded and remaining sample independence was confirmed by a lack of autocorrelation in
tree log-likelihood values assessed with Minitab 13.2 (Minitab Inc., State College, PA). The final 15,000 samples were
used to obtain a consensus tree and posterior probabilities.
Neighbor joining (NJ) and maximum parsimony (MP) trees
were constructed in Mega 4.0 (Tamura et al. 2007) with
1,000 bootstrap replicates. For NJ, the best available ML
model was used (JTT þ G, a 5 0.945, estimated by PhyML),
and additionally, an exploratory analysis was performed to
establish the effect of among-site rate variation on tree topology, where the JTT model was used with a fixed at either
0.25, 0.5, 0.75, or 5. For MP, the close-neighbor-interchange
algorithm was used.
Selection Analyses
Sequences used for selection analyses are listed in supplementary table S1, Supplementary Material online, and included
orthologues of CAPN11 (21 sequences), CAPN1 (18 sequences), and CAPN2 (14 sequences). Teleost CAPN2 sequences
were excluded because phylogenetic evidence indicates that
they are not direct orthologues of tetrapod CAPN2 (Macqueen et al. 2010). When sequences were obtained from Ensembl databases, only species with 6-fold coverage or greater
were included to avoid sequencing errors. Codon alignments
were constructed separately for each family member by submitting nucleotide sequences (complete or near-complete
coding sequences) to Pal2Nal (Suyama et al. 2006) along with
a manually checked alignment of translated amino acids produced using Mafft v.6 with the G-INS-I strategy (Katoh and
Toh 2008). For CAPN1, 2, and 11, finished codon alignments
of 2148, 2100, and 2223 respective sites (respective alignments
B, C, and D in supplementary fig. S1, Supplementary Material
online) were loaded into HyPhy v.0.99 (Kosakovsky Pond and
Frost 2005a) along with a corresponding ML phylogenetic
tree (provided in supplementary fig. S1, Supplementary Material online) constructed by the approach described above.
For each calpain alignment, the HyPhy batch file NucModelCompare was used to establish the best fitting of 203
general time reversible (GTR) models of nucleotide substitution (Kosakovsky Pond et al. 2009). To test if selective
constraints on calpain family members were altered during
mammalian evolution, the HyPhy batch file SelectionLRT
was employed using the best-fitting GTR model crossed
with the MG94 codon model. This approach used likelihood ratio tests (LRTs) to compare the plausibility of five
evolutionary models where dN estimates were either independent or constrained to be equal in different combinations of three specified partitions within the codon data
(Kosakovsky Pond et al. 2009). The specified data partitions
were the eutherian clade, the pretherian clade, and their
separating branch. The models tested were 1, global dN estimate; 2, constrained dN estimate for eutherians and pretherians with independent estimate for the separating
1888
branch; 3, constrained dN estimate for eutherians and
the separating branch with independent estimate for pretherians; 4, constrained dN estimate for pretherians and the
separating branch with independent estimate for eutherians; and 5, independent dN estimates for eutherians, pretherians, and the separating branch. AIC statistics were
used to determine the relative rank of each model and approximate its relative probability (Burnham and Anderson
2002). The HyPhy batchfile AnalyzeCodonData was used to
estimate dN and dS for every branch of each specified phylogenetic tree by locally fitting the MG94 codon substitution model crossed with the best-fitting GTR model. Final
branch dN and dS values were taken as the average of values
obtained by nonparametric bootstrapping of this procedure with 100 replicates.
Gene Expression Analyses
The specific expression of CAPN1, 2, and 11 and RPS13
(coding a ribosomal protein) was examined using reverse
transcription–polymerase chain replication (RT-PCR) with
first-strand complementary DNA (cDNA) templates derived from RNA extracted from a panel of adult tissues
for mouse (Mus musculus), pig (Sus scrofa), tammar wallaby
(Macropus eugenii), platypus (Ornithorhynchus anatinus),
green anole lizard (Anolis carolinensis), zebra finch (Taeniopygia guttata), frog (Xenopus laevis), and zebrafish (Danio
rerio). Forty PCR cycles were used for tammar wallaby
CAPN11 and 35 cycles for all other measured genes. Detailed methods and primer details are provided in supplementary file S1, Supplementary Material online.
DIVERGE 2.0 Rate Tests
Type I and II ‘‘functional divergence’’ was assessed using the
program DIVERGE 2.0 (Gu 2006). The employed alignment
(alignment E in supplementary fig. S1, Supplementary Material online) was constructed as for the main phylogenetic
analysis minus the Gblocks submission to ensure maximum
site inclusion. A ML tree was constructed using the same
approach as for the main phylogenetic analysis and loaded
into DIVERGE 2.0 before the following clades were specified: eutherian CAPN11 (six sequences), sauropsid (birds
and reptile) CAPN11 (four sequences), teleost CAPN11
(six sequences), vertebrate CAPN1 (eight sequences),
and tetrapod CAPN2 (six sequences). It was not possible
to specify other CAPN11 lineages (e.g., amphibians or
monotremes) because the minimum requirement of four
sequences (Gu 2006) was not met. P values were calculated
from z scores using an applet provided by the WEB Interface for Statistics Education (http://wise.cgu.edu/). When
the coefficient of functional divergence h was significantly
greater than 0, posterior probability values for individual
sites were examined. Sequence logos were created in
WebLogo (Crooks et al. 2004).
Ancestral Sequence Reconstruction for the
Common Eutherian Ancestor
Ancestral sequence reconstruction (ASR) was performed
with Datamonkey (Kosakovsky Pond and Frost 2005b)
Episodic Evolution of a Newly Classified Calpain Gene · doi:10.1093/molbev/msq071
using joint and marginal ML as well as sampled reconstruction to obtain CAPN11 and CAPNS1 and CAST4 sequences
representing the common eutherian ancestor. For
CAPN11, ASR was performed using amino acid translations
of the codon alignment and same phylogenetic tree employed for selection analyses. A model selection test supported that JTT þ F (F denotes data-derived frequencies)
was the best fitting of 28 examined evolutionary models
according to AIC statistics. This model was employed with
four substitution rate categories, allowing a gamma distribution of among-site substitution rates. The three methods
of ASR were agreeing at 98.5% of sites. Type II sites were
agreeing among all ASR methods receiving marginal ML
probabilities .0.95 and posterior probability values of
1.0 (sampled reconstruction). Marginal ML reconstructed
sequences were selected because they generally had the
highest support values at nonagreeing positions and were
most realistic considering the alignment data.
For CAST4 and CAPNS1, ASR was performed as for
CAPN11, using respective alignments representing 13
and 12 full-coding amino acid sequences, for a similar
set of pretherian and eutherian species (sequence are provided in supplementary table S1, Supplementary Material
online, respective alignments F and G in supplementary fig.
S1, Supplementary Material online). Sequences were
aligned and processed as described for phylogenetic analysis section. For CAST4, a 621-site alignment was uploaded
to Datamonkey, which specified a suitable NJ tree and indicated JTT þ F as the best-fitting model. We were only
interested in the accuracy of ASR for the CAST4 domain,
which spanned sites 503–586. In this region, all sites were
agreeing between ASR approaches and were well supported. For CAPNS1, a 174-site alignment was uploaded
to Datamonkey, which specified a suitable NJ tree and indicated JTT as the best-fitting model. All reconstructed
sites were agreeing between ASR approaches and were well
supported.
Homology Modeling of Eutherian and Pretherian
CAPN11
Protein homology modeling was performed using Protinfo
PPC, which predicts atomic-level structures of heterodimeric proteins with high accuracy (Kittichotirat et al.
2009). The chosen structural template was rat (Rattus norvegicus) CAPN2 bound to CAPNS1 and CAST4 (Hanna
et al. 2008, RCSB protein databank file: 3BOW.pdb). Target
sequences were as follows: Model A: rat CAPN2, CAPNS1,
and CAST4; Model B: rat CAPN1, CAPNS1, and CAST4;
Model C: rat CAPN11, CAPNS1, and CAST4; Model D:
chicken CAPN11, CAPNS1, and CAST4; and Model E:
CAPN11, CAPNS1, and CAST4 of a common eutherian ancestor. We only submitted regions directly alignable with
the sequences in 3BOW.pdb to avoid template-free modeling. For the full-modeling pipeline, the reader is directed
to Kittichotirat et al. (2009). The best energy–minimalized
models were selected according to their structure and interface confidence scores (Kittichotirat et al. 2009). The
global quality of each model relative to the experimental
MBE
structure was assessed using ERRAT, a program which
can distinguish correctly and incorrectly determined regions of structures according to evidence-based expectations about atomic-level interactions (Colovos and
Yeates 1993). The experimental structure 3BOW.pdb received a global quality score of 89.9 meaning around
90% of residues falls below the programs 95% rejection
limit, which is typical of its 2.4Å resolution (Colovos
and Yeates 1993). The homology models received nearequivalent ERRAT scores (Model A: 89.3, Model B: 89.2,
Model C: 89.9, Model D: 89.4, and Model E: 88.5), indicating
that the Protinfo PPC modeling did not reduce structural
resolution. Global model quality was then examined with
QMEAN, which provides a score based on a series of
atomic-level structural features, which becomes more negative as inferred model quality improves (Benkert et al.
2008). 3BOW.pdb received a score of 96.68 and the rat
CAPN2 control model 96.51, again suggesting that the
modeling approach did not reduce structural resolution.
Other models received an even lower QMEAN score
(86.2 to 90.29). ProQres was also used, which considers
multiple atomic-level expectations about protein structure
to score the local quality of homology models (Wallner and
Elofsson 2005). A sliding window analysis depicts the local
scoring function on a scale of 0–1 (0 being very unreliable).
According to ProQres, the homology models were of
comparable high local quality with the published calpain
template, with scores falling below 0.5 in only a few short
regions.
The 3D structure of the various models were rendered
and manipulated with Polyview 3D (Porollo and Meller
2007). PDB files for each model including a list of predicted
residue–residue interactions are available on request to
D.J.M.
Results
l/m-CAPN Coding Genes are Present in Many
Vertebrate Lineages
Sequences with greater identity to chicken l/m-CAPN
than other calpains were identified in the genomes of teleosts, amphibians, reptiles, birds, and monotreme mammals
(supplementary fig. S2, Supplementary Material online).
For most species, one ‘‘l/m-CAPN–like’’ sequence was observed, although zebrafish, stickleback (Gasterosteus aculeatus), and the frog X. laevis had two (supplementary
fig. S2, Supplementary Material online). In Ensembl databases, these l/m-CAPN–like sequences were variably annotated, for example, as CAPN1, CAPN11, CANX, novel,
or by a sequence identifier from another database (e.g., supplementary fig. S2, Supplementary Material online).
Shared Synteny Exists between l/m-CAPN and
CAPN11 Coding Genes
We examined chromosomal regions containing genes coding the l/m-CAPN–like sequences, plus CAPN1, 2, 8, and 11
in a broad range of vertebrates (fig. 1 and supplementary
fig. S3, Supplementary Material online). Eutherian
1889
MBE
Macqueen et al. · doi:10.1093/molbev/msq071
FIG. 1. A comparison of shared synteny in the chromosomal neighborhood of eutherian CAPN11 and pretherian l/m-CAPN–like genes
provides evidence for a one-to-one orthologous relationship. The question mark indicates the absence an opossum CAPN11 gene in its
expected position. Syntenic genes are arrows pointing in the direction of sense-strand transcription. Avian-specific genes in this region are
shown as gray or black arrowheads pointing in the direction of sense-strand transcription, respectively showing genes conserved in chicken and
zebra finch or just chicken. An accepted phylogeny for the included taxa is shown to the figures left.
chromosomal tracts containing CAPN11 clearly share gene
order with those harboring l/m-CAPN–like coding genes
of pretherian tetrapods and to a lesser extent zebrafish
(fig. 1 and supplementary fig. S3, Supplementary Material
online). This region was distinct from other genomic neighborhoods containing CAPN1, 2, and 8, where shared synteny was evident across vertebrates (supplementary fig.
S3, Supplementary Material online). These results strongly
indicate that eutherian CAPN11 genes are one-to-one orthologues of tetrapod l/m-CAPN–like genes because there
is no plausible mechanism other than direct inheritance to
account for the exact pattern of conserved gene order observed among tetrapods (fig. 1). Although the opossum
(Monodelphis domestica) genome contained the same syntenic chromosomal neighborhood proximal to CAPN11 as
in other tetrapods, CAPN11 was missing from its expected
location (fig. 1 and supplementary fig. S3, Supplementary
Material online, examined in a following section).
Phylogenetic Position of l/m-CAPN–Like Proteins
among the Wider Superfamily
BI and ML phylogenetic analyses employing the best-fitting
available evolutionary models supported the shared synteny analysis because eutherian CAPN11 and pretherian
l/m-CAPN–like sequences branched within a monophyletic vertebrate clade, with nodes following the expected
species relationships (fig. 2). Therefore, it is probable that
these trees correctly reflect the expected topology of a vertebrate-wide calpain family member. We interpret this in
accordance with human nomenclature, as an expansion of
existing CAPN11 family members and adhere to this naming system onward. The branch splitting platypus and eutherian CAPN11 was highly extended compared with all
other CAPN11 branches and this seemed to affect several
phylogenetic reconstruction methods including ML,
evidenced by a low (,50%) bootstrap value (fig. 2).
Furthermore, MP failed to retrieve a monophyletic CAPN11
1890
clade (fig. 3A), as did NJ (fig. 3B), except when enforcing
a gamma distribution shape parameter allowing for
extreme among-site rate variation (a 5 0.25; fig. 3C).
Molecular Evolution of the CAPN11 Clade
Comparing rates of dN and dS is a common way to examine
selective pressures on coding regions (Hughes and Nei
1988; Fay and Wu 2003; Kosakovsky Pond et al. 2009; Pybus
and Shapiro 2009). A frequent assumption made is that
nonsynonymous replacements, by altering protein function, generally affect fitness more than silent replacements.
Although it should not be assumed that silent replacements in coding regions are always selectively neutral,
for example, due to constraints imposed by secondary nucleic acid structure or codon usage preference (Pybus and
Shapiro 2009), it is reasonable to assume that most are
more neutral than those replacements altering an amino
acid. Thus, changes in the dN/dS ratio approximate shifts
in selective constraint and can detect instances of diversifying positive selection (e.g., Hughes and Nei 1988). Commonly, a dN/dS value of 1 is used to imply neutral selection
on a coding region and values lesser or greater than 1 to
respectively indicate purifying and positive selection
(Fay and Wu 2003; Kosakovsky Pond et al. 2009; Pybus
and Shapiro 2009).
We first examined how selective constraints varied
on the whole coding region of CAPN11 during vertebrate
evolution. By globally fitting an evolutionary model, a
dN/dS estimate of 0.12 was obtained, which was similar,
but slightly higher than for CAPN1 (0.080) and 2 (0.096)
(table 1), providing a crude measure that strong purifying
selection has been maintained across vertebrate lineages.
However, dN for the branch splitting platypus from eutherians was ;2.5 fold higher than any other pretherian branch
(not shown, evident in fig. 2). Thus, we sought to test
whether this was due to a shift in selective constraints. This
was achieved by comparing the likelihood of five
Episodic Evolution of a Newly Classified Calpain Gene · doi:10.1093/molbev/msq071
MBE
FIG. 2. ML and BI phylogenetic analyses show that l/m-CAPN–like and CAPN11 amino acid sequences form a monophyletic vertebrate clade
and represent a single calpain family member ancestral to CAPN1 and CAPN2/8. The shown BI topology was highly similar to trees constructed
by ML. Trees were rooted at the CAPN3/9 stem, an outgroup position established previously (Jékely and Friedrich 1999; Macqueen et al. 2010).
All posterior probability values are shown above nodes in the BI tree and supporting ML bootstrap values .50% are shown below nodes or
after BI values (i.e., BI value/ML value). The scale shows the number of substitutions per site along each branch. The chromosomal location of
teleost calpain family member co-orthologues is also shown.
evolutionary models where estimates of dN/dS were either
constrained or allowed to be independent in three partitions within the codon alignments, namely eutherian
branches, pretherian branches and the separating branch
(table 1). By deriving AIC statistics, we obtained an Akaike
weight for each model to approximate the probability that
it was the best fitting of those tested (Burnham and Anderson 2002). For CAPN11, LRTs indicated that Models 2–5
provided a significantly better fit to the data than the
global model (table 1). However, Model 5, where dN/dS
was estimated separately for eutherians, pretherians and
the separating branch, received overwhelming support
as being the most plausible model (table 1, Akaike weight
of 0.998). Respective Model 5 dN/dS estimates were 0.076
and 0.25 for pretherian and eutherian clades and 0.53 for
the separating branch (table 1). Conversely, for both
1891
Macqueen et al. · doi:10.1093/molbev/msq071
MBE
invariably lower, with a mean value of 0.10 and SD of
0.091. For platypus, the closest pretherian relative to eutherians, purifying selection was as strong (dN/dS 5
0.08) as in other pretherian branches. Conversely, dN/dS
estimates for eutherian branches were almost invariably
higher than pretherian branches, but generally lower than
the common therian branch, with a mean value of 0.31 and
SD of 0.17. In support of the compartmentalization analysis, dN/dS values of individual branches in the CAPN1 and
2 phylogenies were similar in pretherians and eutherians,
again suggesting that selective constraints remained stable
during evolution.
The Transcriptional Regulation of CAPN11 across
Vertebrate Evolution
FIG. 3. MP and NJ methods of phylogenetic reconstruction did not
recapture a monophyletic CAPN11 clade in most instances.
Condensed trees are shown with bootstrap confidence values for
important nodes. (A) MP tree. (B) NJ tree using the best-fitting
available ML substitution model (JTT þ G, a 5 0.945 as estimated
by PhyML). (C) NJ tree imposing a gamma shape parameter (a 5
0.25) allowing extreme among-site rate variation.
CAPN1 and 2, dN/dS estimates did not deviate very strongly
from the global estimate in specified partitions for any of
three competing evolutionary models that had between
10% and 60% of being the best-fitting according to their
Akaike weight (table 1). However, it should be mentioned
that for both CAPN1 and 2, a small decrease in dN/dS is
observed in eutherians compared with pretherians in each
competing model (table 1).
The above approach provides a statistical framework
suggesting a striking shift in selective constraints on eutherian CAPN11 after the split from monotremes. However,
this method fixes dN/dS estimates across specified clades,
such that branch-by-branch variation was not considered.
Thus, an evolutionary model was locally fit to estimate
branch-specific dN and dS. With this approach, dN/dS for
the branch splitting platypus from eutherians was 0.54.
Estimates of dN/dS for individual pretherian branches were
1892
Next we examined the tissue-specific mRNA expression of
CAPN1, 2, and 11 from vertebrates spanning all major classes (fig. 4). For species representing eutherians (mouse and
pig), marsupials (tammar wallaby), monotreme mammals
(platypus), reptiles (green anole lizard), birds (zebra finch),
amphibians (African clawed frog), and teleost fish (zebrafish), CAPN1 and 2 were expressed widely across tissues
(fig. 4). Likewise, CAPN11 of all examined pretherian vertebrates including platypus, green anole lizard, zebra finch,
frog, and zebrafish were not restricted in their expression
across tissues (fig. 4). Conversely, in mice, pigs, and humans
(Homo sapiens), CAPN11 was only expressed in testis (fig. 4
and Dear and Boehm 1999; Dear et al. 1999). Previous work
showed that CAPN11 transcripts accumulated specifically
in spermatozoa of adult mice (Dear and Boehm 1999)
and that its protein localized to the acrosomal cap
(Ben-Aharon et al. 2006). A parsimonious explanation
for these results is that changes in eutherian CAPN11 regulation predate the split of Laurasiatheria (pig) and Euarchontoglires (mouse/human) lineages, which occurred some
95–113 million years ago, a period spanning the base of
eutherian evolution (Benton and Donoghue 2007).
A 176-bp exonic fragment of a putative marsupial
CAPN11 orthologue in tammar wallaby (discussed further
in a following section) was confirmed as present in the genome, but we could not detect a transcribed product in
eight different tissues after 40 RT-PCR cycles (fig. 4). This
suggests that CAPN11 has been transcriptionally disabled in
this animal. Unfortunately, opossum samples were not
available to confirm if CAPN11 transcription is disabled
more widely in marsupials.
Divergence in Functional Constraints between
CAPN11 and Other Family Members
The established notion that functional importance and
evolutionary conservation are inherently linked (Kimura
1983) is a central tenet of statistical methods formulated
to identify sites in related proteins with distinct functional
constraints (e.g., Gu 1999, 2006; Gribaldo et al. 2003). Such
sites are thought to underlie functional specificities of distinct phylogenetic clades and two types have been defined.
Type I sites (Gu 1999), otherwise known as heterotachous
MBE
Episodic Evolution of a Newly Classified Calpain Gene · doi:10.1093/molbev/msq071
Table 1. Details of HyPhy Analysis Used to Compare the Plausibility of Five Models Estimating Selective Pressures in Different Phylogenetic
Partitions of Calpain Family Member Codon Alignments.
LogL
Pa
AICib
Dic
Relative
likelihood(i)d
Akaike
weight (vi)e
Model
CAPN1
1
2
217,757.14
217,754.94
NA
0.11
35,592.81
35,592.28
8.36
7.83
0.015
0.020
0.0092
0.012
3
217,753.22
0.0012
35,584.45
0
1.00
0.60
4
217,754.95
0.0085
35,587.89
3.44
0.18
0.11
5
217,753.02
0.0045
35,586.05
1.60
0.45
0.27
CAPN2
1
2
213,423.69
213,422.54
NA
0.13
26,913.38
26,913.08
16.06
15.76
0.00033
0.00038
0.00017
0.00020
3
213,416.08 9.56 3 10205 26,900.16
2.84
0.24
0.12
4
213,414.66 2.14 3 10205 26,897.32
0
1.00
0.52
5
213,414.02 6.32 3 10205 26,898.05
0.73
0.70
0.36
Data partition and dN/dS estimate
Global 5 0.0802 (0.0758–0.0848)
Eutherians and pretherians 5 0.0816
(0.0770–0.117) versus separating
branch 5 0.0572 (0.0436–0.0732)
Eutherians and separating branch 5 0.0646
(0.0567–0.0733) versus pretherians 5 0.0881
(0.0827–0.0938)
Pretherians and separating branch 5 0.0858
(0.0807–0.0911) versus eutherians 5 0.0659
(0.0566–0.0762)
Eutherians 5 0.0660 (0.0567–0.0763) versus
pretherians 5 0.0882 (0.0828–0.0938) versus
separating branch 5
0.0571 (0.0435–0.0730)
Global 5 0.0961 (0.0894–0.1031)
Eutherians and pretherians 5 0.0945
(0.0878–0.102) versus separating
branch 5 0.200 (0.120–0.308)
Eutherians and separating branch 5 0.0774
(0.0683–0.0873) versus pretherians 5 0.113
(0.104–0.123)
Pretherians and separating branch 5 0.114
(0.104–0.124) versus eutherians 5 0.0749
(0.0657–0.0849)
Eutherians 5 0.0743 (0.0652–0.0843) versus
pretherians 5 0.112 (0.102–0.122) versus
separating branch 5 0.198 (0.119–0.304)
CAPN11
1
221,272.13
NA
42,632.26 299.62 8.67 3 10266 8.65 3 10266 Global 5 0.118 (0.112–0.124)
2
221,238.72 3.33 3 10216 42,567.43 234.79 1.037 3 10251 1.035 3 10251 Eutherians and pretherians 5 0.111 (0.106–0.117)
versus separating branch 5 0.486 (0.426–0.552)
3
221,127.51
0
42,345.03 12.39
0.0020
0.0020
Eutherians and separating branch 5 0.273
(0.254–0.293) versus pretherians 5 0.0769
(0.0718–0.0821)
4
221,167.94
0
42,425.89 93.25 5.63 3 10221 5.63 3 10221 Pretherians and separating branch 5 0.082
(0.077–0.087) versus eutherians 5 0.253
(0.232–0.275)
5
221,120.32
0
42,332.64
0
1.00
0.998
Eutherians 5 0.252 (0.231–0.273) versus
pretherians 5 0.0761 (0.0711–0.0813)
versus separating branch 5 0.534
(0.468–0.606)
NOTE.—The values given in parentheses are 95% confidence interval.
a
P value established by LRT indicating whether the given model provides a significant improvement of fit to the data compared with the global model.
b
AICi is the calculated AIC value for given model (i).
c
Di 5 AICi minAIC (the model with the lowest AIC) (after Burnham and Anderson 2002).
d
Relative likelihood(i) 5 exp(1/2Di) (after Anderson and Burnham 2002).
e
xi 5 relative likelihood(i)/sum of relative likelihood(i) of all tested models (after Anderson and Burnham 2002).
sites (Gribaldo et al. 2003), are those fixed in one clade, but
variable in another, whereas type II sites (Gu 2006) are functionally constrained in both clades but fixed as residues
with radically different biochemical properties. We employed DIVERGE 2.0 (Gu 2006) to explore the hypothesis
that shifts in functional constraints occurred in CAPN11
following the split of eutherians from pretherians. Clades
representing CAPN11 orthologues of eutherians, sauropsids, and teleosts were compared with each other and with
clades for their paralogues, CAPN1 and 2.
The coefficient of functional divergence (h) is a statistical
measure of the strength of divergence in functional con-
straints between compared clades ranging from a value
of 0 to 1 (Gu 1999, 2006). Rejection of the null hypothesis
that its value is equal to 0 indicates that a shift in functional
constraints is present at some sites, which is a proxy for
differences in protein function (Gu 1999, 2006). As h increases, so does the number of sites where constraints have
changed between compared clades making functional
divergence more likely. Significant type I divergence was
evident across all compared calpain clades (table 2). The
type I h values observed between eutherian CAPN11 in
comparisons with orthologous (i.e., teleost or sauropsid
CAPN11) or paralogous (i.e., CAPN1 and 2) clades were
1893
Macqueen et al. · doi:10.1093/molbev/msq071
FIG. 4. The figure shows the change in constraints on tissue-specific
transcriptional regulation of CAPN11 but not CAPN1 or 2 during
mammalian evolution. Shown are results for (A) mouse, (B) pig, (C)
tammar wallaby, (D) platypus, (E) green anole lizard, (F) zebra finch,
(G) frog, and (H) zebrafish. Abbreviations are B, brain; SKM, skeletal
muscle; H, heart; SKI, skin; SP, spleen; LI, liver; T, testis; OV, ovary; K,
kidney; LU, lung; -RTC, -reverse transcriptase control; NTC, notemplate control; and gDNA, genomic DNA. The gDNA band for
wallaby CAPN1 was expected because the primers used did not span
an exon boundary.
approximately twice the value of comparisons excluding
eutherians (table 2). Therefore, type I sites seem to be more
associated with eutherian than pretherian CAPN11. However, it was previously shown by Gribaldo et al. (2003) that
type I sites were equally present in paralogous and orthologous subgroups of a and b hemoglobin subunits. These
authors suggested that type I sites underlie common pro1894
MBE
cesses related to the evolution of homologous protein structures rather than being signatures for functional change
(Gribaldo et al. 2003). In support of this notion, type I positions made up 95% of site variation among ;2,000 orthologues of cytochrome b, a mitochondrial protein unlikely to
diverged functionally (Lopez et al. 2002). Here, type I h values
were significantly positively correlated to the mean genetic
distance between the compared clades (Spearman’s R 5
0.74, P 5 0.02, not shown). Thus, the increased type I h value
in eutherian CAPN11 may mainly reflect the increased dN in
the lineages stem. For these reasons, we deemed it would be
difficult to distinguish type I sites that were candidates for
clade-specific functions of CAPN11 from those representing
background evolution.
We also observed significant type II h values between
eutherian CAPN11 and orthologous teleost or sauropsid
clades (table 2). However, there was no evidence for
type II divergence between teleost and sauropsid CAPN11
(table 2). Significant type II h values were also observed
comparing vertebrate CAPN1 and eutherian CAPN11
but not comparing CAPN1 and teleost or sauropsid CAPN11
(table 2). Furthermore, although type II divergence was observed between CAPN2 and each CAPN11 clade, h was
around twice as large in the comparison with eutherian
CAPN11 (table 2). Type II divergence was absent between
CAPN1 and 2 (table 2).
To gain insight into the type II residue replacements in
eutherian CAPN11, sites were examined with the highest
possible Bayesian posterior ratio score (PRS), which mainly
represented positions completely fixed as one residue in
eutherians and a radically different state in the compared
clade. Such type II sites were observed in 29, 34, 16, and 26
of 646 sites in respective comparisons of eutherian
CAPN11 with teleost CAPN11 and sauropsid CAPN11,
CAPN1, and CAPN2. Many of these type II sites were identified in all these comparisons and were generally identical
or conserved in side chain biochemical property in pretherian CAPN11 and CAPN1/2 (fig. 5). Some type II sites receiving the highest PRS were identified in comparisons of
eutherian CAPN11 with pretherian CAPN11 and either,
but not both CAPN1 and 2 (fig. 5). For a broader comparison, we included in figure 5, the residues conserved at the
equivalent sites in CAPN3 and 8, which are phylogenetically closely related to CAPN11, 1, and 2 (Jékely and
Friedrich 1999; Macqueen et al. 2010) but have some fundamentally distinct functions. For example, CAPN3 is
‘‘muscle specific’’ in mammals and birds (Sorimachi
et al. 1995; Sorimachi and Suzuki 2001), does not bind
CAPNS1, and is not inhibited by CAST (Sorimachi et al.
1997). At many type II sites, the residue fixed in CAPN3
and/or 8 is also fixed or at least conserved in biochemical
property with the equivalent residue in pretherian
CAPN11 and/or CAPN1 and/or 2 but not with eutherian
CAPN11 (fig. 5). This suggests that many radical replacements at type II sites of eutherian CAPN11 are unique to
this section of the calpain family and cannot be related to
functional specificities of CAPN3/8 distinguishing them
from CAPN1/2.
MBE
Episodic Evolution of a Newly Classified Calpain Gene · doi:10.1093/molbev/msq071
Table 2. Details of DIVERGE 2.0 Analysis of Functional Divergence.
Comparison
Type I
CAPN11 eutherians versus sauropsids
CAPN11 eutherians versus teleosts
CAPN11 sauropsids versus teleosts
CAPN1 versus CAPN11 eutherians
CAPN1 versus CAPN11 sauropsids
CAPN1 versus CAPN11 teleosts
CAPN2 versus CAPN11 eutherians
CAPN2 versus CAPN11 sauropsids
CAPN2 versus CAPN11 teleosts
CAPN1 versus CAPN2
Type II
CAPN11 eutherians versus sauropsids
CAPN11 eutherians versus teleosts
CAPN11 sauropsids versus teleosts
CAPN1 versus CAPN11 eutherians
CAPN1 versus CAPN11 sauropsids
CAPN1 versus CAPN11 teleosts
CAPN2 versus CAPN11 eutherians
CAPN2 versus CAPN11 sauropsids
CAPN2 versus CAPN11 teleosts
CAPN1 versus CAPN2
u
uSE
z score
P
0.63
0.58
0.35
0.50
0.35
0.28
0.55
0.28
0.36
0.197
0.098
0.067
0.07
0.054
0.072
0.079
0.073
0.043
0.063
0.048
6.50
8.62
4.89
9.32
4.89
3.59
7.50
3.18
5.72
4.11
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
<0.001
<0.0001
<0.001
<0.0001
<0.0001
0.15
0.18
0.031
0.12
20.02
20.031
0.11
0.060
0.068
20.024
0.038
0.04
0.028
0.048
0.04
0.043
0.046
0.034
0.037
0.047
3.92
4.35
1.09
2.41
20.49
20.71
2.44
1.74
1.81
0.51
<0.0001
<0.0001
0.14
0.008
0.69
0.76
0.007
0.041
0.035
0.70
NOTE.—h, coefficient of functional divergence; hSE, standard error of h; z score corresponding to P , 0.05 5 1.645, P , 0.01 5 2.326, P , 0.001 5 3.09, and
P , 0.0001 5 3.719.
Implications of Type II Sites for Putative Residue
Interactions with CAST4 and CAPNS1
We next used protein homology modeling to examine if
type II residue replacements could modulate the interaction of calpains with other proteins. As a template, we used
a 2.4Å resolved crystal structure of the rat CAPN2–CAPNS1
complex (i.e., m-CAPN) bound to an inhibitory domain of
CAST (CAST4) in the presence of Ca2þ (Hanna et al. 2008).
We produced homology models from this structure for
CAPN2, as well as other calpains known to bind CAST, including CAPN1 (Goll et al. 2003) and CAPN11 of chicken
(Wolfe et al. 1989), plus for eutherian CAPN11, where it is
unknown if a physical interaction with CAST exists. In this
regard, CAST is expressed in primate spermatozoa (Rojas
et al. 1999; Yudin et al. 2006), suggesting that it should
be physically proximal to eutherian CAPN11. CAST is
formed of four inhibitory domains each split into regions
A, B, and C (Goll et al. 2003). Each inhibitory domain binds
calpain DIV through region A to CAPNS1 via region C and
to the remaining large subunit via region B, where the protease core active site is blocked while proteolysis is avoided
by a looping mechanism that bypasses the active-site cysteine (Hanna et al. 2008) (e.g., fig. 6A). The control model,
which used rat CAPN2, CAPNS1, and CAST4 sequences
(fig. 6A), was visually identical to the published structure
(Hanna et al. 2008) and all key-stated residue–residue interactions between CAPN2 and CAST4 were predicted (not
shown). We mapped to this model, type II sites where biochemical constraints were conserved in CAPN2 and pretherian CAPN11 but had radically shifted in the
eutherian ancestor (fig. 6A). Of 20 such type II sites, 13 were
positioned away from the interface with CAPNS1 and
CAST4, 6 were found at the interface with region B of
CAST4 (sites Q290, R337, D425, T456, R461, and T464 in
the full-length protein, accession number: 3BOW_A), interacting with one to three CAST4 residues, and 1 site (I516)
was located at the CAPN2–CAPNS1 interface, interacting
with three CAPNS1 residues (fig. 6B). At equivalent type II
sites in full-length rat CAPN1 (accession number:
NP_062025), the same residue–residue interactions were
conserved with no new interactions predicted (fig. 6C).
In chicken CAPN11 (accession number: NP_990634), identical residue–residue interactions were conserved at six of
these seven type II sites and no new interactions were predicted (fig. 6D). The only difference was that at Q293 of the
full-length sequence, which is equivalent to Q290 of
CAPN2, a single additional residue interaction with CAST4
was predicted (fig. 6D). Strikingly, in CAPN11 of the eutherian ancestor and rat (accession number: NP_001002806),
only one of these six type II sites had a conserved interaction with CAST4 compared with CAPN2 (fig. 6E and F). At
the other type II sites, interactions with CAST4 residues
were either lost (e.g., L347/L371 and F474/F499 of rat/ancestral eutherian CAPN11, respectively equivalent to R337
and R461 of CAPN2) or modified (e.g., R300/R324 and
I477/I502 of rat/ancestral eutherian CAPN11, respectively
equivalent to Q290 and T464 of CAPN2 (fig. 6E and F). Furthermore, one type II residue replacement in CAPN11 of rat
and the eutherian ancestor (W430 and W454, respectively)
lead to an interaction with CAST4 absent at the equivalent
type II site of CAPN2 (R416) and pretherian CAPN11
(R419). The single type II residue interaction with CAPNS1
conserved in CAPN2 (I516), CAPN1 (I527), and chicken
CAPN11 (I519) was absent at the equivalent site in eutherian CAPN11 (N554; fig. 6E) and modified in rat CAPN11
1895
Macqueen et al. · doi:10.1093/molbev/msq071
MBE
FIG. 5. Conservation and distribution of identified type II sites in eutherian CAPN11 and pretherian CAPN11, CAPN1, and CAPN2. Sites
receiving the highest possible PRS in at least two DIVERGE 2.0 comparisons with eutherian CAPN11 are marked with stars. For comparison,
logos are shown for amphibian and platypus CAPN11, plus vertebrate CAPN3 and 8, which were not included in the analysis. Residues are color
coded by biochemical property and heights represent their relative frequency at each site. The distribution of conserved type II sites is shown
along the domain structure of human CAPN11, where DIIa, DIIb, DIII, and DIV are respectively boxed in green, yellow, red, and blue. The
location of catalytic residues (C, H, and N) and the EF hand motifs (black vertical lines) are shown. The number of species per logo is indicated.
(N529; fig. 6F). These results suggest that type II replacements fixed in CAPN11 of eutherians alter residue–residue
interactions at the interface with CAST4 (and to a smaller
extent CAPNS1) that have remained conserved between
pretherian orthologues and CAPN2. Because these differences in interaction dynamics are mainly conserved in
CAPN11 of both rat and the eutherian ancestor (fig. 6E
and F), it is likely that many of the original type II fixations
1896
in the eutherian stem have not since been markedly affected or buffered by subsequent changes in the protein
in individual eutherian lineages. It should also be noted that
all interactions between CAST4 and the protease core residues (S105, H262, and N286 in full-length rat CAPN2) and
other critical flanking residues of the active-site cleft, for
example, W288 (Hanna et al. 2008) were conserved across
the homology models (not shown).
Episodic Evolution of a Newly Classified Calpain Gene · doi:10.1093/molbev/msq071
MBE
FIG. 6. (A) 3D surface representation of a protein homology model for rat CAPN2, bound to CAPNS1 and CAST4 (after Hanna et al. 2008). DIIa,
DIIb, DIII, and DIV are shaded by the same color scheme as figure 5, DVI of CAPNS1 is shaded orange, and CAST4 regions are labeled with
arrows (B) Is the same model as part A in cartoon form with the CAPN2 large subunit shaded entirely gray. Blue spheres show type II residues
with no interaction with CAPNS1 or CAST4. Red spheres mark type II residues that interact with one to three residues in CAST4 or CAPNS1.
(C–F) Represents homology models for other calpain family members (indicated above each structure) in the style of part B. Blue and red
spheres show conserved residue–residue interactions relative to those observed in rat CAPN2, except when respectively labeled with blue or
red arrowheads, which show new or lost residue interactions. Cyan spheres show type II residues where an interaction occurs with a CAST4
residue that does not occur at the CAPN2–CAST4 interface, but other residue–residue interactions are conserved. Pink spheres show type II
replacements where an interaction with a CAST4 residue found at the CAPN2–CAST4 interface was lost, but other residue–residue
interactions are conserved.
Evidence for Loss of a Functional CAPN11 Gene in
the Marsupial Lineage
A CAPN11 gene is absent from the opossum genome,
which harbors predicted genes for all other expected family
members, that is, CAPN1, 2, 3, 5, 6, 7, 8, 9, 10, 12, 13, 14 and
15. We submitted the region of chromosome 2 between
TMEM63B and SLC29A1 genes, which flank the 5# and
3# of CAPN11 in all tetrapod genomes (fig. 1 and supplementary fig. S3, Supplementary Material online) to GENSCAN (Burge and Karlin 1997) returning an open-reading
1897
Macqueen et al. · doi:10.1093/molbev/msq071
MBE
FIG. 7. (A) Sequence alignment of putative nonfunctional CAPN11 proteins of marsupials with functional eutherian and pretherian
orthologues, plus CAPN1 and 2 of wallaby/chicken. Residues shaded in blue or green are positioned in different exons and those shaded red
span exon boundaries. Letters shaded in bold or underlined black font respectively identify residue replacements in marsupial or eutherian
CAPN11 that deviate from the state conserved in pretherian CAPN11. A stop codon in wallaby CAPN11 is marked as a black asterix. Indels are
shown as a dash. A type II site in CAPN11 is marked with a 2. (B) Example DNA sequencing trace chromatograms (obtained from the Ensembl
trace server) demonstrating the high quality of nucleotide bases underlying the aligned protein sequence for opossum CAPN11. Amino acids
are shown above codons and arrowheads mark predicted donor and acceptor sites in introns at exon boundaries (C) As for (B), except for
tammar wallaby.
frame (ORF) of 268 amino acids (804 bp, derived from eight
exons), which was BlastP screened against the nonredundant NCBI database, returning statistically strongest hits to
CAPN11 sequences. We initially checked that the nucleotides underlying the GENSCAN prediction were not due to
sequencing errors. Specifically, the 30,000-bp genomic
DNA region encompassing the predicted ORF was broken
into 800-bp segments, which were individually screened
against the Ensembl trace server. In each case, multiple
(.4) overlapping high-quality traces were retrieved (examples in fig. 7). Therefore, we are confident of the accuracy of
1898
the sequence underlying the GENSCAN prediction. Sequence alignment revealed that the opossum ORF was positioned toward the C-terminus of functional CAPN11
proteins and contained several large deletions (not shown).
The CAPN11 ORF (minus indels) shared more sequence
identity to CAPN11 sequences (.45% vs. platypus or
chicken CAPN11) than to the next most closely related
typical calpains (e.g., ;35% vs. marsupial CAPN1 or 2),
but at a markedly lower level than in typical orthologue
comparisons (e.g., ;80% in platypus vs. chicken CAPN11).
The reduced sequence identity was due to frequent residue
Episodic Evolution of a Newly Classified Calpain Gene · doi:10.1093/molbev/msq071
replacements at sites conserved across typical calpains
(e.g., fig. 7).
The tammar wallaby has a 2-fold coverage genome sequence, within which a putative CAPN11 orthologue was
identified (i.e., the sequence that was not transcribed
across tissues; fig. 4). Specifically, a gene prediction (Ensembl ID: ENSMEUG00000013426) was identified by Blast
screening. This gene has been named as CAPN12, but this
is a clear misannotation, which can be proven by simple
sequence alignment (see below). ENSMEUG00000013426
has a predicted cDNA distinct from a repertoire of other
calpain genes (including CAPN1, 2, 3, 5, 6, 7, 8, 9, 10, 13, 14,
and 15) but contains sections of unknown nucleotides,
coding 355 amino acids in total. We again used the Ensembl trace server to confirm the quality of the
;23,000-bp genomic sequence from which this gene
was predicted. The entire region as it appears in Ensembl
was covered by overlapping high-quality traces, although
several regions were covered by a single trace and the
missing sequence information was due to a lack of coverage. This approach identified the presence of a stop codon that is skipped by the Ensembl gene prediction (see
fig. 7). Like the opossum ORFs, the wallaby sequence returned strongest Blast hits to CAPN11 sequences and
spanned the C-terminal of the protein. Similarly, by manual sequence alignment, large deletions were observed
(not shown). In regions where indels were removed,
the sequence shared greater sequence identity to CAPN11
(e.g., 54% vs. platypus) than the next most related calpain
family members (e.g., 46% and 39% respective identity
with wallaby CAPN1 and 2) and less identity with other
family members, including CAPN12 (e.g., 32% identity
with opossum CAPN12, which shares 75% identity with
human CAPN12). The reduced sequence identity was
again due to frequent residue replacements at sites conserved across typical calpains (e.g., see fig. 7).
We could not identify marsupial CAPN11 sequences in
NCBI EST or nonredundant protein databases for marsupials using CAPN11 orthologues as in silico probes. These
degraded sequence predictions for opossum and wallaby
suggest that a functional CAPN11 product is absent in both
marsupial species.
Comparison of CAPN11 in Marsupials, Pretherians,
and Eutherians
Short regions of putative CAPN11 proteins of opossum
(129 amino acids) and wallaby (111 amino acids) were
conserved enough to allow confident sequence alignment
versus functional CAPN11 orthologues, plus marsupial
CAPN1 and 2 (fig. 7). The intron–exon structures of
the marsupial sequences are conserved with other typical
calpains (fig. 7). Unsurprisingly, more residue replacements are present in the marsupial sequences than in
functional calpains (fig. 7). Interestingly, many nonsynonymous changes present in eutherian CAPN11 occur at
sites where replacements have also occurred in marsupials
but are constrained across pretherian orthologues (fig. 7).
These include a class of sites that are constrained as the
MBE
same residue in CAPN11 of included pretherians but variable with respect to amino acid property in included
eutherians (fig. 7). Many such eutherian sites have replacements in one or both marsupial sequences that also
deviate from the conserved pretherian state (fig. 7). These
sites may represent those where functional constraints
were lost in both marsupials and eutherians compared
with the pretherian state. There is also a site-class constrained as the same residue in pretherian CAPN11 but
fixed as a distinct residue in the included eutherians
(fig. 7). Interestingly, at all such sites, nonsynonymous
changes are present in one or both marsupial sequences
that also deviate from the pretherian state (fig. 7). These
sites may represent those where functional constraints
were initially relaxed in both marsupials and eutherians
but strong constraints subsequently returned in eutherians alone. There are also some sites where the same
amino acid is conserved in eutherians and one or both
marsupials (fig. 7), suggesting that they are synapomorphies. One such site was a type II residue (fig. 7).
Discussion
CAPN11 is a Vertebrate-Wide Family Member
Directly Ancestral to CAPN1 and 2
Our phylogenetic trees supported a basal position for the
newly recognized CAPN11 clade relative to CAPN1 and 2/8
with respect to the CAPN3/9 outgroup (fig. 2). It was previously suggested that CAPN1, 2, and 8 genes arose following two rounds of genomic duplication in the vertebrate
stem of chordates, initially a tetraploidisation or large-scale
event leading to CAPN1 and an ancestor gene to CAPN2/8,
followed by a tandem duplication leading to separate
CAPN2 and 8 genes (Jékely and Friedrich 1999). Considering
its phylogenetic position, CAPN11 is a strong candidate for
being the progenitor sequence from which CAPN1 and the
ancestor gene to CAPN2/8 initially arose. Therefore, future
studies of CAPN11 may provide interesting clues into the
biochemical and functional properties of its highly studied
paralogues, CAPN1 and 2.
CAPN11 Diverged Functionally in a Eutherian
Ancestor and is Likely Subject to Reduced Selective
Constraints Relative to Pretherian Orthologues
This work indicates that eutherian CAPN11 performs
a more restricted physiological role than its pretherian
orthologues. For example, the shift in regulation to testis
(fig. 4) and specifically spermatozoa (Ben-Aharon et al.
2006) would have provided a dramatically different cellular
arena in which to function. The restriction of CAPN11 protein to the testis may also explain the ;3-fold elevated dN/
dS ratios estimated across eutherians compared with the
pretherian clade (table 1). It has been shown by Duret
and Mouchiroud (2000) and Jordan et al. (2005) that
the protein products of mouse and human orthologues expressed in a wide range of tissues evolve under greater functional constraints than those expressed in few tissues. For
broadly expressed calpains, it would be expected that
1899
Macqueen et al. · doi:10.1093/molbev/msq071
among their numerous substrates (as well as among proteins with which they interact but do not proteolyse), some
proportion would never be localized in testis or sperm.
Therefore, eutherian CAPN11 likely lost a plethora of interactions conserved in widely expressed pretherian orthologues. Accordingly, some sites involved in the
underlying interactions would be expected to be subject
to a lower level of purifying selection, with nonsynonymous
replacements being neutral that would be strongly deleterious in pretherian orthologues.
In addition to the remarkable shift in gene regulation,
several type II replacements fixed in the eutherian stem
(fig. 5) modify or ablate certain interactions with CAST4
that are conserved in pretherian orthologues and
CAPN1/2 paralogues (fig. 6). However, no type II sites or
other nonsynonymous changes in eutherian CAPN11 proteins modified any critical interactions of CAST4 with the
protease core and active-site cleft of CAPN11. Instead, the
type II sites fall in regions that may affect the overall stability of the interaction. Without experimental validation, it
is unclear if changes in CAPN11–CAST interactions will together relax the sum interaction between these proteins
and thus the availability of CAPN11 to its potential substrates. One intriguing possibility is that with its restricted
expression pattern, eutherian CAPN11 requires less tight
control at the protein level than its pretherian orthologues
and CAPN1/2.
MBE
were required to reach testis restriction in the eutherian
ancestor, it is unlikely that each along the chain would
be adaptive, particularly considering that the gene was
seemingly under purifying selection to retain a wideexpression breadth until this point in evolutionary time
(fig. 4). Thus, we suggest that a strong relaxation in purifying selection is required to account for the number of
changes in regulatory regions required for such a dramatic
change in gene transcription.
In short regions of marsupial CAPN11 proteins that were
available for comparison, many nonsynonymous changes
present in eutherian CAPN11 that deviate from the pretherian state were observed at sites where constraints were
also altered in opossum and/or wallaby (fig. 7). Furthermore, several putative synapomorphic sites, including
one type II site, are shared by marsupial and eutherian
CAPN11 (fig. 7), suggesting that some nonsynonymous
changes present in CAPN11 of the eutherian ancestor occurred in a common eutherian–marsupial ancestor after
the split from monotremes. These results suggest that a loss
of constraint was common to many sites shared by marsupial and eutherian CAPN11 relative to pretherian orthologues. This points to a relaxation in purifying selection on
the accompanying residues in a common therian ancestor
being a driving force in the rapid evolution of CAPN11 in
the eutherian ancestor.
What Underlies the Episodic Evolution of CAPN11?
A Model Akin to the Dykhuizen–Hartl Effect is
Consistent with the Evolution of CAPN11
The point in evolutionary time where CAPN11 became
transcriptionally restricted to testis and where coding level
dN/dS estimates were first elevated encompasses the split of
marsupials and eutherians. Considering this fact and that
the marsupial gene is no longer expressed (fig. 4) and has
accumulated coding level changes that likely render it
nonfunctional (fig. 7), it is difficult to rule out that
a common mechanism that caused a loss of constraint
in a common eutherian–marsupial ancestor underlies
the regulatory and coding level changes to the functional
eutherian gene.
We find nothing in the literature proposing a population
mechanism to explain how orthologues of a widely expressed gene could become transcriptionally disabled in
one lineage (i.e., marsupials) and tissue restricted in its sister group (i.e., the eutherians). The sum spatial and temporal expression of a typical vertebrate gene is governed by
the interactions of the transcription machinery with its
promoter, plus numerous enhancers, silencers, and insulators found at multiple spatially distinct locations in the
proximal genome (Levine and Tjian 2003). Due to this overriding complexity, we suggest that the shift from a broad to
tissue-specific expression pattern could not be mediated by
positive selection alone. Our reasoning is that the adaptive
phenotype being under selection, in this case testisrestricted transcription, would likely require a number of
regulatory elements, such as tissue-specific enhancers, to
be cumulatively modified, meaning no single substitution
could reach the required endpoint. If multiple substitutions
Based on the above arguments, we feel it is impossible to
exclude that a loss of constraints accounts for the rapid
evolution of CAPN11 at the base of therians and its subsequent functional restriction in eutherians and disablement in marsupials. Relaxation of functional constraint
is common in one of two paralogues after genomic duplication due to redundancy in function (Zhang 2003). In the
case of CAPN11, it seems unlikely that functional redundancy with other calpain family members could account
for loss of constraints following the therian–monotreme
split, considering that at this point in time, all vertebrate-wide calpain family members would have been separated by a minimum of ;230–260 million year of
evolution (estimate from Benton and Donoghue 2007)
and would have diverged markedly in protein sequence
and presumably certain functions. Under the neutral theory of evolution (Kimura 1983), the strength of natural selection is intimately associated with population size,
meaning purifying selection becomes increasingly inefficient under smaller effective population sizes, whereas genetic drift is more prevalent. Effective population size can
be reduced via population bottlenecks, which are thought
be common during speciation (Mayr 1963), a proposal that
has been supported experimentally (Guo et al. 2009). It is
possible that some event reducing effective population
size affected the CAPN11 locus of a common marsupial–
eutherian ancestor and consequently, mutations were fixed
in regulatory and coding regions due to inefficient purifying
selection/increased drift, leading to a reduction in
1900
Episodic Evolution of a Newly Classified Calpain Gene · doi:10.1093/molbev/msq071
transcript expression breadth and the incorporation of
many nonsynonymous replacements in the protein that
deviated from the pretherian state. Kimura (1983) proposed the Dykhuizen–Hartl effect (after Dykhuizen and
Hartl 1980) as a mechanism whereby neutral mutations
fixed by drift become subsequently adaptive in an altered
biochemical environment. A model of this type can explain
how eutherian CAPN11 gained a distinct role in eutherians
compared with pretherians and how the gene was lost in
marsupials. There is evidence that several facets of eutherian sperm physiology are distinct from marsupials, including that the oocyte zona pellucida (a membrane the
spermatozoa must penetrate) is markedly thicker and
more resistant to enzymatic digestion (Bedford 1998). Furthermore, the manner by which spermatozoa bind to this
membrane in eutherians is different to marsupials (Bedford
1998). This binding is Ca2þ dependent (Yanagimachi 1978),
and critical membrane fusion events between the spermatozoa and oocyte are likely modulated by the calpain system (Rojas et al. 1999). Therefore, we suggest that after the
split of the common eutherian–marsupial ancestor, some
lineage-specific facet of sperm physiology arose in eutherians, providing an adaptive advantage for the ‘‘evolved’’
CAPN11 gene in testis, possibly in relation to Ca2þ-dependent membrane fusion. It is possible that such an adaptive
response involved positive selection for a testis-specific enhancer in CAPN11 regulatory regions. Under this model,
certain combinations of nonsynonymous replacements
fixed in the coding sequence were also selectively advantageous in the testis-restricted biochemical environment
and with a return to a larger effective population size became subject to strong ongoing purifying selection leading
to the observed type II sites. Conversely, in marsupials, no
adaptive advantage was ever realized and the accumulation
of changes to regulatory and coding regions meant the
CAPN11 gene became subject to weak or even absent
ongoing purifying selection, eventually leading to its
functional disablement.
Supplementary Material
Supplementary table S1, figures S1–S3, and file S1 are available at Molecular Biology and Evolution online (http://
www.mbe.oxfordjournals.org).
Acknowledgments
Dr Lara Meischke (University of St Andrews) sequenced
a teleost l/m-CAPN leading to the studies conception.
We thank Dr Dave Ferrier (University of St Andrews) for
his comments on an earlier manuscript draft. The study
was supported by grants from the Natural Environment
Research Council (NE/E015212/1) and European Commission (contract 506359). D.J.M and I.A.J conceived the study.
D.J.M performed all experiments except PCRs in platypus
and tammar wallaby performed by M.L.D. S.M. contributed
to the shared synteny analyses. D.J.M prepared all figures
MBE
and drafted the manuscript, which was edited to final form
with significant input from I.A.J and some input from
M.L.D.
References
Abascal F, Zardoya R, Posada D. 2005. ProtTest: selection of best-fit
models of protein evolution. Bioinformatics 21:2104–2105.
Bedford JM. 1998. Mammalian fertilization misread? Sperm
penetration of the eutherian zona pellucida is unlikely to be
a lytic event. Biol Reprod. 59:1275–1287.
Ben-Aharon I, Brown PR, Shalgi R, Eddy EM. 2006. Calpain-11 is
unique to mouse spermatogenic cells. Mol Reprod Dev.
73:767–773.
Benkert P, Tosatto SCE, Schomburg D. 2008. QMEAN: a comprehensive scoring function for model quality assessment. Proteins
71:261–277.
Benton MJ, Donoghue PC. 2007. Paleontological evidence to date
the tree of life. Mol Biol Evol. 24:26–53.
Burge C, Karlin S. 1997. Prediction of complete gene structures in
human genomic DNA. J Mol Biol. 268:78–94.
Burnham KP, Anderson DP. 2002. Model selection and multimodel
inference. New York: Springer-Verlag.
Castresana J. 2000. Selection of conserved blocks from multiple
alignments for their use in phylogenetic analysis. Mol Biol Evol.
17:540–552.
Colovos C, Yeates TO. 1993. Verification of protein structures: patterns
of nonbonded atomic interactions. Protein Sci. 2:1511–1519.
Crooks GE, Hon G, Chandonia JM, Brenner SE. 2004. WebLogo:
a sequence logo generator. Genome Res. 14:1188–1190.
Dear TN, Boehm T. 1999. Diverse mRNA expression patterns of the
mouse calpain genes Capn5, Capn6 and Capn11 during
development. Mech Dev. 89:201–209.
Dear TN, Möller A, Boehm T. 1999. CAPN11: a calpain with high
mRNA levels in testis and located on chromosome 6. Genomics
59:243–247.
Drummond AJ, Rambaut A. 2007. BEAST: Bayesian evolutionary
analysis by sampling trees. BMC Evol Biol. 7:214.
Duret L, Mouchiroud D. 2000. Determinants of substitution rates in
mammalian genes: expression pattern affects selection intensity
but not mutation rate. Mol Biol Evol. 17:68–74.
Dutt P, Croall DE, Arthur JS, Veyra TD, Williams K, Elce JS, Greer PA.
2006. m-Calpain is required for preimplantation embryonic
development in mice. BMC Dev Biol. 6:3.
Dykhuizen D, Hartl DL. 1980. Selective neutrality of 6PGD allozymes
in E. coli and the effects of genetic background. Genetics
96:801–817.
Farkas A, Tompa P, Friedrich P. 2003. Revisiting ubiquity and tissue
specificity of human calpains. Biol Chem. 384:945–949.
Fay JC, Wu CI. 2003. Sequence divergence, functional constraint, and
selection in protein evolution. Annu Rev Genomics Hum Genet.
4:213–235.
Goll DE, Thompson VF, Li H, Wei W, Cong J. 2003. The calpain
system. Physiol Rev. 83:731–801.
Gribaldo S, Casane D, Lopez P, Philippe H. 2003. Functional
divergence prediction from evolutionary analysis: a case study of
vertebrate hemoglobin. Mol Biol Evol. 20:1754–1759.
Gu X. 1999. Statistical methods for testing functional divergence
after gene duplication. Mol Biol Evol. 16:1664–1674.
Gu X. 2006. A simple statistical method for estimating type-II
(cluster-specific) functional divergence of protein sequences.
Mol Biol Evol. 23:1937–1945.
Guindon S, Gascuel O. 2003. A simple, fast, and accurate algorithm
to estimate large phylogenies by maximum likelihood. Syst
Biology. 52:696–704.
1901
Macqueen et al. · doi:10.1093/molbev/msq071
Guo YL, Bechsgaard JS, Slotte T, Neuffer B, Lascoux M, Weigel D,
Schierup MH. 2009. Recent speciation of Capsella rubella from
Capsella grandiflora, associated with loss of self-incompatibility
and an extreme bottleneck. Proc Natl Acad Sci U S A.
106:5246–5251.
Hanna RA, Campbell RL, Davies PL. 2008. Calcium-bound structure
of calpain and its mechanism of inhibition by calpastatin. Nature
456:409–412.
Hata S, Doi N, Kitamura F, Sorimachi H. 2007. Stomach-specific
calpain, nCL-2/calpain 8, is active without calpain regulatory
subunit and oligomerizes through C2-like domains. J Biol Chem.
282:27847–27856.
Hughes AL, Nei M. 1988. Pattern of nucleotide substitution at major
histocompatibility complex class I loci reveals overdominant
selection. Nature 335:167–170.
Jékely G, Friedrich P. 1999. The evolution of the calpain family as
reflected in paralogous chromosome regions. J Mol Evol.
49:272–281.
Jordan IK, Mariño-Ramı́rez L, Koonin EV. 2005. Evolutionary
significance of gene expression divergence. Gene 345:119–126.
Katoh K, Toh H. 2008. Recent developments in the MAFFT multiple
sequence alignment program. Brief Bioinform. 9:286–298.
Kimura M. 1983. The neutral theory of molecular evolution.
Cambridge: Cambridge University Press.
Kittichotirat W, Guerquin M, Bumgarner RE, Samudrala R. 2009.
Protinfo PPC: a web server for atomic level prediction of protein
complexes. Nucleic Acids Res. 37:W519–D525.
Kosakovsky Pond SL, Frost SD. 2005a. HyPhy: hypothesis testing
using phylogenies. Bioinformatics 21:676–679.
Kosakovsky Pond SL, Frost SD. 2005b. Datamonkey: rapid detection
of selective pressure on individual sites of codon alignment.
Bioinformatics 21:2531–2533.
Kosakovsky Pond SL, Poon AFY, Frost SD. 2009. Estimating
selection pressures on alignments of coding sequences.
In: Lemey P, Salemi M, Vandamme A, editors. The phylogenetic
handbook. Cambridge: Cambridge University Press. p. 419–490.
Lee HL, Santé-Lhoutellier V, Vigouroux S, Briand Y, Briand M. 2007.
Calpain specificity and expression in chicken tissues. Comp
Biochem Physiol B Biochem Mol Biol. 146:88–93.
Levine M, Tjian R. 2003. Transcription regulation and animal
diversity. Nature 424:147–151.
Lopez P, Casane D, Philippe H. 2002. Heterotachy, an important
process of protein evolution. Mol Biol Evol. 19:1–7.
Macqueen DJ, Meischke L, Manthri S, Anwar A, Solberg C,
Johnston IA. 2010. Characterisation of capn1, capn2-like, capn3
and capn11 genes in Atlantic halibut (Hippoglossus hippoglossus
L.): transcriptional regulation across tissues and in skeletal
muscle at distinct nutritional states. Gene 453:45–58.
1902
MBE
Mayr E. 1963. Animal species and evolution. Cambridge: Harvard
University Press.
Murakami T, Ueda M, Hamakubo T, Murachi T. 1988. Identification
of both calpains I and II in nucleated chicken erythrocytes.
J Biochem. 103:168–171.
Porollo A, Meller J. 2007. Versatile annotation and publication
quality visualization of protein complexes using POLYVIEW-3D.
BMC Bioinformatics. 8:316.
Pybus OG, Shapiro B. 2009. Natural selection and adaptation of
molecular sequences. In: Lemey P, Salemi M, Vandamme A,
editors. The phylogenetic handbook. Cambridge: Cambridge
University Press. p. 407–418.
Rojas FJ, Brush M, Moretti-Rojas I. 1999. Calpain-calpastatin: a novel
complete calcium-dependent protease system in human
spermatozoa. Mol Hum Reprod. 5:520–526.
Ronquist F, Huelsenbeck JP. 2003. MrBayes 3: Bayesian
phylogenetic inference under mixed models. Bioinformatics
19:1572–1574.
Saez ME, Ramirez-Lorca R, Moron FJ, Ruiz A. 2006. The therapeutic
potential of the calpain family: new aspects. Drug Discov Today.
11:917–923.
Sorimachi H, Ishiura S, Suzuki K. 1997. Structure and physiological
function of calpains. Biochem J. 328:721–732.
Sorimachi H, Suzuki K. 2001. The structure of calpain. J Biochem.
129:653–664.
Sorimachi H, Tsukahara T, Okada-Ban M, Sugita H, Ishiura S,
Suzuki K. 1995. Identification of a third ubiquitous calpain
species-chicken muscle expresses four distinct calpains. Biochim
Biophys Acta. 1261:381–393.
Suyama M, Torrents D, Bork P. 2006. PAL2NAL: robust conversion of
protein sequence alignments into the corresponding codon
alignments. Nucleic Acids Res. 34:W609–W612.
Tamura K, Dudley J, Nei M, Kumar S. 2007. MEGA4: molecular
evolutionary genetics analysis (MEGA) software version 4.0. Mol
Biol Evol. 24:1596–1599.
Wallner B, Elofsson A. 2005. Identification of correct regions in
protein models using structural, alignment and consensus
information. Protein Sci. 15:900–913.
Wolfe FH, Sathe SK, Goll DE, Kleese WC, Edmunds T, Duperret SM.
1989. Chicken skeletal muscle has three Ca2þ-dependent
proteinases. Biochim Biophys Acta. 998:236–250.
Yanagimachi R. 1978. Calcium requirement for sperm–egg fusion in
mammals. Biol Reprod. 19:949–958.
Yudin AI, Goldberg E, Robertson KR, Overstreet JW. 2006. Calpain
and calpastatin are located between the plasma membrane and
outer acrosomal membrane of cynomolgus macaque spermatozoa. J Androl. 21:721–729.
Zhang J. 2003. Evolution by gene duplication: an update. Trends Ecol
Evol. 18:292–298.