Long-Lived Dichotomous Lineages of the Proteasome Subunit Beta

Long-Lived Dichotomous Lineages of the Proteasome Subunit
Beta Type 8 (PSMB8) Gene Surviving More than 500 Million
Years as Alleles or Paralogs
Kentaro Tsukamoto, ,1 Fumi Miura,1 Naoko T. Fujito,1 Goro Yoshizaki,2 and Masaru Nonaka*,1
1
Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, Japan
Department of Marine Biosciences, Tokyo University of Marine Science and Technology, Tokyo, Japan
Present address: Institute for Comprehensive Medical Science, Fujita Health University, Toyoake, Aichi, Japan
*Corresponding author: E-mail: [email protected].
Associate editor: Yoko Satta
Data deposition: Accessions: AB602772–AB602774.
2
On an evolutionary time scale, polymorphic alleles are believed to have a short life, persisting at most tens of millions of
years even under long-term balancing selection. Here, we report highly diverged trans-species dimorphism of the
proteasome subunit beta type 8 (PSMB8) gene, which encodes a catalytic subunit of the immunoproteasome responsible
for the generation of peptides presented by major histocompatibility complex (MHC) class I molecules, in lower teleosts
including Cypriniformes (zebrafish and loach) and Salmoniformes (trout and salmon), whose last common ancestor dates
to 300 Ma. Moreover, phylogenetic analyses indicated that these dimorphic alleles share lineages with two shark
paralogous genes, suggesting that these two lineages have been maintained for more than 500 My either as alleles or as
paralogs, and that conversion between alleles and paralogs has occurred at least once during vertebrate evolution. Two
lineages termed PSMB8A and PSMB8F show an A31F substitution that would probably affect their cleaving specificity, and
whereas the PSMB8A lineage has been retained by all analyzed jawed vertebrates, the PSMB8F lineage has been lost by
most jawed vertebrates except for cartilaginous fish and basal teleosts. However, a possible functional equivalent of the
PSMB8F lineage has been revived as alleles within the PSMB8A lineage at least twice during vertebrate evolution in the
amphibian Xenopus and teleostean Oryzias species. Dynamic evolution of the PSMB8 polymorphism through long-term
persistence, loss, and regaining of dimorphism and conversion between alleles and paralogs implies the presence of strong
selective pressure for functional polymorphism of this gene.
Key words: PSMB8, long-lived lineages, balancing selection, immunoproteasome.
Introduction
Proteasomes are huge multicatalytic proteinase complexes,
which are responsible for selective proteolytic degradation
and production of the antigenic peptides to be presented
by MHC class I molecules to cytotoxic CD8þ T cells (Rock
and Goldberg 1999). Proteolysis is performed by 20S proteasome, which is a catalytic core of proteasome and a large
complex composed of four stacks of two outer alpha and
two inner beta rings containing seven alpha and seven beta
subunits, respectively (Groll et al. 1997; Unno et al. 2002).
Of these 14 subunits, only three beta subunits have proteolytic activity. These active beta subunits of a constitutive
proteasome are PSMB5 (X), PSMB6 (Y), and PSMB7 (Z),
with chymotrypsin-like, caspase-like, and trypsin-like proteinase activities, respectively. They are replaced by IFNc inducible beta subunits with proteinase activities, PSMB8
(LMP7), PSMB9 (LMP2), and PSMB10 (MECL-1), respectively, forming an immunoproteasome (Tanaka and
Kasahara 1998). These changes in those subunit compositions are believed to increase chymotrypsin-like activity to
generate peptides with hydrophobic C-terminal residues
suitable for binding into the cleft of the MHC class I
molecules (Tanaka and Kasahara 1998). Especially, PSMB8
subunit is critical component for the production of the
MHC class I binding peptides because PSMB8 knockout
mice show reduced expression of MHC class I molecules
on the cell surface (Fehling et al. 1994). The PSMB8 gene
is believed to have arisen by gene duplication of PSMB5
in a common ancestor of jawed vertebrates, simultaneously
with the appearance of canonical adaptive immunity
(Kasahara 1997).
Amino acid substitutions at the residues involved in
the determination of the cleaving specificity of PSMB8
may affect the recognition repertoire of adaptive immunity, potentially leading to differential susceptibility to
pathogens in individuals with different types of PSMB8.
To date, highly diverged dichotomous types of the PSMB8
gene have been reported in teleost Oryzias species (Miura
et al. 2010), amphibian Xenopus species (Nonaka et al.
2000), and sharks (cartilaginous fish) (Kandil et al. 1996).
The dichotomous types present in these species share
the curious amino acid substitution of Ala or Val, which
have a small side chain (termed A type), versus Phe or
Tyr, which have a bulky aromatic side chain (termed F type)
at the 31st residue of the mature peptide, which is most
probably involved in the determination of cleaving
© The Author 2012. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please
e-mail: [email protected]
Mol. Biol. Evol. 29(10):3071–3079. 2012 doi:10.1093/molbev/mss113
Advance Access publication April 6, 2012
3071
Research article
Abstract
MBE
Tsukamoto et al. · doi:10.1093/molbev/mss113
specificity (Groll et al. 1997; Unno et al. 2002). Mammals
possess only the A-type PSMB8, and human PSMB8 with
V31 exhibits chymotrypsin-like specificity for cleavage after
bulky hydrophobic amino acids (Agarwal et al. 2010). In
contrast, the F-type PSMB8 is predicted to have elastase-like specificity for cleavage after small hydrophobic
amino acids based on the presumed 3D structure of the
F type of the medaka PSMB8 molecule (Tsukamoto
et al. 2005). However, a phylogenetic analysis indicated
that, in these animal groups, the A and F types were derived
via independent evolutionary events (Tsukamoto et al.
2005). Moreover, the A and F types of Oryzias and Xenopus
PSMB8 are alleles that were retained for 60–80 My through
several speciation events by long-term balancing selection
(Nonaka et al. 2000; Miura et al. 2010), whereas those of the
shark PSMB8 are assumed to be paralogous genes (Kandil
et al. 1996).
To elucidate the evolutionary history of the A and F
types of the PSMB8 gene of jawed vertebrates, here, we
performed in silico data mining, phylogenetic analysis,
and segregation or population analysis to clarify whether
the two types are alleles or paralogs.
Materials and Methods
Animals
The India strain of zebrafish, Danio rerio, was supplied by
courtesy of Dr Shinji Takada (National Institute for Basic
Biology, Aichi, Japan) and Dr Sumito Koshida (The
University of Tokyo, Tokyo, Japan). The wild individuals
(N 5 107) of the Japanese loach, Misgurnus anguillicaudatus, collected in the Ibaraki prefecture, Japan, were
purchased from Choei-shouten (Tokyo, Japan). The Okutama strain of rainbow trout, Oncorhynchus mykiss, was from
the Ooizumi station of the Tokyo University of Marine Science and Technology, Nagano, Japan. Wild-caught individuals of the banded hound shark (Triakis scyllium) (N 5 17)
were supplied by courtesy of Mr Takeshi Nakai (Keikyu
Aburatsubo Marine Park, Kanagawa, Japan).
Blast Search of Expressed Sequence Tag Data for the
PSMB8 Alleles
We searched the coding sequence of PSMB8 among the expressed sequence tag (EST) data of teleost species using the
National Center for Biotechnology Information (NCBI)
BLAST search option (http://blast.ncbi.nlm.nih.gov/Blast.cgi).
The tblastn and tblastx searches were performed using the
coding sequences (nucleotide or amino acid sequence) of
the PSMB8N (F-type PSMB8, NCBI accession number
AB183488) and PSMB8d (A-type PSMB8, NCBI accession
number BA000027) alleles of Oryzias latipes as queries.
Two types of PSMB8 sequences, one with Ala and the other
with Phe at the 31st position of the mature peptide, were
identified in the Cypriniformes zebrafish (D. rerio; NCBI accession numbers BC066288 and BC092889) and Japanese
loach (M. anguillicaudatus; NCBI accession numbers
BJ830309 and BJ838416), and in the Salmoniformes Atlantic
salmon (Salmo salar; NCBI accession numbers DY720791
3072
and DW555236) and rainbow trout (O. mykiss; NCBI accession numbers CA360946 and GE824997).
These EST sequences identified included the full coding
sequence of the PSMB8 mature peptide, with the exception
of the Japanese loach EST sequence for the F type (NCBI
accession number BJ838416), which lacked 76 bp at the
5#-end. To obtain the nucleotide sequence of this region,
total RNA was isolated from the caudal fin of loach
individuals possessing the F-type PSMB8 using ISOGEN
(Nippon Gene, Tokyo, Japan) according to the manufacturer’s instructions and was reverse-transcribed to cDNA
by Superscript II (Invitrogen, Carlsbad, CA). The second
exon, which encompassed the missing 76 bp, was amplified
using the cDNA as a template by the F-type specific primers. The forward primer was 5#-ACTGGAATGGACCCGGAGCAGTT-3# designed at the second coding exon, based
on the F-type sequence of zebrafish (NCBI accession number BC092889); 5#-TCTTTGGCCAGTCTTCTCTCCCAAT3# was used as the reverse primer designed at the third
exon of the F-type sequence of Japanese loach (NCBI accession number BJ838416). The polymerase chain reaction
(PCR) was performed using Ex-Taq (Takara Bio Inc., Shiga,
Japan) and the following conditions: denaturation at 95 °C
for 1 min, 35 cycles of denaturation at 95 °C for 30 s, annealing at 62 °C for 30 s and elongation at 72 °C for 60 s,
and final elongation at 72 °C for 3 min. The nucleotide
sequence of the PCR product was determined by direct sequencing. Sequencing reaction was performed using the
BigDye Terminator v3.1 Sequencing Standard kit (Applied
Biosystems Foster City, CA), and 3100/3130xl Genetic Analyzer (Applied Biosystems) was used for determination of
nucleotide sequences. The 84 bp of the exon 2 sequence
determined (NCBI accession number AB602774) overlapped with the EST sequence of loach with 99% identity,
and we used the combined sequences (NCBI accession
numbers AB602774 and BJ838416) for the phylogenetic
analyses.
Genomic DNA Extraction
Genomic DNA was extracted from fish fins using the Puregene Genomic DNA Purification Kit (Gentra Systems, Minneapolis, MN) according to the manufacturer’s instructions
and was finally dissolved in Tris-ethylenediaminetetraacetic
acid (EDTA) buffer (10 mM Tris, 1 mM EDTA).
PCR Amplification of the PSMB8 Gene of Zebrafish,
Japanese Loach, Rainbow Trout, and Banded Hound
Shark
The PSMB8 gene sequence spanning from the second to
the third exon was amplified using genomic DNA as a template and the following primer sets, which are specific for
the F and A types and were designed based on sequences
from each of the species: zebrafish F type (5#-GGACTCGAGAGCATCTGCAGGGTCT-3# and 5#-TGGCCAGTCTTCT
CTCCCAATAAAC-3#), zebrafish A type (5#-AGACTCCAGAGCTTCTGCTGGAAAA-3# and 5#-AGCCAGCAGTCTC
TCCCAGTACTG-3#), loach F type (5#-CTCCAGAGCATC
TGCTGGGTCTTAC-3# and 5#-TCTTTGGCCAGTCTTCT
Long-Lived Dichotomous Lineages of the PSMB8 Gene · doi:10.1093/molbev/mss113
CTCCCAAT-3#), loach A type (5#-CTCCAGAGCATCAGC
TGGAAAATAT-3# and 5#-TCTTTAGCCAGCAGTCTCTCC
CAGT-3#), rainbow trout F type (5#-TGACTCCAGAGCATCTGCAGGATCT-3# and 5#-TGGCCAACACTCTCTCC
CAATAGAC-3#), rainbow trout A type (5#-CTCCAGGGCC
TCAGCTGGCAGC-3# and 5#-AGCCAGCAGTCTCTCCCAGTACTG-3#), banded hound shark F type (5#-ATTCCAGAGCATCTGCAGGCTCC-3# and 5#-CCAGTAGACACAA
TCTGCAGCACTG-3#), and banded hound shark A type
(5#-GACTCCAGAGCATCTGCAGGGAAT-3# and 5#-CCA
GTATTGGCAATCAGCAGCACTT-3#). The PCR condition
was as follows, with the exception of the primer sets used
to amplify the F and A types of banded hound shark: denaturation at 94 °C for 1 min, 35 cycles of denaturation at
94 °C for 30 s, annealing at 64 °C for 30 s and elongation at
72 °C for 1 min, and final elongation at 72 °C for 3 min with
Ex-Taq (Takara Bio Inc.). PCR products were separated in
2% agarose gels. The PCR conditions used for the primer
sets for the two types of banded hound shark were as follows: denaturation at 98 °C for 30 s, 30 cycles of denaturation at 98 °C for 10 s, annealing and elongation at 66 °C for
4 min, and final elongation at 72 °C for 3 min, with LA-Taq
(Takara Bio Inc.). PCR products were separated in 0.8% agarose gels. The nucleotide sequences of the PCR products of
all individuals were determined as described above.
To design the type-specific primers used to amplify the F
and A types of the banded hound shark, we first determined the cDNA sequences of these types (NCBI accession
numbers AB602772 and AB602773). Total RNA was isolated from the liver of a wild-caught banded hound shark
individual, as described above, and was reverse-transcribed
to cDNA using Superscript II (Invitrogen). The cDNAs were
used as templates for PCR amplification of the F- and Atype sequences spanning from the first to the sixth (last)
coding exon using the following type-specific primer sets,
which were designed based on nucleotide sequences that
are conserved between nurse shark (NCBI accession numbers D64056 and D64057) and horn shark (NCBI accession
numbers AF363583 and AF363582): F type (5#-CTGGTGG
TACCGSTTGGTCTGGA-3# and 5#-CCCGCATGTGATACATGTTAACTGT-3#) and A type (5#-CTGGTGGTACCGS
TTGGTCTGGA-3# and 5#-GACACCTTTATCCACCCATCT
TCA-3#). The PCR condition was denaturation at 94 °C for
1 min, 35 cycles of denaturation at 94 °C for 30 s, annealing
at 58 °C for 30 s and elongation at 72 °C for 90 s, and final
elongation at 72 °C for 3 min, with Ex-Taq (Takara Bio Inc.).
The nucleotide sequences of the PCR products were determined as described above.
Phylogenetic Tree Construction
The alignment of multiple nucleotide sequences encoding
mature peptides of PSMB8 and PSMB5 (used as an outgroup) was performed by Clustal X 2.0 (Larkin et al.
2007). Based on the alignments, the short extension present at the 3# side of some sequences was removed and two
alignment sets, 609 sites with all codon positions (A) and
406 sites without the third codon position (B), were used
for phylogenetic analyses.
MBE
A maximum likelihood (ML) tree was constructed using
PAUP*4.0b (Swofford 2000). The parameters of the nucleotide substitution models were calculated by hierarchical
likelihood ratio tests of Modeltest 3.7 (Posada and Crandall
1998) and the TrNef þ I þ G and HYK þ G models were
selected as optimal substitution models for the A and B
sets, respectively. Bootstrap percentages were calculated
based on 100 replications of a full heuristic search.
Prediction of 3D Structures of PSMB8 Molecules
The 3D structures of PSMB8 molecules were predicted
by Automated Mode of SWISS-MODEL server (http://
swissmodel.expasy.org/) (Arnold et al. 2006) based on
the crystal structure of the bovine PSMB5 molecule (Protein Data Bank ID code; 1iruL) (Unno et al. 2002). The interactive visualizations and analyses of predicted molecular
structures were performed by UCSF Chimera program
(http://www.cgl.ucsf.edu/chimera/) (Pettersen et al. 2004).
Results
Two Lineages of PSMB8 Gene, the PSMB8F and
PSMB8A Lineages, in Jawed Vertebrates
A Blast search of the EST data deposited in the National
Center for Biotechnology Information (http://
www.ncbi.nlm.nih.gov/index.html) identified both A and
F types of PSMB8 sequences in basal teleost species, such
as Cypriniformes (zebrafish, D. rerio and Japanese loach,
M. anguillicaudatus) and Salmoniformes (Atlantic salmon,
S. salar and rainbow trout, O. mykiss). The deduced amino
acid sequences of the mature peptide of PSMB8 of these
species were aligned using Clustal X 2.0 (Larkin et al.
2007), as were those of the shark, Oryzias, Xenopus, and
human PSMB8. These amino acid sequences were aligned
without any insertions/deletions, with the exception of
a short extension at the C-terminus of some sequences
(supplementary fig. S1, Supplementary Material online).
The ML tree constructed based on the nucleotide sequences of the coding region of the mature peptide, using the
human, Oryzias, and lamprey PSMB5 sequences as an outgroup, indicated the presence of two dichotomous clades
supported by high bootstrap percentages of 95% and 95%,
respectively (fig. 1). Exclusion of the third codon position
with possibly saturated substitutions did not change the
topology of the tree, and the presence of these dichotomous clades was also supported by high bootstrap percentages of 96% and 90%, respectively (fig. 1). One clade
contained only the PSMB8 sequences of sharks and basal
teleosts with F31; all other sequences belonged to the other
clade. These two clades contained genes from both cartilaginous and bony fish, indicating the presence of two
ancient lineages, termed the PSMB8A and PSMB8F lineages,
which are shared by Chondrichthyes and Teleostomi and
were thus maintained for more than 500 My (Blair and
Hedges 2005). In addition to the sequences shown here,
the BLAST search identified the presence of PSMB8A lineage sequences, but not PSMB8F lineage sequences, in many
mammalian species, in a reptilian green anole, and in
3073
MBE
Tsukamoto et al. · doi:10.1093/molbev/mss113
Human (V)
Xenopus (F)
85 (96)
100 (98)
Xenopus (A)
78 (58)
Medaka (Y)
97 (85)
Medaka (V)
Rainbow trout (A)
100 (100)
100 (98)
95 (96)
PSMB8A
lineage
Atlantic salmon (A)
80 (76)
Japanese loach (A)
100 (80)
Zebrafish (A)
Horn shark (A)
100 (100)
Nurse shark (A)
Rainbow trout (F)
100 (100)
Atlantic salmon (F)
84 (-)
Japanese loach (F)
PSMB8F
lineage
100 (100)
Zebrafish (F)
95 (90)
Horn shark (F)
100 (59)
Nurse shark (F)
Medaka PSMB5
95 (96)
Human PSMB5
Outgroup
100 (100)
Lamprey PSMB5
0.05
FIG. 1. ML phylogenetic tree based on the coding nucleotide sequences with the three codon positions of PSMB8 mature peptides. An ML tree
was constructed based on the TrNef þ I þ G model. The human, medaka, and sea lamprey PSMB5 coding nucleotide sequences were used as
an out-group. Bootstrap percentages .50%, based on 100 replicates, are shown on each branch. The numbers in parentheses indicate the
bootstrap percentages (.50%) from an ML analysis based on the coding nucleotide sequences, excluding the third codon position (the HYK þ
G model). The 31st residue of the mature peptide is shown in parentheses after the name of the species. The members of the nucleotide
sequences of the PSMB8 mature peptide used for phylogenetic analyses are the same as those shown in supplementary figure S1
(Supplementary Material online). An alignment of these sequences was performed based on supplementary figure S1 (Supplementary Material
online), and the accession numbers of these sequences are shown in the legend of supplementary figure S1 (Supplementary Material online).
several higher teleost species, suggesting that the PSMB8F
lineage was lost at least twice during vertebrate evolution
in the tetrapod and teleost lineages. Interestingly, however, the apparent functional equivalents of the PSMB8F
lineage members that have Phe or Tyr at the 31st position
of the mature peptide seem to be regenerated from the
PSMB8A lineage in the Oryzias (medaka) and Xenopus
lineages.
The neighbor joining (NJ) tree constructed based on the
p-distance of amino acid sequences also indicated the presence of the dichotomous clades, the PSMB8F and PSMB8A
lineages, supported by the 98% bootstrap percentages
(supplementary fig. S2, Supplementary Material online).
3074
However, resolution within these two clades seems to
be poor, and the shark sequences are located at the inappropriate positions in both clades of the NJ tree.
As shown in figure 2, 96 of the 204 positions of the
aligned mature peptides showed amino acid substitutions,
and the allowance of up to two residues for each group led
to the recognition of lineage-specific substitutions at eight
positions: I13M, Q/K53V/C, L65I, C/S84A, M87V, S99T, G/
R156A, and R/C157Q/E. In contrast, the type-specific substitutions that are possibly linked to cleaving specificity were
recognized at only two positions, S28T and V/A31Y/F. These
results apparently suggest parallel evolution from S28 and
V/A31 to T28 and Y/F31 in common ancestors of Oryzias and
Long-Lived Dichotomous Lineages of the PSMB8 Gene · doi:10.1093/molbev/mss113
MBE
FIG. 2. Comparison of the amino acid sequences of the mature peptides of PSMB8. The alignment of these PSMB8 sequences was performed
based on supplementary figure S1 (Supplementary Material online), excluding the PSMB5 sequences and the human PSMB8 sequence. Ninetysix of the 204 positions of the aligned mature peptides showed amino acid substitutions and are depicted here. Dots indicate the identity of the
residues with the uppermost sequence. The diagnostic residues for discrimination between the PSMB8F and PSMB8A lineages and between the
F and A types are indicated by * and # under the alignment, respectively. The 31st residue of the mature peptide is shown in parentheses after
the name of the species.
Xenopus. However, another scenario without parallel
evolution is also imaginable, as discussed below.
Dichotomous PSMB8 Lineages Are Alleles in
Cypriniformes and Salmoniformes
Because the A and F types of Oryzias and Xenopus PSMB8
are alleles (Nonaka et al. 2000; Miura et al. 2010), whereas
those of sharks are predicted to be paralogs (Kandil et al.
1996), segregation analysis of the PSMB8A and PSMB8F lineages was performed in basal teleosts and sharks to assess
whether they are alleles or paralogs. Preliminary typing of
several individuals of the India strain of zebrafish from
a closed colony by genomic PCR using PSMB8F and
PSMB8A lineage-specific primers revealed that some animals had only PSMB8A, some had only PSMB8F, and all
others had both, indicating that the PSMB8A and PSMB8F
are alleles. To confirm this conclusion, one male and one
female possessing both PSMB8A and PSMB8F were crossed
and their 96 progeny were typed for PSMB8. Twenty-seven
progeny possessed only PSMB8A, 46 possessed both
PSMB8A and PSMB8F, and 23 possessed only PSMB8F
(fig. 3A). A chi-square test did not reject a ratio of 1:2:1
for the three genotypes observed (F/F:F/A:A/A; P 0.05), supporting the idea that PSMB8A and PSMB8F are
FIG. 3. PSMB8 typing in zebrafish, rainbow trout, and banded hound shark. The PSMB8 gene region spanning from exons 2 to 3 was amplified
by genomic PCR using lineage-specific primers. (A) PSMB8 typing of offspring generated from the crossing of one male and one female
possessing both PSMB8A and PSMB8F in zebrafish. Of 96 samples typing, the results of 16 samples were shown together with those of the
father (F) and mother (M). The genotypes of shown offspring were as follows: PSMB8F/PSMB8F (numbers 4, 7, 14, and 16), PSMB8F/PSMB8A
(numbers 1, 6, 8, 9, 10, 11, 13, and 15), and PSMB8A/PSMB8A (numbers 2, 3, 5, and 12). (B) A pair of rainbow trout and their 41 offspring were
typed. Two PSMB8A bands of 499 bp (A1) and 751 bp (A2) and two PSMB8F bands of 266 bp (F1) and 302 bp (F2) were identified. Of 41
offspring typed, the results of 21 offspring were shown together with those of the father (F) with PSMB8A1/PSMB8F1/PSMB8F2 and mother (M)
with PSMB8A1/PSMB8A2/PSMB8F2. For the locus harboring the PSMB8A2, PSMB8F1, and PSMB8F2 alleles, the genotypes of shown offspring
were as follows: PSMB8A2/PSMB8F2 (numbers 2, 3, 6, 7, and 11–13), PSMB8A2/PSMB8F1 (numbers 4, 16, 18, 19, and 21), PSMB8F2/PSMB8F2
(numbers. 5, 9, 17, and 20), and PSMB8F1/PSMB8F2 (numbers 1, 8, 10, 14, and 15). (C) The 17 wild individuals of banded hound shark were
typed. PSMB8A band of about 6.5 kb and PSMB8F band of about 3 kb were identified from all samples.
3075
MBE
Tsukamoto et al. · doi:10.1093/molbev/mss113
alleles at a single locus in zebrafish. Actually, the zebrafish
genome sequence (Zv9 assembly) of the Ensembl
genome databases (http://www.ensembl.org/Danio_rerio/
Info/Index) indicates that there is only one PSMB8 locus
in the zebrafish genome. Then we analyzed Japanese loach,
for which no laboratory stock was available, using 107 wild
individuals. Twenty possessed only PSMB8A, 54 possessed
both PSMB8A and PSMB8F, and 33 possessed only PSMB8F,
suggesting that PSMB8A and PSMB8F are alleles also in the
loach. These results indicate that PSMB8A and PSMB8F
segregate as alleles at a single locus in Cypriniformes.
Typing in Salmoniformes is complicated because of the
presence of two PSMB8 loci generated by a recent tetraploidization (Shiina et al. 2005; Lukacs et al. 2007). A pair
of rainbow trout and their 41 progeny were typed by
genomic PCR amplification spanning exons 2–3 using
the PSMB8A-specific and PSMB8F-specific primers. The
nucleotide sequences of the PSMB8A and PSMB8F corresponding to the forward primers showed 8 mismatches
of 22 positions, including four positions at the 3# end of
the primers. Similarly, the reverse primer sequences differed
at seven positions of 24 positions including the three positions at the 3# end. Thus, it was highly unlikely that these
primers cross amplify the other type PSMB8 gene. Both the
PSMB8A-specific and PSMB8F-specific primers detected
two different-sized bands in this family (fig. 3B). Nucleotide
sequence analysis of each band of the parents indicated
that the F1 (266 bp), F2 (302 bp), and A2 (751 bp) bands
contained a single sequence, whereas the A1 (499 bp)
bands of both the mother and the father contained double
sequences showing double peaks at a few nucleotide positions. Cloning analysis of the A1 band of the parents identified three different sequences; one was common to the
mother and father (A1a), and the other two were unique
to the father (A1b) and mother (A1c) (supplementary fig.
S3A, Supplementary Material online). These three sequences showed only a few nucleotide differences to each other
of the 453 nucleotide positions compared, and A1c exhibited 100% identity with the published PSMB8 sequence
in the Onmy-IB region (Shiina et al. 2005) (supplementary
fig. S3A, Supplementary Material online). These results indicate that the A1a, A1b, and A1c bands represent alleles at
one PSMB8 locus in the Onmy-IB region. Actually, they segregated as alleles in the progeny (supplementary table S1,
Supplementary Material online). As shown in supplementary figure S3A (Supplementary Material online), the
intronic sequence of A2 showed a marked difference with
that of A1 bands. The sequences of two PSMB8F bands (F1,
266 bp and F2, 302 bp) were identical, with the exception of
a 36-bp long insertion/deletion located in the intronic
region (supplementary fig. S3B, Supplementary Material
online). These three bands, A2, F1, and F2, segregated
as alleles at the other locus, resulting in 16 A2/F2, 10
A2/F1, 8 F2/F2, and 7 F1/F2 progeny (supplementary table
S1, Supplementary Material online). In particular, progeny
with the F1 band also had either the A2 or F2 band, but
never both, suggesting that the A2 and F2 bands of the
mother segregate as alleles (fig. 3B). Thus, PSMB8A and
3076
PSMB8F are alleles also in at least one locus of rainbow
trout. These results indicate that the two PSMB8 lineages
seem to have been retained as alleles for more than 300 My
after the divergence of Cypriniformes and Salmoniformes
(Yamanoue et al. 2006) as a trans-order polymorphism.
Two Lineages of PSMB8 Are Paralogs in Sharks
The shark PSMB8A and PSMB8F were suggestive of paralogous genes because by Southern blot analysis, all four
nurse shark (Ginglymostoma cirratum) individuals analyzed
had both sequences and these two sequences showed only
76.0% and 79.4% identities at the nucleotide and amino
acid levels, respectively (Kandil et al. 1996). In addition,
Northern and Southern blotting analysis suggested that
the nurse shark PSMB8 paralogs are pseudoalleles behaving
like usual alleles (Ohta et al. 2002). However, the lack of the
genome sequence information of the elasmobranch species
bars to directly confirm this possibility. Thus, we analyzed
17 wild-caught banded hound shark (T. scyllium) individuals by genomic PCR using the PSMB8A-specific and
PSMB8F-specific primers to get supportive evidence for this
idea. Both the PSMB8A-specific and PSMB8F-specific bands
were detected in all analyzed individuals (fig. 3C). Since
these individuals were caught by fisherman’s gill nets on
several occasions and showed wide variation in body size,
it was presumed that there was no intimate genetic relationship between them. Thus, these results strongly suggest
that the shark PSMB8A and PSMB8F lineages represent
paralogous genes rather than alleles.
Discussion
We clarified that dichotomous PSMB8 allelic lineages, the
PSMB8F and PSMB8A lineages, have persisted for more
than 300 My in the basal teleost fish, Cypriniformes and
Salmoniformes. As the longest persistence time of
a trans-species polymorphism reported to date is only
50–80 My (Satta et al. 1996; Su and Nei 1999; Nonaka
et al. 2000; Esteves et al. 2005; Klein et al. 2007; Miura
et al. 2010), the persistence time of the PSMB8 allelic dimorphism of basal teleosts is unprecedentedly long, suggesting
the presence of an extremely stringent balancing selection,
most probably reflecting a potential advantage of possessing dual specificity for MHC class I antigen processing. Our
conclusion that the dichotomous lineages of the PSMB8
gene are alleles in zebrafish and rainbow trout was based
on the segregation analysis, which cannot discriminate
between the real alleles and the pseudoalleles formed by
differential silencing of one of in tandem-duplicated paralogs. However, the genome sequences of the MHC class I
regions of zebrafish and rainbow trout harboring the
PSMB8A lineage show no evidences for such tandem
duplication (Michalova et al. 2000; Shiina et al. 2005).
Although the final conclusion should wait the physical
analysis of the genomic region of these species harboring
the PSMB8F lineage, pseudoallelic status is highly unlikely
since the gene order and orientation around the PSMB8
gene, TAP2-PSMB9-PSMB9-like-PSMB10-PSMB8-MHC class
MBE
Long-Lived Dichotomous Lineages of the PSMB8 Gene · doi:10.1093/molbev/mss113
Actinopterygii
Sarcopterygii
Human
Xenopus
Carp/salmon
Oryzias
PSMB8A (F type)
PSMB8A (F type)
PSMB8F
PSMB8A (A type)
PSMB8A (A type)
PSMB8A
PSMB8A
Chondrichthyes
Shark
PSMB8A
PSMB8F
Common ancestor of jawed vertebrate
Paralogs
Alleles
PSMB8F
or
PSMB8A
PSMB8F
PSMB8A
FIG. 4. The supposed evolutionary history of dichotomous PSMB8 lineages in jawed vertebrate. The tree represents the phylogenetic
relationship among jawed vertebrate species, and the presence of PSMB8A and PSMB8F lineages indicated by black and gray lines, respectively.
The line lengths do not reflect the genetic distances. The PSMB8A and PSMB8F lineages were established as alleles or paralogs in common
ancestor of jawed vertebrate. The shark (cartilagenous fish) possesses these two lineages as paralogs, whereas these lineages exist as alleles in
basal teleost, Cypriniformes (carps) and Salmoniformes (salmons). The loss of the PSMB8F lineage occurred at least twice in the higher teleost
and tetrapod lineages as indicated by gray arrows. The F-type alleles were revived de novo within the PSMB8A lineage independently at least
twice in the amphibian Xenopus and teleostean Oryzias species as shown by black arrows.
I, is perfectly conserved by zebrafish, rainbow trout, fugu,
and the Hd-rR and HNI strains of medaka possessing the Aand F-type PSMB8, respectively. These results indicate that
the gene organization of this genomic region has been stable throughout teleost evolution (Lukacs et al. 2007).
Alternative explanation of the phylogenetic tree shown
in figure 1 is that the apparent presence of ancient lineages
of the PSMB8 gene is due to convergent evolution. However, seven of eight observed lineage-specific substitutions
(fig. 2) are located at the various parts of the steric structure of PSMB8 (data not shown) likely irrelevant to cleaving
specificity of PSMB8. Only the Q/K53V/C substitution
found at the residue involved in S1 pocket formation could
have functional importance. The apparently revived F-type
PSMB8 of Xenopus and medaka have the PSMB8A-type residues at the seven substituted positions located outside of
the S1 pocket, supporting the idea that these positions are
irrelevant to the cleaving specificity. In addition, the amino
acid sequences of the A- and F-type PSMB8 of spotted
green pufferfish, Tetraodon nigroviridis (NCBI accession
number CR697191 and CR691449) show only two amino
acid substitutions, A31F and V53M, suggesting that these
two substitutions are enough to change the cleaving
specificity. Since convergent evolution at the sites irrelevant to the cleaving specificity is difficult to imagine, these
results indicate that the observed lineages represent real
lineages.
The presumption that the PSMB8F lineage was lost in
the higher teleost and tetrapod lineages is a curious aspect
of the PSMB8 case. However, PSMB8A lineage molecules
with F31 or Y31 and supposedly with a cleaving specificity
that is similar to that of PSMB8F lineage molecules were
established independently at least twice during the evolution of the teleost Oryzias and amphibian Xenopus lineages.
Once reestablished, the dimorphism of the PSMB8 gene in
these animal lineages was retained for more than 30–60
(Miura et al. 2010) and 80 My (Nonaka et al. 2000), respectively. These results suggest that the PSMB8F lineage was
lost in common ancestors of higher teleosts and tetrapods
hundreds of millions of years ago and that the F-type alleles
were revived de novo within the PSMB8A lineage independently at least twice tens of millions of years ago (fig. 4).
Although this scenario predicts an extremely strong balancing selection to revive the dimorphism multiple times,
it is puzzling that there was a long absence of the dimorphism between the loss of the F lineage and the revival of
the F types. Another conceivable scenario is that most of
the PSMB8F lineage sequence, with the exception of the
close vicinity of the F31 residue, was replaced with the
PSMB8A lineage sequence by homologous recombination
or gene conversion between alleles in common ancestors of
higher teleosts and tetrapods hundreds of millions of years
ago, creating a PSMB8A lineage sequences with F31, and
that such sequence homogenization was repeated multiple
3077
Tsukamoto et al. · doi:10.1093/molbev/mss113
times thereafter between the two PSMB8A lineage sequences with A31 and F31. If the actual PSMB8 evolution followed this scenario, the A and F types have perpetuated
throughout the jawed vertebrate evolution in spite of an
apparent absence of the F type at some parts of the phylogenetic tree (fig. 4). The actual evolutionary scenario of
this curious polymorphism of the PSMB8 gene in the jawed
vertebrate is still to be clarified.
Unlike basal teleosts whose PSMB8A and PSMB8F lineages
are present as alleles, cartilaginous fish possess them as paralogous genes, raising the question of whether the common
ancestor of the jawed vertebrates had these two lineages as
alleles or paralogous genes. The diversity between the
PSMB8A and PSMB8F alleles (65.8–71.6% and 71.6–74.0%
identities at the nucleotide and amino acid levels, respectively) of basal teleosts seems to be too great for normal
alleles, apparently supporting the scenario that they diverged
as paralogous genes and then were converted to alleles. If
this is the case, the PSMB8A and PSMB8F alleles of basal
teleosts are paralogous alleles. However, the revived allelic
dimorphism of Oryzias and Xenopus also shows a high
degree of sequence diversity (82.7–82.9% and 86.8–89.8%
identities at the nucleotide and amino acid levels, respectively) without any evidence that the alleles were once
paralogous. Thus, at present, we cannot rule out the alternative possibility that the two PSMB8 lineages started as alleles
and then were converted to paralogs in cartilaginous fish.
Analysis of the basal groups of both Actinopterygii and Chondrichthyes is expected to provide an answer to this question.
In the MHC class I antigen presentation system, the antigen peptides bound to the MHC class I molecules are usually 8–9 amino acids long, and immunoproteasome is
responsible for the cleavage of the C-terminal side of these
peptides. In the PSMBs with proteinase activity, the Thr1
serves as single residue active site providing the nucleophile
acting as the general base for the hydrolysis reaction of peptide bond, whereas the cleaving specificity is determined by
the 20th, 31st, 35th, 45th, 49th, and 53rd residues forming
the S1 pocket (Groll et al. 1997) (supplementary fig. S1,
Supplementary Material online). To infer the cleaving specificity of the A- and F-types of the PSMB8 molecules, we
predicted the 3D structures of human PSMB8 (A type)
and A and F types of zebrafish, horn shark, and medaka
PSMB8 based on the published 3D structure of the bovine
PSMB5 molecules (Unno et al. 2002) by Automated Mode
of SWISS-MODEL server (http://swissmodel.expasy.org/)
(Arnold et al. 2006) (fig. 5). Supporting the reported chymotrypsin-like activity of human PSMB8 known to be the
A type (Agarwal et al. 2010), all A-type PSMB8 of these animals have wide opening at the entrance of the S1 pocket,
which could allow insertion of the bulky aromatic side
chains. In contrast, the entrance of the S1 pocket of the
F-type PSMB8 is much narrower, allowing insertion of only
smaller hydrophobic side chains. Thus, elastase-like specificity is predicted for the F-type PSMB8. Dual specificities of
PSMB8 borne by the A and F types could be advantageous
at the population level in coping with various kinds of
intracellular pathogens.
3078
MBE
FIG. 5. Predicted 3D structures of the A- and F-type PSMB8
molecules. The 3D structures of the A-type PSMB8 molecule in
human (NCBI accession number NP_004150) (A), the A- and F-type
PSMB8 molecules in zebrafish (NCBI accession numbers BC066288
and BC092889) (B), horn shark (NCBI accession numbers AF363583
and AF363582) (C), and medaka (NCBI accession numbers AB183488
and BA000027) (D) were predicted based on the steric structure of
the bovine PSMB5 molecule (PDB ID; 1iruL) by SWISS-MODEL server
(http://swissmodel.expasy.org/). The six residues forming the S1
pocket are indicated by orange (20th position), red (31st), yellow
(35th), green (45th), cyan (49th), and magenta (53rd), respectively.
The threonine residue acting as the general base for the hydrolysis
reaction is shown in blue. These views are from the inside of the S1
pocket, and Thr1 of the A type is visible, whereas Thr1 of the F type is
hardly visible due to bulky side chains of Phe31 or Tyr31.
Our preliminary analysis in amphibia and reptile identified both the A- and F-type PSMB8 from newt (Cynops pyrrhogaster), gecko (Gekko japonicus), and turtle (Trachemys
scripta) (Huang CH, Tanaka Y, Nonaka M, unpublished
data), suggesting that the lack of PSMB8 dimorphism in
placental mammals is rather an exceptional case. In this
context, it is interesting to note that the PSMB8 and class
IA genes are tightly linked in teleost (Lukacs et al. 2007),
Xenopus (Ohta et al. 2006) and green anole (NCBI accession
number NW_003339585), whereas they are separated
more than 1 Mb in most placental mammals (Kelley
et al. 2005). It is possible, therefore, that the dimorphism
Long-Lived Dichotomous Lineages of the PSMB8 Gene · doi:10.1093/molbev/mss113
of the PSMB8 gene is meaningful only when it is tightly
linked with the class IA gene. If this is the actual case,
the pattern of polymorphism of the class IA gene is expected to show difference between placental mammals
and other gnathostomes, although such comparison is still
to be performed.
The present evolutionary analysis of the PSMB8 gene revealed the dichotomous lineages retained by basal teleost
and sharks for more than 500 My either as alleles or
paralogs, and the long-term trans-order polymorphism
persisted for more than 300 My in basal teleost. These
unprecedented observations on genetic polymorphism indicate the presence of extremely strong selective pressure
for possessing dual specificities of PSMB8 in populations,
although the actual evolutionary mechanism is still to
be clarified.
Supplementary Material
Supplementary figures S1–S3 and table S1 are available at
Molecular Biology and Evolution online (http://www.
mbe.oxfordjournals.org/).
Acknowledgments
We thank Drs Masatoshi Nei and Jan Klein for critical reading
of the manuscript and Drs Shinji Takada, Sumito Koshida,
and Mr Takeshi Nakai for supplying us with fish samples.
This work was supported in part by a Grant-in-Aid for
Scientific Research on Priority Area, MEXT, Japan (to M.N.).
References
Agarwal AK, Xing C, DeMartino GN, Mizrachi D, Hernandez MD,
Sousa AB, Martinez de Villarreal L, dos Santos HG, Garg A. 2010.
PSMB8 encoding the beta5i proteasome subunit is mutated in
joint contractures, muscle atrophy, microcytic anemia, and
panniculitis-induced lipodystrophy syndrome. Am J Hum Genet.
87:866–872.
Arnold K, Bordoli L, Kopp J, Schwede T. 2006. The SWISS-MODEL
workspace: a web-based environment for protein structure
homology modelling. Bioinformatics 22:195–201.
Blair JE, Hedges SB. 2005. Molecular phylogeny and divergence times
of deuterostome animals. Mol Biol Evol. 22:2275–2284.
Esteves PJ, Lanning D, Ferrand N, Knight KL, Zhai SK, van der Loo W.
2005. The evolution of the immunoglobulin heavy chain variable
region (IgVH) in Leporids: an unusual case of transspecies
polymorphism. Immunogenetics 57:874–882.
Fehling HJ, Swat W, Laplace C, Kuhn R, Rajewsky K, Muller U, von
Boehmer H. 1994. MHC class I expression in mice lacking the
proteasome subunit LMP-7. Science 265:1234–1237.
Groll M, Ditzel L, Lowe J, Stock D, Bochtler M, Bartunik HD, Huber R.
1997. Structure of 20S proteasome from yeast at 2.4 A
resolution. Nature 386:463–471.
Kandil E, Namikawa C, Nonaka M, Greenberg AS, Flajnik MF,
Ishibashi T, Kasahara M. 1996. Isolation of low molecular mass
polypeptide complementary DNA clones from primitive
vertebrates. Implications for the origin of MHC class I-restricted
antigen presentation. J Immunol. 156:4245–4253.
Kasahara M. 1997. New insights into the genomic organization and
origin of the major histocompatibility complex: role of
chromosomal (genome) duplication in the emergence of the
adaptive immune system. Hereditas 127:59–65.
MBE
Kelley J, Walter L, Trowsdale J. 2005. Comparative genomics of major
histocompatibility complexes. Immunogenetics 56:683–695.
Klein J, Sato A, Nikolaidis N. 2007. MHC, TSP, and the origin of
species: from immunogenetics to evolutionary genetics. Annu
Rev Genet. 41:281–304.
Larkin MA, Blackshields G, Brown NP, et al. (13 co-authors). 2007.
Clustal W and Clustal X version 2.0. Bioinformatics 23:2947–2948.
Lukacs MF, Harstad H, Grimholt U, et al. (11 co-authors). 2007.
Genomic organization of duplicated major histocompatibility
complex class I regions in Atlantic salmon (Salmo salar). BMC
Genomics 8:251.
Michalova V, Murray BW, Sultmann H, Klein J. 2000. A contig map
of the Mhc class I genomic region in the zebrafish reveals
ancient synteny. J Immunol. 164:5296–5305.
Miura F, Tsukamoto K, Mehta RB, Naruse K, Magtoon W,
Nonaka M. 2010. Transspecies dimorphic allelic lineages of the
proteasome subunit beta-type 8 gene (PSMB8) in the teleost
genus Oryzias. Proc Natl Acad Sci U S A. 107:21599–21604.
Nonaka M, Yamada-Namikawa C, Flajnik MF, Du Pasquier L. 2000.
Trans-species polymorphism of the major histocompatibility
complex-encoded proteasome subunit LMP7 in an amphibian
genus, Xenopus. Immunogenetics 51:186–192.
Ohta Y, Goetz W, Hossain MZ, Nonaka M, Flajnik MF. 2006.
Ancestral organization of the MHC revealed in the amphibian
Xenopus. J Immunol. 176:3674–3685.
Ohta Y, McKinney EC, Criscitiello MF, Flajnik MF. 2002. Proteasome,
transporter associated with antigen processing, and class I genes in
the nurse shark Ginglymostoma cirratum: evidence for a stable class
I region and MHC haplotype lineages. J Immunol. 168:771–781.
Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM,
Meng EC, Ferrin TE. 2004. UCSF Chimera–a visualization system
for exploratory research and analysis. J Comput Chem. 25:
1605–1612.
Posada D, Crandall KA. 1998. MODELTEST: testing the model of
DNA substitution. Bioinformatics 14:817–818.
Rock KL, Goldberg AL. 1999. Degradation of cell proteins and the
generation of MHC class I-presented peptides. Annu Rev
Immunol. 17:739–779.
Satta Y, Mayer WE, Klein J. 1996. HLA-DRB intron 1 sequences:
implications for the evolution of HLA-DRB genes and
haplotypes. Hum Immunol. 51:1–12.
Shiina T, Dijkstra JM, Shimizu S, et al. (15 co-authors). 2005.
Interchromosomal duplication of major histocompatibility
complex class I regions in rainbow trout (Oncorhynchus
mykiss), a species with a presumably recent tetraploid ancestry.
Immunogenetics 56:878–893.
Su C, Nei M. 1999. Fifty-million-year-old polymorphism at an
immunoglobulin variable region gene locus in the rabbit
evolutionary lineage. Proc Natl Acad Sci U S A. 96:9710–9715.
Swofford DL. 2000. Phylogenetic analysis using parsimony (* and
other methods). Sunderland (MA): Sinauer Associates.
Tanaka K, Kasahara M. 1998. The MHC class I ligand-generating
system: roles of immunoproteasomes and the interferon-gammainducible proteasome activator PA28. Immunol Rev. 163:161–176.
Tsukamoto K, Hayashi S, Matsuo MY, Nonaka MI, Kondo M,
Shima A, Asakawa S, Shimizu N, Nonaka M. 2005. Unprecedented intraspecific diversity of the MHC class I region of
a teleost medaka, Oryzias latipes. Immunogenetics 57:420–431.
Unno M, Mizushima T, Morimoto Y, Tomisugi Y, Tanaka K,
Yasuoka N, Tsukihara T. 2002. The structure of the mammalian
20S proteasome at 2.75 A resolution. Structure 10:609–618.
Yamanoue Y, Miya M, Inoue JG, Matsuura K, Nishida M. 2006. The
mitochondrial genome of spotted green pufferfish Tetraodon
nigroviridis (Teleostei: Tetraodontiformes) and divergence time
estimation among model organisms in fishes. Genes Genet Syst.
81:29–39.
3079