Long-Lived Dichotomous Lineages of the Proteasome Subunit Beta Type 8 (PSMB8) Gene Surviving More than 500 Million Years as Alleles or Paralogs Kentaro Tsukamoto, ,1 Fumi Miura,1 Naoko T. Fujito,1 Goro Yoshizaki,2 and Masaru Nonaka*,1 1 Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, Japan Department of Marine Biosciences, Tokyo University of Marine Science and Technology, Tokyo, Japan Present address: Institute for Comprehensive Medical Science, Fujita Health University, Toyoake, Aichi, Japan *Corresponding author: E-mail: [email protected]. Associate editor: Yoko Satta Data deposition: Accessions: AB602772–AB602774. 2 On an evolutionary time scale, polymorphic alleles are believed to have a short life, persisting at most tens of millions of years even under long-term balancing selection. Here, we report highly diverged trans-species dimorphism of the proteasome subunit beta type 8 (PSMB8) gene, which encodes a catalytic subunit of the immunoproteasome responsible for the generation of peptides presented by major histocompatibility complex (MHC) class I molecules, in lower teleosts including Cypriniformes (zebrafish and loach) and Salmoniformes (trout and salmon), whose last common ancestor dates to 300 Ma. Moreover, phylogenetic analyses indicated that these dimorphic alleles share lineages with two shark paralogous genes, suggesting that these two lineages have been maintained for more than 500 My either as alleles or as paralogs, and that conversion between alleles and paralogs has occurred at least once during vertebrate evolution. Two lineages termed PSMB8A and PSMB8F show an A31F substitution that would probably affect their cleaving specificity, and whereas the PSMB8A lineage has been retained by all analyzed jawed vertebrates, the PSMB8F lineage has been lost by most jawed vertebrates except for cartilaginous fish and basal teleosts. However, a possible functional equivalent of the PSMB8F lineage has been revived as alleles within the PSMB8A lineage at least twice during vertebrate evolution in the amphibian Xenopus and teleostean Oryzias species. Dynamic evolution of the PSMB8 polymorphism through long-term persistence, loss, and regaining of dimorphism and conversion between alleles and paralogs implies the presence of strong selective pressure for functional polymorphism of this gene. Key words: PSMB8, long-lived lineages, balancing selection, immunoproteasome. Introduction Proteasomes are huge multicatalytic proteinase complexes, which are responsible for selective proteolytic degradation and production of the antigenic peptides to be presented by MHC class I molecules to cytotoxic CD8þ T cells (Rock and Goldberg 1999). Proteolysis is performed by 20S proteasome, which is a catalytic core of proteasome and a large complex composed of four stacks of two outer alpha and two inner beta rings containing seven alpha and seven beta subunits, respectively (Groll et al. 1997; Unno et al. 2002). Of these 14 subunits, only three beta subunits have proteolytic activity. These active beta subunits of a constitutive proteasome are PSMB5 (X), PSMB6 (Y), and PSMB7 (Z), with chymotrypsin-like, caspase-like, and trypsin-like proteinase activities, respectively. They are replaced by IFNc inducible beta subunits with proteinase activities, PSMB8 (LMP7), PSMB9 (LMP2), and PSMB10 (MECL-1), respectively, forming an immunoproteasome (Tanaka and Kasahara 1998). These changes in those subunit compositions are believed to increase chymotrypsin-like activity to generate peptides with hydrophobic C-terminal residues suitable for binding into the cleft of the MHC class I molecules (Tanaka and Kasahara 1998). Especially, PSMB8 subunit is critical component for the production of the MHC class I binding peptides because PSMB8 knockout mice show reduced expression of MHC class I molecules on the cell surface (Fehling et al. 1994). The PSMB8 gene is believed to have arisen by gene duplication of PSMB5 in a common ancestor of jawed vertebrates, simultaneously with the appearance of canonical adaptive immunity (Kasahara 1997). Amino acid substitutions at the residues involved in the determination of the cleaving specificity of PSMB8 may affect the recognition repertoire of adaptive immunity, potentially leading to differential susceptibility to pathogens in individuals with different types of PSMB8. To date, highly diverged dichotomous types of the PSMB8 gene have been reported in teleost Oryzias species (Miura et al. 2010), amphibian Xenopus species (Nonaka et al. 2000), and sharks (cartilaginous fish) (Kandil et al. 1996). The dichotomous types present in these species share the curious amino acid substitution of Ala or Val, which have a small side chain (termed A type), versus Phe or Tyr, which have a bulky aromatic side chain (termed F type) at the 31st residue of the mature peptide, which is most probably involved in the determination of cleaving © The Author 2012. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected] Mol. Biol. Evol. 29(10):3071–3079. 2012 doi:10.1093/molbev/mss113 Advance Access publication April 6, 2012 3071 Research article Abstract MBE Tsukamoto et al. · doi:10.1093/molbev/mss113 specificity (Groll et al. 1997; Unno et al. 2002). Mammals possess only the A-type PSMB8, and human PSMB8 with V31 exhibits chymotrypsin-like specificity for cleavage after bulky hydrophobic amino acids (Agarwal et al. 2010). In contrast, the F-type PSMB8 is predicted to have elastase-like specificity for cleavage after small hydrophobic amino acids based on the presumed 3D structure of the F type of the medaka PSMB8 molecule (Tsukamoto et al. 2005). However, a phylogenetic analysis indicated that, in these animal groups, the A and F types were derived via independent evolutionary events (Tsukamoto et al. 2005). Moreover, the A and F types of Oryzias and Xenopus PSMB8 are alleles that were retained for 60–80 My through several speciation events by long-term balancing selection (Nonaka et al. 2000; Miura et al. 2010), whereas those of the shark PSMB8 are assumed to be paralogous genes (Kandil et al. 1996). To elucidate the evolutionary history of the A and F types of the PSMB8 gene of jawed vertebrates, here, we performed in silico data mining, phylogenetic analysis, and segregation or population analysis to clarify whether the two types are alleles or paralogs. Materials and Methods Animals The India strain of zebrafish, Danio rerio, was supplied by courtesy of Dr Shinji Takada (National Institute for Basic Biology, Aichi, Japan) and Dr Sumito Koshida (The University of Tokyo, Tokyo, Japan). The wild individuals (N 5 107) of the Japanese loach, Misgurnus anguillicaudatus, collected in the Ibaraki prefecture, Japan, were purchased from Choei-shouten (Tokyo, Japan). The Okutama strain of rainbow trout, Oncorhynchus mykiss, was from the Ooizumi station of the Tokyo University of Marine Science and Technology, Nagano, Japan. Wild-caught individuals of the banded hound shark (Triakis scyllium) (N 5 17) were supplied by courtesy of Mr Takeshi Nakai (Keikyu Aburatsubo Marine Park, Kanagawa, Japan). Blast Search of Expressed Sequence Tag Data for the PSMB8 Alleles We searched the coding sequence of PSMB8 among the expressed sequence tag (EST) data of teleost species using the National Center for Biotechnology Information (NCBI) BLAST search option (http://blast.ncbi.nlm.nih.gov/Blast.cgi). The tblastn and tblastx searches were performed using the coding sequences (nucleotide or amino acid sequence) of the PSMB8N (F-type PSMB8, NCBI accession number AB183488) and PSMB8d (A-type PSMB8, NCBI accession number BA000027) alleles of Oryzias latipes as queries. Two types of PSMB8 sequences, one with Ala and the other with Phe at the 31st position of the mature peptide, were identified in the Cypriniformes zebrafish (D. rerio; NCBI accession numbers BC066288 and BC092889) and Japanese loach (M. anguillicaudatus; NCBI accession numbers BJ830309 and BJ838416), and in the Salmoniformes Atlantic salmon (Salmo salar; NCBI accession numbers DY720791 3072 and DW555236) and rainbow trout (O. mykiss; NCBI accession numbers CA360946 and GE824997). These EST sequences identified included the full coding sequence of the PSMB8 mature peptide, with the exception of the Japanese loach EST sequence for the F type (NCBI accession number BJ838416), which lacked 76 bp at the 5#-end. To obtain the nucleotide sequence of this region, total RNA was isolated from the caudal fin of loach individuals possessing the F-type PSMB8 using ISOGEN (Nippon Gene, Tokyo, Japan) according to the manufacturer’s instructions and was reverse-transcribed to cDNA by Superscript II (Invitrogen, Carlsbad, CA). The second exon, which encompassed the missing 76 bp, was amplified using the cDNA as a template by the F-type specific primers. The forward primer was 5#-ACTGGAATGGACCCGGAGCAGTT-3# designed at the second coding exon, based on the F-type sequence of zebrafish (NCBI accession number BC092889); 5#-TCTTTGGCCAGTCTTCTCTCCCAAT3# was used as the reverse primer designed at the third exon of the F-type sequence of Japanese loach (NCBI accession number BJ838416). The polymerase chain reaction (PCR) was performed using Ex-Taq (Takara Bio Inc., Shiga, Japan) and the following conditions: denaturation at 95 °C for 1 min, 35 cycles of denaturation at 95 °C for 30 s, annealing at 62 °C for 30 s and elongation at 72 °C for 60 s, and final elongation at 72 °C for 3 min. The nucleotide sequence of the PCR product was determined by direct sequencing. Sequencing reaction was performed using the BigDye Terminator v3.1 Sequencing Standard kit (Applied Biosystems Foster City, CA), and 3100/3130xl Genetic Analyzer (Applied Biosystems) was used for determination of nucleotide sequences. The 84 bp of the exon 2 sequence determined (NCBI accession number AB602774) overlapped with the EST sequence of loach with 99% identity, and we used the combined sequences (NCBI accession numbers AB602774 and BJ838416) for the phylogenetic analyses. Genomic DNA Extraction Genomic DNA was extracted from fish fins using the Puregene Genomic DNA Purification Kit (Gentra Systems, Minneapolis, MN) according to the manufacturer’s instructions and was finally dissolved in Tris-ethylenediaminetetraacetic acid (EDTA) buffer (10 mM Tris, 1 mM EDTA). PCR Amplification of the PSMB8 Gene of Zebrafish, Japanese Loach, Rainbow Trout, and Banded Hound Shark The PSMB8 gene sequence spanning from the second to the third exon was amplified using genomic DNA as a template and the following primer sets, which are specific for the F and A types and were designed based on sequences from each of the species: zebrafish F type (5#-GGACTCGAGAGCATCTGCAGGGTCT-3# and 5#-TGGCCAGTCTTCT CTCCCAATAAAC-3#), zebrafish A type (5#-AGACTCCAGAGCTTCTGCTGGAAAA-3# and 5#-AGCCAGCAGTCTC TCCCAGTACTG-3#), loach F type (5#-CTCCAGAGCATC TGCTGGGTCTTAC-3# and 5#-TCTTTGGCCAGTCTTCT Long-Lived Dichotomous Lineages of the PSMB8 Gene · doi:10.1093/molbev/mss113 CTCCCAAT-3#), loach A type (5#-CTCCAGAGCATCAGC TGGAAAATAT-3# and 5#-TCTTTAGCCAGCAGTCTCTCC CAGT-3#), rainbow trout F type (5#-TGACTCCAGAGCATCTGCAGGATCT-3# and 5#-TGGCCAACACTCTCTCC CAATAGAC-3#), rainbow trout A type (5#-CTCCAGGGCC TCAGCTGGCAGC-3# and 5#-AGCCAGCAGTCTCTCCCAGTACTG-3#), banded hound shark F type (5#-ATTCCAGAGCATCTGCAGGCTCC-3# and 5#-CCAGTAGACACAA TCTGCAGCACTG-3#), and banded hound shark A type (5#-GACTCCAGAGCATCTGCAGGGAAT-3# and 5#-CCA GTATTGGCAATCAGCAGCACTT-3#). The PCR condition was as follows, with the exception of the primer sets used to amplify the F and A types of banded hound shark: denaturation at 94 °C for 1 min, 35 cycles of denaturation at 94 °C for 30 s, annealing at 64 °C for 30 s and elongation at 72 °C for 1 min, and final elongation at 72 °C for 3 min with Ex-Taq (Takara Bio Inc.). PCR products were separated in 2% agarose gels. The PCR conditions used for the primer sets for the two types of banded hound shark were as follows: denaturation at 98 °C for 30 s, 30 cycles of denaturation at 98 °C for 10 s, annealing and elongation at 66 °C for 4 min, and final elongation at 72 °C for 3 min, with LA-Taq (Takara Bio Inc.). PCR products were separated in 0.8% agarose gels. The nucleotide sequences of the PCR products of all individuals were determined as described above. To design the type-specific primers used to amplify the F and A types of the banded hound shark, we first determined the cDNA sequences of these types (NCBI accession numbers AB602772 and AB602773). Total RNA was isolated from the liver of a wild-caught banded hound shark individual, as described above, and was reverse-transcribed to cDNA using Superscript II (Invitrogen). The cDNAs were used as templates for PCR amplification of the F- and Atype sequences spanning from the first to the sixth (last) coding exon using the following type-specific primer sets, which were designed based on nucleotide sequences that are conserved between nurse shark (NCBI accession numbers D64056 and D64057) and horn shark (NCBI accession numbers AF363583 and AF363582): F type (5#-CTGGTGG TACCGSTTGGTCTGGA-3# and 5#-CCCGCATGTGATACATGTTAACTGT-3#) and A type (5#-CTGGTGGTACCGS TTGGTCTGGA-3# and 5#-GACACCTTTATCCACCCATCT TCA-3#). The PCR condition was denaturation at 94 °C for 1 min, 35 cycles of denaturation at 94 °C for 30 s, annealing at 58 °C for 30 s and elongation at 72 °C for 90 s, and final elongation at 72 °C for 3 min, with Ex-Taq (Takara Bio Inc.). The nucleotide sequences of the PCR products were determined as described above. Phylogenetic Tree Construction The alignment of multiple nucleotide sequences encoding mature peptides of PSMB8 and PSMB5 (used as an outgroup) was performed by Clustal X 2.0 (Larkin et al. 2007). Based on the alignments, the short extension present at the 3# side of some sequences was removed and two alignment sets, 609 sites with all codon positions (A) and 406 sites without the third codon position (B), were used for phylogenetic analyses. MBE A maximum likelihood (ML) tree was constructed using PAUP*4.0b (Swofford 2000). The parameters of the nucleotide substitution models were calculated by hierarchical likelihood ratio tests of Modeltest 3.7 (Posada and Crandall 1998) and the TrNef þ I þ G and HYK þ G models were selected as optimal substitution models for the A and B sets, respectively. Bootstrap percentages were calculated based on 100 replications of a full heuristic search. Prediction of 3D Structures of PSMB8 Molecules The 3D structures of PSMB8 molecules were predicted by Automated Mode of SWISS-MODEL server (http:// swissmodel.expasy.org/) (Arnold et al. 2006) based on the crystal structure of the bovine PSMB5 molecule (Protein Data Bank ID code; 1iruL) (Unno et al. 2002). The interactive visualizations and analyses of predicted molecular structures were performed by UCSF Chimera program (http://www.cgl.ucsf.edu/chimera/) (Pettersen et al. 2004). Results Two Lineages of PSMB8 Gene, the PSMB8F and PSMB8A Lineages, in Jawed Vertebrates A Blast search of the EST data deposited in the National Center for Biotechnology Information (http:// www.ncbi.nlm.nih.gov/index.html) identified both A and F types of PSMB8 sequences in basal teleost species, such as Cypriniformes (zebrafish, D. rerio and Japanese loach, M. anguillicaudatus) and Salmoniformes (Atlantic salmon, S. salar and rainbow trout, O. mykiss). The deduced amino acid sequences of the mature peptide of PSMB8 of these species were aligned using Clustal X 2.0 (Larkin et al. 2007), as were those of the shark, Oryzias, Xenopus, and human PSMB8. These amino acid sequences were aligned without any insertions/deletions, with the exception of a short extension at the C-terminus of some sequences (supplementary fig. S1, Supplementary Material online). The ML tree constructed based on the nucleotide sequences of the coding region of the mature peptide, using the human, Oryzias, and lamprey PSMB5 sequences as an outgroup, indicated the presence of two dichotomous clades supported by high bootstrap percentages of 95% and 95%, respectively (fig. 1). Exclusion of the third codon position with possibly saturated substitutions did not change the topology of the tree, and the presence of these dichotomous clades was also supported by high bootstrap percentages of 96% and 90%, respectively (fig. 1). One clade contained only the PSMB8 sequences of sharks and basal teleosts with F31; all other sequences belonged to the other clade. These two clades contained genes from both cartilaginous and bony fish, indicating the presence of two ancient lineages, termed the PSMB8A and PSMB8F lineages, which are shared by Chondrichthyes and Teleostomi and were thus maintained for more than 500 My (Blair and Hedges 2005). In addition to the sequences shown here, the BLAST search identified the presence of PSMB8A lineage sequences, but not PSMB8F lineage sequences, in many mammalian species, in a reptilian green anole, and in 3073 MBE Tsukamoto et al. · doi:10.1093/molbev/mss113 Human (V) Xenopus (F) 85 (96) 100 (98) Xenopus (A) 78 (58) Medaka (Y) 97 (85) Medaka (V) Rainbow trout (A) 100 (100) 100 (98) 95 (96) PSMB8A lineage Atlantic salmon (A) 80 (76) Japanese loach (A) 100 (80) Zebrafish (A) Horn shark (A) 100 (100) Nurse shark (A) Rainbow trout (F) 100 (100) Atlantic salmon (F) 84 (-) Japanese loach (F) PSMB8F lineage 100 (100) Zebrafish (F) 95 (90) Horn shark (F) 100 (59) Nurse shark (F) Medaka PSMB5 95 (96) Human PSMB5 Outgroup 100 (100) Lamprey PSMB5 0.05 FIG. 1. ML phylogenetic tree based on the coding nucleotide sequences with the three codon positions of PSMB8 mature peptides. An ML tree was constructed based on the TrNef þ I þ G model. The human, medaka, and sea lamprey PSMB5 coding nucleotide sequences were used as an out-group. Bootstrap percentages .50%, based on 100 replicates, are shown on each branch. The numbers in parentheses indicate the bootstrap percentages (.50%) from an ML analysis based on the coding nucleotide sequences, excluding the third codon position (the HYK þ G model). The 31st residue of the mature peptide is shown in parentheses after the name of the species. The members of the nucleotide sequences of the PSMB8 mature peptide used for phylogenetic analyses are the same as those shown in supplementary figure S1 (Supplementary Material online). An alignment of these sequences was performed based on supplementary figure S1 (Supplementary Material online), and the accession numbers of these sequences are shown in the legend of supplementary figure S1 (Supplementary Material online). several higher teleost species, suggesting that the PSMB8F lineage was lost at least twice during vertebrate evolution in the tetrapod and teleost lineages. Interestingly, however, the apparent functional equivalents of the PSMB8F lineage members that have Phe or Tyr at the 31st position of the mature peptide seem to be regenerated from the PSMB8A lineage in the Oryzias (medaka) and Xenopus lineages. The neighbor joining (NJ) tree constructed based on the p-distance of amino acid sequences also indicated the presence of the dichotomous clades, the PSMB8F and PSMB8A lineages, supported by the 98% bootstrap percentages (supplementary fig. S2, Supplementary Material online). 3074 However, resolution within these two clades seems to be poor, and the shark sequences are located at the inappropriate positions in both clades of the NJ tree. As shown in figure 2, 96 of the 204 positions of the aligned mature peptides showed amino acid substitutions, and the allowance of up to two residues for each group led to the recognition of lineage-specific substitutions at eight positions: I13M, Q/K53V/C, L65I, C/S84A, M87V, S99T, G/ R156A, and R/C157Q/E. In contrast, the type-specific substitutions that are possibly linked to cleaving specificity were recognized at only two positions, S28T and V/A31Y/F. These results apparently suggest parallel evolution from S28 and V/A31 to T28 and Y/F31 in common ancestors of Oryzias and Long-Lived Dichotomous Lineages of the PSMB8 Gene · doi:10.1093/molbev/mss113 MBE FIG. 2. Comparison of the amino acid sequences of the mature peptides of PSMB8. The alignment of these PSMB8 sequences was performed based on supplementary figure S1 (Supplementary Material online), excluding the PSMB5 sequences and the human PSMB8 sequence. Ninetysix of the 204 positions of the aligned mature peptides showed amino acid substitutions and are depicted here. Dots indicate the identity of the residues with the uppermost sequence. The diagnostic residues for discrimination between the PSMB8F and PSMB8A lineages and between the F and A types are indicated by * and # under the alignment, respectively. The 31st residue of the mature peptide is shown in parentheses after the name of the species. Xenopus. However, another scenario without parallel evolution is also imaginable, as discussed below. Dichotomous PSMB8 Lineages Are Alleles in Cypriniformes and Salmoniformes Because the A and F types of Oryzias and Xenopus PSMB8 are alleles (Nonaka et al. 2000; Miura et al. 2010), whereas those of sharks are predicted to be paralogs (Kandil et al. 1996), segregation analysis of the PSMB8A and PSMB8F lineages was performed in basal teleosts and sharks to assess whether they are alleles or paralogs. Preliminary typing of several individuals of the India strain of zebrafish from a closed colony by genomic PCR using PSMB8F and PSMB8A lineage-specific primers revealed that some animals had only PSMB8A, some had only PSMB8F, and all others had both, indicating that the PSMB8A and PSMB8F are alleles. To confirm this conclusion, one male and one female possessing both PSMB8A and PSMB8F were crossed and their 96 progeny were typed for PSMB8. Twenty-seven progeny possessed only PSMB8A, 46 possessed both PSMB8A and PSMB8F, and 23 possessed only PSMB8F (fig. 3A). A chi-square test did not reject a ratio of 1:2:1 for the three genotypes observed (F/F:F/A:A/A; P 0.05), supporting the idea that PSMB8A and PSMB8F are FIG. 3. PSMB8 typing in zebrafish, rainbow trout, and banded hound shark. The PSMB8 gene region spanning from exons 2 to 3 was amplified by genomic PCR using lineage-specific primers. (A) PSMB8 typing of offspring generated from the crossing of one male and one female possessing both PSMB8A and PSMB8F in zebrafish. Of 96 samples typing, the results of 16 samples were shown together with those of the father (F) and mother (M). The genotypes of shown offspring were as follows: PSMB8F/PSMB8F (numbers 4, 7, 14, and 16), PSMB8F/PSMB8A (numbers 1, 6, 8, 9, 10, 11, 13, and 15), and PSMB8A/PSMB8A (numbers 2, 3, 5, and 12). (B) A pair of rainbow trout and their 41 offspring were typed. Two PSMB8A bands of 499 bp (A1) and 751 bp (A2) and two PSMB8F bands of 266 bp (F1) and 302 bp (F2) were identified. Of 41 offspring typed, the results of 21 offspring were shown together with those of the father (F) with PSMB8A1/PSMB8F1/PSMB8F2 and mother (M) with PSMB8A1/PSMB8A2/PSMB8F2. For the locus harboring the PSMB8A2, PSMB8F1, and PSMB8F2 alleles, the genotypes of shown offspring were as follows: PSMB8A2/PSMB8F2 (numbers 2, 3, 6, 7, and 11–13), PSMB8A2/PSMB8F1 (numbers 4, 16, 18, 19, and 21), PSMB8F2/PSMB8F2 (numbers. 5, 9, 17, and 20), and PSMB8F1/PSMB8F2 (numbers 1, 8, 10, 14, and 15). (C) The 17 wild individuals of banded hound shark were typed. PSMB8A band of about 6.5 kb and PSMB8F band of about 3 kb were identified from all samples. 3075 MBE Tsukamoto et al. · doi:10.1093/molbev/mss113 alleles at a single locus in zebrafish. Actually, the zebrafish genome sequence (Zv9 assembly) of the Ensembl genome databases (http://www.ensembl.org/Danio_rerio/ Info/Index) indicates that there is only one PSMB8 locus in the zebrafish genome. Then we analyzed Japanese loach, for which no laboratory stock was available, using 107 wild individuals. Twenty possessed only PSMB8A, 54 possessed both PSMB8A and PSMB8F, and 33 possessed only PSMB8F, suggesting that PSMB8A and PSMB8F are alleles also in the loach. These results indicate that PSMB8A and PSMB8F segregate as alleles at a single locus in Cypriniformes. Typing in Salmoniformes is complicated because of the presence of two PSMB8 loci generated by a recent tetraploidization (Shiina et al. 2005; Lukacs et al. 2007). A pair of rainbow trout and their 41 progeny were typed by genomic PCR amplification spanning exons 2–3 using the PSMB8A-specific and PSMB8F-specific primers. The nucleotide sequences of the PSMB8A and PSMB8F corresponding to the forward primers showed 8 mismatches of 22 positions, including four positions at the 3# end of the primers. Similarly, the reverse primer sequences differed at seven positions of 24 positions including the three positions at the 3# end. Thus, it was highly unlikely that these primers cross amplify the other type PSMB8 gene. Both the PSMB8A-specific and PSMB8F-specific primers detected two different-sized bands in this family (fig. 3B). Nucleotide sequence analysis of each band of the parents indicated that the F1 (266 bp), F2 (302 bp), and A2 (751 bp) bands contained a single sequence, whereas the A1 (499 bp) bands of both the mother and the father contained double sequences showing double peaks at a few nucleotide positions. Cloning analysis of the A1 band of the parents identified three different sequences; one was common to the mother and father (A1a), and the other two were unique to the father (A1b) and mother (A1c) (supplementary fig. S3A, Supplementary Material online). These three sequences showed only a few nucleotide differences to each other of the 453 nucleotide positions compared, and A1c exhibited 100% identity with the published PSMB8 sequence in the Onmy-IB region (Shiina et al. 2005) (supplementary fig. S3A, Supplementary Material online). These results indicate that the A1a, A1b, and A1c bands represent alleles at one PSMB8 locus in the Onmy-IB region. Actually, they segregated as alleles in the progeny (supplementary table S1, Supplementary Material online). As shown in supplementary figure S3A (Supplementary Material online), the intronic sequence of A2 showed a marked difference with that of A1 bands. The sequences of two PSMB8F bands (F1, 266 bp and F2, 302 bp) were identical, with the exception of a 36-bp long insertion/deletion located in the intronic region (supplementary fig. S3B, Supplementary Material online). These three bands, A2, F1, and F2, segregated as alleles at the other locus, resulting in 16 A2/F2, 10 A2/F1, 8 F2/F2, and 7 F1/F2 progeny (supplementary table S1, Supplementary Material online). In particular, progeny with the F1 band also had either the A2 or F2 band, but never both, suggesting that the A2 and F2 bands of the mother segregate as alleles (fig. 3B). Thus, PSMB8A and 3076 PSMB8F are alleles also in at least one locus of rainbow trout. These results indicate that the two PSMB8 lineages seem to have been retained as alleles for more than 300 My after the divergence of Cypriniformes and Salmoniformes (Yamanoue et al. 2006) as a trans-order polymorphism. Two Lineages of PSMB8 Are Paralogs in Sharks The shark PSMB8A and PSMB8F were suggestive of paralogous genes because by Southern blot analysis, all four nurse shark (Ginglymostoma cirratum) individuals analyzed had both sequences and these two sequences showed only 76.0% and 79.4% identities at the nucleotide and amino acid levels, respectively (Kandil et al. 1996). In addition, Northern and Southern blotting analysis suggested that the nurse shark PSMB8 paralogs are pseudoalleles behaving like usual alleles (Ohta et al. 2002). However, the lack of the genome sequence information of the elasmobranch species bars to directly confirm this possibility. Thus, we analyzed 17 wild-caught banded hound shark (T. scyllium) individuals by genomic PCR using the PSMB8A-specific and PSMB8F-specific primers to get supportive evidence for this idea. Both the PSMB8A-specific and PSMB8F-specific bands were detected in all analyzed individuals (fig. 3C). Since these individuals were caught by fisherman’s gill nets on several occasions and showed wide variation in body size, it was presumed that there was no intimate genetic relationship between them. Thus, these results strongly suggest that the shark PSMB8A and PSMB8F lineages represent paralogous genes rather than alleles. Discussion We clarified that dichotomous PSMB8 allelic lineages, the PSMB8F and PSMB8A lineages, have persisted for more than 300 My in the basal teleost fish, Cypriniformes and Salmoniformes. As the longest persistence time of a trans-species polymorphism reported to date is only 50–80 My (Satta et al. 1996; Su and Nei 1999; Nonaka et al. 2000; Esteves et al. 2005; Klein et al. 2007; Miura et al. 2010), the persistence time of the PSMB8 allelic dimorphism of basal teleosts is unprecedentedly long, suggesting the presence of an extremely stringent balancing selection, most probably reflecting a potential advantage of possessing dual specificity for MHC class I antigen processing. Our conclusion that the dichotomous lineages of the PSMB8 gene are alleles in zebrafish and rainbow trout was based on the segregation analysis, which cannot discriminate between the real alleles and the pseudoalleles formed by differential silencing of one of in tandem-duplicated paralogs. However, the genome sequences of the MHC class I regions of zebrafish and rainbow trout harboring the PSMB8A lineage show no evidences for such tandem duplication (Michalova et al. 2000; Shiina et al. 2005). Although the final conclusion should wait the physical analysis of the genomic region of these species harboring the PSMB8F lineage, pseudoallelic status is highly unlikely since the gene order and orientation around the PSMB8 gene, TAP2-PSMB9-PSMB9-like-PSMB10-PSMB8-MHC class MBE Long-Lived Dichotomous Lineages of the PSMB8 Gene · doi:10.1093/molbev/mss113 Actinopterygii Sarcopterygii Human Xenopus Carp/salmon Oryzias PSMB8A (F type) PSMB8A (F type) PSMB8F PSMB8A (A type) PSMB8A (A type) PSMB8A PSMB8A Chondrichthyes Shark PSMB8A PSMB8F Common ancestor of jawed vertebrate Paralogs Alleles PSMB8F or PSMB8A PSMB8F PSMB8A FIG. 4. The supposed evolutionary history of dichotomous PSMB8 lineages in jawed vertebrate. The tree represents the phylogenetic relationship among jawed vertebrate species, and the presence of PSMB8A and PSMB8F lineages indicated by black and gray lines, respectively. The line lengths do not reflect the genetic distances. The PSMB8A and PSMB8F lineages were established as alleles or paralogs in common ancestor of jawed vertebrate. The shark (cartilagenous fish) possesses these two lineages as paralogs, whereas these lineages exist as alleles in basal teleost, Cypriniformes (carps) and Salmoniformes (salmons). The loss of the PSMB8F lineage occurred at least twice in the higher teleost and tetrapod lineages as indicated by gray arrows. The F-type alleles were revived de novo within the PSMB8A lineage independently at least twice in the amphibian Xenopus and teleostean Oryzias species as shown by black arrows. I, is perfectly conserved by zebrafish, rainbow trout, fugu, and the Hd-rR and HNI strains of medaka possessing the Aand F-type PSMB8, respectively. These results indicate that the gene organization of this genomic region has been stable throughout teleost evolution (Lukacs et al. 2007). Alternative explanation of the phylogenetic tree shown in figure 1 is that the apparent presence of ancient lineages of the PSMB8 gene is due to convergent evolution. However, seven of eight observed lineage-specific substitutions (fig. 2) are located at the various parts of the steric structure of PSMB8 (data not shown) likely irrelevant to cleaving specificity of PSMB8. Only the Q/K53V/C substitution found at the residue involved in S1 pocket formation could have functional importance. The apparently revived F-type PSMB8 of Xenopus and medaka have the PSMB8A-type residues at the seven substituted positions located outside of the S1 pocket, supporting the idea that these positions are irrelevant to the cleaving specificity. In addition, the amino acid sequences of the A- and F-type PSMB8 of spotted green pufferfish, Tetraodon nigroviridis (NCBI accession number CR697191 and CR691449) show only two amino acid substitutions, A31F and V53M, suggesting that these two substitutions are enough to change the cleaving specificity. Since convergent evolution at the sites irrelevant to the cleaving specificity is difficult to imagine, these results indicate that the observed lineages represent real lineages. The presumption that the PSMB8F lineage was lost in the higher teleost and tetrapod lineages is a curious aspect of the PSMB8 case. However, PSMB8A lineage molecules with F31 or Y31 and supposedly with a cleaving specificity that is similar to that of PSMB8F lineage molecules were established independently at least twice during the evolution of the teleost Oryzias and amphibian Xenopus lineages. Once reestablished, the dimorphism of the PSMB8 gene in these animal lineages was retained for more than 30–60 (Miura et al. 2010) and 80 My (Nonaka et al. 2000), respectively. These results suggest that the PSMB8F lineage was lost in common ancestors of higher teleosts and tetrapods hundreds of millions of years ago and that the F-type alleles were revived de novo within the PSMB8A lineage independently at least twice tens of millions of years ago (fig. 4). Although this scenario predicts an extremely strong balancing selection to revive the dimorphism multiple times, it is puzzling that there was a long absence of the dimorphism between the loss of the F lineage and the revival of the F types. Another conceivable scenario is that most of the PSMB8F lineage sequence, with the exception of the close vicinity of the F31 residue, was replaced with the PSMB8A lineage sequence by homologous recombination or gene conversion between alleles in common ancestors of higher teleosts and tetrapods hundreds of millions of years ago, creating a PSMB8A lineage sequences with F31, and that such sequence homogenization was repeated multiple 3077 Tsukamoto et al. · doi:10.1093/molbev/mss113 times thereafter between the two PSMB8A lineage sequences with A31 and F31. If the actual PSMB8 evolution followed this scenario, the A and F types have perpetuated throughout the jawed vertebrate evolution in spite of an apparent absence of the F type at some parts of the phylogenetic tree (fig. 4). The actual evolutionary scenario of this curious polymorphism of the PSMB8 gene in the jawed vertebrate is still to be clarified. Unlike basal teleosts whose PSMB8A and PSMB8F lineages are present as alleles, cartilaginous fish possess them as paralogous genes, raising the question of whether the common ancestor of the jawed vertebrates had these two lineages as alleles or paralogous genes. The diversity between the PSMB8A and PSMB8F alleles (65.8–71.6% and 71.6–74.0% identities at the nucleotide and amino acid levels, respectively) of basal teleosts seems to be too great for normal alleles, apparently supporting the scenario that they diverged as paralogous genes and then were converted to alleles. If this is the case, the PSMB8A and PSMB8F alleles of basal teleosts are paralogous alleles. However, the revived allelic dimorphism of Oryzias and Xenopus also shows a high degree of sequence diversity (82.7–82.9% and 86.8–89.8% identities at the nucleotide and amino acid levels, respectively) without any evidence that the alleles were once paralogous. Thus, at present, we cannot rule out the alternative possibility that the two PSMB8 lineages started as alleles and then were converted to paralogs in cartilaginous fish. Analysis of the basal groups of both Actinopterygii and Chondrichthyes is expected to provide an answer to this question. In the MHC class I antigen presentation system, the antigen peptides bound to the MHC class I molecules are usually 8–9 amino acids long, and immunoproteasome is responsible for the cleavage of the C-terminal side of these peptides. In the PSMBs with proteinase activity, the Thr1 serves as single residue active site providing the nucleophile acting as the general base for the hydrolysis reaction of peptide bond, whereas the cleaving specificity is determined by the 20th, 31st, 35th, 45th, 49th, and 53rd residues forming the S1 pocket (Groll et al. 1997) (supplementary fig. S1, Supplementary Material online). To infer the cleaving specificity of the A- and F-types of the PSMB8 molecules, we predicted the 3D structures of human PSMB8 (A type) and A and F types of zebrafish, horn shark, and medaka PSMB8 based on the published 3D structure of the bovine PSMB5 molecules (Unno et al. 2002) by Automated Mode of SWISS-MODEL server (http://swissmodel.expasy.org/) (Arnold et al. 2006) (fig. 5). Supporting the reported chymotrypsin-like activity of human PSMB8 known to be the A type (Agarwal et al. 2010), all A-type PSMB8 of these animals have wide opening at the entrance of the S1 pocket, which could allow insertion of the bulky aromatic side chains. In contrast, the entrance of the S1 pocket of the F-type PSMB8 is much narrower, allowing insertion of only smaller hydrophobic side chains. Thus, elastase-like specificity is predicted for the F-type PSMB8. Dual specificities of PSMB8 borne by the A and F types could be advantageous at the population level in coping with various kinds of intracellular pathogens. 3078 MBE FIG. 5. Predicted 3D structures of the A- and F-type PSMB8 molecules. The 3D structures of the A-type PSMB8 molecule in human (NCBI accession number NP_004150) (A), the A- and F-type PSMB8 molecules in zebrafish (NCBI accession numbers BC066288 and BC092889) (B), horn shark (NCBI accession numbers AF363583 and AF363582) (C), and medaka (NCBI accession numbers AB183488 and BA000027) (D) were predicted based on the steric structure of the bovine PSMB5 molecule (PDB ID; 1iruL) by SWISS-MODEL server (http://swissmodel.expasy.org/). The six residues forming the S1 pocket are indicated by orange (20th position), red (31st), yellow (35th), green (45th), cyan (49th), and magenta (53rd), respectively. The threonine residue acting as the general base for the hydrolysis reaction is shown in blue. These views are from the inside of the S1 pocket, and Thr1 of the A type is visible, whereas Thr1 of the F type is hardly visible due to bulky side chains of Phe31 or Tyr31. Our preliminary analysis in amphibia and reptile identified both the A- and F-type PSMB8 from newt (Cynops pyrrhogaster), gecko (Gekko japonicus), and turtle (Trachemys scripta) (Huang CH, Tanaka Y, Nonaka M, unpublished data), suggesting that the lack of PSMB8 dimorphism in placental mammals is rather an exceptional case. In this context, it is interesting to note that the PSMB8 and class IA genes are tightly linked in teleost (Lukacs et al. 2007), Xenopus (Ohta et al. 2006) and green anole (NCBI accession number NW_003339585), whereas they are separated more than 1 Mb in most placental mammals (Kelley et al. 2005). It is possible, therefore, that the dimorphism Long-Lived Dichotomous Lineages of the PSMB8 Gene · doi:10.1093/molbev/mss113 of the PSMB8 gene is meaningful only when it is tightly linked with the class IA gene. If this is the actual case, the pattern of polymorphism of the class IA gene is expected to show difference between placental mammals and other gnathostomes, although such comparison is still to be performed. The present evolutionary analysis of the PSMB8 gene revealed the dichotomous lineages retained by basal teleost and sharks for more than 500 My either as alleles or paralogs, and the long-term trans-order polymorphism persisted for more than 300 My in basal teleost. These unprecedented observations on genetic polymorphism indicate the presence of extremely strong selective pressure for possessing dual specificities of PSMB8 in populations, although the actual evolutionary mechanism is still to be clarified. Supplementary Material Supplementary figures S1–S3 and table S1 are available at Molecular Biology and Evolution online (http://www. mbe.oxfordjournals.org/). Acknowledgments We thank Drs Masatoshi Nei and Jan Klein for critical reading of the manuscript and Drs Shinji Takada, Sumito Koshida, and Mr Takeshi Nakai for supplying us with fish samples. This work was supported in part by a Grant-in-Aid for Scientific Research on Priority Area, MEXT, Japan (to M.N.). References Agarwal AK, Xing C, DeMartino GN, Mizrachi D, Hernandez MD, Sousa AB, Martinez de Villarreal L, dos Santos HG, Garg A. 2010. PSMB8 encoding the beta5i proteasome subunit is mutated in joint contractures, muscle atrophy, microcytic anemia, and panniculitis-induced lipodystrophy syndrome. Am J Hum Genet. 87:866–872. Arnold K, Bordoli L, Kopp J, Schwede T. 2006. The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling. Bioinformatics 22:195–201. Blair JE, Hedges SB. 2005. Molecular phylogeny and divergence times of deuterostome animals. Mol Biol Evol. 22:2275–2284. Esteves PJ, Lanning D, Ferrand N, Knight KL, Zhai SK, van der Loo W. 2005. The evolution of the immunoglobulin heavy chain variable region (IgVH) in Leporids: an unusual case of transspecies polymorphism. Immunogenetics 57:874–882. Fehling HJ, Swat W, Laplace C, Kuhn R, Rajewsky K, Muller U, von Boehmer H. 1994. MHC class I expression in mice lacking the proteasome subunit LMP-7. Science 265:1234–1237. Groll M, Ditzel L, Lowe J, Stock D, Bochtler M, Bartunik HD, Huber R. 1997. Structure of 20S proteasome from yeast at 2.4 A resolution. Nature 386:463–471. Kandil E, Namikawa C, Nonaka M, Greenberg AS, Flajnik MF, Ishibashi T, Kasahara M. 1996. Isolation of low molecular mass polypeptide complementary DNA clones from primitive vertebrates. Implications for the origin of MHC class I-restricted antigen presentation. J Immunol. 156:4245–4253. Kasahara M. 1997. New insights into the genomic organization and origin of the major histocompatibility complex: role of chromosomal (genome) duplication in the emergence of the adaptive immune system. Hereditas 127:59–65. MBE Kelley J, Walter L, Trowsdale J. 2005. Comparative genomics of major histocompatibility complexes. Immunogenetics 56:683–695. Klein J, Sato A, Nikolaidis N. 2007. MHC, TSP, and the origin of species: from immunogenetics to evolutionary genetics. Annu Rev Genet. 41:281–304. Larkin MA, Blackshields G, Brown NP, et al. (13 co-authors). 2007. Clustal W and Clustal X version 2.0. Bioinformatics 23:2947–2948. Lukacs MF, Harstad H, Grimholt U, et al. (11 co-authors). 2007. Genomic organization of duplicated major histocompatibility complex class I regions in Atlantic salmon (Salmo salar). BMC Genomics 8:251. Michalova V, Murray BW, Sultmann H, Klein J. 2000. A contig map of the Mhc class I genomic region in the zebrafish reveals ancient synteny. J Immunol. 164:5296–5305. Miura F, Tsukamoto K, Mehta RB, Naruse K, Magtoon W, Nonaka M. 2010. Transspecies dimorphic allelic lineages of the proteasome subunit beta-type 8 gene (PSMB8) in the teleost genus Oryzias. Proc Natl Acad Sci U S A. 107:21599–21604. Nonaka M, Yamada-Namikawa C, Flajnik MF, Du Pasquier L. 2000. Trans-species polymorphism of the major histocompatibility complex-encoded proteasome subunit LMP7 in an amphibian genus, Xenopus. Immunogenetics 51:186–192. Ohta Y, Goetz W, Hossain MZ, Nonaka M, Flajnik MF. 2006. Ancestral organization of the MHC revealed in the amphibian Xenopus. J Immunol. 176:3674–3685. Ohta Y, McKinney EC, Criscitiello MF, Flajnik MF. 2002. Proteasome, transporter associated with antigen processing, and class I genes in the nurse shark Ginglymostoma cirratum: evidence for a stable class I region and MHC haplotype lineages. J Immunol. 168:771–781. Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, Ferrin TE. 2004. UCSF Chimera–a visualization system for exploratory research and analysis. J Comput Chem. 25: 1605–1612. Posada D, Crandall KA. 1998. MODELTEST: testing the model of DNA substitution. Bioinformatics 14:817–818. Rock KL, Goldberg AL. 1999. Degradation of cell proteins and the generation of MHC class I-presented peptides. Annu Rev Immunol. 17:739–779. Satta Y, Mayer WE, Klein J. 1996. HLA-DRB intron 1 sequences: implications for the evolution of HLA-DRB genes and haplotypes. Hum Immunol. 51:1–12. Shiina T, Dijkstra JM, Shimizu S, et al. (15 co-authors). 2005. Interchromosomal duplication of major histocompatibility complex class I regions in rainbow trout (Oncorhynchus mykiss), a species with a presumably recent tetraploid ancestry. Immunogenetics 56:878–893. Su C, Nei M. 1999. Fifty-million-year-old polymorphism at an immunoglobulin variable region gene locus in the rabbit evolutionary lineage. Proc Natl Acad Sci U S A. 96:9710–9715. Swofford DL. 2000. Phylogenetic analysis using parsimony (* and other methods). Sunderland (MA): Sinauer Associates. Tanaka K, Kasahara M. 1998. The MHC class I ligand-generating system: roles of immunoproteasomes and the interferon-gammainducible proteasome activator PA28. Immunol Rev. 163:161–176. Tsukamoto K, Hayashi S, Matsuo MY, Nonaka MI, Kondo M, Shima A, Asakawa S, Shimizu N, Nonaka M. 2005. Unprecedented intraspecific diversity of the MHC class I region of a teleost medaka, Oryzias latipes. Immunogenetics 57:420–431. Unno M, Mizushima T, Morimoto Y, Tomisugi Y, Tanaka K, Yasuoka N, Tsukihara T. 2002. The structure of the mammalian 20S proteasome at 2.75 A resolution. Structure 10:609–618. Yamanoue Y, Miya M, Inoue JG, Matsuura K, Nishida M. 2006. The mitochondrial genome of spotted green pufferfish Tetraodon nigroviridis (Teleostei: Tetraodontiformes) and divergence time estimation among model organisms in fishes. Genes Genet Syst. 81:29–39. 3079
© Copyright 2026 Paperzz