MBE Advance Access published July 25, 2007 1 The nonsynonymous/synonymous substitution rate ratio versus the radical/conservative 2 replacement rate ratio in the evolution of mammalian genes 3 4 Kousuke Hanada1,2, Shin-Han Shiu2 and Wen-Hsiung Li1* 5 6 1. Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637 7 2. Department of Plant Biology, Michigan State University, East Lansing, MI 48824 8 9 Running head: Ka/Ks ratio vs radical/conservative replacement ratio 10 11 Key words: positive selection, radical substitution, conservative substitution, classification of 12 amino acids, development. 13 14 *Corresponding author. 15 Wen-Hsiung Li, Department of Ecology and Evolution, University of Chicago 1101 East 57th 16 Street, Chicago, IL, 60637, USA. 17 Tel: +1- 773-702-3104. Fax: +1- 773-702-9740. E-mail: [email protected] 18 1 The Author 2007. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected] 1 Abstract 2 There are two ways to infer selection pressures in the evolution of protein-coding genes: 3 the nonsynonymous and synonymous substitution rate ratio (KA/KS) and the radical and 4 conservative amino acid replacement rate ratio (KR/KC). Since the KR/KC ratio depends on the 5 definition of radical and conservative changes in the classification of amino acids, we develop an 6 amino acid classification that maximizes the correlation between KA/KS and KR/KC. An analysis 7 of 3,375 orthologous gene groups among five mammalian species shows that our classification 8 gives a significantly higher correlation coefficient between the two ratios than those of existing 9 classifications. However, there are many orthologous gene groups with a low KA/KS but a high 10 KR/KC ratio. Examining the functions of these genes, we found an overrepresentation of 11 functional categories related to development. To determine if the over-representation is stage 12 specific, we examined the expression patterns of these genes at different developmental stages of 13 the mouse. Interestingly, these genes are highly expressed in the early middle stage of 14 development (Blastocyst to Amnion). It is commonly thought that developmental genes tend to 15 be conservative in evolution, but some molecular changes in developmental stages should have 16 contributed to morphological divergence in adult mammals. Therefore, we propose that the 17 relaxed pressures indicated by the KR/KC ratio but not by KA/KS in the early middle stage of 18 development may be important for the morphological divergence of mammals at the adult stage, 19 while purifying selection detected by KA/KS occurs in the early middle developmental stage. 20 2 Introduction 1 2 Selection pressure on protein-coding sequences is commonly estimated by the ratio of 3 the nonsynonymous substitution rate (KA) to the synonymous substitution rate (KS) (Li and 4 Gojobori 1983; Hughes and Nei 1988). If the KA/KS ratio is higher than 1, positive selection is 5 assumed to have occurred during the evolution of the sequence. The ratio of the radical 6 replacement rate (KR) to the conservative replacement rate (KC) has also been used to detect 7 positive selection (Hughes, Ota, and Nei 1990). The KR/KC ratio is useful for examining selection 8 pressure in distantly related protein-coding sequences because the KA/KS ratio cannot be 9 accurately estimated in this case due to saturation of KS (Gojobori 1983; Smith and Smith 1996). 10 Since there are two ways of inferring selection pressure on a sequence, an open question is 11 whether these two approaches give the same conclusion or not. Zhang (2000) and Smith (2003) 12 found that KA/KS is correlated with KR/KC based on the amino acid classification that considers 13 polarity and volume, using 47 mammalian and 25 Drosophila genes. However, there are several 14 types of amino acid classifications and it is not known which classification gives a KR/KC 15 measure that best correlates with the KA/KS ratio. Therefore, we do not know the degree of 16 correlation between the two ratios in general. 17 In the present study, we searched for an amino acid classification that gives the best 18 correlation between the two ratios. This amino acid classification is useful because the KR/KC 19 ratio based on this classification can identify genes undergoing similar selection pressures 20 inferred by the KA/KS ratio between distant protein-coding sequences. 21 Another issue is that it is likely that the two ratios are not completely correlated even if 22 the amino acid classification that gives the maximum correlation between the two ratios is used. 23 To address the differences between the selection pressures inferred by KA/KS and KR/KC in the 24 evolution of mammalian genes, we examined functions of genes that showed different selection 25 pressures inferred by the two ratios, using Gene Ontology (GO) categories and expression data of 26 a representative mammal, the mouse. 27 28 Materials & Methods 29 30 Construction of orthologous groups cDNA data of five mammalian species were retrieved from the Ensembl database 31 (www.ensembl.org): Homo sapiens (NCBI35.may), Pan troglodytes (CHIMP1.may), Mus 3 1 musculus (NCBIM33.may), Rattus norvegicus (RGSC3.4.may) and Canis familiaris 2 (BROADD1.may). Reciprocal best hits between every combination of two species were 3 identified with Blastp (Altschul et al. 1997). For sequences that are reciprocal best hits among all 4 species combinations (Fig. 1A), they were considered as an orthologous group among the five 5 species. 3,533 putative orthologous groups were constructed according to the procedure. To 6 further verify the 3,533 orthologous groups, phylogenetic trees were constructed using the protein 7 sequence alignments of members in an orthologous group by the neighbor-joining (NJ) method 8 (Saitou and Nei 1987; Thompson, Higgins, and Gibson 1994). When the topology was different 9 from the species tree, the data set was removed from the orthologous data (Fig 1B). The total 10 number of orthologous groups was reduced to 3,375. For the numbers of nucleotide sites used in 11 these orthologous groups, the interquartile range (25%-75%) and the median number of 12 nucleotide sites are 477.0-1175.0 and 756.0, respectively. 13 The orthologous gene groups in the five mammalian species were determined as follows. 14 The orthologous gene data were carefully constructed to reduce errors for estimating nucleotide 15 and amino substitutions. Only segments aligned among the five species without any gaps were 16 used for the calculation of the KA/KS and KR/KC ratios. 17 18 19 Estimation of KA/KS and KR/KC in each orthologous gene set A phylogenetic tree was reconstructed for each orthologous gene group by the NJ 20 method (Saitou and Nei 1987). The ancestral sequence was inferred at each node in the 21 phylogenetic tree using the maximum likelihood method (Yang, Kumar, and Nei 1995). The 22 transition/transversion ratio was estimated in each orthologous group and the ratio was then used 23 to estimate KA and KS in all branches in the phylogenetic tree by the modified Nei-Gojobori 24 method (Zhang, Rosenberg, and Nei 1998). The sums of KA and KS of all branches were used to 25 determine the KA/KS ratio in each orthologous gene group. 26 Radical and conservative changes were defined by a classification (A) that gave the best 27 correlation between KR/KC and KA/KS and also by three previous classifications with respect to 28 the chemical properties: (B) polarity and volume, (C) charge and aromaticity, and (D) charge and 29 polarity (Zhang 2000; Hanada, Gojobori, and Li 2006) (Table 1). These so-called 30 physicochemical properties (aromaticity, charge, polarity, and volume) are thought to be relevant 31 for the evolution of proteins (Grantham 1974; Miyata, Miyazawa, and Yasunaga 1979). Based on 32 the ancestral sequences inferred at all nodes in the phylogenetic tree of each orthologous group, 4 1 KR and KC were estimated in all branches in the phylogenetic tree by the Zhang method (Zhang 2 2000). The sums of branch lengths that reflected KR and KC were used to determine the KR/KC 3 ratio in each orthologous group. Average KA, KS, KR and KC in each branch of species tree among 4 3,375 orthologous groups are given in Supplement A. 5 6 7 Construction of a new amino acid classification To estimate the average KA/KS ratio for each amino acid replacement, we collected from 8 the orthologous gene groups the amino acid replacements that had occurred. The average KA/KS 9 ratio for each type of amino acid replacement is defined to be the average KA/KS ratio in the 10 collected orthologous gene groups. The average KA/KS ratios were estimated for each of the 75 11 kinds of amino acid replacement occurring by single nucleotide substitution. Since the amino 12 acid replacement having a low (high) KA/KS ratio should tend to be a conservative (radical) 13 change in the highly associated classification, radical and conservative scores were numbered for 14 75 types of amino acid replacement in descending (ascending) order of KA/KS (Supplement B). 15 Using the radical and conservative scores for the 75 types of amino acid replacement, we 16 calculated the totals of radical and conservative scores for each amino acid classification. To find 17 an amino acid classification that would give the maximum correlation between KR/KC and KA/KS, 18 amino acids were classified into two to five groups in all possible combinations and we identified 19 the classification with the highest score. The new classification is regarded as the amino acid 20 classification that can more adequately characterize the relationship between KA/KS and KR/KC. 21 22 23 Functional categories by Gene Ontology. Orthologous gene groups with the top and bottom 10 % KA/KS or KR/KC values were 24 considered as relaxed selection groups and purifying selection groups, respectively. Under this 25 classification, there are four possible combinations for the orthologous gene groups: (1) relaxed 26 selection groups inferred by both KA/KS and KR/KC (a high KA/KS and a high KR/KC), (2) 27 purifying selection groups inferred by both KA/KS and KR/KC (a low KA/KS and a low KR/KC), (3) 28 relaxed and purifying selection groups inferred by KA/KS and by KR/KC (a high KA/KS and a low 29 KR/KC), respectively, and (4) purifying selection and relaxed selection groups inferred by KA/KS 30 and by KR/KC (a low KA/KS and a high KR/KC), respectively. 31 Gene Ontology (GO) assignments for the mouse genes were obtained from the mouse 32 genome database (Hill et al. 2002). To simplify functional interpretation, we used the GO 5 1 categories of biological processes from top to the 4th depth in the hierarchy. The expected 2 proportion of each GO category assigned by the mouse genes was compared with the observed 3 proportion of each GO category assigned by the mouse genes of orthologous gene groups 4 undergoing different selection pressures by the chi-square test. When the observed proportion is 5 significantly higher than the expected proportion in a given GO category (P<0.05), the 6 hierarchical pathways from the root to the overrepresented GO category were shown by the 7 Graphviz software (www.graphviz.org). 8 9 10 The expression pattern at a developmental stage. The mouse expression dataset covering various stages of mouse development (Ringwald 11 et al. 2001) was used to determine the relationships between gene expression and the nature of 12 selection pressure as determined by the KA/KS and KR/KC measures. Among different selection 13 pressures, we compared the expression bias of genes at a developmental stage by the following 14 equation. 15 R= Nob. Nob. = Nex. Pall ⋅ Nselected 16 For a particular developmental stage, Nob. and Nex. are the observed and expected numbers of 17 expressed genes that experienced purifying or relaxed selection pressure at the developmental 18 stage, Pall is the proportion of all mouse genes expressed at a given developmental stage, and 19 Nselected is the total number of genes undergoing each of four types of selection pressures. Nex. 20 was calculated by multiplying Pall by Nselected. 21 22 Results 23 24 A new classification of amino acids To find a new classification that yields the maximum correlation between KA/KS and 25 KR/KC, we first constructed all possible combinations in which the 20 amino acids can be 26 classified into two to five groups. Second, a table representing the average KA/KS ratio for each 27 type of amino acid replacement was constructed to see what kinds of amino acid replacements 28 more adequately characterize the KA/KS ratio (Supplement B). Based on the table, a new 29 classification of amino acids with a higher correlation between the KA/KS ratio and the radical or 30 conservative change was constructed (Classification A in Table 1). In the new classification, 31 amino acids are classified into basic, acidic and neutral charges. The aromatic amino acids belong 6 1 to the group of the basic charges because one of the aromatic amino acids has a basic charge. The 2 amino acids with neutral charge are classified into small and large volumes that fall into distinct 3 groups. Consequently, this new classification seems to be constructed with respect to the 4 chemical properties of charge, aromaticity and volume. 5 6 7 Correlation between KR/KC and KA/KS Using three existing amino acid classifications and our new classification, we estimated 8 four KR/KC ratios for each orthologous gene group. The four KR/KC ratios were significantly 9 positively correlated with each other (P < 0.01) (Table 2). In terms of the correlation between 10 KR/KC and KA/KS, the correlation coefficient in the new classification (A, r=0.48 Table 2) was 11 expected to be the highest among the four chemical classifications because the new classification 12 (A) was constructed by the chemical properties associated with the KA/KS ratio. In fact, the 13 correlation coefficient between KA/KS and KR/KC based on the new classification is significantly 14 higher than those based on the other three classifications (P < 0.01), though the other three KR/KC 15 ratios are also each positively correlated with the KA/KS ratio (P < 0.01) (Fig.2). 16 However, even under the new classification, which gives the highest correlation between 17 the two ratios, the correlation coefficient is less than 0.5, indicating that selective pressures 18 inferred by the KR/KC ratio and by the KA/KC ratio differ substantially. In particular, there are 19 many orthologous gene groups with a low KA/KS and a high KR/KC ratio (Fig. 2). These 20 orthologous gene groups have likely undergone relaxed selection in radical amino acid 21 substitutions as indicated by the KR/KC ratio but experienced purifying selection in 22 non-synonymous changes as indicated by the KA/KS ratio. 23 24 25 26 Overrepresented functional categories undergoing opposite selection pressures inferred by two ratios There are four types of selection pressure experienced by the orthologous gene groups. 27 The number of orthologous gene groups that experienced relaxed or purifying selection pressures 28 in the two ratios is shown in Table 3 and the gene lists are given in Supplement C. Since KA/KS 29 was on the whole positively correlated with KR/KC in mammals, a larger number of groups 30 undergoing the same selection pressures in the two ratios was found in the comparison with the 31 number of groups that underwent the opposite selection pressures in the two ratios. The groups 32 with the opposite selection pressures are only found in a high KR/KC and a low KA/KS ratio. 7 1 To assess the functions of groups that underwent different selection pressures, we 2 examined significantly overrepresented Gene Ontology (GO) categories of mouse genes in 3 orthologous gene groups subject to each type of selection pressures (Fig. 3, Supplement D). The 4 overrepresented functions of genes with a high KR/KC and a high KA/KS ratio are related to 5 "response to stimulus” and “physiological process”. In particular, several functions related to 6 defense response can be clearly found in these genes. Since genes related to defense response are 7 in general accepted as genes undergoing positive selection, these results seem biologically 8 reasonable. On the other hand, the overrepresented functions of genes with a low KA/KS ratio are 9 related to development. This result is also reasonable because most of the genes related to 10 development are subject to purifying selection based on the KA/KS ratio between distantly related 11 species (Powell et al. 1993; Slack, Holland, and Graham 1993). However, it is unclear whether 12 this holds true if the KR/KC ratio is used to evaluate the selection pressure in genes related to 13 development. In genes with a low KA/KS ratio, sex determination and cell differentiation are 14 overrepresented in genes with a high and a low KR/KC ratio, respectively (Fig. 3). Sex 15 determination is likely conserved among mammals but cell differentiation may be required to be 16 somewhat different among mammals for the divergent evolution seen in mammals. Thus, it is 17 possible that relaxed selection pressures indicated by the KR/KC ratio may be one of the important 18 factors for the evolution in mammals. 19 To further examine the different gene functions between the high and low KR/KC ratios 20 in mammalian development, we examined the expression of mouse genes with different selection 21 pressures using the mouse expression dataset covering various stages of development (Fig. 4 A, 22 B). Genes subject to purifying selection based on both ratios are expressed at high levels at the 23 early developmental stages (One cell egg to Blastocyst). On the other hand, genes subject to 24 purifying selection indicated by KA/KS but relaxed selection indicated by KR/KC were expressed 25 predominantly in the early middle stage of development (Blastocyst to Amnion). The relaxed 26 pressures indicated solely by the KR/KC ratio in the early middle stage of development may be 27 important for the divergent evolution in mammals. 28 29 30 31 Discussion The key finding of the present study is that a positive correlation between KA/KS and 8 1 KR/KC at a genomic scale is observed in all amino acid classifications, indicating that the two 2 tests of selection pressure give similar conclusions in mammalian evolution. In particular, the 3 KR/KC ratio of the new classification is useful for estimating selection pressure between distantly 4 related sequences (Gojobori 1983; Smith and Smith 1996). Since the evolutionary rate of 5 synonymous substitution is much faster than that of nonsynonymous substitution, KS is often 6 saturated between distant sequences. On the other hand, the KR/KC ratio is estimated by only 7 amino acid replacements and the evolutionary rate of amino acid replacement is much slower 8 than that of synonymous substitution, so that the KR/KC ratio can be estimated for distant 9 sequences. Thus, the new classification (A) can produce a useful KR/KC ratio for estimating the 10 selection pressure in distant sequences. It should be noted that several reports had classified 11 amino acid replacements into radical and conservative amino acid changes by the likelihood of 12 amino acid replacements and estimated selection pressures by such radical and conservative 13 amino acid changes (Tang et al. 2004; Gojobori et al. 2007). On the other hand, in the present 14 study, we defined radical and conservative changes by the likelihoods of nonsynonymous and 15 synonymous substitutions. Therefore, the selection pressures inferred by radical and conservative 16 changes under our definition should more likely lead to similar selection pressures inferred by the 17 KA/KS ratio. 18 However, a major limitation in substituting KR/KC for KA/KS is that, even when we used 19 the new classification aimed at maximizing the correlation between KR/KC and KA/KS, the 20 correlation between KR/KC and KA/KS is still less than 0.5. There are potentially two reasons why 21 the two ratios are not highly correlated. One reason is biological. For some genes, KR/KC may not 22 be related to the type of natural selection identified by KA/KS. The other reason is technical. In 23 the computation of the KR/KC ratio, radical and conservative changes were defined as amino acid 24 replacements between groups and within groups, respectively. In view of the fact that the radical 25 and conservative changes are defined to be always “0” or “1”, the KR/KC ratio may not fully 26 represent the selection pressure of amino acid replacements. 27 We note that there are many orthologous gene groups with a low KA/KS and a high 28 KR/KC as outliers. To address the opposite selection pressures, we examined the functions of 29 mouse genes and found that functional categories related to development were overrepresented in 30 these genes. We then examined these gene expression patterns at different developmental stages. 31 The mouse genes that underwent such selection pressures tend to be over-expressed in the early 9 1 middle developmental stages. Richardson (1999) proposed that the early middle developmental 2 stages were important for speciation of mammals because these are the stages when many adult 3 traits are specified even if these stages were conservative in the morphological level. Therefore, 4 we propose that the relaxed selection pressures indicated by KR/KC but not by KA/KS in the early 5 middle developmental stages may be important for the morphological divergence of mammals at 6 the adult stage, while purifying selection detected by KA/KS tends to occur in the early middle 7 developmental stages. The differences in the selection pressures assessed by KA/KS and KR/KC 8 indicate that, although genes involved in development have strong constraints in amino acid 9 substitutions, radical changes in the substitutions permitted are likely important for 10 developmental divergence of adult mammals. Thus, opposite selection pressures in the two ways 11 might play an important role in the evolution of genes related to development in mammals. 12 In summary, we inferred 3,375 orthologous gene groups in 5 mammalian species in a 13 stringent manner. KR/KC is positively correlated with KA/KS. The correlation was observed in 14 each of four chemical classifications taking account of aromaticity, charge, polarity or volume. In 15 particular, the chemical classification for aromaticity, charge and volume led to the highest 16 correlation between these two ratios. Moreover, the genes with high KR/KC but low KA/KS were 17 over-represented with genes expressed at a high level in the early middle developmental stages. 18 The selection pressures at these developmental stages may be important for the morphological 19 diversification of mammals. 20 21 22 23 Acknowledgements 24 We thank the members of our laboratories for valuable comments and discussion. This study was 25 supported by NIH grant (GM30998) to W.-H. L. and an NSF grant (DBI-0638591) to S.-H. S. 26 10 1 2 Table 1. Four classifications of amino acids. Classification A by the maximum correlation with the KA/KS ratio Neutral & small ANCGPST (MW*: 75-146) Neutral & large ILMV (MW*: 146-204) Basic acid, Aromaticity & Relatively small R QHK FWY (MW*: 117-149) Acidic charge & Relatively large DE (MW*: 133-147) Classification B by polarity & volume Special C Neutral and Small AGPST Polar & relatively small NDQE Polar & relatively large RHK Nonpolar & relatively small ILMV Nonpolar & relatively large FWY Classification C by charge & aromatic Acidic DE Neutral & No aromaticity QAVLI C STN G PM Neutral & Aromaticity FYW Basic KRH Classification D by charge & polarity Neutral & Polarity STYCNQ Acidic & Polarity DE Basic & Polarity KRH No polarity GAVLI F PM W 3 4 5 *MW: Molecular weight 11 1 2 3 Table 2 Correlation coefficient between KR/KC and KA/KS. KR/KC (Classification A) KR/KC (Classification B) KR/KC (Classification C) KR/KC (Classification D) KR/KC (Classification B) KR/KC (Classification C) KR/KC (Classification D) KA/KS 0.77 0.67 0.35 0.48 0.73 0.35 0.38 0.52 0.37 0.22 4 5 12 1 2 Table 3 The number of orthologous groups undergoing different selection pressures Orthologous groups under relaxed selection indicated by KR/KC (10 % top of KR/KC ratio) Orthologous groups under purifying selection indicated by KR/KC (10 % bottom of KR/KC ratio) Orthologous groups under relaxed selection indicated by KA/KS (10 % top of KA/KS ratio) Orthologous groups under purifying selection indicated by KA/KS (10 % bottom of KA/KS ratio) 116 47 0 147 3 13 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 Literature Cited Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389-3402. Gojobori, J., H. Tang, J. M. Akey, and C. I. Wu. 2007. Adaptive evolution in humans revealed by the negative correlation between the polymorphism and fixation phases of evolution. Proc Natl Acad Sci U S A 104:3907-3912. Gojobori, T. 1983. Codon substitution in evolution and the "saturation" of synonymous changes. Genetics 105:1011-1027. Grantham, R. 1974. Amino acid difference formula to help explain protein evolution. Science 185:862-864. Hanada, K., T. Gojobori, and W. H. Li. 2006. Radical amino acid change versus positive selection in the evolution of viral envelope proteins. Gene 385:83-88. Hill, D. P., J. A. Blake, J. E. Richardson, and M. Ringwald. 2002. Extension and integration of the gene ontology (GO): combining GO vocabularies with external vocabularies. Genome Res 12:1982-1991. Hughes, A. L., and M. Nei. 1988. Pattern of nucleotide substitution at major histocompatibility complex class I loci reveals overdominant selection. Nature 335:167-170. Hughes, A. L., T. Ota, and M. Nei. 1990. Positive Darwinian selection promotes charge profile diversity in the antigen-binding cleft of class I major-histocompatibility-complex molecules. Mol Biol Evol 7:515-524. Li, W. H., and T. Gojobori. 1983. Rapid evolution of goat and sheep globin genes following gene duplication. Mol Biol Evol 1:94-108. Miyata, T., S. Miyazawa, and T. Yasunaga. 1979. Two types of amino acid substitutions in protein evolution. J Mol Evol 12:219-236. Powell, J. R., A. Caccone, J. M. Gleason, and L. Nigro. 1993. Rates of DNA evolution in Drosophila depend on function and developmental stage of expression. Genetics 133:291-298. Ringwald, M., J. T. Eppig, D. A. Begley, J. P. Corradi, I. J. McCright, T. F. Hayamizu, D. P. Hill, J. A. Kadin, and J. E. Richardson. 2001. The Mouse Gene Expression Database (GXD). Nucleic Acids Res 29:98-101. Saitou, N., and M. Nei. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406-425. Slack, J. M., P. W. Holland, and C. F. Graham. 1993. The zootype and the phylotypic stage. Nature 361:490-492. Smith, J. M., and N. H. Smith. 1996. Synonymous nucleotide divergence: what is "saturation"? Genetics 142:1033-1036. Smith, N. G. 2003. Are radical and conservative substitution rates useful statistics in molecular evolution? J Mol Evol 57:467-478. Tang, H., G. J. Wyckoff, J. Lu, and C. I. Wu. 2004. A universal evolutionary index for amino acid changes. Mol Biol Evol 21:1548-1556. Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673-4680. Yang, Z., S. Kumar, and M. Nei. 1995. A new method of inference of ancestral nucleotide and amino acid sequences. Genetics 141:1641-1650. Zhang, J. 2000. Rates of conservative and radical nonsynonymous nucleotide substitutions in mammalian nuclear genes. J Mol Evol 50:56-68. Zhang, J., H. F. Rosenberg, and M. Nei. 1998. Positive Darwinian selection after gene duplication in primate ribonuclease genes. Proc Natl Acad Sci U S A 95:3708-3713. 14 Figure legends 1 2 3 Fig. 1. Construction of ortholog data. 4 The similarity search was conducted by Blastp as in Fig. 1A. Reciprocal best hits were identified 5 between every pair of species. The number of reciprocal best hits between pair of species is 6 shown between each pair of species. When sequences reciprocally had the best hits among the 7 five species, the sequences were considered an orthologous gene among the five species. A 8 phylogeny was then generated for each orthologous gene group. When the phylogeny of the 9 orthologs from the five species is different from the topology of the species phylogeny, this 10 putative ortholog was removed from the ortholog data. The species phylogeny is shown in Fig. 11 1B. 12 13 Fig. 2. Correlation between KA/KS and KR/KC. 14 The X-axis is the KA/KS ratio and the Y-axis is the KR/KC ratio. The ratios were computed based 15 on classification A (r=0.48) (A); classification B (r=0.38) (B); classification C (0.37) (C); and 16 classification D (r=0.22) (D). 17 18 Fig. 3. Overrepresented functions in genes with a low KA/KS and a high KR/KC and genes with a 19 low KA/KS and a low KR/KC ratio. 20 The arrowheads point to subcategories. (A) Categories overrepresented in genes with a low 21 KA/KS and a high KR/KC are in black circles (P < 0.05). (B) Categories overrepresented in genes 22 with a low KA/KS and a low KR/KC are in black circles (P < 0.05). 23 24 Fig. 4. Expression levels of genes with different selection pressures in each developmental stage. 25 26 (A) The X-axis indicates the developmental stage. The names of each stage are as follows: 1 27 (One cell egg), 2 (Beginning of cell division), 3 (Morula), 4 (Advanced division/segmentation), 5 28 (Blastocyst), 6 (Implantation), 7 (Formation of egg cylinder), 8 (Differentiation of egg cylinder), 29 9 (Advanced endometrial reaction; prestreak), 10 (Amnion; midstreak), 11 (Neural plate, 30 presomite; no allantoic bud), 12 (First somites; late head fold), 13 (Turning), 14 (Formation & 31 closure anterior neuropore), 15 (Formation of posterior neuropore, forelimb bud), 16 (Closure 32 post. neuropore, hindlimb & tail bud), 17 (Deep lens indentation), 18 (Closure lens vesicle), 19 15 1 (Complete separation of lens vesicle), 20 (Earliest sign of fingers), 21 (Anterior footplate 2 indented, marked pinna), 22 (Fingers separate distally), 23 (Toes separate), 24 (Reposition of 3 umbilical hernia), 25 (Fingers and toes joined together), 26 (Long whiskers) and 28 (Postnatal 4 development). The Y-axis indicates the normalized difference of expressed genes between genes 5 undergoing a selection pressure and all genes. (B) The sliding window analysis (5 stages) was 6 conducted based on (A). The X-axis is the mean of normalized difference in five developmental 7 stages. The Y-axis indicates the average normalized difference in each window. 8 16 FIG 1 A B Human Dog 16, 022 Human Chimpanzee 18,763 10,787 Mouse 34,789 15,425 470 18, 17,757 19,972 15,463 Mouse 28,584 Rat Rat Chimpanzee Dog 2.5 2 1.5 1 0.5 B 4.5 4 3.5 3 2.5 2 1.5 1 0.5 0 0 0 0.2 0.4 0.6 0.8 KA/KS ratio KR/KC ratio (Classification C) KR/KC ratio (Classification B) A 3 0 1 C 6 5 4 3 2 1 0 0 0.2 0.4 0.6 KA/KS ratio 0.8 1 KR/KC ratio (Classification D) KR/KC ratio (Classification A) FIG 2 0.2 0.4 0.6 KA/KS ratio 0.8 1 D 6 5 4 3 2 1 0 0 0.2 0.4 0.6 KA/KS ratio 0.8 1 FIG 3 A embryonic_development (sensu_Metazoa) embryonic_development axis_specification development biological_process pattern_specification cellular_process cell_differentiation regulation_of_biological process regulation_of_development anterior/posterior pattern_formation epidermal_cell_differentiation regulation_of_epidermis development regulation_of_binding B biological_process sex_determination male_sex_determination development pattern_specification axis_specification growth developmental_growth blastocyst_growth response_to_stimulus behavior visual_behavior FIG 4 5 A Genes under purifying selection indicated by both KA/KS and KR/KC Genes under purifying selection indicated by KA/KS but relaxed selection indiated by KR/KC 2.5 4 2 3 1.5 2 1 1 0.5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 0 B Genes under relaxed selection indicated by both KA/KS and KR/KC 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
© Copyright 2025 Paperzz