Acta Botanica Sinica 植 物 学 报 2004, 46 (1): 10-19 http://www.chineseplantscience.com A Preliminary Study on the Origin and Evolution of Chalcone Synthase (CHS) Gene in Angiosperms HUANG Jin-Xia, QU Li-Jia, YANG Ji, YIN Hao, GU Hong-Ya* (College of Life Sciences, Peking University, Beijing 100871, China) Abstract: By using Thermal Asymmetric Interlaced PCR (TAIL-PCR) method, a DNA fragment of about 1 000 bp was amplified and cloned from a liverwort species (Lunularia cruciata (L.) Dum. ex Lindb). The nucleotide sequence of this fragment and its deduced amino acid sequence shared about 56% and 60% identity with those of exon 2 of CHS genes from vascular plants respectively. The four characteristic catalyzing sites of CHS were found conserved in the deduced amino acid sequences of the fragment when compared with other CHS sequences. This is the first report of cloning a CHS-like gene from liverworts, suggesting that the origin of CHS genes may predate liverworts. Using the CHS-like sequence from L. cruciata and CHS sequences from two fern-alien species, Psilotum nudum (L.) Griseb. and Equisetum arvense L., as outgroups, the phylogenetic trees of about 250 CHSs from 29 families of angiosperm plants were constructed by using the neighbour-joining (NJ), maximum parsimony (MP) and quartet puzzle (QP) methods. The results showed that the CHSs from most plant families were separated into two or more clades while sequences from the families Brassicaceae, Fabaceae and Poaceae were each grouped into an independent monophyletic clade. The relative base substitution rates were estimated for CHS genes in three plant families, Solanaceae, Convolvulaceae, and Asteraceae, where the heterogeneity rate was detected both within and among the families. Results indicated that CHS genes in angiosperm plants were greatly diverse in terms of copy number, base substitution rate, and duplication/deletion events, which might be correlated with the diversity of life history, habitat, floral characters, and defense system of angiosperm plants. Key words: Lunularia cruciata ; chalcone synthase; phylogeny; substitution rate Chalcone synthase (CHS), a key enzyme in the biosynthetic pathway of flavonoids, is only found in plants. It catalyzes a stepwise reaction of three acetate residues from malonyl-CoA with 4 ρ-coumaroyl-CoA to yield the intermediate naringenin-chalcone. In plants, flavonoids play important roles in many physiological processes, such as flower pigmentation, protection against UV-damage and pathogens, and formation of root nodules in leguminous plants (Koes et al., 1994). Since the first CHS cDNA was cloned in 1983 (Reimold et al., 1983), CHS gene has become an attractive model for studying the regulation of gene expression and evolution of gene families (Ursula et al., 1987; Mo et al., 1992; Dong et al., 2001; Koch et al., 2001; Lukacin et al., 2001; Jez et al., 2002; Yang et al., 2002). The CHS genes studied so far contain one intron and two exons, with the only exception of one gene from Antirrhinum majus L. that contains two introns (Sommer and Saedler, 1986). It is also clear that the intron splits a cysteine codon where the position is conserved in all the CHS analyzed. The first exon (exon 1) encodes about sixty amino acids residues whereas the second exon (exon 2) encodes about 340 amino acids residues. The exon 2 is more conserved in terms of the length and nucleotide sequence than the exon 1. The four residues acting as the chemically active sites also locate in the exon 2 and are conserved in all known CHS enzymes (CHS2A from Medicago sativa L. as reference sequence, Ferrer et al., 1999). The high sequence similarity and conserved gene structure suggest that CHS genes may originate from a common ancestor. Since the flavonoids has been found existing in mosses and vascular plants, it is thus speculated that the gene(s) coding for CHS or CHS-like enzyme(s) should be present in the genome of mosses and higher plants (Swain, 1986; Stafford, 1991). Up to now many CHS genes have been cloned from gymnosperm and angiosperm plants. However, no such gene has been cloned from mosses. The most primitive plant from which CHS genes were reported was a fern-alien species, Psilotum nudum (Yamazaki et al., 2001). At least two genes are found coding for CHS in most angiosperm species, whereas in some species in the families Solanaceae and Fabaceae more than eight CHS genes are detected. The expression pattern of CHS genes has Received 2 Jul. 2003 Accepted 15 Sep. 2003 Supported by the National Natural Science Foundation of China (39830020). * Author for correspondence. Tel: +86 (0)10 62751847; E-mail: <[email protected]>. HUANG Jin-Xia et al.: A Preliminary Study on the Origin and Evolution of Chalcone Synthase (CHS) Gene in Angiosperms been studied extensively (Ryder et al., 1987; Koes et al., 1989; Howles et al., 1995; Clegg et al., 1997; Ito et al., 1997). It has been found that different CHS genes have different expression profiles, e.g. some CHS genes in roots, some in leaves, some in flowers or even in different parts of flowers, or some wound- or UV-inducible, implying that different CHS genes may have functionally diverged. However, the evolution of CHS genes in different angiosperm families has not been extensively studied yet. In order to trace the “history”of CHS gene, it is necessary to examine whether the CHS or CHS-like gene exists in non-vascular plants and to study the evolutionary trends of this gene in angiosperms. In this study, a liverwort species, Lunularia cruciata, was selected for detecting the CHS gene in its genome. The relationship of the CHS genes of angiosperms was studies based on the phylogenetic tree. The evolutionary pattern of CHS genes in angiosperm families was also discussed in the aspects of base substitution rates. 1 Materials and Methods 1.1 Materials 1.1.1 Plant materials Fresh plants of Lunularia cruciata L. Dun. ex Lindb. were collected from the greenhouse of Peking University. 1.1.2 Sources of CHS sequences from fernalien and angiosperm plants The CHS sequences used in this study were collected from EMBL database and from Wang et al. (2000). Only CHS genes were selected. The genes coding for stilbene synthase (STS), 2pyrone synthase, acridone synthase, valerophenone synthase, and so on in the CHS superfamily were excluded in this study. 1.2 Methods 1.2.1 DNA isolation and gene cloning from L. cruciata Total DNA was isolated from fresh plants of L. cruciata by using the modified CTAB method (Gu et al., 1995). A pair of degenerated primers was designed based on the conserved region of CHSs: 5'AT(T/C) AC(T/C) CA(C/T) (G/C)TN (G/A/ C)T(A/C/T) TTC TGC AC(A/T/C) AC-3' and 5'-AG(G/A) ATN GC(A/C/G) GGN CC(A/T) CCN GG(G/A) TG-3'. The PCR was performed in a 50 µL reaction mixture containing 100 ng of total DNA of L. cruciata as template, 25 pmol of each primer, 2.5 units 11 of Taq DNA polymerase, 0.25 mmol/L each of dATP, dCTP, dGTP and dTTP. The template DNA was denatured at 94 ℃ for 5 min prior to amplification. PCR was performed in a Peltier Thermal Gradient Cycler programmed for 35 cycles of 94 ℃ for 50 s, 51-60 ℃ for 1 min, and 72 ℃ for 1 min followed by 72 ℃ for 10 min. A 417-bp fragment was amplified, cloned and sequenced. The deduced amino acid sequence of this fragment was found sharing about 70% similarity with that of other CHSs. Based on this sequence, three specific primers for TAIL-PCR were designed at its 5'end and 3'-end respectively: 5'-end (P5-1: 5'-CG AAG CAT CCT TGC TGG TAG AG-3', P5-2: 5'-GGT AGA GCA TGG TGC GGT TCA CA-3' and P5-3: 5'-TTC ACT CCA CTG GTG GTG CAG AA-3'); 3'-end (P3-1: 5'-CAT CTT CGG TGATGG AGC CTC AGT C-3', P3-2: 5'-TGA TGG AGC CTC AGT CCT CGT CAT T-3' and P3-3: 5'-GCT ATC GAA GGA CGC CTG ACT GAA G-3'). Arbitrary degenerate (AD) primers were AD1-1 (5'-NTC GA(G/C) T(A/T)T (G/C)G(A/T) GTT-3' and AD1-2 (5'-GT CGA (G/C)(A/T)G ANA (A/T)GAA-3'). TAIL-PCR amplification was performed according to the protocol described by Liu and Huang (1998) and its strategy is illustrated in Fig.1. After a DNA fragment was Fig.1. The strategy of TAIL-PCR amplification of the CHS-like sequences from Lunularia cruciata. There are three steps in TAIL-PCR for cloning the 5'-end and 3'end fragments of the 417 bp fragment. At the 5'-end, the first amplification reaction was carried out with the genomic DNA of L. cruciata as the template and P5-1 and AD1-1 as primers; the second amplification reaction with the product of the first reaction as the template and P5-2 and AD1-1 as primers; the third amplification reaction with the product of the second reaction as the template and P5-3 and AD1-1 as the primers. At the 3'-end, reactions were the same as that in the 5'-end, except that the primers P3-1 and AD1-2 were used in the first reaction, P3-2 and AD1-2 in the second reaction, P3-3 and AD1-2 in the third reaction respectively. The specific PCR products of the second and third reaction were cloned and sequenced. 12 Acta Botanica Sinica 植物学报 Vol.46 No.1 2004 obtained in the upstream of 5'-end and downstream of 3'end of the 417-bp fragment and sequenced respectively, a pair of primers equivalent to the primers used previously (Wang et al., 2000) was designed, DQ5 (5'-CCC TCC CTT GAC GTT CGA CAG GAC-3') and DQ3 (5'-CTA TTC GTT CTC GAT CAG ATG CGG-3'), to ensure cloning the exon 2 of the CHS or CHS-like gene. PCR amplifications were carried out with the same reaction parameters used to amplify the 417-bp fragment. PCR products were purified from the low-melting-point agarose gel and cloned into a pGEM T-Easy vector (Promega, Wisconsin). Plasmid DNA was purified using Wizard Plus SV Minipreps DNA Purification System (Promega, Wisconsin) and sequenced on an ABI 377 automated DNA sequencer using the Dye Terminator Cycle Sequencing kit (PE Applied Biosystems, USA). 1.2.2 Data analysis Sequences were aligned by CLUSTAL W (Thompson et al., 1994) and then adjusted manually. To test the possible differentiation of relative base substitution rate (abbreviate as rate in the following text), the programs RRTree (Robinson et al., 1998) and K2Wuli (Jermiin, 1996) were adopted to compare the rates within and between plant families. Because the genes in one plant family that were clustered in different lineages in the tree constructed in this study may have significantly different rate, the rate differentiation between lineages of the same family and different families were tested. The neighbor-joining (NJ) (Saitou and Nei, 1987) method (implemented in MEGA2.0, Kumar et al., 2001) was used for phylogenetic analysis with the model of Kimura-2Parameter. The robustness of the tree topology was assessed by bootstrap analysis, with 1 000 resampling replicates. Maximum parsimony (MP) and Quartet Puzzle (QP) methods in PAUP 4.0b1 (Swofford, 1998) were also used for phylogenetic analysis with the default settings. The heuristic search with three options, MULPARS, 100 replications of random addition and TBR branch swapping, were performed to search for the most parsimonious trees. In order to obtain a support estimate for each node, a bootstrap analysis (1 000 replications, heuristic search, TBR branch swapping option, and simple addition of sequences) was also performed. 2 Results 2.1 Cloning the exon 2 of a CHS-like gene from Lunularia cruciata The exon 2 of CHS or CHS-like gene was amplified from many plant species ranging from ferns to angiosperms with the primers as reported by Wang et al. (2000). However, no amplified DNA fragment was detected with the same pair of primers from L. cruciata genomic DNA. Thus, a new pair of primers was designed in this study based on the more conserved region of CHS genes. A 417-bp fragment was obtained and it was found that its deduced amino acid sequence shared at least 72% identity to those of CHSs in angiosperms. A fragment of 401 bp and 396 bp was obtained from 5'upstream and 3'-downstream of the 417-bp fragment respectively by TAIL-PCR. The sequence of the fragment was 179 bp and 56 bp overlapping with the 5'- and 3'-end of the 417-bp fragment respectively, making the total length of 962 bp. A stop codon was found in the 3'-end of this sequence, while the 5'-end was still in the exon 2 of the CHS gene but upstream of the position defined by the 5' primer previously reported (Wang et al., 2000). A single DNA fragment of about 800 bp was amplified with primers DQ5 and DQ3 and the total DNA of L. cruciata as template. It was confirmed by sequencing that it coded for the exon 2 of a CHS-like gene designated as LCCHS-like. Sequence comparison analysis showed more than 56% nucleotide sequence identity and more than 60% deduced amino acid sequence identity between LCCHS-like and the exon 2 of CHSs of other plants respectively. It was interesting to note that all catalyzing sites in LCCHS-like were the same as in other CHSs (MCHS2A as reference sequence) (Fig.2). 2.2 Phylogenetic analysis of CHSs in angiosperms The phylogenetic trees of LCCHS-like, two CHSs of fern-alien species and about 253 CHSs of angiosperms were constructed by using NJ, MP and QP methods, respectively. The NJ tree is shown in Fig.3, which is modified in such a way that the CHSs from the same family were represented by a single branch if those CHSs were clustered together in the original tree as a monophyletic clade, e.g. the branch of the Brassicaceae represents 54 CHS sequences which were grouped into a monophyletic clade in the original tree. The first basal group of all the trees was L. cruciata, the second and third basal groups were fern-aliens species. Generally, the topology of the NJ tree was more similar to that of the QP tree than to that of the MP tree. In the case of certain positions where the NJ and QP trees disagreed with each other, the QP tree usually had the similar pattern with the MP tree. For example, the basal groups in angiosperms were the Nymphaeaceae and Fabaceae in the QP and MP trees, while in the NJ tree the basal group was the Caryophyllaceae that was clustered with the monocot families in the QP and MP trees. In the NJ tree, CHSs from angiosperm plants were clustered into two major clades (Fig.3). The cladeⅠcontained HUANG Jin-Xia et al.: A Preliminary Study on the Origin and Evolution of Chalcone Synthase (CHS) Gene in Angiosperms 13 Fig.2. Sequence of the exon 2 of CHS-like gene from Lunularia cruciata, and its deduced amino acid sequence with four catalyzing sites is underlined. all of the angiosperm families in this study except the Caryophyllaceae, while the clade Ⅱ only contained six families, i.e. Asteraceae, Cannabidaceae, Convolvulaceae, Nymphaeaceae, Orchidaceae, and Solanaceae. The two clades did not correspond with traditional groups of monoco ts a nd d ico ts. T he mo no co ts d id no t fo r m a monophyletic group, and some CHSs from the Alliaceae and Orchidaceae were clustered with those from dicots. Four patterns of grouping for CHSs were found in 29 plant families of angiosperms. The first and simplest one was found in the families Brassicaceae, Fabaceae, and Poaceae. There were at least 16 CHS genes available in 14 Acta Botanica Sinica 植物学报 Vol.46 No.1 2004 Fig.3. Phylogenetic tree constructed by using NJ method. The number following a family represents the different lineage within a family, and the number in parentheses indicates the number of sequences in a particular branch. The roman figures indicate the clade. Each color represents one of the four patterns of grouping. each of these families and all of them were grouped into a monophyletic clade on family bases. These three families were in the clade Ⅰ. The second one was found in about nine families: Apiaceae, Juglandaceae, Lamiaceae, Liliaceae, Magnoliaceae, Rosaceae, Rutaceae, Theaceae, and Vitaceae. Each of these families had about three to six CHS genes available, but these genes were not grouped on the family bases and they appeared at least twice in the tree. They also belonged to the clade Ⅰ. The third one was found in six families: Asteraceae, Cannabidaceae, Convolvulaceae, Nymphaeaceae, Orchidaceae, and Solanaceae. The number of CHS genes available in these families varied from two (the Nymphaeaceae) to 34 (the Convolvulaceae). The common feature in these families was that CHS genes were found in both clades Ⅰ and Ⅱ, and the lineages of some families had long branches, such as Asteraceae, Convolvulaceae, Nymphaeaceae, Orchidaceae, and Solanaceae. The fourth one was found in the rest 11 families, and most of them only had one sequence available. These CHS genes dispersed in the HUANG Jin-Xia et al.: A Preliminary Study on the Origin and Evolution of Chalcone Synthase (CHS) Gene in Angiosperms different branches of the clade Ⅰ. 2.3 Rate tests of CHS genes In clade Ⅱ, CHSs from some families such as Asteraceae, Solanaceae and Convolvulaceae were divided into two to three lineages, and one of the lineages had a long branch in the NJ tree. It indicated that CHS genes in different lineages might have evolved at different rates. First, the rates among the lineages within the same family were calculated (Table 1). The two lineages in the Asteraceae, Asteraceae 1 and Asteraceae 2, had signifiTable 1 15 The clade Ⅰ contained two lineages from the Solanaceae, one from the Convolvulaceae and another one from the Asteraceae. The rates among these four lineages are shown in Table 1. The rate differences among these lineages were no t significant, except the d iffer ence b etween Convolvulaceae 1 and Asteraceae 1. In clade Ⅱ Convolvulaceae 2 evolved significantly faster than the other lineages in the Solanaceae and Asteraceae. The results clearly show that the rates of CHS genes are heterogeneous both within and among certain families. Relative base substitution rate tests of CHS genes Lineage 1/Lineage 2 dKa ± SD P Within family Asteraceae 1/Asteraceae 2 -0.065 225 ± 0.019 8 0.001 012** Solanaceae 1/Solanaceae 2 0.004 333 ± 0.014 206 0.760 389 Solanaceae 3/Solanaceae 4 -0.041 60 ± 0.028 252 0.140 961 Solanaceae 1/Solanaceae 3 -0.074 38 ± 0.015 485 0.000 002** Solanaceae 1/Solanaceae 4 -0.115 97 ± 0.026 326 0.000 012** Solanaceae 2/Solanaceae 3 -0.078 71 ± 0.020 048 0.000 088** Solanaceae 2/Solanaceae 4 -0.120 31 ± 0.027 667 0.000 015** Convolvuaceae 1/Convolvuaceae 3 0.043 357 ± 0.034 030 0.202 648 Convolvuaceae 1/Convolvuaceae 3 -0.152 045 ± 0.047 37 0.001 333** Convolvuaceae 2/Convolvuaceae 3 0.195 402 3 ± 0.047 73 0.000 044** In clade Ⅰ Convolvuaceae 1/Solanaceae 1 0.014 091 ± 0.033 139 0.670 683 Convolvuaceae 1/Solanaceae 2 0.033 191 ± 0.036 961 0.369 192 Convolvuaceae 1/Asteraceae 1 0.049 232 ± 0.022 365 0.027 731* Solanaceae 1/Asteraceae 1 0.028 961 ± 0.021 186 0.171 590 Solanaceae 2/ Asteraceae 1 0.027 272 ± 0.023 725 0.250 304 In clade Ⅱ Convolvuaceae 2/Solanaceae 3 0.112 872 ± 0.031 382 0.000 326** Convolvuaceae 2/Solanaceae 4 0.270 373 ± 0.022 591 0.000 184** Convolvuaceae 2/Asteraceae 2 0.101 151 ± 0.029 912 0.000 725** Convolvuaceae 3/Solanaceae 3 -0.035 34 ± 0.031 802 0.266 463 Convolvuaceae 3/Solanaceae 4 0.016 291 ± 0.021 671 0.452 142 Convolvuaceae 3/Asteraceae 2 -0.027 69 ± 0.036 091 0.325 756 Solanaceae 3/Asteraceae 2 0.022 760 ± 0.036 662 0.534 708 Solanaceae 4/Asteraceae 2 -0.030 78 ± 0.028 029 0.272 175 Reference sequence: AB030004 from Equisetum arvense; dKa, difference between the two groups compared on the number of nonsynonymous substitution per nonsynonymous sites; SD, standard deviation; +, the group on the left side of the pairwise comparison with a faster rate; -, the group on the right side of the pairwise comparison with a faster rate of substitution; P, exact probability; *, P<0.05; **, P<0.01. cantly different rates (2 faster than 1). The rates between Solanaceae 1 and Solanaceae 2 or between Solanaceae 3 and Solanaceae 4 were not significantly different, but the rates between Solanaceae 1/2 and Solanaceae 3/4 were significantly different (3/4 faster than 1/2). Genes in Convolvulaceae 2 evolved faster than those in Convolvulaceae 1 and Convolvulaceae 3. Second, with the reference to the NJ tree the rates among those three families in clade Ⅰ and clade Ⅱ were calculated. 3 Discussion 3.1 The origin of CHS genes It is postulated that the structural genes encoding the enzymes for secondary metabolism have been derived from genes encoding enzymes of primary metabolism (Koes et al., 1994). The condensation of p-coumaroyl CoA with malonyl-CoA, which is catalyzed by CHS, is similar to condensation reactions in fatty acid biosynthesis. Therefore, 16 the condensing enzyme of fatty acid biosynthesis (Fab) from Escherichia coli and CHS are thought to originate from a common ancestor (Verwoert et al., 1992). Recently, the RppA gene is reported and thought to code for a CHSrelated synthase. The RppA gene from a Gram-positive, soilliving filamentaous bacterium Streptomyces griseus encodes a 372-aa protein that shows functional similarity to CHS, i.e. RppA selects malonyl-CoA as the starter, carries out four successive extensions and releases the resulting pentaketide to cyclize to 1, 3, 6, 8-tetrahydroxynaphthalene (THN) (Funa et al., 1999; 2002). Although Fab and RppA have been postulated to have the same ancestor as CHS, they share less than 30% amino acid sequence similarity to CHSs, suggesting that they have been diverged greatly from CHSs. It is important to obtain CHS or CHS-like genes from primitive plants in order to find the “missing link”in the evolutionary history of CHS. Based on the deduced amino acid sequence, LCCHS-like is most likely to catalyze a reaction as same as or similar to that catalyzed by CHS because all the four characteristic catalyzing sites found in CHSs are well conserved in LCCHSlike. Although no CHS gene has been found in moss plants yet, it is reasonable to predict that, because a CHS-like gene has been found in liverwort, CHS or CHS-like genes should exist in mosses. Furthermore, the fact that the LCCHS-like has relatively high sequence similarity to the CHSs of vascular plants suggests that there might be more “primitive”CHS or CHS-like genes in algae. Further work on cloning the complete CHS-like genes from L. cruciata and moss plants is needed to compare them with those of vascular plants and to have an insight into the origin of CHS genes. 3.2 Gene tree verse plant family tree Because the sequences used to construct the phylogenetic trees are only about 876 bp long but from a wide range of angiosperm species, the distance-based method may be more suitable than the most parsimonious method, which relies only on informative sites. Since it is difficult to distinguish the orthologous genes from paralogous genes in this data set, and also difficult to get all members of CHS genes from every family, the phylogenetic tree is only a rough estimation of the relationship of the available CHSs, and cannot be used as a phylogenetic tree for the plant species or families. 3.3 Evolution of CHS gene family Although the evolution of individual genes used in phylogeny reconstruction is generally not well understood and might have a negative impact on phylogenetic analysis, the phylogenetic results could provide the framework of Acta Botanica Sinica 植物学报 Vol.46 No.1 2004 the insight into the evolution of genes or gene families. In the previous studies on the evolution or phylogeny of CHS, sampling was limited within certain families or genera (Koes et al., 1989; Durbin et al., 1995; Clegg et al., 1997; Koch et al., 2000; Yang et al., 2002). In this study, all the available angiosperm sequences were adopted in order to draw an overall picture of CHS evolution. In general, gene duplication is considered to be a major mechanism for evolutionary innovation and functional divergence (Ohta, 1993; Force et al., 1999). Duplication of CHS genes was detected in most plant species so far, including a primitive vascular plant, Psilotum nudum. The fact that the two CHS genes from P. nudum were clustered together in the phylogenetic trees indicated that the duplication event resulting in these two genes occurred at least after the divergence of the ancestor of angiosperms from ferns. More data from fern species are needed to draw a general conclusion on the evolution of CHS genes in this primitive vascular plant group. In angiosperms, the number of CHS genes varies greatly in different families and the phylogenetic tree reveals a complicated evolution history of CHS genes. The four grouping patterns may represent two evolutionary trends of CHS genes. The first trend is that the CHS genes in a family, such as Fabaceae, Poaceae or Brassiaceae, maintain a close relationship with homogenous base substitution rate, which is reflected by a monophyletic clade in the NJ tree for each plant family. In the families Fabaceae and Poaceae, it appears that all the CHS genes are the descendents of the duplication after the divergence of the families; while in the family Brassicaeae the duplication of CHS genes appears somehow to be “inhibited”, therefore, most of the species in this family has only one CHS gene. The second trend is that the CHS genes in each family are clustered into more than one lineages with other families, or they do not form a monophyletic clade. One explanation for the second trend is that the duplication events of these genes occurred before the differentiation of those families, which was also found in the phytochrome (PHY) gene of angiosperms (Donoghue and Mathews, 1998; Mathews and Donoghue, 1999). Taking into consideration of the heterogeneous substitution rates within some plant family, it could be predicted that the duplication might have also occurred after the family divergence and that some of the gene members evolved faster than the others. Therefore, some lineages in the phylogenetic tree may not be a true phylogeny, but a result of parallel or convergent evolution. In this study, the CHS genes in 15 families follow the second trend. These results clearly indicate that the current phylogenetic trees HUANG Jin-Xia et al.: A Preliminary Study on the Origin and Evolution of Chalcone Synthase (CHS) Gene in Angiosperms of the CHS gene does not reflect the phylogeny of the angiosperm families, but can be applied for the analysis of the evolutionary pattern in the CHS gene family. The analysis on substitution rate of gene sequences provides a tool to measure the degree of differentiation following the gene duplication. If one of the duplicated genes would evolve with new functions, it must diverge fast enough to escape from the homogenizing effects of gene conversion or recombination. In this study, the CHS genes within or among some families were divergent in terms of the relative base substitution rate and some genes evolved faster than the others, which was consistent with their functional divergence. For example, in the family Asteraceae the two CHS genes in lineage Asteraceae 2 had faster rates than other genes in the family, and it was reported that nonsynonymous-synonymous substitution rate ratio for the gene ancestral to those two genes was higher than for the other lineages (Yang et al., 2002). The two genes had similar functional divergence to the gene GCHS2 that was a CHS-like gene with different substrate specificity and the truncated catalytic profile in the Asteraceae (Helariutta et al., 1996). Similarly, CHS genes in the Convolvulaceae, especially in the lineage Convolvulaceae 2, appeared to be among the most rapidly evolving CHS genes; and in the Solanaceae, Solanaceae 3 also evolved rapidly and had highly diverged from the rest of the CHS sequences from the Solanaceae. Therefore, the CHS genes in the Solanaceae and Convolvulaceae may also have diverged functionally. It is possible that the different evolutionary rate of CHS genes is correlated to the differentiated functions of the genes. For example, the Convolvulaceae 2 contained CHS-A and CHS-B genes of Ipomoea purpurea (L.) Roth exclusively, and they were postulated to have diverged function from CHS-C and CHSD genes, which encoded enzymes with typical CHS activities (Durbin et al., 1995; 2000; 2001; Clegg et al., 1997). It is most likely that the CHS genes in the lineages Asteraceae 2, Convolvulaceae 2 and Solanaceae 3 of the clade Ⅱ have diverged from the rest of the CHS gene lineages both in sequence and function. Although there are only several plant families with enough number of CHS genes available for statistic analyses in this study, almost each family has a unique evolutionary pattern. The overall picture of CHS evolution in angiosperms, if there is one, is that the gene number varies greatly among plant families, duplication/deletion of the gene occur repeatedly, and some genes in certain families such as Asteraceae, Solanaceae and Convolvulaceae, may have evolved with new functions independently. 17 Angiosperm plants are the most diverse group in the vascular plants in terms of habitat, life history, floral structure and coloring, defense system, interaction with microorganisms, and so on. The gene duplication may provide new genetic materials or more sophisticated regulation of gene expression for the individuals to adapt the environment. More data on the function of different CHS genes are needed to elucidate the significance of the diversity of this gene in angiosperms. Acknowledgements: We gratefully acknowledge WANG Mei-Zhi (Institute of Botany, The Chinese Academy of Sciences) for identifying the liverwort plant and Dr. REN Bo for providing the AD primers. References: Clegg M T, Cumming M P, Durbin M L. 1997. The evolution of plant nuclear genes. Proc Natl Acad Sci USA, 94:7791-7798. Dong X, Braun E L, Grotewold E. 2001. Functional conservation of plant secondary metabolic enzymes revealed by complementation of Arabidopsis flavonoid mutants with maize genes. Plant Physiol, 127:46-57. Donoghue M J, Mathews S. 1998. Duplicate genes and the root of angiosperms, with an example using phytochrome sequences. Mol Phylogenet Evol, 9:489-500. Durbin M L, Learn G H, Huttley G A, Clegg M T. 1995. Evolution of the chalcone synthase gene family in the genus Ipomoea. Proc Natl Acad Sci USA, 92:3338-3342. Durbin M L, McCarg B, Clegg M T. 2000. Molecular evolution of the chalcone synthase multigene family in the morning glory genome. Plant Mol Biol, 42:79-92. Durbin M L, Denton A L, Clegg M T. 2001. Dynamics of mobile element activity in chalcone synthase loci in the common morning glory (Ipomoea purpurea). Proc Natl Acad Sci USA, 98:5084-5089. Ferrer J L, Jez J M, Bowman M E, Dixon R A, Noel J P. 1999. Structure of chalcone synthase and the molecular basis of plant polyketide biosynthesis. Nat Struct Biol, 6:775-784. Force A, Lynch M, Pickett F B, Amores A, Yan Y L, Postlethwait J. 1999. Preservation of duplication genes by complementary, degenerative mutations. Genetics, 151:1531-1545. Funa N, Ohnishi Y, Fujii I, Shibuya M, Ebizuka Y, Horinouchi S. 1999. A new pathway for polyketide synthesis in microorganisms. Nature, 400:897-899. Funa N, Ohnishi Y, Ebizuka Y, Horinouchi S. 2002. Properties and substrate specificity of RppA, a chalcone synthase-related polyketide synthase in Streptomyces griseus. J Biol Chem, 277:4628-4635. Gu H-Y, Qu L-J, Ming X-T, Pan N-S, Chen Z-L. 1995. Plant Genes and Molecular Manipulations. Beijing: Peking Univer- 18 Acta Botanica Sinica 植物学报 Vol.46 No.1 2004 sity Press. (in Chinese) Helariutta Y, Kotilainen M, Eolmaa P, Kalkkinen N, Bremer K, mone-prolactin gene family: a paradigm for evolution by gene duplication. Genetics, 134:1271-1276. Teeri T H, Albert V A. 1996. Duplication and functional di- Reimold U, Kroeger M, Kreuzaler F, Hahlbrock K. 1983. Coding vergence in the chalcone synthase gene family of Asteraceae: and 3' non-coding nucleotide sequence of chalcone synthase evolution with substrate change and catalytic simplification. mRNA and assignment of amino acid sequence of the enzyme. Proc Natl Acad Sci USA, 93:9033-9038. EMBO J, 2:1801-1805. Howles P A, Aprioli T, Weinman J J. 1995. Nucleotide sequence Robinson M, Gouy M, Gautier C, Mouchiroud D. 1998. Sensi- of additional members of the gene family encoding chalcone tivity of the relative-rate test to taxonomic sampling. Mol Biol synthase in Trifolium subterraneum. Plant Physiol, 107:1035- 1036. Evol, 15:1091-1098. Ryder T B, Hedrick S A, Bell J N, Liang X W, Clouse S D, Lamb Ito M, Ichinose Y, Kato H, Shiraishi T, Yamada T. 1997. Molecu- C J. 1987. Organization and differential activation of a gene lar evolution and functional relevance of the chalcone syn- family encoding the plant defense enzyme chalcone synthase thase genes of pea. Mol Gen Genet, 255:28-37. Jermiin L S. 1996. K2Wuli Version 1.0. Australia: Australian National University Press. Jez J M, Bowman M E, Noel J P. 2002. Expanding the biosynthetic repertoire of plant type Ⅲ polyketide synthases by altering starter molecule specificity. Proc Natl Acad Sci USA, 99:5319-5324. Koch M A, Haubold B, Mitchell-Olds T. 2000. Comparative in Phaselous vulgaris. Mol Gen Genet, 210:219-233. Saitou N, Nei M. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol, 4:406-425. Sommer H, Saedler H. 1986. Structure of the chalcone synthase gene of Antirrhinum majus. Mol Gen Genet, 202:429-434. Stafford H A. 1991. Flavonoid evolution-an enzymic approach. Plant Physiol, 96:680-685. evolutionary analysis of chalcone synthase and alcohol dehy- Swain T. 1986. The evolution of flavonoids. Copy V, Jr E M, drogenase loci in Arabidopsis, Arabis and related genera Harborne J B. Plant Flavonoids in Biology and Medicine, (Brassicaceae). Mol Biol Evol, 17:1483-1498. Biochemical, Pharmacological, and Structure Activity Koch M, Haubold B, Mitchell-Olds T. 2001. Molecular systematics of the Brassicaceae: evidence from coding plastidic matK and nuclear Chs sequences. Am J Bot, 88:534-544. Koes R E, Spelt C E, van den Elzen P J, Mol J N. 1989. Cloning and molecular characterization of the chalcone synthase Relationships. New York: Alan R, Liss, Inc. 1-14. Swofford D L. 1998. Paup4.0 Beta Version: Phylogenetic Analysis Using Parisimony. Sinauer Associates, Sunderland, MA, USA Thompson J D, Higgins D G, Gibson T J. 1994. CLUSTAL W: multigene family of Petunia hybrida. Gene, 81:245-257. improving the sensitivity of progressive multiple sequence Koes R E, Quattrocchio F, Mol J N. 1994. The flavonoid biosyn- alignment through sequence weighting, positions-specific gap thetic pathway in plants: function and evolution. BioEssays, penalties and weight matrix choice. Nucleic Acids Res, 22: 16:123-132. 4673-4680. Kumar, S, Tamura K, Jakobsen I B, Nei M. 2001. MEGA: mo- Ursula N K, Barzen E, Bernhardardt J, Rohde W, Schwarz-Sommer lecular evolutionary genetics analysis software. Version 2.1. Z, Reif H J, Wiennand U, Saedler H. 1987. Chalcone synthase Arizona State University, Tempe, Arizona, USA. Liu Y G, Huang N. 1998. Efficient amplification on insert end genes in plants: a tool to study evolutionary relationship. J Mol Evol, 26:213-225. sequences from bacterial artificial chromosome clones by ther- Verwoert I I, Verbree E C, van der Linden K H, Ni Jkamp H J J, mal asymmetric interlaced PCR. Plant Mol Biol Rep, 16:175- Stuitje A R. 1992. Cloning nucleotide and expression of the 181. Escherichia coli fabD gene, encoding malonyl coenzyme A- Lukacin R, Schreiner S, Matern U. 2001. Transformation of acyl carrier protein transacylase. J Bacteriol, 174:2851-2857. acridone synthase to chalcone synthase. FEBS Lett, 508:413- Wang J L, Qu L J, Chen J, Gu H, Chen Z L. 2000. Molecular 417. Mathews S, Donoghue M J. 1999. The root of angiosperm phylogeny inferred from duplicate phytochrome genes. Science, 286:947-949. Mo Y, Nagel C, Taylor L P. 1992. Biochemical complementation evolution of the exon 2 of CHS genes and the possibility of its application to plant phylogenetic analysis. Chin Sci Bull, 45: 1735–1742. Yamazaki Y, Suh D Y, Sitthithaworn W, Ishiguro K, Kobayashi Y, Shibuya M, Ebizuka Y, Sankawa U. 2001. Diverse chal- of chalcone synthase mutants defines a role for flavonols in cone synthase superfamily enzymes from the most primitive functional pollen. Proc Natl Acad Sci USA, 89:7213-7217. vascular plant, Psilotum nudum. Planta, 214:75-84. Ohta T. 1993. Pattern of nucleotide substitution in growth hor- Yang J, Huang J X, Gu H, Zhong Y, Yang Z H. 2002. Duplication HUANG Jin-Xia et al.: A Preliminary Study on the Origin and Evolution of Chalcone Synthase (CHS) Gene in Angiosperms 19 and adaptive evolution of the chalcone synthase genes of Dendranthema (Asteraceae). Mol Biol Evol, 19:1752-1759. (Managing editor: ZHAO Li-Hui)
© Copyright 2026 Paperzz