Masayuki Ishikawa1, Makoto Fujiwara1, Kintake Sonoike2 and Naoki Sato1,* 1Department of Life Sciences, Graduate School of Arts and Sciences, University of Tokyo, 3-8-1 Komaba, Meguro-ku, Tokyo, 153-8902 Japan 2Department of Integrated Biosciences, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa-shi, Chiba, 277-8562 Japan Chloroplasts are descendents of a cyanobacterial endosymbiont, but many chloroplast protein genes of endosymbiont origin are encoded by the nucleus. The chloroplast–cyanobacteria relationship is a typical target of orthogenomics, an analytical method that focuses on the relationship of orthologous genes. Here, we present results of a pilot study of functional orthogenomics, combining bioinformatic and experimental analyses, to identify nuclear-encoded chloroplast proteins of endosymbiont origin (CPRENDOs). Phylogenetic profiling based on complete clustering of all proteins in 17 organisms, including eight cyanobacteria and two photosynthetic eukaryotes, was used to deduce 65 protein groups that are conserved in all oxygenic autotrophs analyzed but not in non-oxygenic organisms. With the exception of 28 well-characterized protein groups, 56 Arabidopsis proteins and 43 Synechocystis proteins in the 37 conserved homolog groups were analyzed. Green fluorescent protein (GFP) targeting experiments indicated that 54 Arabidopsis proteins were targeted to plastids. Expression of 39 Arabidopsis genes was promoted by light. Among the 40 disruptants of Synechocystis, 22 showed phenotypes related to photosynthesis. Arabidopsis mutants in 21 groups, including those reported previously, showed phenotypes. Characteristics of pulse amplitude modulation fluorescence were markedly different in corresponding mutants of Arabidopsis and Synechocystis in most cases. We conclude that phylogenetic profiling is useful in finding CPRENDOs, but the physiological functions of orthologous genes may be different in chloroplasts and cyanobacteria. Keywords: Arabidopsis thaliana • Chloroplast protein • Comparative genomics • Endosymbiogenesis • Photosynthetic gene • Synechocystis sp. PCC 6803. Special Issue – Regular Paper Orthogenomics of Photosynthetic Organisms: Bioinformatic and Experimental Analysis of Chloroplast Proteins of Endosymbiont Origin in Arabidopsis and Their Counterparts in Synechocystis Abbreviations: CPRENDO, chloroplast protein of endosymbiont origin; GFP, green fluorescent protein; PAM, pulse amplitude modulation. Introduction Chloroplasts are descendents of an ancestral endosymbiont related to cyanobacteria (Abdallah et al. 2000, CavalierSmith 2003, Sato 2006). The proteins encoded by chloroplast genomes are orthologs of cyanobacterial counterparts (Martin et al. 1998, Mulkidjanian et al. 2006, Sato 2006). In addition, many genes of the original endosymbiont were transferred to the nuclear genome. These are typical targets of orthogenomics, which classifies proteins according to orthologous relationships. A number of nuclear genes are known to encode chloroplast proteins that are related to photosynthesis or chloroplast biogenesis (Sato 2001, Sato 2006, Bowman et al. 2007). However, there are many other unidentified, nuclear-encoded proteins present in the chloroplast. Only a part of the chloroplast proteome has been elucidated by mass spectrometry (e.g. Peltier et al. 2002, Friso et al. 2004, Kleffmann et al. 2004, Peltier et al. 2004, Peltier et al. 2006). An estimate suggested that >3,600 proteins of the model plant Arabidopsis thaliana originate from the ancestral endosymbiont, and about a half of these are proteins that function in compartments other than chloroplasts (Martin et al. 2002). Another study suggested that the *Corresponding author: E-mail, [email protected]. Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027, available online at www.pcp.oxfordjournals.org © The Author 2009. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: [email protected] Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009. 773 M. Ishikawa et al. cyanobacterial contribution to the nuclear genome of Cyanophora paradoxa is limited to chloroplast proteins (Reyes-Prieto et al. 2006). A problem in previous bioinformatic studies (Abdallah et al. 2000, Martin et al. 2002) was the use of a single species of plant, A. thaliana, and the fact that sequence comparison was made using A. thaliana genes as queries against genes of other organisms. Even though multiple species of cyanobacteria were used, the relationship between various species of cyanobacteria was not assessed. Other studies on comparative genomics of cyanobacteria and algae or plants also used simple species–species comparison (Mulkidjanian et al. 2006, Reyes-Prieto et al. 2006). In contrast, all-against-all comparison is a preferred method that can correctly classify proteins according to their similarity (Sato 2002). The red alga Cyanidioschyzon merolae (Matsuzaki et al. 2004) as well as many other photosynthetic eukaryotes can be used for comparative genomics. The use of genomes of multiple photosynthetic eukaryotes as well as of various cyanobacteria in an all-against-all comparison will give an unbiased estimate of the genes that are shared by cyanobacteria and photosynthetic eukaryotes, and thus the nuclear-encoded chloroplast proteins of endosymbiont origin (abbreviated as CPRENDOs: Sato et al. 2005). We present here results of a pilot functional analysis of nuclear-encoded chloroplast proteins based on eight cyanobacteria, a land plant and a red alga. The study consists of bioinformatic estimation of putative CPRENDOs and experimental verification of their chloroplast localization in Arabidopsis. It also includes initial functional analysis of CPRENDOs in both Arabidopsis and Synechocystis. The aim of the present study is not to report detailed results on individual proteins, but to present the methodology as a new approach in photosynthesis and chloroplast research. Results Estimation of CPRENDOs An all-against-all BLASTP search was performed on a data set (CZ16X) comprising all predicted proteins (102,513 sequences excluding duplication, as of December 2002) in 17 organisms including eight cyanobacteria, three photosynthetic bacteria, two non-photosynthetic bacteria, two nonphotosynthetic eukaryotes and two photosynthetic eukaryotes (A. thaliana and C. merolae). The genomes used in the analysis are listed in Supplementary Table S1. Homolog groups were constructed by single-linkage clustering with several different threshold E-values using the Gclust software version 3.0. The method of clustering was briefly described in previous publications (Sato 2002, Sato et al. 2005). We used the results of clustering with E-values 10–8, 10–12 and 10–20. The homolog groups that are shared by all of the eight cyanobacteria and the two photosynthetic eukaryotes 774 but not by other organisms were selected for each E-value, and the groups were combined. We thus obtained 65 homolog groups that were specific for photosynthetic organisms (Supplementary Table S2). This clustering was done using all the proteins coded for by the nuclear genome as well as organellar genomes and, therefore, some groups contained plastid-encoded proteins and cyanobacterial proteins, such as photosynthetic reaction center proteins, PsbB, PsbC (Group ID 1 in Supplementary Table S2), PsaA and PsaB (Group ID 3). However, other groups contained nuclearencoded proteins of the plant and the alga with cyanobacterial homologs. Among the 65 selected groups, 28 groups contained known proteins involved in photosynthesis or chloroplast biogenesis. Finally, 37 homolog groups, in which the proteins in A. thaliana and C. merolae are encoded by the nucleus and have not been assigned a well-defined function, were selected as targets for further functional analysis. These homolog groups included 56 A. thaliana proteins and 43 Synechocystis proteins (Table 1). Each of the homologous protein groups was assigned a CPRE number. An alignment of an example homolog group is shown in Fig. 1. In this example, the member proteins were highly similar to one another, but each of the proteins of A. thaliana and C. merolae had an N-terminal extension. Among the 56 A. thaliana sequences, 48 had an N-terminal extension. Three programs were tested for the prediction of intracellular localization (Table 2). With the exception of seven proteins, most of the 56 proteins were predicted to be localized in chloroplasts by at least two prediction programs. Note that these results were obtained with the most recent software. Only an old version of TargetP was available at the start of the present study. Based on these results, we considered these proteins as putative CPRENDOs, and analyzed protein targeting and light-regulated gene expression. Initial functional analysis of putative CPRENDOs in Arabidopsis Intracellular targeting of these A. thaliana proteins was analyzed using green fluorescent protein (GFP) fusion proteins in onion epidermis with particle bombardment. This is a heterologous system; however, we obtained no unexpected results that contradicted previously reported biochemical or immunological data in Arabidopsis or Cyanidioschyzon proteins (Moriyama et al. 2008). Of the 56 proteins analyzed, 54 proteins were targeted to plastids (Fig. 2 and Table 2). Seven of these were also targeted to mitochondria (see also Table 3). A complete set of fluorescence micrographs is presented in Supplementary Fig. S1. Two proteins in CPRE 7, consisting of four paralogs, showed no clear localization, and are possibly cytoplasmic proteins. Note that these two cytoplasmic proteins (At2g43910 and At2g43920) are included in a different cluster in a recent database (see the rightmost column in Table 1: ALL95 cluster). The data were consistent Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009. Orthogenomics of photosynthetic organisms Table 1 List of selected CPRENDOs analyzed in the present study CPRE Group ID with threshold value 10–8 10–12 10–20 1 357 416 Hypothetical 2 322 453 Hypothetical (ClpS homolog) ssl3379 ssr2723 3 4 285 366 482 530 1081 822 Ycf52 (acetyltransferase) Hypothetical (DUF1350) sll0286 slr1699 <5> <6> 7 477 607 602 579 698 718 616 762 810 ATAB2 (Tab2 homolog) Hypothetical [TIC21/PIC1] Hypothetical sll2002, slr1110 sll1656 slr1926 Cme CMQ124C, CMG076C – CML200C, CMJ276C cp CMA083C, CMD109C, CMR396C CMH188C CMN128C CMS285C 8 9 10 746 772 798 835 1064 slr0959 sll1586, sll1265 slr0565 <11> 630 802 12 <18> 13 751 14 803 15 786 856 1075 859 941 952 (CAAX N-terminal protease) Hypothetical (Vitamin K epoxide reductase homolog) Probable ferredoxin (2Fe–2S) [NDF4] Ycf19 homolog [CCB3] 16 777 17 915 <19> 901 953 1020 1076 20 21 22 879 23 24 911 25 918 <26> 330 27 478 28 593 1085 1095 1119 1120 1135 1145 406 29 30 31 <32> 33 706 816 913 919 1178 1055 1210 1252 1219 1238 1251 1226 1168 1121 236 34 <35> 1014 1094 36 37 1144 1156 Annotation in database [recent annotation] Gene identifier Syn slr1638, slr1674 ALL95 cluster Ath 1g63610, 2g14910, 5g14970 2360, 5850 – 1g68660 4073 702 1g26220, 1g32070 3g43540, 5g47860 1993 1889 CMQ405C CMO228C CMO209C 3g08010 2g15290 3g59870, 2g43940 2g43910, 2g43920 2g20725, 5g60750, 1g14270 2g25660 4g35760 2082 1965 2379 5724 2754 2006, 11537 2536 ssl3044 CME070C 3g16250 1465 3g07430, 4g27990 5g36120 1g21350 5g55710, 2g47840 5g52970 2090 2310 2593 Ycf65(PSRP-3) Hypothetical Psb29/Thf1 slr0923 sll0295 sll1414 cp, CMC030C CMT057C CMP081C cp, CMS050C CMS436C, CMP233C cp CMP136C CME041C 411 Hypothetical Ycf60 Hypothetical ssr2142 ssl0353 sll1289 sll1737 sll1071 1g68590, 5g15760 2g45990 2g20890 2386 2697 2687 (Rubredoxin homolog) Hypothetical Hypothetical [ape1 locus] Hypothetical (NnrU homolog) Hypothetical Hypothetical PcyA HY2 Ycf20 Hypothetical (cyclase/dehydrase) slr2033 slr1702 slr0575 slr1599 sll0157 Slr0815 Slr0116 – Sll1509 slr0941 1g54500 5g27560 5g38660 1g10830 1g29700 3g17930 – 3g09150 1g65420, 3g56830, 5g43050 1g02470 1096 2692 2597 2911 2784 2810 3570 3705 1524 2163 2g20920, 3g51140, 5g23040 4g19100, 5g52780 3g26580 3g26710 1g12250 4316 2685 5595 3972, 14849 528 1g19740, 1g75460 4g25910, 5g49940 996 285 1g78620 5g17660 1510 536 Hypothetical [CDF1] Hypothetical Hypothetical (TPR region) Hypothetical [CCB1] Hypothetical (PPR homolog) slr1918 sll0933 slr1052 slr0589 sll0301, sll0577, sll0274 ATP-dependent proteinase (LON) sll0195 NifU/NFU2,3 ssl2667 Hypothetical YggH (tRNA methyltransferase) homolog sll0875 sll1300 CMS181C CMO077C CMC040C CMQ364C CMJ095C CMQ319C CMG110C – CMT591C CMD122C, CMD157C CMA064C CML309C CMD175C CMM306C CMO201C, CMQ266C, etc. CMD100C CMK204C, CMJ205C, CMP295C CMS030C CMQ129C, CMT407C CPRENDOs that were identified by other groups after the start of this project are marked by <brackets>, and annotations are given in bold. Syn, Synechocystis sp. PCC 6803; Cme, Cyanidioschyzon merolae; Ath, Arabidopsis thaliana. cp, proteins encoded by the chloroplast genome. A CPRE number was assigned to each homologous protein group. Threshold indicates the level of E-value that was used to estimate each protein group. ALL95 cluster indicates the cluster number in a more recent Gclust database. Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009. 775 M. Ishikawa et al. Fig. 1 An example of a homolog group specific to photosynthetic organisms consisting of unknown proteins. The alignment was prepared by the Clustal X program (Thompson et al. 1994). with the proteomic studies on various chloroplast fractions (Table 2): namely, 37 proteins belonging to 29 homolog groups among the 56 examined groups have been detected in at least some fractions of chloroplast. It is known that the proteomics data alone should not be considered as conclusive evidence for the presence of these proteins in chloroplasts, because many non-chloroplast proteins have been detected in chloroplast fractions. The GFP data indicated that even the putative CPRENDOs that were not predicted to be chloroplast proteins by targeting prediction or that had not been detected in chloroplasts by mass spectrometry were indeed localized to chloroplasts. We conclude, therefore, that we identified 54 CPRENDOs. The transcript level of the genes encoding the 56 proteins (CPREs) was also analyzed by RNA gel blot analysis. The transcript level of 36 genes was elevated in the light as compared with in the dark, suggesting light-promoted expression of these genes (Fig. 2, Table 2 and Supplementary Fig. S1). Table 3 summarizes the targeting and expression analyses. Among the 56 candidate proteins, 38 (= 34 + 4) showed both chloroplast localization and light-promoted expression. Analysis of Synechocystis mutants Disruptants of the corresponding genes (called sCPRE) were prepared in Synechocystis sp. PCC 6803 by homologous 776 recombination using a kanamycin resistance cassette from pUC4K. Among the 43 genes, 33 were knocked out, while seven genes were not completely disrupted and are likely to be essential. Constructs for three genes could not be made despite repeated attempts, either because of experimental failure or due to sequence differences in the strain used. The kinetics of fluorescence induction were measured for all the mutants as an initial survey (Supplementary Fig. S2). The abbreviation ‘FI’ in Fig. 3 indicates that the kinetics was different in the mutant. To analyze the kinetic properties of photosynthesis in more detail, we performed pulse amplitude modulation (PAM) fluorescence analysis (for a review, see Schreiber 2004) on selected mutants. These included the mutants that showed some differences in the fluorescence kinetics (FI), and several apparently normal mutants. Representative traces for the PAM analysis are shown in Fig. 4. The maximal level of fluorescence (Fm) was obtained by the addition of DCMU at the end of each measurement (Fujimori et al. 2005). A disruptant of sCPRE36 showed significantly reduced Fv′ while retaining F and Fm values. A disruptant of sCPRE30 showed an elevated level of F, thus exhibiting very low Fv/Fm. Traces from PAM analysis for all 28 mutants analyzed are presented in Supplementary Fig. S3, and the results are summarized in Fig. 3. The mutants that showed normal fluorescence kinetics in the initial survey were not, Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009. Orthogenomics of photosynthetic organisms Table 2 Summary of expression and targeting analyses CPRE 1 2 3 4 <5> <6> 7 8 9 10 <11> 12 13 14 15 16 17 <18> <19> 20 21 22 23 24 25 <26> 27 28 29 30 31 <32> 33 34 <35> 36 37 AGI code 1g63610 2g14910 5g14970 1g68660 1g26220 1g32070 3g43540 5g47860 3g08010 2g15290 3g59870 2g43945 2g43920 2g43910 2g20725 5g60750 1g14270 2g25660 4g35760 3g16250 3g07430 4g27990 1g21350 5g55710 2g47840 5g52970 1g68590 5g15760 2g45990 5g36120 2g20890 1g54500 5g27560 5g38660 1g10830 1g29700 3g17930 3g09150 1g65420 5g43050 3g56830 1g02470 2g20920 3g51140 5g23040 4g19100 5g52780 3g26580 3g26710 1g12250 1g19740 1g75460 5g49940 4g25910 1g78620 5g17660 Prediction results Predotar PSORT cp cp cp cp cp cp cp cp [mt?] cp cp cp ER cp cp mt [mt?] cp cp cp ER cp ER cp None Cytosol None cytosk cp plasma cp Cytosol ER cp cp cp [mt?] cp cp cp cp cp cp cp cp cp Vacuole None cp cp cp cp cp cp cp cp None Cytosol cp cp cp cp cp cp cp cp cp cp cp cp cp cp [cp?] plasma cp cp [cp?] Vacuole cp cp cp cp cp mt cp cp none cp cp cp mt plasma ER cp cp extr cp cp cp cp cp cp cp cp None cp cp cp cp cp cp cp TargetP cp cp cp cp cp cp mt cp cp cp cp cp None None cp cp cp cp cp cp cp cp None cp cp cp cp cp None cp cp cp cp cp cp cp cp cp None cp cp cp cp cp cp None mt cp cp cp cp cp cp cp cp cp Proteome references GFP localization Expression in light Expression in dark 10 ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ – ++ ++ ++ – ++ – ++ ++ ++ ++ ++ ++ + ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ ++ – – – – ++ ++ ++ + ++ ++ ++ ++ ++ ++ ++ ++ ++ – – ++ + + – – ++ ++ ++ – + – + ++ + – – – + – ++ ++ – + + + – – – + + – + + – – – – – – – – – – + – + ++ + – – – + – + – 10 10 10 10 10 4, 10 10 10 10 6 4, 10 3, 10 10 5, 10 3, 4, 6, 9, 10 1, 2, 6, 10 5, 6, 10 8, 10 5, 10 4, 6, 7, 8, 10 5, 6, 7, 10 6, 7, 10 10 10 5, 7 3, 4, 10 4, 10 10 7, 10 7 10 5, 6, 7, 10 10 10 10 3, 10 cp cp cp cp cp cp cp cp cp cp cp cp Cytosol Cytosol cp cp cp, mt cp cp cp cp cp cp cp cp cp, mt cp cp cp cp cp cp cp cp cp cp cp cp cp, mt cp, mt cp, mt cp cp cp cp cp cp (mt?) cp cp cp cp cp cp cp cp, mt cp, mt Predicted results that were different from the GFP results are marked by bold italic. CPREs that have been identified and reported by other groups are marked by <brackets>. Abbreviations: cp, chloroplasts; mt, mitochondria; nu, nuclear; cytosk, cytoskeleton; plasma, plasma membrane; extr, extracellular; ER, endoplasmic reticulum. References: 1, Schubert et al. (2002); 2, Peltier et al. (2002); 3, Ferro et al. (2003); 4, Froehlich et al. (2003); 5, Friso et al. (2004); 6, Kleffmann et al. (2004); 7, Peltier et al. (2004); 8, Peltier et al. (2006); 9, Dunkley et al. (2006); 10, Zybailov et al. (2008). We also used the PPDB (Sun et al. 2004) for the proteomics data. Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009. 777 M. Ishikawa et al. GFP At5g55710 RNA-blot 18S rRNA CPRE14 L D At5g55710 L D At2g47840 At2g47840 At1g68590 CPRE16 At1g68590 Localization Expression Compartment Number Light >dark Light = dark Plastids Plastids and mitochondria Cytoplasm Total 47 7 34 4 8 0 No detectable expression 5 3 2 56 1 39 1 9 0 8 mutants that did not show significant difference in PAM fluorescence showed changes in the absorption spectrum. Interestingly, the seven mutants that have not been completely segregated did show phenotypes in PAM fluorescence, oxygen evolution or pigment composition. Analysis of Arabidopsis mutants At1g65420 CPRE27 At1g65420 ND Bar=10 µm Fig. 2 Representative results of targeting of GFP fusion proteins and RNA gel blot analysis. Results for CPRE14, 16 and 27 are shown. Disruption of these genes gave visible phenotypes in later analysis. Complete data are presented in Supplementary Fig. S1. in principle, analyzed by PAM. This was justified by the fact that six such ‘normal’ mutants (sCPRE 5, 11, 12, 13, 14 and 19) were also normal in the PAM analysis. Eleven mutants showed both increased qN and decreased ΦPSII. Two others showed elevated qN, while two others showed reduced ΦPSII. These 15 mutants also showed other symptoms in qP, NPQ, Fv/Fm or F0′/Fm. These results suggested that the mutants had defects in photosynthetic electron transport or photosystems. The results of oxygen evolution and spectral analysis are summarized in Table 4. Oxygen evolution activity (per Chl) was low in three mutants, whereas seven mutants showed elevated O2 evolution per Chl. This must be due to a reduced content of Chl, most probably the antenna Chl of PSI. The ratio of carotenoid to Chl and the ratio of phycobilin to Chl, as estimated from the absorbance ratio, A492/A680 and A626/A680, respectively, were also affected in 18 mutants. It should be noted that the five 778 Table 3 Localization of 56 putative CPRENDOs and expression of the corresponding genes in A. thaliana T-DNA insertion lines (tag-lines) of A. thaliana that tagged the CPRE genes were obtained from The Arabidopsis Information Resource (TAIR) and analyzed. Among the 37 CPRE groups, we obtained data for 18 groups (Fig. 3), while results on eight other mutants were reported during the course of the present study (see the next section). Mutants were not obtained for seven CPREs because no T-DNA insertion lines were available in the stock centers, and only heterozygous lines were obtained for four CPREs. Mutants of a gene of CPRE27 (At1g65420) had variegated cotyledons and foliage leaves (Fig. 5B, C). A mutant of CPRE16 (At1g68590) had non-green cotyledons on sucrose-containing medium (Fig. 5D). PAM analysis showed that mutants of nine CPREs had some defects in photosynthesis (Fig. 3 and Supplementary Fig. S4). Representative traces for the PAM analysis are shown in Fig. 6. In our current analysis, 17 out of 30 CPREs (except for CPREs without tag-line stocks) indeed showed phenotypes that were considered as being related to photosynthesis. Proteins that were reported after the start of this project This project began 5 years ago and, during this time, some of the proteins have been characterized in detail and the reports have been published (Fig. 3). These are PIC1/Tic21 (iron transporter or translocon component: CPRE6, Teng et al. 2006, Duy et al. 2007), NDF4 [NAD(P)H dehydrogenase component: CPRE11, Takabayashi et al. 2009], Tab2/ATAB2 (RNA-binding protein involved in psaB translation: CPRE5, Dauvillée et al. 2003, Barneche et al. 2006), Psb29/THF1 (PSII component involved in thylakoid formation: CPRE19, Wang et al. 2004, Keren et al. 2005), PcyA/PebB/HY2 (phytochromobilin/phycobilin biosynthesis enzyme: CPRE26, Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009. Orthogenomics of photosynthetic organisms Kohchi et al. 2001, Frankenberg and Lagarias 2003), CCB1 (CPRE32) and CCB3 (CPRE18) (cytochrome c biogenesis enzyme involved in b6 f complex assembly: Lezhneva et al. 2008) and NifU/NFU2/NFU3 (iron–sulfur cluster assembly enzyme: CPRE35, Nishio and Nakai 2000, Léon et al. 2003, Touraine et al. 2004). Some other proteins were found to be homologs of proteins in other organisms, such as NnrU homolog (nitric oxide reductase: CPRE23, Bartnikas et al. 1997), VKOR homolog (vitamin K epoxide reductase: CPRE10, Goodstadt and Ponting 2004), ClpS homolog (component of Clp machinery in Escherichia coli: CPRE2, Dougan et al. 2002) and yggH homolog (tRNA methyltransferase: CPRE37, De Bie et al. 2003). APE1 (uncharacterized protein involved in acclimation to high light: CPRE22, Walters et al. 2003) was described as a gene involved in a light acclimation defect, but biochemical analysis of the protein was not reported. CDF1 (CPRE29, Kawai-Yamada et al. 2005) was described only as a transgene in yeast cells. Other proteins that have Ycf numbers or annotations have not been analyzed in plants to date. Curiously, mutants in Psb29/ THF1 and APE1 did not show detectable phenotypes in Synechocystis using the method of analysis and growth conditions in the study reported here. According to the publications on these proteins, the mutants showed phenotypes at high light conditions. This suggests that more of the mutants that were analyzed in the present study may show phenotypes under high light or other extreme conditions. Discussion In the present study, we describe a new approach of functional orthogenomics in photosynthetic organisms, which involves comprehensive clustering of all proteins conserved in photosynthetic organisms and functional analysis of the genes or proteins that have been selected. The two parts are discussed separately. Protein clustering and phylogenetic profiling The informatics part of the study is a combination of protein clustering and phylogenetic profiling. Phylogenetic profiling (Pellegrini et al. 1999) relies on the availability of various genome sequences of both photosynthetic and non-photosynthetic organisms. Eight cyanobacterial genome sequences were already available at the start of this project and now tens of cyanobacterial sequences are available. However, Arabidopsis (Arabidopsis Genome Initiative 2000) and Cyanidioschyzon (Matsuzaki et al. 2004) were, until recently, the only genome sequences of photosynthetic eukaryotes. We used the Cyanidioschyzon data for the clustering prior to publication, and preliminary results of the cluster analysis were presented in the paper reporting the sequence data (Matsuzaki et al. 2004). The use of two eukaryotic genomes improved the quality of estimation of conserved proteins in photosynthetic organisms. Many other studies have used Arabidopsis as a pivot in searching for homologs (Abdallah et al. 2000, Martin et al. 2002), and many of the proposed proteins of cyanobacterial origin were predicted to be localized outside chloroplasts. The use of complete clustering based on all-against-all BLASTP analysis (Sato 2002, Sato et al. 2005) resulted in a smaller number of protein clusters that are conserved in photosynthetic organisms, but they are indeed localized to chloroplasts as revealed in the present study. Some recent reports (Keren et al. 2005, Duy et al. 2007, Lezhneva et al. 2008, Takabayashi et al. 2009) also used informatics to identify potential genes for chloroplast proteins, but these studies were based on simple comparisons of single genes. The present study clearly shows that comprehensive analysis of all CPRENDOs is now feasible and effective in finding new chloroplast proteins. The publication of genome sequences of Chlamydomonas reinhardtii (Merchant et al. 2007) and Physcomitrella patens (Rensing et al. 2008), as well as other genomes, has greatly changed the situation, and made phylogenomic predictions more practical (Bowman et al. 2007, De Crécy-Lagard and Hanson 2007). We recently constructed a comparative genomic database involving 95 organisms including all available data of photosynthetic organisms (data set ALL95), based on a more sophisticated algorithm (Sato 2009), which has been made publicly accessible at http://gclust.c.u-tokyo. ac.jp. The results are included in Table 1 in the rightmost column. The use of the ALL95 data set expands the scope of phylogenetic profiling, such as the proteins conserved in green plants or proteins conserved only in photosynthetic eukaryotes. An important characteristic of the clustering reported in the present article is that the data include not only nuclearbut also organellar-encoded proteins. Therefore, we find that some proteins are nuclear encoded in Arabidopsis while homologs are encoded by the chloroplast genome in Cyanidioschyzon, such as CPRE 3, 12, 14 and 16 (Table 1). This is a consequence of the fact that many genes originating from the cyanobacterial endosymbiont remained encoded in the chloroplast genome in red algae, reflecting differences in discontinuous plastid evolution in the green and red lineages (Sato 2001, Sato 2006). Functional genomic analysis The goal of the current functional analysis has been to determine whether the proteins conserved in photosynthetic organisms, including cyanobacteria and plants or algae, are really chloroplast proteins that are involved in photosynthesis directly or indirectly. We can now say yes to this question. The analysis of localization showed that almost all putative CPRENDOs are indeed localized to chloroplasts, with the exception of some paralogs. Expression of many of them was Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009. 779 M. Ishikawa et al. CPRE Wild type 1 2 Species (Glucose tolerant) Ath (Columbia) Syn slr1674 slr1638 At1g63610 At2g14910 At5g14970 Ath Syn 3 4 Syn Ath <6> Ath Syn Ath Syn Ath At3g16250 Syn Ath ssr2142 At3g07430 At4g27990 Ath Syn Ath Syn Ath Syn Ath Syn 9 10 <11> 12 13 14 15 16 17 ssr2723 ssl3379 At1g68660 sll0286 At1g26220 At1g32070 slr1699 At3g43540 At5g47860 sll2002 slr1110 At3g08010 sll1656 At2g15290 slr1926 At2g43910 At2g43920 At2g43945 At3g59870 slr0959 At1g14270 At2g20725 At5g60750 sll1586 sll1265 At2g25660 slr0565 At4g35760 ssl3044 Syn 7 8 SALK Syn Ath Syn Ath <5> Gene Syn Ath Syn Ath Syn Ath Syn Ath Syn Ath sll1289 At1g21350 sll1737 At2g47840 At5g55710 sll1071 At5g52970 slr0923 At1g68590 At5g15760 sll0295 At2g45990 Phenotype [reference] P qP 0.820 100% 0.874 100% qN 0.173 100% 0.336 100% 103% 101% 97% 99% 121% 117% 98% 102% normal normal 065921 A M NPQ Fv/Fm 0.369 0.069 0.508 100% 100% 100% 0.568 0.360 0.839 100% 100% 100% (not analyzed) (not analyzed) 108% 105% 101% PSII n Fv'/Fm' 0.449 100% 0.649 100% 21 21 105% 3 93% 140% 100% 97% 133% 101% (not analyzed) 96% 99% 3 3 98% 106% 97% 106% 100% 100% 100% 103% 6 3 100% 100% 102% 98% 113% 115% 115% 110% (not analyzed) 102% 129% 100% 127% 105% 136% 97% 101% 101% 101% 100% 101% 102% 100% 103% 99% 3 3 3 2 95% 141% 78% 117% 89% 82% 6 102% 101% 106% 103% 100% 104% 3 95% 103% 98% 101% 103% 85% 98% 147% 91% 117% 102% 116% 104% 99% 72% 104% 109% 88% 95% 131% 102% 105% 100% 112% 93% 106% 98% 101% (not analyzed) 84% 100% 101% 100% 99% 105% 100% 76% 105% 98% 101% 97% 104% 99% 5 3 3 3 3 5 3 95% 106% 93% 100% 98% 6 $ 086933 036830 normal PCR failed No stock SG 112856 $ normal 133462 027281 106119 No stock normal PCR failed atab2: albino [1] $, FI, LS pic1: chlorosis [2] PCR failed $ No stock 081999 $ FI, LS 074655 070494 028403 SG FI 128275 SG $ normal ndf4: loss of NDH activity [3] normal $ 001605 032584 normal 93% 113% 98% (not analyzed) 95% 106% 93% 98% 100% 98% 6 102% 102% 97% 100% 103% 109% 103% 106% 100% 101% 110% 99% 100% 101% 104% 102% 104% 102% 4 3 2 99% 93% 98% 88% 101% 89% 102% 100% 106% 117% 133% 159% 96% 143% 110% 100% 95% 101% 86% 130% 92% 162% 75% 163% 104% 94% 76% 137% 96% 109% 100% 98% (not analyzed) 97% 101% 100% 94% 101% 93% 95% 101% 96% 92% 94% 85% 103% 86% 94% 100% 3 6 4 4 3 5 3 3 102% 105% 104% 101% 102% 3 No stock normal 064931 013444 FI 122650 010806 063933 FI, SG albino cotyledon normal 141449 110% Fig. 3 Summary of the analysis of CPRE mutants. $, incomplete segregation (Syn: Synechocystis) or heterozygous line (Ath: Arabidopsis); FI, anomaly in fluorescence induction (Supplementary Fig. S2); SG, slow growth on agar plate; LS, light sensitive on agar plate; normal, no apparent phenotype; n, number of determinations. If a gene name is given in ‘Phenotype’, a report on the gene function had appeared during the work, and the gene was no longer analyzed (the CPRE number is highlighted by <brackets>). If no visible phenotype or no difference in fluorescence kinetics was found 780 Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009. Orthogenomics of photosynthetic organisms CPRE Wild type <18> <19> 20 21 22 23 24 25 <26> Species Gene Syn (Glucose tolerant) Ath (Columbia) Syn Ath Syn Ath Syn Ath Syn Ath Syn Ath Syn Ath Syn Ath Syn Ath Syn ssl0353 At5g36120 sll1414 At2g20890 slr2033 At1g54500 slr1702 At5g27560 slr0575 At5g38660 slr1599 At1g10830 sll0157 At1g29700 slr0815 At3g17930 slr0116 Ath At3g09150 Syn Ath sll1509 At1g65420 27 29 Syn Ath Syn Ath 30 Syn Ath 28 31 <32> Syn Ath Syn Ath Syn 33 34 Ath Syn Ath <35> Syn Ath 36 37 SALK Syn Ath Syn Ath At3g56830 At5g43050 slr0941 At1g02470 slr1918 At2g20920 At3g51140 At5g23040 sll0933 At4g19100 At5g52780 slr1052 At3g26580 slr0589 At3g26710 sll0301 sll0274 sll0577 At1g12250 sll0195 At1g19740 At1g75460 ssl2667 At5g49940 At4g25910 sll0875 At1g78620 sll1300 At5g17660 No stock Phenotype [reference] P $ no mutants analyzed normal thf1: variegation [4] FI qP 0.820 100% 0.874 100% 75% qN 0.173 100% 0.336 100% 114% 99% 104% 60% 169% 101% 103% 101% 126% 99% 106% 99% 102% 94% 103% A M NPQ Fv/Fm 0.369 0.069 0.508 100% 100% 100% 0.568 0.360 0.839 100% 100% 100% 56% 89% 77% PSII n Fv'/Fm' 0.449 100% 0.649 100% 74% 21 21 4 98% 96% 101% (not analyzed) 29% 117% 57% 100% 6 49% 4 89% 112% 105% 99% 99% 104% (not analyzed) 100% 91% 105% 105% 92% 100% 99% 88% 102% 98% 4 3 5 99% 101% 100% 103% 5 3 No stock $, FI, SG 132878 FI ape1: low SG PSII in HL [5] 057053 normal (not analyzed) No stock FI, LS 75% 185% 65% 256% 95% 86% 3 $ hy2: long hypocotyl [6] FI, SG 97% 122% 71% 102% 77% 74% 6 No stock 091458 010998 variegation $ (not analyzed) 89% 101% 97% 125% 106% 137% 64% 101% 82% 70% 107% 167% 75% 101% 91% 70% 100% 85% 4 3 3 96% 116% 71% 83% 77% 74% 7 No stock SG $ normal $ $ (not analyzed) 129925 FI $ 143426 normal $ FI, SG ccb1: pale green [7] normal normal $, FI, SG 100% 60% 109% 211% 100% 36% 101% 72% 99% 59% 3 6 98% 128% 95% 153% 101% (not analyzed) 97% 3 94% 155% 85% 97% 91% 3 127% (not analyzed) (not analyzed) 69% 90% 76% 72% 4 101% 85% 94% 78% 3 4 95% 113% 208% 191% No stock (not analyzed) normal No stock 068796 $, FI, SG nfu2: pale green [8] no mutants analyzed $, FI $ FI, SG 019461 96% 95% 114% 140% 91% 74% 122% 118% (not analyzed) 88% 211% 82% 309% 106% 93% 4 97% 94% 105% 115% 96% 85% 93% 117% 101% 99% 99% 90% 4 3 Fig. 3 Continued in Synechocystis mutants, PAM analysis was not carried out. All Arabidopsis homozygous mutants were subjected to PAM analysis. Each value in red or blue indicates an increase or decrease, respectively, at the significance level of 5% (Ath) or 1% (Syn). References: [1] Barneche et al. (2006); [2] Teng et al. (2006), Duy et al. (2007); [3] Takabayashi et al. (2009); [4] Wang et al. (2004), Keren et al. (2005); [5] Walters et al. (2003); [6] Kohchi et al. (2001); [7] Lezhneva et al. (2008); [8] Touraine et al. (2004). Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009. 781 M. Ishikawa et al. B sCPRE36::kanR A Wild type C sCPRE30::kanR Fig. 4 PAM analysis of Synechocystis mutants. Measurement was started at time 0. Actinic light was provided as shown at 15 or 30% of the full power. Saturating pulses were applied at intervals. DCMU (10 µM) was added at the end of the measurement to obtain the Fm value. (A) Wild type; (B) sCPRE36 disruptant; (C) sCPRE30 disruptant. A B salk_010998 (CPRE27) Col C salk_010998 (CPRE27) D Col (CPRE16) salk_010806 Fig. 5 Visible phenotypes of cpre16 and cpre27 mutants in Arabidopsis. (A) Wild type (Columbia); (B), cotyledon of the cpre27 mutant; (C) foliage leaves of the cpre27 mutant; (D) cotyledons of the wild type (left) and the cpre16 mutant (right). promoted by light. About half of the Synechocystis mutants were affected in photosynthesis, as revealed by fluorescence, oxygen evolution or pigment composition. Many of the plant mutants showed defects in photosynthesis or related processes. Based on these data, we can safely conclude that the conserved proteins in photosynthetic organisms are indeed CPRENDOs. This demonstrates the usefulness of phylogenetic profiling in identifying proteins involved in functions limited in certain groups of organisms, such as plants of green lineage, cyanobacteria or land plants. However, we noticed interesting differences between the mutants of Synechocystis and Arabidopsis (Fig. 3). The results of PAM fluorescence were distinctly different in most cases in Synechocystis and Arabidopsis. The differences are graphically expressed as vectors in Fig. 7B. Each set of the six parameters of PAM analysis was taken as a vector, and the angle and the average size of the vectors for plant and cyanobacterial mutants of each CPRE were plotted. Most data deviated from the axis representing parallel phenotypes 782 in the two organisms. One reason for this is the different mechanisms of energy dissipation in chloroplasts and cyanobacteria (Schreiber 2004, Fujimori et al. 2005). Another reason is that cyanobacteria are free-living organisms, whereas chloroplasts are located within the cell. The mutant data are summarized as a Venn diagram (Fig. 7A). In this figure, only the presence of phenotypes in mutants of Arabidopsis and Synechocystis was used to classify 37 CPRE genes. By this criterion, mutation of 15 CPRE genes resulted in phenotypes in both Arabidopsis and Synechocystis. Uncertainty remains for 10 CPREs, which are shown by the two 5s at the boundary of Arabidopsis. Mutations in five CPRE genes have phenotypes only in Arabidopsis. This result again suggests that the roles of many CPREs are different in chloroplasts and cyanobacteria. Comprehensive analysis of many proteins that were predicted to be CPRENDOs required a long time. The localization and expression analysis are feasible for a larger set of proteins, but analysis of mutants, especially of plants, must be performed one by one, and is time consuming. In addition, detailed analysis of mutants, either in plants or in cyanobacteria, requires specified growth conditions. We did not detect phenotypes in ape1 and thf1 mutants of Synechocystis under normal laboratory conditions of growth. This suggests that a more sophisticated analysis with a range of growth conditions is necessary to find clear phenotypes in CPRE mutants. Another problem is that not all mutants are currently available. The unavailable T-DNA insertion lines may show interesting phenotypes. We have now demonstrated that the proposed CPRENDOs are reasonable candidates for further detailed research; construction of all knockdown mutants for these CPREs will be a promising strategy. Prospects and conclusion The combined informatic and experimental approach as presented here as a pilot research study is a model for future projects. The total number of CPRENDOs as estimated by a recent Gclust analysis is about 1,200 (Sato et al. 2005), which Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009. Orthogenomics of photosynthetic organisms Table 4 Oxygen evolution and spectral properties of Synechocystis mutants Kazusa code Strain CPRE Notes Segregation Wild type O2 evolution P (5%) n Absorbance ratio A492/A680 A626/A680 210.6a – 18 0.402 0.785 100% – 111% 0.247 4 0.445 0.805 104% 0.688 2 0.435 0.872 sll0286 6 3 sll2002 15 <5> ATAB2 sll1656 17 <6> PIC1 95% 0.574 5 0.542 0.889 slr0959 19 8 166% 0.000 5 0.454 0.901 sll1586 20 9 94% 0.524 3 0.493 0.767 sll1265 21 9 ssl3044 23 <11> sll1289 24 13 sll1737 25 14 sll1071 26 15 No 127% 0.028 3 0.411 0.796 NDF4 108% 0.326 5 0.403 0.824 94% 0.568 2 0.451 0.798 Ycf60 140% 0.000 3 0.432 0.780 146% 0.000 3 0.444 0.970 58% 0.000 5 0.424 0.741 64% 0.000 4 0.463 0.833 slr0923 27 16 Ycf65 ssl0353 4 <18> Ycf19 sll1414 29 <19> Psb29 slr2033 30 20 No 101% 0.924 3 0.427 0.781 93% 0.334 4 0.553 1.000 slr1702 31 21 97% 0.699 3 0.531 0.768 slr0575 32 22 117% 0.044 4 0.424 0.772 slr1599 8 23 116% 0.069 5 0.388 0.780 slr0815 35 25 69% 0.000 6 0.430 0.835 slr0116 36 <26> PcyA 125% 0.022 3 0.539 0.834 sll1509 37 27 Ycf20 88% 0.150 5 0.426 0.765 slr0941 38 28 99% 0.931 5 0.355 0.809 sll0933 33 30 133% 0.000 4 0.528 1.036 slr0589 41 <32> 103% 0.696 5 0.399 0.836 sll0577 43 33 92% 0.324 4 0.520 0.770 ssl2667 46 <35> sll0875 47 36 sll1300 48 37 No No CCB1 No NifU No 100% 0.952 4 0.485 0.879 No 126% 0.033 3 0.422 0.936 116% 0.065 4 0.456 0.812 Values that are significantly different from wild-type values are underlined. CPREs that have been identified are marked by <brackets>. aOxygen evolution in µmol mg–1Chl h–1. is not as large as the number (about 3,600) presented by Martin et al. (2002). We expect that comprehensive analysis of this number of genes is feasible. In addition, the present study shows the possibility of performing comprehensive analysis of phylogenetically conserved proteins, such as those conserved in nodulating bacteria and those conserved in green plants, among others. It is true that plant genomics has advanced since the start of this project, by the development of genetic tools, comprehensive analysis of co-expression and proteomics. However, the usefulness of integrated comparative protein data, at a level beyond simple homology searches, as described in the present report, still remains important in complementing other resources. In conclusion, we demonstrated that phylogenetic profiling is effective in identifying hitherto undetected CPRENDOs. A more efficient profiling, using improved protein clustering with more genomic data, will estimate the nearly complete set of CPRENDOs. In addition, a similar approach could be applied to find various lineage-specific proteins, which might Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009. 783 M. Ishikawa et al. A Columbia B 2000 SALK_010806 (CPRE16) C SALK_013444 (CPRE14) Fm Fluorescence intensity, arb. units Fm´ 1000 F Fo 0 2000 ML on A L on Fo´ AL off FR on ML on AL on ML on AL on ML off D E (CPRE27) (CPRE14) 1000 0 ML on AL on ML on AL off AL on ML on ML off AL on 5 min Time Fig. 6 PAM analysis of A. thaliana mutants. AL, actinic light; ML, measuring light. Saturating pulses were given as indicated. (A) Two traces of the wild type (Columbia); (B) a cpre16 mutant; (C) a cpre14 mutant; (D) a cpre27 mutant (green sector); (E) a cpre14 mutant. be involved in lineage-specific functions, such as chloroplast proteins of eukaryotic origin. Materials and Methods Estimation of CPRE The genomes used in the present study are summarized in Supplementary Table S1. The sequence data were assembled as of December 2002. We also have more recent data (ALL95 in Table 1); however, the analyses reported in the present article were based on the data of 2002. The Gclust software version 3.0 was used to construct protein clusters by single-linkage clustering at a threshold E-value, such as 10–8, 10–12 or 10–20. The details of data processing were described in previous papers (Sato 2002, Sato et al. 2005). For the estimation of CPRENDOs, the clusters that had at least one member of each in the eight cyanobacteria, the plant and the red alga, but had no members belonging to nonphotosynthetic organisms or non-oxygenic photosynthetic bacteria, were selected. Prediction of intracellular targeting was performed for the selected Arabidopsis protein sequences using the TargetP server version 1.01 at http://www.cbs.dtu.dk/ services/(Emanuelsson et al. 2000), the WoLF PSORT at http://wolfpsort.org/(Horton et al. 2007) and the Predotar at http://urgi.versailles.inra.fr/predotar/predotar.html (Small et al. 2004). Growth of organisms Wild-type and mutant A. thaliana were grown on 0.8% agarsolidified MS medium (Murashige and Skoog 1962) or on 784 soil at 22°C under continuous illumination with fluorescent lamps (80 µmol m–2 s–1). T-DNA-tagged mutants of Arabidopsis were obtained from the Arabidopsis Biological Resource Center (Ohio State University, Columbus, OH, USA) or the Nottingham Arabidopsis Stock Centre (University of Nottingham, Loughborough, UK). Wild-type (glucosetolerant strain) and mutants of Synechocystis were grown at 30°C in BG-11 medium (Rippka et al. 1979) supplemented by 5 mM sodium bicarbonate and 5 mM HEPES/NaOH (pH 7.5) under continuous illumination with fluorescent lamps (50 µmol m–2 s–1). Liquid cultures were aerated with 1.0% CO2 in air. Intracellular localization of selected proteins Targeting of selected A. thaliana proteins was experimentally analyzed using the GFP fusion technique. A construct for the expression of a transit peptide or a full-length protein fused with GFP was prepared for each selected protein by repeated PCR. The standard PCR was done in a 100 µl reaction with 2.5 U of ExTaq (TAKARA Biomedicals, Kyoto, Japan) according to the program of 30 cycles, each consisting of denaturation at 93°C for 40 s, annealing at 55°C for 2 min and extension at 72°C for 2 min. After the final cycle, extension for 10 min was performed. First, the following three fragments were amplified by standard PCR: (1) the upstream half of the GFP vector containing the 35S promoter region plus a short connecting sequence A at its 3′ end; (2) the putative targeting sequence with a short connecting sequence A at its 5′ end and a short connecting sequence B at its 3′ end; and (3) the sGFP sequence with a short connecting sequence B at its 5′ end. The connecting sequences Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009. Orthogenomics of photosynthetic organisms A Synechocystis Arabidopsis 5 15 2 5 5 4 B unknown: 1 (Change in different parameters) Orthogonal 27 30 8 8 15 −1Anti-parallel (Reverse changes in similar parameters) 27 8 37 14 21 14 0 (No change) 16 Parallel 1 (Similar changes) Fig. 7 Statistical comparison of Arabidopsis and Synechocystis mutants. (A) Classification of CPREs according to observed phenotypes in Arabidopsis and Synechocystis. Numbers within the circles indicate the number of CPREs that showed phenotypes in either Arabidopsis, Synechocystis or both. The number outside the circles indicates CPREs that did not show phenotypes in either organism. Where there are no data for a CPRE, this is described as ‘unknown’. (B) Similarity analysis of the PAM results in Arabidopsis and Synechocystis. Each set of the six parameters of PAM analysis (or, more exactly, deviation from 100%) (Fig. 3) was taken as a vector, and the similarity of the vectors for Arabidopsis and Synechocystis mutants of a CPRE, u and v, respectively, was plotted as a vector having a size, (|u||v|)1/2, and an orientation θ = cos–1(u·v/|u||v|), where u·v is the scalar product of u and v. The data are plotted in a hemi-circular space with an arbitrary unit. Angle θ is measured anti-clockwise from the right (as marked by 1). CPRE numbers are indicated for significant signals. A and B were used to join two PCR fragments by PCR. For the amplification of fragments (1) and (3), the sGFP vector (Chiu et al. 1996) was used as a template. The primers for the amplification of fragment (1) were: primer 1, CCCTCAGAA GACCAGAGGGCTATTGAGACT; and primer 2, GGATC CTCTAGAGTCGAC. The primers for the amplification of fragment (3) were: primer 3, ATGGTGAGCAAGGGCGAG or GTGAGCAAGGGCGAGGAG; and primer 4, TCTCAT GTTTGACAGCTTATCATCGGATCT. The underlined sequences represent connecting sequences A and B, respectively. The primers used for amplifying fragment (2) varied with the genes to be amplified and are summarized in Supplementary Table S3. In the second PCR, purified fragments (1), (2) and (3) were mixed and connected by amplification in a condition slightly different from the standard one, i.e. the annealing temperature was 50°C. The amount of fragment (2) was twice as high as that of fragments (1) or (3). The final product (1–2–3) was further amplified for use in particle bombardment. Tungsten particles (1 µm in diameter) were coated with the DNA and then introduced into scaly leaves of onion bulb by the He-driven particle delivery system PDS-1000/He (BioRad Laboratories, Hercules, CA, USA), using rupture disks for 650 p.s.i. After incubating for 24 h at 25°C in dim light, the epidermis was peeled and examined under a fluorescence microscope (Olympus model BX-60) with an IB cube. For the control of chloroplast and mitochondrial proteins, cpRbcS (Lee et al. 2002) and mitochondrial ATP synthase subunit δ (Moriyama et al. 2008) were used, respectively. RNA gel blot analysis Seedlings of A. thaliana ecotype Columbia were grown on 0.2% agar-solidified MS medium (Murashige and Skoog 1962) for 7 d under light (50 µE m–2 s–1) or in darkness. Shoots were harvested, rapidly frozen in liquid nitrogen, and stored at –80°C until use. Digoxigenin (DIG)-labeled probes for the RNA gel blot analysis were prepared by PCR using a DIG-PCR labeling mixture (Roche Diagnostics, Mannheim, Germany). The primers are listed in Supplementary Table S3. Preparation of total RNA, glyoxylation, electrophoresis, blotting to a nylon membrane and hybridization were done as described previously (Sekine et al. 2007). The band was finally visualized by chemiluminescence of CDP-Star (Roche Diagnostics). Disruption mutagenesis in Synechocystis The genes for CPRE were individually disrupted in Synechocystis sp. PCC 6803 by homologous recombination using a PCR-based disruption cassette. The method was described in a recent publication (Sakurai et al. 2007). The primers are listed in Supplementary Table S4. Complete segregation was confirmed by PCR analysis using the upF and dnR primers. In some cases, complete segregation was not attained, but the partial mutants showing phenotypes were analyzed along with the complete disruptants. Fluorescence measurement PAM fluorescence analysis was performed with Fluorescence Monitoring System FMS1 (Hansatech Instruments Ltd., Norfolk, UK). In the analysis of Arabisopsis, leaves of 30-day-old wild type or mutants grown on soil under a 16 h light/8 h dark cycle at 23°C were used. Modulated measuring light Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009. 785 M. Ishikawa et al. at 594 nm was used at a setting 2 with gain 70. Actinic light with a setting 15 (corresponding to a fluence rate of 80 µmol m–2 s–1) was used to drive photosynthesis. Pulses (0.8 s) of white light at a setting 100 (fluence rate of 8,000 µmol m–2 s–1) at 30 s intervals were applied to obtain maximal fluorescence. The kinetics of fluorescence induction in Synechocystis were measured as described (Fujimori et al. 2005). The results were used to select candidates for further analysis. In the PAM analysis of Synechocystis, exponentially growing cells (A750 ∼0.5) were used. Modulated light was used at a setting 1 with gain 70. Actinic light with a setting 15 and a series of 0.2 s saturating pulses was applied. At the end of each measurement, DCMU (10 µM) was added with an actinic light at a setting 30 to obtain Fm. Oxygen evolution Oxygen evolution of Synechocystis cells was measured polarographically in Oxytherm with an Oxygraph controller (Hansatech Instruments Ltd.). Absorption spectra Absorption spectra of Synechocystis cells were measured by the ‘opal glass’ method using a Shimadzu UV160 spectrophotometer. Two sheets of Parafilm were used as a light scatterer, and the cuvettes were placed just in front of the light detectors. Chlorophyll was determined in a 90% methanol extract at 665 nm using the absorption coefficient 12.7 mM–1 cm–1. Supplementary data Supplementary data are available at PCP online. Funding Ministry of Education, Culture, Sports, Science and Technology (MEXT), Japan Grants-in-Aid (Nos. 17018010, 18017005, 20017006 and 16GS0304 to N.S.). Acknowledgments The authors thank former students T. Saito and A. Fukumoto in the laboratory for their help in the initial phase of the work, and Y. Niwa, University of Shizuoka, for kindly supplying us with the sGFP vector. We also acknowledge the Arabidopsis Biological Resource Center and the Nottingham Arabidopsis Stock Centre for Arabidopsis T-DNA-tag lines References Abdallah, F., Salamini, F. and Leister, D. (2000) A prediction of the size and evolutionary origin of the proteome of chloroplasts of Arabidopsis. Trends Plant Sci. 5: 141–142. 786 Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408: 796–815. Barneche, F., Winter, V., Crèvecoeur, M. and Rochaix, J.D. (2006) ATAB2 is a novel factor in the signalling pathway of light-controlled synthesis of photosystem proteins. EMBO J. 25: 5907–5918. Bartnikas, T.B., Tosques, I.E., Laratta, W.P., Shi, J. and Shapleigh, J.P. (1997) Characterization of the nitric oxide reductase-encoding region in Rhodobacter sphaeroides 2.4.3. J. Bacteriol. 179: 3534–3540. Bowman, J.L., Floyd, S.K. and Sakakibara, K. (2007) Green genes— comparative genomics of the green branch of life. Cell 129: 229–234. Cavalier-Smith, T. (2003) Genomic reduction and evolution of novel genetic membranes and protein-targeting machinery in eukaryote– eukaryote chimeras (meta-algae). Philos. Trans. R. Soc. B: Biol. Sci. 358: 109–134. Chiu, W.-I., Niwa, Y., Zeng, W., Hirose, T., Kobayashi, H. and Sheen, J. (1996). Engineered GFP as vital reporter in plants. Curr. Biol. 6: 325–330. Dauvillée, D., Stampacchia, O., Girard-Bascou, J. and Rochaix, J.-D. (2003) Tab2 is a novel conserved RNA binding protein required for translation of the chloroplast psaB mRNA. EMBO J. 22: 6378–6388. De Bie, L.G., Roovers, M., Oudjama, Y., Wattiez, R., Tricot, C., Stalon, V., et al. (2003) The yggH gene of Escherichia coli encodes a tRNA (m7G46) methyltransferase. J. Bacteriol. 185: 3238–3243. De Crécy-Lagard, V. and Hanson, A.D. (2007) Finding novel metabolic genes through plant-prokaryote phylogenomics. Trends Microbiol. 15: 563–70. Dougan, D.A., Reid, B.G., Horwich, A.L. and Bukau, B. (2002) ClpS, a substrate modulator of the ClpAP machine. Mol. Cell 9: 673–683. Dunkley, T.P., Hester, S., Shadforth, I.P., Runions, J., Weimar, T., Hanton, S.L., et al. (2006) Mapping the Arabidopsis organelle proteome. Proc. Natl Acad. Sci. USA 103: 6518–6523. Duy, D., Wanner, G., Meda, A.R., Von Wirén, N., Soll, J. and Philippar, K. (2007) PIC1, an ancient permease in Arabidopsis chloroplasts, mediates iron transport. Plant Cell 19: 986–1006. Emanuelsson, O., Nielsen, H., Brunak, S. and von Heijne, G. (2000) Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J. Mol. Biol. 300: 1005–1016. Ferro, M., Salvi, D., Brugière, S., Miras, S., Kowalski, S., Louwagie, M., et al. (2003) Proteomics of the chloroplast envelope membranes from Arabidopsis thaliana. Mol. Cell Proteomics 2: 325–345. Frankenberg, N. and Lagarias, J.C. (2003) Phycocyanobilin:ferredoxin oxidoreductase of Anabaena sp. PCC 7120. J. Biol. Chem. 278: 9219–9226. Friso, G., Giacomelli, L., Ytterberg, A.J., Peltier, J.-B., Rudella, A., Sun, Q., et al. (2004) In-depth analysis of the thylakoid membrane proteome of Arabidopsis thaliana chloroplasts: new proteins, new functions, and a plastid proteome database. Plant Cell 16: 478–499. Froehlich, J.E., Wilkerson, C.G., Ray, W.K., McAndrew, R.S., Osteryoung, K.W., Gage, D.A., et al. (2003) Proteomic study of the Arabidopsis thaliana chloroplastic envelope membrane utilizing alternatives to traditional two-dimensional electrophoresis. J. Proteome Res. 2: 413–425. Fujimori, T., Higuchi, M., Sato, H., Aiba, H., Muramatsu, M., Hihara, Y., et al. (2005) The mutant of sll1961, which encodes a putative transcriptional regulator, has a defect in regulation of photosystem Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009. Orthogenomics of photosynthetic organisms stoichiometry in the cyanobacterium Synechocystis sp. PCC 6803. Plant Physiol. 139: 408–416. Goodstadt, L. and Ponting, C.P. (2004) Vitamin K epoxide reductase: homology, active site and catalytic mechanism. Trends Biochem. Sci. 29: 289–292. Horton, P., Park, K.J., Obayashi, T., Fujita, N., Harada, H., Adams-Collier, C.J., et al. (2007) WoLF PSORT: protein localization predictor. Nucleic Acids Res. 35: W585–W587. Kawai-Yamada, M., Saito, Y., Jin, L., Ogawa, T., Kim, K.M., Yu, L.H., et al. (2005) A novel Arabidopsis gene causes Bax-like lethality in Saccharomyces cerevisiae. J. Biol. Chem. 280: 39468–39473. Keren, N., Ohkawa, H., Welsh, E.A., Liberton, M. and Pakrasi, H.B. (2005) Psb29, a conserved 22-kD protein, functions in the biogenesis of photosystem II complexes in Synechocystis and Arabidopsis. Plant Cell. 17: 2768–81. Kleffmann, T., Russenberger, D., von Zychlinski, A., Christopher, W., Sjölander, K., Gruissem, W., et al. (2004) The Arabidopsis thaliana chloroplast proteome reveals pathway abundance and novel protein functions. Curr. Biol. 14: 354–362. Kohchi, T., Mukougawa, K., Frankenberg, N., Masuda, M., Yokota, A. and Lagarias, J.C. (2001) The Arabidopsis HY2 gene encodes phytochromobilin synthase, a ferredoxin-dependent biliverdin reductase. Plant Cell 13: 425–436. Lee, K.H., Kim, D.H., Lee, S.W., Kim, Z.H. and Hwang, I. (2002) In vivo import experiments in protoplasts reveal the importance of the overall context but not specific amino acid residues of the transit peptide during import into chloroplasts. Mol. Cells 14: 388–397. Léon, S., Touraine, B., Ribot, C., Briat, J.F. and Lobréaux, S. (2003) Iron– sulphur cluster assembly in plants: distinct NFU proteins in mitochondria and plastids from Arabidopsis thaliana. Biochem. J. 371: 823–830. Lezhneva, L., Kuras, R., Ephritikhine, G. and De Vitry, C. (2008) A novel pathway of cytochrome c biogenesis is involved in the assembly of the cytochrome b6f complex in Arabidopsis chloroplasts. J. Biol. Chem. 283: 24608–24616. Martin, W., Rujan, T., Richly, E., Hansen, A., Cornelsen, S., Lins, T., et al. (2002) Evolutionary analysis of Arabidopsis, cyanobacterial, and chloroplast genomes reveals plastid phylogeny and thousands of cyanobacterial genes in the nucleus. Proc. Natl Acad. Sci. USA 99: 12246–12251. Martin, W., Stoebe, B., Goremykin, V., Hapsmann, S., Hasegawa, M. and Kowallik, K.V. (1998) Gene transfer to the nucleus and the evolution of chloroplasts. Nature 393: 162–165. Matsuzaki, M., Misumi, O., Shin-i, T., Maruyama, S., Takahara, M., Miyagishima, S., et al. (2004) Genome sequence of the ultrasmall unicellular red alga Cyanidioschyzon merolae 10D. Nature 428: 653–657. Merchant, S.S., Prochnik, S.E., Vallon, O., Harris, E.H., Karpowicz, S.J., Witman, G.B., et al. (2007) The Chlamydomonas genome reveals the evolution of key animal and plant functions. Science 318: 245–250. Moriyama, T., Terasawa, K., Fujiwara, M. and Sato, N. (2008) Purification and characterization of organellar DNA polymerases in the red alga Cyanidioschyzon merolae. FEBS J. 275: 2899–2918. Mulkidjanian, A.Y., Koonin, E.V., Makarova, K.S., Mekhedov, S.L., Sorokin, A., Wolf, Y.I., et al. (2006) The cyanobacterial genome core and the origin of photosynthesis. Proc. Natl Acad. Sci. USA 103: 13126–13131. Murashige, T. and Skoog, F. (1962) A revised medium for rapid growth and bioassays with tobacco tissue cultures. Physiol Plant. 15: 473–497. Nishio, K. and Nakai, M. (2000) Transfer of iron–sulfur cluster from NifU to apoferredoxin. J. Biol. Chem. 275: 22615–22618. Pellegrini, M., Marcotte, E.M., Thompson, M.J., Eisenberg, D. and Yeates, T.O. (1999) Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc. Natl Acad. Sci. USA 96: 4285–4288. Peltier, J.-B., Emanuelsson, O., Kalume, D.E., Ytterberg, J., Friso, G., Rudella, A., et al. (2002) Central functions of the lumenal and peripheral thylakoid proteome of Arabidopsis determined by experimentation and genome-wide prediction. Plant Cell 14: 211–236. Peltier, J.-B., Ytterberg, A.J., Sun, Q. and van Wijk, K.J. (2004) New functions of the thylakoid membrane proteome of Arabidopsis thaliana revealed by a simple, fast, and versatile fractionation strategy. J. Biol. Chem. 279: 49367–49383. Peltier, J.-B., Cai, Y., Sun, Q., Zabrouskov, V., Giacomelli, L., Rudella, A., et al. (2006) The oligomeric stromal proteome of Arabidopsis thaliana chloroplasts. Mol. Cell Proteomics 5: 114–133. Rensing, S.A., Lang, D., Zimmer, A.D., Terry, A., Salamov, A., Shapiro, H., et al. (2008) The Physcomitrella genome reveals evolutionary insights into the conquest of land by plants. Science 319: 64–69. Reyes-Prieto, A., Hackett, J.D., Soares, M.B., Bonaldo, M.F. and Bhattacharya, D. (2006) Cyanobacterial contribution to algal nuclear genomes is primarily limited to plastid functions. Curr. Biol. 16: 2320–2325. Rippka, R., Deruelies, J., Waterbury, J.B., Herdman, M. and Stanier, R.Y. (1979) Generic assignments, strain histories and properties of pure cultures of cyanobacteria. J. Gen. Microbiol. 111: 1–61. Sakurai, I., Mizusawa, N., Wada, H. and Sato, N. (2007) Digalactosyldiacylglycerol is required for stabilization of the oxygenevolving complex in photosystem II. Plant Physiol. 145: 1361–1370. Sato, N. (2001) Was the evolution of plastid genetic machinery discontinuous? Trends Plant Sci. 6: 151–156. Sato, N. (2002) Comparative analysis of the genomes of cyanobacteria and plants. Genome Inform. 13: 173–182. Sato, N. (2006) Origin and evolution of plastids: genomic view on the unification and diversity of plastids. In The Structure and Function of Plastids. Edited by Wise, R.R. and Hoober, J.K. pp. 75–102. Springer, Dordrecht. Sato, N. (2009) Gclust: trans-kingdom classification of proteins using automatic individual threshold setting. Bioinformatics doi:10.1093/ bioinformatics/btp047. Sato, N., Ishikawa, M., Fujiwara, M. and Sonoike, K. (2005) Mass identification of chloroplast proteins of endosymbiont origin by phylogenetic profiling based on organism-optimized homologous protein groups. Genome Inform. 16: 56–68. Schreiber, U. (2004) Pulse-amplitude-modulation (PAM) fluorometry, and saturation pulse method: an overview. In Chlorophyll a Fluorescence: A Signature of Photosynthesis. Edited by Papageorgiou, G.C. and Govindjee. pp. 279–339. Springer, Dordrecht. Schubert, M., Petersson, U.A., Haas, B.J., Funk, C., Schröder, W.P. and Kieselbach, T. (2002) Proteome map of the chloroplast lumen of Arabidopsis thaliana. J. Biol. Chem. 277: 8354–8365. Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009. 787 M. Ishikawa et al. Sekine, K., Fujiwara, M., Nakayama, M., Takao, T., Hase, T. and Sato, N. (2007) DNA-binding and partial nucleoid localization of the chloroplast stromal enzyme ferredoxin:sulfite reductase. FEBS J. 274: 2054–2069. Small, I., Peeters, N., Legeai, F. and Lurin, C. (2004) Predotar: a tool for rapidly screening proteomes for N-terminal targeting sequences. Proteomics 4: 1581–1590. Sun, Q., Emanuelsson, O. and van Wijk, K.J. (2004) Analysis of curated and predicted plastid subproteomes of Arabidopsis. Subcellular compartmentalization leads to distinctive proteome properties. Plant Physiol. 135: 723–735. Takabayashi, A., Ishikawa, N., Obayashi, T., Ishida, S., Obokata, J., Endo, T., et al. (2009) Three novel subunits of Arabidopsis chloroplastic NAD(P)H dehydrogenase identified by bioinformatic and reverse genetic approaches. Plant J. 57: 207–219. Teng, Y.-S., Su, Y.-S., Chen, L.-J., Lee, Y.J., Hwang, I. and Li, H.-M. (2006) Tic21 is an essential translocon component for protein translocation across the chloroplast inner envelope membrane. Plant Cell 18: 2247–2257. Thompson, J.D., Higgins, D.G. and Gibson, T.J. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22: 4673–4680. Touraine, B., Boutin, J.P., Marion-Poll, A., Briat, J.F., Peltier, G. and Lobréaux, S. (2004) Nfu2: a scaffold protein required for [4Fe–4S] and ferredoxin iron–sulphur cluster assembly in Arabidopsis chloroplasts. Plant J. 40: 101–111. Walters, R.G., Shephard, F., Rogers, J.J., Rolfe, S.A. and Horton, P. (2003) Identification of mutants of Arabidopsis defective in acclimation of photosynthesis to the light environment. Plant Physiol. 131: 472–481. Wang, Q., Sullivan, R.W., Kight, A., Henry, R.L., Huang, J., Jones, A.M., et al. (2004) Deletion of the chloroplast-localized Thalakoid Formation 1 gene product in Arabidopsis leads to deficient thylakoid formation and variegated leaves. Plant Physiol. 136: 3594–3604. Zybailov, B., Rutschow, H., Friso, G., Rudella, A., Emanuelsson, O., Sun, Q., et al. (2008) Sorting signals, N-terminal modifications and abundance of the chloroplast proteome. PLoS ONE 3: e1994. (Received January 8, 2009; Accepted February 12, 2009) 788 Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009.
© Copyright 2025 Paperzz