Orthogenomics of Photosynthetic Organisms

Masayuki Ishikawa1, Makoto Fujiwara1, Kintake Sonoike2 and Naoki Sato1,*
1Department
of Life Sciences, Graduate School of Arts and Sciences, University of Tokyo, 3-8-1 Komaba, Meguro-ku, Tokyo, 153-8902
Japan
2Department of Integrated Biosciences, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa-shi, Chiba,
277-8562 Japan
Chloroplasts are descendents of a cyanobacterial
endosymbiont, but many chloroplast protein genes of
endosymbiont origin are encoded by the nucleus. The
chloroplast–cyanobacteria relationship is a typical target
of orthogenomics, an analytical method that focuses on
the relationship of orthologous genes. Here, we present
results of a pilot study of functional orthogenomics,
combining bioinformatic and experimental analyses, to
identify nuclear-encoded chloroplast proteins of
endosymbiont origin (CPRENDOs). Phylogenetic profiling
based on complete clustering of all proteins in 17
organisms, including eight cyanobacteria and two
photosynthetic eukaryotes, was used to deduce 65 protein
groups that are conserved in all oxygenic autotrophs
analyzed but not in non-oxygenic organisms. With the
exception of 28 well-characterized protein groups, 56
Arabidopsis proteins and 43 Synechocystis proteins in the
37 conserved homolog groups were analyzed. Green
fluorescent protein (GFP) targeting experiments indicated
that 54 Arabidopsis proteins were targeted to plastids.
Expression of 39 Arabidopsis genes was promoted by light.
Among the 40 disruptants of Synechocystis, 22 showed
phenotypes related to photosynthesis. Arabidopsis
mutants in 21 groups, including those reported previously,
showed phenotypes. Characteristics of pulse amplitude
modulation fluorescence were markedly different in
corresponding mutants of Arabidopsis and Synechocystis
in most cases. We conclude that phylogenetic profiling is
useful in finding CPRENDOs, but the physiological
functions of orthologous genes may be different in
chloroplasts and cyanobacteria.
Keywords: Arabidopsis thaliana • Chloroplast protein •
Comparative genomics • Endosymbiogenesis • Photosynthetic
gene • Synechocystis sp. PCC 6803.
Special Issue – Regular Paper
Orthogenomics of Photosynthetic Organisms: Bioinformatic
and Experimental Analysis of Chloroplast Proteins of
Endosymbiont Origin in Arabidopsis and Their
Counterparts in Synechocystis
Abbreviations: CPRENDO, chloroplast protein of endosymbiont origin; GFP, green fluorescent protein; PAM, pulse
amplitude modulation.
Introduction
Chloroplasts are descendents of an ancestral endosymbiont
related to cyanobacteria (Abdallah et al. 2000, CavalierSmith 2003, Sato 2006). The proteins encoded by chloroplast
genomes are orthologs of cyanobacterial counterparts
(Martin et al. 1998, Mulkidjanian et al. 2006, Sato 2006). In
addition, many genes of the original endosymbiont were
transferred to the nuclear genome. These are typical targets
of orthogenomics, which classifies proteins according to
orthologous relationships. A number of nuclear genes are
known to encode chloroplast proteins that are related to
photosynthesis or chloroplast biogenesis (Sato 2001, Sato
2006, Bowman et al. 2007). However, there are many other
unidentified, nuclear-encoded proteins present in the chloroplast. Only a part of the chloroplast proteome has been
elucidated by mass spectrometry (e.g. Peltier et al. 2002, Friso
et al. 2004, Kleffmann et al. 2004, Peltier et al. 2004, Peltier
et al. 2006). An estimate suggested that >3,600 proteins of
the model plant Arabidopsis thaliana originate from the
ancestral endosymbiont, and about a half of these are proteins that function in compartments other than chloroplasts
(Martin et al. 2002). Another study suggested that the
*Corresponding author: E-mail, [email protected].
Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027, available online at www.pcp.oxfordjournals.org
© The Author 2009. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists.
All rights reserved. For permissions, please email: [email protected]
Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009.
773
M. Ishikawa et al.
cyanobacterial contribution to the nuclear genome of
Cyanophora paradoxa is limited to chloroplast proteins
(Reyes-Prieto et al. 2006).
A problem in previous bioinformatic studies (Abdallah
et al. 2000, Martin et al. 2002) was the use of a single species
of plant, A. thaliana, and the fact that sequence comparison
was made using A. thaliana genes as queries against genes of
other organisms. Even though multiple species of cyanobacteria were used, the relationship between various species of
cyanobacteria was not assessed. Other studies on comparative genomics of cyanobacteria and algae or plants also used
simple species–species comparison (Mulkidjanian et al.
2006, Reyes-Prieto et al. 2006). In contrast, all-against-all
comparison is a preferred method that can correctly classify
proteins according to their similarity (Sato 2002). The red
alga Cyanidioschyzon merolae (Matsuzaki et al. 2004) as well
as many other photosynthetic eukaryotes can be used for
comparative genomics. The use of genomes of multiple photosynthetic eukaryotes as well as of various cyanobacteria in
an all-against-all comparison will give an unbiased estimate
of the genes that are shared by cyanobacteria and photosynthetic eukaryotes, and thus the nuclear-encoded chloroplast
proteins of endosymbiont origin (abbreviated as CPRENDOs:
Sato et al. 2005).
We present here results of a pilot functional analysis of
nuclear-encoded chloroplast proteins based on eight
cyanobacteria, a land plant and a red alga. The study consists
of bioinformatic estimation of putative CPRENDOs and
experimental verification of their chloroplast localization in
Arabidopsis. It also includes initial functional analysis of
CPRENDOs in both Arabidopsis and Synechocystis. The aim
of the present study is not to report detailed results on individual proteins, but to present the methodology as a new
approach in photosynthesis and chloroplast research.
Results
Estimation of CPRENDOs
An all-against-all BLASTP search was performed on a data
set (CZ16X) comprising all predicted proteins (102,513
sequences excluding duplication, as of December 2002) in 17
organisms including eight cyanobacteria, three photosynthetic bacteria, two non-photosynthetic bacteria, two nonphotosynthetic eukaryotes and two photosynthetic
eukaryotes (A. thaliana and C. merolae). The genomes used
in the analysis are listed in Supplementary Table S1.
Homolog groups were constructed by single-linkage clustering with several different threshold E-values using the Gclust
software version 3.0. The method of clustering was briefly
described in previous publications (Sato 2002, Sato et al.
2005). We used the results of clustering with E-values 10–8, 10–12
and 10–20. The homolog groups that are shared by all of the
eight cyanobacteria and the two photosynthetic eukaryotes
774
but not by other organisms were selected for each E-value,
and the groups were combined. We thus obtained 65
homolog groups that were specific for photosynthetic organisms (Supplementary Table S2). This clustering was done
using all the proteins coded for by the nuclear genome as
well as organellar genomes and, therefore, some groups contained plastid-encoded proteins and cyanobacterial proteins, such as photosynthetic reaction center proteins, PsbB,
PsbC (Group ID 1 in Supplementary Table S2), PsaA and
PsaB (Group ID 3). However, other groups contained nuclearencoded proteins of the plant and the alga with cyanobacterial homologs. Among the 65 selected groups, 28 groups
contained known proteins involved in photosynthesis or
chloroplast biogenesis. Finally, 37 homolog groups, in which
the proteins in A. thaliana and C. merolae are encoded by
the nucleus and have not been assigned a well-defined function, were selected as targets for further functional analysis.
These homolog groups included 56 A. thaliana proteins and
43 Synechocystis proteins (Table 1). Each of the homologous
protein groups was assigned a CPRE number. An alignment
of an example homolog group is shown in Fig. 1. In this
example, the member proteins were highly similar to one
another, but each of the proteins of A. thaliana and C. merolae
had an N-terminal extension. Among the 56 A. thaliana
sequences, 48 had an N-terminal extension. Three programs
were tested for the prediction of intracellular localization
(Table 2). With the exception of seven proteins, most of the
56 proteins were predicted to be localized in chloroplasts by
at least two prediction programs. Note that these results
were obtained with the most recent software. Only an old
version of TargetP was available at the start of the present
study. Based on these results, we considered these proteins
as putative CPRENDOs, and analyzed protein targeting and
light-regulated gene expression.
Initial functional analysis of putative CPRENDOs in
Arabidopsis
Intracellular targeting of these A. thaliana proteins was analyzed using green fluorescent protein (GFP) fusion proteins
in onion epidermis with particle bombardment. This is a
heterologous system; however, we obtained no unexpected
results that contradicted previously reported biochemical
or immunological data in Arabidopsis or Cyanidioschyzon
proteins (Moriyama et al. 2008). Of the 56 proteins analyzed,
54 proteins were targeted to plastids (Fig. 2 and Table 2).
Seven of these were also targeted to mitochondria (see also
Table 3). A complete set of fluorescence micrographs is presented in Supplementary Fig. S1. Two proteins in CPRE 7,
consisting of four paralogs, showed no clear localization, and
are possibly cytoplasmic proteins. Note that these two cytoplasmic proteins (At2g43910 and At2g43920) are included
in a different cluster in a recent database (see the rightmost
column in Table 1: ALL95 cluster). The data were consistent
Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009.
Orthogenomics of photosynthetic organisms
Table 1 List of selected CPRENDOs analyzed in the present study
CPRE Group ID with
threshold value
10–8 10–12 10–20
1
357 416
Hypothetical
2
322
453
Hypothetical (ClpS homolog)
ssl3379
ssr2723
3
4
285
366
482
530
1081
822
Ycf52 (acetyltransferase)
Hypothetical (DUF1350)
sll0286
slr1699
<5>
<6>
7
477
607
602
579
698
718
616
762
810
ATAB2 (Tab2 homolog)
Hypothetical [TIC21/PIC1]
Hypothetical
sll2002, slr1110
sll1656
slr1926
Cme
CMQ124C,
CMG076C
–
CML200C,
CMJ276C
cp
CMA083C,
CMD109C,
CMR396C
CMH188C
CMN128C
CMS285C
8
9
10
746
772
798
835
1064
slr0959
sll1586, sll1265
slr0565
<11> 630
802
12
<18>
13
751
14
803
15
786
856
1075
859
941
952
(CAAX N-terminal protease)
Hypothetical
(Vitamin K epoxide reductase
homolog)
Probable ferredoxin (2Fe–2S)
[NDF4]
Ycf19 homolog [CCB3]
16
777
17
915
<19> 901
953
1020
1076
20
21
22
879
23
24
911
25
918
<26>
330
27
478
28
593
1085
1095
1119
1120
1135
1145
406
29
30
31
<32>
33
706
816
913
919
1178
1055
1210
1252
1219
1238
1251
1226
1168
1121
236
34
<35>
1014
1094
36
37
1144
1156
Annotation in database
[recent annotation]
Gene identifier
Syn
slr1638, slr1674
ALL95
cluster
Ath
1g63610, 2g14910, 5g14970 2360, 5850
–
1g68660
4073
702
1g26220, 1g32070
3g43540, 5g47860
1993
1889
CMQ405C
CMO228C
CMO209C
3g08010
2g15290
3g59870, 2g43940
2g43910, 2g43920
2g20725, 5g60750, 1g14270
2g25660
4g35760
2082
1965
2379
5724
2754
2006, 11537
2536
ssl3044
CME070C
3g16250
1465
3g07430, 4g27990
5g36120
1g21350
5g55710, 2g47840
5g52970
2090
2310
2593
Ycf65(PSRP-3)
Hypothetical
Psb29/Thf1
slr0923
sll0295
sll1414
cp, CMC030C
CMT057C
CMP081C
cp, CMS050C
CMS436C,
CMP233C
cp
CMP136C
CME041C
411
Hypothetical
Ycf60
Hypothetical
ssr2142
ssl0353
sll1289
sll1737
sll1071
1g68590, 5g15760
2g45990
2g20890
2386
2697
2687
(Rubredoxin homolog)
Hypothetical
Hypothetical [ape1 locus]
Hypothetical (NnrU homolog)
Hypothetical
Hypothetical
PcyA
HY2
Ycf20
Hypothetical (cyclase/dehydrase)
slr2033
slr1702
slr0575
slr1599
sll0157
Slr0815
Slr0116
–
Sll1509
slr0941
1g54500
5g27560
5g38660
1g10830
1g29700
3g17930
–
3g09150
1g65420, 3g56830, 5g43050
1g02470
1096
2692
2597
2911
2784
2810
3570
3705
1524
2163
2g20920, 3g51140, 5g23040
4g19100, 5g52780
3g26580
3g26710
1g12250
4316
2685
5595
3972, 14849
528
1g19740, 1g75460
4g25910, 5g49940
996
285
1g78620
5g17660
1510
536
Hypothetical [CDF1]
Hypothetical
Hypothetical (TPR region)
Hypothetical [CCB1]
Hypothetical (PPR homolog)
slr1918
sll0933
slr1052
slr0589
sll0301, sll0577,
sll0274
ATP-dependent proteinase (LON) sll0195
NifU/NFU2,3
ssl2667
Hypothetical
YggH (tRNA methyltransferase)
homolog
sll0875
sll1300
CMS181C
CMO077C
CMC040C
CMQ364C
CMJ095C
CMQ319C
CMG110C
–
CMT591C
CMD122C,
CMD157C
CMA064C
CML309C
CMD175C
CMM306C
CMO201C,
CMQ266C, etc.
CMD100C
CMK204C,
CMJ205C,
CMP295C
CMS030C
CMQ129C,
CMT407C
CPRENDOs that were identified by other groups after the start of this project are marked by <brackets>, and annotations are given in bold. Syn, Synechocystis sp. PCC 6803;
Cme, Cyanidioschyzon merolae; Ath, Arabidopsis thaliana. cp, proteins encoded by the chloroplast genome. A CPRE number was assigned to each homologous protein
group. Threshold indicates the level of E-value that was used to estimate each protein group. ALL95 cluster indicates the cluster number in a more recent Gclust database.
Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009.
775
M. Ishikawa et al.
Fig. 1 An example of a homolog group specific to photosynthetic organisms consisting of unknown proteins. The alignment was prepared by the
Clustal X program (Thompson et al. 1994).
with the proteomic studies on various chloroplast fractions
(Table 2): namely, 37 proteins belonging to 29 homolog
groups among the 56 examined groups have been detected
in at least some fractions of chloroplast. It is known that the
proteomics data alone should not be considered as conclusive evidence for the presence of these proteins in chloroplasts, because many non-chloroplast proteins have been
detected in chloroplast fractions. The GFP data indicated
that even the putative CPRENDOs that were not predicted
to be chloroplast proteins by targeting prediction or that
had not been detected in chloroplasts by mass spectrometry
were indeed localized to chloroplasts. We conclude, therefore, that we identified 54 CPRENDOs.
The transcript level of the genes encoding the 56 proteins
(CPREs) was also analyzed by RNA gel blot analysis. The transcript level of 36 genes was elevated in the light as compared
with in the dark, suggesting light-promoted expression of
these genes (Fig. 2, Table 2 and Supplementary Fig. S1).
Table 3 summarizes the targeting and expression analyses.
Among the 56 candidate proteins, 38 (= 34 + 4) showed both
chloroplast localization and light-promoted expression.
Analysis of Synechocystis mutants
Disruptants of the corresponding genes (called sCPRE) were
prepared in Synechocystis sp. PCC 6803 by homologous
776
recombination using a kanamycin resistance cassette from
pUC4K. Among the 43 genes, 33 were knocked out, while
seven genes were not completely disrupted and are likely to
be essential. Constructs for three genes could not be made
despite repeated attempts, either because of experimental
failure or due to sequence differences in the strain used. The
kinetics of fluorescence induction were measured for all
the mutants as an initial survey (Supplementary Fig. S2).
The abbreviation ‘FI’ in Fig. 3 indicates that the kinetics was
different in the mutant. To analyze the kinetic properties of
photosynthesis in more detail, we performed pulse amplitude modulation (PAM) fluorescence analysis (for a review,
see Schreiber 2004) on selected mutants. These included the
mutants that showed some differences in the fluorescence
kinetics (FI), and several apparently normal mutants. Representative traces for the PAM analysis are shown in Fig. 4. The
maximal level of fluorescence (Fm) was obtained by the addition of DCMU at the end of each measurement (Fujimori
et al. 2005). A disruptant of sCPRE36 showed significantly
reduced Fv′ while retaining F and Fm values. A disruptant of
sCPRE30 showed an elevated level of F, thus exhibiting very
low Fv/Fm. Traces from PAM analysis for all 28 mutants
analyzed are presented in Supplementary Fig. S3, and the
results are summarized in Fig. 3. The mutants that showed
normal fluorescence kinetics in the initial survey were not,
Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009.
Orthogenomics of photosynthetic organisms
Table 2 Summary of expression and targeting analyses
CPRE
1
2
3
4
<5>
<6>
7
8
9
10
<11>
12
13
14
15
16
17
<18>
<19>
20
21
22
23
24
25
<26>
27
28
29
30
31
<32>
33
34
<35>
36
37
AGI code
1g63610
2g14910
5g14970
1g68660
1g26220
1g32070
3g43540
5g47860
3g08010
2g15290
3g59870
2g43945
2g43920
2g43910
2g20725
5g60750
1g14270
2g25660
4g35760
3g16250
3g07430
4g27990
1g21350
5g55710
2g47840
5g52970
1g68590
5g15760
2g45990
5g36120
2g20890
1g54500
5g27560
5g38660
1g10830
1g29700
3g17930
3g09150
1g65420
5g43050
3g56830
1g02470
2g20920
3g51140
5g23040
4g19100
5g52780
3g26580
3g26710
1g12250
1g19740
1g75460
5g49940
4g25910
1g78620
5g17660
Prediction results
Predotar PSORT
cp
cp
cp
cp
cp
cp
cp
cp
[mt?]
cp
cp
cp
ER
cp
cp
mt
[mt?]
cp
cp
cp
ER
cp
ER
cp
None
Cytosol
None
cytosk
cp
plasma
cp
Cytosol
ER
cp
cp
cp
[mt?]
cp
cp
cp
cp
cp
cp
cp
cp
cp
Vacuole
None
cp
cp
cp
cp
cp
cp
cp
cp
None
Cytosol
cp
cp
cp
cp
cp
cp
cp
cp
cp
cp
cp
cp
cp
cp
[cp?]
plasma
cp
cp
[cp?]
Vacuole
cp
cp
cp
cp
cp
mt
cp
cp
none
cp
cp
cp
mt
plasma
ER
cp
cp
extr
cp
cp
cp
cp
cp
cp
cp
cp
None
cp
cp
cp
cp
cp
cp
cp
TargetP
cp
cp
cp
cp
cp
cp
mt
cp
cp
cp
cp
cp
None
None
cp
cp
cp
cp
cp
cp
cp
cp
None
cp
cp
cp
cp
cp
None
cp
cp
cp
cp
cp
cp
cp
cp
cp
None
cp
cp
cp
cp
cp
cp
None
mt
cp
cp
cp
cp
cp
cp
cp
cp
cp
Proteome references GFP localization
Expression in light
Expression in dark
10
++
++
++
++
++
++
++
++
++
++
++
–
++
++
++
–
++
–
++
++
++
++
++
++
+
++
++
++
++
++
++
++
++
++
++
++
++
++
–
–
–
–
++
++
++
+
++
++
++
++
++
++
++
++
++
–
–
++
+
+
–
–
++
++
++
–
+
–
+
++
+
–
–
–
+
–
++
++
–
+
+
+
–
–
–
+
+
–
+
+
–
–
–
–
–
–
–
–
–
–
+
–
+
++
+
–
–
–
+
–
+
–
10
10
10
10
10
4, 10
10
10
10
6
4, 10
3, 10
10
5, 10
3, 4, 6, 9, 10
1, 2, 6, 10
5, 6, 10
8, 10
5, 10
4, 6, 7, 8, 10
5, 6, 7, 10
6, 7, 10
10
10
5, 7
3, 4, 10
4, 10
10
7, 10
7
10
5, 6, 7, 10
10
10
10
3, 10
cp
cp
cp
cp
cp
cp
cp
cp
cp
cp
cp
cp
Cytosol
Cytosol
cp
cp
cp, mt
cp
cp
cp
cp
cp
cp
cp
cp
cp, mt
cp
cp
cp
cp
cp
cp
cp
cp
cp
cp
cp
cp
cp, mt
cp, mt
cp, mt
cp
cp
cp
cp
cp
cp (mt?)
cp
cp
cp
cp
cp
cp
cp
cp, mt
cp, mt
Predicted results that were different from the GFP results are marked by bold italic. CPREs that have been identified and reported by other groups are marked by
<brackets>.
Abbreviations: cp, chloroplasts; mt, mitochondria; nu, nuclear; cytosk, cytoskeleton; plasma, plasma membrane; extr, extracellular; ER, endoplasmic reticulum.
References: 1, Schubert et al. (2002); 2, Peltier et al. (2002); 3, Ferro et al. (2003); 4, Froehlich et al. (2003); 5, Friso et al. (2004); 6, Kleffmann et al. (2004); 7, Peltier et al.
(2004); 8, Peltier et al. (2006); 9, Dunkley et al. (2006); 10, Zybailov et al. (2008). We also used the PPDB (Sun et al. 2004) for the proteomics data.
Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009.
777
M. Ishikawa et al.
GFP
At5g55710
RNA-blot
18S rRNA
CPRE14
L
D
At5g55710
L
D
At2g47840
At2g47840
At1g68590
CPRE16
At1g68590
Localization
Expression
Compartment Number Light >dark Light = dark
Plastids
Plastids and
mitochondria
Cytoplasm
Total
47
7
34
4
8
0
No detectable
expression
5
3
2
56
1
39
1
9
0
8
mutants that did not show significant difference in PAM
fluorescence showed changes in the absorption spectrum.
Interestingly, the seven mutants that have not been
completely segregated did show phenotypes in PAM fluorescence, oxygen evolution or pigment composition.
Analysis of Arabidopsis mutants
At1g65420
CPRE27
At1g65420
ND
Bar=10 µm
Fig. 2 Representative results of targeting of GFP fusion proteins and
RNA gel blot analysis. Results for CPRE14, 16 and 27 are shown.
Disruption of these genes gave visible phenotypes in later analysis.
Complete data are presented in Supplementary Fig. S1.
in principle, analyzed by PAM. This was justified by the fact
that six such ‘normal’ mutants (sCPRE 5, 11, 12, 13, 14 and 19)
were also normal in the PAM analysis.
Eleven mutants showed both increased qN and decreased
ΦPSII. Two others showed elevated qN, while two others
showed reduced ΦPSII. These 15 mutants also showed other
symptoms in qP, NPQ, Fv/Fm or F0′/Fm. These results suggested that the mutants had defects in photosynthetic electron transport or photosystems. The results of oxygen
evolution and spectral analysis are summarized in Table 4.
Oxygen evolution activity (per Chl) was low in three mutants,
whereas seven mutants showed elevated O2 evolution per
Chl. This must be due to a reduced content of Chl, most
probably the antenna Chl of PSI. The ratio of carotenoid to
Chl and the ratio of phycobilin to Chl, as estimated from the
absorbance ratio, A492/A680 and A626/A680, respectively, were
also affected in 18 mutants. It should be noted that the five
778
Table 3 Localization of 56 putative CPRENDOs and expression of
the corresponding genes in A. thaliana
T-DNA insertion lines (tag-lines) of A. thaliana that tagged
the CPRE genes were obtained from The Arabidopsis Information Resource (TAIR) and analyzed. Among the 37 CPRE
groups, we obtained data for 18 groups (Fig. 3), while results
on eight other mutants were reported during the course of
the present study (see the next section). Mutants were not
obtained for seven CPREs because no T-DNA insertion lines
were available in the stock centers, and only heterozygous
lines were obtained for four CPREs. Mutants of a gene of
CPRE27 (At1g65420) had variegated cotyledons and foliage
leaves (Fig. 5B, C). A mutant of CPRE16 (At1g68590)
had non-green cotyledons on sucrose-containing medium
(Fig. 5D). PAM analysis showed that mutants of nine CPREs
had some defects in photosynthesis (Fig. 3 and Supplementary Fig. S4). Representative traces for the PAM analysis are
shown in Fig. 6. In our current analysis, 17 out of 30 CPREs
(except for CPREs without tag-line stocks) indeed showed
phenotypes that were considered as being related to
photosynthesis.
Proteins that were reported after the start of this
project
This project began 5 years ago and, during this time, some of
the proteins have been characterized in detail and the
reports have been published (Fig. 3). These are PIC1/Tic21
(iron transporter or translocon component: CPRE6, Teng
et al. 2006, Duy et al. 2007), NDF4 [NAD(P)H dehydrogenase
component: CPRE11, Takabayashi et al. 2009], Tab2/ATAB2
(RNA-binding protein involved in psaB translation: CPRE5,
Dauvillée et al. 2003, Barneche et al. 2006), Psb29/THF1 (PSII
component involved in thylakoid formation: CPRE19,
Wang et al. 2004, Keren et al. 2005), PcyA/PebB/HY2 (phytochromobilin/phycobilin biosynthesis enzyme: CPRE26,
Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009.
Orthogenomics of photosynthetic organisms
Kohchi et al. 2001, Frankenberg and Lagarias 2003), CCB1
(CPRE32) and CCB3 (CPRE18) (cytochrome c biogenesis
enzyme involved in b6 f complex assembly: Lezhneva et al.
2008) and NifU/NFU2/NFU3 (iron–sulfur cluster assembly
enzyme: CPRE35, Nishio and Nakai 2000, Léon et al. 2003,
Touraine et al. 2004). Some other proteins were found to be
homologs of proteins in other organisms, such as NnrU
homolog (nitric oxide reductase: CPRE23, Bartnikas et al.
1997), VKOR homolog (vitamin K epoxide reductase:
CPRE10, Goodstadt and Ponting 2004), ClpS homolog (component of Clp machinery in Escherichia coli: CPRE2, Dougan
et al. 2002) and yggH homolog (tRNA methyltransferase:
CPRE37, De Bie et al. 2003). APE1 (uncharacterized protein
involved in acclimation to high light: CPRE22, Walters et al.
2003) was described as a gene involved in a light acclimation
defect, but biochemical analysis of the protein was not
reported. CDF1 (CPRE29, Kawai-Yamada et al. 2005) was
described only as a transgene in yeast cells. Other proteins
that have Ycf numbers or annotations have not been
analyzed in plants to date. Curiously, mutants in Psb29/
THF1 and APE1 did not show detectable phenotypes in
Synechocystis using the method of analysis and growth
conditions in the study reported here. According to the
publications on these proteins, the mutants showed
phenotypes at high light conditions. This suggests that more
of the mutants that were analyzed in the present study may
show phenotypes under high light or other extreme
conditions.
Discussion
In the present study, we describe a new approach of functional orthogenomics in photosynthetic organisms, which
involves comprehensive clustering of all proteins conserved
in photosynthetic organisms and functional analysis of the
genes or proteins that have been selected. The two parts are
discussed separately.
Protein clustering and phylogenetic profiling
The informatics part of the study is a combination of protein
clustering and phylogenetic profiling. Phylogenetic profiling
(Pellegrini et al. 1999) relies on the availability of various
genome sequences of both photosynthetic and non-photosynthetic organisms. Eight cyanobacterial genome sequences
were already available at the start of this project and now
tens of cyanobacterial sequences are available. However,
Arabidopsis (Arabidopsis Genome Initiative 2000) and
Cyanidioschyzon (Matsuzaki et al. 2004) were, until recently,
the only genome sequences of photosynthetic eukaryotes.
We used the Cyanidioschyzon data for the clustering prior to
publication, and preliminary results of the cluster analysis
were presented in the paper reporting the sequence data
(Matsuzaki et al. 2004). The use of two eukaryotic genomes
improved the quality of estimation of conserved proteins in
photosynthetic organisms. Many other studies have used
Arabidopsis as a pivot in searching for homologs (Abdallah
et al. 2000, Martin et al. 2002), and many of the proposed
proteins of cyanobacterial origin were predicted to be localized outside chloroplasts. The use of complete clustering
based on all-against-all BLASTP analysis (Sato 2002, Sato
et al. 2005) resulted in a smaller number of protein clusters
that are conserved in photosynthetic organisms, but they
are indeed localized to chloroplasts as revealed in the present study. Some recent reports (Keren et al. 2005, Duy et al.
2007, Lezhneva et al. 2008, Takabayashi et al. 2009) also used
informatics to identify potential genes for chloroplast
proteins, but these studies were based on simple comparisons of single genes. The present study clearly shows that
comprehensive analysis of all CPRENDOs is now feasible and
effective in finding new chloroplast proteins.
The publication of genome sequences of Chlamydomonas
reinhardtii (Merchant et al. 2007) and Physcomitrella patens
(Rensing et al. 2008), as well as other genomes, has greatly
changed the situation, and made phylogenomic predictions
more practical (Bowman et al. 2007, De Crécy-Lagard and
Hanson 2007). We recently constructed a comparative
genomic database involving 95 organisms including all available data of photosynthetic organisms (data set ALL95),
based on a more sophisticated algorithm (Sato 2009), which
has been made publicly accessible at http://gclust.c.u-tokyo.
ac.jp. The results are included in Table 1 in the rightmost
column. The use of the ALL95 data set expands the scope of
phylogenetic profiling, such as the proteins conserved in
green plants or proteins conserved only in photosynthetic
eukaryotes.
An important characteristic of the clustering reported in
the present article is that the data include not only nuclearbut also organellar-encoded proteins. Therefore, we find
that some proteins are nuclear encoded in Arabidopsis while
homologs are encoded by the chloroplast genome in Cyanidioschyzon, such as CPRE 3, 12, 14 and 16 (Table 1). This is a
consequence of the fact that many genes originating from
the cyanobacterial endosymbiont remained encoded in the
chloroplast genome in red algae, reflecting differences in
discontinuous plastid evolution in the green and red lineages
(Sato 2001, Sato 2006).
Functional genomic analysis
The goal of the current functional analysis has been to determine whether the proteins conserved in photosynthetic
organisms, including cyanobacteria and plants or algae, are
really chloroplast proteins that are involved in photosynthesis
directly or indirectly. We can now say yes to this question.
The analysis of localization showed that almost all putative
CPRENDOs are indeed localized to chloroplasts, with the
exception of some paralogs. Expression of many of them was
Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009.
779
M. Ishikawa et al.
CPRE
Wild type
1
2
Species
(Glucose tolerant)
Ath
(Columbia)
Syn
slr1674
slr1638
At1g63610
At2g14910
At5g14970
Ath
Syn
3
4
Syn
Ath
<6>
Ath
Syn
Ath
Syn
Ath
At3g16250
Syn
Ath
ssr2142
At3g07430
At4g27990
Ath
Syn
Ath
Syn
Ath
Syn
Ath
Syn
9
10
<11>
12
13
14
15
16
17
ssr2723
ssl3379
At1g68660
sll0286
At1g26220
At1g32070
slr1699
At3g43540
At5g47860
sll2002
slr1110
At3g08010
sll1656
At2g15290
slr1926
At2g43910
At2g43920
At2g43945
At3g59870
slr0959
At1g14270
At2g20725
At5g60750
sll1586
sll1265
At2g25660
slr0565
At4g35760
ssl3044
Syn
7
8
SALK
Syn
Ath
Syn
Ath
<5>
Gene
Syn
Ath
Syn
Ath
Syn
Ath
Syn
Ath
Syn
Ath
sll1289
At1g21350
sll1737
At2g47840
At5g55710
sll1071
At5g52970
slr0923
At1g68590
At5g15760
sll0295
At2g45990
Phenotype
[reference]
P
qP
0.820
100%
0.874
100%
qN
0.173
100%
0.336
100%
103%
101%
97%
99%
121%
117%
98%
102%
normal
normal
065921
A
M
NPQ
Fv/Fm
0.369
0.069
0.508
100% 100% 100%
0.568
0.360
0.839
100% 100% 100%
(not analyzed)
(not analyzed)
108% 105% 101%
PSII
n
Fv'/Fm'
0.449
100%
0.649
100%
21
21
105%
3
93%
140% 100%
97%
133% 101%
(not analyzed)
96%
99%
3
3
98%
106%
97%
106%
100%
100%
100%
103%
6
3
100%
100%
102%
98%
113%
115%
115%
110%
(not analyzed)
102% 129%
100% 127%
105% 136%
97%
101%
101%
101%
100%
101%
102%
100%
103%
99%
3
3
3
2
95%
141%
78%
117%
89%
82%
6
102%
101%
106%
103%
100%
104%
3
95%
103%
98%
101%
103%
85%
98%
147%
91%
117%
102%
116%
104%
99%
72%
104%
109% 88%
95%
131%
102% 105%
100% 112%
93%
106%
98%
101%
(not analyzed)
84%
100%
101%
100%
99%
105%
100%
76%
105%
98%
101%
97%
104%
99%
5
3
3
3
3
5
3
95%
106%
93%
100%
98%
6
$
086933
036830
normal
PCR failed
No stock
SG
112856
$
normal
133462
027281
106119
No stock
normal
PCR failed
atab2: albino [1]
$, FI, LS
pic1: chlorosis [2]
PCR failed
$
No stock
081999
$
FI, LS
074655
070494
028403
SG
FI
128275
SG
$
normal
ndf4: loss of NDH
activity [3]
normal
$
001605
032584
normal
93%
113%
98%
(not analyzed)
95%
106%
93%
98%
100%
98%
6
102%
102%
97%
100%
103%
109%
103%
106%
100%
101%
110%
99%
100%
101%
104%
102%
104%
102%
4
3
2
99%
93%
98%
88%
101%
89%
102%
100%
106%
117%
133%
159%
96%
143%
110%
100%
95%
101%
86%
130%
92%
162%
75%
163%
104% 94%
76%
137%
96%
109%
100% 98%
(not analyzed)
97%
101%
100%
94%
101%
93%
95%
101%
96%
92%
94%
85%
103%
86%
94%
100%
3
6
4
4
3
5
3
3
102%
105%
104%
101%
102%
3
No stock
normal
064931
013444
FI
122650
010806
063933
FI, SG
albino cotyledon
normal
141449
110%
Fig. 3 Summary of the analysis of CPRE mutants. $, incomplete segregation (Syn: Synechocystis) or heterozygous line (Ath: Arabidopsis); FI, anomaly
in fluorescence induction (Supplementary Fig. S2); SG, slow growth on agar plate; LS, light sensitive on agar plate; normal, no apparent phenotype;
n, number of determinations. If a gene name is given in ‘Phenotype’, a report on the gene function had appeared during the work, and the gene was
no longer analyzed (the CPRE number is highlighted by <brackets>). If no visible phenotype or no difference in fluorescence kinetics was found
780
Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009.
Orthogenomics of photosynthetic organisms
CPRE
Wild type
<18>
<19>
20
21
22
23
24
25
<26>
Species
Gene
Syn
(Glucose tolerant)
Ath
(Columbia)
Syn
Ath
Syn
Ath
Syn
Ath
Syn
Ath
Syn
Ath
Syn
Ath
Syn
Ath
Syn
Ath
Syn
ssl0353
At5g36120
sll1414
At2g20890
slr2033
At1g54500
slr1702
At5g27560
slr0575
At5g38660
slr1599
At1g10830
sll0157
At1g29700
slr0815
At3g17930
slr0116
Ath
At3g09150
Syn
Ath
sll1509
At1g65420
27
29
Syn
Ath
Syn
Ath
30
Syn
Ath
28
31
<32>
Syn
Ath
Syn
Ath
Syn
33
34
Ath
Syn
Ath
<35>
Syn
Ath
36
37
SALK
Syn
Ath
Syn
Ath
At3g56830
At5g43050
slr0941
At1g02470
slr1918
At2g20920
At3g51140
At5g23040
sll0933
At4g19100
At5g52780
slr1052
At3g26580
slr0589
At3g26710
sll0301
sll0274
sll0577
At1g12250
sll0195
At1g19740
At1g75460
ssl2667
At5g49940
At4g25910
sll0875
At1g78620
sll1300
At5g17660
No stock
Phenotype
[reference]
P
$
no mutants analyzed
normal
thf1: variegation [4]
FI
qP
0.820
100%
0.874
100%
75%
qN
0.173
100%
0.336
100%
114%
99%
104%
60%
169%
101%
103%
101%
126%
99%
106%
99%
102%
94%
103%
A
M
NPQ
Fv/Fm
0.369
0.069
0.508
100% 100% 100%
0.568
0.360
0.839
100% 100% 100%
56%
89%
77%
PSII
n
Fv'/Fm'
0.449
100%
0.649
100%
74%
21
21
4
98%
96%
101%
(not analyzed)
29%
117% 57%
100%
6
49%
4
89%
112%
105% 99%
99%
104%
(not analyzed)
100% 91%
105% 105%
92%
100%
99%
88%
102%
98%
4
3
5
99%
101%
100%
103%
5
3
No stock
$, FI, SG
132878
FI
ape1: low
SG
PSII
in HL [5]
057053
normal
(not analyzed)
No stock
FI, LS
75%
185%
65%
256%
95%
86%
3
$
hy2: long hypocotyl
[6]
FI, SG
97%
122%
71%
102%
77%
74%
6
No stock
091458
010998
variegation
$
(not analyzed)
89%
101%
97%
125%
106%
137%
64%
101%
82%
70%
107%
167%
75%
101%
91%
70%
100%
85%
4
3
3
96%
116%
71%
83%
77%
74%
7
No stock
SG
$
normal
$
$
(not analyzed)
129925
FI
$
143426
normal
$
FI, SG
ccb1: pale green [7]
normal
normal
$, FI, SG
100%
60%
109%
211%
100%
36%
101%
72%
99%
59%
3
6
98%
128%
95%
153% 101%
(not analyzed)
97%
3
94%
155%
85%
97%
91%
3
127%
(not analyzed)
(not analyzed)
69%
90%
76%
72%
4
101%
85%
94%
78%
3
4
95%
113%
208%
191%
No stock
(not analyzed)
normal
No stock
068796
$, FI, SG
nfu2: pale green [8]
no mutants analyzed
$, FI
$
FI, SG
019461
96%
95%
114%
140%
91%
74%
122%
118%
(not analyzed)
88%
211%
82%
309%
106%
93%
4
97%
94%
105%
115%
96%
85%
93%
117%
101%
99%
99%
90%
4
3
Fig. 3 Continued
in Synechocystis mutants, PAM analysis was not carried out. All Arabidopsis homozygous mutants were subjected to PAM analysis. Each value in
red or blue indicates an increase or decrease, respectively, at the significance level of 5% (Ath) or 1% (Syn). References: [1] Barneche et al. (2006);
[2] Teng et al. (2006), Duy et al. (2007); [3] Takabayashi et al. (2009); [4] Wang et al. (2004), Keren et al. (2005); [5] Walters et al. (2003); [6]
Kohchi et al. (2001); [7] Lezhneva et al. (2008); [8] Touraine et al. (2004).
Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009.
781
M. Ishikawa et al.
B sCPRE36::kanR
A Wild type
C sCPRE30::kanR
Fig. 4 PAM analysis of Synechocystis mutants. Measurement was started at time 0. Actinic light was provided as shown at 15 or 30% of the full
power. Saturating pulses were applied at intervals. DCMU (10 µM) was added at the end of the measurement to obtain the Fm value. (A) Wild
type; (B) sCPRE36 disruptant; (C) sCPRE30 disruptant.
A
B
salk_010998 (CPRE27)
Col
C
salk_010998 (CPRE27)
D
Col
(CPRE16)
salk_010806
Fig. 5 Visible phenotypes of cpre16 and cpre27 mutants in Arabidopsis.
(A) Wild type (Columbia); (B), cotyledon of the cpre27 mutant; (C)
foliage leaves of the cpre27 mutant; (D) cotyledons of the wild type
(left) and the cpre16 mutant (right).
promoted by light. About half of the Synechocystis mutants
were affected in photosynthesis, as revealed by fluorescence,
oxygen evolution or pigment composition. Many of the
plant mutants showed defects in photosynthesis or related
processes. Based on these data, we can safely conclude that
the conserved proteins in photosynthetic organisms are
indeed CPRENDOs. This demonstrates the usefulness of
phylogenetic profiling in identifying proteins involved in
functions limited in certain groups of organisms, such as
plants of green lineage, cyanobacteria or land plants.
However, we noticed interesting differences between the
mutants of Synechocystis and Arabidopsis (Fig. 3). The results
of PAM fluorescence were distinctly different in most cases
in Synechocystis and Arabidopsis. The differences are graphically expressed as vectors in Fig. 7B. Each set of the six
parameters of PAM analysis was taken as a vector, and the
angle and the average size of the vectors for plant and
cyanobacterial mutants of each CPRE were plotted. Most
data deviated from the axis representing parallel phenotypes
782
in the two organisms. One reason for this is the different
mechanisms of energy dissipation in chloroplasts and
cyanobacteria (Schreiber 2004, Fujimori et al. 2005). Another
reason is that cyanobacteria are free-living organisms,
whereas chloroplasts are located within the cell. The mutant
data are summarized as a Venn diagram (Fig. 7A). In this
figure, only the presence of phenotypes in mutants of Arabidopsis and Synechocystis was used to classify 37 CPRE genes.
By this criterion, mutation of 15 CPRE genes resulted in phenotypes in both Arabidopsis and Synechocystis. Uncertainty
remains for 10 CPREs, which are shown by the two 5s at the
boundary of Arabidopsis. Mutations in five CPRE genes have
phenotypes only in Arabidopsis. This result again suggests
that the roles of many CPREs are different in chloroplasts
and cyanobacteria.
Comprehensive analysis of many proteins that were predicted to be CPRENDOs required a long time. The localization and expression analysis are feasible for a larger set of
proteins, but analysis of mutants, especially of plants, must
be performed one by one, and is time consuming. In addition, detailed analysis of mutants, either in plants or in
cyanobacteria, requires specified growth conditions. We did
not detect phenotypes in ape1 and thf1 mutants of Synechocystis under normal laboratory conditions of growth. This
suggests that a more sophisticated analysis with a range of
growth conditions is necessary to find clear phenotypes in
CPRE mutants. Another problem is that not all mutants are
currently available. The unavailable T-DNA insertion lines
may show interesting phenotypes. We have now demonstrated that the proposed CPRENDOs are reasonable candidates for further detailed research; construction of all
knockdown mutants for these CPREs will be a promising
strategy.
Prospects and conclusion
The combined informatic and experimental approach as
presented here as a pilot research study is a model for future
projects. The total number of CPRENDOs as estimated by a
recent Gclust analysis is about 1,200 (Sato et al. 2005), which
Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009.
Orthogenomics of photosynthetic organisms
Table 4 Oxygen evolution and spectral properties of Synechocystis mutants
Kazusa code
Strain
CPRE
Notes
Segregation
Wild type
O2 evolution
P (5%)
n
Absorbance ratio
A492/A680
A626/A680
210.6a
–
18
0.402
0.785
100%
–
111%
0.247
4
0.445
0.805
104%
0.688
2
0.435
0.872
sll0286
6
3
sll2002
15
<5>
ATAB2
sll1656
17
<6>
PIC1
95%
0.574
5
0.542
0.889
slr0959
19
8
166%
0.000
5
0.454
0.901
sll1586
20
9
94%
0.524
3
0.493
0.767
sll1265
21
9
ssl3044
23
<11>
sll1289
24
13
sll1737
25
14
sll1071
26
15
No
127%
0.028
3
0.411
0.796
NDF4
108%
0.326
5
0.403
0.824
94%
0.568
2
0.451
0.798
Ycf60
140%
0.000
3
0.432
0.780
146%
0.000
3
0.444
0.970
58%
0.000
5
0.424
0.741
64%
0.000
4
0.463
0.833
slr0923
27
16
Ycf65
ssl0353
4
<18>
Ycf19
sll1414
29
<19>
Psb29
slr2033
30
20
No
101%
0.924
3
0.427
0.781
93%
0.334
4
0.553
1.000
slr1702
31
21
97%
0.699
3
0.531
0.768
slr0575
32
22
117%
0.044
4
0.424
0.772
slr1599
8
23
116%
0.069
5
0.388
0.780
slr0815
35
25
69%
0.000
6
0.430
0.835
slr0116
36
<26>
PcyA
125%
0.022
3
0.539
0.834
sll1509
37
27
Ycf20
88%
0.150
5
0.426
0.765
slr0941
38
28
99%
0.931
5
0.355
0.809
sll0933
33
30
133%
0.000
4
0.528
1.036
slr0589
41
<32>
103%
0.696
5
0.399
0.836
sll0577
43
33
92%
0.324
4
0.520
0.770
ssl2667
46
<35>
sll0875
47
36
sll1300
48
37
No
No
CCB1
No
NifU
No
100%
0.952
4
0.485
0.879
No
126%
0.033
3
0.422
0.936
116%
0.065
4
0.456
0.812
Values that are significantly different from wild-type values are underlined. CPREs that have been identified are marked by <brackets>.
aOxygen evolution in µmol mg–1Chl h–1.
is not as large as the number (about 3,600) presented by
Martin et al. (2002). We expect that comprehensive analysis
of this number of genes is feasible. In addition, the present
study shows the possibility of performing comprehensive
analysis of phylogenetically conserved proteins, such
as those conserved in nodulating bacteria and those conserved in green plants, among others. It is true that plant
genomics has advanced since the start of this project, by the
development of genetic tools, comprehensive analysis of
co-expression and proteomics. However, the usefulness of
integrated comparative protein data, at a level beyond
simple homology searches, as described in the present
report, still remains important in complementing other
resources.
In conclusion, we demonstrated that phylogenetic profiling is effective in identifying hitherto undetected CPRENDOs.
A more efficient profiling, using improved protein clustering
with more genomic data, will estimate the nearly complete
set of CPRENDOs. In addition, a similar approach could be
applied to find various lineage-specific proteins, which might
Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009.
783
M. Ishikawa et al.
A
Columbia
B
2000
SALK_010806
(CPRE16)
C
SALK_013444
(CPRE14)
Fm
Fluorescence intensity, arb. units
Fm´
1000
F
Fo
0
2000
ML on
A L on
Fo´
AL off
FR on
ML on
AL on
ML on
AL on
ML off
D
E
(CPRE27)
(CPRE14)
1000
0
ML on
AL on
ML on
AL off
AL on
ML on
ML off
AL on
5 min
Time
Fig. 6 PAM analysis of A. thaliana mutants. AL, actinic light; ML, measuring light. Saturating pulses were given as indicated. (A) Two traces of the
wild type (Columbia); (B) a cpre16 mutant; (C) a cpre14 mutant; (D) a cpre27 mutant (green sector); (E) a cpre14 mutant.
be involved in lineage-specific functions, such as chloroplast
proteins of eukaryotic origin.
Materials and Methods
Estimation of CPRE
The genomes used in the present study are summarized in
Supplementary Table S1. The sequence data were assembled
as of December 2002. We also have more recent data
(ALL95 in Table 1); however, the analyses reported in the
present article were based on the data of 2002. The Gclust
software version 3.0 was used to construct protein clusters
by single-linkage clustering at a threshold E-value, such as
10–8, 10–12 or 10–20. The details of data processing were
described in previous papers (Sato 2002, Sato et al. 2005). For
the estimation of CPRENDOs, the clusters that had at least
one member of each in the eight cyanobacteria, the plant
and the red alga, but had no members belonging to nonphotosynthetic organisms or non-oxygenic photosynthetic
bacteria, were selected.
Prediction of intracellular targeting was performed
for the selected Arabidopsis protein sequences using the
TargetP server version 1.01 at http://www.cbs.dtu.dk/
services/(Emanuelsson et al. 2000), the WoLF PSORT at
http://wolfpsort.org/(Horton et al. 2007) and the Predotar
at http://urgi.versailles.inra.fr/predotar/predotar.html (Small
et al. 2004).
Growth of organisms
Wild-type and mutant A. thaliana were grown on 0.8% agarsolidified MS medium (Murashige and Skoog 1962) or on
784
soil at 22°C under continuous illumination with fluorescent
lamps (80 µmol m–2 s–1). T-DNA-tagged mutants of Arabidopsis were obtained from the Arabidopsis Biological
Resource Center (Ohio State University, Columbus, OH,
USA) or the Nottingham Arabidopsis Stock Centre (University
of Nottingham, Loughborough, UK). Wild-type (glucosetolerant strain) and mutants of Synechocystis were grown at
30°C in BG-11 medium (Rippka et al. 1979) supplemented by
5 mM sodium bicarbonate and 5 mM HEPES/NaOH (pH 7.5)
under continuous illumination with fluorescent lamps
(50 µmol m–2 s–1). Liquid cultures were aerated with 1.0%
CO2 in air.
Intracellular localization of selected proteins
Targeting of selected A. thaliana proteins was experimentally analyzed using the GFP fusion technique. A construct
for the expression of a transit peptide or a full-length protein
fused with GFP was prepared for each selected protein by
repeated PCR. The standard PCR was done in a 100 µl reaction
with 2.5 U of ExTaq (TAKARA Biomedicals, Kyoto, Japan)
according to the program of 30 cycles, each consisting of
denaturation at 93°C for 40 s, annealing at 55°C for 2 min
and extension at 72°C for 2 min. After the final cycle, extension for 10 min was performed. First, the following three
fragments were amplified by standard PCR: (1) the upstream
half of the GFP vector containing the 35S promoter region
plus a short connecting sequence A at its 3′ end; (2) the
putative targeting sequence with a short connecting
sequence A at its 5′ end and a short connecting sequence B
at its 3′ end; and (3) the sGFP sequence with a short connecting sequence B at its 5′ end. The connecting sequences
Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009.
Orthogenomics of photosynthetic organisms
A
Synechocystis
Arabidopsis
5
15
2
5
5
4
B
unknown: 1
(Change in different parameters)
Orthogonal
27
30
8
8
15
−1Anti-parallel
(Reverse changes
in similar parameters)
27 8
37 14
21
14
0
(No change)
16
Parallel 1
(Similar changes)
Fig. 7 Statistical comparison of Arabidopsis and Synechocystis
mutants. (A) Classification of CPREs according to observed phenotypes
in Arabidopsis and Synechocystis. Numbers within the circles indicate
the number of CPREs that showed phenotypes in either Arabidopsis,
Synechocystis or both. The number outside the circles indicates CPREs
that did not show phenotypes in either organism. Where there are no
data for a CPRE, this is described as ‘unknown’. (B) Similarity analysis of
the PAM results in Arabidopsis and Synechocystis. Each set of the six
parameters of PAM analysis (or, more exactly, deviation from 100%)
(Fig. 3) was taken as a vector, and the similarity of the vectors for
Arabidopsis and Synechocystis mutants of a CPRE, u and v, respectively,
was plotted as a vector having a size, (|u||v|)1/2, and an orientation
θ = cos–1(u·v/|u||v|), where u·v is the scalar product of u and v. The
data are plotted in a hemi-circular space with an arbitrary unit. Angle
θ is measured anti-clockwise from the right (as marked by 1). CPRE
numbers are indicated for significant signals.
A and B were used to join two PCR fragments by PCR. For
the amplification of fragments (1) and (3), the sGFP vector
(Chiu et al. 1996) was used as a template. The primers for the
amplification of fragment (1) were: primer 1, CCCTCAGAA
GACCAGAGGGCTATTGAGACT; and primer 2, GGATC
CTCTAGAGTCGAC. The primers for the amplification of
fragment (3) were: primer 3, ATGGTGAGCAAGGGCGAG
or GTGAGCAAGGGCGAGGAG; and primer 4, TCTCAT
GTTTGACAGCTTATCATCGGATCT. The underlined
sequences represent connecting sequences A and B, respectively. The primers used for amplifying fragment (2) varied with
the genes to be amplified and are summarized in Supplementary Table S3. In the second PCR, purified fragments (1), (2) and
(3) were mixed and connected by amplification in a condition
slightly different from the standard one, i.e. the annealing temperature was 50°C. The amount of fragment (2) was twice as
high as that of fragments (1) or (3). The final product (1–2–3)
was further amplified for use in particle bombardment.
Tungsten particles (1 µm in diameter) were coated with
the DNA and then introduced into scaly leaves of onion bulb
by the He-driven particle delivery system PDS-1000/He
(BioRad Laboratories, Hercules, CA, USA), using rupture disks
for 650 p.s.i. After incubating for 24 h at 25°C in dim light, the
epidermis was peeled and examined under a fluorescence
microscope (Olympus model BX-60) with an IB cube. For the
control of chloroplast and mitochondrial proteins, cpRbcS
(Lee et al. 2002) and mitochondrial ATP synthase subunit δ
(Moriyama et al. 2008) were used, respectively.
RNA gel blot analysis
Seedlings of A. thaliana ecotype Columbia were grown on
0.2% agar-solidified MS medium (Murashige and Skoog
1962) for 7 d under light (50 µE m–2 s–1) or in darkness. Shoots
were harvested, rapidly frozen in liquid nitrogen, and stored
at –80°C until use. Digoxigenin (DIG)-labeled probes for the
RNA gel blot analysis were prepared by PCR using a DIG-PCR
labeling mixture (Roche Diagnostics, Mannheim, Germany).
The primers are listed in Supplementary Table S3. Preparation of total RNA, glyoxylation, electrophoresis, blotting to a
nylon membrane and hybridization were done as described
previously (Sekine et al. 2007). The band was finally
visualized by chemiluminescence of CDP-Star (Roche
Diagnostics).
Disruption mutagenesis in Synechocystis
The genes for CPRE were individually disrupted in Synechocystis sp. PCC 6803 by homologous recombination using a
PCR-based disruption cassette. The method was described
in a recent publication (Sakurai et al. 2007). The primers are
listed in Supplementary Table S4. Complete segregation
was confirmed by PCR analysis using the upF and dnR primers. In some cases, complete segregation was not attained,
but the partial mutants showing phenotypes were analyzed
along with the complete disruptants.
Fluorescence measurement
PAM fluorescence analysis was performed with Fluorescence
Monitoring System FMS1 (Hansatech Instruments Ltd., Norfolk, UK). In the analysis of Arabisopsis, leaves of 30-day-old
wild type or mutants grown on soil under a 16 h light/8 h
dark cycle at 23°C were used. Modulated measuring light
Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009.
785
M. Ishikawa et al.
at 594 nm was used at a setting 2 with gain 70. Actinic light
with a setting 15 (corresponding to a fluence rate of 80 µmol
m–2 s–1) was used to drive photosynthesis. Pulses (0.8 s) of
white light at a setting 100 (fluence rate of 8,000 µmol
m–2 s–1) at 30 s intervals were applied to obtain maximal
fluorescence.
The kinetics of fluorescence induction in Synechocystis
were measured as described (Fujimori et al. 2005). The results
were used to select candidates for further analysis. In the
PAM analysis of Synechocystis, exponentially growing cells
(A750 ∼0.5) were used. Modulated light was used at a setting
1 with gain 70. Actinic light with a setting 15 and a series of
0.2 s saturating pulses was applied. At the end of each measurement, DCMU (10 µM) was added with an actinic light at
a setting 30 to obtain Fm.
Oxygen evolution
Oxygen evolution of Synechocystis cells was measured polarographically in Oxytherm with an Oxygraph controller
(Hansatech Instruments Ltd.).
Absorption spectra
Absorption spectra of Synechocystis cells were measured by
the ‘opal glass’ method using a Shimadzu UV160 spectrophotometer. Two sheets of Parafilm were used as a light
scatterer, and the cuvettes were placed just in front of
the light detectors. Chlorophyll was determined in a 90%
methanol extract at 665 nm using the absorption coefficient
12.7 mM–1 cm–1.
Supplementary data
Supplementary data are available at PCP online.
Funding
Ministry of Education, Culture, Sports, Science and Technology (MEXT), Japan Grants-in-Aid (Nos. 17018010, 18017005,
20017006 and 16GS0304 to N.S.).
Acknowledgments
The authors thank former students T. Saito and A. Fukumoto in the laboratory for their help in the initial phase
of the work, and Y. Niwa, University of Shizuoka, for kindly
supplying us with the sGFP vector. We also acknowledge the
Arabidopsis Biological Resource Center and the Nottingham
Arabidopsis Stock Centre for Arabidopsis T-DNA-tag lines
References
Abdallah, F., Salamini, F. and Leister, D. (2000) A prediction of the size
and evolutionary origin of the proteome of chloroplasts of
Arabidopsis. Trends Plant Sci. 5: 141–142.
786
Arabidopsis Genome Initiative (2000) Analysis of the genome sequence
of the flowering plant Arabidopsis thaliana. Nature 408: 796–815.
Barneche, F., Winter, V., Crèvecoeur, M. and Rochaix, J.D. (2006) ATAB2
is a novel factor in the signalling pathway of light-controlled
synthesis of photosystem proteins. EMBO J. 25: 5907–5918.
Bartnikas, T.B., Tosques, I.E., Laratta, W.P., Shi, J. and Shapleigh, J.P.
(1997) Characterization of the nitric oxide reductase-encoding
region in Rhodobacter sphaeroides 2.4.3. J. Bacteriol. 179:
3534–3540.
Bowman, J.L., Floyd, S.K. and Sakakibara, K. (2007) Green genes—
comparative genomics of the green branch of life. Cell 129:
229–234.
Cavalier-Smith, T. (2003) Genomic reduction and evolution of novel
genetic membranes and protein-targeting machinery in eukaryote–
eukaryote chimeras (meta-algae). Philos. Trans. R. Soc. B: Biol. Sci.
358: 109–134.
Chiu, W.-I., Niwa, Y., Zeng, W., Hirose, T., Kobayashi, H. and Sheen, J.
(1996). Engineered GFP as vital reporter in plants. Curr. Biol. 6:
325–330.
Dauvillée, D., Stampacchia, O., Girard-Bascou, J. and Rochaix, J.-D.
(2003) Tab2 is a novel conserved RNA binding protein required for
translation of the chloroplast psaB mRNA. EMBO J. 22: 6378–6388.
De Bie, L.G., Roovers, M., Oudjama, Y., Wattiez, R., Tricot, C., Stalon, V.,
et al. (2003) The yggH gene of Escherichia coli encodes a tRNA
(m7G46) methyltransferase. J. Bacteriol. 185: 3238–3243.
De Crécy-Lagard, V. and Hanson, A.D. (2007) Finding novel metabolic
genes through plant-prokaryote phylogenomics. Trends Microbiol.
15: 563–70.
Dougan, D.A., Reid, B.G., Horwich, A.L. and Bukau, B. (2002) ClpS, a
substrate modulator of the ClpAP machine. Mol. Cell 9: 673–683.
Dunkley, T.P., Hester, S., Shadforth, I.P., Runions, J., Weimar, T., Hanton,
S.L., et al. (2006) Mapping the Arabidopsis organelle proteome. Proc.
Natl Acad. Sci. USA 103: 6518–6523.
Duy, D., Wanner, G., Meda, A.R., Von Wirén, N., Soll, J. and Philippar, K.
(2007) PIC1, an ancient permease in Arabidopsis chloroplasts,
mediates iron transport. Plant Cell 19: 986–1006.
Emanuelsson, O., Nielsen, H., Brunak, S. and von Heijne, G. (2000)
Predicting subcellular localization of proteins based on their
N-terminal amino acid sequence. J. Mol. Biol. 300: 1005–1016.
Ferro, M., Salvi, D., Brugière, S., Miras, S., Kowalski, S., Louwagie, M., et al.
(2003) Proteomics of the chloroplast envelope membranes from
Arabidopsis thaliana. Mol. Cell Proteomics 2: 325–345.
Frankenberg, N. and Lagarias, J.C. (2003) Phycocyanobilin:ferredoxin
oxidoreductase of Anabaena sp. PCC 7120. J. Biol. Chem. 278:
9219–9226.
Friso, G., Giacomelli, L., Ytterberg, A.J., Peltier, J.-B., Rudella, A., Sun, Q.,
et al. (2004) In-depth analysis of the thylakoid membrane proteome
of Arabidopsis thaliana chloroplasts: new proteins, new functions,
and a plastid proteome database. Plant Cell 16: 478–499.
Froehlich, J.E., Wilkerson, C.G., Ray, W.K., McAndrew, R.S., Osteryoung,
K.W., Gage, D.A., et al. (2003) Proteomic study of the Arabidopsis
thaliana chloroplastic envelope membrane utilizing alternatives to
traditional two-dimensional electrophoresis. J. Proteome Res. 2:
413–425.
Fujimori, T., Higuchi, M., Sato, H., Aiba, H., Muramatsu, M., Hihara, Y.,
et al. (2005) The mutant of sll1961, which encodes a putative
transcriptional regulator, has a defect in regulation of photosystem
Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009.
Orthogenomics of photosynthetic organisms
stoichiometry in the cyanobacterium Synechocystis sp. PCC 6803.
Plant Physiol. 139: 408–416.
Goodstadt, L. and Ponting, C.P. (2004) Vitamin K epoxide reductase:
homology, active site and catalytic mechanism. Trends Biochem. Sci.
29: 289–292.
Horton, P., Park, K.J., Obayashi, T., Fujita, N., Harada, H., Adams-Collier,
C.J., et al. (2007) WoLF PSORT: protein localization predictor. Nucleic
Acids Res. 35: W585–W587.
Kawai-Yamada, M., Saito, Y., Jin, L., Ogawa, T., Kim, K.M., Yu, L.H., et al.
(2005) A novel Arabidopsis gene causes Bax-like lethality in
Saccharomyces cerevisiae. J. Biol. Chem. 280: 39468–39473.
Keren, N., Ohkawa, H., Welsh, E.A., Liberton, M. and Pakrasi, H.B. (2005)
Psb29, a conserved 22-kD protein, functions in the biogenesis of
photosystem II complexes in Synechocystis and Arabidopsis. Plant
Cell. 17: 2768–81.
Kleffmann, T., Russenberger, D., von Zychlinski, A., Christopher, W.,
Sjölander, K., Gruissem, W., et al. (2004) The Arabidopsis thaliana
chloroplast proteome reveals pathway abundance and novel protein
functions. Curr. Biol. 14: 354–362.
Kohchi, T., Mukougawa, K., Frankenberg, N., Masuda, M., Yokota, A.
and Lagarias, J.C. (2001) The Arabidopsis HY2 gene encodes
phytochromobilin synthase, a ferredoxin-dependent biliverdin
reductase. Plant Cell 13: 425–436.
Lee, K.H., Kim, D.H., Lee, S.W., Kim, Z.H. and Hwang, I. (2002) In vivo
import experiments in protoplasts reveal the importance of
the overall context but not specific amino acid residues of the
transit peptide during import into chloroplasts. Mol. Cells 14:
388–397.
Léon, S., Touraine, B., Ribot, C., Briat, J.F. and Lobréaux, S. (2003) Iron–
sulphur cluster assembly in plants: distinct NFU proteins in
mitochondria and plastids from Arabidopsis thaliana. Biochem. J.
371: 823–830.
Lezhneva, L., Kuras, R., Ephritikhine, G. and De Vitry, C. (2008) A novel
pathway of cytochrome c biogenesis is involved in the assembly of
the cytochrome b6f complex in Arabidopsis chloroplasts. J. Biol.
Chem. 283: 24608–24616.
Martin, W., Rujan, T., Richly, E., Hansen, A., Cornelsen, S., Lins, T., et al.
(2002) Evolutionary analysis of Arabidopsis, cyanobacterial, and
chloroplast genomes reveals plastid phylogeny and thousands of
cyanobacterial genes in the nucleus. Proc. Natl Acad. Sci. USA 99:
12246–12251.
Martin, W., Stoebe, B., Goremykin, V., Hapsmann, S., Hasegawa, M. and
Kowallik, K.V. (1998) Gene transfer to the nucleus and the evolution
of chloroplasts. Nature 393: 162–165.
Matsuzaki, M., Misumi, O., Shin-i, T., Maruyama, S., Takahara, M.,
Miyagishima, S., et al. (2004) Genome sequence of the ultrasmall
unicellular red alga Cyanidioschyzon merolae 10D. Nature 428:
653–657.
Merchant, S.S., Prochnik, S.E., Vallon, O., Harris, E.H., Karpowicz, S.J.,
Witman, G.B., et al. (2007) The Chlamydomonas genome reveals the
evolution of key animal and plant functions. Science 318:
245–250.
Moriyama, T., Terasawa, K., Fujiwara, M. and Sato, N. (2008) Purification
and characterization of organellar DNA polymerases in the red alga
Cyanidioschyzon merolae. FEBS J. 275: 2899–2918.
Mulkidjanian, A.Y., Koonin, E.V., Makarova, K.S., Mekhedov, S.L.,
Sorokin, A., Wolf, Y.I., et al. (2006) The cyanobacterial genome core
and the origin of photosynthesis. Proc. Natl Acad. Sci. USA 103:
13126–13131.
Murashige, T. and Skoog, F. (1962) A revised medium for rapid growth
and bioassays with tobacco tissue cultures. Physiol Plant. 15:
473–497.
Nishio, K. and Nakai, M. (2000) Transfer of iron–sulfur cluster from
NifU to apoferredoxin. J. Biol. Chem. 275: 22615–22618.
Pellegrini, M., Marcotte, E.M., Thompson, M.J., Eisenberg, D. and Yeates,
T.O. (1999) Assigning protein functions by comparative genome
analysis: protein phylogenetic profiles. Proc. Natl Acad. Sci. USA 96:
4285–4288.
Peltier, J.-B., Emanuelsson, O., Kalume, D.E., Ytterberg, J., Friso, G.,
Rudella, A., et al. (2002) Central functions of the lumenal and
peripheral thylakoid proteome of Arabidopsis determined by
experimentation and genome-wide prediction. Plant Cell 14: 211–236.
Peltier, J.-B., Ytterberg, A.J., Sun, Q. and van Wijk, K.J. (2004) New
functions of the thylakoid membrane proteome of Arabidopsis
thaliana revealed by a simple, fast, and versatile fractionation
strategy. J. Biol. Chem. 279: 49367–49383.
Peltier, J.-B., Cai, Y., Sun, Q., Zabrouskov, V., Giacomelli, L., Rudella, A.,
et al. (2006) The oligomeric stromal proteome of Arabidopsis
thaliana chloroplasts. Mol. Cell Proteomics 5: 114–133.
Rensing, S.A., Lang, D., Zimmer, A.D., Terry, A., Salamov, A., Shapiro, H.,
et al. (2008) The Physcomitrella genome reveals evolutionary insights
into the conquest of land by plants. Science 319: 64–69.
Reyes-Prieto, A., Hackett, J.D., Soares, M.B., Bonaldo, M.F. and
Bhattacharya, D. (2006) Cyanobacterial contribution to algal nuclear
genomes is primarily limited to plastid functions. Curr. Biol. 16:
2320–2325.
Rippka, R., Deruelies, J., Waterbury, J.B., Herdman, M. and Stanier, R.Y.
(1979) Generic assignments, strain histories and properties of pure
cultures of cyanobacteria. J. Gen. Microbiol. 111: 1–61.
Sakurai, I., Mizusawa, N., Wada, H. and Sato, N. (2007)
Digalactosyldiacylglycerol is required for stabilization of the oxygenevolving complex in photosystem II. Plant Physiol. 145: 1361–1370.
Sato, N. (2001) Was the evolution of plastid genetic machinery
discontinuous? Trends Plant Sci. 6: 151–156.
Sato, N. (2002) Comparative analysis of the genomes of cyanobacteria
and plants. Genome Inform. 13: 173–182.
Sato, N. (2006) Origin and evolution of plastids: genomic view on the
unification and diversity of plastids. In The Structure and Function
of Plastids. Edited by Wise, R.R. and Hoober, J.K. pp. 75–102. Springer,
Dordrecht.
Sato, N. (2009) Gclust: trans-kingdom classification of proteins using
automatic individual threshold setting. Bioinformatics doi:10.1093/
bioinformatics/btp047.
Sato, N., Ishikawa, M., Fujiwara, M. and Sonoike, K. (2005) Mass
identification of chloroplast proteins of endosymbiont origin by
phylogenetic profiling based on organism-optimized homologous
protein groups. Genome Inform. 16: 56–68.
Schreiber, U. (2004) Pulse-amplitude-modulation (PAM) fluorometry,
and saturation pulse method: an overview. In Chlorophyll a
Fluorescence: A Signature of Photosynthesis. Edited by Papageorgiou,
G.C. and Govindjee. pp. 279–339. Springer, Dordrecht.
Schubert, M., Petersson, U.A., Haas, B.J., Funk, C., Schröder, W.P. and
Kieselbach, T. (2002) Proteome map of the chloroplast lumen of
Arabidopsis thaliana. J. Biol. Chem. 277: 8354–8365.
Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009.
787
M. Ishikawa et al.
Sekine, K., Fujiwara, M., Nakayama, M., Takao, T., Hase, T. and Sato, N.
(2007) DNA-binding and partial nucleoid localization of the
chloroplast stromal enzyme ferredoxin:sulfite reductase. FEBS J. 274:
2054–2069.
Small, I., Peeters, N., Legeai, F. and Lurin, C. (2004) Predotar: a tool for
rapidly screening proteomes for N-terminal targeting sequences.
Proteomics 4: 1581–1590.
Sun, Q., Emanuelsson, O. and van Wijk, K.J. (2004) Analysis of curated
and predicted plastid subproteomes of Arabidopsis. Subcellular
compartmentalization leads to distinctive proteome properties.
Plant Physiol. 135: 723–735.
Takabayashi, A., Ishikawa, N., Obayashi, T., Ishida, S., Obokata, J., Endo, T.,
et al. (2009) Three novel subunits of Arabidopsis chloroplastic
NAD(P)H dehydrogenase identified by bioinformatic and reverse
genetic approaches. Plant J. 57: 207–219.
Teng, Y.-S., Su, Y.-S., Chen, L.-J., Lee, Y.J., Hwang, I. and Li, H.-M. (2006)
Tic21 is an essential translocon component for protein translocation
across the chloroplast inner envelope membrane. Plant Cell 18:
2247–2257.
Thompson, J.D., Higgins, D.G. and Gibson, T.J. (1994) CLUSTAL W:
improving the sensitivity of progressive multiple sequence alignment
through sequence weighting, position-specific gap penalties and
weight matrix choice. Nucleic Acids Res. 22: 4673–4680.
Touraine, B., Boutin, J.P., Marion-Poll, A., Briat, J.F., Peltier, G. and
Lobréaux, S. (2004) Nfu2: a scaffold protein required for [4Fe–4S]
and ferredoxin iron–sulphur cluster assembly in Arabidopsis
chloroplasts. Plant J. 40: 101–111.
Walters, R.G., Shephard, F., Rogers, J.J., Rolfe, S.A. and Horton, P. (2003)
Identification of mutants of Arabidopsis defective in acclimation of
photosynthesis to the light environment. Plant Physiol. 131:
472–481.
Wang, Q., Sullivan, R.W., Kight, A., Henry, R.L., Huang, J., Jones, A.M., et al.
(2004) Deletion of the chloroplast-localized Thalakoid Formation 1
gene product in Arabidopsis leads to deficient thylakoid formation
and variegated leaves. Plant Physiol. 136: 3594–3604.
Zybailov, B., Rutschow, H., Friso, G., Rudella, A., Emanuelsson, O., Sun, Q.,
et al. (2008) Sorting signals, N-terminal modifications and abundance
of the chloroplast proteome. PLoS ONE 3: e1994.
(Received January 8, 2009; Accepted February 12, 2009)
788
Plant Cell Physiol. 50(4): 773–788 (2009) doi:10.1093/pcp/pcp027 © The Author 2009.