Vol. 20 no. 16 2004, pages 2662–2675 doi:10.1093/bioinformatics/bth306 BIOINFORMATICS DBRF–MEGN method: an algorithm for deducing minimum equivalent gene networks from large-scale gene expression profiles of gene deletion mutants Koji Kyoda1,3 , Kotaro Baba2,4 , Shuichi Onami1,2,3, ∗ and Hiroaki Kitano1,2,3,5 1 Kitano Symbiotic Systems Project, ERATO, Japan Science and Technology Corporation and 2 The Systems Biology Institute, M31 6A, 6-31-15 Jingumae, Shibuya, Tokyo 150-0001, Japan, 3 Graduate School of Science and Technology, Keio University, 3-14-1 Hiyoshi, Kohoku, Yokohama 223-8522, Japan, 4 National Institute of Agrobiological Sciences, 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan and 5 Sony Computer Science Laboratories, Inc., 3-14-13 Higashi-Gotanda, Shinagawa, Tokyo 141-0022, Japan Received on September 15, 2003; revised on March 5, 2004; accepted on April 28, 2004 Advance Access publication May 14, 2004 ABSTRACT Motivation: Large-scale gene expression profiles measured in gene deletion mutants are invaluable sources for identifying gene regulatory networks. Signed directed graph (SDG) is the most common representation of gene networks in genetics and cell biology. However, no practical procedure that deduces SDGs consistent with such profiles has been developed. Results: We developed the DBRF–MEGN (difference-based regulation finding–minimum equivalent gene network) method in which an algorithm deduces the most parsimonious SDGs consistent with expression profiles of gene deletion mutants. Positive (or negative) directed edges representing positive (or negative) gene regulations are deduced by comparing the gene expression level between the wild-type and mutant. The most parsimonious SDGs are deduced using graph theoretical procedures. Compensation for excess removal of edges by restoring a minimum number of edges makes the method applicable to cyclic gene networks. Use of independent groups of edges greatly reduces the computational cost, thus making the method applicable to large-scale expression profiles. We confirmed the applicability of our method by applying it to the gene expression profiles of 265 Saccharomyces cerevisiae deletion mutants, and we confirmed our method’s validity by comparing the pheromone response pathway, general amino acid control system, and copper and iron homeostasis system deduced by our method with those reported in the literature. Interpretation of the gene network deduced from the S. cerevisiae expression profiles by using ∗ To whom correspondence should be addressed. 2662 our method led to the prediction of 132 transcriptional targets and modulators of transcriptional activity of 18 transcriptional regulators. Availability: The software is available on request. Contact: [email protected] Supplementary information: http://www.so.bio.keio.ac.jp/ dbrf-megn/ INTRODUCTION Identification of gene regulatory networks (hereafter called gene networks) is essential for understanding cellular functions. Large-scale gene deletion projects (Liu et al., 1999; Winzeler et al., 1999; Hamer et al., 2001; Giaever et al., 2002) and DNA microarrays (Schena et al., 1995; Lockhart et al., 1996) have enabled large-scale gene expression profiles of gene deletion mutants (deletants); these large-scale profiles comprise the expression levels of thousands of genes measured in deletants of those genes. Hughes et al. (2000) reported gene expression profiles of more than 6300 genes corresponding to 265 single-gene deletants in Saccharomyces cerevisiae. Such profiles are invaluable sources for identifying gene networks. Many procedures, such as those by Ideker et al. (2000), Kyoda et al. (2000), Pe’er et al. (2001), and Wagner (2001), infer gene networks from large-scale expression profiles of gene deletants. In these procedures, gene networks are modeled using various mathematical representations. Ideker et al. (2000) modeled gene networks as acyclic Boolean networks and inferred a network consistent with profiles by using a combinatorial optimization technique. Kyoda et al. (2000) Bioinformatics vol. 20 issue 16 © Oxford University Press 2004; all rights reserved. Minimum equivalent gene networks modeled gene networks as signed directed graphs (SDGs) and deduced the most parsimonious graph consistent with profiles by using a graph theoretical procedure. Pe’er et al. (2001) modeled gene networks as Bayesian networks and inferred gene networks by using machine learning technology. Wagner (2001) modeled gene networks as directed acyclic graphs and deduced the most parsimonious graph consistent with profiles by using a graph theoretical procedure. The SDG is a desirable representation of gene networks because it is the most common representation of gene networks in genetics and cell biology. In such graphs, a regulation between two genes is represented as a signed directed edge (SDE) whose sign—positive or negative—represents whether the effect of the regulation is activation or inhibition and whose direction represents which gene regulates which other gene. Because of the commonness of SDGs in genetics and cell biology, SDGs consistent with large-scale gene expression profiles will provide fruitful information for understanding cellular function; such graphs can be directly compared with gene networks identified through classical small-scale experiments and then can be interpreted in the same manner as those small-scale gene networks. Kyoda et al. (2000) previously developed the DBRF (difference-based regulation finding) method, which deduces the most parsimonious SDG consistent with the expression profiles of gene deletants. However, the method is not applicable to cyclic gene networks. This is the critical drawback of the DBRF method, because real gene networks contain many feedback regulations (Ferrell, 2002; Guelzim et al., 2002; Lee et al., 2002). Therefore, an algorithm that is applicable to cyclic gene networks needed to be developed. In this study, we developed the DBRF–MEGN (minimum equivalent gene network) method, an improved algorithm for deducing the most parsimonious SDG consistent with the expression profiles of gene deletants. This method is applicable not only to acyclic but also to cyclic gene networks. To show the applicability of this method, we applied it to large-scale expression profiles of S. cerevisiae (Hughes et al., 2000). The method successfully deduced the most parsimonious SDG consistent with these profiles. To evaluate the validity of the method, we then compared the deduced graph with gene networks reported in the literature and interpreted this graph to predict the transcriptional targets and modulators of transcriptional activity of known transcriptional regulators. METHODS DBRF–MEGN method Four key concepts of the DBRF–MEGN method are described here. The first two concepts, difference-based deduction of edges and removal of redundant edges, were already implemented in the DBRF method (Kyoda et al., 2000). The last two, compensation for excess removal of edges by restoring a minimum number of non-essential edges and the use of independent groups, were originally developed for the DBRF– MEGN method and thus are improvements on the DBRF method. The first concept is difference-based deduction of edges. The gene expression profiles of gene deletants consist of the expression levels of genes measured in deletants for each of these genes. To deduce the SDG that is consistent with these profiles, we used an assumption that is commonly used in genetics and cell biology as is done in the DBRF method (Kyoda et al., 2000), i.e. there exists a positive (negative) regulation from gene A to gene B when the expression level of gene B in the deletant of gene A is lower (higher) than in the wild-type (Fig. 1A). For each possible pair of genes, we determined whether positive (negative) regulations between those genes exist and deduced all SDEs consistent with both the stated assumption and the profiles; we call these edges ‘initially deduced edges’ (Fig. 1B). This computation required n2 iterations, where n represents the number of genes in the profiles. The second concept is removal of redundant edges. The initially deduced edges consist not only of those representing direct gene regulations but also those representing indirect gene regulations. We define the regulation from gene A to gene B as direct when gene A regulates gene B independently of other gene regulations, e.g. a transcription factor A binds to upstream regulatory regions of gene B and increases the transcription of gene B. On the other hand, we define gene regulation as indirect when gene A regulates gene B as a result of other regulations, e.g. a transcription factor A increases the transcription of transcription factor C, which then increases the transcription of gene B. A desirable gene network consists only of direct gene regulations, because indirect regulations do not correspond to molecular mechanisms of gene regulation. To choose edges representing direct gene regulations from the initially deduced edges, all edges that are deductively explained by two other edges are removed (Fig. 1C), as is done in the DBRF method (Kyoda et al., 2000), because an indirect regulation is explained by direct regulations. To reduce the computational cost, we implemented this removal process by modifying Warshall’s algorithm (Warshall, 1962). The resulting algorithm required n3 iterations. The third concept is compensation for excess removal of edges by restoring a minimum number of non-essential edges. The edges chosen in the removal process, called ‘essential edges’, sometimes fail to explain the initially deduced edges. This is the problem of the DBRF method when it is applied to cyclic gene networks. Some edges represent direct gene regulations even when they are explained by two other edges (Fig. 1C). Therefore, the removal process sometimes removes edges representing direct gene regulations, resulting in excess removal of edges (Fig. 1C). It is difficult to know whether an edge represents direct or indirect gene regulation when the edge is explained by two other edges and when only expression profiles of single-gene deletants are 2663 K.Kyoda et al. Fig. 1. Example of the deduction of MEGNs from the gene expression profiles of gene deletants. (A) Assumption used in the DBRF–MEGN method. (B) Deduction of initially deduced edges. The matrix represents a set of expression profiles, and the schematic represents a set of initially deduced edges. In the matrix, a, b, . . . represent expression levels of gene a, b, . . . , and a, b, . . . represent deletants of gene a, b, . . . . The up (down) arrows indicate that the gene expression levels are higher (lower) in the deletant than in the wild-type. (C) Essential edges. Non-essential edges are light-colored. Dotted edges represent unexplained edges, which cannot be explained by essential edges. Either d → e or d → f represents a direct gene regulation, and either h → i, i → g, and g → h or h → g, g → i, and i → h represent direct gene regulations. (D) Four MEGNs of the profiles. (E) Independent groups of unexplained edges. Combination of the minimum numbers of edges of two independent groups (G1 and G2) produce all four MEGNs. available. Therefore, instead of looking for edges representing direct regulations among the removed edges (hereafter we call the removed edges ‘non-essential edges’), the DBRF– MEGN method compensates for excessively removed edges 2664 by restoring a minimum number of non-essential edges so that the resulting edges (essential edges and the minimum number of non-essential edges) can explain the initially deduced edges (Fig. 1D). Often, several sets of such non-essential edges exist, Minimum equivalent gene networks and the DBRF–MEGN method deduces all sets. The resulting graphs are the most parsimonious SDGs consistent with given profiles and are called ‘the minimum equivalent gene networks’ (MEGNs) of the profiles. The fourth concept is the use of independent groups. The computation of the described process of deducing MEGNs is bounded by n3 m i=0 (I −E) Ci · (I − E − i), where m is the number of non-essential edges to be restored, I is the number of initially deduced edges and E is the number of essential edges. This computation is impractical, however, because (I −E) Cm increases rapidly as I − E and/or m increase. To reduce the computational cost, non-essential edges are separated into ‘independent groups’ so that edges to be restored can be deduced independently for each group (Fig. 1E). Edges that are not explained by essential edges are chosen from nonessential edges, and these edges are divided into independent groups so that the edges in one group do not explain those in other groups. For each group, the minimum number of edges with which essential edges can explain all edges in the group are deduced. All sets of such edges are deduced for each group, and all possible combinations of these sets are computed to generate the com MEGNs mj of the profiles. The 3 putation is bounded by G C ·(R −i)·n , where j j =0 j i=0 Rj i G is the number of groups, Rj is the number of edges in group j , nj is the number of genes in group j and mj is the number of edges to be restored in group j . A detailed description of the algorithm of the DBRF– MEGN method is included in the Supplementary information. Software implementation of this algorithm can be obtained from the authors on request. RESULTS Applicability to large-scale expression profiles obtained for real organisms To evaluate the applicability of the DBRF–MEGN method, we applied it to a subset of large-scale expression profiles obtained for S. cerevisiae (Hughes et al., 2000). The set of profiles comprises expression levels of 265 genes measured in 265 gene deletants corresponding to those genes (see Supplementary information). Each expression level accompanies a P -value, which corresponds to the significance of the difference from the expression level in the wild-type (Hughes et al., 2000). We considered that the expression level in the deletant is increased (decreased) when the level significantly differed from that in wild-type at P ≤ 0.01, which is the same P -value used by Hughes et al. (2000). With this P -value threshold, the DBRF–MEGN method deduced 829 initially deduced edges and 675 essential edges (see Supplementary information). These essential edges deductively explained the initially deduced edges. Therefore, the method deduced a unique MEGN of the profiles, and this MEGN consisted only of those essential edges (Supplementary Figure S1 is a graphical representation of the MEGN). The computation took ∼0.02 s on an Intel Pentium 4 PC (2.8 GHz, 1 GB RAM). The application we just described was a case in which essential edges deductively explained initially deduced edges. To evaluate the applicability when essential edges fail to explain initially deduced edges, we increased the number of initially deduced edges by increasing the P -value threshold from 0.01 to 0.05. Essential edges failed to explain initially deduced edges when the threshold was 0.03 and 0.05. In these two cases, the DBRF–MEGN method successfully deduced 2 and 16 384 MEGNs, respectively (see Supplementary information), and the computation took ∼0.02 and 0.75 s, respectively. These results show the applicability of the DBRF–MEGN method to actual large-scale expression profiles. Validity of gene networks deduced by the DBRF–MEGN method: pheromone response pathway The pheromone response pathway is one of the best characterized cellular cascades in S. cerevisiae. The 265 genes of the expression profiles we used include most genes of this pathway. Therefore, to evaluate the validity of the DBRF– MEGN method, we first compared the MEGN deduced from the expression profiles for the 265 S. cerevisiae genes with the known gene network in this pathway reported in the literature (Fig. 2). First, we focused on transcriptional regulations by Ste12p, which is the central transcription factor in the pheromone response pathway. Because the expression profiles applied to the DBRF–MEGN method were a collection of mRNA levels in gene deletants, an edge directing from STE12 was expected to represent a transcriptional regulation by Ste12p. Among the 265 genes in the profiles, 6 genes are transcriptional targets of Ste12p (Fig. 2A) (Errede and Ammerer, 1989; Sprague and Thorner, 1992; Oehlen et al., 1996; Oehlen and Cross, 1998; Ren et al., 2000; Roberts et al., 2000). The method cannot deduce self-regulations because of its assumption (Fig. 1A for a schematic of this assumption). Therefore, five positive edges directing from STE12 to FAR1, FUS3, SST2, STE2 and TEC1 were expected to be deduced (Fig. 2B). As expected, the method deduced five edges directing from STE12, all five of which were positive edges directing to each of those five genes (Fig. 2C). Next, we focused on the post-transcriptional regulation cascade that regulates Ste12p activity. Deletion of a single gene in this cascade increases (decreases) Ste12p activity, which then increases (decreases) the STE12 mRNA level because Ste12p self-increases its own transcription (Ren et al., 2000). The applied expression profiles were a collection of mRNA levels in gene deletants. Therefore, an edge directing from a gene to STE12 was expected to indicate the existence of a post-transcriptional regulation cascade from this gene to Ste12p, unless the gene is a transcriptional regulator. Among the 265 genes, 11 are involved in the 2665 K.Kyoda et al. post-transcriptional regulation cascade that regulates Ste12p activity (Fig. 2A) (Tedford et al., 1997; Roberts et al., 2000; Elion, 2001). However, deletion of 6 of those 11 genes was not expected to affect the STE12 mRNA level for the following three reasons. First, STE2 encodes the receptor for α-factor (Jenness et al., 1983); the receptor would not be activated in any of the experiments in which gene expression profiles were measured because MATa cells, which do not secrete α-factor (Herskowitz et al., 1992), were used in these experiments. Second, deletion of STE20 does not completely block pheromone-induced Ste12p activation, suggesting unidentified pathways that bypass Ste20p activity (Ramer and Davis, 1993). Third, FUS3 and KSS1 (Elion et al., 1991) and DIG1 and DIG2 (Tedford et al., 1997) are functionally redundant. Therefore, five positive edges directing to STE12 from STE4, STE5, STE7, STE11 and STE18 were expected to be deduced by the DBRF–MEGN method (Fig 2B). As expected, the method deduced five edges directing to STE12, all five of which were positive edges directed from each of those five genes (Fig. 2C). These results show the validity of the DBRF–MEGN method. In addition to the 10 edges just described, the DBRF– MEGN method deduced 2 unexpected edges (STE20 to STE2 and DIG1 to SST2) in this pathway. Validity of gene networks deduced by the DBRF–MEGN method: general amino acid control system Fig. 2. Validation of MEGN in the S. cerevisiae pheromone response pathway. (A) Known pheromone response pathway. Six transcriptional regulations (red edges) and 14 post-transcriptional regulations (light-green edges) were reported previously (Errede and Ammerer, 1989; Sprague and Thorner, 1992; Oehlen et al., 1996; Tedford et al., 1997; Oehlen and Cross, 1998; Ren et al., 2000; Roberts et al., 2000; Elion, 2001). (B) Expected edges in the pheromone response pathway. Five edges from STE12 to transcriptional targets (red edges) and five edges from post-transcriptional regulators to STE12 (green edges) were expected. Six edges from post-transcriptional regulators to STE12 (dotted green edges) were not expected because of the experimental conditions or the redundancy of gene regulation. (C) The MEGN in the pheromone response pathway. Five expected edges from STE12 (red edges), five expected edges to STE12 (green edges) and two unexpected edges (blue edges) were deduced. 2666 The general amino acid control system is a cross-pathway regulatory system that regulates many genes encoding amino acid biosynthesis enzymes and increases their expression under conditions of amino acid starvation in S. cerevisiae. We next evaluated the validity of the DBRF–MEGN method in this system. First, we focused on transcriptional regulations by Gcn4p, which is the central transcriptional regulator in the general amino acid control system. Gcn4p is required for full induction of 539 genes in response to histidine starvation, suggesting transcriptional regulation of these genes by Gcn4p (Natarajan et al., 2001). Similar to that of the transcriptional regulations by Ste12p, an edge directing from GCN4 is expected to represent a transcriptional regulation by Gcn4p (see the second paragraph of the previous section). The DBRF– MEGN method deduced two edges directing from GCN4. Consistent with the expectation, these two edges were positive edges directing to one of the suggested 539 transcriptional targets of Gcn4p (GCN4 to ALD5 and GCN4 to HIS1). Although the 265 genes in the profiles involve 18 of those 539 transcriptional targets, the method deduced only two edges directing from GCN4 to those targets. However, this finding does not indicate low sensitivity of the method, because Gcn4p is not always required for the basal expression of genes whose induction in response to amino acid starvation Minimum equivalent gene networks depends on Gcn4p (Pellman et al., 1990; Hinnebusch et al., 1992). Next, we focused on modulations of Gcn4p activity. Deletion of a gene that encodes a modulator of Gcn4p activity increases (decreases) Gcn4p activity, which then increases (decreases) the mRNA level of transcriptional targets of Gcn4p. As described in the previous paragraph, ALD5 and HIS1 are putative transcriptional targets of Gcn4p. Therefore, we expected an edge directing from a gene to ALD5 or HIS1 to indicate the existence of modulation of Gcn4p activity by this gene unless the gene is a transcriptional regulator. The 265 genes in the profiles include 12 genes (CKA2, CKB2, MED2, RAD6, RPL12A, RPL20A, RPL27A, RPL6B, RPL8A, RPS24A, RTS1 and UBR1) that encode modulators of Gcn4p activity (Feng et al., 1994; Kornitzer et al., 1994; van den Heuvel et al., 1995; Hinnebusch, 1997; Planta and Mager, 1998; Myers et al., 1999; Cherkasova and Hinnebusch, 2003; Wang and Jiang, 2003; Mewes et al., 2004). We found 19 genes (ADE2, ASE1, CKB2, ERG2, FKS1, IMP2’, MED2, RML2, RPL27A, RPS24A, RTG1, RTS1, SIR4, SOD1, UBR1, VPS8, YHL029C, YMR014W and YMR293C) from which edges deduced by the DBRF–MEGN method directed to either ALD5 or HIS1 or both. Consistent with the expectation, these 19 genes included 6 (CKB2, MED2, RPL27A, RPS24A, RTS1 and UBR1) that encode modulators of Gcn4p activity. Three of these six genes (CKB2, RTS1 and UBR1) have edges directing to both ALD5 and HIS1, whereas the remaining three have a single edge directing to either ALD5 or HIS1 (MED2 and RPS24A to ALD5, and RPL27A to HIS1), suggesting modulation of specific gene transcription or crosstalk among modulators of different transcriptional regulators. As just described, the 265 genes in the profiles included 12 genes encoding modulators of Gcn4p activity, and edges directing to ALD5 or HIS1 were deduced from 6 of those 12 genes. Deletion of the remaining six genes (CKA2, RAD6, RPL12A, RPL20A, RPL6B and RPL8A) was expected not to affect the mRNA levels of ALD5 and HIS1 for the following reasons. CKA2 is functionally redundant to CKA1 (Padmanabha et al., 1990). RPL12A, RPL20A, RPL6B and RPL8A encode ribosomal proteins, many of which are duplicated in S. cerevisiae (Planta and Mager, 1998; Hughes et al., 2000). Absence of RAD6, which encodes a specific ubiquitin conjugating enzyme required for Gcn4p degradation, mildly inhibits Gcn4p degradation (Kornitzer et al., 1994). Among the 19 genes from which edges directed to either ALD5 or HIS1 or both, 13 (ADE2, ASE1, ERG2, FKS1, IMP2 , RML2, RTG1, SIR4, SOD1, VPS8, YHL029C, YMR014W and YMR293C) do not encode known modulators of Gcn4p activity. However, this finding does not indicate low specificity of the deduction by the DBRF–MEGN method, because activity of cellular processes involving these genes may influence Gcn4p activity. For example, IMP2 is involved in carbohydrate metabolism, and glucose limitation stimulates translation of Gcn4p (Donnini et al., 1992; Yang et al., 2000). These results support the validity of the DBRF–MEGN method. Validity of gene networks deduced by the DBRF–MEGN method: copper and iron homeostasis system To evaluate the validity of the DBRF–MEGN method in the copper and iron homeostasis system, we first focused on transcriptional regulations by Mac1p and Aft1p, both of which play key roles in this system. An edge directing from MAC1 was expected to represent either a direct transcriptional regulation by Mac1p or an indirect transcriptional regulation by Mac1p through Aft1p for the following two reasons. First, the absence of MAC1 increases the expression of AFT1 and its transcriptional targets (De Freitas et al., 2004). Second, the 265 genes in the profiles include MAC1 but not AFT1. Among the 265 genes in the profiles, 3 (ERG3, FRE6 and MNN1) are transcriptionally regulated by Mac1p (Georgatsou and Alexandraki, 1999; De Freitas et al., 2004) and 2 (ERG3 and FRE6) are transcriptionally regulated by Aft1p (Martins et al., 1998; Rutherford et al., 2003). The DBRF–MEGN method deduced two edges directing from MAC1. Consistent with the expectation, these two edges were negative edges directing to ERG3 and FRE6. The edge directing from MAC1 to MNN1 was deduced when the P -value threshold was set to 0.02, which is consistent with the imperfect reproducibility of the reduction of MNN1 expression in mac1 cells (De Freitas et al., 2004). Next, we focused on modulations of Mac1p and Aft1p activities. As described in the previous paragraph, ERG3 and FRE6 are transcriptionally regulated by Mac1p and Aft1p. Therefore, we expected that an edge directing from a gene to ERG3 or FRE6 would indicate the existence of a modulation of Mac1p or Aft1p by this gene unless the gene is a transcriptional regulator, similar to the situation involving modulators of Gcn4p activity (see the third paragraph of the previous section). Genes crucial for vacuolar functions are involved in modulations of Mac1p and Aft1p activities, whereas those crucial for mitochondrial functions are involved in modulations of Aft1p for the following two reasons. First, Mac1p activity is downregulated by its direct binding to copper ions (Jensen and Winge, 1998), whereas Aft1p activity is regulated by its nuclear localization in response to cellular iron status (Yamaguchi-Iwai et al., 2002). Second, vacuoles are crucial for copper and iron homeostasis, whereas mitochondria are crucial for iron homeostasis (De Freitas et al., 2003). We found 13 genes (AEP2, AFG3, CUP5, ERG2, ERG28, MRPL33, RML2, RSM18, SSN6, VMA8, YEL044W, YMR031W-A and YMR293C) from which edges deduced by the DBRF–MEGN method directed to either ERG3 or FRE6 or both. Consistent with the above expectation, these 13 genes included 8 that are crucial for either vacuolar (CUP5 and VMA8) or mitochondrial (AFG3, AEP2, MRPL33, RML2, RSM18 and YMR293C) functions (Kang et al., 1991; Finnegan et al., 1995; Paul and 2667 K.Kyoda et al. Table 1. Performance of the DBRF–MEGN method T = 0.005 IDE MEGN Pheromone response pathway Sensitivity 0.90 0.90 (9/10) (9/10) Specificity 0.90 0.90 (9/10) (9/10) General amino acid control system Sensitivity 0.18 0.18 (7/38) (7/38) Specificity 0.23 0.27 (7/31) (7/26) T = 0.01 IDE MEGN T = 0.02 IDE MEGN T = 0.03 IDE MEGN T = 0.04 IDE MEGN T = 0.05 IDE MEGN 1.00 (10/10) 0.91 (10/11) 1.00 (10/10) 1.00 (10/10) 1.00 (10/10) 0.83 (10/12) 1.00 (10/10) 1.00 (10/10) 1.00 (10/10) 0.71 (10/14) 1.00 (10/10) 0.91 (10/11) 1.00 (10/10) 0.71 (10/14) 0.90 (9/10) 0.90 (9/10) 1.00 (10/10) 0.63 (10/16) 0.90 (9/10) 0.69 (9/13) 0.29 (11/38) 0.25 (11/44) 0.29 (11/38) 0.42 (11/26) 0.34 (13/38) 0.23 (13/57) 0.34 (13/38) 0.39 (13/33) 0.37 (14/38) 0.20 (14/69) 0.32 (12/38) 0.41 (12/29) 0.39 (15/38) 0.20 (15/75) 0.26 (10/38) 0.33 (10/30) 0.42 (16/38) 0.19 (16/83) 0.29 (11/38) 0.38 (11/29) Sensitivity and specificity of the DBRF–MEGN method for the pheromone response pathway and the general amino acid control system, with the actual numbers given in parentheses. A total of 10 gene regulations in the pheromone response pathway were used as the ‘gold standard’ for calculating sensitivity and specificity (STE12 to FAR1, FUS3, SST2, STE2 and TEC1, and STE4, STE5, STE7, STE11 and STE18 to STE12). A total of 38 gene regulations in the general amino acid control system were used as the ‘silver standard’ (GCN4 to ADE1, ALD5, ARG80, CBP2, ECA39, GLN3, HIS1, IMP2 , PET117, STB4, STE11, YAL004W, YEL059W, YER024W, YER033C, YHL045W, YIL037C and YOR072W, and CKB2, MED2, RPL12A, RPL20A, RPL27A, RPL6B, RPL8A, RPS24A, RTS1 and UBR1 to ALD5 or HIS1). Four gene regulations that represent modulations of Gcn4p activity by Ckb2p or Rad6p (CKB2 to ALD5 and HIS1, and RAD6 to ALD5 and HIS1) were excluded because these regulations were not expected to be deduced (Padmanabha et al., 1990; Kornitzer et al., 1994). All edges that were included in at least one MEGN were used in calculating sensitivity and specificity for T = 0.03 and 0.05. Note that sensitivity for the general amino acid control system was underestimated because all gene regulations of the silver standard are not necessarily expected to be deduced (see the section Validity of gene networks deduced by the DBRF–MEGN method: general amino acid control system). Specificity for the general amino acid control system was also underestimated because of the possible existence of uncharacterized gene regulations. T , P -value threshold and IDE, initially deduced edges. Tzagoloff, 1995; Arlt et al., 1996; Pan and Mason, 1997; Forgac, 1999; Hughes et al., 2000). For the remaining five genes, the method deduced five edges directing to ERG3 or FRE6 (ERG2 to ERG3, ERG28 to ERG3, SSN6 to ERG3, YEL044W to FRE6 and YMR031W-A to FRE6). Two of these five edges are consistent with the previously reported gene regulations, although it is unknown whether Mac1p or Aft1p play roles in these regulations (ERG2 to ERG3 and SSN6 to ERG3; Arthington-Skaggs et al., 1996; Vik and Rine, 2001; Kwast et al., 2002). It is possible that those edges represent modulations of other transcriptional regulators than Mac1p and Aft1p because the 265 genes in the profiles do not involve all transcriptional regulators. The remaining three edges directed from recently characterized genes with little information (ERG28; Hughes et al., 2000) or from uncharacterized open reading frames (YEL044W, YMR031W-A). These results again support the validity of the DBRF–MEGN method. Optimal P -value threshold for the DBRF–MEGN method To determine the optimal P -value threshold for the DBRF– MEGN method, we examined the sensitivity and specificity of the method at various P -value thresholds (Table 1). For calculating the sensitivity and specificity, the 10 gene regulations that were expected to be deduced in the pheromone response pathway were used as the ‘gold standard’ (see the section Validity of gene networks deduced by the DBRF–MEGN method: pheromone response pathway). The 38 gene regulations in the general amino acid control system were used as the ‘silver standard’, because those regulations were not necessarily expected to be deduced (see the section Validity of gene 2668 networks deduced by the DBRF–MEGN method: general amino acid control system). As expected, sensitivity increased as the threshold increased. For the pheromone response pathway, sensitivity was highest when the threshold was between 0.01 and 0.03. Interestingly, a decrease in sensitivity was observed as the threshold increased to ≥0.03. This decrease is a result of the removal of true-positive edges that are explained by two false-positive edges and those that are explained by a combination of false-positive and true-positive edges during the process removal of redundant edges, indicating that increased thresholds (≥0.03) do not provide the highest sensitivity because of the increased number of false-positive edges. In contrast, specificity decreased as the threshold increased. Noteworthily, decreased thresholds (≤0.005) did not provide the highest specificity because the number of deduced edges was too small for efficient removal of false-positive edges during the process removal of redundant edges. In light of these results, we concluded that 0.01 is an optimal threshold that provides a good balance of sensitivity and specificity. Prediction of transcriptional targets and modulators of transcriptional activity from MEGNs As described in the previous three sections, the DBRF–MEGN method successfully deduced expected edges directing from Ste12p, Gcn4p and Mac1p to their transcriptional targets, those directing from its modulators (post-transcriptional regulators) to Ste12p, and those directing from their modulators to transcriptional targets of Gcn4p and Mac1p. These successful Minimum equivalent gene networks Fig. 3. Schemes to predict transcriptional targets and modulators of transcriptional activity from MEGNs. (A) Scheme for a transcriptional regulator that self-regulates its transcription. A gene is predicted to be a transcriptional target when an edge directs from this regulator to the gene (red edge). A gene is predicted to be a modulator of transcriptional activity when an edge directs from the gene to the regulator (green edge). (B) Scheme for a transcriptional regulator that does not self-regulate its own transcription. A gene is predicted to be a transcriptional target when an edge directs from this regulator to the gene (red edge). A gene is predicted to be a modulator of transcriptional activity when an edge directs from the gene to transcriptional targets of this regulator (green edge). deductions suggest that transcriptional targets and modulators of a given transcriptional regulator can be predicted from MEGNs by interpreting edges directing from the transcriptional regulator and those directing to the regulator or its transcriptional targets. We examined such possible predictions as follows. First, we considered the transcriptional targets of a given transcriptional regulator. An edge is expected to represent a transcriptional regulation by the transcriptional regulator when it directs from this regulator (Fig. 3), as was discussed for the transcriptional regulations by Ste12p (see the second paragraph of the section Validity of gene networks deduced by the DBRF–MEGN method: pheromone response pathway). Therefore, a gene is predicted to be a transcriptional target of a given transcriptional regulator when an edge directs from this regulator to the gene. Next, we considered modulators of transcriptional activity of a given transcriptional regulator. In this case, two alternative prediction schemes (schematically represented in Fig 3A and B for prediction of modulators of transcriptional activity) should be used, depending on whether the given regulator selfregulates its own transcription or not. In the first alternative scheme, when the regulator self-regulates its transcription, deletion of a gene that encodes a modulator of the activity of this regulator increases (decreases) the mRNA level of this regulator, as was discussed for the post-transcriptional regulation cascade of Ste12p (see the third paragraph of the section Validity of gene networks deduced by the DBRF– MEGN method: pheromone response pathway). Therefore, a gene is predicted to be a modulator of transcriptional activity of a given transcriptional regulator when an edge directs from the gene to the given regulator (Fig. 3A). In the second alternative scheme, when the given transcriptional regulator does not self-regulate its own transcription, deletion of a gene that encodes a modulator of the activity of this regulator is expected not to influence the mRNA level of this regulator. Such deletion increases (decreases) the activity of this regulator, which then increases (decreases) the mRNA level of its transcriptional targets, as was discussed for the modulations of Gcn4p activity (see the third paragraph of the section Validity of gene networks deduced by the DBRF–MEGN method: general amino acid control system). Therefore, a gene is predicted to be a modulator of the activity of a given transcriptional regulator when an edge directs from the gene to transcriptional targets of this regulator (Fig. 3B). Of the 265 genes in the profiles, 18 are listed as ‘transcriptional regulators’ in the Saccharomyces Genome Database (Cherry et al., 1998; see gene list in Supplementary information). Based on the above two schemes for transcriptional targets and modulators of transcriptional activity, we predicted transcriptional targets and modulators of transcriptional activity of those 18 genes from the MEGN deduced by the DBRF–MEGN method from the expression profiles of the 2669 K.Kyoda et al. 265 S. cerevisiae genes (Table 2). Nearly half (132) of the 265 genes were thus predicted as transcriptional targets or modulators of transcriptional activity or both. An important feature of a gene regulatory network is crosstalk between cellular processes (Schwikowski et al., 2000; Hinnebusch and Natarajan, 2002; Brun et al., 2003; Pawson and Nash, 2003). Because of crosstalk, it is expected that the activity of a single transcriptional regulator is modulated by genes involved in diverse cellular processes and that genes involved in a single cellular process modulate the activity of several different transcriptional regulators. To confirm the capability of our prediction schemes to predict such crosstalk-dependent modulations, we compared modulators predicted by our schemes with gene clusters generated by hierarchical clustering using the full set of profiles in the Rosetta Compendium (Hughes et al., 2000); genes belonging to the same cluster are likely to function in the same cellular process (Eisen et al., 1998). As expected, the predicted modulators of most transcriptional regulators involved genes belonging to several different clusters, and genes belonging to the same cluster were involved in the modulators of several different transcriptional regulators (Table 2). The results indicate that our prediction scheme can predict modulators that modulate activity of transcriptional regulators through crosstalk of cellular processes. DISCUSSION We developed the DBRF–MEGN method, an algorithm for deducing the most parsimonious SDGs consistent with largescale expression profiles of gene deletants. One key feature of this method is compensation for excessively removed edges by restoring a minimum number of non-essential edges. This makes the method applicable not only to acyclic gene networks but also to cyclic gene networks. Our previous method, the DBRF method, fails to deduce the most parsimonious SDG when the target network is cyclic (Kyoda et al., 2000). This prevents the DBRF method from being widely used in the analysis of large-scale gene expression profiles, because real gene networks contain many feedback loops (Ferrell, 2002; Guelzim et al., 2002; Lee et al., 2002). The applicability of the DBRF–MEGN method to cyclic gene networks most probably will greatly improve the effectiveness of large-scale gene expression profiles. Another key feature of the DBRF–MEGN method is the implementation of independent groups of non-essential edges. This feature makes the method applicable to large-scale gene expression profiles by greatly reducing the computational cost of the process that deduces the minimum number of non-essential edges for the compensation. The method successfully deduced MEGNs from the large-scale expression profiles of 265 genes in 0.75 s, even when 16 384 different MEGNs exist. Without implementation of independent groups, such a deduction would take 3.8×1015 years to obtain 2670 the same results. Despite the great reduction in computational cost by the use of independent groups, there is no guarantee that the method will deduce MEGNs from any given expression profiles in an acceptable time. The cost depends on the maximum number of edges among all independent groups, and this number corresponds to the modularity of the gene network. Gene networks are predicted to be highly modulated (Hartwell et al., 1999; Ravasz et al., 2002; Rives and Galitski, 2003). Therefore, the DBRF–MEGN method most probably deduce MEGNs from most sets of expression profiles in an acceptable time. A major advantage of the DBRF–MEGN method is the representation of gene networks. A gene network deduced by this method is represented by SDG, the most common representation of gene networks in genetics and cell biology. This commonness allows the deduced gene networks to be compared with those identified through classical smallscale experiments and to be interpreted in the same way as those small-scale gene networks. We compared the pheromone response pathway, general amino acid control system, and copper and iron homeostasis system deduced by the DBRF–MEGN method with those reported in the literature, and found that the transcription targets and modulators of transcriptional activity of 18 transcriptional regulators were predicted from the MEGN of the expression profiles of 265 gene deletants. MEGNs probably will provide effective links between large- and small-scale gene network analyses and will provide important clues to understanding cellular function. Another advantage of the DBRF–MEGN method is the removal of redundant edges. The method removes as many non-essential edges as possible from the initially deduced edges. This makes the deduced graph simpler and more directly represent molecular mechanisms of gene networks. By the DBRF–MEGN method, ∼20% of edges (154 of 829) were removed from the initially deduced edges deduced from S. cerevisiae gene expression profiles. In the pheromone response pathway, ∼65% of edges (23 out of 35) were removed. All these removed edges represent indirect regulations from post-transcriptional gene regulators to transcriptional targets of Ste12p. Therefore, removal of these edges simplified interpretation of the MEGN in the pheromone response pathway. The DBRF–MEGN method has wider applicability and higher performance than either the predictor method (Ideker et al., 2000) or Pe’er et al.’s method (Pe’er et al., 2001) or Wagner’s method (Wagner, 2001). The predictor method (Ideker et al., 2000) is applicable only to acyclic gene networks, and its performance is lower than that of the DBRF method because of the Boolean modeling of gene networks in the predictor method (Kyoda et al., 2000). The DBRF– MEGN method is applicable to both acyclic and cyclic gene networks and is an improvement over the DBRF method. Pe’er et al.’s method estimates a Bayesian network that models gene networks, whereas the DBRF–MEGN method computes the Minimum equivalent gene networks Table 2. Predicted transcriptional targets and modulators of transcriptional activity Transcriptional Effect* Modulators of transcriptional activity Transcriptional targets ADE2 CKB2 MED2 RML2 RPS24A RTG1 RTS1 SOD1 VPS8 ALD5 HIS1 regulators GCN4 P YMR014W YMR293C GLN3 N ASE1 ERG2 FKS1 IMP2' RPL27A SIR4 UBR1 YHL029C P ASE1 BIM1 CEM1 CLB2 DOT4 ISW2 MRT4 NPR2 PET117 RNR1 N AEP2 PFD2 RTG1 YOR051C P AEP2 AFG3 CUP5 ERG2 ERG28 MRPL33 RML2 RSM18 SSN6 RPL20A SBH2 SPF1 SSN6 SST2 VMA8 YHR011W YMR031W-A MAC1 AQY2A AQY2B YER024W VMA8 YMR031W-A YMR293C MBP1 N YEL044W P YHR031C YOR080W ERG3 FRE6 N ERG2 ERG3 ERG28 MED2 RAD6 STE12 P STE4 STE5 STE7 STE11 STE18 FAR1 FUS3 SST2 STE2 TEC1 RNR1 SWI4 P AEP2 AFG3 CEM1 CUP5 DIG1 ERG2 ERG3 ERG28 FKS1 FUS3 CLB2 CLB6 ERG6 HST3 MNN1 MRT4 GAS1 HPT1 MED2 MSU1 QCR2 RML2 RPL12A RPL27A RPS24A SWI5 RPS27B RSM18 SCS7 SGS1 SHE4 SIN3 SIR4 SPF1 SSN6 UBR1 VMA8 YEL044W YER083C YHL029C YHR011W YMR014W YMR031W-A YMR293C YOR078W N CKB2 CUP5 DOT4 ERG2 ERG3 MED2 RML2 RPD3 RPL8A ADE1 ADE2 ALD5 ARG80 ERG4 RPL12A RPL27A RPS24A RPS27B RTG1 RTS1 SIR4 SSN6 VMA8 YEL047C YHL029C YML011C YMR009W VPS8 YEL044W YMR014W YMR293C YOR078W SWI5 P AEP2 ASE1 CLB2 CUP5 DIG1 FKS1 FUS3 GAS1 GYP1 IMP2 JNM1 KIM4 KIN3 MSU1 OST3 QCR2 RAD57 RPS27B RSM18 SCS7 SGS1 SIR2 SST2 VMA8 YAR014C TEC1 N YER030W YIL117C P YER030W YIL117C N AEP2 ASE1 CLB2 CUP5 DIG1 FKS1 FUS3 GAS1 GYP1 IMP2 JNM1 KIM4 KIN3 MSU1 OST3 QCR2 RAD57 RPS27B RSM18 SCS7 SGS1 SIR2 SST2 VMA8 YAR014C TUP1 P N ADE2 AFG3 ERG28 MRT4 QCR2 RML2 RPD3 RRP6 RSM18 RTS1 ARD1 ERG2 FUS3 GYP1 HST3 ISW1 SHE4 SIN3 SIR4 SSN6 TOP3 VMA8 YEL033W YEL044W YHR011W KIN3 KSS1 RAD27 RPS24A SIR2 YMR031W-A YOR078W YOR080W YER083C YMR258C YOR015W AEP2 CUP5 ECM18 ERG3 ERG4 FUS3 HMG1 HOG1 RAD6 AQY2A BUB2 CAT8 CIN5 ECM34 GPA2 RML2 RPD3 RPS27B SIN3 SST2 YER083C MAC1 MAK10 NTA1 PEP12 PHD1 STE4 SWI4 UTR4 VPS21 YEL044W YEL059W YEL067C YER033C YHL029C YHR022C YHR031C YMR031C YOR009W YOR051C YAP1 P ASE1 BIM1 CEM1 CLB2 ISW2 NPR2 PET117 RNR1 RPL20A SBH2 N PFD2 RTG1 YOR051C SOD1 SPF1 SST2 VMA8 YER024W A total of 18 transcriptional regulators that are both listed as ‘transcriptional regulators’ in the Saccharomyces Genome Database (Cherry et al., 1998) and included in the 265 genes in the profiles were analyzed (see gene list in Supplementary information). Transcriptional targets and modulators of transcriptional activity of these 18 transcriptional regulators were predicted from the MEGN deduced from the expression profiles of 265 S. cerevisiae gene deletants based on the two prediction schemes shown in Figure 3. All predicted transcriptional targets and modulators of transcriptional activity are shown. Gene clusters reported by Hughes et al. (2000) are represented by different colors: mitochondrial function (yellow), cell wall (brown), protein synthesis (sky blue), ergosterol biosynthesis (orange), mating (violet), MAPK activation (turquoise), rnr1 HU (red), histone deacetylase (blue), isw (purple), vacuolar ATPase/iron regulation (bright pink), sir (grey), tup1 ssn6 (light green), Gcn4 down (green) and Gcn4 up (bright green). ∗ Indicates the positive (P) or negative (N) effect of transcriptional regulation or modulation of transcriptional activity. 2671 K.Kyoda et al. exact solution of the most parsimonious SDG that is consistent with expression profiles. Pe’er et al.’s method failed to infer all gene regulations relating to STE12 (Pe’er et al., 2001), whereas the DBRF–MEGN method deduced 10 such regulations from the same expression profiles. Gene networks deduced by the Wagner’s method (Wagner, 2001) have less information than those deduced by the DBRF–MEGN method. An edge deduced by Wagner’s method represents the direction of gene regulation but lacks information about whether the regulation is activation or inhibition, whereas an edge deduced by the DBRF–MEGN method has all this information. Wagner’s method avoids deducing the cycle structures of gene networks, whereas the DBRF–MEGN method deduces all candidates of such structures. Therefore, we conclude that the DBRF–MEGN method is better than all three of these methods. The success of gene network deduction by the DBRF– MEGN method depends on the experimental conditions under which the expression profiles were obtained. The expression profiles used in the present study were obtained from asynchronous culture of deletant strains (Hughes et al., 2000). Therefore, the method could not deduce cell-cycle-specific gene regulations, such as MBP1 to CLB6 (Koch et al., 1993) and SWI6 to CLB6 (Dirick et al., 1998). The method also could not deduce diploid- or haploid-specific gene regulations, such as TUP1 to STE5 (Mukai et al., 1993) and SIN3 to STE2 (Vidal et al., 1991), because some profiles were obtained in diploid cells and others were obtained in haploid cells. To deduce these types of regulations, expression profiles would need to be obtained through more controlled experiments, such as inactivation of gene function at a specific period of the cell cycle in synchronized culture and measurement of all the expression profiles in either diploid or haploid cells. Improvements in technologies that more accurately control experiments, such as real-time monitoring of the cell cycle and drug-induced rapid inactivation of gene function, will increase the effectiveness of the DBRF–MEGN method. A major drawback of the DBRF–MEGN method is its inability to deduce redundant gene regulations. The method deduces a gene regulation only when deletion of a single gene affects the expression level of another gene. Therefore, when two or more genes redundantly regulate a gene, the method cannot deduce any of these regulations. In the pheromone response pathway, the method could not deduce five redundant regulations from STE20, FUS3, KSS1, DIG1 and DIG2 to STE12. Importantly, this is not a drawback of our algorithm but of the expression profiles of single-gene deletants. One possible solution is to generate expression profiles of multiple gene deletants, although such generation might entail enormous experimental costs. We are now developing an algorithm applicable to such expression profiles. Transcriptional targets and modulators of transcriptional activity of given transcriptional regulators can be predicted from MEGNs. This prediction is an example of the 2672 interpretation of MEGNs. In this prediction, nearly half of the 265 S. cerevisiae genes were predicted as transcriptional targets or modulators of transcriptional activity or both. The remaining genes likely are transcriptional targets or modulators of transcriptional activity of transcriptional regulators that are not included in the 265 genes. In the pheromone response pathway, the DBRF–MEGN method deduced two unexpected edges, STE20 to STE2 and DIG1 to SST2. In light of the prediction schemes described in the previous section, Ste20p is predicted to be a modulator of the transcriptional activity of some transcriptional regulator that is not included in the 265 genes and that regulates the transcription of STE2. Similarly, Dig1p is predicted to be a modulator of the transcriptional activity of some transcriptional regulator that is not included in the 265 genes and that regulates the transcription of SST2. One possible approach to predicting the functions of those remaining genes is to generate MEGNs from the expression profiles of all the S. cerevisiae genes. Giaever et al. (2002) generated single deletants of almost all S. cerevisiae genes; the expression profiles of all those deletants are highly anticipated. Many direct gene regulations are represented in MEGNs but not in the most parsimonious unsigned directed graphs consistent with expression profiles. The redundancy of edges is determined by both accessibility and effect of three edges in the DBRF–MEGN method, whereas it is determined only by accessibility when the most parsimonious unsigned directed graph is deduced. Therefore, a gene regulation whose accessibility is explained by two other regulations but whose effect is not explained is represented in the MEGN but not in the most parsimonious unsigned directed graph (e.g. regulation from gene h to gene a in Fig. 1B). Approximately 15% of edges (104 out of 675) in MEGN deduced from S. cerevisiae expression profiles represent such regulations. Regulation from STE12 to FUS3 (Roberts et al., 2000) is one such regulation. Maki et al. (2001) proposed a combination approach, in which the most parsimonious unsigned directed graph is deduced from the expression profiles of gene deletants and then functions of its edges are inferred from time-series expression profiles. Integration of the DBRF–MEGN method probablywill improve the performance of such combination approaches. Large-scale gene expression profiles of gene deletants are invaluable sources for understanding cellular functions. Clustering has been the only method widely used in the analysis of these profiles. Although clustering can predict cellular processes that involve the target gene, it provides no overt information about the gene regulations, which make up gene networks. The DBRF–MEGN method deduces gene regulations from large-scale expression profiles of gene deletants. The method is applicable not only to expression profiles measured by using DNA microarrays but also to those measured by using other technologies, such as 2D-PAGE-MS (Gygi et al., 1999) and protein chips (Zhu et al., 2000). The DBRF–MEGN method will provide fruitful information for understanding cellular functions. Minimum equivalent gene networks ACKNOWLEDGEMENTS We thank K. Oka for his support and valuable discussion. We also thank A. Kimura, S. Hamahashi and M. Morohashi for critical comments on this manuscript. This work was supported in part by a grant from the Ministry of Agriculture, Forestry and Fisheries of Japan (Rice Genome Project SY1106) to H.K. and S.O.; a grant from Special Coordination Funds for Promoting Science and Technology (to H.K. and S.O.) and Grant-in-Aid for the 21st Century Center of Excellence (COE) Program entitled ‘Understanding and Control of Life’s Function via Systems Biology (Keio University)’ (to H.K.), the Ministry of Education, Culture, Sports, Science and Technology, the Japanese Government; and a grant from Institute for Bioinformatics Research and Development (BIRD), Japan Science and Technology Agency to S.O. REFERENCES Arlt,H., Tauer,R., Feldmann,H., Neupert,W. and Langer,T. (1996) The YTA10-12 complex, an AAA protease with chaperonelike activity in the inner membrane of mitochondria. Cell, 85, 875–885. Arthington-Skaggs,B.A., Crowell,D.N., Yang,H., Sturley,S.L. and Bard,M. (1996) Positive and negative regulation of a sterol biosynthetic gene (ERG3) in the post-squalene portion of the yeast ergosterol pathway. FEBS Lett., 392, 161–165. Brun,C., Chevenet,F., Martin,D., Wojcik,J., Guénoche,A. and Jacq,B. (2003) Functional classification of proteins for the prediction of cellular function from a protein–protein interaction network. Genome Biol., 5, R6. Cherkasova,V.A. and Hinnebusch,A.G. (2003) Translational control by TOR and TAP42 through dephosphorylation of eIF2α kinase GCN2. Genes Dev., 17, 859–872. Cherry,J.M., Adler,C., Ball,C., Chervitz,S.A., Dwight,S.S., Hester,E.T., Jia,Y., Juvik,G., Roe,T., Schroeder,M. et al. (1998) SGD: Saccharomyces Genome Database. Nucleic Acids Res., 26, 73–79. De Freitas,J., Wintz,H., Kim,J.H., Poynton,H., Fox,T. and Vulpe,C. (2003) Yeast, a model organism for iron and copper metabolism studies. Biometals, 16, 185–197. De Freitas,J.M., Kim,J.H., Poynton,H., Su,T., Wintz,H., Fox,T., Holman,P., Loguinov,A., Keles,S., Van der Laan,M. et al. (2004) Exploratory and confirmatory gene expression profiling mac1. J. Biol. Chem., 279, 4450–4458. Dirick,L., Goetsch,L., Ammerer,G. and Byers,B. (1998) Regulation of meiotic S phase by Ime2 and a Clb5,6-associated kinase in Saccharomyces cerevisiae. Science, 281, 1854–1857. Donnini,C., Lodi,T., Ferrero,I. and Puglisi,P.P. (1992) IMP2, a nuclear gene controlling the mitochondrial dependence of galactose, maltose and raffinose utilization in Saccharomyces cerevisiae. Yeast, 8, 83–93. Eisen,M.B., Spellman,P.T., Brown,P.O. and Botstein,D. (1998) Cluster analysis and display of genome-wide expression patterns. Proc. Natl Acad. Sci., USA, 95, 14863–14868. Elion,E.A. (2001) The Ste5p scaffold. J. Cell Sci., 114, 3967–3978. Elion,E.A., Brill,J.A. and Fink,G.R. (1991) FUS3 represses CLN1 and CLN2 and in concert with KSS1 promotes signal transduction. Proc. Natl Acad. Sci., USA, 88, 9392–9396. Errede,B. and Ammerer,G. (1989) STE12, a protein involved in celltype-specific transcription and signal transduction in yeast, is part of protein–DNA complexes. Genes Dev., 3, 1349–1361. Feng,L., Yoon,H. and Donahue,T.F. (1994) Casein kinase II mediates multiple phosphorylation of Saccharomyces cerevisiae eIF-2α (encoded by SUI2), which is required for optimal eIF-2 function in S. cerevisiae. Mol. Cell. Biol., 14, 5139–5153. Ferrell,J.E. (2002) Self-perpetuating states in signal transduction: positive feedback, double-negative feedback and bistability. Curr. Opin. Cell Biol., 14, 140–148. Finnegan,P.M., Ellis,T.P., Nagley,P. and Lukins,H.B. (1995) The mature AEP2 gene product of Saccharomyces cerevisiae, required for the expression of subunit 9 of ATP synthase, is a 58 kDa mitochondrial protein. FEBS Lett., 368, 505–508. Forgac,M. (1999) Structure and properties of the vacuolar (H+ )ATPases. J. Biol. Chem., 274, 12951–12954. Georgatsou,E. and Alexandraki,D. (1999) Regulated expression of the Saccharomyces cerevisiae Fre1p/Fre2p Fe/Cu reductase related genes. Yeast, 15, 573–584. Giaever,G., Chu,A.M., Ni,L., Connelly,C., Riles,L., Véronneau,S., Dow,S., Lucau-Danila,A., Anderson,K., André,B. et al. (2002) Functional profiling of the Saccharomyces cerevisiae genome. Nature, 418, 387–391. Guelzim,N., Bottani,S., Bourgine,P. and Képès,F. (2002) Topological and causal structure of the yeast transcriptional regulatory network. Nat. Genet., 31, 60–63. Gygi,S.P., Rochon,Y., Franza,B.R. and Aebersold,R. (1999) Correlation between protein and mRNA abundance in yeast. Mol. Cell. Biol., 19, 1720–1730. Hamer,L., Adachi,K., Montenegro-Chamorro,M.V., Tanzer,M.M., Mahanty,S.K., Lo,C., Tarpey,R.W., Skalchunes,A.R., Heiniger,R.W., Frank,S.A. et al. (2001) Gene discovery and gene function assignment in filamentous fungi. Proc. Natl Acad. Sci., USA, 98, 5110–5115. Hartwell,L.H., Hopfield,J.J., Leibler,S. and Murray,A.W. (1999) From molecular to modular cell biology. Nature, 402, C47–C52. Herskowitz,I., Rine,J. and Strathern,J. (1992) Mating-type determination and mating-type interconversion in Saccharomyces cerevisiae. In Jones,E.W., Pringle,J.R. and Broach,J.R. (eds), The Molecular and Cellular Biology of the Yeast Saccharomyces. Cold Spring Harbor Laboratory Press, New York, pp. 583–656. Hinnebusch,A.G. (1992) General and pathway-specific regulatory mechanisms controlling the synthesis of amino acid biosynthetic enzymes in Saccharomyces cerevisiae. In Jones,E.W., Pringle,J.R. and Broach,J.R. (eds), The Molecular and Cellular Biology of the Yeast Saccharomyces. Cold Spring Harbor Laboratory Press, New York, pp. 319–414. Hinnebusch,A.G. (1997) Translational regulation of yeast GCN4. A window on factors that control initiator-tRNA binding to the ribosome. J. Biol. Chem., 272, 21661–21664. Hinnebusch,A.G. and Natarajan,K. (2002) Gcn4p, a master regulator of gene expression, is controlled at multiple levels by diverse signals of starvation and stress. Eukaryot. Cell, 1, 22–32. Hughes,T.R., Marton,M.J., Jones,A.R., Roberts,C.J., Stoughton,R., Armour,C.D., Bennett,H.A., Coffey,E., Dai,H., He,Y.D. et al. (2000) Functional discovery via a compendium of expression profiles. Cell, 102, 109–126. 2673 K.Kyoda et al. Ideker,T.E., Thorsson,V. and Karp,R.M. (2000) Discovery of regulatory interactions through perturbation: inference and experimental design. Pac. Symp. Biocomput., 305–316. Jenness,D.D., Burkholder,A.C. and Hartwell,L.H. (1983) Binding of α-factor pheromone to yeast a cells: chemical and genetic evidence for an α-factor receptor. Cell, 35, 521–529. Jensen,L.T. and Winge,D.R. (1998) Identification of a copperinduced intramolecular interaction in the transcription factor Mac1 from Saccharomyces cerevisiae. EMBO J., 17, 5400–5408. Kang,W., Matsushita,Y., Grohmann,L., Graack,H.R., Kitakawa,M. and Isono,K. (1991) Cloning and analysis of the nuclear gene for YmL33, a protein of the large subunit of the mitochondrial ribosome in Saccharomyces cerevisiae. J. Bacteriol., 173, 4013–4020. Koch,C., Moll,T., Neuberg,M., Ahorn,H. and Nasmyth,K. (1993) A role for the transcription factors Mbp1 and Swi4 in progression from G1 to S phase. Science, 261, 1551–1557. Kornitzer,D., Raboy,B., Kulka,R.G. and Fink,G.R. (1994) Regulated degradation of the transcription factor Gcn4. EMBO J., 13, 6021–6030. Kwast,K.E., Lai,L.C., Menda,N., James,D.T.,III, Aref,S. and Burke,P.V. (2002) Genomic analyses of anaerobically induced genes in Saccharomyces cerevisiae: functional roles of Rox1 and other factors in mediating the anoxic response. J. Bacteriol., 184, 250–265. Kyoda,K.M., Morohashi,M., Onami,S. and Kitano,H. (2000) A gene network inference method from continuous-value gene expression data of wild-type and mutants. Genome Inform. Ser. Workshop Genome Inform., 11, 196–204. Lee,T.I., Rinaldi,N.J., Robert,F., Odom,D.T., Bar-Joseph,Z., Gerber,G.K., Hannett,N.M., Harbison,C.T., Thompson,C.M., Simon,I. et al. (2002) Transcriptional regulatory networks in Saccharomyces cerevisiae. Science, 298, 799–804. Liu,L.X., Spoerke,J.M., Mulligan,E.L., Chen,J., Reardon,B., Westlund,B., Sun,L., Abel,K., Armstrong,B., Hardiman,G. et al. (1999) High-throughput isolation of Caenorhabditis elegans deletion mutants. Genome Res., 9, 859–867. Lockhart,D.J., Dong,H., Byrne,M.C., Follettie,M.T., Gallo,M.V., Chee,M.S., Mittmann,M., Wang,C., Kobayashi,M., Horton,H. et al. (1996) Expression monitoring by hybridization to highdensity oligonucleotide arrays. Nat. Biotechnol., 14, 1675–1680. Maki,Y., Tominaga,D., Okamoto,M., Watanabe,S. and Eguchi,Y. (2001) Development of a system for the inference of large scale genetic networks. Pac. Symp. Biocomput., 446–458. Martins,L.J., Jensen,L.T., Simons,J.R., Keller,G.L. and Winge,D.R. (1998) Metalloregulation of FRE1 and FRE2 homologs in Saccharomyces cerevisiae. J. Biol. Chem., 273, 23716–23721. Mewes,H.W., Amid,C., Arnold,R., Frishman,D., Güldener,U., Mannhaupt,G., Münsterkötter,M., Pagel,P., Strack,N., Stümpflen,V. et al. (2004) MIPS: analysis and annotation of proteins from whole genomes. Nucleic Acids Res., 32, D41–D44. Mukai,Y., Harashima,S. and Oshima,Y. (1993) Function of the ste signal transduction pathway for mating pheromones sustains MATα1 transcription in Saccharomyces cerevisiae. Mol. Cell. Biol., 13, 2050–2060. Myers,L.C., Gustafsson,C.M., Hayashibara,K.C., Brown,P.O. and Kornberg,R.D. (1999) Mediator protein mutations that selectively abolish activated transcription. Proc. Natl Acad. Sci., USA, 96, 67–72. 2674 Natarajan,K., Meyer,M.R., Jackson,B.M., Slade,D., Roberts,C., Hinnebusch,A.G. and Marton,M.J. (2001) Transcriptional profiling shows that Gcn4p is a master regulator of gene expression during amino acid starvation in yeast. Mol. Cell. Biol., 21, 4347–4368. Oehlen,L. and Cross,F.R. (1998) The mating factor response pathway regulates transcription of TEC1, a gene involved in pseudohyphal differentiation of Saccharomyces cerevisiae. FEBS Lett., 429, 83–88. Oehlen,L.J., McKinney,J.D. and Cross,F.R. (1996) Ste12 and Mcm1 regulate cell cycle-dependent transcription of FAR1. Mol. Cell. Biol., 16, 2830–2837. Padmanabha,R., Chen-Wu,J.L., Hanna,D.E. and Glover,C.V. (1990) Isolation, sequencing, and disruption of the yeast CKA2 gene: casein kinase II is essential for viability in Saccharomyces cerevisiae. Mol. Cell. Biol., 10, 4089–4099. Pan,C. and Mason,T.L. (1997) Functional analysis of ribosomal protein L2 in yeast mitochondria. J. Biol. Chem., 272, 8165–8171. Paul,M.F. and Tzagoloff,A. (1995) Mutations in RCA1 and AFG3 inhibit F1 -ATPase assembly in Saccharomyces cerevisiae. FEBS Lett., 373, 66–70. Pawson,T. and Nash,P. (2003) Assembly of cell regulatory systems through protein interaction domains. Science, 300, 445–452. Pe’er,D., Regev,A., Elidan,G. and Friedman,N. (2001) Inferring subnetworks from perturbed expression profiles. Bioinformatics, 17(Suppl. 1), S215–S224. Pellman,D., McLaughlin,M.E. and Fink,G.R. (1990) TATAdependent and TATA-independent transcription at the HIS4 gene of yeast. Nature, 348, 82–85. Planta,R.J. and Mager,W.H. (1998) The list of cytoplasmic ribosomal proteins of Saccharomyces cerevisiae. Yeast, 14, 471–477. Ramer,S.W. and Davis,R.W. (1993) A dominant truncation allele identifies a gene, STE20, that encodes a putative protein kinase necessary for mating in Saccharomyces cerevisiae. Proc. Natl Acad. Sci., USA, 90, 452–456. Ravasz,E., Somera,A.L., Mongru,D.A., Oltvai,Z.N. and Barabási,A.L. (2002) Hierarchical organization of modularity in metabolic networks. Science, 297, 1551–1555. Ren,B., Robert,F., Wyrick,J.J., Aparicio,O., Jennings,E.G., Simon,I., Zeitlinger,J., Schreiber,J., Hannett,N., Kanin,E. et al. (2000) Genome-wide location and function of DNA binding proteins. Science, 290, 2306–2309. Rives,A.W. and Galitski,T. (2003) Modular organization of cellular networks. Proc. Natl Acad. Sci., USA, 100, 1128–1133. Roberts,C.J., Nelson,B., Marton,M.J., Stoughton,R., Meyer,M.R., Bennett,H.A., He,Y.D., Dai,H., Walker,W.L., Hughes,T.R. et al. (2000) Signaling and circuitry of multiple MAPK pathways revealed by a matrix of global gene expression profiles. Science, 287, 873–880. Rutherford,J.C., Jaron,S. and Winge,D.R. (2003) Aft1p and Aft2p mediate iron-responsive gene expression in yeast through related promoter elements. J. Biol. Chem., 278, 27636–27643. Schena,M., Shalon,D., Davis,R.W. and Brown,P.O. (1995) Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science, 270, 467–470. Schwikowski,B., Uetz,P. and Fields,S. (2000) A network of protein– protein interactions in yeast. Nat. Biotechnol., 18, 1257–1261. Sprague,G.F.,Jr and Thorner,J.W. (1992) Pheromone response and signal transduction during the mating process of Saccharomyces Minimum equivalent gene networks cerevisiae. In Jones,E.W., Pringle,J.R. and Broach,J.R. (eds), The Molecular and Cellular Biology of the Yeast Saccharomyces, Cold Spring Habor Laboratory Press, New York, pp. 657–744. Tedford,K., Kim,S., Sa,D., Stevens,K. and Tyers,M. (1997) Regulation of the mating pheromone and invasive growth responses in yeast by two MAP kinase substrates. Curr. Biol., 7, 228–238. van den Heuvel,J., Lang,V., Richter,G., Price,N., Peacock,L., Proud,C. and McCarthy,J.E. (1995) The highly acidic C-terminal region of the yeast initiation factor subunit 2α (eIF-2α) contains casein kinase phosphorylation sites and is essential for maintaining normal regulation of GCN4. Biochim. Biophys. Acta, 1261, 337–348. Vidal,M., Strich,R., Esposito,R.E. and Gaber,R.F. (1991) RPD1 (SIN3/UME4) is required for maximal activation and repression of diverse yeast genes. Mol. Cell. Biol., 11, 6306–6316. Vik,Å. and Rine,J. (2001) Upc2p and Ecm22p, dual regulators of sterol biosynthesis in Saccharomyces cerevisiae. Mol. Cell. Biol., 21, 6395–6405. Wagner,A. (2001) How to reconstruct a large genetic network from n gene perturbations in fewer than n2 easy steps. Bioinformatics, 17, 1183–1197. Wang,H. and Jiang,Y. (2003) The Tap42-protein phosphatase type 2A catalytic subunit complex is required for cell cycle-dependent distribution of actin in yeast. Mol. Cell. Biol., 23, 3116–3125. Warshall,S. (1962) A theorem on Boolean matrices. J. Assoc. Comput. Mach., 9, 11–12. Winzeler,E.A., Shoemaker,D.D., Astromoff,A., Liang,H., Anderson,K., Andre,B., Bangham,R., Benito,R., Boeke,J.D., Bussey,H. et al. (1999) Functional characterization of the S. cerevisiae genome by gene deletion and parallel analysis. Science, 285, 901–906. Yamaguchi-Iwai,Y., Ueta,R., Fukunaka,A. and Sasaki,R. (2002) Subcellular localization of Aft1 transcription factor responds to iron status in Saccharomyces cerevisiae. J. Biol. Chem., 277, 18914–18918. Yang,R., Wek,S.A. and Wek,R.C. (2000) Glucose limitation induces GCN4 translation by activation of Gcn2 protein kinase. Mol. Cell. Biol., 20, 2706–2717. Zhu,H., Klemic,J.F., Chang,S., Bertone,P., Casamayor,A., Klemic,K.G., Smith,D., Gerstein,M., Reed,M.A. and Snyder,M. (2000) Analysis of yeast protein kinases using protein chips. Nat. Genet., 26, 283–289. 2675
© Copyright 2026 Paperzz