The COSTEX model: a cost-benefit model relating gene expression and selection Daniel Kahn, Jean-François Gout & Laurent Duret Laboratoire de Biométrie & Biologie Evolutive Lyon 1 University, INRIA BAMBOO team & INRA MIA Department Whole genome duplications as a tool to investigate dosage selection Following whole-genome duplication (WGD) Relative gene dosage is initially unchanged Duplicated genes are gradually lost with probability inversely related to selective pressure This may be exploited to analyze selective pressure on gene dosage D. Kahn, COSTEX model 2 Duplications in the Paramecium genome Aury et al., 2006, Nature 444:171-178 D. Kahn, COSTEX model 3 Three successive rounds of WGD Gene content: 2 x 2 x 2 2 D. Kahn, COSTEX model 5 Fate A brief introduction about Whole-Genome Duplications (WGDs) of genes after WGD ohnologon • WGD creates identical copies of all genes (ohnologs) D. Kahn, COSTEX model Fate A brief introduction about Whole-Genome Duplications (WGDs) of genes after WGD • WGD creates identical copies of all genes (ohnologs) • Mutations lead to pseudogenization of some ohnologs D. Kahn, COSTEX model Fate A brief introduction about Whole-Genome Duplications (WGDs) of genes after WGD • WGD creates identical copies of all genes (ohnologs) • Mutations lead to pseudogenization of some ohnologs D. Kahn, COSTEX model Fate A brief introduction about Whole-Genome Duplications (WGDs) of genes after WGD • WGD creates identical copies of all genes (ohnologs) • Mutations lead to pseudogenization of some ohnologs • Finally, only a few pairs of genes are retained D. Kahn, COSTEX model Fate A brief introduction about Whole-Genome Duplications (WGDs) of genes after WGD Ohnologon that lost one copy Retained ohnologon • WGD creates identical copies of all genes (ohnologs) • Mutations lead to pseudogenization of some ohnologs • Finally, only a few pairs of genes are retained D. Kahn, COSTEX model Relationship between gene retention and gene expression Frequency of gene retention Data from Paramecium post-genomics consortium Jean Cohen & coll. D. Kahn, COSTEX model Expression level (log2) 11 Model for expression-dependent selection Protein expression has a cost =>Trade-off between cost and benefit The model assumes that expression was optimal before WGD In vitro evolution experiments have shown that an optimum can indeed be reached in only a few hundred generations (e.g. Dekel & Alon, 2005) D. Kahn, COSTEX model 12 Modelling the cost of expression Dekel & Alon, 2005, Nature 436:588-592 kX C( X ) MX expression cost C(X) X k M expression level D. Kahn, COSTEX model cost function expression level cost parameter maximal capacity M 13 Cost-benefit optimization Benefit : B(X) fitness fitness expression cost cost expression Cost: C(X) levelX Xexpression o o The COSTEX model Express fitness as a function of expression x relatively to optimum level X0 X x X0 kX 0 x w( x ) B ( x ) M X0x D. Kahn, COSTEX model 15 The COSTEX model Approximate fitness around optimum X0 by Taylor expansion: 1 2w 2 w( x ) 1 (1)( x 1) 2 x 2 Therefore selection on expression can be quantified by: 2w d 2B 2kMX 02 (1) 2 (1) 0 2 3 x dx (M X 0 ) D. Kahn, COSTEX model 16 Expression-dependent fitness 1 fitness Low X0 Medium X0 High X0 Loss of duplication 0 0.5 1 1.5 Relative dosage or expression X/Xo Fitness loss Selection against loss of duplicated gene kMX 0 2 1 d 2B s (1) 3 2 4( M X 0 ) 8 dx Optimal expression X0 D. Kahn, COSTEX model 18 Selection against pseudogene formation Pseudogene formation after WGD entails a loss of fitness that can be expressed in the COSTEX model: 1 1 dB kMX 0 s B( ) B(1) (1) 2 2 dx 2( M X 0 ) 2 Therefore the pseudogenization path to gene loss is also under expression-dependent selection: the higher the gene is expressed, the less likely is the fixation of disabling mutations. D. Kahn, COSTEX model 19 Expression constrains evolutionary rates More generally, mutations that decrease the benefit function by a fraction a are counter-selected in an expression-dependent manner in the COSTEX model: s(a ) a B(1) a (1 kX 0 ) M X0 Mutations with an equivalent effect on protein function are more deleterious for highly expressed genes because of higher expression cost, a price the organism had to ‘pay’ for their function. This relationship also applies for potentially suboptimal expression X X0 D. Kahn, COSTEX model 20 Expression constrains evolutionary rates Expression is the best predictor of evolutionary rates in coding sequences (Duret & Mouchiroud, 2000, Drummond et al., 2006) Drummond et al, 2005 PNAS,102:14338 D. Kahn, COSTEX model 21 Expression-dependent selection The COSTEX model can explain the relationship between retention rate and gene expression The model is also supported by gene knockout experiments in yeast (measure of fitness in heterozygotes wt/KO) The model predicts that the level of expression is all the more conserved in evolution as expression is high It also explains the observation that highly expressed genes have low rates of sequence evolution D. Kahn, COSTEX model 22 Retention of metabolic genes Unexpected observation that metabolic genes are more retained than other genes following WGD However little selective pressure is expected on the dosage of individual enzyme genes (Kacser & Burns, 1981) Is this a paradox? D. Kahn, COSTEX model 23 Metabolic genes are more expressed D. Kahn, COSTEX model 24 High retention of metabolic genes: why? Retention of metabolic genes is best explained by selection for gene expression Although the loss of individual enzyme genes should generally be neutral, each successive loss will be more and more counter-selected. For instance in a linear pathway: J J0 1 p 1 CiJ0 i 1 Ultimately this would result in half of the flux, which should be strongly counter-selected in general D. Kahn, COSTEX model 25 Metabolic fluxes are not proportional to enzyme activities They typically show a hyperbolic dependency Most enzymes have low control on flux Summation theorem n C i 1 J0 i 1 Kacser & Burns 1981, Genetics 97:639-666 D. Kahn, COSTEX model 27 Therefore little selective pressure is expected on the dosage of individual enzyme genes This a classical explanation of the recessivity of metabolic defects D. Kahn, COSTEX model 28 Ongoing dynamics of gene inactivation 49% loss of duplicated genes following the recent WGD Contrary to initial expectation, metabolic genes are more retained than other genes: 42% gene loss ( n = 1,144 metabolic genes, P-value < 10-3 ) Why? Gout, Duret & Kahn 2009, Mol. Biol. Evol., in press D. Kahn, COSTEX model 29 D. Kahn, COSTEX model 30 D. Kahn, COSTEX model 31 50% 45% 40% Gene frequency 35% 30% 25% 20% 15% 10% 5% 0% 1 2 3 4 5 6 7 8 Number of genes within ohnologon D. Kahn, COSTEX model 32 b. Intermediary WGD a. Recent WGD 100% 100% Gene loss frequency 88% * 80% 80% 60% 60% 40% 54% ** 40% 44% 42% 20% 0% 0% D. Kahn, COSTEX model 2 genes or more 63% ** 40% 20% 1 gene before WGD 78% 76% 1 gene before WGD 2 genes or more 33 1.2 Relative fitness 1.0 0.8 0.6 0.4 0.2 0.0 0 0.2 0.4 0.6 0.8 1 1.2 Relative dosage or expression D. Kahn, COSTEX model 34 P. tetraurelia : the best model organism for studying WGDs P. tetraurelia : 3 successive WGDs with different loss rates (Aury et al, 2006) 92 % 76 % 49 % D. Kahn, COSTEX model
© Copyright 2026 Paperzz