A Microarray-Based Screening Procedure for Detecting Differentially Represented Yeast Mutants Rafael A. Irizarry Department of Biostatistics, JHU [email protected] http://biostat.jhsph.edu/~ririzarr A B DOWNTAG kanR UPTAG CEN/ARS CEN/ARS Circular pRS416 EcoRI linearized PRS416 Transformation into deletion pool Select for Ura+ transformants Genomic DNA preparation Cy5 labeled PCR products PCR Cy3 labeled PCR products Oligonucleotide array hybridization NHEJ Defective Which mutants are NHEJ defective? • Find mutants defective for transformation with linear DNA • Dead in linear transformation (green) • Alive in circular transformation (red) • Look for spots with large log(R/G) YKU70 NEJ1 YKU80 YKU70 NEJ1 YKU80 • . Data • • • • 5718 mutants 3 replicates on each slide 5 Haploid slides, 4 Diploid slides Arrays are divided into 2 downtags, 3 uptag (2 of which replicate uptags) Average Red and Green Scatter Plot Average Red and Green MVA plot Improvement to usual approach • Take into account that some mutants are dead and some alive • Use a statistical model to represent this • Mixture model? • With ratio’s we lose information about R and G separately • Look at them separately (absolute analysis) Histograms Using model we can attach uncertainty to tests For example posterior z-test, weighted average of z-tests with weights obtained using the posterior probability (obtained from EM) Is Normal(0,1) QQ-Plot Uptag/Downtag Z-Scores Average Red and Green MVA Plot Average Red and Green Scatter Plot ResultsTable 1 2 3 4 5 6 7 8 9 10 11 12 13 YMR106C YOR005C YLR265C YDL041W YIL012W YIL093C YIL009W YDL042C YIL154C YNL149C YBR085W YBR234C YLR442C 9.5 19.7 6.1 10.4 12.2 4.8 5.6 12.9 1.8 1.7 2.5 1.7 6.1 47 69.2 35 44.9 32 35.8 32 35.6 31 21.7 29 30.8 29 -23.5 29 32.1 28 91.3 27 93.4 26 -15.8 26 87.5 26 -100.0 a a a a a a a a m m a m a a d m m a a a d m d a d a 100 100 100 100 100 100 100 100 82 71 84 75 100 Acknowledgements • • • • Siew Loon Ooi Jef Boeke Forrest Spencer Jean Yang END Summary • Simple data exploration useful tool for quality assessment • Statistical thinking helpful for interpretation • Statistical models may help find signals in noise Acknowledgements Biostatistics Karl Broman Leslie Cope Carlo Coulantoni Giovanni Parmigiani Scott Zeger MBG (SOM) Jef Boeke Siew-Loon Ooi Marina Lee Forrest Spencer UC Berkeley Stat Ben Bolstad Sandrine Dudoit Terry Speed Jean Yang Gene Logic WEHI Francois Colin Bridget Hobbs Uwe Scherf’s Group Natalie Thorne PGA Tom Cappola Skip Garcia Joshua Hare Warning • Absolute analyses can be dangerous for competitive hybridization slides • We must be careful about “spot effect” • Big R or G may only mean the spot they where on had large amounts of cDNA • Look at some facts that make us feel safer Correlation between replicates R1 R2 R3 G1 G2 G3 R1 1.00 0.95 0.95 0.94 0.90 0.90 R2 0.95 1.00 0.96 0.90 0.95 0.91 R3 0.95 0.96 1.00 0.91 0.92 0.95 G1 0.94 0.90 0.91 1.00 0.96 0.96 G2 0.90 0.95 0.92 0.96 1.00 0.97 G3 0.90 0.91 0.95 0.96 0.97 1.00 Correlation between red, green, haploid, diplod, uptag, downtag RHD RHU RDD RDU GHD GHU GDD GDU RHD 1.00 0.59 0.56 0.32 0.95 0.58 0.54 0.37 RHU 0.59 1.00 0.38 0.56 0.58 0.95 0.40 0.58 RDD 0.56 0.38 1.00 0.58 0.54 0.39 0.92 0.64 RDU 0.32 0.56 0.58 1.00 0.33 0.53 0.58 0.89 GHD 0.95 0.58 0.54 0.33 1.00 0.62 0.56 0.39 GHU 0.58 0.95 0.39 0.53 0.62 1.00 0.41 0.58 GDD 0.54 0.40 0.92 0.58 0.56 0.41 1.00 0.73 GDU 0.37 0.58 0.64 0.89 0.39 0.58 0.73 1.00 BTW The mean squared error across slides is about 3 times bigger than the mean squared error within slides Mixture Model We use a mixture model that assumes: • There are three classes: – Dead – Marginal – Alive • Normally distributed with same correlation structure from gene to gene Random effect justification Each x = (r1,…,r5,g1,…,g5) will have the following effects: • Individual effect: same mutant same expression (replicates are alike) • Genetic effect: same genetics same expression • PCR effect : expect difference in uptag, downtag Does it fit? Does it fit? What can we do now that we couldn’t do before? • Define a t-test that takes into account if mutants are dead or not when computing variance • For each gene compute likelihood ratios comparing two hypothesis: alive/dead vs.dead/dead or alive/alive QQ-plot for new t-test Better looking than others 1 2 3 4 5 6 7 8 9 10 11 12 13 YMR106C YOR005C YLR265C YDL041W YIL012W YIL093C YIL009W YDL042C YIL154C YNL149C YBR085W YBR234C YLR442C 9.5 19.7 6.1 10.4 12.2 4.8 5.6 12.9 1.8 1.7 2.5 1.7 6.1 47 69.2 35 44.9 32 35.8 32 35.6 31 21.7 29 30.8 29 -23.5 29 32.1 28 91.3 27 93.4 26 -15.8 26 87.5 26 -100.0 a a a a a a a a m m a m a a d m m a a a d m d a d a 100 100 100 100 100 100 100 100 82 71 84 75 100
© Copyright 2026 Paperzz