2003年度 統計関連学会 連合大会 企画セッション: DNAアレイデータ解析に関する統計的諸問題 DNAアレイデータ概説 井元清哉1,樋口知之2 1東京大学医科学研究所ヒトゲノム解析センター [email protected] 2統計数理研究所 [email protected] (C) Copyright 2003 Seiya Imoto, Human Genome Center, University of Tokyo Joint Statistical Meeting 2003 in San Francisco 8月4日 8:30-10:20 10:30-12:20 10:30-12:20 Analysis of gene expression data (p.96) Bayesian and mixture method in genomics data (p.125) Data analysis of microarray data (p.133) 8月5日 10:30-12:20 14:00-15:50 Classification of gene expression data (p.246) Microarray data analysis (p.276) 8月6日 8:30-10:20 10:30-12:20 10:30-12:20 10:30-12:20 Statistical issues in image analysis, microarrays, and machine learning (p.305) Bayesian methods for microarray data analysis (p.342) Statistics and genomics (p.345) Analysis of genetic data II (p.370) 8月7日 8:30-10:20 8:30-10:20 10:30-12:20 Statistics and microarrays (p.422) Normalization of microarray data (p.445) Multivariate approachs to gene expression data (p.465) 遺伝子発現データ cDNAマイクロアレイデータ オリゴヌクレオチドアレイ (Affymetrix社,GeneChip R ) マクロアレイ (ラジオアイソトープ) (C) Copyright 2003 Seiya Imoto, Human Genome Center, University of Tokyo (C) Copyright 2003 Seiya Imoto, Human Genome Center, University of Tokyo Red means Cell A < Cell B Green means Cell A > Cell B Yellow means Cell A = Cell B The transfer of information from DNA to protein gene AGGTTCAGCGC DNA Transcription (転写) mRNA Splicing; A process that results in removal of introns and joining of exons in RNAs. exon: cording region intron: noncording region Translation (翻訳) Protein (C) Copyright 2003 Seiya Imoto, Human Genome Center, University of Tokyo cDNA microarray Reference Cell Experimental Cell Extract mRNA from all genes Colored cDNA Hybridize to chip (C) Copyright 2003 Seiya Imoto, Human Genome Center, University of Tokyo GeneX is over-expressed in Cell B than Cell A Cell A Cell B Labeled cDNA from geneX Hybridize to chip Spot of geneX with complementary sequence of colored cDNA This spot shows red color. (C) Copyright 2003 Seiya Imoto, Human Genome Center, University of Tokyo (C) Copyright 2003 Seiya Imoto, Human Genome Center, University of Tokyo Red means Cell A < Cell B Green means Cell A > Cell B Yellow means Cell A = Cell B cDNA microarray (C) Copyright 2003 Seiya Imoto, Human Genome Center, University of Tokyo This machine can make 48 microarrays simultaneously (One day). (C) Copyright 2003 Seiya Imoto, Human Genome Center, University of Tokyo Colored cDNAs are put at the cusp of the needles. 384 plate contains 384 colored cDNAs. Yeast has over 6,000 genes, then we should change 384 plate 16 times. (C) Copyright 2003 Seiya Imoto, Human Genome Center, University of Tokyo Dip 32 spots at once. (C) Copyright 2003 Seiya Imoto, Human Genome Center, University of Tokyo (C) Copyright 2003 Seiya Imoto, Human Genome Center, University of Tokyo Green Green b.g.-corrected Red b.g.-corrected background (R. b.g.-c)/(G. b.g.-c) Red intensity Green Systematic name intensity Red b.g. Gene function A_1_1 A_1_2 A_1_3 A_1_4 A_1_5 A_1_6 A_1_7 A_1_8 A_1_9 A_1_10 A_1_11 A_1_12 A_1_13 A_1_14 A_1_15 A_1_16 A_1_17 A_1_18 A_1_19 A_1_20 A_1_21 A_1_22 A_1_23 A_1_24 A_1_25 A_1_26 Ctrl Ctrl D x A - PSL Bkgd 59358.75 512.92 1209.19 512.92 1948.2 512.92 4940.806 512.92 1485.59 512.92 32642.03 512.92 6919.441 512.92 2698.301 512.92 7167.958 512.92 5470.062 512.92 27879.49 512.92 2589.613 512.92 6196.245 512.92 34737.1 512.92 34035.35 512.92 1638.381 512.92 3873.718 512.92 2433.625 512.92 1800.736 512.92 1296.689 512.92 3453.24 512.92 10731.55 512.92 6191.309 512.92 3589.998 512.92 27568.34 512.92 1956.182 512.92 Ctrl sDxA 58845.83 696.271 1435.28 4427.886 972.671 32129.11 6406.521 2185.382 6655.038 4957.142 27366.57 2076.693 5683.326 34224.18 33522.43 1125.461 3360.799 1920.706 1287.816 783.77 2940.32 10218.63 5678.39 3077.078 27055.42 1443.262 Data Data D x A - PSL Bkgd 50953.13 1779.913 2522.345 1779.913 3100.152 1779.913 6670.604 1779.913 2916.086 1779.913 42304.13 1779.913 8540.246 1779.913 4314.47 1779.913 7379.286 1779.913 6953.799 1779.913 33746.9 1779.913 4385.568 1779.913 8840.475 1779.913 36129.62 1779.913 27128.53 1779.913 2988.042 1779.913 4955.141 1779.913 3502.406 1779.913 3011.855 1779.913 2636.549 1779.913 4968.026 1779.913 9307.246 1779.913 8808.398 1779.913 4420.744 1779.913 20856.2 1779.913 3150.716 1779.913 Data sDxA 49173.22 742.4323 1320.239 4890.691 1136.173 40524.22 6760.333 2534.557 5599.373 5173.886 31966.99 2605.655 7060.562 34349.7 25348.62 1208.129 3175.228 1722.493 1231.942 856.6356 3188.113 7527.333 7028.485 2640.831 19076.29 1370.803 Ratio (sDxA): Data / 0.835628 YAL003W 1.066298 YAR053W 0.919848 YBL078C 1.104521 YAL008W 1.168096 YAR062W 1.261293 YBL087C 1.055227 YAL014C 1.159778 YAR068W 0.841374 YBL100C 1.043724 YAL025C 1.168103 YBL002W 1.254713 YBL107C 1.242329 YDR044W 1.003668 YDR134C 0.756169 YDR233C 1.073453 YDR048C 0.944784 YDR139C 0.896802 YDR252W 0.956613 YDR053W 1.092968 YDR149C 1.084274 YDR260C 0.736629 YDR056C 1.23776 YDR152W 0.858227 YDR269C 0.705082 YGL189C 0.949795 YGL261C Ctrl translation elongation factor eef1beta hypothetical protein essential for autophagy protein of unknown function putative pseudogene 60s large subunit ribosomal protein l23.e strong similarity to hypothetical protein yhr214w questionable orf nuclear viral propagation protein histone h2b.2 hypothetical protein coproporphyrinogen iii oxidase strong similarity to flo1p, flo5p, flo9p and ylr110c similarity to hypothetical protein ydl204w questionable orf ubiquitin-like protein strong similarity to egd1p and to human btf3 pro questionable orf questionable orf hypothetical protein hypothetical protein weak similarity to c.elegans hypothetical protein questionable orf 40s small subunit ribosomal protein s26e.c7 strong similarity to members of the srp1/tip1 fam Data 1. {Cy3ij , Cy5ij } B ij B ij 2. {Cy3 , Cy5 } i 番目のアレイによって観測された j 番目の遺伝子 の発現データ バックグラウンドのインテンシティ分を補正 3. {Cy3ijBN , Cy5ijBN } 正規化されたインテンシティ Cy5ijBN 4. xij log 2 Cy3BN ij 対数変換 (C) Copyright 2003 Seiya Imoto, Human Genome Center, University of Tokyo 正規化1 (大域的正規化) (C) Copyright 2003 Seiya Imoto, Human Genome Center, University of Tokyo 正規化2 (局所的正規化) 1 1 2 2 3 3 4 5 4 6 7 8 (C) Copyright 2003 Seiya Imoto, Human Genome Center, University of Tokyo 8 4 (C) Copyright 2003 Seiya Imoto, Human Genome Center, University of Tokyo 利用可能なマイクロアレイデータ1 スタンフォード (SMDデータベース) 人 (Homo sapiens) パン酵 (Saccharomyces cerevisiae) 線虫 (Caenorhabditis elegans) 論文のアブストラクト データの説明 http://genome-www5.stanford.edu/MicroArray/SMD/ (C) Copyright 2003 Seiya Imoto, Human Genome Center, University of Tokyo 利用可能なマイクロアレイデータ2 KEGGデータベース 藍藻 (Synechocystis sp. PCC6803) 枯草菌 (Bacillus subtilis) 線虫(Escherichia coli K-12 W3110) 論文のアブストラクト http://www.genome.ad.jp/ (C) Copyright 2003 Seiya Imoto, Human Genome Center, University of Tokyo 利用可能なマイクロアレイデータ3 Golub et al. (1999). Science. 血液の癌 AML と ALL の分類 38患者(学習データ),34患者(テストデータ) http://contest.genome.ad.jp/problem2.html (C) Copyright 2003 Seiya Imoto, Human Genome Center, University of Tokyo マイクロアレイデータ解析の レクチャーノート Terry Speed ed. (2002). Statistical analysis of gene expression microarray data. CHAPMAN&HALL/CRC Sorin Draghici. (2003). Data analysis tools for DNA microarrays. CHAPMAN&HALL/CRC (C) Copyright 2003 Seiya Imoto, Human Genome Center, University of Tokyo その他 科研費シンポジュウム「バイオスタティスティクスの数理的基 礎」(2002年12月東京大学数理科学) チュートリアル:遺伝子発現データ解析概論. 濱野鉄太郎,伊藤陽一,井元清哉 http://www.ms.u-tokyo.ac.jp/~nakahiro/sympo14/tu1 日本計量生物学会 2003年度シンポジュウム特別セッション 「マイクロアレイデータ解析における統計的方法論の開発」 井元清哉,大瀧慈 http://bonsai.ims.u-tokyo.ac.jp/~imoto/imoto_biometrics2003.pdf (C) Copyright 2003 Seiya Imoto, Human Genome Center, University of Tokyo
© Copyright 2026 Paperzz