Physical Mapping Problem [email protected] Problem Definition Physical mapping的定義 DNA A B Q C P D O N E M F G H H K J J L Fragment of DNA Why We Need Physical Mapping 可以利用這個地圖將DNA做完全排序 可以知道基因到底如何對人類產生作用 利用人造蛋白質...等等來改進遺傳體質 8 人類染色體(約 10 bp) Physical map 6 (約 10 bp) AGACTAGTCGTAACGATCGCTAATTTAAGGCTACT..... DNA Sequencing 3 (約 10 bp) Why We Need Physical Mapping 可以利用這個地圖將DNA做完全排序 可以知道基因到底如何對人類產生作用 利用人造蛋白質...等等來改進遺傳體質 可以得知基因(或標記)的大約位置 對於一些遺傳疾病可以得到較多的資訊 可以幫助偵測是否具有遺傳疾病 DNA A B Q C P D O N E M F G H K J H J α L Fragment of DNA target DNA 加入酵素 Partial Digest Problem •by single enzyme A •restriction sites: a1< a2< a3<.....< ap •multiset of fragment lengths {ajai,i<j} target DNA Double Digest Problem (DDP) Clones first completely digested by enzyme A,then by B, finally A and B together restriction sites: by A: a1< a2< a3<.....< ap by B: b1< b2< b3 <.....< bq by A+B : c1< c2< c3 <.....< cp+q Reconstruct the restriction sites from these multisets Example : DDP Enzyme A Enzyme B Enzyme A+B 3 4 1 6 5 2 8 7 3 10 11 3 5 6 7 Solution Double Digest Problem (DDP) target DNA ........ By Probe Approach target DNA ................. ATGCGCTAACTGGACTTCAAGCCTAAACTGCATCAGACTT ........ TACGCGATTGACCTGAAGT Complementary probe The Spirit of Hybridization target DNA A B 1 C D 2 E F 3 G H 4 I 5 J 1 A B C D E F G H I J 2 3 4 5 A B C D E F G H I J 1 1 1 1 2 1 1 1 1 1 1 3 5 1 1 1 1 1 1 1 1 4 1 1 1 1 1 J 1 2 3 4 5 D F I E G A C H B B H C A G E I F D J 1 1 1 1 1 2 1 1 1 1 1 1 1 3 4 5 1 1 1 1 1 1 1 1 1 1 1 False Negative A、C C、D、E E、F 1 2 3 4 F、G 4 A、F、G 5 G、H、I 5 G、H、I 6 E、F、I、J、K 1 2 3 A、B、C C、D、E 6 I、J、K E、F False Positive A、C C、D、E E、F 1 2 3 4 F、G 4 A、F、G 5 G、H、I 5 G、H、I 6 E、F、I、J、K 1 2 3 A、B、C C、D、E 6 I、J、K E、F Chimeric Clones A、C C、D、E E、F 1 2 3 4 F、G 4 A、F、G 5 G、H、I 5 G、H、I 6 E、F、I、J、K 1 2 3 A、B、C C、D、E 6 I、J、K E、F Clones 1 2 3 4 5 6 A、B、C C、D、E 3 E、F 4 F、G 1 2 I、J、K G、H、I Probes 5 6 A B C D E F G H I J K 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0 1 0 0 1 0 0 1 0 0 1 1 0 0 0 1 0 1 0 0 0 1 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 Clones 1 2 3 4 5 6 A、B、C C、D、E 3 E、F、K 4 I、J、K、F、G 1 I、J、K 2 G、H、I Probes 5 6 A B C D E F G H I J K 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 0 0 1 0 0 1 0 0 1 1 0 0 0 1 0 1 0 0 0 1 0 0 0 0 1 1 0 1 0 0 1 0 0 1 0 0 1 0 1 1 0 0 How To Use Traveling Salesman Problem to Solve Physical Mapping Problem How to Convert to TSP? Hamming distance A B C D E F G H I J 1 1 1 1 2 1 1 1 1 1 1 3 5 1 1 1 1 1 1 1 1 4 1 1 1 1 1 A B C D E F G H I J A 0 2 0 3 2 2 2 1 2 4 B CDE F GH I J 0 2 3 3 3 3 2 4 2 0 3 3 3 1 2 4 3 0 2 2 2 3 1 1 0 0 2 5 1 3 0 2 5 1 3 0 3 0 3 4 0 3 2 2 0 How to Convert to TSP? Hamming distance Cycle weight = number of gaps transitions +2n A B C D E F G H I J 1 1 1 1 2 1 1 1 1 1 1 3 5 1 1 1 1 1 1 1 1 4 1 1 1 1 1 A B C D E F G H I J A 0 2 0 3 2 2 2 1 2 4 B CDE F GH I J 0 2 3 3 3 3 2 4 2 0 3 3 3 1 2 4 3 0 2 2 2 3 1 1 0 0 2 5 1 3 0 2 5 1 3 0 3 0 3 4 0 3 2 2 0 How to Convert to TSP? Hamming distance Cycle weight = number of gaps transitions +2n So, minimize the cycle weight is to the gap number Our approach We also convert it to optimization problem F(A) = X*C(A)+Y*P(A)+Z*N(A)+T*M(A)+ P*L(A). p X ln 1 p Y ln 1 Z ln 1 T ln 1 Using more complicated model Using Genetic Algorithm to solve it. The results of our approach tested on simulated data. (a) The false negative rate is set as 0.1. The false positive rate is 0.05. (b) The false negative rate is set as 0.1. The false positive rate is 0.01. Experimental Results of our GA tested on Real data from chromosome 1 (a) It shows the results of our GA run with the data which is a contig with about 95 clones and about 120 probes (b) It shows the results of our GA run with the data which is a contig with about 172 clones and about 136 probes
© Copyright 2026 Paperzz