WSSP Chapter 9 Determine ORF and BLASTP atttaccgtg tgatgagtat ccggaaatag acttcaatga ttggattgaa gatacagttt gatcccgatc ttggttctaa attatcttgc tccgtattaa atgattgctt gcattcgaat atgagccagc taacgaacgg caatattttc gcgtacccgt Steps and terms used in protein expression 1st ATG in mRNA p 9-1 Cloning the cDNA library p 9-1 Possible reading frames p 9-2 Possible types of clones in the cDNA library p 9-2 DSAP Define ORF page: Link to Toolbox translation program p 9-3 Toolbox: DNA Sequence Translation Program PolyA tail at 3’ end Reading frames p 9-3 EX1.12 +1 Reading Frame Longest ORF Translation stop p 9-3 Which one of these would be the correct ORF? A) B) Rule #1: If downstream of a stop codon, translation of the protein MUST start with an M (MET) p 9-3 Could this ORF code for the protein? ? p 9-4 Does this region match the BLASTX matches? Region of DNA that codes for the highlighted in protein sequence BLASTx p 9-4 Could the DNA code for a partial protein? ? p 9-4 Does this region match the BLASTX matches? Region of DNA that codes for the highlighted in protein sequence BLASTx p 9-4 Does this region match the BLASTX matches? Region of DNA that codes for the highlighted in protein sequence BLASTx p 9-4 An example of a partial coding sequence Similar Seq. Is this a partial ORF cDNA clone? What about this region? The first part of the protein may not have matches because it is not conserved. Query Sbjct 2 60 410 Region of similarity 475 The BLASTx helps determine which reading frame is correct >ref|NP_001150519.1| dynein light chain LC6, flagellar outer arm [Zea mays] Length=93 Score = 158 bits (400), Expect = 5e-37 Identities = 73/93 (78%), Positives = 83/93 (89%), Gaps = 0/93 (0%) Frame = +2 Query 11 Sbjct 1 Query 191 Sbjct 61 MLEGRARVEDTDMPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVV MLEG+A VEDTDMP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVV MLEGKAVVEDTDMPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVV GSSFGCFFTHKKGSFIYFRLETLHFLIFKGAAA GS FGC+ TH KGSFIYFRLE+L FL+FKGAAA GSGFGCYITHSKGSFIYFRLESLRFLVFKGAAA 190 60 289 93 It also helps suggest the start point p 9-6 Chose the reading frame and paste in the protein sequence Do not include the * (stop codon) Make sure to include bases that code for the stop codon p 9-7 The Five Commandments of DSAP I. The stop codon is part of the ORF DSAP BLASTp page p 9-8 NCBI BLASTp page Paste in protein sequence p 9-8 BLASTp results of EX1.12 +2 ORF Link to Conserved Domain Database p 9-9 BLASTp results of EX1.12 +1 ORF BLASTp results of EX1.12 +3 ORF No matches Enter BLASTp data into table Protein Possible DNA Clones M * AAAAAA AAAAAA AAAAAA p 9-10 Suppose the cDNA was missing the first 13 bp Does this DNA code for the start of the protein? >gi|226493894|ref|NP_001150519.1| dynein light chain LC6, flagellar outer arm [Zea mays] Length=93 Score = 139 bits (351), Expect = 8e-32 Identities = 63/81 (77%), Positives = Query 1 Sbjct 13 Query 61 Sbjct 73 MPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVVGSSFGCFFTHKK MP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVVGS FGC+ TH K MPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVVGSGFGCYITHSK GSFIYFRLETLHFLIFKGAAA GSFIYFRLE+L FL+FKGAAA GSFIYFRLESLRFLVFKGAAA 81 93 60 72 Suppose the cDNA was missing the first 13 bp Did they choose the correct ORF? >gi|226493894|ref|NP_001150519.1| dynein light chain LC6, flagellar outer arm [Zea mays] Length=93 Score = 139 bits (351), Expect = 8e-32 Identities = 63/81 (77%), Positives = Query 1 Sbjct 13 Query 61 Sbjct 73 MPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVVGSSFGCFFTHKK MP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVVGS FGC+ TH K MPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVVGSGFGCYITHSK GSFIYFRLETLHFLIFKGAAA GSFIYFRLE+L FL+FKGAAA GSFIYFRLESLRFLVFKGAAA 81 93 60 72 Suppose the cDNA was missing the first 13 bp Did they choose the correct ORF? BLASTP starting here >gi|226493894|ref|NP_001150519.1| dynein light chain LC6, flagellar outer arm [Zea mays] Length=93 Score = 139 bits (351), Expect = 8e-32 Identities = 63/81 (77%), Positives = Query 1 Sbjct 13 Query 61 Sbjct 73 MPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVVGSSFGCFFTHKK MP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVVGS FGC+ TH K MPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVVGSGFGCYITHSK GSFIYFRLETLHFLIFKGAAA GSFIYFRLE+L FL+FKGAAA GSFIYFRLESLRFLVFKGAAA 81 93 BLASTP starting here >gi|226493894|ref|NP_001150519.1| dynein light chain LC6, flagellar outer arm [Zea mays] Score = 156 bits (395), Expect = 6e-37 Identities = 72/92 (78%), Positives = 82/92 (89%), Gaps = 0/92 (0%) Query 1 Sbjct 2 Query 61 Sbjct 62 LEGRARVEDTDMPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVVG LEG+A VEDTDMP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVVG LEGKAVVEDTDMPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVVG SSFGCFFTHKKGSFIYFRLETLHFLIFKGAAA S FGC+ TH KGSFIYFRLE+L FL+FKGAAA SGFGCYITHSKGSFIYFRLESLRFLVFKGAAA 92 93 60 61 60 72 Compare the BLASTx and BLASTp results for EX1.12: Are the matches to the same proteins? p 9-11 Compare the BLASTx and BLASTp results for EX1.12: Are the e-values similar? p 9-12 Compare the BLASTx and BLASTp results for EX1.12: Are the alignments similar? BLASTx >ref|NP_001150519.1| Length=93 Query 11 Sbjct 1 Query 191 Sbjct 61 dynein light chain LC6, flagellar outer arm [Zea mays] MLEGRARVEDTDMPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVV MLEG+A VEDTDMP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVV MLEGKAVVEDTDMPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVV GSSFGCFFTHKKGSFIYFRLETLHFLIFKGAAA GS FGC+ TH KGSFIYFRLE+L FL+FKGAAA GSGFGCYITHSKGSFIYFRLESLRFLVFKGAAA 190 60 289 93 BLASTp >gi|226493894|ref|NP_001150519.1| dynein light chain LC6, flagellar outer arm [Zea mays] Length=93 Score = 158 bits (400), Expect = 2e-37 Query 1 Sbjct 1 Query 61 Sbjct 61 MLEGRARVEDTDMPRKMQAEAMNAASHALDLFDVADCKSLAAHIKKEFDKIYGPGWQCVV MLEG+A VEDTDMP KMQA+AM+AAS ALD FDV DC+S+A+HIKKEFD I+GPGWQCVV MLEGKAVVEDTDMPAKMQAQAMSAASRALDRFDVLDCRSIASHIKKEFDAIHGPGWQCVV GSSFGCFFTHKKGSFIYFRLETLHFLIFKGAAA GS FGC+ TH KGSFIYFRLE+L FL+FKGAAA GSGFGCYITHSKGSFIYFRLESLRFLVFKGAAA 60 60 93 93 p 9-12 DSAP Review Page DSAP Review Page p. 7-17 Use Toolbox to determine the ORF Do NOT use Toolbox to determine the 5’ UTR!!! The Five Commandments of DSAP I. The stop codon is part of the ORF II. The start of the 5’ UTR is always the first base Do NOT use Toolbox to determine the 3’ UTR!!! Determine ranges of 5’ UTR and 3’ UTR by highlighting the ranges in the DSAP cDNA text box p. 9-14 What should you do if your clone is a partial? An example of a partial coding sequence Similar Seq. The first bases are part of the reading frame ? S I R XGC TCA ATC CGT The Five Commandments of DSAP I. The stop codon is part of the ORF II. The start of the 5’ UTR is always the first base III. If the clone is a partial, there is no 5’ UTR IV. If the clone is a partial, the start of the ORF is always the first base What should you do if you get these results? BLASTX Why is my cDNA noncoding? Genomic DNA RNA cDNA (Partial) ORF AAAAAAA AAAAAAA Recent genome wide RNA sequence studies show that more than 10% of polyA RNAs are non-coding If your DNA is noncoding, enter in the entire sequence as 3’ UTR The Five Commandments of DSAP I. The stop codon is part of the ORF II. The start of the 5’ UTR is always the first base III. If the clone is a partial, there is no 5’ UTR IV. If the clone is a partial, the start of the ORF is always the first base V. If the clone is non-coding, the entire DNA 3’ UTR The Five Commandments of DSAP I. The stop codon is part of the ORF II. The start of the 5’ UTR is always the first base III. If the clone is a partial, there is no 5’ UTR. IV. If the clone is a partial, the start of the ORF is always the first base. V. If the clone is non-coding, the entire DNA 3’ UTR
© Copyright 2026 Paperzz