Use ORF finder to predict the coding region for gene you got in the homework 2. (a) Which open reading frame is the correct one? Answer: Frame +3, from 66 to 2045 ,length 1980. (b) How many residues does protein have? Give the sequence in FASTA format. Answer:659 MLKKIFYGFIVLFLIVVGLLAILIAQVWVSTNKDIAKIKDYRPSVASQILDRKGRLIANIYDKEFRFYAR FEEIPPRFIESLLAVEDTLFFEHGGINLDAIMRAMIKNAKSGRYTEGGSTITQQLVKNMVLTREKTLTRK LKEAIISIRIEKVLSKEEILERYLNQTFFGHGYYGVKTASLGYFKKPLDKLTLKEITMLVALPRAPSFYD PTKNLEFSLSRANDILRRLYSLGWISSNELKGALNEVPIIYNQTSTQNIAPYVVDEVLKQLDQLDGLKTQ GYTIKLTIDLDYQRLALESLRFGHQKILEKIAKEKPKTNASNEDEDNLNASMIVTDTSTGKILALVGGID YKKSAFNRATQAKRQFGSAIKPFVYQIAFDNGYSTTSKIPDTARNFENGNYSKNSEQNHAWHPSNYSRKF LGLVTLQEALSHSLNLATINLSDQLGFEKIYQSLSDMGFKNLPKDLSIVLGSFAISPIEAAEKYSLFSNY GTMLKPMLIESITDQQNDVKTFTPMETKKITSKEQAFLTLSVLMNAVENGTGSLARIKGLEIAGKTGSSN NNIDAWFIGFTPTLQSVIWFGRDDNTPIGKGATGGVVSAPVYSYFMRNILAIEPSLKRKFDVPKGLRKEI VDKIPYYSTPNSITPTPQKTDDGEEPLLF (c) Use blast to search the nr database. Set E value to 0.0000001 with PAM70 matrix. Answer: gi|21262171|dbj|BAB96754.1| penicillin binding protein [Hel... 1653 0.0 gi|21262169|dbj|BAB96753.1| penicillin binding protein [Hel... 1636 0.0 gi|21262167|dbj|BAB96752.1| penicillin binding protein [Hel... 1634 0.0 gi|15645222|ref|NP_207392.1| penicillin-binding protein 1A ... 1623 0.0 gi|19073477|gb|AAL84835.1|AF479618_1 penicillin-binding pro... 1618 0.0 gi|19073475|gb|AAL84834.1|AF479617_1 penicillin-binding pro... 1613 0.0 gi|13272374|gb|AAK17126.1|AF315503_1 PBP1 [Helicobacter pyl... 1609 0.0 gi|15611611|ref|NP_223262.1| PENICILLIN-BINDING PROTEIN [He... 1608 0.0 gi|15791870|ref|NP_281693.1| penicillin-binding protein [Ca... 683 0.0 gi|17232816|ref|NP_489364.1| penicillin-binding protein [No... 322 3e-86 (d) Use blast 2 sequence to compare number 1 and number 10 hits. Answer: Sequence 1 lcl|seq_1 Length 659 (1 .. 659) Sequence 2 lcl|seq_2 Length 643 (1 .. 643) 2 1 NOTE:The statistics (bitscore and expect value) is calculated based on the size of nr database Score = 334 bits (788), Expect = 6e-90 Identities = 216/607 (35%), Positives = 327/607 (53%), Gaps = 84/607 (13%) Query: 38 IKDYRPSVASQILDRKGRLIANIYDKEFRFYARFEEIPPRFIESLLAVEDTLFFEHGGIN 97 I+++ P+ ++ I D KGRL+A+I R I P + LA EDT F+ H GI+ Sbjct: 65 IRNFVPAETTYIYDIKGRLLASIHGEVNREVVPLKKISPHLKRAVLASEDTSFYHHHGID 124 Query: 98 LDAIMRAMIKNAKSGRYTEGGSTITQQLVKNMVLTREKTLTRKLKEAIISIRIEKVLSKE 157 I RA++ N +G EGGST+T QLVKN+ L+ E+T TRK+ EA+++IR+E VLSK+ Sbjct: 125 PVGIGRALVVNLEAGEVQEGGSTLTMQLVKNLFLSQERTFTRKIAEAVLAIRLEQVLSKD 184 Query: 158 EILERYLNQTFFGHGYYGVKTASLGYFKKPLDKLTLKEITMLVALPRAPSFYDPTKNLEF 217 EIL+ YLNQ + G YGV A+ YF K L L E +M+ L AP + P NLE Sbjct: 185 EILDLYLNQVYWGDNNYGVQMAARYYFNKSAANLNLAESAMMAGLLPAPENFSPFINLEL 244 Query: 218 SLSRANDILRRLYSLGWISSNELKGALNEVPIIYNQTSTQNI------------APYVVD 265 + + ++L R+ L WIS + YNQ+ Q I APY+ + Sbjct: 245 AKQKQKEVLLRMLELNWISQQD-----------YNQALKQKIQLNNKRTLEGSAAPYITN 293 Query: 266 EVLKQL------DQL--DGLKTQGYTIKLTIDLDYQRLALESLRFGHQKILEKIAKEKPK 317 V +L D L GL+ Q TID +Q +A + + HQ++ K Sbjct: 294 SVAQELVRRFGRDVLLKGGLRVQT-----TIDAQFQMMANKTVKRWHQRL-------KRQ 341 Query: 318 TNASNEDEDNLNASMIVTDTSTGKILALVGGIDYKKSAFNRATQAKRQFGSAIKPFVYQI 377 +N+ +++ D T I ALVGG+D K S FNRATQA+RQ GSA KPFVY Sbjct: 342 GLRNNQ------IALVAIDPRTHFIKALVGGVDAKTSEFNRATQARRQPGSAFKPFVYYA 395 Query: 378 AFDNG-YSTTSKIPDTARNFENGN--YSKNSEQNHAWHPSNYSRKFLGLVTLQEALSHSL 434 AF +G ++ + + DT + +GN YS P NY F+G + + ALS S Sbjct: 396 AFASGKFTPNTIVQDTPVRYRDGNGWYS----------PRNYDNSFMGAIPIRTALSLSR 445 Query: 435 NLATINLSDQLGFEKIYQSLSDMGFKNLPKD--LSIVLGSFAISPIEAAEKYSLFSNYGT 492 N+ +I L G ++ ++ +G + P + S+ LG+ ++P+E A Y+ +NYG Sbjct: 446 NIPAIKLGKAVGLNRVIETSRTLGITS-PMEPVTSLPLGAIGVTPVEMASAYATLANYGW 504 Query: 493 MLKPMLIESITDQQNDVKTFTPMETKKITSKEQAFLTLS---------VLMNAVENGTGS 543 LI ++D +V I + L L+ V+ + + NGTG Sbjct: 505 QSPTTLIMRVSDSNGNV---------LIDNTPKPRLVLNPWASASVIDVMQSVINNGTGR 555 Query: 544 LARIKGLEIAGKTGSSNNNIDAWFIGFTPTLQSVIWFGRDDNTPIGKGATGGVVSAPVYS 603 A I G AGKTG++++ D WF+G P L + IW GRDDN +G GATGG AP+ Sbjct: 556 AAAI-GRPAAGKTGTTSSERDVWFVGTVPQLTTAIWVGRDDNKRLGYGATGGGTVAPIWR 614 Query: 604 YFMRNIL 610 FM N L Sbjct: 615 DFMQNAL 621 CPU time: Lambda 0.15 user secs. K H 0.01 sys. secs 0.16 total secs. 0.332 0.230 0.987 Gapped Lambda 0.291 K H 0.0910 0.410 Matrix: PAM70 Gap Penalties: Existence: 10, Extension: 1 Number of Hits to DB: 4681 Number of Sequences: 0 Number of extensions: 668 Number of successful extensions: 7 Number of sequences better than 10.0: 1 Number of HSP's better than 10.0 without gapping: 1 Number of HSP's successfully gapped in prelim test: 0 Number of HSP's that attempted gapping in prelim test: 0 Number of HSP's gapped (non-prelim): 1 length of query: 659 length of database: 442,539,632 effective HSP length: 49 effective length of query: 610 effective length of database: 442,539,583 effective search space: 269949145630 effective search space used: 269949145630 T: 9 A: 40 X1: 15 ( 7.2 bits) X2: 119 (50.0 bits) X3: 119 (50.0 bits) S1: 41 (21.7 bits) S2: 75 (34.9 bits) (e) Search for conserved domains of this gene in Pfa m database. How many domains can you find? (1) Give the name, Pfam number and consensus sequence of the conserved domain. (2) Show Domain Relatives. (3) How many similar domain architectures can you find? Answer: (1) gnl|CDD|1466 pfam00912, Transglycosyl, Transglycosylase. The penicillin-bin... 233 2e-62 gnl|CDD|7821 pfam00905, Transpeptidase, Penicillin binding protein transpep... 122 5e-29 gnl|CDD|1466, pfam00912, Transglycosyl, Transglycosylase. The penicillin-binding proteins are bifunctional proteins consisting of transglycosylase and transpeptidase in the N- and C-terminus respectively. CD-Length = 169 residues, 100.0% aligned Score = 233 bits (595), Expect = 2e-62 Query: 47 SQILDRKGRLIANIYDKEFRFYARFEEIPPRFIESLLAVEDTLFFEHGGINLDAIMRAMI 106 Sbjct: 1 MKIYDADGELIGEFGEERRR-PVPLNDIPPNLKEALIASEDRRFYEHHGIDPKGIGRAAL 59 Query: 107 KNAKSGRYTEGGSTITQQLVKNMVLTREKTLTRKLKEAIISIRIEKVLSKEEILERYLNQ 166 Sbjct: 60 ANLKSGGVVQGASTITQQLAKNLFLSHERTFTRKANEAWLALQLEQVYSKDEILELYLNK 119 Query: 167 TFFGHGYYGVKTASLGYFKKPLDKLTLKEITMLVALPRAPSFYDPTKNLE 216 Sbjct: 120 IYFGNGVYGIEAAAQYYFGKPAKDLTLAEAALLAGLPKAPSRYNPVRNPE 169 gnl|CDD|7821, pfam00905, Transpeptidase, Penicillin binding protein transpeptidase domain. The active site serine is conserved in all members of this family. CD-Length = 327 residues, Score = 99.7% aligned 122 bits (306), Expect = 5e-29 Query: 305 QKILEKIAKEKPKTNASNEDEDNLNASMIVTDTSTGKILALVGGIDYKKSAF-------- 356 Sbjct: 1 SKLQKAAERALDKAVAKYKAK---RGAAVVMDPKTGEVLAMASSPSYDPNLFVGGENEPL 57 Query: 357 -NRATQAKRQFGSAIKPFVYQIAFDNGYSTTSKIPDTARNFENGNYSKNSEQNHAWHPSN 415 Sbjct: 58 RNRAVTGVYEPGSTFKPITAAAALENGVIK----PNEVLDDSGGIYQGGG----STIKYD 109 Query: 416 YSRKFLGLVTLQEALSHSLNLATINLSDQLGFEKIYQSLSDMGF---------------- 459 Sbjct: 110 WRRGGHGTITLRQALEKSSNTGFVKLALKLGPDKLRDYLKRFGLGVKTGIDLPGEAAGSL 169 Query: 460 KNLPKDLSIVLGSFAI------SPIEAAEKYSLFSNYGTMLKPMLIESITDQQNDVKTFT 513 Sbjct: 170 PPSNKRLLADTATSAFGQGDTVTPLQMAQAYATIANGGTLVQPHLVKSIVDPNGQIDG-- 227 Query: 514 PMETKKITSKEQAFLTLSVLMNAVENGTGSLARIKGLEIAGKTGSSN---------NNID 564 Sbjct: 228 TPVSKETISKTVSEMLQAGLEGVVGGGTGQTAAVPGYDVAGKTGTAQKAGKGGGYTNTYN 287 Query: 565 AWFIGFTPTLQSVIWFGRDDNTPIGKGATGGVVSAPVYS 603 Sbjct: 288 AWFVGYAPADNPKYAVAVVIDNPQDKGGYGGAVAAPIFK 326 (2) (3)173 similar domain architectures
© Copyright 2026 Paperzz