blah

Supp. 8. ItWRKY1 gene from I. trifita
1. Alignment of three transcripts:
http://www.ebi.ac.uk/Tools/msa/clustalw2/
From top to bottom, three transcripts are:
1. contig_1743 from I. batatas (L.) Lam. cv. Xushu18 transcriptome;
2. comp26335_c0_seq1 from I. trifita transcriptome;
3. comp26335_c0_seq2 from I. trifita transcriptome
2. ORFs of two transcripts
http://www.ncbi.nlm.nih.gov/projects/gorf/
comp26335_c0_seq1 (2329 nt)
Frame
From
To
Length
-3
271
2040
1770
-1
813
1169
357
-1
363
614
252
3
282
482
201
2
1847
2005
159
-1
1923
2074
153
-2
125
271
147
-2
611
751
141
2
245
358
114
3
1971
2074
105
comp26335_c0_seq2 (2314 nt) has the same results as the comp26335_c0_seq1 has,
resulting in a 589 aa sequence. Blasting this ORF to NR database, we found the
reverse complimentary of a segment from position 312 to 1964 on the
comp26335_c0_seq1 is a 1,653 CDS sequence coding a 550 aa protein, named
ItWRKY1
> ItWRKY1 protein
MAASSGTIDAPTASSSFSFSTASSFMSSSFTDLLSSDAYSGGSVSRGLGDRIAERTGSGV
PKFKSLPPPSLPLSSPAVSPSSYFAFPPGLSPSELLDSPVLLSSSNILPSPTTGTFPAQT
FNWKNDSNASQEDVKQEEKGYPDFSFQTNSASMTMNYEDSKRKDELNSLQSLPPVTTSTQ
MSSQNNGGSYSEYNNQCCPPSQTLREQRRSDDGYNWRKYGQKQVKGSENPRSYYKCTHPN
CPTKKKVERALDGQITEIVYKGAHNHPKPQSTRRSSSSTASSASTLAAQSYNAPASDVPD
QSYWSNGNGQMDSVATPENSSISVGDDEFEQSSQKREPVGDEFDEDEPDAKRWKVENESE
GVSAQGSRTVREPRVVVQTTSDIDILDDGYRWRKYGQKVVKGNPNPRSYYKCTSQGCPVR
KHVERASHDIRSVITTYEGKHNHDVPAARGSGSHGLNRGANPNNNAAMAMAIRPSTMSLQ
SNYPIPIPSTRPMQQGEGQVPYEMLQGPGGFGYSGFGNPMNAYANQIQDNAFSRAKEEPR
DELFLETLLA
> ItWRKY1 CDS
ATGGCTGCTTCTTCAGGGACAATAGACGCCCCCACAGCTTCTTCATCTTTCTCTTTCTCC
ACCGCCTCTTCATTCATGTCCTCCTCCTTCACTGACCTCCTTTCCTCCGACGCCTATTCC
GGCGGCTCTGTGAGCAGAGGGCTGGGTGATCGGATAGCGGAGAGGACGGGGTCGGGTGTG
CCCAAGTTTAAGTCTTTGCCGCCGCCGTCTCTGCCGCTTTCTTCGCCGGCCGTCTCGCCG
TCGTCTTACTTCGCTTTTCCTCCTGGGTTGAGCCCCAGTGAGCTCCTGGATTCCCCTGTT
CTTCTATCTTCCTCAAACATTTTGCCGTCTCCGACAACTGGGACTTTTCCTGCTCAGACC
TTCAACTGGAAGAATGATTCTAACGCATCCCAGGAAGATGTTAAGCAAGAAGAGAAAGGA
TACCCAGATTTCTCTTTCCAGACTAACTCTGCTTCAATGACAATGAATTATGAAGATTCT
AAGAGGAAAGATGAGCTCAATTCTCTGCAGAGCCTTCCCCCTGTGACTACTTCAACTCAG
ATGAGCTCTCAGAACAATGGTGGGAGCTACTCTGAGTATAATAATCAATGCTGCCCGCCC
TCCCAGACGTTGAGGGAGCAGAGGCGATCTGATGACGGGTACAATTGGAGGAAATACGGG
CAGAAACAGGTGAAGGGGAGCGAAAACCCGAGGAGTTATTACAAGTGCACGCACCCGAAT
TGCCCCACGAAGAAGAAGGTCGAGAGGGCTTTGGATGGGCAGATTACTGAGATTGTCTAC
AAAGGAGCTCACAATCACCCGAAGCCTCAGTCCACTAGGAGATCGTCGTCCTCCACAGCT
TCTTCGGCTTCAACTTTGGCTGCCCAGTCTTATAACGCGCCTGCCAGTGATGTCCCGGAT
CAGTCGTATTGGTCTAATGGTAACGGGCAGATGGATTCTGTTGCCACGCCAGAGAATTCT
TCGATCTCCGTGGGGGATGATGAATTCGAGCAGAGCTCTCAGAAGAGGGAGCCCGTGGGA
GACGAGTTTGATGAAGACGAACCCGATGCAAAGAGATGGAAAGTGGAAAACGAAAGCGAG
GGAGTTTCTGCACAGGGGAGTAGGACAGTAAGGGAACCGAGAGTTGTAGTTCAAACGACG
AGTGATATTGATATTCTCGACGATGGTTATAGATGGAGAAAATATGGCCAGAAAGTTGTG
AAGGGAAATCCCAATCCAAGGAGCTATTACAAATGCACGAGCCAAGGCTGCCCGGTGAGG
AAACACGTGGAAAGGGCTTCACACGATATCCGCTCGGTGATAACAACCTACGAAGGGAAA
CACAACCACGACGTTCCTGCTGCCCGAGGGAGTGGCAGCCACGGCCTCAACCGGGGCGCC
AATCCTAACAACAATGCGGCCATGGCTATGGCGATTAGGCCTTCGACGATGTCTCTCCAA
TCTAACTACCCCATCCCAATCCCGAGCACGAGGCCAATGCAGCAGGGAGAAGGCCAAGTG
CCTTACGAGATGTTGCAGGGACCGGGCGGTTTTGGGTACTCGGGATTTGGGAACCCGATG
AATGCCTACGCGAACCAAATCCAGGACAACGCGTTCTCGAGGGCCAAGGAGGAGCCCAGA
GATGAGTTGTTCCTGGAGACATTGCTAGCTTGA
3. Alignment to SPF1 protein
http://www.uniprot.org/blast
Q: ItWRKY1 protein, 550 aa
H: SPF1 protein (Uniprot: Q40090), Ipomoea batatas (Sweet potato), 549 aa
E-value: 0.0, Score: 2842, Identity: 98.4%, Positives: 99.1%
Nine mutation:
S29-, S35A, M165L, P338S, V339G, V500A, P508S, E542D, E546D
4. Pfam domains
http://www.uniprot.org/uniprot/Q40090
Source
low_complexity
disorder
low_complexity
disorder
disorder
disorder
Pfam
disorder
disorder
low_complexity
Pfam
disorder
disorder
disorder
disorder
low_complexity
disorder
low_complexity
disorder
disorder
Domain
Start
End
n/a
n/a
n/a
n/a
n/a
n/a
WRKY1
n/a
n/a
n/a
WRKY2
n/a
n/a
n/a
n/a
n/a
n/a
n/a
n/a
n/a
11
44
58
100
106
114
210
226
242
268
386
393
402
419
426
458
478
505
523
542
29
70
86
103
108
223
266
239
381
288
443
397
417
424
472
469
503
515
539
546
Domains:
WRKY1(56aa): DGYNWRKYGQ KQVKGSENPR SYYKCTHPNC PTKKKVERAL DGQITEIVYK GAHNHPK;
WRKY2(57aa): DGYRWRKYGQ KVVKGNPNPR SYYKCTSQGC PVRKHVERAS HDIRSVITTY EGKHNHDV