materials and methods

Supplementary Material
Title: Analysis and Prediction of Anti-bacterial Peptides
Authors: Sneh Lata, B. K. Sharma and G. P. S. Raghava*; Institute of
Microbial Technology, Sector-39A, Chandigarh, India
Figure S1: Analysis of residues at first position of N-terminus of antibacterial peptides.
Height of bar shows the frequency of residues at a position.
Figure S2: Analysis of residues at 2nd position of N-terminus of antibacterial peptides.
Height of bar shows the frequency of residues at a position.
Figure S3: Analysis of residues at 3rd position of N-terminus of antibacterial peptides.
Height of bar shows the frequency of residues at a position.
Figure S4: Analysis of residues at 4th position of N-terminus of antibacterial peptides.
Height of bar shows the frequency of residues at a position.
Figure S5: Analysis of residues at 5th position of N-terminus of antibacterial peptides.
Height of bar shows the frequency of residues at a position.
Propensity of amino acids at position 5 in N-terminal dataset
50
45
40
30
Antibacterial peptides
25
Non-antibacterial peptides
20
15
10
5
0
A
C
D
E
F
G
H
I
K
L
M
N
P
Q
R
S
T
V
W
Y
amino acids
Figure S6: Analysis of residues at 1st position of C-terminus of antibacterial peptides. Height
of bar shows the frequency of residues at a position.
Propensity of amino acids at position 1 in C- trminal dataset
70
60
50
No. of peptides
No of peptides
35
40
Antibacterial peptides
Non-antibacterial peptides
30
20
10
0
A
C
D
E
F
G
H
I
K
L
M
N
Amino acids
P
Q
R
S
T
V
W
Y
Figure S7: Analysis of residues at 2nd position of C-terminus of antibacterial peptides.
Height of bar shows the frequency of residues at a position.
Propensity of am ino acids at position 2 in C- term inal dataset
60
No. of peptides
50
40
Antibacterial peptides
30
Non-antibacterial peptides
20
10
0
A
C
D
E
F
G
H
I
K
L
M
N
P
Q
R
S
T
V
W
Y
am ino acids
Figure S8: Analysis of residues at 3rd position of C-terminus of antibacterial peptides.
Height of bar shows the frequency of residues at a position.
Propensity of amino acids at position 3 in C-terminal dataset
80
70
No. of peptides
60
50
Antibacterial peptides
40
Non-antibacterial peptides
30
20
10
0
A
C
D
E
F
G
H
I
K
L
M
N
amino acids
P
Q
R
S
T
V
W
Y
Figure S9: Analysis of residues at 4th position of C-terminus of antibacterial peptides.
Height of bar shows the frequency of residues at a position.
Propensity of am ino acids at position 4 in C-term inal dataset
60
No. of peptides
50
40
Antibacterial peptides
30
Non-antibacterial peptides
20
10
0
A
C
D
E
F
G
H
I
K
L
M
N
P
Q
R
S
T
V
W
Y
am ino acids
Figure S10: Analysis of residues at 5th position of C-terminus of antibacterial peptides.
Height of bar shows the frequency of residues at a position.
Propensity of amino acids at position 5 in C-terminal dataset
60
No. of peptides
50
40
Antibacterial peptides
30
Non-antibacterial peptides
20
10
0
A
C
D
E
F
G
H
I
K
L
M
N
amino acids
P
Q
R
S
T
V
W
Y
Figure S11: Frequency of polar, non-polar, negative charge and C+R+K in antibacterial and
non-antibacterial peptides at 1st position of N-terminus.
Figure S12: Frequency of polar, non-polar, negative charge and C+R+K in antibacterial and
non-antibacterial peptides at 2nd position of N-terminus.
Figure S13: Frequency of polar, non-polar, negative charge and C+R+K in antibacterial and
non-antibacterial peptides at 2nd position of N-terminus.
Figure S14: Frequency of polar, non-polar, negative charge and C+R+K in antibacterial and
non-antibacterial peptides at 4th position of N-terminus.
Figure S15: Frequency of polar, non-polar, negative charge and C+R+K in antibacterial and
non-antibacterial peptides at 5th position of N-terminus.
Figure S16: Frequency of polar, non-polar, negative charge and C+R+K in antibacterial and
non-antibacterial peptides at 1st position of C-terminus.
Figure S17: Frequency of polar, non-polar, negative charge and C+R+K in antibacterial and
non-antibacterial peptides at 2nd position of C-terminus.
Figure S18: Frequency of polar, non-polar, negative charge and C+R+K in antibacterial and
non-antibacterial peptides at 3rd position of C-terminus.
Figure S19: Frequency of polar, non-polar, negative charge and C+R+K in antibacterial and
non-antibacterial peptides at 4th position of C-terminus.
Figure S20: Frequency of polar, non-polar, negative charge and C+R+K in antibacterial and
non-antibacterial peptides at 5th position of C-terminus.
Figure S21: Number of antibacterial peptides of various lengths in original
dataset.
600
500
No. of peptides
400
300
antibacterial peptides
200
100
0
>=5
>=10
>=15
Length of peptides
>=20
>=25
Figure S22: creation of NT15, CT15 and NTCT15 datasets.
Figure S23: Creation of NT5 dataset.
Figure S24: Creation of NT10 dataset.
Figure S25: Creation of NT20 dataset.
Table S1:
Quantitative weight matrix for first fifteen residues of N-terminus of antibacterial
peptides. P1, P2... P15 shows residue preferences for positions 1, 2... 15, respectively. The number
shown in bold has highest propensity of a residue in a given position.
AA
A
C
D
E
P1
0.241
0.231
0.120
0.882
P2
P3
0.641
P4
0.030
0.379
-0.889
0.467
0.400
0.556
0.472
0.488
0.283
0.467
0.695
0.200
-0.111
-0.538
0.470
0.263
0.000
0.240
0.485
0.636
0.667
0.852
0.278
0.590
-0.304
-0.600
0.765
-0.294
0.135
P6
0.014
P7
0.261
P8
0.091
0.556
0.059
0.829
0.333
0.750
0.500
0.082
0.538
0.442
0.318
0.833
0.027
0.048
0.412
0.185
0.765
0.812
0.455
0.111
0.062
0.281
0.236
0.055
0.333
0.176
0.137
0.171
0.250
0.300
0.216
0.524
0.091
0.368
0.667
0.167
0.050
0.533
0.531
0.429
P5
P9
P10
0.168
0.444
0.250
0.562
0.267
0.586
0.500
0.368
0.532
0.432
0.111
0.185
0.270
0.097
0.678
0.059
0.500
0.333
0.039
1.000
0.429
0.438
0.517
F
G
H
I
K
L
M
N
P
Q
0.411
-0.429
-0.048
-0.643
-0.806
R
S
T
0.244
0.365
0.600
-0.064
0.034
0.111
0.273
-0.300
-0.423
-0.111
V
W
Y
0.778
-0.429
0.500
0.143
0.127
0.275
0.111
0.667
0.273
0.636
0.000
0.314
0.556
0.055
0.207
0.048
0.088
0.091
0.333
0.600
0.619
0.381
0.500
0.023
0.548
0.167
0.000
0.111
0.414
0.038
0.353
0.442
0.500
0.000
0.364
0.477
P11
0.074
P12
0.156
P13
0.028
0.083
0.529
0.317
0.462
0.333
0.524
0.394
0.333
0.786
0.471
0.097
0.429
0.875
0.800
0.154
0.231
0.333
0.100
0.176
0.129
0.486
0.434
0.229
0.065
0.111
0.385
0.667
0.417
0.347
0.191
0.000
0.125
0.500
0.185
0.471
0.318
0.515
0.229
0.200
0.600
0.314
0.583
0.164
0.333
0.750
0.217
0.613
0.472
0.800
0.576
0.333
0.333
0.167
0.026
0.467
0.600
0.067
0.098
0.636
0.133
0.422
0.652
0.048
0.500
0.241
0.000
0.164
0.273
0.250
0.500
0.148
0.556
0.176
0.240
0.059
0.333
0.190
0.200
1.000
0.000
0.765
0.091
0.238
0.273
0.333
0.036
0.068
0.027
0.250
0.143
0.222
0.250
0.586
0.053
0.188
0.288
0.191
0.077
0.300
0.394
0.355
0.244
0.444
0.562
0.211
1.000
0.125
P14
P15
0.333
0.276
0.133
0.556
0.282
0.500
0.167
0.905
0.742
0.310
0.176
0.200
0.036
0.377
0.023
1.000
0.569
0.404
0.455
0.105
0.037
0.238
0.067
0.294
0.500
0.200
0.636
0.440
0.191
0.400
0.053
0.056
0.000
0.040
0.208
0.118
0.077
0.086
0.667
0.529
Table S2: Quantitative weight matrix for last fifteen residues of C-terminus of antibacterial peptides. P1,
P2. P15 shows residue preferences for positions 1, 2 … 15 respectively. The number shown in bold has
highest propensity of a residue in a given position.
AA
A
P1
0.320
P2
0.016
P3
0.051
P4
P5
P6
P8
P9
P10
P11
0.108
P7
0.160
P12
0.072
P13
0.059
P14
P15
0.205
0.375
0.167
0.012
0.217
0.134
0.413
0.455
0.775
0.867
0.185
0.520
0.857
0.167
0.226
0.692
0.579
0.622
0.097
0.871
0.375
0.172
0.467
0.471
0.625
0.739
0.565
0.676
0.500
0.758
0.231
0.167
0.579
0.562
0.294
0.833
0.724
0.389
0.412
0.576
0.116
0.125
0.238
0.200
0.636
0.200
0.362
0.148
0.143
0.200
0.257
0.385
0.224
0.091
0.636
0.565
0.409
0.143
0.487
0.429
0.273
0.190
0.714
0.161
0.167
0.714
0.000
0.517
0.333
0.630
0.081
0.200
0.217
0.167
0.538
0.857
0.214
0.037
0.049
0.000
0.032
0.200
0.103
0.228
0.043
0.143
0.367
0.194
0.239
0.222
0.349
0.467
0.048
0.625
0.050
0.000
0.125
0.048
0.224
0.143
0.273
0.231
0.410
0.273
0.362
0.256
0.283
0.360
0.222
0.100
0.091
0.016
0.067
0.500
0.310
0.464
0.250
0.548
0.487
0.273
0.403
0.108
0.429
0.098
0.317
0.207
0.356
0.148
0.018
0.250
0.289
0.395
0.167
0.524
0.212
0.345
0.068
0.800
0.333
0.391
0.059
0.366
0.333
0.500
0.051
0.333
0.091
0.070
0.188
0.178
0.231
0.474
0.091
0.438
0.222
0.231
0.462
0.026
0.600
0.615
0.366
0.111
0.421
0.353
0.129
0.357
0.357
0.000
0.455
0.226
0.073
0.255
0.026
0.061
0.143
0.057
0.286
0.333
0.550
0.133
0.013
0.333
0.625
0.353
0.556
0.048
1.000
0.250
0.091
0.028
0.429
0.164
1.000
0.167
0.100
0.263
0.043
0.176
0.667
0.412
0.415
0.075
0.333
0.077
0.455
0.125
0.375
0.149
0.286
0.400
0.556
0.250
0.600
0.351
0.214
0.179
0.256
0.200
0.212
0.032
0.352
0.298
0.238
0.412
0.081
0.227
0.200
0.043
0.000
0.152
0.353
0.022
0.167
0.222
0.167
0.300
0.346
0.290
C
D
E
F
G
H
I
0.059
0.244
K
L
M
0.000
0.000
N
P
Q
0.061
R
S
T
V
0.443
0.433
0.412
0.400
0.286
0.097
0.182
0.333
0.125
0.048
W
Y
0.000
0.268
0.360
0.750
0.385
0.225
0.000
0.360
0.137
0.263
0.077
0.333
0.143
0.089
0.273
0.200
0.128
0.571
0.111
0.053
0.167
0.000
0.600
0.455
0.067
0.333
0.032
0.189
0.351
0.163
0.156
0.600
0.077
0.037
0.208
0.098
0.037
0.226
0.667
0.238
Table S3: Performance of SVM module developed by using amino acid
composition and binary pattern of NT5 dataset.
Theshold
-1
-0.9
-0.8
-0.7
-0.6
-0.5
-0.4
-0.3
-0.2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Amino acid
composition
Sen.
Spec.
Acc.
(%)
(%)
(%)
97.31
95.21
93.71
91.62
90.42
89.22
87.72
84.73
81.44
73.65
70.36
66.77
63.47
57.49
53.59
48.50
44.61
39.22
34.43
29.04
25.45
30.54
35.33
39.82
46.11
50.60
55.09
58.98
63.47
72.75
76.65
79.64
83.83
86.83
89.52
91.62
94.01
94.91
96.11
97.90
61.38
62.87
64.52
65.72
68.26
69.91
71.41
71.86
72.46
73.20
73.50
73.20
73.65
72.16
71.56
70.06
69.31
67.07
65.27
63.47
Binary pattern
Sen.
(%)
Spec.
(%)
Acc.
(%)
95.81
94.31
93.71
91.62
90.12
86.83
85.03
82.34
80.24
74.25
70.66
67.96
64.97
60.78
57.78
54.49
51.50
44.91
39.52
32.34
24.25
30.54
37.13
43.11
48.50
54.19
59.88
64.97
68.86
74.85
77.54
83.83
88.32
88.92
90.12
91.02
93.41
95.51
96.11
97.01
60.03
62.43
65.42
67.37
69.31
70.51
72.46
73.65
74.55
74.55
74.10
75.90
76.65
74.85
73.95
72.75
72.46
70.21
67.81
64.67
Table S4: Performance of SVM module developed by using amino acid
composition and binary pattern of NT10 dataset.
Theshold
-1
-0.9
-0.8
-0.7
-0.6
-0.5
-0.4
-0.3
-0.2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Amino acid
composition
Sen.
Spec.
Acc.
(%)
(%)
(%)
97.29
97.04
95.81
94.58
93.60
92.86
91.87
90.89
89.66
83.74
81.03
78.82
75.12
70.94
68.97
64.29
58.37
53.94
49.51
45.57
44.33
49.01
52.46
55.91
59.61
63.05
65.52
67.73
72.17
81.28
83.25
85.22
87.68
90.89
91.38
93.10
93.84
94.58
96.31
96.55
70.81
73.03
74.14
75.25
76.60
77.96
78.69
79.31
80.91
82.51
82.14
82.02
81.40
80.91
80.17
78.69
76.11
74.26
72.91
71.06
Binary pattern
Sen.
(%)
Spec.
(%)
Acc.
(%)
99.75
99.75
99.26
98.77
97.04
95.81
94.33
91.87
90.64
85.22
81.28
78.33
74.63
69.21
65.76
59.85
51.48
43.10
32.27
25.62
13.55
17.98
25.12
34.48
42.86
50.99
59.61
67.73
74.38
87.68
90.39
93.60
95.07
96.80
98.03
99.01
99.51
99.75
99.75
99.75
56.65
58.87
62.19
66.63
69.95
73.40
76.97
79.80
82.51
86.45
85.84
85.96
84.85
83.00
81.90
79.43
75.49
71.43
66.01
62.68
Table S5: Performance of SVM module developed by using amino acid
composition and binary pattern of NT15 dataset.
Theshold
-1
-0.9
-0.8
-0.7
-0.6
-0.5
-0.4
-0.3
-0.2
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Amino acid
composition
Sen.
Spec.
Acc.
(%)
(%)
(%)
96.93
96.68
96.42
95.91
94.63
93.61
92.07
91.05
90.03
88.24
86.45
84.40
81.33
78.52
77.49
73.66
70.59
67.26
62.92
57.80
51.66
56.52
62.40
65.98
68.29
71.61
76.21
78.77
81.33
88.24
89.77
90.79
92.58
94.12
94.37
95.65
96.16
96.68
97.70
97.70
74.30
76.60
79.41
80.95
81.46
82.61
84.14
84.91
85.68
88.24
88.11
87.60
86.96
86.32
85.93
84.65
83.38
81.97
80.31
77.75
Binary pattern
Sen.
(%)
Spec.
(%)
Acc.
(%)
97.95
96.93
96.68
96.16
95.65
95.40
93.86
92.33
91.05
87.72
85.93
83.63
81.59
79.03
75.70
73.40
70.08
62.92
57.29
52.43
46.55
51.15
57.54
60.36
65.22
68.29
73.66
76.98
81.33
87.98
91.30
92.84
93.86
93.86
94.63
95.40
95.65
96.42
96.93
97.95
72.25
74.04
77.11
78.26
80.43
81.84
83.76
84.65
86.19
87.85
88.62
88.24
87.72
86.45
85.17
84.40
82.86
79.67
77.11
75.19