Determining minimum set of driver nodes in protein-protein interaction networks: Additional file 1 Xiao-Fei Zhang, Le Ou-Yang, Yuan Zhu, Meng-Yun Wu, and Dao-Qing Dai Contents 1 Supplementary Table 2 2 Supplementary Figure 3 1 1 Supplementary Table Table S1: Significance of the difference between degree populations of predicted driver proteins and non-driver proteins intlinprog Dataset combined binary co-complex lp solve MDS CC-MDS MDS CC-MDS 4.9E-10 2.3E-15 8.3E-05 0 0 1.3E-134 1.6E-131 6.7E-183 4.2E-64 0 0 1.3E-134 Table S2: Significance of the difference between betweenness populations of predicted driver proteins and non-driver proteins intlinprog Dataset combined binary co-complex lp solve MDS CC-MDS MDS CC-MDS 1.6E-25 1.3E-30 1.1E-15 0 0 8.5E-248 1.1E-235 4.6E-314 9.4E-130 0 0 8.5E-248 Table S3: Significance of the difference between populations of the number of annotated protein complexes of predicted driver proteins and non-driver proteins intlinprog Dataset combined binary co-complex lp solve MDS CC-MDS MDS CC-MDS 3.5E-03 2.3E-02 1.5E-02 2.3E-06 6.1E-06 5.6E-05 5.2E-05 5.1E-04 5.4E-04 2.2E-06 5.9E-06 5.6E-05 Table S4: Significance of the difference between populations of the number of annotated GO annotations of predicted driver proteins and non-driver proteins intlinprog Dataset lp solve Ontology MDS CC-MDS MDS CC-MDS combined BP CC MF 2.9E-06 7.8E-11 1.5E-15 1.4E-33 1.7E-44 3.2E-78 6.3E-17 6.7E-26 8.1E-50 2.5E-33 8.7E-44 6.4E-78 binary BP CC MF 3.1E-07 1.3E-14 1.8E-16 5.6E-31 4.9E-43 7.1E-72 2.9E-20 1.8E-28 3.8E-48 9.8E-31 1.2E-42 1.4E-71 co-complex BP CC MF 1.9E-05 4.4E-07 5.0E-13 3.2E-28 1.7E-31 5.5E-75 7.8E-20 2.0E-21 1.1E-43 3.2E-28 1.7E-31 5.5E-72 2 2 Supplementary Figure A B 1 0 1406 0.9 1404 0.25 0.8 1402 0.7 1400 0.5 1398 0.6 1396 0.75 0.5 1394 0.4 = 0.15 1 1392 0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1 Figure S1: Effect of parameter γ on the resulting CC-MDS proteins for the intlinprog method in the binary network. In (A), we present the effect of parameter γ on the number of predicted driver proteins. The x-axis denotes the value of γ; the y-axis denotes the number of driver proteins determined using the CC-MDS model; the red circle labels the value of γ we choose. In (B), we present the overlap rate between the sets of driver proteins obtained using different values of γ. Number of driver proteins 552 1 0 551 0.9 0.25 550 0.8 549 0.7 0.5 548 0.6 547 0.75 0.5 = 0.05 546 0.4 1 545 0 0.25 0.5 0.75 1 0 0.25 0.5 0.75 1 Figure S2: Effect of parameter γ on the resulting CC-MDS proteins for the intlinprog method in the complex network. In (A), we present the effect of parameter γ on the number of predicted driver proteins. The x-axis denotes the value of γ; the y-axis denotes the number of driver proteins determined using the CC-MDS model; the red circle labels the value of γ we choose. In (B), we present the overlap rate between the sets of driver proteins obtained using different values of γ. 3 A 2 Degree 10 1 10 0 10 MDS non-MDS CC-MDS non-CC-MDS MDS non-MDS CC-MDS non-CC-MDS MDS non-MDS CC-MDS non-CC-MDS MDS non-MDS CC-MDS non-CC-MDS MDS non-MDS CC-MDS non-CC-MDS MDS non-MDS CC-MDS non-CC-MDS B 2 Degree 10 1 10 0 10 C 2 Degree 10 1 10 0 10 Figure S3: Degree distributions of predicted driver and non-driver proteins. The degree distributions of predicted driver and non-driver proteins are represented by box plots (line = median). (A) combined network; (B) binary network; (C) co-complex network. 4 0 10 -2 10 -4 10 -6 10 0 10 -2 10 -4 10 -6 10 0 10 -2 10 -4 10 -6 10 Figure S4: Betweenness distributions of predicted driver and non-driver proteins. The betweenness distributions of predicted driver and non-driver proteins are represented by box plots (line = median). (A) combined network; (B) binary network; (C) co-complex network. 5 A 4000 Number of connected components 3500 3000 2500 2000 1500 MDS-intlinprog CC-MDS-intlinprog MDS-lp-solve CC-MDS-lp-solve 1000 500 0 0 500 1000 1500 Number of deleted proteins B 4000 Number of connected components 3500 3000 2500 2000 1500 MDS-intlinprog CC-MDS-intlinprog MDS-lp-solve CC-MDS-lp-solve 1000 500 0 0 500 1000 1500 Number of deleted proteins C 1400 Number of connected components 1200 1000 800 600 MDS-intlinprog CC-MDS-intlinprog MDS-lp-solve CC-MDS-lp-solve 400 200 0 0 100 200 300 400 500 600 Number of deleted proteins Figure S5: Vulnerability to attack against predicted driver proteins quantified using the number of connected components. Starting with the most connected proteins, the proteins are successively deleted and the number of connected components after each deletion is calculated. There is one curve for each set of predicted driver proteins that shows the number of connected components as a function of the number of deleted proteins. (A) combined network; (B) binary network; (C) co-complex network. 6 Largest connected component A 1 MDS-intlinprog CC-MDS-intlinprog MDS-lp-solve CC-MDS-lp-solve 0.9 0.8 0.7 0.6 0.5 0.4 0 500 1000 1500 Number of deleted proteins Largest connected component B 1 MDS-intlinprog CC-MDS-intlinprog MDS-lp-solve CC-MDS-lp-solve 0.9 0.8 0.7 0.6 0.5 0.4 0 500 1000 1500 Number of deleted proteins Largest connected component C 1 MDS-intlinprog CC-MDS-intlinprog MDS-lp-solve CC-MDS-lp-solve 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0 100 200 300 400 500 600 Number of deleted proteins Figure S6: Vulnerability to attack against predicted driver proteins quantified using the largest connected component. Starting with the most connected proteins, the proteins are successively deleted and the size of largest connected component after each deletion is calculated. There is one curve for each set of predicted driver proteins that shows the fraction of nodes in the largest connected component as a function of the number of deleted proteins. (A) combined network; (B) binary network; (C) co-complex network. 7 Number of complexes A 2 10 1 10 0 10 Number of complexes B MDS non-MDS CC-MDS non-CC-MDS MDS non-MDS CC-MDS non-CC-MDS MDS non-MDS CC-MDS non-CC-MDS MDS non-MDS CC-MDS non-CC-MDS MDS non-MDS CC-MDS non-CC-MDS MDS non-MDS CC-MDS non-CC-MDS 2 10 1 10 0 10 C 2 Number of complexes 10 1 10 0 10 Figure S7: Distributions of the number of associated complexes of predicted driver and non-driver proteins. The distributions of the number of associated protein complexes of predicted driver and non-driver proteins are represented by box plots (line = median). (A) combined network; (B) binary network; (C) co-complex network. 8 2 Number of annotations 10 1 10 0 10 2 Number of annotations 10 1 10 0 10 2 Number of annotations 10 1 10 0 10 Figure S8: Distributions of the number of associated GO annotations of predicted driver and non-driver proteins in the combined network. The distributions of the number of associated GO annotations of predicted driver and non-driver proteins are represented by box plots (line = median). (A) biological process; (B) cellular component; (C) molecular function. 9 A 2 10 1 10 0 10 MDS non-MDS CC-MDS non-CC-MDS MDS intlinprog non-MDS CC-MDS non-CC-MDS lp_solve B 2 10 1 10 0 10 MDS non-MDS CC-MDS non-CC-MDS MDS intlinprog non-MDS CC-MDS non-CC-MDS lp_solve C 2 10 1 10 0 10 MDS non-MDS CC-MDS non-CC-MDS intlinprog MDS non-MDS CC-MDS non-CC-MDS lp_solve Figure S9: Distributions of the number of associated GO annotations of predicted driver and non-driver proteins in the binary network. The distributions of the number of associated GO annotations of predicted driver and non-driver proteins are represented by box plots (line = median). (A) biological process; (B) cellular component; (C) molecular function. 10 A 2 Number of annotations 10 1 10 0 10 MDS non-MDS CC-MDS non-CC-MDS MDS intlinprog non-MDS CC-MDS non-CC-MDS lp_solve B 2 Number of annotations 10 1 10 0 10 MDS non-MDS CC-MDS non-CC-MDS MDS intlinprog non-MDS CC-MDS non-CC-MDS lp_solve C 2 Number of annotations 10 1 10 0 10 MDS non-MDS CC-MDS non-CC-MDS intlinprog MDS non-MDS CC-MDS non-CC-MDS lp_solve Figure S10: Distributions of the number of associated GO annotations of predicted driver and non-driver proteins in the complex network. The distributions of the number of associated GO annotations of predicted driver and non-driver proteins are represented by box plots (line = median). (A) biological process; (B) cellular component; (C) molecular function. 11 CC-MDS (1407) 33 18 17 1339 33 146 DS-DC (1536) 32 DS-GDC (1534) CC-MDS (1393) 33 21 12 1327 26 151 DS-DC (1525) 32 DS-GDC (1532) CC-MDS (546) 10 9 4 523 9 77 DS-DC (618) 13 DS-GDC (617) Figure S11: Overlap of the three sets of driver proteins produced by CC-MDS, DS-DC and DS-GDC algorithms applied on the three networks considered. (A) combined network; (B) binary network; (C) co-complex network. 12
© Copyright 2026 Paperzz