Barabasi in Bioinformatics

A Real-life Application of
Barabasi’s Scale-Free
Power-Law
Presentation for ENGS 112
Doug Madory
Wed, 1 JUN 05
Fri, 27 MAY 05
Background


Common property of many large networks is
vertex connectivities follow a scale-free powerlaw distribution.
Consequence of two generic mechanisms:
(i) networks expand continuously by the
addition of new vertices, and
(ii) new vertices attach preferentially to sites
that are already well connected.
So what?
The objective of network theory is not
network diagrams, but insight!
 Application of Barabasi’s theory to
bioinformatics has produced several
significant biological discoveries

Determining Roles of Proteins
Within Metabolism
Proteins are traditionally identified on the
basis of their individual actions
 Modern research is trying to determine
contextual or cellular function of proteins

 Requires
analysis of 1000’s of simultaneous
protein-protein interactions – unworkable!
 Must analyze as a complex network
Yeast proteome
Protein-Protein Interaction

Map of protein-protein interactions forms a scalefree power-law network
 Few
highly-connected proteins play central role in
mediating interactions among numerous, less connected
proteins
 Consequence is tolerance to random errors
 Removal of highly-connected proteins rapidly increases
network diameter computationally
Highly-Connected Proteins

When highly-connected proteins are removed in
order of connectivity, mortality of cell increases
 Highly-connected
proteins paramount to survival
 93% of proteins have <5 links, 21% essential
 0.7% of proteins have >15 links, 62% essential

Conversely when proteins are
removed at random, effect is
negligible
More Characteristics of
Highly-Connected Proteins


Most hub proteins same across species
4% of all proteins were found in all organisms of
experiment
 These

were also the most highly connected proteins
Species-specific differences expressed in least
connected proteins
Small-World in Organisms

Connectivity characterized by network diameter
 Shortest
biochemical pathway averaged over all pairs
of substrates

For all known non-biological networks average
node connectivity is fixed
 Implies
increased diameter as new nodes added
 Therefore more complex organisms should have
greater network diameters – but they don’t!!!
Conservation of Diameter



All metabolic networks share
same diameter!
As organism complexity
increases individual proteins
are increasingly connected
to maintain constant metabolic
network diameter
Larger diameter would attenuate organism’s
ability to respond efficiently to external changes
Conservation of Diameter
Summary for Diameter
A nderson-Darling N ormality Test
3.3
3.2
3.1
3.0
3.4
3.5
A -S quared
P -V alue <
1.48
0.005
M ean
S tDev
V ariance
S kew ness
Kurtosis
N
3.2907
0.1231
0.0151
0.104338
-0.311452
43
M inimum
1st Q uartile
M edian
3rd Q uartile
M aximum
3.0000
3.2000
3.3000
3.4000
3.5000
95% C onfidence Interv al for M ean
3.2528
3.3286
95% C onfidence Interv al for M edian
3.2000
3.3000
95% C onfidence Interv al for S tD ev
9 5 % C onfidence Inter vals
0.1015
0.1564
Mean
Median
3.20

3.22
3.24
3.26
3.28
3.30
3.32
Minitab analysis of Barabasi’s data for diameter
Conservation of Gamma

All metabolic networks
share power-law
 a. A.
Fulgidus (archae)
 b. E. coli (bacterium)
 c. C. Elegans (eukaryote)
 d. All 43 organisms (avg)
 g for all life about 2.2
Conservation of g
Summary for g
A nderson-Darling N ormality Test
2.0
2.1
2.2
2.3
2.4
A -S quared
P -V alue <
4.99
0.005
M ean
S tDev
V ariance
S kew ness
Kurtosis
N
2.1837
0.0893
0.0080
0.024458
0.234872
86
M inimum
1st Q uartile
M edian
3rd Q uartile
M aximum
2.0000
2.1000
2.2000
2.2000
2.4000
95% C onfidence Interv al for M ean
2.1646
2.2029
95% C onfidence Interv al for M edian
2.2000
2.2000
95% C onfidence Interv al for S tD ev
9 5 % C onfidence Inter vals
0.0776
0.1050
Mean
Median
2.16

2.17
2.18
2.19
2.20
Minitab analysis of Barabasi’s data for g
Conclusions
Barabasi’s network theory offers insights into
metabolic networks in cellular biology
 Correlation between connectivity and
indispensability of a protein confirms that
robustness against lethal mutations is derived
from organization of protein interactions
 Metabolic networks within all living things
have almost same diameter and g.
