extreme pathway matrix

1
Extreme Pathway Lengths and
Reaction Participation in
Genome Scale Metabolic
Networks
Jason A. Papin, Nathan D. Price and Bernhard Ø. Palsson
2
Introduction
Matrix
Reaction Network DuplicationStoichiometric
is
v1 v2
only for easy
 1 0
drawing
v3
v4
v5
v6
0
0
0
0

 1 2 2 0 0 0
 0 1 0 0 1 1

S   0 0 1 1 1 0
 0 0 0 1 0 1

 0 1 1 0 0 0
 0 0 1 1 1 0

b1
1
0
0
0
0
0
0
b2
b3
0 A

0 0 B
0 0 C

0 0 D
1 0  E

0 1  byp
0 0  cof
0
3
Background
v1
v2
v3
v4
v5
v6
b1
b2
b3
 1 0 0 0 0 0 1 0 0  A



1

2

2
0
0
0
0
0
0

 B
 0 1 0 0 1 1 0 0 0  C


S   0 0 1 1 1 0 0 0 0  D
 0 0 0 1 0 1 0 1 0  E


0

1

1
0
0
0
0
0

1

 byp
 0 0 1 1 1 0 0 0 0  cof


For every metabolite in the system we get the following
equation:
d  Xi 
  Si , j v j
dt
j
4
Background
v1
v2
v3
v4
v5
v6
b1
b2
b3
 1 0 0 0 0 0 1 0 0  A



1

2

2
0
0
0
0
0
0

 B
 0 1 0 0 1 1 0 0 0  C


S   0 0 1 1 1 0 0 0 0  D
 0 0 0 1 0 1 0 1 0  E


0

1

1
0
0
0
0
0

1

 byp
 0 0 1 1 1 0 0 0 0  cof


Lets look at B for example:
d  B
 v1  2v2  2v3
dt
Since the time constants associated with growth are much
larger than those associated with each individual reaction we
assume:
d  B
 v1  2v2  2v3  0
dt
5
Background
We get: i,
d  Xi 
dt
  Si , j v j  0  S  v  0
 1 0 0

 1 2 2
 0 1 0

S   0 0 1
 0 0 0

 0 1 1
 0 0 1

j
 v1 
 
0 0 0 1 0 0   v2   0 

 
0 0 0 0 0 0   v3   0 
 
0 1 1 0 0 0   v4   0 

 
1 1 0 0 0 0   v5    0 
 
1 0 1 0 1 0   v6   0 

 
0 0 0 0 0 1   v7   0 
 

1 1 0 0 0 0   v8   0 
v 
 9
Every solution of this set of equation is a steady state that
the system can be in.
6
Background
Reminder:
• Such a system is called homogenous.
• Such a system always has a solution (the zero solution).
• If it has more than one solution it has an infinite number of
solutions.
• The set of all the solutions is a vector space.
• This vector space is called the null space.
• From the rank theorem of linear algebra we know:
dim Nul  S   n  rank  S  ( n is the number of reactions)
7
Background
The minimal possible size of a spanning
set is m .
Defining the null space
If the spanning set satisfies this then it is
In order to
define
the null
need to
called
a base
andspace
all thewe
vectors
in find
it area spanning
set.
linearly independent.
Reminder:
A spanning set for a vector space U of dimension m is a set
of vectors, K  k1 , , km , , kl   U such that every other
vector in U can be written as a linear combination of the
vectors in K .
Mathematically:
u U , 1 ,
l
, l s.t.  j k j  u
j 1
8
Background
Defining the null space
Since K  k1 , , km   Nul  s  we have S  ki  0 for every i .
This implies that every member of the base is a possible
steady state.
A Problem
u  Nul  S  , 1 ,
Mathematically, 1 ,
m
,  m s.t.  j k j  u
j 1
,  m can take negative values.
Biologically this creates a problem since each vector
defines a flux which can not be “reversed”.
9
Background
The solution
Notice that we are only interested in solutions where vi  0
for every i (since the reactions must take place in the “right”
direction).
We find a spanning set K such that every such solution can
be written as a linear combination of the vectors in K where
all the coefficients take non-negative values.
Notice that the vectors in such a set can be linearly
independent.
10
Background
The solution
These vectors will be called genetically independent.
Genetically independent vectors are a group of vectors in
which no vector can be expressed as a linear combination of
the other vectors such that all the coefficients are non
negative.
An algorithm to find a genetically independent minumum
spanning set is described in Clarke’s paper “Complete set of
steady states for the general stoichiometric dynamical
systems” and will not be shown in framework of this
presentation.
11
Background
The resulting solution space takes the space of a convex
polyhedral cone.
12
Extreme Pathways
v1
v2
v3
v4
v5
v6
b1
b2
b3
 1 0 0 0 0 0 1 0 0  A



1

2

2
0
0
0
0
0
0

 B
 0 1 0 0 1 1 0 0 0  C


S   0 0 1 1 1 0 0 0 0  D
 0 0 0 1 0 1 0 1 0  E


0

1

1
0
0
0
0
0

1

 byp
 0 0 1 1 1 0 0 0 0  cof


The genetically
independent spanning set
in the above example is the
following:
 2   2   2  v1
     v
1 0 1 2
 0   1   0  v3
     
 0   1   1  v4
 0  ,  0  ,  1  v5
     
 1   0   0  v6
 2  2  2 b
      1
 1   1   1  b2
1 1 1b
      3
13
Extreme Pathways
Notice that each such vector defines a pathway in the
Reaction network.
 2  v1
 v
1 2
 0  v3
 
 0  v4
 0  v5
 
 1  v6
 2 b
  1
 1  b2
1b
  3
 2  v1
 v
0 2
 1  v3
 
 1  v4
 0  v5
 
 0  v6
 2 b
  1
 1  b2
1b
  3
 2  v1
 v
1 2
 0  v3
 
 1  v4
 1  v5
 
 0  v6
 2 b
  1
 1  b2
1b
  3
14
Extreme Pathways
These pathways are called extreme pathways.
From the way they were calculated we know that every
possible steady state flux can be expressed as a non
negative linear combination of these extreme pathways.
The extreme pathways define the topological structure of
the network.
15
Extreme Pathways
We now define there extreme pathway matrix:
EP1 EP2 EP3
2

1
0

0
P  0

1
2

1
1

2
0
1
1
0
0
2
1
1
2  v1

1  v2
0  v3

1  v4
1  v5

0  v6
2  b1

1  b2
1  b3
Pi , j equals the relative flux value through the
i th reaction in the j th extreme pathway.
16
Extreme Pathway Length
A property of the extreme pathways which we are interested
in is the length of the extreme pathways.
These lengths can be calculated from the extreme pathway
matrix.
First we transform P to a binary matrix P by changing all
the non zero values to 1.
EP1 EP2 EP3
2

1
0

0
P  0

1
2

1
1

2
0
1
1
0
0
2
1
1
2  v1

1  v2
0  v3

1  v4
1  v5

0  v6
2  b1

1  b2
1  b3
EP1 EP2 EP3

1

1
0

0
P  0

1
1

1
1

1
0
1
1
0
0
1
1
1
1  v1

1  v2
0  v3

1  v4
1  v5

0  v6
1  b1

1  b2
1  b3
17
Extreme Pathway Length
We then simply multiply P T with P .
EP1 EP2 EP3
6

PT  P  


4
6
5  EP1

5  EP2
7  EP3
The numbers in the i, i position represent
th
the length of the i extreme pathway.
The numbers in the i, j represent the
th
th
shared length of the i and j extreme
pathways.
EP2
EP1
EP3
18
Extreme Pathway Length
EP1 EP2 EP3
Why is this true?
1

1
0
 1
0
P  0

1
1

1
1

1
0
1  v1

1  v2
0v
2  3 3
1  v4
1  v5

0  v6
1  b1

1  b2
1  b3
1
EP EP
EP
6

PT  P  


1
40
0
6
1
1
1
5  EP1

5  EP2
7  EP3
Lets look at the 1,3 entry for example:
v1 v2 v3 v4 v5 v6
1 1 0 0 0 1
1st row of P T:
b1 b2 b3
1 1 1
3rd column of P : 1
1
1
0
1
1
0
1
1
19
Extreme Pathway Length
Using this method we can calculate the extreme pathway
lengths for various organisms.
In this article the lengths of the extreme pathways
responsible for producing amino acids were calculated for:
1.
Haemophilus influenzae – AKA Pfeiffer's bacillus or
Bacillus influenzae.
2.
Helicobacter pylori – A bacteria that infects the lining
of the human stomach.
20
Extreme Pathway Length
These
distributions have influenzae
more than one
Haemophilus
peak.
This implies that there are often multiple
common extreme pathway lengths around
which deviations are made
21
Extreme Pathway Length
Helicobacter pylori
valine andConclusion:
alanine are almost identical
except that the histogram is shifted.
The number of extra reaction
It takes five extra reaction steps to make
steps
create
valineand
valine
for need
shorter to
extreme
pathways
only threeof
extra
reactions
for the longer
instead
alanine
depends
on
ones.
the length of
the pathway.
22
Extreme Pathway Reaction
Participation
Another property of the extreme pathways which we are
interested in is the reaction participation in the extreme
pathways.
The reaction participation of a reaction vi is the number of
extreme pathways that the reaction takes place in.
v1 ‘s reaction participation is 3 for example.
EP1
EP2
EP3
23
Extreme Pathway Reaction
Participation
We want to calculate the reaction participation value for
each of the reactions.
Recall that P is the matrix obtained from P by changing all
the non zero values to 1.
EP1 EP2 EP3
2

1
0

0
P  0

1
2

1
1

2
0
1
1
0
0
2
1
1
2  v1

1  v2
0  v3

1  v4
1  v5

0  v6
2  b1

1  b2
1  b3
EP1 EP2 EP3

1

1
0

0
P  0

1
1

1
1

1
0
1
1
0
0
1
1
1
1  v1

1  v2
0  v3

1  v4
1  v5

0  v6
1  b1

1  b2
1  b3
24
Extreme Pathway Reaction
Participation
This can be achieved by multiplying P with PT .
v1 v2 v3 v4 v5 v6 b1 b2 b3
 3 2 1 2 1 1 3 3 3  v1

v
2
0
1
1
1
2
2
2

 2

1 1 0 0 1 1 1  v3


2
1
0
2
2
2

 v4
P  PT  
1 0 1 1 1  v5


1
1
1
1

 v6

3 3 3  b1


3 3  b2


3  b3

The numbers in the i, j
The numbers in the i, i
position represent
position represent
in how many
in how many
extreme pathways
extreme pathways
i and
both reaction
th
the i reactions
reaction j
participates in.
participates in.
25
Extreme Pathway Reaction
Participation
EP1 EP2 EP3
Why is this true?
1
v1 v2 v3 v4
1
3 2 1 2
0

2
0
1


1  01

P   02


P  PT  
1

1



1

1




1
1  v1
1  v2
3 3
0  v3
2 2
11 v41
12 v52

01 v61
11 b11

3
1  b23
1  b33
v5 v6 b1 b2 b3
0
1 1
1
1 1
01 0
10 0
10 0
1 1
1
1
3  v1

2  v2
1  v3

2  v4
1  v5

1  v6
3  b1

3  b2
3  b3
Lets look at the 2,4 entry for example:
EP1 EP2 EP3
1
0
1
2nd row of P :
4th column of P T :
0
1
1
26
Extreme Pathway Reaction
Participation
What can we learn from the extreme pathway reaction
participation matrix?
Lets look the at v1 for example:
v1 v2 v3 v4 v5 v6 b1 b2 b3
 3 2 1 2 1 1 3 3 3  v1

v
2
0
1
1
1
2
2
2

 2

1 1 0 0 1 1 1  v3


2
1
0
2
2
2

 v4
P  PT  
1 0 1 1 1  v5


1 1 1 1  v6

v1 participates
in
3
extreme
pathways.

3 3 3  b1
Since
 there are only 3 extreme

3  b2

v1 3participates
pathways
we
know
that

b
3
 3
in all the extreme pathways.
27
Extreme Pathway Reaction
Participation
What else can we learn from the extreme pathway reaction
participation matrix?
Lets look the at v1 and b1 for example:
v1 v2 v3 v4 v5 v6 b1 b2 b3
 3 2 1 2 1 1 3 3 3  v1

v
2
0
1
1
1
2
2
2

 2

1 1 0 0 1 1 1  v3


2
1
0
2
2
2

 v4
v1 Pparticipates
in 3 extreme
 PT  
1 0 1 pathways.
1 1  v5


b1 participates
in
3
extreme
pathways.
1 1 1 1  v6

 are 3 extreme 3pathways
3 3  b1 in
Since there


which they
3  b2 that
 both appear in we 3know
one takesplace iff the other takes
b3
3  place.
28
Extreme Pathway Reaction
Participation
The reactions in region 1 participate in all of the extreme
Conclusion: all the reactions in region 1
pathways.
either participate or not together.
29
Extreme Pathway Reaction
Participation
This information if of value. If we know all the reactions that
must occur together we can control (or completely prevent)
a reaction by affecting a different reaction.
As we just saw, in some cases this information is easily
seen in the matrix.
We will now describe an algorithm which based on the
reaction participation matrix will find all the reactions that
must occur (or not occur) together.
30
Extreme Pathway Reaction
Participation
R is the
PM
reaction
Can
we make the algorithm
participation
work faster?
Each reaction is inmatrix.
a group of its own. K  v1 ,
Answer: Who cares?
The algorithm:
1.
2.
, vn 
While K changes:
I. Check if there exist i, j such that:  RPM i ,i   RPM  j , j   RPM i , j
II. For every such couple merge the two groups.
Naïve implementation: O  n3  iterations.
The implementation can be improved to work in O  n2 
iterations by choosing the pairs on which we perform the
test more carefully.
31
Extreme Pathway Reaction
Participation
Example:
v1 v2 v3 v4 v5 v6 b1 b2 b3
We could make
1 small
2 1 optimization.
1 3 3 3  v1
3 2 a
v
When wereach
a reaction
that
was
2
0
1
1
1
2
2
2

 2
already joined
with
another
reaction
we

1 1 0 0 1 1 1  v3


need
not check
it.
2 1 0 2 2 2  v4

This however
RPM   doesn’t change
1 1 1 the
1 worst
1  v5


case
complexity.
1 1 1 1 v






 6
3 3 3  b1

3 3  b2
3  b3
K
KK

Kvv11,vbv1,11,,bvb,121,,vbb,222,,vb,b333v,3,2,vv,v422v,4,3,vv,v533v,54,v,v644v,6,5,bv,v155bv,,2v,66bv,26b,3b3
32