A Network of Cellular Automata for the simulation of the

A Network of Cellular Automata
for the simulation of the Immune System
Castiglione F.† , Mannella G.‡ , Motta S.‡ , and Nicosia G.‡
†
Center for Parallel Computing, University of Cologne
Weyertal 80, D - 50931 Köln, Germany
[email protected]
‡
Department of Mathematics, University of Catania
V.le A. Doria 6-I, 95126, Catania, Italy
[email protected] {mannella,nicosia}@dmi.unict.it
Wed Feb 24 10:30:18 MET 1999
Abstract
We propose a new computational scheme of the Celada-Seiden model
(CSmodel ) for the simulation of the Immune System response. This new
approach is closer to the classical definition of cellular automata than the
previous. It has the advantage to achieve the same results obtained with
the CSmodel (IMMSIM) with a lower computational effort.
keywords: Immune System, CSmodel , Stochastic Cellular Automata, Lattice Gas Automata, computational complexity.
1
Introduction
The Immune System (IS) is a complex system of cells and molecules, distributed
throughout our body, that provide us with a basic defense against pathogenic
organisms. It has two different response to fight antigenic challenge: the humoral
response, mediated by antibodies, and the cellular response, mediated by cells.
Like the nervous system, the IS performs pattern recognition tasks, and retains
a memory of the antigens that it has fought. The IS contains more than 107
different cells clones that communicate via cell-cell contact and secretion of
molecules.
During the last decade we have witness the proliferation of studies in the field
of computational biology, which attempt to use methods and concepts of the
physics to handle the great complexity of the IS [1, 2]. In this respect the
models proposed so far can be classified into two basic approaches: equationbased and automata-like. While the first models the population dynamics of the
main cells involved in the immune response using ordinary or partial differential
equations, the latter goes down to the description of the interaction among
the cells and the molecules mimicking the cell behavior so that the emergent
population dynamics is the results of a large number of cooperating entities.
1
The power of the first approach is the ability to give a clear answer to the
question addressed but is limited by the great complexity of the overall view.
Moreover the immunology is far from being considered a well understood field,
instead, a lot is unclear or even not known at all. The methods of statistical
physics and the use of CA in particular [3] seems most suited to handle both the
complexity and the uncertainty of the IS because it is possible to break down
the relations among the entities in a set of elementary rules that can be either
simulated on computers or easily substituted to validate new hypothesis against
the results obtained. In a certain sense both in machina methods are valid to
study the immune system as far as the results match the real in vivo or in vitro
experiments.
One of the most rich models of the immune system is the CSmodel [4, 5].
This has been shown to be able to reproduce real phenomena of the immune
response. The model has a computational counterpart called IMMSIM and a
more recent development in C language [6], CIMMSIM (in the following we will
make a clear distinction between the model and the algorithm that implements
it).
At variance with the IMMSIM, which is a based on a lattice gas approach (see §2),
our implementation of the CSmodel is based on a network of cellular automata.
This has proven to produce results which are well in line with the ones obtained
by previous computational schemes, requiring less computational effort.
The paper is organized as follows: in section §2 we recall the CSmodel and we
show that the computational counterpart IMMSIM, can be described in the larger
class of the Integer Lattice Gas Automata (ILGA). In §3 the new computational
model, ARIMMSY(ARtificial IMMune SYstem), is introduced and described in
details. In §4 we estimate the complexity of both schema. The numerical
equivalence has been confirmed by means of extensive computer simulation; in
§5 we show the immunization process to be compared with previous results;
finally in §6 we draw some conclusions and comments for future works.
2
The CSmodel and IMMSIM
In this section we briefly review the CSmodel without going into the details
leaving the reader to the bibliography for further reference [4, 5]. The model
includes the following entities : antigens (Ag), lymphocyte B cells (B), plasma
B cells (PLB), antigen processing cells (APC), lymphocyte T-helper cells (T),
immune-complexes (IC), and antibodies (Ab). The Ag represent a generic foreign substances such as parasites, bacteria or virus potentially dangerous to the
host organism, that must be eliminated by the system response. B, PLB and
T cells represents lymphocytes with different functions (e.g. the Plasma cells
produce antibodies); AP cells (APC) represent the wide class of macrophages
and Ab are Ag-specific soluble molecules secreted by plasma cells that eliminate the antigens labeling them as IC (Ab-Ag ties) subsequently eliminated by
the macrophages. The CSmodel defines precise interaction rules of two types:
specific interactions that occur when some entities bind each other through
receptors (i.e. the bound is determined by their matching) and non-specific interactions that occur when two entities interact without any recognition process
(e.g. this is the case of APC catching small entities as Ag or IC). To avoid the
great complexity of the chemical interactions between receptors, the bindings
2
are mimicked as stochastic events. The probability that two receptors interact
is determined by a bit to bit matching over the bit strings representing them. In
this respect the model is also considered to belong to the class of the bit string
models in immunology [7] recently generalized in [8].
Every cell can be found in one of the allowed states (e.g. a B cell can be Active,
Internalized, Exposing, Stimulated or Memory according to whether it has bound
an antigen or not, if it expose the MHC/peptide complex, if it duplicates or if it
is considered a memory B cell) and successful interactions between two entities
produce a cell-state-change. Following this line the immune response is realized
as simultaneous multiple interaction/state changes, leading to the dominance
(in population) of those clones of cells that specifically recognize the antigen.
The computation counterpart IMMSIM (or CIMMSIM) has an underlying regular, hexagonal, two-dimensional lattice (six links at 60◦ ). Each site incorporates many entities. These interact in loci and diffuse to adjacent sites according
to a random choice. It is important to note that no interactions occur between
entities on different sites. Every site of the automaton includes a large number
of bit strings accounting for the definition of the various entities and their states
(i.e. both receptors and cell-states). On the contrary, classical cellular automata
[9] are characterized by having only one entity for each site, interacting with the
neighborhood (each site is updated in discrete time steps according to the value
of its neighborhood).
Thus IMMSIM can not clearly be identified as a classical cellular automata;
as matter of fact the IMMSIM structure is more likely to be identified as a Lattice
Gas Automata.
Lattice Gas Automata. LGA are a class of dynamical systems [10] in which
particles move on a lattice in discrete time steps. They have been developed
to study a wide variety of physical systems [11] and capture many features of
statistical mechanical systems at non equilibrium such as fluids [12]. In the LGA
the sites are connected by link carrying a bounded number of discrete particles.
Many particles are included in a site and they interact and diffuse in one step
according to a fixed set of deterministic rules. The state of each site is usually
represented by a few bits.
If we denote with n the number of directions and with L the number of bits per
directions, for a total of nL bits per site, then the number of particles moving
along any lattice direction can range from 0 to 2nL − 1. The total number of
states per site is then 2nL . Computationally, this means that the state of each
site is described by an integer in N ∩ [0, 2nL − 1], hence we have the Integer
Lattice Gas Automata [13] (ILGA).
ILGA is the mathematical structure formally closer to the implementation
of the CSmodel developed so far (IMMSIM, CIMMSIM). However it is important
to point out a distinction; in LGA (used for example in fluid dynamics) the
collisions among particles conserve mass and momentum; the existence of these
conserved quantities is what gives rise to fluid-like behavior at the macroscopic
level. In IMMSIM (and in general in any biological-oriented automata) this
cannot hold because the aging phase (see below) and the interaction phases
clearly alter the population of the various entities (e.g. during the interaction
the Ab, IC and Ag may be phagocitated by other cells). Even though stochastic
rules have been introduced in the framework of CA and LGA [14], IMMSIM and
3
CIMMSIM are probably some of the few ILGA which implement stochastic rules.
Both IMMSIM and CIMMSIM perform the following subtasks to achieve a
step of the simulation:
interaction the local interaction rules among cell and molecular
entities change the state of each site of the body;
diffusion
entities move to adjacent sites;
aging
account for both natural death and cell production from
the bone marrow as well as clone expansion and molecular
secretion.
3
The new approach
The key point in constructing the new approach to implement the CSmodel is to
think of a single site of the ILGA as a distinct Nondeterministic Cellular Automata (NCA); a similar approach was used starting in [15] as a generalization
of the Weisbuch-Atlan model for the immune response.
Therefore we define each site as a rectangular NCA called CA-node. The results
is a decoupling of the three phases just mentioned over two layers (fig. 1): the
upper layer constitute the body and is a regular NCA, while the lower level is the
micro structure where the interactions take place (in the following we will call
CA-node a single site of the body and point a single site of the microstructure).
The structure of a CA-node is a kind of hybrid approach between a NCA (for
what concern the diffusion phase) and ILGA (for what concern the interaction),
but the number of cell and molecules allocated in each point is limited by a
certain factor while, in the previous scheme, the only limitation to the molecular
growth is due to the machine numerical representation. This limitation in the
number of allocable entities in each point of the CA-node will reveal itself crucial
in limiting the computational complexity.
The interaction phase are performed at the lower level, while the diffusions
phase affect both levels. The aging phase affect every single entity in the
simulation indifferently of the position on the grid so that it is not worth to
make any distinction among levels. The following diagram together with figure
1 should point out the structure and the functions of the two levels.
Level
upper
|
lower
Formal Description Dimension
NCA
r×s
|
|
hybrid NCA/ILGA
p×q
Site-Names
Function
CA-nodes
diffusion
|
|
points
interact./diff.
The hybrid approach just mentioned for the CA-nodes corresponds to have in
each point of the CA-node only one cell (note that here the word cell refers to
the concrete biological cell) and a certain maximum number nmax of molecules.
Given this restriction it is easy to make an estimation of the physical dimension
of the body of the simulation: a lymphocyte in mammals have a diameter
length approximatively 15µm, bacteria ' 1 − 10µm and viruses from about
0.02µm to some µm (molecules such Ab are much smaller). Hence, a reasonable
estimation of the dimension of a single point of a CA-node is from 15+0.02nmax
to (15+10nmax )µm. Given that a usual value of nmax is 24, a 30×30 CA-nodes
body represent a physical dimension from about 15.48 to 255µm.
4
We shall now describe the aforementioned procedures in more details.
The interaction phase. The interaction phase occurs at the lower level (i.e.
there are no interactions between CA-nodes). At variance with IMMSIM, in a
single step every single entity can perform only one interaction and the interactions occur in two phases: in the first phase, called inner interaction, the
single cell (either B, PLB, T or APC) allocated in a point of the CA-node,
interacts with the molecules (Ab, IC or Ag) allocated in the same point and
then, these molecules interact among their own; in the second phase, called external interaction, the cell and the molecules in the point who have not changed
state by interactions in the inner step, have a chance to interact with cells
and molecules allocated in an extended R-Moore neighborhood (i.e. if we call
the points-space L = {(i, j) : i, j ∈ N , 0 ≤ i < p, 0 ≤ j < q} for certain
p, q ∈ N , then the R-Moore neighborhood of the point (i, j) is defined as
Ni,j = {(u, v) ∈ L : |u − i| ≤ R & |v − j| ≤ R} ).
It is worth to note that if we specify R so to include all the points of a CA-node
in the neighborhood of a single point, then the automaton will degenerate into
a lattice gas again.
Diffusion phase. At the end of the interaction phase entities are allowed to
diffuse in the lattice according to the following rules; each entity and molecule
allocated in an inner point of a CA-node (see fig. 1) will try to move into
one of the neighborhood points according to a uniform random choice. The
diffusion process must satisfy the constraint of “nmax molecules and one cell”
per point, therefore, if the entities cannot move, we try to diffuse them over an
extended 2-Moore neighborhood. Finally, entities that fail to be allocated in
a new point will not diffuse at all. For the boundary points of a CA-node we
operate as follows: if the final destination is inside the CA-node then we follows
the above procedure, otherwise the destination will be randomly choose among
the adjacent CA-nodes of the upper level; again, if this does not succeed, the
entity will not move.
Aging. At any time step some cells may die. This process is stochastic for as
the probability to delete an entity is governed by an exponential negative law
with parameter τ (mean life, different for each entity). The production of new
cells from the bone marrow are determined in a way so to guaranty, in absence
of any antigenic stimulation, a steady state of the overall population.
4
Complexity comparison
We shall now compare the complexity of the two implementations, focusing on
the interactions as the most time consuming procedures. In any site of the
lattice, IMMSIM or CIMMSIM, we may find any number of cells and molecules.
This lead to a worst case number of possible interactions given by the product between the number of interacting entities. The number of possible clones
(i.e. cells with different receptor) that can be stimulated to proliferate during
the immune response depends on a parameter called mc which is the minimum
receptor-receptor match allowed to have a non zero probability for the entities
to interact. The proliferation of a certain clone of cells is a consequence of a
5
stimulation by interaction with the Ag, thus the number of potentially
Pl proliferating clones, once we challenge the system with a single antigen, is m=mc ml .
This lead to a computational complexity of the interaction procedure respect
the whole body composed by D sites
!
1
l
D · ll+ 2
(1)
O D·
'O
1
m +1
mc
mc c 2 (l − m )l−mc + 2
c
where we used the Stirling approximation for large l and mc . In practice this
complexity is exponential in l for 8 < l < 32, l − mc > 1 and mc ≥ l/2, which
are reasonable values for our simulations.
Let us estimate the computational time complexity of the new scheme. Be
D the dimension of the body, d = p × q the dimension of the CA-nodes, l the
length of the bit-strings, and nmax the maximum number of molecules allowed
for each point of the CA-node. In ARIMMSY the interactions occur only inside
each CA-node; here is where the limitation on the number of entities in the
CA-nodes takes place to reduce to polynomial the complexity of the model.
The cardinality of an R-Moore neighborhood is NR = (2R + 1)2 , hence for a
point of the CA-node the number of interactions among entities allocated in the
aforementioned neighborhood is at most NR n2max ; for the whole CA-node are
O(dNR n2max ) and, finally, the complexity of the interaction procedure for the
whole lattice is
O(DdNR n2max )
(2)
Comparing eq. 1, eq. 2 and table 2 we notice that ARIMMSY performs better for
larger values of l just because the complexity is independent from the bit string
length.
5
Results
To compare the results obtained with the new approach ARIMMSY, we performed three set of simulations with different bit-string length, respectively 8,
10 and 12. The 8-bit-string input parameters set is referred to as the Standard Parameter Set [4, 6]. This was used as reference for the development of
CIMMSIM; in particular to calibrate the system to exhibit the primary and secondary immune response, obtaining, apart from statistical fluctuations due to
the stochastic nature of the simulations, the same population dynamics for all
the entities [6].
In table 1 a subset of the parameters for a 12 bit system are reported.
We have used these parameters to produce the results showed later. The main
parameter values are derived from the Standard Set scaling the system from 8 to
12 bit string length. Figure 2 shows the total population of B cells according to
the match with the injected antigen; the two plots show the results obtained with
the new and the old approach respectively. Figure 3 shows plots for antigens,
B and T memory and antibodies obtained with the new approach.
According to the estimation made in §4, table 2 shows a clear advantage of
ARIMMSY for longer bit strings systems compared with CIMMSIM (version 2).
6
6
Conclusions
In the present paper we presented a new computational scheme for the CSmodel
of the immune system. The driving idea was to built a classic (non deterministic)
CA. Cellular automata are much easier formal structures compared to lattice
gas IMMSIM. A formalization and a theoretical approach to such scheme can be
achieved. Our goal was partially fulfilled leading to a network of NCA able to
reproduce the systematic behavior of a previous implementation of the CSmodel
when submitted to the same input parameters. Previous implementation of the
model, CIMMSIM and IMMSIM, are basically LGAs and have shown to be expensive in term of computing requirements. The new approach, instead, has
proven to shows results well in line with the previous one with a lower computational effort. The present result also tells that the CSmodel shows to be a robust
model, performing independently from the computational approach. As further
work we plan to implement ARIMMSY on parallel platform using MPI libraries
in order to add to the model increasing biological complexity (e.g. mutation
processes, cell mediated response) or spatial effects (e.g. modeling lymphatic
channels connecting primary and secondary lymphoid organs).
Acknowledgments
F. Celada is kindly acknowledged for many valuable discussions.
This work has been partially supported with a grant from the Italian Universities
Consortium Computing Center (CINECA).
References
[1] Perelson A.S.(ed.), Theoretical Immunology, Part One and Two, SFI Studies in the Sciences of Complexity, Proceedings Volumes Vol. II and III,
Addison Wesley, NY, 1988.
[2] Perelson A. and Weisbuch G., Reviews of Modern Physics, 51, 576, 1997.
[3] Stauffer D. and Pandey R., Computers in Physics, 6(4), 404, 1992.
[4] Seiden P.E. and Celada F., J. Theor. Biol., 158, 329, 1992.
[5] Celada F. and Seiden P.E., Immunology Today, 13, 56 (1992).
[6] Succi S., Castiglione F., and Bernaschi M., Phys. Rev. Lett., 79, 4493, 1997.
[7] Farmer, J.D., Packard, N.H. & Perelson, A.S., Physica D, 22, 187, 1986.
[8] Zorzenon Dos Santos R. M. & Bernardes A. T., Phys. Rev. Lett., 81, 3034,
1998.
[9] Wolfram S., Cellular Automata and Complexity, Addison Wesley, NY, 1994.
[10] Boghosian B.M., and Taylor W., Phys. Rev. E, 52, 510, 1995.
[11] Manneville P. et al., (eds.) Springer Verlag Series in Physics, 46, 1989.
[12] Frisch U., Hasslacher B., Pomeau Y., Phys. Rev. Lett., 56, 1505, 1986.
7
[13] Boghosian B.M., Yepez J., Alexander F.J., and Margolus N.H., Phys. Rev.
E, 55, 4137, 1997.
[14] Gutowitz H. (ed.), Physica D Nonlinear Phenomena, 45, 136, 1990.
[15] Dayan I., et al., J. Phys. A, 21, 2473, 1988.
8
r
Body
upper level
(CA-nodes)
s
boundary
points
neighbourhood points
111
000
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000111
111
000
111
000
111
000
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
CA-node
000111
111
000
111
000
111
000
111
000
111
000
111
000
000
111
000
111
q
lower level
000
111
000
111
000
111
000
111
000111
111
000
111
000 111
111
000
111
000
111
000
000
111
000
111
000
(points)
000
111
000
111
000
111
000
111
000
111
000 111
111
000
000
111
000 111
111
000
111
000
111
000
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
000
111
inner points
p
Figure 1: Decomposition of the automaton in two layers; the body in which the
diffusion phase is accomplished and the CA-node where the interactions take
place.
Standard Parameter Set
Parameter
Value
Meaning
l
12
bit string length
D
16 x 32 body size
mc
10
minimum match to bind
tinj Ag
(0,120) injection time
# Ag
1000
Ag injected
# (B=T=APC)
5000
initial cells population
num steps
200
num steps
Table 1: A subset of the Standard Parameter Set
Parameter set
dat8
dat10
dat12
l
8
10
12
D
16 × 16
28 × 28
16 × 32
CIMMSIM 2
101
416
1543
ARIMMSY 1
196
403
1107
Table 2: Execution time comparison (seconds) on a 120MHz CISC processor
with 32MB of memory. ARIMMSY show to perform better on longer bit strings
systems compared with CIMMSIM. Note that changing the bit string length l
will require other parameter values to be scaled by consequence, to allow some
quantities (for example the concentration of clones of antigen-reactive cells per
site) to remain constant.
9
500
B cell affinity (mismatch)
0
1
2
450
400
350
300
250
200
150
100
50
0
0
20
40
60
80
100
time steps
120
140
160
180
200
80
100
time steps
120
140
160
180
200
400
B cell affinity (mismatch)
0
1
2
350
300
250
200
150
100
50
0
0
20
40
60
Figure 2: The first plot (top) refers to the results obtained with the new computational model ARIMMSY for the B cell population in a 12 bit system. The
second refers to the corresponding results obtained with CIMMSIM. l − mc = 2
hence we show mismatch zero, corresponding to the perfect match, one and
two bit-mismatches. The system show the immunization process after the first
injection of antigens at time step zero, followed by a proliferation of memory
clones that trigger a sudden second response at the second injection (time step
120). Due to the stochastic nature of the model the results show quantitative
but not qualitative differences.
10
4000
(a)
3500
3000
Memory cells
B
T
2500
2000
1500
(b)
0
40
60
80
100
120
140
160
180
200
80
100
120
140
160
180
200
Antibodies mismatch
0
1
2
0
(c)
20
12000
10000
8000
6000
4000
2000
0
20
40
60
1200
1000
800
600
400
200
0
Ag
0
20
40
60
80
100
120
140
160
180
200
Figure 3: The system develops the humoral immune response in correspondence
to the injection of the Ag (c). The development of the memory cells (a) allows
the system to build up a second, sharper response against the Ag. The depletion
time for the second injection is smaller than the first demonstrating the efficiency
of the system to achieve the learning task. Figure (b) shows the total antibodies
population per match, respectively zero, one and two bit-mismatch with the
antigen.
11