A Probabilistic Approach to Simulation of Molecular Biological

Dresden University of Technology
Department of Computer Science
An Object Oriented Simulation of Real
Occurring Molecular Biological
Processes for DNA Computing
and its Experimental Verification
T. Hinze, U. Hatnik, M. Sturm
Dresden DNA Computation Group
email: [email protected]
www: http://wwwtcs.inf.tu-dresden.de/dnacomp
T. Hinze, U. Hatnik, M. Sturm
1/13
Simulation of Molecular Biological Processes
Dresden University of Technology
Department of Computer Science
Contents
1.
State of the Art in DNA Computing
2.
Side Effects of DNA Operations
3.
A Probabilistic Approach to DNA Computing
4.
An Object Oriented Simulation Tool
5.
Selected DNA Operations
6.
A PCR Example
7.
Conclusions
T. Hinze, U. Hatnik, M. Sturm
2/13
Simulation of Molecular Biological Processes
Dresden University of Technology
Department of Computer Science
State of the Art in DNA Computing
vision
establish a universal biocomputer in theory and laboratory
biocomputer based on a formal model should feature by
– computational completeness (universality), reliability
– high operational speed using massive data parallelism
– high storage capacity and density, persistence of stored data
– DNA reusability, energy efficient processing without mech. wear
challenges
making DNA operations error resistent reducing side effects
bridging the gap between formal models of DNA computing and
lab-reality
our approach
T. Hinze, U. Hatnik, M. Sturm
3/13
Simulation of Molecular Biological Processes
Dresden University of Technology
Department of Computer Science
Gap between Models and Lab-Reality
side effects of DNA operations
not controllable, unreproducible, stochastically occurring effects
of molecular biological processes used as DNA operations
can sum up in sequences of DNA operations
lead to unexpected, unprecise, unreproducible or even unusable
final results of experimental DNA computations
frequently used abstractions of formal models of DNA computing
only linear DNA used as data carrier (words of formal languages)
unrestricted approach; arbitrary ( ) number of strand copies
unique result strands detectable absolutely reliable
all DNA operations performed completely and reproducible
idea to bridge the gap
specification of DNA operations on molecular level
include side effects specified by statistical parameters into the
description of DNA operations
probabilistic approach
T. Hinze, U. Hatnik, M. Sturm
4/13
Simulation of Molecular Biological Processes
Dresden University of Technology
Department of Computer Science
(differences
in DNA
(diff. from
lin. DNA
structure)
(differences from
perfect specification
incomplete reaction (% unprocessed strands)
of reaction)
sequence)
mutations
artifacts
loss of linear DNA strands by forming hairpins,
bulges, loops, junctions, and compositions of them
(% loss rate of tube contents)
failures in reaction procedure
classification of side effects
gel electrophoresis
affinity purification
PCR
polymerisation
labeling
digestion
ligation
union
melting
annealing
operations performed with
state of the art laboratory techniques
synthesis
Side Effects of DNA Operations
point mutation (% mutation rate)
deletion (% deletion rate, max. length of deletion)
insertion
unspecificity (% error rate, maximum difference)
supercoils
strand instabilities caused by temperature or pH
impurities by rests of reagences
undetectable low DNA concentration (min. # copies)
loss of DNA strands (% loss rate of tube contents)
: supported in simulation tool
T. Hinze, U. Hatnik, M. Sturm
in brackets: statistical parameters
5/13
: significant side effect caused by the operation
Simulation of Molecular Biological Processes
Dresden University of Technology
Department of Computer Science
A Probabilistic Approach to DNA Computing
properties
multiset based, nondeterministic, restricted model
description of DNA operations on level of single nucleotides and
strand end labels
operation param., side effect param.
recently supported: synthesis, annealing, melting, union, ligation,
digestion, labeling, polymerisation, PCR, affinity purification,
gel electrophoresis; formal description by prog. language
operation control
iteration of molecular events, probability-controlled
probabilities of molecular events depend on: DNA pool, number
of strand copies, operation parameters, side effect parameters
iteration terminates iff empty list (matrix) of possible mol. events
T. Hinze, U. Hatnik, M. Sturm
6/13
Simulation of Molecular Biological Processes
Dresden University of Technology
Department of Computer Science
A Probabilistic Approach to DNA Computing
iteration exemplified by annealing (simplified)
10 x
6x
A AGC T CCGA T GGAGC T
1.
T GA AGC T CC A T CGGA
DNA pool
10 strands
6 strands
A AGC T CCGA T GGAGC T
T GA AGC T CC A T CGGA
probability for collision:
10 strands
A AGC T CCGA T GGAGC T
0.39
probability for collision:
= 10 * 210
p
0.23
possible complete molecular reactions:
A AGC T CCGA
T
GGAGC T
possible complete molecular reactions:
A AGC T CCGA T GGAGC T
T CGAGG T AGCC T CGA A
AGGC T A CC T CGA AG T
3.
loop
probability for collision:
6 strands
T GA AGC T CC A T CGGA
0.23
2.
= 10 * 26
p
probability for collision:
= 6 * 210
p
0.15
possible complete molecular reactions:
A AGC T CCGA T GGAGC T
= 6 * 26
p
4.
possible complete molecular reactions:
no new product
1x
A AGC T CCGA T GGAGC T
AGGC T A CC T CGA AG T
AGGC T A CC T CGA AG T
number of DNA strands: p = 10 + 6 = 16
minimum nucleotide bonding rate for stable hybridized DNA double strands: 50%
molecular event: strand hybridization
1.
2.
3.
4.
9x
A AGC T CCGA T GGAGC T
5x
T GA AGC T CC A T CGGA
modified DNA pool
Create list (matrix) of molecular events and their probabilities including side effects
Select one molecular event randomly with respect to the probability distribution
Determine all possible reaction products from this molecular event and select one of them
Modify DNA pool
T. Hinze, U. Hatnik, M. Sturm
7/13
Simulation of Molecular Biological Processes
Dresden University of Technology
Department of Computer Science
An Object Oriented Simulation Tool
main features
specification of DNA operations on the level of single nucleotides
and strand end labels using probabilistic approach
number of strand copies considered
concentrations of different
DNA strands and their influence to the behaviour in op. process
each DNA operation processed inside a virtual test tube collecting
a multiset of DNA strands, several test tubes supported
each DNA operation characterized by a set of specific
parameters and side effect parameters
arbitrary sequences of DNA operations including propagation of
side effects can be visualized and logged
Java, simulation tool requires at least Java Development Kit 2.0
other objects
object
frame
interface
simulation
algorithm model
a)
T. Hinze, U. Hatnik, M. Sturm
algorithms for
process control
tubes
sequences
b)
8/13
tubes
sequences
sequences
algorithms for
sets of DNA strands
algorithms for
sequences molecular interactions
(DNA strand, reactants)
Simulation of Molecular Biological Processes
Dresden University of Technology
Department of Computer Science
Selected DNA Operations – Synthesis
operation parameters:
tube1
tube name:
nucleotide sequence (5’-3’): AGGCACTGAGGTGATTGGC
number of strand copies: 8 000
side effect parameters:
point mutation rate:
deletion rate
maximum deletion length:
0.06%
0.06%
11% of strand length
output of
test tube
contents
T. Hinze, U. Hatnik, M. Sturm
9/13
Simulation of Molecular Biological Processes
Dresden University of Technology
Department of Computer Science
Selected DNA Operations – Digestion
operation parameters:
tube name:
recognition sequence
and restriction site:
tube2
5’ T C G C G A 3’
3’ A G C G C T 5’
side effect parameters:
rate of not executed molecular cuts: 5%
rate of star activity (unspecificity):
5%
N C
5’
wildcarded recognition sequence:
G C G N 3’
3’ N G C G C N 5’
output of
test tube
contents
T. Hinze, U. Hatnik, M. Sturm
10/13
Simulation of Molecular Biological Processes
Dresden University of Technology
Department of Computer Science
A PCR Example (I)
1000 copies template1
Synthesis
1000 copies template2
Synthesis
8000 copies primer1
Synthesis
8000 copies primer2
Synthesis
Union
Union
7910
correct synthesized strands
(7910 copies of primer2,
7909 copies of primer1,
942 copies of template1,
941 copies of template2)
7909
Union
942
941
7
incorrect synthesized
strands carrying
point mutations and deletions
(totally 90 strands of primer2,
91 strands of primer1,
58 strands of template1,
59 strands of template2)
6
5
5
T. Hinze, U. Hatnik, M. Sturm
11/13
Simulation of Molecular Biological Processes
Dresden University of Technology
Department of Computer Science
A PCR Example (II)
1st PCR cycle
Melting
max. length: 100bp
min. bonding rate: 50%
Annealing
Polymerisation
correct amplified double stranded
template (5965 copies)
5965
2739
primer rests
2713
unwanted strands (including amplified
strands with deletions resulting in too
short double stranded PCR fragments)
2nd PCR cycle
Melting
max. length: 100bp
min. bonding rate: 50%
Annealing
Polymerisation
6
1
2
3
4
5
2
2
1
3rd PCR cycle
Melting
max. length: 100bp
min. bonding rate: 50%
1
100
50
Annealing
lane 1: PCR negative control
lanes 2, 3, 4: PCR product (simulation screenshot)
lane 5: 50bp marker
Polymerisation
T. Hinze, U. Hatnik, M. Sturm
12/13
Simulation of Molecular Biological Processes
Dresden University of Technology
Department of Computer Science
Conclusions
results
first formal model of DNA computing considering side effects of
DNA operations using a probabilistic, restricted, and
multiset based approach
contribution to bridge the gap between experimental and
theoretical DNA computing
supports experimental setup of DNA algorithms as well as
implementations of models for DNA computation
prediction of experimental results and cost effective optimization
of error reducing and error compensating operation sequences
object oriented simulation tool based on this approach enables a
flexible, interoperable, and ergonomic model handling
further work
extension to additional side effects concerning nonlinear DNA
application to the implementation of distributed splicing systems
T. Hinze, U. Hatnik, M. Sturm
13/13
Simulation of Molecular Biological Processes