Molecular Computation and Splicing Systems

Molecular Computation and
Splicing Systems
J.H.M. Dassen, 1996.
Summarized by Dongmin Kim
2002. 4.
Introduction

Molecular Computation is interesting from both a
theoretical and a practical viewpoint.
 Differences in what problems are tractable.
 Turing machine can perform the same computation as
any other devices. (Church-Turing hypothesis)
 But, some implementable models may be more than
polynomially faster than others.
© 2002, SNU BioIntelligence Lab, http://bi.snu.ac.kr/
Advantages of Molecular
Computation

Energy-efficient
 Massive parallelism
 A sequential computer is an approximation of a
deterministic Turing machine.
 A parallel computer is an approximation of a
nondeterministic Turing machine.
 From a practical perspective, molecular computation
may redefine the limits of feasible computation.

Density of information storage
© 2002, SNU BioIntelligence Lab, http://bi.snu.ac.kr/
Limitations of Adleman’s
approach





Solves combinatorial problems only.
The operations involved are very slow and highly
error prone.
Scalability to large problem instances is doubtful.
Requires external operators
But, now several universal models; some
approaches do not require an external operator;
and less error prone operations and probabilistic
approaches are being studied.
© 2002, SNU BioIntelligence Lab, http://bi.snu.ac.kr/
Molecular Computation Models

Special purpose vs. universal
 Adleman’s, Lipton’s approach.
 Beaver’s and Rothemund’s simulation of Turing
machines.

In vitro vs. in vivo
 The information carrier
 DNA vs. RNA or ‘unusual’ DNA structures.

The operations
 Instructions -> data
 Rothemund’s Turing machine simulation treat
instructions as data
© 2002, SNU BioIntelligence Lab, http://bi.snu.ac.kr/
Molecular Computation Models
(2)

Rewritability of the information carrier
 One-pot vs. multiple phases
 Error-resilience
 Communication
 Between information carriers.
 The operator works ‘blindly’

Native or not
 There is no model that is ‘native’ to Molecular
Computation
© 2002, SNU BioIntelligence Lab, http://bi.snu.ac.kr/
Special Purpose Models (1)

Adleman’s approach
 Special purpose, in vitro model, the information carrier
is not rewritten, multiple separated phases.
 Limitations of the abstract model
 It
cannot break the exponential barrier: (Juris Hartmanis, On
the weight of computations, Bulletin of the European
Association for Theoretical Computer Science, 55:136-138,
1995.) Solving a 200 node instance of DHPP would require an
amount of DNA weighing more than the Earth.
 The output of the initialization step fall in a limited class of
languages. When the self-assembly is linear, this class is that of
regular languages.
© 2002, SNU BioIntelligence Lab, http://bi.snu.ac.kr/
Special Purpose Models (2)

Lipton’s model
 Solving SAT problem.
 Introduces the notion of “test tube”.
 Suggests using a molecular computational device as a
special purpose co-processor or unit for performing
exponential searches: an electronic/ molecular hybrid
computer.
© 2002, SNU BioIntelligence Lab, http://bi.snu.ac.kr/
Universal Models

Does a model in Molecular Computation exist that
is capable of simulating all computations:
 The answer seems like to be ‘Yes’.
 One was from several more or less practical proposals
that simulate classes of Turing machines using
Molecular Computation.
 The other was from the theory of splicing systems.
© 2002, SNU BioIntelligence Lab, http://bi.snu.ac.kr/
Universal Models (2) - Turing
machines

Basic model
 A finite control (a transition table, a current state), a tape of
potentially unlimited, symbols from a finite alphabet, and a read/
write head.

Representation
 ‘hardware – software’ vs. ‘constant – variable components’

Configurations
 Describes the contents of the tape, the position of the read/ writehead and the state of the finite control.

Determinism vs. Nondeterminism
 Nondeterministic Turing machine is ‘faster’ than deterministic one.
© 2002, SNU BioIntelligence Lab, http://bi.snu.ac.kr/
Universal Models (3)

Beaver’s model
 Simulates deterministic Turing machine.

A new operator: context-sensitive substitution
 We want to substitute substring  X to Y .
Add Y .
 Then we have LY R and PCR.
 We have LX R LY R .
 Destroy LX R by S1 nuclease.


Simulation
 Each substitution is corresponding to a configuration of TM.
 If duplicate tubes, it simulates Nondeterminism.
© 2002, SNU BioIntelligence Lab, http://bi.snu.ac.kr/
Universal Models (4)

Rothemund’s Model
 The simulation is performed by implementing the single steps from
one configuration to another.
 Instead of developing a simulation of a universal Turing machine
directly, Rothemund uses a small non-universal Turing machine
and then suggests how to scale up to a universal TM.

Useful enzymes
 Chose to use class IIS restriction enzymes.

Representing instantaneous descriptions
 The contents of the tape: symbols are each assigned a sequence.
 The position of the head: another sequence which indicates the
recognition site of a restriction enzyme and the splicing site.
 The state of the finite control: is encoded in the space between the
recognition site and the current symbol.
© 2002, SNU BioIntelligence Lab, http://bi.snu.ac.kr/
Universal Models (5)

Transitions
 Representing the transition table

Is encoded into four type oligonucleotides.
 Implementing the transitions

Estimates
 Representing one mole of bits requires about 260 m3 water.
 Each transition take about 4.5 hours

Problems
 It does not describe how to generate the initial tapes.
 Rothemund does not explain whether his scheme is suitable for
Nondeterministic TM.
 The scheme requires many different kind of restriction enzymes.
© 2002, SNU BioIntelligence Lab, http://bi.snu.ac.kr/
Universal Models (6)

Winfree’s model: simulating cellular automata
 Blocked cellular automata
 One-dimensional
variation and can be universal.
 The transition rule is formulated for pairs of cells.
 Simulates a universal blocked cellular automaton
 By
designing small units of DNA that they self-assemble into
two-dimensional complexes.
 One direction corresponds to the state of the whole automaton,
and the other shows the contents of one cell during the whole
developments of the automaton.
 It is unclear how practical Winfree’s approach is, but it
is conceptually much simpler than previous ones.
© 2002, SNU BioIntelligence Lab, http://bi.snu.ac.kr/
Splicing Systems (1)

Abstract models for the languages generated by strands of
DNA under the application of restriction enzymes and
subsequent annealing and ligation.
 Thomas Head, Formal language theory and DNA: an analysis of
the generative capacity of specific recombinant behaviors, Bulletin
of Mathematical Biology, 49(6): 737-759, 1987.
 If DNA-related problems are difficult to solve, then DNA-based
primitives may enable solutions to difficult problems.

The splicing operator
 In general formal language theory, strings are formed by applying
the concatenation of symbols.
 Splicing is the operation of concatenating a prefix of one string and
a suffix of another string. (e.g. splice (‘snack’, ‘tofu’) = ‘snafu’)
© 2002, SNU BioIntelligence Lab, http://bi.snu.ac.kr/
Splicing Systems (2)

Splicing rules
 Just as the use of concatenation is regulated by
grammatical rules, the use of splicing is regulated by
splicing rules.
 Is consists of four finite strings u1, u2, u3, u4
 u1, u2
(u3, u4) determine the possible sites of the splicing in the
first (second) string.
 u1, u4 are kept but u2, u3 are not.
 Formally, a splicing rule looks like as follows:
r  u1 # u2 $u3 # u4
© 2002, SNU BioIntelligence Lab, http://bi.snu.ac.kr/
Splicing Systems (3)

H scheme
  (V , R ) V is an alphabet, R is a set of splicing rules.
 H system
   (V , L, R) L is a given language.
 Extended H system
   (V , T , A, R) T is the terminal alphabet, A is the set
of axioms.

Classes
 Both A and R are finite: regular languages.
 A is finite, but R regular: recursively enumerable
languages.
© 2002, SNU BioIntelligence Lab, http://bi.snu.ac.kr/
Splicing Systems (4)

Question
 Are there splicing systems that can generate the recursively
enumerable languages, for which a realistic implementation is
possible?

Requirements
 The amount of initial strands and the number of different
restriction enzymes is finite.
 DNA strands are consumed in splicing.
 The length of a recognition site of a restriction enzyme is limited
 Some restrictions on the use of the splicing operator are difficult to
implement.
© 2002, SNU BioIntelligence Lab, http://bi.snu.ac.kr/
Splicing Systems (5)

Candidates
 Splicing systems based on multisets

K.L. Denninghoff and R.W. Gatterdam. On the undecidability of
splicing systems. International Journal of computer Mathematics, 27:
133-145, 1989.
 Splicing systems for circular strings

Takashi Yokomori, Satoshi Kobayashi, and Claudio Ferretti. On the
power of circular splicing systems and DNA computability. Technical
Report CSIM 95-01, Univ. of Electro-Communications, 1995.
 Multiset splicing system with finite axioms and radius 2

Thomas Head, Gheorghe Paun, and Dennis Pixton. Generative
Mechanisms Suggested by DNA Recombination. Vol. 2 of Rozenberg
and Salomaa. 1996.
© 2002, SNU BioIntelligence Lab, http://bi.snu.ac.kr/
Conclusion

Practical molecular computation
 Molecular Computation has great potential.
 The scale-up problem is difficult.
 Some models are being refined and some new ones are
introduced using very different paradigms or
implementations.

Theory
 Provides us with a new way of viewing biological and
chemical processes which may prove valuable in
various fields.
© 2002, SNU BioIntelligence Lab, http://bi.snu.ac.kr/
About

Our TSP
 H system series
 implementation

Another model of DNA computing
 New model beating H system series (??)
 Variants of H system series.
 Another theoretically universal system.
 More practical ones to address Turing tar-pit.
© 2002, SNU BioIntelligence Lab, http://bi.snu.ac.kr/