abstract bio molecular computing

ABSTRACT
BIO MOLECULAR COMPUTING
Biomolecular computing, ‘computations performed by biomolecules’, is challenging
traditional approaches to computation both theoretically and technologically. Often placed
within the wider context of ‘natural’ or even ‘unconventional’ computing, the study of
natural and artificial molecular computations is adding to our understanding both of
biology and computer science well beyond the framework of neuroscience. The papers in
this special theme document only a part of an increasing involvement of Europe in this far
reaching undertaking. In this introduction, I wish to outline the current scope of the field
and assemble some basic arguments that biomolecular computation is of central
importance to both computer science and biology. Readers will also find arguments for not
dismissing DNA Computing as limited to exhaustive search and for a qualitatively
distinctive advantage over all other types of computation including quantum computing.
The idea that molecular systems can perform computations is not new and was
indeed more natural in the pre-transistor age. Most computer scientists know of
von Neumann’s discussions of self-reproducing automata in the late 1940s, some
of which were framed in molecular terms. Here the basic issue was that of
bootstrapping: can a machine construct a machine more complex than itself?
Important was the idea, appearing less natural in the current age of dichotomy
between hardware and software, that the computations of a device can alter the
device itself. This vision is natural at the scale of molecular reactions, although it
may appear utopic to those running huge chip production facilities. Alan Turing
also looked beyond purely symbolic processing to natural bootstrapping
mechanisms in his work on self-structuring in molecular and biological systems.
Purely chemical computers have been proposed by Ross and Hjelmfelt extending
this approach. In biology, the idea of molecular information processing took hold
starting from the unraveling of the genetic code and translation machinery and
extended to genetic regulation, cellular signaling, protein trafficking,
morphogenesis and evolution - all of this independently of the development in the
neurosciences. For example, because of the fundamental role of information
processing in evolution, and the ability to address these issues on laboratory time
scales at the molecular level, I founded the first multi-disciplinary Department of
Molecular Information Processing in 1992. In 1994 came Adleman’s key
experiment demonstrating that the tools of laboratory molecular biology could be
used to program computations with DNA in vitro. The huge information storage
capacity of DNA and the low energy dissipation of DNA processing lead to an
explosion of interest in massively parallel DNA Computing. For serious
proponents of the field however, there really never was a question of brute search
with DNA solving the problem of an exponential growth in the number of
alternative solutions indefinitely. In a new field, one starts with the simplest
algorithms and proceeds from there: as a number of contributions and patents have
shown, DNA Computing is not limited to simple algorithms or even, as we argue
here, to a fixed hardware configuration.
After 1994, universal computation and complexity results for DNA Computing
rapidly ensued (recent examples of ongoing projects here are reported in this
collection by Rozenberg, and Csuhaj-Varju). The laboratory procedures for
manipulating populations of DNA were formalized and new sets of primitive
operations proposed: the connection with recombination and so called splicing
systems was particularly interesting as it strengthened the view of evolution as a
computational process. Essentially, three classes of DNA Computing are now
apparent: intramolecular, intermolecular and supramolecular. Cutting across this
classification, DNA Computing approaches can be distinguished as either
homogeneous (ie well stirred) or spatially structured (including multi-compartment
or membrane systems, cellular DNA computing and dataflow like architectures
using microstructured flow systems) and as either in vitro (purely chemical) or in
vivo (ie inside cellular life forms). Approaches differ in the level of
programmability, automation, generality and parallelism (eg SIMD vs MIMD) and
whether the emphasis is on achieving new basic operations, new architectures,
error tolerance, evolvability or scalability. The Japanese Project lead by Hagiya
focuses on intramolecular DNA Computing, constructing programmable state
machines in single DNA molecules which operate by means of intramolecular
conformational transitions. Intermolecular DNA Computing, of which Adleman's
experiment is an example, is still the dominant form, focusing on the hybridization
between different DNA molecules as a basic step of computations and this is
common to the three projects reported here having an experimental component
(McCaskill, Rozenberg and Amos). Beyond Europe, the group of Wisconsin are
prominent in exploiting a surface based approach to intermolecular DNA
Computing using DNA Chips. Finally, supramolecular DNA Computing, as
pioneered by Eric Winfree, harnesses the process of self-assembly of rigid DNA
molecules with different sequences to perform computations. The connection with
nanomachines and nanosystems is then clear and will become more pervasive in
the near future.
In my view, DNA Computation is exciting and should be more substantially
funded in Europe for the following reasons:
• it opens the possibility of a simultaneous bootstrapping solution of future
computer design, construction and efficient computation
• It provides programmable access to nanosystems and the world of molecular
biology, extending the reach of computation
• it admits complex, efficient and universal algorithms running on
dynamically constructed dedicated molecular hardware
• it can contribute to our understanding of information flow in evolution and
biological construction
• it is opening up new formal models of computation, extending our
understanding of the limits of computation.
The difference with Quantum Computing is dramatic. Quantum Computing
involves high physical technology for the isolation of mixed quantum states
necessary to implement (if this is scalable) efficient computations solving
combinatorially complex problems such as factorization. DNA Computing
operates in natural noisy environments, such as a glass of water. It involves an
evolvable platform for computation in which the computer construction machinery
itself is embedded. Embedded computing is possible without electrical power in
microscopic, error prone and real time environments, using mechanisms and
technology compatible with our own make up. Because DNA Computing is linked
to molecular construction, the computations may eventually also be employed to
build three dimensional self-organizing partially electronic or more remotely even
quantum computers. Moreover, DNA Computing opens computers to a wealth of
applications in intelligent manufacturing systems, complex molecular diagnostics
and molecular process control.
The papers in this section primarily deal with Biomolecular Computing. The first
contribution outlines the European initiative in coordinating Molecular Computing
(EMCC). Three groups present their multidisciplinary projects involving joint
theoretical and experimental work. Two papers are devoted to extending the range
of formal models of computation. The collection concludes with a small sampler
from the more established approach to biologically inspired computation using
neural network models. It is interesting that one of these contributions addresses
the application of neural modelling to symbolic information processing. However,
the extent to which informational biomolecules play a specific role in long term
memory and the structuring of the brain, uniting neural and molecular
computation, still awaits clarification.
Bio-Molecular Computing Technology
Bio-Molecular uses DNA and other Biological materials as the blocks for planning of living
computational machines to solve complex problems. Natural computations performed by bio
molecules with this field to which electronics engineering, bio physics, chemistry, molecular
biology, solid state physics and computer science contribute to extent. It has developed a bio
logical computer, composed of DNA molecules and enzymes than can provide a biological
phenomenon.
Leonard Adleman is a Father of DNA Computing that possible by finding the best solution to
following problems with a molecular computer.
1. A Real Problem
2. A Hamiltonian Path Problem
3. A Traveling Salesman Problem.
Computation Process
The basic idea behind smart molecules is to develop massively distributed living machines. It is
used to perform basic automated tasks. DNA system is a generic most advanced organic form of
autonomous programmable computing devices. Bio-Molecules involve the encoding,
manipulation and retrieve information at a macro molecular level. The bio logical systems have
more facilities like self assembly, recognition, high speed processing, learning and self
reproduction. Overall process of computation carried out the input molecules to provide a last
step output in the form of dsDNA molecules.
Definition
Molecular computing is an emerging field to which chemistry, biophysics, molecular biology, electronic
engineering, solid state physics and computer science contribute to a large extent. It involves the
encoding, manipulation and retrieval of information at a macromolecular level in contrast to the
current techniques, which accomplish the above functions via IC miniaturization of bulk devices. The
biological systems have unique abilities such as pattern recognition, learning, self-assembly and selfreproduction as well as high speed and parallel information processing. The aim of this article is to
exploit these characteristics to build computing systems, which have many advantages over their
inorganic (Si,Ge) counterparts.
DNA computing began in 1994 when Leonard Adleman proved thatDNA computing was possible by
finding a solution to a real- problem, a Hamiltonian Path Problem, known to us as the Traveling
Salesman Problem,with a molecular computer. In theoretical terms, some scientists say the actual
beginnings of DNA computation should be attributed to Charles Bennett's work. Adleman, now
considered the father of DNA computing, is a professor at the University of Southern California and
spawned the field with his paper, "Molecular Computation of Solutions of Combinatorial Problems."
Since then, Adleman has demonstrated how the massive parallelism of a trillion DNA strands can
simultaneously attack different aspects of a computation to crack even the toughest combinatorial
problems.
Adleman's Traveling Salesman Problem:
The objective is to find a path from start to end going through all the points only once. This problem is
difficult for conventional computers to solve because it is a "non-deterministic polynomial time
problem" . These problems, when they involve large numbers, are intractable with conventional
computers, but can be solved using massively parallel computers like DNA computers. The Hamiltonian
Path problem was chosen by Adleman because it is known problem.
The following algorithm solves the Hamiltonian Path problem:
1.Generate random paths through the graph.
2.Keep only those paths that begin with the start city (A) and conclude with the
end city (G).
3.If the graph has n cities, keep only those paths with n cities. (n=7)
4.Keep only those paths that enter all cities at least once.
5.Any remaining paths are solutions.
The key was using DNA to perform the five steps in the above algorithm. Adleman's first step was to
synthesize DNA strands of known sequences, each strand 20 nucleotides long. He represented each of
the six vertices of the path by a separate strand, and further represented each edge between two
consecutive vertices, such as 1 to 2, by a DNA strand which consisted of the last ten nucleotides of the
strand representing vertex 1 plus the first 10 nucleotides of the vertex 2 strand. Then, through the
sheer amount of DNA molecules (3x1013 copies for each edge in this experiment!) joining together in
all possible combinations, many random paths were generated. Adleman used well-established
techniques of molecular biology to weed out the Hamiltonian path, the one that entered all vertices,
starting at one and ending at six. After generating the numerous random paths in the first step, he
used polymerase chain reaction (PCR) to amplify and keep only the paths that began on vertex 1 and
ended at vertex 6. The next two steps kept only those strands that passed through six vertices,
entering each vertex at least once. At this point, any paths that remained would code for a
Hamiltonian path, thus solving the problem.
through Biocomputers use systems of biologically derived molecules, such as DNA and proteins, to
perform computational calculations involving storing, retrieving, and processing data.The
development of biocomputers has been made possible by the expanding new science of
nanobiotechnology. The term nanobiotechnology can be defined in multiple ways; in a more general
sense, nanobiotechnology can be defined as any type of technology that uses both nano-scale
materials, i.e. materials having characteristic dimensions of 1-100 nanometers, as well as biologically
based materials. A more restrictive definition views nanobiotechnology more specifically as the
design and engineering of proteins that can then be assembled into larger, functional structures (116117) (9).³,1 The implementation of nanobiotechnology, as defined in this narrower sense, provides
scientists with the ability to engineer biomolecular systems specifically so that they interact in a
fashion that can ultimately result in the computational functionality of a computer.
The promising field of biocomputer research uses the science behind nano-sized biomaterials to
create various forms of computational devices, which may have many potential applications in the
future. One day, biocomputers using nanobiotechnology may become the cheapest, most energyefficient, most powerful, and most economical of any commercially available computer. Already,
scientists are making significant headway in the advancement of this science.
Scientific Background
Biocomputers use biologically derived materials to perform computational functions. A biocomputer
consists of a pathway or series of metabolic pathways involving biological materials that are
engineered to behave in a certain manner based upon the conditions (input) of the system. The
resulting pathway of reactions that takes place constitutes an output, which is based on the
engineering design of the biocomputer and can be interpreted as a form of computational analysis.
Three distinguishable types of biocomputers include biochemical computers, biomechanical
computers, and bioelectronic computers. Our task is to investigate how synthetic biochemical systems
can be designed to carry out algorithms and compute; what models of computation arise from
biochemical processes and how they can be programmed; and how to "compile" abstract descriptions
of biomolecular algorithms down to specific synthetic DNA sequences that implement the desired
computation in the laboratory.
Like the carefully orchestrated molecular processes that occur within living cells, biomolecular
computation can in principle occur autonomously, without the need for any external intervention
during the computation. Being able to design and understand such systems is our ultimate goal. We
are exploring several interconnected paradigms of biomolecular computation, based loosely on
processes that are ubiquitous throughout living organisms.
Algorithmic self-assembly of DNA tiles (inspired by crystals, microtubules, and virus capsids on the
biological side, and by Wang tiles on the mathematical) encodes information in the geometric
arrangement of tiles, and performs logical steps by the selective addition of tiles as geometrically
compatible sites. Algorithmic self-assembly may be ideally suited for bottom-up self-fabrication of
complex nanostructures. A major question concerns how to reduce error rates during assembly; we
are investigating "proofreading" logic for error-resilient algorithmic growth, as well as methods to
programmably control the nucleation of self-assembled structures. Both theoretical and experimental
projects are ongoing.
In vitro RNA transcriptional circuits are a stripped-down, bare-bones version of genetic regulatory
networks in the cell; signals are carried by the concentration of specific RNA transcripts; RNA
polymerase and RNase regulate the production and degradation of RNA. In vitro RNA transcriptional
circuits should allow dynamic control of biomolecular processes -- at the time scale of minutes. On the
theory side, we have shown how these networks can function as biochemical neural networks; on the
experimental side, we have demonstrated and characterized a two-node bistable circuit. Future
research aims at spatial patterning in reaction-diffusion conditions, and at measuring stochastic
behavior due to small copy numbers in small volumes.Biochemical circuits, such as cellular signal
transduction cascades, are logically related to boolean circuits. For example, a given enzyme
molecules may be either phosphorylated ("on") or not ("off"). Phosphorylation cascades are ideal for
the study of reliable computation in the presence of thermal "noise". More generally, one may ask
how to design formal chemical reaction networks to perform computation, and how stochastic noise
is shaped by network activity. Experimentally, we are constructing DNA-based logic gates that can be
"wired" into arbitrary circuits.
Chemical self-replication and evolution must have gotten started somehow, way back when. We
are using algorithmic self-assembly to investigate a radical hypothesis of Graham Cairns-Smith, that
life got started as clay crystals that reproduced patterns as they grew. On paper, at least, it appears
that simple crystal growth mechanisms are sufficient for very complex Darwinian evolution. RNA and
DNA hybridization and folding are essential processes for all DNA computing, and can perform
complex logical operations in their own right. Realistic yet tractable models of nucleic acid
interactions form the foundation for higher-level descriptions of DNA nanodevices, and allow for
automated design of DNA sequences for DNA structures and devices. We are developing fast
simulation algorithms for simulating folding at the secondary structure level.
x Deoxyribonucleic acid is present in all organisms. Whether it is mammal, bird or bacteria DNA is
responsible for a functioning organism. Looking at the two tables provided, there are some noticeable
trends that could be identified, as well as conclusions that can be derived. The idea that more complex
organism have more DNA mass per cell, that the mass of DNA in somatic cells is constant (there is a
range but it is very slight) for any particular organism, that sperm cells are haploid cells and that all
organisms have DNA present in their cells are ideas present from the tables provided.
The fact that all of the organisms, whether it is mammal, bird or fungi have a mass of DNA present in their
cells shows that DNA is present for a reason. If a mammal has an approximate DNA mass in each cell of
6pg and birds have an average DNA mass of 2pg per cell then this DNA has to have a specific function in
the body, which explains its initial appearance in each cell. Also, because the organism has a DNA mass
in each cell then DNA would have to be passed from the parents onto their offspring.
The masses of DNA in the somatic cells of the chicken are all approximately the same. The DNA mass
found in a heart cell of the chicken measured at 2.45 pg while the mass of the DNA in each kidney cell
weighed at 2.50pg. This can be explained by the fact that when an egg is fertilized by a sperm cell, the
fertilized egg eventually becomes the starting point for all the different cells. During the process of mitosis,
the fertilized egg is duplicating to form a cluster of cells, while doing so the DNA is also being duplicated
and eventually these cluster of cells will become specialized for different functions in different areas of the
body. The small but notable variance in the value of masses of the somatic cells can be attributed to
experimental error.
More complex organisms have a higher DNA mass content per cell. Per cell a mammal has a DNA mass
content per cell of 6 pg while a bird has an average DNA mass content of 2pg. The mass of DNA present
in each of the cells is dependant on how complex the organism is, the higher the complexity the more
DNA that is needed for the organism to function with its internal functions. Since there is more information
for the organism itself, then the DNA mass will increase.
From the tables provided the mass of the DNA found in sperm cells can be noted. While the mass of DNA
seems to fall in the same range for all the different cells in the chicken’s body, the sperm cell is an odd
case. With a mass of DNA of 1.26 pg the sperm cell holds the smallest number, as well as the number
that does not fit in with the rest of the other values assigned as values for DNA in the different parts of the
cells in the chicken’s body. This can be explained by the fact that the sperm cell is a haploid cell, and that
it carries half the DNA that an organism will eventually obtain, the other half coming from the egg. If the
sperm cell, with a DNA mass of 1.26 pg were to fertilize the egg (which has a DNA mass of 1.26 pg), the
resulting organism will have a DNA mass of 2.52 pg. This value fits in with the rest of the values observed
in the second chart. It can be noted that DNA is passed on from both the sperm and egg cell.
The initial presence of the DNA in each of the organism’s cells, the fact that the mass of DNA in each of
the somatic cells of the chicken is approximately the same, that sperm cells carry half of the mass of DNA
of an ordinary somatic cell, and that more complex organisms have a higher DNA mass content per cell
than lower organism are ideas present in the tables. From these four points extracted from the tables
DNA seems to be an important component for each organism responsible for carrying information.
ertices, entering each vertex at lea