Computer-assisted design for scaling up systems based on DNA

Downloaded from rsif.royalsocietypublishing.org on January 29, 2014
Computer-assisted design for scaling up
systems based on DNA reaction networks
rsif.royalsocietypublishing.org
Nathanaël Aubert1, Clément Mosca2, Teruo Fujii2, Masami Hagiya1
and Yannick Rondelez2
1
Graduate School of Information Science and Technology, and 2LIMMS/CNRS-IIS, Institute of Industrial Science,
University of Tokyo, Tokyo, Japan
Research
Cite this article: Aubert N, Mosca C, Fujii T,
Hagiya M, Rondelez Y. 2014 Computer-assisted
design for scaling up systems based on DNA
reaction networks. J. R. Soc. Interface 11:
20131167.
http://dx.doi.org/10.1098/rsif.2013.1167
Received: 13 December 2013
Accepted: 2 January 2014
Subject Areas:
biochemistry, computational biology
Keywords:
molecular programming, computer-assisted
design, in silico to in vitro
Author for correspondence:
Nathanaël Aubert
e-mail: [email protected]
In the past few years, there have been many exciting advances in the field
of molecular programming, reaching a point where implementation of
non-trivial systems, such as neural networks or switchable bistable networks, is a reality. Such systems require nonlinearity, be it through signal
amplification, digitalization or the generation of autonomous dynamics
such as oscillations. The biochemistry of DNA systems provides such
mechanisms, but assembling them in a constructive manner is still a difficult
and sometimes counterintuitive process. Moreover, realistic prediction
of the actual evolution of concentrations over time requires a number of
side reactions, such as leaks, cross-talks or competitive interactions, to be
taken into account. In this case, the design of a system targeting a given
function takes much trial and error before the correct architecture can be
found. To speed up this process, we have created DNA Artificial Circuits
Computer-Assisted Design (DACCAD), a computer-assisted design software
that supports the construction of systems for the DNA toolbox. DACCAD is
ultimately aimed to design actual in vitro implementations, which is made
possible by building on the experimental knowledge available on the
DNA toolbox. We illustrate its effectiveness by designing various systems,
from Montagne et al.’s Oligator or Padirac et al.’s bistable system to new
and complex networks, including a two-bit counter or a frequency divider
as well as an example of very large system encoding the game Mastermind.
In the process, we highlight a variety of behaviours, such as enzymatic saturation and load effect, which would be hard to handle or even predict with
a simpler model. We also show that those mechanisms, while generally seen
as detrimental, can be used in a positive way, as functional part of a design.
Additionally, the number of parameters included in these simulations can be
large, especially in the case of complex systems. For this reason, we included
the possibility to use CMA-ES, a state-of-the-art optimization algorithm that
will automatically evolve parameters chosen by the user to try to match a
specified behaviour. Finally, because all possible functionality cannot be
captured by a single software, DACCAD includes the possibility to export
a system in the synthetic biology markup language, a widely used language
for describing biological reaction systems. DACCAD can be downloaded
online at http://www.yannick-rondelez.com/downloads/.
1. Introduction
1.1. Generality
Electronic supplementary material is available
at http://dx.doi.org/10.1098/rsif.2013.1167 or
via http://rsif.royalsocietypublishing.org.
Computer-assisted design (CAD) has helped many fields reach their full potential by scaling up designs, making error-prone or repetitive tasks automated
and allowing easy simulation and verification of systems. As such CAD has
proved to be a necessary tool to develop any advanced technology, examples
of which include areas as varied as integrated circuits, aeronautics or programming. The case of DNA computing, a field that uses the interactions between
various DNA molecules and enzymes to perform meaningful operations, is no
exception: while DNA computing systems have shown great promise [1–4],
advances are still considerably limited by the trial-and-error process, and many
& 2014 The Author(s) Published by the Royal Society. All rights reserved.
Downloaded from rsif.royalsocietypublishing.org on January 29, 2014
— it provides a straightforward graphical interface to create
the network and assess its dynamic behaviour;
— it eliminates routine and error-prone tasks such as ODE
writing and solving;
— it helps in simple optimization tasks such as adjusting the
parameters to tune the behaviour of the network towards
a quantitative target, once a qualitative agreement has
been obtained; and
— it can display an animated version of the network,
giving a better understanding of the evolution of the
system over time.
DACCAD was written entirely in Java1 to make it as easy
to deploy as possible. This software allows the user to set the
reaction parameters, such as enzymatic activities, to be as
close as possible to real-life experiment. Users can also refer
to the electronic supplementary material of Montagne et al.
[3] to fit these parameters to their own experimental settings.
It comes with preset parameters from Padirac et al.’s work, so
that only a general knowledge of CRNs is necessary to start
designing complex systems. DACCAD thus makes it quick
to test designs based on the user’s intuition. Designed systems can then be built in vitro, using Baccouche et al.’s [29]
protocol. Finally, DACCAD can convert any such designed
system to synthetic biology markup language (SBML) to
ensure compatibility with other simulation tools that the
user would like to use.
2
J. R. Soc. Interface 11: 20131167
dependence of each nucleotide on the global environmental
conditions (salt concentration, temperature) is thus averaged
through those parameters. While those rules are simple in
essence, complexity emerges from their intrinsic nonlinearity.
Similarly, the biochemistry of DNA can be described by
simple models, such as the Michaelis –Menten equation or
competitive inhibition. Therefore, DNA-based CRNs can be
both implemented in test tubes and described by quantitative
or semi-quantitative ordinary differential equation (ODE)
models based on standard kinetics. Compared with previous
experimental studies on CRNs, the availability of such mathematical models is the true benefit of DNA systems. It puts
the molecular programming approach in stark contrast to
‘classic’ nonlinear reaction networks, such as those observed
in chemistry (such as the Belousov –Zhabotinsky reaction
[24]) and in biology (gene regulation networks [25,26], signalling cascades in metabolic networks [27], etc.). Note that in
these last two cases, it is generally possible to infer highlevel models of interaction among the molecular members
of the reaction network, such as Hill equation for the cooperative binding of ligands [28]. However, it is much more
difficult to obtain precise mathematical forms describing
these interactions, let alone accurate numerical parameters
needed for predictive simulations. This last point thus
makes those systems harder to use in an engineering context.
DNA-based systems thus provide the chemical tools to
implement any computation or dynamic behaviour at the
molecular scale. While the construction of complex systems
quickly becomes intractable for the human brain, these accurate mechanistic models allow us to numerically simulate
them with reasonable accuracy, justifying CAD. For those
reasons, we expect that this approach will be the only way
to push the complexity of experimentally accessible systems
further. In DACCAD, the services offered by the machine
are fourfold:
rsif.royalsocietypublishing.org
such systems are often left at the proof-of-concept stage. Computer assistance could remove this limitation and help
designers take the next step towards actual applications. In
this study, we present DNA Artificial Circuits ComputerAssisted design (DACCAD), a CAD program for designing
systems using Montagne et al.’s [3] dynamic networks assembly (DNA) toolbox, a set of DNA-and-enzyme-based
modules that can be arbitrarily combined into chemical
reaction networks (CRNs).
CRNs are assemblies of cross-interacting chemical reactions
where each molecular compound can affect the formation
and degradation rate of the others. CRNs, when kept out of
equilibrium, can display a variety of nonlinear behaviours,
including oscillations and multistability. They are very powerful systems in terms of information processing, as shown by the
variety of complex cellular processes that are controlled by
CRNs in vivo [5]. The recent demonstrations that they possess
Turing universality [1,6–9] also advocates for their use in a variety of control tasks at the molecular level. However, CRNs
tend to resist rational design. A link between the properties
of the network and its dynamic functions can be obtained
from dynamical systems theory, including linear stability
analysis [10], Lyapunov stability analysis [11] or CRN theory
[12], but such analytical approaches are generally restricted
to idealized systems with simple mathematical models.
Actual implementations are generally much more intricate in
terms of chemical kinetics.
In recent years, great efforts have been invested in the
exploration of the class of reaction networks relying on in vitro
DNA chemistry and biochemistry. These systems are based
on the predictability of DNA–DNA interactions using simple Watson–Crick base-pairing rules as well as on the rich
repertoire of possible enzymatic transformations. Using DNAbased molecular programming, it is possible to build simple
computing systems based on Boolean logic [6,13,14] but also
neural networks [2] and CRNs [7]. Enzymatic systems are
of particular interest because they can be maintained out of
equilibrium for an extended period of time (for instance by
having a large amount of deoxyribonucleotide triphosphate
or nucleoside 50 -triphosphate in the solution, acting as a
generic fuel) and thus are suited to explore emergent behaviours. We focus on the DNA toolbox, a recent framework to
assemble out-of-equilibrium networks of arbitrary topology,
whose dynamics are powered by three enzymes (polymerase,
exonuclease and nickase). This framework has been used
to build oscillators [3], bistable systems [4], a push–push
memory circuit [4] or even molecular ecosystems reproducing
predator–prey cycles in bulk [15], spatially [16,17] or in droplets
[18]. Spatial implementations of those systems display waves
[16,17], showing they can be used for two-dimensional reaction–diffusion systems. All these circuits were simple enough
to be designed and tuned by hand, but doing so was still
an arduous process of trial and error that cannot be extended to larger systems. Indeed, in well-mixed solutions, any
molecular element can potentially interact with many others
at any given time, and keeping track of all of these interactions
and their kinetic consequences quickly becomes impossible
for a human. It gets worse when counterintuitive effects,
such as competition and saturation [19–21], have to be taken
into account.
However, a quite unique feature of DNA as a molecular
material is the possibility to find simple rules and kinetic
parameters describing its global behaviour [7,22,23]. The
Downloaded from rsif.royalsocietypublishing.org on January 29, 2014
1.2. Related work
+
inhibition
+
autocatalysis
+
+
CANDO [44], as the number of different sequences necessary
to create complex structures is very large.
Note that those programs work at different levels, ranging
from abstract species in COPASI to DNA strands in VISUALDSD
or DACCAD to actual DNA sequences in NUPACK or
DINAMelt, depending on the application.
1.3. The dynamic networks assembly toolbox
The DNA toolbox was introduced by Montagne et al. [3] to
help rationally design DNA-based molecular programs, and
demonstrated its effectiveness through multiple systems
such as the Oligator. It has the advantage of using very
simple modules (activation and inhibition) which can be
combined to form, in theory, arbitrarily complex systems.
Moreover, it was described both from the theoretical [45]
and experimental point of view [29]. In the toolbox, two
kinds of DNA strands have to be distinguished: short DNA
strands are used as signals, and longer strands are used as
templates to generate new strands (figure 1). Activation is
done by short strands being extended by a DNA polymerase
enzyme along a compatible template, after which both signal
and output strands are cleaved apart by nickase. Both signal
and output are too short to form stable duplexes and are
eventually released. Inhibition is done by slightly longer
strands that are complementary to most of their target, but
do not trigger polymerization or nicking, thus inactivating
the template. Both signal and inhibition strands are degraded over time by the action of an exonuclease enzyme, to
keep the system out of equilibrium. Templates are chemically
protected against such degradation.
Any system using this paradigm can be easily represented
as a graph: signal and inhibition strands are the nodes while
templates are represented by an arrow from their signal to
their output. A second kind of arrow (‘bar-headed’ arrow)
is also used to represent which template is inhibited by a
given inhibition strand. Simple examples of such representation are shown in figure 2. Conversely, any such graph
can be directly converted to a DNA implementation.
DACCAD can simulate both autonomous and nonautonomous systems. The latter are simulated by adding a
given amount of a specific species to the system at a set
time in the simulation. This allows us to simulate inputs
coming from outside the system, such as injections made
by the experimenter or signal coming from another DNA
system present in the environment. Spikes, either unique or
J. R. Soc. Interface 11: 20131167
Figure 1. Modules of the DNA toolbox: activation, inhibition and the special case
of autocatalysis. The black domain in the inhibition strand represents the final
mismatch preventing the action of the polymerase. (Online version in colour.)
rsif.royalsocietypublishing.org
The basic operations that can be executed by DNA, such as
two complementary strands attaching to form a double helix,
toehold-mediated strand displacement and enzymatic transformations, are well known. However, there are multiple
paradigms that use those mechanisms in very different ways.
One such paradigm, DNA strand displacement (DSD), already
benefits from its own software for CAD, VISUALDSD [30],
demonstrating the effectiveness of the CAD approach.
Recently, Kwiatkowska and co-workers [31] were able to
prove the incorrectness of a basic DSD system by coupling
VISUALDSD with a verification software and then propose
a corrected version. VISUALDSD is also useful to highlight
counterintuitive phenomena such as the winner-take-all
effect [19,32,33].
DSD is also the foundation of Qian and Winfree’s seesaw
gate [1,2], which they used to build complex logical circuits.
However, the translation is hard to do by hand, and so is
the design of the actual hundreds of DNA sequences making up
the system. This is why they designed a compiler (http://www.
dna.caltech.edu/SeesawCompiler/) which takes a description
of a logical circuit and outputs the DNA sequences to
implement it. The compiler can also export the system in the
MATHEMATICA and SBML [34,35] format for simulation of
the system at the molecular level, and VISUALDSD format for
simulation at the toehold level.
The motivation to automatically design DNA sequences
is that unwanted interactions at the molecular level are
very common. While simulation might show no problem
with abstract sequences, actual implementation, if done
incorrectly, can lead to unexpected secondary structures.
In particular, because sequences are made up of only four
nucleotides, it is hard to prevent partial complementarity.
NUPACK [36] and DINAMelt [37,38] are online tools
designed to check how specific sequences will interact
together and output the most stable configurations. They
also compute many characteristics of such structures, such
as melting temperature. NUPACK focuses more on the secondary structures of systems with multiple kinds of DNA
strands, whereas DINAMelt provides various data on the
interaction of two sequences.
For broader applications, COPASI [39] is a very complete
tool for analysing and simulating biological systems. It has
the advantage of being able to perform mass action, stochastic
and hybrid simulations of any system described in SBML and
is also able to handle parameter fitting and optimization,
which makes it very powerful. However, it has the inconvenience of requiring the user to provide the SBML file describing
the system, or input each and every single molecules, interactions and reaction rates by hand through the graphical user
interface, a long process for realistic systems. For this reason,
we gave the option to export systems designed with
DACCAD in the SBML format, so that the user can perform
additional operations with COPASI, or any other software
supporting this format.
Furthermore, DNA possibilities are not limited only to
computation. Because DNA molecules can be considered flexible when long enough, but locally rigid, they can be used to
build complex structures, for instance by using a long scaffold
strand held in place by short ‘staple’ strands [40], or by using
directly specific short strands [41,42]. Such approaches have
to rely heavily on CAD software, such as CADNANO [43] and
3
activation
+
Downloaded from rsif.royalsocietypublishing.org on January 29, 2014
(a)
(b)
A
B
IAA
IBB
A
B
Figure 2. Examples of graph representation of the DNA toolbox: (a) Montagne
et al.’s Oligator [3] and (b) Padirac et al.’s bistable system [4].
2. Methods
2.1. Mathematical model
To keep the model simple, yet descriptive enough, we model reactions at the domain level. Such level abstracts actual DNA
sequences and instead considers only meaningful interactions,
from the designer’s point of view. In particular, in the DNA
toolbox, signal and inhibition strands are composed of only one
domain, so we consider they are either completely free or completely attached to their target. Similarly, templates are composed of
two domains, because they have both an activation and an
output site. This approach has proved itself over time in DSD systems [2,30,31], and can be extended to the DNA toolbox if we
consider that enzyme activity is carried out as a single step. This
means that no intermediate product of an enzymatic reaction,
such as a partially extended DNA strand, can interact with other
parts of the system. This also means that the evolution of the
state of a module (activation or autocatalysis) over time can be
derived only from the current concentrations of the relevant
species and its current state (figure 4). With those restrictions,
the set of possible reactions taking place is large, but straightforward: both signal and output strands can attach or detach to the
template and an inhibition strand can displace them if there is a
toehold [46] available, the polymerase enzyme will extend the
signal strand, displacing the output if it was attached and so on.
Possible reactions for a single template are shown in figure 5.
This demonstrates one of the advantages of DACCAD: in a generic
biochemical simulation program such as COPASI, the user would
need to add all those species and reactions by hand, setting the kinetic rates of each reaction based on enzymatic saturation.
DACCAD generates all those elements by implicitly referring to
a precise biochemical context, allowing the user to design systems
much faster. Generating a template takes one click in DACCAD,
whereas it generates for COPASI five new species, 11 reactions
and requires one to update carefully the Michaelis–Menten
X
X
d½s
ðtÞ ¼
fin
fout
temp ðtÞ þ
temp ðtÞ exos ðtÞ½sðtÞ;
dt
temp[I
temp[O
ð2:1Þ
where I is the set of all modules accepting s as activator, O is the set of
all modules generating s, fin
temp ðtÞ is the flow of s as input of a template temp, fout
temp ðtÞ is its contribution as the output of a template
temp and exos(t) is the activity of the exonuclease enzyme respectively to s at time t, which is a constant depending only on s if
enzymatic saturation is not taken into account. Furthermore,
based on the measures of Montagne et al. [3], we assume that exos
mostly depends on the length of s, which means it can take only
two values: one for the signal strands and one for the inhibition
strands. Except for this strand-specific variation, enzymatic activity
is considered first order in the basic model, meaning in particular
that exos does not change over time. Also note that a given
module can be both in I and O in the case of autocatalysis.
The derivatives for inhibition strands are similar, but the
input term is replaced by an inhibition term:
X
d½i
ðtÞ ¼ finhib ðtÞ þ
fout
temp ðtÞ exoi ðtÞ½iðtÞ:
dt
temp[O
ð2:2Þ
Note that there is only one flux for the inhibition term, because a
given inhibition strand targets a specific module temp. fin
temp ðtÞ,
inhib
fout
ðtÞ can be directly derived from the current
temp ðtÞ and f
state of the module temp. Their expressions are given in the
electronic supplementary material.
While the basic model does not take into account enzyme
saturation, using first-order activity, such assumption is not realistic for a complex system. Specifically, all modules are sharing
the same three enzymes, which is expected to have an impact
on their activity. The usual way to model such burden is to use
a Michaelis – Menten term to quantify the enzymatic activity.
Such term is further modified to take into account that multiple
modules are competing for those resources [21]. Coaxial stacking, the collaborative stabilization that happens at a nick, and
dangle are also considered by the model, as it affects the release
speed of input and output. Explicit saturation expressions and
a discussion on coaxial stacking are given in the electronic
supplementary material.
It is possible to toggle the saturation for part or all of the
enzymes (polymerase, nicking enzyme and exonuclease) to
obtain results closer to reality at a (light) cost in computation time.
It is also possible to toggle some additional effects: removing
enzymatic coupling, in which case saturation terms from substrates other than s are neglected; taking free templates into
account in the saturation term of the exonuclease (even though
they are not degraded, they can still bind to the enzyme and
act as competitive inhibitors); allowing signal and output strands
to invade inhibited templates; allowing the polymerase to generate output at a very slow rate from templates without primer
( phenomenon called leak). Saturation is toggled on by default,
both to be closer to the actual in vitro behaviour and to help
design systems specifically using this phenomenon, such as in
a winner-take-all system [19,32,33].
2.2. Implementation
2.2.1. Graphical interface and graph manipulation
The main graphical interface of DACCAD is separated into two
areas (figure 7): a display panel showing the graphical
4
J. R. Soc. Interface 11: 20131167
periodic, can be added through the interface. Arbitrary inputs
can also be loaded from an external file.
However, despite the simplicity of the DNA toolbox
paradigm, many effects are very hard to take into account
for a human designer. For instance, replacing a direct activation (A promotes B) by an indirect one (A promotes C
which promotes B) will result in more than just some latency
in the activation: there will also be a latency in all responses
of the strand B, meaning that its concentration will also
decrease slower (figure 3). Additionally, enzymes may get
saturated, which would change the reaction rates of other
parts of the system in ways difficult to apprehend for the
human mind [21,33]. Those problems can be avoided by
modifying parameters in the system (strand stability, template concentrations, enzyme activities, etc.), or making
additional changes in the structure, operations that are
easily carried out with DACCAD.
equation of each enzyme. It gets even worse for COPASI if the template is inhibited, as an additional species and six new reactions
have to be added to the mathematical model. Figure 6 shows the
size difference between the representation of the same reaction
network in DACCAD and COPASI.
The derivative of the concentration of a signal strand s is deduced from its interactions with all compatible modules as follows:
rsif.royalsocietypublishing.org
IAA
Downloaded from rsif.royalsocietypublishing.org on January 29, 2014
25
a
b
a
35
30
25
20
15
10
5
b
20
15
10
5
0
50
100
150
200
time (min)
250
300
0
5
a
b
c
a
50
100
c
150
200
time (min)
b
250
300
2.2.3. Local optimization
inhibitor
signal
module
output
Figure 4. An activation module from figure 1 as a black box, with its three
external components: the signal, inhibition and output strands. Based only on
the current concentrations of those three elements and the module current
state, it is possible to compute the module’s impact on said elements, as well
as the derivative of its state. (Online version in colour.)
representation of the system being designed and a data panel
showing context-dependent information.
As mentioned before, DNA toolbox systems can be represented
as a graph, with a minor modification to allow inhibitors to target
templates. The Graph class from the JUNG library [47] was
extended to store this information, as well as additional sequences
and template-relative parameters, such as stability and concentration. The display is done by the JUNG library, with a modified
renderer able to draw inhibition arrows. The plotting of time
traces is done by the JFreeChart library [48]. It is also possible to
obtain an animated version of the graph, displaying the evolution
of species concentrations and enzyme saturation, using a second
renderer supporting size and colour changes.
While trial and error is a simple way to optimize in silico a
system’s behaviour, such as an oscillator’s amplitude or frequency, it might be inconvenient owing to the sheer number of
parameters existing in the model (duplex stability of every
species, initial concentrations, template concentrations, Michaelis
constants of enzymes). For this purpose, we added the possibility to use CMA-ES, a state-of-the-art optimization algorithm
[50], to perform parameter optimization. The structure is
kept as defined by the user to make optimization possible in
a reasonable amount of time. Structure optimization can
also be attempted, but requires adapted algorithms [51,52]. The
user can define which parameters can be optimized, which
is useful to accommodate existing constraints or reduce
computation time.
The desired behaviour is defined by placing dots on the time
plot of concentrations or by loading a previously saved target
profile. CMA-ES will then try to minimize the least-square distance between the target and the actual concentration curves.
If no match can get below the user-defined termination
threshold, the algorithm stops after a given number of iterations
(default is 100) and sets the parameters to the best fit.
3. Results
Here, we present various systems that were simulated using
DACCAD, going from simple to advanced networks. We
also present the effects of enzyme saturation and show that
they can be both deleterious or instrumental, depending on
the context.
2.2.2. Equation generation and solving
In §2.1, we have explained how to reduce a given graph to a
system of differential equations. This system is actually never
explicitly generated by the simulator. Instead, rewriting rules
are applied to deduce all the flux affecting a given part of the
system on the fly, generating the value of the derivatives of
the various parts of the DNA system. Those derivatives are
passed on demand to the Gragg –Bulirsch – Stoer integrator [49]
from the Apache Commons Mathematics Library, generating
the time trace. This algorithm was chosen for its high precision
and step adaptability, useful to solve systems such as the bistable
switch [4] where two excluding parts of a system get very close
in concentration and requires careful integration to obtain the
correct dynamics. Default parameters are given in the electronic
supplementary material.
The system of differential equations is only actually generated for export as a MATHEMATICA or SBML file. A generic file
template containing the basic rules of the DNA toolbox chemistry is completed through similar rewriting rules to the one
used for the simulator.
3.1. Simple systems
As a proof of concept, we implemented Montagne et al.’s
Oligator [3] and Padirac et al.’s bistable circuit [4]. The time
series were compared with those of the original articles.
Those systems represent two basic behaviours, respectively
oscillations and memory, making them appropriate building
blocks for advanced systems. Multiple time plots showing the
impact of different parameters on those systems are shown in
figure 8. This also gives a chance to quickly explore the
impact of multiple parameters on such systems. In particular,
coaxial stacking (and thus indirectly the actual nucleotide
sequences) has a strong impact on autocatalytic signals. For
the Oligator, the system obtains more stable oscillations if
the autocatalyst has little to no coaxial slowdown. In the
case of the bistable circuit, it is possible to find a stability diagram based on the initial concentrations of A and B similar to
the one found by Padirac et al. [4]. This diagram is strongly
J. R. Soc. Interface 11: 20131167
Figure 3. Impact of indirect activation of a given species. (Online version in colour.)
rsif.royalsocietypublishing.org
concentration (nM)
30
Downloaded from rsif.royalsocietypublishing.org on January 29, 2014
(a)
(b)
6
signal
rsif.royalsocietypublishing.org
tempalone
inhib
tempin
pol
lin
tempboth
nick
poldispl
lout
tempinhib
Figure 5. All the possible reactions between a template and its various inputs. The respective concentrations of the different configurations of template represent
the state of a module: temp, noted tempalone (template alone), tempin (signal strand attached), tempout (output strand attached), tempboth (both signal and output
strands attached), tempext (template completely double-stranded) and tempinhib (inhibited template). (a) Working of an activation template without inhibition.
(b) Reactions related to inhibition. Toe represents the fact that signal and output strands invade an inhibited template at a slower rate, because they have a
very short toehold. (Online version in colour.)
talone1to1
tout1to1
3.3. Combined systems
s1
s1
the oscillator, making it actually twice slower. Good parameters were found by excluding this parameter from the
optimization as well.
tin1to1
tboth1to1
text1to1
Figure 6. Comparison between the representation of a simple DNA toolbox
autocatalytic module in DACCAD, and the same chemical system in COPASI.
The latter graph was created through the interface of COPASI from the
SBML file exported by DACCAD. (Online version in colour.)
influenced by the coaxial stacking values of the various
templates (see the electronic supplementary material).
3.2. Simple system optimization
The Oligator oscillates only on a restricted area of the
complete parameter space [3]. To find a working set with
specific properties in terms of amplitude and frequency, we
used the optimization tool of DACCAD, leading to the results
shown in figure 9. In this particular case, a very good match
was found, but specific behaviours may be impossible. In the
case of troubles with the evolution, it is recommended to
reduce the number of evolved parameters and go through
multiple intermediary behaviours, so that the system evolves
progressively in the good direction. As an example, when
trying to optimize the frequency divider (figure 11), the optimizer was not explicitly allowed to modify the behaviour of
the oscillator. However, by increasing the concentration of the
template connecting the oscillator to the rest of the circuit and
using the load effect, the optimizer found a ‘good’ match: the
additional templates sequester an intermediary species of
Using the two previous building blocks, we can build
advanced systems that would require hundreds of lines if
the ODEs were written by the user. By cascading two bistable
systems, we can obtain a two-bit counter (figure 10). If
instead we chose to combine an extended Oligator and a bistable circuit, we can obtain a frequency divider (figure 11).
The systems presented here are perfect examples of the problems that might arise when combining two circuits. While
the modularity of the DNA toolbox makes it simple to use
multiple systems in parallel, such as in the two-bit counter,
many factors have to be taken into account when cascading
systems. The most important is the load effect [53]: when
adding a template taking a given strand as input, the sequestering of this strand will change the behaviour of the system of
which it is part. In the frequency divider, for instance, the template connecting the oscillator to the bistable system will change
the oscillator’s frequency or even prevent anything more than
damped oscillations. This phenomenon is strongly dependent
on the strand used as input: any perturbation to the autocatalyst
will prevent true oscillations, but the intermediate species are
more robust. The optimal course of action is then to create a
link to a new species that will in turn bear the full load of the
connection to the other part of the system.
3.4. Saturation-based effects
Kim et al. [19] presented a biochemical phenomenon named
winner-take-all that arises when there is competition for a
common resource in which the species that reproduces the
fastest using this resource will eliminate all competitors.
This phenomenon was leveraged by Genot et al. to perform
efficient computation [32]. In the DNA toolbox, enzymes
are perfect examples of such shared resources [33]. Because
the standard experimental parameters are set to avoid saturation (and thus competition) as much as possible, the activity
of one enzyme has to be reduced by an order of magnitude.
J. R. Soc. Interface 11: 20131167
tempext
+
tempout
Downloaded from rsif.royalsocietypublishing.org on January 29, 2014
7
rsif.royalsocietypublishing.org
J. R. Soc. Interface 11: 20131167
number of species: 2
including 0 inhibitor
number of autocatalysts: 1
number of selected nodes: 1
K (dissociation constant) (nM)
initial concentration (nM)
template 1 1 (nM)
Figure 7. 1. Display panel. 2. Data panel. 3. Parameters area. Sets species and template-relative parameters. 4. Graph of the system. Selected nodes are displayed in
blue. 5. File menu. 6. Species menu. 7. Options. Sets general and enzymatic parameters. 8. Plot. Simulates and displays the behaviour of the system. (Online version
in colour.)
(a)
(b)
8
concentration (nM)
7
A
B
IAA
6
5
4
3
2
1
0
concentration (nM)
(c)
(d)
25
20
A
B
15
10
5
0
100 200 300 400 500 600 700 800 900 1000
time (min)
10
9
8
7
6
5
4
3
2
1
0
20
18
16
14
12
10
8
6
4
2
0
A
B
IAA
A
B
100 200 300 400 500 600 700 800 900 1000
time (min)
Figure 8. (a,b) The Oligator [3], simulated with first-order enzymatic activity (a) and with enzymatic saturation and competitive coupling (b). With the chosen
parameters, the amplitude and frequency of oscillation are modified, but the behaviour is otherwise unchanged. (c,d) Simulation of Padirac’s bistable circuit. Only
the time plot of the autocatalytic species is shown. The initial concentration of A is higher than that of B, forcing the system into the A state. If the concentration of
autocatalytic templates is too high (d ), the system saturates the enzymes and loses its bistable properties. Performing some optimizations on the enzyme parameters gives us some insights into this phenomenon: it turns out that with Padirac et al.’s [4] parameters, the nicking enzyme saturates first, causing this loss of
bistability. (Online version in colour.)
Downloaded from rsif.royalsocietypublishing.org on January 29, 2014
(b)
10
9
8
7
6
5
4
3
2
1
8
30
A
B
IAA
A
B
IAA
25
rsif.royalsocietypublishing.org
concentration (nM)
(a)
20
15
10
5
100 200 300 400 500 600 700 800 900 1000
time (min)
0
100 200 300 400 500 600 700 800 900 1000
time (min)
Figure 9. Optimization of an oscillator. (a) At first, the system displays only damped oscillations. The user can describe the desired behaviour through the interface
(the dots represent the user-defined behaviour). (b) CMA-ES then performs some rounds of optimization until a result close enough is achieved. Note that the
concentration scale is different between the two panels. (Online version in colour.)
(b)
I6
I5
s7
s8
I1
I2
s3
I10
s4
I13
I11
I12
12
lsb
msb
clock
10
concentration (nM)
(a)
8
00
6
01
10
11
4
2
0
s9
–2
0
500
1000
1500
2000
time (min)
2500
3000
Figure 10. A two-bit counter made of two push – push memory circuits [4], with (a) the actual circuit of the system and (b) the time plot of the relevant species.
Switches come from an input of species 9 (clock), making this a non-autonomous system. Note that the spike of species 9 when switching the least relevant bit is
different when going from 0 (species 3 high) to 1 (species 4 high) than when going from 1 to 0. This is due to the load effect: when species 3 is high, more
templates that can capture species 9 are inhibited, so the free concentration of species 9 is higher.
s16
I4
s1
s3
s2
I7
I8
s15
I14
s10
s5
I12
s6
s11
s9
concentration (nM)
35
30
25
s3
s5
s6
20
15
10
5
I13
0
500
1000
1500
2000
time (min)
2500
3000
Figure 11. A frequency divider. The original oscillator (s1 to I4, a version of Montagne et al.’s Oligator [3] with a two species delay) is connected to a push – push
memory (s5 to I11). The two species delay has the double effect of making the oscillator more robust to the connection load and slowing the oscillations to give
enough time to the bistable circuit to switch. To produce spike-like activation of the switch, an additional species (I12) is used to form an incoherent feed-forward
loop. Species s13 and s14 are used to initially inhibit the switch, giving enough time to the bistable switch to reach a valid state. Both state species from the bistable
circuit can then be used as oscillators of half the original frequency, in opposition of phase. (Online version in colour.)
This can be achieved in an actual experiment by reducing the
actual quantity of enzyme introduced in the system. Each of
the three enzymes creates different behaviours when they are
set to be the bottleneck of the reaction system, leading to
interesting dynamics. For instance, very low exonuclease
means that a dominating species will receive most of the
degradation, thus protecting other species that would not
survive otherwise. Conversely, very low polymerase enables
the winner-takes-all effect described by Kim et al. as shown in
figure 12. Saturation can also be harnessed to generate various behaviours, such as oscillations (figure 13) in a system
that would be stable otherwise.
J. R. Soc. Interface 11: 20131167
0
Downloaded from rsif.royalsocietypublishing.org on January 29, 2014
(a)
(b)
(c)
s2
100
9
600
s1
s2
500
80
400
60
300
40
200
20
100
0
100 200 300 400 500 600 700 800 900 1000
time (min)
0
s1
s2
100 200 300 400 500 600 700 800 900 1000
time (min)
(b)
80
70
60
50
40
30
20
10
0
–10
I3
I4
s1
s2
(c)
(d)
14
concentration (nM)
s1
s2
concentration (nM)
(a)
12
5
s1
s2
4
s1
s2
10
8
3
6
2
4
1
2
0
500
1000
1500
2000
time (min)
2500
3000
0
500
1000
1500
2000
time (min)
2500
3000
Figure 13. (a) Saturation-based oscillations. In those systems, saturation works mostly at the polymerase level. Each network, having positive and negative feedbacks but lacking delay, produces damped oscillations (b). When they are put together in a system, one network starting loads the polymerase, reinforcing the decay
of the other. The two systems then tend to use the polymerase alternately and thus synchronize in opposite phases. Each of them feels an apparent oscillating
polymerase activity, which provides the additional nonlinearity necessary to compensate for the damping. Oscillations can be symmetrical if template concentrations
are identical (c), or asymmetric if there is a small difference in concentrations (d). (Online version in colour.)
3.5. Complex system: the Mastermind game
Finally, as a proof of concept of modular design, we
implemented a large system where repeating motifs were
connected together. For this purpose, we chose to implement
a version of Mastermind, a classic board game in which one
player decides a secret combination that the other player tries
to guess. We chose to have two possible symbols and a combination of length four. The design was done incrementally.
First, we created a guess validation module, combining a bistable (memory), an input system (signal amplification) and a
reporter. The response was optimized in a saturated environment using CMA-ES. We then combined four such modules
and optimized again to cover all input possibilities. The
whole system is shown in figure 14. The design strategy is
explained in the electronic supplementary material. A
sample game is shown in figure 15.
4. Conclusion
In this study, we presented a mathematical model for the simulation of systems designed with the DNA toolbox. We also
introduced DACCAD, a software created to help design such
systems. DACCAD allows its user to quickly create new DNA
toolbox systems by using its intuitive graphical interface. Such
systems can then be refined at will by setting a large range of parameters to fit the user’s experimental conditions. Once the
design is complete, it is possible to move on to the actual DNA
J. R. Soc. Interface 11: 20131167
Figure 12. (a) Winner-takes-all effect. Time trace of two simple autocatalysts, with (b) and without (c) saturation. Interaction is done only through competition for
the polymerase enzyme, other enzymes are set to work in first-order regime. In this case, the autocatalyst s1 has more template than s2, resulting in the complete
disappearance of the latter. (Online version in colour.)
rsif.royalsocietypublishing.org
s1
concentration (nM)
120
Downloaded from rsif.royalsocietypublishing.org on January 29, 2014
10
I36
I23
s22
I4
s19
I5
s25
s51
s20
I53
I14
s12
s27
s52
s1
I48
s49
I2
s18 s
28
s31
I13
s46
s11
I26
s39
s37
s33
I44
I24
s17
s9
s30
I3
I43
s54
s42
s8
s34
s21
I29
I45
I6
s7
I41
I16
s10
s40
s32
s55
Figure 14. Implementation of the Mastermind game. State bistable circuits are shown in blue, possible inputs in green, reporter species in yellow and lock species
in red.
output (s55)
I56
AAAB
AAAA
BAAA
BABA
30
concentration (nM)
25
20
15
10
5
0
–5
0
500
1000
1500
2000
2500
time (min)
3000
3500
4000
4500
Figure 15. Mastermind game. The state of a given position can be either A or B. Each attempt improves the guess until the correct answer, BABA, is found on the
fourth trial.
sequence design, using tools such as NUPACK [36] or DINAMelt [37,38] and the design rules presented in Baccouche et al.
[29]. While the value of some parameters, such as the release
slowdown owing to coaxial stacking, can be hard to predict,
the actual value can be measured through some basic experiments and then updated in DACCAD, allowing the quick
correction of a design. Such parameters can also be slightly
modified from their actual experimental value to encompass
more specific or sequence-dependent effects [54,55]. Finding
a way to fit such experimental parameters directly from
fluorescence data could be an important future application.
Note also that the very strength of DACCAD, i.e. its black
box handling of kinetic complexities, brings along a limitation: it can only work with pure polymerase– nickase
activator –inhibitor systems, and other reactions, such as
hairpin formation from the predator –prey system [15], are
forbidden. However, handling of additional operations can
be performed through external software as DACCAD
models can be exported in the widely used SBML format.
In particular, it is possible to quickly generate the DNA toolbox part of a system and then move on to another software to
add operations. In that sense, DACCAD can be seen as a
compiler graph to SBML, allowing the algorithmic generation
of DNA toolbox systems (for example, systems trying to solve
instances of the 3-SAT problem [56]). Moreover, this generation is facilitated by its simple graph description file format.
We then showed how the software can be used to design
systems of increasing complexity. In particular, one can also
observe different behaviours, such as saturation-based oscillation or the winner-takes-all effect. Local optimization can
also be performed by using the state-of-the-art algorithm
CMA-ES. Moreover, the mathematical model can be easily
re-used with other design techniques, such as evolutionary
algorithms, which may help optimizing systems following
multiple possible objectives [57], or find new design patterns
to create complex systems [51,52]. It would also be interesting
to investigate two-dimensional reaction–diffusion implementations of the DNA toolbox [16,17], which would prove
beneficial to develop smart materials or create particular
spatio-temporal patterns. To do so, DACCAD could be extended
to add diffusion terms to the current equations, and then export
them to be solved by fast off-the-shelf reaction–diffusion simulation software, such as READY (https://code.google.com/p/
reaction-diffusion/).
J. R. Soc. Interface 11: 20131167
I56
rsif.royalsocietypublishing.org
s15
s47
s35
I38
Downloaded from rsif.royalsocietypublishing.org on January 29, 2014
Acknowledgements. We thank Prof. Nicolas Bredeche, Dr Anthony
Genot, Dr Adrien Padirac and Alexandre Baccouche for useful
discussions, the anonymous reviewers and all the DACCAD beta testers, who provided useful insights into the organization of the
software and priceless bug reports.
Endnote
1
Java 6 is required. Minimum specifications are available online on
Oracle’s website at http://www.oracle.com/technetwork/java/
javase/config-417990.html.
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
Qian L, Winfree E. 2011 A simple DNA gate motif for
synthesizing large-scale circuits. J. R. Soc. Interface 8,
1281–1297. (doi:10.1098/rsif.2010.0729)
Qian L, Winfree E. 2011 Scaling up digital circuit
computation with DNA strand displacement
cascades. Science 332, 1196– 1201. (doi:10.1126/
science.1200520)
Montagne K, Plasson R, Sakai Y, Fujii T, Rondelez Y.
2011 Programming an in vitro DNA oscillator using
a molecular networking strategy. Mol. Syst. Biol. 7,
466. (doi:10.1038/msb.2010.120)
Padirac A, Fujii T, Rondelez Y. 2012 Bottom-up
construction of in vitro switchable memories. Proc.
Natl Acad. Sci. USA 109, E3212–E3220. (doi:10.
1073/pnas.1212069109)
Wagner A, Fell DA. 2001 The small world inside
large metabolic networks. Proc. R. Soc. Lond. B 268,
1803–1810. (doi:10.1098/rspb.2001.1711)
Soloveichik D, Seelig G, Winfree E. 2010 DNA as a
universal substrate for chemical kinetics. Proc. Natl
Acad. Sci. USA 107, 5393 –5398. (doi:10.1073/pnas.
0909380107)
Seelig G, Soloveichik D, Zhang DY, Winfree E. 2006
Enzyme-free nucleic acid logic circuits. Science 314,
1585–1588. (doi:10.1126/science.1132493)
Magnasco MO. 1997 Chemical kinetics is Turing
universal. Phys. Rev. Lett. 78, 1190 –1193. (doi:10.
1103/PhysRevLett.78.1190)
Winfree E. 1998 Algorithmic self-assembly of DNA.
Doctoral dissertation. California Institute of
Technology.
Murray JD. 2002 Mathematical biology I: an
introduction, 3rd edn. Berlin, Germany: Springer.
Branicky MS. 1998 Multiple Lyapunov functions and
other analysis tools for switched and hybrid
systems. IEEE Automat. Control 43, 475–482.
(doi:10.1109/9.664150)
Tyson JJ, Chen KC, Novak B. 2003 Sniffers,
buzzers, toggles and blinkers: dynamics of
regulatory and signaling pathways in the cell.
Curr. Opin. Cell Biol. 15, 221–231. (doi:10.1016/
S0955-0674(03)00017-6)
Chiniforooshan E, Doty D, Kari L, Seki S. 2011
Scalable, time-responsive, digital, energy-efficient
molecular circuits using DNA strand displacement. In
DNA computing and molecular programming (eds Y
Sakakibara, Y Mi). Lecture Notes in Computer
Science, no. 6518, pp. 25 –36. Berlin, Germany:
Springer. (doi:10.1007/978-3-642-18305-8_3)
14. Genot AJ, Bath J, Turberfield AJ. 2011 Reversible
logic circuits made of DNA. J. Am. Chem. Soc. 133,
20 080– 20 083. (doi:10.1021/ja208497p)
15. Fujii T, Rondelez Y. 2012 Predator– prey molecular
ecosystems. ACS Nano 7, 27– 34. (doi:10.1021/
nn3043572)
16. Padirac A, Fujii T, Estévez-Torres A, Rondelez Y. 2013
Spatial waves in synthetic biochemical networks.
J. Am. Chem. Soc. 135, 14 586–14 592. (doi:10.
1021/ja403584p)
17. Padirac A. 2012 Tailoring spatiotemporal dynamics
with DNA circuits. Doctoral dissertation, University
of Lyon.
18. Hasatani K, Leocmach M, Genot AJ, Estevez-Torres
A, Fujii T, Rondelez Y. 2013 High-throughput
observation of compartmentalized biochemical
oscillators. Chem. Commun. 49, 8090–8092.
(doi:10.1039/c3cc44323j)
19. Kim J, Hopfield JJ, Winfree E. 2004 Neural network
computation by in vitro transcriptional circuits. Adv.
Neural Inf. 17, 681 –688.
20. Wong WW, Tsai TY, Liao JC. 2007 Single-cell zerothorder protein degradation enhances the robustness
of synthetic oscillator. Mol. Syst. Biol. 3, 130.
(doi:10.1038/msb4100172)
21. Rondelez Y. 2012 Competition for catalytic
resources alters biological network dynamics.
Phys. Rev. Lett. 108, 018102. (doi:10.1103/
PhysRevLett.108.018102)
22. Zhang DY, Winfree E. 2009 Control of DNA strand
displacement kinetics using toehold exchange.
J. Am. Chem. Soc. 131, 17 303–17 314. (doi:10.
1021/ja906987s)
23. Kim J, Winfree E. 2011 Synthetic in vitro
transcriptional oscillators. Mol. Syst. Biol. 7, 465.
(doi:10.1038/msb.2010.119)
24. Zaikin AN, Zhabotinsky AM. 1970 Concentration
wave propagation in two-dimensional liquid-phase
self-oscillating system. Nature 225, 535–537.
(doi:10.1038/225535b0)
25. Goodwin BC. 1965 Oscillatory behavior in enzymatic
control processes. Adv. Enzyme Regul. 3, 425–437.
(doi:10.1016/0065-2571(65)90067-1)
26. Elowitz MB, Leibler S. 2000 A synthetic oscillatory
network of transcriptional regulators. Nature 403,
335 –338. (doi:10.1038/35002125)
27. Barabàsi AL, Oltvai ZN. 2004 Network biology:
understanding the cell’s functional organization. Nat.
Rev. Genet. 5, 101–113. (doi:10.1038/nrg1272)
28. Hill AV. 1910 The possible effects of the aggregation
of the molecules of haemoglobin on its dissociation
curves. J. Physiol. 40, 4–7.
29. Baccouche A, Montagne K, Padirac A, Rondelez Y.
Submitted. Dynamic DNA-toolbox reaction circuits: a
walkthrough.
30. Lakin MR, Youssef S, Polo F, Emmott S, Phillips A.
2011 Visual DSD: a design and analysis tool for DNA
strand displacement systems. Bioinformatics 27,
3211– 3213. (doi:10.1093/bioinformatics/btr543)
31. Lakin MR, Parker D, Cardelli L, Kwiatkowska M,
Phillips A. 2012 Design and analysis of DNA strand
displacement devices using probabilistic model
checking. J. R. Soc. Interface 9, 1470–1485. (doi:10.
1098/rsif.2011.0800)
32. Genot AJ, Fujii T, Rondelez Y. 2012 Computing with
competition in biochemical networks. Phys. Rev.
Lett. 109, 208102. (doi:10.1103/PhysRevLett.109.
208102)
33. Genot AJ, Fujii T, Rondelez Y. 2013 Scaling down
DNA circuits with competitive neural networks.
J. R. Soc. Interface 10, 20130212. (doi:10.1098/rsif.
2013.0212)
34. Hucka M et al. 2003 The systems biology markup
language (SBML): a medium for representation and
exchange of biochemical network models.
Bioinformatics 19, 524– 531. (doi:10.1093/
bioinformatics/btg015)
35. Finney A, Hucka M. 2003 Systems biology markup
language: level 2 and beyond. Biochem. Soc. Trans.
31, 1472 –1473. (doi:10.1042/BST0311472)
36. Zadeh JN, Steenberg CD, Bois JS, Wolfe BR, Pierce
MB, Khan AR, Dirks RM, Pierce NA. 2011 NUPACK:
analysis and design of nucleic acid systems.
J. Comput. Chem. 32, 170–173. (doi:10.1002/
jcc.21596)
37. Markham NR, Zuker M. 2005 DINAMelt web server
for nucleic acid melting prediction. Nucleic Acids
Res. 33, W577 –W581. (doi:10.1093/nar/gki591)
38. Markham NR, Zuker M. 2008 UNAFold: software for
nucleic acid folding and hybridization. Methods Mol.
Biol. 453, 3–31. (doi:10.1007/978-1-60327-4296_1)
39. Hoops S et al. 2006 COPASI: a complex pathway
simulator. Bioinformatics 22, 3067 –3074. (doi:10.
1093/bioinformatics/btl485)
40. Rothemund PW. 2006 Folding DNA to create
nanoscale shapes and patterns. Nature 440,
297–302. (doi:10.1038/nature04586)
J. R. Soc. Interface 11: 20131167
References
11
rsif.royalsocietypublishing.org
It is our hope that this software will speed up the creation of
novel networks as well as spread the usage of molecular programming to a broader audience. The mathematical model of
the DNA toolbox presented here might also give new theoretical insights on its power, such as whether it would be possible
to approximate arbitrary CRNs. The executable file and all
examples from this article can be found at http://www.yannick-rondelez.com/downloads/. The code can be obtained
by contacting the first author.
Downloaded from rsif.royalsocietypublishing.org on January 29, 2014
47.
48.
49.
50.
52.
53. Franco E, Friedrichs E, Kim J, Jungmann R, Murray
R, Winfree E, Simmel FC. 2011 Timing molecular
motion and production with a synthetic
transcriptional clock. Proc. Natl Acad. Sci. USA 108,
E784– E793. (doi:10.1073/pnas.1100060108)
54. Pyshnyi DV, Ivanova EM. 2004 The influence of
nearest neighbours on the efficiency of coaxial
stacking at contiguous stacking hybridization of
oligodeoxyribonucleotides. Nucleos. Nucleot. Nucl.
Acids 23, 1057–1064. (doi:10.1081/NCN200026071)
55. Qian J, Ferguson TM, Shinde DN, Ramirez-Borrero AJ,
Hintze A, Adami C, Niemz A. 2012 Sequence
dependence of isothermal DNA amplification via EXPAR.
Nucleic Acids Res. 40, e87. (doi:10.1093/nar/gks230)
56. Hagiya M, Kawamata I, Aubert N. 2013 Towards
persistent molecular computers for molecular
robots. In Proc. 19th Int. Conf. on DNA Computing
and Molecular Programming, Tempe, AZ, USA,
22–27 September 2013.
57. Warmflash A, Francois P, Siggia ED. 2012 Pareto
evolution of gene networks: an algorithm to
optimize multiple fitness objectives. Phys. Biol. 9,
056001. (doi:10.1088/1478-3975/9/5/056001)
12
J. R. Soc. Interface 11: 20131167
51.
catalyzed by DNA. Science 318, 1121– 1125.
(doi:10.1126/science.1148532)
O’Madadhain J, Fisher D, Smyth P, White S, Boey YB.
2005 Analysis and visualization of network data using
JUNG. J. Stat. Softw. 10, 1–25.
Gilbert D. 2002 The JFreeChart class library,
developer guide. Harpenden, UK: Object Refinery.
Hairer E, Nørsett SP, Wanner G. 1987 Solving
ordinary differential equations I. Nonstiff problems.
Berlin, Germany: Springer.
Hansen N, Ostermeier A. 1996 Adapting arbitrary
normal mutation distributions in evolution
strategies: the covariance matrix adaptation. In Proc.
IEEE Int. Conf. on Evolutionary Computation, Nagoya,
Japan, 20– 22 May 1996, pp. 312–317. (doi:10.
1109/ICEC.1996.542381)
Dinh QH et al. In press. An effective method for
evolving reaction network in synthetic biochemical
systems. IEEE Trans. Evolut. Comput.
Aubert N et al. 2013 Evolving cheating DNA
networks: a case study with the rock-paperscissors game. In Advances in artificial life,
ECAL, vol. 12, pp. 1143–1150. Cambridge, MA:
MIT Press.
rsif.royalsocietypublishing.org
41. Wei B, Dai M, Yin P. 2012 Complex shapes
self-assembled from single-stranded DNA
tiles. Nature 485, 623 –626. (doi:10.1038/
nature11075)
42. Ke Y, Ong LL, Shih WM, Yin P. 2012 Threedimensional structures self-assembled from DNA
bricks. Science 338, 1177 –1183. (doi:10.1126/
science.1227268)
43. Douglas SM, Marblestone AH, Teerapittayanon S,
Vazquez A, Church GM, Shih WM. 2009 Rapid
prototyping of 3D DNA-origami shapes with
caDNAno. Nucleic Acids Res. 37, 5001 –5006.
(doi:10.1093/nar/gkp436)
44. Castro CE, Kilchherr F, Kim D-N, Shiao EL, Wauer T,
Wortmann P, Bathe M, Dietz H. 2011 A primer to
scaffolded DNA origami. Nat. Methods 8, 221–229.
(doi:10.1038/nmeth.1570)
45. Plasson R, Rondelez Y. 2013 Synthetic biochemical
dynamic circuits. In Multiscale analysis and
nonlinear dynamics: from genes to the brain,
pp. 113 –145. Weinheim, Germany: Wiley-VCH.
(doi:10.1002/9783527671632.ch05)
46. Zhang DY, Turberfield AJ, Yurke B, Winfree E. 2007
Engineering entropy-driven reactions and networks
Computer Assisted Design for Scaling Up Systems based on
DNA Reaction Networks
Supplementary materials
Nathanaël Aubert1 , Clément Mosca2 , Teruo Fujii2 , Masami Hagiya1 , Yannick Rondelez2
1
S1
Graduate School of Information Science and Technology, University of Tokyo, Japan
2
LIMMS/CNRS-IIS, Institute of Industrial Science, University of Tokyo, Japan
naubert, hagiya{@is.s.u-tokyo.ac.jp}, [email protected]
In vitro implementation of the DNA toolbox
In Padirac et al.’s implementation, on which the default parameters of DACCAD are based,
signal strands are 11-mers, while inhibition is done by 15-mers. Inhibitors strands is made
of three parts: the first seven nucleotides are complementary to the end of the input strand,
the next six are complementary to the beginning of the output strand and the last two are
mismatched with the target template to prevent elongation by the polymerase. Note that it
could be possible to make the inhibitors occupy the target template in different locations, as
long as the nicking recognition/cutting sites are not produced. This inhibition strategy also
makes it hard to have inhibition of templates generating inhibitors, since the output of such
templates, being inhibitors themselves, will have a long toehold to displace their inhibitor. Using
specific (longer) inhibitors was not deemed a good solution, since those inhibitors, in turn, will
require even longer strands, and so on. Instead, it is possible to use an intermediate activation
of inhibitor (A generates B which generates the inhibitor) and inhibit the first step (A generates
B). See Baccouche et al. [28] for additional details.
S2
Default parameters
Association constants and strand-displacement speeds are based on Zhang et al.’s values [22].
Strands dissociation constants and Michaelis-Menten parameters for the enzymes are taken from
Padirac et al.’s measurements [4]. The value of enzyme activities are also taken from Padirac
et al.’s experiment, except for the nickase activity, for which the fitted value was taken instead.
The slowdown due to coaxial stacking is computed following a simple two-state model: when
the bases at the nick are stacked, the DNA complex is considered stable and, as such, does not
denature; when the bases are not stacked, the release happens without at the normal speed
(actually, faster, because there is no dangle stabilization). The reactions going from one state
to the other are considered at equilibrium, which allows us to use the equilibrium constants as
the slowdown ratio. Values for this ratio were trickier, as there is not only a strong sequence
dependence [S1], but this dependence is not even limited to the nearest-neighbours [54]. To get
realistic values, we considered the opening/stacking energy described by Frank-Kamenetskii’s
group [S2], with salt and temperature (42◦ C) corrections [S3] to match the experimental values
of the DNA toolbox. Since we only care about the relative increase in stability compared to
a single-stranded input or output present on the template, we have to reduce those values to
take into account the stability increase of dangles [S4] (those values were also corrected for the
temperature, and averaged in the same way that Frank-Kamenetskii’s group [S2] to be able to
1
compare). This yields a range of values between 0.04 (25 times slowdown for a nicking site GC)
to approximately 1 (no slowdown for a nicking site TA). The various possibilities are listed in
the next table and are provided as a pop-up in DACCAD. The default value corresponds to a
five times slowdown, but can be changed to match the actual sequences used in a particular
experimental setting. Note that there are many different and sometimes contradictory measures
of coaxial stacking, so DACCAD can only provide a best effort approach.
Sequence at the nick (50 to 30 )
TA
Coaxial-stacking stability increase (kcal/mol)
- 0.46
Final stability increase
0.09
Equivalent slowdown factor
1.2
CA = TG
- 0.85
-0.30
0.62
CG
- 1.21
-1.06
0.19
AG = CT
- 1.36
-1.14
0.16
AA = TT
- 1.41
-1.14
0.16
AT
-1.64
-1.33
0.12
GA = TC
- 1.73
-1.15
0.16
GG = CC
- 1.74
-1.34
0.12
GT = AC
- 2.11
-2.00
0.04
GC
Average
-2.47
- 1.51
-2.03
- 1.15
0.04
0.16
However, those values do not take into account the base pairs neighbouring the nicking site,
although a strong dependence has been observed by Pyshnyi and Ivanova [54].
S3
Equations for species
φin
temp (t) = kduplex Ks ([tempin ](t) + stacktemp [tempboth ](t))
+kduplex [inhibtemp ](t)[tempin ](t)
−kduplex [s](t)([tempalone ](t) + [tempout ](t) + λin [tempinhib ])
(1)
All bracketed terms are concentrations. kduplex is the kinetic constant of association, that is, the
constant representing how fast two complementary DNA strands will form a double helix. It is
considered sequence independent, and its default value is based on Zhang’s measures [22]. Ks is
the ratio of the kinetic constant of the reverse reaction over kduplex and can easily be obtained
by experiment or nearest-neighbour model. stacktemp is the dimensionless factor representing
the increase in stability due to coaxial stacking between the input and output when they are
both attached to the template. λin , as well as λout in the following equations, represent the
dimensionless slowdown due to a signal strand invading an inhibited template. The two different
values are based on the size of the toehold left by the inhibitor on the relevant side of the
template. The first term represents the signal strands restored by detaching from a template.
The second term represents those restored by being displaced by an invading inhibitor. The last
term comes from the reaction opposite to that of the first term and represents signal strands
captured by templates.
φout
temp (t) = kduplex Ks ([tempout ](t) + stacktemp [tempboth ](t))
+kduplex [inhibtemp ](t)[tempout ](t)
−kduplex [s](t) ([tempalone ](t) + [tempin ](t) + λout [tempinhib ](t))
+poldispl (t)[tempboth ](t)
(2)
The first three terms mirrors those of the previous equation. The additional term represents
output strands released by the displacement activity of the polymerase enzyme when it extends
2
a nicked duplex. poldispl represents the polymerase activity when displacing a substrate. When
this substrate is long enough (for instance an inhibitor), this activity is slowed down by an
additional factor, specific to the length of the substrate. If not, the increased affinity for the
polymerase is considered to be offset by the slowdown, so that poldispl ' pol. Note that the
polymerase activity pol only appears in the equations relative to the internal update of templates
(see next Section).
φinhib (t) = αkduplex Ki [tempinhib ](t)
−kduplex [i](t) ([tempalone ](t) + [tempin ](t) + [tempout ](t))
+kduplex [tempinhib ](t) (λin [sin ](t) + λout [sout ](t))
(3)
Where α represents the fact that an inhibition strand is less stable on its target than on the
template that created it due to sequence mismatch (see Figure 1).
S4
Equation set for templates
The equations for the update of a module temp taking sin as input, sout as output and sinhib
as inhibitor are as follow:
d[tempalone ]
(t) = kduplex (Kin [tempin ](t) + Kout [tempout ](t) + αKinhib [tempinhib ](t))
dt
−kduplex [tempalone ](t) ([sin ](t) + [sout ](t) + [sinhib ](t))
d[tempin ]
(t)
dt
= kduplex [sin ](t) ([tempalone ](t) + λin [tempinhib ](t)) + kduplex Kout [tempboth ]
d[tempout ]
(t)
dt
= kduplex [sout ](t) ([tempalone ](t) + λout [tempinhib ](t)) + kduplex Kin [tempboth ]
d[tempboth ]
(t)
dt
= kduplex ([sin ](t)[tempout ](t) + [sout ](t)[tempin ](t)) + nick(t)[tempext ](t)
d[tempext ]
(t)
dt
= pol(t)[tempin ](t) + poldispl (t)[tempboth ](t) − nick(t)[tempext ](t)
−kduplex [tempin ](t) (Kin + [sout ](t) + [sinhib ](t)) − pol(t)[tempin ](t)
−kduplex [tempout ](t) (Kout + [sin ](t) + [sinhib ](t))
−kduplex stack[tempboth ](t) (Kin + Kout ) − poldispl (t)[tempboth ](t)
d[tempinhib ]
(t) = kduplex [sinhib ](t) ([tempalone ](t) + [tempin ](t) + [tempout ](t))
dt
−kduplex [tempinhib ](t) (αKinhib + λin [sin ](t) + λout [sout ](t))
S5
Equation set for enzymes
While the basic model does not take into account enzyme saturation, preferring first order
activity, this assumption is not realistic for complex systems. Specifically, all modules are
sharing the same three enzymes, which is expected to have an impact on their activity. The
usual way to model such burden is to use a Michaelis-Menten term to quantify the enzymatic
activity. Such term is further modified to take into account that multiple modules are competing
for those resources [21].
The exonuclease term becomes:
exos (t) =
Vm,exo
P
[s0 ](t)
s
Km,exo
1 + s0 ∈seq s0
Km,exo
3
(4)
s
where Vm,exo is the maximum theoretical rate and Km,exo
the Michaelis constant for the
s
species s. Note than in our model Vm,exo is independent of s and Km,exo
can only take one
of two values, depending on whether s is an inhibition species or not.
The polymerase activity is separated in two terms: pol for templates with input alone and
pol displ in the case where both input and output are present. We consider that the polymerase
does not interact noticeably with other states of the template.
pol(t)
=
Vm,pol
P
[tempin ](t) [tempboth ](t)
Km,pol 1 + temp
+
Km,pol
Km,displ
(5)
poldispl (t) =
Vm,displ
P
[tempin ](t) [tempboth ](t)
Km,displ 1 + temp
+
Km,pol
Km,displ
Note that Vm,displ depends on the length of the output.
The nickase term is the simplest. We consider that only fully double stranded templates
can capture this enzyme, yielding:
nick(t) =
Km,nick
Vm,nick
P
[tempext ](t)
1 + temp
Km,nick
4
(6)
S6
Full equation set for an autocatalytic module
In the case of a simple autocatalytic module s → s, we get the following equations:
d[s]
(t)
dt
=
kduplex · Ks · ([tempin ](t) + [tempout ](t) + 2 stack · [tempboth ](t))
− kduplex · [s](t) · (2 [tempalone ](t) + [tempin ](t) + [tempout ](t))
+ pol(t) · [tempboth ](t) − exos (t) · [s](t)
d[tempalone ]
(t)
dt
=
kduplex · ([tempin ](t) + [tempout ](t))
− 2 kduplex · [s](t) · [tempalone ](t)
d[tempin ]
(t)
dt
=
kduplex · ([s](t) · [tempalone ](t) + stack · Ks · [tempboth ](t))
− kduplex · [tempin ](t) · ([s](t) + Ks ) − pol · [tempin ](t)
d[tempout ]
(t)
dt
=
kduplex · ([s](t) · [tempalone ](t) + stack · Ks · [tempboth ](t))
− kduplex · [tempout ](t) · ([s](t) + Ks )
d[tempboth ]
(t)
dt
=
kduplex · [s](t) · ([tempin ](t) + [tempout ](t))
− 2 stack · kduplex · Ks · [tempboth ](t)
+ nick · [tempext ](t) − pol · [tempboth ](t)
d[tempext ]
(t)
dt
=
pol · ([tempin ](t) + [tempout ](t))
− nick · [tempext ](t)
exos (t)
pol(t)
Vm,exo
=
Km,short · 1 +
=
=
Km,short
+
[tempalone ](t)
Km,temp
Vm,pol
Km,pol · 1 +
nick
[s](t)
[tempin ](t)
Km,pol
Vm,nick
Km,nick + [tempext ](t)
5
+
[tempboth ]
Km,displ
Figure S1: Graph animation, relative to the time trace. The simulated system is a three-switch
oscillator [17] without enzymatic saturation.
S7
Dynamic graph display
Linking structure and behaviour of a toolbox system can be challenging, especially with large
systems. For this reason, we implemented a module displaying a dynamic representation of
the network. This module provides an animated version of the graph, in which the radiuses
of the nodes are updated to be proportional to the logarithm of the concentration of their
corresponding species at a particular time. We chose to use a logarithmic representation to be
able to see variations of concentrations at multiple orders of magnitude. The edges are also
updated, using a gradient of colour to represent the concentration of corresponding templates
being occupied by a signal strand (i.e. a completely inhibited connection would appear in grey).
Furthermore, to make the graph easier to read, it is drawn using a standard spring-electrical
graph drawing algorithm [S6] (each node is given a repulsive force for all other nodes and an
attractive force which applies specifically to the nodes with which it is connected). This has
proven to be satisfying in terms of display results and computing time.
Enzymatic saturation is also displayed, giving the user a better understanding of the under
the hood mechanisms going on at a given time.
As an example, we show some variations of the grapho of Padirac’s switch oscillator, a
system that uses three autocatalitic modules inhibiting each other in a circular pattern (Figure
S1). Interestingly, the oscillations occur in the reverse order of inhibition, a fact that might be
counter-intuitive. It can be hard to find this property by only reading the time plot, if one is
not looking for it. On the other hand, the activation order is obvious when watching the graph
animation.
6
Figure S2: Bistability domains of Padirac et al.’s experiments [4] (top left) and of simulations
with various template concentration bonuses: 15% (bottom left), 20% (top right) and 25%.
Without bonus, α always win over the considered range, which could be explained by incorrect
coaxial stack slowdown.
S8
Comparison of stability diagram between Padirac et al.’s
experiment and DACCAD simulation
Simulation of the bistable circuit was done using the parameters measured by Padirac et al. [4].
Coaxial stacking values were set using the nucleotide sequences described in their work and the
table from Section S2. Finally, based on their discussion of possible difference in activity of the
polymerase and the experimental values of Qian et al. [S5], templates relative to β were given
a concentration bonus, due to their expected higher affinity with the polymerase. Simulation
results for different bonus are shown alongside Padirac et al.’s experimental results (Figure S2).
Without template correction, the system is still bistable, but the attraction basin of the high β
state is too small to appear on the Figure.
7
S9
Design of the Mastermind system
In the Mastermind game, one player decides of a secret combination and a second player tries
to guess it in the least amount of attempts. A combination is a sequence of symbols (usually
colours) in which order is important. After each guess from the second player, the first player
announce how many symbols are correctly placed (correct guess) and how many symbols among
the remaining do appear in the combination, but at a different position (misplaced guess). For
instance, if the combination was red-green-blue-black, a guess of red-red-red-red would give
one correct, zero misplaced and a guess of black-blue-green-red would yield zero correct, four
misplaced. While, in the original game, there are multiple possibilities for each position, we
simplify it to use only two different symbols (A and B). For this reason, we also don’t give the
amount of misplaced guesses, as it would make the game too easy.
We designed a simple version of this game using DACCAD. In our version, we use four
bistable motifs to store the secret code decided by one player. The opponent may inject one
of two possible species per position. If the correct species was selected, a downstream species
is activated that we call “reporter” species. If all “reporter” species are present, then the
concentration of the “lock” species falls to zero, indicating victory. To determine the number of
correct guesses, the concentration of the reporter species may be measured through fluorescence
in a real implementation. By using the same fluorophore for all of them, the only relevant
information is the level of fluorescence which is equivalent to the number of correct guesses. It
is also possible to use the drop of the “lock” species as a hint, although this modification is not
linear with the number of correct results, which might be confusing.
The design of this system was done step by step, starting from a single bistable circuit (blue
subsystem in Figure 14). This circuit was then complemented to prevent the expression of the
reporter species (yellow in Figure 14). Then the two possible inputs were added. DACCAD was
not only useful to adjust the various template concentrations until the behaviour was correct,
it also showed that inputs strands where not surviving long enough to have an impact on the
system. This prompted the development of the specialised input module (green in Figure 14).
The indirect activation generates a large amount of intermediary species (see Figure 3), which in
turn has enough strength to allow the creation of reporter species (yellow in Figure 14). Finally,
a direct activation is also added to counter the delay induced through the indirect activation.
Saturations were then turned on and templates concentrations were optimized using CMA-ES
until reaching a satisfying response to input. To do so, the reporter circuit was simulated for
about 2000 minutes. To get a saturation similar to that of the final system, dummy autocatalytic
and activation templates were also included in the system. At 500 and 1500 minutes inputs
were added; first the input corresponding to the state of the bistable circuit, then the one
corresponding to the other possible state. The long delay was used to ensure that the steadystate is reached at the time of input. The optimization target was then set to have a large
response when the correct input is added and no answer otherwise. Once the optimization was
done, template concentrations were adjusted so that the system would be symmetrical. We
also rounded those concentrations to the nearest integer, to take into account the maximum
precision of the experimental material.
Once the complete reporter circuit was considered correct, it was duplicated and the two
resulting circuits were connected, forwarding information about the correctness of the guess.
Then this whole system was duplicated once more and connected through an inhibition of the
lock species (red in Figure 14). Concentrations of the connecting templates were then optimized
so that the behaviour of the system, that is the level of the lock species and reporters reflected
roughly linearly the amount of correct guesses. Restriction to those templates is necessary to
limit the number of evolved parameters. This is important due to the simulation time required
for the complete system, in all possible configurations. Particular attention was given to the
case with only two correct guesses, which can result in two possible configurations: either both
8
guesses are on the same “side” (that is on two directly connected bistable circuits) or on the
opposite sides. In the first attempts at creating the system, there was a strong asymmetry
between those two cases, due to the fact that the first case would saturate the template going
from the connection species (like s42 in Figure 14) to the final inhibitor, so that inhibition was
stronger when distributed. Through optimization, the concentrations were adjusted so that
the production of both connection species would always stay in the linear response zone of the
templates connecting them to the final inhibitor. Finally, we edited the concentrations again
following the guidelines from the previous paragraph.
As mentioned, this system does not include misplaced guesses, since they would provide too
much information and spoil the fun in this version of the game. In a more complex version
of this game, one way to encode those alternative reporters is to use an indirect activation
going from all state species, inhibited if that position was correctly guessed, to a state-specific
global reporter, representing how many positions have yet to be matched. The last activation
is inhibited by compatible guesses that do not match locally. The difference between the two
species concentration would then represent the number of misplaced matches.
References
[S1] Vasiliskov, V. A., Prokopenko, D. V., & Mirzabekov, A. D. (2001). Parallel multiplex thermodynamic analysis of coaxial base stacking in DNA duplexes by oligodeoxyribonucleotide
microchips. Nucleic acids research, 29(11), 2303-2313.
[S2] Yakovchuk, P., Protozanova, E., & Frank-Kamenetskii, M. D. (2006). Base-stacking and
base-pairing contributions into thermal stability of the DNA double helix. Nucleic acids
research, 34(2), 564-574.
[S3] Protozanova, E., Yakovchuk, P., & Frank-Kamenetskii, M. D. (2004). Stacked–unstacked
equilibrium at the nick site of DNA. Journal of molecular biology, 342(3), 775-785.
[S4] Bommarito, S., Peyret, N., & SantaLucia Jr, J. (2000). Thermodynamic parameters for
DNA sequences with dangling ends. Nucleic Acids Research, 28(9), 1929-1934.
[S5] Qian, J., Ferguson, T. M., Shinde, D. N., Ramı́rez-Borrero, A. J., Hintze, A., Adami, C.,
& Niemz, A. (2012). Sequence dependence of isothermal DNA amplification via EXPAR.
Nucleic acids research, 40(11), e87-e87.
[S6] Hu, Y. (2005). Efficient, high-quality force-directed graph drawing. Mathematica Journal,
10(1), 37-71.
9