Generating Probability Distributions using Multivalued

Generating Probability Distributions using
Multivalued Stochastic Relay Circuits
David Lee
Jehoshua Bruck
Dept. of Electrical Engineering
Stanford University
Palo Alto, CA 94305
Email: [email protected]
Dept. of Electrical Engineering
California Institute of Technology
Pasadena, CA 91125
Email: [email protected]
Abstract—The problem of random number generation dates
back to von Neumann’s work in 1951. Since then, many algorithms have been developed for generating unbiased bits from
complex correlated sources as well as for generating arbitrary
distributions from unbiased bits. An equally interesting, but less
studied aspect is the structural component of random number
generation as opposed to the algorithmic aspect. That is, given
a network structure imposed by nature or physical devices,
how can we build networks that generate arbitrary probability
distributions in an optimal way?
In this paper, we study the generation of arbitrary probability
distributions in multivalued relay circuits, a generalization in
which relays can take on any of N states and the logical
‘and’ and ‘or’ are replaced with ‘min’ and ‘max’ respectively.
Previous work was done on two-state relays. We generalize these
results, describing a duality property and networks that generate
arbitrary rational probability distributions. We prove that these
networks are robust to errors and design a universal probability
generator which takes input bits and outputs arbitrary binary
probability distributions.
I. I NTRODUCTION
A. Motivation
Many biological systems involve stochasticity. Examples
of these include gene expression[1], chemical reactions[2],
and neuron signaling[3]. However, despite the stochasticity,
often deemed as noise, they are still capable of achieving
functionalities that artificial systems cannot yet compete with.
Motivated by the idea that stochasticity is an important enabler of biological computation, we tackle a simpler question
as a stepping stone: Can we design networks that generate
stochasticity in a systematic way?
B. Structural Aspects of Random Number Generation
This question is strongly connected to an important thread
of work in computer science on random number generation. In
1951, Von Neumann [4] studied this problem in the context of
generating a fair coin toss from a biased coin. Knuth and Yao
[5] studied the reverse problem of generating arbitrary distributions from unbiased bits (fair coins), which was extended by
Han and Hoshi [6] to generating arbitrary distributions from
biased distributions. These primarily focus on an algorithmic
perspective to random number generation.
In a biological system (or other physical device), we do not
have the full generality of the algorithmic approach to generating probability distributions. Randomness arises in specific
areas that can only propagate according to the structure of how
components are composed in the given network. The study
of random number generation under these constraints can be
greatly beneficial to understanding natural stochastic systems
as well as to designing devices for random number generation.
In this paper, we will analyze random number generation
in the context of multivalued relay circuits. Here, we focus
on developing the framework of probability generation in
multivalued relays and briefly summarize further results. These
results are expanded on in the longer version[7].
C. Deterministic Relays
A deterministic relay switch is a 2 terminal object which can
be connected (closed) by a wire or left open. The state of the
switch, which can be either 0 or 1, describes an open or closed
state respectively. These states are complements of each other.
That is, if switch x is in state 0, then the complement x̄ is in
state 1 and vice versa. When multiple switches are composed
together, these networks are known as relay circuits. One of
these, a series composition, is formed when two switches
are put end to end. A parallel composition is formed when
two switches are composed so that the beginning terminals
are connected together and the end terminals are connected
together.
Shannon showed that the series and parallel compositions
can be represented by the boolean ‘and’ and ‘or’ operations[8].
If switches x and y are composed in series to form z1 , then
z1 will only be closed if both x and y are closed. On the other
hand, if switches x and y are composed in parallel to form
z2 , then z2 will be closed if either x or y are closed. We will
denote the series composition of x and y by x ∗ y or simply
xy and the parallel composition by x + y. This notation will
be preserved for further generalizations of the relay circuits.
D. Stochastic Relays
Recently, Wilhelm and Bruck introduced the notion of a
stochastic relay circuit[9]. These circuits are a generalization
of Shannon’s relay circuits; instead of having deterministic
relay switches that are in either the open or closed state,
Fig. 1. Simple Examples. As stated, 12 is short for ( 21 , 12 ). (a) A single
1
-pswitch. (b) Putting (a) in parallel with a 21 -pswitch gives 34 . (c) Putting
2
(b) in series with a 12 -pswitch gives 38 . (d) Putting (c) in parallel with a
1
-pswitch gives 11
.
2
16
stochastic relay circuits can exist in either state with a specified
probability. If a stochastic relay switch x, called a pswitch, has
probability p of being closed and probability 1 − p of being
open, this distribution is represented by a vector v = (1−p, p)
where vi corresponds to the probability of x being in state
i. We say that x realizes (1 − p, p) or simply x realizes
p. If pswitches x and y, which realize probabilities p and
q respectively, are composed in series, the new composition
will realize pq. If they are composed in parallel, the new
composition will realize p + q − pq (see Figure 1).
One of the primary questions dealt with in their work was
the generation of probability distributions using a limited number of base switch types, known as a switch set. For example,
if the switch set S = { 21 }, then relay circuits built with this
switch set can only use pswitches with the distribution ( 12 , 21 ).
They proved that using the switch set S = { 21 }, all probability
distributions 2an could be realized with at most n pswitches.
Continuing, many more results were proved not only in realizing other probability distributions[9][10], but also in circuit
constructions such as a Universal Probability Generator[9] and
in robustness[11] and expressibility[10] properties of these
circuits.
E. Multivalued Stochastic Relays
In order to study generation of non-bernoulli distributions, it
is necessary to generalize Shannon’s relays to a larger number
of states. Multivalued logics have been studied as early as
in 1921 by Post[12] and followed up on by Webb[13] and
others[14]. The work presented in this paper concerns one
generalization of two-state relay circuits to multivalued relay
circuits.
A multivalued switch is a relay switch that can be in any
of n states: 0, 1, 2, ..., n-1. We define the complement of a
switch to be n− 1 − i, where i is the state of the switch. Series
and parallel compositions are redefined to ‘min’ and ‘max’,
respectively, rather than the boolean ‘and’ and ‘or’. This means
that when switches x and y are composed in series, the overall
circuit is in state min(x, y) and when they are composed in
parallel, the overall circuit is in state max(x, y) (see Figure 2a
and b). Multivalued relays have a physical interpretation as the
Fig. 2. 3-state Examples. We use × to represent a deterministic switch and
⊗ to represent a pswitch. (a) A comparison of the truth tables for 2-state and
3-state relay switches. (b) A simple 3-state deterministic example. Evaluating
series connections using min and parallel connections as max, we find that
this circuit is in state 1. (c) A simple 3-state stochastic example. On the top,
we put a ( 12 , 0, 12 ) pswitch in parallel with a deterministic switch in state 1.
Then, with 21 probability we get max(0, 1) and with 12 probability we get
max(2, 1), which yields a (0, 21 , 12 ) circuit. Similarly, we can calculate the
bottom circuit to realize distribution ( 21 , 14 , 41 ).
timings of a network of events. Max-min functions have been
studied as a way of reasoning about discrete event systems[15].
We will, however, only briefly survey applications in this short
version.
Now that we have defined multivalued relays, we can also
reason about probability distributions over the n states of
a multivalued stochastic relay. If a 3-state pswitch x has
probability p0 of being in state 0, p1 of being in state 1, and
p2 of being in state 2, we represent the distribution with the
vector v = (p0 , p1 , p2 ) and say x realizes (p0 , p1 , p2 ). If the
distribution is in the form (1 − p, 0, ..., 0, p), we will shorten
this to simply p if the number of states can be inferred from
the context (see Figure 2c).
We present the following results in our paper:
1) A duality property (Section II).
2) Networks for generating binary probability distributions
(Sections III and IV).
3) Summary of additional results: constructions for rational
distributions, proofs of robustness to errors, universal
probability generation, and applications to neural circuits
and DNA computing (Section V).
II. D UALITY
It is important to characterize properties of multivalued
circuits. One well-known property is duality, which plays a
role in resistor networks, deterministic and two-state stochastic
relay circuits. We show that a similar duality concept exists
for multivalued circuits. Define the dual state of i as the state
n − 1 − i; the dual distribution of v as the distribution v̄ where
v̄i = vn−1−i ; the dual switch of x as the switch x̄ that realizes
the dual distribution of x; and the dual circuit of C as the
circuit C̄ that realizes the dual distribution of C.
Theorem 1 (Duality Theorem): Given a stochastic seriesparallel circuit C, we can construct C̄ by replacing all the
switches in C with their dual switches and by replacing series
Fig. 3. 3-state Duality Example. Remember that the 3-state 12 -pswitch is
the shorthand for the ( 21 , 0, 21 ) pswitch. (a) A two element duality example.
(b) A duality example on a larger circuit. Note that since the pswitches
are symmetric distributions (duals of themselves), only the compositions are
changed.
connections with parallel connections and vice versa. (see
Figure 3).
Proof: This is shown using induction on series-parallel
connections.
Base Case: The dual of a single pswitch with distribution
(p0 , p1 , ..., pn−1 ) is (pn−1 , ..., p1 , p0 ), which trivially satisfies
the theorem.
Inductive Step: Suppose a circuit C with distribution (p0 , p1 , ..., pn−1 ) and a circuit C ′ with distribution
(q0 , q1 , ..., qn−1 ) satisfy the theorem, i.e. the distribution
of C̄ is (pn−1 , ..., p1 , p0 ) and the distribution of C̄ ′ is
(qn−1 , ..., q1 , q0 ). To prove the theorem, it is sufficient for us
to show that CC ′ is the dual of C̄ + C̄ ′ .
Let Cs = CC ′ and Cp = C̄ + C̄ ′ . Then Cs =
(c0 , c1 , ..., cn−1 ) and Cp = (d0 , d1 , ..., dn−1 ) where
X
P r(C = i)P r(C ′ = j)
ck =
min(i,j)=k
=
X
pi qj
min(i,j)=k
dk =
X
P r(C̄ = i)P r(C̄ ′ = j)
max(i,j)=k
=
X
max(i,j)=k
=
pn−1−i qn−1−j
X
pn−1−i qn−1−j
min(n−1−i,n−1−j)=n−1−k
=
X
pi′ qj ′
min(i′ ,j ′ )=n−1−k
= cn−1−k .
We see that d0 = cn−1 , d1 = cn−2 , ..., dn−1 = c0 ,
demonstrating that Cs is the dual of Cp .
III. R EALIZING B INARY 3- STATE D ISTRIBUTIONS
We can now ask questions about generating probability
distributions with stochastic relay circuits. These include:
What are the possible distributions that can be realized? What
is the smallest set of basic switching elements necessary to
realize these distributions? Are there efficient constructions to
realize these distributions? We will begin by demonstrating
how to generate distributions on 3 states.
In writing out algorithms and circuit constructions we will
use the notation introduced earlier for series and parallel
connections, i.e. xy and x + y will denote series and parallel
connections respectively. We will use this notation loosely, i.e.,
( 12 , 0, 21 )( 12 , 12 , 0) represents a circuit formed by composing a
( 12 , 0, 21 ) pswitch in series with a ( 12 , 12 , 0) pswitch.
Lemma 1 (3-state sp composition rules): Given
p = (p0 , p1 , p2 ) and q = (q0 , q1 , q2 ), let x = pq and
y = p + q. Then,
x0 = p0 q0 + p0 q1 + p0 q2 + p1 q0 + p2 q0
= 1 − (1 − p0 )(1 − q0 )
x1 = p1 q1 + p1 q2 + p2 q1
x2 = p2 q2
y0 = p0 q0
y1 = p1 q1 + p1 q0 + p0 q1
y2 = p2 q2 + p2 q1 + p2 q0 + p1 q2 + p0 q2
= 1 − (1 − p2 )(1 − q2 )
Proof: It follows from enumerating all 32 switch combinations.
Theorem 2 (Binary 3-state Distributions): Using
the
switch set S = { 21 } and the deterministic switches, we can
realize all 3-state distributions of the form ( 2an , 2bn , 2cn ) using
at most 2n − 1 pswitches with the following recursive circuit
construction (see Figure 4 for an example):
if p ∈ S then
return p
else if 2an > 21 then
n−1
b
c
return ( 12 , 0, 21 )( a−2
2n−1 , 2n−1 , 2n−1 )
a+b
1
else if 2n > 2 then
n−1
n−1
c
a
, 2 2n−1−a , 0) + ( 12 , 0, 12 )(0, 2 2n−1−c , 2n−1
)
return ( 2n−1
a+b+c
1
else if 2n > 2 then
n−1
1
b
1
a
, 2n−1
, c−2
return ( 2n−1
2n−1 ) + ( 2 , 0, 2 )
end if
Proof: For any distribution
Pk p = (p0 , p1 , p2 ), there exists
some smallest k such that i=0 pi > 12 , which correspond to
the 3 recursive cases enumerated above. We can verify that
for each of these cases:
1) The decompositions obey the 3-state composition rules.
2) The switches are valid (non-negative probabilities and
sum to 1)
Since each algorithm’s decomposition uses switches of type
a
b
c
( 2n−1
, 2n−1
, 2n−1
), then we will eventually have n = 0,
corresponding to a deterministic switch; at this point the
algorithm terminates and has successfully constructed any
( 2an , 2bn , 2cn ).
We will now prove that we use at most 2n−1 pswitches for
all n ≥ 1. Define fn as the maximum number of pswitches
Fig. 4. Realizing 3-state Distribution. (a) We want to realize ( 85 , 41 , 18 ).
This falls under the first case. (b) Now we have an unconstructed distribution
that falls under the second case. (c) There are two unconstructed distributions.
The top one falls under case 3 and the bottom falls under case 2. (Note that
case 2 is equivalent to case 1 for p0 = 12 ). (d) The final circuit.
used in the construction of any distribution ( 2an , 2bn , 2cn ). Then,
f1 = 1
(1)
fn = max(fn−1 + 1, 2(n − 1) + 1, fn−1 + 1)
= max(fn−1 + 1, 2n − 1)
(2)
where (1) is shown trivially, and (2) is derived from the 3
cases of the algorithm. Note that we are using a previous
result[9] that 2 state distributions of form ( 2an , 2bn ) use at
most n switches, and also assuming here that our algorithm
generalizes the previous algorithm; that is, distributions of
the form ( 2an , 2bn , 0) and its permutations also use at most
n switches. For now we will leave this unproved, but it can
be shown that this is true.
It is useful at this time to provide some intuition regarding
the algorithm. We can view the original distribution as a series
of blocks dividing up the interval [0, 1] (see Figure 5). By
applying the algorithm, we are separating this larger interval
into smaller intervals [0, 21 ] and [ 12 , 1], cutting any blocks on
that boundary. When this separation occurs, the total size of the
block is decreased (namely, in half), and so the probabilities
representing those intervals change - these probabilities are
precisely those of the algorithm. Namely, we can rewrite the
three cases of the algorithm:
( 2an , 2bn , 2cn ) =
n−1
b
c
Case 1: (1, 0, 0) + ( 12 , 0, 21 )( a−2
2n−1 , 2n−1 , 2n−1 )
Fig. 5. Visualization of 3-state Algorithm. We want to realize ( 58 , 14 , 18 ).
A ⊗ symbol marks where the cuts occur which correspond to the use of one
pswitch. (a) We start with one interval split into blocks corresponding to the
pswitch probabilities. (b) When we cut [0, 1] in half, we get two intervals; the
blocks are cut and the probabilities change according to the new interval size.
(c) The second application of the algorithm. (d) When we are left with single
block-interval pairs, we know we are done since the deterministic switch is
in our switch set.
realize all N-state distributions of the form ( 2xn0 , 2xn1 , ..., xN2n−1 )
using at most (N − 1)(n − ⌈log2 N ⌉) + 2⌈log2 N ⌉ − 1 pswitches
with the following algorithm:
if p ∈ S then
return p
else if 2xn0 > 21 then
n−1
x1
−1
, 2n−1
, ..., x2Nn−1
)
return 21 ( x02−2
n−1
x0 +x1
1
else if 2n > 2 then
n−1
n−1
x0
−x0
x2
1 −2
return ( 2n−1
, 2 2n−1
, ...) + 21 (0, x0 +x2n−1
, 2n−1
, ...)
else if ... then
...
N −1
> 21 then
else if x0 +x1 +...+x
2n
−2n−1
x0
x1
return ( 2n−1 , 2n−1 , ..., xN −1
) + 21
2n−1
end if
Proof: The proof is similar to the 3-state case, so we will
not detail it.
n−1
n−1
a
c
Case 2: ( 2n−1
, 2 2n−1−a , 0) + ( 21 , 0, 21 )(0, 2 2n−1−c , 2n−1
)
n−1
1
a
b
1
Case 3: ( 2n−1
, 2n−1
, c−2
2n−1 ) + ( 2 , 0, 2 )(0, 0, 1)
IV. R EALIZING B INARY N - STATE D ISTRIBUTIONS
We are now ready to continue with our algorithm for N
states. Intuitively, we can describe the algorithm for N states
in the same way asPfor 3 states. We first find the smallest
k
index k for which i=0 pi > 12 . Then based on the index
k, we can decompose our distribution in a way corresponding
to the interval-block visualization; the only difference is that
each interval can now have up to N blocks instead of just 3.
Theorem 3 (Binary N-state Distributions): Using
the
switch set S = { 21 } and the deterministic switches, we can
V. F URTHER W ORK
In the following sections, we will introduce other results
without details. These include: 1) generating rational probability distributions, 2) robustness on circuit error, 3) and a
circuit that can generate all binary probability distributions.
Finally, we will briefly discuss applications.
A. Realizing Rational Distributions
Given the previous results, a natural question arises: Can
we also generate probability distributions over non-binary
fractions? This question was studied for the 2-state case by
Wilhelm, Zhou, and Bruck[9][10] for distributions of the
form ( qan , qbn ) using the switch set S = { q1 , 2q , ..., q−1
q }. In
their work, they demonstrated algorithms for realizing these
distributions for any q that is a multiple of 2 or 3. In addition,
they proved that for any prime q greater than 3, no circuit
exists that can generate all ( qan , qbn ) using the given switch
set.
We were able to demonstrate two distinct algorithms for
generating non-binary distributions over N -states. For one of
these algorithms, we prove that if there is a two-state algorithm
for realizing any distribution of the form ( qxn0 , qxn1 ), then we can
also realize any distribution of the form ( qxn0 , qxn1 , ..., xNqn−1 )
by using the 21 -switch. The idea is to reduce our N -state
distribution into at most N − 1 two-state distributions and
then apply the two-state algorithm. In our second algorithm,
we prove that we can realize all rational distributions with an
algorithm that reduces the denominator in a similar method to
the binary fraction algorithm.
B. Robustness
Another important question is whether we can construct
probabilities robustly. That is, if there is an ǫ error on the
base pswitches used, will the error on circuit constructions be
bounded or grow uncontrollably? We were able to demonstrate
that using our construction algorithms, the error on any constructed circuit, regardless of size, is bounded by a constant
factor of ǫ. Specifically, define an active state as a state with
non-zero probability, i.e. ( 14 , 0, 43 ) has active states 0 and 2.
Then, if we allow at most ǫ error on the active states of
the pswitches used in the circuit synthesis, the overall circuit
will realize a distribution in which the errors on each state
probability will be bounded by kǫ, k = log2 ⌈N − 1⌉ + 2.
C. Universal Probability Generator
Up till now, we have only looked at circuits with set
switches and described the expressibility of distributions that
can be constructed. A next question is: What about circuits
that implement stochastic functions? How can we reason about
circuits that implement different probability distributions given
deterministic inputs. Wilhelm and Bruck[9] demonstrated a
circuit that can output arbitrary 2-state probabilities as a
function of deterministic inputs.
We were able to demonstrate an analogous circuit for
N > 2. Specifically, given a number of states N and the
power of the denominator n, we can create a circuit that will
generate every distribution of the form ( 2xn0 , 2xn1 , ..., xN2n−1 ) in
a systematic way. In our construction, we use a group of
n + 1 input switches that encode 2xn0 in binary, a second group
1
encoding x02+x
in binary, etc...
n
D. Discussion on Applications
We believe that our circuit constructions and related algorithms are interesting results in the context of the robust
stochastic computation in biological and natural systems. One
example of where our specific max/min state structure may be
useful is in understanding temporal or order coding in neurons.
We can represent the discrete time (or order) a neuron fires
as the state of a relay circuit. Then the circuit Cs that is the
series connection of relays x and y represents the state of an
output neuron which has a high resting potential and two input
neurons (x and y). Since it has a high resting potential, then if
either neurons x or y fire, then our output neuron will exceed
the threshold potential, firing at time min(x, y). Using similar
reasoning, we can represent an output neuron with low resting
potential as a parallel connection.
There are a number of directions for extending this work to
better model and understand biological circuits. These include
studying joint distributions, stochastic functions, probability
distributions on different state structures, and stochasticity
in compositions. With further understanding of biological
circuits, this can also aid in building artificial molecular
circuits, such as in DNA computing[16]. In addition, stochastic
circuits may be useful in modeling other stochastic networks
in engineering.
ACKNOWLEDGMENT
The authors would like to thank Dan Wilhelm, Hongchao
Zhou, and Ho-lin Chen for helpful discussions. This work
was supported in part by the Molecular Programming Project
funded by the NSF Expeditions in Computing Program under
grant CCF-0832824, the Caltech SURF program, and the
Aerospace Corporation.
R EFERENCES
[1] M. Elowitz, A. Levine, E. Siggia, P. Swain. Stochastic gene expression
in a single cell. Science 297:183-186 (2002).
[2] D. Soloveichik, M. Cook, E. Winfree, J. Bruck. Computation with finite
stochastic chemical reaction networks. Natural Computing 7(4):615–633
(2008).
[3] E. Schneidman, B. Freedman, I. Segev. Ion channel stochasticity may
be critical in determining the reliability and precision of spike timing.
Neural Comp. 10:1679-1703 (1998).
[4] J. von Neumann. Various techniques used in connection with random
digits. Applied Math Series 12:36-38 (1951).
[5] D. Knuth, A. Yao. The complexity of nonuniform random number
generation. Algorithms and Complexity: New Directions and Recent
Results 357-428 (1976).
[6] T. Han, M. Hoshi. Interval algorithm for random number generation.
IEEE Trans. on Information Theory 43(2):599-611 (1997).
[7] D. Lee, J. Bruck. Generating probability distributions using
multivalued stochastic relay circuits. Caltech Paradise ETR at
http://paradise.caltech.edu/etr.html
[8] C. Shannon. A symbolic analysis of relay and switching circuits. Trans.
AIEE 57:713723 (1938).
[9] D. Wilhelm, J. Bruck. Stochastic switching circuit synthesis. Proc. IEEE
ISIT (2008).
[10] H. Zhou, J. Bruck. On the expressibility of stochastic switching circuits.
Proc. IEEE ISIT (2009).
[11] P. Loh, H. Zhou, J. Bruck. The robustness of stochastic switching
networks. Proc. IEEE ISIT (2009).
[12] E. Post. Introduction to a general theory of elementary propositions.
American Journal of Mathematics 43(3):163185 (1921).
[13] D. Webb. Many-valued logics. Caltech Ph.D. Dissertation, Advisor Eric
Bell (1936).
[14] D. Rine. Computer science and multiple-valued logic: theory and
applications. North Holland (1984).
[15] J. Gunawardena. Min-max functions. Discrete Event Dynamic Systems:
Theory and Applications 4(4):377407 (1994).
[16] G. Seelig, D. Soloveichik, D. Zhang, E. Winfree. Enzyme-free nucleic
acid logic circuits. Science 314:15851588 (2002).