A General Model for Memory-Based Finite

IEEE TRANSACTIONS ON COMPUTERS, VOL. C-36, NO. 2, FEBRUARY 1987
A
175
General Model for Memory-Based
Machines
LEE D. CORAOR,
MEMBER, IEEE,
PAUL T. HULINA,
MEMBER, IEEE, AND
Abstract-A general model of a memory-based finite-state
machine architecture is introduced, the 2k - decision machine
(2k - D). The classical (2n - D) and binary decision (2 - D)
architectures are shown to be special cases of the 2k - D
architecture. The equivalence among the 22k-D solutions for
different values of k follows from the sequentialization principle,
which is stated in the paper. A cost measure is defined in terms of
memory size and a procedure to determine the 2k - D
architecture which offers the minimum cost, given that the speed
is equal to one state transition per clock cycle, is presented. It is
shown that this architecture is not in general the minimum cost
solution when the speed of the circuit is not a critical design
factor. Also discussed, are the optimization problems to be
solved when a minimum cost 2k - D architecture is desired. In
addition, bounds on the cost of the 2k - D architecture are
calculated in terms of simple measures taken from the mathematical model describing the behavior of the state machine. The
results of this research are particularly attractive when LSI and
VLSI technologies are considered. State machines constitute
fundamental blocks in systems to be fabricated as integrated
circuits. Memory-based implementations offer a short design
time, regularity of structure, and expandability.
Index Terms-Conditional transition, decision diagrams, digital system design, finite-state machines, memory-based implementation, optimization, sequentialization.
I. INTRODUCTION
r IHE solutions to most problems in science and engineering
can be expressed in terms of algorithms which operate on
data sets. Once an algorithm has been devised, we can design a
machine able to implement it. However, the nature of the
problem usually imposes restrictions on the implementation,
e.g. resources to be used as dictated by the kind of
calculations, data types, speed limitations, etc. The machine
can, in general, be decomposed into a control unit and a
processing unit as shown in Fig. 1.
The processing unit comprises the set of resources operating
on the data. The sequence of the operations performed by
these resources is controlled by signals being generated in the
control unit. Decisions are made in the control unit using
signals fedback from the processing unit. Consequently, the
control unit behaves as a state machine.
If the machine model in Fig. 1 has to deal with discrete
signals, then both the control and the operational portion can
be implemented as digital systetns. The field of digital systems
Manuscript received October 29, 1985; revised December 3, 1985.
The authors are 'with the Department of Electrical Engineering, Pennsylvania State University, University Park, PA 16802.
IEEE Log Number 8611720.
Finite-State
ORLANDO A. MOREAN,
CONTROL
.UNIT
MEMBER, IEEE
PROCESSING
UNIT
Fig. 1. General machine decomposition.
encompasses the study of the transformations of discrete
signals, the physical devices designed to perform specific
transformations, and the use of these devices as building
blocks to. construct larger systems. The input-output behavior
of a digital system can be described either explicitly by means
of tables, or discrete functions, or implicitly by means of
algebraic structures, algorithms, or computation schemes [5].
The design of the processing unit in the machine model of
Fig. 1 is application dependent and therefore its structure can
not be defined beforehand. The control unit, however,
behaves as a state machine and therefore it has a digital system
counterpart, i.e., the sequential circuit.
Due to their overwhelming importance in digital system
design, this paper will concentrate on synchronous finite-state
machines and will simply refer to them as state machines.
State machine models have been developed to describe the
behavior of a state machine. The models most commonly used
are: the finite state automation [7], [9], the state table, and the
state diagram. A particular model is selected according to the
needs of the digital system designer. For example, some
models are more convenient when optimization must be
performed, while others offer a better understanding of the
algorithm to be implemented by the state machine.
Once an appropriate description of the machine has been
obtained, we can proceed with an implementation. State
machine implementations can be broadly classified as gate
oriented or memory based.
Gate oriented implementations are customized for the
application and therefore represent the approach which maximizes the speed of the circuit. However, this type of
implementation is quite inflexible and changes in the behavior
of the machine usually entail major redesign efforts. In
addition, gate-oriented designs generally lead to irregular
structures, which are not very attractive when the machine
forms part of a system to be constructed as an integrated
circuit. To obtain regularity in structure, programmable logic
arrays (PLA) are often used in gate-oriented designs [12].
Although the logic structure which is implemented is still
irregular, the PLA structure is regular. With a PLA implementation, many gates are not used since 100 percent efficiency is
not obtainable. In addition, since excitation equations must be
0018-9340/87/0200-0175$01.00 © 1987 IEEE
176
generated for this type of design, a significant initial design
effort is required.
Memory-based implementations are slower but very flexible in the sense that changes in the machine behavior are
usually accomplished by reprogramming of the memory, with
no hardware alterations whatsoever. In addition, they lead to
very regular structures and do not require the generation of the
excitation equations, which make memory-based implementations particularly attractive in the realm of integrated circuit
technology. These advantages offer a strong incentive for the
digital system designer to concentrate his efforts on memorybased implementations of finite-state machines.
The classical memory-based implementation (2n - D) of a
state machine with n input variables is given in Fig. 2.
In the implementation of Fig. 2, AO, A1,
,A2nrepresent the 2" possible next states for the machine corresponding to the 2" input combinations, s is the present state, s +
is the next state, x is the input variable vector, and z is the
output vector. If a Mealy model of the state machine is being
implemented, then we need an extra multiplexer to select the
proper output vector.
The maximum speed of the machine of Fig. 2 is mainly
determined by the memory access time. The cost and
complexity of the implementation are proportional to the
memory size. The number of memory elements required for a
realization is directly related to the number of states. Consequently, the minimization of the number of states does in many
cases reduce the complexity and cost of the realization.
Several techniques have been developed to eliminate redundant states, i.e., states whose functions can be accomplished
by other states [9]. The cost of the implementation is
particularly important when the machine is to be part of a
system to be fabricated as an integrated circuit (IC).
Current technology, to an increasing degree, dictates the
fabrication of more complex systems on a single IC. State
machines are fundamental building blocks in the control of
flow and processing of information inside the IC. This fact
motivates the study of alternative memory-based implementations of state machines, aimed at satisfying the specific time
constraints of the control tasks at minimum cost. An attempt in
this direction is the memory-based binary decision (BD or 2 D) architecture proposed by Clare [4], whose block diagram is
given in Fig. 3. In the architecture of Fig. 3, C is a field in the
memory word which selects the input variable to be tested at
present state s, Ao is the next state if the testing variable is 0,
and AI is the next state if the testing variable is 1.
Binary decision theory principles can be traced back to the
work of Lee in 1959 [11]. Since then, many researchers have
applied these principles to sequentialize the processes in the
control unit of a digital system [3], [15], [17]. The 2 - D
approach is an alternative implementation of a state machine
which, through the sequential analysis of the input variables,
offers the possibility of a reduction in the memory requirements and, consequenty, in the cost of the implementation.
This reduction comes, of course, at the expense of speed.
Speed is reduced because extra states must be added between
the original states of the machine, since only one input can be
tested at a time.
IEEE TRANSACTIONS ON COMPUTERS, VOL. C-36, NO. 2, FEBRUARY 1987
9
Fig. 3. Binary decision (2
-
D) architecture.
The 2n - D and 2 - D memory-based implementations
can be considered as instances of a more general class of
realizations, the 2k- D architectures (1 < k c n). For a
given finite-state machine we can determine its 2 - D, 22 D,. 2n - D implementations. The equivalence of the
different implementations can be established in terms of the
"sequentialization principle," which is stated later on.
In Section II of the paper several models of a finite-state
machine are reviewed and a number of important definitions
are provided. In Section III the 2k - D architecture is
introduced and two implementations of this architecture are
presented. In addition, a cost analysis of the 2k - D (1 < k
c n) architecture is presented in which cost equations are
developed. Based on the cost measure, it is demonstrated that
for certain problems a 2k - D machine with k < n can
operate at the same speed as the 2n - D machine but at
reduced cost. In Section IV the optimization problems to be
solved, when the minimum cost 2k - D architecture is sought
and the speed is not a factor in the design, are presented. In
addition, bounds on the cost of the 2k - D architecture are
calculated based on simple measures taken from the mathematical model describing the behavior of the state machine.
II. STATE MACHINE MODELS AND DEFINITIONS
In order to provide some necessary background material,
models for a finite state machine along with definitions and
notation used in the paper will now be presented.
Definition: A finite-state automaton is an algebraic
structure
A -(, 2, $, A, A)
A
(X1, X2,
*
2= (Z1, Z2, * * *,
XN) is the input alphabet
Zp) iS the output alphabet
177
CORAOR et al.: MEMORY-BASED FINITE-STATE MACHINES
= (SI,
S2,
,
SM) is the set of internal states
of the automaton
A= (61,
6il,
A= (X1,
X2, * *,) is the set of output functions
*
I
6M) is the set of next state or
transition functions bi: $ x Ag$
Fig. 4. State diagram of Example 1.
Xi: $x-.
The elements of set $ can be represented by a binary code
designated as the state variable vector or state vector, and
denoted as s. The dimension of s, denoted by m, is the
minimum number of state variables needed to uniquely
identify an element of $. Therefore m = Flog2 M], where
Flog2 MI is the smallest integer greater than or equal to log2
M.
Similarly, we can represent the elements of A and I by
binary codes. These codes will be referred to as the input
variable vector or input vector x, and output variable vector or
output vector z, respectively. Their dimensions, denoted by n
and p, are given by n = Flog2 N] andp = Flog2 PI .
Definition: The state table is an explicit representation of
the behavior of a state machine. It consists of a matrix
including M rows and N columns. The rows correspond to the
elements of $ and the columns to the elements of Z. The entry
appearing at the intersection of row i and column j is the
element (6i(Xj), Xi(Xj)) of $ x M, referred to as the total
state.
Definition: The state diagram is an explicit representation
of the behavior of a state machine. In contains M nodes, which
correspond to the elements of $. There is a directed arc from
node i to node k, labelled Xa/Zb, which indicates that if the
machine is in state Si and the input combination is Xa, then the
next state of the machine is Sk and the corresponding output
combination is Zb.
The transition between nodes in the state diagram is
determined by the value of the input vector. We can perform a
parallel analysis of the input variables (2n - D) while the
machine is in the present state, or carry out a sequential
analysis of the input variables as in the 2 - D case, This
however implies the expansion of the state diagram to include
those nodes where the testing of the input variables is
performed.
Definition: A link state Tj of a machine is an added state
where the intermediate steps in the evaluation of the input
variable vector takes place.
In order to avoid any ambiguity, an original state of the
machine will simply be referred to as state Si.
Let Qi denote the number of link states which can be
reached from a state Si of the machine. Let Q be the total
number of link states of the machine. Q is given by
M
Q=EQ1
i=1
(1)
Example 1: Consider the state machine described by the
state diagram of Fig. 4.
In Fig. 4 the label of each directed arc corresponds to xlxo/
Fig. 5. Alternative state diagram of Example 1.
z. The input variables xl and xo are tested in parallel to
determine the next state and output of the machine. If the input
variables are tested sequentially (it is assumed that xl is tested
first), the state diagram of Fig. 5 is obtained.
In the state diagratn of Fig. 5 there are 2 states SO, S1, and 2
link states To, T1. Qo and Ql are 2 and 0, respectively, and
therefore Q = 2. Note that if the machine is in state S1, only xl
has to be tested to determine the next state and output of the
state machine. However, if the machine is in state SO the
knowledge of the value of xl is not sufficient to determine the
next state and output of the machine. The generalization of
these facts leads us to the following concepts.
Definition: An input variable xk is a control variable of a
state transition function 6,{x) if and only if
MiX09
X1,
..
Xk- I Os Xk+lI
..
Xn-1)
+6i(xo, Xi,**, Xk-1, 1, Xk+'b
..
Xn_1).
Definition: The control vector cSi of the state transition
function bi(x) is a vector whose components are the control
variables of bi(x). Consequently, c5i E x and 6,(x) = bi(cbi).
Denote the dimension of c6i by ,.
Definition: An input variable xk is a control variable of an
output function X1(x) if and only if
Xi (XOo
X1, o..
Xk-1,
O,
Xk+I,
Xi(XO, XI,,
Xn-1)
Xk-l
1,I Xk+l1,
Xn-
0-
Definition: The control vector CVj of the output function
Xj(x) is a vector whose components are the control variables of
Xj(x). Consequently cxj E x and X(x) = X1{j). Denote the
dimension of Cxj by v.j
by
For the machine of Example 1 the control vectors are given
Cxo=C60= {x}= {xlxo}
Cxi =c61 = {x1}.
178
IEEE TRANSACTIONS ON COMPUTERS, VOL. C-36, NO. 2, FEBRUARY 1987
We will now introduce a state machine model which makes
explicit use of the control vector concept.
Definition: A conditional transition CTi is an explicit
representation of the transition and output mappings associated
with the machine state Si expressed as
CTi fa(xc)
Sa Za
f (xc) S- z4
..............
f0(xC) So Z4
(2)
where
Si E t and Zi E 2, fori=,u
,Q
XC= ciNcxi, with c1i and cl,j being the control vectors of 6,{x)
and X,(x), respectively, where @ is the concatenation opera, fi, are disjunctions of the control variable
tion. fa, fq,
combinations so that
CMF, =x1' * (S, O) +x, * (S1 1).
When considering the classical memory based implementation of a state machine given in Fig. 2, the antecedents of all
conditional transitions are evaluated simultaneously due to the
operation of the address decoder. Note however that the
evaluation of the antecedents of the conditional transition in (2)
can be carried out sequentially. The evaluation stops when one
of the antecedents returns the true value. The order of the
evaluation is a degree of freedom available to the system
designer. To evaluate the antecedents of the CT's of the
machine we can restrict ourselves to the analysis of the control
vectors associated with the CT's. To formalize these ideas we
introduce the following.
Definition: a vector w is a subvector of a vector x if every
component of w is a component of x.
Definition: Let
1) Ex denote the set of all subsets of components of a vector
x
fv f=O fori=j
2) w be a subvector of x
3) Ew be a subset of Ex
Then, Ew is a complete. description (CD) of w if and only
if each component of w is contained in at least one element of
and
Vfja1
Ew.
(SiZi) *(SjZj) if i *j and (S,Z' )= (SjZj) if and only if
Si= Sj and Z, =Zj.
Definition: The antecedents of a conditional transition
* * fo}
CTi are the elements of the set of ffunctions {fa, fl, *,
It should be noted at this point that for 2" - D and 2 - D
state machines, the complete description sets are unique and
given by
CD2n-D={X}
{XoXlX2
=
CD2-D={Xo, Xl,
X2,
*
Xn-}
-*,Xn-1}-
whose evaluation determines the ne xt state and outnut combiLemma 1: The sequentialization principle. The next state
nation of the state machine.
Definition: The consequents off a conditional transition and output vectors of a state machine can be determined from
CTi are the elements of the set of next state and output the knowledge of the present state of the machine Si and the
*
*?
sequential evaluation, in any order, of the elements of a
combination pairs {(SaZa), (SAZ), *
(SQZD)}@
Definition: The conditional imapping function of a complete description of the control vector of the conditional
conditional transition CT,, denoted by CMFi is the relation transition CTi associated with Si.
between the consequents and antecedents of CTi expressed as
Proof: The proof of Lemma 1 is based on the application
of Shannon's expansion theorem to the conditional mapping
a Boolean function, i.e.,
function CMF,(xc) about the elements of the complete descripCMF,(xC)=(=i(xc X),(Xc))
(3) tion set of the control vector xc associated with the conditional
transition CT,. This process generates a tree-like structure
Z)
CMF (xc)f
* (So Za) +f
(Sy 4)
where the nodes are occupied by elements of the complete
+ *. +fo(xc) * (S Zn). description set. The leaves or terminal nodes of the tree are the
possible consequents of the conditional transition CTi. It takes
The interpretation of (2) is as follows: Let 5, denote the a finite number of steps to reach a terminal node because the
present state of the machine. The atnitecedents associated with uncertainty of the value of the conditional mapping function is
the conditional transition CTi are evraluated. Only one of them reduced after each node of the tree is evaluated in a downward
returns the value 1 because they aire mutually disjoint, e.g., fashion. Consequently, the sequential evaluation of the elef~(x) = 1. The next state and oi itput comhbination of thse ments of the complete description set of the control vector xC
machine form the consequent assoc iated with f(x), i.e., (
of CTi does indeed determine the next state and output
Z4).
combination of the state machine.
For the machine of Example 1 the conditional transition
To clarify some of the definitions and concepts just
model and the conditional mapping function are as follows:
presented, the following example is given.
Example 2: Consider a state machine whose behavior is
CT, x' SO 0
CTO xI'xox+xIO SO 0
described by means of a state table. The ith row of the state
1
X1Xo+X1X' SI 1
Sl
table, describing the next state function for some arbitrary
CMFo=(x IxI+x,xo) * (SoO)+ (x'xO+xlx,) * (Si 1)
present state, is given in Fig. 6.
)(St,
I
179
CORAOR et al.: MEMORY-BASED FINITE-STATE MACHINES
x2xl x0
SIoC)
000 001 010 Ol
110 1l1
Sb Sc
100 101
Sc SO Sc Sc Sb
So
00,1(
Fig. 6. Ith row of state table.
The conditional transition CTi, associated with the present
state S, is given by
CTi
Sa
x 'xo
X2XO
x2' X + XlX0
Sb
Sc
Za
Zb
Zc
Consider one possible complete description set CD =
{xlxo, x2x1}. The next state function ai(x) can be expressed in
terms of the elements of CD as follows:
bi(X)=XX [X2Sb+ X2 SC ]+X1'Xo[S]
(X)=X1:Xo{X2xx
[X2ISb + X2 SC] + X1XO[-SC]
[Sc] +xx[SC] +X2Xl [Sb] +X2X1[Sb]}
+XX [ SaS
S
X
+
XlX' {X2 Xl[SC] + X2X, [Sc]
+ X2Xl
[Sb]
bs
Fig. 7. Diagram representation of 6,(x) with CD = {x1xO, x2x,}.
6i(X) =Xl XOSa0+X2X Sb+ (X2X +x1xo)Sc
+ XIX'
S0
+ X2X1 [Sb]}
III. 2' - DECISION (2k - D) MEMORY-BASED ARCHITECTURE
In the 2n - D memory based architecture of Fig. 2, the
antecedents of the conditional transition CTi of present state S5
are evaluated simultaneously due to the operation of the
address decoder. Therefore, the evaluation of the conditional
mapping function (6,Xx) Xi(x)), associated with the present
state Si of the machine, takes place in just one clock cycle. As
previously discussed, the complete description set for the 2, D state machine is given by
CDc= {x}.
The 2k - D memory based architecture is characterized by
a unique complete description set CD2k D which is given by
CD2kD { W W is a subvector of x
and w has k components}.
(4)
We can express the next state function in the form of a
Therefore, in the 2k - D architecture k input variables are
diagram whose nodes are occupied by the elements of the
for
is
set.
the
tested
The
during each clock cycle. If the control vector associated
diagram
example given
complete description
with
CTi has dimension tcj < k, then the evaluation of the
in Fig. 7.
conditional
7
of
arcs
in
the
mapping function CMFi(x) can be accomplished in
the
directed
The labels of
diagram Fig.
one
clock
of
nodes
the
CD
in
the
cycle. However, if tcj > k then the evaluation
only
evaluation
of
element
to
the
correspond
of CMF,(x) requires the sequential evaluation of a subset of
from which the arcs depart.
CD2kD constituting a complete description set of the control
We can generalize these concepts as follows.
of
model
is
a
hierarchical
xe,. Let CDi denote such a subset. The evaluation of the
A
decision
vector
diagram
Definition:
the evaluation of a discrete function wherein the value of a first element of CDi takes place while the machine is in state
subset of the input variables is determined and the next action Si. The evaluation of the remaining elements of CDi requires
taken, either to choose another subset to evaluate, or to output the addition of extra or link states (T,).
Extending the 2 - D concept depicted in Fig. 3, results in
the value of the function, is chosen accordingly. The root of
the diagram is the node where the evaluation process begins. the architecture shown in Fig. 8.
In the extended 2 - D architecture of Fig. 8 there are k
The leaves or terminal nodes of the diagram corresponds to
the values of the discrete function. The remaining nodes will multiplexers, whose input is the vector of input variables,
be referred to as transition nodes. If a node in the diagram has which will be referred to as the inpu-t multiplexers. The other
more than one branch directed into it, then the node is multiplexer included in Fig. 8 will be referred to as the
address multiplexer. The outputs of the input multiplexers
qualified as reconvergent.
Definition: An optimum decision diagram for a multiple- form the group of input variables to be tested during the
output discrete function is a decision diagram for the function present state of the machine. They control the address
, Ck of the memory word
with the minimum number of nodes.
multiplexer. Fields Cl, C2,
Definition: A decision tree is a decision diagram with no control the input multiplexers so that an element of CD2k-D
can be selected. Fields AO, A1, *.. A2k1 hold the 2k possible
reconvergent nodes.
When we apply the sequentialization principle using a next states corresponding to the 2k possible values of the
description set CDM, we can generate different decision element of CD2k-D to be tested. There is one field to specify
diagrams depending on the order of evaluation of the elements the output vector.
However, there is redundant information in the architecture
of CDM. Optimum decision diagrams lead to state diagrams
of Fig. 8 because we can generate not only the elements of
with minimum number of link states.
+ x1xo[S']
180
IEEE TRANSACTIONS ON COMPUTERS, VOL. C-36, NO. 2, FEBRUARY 1987
ARRAY OF
ADDRESS,E
~ s
----N
CD
g~ ~ ~ ~_
_
Fig. 9. CD (2 k
Fig. 8. Extended 2
-
D (2k
-
*P
x
2
-
D) architecture.
D) architecture.
Q is the number of transition states
CD2kD, but also additional groups of input variables whose
n is the dimension of the input variable vector
analysis is not necessary. It follows from Fig. 8 that the we can
p is the dimension of the output vector
generate all the elements of CD2,-D for 1 c x c k. To
k is the degree of decision of the architecture
overcome this problem a second scheme is proposed as shown
[CD2k-D] is the number of elements of the complete
in Fig. 9.
description set CD2k-D' which is given by
In the CD architecture of Fig. 9 there is one multiplexer
whose inputs are the elements of CD2kD, it will be referred to
n!
(11)
[ 2k-DJ k!(n - k)!
as the test set multiplexer. Field C in the memory word
allows the selection of which element of CD 2k-D is being
Theorem 1: If a state machine is implemented using the
tested during the present state of the machine. The rest of the
Extended 2 - D and CD architectures for a 2k- D
architecture is similar to the one of Fig. 8.
implementation, then CM,CD ' CM,EXTS
Cost Measure
Proof: From (9) and (10) it follows that
Two of the most relevant factors characterizing an implementation of a state machine are speed and cost. The system CM,CD - CM,EXT = (M+ Q) x {[log2 [CD2kD] rF- k[log2 nl}
designer generally seeks the solution that satisfies the time [log2
[CD2k-D11
constraints of the problem with minimum cost.
The cost of a memory-based architecture implementing a
state machine can be expressed as
- [02 {fk!(n-k)! }
cost (architecture) = cost (elements) + (cost interconnections)
(8)
The elements correspond to the functional blocks required
by the implementation, e.g., memory, address decoder,
latches, etc.
As a rough approximation, the cost of the memory-based
architecture is proportional to its memory requirements.
Therefore we will use the size of the memory in bits as a
measure of the cost of the architecture.
Let CM,Arch stand for the number of bits of memory required
by a memory-based implementation of a state machine, whose
arthictecture is indicated by the subscript Arch. The costs of
the 2k - D architectures of Figs. 8 and 9, denoted by CM,EXT
CM,CD, respectively,
given by
CM,EXT =(M+ Q) x {k[log2 n] + 2k[Og2 (M+ Q)1 +P }
and
(n)(n - 1)(n - 2) ... (n-k+ 1))]
(k)(k- 1)(k-2) ... (2)(1)
[10g2
-3 +log2 {
< 1092
k
+ *+
CM,CD = (M+ Q) + Flog2 [CD2k-DJ1
+2k[l0g2(M+Q)]+p} (10)
' k0log2
k-i}
1092
k
+
F
192{
k-2}
3
n]
[log2 [CD2k- D]1-k[log2 n]<O
=*CM,CD- CM,EXT sO.
Therefore, CM,CD ' CM,EXT-
M
l3+ 109g2
}] 1+ {2
g2t3
are
(9)
where
-
- [log2 {
From Theorem 1 it follows that between the 2k - D
is the number of original states of the
architecture types EXT 2 - D and CD, the CD arthitecture
machine
181
CORAOR et al.: MEMORY-BASED FINITE-STATE MACHINES
offers the minimum memory requirements for any state
machine. Therefore, form this point on whenever the 2k - D
architecture is referred to, it is implicitly understood that the
CD architecture of Fig. 7 is being used. The cost of the 2k
D architecture, denoted by CM2k-D, corresponds to CM,CD,
i.e.,
CM2kD =(M+Q)x {[lg2 [CD 2k-DI
+2k[log2(M+ Q)]+p}. (12)
The system designer can use (12) to classify the 2k - D
architecture implementing a given state machine on a memory
cost basis. Note that the costs of the classical and 2 - D
memory based architectures are derived from (12) with k = n
and k = 1, respectively. However, the speed of the machine is
also a major factor in determining the selection of a solution to
a particular problem.
If a memory-based architecture is used to implement a state
machine, the speed of the implementation is mainly determined by the memory access time. We can measure speed in
terms of the number of clock cycles required to perform a
transition between two states of the machine. For example, the
classical memory-based architecture of Fig. 2 can be qualified
as a 1 clock cycle implementation of a state machine because
any state transition takes just 1 clock cycle to be executed. The
effects of speed restrictions on the 2k D architecture is the
object of study of the next section.
make this possible as well as the cost implications are
considered in the following theorems.
Theorem 2: If ¢ = max {CVDS} and r < n, then there
exists a 2k - D architecture which can be operated at
maximum speed for any k where ¢ c k c n.
Proof: Li s v c k Vi. Consequently, for every vector a
E CD2_D there exists a set of vectors {i3, 2, *, 13} C
CD2k-D such that a is a subvector of 13j for j = 1, ,
where 1 is given by
(n - ti)!
(n - k)!(k- Li)!
This implies that the value of a can be determined from the
value of any of the vectors 13j. Therefore, for each conditional
transition CTi we can always select an element 1 of CD2k-D to
test so that xc, is a subvector of 13, and consequently the
transition time T, is unity and Qi = 0. Applying the same
procedure to all conditional transitions we achieve the condition Q = 0, which implies that the 2k - D architecture is
operating at maximum speed.
Theorem 3: If D = max {CVDS}, and D < n, then there
exists a 2k- D architecture with k = vsuch that CM2r_D <
CM,2k-D for v c k < n.
Proof: From Theorem 2 it follows that Q = 0 for the 2k
- D architecture with
<
c k < n. From (12) we can
determine the cost of the architecture as follows:
CM,2k-D =Mx {[lg2 [ CD2k-D]1 + 2kFlog2 M] +p}.
Speed Considerations
For many state machine applications, it is required that their In particular for k = D we have
implementations operate at the maximum possible speed. This
CM,2r-D=MX{[log2 [ CD2r_Dll + 2T0log2 M]+P}
is particularly important in LSI and VLSI designs.
In order to characterize the 2k - D architecture in terms of let
speed, it is convenient to first introduce some notation. With
no loss of generality, we will restrict our discussion to a
A= X {C2¢-D C2k-D}
Moore-type state machine.
M
Definition: The transition time ri associated with a
conditional transition CTi is the maximum number of clock
cycles required to determine the value of the conditional A = [log2 [CD2 -Dfl- Flog2 [CD2k-Dl] + (2 -2k)0log2 A
mapping function CMFi.
Definition: The control vector dimension set CVDS of a
,A F[l0g2 I CD2r D1- Flog2 [ CD2k-D1l + 2 - 2k
state machine is given by CVDS = { t1, A2, * *, tM} where t
is the dimension of the control vector xci associated with the
conditional transition CTi of the state machine.
- k
We will measure the speed of a state machine implementa- A [ lo92 (nh) - 0lo 2()]+
tion in terms of the maximum number of clock cycles
necessary to perform a transition between any two states of the
machine. The maximum speed of any implementation is 1
transition per clock cycle.
For a 2k D architecture, the condition of maximum speed
is achieved in the absence of link states, i.e., Q = 0. For the
2n D architecture the condition Q = 0 is always satisfied
+ -2k
A<102 {Q0)}
because the control vectors of all conditional transitions are
identical and equal to x, the vector of input variables. The
control vector dimension set for the 2n - D architecture is
thus given by CVDS = {n, n, n, * n}.
D architecture for k * n can, under certain
The 2k
1 ++2 2
--Fo { k! (n-k)!
- )!
circumstances, achieve maximum speed. The conditions that
-
-
-
-
og2~~ !(n
182
IEEE TRANSACTIONS ON COMPUTERS, VOL. C-36, NO. 2, FEBRUARY 1987
Ps k => (n - k)! s (n - O)!
*>. A C
1092
t
(n - k)!
==( -<I
PS
x2x 0x0
000 001 010 Oil
so SI S2 SI
Si So So SI
S2 SO So S3
S3 So S, S2
1+V2- 2k= f(k, D)
100 101
1 10 1 1 1
S3 Si S2 Si S3
Si S2 S2 S2 S2
S3 S0 S0 S3 S3
S3 SQ S1S2 S3
Fig. 10. -State table of Example 3.
f(k+ 1, k) =092 +(k1)1
=
[lo2(k+l1)
+
2g 2k+1
lo92 {
conditional transitions for the machine can be deduced from
the state table of Fig. 8. After some simplifications the
following expressions are obtained.
+2-2]k-+1
f(k + 1, P) -f(k, =Flog2 (k + 1)] - 2k < 0,
CTo: x6 S1 CT1: x'x' So CT2: x' So CT3: xx1' so
X1 3
X X1 SI
X2X1 S1
x1'xO S2
X2X It S2
X1XO S3
X2
S2
vk.(D.
Therefore, f(k, t) is a monotone decreasing function for k >
.> 1 and reaches its maximum for k =
f(k,
min =f(
=[=-0log2
1] + 2V- 2V= 0.
X2X1
S3 .
From these expressions we can determine the control
vectors and their dimensions, which are listed in Table I.
From Table I CVDS = {2, 2, 1, 2} and ¢ = max {CVDS}
= 2. Therefore the 2k - D architecture with k = 2 offers
maximum speed at minimum cost for 2 . k . 3. To check
this statement let us determine CM2k-D for all possible k.
Using equation (12), we obtain
0 CM$2 D < CM,2k-D for ¢ _ k < n
Therefore, Ai
From Theorems 2 and 3 we can conclude that the 2k D
architecture with k = max {CVDS} offers maximum spe(ed of
1 transition per clock cycle at minimum cost in ternr iS of
memory requirements in the range max{CVDS} . k 5 n.
CM23_-D =4x{213rog2 4]} = 48 bits.
For the 2k - D architecture if 1 < k < ¢, then Q > () and
CM,22- 4 x {log2 ICD22 DI] + [log2 4]} = 40 bits.
the speed is less than unity. The cost of the architectuIre is
To calculate the minimum value of CM,2-D we need the
given by (12). To compare to the 2 - D architecture, we will
use the figure of merit FM1,k which is given by
value of Qmin, which can be determined from the optimum 2 D diagrams given in Fig. 11.
From Fig. 11 Qmin = 4, and therefore,
FMrsk= C
CM2k-D
CM,2-D = (4 + 4) x {rlog2 [CD2-DJ]
FM¢,k
+ 2 x rlog2 (4 + 4)1} = 64 bits.
-
[CD2¢_Dfl-+ 2¶l0g2 m1+p}
(M+ Q) X {[log2 JCD2k-D1] + 2k[log2 (M+ Q)1 4 }
Mx {[log2
lim
p
--*
M,n finite
M
vk<c
(13)
vk<c.
(14,)
FM,k= M+Q <1
lim FM¢,k=
=->
Mp,*fn 2t
p,n finite
Minimum {CM 2k-D = CM,22 = 40 bits.
This example shows the steps to be taken in determining the
2k - D architecture with minimum cost that operates at
maximum speed. It also illustrates that in some cases the
minimum cost 2k - D architecture for 1 . k s n can operate
at maximum speed.
IV. 2" - D OPTIMIZATION PROBLEMS AND STRATEGIES
When speed is not a factor in the implementation and cost is
to be minimized, the system designer must select the 2" - D
architecture such that CM2k -D is minimum for 1 c k . n.
Equation (12) reveals that, for a given value of k, CM,2k-D
is a monotone increasing function of Q. Hence, CM2k D is
minimum for Q = Qmin. Consequently, to determine the 2k D architecture with minimum cost we must first calculate Qmin
for 1 . k . n. From (1) it follows that
From (13) and (14) we can conclude that the 2k - D
architecture with k = max {CVDS} does not offer, in
general, the most cost-effective solution for 1 . k < n.
There exists, however, a special case. For those state
machines with t, = 2 for i = 1, 2, * * , M, it can be proved
that the 2k - D architecture with k = 2 operates at the
maximum speed and offers minimum cost in terms of memory
M
requirements for 1 . k c n.
(15)
=
Qi, min .
Qmin
S
Example 3: Consider the state machine whose behavior is
i= 1
described by the state table of Fig. 10. In Fig. 10, PS stands
for the present state of the machine, and CMF,(x) = 8(x)
Qimin corresponds to a 2k - D decision diagram with the
because no output lines are being considered. For this state minimum number of nodes. Finding Qi,min entails the solution
machine, x = x2x1xo, n = 3, M = 4, and p = 0. The of an NP-complete optimization problem. Some solutions for
183
CORAOR et al.: MEMORY-BASED FINITE-STATE MACHINES
requires the solution of a difficult optimization problem.
However, the system designer can obtain bounds on CM2k-D
based on simple calculations using the elements of CVDS.
If the present state of the machine is Si then t input variables
must be tested to determine the value of CMFi(xj,). The
sequential evaluation of CMF,{xc,) using a 2k - D approach is
carried out by using 2k - D decision diagrams. As it has been
already pointed out in previous sections, many 2k - D
diagrams can be used to evaluate CMFi(xj,). Due to the fact
that k input variables are tested at any node in the diagram, the
minimum possible number of nodes is given by r[h/k] .
Therefore, the lower bound for Qi, denoted by Qi,LB, is given
TABLE I
CONTROL VECTORS FOR EXAMPLE 3
xci C
0 x1Xo 2
1 X2X, 2
2 Xi
1
3 X2X, 2
I
CT'
xo
S..X
CT
X3
I1~
by
Iki-
Qi,LB =
CT'
(18)
v
Similarly we will denote by Qi,uB the upper bound for Qi,
which can be determined from the maximum possible number
of nodes in the decision diagram. The following concept will
help us to calculate QiuB.
Fig. 11. 2 - D diagrams for Example 3.
Definition: The restriction rW,,(x) of a function f(x) with
respect to a subvector w E x is the function obtained when the
variables in w are evaluated so that w assumes the value v. If k
TABLE II
EVALUATION OF ND2-D(n)
is the dimension of w then there exists 2k possible restrictions
of
f(x).
n ND24(n) n ND2D(n)
When
a node in a 2 - D diagram representing CMF,{xc,) is
1
1
5 1658880
evaluated,
we are left with a maximum of 2k restrictions which
2
2
6 1 65xi0'3
depend
on
t - k input variables. Therefore we can have a
7 1 91x1027
3 12
maximum
of
2k - D nodes in the second level of the diagram.
4 576
8 2 91xI l
Each one of these nodes tests k input variables and generates a
the case of a 2 - D decision diagram are given in the literature. mPaximum of 2k - D restrictions which depend on ts - 2k
variables. The total number of nodes is (2k)2. We can repeat
[3], [6], [13], [18]-[21].
To illustrate the magnitude of the problem consider the this process until no more input variables need to be tested.
evaluation of the conditional mapping function CMFi(x). The The depth of the diagram is given by r[n/k], and therefore
number of 2 - D decision diagrams representing the QJUB is given by
sequential evaluation of CMFi(x) is a function of n, the
1.
Qi,UB = (2 )
(19)
dimension of x, and will be denoted by ND2 -D(n). It can be
The lower and upper bounds for Q can be obtained from
shown that ND2-D(n) is given by
(5),
(18), and (19), as shown in the following equations.
n- I
(16)
ND2-D(n) = fJ (n - i)2i.
i=o
We can also express (16) in the form of a recurrence relation,
{(2k)FI/kL-1 } = Z (2k)fti/k]-
QUB= ,
M
(20)
i.e.,
ND2-D(n) = n x [ND2-D(n - 1)]2.
(21)
(17)
The first few values of ND2-D(n) are listed in Table II.
To find Qmin we need to optimize M decision diagrams, one
for each conditional transition of the state machine. To
determine the 2k - D architecture such that CM,2kD is
minimum, requires the solution of a difficult optimization
problem for 1 < k c max {CVDS}. However, it can be
proved that the optimum 2k - D decision diagrams for k > 1
can be obtained from the optimum 2 - D diagrams using a
grouping procedure.
Bounds on CM,2k-D
Up to this point we have established that, an optimum cost
implementation of a state machine using a 2k - D architecture
i=l
i=l
Substitution of QLB and QUB in (12) gives the lower and
upper bounds for the cost of the 2k - D architecture, which
will be denoted by CM2k-DLB and CM,2k-D,UB, respectively.
These bounds give the system designer some ingight into the
advisability of pursuing the 2k - D approach before attempting to obtain optimum costs. Let us illustrate these ideas by
means of an example.
Consider the problem stated in Example 1. If a 2 - D
architecture is sought we can determine CM,2-D,LB CVDS =
(2 2 1 2), therefore
-
QLB
4
R/iXl-4= 3.
~~~~~~~~i=l1
184
IEEE TRANSACTIONS ON COMPUTERS, VOL.
Use of (12) yields
CM,2-D,LB
=
(4 + 3) x
{rlog2
3] + rlog2
(4 + 3)]} = 56
bits.
Note that CM22 -D = 40 bits, so that CM22-D < CM,2-D,LB,
and consequently it is not worth pursuing the 2 - D approach.
V. CONCLUSIONS
The 2k - D architecture stands as a generalized model of
the memory-based implementation of a finite state machine.
Different architectures are obtained specifying the value of k,
with 1 e k < number of input variables. The sequentialization principles establishes the equivalence among the architectures. The classical and binary decision memory-based architectures are particular instances of the 2k - D architecture
with k = n and k = 1, respectively.
Two implementations for the 2k - D architecture were
presented: Extended 2 - D and CD architectures. The CD
architecture offers the minimum memory requirements between the two implementations.
Two important parameters characterize the 2k - D
architecture: 1) Speed, measured in terms of state transitions
per clock cycle, and 2) Cost, estimated by the memory
requirements. Maximum speed is achieved by the 2k - D
architecture with k greater than or equal ¢, the maximum
dimension of the control vector dimension set (CVDS). CVDS
is obtained from simple operations on the mathematical model
of the machine. For those 2k - D architectures that operate at
maximum speed, the 2 - D architecture offers minimum
cost.
In some situations, the 2k - D architecture with k =
offers minimum cost and maximum speed for I e k c n and
¢ > 1. This is always true for the case of a 4 - D architecture
implementing a state machine with tj = 2 vi where t, is the
dimension of the control vector associated with state si.
Some bounds on the cost of the 2k - D architecture can be
calculated from the knowledge of CVDS. These bounds gives
some insight in the advisability of pursuing the 2k - D
482, 1979.
[4] R. Clare, Designing Logic Systems Using State Machines. New
York: McGraw-Hill, 1973.
[5] M. Davio, J. P. Deschamps, and A. Thayse, Digital Systems with
Algorithmic Implementation. New York: Wiley, 1983.
[6] M. Davio and A. Thayse, "Implementation and transformation of
algorithms based on automata. Part I: Introduction and elementary
optimization problems," Philips J. Res., vol. 35, pp. 122-144, 1980.
[7] J. Hartmanis and R. E. Stearns, Algebraic Structure Theory of
Sequential Machines. Englewood Cliffs, NJ: Pretice-Hall, 1966.
[8] Hill and Peterson, Introduction to Switching Theory and Logical
Design. New York: Wiley, 1981.
[9] Z. Kohavi, Switching and Finite Automata Theory. New York:
McGraw-Hill, 1978.
[10] D. E. Knuth, Fundamental Algoritms, The Art of Computer
Programming, Vol. 1. Reading, MA: Addison-Wesley, 1969.
[11] C. Y. Lee, "Representation of switching functions by binary-decision
diagrams," Bell Syst. Tech. J., pp. 985-999, July 1959.
[12] Monolithic Memories, Inc. Designing with Programmable Array
Logic. New York: McGraw-Hill, 1981.
[13] B. M. E. Moret, "Decision trees and diagrams," Comput. Surveys,
vol. 14, pp. 594-623, Dec. 1982.
[14] A. Mukhopadhyay, Recent Developments in Switching Theory.
New York: Academic, 1971.
[15] Z. Murray, Vromen, Hudson, Le-Ngoc, and Holck, "Binary-decisionbased programmable controllers, Part I," IEEE Micro, pp. 67-83,
Aug. 1983; Part II, pp. 16-26, Oct. 1983; Part III, pp. 24-39, Dec.
1983.
[16] E. Sanchez and A. Thayse, "Implementation and transformation of
algorithms based on automata. Part III: Optimization of evaluation
programs," Philips J. Res., vol. 36, pp. 159-172, 1981.
[17] D. J. Stewart, Ed. "Binary decision-based programmable controllers,"
IEEE Micro, pp. 58-72, June 1985.
[18] A. Thayse, "P-functions: A new tool for the analysis and synthesis of
binary programs," IEEE Trans. Comput., vol. C-30, pp. 126-134,
1981.
[19]
,"Synthesis and optimization of programs by means of Pfunctions," IEEE Trans. Comput., vol. C-31, pp. 34-40, 1982.
[20]
,"Synthesis and asynchronous implementation of algorithms
using a generalized P-function conecpt," IEEE Trans. Comput., vol.
C-33, pp. 861-868, 1984.
[21]
,"Implementation and transformation of algorithms based on
automata, Part II: Synthesis of evaluation programs," Philips J. Res.,
vol. 35, pp. 190-216, 1980.
Ma
4
approach.
The results of this research are particularly attractive when
LSI and VLSI technologies are considered. State machines
constitute fundamental blocks in systems to be fabricated as
integrated circuits. Memory-based implementations offer a
short design time, regularity of structure and expandability.
Minimum cost implementations operating at maximum speed
are frequent requirements dictated by current technology.
Extensions of this work are in progress, with particular
emphasis on the following aspects:
1) Expansion of the cost function to include the cost of
Z
multiplexers.
2) Use of
DON'T CARE'S in the state table describing the
behavior of the state machine to reduce the cost of the 2k - D
implementation.
3) Application of ideas and principles developed in this
paper to PLA-based implementations of finite-state machines.
REFERENCES
[11 S. B. Akers, "Binary decision diagrams," IEEE Trans. Comput.,
vol. C-27, pp. 509-516, 1978.
[2] R. T. Boute, "The binary-decision machine as programmable controller," Euromicro Newslett., vol. 1, no. 2, pp. 16-22, 1976.
[3] D. Cerny, D. Mange, and E. Sanchez, "Synthesis of minimal binary
decision trees," IEEE Trans. Comput., vol. C-28, no. 7, pp. 472-
C-36, NO. 2, FEBRUARY 1987
A
_Lee D. Coraor (S'70-M'78) received the B.S.
degree in 1974 from Pennsylvania State University,
University Park, and the Ph.D. degree in 1978 from
the University of Iowa, Iowa City, IA, both in
electrical engineering.
He is currently an Assistant Professor of Electrical Engineering at The Pennsylvania State University. His research interests include computer architecture, processor-memory interfacing, microcomputer system design, and computer security.
Paul T. Hulina (S'67-M'69) received the BSEE
degree from Carnegie-Mellon University, Pittsburgh, Pa., and the MSEE and Ph.D. degrees from
Pennsylvania State University, University Park.
He is currently an Associate Professor of Electrical Engineering, Pennsylvania State University,
where he is engaged in teaching and research
activities. His current interests are in computer
architecture, digital design, and microprocessorbased real-time systems.
Orlanda A. Morean (M'86) received the B.S.E.E.
and M.E. degrees in electronic engineering in 1975
and 1982, respectively, both from Simon Bolivar
University, Caracas, Venezuela. In 1984 he obtained the M .S. degree in electrical engineering
ftom the Pennsylvania State University, University
Park, where he is currently working toward the
Ph.D. degree in the field of digital systems.
His research interests include digital systems,
switching theory, and computer architecture.
Mr. Morean is a member of Eta Kappa Nu.