Synthesis, Design and Test of Reversible Circuits

Synthesis, Design and Test of Reversible Circuits Employing
Classical Techniques
Sayeeda Sultana
Department of Electrical and Computer Engineering
McGill University
Montreal, Quebec
May 2014
A thesis submitted to McGill University in partial fulfillment of the requirements of the
Degree of Doctor of Philosophy
© Copyright by
Sayeeda Sultana
2014
i
Dedication
To my husband, Shahriar Al Imam,
who always supported me in this journey
and
my daughter, Tanisha,
who encouraged me to make it possible.
i
Acknowledgement
First of all, I would like to thank my supervisors, Dr. Katarzyna Radecka and Prof.
Zeljko Zilic for giving me the opportunity to work on such an interesting topic.
Their supports and guidance made my PhD journey a memorable one. At every
step, they encouraged me to find out proper solution of the problems. I acquired
a lot of scientific knowledge especially in testing and synthesis of circuits and
how they can be applied in the way of scientific ventures. They taught me how to
overcome the difficulties both in research and personal life. Finally, I am grateful
to them for the financial support they provided.
I would like to thank the members of my PhD supervisory committee, Professors
Roni Khazaka and Thomas Szkopek for their input and suggestions. Prof.
Khazaka helped me suggesting the additional scopes I needed to explore to
enrich the thesis and Prof. Szkopek encouraged me to think critically in
respected field.
I would take this opportunity to thank my lab mates especially Bojan, Jason,
Omid and MH Neishabouri with whom I shared my views, success and tough
times. They encouraged and helped me whenever possible in my need. They
cheered me up when I was depressed with my performance and made me smile
ii
at the end of the day. Special thanks to Pang and Atena who gave some ideas
and feedbacks regarding my work.
Thanks to Natural Sciences and Engineering Research Council of Canada
(NSERC), Le Fonds de recherche du Quebec (FQRNT) and the Vadasz Family
Foundation for providing me the financial supports which helped me to continue
my research smoothly.
Finally, I would like to thank my family who were always with me to encourage
me during my journey. Thanks to my parents who wanted me to reach my goal. A
very special thanks to my husband, Shahriar Al Imam who stood by me in my
hard times and tried to make me smile by all means. Above all, thanks to my
daughter, Tanisha who made me to believe in myself and every times her cuddle
gave me the strength to move forward.
iii
Table of Contents
Dedication ........................................................................................................................i
Acknowledgement...............................................................................................................ii
Abstract..............................................................................................................................xi
Abrégé..............................................................................................................................xiv
Chapter 1 Introduction ................................................................................................... 1
1.1
Objective, scope and motivation .................................................................... 2
1.2
Contributions ................................................................................................. 5
1.3
Organization .................................................................................................. 7
Chapter 2 Background of reversible circuits .................................................................. 11
2.1
Reversible function ...................................................................................... 11
2.2
Reversible gates .......................................................................................... 14
2.3
Cost of reversible circuits ............................................................................. 18
2.4
Implementation of reversible circuits and energy consumption ..................... 22
Chapter 3 Reversible synthesis using technology mapping ........................................... 26
3.1
State-of-art reversible synthesis methods .................................................... 26
3.2
Problem identification and motivation ........................................................... 38
3.3
Proposed reversible circuits synthesis ......................................................... 40
3.4
Methodology ................................................................................................ 42
3.4.1
Toffoli modules of irreversible gates ......................................................... 43
3.4.2
Transforming classical to reversible circuits ............................................. 59
3.5
Theoretical analysis: technical lemmas ........................................................ 68
3.6
Experimental results .................................................................................... 72
3.7
Conclusion .................................................................................................. 76
Chapter 4 Reversible synthesis with redundancy removal ............................................. 78
4.1
Introduction ................................................................................................. 78
4.2
Redundancy in classical and reversible circuits ............................................ 81
4.3
Redundant reversible gate removal.............................................................. 94
4.4
Experimental results .................................................................................... 99
4.5
Conclusion ................................................................................................ 102
Chapter 5 Reversible architecture of computer arithmetic ........................................... 103
5.1
State-of-art and proposed contributions ..................................................... 105
5.2
Reversible Controlled Adder/Subtractor ..................................................... 108
5.2.1
Reversible Controlled Adder/Subtractor (RCAS) block ........................... 109
iv
5.2.2
n-bit Adder/Subtractor ........................................................................... 113
5.2.3
Reversible Adder/Subtractor with Overflow Detector .............................. 114
5.2.4
Comparative Analysis of Reversible Adder/Subtractor ........................... 116
5.3
Reversible Binary Comparator ................................................................... 119
5.3.1
Comparator Basic.................................................................................. 120
5.3.2
Proposed Comparator Design................................................................ 121
5.3.3
Comparative Analysis of Reversible Comparators.................................. 125
5.4
Reversible Arithmetic Logic Unit ................................................................ 128
5.4.1
The Logical Operations.......................................................................... 129
5.4.2
The Arithmetic Operations ..................................................................... 130
5.4.3
Function Generator................................................................................ 130
5.4.4
Function Selector .................................................................................. 132
5.4.5
One-bit Reversible Arithmetic Logic Unit (RALU) ................................... 133
5.4.6
4-bit RALU ............................................................................................ 135
5.4.7
Analysis of circuit parameters for n-bit RALU ......................................... 137
5.4.8
Comparison to previous work................................................................. 138
5.4.9
RALU with overflow detector and set-less-than function ......................... 141
5.5
Reversible square root circuit..................................................................... 143
5.5.1
n-bit reversible square root circuit .......................................................... 147
5.5.2
Analysis of n-bit circuit parameters ........................................................ 149
5.6
Simulation Results ..................................................................................... 150
5.7
Conclusion ................................................................................................ 158
Chapter 6 Testing of Reversible Circuits ..................................................................... 160
6.1
State-of-art ................................................................................................ 161
6.2
Fault Models and Testability of Reversible circuits ..................................... 164
6.2.1
6.3
Testability of Reversible Circuits ............................................................ 167
Proposed SAT-based Testing for Gate Replacement Faults ....................... 169
6.3.1
Proposed Conventional Miter-based Testing .......................................... 170
6.3.2
SAT formulation..................................................................................... 173
6.3.3
Reversible Miter as a test pattern generator........................................... 176
6.3.4
SAT formulation of reversible miter ........................................................ 178
6.3.5
Proposed Reversible Test Miter ............................................................. 181
6.4
Wire Replacement Fault Testing ................................................................ 190
6.5
Experimental Results ................................................................................. 192
6.6
Testability of Reversible Arithmetic Circuit ................................................. 195
6.6.1
Identifying Fault Location ....................................................................... 199
v
6.6.2
6.7
Testing n-bit controlled Adder/Subtractor ............................................... 200
Conclusion ................................................................................................ 204
Chapter 7 Conclusion ................................................................................................. 206
7.1
Synopsis ................................................................................................... 207
7.2
Findings of the project ............................................................................... 213
7.3
Future work ............................................................................................... 215
References................................................................................................................. 220
vi
List of Figures
Figure 2-1: Standard reversible gates. ....................................................................................... 15
Figure 2-2: Reversible gates (a) Fredkin (b) Peres. .................................................................... 16
Figure 2-3: Universal AND-OR gate (a) classical (b) Toffoli. ....................................................... 17
Figure 2-4: Reversible gates for arithmetic designs [53]. ............................................................ 18
Figure 2-5: Quantum realization of Toffoli gate........................................................................... 20
Figure 2-6: Quantum realization of Fredkin gate. ....................................................................... 20
Figure 2-7: Quantum realization of Peres gate. .......................................................................... 20
Figure 2-8: CMOS realization of CNOT gate [5]. ........................................................................ 23
Figure 3-1: Classical circuit equivalent to reversible NOT, XOR and Toffoli gate ........................ 46
Figure 3-2: Reversible equivalent gates. .................................................................................... 48
Figure 3-3: Creation of super cells. ........................................................................................... 51
Figure 3-4: Parallel packed cells. ............................................................................................... 52
Figure 3-5: Impact of larger size super cells. ............................................................................. 53
Figure 3-6: Impact of different size packed cells in a) garbage bits, b) gate count and c) quantum
cost. .......................................................................................................................................... 56
Figure 3-7: Impact of hybrid cells. ............................................................................................. 58
Figure 3-8: Reversible mapping from irreversible specification/circuit. ........................................ 60
Figure 3-9: Reversible mapping (a) classical (b) reversible equivalent gate mapping (c) Toffoli
equivalent circuit. ...................................................................................................................... 63
Figure 3-10: Fan-out mapping: (a) Classical circuit, (b) Reversible equivalent gate mapping with
copies of inputs, (c) Cascaded Toffoli modules. ......................................................................... 64
Figure 3-11: XOR fan-out mapping: (a) Classical, (b) Problem with XOR mapping first, (c) XOR
mapping after. ........................................................................................................................... 65
Figure 3-12: Reversible mapping with supercell library: (a) Original circuit, (b) Possible location of
supercell and individual Toffoli modules usage. ......................................................................... 67
Figure 3-13: Super-cell optimization, (a) Full Adder circuit, (b) Reversible realization of universal
AND-OR gate, (c) Super-cell implementation. ............................................................................ 68
Figure 4-1: Classical (a) to Toffoli (b) module mapping. ............................................................. 79
Figure 4-2: (a) Redundancy removal before reversible synthesis, (b) Redundancy removal during
reversible synthesis. .................................................................................................................. 83
Figure 4-3: Arbitrary circuit with s-a-v fault (a) classical implementation, with untestable fault, (b)
Reversible implementation with direct mapping with fault testable through garbage (c) Toffoli
realization of the circuit. ............................................................................................................. 84
Figure 4-4: Removal of same gates in series- from classical to reversible. ................................. 85
vii
Figure 4-5: Simplified Toffoli modules for s-a-v faults. ................................................................ 87
Figure 4-6: Steps for reversible redundancy removal. ................................................................ 92
Figure 4-7 : Grouping of Toffoli modules: (a) Group 1 with constant ‘0’, (b) Group 2 with constant
‘1’. ............................................................................................................................................. 95
Figure 4-8: : Simplification by Toffoli gates removal in different conditions, (a) Modules from same
group, (b) Modules from different groups, (c) Modules complementary to each other. ................ 98
Figure 5-1: Controlled adder/subtractor design in 2’s complement a) CAS block b) 4-bit
adder/subtractor [85]. .............................................................................................................. 110
Figure 5-2: Reversible Implementation of controlled adder/subtractor. ..................................... 113
Figure 5-3: n-bit Reversible Controlled Adder/Subtractor circuit. .............................................. 114
Figure 5-4: Modification to include overflow detector. ............................................................... 115
Figure 5-5: n-bit Reversible Controlled Adder/Subtractor with overflow detector. ...................... 116
Figure 5-6: Improvement in quantum cost with subtractor size with proposed design compared to
[35].......................................................................................................................................... 118
Figure 5-7: Use of a subtractor as a comparator. ..................................................................... 123
Figure 5-8: 4-bit reversible comparator. ................................................................................... 125
Figure 5-9: Number of garbage outputs of different reversible comparators. ............................. 127
Figure 5-10: Delay for different reversible comparators. ........................................................... 128
Figure 5-11: Quantum cost for different reversible comparators ............................................... 128
Figure 5-12: Reversible ALU two steps block diagram ............................................................. 130
Figure 5-13: Reversible ALU function generator....................................................................... 131
Figure 5-14: Reversible ALU function selector (4:1 MUX)......................................................... 132
Figure 5-15: Reversible ALU function selector using Fredkin gates. ......................................... 133
Figure 5-16: Reversible ALU design I (using 4:1 MUX) ............................................................ 134
Figure 5-17: Reversible ALU design II (using Fredkin selector) ................................................ 136
Figure 5-18: Reversible ALU circuit (a) 1-bit Block diagram and (b) 4-bit reversible implementation
............................................................................................................................................... 137
Figure 5-19: Proposed Reversible ALU comparable to V-shaped design .................................. 139
Figure 5-20: Modified 4-bit RALU with overflow detection and set-less-than operation ............ 142
Figure 5-21: Classical square-root circuit (a) Internal structure of CAS and (b) 8-bit square root
circuit [87] ............................................................................................................................... 145
Figure 5-22 : Modified RCAS module for Square Root ............................................................ 145
Figure 5-23 : 4-bit reversible square root circuit(2-bit output) (a) Block diagram and (b) Internal
reversible implementation. ....................................................................................................... 148
Figure 5-24: Reversible controlled adder/subtractor ................................................................ 151
Figure 5-25: 4-bit Reversible controlled adder/subtractor ........................................................ 151
viii
Figure 5-26: 4-bit Reversible controlled adder/subtractor with overflow detector ...................... 152
Figure 5-27: 4-bit Reversible Comparator ............................................................................... 152
Figure 5-28: Simulation result of 1-bit Reversible ALU block ................................................... 153
Figure 5-29: Simulation result of 1-bit RALU using Fredkin MUX............................................. 154
Figure 5-30: Simulation result of 4-bit Reversible ALU with 4:1 MUX ....................................... 154
Figure 5-31: Simulation result of RALU_Fredkin_4bit .............................................................. 155
Figure 5-32: Simulation result of 4-bit RALU with overflow and Set less than .......................... 156
Figure 5-33: Simulation result of RCAS for Square Root Circuit .............................................. 157
Figure 5-34: Simulation result of 8-bit Reversible Square Root Circuit ..................................... 157
Figure 6-1: Controllability and Observability (a) without constant input (fully testable) (b)
contradiction with constant input and search for alternative vector (c) untestable fault in
embedded circuit with constant input [89]. ............................................................................... 169
Figure 6-2: (a) An example of a correct reversible circuit (b) the gate replacement and the
affected area or fault cone . ..................................................................................................... 176
Figure 6-3. The explanation of the cancellation ........................................................................ 178
Figure 6-4: (a) A correct Toffoli network, (b) the corresponding faulty circuit of the Toffoli network,
(c) the reduced reversible miter circuit to apply CNFs. ............................................................. 180
Figure 6-5: The proper vector to detect the gate replacement fault. .......................................... 181
Figure 6-6. General block diagram representing fault excitation and propagation behavior for
faulty and fault-free gates ........................................................................................................ 183
Figure 6-7. Reversible Test Miter (a) Schematic of classical equivalent (b) proposed reversible
form ........................................................................................................................................ 185
Figure 6-8: Fault location of benchmark circuit 4 mod5 ............................................................ 188
Figure 6-9: Fault observation of benchmark circuit 4 mod5 ...................................................... 189
Figure 6-10: (a) The schematic of correct circuit, (b) with the wire replacement fault, (c) the
reversible Toffoli fault free network, (d) the faulty Toffoli network.............................................. 191
ix
List of Tables
Truth-table of a full adder (adding insufficient) garbage .............................................................. 12
Reversible embedding of full adder............................................................................................ 13
Quantum cost of n-controlled Toffoli gate [27] ............................................................................ 21
Functionality of reversible supercells ......................................................................................... 50
Comparison of large size to small size supercell combination of random circuits ........................ 55
Comparison to hybrid mapping to individual cell for random circuits ........................................... 58
Comparison of of equivalent gate libraries of different steps ....................................................... 73
Our proposed method vs. BDD based method ........................................................................... 74
Proposed method vs. Technology mapping ............................................................................... 75
Comparison of 2-input to 3-input cell library ............................................................................... 75
Simplification by Redundant fault and Gate removal ................................................................ 101
Reversible controlled adder/subtractor truth table .................................................................... 111
Comparison of different 16-bit subtractors ............................................................................... 117
Comparison of different 16-bit Adder/Subtractor ...................................................................... 119
Comparators NOR-AND vs. INV-AND ..................................................................................... 124
Reversible ALU Operations with control inputs ........................................................................ 135
Cost comparison of 1-bit ALU .................................................................................................. 138
Different 32-bit Reversible ALU realizations ............................................................................. 140
RALU (Fig. 5-19) Operations with control inputs (X is unchanged) ........................................... 141
Costs of Reversible Square Root for different size ................................................................... 150
Comparator outputs at various intervals of Fig. 5-27 ................................................................ 153
RALU operations at various intervals of Fig. 5-32 .................................................................... 156
CNF formula of standard reversible gates ............................................................................... 172
Toffoli-Fredkin functionality table ............................................................................................. 187
Number of gates for different fault locations ............................................................................. 190
Test Miter vector for various faults ........................................................................................... 194
Comparison of three proposed methods .................................................................................. 195
Faulty behavior of missing control point at different location of RCAS ...................................... 198
Testing Missing Control faults for 4-bit RCAS .......................................................................... 203
x
Abstract
Over the last few years, research on reversible logic emerged as an important
topic in many directions starting from synthesis towards test, debugging and
verification as well as arithmetic designs. The motivation behind reversible
computation comes from low power dissipation and close relation to quantum
circuits, which, in the near future, could become a competitor to current classical
circuits. As reversible circuits are still relatively new, the biggest research impact
is on synthesis of such circuits. In the first part of this thesis, we present a
synthesis approach to realize large reversible circuits based on classical
technology mapping. The irreversible nature of most of the original algorithms
makes the synthesis of reversible circuits from irreversible specifications a
challenging task. A large part of the existing algorithms, although optimized in
garbage bits and gate counts, are restricted to small functions, while some
approaches address large functions but are costly in terms of gate count,
additional lines and quantum cost. A synthesis solution for large circuits with less
quantum cost and garbage bits is presented in this thesis by avoiding
permutation based reversible embedding.
In addition, we present an indirect way of realizing arithmetic circuits avoiding the
direct translation of classical truth table with better performance with respect to
xi
various reversible parameters. We develop an improved reversible controlled
adder/subtractor with overflow detection to enhance reliability. We use this
adder/subtractor module with slight modification to implement some complex
designs such as reversible square-root circuit, comparator for signed numbers
and finally a new integrated module of reversible arithmetic logic unit, which
encapsulates most of the operations in classical realization with less number of
control lines. This module intends to perform the basic mathematical operations
of addition, subtraction with overflow detection, comparison, as well as logic
operations AND, OR, XOR and some negated logical functions such as NAND,
NOR and XNOR including implication. Thus our design is very efficient and
versatile with less number of lines and quantum cost.
Apart from synthesis and designs, testing must also be brought onboard to
accommodate the reliable implementation of reversible logic. Our final part of the
thesis addresses this issue. To date, most reversible circuit fault models include
stuck-at-value, missing gate fault and control point faults of Toffoli network. Nowa-days, the synthesis process is not restricted to standard reversible gates,
rather some designs especially arithmetic circuits include other gates. In such
realization, failures can happen due to erroneous replacements or incorrect
cascading of gates, which cannot be defined with existing fault model alone.
Thus in this thesis, we present two fault models namely gate replacement fault
xii
and wire replacement fault which target circuits implemented using any reversible
gate library. To test such faults, three testing schemes are proposed by adopting
the conventional testing methods for irreversible circuits based on Boolean
Satisfiability (SAT) formulation. In particular, a new Reversible Test Miter is
constructed, which, along with backtracking, speed up detection gate and wire
replacement faults with less memory. In addition, on a different study, the testing
feature of modular reversible design is investigated and presented in this thesis
showing that the same test set of basic block is applicable for cascaded design.
We hope our effort on synthesis, design and test of reversible circuits will enrich
their viable technological realization.
xiii
Abrégé
Au cours des dernières années, la recherche sur la logique réversible est apparue
comme un sujet important dans de nombreux domaines allant de la synthèse au test,
débogage, vérification ainsi qu’aux designs arithmétiques, apportant des solutions
alternatives aux réseaux classiques. L’intérêt pour le calcul réversible vient de sa faible
dissipation énergétique et son étroite relation avec les circuits quantiques qui, dans un
avenir proche, pourraient venir concurrencer les circuits classiques actuels. Comme les
circuits réversibles sont encore relativement nouveaux, le plus gros impact en termes de
recherche porte sur la synthèse de tels circuits. Dans la première partie de cette thèse,
nous présentons une approche pour la réalisation de larges circuits réversibles basée
sur la technologie classique. Le caractère irréversible de la plupart des algorithmes de
base rend difficile la synthèse de circuits réversibles à partir de spécifications
irréversibles. Une grande partie des algorithmes existants, bien qu’optimisés en termes
de bits de réserve et nombre de portes, sont limités à de petites fonctions alors que
d’autres se destinent à de plus grandes fonctions mais sont coûteux en nombre de
portes, lignes additionnelles et coût quantique. Une solution de synthèse pour les
circuits larges est présentée dans cette thèse avec un moindre coût quantique et moins
de bits de réserve en évitant l’intégration réversible basée sur des permutations.
En plus, nous présentons une manière indirecte de réaliser des circuits arithmétiques.
Nous développons un bloc additionneur/soustracteur commandé réversible (RCAS).
xiv
Cette conception inclue la détection de débordement pour améliorer sa fiabilité. Nous
utilisons ce module RCAS pour implémenter un circuit de calcul de la racine carrée et
les comparateurs de nombres signés. Puis, nous introduisons un nouveau module
intégré d’unité logique arithmétique réversible, qui encapsule la plupart des opérations
en réalisation classique avec moins de lignes de contrôle. Ce module vise à effectuer les
opérations mathématiques de base comme l’addition, la soustraction avec détection de
débordement, la comparaison ainsi que les opérations logiques comme le ET, OU, OU
EXCLUSIF et d’autres fonctions négatives comme NON ET, NON OR et NON OU
EXCLUSIF et l’implication. Par conséquent, notre design est très efficace et polyvalent
et présente un moins grand nombre de lignes de contrôle et un plus faible coût
quantique.
Outre la synthèse, l’optimisation et le design arithmétique, des tests doivent être mis en
place pour accueillir une mise en oeuvre fiable de la logique réversible. La dernière
partie de cette thèse traite de cette question. À ce jour, les modèles de panne de circuits
réversibles comprennent la détection de blocage sur une valeur, des défauts de portes
manquantes et l’apparition ou disparition de points de contrôle. Cependant, le processus
de synthèse ne se limite pas à des portes réversibles standards, mais s’étend au design
d’autres portes, en particulier la porte de Fredkin, de Peres ainsi que de nouvelles
portes proposées. Dans une telle réalisation, des pannes peuvent se produire en raison
de remplacements erronés ou une mise en cascade incorrecte. Nous présentons deux
modèles à savoir la panne de remplacement de porte et celle de remplacement de
connexion qui ciblent les circuits implémentés à l’aide de n’importe quelle bibliothèque
xv
de porte réversible. Pour tester ces pannes, nous proposons trois programmes de test
en adoptant les méthodes de test conventionnelles pour les circuits irréversibles basées
sur des formulations de Satisfiabilité Booléenne (SAT). Nous étudions également la
capacité de test des designs modulaires réversibles et montrons que le même ensemble
de test pour bloc de base s’applique aux designs en cascade. Nous espérons que notre
travail sur la synthèse, la conception et le test de circuits réversibles permettra d’enrichir
la viabilité de leur réalisation technologique.
xvi
Chapter 1 Introduction
Recently, reversible circuit synthesis has started to emerge as an important topic,
bringing alternative solutions to classical Boolean networks. The motivation
behind reversible computation emerged from two important facts. First, such
circuits dissipate less energy, and secondly, they are closely related to several
emerging technologies such as quantum circuits, which could become a viable
competitor to current classical circuits.
In 1961, Landauer showed that irreversible circuits regardless of the underlying
technology always consume power, and consequently dissipate heat at the rate
of at least kTln2 for each bit of information erasure, where k is Boltzmann’s
constant and T is temperature [1]. Later, Bennett showed that in principle,
arbitrarily small or zero energy dissipation is only possible if no information is lost
during computation [2]. This holds good for reversible circuits as input and output
data is processed without losing any of the original information. Though the
fraction of the power consumption in current VLSI circuits attributable to
information loss is negligible, this is expected to change as increasing packing
1
densities force the power consumption per gate operation to decrease [3],
making reversible computation an attractive alternative.
Studying the reversible circuits enriches our knowledge of quantum computation,
as reversibility is an integrated part of the later. In quantum circuits, qubits
instead of traditional bits are used allowing the values 0 and 1 as well as their
superpositions. As every quantum operation is inherently reversible, classical
reversible circuits are a subclass of quantum circuits [4]. Thus, any development
in this domain can enrich the processing to quantum logic.
Additionally, the applications of reversible circuits are found in low power CMOS
designs [5], adiabatic circuits [6,7], cryptography [8], optical computing [9] and
digital signal processing [10,11] requiring that all the information encoded in the
inputs be preserved in outputs.
1.1
Objective, scope and motivation
The objective of this thesis is to propose new design, synthesis and testing
algorithms of reversible circuits built on the foundations of classical methods. In
our approach, we aim to apply existing classical synthesis and testing methods
with modifications as necessary. Note that our interest in this work is in the
aspect of logical reversibility, not in the context of physical (thermodynamically)
reversibility. The technological realizations of reversible circuits are still at infancy
and hence we rely on existing literatures. Similarly, our approaches do not
2
explicitly target unitary operations of quantum computing which exploit quantum
mechanical phenomena such as superposition and entanglement. Our interest is
to propose general approaches for synthesis and design of binary reversible
circuits applicable to their future realization irrespective of underlying technology.
Our first goal is to propose the methods that are capable of addressing any
irreversible function, represented either at a function or at a gate level, and find
its reversible realization. Further, we want to study the optimizing techniques to
realize improved networks. Another avenue we aim to focus is designing
arithmetic circuits in the context of reversible computing, since despite the
change in implementation technology the basic structure of stored-program
digital computer (known as Von Neumann architecture) requires an arithmetic
logic unit to process data. Moreover, the reversible (permutative) binary circuits
which are sub-category of quantum circuits include blocks that are parts of
oracles, such as comparators or arithmetic blocks, counters of ones etc. Our final
goal is to concentrate on testing feature of reversible circuits and propose
efficient method to detect faults in networks under different design error models.
Synthesis and hardware implementation of reversible logic differs from traditional
approaches. The reason behind this is that fan-out and feed-back is not allowed
and a cascade of reversible gates can realize a reversible function [12].
Reversible synthesis processes start from reversible specification, in which the
3
set of output vectors is a permutation of the set of input vectors. These
permutations are then synthesized to create reversible networks. For example,
for three variables we have eight combinations of input patterns, for which we
can create up to 8! or 40320 reversible functions [13]. On the other hand, to find
a reversible implementation of classical irreversible function, the synthesis
approaches mainly rely on finding reversible embedding onto a suitable
reversible function first, and then the synthesis can take place [14]. While simple
three variable functions require searching of 40320 reversible functions for
proper embedding, the complexity increases with increasing the number of
variables. A large part of the existing algorithms, although optimized in garbage
bits and gate counts, are restricted to small functions, while some approaches
successfully address large functions but are costly in terms of gate count,
additional lines and quantum cost [15-28]. Further, assigning arbitrary value to
the extra signals for creation of reversibility is an open problem. So an efficient
method is required to bypass these permutations, reversible embeddings or extra
signal assignments for larger functions. The synthesis approaches of reversible
circuits have limitations in handling large circuits. This thesis work aims to find a
solution for this problem.
Our solutions do not restrict synthesis process to
finding permutations of reversible specifications and thereby offer better
scalability.
4
Some recent works focus on reversible realization of arithmetic circuits [29-36] by
finding direct translation of their function truth table. Moreover, we can see
attempts to propose quantum reversible implementation of arithmetic circuits [3740]. Mostly these designs require higher number of garbage bits and quantum
cost. Our target is to propose an improved design with lower cost and that is
applicable in realizing an efficient and flexible arithmetic logic unit.
While synthesis and optimization of reversible circuits are exercised in great
extent, some works have also been done to address testing of such circuits for
design and implementation errors [41-48]. By nature reversibility offers
transparency in signal propagation, and cascading of reversible gates provides
controllability and observability that ease the testing process. Technology
dependency introduces different types of faults in reversible circuits. Therefore,
this area needs investigation and our goal is to relate some classical methods to
reversible networks in the testing aspect.
1.2
Contributions
In this study, we have proposed a synthesis technique of reversible circuits
based on classical technology mapping which can handle larger circuits
bypassing the permutation of input bit combination. Later we studied the
application of redundancy removal procedure generally used in irreversible
design for the optimization of reversible network. A great part of this thesis
5
contributed to design some efficient reversible arithmetic designs such as a novel
controlled adder/subtractor with overflow detector, generalized square root
circuit,
comparator
circuit
and
reversible
arithmetic
logic
unit,
which
accommodates more functions than existing approaches. Our final contributions
include error modeling and testing of gate and wire replacement faults in
reversible circuits. A number of the findings of this study have been presented in
refereed publications, while some of the results are submitted and still under
review. In that sense, the scholarly contribution of the thesis, as a list of
publications, is as follows:

S. Sultana, K. Radecka, ‘Reversible Architecture of Computer Arithmetic’,
accepted for publication in International Journal of Computer Applications
(IJCA), May 2014.

S. Sultana, A. R. Fekr, K. Radecka, ‘SAT-based Reversible Gate/Wire
Replacement Fault Testing’, IEEE 56th International Midwest Symposium
on circuits and systems (MWSCAS 2013), special session on Reversible
Computing, pp. 1075-1078.

S. Sultana, K. Radecka, ‘Testing Reversible Adder/Subtractor for Missing
Control Points’, IEEE 56th International Midwest Symposium on circuits
and systems (MWSCAS 2013), pp. 412-415.
6

S. Sultana, Y. Pang, K. Radecka, ‘A Study on Relating Redundancy
Removal in Classical Circuits to Reversible Mapping’, 29 th IEEE
International Conference on Computer Design (ICCD 2011), pp. 206-211.

S. Sultana, K. Radecka,‘ Reversible implementation of square-root circuit’,
International Conference on Electronics, Circuits and Systems (ICECS
2011), pp. 141-144.

S. Sultana, K. Radecka,‘ Reversible adder/subtractor with overflow
detector’, IEEE 54th International Midwest Symposium on circuits and
systems (MWSCAS 2011), pp. 1-4.

S. Sultana, K. Radecka, ‘Rev-map: A direct gate way from classical
irreversible network to reversible network’, IEEE 41st International
Symposium on Multiple-Valued Logic (ISMVL-2011), pp. 147-152.
1.3
Organization
The organization of this dissertation is as follows. In the first chapter, we start
with the basic introduction of the reversible logic, present our motivation and
objective of the research, the basic contribution of the work and an overview for
the rest of the document. We review the literature to furnish the basis of
reversible logic, the progress and our scope in this area. Details of state of the art
for different area of reversible circuits related to our work are presented in the
corresponding chapters.
7
The background description of reversible logic is presented in chapter 2. The
definition of the reversible specification, standard reversible gates and their
quantum realization is introduced here. We present the intricate details of
reversible embedding of irreversible circuit and the basic constraints to realize
such design. We also introduce the complexity measures for reversible circuits.
Finally, in order to put this study into perspective, we describe the state of the art
regarding hardware implementation of some reversible circuits.
In chapter 3, we present our proposed synthesis method based on classical
technology mapping. We first introduce the overview of the existing approaches
of reversible synthesis. Then we describe the creation of our reversible Toffoli
modules equivalent to classical gates at different steps for enhanced flexibility.
The mapping algorithm to obtain reversible network is presented next. We
investigate the significance of various packing of classical gates and their
reversible modules to minimize the number of garbage outputs and improve the
overall cost of the realization. We then compare our approach with one of the
best existing synthesis methods dealing with large circuits.
In chapter 4, we study to apply the redundancy removal technique of classical
irreversible circuit in the context of reversible circuit. We describe the basic of
redundant stuck-at-value fault model and its impact in simplification of reversible
8
circuits synthesized by technology mapping. We show the gate count
minimization under redundancy conditions through experimental results.
In chapter 5, we focus on designing reversible arithmetic circuits. We first present
the state of the art regarding various arithmetic designs and the limitations of
them. Then we introduce our new controlled adder/subtractor design with
enhanced property to detect overflow. Then we explain how this design is
employed to realize reversible comparator and reversible arithmetic logic unit
(RALU). We present details of a flexible and better realization of RALU which
includes more functions than existing designs. Finally, we explain our novel
structure of reversible square root circuit. At each step we present the
comparative study of our proposed design with respect to other designs.
In chapter 6, we investigate the testing of reversible circuits using well-known
classical method based on Boolean satisfiability. We introduce two fault models
named gate and wire replacement faults, and show how they can represent other
faults. We discuss the basic of Boolean satisfiability and the SAT formulation of
reversible gates. Then we propose three methods to test the gate and wire
replacement faults. We compare the methods through experimental results and
describe the significance of our proposed reversible test miter to speed up the
testing procedure. In addition, we present a glimpse of testing modular design of
9
reversible circuits, in particular testing reversible adder/subtractor for control
point faults. We show a small test set that can handle even large design.
In chapter 7, as conclusion we summarize our contribution and present some
future directives for further improvements. First, a synopsis of the work done is
presented, followed by the findings of the study. Several initial findings make the
basis of short list of important future work, which wraps up the chapter.
10
Chapter 2 Background of reversible circuits
Reversible circuits have been studied extensively to find an alternative solution to
classical design. Some unique properties of reversible logic are identified to
facilitate the synthesis and testing of circuits and above all their applications in
different arena of computation. The quantum realizations of reversible gates are
exercised to relate reversible logic to quantum circuits. In this chapter, we will
present the basic definition and properties of reversible function, gate design and
circuit construction followed by the cost parameters of reversible circuits.
2.1
Reversible function
An nxn reversible circuit realizes an n-input/n-output function where each input
vector maps bijectively to a unique output vector. The reversible circuits allow no
fan-out and no feedback path. The cascading preserving these two rules can
build any reversible circuit from reversible gates.
Irreversible functions differ from their reversible counterpart in two aspects. First,
the number of inputs and outputs are generally not equal, and secondly the input
11
and output mapping is not unique, i.e., the same output pattern can occur for
different input combinations. Therefore, some processing is required to transform
an irreversible function into a reversible one. A simple addition of extra variables
as outputs to match the number of inputs is generally not sufficient, as this
process does not guarantee reversibility. Additionally, we need to confirm unique
output pattern. For example, consider the truth table of a full adder. The adder
has three inputs and two outputs: sum and carry. Even if we add one output to
make input-output number the same, then the specification does not become
reversible as output patterns 01 and 10 repeat, Table 2-1. Hence, we need to
add more outputs to make the circuit reversible.
Table 2-1: Truth-table of a full adder (adding insufficient) garbage
Cin A B carry
sum g
0
0
0
0
0
0
0
0
1
0
1
0
0
1
0
0
1
1
0
1
1
1
0
0
1
0
0
0
1
-
1
0
1
1
0
1
1
1
0
1
0
-
1
1
1
1
1
1
Thus reversible synthesis from irreversible specification aims to embed an
arbitrary irreversible function of I inputs and O outputs (generally I  O) into a
12
reversible circuit, constructed solely from reversible gates. The added A inputs
are referred as ancilla bits and the extra G outputs are garbage bits. The lower
bound on the number of extraneous bits G is based on the number of repetitions
M of identical output combinations. The minimal number of added garbage bits,
G=
. Then, the number of ancilla bits A is such that I + A = O + G. For
example, in Table 2-1 we observe in the irreversible specification of adder
circuits (I = 3 and O = 2), the maximum number of output patterns (01 or 10)
repeated is 3. So, we need to add extra bits at output, G =
= 2 (garbage
g1 and g2 in Table 2-2). To ensure equal number of inputs and outputs we add
one bit at the input, i.e. ancilla, A = 1 (a1 in Table 2-2).
Table 2-2: Reversible embedding of full adder
a1 Cin A B carry sum g1 g2
0
0
0
0
0
0
0
0
0
1
0
0
1
1
0
0
1
0
1
0
0
0
0
1
0
0
1
1
0
1
0
0
0
0
1
0
0
1
0
0
0
0
0
1
1
1
0
1
1
1
0
1
1
1
1
0
0
1
1
0
1
0
1
0
The ancilla bits can be set to constant 0 or 1 by the designer. Garbage outputs
can have any value following the reversible constraint, i.e. no output pattern is
repeated in the specification.
13
2.2
Reversible gates
Given a reversible specification, there exists several ways to construct
reversible networks using reversible gates. The most standard reversible gates
are NOT, CNOT and Toffoli gates, which are in general forms of multiple
controlled Toffoli gate [49]. Two other gates commonly used in reversible designs
are Fredkin and Peres gates. We will present a brief description of these gates
next.
Definition 1: A multiple control Toffoli gate has several control lines and a
single target line xj and this gate maps (x1, x2, x3, ……., xj,……., xn) to (x1,
x2, x3, ……., xi1 xi2….. xik  xj,……., xn) where xin (i=1,2 ….k) represent
control lines, i.e. the value of the target line is inverted when all control lines
are set to 1.
NOT gate is a multiple control Toffoli gate with no controls denoted as TOF ( xj).
CNOT gate (controlled NOT) is a multiple control Toffoli gate with single control
bit which is also known as Feynman gate and denoted as TOF ( xi; xj). The
original Toffoli gate is a multiple control gate with two controls denoted as TOF
(xi1, xi2; xj). These gates are shown in Fig. 2-1.
14
A
X
NOT
A
X=A
B
Y=AB
A
X=A
B
Y=B
Z=CAB
C
Toffoli
CNOT
Figure 2-1: Standard reversible gates.
A 3x3 Toffoli gate implements a function Z= ABC with two control inputs A and
B, which are copied at the outputs X and Y. That way, they can fan-out signals A
and B to the rest of the circuit. This gate is universal since any reversible function
can be realized with cascade of this gate only. For example, AND, NOT, XOR
can be obtained from Toffoli gate T (A, B, C) as follows:
AND: T (A, B, 0) = AB,
NOT: T (A, 1, 1) = Ā
XOR: T (1, B, C) = B  C or T (A, 1, C) = A  C
A multiple controlled Fredkin gate [50], Fig. 2-2(a), is a controlled swap gate with
two target lines. This gate realizes the mapping of the inputs (A, B, C) to the
outputs (X= A, Y= A’B+AC, Z= AB+A’C). Note, that the values of the target lines
15
are interchanged if the control lines are set to 1 . The Fredkin gate is an important
part of many arithmetic circuit designs.
Another gate commonly used, is the Peres gate [51], Fig. 2-2(b). This gate
implements the mapping of the inputs ( A, B, C) to (X =A, Y= AB, Z= ABC).
The main advantage of a Peres gate is its capability to implement a half adder
functions using only one instance of the gate [51].
A
A
A
A
B
A’B+AC
B
AB
C
CAB
C
(a) Fredkin
AB+A’C
(b) Peres
Figure 2-2: Reversible gates (a) Fredkin (b) Peres.
Another commonly available reversible gate is the Kerntopf gate [17], which can
provide many functions at the same time, but so far the circuits implemented with
this gate have been more costly than with other gates. This gate has three
outputs, P, Q and R, defined as:
P (A, B, C) = 1 A  B  C  AB,
Q ( A, B, C)= 1 AB  B  C  BC
R (A, B, C) = 1 A  B  AC.
16
The Universal AND/OR gate shown in Fig. 2-3 is an inexpensive gate that packs
AND and OR functions together and its polarity can be programmed [52]. This
gate can be realized with one Toffoli gate and two CNOT gates. With the function
of two outputs this gate also passes one input signal which can be used for
signal fan-out. This gate reduces the number of output garbage bits by 1. The
gate outputs are:
F (A, B, C) = {P (A, B, C), Q (A, B, C), R (A, B, C)}
F (A, B, 1) = {P = A,
=
,
=
}
F (A, B, 0) = {P = A, Q = A+B, R = AB}
A
P
Q
B
R
C
(a)
A
P
B
Q
C
R
(b)
Figure 2-3: Universal AND-OR gate (a) classical (b) Toffoli.
17
Besides these gates, several new application specific reversible gates such as R
gate [101], TR gate [35], universal reversible logic gate (URG gate) [102] and
modified Toffoli gate/BJN gate [53] have been proposed which are used in
designing adder, subtractor, comparator, arithmetic logic unit etc. The inputoutput mappings of these 3x3 gates are shown in Fig. 2-4.
A
B
R Gate
C
O1=AB
A
O2=A
B
O3=C’AB
C
(a)
A
B
TR
Gate
C
O1=(A+B)⊕C
URG
Gate
O2=B
O3=CAB
(b)
O1=A
A
O2=AB
B
O3=CAB'
C
(c)
O1=A
BJN
Gate
O2=B
O3=(A+B)⊕C
(d)
Figure 2-4: Reversible gates for arithmetic designs [53].
2.3
Cost of reversible circuits
The parameters to define the figure of merit of any reversible circuit
include the number of gates, the number of extra garbage outputs and the
number of constant inputs (i.e. ancilla) to realize the circuit. Namely, the cost of
an implementation of a reversible function f is defined as the number of Toffoli
gates in the network representation that realizes f. However, from the point of
technological implementation, this is not as simple since multiple controlled
18
Toffoli gates are composed with large set of gates and increasing control lines
increases the cost of the gates. The most common parameter that is used in
current literatures in comparative analysis is quantum cost of a gate, which is defined as
follows:
Definition 2: A quantum cost of a reversible gate T is defined as a number
of elementary quantum operations performed by NOT, CNOT and
controlled V or V+ gates in order to realize this gate.
Gates V and V+ are defined as the square-root-of NOT gate, since
. The controlled V-gate changes the target line according to the
transformation matrix
when the control line is set to 1.
Similarly, the controlled V+ gate changes the target following the transformation
matrix,
, when the control line is set to 1 [103]. The
standard reversible gates are sometimes characterized with their quantum
realization and the associated quantum cost. For example, the Toffoli gate is
realized with five elementary quantum operations, Fig. 2-5 and hence it has
quantum cost 5.
19
A
X=A
B
Y=B
C
V
V
V+
Z=A·B⊕C
Figure 2-5: Quantum realization of Toffoli gate.
The quantum cost of Fredkin gate is 5 since it is realized by 2 CNOT gates, 1
controlled-V gate and two paired integrated qubit gates (dotted circle in Fig. 2-6)
[54]. In many cases the quantum cost of Fredkin gate is assumed as 7.
A
X=A
V+
B
V
V
Y=A’B+AC
Z=AB+A’C
C
Figure 2-6: Quantum realization of Fredkin gate.
The Peres gate is constructed using one Toffoli gate and one CNOT gate.
However, in quantum realization, it requires only four elementary operations and
thus the cost is 4, Fig. 2-7.
A
X=A
B
Y=A⊕B
C
V
V+
V
Z=AB⊕C
Figure 2-7: Quantum realization of Peres gate.
20
Table 2-3 shows the quantum cost for a selection of multiple controlled Toffoli
gate configurations [27]. For example, NOT and CNOT gates have a quantum
cost one. Larger Toffoli gates have a higher quantum cost due to the number of
elementary quantum operations required for their realizations.
Table 2-3 Quantum cost of n-controlled Toffoli gate [27]
No. of control lines
Quantum Cost
0
1
1
1
2
5
3
13
4
26, if at least 2 lines are unconnected
4
29, otherwise
5
38, if at least 3 lines are unconnected
5
58, if at least 1 or 2 lines are unconnected
5
61, otherwise
6
50, if at least 4 lines are unconnected
6
80, if at least 1, 2 or 3 lines are unconnected
6
125 otherwise
The sum of the quantum cost for each gate defines the quantum cost for the
entire circuit. The other complexity measures for a reversible network in recent
21
literatures include total number of pass transistors and the delay of the realization
in CMOS technology [132].
2.4
Implementation of reversible circuits and energy consumption
The possibility in generalization of Landauer’s principal in the context of logical
reversibility and physical reversibility is presented in [55], although the application
of reversible circuits in different fields increases the efforts of their physical
implementation. In CMOS technology, the implementation of reversible circuit is
proposed by A. De Vos [5, 56, 57] based on the fact that logical reversibility is a
necessary (although not sufficient) condition for physical reversibility. In this
approach, no power supply inputs are present. Therefore, there are no Vdd, Vss or
ground bus bars or no clock lines are used and all energy provided at the outputs
is originated from the inputs. A square like representation of reversible MOS gate
is presented as in Fig. 2-8 which is capable of implementing any combinational
circuit. The function
is equivalent to
and
All energy supplied to the outputs
and
come from the inputs
and
.
To ensure low power consumption (quasi)-adiabatic operation is combined with
22
reversible logic in pass transistors. In this approach, adiabatic r-MOS logic
consumes only
is
while static CMOS has
, i.e. power reduction
. The scaling approach forces to shrink Vt along with Vdd suggesting Vdd
to be in between 2Vt and 6Vt which ensures low power consumption in reversible
logic.
Xk
Yk
f
f
f
f
Yk
Xk
Figure 2-8: CMOS realization of CNOT gate [5].
A bio-molecular realization of reversible Fredkin gate is proposed in [58] based
on enzymatic reactions that are used in DNA amplification. A method to connect
these gates together in biochemical fashion to create a reversible logic network is
also presented. According to [58] as DNA amplification can combine different
elements by hybridization and polymerization, this can be promising for powerful
computation. The process can operate at near equilibrium conditions and
23
dissipate little energy. In this operation, the original input is recovered by taking
DNA output of the Fredkin gate as input to a chain of Fredkin gates. The
advantage of this technique is that DNA amplification can occur in isothermal
system, thereby eliminating energy loss through heating and cooling of the
reaction.
A mechanical-based quantum dot cellular automata (QCA) is proposed to realize
reversible networks and its thermodynamic analysis is presented in [59] to
conform less energy dissipation of reversible circuits. The operation of a single
mechanical cell offers non-dissipative features with respect to quasi-adiabatic
clocking and thermodynamics. Each cell clocking unit has four phases (relax,
switch, lock and release) and switching a cell in relax phase to store information
is reversible and its release under the existence of a driver put the cell back to
relax state reversibly ensuring the information erased is not the only copy of that
information. The energy analysis for different QCA reversible circuits shows that
energy dissipation per switching is much less than kTln2 using special clocking
scheme: Bennett clocking.
The other implementations of reversible logic to be mentioned here are molecular
reversible logic [60], reversible electronic logic using switches [61] and quantum
implementation of reversible logic in trapped-ion technology where ions are
confined in a ion-trap or between two electrodes [62]. Some of the ions are
24
grounded representing |0> and some are driven by fast oscillating mode to
represent |1>. Gate operations are performed by laser pulses and ions interact
with laser pulses in certain frequency and duration. Two more attempts in
realizing reversible circuits are presented in [63] and [64].
25
Chapter 3 Reversible synthesis using technology
mapping
The synthesis of reversible circuits is of interest for the development of practical
implementation, and for the fundamental understanding of reversible properties.
In general, the synthesis methods target an efficient realization of reversible
circuits from reversible specification. Different algorithms are applied to transform
the specifications into a network of Toffoli gates. The main objectives of these
approaches focus on minimizing reversible gate count and quantum cost in
implementation. In this chapter, we propose a new synthesis technique to realize
reversible network of classical gate level netlist based on classical technology
mapping method.
3.1
State-of-art reversible synthesis methods
Most reversible synthesis processes start from truth-table based reversible
specification, which is usually developed from input/output bits permutations.
These permutations are then synthesized to create reversible networks. An early
synthesis approach based on heuristics is the MMD [12] (Miller, Maslov and
Dueck). The process starts with reversible specification given in the form of a
26
truth table. Each row of the table is visited and gates are added until the output
values match the input values (i.e. identity is achieved). At each step, gates are
chosen so that they do not alter already considered rows and placed from the
output side of the network. The approach has been extended by the application
of templates to reduce the size of resulting circuits [13, 15].
In [15] an algorithm is presented to find a sequence of Toffoli gates, which
transforms a given reversible function to the identity function. The process starts
from looking at the output of reversible specification. If output at each row does
not match to the corresponding input pattern then a suitable Toffoli gate is
applied at the output side. The resulting output is then subjected to another
Toffoli gate to have output patterns close to input patterns without any backtracking, i.e., once a row specification is transformed to the correct value, it will
remain at that value regardless of the transforms required for later rows. The
process continues until the final output has achieved identity. To reduce the size
of the circuit, the approach can also be applied in both directions simultaneously,
choosing to add gates at the input side or at the output side. The main advantage
of this method is that extensive searching can be avoided with this algorithm. The
problem with this method is that it is restricted to small circuits (3 or 4 inputs) and
for irreversible function we need to derive a reversible specification first with
added garbage lines and must be completely specified.
27
In [16] the authors present a method that can synthesize circuits for functions
with up to 21 variables in reasonable time. The Reed-Muller spectra and
template matching algorithm offer reduced circuit size when compared to their
previous method. Another method proposed is similar to MMD method and works
with a single row at a time but update upper rows if necessary which is not
possible for MMD approach. A bidirectional modification has been developed
where the function or its inverse can be chosen based on the cost associated
with fixing the arbitrary ith row of the corresponding RM spectra to obtain a
smaller network. The method employs Reed-Muller spectra to select Toffoli gate
to obtain lower RM cost. A local optimization technique, template matching is
presented to further reduce synthesized circuits. Re-synthesis approach, i.e., any
sequence of gates in reversible network is subjected to synthesis methods and
template matching is applied. If the resulted network is smaller, then it replaces
the selected sequence in the original network. Two procedures have been
developed. First one named random_driver performs a user-specified number of
iterations. In each iteration, a number of random sub-networks is re-synthesized
and the best overall simplification is chosen and forwarded for the next iteration.
Second method, exhaustive_driver tries all possible sub-networks with at least 5
gates of a given network for further simplifications. Again, this method assumes a
reversible specification as a starting point. In this method, first three different
28
synthesis techniques have been applied, and then templates are used for local
minimization. Finally, re-synthesize with random_driver is applied until several
iterations do not bring any simplification and then exhaustive_driver is applied
until no further simplification occurs. For the purpose of minimization, the process
also synthesizes inverse of the specifications. This software can synthesize
functions with less than 21 variables within an allowed 12 hours time. Though the
method is applicable for larger functions, the runtime for such synthesis grows
exponentially. Incompletely specified functions or classical irreversible circuits
are not targeted in this method. The method offers an option to minimize gate
count or technology-motivated cost like quantum cost because a smaller
quantum cost is desirable than a smaller gate count since quantum cost is better
indication of the technological cost of implementing a circuit.
The algorithm proposed by Kerntopf [17] can utilize any arbitrary gate library
(NOT, CNOT, n-bit Toffoli, SWAP and n-bit Fredkin gates) during synthesis.
Given a reversible specification, all gates in the library are considered as
possible candidates. In every step of the algorithm, all gates are examined and
for each of them, shared binary decision diagram (SBDD) with complemented
edge is constructed. A SBDD is a graph representation of Boolean functions
where multiple binary decision diagrams (BDD) are joined into a single shared
diagram, consisting of BDDs sharing their sub-graphs representing the functions
29
associated with it as complement of the function being pointed by the edge. The
complexity measure C(f) of an n x n reversible function equal to C(f)= x(f) – n,
where x(f) denotes the number of non-terminal nodes in the SBDD is determined,
and the gate that results in lower complexity measure is chosen to be added to
the synthesized circuit. Synthesis proceeds on the remainder function. If two or
more gates yield the lowest complexity, multiple paths are explored
simultaneously.
The algorithm
presented in [18] by Shende et.al, is an interactive approach
based on generating all possible circuits containing n gates for increasing values
of n until a circuit is found implementing the specification. Effects of even and
odd permutation functions are presented. The result obtained in this method is
optimal. However, this method is suitable for reversible functions of three or four
variables that require eight or fewer gates in their implementation. Spectral
techniques based algorithm especially on the Rademacher-Walse spectrum is
presented in [19]. The circuit is synthesized from inputs to outputs or vice versa
at every stage, depending upon the best application of generalized n-bit Toffoli
gate that is possible. The best translation is determined based on the maximum
positive change in the complexity measure of the function. An error is generated
if no translation can be found. This algorithm is promising to realize a valid circuit
given enough time and memory.
30
The next method we consider is SAT (satisfiability) based exact synthesis where
minimal circuit realization of a given circuit is the main objective [20]. Given a
reversible function f, the exact synthesis of f into a network of Toffoli gates is
formulated as a sequence of decision problems. A decision problem is used to
check if for f and a number d, a Toffoli network with exactly d gates exists. The
process starts with d=1 and if no realization is found then d is increased until a
valid Toffoli network is obtained. The decision problem is encoded and solved by
using SAT techniques, i.e., finding satisfying assignments for each instance. To
deal with irreversible function, constant inputs are assigned and garbage outputs
are left unspecified. In this synthesis approach, to guarantee minimum number of
gates, both 0 and 1 values of constant inputs are used, which requires
exponential number of combinations to be checked causing significant increase
in complexity. To minimize such overhead, the problem formulation is modified
such that all variables representing the same constant are equal. Though this is
the current best methods for small functions, no indication is found whether the
unspecified don’t cares do really guarantee the reversibility or not.
In [21] the authors present an algorithm and tool for the synthesis of reversible
functions based on positive polarity Reed-Muller (PPRM) expansion. The
reversible specification is converted to EXOR sum-of-products (ESOP) form first
using already developed EXORCISM-4 tool. Then this ESOP is transformed to
31
PPRM form by making the substitution a’= a 1 on all the complemented
variables, algebraically expanding the product terms, and cancelling out an even
number of identical product terms. The PPRM expansion of all outputs is the
input of the algorithm which examines possible substitutions. For each
substitution a new node is created, the factor that is identified is substituted in the
PPRM expansion to get new PPRM expansion. If the new expansion has fewer
terms then all nodes are considered for the next step with priority queue. The
node with highest priority is considered to substitute the factor with available
Toffoli gate. The process continues until a valid solution is reached. The
advantage of this algorithm is it employs shared functionality that exists between
multi-output functions. In general, any algorithm simply uses as many gates as
there are terms in the Reed-Muller expansion of the function. In this approach,
sub- expressions common between Reed-Muller expansions of multiple functions
are identified and then substituted with available Toffoli gates. The experimental
results show basic algorithm can handle all 40320 three-variable reversible
functions efficiently. Some heuristics are introduced to improve performance for
larger functions. Experiments on scalability indicate this method can quickly find
solutions to a good proportion of the randomly generated functions with 6-16
inputs. However, the algorithm does not guarantee optimized design in terms of
32
gate count or quantum cost. Also incompletely specified reversible functions
processing or generation time of PPRM expression is not considered here.
In reversible logic synthesis with output permutation [22] specification of
reversible function has been updated through reordering of the output positions.
In general, the outputs in a given truth table are fixed in positions. However, the
output ordering is irrelevant and for each output permutation we can have distinct
realization of reversible network. The proposed method checks all permutations
or partial permutations to achieve a smaller realization. In addition, no extra
gates are required to achieve the output permutation. In order to find best
permutation, all permutations (n!) are checked (where n is the number of
variables of the reversible function) for a completely specified function. The
method is applied to both exact and heuristic synthesis algorithms proposed
earlier by encoding all permutations, and keeps the best realization. For heuristic
approach, a sifting algorithm is adopted where the gate count of first realization is
saved and for each output, the best position is searched. If the gate count for
such realization is smaller than the current best then it is stored as being current
best. Each position for every output is checked to find a realization with fewest
gates. The method results in improvements of gate counts at the expense of
runtime, but does not guarantee to always end up with fewer gates than existing
approaches. For example, in exact synthesis method, mod5d2 circuit is realized
33
with 8 gates in 9.9s whereas with output permutations require 1097.6s for
realization having same 8 gates. For heuristic method the SWAP gives better
result than original method and again trade off is runtime, sometimes as high as
30 times for circuit with 8 variables. Furthermore, there is no suggestion about
how to handle large circuits with output permutations.
In [23] the authors addressed synthesizing reversible circuits from completely
specified irreversible specifications in the form of Toffoli gates. Since irreversible
functions require embedding to reversible specifications leading to addition of
constant inputs and garbage outputs and a subsequent assignment problem of
don’t cares, three methods have been proposed to complete assigning garbage
outputs. The first method considers minimizing Hamming distance of the output
patterns to the corresponding input patterns. For every output pattern of
irreversible function, the set of rows containing that pattern are grouped and then
the candidate outputs are assigned with minimal Hamming distance to the input
assignment of the corresponding row. The second method uses Hungarian
algorithm to solve the output assignment problem to associate each row sharing
a common irreversible function output with one of the candidate output. The third
method is based on XOR combinations of primary inputs. The algorithm checks
one row of the specification at a time and for each garbage output, the don’t care
is set to the cumulative XOR of the bit position corresponding to garbage outputs
34
up to and including garbage position. Then the RM based synthesis method is
applied to realize the completely specified reversible function. Then template
matching algorithm is applied to minimize quantum cost using only the basic set
of 14 templates proposed by Maslov. The experimental results show the
effectiveness of the methods. However, the method does not give better results
consistently and don’t care assignments with different methods increase
computational complexity. Also, the approach cannot handle problems with more
than 13 to 15 lines including garbage lines.
In [24] an iterative algorithm is presented to implement an incompletely specified
Boolean function as a cascade of reversible complex Maitra terms known as
reversible wave cascades. In this approach a cascade of the original
incompletely specified function generates a circuit that is equivalent to the
cascade implementation of a completely specified function having the same ONset and OFF-set as the original function. The set of completely specified
functions representing a stage of a cascade is generated first. The remainder
function is computed next assuming that the completely specified functions will
be used in the cascade. If the remainder function is independent of at least one
(or more) variables, the next stage is added to already synthesized network. In
this approach, at most one constant input and no garbage outputs are required.
35
Another approach that synthesizes large functions efficiently is binary decision
diagram (BDD) based techniques [25]. Any Boolean function can be efficiently
represented
by
a
directed
acyclic
graph,
BDD
employing
Shannon
decomposition. Having a BDD G= (V, E) representing f, a reversible function can
be derived traversing the decision diagram and substituting each node v € V with
a cascade of reversible gates. The respective cascade of gates depends on the
successors of a node v. Reversible equivalents of each node for all possible
scenarios are derived first. The method solely depends on the state-of-the-art
BDD packages and once BDD is created then the synthesis algorithm for
substitution is applied. Requiring higher gate count and more additional lines is
the main problems with this approach.
A recent development in [26] is foundations for reversible programmable logic
array (RPLA) architecture using reversible Fredkin and Feynman gates. The
proposed RPLA has n inputs and m outputs, and can realize m functions of n
variables. For example, a 3-input RPLA can perform any 28 functions using the
combinations of 8 minterms. In this approach, classical AND and OR functions
are implemented by setting one input of the Fredkin gate to 0 and 1, respectively.
To propagate signals, i.e., provide fan-out and complement of the inputs
Feynman gate is used where one of the inputs of the gate is set to 0 or 1
accordingly. First the reversible AND array realizes the selected product terms of
36
the inputs and then reversible OR array is used to generate various possible
functions of the product terms.
The method is promising for reversible
implementation but there is no verification whether the realized circuits really
represent reversibility of the function. Also number of garbage bits generated
from
this
implementation
is
not
reported.
For
a
simple
3-input
full
adder/subtractor requires 45 Feynman gates and 25 Fredkin gates with 70 extra
input lines generated using this implementation. So at this point, a large number
of gates and extra bits requirement for even small circuits limit its applicability.
The summary and overview of synthesis and optimization of reversible circuit
synthesis is presented in [27] and [28]. Some other methods to be mentioned
here includes Exclusive OR sum of products (ESOP) minimization and quantum
multiple valued decision diagram (QMDD)-based swapping for generating Toffoli
gate cascade [65]. The QMDD-synthesis manipulates matrix specifications of
both binary and multiple-valued reversible and quantum circuits [66, 67]. The
Portland State University research group recently performed an extensive work
on synthesis of reversible circuits for completely and incompletely specified
reversible and irreversible functions [106-110]. Their proposed MP (multiple path)
algorithms offer improvement over MMD method in dealing with large functions.
Their proposed cube grouping method to synthesize large functions with
37
significant don't cares outperforms the other existing methods in many cases
[131].
3.2
Problem identification and motivation
From the above discussion, we can identify the problem of the existing synthesis
approaches is clearly the limitation on the I/O sizes of the circuits, which can be
addressed that way due to the computational complexity of permutation
assignments. Hence permutation-based synthesis targets only small functions. In
fact exact and heuristics method can handle only few variables (around 6
variables for exact and 30 variables in case of heuristics). Reversible
specification as a starting point limits their applicability to 2n! reversible functions
processing. For embedding irreversible circuits to reversible one with proper
garbage outputs assignment is still an open problem. Further, an optimized
reversible realization requires a significant run-time.
The more scalable approach is offered by positive polarity Reed-Muller (RM)
synthesis techniques [21]. The RM approach is also used to realize irreversible
specifications as reversible designs in [14]. However, the matrix-based
computation performed to get the RM expansion for a function with a large
number of variables requires significant processing time. The BDD-based
solution [25] is linear in size of BDDs, and hence requires manageable
38
processing time whenever a BDD can be constructed. The technique, however,
necessitates more gates for reversible implementation of larger functions as well
as some pre-processing for the BDD generation. To alleviate some of the above
limitations, yet another solution was presented in [52], where a classical
technology mapping was employed to realize irreversible circuits in reversible
implementations. The cube re-ordering based synthesis methods can also
handle large functions with don't cares very efficiently and in many cases require
less number of gates and quantum cost than other methods [131].
In our work, we propose the solutions, which not only overcome the limitations of
dealing with larger functions but also extend the synthesis of reversible circuits to
the cases, when the specifications and/or original implementations are
irreversible gate level netlist. In particular, we propose a method for realizing a
gate-level implementation of an irreversible function as a reversible network
using standard reversible gates. Our approach utilizes a gate-level classical
circuit as specification, which is realized as a reversible network using our predefined equivalent Toffoli modules. Hence, we do not need to find any reversible
embeddings of a target function before the synthesis, instead we create a
realization involving Toffoli-based modules equivalent to gates in a classical
network. To reduce the number of extraneous bits we introduce a library of
reversible modules called supercells.
39
3.3
Proposed reversible circuits synthesis
Unlike most reversible synthesis approaches, we plan to structure our algorithms
such that they can fully benefit from classical synthesis techniques of irreversible
circuits for synthesis of reversible circuits. The method does not restrict synthesis
process to finding permutations of reversible function specifications, and thereby
results in better scalability.
First, we explore the applicability of classical
methods in design and synthesis of reversible circuits directly from irreversible
specification or implementation. In this context, we need to resolve the issues
acceptable in irreversible circuits but not in their reversible counterpart.
Furthermore, reversibility requires addition of some extra signals to uniquely map
inputs to outputs. As the number of extra signals can be substantial even for a
small circuit, we need to address the issue of their efficient reduction. The
secondary goal is the optimization of reversible circuits in terms of gate count or
quantum cost. The main steps include:

Creation of reversible cells for reversible embedding of irreversible
specification: To achieve reversible embedding of classical irreversible
functions we plan to explore technology-mapping techniques. Here the
first objective is to create a library of reversible cells that can cover any
kind of irreversible networks.
40

Pre-mapping optimization process: We plan to start from the function
specification level of classical circuits without obtaining any reversible
specification. Currently many optimization techniques are available in
classical arena. If a function specification is synthesized and optimized
prior to reversible mapping, then a minimal circuit can be obtained.

A reversible circuit is constructed through cascade of several gates. Each
gate is a part of a stage of the cascade, which must include the gate and
all the unconnected lines running through it. A signal fan-out, which is very
common to classical irreversible circuits, is not acceptable in reversible
logic. As in our work we perform mapping of such irreversible circuits into
reversible cells, we need to find efficient solution to this problem with least
cost in terms of additional gates and ancilla/garbage bits.

Gate, ancilla and garbage bits optimization: A direct mapping into
reversible cells introduces large number of extra lines and gates in
reversible networks. We need to look for optimization of this number. We
will investigate whether packing of gates lead to reduced number of gates
and extra lines and at what extent we are allowed to pack.
41
3.4
Methodology
To obtain reversible implementations of classical irreversible circuits, we follow
the classical technology mapping procedure where a network of gates is defined
by cells from a specific technology library. The work is an extensive
advancement of ideas proposed in [52] which utilized the classical techniques in
reversible circuit synthesis for the first time, but did not reach to maturity since no
further development was exercised. Moreover, the experiment results indicate
the number of reversible cells usage but no information about standard Toffoli
gate requirements. In this research, we work to solve the missing features and
improving the approaches in terms of garbage counts and quantum cost. Further,
in our approach, an already synthesized irreversible circuit can be directly
transferred to a reversible network.
The process starts with creating a reversible Toffoli module library. Then given a
function specification, a classical synthesis is employed to transform this
specification to a classical network implemented using irreversible gates such as
NAND, NOR, AND-OR-INV etc. In the process, various classical network
minimizations are explored. A reversible technology mapping is then applied
through traversing in topological order of already synthesized classical network to
replace it with reversible modules. Different packing of gates into parallel and
42
serial reversible forms is proposed in our work to obtain a minimal reversible
network and to manage fan-out signals.
3.4.1 Toffoli modules of irreversible gates
To transform an irreversible circuit into a reversible one, we first create Toffolibased reversible modules of the available 2-input classical gates. Next, all
classical gates are substituted by their reversible equivalent modules. Each of
the modules consists only of a cascade of reversible NOT, CNOT and multiple
controlled Toffoli gates stored in the so-called reversible equivalent library. Our
proposed reversible library is created and enhanced in three steps (fundamental
A, extended B and supercells C) to provide better flexibility and reduction in
extraneous signals. Though Toffoli modules resulting from each of the above
three steps can alone be used to implement reversible network of irreversible
circuits, progressive steps offer better-optimized realization. To design such
modules we use Reed-Muller transform defined below and implement them with
standard reversible gates, i.e., NOT, CNOT and multiple controlled Toffoli gate.
Reed Muller Transformation
Every Boolean function y= f (x1, x2, ……., xn ) can be represented uniquely as a
polynomial of the form a0  a1 x1  a2 x2  a3 x1x2 …… a2n-1 x1 x2…. xn with
43
Boolean coefficients a0 , a1,… a2n-1 , which is referred as positive polarity Reed
Muller expansion. RM spectra can be efficiently computed using fast transform
techniques. The transformation is expressed in matrix form as,
R = Mn F
M0 = [1]
where the summation is modulo-2, i.e., EXOR and F is the truth vector of the
given function. Some important properties of this transformation are [12, 21]:

Self inverse, i.e. RMT(RMT(f)) = f

Order dependent, i.e. the value of f[i] is not affected by any value of f[j]
where i ≥j;

Power of two independence, the value of f[i] for i = 2p is never changed
with the values of f[j], where j = 2q and 1≤ p, q≤ n.
The multiple controlled Toffoli gate, TOF ( xi1 xi2….. xip; xj) is applied to realize
reversible specifications by replacing each occurrence of the literal of xj in the
Reed Muller expansion of the output variable yo= a0  a1 x1  a2 x2  a3 x1x2
…… a2n-1 x1 x2….
xn with the expression xj  xi1xi2……..xip . A simplification
of the resulting expression is done next [21].
44
We define the functionality of each Toffoli module in terms of classical gates such
that they can be used directly to replace classical 2-input gates in technology
mapping. In this regard, we generate each module in such a way that at least
one input propagates to the output through garbage to retrieve them in reversibly.
Reversible module library with minimal cardinality
Step A, which is the first one in the creation of the proposed reversible equivalent
gates library, includes typical reversible gates, equivalent to classical gate library:
AND, NOT and XOR gates. The NOT gate is a reversible NOT, irreversible XOR
is realized with CNOT, and AND gate is implemented by setting the target line ( C
inputs of Toffoli in Fig. 3-1) of Toffoli gate to ‘0’. For the XOR gate, in addition to
the gate output, a copy of one of the inputs (referred to as an input A) appears at
the output, and can be used by another gate. The same is true for an AND gate
(copies of inputs A and B are available at outputs), Fig. 3-1. This set of modules
can present reversible embedding of any irreversible function expressed in
exclusive-OR sum-of-products (ESOP) form. Proposed library of Toffoli modules
is directly related to the general reversible gates. As the modules can be viewed
as blocks with equal number of input and output ports, we say AND gate is
implemented in 3x3 block, XOR gate 2x2 block and an inverter with 1x1 block.
45
To realize an AND gate, we use a generic procedure for reversibility. As 2-input
AND gates produce three identical outputs, we can calculate that 2 extra bits are
required to make a reversible cell of the AND gate (M = 3,
). Hence,
we copy the two inputs directly to the output such that they can always be
recovered from the outputs, and be used by other gates in the network. At the
input side, we set ancilla to a constant 0. On the other hand, to realize a 2x2
XOR reversible cell, we find that only one garbage output is sufficient while no
ancilla is required. An inverter maps input and output uniquely. The structure of
the above gates is presented in Fig. 3-1.
A
X
A
B
X
Y=AB
A
P
B
Q
R=CAB
C
Figure 3-1: Classical circuit equivalent to reversible NOT, XOR and Toffoli gate
Extended library of reversible equivalent gates
The library presented in the previous section is restricted to only three basic
gates, and hence has to be extended to handle large circuits more efficiently.
The selection of the gates for the Step B library is based on observations coming
from the process of reversible circuit implementation. For example, consider an
46
irreversible network in an ESOP realization. The problem in transferring this
network to a reversible one is associated with the increasing size of product
terms, i.e., the higher number of literals. This requires Toffoli modules with more
control inputs leading to the increase in the quantum cost. Moreover, a typical
gate-level implementation of irreversible circuits includes gates like NAND, OR,
NOR, XNOR, and AND or OR with single input inverted. Hence, it is needed to
generate Toffoli modules to represent such gates in reversible equivalent library
(some examples in Fig. 3-2) in order to retain flexibility in reversible mapping of
gate-level classical circuits.
To generate each of the Toffoli modules corresponding to typical classical
standard gates we adopt classical spectral techniques, in particular Reed-Muller
Transformation. The gate functionality is expressed in terms of its RM spectrum,
and each term of the expression is replaced with Toffoli gates. Note that our use
of RM Transform is restricted only to the generation of reversible library of Toffoli
modules. It is not applied during the actual reversible synthesis of irreversible
gate-level specifications. This contrasts to the RM reversible synthesis
approaches [14, 21], which deal with circuit function specification instead of gatelevel netlist.
In previous section we already mentioned AND, INV and XOR reversible
modules. To construct the OR functions, we follow the same procedure as for
47
the AND gate but in this case the reversible realization requires three Toffoli
gates, Fig. 3-2. Note that for AND and OR modules in Toffoli realizations the
extra input polarity control provides for NAND and NOR functions. The proposed
extended reversible equivalent library offers flexibility in reversible mapping of
any existing gate-level implementation of classical circuits to reversible network.
A
B
R
A
B
0
A
B
R
A
B
R
A
B
1
A
B
R
A
B
R
A
B
1
A
B
R
A
B
R
A
B
0
A
B
R
A
B
R
A
B
0
A
B
R
A
B
R
A
B
1
A
B
R
Figure 3-2: Reversible equivalent gates.
Reversible supercells library- garbage reduction
The reversible library created in Step B includes reversible equivalents (with two
extraneous outputs) of the popular 2-input irreversible gates. The problem with
such gates, however, lies in the accumulation of garbage outputs, which
increases with the number of gates. For example, the total number of garbage
bits is maximum, i.e., two times the total number of gates in worst case, when
there is no fan-out in the irreversible network, and hence no garbage bits can be
used for the valid fan-out signals propagation.
48
To overcome this problem, and to reduce the garbage bits, in Step C, in addition
to the Toffoli modules from Step B, we introduce a new notion of reversible
equivalent gate referred to as a supercell. A reversible circuit when obtained
through the mapping using supercells is characterized by reduced gate count,
minimized ancilla and garbage bits compared to similar reversible mapping using
basic reversible gates only.
Supercells are created from all possible combinations of basic irreversible logic
gates connected serially, such as AND-OR, NOR-AND, single input inverted
AND-NOR etc., some examples are shown in Fig. 3-3. Each of the functions
representing gate combinations is implemented in Toffoli embeddings, and is
stored in the reversible library. Note that these supercells present their combined
function instead of individual functions at the output. A small sample of supercells
is presented in Table 3-1. Note that all inputs are available at the garbage
outputs of a supercell, and can be used as the solution to fan-out problems in the
irreversible specifications. Further, we extend the concept of supercells to
incorporate reversible equivalents of 3-input classical gates and their possible
generic with different inverted signals. Inclusion of these gates decreases
garbage bits and multiple-controlled Toffoli gates as well.
49
Table 3-1: Functionality of reversible supercells
Classical function
Reversible supercells
⊕ ⊕
⊕
⊕ ⊕
⊕
⊕
⊕ ⊕
⊕
⊕
⊕
⊕
⊕ ⊕
⊕
⊕
⊕
⊕
⊕ ⊕
⊕ ⊕
⊕
⊕ ⊕
⊕
⊕
⊕ ⊕
⊕
⊕
⊕
⊕
⊕
⊕
⊕
⊕
⊕
⊕
⊕ ⊕
Example 1: The supercell realizing an AND-OR function has three garbage bits
at the outputs, Fig. 3-3(b). If AND and OR gates were to be represented
independently, then this would require 4 standard reversible gates (1 Toffoli
gates with 2 garbage for AND, 1 Toffoli and 2 CNOT gates with 2 garbage
for OR gate), generating 4 garbage and 2 ancilla bits, which carry no useful
information. However, by using a supercell concept both garbage and gate
are saved.
50
A
B
C
R
C
C
B
B
A
A
0
R
A
A
B
C
Y
(a)
A
B
R
A
R
B
B
A
C
C
B
0
R
A
B
B
C
(e)
B
0
Y
Y
C
A
A
B
B
C
Y
(d)
C
0
C
A
(b)
A
B
C
A
(c)
A
C
B
A
A
B
C
R
B
B
C
C
R
0
R
A
A
B
C
(f)
R
A
A
B
B
C
C
1
R
(g)
Figure 3-3: Creation of super cells.
Another type of module optimization is combination of two classical gates and
realized in Toffoli embeddings, we name parallel packed cell. This cell realizes
two classical functions from the set: {AND, OR, XOR} together with their possible
inversions of inputs and outputs of the gates for the same two inputs. The
structure of the parallel packed cell is such that at least one input is transmitted
to the output to be used for fan-out, saving the total number of gates and
garbage bits when possible. Examples of some other parallel packed cells and
their Toffoli realizations are shown in Fig. 3-4. For example in Fig. 3-4(b), X=
0⊕AB and Y= 0⊕A⊕AB and thus requires 3 reversible gates. When we pack
51
them with X unchanged and Y= X⊕A, then we save 1 CNOT gate and one
garbage output.
A
X
B
Y
A
X
B
Y
A
B
0
(a)
(c)
A
B
0
A
Y
X
A
X
B
Y
A
X
B
Y
A
B
0
A
Y
X
A
B
1
A
Y
X
(b)
Y
B
X
(d)
Figure 3-4: Parallel packed cells.
Super cells in larger sizes
The supercells representing the reversible equivalents of irreversible functions of
two logic gates in cascade are referred to as size-2 supercells. The supercells
can be easily extended to size-3 and more. For example, consider an AND-ORNAND structure, which can be implemented by the size-3 supercell Toffoli (a, b,
c, d, 1⊕ab⊕acd⊕abcd).
However, unlike in the classical synthesis, the benefits of larger size may not be
obvious, and we need to find the limitations on packing of gates in the context of
the garbage and quantum cost reductions. Hence, before using higher-order size
supercells, we must decide whether they lead to optimized implementation, or
large supercells become too costly instead of providing any advantage. One
such comparison is illustrated in Fig. 3-5 where for some random circuits with 5
52
classical gates in series, we calculated number of garbage bits and gates as well
as quantum cost with different size of supercells.
F
E
D
C
B
F
E
0
D
0
(b) C
0
B
0
A
0
F
E
D
C
B
0
A
0
g1
g2
g3
g4
g5
g6
g7
g8
g9
g1
0
z
g1
g2
g3
g4
g5
g6
g7
Z
F
E
D
C
0
B
A
0
F
E
D
0
C
B
0
A
0
g1
g2
g3
g4 (c)
g5
g6
g7
g8
Z
g1
g2
g3
g4
g5
g6
g7
Z
(d)
Z
A
(a)
A
B
C
D
E
F
0
(e)
g1
g2
g3
g4
g5
g6
Z
(f)
Figure 3-5: Impact of larger size super cells.
Example 2: Consider the irreversible circuit having 6 inputs and single output,
Fig. 3-5(a). We implement reversible network by replacing each gate with its
reversible equivalent Toffoli module, 3-5(b). This requires 9 gates with 10
garbage bits bringing the quantum cost to 29. Next we consider packing
53
every two gates in topological order and replacing them with 2-gate supercell. Gates, which cannot be placed with double gate configurations are
substituted with a simple cell, saving 2 garbage and 2 gate though quantum
cost increases by 14, Fig 3-5(c). On the other hand, using 3-gate super-cell
and 2-gate super-cell we can save 3 garbage bits, however the quantum
cost is now 70, Fig. 3-5(d). In Fig. 3-5(e), we packed 4 gates as super-cell
and the rest is replaced individually. Finally, all the gates are considered for
creating a 5-gate super-cell. This process increases the number of gates
and quantum cost as well, Fig. 3-5(f).
We tried various circuits to find the advantages and disadvantages of higher level
supercells and presented in plots, Fig. 3-6. The plots represent garbage bits,
gate count and quantum cost in terms of individual cell replacements (1), two 2gate supercell plus one single gate (2), 3-gate supercell plus 2-gate cell (3), 4gate supercell plus single gate cell (4) and 5-gate supercell (5). Note that packing
of gates in higher levels always results in reduction of garbage bits. The gate
savings depend on circuit construction but quantum cost always increases. The
main reason is that when we pack more gates, their reversible embeddings use
Toffoli gates with more control signals and when the number of controls is more
than 3, the increase in quantum cost is significant as we can see in chapter 2.
For example, in Fig. 3-6(c) for circuit 1, if size-2 supercells are used, then the
54
quantum cost is 43. On the other hand, using size-3 supercell increases the
quantum cost to 70, size-4 supercells to 133 and the size-5 supercell to 297.
Another example is presented in Table 3-2, where size-3 and size-2 supercells
are used instead of creating 5-gate supercell. Here, the quantum costs are
significantly reduced in size-2 and size-3 supercell implementation by
compromising a single bit increase in garbage bits. Hence, although such a
possibility always exists, we do not consider cascading three irreversible gates or
more in mapping to higher-size supercells, as savings of gates and garbage bits
are overshadowed by the increase in the quantum costs.
Table 3-2 Comparison of large size to small size supercell combination of random
circuits
5-gate cell
3-gate + 2-gate cell
Garbage
Gate
QC
Garbage
Gate
QC
6
7
297
7
6
70
6
12
369
7
6
38
6
7
160
7
8
68
6
11
340
7
8
77
55
12
Ckt 1
Ckt 2
Ckt 3
10
e
g
a
b
r
a
G
8
6
4
2
0
14
12
1
2
3
Gates in supercell
4
3
Gates in supercell
4
3
Gates in supercell
4
5
(a)
Ckt 1
Ckt 2
Ckt 3
10
e
t
a
G
8
6
4
2
0
400
350
1
2
5
(b)
Ckt 1
Ckt 2
Ckt 3
300
250
C 200
Q
150
100
50
0
1
2
5
(c)
Figure 3-6: Impact of different size packed cells in a) garbage bits, b) gate
count and c) quantum cost.
56
Hybrid cells
We consider another type of packing gates where a number of parallel gates and
serial gates are packed together and their overall functionality is realized in
reversible equivalent modules, we call them hybrid cells. These hybrid cells offer
better results in terms of garbage bits, gates and quantum cost. Let us consider
Example 3 to see how the reduction can be achieved.
Example 3: A segment of classical irreversible circuit is presented in Fig. 3-7. We
consider the implementation where two outputs from the parallel gates are
fed to two different gates as shown in Fig. 3-7. The reversible realization is
obtained by using 1 parallel packed cell with outputs X and Y and a 2-gate
super-cell (OR and AND in series), Fig. 3-7(b). For this case the number of
garbage outputs is 4 while it requires 6 gates with quantum cost 30.
However, the hybrid realization requires 3 gates only with 3 garbage lines
and quantum cost 15. Thus hybrid cell saves 1 garbage bit and 3 gates as
well as quantum cost by 15.
57
x
A
B
C
Z
y
(a)
A
g1
B
g2
0
g3
C
g4
0
Z
A
g1
B
g2
C
g3
0
Z
(c)
(b)
Figure 3-7: Impact of hybrid cells.
One interesting feature of the hybrid cell is that all the inputs to the whole
function are available at the output of the cell and thus can be used for fan-out
propagation. The rationale behind our hybrid cell creation is depicted in Table 33. This cell offers reduced gate counts.
Table 3-3: Comparison to hybrid mapping to individual cell for random circuits
Hybrid
Individual cell
Garbage
Gate
QC
Garbage
Gate
QC
3
2
14
4
5
29
3
7
31
4
8
32
3
2
10
4
5
21
Hybrid
Parallel & serial cell
3
3
11
4
6
30
3
5
1
4
8
24
3
3
10
4
5
21
58
3.4.2 Transforming classical to reversible circuits
To synthesize a reversible circuit we adopt a modified classical mapping
procedure using the reversible supercell library presented in the previous section.
Our approach has the flexibility of implementing a reversible circuit starting from
the function specification or any technology-dependant classical gate-level
realization of the circuit. We begin from transforming an irreversible circuit into its
equivalent classical implementation based on 2- or 3-input gates. The above
restriction on the gate sizes comes from the fact that the resulting reversible
network will be implemented using the reversible counterparts of classical gates
from the original irreversible network. At this stage we also apply classical
optimization techniques to obtain a simplified network and its reversible
realization. In this method the main goal is that overall mapped network must be
reversible, hence must submit to the two characteristics of reversible designs.
Note that we do not introduce any creation of reversible specification from
classical function specification, or permutation techniques in the process. The
schematic of the steps is shown in Fig. 3-8. The process involves the following
steps:

Create the reversible modules library which is circuit independent and can
be reusable for mapping of any designs.
59

If classical function is given in the form of a network implemented with 2input gates from any technology- dependent irreversible library of standard
cells (such as NAND-NAND, NOR-NOR, AND-OR-INV etc.), we first
employ classical network minimization. If only a function specification is
given, then the classical technology mapping needs to be employed first to
transform the irreversible circuits to a classical network of standard cells
(as found in UC Berkeley SIS or ABC system for synthesis and
verification).

We traverse the classical network to replace the irreversible gates with
reversible modules in topological order.
Read irreversible
specification
Read gate-level
description of
irreversible circuit
Classical
optimization
technique
Circuits to be
mapped to
reversible
Read reversible
equivalent gate
library
Map
Reversible
network
Figure 3-8: Reversible mapping from irreversible specification/circuit.
Technology mapping
Technology mapping has been a very crucial step in the synthesis process of
classical circuits [68]. A library of customized cells of various functionality is
60
created. The mapping process consists of three major tasks. First, Boolean
networks are partitioned into an interconnection of single-output networks, with
the property that each internal node has fan-out 1. Then each sub-network is
decomposed into an interconnection of two-input functions (e.g. AND, OR, NAND
or NOR). Each sub-network is modeled by a direct acyclic graph (DAG), called a
subject graph. Finally, each subject graph is covered by an interconnection of
library cells.
During reversible technology mapping, we preserve the connectedness and fanout conditions for each cell replacement [52] to ensure network reversibility.
Furthermore, due to the structure of gates in the packet cell library, we employ
the extraneous bits to provide fan-out signals of original irreversible network
limiting the otherwise necessary costly duplications. For example the garbage
outputs of AND or OR function can be used as inputs to other gates.
The actual mapping into reversible gates begins from primary inputs of the
transformed irreversible network (level 0). Starting from level 0, the circuit is
traversed in topological order (increasing the level number by one for each gate
encountered in the path to the primary outputs) to replace each irreversible gate
or gate combinations with its equivalent reversible Toffoli module or supercell.
The newly generated module is then cascaded with the reversible structure
generated so far. However, if two classical gates X and Y on the same level
61
share inputs, i.e., in the case of fan-out signals, then we first replace such gates
with their Toffoli modules. Next we cascade Toffoli module X with the module Y
by connecting the shared (fan-out) input of module Y to the garbage output of
gate X. Hence, during reversible mapping the fanout-free conditions for each
gate replacement are preserved due to the structural property of gates in the
reversible supercell library, illustrated in Example 4.
Example 4: Consider an irreversible network in Fig. 3-9. In the reversible
mapping, each gate of the original network is replaced by a reversible
module. Modules are connected in such a way that garbage bits of one gate
are utilized to provide required fan-outs signals to the other gate. Here, for
example, AND1 and OR gates are at the same level and share inputs A and
B. Hence we map AND1 gate first, followed by the OR gate. For fan-out
nodes of A and B, we use the AND gate garbage outputs. The ancilla inputs
are added to each AND and OR gate. The resulting Toffoli network is
shown, Fig. 3-9(c).
62
A
B
Y
A
B AND1
0
(a)
OR
0
A
B
0
0
0
(c)
G1
G2
AND2
0
G3
G4
Y
(b)
g1
g2
g3
g4
Y
Figure 3-9: Reversible mapping (a) classical (b) reversible equivalent gate
mapping (c) Toffoli equivalent circuit.
For some circuits passing fan-out signals through garbage bits of preceding
gates may not be enough, and hence more resources are needed. In such
cases, we resort to constructing a cascade of reversible gates in order to make
the fan-out signals internal to the cascade structure. In the cascaded block, we
pass a fan-out signal to the destination gate through garbage outputs of the
proceeding gate. We continue this way of transmitting the fan-out signal though
the cascades of serially connected gates, until all fan-out signal locations in the
cascaded gates are reached. Thus, we can accommodate any number of fanouts of a signal through this process. As a result, we not only minimize garbage
bits by reusing them, but also avoid adding extra repetition of inputs. No extra
gates or increase in number of control lines are imposed in the process. Hence,
there is no increase in the quantum cost as connectivity of reversible gates with 3
or less control lines has no effect on the increase of the quantum cost [27].
63
Finally, the testability of the reversible circuits is improved as now valid signals
are propagated through garbage lines.
For example, in Fig. 3-10(a), three irreversible gates share the same input
signals A and B. If these gates are mapped individually then we need to provide
three copies of A and B as inputs to each gate. This generates 6 garbage
outputs, Fig. 3-10(b). However, by our method, we reuse the garbage bits of gate
1 as inputs to gate 2 and garbage bits of gate 2 as inputs to gate 3. Thus, we
reduce garbage bits to 2, Fig. 3-10(c).
A
B
f1
f2
(a)
f3
(b)
A
B TM1
0
g1=A
g2=B
f1
A
B TM2
1
g3=A
g4=B
f2
A
B TM3
0
g5=A
g6=B
f3
A
B TM1
0
1
(c)
0
TM2
TM3
f1
f2
g5=A
g6=B
f3
Figure 3-10: Fan-out mapping: (a) Classical circuit, (b) Reversible equivalent gate
mapping with copies of inputs, (c) Cascaded Toffoli modules.
Special care needs to be taken in the case of XOR gates if two inputs to the
XOR are fanned out to other gates as inputs. For example, for the circuit of Fig.
3-11(a) where f1 =AB and f2 = A⊕B, if the XOR gate is mapped first, then a
desired XOR output as well as the garbage output A (first CNOT gate of Fig. 311(b)) are available. However, for the input of the AND gate ( f1 =AB), both A and
64
B signals are needed. To obtain B, another XOR (second CNOT gate of Fig. 311(b)) is added, which then nullifies the previous XOR function. Hence, after the
AND gate mapping, we need to add another XOR (last CNOT, Fig. 3-11(b)), this
time, to realize f2 = A⊕B. Instead, in our proposed method, the other gates,
which share signals with the XOR gate (the AND gate in this case), are
considered first for mapping with reversible Toffoli modules, and then its garbage
outputs are utilized to provide the input signals to the XOR gate, Fig. 3-11(c).
Thus, quantum cost is reduced compared to the circuit with cascaded XOR
gates.
A
B
f1
(a)
A
B
0
f2
f1
f2
(b)
A
B
0
f2
f1
(c)
Figure 3-11: XOR fan-out mapping: (a) Classical, (b) Problem with XOR
mapping first, (c) XOR mapping after.
Super cell mapping
While mapping an irreversible function into gates from the supercell library the
aim is to use in the first place supercells as they require less number of extra bits
and gates in equivalent reversible realization. Only if it is infeasible for a
particular configuration, the basic logic gates from the library are used.
65
Algorithm 1 illustrates the process of mapping of an irreversible network into a
supercell library. The network is traversed in topological order starting from
primary inputs. Whenever any combination of encountered consecutive gates
matches the reversible supercell, and no intermediate signal is fanned out to the
next circuit level, then the combination is replaced with the corresponding
reversible supercell structure, Step 7. We proceed to the next gate, and repeat
the search for the matching gate combination. If the output of a gate has fan-out,
then it cannot be grouped with the next gate to eventually be replaced by a
supercell, since a supercell structure does not provide intermediate output signal
as garbage bits (rather only the inputs and grouped functionality). In this case,
the corresponding gate is replaced with equivalent Toffoli module instead of a
supercell, Step 12.
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
Select gate with primary inputs
Until primary outputs not reached {
Select the gate
If (gate outputs not primary and no fanout) then
Add next gate in topological order
If (gate combination matches supercell) then
Replace gate with supercell
Else {
Select next gate in topological order
Return to step 4 }
Else
Replace with gate equivalent from supercell
library
13. Else
14. Replace with equivalent from super cell library}
Algorithm 1: Supercell mapping
66
For example, in Fig. 3-12(a), when traversing from primary inputs, gates 1 and 3
can be grouped to be replaced by a reversible supercell (SC). However, gate 4
output has fan-outs and we cannot pack this gate with gates 5 or 6. Thus, these
gates are replaced with individual Toffoli modules (TM). The process of mapping
continues until the final gate is reached.
A
1
D
B
C
E
3
2
X
4
5
f1
6
f2
Y
Z
(a)
A
B
D
0
A
B
SC(1&3)
D
X
1
TM4
X
Y
Z
1
C
CNOT B
2 Y
TM5
Z
E
f1
(b)
E
0
TM6
Z
E
f2
Figure 3-12: Reversible mapping with supercell library: (a) Original
circuit, (b) Possible location of supercell and individual Toffoli modules usage.
Example 5: Given the full adder in Fig. 3-13, if each individual gate replacement
is considered, then we require 7 ancilla bits at the input side and 8 garbage
bits with reuse of some of the garbage to provide fan-out path. Again, we
need to add one extra inverter reversible module at the input of gate 9. To
compare with the packing approach using universal AND/OR in [52], we use
the reversible structure of this universal gate to implement the reversible
circuit. We save two extraneous bits on both input and output sides at the
67
expense of adding two extra inverters, enclosed in circle, Fig. 3-13(b) for
fan-out propagation. This implementation requires 15 Toffoli gates. In
contrast, when using the supercell optimization for the two supercells
replacement we save two ancilla and two garbage bits eliminating the need
to add any inverter. Moreover, this realization requires only 7 Toffoli gates
saving 8 other Toffoli gates, Fig. 3-13(c).
A
1
B
3
9
2
4
5
Cn+1
7
8
6
C
A
B
e1
e2
C
e2
e4
A
B
e1
e2
C
e2
e4
1&2
g1
3
4
g2
g3
5&6
1
2,3,4
Sn
(a)
7
8
g4
Sn
e5
g1
g2
5
6,7,8
g5
g6
Cn+1
9
g3
g4
9
Sn
e5
(b)
g5
g6
Cn+1
(c)
Figure 3-13: Super-cell optimization, (a) Full Adder circuit, (b) Reversible
realization of universal AND-OR gate, (c) Super-cell implementation.
3.5
Theoretical analysis: technical lemmas
In general, a reversible circuit with n variables requires at least (2n/ln3) + O(2n)
gates for implementation [69]. If the bound is tight, then this number is n2n gates.
68
The situation is slightly different in the BDD-based framework, as the total
number of gates (NOT, CNOT and 2-controlled Toffoli) is directly dependent on
BDD size. In the worst-case scenario, for a single-output function, the
corresponding BDD has 2n nodes, and can be realized with at most 3x2n gates
[25]. The number of reversible gates is not a consistent parameter for
comparison as less number of gates with more control inputs sometimes results
in higher quantum cost than a circuit with more gates from standard reversible
library used in BDD-based method. Rather we concentrate on the other aspects,
such as the determination of garbage bits and quantum cost. Note, that in our
method, we deal with gate-level implementation of classical circuits and we map
classical circuits realized in irreversible gates to the reversible gates and number
of reversible gates required solely depends on number and type of classical
gates, not on function specification. Thus, we relate the number of logic gates in
classical realization to estimate bounds on reversible gates and garbage bits in
reversible implementation instead of the number of irreversible function variables.
The relation is stated in the following lemmas.
Lemma 1: For any classical circuit with n gates, when realized in reversible
embeddings, the maximum number of garbage bits produced is 2n (upper
bound).
69
Proof: Among all irreversible 2-variable function classes only AND/ OR and their
generic need to add 2 garbage bits for reversibility. Hence, in the worse case,
when no supercell can be used, and all the gates of the reversible network are of
the AND/OR type, then the maximum number of garbage will be two times the
number of gates.□
Lemma 2: For a reversible realization of any classical circuit the minimum
number of garbage bits in reversible embedding is 2(# of AND/OR) + (# of
XOR)-(# of fan-outs)-(# of super cells) (lower bound).
Proof: The total number of garbage bits is 2(# of AND/OR) + (# of XOR) as
AND/OR function generates 2 garbage bits and XOR has 1 in reversible
equivalent, if no garbage is saved the other way. If fan-outs are present in the
irreversible circuit, then we reuse some garbage to provide all fan-outs. Again,
use of supercells further reduces garbage, 1 bit for each super cell. □
The actual number depends on the efficiency in using garbage signals. However,
by using super-cells we save extraneous bits related to the number of gates
packed together.
Lemma 3: If different 2-input irreversible gates are combined to create a
supercell, then the number of garbage bits can be saved in the resulting
supercell with cascaded n gates is n – 1.
70
Proof: A supercell is created in such a way that, in addition to targeting a
supercell function output, copies of all the inputs are available at the outputs as
garbage bits. Also, an ancilla is added. Now, if we group n 2-input gates serially
such that the output of one gate is connected to the input of next gate, then we
have n+1 inputs. Hence, the number of garbage bits with grouping into a super
cell is n+1. On the other hand, each 2-input AND/OR reversible equivalent gate
has 1 ancilla bit. Hence, for n such gates the total number of ancilla bits is n. To
equalize the input-output numbers for the network in reversible implementation
(inputs + ancilla = outputs + garbage), the number of garbage bits is: ( n+1) + n-1
= 2n. Therefore, the garbage saved is: 2n - (n+1) = n -1, where n is the number
of gates packed.□
In section 3.4.1, we presented reversible modules for 2-input logic gates, which
are classical equivalents of Toffoli gates. Based on that, we deduce upper and
lower bounds for number of Toffoli gates in reversible realization in our approach.
Lemma 4: A classical circuit implemented with n gates can be realized in
reversible embedding with at most 7n/2 Toffoli gates.
Proof: For all the gates in the reversible supercell library, the 3-input OR gate
(a+b+c), is the one requiring the highest number of Toffoli gates, i.e., 7 for its
realization. In the worst case, the classical circuit is implemented using only n
consecutive OR gates and we can map them with n/2 supercells (each supercell
71
is reversible equivalent of group of 2 logic gates). Therefore, the maximum
number of Toffoli gates required is 7n/2.□
Lemma 5: The minimum number of Toffoli gates required to realize a
classical circuit of n gates is n.
Proof: As from the proof of Lemma 4, AND, NOT, and XOR functions require
single Toffoli gate. This set of gates is universal. If any irreversible circuit has n
such gates only and no fan-out signals, then it can also be implemented using n
Toffoli gates.
3.6
Experimental results
The approach presented in this chapter is verified with the aid of Berkeley SIS
program [70]. We used some fundamental arithmetic circuits and benchmarks
such as MCNC package for which reversible realization has been completed
successfully on a 450MHz Sun Ultra 80 workstation.
We assessed the significance of generating our reversible equivalent gate library
with supercells, as well as finding the alternative to bypass the search for proper
reversible embedding for large irreversible functions. In Table 3-4, we compare
the impact of different reversible equivalent gate libraries obtained in each step
(fundamental, extended and super cells) in terms of Toffoli gates and garbage
bits. We can see that supercells created from packing two gates save both Toffoli
72
gates and garbage bits. For the fundamental library, the extra gates are required
for inverting signals.
Table 3-4: Comparison of of equivalent gate libraries of different steps
(T= no. of Toffoli gates and G= no. of garbage outputs)
Circuit
AND-NOT-XOR library
Extended library
Super-cell
T
G
T
G
T
G
rd32
23
11
18
11
14
9
Schneider
8
6
8
6
7
5
Decoder
12
7
12
7
11
6
MUX
7
5
4
5
3
4
Parity
Generator
10
5
7
5
7
4
Table 3-5 represents a comparison of our approach to one of the best
approaches dealing with large irreversible functions, binary decision diagram
(BDD) based technique though in some cases the cube grouping methods offer
better results. The two solutions differ but both can address larger functions. We
also notice from [25] that large circuits in Table 3-5 are not synthesizable within
specified time limit using Reed-Muller approach. We calculate the quantum costs
(Column 5) of resulting circuits realized using Toffoli gates adopting the same
information presented in chapter 2 [27]. Our method outperforms the BDD one in
terms of the number of Toffoli gates used and quantum costs, as well as in the
number of lines. However, in some cases, the BDD-method provides lower
73
number of garbage bits. The specification constraints (fewer gates or fewer lines)
determine the overall effectiveness of the methods.
Table 3-5: Our proposed method vs. BDD based method
Circuit name PI/PO
Gate
frg2
Proposed
Line QC
BDD Approach [25]
Gate Line QC
143/139 1250
535
5868 4472 1411 14944
i8
133/81
1304
871
4788 3550
955
11478
i5
133/66
198
199
1254
530
345
1738
i7
199/67
841
705
2865
941
403
2953
i6
138/67
653
519
2177
734
280
2234
cordic
23/2
81
47
424
101
52
325
i4
192/6
172
326
1680 2115
729
6827
To compare our method with technology mapping in [52], we present Table 3-6.
Note, that in [52] there is no provision to calculate actual number of Toffoli gates
and quantum cost, which we explore in our method. Hence, for the comparison
purpose we consider the number of reversible cells created in [52]. Column 1
represents circuits’ name with primary I/O count in Column 2. Column 3 and 4
store the number of reversible cells used with original mapping and with use of
the Universal AND/OR gate respectively in [52]. Column 5 shows the reduced
number of reversible Toffoli modules in our approach, while Column 6 illustrates
the advantage of the reversible super cell method in reducing number cells over
solutions proposed in Column 4. Finally, the reduction of garbage bits due to
equivalent cells reduction is presented in last column.
74
Table 3-6: Proposed method vs. Technology mapping
Circuit
apex6
C1355
C2670
C3540
C5315
C6288
C7552
i10
pair
PI/PO
135/99
41/32
233/140
50/22
178/123
32/32
207/108
257/224
173/137
Rev
cells
591
246
547
928
1306
2138
1630
1851
1260
Ref. [52]
With Univ.
AND/OR
587
212
514
895
1221
1899
1470
1778
1213
Our proposed method
Toffoli
With SuperGarbage
modules
cells
saving
589
376
211
246
202
10
533
373
141
925
598
297
1306
930
291
2043
1422
477
1574
1096
374
1705
1138
640
1254
841
372
In Table 3-7, we compare the significance of 3-input module creation in terms of
number of cell reduction and garbage reduction with the results of 2-input cell
from [52]. Each 3-input module saves one garbage bit, so if we save 83 cells for
example in case of apex6, then we can save 83 garbage bits. Further, preredundancy removal allows more garbage and gate savings.
Table 3-7: Comparison of 2-input to 3-input cell library
Circuit L2-in Lpack2in Red3in
L3in
apex6
591
587
503
504
C1355
246
212
234
234
C2670
547
514
470
483
C3540
928
895
766
767
C5315
1306
1221
1207
1207
C6288
2138
1899
1907
1913
C7552
1630
1470
1503
1559
i10
1851
1778
1449
1572
pair
1260
1213
1054
1061
x3
603
601
516
516
75
3.7
Conclusion
In this chapter, we provide techniques for realizing reversible network from gatelevel implementation of classical irreversible circuits of sizeable complexity with
Toffoli-based modules of classical counterpart. Furthermore, a new idea of
combining gates to create supercells is proposed.
The proposed method, which mainly work on gate level implementation of
classical circuits is an alternative to other two efficient methods, i.e., Reed-Muller
and BDD based solutions.
The RM method not only requires a significant
processing time, but also often generates solution with a high quantum cost [21].
For example, a 10-input OR function has a Reed-Muller expansion with quantum
cost 83531 and 10 garbage outputs. On the other hand, our method requires four
3-input OR super cell and one 2-input OR reversible equivalent, hence resulting
in a quantum cost of 131 and 14 garbage outputs.
The BDD-based reversible synthesis method is better than RM method in terms
of quantum cost; however, it suffers from large number of extra lines. For
example, a 3-input XOR function implementation using BDD method needs 10
reversible gates with 6 garbage bits and quantum cost 34. In contrast, our
method requires 2 gates producing 2 garbage bits and has a quantum cost of 2.
Moreover, a shared BDD requires additional line for each shared node [25]. We
76
observed a 75.39% improvement in quantum costs compared to BDD-based
synthesis for circuit i4 in Table 3-5. Additionally we can see a 62% improvement
over BDD in number of lines (Table 3-5).
77
Chapter 4 Reversible synthesis with redundancy
removal
In this chapter, we present a way of synthesis of reversible circuits using
redundant faults information obtained with the aid of its classical counterpart. We
use Toffoli-based modules of classical standard gates and technology mapping
presented in chapter 3 to relate the effect of redundant stuck-at-value fault in
classical irreversible gate level circuits and their reversible implementation. The
simplified form of such Toffoli modules is created based on any fixed values of
input signals corresponding to stuck-at value effects. We also present redundant
gates removal in reversible mapping.
4.1
Introduction
In our proposed method of synthesizing classical irreversible network to its
reversible embedding we use reversible technology mapping [52, 71].
In
particular, a gate-level irreversible circuit is mapped to a reversible one by
replacing each gate with its reversible equivalent, Fig. 4-1. A key issue, not
widely considered in irreversible-to-reversible network mapping scenario, is the
presence of redundancies in classical specifications, and their eventual transfer
78
to the corresponding reversible embedding. By definition, redundant errors do not
change the original functionality. They can be found among functional faults as
well as design errors such as stuck-at faults, gate and wire replacements, etc.
The main objective of this part of work is to address the issue of redundant errors
in the irreversible-to-reversible circuit mapping and their minimization.
A
A
B
1
CG1
CG3
B
CG2
RG1
Y
RG2
(a)
1
G1
G2
0
G3
RG3 G4
Y
(b)
Figure 4-1: Classical (a) to Toffoli (b) module mapping.
In [72] the reversible circuits are corrupted with random gate replacements, and
symbolic equivalence checking is used to detect partially redundant logic gates.
Authors in [73] propose an approach to observe the presence of redundancies
depending on the multiple cross-point appearance and disappearance faults in a
Toffoli network. However, inability to localize the faults is a problem, and results
do not reflect the changes on the number of reversible gates or quantum cost
after redundancy removal.
Typically, information about the location of redundant s-a-v faults in an
irreversible circuit comes from the Automatic Test Pattern Generator (ATPG). In
contrast, reversible ATPG for identifying redundant s-a-v fault can be constructed
79
[41], but is not well developed. Therefore, in this chapter we propose an
approach, where classical methods are applied for redundancy identification and
minimization of reversible designs. The scheme is well integrated with the
reversible circuits synthesized by technology mapping [52, 71]. In particular, the
synthesis is done by a direct substitution of classical gates by reversible modules
of equivalent functionality. Here, we follow the realization involving Toffoli-based
modules equivalent to standard gates in classical network presented in previous
chapter. The appearance of a faulty value on inputs of each module is deduced,
and an approach using this information to simplify a reversible network is
presented. Further, we study instances of redundancy in reversible networks
generated through mapping from irredundant classical circuits. This can happen
when gates in irreversible network share inputs and their reversible modules
have common Toffoli gates with the same constant target input. We present the
algorithm to remove such extra Toffoli gates reducing garbage outputs and
quantum cost.
The translation of a classical circuit to a reversible one is prone to adding
redundancies, which can be often identified during testing for stuck-at faults. In
particular, a stuck-at (s-a-v) fault at line A in a logic circuit is redundant if setting
A=1 (0) does not change the functionality of the circuit. Hence, the line, on which
the fault is located, can be set to that constant fault value [76]. Then the gate
80
related to this signal can be removed if this constant value controls the gate
(such as ‘0’ for AND gate and ‘1’ for OR). The minimization continues until the
propagated value is no more a constant.
4.2
Redundancy in classical and reversible circuits
Classical circuits synthesized by automatic tools often include untestable faults.
The signal lines, on which such faults are located, are removed without affecting
the output function. The redundant elements cannot be ignored as they cause
undesirable effects like increase in delay time and chip area, as well as the
degradation of the circuit testability. There are many sources of redundancies in
classical networks. For example technology-dependent synthesis methods such
as NAND-NAND mapping uses only NAND gates causing some signal to be
redundant [74]. Also using pre-synthesized IPs of the size or functionality broader
than required introduces some redundancy. Note that often some means of
redundancy is intentionally employed in fault-tolerant designs or as an
adjustment of the circuit delay, hazard elimination or testability [75, 76]. However,
as these types of redundancies are not used in circuit minimization, they are not
considered here.
Work in [75] presents a method for redundancy removals in classical circuits
based on transducer or logic verification. Other solutions come from hardware
81
testing, and use redundant stuck-at faults to identify and remove redundant wires
or minimize gate sizes [77, 78]. The removal of redundant faults must proceed
gradually, as often when a redundant element is removed, other redundant
elements may become irredundant and vice versa. Therefore, only one
redundant fault is removed at one time, and ATPG must be rerun on the modified
circuit, making the overall procedure computationally expensive.
Redundancies present in irreversible circuits may contribute to the increased size
of their reversible representations, if the later are obtained by direct mapping
methods [52, 71]. For example, in the synthesis of reversible circuits from
irreversible specifications [71] the classical synthesis methods are used to
produce an irreversible circuit, which is then mapped to the reversible one using
Toffoli modules. In the process, no special attention is paid to eliminate
redundancies in the irreversible circuit, which can be passed to the final
reversible circuit. To alleviate this problem, we can apply classical ATPG to the
irreversible circuit in order to detect redundant stuck-at faults, and then remove
them using re-synthesis, Fig. 4-2(a). Although this solution gives a minimized
irreversible circuit, this procedure is computationally intensive. However, we do
not need to have an irredundant irreversible circuit in order to obtain its
minimized reversible representation. All what is required is a list of untestable
faults in the irreversible circuit generated by a single run of ATPG. Such a list of
82
faults is used to pinpoint the potential locations of redundancies in the reversible
realization. In our method we use the information about redundant faults from
classical ATPG and apply them during the reversible synthesis, flow diagram in
Fig. 4-2(b).
By doing that, all the minimization effort is oriented towards a
reversible network avoiding a costly minimization procedure of the intermediatestep irreversible circuit.
Irreversible Specification
Irreversible Specification
Synthesis irreversible
netlist
ATPG
Redundant
faults?
(a)
Synthesis irreversible
netlist
Yes
ATPG
Redundant
faults?
(b)
No
Reversible
synthesis
irreversible
Re-synthesis
Reversible
synthesis
Figure 4-2: (a) Redundancy removal before reversible synthesis, (b) Redundancy
removal during reversible synthesis.
In our proposed method we first investigate the characteristics of classical
circuit’s redundant faults in the reversible embedding. Garbage bits are known
redundancies added to the reversible network, however, our aim is to optimize
the network disregarding garbage outputs, which are assumed to be don’t cares,
83
and eventually reduce the appearance of garbage bits as well. Although some
faults can be detected through garbage outputs [41] when they do not change
the target functionality, such faults are considered to be redundant, as they do
not affect the circuit functionality. For example, in Fig. 4-3, e s-a-0 fault is
redundant since the test vector 11 generates “1” at output for both fault-free and
faulty cases. On the other hand, in reversible realization the faulty behavior is
observable at garbage output at g3, while function output E in reversible
embedding remains uninfected too.
A1
e s-a-0
1/0
B1
(a)
A1
B1
e1
e2
s-a-0 1/0
A
E
1/1
g1
B
g2
0
0
g2=B
(b)
g1=A
B
0
e1/0
E1/1
g3=AB
E
(c)
Figure 4-3: Arbitrary circuit with s-a-v fault (a) classical implementation, with
untestable fault, (b) Reversible implementation with direct mapping with fault testable
through garbage (c) Toffoli realization of the circuit.
Definition 1: In reversible circuits, similarly to the irreversible ones, if the
presence of a fault does not change the primary output value, then we call
the fault functionally redundant.
84
Although redundant faults in classical counterpart do not affect the testability
of reversible circuits, their removal during synthesis results in the optimized
reversible implementation.
Example 1: Consider a network consisting of a NOR gate the output of which is
connected to an XOR gate, Fig. 4-4. In reversible realization, the XOR has
a control on the line “b” and the target on the output of NOR module (line 3
in the red circle). Assume a s-a-0 fault located at the input “0” to the CNOT.
Then, from row 2 of Fig. 4-5 (with c=1 for NOR), the reduced NOR module
becomes b⊕1, i.e., CNOT with a control at “b” and a target input equal to 1.
The reduced module (CNOT) is in series with the next CNOT having
controls at “b” and a target at the line of value 1. Hence, the final output is
f= b⊕1⊕b = 1, i.e., both CNOTs are removed.
0
b
f
0
b
1
g1
g2
f
0
b
1
0
b
1
g1
g2
f
g1
g2
f
Figure 4-4: Removal of same gates in series- from classical to reversible.
85
One way, in which we can obtain a reduced Toffoli module, is to replace the
faulty constant value at the input and then simplify the gate. For example, in the
case of the OR module comprising of standard reversible gates, if any input (i.e.,
a) is fixed at value 1 (row 1 of Fig. 4-5), then the function output (f=c⊕a⊕b⊕ab)
becomes f=c⊕1, which is simply a NOT gate on the target line. Similarly, a s-a-0
fault at any input of the OR gate changes the module to a CNOT gate with the
remaining input being a control one ( f=c⊕b). The simplifications of standard
reversible gates are summarized in Fig. 4-5. Note that the simplification of each
individual gate follows the presence of a redundant s-a-v fault at the gate input.
The garbage line associated with the fault is discarded. However, in a network of
cascaded Toffoli modules, the faulty input can propagate through the garbage
outputs of the previous module to the next module. Hence, the faulty signal is
kept until the simplification process is completed, and is removed when no further
simplification is possible.
86
1
1
0
1
0
0
1
0
1
b
c
g1
g2
f
b
c
g2
f
0
b
c
g1
g2
f
b
c
g2
f
1
b
c
g1
g2
f
b
c
g2
f
0
b
c
g1
g2
f
b
c
g2
f
1
1
b
1
g1
g2
f
b
f
0
0
b
1
g1
g2
f
b
1
g2
f
a
1
1
g1
g2
f
a
1
g1
f
a
g1
0
0
1
g2
f
a
1
g1
f
1
1
b
0
g1
g2
f
b
0
g2
f
0
0
b
0
g1
g2
f
b
0
g2
f
1
a
1
0
g1
g2
f
a
1
g1
f
0
a
0
0
g1
g2
f
a
0
g1
f
1
Figure 4-5: Simplified Toffoli modules for s-a-v faults.
87
Another example of redundancy incorporates a repetition of similar gates
connected in series (representing identity function), and a cascade of two CNOT
gates with the same ancilla inputs and target of first stage CNOT being a control
of the next CNOT gate. This kind of redundancy is generated as a byproduct of
the s-a-v redundancy removal process in reversible mapping. For example,
consider a NAND gate (da⊕1), which has a redundant fault s-a-1 at the input d.
The removal of the fault generates a CNOT (a⊕1). If this gate is in cascade with
another CNOT gate (two CNOT gates in cascade represented as f=a⊕b⊕c),
which has the same constant 1 in target line, then resulting function becomes
f=a⊕1⊕1=a. Hence, both CNOTs can be removed, and a function output
becomes a non-constant input (f=a).
In order to relate redundant faults of classical and reversible networks, we rely on
classical ATPG to identify and locate redundant faults in original irreversible
network.
Lemma 1: If a fault on wire “w” in a classical network is redundant, then the
corresponding wire in the mapped reversible network in technology
mapping is also functionally redundant.
Proof: The reversible mapping in chapter 3 transforms each gate in the classical
network into a reversible module. The dedicated connections among gates of
the original network are maintained, since the irreversible modules are replaced
88
by their reversible counterparts. However, to preserve I/O compatibility extra
garbage outputs are added. According to [71], such garbage bits carry values,
which are the copy of module inputs. Hence, if a faulty value at wire w in a
classical circuit does not change the output functionality, i.e., F⊕Ff=0, then the
fault effect propagated through the dedicated path in a reversible embedding will
not alter the target output (i.e. Fr⊕Frf = 0). □
Example 2: Consider a classical network implementing a function f=ab+bc+a’c.
The redundant AND gate bc is added intentionally by the designer to avoid
static hazard and to ensure the continuity of the value f=1 if b=c=1 when the
logical value of “a” changes. Though in CMOS reversible circuit the static
hazard may present, in quantum circuits this will not happen because of
their unitary operations. However, in our direct reversible mapping, the
redundant logic added to avoid static hazard in classical circuits will pass
through in reversible network. So we need to get rid of this type of
redundancy in reversible realization. The classical ATPG identifies the s-a-0
fault at the output of the gate “bc” to be redundant. After the modulereversible mapping with a target library of Toffoli gates, the reversible
representation
of
the
original
function
f
becomes,
fr=0⊕(0⊕ac⊕c)⊕(0⊕ab⊕bc)⊕(0⊕ab)(0⊕bc)⊕(0⊕ab⊕bc⊕(0⊕ab)(0⊕bc))(0
⊕ac⊕c). If we perform the algebraic XOR simplification, i.e., x⊕x=0, then
89
we obtain fr=0⊕ab⊕ac⊕c. However, as “bc” is assumed to be corrupted by
a s-a-0 fault (identified by classical ATPG), then again fr becomes
fr=0⊕ab⊕ac⊕c. This implies that the s-a-0 fault at the line “bc” is indeed
redundant in the case of both, reversible as well as classical networks.
Input: mapped reversible network, redundant s-a-v at l
Output: simplified reversible network with no redundant fault
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
For redundant fault v at l, set l= v
Until no module with constant control inputs reached {
Replace all modules connected to l with reduced forms
If (l primary input or not garbage output) then
Retain line l
Else remove l
Select next module in topological order
If (same as previous reduced module ) then
Remove gates
Else {
If (two CNOTs in cascade with similar constants in target) then
Remove them
Connect module output to non-constant input}
Else{
If (any control of current module is constant) then
Set l= new constant
Return to step 3 }
Algorithm 1: Simplification by Redundancy removal
We use redundant faults identified in a classical network to remove logic gates in
mapped reversible network driven by the fault. As we are only interested in
synthesizing a reversible network, then after locating the redundant fault in the
irreversible circuit, we proceed with the redundancy removal technique in
90
reversible network described in Algorithm 1. In Step 1, we set a considered
signal to the constant corresponding to the polarity of the s-a-v fault. Then we
replace the original Toffoli module by the simplified gate/wire of Fig. 4-5. In
consequence, the next module is subjected to another simplification if the
constant value of previous stage propagates to that module. This replacement
continues until no module matches any condition of simplified forms in Fig. 4-5.
Example 3: Consider the classical circuit of Fig. 4-6 and its reversible
implementation. The classical ATPG determines that the fault “a” s-a-1 is
redundant. While testing for that fault, if the non-controlling values are set to
a=0 and b=1 for the first Toffoli gate leaving “c” input as a don’t care value,
the target output function f=1⊕c will be the same as the fault-free function
output with a=1. So the value of input “a” has no effect on the target
function. Hence, we set a=1(similarly to the classical circuit for the
corresponding redundant fault), and proceed with further simplification. In
the first step, Fig. 4-6(a), we reduce the OR module using the assignment
a=1. This results in a single inverter on line 3, the output of which is 0.
Using this assignment, next we simplify the 2 nd OR module leading to the
replacement of the OR module by a CNOT gate between lines 5 and 6.
Discarding two inverters in series, the resultant constant 1 on line 3
degrades the last NAND to a CNOT, Fig. 4-6(d). If we proceed further we
91
observe that this CNOT is cascaded with next CNOT and has two same
constant inputs, which actually nullify the existence of the two gates. Hence,
by discarding them in addition to previously mentioned redundant lines
(1,3), we achieve a circuit, which is reversible equivalent to classical NAND
gate, Fig. 4-6(e).
a
f
h
b
c
a
b
1
c
1
1
1
g
(b)
i
g1
g2
g3
g4
g5
g6
f
a
b
1
c
1
1
1
(c)
a
b
1
c
1
y 1
1
g1
g2
g3
g4
g5
g6
f
(a)
g1
g2
g3
g4
g5
g6
f
a
b
1
c
1
1
1
(d)
g1
g2
g3
g4
g5
g6
f
a
b
c
1
g1
g2
g4
f
(e)
Figure 4-6: Steps for reversible redundancy removal.
Lemma 2: The network obtained by Algorithm 1 is reversible.
Proof: Any simplification performed on the network leading to the removal of
Toffoli gates related to constant fault value does not affect any of the original
lines of the network unless they are connected to an ancilla and garbage output
(steps 4-6 in Algorithm 1). Hence, any eventual removal of such (ancilla and
garbage) line has no effect on a target function. Further, removing an ancilla is
done only when it becomes a garbage output due to Toffoli module simplification.
92
Hence, the number of inputs and outputs in reversible network remains
compatible preserving the reversibility condition □
Note that the simplification process with redundancy removal in a reversible
circuit differs from the classical one. However, the resultant reversible circuit is
the same as the one obtained by applying classical redundancy removal prior to
reversible mapping. Therefore, we are able to avoid the time-consuming classical
redundancy removal and circuit optimization step in our method, and perform
removal and optimization only during the reversible synthesis. This is particularly
important, if the reversible network is to be synthesized from the irreversible
specification with no optimized irreversible circuit existing prior to the process.
The irreversible circuit is created only at the intermediate step of the reversible
synthesis to serve as a template for the reversible gate mapping procedure.
However, such an irreversible circuit does not need to be optimized by removing
redundancies.
Based on Lemma 1, a redundant fault in a classical network is also functionally
redundant in a corresponding reversible circuit obtained from irreversible-gate-toToffoli module mapping process. A classical gate affected by the faulty value is
replaced by a constant value or an input signal. Similarly, according to Fig. 4-5
each Toffoli module is simplified to represent an input signal or a constant. For
example, a s-a-0 fault at any input of an AND gate forces the gate output to a
93
constant value 0. In the reversible network this is represented as 0⊕0 condition.
In fact, each classical gate simplification has a corresponding Toffoli module
simplification.
4.3
Redundant reversible gate removal
Another type of redundancies is linked to redundant gates, for example, when
different gates share both inputs in classical network. To illustrate the case, if
AND and OR gates have the same inputs a and b, then in their reversible
implementation AND gate is represented as f1=0⊕ba, and garbage outputs a
and b are then used for inputs to the OR gate: f2=0⊕ba⊕b⊕a. In this case, the
Toffoli gate (0⊕ba) is used twice, but one of them is redundant. The redundancy
can be eradicated by keeping one copy of a common gate, removing the other
gate, and interchanging the control points for the second gate.
In our approach, we divide Toffoli modules of various classical gates into two
groups based on the constant ancilla bits on their target lines. Group 1 includes
AND, OR and a single input-inverted AND gates where ancilla is 0, Fig. 4-7(a).
Group 2 consists of NAND, NOR and single input-inverted OR gates having
constant 1 as ancilla, Fig. 4-7(b).
94
a
b
0
g1
g2
ab
a
b
0
g1
g2
a+b
a
b
0
g1
g2
āb
g1
g2
a+b
a
b
1
g1
g2
ā+b
(a)
a
b
1
g1
g2
ab
a
b
1
(b)
Figure 4-7 : Grouping of Toffoli modules: (a) Group 1 with constant ‘0’, (b) Group 2
with constant ‘1’.
The process of removing extra Toffoli gates is summarized in Algorithm 2. In
Step 1, we search for the reversible Toffoli modules having the same inputs. Next
we identify whether the modules are from the same group (Group1 or Group2,
Fig. 4-7).
There can be more than two modules. If the selected modules are
from the same group, then we keep the module with minimum Toffoli gates
common to all the remaining modules, and remove the common Toffoli gates
from the other instances. For example, both the reversible NAND gate (1⊕ab)
and reversible NOR gate (1⊕ab⊕b⊕a) (Group 2, Fig. 4-7(b) with constant ‘1’)
have same 1⊕ab, which can be removed from the NOR module to minimize the
network.
In the next step, we consider the remaining part of 2 nd reversible module(s) (in
above example ⊕b⊕a part of NOR). The common Toffoli part of the first
95
considered module is set as a control of the remaining part of 2 nd module. The
garbage outputs of 1st module (common inputs a and b) are used to set target
lines for this module. However, if the modules are from both groups, and not
invert of each other, (for example AND (0⊕ab) from group 1 and NAND (1⊕ab)
from group 2) then we consider first the module with the maximum number of
Toffoli gates. We keep the Toffoli gates common to all modules (except the
ancilla) unchanged, while for the remaining CNOTs of the module we swap the
position of a control and a target. Since for inputs a and b the gate CNOT
implements the function a⊕b, the role of signals a and b as the control and the
target are interchangeable. For the next module, which is from other group (with
different ancilla constant) we add an inverter on the output of the part having
common Toffoli gates to obtain a different ancilla value of the next module which
belongs to the group other than the first considered module. To complete the
process, we add the remaining part of the second considered module (if any). If,
however, the modules are inverts of each other, then we remove one module and
add a CNOT gate only with target line input as constant 1.
96
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
Traverse network from primary inputs in topological order
Until primary outputs not reached {
Select modules having both inputs same
If (modules from same group) then
Retain module with minimum common gates
Remove common part from other modules
Set output of first module as control for other modules
Set targets of rest modules on garbage of first module
Else {
If ( modules from different groups and not invert to each other) then
Select module with maximum gates
Select common parts and interchange control and target for rest part
Add inverter on common output
Add remaining part of second module}
Else {
If (modules complementary to each other) then
Retain one module in original form
Add one CNOT with control at first module
Set target constant 1 and output second module}
Select next modules
Return to step 3
Algorithm 2: Redundant reversible gate removal
Example 5: Let us consider the three instances of reversible mapping in Fig. 4-8.
In Fig. 4-8(a), the reversible Toffoli modules have same constant ‘0’ (group
1 of Fig. 4-7). The original quantum cost of this network is 13. In this
realization, Toffoli and CNOT gates of
module of a function f 1 (0⊕ab⊕b)
are similar to the first two gates of an OR module, f 2 (0⊕ab⊕b⊕a). Hence,
we remove these two reversible gates from the second, i.e., OR module
and set the control for the remaining part at the output of first module, line 3
97
and target on “a” (swapping control and target). Thus line 4 become
redundant, and can be removed. With this simplified realization, the
quantum cost is only 7, while the garbage output count is 1. In Fig. 4-8(b)
the redundancy removal in modules from both groups, which are not inverts
of each other is demonstrated. Here, one module from group 1 is
f1=1⊕ab⊕a⊕b and another module from group 2 is f 2=0⊕ba⊕b. Hence,
we first keep 1⊕ab ⊕b unchanged and swap the control of last CNOT
gate. Inverting 1⊕ab⊕b (by adding a NOT gate at the outputs line) will
create 1⊕ (1⊕ab⊕b)= 0⊕ab⊕b, which is the function f1. Fig. 4-8(c)
illustrates the case where modules are mutual inverts.
a
b
f1
(a)
f2
a
b
f2
(b)
f1
a
b
a
b
0
0
g1
g2
f1
f2
a
b
0
a
b
1
0
g1
g2
f1
f2
a
b
1
f2
(c)
f1
a
b
0
1
g1
g2
f1
f2
a
b
0
1
f2
g2
f1
f2
g2
f1
g1
g2
f1
f2
Figure 4-8: : Simplification by Toffoli gates removal in different
conditions, (a) Modules from same group, (b) Modules from different groups, (c)
Modules complementary to each other.
98
Lemma 3: In the redundant gate removal, the maximum savings in quantum
cost is 11, while the maximum reduction in garbage bits is 2.
Proof: Since a two-input classical gate has three lines in its reversible realization,
we can obtain only three outputs from each Toffoli module. Now let us assume
three gates of the same group are in parallel and share the same inputs (i.e.,
a+b, āb and ab). Note, that if we have more than three gates sharing the same
inputs, then we perform the individual mapping of other gates first such that both
the inputs are available for grouping of the remaining three gates, and no more
input fan-out signal is required for further propagation. In ordinary mapping of
them the maximum quantum cost is 18 and garbage outputs are 2. However,
with redundant gate removal the quantum cost is 7, and all the three outputs
provide three different functions with no garbage output. Thus savings in
quantum cost is 18-7 = 11, and the garbage savings is 2.
4.4
□
Experimental results
We applied our redundancy removal technique to several classical circuits with
the aid of Berkeley SIS program [70]. First, we use classical ATPG from SIS to
identify redundant faults in classical circuits without minimizing the irreversible
network. Then we exercised the simplification by redundancy removal techniques
(both algorithms) on reversible representations of some MCNC benchmark
99
circuits, Table 4-1. Note that in Table 3-5, we already showed our reversible
mapping method is already better than BDD based method in terms of number of
gates, quantum cost and in many cases number of lines. Thus in Table 4-1, we
present the improvement over our direct mapping only. Column 2 represents
primary input/output number of original classical circuits. Columns 3 - 4 indicate
the number of garbage outputs, and the quantum cost of the reversible circuits
obtained with the original direct mapping [52]. Columns 5 - 6 present the number
of garbage bits and quantum cost of simplified circuits after applying redundant
fault removal method (Algorithm 1). Next we apply Algorithm 2 to remove
redundant reversible gates and columns 7-8 illustrates the effect on garbage and
quantum cost. The overall improvements of applying both algorithms over direct
mapping are stated in columns 9-10. The savings in garbage bit is more than
11% and in quantum cost is above 14% for most of the circuits with our
approach.
100
Table 4-1: Simplification by Redundant fault and Gate removal
Direct Mapping [52]
circuit
Simplification by fault
removal (Alg1)
After redundant gate
removal (Alg2)
Overall Improvement
over [52]
PI/PO
Garbage
Quantum
cost
Garbage
Quantum
cost
Garbage
Quantum
cost
Garbage
saving
Quantum
cost
k2
45/45
809
4861
796
4786
784
4725
25
(3.1%)
136
(2.8%)
alu_4
18/4
576
3306
549
3146
537
3095
39
(6.8%)
211
(6.4%)
C2670
233/140
581
2811
567
2737
527
2556
54
(9.3%)
255
(9.1%)
frg2
143/139
659
5868
609
3476
585
3355
74
(11.2%)
2513
(42.8%)
i10
257/224
1823
10270
1679
9449
1597
9065
226
(12.4%)
1205
(11.7%)
comp
32/3
118
497
111
490
96
425
22
(18.6%)
72
(14.5%)
cordic
23/2
63
246
63
246
55
210
8
(12.7%)
36
(14.6%)
parity
16/1
30
95
30
95
15
19
15
(50%)
76 (80%)
Cm82a
5/3
13
65
13
65
8
40
5
(38.5%)
25
(38.5%)
count
35/16
129
658
129
658
114
592
15
(11.6%)
66 (10%)
decoder
5/16
15
173
15
173
10
146
5
(33.3%)
27
(15.6%)
Z4ml
7/4
30
151
30
151
23
124
7
(23.3%)
27
(17.9%)
101
4.5
Conclusion
In this chapter, we present the simplification of reversible circuits synthesized by
technology mapping using the redundancy removal technique of classical
irreversible gates. A new approach to the utilization of redundant faults for the
minimization of reversible circuits has been exercised. We show that gate count
minimization is possible under redundancy conditions. We address the relation
between classical and reversible redundancy removal schemes with examples
and find in many cases the simplified reversible network is equivalent to
simplified classical network with redundancy removal. Further, the gate
redundancies arising from direct reversible mapping can be removed to find
better realization.
102
Chapter 5 Reversible architecture of computer
arithmetic
Until now a great part of research on reversible logic aimed on synthesizing the
function specification to an efficient reversible circuit realization. Recently some
works have been done on designing reversible arithmetic circuits such as adders,
subtractors, multipliers and arithmetic logic unit as well as proposing some new
gates dedicated to implement arithmetic circuits. In quantum computing the
arithmetic operations are implemented in elementary unitary operations and
special designs. We mentioned earlier that our approaches target on reversible
logic structure independent of any technology and not for unitary operations of
quantum circuits. However, permutative (reversible) quantum computing often
include arithmetic blocks which are parts of oracles in Grover's Algorithm
(database search) and Shor's Algorithm (integer factorization) [111]. Despite the
underlying technology, arithmetic operations always play an important role in
data processing. In current adiabatic CMOS technology, the reversible alternative
of classical circuit can directly implement these arithmetic circuits for low power
computation. In this chapter we present an efficient way to realize reversible
103
arithmetic circuits especially targeting toward reversible arithmetic logic unit
(RALU). Our main goal is to create a modular block which can be used in other
arithmetic circuits and can be extended to any size. To start with, we propose
reversible controlled adder/subtractor (RCAS) with overflow detector, not
addressed earlier. Next we present a new design of reversible comparator based
on reversible subtractor using RCAS. Our next target in this chapter is to present
an efficient and flexible design of reversible arithmetic logic unit. In literature for
reversible logic, we do not find a significant advancement in integrating both
logical and arithmetical functions, commonly known as arithmetic logic unit
(ALU), a key feature of any computing system architecture. Here, we present a
novel reversible arithmetic logic unit (ALU) performing basic functions similar to
classical ALU such as addition, subtraction, AND, OR and XOR operations.
Additional functions such as, NAND, NOR, XNOR and logical functions with
single input inverted, overflow detection and comparison can also be performed
with this design. The integration of these operations in single module using less
number of control signals is not available in any of existing approaches. We also
present another important arithmetic circuit important in many scientific
calculations, the square root circuit and not addressed earlier in reversible logic.
The proposed designs and analysis based on different parameters of reversible
circuits – number of gates, garbage bits and quantum cost as well as simulation
104
results are presented here. We compare our arithmetic circuits with existing
circuits and most of them are better in many ways. Our reversible arithmetic logic
unit offers efficient programmability and more flexibility than other methods.
5.1
State-of-art and proposed contributions
In addition to the reversible realization of logical functions, a great deal of work
has been done aiming to implement the basic reversible arithmetic units such as
adders [29-31], multipliers [32, 33] and subtractors [34-36] by finding a direct
translation from classical truth table to reversible forms using basic standard
reversible gates such as Toffoli, Peres, Fredkin and Feynman gates as well as
some new reversible gates such as HNG and TSG. In [35] a new reversible TR
gate is introduced, which is applied to implement n-bit subtractors. This design
outperforms the previously proposed subtractor [34] in terms of quantum cost
and garbage outputs. Recently proposed reversible n-bit adder/subtractors use
the cascaded full adder/subtractor modules with a control input signal [36]. the
above methods use the reversible embedding of the truth table specification of
adder/subtractor. Apart from that we also see quantum/reversible circuits for
arithmetic operations from addition to multiplication and modular exponentiation
[37-40], playing an important role in quantum Shor’s algorithm.
105
In this dissertation, we implement subtraction utilizing full adders, where instead
of subtracting a subtrahend from a minuend; we add a minuend to a subtrahend
expressed in 2’s Complement. First, we develop a reversible controlled
adder/subtractor (RCAS) block, which performs both: addition of two binary
(signed or unsigned) numbers or a subtraction depending on the value of the
control input. This design shows an improvement over existing parallel
adder/subtractor with respect to the quantum cost and garbage bits.
In this dissertation, we also present novel reversible implementations of
comparator by utilizing our reversible controlled adder/subtractor. Typically,
implementations of reversible comparator are based on truth table specification
of functionality [53, 79, 80]. These designs require significant quantum cost and
garbage outputs. Here, we propose a subtraction-based comparator for signed
numbers using our RCAS with overflow detector. The design is more efficient
than existing approaches in terms of quantum cost and garbage outputs.
Next, we propose an efficient and versatile reversible arithmetic logic unit
(RALU), which is very close to its classical counterpart. The RCAS block plays an
important role in our newly proposed RALU. Integration of logical functions or
arithmetic units or both in reversible logic is still a challenge. However, a
reversible computing architecture with the instruction set, control logic and
address calculation has been demonstrated in recent work [81]. An integrated
106
logic unit using approach in [21] performs eight logic functions. In [82] we find
some benchmark circuits performing several logic and arithmetic functions in one
module. One such example is a unit, which performs AND, OR and XOR only.
The other benchmark circuit named mini ALU using BDD-based method [25]
includes OR, AND and addition operation. Some designs generated from SyRec
programming language incorporate multiplication and division too [82].
An
arithmetic unit proposed in [54] performs ADD and SUBTRACT operations as
well as increment and decrement of the input by one. A V-shaped low power
reversible ALU is developed in [83] for programmable reversible/quantum
computing, which performs modular arithmetic like addition, subtraction, negative
subtraction, XOR and no operation (NOP). However, the result does not reflect
the overflow (carry out) of arithmetic operations. Moreover, this design does not
include logic functions more common to classical ALU such as AND/NAND or
OR/NOR. Recently a reversible ALU design is proposed in [84] which includes
many operations close to classical ALU design.
In this thesis, we introduce a new integrated module of a RALU, which
encapsulates most of the operations in classical realization with less number of
control lines. This module intends to perform the basic mathematical operations
of addition, subtraction, as well as logic operations AND and OR. Further, we
introduce an XOR function (not available in classical ALU), which is very useful in
107
reversible circuits. Finally, some negated logical functions such as NAND, NOR
and XNOR including implication are also realized in this design.
In our approach, we implement the RALU operating on single-bit data, which is
capable to realize various arithmetic functions, and can be cascaded into an n-bit
design. Further, we modify our RALU to detect overflow and to perform
comparison (set-less-than) operation to detect whether a number is less than
another number. Thus our design includes more functions with less number of
lines and quantum cost.
Apart from this, we use this RCAS module with slight modification to propose a
new and efficient structured methodology to implement a reversible square-root
circuit, which performs 2’s Complement addition/subtraction controlled by a digitby-digit square root result.
5.2
Reversible Controlled Adder/Subtractor
The Addition and Subtraction are two atomic operations of complex data
processing in computer arithmetic. In our proposed reversible arithmetic designs
such as comparator and arithmetic logic unit, a significant part relies on an
addition/ subtraction module, which is described first.
An adder/subtractor block is a combinational circuit, which adds or subtracts two
binary numbers X and Y depending on the value of the input control signal. For
108
addition, a general full adder block adds three bits X, Y, Z and generates two
outputs: sum(S) and carry-out (Co) according to the logic equations S= XYZ
and Co = XYZ(XY). A subtractor performs a subtraction on three bits X
(minuend), Y (subtrahend) and Z (subtrahend), and results in a difference D and
a borrow Bo calculated according to the logic equations D= XYZ and Bo=
X’YZ(XY)’ [36]. In operations on unsigned numbers, when Cout from the most
significant position of an n-bit adder/subtractor equals to 1, an overflow occurs
indicating that the result cannot be represented by assumed n bits. In this work,
we present reversible adder/subtractor which can handle both signed and
unsigned numbers. We use signed numbers in 2’s Complement notation, as the
speed of adding/subtracting in this number notation makes it a preferable choice
for ALU implementations of many modern computers.
5.2.1 Reversible Controlled Adder/Subtractor (RCAS) block
The controlled adder/subtractor (CAS) block is designed to perform an addition
or subtraction depending on the value of the input control signal. The concept
applied here is based on the use of an adder circuit to perform subtraction
instead of having a dedicated subtractor. Hence, the operation X-Y is
implemented as X+Y’+1, i.e., 2’s Complement. The schematic of classical
controlled adder/subtractor is presented in Fig. 5-1. The A/S control wire is set to
109
0 for an addition and the A/S is set to 1 for subtraction, hence the vector Y is 1’s
complemented using the XOR gates. The Cin is also set to 1 to complete the 2’s
complementation of Y, Fig. 5-1.
Y
X
A/S
COUT/
BOUT
Y3
FULL ADDER
S/D
CIN
COUT/
BOUT
X3
A/S
Y2
CAS
S3/D3
X2
A/S
Y1
CAS
S2/D2
X1
CAS
S1/D1
A/S
Y0
X0
A/S
CAS
S0/D0
CIN
(b)
(a)
Figure 5-1: Controlled adder/subtractor design in 2’s complement a) CAS block
b) 4-bit adder/subtractor [85].
Note that an irreversible adder/subtractor module has 4 inputs ( A/S, Cin, X and Y)
and 2 outputs (S/D and Cout). As its input/output count is unequal, the original
form cannot be used directly as a reversible element. To create a cascadable
reversible CAS module, we need to add some garbage bits to the original
irreversible CAS function to account for I/O compatibility. The assignments of the
garbage bits must be such that every input combination maps to a unique output
pattern, while preserving the original addition/subtraction operation. To fulfill this
condition, the minimum number of garbage signals is
, where M is the
maximum number of output pattern repeated. In irreversible controlled full
adder/subtractor M is 6 (outputs S/D and Cout pattern 01 or 10), resulting in 3
garbage signals. As this would still imbalance the I/O compatibility (4 inputs vs. 5
110
outputs), a single ancilla bit set permanently to 0 is added. The truth table of our
RCAS module is in Table 5-1.
A second issue in our RCAS construction is the lack of fan-out signals in
reversible design. In the classical CAS, the same control signal A/S is fanned-out
to all cascaded CAS blocks which is not possible for reversible circuits. To
remedy this, the A/Sg garbage bit in the RCAS (Table 5-1) is used to provide a
copy of a control signal to next blocks.
Table 5-1: Reversible controlled adder/subtractor truth table
cn
st.
A/
S
Cin
X
Y
S/
D
Co
ut
A/
Sg
g1
g2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
1
0
0
0
1
0
1
0
0
1
1
0
0
0
1
1
0
1
0
1
0
0
0
1
0
0
1
0
0
0
0
0
0
1
0
1
0
1
0
0
1
0
0
1
1
0
0
1
0
1
1
0
0
1
1
1
1
1
0
1
0
0
1
0
0
0
1
0
1
0
1
0
1
0
0
1
0
0
1
0
0
0
1
0
1
0
0
1
1
1
0
0
1
0
1
1
1
0
1
1
1
0
1
1
0
0
0
1
1
0
1
0
1
1
0
1
1
0
1
0
0
0
1
1
1
0
1
1
1
1
0
0
1
1
1
1
0
1
1
1
1
111
With the questions of fan-outs and I/O count compatibility resolved, the reversible
controlled adder/subtractor (RCAS) block is constructed as shown in Fig. 5-2
using reversible gates. The inputs to RCAS are: data signals: X, Y and Cin, a
control signal A/S and an ancilla bit (cnst.) set to 0. The outputs are: S/D, Cout
and A/Sg which is an A/S control signal propagating to the next block, and two
garbage outputs g1 and g2. The full adder is implemented by cascading two Peres
gates (highlighted), which results in generating the target outputs S/D and Cout,
where S/D = A/S⨁Y⨁X⨁Cin and Cout= X(A/S⨁Y)⨁Cin(A/S⨁Y⨁X), in addition to
garbage outputs A/Sg, g1 and g2, Fig. 5-2. Hence, when A/S is set to 0, then
CNOT (first gate, Fig. 5-2) passes the true copy of Y while A/Sg acts as a
garbage output. When A/S is 1 then Y' is available at the output of CNOT, and
the full adder adds X, Y’ and Cin (which is set to 1 for subtraction). The garbage
output A/Sg of a given RCAS block is reused for a control signal fed to the
consecutive RCAS block. The quantum cost of the RCAS module is 9 (8 for two
Peres plus 1 for CNOT gate).
112
A/S
A/Sg
X
g1=x
Y
g2=A/S⊕Y⊕X
CIN
A/S⊕Y⊕X⊕Cin
0
X(A/S⊕Y)⊕CIN(X⊕A/S⊕Y)
Peres gate
Peres gate
Figure 5-2: Reversible Implementation of controlled adder/subtractor.
5.2.2 n-bit Adder/Subtractor
A RCAS block can be cascaded to construct an n-bit adder/subtractor. In the
proposed architecture the A/Sg and Cout outputs of a previous stage propagate to
the input control signal A/S and the Cin of the next RCAS. For a 2’s Complement
subtraction X+Y’+1, an extra value 1 needs to be added to the least significant bit
position (LSB) of the result. Hence, the original LSB block is modified by adding a
CNOT gate between A/S and a constant ‘0’ ancilla bit. An n-bit adder/subtractor
is presented in Fig. 5-3. Note that when A/S is set to 0 for an addition, a value 0
is passed through the CNOT gate to the RCAS0 resulting in the addition of X0
and Y0 only. When A/S is set to 1 for the subtraction, a value 1 is added to X0+Y0’
as required by a 2’s Complement notation. The addition of CNOT gate increases
quantum cost by 1.
113
Yn-1 X n-1
Y3
0
A/ Sg
A/ S
Y2
g2n g2n-1
Sn-1/ Dn-1
Y1
C4 /B4
g8
Y0
A/ S
RCAS 1
C2 /B2
g7
g6
S2 / D2
X0
0
A/ S
RCAS 2
C3 /B3
S3 / D3
X1
0
A/ S
RCAS 3
Cn-2/B n-2
X2
0
A/ S
RCAS n-1
Cn/ Bn
X3
0
A/ S
RCAS 0
C1 /B1
g5
g4
S1 / D1
g3
g2
0
g1
S0 / D0
Figure 5-3: n-bit Reversible Controlled Adder/Subtractor circuit.
5.2.3 Reversible Adder/Subtractor with Overflow Detector
If a result of an n-bit addition/subtraction does not fall within the allowed range,
then an arithmetic overflow occurs. Although, when adding unsigned numbers,
the output carry-out Cout signal coming from the most significant RCAS block
serves as an overflow indicator, for the signed numbers the carry-out at the signbit position is not sufficient. In the case of signed numbers the overflow can
occur, when adding two numbers of the same sign with the carry out signal from
the MSB position being 0, i.e., not indicating the overflow.
In general, when considering 2's Complement addition (subtraction)the overflow
occurs if the carry-in (borrow-in) to the most significant RCAS is different than the
carry-out (borrow-out) generated by that block, Ovf = Cn-1⊕Cn. To implement the
above check we use two additional CNOT gates at the outputs of the two most
significant RCAS modules, Fig. 5-4. The first CNOT gate placed at the Cout of
RCASn-2 block provides a copy of carry input Cn-1 to the RCASn-1. The second
CNOT performs XOR of Cn and Cn-1, where Cn is the carry output from an nth bit
114
RCAS. If Ovf is equal to 1, then the addition is incorrect, falling outside the
assumed range. An n-bit signed numbers (2’s Complement) reversible
adder/subtractor with overflow detector is given in Fig. 5-5.
A/S
A/Sg
RCASn-1
RCASn-2
Cn
Cn-1
Ovf
0
Figure 5-4: Modification to include overflow detector.
Example 1: Consider two signed numbers X = 0001 (+1) and Y = 1101 (-3). To
subtract Y from X, the control signal A/S is set to 1 resulting in inverting 4
bits of Y. A carry-in to the RCAS0 equals to 1 is added to create a 2’s
Complement of Y. The subtraction X - Y is D = 0100 (+4) The carry outputs
C2 and C3 from the pre-last (RCAS2) and the last block (RCAS3) are 0
indicating no overflow, i.e., Ovf = 0. Another addition 0110 (+6) and 0100
(+4) performed with this 4-bit signed adder results in an incorrect value S =
1010 (-6). Thus Ovf = 1 (C3 = 0, C2
115
0
Yn-1
Xn-1
0
Yn-2
Xn-2
0
Y2
X2
0
Y1
X1
0
Y0
X0
A/ Sg
A/ S
0
g
g
g
g
g
RCAS0
g
S0/D0
g
RCAS1
S1/D1
g
RCAS2
S2/D2
Ovf
RCASn-2
Sn-2/Dn-2
Cn
Sn-1/Dn-1
RCASn-1
g
0
g
Figure 5-5: n-bit Reversible Controlled Adder/Subtractor with overflow detector.
5.2.4 Comparative Analysis of Reversible Adder/Subtractor
Now we present a comparative study on implementation cost of our RCAS
designs with respect to circuits in [34] and [35] for 16-bit circuits. As each design
incorporates different reversible gates having different quantum cost, we rely on
the overall quantum cost of the circuit. However, for completeness we also give
the total number of reversible gates and garbage bits in each design.
The n-bit subtractor [34] requires 5n-3 reversible gates and 9n-6 garbage
outputs, with the overall quantum cost of 17 n-11. The most cost effective
subtractor [35] uses 2n-1 TR gates, 2n-1 garbage outputs, and its quantum cost
is 12n-6. Our reversible n-bit adder/subtractor requires 3n+1 gate (Peres and
CNOT), 2n+1 garbage bits, and has the quantum cost of 9 n+1. The results for
16-bit subtractors are summarized in Table 5-2. It can be seen that our design is
better than the method in [34] in all aspects. Compared to [35], our design is
slightly worse when it comes to the gate count (the two approaches use different
116
gate libraries) and garbage count, but has a significantly lower quantum cost. Our
design always requires 2 more garbage outputs compared to the current best
subtractor [35]. One of these two garbage bits is generated from the control input
A/S to select between an addition and a subtraction ( A/Sg from RCASn-1, Fig. 5-3)
Note that such a feature is not present in [35] since that design performs only a
subtraction.
The percentage ratio of improvement in the quantum cost of our method is (3 n7/12n-6)*100. The enhancement in quantum cost as a function of an increasing
size of a subtractor compared to design in [35] is shown in Fig. 5-6. For example,
for a 32-bit subtractor, quantum cost improvement is 23.5% vs. 3.17% increase
in garbage bits.
Table 5-2: Comparison of different 16-bit subtractors
# of gates
#of garbage outputs
Quantum cost
Thapliyal [34]
77
138
261
Thapliyal [35]
31
31
186
Proposed
49
33
145
117
Reducton in Quantum Cost
100
80
60
40
20
0
5
10
15
20
Subtractor size, n
25
30
Figure 5-6: Improvement in quantum cost with subtractor size with
proposed design compared to [35]
Table 5-3 summarizes the comparison of our 16-bit design with binary
adder/subtractors presented in [36]. These are three designs constructed from
Fredkin, Peres and TR gates, and differ in the number of reversible gates,
garbage outputs, ancilla inputs and quantum costs. As the Add/Sub- Design III
implemented with Peres and CNOT gates clearly outperforms the Design I
(Fredkin and CNOT), and the Design II (TR and CNOT gates) [36], we use only
the Design III for comparison with our designs in Fig. 5-3 and Fig. 5-5
(adder/subtractor with the overflow detector). Our design is better than
adder/subtractor circuits [36] in terms of reversible gates, garbage outputs and
the quantum cost. For example, for an n-bit Design III [36] the quantum cost is
10(n-1)+6 =10n-4 with (n-1) full adder/subtractor each contributing 10 to the
quantum cost, and one LSB half adder/subtractor with quantum cost of 6. In
contrast, our n-bit design has a cost of 9n+1. Hence, the improvement in the
quantum cost of our design is n-3 for n> 3. The gain grows with increasing size of
the adder/subtractor.
118
Table 5-3: Comparison of different 16-bit Adder/Subtractor
5.3
Designs
Reversible
gates
Garbage outputs
constant inputs
Quantum cost
Add/Sub- I [36]
124
78
47
327
Add/Sub- II [36]
63
47
16
218
Add/Sub- III [36]
63
47
16
156
Proposed Design
49
33
17
145
With overflow
51
33
18
147
Reversible Binary Comparator
A binary comparator is an important part of computing systems especially in
central processing unit. It is also frequently used in microcontrollers,
communication systems, encryption devices, sorting networks, etc. In the
reversible implementation, we need a comparator module, which can be easily
integrated with other arithmetic elements with minimal cost. In this work, we
propose a new design of reversible comparator based on our reversible
adder/subtractor (RCAS) module, which is in contrast to the existing reversible
comparators constructed from truth table based logic equations. Our design is
capable of performing comparison of two signed numbers.
119
5.3.1 Comparator Basic
The basic irreversible single bit binary comparator compares two bits x and y
based on the logic expressions:
,
and
⨁
For 2-bit
comparator the functional logic equations are as follows [86]:
⨁
⨁
,
where
⨁
,
.
(1)
The above idea can be extended to compare two n-bit unsigned binary numbers
x and y. The comparison starts from the most significant bits of x and y and
⨁ , where i= 0, 1, 2 ………n-1.
checks for each bit position equality,
When ai for every bit position is 1 then two numbers are equal:
(2)
In case of inequality we check for greater than or less than relation. Starting from
MSB, if ak= 0 at the kth bit position, then if, xk=1 and yk=0 then x>y. Otherwise, if
xk=0 and yk=1, then x<y. For example, the equations to compare two 4-bit
numbers x and y are as shown below:
(3)
120
5.3.2 Proposed Comparator Design
In this thesis, we implement the reversible comparison of two signed numbers
using
the
subtraction
operation.
We
utilize
our
reversible
controlled
adder/subtractor (RCAS) circuit with overflow detector presented in section 5.2.
From the magnitude and sign of subtraction result, the two numbers can be
compared [86]. The schematic of the basic idea for comparing two signed
numbers is shown in Fig. 5-7. The controlled adder/subtractor with overflow
detector circuit is set to perform subtraction operation ( A/S=1) and for
comparison purpose, we extract three signals from subtractor circuit: overflow
Ovf, Neg which is a copy of most significant bit of difference Sn/Dn and a Z
indicating when the difference between two numbers is zero. This can be
obtained by performing NOR operation of all the bits of subtraction result. The
Ovf is set to 1 if arithmetic overflow occurs, otherwise Ovf =0. Signal Z is equal to
1 if the difference result is 0, otherwise Z is 0. Finally, the sign bit Neg =1
indicates negative result while Neg is 0 when the subtraction result is positive. By
utilizing these three signals we can determine the comparator outputs of two
numbers X and Y, i.e., X<Y, X=Y and X>Y. To detect the equality of two numbers
X and Y, we need to monitor Z. If all bits of the subtraction result are ‘0’ then the
121
numbers are equal and Z=1. Further, based on the values of the two signals Ovf
and Neg we can detect X<Y relation. Here, if X and Y have the same sign there
is no overflow, and hence Ovf = 0. Further, the difference X – Y will be negative,
i.e., Neg=1. When X is negative and Y is positive, then X – Y will be negative
(Neg=1) if there is no overflow (Ovf = 0). However, the result will be positive
(Neg= 0) if there is an overflow (Ovf= 1). Thus, X<Y is detected if Ovf⨁Neg=1. In
comparison, any relationship can be detected if the other two relations are false.
Hence, we can determine the greater than relation using equality and less than
information, i.e.
. Thus, X>Y will be detected if
⨁
The 4-bit reversible comparator for two signed numbers is shown in Fig. 5-8. We
use the 4-bit reversible controlled adder/subtractor with overflow detector. To
generate signal Neg, which is a copy of sign bit (the most significant bit S3/D3),
we add CNOTsign gate. Monitoring equality of two numbers in our reversible
comparator implementation requires extra circuitry, which increases linearly with
the size of comparator. In subtraction, if all bits of the difference are ‘0’, then the
two numbers are equal. So, if we perform NOR operation of these difference
), the output will be 1 only when X=Y.
output bits (i.e.
122
Y
X
AS=1
Ovf
Neg
Cin=1
S1/D1
S2/D2
Sn/Dn
n-bit controlled adder/
subtractor with overflow
detector
Z
Figure 5-7: Use of a subtractor as a comparator.
To determine the equality of X and Y we use 2-input NOR gates for each pair of
difference bits, which then are AND-ed. We avoid the use of NOR gates with
higher number of inputs since that increases the number of controls in reversible
gate and hence increases the quantum cost. The reversible implementation of a
2-input NOR gate (encircled dotted) has the input-output mapping (X, Y, ‘1’ ->X,
Y, 1⨁XY⨁X⨁Y with one Toffoli and two CNOT gates). Finally we use the Toffoli
gate to perform AND of all outputs from the NOR structure. Therefore, for a 4-bit
comparator we need 2-controlled Toffoli gate (
), where
Fx=y indicates the target output. The number of controls of the Toffoli gates
increases with the increased size of comparator. For example, 8-bit comparator
requires 4-controlled Toffoli gate. An alternative solution of this part of the
comparator is to use inverters to invert all the bits of difference output and then
and them (
). For large size comparator, having Toffoli gate
123
with large control bits increases quantum cost significantly. Hence, we use 4controlled Toffoli gates at first level, and in next levels 2-controlled Toffoli gates.
Table 5-4 represents the significance of these two implementations (NOR-AND
and INV-AND) of detecting equality of the comparator in terms of quantum cost,
delay and garbage outputs. The advantage from using and INV-AND design is a
reduction in garbage outputs. However, NOR-AND design offers smaller
quantum cost and delay.
To determine X<Y, we add another CNOTless to the original design determining
X=Y condition, as well as a copy of overflow Ovf. The output of the CNOTless is
denoted as Fx<y. Finally, to detect X>Y, we add a reversible NOR gate, which
implies that if both X<Y and X=Y are false then X must be greater than Y. The
three outputs of NOR gate are the reversible comparator outputs: Fx<y, Fx=y and
Fx>y.
Table 5-4: Comparators NOR-AND vs. INV-AND
NOR-AND
INV-AND
QC
Delay
GOs
QC
Delay
GOs
4-bit
66
55
17
76
70
15
8-bit
126
96
33
147
222
29
16-bit
246
173
65
289
205
57
32-bit
486
322
129
573
357
113
64-bit
966
615
257
1141
685
225
124
The design methodology is extendable to reversible comparator of any size.
Only the number of NOR gates will vary depending on the number of bits and so
the controls of Toffoli gate for detecting equality.
Y 3 X3
Y2 X
0
0
Y1 X1
2
0
Y0 X0
0
A/ Sg
A/S
RCAS 3
Ovf
CNOTless
S3/D3
g
g
8
7
CNOTsign
Cn
RCAS 2
g
g
6
5
RCAS1
RCAS0
g4
g2
g3
g1
0
S0/D0
S1/D1
0
S2/D2
S3/D3
X<Y
1
1
X=Y
X>Y
0
gn
gn
0
0
Figure 5-8: 4-bit reversible comparator.
5.3.3 Comparative Analysis of Reversible Comparators
There are very few references to reversible binary comparator implementations
for unsigned numbers, most based on truth table representations of single bit or
2-bit comparison [53, 79, 80]. The early proposed reversible/quantum comparator
[79] is an iterative network to compare two n-digit binary numbers utilizing a
serial chain of n basic 1-bit comparator cells and a comparator output circuit.
Another design of reversible binary comparator proposed in [80] is based on
binary tree structure where each node consists of a 2-bit reversible comparator.
The 2-bit reversible binary comparator has quantum cost of 27, 6 garbage
125
outputs and delay of 18 Δ (where Δ is assumed as the delay of each quantum
elementary gate operation). In a recent work [53], authors presented the
construction of one bit comparator using various standard reversible gates such
as Feynman, Peres, Toffoli and Fredkin gates as well as some special 3x3 gates
(with inputs A, B, C and outputs P, Q, R) such as an R-gate (P= AB, Q=A, and
R=C’AB), URG gate (P= (A+B)C,
Q=B, and R=CAB), TR gate (P=A,
Q=AB, and R=AB’C) and BJN gate (P=A, Q=B and R=(A+B)C). Among the
above the best reversible 1-bit comparator implemented with Peres and BJN
gates has quantum cost 10 and 2 garbage outputs. However, the extension to
two n-bit numbers is not reported.
In our design we use signed numbers, which are compared in the indirect way
instead of in a bitwise manner. For example, the reversible/quantum comparator
in [79] has two parts: a chain of 1-bit comparator cells (quantum cost of 39, 8
garbage outputs and delay of 24 Δ) and an output circuit (quantum cost 9 and
delay 7Δ). Hence, for a 4-bit reversible comparator, the quantum cost is 165. The
number of garbage outputs is 32 and the delay is 103Δ. The comparator tree
structure proposed in [80] requires 2-bit reversible binary comparator modules as
well as a comparator output circuit. According to the proposed design a 2-bit
comparator consists of two R-B comp modules, two TR gates and one CNOT
gate. Its total quantum cost is 27, it needs 6 garbage outputs, and its delay is
126
18Δ. The additional output circuit is the same as in [79]. Hence, for a 4-bit
design, this tree based reversible comparator has quantum cost 90, 18 garbage
outputs and delay of 43. In our design, a 4-bit reversible comparator has the
quantum cost t = QC of 4-bit RCAS with overflow detector+ CNOTsign +
CNOTless+ 2 NORz + 2-controlled Toffoli+ NORgt = 38+1+1+2*7+5+7= 66. The
number of garbage is 17, while the delay is 55Δ.
Figure 5-9:Number of garbage outputs of different reversible comparators.
Fig. 5-9, Fig. 5-10 and Fig. 5-11 present a comparison of a proposed comparator
with existing serial based [79] and tree based [80] designs. As can be seen, our
design is far better than serial reversible comparator [79] in all aspects.
Compared to tree-based design [80], our design has a smaller quantum cost and
fewer garbage outputs. However, our design is slower than that in [80] since our
reversible controlled adder/subtractor is a ripple carry structure.
127
Figure 5-10:Delay for different reversible comparators.
Figure 5-11Quantum cost for different reversible comparators
.
5.4
Reversible Arithmetic Logic Unit
An ALU is an integral module of multiple one- and two-input arithmetic and logic
functions. Instead of constructing several single-function circuits this integrated
module offers programmability with less gate cost. However, the incorporation of
several functions into a single unit requires additional control lines and circuit
resources. In this thesis, we implement a reversible ALU having operations of a
128
conventional irreversible ALU. For the ALU design we need to concentrate on
including as many arithmetic and logic operations as possible in a simple design
with maximum efficiency and minimum possible cost. Hence, the reversible ALU
presented here includes most operations available in conventional irreversible
ALU. The stepwise development of our RALU is described next.
5.4.1 The Logical Operations
In a classical design the most common logical functions included in basic ALU
are AND and OR. In our RALU we intend to realize these two functions first. The
reversible equivalent of these two functions are AND: ( A, B, ‘0’ →A, B, 0⨁AB)
implemented with single Toffoli gate and OR: ( A, B, ‘0’ → A, B, 0⨁A⨁B⨁AB)
implemented with one Toffoli gate and two CNOT gates. However, to ensure
reversibility the AND and OR embeddings require some extra signals as garbage
outputs. Moreover, we add another important logical operation, i.e., the bitwise
exclusive-OR (XOR), which is elementary in reversible logic. Extending ALU by
this operation is obvious, and increases the flexibility and applicability of RALU.
Moreover, suitable control signal facilitates performing NAND, NOR and XNOR
operations too. For example, we implement 0⨁A⨁B⨁Cnt. When control Cnt is
false then it performs bitwise XOR and when Cnt is true then we get XNOR
operation.
129
5.4.2 The Arithmetic Operations
The two basic arithmetic operations included in any ALU are an addition and a
subtraction of two binary numbers. The main problem in designing reversible
adder or subtractor is that the function is not bijective uniquely, and hence we
need to find proper reversible embedding with the aid of extra signals. In our
RALU we implement an adder and a subtractor in a single module with a control
signal. The structure of our controlled adder/subtractor (RCAS) module
presented earlier executes addition or subtraction based on a control signal.
In our RALU the arithmetic and logical operations are first performed in parallel,
and then the desired result is selected by a multiplexer (MUX). Thus, we have
two steps in our approach: function generation and function selection, Fig. 5-12.
We explain the implementation of each step next.
AND (XY/XῩ)
OR (XY/XῩ)
4:1 MUX
Y
Function
Generator
X
ADD/Subtract
Result
XOR/XNOR
Ctl1
AS
Ctl2
Figure 5-12: Reversible ALU two steps block diagram
5.4.3 Function Generator
The first module of our reversible ALU generates four arithmetic-logic functions in
parallel as well as transmits an input X unchanged. Another operand Y is
130
controlled by a control signal AS, which determines the inverted operation or a
subtraction. The implementation of this generator block is presented in Fig. 5-13.
The control signal Cpn defines the functionality of AND and OR (when Cpn=0),
otherwise NAND or NOR functions are performed. Next if the control signal AS is
‘0’ then
the device
performs the following
operations simultaneously:
AND/NAND, OR/NOR, addition and XOR. On the other hand, if the control signal
AS is ‘1’ then results for AND/NAND with single input inverted, OR/NOR with
single input inverted, subtraction and XNOR are generated, Table 5-5.
AS
ASout
X
g1=X
Y
OR=0⊕X⊕(Y⊕AS)⊕X(Y⊕AS)⊕Cpn
Sum/Diff = X⊕(Y⊕AS)⊕Cin
Cin
Cout
0
AND=(Y⊕AS)X⊕Cpn
Cpn
0
XOR/XNOR=X⊕Y⊕AS
Figure 5-13: Reversible ALU function generator.
In our design, we use the RCAS in Fig. 5-2. However, in order to implement with
fewer number of gates for extended functions AND, OR and XOR, we introduce
two CNOT gates in between two Peres gates (AND and XOR) and one extra
CNOT gate after the 2nd Peres gate, Fig. 5-13.The quantum cost of this function
generator is 12.
131
5.4.4 Function Selector
To select a desired function output we implement a 4:1 reversible multiplexer
shown in Fig. 5-14. The original irreversible multiplexer operation is translated
into a reversible function using Positive Davio expansion, which confirms
minimum number of lines. The number of reversible gates for this MUX is 6 and
the number of lines is also 6. The quantum cost is 18.
Result= (F1⨁Ctl1 (F1⨁F2)) ⨁Ctl2 (F1⨁Ctl1 (F1⨁F2) ⨁F3⨁Ctl1 (F3⨁F4)).
F1
Result
F2
g1
F3
g2
F4
g3
Ctl1
g4=Ctl1g
Ctl2
g5=Ctl2g
Figure 5-14: Reversible ALU function selector (4:1 MUX).
Another implementation of a function generator is obtained by using Fredkin
gates. A Fredkin gate itself is a reversible equivalent of 2:1 MUX. Hence, instead
of using 4:1 MUX proposed in the previous paragraph we can use three 2:1
MUXs (Fredkin gates). The advantage of having a Fredkin gates’ multiplexer lies
in a smaller quantum cost (15) as well as less logic depth (The quantum cost of
Fredkin gate is assumed 5 according to [54]). Further, a Fredkin gate generates
132
the selected function as well as other functions available in the outputs. For
example in Fig. 5-15, if we set control Ctl1=’0’ and Ctl2 =’1’ then Fred1 gate
selects F1 in line 2 while F2 is also available at line 3. Similarly Fred2 selects F3 in
line 4 and F4 in line 5. The final Fred3 gate selects F3 as a resulting output while
F1 is also available as garbage output Gr. Thus, though it selects only one
function as a result in target line, the rest of the functions can be obtained from
its garbage outputs, converting garbage lines into functional ones. We present
our RALU implemented using both function selectors in the following sub-section
5.5 (Fig. 5-14 and Fig. 5-15).
Ctl1'.F1+Ctl1.F2
F2
Ctl1'.F1+Ctlg'.F2
F1/F2
F3
F4
Fred2
F3/F4
Fred3
Ctl2
F1
Ctl1
Fred1
Ctl1
Ctl2
Result
Gr
Ctl1g
Ctl1'.F3+Ctl1.F4
Ctl1.F3+Ctl1.F4
Figure 5-15: Reversible ALU function selector using Fredkin gates.
5.4.5 One-bit Reversible Arithmetic Logic Unit (RALU)
Now we create two designs of a reversible 1-bit ALU - one incorporating a 4:1
MUX and another using a Fredkin selector. The first design of a 1-bit RALU is
shown in Fig. 5-16. The circuit comprising 4:1 multiplexer requires two Peres
133
gates, 3 Toffoli gates and 7 CNOT gates (total 12 reversible gates). The number
of lines is 9 and overall quantum cost is 30. On the other hand, the design 2 with
Fredkin multiplexer, Fig. 5-17, requires two Peres gates, 4 CNOT gates and
three Fredkin gates (9 gates in total) with overall quantum cost 27. In Fig. 517(b) we can see the availability of all outputs as target function output (XOR)
and garbage outputs (AND, SUM and OR) for control inputs Ctl1g=1 and Ctl2g=1.
ASout
AS
X
GX
Y
G1
G2out
Cin
Cout
0
Cpn
Result
0
G3
Ctl1
g4=Ctl1g
Ctl2
g5=Ctl2g
Figure 5-16: Reversible ALU design I (using 4:1 MUX)
Note, that in the overall n-bit RALU module, all the blocks are placed in a way to
comply with reversibility properties, while preserving the correctness of the
execution of the arithmetic and logical operations. The 1-bit RALU operation with
control inputs is presented in Table 5-5.
134
Table 5-5: Reversible ALU Operations with control inputs
Cpn
AS
Ctl2
Ctl1
Operation
0
0
0
0
AND
0
0
0
1
OR
0
0
1
0
ADD
0
0
1
1
XOR
1
0
0
0
NAND
1
0
0
1
NOR
0
1
0
0
XY’
0
1
0
1
Y->X
1
1
0
0
X->Y
0
1
1
0
SUB
0
1
1
1
XNOR
5.4.6 4-bit RALU
In the proposed circuit, we cascade the 1-bit RALU module and use the copy of
all control signals (AS, Ctl2, Ctl1) available at the module outputs for the next
stage of the RALU operation. Thus RALU guarantees the generation of all the
required fan-out signals for controlling the selection of functions with the minimal
cost of the implementation.
135
ASout
GX
AS
X
Y
Cin
Cout
0
Cpn
0
Ctl2g=1
Ctl2g
Ctl1g=1
Fred1
Ctl1
(a)
Fred2
OR
Gro
Fred3
Ctl2
Ctl1g
XOR
AND
Ctl2g
Result
Gr
Ctl2g
Gsx
OR
AND
OR
SUM
XOR
XOR
SUM
(b)
Figure 5-17: Reversible ALU design II (using Fredkin selector)
A 4-bit reversible arithmetic logic unit with inputs X3X2X1X0 and Y3Y2Y1Y0 is
presented in Fig. 5-18 (b). The basic block, i.e., the RALU module is shown in
Fig. 5-18(a). The control signal AS transmits the true or inverted copy of input
signal Y as well as defines the addition or subtraction operation. The other input
X is transmitted unchanged at the output as garbage G x. All the functions outputs
are available at each RALU module outputs as G ao, Gsx and Gr. The output
Result presents the desired function selected by the control signals. Since in the
addition or subtraction operations, the carry out Cout of a previous stage is
propagated to the next stage, in our cascading we also follow this scheme. Thus
four outputs are reused to provide next stage signals. This is a minimization of
garbage outputs. Hence, a 4-bit reversible implementation of the ALU requires 24
lines. Note that for subtraction operation the input carry in C in should be set to ‘1’.
136
0 0 0
Gao0
Gsx0
Gr0
Gao Gsx Gr 0 0 0
Gao1
Gsx1
Gr1
0 0 0
GX2 X3
Y3
X2
Y2
Result1
RALU2
GX1
RALU1
RALU0
X0
Y0
C
in
(b)
AS
Ctl1
Ctl2
0 0 0
GX0 X1
Y1
Result0
Gao2
Gsx2
Result2
Gr2
GX3
Result3
Cout
ASout
GC1
GC0
RALU3
0 0 0
GX
Result
Cout
ASout
GCtl2
GCtl1
RALU
(a)
X
Y
Cin
AS
Ctl1
Ctl2
Gao3
Gsx3
Gr3
Figure 5-18: Reversible ALU circuit (a) 1-bit Block diagram and (b) 4bit reversible implementation
5.4.7 Analysis of circuit parameters for n-bit RALU
By cascading RALU blocks the same way as in Fig. 5-18 any size of the
arithmetic logic operation can be easily realized. The n-bit ALU with five basic
arithmetic-logical operations requires elementary reversible logic gates (Peres,
Toffoli, Feynman gates). Each RALU module with a 4:1 MUX requires 12 gates
with quantum cost 30. Hence, for the n-bit realization the number of gates is 12 n,
and the quantum cost is 30n. On the other hand, with Fredkin multiplexer, the
number of reversible gates for an n-bit design is 9n, and quantum cost is 27n.
The number of garbage bits is also linear in the size of inputs (5n+4). Note, that
the addition of the control and the combination of multiple functions do not
necessarily require a large number of gates and ancilla.
137
5.4.8 Comparison to previous work
In literature, we find different realizations of logic or arithmetic units as
benchmark circuits [82] implementing different operations. Table 5-6 compares
various reversible circuit parameters of proposed design with existing realizations
for a 1-bit RALU. Table 5-7 presents a summary of existing 32-bit reversible ALU
or LU realizations with our approach. Note that the operations performed by each
method are not the same; hence a solid comparison is not possible. We did not
include multiplication or division operations as the complexity of the design
increases and that’s why in CMOS design usually these operations are not
integrated in ALU. For data path or DSP circuits we can rely on existing
reversible multiplication circuits. Our implementation is very close to the classical
ALU with acceptable cost. For a 32-bit realization, our design is better than Logic
Unit, which does not even include arithmetic operations.
Table 5-6: Cost comparison of 1-bit ALU
Circuit
Operation
Lines
Gates
QC
Logic Unit [ 21]
AND, OR, NAND, NOR, XOR, 1, 0, XNOR
5
18
114
Mini ALU using BDD [82]
AND, OR, ADD, no-op.
10
20
60
Proposed Design
AND, OR, XOR, NAND, NOR, XNOR, AND/OR
with single input inverted, ADD, SUB
9
9
27
The V-shape design [83] is more economical and efficient for programmable
reversible computing. From Table 5-7 we can see the design contrast our
138
proposed RALU in many operations. For example, the method in [83] calculates
a modular addition and subtraction, so no carry output is considered, while our
design calculates complete result having a sum and carry output to indicate
arithmetic overflow condition. However, our original RALU includes more logic
operations such as AND/NAND, OR/NOR. This requires an extra circuitry, which
is absent in method [83]. To present a meaningful comparison we modify our
reversible ALU to include the operations performed in [83], and we discard extra
logic functions from our design. In Fig. 5-19, we show our new design consisting
of our original reversible controlled adder/subtractor (RCAS: quantum cost 9, one
CNOT and two Peres gates), one Toffoli, one CNOT and one Fredkin gate. The
functions with different control signals are presented in Table 5-8. The overall
quantum cost of the design is 20. Hence, the quantum cost for 32-bit design 640,
which is less than the V-shaped design (QC= 694).
Cres
Csns
Cnop
Csns
Cnop
A/S
A/Sg
X
Y
F
Ccarry
S/D
0
X
Cres
Result
g
Cout
Peres gate
Peres gate
Figure 5-19: Proposed Reversible ALU comparable to V-shaped design
139
Table 5-7: Different 32-bit Reversible ALU realizations
Circuit
Logic Unit
[82]
ALU SyReC
[82]
Simple ALU
SyRec [82]
V-Shape [83]
ALU [84]
Proposed
design
Operations
AND, OR, XOR
ADD, SUB, MULT, DIV
ADD, SUB, MULT, XOR
Modular arithmetic (ADD,
SUB, NSUB), XOR, no-op
ADD, SUB, OR, NOR,
AND/NAND (or XOR/XNOR)
AND, NAND, OR, NOR, ADD,
SUB, XOR, XNOR, implication
Const.
yes
no
yes
no
yes
no
no
yes
yes
Gates used
CNOT, Generalized
Toffoli
CNOT, Generalized
Toffoli
CNOT, Generalized
Toffoli
CNOT, Toffoli and
Fredkin
CNOT, Fredkin,
HNG, MRG/POAG
CNOT, Peres,
Fredkin, Toffoli
Lines
299
203
331
235
331
235
Gates
571
385
15950
15764
4413
4227
QC
1223
6562
1336477
1851487
27009
152852
69
190
694
196
254
830
164
288
864
Recently two designs of reversible ALU were presented in [84] based on two
newly proposed gates MRG and Peres-AND-OR (PAOG). These designs
perform similar arithmetic and logical functions to ours. For example, a reversible
ALU with MRG and HNG gates perform OR, NOR, XOR, XNOR, ADD, SUB
operations, and a reversible ALU with PAOG and HNG gates perform AND,
NAND, OR, NOR, ADD and SUB operation. Note that the first design excludes
AND, NAND operation while the 2nd design excludes XOR/XNOR operations.
However, our RALU integrates all the functions of these two designs. The total
cost of an n-bit ALU in [84] is 26n-2. To make a fair comparison, we drop the
excess functions (AND/NAND or XOR/XNOR), and then we find the total
quantum cost to be 21n for a design with a Fredkin selector and 23n for a design
with a multiplexor. For a 32-bit reversible ALU our best design has a quantum
cost of 672, while the design in [84] has quantum cost 830. Moreover, for a 1-bit
140
ALU the design in [84] requires 10 lines whereas our ALU needs 9 lines (actually
8 lines if we drop one function for a proper comparison). Thus we save 2 lines
per bit of an ALU. Thus, proposed ALU is more economical than other methods.
Table 5-8: RALU (Fig. 5-19) Operations with control inputs (X is unchanged)
Cres
Csns
Cnop
AS
Ccarry
ALU Operation
1
0
0
0
0
Y +n X
ADD
1
1
0
1
0
Y –nX
SUB
1
0
0
1
1
0
0
0
0
0
Y ⨁X
XOR
0
0
1
0
0
Y
NOP
0
0
1
1
0
0
0
0
1
0
⨁
1
0
0
0
1
Y +n X +n1
1
0
0
1
0
X –nY –n 1
X –nY
NSUB
5.4.9 RALU with overflow detector and set-less-than function
When RALU performs an addition or a subtraction operation, we should consider
the allowed range for the given number representation used. This means that we
should check whether the result is within the acceptable range. For an unsigned
number, the Cout output represents the overflow of the operation. However, as we
mentioned in Section 5.2, in case of signed numbers’ operations (2’s
Complement Computation) a controlled adder/subtractor requires an extra circuit
141
to monitor an overflow. Similar to the RCAS design with the overflow detector,
Fig. 5-5, we modify our RALU at the most significant bit position. A copy of a
carry-input of this RALU block is obtained using a CNOTin gate, and the carry-in
is XOR-ed with carry output of the block (CNOTovf) to detect overflow, Fig. 5-20.
Gr0
Gao1
Gsx1
Gr1
RALU2
Gao2
Gsx2
Gr2
GX3
Result3
C out
AS out
G C1
G C0
Gao3
Gsx3
Ovflow
CNOTslt
0
Gr3r
CNOTsign
Gsx0
0 0 0
GX2 X3
Y3
Result2
Cout2
CNOTovf
Gao0
RALU1
RALU0
X0
Y0
Cin
AS
Ctl1
Ctl2
Gr 0 0 0
X2
Y2
Result1
GX1
RALU3
0 0 0
X1
Y1
Result0
GX0
CNOTin
0 0 0
Slt
0
Figure 5-20: Modified 4-bit RALU with overflow detection and setless-than operation
We can employ this overflow detector to add another operation to our RALU. The
set-less-than is usually available in classical arithmetic logic unit, and is used to
compare if a number X is less than the number Y (X<Y). As discussed earlier,
during the subtraction of two signed numbers, i.e., X-Y, the sign of the result
XOR-ed with the overflow signal indicates whether X is smaller than Y. The copy
of a sign bit (most significant digit of a difference) is obtained with CNOT sign gate.
The set-less-than output Slt is generated by the CNOT slt gate, Fig. 5-20. The
overall quantum cost is increased only by 4 with the inclusion of two functions:
the overflow detection and comparison.
142
5.5
Reversible square root circuit
In scientific computations next to basic mathematical operations of addition,
subtraction, multiplication and division, square-root is most useful and vital. For
example, numerical analysis, complex number computations, statistical analysis,
computer graphics and signal processing are among the fields where square root
is of relevance [87]. In classical irreversible arena we find different realizations of
square root circuit. Since reversible circuit is emerging as an alternative to
classical circuit, here we introduce a novel reversible realization of this operation
in addition to the other reversible arithmetic circuits even though it is not suitable
for quantum algorithms. As a basic module, we use the reversible controlled
adder/subtractor (RCAS) block based on 2’s Complement computation with slight
modification. In our design we create an array of such RCAS blocks which
perform addition or subtraction based on the result generated from digit-by-digit
square root operation. To our best knowledge this is the first methodical
approach for implementing reversible square root circuit.
Although, the realization of a square root in classical circuits is well established,
the way of implementing this operation in emerging technologies is not yet
sound. We find only one example of reversible 8-bit square root circuit as
benchmark result in [82]. However, this is a discrete example of square root
operation, not a regular structure or generalized method of building reversible
143
square root circuit. In this part of our work, we propose a structured methodology
of implementing this arithmetic circuit. Based on its classical realization, we
generate the reversible embedding which is an array structure of basic blocks,
and can realize square root circuits of any size. In particular, we implement the
classical non-restoring array structure of square-root circuit [87] in reversible
network, which performs 2’s Complement addition/subtraction controlled by a
digit-by-digit square root result.
A square root circuit is not modular, unlike an adder/subtractor, which has a
regular structure suitable for cascading. However, still RCAS introduced in the
previous sections is common for a square root implementation. For example, an
array structure of a non-restoring square root circuit presented in [87] uses the
classical controlled adder/subtractor blocks, Fig. 5-21. The 8-bit non-restoring
circuit realizes a digit-by-digit scheme, where at each iteration computed in each
row, only one digit of square root is performed [87]. Based on this structure we
create
our
reversible
square
root
circuit
using
reversible
controlled
adder/subtractor block (RCAS), Fig. 5-22. A few additional CNOT gates are
added to provide fan-out signals. Note that the RCAS blocks are placed in
reversible n-bit square root circuit in a way such that no fan-out signals are
present.
144
X1
Y
X2
1
1
CAS1
X
A/S
CAS2
(a)
X4
X3
A/S
1
0
q1
COUT/BOUT
CAS3
CAS4
CAS5
CAS6
Full Adder
X6
X5
Y
S/D
1
CIN
0
q2
CAS7
CAS8
CAS9
CAS10
CAS11
CAS12
X8
X7
(b)
1
0
q3
CAS13
q4
CAS14
r1
r2
CAS15
r3
CAS16
CAS17
r4
r5
r1
CAS18
r6
r1
CAS19
r7
CAS20
r8
Figure 5-21: Classical square-root circuit (a) Internal structure
of CAS and (b) 8-bit square root circuit [87]
In the proposed reversible square-root circuit, we incorporate the RCAS block
from Fig. 5-22 and reuse the copy of input signal Y of previous stage for the next
stage square root operation. For example, Y outputs of CAS3 and CAS4 in row 2
are connected to Y inputs of CAS8 and CAS9 in row 3, Fig. 5-21(b). The RCAS
guarantees the generation of all the original irreversible fan-out signals required
for our square root circuit with minimal cost of the implementation. Before we
explain a generalized design we present the procedure to realize a small
reversible square root circuit first.
Y
0
A/Sg
A/S
RCAS
CIN
COUT
(a)
A/S
X
S
g1
Yg
A/Sg
Yg
Y
X
A/ S⨁X⨁ Y
CIN
S/D
0
COUT
(b)
Figure 5-22 : Modified RCAS module for Square Root
145
A 4-bit reversible square root circuit with the input x1x2x3x4, the 2-bit square root
output q1q2 and the 4-bit remainder output r1r2r3r4 is presented in Fig. 5-23. It
incorporates the top two rows of Fig. 5-21. However, for the classical
implementation, in the first row the control input signal A/S (set to 1) is
propagated through CAS1 and CAS2, and is fed back to CAS2 as carry-in signal.
The carry-out of CAS2 serves as a Cin input to CAS1. Note, that the propagated
A/Sg from CAS1 is the same signal for the both CAS1 and CAS2 blocks. Further,
the carry-out of all CAS blocks is transmitted from right to left (for example Cout of
CAS2 is connected to Cin of CAS1, Cout of CAS6 to Cin of CAS5, etc.). Since no
feedback is allowed in reversible embedding, we change the order of the RCAS
signals propagation into the direction from right to left. Hence the right-most
RCAS1 block at the top row in Fig. 5-23(b) takes the control input A/S. In a
classical design, Fig. 5-21(b) the A/S signal of CAS1 is set to ‘1’ in order to
calculate the first digit of square root. We also set A/S input of RCAS1 to ‘1’, Fig.
5-23(b), as Cin of the RCAS1 block is connected to the control input A/S. Hence,
a required copy of A/S signal in the reversible embedding is obtained by using a
CNOT gate (right most gate CNOT cin in the top row of Fig. 5-23(a), with the I/O
mapping: A/S, ‘0’→ A/Sg, A/S⨁0). The outputs of this CNOT are connected to
inputs A/S and Cin of RCAS1 respectively. The outputs A/Sg and Cout of the
RCAS2 block are connected to the left RCAS1 block. To provide the inverted A/S
146
as well as A/S itself, each inverter in a classical implementation is replaced by a
CNOT gate marked as a CNOT inv in Fig. 5-23(a) with the I/O mapping: A/S, ‘1’→
A/Sg , A/S⨁1). In addition, since each square root bit from each row ( qi) is the
control signal for the next row (Fig. 5-21), we need a fan-out of qi. In reversible
implementation, we use a CNOT gate to generate a copy of a required signal.
For example, to obtain a copy of q1, we use a CNOT gate (CNOTq in Fig. 523(a)) with the I/O mapping: q1, ‘0’→ q1, q1⨁0.
5.5.1 n-bit reversible square root circuit
The array structure of reversible square root circuit can easily be extended to
incorporate any size of the square root operation. In general, for a 2 n-bit square
root circuit, in order to generate an n-bit square root output we need n rows of
CAS blocks, each row having 2i CAS blocks, where i is the order of rows (i = 1,
2,..., n) [87]. The architecture of a 2n-bit square root is a direct extension of the
schematics presented in Fig. 5-21.
Similarly we can implement our reversible array structure of 2n-bit square root
circuit. We need n(n+1) RCAS blocks, arranged in n rows with 2i RACS blocks in
each row (order of rows i = 1, 2,…, n). In each row the right-most RCAS block
control signal (A/S) has a fan-out to the Cin input. To copy the A/S signal we need
n extra CNOT gates (CNOTcin in Fig. 5-23(a)) of the I/O configuration: ( X, ‘0’ → X,
147
X⨁0). Also, the input Y of the 2nd most right RCAS block in each row is
connected to the inverted A/S signal. Hence, we need additional n CNOT gates
(CNOTinv in Fig. 5-23(a)) realizing the I/O mapping: X, ‘1’ → X, X⨁1). Likewise,
the input Y of the 3rd RCAS block from the right in each row excluding the 1st row
is connected to A/S. Hence the additional n-1 CNOT gates (CNOTAS in Fig. 523(a)) implementing the function ( X, ‘0’ → X, X⨁0) are required. Finally, n-1
CNOT (CNOTq gates in Fig. 5-23(a)) performing the mapping: ( X, ‘0’ → X, X⨁0)
are used to obtain a copy of a square root digit ( qi) from each row except the final
row.
CNOTinv
X1
1
0
1
X2
0
A/S
A/Sg2
RCAS 2
RCAS 1
0
q1
g3
0
q1g
CNOTq
g4
g1
CNOTcin
g2
(a)
CNOTAS
0
0
CNOTinv
0
0
X3
1
0
1
X4
0
q1g
A/Sg6
A/Sg5
RCAS 6
q2
A/Sg4
RCAS 5
RCAS 4
Cout4
Cout5
g10
A/Sg3
g9
Yg
r1
(b)
Cout3
g7
Yg
r3
r2
RCAS 3
A/S
gn
g3
S1
q1
1 g2
X 1 g1
S2
0 C2
0
1
g8
g5
0
CNOTcin
g6
r4
A/S=1
1
X2
0
0
0
r1
q2
0
S1
r2
0
S2
0
r4
C4
r3
C3
1
X4
0
0
Figure 5-23 : 4-bit reversible square root circuit(2-bit output) (a) Block
diagram and (b) Internal reversible implementation.
148
5.5.2 Analysis of n-bit circuit parameters
Each RCAS block in Fig. 5-23 requires two CNOT and two Peres gates, while
generating 3 garbage outputs. Note, the actual number of the garbage bits is 2,
as the garbage signal A/Sg is reused for the propagation of the input control
signal A/S between two cascaded RCAS blocks. Hence, the implementation of a
2n-bit input/ n-bit output circuit requires 2n(n+1) Peres and 2n(n+3)-2 CNOT
gates, totaling in the 4n2+8n-2 number of reversible gates. The quantum cost of
the overall design is 10n2+14n-2. The garbage outputs generated for an n-bit
square root circuit is n2+6n-2. The circuit parameters for different sizes are
shown in Table 5-9.
The benchmark circuit for 8-bit square root in (sqrt8) [82] utilizes 40 multiple
controlled Toffoli gates with overall quantum cost 622. In our approach, the
number of reversible gates (Peres and CNOT) is 94 and overall quantum cost is
only 214. Note, design in [82] requires less number of gates as this uses Toffoli
gates with more control lines (above 5 controls) which has higher quantum cost
(QC = 125 for Toffoli with 6 controls) whereas we use only Peres (QC=4) and
CNOT (QC=1) gates. Hence, we rely on quantum cost as comparison parameter
and in that respect our approach is better (65% improvement).
149
Table 5-9: Costs of Reversible Square Root for different size
Circuit size
# of gates
#of garbage outputs
Quantum cost
4
30
14
66
8
94
78
214
16
318
110
750
32
1150
350
2782
5.6
Simulation Results
We validated the reversible functionality through simulations for the proposed
designs of reversible controlled adder/subtractor circuits (Fig. 5-24- Fig. 5-26 ),
reversible comparator circuit (Fig. 5-27); the two 1-bit RALU blocks (Fig. 5-28Fig. 5-29), 4-bit reversible arithmetic logic circuits (Fig. 5-30-Fig. 5-31) and RALU
with overflow detector and set-less-than operation (Fig. 5-32). The simulation
results of reversible square root circuit is presented in Fig. 5-33 through Fig. 534. All of the above designs were implemented in VHDL and simulated using
Quartus II 9.1 sp1 web edition [88]. The RCAS module is modeled in the
behavioral manner, while the remaining designs are implemented using structural
code with RCAS block as component.
150
Fig. 5-24 illustrates the simulations of the RCAS block with inputs: X, Y, Cin,
control signal AS and constant inputs ‘zero’, and outputs ASout, Cout, SD, G1 and
G2 (two garbage bits).
Figure 5-24: Reversible controlled adder/subtractor
Fig. 5-25 presents the simulation of the 4-bit reversible adder/subtractor with the
random input values. The highlighted node SD illustrates the signed decimal
value of the arithmetic operation on two signed inputs X and Y (the addition when
the control input AS is 0, and the subtraction for AS equals to 1).
Figure 5-25: 4-bit Reversible controlled adder/subtractor
In Fig. 5-26, we present the random simulation results of the 4-bit 2’s
Complement reversible adder/subtractor with the overflow detector. Note, that
151
the first two values of the output SD are the correct result of an addition and
subtraction. However, the third value is erroneous, as it does not lie within the 4bit range of the data representation. This is indicated by the overflow flag Ovf set
to high.
Figure 5-26: 4-bit Reversible controlled adder/subtractor with overflow
detector
We present the simulation results of the 4-bit reversible comparator for signed
numbers in 2’s Complement computation in Fig. 5-27. The simulator sets some
random values of two signed numbers X and Y. Control input AS is set to 1 for
subtraction result. In Table 5-10, we show the comparator operations at different
intervals.
Figure 5-27: 4-bit Reversible Comparator
152
Table 5-10: Comparator outputs at various intervals of Fig. 5-27
Interval (ns)
Inputs
Outputs
Operation
Satisfied
80 - 100
X=5, Y= 3
Sout = 2, Grth=1
X>Y
100 - 120
X=-1, Y= 4
Sout= -5, Less=1
X<Y
140 - 160
X=3, Y= 3
Sout= 0, Equal= 0
X=Y
220 -240
X=-2, Y= -2
Sout= 0, Equal= 0
X=Y
Fig. 5-28 illustrates the simulations of the RALU block using 4:1 MUX with inputs:
X, Y, Cin, control signals AS, Ctl2 and Ctl1, three constant inputs ‘zero’ (hidden
nodes), and outputs ASout, Cn1g, Cn0g, Cout, Result, G1 (copy of X) and G2, G3,
G4. Note that every input combination has an expected (unique) output pattern.
This, in addition to the correctness of behavior of RALU, confirms reversibility of
the RALU module function. For clarity, some of the signals are not displayed
(zero, G1, G2, G3, G4, ASout, Cn1g and Cn0g). However, the intermediate results
of different functions are shown in the simulation. The highlighted node ‘ Result’
indicates which operation is being performed based on the control signals.
Figure 5-28: Simulation result of 1-bit Reversible ALU block
153
Fig. 5-29 shows the simulations of a 1-bit RALU using a Fredkin multiplexer. For
each control inputs combination, we simulate 4-input patterns of X and Y. The
highlighted ‘Result’ represents correct function output values for corresponding
control signals. Moreover, the other non-selected outputs are available at
garbage outputs Gao, Gsx and Gr.
Figure 5-29: Simulation result of 1-bit RALU using Fredkin MUX
Fig. 5-30 illustrates the random simulations of a 4-bit RALU. The node ‘Result’
(highlighted) shows the outputs according to the control signals. For each control
combination we check two random inputs and outputs. For example, for controls
ASCtl2Ctl1= 001 and inputs X = 1010 and Y = 0100, the RALU performs an OR
operation. Hence, the outputs become Result= 1110. Note that the garbage
outputs are hidden in the display.
Figure 5-30: Simulation result of 4-bit Reversible ALU with 4:1 MUX
154
In Fig. 5-31 we present the simulation results of a 4-bit RALU using Fredkin
gates, with 24 inputs and 24 outputs. As before, the two random values are the
inputs X and Y, the output ‘Result’. The time interval 0-20ns represents bit-wise
AND (XY), 20-40ns bit-wise OR, 40-60ns Sum, 60-80ns XOR, 80-100ns AND
with Y inverted (XY’), 100-120ns OR with Y inverted (X+Y’), 120-140ns
subtraction and finally 140-160ns XNOR operation. The garbage outputs Gx,
ASout, Gc1 and Gc0 are the copies of inputs X, AS, Ctl1 and Ctl0 respectively.
The garbage outputs Gao, Gsx and Gr represent non-selected outputs (AND/OR
for values of Ctl0), (Sum/XOR according to Ctl0) and AND/Sum or OR/XOR
(based onCtl1), which is not selected by ‘Result’.
Figure 5-31: Simulation result of RALU_Fredkin_4bit
The simulation results of RALU with the overflow detector and the set-less-than
function are shown in Fig. 5-32. The node ‘Result’ shows the outputs according
to the control signals for some random values of inputs set by the simulator. In
Table 5-11, we summarize some of the operations.
155
Figure 5-32: Simulation result of 4-bit RALU with overflow and Set
less than
Table 5-11 : RALU operations at various intervals of Fig. 5-32
Interval (ns)
Controls
ASCtl0Ctl1
Inputs
Outputs
Operation
20 - 40
000
X=3, Y=1
Result = 1
AND
60 - 80
001
X=-1, Y=4
Result = 3
ADD
140 - 160
101
X=-3, Y=0
Result = -3, Slt= 1
SUB, set-less-then
200 - 220
101
X=-7, Y=2
Ovf=1, Slt =1
SUB, Overflow
260 - 280
011
X=2, Y=2
Result = 0
XOR
Fig. 5-33 illustrates the simulations of our RCAS block for square root circuit with
inputs: X, Y, Cin, control signal AS, and constant inputs ‘zero’, and outputs
ASout, Cout, SD, G1(a copy of Y) and G2. Note, that every input combination
has an expected (unique) output pattern. This, in addition to the correctness of
behavior, confirms reversibility of the RCAS function.
156
Figure 5-33: Simulation result of RCAS for Square Root Circuit
We present the simulation result of 8-bit square root circuit in Fig. 5-34. The
simulator used random values (decimal) for an input X. The node q (highlighted)
shows the decimal value of square root of the input X, while the output r
represents the remainder value. For example, for the input X = 232 (decimal), the
square root quotient is 15 and remainder r is 7.The number of inputs/outputs is
50. We observe total number of garbage bits is 38 which comply with our
theoretical analysis. However, only 4 garbage outputs are shown here. The rest
34 garbage bits and some constant inputs are hidden due to space limitation.
Figure 5-34: Simulation result of 8-bit Reversible Square Root Circuit
157
5.7
Conclusion
Reversible logic is considered to be compatible with future computing
technologies, which dissipate less energy. Finding an efficient reversible
implementation of classical computer arithmetic especially the arithmetic logic
unit is still a challenging issue. In this chapter, we presented reversible
architectures of computer arithmetic circuits with smaller overhead than designs
proposed by other authors. Starting with the basic RCAS module, which is then
extended to detect overflow in 2’s Complement computation, we demonstrated
that our design has better performance than the existing subtractors or combined
adder/subtractors in terms of quantum cost and garbage outputs. Next, we
employed our RCAS with overflow detector to present a novel design of
reversible comparator. Based on the information from subtraction and overflow
outputs, we can compare two signed numbers to identify equality, less than or
greater than operations. The design outperforms existing methods.
Finally, we proposed a complete and new RALU, which is similar to the basic
classical ALU. We presented two different realizations and then analyzed their
effectiveness. Our integrated module is better than any existing reversible
arithmetic logic unit incorporating more operations. The modular structure of an
n-bit RALU offers economical and acceptable values of reversible circuit
parameters comparable to other benchmark circuits.
158
Later, we presented a novel design methodology to implement reversible square
root circuit with an array structure. As a building block, we first created a modified
reversible
controlled
adder/subtractor
module
performing
reversible
2’s
Complement addition/subtraction. The quantum cost of this module was 10. Next
we implemented an array structure with the RCAS modules to perform square
root operation. The methodology was generalized to realize square root circuits
of any size with less quantum cost.
Since reversible logic is a promising candidate alternative to classical circuits,
any development in arithmetic circuits will play an important role in their direct
future technological translation.
159
Chapter 6 Testing of Reversible Circuits
Reversible circuit testing is a challenging issue. Different models have been
introduced to address technology-specific faults. To date, frequently considered
reversible circuit fault models include missing gate and control point appearance
or disappearance for Toffoli network consisting primarily of basic gates such as
NOT, CNOT and Toffoli. In the process of synthesis or template matching, errors
can also happen due to erroneous replacements of a gate with a wrong one or
incorrect cascading of gates. With an aim to find optimized alternative, designers
also sometimes erroneously add gates which are not actually intended for. On
the other hand, nowadays in addition to basic gates reversible gate libraries often
include Peres gates, Fredkin gates and application-specific reversible gates
mainly used to realize arithmetic algorithms. To incorporate the extended
reversible gate libraries for testing, we present in this thesis, gate replacement
and wire replacement fault models which can also model errors like missing gate,
as well as control points appearance or disappearance. Similarly to synthesis
and testing of irreversible circuits, the detection of such gate or wire faults and
the identification of any untestable (redundant) faults can contribute to the overall
160
circuit optimization. In this chapter, we propose three testing schemes based on
Boolean Satisfiability (SAT) formulation addressing gate and wire replacement
faults. We also present the construction of a Reversible Test Miter, which, along
with backtracking, can easily detect such faults. Finally, we show that with
learned test vectors from a reversible test miter, a smaller test set for gate
replacement faults can be derived to increase fault coverage and speed of
testing.
In an additional part of this chapter, we will introduce the testability of reversible
modular circuit, especially the arithmetic circuits. As an example the testing of
reversible adder/subtractor for missing control point fault will be discussed and
the special feature of the modular design with small test size will be identified.
6.1
State-of-art
As reversible circuits are still a relatively new area, the biggest research impact is
on synthesis of such circuits. However, testing and verification are an integrated
part of the overall design process. Recently, some progress has been made in
fault modeling and testing of reversible circuits [41-48, 117-124, 127].
In
particular, it has been demonstrated how to model failures in reversible logic at
the gate level. It has been shown that different types of faults, such as stuck-at
value, bridging faults, missing gate faults and lock gate faults can happen in
161
reversible circuits. Such faults can be difficult to detect, though circuit reversibility
offers better observability and controllability than classical circuits [41].
Patel et al. [41] proposed test generation schemes for reversible circuits. Authors
have shown that only few test vectors are necessary to fully test a reversible
circuit under the multiple stuck-at fault models, with the number growing at most
logarithmically both in the number of inputs and the number of gates. Though the
approach does not guarantee its applicability to real implementations of quantum
circuits, it opens the door for investigating testing of reversible circuits.
Further, authors in [43] identify that the traditional fault models like stuck-at-value
may not accurately represent the faulty behavior or test generation of reversible
circuits. A missing gate fault model is proposed to better represent the physical
failure of quantum technologies. This fault model is extended in [44, 117], and a
family of logic faults in quantum and reversible circuits is presented. The targeted
realization is the ion-trap technology of quantum circuits. The considered faults
are single missing gate, repeated gate, multiple missing gate and partial missing
gate faults.
A new fault model, i.e., a cross point fault that is a presence or absence of
control points, is presented in [89]. The proposed ATPG algorithm is based on
random selection of faults from complete fault set which offer better results in
terms of minimal test set and runtime.
162
Further, a design-for-testability method has been proposed to make any
reversible logic circuit composed of n-bit Toffoli gates fully testable for single
stuck-at faults and single intra-level bridging faults (short between two lines) [45].
The methods offer full coverage of these two fault types; however, the overhead
incurred by added circuitry for testability in terms of quantum cost is significant.
In recent work, SAT-based ATPG is proposed for testing single missing control
fault, single additional control fault and single missing gate fault using known
fault-detecting constraint [48, 89]. The method is very efficient and can handle
additional constraints such as constant inputs, as well as it can even detect
untestable fault.
Some other recent efficient approaches for fault detection and test vector
generation of various faults in reversible and quantum circuits are presented in
[118- 124].
The synthesis process generally uses reversible gate library, which includes
basic gates such as NOT, CNOT and Toffoli as well as Peres and Fredkin gates.
Moreover, in reversible arithmetic circuits proposed over the last few years
include some new reversible gates, the summary of which is presented in [53].
The replacement of one gate with other gate or wrong swapping of target and
controls cannot be defined with control points faults or missing gate fault model
alone. Thus in this chapter, we consider two fault models namely gate
163
replacement fault and wire replacement fault which targets circuits implemented
using the above mentioned reversible gate library. Moreover, the missing gate
faults and cross point faults can also be defined under proposed fault model.
Note, that the gate modifications in reversible logic have been exercised in [72] to
aid searching for partial redundant logic for optimization purposes. We also
consider the erroneous swapping of target and control lines of a gate, labeling as
wire replacement faults in reversible logic.
In this chapter, we propose three testing methods to detect gate and wire
replacement faults adopting the conventional testing schemes for irreversible
circuits. We also study their application efficiencies. Furthermore, we present a
general methodology for finding test vectors aiming to produce a better fault
coverage. Our method is faster and requires less memory to find solutions. In
particular, we propose reversible test miter and use Boolean satisfiability to find
proper test vector.
6.2
Fault Models and Testability of Reversible circuits
Different fault models have been reported for reversible circuits such as stuck-at,
delay faults, bridging fault, various forms of missing gate faults, cross-point
appearance or disappearance faults [41, 43, 117, 121-124, 127]. The summary
of various fault models in the context of logical reversible and quantum circuits
164
with their technological relevance is presented in [127]. The stuck-at fault is
defined as the output at a constant value irrespective of the input signal to the
wire or gate. The existence of this type of fault in real quantum technology is
questionable. To relate with the ion-trap quantum technology, missing gate faults
are introduced in [43] where the qubits are represented by the ion state and
gates correspond to the laser pulses controlling the interaction of the ions. The
duration, misalignment or mistune of pulses result in missing gate faults which
produce logical outputs as if one or more of gates were missing. The bridging
fault is an unwanted connection of elements that alter the gate functionality, for
example in transistor based design the short path between two terminals
changes the output.
In this dissertation, we address the testing of gate and wire replacement faults.
The gate replacement fault is present when another gate is erroneously placed
instead of the correct one. The circuit has same number of gates as specification
but results in wrong logical output. This type of fault in quantum reversible circuits
is defined as a pulse of wrong wavelength or duration [127]. Both correct and
erroneous gates must be I/O compatible. Although in irreversible circuits a fault is
sometimes found to be redundant, this is not the case in the complete reversible
implementation due to the unique mapping between inputs and outputs. Namely,
if the fault does not affect the main outputs, it is propagated through the garbage
165
outputs. Thus, there is always a vector for which faulty and fault free reversible
gates will generate two different output vectors. However, reversible circuits with
constant inputs can cause a fault to be hard to detect.
We extend the definition of gate replacement faults to include both missing gate
fault and cross-point appearance or disappearance faults. The missing gate fault
model represents complete omission of a gate from the network. This fault can
be defined as gate replacement fault assuming the gate is replaced by identity
gate.
Similarly, cross-point fault, which is based on appearance or
disappearance of control points in a reversible gate, and changes the
functionality of that gate, can be viewed as gate replacement faults. For example,
if one control point of a Toffoli gate is missing then the gate will perform a CNOT
operation, i.e., the faulty behavior is modeled as Toffoli gate replaced by CNOT
gate. However, in recent study it shows that this cross-point fault model is not
realistic in quantum technology since in this case the gate becomes probabilistic
[127].
Wire replacements are considered both in synthesis and verification [78, 90, 91]
of classical circuits. Many recent advances in synthesis explore the rewiring
approach, where selected wires are chosen for the replacement, in the attempt to
find a better implementation for logic at considered nodes. This issue is very
important in the simulation-based verification of the reversible logic when
166
redundant wire replacements and optimization is the goal. The wire replacements
can be classified into two categories. The first, deals with errors affecting input
port connections. The second category describes errors causing one or more
internal wires in the circuit to be wrongly connected to nodes [78]. In this work we
consider the second category of wire replacement faults. In reversible circuits
especially in dual-rail technology or CMOS implementation the interconnection
between gates can be wrong or the control line can be swapped by target
erroneously.
6.2.1 Testability of Reversible Circuits
In conventional circuits, the two important factors, which measure the testability
of the circuit, are controllability and observability. For reversible circuits these two
properties are easier to achieve, simplifying the test procedure [41]. In particular,
the full observability of any fault in the circuit is guaranteed, since each input
vector corresponds to a unique output. Hence, any change in the network due to
an error will alter the output vector. Similarly, due to backtracking property of
reversible circuits, the full controllability is guaranteed. This however, is not
always true in the embedded reversible circuits, where constant inputs restrict
finding proper input vectors [89].
167
Example 1: Consider the circuit shown in Fig. 6-1 constructed with Toffoli gates.
A gate at level 5 (red circled) is erroneously replaced with Fredkin gate. To
observe the fault effect we set the input vector to the corresponding gate to
111. Using backward propagation we can find that the input assignment 110
activates this fault. For a complete reversible circuit, Fig. 6-1 (a) an input
assignment leading to the fault detection can be found easily. However, for
embedded reversible circuit with constant 1 at line 3 in Fig. 6-1(b), it is
impossible to set this line to 0 for the fault excitation. Again, if input a in line
1 is set as constant 0, Fig. 6-1(c), no test vector can detect the fault. Thus
constant input can sometimes make faults hard to test. 
168
1
1
1
1
1
1
0
1
0
1
1
1
a
b
c
1
1
0/1
1/0
(a)
a 1
1
1
1
1
1
0
1
0
1
1
1
b
1
1
1
0/1
1/0
(b)
0
b
c
1
1
1
1
1
1
0
1
0
1
1
1
1
1
0/1
1/0
(c)
Figure 6-1: Controllability and Observability (a) without constant input (fully
testable) (b) contradiction with constant input and search for alternative vector (c)
untestable fault in embedded circuit with constant input [89].
6.3
Proposed SAT-based Testing for Gate Replacement Faults
In this section, we describe our methods for testing gate replacement faults. To
start with, we consider the application of conventional SAT-based testing
method, then we move to the second method based on reversible miter, and
finally we present a novel reversible test miter-based fault identification. A SATbased formulation is very common in irreversible circuit testing, and in recent
time in synthesis, testing, verification and debugging of reversible circuits. To
169
utilize this popular method we need to generate the SAT formulation of different
reversible gates first, and use them to get the complete circuit clauses.
Definition 6.1 The Boolean Satisfiability (SAT) is defined as follows [92]. Let
h be a Boolean function in Conjunctive Normal Form (CNF). Then the SAT
problem is constructed to determine whether there exists an assignment to
the variables of h such that h evaluates to true or prove that no such
assignment exists.
As the Toffoli gate is universal, i.e., all reversible circuits can be implemented as
networks of Toffoli gates, we take the advantages of the universality of the Toffoli
gate, and present a simple and fast method to detect the gate replacement faults
in such networks. However, in a circuit with different kinds of reversible gates, it
is necessary to apply the SAT formulation level by level.
Note, that for the
verification of any output (checking the change in output), it is enough to consider
the corresponding line and gates, which lead us to that specified output.
6.3.1 Proposed Conventional Miter-based Testing
Our first method for testing of reversible circuits emerges from the conventional
test pattern generation technique [92]. In that regard, we need to extract a
formula, which defines the set of test patterns that detect the fault. To empirically
find a test vector we use a Boolean Satisfiability algorithm to satisfy the
170
conjunctive normal form or CNF formula representing the circuit functionality with
the incorporated notion of the fault.
First, for each reversible gate, we construct a set of clauses defining the logical
functionality of this gate. If the gate has garbage outputs which are a copy of
inputs, we do not introduce any new variable for their representation. For
example, in Toffoli gate two control inputs are propagated as garbage outputs
and so only the target line clauses are considered. The set of clauses
corresponding to different reversible gates are shown in Table 6-1. Similar to
Toffoli gate, for other types of gates, if the outputs are copies of inputs, we can
disregard them in the CNF formula.
171
Table 6-1 CNF formula of standard reversible gates
Gate
Function
CNF formula
NOT
CNOT
⨁
Toffoli
⨁
Fredkin
Peres
⨁
⨁
⨁
R-gate
⨁
TR
⨁
⨁
BJN
⨁
⨁
URG
⨁
172
6.3.2 SAT formulation
To extract a CNF formula of any reversible circuit for gate replacement fault, we
consider the topological description of a circuit. At each level, for each gate of the
fault free circuit, we add the clauses of the corresponding gate. Next, we
represent a faulted version of the circuit by replacing the correct gate with faulty
one. Since the fault free and faulty circuits behave identically except for the faulty
gate and remaining part of the circuit towards the output (purple fault cone in Fig.
6-2(b)), we extract the formula of faulty circuit starting at the fault location and
take conjunction of clauses of this fault cone only. Next, to test for this fault, we
need to find a set of inputs that cause the fault free output to differ from faulty
output. An additional formula for the XOR of these two outputs will define that
instance. Thus our final formula for all possible tests is constructed from the
conjunction of clauses for fault free circuit, faulty circuit and the additional XOR
formula. Then we use existing SAT solver minisat2 [93] to find a satisfying
assignment, i.e., the one which will detect the fault. If the formula is unsatisfiable
then the fault is redundant. This information of redundancy is sometimes helpful
in optimizing the circuit.
Example 1: Let us consider the 2-to-4 decoder circuit (with constant inputs a1=0
and a2=0), Fig. 6-2. The CNF formula for the fault free circuit is extracted
level by level and is shown in detail next:
173
Level 1: Peres gate (
⊕
Level 2: Toffoli (
⊕
Level 3: Toffoli (
⊕
Level 4: CNOT
⊕
)
⊕
Level 5: NOT
Assume, that in the faulty circuit a Toffoli gate at level 3 is erroneously
replaced by a Fredkin gate. The clauses for the faulty gate and the
corresponding faulty cone from level 3 through levels 4 and 5 to the output are:
174
Now to compare fault free outputs (y41 and y51) and faulty outputs (y41f and y51f)
we assume two new variables in the formula z1 and z2 defining the difference
(XOR) of faulty and fault free outputs, i.e., z1 = y41 ⊕ y41f and z2 = y51 ⊕y51f . To
test the fault and find a satisfying assignment at least one of the variables z1 and
z2 must be active. Thus, the resulting clauses are:
.
Next, the SAT formulation is solved in order to determine the test vector for the
given gate replacement faults. The test vector “0110” detects the abovementioned gate replacement fault. 
175
y12
0
c
a
Peres
y41
Toffoli
y11
d
y31
Toffoli
c
y51
d
y21
0
y12
0
c
b
a
Fredkin
Peres
y11
d
y41f
y31f
Toffoli
y21
0
y32f
c
y51f
d
b
Figure 6-2: (a) An example of a correct reversible circuit (b) the
gate replacement and the affected area or fault cone .
6.3.3 Reversible Miter as a test pattern generator
In the second method proposed here, we take the advantage of the reversible
miter usually used to verify equivalence checking of reversible circuits. The
rationale behind this is basically the fact, that if fault free and faulty circuits are
not equivalent then the counter example represents the vector for which two
circuits behave differently.
Definition 6.2: A reversible miter circuit of two reversible circuits,
and
is constructed as
and
,
,
,
. The two circuits,
,
are functionally equivalent if and only if all of their reversible miters
implement the identity transformation [94].
176
To compare two circuits with gate or wire replacement error, we can use such a
miter. The main advantage of a reversible miter is that we can simplify the miter
circuit by applying gate cancellations. Namely, in order to obtain a simplified
circuit a series of gate cancellation can be applied to the reversible miter based
on following properties of reversible circuits. The first reduction rule states that
any two adjacent identical reversible gates can be canceled out, Fig. 6-3. This is
in contrast to the conventional circuits where two circuits with identical gate
sequences cannot be canceled due to the observability don’t cares introduced by
them. In reversible circuits, a miter constructed from an arbitrary circuit C1 and its
inverse circuit C1-1 can result in empty circuit with gate cancellation [94, 95].
Another swapping rule, which can facilitate simplification, states that any two
gates can be swapped if they do not act on the same bit lines. Two adjacent
NOT, CNOT or Toffoli gates can be swapped if the control bit of one gate is not
the target bit of another gate.
To facilitate maximum gate cancellations of a miter circuit with l levels and a fault
at level
, the miter circuit for the correct (
constructed according to the following rule:
177
and faulty (
) circuits is
C
If the
is generated, then there is
gate cancellations, otherwise
cancellation could be performed. However, in our testing
scheme we always construct C C
or C C
since the solutions provide the
satisfying input assignments. On the other hand, in the inverse circuit at first part
of miter circuit
finds the satisfying assignment of outputs corresponding to
the input test vectors that detect the fault. This is because in an inverse circuit
the gates are placed in reverse order, i.e., the last gate is placed at the first
position and so on to retrieve input value from any particular output combination.
Thus, if we use
as miter circuit an additional backward propagation of the
output vector is needed to find the proper input assignments.
X
X
X
Y
Y
Y
Z
XY
Z
Z
Figure 6-3. The explanation of the cancellation
6.3.4 SAT formulation of reversible miter
Once a simplified reversible miter is obtained we construct a CNF-SAT formula in
the same way as in our first conventional method. We start with the first gate of
the reversible miter and add a set of clauses for all gates. Next we introduce new
178
formulas for each bit line which has a target bit on it. These are satisfied by
variable combinations where some circuit output differs from the respective input
proving that in the miter there are gates, which are not identical to each other to
cancel out and result in an empty reversible miter from faulty and fault free
circuits. We assume that
fi will be 1 only when input xi ≠ output yi. This is
expressed as XOR function fi=xi ⨁ yi and the clauses are (fi+xi+ yi’)( fi’+xi+ yi)(
fi+xi’+ yi)( fi’+xi’+ yi’). Finally the clause (f1+f2+f3+…..+fn) is added where n is the
number corresponding to the bit lines having any target bit on them. To find the
proper test vector at least one of the variables fn must be satisfied. The SAT
formula constructed here will be satisfied by those input combinations, for which
the corresponding outputs of two circuits produce different values.
Example 2: As the Toffoli gate is universal and also the most frequently
used, in this example we explain the gate replacement fault in the Toffoli
network. However, the gate type of an error is irrelevant and does not affect
the method. Fig. 6-4(a) shows the correct circuit and Fig. 6-4(b) the faulty
circuit, respectively. We construct the CNF formulas for the reversible miter
and the formulas to add the consistency formula following that a gate
replacement fault (which is also viewed as appearance of cross-point fault)
is detectable only when the XOR of the inputs and the main outputs result in
“1”. By solving the SAT formulation with minisat2 [93], the vector
179
is extracted as it detects the injected gate
replacement fault. The simulation of the corresponding vector is shown in
Fig. 6-5. From the output vector we can see the faulty and fault free outputs
differ on line 3. 
X1
X2
X3
X4
X5
x11
x12
x13
x14
x15
x21
x22
x23
x24
x25
x31
x32
x33
x34
x35
X1
X2
X3
X4
X5
x11
x12
x13
x14f
x15f
x21
x22
x23
x24f
x25f
x31
x32
x33
x34f
x35f
(a)
(b)
x11
x12
x13
x14
x15
x16
x17
x18
x19
x21
x22
x23
x24
x25
x26
x27
x28
x29
x31
x32
x33
x34
x35
x36
x37
x38
x39
x11
x12
x13
x14
x15
x16
x17
x21
x22
x23
x24
x25
x26
x27
x31
x32
x33
x34
x35
x36
x37
(c)
x11
x12
x13
x14
x15
x21
x22
x23
x24
x25
x31
x32
x33
x34
x35
Figure 6-4: (a) A correct Toffoli network, (b) the corresponding faulty circuit
of the Toffoli network, (c) the reduced reversible miter circuit to apply CNFs.
180
Therefore, upon solving SAT, we obtain a test vector, which detects the fault. It
is worth noting that if the conventional miter circuit is applied, there are 6 gates
which should be considered in CNF form, however in this method we need only 4
gates making the SAT problem easier and faster to solve.
1
0
0
0
0
1
1
1
1
1
1
1
1
0
0
1
0
0
0
0
1
1
1
1
1
1
1
1
1
1
Figure 6-5: The proper vector to detect the gate replacement fault.
6.3.5 Proposed Reversible Test Miter
Finding test vectors using conventional fault cone method or reversible miter
(proposed methods presented in previous sections) increases the complexity in
terms of number of clauses as the fault location changes. For example, in
conventional method if the fault occurs at the first segment of the circuit then the
fault cone includes gates of the circuit starting from the fault location and ending
at the circuit outputs. This increases the number of gates in SAT formulation. On
the other hand, in reversible miter, if the fault is located in the first segment of
181
circuit, then the gate cancellation of a reversible miter C C
results in less
number of gates and, in consequence, the number of clauses. However, the
worst case occurs if the fault affects gates in the last circuit level. In that case no
gate cancellation is possible, and the reversible miter is just the double of the
original circuit, i.e., this doubles the number of clauses in SAT formulation.
Solving such a formula requires an excessive run time.
Here we propose a new approach to detect gate and wire replacement faults for
reversible circuits. Unlike SAT-based ATPG proposed in [48, 89], which test for
missing and additional control faults or missing gate fault with known fault
constraint, gate and wire replacement faults behavior does not have defined
input assignment. Different gate replacement requires different error defining
vectors. For example, in the process of design or synthesis to obtain optimal
alternative realization of circuits, if a Peres gate is replaced by Fredkin gate or in
arithmetic design TR gate is replaced by BJN gate, then first we need to find the
fault constraint (110,111), which is then applied to the SAT formulation of the
overall circuit to generate a test vector. Hence we propose a new testing
approach, which determines the vectors for which the fault free and faulty gates
behave differently first and the vectors are then used as known constraint in SAT
formulation for backtracking.
182
Our approach follows the conventional testing scheme, and divides the circuit in
three segments: from the primary inputs to the fault location (fault excitation), a
fault free/faulty gate, a fault propagation circuit, Fig. 6-6. From the first segment
we need to generate an input vector exciting the fault, the second segment
identifies the possible pattern, for which the two gates output are different, we
call this a reversible test miter. The final segment propagates the faulty behavior
at outputs. This part is simplified by full observability of reversible circuit [41].
Thus, if original and faulty gates generate two different patterns at their outputs,
then the outputs of the overall circuits will also differ. Hence, we can neglect the
fault propagation part. Our reversible test miter generates a vector that can be
used as a learned clause in solving the formula of circuit’s fault excitation
segment. Thus with backward propagation from fault location we can obtain an
input test vector.
x11
x21
x31
x13
x12
Fault
excitation
segment
x22
RG
x32
x23
x33
O1/O1f
Fault
observation
segment
O2/O2f
x13f
x23f
RGf
O3/O3f
x33f
Figure 6-6. General block diagram representing fault
excitation and propagation behavior for faulty and fault-free gates
183
Reversible Test Miter
The basic reversible test miter module is constructed by cascading the original
gate with its faulty replacement, Fig. 6-7(b). The rationale behind constructing
such test miter comes from the conventional equivalence checking. In this
scheme the outputs of both gates are compared by XORing them. The OR gate
define constant-0 function, i.e., the output CM will be 0 only when all the outputs
of two gates are equal, Fig. 6-7. However, in the reversible logic, if two identical
gates are placed, then in their cascade the outputs are the same as inputs.
Hence, we can assume that two gates are equivalent only when their cascaded
circuit behave as set of wires from input to output. Thus if the same constraint
satisfy both conventional scheme and its reversible counterpart then we can
establish the effectiveness of reversible test miter.
This is illustrated in the
following example:
Example 3 : From Fig. 6-7(a) we find, that it realizes the following the logical
equation:
⨁
⨁
⨁
⨁
The output CM will be 0 which verify the equivalency only when a=0. On the
other hand, the outputs of the reversible test miter in Fig. 6-7(b) are the
same as inputs only when a=0 (line 1 already same, line 2 output2 =b=
184
input2 and in line 3 output3=c=input3). Thus reversible test miter works well
in finding equivalency.
a
a
b
b
c
c⊕ab
CM
a
Fred
a'b + ac
a'c + ab
(a)
a
a
b
b
c
c⊕ab
a
a'b + a(c⊕ab)
Fredkin
a'(c⊕ab) + ab
(b)
Figure 6-7. Reversible Test Miter (a) Schematic of classical equivalent (b)
proposed reversible form
We employ this reversible test miter to find the test vector for detecting
irredundant gate replacement faults. As before, we first derive the CNF
formulation of the test miter by adding clauses of both gates. Next we add the
identity check clause where input patterns differ from output ones. Finally the
clause for output goal is added. Once the CNF formulation is completed, we
solve for any satisfying assignment using state-of-art SAT solver (minisat2 in our
case). For example, in Fig. 6-7 we construct a reversible test miter of Toffoli gate
(original gate in fault free circuit) and the replaced Fredkin gate (faulty circuit).
185
The function truth table of each individual gate and the combined test miter circuit
is shown in Table 6-2. From this table, we see, that for the last three input
patterns, the output patterns of two gates differ. The test miter outputs also vary
from these three input patterns only. Thus if we can find a CNF expression with
the constraint that the input bit is not identical to output one ( i.e. their XOR is 1)
for the test miter then the satisfying assignment will represent the error defining
vector.
The CNF formulation of Toffoli-Fredkin (T-F) test miter of Fig. 6-7(b) is:
The satisfying assignments from SAT solver minisat2 are: 111, 101and 110
which are in accordance to the truth table, Table 6-2.
186
Table 6-2 Toffoli-Fredkin functionality table
Inputs
Toffoli
Fredkin
Test miter (T-F)
a
b
c
p
q
r
p
q
r
p
q
r
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
1
0
0
1
0
0
1
0
1
0
0
1
0
0
1
0
0
1
0
0
1
1
0
1
1
0
1
1
0
1
1
1
0
0
1
0
0
1
0
0
1
0
0
1
0
1
1
0
1
1
1
0
1
1
0
1
1
0
1
1
1
1
0
1
1
1
1
1
1
1
1
1
0
1
1
1
1
0
1
In the next step we use this vector to find primary input test vector with backward
propagation from fault location. The CNF formula of circuit levels from the first to
the fault location is constructed, and we add the constraints derived from vector
of the test miter. The satisfying assignment of the formula is our desired test
vector.
Example 4: Consider a benchmark circuit 4 mod 5 in Fig. 6-8. If the third gate
(Toffoli) is replaced erroneously by a Fredkin gate, then the test miter finds
the vector l3l4l5=111. The CNF formulation of this part of the circuit is,
187
The satisfying assignment for this formula is a= 1, b=1, c= 0 and d= 0. 
a
b
c
d
1
g1
g2
g3
g4
f
Figure 6-8: Fault location of benchmark circuit 4 mod5
To verify that our obtained test vector detects the fault we present the bit
propagation in Fig. 6-9. Fig. 6-9(a) shows the original fault free outputs for the
corresponding vector, while Fig. 6-9(b) represents the faulty outputs for the
replaced gate. The complete algorithm is presented in Algorithm1.
1.
2.
3.
4.
5.
6.
7.
Construct reversible test miter circuit go-gf ={CNOT, Toffoli, Fredkin, Peres}
and obtain CNF formula
Use SAT solver to find error defining vector (edv).
For gate level i= 1, 2, ………….k, if i=1 return test_vector = edv, else
Construct CNF for sub-circuit from primary input to fault level.
For any constant input add corresponding clauses.
Add clauses for error defining vector.
Use SAT solver to find satisfying input assignment, i.e. the desired test vector .
Algorithm1
188
1
1
(b) 0
0
1
1
1
1
1
Fredkin
1
1
1
1
1
1
(a) 0
0
1
1
1
0
1
1
1
1
1
1
0
1
1
1
1
0
0
Figure 6-9: Fault observation of benchmark circuit 4 mod5
As we discussed earlier, in the conventional method only the fault free circuit and
the fault cone, i.e., the part of the circuit following the fault location are
considered. Thus, if a circuit has l gates with the gate replacement fault occurs at
lf gate, then in this method the total number of gates involve in SAT formulation is
2l-lf+1. On the other hand, in reversible miter method, the number of gates
involved is 2lf since the gate cancellation in miter circuit CCf-1 occurs on gates
after the fault location and inverse of first segment remains with faulty gate.
Finally, in reversible test miter method the test miter is constructed using faultfree and faulty gates, and backward propagation involves lf -1 gates. Thus in total
this method considers lf+1 gates only. Table 6-3 shows an example of number of
gates considered in each method with different fault location lf of an arbitrary
189
circuit with l=6 gates. Note that reversible test miter involves a smaller number of
gates in SAT formulation.
Table 6-3: Number of gates for different fault locations
Methods
6.4
Fault location, lf
1
2
3
4
5
6
Conventional
12
11
10
9
8
7
Reversible Miter
2
4
6
8
10
12
Reversible Test Miter
2
3
4
5
6
7
Wire Replacement Fault Testing
We model the wire replacement fault as the swapping of control lines with the
target lines of a gate or erroneous placement of target or control points. For
example in Fig. 6-10(a), we can see gate R.G 4 has control points at x14 and x24
while the target in x34. If we erroneously connect the target at x14 and control at
x34, Fig. 6-10(b), then the circuit output will be different than the original outputs.
190
x11
x12
x13
x21 R.G x22
x23
1
x31
x32 R.G x33
2
x41
x42
x43
x11
x12
x13
x14
x16
R.G
x26
3 x24 R.G x25
4
x34
x35 R.G x36
5
x44
x45
x46
x14
R.G
x23 3 x24
x21 R.G x22
1
x31
x32 R.G x33
2
x41
x42
x43
x15
x34
x44
x15
x16
x26
R.G x25
4
x35 R.G x36
5
x45
x46
x11
x16
x21
x26
x31
x36
x46
x41
x11
x16
x21
x26
x31
x36
x46
x41
Figure 6-10: (a) The schematic of correct circuit, (b) with the wire replacement
fault, (c) the reversible Toffoli fault free network, (d) the faulty Toffoli network.
Using the same methods for gate replacements, we perform testing for wire
replacement faults. In the proposed first method the original circuit and the fault
cone clauses are considered in SAT formulation, while in the second, reversible
miter-based method, a circuit is constructed by cascading the original circuit and
the inverse of faulted version with proper gate cancellations if applicable. In the
final proposed reversible test miter method, first the original gate and the new
191
structured gate due to swapping of lines are used for the miter circuit. Then, by
solving the CNF formula we can find error defining vector which is used to find
primary input assignments with backward propagation. For example, in Fig. 610(a) the test miter of the fault-free and faulty gate generates error defining
vector as, abc= 110 from which we can find the input test vector as abcd = 1100.
The fault free outputs of the circuit (Fig. 6-10(c)) are 1111 and for faulty version
outputs (Fig. 6-10(d)) 1100. Thus the wire replacement fault is detected.
6.5
Experimental Results
To verify our proposed methods we first constructed several reversible test
miters from different combination of reversible gates such as Toffoli-Fredkin,
Peres-Fredkin, TR-BJN etc. and generated the error defining vectors using
minisat2 [93]. Table 6-4 summarizes the test vector of some reversible test miter
for various gate replacement faults, missing gate faults and cross point faults.
With these test assignments as error defining vector, the input test vectors are
found very easily. We also notice that in many cases the same vector (for
example 111) defines the faulty behavior for different gate replacements. Note
that, the error defining constraints for missing gate and cross point
disappearance faults are in accordance to the constraints presented in [44, 48].
192
Next we compare the proposed three methods by inserting arbitrary gate
replacement faults at random location of circuits and observing the number of
clauses, conflicts they encounter or their test vector generating time. For some
reversible benchmark circuits [82], we perform the SAT formulation and solve for
test vectors using minisat2. In Table 6-5, we present the number of clauses in
SAT formulation and time to detect an arbitrary gate replacement fault for each
method. Note that in reversible test miter the time indicates the total time of
backward propagation and error defining vector generation. We could not
compare our methods with any other testing schemes since they do not address
gate and wire replacement faults considered here.
193
Table 6-4: Test Miter vector for various faults
Fault Category
Gate involved
Test vector
Gate replacement
CNOT-NOT
a=0, b=0/1
Toffoli-CNOT
a=0, b=1, c=1/0
Toffoli-Peres
a=1, b=1,c=1
Fredkin-Peres
a=1, b=1,c=1
Fredkin-Toffoli
a=1, b=1, c=1
TR-BJN
a=1, b=1, c=1/0
CNOT
a=1, b=1
Toffoli
a=1, b=1, c=1/0
Peres
a=1, b=1, c=1/0
Fredkin
a=1, b=1, c=0
Missing gate
a=1, b=0, c=1
Cross-point disappearance
Any gate
affected point=0 other controls =1
From the result summary of each solution, we observe that reversible test miter
is better than the other two methods in number of clauses, conflicts, decisions or
memory used. For example, in case of hwb4, the conventional fault cone method
has 9 conflicts and memory use 78.19 Mb, the reversible miter has 13 conflicts
with the same memory uses and reversible test miter backward propagation has
0 conflicts with 77.94 Mb memory uses.
194
Table 6-5: Comparison of three proposed methods
Conventional
Circuit
l
Reversible Miter Reversible Test Miter
Fault location, lf
clauses
time
clauses
time
clauses
time
ham3
5
1
59
0.31
23
0.05
23
0.05
4gt11
4
3
39
0.06
29
0.05
25
0.05
4mod5
5
5
32
0.06
48
0.06
32
0.04
mod5mils
5
4
43
0.066
47
0.061
35
0.053
ALU_v1
7
4
63
0.063
55
0.058
35
0.056
2 of 5 12
9
97
0.150
93
0.100
53
0.050
hwb4 12
10
83
0.06
119
0.06
71
0.050
6.6
Testability of Reversible Arithmetic Circuit
In a preliminary study, we try to investigate the special feature of the reversible
arithmetic circuits and present that aspect in this section. In particular, we
present the testability property of modular design such as reversible
adder/subtractor for a very common fault missing control points in reversible
circuits [43]. Missing control point fault has been an important fault model for
reversible circuits from technological viewpoint. By definition cross-point or
missing control fault is the omission of a control point from a reversible gate,
which eventually can change the functionality of the circuit. In the context of ion195
trap quantum computing, this is similar to partial missing-gate fault model which
is a result of partially misaligned or mistuned gate pulses [43]. It turns a k-CNOT
gate into a k’-CNOT gate where k’< k. Thus we need some testing measure for
detecting such faults in any reversible design. Testing of this type of faults is
addressed in [44, 48].
In this thesis, we consider the testing of single missing control fault for modular
structure of reversible design. As an example, we consider our previously
proposed RCAS circuit presented in the chapter 5. From the internal structure of
RCAS we can see there are 7 control points where the fault can occur. For the
first gate the fault occurs at line 1 ( AS1), next in second gate control at line X
(Xg2) and at line Y (Yg2). For third gate control again at line X (Xg3) and in 4th gate
line Y (Yg4) and line Cin (Cg4). The last fault location is at 5th gate line Y (Yg5).
Note that, the testing constraint for any missing control point fault is to set the
fault location to ‘1’. Keeping this in mind, we can derive the faulty output behavior
of single missing control fault at various locations. The mapping of inputs to
outputs of original RCAS is (AS, X, Y, Cin, ‘0’) → (AS, X, G=AS⊕Y⊕X, S=
AS⊕Y⊕X⊕Cin, Co= X(AS⊕Y)⊕Cin(AS⊕Y⊕X)).
The fault effects for various control points can be expressed as follows:
196
a.
If control at AS is missing then to test the fault we assume AS=1. As a
consequence the output logical expressions are
AS, X, G= ⊕X, S= ⊕X⊕Cin, Co= X Cin( ⊕X).
b.
Similarly for fault Xg2, the outputs are:
AS, X, G=AS⊕Y⊕X, S= AS⊕Y⊕X⊕Cin , Co= (AS⊕Y)⊕Cin(AS⊕Y⊕X).
c.
Missing control at Yg2 results in outputs
AS, X, G=AS⊕Y⊕X, S= AS⊕Y⊕X⊕Cin , Co= X⊕Cin(AS⊕Y⊕X).
d.
Outputs with missing control at Xg3,
AS, X, G=
e.
⊕ , S=
⊕ ⊕Cin , Co= X(AS⊕Y)⊕Cin(
⊕ ).
The faulty behavior for missing control at Yg4,
AS, X, G=AS⊕Y⊕X, S= AS⊕Y⊕X⊕Cin , Co= X(AS⊕Y)⊕Cin
f.
For fault at Cg4 the outputs are
AS, X, G=AS⊕Y⊕X, S= AS⊕Y⊕X⊕Cin , Co= X(AS⊕Y)⊕ AS⊕Y⊕X
g.
The fault at Yg5 causes the outputs to change as AS,
X, G=AS⊕Y⊕X, S=
, Co= X(AS⊕Y)⊕Cin(AS⊕Y⊕X).
197
Table 6-6: Faulty behavior of missing control point at different location of RCAS
inputs
Outputs Fault free
Fault AS1
Fault Xg2
Fault Yg2
Fault Xg3
Fault Yg4
Fault Cg4
Fault Yg5
AS XYCin
GSCo
GSCo
GSCo
GSCo
GSCo
GSCo
GSCo
GSCo
0000
000
110
000
000
110
000
000
010
0001
010
101
010
010
101
011
010
000
0010
110
000
111
110
000
110
111
110
0011
101
010
100
101
010
101
101
101
0100
110
001
110
111
110
110
111
110
0101
101
011
101
100
101
101
101
101
0110
001
110
001
001
001
001
001
011
0111
011
101
011
011
011
010
011
001
1000
110
110
111
110
000
110
111
110
1001
101
101
100
101
010
101
101
101
1010
000
000
000
000
110
000
000
010
1011
010
010
010
010
101
011
010
000
1100
001
001
001
001
001
001
001
011
1101
011
011
011
011
011
010
011
001
1110
110
110
110
111
110
110
111
110
1111
101
101
101
100
101
101
101
101
Since first two outputs AS and X remain unchanged for any control fault, in
testing we consider only the three outputs G, S and Co. To identify proper test
vectors we generate the function table for fault free outputs and faulty outputs for
various faults, as shown in Table 6-6. Note that the constant input ‘0’ is not
shown in the table. Similarly, since garbage outputs AS and X are copies of
inputs and remain unchanged for any fault we disregard them in our testing
observation. In Table 6-6, we highlighted the outputs of RCAS block in different
198
colors for different missing control fault corresponding to an input vector. For
example, the input vector 0010 detects faults at AS1, Xg2, Xg3 and Cg4. Hence
they all are highlighted in red. From the table, we can see the three test vectors
of inputs (AS X Y Cin) 0100, 0111 and 1000 can detect all missing control faults
for our RCAS. For example, test vector 0100 detects faults AS1, Yg2 and Cg4;
vector 0111 identifies Yg4 and Yg5 as well as AS1 and vector 1000 detects Xg2,
Xg3 in addition to Cg4.
6.6.1 Identifying Fault Location
From Table 6-6, we notice, that each fault in RCAS creates different sequence of
outputs corresponding to input sequence. For example, the input sequence 0100
→ 0111→ 1000 generates the fault free output sequence 110 →011→110, fault
AS1 outputs 001→101→110, fault Xg2 results 110 →011→111, Yg2 outputs 111
→011→110, Xg3 outputs 110 →011→000, Yg4 results 110 →010→110, fault Cg4
results 111 →011→111 and finally fault Yg5 has output sequence 110 →001→110.
We utilize this scenario of output sequence to point out the exact location of the
fault. Thus if we can store these output sequences, testing any RCAS would be
easier with proper detection of fault location.
199
6.6.2 Testing n-bit controlled Adder/Subtractor
We can construct an n-bit reversible adder/subtractor by cascading n RCAS
blocks where the carry output is rippled through the stages. Thus fault effect at
any stage can propagate through the next stages. Testing of reversible circuits is
not a global problem, i.e., any combinations of fully testable modules is fully
testable, and sometimes even testable by the same patterns. The idea is that
testing of classical circuits with hierarchical modules (like half adder cascaded full
adder) do not face globality problem. However, when re-convergent fan-out
occurs testing becomes a global problem, i.e., cascading fully testable blocks do
not ensure testability of complete circuit. In that case, we need to generate test
vectors for fault excitation and propagation considering the full circuit structures
instead of individual blocks. However, in reversible circuits we flatten the fan-out
branches and cascade all reversible gates. In fact, all the signals are transparent
through the reversible mapping. So any fault at any stage of the circuit can be
excited by backward propagation from fault location setting opposite value of the
fault and there remains at least one dedicated path for fault observation and
hence a fault is testable.
In our cascaded RCAS design, we observe the same testability feature for
general reversible circuits. Particularly, the three test vectors that test any
missing control point faults in 1-bit RCAS module can be used to test an n-bit
200
design. The same input values for X and Y are applied in each block and the
control signal AS is propagated through all blocks. Only input Cin is set at the first
block such that the carry out of the block provides the same value to the next
block Cin. In case of 4-bit reversible adder/subtractor Co is observable only at 4th
stage and the outputs G and S are available from all stages in addition to
garbage output which is copy of input X. One interesting property of our selected
three test vectors that we extracted from Table 6-6 is that for fault free circuit they
generate the carry out (Co) as required by the proposed test vector. For example,
the vector 0100 (AS X Y Cin) generates outputs GSCo = 110 where carry output
is ‘0’ which is Cin for the next block. So if we set all AS, X and Y as 010, the carry
inputs for fault free case will be ‘0’ for all stages as required. Similarly, vectors
0111 and 1000 generate carry outputs ‘1’ and ‘0’ respectively. Only the fault can
change the Co and hence the output behaviors of the next modules.
We exercise the proposed testing scheme of missing control point faults at
various stages in reversible controlled adder/subtractor of different sizes. In
Table 6-7, we present the faulty outputs of a 4-bit RCAS. The same three test
vectors extended with bit length detect all the faults very easily. As before, the
input test vector 0100101010 (AS X1Y1CinX2Y2X3Y3X4Y4) identifies the faults at
AS1, Yg2 and Cg4 for all stages (highlighted turquoise), test vector 0111111111
tests AS1, Yg4 and Yg5 (highlighted pink) for all stages and finally test vector
201
1000000000 detects Xg2, Xg3 and Cg4 (highlighted green) faults of every stage.
However, the output sequence for each fault is different. To identify the block
where the actual fault occurred, we check changes in outputs. Since any fault at
the last stage changes only corresponding outputs the first 6-bit outputs remain
same as fault free output. Thus only last three bits are sufficient to test the faults
at this module. However, if for faults in any other block also changes only these
three bits then we observe the number and bit position of output changes and
decide the location of faults. In general we can conclude two observation rules:
If first 6-bits remain unchanged then fault at 4th stage, if 4 bits same then fault at
3rd stage, if first 2 bits are same then fault at 2 nd stage and if first 2 bits changes
then fault at first RCAS block (Fault AS1, Xg3, Yg5 in Table 6-7).
In other cases, if output varies from fault free outputs at last bit only then fault
occurs at last RCAS block, if last two bits different then 3 rd stage RCAS, last 4
bits vary then 2nd stage fault and last 6 bits changes identify faults at first stage
fault.
202
Table 6-7: Testing Missing Control faults for 4-bit RCAS
* FAULTS IDENTIFIED BY TEST VECTOR : 1 IS HIGHLIGHTED WITH TURQUOISE, 2 IS HIGHLIGHTED BY PINK AND 3 IS HIGHLIGHTED BY
GREEN
Inputs
ASX1Y1Cin
Fault location at stage 4
Fault-free
X2Y2X3Y3
X4Y4
Outputs: G1S1 G2S2 G3S3 G4S4 Co
Fault AS1
Fault Xg2
Fault Yg2
Fault Xg3
Fault Yg4
Fault Cg4
Fault Yg5
01001010101
111111110
111111001
111111110
111111111
111111110
111111110
111111111
111111110
01111111112
010101011
010101101
010101011
010101011
010101011
010101010
010101011
010101001
10000000003
111111110
111111110
111111111
111111110
111111000
111111110
111111111
111111110
111100101
111111110
111111101
111111110
111111110
111111101
111111110
010110011
010101011
010101011
010101011
010101001
010101011
010100011
111111110
111111101
111111110
111100110
111111110
111111101
111111110
110010101
111111110
111110101
111111110
111111110
111110101
111111110
011001011
010101011
010101011
010101011
010100011
010101011
010001011
111111110
111110101
111111110
110011110
111111110
111110101
111111110
001010101
111111110
111010101
111111110
111111110
111010101
111111110
100101011
010101011
010101011
010101011
010001011
010101011
000101011
111111110
111010101
111111110
001111110
111111110
111010101
111111110
Fault location at stage 3
Fault location at stage 2
Fault location at stage 1
Thus we can identify any missing control point fault in reversible adder/subtractor
circuit of any size. We can employ this methodology to detect missing control
point faults in any modular design of reversible circuits. We need to identify
203
minimal test set of the basic module which detects all the faults using any stateof-art testing scheme. The only constraint is to choose test vectors that generate
the same outputs as corresponding inputs for signals which travel through
different cascaded modules.
6.7
Conclusion
In this chapter we presented two important perspective of testing reversible
circuits. First we introduced testing methods to test wire and gate replacement
faults. Next we addressed the testing of reversible arithmetic circuits.
We proposed three testing schemes to detect the wire and gate replacement
faults in the reversible circuits. Single missing gate or control points faults, which
are usually addressed in other works have defined constraint, and can be easily
incorporated to SAT formulation. However, gate and wire replacement faults are
not straightforward to detect.
For this purpose, we presented reversible test
miter and adopted SAT mechanism to find proper vector. Then, the back
propagation is applied to obtain the corresponding vectors, which are for the
irredundant faults. Any unsatisfied faults are redundant faults and can be used
for optimization of the circuits.
Next, we illustrated a method to find a test set to detect missing control points in
reversible modular arithmetic circuits, for example controlled adder/subtractor.
204
We observe that only three test vectors are needed to identify such faults in
RCAS of any size. Further, we have shown that the sequence of output
corresponding to the input sequence of test vectors can locate the fault position
of RCAS module. We demonstrated the fault localizing methods on an example
of a 4-bit adder/subtractor. To our best knowledge, this is the first attempt to
localize faults in reversible circuits with cascaded modules.
205
Chapter 7 Conclusion
Reversible
logic
is
a
possible
alternative
for
low
power
computing
implementations. Recently several attempts are brought onboard in hardware
realization of reversible circuits in CMOS technology, trapped-ion quantum
technology,
quantum
dot
cellular
automata,
optical
and
molecular
implementation. These realizations are abstraction of some reversible gates in
real technology, the complete implementation of reversible computing is still a
challenge. However, development in efficient methodology in synthesis and
optimization, arithmetic designs and testing to accommodate the reliable
implementation of reversible logic will ease the realization process.
Still the
biggest research impact is on synthesis of such circuits. During the journey of
this work, our understating in synthesis and testing of reversible logic from
literatures lead us the way to incorporate the new ideas to obtain better
performance. A synopsis of the proposed approaches is presented herewith,
followed by the findings of the work done. In the end, we present the plausible
future work based on integration of recent advancement with our point of views.
206
7.1
Synopsis
We adopted the conventional classical methods in the process of design,
synthesis and testing of reversible circuits. The rationale is that trying out
different existing options for a new area can introduce a direct gate-away in
technologies. As mentioned earlier our main interest is in logical reversibility, not
the physical reversibility or the quantum unitary operations. We use the quantum
cost to compare different realizations since this is the only standardized
parameter commonly used for any reversible gate.
In the synthesis of reversible circuits there are two categories: methods like MMD
that start from random reversible specification and generate circuits adopting
various optimization techniques. These methods are restricted to small number
of variables. The second type starts from non-reversible specification and create
reversible realization with extra ancilla and garbage bits. Our synthesis method
targets to this category especially for large circuits. Moreover, our approach
especially deals with gate level net-list of classical network to find their reversible
embedding. We use technology mapping for transferring irreversible circuits to
reversible realization. In synthesis of already synthesized classical irreversible
network to its reversible embedding, we use reversible technology mapping
avoiding the generation of function specification and finding proper permutation
to get reversible specification. We create a reversible library of Toffoli modules
207
equivalent to classical two input gates. Later we propose packing of gates named
supercells to minimize garbage outputs and analyze the limitations on their sizes.
We apply our approaches with the aid of Berkley SIS program to MCNC
benchmark circuits and compare our results with one of the current efficient
approaches, BDD-based method which can handle larger circuits efficiently in
many cases than other existing methods. We also provide theoretical analysis of
maximum and minimum bound on number of gates and garbage bits. We show
that our method is better in terms of quantum cost and number of gates and in
many cases number of garbage bits than BDD-based method.
In the synthesis of classical network sometimes design errors such as stuck-atfaults, wrong connection or gate replacement may present in the circuit, which do
not cause any change in overall functionality. However, their removal often
results in simpler circuit. If these redundancies still present in classical
specifications, they will eventually transfer to the corresponding reversible
embedding. Hence, our next concern in reversible mapping is to address the
issue of redundant errors in the irreversible-to-reversible circuit mapping and
their minimization. We investigate the effect of redundant stuck-at faults which
may propagate to Toffoli modules in direct mapping and show how they can
simplify the overall circuit. Besides, in mapped reversible embedding some
redundant Toffoli gates may be present. This error can take place if in irreversible
208
specification, several gates share inputs and their reversible modules have
common Toffoli gates with the same constant target input. We present the
algorithm to remove such extra Toffoli gates reducing garbage outputs and
quantum cost.
In reversible arithmetic designs, while researchers engaged in proposing some
new gates from function truth table and apply them in individual circuits such as
adder, subtractor, multiplier etc., our target is to present an integrated circuit
especially a reversible arithmetic logic unit. We try to include arithmetic and logic
operations close enough to their classical counter parts to facilitate direct
translation. We propose a basic block, which can be used for other larger circuits
with modifications. First, we develop a reversible controlled adder/subtractor
(RCAS) block, which performs both addition and subtraction of two binary
(signed or unsigned) numbers. Here, we implement an indirect way of subtraction
utilizing full adders based on 2’s Complement computation. To increase the
reliability of the circuit, we include the overflow detection, which was not
attempted before.
Next, we present a new and efficient design of reversible comparator. This
design incorporates a basic part of previous RCAS modules. Reversible
comparators are generally constructed from truth-table based logic resulting in
significant quantum cost and garbage outputs, and are applicable to unsigned
209
numbers only. Here, we present a new design of reversible comparator using our
proposed RCAS module with overflow detector, which is more efficient than
existing approaches in terms of quantum cost and garbage outputs and more
notably, capable of comparing signed numbers.
Finally, we present a novel reversible arithmetic logic unit (RALU), based on our
RCAS block. We introduce a new integrated RALU module which performs most
of the classical ALU operations such as addition, subtraction, AND, OR and XOR
with less number of control lines. Negated logical functions (NAND, NOR, XNOR)
and implication are also realized. We implement 1-bit RALU having two parts:
function generation and function selector. In function generator, all operation
outputs are generated in parallel, from which function selector selects the desired
output depending on the value of control inputs. We propose two designs for
function selector, one of which is employing Fredkin gates and other one is our
proposed multiplexer. The RALU is expandable to n-bit design with cascaded
single bit modules. Later, we modify our RALU to detect overflow and can
perform comparison of two numbers (set-less-than). To our knowledge, our
design is the most versatile and efficient reversible design reported to date.
For all of our proposed arithmetic circuits, we perform the cost analysis of single
bit and n-bit design in terms of reversible parameters such as gate count,
garbage count and quantum cost. Then we compare our design with other
210
existing methods for various sizes. We show the improvements in terms of
quantum cost and garbage bits. To prove the functionality of our reversible
arithmetic circuits, we implement them in VHDL and simulated using Quartus II
9.1 sp1 web edition. The single bit modules are modeled in the behavioral
manner, while the remaining designs are implemented using structural code with
basic block as component. The simulation results confirm the functionality and
reversibility of all designs.
Our next target of the project is to find a testing scheme to detect faults in any
reversible circuits. During our study on synthesis of reversible circuits and
designing reversible arithmetic circuits, we notice that the realization of reversible
circuits is not restricted to standard reversible gates such as NOT, CNOT and
multiple controlled Toffoli gate. Rather we find circuits with Fredkin, Peres gates
as well as some other new gates. Thus, we try to present fault models and their
testing scheme to address circuits from any type of reversible library. We
observe in the process of design, synthesis or template matching, failures can
happen where a gate is replaced by another gate inadvertently or erroneous
cascading of lines in two gates. We present two fault models to define such
errors: gate replacement and wire replacement faults. To test these faults we
adopt a familiar classical testing scheme based on Boolean Satisfiability. In
reversible circuits, this SAT-based technique is used in the synthesis and
211
verification purposes. Here, we propose three testing methods. Our first method
follows conventional procedure by adding SAT formulations for fault free circuit,
fault cone and fault observation constraint. The second method employs
equivalence checking to find test vector assuming that fault free and faulty
circuits are not equivalent and the counter example represents the vector for
which two circuits behave differently. Our new reversible test miter finds an error
defining vector for which the original gate and replaced gate generate different
outputs. This error defining vector is same for many gate replacement cases and
can be used as known constraint in SAT formulation from circuit primary inputs to
fault location. We solve the formula to obtain the desired input assignment or test
vector.
To verify our proposed methods we first constructed several reversible test
miters from different combination of reversible gates and generated the error
defining vectors using the state of art SAT solver minisat2. For some reversible
benchmark circuits, we perform the SAT formulation for the proposed three
methods and solve for test vectors to show how the reversible test miter works
efficiently.
In our study of arithmetic circuits and testing, we wanted to identify if there is any
special testing feature of such modular design. As a preliminary research, we try
to apply exhaustive testing and to find the output behaviors for each input vector
212
in case of missing control point fault. We investigate for minimal test set and
observe only three test vectors are sufficient to test reversible controlled
adder/subtractor block. Further, if we test in a sequence of all three vectors, the
output sequence for each fault is distinct and observing the output sequence we
can identify the exact location of the fault. We apply the same technique for 4-bit
adder/subtractor and find that we can use the same input values for each block.
Hence, the extended versions of the same three input vectors are sufficient to
test all missing control point faults.
7.2
Findings of the project
The basic findings of the project, resulting in the scholarly contribution of the
project is as follows:

Synthesis of reversible circuits from large irreversible specification or gatelevel implementations results in better realization with less number of
garbage bits and quantum cost. We can avoid complex and time
consuming permutation based reversible embedding and address large
functions easily.

Though the stuck-at fault is not quite appropriate in reversible circuit, the
behaviour of redundant stuck-at faults is the same as in classical network.
Hence in direct mapping of irreversible of reversible embedding we can
213
avoid redundancy removal step in classical network and remove them
during mapping to obtain simpler circuits. Moreover, we can remove
identical Toffoli modules sharing same inputs to obtain an optimized
network.

Arithmetic circuits in any technology are important parts of processing
information. In reversible technology researchers also predict the
presence of arithmetic blocks. Keeping that in mind, like other literatures
our endeavour remains in developing a series of reversible arithmetic
circuits which are better than currently available designs. An indirect way
of subtraction for both signed and unsigned number is presented. To
increase the reliability we incorporated the overflow detection. Utilizing this
subtractor we propose comparator designs and avoid the increased
complexity of truth-table based realization for larger size.

We propose a versatile reversible arithmetic logic unit which incorporates
more functions with less quantum cost than existing designs.

We demonstrate a generalized methodology to implement square root
circuit, an important element in complex computation. This is the first
attempt to consider such operation in reversible network.

We investigate errors in reversible circuits and present the error models
gate and wire replacement faults generalized enough to handle other
214
faults such as missing gate fault, control point appearance or
disappearance fault.

We show that our proposed reversible test miter can address gate and
wire replacement faults of the circuits constructed from standard reversible
gates as well as application-specific gates especially proposed in
reversible arithmetic designs. This method speed up testing with small test
size.

We study the testability of modular reversible design i.e. arithmetic circuits
for missing control point faults. We show that only three test vectors are
required to test such faults and extended version of same test vectors that
test the basic block can test modular design of large size. Even it is
possible to identify the location of the fault if we observe the output
sequence corresponding to a sequence of input test vectors.
7.3
Future work
As an ongoing project, some initial attempts to extend the findings reported in
this thesis were performed. Based on them the evident experiments foreseeable
in the near future remain as follows:

Synthesis and optimization of our reversible realization is still an open
problem. The Toffoli modules we created in our method is based on Reed215
Muller transformation which sometimes introduced costly reversible
implementation. Using other spectral techniques such as positive davio
decision diagram (PDD) can be employed to obtain optimal gate.
Redundancy removal technique can be incorporated in current synthesis
approaches to obtain a cost effective realization. Then the results will be
comparable to recently proposed synthesis methods [112-113, 131].

A new study based on linear nearest neighbor model (LNNM) of quantum
technology suggests on limiting the number of ancilla to a smaller value. It
has been found that for this model the quantum cost for methods that add
a large number of ancillary qubits increases significantly (up to 1200%)
[131]. Hence, in future work we will consider minimizing ancilla bits for the
Toffoli modules and arithmetic blocks when target technology enforces
LNNM constraints.

The incompletely specified functions include don’t cares. Their proper
assignment of values can result in simplified network. Our future work
aims to perform such study based on very recent literatures [106, 131] and
improve our proposed synthesis methods. Alternatively we can apply
Reed-Muller (RM) Transform which is the polynomial representation of a
function and in binary domain is directly implementable by means of AND
and XOR. In classical synthesis it has been shown that considering don’t
216
care conditions for incompletely specified functions lead to minimal degree
polynomial [137]. With this method for incompletely specified functions
each unspecified point leads to a zero in the place of one highest-order
coefficient in the transform domain. We can use the approaches in case of
incompletely specified reversible functions to obtain a minimal network.

Reducing the number of garbage outputs is always a challenging issue.
We tried to reduce them by reusing them in cascade. However, in
arithmetic logic unit, the function generator can be redesigned to
accommodate more functions under less control inputs, which will
eventually reduce number of lines.

In our proposed designs, we mainly consider the quantum cost as analysis
parameter to compare designs, sometimes which are not even targeting
quantum circuits, since this is a standard measure currently used in
literatures. However, recently some CMOS implementations of reversible
circuits have been proposed using pass transistors [132]. For example, to
implement with reversible complementary pass transistor (R-CPL), we
need 4 pass gates (8 transistors) for Feynman gate, 8 pass gates (16
transistors) for Toffoli and Fredkin gates. Thus we can calculate overall
number of pass gates or transistors for any reversible circuit implemented
using these gates in CMOS technology. Also, quantum dot cellular
217
automata (QCA) is a plausible candidate for hardware realization of
reversible logic [133]. We can consider in future such realizations for cost
analysis.

We applied our testing schemes for two fault models. We need to consider
some technology specific faults since there are some attempts going on
for hardware implementation of reversible logic. The more scope lies in
testing arithmetic circuits. We investigated only one type of faults of binary
reversible logic considered in some literatures and in future we can
consider other technology related faults to see if the same test vectors are
sufficient or not. As a preliminary point, we can consider FPGA emulator
of quantum circuits and Quantum Fourier Transform presented in [138] to
check the behavior of different faults and applicability of our testing
schemes.

The basic idea for reversible logic comes from low power consumption.
However, synthesis does not include power calculation and in literature we
cannot find significant research in this area to aid the power analysis of
our proposed designs. Since technological realization of reversible circuits
is still at infancy, we can perform high level power modeling based on field
programmable gate array (FPGA) to estimate the efficiency of design
[134-136].
218

In
our
arithmetic
designs
we
proposed
a
reversible
controlled
adder/subtractor in ripple-carry propagation. Our target is to create a basic
module which can be used in many other designs and only cascading is
required to extend the modular design to any size. To speed up data
processing other adder designs can be adopted. For example, we
attempted to find an alternative, i.e. reversible carry-look-ahead adder
though it does not represent regular modular structure. For a 4-bit design,
we need 24 reversible gates two of which are 4-controlled Toffoli gates,
one 3-controlled Toffoli gate. The number of ancilla is 11 and the number
of garbage bits is 16. The overall quantum cost is 122. On the other hand,
our current ripple-carry design requires 13 gates, 9 garbage bits and
overall quantum cost 37. In future we can exercise other adder/ALU
designs for optimized realization.

Finally, in future another classical methodology that we can explore in the
context of reversible logic is using verification tools in multi-abstraction
levels designs of high-level reversible specifications. We can exercise the
use of MATLAB with Simulink and System C to validate at each level of
abstraction of the design flow, starting from algorithmic modeling of design
specification to the final FPGA implementation of reversible network [139].
219
References
[1]
R. Landauer, “Irreversibility and heat generation in the computational
process”. IBM Journal of Research and Development, 1961. 5(3): pp. 183191.
[2]
C. H. Bennett, “Logical reversibility of computation”, IBM Journal of
Research and Development, 1973. 17(6): pp. 525-532.
[3]
V.V. Zhirnov, R.K. Calvin, III, J. A. Hutchby, and G. I. Bourianoff, “Limits to
binary logic switch scaling- A Gedanken model”, Proc. IEEE, vol 91, pp.
1934-1939, Nov. 2003.
[4]
M. A. Nielsen, I. L. Chuang, “Quantum Computation and Quantum
Information”, Cambridge University Press, 2000-10-23.
[5]
B. Desoete and A. De Vos, “A reversible carry-look-ahead adder using
control gates”, Integration, VLSI J., vol. 33, pp. 89-104, 2002.
[6]
W. C. Athas and L.J. Svensson, “Reversible Logic Issues in Adiabatic
CMOS”, Workshop on Physics and Computation, 1994, pp. 111-118.
[7]
S. Burignat and A.D. Vos, “Test of a Majority-based Reversible (Quantum)
4-bits Ripple Carry Adder in Adiabatic Calculation”, 18 th International
Conference on Mixed Design of Integrated Circuits and Systems, 2011, pp.
368-373.
[8]
H. Thapliyal and M. Zwolinski, “Reversible Logic to Cryptographic
Hardware: a New Paradigm”, CoRRabs/cs/0610089, 2006.
[9]
H. Thapliyal and N. Ranganathan, “Mach-Zehnder interferometer based
design of all optical reversible binary adder”, Design, Automation and Test
in Europe, 2012, pp. 721-726.
[10] M. Skoneczny, Y van Rentergem and A. D. Vos, “Reversible Fourier
Transform Chip”, 15th International Conference on Mixed Design of
Integrated Circuits and Systems, 2008, pp. 281-286.
220
[11] A. D. Vos, S. Burignat and M. K. Thomsen, “ Reversible Implementation of a
discrete Integer Linear Transformation”, Journal of Multiple-Valued Logic
and Soft Computing, vol. 18, no. 1, pp. 25-35, 2012.
[12] D. M. Miller, D. Maslov, and G. W. Dueck, “A transformation based
algorithm for reversible logic synthesis”, Design Automation Conf., pp. 318–
323, 2003.
[13] D. Maslov, G. W. Dueck, and D. Michael Miller, “Simplification of Toffoli
networks via Templates”, In Proc. Symp. Integr. Circuits Syst. Des., pp. 5358, Sep. 2003.
[14] D. M. Miller, G. W. Dueck and R. Wille, “Synthesizing Reversible Circuits
from Irreversible Specifications Using Reed-Muller Spectral Techniques”,
Proc. IEEE Intl. Symp. on MVL, pp. 87-96, May 2009.
[15] D. Maslov, G. W. Dueck, and D. Michael Miller, “Toffoli network synthesis
with templates”, IEEE Trans. on CAD, 24(6):807–817, 2005.
[16] D. Maslov, G.W.Dueck and D. M. Miller, “Techniques for the Synthesis of
Reversible Toffoli Networks”, ACM Trans. on Design Automation of
Electronic System, Vol. 12, No.4, pp. 42:1-42:28, Sept.2007.
[17] P. Kerntopf. “A new heuristic algorithm for reversible logic synthesis”, Proc.
Design Automation Conf., pp. 834–837, 2004.
[18] V. V. Shende, A. K. Prasad, I. L. Markov, and J. P. Hayes, “Synthesis of
reversible logic circuits”, IEEE Trans. on CAD, 22(6):710–722, 2003.
[19] D. M. Miller and G. W. Dueck, “Spectral techniques for reversible logic
synthesis”, Proc. Int. Symp. Represent. Methodology Future Comput.
Technol., pp. 56-62, March 2003.
[20] D. Grosse, R.Wille, G. Dueck and R. Drechsler, “Exact Multiple Control
Toffoli Network Synthesis with SAT Techniques”, IEEE Trans. On CAD,
28(5): 703-715, 2009.
[21] P. Gupta, A. Agrawal and N.K. Jha, “An Algorithm for Synthesis of
Reversible logic Circuits”, IEEE Transactions on CAD of Integrated Circits
and Systems, 25(11): 2317-2330, 2006.
221
[22] R. Wille, D. Grosse, G. W. Dueck and R. Drechsler, "Reversible Logic
Synthesis with Output Permutation", 2009 22nd International Conference on
VLSI Design, vlsid, pp.189-194, 2009.
[23] T. Hirayama, M. Higashiohno and Y. Nishitani, “Search Space Reduction for
Reversible Logic Synthesis by Evaluating Lower Bounds”, Proc. Intl.
Symposium of Multiple-Valued Logic, pp. 73-78, May 2009.
[24] A. Mishchenko and M. Perkowski. “Logic synthesis of reversible wave
cascades”, International Workshop on Logic Synthesis, pp. 197-202, June
2002.
[25] R. Wille and R. Dreschler, “BDD-based Synthesis of Reversible Logic
Circuits for Larger Functions”, Proc. Design Automation Conference,
pp.270-275, July 2009.
[26] H. M. Thapliyal and H. R. Arabnia, “Reversible Programmable Logic Array
(RPLA) using Fredkin & Feynman gates for industrial electronics and
applications”,
International
Conference
on
Computer
Design
and
Conference on Computing in Nanotechnology (CDES), 2006, pp. 70-74.
[27] R. Wille and R. Dreschler, “Synthesizing Reversible Logic: An Overview”,
Proc. Intl. Symposium of Multiple-Valued Logic, pp. 79-86, May 2009.
[28] M. Saeedi and I. L. Markov, “Synthesis and Optimization of Reversible
Circuits- A Survey”, ACM Computing Surveys, Vol. 45, No. 2, Article 21 (34
pages), 2013.
[29] L. Ni, Z. Guan and W. Zhu, “A General method of Constructing the
Reversible Full Adder”, 3rd Intl. Symp. on Intelligent Inf. Technology and
Security Informatics, pp. 109-113, 2010.
[30] H. Thapliyal and M.B Srinivas, “Novel Design and Reversible Logic
Synthesis of Multiplexer Based Full Adder and Multipliers”, 48th Midwest
Symp.on Circuits and Systems, vol. 2, pp. 1593-1596, 2006.
[31] H. Thapliyal, M.B Srinivas, “Novel Reversible TSG gate and its application
for designing reversible carry look ahead adder and other adder
222
architectures”, Proc. of 10th Asia-pacific computer systems architecture
Conference, 3740, 2005.
[32] M. Haghparast and K. Navi, “Design of a novel reversible multiplier circuit
using HNG gate in nanotechnology”, Am. J. Applied Sciences, vol.5, 2008,
282.
[33] M. Ehsanpour, P. Moallem, A. Vafaei, “Design of a Novel Reversible
Multiplier Circuit Using Modified Full Adder”, 2010 Intl. Conf. on Computer
Design and Applications, vol.3 pp. 230-234.
[34] H. Thapliyal, M.B Srinivas and H.R. Arabnia, “Reversible Logic Synthesis of
Half, Full and Parallel Subtractors”, Proc. of Intl. Conf. on Embedded Sys.
and App., June 2005, Las Vegas, pp. 165-181.
[35] H. Thapliyal and N. Ranganathan, “Design of Efficient Binary Subtractors
Based on a New Reversible Gate”, Proc. of 2009 IEEE Computer Society
Annual Symposium on VLSI, pp. 229-234.
[36] H. G. Rangaraju, U. Venugopal, K.N. Muralidhara and K. B. Raja, “Low
Power Reversible Parallel Binary Adder/Subtractor”, Intl. J. of VLSI design &
Comm. Sys. (VLSICS), Vol. 1, no. 3, 2010, pp 23-34.
[37] V. Vedral, A. Barenco and A Ekert, “ Quantum Networks for Elementary
Arithmetic Operations”, Phys. Rev. A, vol. 54, no. 1, pp. 147-153, 1996.
[38] S. A. Cuccaro, T. G. Draper, S. A. Kutin and D. P. Moulton, “A new
Quantum Ripple-Carry Addition Circuit,” quant-ph/0410184, 2004.
[39] Y. Takahashi, S. Tani and N. Kunihiro, “Quantum Addition Circuits and
Unbounded Fan-out”, Quantum Information and Computation, vol. 10, no.
9&10, pp. 872-890, 2010.
[40] I. L. Markov and M. Saeedi, “Constant-Optimized Quantum Circuits for
Modular Multiplication and Exponentiation”, Quantum Information and
Computation, vol. 1. no. 5&6, pp. 872-890, 2012.
[41] K. N. Patel, J. P. Hayes and I. L. Markov, “Fault Testing for Reversible
Logic Circuits”, IEEE Trans. On CAD, vol. 23, no.8, pp. 1220-1230, 2004.
223
[42] J. P. Hayes, I. Polian and B. Becker, “Testing for Missing-Gate Faults in
Reversible Circuits”, Proc. Asian Test Symposium, Taiwan, November
2004.
[43] I. Polian, T. Fiehn, B. Becker and J. P. Hayes, “A Family of Logical Fault
Models for Reversible Circuits”, 14th Asian Symposium, 2005, pp. 100-105.
[44] J. Zhong and J. C. Muzio, “Analyzing fault models for reversible Logic
Circuits”, IEEE Congress on Evol. Computation, pp.2422-2427, 2006.
[45] M. Bubna, N. Goyal and I. Sengupta, “A DFT Methodology for Detecting
Bridging Faults in Reversible Logic Circuits”, Proc. of IEEE Tencon, Taipei,
2007.
[46] H. Rahman, D. K. Kole, D. K. Das and B. B. Bhattacharya, “Detection of
Bridging Faults in reversible Circuits”, Proc. VLSI Design and Test
Symposium, pp. 384-392, August 2006.
[47] J. S. Allen, Jacob D. Biamonte and M. A. Perkowski, “ATPG for Reversible
Circuits using Technology-Related Fault Models”, Proc. International
Symposium
on
Representations
and
Methodologies
for
Emergent
Computing Technologies, Tokyo, Japan, September 2005.
[48] R. Wille, H. Zang and R. Drechsler, "ATPG for reversible circuits using
simulation, Boolean Satisfiability and pseudo Boolean optimization," in IEEE
Annual Symp. on VLSI, 2011, pp. 120-125.
[49] T. Toffoli, “ Reversible Computing”, Technical Memo, MIT/LCS/TM-151,
Boston 1980.
[50] E. Fredkin, T. Toffoli, “Conservative Logic”, Int. J. Theor. Physics, vol. 21,
no. 3-4, pp. 219-253, 1982.
[51] A. Peres, “ Reversible logic and quantum computers,” Phys. Rev. A, Gen.
Phys., vol. 32, no. 6, pp. 3266-3276, Dec. 1985.
[52] Z. Zilic, K. Radecka and A. Khazamiphur, “Reversible Circuit technology
Mapping from Non-reversible Specifications”, Proc. of Design, Automation
and Test in Europe, pp. 558-563, 2007.
224
[53] A.N. Nagamani, H. V. Jayashree and H. R. Bhagyalakshmi, “Novel Low
power Comparator design using Reversible Logic Gates”, Indian J. of
Comp. Science and Engineering, vol. 2, no. 4, 2011, pp. 566-574.
[54] R. Aradhaya, K. N. Muralidhara, B. Kumar, “ Design of Low Power
Arithmetic Unit Based on Reversible Logic”, International Journal of VLSI
and Signal Processing Applications, vol. 1, no. 1, pp. 30-38, 2011.
[55] O. J. E. Maroney, “The (absence of a) relationship between thermodynamic
and logical reversibility”, Studies in History and Philosophy of Modern
Physics, 36, pp. 355–374.
[56] A. De Vos, B. Desoete, A. Adamaski, P. Pietrzak, M. Sibinski and T.
Widerski, “Design of Reversible Logic Circuits by Means of Control Gates”,
Proc. Patmos 2000 Conference, Goettinge (Springer Lecture Notes in
Computer Science, Vol. 1918), pp. 255-264, 2000.
[57] A. De Vos, B. Desoete, F. Janiak and Nogawski, “Control Gates as Building
Blocks for Reversible Computers”, Proc. Patmos 2001 Conference,
Yverdon, pp. 9201-9210.
[58] P. Klein, T. H. Leete and H. Rubin, “A Bio-molecular Implementation of
Logically Reversible Computation with Minimal Energy Dissipation”, Elsevier
Bio systems 52, pp. 15-23, 1999.
[59] J. Huang, X. Ma, and F. Lombardi, “Energy analysis of QCA circuits for
reversible computing”, Proceedings of the 6th IEEE Conference on
Nanotechnology (NANO) 1, pp. 39- 42, 2006.
[60] P. Remin et. al., “Reversible Molecular Logic: A Photophysical Example of a
Feynman Gate”, ChemPhysChem, vol. 10, issue 12, pp. 2004-2007.
[61] R.
C.
Merkle,
“Reversible
Electronic
Logic
using
Switches”.
Nanotechnology: 4, pp 21-40, 1993.
[62] C. Monroe, D. M. Meekhof, B.E. King, W. M. Itano, and D. J. Wineland,
“Demonstration of a Fundamental Quantum Logic Gate”. Phys. Rev. Lett.
75, pp. 4714–4717, 1995.
225
[63] K. Morita and T. Ogiro, “Simple Universal Reversible Cellular Automata in
Which Reversible Logic Elements Can Be Embedded”, IEICE Trans. Inf.
And Syst., vol. E87-D, no. 3, March 2004.
[64] T. Metodiev et. al, “Preliminary Results on Simulating a Scalable Fault
Tolerant ion-trap system for Quantum Computation”, In 3 rd workshop on
Non-Silicon Computing.
[65] J. E. Rice, K. B.Fazel, M. A. Thornton and K. B. Kent, “Toffoli Gate Cascade
Generation Using ESOP Minimization and QMDD-based Swapping”, Proc.
Intl. Symp. on MVL, pp. 63-72, May 2009.
[66] D. M. Miller and M. A.Thornton, “QMDD: A Decision Diagram Structure for
Reversible and Quantum Circuits”, Proc. of IEEE International Symposium
on Multiple-Valued Logic, pp. 30, May 2006.
[67] D. Goodman, M. A. Thornton, D. Y. Feinstein and D. M. Miller, “Quantum
Logic Circuit Simulation Based on the QMDD Data Structure”, Proc. of
Applications of the Reed-Muller (RMW), May 16, 2007, pp. 99-105.
[68] F. Mailhot and G. De Micheli, “Technology Mapping Using Boolean
Matching and Don’t Care Sets”, Proc. of the 1990 European Design
Automation Conference, pp. 212-216, 1990.
[69] D. Maslov and G. W. Dueck, “Reversible cascades with minimal garbage”,
IEEE Trans. on CAD, 23(11):1497–1509,2004.
[70] SIS: A System for Sequential Circuit Synthesis
http://www.eecs.berkeley.edu/Pubs/TechRpts/1992/2010.html
[71] S. Sultana and K. Radecka, “Rev-map: A direct gate way from classical
irreversible network to reversible nework”, Proc. IEEE Intl. Symp. on MVL,
May2011, pp. 147-152.
[72] D. Y. Feinstein, M. A. Thornton and D. M. Miller, “Partially Redundant Logic
Detection using Symbolic Equivalence Checking in Reversible and
Irreversible Logic Circuits”, In Proc. of Design, Automation and Test in
Europe, 2008. pp.1378-1381.
226
[73] J. Zhang and J. C. Muzio, “Using crosspoint faults in simplifying Toffoli
networks”, Proc. of IEEE North-East Workshop on Circuits and Systems,
2006, pp. 129-132.
[74] S. Kajiwara, H. Shiba and Y. Kinoshita “Removal of redundancy in
combinational circuits by classification of undetectable faults”, Trans IEICE
1992; J75-D-I: pp. 107-115.
[75] D.K. Ray-Choudhuri, “On the construction of minimally redundant reliable
system designs”, Bell Sys. Tech. J., vol. 40, pp. 595, 1961.
[76] R. David, “Random testing of digital circuits”, Marcel Dekker, inc. New York,
April 1998, ISBN-10: 0824701828.
[77] K. Radecka and Z.Zilic, “Identifying redundant gate replacements in
verification by error modeling”, Int’l. Test. Conference, 2001, pp. 803-812.
[78] K. Radecka and Z. Zilic, “Identifying redundant wire replacements for
synthesis and verification”, Proc. Asia and South Pacific Design Automation
Conference (ASP-DAC), 2002, pp. 517-523.
[79] A.N. Al-Rabadi, “Closed-system quantum logic network implementation of
the viterbi algorithm”, Facta universitatis-Ser.: Elec. Energ., vol. 22, no. 1,
pp. 1-33, April 2009.
[80] H. Thapliyal, N. Ranganathan and R. Ferreira, " Design of a Comparator
Tree Based on Reversible Logic”, IEEE Intl. Conf. on Nanotechnology,
2010, pp. 1113-1116.
[81] M. K. Thomsen, H. B. Axelsen and R. Gluck, “A Reversible Process
Architecture and its Reversible Logic Design”, Reversible Computation,
Lecture Notes in Computer Science Volume 7165, 2012, pp 30-42.
[82] R. Wille, D. Grosse, L. Teuber, G. W. Dueck, and R. Drechsler, “Revlib: An
online resource for Reversible Functions and Reversible Circuits”, Int’l
Symp. On Multi-Valued Logic, pp. 220-225, http://revlib.org/
[83] M. K. Thomsen, R. Gluck, H. B. Axelsen, “Reversible arithmetic logic unit for
quantum arithmetic”, J. Phys. A: Math. Theor., vol. 43, no. 38, 2010.
227
[84] M. Morrison and N. Ranganathan, “Design of a Reversible ALU based on
Novel Programmable Reversible Logic Gate Structures”, IEEE Computer
Society Annual Symposium on VLSI, 2011, pp. 126-131.
[85] V. C. Hamacher, S. G. Zaky and Z. G. Vranesic, "Computer Organization”,
New York : McGraw-Hill, ©1984, ISBN: 0072320869.
[86] S. Brown and Z. Vranesic, “ Fundamentals of digital logic with VHDL
design”, McGraw-Hill, 2008, ISBN-13: 978-0077221430.
[87] S. Samavi, A. Sadrabadi, A. Fanian, “ Modular array structure for norestoring square root circuit”, Journal of Systems Architecture, vol. 54, pp.
957-966, 2008.
[88] Quartus, https://www.altera.com/download/software/quartus-ii-we/9.1.
[89] H. Zang, R. Wille and R. Drechsler, " SAT-based
ATPG for reversible
circuits," in International Design and Test Workshop, 2010, pp. 149-154.
[90] S. C. Chang, L. Ginneken and M. Marek- Sadowska, "Circuit Optimization
by rewiring," IEEE Trans. on Computers, 1999, vol.48, no. 9, 962-970.
[91] K. Radecka and Z. Zilic, “Verification by error modeling: using testing
techniques in hardware verification”, Kluwer Academic Publishers, 2004.
[92] T. Larrabee, “Test Pattern Generation using Boolean Satisfiability”, IEEE
Trans. on CAD, vol. 11, no. 1, 1992, pp. 4-15.
[93] N. Een and N. Sorensson, MiniSat: A minimalistic, open-source Sat solver,
http://minisat.se.
[94] S. Yamashita and I. Markov, “Adatpive Equivalence Checking for Quantum
Circuits”, Proc. of Reed Muller Workshop, pp 97-106, 2009.
[95] R. Wille, D. Grosse, D. M. Miller and R. Drechsler, "Equivalence Checking
of Reversible Circuits",
2009 39th International Symposium on Multiple-
Valued Logic, ismvl, pp.324-330, 2009.
[96] Maslov Reversible Logic Benchmarks, http://www.cs.uvic.ca/dmaslov/
[97] G. Hachtel and F. Somenzi, “Logic Synthesis and Verification Algorithms”,
Kluwer Academic Publishers, 1996.
228
[98] S. Sultana, K. Radecka, “Reversible adder/subtractor with overflow
detector”, Intl. Midwest Symp. on Circuits and Systems (MWSCAS 2011),
pages 1- 4.
[99] S. Sultana, K. Radecka, “Reversible implementation of square-root circuit”,
Intl. Conf. on Electronics, Circuits and Systems (ICECS 2011), pp. 141-144.
[100]A. Barenco, C. H. Bennett, R. Cleve, D. DiVincenzo, N. Margolus, P. Shor,
T. Sleator, J. Smolin,
H. Weinfurter, “Elementary gates for quantum
computation”, Phys. Rev. A., vol. 52, no. 5, pp. 3457‐3467, 1995.
[101] D.P. Vasudevan, P.K. Lala, J. Di and J.P. Parkerson, “ Reversible-Logic
Design With Online Testability”, IEEE Transactions on Instrumentation and
Measurement, vol. 55, no. 2, pp. 406-414, 2006.
[102] S.N. Mahammad and K. Veezhinathan, “ Constructing Online Testable
Circuits using Reversible Logic”, IEEE Transactions on Instrumentation and
Measurement, vol. 59, no. 1, pp. 101-109, 2010.
[103] D.M. Miller, R. Wille and Z. Sasanian, “Elementary Quantum Gate
Realizations for Multple-Control Toffoli Gates”, Proc. of 41st IEEE
International Symp. on Multiple-Valued Logic (ISMVL 2011), pp. 288-293.
[104] D. M. Miller and G.W. Dueck, “Spectral Techniques for Reversible Logic
Synthesis”, Proc. of Reed-Muller Workshop, 2003, pp. 56-62.
[105] A. Khlopotine, M. Perkowski and P. Kerntopf, “Reversible Logic Synthesis
by Gate Composition”, Proc. of IWLS 2002, pp. 261-266.
[106] M. Kumar, B. Iyer, N. Metzger, Y. Wang and M. Perkowski, “Realization of
Incompletely Specified Functions in Minimized Reversible Circuits”, Proc. of
Reed-Muller Workshop, 2007.
[107] N. Alhagi, M. Hawash and M. Perkowski, “Synthesis of Reversible Circuits
with No Ancilla Bits for Large Reversible Function Specified with Bit
Equations”, Proc. of 40th IEEE International Symp. on Multiple-Valued Logic
(ISMVL 2011), pp. 39-45.
229
[108] B. Schaeffer and M. Perkowski, “Linear Reversible Circuit Synthesis in the
Linear Nearest-Neighbor Model”, Proc. of IEEE International Symp. on
Multiple-Valued Logic (ISMVL 2012), pp. 157-160.
[109] B. Schaeffer, L. Tran, A. Gronquist, M. Perkowski and P. Kerntopf,
“Synthesis of Reversible Circuits Based on Products of Exclusive OR
Sums”, Proc. of IEEE International Symp. on Multiple-Valued Logic (ISMVL
2013), pp. 35-40.
[110] G. Yang, F. Xie, W. N. Hung, X. Song and M. Perkowski, “Realization and
Synthesis of Reversible Functions”, Theor. Comput. Sci, vol. 412, no. 17,
pp. 1606-1613, 2011.
[111] N. Alhagi, “Synthesis of reversible functions using various gate libraries and
design specifications”, 2010. Dissertation and Theses, paper 366.
[112] Z. Sasanian, M. Saeedi, M. Sedighi and M. S. Zamani, “A Cycle-based
Algorithm for Reversible Logic”, Proc. of Asia and South Pacific Design
Automation Conference, 2009, pp. 745-750.
[113] K. Datta, B. Ghuku, D. Sandeep and I. SenGupta, “A Cycle based
Reversible Logic Synthesis Approach”, International Conference on
Advances in Computing and Communication, 2013, pp. 316-319.
[114] E. Frosberg, “Reversible Logic Based on Electron Waveguide Y-branch
Switches”, Nanotechnology, vol. 15, pp. 298-302, 2004.
[115] L. Grover, “A fast quantum Mechanical Algorithm for Database Search”,
Proc. 28th Annual Symp. on Theory of Computing, 1996, pp. 212-219.
[116] H.B. Axelsen and M.K.Thomsen, “Garbage-Free Reversible Integer
Multiplication with Constants of the Form 2 k ± 2i ± 1”, Reversible
Computation 2013, Lecture Notes in Computer Science, vol. 7581, 2013,
pp. 171-182.
[117] J. Biamonte, J. Allen and M. Perkowski, “Fault Models for Quantum
Mechanical Switching Networks”, Journal of Electronic Testing, vol. 26, pp.
499-511, 2010.
230
[118] N. Farazmand, M. Zamani and M. Tahoori, “Online Fault Testing of
Reversible Logic using Dual-Rail Coding”, Proc. of IEEE International OnLine Testing Symposium, 2010, pp. 204-205.
[119] N. Farazmand, M. Zamani and M. Tahoori, “Online Multiple Fault Detection
in Reversible Circuit”, Proc. of IEEE International Symposium on Defect and
Fault Tolerance in VLSI Systems, 2010, pp. 429-437.
[120] M. Zamani and M. Tahoori, “Online Missing/Repeated Gate Faults
Detection in Reversible Circuits”, Proc. of IEEE International Symposium on
Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT),
2011, pp. 435-442.
[121] M. Zamani, M. Tahoori and K. Chakrabarty, “ Ping-pong Test: Compact
Test Vector Generation of Reversible Circuits”, Proc. of 30 th VLSI Test
Symp., 2012, pp. 164-169.
[122] N. Alves,
“Detecting
Errors in
Reversible
Circuits with
Invariant
Relationships”, arXiv: 08123871v1, 2008.
[123] I. Polian and J.P. Hayes, “Advanced Modeling of Faults in Reversible
Circuits”, invited paper, Proc. of the IEEE East-West Design and Test
Symp., 2010, pp. 376-381.
[124] A. Paler, I. Polian and J.P.Hayes, “Detection and Diagnosis of Faulty
Quantum Circuits”, Proc. of the Asia and South Pacific Design Automation
Conference, 2012, pp. 181-186.
[125] M. Lukac, B Shuai, M. Kameyama and M. Miller, “Information Preserving
Logic- using Logical Reversibility to Reduce CPU-Memory Bottleneck”,
Proc. of IEEE International Symp. on Multiple-Valued Logic (ISMVL 2011),
on CD.
[126] J. Fiurasek, “ Linear Optical Fredkin Gate Based on Partial –SWAP Gate”,
Phys. Rev. A, 78:032317, 2008.
[127] M. Lukac, M. Kameyama, M. Perkowski, P. Kerntopf and C. Moraga, “
Analysis of Faults in Reversible Computing”, Proc. of IEEE International
Symp. on Multiple-Valued Logic (ISMVL 2014).
231
[128] J. E. Rice, “An Overview of Fault Models and Testing Approaches for
Reversible Logic”, Proc. of the Pacific Rim Conference on Communications,
Computers and Signal Processing (PACRIM), 2013, pp. 6.
[129] M. Micuda, M. Sedlak, I. Straka, M. Mikova, M. Duesk, M. Jezek and J.
Fiursek, “ Efficient Experimental Estimation of Fidelity of Linear Optical
Quantum Toffoli Gate”, Phys. Rev. Lett., 111: 160407, 2013.
[130] M. Hawash, “Methods for Efficient Synthesis of Large Reversible Binary
and Ternary Quantum Circuits and Applications of Linear Nearest Neighbor
Model”, 2013, Dissertations and Theses, paper 1090, chapters 9 and 10.
[131] M. Lukac, M. Kameyama, M. Perkowski and P. Kerntopf, “Decomposition of
Reversible
Logic
Functions
Based
on
Cube-Reordering”,
FACTA
UNIVERSITATIS, SER. Elec. Energ, vol. 24, no. 3, December 2011, pp.
403-422.
[132] M.K. Thomsen, “Design of Reversible Logic Circuits using Standard Cells”,
Technical report 2012-03, Dept. of Computer Science, University of
Copenhagen, 32 pages.
[133] N. A. Shah, F. A. Khandy, J. Iqbal, “Quantum Dot Cellular Automata (QCA)
Design
of
Multi-Function
Reversible
Logic
Gate”,
Journal
of
Communications in Information Science and Management Engineering
(CISME), vol.2, no. 4, 2012, pp. 8-18.
[134] S. Gupta and F.N. Najm, “Power Modeling for High Level Power
Estimation”, IEEE Transactions on Very Large Scale Integration, vol.8, no.
1, 2000, pp. 18-29.
[135] L. Shang and N. K. Jha, “High Level Power Modeling of CPLDs and
FPGAs”, Proc. of IEEE International Conference of Computer Design, ICCD
2001, pp. 46-51.
[136] A. De Vos and Y. V. Rentergen, “Power Consumption in Reversible Logic
Addressed by a Ramp Voltage”, proc. of Integrated Circuit and System
Design, Power and Timing Modeling, Optimization and Simulation, 15 th
International Workshop PATMOS 2005.
232
[137] Z. Zilic and Z. Vranesic, "A Multiple-Valued Reed-Muller Transform for
Incompletely
Specified
Functions", IEEE
Transactions
on Computers,
vol. 44, No. 8, pp. 1012-1020, August 1995.
[138] A. U. Khalid, Z. Zilic and K. Radecka, "FPGA Emulation of Quantum
Circuits", Proceedings of IEEE International Conference on Computer
Design , ICCD04, pp. 310-315, Oct. 2004.
[139] J. F. Boland, C. Thibeault and Z. Zilic, "Using Matlab and Simulink in a
SystemC Verification Environment", Proceedings of Design and Verification
Conference and Exposition, DVCon 05, pp. 56-61, Feb. 2005.
233