vii TABLE OF CONTENTS CHAPTER TITLE DECLARATION DEDICATION ACKNOWLEDGEMENT ABSTRACT ABSTRAK TABLE OF CONTENTS LIST OF TABLES LIST OF FIGURES LIST OF ABBREVIATIONS LIST OF SYMBOLS LIST OF APPENDICES 1 INTRODUCTION 1.1 Block-based Neural Networks 1.2 Problem Statement 1.3 Objectives 1.4 Scope of Work 1.5 Contributions 1.6 Dissertation Organization 2 LITERATURE REVIEW AND THEORETICAL BACKGROUND 2.1 Artificial Neural Networks (ANN) 2.1.1 Background Theory of ANNs 2.1.2 Neuron Activation Functions 2.1.3 Common Criticism of Conventional ANNs 2.2 Background Theory of BbNNs 2.2.1 Previous Work on BbNNs in Hardware PAGE ii iii iv v vi vii xii xiv xix xxi xxii 1 2 3 6 6 8 9 11 11 11 13 16 18 20 viii 2.3 Genetic Algorithm 2.3.1 Multi-objective Optimization Models Using GA Previous Work on Selected Case Studies 2.4.1 Driver Drowsiness Classification Using Heart Rate Variability 2.4.2 ECG Signal Classification in Detection Heart Arrhythmia 2.4.3 Tomita Language Grammar Inference 22 METHODOLOGY 3.1 Research Approach 3.1.1 System Latency and GA-based Training for the Proposed BbNN 3.1.2 Design Considerations in Neuron Block Hardware Development 3.1.3 Design Approach of the Simulation Model for the Proposed BbNN 3.1.4 Hardware Design: Co-Design and SoC Implementation 3.2 Software Tools and Design Environment 3.2.1 The GAlib Library 3.2.2 Mentor Graphics Modelsim 3.2.3 Icarus Verilog 3.2.4 Verilator 3.2.5 Nios2-Linux Embedded Operating System 3.3 Hardware Design Verification Methodology 3.3.1 Using the Command-line Version of Modelsim for Verification 3.3.2 Using SystemVerilog DPI-C in Modelsim for Co-Verification 3.3.3 Viewing Waveforms with GTKWave 3.3.4 Using Icarus Verilog for Verification 3.3.5 Using Verilator for Advanced CoSimulation 3.4 Case Studies for Verification of the Proposed BbNN 3.4.1 Test Dataset Creation from HRV Feature Vectors for Case Study 2 31 31 2.4 3 25 26 26 28 30 33 34 35 36 38 38 40 40 41 42 44 45 47 49 49 51 54 55 ix 3.4.2 3.4.3 4 5 Test Dataset Creation from ECG Signal Feature Vectors for Case Study 3 Test Dataset Creation Using Tomita Grammar for Case Study 4 MODELING, ALGORITHMS, AND SIMULATION OF PROPOSED BLOCK-BASED NEURAL NETWORKS 4.1 BbNN as a Structured Neural Network 4.2 Modeling of a Neuron Block in the Proposed BbNN 4.2.1 System Latency and Synchronous Logic Structure of Neuron Blocks 4.2.2 Design Considerations: Range of Values for the Synaptic Weights 4.3 GA-based Training and Optimization of the Proposed BbNN 4.3.1 Formulation of the GA Fitness Function 4.4 BbNN Array Interconnect Algorithm 4.5 Implementation of the BbNN Simulation Model HARDWARE DESIGN AND IMPLEMENTATION 5.1 System Architecture of BbNN in Hardware 5.2 Hardware Design Considerations 5.2.1 Number Representation for Fixed-Point Arithmetics and Precision 5.2.2 Activation Function of the Hardware Neuron Block 5.3 Design of the Proposed Hardware Neuron Block 5.3.1 Neuron Block FSM Control Unit 5.3.2 Multiplexing Sequence of the Multiplier Inputs 5.3.3 Neuron Block Dual-Port On-chip Memory Submodule 5.3.4 Neuron Block Input Multiplexer Submodule 5.3.5 Neuron Block ALU Submodule 5.3.6 Neuron Block Operand Selector Submodule 58 61 63 63 65 68 71 72 75 79 84 87 87 89 89 92 96 98 100 102 104 105 108 x 5.3.7 6 Top Level BbNN Controller with Avalon Bus Interfacing EXPERIMENTAL RESULTS, CASE STUDIES, AND ANALYSIS 6.1 Analysis to Obtain Optimum Range of Weight Values 6.1.1 Geometric Analysis of a Neuron Block with One Synapse 6.1.2 Geometric Analysis of a Neuron Block with Two Synapses 6.1.3 Geometric Analysis of a Neuron Block with Three Synapses 6.1.4 Weight Boundary Selection and Empirical Analysis 6.2 Verification and Analysis of Proposed BbNN through Case Studies 6.2.1 Functional Verification of Proposed BbNN Using the XOR Classification Problem 6.2.2 Case Study 2: Driver Drowsiness Detection Based on HRV 6.2.3 Case Study 3: Heart Arrhythmia Classification from ECG Signals 6.2.4 Case Study 4: Grammar Inference of the Tomita Language 6.3 Verification of the BbNN Implementation Model in Hardware 6.3.1 Simulation Results of the Neuron Block Submodules 6.3.2 Simulation Results of the Neuron Block 6.3.3 Co-Simulation of BbNN HW-SW Design 6.4 Resource Utilization of the BbNN Implementation Model in Hardware 6.5 Performance Analysis of the Proposed BbNN Models 6.5.1 Single Thread vs Multithreaded BbNN Simulation Models 109 113 113 114 116 118 121 122 123 125 127 130 133 134 136 138 140 142 143 xi 6.5.2 6.5.3 7 Software vs Verilated BbNN Simulation Models Embedded Software vs Hardware BbNN Implementation Models CONCLUSION 7.1 Concluding the Experimental Results 7.2 Contributions 7.3 Future Work REFERENCES Appendices A – E 144 144 147 148 150 151 153 160 – 182 xii LIST OF TABLES TABLE NO. 5.1 5.2 5.3 5.4 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 6.11 6.12 6.13 TITLE All the required multiplication operands used for computing Equation 5.12-5.15 used in this work. Summary of ALU adder operation based on opcode. The values of Sel Addr, Sel X and Sel Y based on the operand counter that is required by the state machine. Description of all the registers in the BbNN controller module. Distribution cases for Equation 6.1. Distribution cases for Equation 6.2 with an ignored bias (b = 0). All possible combination of offsets used on Equation 6.2 with x1 and x2 preset to maximum positive values. Parameters used for GA evolution for this analysis. Statistics of generations required by GA to achieve a fitness of 0.95 when training a 7 × 1 BbNN for classification of heart arrhythmia. XOR training dataset for the BbNN structure. Parameters used for GA evolution. Results on BbNN classification accuracy of heart arrhythmia using individual record training. Benchmarking results on BbNN classification accuracy of the heart arrhythmia case study with all records combined. Results on BbNN classification accuracy for Tomita languages. Results on BbNN configuration and optimum latency achieved for each Tomita language. Benchmarking results in classification accuracy for the grammar inferencing case study. The training accuracy and latency of all the case studies tested with the verilated BbNN model. PAGE 101 106 109 111 115 117 119 122 122 123 125 129 130 131 132 133 139 xiii 6.14 6.15 6.16 6.17 6.18 6.19 6.20 FPGA logic utilization of the proposed design in relation to previous works using various neuron block array configurations. Selected FPGA prototyping platform and BbNN bit precision for the proposed design in comparison with previous work. Benchmark of the processors used in this work. Performance analysis for the training of single and multithreaded BbNN simulation models. Performance analysis for the training of the software and verilated BbNN simulation models. Performance analysis for both embedded software and hardware implementation BbNN models for various latency values. Performance analysis for the training of the embedded software and hardware implementation BbNN models. 141 141 142 143 144 145 145 xiv LIST OF FIGURES FIGURE NO. 1.1 1.2 1.3 1.4 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3.1 3.2 3.3 3.4 TITLE A multilayer-perceptron ANN with a single hidden layer. Neuron blocks interconnected as a scalable grid to form a BbNN structure [8]. Example of a 2 × 2 BbNN structure configured in recurrent mode. Layers of the proposed embedded BbNN SoC. A single neuron with three synapses in a conventional ANN. A multilayer-perceptron ANN with a single hidden layer. The observable bell shaped curve from the plot of a sigmoidal function. Depiction of a massively interconnected multilayerperceptron ANN. Neuron blocks interconnected as a scalable grid to form a BbNN structure [8]. Functional block diagram of a neuron block in hardware as described in [20, 36, 37]. Behavioral flow chart for Simple GA [38]. Spectral analysis of heart rate using the three different power bands. ECG signals from a patient depicting (a) a normal heartbeat and (b) a heartbeat with premature ventricular contraction (PVC). BbNN structure with a latency of three computational cycles. Verification of the neuron block using Modelsim and DPIC interfacing with a golden reference software model as comparison. Methodology for co-simulation of the BbNN hardware design using Verilator and C/C++. Layers of the proposed embedded BbNN SoC. PAGE 1 2 4 8 12 12 15 17 18 21 23 27 29 33 37 37 38 xv 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12 3.13 3.14 3.15 3.16 3.17 3.18 3.19 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 Diagram of the connection between Nios2-Linux, FPGA, Nios II CPU and the external hardware. The testbench design environment [69]. Constrained-random test progress over time vs. directed testing [69]. Modelsim compilation and simulation output for the TestVerilog module. Verification of a hardware module using Modelsim and DPIC interfacing with a golden reference software model as comparison. GTKWave displaying the waveforms of a simulated module. The methodology of verifying a design using Verilator by converting SystemVerilog hardware modules into cycle accurate C/C++ libraries. Output from the C/C++ verification program for testing the verilated adder module. HRV sleep state classification using BbNN. QRS complex detection. HR plotting at the center of every RR interval. HRV extraction to obtain power bands. Spectral analysis of HR. Overview of ECG signal preprocessing and feature extraction. ECG signal preprocessing flow. Diagram depicting (a) neuron blocks interconnected as a scalable grid to form a BbNN structure and (b) an example of a configured neuron block. All possible internal configurations of a neuron block. Conceptual model of a single neuron block. BbNN structure with a latency of (a) five computational cycles and (b) three computational cycles. Example of a recurrent mode 2 × 2 BbNN structure. A flowchart depicting the (a) basic concept of the BbNN training model and (b) multi-population parallel GA with migration applied in this work. Encoding the chromosome of a BbNN structure. A r × c BbNN with detailed labeling of the input/output interconnects. Architecture of the proposed BbNN simulation model. 43 44 45 46 47 50 51 54 55 55 57 57 58 60 60 64 65 66 68 70 73 74 80 86 xvi 4.10 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11 5.12 5.13 5.14 5.15 5.16 5.17 5.18 6.1 6.2 6.3 6.4 CPU utilization during BbNN training. System architecture of the proposed FPGA-based BbNN SoC. Layers of the proposed embedded BbNN SoC. The Q4.11 fixed-point number format used in this work. A neuron block with the maximum possible combination of weights connected to the output neuron. Performing addition (or subtraction) using fixed-point arithmetic on Q4.11 type numbers. Performing multiplication using fixed-point arithmetic on Q4.11 type numbers with truncation. Plot of the proposed piecewise quadratic (PWQ) function. Functional block diagram of the proposed neuron block in hardware. Algorithmic state machine chart for the FSM control unit. Individual interconnections of the multiplier unit within the neuron block. Block diagram of the dual-port on-chip memory of the neuron block. Using the direction bits to perform a conditional statement to enable or disable the first operand. The architecture of the input multiplexer within the neuron block. The architecture of the ALU within the neuron block. The logic circuit for obtaining the select signal S2. Logic circuit for determining the positive and negative limit of the x value in register A. Block diagram of the operand selector module within the neuron block. Functional block diagram of the BbNN controller module designed for handling the CPU bus interface. A neuron block with one synapse connected to the output node. Geometric structure of Equation 6.1 for both input cases with weights ranging from −1 to +1. Geometric structure of Equation 6.1 with weights ranging from −3 to +3. A neuron block with two synapses connected to the output node. 86 88 88 90 90 91 92 95 97 99 100 102 103 104 105 106 107 108 110 115 115 116 116 xvii 6.5 6.6 6.7 6.8 6.9 6.10 6.11 6.12 6.13 6.14 6.15 6.16 6.17 6.18 6.19 6.20 6.21 6.22 6.23 6.24 Geometric structure of Equation 6.2 with weights ranging from −1 to +1 using (a) ignored bias and (b) maximum bias offset. Geometric structure of Equation 6.2 with weights ranging from −3 to +3 with (a) maximum positive bias offset and (b) maximum negative bias offset. A neuron block with three synapses connected to the output node for mathematical analysis. Geometric structure of Equation 6.3 for case 7 with weights ranging from (a) −1 to +1 and (b) −3 to +3. Geometric structure of Equation 6.3 for case 8 with weights ranging from (a) −1 to +1 and (b) −3 to +3. Geometric structure of Equation 6.3 for case 9 with weights ranging from (a) −1 to +1 and (b) −3 to +3. Evolved BbNN that solves the XOR problem. Fitness trend of the BbNN evolution for the XOR problem case study. Evolved BbNN structure in driver drowsiness detection. Fitness trend of the BbNN evolution for the driver drowsiness detection case study. Fitness trend of the BbNN evolution for the heart arrhythmia classification case study. Evolved BbNN structure for the classification of heart arrhythmia using all records combined as training data. Evolved BbNN structure for the classification of Tomita language 3. Fitness trend of the BbNN evolution for the grammar inference of Tomita language 3. Simulation results (in dumpfile form) for the dual-port onchip memory. Simulation results (in dumpfile form) for the ALU submodule. Simulation results (in dumpfile form) for the FSM submodule. Simulation results (in dumpfile form) for the operand selector submodule. SystemVerilog testbench with DPI-C interfacing for coverification of the neuron block. Results of the neuron block co-verification routine. 117 118 118 120 120 121 124 124 126 126 128 128 131 132 134 135 135 136 137 138 xviii 6.25 6.26 Co-simulation of the BbNN hardware design using Verilator and C/C++. Terminal output from the training algorithm after successfully training the verilated BbNN hardware applied on case study 2. 139 140 xix LIST OF ABBREVIATIONS ALU – Arithmetic Logic Unit ANN – Artificial Neural Network BPF – Band Pass Filter BbNN – Block-based Neural Network CPU – Central Processing Unit DDR SDRAM – Double Data Rate Synchronous Dynamic Random-Access Memory DPI – Direct Programming Interface DUT – Device-Under-Test DUV – Device-Under-Verification EA – Evolutionary Algorithm ECG – Electrocardiogram FFT – Fast Fourier Transform FPU – Floating Point Unit FPGA – Field Programmable Gate Array FSM – Finite State Machine GA – Genetic Algorithm GCC – GNU Compiler Collection GNU – Gnu’s Not Unix GUI – Graphical User Interface HDL – Hardware Description Language HF – High Frequency HPF – High Pass Filter HRV – Heart Rate Variability HW – Hardware HW-SW – Hardware-Software IC – Integrated Circuit I/O – Input/Output xx LE – Logic Elements LF – Low Frequency LPF – Low Pass Filter LUT – Lookup Table Mbit – Mega Bits MDIPS – Millions of Dhrystone Instructions Per Second MHz – Mega Hertz MLP – Multilayer perceptron MMU – Memory Management Unit ms – Millisecond MUX – Multiplexer MWIPS – Millions of Whetstone Instructions Per Second NoC – Network-on-Chip ns – Nanosecond OS – Operating System PC – Personal Computer PWL – Piecewice Linear PWQ – Piecewice Quadratic RTL – Register Transfer Level RTOS – Real Time Operating System SoC – System-on-Chip SOM – Self Organizing Maps SW – Software USB – Universal Serial Bus VHDL – Very High Speed Integrated Circuit Hardware Description Language VLF – Very Low Frequency WFDB – Waveform Database xxi LIST OF SYMBOLS x0−3 – Inputs of a neuron block y0−3 – Outputs of a neuron block w0−5 – Weight values for the synapses inside a neuron block b0−3 – Bias values for the nodes inside a neuron block d0−3 – Direction values for the neuron block interconnects r – The amount or rows in a BbNN structure c – The amount or columns in a BbNN structure fBbN N – Fitness value of the BbNN structure Perr – BbNN error parameter PLQ – BbNN latency quality parameter Ppen – BbNN penalty parameter α – General scaling or modifier constant β – General scaling or modifier constant β1−3 – BbNN parameter scaling constants Ls – Specified latency value of the BbNN system Lmin – Minimum possible latency value Lmax – Maximum possible latency value σ – Latency modifier for Lmax φ – Latency offset for Lmin Ninv – Number of neuron blocks with invalid configurations g(x) – General representation of an activation function sp – Activation function saturation point Ca – Classification accuracy TP – The number of true positives FP – The number of false positives TN – The number of true negatives FN – The number of false negatives xxii LIST OF APPENDICES APPENDIX A B C D E TITLE Publications BbNN Simulation Model Source Codes BbNN Hardware Implementation Source Codes Verification Routines Octave Routines used for Experimentation and Data Preparation PAGE 160 162 167 176 182
© Copyright 2026 Paperzz