Paper by E.M. Ortigosa , A. Canas, E.Ros, P.M. Ortigosa, S. Mota , J. Dı´az Paper Review by, Ryan MacGowan 1 What does this even mean?? Speech Recognition Artificial Neural Networks Solutions Results Likes and Dislikes Conclusion 2 Multi-Layer Perception (referred to as MLP) is a type of standard feed forward neural network which uses at least 3 layers (input, hidden, and output) 3 Different abstraction levels just means that the solution will be realized using 2 different methods, one using low-level VHDL, and another using higher-level Handel -C 4 Hardware Descriptions of Multi-Layer Perceptions with Different Abstraction Levels …… REALLY MEANS ………. FPGA Implementation of two Neural Networks using VHDL, and Handel-C, to solve a problem In this case the problem that is solved is Speech Recognition 5 Due to the increasing power of FPGAs, solutions for Speech Recognition can be designed using an Artificial Neural Network built right into the FPGA. Useful for applications in cars, GPS, toys, and other embedded systems where control over speech would be useful A computer samples audio, and this waveform is converted into a vector using vector bank and prediction analysis. This vector is what is sent to the neural network. The Neural Network computes which word was spoken as the output 6 All solutions were realized using Artificial Neural Network presented here. 10 vectors with 22 features are 220 input data values, sent to the 24 hidden neurons for computation, whose output are sent to the output neurons which classify the input and provide an out If a spoken command falls in a class, we expect that output node to give a high value 7 In order to ensure maximum accuracy, a number of neural network structures were tested. The best result of 96.3% accuracy was present when 24 hidden neurons were used. 8 This is the functional unit used in the implementations. The first 8 bit input is the input value, and the second represents the connection weight for that input The output of the multiplier is sign extended due to the maximum size of the summation of weight values in EQ1 9 The output of this functional unit presented in the previous slide is then passed into the sigmoid activation function, which gives an 8-bit output based on the 23-bit input This 8-bit output can either be passed to another layer of hidden neurons, or passed to the output neurons 10 11 We can enhance the speed of the design by placing the Ram containing the weights , and the functional units in parallel The output sums from these functional units are stored in a register, and selected by the multiplexer to be sent to the activation function This is only a partial parallel design as the outputs of each layer are still computer sequentially 12 The Handel-C design is done in both serial and parallel, as shown below Serial Parallel NumHidden is the number of hidden neurons(24), NumInput is the number of input values (220), W is the array containing the weights, In is the input array, and Sum is the sum of the weights multiplied by the inputs 13 In order to try to test more solutions and come up with an optimal one, different RAM types were used in the Handel-C Design: 1) Only Distributed RAM Blocks 2) A combination of Embedded and Distributed RAM Blocks 3) Only Embedded RAM Blocks 14 Here are the results 15 When looking at the throughput, the VHDL design provides the best in both serial and parallel cases. HC(a) provides the highest throughput due to the higher frequency that can be achieved 16 This trend continues as we also examine the performance cost. The VHDL design is the best as it provides the highest throughput while also using the least number of gates Handel-C designs have at least a 1.6X higher performance cost 17 These graphs show the linear relationship between the number of neurons in the hidden layer and the amount of resources used Since our design only uses 24 hidden neurons, the resources are manageable, however our design is also very small (only 10 words) 18 While these tables and graphs above show that the VHDL design is superior in terms of throughput and performance cost, there are also drawbacks which we must consider. Design time is 10X longer on the VHDL design Exploring different solutions with a VHDL Design takes considerably longer as it requires an entire new control unit to be designed each time. 19 The FPGA used is a Virtex-E 2000 • CLBs contain 4 LCs(Logic Cells), each with 4- input function generator, carry logic, storage element • The 4 LCs are placed in 2 slices, each slice provides 5-6 input function generators • Each LC has a 4-input LUT which provides a 16x1 memory block • Also contains EMB memory blocks 20 VHDL was coded with FPGA Advantage Tool 5.3 from Mentor Graphics DK Design Suite was used for Handel-C implementation Both designs were placed and routed using ISE Foundation Tool 3.5i 21 The relevance to labs performed in the course The comparison between parallel and serial for all types of implementation Great description of neural network covering all aspects The application is very practical with todays electronic culture 22 More detail on actual voice recognition system – specifically the computer used to preprocess the audio. Paper is not entirely modern (2006) Some sources used are even older (1990-1993) The system is unrealistically small (10 different words), with no discussion about viability in more complex environment Not much information given on VHDL. No pseudocode given, no simulations given. 23 Using Neural Networks for Speech recognition allows compact embedded systems to be developed Parallel processing allows a speedup of up to 17X over a serial implementation VHDL results in an implementation which is 1.21-1.24X faster than the Handel-C implementation When using Handel-C, it is important to know the most optimized type of RAM for your application, in this case being distributed RAM. Handel C designs have a 1.6X higher performance cost Computation of output takes only 13-16ms 24 25
© Copyright 2026 Paperzz