UNIVERSIDADE TÉCNICA DE LISBOA
INSTITUTO SUPERIOR TÉCNICO
ARTIFICIAL RETINA:
Development of a Bio-Inspired Model
with Configurable Implementation
José António Henriques Germano, n. 48014, AE: Sistemas Electrónicos
Ricardo Manuel Simões Baptista, n. 48095, AE: Computadores
DEGREE IN ELECTRICAL AND COMPUTER ENGINEERING
Graduation Report
127/2003/M
Supervisor: Prof. Leonel Sousa
October 2004
Acknowledgements
Many people have been a part of our graduate education, as friends, teachers, and colleagues. Prof. Leonel Sousa, first and foremost, has been all of these. The best advisor
and teacher we could have wished for, he is actively involved in the work of all his students, and clearly always has their best interest in mind. Thank you for the guidance and
connivance throughout the project.
At the Signal Processing Systems Research Group (SiPS) of INESC-ID we were surrounded by knowledgeable and friendly people who helped us daily. A special thanks to
Eng. Pedro Tomás, who helped us so many times from research to implementation and
even in reviewing this report, to M.Sc. Tiago Dias and Eng. Ricardo Guapo for their
helpful suggestions, discussions and ideas.
In addition to the people in INESC-ID, we have lucky enough to have the support of
many good friends. This graduation project long working days and nights would not have
been the same without our friends Eng. Tiago Rojão and (future Eng.) Pedro Pinho. A
special thanks also to our friend (future Eng.) João Pereira for reviewing this report in
such a short time.
Finally, we would like to thank those closest to us, whose presence helped make the
completion of this graduate work possible. We would like to thank our families, especially
our parents, and our friends and girlfriends, for their absolute confidence and to whom we
are forever indebted for their understanding, endless patience and encouragement when
it was most required.
José António Henriques Germano
Ricardo Manuel Simões Baptista
i
ii
Abstract
Nowadays, in a society which relies heavily on sight, loss of vision poses extraordinary
challenges on individuals. The goal of the work described in this report is the development
of a bio-inspired processing module to produce spike events capable of exciting the visual
cortex cells and therefore provide some sense of visual rehabilitation to profound blind
people.
Methods and technics are proposed for modelling the retina response to a visual stimulus and to generate neural impulses to stimulate the visual cortex. With the purpose
of improving the resemblance of the visual neural code generated by the artificial systems,
the modelling of the retina cell’s response was based on artificial neural networks in opposition to what the classic model previously offered. An architecture is proposed for the
implementation of both models.
To implement the complete system, a dedicated prototype board was designed. The
purpose of this system is to achieve a small size prototype derived from a ”low power”
design based on a Field-Programmable Gate Arrays (FPGA). Regarding to a previous
prototype, the power consumption is reduced at about 50% and the available memory
resources increased five times. This allowed the simulation and validation of the full
model.
Keywords
Retina Model, Spike Sorting, Neural Networks, VLSI Architectures, Configurable Systems.
iii
iv
Resumo
Nos dias de hoje, numa sociedade que depende fortemente da visão, a perda desta coloca
desafios significativos a indivı́duos invisuais. O trabalho descrito neste relatório tem como
objectivo o desenvolvimento de um módulo de processamento bio-inspirado, de forma a
produzir os impulsos electromagnéticos capazes de excitar as células do córtex visual e,
consequentemente, fornecer algum sentido da reabilitação a cegos profundos.
São propostos métodos e técnicas com vista a modelar a resposta da retina a um
estı́mulo visual e gerar impulsos neuronais para estimular o cortex visual. Com a finalidade
de melhorar o código neuronal gerado pelos sistemas artificiais, a modelação da retina foi
baseada na utilização de redes neuronais artificiais em oposição à abordagem do modelo
clássico. São propostas arquitecturas para a implementação de ambos os modelos.
Para implementar o sistema completo, projectou-se um protótipo dedicado. A finalidade deste protótipo é conseguir um sistema de dimensão reduzida e de ”baixo consumo”
baseado numa FPGA. Comparando com um protótipo anteriormente desenvolvido, o
consumo de potência é reduzido em aproximadamente 50% e os recursos de memória
disponı́veis aumentados em cinco vezes. Este novo protótipo permitiu a simulação e a
validação do modelo completo.
Palavras Chave
Modelo da Retina, Classificação de Células por Potenciais de Acção, Redes Neuronais,
Arquitecturas VLSI, Sistemas Configuráveis
v
vi
Contents
1 Introduction
1
1.1
Project description and main goals . . . . . . . . . . . . . . . . . . . . . .
1
1.2
Contributions of this work . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
1.3
Report organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
2 Human Visual System
5
2.1
The human eye . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
2.2
An overview of the retina . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
2.3
Central visual pathways . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8
2.4
Retinal ganglion cell response to light . . . . . . . . . . . . . . . . . . . . . 10
2.5
Spike sorting
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.5.1
Feature and principal components analysis . . . . . . . . . . . . . . 11
2.5.2
Cluster analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5.3
Pre-processing and classification . . . . . . . . . . . . . . . . . . . . 13
3 Retina Neural Models
17
3.1
Defining the problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2
Classic model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3
Neural networks modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.3.1
Introducing neural networks . . . . . . . . . . . . . . . . . . . . . . 20
3.3.2
A neural networks approach . . . . . . . . . . . . . . . . . . . . . . 22
3.3.3
Training the neural network . . . . . . . . . . . . . . . . . . . . . . 23
3.3.4
Simulations and results . . . . . . . . . . . . . . . . . . . . . . . . . 24
4 Processing Architectures
27
4.1
Classic model architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2
Neural networks architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 29
vii
viii
CONTENTS
4.3
Spike multiplexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.4
Serial communication protocol . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.5
System architecture conclusions . . . . . . . . . . . . . . . . . . . . . . . . 34
5 Full System Prototype
35
5.1
Prototype processing core . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.2
Power distribution system design . . . . . . . . . . . . . . . . . . . . . . . 36
5.3
VGA display port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.4
Complete board and components placement . . . . . . . . . . . . . . . . . 42
5.5
Digital logic blocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.6
5.5.1
Image capture and resize . . . . . . . . . . . . . . . . . . . . . . . . 44
5.5.2
Register configuration . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.5.3
Classic model implementation . . . . . . . . . . . . . . . . . . . . . 48
5.5.4
Neural networks implementation . . . . . . . . . . . . . . . . . . . . 49
5.5.5
Serial communication protocol . . . . . . . . . . . . . . . . . . . . . 50
5.5.6
Image display . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Conclusions and results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6 Conclusions
6.1
57
Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
A Clustering Algorithms
59
A.1 K-MEANS algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
A.2 EM algorithm and bayesian classification . . . . . . . . . . . . . . . . . . . 60
B SPiKes Classifier - User Manual
63
C Neural Network Modelling Spike Trains Simulations
65
D FPL Filter Implementation
69
E Prototype Datasheet
71
F Prototype Board Schematics
77
List of Figures
1.1
CORTIVIS project main modules. . . . . . . . . . . . . . . . . . . . . . . .
2
2.1
Scheme of the human eye. . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
2.2
The retina as a layered structure. . . . . . . . . . . . . . . . . . . . . . . .
7
2.3
A rod-initiated pathway. B: Bipolar Cells, RB: Rod Bipolars, AII: Amacrine cells, G: Ganglion Cells. (Source: [1]). . . . . . . . . . . . . . . . . . .
8
2.4
The human visual pathways (Source: [2]). . . . . . . . . . . . . . . . . . . .
9
2.5
Receptive field center-surround organization. . . . . . . . . . . . . . . . . . 10
2.6
Example of spike signal pre-processing and prototype traces after separation. 14
2.7
Retinal ganglion cells response to a fullfield flash stimulus. . . . . . . . . . 15
3.1
Model with space-time separability. . . . . . . . . . . . . . . . . . . . . . . 18
3.2
Model with space-time dependency as suggested in [1]. . . . . . . . . . . . 19
3.3
Proposed models with nonlinear processing. . . . . . . . . . . . . . . . . . 20
3.4
Neural network training by adjusting weight parameters. . . . . . . . . . . 21
3.5
Neural network processing unit. . . . . . . . . . . . . . . . . . . . . . . . . 21
3.6
A multi-layer network with two hidden layers of units. . . . . . . . . . . . . 22
3.7
Stimulus (upper panel) and response (lower panel). . . . . . . . . . . . . . 23
3.8
Organization of the training set data to feed the neural Network.
3.9
Comparison of real and neural network modelled spike trains. . . . . . . . 26
4.1
Global architecture of the Artificial Retina. . . . . . . . . . . . . . . . . . . 27
4.2
Classic Bio-Inspired processing module. . . . . . . . . . . . . . . . . . . . . 28
4.3
Retina Early Layers diagram (adapted from [2]). . . . . . . . . . . . . . . . 28
4.4
Integrate-and-fire block diagram. . . . . . . . . . . . . . . . . . . . . . . . 29
4.5
Parallel architecture for a single perceptron. . . . . . . . . . . . . . . . . . 30
4.6
Perceptron data flow diagram. . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.7
MAC architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
ix
. . . . . 24
x
LIST OF FIGURES
4.8
Serialization and data packing block. . . . . . . . . . . . . . . . . . . . . . 31
4.9
Implemented AER module.
. . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.10 Packet structure [3]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.11 New packet structure.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.12 Adopted packet structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.1
Block diagram of the prototype. . . . . . . . . . . . . . . . . . . . . . . . . 35
5.2
FPGA available I/O signals. . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.3
MAX1830/1831 adjustable configuration electrical diagram.
5.4
Equivalent circuit of a real capacitor. . . . . . . . . . . . . . . . . . . . . . 39
5.5
Power plane division, different colors identify split planes borders. The
decoupling capacitors with the lower value are also represented. . . . . . . 41
5.6
DAC electrical diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.7
TPS78601 adjustable configuration electrical diagram. . . . . . . . . . . . . 42
5.8
Complete prototype board. . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.9
Digital logic block diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . 44
. . . . . . . . 38
5.10 Camera module sync signals. . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.11 Implemented hardware for frame capture. . . . . . . . . . . . . . . . . . . . 45
5.12 Spatial low pass gaussian filter. . . . . . . . . . . . . . . . . . . . . . . . . 46
5.13 Block diagram for register configuration module. . . . . . . . . . . . . . . . 47
5.14 Register configuration block diagram. . . . . . . . . . . . . . . . . . . . . . 47
5.15 Block diagram of the program regs. . . . . . . . . . . . . . . . . . . . . . . 47
5.16 Write cycle. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.17 Read cycle. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.18 Early Layers FPL full architecture. . . . . . . . . . . . . . . . . . . . . . . 49
5.19 Integrate-and-fire adopted architecture. . . . . . . . . . . . . . . . . . . . . 49
5.20 Data packing block diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.21 Data unpacking block diagram. . . . . . . . . . . . . . . . . . . . . . . . . 52
5.22 VGA timing diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.23 VGA monitor control circuit [2]. . . . . . . . . . . . . . . . . . . . . . . . . 54
5.24 Complete prototype system. . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.25 Photographs of the experimental results obtained with the artificial retina
prototype. The input image (after downsizing) is displayed on the top left
corner and the output at the bottom right corner. . . . . . . . . . . . . . . 56
B.1 SPKC (SPiKes Classifier) user window. . . . . . . . . . . . . . . . . . . . . 63
LIST OF FIGURES
xi
C.1 Comparison of real and modelled spike trains, neural network model with
stimulus input only. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
C.2 Comparison of real and modelled spike trains, neural network model with
stimulus and response feedback input. . . . . . . . . . . . . . . . . . . . . . 67
D.1 FPL implementation of the filters. . . . . . . . . . . . . . . . . . . . . . . . 70
E.1 Prototype block diagram.
. . . . . . . . . . . . . . . . . . . . . . . . . . . 71
E.2 JTAG connector pin order. . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
F.1 FPGA electrical diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
F.2 FPGA power connections and configuration. . . . . . . . . . . . . . . . . . 79
F.3 Power regulators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
F.4 Digital to analog converters. . . . . . . . . . . . . . . . . . . . . . . . . . . 81
F.5 Main schematic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
xii
LIST OF FIGURES
List of Tables
5.1
Capacitor value percentages for a balanced decoupling network [4]. . . . . . 39
5.2
Decoupling capacitors quantities. . . . . . . . . . . . . . . . . . . . . . . . 40
5.3
Implementation costs for the Bio-inspired Processing Module. . . . . . . . 50
5.4
Neural networks implementation hardware cost for different topologies. . . 50
5.5
VGA Timings [5]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.6
Complete Artificial Retina system implemented on a Xilinx Spartan XC3S400
FPGA. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
E.1 VGA synchronization pins and DAC clock. . . . . . . . . . . . . . . . . . . 73
E.2 FPGA pins for the video DACs. . . . . . . . . . . . . . . . . . . . . . . . . 73
E.3 FPGA pins for the slide switches and the push button. . . . . . . . . . . . 74
E.4 FPGA pins for camera connector expansion slot. . . . . . . . . . . . . . . . 75
E.5 FPGA pins for the generic expansion slot. . . . . . . . . . . . . . . . . . . 75
xiii
xiv
LIST OF TABLES
Acronyms
bit
– Binary digit
CAD
– Computer Aided Design
CGC
– Contrast Gain Control
CMOS
– Complementary Metal-Oxide Semiconductor
CORTIVIS
– Cortical Visual Neuroprosthesis for the Blind
CRC
– Cycle Redundancy Code
DAC
– Digital to Analog Converter
DoG
– Difference of Gaussians
EM
– Expectation-Maximization
ESL
– Equivalent Series Inductance
ESR
– Equivalent Series Resistance
FIFO
– First In First Out
FIR
– Finite Impulsive Response
FPGA
– Field-Programmable Gate Arrays
FPL
– Field-Programmable Logic
GCLK
– Global Clock
HCMOS
– High Speed Complementary Metal-Oxide Semiconductor
I2C
– Inter Integrated Circuit
IC
– Integrated Circuit
IIR
– Infinite Impulsive Response
JTAG
– Joint Test Action Group-IEEE Standard 1149.1
LGN
– Lateral Geniculate Nucleus
MAC
– Multiply and ACcumulate
xv
xvi
LIST OF TABLES
MOS
– Metal-Oxide Semiconductor
MSE
– Mean Squared Error
NTSC
– National Television System Committee
PCA
– Principal Components Analysis
PCB
– Printed Circuit Board
PDS
– Power Distribution System
PWM
– Pulse Width Modulation
QVGA
– Quarter Video Graphics Array
RAM
– Random Access Memory
RF
– Radio Frequency
RGB
– Red Green Blue
ROM
– Read-Only Memory
SCCB
– Serial Camera Control Bus
VGA
– Video Graphics Array
VHDL
– VHSIC Hardware Description Language
VHSIC
– Very High Speed Integrated Circuits
ZV
– Zoomed Video format
Chapter 1
Introduction
The work presented in this report has been developed in the scope of the European project
Cortical Visual Neuroprosthesis for the Blind (CORTIVIS) [6], conducted by a European
consortium where INESC-ID is included. This project is financially supported by the
Commission of the European Communities, specific RTD programme ”Quality of Life
and Management of Living Resources”, QLK6-CT-2001-00279. The research work has
been carried out in narrow collaboration with the University Miguel Hernández and the
University of Granada, Spain.
In this chapter, the project main goals are presented, as well as the contributions of
this graduation project to achieve these goals.
1.1
Project description and main goals
Nowadays, in a society which relies heavily on sight, loss of vision poses extraordinary
challenges on individuals. Currently, there is no effective treatment for some patients who
are profoundly visually handicapped due to degeneration or damage in the retina, optic
nerve or the brain.
The CORTIVIS European project aims to develop prototypes in the field of visual
rehabilitation and to demonstrate the feasibility of a cortical neuroprosthesis, interfaced
with the visual cortex, as a means through which a limited but useful visual sense may
be restored to profoundly blind people. Even though the full restoration of vision seems
to be impossible, the discrimination of shape and location of objects could allow blind
subjects to ’navigate’ in a familiar environment and to read enlarged text, resulting in a
substantial improvement in the standard of living of the blind and the visually impaired.
A block diagram of the cortical visual neuroprosthesis is presented in Figure 1.1, where
the grey blocks are the modules that will be addressed in this work. Following the same
approach as our own visual system, the system will use a bio-inspired (retina-like) visual
processing front-end. The visual stimulus is first captured by some sort of image capture
device, like a digital video camera, and then sent to a visual encoding block. This block’s
function is to map the visual stimulus in a sequence of action potentials, also called spike
1
2
CHAPTER 1. INTRODUCTION
Visual
Stimulus
Image
capture
Visual
Encoding
Data
Packing
Outside
Human
Head
RF Link
Modulator
RF Channel
Electrical stimulus for
microelectrode array
Electrode
Stimulator
Data
Unpacking
RF Link
Demodulator
Inside
Human
Head
Visual Cortex Cells
Figure 1.1: CORTIVIS project main modules.
trains, so that a blind individual is able to transform the visual world into electrical
signals that can be used to excite, in real time, the neurons at his visual cortex. The
output information from this bio-inspired peripheral device is then packed and sent via a
Radio Frequency (RF) serial link to the inside of the human head. An electrode stimulator
is applied to a microelectrode penetrating array in order to stimulate the primary visual
cortex cells. Such a visual neuroprosthesis is expected to recreate a limited, but useful
visual sense in a blind individual who is using such a system.
1.2
Contributions of this work
The main goals of the CORTIVIS project were presented and now a brief notion of this
work’s contribution is given.
The work focuses on the Visual Encoding block of Figure 1.1. Although there are
models1 proposed for this functional block, they may not provide accurate results. The
purpose of this work is to study another approach to model the response of retinal cells
responsible for vision. This way, the core of the work will consist in the application and
analysis of supervised learning methods in order to determine the pattern of generation
of action potentials matching the ones in the human retina.
Another part of this work consists on the development of a prototype board designed
specifically for this project. This new board will allow the implementation of the complete model, the highlighted blocks of Figure 1.1 outside the human head. Also, using a
dedicated design, it is possible to achieve a smaller board and a lower power consumption.
The board will have a processing core based on a FPGA and will also provide a video
port.
Finally, it was designed and implemented a serial communication protocol to transmit
the spike information generated by the Visual Encoding processing module to the brain
over the RF serial link.
1
See Chapter 3 for a brief description of the retina neural models
1.3. REPORT ORGANIZATION
1.3
3
Report organization
This thesis is composed by six chapters, including those regarding to the introduction
and conclusions – the first and sixth chapters, respectively. The introductory chapter
describes the CORTIVIS project goals and the contributions of this graduation work to
fulfill these goals.
The second chapter introduces the human biological visual system. The classification
of retinal ganglion cells is also analysed, for which clustering algorithms are discussed. As
a result, a spike-sorting software tool was developed.
Chapter 3 addresses the problem of modelling the neural code of the retina. In this
chapter the Classic Model approach is presented and a new model based on Neural Networks is proposed. This new approach provides a more precise information, spike trains
instead of the instantaneous firing rate.
Chapter 4 presents a possible architecture for implementing these two retina models,
the Classic Model and the Neural Networks. The architecture for a Spike Multiplexing
module is also described. This module serializes the spike information generated by the
retina processing module. To obtain a complete system, a serial communication protocol
was also designed.
Finally, Chapter 5 introduces a full prototype for the complete retina model. The
digital logic blocks necessary to implement the retina model are described, including
the retina Classic Model and the Neural Networks model. Other hardware modules,
responsible for the interface with the digital camera and for the generation of an output
image representing the processed stimulus, are also presented. The hardware design for the
data packing and unpacking module that implements the serial communication protocol
is introduced. A dedicated prototype board is presented and the obtained experimental
results are discussed.
4
CHAPTER 1. INTRODUCTION
Chapter 2
Human Visual System
Visual perception results from a series of optical and neural transformations [1]. To
understand how the models described in this report relate to biology, one must be at
least familiar with the present knowledge of the biological visual system. This section is
intended to give the minimum background to someone who lacks this familiarity.
Light arriving at the eye is first transformed by the cornea and the lens, which focus it
and create a retinal image. The retinal image is then transformed into neural responses by
the light-sensitive elements of the eye, the photoreceptors. The photoreceptors responses
are transformed into several neural representations within the eye, and these are transformed into a multiplicity of cortical representations. Many of these transformations occur
in parallel streams within the visual pathways, and will be discussed in the next sections.
In addition, processing of neural spike activity is also addressed. The detection of
neural activity is an issue of great importance. This is because different retinal ganglion
cells do not respond to light in the same way. To better study the visual system, one
must know which basic element carries information and how to group the different types
of cells that code this element in the same way. This subject is considered in the last two
sections of the chapter.
2.1
The human eye
When arriving the eye, light is focused and inverted by the cornea and lens onto the
photoreceptors, a collection of light-sensitive neurons that are part of a thin layer of
neural tissue called the retina. Figure 2.1 shows a scheme of the human eye. The iris is
like a shutter that enables the eye to regulate the amount of entering light. The lens is a
spherical shape elastic structure, stretched out into a disc-like shape by the zonule fibers
to allow far-focusing. When the ciliary muscle contracts, the zonule fibers go slack, the
lens are released from their tension and free to round up. This change is necessary for
near-focusing and the adjustment process is called accommodation.
The photoreceptors can be fundamentally classified in two different types, the rods and
the cones. There are approximately 5 million cones and 100 million rods in each eye. The
5
6
CHAPTER 2. HUMAN VISUAL SYSTEM
Figure 2.1: Scheme of the human eye.
positions of these two types of photoreceptors differ in many ways across the retina. For
instance, the central fovea contains no rods, but it does contain the highest concentration
of cones, which makes the fovea the region of highest visual acuity in the human retina.
The photoreceptors transduce light into neural signals, which are transmitted through the
several layers of retinal neurons to those neurons whose output fibers make up the optic
nerve. The optic nerve fibers exit the eye at a location in the retina called the optic
disk1 , and carry the neural signals to the brain for further processing.
2.2
An overview of the retina
The retina is a thin layer of neural tissue in the back of the eye. It can be decomposed
in five layers: three layers of cell bodies and two layers of synaptic interconnections
between the neurons. Yet, the fovea consists only in a single layer of neurons, the cone
photoreceptors. This structural form is depicted in Figures 2.1 and 2.2.
Light enters from the ganglion cell layer side first, and must penetrate all cell types
before reaching the rods and cones. This is because the pigment-bearing membranes of
the photoreceptors have to be in contact with the eye’s pigment epithelial layer, which
provides vitamin A [7]. The photoreceptors’ cell bodies are located in the outer nuclear
layer of the retina. The synaptic terminals of the photoreceptors make contact with
the dendritic fields of the bipolar cells and horizontal cells in the outer plexiform
layer. The cell bodies of the bipolar and horizontal cells are located in the inner nuclear
layer. The horizontal cells make connections with the cells in the outer nuclear layer.
The bipolar cells, however, make connections onto the dendrites of the ganglion cells
within the inner plexiform layer. Since only the bipolar cells link the signals in the
outer and inner plexiform layers, all the visual signals must go through the bipolar cells.
Another class of cells located in the inner nuclear layer are the amacrine cells. These
1
The optic disk is also known as the blind spot
2.2. AN OVERVIEW OF THE RETINA
7
Figure 2.2: The retina as a layered structure.
cells have no identifiable axons, only dendrites. The dendritic fields of the amacrine and
ganglion cells connect in the inner plexiform layer. The retinal ganglion cell bodies are
located in the ganglion cell layer, and their dendritic fields connect with the axon terminals
of the bipolars as well as with the dendritic fields of the amacrine cells.
The axons of the retinal ganglion cells provide the only retinal output signal. They
comprise the optic nerve and exit from the retina at a single location, the optic disk.
There are no photoreceptors at the optic disk so we do not perceive the light that falls
there (blind spot). We are not aware of the blind spot in our eyes since the portion of the
visual field falling in the blind spot of one eye falls on the functional portion of the retina
in the other eye, and the brain fills in the missing information with whatever pattern
surrounds the hole.
The retina segregates visual information into parallel neural pathways specialized in
different visual tasks. The signals initiated within the rods follow a separate rod pathway
within the retina until the signals arrive at the retinal ganglion cells, Figure 2.3. The
rods make connections with a class of bipolar cells, the rod bipolars, which integrate
the responses of many different rod photoreceptors and synapse directly on the amacrine
cells. The amacrine cells synapse onto the ganglion cells which also receive signals via the
cone pathway, merging both the rod and cone pathways. Due to the strong convergence
of the signals in the rod pathway, this makes it well-suited to capture information at
low light levels (night vision) while paying a penalty in terms of visual acuity. The
cone pathway, as opposite, is not so sensitive but is more acute and is used for color
vision. This way, specialized computational goals can be achieved as a result of special
anatomical connections. Rodents such as rats, which are nocturnal animals, have retinas
8
CHAPTER 2. HUMAN VISUAL SYSTEM
Figure 2.3: A rod-initiated pathway. B: Bipolar Cells, RB: Rod Bipolars, AII: Amacrine
cells, G: Ganglion Cells. (Source: [1]).
overwhelmingly dominated by rods. Most fish, frog, turtle and bird retinas have three
to five types of cones and consequently very good color vision. It is presumed that each
visual pathway carries an efficient retinal image representation that is most relevant for
the tasks carried out where the output is sent, the visual cortex. Finally, based on the
size and spread of the dendritic arborizations, the retinal ganglion cells can be mainly
classified in two types: the midget cells, whose dendritic fields are more dense and
compact, and the parasol cells, with sparse dendritic fields and large to medium-size
cell bodies.
2.3
Central visual pathways
Once the ganglion cells axons leave the retina, they travel through the optic nerve to the
optic chiasm. The optic chiasm is a location where the optic nerve axons from the two
retinae join and are then reorganized into two separate groups that encode information
from the right and left visual fields. There, the fibers are sorted into two new groups,
each connecting to only one side of the brain. Axons from ganglion cells whose receptive
fields are located in the left visual field send their output toward the Lateral Geniculate
Nucleus (LGN) on the right side of the brain, and vice-versa. Thus, the LGN receives a
retinal signal from both eyes, but only one half of the visual field. Figure 2.4 illustrates the
pattern of connections schematically, from the retinas all the way to the visual cortex.
The LGN is a structure located in the thalamus which is a major recipient of axons from
the retina. The primate LGN contains six different layers. The four superficial layers are
called the parvocellular layers and the two deeper layers the magnocellular layers.
The axons of the parasol and midget retinal ganglion cells connect with different layers:
2.3. CENTRAL VISUAL PATHWAYS
9
the axons of the midget retinal ganglion cells terminate in the parvocellular layers, while
the axons of the parasol cells terminate in the magnocellular layers. This way, the midget
and parasol cells form separate visual pathways. The path from the midget ganglion
cells to the parvocellular layers of the LGN is called the parvocellular pathway, while
the pathway from the parasol cells to the magnocellular layers is called magnocellular
pathway.
left hemifield
right hemifield
Image Preview
Optic
chiasm
Optic nerve
Meyer’s loop
optic track
LGN
LGN
optic radiations
V1
occipital poles
Figure 2.4: The human visual pathways (Source: [2]).
There are several differences in the way neurons in the parvocellular and magnocellular
pathways code information. Also, anatomical and physiological measurements suggest
that both pathways carry different types of information to the brain. Electrical signals
traveling from the retina to the LGN have longer conduction times to the parvocellular
layers than to the magnocellular layers [1]. When facing contrast patterns, the neurons
in these two pathways also respond differently. As the stimulus contrast increases, the
response of neurons in the magnocellular pathway changes more rapidly than the response
in the parvocellular pathway. Thus, it appears that the magnocellular pathway exists as a
specialization that improves the ability to perform tasks requiring high temporal-frequency
information. Two examples of such tasks are motion detection and motion tracking.
The neurons in the LGN send their axons directly to area V1 (primary visual cortex)
via the optic radiations. When entering the primary visual cortex, information from
left and right visual fields is still separated. Here, this information will be mixed so that
binocular vision is achieved. Cortical area V1 is the first point in the visual pathways
where individual neurons receive binocular input. These binocular neurons may play an
important role in our perception of stereo depth.
10
CHAPTER 2. HUMAN VISUAL SYSTEM
2.4
Retinal ganglion cell response to light
The retinal ganglion cells are part of the pathway that transforms light stimulus into a
temporal series of discrete electrical pulses called action potentials or spikes. Within
the field of electrophysiology2 , the transformation associated with a neuron is called the
neuron’s receptive field. In the vision domain, the receptive field of a neuron is defined
as the retinal area in which light influences the neuron’s response. It is possible to obtain
this region by stimulating neurons with small flashes of light and simple moving bars
and determining when the neuron responds and fails to respond. The responses of many
neurons in the visual pathway are influenced only by light falling in narrow regions of the
retina, hence small regions of the visual field. Although one usually refers to the neuron’s
receptive field, in fact it depends on the properties of the entire visual pathway.
Most retinal ganglion cells respond with a random stream of action potentials when
they are stimulated with a large uniform field. The average number of action potentials
per unit of time in the presence of a constant field is called the spontaneous firing
rate. When the central region of the receptive field is flashed with a small spot of light,
the firing rate increases when compared to the spontaneous activity. Conversely, if the
spot is placed on a surrounding area, the cell’s activity decreases. The intermediate
region shows some excitation at the beginning of the stimulus and some inhibition at
it’s extinction. This center-surround organization defines important properties of
our visual capacities, like edge-detection. The case just mentioned defines a on-center,
off-surround receptive field and is illustrated in Figure 2.5(a). If the ganglion cell is
inhibited by light falling in the center and excited by light falling in the surround they
are called off-center, on-surround cells, Figure 2.5(b).
(a) ON center, OFF surround.
(b) OFF center, ON surround.
Figure 2.5: Receptive field center-surround organization.
2
The field that studies the electrical responses of neurons
2.5. SPIKE SORTING
2.5
11
Spike sorting
Retinal ganglion cells, like other neurons, communicate with the central nervous system
by firing action potentials. Therefore, the detection of retinal neural spike activity is a
technical challenge and a prerequisite for studying most retinal functions. These brief
voltage spikes can be recorded using a microelectrode, which often also picks up signals
from neurons in a local region. Due to the presence of large amounts of background
noise and the difficulty to distinguish the action potentials of one neuron from those
nearby, measuring the activity of individual neurons accurately can be quite hard. Thus,
experimental results can be greatly enhanced by the use of spike-sorting software.
This section reviews classification of action potentials, a problem commonly referred
to as spike sorting, and presents the pre-processing of spike recorded signals necessary
for neural network data preparation. Spike sorting revealed to be indispensable since most
experimental retinal recording results were either not classified or the classification was
not accurate.
2.5.1
Feature and principal components analysis
From the recorded neural activity it can be seen that there are several different types of
action potentials. The spike shapes differ in many aspects: the sign of the principal peak
determines whether the response is ON-type or OFF-type, the ratio of the two major
peaks influences how transient the response is and the overall time scale of the curve
sets the time scale of the response. Assuming that these different spike types identify
different neurons, this correspondence can be established. Also, recorded waveforms show
a significant amount of background noise which could arise from the electronic amplifiers
or from the biological complex system. Nevertheless, neurons classification should be
reliable even in the presence of this background noise.
If shape could be characterized, this information could be used to classify each spike.
One approach to characterize shape is to measure its features, such as: spike height,
measured as the maximum spike amplitude, width, corresponding to the time lag between
the maximum and minimum, or peak-to-peak amplitude. This was one of the earliest
approaches to spike sorting [8]. In general, the more features considered, the better the
distinction of different spike shapes.
But choosing a feature based on an intuitive idea of what might be useful is an ad-hoc
approach and, although simple, it can yield poor cluster separation. Therefore, features
should be chosen systematically and automatically. One way to do this is with Principal
Components Analysis (PCA). The idea behind PCA is to find an ordered set of orthogonal
basis vectors that capture, in the data, the directions of largest variation. In the present
case, data is the spike recorded waveforms. Because the components are ordered in terms
of how much variability they capture, adding together the first k components will describe
the most variation in the data. Adding additional components yields progressively smaller
corrections until the spike is described exactly. The principal component vectors are
obtained by computing the eigenvectors of the covariance matrix of the data. A previous
study on comparing clustering methods found that principal components as features yield
12
CHAPTER 2. HUMAN VISUAL SYSTEM
more accurate classification than other features [8].
2.5.2
Cluster analysis
Finding clusters in multidimensional data sets and classifying data based on those clusters
is called cluster analysis. The subjacent assumption in clustering is that data results
from several independent classes, each described by a relatively simple model. This is
the case of spike sorting, since each action potential is supposed to arise from a unique
neuron.
There are many methods for clustering [9], but all share two common tasks. The first
task is to describe both the cluster location and the variability of the data around that
location. The second task is, given a description of the clusters, to classify that data.
Perhaps the most simple approach to clustering is the nearest-neighbour or k-means3
clustering. K-means consists in defining the cluster locations as the mean of the data
within that cluster. When using Euclidean distance, a spike is associated to whichever
cluster that has the closest mean. This defines a set of implicit decision boundaries that
separate the clusters while ignoring the distribution of data within the cluster. This
approach is adequate when the clusters are well separated, but fails when clusters significantly overlap or when the cluster shapes differ significantly from a spherical distribution.
Bayesian clustering is another approach to model data. In the most common approach,
one tries to capture the statistical distribution of data modelling each cluster with a multivariate Gaussian, centered on the cluster. This approach offers many advantages over
the one described above, and possibly the main one is that it quantifies the certainty of
classification. Expectation-Maximization (EM)4 algorithm is a technique to this type of
clustering [10]. Note that the probabilistic nature of membership allows decision boundaries to be computed as a function of confidence level. Also, an idea of the degree of
separation between classes is possible by looking at the distribution of the probabilities
for a class.
A difficult task in clustering is to decide the number of classes to use. In spike sorting,
the ideal number of classes corresponds to the number of neuron types being observed, but
choosing is not so simple. The distribution of spikes produced by individual neurons might
not be well described by the model employed. When spikes reveal stationary shapes and
uncorrelated noise, clusters appearance will be nearly spherical and conclusions drawn
from nearest-neighbour or symmetric Gaussian models can be very accurate [11]. In
other situations, such as non-stationary spike shapes or correlated noise, Gaussian mixture
models can still accurately model data. But complicated cases arise when neurons generate
complex bursts or the background noise is non-stationary. Therefore, the structure of the
data can be very difficult to capture, and it will be difficult both to predict the number
of units and to make accurate classifications.
3
4
See Appendix A.1: K-MEANS algorithm
See Appendix A.2: EM algorithm and bayesian classification
2.5. SPIKE SORTING
2.5.3
13
Pre-processing and classification
In order to model the spike firing events of retinal ganglion cells, in a way that a full
retinal model can be developed, cells were classified as part of the pre-processing of data
to feed supervised classifiers such as the neural networks5 . As already seen, Section 2.4,
there are several types of retinal ganglion cells, mostly characterized by their ON-OFF
response. A model of these must take this into account. For that reason, and before
advancing any further, experimental recorded action potentials were sorted into classes,
a process which is described next.
The experimental results, unsorted spike data, were obtained as described in [12] and
supplied by the University Miguel Hernandez, Spain. Experiments were made mostly on
cats and rabbit retinas.
The classification of this data was made with spike-sorting software specifically developed for this task. The software was developed in MATLAB. An user manual for this
software can be found in Appendix B.
Typically, several spike waveforms occurred at one electrode (multi-unit activity).
These spike ’prototypes’ at each electrode were separated using unsupervised classification
methods: the k-means and EM algorithms. Each 4-ms waveform consisted of 120 equally
spaced sample voltage values. These formed one input vector, resulting a 120-dimensional
space were the voltage traces cluster in a restricted volume. The two most important
dimensions for the cluster analysis were depicted graphically. The number of expected
clusters was adjusted and selected manually, after inspection of the raw data and it’s first
two principal components. After classification, a single cluster could be selected manually
and all traces belonging to it were shown superimposed. Classification could always be
repeated if the results were not satisfactory.
In the example of Figure 2.6(a) it was assumed that three prototypes were recorded in one electrode. The two most important principal components are shown in Figure 2.6(b). Clustering was performed using k-means and results are shown in Figures 2.6(c) and 2.6(d). The three corresponding classified prototype traces are in Figures 2.6(e), 2.6(f) and 2.6(g). In general, multi-prototype signals were obtained and separation was difficult. Therefore, for future use were only selected those prototypes that
had waveforms typical of single unit activity, unequivocal in terms of both amplitude
and shape, such as those in Figures 2.6(e), 2.6(f) and 2.6(g). As suggested in [12], spike
prototypes of identical waveform at a given electrode are assumed to be originated from
a single ganglion cell.
The sorting performed showed remarkable results. Before sorting the several electrodes
into their types, spike trains were as shown in Figure 2.7(a). As noted, there are spike
trains that do not appear to have correlation. Spike sorting grouped those that showed
similarity in temporal firing. The results are patent in Figure 2.7(b).
5
See Section 3.3
14
CHAPTER 2. HUMAN VISUAL SYSTEM
Spike electrode waveforms
Principal Component Analysis
60
150
50
100
40
2nd principal component
Amplitude (uV)
30
20
10
0
-10
50
0
-50
-20
-100
-30
-40
0
0.5
1
1.5
2
Time (ms)
2.5
3
3.5
-150
-150
4
(a) Electrode multi-unit activity.
-100
-50
0
50
1st principal component
100
150
(b) Distribution of principal components
of the response waveforms in Figure 2.6(a).
Spike electrode waveforms
Principal Component Analysis
60
150
50
100
40
2nd principal component
Amplitude (uV)
30
20
10
0
-10
50
0
-50
-20
-100
-30
-40
0
0.5
1
1.5
2
Time (ms)
2.5
3
3.5
-150
-150
4
(c) Electrode multi-unit activity sorted by
classes: unit 1 = blue, unit 2 = green,
unit 3 = red.
Spike electrode waveforms
-100
-50
0
50
1st principal component
60
40
50
150
(d) Distribution of principal components of
the sorted response waveforms in Figure 2.6(c).
Spike electrode waveforms
50
100
Spike electrode waveforms
30
20
40
30
10
30
10
0
-10
20
10
0
-10
-20
0
-10
-20
-20
-30
-30
-40
Amplitude (uV)
Amplitude (uV)
Amplitude (uV)
20
-30
0
0.5
1
1.5
2
Time (ms)
2.5
3
3.5
4
(e) Responses classified to
unit 1.
-40
0
0.5
1
1.5
2
Time (ms)
2.5
3
3.5
4
(f) Responses classified to
unit 2.
-40
0
0.5
1
1.5
2
Time (ms)
2.5
3
3.5
4
(g) Responses classified to
unit 3.
Figure 2.6: Example of spike signal pre-processing and prototype traces after separation.
2.5. SPIKE SORTING
15
Spike firing events
Stimulus
Retinal
Ganglion
Cells
Response
Cells
of the
same
type
0
10
20
30
40
50
60
Time (s)
(a) Before spike sorting.
Spike firing events
Stimulus
Retinal
Ganglion
Cells
Response
0
10
20
30
40
50
60
Time (s)
(b) After spike sorting.
Figure 2.7: Retinal ganglion cells response to a fullfield flash stimulus.
16
CHAPTER 2. HUMAN VISUAL SYSTEM
Chapter 3
Retina Neural Models
When facing the problem of modelling the neural code of the retina one of the first
questions we ought to ask is: How do action potentials represent information? [13]
Within the central nervous system, information is conveyed using action potentials as
the standard signal. Since our only source of visual experience results from the electrical
spikes on the ganglion cell axons, it is important to understand how these electrical spikes
shape our visual perception. The retina performs a significant amount of processing,
compressing the visual signal from a neural population of 108 photoreceptors into 106
optic nerve fibers, and so one might even learn new principles for efficient coding.
3.1
Defining the problem
Since the structure of the retina is remarkably preserved from fish to primates, models of
light response that successfully predict a ganglion cell’s firing rate also share a common
structure: differences among species affect the quantitative parameters of these models,
but not their basic elements. Also, a useful description of the neural code should specify [13]: (1) the relevant measure of neural activity in the ganglion cell population; (2)
how this activity responds to any given visual stimulus; (3) the precision of this response;
(4) and the variability of the stimulus response based on the recent history of the visual
input. Thus, throughout the stimulus presentation, the neuron’s instantaneous firing
probability at various times is generally taken to be the most important response feature.
As seen before, in Section 2.3, our visual perception is conveyed to the brain in parallel
pathways, but most of what we know about retinal signalling results from recording single
retinal ganglion cells. The underlying assumption is that the retinal population consists
of groups of neurons with similar functional properties and that the firing of each neuron
depends only on the stimulus presented and not on the activity of the surrounding neurons.
Unfortunately, this last condition is not always met [14].
Due to the opposite action of the center and surround regions of the receptive field,
ganglion cells respond to stimuli whose intensity varies in space, creating a response
sensible to object edges. Also, and because light step stimulus response lasts only a few
17
18
CHAPTER 3. RETINA NEURAL MODELS
milliseconds, ganglion cells seem to emphasize stimuli that vary in time over static ones.
In general, the stimulus is given by the intensity distribution I(x, t, λ) on the retina,
as a function of the position x, time t, and wavelength λ. One thus seeks a mathematical
model that captures the most important aspects of the retinal processing, and whose
output is the temporal evolution of the cells firing rate.
3.2
Classic model
The basic model architecture is depicted in Figure 3.1. The input is a spatiotemporal
stimulus contrast pattern I(x, t). The model consists of a set of band-pass spatial filters,
S(x), and high-pass temporal filters, T (t), assumed to factorize into a spatial and temporal
part (space-time separability):
R(x, t) = S(x) · T (t)
(3.1)
The spatial profile of the response amplitude, S(x), follows from a Difference of Gaussians (DoG) model. The DoG model assumes neural response as a combined signal of the
center and surround as separate mechanisms. The center mechanism contains the input
only from a small central region of the cell’s receptive field, whereas the surround mechanism includes the center and the surrounding region. Both mechanisms are assumed to
DoG
Stimulus
I(x,t)
High-Pass
S(x)
Rectification
T(t)
Firing Rate
F(t)
Figure 3.1: Model with space-time separability.
behave as space-time separable linear systems when responding to contrast stimuli. To
describe the spatial sensitiveness of the center and surround, a curve with the shape of
a Gaussian distribution is used. The output of a neuron is obtained by the sum of the
temporal response of both mechanisms, although the temporal responses are different in
the center and surround, generally being slower in the latter, Figure 3.2. This spatial
profile can be formalized by:
g+
S(x) = q
e
2
2πσ+
−
x2
2σ 2
+
g+
−q
e
2
2πσ−
−
x2
2σ 2
−
(3.2)
where parameters g+ and g− determine the relative weights of the center and surround,
respectively, σ+ and σ− their diameters.
The temporal profile can be approximated by a first-order high-pass filter:
T (t) = δ(t) − αH(t)e−αt
(3.3)
where δ(t) denotes the Dirac delta function, H(t) is the Heaviside step function and α−1
is the decay time constant of the response.
3.2. CLASSIC MODEL
Stimulus
I(x,t)
19
- + -
High-Pass Rectification
+
T(t)
Firing Rate
F(t)
Delay
Figure 3.2: Model with space-time dependency as suggested in [1].
As a result, R(x, t) can be seen as the change in the firing rate produced by flashing a
light spot at time t = 0 and location x. Assuming linearity, the firing rate F (t) produced
follows from the sum on the entire visual field and becomes,
Z Z
F (t) = f0 +
I(x, τ ) · S(x) · T (t − τ )dxdτ
(3.4)
where f0 is the spontaneous firing rate without stimulation, and negative values are truncated to zero. This expression can also be seen as a weighted version of the stimulus
by S(x), summed over all space, and then passed through a temporal filter of impulse
response T (t). It can be rewritten in a more compact form as
F (t) = ϕ[R(x, t) ∗ I(x, t) + f0 ]+
(3.5)
with the ∗ denoting the convolution operator, ϕ a scale factor and [x]+ := xH(x) the rectification operator. Although this model captures the main characteristics of the retinal
stimulus-response, it is a coarse approach. Stimulus contrast exerts a strong modulatory
effect on the responses of retinal ganglion cells. In humans, for example, visual stimulation with bright, full-field flicker of high frequency evokes impressive illusions of spatial
pattern [12]. To improve this model, a nonlinear contrast gain control mechanism
was added and turned out to be crucial for motion processing. The model is depicted in
Figure 3.31 .
The contrast gain control mechanism consists on a low-pass first order temporal filtering followed by a nonlinear gain control function. The low-pass temporal filtering can be
represented by
t
v(x, t) = u(x, t) ∗ [H(t)Be− β ]
(3.6)
where the B and β define the strength and constant time of the low-pass filter, u(x, t) is
given by
u(x, t) = g(x, t) × R(x, t)
(3.7)
and the nonlinear gain control function is
g(x, t) =
1
1 + [v(x, t)]+ 4
(3.8)
Finally, the system’s output, the instantaneous firing rate, is obtained from the rectification of the result,
F (t) = ψ[u(x, t) + f0 ]+
(3.9)
1
The following equations hold for the model presented in Figure 3.3(a)
20
CHAPTER 3. RETINA NEURAL MODELS
Nonlinearity Low-Pass
DoG
Stimulus
I(x,t)
S(x)
High-Pass
Rectification
T(t)
Firing Rate
F(t)
x
(a) Nonlinear at the end [12].
Nonlinearity
DoG
Stimulus
I(x,t)
S(x)
Low-Pass
High-Pass
Rectification
T(t)
x
Firing Rate
F(t)
(b) Feedback loop around the whole system [15].
Figure 3.3: Proposed models with nonlinear processing.
where ψ is once again a scale factor.
Functionally, this feedback loop generates a delayed suppression of high, sustained
retinal ganglion cell activation, and thus altering the temporal characteristics of the retinal
ganglion cells firing rates [12]. In other words, the net effect of the contrast gain control is
that, during large light fluctuations, the retinal response is less sensitive and faster [13].
3.3
Neural networks modelling
In this section, one presents a different approach to model retinal ganglion cell’s firing. The
idea is to train a neural network with experimental data of multi-electrode recordings
from the ganglion cell layer of a retina and attempt to catch the structure of the fired
spike trains.
3.3.1
Introducing neural networks
Artificial neural networks can be most adequately characterized as computational models with particular properties such as the ability to adapt or learn, to generalise, cluster
or organize data, and which operation is based on parallel processing [16]. These computational models are inspired in biological nervous systems. As in nature, the connections
between these elements determine the function of the network. For that reason, an arti-
3.3. NEURAL NETWORKS MODELLING
21
ficial neural network consists of a pool of simple processing units which communicate by
sending signals to each other over a large number of weighted connections.
Neural networks are trained by adjusting the values of the connections (also called
weights) between elements so that a particular input leads to a specific target output, see
Figure 3.4. The ability to perform a particular function depends largely on the chosen
weights. Typically the train is performed in a supervised manner. A training set is
assumed to be available, containing both the input and the correspondent desired output
patterns (also called target patterns). Training is based on the minimization of an error
measure between the network outputs and the desired outputs.
Target
Neural
Network
Input
=?
Output
Adjust
weights
Figure 3.4: Neural network training by adjusting weight parameters.
The most widely used kind of neural network is composed of simple units of the type
shown in Figure 3.5 [17]. The unit performs a weighted sum of it’s inputs, which is then
passed through a nonlinear function known as the activation function. Units are arranged
in a feed-forward layered structure, as depicted in Figure 3.6. Each layer consists of units
that receive their inputs from the units in the previous layers and send their outputs to
the units in next layers. There are no connections within units in a layer. An example of
a network with two hidden layers is presented in Figure 3.6.
Inputs
x1
x2
xN
1
w0
w1
w2
Activation function
Output
+
y
wN
Figure 3.5: Neural network processing unit.
The basic algorithm for training a neural network is based on the gradient method.
This method consists in iteratively adapting the weights, w, by taking steps proportional
to the negative of the gradient of the function to be minimized, E,
wn+1 = wn − η · ∇En
(3.10)
where η is the step size. In the case of feed-forward networks, the gradient components
are computed by the backpropagation rule [16], which can be derived by the application
of the chain rule of differenciation.
22
CHAPTER 3. RETINA NEURAL MODELS
+
Inputs
Hidden Layers
Outputs
Figure 3.6: A multi-layer network with two hidden layers of units.
3.3.2
A neural networks approach
The structure of firing of retinal ganglion cells to a stimulus can be seen as the probability
of some sequence {ti }, i = 1, 2, . . . , n, where ti are the times of fired spikes, given that the
stimulus I(t) is presented. As a result, given a particular time-dependent stimulus I(t),
one aims to find a complete description of the neural response based on the conditional
distribution P [{ti }|I(t)], which measures the relative likelihood that spikes will arrive
at the set of times {t1 , t2 , . . . , tN }. However, in practice, experiments usually seek a
characterization through the first few moments of such a distribution, like the mean, the
variance, and so on. It can be shown that the mean of this conditional distribution,
P [{ti }|I(t)], matches the firing rate attained by the model of the previous section [18].
Therefore, the model presented in Section 3.2 suffers from limitations by only considering
firing rates and neglecting the other moments of the distribution. It thus neglects the
fine temporal structure of retinal spike trains and correlations between retinal ganglion
cells [14, 19].
The presented approach tries to overwhelm some of these deficiencies. The idea is to
train a neural network using experimental spike trains to simple types of stimuli. Due
to the strong ability of adaptation of these systems they would ideally learn the feature
sets that produce the spikes occurrence. At each moment in time, the information feed
to the network should allow it to predict the occurrence of a spike. When looking to a
time interval, the spike train produced would be identical to a spike train produced by
the retinal ganglion cells themselves.
It has been assumed that spatial correlations between retinal ganglion cells do not
exist, and that cells of the same type fire approximately the same way. This means that
the firing of a cell is not influenced by its neighbouring cells, which is not true in real
biological systems. The similarity between groups of cells, ON /OF F , might help one
to find patterns in the way neurons represent information conveyed to the brain. After
classifying cells using the procedure described in Section 2.5, they were grouped by type
of response. Each group represented a type of firing that should be modelled.
The stimulus consisted in full-field flashes i.e., stimulations where the whole of the
stimulated area is illuminated with a spatial color constant stimulus for a given time
period (ON), followed by a dark (no light stimulation) period (OFF). This stimulus and
3.3. NEURAL NETWORKS MODELLING
23
the response of a cell in a given electrode is depicted in Figure 3.7. Notice that most of
the time there is no spiking activity.
Spike firing events
Stimulus
Cell
spikes
0
2
4
6
8
10
Tim e (s)
Figure 3.7: Stimulus (upper panel) and response (lower panel).
3.3.3
Training the neural network
Artificial neural networks were constructed and trained using the Neural Network Toolbox
from MATLAB. Up to four-layer, fully connected, feed-forward networks with twenty
hidden units in each hidden layer were used, as illustrated in Figure 3.6.
To model retinal ganglion cells firing, a set of different experiences were performed.
These experiences aimed at finding the network topology that most accurately achieves
the best classification of the firing events.
The first subset of experiences had the purpose of searching the faster and most
accurate training method. Activation functions were also tested and established. After
this, networks were trained by backpropagation to minimize the Mean Square Error (MSE)
between the spike prediction and the true spike. The direction of the change to each weight
was computed using Powell-Beale restarts conjugate gradient descent [20, 21] (traincgb
in MATLAB), and the size of the step was computed by Charalambous line search [22]
(srchcha in MATLAB), which provides an adaptive learning rate with cubic interpolation
and sectioning. The activation sigmoid function was given by (3.11).
f (s) =
2
−1
1 + e−2s
(3.11)
The search for the best network topology ended in a second group of simulations. Here,
the number of hidden layers and units was varied so that modelling was done without
underfitting or overfitting data.
The input layer of the network was presented with a window of spike trains 0.96 s long
(sampling window width used for rabbit retinas [23]), binned at 10 ms resolution. In the
first group of experiences, each input unit represented the stimulus in each time bin and,
a second group of experiences also added the existence of spikes in time bins. The value
of the output unit represented the spike prediction. In other words, the network takes the
stimulus sequence {I(t1 ), I(t2 ), . . . , I(tN )} and the correspondent value of the spike train
at times {ti }, i = 1, 2, . . . , N , {F (t1 ), F (t2 ), . . . , F (tN −1 )} (1 if spike, 0 if not) as input
24
CHAPTER 3. RETINA NEURAL MODELS
and is trained in order to reproduce a s ingle spike firing event at time tN , F (tN ). Two
different types of trainings were carried out. The difference resided in the nature of the
input used, just stimulus as seen in Figure 3.8(a) or stimulus and past spikes occurrence,
see Figure 3.8(b). In both cases there is only a single output value, the predicted spike
event at time tN .
I(tN)
I(tN-1)
I(tN-2)
...
I(t2)
I(t1)
F(tN)
(a) Training set with stimulus input, firing event output at time t.
I(tN)
I(tN-1)
I(tN-2)
...
I(t2)
I(t1)
F(t N-1)
F(t N-2)
...
F(t 2)
F(t 1)
F(tN)
(b) Training set with stimulus and firing
events input, firing event output at
time t.
Figure 3.8: Organization of the training set data to feed the neural Network.
These experimental results allowed the convergence to the topology that produced
the best results. In the following subsection a comparison of the performed trainings is
reported.
3.3.4
Simulations and results
To illustrate the performance of the model, raster plots of real spike trains are presented,
along with the corresponding predicted spike trains from the neural networks modelling
(see Appendix C).
The inputs consisted in pattern vectors of dimension N = 96 when using only stimulus
and N = 96 + 95 = 191 when using stimulus and feedback of past spikes. Networks were
trained with a training set of approximately 60 000 patterns in the first case and 80 000 in
the later. From these, 75% was used as training and 25% as validation. Finally, network
performance was assessed using a third data set reserved for testing, about the same size
as the validation set.
Modelling with stimulus input
The first type of training used only stimulus as input. The assumption is straightforward:
there is no temporal correlation between spikes.
The result is patent in Figure C.1. As noted, the system was not able to recreate the
spike train reproduced by the cell. The adaline2 responded with a periodic output, as
2
This is a particular case in which the unit activation function is equivalent to f (x) = 1
3.3. NEURAL NETWORKS MODELLING
25
expected since the input is totally periodic. This is a property of linear systems. As to the
nonlinear networks with one and two hidden layers, no matter the number of units used,
they also generated almost periodic responses. This can be explained by considering that
sigmoid functions are approximately linear close to the origin and that activation entries3
are falling in that region.
When looking to the cell response it is observable that the assumption was incorrect.
Temporal correlation is one of the reasons by which the visual system has such good
coding properties.
Modelling with stimulus and past spikes input
The introduction of new information in the learning system by adding past spikes revealed
vital. Now, the input and the output are no longer periodic. As a result, neural networks
provided a much higher accuracy in the prediction of the spiking events. An example of the
results is depicted in Figure C.2 with the same topologies trained earlier. The inclusion of
past information transformed this into a system with memory. For that reason, temporal
correlation between successful spikes was exploited and the results are at sight. Depending
on the network topology, the structure of the spike train was somehow captured.
In order to choose the topology that best classified spike events, zooming was performed. Figure 3.9 is a combination of the best results obtained (in this example) in the
previous trainings. As in other experiments performed, the best results are associated
with nonlinear topologies, generally using one hidden layer with 20 hidden units or two
hidden layers with 20 plus 10 or 10 plus 5 units. In this case, the best topology is a two
hidden layer with 10 and 5 units in each layer.
3
Activation function input values
26
CHAPTER 3. RETINA NEURAL MODELS
Stimulus
Cell
ADALINE
Model 20-1
Model 10-5-1
0
10
20
30
40
50
60
Tim e (s)
Stimulus
Cell
ADALINE
Model 20-1
Model 10-5-1
15
16
17
18
19
20
21
Tim e (s)
Figure 3.9: Comparison of real and neural network modelled spike trains.
Chapter 4
Processing Architectures
In this chapter, architectures are proposed for implementing the two retina models, Classic
Model and Neural Networks. Also, in order to create a complete system, it is required
to design an architecture for the serial communication sub-system. The serial protocol
allows the spike information to be transmitted into the brain over a serial asynchronous
RF link. The full diagram of the system can be seen in Figure 4.1, in which the grey color
highlights the blocks that are addressed in this chapter.
Digital
Camera
Visual
Stimulus
Bio-Inspired processing module
of the Artificial Retina
Spikes
Serialization
and Data
Packing
Data
Packets
RF Link
Modulator
RF Link
Data
Inside Human Head
Bit Vector
Bit
Electrode
Stimulator
Spikes
Data
Unpacking
Data
Packets
RF Link
Demodulator
Figure 4.1: Global architecture of the Artificial Retina.
The input is generated by an image capture device, in this case a digital camera, whose
output is in Red Green Blue (RGB) format, and connects to the processing modules. The
spike information is then wrapped into packets by a data formatting block. These data
packets are transmitted over a RF channel. At the receiver end, a RF demodulator
receives the serial data and feeds it serially to a block responsible for data unpacking.
Finally the spike information is sent to the electrode stimulator.
4.1
Classic model architecture
The developed processing modules for the bio-inspired processing module of the artificial
retina model are divided in two distinct parts: Retina Early Layers and Neuromorphic
27
28
CHAPTER 4. PROCESSING ARCHITECTURES
Pulse Coding. Figure 4.2 represents the block diagram of this bio-inspired processing
module.
Visual
Stimulus
Retina
Early Layers
Neuromorphic
Pulse Coding
Spikes
Figure 4.2: Classic Bio-Inspired processing module.
Here, the Retina Early Layers is the model introduced in Section 3.2 and Neuromorphic
Pulse Coding generates the action potentials using the instantaneous firing rate. This last
block will be described in the present chapter.
It is specified that the Retina Early Layers module should be able to process frames at a maximum rate of 100 Hz and that the maximum spike rate generation for the
Neuromorphic Pulse Coding must, at most, be 1 kHz per cell [2].
The retina early layers
The full processing mechanism for the Retina Early Layers is presented in Figure 4.3.
IR(x,t)
IG(x,t)
IB(x,t)
k S2 (r)
k S2 (r)
Delay
+
Delay
+
Delay
+
+
+
x
u(x,t)
Contrast Gain Control
High Pass Non Linear Low Pass
Visual Stimulus
R(x,t)
F(x,t)
Space filtering
Time filtering
Figure 4.3: Retina Early Layers diagram (adapted from [2]).
The spatial calculus is done using one DoG1 filter, equation (D.2), if space-time independency is considered. Nevertheless, the spatial filtering should be done using two
gaussians in order to independently convolve the center and surround since several authors have suggested that space-time filtering is not completely separable [1, 13]. Thus
this implies the use of a low pass filter, equation (D.8), after each gaussian. With such
architecture it is not only possible to implement space-time dependent models, but also
the one with space-time separability.
As different colors are treated differently by the retina, the spatial filter can be expanded into three separate filters, one per each RGB channel. As depicted in Figure 4.3,
considering space-time dependency there will be six receptive fields, two per RGB channel. The time filtering is done using the high-pass filter presented in (D.6). The Contrast
1
See Appendix D for an overview of filters implementation
4.2. NEURAL NETWORKS ARCHITECTURE
29
Gain Control (CGC) non-linear function was implemented using a look-up table and the
low-pass corresponds to equation (D.8).
Neuromorphic pulse coding
As the human brain responds to action potentials, the instantaneous firing rate at the
output of the Early Layers must be modified to meet this new representation.
This task is carried out by the Neuromorfic Pulse Coding block. A first approach to
fulfill this goal is to use the standard integrate-and-fire mechanism. To improve this model,
some modifications were suggested: (1) a leakage factor in the integrator [23]; (2) a spike
height modulation factor [24]. Integrate-and-fire was implemented using a simplification:
the spike amplitude remains constant, only the leakage factor was included. This solution
decreased the needed hardware as it requires less memory. Figure 4.4 shows the schematic
representation of this block.
Z
+
F[ q, n]
+
+
x
γ
Z
-
-1
-1
Pacc [q,n]
pulse[q,n]
φ
d
Figure 4.4: Integrate-and-fire block diagram.
The module can be described by (4.1) and (4.2)
Pacc [q, n] = F [q, n] + γ · Pacc [q, n − 1] − pulse[q, n − 1] − d
(4.1)
pulse[q, n] = H(Pacc [q, n] − φ)
(4.2)
where γ is the feedback loop gain, d is a leakage factor, φ is the firing threshold and q
represents the discrete spatial dimentions.
4.2
Neural networks architecture
The architecture for the bio-inspired processing module of the artificial retina model that
is presented does not include the learning mechanism of the neural network. It is assumed
that a similar net has been previously trained and thereby all the necessary coefficients
are known.
The basic block for implementing the neural networks model is the perceptron. As
the polarization input requires one adder but does not require a multiplier, a perceptron
with N inputs implies that the system requires N multipliers and an adder tree with
N adders. Figure 4.5 represents the parallel architecture for a single perceptron. This
parallel solution requires to many hardware resources. Since a parallel solution is not
30
CHAPTER 4. PROCESSING ARCHITECTURES
coef N-2 I N-2 coef N-1 IN-1 coef N I
N
coef 1 I1 coef 2 I2
x
coef 0
x
+
x
+
x
x
+
+
+
+
+
nl_input
Non
Linear
spike
Figure 4.5: Parallel architecture for a single perceptron.
reasonable to implement, the proposed architecture was achieved by using folding. The
basic block is composed by a multiplier and an adder, for which a data flow diagram to
calculate a single output of a perceptron is represented in Figure 4.6.
coef N-1 IN-1 coef N I
N
coef 1 I 1 coef 2 I2 coef 3 I3 coef 4 I4 coef 5 I 5
coef 0
x
x
x
x
x
+
+
+
+
+
t1
t2
t3
t4
t5
...
x
x
+
+
+
t N-1
tN
t N+ 1
nl _input
Figure 4.6: Perceptron data flow diagram.
The necessary functionality can be performed by a simple Multiply And Accumulate
(MAC) unit (Figure 4.7) considering that the initial value of the feedback register is coef0 .
Ii
coef i
x
+
Figure 4.7: MAC architecture.
To estimate the cost of this solution it is necessary to determine how many of these
basic blocks are required. It was specified a maximum spike frequency (fspike = 1/Tspike )
for each cell. By considering a micro-electrode array of size D, the available time for
processing the output of one cell (Telectrode ) will be
Telectrode =
Tspike
D
(4.3)
Considering N as the number of inputs of a percetron, using one MAC will require N + 1
4.3. SPIKE MULTIPLEXING
31
clock cycles to compute the output
Tpercept (N ) = (N + 1) · TM AC
(4.4)
where TM AC is the delay time of the MAC block2 . Denoting Ni , i = 1, . . . , k, as the
number of inputs of the ith layer, the necessary time to compute the output of a network
with k layers is given by (4.5).
0
Tspike
= N2 · Tpercept (N1 ) + ... + Nk · Tpercept (Nk−1 ) + Tpercept (Nk ) ⇐⇒
0
⇐⇒ Tspike
=
k
X
Nj · Tpercept (Nj−1 ) + Tpercept (Nk )
(4.5)
j=2
In order to achieve the desired spike frequency, the following equation must be respected:
Telectrode
0
Tspike
>
M
(4.6)
where M is the number of the necessary MACs. The coefficients, since their value is
known, can be stored into a Random Access Memory (RAM) or Read-Only Memory
(ROM) block. The amount of required memory is given by (4.7) where b is the number
of bits used to represent a coefficient.
M emcoef =
k
X
Nj · (Nj−1 + 1) + Nk + 1 · b
(4.7)
j=2
Finally, the output of the adder tree must suffer a nonlinear transformation. This
can be achieved by using the adder result to index a look-up table where samples of the
nonlinear function are stored.
4.3
Spike multiplexing
This module serializes the spike information generated by the artificial retina processing
module module in order to create a single data flow that can be transmitted to the brain
over a serial link. Figure 4.8 represents the block diagram of the serialization and data
packing block. The data flow generated by the spike multiplexing module is packed
Spikes
Spike
Multiplexing
Data Packing
Data
Packets
Figure 4.8: Serialization and data packing block.
accordingly to the serial communication protocol defined in the next section.
2
The multiplier introduces the higher delay
32
CHAPTER 4. PROCESSING ARCHITECTURES
The serialization module is based on the Address Event Representation (AER) [25]
protocol which is designed to be a neuromorphic representation of signals. The protocol
defines a way to asynchronously transmit information without timestamps. When a spike
is generated, it is placed onto the bus as quickly as possible in order to minimize the
latency. No spikes are lost due to collisions, if two spikes occur simultaneously they are
transmitted as quickly as possible in sequence. A possible implementation for this module
is the use of an AER tree [26]: it consists on multiple stages of arbiters which multiplex
the spike requests in their inputs to the output. However, this solution requires too much
hardware and is not scalable with the number of electrodes.
To overcome these drawbacks, a new solution was proposed using a First In First
Out (FIFO) memory [2]. Since the spike generation circuit is intrinsically sequential for
both the classic and neural networks models, a FIFO memory can be used to register
the generated spikes. This memory stores the information if the line if busy. In this
implementation, the needed hardware does not increase with the number of electrodes,
although the performance is more dependent on channel characteristics, it implies increasing the latency. Figure 4.9 depicts the architecture adopted for implementing the spike
multiplexing module.
Dual Port RAM
DIA
AddA
DOA
WEA
CLKA
Bit Vector
Bit
CLK
pulse
addr
pulse
Counter B
CLK
COUNT/STOP
Q
DIB
AddB
WEB
CLKB
event
DOB
Figure 4.9: Implemented AER module.
The pulse signal is only asserted by the retina processing module when a new spike
occurs and the spike address is given by pulse addr. When a new spike occurs pulse is
active, enabling the write of the spike address in the RAM block. This signal also enables
the counter that generates the write address of FIFO memory.
4.4
Serial communication protocol
In order to transmit spike information to the brain it is necessary to design a serial communication protocol. The protocol frame format and main characteristics were described by
the CORTIVIS project specifications [3]. The emitter module receives event information
from the spike multiplexing module and sends events as soon as possible.
The RF link is a serial transmission channel that may distort signals and introduce
errors. A simple error detection mechanism, without error correction capabilities is used.
The requirements of low power consumption at the receiver module was determinant for
adopting this approach. There are no retransmissions, so, if the receiver detects errors, the
circuit discards all data corresponding to the packet. The transmission is asynchronous
and a continuous flow of data maintains the channel busy. When there are no new events,
4.4. SERIAL COMMUNICATION PROTOCOL
33
the packet is filled with dummy bits, ensuring that the receiver can always extract power
from the received data. To simplify the protocol no kind of handshake is used. The packet
structure is presented in Figure 4.10.
22 x 8 ( or 4 ) = 176( or 88) bit
Header
Type
Valid
Event1
10 bit
3 bit
4 bit
22 bit
...
Event2
Event 8 ( or 4)
CRC
16 bit
Reserved
( 1 bit)
Address
Amplitude Duration
MSB LSB
LSB
10 bit
MSB MSB
7 bit
LSB
5 bit
Figure 4.10: Packet structure [3].
The packet structure is as follows:
• Header, delimits a packet and synchronizes the communication;
• Type, this field is constant in this implementation and set to ”Stimulation”;
• Reserved, unused, equal to zero; added to allow future expansions to the protocol;
• Valid, represents the number of valid events in the packet;
• Event, carries the electrode address and data. Figure 4.10 highlights the structure
of an event. The adopted retina model does not provide information about spike
amplitude or duration; in this implementation, these fields are constant and cannot
be adjusted dynamically; the number of events in a packet must also be adjusted in
compile time: only the values 4 and 8 are accepted at the moment;
• CRC, this system uses a 16 bits Cycle Redundancy Code (CRC) (CRC-CCITT)
with a polynomial of the type x16 + x12 + x5 + 1.
Although the CRC field was already defined in [3], it was necessary to change its
structure. The packet header was smaller than the CRC, therefore the receiver could
detect an invalid header. A way found to avoid this misinterpretation was to change the
packet structure and to include some bits delimiting the CRC field. Figure 4.11 showns
a possible packet structure, that would avoid the header erroneous detection.
Reserved (1 bit)
22 x 8 ( or 4 ) = 176( or 88) bit
Header
Type
Valid
Event1
10 bit
3 bit
4 bit
22 bit
...
CRC
Event 8 ( or 4) 0
CRC
CRC
0
0
(7 downto 0)
(16 downto 8)
1 bit
8 bit
1 bit
8 bit
1 bit
Figure 4.11: New packet structure.
The bits function remains the same, but the CRC field has a different structure. Using
zeros to delimit the CRC field and putting a zero in the middle, the header sequence will
34
CHAPTER 4. PROCESSING ARCHITECTURES
never occur in the payload of the packet. This solution requires a more complex receiver,
increasing its power consumption, which is a critical aspect since the receiver module will
operate inside the human head and retrieves energy from the serial link. As there are no
retransmissions, the effect of an erroneous decoding is not much different from the lost of
spike information. If a packet is corrupted the data will be lost. The simple solution of
eliminating the CRC field was chosen and the architecture in Figure 4.12 is the adopted
packet structure.
Reserved (1 bit)
22 x 8 ( or 4 ) = 176( or 88) bit
Header
Type
Valid
Event1
10 bit
3 bit
4 bit
22 bit
...
Event 8 ( or 4)
0...0
16 bit
Figure 4.12: Adopted packet structure.
The CRC field was replaced by a sequence of zeros. This solution requires less hardware
and thus reduces the power consumption.
4.5
System architecture conclusions
This chapter presented two different architectures for the visual encoding processing module. For the Classic Model architecture suggested in [2] the hardware requirements are
scalable, allowing to process visual input to a larger target micro electrode array with
the same computational resources and only the necessary memory increases. The architecture required to implement a neural network in Field-Programmable Logic (FPL) was
also developed. Considering the data path only, the developed architecture can be used
for different electrode matrix sizes and different neural network topologies.
A Spike Multiplexing module was created to achieve a single data flow for the spike
information generated by the processing module. Although this module was developed
by [2] in a different model, as Neural Networks also present a sequential nature it can still
be used with the proposed new model. To deal with a bigger electrode matrix, only the
memory size needs to be changed.
To enable the transmission of the spike information generated by the processing module over a RF serial link channel a Serial Communication protocol was developed. This
protocol was specified in [3] although changes were required to achieve the correct functioning of the system. The chosen specification was the simplest one in order to ensure
that the unpacking module does not require much power. This is a critical aspect since
the unpacking module will operate inside the human head.
Chapter 5
Full System Prototype
In Chapter 4, the system architecture was introduced considering a FPL implementation.
This chapter presents a full prototype for the complete retina model.
Analyzing the hardware requirements of the bio-inspired processing module, the previous prototype [2] was based on a FPGA with a small amount of memory. In order to
make the validation of the complete model and to allow future expansions a prototype
board with more resources was developed. The main goal of this system is to achieve a
small size prototype derived from a low power design based on a FPGA with more computational resources than the previous solution. The low power objective was the most
difficult to achieve, considering that the main component is a FPGA. This board will
also have a Video Graphics Array (VGA) port, in order to simulate and to validate the
model implementation. The digital output image of the processing module is converted
to analog level using three video Digital to Analog Converter (DAC)s. Figure 5.1 shows
a block diagram of the intended prototype.
Standard Monitor
VG A
Processed
Image
8-bit DACs
RF Link
XILINX
SPARTAN
XC3S400
Generic Connector
Camera Connector
C3188A Camera
Module
Input Image
Data
Packets
RF Link
Modulator
Artificial Retina Prototype
Figure 5.1: Block diagram of the prototype.
The prototype board was developed using Altium Protel 2004, a complete board-level
design system Computer Aided Design (CAD) software. A datasheet for this board is
presented in Appendix E. The developed prototype was implemented on a four layer
Printed Circuit Board (PCB) in which the two inner planes are dedicated to VCC and
GND and the outside layers are used for rooting.
35
36
CHAPTER 5. FULL SYSTEM PROTOTYPE
This chapter also describes the digital logic blocks necessary to implement the retina
model. The two considered architectures for the main processing module were presented
in the previous chapter. To create a complete system, it was necessary to develop other
hardware modules. These modules, addressed in this chapter, are responsible for the
interface with the digital camera and for generating the output image representing the
processed stimulus. Finally, the hardware design for the data packing and unpacking
module, that implements the serial communication protocol, is also presented.
5.1
Prototype processing core
The developed board uses the XILINX SPARTAN-3 XC3S400 [27] FPGA as the processing core. This represents a low cost solution but also ensures a significant increase in
hardware resources when compared to the FPGA employed in the previous developed
prototype [2], the XILINX SPARTAN-2 XC2S200 [28] FPGA. The XC3S400 has 400 k
system gates, a total memory of 288 kb distributed in 16 RAM blocks, 16 dedicated 18-bit
multipliers and fast look-ahead carry logic. The main difficulty with the previous prototype was the amount of memory available. The current solution is more adequate, since
the available memory in the RAM blocks is more than five times greater. The dedicated
multipliers allow to save system gates and to increase the operating frequency. This is an
important feature because a higher operation frequency allows to process data for a bigger
electrode matrix. However, there is a main drawback in using this FPGA, it requires three
independent supply voltages: VCCIN T internal supply voltage (1.2 V ); VCCAU X auxiliary
supply voltage (2.5 V ); VCCO output driver supply voltage (1.2 V to 3.3 V ).
Although XC3S400 has a maximum of 264 I/O ports, this design does not use them
all. Figure 5.2 represents the FPGA I/O available signals. Two external connectors
are provided: i) Cam is dedicated to the interface with the digital camera and uses 25
pins including two Global Clock (GCLK)1 inputs; ii) Gen is a generic connector that
provides 36 pins, four of them are GCLK inputs, this connector can be used to debug,
to output results or to connect expansion modules. It also provides three dedicated
buses that connect directly to the video DACs, a clock net to the video DACs and the
synchronization signals necessary to the VGA port. The connection diagrams for the
FPGA can be found on Appendix F.1 and Appendix F.2.
The pins corresponding to the Joint Test Action Group-IEEE Standard 1149.1 (JTAG)
interface used to configure the FPGA are not represented in Figure 5.2, but can be found
in Appendix F.2. The configuration pins are only available through a six pin header, in
order to reduce the PCB size.
5.2
Power distribution system design
Since FPGAs can implement an almost infinite number of applications at undetermined
frequencies and in multiple clock domains, it can be very complicated to predict its cur1
Global Clock, low-capacitance, low-skew network well-suited to high-frequency signals
IO
IO_L32N_1/GCLK5
IO_L32P_1/GCLK4
IO_L31N_1/VREF_1
IO_L31P_1
IO
IO_L28N_1
IO_L28P_1
IO_L27N_1
IO_L27P_1
IO
IO_L10N_1/VREF_1
IO_L10P_1
IO_L01N_1/VRP_1
IO_L01P_1/VRN_1
Cam1
Cam0
BANK 2
BANK 7
BANK 3
BANK 4
IO_L01N_2/VRP_2
IO_L01P_2/VRN_2
IO/VREF_2
IO_L19N_2
IO_L19P_2
IO_L20N_2
IO_L20P_2
IO_L21N_2
IO_L21P_2
IO_L22N_2
IO_L22P_2
IO_L23N_2/VREF_2
IO_L23P_2
IO_L24N_2
IO_L24P_2
IO_L39N_2
IO_L39P_2
IO_L40N_2
IO_L40P_2/VREF_2
IO_L40N_3/VREF_3
IO_L40P_3
IO_L39N_3
IO_L39P_3
IO_L24N_3
IO_L24P_3
IO_L23N_3
IO_L23P_3/VREF_3
IO_L22N_3
IO_L22P_3
IO_L21N_3
IO_L21P_3
IO_L20N_3
IO_L20P_3
IO_L19N_3
IO_L19P_3
IO_L17N_3
IO_L17P_3/VREF_3
IO_L01N_3/VRP_3
IO_L01P_3/VRN_3
IO_L32P_4/GCLK0
IO_L32N_4/GCLK1
IO_L31P_4/DOUT/BUSY
IO_L31N_4/INIT_B
IO/VREF_4
IO_L30P_4/D3
IO_L30N_4/D2
IO_L27P_4/D1
IO_L27N_4/DIN/D0
IO
IO_L25P_4
IO_L25N_4
IO/VREF_4
IO
IO_L01P_4/VRN_4
IO_L01N_4/VRP_4
IO/VREF_4
BANK 5
156
155
154
152
150
149
148
147
146
144
143
141
140
139
138
137
135
133
132
BTN3
BTN2
BTN1
BTN0
BTN4
131
130
128
126
125
124
123
122
120
119
117
116
115
114
113
111
109
108
107
106
DB3
DB2
DB1
DB0
DG7
DG6
DG5
DG4
DG3
DG2
DG1
DG0
DR7
DR6
DR5
DR4
DR3
DR2
DR1
DR0
BTN[0..4]
CLK_50Mhz
CLK_dac
DB7
DB6
DB5
DB4
BTN[0..4]
CLK_50Mhz
CLK_dac
VGA_HS
VGA_VS
DB[0..7]
DG[0..7]
DR[0..7]
DB[0..7]
DG[0..7]
DR[0..7]
Gen0
Gen1
Gen2
Gen3
Gen4
Gen5
Gen6
Gen7
Gen8
Gen9
Gen10
Gen11
Gen12
Gen13
Gen14
Gen15
Gen16
Gen17
Gen18
Gen19
Gen20
Gen21
Gen22
Gen23
Gen24
Gen25
Gen26
Gen27
Gen28
Gen29
Gen30
Gen31
Gen[0..35]
BANK 1
BANK 6
IO_L40P_6/VREF_6
IO_L40N_6
IO_L39P_6
IO_L39N_6
IO_L24P_6
IO_L24N_6/VREF_6
IO_L23P_6
IO_L23N_6
IO_L22P_6
IO_L22N_6
IO_L21P_6
IO_L21N_6
IO_L20P_6
IO_L20N_6
IO_L19P_6
IO_L19N_6
IO/VREF_6
IO_L01P_6/VRN_6
IO_L01N_6/VRP_6
BANK 0
57
58
61
62
63
64
65
67
68
71
72
74
76
77
78
Gen33
Gen32
28
29
31
33
34
35
36
37
39
40
42
43
44
45
46
48
50
51
52
IO_L01P_7/VRN_7
IO_L01N_7/VRP_7
IO_L16P_7/VREF_7
IO_L16N_7
IO_L19P_7
IO_L19N_7/VREF_7
IO_L20P_7
IO_L20N_7
IO_L21P_7
IO_L21N_7
IO_L22P_7
IO_L22N_7
IO_L23P_7
IO_L23N_7
IO_L24P_7
IO_L24N_7
IO_L39P_7
IO_L39N_7
IO_L40P_7
IO_L40N_7/VREF_7
U6A
XC3S400-5PQ208C
79
80
81
83
85
86
87
90
92
93
94
95
96
97
100
101
102
Gen35
Gen34
2
3
4
5
7
9
10
11
12
13
15
16
18
19
20
21
22
24
26
27
IO_L01P_5/CS_B
IO_L01N_5/RDWR_B
IO_L10P_5/VRN_5
IO_L10N_5/VRP_5
IO
IO_L27P_5
IO_L27N_5/VREF_5
IO_L28P_5/D7
IO_L28N_5/D6
IO
IO_L31P_5/D5
IO_L31N_5/D4
IO_L32P_5/GCLK2
IO_L32N_5/GCLK3
IO/VREF_5
Cam[0..24]
Cam15
Cam16
Cam17
Cam18
Cam19
Cam20
Cam21
Cam22
Cam23
Cam24
IO/VREF_0
IO_L01N_0/VRP_0
IO_L01P_0/VRN_0
IO/VREF_0
IO_L25N_0
IO_L25P_0
IO
IO_L27N_0
IO_L27P_0
IO_L30N_0
IO_L30P_0
IO
IO_L31N_0
IO_L31P_0/VREF_0
IO_L32N_0/GCLK7
IO_L32P_0/GCLK6
205
204
203
200
199
198
197
196
194
191
190
189
187
185
184
183
Cam14
Cam13
Cam12
Cam11
Cam10
Cam9
Cam8
Cam7
Cam6
Cam5
Cam4
Cam3
Cam2
Cam[0..24]
37
182
181 CLK_50Mhz
180 CLK_dac
178
176
175
172
171
169
168
167
166
165
162
161
5.2. POWER DISTRIBUTION SYSTEM DESIGN
Gen[0..35]
Figure 5.2: FPGA available I/O signals.
rent demands. These transient currents are the main cause of ground bounce thus making
the decision on the regulators to use on the Power Distribution System (PDS) very important. Two choices were considered to generate the required voltage levels: linear and
Pulse Width Modulation (PWM) regulators. While the first one requires less external
components, its efficiency is highly dependent on the dropout2 . In opposition, the PWM
regulators are able to achieve high efficiency levels even with high voltage dropouts, thus
saving thermal dissipation area. Based on such arguments, the choice was to use linear
regulators for the 5V power grid and PWM regulators for all other cases. This allows to
guarantee a highly efficient PDS with a minimum number of external components.
For all the supply voltages of the FPGA (1.2 V , 2.5 V and 3.4 V ) it was used the
MAX1830/MAX1831 [29] a low voltage PWM, step-down regulator, which delivers a
current of up to 3 A with a peak efficiency of 94%. This PWM architecture regulates
the output voltage by changing a PMOS switch on-time relative to the constant off-time.
Increasing the on-time increases the peak inductor current and the amount of energy
transferred to the load per pulse. This device also has another mode of operation, saving
power when the load current is low. Considering the device datasheet, it is possible
2
Difference between input and output voltage levels
38
CHAPTER 5. FULL SYSTEM PROTOTYPE
to obtain all of the desired voltages. The adjustable configuration for this regulator is
presented in Figure 5.3.
Vout
L1
Vcc
2
4
12
5
6
7
10
LX
LX
LX
LX
FB
FBSEL
REF
GND
PGND
PGND
IN
IN
VCC
n_SHDN
COMP
TOFF
470pF
Rtoff
Cin
22uF
130K
1
3
14
16
8
11
10
9
13
15
Inductor Isolated
2.2uH
R2
Cout
1uF
R1
150uF
MAX1830
2.2uF
Figure 5.3: MAX1830/1831 adjustable configuration electrical diagram.
This regulator has an internal voltage reference source. The internal voltage is compared to the feedback voltage, provided by a resistive divider placed between the output
voltage and the ground. The output of this comparator along with a current sense signal
is used as input to the PWM logic that controls the output MOS transistors. In the adjustable mode, the output voltage is determined by an equation presented in the regulator
datasheet. Applying that equation to the circuit in Figure 5.3 yields
µ
R2 = R1
¶
Vout
−1 ,
VREF
(5.1)
where VREF is the regulator internal reference voltage (typically 1.1 V ) and R1 = 30 kΩ.
Hence, for VCCIN T = 1.2 V , R2 must be set to 2.7 kΩ. The values for RT OF F and for the
inductor L were obtained considering the values proposed in the datasheet. The other
voltage levels were achieved using the preset output available voltages. Table 2 of the
regulator datasheet shows how the output voltage programming can be done. Considering
this table, VCCAU X is obtained by using a MAX1830 and connecting the FBSEL to VCC
and FB to the output voltage. For the VCCO , voltage FBSEL must be connected to the
REF pin and FB to the output voltage. The inductor and RT OF F are the same as for
the adjustable mode.
The second component of the PDS is the bypass or decoupling capacitors network
which act as a local energy storage. Every capacitor has a narrow frequency band where
it is most effective as a decoupling capacitor. The Equivalent Series Resistance (ESR)
of the capacitor determines the quality factor (Q), which determines the width of the
effective frequency band. Real capacitors also have a parasitic inductance3 ; these two
parasite components form an RLC circuit, Figure 5.4, and the resonant frequency of the
capacitor will be the frequency associated with the RLC circuit.
In order to determine a proper placement, it is necessary to consider the capacitor’s
parasite inductance, LSELF , previous designated as ESL, and the mounting parasitic
3
Also known as Equivalent Series Inductance (ESL)
5.2. POWER DISTRIBUTION SYSTEM DESIGN
39
C
ESL
ESR
Figure 5.4: Equivalent circuit of a real capacitor.
inductance, LM OU N T . This LM OU N T is within the range of 300 pH to 1500 pH [4].
Considering that the resonant frequency (FR ) for this RLC circuit is given by (5.2),
FR =
1
q
(5.2)
2π (LSELF + LM OU N T )C
it is now possible to determine the effective resonant frequency associated with a capacitor
placed on the board. Noise from the FPGA falls in certain frequency bands, and different
sizes of decoupling capacitors will operate in those different bands. For this reason,
capacitor placement is determined based on the effective frequency of each capacitor.
Capacitors need to be close to the device to perform the decoupling function. There are
two basic reasons for this requirement. First, shortening the distance between the device
and the decoupling capacitors reduces the inductance resulting in a less impeded transient
flow. The second reason deals with the phase relationship between the FPGA noise source
and the mounted capacitor. The placement of the capacitor determines the length of the
transmission line that connects the capacitor and the FPGA being the propagation delay
the relevant factor. For any transient current demanded by the FPGA, there is a roundtrip delay to the capacitor before any relief is seen. As result, if the distance is greater
than a quarter of wavelength, the transferred energy is negligible. The energy transferred
to the FPGA increases to approximately 100% at zero distance. In practical applications,
one tenth of a quarter wavelength is a good target [4]. The wavelength corresponds to
the capacitor mounted resonant frequency.
To achieve a balanced decoupling network, it is desirable to use the capacitor values
presented in Table 5.1. The exact value of these capacitors is not critical, but it is
Table 5.1: Capacitor value percentages for a balanced decoupling network [4].
Capacitor Value
Quantity Percentage
Capacitor Type
470 µF to 1000 µF
4%
Tantalum
1.0 µF to 4.7 µF
14%
X7R 0805
0.1 µF to 0.47 µF
27%
X7R 0603
0.01 µF to 0.047 µF
55%
X7R 0402
necessary to have some capacitors in every order of magnitude. According to Table 5.1,
the smaller capacity will be 0.01 µF , considering (5.2) the resonant frequency is
FR =
q
1
2π (0.9 × 10−9 + 1.0 × 10−9 )0.01 × 10−6
= 36.5 M Hz
(5.3)
40
CHAPTER 5. FULL SYSTEM PROTOTYPE
assuming a LSELF = 0.9 nH [4] and a LM OU N T = 1.0 nH. With a propagation speed
of about 1.54 × 108 m/s [4] for the FR4 dielectric, the wavelength associated with this
capacitor is
λ=
1.54 × 108
= 4.22 m
36.5 × 106
(5.4)
Therefore the target radius, rP LACE , will be one tenth of a quarter wavelength, rP LACE =
4.22/40 = 0.11 m . Since all the other capacities are larger then this one, this will be
the smallest necessary radius. Considering the board expected size, this radius will be
achievable.
It is only necessary to provide one capacitor per VCC pin, if all pins are used. Hence
VCCIN T and VCCAU X must always be decoupled, but VCCO can be prorated according to
the I/O utilization. As in this design not all FPGA input/output pins are connected
the decoupling of VCCO will only use a ratio of one capacitor per two pins. The adopted
design uses the capacitors present in Table 5.2.
Table 5.2: Decoupling capacitors quantities.
Quantity
Capacitor Value
VCCIN T
VCCAU X
VCCO
470 µF
1
1
1
1.0 µF
1
1
2
0.1 µF
2
2
4
0.01 µF
4
4
8
Only the VCCIN T has one capacitor per pin, as it is the most critical supply. The
VCCAU X and VCCO have a rate of one capacitor per two supply pins, having the quantities
for each capacitor value been calculated accordingly to Table 5.1.
Also, in order to achieve a more effective decoupling network, the prototype uses a
four layers PCB in which the two internal layers are power planes and the outside layers
are used for rooting. Figure 5.5 represents the arrangement of the power planes.
Different planes are identified with different colors. The split planes utilized for the
DACs are also represented. The ground plane is also divided, but in a more simple
manner. All the digital circuitry shares the same plane, only the analog part of the DACs
has a separate ground. The arrangement of the decoupling capacitors with the lower value
is also represented in the figure since their placement is more critical. The usage of this
arrangement for the split planes lead to a small distance between the capacitors and the
power pins and therefore increases the decoupling efficiency.
In Appendix F.3, the electrical diagram for all the regulators is presented and in the
Appendix F.2 is the decoupling network can be found.
5.3. VGA DISPLAY PORT
41
Camera Connector
Reg
Vcco
VGA
Vcco Plane
DAC Plane B
Reg
Vccint
XILINX SPARTAN
XC3S400
Vccint Plane
8-bit DAC
Vccaux Plane
DAC Plane G
Reg
Vccaux
DAC Plane R
Generic Connector
Figure 5.5: Power plane division, different colors identify split planes borders. The decoupling capacitors with the lower value are also represented.
5.3
VGA display port
In order to test the model implementation a VGA display port and 3 video DACs were
added to the prototype. The choice of 3 distinct DACs allows future expansions. If
the model output is extended for separate processing for each color channel this same
prototype board can be used for validation. This prototype is intended to display an
image with color information, if required. Also, in order to achieve a better image quality,
the simple resistor DAC used in the previous prototype had to be put aside. The display
port is based in three separate 8-bit Video Digital-to-Analog converters, TLC5602C [30],
one per RGB channel. This DAC presents a low power consumption (typically 80 mW ),
20 M Hz conversion rate, single 5 V power supply and TTL digital input voltage. In
order to ensure the correctness of the digital output logic levels from the FPGA, the
VCCO voltage must be set to 3.3 V . Another necessary input for this DAC is a reference
voltage of 4 V . This voltage is generated with a MAX6004 [31], a low power and low
dropout voltage reference Integrated Circuit (IC). The electrical diagram for the DAC
responsible for one of the VGA channels is presented in Figure 5.6.
+5
Inductor
600z
L13
Inductor
600z
DR[0..7]
DR[0..7]
L14
+5
Inductor
600z
L15
Inductor
600z
U9
AGNDR
10
5
C25
8
AVDDR
CLK_dac
11
Cap Semi
1uF
COMP_R 3
DR0
19
18
DR1
17
DR2
16
DR3
15
DR4
14
DR5
13
DR6
12
DR7
1
DGNDR
2
9
C26
DVDDR
Cap Semi
4
0.01uF
AGNDR 7
Vref_dac
ANLGGND
ANLGVDD1
ANLGVDD2
CLK
COMP
D0(LSB)
D1
D2
AOUT
D3
D4
NC20
D5
D6
D7(MSB)
DGTLGND
DGTLVDD1
DGTLVDD2
REF
NC7
+5
Vref_dac
U5
1
6
AoutR
20
AGNDR
COMP_R
VIN VOUT
GND
3
L12
MAX6004
C27
Cap Semi
1uF
TLC5602CDW
Figure 5.6: DAC electrical diagram.
2
42
CHAPTER 5. FULL SYSTEM PROTOTYPE
According to the device datasheet, it is necessary to decouple the power pins, which
is accomplish by placing a ferrite bead in series with the power and ground pins and by
putting a capacitor between these pins, close to the device. Also, in order to decrease
the circuit noise, each DAC has a separate analog power and analog ground plane, being
isolated from the digital circuitry and from the other DACs. The electrical diagram for
all DACs is presented in Appendix F.4.
5.4
Complete board and components placement
While in the preceding sections the individual parts of this prototype were characterized,
this section integrates the several blocks into a complete board.
To design an autonomous system, it is necessary to include an additional power supplier for the digital camera. As the required voltage is the same for the DACs, these devices
will be powered by the same source. This voltage level is generated by a TPS78601 [32],
a ultra low-noise, low-dropout linear regulator with a maximum output current of 1.5 A.
This regulator has an adjustable output voltage. Figure 5.7 presents diagram of the circuit
for the adjustable voltage mode.
R2
Vcc
R1
C1
U10
2
5
1
Vout
IN
FB
EN
OUT
1uF
3
6
GND
TAB
2.2uF
4
TPS78601
Figure 5.7: TPS78601 adjustable configuration electrical diagram.
To control the output, the IC uses an internal reference voltage. This reference is
compared to the feedback voltage provided by a resistive divider between the output
voltage and the ground. The output of the comparator along with a current sense signal
is used to control the gate voltage of Metal-Oxide Semiconductor (MOS) transistor placed
between the unregulated input and the regulated output voltage. By changing the gate
voltage the on-resistance of the transistor is altered and thereby the output voltage is
controlled. Hence, since the transistor is placed in series, the power dissipation will be
directly proportional to the voltage difference between the input and the output. As
stated in Section 5.2, the linear regulator has a much simpler circuit, although it only
presents a good efficiency if the input voltage is close to the output voltage. However,
since the maximum input voltage for this regulator is 6 V , the maximum dropout voltage
is 1 V leading to a minimum efficiency of over 80%. Moreover, as it requires less exterior
components then the PWM regulator, the electrical diagram is simpler. Thus, the usage
of this device allows the PCB size to be reduced. To obtain the desired voltage the R2
5.4. COMPLETE BOARD AND COMPONENTS PLACEMENT
43
resistor must be set according to equation (5.5),
µ
R2 = R1
¶
Vout
−1 ,
VREF
(5.5)
where R1 = 30 kΩ and VREF is typically 1.2246 V . Hence, to achieve the desired voltage,
R2 must be set to 92.4 kΩ. Also, according to equation 3 of the regulator datasheet, to
improve the stability of the output voltage, C1 must follow the next equation
C1 =
(3 × 10−7 ) × (R1 + R2 )
(R1 × R2 )
(5.6)
and thereby C1 will be equal to 15 pF .
The PCB has a generic input/output connector to enable debugging, the output of
results and/or future expansions. It also has a connector to deliver power and receive the
input image from the digital camera. Another necessary component is a clock generator,
for this design it was employed the CFPS-73 [33], a 50 M Hz, High Speed Complementary
Metal-Oxide Semiconductor (HCMOS) 3.3 V oscillator. Finally, to create simple control
signals, a push button and a four way slide switch were included. Figure 5.8 shows a
picture of the actual developed PCB board, were the main components are highlighted.
Camera Connector
JTAG
Regulator
Vcco
Regulator
Vccint
Oscilator
VTO
VGA
Control
Switches
XILINX
SPARTAN3
XC3S400
Video
DACs
Regulator
Vccaux
Power
Jack
Generic Connector
Regulator
5V
Figure 5.8: Complete prototype board.
Appendix E provides a short datasheet of this prototype. It includes the connector
pinouts and all the pin connections to onboard input and output devices. The electrical
diagram for the full prototype can be found on Appendix F.5.
When assembling the prototype some minor errors were detected, none of which compromise the correct operation of the board. The pin order in the JTAG interface was
incorrect and the footprint size of some components was not adequate. There are certain
aspects that could be improved. The usage capacitors with a smaller package outline
would allow the placement of more decoupling capacitors on the VCCAU X source, therefore creating a better decoupling network. Moreover, by placing more components on the
back side of the PCB, a decrease in the board size would be achieved.
44
5.5
CHAPTER 5. FULL SYSTEM PROTOTYPE
Digital logic blocks
In order to perform a validation of the retina processing module, the architecture was
implemented in VHDL and tested on the developed prototype board. Only the Classic
Model was implemented but an analysis of the Neural Networks was made towards determining the required hardware resources. The tests were made using only one color
channel and an input represented in 8-bit greyscale, however a 12-bit representation was
used internally to reduce computational and filter discretization errors. The considered
dimension of the electrode array was 32 × 32. It was also necessary to develop other
modules, Figure 5.9 represents the developed logic circuits.
Input
Image
SDA
SCL
Image Capture
and Resize
Register
Configuration
Spikes
Visual Encoding
Resized
Image
Serialization and
Data Packing
Data
Packets
Processed
Image
Image Display
R,G,B
VS
HS
Figure 5.9: Digital logic block diagram.
Two modules were created for image acquisition, one to capture the input image and
resize it and the other to make the power-on configuration of the digital camera internal
registers. The resized image is then feed to the visual encoding module. This module
produces two outputs, one is the train of action potentials that will be sent to the brain
and the other is the firing rate that results from the retina model filtering. The data
packing module prepares the spike information to be transmitted over a serial link. At
last, the display module produces a screen visual image to allow the validation of the
processing module. This module displays the input image, the spike information and the
processed image before the spikes are generated.
The synthesis results presented in the following subsections where achieved using the
XILINX ISE WebPACK 6.2i synthesis tool with a XILINX SPARTAN-3 XC3S400-4 as
target device.
5.5.1
Image capture and resize
The input image was obtained using a digital camera module from Quasar Electronics [34]
that uses Omnivision OV7620 [35] Complementary Metal-Oxide Semiconductor (CMOS)
image sensor. All camera functions can be configured using a serial data transmission
protocol, Serial Camera Control Bus (SCCB) [36], which is a simplified version of Philips
Inter Integrated Circuit (I2C) [37] protocol. This camera is able to capture a window size
of 4.84 mm × 3.64 mm generating an array of 644×492 with a pixel size of 7.6 µm ×
7.6 µm. The digital video port supports 60 Hz YCrCb 4:2:2 16-bit/8-bit format, Zoomed
Video Format (ZV) Port output format, RGB raw data 16-bit/8-bit output format and
CCIR601/CCIR656 format, with progressive or interlaced scan. It also has a black and
5.5. DIGITAL LOGIC BLOCKS
45
white composite video signal output (VTO) in National Television System Committee
(NTSC) format that can be used for test purposes. The sensor has a signal to noise ratio
greater then 48 dB and a dynamic range of more than 72 db. The supply voltage for this
sensor is 5 V and it requires less then 120 mW when it is active and only 50 µW while in
standby. This camera module includes lens with a focal length and an aperture of 6 mm
and F1.6, respectively. The current retina model version only uses greyscale data and the
camera is programmed to capture frames in Quarter Video Graphics Array (QVGA), at
a frame rate of 20 Hz in non-interlaced mode.
The camera module supplies three synchronization signals: vertical sync (VSYN), horizontal window reference (HREF) and pixel clock (PCLK). Frame synchronization is done
by detecting a high pulse on VSYN and a new line occurs when HREF has a low pulse,
Figure 5.10 shows the temporal diagrams for this two signals.
Tpw
Tdis p
TS WL
Tfp
TLINE
Tbp
VSYN
HREF
HREF
PCLK
Figure 5.10: Camera module sync signals.
The PCLK is used to identify pixels on a line, the pixels information is updated every
falling edge of pixel clock. This implies that pixel information can be read on the rising
edge of PCLK. Another signal, odd field flag (FODD), is also required in order to only assert
WE when FODD is high.
A simple hardware can be used in order to capture the frames using these four synchronization signals. In Figure 5.11, it is shown the implementation used for this module [2].
Falling edge detection
FFD
FFD
CLK
HREF
D
D
Counter
CLK
Q
Line
count
Q
Column
count
COUNT/STOP
RESET
VSYNC
Rising edge detection
FFD
PXL_CLK
D
FFD
D
Counter
CLK
COUNT/STOP
RESET
FODD
Bit Vector
D
WE
FFD
Bit
Figure 5.11: Implemented hardware for frame capture.
The hardware costs of this synchronization device are 15 slices (1%) with a maximum
operating frequency of 25 M Hz.
46
CHAPTER 5. FULL SYSTEM PROTOTYPE
The developed processing module for this prototype was configured to have an input
of only one color channel, in this case greyscale, and a frame window of 32×32. In a
first approach, the desired window was obtained using only register configuration, but the
image frames for this solution where very affected by noise. Another solution was to make
an image resize: the visual frames were captured with a 128×128 window size and, using
filtering followed by decimation, the image was reduced to 32×32 pixels. The filter main
purpose is to avoid the aliasing generated by the decimation process, but it is also used to
reduce some of the vertical noise. Three types of filters were considered: bicubic, gaussian
and bilinear. The chosen filter was a gaussian low pass type with a standard deviation
of two, since it provides good results and does not require much logic. The filtering was
done line by line delivering an output in the same way and can be described by equation
(D.1). The schematic representation can be seen in Figure 5.12, where the symmetry of
the gaussian was used to reduce the necessary logic, it uses less multipliers and less delay
lines.
a 3 x[n-3]
a4 x[ n-3]
a4 x[ n-4]
a 3 x[n-2]
a4 x[ n-2]
a2 x[n-2]
a1 x[ n]
a 3 x[n-1]
a4 x[ n-1]
a2 x[n]
a 3x[n]
a2 x[n-1]
a 3 x[n+1]
a 4 x[n]
a2 x[n+1]
+
a1x[ n-1]
a 0 x[n]
a 0 x[n+3]
a4 x[n+4]
a 3 x[n+2]
x
a4x[ n+2]
a3 x[n+4]
a4x[ n+1]
x
a4
a4x[ n+3]
a2 x[n+4]
a 0 x[n+2]
x
a3
a 0 x[n+1]
a1 x[n+4]
a1 x[ n+1]
x
a2
a1 x[ n+2]
x
a1
a1 x[ n+3]
Register
Delay line
a0 x[n+4]
a2 x[n+3]
a
p[n-1] 0
=
x[n+4]
a2 x[n+2]
p[n]
a 3 x[n+3]
Input
pixel
+
+
+
+
+
+
+
Y[n]
Decimaion
Output pixel
Figure 5.12: Spatial low pass gaussian filter.
The hardware costs of this filter are 130 slices (3%) and 1 RAM block and the maximum
operating frequency is 179 M Hz.
5.5.2
Register configuration
A new hardware module was developed to improve the output image of the digital camera.
This independent system was only designed to experiment new register settings, although
the VHDL core had to be included in the main design in order to make the register
configuration at setup time.
To allow a more flexible operation, the VHDL core receives data with the register
settings from a PC through the Digilab 2 [38] parallel port. The data is sent by a simple
program written in C language. Figure 5.13 shows the connections diagram for this
module.
5.5. DIGITAL LOGIC BLOCKS
47
PC
DB25
Board
Power Serial port
Supply connector
R,G,B
Parallel port
C3188A
Camera
Module
8-bit DAC
connector
XILINX
SPARTAN
XC2S200
SDA
Standart
Monitor
VSYNC
HSYNC
SCL
DIGILAB D2 Board
Figure 5.13: Block diagram for register configuration module.
Initially, the camera composite video output, VTO, was used to evaluate the resulting
image. However, this more simple solution had to be put aside as it was necessary to
adjust the frame rate to achieve better results. The block diagram of the developed
system is shown in Figure 5.14.
Dual Port RAM
QA
DB25
PC
epp2
program_
regs
DB
SCL
SDA
C3188A
Camera
Module
Figure 5.14: Register configuration block diagram.
The control program supplies the register address and value to the VHDL core and
stores the register configuration in a file. A first VHDL block, epp2, only reads the
programming data from the parallel port and then stores it in a dual port RAM; this
block has a counter that is incremented whenever a new byte is received. The dual port
RAM is also connected to a second block, program_regs, responsible for the interface with
the camera using the SCCB protocol. The protocol was implemented using a hierarchical
three level state machine as shown in Figure 5.15.
camera_config
read_data
read_cycle
write_cycle
gen_start
gen_stop
write_data
Figure 5.15: Block diagram of the program regs.
The first layer, camera_config, sends commands to enable a read or a write cycle
and supplies the register address and the value, which were stored in the dual port RAM.
48
CHAPTER 5. FULL SYSTEM PROTOTYPE
The cycle is initiated when a button on Digilab 2 is pressed. The next layer consists
of two state machines: one to implement the write cycle (write_cycle), and the other
the read cycle (read_cycle), only one is active at a time. The third layer does the
interface with the camera module. The last layer is responsible for generating all the bit
sequences necessary to implement the SCCB protocol. A write and a read cycle is shown
in Figure 5.16 and 5.17, respectively.
gen_start
write_data
write_data
. ..
ID7
ID6
. ..
write_data
. ..
ID0
X
ADD7
ADD6
. ..
gen_stop
. ..
ADD0
X
D7
. ..
D6
D0
X
Figure 5.16: Write cycle.
gen _start
write_data
...
ID7
ID6
gen _start
...
.. .
ID0
X
ADD7
ADD6
write_data
ID6
...
.. .
ADD0
X
read_data
...
ID7
gen_stop
write_data
gen_stop
.. .
ID0
X
D7
D6
.. .
D0
X
Figure 5.17: Read cycle.
After achieving a suitable register configuration, the PC interface is no longer necessary. A modified version of this system was included in the main design, in order to
make the power-on configuration. In this solution, the configuration values are stored
in a ROM memory block and only write operations are necessary. This register configuration device requires a total of 114 slices (3%), operating at a maximum frequency of
191 M Hz.
5.5.3
Classic model implementation
The processing model was described in Section 4.1. The employed implementation only
processes one of the RGB channels and considers space-time separability. The global FPL
architecture for one RGB channel is shown in Figure 5.18.
Figure 5.19 shows the adopted FPL architecture for the Spike Generation and the
implementation of the several filters can be found in Appendix D. This model was used
in order to test the developed prototype board and also the other developed hardware
modules. In Table 5.3 it is shown the needed hardware resources for the processing module
without the CGC block.
The complete model, including the CGC block, occupies a total of 365 slices (10%)
and 7 RAM blocks (43%), and has a maximum operating frequency of 86MHz.
5.5. DIGITAL LOGIC BLOCKS
49
Visual stimulus
DB
Low
pass
Non
Linear
Dual port
RAM
QA
DoG
Dual
port
RAM
QA
x
High pass
QA
Spike
Generation
DB
Spikes
to send
DB
Rectif.
Neuromorphic
Pulse Coding
Early Layers
Spike
Multiplexing
Figure 5.18: Early Layers FPL full architecture.
FR[q,n]
I[q,n]
+
Signal
Processing
d
Bit Vector
pulse[q,n]
Pacc [q,n]
+
>=
+
γ
x
Register A
+
Bit
+
0
'0'
1
Spike
Amplitude
MUX B
Figure 5.19: Integrate-and-fire adopted architecture.
5.5.4
Neural networks implementation
For the architecture presented in Section 4.2, it is now calculated the hardware resources
needed to implement the neural networks approach. It is important to notice that this
calculus does not consider the unpredictable extra hardware costs due to routing neither
the delays generated by this same routing. This estimation will assume the usage of a
18-bit representation for the internal calculations.
The maximum spike rate generation must is 1 kHz per cell [2]. Considering an array
32 × 32, electrodes, the available time for calculating the output of one cell accordingly
to (4.3) is
1
Telectrode =
= 977 ns
(5.7)
1024 × 1000
One of the employed neural networks has two hidden layers. The first hidden layer receives
100 inputs, the second 20 and the output layer only has 10 inputs. Using (4.5) and (4.4),
it is possible to calculate the necessary time for calculating one output using a single MAC
0
= 20 · Tpercept (100) + 10 · Tpercept (20) + Tpercept (10) = 2241 · TM AC
Tspike
(5.8)
The implementation costs of a single MAC using the developed prototype are, considering
the usage of embedded multipliers, 19 slices and the maximum operating frequency is
190 M Hz. The minimum required number of MACs that are necessary to respect the
temporal restrictions can be calculated using (4.6),
&
'
2241 × 5, 26 × 10−9
M=
=⇒ M = 13
977 × 10−9
(5.9)
50
CHAPTER 5. FULL SYSTEM PROTOTYPE
Table 5.3: Implementation costs for the Bio-inspired Processing Module.
FPGA
Block
Slice
Operating
RAM Blocks
Occupation
Frequency (M Hz)
(Total 16)
21%
47∗
5
2%
51
3
5%
85∗
1
0%
186
1
Retina
Xilinx Spartan
Early Layers
XC2S200
Neuromorphic
Pulse Coding
Retina
Xilinx Spartan
Early Layers
XC3S400
Neuromorphic
Pulse Coding
(24 slices)
*49 clock cycles are required to process each pixel
Also, to store all the coefficients, the total amount of required memory considering a
18-bit fixed point representation can be calculated using (4.7), resulting in
M emcoef = (101 · 20 + 21 · 10 + 11) · 18 = 22.16 kb
(5.10)
Assuming that each MAC requires an activation function, the implementation of this
functions will require a total of approximately 30 kb of memory, using 128 samples with
18-bit each. Following the same procedure for the different networks topologies considered
in Section 3.3 the results represented in Table 5.4 were achieved.
Table 5.4: Neural networks implementation hardware cost for different topologies.
Network
Number of
Number
Number of
Coefficients
Look-up Table
Topology
inputs
MAC
Coefficients
Memory (kb)
Memory (kb)
Adaline
191
2
192
3.46
-
20-1
191
21
3861
69.50
48.38
10-5-1
191
11
1981
35.66
25.34
20-10-1
100
13
1231
22.16
29.95
Analyzing the hardware requirements presented in table 5.4 it is concluded that a
processing system based on the neural networks can be implemented in the developed
prototype. The only limitation is the available memory. Although, since the prototype
offers one expansion connector, this drawback can be surpassed using an external memory
module.
5.5.5
Serial communication protocol
The function of this hardware module is to create the packet structure presented in
Section 4.4. Figure 5.20 represents the adopted hardware implementation.
The function of each block can be described as follows:
5.5. DIGITAL LOGIC BLOCKS
51
State Machine
clk
rst
RAMAddB
SUB
clk
rst
new_event_cnt
-
RAMAddA sel_data
Cbit
Cevent
load cnt_clear
Dual Port RAM
event
DIA
AddA
WEA
CLKA
DIB
AddB
‘1’
WEB
CLKB
DOA
“CONST_HEAD”
counter_5
DOB
clk
MUX
‘0’
rst
clear
Cbit
Cevent
x“0000”
out_buffer
Bit Vector
Bit
clk
rst
load
p_data
Data Packets
s_out
Figure 5.20: Data packing block diagram.
• Dual Port RAM, storage element of the FIFO memory introduced in Section 4.3;
• counter_5, bit counter, also generates the number of sent events;
• SUB, generates the number of events waiting to be sent;
• MUX, selects the data to be loaded into the output buffer;
• out_buffer, sends the data to the RF modulator; with a parallel load, it stores the
data corresponding to a given chunk of the packet;
• State Machine, this Mealy4 state machine generates all the control signals necessary to perform the packet formatting; it also manages the read pointer of the
FIFO memory; the output from the counter_5 and result of SUB are input signals
to the state machine.
This module requires 148 slices (4%) and 1 RAM block (6%), operating at maximum
frequency of 121 M Hz.
It was also necessary to develop a hardware module capable of receiving and unpacking
the output data of the RF link demodulator. The implementation of this hardware device
is shown in Figure 5.21.
The function of each block can be described as follows:
• in_buffer, shift register that stores incoming data;
• counter_5, bit counter, also generates the number of send events;
• event_data, stores the recovered spike information that will be written in the memory;
• num_events, stores the number of events from the current packet;
• comp1, active when a new packet is detected;
4
Finite state machine where the outputs are determined by the current state and input
52
CHAPTER 5. FULL SYSTEM PROTOTYPE
in_buffer
Sclk
rst
Data
Packets
Dual Port RAM
event_data
clk
clk
p_data
rst
s_in
clear
“CONST_
HEAD”
rst
D
DIA
AddA
WEA
CLKA
Q
en
counter_5
Cevent
num_events
/=?
clk
rst
D en
Q
comp2
rst
clear
=?
Cbit
comp1
clk
DIB
AddB
‘0’
WEB
CLKB
State Machine
DOB
Spikes
Counter
cnt_clear
Cevent
Cbit
buff_clear
new_pack
clk
rst
DOA
en_num
num_events
en_data
RAMAddA
new_event
data_valid
CLK
Q
COUNT/STOP
Bit Vector
Bit
Figure 5.21: Data unpacking block diagram.
• State Machine, generates all the control signals necessary for the data unpacking;
it also controls the write signal and the write pointer of the Dual Port RAM; the
output from counter_5, the num_events signal and the result of comp1 are the
input signals for this state machine;
• Dual Port RAM, implements a spike buffer,
• Counter, uses the signal data_valid from the State Machine and the output of
comp2 to manage the read pointer of the Dual Port RAM.
This unpacking module was implemented on a Xilinx Spartan XC2S200 requiring 72 slices
(3%) and 3 RAM blocks; the maximum operating frequency is 57 M Hz.
5.5.6
Image display
To verify the proper functioning of the artificial retina processing module, an output
stimulus display module was included in the design. This module should generate all synchronization signals necessary for the standard VGA monitor. The adopted configuration
uses a resolution of 640×480 and a refresh rate of 60 Hz. Figure 5.22 shows the temporal
diagram of the VGA signals and Table 5.5 the necessary timings for the signals.
The circuit of Figure 5.23 fulfills this task. As the captured image is in greyscale, so
is the output. Also, since an array of 32×32 pixels if very small for visual analysis, the
output image was enlarged four times by repeating the same pixel of the original image
16 times in the output image resulting in a window of 256×256 pixels. Although this
module is relatively simple, namely required additional RAM.
The signals generated by this circuit only control the image display. This device
requires 37 slices (1%) and uses 2 RAM blocks; the maximum operating frequency is
204 M Hz.
5.6. CONCLUSIONS AND RESULTS
53
R/G/B
signals
H/V SYNC
TPW Tbp
Tdisplay
Tfp
TS
Figure 5.22: VGA timing diagram.
Table 5.5: VGA Timings [5].
Symbol
TS
Parameter
Vertical Sync
Horizontal Sync
16.7 ms
32 µs
15.36 ms
25.6 µs
Period
Tdisp
Display time
Tpw
Sync pulse width
64 µs
3.84 µs
Tf p
Sync front porch
320 µs
640 ns
Tbp
Sync back porch
928 µs
1.92 µs
In order to validate the actual spike generation it was developed a module that recovers
the image using the spike information [2]. This recovery system was implemented by using
a low-pass filter with a very low cutoff frequency. To decrease the filter size it was used
an Infinite Impulsive Response (IIR) filter, with the following transfer function
H(s) =
c1
c2
c1 · c2
·
=
(s + a) (s + a)
(s + a)2
(5.11)
where c1 and c2 are gain coefficients. This filter can be implemented using a chain of
two low-pass IIR filters, equation (D.7). The discrete filter chain requires coefficients
with an higher value than the second order filter. Hence, the filter chain makes the
implementation less sensible to discretization errors. The adopted design implements the
filter chain requiring 98 slices (2%), 2 RAM blocks and 2 multipliers. The maximum
operating frequency is 91 M Hz.
5.6
Conclusions and results
Table 5.6 presents the implementation costs of the developed prototype. This table includes the hardware costs of all the developed digital circuits. Considering the results shown
in Table 5.6, this new platform achieves the proposed objectives. It implementes the
complete system with a high operating frequency and using a small amount of hardware
resources. To achieve an independent spacial processing, only the space calculus would
be replicated. As the hardware resources for the classic model are only 10% of the total
slices, this system allows the implementation of the classic model considering independent spacial processing for the three RGB channels. The power requirements also follow
the expected results: using the full implementation of the digital blocks on the FPGA
54
CHAPTER 5. FULL SYSTEM PROTOTYPE
Counter
MOD 800
CLK
DIV2
‘1’
CLK
Column count
Q
COUNT/STOP
>=656
<752
Column
addressing
section
Horizontal
Sync Pulse
<640
Blank
<480
Counter
MOD 521
CLK
‘1’
Line count
Q
COUNT/STOP
Bit Vector
Bit
Line
addressing
section
>=490
<492
Vertical
Sync Pulse
Figure 5.23: VGA monitor control circuit [2].
the board requires 450 mW , value that increases to 950 mW when the digital camera is
connected. The usage of the VGA output, with a conversion rate of 12.5 M Hz for the
DACs, requires 1.2 W . Regarding the to the previous prototype [2] the relative power
consumption is about 50% less (1.8 W ). Figure 5.24 presents a picture of the complete
prototype system.
Figure 5.24: Complete prototype system.
In order to test the processing model on the new board, a switch was used to control
the information shown in the display device. By using this switch, the output image can
be set to display the output of the Early Layers or the recovered image from the spike
output. Also, using another switch, it is possible to remove the temporal high-pass filter.
Figure 5.25 shows the output images obtained from the prototype. The input image
5.6. CONCLUSIONS AND RESULTS
55
Table 5.6: Complete Artificial Retina system implemented on a Xilinx Spartan XC3S400
FPGA.
Block
Slice
Operating
RAM Blocks
Multipliers
Occupation
Frequency (M Hz)
(Total 16)
(Total 16)
3%
178
1
0
3%
190
0
0
Classic Model∗
10%
85∗∗
7
3
Serialization and
4%
121
1
0
Image Display
3%
91
4
2
Data
3%
57
3
0
20%
85
12
5
Image Capture
and Resize
Register
Configuration
Data Packing
Unpacking∗∗∗
Complete
System
* includes Image Capture and Resize
** 49 clock cycles are required to process each pixel
*** synthesized on a Xilinx Spartan XC2S200 FPGA
resized is on the upper left corner and the processed image is on the lower right corner.
The output image from the Early Layers when the high-pass filter is included was
obtained by moving the camera. This resulted, as expected, in an image were only
the moving edges are displayed. When the filter is removed, the resulting image only
presents the shapes of the objects. In order to test the spike generation the input of the
Neuromorphic Pulse Coding was set to be the input resized image. The results followed
the expected model behavior. The developed board also operates as expected and the
output image is slightly better when compared with the previous developed board.
The developed board can also be employed as a development platform for a different
system, since it offers a total of 61 I/O pins in which 6 are global clock pins. These
connectors also provide a power source to a possible expansion module. The camera
connector provides a 5 V supply and the generic connector provides 5 V , 3.3 V and the
unregulated input voltage. The video port uses 8-bit per RGB channel, generating a
true color video format, and the video DAC has a maximum conversion rate of 20 M Hz.
This video port makes this board suitable for image processing, although for bigger image
frames it might be necessary to introduce a memory expansion module.
56
CHAPTER 5. FULL SYSTEM PROTOTYPE
(a) Early Layers experimental results.
(b) Early Layers without time filtering.
(c) Neuromorphic Pulse Coding experimental results after image recovery.
Figure 5.25: Photographs of the experimental results obtained with the artificial retina
prototype. The input image (after downsizing) is displayed on the top left corner and the
output at the bottom right corner.
Chapter 6
Conclusions
Modelling retinal ganglion cells response to a visual stimulus is the key to create an artificial system for visual rehabilitation. Visual information is conveyed to the visual cortex
in the form of spike trains and the detection of spikes poses a wide number of challenges.
After capturing neural activity with the proper devices, the application of Spike-Sorting
techniques is a fundamental task as pre-processing of neural recorded data to any model
representation. Although simple, the algorithms applied to cluster data revealed good
results, allowing different responses in the same electrode to be split and filtered to individual cells. Aggregation of individual cells by type maintained the structure of the fired
spike trains.
The networks trained achieved their primary objective by showing that, like other
nervous systems, the process of seeing can be explained by a learning rule. Neural networks performed quite well in retaining the structure of the spike trains. The topologies
that revealed the best results used a nonlinear activation function and at least twenty
hidden units to achieve the proposed results, mostly ever the number of units was superior and distributed among two hidden layers. Model results comparison cannot be
directly accomplished since the output variables of the models are not comparable. The
Classic Model outputs variable is the instantaneous firing rate whereas Neural Networks’
modelling predicts the individual spikes.
The hardware requirements for the Classic Model are scalable with the number of
considered electrodes. The architecture required to implement a neural network in FPL
was also developed. Considering the data path, the developed architecture can be used
for different electrode matrix sizes and different neural network topologies. The Spike
Multiplexing generates a single data flow for the spike information generated by the processing module. This architecture can also be employed for the Neural Networks. The
transmission of the spike information over a RF serial link channel was done accordingly
to a protocol previously specified in [3], even though changes were required to achieve the
correct functioning of the system. The chosen specification was the simplest, thus ensuring that the unpacking module does not require much power. This is a critical, aspect
as the unpacking module operates inside the human head.
This new prototype achieves the expected goals. It implements the complete system
with a higher operating frequency and using a small amount of hardware resources, when
57
58
CHAPTER 6. CONCLUSIONS
compared to previous solutions. The results followed the expected model behavior. The
system allows the implementation of the classic model, considering independent spacial
processing for the three RGB channels. The power requirements also follow the expected
results: using the full implementation of the digital blocks on the FPGA the board requires
450 mW , value that increases to 950 mW when the digital camera is connected. The
developed board also operates as expected and the output image is slightly better when
compared with one obtained with the previous board. The developed board can also be
employed as a development platform for a different system. It offers I/O pins through two
connectors, providing also a power source. The video port uses 8-bit per RGB channel
and high speed DACs generating a true color video. This video port makes this board
suitable to image processing.
6.1
Future work
Neural Networks modelling showed that the use of learning systems as predictors to model
spike trains is a hypothesis that can be much more explored. As for now, the results are
encouraging. There are also other learning methods that could be used instead of neural
networks to perform this task, such as decision learning trees or even unsupervised learning
methods. The use of information theory could also be an approach in coding spike trains,
as a measure of spike added information transmission.
There are several models of retinal responses in the literature and, as a means to
compare the different models, an almost universal error measure should be established and
implemented. Also, Neural Networks modelling only approached the temporal dimension
of the response and the spatial dimension is yet to be addressed.
Finally, using the new prototype board it is now possible to implement the complete
Classic Model with independent spacial processing. The Neural Networks model was
not completely implemented, an additional effort should made to develop the control
unit. Also, since the coefficients require a great amount of memory, implementing larger
neural networks would require the development of an expansion board to provide external
memory.
Appendix A
Clustering Algorithms
A.1
K-MEANS algorithm
The k-means algorithm [9, 39] is one of the simplest unsupervised learning algorithms that
solve the well known clustering problem. The procedure follows a simple and easy way to
classify a given data set through a certain number of clusters (assume k clusters) fixed a
priori. The main idea is to define k centroids, one for each cluster. These centroids should
be placed in a cunning way, because different location causes different results. The first
step is to take each point belonging to a given data set and associate it with the nearest
centroid. When no point is pending, an early groupage is done. At this point we need
to re-calculate k new centroids as barycenters of the clusters resulting from the previous
step. After we have these k new centroids, a new binding has to be done between the
same data set points and the nearest new centroid. A loop has been generated and, as
a result, we may notice that the k centroids change their location step by step until no
more changes are made. In other words, centroids do not move any more. Finally, this
algorithm aims at minimizing an objective function, in this case a squared error function.
The objective function (A.1) is an indicator of the distance of the n data points from
their respective cluster centres, where kxki − cj k is a chosen distance measure between a
data point xki and the cluster centre cj .
J=
K X
n
X
kxki − cj k2
(A.1)
j=1 i=1
The algorithm is composed of the following steps:
1. Place K points into the space represented by the objects that are being clustered.
These points represent initial group centroids.
2. Assign each object to the group that has the closest centroid.
3. When all objects have been assigned, re-calculate the positions of the K centroids.
4. Repeat Steps 2 and 3 until the centroids no longer move. This produces a separation
of the objects into groups from which the metric to be minimized can be calculated.
59
60
APPENDIX A. CLUSTERING ALGORITHMS
Although it can be proved that the procedure will always terminate, the k-means
algorithm does not necessarily find the most optimal configuration. K-means can converge
to a local optimum, which in this case is a partition of points in which moving any
single point to a different cluster increases the total sum of distances. The algorithm is
also significantly sensitive to the initial randomly selected cluster centres. The k-means
algorithm can be run multiple times to reduce this effect.
A.2
EM algorithm and bayesian classification
The EM algorithm is a general method of finding the maximum-likelihood estimate of
the parameters of an underlying distribution from a given data set when the data is
incomplete or has missing values [10].
Consider a data set X of size N taken from a distribution with density function
p(x|Θ) characterized by the set of parameters Θ (e.g. in the current case p is a Gaussian
distribution and Θ are the mean and covariance of p). If assumed that the observed data
points X = {x1 , x2 , . . . , xN } are independent and identically distributed (i.i.d.), then, the
resulting density for the samples is
p(X|Θ) =
N
Y
(A.2)
p(xi |Θ) = L(Θ|X)
i=1
where the function L(Θ|X) is called the likelihood function. The likelihood is considered
a function of the parameters Θ where the data X is fixed. The maximum likelihood
problem consists on finding the parameters Θ that maximize L, that is
Θ∗ = arg max L(Θ|X)
(A.3)
Θ
It is usual to maximize the log-likelihood function, log(L(Θ|X)), for analytical simplification.
Frequently, the probability density function (pdf) results of a mixture of M classes of
some kind. In this case the pdf is defined by
p(x|Θ) =
M
X
πi pi (x|θi )
(A.4)
i=1
where the parameters are Θ = (π1 , . . . , πM , θ1 , . . . , θp ), such that each πi is the ith class
P
priori probability, M
i=1 πi = 1 and each pi is a density function parameterized by θi .
The log-likelihood expression for this density function is given by
log(L(Θ|X)) = log
N
Y
i=1
p(xi |Θ) =
N
X
i=1
log
M
X
πj pj (xi |θj )
(A.5)
j=1
and it is not easy to optimize since it consists of the log of the sum. However, considering
X incomplete and the existence of unobserved data items Y = {y1 , y2 , . . . , yM } whose
A.2. EM ALGORITHM AND BAYESIAN CLASSIFICATION
61
values represent the component density that generated each data observation, the loglikelihood expression (A.5) becomes substantially simplified. This way, it is assumed that
yi = k if the ith sample was generated by the k th mixture component, the likelihood
becomes
log(L(Θ|X,Y)) = log(P(X,Y|Θ)) =
N
X
log(P (xi |yi )P (yi )) =
i=1
N
X
log(πyi pyi (xi |θyi ))
(A.6)
i=1
and can be computed using a variety of techniques, like the EM algorithm.
The algorithm is composed of the following steps:
1. Start (t=0) with an initial arbitrary set of parameters Θ0 = {(π10 , θ10 ), (π20 , θ20 ),
0
0
. . . , (πM
, θM
)}, and initial mixture estimation p(x|Θ0 );
2. Compute posteriors
P t (k|xi , θt ) =
πkt ptk (xi |θkt )
pt (xi |Θt )
(A.7)
3. Estimate the parameters Θt+1 by maximizing the likelihood function with fixed
P t (k|xi , θt ). Assuming a d -dimensional Gaussian with mean µ and covariance matrix
Σ, i.e., θi = (µi , Σi ) then
pi (x|µi , Σi ) =
1
T
− 12 (x−µ)Σ−1
i (x−µ)
e
d/2
1/2
(2π) |Σi |
(A.8)
and the maximization at each iteration of leads to
PN
xi P t (k|xi , θt )
k
µb t+1
= Pi=1
k
N
t
t
i=1 P (k|xi , Θ )
bt
b tk )T
i−µ
b t+1 = (xi − µk )(x
Σ
P
k
N
PN
P t (k|xi , Θt )
t
t
i=1 P (k|xi , θk )
πbkt+1 =
(A.9)
i=1
N
1 X
P t (k|xi , Θt )
N i=1
(A.10)
(A.11)
4. Set t = t + 1 and repeat steps 2 and 3 while the likelihood increases.
In the case of unsupervised learning, after estimating the parameters of a probability density functions mixture it is possible to compute the posteriors P (k|x), which are
obtained with Bayes’ rule,
πk pk (x|θk )
P (k|x, θk ) = PM
k=1 πk pk (x|θk )
(A.12)
If it is possible to assume that the each mixture component matches a class, the EM
algorithm provides an estimate of the a posteriori probability of each class given an
observation, without explicitly indicating the observation classes.
The natural classification is: x ∈ Classk → k = arg maxj P (j|x)
62
APPENDIX A. CLUSTERING ALGORITHMS
Appendix B
SPiKes Classifier - User Manual
SPKC (SPiKes Classifier) is a Matlab-based spike-sorting program. It implements two
algorithms for classifying spikes based on the principal component analysis of the spikes
waveform. The two algorithms for sorting are the k-means clustering and the expectationmaximization EM. This software was developed by José Germano and Ricardo Baptista
for their graduation project (IST / SIPS, INESC-ID).
Graph 1
Figure B.1: SPKC (SPiKes Classifier) user window.
63
Graph 2
64
APPENDIX B. SPIKES CLASSIFIER - USER MANUAL
Description of buttons and functions:
• File - Open Select one or more similarly configured *.nev data files for analysis.
• Electrode# Select channel to analyze. This box lists all available channels.units,
classified or not.
• Get Waveforms Loads a sub-sample of the total number of available waveform
events, displays the waveforms in window 1 and displays the waveforms projected
onto their first two principal components in window 2.
• Clustering Algorithm combo box selection defines the clustering algorithm for
sorting spikes.
• Classes combo box selection defines the number of classes to use in classification.
• Princ.Comp. combo box selection determines the number of principal components
used to define each waveform in the principal component analysis. This determines
the size of the feature set used in determining the clusters. However, only the first
two leading principal components are used for display in graph 2.
• Sort Classifies the sub-samples into several units. Additional options for the clustering process can be set in the ClusteringAlgorithm box, the Classes box and the
Princ. Comp. box.
• Remove Clears the classification on a given channel.
• Save Saves classification generated by ’Sort’ button.
Description of Graphs:
• Graph 1 displays the voltage-time waveforms of the neural events. Waveforms are
color coded to indicate their defined units.
• Graph 2 displays the neural waveforms projected onto their principal components.
Appendix C
Neural Network Modelling Spike
Trains Simulations
65
66APPENDIX C. NEURAL NETWORK MODELLING SPIKE TRAINS SIMULATIONS
Stim ulus
Cell
Model
0
10
20
30
40
50
60
15
16
17
Tim e (s)
18
19
20
21
Tim e (s)
(a) ADALINE.
Stimulus
Cell
Model 5-1
Model 10-1
Model 20-1
0
10
20
30
40
50
60
15
16
17
Tim e (s)
18
19
20
21
Tim e (s)
(b) Modelling with one hidden layer.
Stim ulus
Cell
Model 5-5-1
Model 10-5-1
Model 10-10-1
Model 20-10-1
0
10
20
30
40
50
60
15
16
Tim e (s)
17
18
19
20
21
Tim e (s)
(c) Modelling with two hidden layers.
Figure C.1: Comparison of real and modelled spike trains, neural network model with
stimulus input only.
67
Stimulus
Cell
Model
0
10
20
30
40
50
15
60
16
17
18
19
20
21
T i me ( s)
Tim e (s)
(a) ADALINE.
Stimulus
Cell
Model 5-1
Model 10-1
Model 20-1
0
10
20
30
40
50
60
15
16
17
Tim e (s)
18
19
20
21
Tim e (s)
(b) Modelling with one hidden layer.
Stimulus
Cell
Model 5-5-1
Model 10-5-1
Model 10-10-1
Model 20-10-1
0
10
20
30
40
50
60
15
16
17
18
19
20
21
Tim e (s)
Tim e (s)
(c) Modelling with two hidden layers.
Figure C.2: Comparison of real and modelled spike trains, neural network model with
stimulus and response feedback input.
68APPENDIX C. NEURAL NETWORK MODELLING SPIKE TRAINS SIMULATIONS
Appendix D
FPL Filter Implementation
To implement the desired modules several types of filters are required. This section
proposes a possible FPL implementation for these filters.
The DoG space filter was discretized using a square window to limit the order of the
resulting Finite Impulsive Response (FIR) filter. In discrete time
m[n] =
l
X
CDoG (k) · I[n − k]
(D.1)
k=−l
where I represents the input, 2 · l + 1 is the size of the window and CDoG are the filter
coefficients. Applying the Z transform to (D.1) the discrete transfer function is obtained
l
X
M (z)
=
CDoG (k) · z −k
I(z)
k=−l
(D.2)
Since the response was truncated it is necessary to correct the filter coefficients in
order to guarantee that the static gain is zero, this results in
ak = CDoG (k) −
l
X
1
(
CDoG (k) − A0 )
2l + 1 k=−l
(D.3)
where A0 represents the static gain and ak are the new filter coefficients.
This DoG filter was implemented using one multiplier by folding its architecture N 2 =
(2l + 1)2 times, therefore taking N 2 cycles to compute the convolution. Figure D.1(a)
represents the FPL implementation adopted for this filter where the ROM stores the filter
coefficients. Using this solution the working frequency will be N 2 times smaller than the
parallel solution1 .
The discrete high-pass filter, which was obtained applying the bilinear transformation
to (D.4), which is the laplace transform for this filter.
GHP (s) =
1
Not implemented as it needed to much resources
69
s
s+α
(D.4)
70
APPENDIX D. FPL FILTER IMPLEMENTATION
As the bilinear transformation is given by s =
GHP (z) = CHP
2 1−z −1
Ts 1+z −1
the resulting IIR filter is:
1 − z −1
1 − BHP z −1
(D.5)
where CHP and BHP are the filter coefficients. Resulting in discrete time:
r[n] = BHP · r[n − 1] + CHP · (m[n] − m[n − 1])
(D.6)
in which m represents the input and r the output.
The discrete IIR low-pass filter, which has a frequency response given by
HLP (s) =
B
s + τ1
(D.7)
was also discretized using the bilinear transformation resulting in
v[n] = BLP · v[n − 1] + CLP (u[n] + u[n − 1])
(D.8)
where u represents the input and v the output.
These two IIR filters can be implemented in FPL. In Figure D.1(b) a possible architecture is presented.
Q
ROM
(a) Gaussian filter.
+
+
+/Coef.
Q
D
Output
+
Input
+
Output
Input
x
+
x
+
+
+
RAM
(b) IIR filter
Figure D.1: FPL implementation of the filters.
The only difference from a low-pass to a high-pass filter is the coefficient, coef, and
HP
the feedback signal. A high-pass has coef = − B
and a negative feedback; the low-pass
CHP
BLP
has coef = CLP and the feedback is positive.
Appendix E
Prototype Datasheet
OVERVIEW
This prototype board was developed to allow the validation of the complete processing
module developed in the scope of the European project CORTIVIS - Cortical Visual
Neuroprosthesis for the Blind. Figure E.1 presents the block diagram of this board.
Control
Switches
VGA
RGB
Artificial Retina Prototype
5V Power
Regulator
8-bit DACs
39
40
JTAG
2
1
24
36
Generic Connector
Camera Connect or
25
XILINX
SPARTAN
XC3S400
1
2
32
31
FPGA Power Regulators
Power
Jack
Figure E.1: Prototype block diagram.
The board can also be employed as development platform for a different system. It
offers a total of 61 I/O pins in which 6 are global clock pins on two expansion connectors,
these connectors also provide a power source to a possible expansion module. The video
port uses 8-bit per RGB channel generating a true color video format making this board
suitable for image processing.
71
72
APPENDIX E. PROTOTYPE DATASHEET
FEATURES
This developed board presents the following main characteristics:
• Xilinx Spartan3 XC3S400 FPGA as processing core. This device as 400 K system
gates, a total memory of 288 kbits distributed in 16 RAM blocks and 16 dedicated
18 bits multipliers;
• three high efficiency, 3 A maximum output current power regulators (1.2 V , 2.5 V
and 3.3 V ) and a 1.5 A maximum output current 5 V linear power regulator. The
3.3 V and 5 V are available in the expansion connectors;
• 50 M Hz HCMOS 3.3 V oscillator;
• VGA true color display port, uses three onboard low power consumption 20 M Hz
conversion rate video DACs;
• it offers four slide switches and one push button;
• FPGA programming through a six pin JTAG header;
• two expansion slots with a total of 61 I/O pins in which 6 are GCLK inputs. Both
slots can provide supply power;
POWER SUPPLY
This board requires power supply with an output voltage within the range of 5 V to 6 V
DC. The power connector must be connected to a 2.1 mm female center-positive plug
and the supply must be capable of delivering at least 1 A. The board generates a total of
four supply voltages, 1.2 V , 2.5 V , 3.3 V and 5 V . To save power all the supply voltages,
except the 5 V , are generated using high efficiency regulators. Some of these voltages,
5 V and 3.3 V , as well as the unregulated input, are available through the expansion
connectors.
OSCILLATOR
The employed oscillator has an output frequency of 50 M Hz with a 3.3 V maximum
amplitude. The oscillator is connected to a GCLK FPGA pin (P181) and is placed close
to the FPGA. Oscillators for different frequencies can be employed.
VGA DISPLAY PORT
The display port is based on three separate 8-bit Video Digital-to-Analog converters,
TLC5602C, one per RGB channel. The TLC5602C DAC as a low power consumption
(80 mW typ), 20 M Hz conversion rate, single 5 V power supply and TTL digital input
voltage. Each of these video DACs are connected by a dedicated 8-bit wide data bus to
73
the FPGA and the clock net is the same for all the devices. The VGA synchronization
pins are connected directly to the FPGA. Table E.1 shows the FPGA pins that provide
those signals and also the output clock net for the DACs.
Table E.1: VGA synchronization pins and DAC clock.
FPGA PIN
Signal
139
HS
138
VS
180
DAC clock
The pin mapping between the FPGA I/O and the DACs is summarized in Table E.2.
Table E.2: FPGA pins for the video DACs.
Data Red
Data Green
Data Blue
FPGA PIN
BUS num
FPGA PIN
BUS num
FPGA PIN
BUS num
106
0 (LSB)
116
0 (LSB)
126
0 (LSB)
107
1
117
1
128
1
108
2
119
2
130
2
109
3
120
3
131
3
111
4
122
4
132
4
113
5
123
5
133
5
114
6
124
6
135
6
115
7 (MSB)
125
7 (MSB)
137
7 (MSB)
When using this video port ensure that the clock input for the video DACs is lower
than 20 M Hz. Also, only connect the VGA plug when the board is connected to the
power supply.
SLIDE SWITCHES AND PUSH BUTTON
When in the ON position the FPGA pin is pulled to ground; when in OFF the pin will
go high. The push button will also be pulled to ground when pressed, otherwise the pin
is pulled to VCC . Table E.3 presents the pin mapping for the slide swiches and the push
button.
FPGA CONFIGURATION
The FPGA is configured using a 6 pin JTAG header. Figure E.2 shows the function of
each pin in the connector.
74
APPENDIX E. PROTOTYPE DATASHEET
Table E.3: FPGA pins for the slide switches and the push button.
FPGA PIN
Name
PIN num
156
SW(0)
1
155
SW(1)
2
154
SW(2)
3
152
SW(4)
4
150
PB
-
.
TMS
TDI
TDO
TCK
GND
VCC
RGB
.
.
.
Control
Switches
.
25
.
VGA
Camera Connector
Figure E.2: JTAG connector pin order.
The header can be connected to a standard JTAG programming cable (Digilent JTAG3),
the Vcc pin is connected to the 3.3 V supply voltage.
EXPANSION CONNECTORS
This board provides two expansion connectors, both can provide supply power to an
expansion module. The pin arrangement is not equal for the two connectors. As the
camera connector must connect a digital camera, the pin arrangement must be the same
as the camera module. Table E.4 showns the pin mapping between the FPGA I/O and the
camera connector. Table E.5 shows the FPGA pins for the generic connector expansion
slot.
75
Table E.5: FPGA pins for the generic expansion slot.
FPGA PIN
Table E.4: FPGA pins for camera connector
expansion slot.
Name
Connect num
48
1
46
2
52
3
Connect num
51
4
189
1
58
5
187
2
57
6
191
3
62
7
190
4
61
8
196
5
64
9
194
6
63
10
198
7
67
11
197
8
65
12
200
9
71
13
199
10
68
14
204
11
74
15
203
12
72
16
205
13
77∗
17
183∗
14
76∗
18
15
79∗
19
16
78∗
20
FPGA PIN
-
Name
GND
184∗
17
81
21
2
-
GND
18
80
22
3
19
85
23
-
5V
20
83
24
-
GND
21
87
25
-
5V
22
86
26
4
23
92
27
5
24
90
28
7
25
94
29
9
26
93
30
10
27
96
31
11
28
95
32
12
29
100
33
13
30
97
34
GND
31
102
35
VTO
32
101
* CCLK pins
-
36
5V
37
-
3.3 V
38
-
Vccunreg
39
GND
40
* CCLK pins
76
APPENDIX E. PROTOTYPE DATASHEET
Appendix F
Prototype Board Schematics
77
D
C
B
A
1
28
29
31
33
34
35
36
37
39
40
42
43
44
45
46
48
50
51
52
2
3
4
5
7
9
10
11
12
13
15
16
18
19
20
21
22
24
26
27
Gen[0..35]
Gen33
Gen32
Gen35
Gen34
Cam15
Cam16
Cam17
Cam18
Cam19
Cam20
Cam21
Cam22
Cam23
Cam24
IO_L40P_6/VREF_6
IO_L40N_6
IO_L39P_6
IO_L39N_6
IO_L24P_6
IO_L24N_6/VREF_6
IO_L23P_6
IO_L23N_6
IO_L22P_6
IO_L22N_6
IO_L21P_6
IO_L21N_6
IO_L20P_6
IO_L20N_6
IO_L19P_6
IO_L19N_6
IO/VREF_6
IO_L01P_6/VRN_6
IO_L01N_6/VRP_6
IO_L01P_7/VRN_7
IO_L01N_7/VRP_7
IO_L16P_7/VREF_7
IO_L16N_7
IO_L19P_7
IO_L19N_7/VREF_7
IO_L20P_7
IO_L20N_7
IO_L21P_7
IO_L21N_7
IO_L22P_7
IO_L22N_7
IO_L23P_7
IO_L23N_7
IO_L24P_7
IO_L24N_7
IO_L39P_7
IO_L39N_7
IO_L40P_7
IO_L40N_7/VREF_7
Cam[0..24]
Cam14
Cam13
Cam12
Cam11
Cam10
Cam9
Cam8
Cam7
Cam6
Cam5
Cam4
Cam3
Cam2
BANK 7
1
Cam[0..24]
2
Cam1
Cam0
205
204
203
200
199
198
197
196
194
191
190
189
187
185
184
183
BANK 6
BANK 5
BANK 0
Gen17
Gen18
Gen19
Gen20
Gen21
Gen22
Gen23
Gen24
Gen25
Gen26
Gen27
Gen28
Gen29
Gen30
Gen31
2
182
181 CLK_50M hz
180 CLK_dac
178
176
175
172
171
169
168
167
166
165
162
161
BANK 4
BANK 1
IO
IO_L32N_1/GCLK5
IO_L32P_1/GCLK4
IO_L31N_1/VREF_1
IO_L31P_1
IO
IO_L28N_1
IO_L28P_1
IO_L27N_1
IO_L27P_1
IO
IO_L10N_1/VREF_1
IO_L10P_1
IO_L01N_1/VRP_1
IO_L01P_1/VRN_1
BANK 2
BANK 3
Gen[0..35]
IO_L40N_3/VREF_3
IO_L40P_3
IO_L39N_3
IO_L39P_3
IO_L24N_3
IO_L24P_3
IO_L23N_3
IO_L23P_3/VREF_3
IO_L22N_3
IO_L22P_3
IO_L21N_3
IO_L21P_3
IO_L20N_3
IO_L20P_3
IO_L19N_3
IO_L19P_3
IO_L17N_3
IO_L17P_3/VREF_3
IO_L01N_3/VRP_3
IO_L01P_3/VRN_3
IO_L01N_2/VRP_2
IO_L01P_2/VRN_2
IO/VREF_2
IO_L19N_2
IO_L19P_2
IO_L20N_2
IO_L20P_2
IO_L21N_2
IO_L21P_2
IO_L22N_2
IO_L22P_2
IO_L23N_2/VREF_2
IO_L23P_2
IO_L24N_2
IO_L24P_2
IO_L39N_2
IO_L39P_2
IO_L40N_2
IO_L40P_2/VREF_2
IO_L32P_4/GCLK0
IO_L32N_4/GCLK1
IO_L31P_4/DOUT/BUSY
IO_L31N_4/INIT_B
IO/VREF_4
IO_L30P_4/D3
IO_L30N_4/D2
IO_L27P_4/D1
IO_L27N_4/DIN/D0
IO
IO_L25P_4
IO_L25N_4
IO/VREF_4
IO
IO_L01P_4/VRN_4
IO_L01N_4/VRP_4
IO/VREF_4
IO/VREF_0
IO_L01N_0/VRP_0
IO_L01P_0/VRN_0
IO/VREF_0
IO_L25N_0
IO_L25P_0
IO
IO_L27N_0
IO_L27P_0
IO_L30N_0
IO_L30P_0
IO
IO_L31N_0
IO_L31P_0/VREF_0
IO_L32N_0/GCLK7
IO_L32P_0/GCLK6
IO_L01P_5/CS_B
IO_L01N_5/RDWR_B
IO_L10P_5/VRN_5
IO_L10N_5/VRP_5
IO
IO_L27P_5
IO_L27N_5/VREF_5
IO_L28P_5/D7
IO_L28N_5/D6
IO
IO_L31P_5/D5
IO_L31N_5/D4
IO_L32P_5/GCLK2
IO_L32N_5/GCLK3
IO/VREF_5
57
58
61
62
63
64
65
67
68
71
72
74
76
77
78
Figure F.1: FPGA electrical diagram.
79
80
81
83
85
86
87
90
92
93
94
95
96
97
100
101
102
DB3
DB2
DB1
DB0
DG7
DG6
DG5
DG4
DG3
DG2
DG1
DG0
DR7
DR6
DR5
DR4
DR3
DR2
DR1
DR0
131
130
128
126
125
124
123
122
120
119
117
116
115
114
113
111
109
108
107
106
3
DB7
DB6
DB5
DB4
BTN[0..4]
BTN[0..4]
CLK_50Mhz
Date:
File:
A4
Size
Title
DR[0..7]
DG[0..7]
DB[0..7]
13-09-2004
D:\Trabalho\..\spartan3.SchDoc
Number
DR[0..7]
DG[0..7]
DB[0..7]
CLK_dac
VGA_HS
VGA_VS
CLK_50Mhz
CLK_dac
BTN3
BTN2
BTN1
BTN0
BTN4
U6A
XC3S400-5PQ208C
156
155
154
152
150
149
148
147
146
144
143
141
140
139
138
137
135
133
132
3
4
Sheet of
Drawn By:
4
Revision
D
C
B
A
78
APPENDIX F. PROTOTYPE BOARD SCHEMATICS
Gen0
Gen1
Gen2
Gen3
Gen4
Gen5
Gen6
Gen7
Gen8
Gen9
Gen10
Gen11
Gen12
Gen13
Gen14
Gen15
Gen16
Figure F.2: FPGA power connections and configuration.
D
C
B
A
6
5
4
3
2
1
1
Header 6
J5
Vccaux
1
Res3
4.7K
R11
Res3
4.7K
R12
Res3
4.7K
Rc13
Res3
4.7K
R14
Vcco
Vccint
Vccaux
Vcco
127
110
153
136
177
164
201
188
192
174
88
70
193
173
142
121
89
69
38
17
206
208
158
159
160
55
54
56
207
104
103
GND
GND
GND
GND
GND
GND
GND
GND
GND
GND
GND
GND
GND
GND
GND
GND
GND
GND
GND
GND
GND
GND
GND
GND
GND
GND
GND
GND
VCCO_4
VCCO_4
VCCO_5
VCCO_5
VCCO_6
VCCO_6
VCCO_7
VCCO_7
XC3S400-5PQ208C
VCCO_3
VCCO_3
VCCO_2
VCCO_2
VCCO_1
VCCO_1
VCCO_0
VCCO_0
VCCINT
VCCINT
VCCINT
VCCINT
VCCAUX
VCCAUX
VCCAUX
VCCAUX
VCCAUX
VCCAUX
VCCAUX
VCCAUX
HSWAP_EN
TDI
TDO
TCK
TMS
M0
M1
M2
PROG_B
CCLK
DONE
U6B
2
84
98
60
73
32
49
6
23
1
8
14
25
30
41
47
53
59
66
75
82
91
99
105
112
118
129
134
145
151
157
163
170
179
186
195
202
2
C51
Cap Semi
0.01uF
C44
Cap Semi
0.1uF
C36
Cap Semi
0.01uF
C28
Cap Semi
0.01uF
C52
Cap Semi
0.01uF
Vccaux
C45
Cap Semi
0.1uF
Vcco
C37
Cap Semi
0.01uF
C29
Cap Semi
0.01uF
Vccint
C53
Cap Semi
0.01uF
C46
Cap Semi
0.1uF
C38
Cap Semi
0.01uF
C30
Cap Semi
0.01uF
3
3
C54
Cap Semi
0.01uF
C47
Cap Semi
0.1uF
C39
Cap Semi
0.01uF
Vcco
C31
Cap Semi
0.01uF
Date:
File:
A4
Size
Title
C33
Cap Semi
0.1uF
Vcco
4
Sheet of
Drawn By:
C57
Cap Semi
1uF
Vccaux
13-09-2004
D:\Trabalho\..\spartan3_power.SchDoc
Number
C55
Cap Semi
0.1uF
Vccaux
C43
Cap Semi
0.01uF
C34
Cap Semi
1uF
Vccint
C42
Cap Semi
0.01uF
C48
Cap Semi
1uF
C41
Cap Semi
0.01uF
C63
Cap Semi
1uF
C40
Cap Semi
0.01uF
C32
Cap Semi
0.1uF
Vccint
4
Revision
C58
Cap Pol3
470uF
Vccaux
C50
Cap Pol3
470uF
Vcco
C35
Cap Pol3
470uF
Vccint
D
C
B
A
79
Figure F.3: Power regulators.
D
C
B
A
Res3
10
R7
C9
Cap Pol3
22uF
Vcc_unreg
Vcc_unreg
PWR2.5
J1
2
3
1
1
C10
Cap Semi
2.2uF
Cap Semi
470pF
C11
Cap Semi
2.2uF
C60
Cap Pol3
10uF
1
IN
FB
EN
R8
Res3
39K
2
4
12
5
6
7
TPS78601
2
5
1
U10
C61
Cap Pol3
680uF
OUT
C62
Cap Semi
100nF
MAX1831
IN
IN
VCC
n_SHDN
COMP
TOFF
U3
4
Vcc_unreg
GND
TAB
3
6
LX
LX
LX
LX
FB
FBSEL
REF
GND
PGND
PGND
1
3
14
16
8
11
10
9
13
15
Cap Semi
1uF
C1
Cap Semi
15pF
R1
Res3
91K
R2
Res3
30K
2
C12
Cap Semi
1uF
Inductor Isolated
2.2uH
L2
+5
2
C13
Cap Pol3
150uF
Vcco
Res3
10
R9
Res3
10
R3
C5
Cap Pol3
22uF
Vcc_unreg
C14
Cap Pol3
22uF
Vcc_unreg
C4
Cap Semi
2.2uF
Cap Semi
470pF
C6
C15
Cap Semi
2.2uF
3
2
4
12
5
6
7
R10
Res3
56K
R4
Res3
130K
Cap Semi
470pF
Vccin_25
C16
3
IN
IN
VCC
n_SHDN
COMP
TOFF
MAX1830
Date:
File:
A4
Size
Title
MAX1830
IN
IN
VCC
n_SHDN
COMP
TOFF
U2
2
4
12
5
6
7
U4
1
3
14
16
8
11
10
9
13
15
C8
Cap Semi
1uF
R6
Res3
30K
R5
Res3
2.7K
4
Sheet of
Drawn By:
C17
Cap Semi
1uF
Inductor Isolated
2.2uH
L1
Vccin_25
Inductor Isolated
2.2uH
L3
13-09-2004
D:\Trabalho\..\power.SchDoc
Number
LX
LX
LX
LX
FB
FBSEL
REF
GND
PGND
PGND
LX
LX
LX
LX
FB
FBSEL
REF
GND
PGND
PGND
1
3
14
16
8
11
10
9
13
15
4
Revision
C7
Cap Pol3
150uF
Vccint
C18
Cap Pol3
150uF
Vccaux
D
C
B
A
80
APPENDIX F. PROTOTYPE BOARD SCHEMATICS
Figure F.4: Digital to analog converters.
D
C
B
A
+5
+5
+5
+5
DG[0..7]
DR[0..7]
CLK_dac
Inductor
600z
Inductor
600z
L11
L10
1
DG[0..7]
Inductor
600z
Inductor
600z
L9
L8
Inductor
600z
Inductor
600z
L15
L14
DR[0..7]
Inductor
600z
Inductor
600z
L13
L12
CLK_dac
AGNDG
Vref_dac
10
5
8
C22
AVDDG
11
CLK_dac
Cap Semi
1uF
COMP_G 3
19
DG0
18
DG1
17
DG2
16
DG3
15
DG4
14
DG5
13
DG6
12
DG7
DGNDG
1
2
C23
DVDDG
9
Cap Semi
4
0.01uF
AGNDG 7
Vref_dac
10
5
8
C25
AVDDR
CLK_dac
Cap Semi
11
1uF
COMP_R 3
19
DR0
DR1
18
DR2
17
DR3
16
DR4
15
DR5
14
DR6
13
DR7
12
DGNDR
1
2
C26
DVDDR
9
Cap Semi
4
0.01uF
AGNDR 7
AGNDR
TLC5602CDW
ANLGGND
ANLGVDD1
ANLGVDD2
CLK
COMP
D0(LSB)
D1
D2
AOUT
D3
D4
NC20
D5
D6
D7(MSB)
DGTLGND
DGTLVDD1
DGTLVDD2
REF
NC7
U8
TLC5602CDW
ANLGGND
ANLGVDD1
ANLGVDD2
CLK
COMP
D0(LSB)
D1
D2
AOUT
D3
D4
NC20
D5
D6
D7(MSB)
DGTLGND
DGTLVDD1
DGTLVDD2
REF
NC7
U9
AGNDR
AoutR
AGNDG
20
2
COMP_G
AoutG
6
COMP_R
20
6
C24
Cap Semi
1uF
C27
Cap Semi
1uF
+5
1
DB[0..7]
VIN VOUT
GND
U5
+5
+5
MAX6004
3
2
3
2
Inductor
600z
Inductor
600z
L7
L6
DB[0..7]
Inductor
600z
Inductor
600z
L5
L4
3
Vref_dac
Date:
File:
A4
Size
Title
Vref_dac
6
1
7
2
8
3
9
4
10
5
TLC5602CDW
J8
Connector 15
15
14
13
12
11
ANLGGND
ANLGVDD1
ANLGVDD2
CLK
COMP
D0(LSB)
D1
D2
AOUT
D3
D4
NC20
D5
D6
D7(MSB)
DGTLGND
DGTLVDD1
DGTLVDD2
REF
NC7
U7
13-09-2004
D:\Trabalho\..\DAC.SchDoc
Number
AoutB
AoutG
AoutR
10
5
8
C19
AVDDB
CLK_dac
Cap Semi
11
1uF
COMP_B 3
19
DB0
DB1
18
DB2
17
DB3
16
DB4
15
DB5
14
DB6
13
DB7
12
DGNDB
1
2
C20
DVDDB
9
Cap Semi
4
0.01uF
AGNDB 7
AGNDB
0
1
6
AoutB
4
Sheet of
Drawn By:
VGA_VS
VGA_HS
C21
Cap Semi
1uF
Revision
COMP_B
20 AGNDB
4
D
C
B
A
81
Figure F.5: Main schematic.
D
C
B
A
Gen34
Gen32
Gen30
Gen28
Gen26
Gen24
Gen22
Gen20
Gen18
Gen16
Gen14
Gen12
Gen10
Gen8
Gen6
Gen4
Gen2
Gen0
1
+5
Vcc_unreg
U_DAC
DAC.SchDoc
U_spartan3_power
spartan3_power.SchDoc
U_power
power.SchDoc
1
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
32
34
36
38
40
Header 20X2H
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
33
35
37
39
J4
CLK_dac
DR[0..7]
DG[0..7]
DB[0..7]
VGA_HS
VGA_VS
Vcco
Gen35
Gen33
Gen31
Gen29
Gen27
Gen25
Gen23
Gen21
Gen19
Gen17
Gen15
Gen13
Gen11
Gen9
Gen7
Gen5
Gen3
Gen1
DR[0..7]
DG[0..7]
DB[0..7]
Cam17
Cam19
Cam21
Cam23
Cam16
Cam3
Cam5
Cam7
Cam9
Cam11
Cam13
Cam14
2
4
6
8
10
12
14
16
18
20
22
24
26
28
30
32
Header 16X2H
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
J3
CLK_dac
DR[0..7]
DG[0..7]
DB[0..7]
VGA_HS
VGA_VS
Gen[0..35]
Cam[0..24]
U_spartan3
spartan3.SchDoc
BTN[0..4]
2
Cam18
Cam20
Cam22
Cam24
VTO
+5
Cam2
Cam4
Cam6
Cam8
Cam10
Cam12
Cam0
Cam1
Cam15
BTN[0..4]
R15
Res3
4.7K
Vcco
BTN0
CLK_50Mhz
2
BTN4
BTN3
BTN2
BTN1
BTN0
BTN1
R16
Res3
4.7K
Vcco
3
4
E/D
GND
50MHz
CFPS-73
OUT
VCC
U1
SW DIP-4
SW-PB
S1
S2
Vcco
1
2
3
4
BTN2
R17
Res3
4.7K
Vcco
2
1
8
7
6
5
BTN3
3
VTO
R18
Res3
4.7K
Vcco
3
Date:
File:
A4
Size
Title
Header 2
1
2
J2
C59
Cap Semi
100pF
BTN4
R19
Res3
4.7K
Vcco
13-09-2004
D:\Trabalho\..\main.SchDoc
Number
4
Sheet of
Drawn By:
4
Revision
D
C
B
A
82
APPENDIX F. PROTOTYPE BOARD SCHEMATICS
Cam[0..24]
Gen[0..35]
Bibliography
[1] Brian A. Wandell. Foundations of Vision. Sinauer Associates, Inc., 1995.
[2] Pedro F. Z. Tomás. Bio-inpired processing module for the development of an artificial
retina. Graduation Report, Instituto Superior Técnico, Lisbon, 2003.
[3] CORTIVIS Project. Communication: Protocol, data formats and digital circuits.
Technical report, INESC-ID, June 2003.
[4] Xilinx. Power Distribution System (PDS) Design: Using Bypass/Decoupling Capacitors.
http://www.xilinx.com/bvdocs/appnotes/xapp623.pdf.
[5] Digilent Inc. Digilab DIO2 Reference Manual.
http://www.digilentinc.com/Data/Products/DIO2/DIO2-rm.pdf.
[6] Cortical Visual Neuroprosthesis for the Blind - CORTIVIS.
http://cortivis.umh.es.
[7] Helga Kolb. How the retina works. American Scientist, 91, 2003.
[8] Heetderks W. J. Wheeler B. C. A comparison of techniques for classification of
multiple neural signals. IEEE Trans. Biomed. Eng., 29:752–759, 1982.
[9] Everitt B. S. Cluster Analysis. New York: Wiley, 1993.
[10] T. Krishnan G.J. McLachlan. The EM Algorithm and Extentions. New York: Wiley,
1997.
[11] Michael S. Lewicki. A review of methods for spike sorting: the detection and classification of neural action potentials. Network: Comput. Neural Syst., 9:R53–R78,
1998.
[12] Stefan D. Wilke, Andreas Thiel, Christian W. Eurich, Martin Greschner, Markus
Bongard, Joseff Ammermüller and Helmut Schwegler. Population coding of motion
patterns in the early visual system. Journal of Comparative Physiology A, 187(7):549–
558, March 2001.
[13] Markus Meister and Michael J. Berry II. The Neural Code of the Retina. Neuron,
22:435–450, March 1999.
83
84
BIBLIOGRAPHY
[14] Iman H. Brivanlou, David K. Warland and Markus Meister. Mechanisms of Concerted
Firing among Retinal Ganglion Cells. Neuron, 20:527–539, March 1998.
[15] Michael J. Berry II, Iman H. Brivanlou, Thomas A. Jordan and Markus Meister.
Anticipation of moving stimuli by the retina. Nature, 398:334–338, March 1999.
[16] Ben Krose, Patric van der Smagt. An Introduction to Neural Networks. 1996.
[17] E. Fiesler and R. Beale, editors. Handbook of Neural Computation. Institute of
Physics and Oxford University Press, 1996.
[18] Fred Rieke, David K. Warland, Rob de Ruyter van Steveninck and William Bialek.
Spikes: Exploring the Neural Code. Cambridge, MA: MIT Press, 1997.
[19] Michael J. Berry, David K. Warland and Markus Meister. The structure and precision
of retinal spike trains. Proc. Natl. Acad. Sci. USA, 94:5411–5416, May 1997.
[20] E. M. L. Beale. A derivation of conjugate gradients. In F. A. Lootsma, editor,
Numerical methods for nonlinear optimization. Academic Press, London, 1972.
[21] M. J. D. Powell. Restart procedures for the conjugate gradient method. Mathematical
Programming, 12:241–254, 1977.
[22] C. Charalambous. Conjugate gradient algorithm for efficient training of artificial
neural networks. IEEE Proceedings, 139(3):301–310, 1992.
[23] Justin Keat, Pamela Reinagel, R. Clay Reid and Markus Meister. Predicting Every
Spike: A Model for the Responses of Visual Neurons. Neuron, 30:803–817, June
2001.
[24] Rubén Moreno and Néstor Parga. Firing rate for a generic integrate-and-fire neuron
with exponentially correlated input. Lecture Notes in Computer Science of SpringerVerlag Heidelberg, 2415, 2002.
[25] Kwabena A. Boahen. Point-to-point connectivity between neuromorphic chips using
address-events. IEEE Transactions on Circuits and Systems, pages 100–117, 1999.
[26] Lazzaro J. and Wawrzynek J. A multi-sender asynchronous extension to the address
event protocol. In Proc. of 16th Conference on Advanced Research in VLSI, pages
158–169, 1995.
[27] Xilinx. Spartan-3 FPGA Family: Complete Data Sheet.
http://direct.xilinx.com/bvdocs/publications/ds099.pdf.
[28] Xilinx Inc. Spartan-2 Complete Data Sheet.
http://direct.xilinx.com/bvdocs/publications/ds077.pdf.
[29] MAXIM. MAX1830/1831, 3A, 1MHz, Low-Voltage, Step-Down Regulators with
Synchronous Rectification and Internal Switches.
http://pdfserv.maxim-ic.com/en/ds/MAX1830-MAX1831.pdf.
BIBLIOGRAPHY
85
[30] Texas Instruments. TLC5602C, VIDEO 8-BIT DIGITAL-TO-ANALOG CONVERTERS.
http://www-s.ti.com/sc/ds/tlc5602.pdf.
[31] MAXIM. MAX6004, Low-Cost, Low-Power, Low-Dropout,SOT23-3 Voltage References.
http://pdfserv.maxim-ic.com/en/ds/MAX6001-MAX6005.pdf.
[32] Texas Instruments. TPS78601, ULTRALOW-NOISE, HIGH PSRR, FAST RF 1.5
A LOW-DROPOUT LINEAR REGULATORS.
http://www-s.ti.com/sc/ds/tps78601.pdf.
[33] C-MAC. CFPS-73, 50 MHz, Tri-state HCMOS (3.3V) oscilator.
http://www.cmac.com/mt/databook/oscillators/zarlink/72-73/CFPS-72,
[34] Quasar Electronics Ltd. C3188A - 1/3” Color Camera Module With Digital Output.
http://www.electronic-kits-and-projects.com/kit-files/cameras/d-c3188a.pdf.
[35] OmniVision Technologies Inc. OV7620 - Single-Chip Cmos Color Digital Camera.
http://mxhaard.free.fr/spca50x/Doc/Omnivision/OV7620.pdf.
[36] OmniVision Technologies Inc. The Serial Camera Control Bus Functional Specifications.
http://www.ovt.com/pdfs/ds note.pdf.
[37] Philips Semiconductors. THE I2C-BUS SPECIFICATION.
http://www.semiconductors.philips.com/acrobat/literature/9398/39340011.pdf.
[38] Digilent Inc. Digilab 2 Reference Manual.
http://www.digilentinc.com/Data/Products/D2/D2-rm.PDF.
[39] MacQueen, J. B. Some Methods for classification and Analysis of Multivariate Observations, Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and
Probability. Berkeley, University of California Press, 1:281–297, 1967.
© Copyright 2026 Paperzz