Powerful Neuroprocessor for Intelligent Systems SAND

CEC
CERTIFIE
D
C–
DIN EN
100 114
CERTIFIED
IMS CHIPS
Powerful Neuroprocessor
for Intelligent Systems
SAND
ISO 900
1
IMS CHIP
S
General Description
SAND (Simply-Applicable Neural Device) is a neural processor based upon the principle of a systolic array.
Four parallel processor elements form the heart of this array. Each processor element has a multiplier
and two adders, one of which serves as an accumulator. A post-processing module allows the determination of the largest and smallest output activation. With a maximum clock frequency of fmax = 50 MHz
SAND achieves a performance of 200 MCPS (Mega Connections Per Second). Multiple SAND chips may
be connected in parallel in order to attain a further increase in performance. SAND was designed by
Forschungszentrum Karlsruhe and IMS Chips in the framework of the neuro-logic working group.
SAND Properties
• 4 parallel processor elements
• Calculation of scalar product and vector distance
• Extreme value search (minimum and maximum)
• Cut-function with over/underflow-recognition
• On-line adaptation of arithmetic precision
• Cascadable architecture
• The following neural networks are supported
- multilayer perceptron (MLP)
- radial basis function networks (RBF)
- Kohonen feature maps
• Simple programming (34-bit control words)
• Parallel weight and data bus
• Only few external peripheral components are
necessary for the operation of SAND:
- Look-up table for the non-linear transfer function (no additional processor required)
- Sequencer for the overall memory management
as well as the control of SAND itself
- Weight and intermediate memory
Technical Data
• 200 MCPS (600 MOPS) at maximum clock speed
• 100 Mbytes/s maximum data transfer rate on
the weight/data bus
• Maximum MLP network size:
512–512–512–...–512 (I-H-H-...-O)
• Maximum Kohonen network size: 512–256 (I-O)
• Maximum RBF network size: 512–512–512 (I-H-O)
Pin Description
13
12
11
10
IMS 201.047 • FZK
9
33,02 +
- 0,33 mm
Performance Specifications
8
7
SAND
6
5
4
Electrical and Mechanical Characteristics
3
• Supply voltage Vdd : 4.5 V … 5.5 V
• Maximum clock rate: 50 MHz
• Maximum power consumption PV,max: 3W
• CPGA package, 120 pins
• Dimensions: 33 mm x 33 mm x 2 mm
• Input / Output: CMOS
2
1
N
M
L
K
J
H
G
F
E
Control
Data
Weights
Address
D
C
B
A
Power
Chip Architecture
Description of the SAND architecture
Data I/O
• Calculation of the scalar product X·Y (multilayer
perceptron)
• Calculation of the vector distance ||X-Y||
(Kohonen- and RBF-networks)
• Extreme value search (Kohonen)
• 16-bit data and weights
• 40-bit internal precision
• AutoCut module for reduction of word width to
16 bits, with saturation function and on-line
adaptation of arithmetic precision
• Processing of packets consisting of 4 data words
• Activation (input and output) normalized to
the range -1.0…+1.0
• 8 fixed-point formats available for the weights
(0.25…128)
• 2 selectable output formats: linear activation
function or any transfer function as look-uptable in external memory
• All data represented in two’s complement
format
• Continuous dataflow on the weight and data
busses (max. 100 MB/s)
Data
Weights
16
16
R
R
Add
Add
Add
Add
Switch
Switch
Switch
Switch
16
Mult
Add
Mult
Add
16
16
Register
16
Register
Mult
Add
16
16
Register
16
Register
R
Mult
Add
40
40
40
40
AutoCut
AutoCut
AutoCut
AutoCut
16
Register
16
16
Register
Register
Switch
16
Post Processing
16
Data
16
Address
Figure 1: Internal architecture of the SAND neural processor
16
Register
16
Application Notes
Typical fields of applications of
the SAND neural processor are:
Data_in
16
• Pattern recognition
• Image processing
• High-energy physics
• Control engineering
• Prediction systems
Weight
memorySRAM
64k x 16
SAND
Fifo
2k x 16
Cmd_bus
8
Sequencer
34
Look
Very few external peripheral
Up
Table
components are necessary for
the operation of SAND.
16
Only a weight store (SRAM,
64k words), a FIFO (2k words)
and a look-up table (SRAM,
Data_out
64k words) are required. For
Figure 2: Minimal Configuration
the control of SAND and the
memory management a sequencer component (FPGA) is available providing a simple macro command
set. Figure 2 shows a minimal configuration.
The input data streams (weights and data) are processed in the SAND chip. Dependent on the selected
operating mode, a non-linear transfer function may be implemented by means of an external look-up
table. The output data of the hidden layers are stored temporarily in the FIFO before being processed
again by the SAND chip as previously described.
A neural processor board that contains up to four SAND chips and a PCI host-interface is distributed by
INCO Systeme, Leipzig ( www. inco.de). The neuro ution board shows a peak performance of 800 MCPS.
IMS • Allmandring 30a • D-70569 Stuttgart
Tel. +49/711/685-7333 • Fax +49/711/685-5930
http://www.uni-stuttgart.de/ims/
The printed data are prelimanary and subject to change without prior notice.
8/97