neural networks: part 1

Neural Networks (NN)
Part 1
1. NN: Basic Ideas
2. Computational Principles
3. Examples of Neural Computation
1. NN: Basic Ideas
• Neural network: An example
• A neural network has been shown to be a universal
Turing machine; it can compute anything that is
computable (Siegelmann & Sontag, 1991).
• Further, if equipped with appropriate algorithms, the
neural network can be made into an intelligent
computing machine that solves problems in finite time.
• Such algorithms have been developed in recent years:
–
–
–
–
–
Backpropagation learning rule (1985)
Hebbian learning rule (1949)
Kohonen’s self-organizing feature map (1982)
Hopfield nets (1982)
Boltzman machine (1986)
• Biological neural network (BNN) vs artificial neural
network (ANN)
McCulloch-Pitts Networks
(1943)
- Model of artificial neurons that computes Boolean
logical functions
Y  f (W X  W X  Q)
1 1
2 2
where Y, X1, X2 take on binary values of 0 or 1, and W1,
W2, Q take on continuous values
(e.g.) for W1 = 0.3, W2 = 0.5, W0 = 0.6
X1
0
0
1
1
X2
0
1
0
1
Y
0
0
0
1
Boolean AND Computation
McCulloch & Pitts (1943, Fig. 1)
Knowledge representation in M-P network
Hiring Rule #1:
“A job candidate who has either good grade or prior job
experience and also gets strong letters and receives positive
mark in interview tends to make a desirable employee and
therefore should be hired.”
“Knowledge is in the connection weights (and the threshold).”
Learning in M-P network
Hiring Rule #1A:
“A job candidate who has either good grade or prior job
experience and also gets strong letters or receives positive mark
in interview tends to make a desirable employee and therefore
should be hired.”
“Learning through weight modification (e.g., Hebb rule).”
Acquisition of new knowledge in M-P network
Hiring Rule #2:
“In addition to Rule 1A, a job candidate who has no prior job
experience and receives negative mark in interview shouldn’t be
hired.”
“Acquisition of new knowledge through creation of new neurons
(i.e., synaptogenesis).”
2. Computational Principles
1. Distributed representation
2. Later inhibition
3. Bi-directional interaction
4. Error-correction learning
5. Hebbian learning
1. Distributed Representation
(vs. localist representation)
An object is represented by multiple units; the
same unit participates in multiple
representations:
Why distributed representation?
1. Efficiency
Solve the combinatorial explosion problem: With n binary units, 2n
different representations possible.
(e.g.) How many English words from a combination of 26 alphabet
letters?
2, Robustness (fault-tolerance)
Loss of one or a few units does not affect representation.
(e.g., holographic image, Fourier transformation).
2. Lateral Inhibition
Selective activation through activation-dependent
inhibitory competition (i.e., WTA).
A
Z
K
3. Bi-directional Interaction
(interactive, recurrent connections)
This top-down-and-bottom-up processing computes constraintoptimization, and is generally faster than the uni-directional
computation (e.g., word recognition).
Bi-directional interactions in Speech Perception
4. Error-correction Learning
(e.g., backpropagation)
dWik =e*Ai*(Tk - Bk)
A three-layer feed forward network with backpropagation
learning can approximates any measurable function to any
desired degree of accuracy by increasing the number of
hidden units (Hornik et al, 1989, 1990; Hecht-Nielsen, 1989).
However, biological plausibility of backpropagation learning
is yet to be confirmed.
5. Hebbian Learning
(unsupervised/self-organizing)
dWik =e Bk (Ai - Wik)
Hebbian Rule Encodes Correlational Information
Asymptotically,
Wik = P(Ai = ‘fire |Bk =‘fire’)
In other words, the weight Wik stores information about correlation (i.e.,
co-firing activities) between the input and output units.
Q1: How about encoding the other half of the correlational information,
that is,
P(Bk = ‘fire|Ai =‘fire’)
Q2: Discuss implications of an anti-Hebbian learning rule such as
dWik =- e Ai(Bk - Wik)
Biological Plausibility of Hebbian Learning
The neurophysiological plausibility is well documented
(e.g., Levy & Stewards, 1980), including sub-cellular
mechanisms (NMDA-mediated long-term potentiation
(LTP)).
3. Examples of Neural Computation
•
•
•
•
•
•
•
•
•
Noise-tolerant memory
Pattern completion
Content-addressable memory
Language learning
Sensory motor control
Visual perception
Speech perception
Word recognition
…..
NETtalk
(program that learns to pronounce English words)
Submarine Sonar