download

Matakuliah
Tahun
Versi
: H0434/Jaringan Syaraf Tiruan
: 2005
:1
Pertemuan 12
WIDROW HOFF LEARNING
1
Learning Outcomes
Pada akhir pertemuan ini, diharapkan mahasiswa
akan mampu :
• Membuktikan Widrow Hoff Learning
dengan contoh aplikasi.
2
Outline Materi
• Jaringan Adaline.
• LMS Algorithm.
3
ADALINE Network
a = purel in Wp + b  = Wp + b
w i, 1
T
w i, 2
=
w
i

T
ai = purelin ni  = purelin iw p + bi  = iw p + bi
w i, R
4
Two-Input ADALINE
T
T
a = purelin n = purelin 1w p + b  = 1w p + b
T
a = 1w p + b = w1 , 1 p1 + w1 , 2 p2 + b
5
Mean Square Error
Training Set:
{ p1, t 1} , {p2 , t2} ,  , { pQ, t Q}
Input:
pq
tq
Target:
Notation:
x=
1w
b
z = p
1
a
T
T
= 1w p + b
a = x z
Mean Square Error:
2
2
2
Fx  = E e  = E t – a  = E t – xT z  
6
Error Analysis
2
2
2
Fx  = E e  = E t – a  = E t – xT z  
T
2
F x = E t – 2t x Tz + x T zz x 
2
T
Fx  = E t  – 2x T Et z  + xT E zz x
T
T
F x = c – 2 x h + x Rx
2
c = E t 
h = E t z 
T
R = E zz 
The mean square error for the ADALINE Network is a
quadratic function:
T
1 T
F x  = c + d x + -- x Ax
2
d = –2h
A = 2R
7
Stationary Point
Hessian Matrix:
A = 2R
The correlation matrix R must be at least positive semidefinite. If
there are any zero eigenvalues, the performance index will either
have a weak minumum or else no stationary point, otherwise
there will be a unique global minimum x*.
T
1 T
F x  =   c + d x + -- x Ax = d + Ax = – 2 h + 2 Rx


2
– 2h + 2Rx = 0
If R is positive definite:
–1
x = R h
8
Approximate Steepest
Descent
Approximate mean square error (one sample):
F̂x  =  t k  – a k 
2
2
= e k 
Approximate (stochastic) gradient:
2
̂ Fx  = e k 
2
e  k 
e k 
 e  k  j = ---------------- = 2e k  ------------ w1 , j
w1 , j
2
2
 e k  R + 1
j = 1, 2,  , R
2
e k 
e  k 
= ---------------- = 2e k  ------------b
b
9
Approximate Gradient
Calculation
T
e k 
 t k  – a  k 

------------- = ---------------------------------- =
 t  k  – 1 w pk  + b 
 w1 , j
w 1, j
 w1 , j
e k 

------------- =
w 1, j
 w1,
R
j


t k  –   w1 , i p i k  + b
i = 1

e k 
------------- = – p j  k
 w1 , j
 e k 
------------- = – 1
b
2
̂F x = e  k  = –2e k z  k
10
LMS Algorithm
xk + 1 = xk – F x
x = xk
x k + 1 = xk + 2e k  zk 
1 w k +
1 = 1w  k  + 2e  k p k
b k + 1 = b k + 2e k 
11
Multiple-Neuron Case
iw k +
1 = iw k  + 2 ei k p k 
bi k + 1 = bi k  + 2e i k 
Matrix Form:
T
W  k + 1 = W  k + 2e k p k 
b k + 1 = bk  + 2e k
12
Analysis of Convergence
x k + 1 = xk + 2e k  zk 
E xk + 1  = E xk  + 2E e k z k 
T
Ex k + 1  = Ex k + 2 E t  k z  k  – E  xk z k  z  k  
T
E xk + 1  = E xk  + 2  Et k z  k  – E  z k z  k  xk 
E xk + 1  = E xk  + 2 h – RE xk  
E xk + 1 =  I – 2R E xk  + 2h
For stability, the eigenvalues of this
13
matrix must fall inside the unit circle.
Conditions for Stability
eig  I – 2R  = 1 – 2 i  1
(where i is an eigenvalue of R)
Since
i > 0
,
1 – 2i  1
.
Therefore the stability condition simplifies to
1 – 2 i > – 1
  1  i
for all i
0    1  max
14
Steady State Response
E xk + 1 =  I – 2R E xk  + 2h
If the system is stable, then a steady state condition will be reached.
E xss = I – 2R E xss + 2h
The solution to this equation is
–1
Ex ss = R h = x
This is also the strong minimum of the performance index.
15
Example
Banana


–1


p
=
,
t
=
 1
1 1
–1 


–
1


T


1


p
=
,
t
=
Apple  2
1 2
1 


–
1


1
2
T
1
2
T
R = E  pp  = -- p1 p1 + -- p2 p2
1
R = -2
–1
1
1 0 0
1
1 –1 1 – 1 + -2- 1 1 1 – 1 = 0 1 – 1
–1
–1
0 –1 1
1 = 1.0,
2 = 0.0,
3 = 2.0
1
1
  ----------- = ------- = 0.5
 max
2.0
16
Iteration One
Banana
–1
a 0  = W  0 p  0  = W 0  p 1 = 0 0 0 1 = 0
–1
e  0 = t  0 – a0 = t 1 – a 0= – 1 – 0= –1
T
W 1 = W 0 + 2e 0 p 0
T
–1
W 1  = 0 0 0 + 2 0.2  – 1  1 = 0.4 – 0.4 0.4
–1
17
Iteration Two
Apple
1
a  1 = W 1  p  1 = W 1  p 2 = 0.4 – 0.4 0.4 1 = – 0.4
–1
e  1 = t  1 – a1 = t 2 – a 1= 1 –  –0.4= 1.4
T
1
W 2  = 0.4 – 0.4 0.4 + 2 0.2  1.4  1 = 0.96 0.16 –0.16
–1
18
Iteration Three
–1
a  2 = W 2  p  2 = W 2  p 1 = 0.96 0.16 – 0.16 1 = – 0.64
–1
e 2 = t 2 – a 2 = t 1 – a2 = – 1 – –0.64= –0.36
T
W 3 = W 2 + 2 e 2p  2 = 1.1040 0.0160 –0.0160
W   = 1 0 0
19