Introduction to Artificial Neural Networks 主講人: 虞台文 Content Fundamental Concepts of ANNs. Basic Models and Learning Rules – – – Neuron Models ANN structures Learning Distributed Representations Conclusions Introduction to Artificial Neural Networks Fundamental Concepts of ANNs What is ANN? Why ANN? ANN Artificial Neural Networks – – To simulate human brain behavior A new generation of information processing system. Applications Pattern Matching Pattern Recognition Associate Memory (Content Addressable Memory) Function Approximation Learning Optimization Vector Quantization Data Clustering ... Traditional Computers are inefficient at these tasks although their Applications computation speed is faster. Pattern Matching Pattern Recognition Associate Memory (Content Addressable Memory) Function Approximation Learning Optimization Vector Quantization Data Clustering ... The Configuration of ANNs An ANN consists of a large number of interconnected processing elements called neurons. – A human brain consists of ~1011 neurons of many different types. How ANN works? – Collective behavior. The Biologic Neuron The Biologic Neuron 二神經原之神 經絲接合部分 軸突 樹狀突 The Biologic Neuron Excitatory or Inhibitory The Artificial Neuron x1 wi1 x2 wi2 i f (.) a (.) xm wim yi The Artificial Neuron x1 yi (t 1) a( f ) wi1 x2 wi2 i yi f (.) a (.) m f (i ) x wij xwjim i m j 1 1 f 0 a( f ) 0 otherwise wij positive excitatory The Artificial Neuron negative inhibitory zero no connection x 1 wi1 x2 wi2 i f (.) a (.) xm wim yi The Artificial Neuron x1 wi1 x2 wi2 Proposed by McCulloch and Pitts [1943] M-P neurons i f (.) a (.) xm wim yi What can be done by M-P neurons? A hard limiter. A binary threshold unit. Hyperspace separation. y f ( i ) w1 x1 w2 x2 1 f ( i ) 0 y 0 otherwise w1 x1 w2 x2 0 x2 1 x1 What ANNs will be? ANN A neurally inspired mathematical model. Consists a large number of highly interconnected PEs. Its connections (weights) holds knowledge. The response of PE depends only on local information. Its collective behavior demonstrates the computation power. With learning, recalling and, generalization capability. Three Basic Entities of ANN Models Models of Neurons or PEs. Models of synaptic interconnections and structures. Training or learning rules. Introduction to Artificial Neural Networks Basic Models and Learning Rules Neuron Models ANN structures Learning Processing Elements Extensions of M-P neurons What integration functions we may have? i f (.) a (.) What activation functions we may have? Integration Functions m M-P neuron f i neti wij x j i j 1 Quadratic Function Spherical Function m f i wij x 2j i i j 1 f (.) a (.) m f i ( x j wij ) i 2 j 1 Polynomial m m j k f w x x x x Function ijk j k j k i i j 1 k 1 Activation Functions M-P neuron: (Step function) i 1 f 0 a( f ) 0 otherwise f (.) a (.) a 1 f Activation Functions Hard Limiter (Threshold function) i 1 a( f ) sgn( f ) 1 f (.) a (.) f 0 f 0 a 1 1 f Activation Functions Ramp function: i f (.) a (.) f 1 1 a( f ) f 0 0 f 1 f 0 a 1 1 f Activation Functions Unipolar sigmoid function: 1 a( f ) f 1 e i f (.) a (.) 1.5 1 0.5 0 -4 -3 -2 -1 0 1 2 3 4 Activation Functions Bipolar sigmoid function: 2 a( f ) 1 f 1 e i f (.) a (.) 1.5 1 0.5 0 -4 -3 -2 -1 -0.5 0 -1 -1.5 1 2 3 4 Example: Activation Surfaces y L1 L3 L1 L3 L2 L2 x x y Example: Activation Surfaces y L1 x1=0 L3 xy+4=0 L2 y1=0 x 1=1 2=1 L1 L2 3= 4 1 0 1 0 x L3 1 1 y Example: Activation Surfaces 010 y L1 Region Code L3 011 111 001 101 110 L1 L3 L2 L2 100 x x y Example: Activation Surfaces y z L1 z=0 L4 L3 z=1 L1 L2 L3 L2 x x y Example: Activation Surfaces y z L1 4=2.5 z=0 L4 L3 z=1 1 L1 1 1 L2 L3 L2 x x y Example: Activation Surfaces M-P neuron: (Step function) z 1 f 0 a( f ) 0 otherwise L4 L1 L2 x L3 y Unipolar sigmoid function: 1 a( f ) f 1 e Example: Activation Surfaces =2 z =3 L4 L1 =5 L2 L3 =10 x y Introduction to Artificial Neural Networks Basic Models and Learning Rules Neuron Models ANN structures Learning ANN Structure (Connections) Single-Layer Feedforward Networks y1 y2 yn . . . w1m w11 w12 x1 w21 w22w2m x2 wn1 w n2 wnm xm Multilayer Feedforward Networks y1 y2 Output Layer yn . . . . . . Hidden Layer . . . Input Layer . . . x1 x2 xm Pattern Recognition Multilayer Feedforward Networks Where the knowledge from? Learning Classification Analysis Input Output Single Node with Feedback to Itself Feedback Loop Single-Layer Recurrent Networks y1 y2 yn . . . x1 x2 xm Multilayer Recurrent Networks y1 y2 y3 . . . . . . x1 x2 x3 Introduction to Artificial Neural Networks Basic Models and Learning Rules Neuron Models ANN structures Learning Learning Consider an ANN with n neurons and each with m adaptive weights. Weight matrix: w1T w11 T w 2 w21 W T w n wn1 w12 w22 wn 2 w1m w2 m wnm How? Learning Consider an ANN with n neurons and each To “Learn” the weight matrix. with m adaptive weights. Weight matrix: w1T w11 T w 2 w21 W T w n wn1 w12 w22 wn 2 w1m w2 m wnm Learning Rules Supervised learning Reinforcement Unsupervised learning learning Supervised Learning Learning with a teacher Learning by examples Training set T (x(1) , d (1) ), (x(2) , d (2 ) ), , (x( k ) , d ( k ) ), T (x , d ), (x , d ), (1) (1) (2) (2 ) (k ) (k ) , (x , d ), Supervised Learning x y ANN W Error signal Generator d Reinforcement Learning Learning with a critic Learning by comments Reinforcement Learning x y ANN W Critic signal Generator Reinforcement Signal Unsupervised Learning Self-organizing Clustering – Form proper clusters by discovering the similarities and dissimilarities among objects. Unsupervised Learning x ANN W y The General Weight Learning Rule x1 w i1 x2 w xj . i2 . . wij . . . xm-1 m 1 Input: neti wij x j i j 1 i wi,m-1 i yi Output: yi a(neti ) We want to learn the weights & bias. The General Weight Learning Rule x1 w i1 x2 w xj . i2 . . wij . . . xm-1 m 1 Input: neti wij x j i j 1 i wi,m-1 i yi Output: yi a(neti ) We want to learn the weights & bias. The General Weight Learning Rule x1 w i1 x2 w xj . i2 . . wij . . . xm-1 m 1 Input: neti wij x j i j 1 i Let xm = 1 and wim = i. m wi,m-1 i neti wij x j j 1 The General Weight Learning Rule x1 w i1 x2 w xj . i2 . . wij . . . xm-1 wi,m-1 m 1 Input: neti wij x j i j 1 i Let xm = 1 and wim = i. m wim=i xm= 1 neti wij x j j 1 We want to learn wi=(wi1, wi2 ,…,wim)T The General Weight Learning Rule x1 w i1 x2 w xj . i2 . . wij . . . xm-1 wi,m-1 m Input: neti wij x j j 1 i wim=i xm= 1 yi wi(t) = ? The General Weight Learning Rule x yi wi r Learning Signal Generator di The General Weight Learning Rule x yi wi r Learning f r (w Signal i , x, di ) Generator di The General Weight Learning Rule x yi wi w i (t ) rx(t ) r Learning f r (w Signal i , x, di ) Generator di w i (t ) rx(t ) The General Weight Learning Rule x yi wi wi (t ) rx(t ) r Learning Rate Learning f r (w Signal i , x, di ) Generator di We want to learn wi=(wi1, wi2 ,…,wim)T The General Weight Learning Rule wi (t ) rx(t ) r f r (wi , x, di ) Discrete-Time Weight Modification Rule: w ( t 1) i w f r (w , x , d )x (t ) i (t ) i (t ) (t ) i Continuous-Time Weight Modification Rule: dw i (t ) rx(t ) dt (t ) Hebb’s Learning Law • Hebb [1994] hypothesis that when an axonal input from A to B causes neuron B to immediately emit a pulse (fire) and this situation happens repeatedly or persistently. • Then, the efficacy of that axonal input, in terms of ability to help neuron B to fire in future, is somehow increased. • Hebb’s learning rule is a unsupervised learning rule. Hebb’s Learning Law r f r (w i , x, di ) a(w x) yi T i wi (t ) rx(t ) yi x wij yi x j + + Introduction to Artificial Neural Networks Distributed Representations Distributed Representations • Distributed Representation: – An entity is represented by a pattern of activity distributed over many PEs. – Each Processing element is involved in representing many different entities. • Local Representation: – Each entity is represented by one PE. Example P0 P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15 Dog Cat Bread + _ + + _ _ _ _ + + + + + _ _ _ + _ + + _ _ _ _ + _ + _ + + _ + _ _ _ _ _ _ + + + + + + + + + + Act as a content addressable memory. Advantages P0 P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15 Dog Cat Bread + _ + + _ _ _ _ + + + + + _ _ _ + _ + + _ _ _ _ + _ + _ + + _ + _ _ _ _ _ _ + + + + + + + + + + P0 P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15 + + + + What is this? Act as a content addressable memory. Make induction easy. Advantages P0 P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15 Dog Cat Bread + _ + + _ _ _ _ + + + + + _ _ _ + _ + + _ _ _ _ + _ + _ + + _ + _ _ _ _ _ _ + + + + + + + + + + P0 P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15 Fido + _ _ + _ _ _ _ + + + + + + _ _ Dog has 4 legs? How many for Fido? Advantages Act as a content addressable memory. Make induction easy. Make the creation of new entities or concept easy (without allocation of new hardware). P0 P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15 Dog Cat Bread Doughnut + _ + + _ _ _ _ + + + + + _ _ _ + _ + + _ _ _ _ + _ + _ + + _ + _ _ _ _ _ _ + + + + + + + + + + + + _ _ _ + + _ + _ _ _ + + + _ Add doughnut by changing weights. Advantages Act as a content addressable memory. Make induction easy. Make the creation of new entities or concept easy (without allocation of new hardware). Fault Tolerance. P0 P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15 Dog Cat Bread + _ + + _ _ _ _ + + + + + _ _ _ + _ + + _ _ _ _ + _ + _ + + _ + _ _ _ _ _ _ + + + + + + + + + + Some PEs break down don’t cause problem. Disadvantages • How to understand? • How to modify? Learning procedures are required.
© Copyright 2024 Paperzz