Neural networks Conditional random fields - factor graph K (L+1) K (y, • X) (y, = X) • (y = ) = (y, (y exp X) ) = = b exp (y b ) = exp 1 b 1 1 (ykcX) ) f== (yb⇣kc ) y= exp b ck fk=1 c yk=1 f f • ff •=k ff(y, yk=1 c 1yk=1 ff(y,kX) yk =c ⌘ k =c k =c k =c 1⌘ k =c k=1 2 ⌘ ⇣exp ⇣P ⇣P ⇣P ⌘ P (L+1,+1) K(L+1,+1) 1 K(L+1,+1) (L+1,+1) 1 K(L+1,+1) 1PK 1 K 1 ⇣=W ⇣ ⌘x1k+1,i ⌘x1k+1,i ⌘ • f (y, (y xf k+1,i = xexp ⇣ ⌘x1k+1,i •=k ,X) (y, (yW ,c,i x ) ⇣= ⇣PW • X) •= X) (y x)fk+1,i (y ,X) )xfk+1,i = )fk+1,i = W x1k+1,i •=k ,X) (y, =kexp (y , c,i x ) = exp W k+1,i y kexp k+1,i f (y, ff (y, yk =c yk =c kexp y =c k =c k=1 c,i k=1 c,i P P P k k=1 k=1 c,i k=1 P (L+1, (L+1, 1) 1) (L+1, 1) K K K1) (L+1, 1) K (L+1, K • f (y, • X) •(y =kX) , fx X) , fxf)(y, = =k1,i exp (y = W exp , xf)c,i W )k= 1,i exp W x x1kk=2 1 x 1 •= X) =k1,i , x)f ⇣ =k1,i exp W x 1 •k(y X) =k1,i (y , x ) = exp x 1 f (y, f(y, k(y k kk=2 1,iW y 1,i =c y =c k 1,i y =c ff(y, k k 1,i y =c k 1,i y k k k c,i c,i k k=2 k=2 c,i c,i k=2 ⇣P ⇣P ⇣P ⇣P ⌘ P ⌘ ⌘ ⌘ K 1 K 1 K 1 K 1 K 1 0 0 0 ⇣ ⇣ ⇣ ⌘ ⌘ ⌘ • f (y, (y yfk+1 exp Vexp 1 0=c 0 =c=c 0 =c=c ⌘ •=k ),X) (y, =c,c (y ,⇣0y=k+1 ) = exp V 1 1y0 k+1 =c 0 =c ⇣ ⌘ • X) •= X) (y y= (y y= ), y= Vexp V1exp 1 1 •=k ,X) (y, = (y ) V 1 y y fk+1 fk+1 kk=1 c,c y f (y, ff (y, k),X) c,c c,c y =c y =c =c y fk+1 k c,c y k k+1 k k k k+1 k+1 k=1 P P P k k+1 k=1 k=1 P k=1 P (L+1,0) (L+1,0) K (L+1,0) K (L+1,0) K K (L+1,0) K Topics: Markov network • f (y, • X) (y, = X) • (y = , x (y, (y ) X) , = x exp = ) = (y W exp , x W ) = exp x W 1 x 1 x 1yxkk,i =k ff(y, (ykX) , xf k,i exp W 1ykk,i =k)c,i=f k,i (y , xk,i )c,i= k,i exp f • ff • k,i k,i k,i =c =c =c 1yk =c ff(y,kX) =c k c,i c,iykW k=1 k=1 k=1 k=1 c,iyxkk,i k=1 = observed • f (y ), yfk+2 ,(y yf•k+1 yfk+2 (y , yk),k+2 ) k),, yy⇣fk+2 (yk),, y⇣ yk+2 yk+2 ) ⇣ • (y • (y ) , y (y ) (y , y y • , y ) (y k, y k+2 f k(y kfk+2 f,k+2 k k+1), ⇣ f k k k+1 k+2 k+1 k f k+1 ⇣ ⌘ P⌘ ⌘ ⌘ • Illustration for K=5 P P P P (L+1,+1) (L+1,+1) (L+1,+1) K 1 K 1 K (L+1,+1) 1 K 1 K 1 (L+1,+1) • f (y, • X) = X) •(y =kX) , fx (y , fxf)(y, = =kX) exp )f k+1,i (y = exp , xf)c,ik+1,i W ) k+1,i =W exp W xk=1 x1k=1 1k+1,i • = (y , x = exp • = (y , x ) = exp W x f (y, f(y, f(y, k+1,i kX) k+1,i kW k+1,i k+1,i yk =cx1k=1 yk =cx1 k+1,i yk =c 1 f k+1,i y =c k c,i c,i k k=1 c,i c,i k=1 • x1 •x2x1x3x x•x5x •5 x • 2xx4x 1x 2 4xx 3 5 x4 x5 13 2x 4x 3x 4x 5x 1x 2x 3x ⇣ ⇣P ⇣P ⇣P ⇣P ⌘ P ⌘ ⌘ ⌘ ⌘ K 1 K 1 K 1 K 1 K 1 • y1f (y, y y y y • y y y y y • y y • y y y y y y y y • y, f1f3(y, y=2k4X) y2k4X) y3k+1 0=cV 5• 4) 5V • 2 X) (y =2k4X) ,)yfy1f= exp ), f= (y Vexp , yf0=k+1 ), y0= 1c,c 11c,c 11k=1 0=c=c 1f3(y, 24X) 1 5 5= •= (y exp Vexp 11c,c 3(y, 5= •(y (y )k0= exp 11yy0kk+1 1y0k+1 =c0 f(y, k+1 k+1 kc,c y =cV y ykk+1 =c y0kk+1 yy0kk+1 =c =c f3 y kc,c k+1 =c =c k=1 k=1 k=1 k=1 • MARKOV NETWORK VISUALIZATION • • k , yf k+2 (y•k ,)yfk+2 •fk(y ), yfkk+2 (y ,•yfkk+1 (y , yfkk+1 ), ykk+2 ,)yfk+2 (y (y ), yfkk+2 (y (y ) k), yfk+2 (yk ), yk+2 , yk+2 ) f (y k+2 k+1 k+1 k+1 ) • x1 •x2x1x3•x2xx4x •x5x x52xx41x3xx52x4x3x5x4 x5 xx 1 3x 2x41 3• • Each • y1 •y2 yy1 3 y•y2 4yy1y3 5y•y24yy1y35y•y24yy1y35yy2 4 yy3 5 y4 y5 edge is associated with one of more factors 2 2 2 2 2 K (L+1) K X (y, • X) (y, = X) • (y = ) = (y, (y exp X) ) = = b exp (y b ) = exp 1 b 1 1 X) (y ) f1=p(y exp bkcx)>yk=2 k=2 X) = (y = exp b 1 k=2 > cK ck |X)) c yk=1 K 1k =c= r log p(y|X) r flog p(y|X) f • ff •=k f(e(y yk=2 c1 ff(y,k= f(y, k yk=1 f y k =c k =c k =c k =c k=1 k=1 W a⌘ (x k=1 X X )= ra x ) k k 3 ⌘ ⇣ ⌘ ⇣ (x ) ⇣ ⌘ k 1 k 1 ⇣ ⇣ > P K 1 K 1 P K 1 K 1 (L+1,+1) X X K(L+1,+1) 1PK r log p(y|X) = ra(1,+1) log p(y|X) x X X K) 1p(yk |X)) (L+1,+1) 1P= K(L+1,+1) 1PK (e(y (L+1,+1) 1 k=1 k k=2 k=2 W(1,+1) (x ) k+1 k > >⌘ > >⌘ ⇣ ⇣ ⇣ ⌘ • (y, X) = (y , x ) = exp W x 1 ⇣ ⌘ • (y, X) = (y , x ) = exp W x ⇣ • (y, • X) (y, = X) (y = , x (y , ) x = exp ) = W exp W x x 1 1 • (y, X) = (y , x ) = exp W x 1 f f k k+1,i k+1,i y =c f f k k+1,i k+1,i f f k k+1,i k k+1,i k+1,i k+1,i y =c y =c r log p(y|X) = r log p(y|X) x = (e(y ) p(y |X)) x f f k k+1,i k+1,i y =c k r log p(y|X) = r log p(y|X) x = (e(y ) p(y |X)) x c,i k=1 c,i kk c,i c,i k k+1 k=1 k=1 c,i k=1 W (x(L+1, ) k+1 a (x K ) 1 k+1 k=1 a k+1 k=1 K (L+1, 1) PK 1)kP (L+1, 1) P K K (L+1, 1) kPK (L+1, 1)kkPKk=1 X1 W X • flog (y, • p(y|X) X) •(y =k=X) , fx X) , fxf)(e(y = =k1,i exp (y = W exp , |X)) xf)c,i W )k=1 = 1,i exp W x x1kk=2 1 x 1 •=xX) =k1,i ,) x)f ⇣ =k1,i exp W x 1 •k(y (y, X) = (y , x ) = exp x 1 k=1 k=1 k=1 > > f (y, f(y, k(y k1,i k kk=2 1,iW y 1,i =c y =c k 1,i y =c ff(y, k k 1,i y =c k k 1,i y k k k c,i c,i k k=2 k=2 c,i c,i k=2 = ra p(y x ⌘ ⇣ k k ⇣ ⇣ ⌘ ⌘ ⇣ ⌘ (x ) • k+1 k+1 PK 1PK 1PK 1PK 1PK 1 k=1 k=1 0 0 ⇣y= ⇣ ⌘ ⌘ ⌘ • f•(y, (y yfk+1 exp Vexp 1 0=c 0 =c=c 0 =c=c ⌘ •=k ),X) (y, =c,c (y ,⇣0y=k+1 ) = exp V 1 1y0 k+1 =c 0 =c ⇣ ⌘ • X) •= X) (y y= (y ),⇣0y= Vexp V1exp 1 1 •=k ,X) (y, = (y ) V 1 y y fk+1 fk+1 kk=1 c,c y f (y, ff (y, k),X) c,c c,c y =c y =c =c y fk+1 k c,c y k k+1 k @ log p(y|X) k k k+1 k+1 k=1 P P P k k+1 k=1 k=1 P k=1 P (L+1,0) (L+1,0) K (L+1,0) K (L+1,0) K K (L+1,0) K • = (1y Topics: Markov network • f (y, • X) (y, = X) • (y = , x (y, (y ) X) , = x exp = ) = (y W exp , x W ) = exp x W 1 x 1 x 1 0 0 • (y, X) = (y , x ) = exp W x 1 • (y, X) = (y , x ) = exp W x 1 f k,i k f k,ikc,i f k,ik =c =c yk)k,i =c yk =c ff k ff@k,iklog =c ffp(y|X) @apy(y , yk+1 c,i k,i c,iykk,i kk,i k=1 k=1 k=1 c,i k,i k=1 c,iykk,i k=1 k 0 0 = observed 0 ,y 0 p(y = (1 @ log p(y|X) @ log p(y|X) k 0 = yk , yk+10 = yk+1 |X)) y =y =y k k+1 0 0 0 0 • f (y , y ) (y , y , y ) • (y , y ) (y , y , y ) k k+1 • (y • , y (y ) , y (y ) (y , , y y ) , y ) • (y , y ) (y , y , y ) k fk+2 f k k+1 k+2 f k k+2 f k k+1 k+2 k f= k+2k (1fyk+2 fk fk+1 k+1 k+1 p(y yk+1 |X)) (1 fk+2 k+1 k+2 @a , y=y )k= ⇣ p(y yk+2 = yk+1 |X)) k =⇣yk , yk+1 =⌘ =y p0 (y kky==y ⇣ ⇣ ⌘ ⌘ =y ,ykk0 k+2 ⇣ ⌘ k ,,yyk+1 0 0 • Illustration for K=5 P P P @a (y , y ) P @a (y , y ) P p k k+1 p k k+1 (L+1,+1) (L+1,+1) (L+1,+1) K 1 K 1 K (L+1,+1) 1 K 1 K 1 p(y|X) (L+1,+1) 0 0 • =yX) = X) (y = (y ,= xf)(y, = = exp )f k+1,i (y = exp , xf)c,ik+1,i W ) k+1,i =W exp W xk=1 x1k=1 1 x 1 K 1=c 1 • = , x = exp x 1 X) = (y , x ) = exp W x p(y =kX) y,kfx y• y(y |X)) = (1•y =y f,y(y, f (y, f(y, f, (y, k+1,i kX) k+1,i kW k+1,i k+1,i y =c y =c k+1,i y f k k+1,i y =c k• k+1 f k k+1,i X k k k k+1 c,i c,i k k=1 k=1 c,i 0 c,i k=1 @ log p(y|X) • x1 •x2x1x3x x•x5x •5 x • 2xx4x , yk+1 ) 1x 2 4xx 3 5 x4 x5 13 2x 4x 3x 4x 5x 1x 2x 3x = (1 K 1 @Vy⌘,y X⇣ ⇣ ⇣ ⌘ ⌘k=1 ⌘ ⇣ ⌘0 ⇣ K 1 K 1 @ log p(y|X) X P P P 0 X P P @ log•p(y|X) p(y|X) Ky0k =y 1 0 ,yK 1 1 ykK 1p(yK 0K 1 0 = yk+1 |X)) = (1 = ,0y= 0 0 k k+1 =y •@ y1flog(y, y y y y k+1 y y y y y • y y • y y y y y y y y • y y y y y k k+1 0 0 0 0 0 2 X) 3 4 5 = (1 p(y = y , y yk+1 0 0|X)) 1 2 3 4 5 • (y, = X) • (y = , (y, (y X) , ) y = = exp ) = (y V exp , y V ) = exp 1 V 1 1 1 1 1 = (1 p(y = y , y = y |X)) 0 0 1f • 2 1 2k 4X) 5 40 ,yff 5 k4 = (y , ) = V 1 1 1f3@V 3(y, 5=kc,c • X) (y , y ) = exp 1 1 0,y kV =y =y k yexp k+1 y2 =y f(y, f k+1 k=y k+1 f k+1 c,c y =c c,c y y =c =c y =c y =c y =c k=ck+1 f3 k+1 c,c y y =c k ,y k+1 f k k+1 c,c y =c y =c k k k+1 k+1 k k+1 k k+1 y k=1 k=1 k=1 k k+1 k=1 k=1 K 1 k k+1 @V @V X (1, 1) • (1,+1) (1,+1) k MARKOV NETWORK VISUALIZATION (1,+1) (1,+1) (1,+1) (1,+1) (1,+1) k k k k k 0 k k k 0 k k+1 0 k+1 k+1 0 k k k+1 0 k+1 0 k+1 0 k y|X) = X 0 ,y 0 yk k+1 0 (1yk =yk0 ,yk+1 =yk+1 • 0 k 0 ,y 0 yk k+1 0 k k=1 p(yk = k+1 yk , yk+1 = 0 k+1 k=1 0 yk+1 |X)) k=1 k 0 k k+1 0 k+1 0 k+1 • k , yf k+2 (y•k ,)yfk+2 •fk(y ), yfkk+2 (y ,•yfkk+1 (y , yfkk+1 ), ykk+2 ,)yfk+2 (y ), yVk+2log (y ), yfkk+2 (y (y ) k), yfk+2 (yk ), yk+2 ) p(y|X) f (y k+2 k+1 k+1 k+1r K X1 (e(yk ) e(yk+1 )> ! K 1 K K 1 1 X X K 1 K 1 K 1 k=1 X X X X > > >= rV log log p(y|X) (e(y e(y p(y , y|X)) |X)) = freq(y y k ) k+1 k+1 )freq(y kk+1 k+1 k , )yk+1 ) p(yk ,p(y k ,|X r p(y|X) = (e(y ) e(y ) p(y , y = freq(y , y y r log p(y|X) = (e(y ) e(y ) p(y , y |X)) = , y ) p(y , y |X) ! V k k k+1 k k+1 k+1 V k k+1 k k+1 k k k+1 • x1 •x2x1x3•x2xx4x •x5x x52xx41x3xx52x4x3Kx xx 5x 1 3x 2x41 3• 1 4 x5 X1 X k=1k=1 k=1 k=1 k=1 k=1 > (e(yk ) e(yk+1 ) p(yk , yk+1 |X)) = freq(yk , yk+1 ) p(yk , yk+1 |X)• log p(y, X) = log (p(y|X)p(X)) = log p(y • y1 •y2 yy1 3 y•y2 4yy1y3 5y•y24yy1y35y•y24yy1y35yy2 4 yy3 5k=1 y4 y5 1 QK • Representation • logis log p(y, (p(y|X)p(X)) logp(y|X) p(y|X)log log p(X) logp(y p(y|X) • p(y|X) =log k=1 ambiguous: p(y, X)X) (p(y|X)p(X)) = =loglog p(y|X) p(X) p(y|X) k |yk 1 , X) p(yk |yk 1 , X) = log p(y, X) = log• (p(y|X)p(X)) = =loglog p(y|X) log p(X) QK QK QK 1 •1 1 X)p(X)) = log p(y|X) log p(X) log p(y|X) = p(y |y , X) p(y |y , X) = (a (y,2yk)k )) , yu2(y ,x )+ka)pf+ (yka2 ,px ‣ clique between yp(y|X) , y and x , but we didn’t explicitly define a joint factor = p(y |y , X) p(y |y , X) = exp (a ) (y p(y|X) = k=1 p(yk•|yk•p(y|X) , X) p(y |y , X) = exp (a (y ) + a (y , y )) 1 2 2 k k 1 k k 1 u 1 ,fy(y k )) fk(y 1exp 2(y 1 , y2 ) f (y1 , y2 1 Z(yk 1 ,X)k k 1 u k k 1 1 kk=1k=1 k 1 k k 1 ,X) Z(ypkZ(y ,X) k 1 1 +1 k=1 2 2 = 2 2 2 1 X)f (y p(y |y , X) = exp2y(a )(y +(y ,=yf1k,(y )) k k 1 k ,,axy2p22,(y ),xx y221,,is (y11,,yto y22,)x2f)(y=2 , xf2(y ) 1 , y2 ) f (y2 , x2 ) f (y1 , x2 ) ,2)x)u2, (y ) fkf(y , 1x,2y)f2, )(y (y ,xxy22equivalent , x•2kf) (y (y fact, defined and which Z(y ,X) 1(y 2 2) 1ff,(y 12, 1,y,x f21 22))f1(y 1 2))) = ff(y 1f 1‣, yin 2, x 2 ) we f (y 2• f, y fy(y having: (y , x f 1 2) f (y1 , y2 ) f (y1 , y2 , x2 ) = f (y1 , y2 ) f (y2 , x2 ) f (y1 , x2 ) K (L+1) K (y, • X) (y, = X) • (y = ) = (y, (y exp X) ) = = b exp (y b ) = exp 1 b 1 1 (ykcX) ) f== (yb⇣kc ) y= exp b ck fk=1 c yk=1 f f • ff •=k ff(y, yk=1 c 1yk=1 ff(y,kX) yk =c ⌘ k =c k =c k =c 1⌘ k =c k=1 4 ⌘ ⇣exp ⇣P ⇣P ⇣P ⌘ P (L+1,+1) K(L+1,+1) 1 K(L+1,+1) (L+1,+1) 1 K(L+1,+1) 1PK 1 K 1 ⇣=W ⇣ ⌘x1k+1,i ⌘x1k+1,i ⌘ • f (y, (y xf k+1,i = xexp ⇣ ⌘x1k+1,i •=k ,X) (y, (yW ,c,i x ) ⇣= ⇣PW • X) •= X) (y x)fk+1,i (y ,X) )xfk+1,i = )fk+1,i = W x1k+1,i •=k ,X) (y, =kexp (y , c,i x ) = exp W k+1,i y kexp k+1,i f (y, ff (y, yk =c yk =c kexp y =c k =c k=1 c,i k=1 c,i P P P k k=1 k=1 c,i k=1 P (L+1, (L+1, 1) 1) (L+1, 1) K K K1) (L+1, 1) K (L+1, K • f (y, • X) •(y =kX) , fx X) , fxf)(y, = =k1,i exp (y = W exp , xf)c,i W )k= 1,i exp W x x1kk=2 1 x 1 •= X) =k1,i , x)f ⇣ =k1,i exp W x 1 •k(y X) =k1,i (y , x ) = exp x 1 f (y, f(y, k(y k kk=2 1,iW y 1,i =c y =c k 1,i y =c ff(y, k k 1,i y =c k 1,i y k k k c,i c,i k k=2 k=2 c,i c,i k=2 ⇣P ⇣P ⇣P ⇣P ⌘ P ⌘ ⌘ ⌘ K 1 K 1 K 1 K 1 K 1 0 0 0 ⇣ ⇣ ⇣ ⌘ ⌘ ⌘ • f (y, (y yfk+1 exp Vexp 1 0=c 0 =c=c 0 =c=c ⌘ •=k ),X) (y, =c,c (y ,⇣0y=k+1 ) = exp V 1 1y0 k+1 =c 0 =c ⇣ ⌘ • X) •= X) (y y= (y y= ), y= Vexp V1exp 1 1 •=k ,X) (y, = (y ) V 1 y y fk+1 fk+1 kk=1 c,c y f (y, ff (y, k),X) c,c c,c y =c y =c =c y fk+1 k c,c y k k+1 k k k k+1 k+1 k=1 P P P k k+1 k=1 k=1 P k=1 P (L+1,0) (L+1,0) K (L+1,0) K (L+1,0) K K (L+1,0) K Topics: factor graph • f (y, • X) (y, = X) • (y = , x (y, (y ) X) , = x exp = ) = (y W exp , x W ) = exp x W 1 x 1 x 1yxkk,i =k ff(y, (ykX) , xf k,i exp W 1ykk,i =k)c,i=f k,i (y , xk,i )c,i= k,i exp f • ff • k,i k,i k,i =c =c =c 1yk =c ff(y,kX) =c k c,i c,iykW k=1 k=1 k=1 k=1 c,iyxkk,i k=1 = observed • (y , y ) (y , y , y ) • (y , y ) (y , y , y ) • (y • , y (y ) , y (y ) (y , , y y ) , y ) • (y , y ) (y , y , y ) f k k+2 f k k+1 k+2 f k k+2 f k k+1 k+2 f k f k+2 k k+2 f k f k+1 k k+2 k+1 k+2 f k k+2 f k k+1 k+2 ⇣ ⇣ ⇣ ⌘ ⇣ ⌘ ⇣P P⌘ P⌘ • Factor graphs better represent factors P P (L+1,+1) (L+1,+1) (L+1,+1) K 1 K 1 K (L+1,+1) 1 K 1 K 1 (L+1,+1) • f (y, • X) = X) •(y =kX) , fx (y , fxf)(y, = =kX) exp )f k+1,i (y = exp , xf)c,ik+1,i W ) k+1,i =W exp W xk=1 x1k=1 1k+1,i • = (y , x = exp • = (y , x ) = exp W x f (y, f(y, f(y, k+1,i kX) k+1,i kW k+1,i k+1,i yk =cx1k=1 yk =cx1 k+1,i yk =c 1 f k+1,i y =c k c,i c,i k k=1 c,i c,i k=1 • x1 •x2x1x3x x•x5x •5 x • 2xx4x 1x 2 4xx 3 5 x4 x5 13 2x 4x 3x 4x 5x 1x 2x 3x ⇣ ⇣P ⇣P ⇣P ⇣P ⌘ P ⌘ ⌘ ⌘ ⌘ K 1 K 1 K 1 K 1 K 1 • y1f (y, y y y y • y y y y y • y y • y y y y y y y y • y, f1f3(y, y=2k4X) y2k4X) y3k+1 0=cV 5• 4) 5V • 2 X) (y =2k4X) ,)yfy1f= exp ), f= (y Vexp , yf0=k+1 ), y0= 1c,c 11c,c 11k=1 0=c=c 1f3(y, 24X) 1 5 5= •= (y exp Vexp 11c,c 3(y, 5= •(y (y )k0= exp 11yy0kk+1 1y0k+1 =c0 f(y, k+1 k+1 kc,c y =cV y ykk+1 =c y0kk+1 yy0kk+1 =c =c f3 y kc,c k+1 =c =c k=1 k=1 k=1 k=1 • FACTOR GRAPH VISUALIZATION • • k , yf k+2 (y•k ,)yfk+2 •fk(y ), yfkk+2 (y ,•yfkk+1 (y , yfkk+1 ), ykk+2 ,)yfk+2 (y (y ), yfkk+2 (y (y ) k), yfk+2 (yk ), yk+2 , yk+2 ) f (y k+2 k+1 k+1 k+1 ) • x1 •x2x1x3•x2xx4x •x5x x52xx41x3xx52x4x3x5x4 x5 xx 1 3x 2x41 3• • y1 •y2 yy1 3 y•y2 4yy1y3 5y•y24yy1y35y•y24yy1y35yy2 4 yy3 5 y4 y5 Factor graph 1 • This Markov network Factor graph 2 is less ambiguous: 2 ? ? 2 2 2 2 K (L+1) K (y, • X) (y, = X) • (y = ) = (y, (y exp X) ) = = b exp (y b ) = exp 1 b 1 1 (ykcX) ) f== (yb⇣kc ) y= exp b ck fk=1 c yk=1 f f • ff •=k ff(y, yk=1 c 1yk=1 ff(y,kX) yk =c ⌘ k =c k =c k =c 1⌘ k =c k=1 4 ⌘ ⇣exp ⇣P ⇣P ⇣P ⌘ P (L+1,+1) K(L+1,+1) 1 K(L+1,+1) (L+1,+1) 1 K(L+1,+1) 1PK 1 K 1 ⇣=W ⇣ ⌘x1k+1,i ⌘x1k+1,i ⌘ • f (y, (y xf k+1,i = xexp ⇣ ⌘x1k+1,i •=k ,X) (y, (yW ,c,i x ) ⇣= ⇣PW • X) •= X) (y x)fk+1,i (y ,X) )xfk+1,i = )fk+1,i = W x1k+1,i •=k ,X) (y, =kexp (y , c,i x ) = exp W k+1,i y kexp k+1,i f (y, ff (y, yk =c yk =c kexp y =c k =c k=1 c,i k=1 c,i P P P k k=1 k=1 c,i k=1 P (L+1, (L+1, 1) 1) (L+1, 1) K K K1) (L+1, 1) K (L+1, K • f (y, • X) •(y =kX) , fx X) , fxf)(y, = =k1,i exp (y = W exp , xf)c,i W )k= 1,i exp W x x1kk=2 1 x 1 •= X) =k1,i , x)f ⇣ =k1,i exp W x 1 •k(y X) =k1,i (y , x ) = exp x 1 f (y, f(y, k(y k kk=2 1,iW y 1,i =c y =c k 1,i y =c ff(y, k k 1,i y =c k 1,i y k k k c,i c,i k k=2 k=2 c,i c,i k=2 ⇣P ⇣P ⇣P ⇣P ⌘ P ⌘ ⌘ ⌘ K 1 K 1 K 1 K 1 K 1 0 0 0 ⇣ ⇣ ⇣ ⌘ ⌘ ⌘ • f (y, (y yfk+1 exp Vexp 1 0=c 0 =c=c 0 =c=c ⌘ •=k ),X) (y, =c,c (y ,⇣0y=k+1 ) = exp V 1 1y0 k+1 =c 0 =c ⇣ ⌘ • X) •= X) (y y= (y y= ), y= Vexp V1exp 1 1 •=k ,X) (y, = (y ) V 1 y y fk+1 fk+1 kk=1 c,c y f (y, ff (y, k),X) c,c c,c y =c y =c =c y fk+1 k c,c y k k+1 k k k k+1 k+1 k=1 P P P k k+1 k=1 k=1 P k=1 P (L+1,0) (L+1,0) K (L+1,0) K (L+1,0) K K (L+1,0) K Topics: factor graph • f (y, • X) (y, = X) • (y = , x (y, (y ) X) , = x exp = ) = (y W exp , x W ) = exp x W 1 x 1 x 1yxkk,i =k ff(y, (ykX) , xf k,i exp W 1ykk,i =k)c,i=f k,i (y , xk,i )c,i= k,i exp f • ff • k,i k,i k,i =c =c =c 1yk =c ff(y,kX) =c k c,i c,iykW k=1 k=1 k=1 k=1 c,iyxkk,i k=1 = observed • (y , y ) (y , y , y ) • (y , y ) (y , y , y ) • (y • , y (y ) , y (y ) (y , , y y ) , y ) • (y , y ) (y , y , y ) f k k+2 f k k+1 k+2 f k k+2 f k k+1 k+2 f k f k+2 k k+2 f k f k+1 k k+2 k+1 k+2 f k k+2 f k k+1 k+2 ⇣ ⇣ ⇣ ⌘ ⇣ ⌘ ⇣P P⌘ P⌘ • Factor graphs better represent factors P P (L+1,+1) (L+1,+1) (L+1,+1) K 1 K 1 K (L+1,+1) 1 K 1 K 1 (L+1,+1) • f (y, • X) = X) •(y =kX) , fx (y , fxf)(y, = =kX) exp )f k+1,i (y = exp , xf)c,ik+1,i W ) k+1,i =W exp W xk=1 x1k=1 1k+1,i • = (y , x = exp • = (y , x ) = exp W x f (y, f(y, f(y, k+1,i kX) k+1,i kW k+1,i k+1,i yk =cx1k=1 yk =cx1 k+1,i yk =c 1 f k+1,i y =c k c,i c,i k k=1 c,i c,i k=1 • x1 •x2x1x3x x•x5x •5 x • 2xx4x x 1x 2 4xx 3 5 x4 x5 13 2x 4 factor 3x 4x 5x 1x 2x 3x = ⇣ ⇣P ⇣P ⇣P ⇣P ⌘ P ⌘ ⌘ ⌘ ⌘ K 1 K 1 K 1 K 1 K 1 • y1f (y, y y y y • y y y y y • y y • y y y y y y y y • y, f1f3(y, y=2k4X) y2k4X) y3k+1 0=cV 5• 4) 5V • 2 X) (y =2k4X) ,)yfy1f= exp ), f= (y Vexp , yf0=k+1 ), y0= 1c,c 11c,c 11k=1 0=c=c 1f3(y, 24X) 1 5 5= •= (y exp Vexp 11c,c 3(y, 5= •(y (y )k0= exp 11yy0kk+1 1y0k+1 =c0 f(y, k+1 k+1 kc,c y =cV y ykk+1 =c y0kk+1 yy0kk+1 =c =c f3 y kc,c k+1 =c =c k=1 k=1 k=1 k=1 • FACTOR GRAPH VISUALIZATION • • k , yf k+2 (y•k ,)yfk+2 •fk(y ), yfkk+2 (y ,•yfkk+1 (y , yfkk+1 ), ykk+2 ,)yfk+2 (y (y ), yfkk+2 (y (y ) k), yfk+2 (yk ), yk+2 , yk+2 ) f (y k+2 k+1 k+1 k+1 ) • x1 •x2x1x3•x2xx4x •x5x x52xx41x3xx52x4x3x5x4 x5 xx 1 3x 2x41 3• • y1 •y2 yy1 3 y•y2 4yy1y3 5y•y24yy1y35y•y24yy1y35yy2 4 yy3 5 y4 y5 Factor graph 1 • This Markov network Factor graph 2 is less ambiguous: 2 ? ? 2 2 2 2 K (L+1) K hugo.larochelle@ (y, • X) (y, = X) • (y = ) = (y, (y exp X) ) = = b exp (y b ) = exp 1 b 1 1 (ykcX) ) f== exp b 1 (y ) = exp b 1 ck fk=1 c c f f • ff •=k ff(y, y =c y =c y =c c ff(y,kX) y =c k y =c k k k k ⇣ k=1 ⌘k 4 ⌘ ⇣ ⇣ k=1⇣P k=1⇣P k=1 ⌘ ⌘ P (L+1,+1) K(L+1,+1) 1 K(L+1,+1) (L+1,+1) 1PK(L+1,+1) 1PK 1 K 1 ⇣=W ⇣ ⌘x1k+1,i ⌘x1k+1,i ⌘ • f (y, (y xf k+1,i = xexp ⇣ ⌘x1k+1,i •=k ,X) (y, (yW ,c,i x ) ⇣= ⇣PW • X) •= X) (y x)fk+1,i (y ,X) )xfk+1,i = )fk+1,i = W x1k+1,i •=k ,X) (y, =kexp (y , c,i x ) = exp W k+1,i y kexp k+1,i f (y, ff (y, yk =c ykSeptember =c kexp y =c k =c k=1 c,i k=1 c,i P P P k k=1 k=1 c,i k=1 P (L+1, (L+1, 1) 1) (L+1, 1) K K K1) (L+1, 1) K (L+1, K • f (y, • X) •(y =kX) , fx X) , fxf)(y, = =k1,i exp (y = W exp , xf)c,i W )k= 1,i exp W x x1kk=2 1 x 1 •= X) =k1,i , x)f ⇣ =k1,i exp W x 1 •k(y X) =k1,i (y , x ) = exp x 1 f (y, f(y, k(y k kk=2 1,iW y 1,i =c y =c k 1,i y =c ff(y, k k 1,i y =c k 1,i y k k k c,i c,i k k=2 k=2 c,i c,i k=2 ⇣P ⇣P ⇣P ⇣P ⌘ P ⌘ ⌘ ⌘ K 1 K 1 K 1 K 1 K 1 0 0 0 ⇣ ⇣ ⇣ ⌘ ⌘ ⌘ • f (y, (y yfk+1 exp Vexp 1 0=c 0 =c=c 0 =c=c ⌘ •=k ),X) (y, =c,c (y ,⇣0y=k+1 ) = exp V 1 1y0 k+1 0 =c ⇣ ⌘ • X) •= X) (y y= (y y= ), y= Vexp V1exp 1 1 •=k ,X) (y, = (y ) V 1 Abstr y y fk+1 fk+1 kk=1 c,c y =c f (y, ff (y, k),X) c,c c,c y =c y =c =c y fk+1 k c,c y k k+1 k k k k+1 k+1 k=1 P P P k k+1 k=1 k=1 P k=1 P (L+1,0) (L+1,0) K (L+1,0) K (L+1,0) K K (L+1,0) K Topics: factor graph • f (y, • X) (y, = X) • (y = , x (y, (y ) X) , = x exp = ) = (y W exp , x W ) = exp x W 1 x 1 x 1yxkk,i =k ff(y, (ykX) , xf k,i exp W 1ykk,i =k)c,i=f k,i (y , xk,i )c,i=fork,i exp f • ff • k,i k,i ykW k,i =c =c =c 1yk =c ff(y,kX) =c k c,i c,islides k=1 k=1 k=1 Math my “Training CRFs”. k=1 c,iyxkk,i k=1 = observed • (y , y ) (y , y , y ) • (y , y ) (y , y , y ) • (y • , y (y ) , y (y ) (y , , y y ) , y ) • (y , y ) (y , y , y ) f k k+2 f k k+1 k+2 f k k+2 f k k+1 k+2 f k f k+2 k k+2 f k f k+1 k k+2 k+1 k+2 f k k+2 f k k+1 k+2 ⇣ ⇣ ⇣ ⌘ ⌘ ⌘ ⇣ ⌘ ⇣ • Factor graphs better represent factors (L+1,0) (L+1, 1) P P P P P • a(L+1,+1) a 1 K (x1k )yK + 11k>1 u (yk ) = (L+1,+1) (L+1,+1) K Ka 1 K (x (L+1,+1) (L+1,+1) 1 k 1 )y + • f (y, • X) = X) •(y =kX) , fx (y , fxf)(y, = =kX) exp )f k+1,i (y = exp , xf)c,ik+1,i W ) k+1,i =W exp W xk=1 x1k=1 1k+1,i • = (y , x = exp • = (y , x ) = exp W x f (y, f(y, f(y, k+1,i kX) k+1,i kW k+1,i k+1,i yk =cx1k=1 yk =cx1 k+1,i yk =c 1 f k+1,i y =c k c,i c,i k k=1 c,i c,i k=1 • x1 •x2x1x3x x•x5x •5 x • 2xx4x x 1x 2 4xx 3 5x4 x5• ap (yk , yk+1 ) = 11k<K Vy ,y 13 2x 4 factor 3x 4x 5x 1x 2x 3x = ⇣ ⇣P ⇣P ⇣P ⇣P ⌘ P ⌘ ⌘ ⌘ ⌘ K 1 (L+1,0) K 1 K 1 K 1 (L+1, K 11) (x ) ) exp(a(L+ • y1f (y, y y y y • y y y y y • exp(a (x ) ) exp(a • y y • y y y y y y y y • y y y y y 0 0 0 0 3 4 5 0 3 y y =c 1 2 3 4 5 • 2 X) (y, = X) • (y = , (y, (y X) , ) y = = exp ) = (y V exp , y V ) = exp 1 V 1 1 1 1 0 1f • 2 1 2k 4X) 5 5 k4X) (y , fk+1 ) f=k+1 Vc,c 1c,c 1k=1 1f3k+1 5=kc,c •=2k4 ff3(y, (yexp , yk+1 )k= 11yy0k2k+1 1y0k+1 =c0 f(y, f k+1 y =cexp c,c y ykk+1 =cV=c ykk+1 yy0kk+1 =c =c f3 y =c=c kc,c =c k=1 k=1 k=1 k=1 • FACTOR GRAPH VISUALIZATION k k k k+1 3 • 3 • k , yf k+2 (y•k ,)yfk+2 •fk(y ), yfkk+2 (y ,•yfkk+1 (y , yfkk+1 ), ykk+2 ,)yfk+2 (y (y ), yfkk+2 (y (y ) k), yfk+2 (yk ), yk+2 , yk+2 ) f (y k+2 k+1 k+1 k+1 ) • x1 •x2x1x3•x2xx4x •x5x x52xx41x3xx52x4x3x5x4 x5 xx 1 3x 2x41 3• • y1 •y2 yy1 3 y•y2 4yy1y3 5y•y24yy1y35y•y24yy1y35yy2 4 yy3 5 y4 y5 Factor graph 1 • This Markov network Factor graph 2 is less ambiguous: 2 ? ? 2 2 2 2 K (L+1) K hugo.larochelle@ (y, • X) (y, = X) • (y = ) = (y, (y exp X) ) = = b exp (y b ) = exp 1 b 1 1 (ykcX) ) f== exp b 1 (y ) = exp b 1 ck fk=1 c c f f • ff •=k ff(y, y =c y =c y =c c ff(y,kX) y =c k y =c Université de Sherbrooke k k k k ⇣ k=1 ⌘k 4 ⌘ ⇣ ⇣ k=1⇣P k=1⇣P k=1 ⌘ ⌘ P (L+1,+1) K(L+1,+1) [email protected] (L+1,+1) K(L+1,+1) 1PK(L+1,+1) 1PK 1 K 1 ⇣=W ⇣ ⌘x1k+1,i ⌘x1k+1,i ⌘ • f (y, (y xf k+1,i = xexp ⇣ ⌘x1k+1,i •=k ,X) (y, (yW ,c,i x ) ⇣= ⇣PW • X) •= X) (y x)fk+1,i (y ,X) )xfk+1,i = )fk+1,i = W x1k+1,i •=k ,X) (y, =kexp (y , c,i x ) = exp W k+1,i y kexp k+1,i f (y, ff (y, yk =c ykSeptember =c kexp y =c k =c k=1 c,i k=1 c,i P P P k k=1 k=1 c,i k=1 P (L+1, (L+1, 1) 1) (L+1, 1) K K K1) (L+1, 1) K (L+1, K • f (y, • X) •(y =kX) , fx X) , fxf)(y, = =k1,i exp (y = W exp , xf)c,i W )k= 1,i exp W x x1kk=2 1 x 1 •= X) =k1,i , x)f ⇣ =k1,i exp W x 1 •k(y X) =k1,i (y , x ) = exp W x 1 f (y, f(y, k(y k kk=2 1,iSeptember y 1,i =c y =c k 1,i y =c ff(y, k k 1,i y =c k 1,i y k k k c,i c,i k k=2 k=2 c,i c,i k=2 26, 2012 ⇣P ⇣P ⇣P ⇣P ⌘ P ⌘ ⌘ ⌘ K 1 K 1 K 1 K 1 K 1 0 0 0 ⇣ ⇣ ⇣ ⌘ ⌘ ⌘ • f (y, (y yfk+1 exp Vexp 1 0=c 0 =c=c 0 =c=c ⌘ •=k ),X) (y, =c,c (y ,⇣0y=k+1 ) = exp V 1 1y0 k+1 0 =c ⇣ ⌘ • X) •= X) (y y= (y y= ), y= Vexp V1exp 1 1 •=k ,X) (y, = (y ) V 1 Abstr y y fk+1 fk+1 kk=1 c,c y =c f (y, ff (y, k),X) c,c c,c y =c y =c =c y fk+1 k c,c y k k+1 k k k k+1 k+1 k=1 P P P k k+1 k=1 k=1 P k=1 P (L+1,0) (L+1,0) K (L+1,0) K (L+1,0) K K (L+1,0) K Topics: factor graph • f (y, • X) (y, = X) • (y = , x (y, (y ) X) , = x exp = ) = (y W exp , x W ) = exp x W 1 x 1 x 1yxkk,i =k ff(y, (ykX) , xf k,i exp W 1ykk,i =k)c,i=f k,i (y , xk,i )c,i=fork,i exp f • ff • k,i k,i ykW k,i =c =c =c 1yk =c ff(y,kX) =c k c,i c,islides k=1 k=1 k=1 Math my “Training CRFs”. k=1 c,iyxkk,i k=1 Abstract = observed • (y , y ) (y , y , y ) • (y , y ) (y , y , y ) • (y • , y (y ) , y (y ) (y , , y y ) , y ) • (y , y ) (y , y , y ) f k k+2 f k k+1 k+2 f k k+2 f k k+1 k+2 f k f k+2 k k+2 f k f k+1 k k+2 k+1 k+2 f k k+2 f k k+1 k+2 ⇣ ⇣ ⇣ ⌘ ⌘ ⌘ ⇣ ⌘ ⇣ • Factor graphs better represent factors (L+1,0) (L+1, 1) Math for my slides “Training CRFs”. P P P P P • a(L+1,+1) a 1 K (x1k )yK + 11k>1 u (yk ) = (L+1,+1) (L+1,+1) K Ka 1 K (x (L+1,+1) (L+1,+1) 1 k 1 )y + • f (y, • X) = X) •(y =kX) , fx (y , fxf)(y, = =kX) exp )f k+1,i (y = exp , xf)c,ik+1,i W ) k+1,i =W exp W xk=1 x1k=1 1k+1,i • = (y , x = exp • = (y , x ) = exp W x f (y, f(y, f(y, k+1,i kX) k+1,i kW k+1,i k+1,i yk =cx1k=1 yk =cx1 k+1,i yk =c 1 f k+1,i y =c k c,i c,i k k=1 c,i c,i k=1 • x1 •x2x1x3x x•x5x •5 x x4 x(x • 2xx4x x 1) 1ax 2k )4x 3a5(L+1,0) 5•k )ayp (y 13 2x 4 factor 3x 5x 1x 2x 3ux = •4x (y =x +k ,1yk>1 (xkVy1 ),yy + 1k<K a(L+1,+1) (xk+ ) (L+1, = 11k<K k+1a ⇣ ⇣P ⇣P ⇣P ⇣P ⌘ P ⌘ ⌘ ⌘ ⌘ K V 1 (L+1,0) K 1 K 1 K 1 (L+1, K 11) (x ) ) exp(a(L+ • a (y , y ) = 1 • y1f (y, y y y y • y y y y y p k k+1 1k<K y ,y • exp(a (x ) ) exp(a • y y • y y y y y y y y • y y y y y 0 0 0 0 3 4 5 0 3 y y =c 1 2 3 4 5 • 2 X) (y, = X) • (y = , (y, (y X) , ) y = = exp ) = (y V exp , y V ) = exp 1 V 1 1 1 1 0 1f • 2 1 2k 4X) 5 5 k4X) (y , fk+1 ) f=k+1 Vc,c 1c,c 1k=1 1f3k+1 5=kc,c •=2k4 ff3(y, (yexp , yk+1 )k= 11yy0k2k+1 1y0k+1 =c0 f(y, f k+1 y =cexp c,c y ykk+1 =cV=c ykk+1 yy0kk+1 =c =c f3 y =c=c kc,c =c k=1 k=1 k=1 k=1 • FACTOR GRAPH VISUALIZATION k k k k • k k+1 • exp(a(L+1,0) (x3 )y3 ) exp(a(L+1, k k+1 3 1) 3 (x2 )y3 ) exp(a(L+1,+1) (x4 )y3 ) exp(Vy3 • k , yf k+2 (y•k ,)yfk+2 •fk(y ), yfkk+2 (y ,•yfkk+1 (y , yfkk+1 ), ykk+2 ,)yfk+2 (y (y ), yfkk+2 (y (y ) k), yfk+2 (yk ), yk+2 , yk+2 ) f (y k+2 k+1 k+1 k+1 ) • x1 •x2x1x3•x2xx4x •x5x x52xx41x3xx52x4x3x5x4 x5 xx 1 3x 2x41 3• • y1 •y2 yy1 3 y•y2 4yy1y3 5y•y24yy1y35y•y24yy1y35yy2 4 yy3 5 y4 y5 Factor graph 1 • This Markov network Factor graph 2 is less ambiguous: 2 ? ? 2 2 2 2 K (L+1) Hugo Larochelle K hugo.larochelle@ (y, • X) (y, = X) • (y = ) = (y, (y exp X) ) = = b exp (y b ) = exp 1 b 1 1 (ykcX) ) f== exp b 1 (y ) = exp b 1 ck fk=1 c c f f • ff •=k ff(y, y =c y =c y =c c ff(y,kX) y =c k y =c Université de Sherbrooke k k k k ⇣ k=1 ⌘k k=1 k=1 4 ⌘ ⇣ Département ⇣ k=1⇣Pd’informatique ⌘ ⌘ ⇣ P (L+1,+1) K(L+1,+1) 1P (L+1,+1) K(L+1,+1) 1PK(L+1,+1) 1PK 1 K 1 [email protected] ⇣=W ⇣ ⇣= ⌘x1k+1,i ⌘x1k+1,i ⌘ • f (y, (y xf k+1,i = xexp ⇣ ⌘x1k+1,i •=k ,X) (y, (yW ,c,i x )Sherbrooke ⇣PW • X) •= X) (y x)fk+1,i (y ,X) )xfk+1,i = )fUniversité = W x1k+1,i •=k ,X) (y, =kexp (y , c,i x ) = exp W de k+1,i y kexp k+1,i f (y, ff (y, yk =c ykSeptember =c kexp k+1,i y =c k =c k=1 c,i k=1 c,i P P P k k=1 k=1 c,i k=1 P (L+1, (L+1, 1) 1) (L+1, 1) K K K1) (L+1, 1) K (L+1, K • f (y, • X) •(y =kX) , fx X) , fxf)(y, = =k1,i exp (y = W exp , xf)c,i W )k= 1,i exp W x x1kk=2 1 x 1 •= X) =k1,i , x)f ⇣ =k1,i exp W x 1 •k(y X) =k1,i (y , x ) = exp W x 1 [email protected] f (y, f(y, k(y k kk=2 1,iSeptember y 1,i =c y =c k 1,i y =c ff(y, k k 1,i y =c k 1,i y k k k c,i c,i k k=2 k=2 c,i c,i k=2 26, 2012 ⇣P ⇣P ⇣P ⇣P ⌘ P ⌘ ⌘ ⌘ K 1 K 1 K 1 K 1 K 1 0 0 0 ⇣ ⇣ ⇣ ⌘ ⌘ ⌘ • f (y, (y yfk+1 exp Vexp 1 0=c 0 =c=c 0 =c=c ⌘ •=k ),X) (y, =c,c (y ,⇣0y=k+1 ) = exp V 1 1y0 k+1 0 =c ⇣ ⌘ • X) •= X) (y y= (y y= ), y= Vexp V1exp 1 1 •=k ,X) (y, = (y ) V 1 Abstr y y fk+1 fk+1 kk=1 c,c y =c f (y, ff (y, k),X) c,c c,c y =c y =c =c y fk+1 k c,c y k k+1 k k k k+1 k+1 k=1 P P P September 26, 2012 k k+1 k=1 k=1 P k=1 P (L+1,0) (L+1,0) K (L+1,0) K (L+1,0) K K (L+1,0) K Topics: factor graph • f (y, • X) (y, = X) • (y = , x (y, (y ) X) , = x exp = ) = (y W exp , x W ) = exp x W 1 x 1 x 1yxkk,i =k ff(y, (ykX) , xf k,i exp W 1ykk,i =k)c,i=f k,i (y , xk,i )c,i=fork,i exp f • ff • k,i k,i ykW k,i =c =c =c 1yk =c ff(y,kX) =c k c,i c,islides k=1 k=1 k=1 Math my “Training CRFs”. k=1 c,iyxkk,i k=1 Abstract = observed • (y , y ) (y , y , y ) • (y , y ) (y , y , y ) • (y • , y (y ) , y (y ) (y , , y y ) , y ) • (y , y ) (y , y , y ) f k k+2 f k k+1 k+2 f k k+2 f k k+1 k+2 f k f k+2 k k+2 f k f k+1 k k+2 k+1 k+2 f k k+2 f k k+1 k+2 ⇣ ⇣ ⇣ ⌘ ⌘ ⌘ ⇣ ⌘ ⇣ • Factor graphs better represent factors (L+1,0) (L+1, 1) Math for my slides “Training CRFs”. P P P P P • a(L+1,+1) ) =K a 1 K (x1k )yK + 11k>1 u (ykAbstract (L+1,+1) (L+1,+1) Ka 1 K (x (L+1,+1) (L+1,+1) 1 k 1 )y + • f (y, • X) = X) •(y =kX) , fx (y , fxf)(y, = =kX) exp )f k+1,i (y = exp , xf)c,ik+1,i W ) k+1,i =W exp W xk=1 x1k=1 1k+1,i • = (y , x = exp • = (y , x ) = exp W x f (y, f(y, f(y, k+1,i kX) k+1,i kW k+1,i k+1,i yk =cx1k=1 yk =cx1 k+1,i yk =c 1 f k+1,i y =c k c,i c,i k k=1 c,i c,i k=1 • x1 •x2x1x3x xx4x x •5 x x x4 x(x •Math x x 1) •x5x for my slides 1ax 2k )CRFs”. 3a5(L+1,0) 5•k )ayp (y 2 13 2x 4 factor 3x 5x 1 2“Training 3ux 4x = •4x (y =x +k ,1yk>1 (xkVy1 ),yy + 1k<K a(L+1,+1) (xk+ ) (L+1, = 11k<K k+1a ⇣ ⇣P ⇣P ⇣P ⇣P ⌘ P ⌘ ⌘ ⌘ ⌘ (L+1,0) (L+1, 1) K 1 (L+1,0) (L+1,+1) K 11k<K K 1(x K )a1 K )1 (L+1, 1) (L+ • a (y ) = a (x ) + 1 a (x ) + • a (y , y ) = 1 V u k k y k>1 k 1 y k+1 y • y1f (y, y y y y • y y y y y p k k+1 1k<K y ,y • exp(a (x ) exp(a (x ) ) exp(a • y y • y y y y y y y y • y y y y y 0 0 0 0 0 0 3 4 5 0 0 3 y 2 y 1 2 3 4 5 • 2 X) (y, = X) • (y = , (y, (y X) , ) y = = exp ) = (y V exp , y V ) = exp 1 V 1 1 1 1 1 0 1f • 2 1 2k 4X) 5 5 k4X) (y , fk+1 ) f=k+1 Vc,c 1c,c 1k=1 1f3k+1 5=kc,c •=2k4 ff3(y, (yexp , yk+1 )k= 1yykk+1 1yk+1 =c0 f(y, f k+1 y =cexp c,c y ykk+1 =cV=c ykk+1 yykk+1 =c =c f3 y =c=c kc,c =c =c k=1 k=1 k=1 k=1 • FACTOR GRAPH VISUALIZATION k k k • k k k k k+1 • ap (yk , yk+1 ) = 11k<K Vyk ,y(L+1,0) • exp(a (x3 )y3 ) exp(a(L+1, k+1 k k+1 k 3 1) 3 (x2 )y3 ) exp(a(L+1,+1) (x4 )y3 ) exp(Vy3 • k , yf k+2 (y•k ,)yf(L+1,0) •fk(y ), yfkk+2 (y ,•yfkk+1 (y , yfkk+1 ),(L+1, ,)yfk+2 (y ), yk+2 ) (y ), yfkk+2 (y (y ykk+2 ) k), yfk+2 (yk ), yk+2 f (y k+2 k+2 k+1 k+1 k+1 1) (L+1,+1) • exp(a (x3 )y3 ) exp(a (x2 )y3 ) exp(a (x4 )y3 ) exp(Vy3 ,y4 ) • x1 •x2x1x3•x2xx4x •x5x x52xx41x3xx52x4x3x5x4 x5 xx 1 3x 2x41 3• • y1 •y2 yy1 3 y•y2 4yy1y3 5y•y24yy1y35y•y24yy1y35yy2 4 yy3 5 y4 y5 Factor graph 1 • This Markov network Factor graph 2 is less ambiguous: 2 ? ? 2 2 2 2 K (L+1) Hugo Larochelle K hugo.larochelle@ (y, • X) (y, = X) • (y = ) = (y, (y exp X) ) = = b exp (y b ) = exp 1 b 1 1 (ykcX) ) f== exp b 1 (y ) = exp b 1 ck fk=1 c c f f • ff •=k ff(y, y =c y =c y =c c ff(y,kX) y =c k y =c Université de Sherbrooke k k k k ⇣ k=1 ⌘k k=1 k=1 k=1 4 ⌘ ⇣ ⇣ ⇣ ⌘ ⌘ ⇣ Département d’informatique P P Hugo Larochelle (L+1,+1) K(L+1,+1) 1P (L+1,+1) K(L+1,+1) 1PK(L+1,+1) 1PK 1 K 1 [email protected] ⇣=W ⇣ ⇣= ⌘x1k+1,i ⌘x1k+1,i ⌘ • f (y, (y xf k+1,i = xexp ⇣ ⌘x1k+1,i •=k ,X) (y, (yW ,c,i x )Sherbrooke ⇣PW • X) •= X) (y x)fk+1,i (y ,X) )xfk+1,i = )fUniversité = W x1k+1,i •=k ,X) (y, =kexp (y , c,i x ) = exp W de k+1,i y kexp k+1,i f (y, ff (y, yk =c ykSeptember =c kexp k+1,i y =c k =c k=1 c,i k=1 c,i P P P k k=1 k=1 c,i Département d’informatique k=1 P (L+1, (L+1, 1) 1) (L+1, 1) K K K1) (L+1, 1) K (L+1, K • f (y, • X) •(y =kX) , fx (y X) , fxf)(y, = =k1,i exp (y = W exp , xf)c,i W )k= 1,i exp W x x1kk=2 1 x 1 •= X) =k1,i ,de x)f ⇣ =k1,i exp W x 1 •kUniversité X) =k1,i (y , x ) = exp W x 1 [email protected] f (y, f(y, k(y k kk=2 1,iSeptember y 1,i =c y =c k 1,i y =c ff(y, k k 1,i y =c k 1,i y k k k c,i c,i k k=2 k=2 c,i c,i k=2 Sherbrooke 26, 2012 ⇣P ⇣P ⇣P ⇣P ⌘ P ⌘ ⌘ ⌘ K 1 K 1 K 1 K 1 K 1 [email protected] 0 0 0 ⇣ ⇣ ⇣ ⌘ ⌘ ⌘ • f (y, (y yfk+1 exp Vexp 1 0=c 0 =c=c 0 =c=c ⌘ •=k ),X) (y, =c,c (y ,⇣0y=k+1 ) = exp V 1 1y0 k+1 0 =c ⇣ ⌘ • X) •= X) (y y= (y y= ), y= Vexp V1exp 1 1 •=k ,X) (y, = (y ) V 1 Abstr y y fk+1 fk+1 kk=1 c,c y =c f (y, ff (y, k),X) c,c c,c y =c y =c =c y fk+1 k c,c y k k+1 k k k k+1 k+1 k=1 P P P September 26, 2012 k k+1 k=1 k=1 P k=1 P (L+1,0) (L+1,0) K (L+1,0) K (L+1,0) K K (L+1,0) K Topics: factor graph • f (y, • X) (y, = X) • (y = , x (y, (y ) X) , = x exp = ) = (y W exp , x W ) = exp x W 1 x 1 x 1yxkk,i =k ff(y, (ykX) , xf k,i exp W 1ykk,i =k)c,i=f k,i (y , xk,i )c,i=fork,i exp f • ff • k,i k,i ykW k,i =c yxkk,i =c =c 1yk =c ff(y,kX) =c k c,i c,i k=1 k=1 k=1 Math my slides “Training CRFs”. k=1 c,i k=1 Abstract September 26, 2012 = observed • (y , y ) (y , y , y ) • (y , y ) (y , y , y ) • (y • , y (y ) , y (y ) (y , , y y ) , y ) • (y , y ) (y , y , y ) f k k+2 f k k+1 k+2 f k k+2 f k k+1 k+2 f k f k+2 k k+2 f k f k+1 k k+2 k+1 k+2 f k k+2 f k k+1 k+2 ⇣ ⇣ ⇣ ⌘ ⌘ ⌘ ⇣ ⌘ ⇣ • Factor graphs better represent factors (L+1,0) (L+1, 1) Math for my slides “Training CRFs”. P P P P P • a(L+1,+1) ) =K a 1 K (x1k )yK + 11k>1 u (ykAbstract (L+1,+1) (L+1,+1) Ka 1 K (x (L+1,+1) (L+1,+1) 1 k 1 )y + • f (y, • X) = X) •(y =kX) , fx (y , fxf)(y, = =kX) exp )f k+1,i (y = exp , xf)c,ik+1,i W ) k+1,i =W exp W xk=1 x1k=1 1k+1,i • = (y , x = exp • = (y , x ) = exp W x f (y, f(y, f(y, k+1,i kX) k+1,i kW k+1,i k+1,i yk =cx1k=1 yk =cx1 k+1,i yk =c 1 f k+1,i y =c k c,i c,i k k=1 c,i c,i k=1 • x1 •x2x1x3x xx4x x •5 x x x4 x(x •Math x x 1) •x5x x for my slides 1ax 2k )CRFs”. 3a5(L+1,0) 5•k )ayp (y 2 13 2x 4 factor 3x 5Abstract 1 2“Training 3ux 4x = •4x (y =x +k ,1yk>1 (xkVy1 ),yy + 1k<K a(L+1,+1) (xk+ ) (L+1, = 11k<K k+1a ⇣ ⇣P ⇣P ⇣P ⇣P ⌘ P ⌘ ⌘ ⌘ ⌘ Math for my slides “Training CRFs”. (L+1,0) (L+1, 1) K 1 (L+1,0) (L+1,+1) K 11k<K K 1(x K )a1 K )1 (L+1, 1) (L+ • a (y ) = a (x ) + 1 a (x ) + • a (y , y ) = 1 V u k k y k>1 k 1 y k+1 y • y1f (y, y y y y • y y y y y p k k+1 1k<K y ,y • exp(a (x ) exp(a (x ) ) exp(a • y y • y y y y y y y y • y y y y y 0 0 0 0 0 0 3 4 5 0 0 3 y 2 y 1 2 3 4 5 • 2 X) (y, = X) • (y = , (y, (y X) , ) y = = exp ) = (y V exp , y V ) = exp 1 V 1 1 1 1 1 0 1f • 2 1 2k 4X) 5 5 k4X) (y , fk+1 ) f=k+1 Vc,c 1c,c 1k=1 1f3k+1 5=kc,c •=2k4 ff3(y, (yexp , yk+1 )k= 1yykk+1 1yk+1 =c0 f(y, f k+1 y =cexp c,c y ykk+1 =cV=c ykk+1 yykk+1 =c =c f3 y =c=c kc,c =c =c k=1 k=1 k=1 k=1 • FACTOR GRAPH VISUALIZATION k k k k k k k k+1 k k+1 k 3 3 (L+1, 1) (L+1,+1) • au (yk ) = a(L+1,0) (xk•)yakp+(yk1,k>1 a (x ) + 1 a (xk+1 )yk 1) (x ) ) exp(a(L+1,+1) (x ) ) exp(V (L+1,0) (L+1, k<K k yk+1 ) = 11k<K V1yky,y • kexp(a (x3 )y3 ) exp(a k+1 2 y3 4 y3 y3 • ) =f (y (y•Vkyk,),yyf(L+1,0) • (y ) (y , y (y , y , y ) , ) y (y ) , y ) (y , y ) (y , y ) • (y , y ) (y , y ) k, y f k+2 k+2 f f k f k k+1 k k+2 k+1 k+2 f k+2 k k+1 k+2 k k+2 f k k+1 k+2 f k k+2 f k k+1 k+2 • ap (yk , yk+1 1•1k<K (L+1, 1) (L+1,+1) • exp(a k+1 (x3 )y3 ) exp(a (x2 )y3 ) exp(a (x4 )y3 ) exp(Vy3 ,y4 ) (L+1, 1) (L+1,+1) • exp(a(L+1,0) )x (x ) ) exp(a • x(x1 3 • xexp(a x x x x • x x x x x x x5x4(xx • x x x x x • x x y32)x1 2 y 4 )5 y3 ) exp(Vy3 ,y4 ) 3 2 41 3 52 41 3 52 3 41 3 52 4x3 • y1 •y2 yy1 3 y•y2 4yy1y3 5y•y24yy1y35y•y24yy1y35yy2 4 yy3 5 y4 y5 Factor graph 1 • This Markov network Factor graph 2 is less ambiguous: 2 ? ? 2 2 2 2
© Copyright 2026 Paperzz