Tensor data analysis Part 2

Tensor data analysis Part 2 Mariya Ishteva Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Outline Last Dme: !   MoDvaDon !   Basic concepts !   Basic tensor decomposiDons ! 
! 
! 
Today: Other useful decomposiDons Local minima Tensors and graphical models 2 Tensor ranks 3 Matrix representaDons of a tensor ! 
mulDlinear rank: (rank(A(1)), rank(A(2)), rank(A(3))) 4 Tensor-­‐matrix mulDplicaDon ! 
Tensor-­‐matrix product ! 
ContracDon 4th order tensor 5 Basic decomposiDons 6 Outline ! 
Other useful decomposiDons ! 
! 
! 
! 
Constrained decomposiDons Block term decomposiDon Tensor Train decomposiDon Hierarchical Tucker decomposiDon ! 
Local minima ! 
Tensors and graphical models 7 Constrained decomposiDons ! 
S: as diagonal as possible
! 
CP with orthogonality constraints ! 
Other constraints ! 
! 
! 
! 
! 
! 
! 
NonnegaDvity Sparsity Symmetry Missing values Dynamic tensor decomposiDons Etc. ComputaDon can oWen be performed using matrix algorithms for the matrix representaDons of the tensors 8 Block term decomposiDon •  Uniqueness properDes ! 
L. De Lathauwer, !   DecomposiDons of a Higher-­‐Order Tensor in Block Terms, !   SIAM Journal on Matrix Analysis and Applica5ons, V. 30, # 3, 2008 9 Tensor train (TT) decomposiDon •  Avoids curse of dimensionality •  Small number of parameters, compared to Tucker model •  Slightly more parameters than CP but more stable •  has dimensions , •  are called compression ranks: , •  ComputaDon based on SVD •  ComputaDon: top  bo]om ! 
I. V. Oseledets, !   Tensor-­‐Train DecomposiDon, !   SIAM Journal on Scien5fic Compu5ng, V. 33, 2011 10 Hierarchical Tucker decomposiDon •  Similar properDes as TT decomposiDon •  ComputaDon: bo]om  top ! 
L. Grasedyck, ! 
Hierarchical Singular Value DecomposiDon of Tensors, ! 
SIAM Journal on Matrix Analysis and Applica5ons, V. 31, # 4, 11 2010 Outline ! 
Other useful decomposiDons ! 
Local minima ! 
Tensors and graphical models 12 Low mulDlinear rank approximaDon 13 Low mulDlinear rank approximaDon 14 Example 1 15 Example 1 16 Example 1 17 Example 1 18 Example 2 19 Local minima: summary of results 20 Outline ! 
Other useful decomposiDons ! 
Local minima ! 
Tensors and graphical models 21 Tensors and graphical models ! 
CP/CANDECOMP/PARAFAC ! 
Tensor Train ! 
Hierarchical Tucker ! 
! 
Tucker/MLSVD Block term decomposiDon Not commonly used graphical models 22 Quartet relaDonships: topologies 23 Discovering tree structures ! 
! 
Assume: Data correspond to latent tree model For simplicity: assume each latent variable has 3 neighbors Building trees based on quartet relaDonships !   Choose randomly 3 variables; add one; resolve relaDonship !   For t = 4 to # variables do !   Pick a root for the current tree (should split the tree in 3 branches of approximately equal size) !   Pick a leaf in each branch (X1, X2, X3) !   Resolve quartet relaDonship (X1, X2, X3, Xt+1) !   A]ach Xt+1 to the corresponding subtree (i.e., repeat last steps recursively unDl only 4 variables are leW) 24 Tensor view of quartets NotaDon: etc. stands for etc. 25 Matrix representaDons of quartet tensor 26 Rank properDes of matrix representaDons ! 
! 
Due to sampling noise, A, B, C become full rank Nuclear norm relaxaDon !   Approximate the rank by ! 
Nuclear norm: sum of singular values Tightest convex lower bound for rank 27 Resolving quartet relaDons ! 
! 
! 
! 
Nuclear norm: approximaDon of rank. No 100% guarantee to succeed However, we can show it is always successful when H and G are independent or “close to” independent Easy to compute Do not need to know number of hidden states in advance. They can also be different 28 Example: Stock data Given: stock prices (25 years, 10 entries per day) Find: relaDons between stocks Petroleum: SUN (Sunoco) SLB (Schlumberger) CVX (Chevron) XOM (Exxon Mobil) APA (Apache) COP (ConocoPhillips) Retailers: RSH (RadioShack) TGT (Target) WMT (WalMart) Finance: AXP (American Express) C (CiDgroup) JPM (JPMorgan Chase) F (Ford Motor: AutomoDve and Financial Services) 29 Conclusion ! 
Real data: oWen mulD-­‐way
! 
Matrix concepts and decomposiDons are generalizable to tensors ! 
! 
Local minima are not necessarily an issue but if they are, reformulate using nuclear norm ( convex problems) Advantages of tensor approach ! 
! 
! 
! 
! 
Tensor algorithms be]er exploit structure; interpretability Uniqueness properDes (CP, Block term decomposiDon) No curse of dimensionality (Tensor train, hierarchical Tucker) Thank you! [email protected] 30