Curvilinear Component Analysis and Bregman divergences

Curvilinear Component Analysis and Bregman
divergences
Jigang Sun
Colin Fyfe
Malcolm Crowe
28 April 2010
University of the West of Scotland
Multidimensional Scaling(MDS)
• A group of information visualisation methods that
projects data from high dimensional space, to a low
dimensional space, often two or three dimensions,
keeping inter-point dissimilarities (e.g. distances) in
low dimensional space as close as possible to the
original dissimilarities in high dimensional space.
• When Euclidean distances are used, it is Metric
MDS.
Visualising 18 dimensional data
Basic MDS
•The basic MDS, the stress function to be minimised
E BasicMDS 
N

N
2
(L

D
)

 ij
ij
i 1 j i 1
N

N
 E ij
i 1 j i 1
where
error
E ij  | L ij  D ij |
Dij  || X i - X j ||, the distance between points X i and X j in data space
Lij  || Yi - Yj ||, the mapped distance between points Yi and Y j in latent space
 Sammon Mapping (1969)
E Sammon 
1
N
N
N
N

  Dij i 1 ji 1
i 1 j  i 1
(L ij  D ij ) 2
D ij
•Improve the Sammon mapping with Bregman divergence
2
Bregman divergence
F ( p)
d
F
( p, q)  F ( p)  F (q)  F ' (q)( p  q)
F ' (q)( p  q)
F (q)
q
p
Intuitively, it is the
difference between the
value of F at point p and
the value of the first-order
Taylor expansion of F
around point q evaluated
at point p.
2 representations
When F is in one variable, the Bregman Divergence is
truncated Taylor series
Two useful properties for MDS
1. Non-negativity
d F ( p, q)  0, and d F ( p, q)  0  p  q
2. Non-symmetry
d F ( p, q)  d F (q, p)
Except in special cases such as F(x)=x^2
Improving Sammon Mapping with
Bregman divergences
Recall the classical Sammon Mapping (1969)
E Sammon 
1
N
N
N
N

  Dij i 1 ji 1
i 1 j  i 1
(L ij  D ij ) 2
D ij
• Choose a base convex function
• common term: the first term of ExtendedSammon is
Sammon, not considering constant coefficients
An Experiment on Swiss roll data set
Two groups of Convex functions
• No 1 is for the Extended Sammon mapping.
OpenBox, Sammon and FirstGroup
SecondGroup on OpenBox
Curvilinear Component Analysis (CCA) and
Bregman Divergences
• W( .) has argument the inter-point distance in latent space
• Good at unfolding strongly nonlinear structures
• Stochastic gradient descent updating rule
A version of CCA
One weight function can be
Updating rule
Rewriting stress function for CCA using
right Bregman divergences
Given convex function
Updating rule is the same
The common term between BasicCCA and Real CCA
=
• The first term is common with
Real CCA vs Basic CCA
Conclusions
We introduced
•The Extended Sammon mapping vs the Sammon
mapping
•We create two groups of left Bregman divergences and
experiment on artificial data sets.
• A right Bregman divergence redefines the stress
function for Curvilinear Component Analysis
•Any questions?