pptx

1
Robust Subspace Clustering in High Dimension:
A Deterministic Result
Guangcan Liu
2
Problem Definition (Robust Subspace Clustering)
Survey
input
๐‘น
๐’Ž
Applications
Computer Vision
Image Processing
Biometric
Physics
System theoryoutput
โ€ฆโ€ฆ
Methods
RANSAC (1981) -- LD
SIM (1998) -- HD
MSL (2004) -- HD
LSA (2006) -- HD
LLMC (2007) -- HD
Errors
ALC (2007) -- HD
white noise GPCA (2008) -- LD
SC (2009) -- HD
outliers
SSC (2009) -- HD
SCC (2009) -- HD
missing entries LRR (2010) -- HD
LBF (2011) -- HD
corruptions
SLBF (2011) โ€“ HD
โ€ฆโ€ฆ
โ€ฆโ€ฆ
People
Rene Vidal
Takeo Kanade
Ali Skemen
Gilad Lerman
David Donoho
Emmaunuel Candes
Anna Little
Laura Balzano
Jerome Baudry
Robert Nowark
Alexander Powell
Elad Admir
Ehsan Elhamifar
Don Hong
Yi Ma
Teng Zhang
Shuicheng Yan
Huan Xu
Zhouchen Lin
Guangcan Liu
โ€ฆโ€ฆ
3
In general case, the problem is ill-posed
Unidentifiability
Example 1
Example 2
2D
Question: Under which conditions, it is possible to EXACTLY solve the
robust subspace clustering problem
4
A Special Case
๏ฐ
๏ฐ
Let ๐‘† = ๐‘†1 + ๐‘†2 โ€ฆ + ๐‘†๐‘˜ be the sum of all ๐‘˜ subspaces.
High Dimension Assumption:
๏ฎ
The ambient data dimension m is so high such that
๐’Ž โ‰ซ ๐‘ซ๐’Š๐’Ž ๐‘บ
This special case is indeed significant !
Hopkins155 (widely used in research):
๏ฌ
Ambient data dimension, m = 100 ± 20
๏ฌ
Number of subspaces, ๐‘˜ = 2 or 3
๏ฌ
Dimension of each subspace โ‰ค 4
๐‘11
๐‘ž11
โ‹ฎ
๐‘๐น1
๐‘ž๐น1
๐‘˜
๐ท๐‘–๐‘š ๐‘† โ‰ค
๐‘12
๐‘ž12
โ‹ฎ
๐‘๐น2
๐‘ž๐น2
โ‹ฏ
โ‹ฏ
โ‹ฑ
โ‹ฏ
โ‹ฏ
๐‘น๐Ÿ๐‘ญ×๐’
๐‘1๐‘›
๐‘ž1๐‘›
๐ด1
โ‹ฎ = โ‹ฎ
๐‘๐น๐‘›
๐ด๐น
๐‘ž๐น๐‘›
๐‘น๐Ÿ๐‘ญ×๐Ÿ’
๐ท๐‘–๐‘š ๐‘†๐‘– โ‰ค 3 × 4 = 12 โ‰ช ๐‘š
๐‘–
โ„Ž11
โ„Ž21
โ„Ž31
1
โ‹ฏ
โ‹ฏ
โ‹ฏ
โ‹ฏ
๐‘น๐Ÿ’×๐’
โ„Ž1๐‘›
โ„Ž2๐‘›
โ„Ž3๐‘›
1
5
A Special Case
๏ฐ
๏ฐ
Let ๐‘† = ๐‘†1 + ๐‘†2 โ€ฆ + ๐‘†๐‘˜ be the sum of all ๐‘˜ subspaces.
High Dimension Assumption:
๏ฎ
The ambient data dimension m is so high such that
๐’Ž โ‰ซ ๐‘ซ๐’Š๐’Ž ๐‘บ
This special case is indeed significant !
Face Images:
๏ฐ
Consider ๐‘˜ = 10,000
๏ฐ
The dimension of each (front) face subspace is about 5
๏ฐ
Blessing of Dimensionality
๏ฎ
10 × 10 face images ( m = 100)
โ€ข
๐ท๐‘–๐‘š ๐‘† โ‰ค 5๐‘˜ = 50,000 > ๐‘š
๏ฎ
100 × 100 face images ( m = 10,000)
โ€ข
๐ท๐‘–๐‘š ๐‘† โ‰ค 5๐‘˜ = 50,000 > ๐‘š
๏ฎ
1000 × 1000 face images (m = 1,000,000)
โ€ข
๐ท๐‘–๐‘š ๐‘† โ‰ค 5๐‘˜ = 50,000 โ‰ช ๐‘š
6
Problem Formulation (Robust Subspace Clustering)
What is the difference with PCA ?
๏ฐ
Input: ๐‘‹ = [๐‘ฅ1 , ๐‘ฅ2 , โ€ฆ , ๐‘ฅ๐‘› ], a given data matrix each column in which is an mdimensional data point approximately drawn from some subspace.
๏ฌ
๐‘‹ = ๐ฟ0 + ๐ธ0
๐‘ณ๐ŸŽ (authentic)
๐‘ฌ๐ŸŽ (errors)
๐‘ฟ (observed)
๏ฐ
PCA
๐‘น๐’Ž
=
๏ฐ
๏ฐ
linear
+
Output:
๏ฌ
Correct subspace membership
๏ฐ
Robust subspace clustering (High Dimension)
Low-Rankness Assumption:
๏ฌ
๐‘† = ๐‘†1 + ๐‘†2 โ€ฆ + ๐‘†๐‘˜ .
nonlinear in linear
โ€ข
โ€ข
min(๐‘š, ๐‘›)
๐‘Ÿ0 โ‰œ ๐‘Ÿ๐‘Ž๐‘›๐‘˜ ๐ฟ0 = ๐ท๐‘–๐‘š ๐‘† โ‰ค
log max(๐‘š, ๐‘›)
i.e., High Dimension Assumption + ๐‘› is also sufficiently large
i.e., the sum of multiple subspaces together is a subspace
7
A Baseline Idea
white noise:
Comments
PCA (principal component analysis)
๐ฟ0 = ๐‘Ž๐‘Ÿ๐‘” ๐‘š๐‘–๐‘›๐ฟ ๐ฟ โˆ— + ๐œ† ๐‘‹ โˆ’ ๐ฟ 2๐น
spare corruptions:
Positive
โ€ข PCP (principal component pursuit)
๏ฎ
Provided that
๐ฟ0et al.,
is NIPSโ€™09;
low-rank,
it etisal.,possible
[Wright
Candes
JACMโ€™11] for (robust) PCA
methods to recover
๐ฟ0 ๐‘š๐‘–๐‘›
from๐ฟ ๐ฟ๐‘‹ โˆ— without
multiple
๐ฟ0 = ๐‘Ž๐‘Ÿ๐‘”
+ ๐œ† ๐‘‹ โˆ’considering
๐ฟ1
subspaces.
โ€ข OP (outlier pursuit) [Xu et al., NIPSโ€™10]
โ€ข
1
Error Correction๏ฐ
X
๐‘ณ๐ŸŽ
โ€ข
โ€ข
Candes et al. Robust Principal Component Analysis ? JACMโ€™11.
๐ฟ0 = ๐‘Ž๐‘Ÿ๐‘” ๐‘š๐‘–๐‘›๐ฟ ๐ฟ โˆ— + ๐œ† ๐‘‹ โˆ’ ๐ฟ 2,1
Candes and Recht.
Exact Matrix Completion via Convex Optimization. FMCโ€™09.
Negativemissing entries:
[Candes et al., Fund. Math. Comp.โ€™09]
๏ฎ
In โ€ขthematrix
case ofcompletion
multiple subspaces,
the success condition for
๐‘š๐‘–๐‘›๐ฟvery
๐ฟ โˆ— restrictive:
+ ๐œ† ๐‘ƒฮฉ ๐‘‹ โˆ’ ๐ฟ 2๐น
recovering๐ฟ๐ฟ0 0=is๐‘Ž๐‘Ÿ๐‘”
actually
2
The incoherent condition requried by (robust) PCA
Standard Subspace Clustering โ€ข
Step 1: compute an affinity matrix
๐‘ณ๐ŸŽ
methods is inconsistent with multiple subspaces.
โ€ข SIM(shape interaction matrix)[Costeira et al., IJCVโ€™98]
Guangcan Liu and Ping Li. Recovery of coherent data via low-rank dictionary
โ‘  NIPSโ€™14
Perform SVD: ๐‘ณ0 = ๐‘ผ๐šบ๐‘ฝ๐“
pursuit,
๏ฎ
The styleโ‘กof two
steps
not easy
to use
Form
an is
affinity
matrix
byin๐’practice.
= |๐‘ฝ๐‘ฝ๐‘‡ |
โ€ข SSC (sparse subspace clustering)[Rene et al., CVPRโ€™09]
๐‘ = ๐‘Ž๐‘Ÿ๐‘” ๐‘š๐‘–๐‘›๐‘ ๐‘ 1 ๐‘ . ๐‘ก. ๐ฟ0 = ๐ฟ0 ๐‘
Overall Grade: D (grading system A-F)
โ€ข โ€ฆโ€ฆ
Step 2: spectral clustering
๏ฐ
โ€ข
8
Preliminary : Shape Interaction Matrix (SIM)
Definition
For a data matrix ๐‘€ each column in which is a data point, its shape interaction
matrix (SIM) is
๐‘‰๐‘‰ ๐‘‡ , (row projector)
where ๐‘ˆฮฃ๐‘‰ ๐‘‡ is the skinny SVD of ๐‘€.
The SIM of ๐‘ณ๐ŸŽ , ๐‘ฝ๐ŸŽ ๐‘ฝ๐‘ป๐ŸŽ โˆˆ ๐‘น๐’×๐’ , identifies
the true
However,
โ€ฆ subspace membership
The following had been
proved (Kanatani, ECCVโ€™11; Vidal
et al., IJCVโ€™08; Liu et al., TPMAIโ€™13; Xu
et al., ICMLโ€™13):
๏ฌ
๏ฌ
for ๐‘ฅ๐‘– and ๐‘ฅ๐‘— from
different subspaces,
[๐‘‰0 ๐‘‰0๐‘‡ ]๐‘–๐‘— = 0
for ๐‘ฅ๐‘– and ๐‘ฅ๐‘— from the
same subspace,
[๐‘‰0 ๐‘‰0๐‘‡ ]๐‘–๐‘— โ‰  0 with high
probability
๐‘ฝ๐ŸŽ ๐‘ฝ๐‘ป๐ŸŽ
๐‘‹ = ๐‘ˆ๐‘‹ ฮฃ๐‘‹ ๐‘‰๐‘‹๐‘‡
independent
๐‘ฝ๐ŸŽ ๐‘ฝ๐‘ป๐ŸŽ
๐‘‰๐‘‹ ๐‘‰๐‘‹๐‘‡
intersection
SNRdB = 23
9
Our Method
Given ๐‘ฟ, ๐‘ฟ = ๐‘ณ๐ŸŽ + ๐‘ฌ๐ŸŽ , recover ๐‘ฝ๐ŸŽ ๐‘ฝ๐‘ป๐ŸŽ (the SIM of ๐‘ณ๐ŸŽ ) using a single convex procedure.
A Basic Theory[Liu et al., TPAMIโ€™13]:
๐‘‰0 ๐‘‰0๐‘‡ = arg ๐‘š๐‘–๐‘›๐‘ ๐‘
โˆ—
๐‘ . ๐‘ก. ๐ฟ0 = ๐ฟ0 ๐‘
About the nuclear norm, | โ‹… |โˆ— :
๏ฌ
Let {๐œŽ1 , ๐œŽ2 , โ€ฆ } be the singular values of a matrix ๐‘€. Then ๐‘€ โˆ— = ๐‘– ๐œŽ๐‘–
๏ฌ
Nuclear norm is the closest convex approximation to the rank function
๐œŽ1
๐œŽ2
.
.
๐œ‹=
, rank M = ๐œ‹ 0 , M โˆ— = ๐œ‹
.
.
.
.
1
10
Our Method
Given ๐‘ฟ, ๐‘ฟ = ๐‘ณ๐ŸŽ + ๐‘ฌ๐ŸŽ , recover ๐‘ฝ๐ŸŽ ๐‘ฝ๐‘ป๐ŸŽ (the SIM of ๐‘ณ๐ŸŽ ) using a single convex procedure.
A Basic Theory[Liu et al., TPAMIโ€™13]:
๐‘‰0 ๐‘‰0๐‘‡ = arg ๐‘š๐‘–๐‘›๐‘ ๐‘
๐‘,๐‘†
๐‘ . ๐‘ก. ๐ฟ0 = ๐ฟ0 ๐‘
๐‘‹ โˆ’ ๐ธ0 = ๐‘‹ โˆ’ ๐ธ0 ๐‘
To recover ๐‘ฝ๐ŸŽ ๐‘ฝ๐‘ป๐ŸŽ , one may try:
min ๐‘ โˆ— + ๐œ† ๐ธ
โˆ—
โ„“
๐‘ . ๐‘ก. , ๐‘‹ โˆ’ ๐ธ = ๐‘‹ โˆ’ ๐ธ ๐‘
๐ธ 2๐น : white noise
๐ธ 1 : randomly sparse errors
๐ธ 2,1 : column-wisely sparse errors
๏ฌ
๏ฌ
not convex !
Paolo, Rene, and Avinash, โ€œA Closed Form Solution to Robust Subspace
Estimation and Clusteringโ€, CVPRโ€™11:
๐ธ0
min ๐‘ โˆ— + ๐œ† ๐ธ 1 ๐‘ . ๐‘ก. , ๐‘‹ โˆ’ ๐ธ = ๐‘‹ โˆ’ ๐ธ ๐‘
๐‘,๐‘†
randomly
sparse
11
Our Method
Given ๐‘ฟ, ๐‘ฟ = ๐‘ณ๐ŸŽ + ๐‘ฌ๐ŸŽ , recover ๐‘ฝ๐ŸŽ ๐‘ฝ๐‘ป๐ŸŽ (the SIM of ๐‘ณ๐ŸŽ ) using a single convex procedure.
A Basic Theory[Liu et al., TPAMIโ€™13]:
๐‘‰0 ๐‘‰0๐‘‡ = arg ๐‘š๐‘–๐‘›๐‘ ๐‘
A Convex Approximation Scheme:
๏ฌ
An observation:
๐ธ0 ๐‘‰0 ๐‘‰0๐‘‡ โ‰ˆ 0
๐‘ฌ๐ŸŽ
๏ฌ
๏ฌ
โˆ—
๐‘ . ๐‘ก. ๐ฟ0 = ๐ฟ0 ๐‘
๐‘ฌ๐ŸŽ ๐‘ฝ๐ŸŽ ๐‘ฝ๐‘ป๐ŸŽ
We remove ๐ธ๐‘ in ๐‘‹ โˆ’ ๐ธ = ๐‘‹ โˆ’ ๐ธ ๐‘:
๐‘‹ = ๐‘‹๐‘ + ๐ธ โˆ’ ๐ธ๐‘
Convex formulation:
๐’Ž๐’Š๐’ ๐’ โˆ— + ๐€ ๐‘ฌ
๐’,๐‘บ
โ„“
๐’”. ๐’•. , ๐‘ฟ = ๐‘ฟ๐’ + ๐‘ฌ
Question: What is lost ?
12
Our Method
Given ๐‘ฟ, ๐‘ฟ = ๐‘ณ๐ŸŽ + ๐‘ฌ๐ŸŽ , recover ๐‘ฝ๐ŸŽ ๐‘ฝ๐‘ป๐ŸŽ (the SIM of ๐‘ณ๐ŸŽ ) using a single convex procedure.
A Basic Theory[Liu et al., TPAMIโ€™13]:
๐‘‰0 ๐‘‰0๐‘‡ = arg ๐‘š๐‘–๐‘›๐‘ ๐‘
๐‘ . ๐‘ก. ๐ฟ0 = ๐ฟ0 ๐‘
โˆ—
Rather surprisingly, in some cases, nothing is lost!
under certain conditions, the convex procedure below can EXACTLY recover ๐‘ฝ๐ŸŽ ๐‘ฝ๐‘ป๐ŸŽ
๐’Ž๐’Š๐’ ๐’ โˆ— + ๐€ ๐‘ฌ
๐’,๐‘บ
๏ฌ
โ„“
๐’”. ๐’•. , ๐‘ฟ = ๐‘ฟ๐’ + ๐‘ฌ
Setting: sample-specific errors
column-wisely
Sparse
(group sparsity)
๐ธ0
Low-Rank Representation (LRR)
(LRR) ๐ฆ๐ข๐ง ๐’ โˆ— + ๐€ ๐‘ฌ
๐’,๐‘บ
[Liu et al., ICMLโ€™10, TPAMIโ€™13]
๐Ÿ,๐Ÿ
๐’”. ๐’•. ๐‘ฟ = ๐‘ฟ๐’ + ๐‘ฌ
13
Our Method
Given ๐‘ฟ, ๐‘ฟ = ๐‘ณ๐ŸŽ + ๐‘ฌ๐ŸŽ , recover ๐‘ฝ๐ŸŽ ๐‘ฝ๐‘ป๐ŸŽ (the SIM of ๐‘ณ๐ŸŽ ) using a single convex procedure.
LRR:
๐’โˆ— , ๐‘ฌโˆ— = ๐’‚๐’“๐’ˆ ๐ฆ๐ข๐ง ๐’ โˆ— + ๐€ ๐‘ฌ
๐’,๐‘บ
๐Ÿ,๐Ÿ
๐’”. ๐’•. ๐‘ฟ = ๐‘ฟ๐’ + ๐‘ฌ
A Deterministic Result
Assumption:
Notes
๐’Ž โ‰ซ ๐‘ซ๐’Š๐’Ž ๐‘บ โ‡’ ๐‘ ๐‘๐‘Ž๐‘› ๐ฟ0 โˆฉ ๐‘ ๐‘๐‘Ž๐‘› ๐ธ0 = 0 (avoid unidentifiability)
Theorem
LRR also depends on the incoherent
3
There exists ๐›พ โˆ— > 0, such that LRR with condition
parameter
๐œ†
=
strictly
7 ๐‘‹ ๐›พโˆ— ๐‘›
๏ฌ
๐‘ โˆ— โ‰  ๐‘‰0 ๐‘‰0๐‘‡ (๐‘ โˆ— is asymmetric)
๏ฌ
โ„“1 regularization:
succeeds as long as ๐›พ โ‰ค ๐›พ โˆ— (๐›พ is the fraction
of nonzero columns of ๐ธ0 ):
๐‘ โˆ—โ„+, ๐œ† ๐ธ 1 , ๐‘ . ๐‘ก. ๐‘‹ = ๐‘‹๐‘ + ๐ธ
๐‘ˆ โˆ— ๐‘ˆ โˆ— ๐‘‡ = ๐‘‰0 ๐‘‰0๐‘‡ andmin
โ„โˆ— =
0
โ€ข
near
recovery
to ๐‘‰0 ๐‘‰0๐‘‡ , but not
โˆ—
โˆ—
โˆ—
where ๐‘ˆ is the left singular vectors of ๐‘ , and
โ„ is the column
exact.
๏ฌ
Provided that ๐‘‹:,๐‘– 2 = 1, โˆ€๐‘– = 1, โ‹ฏ ๐‘›,
support of ๐ธ โˆ— .
1
๐‘‰0 --- right singular vectors of ๐ฟ0
๐œ†=
log ๐‘›
โ„ --- column support of ๐ธ
0
๏ฌ
0
14
Results on Randomly Generated Data
ambient dimension ๐‘š = 300
๐‘Ÿ0 โ‰œComparison
๐‘Ÿ๐‘Ž๐‘›๐‘˜ ๐ฟ0 = 5,10, โ€ฆ , 150
Experimental Settings
#subspace ๐‘˜ = 5
๐›พ = 1.7%, 3.4%, โ€ฆ , 50%
#data points ๐‘› = 300
#trials = 100
๐œ† = 1/ log ๐‘›
LRR ๐‘ฝ ๐‘ฝ๐‘ป
Recovering
๐ŸŽ ๐ŸŽ
min ๐‘ โˆ— + ๐œ† ๐ธ
๐‘,๐‘†
2,1
๐‘ . ๐‘ก. ๐‘‹ = ๐‘‹๐‘ + ๐ธ
corruption percentage ๐œธ
corruption percentage ๐œธ
Outlier
Pursuit (OP):
Recovering
colu_supp(๐‘ฌ๐ŸŽ )
min ๐ฟ โˆ— + ๐œ† ๐ธ
๐ฟ,๐‘†
2,1
๐‘ . ๐‘ก. ๐‘‹ = ๐ฟ + ๐ธ
corruption
percentage
๐œธ
corruption
percentage
๐œธ
15
Results on Motion Sequences
Experimental Settings
Dataset: Hopkins155 (+ synthetic corruptions)
Baseline: Outlier Pursuit (OP) + SIM
16
Results on Face Images
input
······
๐‘ฟ=
๐’โˆ— , ๐‘ฌโˆ— = ๐’‚๐’“๐’ˆ ๐ฆ๐ข๐ง ๐’ โˆ— + ๐€ ๐‘ฌ
๐’,๐‘บ
๐Ÿ,๐Ÿ
๐’”. ๐’•. ๐‘ฟ = ๐‘ฟ๐’ + ๐‘ฌ
output
๐‘ฟ๐’โˆ— =
······
17
Some Comments for LRR
Positive:
๏ฎ
๏ฎ
๏ฎ
Subspace clustering + error correction
Computationally stable (convex)
The model is flexible and can easily adapt to various problems:
Image segmentation, ICCVโ€™11
Saliency Detection, TIPโ€™11
Negative:
๏ฎ
LRR still partially depends on the incoherent condition !
โ€ข
LRR has NOT fully captured the structure of multiple subspaces
Overall Grade: D+ (grading system A-F)
18
Conclusion & Future Work
Conclusion
๏ฌ
Figuring out a significant and โ€œeasyโ€ case for the robust subspace
clustering problem (blessing of dimensionality)
๏ฌ
Suggesting a convex formula termed LRR to resolve the problem under
certain conditions
Future Work
๏ฌ
Fast algorithms (curse of dimensionality)
๏ฌ
Completely removing the dependence on the incoherent condition
References
[1] Liu et al. Robust Subspace Segmentation by Low-Rank Representation , ICMLโ€™10.
[2] Liu et al. Robust Recovery of Subspaces Structures by Low-Rank Representation, TPAMIโ€™13
[3] Liu et al. Recovery of Coherent Data via Low-Rank Dictionary Pursuit, NIPSโ€™14
[4] Liu et al. A Deterministic Analysis for LRR, TPAMI (under revision)
19
Questions ?
Welcome to email me:
[email protected]