pptx

1
Robust Subspace Clustering in High Dimension:
A Deterministic Result
Guangcan Liu
2
Problem Definition (Robust Subspace Clustering)
Survey
input
𝑹
𝒎
Applications
Computer Vision
Image Processing
Biometric
Physics
System theoryoutput
……
Methods
RANSAC (1981) -- LD
SIM (1998) -- HD
MSL (2004) -- HD
LSA (2006) -- HD
LLMC (2007) -- HD
Errors
ALC (2007) -- HD
white noise GPCA (2008) -- LD
SC (2009) -- HD
outliers
SSC (2009) -- HD
SCC (2009) -- HD
missing entries LRR (2010) -- HD
LBF (2011) -- HD
corruptions
SLBF (2011) – HD
……
……
People
Rene Vidal
Takeo Kanade
Ali Skemen
Gilad Lerman
David Donoho
Emmaunuel Candes
Anna Little
Laura Balzano
Jerome Baudry
Robert Nowark
Alexander Powell
Elad Admir
Ehsan Elhamifar
Don Hong
Yi Ma
Teng Zhang
Shuicheng Yan
Huan Xu
Zhouchen Lin
Guangcan Liu
……
3
In general case, the problem is ill-posed
Unidentifiability
Example 1
Example 2
2D
Question: Under which conditions, it is possible to EXACTLY solve the
robust subspace clustering problem
4
A Special Case


Let 𝑆 = 𝑆1 + 𝑆2 … + 𝑆𝑘 be the sum of all 𝑘 subspaces.
High Dimension Assumption:

The ambient data dimension m is so high such that
𝒎 ≫ 𝑫𝒊𝒎 𝑺
This special case is indeed significant !
Hopkins155 (widely used in research):

Ambient data dimension, m = 100 ± 20

Number of subspaces, 𝑘 = 2 or 3

Dimension of each subspace ≤ 4
𝑝11
𝑞11
⋮
𝑝𝐹1
𝑞𝐹1
𝑘
𝐷𝑖𝑚 𝑆 ≤
𝑝12
𝑞12
⋮
𝑝𝐹2
𝑞𝐹2
⋯
⋯
⋱
⋯
⋯
𝑹𝟐𝑭×𝒏
𝑝1𝑛
𝑞1𝑛
𝐴1
⋮ = ⋮
𝑝𝐹𝑛
𝐴𝐹
𝑞𝐹𝑛
𝑹𝟐𝑭×𝟒
𝐷𝑖𝑚 𝑆𝑖 ≤ 3 × 4 = 12 ≪ 𝑚
𝑖
ℎ11
ℎ21
ℎ31
1
⋯
⋯
⋯
⋯
𝑹𝟒×𝒏
ℎ1𝑛
ℎ2𝑛
ℎ3𝑛
1
5
A Special Case


Let 𝑆 = 𝑆1 + 𝑆2 … + 𝑆𝑘 be the sum of all 𝑘 subspaces.
High Dimension Assumption:

The ambient data dimension m is so high such that
𝒎 ≫ 𝑫𝒊𝒎 𝑺
This special case is indeed significant !
Face Images:

Consider 𝑘 = 10,000

The dimension of each (front) face subspace is about 5

Blessing of Dimensionality

10 × 10 face images ( m = 100)
•
𝐷𝑖𝑚 𝑆 ≤ 5𝑘 = 50,000 > 𝑚

100 × 100 face images ( m = 10,000)
•
𝐷𝑖𝑚 𝑆 ≤ 5𝑘 = 50,000 > 𝑚

1000 × 1000 face images (m = 1,000,000)
•
𝐷𝑖𝑚 𝑆 ≤ 5𝑘 = 50,000 ≪ 𝑚
6
Problem Formulation (Robust Subspace Clustering)
What is the difference with PCA ?

Input: 𝑋 = [𝑥1 , 𝑥2 , … , 𝑥𝑛 ], a given data matrix each column in which is an mdimensional data point approximately drawn from some subspace.

𝑋 = 𝐿0 + 𝐸0
𝑳𝟎 (authentic)
𝑬𝟎 (errors)
𝑿 (observed)

PCA
𝑹𝒎
=


linear
+
Output:

Correct subspace membership

Robust subspace clustering (High Dimension)
Low-Rankness Assumption:

𝑆 = 𝑆1 + 𝑆2 … + 𝑆𝑘 .
nonlinear in linear
•
•
min(𝑚, 𝑛)
𝑟0 ≜ 𝑟𝑎𝑛𝑘 𝐿0 = 𝐷𝑖𝑚 𝑆 ≤
log max(𝑚, 𝑛)
i.e., High Dimension Assumption + 𝑛 is also sufficiently large
i.e., the sum of multiple subspaces together is a subspace
7
A Baseline Idea
white noise:
Comments
PCA (principal component analysis)
𝐿0 = 𝑎𝑟𝑔 𝑚𝑖𝑛𝐿 𝐿 ∗ + 𝜆 𝑋 − 𝐿 2𝐹
spare corruptions:
Positive
• PCP (principal component pursuit)

Provided that
𝐿0et al.,
is NIPS’09;
low-rank,
it etisal.,possible
[Wright
Candes
JACM’11] for (robust) PCA
methods to recover
𝐿0 𝑚𝑖𝑛
from𝐿 𝐿𝑋 ∗ without
multiple
𝐿0 = 𝑎𝑟𝑔
+ 𝜆 𝑋 −considering
𝐿1
subspaces.
• OP (outlier pursuit) [Xu et al., NIPS’10]
•
1
Error Correction
X
𝑳𝟎
•
•
Candes et al. Robust Principal Component Analysis ? JACM’11.
𝐿0 = 𝑎𝑟𝑔 𝑚𝑖𝑛𝐿 𝐿 ∗ + 𝜆 𝑋 − 𝐿 2,1
Candes and Recht.
Exact Matrix Completion via Convex Optimization. FMC’09.
Negativemissing entries:
[Candes et al., Fund. Math. Comp.’09]

In •thematrix
case ofcompletion
multiple subspaces,
the success condition for
𝑚𝑖𝑛𝐿very
𝐿 ∗ restrictive:
+ 𝜆 𝑃Ω 𝑋 − 𝐿 2𝐹
recovering𝐿𝐿0 0=is𝑎𝑟𝑔
actually
2
The incoherent condition requried by (robust) PCA
Standard Subspace Clustering •
Step 1: compute an affinity matrix
𝑳𝟎
methods is inconsistent with multiple subspaces.
• SIM(shape interaction matrix)[Costeira et al., IJCV’98]
Guangcan Liu and Ping Li. Recovery of coherent data via low-rank dictionary
① NIPS’14
Perform SVD: 𝑳0 = 𝑼𝚺𝑽𝐓
pursuit,

The style②of two
steps
not easy
to use
Form
an is
affinity
matrix
byin𝒁practice.
= |𝑽𝑽𝑇 |
• SSC (sparse subspace clustering)[Rene et al., CVPR’09]
𝑍 = 𝑎𝑟𝑔 𝑚𝑖𝑛𝑍 𝑍 1 𝑠. 𝑡. 𝐿0 = 𝐿0 𝑍
Overall Grade: D (grading system A-F)
• ……
Step 2: spectral clustering

•
8
Preliminary : Shape Interaction Matrix (SIM)
Definition
For a data matrix 𝑀 each column in which is a data point, its shape interaction
matrix (SIM) is
𝑉𝑉 𝑇 , (row projector)
where 𝑈Σ𝑉 𝑇 is the skinny SVD of 𝑀.
The SIM of 𝑳𝟎 , 𝑽𝟎 𝑽𝑻𝟎 ∈ 𝑹𝒏×𝒏 , identifies
the true
However,
… subspace membership
The following had been
proved (Kanatani, ECCV’11; Vidal
et al., IJCV’08; Liu et al., TPMAI’13; Xu
et al., ICML’13):


for 𝑥𝑖 and 𝑥𝑗 from
different subspaces,
[𝑉0 𝑉0𝑇 ]𝑖𝑗 = 0
for 𝑥𝑖 and 𝑥𝑗 from the
same subspace,
[𝑉0 𝑉0𝑇 ]𝑖𝑗 ≠ 0 with high
probability
𝑽𝟎 𝑽𝑻𝟎
𝑋 = 𝑈𝑋 Σ𝑋 𝑉𝑋𝑇
independent
𝑽𝟎 𝑽𝑻𝟎
𝑉𝑋 𝑉𝑋𝑇
intersection
SNRdB = 23
9
Our Method
Given 𝑿, 𝑿 = 𝑳𝟎 + 𝑬𝟎 , recover 𝑽𝟎 𝑽𝑻𝟎 (the SIM of 𝑳𝟎 ) using a single convex procedure.
A Basic Theory[Liu et al., TPAMI’13]:
𝑉0 𝑉0𝑇 = arg 𝑚𝑖𝑛𝑍 𝑍
∗
𝑠. 𝑡. 𝐿0 = 𝐿0 𝑍
About the nuclear norm, | ⋅ |∗ :

Let {𝜎1 , 𝜎2 , … } be the singular values of a matrix 𝑀. Then 𝑀 ∗ = 𝑖 𝜎𝑖

Nuclear norm is the closest convex approximation to the rank function
𝜎1
𝜎2
.
.
𝜋=
, rank M = 𝜋 0 , M ∗ = 𝜋
.
.
.
.
1
10
Our Method
Given 𝑿, 𝑿 = 𝑳𝟎 + 𝑬𝟎 , recover 𝑽𝟎 𝑽𝑻𝟎 (the SIM of 𝑳𝟎 ) using a single convex procedure.
A Basic Theory[Liu et al., TPAMI’13]:
𝑉0 𝑉0𝑇 = arg 𝑚𝑖𝑛𝑍 𝑍
𝑍,𝑆
𝑠. 𝑡. 𝐿0 = 𝐿0 𝑍
𝑋 − 𝐸0 = 𝑋 − 𝐸0 𝑍
To recover 𝑽𝟎 𝑽𝑻𝟎 , one may try:
min 𝑍 ∗ + 𝜆 𝐸
∗
ℓ
𝑠. 𝑡. , 𝑋 − 𝐸 = 𝑋 − 𝐸 𝑍
𝐸 2𝐹 : white noise
𝐸 1 : randomly sparse errors
𝐸 2,1 : column-wisely sparse errors


not convex !
Paolo, Rene, and Avinash, “A Closed Form Solution to Robust Subspace
Estimation and Clustering”, CVPR’11:
𝐸0
min 𝑍 ∗ + 𝜆 𝐸 1 𝑠. 𝑡. , 𝑋 − 𝐸 = 𝑋 − 𝐸 𝑍
𝑍,𝑆
randomly
sparse
11
Our Method
Given 𝑿, 𝑿 = 𝑳𝟎 + 𝑬𝟎 , recover 𝑽𝟎 𝑽𝑻𝟎 (the SIM of 𝑳𝟎 ) using a single convex procedure.
A Basic Theory[Liu et al., TPAMI’13]:
𝑉0 𝑉0𝑇 = arg 𝑚𝑖𝑛𝑍 𝑍
A Convex Approximation Scheme:

An observation:
𝐸0 𝑉0 𝑉0𝑇 ≈ 0
𝑬𝟎


∗
𝑠. 𝑡. 𝐿0 = 𝐿0 𝑍
𝑬𝟎 𝑽𝟎 𝑽𝑻𝟎
We remove 𝐸𝑍 in 𝑋 − 𝐸 = 𝑋 − 𝐸 𝑍:
𝑋 = 𝑋𝑍 + 𝐸 − 𝐸𝑍
Convex formulation:
𝒎𝒊𝒏 𝒁 ∗ + 𝝀 𝑬
𝒁,𝑺
ℓ
𝒔. 𝒕. , 𝑿 = 𝑿𝒁 + 𝑬
Question: What is lost ?
12
Our Method
Given 𝑿, 𝑿 = 𝑳𝟎 + 𝑬𝟎 , recover 𝑽𝟎 𝑽𝑻𝟎 (the SIM of 𝑳𝟎 ) using a single convex procedure.
A Basic Theory[Liu et al., TPAMI’13]:
𝑉0 𝑉0𝑇 = arg 𝑚𝑖𝑛𝑍 𝑍
𝑠. 𝑡. 𝐿0 = 𝐿0 𝑍
∗
Rather surprisingly, in some cases, nothing is lost!
under certain conditions, the convex procedure below can EXACTLY recover 𝑽𝟎 𝑽𝑻𝟎
𝒎𝒊𝒏 𝒁 ∗ + 𝝀 𝑬
𝒁,𝑺

ℓ
𝒔. 𝒕. , 𝑿 = 𝑿𝒁 + 𝑬
Setting: sample-specific errors
column-wisely
Sparse
(group sparsity)
𝐸0
Low-Rank Representation (LRR)
(LRR) 𝐦𝐢𝐧 𝒁 ∗ + 𝝀 𝑬
𝒁,𝑺
[Liu et al., ICML’10, TPAMI’13]
𝟐,𝟏
𝒔. 𝒕. 𝑿 = 𝑿𝒁 + 𝑬
13
Our Method
Given 𝑿, 𝑿 = 𝑳𝟎 + 𝑬𝟎 , recover 𝑽𝟎 𝑽𝑻𝟎 (the SIM of 𝑳𝟎 ) using a single convex procedure.
LRR:
𝒁∗ , 𝑬∗ = 𝒂𝒓𝒈 𝐦𝐢𝐧 𝒁 ∗ + 𝝀 𝑬
𝒁,𝑺
𝟐,𝟏
𝒔. 𝒕. 𝑿 = 𝑿𝒁 + 𝑬
A Deterministic Result
Assumption:
Notes
𝒎 ≫ 𝑫𝒊𝒎 𝑺 ⇒ 𝑠𝑝𝑎𝑛 𝐿0 ∩ 𝑠𝑝𝑎𝑛 𝐸0 = 0 (avoid unidentifiability)
Theorem
LRR also depends on the incoherent
3
There exists 𝛾 ∗ > 0, such that LRR with condition
parameter
𝜆
=
strictly
7 𝑋 𝛾∗ 𝑛

𝑍 ∗ ≠ 𝑉0 𝑉0𝑇 (𝑍 ∗ is asymmetric)

ℓ1 regularization:
succeeds as long as 𝛾 ≤ 𝛾 ∗ (𝛾 is the fraction
of nonzero columns of 𝐸0 ):
𝑍 ∗ℐ+, 𝜆 𝐸 1 , 𝑠. 𝑡. 𝑋 = 𝑋𝑍 + 𝐸
𝑈 ∗ 𝑈 ∗ 𝑇 = 𝑉0 𝑉0𝑇 andmin
ℐ∗ =
0
•
near
recovery
to 𝑉0 𝑉0𝑇 , but not
∗
∗
∗
where 𝑈 is the left singular vectors of 𝑍 , and
ℐ is the column
exact.

Provided that 𝑋:,𝑖 2 = 1, ∀𝑖 = 1, ⋯ 𝑛,
support of 𝐸 ∗ .
1
𝑉0 --- right singular vectors of 𝐿0
𝜆=
log 𝑛
ℐ --- column support of 𝐸
0

0
14
Results on Randomly Generated Data
ambient dimension 𝑚 = 300
𝑟0 ≜Comparison
𝑟𝑎𝑛𝑘 𝐿0 = 5,10, … , 150
Experimental Settings
#subspace 𝑘 = 5
𝛾 = 1.7%, 3.4%, … , 50%
#data points 𝑛 = 300
#trials = 100
𝜆 = 1/ log 𝑛
LRR 𝑽 𝑽𝑻
Recovering
𝟎 𝟎
min 𝑍 ∗ + 𝜆 𝐸
𝑍,𝑆
2,1
𝑠. 𝑡. 𝑋 = 𝑋𝑍 + 𝐸
corruption percentage 𝜸
corruption percentage 𝜸
Outlier
Pursuit (OP):
Recovering
colu_supp(𝑬𝟎 )
min 𝐿 ∗ + 𝜆 𝐸
𝐿,𝑆
2,1
𝑠. 𝑡. 𝑋 = 𝐿 + 𝐸
corruption
percentage
𝜸
corruption
percentage
𝜸
15
Results on Motion Sequences
Experimental Settings
Dataset: Hopkins155 (+ synthetic corruptions)
Baseline: Outlier Pursuit (OP) + SIM
16
Results on Face Images
input
······
𝑿=
𝒁∗ , 𝑬∗ = 𝒂𝒓𝒈 𝐦𝐢𝐧 𝒁 ∗ + 𝝀 𝑬
𝒁,𝑺
𝟐,𝟏
𝒔. 𝒕. 𝑿 = 𝑿𝒁 + 𝑬
output
𝑿𝒁∗ =
······
17
Some Comments for LRR
Positive:



Subspace clustering + error correction
Computationally stable (convex)
The model is flexible and can easily adapt to various problems:
Image segmentation, ICCV’11
Saliency Detection, TIP’11
Negative:

LRR still partially depends on the incoherent condition !
•
LRR has NOT fully captured the structure of multiple subspaces
Overall Grade: D+ (grading system A-F)
18
Conclusion & Future Work
Conclusion

Figuring out a significant and “easy” case for the robust subspace
clustering problem (blessing of dimensionality)

Suggesting a convex formula termed LRR to resolve the problem under
certain conditions
Future Work

Fast algorithms (curse of dimensionality)

Completely removing the dependence on the incoherent condition
References
[1] Liu et al. Robust Subspace Segmentation by Low-Rank Representation , ICML’10.
[2] Liu et al. Robust Recovery of Subspaces Structures by Low-Rank Representation, TPAMI’13
[3] Liu et al. Recovery of Coherent Data via Low-Rank Dictionary Pursuit, NIPS’14
[4] Liu et al. A Deterministic Analysis for LRR, TPAMI (under revision)
19
Questions ?
Welcome to email me:
[email protected]

Download Report

pptx

Paperzz.com

Your Paperzz