an application in Matrix Completion with Lipschitz loss

Regularization Methods:
an application in Matrix Completion
with Lipschitz loss
Vincent Cottet (with P. Alquier and G. Lecué)
PhD Supervisor: N. Chopin
Rencontres ENSAE/ENSAI, Rennes, 26.01.2017
Alquier, Cottet, Lecué
Matrix Completion with Lipschitz loss
Problem
Ingredients
data: (Xi , Yi )N
i=1 i.i.d. from distribution P
set of predictors : F ⊂ L2 = f : E(f (X )2 ) < +∞
loss function: `(f (X ), Y )
Oracle:
f ∗ ∈ arg min E [`(f (X ), Y )]
f ∈F
E(f ) = E [`(f (X ), Y )] − E [`(f ∗ (X ), Y )]
Regularization norm k·k over F .
Alquier, Cottet, Lecué
Matrix Completion with Lipschitz loss
Problem
Ingredients
data: (Xi , Yi )N
i=1 i.i.d. from distribution P
set of predictors : F ⊂ L2 = f : E(f (X )2 ) < +∞
loss function: `(f (X ), Y )
Oracle:
f ∗ ∈ arg min E [`(f (X ), Y )]
f ∈F
E(f ) = E [`(f (X ), Y )] − E [`(f ∗ (X ), Y )]
Regularization norm k·k over F .
Regularized Empirical Risk Minimizer (RERM)
(
)
N
1 X
b
f = arg min
`(f (Xi ), Yi ) + λ kf k
N
f ∈F
i=1
Alquier, Cottet, Lecué
Matrix Completion with Lipschitz loss
Introduction
Lipschitz Property
0
`(y1 , y ) − `(y20 , y ) ≤ y10 − y20 Alquier, Cottet, Lecué
Matrix Completion with Lipschitz loss
Introduction
Lipschitz Property
0
`(y1 , y ) − `(y20 , y ) ≤ y10 − y20 Examples:
Logistic loss: for all y ∈ {−1, +1}, y 0 ∈ R
`(y 0 , y ) = log(1 + exp(−y 0 y )),
Hinge loss: for all y ∈ {−1, +1}, y 0 ∈ R
`(y 0 , y ) = (1 − y 0 y )+ ,
Quantile loss (level τ ∈ (0, 1)): for all y ∈ R, y 0 ∈ R
`(y 0 , y ) = τ (y 0 − y )+ + (1 − τ )(y − y 0 )+ ,
Alquier, Cottet, Lecué
Matrix Completion with Lipschitz loss
Strategy
(
fb = arg min
f ∈F
N
1 X
`(f (Xi ), Yi ) + λ kf k
N
)
i=1
Alquier, Cottet, Lecué
Matrix Completion with Lipschitz loss
Strategy
(
fb = arg min
f ∈F
N
1 X
`(f (Xi ), Yi ) + λ kf k
N
)
i=1
Bernstein Parameter: κ
∀f ∈ F
kf − f ∗ k2κ
L2 ≤ AE(f )
Alquier, Cottet, Lecué
Matrix Completion with Lipschitz loss
Strategy
(
fb = arg min
f ∈F
N
1 X
`(f (Xi ), Yi ) + λ kf k
N
)
i=1
Bernstein Parameter: κ
Complexity function: r (ρ)
Rad(B) 1/2κ
r (ρ) = C ρ √
,
N
Alquier, Cottet, Lecué
B = {f : kf k ≤ 1}
Matrix Completion with Lipschitz loss
Strategy
(
fb = arg min
f ∈F
N
1 X
`(f (Xi ), Yi ) + λ kf k
N
)
i=1
Bernstein Parameter: κ
Complexity function: r (ρ)
Sparsity equation: fixed point ρ∗
(
)
∆(ρ) = inf
sup
h, g : h ∈ ρS ∩ r (2ρ)BL2
g ∈∂k·k(f ∗ )
Alquier, Cottet, Lecué
Matrix Completion with Lipschitz loss
Strategy
(
fb = arg min
f ∈F
N
1 X
`(f (Xi ), Yi ) + λ kf k
N
)
i=1
Bernstein Parameter: κ
Complexity function: r (ρ)
Sparsity equation: fixed point ρ∗
(
)
∆(ρ) = inf
sup
h, g : h ∈ ρS ∩ r (2ρ)BL2
g ∈∂k·k(f ∗ )
4
∆(ρ∗ ) ≥ ρ∗
5
Alquier, Cottet, Lecué
Matrix Completion with Lipschitz loss
Strategy
(
fb = arg min
f ∈F
N
1 X
`(f (Xi ), Yi ) + λ kf k
N
)
i=1
Bernstein Parameter: κ
Complexity function: r (ρ)
Sparsity equation: fixed point ρ∗
Final Result, w/ high prob.
b
∗
f
−
f
≤ ρ∗
b
∗
f
−
f
≤ r (2ρ∗ )
L2
E(fb) ≤ C (r (2ρ∗ ))2κ
Alquier, Cottet, Lecué
Matrix Completion with Lipschitz loss
Application – Matrix Completion
m1 × m2 Matrix:
black cells: known
white cells: unknown
Observations:
Xi : location
Yi : value in R or {−1, +1}
Trace Regression
ith-location (u, v ): Xi = eu ⊗ ev
approximation: Yi by f (Xi ) = Xi , M = Tr(Xi> M) = Mu,v
Alquier, Cottet, Lecué
Matrix Completion with Lipschitz loss
Application – Matrix Completion
Regularized problem
predictor: f = ·, M
Lipschitz loss function `
Regularization: ·, M = kMk
S1
=
Pm1 ∧m2
i=1
σi (M)
Estimator
(
b = arg min
M
M
N
1 X `( Xi , M , Yi ) + λ kMkS1
N
)
i=1
Alquier, Cottet, Lecué
Matrix Completion with Lipschitz loss
Application – Matrix Completion (2)
Bernstein Parameter: κ = 1 (logistic: True, Hinge: mild ass.,
Quantile: mild ass.)
Alquier, Cottet, Lecué
Matrix Completion with Lipschitz loss
Application – Matrix Completion (2)
Bernstein Parameter: κ = 1 (logistic: True, Hinge: mild ass.,
Quantile: mild ass.)
q
1/2
1 +m2 )
Complexity function: r (ρ) = C ρ Nlog(m
min(m1 ,m2 )
s
Rad(B) = C
Alquier, Cottet, Lecué
log(m1 + m2 )
min(m1 , m2 )
Matrix Completion with Lipschitz loss
Application – Matrix Completion (2)
Bernstein Parameter: κ = 1 (logistic: True, Hinge: mild ass.,
Quantile: mild ass.)
q
1/2
1 +m2 )
Complexity function: r (ρ) = C ρ Nlog(m
min(m1 ,m2 )
Sparsity equation (s = rank(M ∗ )), fixed point:
s
log(m1 + m2 )
4
ρ∗ = Csm1 m2
⇒ ∆(ρ∗ ) ≥ ρ∗
N min(m1 , m2 )
5
Alquier, Cottet, Lecué
Matrix Completion with Lipschitz loss
Application – Matrix Completion (2)
Bernstein Parameter: κ = 1 (logistic: True, Hinge: mild ass.,
Quantile: mild ass.)
q
1/2
1 +m2 )
Complexity function: r (ρ) = C ρ Nlog(m
min(m1 ,m2 )
Sparsity equation (s = rank(M ∗ )), fixed point:
s
log(m1 + m2 )
4
ρ∗ = Csm1 m2
⇒ ∆(ρ∗ ) ≥ ρ∗
N min(m1 , m2 )
5
Final Result, w/ high prob.
2
1 s(m1 + m2 ) log(m1 + m2 )
b
M − M ∗ ≤ C
m1 m2
N
S2
b ≤ C s(m1 + m2 ) log(m1 + m2 )
E(M)
N
Alquier, Cottet, Lecué
Matrix Completion with Lipschitz loss
Illustration – Median Reconstruction
Example:
20 % of known entries from a 200 × 200 matrix
rank 3 matrix with small noise
10% of corrupted entries w/ different magnitude
1.25
method
Least Squares
Median loss
L1 Risk
1.00
0.75
0.50
0
10
20
30
Outliers Magnitude
Alquier, Cottet, Lecué
Matrix Completion with Lipschitz loss
Key Points
General case
General study of RERM with Lipschitz loss
2 settings:
Subgaussian, computation of 2 objects : w (B) and ρ∗
Bounded, computation of 2 objects : Rad(B) and ρ∗
Estimation and Excess risk bounds
Applications
Matrix Completion:
binary with logistic and hinge loss
quantile reconstruction
Logistic LASSO and Logistic SLOPE
SVM without induced sparsity
Alquier, Cottet, Lecué
Matrix Completion with Lipschitz loss

Download Report

an application in Matrix Completion with Lipschitz loss

Paperzz.com

Your Paperzz