Random matrix theory in sparse recovery

Random matrix theory in sparse recovery
Maryia Kabanava
RWTH Aachen University
CoSIP Winter Retreat 2016
Maryia Kabanava (RWTH Aachen)
Random matrix theory in sparse recovery
CoSIP 2016
Compressed sensing
Goal: reconstruction of (high-dimensional) signals from minimal
amount of measured data
Key ingredients:
Exploit low complexity of signals (e.g. sparsity/compressibility)
Efficient algorithms (e.g. convex optimization)
Randomness (random matrices)
Maryia Kabanava (RWTH Aachen)
Random matrix theory in sparse recovery
CoSIP 2016
Signal recovery problem
Signal x ∈ Rd is unknown.
Given:
Signal linear measurement map: M : Rd → Rm , m ≪ d.
Measurement vector: y = Mx + w ∈ Rm , kw k2 ≤ η.
Goal: recover x from y .
Idea: recovery is possible if x belongs to a set of low complexity.
Standard compressed sensing: sparsity (small number of
nonzero coefficients)
Cosparsity: sparsity after transformation
Structured sparsity: e.g. block sparsity
Low rank matrix recovery
Low rank tensor recovery
Maryia Kabanava (RWTH Aachen)
Random matrix theory in sparse recovery
CoSIP 2016
Noiseless model
y
M
x
S
m
m×d
=
Sc
under-determined linear system
supp x = S ⊂ {1, 2, . . . , d}
ℓ0 -minimization
min kzk0 s.t. Mz = y
z∈Rd
NP-hard
Maryia Kabanava (RWTH Aachen)
ℓ1 -minimization
min kzk1 s.t. Mz = y
z∈Rd
efficient minim. methods
Random matrix theory in sparse recovery
CoSIP 2016
Nonuniform vs. uniform recovery
Nonuniform recovery
A fixed sparse (compressible) vector is recovered with high
probability using M.
Sufficient conditions on M
Descent cone of ℓ1 -norm at x intersects ker M trivially.
Construct (approximate) dual certificate.
Uniform recovery
With high probability on M every sparse (compressible)
vector is recovered.
Sufficient conditions on M
Null space property.
Restricted isometry property.
Maryia Kabanava (RWTH Aachen)
Random matrix theory in sparse recovery
CoSIP 2016
Nonuniform recovery: descent cone
For fixed x ∈ Rd , we define the convex cone
T (x) = cone{z − x : z ∈ Rd , kzk1 ≤ kxk1 }.
Theorem
x
x + ker M
Rm×d .
Rd
Let M ∈
A vector x ∈
is
the unique minimizer of kzk1 subject
to Mz = Mx if and only if
ker M ∩ T (x) = {0}.
x + T (x)
Let Sd−1 = {x ∈ Rd : kxk2 = 1} and set T := T (x) ∩ Sd−1 . If
inf kMxk2 > 0,
x∈T
(1)
then ker M ∩ T = ∅ and ker M ∩ T (x) = {0}.
Maryia Kabanava (RWTH Aachen)
Random matrix theory in sparse recovery
CoSIP 2016
Uniform recovery: null space property (NSP)
M ∈ Rm×d is said to satisfy the stable NSP of order s with
0 < ρ < 1, if for any S ⊂ [d] with |S| ≤ s it holds
kvS k1 < ρkvS c k1
for all v ∈ ker M.
(2)
Theorem
Let M ∈ Rm×d satisfy (2). Then, for any x ∈ Rd the solution x̂ of
min kzk1
z∈Rd
subject to Mz = y ,
with y = Mx, approximates x with ℓ1 -error
kx − x̂k1 ≤
2(1 + ρ)
σs (x)1 ,
1−ρ
(3)
where σs (x)1 := inf {kx − zk1 : z is s-sparse}.
Maryia Kabanava (RWTH Aachen)
Random matrix theory in sparse recovery
CoSIP 2016
Strategy to check NSP
Lemma
Let
o
n
Tρ,s := w ∈ Rd : kwS k1 ≥ ρkwS c k1 for some S ⊂ [d], |S|≤ s .
Set T := Tρ,k ∩ Sd−1 . If
inf kMw k2 > 0,
w ∈T
then for any v ∈ ker M it holds
kvS k1 < ρkvS c k1 .
Maryia Kabanava (RWTH Aachen)
Random matrix theory in sparse recovery
CoSIP 2016
Uniform recovery: restricted isometry property (RIP)
Definition
The restricted isometry constant δs of a matrix M ∈ Rm×d is
defined as the smallest δs such that
(1 − δs )kxk22 ≤ kMxk22 ≤ (1 + δs )kxk22
(4)
for all s-sparse x ∈ Rd .
Requires that all s-column submatrices of M are
well-conditioned.
δs = max kMST MS − Id k2→2
|S|≤s
Implies stable NSP.
We say that M satisfies the restricted isometry property if δs is
small for reasonably large s.
Maryia Kabanava (RWTH Aachen)
Random matrix theory in sparse recovery
CoSIP 2016
RIP implies recovery by ℓ1 -minimization
(1 − δs )kxk22 ≤ kMxk22 ≤ (1 + δs )kxk22
(5)
Theorem
Assume that the restricted isometry constant of M ∈ Rm×d
satisfies
√
δ2s < 1/ 2 ≈ 0.7071.
Then ℓ1 -minimization reconstructs every s-sparse vector x ∈ Rd
from y = Mx.
Maryia Kabanava (RWTH Aachen)
Random matrix theory in sparse recovery
CoSIP 2016
Matrices satisfying recovery conditions
Open problem: Give explicit matrices M ∈ Rm×d that satisfy
recovery conditions.
Goal: Successful recovery with M ∈ Rm×d , if
m ≥ C slnα (d),
for constants C and α.
Deterministic matrices known, for which m ≥ C s 2 .
Way out: consider random matrices.
Maryia Kabanava (RWTH Aachen)
Random matrix theory in sparse recovery
CoSIP 2016
Gaussian random variables
A standard Gaussian random variabel X ∼ N(0, 1) has probability
density function
2
1
(6)
ψ(x) = √ e −x /2 .
2π
1
The tail of X decays super-exponentially
P(|X | > t) ≤ e −t
2
3
2 /2
,
t > 0.
The absolute moments of X can be computed as
√
Γ((1 + p)/2) 1/p
√
p 1/p
(E |X | )
= 2
= O( p),
Γ(1/2)
(7)
p ≥ 1.
The moment generating function of X equals
E exp(tX ) = e t
Maryia Kabanava (RWTH Aachen)
2 /2
,
t ∈ R.
Random matrix theory in sparse recovery
CoSIP 2016
Subgaussian random variables
Lemma
Let X be a random variable with EX = 0. Then the following
properties are equivalent.
1
Tails: There exist β, κ > 0 such that
P(|X | > t) ≤ βe −κt
2
for all t > 0.
(8)
for all p ≥ 1.
(9)
Moments:
(E |X |p )
3
2
1/p
√
≤C p
Moment generating function:
E exp(tX ) ≤ e ct
2
for all t ∈ R.
(10)
A random variable X with EX = 0 that satisfies one of the
properties above is called subgaussian.
Maryia Kabanava (RWTH Aachen)
Random matrix theory in sparse recovery
CoSIP 2016
Subgaussian random variables: examples
1
Gaussian
2
Bernoulli: P {X = −1} = P {X = 1} =
3
1
2
Bounded: |X | ≤ M almost surely for some M
Maryia Kabanava (RWTH Aachen)
Random matrix theory in sparse recovery
CoSIP 2016
Hoeffding-type inequality
Theorem
Let X1 , . . . , XN be a sequence of independent subgaussian random
variables,
E exp(tXi ) ≤ e ct
For a ∈
RN ,
2
for all t ∈ R and i ∈ {1, . . . , N}.
the random variable Z :=
N
X
(11)
ai Xi is subgaussian, i.e.
i =1
E exp(tZ ) ≤ exp ckak22 t 2
and
for all t ∈ R
N
!
X
t2
ai Xi ≥ t ≤ 2 exp −
P 4ckak22
i =1
Maryia Kabanava (RWTH Aachen)
Random matrix theory in sparse recovery
for all t ∈ R.
(12)
(13)
CoSIP 2016
Subexponential random variables
A random variable X with EX = 0 is called subexponential if there
exist β, κ > 0 such that
P(|X | > t) ≤ βe −κt
for all t > 0.
(14)
Theorem (Bernstein-type inequality)
Let X1 , . . . , XN be a sequence of independent subexponential
random variables,
P(|Xi | > t) ≤ βe −κt
for all t > 0 and i ∈ {1, . . . , N}.
(15)
Then
N
!
X (κt)2
Xi ≥ t ≤ 2 exp −
P 2βN + κt
i =1
Maryia Kabanava (RWTH Aachen)
Random matrix theory in sparse recovery
for all t ∈ R.
(16)
CoSIP 2016
Random matrices
Definition
Let M ∈ Rm×d be a random matrix.
If the entries of M are independent Bernoulli variables (i.e.
taking values ±1 with equal probability), then M is called a
Bernoulli random matrix.
If the entries of M are independent standard Gaussian random
variables, then M is called a Gaussian random matrix.
If the entries of M are independent subgaussian random
variables,
P (|Mjk | ≥ t) ≤ βe −κt
2
for all t > 0,
then M is called a subgaussian random matrix.
Maryia Kabanava (RWTH Aachen)
Random matrix theory in sparse recovery
CoSIP 2016
RIP for subgaussian random matrices
Theorem
Let M ∈ Rm×d be subgaussian random matrix. Then there exists
C = C (β, κ) > 0 such that the restricted isometry constant of
√1 M satisfies δs ≤ δ w.p. at least 1 − ε provided
m
m ≥ C δ−2 s ln(ed/s) + ln(2ε−1 ) .
Maryia Kabanava (RWTH Aachen)
Random matrix theory in sparse recovery
(17)
CoSIP 2016
Random matrices with subgaussian rows
Let Y ∈ Rd be random.
If E |hY , xi|2 = kxk22 for all x ∈ Rd , then Y is called isotropic.
If, for all x ∈ Rd with kx2 k = 1, the random variable hY , xi is
subgaussian,
E exp (thY , xi) ≤ exp(ct 2 ) for all t ∈ R, (c is indep. of x),
then Y is called a subgaussian random vector.
Theorem
Let M ∈ Rm×d be random with independent, isotropic,
subgaussian rows with the same parameter c. If
m ≥ C δ−2 s ln(ed/s) + ln(2ε−1 ) ,
then the restricted isometry constant of
at least 1 − ε.
Maryia Kabanava (RWTH Aachen)
√1 M
m
Random matrix theory in sparse recovery
(18)
satisfies δs ≤ δ w.p.
CoSIP 2016
Ingredients of the proof: concentration inequality
Let M ∈ Rm×d be random with independent, isotropic,
subgaussian rows. Then, for all x ∈ Rd and every t ∈ (0, 1),
P m−1 kMxk22 − kxk22 ≥ tkxk22 ≤ 2 exp(−ct 2 m).
(19)
Proof.
Let x ∈ Rd , kxk2 = 1. Denote the rows of M by Y1 , . . . , Ym ∈ Rd .
Define
Zi = |hYi , xi|2 − kxk22 , i = 1, . . . , m.
EZi = 0, P (|Zi | ≥ r ) ≤ β exp(−κr )
m
P
Zi
m−1 kMxk22 − kxk22 = m−1
i =1
Bernstein
inequality:
!
m
X
κ2
−1
2
mt
P m
Zi ≥ t ≤ 2 exp −
4β + 2κ
i =1
Maryia Kabanava (RWTH Aachen)
Random matrix theory in sparse recovery
CoSIP 2016
Ingredients of the proof: covering argument
Let M ∈ Rm×d be random and
P m−1 kMxk22 − kxk22 ≥ tkxk22 ≤ 2 exp(−ct 2 m) for all x ∈ Rd .
Define M̃ =
√1 M.
m
Then
P kM̃xk22 − kxk22 ≥ tkxk22 ≤ 2 exp(−ct 2 m) for all x ∈ Rd .
For S ⊂ {1, . . . , d}, |S| = s and δ, ε ∈ (0, 1), if
m ≥ C δ−2 (7s + 2 ln(2ε−1 )),
(20)
then w.p. at least 1 − ε
kM̃ST M̃S − Id k2→2 < δ.
Maryia Kabanava (RWTH Aachen)
Random matrix theory in sparse recovery
(21)
CoSIP 2016
Ingredients of the proof: union bound
Let M̃ ∈ Rm×d be random and
P kM̃xk22 − kxk22 ≥ tkxk22 ≤ 2 exp(−ct 2 m) for all x ∈ Rd .
If for δ, ε ∈ (0, 1),
m ≥ C δ−2 s(9 + 2 ln(d/s)) + 2 ln(2ε−1 ) ,
(22)
then w.p. at least 1 − ε, the restricted isometry constant δs of M̃
satisfies δs < δ.
Maryia Kabanava (RWTH Aachen)
Random matrix theory in sparse recovery
CoSIP 2016
Gaussian width
For T ⊂ Rd we define its Gaussian width by
ℓ(T ) := Esup hx, g i, g ∈ Rd is Gaussian.
(23)
x∈T
width
u
Due to the rotation invariance
(23) can be written as
ℓ(T ) = Ekg k2 · Esup hx, ui,
x∈T
T
where u is uniformly distributed
on Sd−1 .
ℓ(Sd−1 ) = E sup hx, g i = Ekg k2 ∼
√
d
kxk2 =1
p
D := conv x ∈ Sd−1 : |supp x| ≤ s , ℓ(D) ∼ s ln(d/s)
Maryia Kabanava (RWTH Aachen)
Random matrix theory in sparse recovery
CoSIP 2016
Gordon’s escape through a mesh
ℓ(T ) := Esup hx, g i, g ∈ Rd is Gaussian.
x∈T
√
m
Em := Ekg k2 = 2 Γ((m+1)/2)
Γ(m/2) , g ∈ R is Gaussian,
√
m
√
≤ Em ≤ m.
m+1
Theorem
Let M ∈ Rm×d be Gaussian and T ⊂ Sd−1 . Then, for t > 0, it
holds
P
inf kMxk2 > Em − ℓ(T ) − t
x∈T
t2
≥ 1 − e− 2 .
(24)
The proof relies on the concentration of measure inequality for
Lipschitz functions.
m is determined by:
1
m
(m & ℓ(T )2 )
≥ ℓ(T ) + t +
Em ≥ √
τ
m+1
Maryia Kabanava (RWTH Aachen)
Random matrix theory in sparse recovery
CoSIP 2016
Estimates for Gaussian widths of T (x)
T (x) = cone{z − x : z ∈ Rd , kzk1 ≤ kxk1 }
(25)
N (x) := {z ∈ Rd : hz, w −xi ≤ 0 for all w s.t. kw k1 ≤ kxk1 } (26)
ℓ(T (x) ∩ Sd−1 ) ≤ E min kg − zk2 , g ∈ Rd is a standard
z∈N (x)
Gaussian random vector.
Let supp(x) = S. Then
o
[n
N (x) =
z ∈ Rd : zi = t sgn(xi ), i ∈ S, |zi | ≤ t, i ∈ S c
t≥0
ℓ T (x) ∩ Sd−1
Maryia Kabanava (RWTH Aachen)
2
≤ 2s ln(ed/s)
Random matrix theory in sparse recovery
CoSIP 2016
Nonuniform recovery with Gaussian measurements
Theorem
Let x ∈ Rd be an s-sparse vector. Let M ∈ Rm×d be a randomly
drawn Gaussian matrix. If, for some ε ∈ (0, 1),
m2
≥ 2s
m+1
p
ln(ed/s) +
r
ln(ε−1 )
s
!2
,
(27)
then w.p. at least 1 − ε the vector x is the unique minimizer of
kzk1 subject to Mz = Mx.
Maryia Kabanava (RWTH Aachen)
Random matrix theory in sparse recovery
CoSIP 2016
Estimates for Gaussian widths of Tρ,s
o
n
Tρ,s := w ∈ Rd :kwS k1 ≥ ρkwS c k1 for some S ⊂ [d], |S|= s (28)
n
o
D := conv x ∈ Sd−1 : |supp(x)| ≤ s
(29)
Tρ,s ∩ Sd−1 ⊂ (1 + ρ−1 )D
p
√
ℓ(D) ≤ 2s ln(ed/s) + s
p
√
ℓ(Tρ,s ∩ Sd−1 ) ≤ (1 + ρ−1 )( 2s ln(ed/s) + s)
Maryia Kabanava (RWTH Aachen)
Random matrix theory in sparse recovery
CoSIP 2016
Ununiform recovery with Gaussian measurements
Theorem
Let M ∈ Rm×d be Gaussian, 0 < ρ < 1 and 0 < ε < 1. If
s
!2
p
m2
ln(ε−1 )
1
−1 2
ln(ed/s) + √ +
≥ 2s (1 + ρ )
m+1
s ((1 + ρ−1 )2 )
2
then w. p. at least 1 − ε for every x ∈ Rd a minimizer x̂ of kzk1
subject to Mz = Mx approximates x with ℓ1 -error
kx − x̂k1 ≤
Maryia Kabanava (RWTH Aachen)
2(1 + ρ)
σs (x)1 .
(1 − ρ)
Random matrix theory in sparse recovery
CoSIP 2016
Thank you for your attention !!!
Maryia Kabanava (RWTH Aachen)
Random matrix theory in sparse recovery
CoSIP 2016