Diffusion Maps Part I: Key Lemmas and Outline of Proof

Review of Riemannian Geometry
Two Easy Examples
Isometric Embeddings
Diffusion Maps
Proof of Lemmas 1-4
Diffusion Maps Part I:
Key Lemmas and Outline of Proof
Tyrus Berry
George Mason University
September 22, 2015
Tyrus Berry George Mason University
Diffusion Maps Part I: Key Lemmas and Outline of Proof
Review of Riemannian Geometry
Two Easy Examples
Isometric Embeddings
Diffusion Maps
Proof of Lemmas 1-4
Review of Fundamental Questions
Given a set of points in a high-dimensional space:
1. How can we reduce the dimensionality of a data set?
I
PCA/MDS: Linear projection, Topology preserving
I
ISOMAP: Geometry preserving, sensitive to noise
2. How can we represent a geometric structure?
I
I
I
I
Riemannian Geometry ⇔ Laplace-Beltrami operator
Discrete approximation will be a Graph Laplacian ⇒ Robust
Eigenfunctions ⇒ Custom Fourier analysis
Represent function spaces and operators in this basis,
smooth/denoise, many other applications
Tyrus Berry George Mason University
Diffusion Maps Part I: Key Lemmas and Outline of Proof
Review of Riemannian Geometry
Two Easy Examples
Isometric Embeddings
Diffusion Maps
Proof of Lemmas 1-4
Review of Riemannian Geometry
Two Easy Examples
Circle vs. Ellipse
Torus
Isometric Embeddings
Diffusion Maps
Idea and Construction
Statement of Theorem and Caveats
Outline of Proof
Proof of Lemmas 1-4
Tyrus Berry George Mason University
Diffusion Maps Part I: Key Lemmas and Outline of Proof
Review of Riemannian Geometry
Two Easy Examples
Isometric Embeddings
Diffusion Maps
Proof of Lemmas 1-4
Riemannian manifold M with metric g
I
I
I
I
I
I
I
I
Local coordinates: Hx (y ) = (x1 (y ), ..., xd (y ))> ∈ Rd
P ∂ f˜
P
∂
so that v (f ) = i vi ∂x
Tangent vectors: v = i vi ∂x
i
i
∂
, ∂ and g ij = (g −1 )ij
Riemannian metric: gij = gx ∂x
i ∂xj
P ∂ f˜ ˜
Exterior derivative: (df )(v ) ≡ i vi ∂x
, f = f ◦ Hx−1
i
P
Gradient: (∇f )j ≡ k g jk dfk so that df (v ) = g (v , ∇f )
p
R
R
Volume: M f (y ) dV (y ) = M f˜(x1 , ..., xd ) |g |dx1 · · · dxd
R
Divergence: hdiv X , f i ≡ hX , ∇f i = M g (X , ∇f ) dV
p
P
∂
ij ∂ f˜
|g
|g
Laplacian: ∆f ≡ div ∇f = ij √1 ∂x
∂xj
i
|g |
Tyrus Berry George Mason University
Diffusion Maps Part I: Key Lemmas and Outline of Proof
Review of Riemannian Geometry
Two Easy Examples
Isometric Embeddings
Diffusion Maps
Proof of Lemmas 1-4
Local Coordinates
x2 = Hx (y)2
f (x)
∂
∂x1
b
x
x1
Tx M ∼
= Rd
Hx−1
Tyrus Berry George Mason University
Ux ⊂ M
∂
∂x2
f
R
Diffusion Maps Part I: Key Lemmas and Outline of Proof
Review of Riemannian Geometry
Two Easy Examples
Isometric Embeddings
Diffusion Maps
Proof of Lemmas 1-4
Circle vs. Ellipse
Torus
Geometry of a Circle
I
I
I
I
The circle S 1 is special because it has a global coordinate θ
Unit circle: M = {(cos(θ), sin(θ))> : θ ∈ [0, 2π)} ∼
= S 1.
Tangent space Tθ M = dθ(− sin(θ), cos(θ))> ∼
= R1
Induced Riemannian metric:
gθ (u, v ) = dθ(− sin(θ), cos(θ)) · dθ(− sin(θ), cos(θ)) = dθ2
I
I
2
Laplacian: ∆f (θ) = ∂∂θf2
p
Note: |g | gives the arclength differential
Tyrus Berry George Mason University
Diffusion Maps Part I: Key Lemmas and Outline of Proof
Review of Riemannian Geometry
Two Easy Examples
Isometric Embeddings
Diffusion Maps
Proof of Lemmas 1-4
Circle vs. Ellipse
Torus
Geometry of an Ellipse
I
Ellipse: M = {(a cos(θ), b sin(θ))> : θ ∈ [0, 2π)}
gθ (u, v ) = (a2 sin2 (θ)+b 2 cos2 (θ))dθ2 = (b 2 +(a2 −b 2 ) sin2 (θ))dθ2
I
I
g (θ) = (b 2 + (a2 − b 2 ) sin2 (θ))
Laplacian:
∆f =
X
i
1 ∂
p
|g | ∂xi
p
∂
|g |g ij
f
∂xj
=g
−1/2
∂
∂θ
−1/2 ∂
g
f
∂θ
I
√
If a = 2, b = 1: ∆f =
I
Note: S 1 is an embedding of the ellipse but not isometric
1
∂2f
1+sin2 (θ) ∂θ2
Tyrus Berry George Mason University
−
1
∂f
2(1+sin2 (θ))2 ∂θ
Diffusion Maps Part I: Key Lemmas and Outline of Proof
Review of Riemannian Geometry
Two Easy Examples
Isometric Embeddings
Diffusion Maps
Proof of Lemmas 1-4
Circle vs. Ellipse
Torus
Geometry of a Torus
I
The Torus has global coordinates (θ, φ)
I
Flat Torus:
M = {(cos(θ), sin(θ), cos(φ), sin(φ))> : (θ, φ) ∈ [0, 2π)2 }
I
Tangent space
Tθ,φ M = (− sin(θ), cos(θ), 0, 0)dθ + (0, 0, − sin(φ), cos(φ))dφ
I
Induced Riemannian metric: gθ,φ (u, v ) = dθ2 + dφ2
I
Laplace-Beltrami operator: ∆f =
p
Note: |g | gives dA
I
Tyrus Berry George Mason University
∂2f
∂θ2
+
∂2f
∂φ2
Diffusion Maps Part I: Key Lemmas and Outline of Proof
Review of Riemannian Geometry
Two Easy Examples
Isometric Embeddings
Diffusion Maps
Proof of Lemmas 1-4
Circle vs. Ellipse
Torus
Geometry of a Torus

I

(r + cos(θ)) sin(φ)
Curved Torus:  (r + cos(θ)) cos(φ) 
sin(θ)
Tangent space:




− sin(θ) sin(φ)
(r + cos(θ)) cos(φ)
Tθ,φ M =  − sin(θ) cos(φ)  dθ+ −(r + cos(θ)) sin(φ)  dφ
cos(θ)
0
I
Induced Riemannian metric and Laplace-Beltrami operator:
I
gθ,φ (u, v ) = dθ2 + (r + cos(θ))2 dφ2
∂f
1
∂
∂f
∂
1
(r + cos θ)
+
∆f =
r + cos θ ∂θ
∂θ
∂φ r + cos θ ∂φ
Tyrus Berry George Mason University
Diffusion Maps Part I: Key Lemmas and Outline of Proof
Review of Riemannian Geometry
Two Easy Examples
Isometric Embeddings
Diffusion Maps
Proof of Lemmas 1-4
Pullback metric and Isometric Embeddings
I
Let (M, g ) be a Riemannian manifold
I
Consider an embedding ι : M → Rm
I
I
Rm has a smoothly varying inner product h·, ·i
We can define the pullback metric by:
g̃x (u, v ) = hDι(x)u, Dι(x)v i = u > Dι(x)> Dι(x)v
I
If g = g̃ then we call ι an isometric embedding
I
Nash’s theorem: Every Riemannian manifold can be
isometrically embedded in Rm for m sufficiently large
Tyrus Berry George Mason University
Diffusion Maps Part I: Key Lemmas and Outline of Proof
Review of Riemannian Geometry
Two Easy Examples
Isometric Embeddings
Diffusion Maps
Proof of Lemmas 1-4
Idea and Construction
Statement of Theorem and Caveats
Outline of Proof
Geometric Prior for Diffusion Maps
I
I
m
Consider a data set {xi }N
i=1 ⊂ M ⊂ R
Then M inherits a metric g from Rm
I
Alternatively, we can assume yi ∈ M and xi = ι(yi ) ∈ Rm
where ι is an isometric embedding with respect to g
I
Consequence: Diffusion maps will recover the geometry of
the embedded data
I
Fact: We can generalize Diffusion maps (Local Kernels, Berry
& Sauer 2015) to recover any geometry using local kernels
Tyrus Berry George Mason University
Diffusion Maps Part I: Key Lemmas and Outline of Proof
Review of Riemannian Geometry
Two Easy Examples
Isometric Embeddings
Diffusion Maps
Proof of Lemmas 1-4
Idea and Construction
Statement of Theorem and Caveats
Outline of Proof
Diffusion Maps Construction
I
m
Consider a data set {xi }N
i=1 ⊂ M ⊂ R
I
We want to approximate the Laplace-Beltrami operator ∆
I
We will approximate functions on M with functions on {xi }
which are N × 1 vectors f~ = (f (x1 ), ..., f (xN ))>
I
We will approximate ∆ with a sparse N × N matrix L
P
P
We want: ∆f (xi ) = Lf~ = j Lij fj = j Lij f (xj )
I
i
I
The matrix L will be a graph Laplacian for a weighted
k-nearest neighbor graph on {xi }
I
We will prove that in the limit of large data, meaning as
N → ∞, we have L → ∆
Tyrus Berry George Mason University
Diffusion Maps Part I: Key Lemmas and Outline of Proof
Review of Riemannian Geometry
Two Easy Examples
Isometric Embeddings
Diffusion Maps
Proof of Lemmas 1-4
Idea and Construction
Statement of Theorem and Caveats
Outline of Proof
What is a Diffusion Map
I
The eigenvectors and eigenvalues of L approximate the
eigenfunctions ϕj (xi ) and eigenvalues λj of ∆
I
Fact: 0 = λ0 ≥ λ1 ≥ λ2 ≥ · · · (discrete and continuous)
I
We can consider a family of diffusion maps
Φt (x) = (e λ1 t ϕ1 (x), ..., e λk t ϕk (x))>
I
Φt optimally preserves the distance (MDS):
Dt (x, y )2 = lim ||Φt (x)−Φt (y )||2 =
k→∞
I
∞
X
e 2λj t (ϕj (x)−ϕj (y ))2
j=1
Fact: A diffusion map is a canonical isometric embedding
Tyrus Berry George Mason University
Diffusion Maps Part I: Key Lemmas and Outline of Proof
Review of Riemannian Geometry
Two Easy Examples
Isometric Embeddings
Diffusion Maps
Proof of Lemmas 1-4
Idea and Construction
Statement of Theorem and Caveats
Outline of Proof
Diffusion Maps Theorem
I
We assume that xi are sampled from a smooth density q(x)
I
Let h : [0, ∞) → [0, ∞) have decay h(s) < c0 e −c1 s
||xi −xj ||
Define Jij = J(xi , xj ) = h
δ
R
P
||xi −y ||
Define Di = j Jij ≈ M h
q(y ) dV (y )
δ
P
Right normalization: Kij = Dj−1 Jij and D̂i = j Jˆij
I
I
I
I
I
2
Left normalization: K̂ij = Di−1 Kij and finally L = I −δ2K̂
Theorem: Lf~ = ∆f (xi ) + O δ 2 , N −1/2 δ −1−d/2
i
I
So L → ∆ as δ → 0 and N → ∞
Tyrus Berry George Mason University
Diffusion Maps Part I: Key Lemmas and Outline of Proof
Review of Riemannian Geometry
Two Easy Examples
Isometric Embeddings
Diffusion Maps
Proof of Lemmas 1-4
Idea and Construction
Statement of Theorem and Caveats
Outline of Proof
Important Caveats we will Discover:
I
The constant in the error term is proportional to:
q(x)−1/2+2−2d → ∞ as q(x) → 0
I
On compact manifolds: inf q(x) > 0
I
Extending to noncompact manifolds requires
δ = δ(x) ∝ q(x)β for β < 0 (Berry and Harlim 2015)
I
The constant in the error term is proportional to ||∇f || so
rough functions require more data (sampling theory?)
I
Pointwise result breaks down for manifolds with boundary
I
Fact: No pointwise convergence at the boundary (Gibbs)
instead we get L2 convergence to the Neumann Laplacian
I
Still trying to fix this issue!
Tyrus Berry George Mason University
Diffusion Maps Part I: Key Lemmas and Outline of Proof
Review of Riemannian Geometry
Two Easy Examples
Isometric Embeddings
Diffusion Maps
Proof of Lemmas 1-4
Idea and Construction
Statement of Theorem and Caveats
Outline of Proof
Outline of the Proof:
I
Monte-Carlo:
Z
X
~
Jf =
J(xi , xj )f (xj ) ∝
i
I
Exponential decay J(xi , y ) = h
I
Taylor expand:
I
I
I
J(xi , y )f (y ) dV (y )
M
i
||y −xi ||
δ
so ||y − xi || < δ
1
f (y ) = f (x) + (y − x)> ∇f (x) + (y − x)> H(f )(x)(y − x) + · · ·
2
Lemmas 1 & 2: In this neighborhood, y − xi ≈ u ≈ s ∈ Tx M
Lemma 3: The volume form is preserved |dy /du| = 1 + O(δ 2 )
Lemma 4: f (y ) = f (x) + u > ∇f (x) + 12 u > H(f )(x)u + · · ·
Tyrus Berry George Mason University
Diffusion Maps Part I: Key Lemmas and Outline of Proof
Review of Riemannian Geometry
Two Easy Examples
Isometric Embeddings
Diffusion Maps
Proof of Lemmas 1-4
Idea and Construction
Statement of Theorem and Caveats
Outline of Proof
Outline of the Proof:
I
Monte-Carlo:
Z
X
~
Jf =
J(xi , xj )f (xj ) ∝
i
i
J(xi , y )f (y ) dV (y )
M
I
Using lemmas, replace the integral with
Z
1 >
||u||
>
h
f (x) + u ∇f (x) + u H(f )(x)u du
δ
2
Tx M
I
All terms integrate to zero (symmetry) except f (x) + ui2 ∂∂sf2
i
R
Result: J f~ ∝ f (x) + δ 2 m∆f where m = ui2 h(||u||) du
I
2
i
I
So I −J
∝ ∆, then use normalizations to remove sampling bias
δ2
and proportionality constants
Tyrus Berry George Mason University
Diffusion Maps Part I: Key Lemmas and Outline of Proof
Review of Riemannian Geometry
Two Easy Examples
Isometric Embeddings
Diffusion Maps
Proof of Lemmas 1-4
Key Fact To Prove
When ||y − x|| is sufficiently small, ||y − x|| ≈ ||u|| ≈ ||s||
M
γ ′′ (0)
y = expx (s)
b
γ
h(u)
b
b
x
u
Tyrus Berry George Mason University
b
s = γ ′ (0)
Tx M
Diffusion Maps Part I: Key Lemmas and Outline of Proof
Review of Riemannian Geometry
Two Easy Examples
Isometric Embeddings
Diffusion Maps
Proof of Lemmas 1-4
Lemma 1
Claim: Let si be geodesic coordinates based at a point
x ∈ M ⊂ Rm and let ui be projection coordinates, then
si = ui + Px,3 (u) + O(||u||4 )
I
Let γ(s) be the geodesic with γ(0) = x and γ(1) = y
I
The geodesic coordinates of y are ~s = γ 0 (0), set s = ||γ 0 (0)||
I
I
Reparametrize
γ̃ according to arc length so that
Rτ
τ = 0 ||γ̃ 0 (t)|| dt and 1 = ||γ̃ 0 (τ )||
Then γ̃(s) = y , γ̃ 0 (0) = ~s /s and γ̃ 00 (0) ⊥ Tx M so
y = γ̃(s) = γ̃(0) + s γ̃ 0 (0) +
I
s 2 00
γ̃ (0) + Px,3 (~s ) + O(s 4 )
2
Px,3 (~s ) is an order-3 homogeneous polynomial in (s1 , ..., sd )
Tyrus Berry George Mason University
Diffusion Maps Part I: Key Lemmas and Outline of Proof
Review of Riemannian Geometry
Two Easy Examples
Isometric Embeddings
Diffusion Maps
Proof of Lemmas 1-4
Lemma 1
Claim: Let si be geodesic coordinates based at a point
x ∈ M ⊂ Rm and let ui be projection coordinates, then
si = ui + Px,3 (u) + O(||u||4 )
I
From above:
y = γ̃(s) = x + s γ̃ 0 (0) +
s 2 00
γ̃ (0) + Px,3 (~s ) + O(s 4 )
2
I
Project onto ei to find (recall γ̃ 00 (0) ⊥ Tx M):
ui ≡ hy − x, ei i = s γ̃ 0 (0), ei + Px,3 (~s ) + O(s 4 )
I
Since s γ̃ 0 (0)i = si we have ui = si + Px,3 (~s ) + O(s 4 )
Tyrus Berry George Mason University
Diffusion Maps Part I: Key Lemmas and Outline of Proof
Review of Riemannian Geometry
Two Easy Examples
Isometric Embeddings
Diffusion Maps
Proof of Lemmas 1-4
Lemma 1
Claim: Let si be geodesic coordinates based at a point
x ∈ M ⊂ Rm and let ui be projection coordinates, then
si = ui + Px,3 (u) + O(||u||4 )
I
From above: ui = si + Px,3 (~s ) + O(s 4 )
I
Since this is true for all ui we have: si = ui + Px,3 (~s ) + O(s 4 )
I
Plugging this in:
Px,3 (~s ) = Px,3 (u + Px,3 (~s )) = Px,3 (u) + O(ui2 sj3 )
I
Repeating: O(s 4 ) and O(ui2 sj3 ) can be bounded by O(||u||4 )
I
So: si = ui + Px,3 (u) + O(||u||2 )
I
We keep Px,3 (u) because odd terms cancel in
Tyrus Berry George Mason University
R
Tx M
Diffusion Maps Part I: Key Lemmas and Outline of Proof
Review of Riemannian Geometry
Two Easy Examples
Isometric Embeddings
Diffusion Maps
Proof of Lemmas 1-4
Lemma 2
Claim: ||y − x||2 = ||u||2 + Px,4 (u) + Px,5 (u) + O(||u||6 )
I
I
I
I
Recall: u is the projection of y − x onto Tx M
P
2
So y − x = (u, g (u)) and ||y − x||2 = ||u||2 + n−d
i=1 gi (u)
When y = x we find 0 = (0, g (0)) so g (0) = 0
Since g (u) ⊥ Tx M we find
∂gi
∂uj (0)
= 0 so
g (u) = g (0) + u > Dg (0) + Px,2 (u) + Px,3 (u) + O(||u||4 )
= Px,2 (u) + Px,3 (u) + +O(||u||4 )
I
Thus: g (u)2 = Px,4 (u) + Px,5 (u) + O(||u||6 )
Tyrus Berry George Mason University
Diffusion Maps Part I: Key Lemmas and Outline of Proof
Review of Riemannian Geometry
Two Easy Examples
Isometric Embeddings
Diffusion Maps
Proof of Lemmas 1-4
Lemma 3
Claim: |dy /du| = 1 + Px,2 (u) + Px,3 (u) + O(||u||2 )
I
Since
∂gi
∂uj (0)
= 0 we have
∂gi
(u) = P1,x (u) + P2,x (u) + O(||u||3/2 )
∂uj
I
I
Since y − x = (u, g (u)) we have:
∂y
(u) = (ei , P1,x (u) + P2,x (u)) + O(||u||3/2 )
∂uj
Id×d
So (dy /du) =
P1,x (u) + P2,x (u)
Tyrus Berry George Mason University
Diffusion Maps Part I: Key Lemmas and Outline of Proof
Review of Riemannian Geometry
Two Easy Examples
Isometric Embeddings
Diffusion Maps
Proof of Lemmas 1-4
Lemma 3
Claim: |dy /du| = 1 + Px,2 (u) + Px,3 (u) + O(||u||2 )
Id×d
I From above: (dy /du) =
so
P1,x (u) + P2,x (u)
|dy /du|2 = |(dy /du)> (dy /du)| = Id×d + (P1,x (u) + P2,x (u))2 I
Since
√
= 1 + P2,x (u) + P3,x (u) + O(||u||4 )
1 + 2 + 3 = 1 + 2 /2 + 3 /2 + O(4 ) we have
|dy /du| = 1 + P2,x (u) + P3,x (u) + O(||u||4 )
Tyrus Berry George Mason University
Diffusion Maps Part I: Key Lemmas and Outline of Proof
Review of Riemannian Geometry
Two Easy Examples
Isometric Embeddings
Diffusion Maps
Proof of Lemmas 1-4
Lemma 4
Claim: For ||y − x|| sufficiently small:
f (y ) = f (x)+
X
i
I
ui
1X
∂ 2 f˜
∂ f˜
(0)+
ui uj
(0)+Px,3 (u)+O(||u||4 )
∂si
2
∂si ∂sj
ij
In geodesic coordinates we have f (y ) = f (expx (~s )) = f˜(~s ) so
X ∂ f˜ 1 X
∂ 2 f˜
f (y ) = f˜(0)+
si
(0)
si sj
(0)+Px,3 (s)+O(||s||4 )
∂si
2
∂si ∂sj
i
I
ij
By Lemma 1, si = ui + Px,3 (u) + O(||u||4 ) and f˜(0) = f (x)
so the claim follows directly.
Tyrus Berry George Mason University
Diffusion Maps Part I: Key Lemmas and Outline of Proof