Conductance, the Normalized Laplacian, and Cheeger`s Inequality

Spectral Graph Theory
Lecture 6
Conductance, the Normalized Laplacian, and Cheeger’s Inequality
Daniel A. Spielman
September 21, 2015
Disclaimer
These notes are not necessarily an accurate representation of what happened in class. The notes
written before class say what I think I should say. I sometimes edit the notes after class to make
them way what I wish I had said.
There may be small mistakes, so I recommend that you check any mathematically precise statement
before using it in your own work.
These notes were last revised on September 21, 2015.
6.1
Overview
As the title suggests, in this lecture I will introduce conductance, a measure of the quality of a
cut, and the normalized Laplacian matrix of a graph. I will then prove Cheeger’s inequality, which
relates the second-smallest eigenvalue of the normalized Laplacian to the conductance of a graph.
Cheeger [Che70] first proved his famous inequality for manifolds. Many discrete versions of Cheeger’s
inequality were proved in the late 80’s [SJ89, LS88, AM85, Alo86, Dod84, Var85]. Some of these
consider the walk matrix (which we will see in a week or two) instead of the normalized Laplacian,
and some consider the isoperimetic ratio instead of conductance.
The proof that I present today follows an approach developed by Luca Trevisan [Tre11].
6.2
Conductance
Back in Lecture 2, we related to isoperimetric ratio of a subset of the vertices to the second
eigenvalue of the Laplacian. We proved that for every S ⊂ V
θ(S) ≥ λ2 (1 − s),
where s = |S| / |V | and
def
θ(S) =
|∂(S)|
.
|S|
6-1
Lecture 6: September 21, 2015
6-2
Re-arranging terms slightly, this can be stated as
|V |
|∂(S)|
≥ λ2 .
|S| |V − S|
Cheeger’s inequality provides a relation in the other direction. However, the relation is tighter and
cleaner when we look at two slightly different quantities: the conductance of the set and the second
eigenvalue of the normalized Laplacian.
The formula for conductance has a different denominator that depends upon the sum of the degrees
of the vertices in S. I will write d(S) for the sum of the degrees of the vertices in S. Thus, d(V ) is
twice the number of edges in the graph. We define the conductance of S to be
def
φ(S) =
|∂(S)|
.
min(d(S), d(V − S))
Note that many similar, although sometimes slightly different, definitions appear in the literature.
For example, we would instead use
d(V )∂(S)
,
d(S)d(V − S)
which appears below in (6.5).
We define the conductance of a graph G to be
def
φG = min φ(S).
S⊂V
The conductance of a graph is more useful in many applications than the isoperimetric number.
I usually find that conductance is the more useful quantity when you are concerned about edges,
and that isoperimetric ratio is most useful when you are concerned about vertices. Conductance is
particularly useful when studying random walks in graphs.
6.3
The Normalized Laplacian
It seems natural to try to relate the conductance to the following generalized Rayleigh quotient:
y T Ly
.
y T Dy
(6.1)
If we make the change of variables
D 1/2 y = x ,
then this ratio becomes
x T D −1/2 LD −1/2 x
.
xTx
That is an ordinary Rayleigh quotient, which we understand a little better. The matrix in the
middle is called the normalized Laplacian (see [Chu97]). We reserve the letter N for this matrix:
def
N = D −1/2 LD −1/2 .
Lecture 6: September 21, 2015
6-3
This matrix often proves more useful when examining graphs in which nodes have different degrees.
We will let 0 = ν1 ≤ ν2 ≤ · · · ≤ νn denote the eigenvalues of N .
The conductance is related to ν2 as the isoperimetric number is related to λ2 :
ν2 /2 ≤ φG .
(6.2)
I include a proof of this in the appendix.
My goal for today’s lecture is to prove Cheeger’s inequality,
√
φG ≤ 2ν2 ,
which is much more interesting. In fact, it is my favorite theorem in spectral graph theory.
The eigenvector of eigenvalue 0 of N is d 1/2 , by which I mean the vector whose entry for vertex u
is the square root of the degree of u. Observe that
D −1/2 LD −1/2 d 1/2 = D −1/2 L1 = D −1/2 0 = 0.
The eigenvector of ν2 is given by
arg min
x ⊥d 1/2
xTN x
.
xTx
Transfering back into the variable y , and observing that
x T d 1/2 = y T D1/2 d 1/2 = y T d ,
we find
ν2 = min
y ⊥d
6.4
y T Ly
.
y T Dy
Cheeger’s Inequality
Cheeger’s inequality proves that if we have a vector y , orthogonoal to d , for which the generalized
Rayleigh quotient (6.1) is small, then one can obtain a set of small conductance from y . We obtain
such a set by carefully choosing a real number t, and setting
St = {u : y (u) ≤ t} .
Theorem 6.4.1. Let y be a vector orthogonal to d . Then, there is a number t for which the set
St = {u : y (u) < t} satisfies
s
y T Ly
.
φ(St ) ≤ 2 T
y Dy
Before proving the theorem, I wish to make one small point about the denominator in the expression
above. It is essentially minimized when y T d = 0, at least with regards to shifts.
Lecture 6: September 21, 2015
6-4
Lemma 6.4.2. Let v s = y + z1. Then, the minimum of v Tz Dv Tz is achieved at the z for which
v Tz d = 0.
Proof. The derivative with respect to z is
2d T v z ,
and the minimum is achieved when this derivative is zero.
We begin our proof of Cheeger’s inequality by defining
ρ=
y T Ly
.
y T Dy
So, we need to show that there is a t for which φ(St ) ≤
√
2ρ.
By renumbering the vertices, we may assume without loss of generality that
y (1) ≤ y (2) ≤ · · · ≤ y (n).
We begin with some normalization. Let j be the least number for which
j
X
d (u) ≥ d(V )/2.
u=1
We would prefer a vector that is centered at j. So, set
z = y − y (j)1.
This vector z satisfies z (j) = 0, and, by Lemma 6.4.2,
z T Lz
≤ ρ.
z T Dz
We also multiply z by a constant so that
z (1)2 + z (n)2 = 1.
Recall that
φ(S) =
|∂(S)|
.
min(d(S), d(V − S))
We will define a distribution on t for which we can prove that
p
E [|∂(St )|] ≤ 2ρ E [min(d(St ), d(V − St ))] .
This implies1 that there is some t for which
p
|∂(St )| ≤ 2ρ min(d(St ), d(V − St )),
√
If this is not immediately clear, note that it is equivalent to assert that E
2ρ min(d(S), d(V − S)) − |∂(S)| ≥ 0,
which means that there must be some S for which the expression is non-negative.
1
Lecture 6: September 21, 2015
which means φ(S) ≤
6-5
√
2ρ.
To switch from working with y to working with z , define We will set St = {u : z (u) ≤ t}. Trevisan
had the remarkable idea of choosing t between z (1) and z (n) with probability density 2 |t|. That
is, the probability that t lies in the interval [a, b] is
Z b
2 |t| .
t=a
To see that the total probability is 1, observe that
Z z (n)
Z 0
Z z (n)
2 |t| =
2 |t| = +
2 |t| = z (n)2 + z (1)2 = 1,
t=z (1)
t=z (1)
t=0
as z (1) ≤ z (j) ≤ z (n) and z (j) = 0.
Similarly, the probability that t lies in the interval [a, b] is
Z b
2 |t| = sgn(b)b2 − sgn(a)a2 ,
t=a
where


1
sgn(x) = 0


−1
if x > 0
if x = 0, and
if x < 0.
Lemma 6.4.3.
X
Et [|∂(St )|] =
Prt [(u, v) ∈ ∂(St )] ≤
(u,v)∈E
X
|z (u) − z (v)| (|z (u)| + |z (v)|).
(u,v)∈E
Proof. An edge (u, v) with z (u) ≤ z (v) is on the boundary of S if
z (u) ≤ t < z (v).
The probability that this happens is
(
z (u)2 − z (v)2 2
2
sgn(z (v))z (v) − sgn(z (u))z (u) =
z (u)2 + z (v)2
when sgn(u) = sgn(v),
when sgn(u) =
6 sgn(v).
We now show that both of these terms are upper bounded by
|z (u) − z (v)| (|z (u)| + |z (v)|).
Regardless of the signs,
z (u)2 − z (v)2 = |(z (u) − z (v))(z (u) + z (v))| ≤ |z (u) − z (v)| (|z (u)| + |z (v)|).
When sgn(u) = −sgn(v),
z (u)2 + z (v)2 ≤ (z (u) − z (v))2 = |z (u) − z (v)| (|z (u)| + |z (v)|).
(6.3)
Lecture 6: September 21, 2015
6-6
We now derive a formula for the expected denominator of φ.
Lemma 6.4.4.
Et [min(d(St ), d(V − St ))] = z T Dz .
Proof. Observe that
Et [d(St )] =
X
Prt [u ∈ St ] d(u) =
X
Prt [z (u) ≤ t] d(u).
u
u
The result of our centering of z at j is that
t < 0 =⇒ d(S) = min(d(S), d(V − S)),
and
t ≥ 0 =⇒ d(V − S) = min(d(S), d(V − S)).
That is, for u < j, u is in the smaller set if t < 0; and, for u ≥ j, u is in the smaller set if t ≥ 0. So,
X
X
Et [min(d(St ), d(V − St ))] =
Pr [z (u) < t and t < 0] d(u) +
Pr [z (u) > t and t ≥ 0] d(u)
u<j
=
X
u≥j
Pr [z (u) < t < 0] d(u) +
u<j
=
X
=
X
X
Pr [z (u) > t ≥ 0] d(u)
u≥j
z (u)2 d(u) +
u<j
X
z (u)2 d(u)
u≥j
2
z (u) d(u)
u
T
= z Dz .
Recall that our goal is to prove that
E [|∂(St )|] ≤
p
2ρ E [min(d(St ), d(V − St ))] ,
and we know that
Et [min(d(St ), d(V − St ))] =
X
z (u)2 d(u)
u
and that
Et [|∂(St )|] ≤
X
|z (u) − z (v)| (|z (u)| + |z (v)|).
(u,v)∈E
We may use the Cauchy-Schwartz inequality to upper bound the term above by
s X
s X
2
(z (u) − z (v))
(|z (u)| + |z (v)|)2 .
(u,v)∈E
(u,v)∈E
(6.4)
Lecture 6: September 21, 2015
6-7
We have defined ρ so that the term under the left-hand square root is at most
X
ρ
z (u)2 d(u).
u
To bound the right-hand square root, we observe
X
X
X
z (u)2 d(u).
z (u)2 + z (v)2 = 2
(|z (u)| + |z (v)|)2 ≤ 2
u
(u,v)∈E
(u,v)∈E
Putting all these inequalities together yields
s
s
X
X
2
E [|∂(S)|] ≤ ρ
z (u) d(u) 2
z (u)2 d(u)
u
u
p X
= 2ρ
z (u)2 d(u)
u
p
= 2ρ E [min (d(S), d(V − S))] .
I wish to point out two important features of this proof:
1. This proof does not require y to be an eigenvector–it obtains a cut from any vector y that is
orthogonal to d .
2. This proof goes through almost without change for weighted graphs. The main difference is
that for weighted graphs we measure the sum of the weights of edges on the boundary instead
of their number. The main difference in the proof is that lines (6.3) and (6.4) become
X
E [w(∂(S))] =
Pr [(u, v) ∈ ∂(S)] wu,v
(u,v)∈E
≤
X
|z (u) − z (v)| (|z (u)| + |z (v)|)wu,v
(u,v)∈E
≤
s X
(u,v)∈E
wu,v (z (u) −
z (v))2
s X
wu,v (|z (u)| + |z (v)|)2 ,
(u,v)∈E
and we observe that
X
(u,v)∈E
wu,v (|z (u)| + |z (v)|)2 ≤ 2
X
z (u)2 d(u).
u
The only drawback that I see to the approach that we took in this proof is that the application of
Cauchy-Schwartz is a little mysterious. Shang-Hua Teng and I came up with a proof that avoids
this by introducing one inequality for each edge. If you want to see that proof, look at my notes
from 2009.
Lecture 6: September 21, 2015
A
6-8
Proof of (6.2)
Lemma A.1. For every S ⊂ V ,
φ(S) ≥ ν2 /2.
Proof. As in Lecture 2, we would like to again use χS as a test vector. But, it is not orthogonal to
d . To fix this, we subtrat a constant. Set
y = χS − σ1,
where
σ = d(S)/d(V ).
You should now check that y T d = 0:
y T d = χTS d − σ1T d = d(S) − (d(S)/d(V ))d(V ) = 0.
We already know that
y T Ly = |∂(S)| .
It remains to compute y T Dy . If you remember the previous computation like this, you would
guess that it is d(S)(1 − σ) = d(S)d(V − S)/d(V ), and you would be right:
X
X
y T Dy =
d(u)(1 − σ)2 +
d(u)σ 2
u6∈S
u∈S
2
= d(S)(1 − σ) + d(V − S)σ 2
= d(S) − 2d(S)σ + d(V )σ 2
= d(S) − d(S)σ
= d(S)d(V − S)/d(V ).
So,
ν2 ≤
y T Ly
|∂(S)| d(V )
=
.
y T Dy
d(S)d(V − S)
(6.5)
As the larger of d(S) and d(V − S) is at least half of d(V ), we find
ν2 ≤ 2
|∂(S)|
.
min(d(S), d(V − S))
References
[Alo86] N. Alon. Eigenvalues and expanders. Combinatorica, 6(2):83–96, 1986.
[AM85] Noga Alon and V. D. Milman. λ1 , isoperimetric inequalities for graphs, and superconcentrators. J. Comb. Theory, Ser. B, 38(1):73–88, 1985.
Lecture 6: September 21, 2015
6-9
[Che70] J. Cheeger. A lower bound for smallest eigenvalue of the Laplacian. In Problems in
Analysis, pages 195–199, Princeton University Press, 1970.
[Chu97] F. R. K. Chung. Spectral Graph Theory. American Mathematical Society, 1997.
[Dod84] Jozef Dodziuk. Difference equations, isoperimetric inequality and transience of certain
random walks. Transactions of the American Mathematical Society, 284(2):787–794, 1984.
[LS88]
Gregory F. Lawler and Alan D. Sokal. Bounds on the l2 spectrum for Markov chains and
Markov processes: A generalization of Cheeger’s inequality. Transactions of the American
Mathematical Society, 309(2):557–580, 1988.
[SJ89]
Alistair Sinclair and Mark Jerrum. Approximate counting, uniform generation and rapidly
mixing Markov chains. Information and Computation, 82(1):93–133, July 1989.
[Tre11] Luca
Trevisan.
Lecture
4
from
cs359g:
Graph
partitioning
and
expanders,
stanford
university,
January
2011.
available
at
http://theory.stanford.edu/ trevisan/cs359g/lecture04.pdf.
[Var85] N. Th. Varopoulos. Isoperimetric inequalities and Markov chains. Journal of Functional
Analysis, 63(2):215 – 239, 1985.