Summary: Normalized Cuts and Image Segmentation, by Jianbo Shi (Associate Professor at the University of Pennsylvania, Computer and Information
Sciences) and Jitendra Malik (Professor in Computer Science at University of
California, Berkeley)
Main Point: The authors introduce the normalized cut criterion and use
it to segment images by modeling the pixels in the image as vertices in a graph,
and then partitioning the graph so as to minimize the normalized cut. The
normalized cut between disjoint partitions A and B, or N Cut(A, B) is defined
as:
P
P
u∈A,v∈B w(u, v)
u∈A,v∈B w(u, v)
+ P
N Cut = P
q∈A,t∈V w(q, t)
r∈B,s∈V w(r, s)
The first term is the ratio the sum of weights of edges that cross from A to B
to the sum of weights of edges that are incident to A, and the second term is
similarly defined. Therefore, a small N Cut corresponds to a large sum of edge
weights within each partition, relative to the sum of edge weights across partitions. The algorithm used to obtain an optimal normalized cut is known today
as the spectral bisection algorithm, which uses the eigenvector corresponding to
the second eigenvalue of the graph Laplacian matrix to partition vertices into
two groups.
As presented, Shi and Malik’s algorithm is hierarchical, where at each level
the clusters found in the previous level are split in two using spectral bisection.
The first level is the entire graph where vertices represent pixels in an image
and edge weights between pixels are given by considering the distance between
these two pixels in the plane of the image, as well as the difference in brightness
values of the two pixels. More specifically,
2
−kXi −Xj k2
j k2
2
−kFiσ−F
2
σ2
I
X
e
∗
e
, if kXi − Xj k2 < r
wij =
0,
otherwise
where Fi and is the pixel intensity of pixel i, and Xi is the position of pixel i.
To partition the graph into two pieces, the objective is to find partitions A and
B that produce the minimal N Cut on the graph. Note that minimizing N Cut
over A, B is equivalent to minimizing:
P
P
u∈A,v∈B w(u, v)
u∈A,v∈B w(u, v)
P
+ P
q∈A,t∈A w(q, t)
r∈B,s∈B w(r, s)
=
1
P
q∈A,t∈A
w(q, t)
!
1
+P
r∈B,s∈B
X
w(r, s)
w(u, v)
u∈A,v∈B
In other words, the objective of N Cut is to “partition the set of vertices into disjoint sets V1 , . . . , Vk , where by some measure the similarity among the vertices in
1
P
a set Vi is high and, across different sets Vi , Vj is low.” The cut u∈A,v∈B w(u, v)
between A and B is thus normalized by the within-cluster “associations” of A
and B, which are the sum of edge weights within each cluster. The authors point
out that finding an A and B which minimize the normalized cut is N P -hard
(proven by Papadimitriou). They go on to show how to approximately find A
and B finding generalized eigenvectors of the matrix D − W , where D is the
diagonal matrix of vertex degrees and W is symmetric matrix of edge weights
defined by wij above. The matrix D − W is today known as the Laplacian
matrix L. Shi and Malik’s algorithm, which recursively splits the graph into
two pieces using these eigenvectors of L, is known as spectral bisection.
Finding the minimal N Cut is equivalent to finding the “indicator” vector x,
where xi = 1 if node i is in partition A and xi = −1 if node i is in partition B,
which satisfies:
P
P
xi <0,xj >0 −wij xi xj
xi >0,xj <0 −wij xi xj
P
P
+
minx N Cut(x) = min
x
xi >0 di
xi <0 di
Shi and Malik prove that if we set y = (1 + x) −
P
di
Pxi >0 (1
x <0 di
− x), where di is
i
the degree of node i, then:
min N Cut(x) = min
x
Subject to yi ∈
P
di
{1, − Pxi >0 di }
x <0
y
y T (D − W )y
y t Dy
and y T D1 = 0 .This is in fact a Rayleigh quo-
i
tient. Minimizing the above quotient over y is of course still N P hard, because
we have constrained y to only take on discrete values. This is where the approximation comes in. If we eliminate the constraint on y, letting y be any
real-valued vector, the Rayleigh quotient above can be minimized by solving
the generalized eigenvalue problem:
(D − W )y = λDy subject to y T D1 = 0
We can find such a y by Lanczos iteration.
This means that the second-smallest eigenvalue λ2 of the generalized eigenvalue
problem above is the value of the minimal N Cut, with the real-valued eigenvector y2 . Now, it remains to map the entries of y back into the discrete values
indicating whether node i, corresponding to entry yi , belongs to partition A or
partition B. The authors accomplish this feat by grouping the entries yi into
two parts, where all entries in the same part are approximately the same value.
In other words, the hope is that the vector y will be approximately piecewise
constant, where all entries of y will be close to one of two distinct values. The
partitioned entries yi are then used to partition the pixels into two parts, where
pixel i belongs to partition A if yi belongs to the first part and to partition B
if yi belongs to the second part. This completes the spectral bisection of the
2
graph.
The authors go on to discuss a variant of spectral bisection, where the first
k eigenvectors of D − W are used to partition the graph into k pieces, instead of
recursively splitting partitions of the graph into two. This can be done because
the eigenvector corresponding to the kth smallest eigenvalue is shown to be the
real-valued solution that optimally subpartitions the first k − 1 parts. These k
eigenvectors can then be used to partition the nodes by using k-means. To find
k partitions using these eigenvectors, n k-dimensional vectors U representing
the n pixels are constructed by letting Ui = (yi1 , yi2 , . . . , yik ), where y j is the
j + 1st eigenvector (because y 1 is just a constant eigenvector). Then, k-means
is used to cluster the U ’s into k groups, and pixel i’s paritition is chosen according to the group of vector Ui . This is the algorithm generally referred to as
the “spectral clustering” algorithm. The authors note, that they “have experimented with this simultaneous k-way cut method” but only show results for
the spectral bisection algorithm in the paper. The results shown are examples
of images that have been segmented by spectral bisection, which seems to have
successfully partitioned the images into pieces.
Interesting connections are drawn between Shi and Malik’s algorithm and other
concepts in spectral graph theory. A list of the connections described is given
below:
• Fan Chung proposed a normalized definition of the graph Laplacian as
1
1
D− 2 (D − W )D− 2 . If we multiply the eigenvectors y j used to compute
1
1
the normalized cut by D− 2 , then these vectors D− 2 y j are exactly the
eigenvectors of the normalized Laplacian
• The Cheeger constant hG relates to the eigenvalues in (D − W )y = λDy
in the following way:
h2
2hG ≥ λ1 > G
2
• The ratio cut RCut of a graph is defined similarly to the normalized cut:
P
u∈A,v∈V \A w(u, v)
RCut =
min(|A|, |V − A|)
However, experimentally the N Cut criterion performs better than RCut,
as is expected based on Chung’s explanation in Spectral Graph Theory
• The conductance of a random walk on the graph defined by nodes i and
edge weights Wij “can be shown [to be] the normalized cut value and
the normalized cut vectors y in (D − W )y = λDy are exactly the right
eigenvectors of [the probability transition matrix] P”, where P is simply
D−1 W .
3
© Copyright 2026 Paperzz