A Class of Problems
We use
Numerical continuation
Bifurcation theory with symmetries
to analyze a class of optimization problems of the form
max F(q,)=max (G(q)+D(q)).
q
q
The goal is to solve for = B(0,), where:
•
•
•
•
•
•
. : q( z | y ) | q( z | y ) 1, y Y n
zZ
G and D are infinitely differentiable in .
G is strictly concave.
D is convex.
G and D must be invariant under relabeling of the classes.
The hessian of F is block diagonal with N blocks {B} and B=B if
q(z|y)= q(z|y) for every yY.
Problems in this Class
•
Deterministic Annealing (Rose 1998)
max H(Z|Y) - D(Y,Z)
Clustering Algorithm
•
Rate Distortion Theory (Shannon ~1950)
max –I(Y,Z) - D(Y,Z)
Optimal Source Coding
•
Information Distortion (Dimitrov and Miller2001)
max H(Z|Y) + I(X,Z)
Used in neural coding.
•
Information Bottleneck Method (Tishby, Pereira, Bialek 2000)
max –I(Y,Z) + I(X,Z)
Used for document classification, gene expression,
neural coding and spectral analysis
Rate Distortion
How well is the source X represented by Z?
p(X)
X
Z is a representation of X using N symbols (or clusters)
Information Distortion
A good communication system has p(X,Y) like:
2H(Y) output sequences
1
Y
2
2I(X,Y) distinguishable
input/output classes of (x,y) pairs
3
4
Size of an input/output class:
2H(X)
X
input sequences
input
source
X
2(H(X|Y) + H(Y|X)) pairs
clustered
outputs
output
source
P(Y |X)
Y
q*(Z |Y)
Z
Q*(Z |X)
Goal: Determine the input/output classes of (x,y) pairs.
Idea: We seek to quantize (X,Y) into clusters which correspond with the
input/output classes.
Method: We determine a quantizer, Q*, between X and Z , a
representation of Y using N elements, such that F(Q*,B) is a
maximum for some B (0,).
Some nice properties of the
problem
The feasible region , a product of simplices, is nice.
Lemma
is the convex hull of vertices ().
y1
The optimal quantizer q* is DETERMINISTIC.
y2
y3
y1
y2
y3
Theorem The extrema of lie generically on the vertices of ..
Corollary The optimal quantizer is invariant to small perturbations
in the model.
Solution of the problem when
p(X,Y):= 4 gaussian blobs
p(X,Y)
I(X,Z) vs. N
The Dynamical System
Goal: To efficiently solve maxq (G(q) + D(q)) for each , incremented
in sufficiently small steps, as B.
Method: Study the equilibria of the of the flow
q
q , L (q, , ) : q , G(q) D(q) y q( z | y) 1
yY
z
•
The Jacobian wrt q of the K constraints {zq(z|y)-1} is J = (IK IK … IK).
•
The first equilibrium is q*(0 = 0) 1/N.
•
q F
q., L (q, , ) T
J
J
0
determines stability and location of
bifurcation.
Assumptions:
•
Let q* be a local solution to and fixed by SM .
•
Call the M identical blocks of q F (q*,): B. Call the other N-M blocks
of q F (q*,): {R}.
•
At a singularity (q*,*,*), B has a single nullvector v and R is
nonsingular for every .
•
If M<N, then BR-1 + MIK is nonsingular.
Theorem: If q, L(q*,*,*) is singular then q F (q*,*) is singular.
Theorem: (q*,*,*) is a bifurcation of equilibria of if and only if
q, L(q*,*, *) is singular.
Theorem: If (q*,*,*) is a bifurcation of equilibria of , then * 1.
Theorem: dim (ker q F (q*,* )) = M with basis vectors w1,w2, … , wM
v if is the i th unresolved class
[ wi ]
0 otherwise
Theorem: dim (ker q, L (q*,*,*)) = M-1 with basis vectors
M 1
wi wM
0
0
i 1
Investigating the Dynamical System
How:
Use numerical continuation in a constrained system to
choose and to choose an initial guess to find the equilibria
q*( ).
Use bifurcation theory with symmetries to understand
bifurcations of the equilibria.
Continuation
(qk 1 , k 1 )
*
q
*
qk 1
(qk , k )
*
( 0)
k 1
q
(qk 1 , k 1 )
( 0)
qk
*
k
k 1( 0)
( 0)
• A local maximum qk*(k) of is an equilibrium of the
gradient flow .
• Initial condition qk+1(0)(k+1(0)) is sought in the tangent
direction qk , which is found by solving the matrix system
qk
q , L (qk , k , k )
q , L (qk , k , k )
k
• The continuation algorithm used to find qk+1*(k+1) is based
on Newton’s method.
Conceptual Bifurcation Structure
q* (YN|Y)
q*
1
N
Bifurcations of q*()
Observed Bifurcations for the 4 Blob Problem
Bifurcations with symmetry
To better understand the bifurcation structure, we capitalize on
the symmetries of the optimization function F(q,).
The “obvious” symmetry is that F(q,) is invariant to relabeling
of the N classes of Z
The symmetry group of all permutations on N symbols is SN.
q
The action of SN on and q, L (q, , ) is represented by the finite
Lie Group
:
0
K n
0
n K
| P
I
K K
where P is a “block permutation” matrix.
q
The symmetry of is measured by its isotropy group, the subgroup
of which fixes it.
What do the bifurcations look like?
The Equivariant Branching Lemma gives the
existence of bifurcating solutions for every isotropy
subgroup which fixes a one dimensional subspace of
ker q,L (q*,,).
Theorem:
Let (q*,*,*) be a singular point of the flow
q
q , L (q, , )
such that q* is fixed by SM. Then there exists M
bifurcating solutions, (q*,*,*) + (tuk,0,(t)), each with
isotropy group SM-1, where
( M 1)v if is the k th unresolved class
[uk ] v
if k is any other unresolved class
0
otherwise
Bifurcation Structure
3 F (q* , * )
[v]k [v]m [v]l .
Let T(q*,*) =
k , m ,l qk qm ql
Transcritical or Degenerate?
Theorem: If T(q*,*) 0 and M>2, then the bifurcation at
(q*,*) is transcritical. If T(q*,*) = 0, it is degenerate.
Branch Orientation?
Theorem: If T(q*,*) > 0 or if T(q*,*) < 0, then the branch is
supercritical or subcritical respectively. If T(q*,*) = 0 , then
4qqqq F(q,) dictates orientation.
Branch Stability?
Theorem: If T(q*,*) 0, then all branches fixed by SM-1 are
unstable.
Partial lattice of the isotropy subgroups of S4
(and associated bifurcating directions)
S4
3v
v
v
v
0
v
3v
v
v
0
S3
S2 S2 S2
0
2v
v
v
0
0
v
2v
v
0
0
v
v
2v
0
S3
S3
S2 S2 S2
2v
0
v
v
0
v
0
2v
v
0
v
0
v
2v
0
v
v
3v
v
0
S2 S2 S2
2v
v
0
v
0
v
2v
0
v
0
v
v
0
2v
0
S3
v
v
v
3v
0
S2 S2 S2
2v
v
v
0
0
v
2v
v
0
0
v
v
2v
0
0
1
For the 4 blob problem:
The isotropy subgroups and
bifurcating directions of the
observed bifurcating branches
isotropy group: S4
S3
S2
1
bif direction:
(-v,-v,3v,-v,0)T (-v,2v,0,-v,0)T (-v,0,0,v,0)T …No more bifs!
Other Branches
The Smoller-Wasserman Theorem ascertains the existence
of bifurcating branches for every maximal isotropy
subgroup.
Theorem: If M is a composite number, then there exists
bifurcating solutions with isotropy group <p> for every
element of order M in and every prime p|M. The
bifurcating direction is in the p-1 dimensional subspace of
ker q,L (q*,,) which is fixed by <p>.
We have never numerically observed solutions fixed by
<p> and so perhaps they are unstable.
Lattice of the maximal isotropy
subgroups <p> in S4
S4
(1423)
A4
(1324)
1234
2
1324
2
v
v
v
v
2
1243
v
v
v
v
v
v
v
v
An example of redundancy: (1423)2= (1324)2= (12)(34)
The full lattice of subgroups of the group SM is not known for
arbitrary M.
The efficient algorithm
to solve max F(q, )
Let q0 be the maximizer of maxq G(q), 0 =1 and s > 0. For k
0, let (qk , k ) be a solution to maxq (G(q) + D(q )). Iterate
the following steps until K = B for some K.
qk
q , L (qk , k , k )
1. Perform -step: solve q , L (qk , k , k )
k
qk
and select k+1 = k + dk where
for
k
dk = s /(||qk ||2 + ||k ||2 +1)1/2.
2. The initial guess for qk+1 at k+1 is qk+1(0) = qk + dk qk .
3. Optimization: solve maxq (G(q) + k+1 D(q)) to get the
maximizer q*k+1 , using initial guess qk+1(0) .
4. Check for bifurcation: compare the sign of the determinant
of an identical block of each of q [G(qk) + k D(qk)] and
q [G(qk+1) + k+1 D(qk+1)]. If a bifurcation is detected, then
set qk+1(0) = qk + dk u where u is given by and repeat step 3.
© Copyright 2026 Paperzz