MATHEMATICAL OPTIMIZATION FOR
THE INVERSE PROBLEM OF INTENSITY
MODULATED RADIATION THERAPY
Yair Censor, D.Sc.
Department of Mathematics, University of Haifa, Haifa,
Israel
.
The 2003 AAPM Summer School on Intensity Modulated Radiation Therapy, Colorado Springs, Colorado, USA, June 22—26,
2003.
.
Y.C./IMRT/COS/slide 1
Setting the stage
Figure 1: A 3D cross-section, external radiation field
and dose distribution for 3D-IMRT planning.
Y.C./IMRT/COS/ slide 2
Road map (of our discussion)
• The continuous forward and inverse problems
• Full discretization of the inverse problem
• The general optimization problem
.
— Linear optimization
.
— Global optimization
.
— The convex feasibility problem
.
— Multi-objective optimization
• Optimization and feasibility methods
.
— Linear optimization and Mixed integer
.
programming (MIP)
.
— Pareto optimality in Multi-objective
.
optimization
.
— Global optimization: Simulated annealing
.
— Projection methods: POCS and Cimmino’s
.
algorithms
.
— Gradient methods
• What is the most practical thing and what is the
road to wisdom
Y.C./IMRT/COS/slide 3
The continuous forward problem of IMRT
Assume that the cross-section of the patient and its
radiation absorption characteristics are known.
Given an external radiation intensity function (field)
ρ(u, w),
find the dose function D(r, θ), for all (r, θ), from the
formula
D(r, θ) = D[ρ(u, w)](r, θ)
where D is the dose operator which relates the dose
function to the radiation intensity function.
.
Y.C./IMRT/COS/slide 4
The main difficulty regarding the operator D
.
I There exists no closed-form analytic representation
of the dose operator D.
.
I To solve the forward problem in practice, a state-ofthe-art computer program, which represents a computational approximation of the operator D and
which enables reasonably good dose calculations, must
be used.
.
.
.
Y.C./IMRT/COS/slide 5
The continuous inverse problem of IMRT
Assume that the cross-section of the patient and its
radiation absorption characteristics are known.
Given a prescribed dose function D(r, θ),
find a radiation intensity function ρ(u, w) such that
ρ(u, w) = D−1[D(r, θ)]
where D−1 is the inverse operator of D.
.
.
.
Y.C./IMRT/COS/slide 6
The main difficulty regarding the operator D−1
I The inversion problem needs to be solved, in a computationally tractable way, although no closed-form
analytic mathematical representation is available for
the dose operator D.
.
I Without such a mathematical representation of D
it is impossible to employ mathematical methods
for analytic inversion to find the inverse operator
D−1.
.
I Conclusion: full discretization of the problem
has to be adopted.
.
Y.C./IMRT/COS/slide 7
Y.C./IMRT/COS/slide 8
Full discretization of the inverse problem:
I Voxels are indexed by j = 1, 2, . . . , J.
.
I Beams are indexed by s = 1, 2, . . . , S.
.
I Rays (beamlets) are indexed, for all rays in all
beams, by i = 1, 2, . . . , I.
.
Notations
j
ai — the dose absorbed in voxel j due to one unit
radiation intensity coming from ray i.
.
Y.C./IMRT/COS/slide 9
Full discretization of the inverse problem: — cont’d.
I xi — the total radiation intensity that the i-th ray
should deliver.
.
I x = (xi)Ii=1 — the I—dimensional radiation intensity
vector.
.
P
I Ii=1 aji xi — the total dose that will be absorbed at
voxel j due to radiation from all rays.
· The inner (dot) product of aj and x in the Euclidean
space RI
I
D
E
X
j
j
ai xi = a , x
i=1
.
Y.C./IMRT/COS/slide 10
Full discretization of the inverse problem — cont’d.
Prescriptions: For all voxels, indexed j = 1, 2, . . . , J:
· Exact prescriptions:
D
E
j
a ,x
· Interval prescriptions:
lj ≤
D
= bj
E
j
a ,x
≤ uj
I The feasibility approach:
.
E
D
j
a , x ≤ bl ,
D
E
t ≤ aj , x ,
q
D
E
j
a , x ≤ c,
xi ≥ 0,
Y.C./IMRT/COS/slide 11
for all
for all
for all
for all
j ∈ Bl , l = 1, 2, . . . , L,
j ∈ Tq , q = 1, 2, . . . , Q,
j ∈ C,
i = 1, 2, . . . , I.
Full discretization: Voxels, beams, rays.
Y.C./IMRT/COS/slide 12
Mathematical optimization and feasibility problems
I The general optimization problem
min f (x)
such that
hm(x) = 0, m = 1, 2, . . . , M
gj (x) ≤ 0,
j = 1, 2, . . . , J
x∈Γ
f : RI → R is an objective function
gj : RI → R and hm : RI → R are constraints
functions and x ∈ Γ is called a set-constraint.
I Denote by Q ⊆ RI the feasible set of the optimization problem.
.
Y.C./IMRT/COS/slide 13
Mathematical optimization and feasibility problems
— cont’d.
Definitions
.
¥ A point x∗ ∈ Q is a global optimal solution (global
minimizer) of if
f (x∗) ≤ f (x), for all x ∈ Q.
.
¥ A point xe ∈ Q is a local optimal solution (local
minimizer) of if there exists a neighborhood U ⊂ RI
e so that
of x
.
e ≤ f (x), for all x ∈ Q ∩ U.
f (x)
Y.C./IMRT/COS/slide 14
Special cases
(1) Linear Optimization: all functions (objective and
constraints) are linear, i.e., of the generic form
f (x) = hϕ, xi + β,
gj (x) =
hm(x) = hξ m, xi + σ.
D
E
j
a , x + b,
I The feasible set of the fully discretized inverse problem of IMRT, discussed before, is represented by linear
constraints.
.
(2) Global optimization: The objective function is multiextremal, i.e., has multiple local minimizers with different objective function values.
.
.
Y.C./IMRT/COS/slide 15
Special cases — cont’d.
(3) The convex feasibility problem: no objective function at all.
Qj = {x ∈ RI | qj (x) ≤ 0}, j = 1, 2, . . . , J,
where each qj (x) is a convex function, thus, Qj are
convex sets.
Q = ∩Jj=1Qj .
I The convex feasibility problem: Find any x∗ ∈ Q.
.
.
.
Y.C./IMRT/COS/slide 16
Special cases — cont’d.
(4) Multi-objective optimization: The problem has more
then one objective function.
“ min ”{(f1, f2, . . . , fT ) | x ∈ Q}.
For example, if T = 5 and we have two feasible vectors
x1, x2 ∈ Q with function values
1
f (x)
1
f2(x)
2
f (x) = 3
3
f (x)
4
4
5
f5(x)
and
1
1
4 .
2
7
Which is better?
I The vector of function-values induces on the feasible set Q a partial ordering.
.
Y.C./IMRT/COS/slide 17
Optimization and feasibility methods
Linear optimization
Denote by Q the set of all intensity radiation vectors
x = (xi)Ii=1 that satisfy all constraints given by:
D
E
j
a ,x
D
≤ bl , for all j ∈ Bl , l = 1, 2, . . . , L,
tq ≤
E
j
a ,x
D
E
j
a ,x ,
for all j ∈ Tq , q = 1, 2, . . . , Q,
≤ c, for all j ∈ C
xi ≥ 0, for all i = 1, 2, . . . , I.
.
.
Y.C./IMRT/COS/slide 18
Linear optimization — cont’d.
Various linear optimization problems for IMRT are obtained from
Examples:
min{f (x) | x ∈ Q}
I
J X
X
j
(1) f (x) =
ai xi
j=1 i=1
L
X
I
X X
j
(2) f (x) =
βl
ai xi
j∈Bl i=1
l=1
Q
I
I
X
X X
X X
j
j
+
θq
ai xi + γ
ai xi
q=1
j∈Tq i=1
j∈C i=1
after choosing (how?) user-specified weights of imQ
portance {β l }L
}
,
{θ
q
q=1 and γ.
l=1
.
Y.C./IMRT/COS/slide 19
Optimization and feasibility methods
Mixed integer programming (MIP)
min
hc, xi + hd, yi
subject to t ≤ Ax + By ≤ b
l≤x≤u
yp ∈ {0, 1, 2, . . . }
Aim: To handle partial volume constraints (also called
dose-volume constraints) of the form:
“up to ϕ% of all voxels inside a certain organ at risk
(OAR), say Bel , might be allowed to exceed bel by ψ%”,
without specifying a priori which of the voxels in Bel
will actually use this relaxed upper bound.
.
.
Y.C./IMRT/COS/slide 20
Mixed integer programming (MIP) — cont’d.
The non-MIP problem:
min f (x)
such that
D
E
j
a ,x
D
≤ bl , for all j ∈ Bl , l = 1, 2, . . . , L,
tq ≤
E
j
a ,x
D
E
j
a ,x ,
for all j ∈ Tq , q = 1, 2, . . . , Q,
≤ c, for all j ∈ C
xi ≥ 0, for all i = 1, 2, . . . , I.
.
.
Y.C./IMRT/COS/slide 21
Mixed integer programming (MIP) — cont’d.
The MIP problem:
I 1 For l = le replace the constraint
D
E
j
a ,x
by
D
E
j
a ,x
≤ bel , for all j ∈ Bel
≤ bel + ψ · bel · oj , for all j ∈ Bel
I oj ∈ {0, 1} for all j ∈ Bel
I price: added (unknown) variables oj , for all j ∈ Bel
to the original problem
.
.
Y.C./IMRT/COS/slide 22
Mixed integer programming (MIP) — cont’d.
I 2 Add to the constraints of the original problem:
X
j∈Bel
oj =| Bel | ·ϕ
I | Bel | is the cardinality of the set of indices Bel, i.e.,
the number of indices in Bel .
.
.
.
.
.
Y.C./IMRT/COS/slide 23
Mixed integer programming (MIP) — cont’d.
such that
D
E
j
a ,x
D
X
min f (x) +
0 · oj
j∈Bel
≤ bl , for all j ∈ Bl , l = 1, 2, . . . , L, l 6= le
E
j
a , x ≤ bel + ψ · bel · oj , for all j ∈ Bel
D
E
j
tq ≤ a , x , for all j ∈ Tq , q = 1, 2, . . . , Q,
D
E
j
a , x ≤ c, for all j ∈ C
xi ≥ 0, for all i = 1, 2, . . . , I.
X
oj =| Bel | ·ϕ
.
j∈Bel
.
.
Y.C./IMRT/COS/slide 24
Optimization and feasibility methods
Multi-objective optimization
I More then one objective function is defined over
the feasible set.
“ min ”{F (x) | x ∈ Q}
F (x) = (f1(x), f2(x), . . . , fT (x))
For each t = 1, 2, . . . , T, the function ft(x) maps
RI → R.
I What does it mean to minimize a vector of functions?
.
.
Y.C./IMRT/COS/slide 25
Multi-objective optimization — cont’d.
Scalarization
Conversion of the multi-objective problem into a family of scalar optimization problems of the form:
min
T
X
γ tft(x) | x ∈ Q ,
t=1
T
where γ = (γ t)T
t=1 ∈ R is a parameter vector of
weights of relative importance.
.
I Difficulty: How to choose the weights vector?.
.
.
Y.C./IMRT/COS/slide 26
Multi-objective optimization — cont’d.
Pareto optimality
Definition A point x∗ ∈ RI is called Pareto optimal
(efficient) if x∗ ∈ Q and there is no other x 6= x∗
such that x ∈ Q, for which ft(x) ≤ ft(x∗) for all
t = 1, 2, . . . , T, with a strict inequality for at least
one t, 1 ≤ t ≤ T.
.
Explanation x∗ is Pareto efficient if it is impossible
to decrease the value of any individual scalar objective
function from its value at x∗ without increasing at
least one other scalar objective function.
.
.
Y.C./IMRT/COS/slide 27
Optimization and feasibility methods
Global optimization — Simulated annealing
I Simulated Annealing (SA) exploits an analogy between the way in which a metal cools and freezes into
a minimum energy crystalline structure (the annealing process) and the search for a minimum in a more
general system.
I The SA algorithm is based on the Metropolis algorithm for finding the equilibrium configuration of
a collection of atoms at a given temperature. The
connection with mathematical minimization was first
noted by Pincus. Kirkpatrick et al. proposed that it
form the basis of an optimization technique. Webb
was first to apply it to the inverse problem of RTTP.
I SA’s major advantage over other methods is an
ability to avoid becoming trapped at local minima.
Y.C./IMRT/COS/slide 28
Global optimization — Simulated annealing
I The algorithm employs a random search which not
only accepts changes that decrease objective function
f , but also some changes that increase it.
.
.
¥ Reference: “Computational Science Education Project”.
CSEP is an electronic book for teaching Computational Science and Engineering. Sponsored by U.S.
Department of Energy Copyright (C) 1991, 1992, 1993,
1994, 1995, 1996 by the Computational Science Education Project. http://csep1.phy.ornl.gov/csep.html
.
.
Y.C./IMRT/COS/slide 29
Projection methods
I The orthogonal projection PC (x) of a point x ∈ RI
onto a closed convex set C ⊆ RI is defined by
PC (x) := argmin{k z − x k| z ∈ C}
k PC (x) − x k≤k z − x k, for all z ∈ C
Figure 2: The projection of x onto the convex set C.
.
Y.C./IMRT/COS/slide 30
Projection methods — cont’d.
I A relaxation parameter is introduced
TC (x) = PC,λ(z) := (1 − λ)x + λPC (x)
TC (x) is the relaxed projection of x onto C with relaxation λ.
Figure 3: The geometric meaning of relaxation of a
projection onto a convex set.
Y.C./IMRT/COS/slide 31
Projection methods The convex feasibility problem
(CFP)
Qj = {x ∈ RI | qj (x) ≤ 0}, j = 1, 2, . . . , J,
each qj (x) is a convex function, thus, Qj are convex
sets,
Q = ∩Jj=1Qj .
I The CFP: Find any x∗ ∈ Q.
.
I Q = ∅ (the empty set) =⇒ the CFP is inconsistent
(not feasible).
I Q 6= ∅ =⇒ the CFP is consistent (feasible).
.
Y.C./IMRT/COS/slide 32
Projection methods The POCS algorithm
I Starting from an arbitrary initial point x0 ∈ RI , the
iterative step is
xk+1 = xk + λk (PCj(k) (xk ) − xk )
.
I {λk }k≥0 are relaxation parameters.
.
I {j(k)}k≥0 is a control sequence, 1 ≤ j(k) ≤ m,
for all k ≥ 0.
I The cyclic control: j(k) = k mod J + 1.
.
Y.C./IMRT/COS/slide 33
Figure 4:
.
.
.
Y.C./IMRT/COS/slide 34
Projection methods — cont’d.
POCS for linear equations is the Kaczmarz algorithm
also known as ART in image reconstruction from projections.
.
Y.C./IMRT/COS/slide 35
Projection methods Cimmino’s algorithm
xk+1 = xk + λk
J
X
j=1
wj PCj (xk ) − xk .
{wj }Jj=1, such that wj > 0 and
weights of importance.
Y.C./IMRT/COS/slide 36
PJ
j=1 wj = 1 are
Projection methods Cimmino’s algorithm for
half-spaces
Cj = {x ∈ RI | haj , xi ≤ dj }, for all j = 1, 2, · · · , J,
the formula becomes:
xk+1 = xk + λk
n
X
wj cj (xk )aj
dj
− haj , xk i
j=1
where
Ã
cj (xk ) = min 0,
.
.
.
Y.C./IMRT/COS/slide 37
kaj k2
!
Projection methods The String-Averaging Projection
Method
.
.
Y.C./IMRT/COS/slide 38
Figure 5: Graphical illustration of the least-intensity
feasible (LIF) solution.
.
.
Y.C./IMRT/COS/slide 39
Projection methods The principle of Cyclic or
Simultaneous Subgradients Projections (CSP or SSP,
respectively):
Figure 6:
Y.C./IMRT/COS/slide 40
Gradient methods
Purpose: Unconstrained optimization
(
min f (x)
s. t. x ∈ RI
General class of methods: Line search methods:
xk+1 = xk + αk dk
xk — the current iterate
xk+1 — the next iterate
dk — the step direction
αk — the step size
Line search:
f (xk + αk dk ) = min{f (xk + αdk ) | 0 ≤ α < ∞}
Y.C./IMRT/COS/slide 41
Gradient methods
xk+1 = xk + αk ∇f (xk ), i.e., dk = ∇f (xk )
xk+1 = xk + αk Dk ∇f (xk ), i.e., dk = Dk ∇f (xk )
I different choices of the positive definite symmetric
matrix Dk give rise to:
Steepest descent, Newton’s, Diagonally-scaled
steepest descent, Modified Newton’s, Discretized
Newton’s, Gauss-Newton’s methods and more.
I Other methods for selecting dk :
Conjugate gradients, coordinate descent methods
etc.
.
Y.C./IMRT/COS/slide 42
I “The most practical thing in the world
is a good theory”,
attributed to H. von Helmholtz.
I “The road to wisdom? -- Well it’s plain
and simple to express:
Err
and err
and err again
but less
and less
and less”,
Piet Hein, “The road to Wisdom”, 1966.
.
.
Y.C./IMRT/COS/slide 43
© Copyright 2026 Paperzz