PhD-position: Non-convex optimization with applications in imaging.

PhD-position: Non-convex optimization with applications in
imaging.
Overview
Imaging is a mathematical field which consists of using indirect measurements (data) for reconstructing
(images of) structures which can not be seen directly. This appears for example in mathematical tomography,
where the interior of objects (such as a human being) can be reconstructed by X-ray absorption patterns.
When imaging on an atomic level, one retrieves e.g. crystal structures from the amplitude of diffraction
patterns, using highly monochromatic light. The problem of algorithmically converting the data to the
desired image is known as phase retrieval. Such methods will be important for e.g. the beamline NanoMax
at MAX IV, Lund, as well as ESS later on. When formulated in mathematical terms, these problems belong
to the class of “ill-posed inverse problems”, which usually are solved by optimization tools upon adding
additional constraints (related to prior knowledge) to stabilize the inversion process. This boils down to
finding the minimum of certain functionals which usually are non-convex, which has the effect that most
algorithms are ad-hoc with no guarantees of convergence and/or lacks concrete characterizations of the point
of convergence. For this reason it has become very popular to replace the desired functional with a “similar”
one which is convex. This has the advantage of yielding algorithms which are guaranteed to converge, but
the disadvantage that the optimal point is not the one initially sought, and it is well known that these methods
always give rise to biased solutions
Recently, our team has made advances in this direction [17]. The paper is the culmen of several years
of research devoted at understanding convergence of algorithms in non-convex settings, and of computing
convex envelopes of non-convex functionals [3, 10, 11, 14], thereby being able to provide convergence
proofs of related algorithms as well as descriptions of the convergence points. We have also developed
several algorithms for tomographic reconstructions [8, 9] which triggered a close collaboration with the
MAX IV laboratory imaging group. The PhD student (Viktor Nikitin) who developed these algorithms and
defended his thesis at the Mathematics department in 2016, is currently implementing the new algorithms
at the MAX IV Laboratory and LUNARC as a PostDoc hosted at the MAX IV Laboratory. The open PhDposition is intended to further strengthen the link between the Centre for Mathematical Sciences and MAX
IV. We aim to continue the study of the above mentioned areas (non-convex optimization, fast algorithms)
with a particular focus on their potential use for MAX IV and their future users, as well as the National
7T facility (MRI) and later on ESS (neutron tomography). MAX IV is currently the most intense lightsource on the planet, and due to new technology also the most monochromatic beam. However, this will
have a limited impact unless coupled with new imaging algorithms and on-site expertise who can guide the
users on how to extract the desired information from the acquired data. As mentioned, the phase retrieval
problem, which appears in coherent diffraction tomography and ptychography, is of particular interest for
the beamline NanoMax and can be used for the imaging of e.g. crystals and particles. A related problem
which we wish to tackle is that of inverting measurements of an object which is deforming during the
tomographic acquisition. A typical example is clinical observations of moving organs or the emerging field
of micrometer resolution functional anatomy of small animals [29]. As of today only indirect methods
are used to measure the function of the lung tissue. Yet to understand the underlying mechanism of lung
malfunctions the microstructure of the organ must be studied in vivo. So far the only promising way is
using synchrotron X-ray tomographic microscopy, but the problem of reconstruction of the moving organ
at the micrometer scale is not solved. Incidentally, the same type of problem is of interest to the Cardiac
MR group at the National 7T facility at LBIC/Skånes Universitetssjukhus, for example when doing MRIimaging on the heart [25], and they are also interested in collaboration on this matter. In our group we have
worked with a lot of different problems during the last couple of years. We have published work in complex
analysis and operator theory [2, 4, 12, 16, 18], inverse problems [1, 3], fast algorithms [8], optimization
[17, 3, 10, 11, 14], geophysics [5] and signal processing [6, 7, 13]. Our aim is to be active on the entire
chain of problems, ranging for pure mathematical analysis to the development of fast numerical algorithms
1
to the actual treatment of measured data. Despite the applied flavor of the project, the mathematical problems
to be tackled are by no means easy or only in the area “applied mathematics”. The position is suitable for
a person who likes pure mathematics but at the same time has knowledge and interest in developing and
numerically testing algorithms. Below we list in greater detail some different components of the planned
research.
Inverse problems/non-convex optimization
In many physical and engineering setups one seeks information about a physical object or system that can
only be measured indirectly. The problem of reconstructing the information of interest is called an inverse
problem, and these are usually ill-posed. This means that additional assumptions need to be added in order
to obtain stable reconstruction algorithms. A lot of attention has recently been put in replacing traditional
constraints on amplitude or frequency range by imposing various types of sparseness constraints on the
solutions. A good motivation for this concerning imaging technologies is that most images can typically
be substantially compressed without loosing much information content. Imposing such conditions will,
however, cause the inverse problem to become non-linear even if the original formulation is linear. Related
optimization problems are then non-convex and these are difficult to deal with due to the presence of local
minima. For this reason, it is important to try to find formulations that lead to convex or close to convex
optimization problems. There is an abundance of algorithms using optimization methods of various kinds
that attempt to solve non-convex/non-linear inverse problems with various degrees of success. Many of
them are ad-hoc type of algorithms, where the “solution” obtained is merely the output of some algorithm
without actually solving a well defined problem, since they often get stuck at a local minimum. This also
hold for inverse problems that are non-linear already in their original formulation, such as the phase retrieval
problem and the complex frequency estimation problem.
An interesting approach that can be taken to address these issues relies on replacing the original nonconvex objective function with its convex envelope. It has recently turned out that there are important cases
where it is possible to compute the convex envelope explicitly [27, 17, 3, 11, 14], or one may work with it
implicitly in the problem formulation to prove convergence of seemingly ad-hoc methods [10].
When looking for sparse solutions one is interested in minimizing the amount of non-zero contributions,
sometimes called the `0 -penalty. This is however non-convex, even discontinuous, and a popular alternative
is to add an `1 -penalty on the data terms (as in e.g. compressed sensing). This has the advantage that it leads
to a convex functional, and under some conditions the solutions coincide with those of the first (non-convex)
problem [15, 21, 36]. However, the region where this happens is quite limited, and many physical setups
work in regimes where this is not the case. One problem is for instance that the `1 -penalty will enforce a
bias on the obtained solution, where the amplitudes are consequently underestimated.
Attempts to solve these problems typically fall into three categories, either one change the problem formulation (and its solution) by replacing the actual cost-functional with a standard convex one (`1 -minimization,
nuclear norm, compressed sensing etc.), or one builds ad hoc semi-convex functionals using e.g. ` p -norms
with 0 ă p ă 1 and demonstrate that these “seem to do well in practice”, or one leaves the non-convex
functional as is and rely on classical algorithms which often get stuck in local minima or even diverge occasionally (like alternating projections, Douglas-Rachford, ADMM and so on), often combined with restart
schemes. We wish to provide a 4th alternative by continuing the investigation mentioned above.
The original motivation for using the nuclear norm was given by [22] where it is shown that the nuclear
norm is the convex envelope of the rank function on the set tA; }A} ď 1u. The constraint }A} ď 1 is artificial
and added since the convex envelope on the whole domain would simply be the zero function. Recently it
has been observed [27, 28] that this problem can be circumvented if one takes into account the least squares
penalty term }A ´ F}2 before taking the convex envelope. The replacement of the rank penalty arising in this
way is non-convex by itself (but convex when combined with }A ´ F}2 ) and has the property of being equal
to the original non-convex one far away from discontinuities, leading to global minima which often even
coincide with the originally sought one. In contrast, the nuclear norm always move the sought optimum
because it penalizes large and small singular values alike and therefore exhibit a shrinking bias. We have
recently shown that the basic idea behind [27, 28] is very general and does not restrict itself to low rank
approximation. Consider a non-convex problem of the form
argminXPM f pXq ` }L pXq ´ F}2
2
(0.1)
over some subset M of a Hilbert space H , (where previously X was a matrix, f pXq its rank, L “ Id and
F measured data). In other words we are seeking the value of X P M for which the minimum is attained.
It turns out that when L “ Id the convex envelope of this functional also is of the form f˘pXq ` }X ´ F}2
where f˘ is the so called Lasry-Lions approximant (previously used in regularization) which can be explicitly
computed in a number of concrete situations [17]. Replacing the original functional by this leads to a
convex problem which is much closer to the original problem than previously considered alternatives in the
literature. Related algorithms can be proven to converge and their output can be analyzed by understanding
the minimal points of f˘pXq ` }X ´ F}2 on M . When L ‰ Id it is still possible to prove that the replacement
of f by f˘ has a number of desirable features, such as not moving global minima (although convexity might
be lost) [17, 3]. Recently we have also been able to give criteria of when sparse stationary points to such
functionals are actually unique, i.e. when the point of convergence actually solves the original problem
despite lack of convexity [33]. A partial aim of the Phd work is to further develop these ideas and also apply
them to synchrotron tomography, MRI and seismic imaging.
Phase retrieval
Given a complex signal Fpkq with phase ψpkq, i.e. Fpkq “ |Fpkq|eiψpkq the “phase retrieval problem” consists
in finding ψ given measurements only of the amplitude |Fpkq| and a set of constraints. Phase retrieval
problems in two or three dimensions arise for both near field imaging (real space imaging described by the
Fresnel transform) [19] and far field (coherent diffraction) imaging [23]. In the early 1980’s [23, 24], it was
demonstrated that if the diffracting sample has a small support and if the diffraction patterns is adequately
measured, it is possible to retrieve the phase using an iterative approach now known as the GerchbergSaxton-Fienup-algorithm.
Since the first experimental demonstration realized in 1999 by Miao et al. [31] using a synchrotron X-ray
beam, coherent diffraction imaging (CDI) has significantly progressed and been applied to the reconstruction
of shape but also strain inside nanocrystals [34, 35]. Many efforts have also been made to improve the
ab. initio phase retrieval algorithms, combining the original iterative algorithm (Gerchberg-Saxton-Fienup)
with charge flipping algorithms [38], and more recently using ptychography [20]. Another direction is the
shrink wrap algorithm, in which the support of the object is a constraint which is iteratively updated to
fit the measurements [30]. The paper [32] from 2010 gives an overview of contemporary algorithms for
phase retrieval and concludes with (p. 65) “It must be observed, however, that none of these approaches
have yielded an algorithm that has been found to converge in all cases.” This continues to be the case, as
indicated by the recent publication [26] which provides state of the art algorithms for the solution of both
ptychography and traditional phase retrieval.
In collaboration with Gerardina Carbone (NanoMax) and Jesper Wallentin (Synchrotron Radiation Research, LU) we plan to work on improvements of these algorithms, based on the new ideas emerging from
the previous section.
4D-imaging
In transmission tomography as well as MRI-measurements on humans, a problem is blurring due to the fact
that the object measured is not still during the measurement process (see e.g. [25, 37]). Popular methods
for dealing with this problem involve optimizaion techniques of the type discussed above. For example, in
[25] the total variation norm is used in both spacial variables and time, to stabilize the inversion process.
However, the scientists with a need for such methods are usually not mathematicians (some are doctors) and
do not have the time to improve or tailormake these algorithms. For example, the lungs move in a continuous
fashion when healthy whereas undergoes unpredictable distortion during an asthma attack. Such differences
affect the design of the corresponding algorithm. Consequently, much data acquired at synchrotron imaging
beamlines or the National 7T facility (at LBIC/Skånes Universitetssjukhus) is sub-optimally processed, or
in worst cases not processed at all mainly because the ambitious acquisition schemes do not match the
current capabilities of the reconstruction methods. In collaboration with Rajmund Mokso (MedMax) and
the cardiac imaging group at the National 7T facility, we plan to work on improvements of such algorithms.
3
References
plied physics letters, 75(19):2912–2914, 1999.
[20] M. Dierolf, A. Menzel, P. Thibault, P. Schneider, C. M.
Kewish, R. Wepf, O. Bunk, and F. Pfeiffer. Ptychographic
x-ray computed tomography at the nanoscale. Nature,
467(7314):436–439, 2010.
[21] D. L. Donoho. Compressed sensing. IEEE Transactions
on Information Theory, 52(4):1289–1306, April 2006.
[22] M. Fazel, H. Hindi, and S. P. Boyd. A rank minimization heuristic with application to minimum order system
approximation. In American Control Conference, 2001.
[23] J. R. Fienup. Phase retrieval algorithms: a comparison.
Applied optics, 21(15):2758–2769, 1982.
[24] R. W. Gerchberg and W. O. Saxton. Optik, 35:237, 1972.
[25] K. Haris, E. Hedström, S. Bidhult, F. Testud, N. Maglaveras, E. Heiberg, S. R. Hansson, H. Arheden, and A. H.
Aletras. Self-gated fetal cardiac mri with tiny golden angle
igrasp: A feasibility study. Journal of Magnetic Resonance
Imaging, 2017.
[26] R. Hesse, D. R. Luke, S. Sabach, and M. K. Tam. Proximal
heterogeneous block input-output method and application
to blind ptychographic diffraction imaging. arXiv preprint
arXiv:1408.1887, 2014.
[27] V. Larsson and C. Olsson. Convex envelopes for low
rank approximation. In Energy Minimization Methods in
Computer Vision and Pattern Recognition, pages 1–14.
Springer International Publishing, 2015.
[28] V. Larsson, C. Olsson, E. Bylow, and F. Kahl. Rank
minimization with structured data patterns. In Computer Vision–ECCV 2014, pages 250–265. Springer International Publishing, 2014.
[29] G. Lovric, S. F. Barré, J. C. Schittny, M. Roth-Kleiner,
M. Stampanoni, and R. Mokso. Dose optimization approach to fast x-ray microtomography of the lung alveoli.
Journal of applied crystallography, 46(4):856–860, 2013.
[30] S. Marchesini, H. He, H. N. Chapman, S. P. Hau-Riege,
A. Noy, M. R. Howells, U. Weierstall, and J. C. Spence. Xray image reconstruction from a diffraction pattern alone.
Physical Review B, 68(14):140101, 2003.
[31] J. Miao, P. Charalambous, J. Kirz, and D. Sayre. Extending
the methodology of x-ray crystallography to allow imaging
of micrometre-sized non-crystalline specimens. Nature,
400(6742):342–344, 1999.
[32] K. A. Nugent. Coherent methods in the x-ray sciences.
Advances in Physics, 59(1):1–99, 2010.
[33] C. Olsson, M. Carlsson, F. Andersson, and V. Larsson.
Non-convex rank/sparsity regularization and local minima.
arXiv preprint arXiv:1703.07171, 2017.
[34] M. A. Pfeifer, G. J. Williams, I. A. Vartanyants, R. Harder,
and I. K. Robinson. Three-dimensional mapping of
a deformation field inside a nanocrystal.
Nature,
442(7098):63–66, 2006.
[35] I. K. Robinson, I. A. Vartanyants, G. Williams, M. Pfeifer,
and J. Pitney. Reconstruction of the shapes of gold
nanocrystals using coherent x-ray diffraction. Physical Review Letters, 87(19):195505, 2001.
[36] J. A. Tropp. Just relax: Convex programming methods for
identifying sparse signals in noise. Information Theory,
IEEE Transactions on, 52(3):1030–1051, 2006.
[37] V. Van Nieuwenhove, J. De Beenhouwer, J. Vlassenbroeck, M. Moesen, M. Brennan, and J. Sijbers. Registration based sirt: A reconstruction algorithm for 4d ct.
[38] J. Wu and J. Spence. Reconstruction of complex singleparticle images using charge-flipping algorithm. Acta
Crystallographica Section A: Foundations of Crystallography, 61(2):194–200, 2005.
[1] F. Andersson and M. Carlsson. Alternating projections on
nontangential manifolds. Constr. Approx., 38(3):489–525,
2013.
[2] F. Andersson and M. Carlsson. On General Domain Truncated Correlation and Convolution Operators with Finite
Rank. Integr. Eq. Op. Th., 82(3), 2015.
[3] F. Andersson and M. Carlsson. Fixed-point algorithms for
frequency estimation and structured low rank approximation. arXiv preprint arXiv:1601.01242, 2016.
[4] F. Andersson and M. Carlsson. On the structure of positive semi-definite finite rank general domain hankel and
toeplitz operators in several variables. Complex Analysis
and Operator Theory, pages 1–30, 2016.
[5] F. Andersson, M. Carlsson, and M. V. de Hoop. Frequency
extrapolation through sparse sums of lorentzians. Journal
of Earth Science, 25(1).
[6] F. Andersson, M. Carlsson, and M. V. de Hoop. Nonlinear approximation of functions in two dimensions by sums
of exponential functions. Appl. Comput. Harmon. Anal.,
29(2):156–181, 2010.
[7] F. Andersson, M. Carlsson, and M. V. de Hoop. Sparse approximation of functions using sums of exponentials and
AAK theory. J. Approx. Theory, 163(2):213–248, 2011.
[8] F. Andersson, M. Carlsson, and V. V. Nikitin. Fast algorithms and efficient gpu implementations for the radon
transform and the back-projection operator represented as
convolution operators. SIAM Journal on Imaging Sciences,
9(2):637–664, 2016.
[9] F. Andersson, M. Carlsson, and V. V. Nikitin. Fast inversion of the exponential Radon transform by using fast
laplace transforms. J. Four. Anal. Appl., 2017.
[10] F. Andersson, M. Carlsson, and C. Olsson. Convergence of
dual ascent in non-convex/non-differentiable optimization.
arXiv preprint arXiv:1609.06576, 2016.
[11] F. Andersson, M. Carlsson, and C. Olsson. Convex envelopes for fixed rank approximation. Opt. Let., 2017.
[12] F. Andersson, M. Carlsson, and K.-M. Perfekt. Aak-type
theorems for hankel operators on weighted spaces. Bulletin des Sciences Mathématiques, 139(2):184–197, 2015.
[13] F. Andersson, M. Carlsson, J.-Y. Tourneret, and H. Wendt.
A new frequency estimation method for equally and unequally spaced data. Signal Processing, IEEE Transactions on, 62(21):5761–5774, 2014.
[14] F. Andersson, M. Carlsson, and H. Wendt. On a fixed-point
algorithm for structured low-rank approximation and estimation of half-life parameters. EUSIPCO 2016 conference
proceedings.
[15] E. J. Candès, J. Romberg, and T. Tao. Robust uncertainty
principles: Exact signal reconstruction from highly incomplete frequency information. Information Theory, IEEE
Transactions on, 52(2):489–509, 2006.
[16] M. Carlsson. On truncated Wiener-Hopf operators and
BMOpZq. Proc. Amer. Math. Soc., 139(5):1717–1733,
2011.
[17] M. Carlsson. On convexification/optimization of functionals including an l2-misfit term.
arXiv preprint
arXiv:1609.09378, 2016.
[18] M. Carlsson and J. Wittsten. A note on holomorphic functions and the Fourier-Laplace transform. Mathematica
Scandinavica, to appear.
[19] P. Cloetens, W. Ludwig, J. Baruchel, D. Van Dyck,
J. Van Landuyt, J. Guigay, and M. Schlenker. Holotomography: Quantitative phase tomography with micrometer
resolution using hard synchrotron radiation x rays. Ap-
4