Smoothed Analysis for Linear Optimization Algorithm

Steven R. Dunbar
Department of Mathematics
203 Avery Hall
University of Nebraska-Lincoln
Lincoln, NE 68588-0130
http://www.math.unl.edu
Voice: 402-472-3731
Fax: 402-472-8466
Topics in
Probability Theory and Stochastic Processes
Steven R. Dunbar
Smoothed Analysis of Linear Optimization
Rating
Mathematicians Only: prolonged scenes of intense rigor.
1
Question of the Day
Key Concepts
1. The performance time of an algorithm is usually expressed by its running time, expressed as a function of the input size of the problem it
solves.
2. The performance profiles of algorithms across the landscape of input
instances can differ greatly.
3. Average-case analyses employ distributions with concise mathematical
descriptions, such as Gaussian random vectors, uniform vectors, and
other standard distributions. The drawback of using such distributions
is that the inputs in practice may have little resemblance to the inputs
that are likely to be generated.
4. An alternative is to identify typical properties of real data, define an
input model that captures these properties, and then rigorously analyzes the performance of algorithms assuming their inputs have these
properties. Smoothed analysis is a step in this direction.
Vocabulary
1. The worst case measure is defined as
WCA [n] = max TA [x].
x∈Ωn
2
2. Suppose S provides a distribution over each Ωn , the average case
measure corresponding to S is:
AveSA [n] = E [TA [x]]
where the expectation is over x ∈S Ωn indicating that x is randomly
chosen from Ωn according to distribution S.
3. A Gaussian random vector of variance σ 2 , centered at the origin in
Ωn = Rn is a vector in which each entry is an independent Gaussian
random variable of variance σ 2 and mean 0.
4. The smoothed complexity of A with σ-Gaussian perturbations is given
by
2
SmoothedσA [n] = max n E [TA (x0 + g)]
x∈[−1,1]
where g is a σ 2 -Gaussian random vector.
Mathematical Ideas
Standard Complexity Measures
The performance time of an algorithm is usually expressed by its running
time, expressed as a function of the input size of the problem it solves. The
performance profiles of algorithms across the landscape of input instances
can differ greatly and can be quite irregular. Some algorithms run in time
linear in the input size on all instances, some take quadratic or higher order
polynomial time, while some may take an exponential amount of time on
some instances. For example, we showed in Worst Case and Average Case
Behavior of the Simplex Algorithm that on the Klee-Minty example in Rn
the Simplex Algorithm with Dantzig’s Rule for pivoting takes 2n − 1 steps.
Although we normally evaluate the performance of an algorithm by its
running time, other performance parameters are often important. These
3
performance measures include the amount of memory space required, the
number of bits of precision required to achieve a given output accuracy, the
number of cache misses, the error probability of a decision algorithm, the
number of random bits needed in a randomized algorithm, the number of
calls to a given subroutine, and the number of examples needed in a learning
algorithm.
When A is an algorithm for solving problem P , we let TA [x] denote the
running time of algorithm A on input instance x. An input domain Ω of
all input instances is usually viewed as the union of a family of subdomains
{Ω1 , Ω2 , . . . , Ωn , . . . .} where Ωn represents all instances in Ω of size n.
The worst case measure is defined as
WCA [n] = max TA [x].
x∈Ωn
For example, the Klee-Minty example in Worst Case and Average Case
Behavior of the Simplex Algorithm shows that
WCSimplex [n] ≥ C · 2n
where C is some constant measuring the running time at each pivot.
The average case measures have more parameters. In each averagecase measure, one first determines a distribution of inputs and then measures
the expected performance of the algorithm assuming inputs are drawn from
this distribution. Supposing S provides a distributions over each Ωn , the
average case measure corresponding to S is:
AveSA [n] = E [TA [x]]
where the expectation is over x ∈S Ωn indicating that x is randomly chosen
from Ωn according to distribution S. One would ideally choose the distribution of inputs that occurs in practice, but it is rare that one can determine
or cleanly express these distributions. Furthermore, the distributions can
vary greatly from one application to another. Instead, average-case analyses
have employed distributions with concise mathematical descriptions, such as
Gaussian random vectors, uniform vectors, and other standard distributions.
The drawback of using such distributions is that the inputs in practice may
have little resemblance to the inputs that are likely to be generated.
4
Smoothed Analysis Measures
An alternative is to identify typical properties of real data, define an input
model that captures these properties, and then rigorously analyzes the performance of algorithms assuming their inputs have these properties. Smoothed
analysis is a step in this direction. It is motivated by the observation that
real data is often subject to some small degree of noise. For example, in
industrial optimization and economic prediction, the input parameters could
be obtained by physical measurements, and the measurements usually have
some low magnitude uncertainty. At a high level, each input is generated
from a two-stage model. In the first stage, an instance of the problem is
formulated according to say physical, industrial or economic considerations.
In the second stage, the instance from the first stage is slightly perturbed.
The perturbed instance is the input to the algorithm.
In smoothed analysis, we assume the input to the algorithm is subject
to a slight random perturbation The smoothed measure of an algorithm
on an input instance is its expected performance over the perturbations of
that instance. Define the smoothed complexity of the algorithm to be the
maximum smoothed measure over the input instances.
A Gaussian random vector of variance σ 2 , centered at the origin in
ΩN = Rn is a vector in which each entry is an independent Gaussian random
variable of variance σ 2 and mean 0, meaning that the probability density of
each entry in the vector is
√
1
2πσ 2
e−x
2 /2σ 2
For a vector x0 ∈ Rn , the σ-Gaussian perturbation of x0 is a random vector
x = x0 + g where g is a Gaussian random vector of variance σ 2 .
Definition. Suppose A is an algorithm with Ωn = Rn . Then the smoothed
complexity of A with σ-Gaussian perturbations is given by
2
σ
SmoothedA
[n] = max n E [TA (x0 + g)]
x∈[−1,1]
where g is a σ 2 -Gaussian random vector.
In words, this definition says:
1. Perturb the original input x0 to obtain the input x0 + g
5
2. Feed the perturbed input into the algorithm
3. For each original input, measure the expected running time of the algorithm A on random perturbations of that input.
4. Then obtain the smoothed analysis by the expectation under the worst
possible input.
By varying σ 2 between 0 and infinity, one can use smoothed analysis to
interpolate between worst-case and average case analysis. When σ 2 = 0,
one recovers the ordinary worst-case analysis. As σ 2 grows large the random
perturbation g dominates the original x0 and one obtains an average-case
analysis. We are often interested in the case when σ (the standard deviation,
measured in the same units as kxk) is small relative to kxk in which case
x + g is a slight perturbation of x. Smoothed analysis often demonstrates
that a perturbed problem is less time-consuming to solve.
Definition. Algorithm A has polynomial smoothed complexity if there
exist positive constants n0 , σ0 , c, k1 and k2 such that for all n ≥ n0 , and
0 ≤ σ ≤ σ0
2
SmoothedσA [n] ≤ c · σ −k2 · nk2 .
Recall Markov’s Inequality: If X is a random variable that takes only
nonnegative values, then for any a > 0:
P [X ≥ a] ≤ E [X] /a
Therefore, if an algorithm A has smoothed complexity T (n, σ), then
max n P TA [x0 + g] ≤ δ −1 T [n, σ] ≥ 1 − δ
x0 ∈[−1,1]
Proof. Need to work through this.
This says that if A has polynomial smoothed complexity, then for any x0 ,
with probability at least 1 − δ, A can solve a random perturbation of x0 in
time polynomial in n, 1/σ, and 1/δ.
This probabilistic upper bound does not imply that smoothed complexity of A is O(T [n, σ]). Blum, Dunagan, Beier, and Vöcking introduced a
relaxation of polynomial smoothed complexity:
6
Definition. Algorithm A has probably polynomial smoothed complexity if there exist constants n0 , σ0 , c and α such that for n ≥ n0 and
0 ≤ σ ≤ σ0 ,
max n E [TA [x0 + g]α ] ≤ c · σ −1 · n
x∈[−1,1]
They show that some algorithms have probably polynomial smoothed
complexity, in spite of the fact that their smoothed complexity is unbounded.
Spielman and Teng considered the smoothed complexity of the simplex
algorithm with the shadow-vertex pivot rule developed by Gass and Saaty.
They show that the smoothed complexity of the algorithm is polynomial.
Vershynin improved their result to obtain a smoothed complexity of
O max n5 · (log(m))2 , n9 · (log(m))4 , n3 · σ −4
Sources
This section is adapted from the article “Smoothed Analysis: An attempt to
explain the behavior of algorithms in practice’ by Daniel A. Spielman and
Shang-Hua Teng, [1].
Problems to Work for Understanding
1.
2.
3.
4.
7
Reading Suggestion:
References
[1] Daniel A. Spielman and Shang-Hua Teng. Smoothed analysis: An attempt to explain the behavior of algorithms in practice. Communications
of the ACM, 52(10):77–84, October 2009.
Outside Readings and Links:
1.
2.
3.
4.
I check all the information on each page for correctness and typographical
errors. Nevertheless, some errors may occur and I would be grateful if you would
alert me to such errors. I make every reasonable effort to present current and
accurate information for public use, however I do not guarantee the accuracy or
timeliness of information on this website. Your use of the information from this
website is strictly voluntary and at your risk.
I have checked the links to external sites for usefulness. Links to external
websites are provided as a convenience. I do not endorse, control, monitor, or
guarantee the information contained in any external website. I don’t guarantee
that the links are active at all times. Use the links here with the same caution as
you would all information on the Internet. This website reflects the thoughts, interests and opinions of its author. They do not explicitly represent official positions
or policies of my employer.
Information on this website is subject to change without notice.
Steve Dunbar’s Home Page, http://www.math.unl.edu/~sdunbar1
8
Email to Steve Dunbar, sdunbar1 at unl dot edu
Last modified: Processed from LATEX source on January 27, 2011
9