Minima and Maxima

Minima and Maxima
Science of Choice
• Optimization - it’s all about finding the best way to do a
specific task
• Optimization helps us to choose, on the basis of defined
criteria and restrictions, the best alternative available
• Examples are found in most areas of human activity:
• an oil company may wish to find the optimal rate of extraction
from one of its wells
• a manager seeks optimal combinations of inputs to maximize
profits & minimize costs
Science of Choice
• Studying optimization systematically requires mathematical
modelling:
• Defining objective function - what to maximize / minimize
• Defining decision variables
• Optimization - process of finding the set of values of the
variables that will lead us to the desired extremum of the
objective function
Local and Global Maximum
Assume y = f (x) is a continuous function, which is defined on a
closed interval I = [a, b]. These points in the domain of a function
where it reaches its largest values are usually referred to as
maximum points. We distinguish between local and global
maximum points
Definition
The point x0 is the local maximum of f on I , if there exists a
neighborhood U(x0 ) of x0 such that f (x0 ) f (x) for all x 2 U(x0 ).
Definition
The point x0 is the global maximum of f on I , if f (x0 )
all x 2 I .
f (x) for
Local and Global Minimum
The points in the domain of a function where it reaches its smallest
values are referred to as minimum points. As in the case of
maximum values, we distinguish between local and global minimum
points:
Definition
The point x0 is the local minimum of f on I , if there exists a
neighborhood U(x0 ) of x0 such that f (x0 )  f (x) for all x 2 U(x0 ).
Definition
The point x0 is the global minimum of f on I , if f (x0 )  f (x) for
all x 2 I .
Collectively local / global maximum or minimum points are known
as extreme points.
Local vs. Global Extremum
• Most of the time we discuss about finding local (or relative)
extreme value points
• Remember, that being able to show that a point is a local
minimum or maximum is no guarantee that it is the global
minimum or maximum as well
• How do we find global extremum then?
• it must be either a local extreme value or one of the end points
of the function
• e.g. finding maximum: if we know all the local maxima, it is
necessary only to select the largest of these and compare it
with the end points to see whether it is also the global
maximum
• An extreme point is termed a free extremum, if it is not an
end point of the interval.
Local vs Global Extremum
Figure: Local vs Global Extremum: Function domain is [ 3, 1.5]
Local vs Global Extremum
Figure: Local vs Global Extremum: Function domain is [ 2.5, 1.25]
Optimality for differentiable functions
• In the following slides we will present the local optimality
conditions for univariate differentiable functions:
• necessary first-order condition
• sufficient conditions:
• first derivative test
• second derivative test
First-order Condition (FOC)
Theorem
Suppose that a function f is differentiable in an interval I and x0 is
an interior point of I . For x = x0 to be a maximum or minimum
point for f in I , a necessary condition is that it is a stationary
(critical) point for f i.e. x = x0 satisfies the equation f 0 (x0 ) = 0
(first-order condition)
Note:
• All extreme value points of differentiable functions must satisfy
the necessary conditions.
• However, also other points, which are not extreme values can
satisfy the necessary first-order condition.
• Therefore, based on this condition only, one cannot be sure
whether the point actually is a true extreme value point. )
Therefore we need to take a look at the sufficiency conditions:
first-derivative test and second order test
First-derivative test
Theorem
If f (x) is a continuously differentiable function on interval
[a, b],then the sufficient and necessary condition for a local extreme
value at point x0 is8
following:
9
f 0 (x0 ) = 0
<
=
x0 is maximum , f 0 (x) 0, when x < x0 , x 2 U(x0 )
: 0
;
f (x)  0, when x > x0 , x 2 U(x0 )
8
9
f 0 (x0 ) = 0
<
=
x0 is minimum , f 0 (x)  0, when x < x0 , x 2 U(x0 )
: 0
;
f (x) 0, when x > x0 , x 2 U(x0 )
Note: if f 0 (x) is not defined at a particular point, it does not mean
that the point cannot be an extreme value. These points must be
checked carefully.
Examples
Examples
1. Find the local extrema of the function
y = f (x) = x 3 12x 2 + 36x + 8
2. Find the local extremum of the average cost function
AC = f (Q) = Q 2 5Q + 8
3. You own a real estate whose value tpyears from now is given
by the function V (t) = 10, 000 exp ( t). Assuming that the
interest rate for the foreseeable future will remain at 6 percent,
what is the optimal selling time that maximizes the present
value of your asset?
Second derivative test
Theorem
Assume that f (x) is twice continuously differentiable on interval
[a, b]. A sufficient condition for a free local extreme value at x0 is
the
⇢ 0 following:
f (x0 ) = 0
) x0 is maximum
f 00 (x0 ) < 0
⇢ 0
f (x0 ) = 0
) x0 is minimum
f 00 (x0 ) > 0
Remark: the case f 0 (x0 ) = 0 and f 00 (x0 ) = 0 means that either
there is no extremum or higher order derivatives need to be
computed to determine the existence of an extremum.
Examples
Examples
1. Find the local extrema of y = f (x) = x 3
2nd order conditions.
12x and check the
2. Let total revenue R(Q) = 1200Q 2Q 2 and total cost
C (Q) = Q 3 61.25Q 2 + 1528.5Q + 2000. What is the
maximum profit?
3. Show that when profit is at its maximum, marginal revenue is
equal to marginal cost.
Inflection points
• As we remarked earlier, we cannot use the second order
derivative test to detect the nature of an extreme value point
when f 00 (x0 ) = 0
• If f 000 (x0 ) 6= 0 (or higher order odd derivative is non-zero when
lower order derivatives are zero), it leads us to a discussion
about inflection points. Inflection point can be defined as
follows:
• A point on a curve at which the second derivative changes sign
from positive to negative or negative t positive (e.g. the curve
changes from concave to convex)
• A point on a curve where the tangent crosses the curve itself
• Saddle point = a point which is both a stationary point (i.e.
f 0 (x0 ) = 0) and a point of inflection
• A saddle point is not a local extremum
Saddle Point
Examples
Examples
1. Show that the function x 3 has a saddle point.
2. Check for extremum and saddle point in (x
2)2 (x
3)3
Exercise Problems
Answers to Exercise Problems
Multivariate First order condition (FOC)
• In order to find local maximum and minimum values, we need
first and second order partial derivatives in the case of
functions with multiple variables
• Like in one variable case, the first order (necessary) condition
for a point x ⇤ to be a max / min of a function f is f 0 (x ⇤ ) = 0,
i.e. x ⇤ is a critical point
• Similar first order condition works for a function f of n
variables, but with partial derivatives
In the next slides we will consider multivariate function of the form
f : R n ! R, which will be denoted as f (x), where x represents a
vector (x1 , x2 , · · · , xn ).
Multivariate First order condition (FOC)
Theorem
(FOC) Let f : U ! R be a continuously differentiable function
defined on a subset U of R n . If x ⇤ is a local max or min of f in U
and if x ⇤ is an interior point of U, then ∂∂xfi (x ⇤ ) = 0 for i = 1, ..., n
i.e. the total differential df = 0. We say that the n-vector x ⇤ is a
critical point of function f .
Examples
Examples
Find the points, which satisfy FOC in the following problems (i.e.
candidate points for max/min):
1. z =
x 2 + xy
2. F (x, y ) = x 3
y 2 + 3x
y 3 + 9xy
Second Order Condition
• Note that the FOC does not tell us whether the extreme value
is a minimum or a maximum or an inflexion point.
• Therefore we need a second order (sufficiency) condition to
verify the true nature of the extreme point candidate
• We apply the same principle as we did in the univariate case,
although the required steps are a bit more elaborate due to the
number of variables involved
• Thus, we need to compute the 2nd order differential, which is
a matrix (referred as Hessian) in case of multiple variables
d 2f
dx 2
2
6
= 6
4
∂ 2f
∂ x1 ∂ x1
..
.
∂ 2f
∂ xn ∂ x1
···
..
.
···
∂ 2f
∂ x1 ∂ xn
..
.
∂ 2f
∂ xn ∂ xn
3
7
7
5
Second Order Condition
Theorem
(SOC) Let f : U ! R be a twice continuously differentiable function
defined on a subset U of R n , for which x ⇤ is a critical point, then:
1. The function f has a local minimum at x ⇤ if the hessian
matrix is positive definite
2. The function f has a local maximum at x ⇤ if the hessian
matrix is negative definite
3. The extreme point is neither a minimum nor a maximum, but
a saddle point, if the hessian matrix is neither positive definite
nor negative definite
Examples
Examples
1. Find the local maximum and minimum for
f (x, y ) = x 3 + y 3 xy
2. Find the extreme value(s) of z = 8x 3 + 2xy
3. Find the extreme value(s) of z = x + 2ey
3x 2 + y 2 + 1
ex
e 2y
4. A monopolist producing a single output has two types of
customers. If it produces Q1 units for customers of type 1,
then these customers are willing to pay a price of 50 5Q1
dollars per unit. If it produces Q2 for customers of type 2, then
these customers are willing to pay a price of 100 10Q2 dollars
per unit. The monopolist’s cost of manufacturing Q units of
output is 90 + 20Q dollars. In order to maximize profits, how
much should the monopolist produce for each market?
Background for directional derivatives
• Before going to directional derivatives, let’s introduce curves:
• Let x(t) = (x1 (t), ..., xn (t)), where each coordinate function
xi : R ! R is continuous. We say that x(t) is a curve in R n
parametrized by t. Interpretation: the vector (n-tuple)
(x1 (t), ..., xn (t)) describes the coordinates of the curve at the
point, where parameter value is t.
• E.g. consider that you travel from city A to city B by car. The
path of the trip can be described on a map as a curve.
Naturally there are various ways to parametrize the curve; one
simple way is to report map coordinates that identify the
location of the car when the car has been driving for t hours.
Chain Rule for Curves
Theorem
(Chain Rule for curves) If x(t) is a continuously differentiable curve
on an interval about t0 and f is a continuously differentiable
function in the neighborhood of point x(t0 ), then
g (t) := f (x(t)) = f (x1 (t), ..., xn (t)) is a continuously differentiable
function at t0 and the derivative of g with respect to t is given by
dg
∂f
∂f
∂f
0
0
0
dt (t0 ) = ∂ x1 (x(t0 ))x1 (t0 ) + ∂ x2 (x(t0 ))x2 (t0 ) + ... + ∂ xn (x(t0 ))xn (t0 )
Directional Derivative and Gradient
• Using the previous chain rule, we can compute the rate of
change of a multivariate function F (x1 , ..., xn ) at a given point
x ⇤ in any given direction v = (v1 , ..., vn )
• For this purpose we need to parametrize the direction v from
point x ⇤ . The equation of the line is given by x = x ⇤ + tv ,
where t 2 R. i.e.
x(t) = (x1 (t), ..., xn (t)) = (x1⇤ + tv1 , ..., xn⇤ + tvn )
• Now we can define g (t) = F (x(t)) = F (x ⇤ + tv ) = F
(x1⇤ + tv1 , ..., xn⇤ + tvn )
• Then we use chain rule to get the derivative of g at t = 0, we
obtain :
dg
dt (0)
=
∂F
∂F
∂F
⇤
⇤
⇤
∂ x1 (x )v1 + ∂ x2 (x )v2 + ... + ∂ xn (x )vn
• We say that g 0 (0) is the directional derivative of function F in
the direction defined by vector v
Directional Derivative and Gradient
Definition
(Directional derivative and gradient) If F (x1 , ..., xn ) is a
continuously differentiable function and v = (v1 , ..., vn ) is a direction
vector, the directional derivative of F at x ⇤ 2 R n in direction v is
given by Dv F (x ⇤ ) = ∂∂xF1 (x ⇤ )v1 + ∂∂xF2 (x ⇤ )v2 + ... + ∂∂xFn (x ⇤ )vn or
alternatively using matrix notation
0 ∂F
B
Dv F (x ⇤ ) = —F (x ⇤ ) · v = @
∂ x1 (x
⇤)
..
.
∂F
⇤
∂ xn (x )
1 0 1
v1
C B .. C
A·@ . A =
vn
n
∂F
 ∂ xi (x ⇤ )vi
i=1
The vector —F = ( ∂∂xF1 (x ⇤ ), ..., ∂∂xFn (x ⇤ ))T is called the gradient of
F.
Note: Some sources assume ||v || = 1 as a part of the definition.
Examples
Examples
1. Consider the production function Q = 4K 3/4 L1/4 . Assume that
the current input is (K , L) = (10000, 625). Compute the
directional derivative in direction (1, 1).
Interpreting Directional Derivatives
• According to the definition of directional derivative, it
measures the rate at which the function F rises and falls as
one moves out from x ⇤ in the direction of v .
• Let’s concentrate for a while purely on the direction of v ,
ignoring its length. I.e. we set ||v || = 1. Because the
directional derivative is a dot product of two vectors, we have
the following presentation:
Dv F (x ⇤ ) = —F (x ⇤ ) · v = ||—F (x ⇤ )||||v || cos q = ||—F (x ⇤ )|| cos q
• Question: "In what direction does F increase the most
rapidly?"
• Answer: F increases most rapidly when moving in the
direction of the gradient
Examples
Examples
1. Let f = f (x,
✓y )◆be a function such that f (1, 2) = 10 and
3
—f (1, 2) =
4
1.1 What are the directional derivatives of f in directions (1, 0)
and (0, 1)?
1.2 Compute the directional derivative of f in the direction given
by the functions gradient i.e. (3, 4)?
1.3 Consider the direction vector (4/5, 3/5) and directional
derivative of f . Interpret.
2. Consider again the production function Q = 4K 3/4 L1/4 and
assume that the current input is (K , L) = (10000, 625). In
what direction should we add K and L in order to increase
production most rapidly?