Week 8 Notes

Seunghee Ye
Ma 8: Week 8
Nov 17
Week 8 Summary
This week, we will discuss optimization. Given a function f (x), how do we find x that maximizes
f (x)? This discussion will bring us to convex functions which will prove to be quite useful.
Topics
Page
1 Optimization and Convex Functions
1.1 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Convex Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1.1
1
1
2
Optimization and Convex Functions
Optimization
Definition (Critical points and local extrema)
Let f be a C 1 function. Then, we say that a is a critical point of f if f 0 (a) = 0. We say that a is a
local minimum (resp. local maximum) of f if there exists δ > 0 such that for all x ∈ (a − δ, a + δ), we
have f (x) ≥ f (a) (resp. f (x) ≤ f (a)). If a is either a local minimum or a local maximum, we say that a is
a local extremum.
What we care about are local extrema: we want to find values a such that f is locally minimized/maximized
at a. Then, why did we introduce critical points? As you might know already, we care about critical points
because local extrema are always critical points.
Proposition 1.1. Let f be a C 1 function and suppose a is a local extremum of f . Then, f 0 (a) = 0.
Proof. We proceed by contradiction. We will only prove that when a is a local minimum, f 0 (a) = 0. The
case of local maxima will follow either by symmetry of argument or by considering g(x) = −f (x).
Suppose a is a local minimum and suppose f 0 (a) 6= 0. Then, either f 0 (a) > 0 or f 0 (a) < 0. Suppose
0
f (a) > 0. Since f is C 1 , we know that f 0 (x) is continuous. In particular, we can find δ1 > 0 such that
f 0 (x) > 0 for all x ∈ (a − δ1 , a + δ1 ).
Since a is the local minimum, we can find δ2 > 0 such that f (x) ≥ f (a) for all x ∈ (a − δ2 , a + δ2 ). Now,
let δ = min(δ1 , δ2 ).
Since f is continuous and has a local minimum at a, we can find b1 , b2 ∈ (a − δ, a + δ) such that
f (b1 ) = f (b2 ). By the Mean Value Theorem, we can find c ∈ (b1 , b2 ) ⊂ (a − δ, a + δ) such that
f 0 (c) =
f (b2 ) − f (b1 )
=0
b2 − b1
However, we chose δ such that whenever x ∈ (a − δ, a + δ), we have f 0 (x) > 0, which is a contradiction.
Hence, we conclude that f 0 (a) = 0.
So we see that local extrema must always be critical points of f (x). But is the converse also true? In
other words, are all critical points local extrema? Unfortunately, the converse is not true in general. For
example, consider f (x) = x3 . Then, f 0 (0) = 0 and thus, 0 is a critical point of f . However, we also know
that f (x) is a strictly increasing function on R. Hence, 0 cannot be a local extremum.
However, with a few extra conditions on f , we can check if a critical point a is a local extremum of f .
Theorem 1.1 (First Derivative Test). Let a be a critical point of f .
Page 1 of 4
Seunghee Ye
Ma 8: Week 8
Nov 17
• Suppose there exists δ > 0 such that f 0 (x) < 0 for all x ∈ (a − δ, a) and f 0 (x) > 0 for all x ∈ (a, a + δ).
Then, a is a local minimum of f .
• Suppose there exists δ > 0 such that f 0 (x) > 0 for all x ∈ (a − δ, a) and f 0 (x) < 0 for all x ∈ (a, a + δ).
Then, a is a local maximum of f .
If f is a C 2 function, we have an even better test.
Theorem 1.2 (Second Derivative Test). Let a be a critical point of f .
• Suppose f 00 (a) > 0. Then, a is a local minimum of f .
• Suppose f 00 (a) < 0. Then, a is a local maximum of f .
Using the first and second derivative tests, finding local extrema becomes very easy. Let’s end this section
with an optimization problem.
Example 1.1. Suppose that each week, Caltech Bookstore sells 200 iPad Mini with Retina DisplayTM (hereinafter
referred to as iPad) for $350 each. A market survey indicates that for each $10 rebate offered, the number
of iPads sold per week will increase by 20 units.
Write the price and the revenue as functions of number of units sold per week. How large should the
rebate be if Caltech Bookstore wanted to maximize the revenue?
Solution. Let’s first write the price and the revenue as functions of the number of units sold per week. Let
x be the total number of iPads sold in a week. Then, the increase in sales by offering a rebate is x − 200.
The market survey says that for each increment of $10 in rebate offered, x increases by 20. Therefore, we
can write the price as a function of x as:
P (x) = 350 −
x
10
(x − 200) = 450 −
20
2
Now, the revenue is simply the price multiplied by sales. Hence,
R(x) = xP (x) = 450x −
x2
2
The goal is to find the value of x which maximizes R(x) i.e. we want to find the global maximum of R(x).
To do that, first we need to find all the local extrema of R(x). And to do that, we first find the critical
points of R(x). But that’s not hard at all.
R0 (x) = 450 − x
⇒
R0 (450) = 0
Hence, x = 450 is the unique critical point of R(x). Noting that R00 (x) = −1 < 0 for all x, we conclude that
450 is indeed a local maximum of R(x).
Since 450 is the only local extremum of R(x), it is in fact the global maximum. Therefore, to maximize
revenue, Caltech Bookstore must offer a rebate which would sell 450 iPads. In other words, Caltech Bookstore
must offer a rebate which would increase sales by 450 − 200 = 250. This corresponds to a rebate of
$20
250 = $125. In other words, Caltech Bookstore should be selling iPad Minis at $350 - $125 = $225 after
10
rebate!
1.2
Convex Functions
You might have seen convex and concave functions before (some teachers say “concave up” and “concave
down”). An example of a convex function is f (x) = x2 and an example of concave function is g(x) = −x2 .
Intuitively, we think of convex functions as those functions that can be minimized, and concave functions
Page 2 of 4
Seunghee Ye
Ma 8: Week 8
Nov 17
as those that can be maximized. Using the second derivative test, this means that convex functions should
have nonnegative second derivative and concave functions should have nonpositive second derivative. Our
intuition serves us well this time and we will see in a bit that f (x) is convex (resp. concave) if and only if
f 00 (x) ≥ 0 (resp. f 00 (x) ≤ 0).
But first, let’s give a formal definition of convex and concave funcitons.
Definition (Convex and Concave Functions)
let f be a function. f is called convex if for all θ ∈ [0, 1] and for all x, y we have
f (θx + (1 − θ)y) ≤ θf (x) + (1 − θ)f (y)
f is called concave if for all θ ∈ [0, 1] and for all x, y we have
f (θx + (1 − θ)y) ≥ θf (x) + (1 − θ)f (y)
As we said in the beginning of this section, we have the following proposition.
Proposition 1.2. f is convex (resp. concave) if and only if f 00 (x) ≥ 0 (resp. f 00 (x) ≤ 0) for all x.
Solution. You will need to prove this for this week’s problem set!
Theorem 1.3 (Jensen’s Inequality). Let f be a confex function and let a1 , . . . , an > 0. Then, for all
x1 , . . . , xn , we have
! P
Pn
n
ai xi
i=1
i=1 ai f (xi )
Pn
f
≤ P
n
a
j=1 j
j=1 aj
If g is a concave function, we have
g
! P
Pn
n
ai xi
i=1
i=1 ai g(xi )
Pn
≥ P
n
j=1 aj
j=1 aj
You should think of Jensen’s inequality as an extension of the definition of convexity/concavity of a
function to more than 2 points. In short, Jensen’s inequality says that if you have a convex function, the
function’s value at the weighted average of the xi ’s is at most the weighted average of the f (xi )’s.
Using Jensen’s inequality, we can prove the generalized AM-GM inequality. Let’s recall the AM-GM
inequality:
Theorem 1.4. Let a, b > 0. Then,
a+b √
≥ ab
2
Proof. We could prove this using Jensen’s inequality but that would be very inefficient. Instead, we give a
simpler proof. We know that
0 ≤ (a − b)2 = a2 − 2ab − b2
⇒
4ab ≤ a2 + 2ab + b2 = (a + b)2
By taking square roots on both sides, we get the desired inequality.
That was easy! And we didn’t even need to use Jensen’s inequality. Now, let’s try to prove the generalized
AM-GM inequality.
Theorem 1.5 (Generalized AM-GM Inequality). Let x1 , . . . , xn > 0. Then,
√
x1 + · · · + xn
≥ n x1 · · · xn
n
Page 3 of 4
Seunghee Ye
Ma 8: Week 8
Nov 17
Proof. This time, we will use Jensen’s inequality to prove the inequality. Which convex/concave function do
we use? Let’s take f (x) = log x.
First, we claim that f (x) is a concave function that is strictly increasing on its domain of definition,
which is (0, ∞). First, we have that (log x)0 = x1 > 0 for all x ∈ (0, ∞). Hence, log x is strictly increasing.
Differentiating one more time we get
0
1
1
(log x) =
=− 2 <0
x
x
00
for all x ∈ (0, ∞)
By Proposition 1.2, we conclude that log x is a concave function.
Now, let x1 , . . . , xn > 0. Then, by Jensen’s inequality for concave functions (where we let a1 = · · · =
an = 1) we get
x1 + · · · + xn
log x1 + · · · + log xn
≥
(1)
log
n
n
log(x1 · · · xn )
=
(2)
n
=
1
log (x1 · · · xn ) n
(3)
However, we showed that log x is increasing on (0, ∞). Therefore, above inequality implies
√
x1 + · · · + xn
≥ n x1 · · · xn
n
which is precisely the generalized AM-GM inequality.
(Alternatively, we can apply ex to both sides and use the fact that ex is strictly increasing)
Page 4 of 4