Convex Functions Konvexa Funktioner

Faculty of Technology and Science
Department of Mathematics
Hamid Reza Ghadiri
Convex Functions
Konvexa Funktioner
Mathematics
Degree Project 15 ECTC, Bachelor Level
Date/Term:
Supervisor:
Examiner:
Karlstads universitet 651 88 Karlstad
Tfn 054-700 10 00 Fax 054-700 14 60
[email protected] www.kau.se
2011-03-10
Sorina Barza
Håkan Granath
Content
(1) Introduction.........................................................................3
(2) Regularity Properties of Convex Functions..........................8
(3) Closure under Functional Operations..................................14
1
CONVEX FUNCTIONS
HAMID REZA GHADIRI
Abstract
Abstract:Convexity is a simple and natural notion which can be traced
to the ancient times. The theory of convex functions is a part of the
general theory on convexity, as convex functions are those whose epigraph (the set of points above the graph) is a convex set. The thesis
presents an elementary introduction in the theory of real convex functions. We give some characterizations of convex functions, present some
elementary regularity or geometric properties and solve some problems
and give some applications. Moreover, we prove that convexity is preserved under many of the usual functional operations and this offers a
way of identifying a more complex convex function.
Sammanfattning:Konvexitet är ett enkelt och naturligt begrepp som
kan spåras till gamla tider. Teorin om konvexa funktioner är en del av
en den allmänna konvexitetsteorin eftersom konvexa funktioner är just
de vars epigraf (mängden av punkter som ligger ovanpå funktionens
graf) är en konvex mängd. Detta arbete innehåller en grundläggande
introduktion till teorin om reella konvexa funktioner. Vi ger några
ekvivalenta beskrivningar av konvexa funktioner, bevisar några elementära regularitet- och geometriska egenskaper samt löser några problem. Utöver detta bevisar vi att konvexitetet är sluten under många
vanliga operationer något som visar ett sätt att identifiera en mer komplicerad konvex funktion.
2
3
1. Introduction
The aim of this thesis is to study some geometric and regularity
properties of real-valued convex functions of real variable. Many of the
results presented are contained in the books [4], [3] and [5]. However,
in this work we give some details of proofs which are omitted in the
reference books and also solve some problems posed in them. We start
with some definitions which are necessary in our work.
Definition 1. Let f be a function defined on an interval I of the real
line. The function f is called convex if and only if the inequality
f (λx1 + (1 − λ)x2 ) ≤ λf (x1 ) + (1 − λ)f (x2 )
is satisfied for any x1 and x2 ∈I and any 0 ≤ λ ≤ 1.
The function f is called strictly convex if and only if
f (λx1 + (1 − λ)x2 ) < λf (x1 ) + (1 − λ)f (x2 )
for any x1 and x2 ∈ I and any 0 < λ < 1 and x1 6= x2 .
Remark 1. For example, f (x) = x2 , x ∈ R is strictly convex and the
function
(
x2 − 1, |x| ≥ 1
f (x) =
0,
|x| < 1
is convex but not strictly convex.
Definition 2. A function f is called concave if
f (λx1 + (1 − λ)x2 ) ≥ λf (x1 ) + (1 − λ)f (x2 )
is satisfied for any x1 and x2 ∈ I and any 0 ≤ λ ≤ 1.
The function f is called strictly concave if and only if
f (λx1 + (1 − λ)x2 ) > λf (x1 ) + (1 − λ)f (x2 )
for any x1 and x2 ∈ I and any 0 < λ < 1 and x1 6= x2 .
Remark 2. Observe that a function f is convex if and only if −f is
concave. The theory of concave functions may therefore be subsumed
under that of convex functions and we shall concentrate our attention
on the latter.
Definition 3. A function f is called affine on I if and only if f (x) =
mx + b, x ∈ I.
Remark 3. It is clear that any affine function is convex and concave.
4
HAMID REZA GHADIRI
In fact, in the following proposition we show that also the converse
is true, i.e. the only functions that can be both convex and concave
are the affine ones. The following proposition is stated as an exercise
in [4] .
Proposition 1.
(1) A function f is affine on R if and only if
f (λx + (1 − λ)y) = λf (x) + (1 − λ)f (y),
for all λ ∈ R and x, y ∈ R.
(2) A function f is affine on an interval I if and only if both f and
−f are convex on I.
(3) If f : [a, b] → R is a convex function and there is a single value
of λ ∈(0,1) for which
f (λa + (1 − λ)b) = λf (a) + (1 − λ)f (b),
then f is affine on [a, b].
(4) Let f be a convex function defined on an interval I. Then it is
strictly convex there if and only if there is no subinterval of I
on which f is affine.
Proof.
(1) Let f (x) = mx + b be an affine function on R and x, y, λ ∈ R.
Then
f (λx + (1 − λ)y) =
= m(λx + (1 − λ)y) + b
= λf (x) + (1 − λ)f (y).
Suppose now that we have the equality
f (λx + (1 − λ)y) = λf (x) + (1 − λ)f (y)
for all λ ∈ R and x, y ∈ R . Let x0 and y0 be two real numbers
such that x0 < y0 . We denote by M (x) the affine function
whose graph contains the points (x0 , f (x0 )) and (y0 , f (y0 )).
Hence
f (x0 ) − f (y0 )
M (x) =
(x − y0 ) + f (y0 ) = mx + b.
x0 − y 0
Let x be an arbitrary point, x = λx0 +(1−λ)y0 for some λ ∈ R.
Then
M (λx0 + (1 − λ)y0 )
f (x0 ) − f (y0 )
=
(λx0 + (1 − λ)y0 − y0 ) + f (y0 )
x0 − y0
f (x0 ) − f (y0 )
=λ
(x0 − y0 ) + f (y0 )
x0 − y 0
5
= λf (x0 ) + (1 − λ)f (y0 ) = f (λx0 + (1 − λ)y0 ).
Hence M (x) = f (x) for all x, i.e. f (x) is affine.
(2) Suppose first that f is affine on I i.e. f (x) = mx + b, x ∈ I.
By easy calculations we get
f (λx + (1 − λ)y) = λf (x) + (1 − λ)f (y)
for any x, y ∈ I, which implies that f is convex. Similarly, one
can show that −f is convex.
Suppose now that f and −f are convex function. Hence
f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y),
and
(−f )(λx + (1 − λ)y) ≤ λ(−f )(x) + (1 − λ)(−f )(y)
or, equivalently
f (λx + (1 − λ)y) ≥ λf (x) + (1 − λ)f (y),
for x, y ∈ I and 0 ≤ λ ≤ 1. By the first and third inequalities
we get
f (λx + (1 − λ)y) = λf (x) + (1 − λ)f (y)
for 0 ≤ λ ≤ 1, which means that f is affine. Observe that
we proved that the only functions which are both concave and
convex are the affine functions.
(3) Suppose that f is not affine. Since f is convex and by unit(1)
of this proposition we have
f (λx + (1 − λ)y) < λf (x) + (1 − λ)f (y), λ ∈ [0, 1]
which is in contradiction with the fact that for x = a, y = b we
have a single λ for which we have equality. Hence f is affine on
[a, b].
(4) We prove first that if f is strictly convex then there is no subinterval of I on which f is affine. Suppose that there exists an
interval I0 where f is affine. Hence
f (λx + (1 − λ)y) = λf (x) + (1 − λ)f (y)
for all x, y, I0 ∈ [a, b] and λ ∈ (0, 1). This means that f is not
strictly convex and contradicts the hypothesis. We prove now
that if there is no subinterval of I on which f is affine then f
is strictly convex. Suppose that f is not strictly convex. Hence
there exists x0 , y0 , x0 6= y0 and λ0 ∈ (0, 1) such that
f (λ0 x0 + (1 − λ0 )y0 ) = λ0 f (x0 ) + (1 − λ0 )f (y0 ).
6
HAMID REZA GHADIRI
By (3), f is affine on (x0 , y0 ) so there exists I0 = (x0 , y0 ) ⊂ I
on which f is affine, and this contradicts the hypothesis.
The following proposition gives a geometric characterization of convexity. It is given as observation in [4, page 2] but present here a detailed
proof.
Proposition 2. Let I be an open interval, x1 , x2 be two points such
that x1 < x2 and M (x) be the linear function whose graph passes
through (x1 , f (x1 )) and (x2 , f (x2 )). The function f is convex if and
only if f (x) ≤ M (x) for all x ∈ [x1 , x2 ] or, equivalently
f (x) − f (x1 )
M (x) − M (x1 )
M (x2 ) − M (x)
f (x2 ) − f (x)
≤
=
≤
x − x1
x − x1
x2 − x
x2 − x
for x ∈ (x1 , x2 ).
Proof. Suppose first that f is convex. Let
f (x2 ) − f (x1 )
M (x) =
(x − x1 ) + f (x1 )
x2 − x1
be the linear function whose graph contains the points (x1 , f (x1 )) and
(x2 , f (x2 )) and let x ∈ [x1 , x2 ]. Then there exists 0 ≤ λ ≤ 1 such that
x = λx1 + (1 − λ)x2 .
f (x2 ) − f (x1 )
(λx1 + (1 − λ)x2 − x1 ) + f (x1 )
x2 − x1
f (x2 ) − f (x1 )
=λ
(x2 − x1 ) + f (x1 ).
x 2 − x1
By easy calculations we get
M (λx1 + (1 − λ)x2 ) =
λf (x1 ) + (1 − λ)f (x2 ).
Hence
M (x) = λf (x1 ) + (1 − λ)f (x2 ),
and since f is convex we have
f (x) = f (λx1 + (1 − λ)x2 ) ≤ M (x).
The second statement of the proposition as well as the reversed part
follow easily by the first.
The next proposition is given as an exercise in [5].
Proposition 3. The function f is convex if and only if the determinant
1 x1 f (x1 ) 1 x2 f (x2 ) 1 x3 f (x3 ) 7
is nonnegative for any x1 < x2 < x3 in the
Proof. Suppose that the determinant tive. Then
interval I.
1 x1 f (x1 )
1 x2 f (x2 )
1 x3 f (x3 )
is nonnega
x2 f (x3 ) − x3 f (x2 ) − x1 (f (x3 ) − f (x2 )) + f (x1 )(x3 − x2 ) ≥ 0,
which after straightforward calculations leads to
f (x3 )(x2 − x1 ) + f (x2 )(x1 − x3 ) + f (x1 )(x3 − x2 ) ≥ 0
i.e
(x3 − x1 )f (x2 ) ≤ (x2 − x1 )f (x3 ) + (x3 − x2 )f (x1 ),
or equivalently,
f (x2 ) ≤
x2 − x1
x 3 − x2
f (x3 ) +
f (x1 ).
x3 − x1
x 3 − x1
Since x1 < x2 < x3 , x2 = (1 − λ)x1 + λx3 for some 0 < λ < 1.
Substituting x2 by (1 − λ)x1 + λx3 in the above inequality we get
f (λx3 + (1 − λ)x1 ) ≤ λf (x3 ) + (1 − λ)f (x1 )
which means that f is convex on I.
Suppose now that f is convex. Let x1 < x2 < x3 with
x2 = λx1 + (1 − λ)x3
1
for some 0 < λ < 1. Observe that λ = xx23 −x
and 1 − λ =
−x1
get
x 3 − x2
x2 − x1
f (x3 ) +
f (x1 ),
f (x2 ) ≤
x3 − x1
x 3 − x1
i.e.
x3 −x2
.
x3 −x1
(x3 − x1 )(f (x2 ) − f (x1 )) ≤ (x2 − x1 )((f (x3 ) − f (x1 )),
or,
x2 f (x3 ) − x3 f (x2 ) − x1 (f (x3 ) − f (x2 )) + f (x1 )(x3 − x2 ) ≥ 0.
Thus
1 x1 f (x1 )
1 x2 f (x2 )
1 x3 f (x3
and this completes the proof.
≥ 0,
We
8
HAMID REZA GHADIRI
2. Regularity Properties of Convex Functions.
The following lemma will be used in the proof of the Theorem 1.
Lemma 1. Let f : I → R and suppose that f is twice differentiable at
x0 ∈ I. Then
f (x0 + h) − 2f (x0 ) + f (x0 − h)
.
h→0
h2
f 00 (x0 ) = lim
Proof. Let x0 ∈ I and g(h) = f (x0 + h) − 2f (x0 ) + f (x0 − h), h > 0;
g(0) = 0 and g is differentiable in 0. By L’hospital’s rule we have
f (x0 + h) − 2f (x0 ) + f (x0 − h)
g 0 (h)
.
=
lim
h→0
h→0 2h
h2
By using the chain rule we get
lim
f 0 (x0 + h) − f 0 (x0 − h)
g 0 (h)
= lim
,
h→0
h→0 2h
2h
which equals f 00 (x0 ).
lim
Remark 4. The existence of the limit in
f (x0 + h) − 2f (x0 ) + f (x0 − h)
h→0
h2
does not imply that f is twice differentiable. The function
(
x2 sin x1 , x 6= 0
f (x) =
0,
x=0
lim
has no second derivative in 0 but the above limit for x0 = 0 exists and
it is equal to 0.
Theorem 1. [1] Let I be an open interval and let f : I → R be a function which has a second derivative on I. Then f is a convex function
on I0 if and only if f 00 (x0 ) ≥ 0 for all x0 ∈ I.
Proof. Let f be a convex function on I. By Lemma 1, the second
derivative is given by the limit
f (x0 + h) − 2f (x0 ) + f (x0 − h)
h→0
h2
for each x0 ∈ I. Given x0 ∈ I, let h be such that x0 + h and x0 − h
belong to I. Then we have
f 00 (x0 ) = lim
1
1
f (x0 ) ≤ f (x0 + h) + f (x0 − h)
2
2
9
by convexity of f . Therefore, we have f (x0 +h)−2f (x0 )+f (x0 −h) ≥ 0
for any x0 ∈ I. Hence
f (x0 + h) − 2f (x0 ) + f (x0 − h)
≥ 0.
h→0
h2
f 00 (x0 ) = lim
Now suppose that f is twice differentiable on I and f 00 ≥ 0. We will use
Taylor’s Theorem to prove that f is convex. Let x1 ,x2 be two arbitrary
points of I. Let 0<t <1 and x0 := (1 − t)x1 + tx2 be an arbitrary point
between x1 and x2 . By Taylor’s Theorem there exists c1 between x0
and x1 such that
1
f (x1 ) = f (x0 ) + f 0 (x0 )(x1 − x0 ) + f 00 (c1 )(x1 − x0 )2
2
and c2 between x0 and x2 such that
1
f (x2 ) = f (x0 ) + f 0 (x0 )(x2 − x0 ) + f 00 (c2 )(x2 − x0 )2 .
2
00
Since f is nonnegative on I the term
1
1
R := (1 − t)f 00 (c1 )(x1 − x0 )2 + tf 00 (c2 )(x2 − x0 )2
2
2
is also nonnegative. Thus
(1 − t)f (x1 ) + tf (x2 ) =
= f (x0 ) + f 0 (x0 )((1 − t)x1 + tx2 − x0 )+
1
1
(1 − t)f 00 (c1 )(x1 − x0 )2 + tf 00 (c2 )(x2 − x0 )2
2
2
= f (x0 ) + R ≥ f (x0 ) = f ((1 − t)x1 + tx2 ).
Hence, f is a convex function on I and the proof is complete.
Remark 5. An easier proof is given in the second part of Corollary 2.
Theorem 2. [5]Let I be an open interval. If f : I → R is convex
function (strictly convex), then the left derivative f−0 (x) and the right
derivative f+0 (x) exist and are increasing (strictly increasing) on I.
Proof. We prove first that the right derivative exists. Consider (see
(x)
Proposition 2) the function g(h) = f (x+h)−f
, h > 0. If h1 < h2 , we
h
have by the convexity of f that
f (x + h1 ) − f (x)
f (x + h2 ) − f (x)
≤
,
h1
h2
10
HAMID REZA GHADIRI
which means that g is increasing. Thus
lim
h→0+
f (x + h) − f (x)
f (x + h) − f (x)
= inf
h>0
h
h
f (x + h) − f (x)
can not be −∞.
h>0
h
is −∞ or finite. We prove that inf
Let x and x0 two fixed points and x0 < x. By Proposition 2 we have
f (x) − f (x0 )
f (x + h) − f (x)
≤
0
x−x
h
for any h > 0 such that x + h ∈ I. Hence
f (x) − f (x0 )
f (x + h) − f (x)
≤ inf
0
h>0
x−x
h
which means that
f (x + h) − f (x)
lim
h→0+
h
0
is finite, i.e. the right derivative f+ (x) exists.
Now we show that the right derivative is an increasing function. Let
x, y ∈ I and x < y. Choose h > 0 such that x + h and y + h ∈ I. By
proposition 2 we have
f (x + h) − f (x)
f (y + h) − f (y)
≤
h
h
0
Passing to the limit as h → 0+ we get f+ (x) ≤ f+0 (y) which means that
f+0 is an increasing function.
We prove now that the left derivative exists. The proof is similar but for
the sake of completeness we give the details also in this case. Consider
the function
f (x) − f (x − h)
g(h) =
,
h
for h > 0. If 0 < h1 < h2 we have by convexity of f that
f (x) − f (x − h2 )
f (x) − f (x − h1 )
≤
h2
h1
which means that g is decreasing. Thus
lim
h→0+
f (x) − f (x − h)
f (x) − f (x − h)
= sup
h
h
h>0
is +∞ or finite. We will prove that sup
+∞.
h>0
f (x) − f (x − h)
can not be
h
11
Let x and x0 two fixed points and x0 < x . By the convexity of f we
have
f (x) − f (x0 )
f (x) − f (x − h)
≥
0
x−x
h
for any h > 0 such that x − h ∈ I. Hence
f (x) − f (x0 )
f (x) − f (x − h)
≥ sup
0
x−x
h
h>0
which means that
f (x) − f (x − h)
lim
h→0+
h
is finite, i.e. the left derivative f−0 (x) exists.
Now we show that the left derivative is an increasing function. Let
x, y ∈ I and x < y. Choose h > 0 such that x − h and y − h ∈ I. By
convexity we have
f (x) − f (x − h)
f (y) − f (y − h)
≤
h
h
0
Passing to the limit as h → 0+ we get f− (x) ≤ f−0 (y), which means
that f−0 is an increasing function.
Corollary 1. Any convex function defined on an open interval I is
continuous.
Proof. Let x0 ∈ I. We want to show that f (x0 ) = limx→x0 + f (x) =
limx→x0 − f (x). We have
f (x) − f (x0 )
lim f (x) = lim
.(x − x0 ) + f (x0 )
x→x0 +
x→x0 +
x − x0
= f+0 (x0 ).0 + f (x0 ) = f (x0 ),
since by theorem 2 the right derivative exists. Similarly
lim f (x) = f (x0 ).
x→x0 −
Hence, f is continuous in x0 . Since x0 was arbitrary, f is continuous
on I.
Proposition 4. Let f be a convex function on an open interval I.
Then f−0 (x) ≤ f+0 (x), x ∈ I.
Proof. Let h > 0, such that x+h and x−h are in I. By Proposition
2 we have
f (x) − f (x − h)
f (x + h) − f (x)
≤
h
h
for any h. By letting h → 0 we get f−0 (x) ≤ f+0 (x).
12
HAMID REZA GHADIRI
Corollary 2. Let f be defined on an open interval I.
(1) If f is differentiable on I then f is convex if and only if f 0 is
increasing.
(2) If f is twice differentiable on I then f is convex if and only if
f 00 ≥ 0 on I.
Proof.
(1) Suppose first that f is convex and differentiable on I. Then
f−0 = f+0 = f 0 on I. By Theorem 2, f 0 is increasing.
Conversely suppose now that f 0 is increasing. Choose x1 <
x2 < x3 such that x1 , x2 , x3 ∈ I. By Lagrange’s mean value
theorem we have that
f (x2 ) − f (x1 )
= f 0 (c1 ),
x 2 − x1
for some x1 < c1 < x2 and
f (x3 ) − f (x2 )
= f 0 (c2 ),
x 3 − x2
for some x1 < c2 < x3 . By the monotonicity of f 0 we have that
f (x2 ) − f (x1 )
f (x3 ) − f (x2 )
≤
x2 − x1
x3 − x2
i.e. f is convex.
(2) Suppose that f is convex. By (1) it follows that f 00 ≥ 0. Conversely, suppose now that f 00 ≥ 0. Then f 0 is increasing and by
(1) f is convex.
Proposition 5. Let f be a convex function, differentiable on an interval I. Then the graph of f lies above any tangent line on that interval.
Proof. Let x0 ∈ I and y = T (x), T (x) = f 0 (x0 )(x − x0 ) + f (x0 ) be the
tangent line in the point x0 to the graph of the function y = f (x).
Denote by
h(x) = f (x) − T (x)
We have that h(x0 ) = 0, h0 (x) = f 0 (x) − T 0 (x) = f 0 (x) − f 0 (x0 ) and
hence h0 (x0 ) = 0. By Corrollary 2, f 0 is increasing. Hence h0 (x) ≤ 0,
for x ≤ x0 and h0 (x) ≥ 0 for x ≥ x0 , x0 is a local min point, which
means that h(x) ≥ 0 and therefore, f (x) ≥ T (x).
Theorem 3. Let f be a convex function defined on an open interval
I. Then f is differentiable except on a countable set.
13
Proof. Let x0 be a point where f+0 is continuous, i.e. f+0 (x0 ) =
limx→x0 f+0 (x).
We show first that f is differentiable in x0 , i.e. f+0 (x0 ) = f−0 (x0 ). By
Proposition 4 we have f−0 (x0 ) ≤ f+0 (x0 ) so we just have to prove that
f−0 (x0 ) ≥ f+0 (x0 ).
By the convexity of f , we have f−0 (x0 − h) ≤ f+0 (x0 ) and by the continuity of f+0 we get the desired inequality.
Since f+0 is an increasing function it has mostly a countable number of
discontinuities and hence f has countable many points where it is not
differentiable, see e.g. [1, Theorem 5.6.4].
Proposition 6. [4] A convex function defined on a closed interval [a, b]
is bounded.
Proof. Let M = max(f (a), f (b)) and z = λa + (1 − λ)b for some
λ∈[0,1]. Then
f (z) ≤ λf (a) + (1 − λ)f (b) ≤ (λ + (1 − λ))M = M,
i.e. f is bounded from above.
Now we show that the function is also bounded from below. We consider an arbitrary point written in the form a+b
+t, where t ∈ [ a−b
, b−a
].
2
2
2
Then by convexity of f we have
a+b
1 a+b
1 a+b
f(
) ≤ f(
+ t) + f (
− t).
2
2
2
2
2
Since
a+b
− t) ≥ −M
−f (
2
we get
a+b
a+b
f(
+ t) ≥ 2f (
)−M =m
2
2
for any t∈[ a−b
, b−a
].
2
2
Hence, f (x) ≥ m, for all x ∈ [a, b] which means that f is bounded from
below.
Remark 6. One can see that a convex function may not be continuous
at the boundary point of its domain as it might have upward jump
there (see Proposition 7). Considering a convex function defined on an
open interval one can see that it must not be bounded from above. For
1
instance, the function f (x) = 1−x
2 , x ∈ (−1, 1) is a convex function
not bounded from above.
Proposition 7. Let f be a convex function defined on the interval [a,b]
and suppose that f is not continuous in a. Then f (a) > limx→a+ f (x) =
f (a+ ).
14
HAMID REZA GHADIRI
Proof. Clearly, limx→a+ f (x) exists since f is monotone on a neighborhood of a, see e.g. [3, Proposition 1.3.4]. Suppose that f (a) ≤
limx→a+ f (x). Since f is convex we have for λ = 12 that
f(
a+x
f (a) + f (x)
)≤
,
2
2
for x ∈ (a, b]. Hence,
lim f (
x→a+
a+x
f (x)
f (a)
)−
≤
2
2
2
i.e
lim (f (x) −
x→a+
since
lim f (
x→a+
f (x)
f (a)
)≤
,
2
2
a+x
) = lim f (x).
x→a+
2
Therefore,
lim
x→a+
f (a)
f (x)
≤
2
2
thus,
lim f (x) ≤ f (a)
x→a+
which is contrary to the fact that f (a) ≤ f (a+ ).
3. Closure under Functional Operations and Applications
Proposition 8. [4]If f and g are convex functions defined on an interval I, then any linear combination of αf + βg is also convex provided
α and β are nonnegative real numbers.
Proof. If we consider the function
h(x) = αf (x) + βg(x),
by using the convexity of f and g we get
h(λx + (1 − λ)y) = αf (λx + (1 − λ)y) + βg(λx + (1 − λ)y)
≤ α (λf (x) + (1 − λ)f (y)) + β (λg(x) + (1 − λ)g(y))
= λ(αf (x) + βg(x)) + (1 − λ)(αf (y) + βg(y))
= λh(x) + (1 − λ)h(y)
i.e. h is a convex function.
Proposition 9. If f and g are convex functions and g is increasing,
then the composition g ◦ f is also convex on I.
15
Proof. Suppose that I is an open interval and a, b ∈ I and λ∈[0,1].
Let h(x) = g(f (x)) and x be an arbitrary point in the interval I such
that x = λa + (1 − λ)b. Since f is convex,
f (x) ≤ λf (a) + (1 − λ)f (b)
Since g is convex and increasing, we have that
h(x) = g(f (x)) ≤
≤ g(λf (a) + (1 − λ)f (b))
≤ λg(f (a)) + (1 − λ)g(f (b))
= λh(a) + (1 − λ)h(b)
i.e. h = g ◦ f is a convex function.
Remark 7. In the above-mentioned proposition g has to be increasing.
For example, consider f (x) = x1 and g(x) = √1x , x > 0. The function
√
g(f (x)) = x is concave although both f and g are convex functions.
Remark 8. We can summarize convexity and concavity of the composite function g ◦ f in this way: if f : I → R, g : J → R and range
f ⊆ J, then we have
(1) If f and g are both convex and g is increasing, then g ◦ f is
convex.
(2) If f is concave and g is concave and decreasing, then g ◦ f is
convex.
(3) If f and g are both concave and g is increasing, then g ◦ f is
concave.
(4) If f is convex and g is concave and decreasing, then g ◦ f is
concave.
Corollary 3. Let f be a convex function on an open interval I. Then
ef (x) is also convex on I.
Proof. Apply Proposition 9 with g(x) = ex .
Proposition 10. If f and g are both non-negative, either decreasing or
increasing and convex functions, then h(x) = f (x)g(x) is also convex.
Proof. Let x and y be two arbitrary points such that x < y and
λ ∈ [0, 1]. Observe that for x < y we have
[f (x) − f (y)][g(y) − g(x)] ≤ 0
i.e.
f (x)g(y) + f (y)g(x) ≤ f (x)g(x) + f (y)g(y)
16
HAMID REZA GHADIRI
Since f and g are convex functions, we have
f (λx+(1−λ)y)g(λx+(1−λ)y) ≤ (λf (x) + (1 − λ)f (y)) (λg(x) + (1 − λ)g(y))
i.e.
f (λx + (1 − λ)y)g(λx + (1 − λ)y) ≤
≤ λ2 f (x)g(x) + λ(1 − λ)[f (x)g(y) + f (y)g(x)] + (1 − λ)2 f (y)g(y).
By easy calculations we get
f (λx + (1 − λ)y)g(λx + (1 − λ)y) ≤ λf (x)g(x) + (1 − λ)f (y)g(y) =
= λh(x) + (1 − λ)h(y).
The following theorem will be used in the proof of Propositions 11
and 12.
Theorem 4. (see e.g. [3]) Let f : I → R be a continuous function and
x, y ∈ I. Then f is convex if and only if f is midpoint convex i.e.
x+y
f (x) + f (y)
)≤
.
2
2
Proof. The fact that f convex implies f midpoint convex is trivial.
We need to prove the sufficiency part. Suppose that f is not convex.
Then there exists a subinterval [a, b] such that the graph of f is not
under the chord (a, f (a)) and (b, f (b)), which is the function
f(
ϕ(x) = f (x) −
f (b) − f (a)
(x − a) − f (a) ≥ 0,
b−a
x ∈ [a, b].
It is easy to see that
γ = sup{ϕ(x)|x ∈ [a, b]} > 0.
Observe that ϕ is continuous and ϕ(a) = ϕ(b) = 0. A direct calculation
shows that ϕ is also midpoint convex. Put =
¸ inf{x|ϕ(x) = γ}; then
ϕ(c) = γ and c ∈ (a, b). By definition of c, for every h > 0 for which
c + h, c − h ∈ (a, b) we have ϕ(c − h) < ϕ(c) and ϕ(c + h) ≤ ϕ(c), so
that
ϕ(c − h) + ϕ(c + h)
ϕ(c) >
,
2
which is in contradiction to the fact that ϕ is midpoint convex.
Remark 9. The above theorem is not true without the continuity assumption. There exist midpoint convex functions which are not continuous but such an example is far away the aim of this thesis.
As an application of Theorem 4 we formulate the following geometric
inequality called Hermitte-Hadamard inequality. The proof is our own.
17
Proposition 11. [[3]] Let f : (a, b)
Then f is convex if and only if
Z t
1
f (x)dx ≤
t−s s
→ R be a continuous function.
1
[f (s) + f (t)]
2
for all a < s < t < b.
Proof. Suppose first that f is a convex function and x = λs+(1−λ)t,
0 < λ < 1. Hence
x−t
λ=
.
s−t
Thus, by a change of variables we get
Z 1
Z t
1
f (λs + (1 − λ)t)dλ.
f (x)dx =
t−s s
0
Since f is a convex function, we have
Z t
Z 1
Z 1
1
f (x)dx =
f (λs + (1 − λ)t)dλ ≤
[λf (s) + (1 − λ)f (t)]dλ
t−s s
0
0
1
= [f (s) + f (t)]
2
and this completes the proof of the first part of the proposition.
Now we suppose that
Z t
1
1
f (x)dx ≤ [f (s) + f (t)]
t−s s
2
for any s < t in (a, b).
We assume that the function is not convex, i.e. by Theorem 4, there
exist s and t, s < t in the interval (a, b) such that
s+t
f (s) + f (t)
f
>
.
2
2
Consider the set
C = {x ∈ (s, t) : f (x) > f (s) +
f (t) − f (s)
(x − s)}.
t−s
The set C is not empty. One can easily verify that s+t
∈ C. Since f is
2
continuous the set C is an open set. Let (y, z) be a maximal connected
. Since (y, f (y)) and
component of the set C containing the point s+t
2
(s)
(z, f (z)) are on the graph of the function g(x) = f (s) + f (t)−f
(x − s)
t−s
we have:
f (t) − f (s)
f (y) = f (s) +
(y − s)
t−s
18
HAMID REZA GHADIRI
and
f (z) = f (s) +
f (t) − f (s)
(z − s).
t−s
It implies that
f (y) + f (z)
f (t) − f (s)
= f (s) +
2
t−s
y+z
−s .
2
Therefore
Z z
Z z
1
1
f (t) − f (s)
f (x)dx >
(f (s) +
(x − s))dx
z−y y
z−y y
t−s
f (t) − f (s) y + z
(
− s)
t−s
2
f (y) + f (z)
=
2
which contradicts our assumption.
= f (s) +
Proposition 12. [2] Let I be a closed interval of the form either [0, a]
or [0, ∞). Suppose that f is a continuous function which satisfies
f (0) = 0. Then f is convex if and only if
n
n
X
X
(−1)i−1 f (xi ) ≥ f ( (−1)i−1 xi ),
i=1
i=1
for any n ≥ 2 and any n points x1 ≥ x2 ≥ .... ≥ xn−1 ≥ xn in the
interval I.
Proof. We prove the statement by induction. Suppose that f is
convex on the closed interval I. Let x1 > x2 > x3 be arbitrary points
in the interval I and let λ > 0 such that x2 = λx1 + (1 − λ)x3 . Since
f is convex we have
f (x2 ) = f (λx1 + (1 − λ)x3 ) ≤ λf (x1 ) + (1 − λ)f (x3 ),
and
f (x1 − x2 + x3 ) = f ((1 − λ)x1 + λx3 ) ≤ (1 − λ)f (x1 ) + λf (x3 ).
Therefore we get
f (x2 ) + f (x1 − x2 + x3 ) ≤ f (x1 ) + f (x3 ),
which is also valid for any x1 ≥ x2 ≥ x3 in I by continuity of f . If we
take x3 = 0, then by f (0) = 0 we have
f (x1 − x2 ) ≤ f (x1 ) − f (x2 ).
Hence we have the inequality in the proposition for the n = 2 and
n = 3.
19
Suppose now that the inequality is valid for n = m ≥ 2. Then for any
n = m + 2 points x1 ≥ x2 ≥ .... ≥ xm+2 in the interval I we have,
m+2
X
i−1
(−1)
m+2
X
(−1)i−1 xi )
f (xi ) = f (x1 ) − f (x2 ) + f (
i=1
i=3
m+2
X
(−1)i−1 xi ),
≥ f (x1 ) − f (x2 ) + f (
i=3
m+2
X
(−1)i−1 xi )
≥ f(
i=1
which means that the inequality is valid for every n ≥ 2.
Conversely suppose the inequality holds. Then we get in particular
that
f (x2 ) + f (x1 − x2 + x3 ) ≤ f (x1 ) + f (x3 ),
for any x1 ≥ x2 ≥ x3 in I. By taking x2 =
x1 +x3
2
we get
f (x1 ) + f (x3 )
x1 + x3
)≤
;
2
2
hence f is convex on I, by Theorem 4. The following proposition is
formulated as an exercise in [4].
f(
Proposition 13. Let f be a bijection between two intervals I and J.
Then f is convex and increasing if and only if its inverse f −1 is increasing and concave.
Proof. Suppose that f is convex. Then we have
f (λx + (1 − λ)t) ≤ λf (x) + (1 − λ)f (t).
Considering f (x) = y and f (t) = u and by changing variables we get
f (λf −1 (y) + (1 − λ)f −1 (u)) ≤ λy + (1 − λ)u
By taking inverse from both sides and using the monotonicity of f −1
we have
λf −1 (y) + (1 − λ)f −1 (u) ≤ f −1 (λy + (1 − λ)u),
which means that f −1 is concave.
Remark 10. Observe that if f is convex and decreasing then f −1 is
convex and decreasing as well.
20
HAMID REZA GHADIRI
Theorem 5. [3] (the Discrete Jensen’s Inequality). Let f be convex
P
function on the open interval I and let xi ∈ I. If λi > 0 and ki=1 λi =
1, then
!
k
k
X
X
f
λi xi ≤
λi f (xi )
i=1
i=1
Proof. In order to prove this theorem we use induction. If λ1 +λ2 =
1 then for x1 and x2 we have
λ1 f (x1 ) + λ2 f (x2 ) ≥ f (λ1 x1 + λ2 x2 ).
This is true by definition of convexity. Now, suppose that the theorem
λi
is true with k − 1 values. Let λ0i = (1−λ
for i = 1, 2, ..., k − 1, then we
k)
have
k
k−1
X
X
λi f (xi ) = λk f (xk ) + (1 − λk )
λ0i f (xi )
i=1
i=1
≥ λk f (xk ) + (1 − λk )f
≥f
λk xk + (1 − λk )
k−1
X
i=1
k−1
X
!
λ0i xi
!
λ0i xi
i=1
=f
k
X
!
λi f (xi ) .
i=1
Hence, by the principle of induction the inequality is true for any k ∈ N.
Remark 11. Observe that in fact Jensen’s inequality is equivalent with
the notion of convexity. Take λ1 = λ, λ2 = 1 − λ, k = 2 in the above
inequality. Jensen’s inequality has a lot of applications in mathematical
analysis and elsewhere.
21
References
[1] R. G. Bartle and D. R Sherbert, Introduction To Real Analysis, John Wiley
& Sons, Inc, 2000.
[2] Masayoshi Hata, Problems and Solutions in Real Analysis, World Scientific
Publishing Co. Pte. Ltd. 2007.
[3] C. P. Niculescu and L. E Persson, Convex Functions, Universitaria Press,
2003.
[4] A. W. Roberts and D. E Varberg, Convex Functions, New York and London,
Academic Press, 1973.
[5] B. S. Thomson and J. B. Bruckner and A. M. Bruckner, Elementary Real
Analysis, 2008.