B
Convex Sets and Functions
Definition B.1. Let (L, +, ·) be a real linear space and let C be a subset
of L. The set C is convex if, for all x, y ∈ C and all a ∈ [0, 1], we have
(1 − a)x + ay ∈ C. In other words, every point on the line segment connecting
x and y belongs to C.
Example B.2. The convex subsets of (R, +, ·) are the intervals of R. Regular
polygons are convex subsets of R2 .
Definition B.3. Let U be a subset of a real linear space (L, +, ·).
A convex combination of U is an element of L of the form a1 x1 +· · ·+ak xk ,
where x1 , . . . , xk ∈ U , ai ≥ 0 for 1 ≤ i ≤ k, and a1 + · · · + ak = 1.
If the conditions ai ≥ 0 are dropped, we have an affine combination of U .
In other words, x is an affine combination of U if there exist
P a1 , . . . , ak ∈ R
such that x = a1 x1 + · · · + ak xk , for x1 , . . . , xk ∈ U , and ki=1 ai = 1.
Definition B.4. Let U be a subset of a real linear space (L, +, ·). A subset
{x1 , . . . , xn } is affinely dependent if 0 = a1 x1 +
P·n· · + an xn such that at least
one of the numbers a1 , . . . , an is nonzero and i=1 ai = 0. If no such affine
combination exists, then x1 , . . . , xn are affinely independent.
Theorem B.5. The set U = {x1 , . . . , xn } is affinely independent if and only
if the set V = {x1 − xn , xn−1 − xn } is linearly independent.
Proof. Suppose that U is affinely independent but V is linearly dependent;
that is, 0 = b1 (x1 − xn ) + · · · + bn−1 (xn−1 − xn ) such that not all numbers bi
are 0. This implies
!
n−1
X
b1 x1 + · · · + bn−1 xn−1 −
bi xn = 0,
i=1
which contradicts the affine independence of U .
574
B Convex Sets and Functions
Conversely, suppose that V is linearly independent but U is not affinely
independent. In this case, 0 = a1 x1P+ · · · + an xn such that at least one
of the
Pn−1
n
numbers a1 , . . . , an is nonzero and i=1 ai = 0. This implies an = − i=1 ai ,
so 0 = a1 (x1 − xn ) + · · · + an−1 (xn−1 − xn ). Observe that at least one of the
numbers a1 , . . . , an−1 must be distinct from 0 because otherwise we would
have a1 = · · · = an−1 = an = 0. This contradicts the linear independence of
V , so U is affinely independent. ⊓
⊔
Example B.6. Let x1 and x2 be two elements of the linear space (R2 , +, · · · ).
The line that passes through x1 and x2 consists of all x such that x − x1 and
x − x2 are collinear; that is, a(x − x1 ) + b(x − x2 ) = 0 for some a, b ∈ R such
that a + b 6= 0. Thus, we have x = a1 x1 + a2 x2 , where
a1 + a2 =
a
b
+
= 1,
a+b a+b
so x is an affine combination of x1 and x2 . It is easy to see that the segment
of line contained between x1 and x2 is given by a convex combination of x1
and x2 ; that is, by an affine combination a1 x1 + a2 x2 such that a1 , a2 ≥ 0.
Theorem B.7. If C is a convex subset of a real linear space (L, +, ·), then C
contains all convex linear combinations of C.
Proof. The proof is by induction on k ≥ 2 and is left to the reader. ⊓
⊔
Theorem B.8. The intersection of any collection of convex sets of a linear
space (L, +, ·) is a convex set.
S
Proof. Let C = {Ci | i ∈ I} be a collection of convex sets and let C = C.
Suppose that x1 , . . . , xk ∈ C, ai ≥ 0 for 1 ≤ i ≤ k, and a1 + · · · + ak = 1.
Since x1 , . . . , xk ∈ Ci , it follows that a1 x1 + · · · + ak xk ∈ Ci for every i ∈ I.
Thus, a1 x1 + · · · + ak xk ∈ C, which proves the convexity of C. ⊓
⊔
Corollary B.9. The family of convex sets of a linear space (L, +, · · · ) is a
closure system on P(L).
Proof. This statement follows immediately from Theorem B.8 by observing
that the set L is convex. ⊓
⊔
Corollary B.9 allows us to define the convex hull of a subset U of L as the
closure Kconv (U ) of U relative to the closure system of the convex subsets of
L. If U ⊆ Rn consists of n+1 points such that no point is an affine combination
of the other n points, then Kconv (U ) is an n-dimensional simplex in L.
Example B.10. A two-dimensional simplex is defined starting from three points
x1 , x2 , x3 in R2 such that none of these points is an affine combination of
B Convex Sets and Functions
575
the other two (no point is collinear with the others two). Thus, the twodimensional symplex generated by x1 , x2 , x3 is the full triangle determined
by x1 , x2 , x3 .
In general, an n-dimensional simplex is the convex hull of a set of n + 1
points x1 , . . . , xn+1 in Rn such that no point is an affine combination of the
remaining n points.
Let S be the n-dimensional simplex generated by the points x1 , . . . , xn+1 in
Rn and let x ∈ S. If x ∈ S, then x is a convex combination of x1 , . . . , xn , xn+1 .
In other words, there exist a1 , . . . , an , an+1 such that a1 , . . . , an , an+1 ∈ (0, 1),
Pn+1
i=1 ai = 1, and x = a1 x1 + · · · + an xn + an+1 xn+1 .
The numbers a1 , . . . , an , an+1 are the baricentric coordinates of x relative
to the simplex S and are uniquely determined by x. Indeed, if we have
x = a1 x1 + · · · + an xn + an+1 xn+1 = b1 x1 + · · · + bn xn + bn+1 xn+1 ,
and ai 6= bi for some i, this implies
(a1 − b1 )x1 + · · · + (an − bn )xn + (an+1 − bn+1 )xn+1 = 0,
which contradicts the affine independence of x1 , . . . , xn+1 .
The next statement plays a central role in the study of convexity. We
reproduce the proof given in [59].
Theorem B.11 (Carathéodory’s Theorem). If U is a subset of Rn , then
P
for every x ∈ Kconv (U ) we have x = n+1
i=1 ai xi , where xi ∈ U , ai ≥ 0 for
Pn+1
1 ≤ i ≤ n + 1, and i=1 ai = 1.
Pp+1
Proof. Consider x ∈ Kconv (U ). We can write x = i=1 ai xi , where xi ∈ U ,
Pp+1
ai ≥ 0 for 1 ≤ i ≤ p + 1, and i=1 ai = 1. Let p be the smallest number
which allows this kind of expression for x. We prove the theorem by showing
that p ≤ n.
Suppose that p ≥ n+1. Then, the set {x1 , . . . , xp+1 } is affinely dependent,
Pp+1
Pp+1
so there exist b1 , . . . , bp+1 not all zero such that 0 = i=1 bi xi and i=1 bi =
ap+1
0. Without loss of generality, we can assume bp+1 > 0 and bp+1
≤ abii for all i
such that 1 ≤ i ≤ p and bi > 0. Define
ai
ap+1
ci = b i
−
bi
bp+1
for 1 ≤ i ≤ p. We have
p
X
i=1
ci =
p
X
i=1
p
ap+1 X
ai −
bi = 1.
bp+1
i=1
Furthermore, ci ≥ 0 for 1 ≤ i ≤ p. Indeed, if bi ≤ 0, then ci ≥ ai ≥ 0; if
ap+1
bi > 0, then ci ≥ 0 because bp+1
≤ abii for all i such that 1 ≤ i ≤ p and bi > 0.
Thus, we have
576
B Convex Sets and Functions
p
X
p p
X
X
ap
ci xi =
ai − bi xi =
ai xi = x,
bp
i=1
i=1
i=1
which contradicts the choice of p. ⊓
⊔
A finite set of points P in R2 is a convex polygon if no member p of P lies
in the convex hull of P − {p}.
Theorem B.12. A finite set of points P in R2 is a convex polygon if and
only if no member p of P lies in a two-dimensional simplex formed by three
other members of P .
Proof. The argument is straightforward and is left to the reader as an exercise.
⊓
⊔
Theorem B.13 (Radon’s Theorem). Let P = {xi ∈ Rn | 1 ≤ i ≤ n + 2}
be a set of n + 2 points in Rn . Then, there are two disjoint subsets R and Q
of P such that Kconv (R) ∩ Kconv (Q) 6= ∅.
Proof. Since n+2 points in Rn are affinely dependent, there exist a1 , . . . , an+2
not all equal to 0 such that
n+2
X
ai xi = 0
(B.1)
i=1
Pn+2
and i=1 ai = 0. Without loss of generality, we can assume that the first k
Pk
numbers are positive and the last n + 2 − k are not. Let a = i=1 ai > 0
aj
al
and let bj = a for 1 ≤ j ≤ k. Similarly, let cl = − a for k + 1 ≤ l ≤ n + 2.
Equality (B.1) can now be written as
k
X
j=1
bj x j =
n+2
X
cl xl .
l=k+1
P
P
Since the numbers bj and cl are nonnegative and kj=1 bj = n+2
l=k+1 cl = 1,
it follows that Kconv ({x1 , . . . , xk }) ∩ Kconv ({xk+1 , . . . , xn+2 }) 6= ∅. ⊓
⊔
Theorem B.14 (Klein’s Theorem). If P ⊆ R2 is a set of five points such
that no three of them are collinear, then P contains four points that form a
convex quadrilateral.
Proof. Let P = {xi | 1 ≤ i ≤}. If these five points form a convex polygon,
then any four of them form a convex quadrilateral. If exactly one point is in
the interior of a convex quadrilateral formed by the remaining four points,
then the desired conclusion is reached.
Suppose that none of the previous cases occur. Then, two of the points,
say xp , xq , are located inside the triangle formed by the remaining points
xi , xj , xk . Note that the line xp xq intersects two sides of the triangle xi xj xk ,
B Convex Sets and Functions
577
xi
u
@
@
@
@
@ @
u @
@
xq
@
u
@
@u
xp
xk
u
xj
Fig. B.1. A five-point configuration in R2 .
say xi xj and xi xk (see Figure B.1). Then xp xq xk xj is a convex quadrilateral.
⊓
⊔
A function f : R −→ R is convex if its graph on an interval is located
below the chord determined by the endpoints of the interval. More formally,
we have the following definition.
Definition B.15. A function f : R −→ R is convex if f (tx + (1 − t)y) ≤
tf (x) + (1 − t)f (y) for every x, y ∈ Dom(f ) and t ∈ [0, 1]. The function
g : R −→ R is concave if −g is convex.
Theorem B.16. If f : R −→ R is a convex function and a < b ≤ c, then
f (b) − f (a)
f (c) − f (a)
≤
.
b−a
c−a
Proof. Since a < b ≤ c, we can write b = ta + (1 − t)c, where t =
The convexity of f yields the inequality
f (b) ≤
c−b
c−a
∈ (0, 1].
c−b
b−a
f (a) +
f (c),
c−a
c−a
which is easily seen to be equivalent with the desired inequality. ⊓
⊔
A similar result follows.
Theorem B.17. If f : R −→ R is a convex function and a ≤ b < c, then
f (c) − f (a)
f (c) − f (b)
≤
.
c−a
c−b
578
B Convex Sets and Functions
Proof. The argument is similar to the proof of Theorem B.16. ⊓
⊔
Corollary B.18. Let f : R −→ R be a convex function and let p, q, p′ , q ′ be
four numbers such that p ≤ p′ < q ≤ q ′ . We have the inequality
f (q) − f (p)
f (q ′ ) − f (p′ )
≤
.
q−p
q ′ − p′
(B.2)
Proof. By Theorem B.16 applied to the numbers p′ , q, q ′ , we have
f (q) − f (p′ )
f (q ′ ) − f (p′ )
≤
.
′
q−p
q ′ − p′
Similarly, by applying Theorem B.17 to p, p′ , q, we obtain
f (q) − f (p)
f (q) − f (p′ )
≤
.
q−p
q − p′
The inequality of the corollary can be obtained by combining the last two
inequalities. ⊓
⊔
From Corollary B.18, it follows that if f : R −→ R is convex and differentiable everywhere, then its derivative is an increasing function.
The converse is also true; namely, if f is differentiable everywhere and its
derivative is an increasing function, then f is convex. Indeed, let a, b, c be
three numbers such that a < b < c. By the mean value theorem, there is
p ∈ (a, b) and q ∈ (b, c) such that
f ′ (p) =
f (b) − f (a)
f (c) − f (b)
and f ′ (q) =
.
b−a
c−b
Since f ′ (p) ≤ f ′ (q), we obtain
f (b) − f (a)
f (c) − f (b)
≤
,
b−a
c−b
which implies
c−b
b−a
f (a) +
f (c);
c−a
c−a
that is, the convexity of f . Thus, if f is twice differentiable everywhere and its
second derivative is nonnegative everywhere, then it follows that f is convex.
Clearly, under the same conditions of differentiability as above, if the second
derivative is nonpositive everywhere, then f is concave.
The functions listed in the Table B.1, defined on the set R≥0 , provide
examples of convex (or concave) functions.
f (b) ≤
Theorem B.19 (Jensen’s Theorem). Let f be a functionPthat is convex on
n
an interval I. If t1 , . . . , tn ∈ [0, 1] are n numbers such that i=1 ti = 1, then
B Convex Sets and Functions
f
n
X
ti xi
i=1
!
≤
n
X
579
ti f (xi )
i=1
for every x1 , . . . , xn ∈ I.
Proof. The argument is by induction on n, where n ≥ 2. The basis step, n = 2,
follows immediately from Definition B.15.
Suppose that the
Pstatement holds for n, and let u1 , . . . , un , un+1 be n + 1
numbers such that n+1
i=1 ui = 1. We have
f (u1 x1 + · · · + un−1 xn−1 + un xn + un+1 xn+1 )
un xn + un+1 xn+1
= f u1 x1 + · · · + un−1 xn−1 + (un + un+1 )
.
un + un+1
By the inductive hypothesis, we can write
f (u1 x1 + · · · + un−1 xn−1 + un xn + un+1 xn+1 )
≤ u1 f (x1 ) + · · · + un−1 f (xn−1 ) + (un + un+1 )f
un xn + un+1 xn+1
un + un+1
.
Next, by the convexity of f , we have
un
un+1
un xn + un+1 xn+1
f
≤
f (xn ) +
f (xn+1 ).
un + un+1
un + un+1
un + un+1
Combining this inequality with the previous inequality gives the desired conclusion. ⊓
⊔
Of course,
Pn if f is a concave function and t1 , . . . , tn ∈ [0, 1] are n numbers
such that i=1 ti = 1, then
!
n
n
X
X
ti f (xi ).
(B.3)
f
ti xi ≥
i=1
i=1
Table B.1. Examples of convex or concave functions.
Function
xr for
r>0
ln x
x ln x
ex
Second
Derivative
Convexity
Property
r(r − 1)xr−2 concave for r < 1
convex for r ≥ 1
− x12
concave
1
x
convex
ex
convex
580
B Convex Sets and Functions
Example B.20. We saw that the function fP
(x) = ln x is concave. Therefore, if
n
t1 , . . . , tn ∈ [0, 1] are n numbers such that i=1 ti = 1, then
!
n
n
X
X
ln
ti xi ≥
ti ln xi .
i=1
i=1
This inequality can be written as
ln
n
X
ti xi
i=1
or equivalently
n
X
!
≥ ln
ti xi ≥
i=1
n
Y
xtii ,
i=1
n
Y
xtii ,
i=1
for x1 , . . . , xn ∈ (0, ∞).
In the special case where t1 = · · · = tn = n1 , we have the inequality that
relates the arithmetic to the geometric average on n positive numbers:
n
Y
x1 + · · · + xn
≥
n
xi
i=1
! n1
.
(B.4)
Pn
Let w = (w1 , . . . , wn ) ∈ Rn be such that
i=1 wi = 1. For r 6= 0,
the w-weighted mean of order r of a sequence of n positive numbers x =
(x1 , . . . , xn ) ∈ Rn>0 is the number
µrw (x)
=
n
X
wi xri
i=1
! 1r
.
Of course, µrw (x) is not defined for r = 0; we will give as special definition
µ0w (x) = lim µrw (x).
r→0
We have
lim
r→0
ln µrw (x)
= lim
ln
Pn
i=1
wi xri
Pn r r
i=1 wi xi ln xi
= lim P
n
r
r→0
i=1 wi xi
n
X
=
wi ln xi
r→0
i=1
= ln
n
Y
i=1
i
xw
i .
B Convex Sets and Functions
581
Qn
i
Thus, if we define µ0w (x) = i=1 xw
i , the weighted mean of order r becomes
a function continuous everywhere with respect to r.
For w1 = · · · = wn = n1 , we have
µ−1
w (x) =
nx1 · · · xn
x2 · · · xn + · · · + x1 · · · xn−1
(the harmonic average of x),
1
µ0w (x) = (x1 . . . xn ) n
(the geometric average of x),
x1 + · · · + xn
µ1w (x) =
n
(the arithmetic average of x).
Theorem B.21. If p < r, we have µpw (x) ≤ µrw (x).
Proof. There are three cases depending on the position of 0 relative to p and
r.
r
In the first case, suppose that r > p > 0. The function f (x) = x p is
convex, so by Jensen’s inequality applied to xp1 , . . . , xpn , we have
n
X
wi xpi
i=1
! pr
≤
n
X
wi xri ,
i=1
which implies
n
X
wi xpi
i=1
! p1
≤
n
X
wi xri
i=1
! 1r
,
which is the inequality of the theorem. r
′′
p
If r > 0 r> p, the function f (x) = x is again convex because f (x) =
−2
r r
p
≥ 0. Thus, the same argument works as in the previous case.
p p −1 x
Finally, suppose that 0 > r > p. Since 0 <
is concave. Thus, by Jensen’s inequality,
n
X
wi xpi
i=1
Since
1
r
≥
n
X
r
< 1, the function f (x) = x p
wi xri .
i=1
< 0, we obtain again
n
X
i=1
⊓
⊔
! pr
r
p
wi xpi
! p1
≤
n
X
i=1
wi xri
! 1r
.
C
Useful Integrals and Formulas
C.1 Euler’s Integrals
The integrals
B(a, b) =
Z
1
xa−1 (1 − x)b−1 dx,
0
Γ (a) =
Z
∞
xa−1 e−x dx,
0
are known as Euler’s integral of the first type and Euler’s integral of the second
type, respectively. We assume here that a and b are positive numbers to ensure
that the integrals are convergent.
Replacing x by 1 − x yields the equality
Z 0
B(a, b) = −
(1 − x)a−1 (x)b−1 dx = B(b, a),
1
which shows that B is symmetric.
Integrating B(a, b) by parts, we obtain
Z 1
B(a, b) =
xa−1 (1 − x)b−1 dx
0
=
Z
1
0
=
xa
a
1
b−1 Z 1
xa (1 − x)b−2 dx
+
a
0
(1 − x)b−1 d
xa (1 − x)1−b)
a
0
Z
Z
b − 1 1 a−1
b − 1 1 a−1
b−2
=
x (1 − x) dx −
x
(1 − x)b−1 dx
a
a
0
0
b−1
b−1
=
B(a, b − 1) −
B(a, b),
a
a
© Copyright 2026 Paperzz