521 JENSEN`S INEQUALITY* The simplest form of

1937-1
JENSEN'S INEQUALITY
521
22. Zariski, O., On the linear connection index of the algebraic surfaces
zn—f{x, y), Proceedings of the National Academy of Sciences, vol. 15 (1929),
pp. 494-501.
23. Zariski, 0 . , On the non-existence of curves of order 8 with 16 cusps,
American Journal of Mathematics, vol. 53 (1931), pp. 309-318.
24. Zariski, O., On the irregularity of cyclic multiple planes, Annals of
Mathematics, (2), vol. 32 (1931), pp. 485-511.
25. Zariski, O., Algebraic Surfaces, Ergebnisse der Mathematik^ vol. 3
(1935), pp. 160-181.
W E L L S COLLEGE
J E N S E N ' S INEQUALITY*
BY E . J. MCSHANE
The simplest form of Jensen's inequality is that if 4>{x) is a
convex function, and m is the arithmetic mean of Xi, • • • , xn,
then the mean of the numbers <j>(xn) is not less than 0(m). This
inequality can be generalized in several different ways. The
function <j>(x) can be replaced by a convex function of several
variables, and the arithmetic mean can be replaced by any one
of several other means, as has been shown in various proofs.
Since the inequality is of considerable utility, it seems worth
while to have it established in a form which is general enough to
cover a wide assortment of applications.
The proofs will rest on two well known properties of convex
sets, f If K is closed and convex and a point p is not in K, then
p can be separated from K by a hyperplane. If K is closed and
convex and p is a boundary point of K, there is a hyperplane of
support of K passing through p.
1. The Inequality in Geometric and in Analytic Form, It
be convenient in the following proofs to use these symbols
definitions :
Rn is w-dimensional Euclidean space. Its points will be
noted by fa, - • • , zn) or by z. Linear functions ^Z 0 ^»' o r
will be symbolized by l(z).
will
and
de^n
* Presented to the Society, December 31, 1936.
t A set is convex if for every pair P, Q of points of the set the line segment
PQ is contained in the set.
522
[August,
E. J. MCSHANE
L is a linear class of real valued functions f(x) defined on a
set E. It shall be supposed have the properties:
(la) If fiy f2 are in L and ki, k2 are real numbers, kifi+k2f2 is in
L.
(lb) The function defined and constantly equal to 1 on £ is
in L.
Mf is a linear mean defined on L, with these properties:
(2a) ATI = 1.
(2b) If ƒ], f2 are in L and ki, k2 are real numbers,
M(k\f^+k2f2)
= k1Mf1+k2Mf2.
(2c) If ƒ (a) is in L and ƒ(*) ^ 0 , then Af/ ^ 0 .
If f(x) is an n-tuple of functions (fi(x), • • • , fn(x)) of L, we
denote by Mf the w-tuple (M/i, • • • , Mfn). From (2) we obtain
(3) Ml(f)=l(Mf)
for every function l(z) linear on i£ n .
The geometric formulation of Jensen's inequality is as follows:
THEOREM 1. Let (1) and (2) be satisfied. Let K be a closed convex point set in Rn. Letfi(x), • • • ,fn(x) be functions of the class L
such that f = (fi, • • • , fn) is in Kfor all x in E. Then Mf is in K.
Let l(z)+c = 0 be a hyperplane in Rn such that K is entirely
to one side, say l(z)+c^0
for z in K. Then / ( / ) + £ ^ 0 for all x,
and OSM(l(f)+c)
= Ml(f) + Mc = l(Mf)+c, so that Mf lies on
the same side of the hyperplane as K. That is, no hyperplane
separates Mf from K. This is only possible if Mf is in K.
The more usual analytical formulation of the inequality is
covered by the following theorem :
THEOREM 2. Let (1) and (2) be satisfied. Let K be a closed convex point set in Rn, and let <j>(z) be continuous and convex* on K.
Let fi(x)y - - • , fn(x) be functions of the class L such that f(x)
=
(fi(x)> ' ' ' 1 fn(oc)) is in Kfor all x in E, and such, moreover,
that (t>(f(x)) is in the class L. Then <j>(Mf) is defined and
(J)
4>(Mf)^M(j>(f).
* We call <f> convex on K if 0(J(zi+z 2 ) ) Si(<f>(zi) +<Kz2) ) for all zh z2 in K.
523
JENSEN'S INEQUALITY
1937-1
Denote the points (si, • • • , sn+i) of Rn+i by z 1 . These can
also be denoted by (z, s n +i), where z is in i? n . Define X\ to
be the set in Rn+i of points (z, sn+i) such that z is in X and
2n+i^0( z )' It is clear that Ki is closed and convex, and for all
x in E the point (/(#), cj>(£(x)) is in JRTi. Hence by Theorem 1,
the point (Aff, M<f>(f)) is in 2Ti; that is, Mf is in K and
M<t>{i)^4>{Mi).
Obviously we could weaken our hypotheses somewhat by
omitting the requirement that 0 be continuous and assuming
instead that the set Ki is closed and convex. This would permit
0(z) to have infinite discontinuities on the boundary of K.
EXAMPLES: (1) Let a(xi, • • • , xm) be a positively monotonic* function of the variables (xi, • • • , xm), and let E be a set
measurable with respect to a and such that 0<maE < <*>. Let L
be the class of all functions f(x) which are Lebesgue-Stieltjes
integrable with respect to a over E. Let Mf = f Efda/ f Eda. Then
conditions (1) and (2) are satisfied. The conclusion of Theorem 2
assumes the form
<t>( \ fida/
I da, • • • ,
I fnda/
I da J
S J*(/i, • • ' ,fn)da/ J da.
In the next three examples requirements of convergence or
integrability are too obvious to need statement:
(2) The range of x is (1, • • • , m) or (1, 2, • • • ), so t h a t / ( x ) is a
(finite or infinite) sequence (&i, #2, * • * )> and
Mf—^CiaJ^cu
where c^O and 0 < ] ^ C ; < oo.
(3) The functions ƒ of L are continuous, and Mf =
fEdx/fEdx.
(4) The functions ƒ of L are Lebesgue measurable over the
measurable set E, and Mf = ffpdx/Jpdx}
where p(x)^0
and
0<fpdx<<x>.
(5) E is in the interval (0,1), L is the class of all bounded
functions on E, Mf is the Banach intégrait of ƒ over (0, 1).
(6) E is the set of all real numbers, L the class of all uniformly almost periodic functions, Mf is the mean value of ƒ.
* That is, a function whose mth difference is non-negative,
t Banach, Théorie des Opérations Linéaires, p. 31.
524
E. J. MCSHANE
[August,
(7) More generally, E is any group, L is the class of all functions almost periodic on E, Mf is von Neumann's* mean value
off.
(8) In the space (xi, x2, • • • ) of infinitely many dimensions,
(O^Xi^ 1), E is a set of finite positive measure,f L is the class
of functions summable over E, Mf =
fEfdx/mE.
2. Conditions for Strict Inequality. In §1 we have not mentioned conditions for strict inequality. To investigate this question it is convenient to define negligible sets. A set ScE is
negligible (with respect to L and M) if there exists a function ƒ
in the class L such that
(a) ƒ(*) ^ 0 on £ ,
(b) ƒ(*) > 0 on 5 ,
(c) Mf = 0.
It follows readily that every subset of a negligible set is negligible, and so is every set which is the sum of a finite number of
negligible sets. It is then easy to prove the following theorem:
THEOREM 3. In Theorem 1, Mf is a boundary point of K only
if all points f(x) except those corresponding to a negligible set of x
belong to the intersection of K with one of its hyperplanes of support.
If Mf is a boundary point of K, through it there passes a
hyperplane of support tr: l(z)+c = 0 of K\ say l(z)+c^Q for z
in K. Let S be the set of x such that f{x) is not in the intersection wK; then l(f(x))+c>0
on 5. Since l(f(x))+c^z0
for all x
and M(l(f(x))+c) = l{Mf)+c = 0, we see that S is negligible.
I omit the easy proof of the following theorem :
THEOREM 4. If the set K is strictly convex, and <j> is strictly
convex on K, then in Theorem 2 equality holds only if f%{x) = Mfi
— const. {i = 1, • • • , n) except on a negligible set.
I am unable to state whether the conditions ƒ* = Mf » except
on a negligible set are sufficient as well as necessary for equality
in Theorem 4. However, by adding a further postulate concern* J. von Neumann, Almost periodic functions in a group, Transactions of
this Society, vol. 36 (1934); in particular, p. 452.
t P. J. Daniell, Integrals in an infinite number of dimensions, Annals of
Mathematics, (2), vol. 20 (1919), p. 281.
B. Jessen, The theory of integration in a space of an infinite number of dimensions, Acta Mathematica, vol. 63 (1934), p. 249.
1937-]
JENSEN'S INEQUALITY
525
ing L and M we can establish this, even when K is not strictly
convex. We henceforth restrict our attention to systems L, M
such that the following condition holds:
(4) If S is any negligible subset of E, and f(x) is any (real)
function which vanishes on E — S, then ƒ(x) is in the class L
and Mf = 0.
In example (1) negligible sets are sets of measure maS = 0;
hence condition (4) is satisfied. In example (2), a set 5 of integers is negligible iî^usCi^O;
in (4), S is negligible if p(x) = 0
on almost all of S. In (8), S is negligible if mS = 0. In (3), (6),
and (7), only the empty set is negligible. For all of these (4)
is valid. I do not know whether example (5) satisfies (4).
An immediate consequence of conditions (1), (2), and (4) is
that if f(x) is any function of class L, and g(x) =f(x) except on a
negligible set, then g(x) is in L and Mf=Mg.
THEOREM 5. If condition (4) is satisfied, then in inequality
(J) equality holds if and only if the following condition holds :
For all x except at most those belonging to a negligible set S, the
point (fi(x), • - , fn(%)) belongs to a convex subset K' of K on
which <j){z) is linear. In particular, if<t>(z) is strictly convex* equality holds if and only if f%{x) — const, except on a negligible set.
The last statement is an immediate consequence of the first,
for if <j> is strictly convex the only subsets Kf on which <j> is
linear consist of single points. Suppose then that the condition
of Theorem 5 holds; by redefining ƒ*(#) on S, it will be true that i
is in K' for all x, without change in Mi or M(<f>(f)). On K' we
have 4>{z)**l(z)+c; hence ikf<£(/)= M(l(f)+c)=l(Mf)+c.
But
since K' is convex, by Theorem 1 the point Mi is in K', and so
4>(Mi)*=l(Mi)+c. Hence equality holds in the inequality (J).
To prove the necessity of our condition we first observe that
there may be linear relations l(i(x))+c = 0 holding for all x except those of a negligible set. We choose a maximal set of such
linear relations; there is no loss of generality in assuming that
these are of the form / 5+ i(x)— • • • =fn(x)=0
except on Si,
where Si is negligible. Then Mf,+i= • • • =Mfn = 0.
We now change notation. Let Rs be the space of points
(si, • • • , z8) ; let H be the set of (zi, • • • , z8) such that
0i, • • • , s„ 0, • • • , 0) is in K; let yp(zu • • - , * , ) =c/>(zlf • • - , * „
* We assume only that K is convex, not that it is strictly convex.
526
E. J. MCSHANE
[August,
O, •• - , 0) on H; in the space R8+i of points (zx, • • • , zat zn+i)
let Hi be the set for which (zï} • • • , za) is in H and
S n + i ^ ( * i , • • • i *«). For x in 22 —Si we have (fi(x), • • • ,ƒ,(*))
in i J and (/i, • • • , ƒ„ ^(/i, • • • , ƒ,)) in Hx. If equality holds in
(ƒ), then
W A , • • • , ƒ . ) = Mtf>(/i, • • - , ƒ . , 0, •• - , 0) = M4>(fL, • • • , fn)
= 0(Jf/i, • • • , M/ S , 0, • • • , 0)
= V W x , • - . , AT/.),
so the point (Mfa • • • , Mfs, M\[/(fi, • • • , ƒ,)) is a boundary
point of Hi in i? s+ i.
By Theorem 3, for all points x of E — Si except a negligible
set S2 the point (fi(x), • • • , ƒ,(#), ^(/i, • • • , ƒ,)) belongs
to the intersection of Hi with a hyperplane of support ir:
a+bi%i+ - - - +bszs+czn+i = 0. T h a t is, except on Si+S2 the
equation a+bifi+ • • • +bjs+c\[/{fi,
• • • , fs)=0
holds. Here
£ ^ 0 ; otherwise we would have a new linear relation between the
fi independent of the maximal set / s +i = • • • =fn = 0. We
may therefore suppose that c = l ; hence ir has the equation
a+^biZi+Zn+i^O.
The left member of this equation is positive
for some points (si, • • • , z8, zn+i) of Hi, since zn+i is arbitrarily
large. But i is a hyperplane of support, so
a+£b&i+zn+i
does not change sign on Hi; therefore a+]>2b&i+zn+i*zQ on Hi.
For (zu • • • , zs) in H the point (*i, • • • , zay \l/(zu • • • , z,)) is
in iJj, so that a + ^ ^ i + ^ f e •• • , 2 , ) ^ 0 o n £ Moreover, if
Zn+i>$(zi, - • • ,s s ), then a + ^ & ^ i + s w + i > 0 . Hence the points of
Hi which He in w are those of the form (zi, • • • , za, \l/(zi, • • , afi))
with Oi, • • • , z8) in i ? and a + ^ M i + ^ O s i , • • • , 2 8 )=0. The
points satisfying these conditions form the intersection irHi,
which, being the intersection of convex sets, must be convex, as
is also its projection H' on the space Rs. For all x in E— (Si+S 2 )
the point (fi(x), • • • , ƒ,(*)» ^Gfi» ' * ' » ƒ«)) i s i n ^#1» s o t n a t
(/i0*0> * • * » ƒ«(*)) i s *n ^ ' - If w e define i£' to be the set of all
points (zi, - • • , za, 0, • • • , 0) with (zx, • • • , z8) in UT', then X"' is
convex, and for all x in E — (S1+S3) the point (fi{x), • • • ,ƒ*(#),
0, • • - , 0) is in i£'. If (zi, • • • , zn) is in i£' then (si, • • • , z8)
is in üT, and
0(01, • • • , Zn) = 4>(ZU ' ' ' , Zê, 0, ' ' ' , 0)
1937-]
JENSEN'S INEQUALITY
527
This establishes the theorem.
3. Extension to Banach Spaces. It is possible to extend Theorems 1,2,3, and 4 to functions f(x) assuming values in a Banach
space B. Suppose that (1) and (2) are satisfied, and that L is a
class of functions f(x) defined on E and assuming values in B.
We shall assume that for every linear function l(z) on B and
every f(x) in L the function l(f(x)) is in L. Further, we shall
assume that there is a linear mean M defined on L such that
for every linear function l(z) on B and every ƒ in L we have
/(AH) = M (1(f)).
We then find that Theorems 1, 2, 3, and 4 extend with only
one change. The only properties of convex sets which we used
were these : through each boundary point of a convex set there
passes a hyperplane of support, and each point which does not
belong to a convex set can be separated from it by a hyperplane.
These properties have been established for convex bodies*
(closed convex sets having interior points). Hence our theorems
extend at once, provided that we replace the words "convex set"
by "convex body."
UNIVERSITY OF VIRGINIA
* Cf. Banach, Théorie des Opérations Linéaires, p. 246.