Solutions

Analysis & Optimization
Problem Set 4
June 19
1. (a) With f as given, we have
  X
X

n
n
n
n
X
X
2
2a
x
+
2b
x
−
2
−2x
(y
−
ax
−
b)
x
y
i
i i
i
i i
i
 

  i=1

 i=1
i=1
i=1



.
=
∇f (a, b) =  n


n
n
X
X
 
 X

−2(yi − axi − b)
2a
xi + 2nb − 2
yi
i=1
i=1
i=1
Setting this equal to zero, we have a system of two linear equations in a and b, which we can solve
however we like to obtain that f (a∗ , b∗ ) = 0 if and only if
P
P P
n xi yi − xi yi
P 2
P
a∗ =
n xi − ( xi )2
P 2P
P
P
xi
yi − xi yi yi
∗
P
P
b =
n x2i − ( xi )2
P
P
when n x2i 6= ( xi )2 ; otherwise there are infinitely many solutions.1 By the Cauchy-Schwarz
inequality,
v
u n
n
X
√ uX
x2i ,
xi = (1, . . . , 1) · (x1 , . . . , xn ) ≤ (1, . . . , 1) (x1 , . . . , xn ) = nt
i=1
i=1
with equality if and only if (x1 , . . . , xn ) is a multiple of (1, . . . , 1), i.e. when x1 = · · · = xn . This
corresponds to data where (x1 , y1 ), . . . , (xn , yn ) all lie on a vertical line in the xy-plane; in this
context it is clear that there cannot be a unique minimum point (a, b), since there is no line of
the form ax + b passing through all of these points.
Since f is a sum of convex functions, it is convex, and so any critical point is both a local and
global minimum.
(b) Substituting into the formulas obtained in part (a), we find a∗ = 0 and b∗ = 1.
One possible conclusion is that the quantity y is actually constant (we only observe fluctuations
- due to measurement imprecision - which are independent of x). If you’re confident enough
in the precision of your experiment, another possible conclusion is that y is actually not well
approximated by a linear function of x.
1 This
is by the computation that ∇f (a, b) = 0 can be rewritten as the matrix equation

 n

 n
n
X
X
X
x2i
xi 
x
y
i
i



!

i=1
 a
i=1
i=1




=


,
n
n
 b
 X

X




xi
n
yi
i=1
with the determinant of the square matrix given by n
i=1
P
P
x2i − ( xi )2 .
1
2. Since z 7→ ez is a strictly increasing function, a maximum of f is precisely the same as a maximum of
x2 + 2xy + 2y 2 = (x + y)2 + y 2 . Since (1 + 1)2 ≥ (x + y)2 and 12 ≥ y 2 for any (x, y) in the given square,
f (x, y) attains its maximum at (1, 1), where it has the value e5 . It also attains this value at (−1, −1).
Moreover, if f (x + y) = e5 , then (x + y)2 + y 2 = 22 + 12 ; since (x + y)2 ≤ 4 and y 2 ≤ 1 we must have
(x + y)2 = 4 and y 2 = 1, implying (x, y) = (1, 1) or (x, y) = (−1, −1). So the maximum of f on the
given square is attained only at (1, 1) and (−1, −1).
3. Define A := a1 + · · · + an .
(a) It is clear that the region {x ∈ Rn : p · x ≤ w, x1 ≥ 0, . . . , xn ≥ 0} is closed. It can be seen to be
bounded by viewing it as the region bounded by the hyperplane normal to p and the coordinate
axes; here we are using the fact that each pi is strictly positive in an essential way; otherwise the
plane {x ∈ Rn : p · x = w} could be parallel to one of the coordinate axes, and the region would
be unbounded.2 Since the region is closed and bounded, any continuous function defined on it (in
this case u) must have a global maximum.
Since ∇u(x) 6= 0 for all x ∈ Rn such that x1 , . . . , xn > 0 (this can be seen from the expression
given for ∇u in part (b)), u has no critical points, and so its maximum in the given region must be
attained on the boundary. If xi = 0 for any i, then u(x) = 0; since it is clear that u is positive on
the interior of the region, its maximum cannot be obtained on any of the coordinate hyperplanes.
So it must be attained on the boundary portion given by {x ∈ Rn : p · x = w, x1 , . . . , xn > 0}.
(b) We have the Lagrangian L(x) = u(x) − λp · x, and compute

  a1
a1 xa1 1 −1 xa2 2 · · · xann − λp1
x1 u(x) − λp1


 
..
..
∇L(x) = 
.
=
.
.
an−1 an −1
a1
an
an x1 · · · xn−1 xn
− λpn
xn u(x) − λpn

Then if ∇L(x) = 0, then pi xi =
ai
xi u(x) = λpi , we then have
a1
λ u(x),
xi =
and so w = p · x =
A
λ u(x),
i.e.
1
λ u(x)
=
w
A.
Since
ai
ai w
u(x) =
.
pi λ
pi A
(c) Let ξ be fixed; since f (x) ≤ f (ξ) for all x if and only if f (x)a ≤ f (ξ)a for all x, we see that ξ
is a maximum of f (x) if and only if it is a maximum of f (x)a . Since f (x) ≤ f (ξ) if and only if
f (c xc ) ≤ f (c ξc ) for all x in the domain of f , we likewise have that ξ is a maximum of f (x) if and
only if ξc is a maximum of f (cx).3
(d) By Jensen’s inequality,
ln(a1 x1 + · · · + an xn ) ≥ a1 ln x1 + · · · + an ln xn
for any x1 , . . . , xn > 0 and a1 , . . . , an ∈ [0, 1] such that a1 + · · · + an = 1. Then
a1 x1 + · · · + an xn = eln(a1 x1 +···+an xn ) ≥ ea1 ln x1 +···+an ln xn = ea1 ln x1 · · · ean ln xn = xa1 1 · · · xann .
Since x 7→ ln x is a strictly concave function, equality holds here only when x1 = · · · = xn . When
a1 = · · · = an = n1 , this says
x1 + · · · + xn
≥ (x1 · · · xn )1/n .
n
2 (This
footnote is complete fluff; you can ignore it if you want.) Those of you who are taking or have had Intro to
Modern Analysis may find the following rigorous argument for the region’s boundedness amusing: suppose it is not, so that
we can find a sequence {xk }k∈N in the region such that |xk | = k. Then k1 xk ∈ Sn−1 ; since Sn−1 is compact there is a
subsequence of { k1 xk }k∈N which converges to some x0 ∈ Sn−1 . (For simplicity we denote the subsequence stil! l by { k1 xk }.)
Since {x ∈ Rn : x1 , . . . , xn ≥ 0} is closed, we have x01 , . . . , x0n ≥ 0; furthermore since x0 ∈ Sn−1 there must be some
index i for which x0i > 0. Since p · xk ≤ w, we have p · k1 xk ≤ w
, and so sending k → ∞, we have p · x0 ≤ 0, but
k
p · x0 = p1 x01 + · · · + pn x0n ≥ pi x0i > 0, a contradiction.
3 Note here that { x : x is in the domain of f } is the domain of f (cx).
c
2
(e) If ai = pi and a1 + · · · + an = 1, then by part (d) we have u(x) ≥ p · x = w for any x subject to
the constraints. Since u(w, . . . , w) = w, we have that (w, . . . , w) is a global maximum. Now let
(a1 , . . . , an ) and (p1 , . . . , pn ) be proportional. Define P := (p1 + · · · + pn )−1 . By part (c), we know
a /A
a /A
that a global maximum of u(x) is the same as a global maximum of u(x)1/A = x1 1 · · · xnn .
1
1
1
Furthermore {p · x = w} = {P p · x = P w}. Since P p = A a and A a1 + · · · + A an = 1, we know
from the simplified case above that (P w, . . . , P w) is a global maximum.
(f) Let (a1 , . . . , an ) and (p1 , . . . , pn ) be arbitrary, and let (c1 , . . . , cn ) be selected later. Define x
bi =
pi
x
,
so
that
ai i
b = w}.
{x ∈ Rn : p · x = w} = {x ∈ Rn : a · x
w
w
b = (A
By part (e), the maximum of u on this set is attained at x
,..., A
). This corresponds to
an w
a1 w
x = ( p1 A , . . . , pn A ).
(g) We have
x∗i
Then
ai w
=
pi A
∂x∗i
ai w
=− 2 <0
∂pi
pi A
∗
and
u(x ) =
and
∂u(x∗ )
=
∂w
a1
p1
a1
a1
p1
···
p1
an
pn
···
an w A
.
A
an
pn
an w A−1
> 0.
A
This says that as the price of goods increase, the optimal quantity to buy to maximize happiness
decreases, while as personal wealth increases, optimal happiness also increases.
4. (a) Since ∇g(x, y, z) = (f 0 (x), f 0 (y), f 0 (z)), (x, y, z) is a minimum of g with the given constraint only
when λ ∈ R is such that:
f 0 (x) = λ
f 0 (y) = λ
f 0 (z) = λ
x + y + z = 1.
Since f is strictly convex, f 0 is strictly increasing, and so f 0 (x) = f 0 (y) = f 0 (z) implies x = y = z.
Substituting into x + y + z = 1, we find that (x, y, z) = ( 31 , 13 , 13 ). Since g is a sum of strictly
convex functions, it is itself strictly convex. So ( 13 , 13 , 31 ) is both a local and global minimum.
(b) By Jensen’s inequality,
f ( 13 )
=f
x+y+z
3
≤
f (x) + f (y) + f (z)
g(x, y, z)
=
3
3
for any x, y, z such that x+y+z = 1. That is, g(x, y, z) ≥ 3f ( 31 ) for all x, y, z such that x+y+z = 1.
Since g( 13 , 13 , 13 ) = 3f ( 13 ), we know that ( 13 , 13 , 13 ) is a global minimum.
(c) By Jensen’s inequality,
ln
2
x1 + · · · + xn
ln x1 + · · · + ln xn
= ln
≥
n
n
n
for all x1 , . . . , xn > 0 such that x1 + · · · + xn = 2. Since
ln n2 + · · · + ln n2
2
= ln ,
n
n
we know that ( n2 , . . . , n2 ) is a global maximum of n1 (ln x1 + · · · + ln xn ), and equivalently of ln x1 +
· · · + ln xn . We could have also done this via Lagrange multipliers.
3