EC 521 MATHEMATICAL METHODS FOR ECONOMICS Lecture 2

EC 521 MATHEMATICAL METHODS FOR
ECONOMICS
Lecture 2: Convex Sets
Murat YILMAZ
Boğaziçi University
In this section, we focus on convex sets, separating hyperplane theorems and Farkas Lemma.
And as an application we look at a linear production model and characterize efficiency. We
begin with the definition of a convex set. These lecture notes are mostly based on Chapter 3 in
Advanced Mathematical Economics by R.V. Vohra.
Definition 1 A set C of vectors/points is called convex if for all x, y ∈ C and λ ∈ [0, 1],
λx + (1 − λ)y ∈ C.
convex
not convex
Remark 1 .
(1) A set in Rn is convex if whenever it contains two vectors/elements, it also contains the entire
line segment connecting them.
(2) If x1 , x2 , . . . , xk ∈ C is convex, then
Pk
i=1 λi xi
∈ C where
(3) Let X, Y ⊆ Rn be two convex sets. Then:
(i) X + Y = {z ∈ Rn : z = x + y x ∈ X, y ∈ Y } is convex.
(ii) αX = {z ∈ Rn : z = αx x ∈ X, α ∈ R} is convex.
1
Pk
i=1 λi
= 1, λi ≥ 0 ∀i.
(iii) X ∩ Y is convex (in fact, intersection of any collection of convex sets is convex.)
(iv) X ∪ Y might not be convex.
(4) {x : Ax = b x = 0} is convex. Why? Let x1 , x2 ∈ C = {x : Ax = b x ≥ 0}. Then,
Ax1 = b = Ax2 , and A(λx1 + (1 − λ)x2 ) = λAx1 + (1 − λ)Ax2 = λb + (1 − λ)b = b. Thus,
λx1 + (1 − λ)x2 ∈ C.
(5) If f : S → R is concave, then the set {(x, y) ∈ Rn+1 : y ≤ f (x) x ∈ S ⊆ Rn } is convex.
Why? Let (x1 , y1 ), (x2 , y2 ) ∈ {(x, y) : y ≤ f (x)}. Then, y1 ≤ f (x1 ) and y2 ≤ f (x2 ). That is,
λy1 + (1 − λ)y2 ≤ λf (x1 ) + (1 − λ)f (x2 ) ≤ f (λx1 + (1 − λ)x2 ). Thus, we get the following:
(λy1 + (1 − λ)y2 , λx1 + (1 − λ)x2 ) ∈ {(x, y) : y ≤ f (x)}
Note the reflection of this property in consumer theory: u concave ⇔ upper contour set convex.
Separating Hyperplane Theorems
Main idea:
straight line
•b∈
/C
C convex
b
y
L
z
x∗
C convex
x∗ chosen to be the closest element in C to point b, with b 6= x∗ . Line L is perpendicular
to [b, x∗ ] segment and is midway between x∗ and b. For L to be our separator, need to show
@y ∈ C lies to the left/up of L. If not, all z ∈ [y, x∗ ] ⊆ C. But z is closer to b.
2
First, we make sure such an x∗ (closest point to b) exists.
Lemma 1 Let C be a compact set not containing the origin. Then, there exists an x∗ ∈ C such
that d(x∗ , 0) = inf x∈C d(x, 0) > 0.
Proof. Follows directly from the continuity of d(x) and the Weierstrass Maximum Theorem.
Definition 2 Let h ∈ Rn and β ∈ R.
A hyperplane is a set Hh,β = {x ∈ Rn : hx = β}
A halfspace (below Hh,β ) is a set Hh,β = {x ∈ Rn : hx ≤ β}
A halfspace (above Hh,β ) is a set Hh,β = {x ∈ Rn : hx ≥ β}
x2
x2
{x ∈ R2 : (a, b) · x ≤ 1}
ax1 + bx2 ≤ 1
H(a,b),1 :ax1 + bx2 = 1
x1
x1
a hyperplane
a half space
Theorem 1 (Strict Separating Hyperplane Theorem) Let C be a closed convex set and
b∈
/ C. Then there is a hyperplane Hh,β such that h · b < β < h · x ∀x ∈ C
Proof. By a translation of the coordinates we assume that b = 0, without loss of generality.
Choose x∗ ∈ C that minimizes d(x, 0) for x ∈ C. By Lemma 1 above, such an x∗ exists and
d(x∗ , 0) > 0. (Note that Lemma 1 assumes compactness but here we don’t. Here is why: Pick
any y ∈ C and let C 0 = C ∩ {x ∈ C : d(x, 0) ≤ d(y, 0)}. Notice that C 0 is closed because both
C and {x ∈ C : d(x, 0) ≤ d(y, 0)} are closed. C 0 is also bounded. Now, it is easy to see that the
point in C 0 closest to 0 is also the point in C closest to 0.)
Let m be the midpoint of the line joining 0 to x∗ , i.e. m =
x∗
2 .
Choose Hh,β that goes through
m and is perpendicular to the line joining 0 and x∗ . That is, we choose h to be the vector x∗
scaled by d(x∗ , 0). That is, h =
x∗
d(x∗ ,0) .
Set β = h · m. Notice β = h · m =
3
x∗
2
·
x∗
d(x∗ ,0)
=
d(x∗ ,0)
.
2
Next, we verify that b = 0 is on one side of Hh,β and x∗ is on the other side. Observe that
h·b=0<
d(x∗ ,0)
2
∗
= h · m = β. Next, h · x∗ = x∗ d(xx∗ ,0) = d(x∗ , 0) >
d(x∗ ,0)
2
= h · m = β.
Now, pick any x ∈ C (x 6= x∗ ). Since C is convex (1 − λ)x∗ + λx ∈ C. From the choice of
x∗ , d(x∗ , 0)2 ≤ d((1 − λ)x∗ + λx, 0)2 . Since d(z, 0)2 = z · z we have d(x∗ , 0)2 ≤ [(1 − λ)x∗ + λx] ·
[(1 − λ)x∗ + λx] = (x∗ + λ(x − x∗ )) · (x∗ + λ(x − x∗ )) = d(x∗ , 0)2 + 2λx∗ (x − x∗ ) + λ2 d(x − x∗ , 0)2 .
That is, 0 ≤ 2x∗ (x − x∗ ) + λd(x − x∗ , 0)2 . Since, λ can be picked arbitrarily small, we get
x∗ (x − x∗ ) ≥ 0 ∀x ∈ C. Using x∗ = 2m and h =
x∗
d(x∗ ,0) ,
we get 0 ≤ [d(x∗ , 0) h] · (x − 2m), that
is, h · x ≥ 2m · h > h · m = β. So ∀x ∈ C, h · x > β.
x2
x∗
C
m
Hh,β
b=0
x1
h·x=β
This theorem basically says that a hyperplane Hh,β strictly separates C from b, if C is closed
and convex. If we drop the requirement that C be closed, we obtain a weaker result.
Theorem 2 (Weak Separating Hyperplane Theorem) Let C be a convex set and b ∈
/ C.
Then there is a hyperplane Hh,β such that h · b ≤ β ≤ h · x ∀x ∈ C
Proof. The only difference from the proof of the previous theorem is that x∗ is chosen so that
d(x∗ , 0) = inf x∈C d(x, 0). Since it is possible that x∗ = b (i.e., if b were on the boundary of C, x∗
not necessarily in C), the strict inequalities in the previous theorem must be replaced by weak
inequalities.
Theorem 3 Let C, D ⊆ Rn be two non-empty, disjoint, convex sets. Then there exists a hyperplane Hh,β such that h · x ≥ β ≥ h · y for all x ∈ C and y ∈ D.
Proof. K = {z : z = x − y, x ∈ C, y ∈ D} is convex and 0 ∈
/ K. By weak seperating
hyperplane theorem, ∃Hh,β such that h · 0 ≤ β ≤ h · z ∀z ∈ K. Pick any x ∈ C, y ∈ D. Then
4
h · (x − y) = h · x − h · y ≥ 0. In particular, h · x ≥ inf u∈C h · u ≥ supv∈D h · v ≥ h · y. Choose
β ∈ [inf u∈C h · u, supv∈D h · v] to complete the proof.
What if both C and D are also closed? Do we get the strict version of above theorem? No,
only if one of them is bounded. (counterexample?)
Theorem 4 Let C, D ⊆ Rn be two non-empty, disjoint, closed and convex sets with at least one
of them being bounded. Then there exists a hyperplane Hh,β such that h · x > β > h · y for all
x ∈ C and y ∈ D (where C is bounded).
Proof. Similar to the one above. Just show K is closed and apply strict seperating hyperplane
theorem.
Definition 3 The set of all non-negative linear combinations of the columns of Am×n is called
the finite cone generated by the columns of Am×n and denoted by cone(A). That is, cone(A) =
{y ∈ Rm : y = Am×n x for some x ∈ Rn+ }
Lemma 2 cone(A) is convex and closed.
Proof. Convexity is easy. For closedness, first show cone(B) is closed if all columns of B are
LI. Complete the proof as an exercise.
Theorem 5 (Farkas Lemma) Let A be an m × n matrix and b ∈ Rm . Let F = {x ∈ Rn :
Ax = b, x ≥ 0}. Then, either F 6= 0 or ∃y ∈ Rm such that yA ≥ 0, y · b < 0, but not both.
Proof. First, we show ’not both’ part: Suppose F 6= 0. Choose any x ∈ F . Then y · b = y · Ax =
(y · A)x ≥ 0. Now, suppose F = 0. Then, b ∈
/ cone(A). Since cone(A) is closed and convex, we
can use the strict seperating hyperplane theorem to identify a hyperplane Hh,β that seperates
b from cone(A). Without loss of generality, we can assume that h · b < β < h · z ∀z ∈ cone(A).
Since the origin is in cone(A), it is easy to see that β < 0. Let aj be the j th column vector of the
matrix A. We show that h · aj ≥ 0. Suppose not, i.e. h · aj < 0. Note that λaj ∈ cone(A) for any
λ ≥ 0. Thus, h · (λaj ) > β since λaj ∈ cone(A). Since λ can be chosen arbitrarily large, h · (λaj )
can be made smaller than β, which gives a contradiction. Thus, h · aj ≥ 0 for all columns of A.
Hence y = h is our required vector with y · A ≥ 0 and y · b < 0.
5
Polyhedrons and Polytopes
Definition 4 Let S ⊆ Rn . A vector v ∈ Rn can be expressed as a convex combination of
P
Pm
j
vectors in S if there is a finite set {v 1 , . . . , v m } ⊆ S such that v = m
j=1 λj v with
j=1 λj = 1,
λj ≥ 0 ∀j.
Definition 5 Let S ⊆ Rn . The convex hull of S, conv(S), is the set of all vectors that can
be expressed as a convex combination of vectors in S. (Alternatively: conv(S) is the smallest
convex set containing S, or conv(S) is the intersection of all convex sets that contain S.)
Definition 6 A set P ⊆ Rn is called a polytope if there is a finite S ⊆ Rn such that P =
conv(S).
Definition 7 A non-empty set P ⊆ Rn is called a polyhedron if there is an m × n matrix A
and a vector b ∈ Rm such that P = {x ∈ Rn : Ax ≤ b}.
Theorem 6 The set of all convex combinations of a finite number of vectors is a polyhedron.
Thus, a polytope is a polyhedron. A polyhedron is a polytope if it is also bounded.
Definition 8 Let S ⊆ Rn be convex. An extreme point of S is a point that can not be
expressed as a convex combination of any other points in S.
Theorem 7 If P is a polytope, then each x ∈ P can be written as a convex combination of its
extreme points.
Application: Linear Production Model
Let x ∈ Rm be a non-negative input vector. Let y ∈ Rn be a non-negative output vector. Let
P be a m × n production matrix that relates outputs to inputs as follows: y1×n = x1×m Pm×n .
Here pij is the amount of the j th output generated from one unit of the ith input. Let b ∈ Rk
be a non-negative resource/capacity vector that lists the amount of raw materials available for
production. Let Cm×k be an m × k non-negative consumption matrix that relates inputs to
resources: x1×m Cm×k ≤ b1×k . Here cij is the amount of resource j consumed to produce one
unit of input i. The input space is X = {x ∈ Rm : x · C ≤ b, x ≥ 0}. The output space is
Y = {y ∈ Rn : y = x · P, x ∈ X, y ≥ 0}. An output vector y ∗ is efficient if there is no other
y ∈ Y such that y ≥ y ∗ .
6
Theorem 8 A vector y ∗ ∈ Y is efficient iff there exists a non-negative, non-trivial price vector
p such that y ∗ · p ≥ y · p for all y ∈ Y .
Proof. (⇐): This is almost trivial. If y ∗ · p ≥ y · p ∀y ∈ Y for some price vector p, then for no
other y ∈ Y, y ≥ y ∗ . Thus, y ∗ is efficient.
(⇒): Suppose that y ∗ is efficient. First we prove the following claim:
Claim 1 There exists a matrix D with n rows and a vector r such that Y = {y ∈ Rn : y · D ≤ r}
Proof. Let x1 , x2 , . . . , xk be the extreme points of X. Pick any y ∈ Y . Then there is an
x ∈ X such that y = x · P . Since X is a polytope (X is a polyhedron and bounded. And every
polyhedron that is bounded is also a polytope), any element in X can be expressed as a convex
combination of its extreme points. Thus, ∃{λj }kj=1 such that x = λ1 x1 +λ2 x2 +. . .+λk xk . Thus,
we can write y = λ1 x1 P + λ2 x2 P + . . . + λk xk P . This means, each y ∈ Y can be written as a
convex combination of {x1 P, x2 P, . . . , xk P }. It is straightforward to see any convex combination
of these vectors is also in Y . Hence, Y is a convex combination of a finite number of points, i.e.,
Y = conv({x1 P, . . . , xk P }). Thus, Y is a polytope and hence it is a polyhedron, that is, ∃Am×n
and b ∈ Rm such that Y = {y ∈ Rn : A · y ≤ b} and the result follows. So, now we know Y = {y ∈ Rn : y · D ≤ r} for some D and r. Let S = {j : y ∗ · dj = rj }
where dj is the j th column of D. We show S 6= ∅. Suppose not. Then, y ∗ · dj < rj for all
j. Let w be the vector obtained from y ∗ by adding > 0 to the first component of y ∗ . Then
w · dj = y ∗ · dj + d1j . The assumption S = ∅ allows us to choose sufficiently small so that
y ∗ · dj + d1j ≤ rj . Thus, w ∈ Y and w ≥ y ∗ contradicting the efficiency of y ∗ . Thus, S 6= ∅.
Consider now the system {z · dj ≤ 0}j∈S . We claim that there is no non-trivial non-negative
solution z ∈ Rn . If there is, there is an > 0 sufficiently small such that (y ∗ + z)dj ≤ rj
∀j, implying y ∗ + z ∈ Y contradicting the efficiency of y ∗ . Since the system {z · dj ≤ 0}j∈S
does not admit a non-trivial non-negative solution, we have by a version of Farkas Lemma,
P
P
non-negative numbers {λj }j∈S such that j∈S λj dj > 0. Setting p = j∈S λj dj completes the
P
P
P
proof of Theorem 8, since y ∗ j∈S λj dj ≥ y j∈S λj dj ∀y ∈ Y . Note that y j∈S λj dj =
P
P
P
j
∗ j
j∈S λj yd ≤
j∈S λj rj =
j∈S λj y d .
See Propositions 5.F.1 and 5.F.2 in Mas-Colell et al, Microeconomic Theory, page 150-151,
for a similar result. Theorem 8 is a simpler version of first and second welfare theorems.
7