Convex sets

Convex functions
Lecture 3
Dr. Zvi Lotker
Convex set extension of
Interval

A set S in a vector space over d is
called a convex set if the line segment
joining any pair of points lies entirely in
S.
A convex set.
A non-convex
set
Examples
Rd
 Affine set
 Half space
H={x:a1x1+a2x2+…+adxd≤b}
 Sphere, B={x:x12+x22+…+xd2≤1}
 The Cartesian product of two convex
sets

1
0.5
-1
-0.5
0.5
-0.5
-1


Hypercube
d-Simplex
1
Hyperplane

A hyperplane is a generalization of the
plane in d-dimensional space.


It divides the space into two half-space
hyperplane is an affine subspace of
codimension 1
2
Z
1
1
0
-1
0.5
-2
-1
0
-0.5
-0.5
0
X
0.5
1
-1
Y
Hyperplane def

An affine hyperplane in n-dimensional
space can be described by a nondegenerate linear equation of the
following form:


a1x1 + a2x2 + ... + anxn= b.
Non-degenerate means that not all the ai
are zero.

In short we can write

H={x:<a,x>=b}
Ellipsoid


={x: (x-o)P-1(x-o)1}
Where P>0, P is symmetric and positive
i.e. all the Eigenvalues >0.
Outline of the lecture







Polytop
Operations that preserve convexity
Carathéodory's theorem
Radon's theorem
Helly's theorem
Separating hyperplane theorem
Generalized inequalities
Polytope

A convex polytope may be defined as
the convex hull of a finite set of points
(which are always bounded), or as a
bounded intersection of a finite set of
half-spaces.
Example Polytope

Polygon (2-polytope)

Polyhedron (3-polytope)
polyhedra.
Explicitly, a d-dimensional polytope may
be specified as the set of solutions to a
system of linear inequalities Axb.
 bounded or unbounded generalization
of a polytope of any dimension.
d-Simplex

simplex is an n-dimensional analogue of a
triangle.

A simplex is the convex hull of a set of (n + 1)
affinely independent points in some Euclidean
space
Facets


An n-dimensional convex polytope is
bounded by a number of (n-1)-dimensional
facets.
These bounding sub-polytopes are referred
to as faces:


A 0-dimensional face is called a vertex.
A 1-dimensional face is called an edge.
Positive semidefinite cone


a positive-definite matrix is a symmetric
matrix which in many ways is analogous to a
positive real number
The following is Equivalent




M is a positive-definite matrix
For all non-zero vectors zRd zMz>0
All eigenvalues λi of M are positive
xMy is an inner product on R
Semidefinite
Example S2
x
y

y
2
2

S

x

0
,
z

0
,
zx

y


z
10
Y
7.5
5
2.5
0
10
7.5
Z
5
2.5
0
0
2.5
5
X
7.5
10
Operations that preserve
convexity
Lemma:
A,B are convex AB is convex
 Proof

assume x,yAB
 it follows that x,yA and x,yB
 Since A,B is convex it follows that
x+(1-)yA,B.


x+(1-)yAB
Operations that preserve
convexity
Affine functions
F:RnRm, F[x]=Ax+b,ARmn, bRm
 If S is a convex set Then F[S] is convex
 Proof
Assume x,yF[S], [0,1].
There exist a,bS s.t. F[a]=x,F[b]=y
Since S is convex it follows that a+(1-)bS
From linearity of F it follows

F[a+(1-)b]=x+(1-)yF[S],
Operations that preserve
convexity
Lemma:
A,B are convex AB is convex
 Proof

assume x,yAB
 it follows that x,yA and x,yB
 Since A,B is convex it follows that
x+(1-)yA,B.


x+(1-)yAB
Operations that preserve
convexity
Affine functions
F:RnRm, F[x]=Ax+b,ARmn, bRm
 If S is a convex set Then F[S] is convex
 Proof
Assume x,yF[S], [0,1].
There exist a,bS s.t. F[a]=x,F[b]=y
Since S is convex it follows that a+(1-)bS
From linearity of F it follows

F[a+(1-)b]=x+(1-)yF[S].

Is the other direction is true ?
Carathéodory's theorem

If a point xRd lies in the convex hull of a
set P, there is a subset P’of P consisting of
no more than d+1 points such that x lies in
the convex hull of P′.
(0,1)
(1,1)
(0,0)
(1,0)
Carathéodory's theorem
Let xConv(P). Then, x is a convex
combination of points in P.
I.e. x=1x1+…+ kxk where every xjP, every
λj0 and 1+…+ k=1.


Suppose k>d+1


x2-x1+…+ xk-x1, are linearly dependent
so there are real scalars μ2, ..., μk, not all zero, s.t
μ2(x2-x1)+…+μk,(xk-x1)=0
Carathéodory's theorem
Let xConv(P). Then, x is a convex
combination of points in P.
I.e. x=1x1+…+ kxk where every xjP, every
λj0 and 1+…+ k=1.


Suppose k>d+1


x2-x1+…+ xk-x1, are linearly dependent
so there are real scalars μ2, ..., μk, not all zero, s.t
μ2(x2-x1)+…+μk,(xk-x1)=0
Carathéodory's theorem

Suppose k>d+1





so there are real scalars μ2, ..., μk, not all zero, s.t
μ2(x2-x1)+…+μk,(xk-x1)=0
Let μ1:=-(μ2+…+μk)


x2-x1+…+ xk-x1, are linearly dependent
μ1+μ2+…+μk=0
μ1x1+μ2x2+…+μkxK=0
and not all of the μj are equal to zero
Therefore, at least one μj>0
Carathéodory's theorem

Suppose k>d+1





so there are real scalars μ2, ..., μk, not all zero, s.t
μ2(x2-x1)+…+μk,(xk-x1)=0
Let μ1:=-(μ2+…+μk)


x2-x1+…+ xk-x1, are linearly dependent
μ1+μ2+…+μk=0
μ1x1+μ2x2+…+μkxK=0
and not all of the μj are equal to zero
Therefore, at least one μj>0
Carathéodory's theorem

Let μ1:=-(μ2+…+μk)


μ1+μ2+…+μk=0
μ1x1+μ2x2+…+μkxK=0

and not all of the μj are equal to zero
Therefore, at least one μj>0

Then




x= 1x1+…+ kxk-(μ1x1+μ2x2+…+μkxK)
Def =min{i/μi: μi>0 }
For all i i-μi0 and for some i, i-μ=0
Carathéodory's theorem

Then




x= 1x1+…+ kxk-(μ1x1+μ2x2+…+μkxK)
Def =min{i/μi: μi>0 }
For all i i-μi0 and for some i, i-μ=0
What is

1-μ1+…+ k-μk=?
Convex of Compact set



Theorem If SRd is a compact set, then conv(S)
is a compact set
Proof
Let  be the standard simplex in Rd+1.



 is compact
Sd+1 is compact, Sj={(x1,…,xj):xiS)
Consider the map : Sd+1Rd



(u1,…,ud+1;a1,…,ad+1)=a1u1+…+ad+1ud+1
Carathéodory's theorem implies that the Image of  is
convex.
Since  is continuous the Image of  is Compact.
Question




Is it necessary to have d+1 points
Conv(S)=Conv(Conv(S))
ABConv(A) Conv(B)
Is the set {t x+(1-t)y: x,yP, 1>t>0} is
convex?
Radon's theorem

(1887, 1956)
Any set of d + 2 points in Rd can be
partitioned into two (disjoint) sets
whose convex hulls intersect.
Radon's theorem





(1887, 1956)
Theorem: Let SRd be a set containing at least d+2
points. Then there are two non intersecting subsets
R,BS s.t conv(R)conv(B)
Proof
Suppose X={x1,x2,…,xd+2} Rd
Since any set of d+2 points in Rd is affinely
dependent, there exists a set of multipliers a1,…,ad+2
not all of them 0 s.t
a1x1+…+ ad+2xd+2=0, a1+…+ ad+2=0
Radon's theorem



(1887, 1956)
Theorem: Let SRd be a set containing at
least d+2 points. Then there are two non
intersecting subsets R,BS s.t
conv(R)conv(B)
Proof
a1x1+…+ ad+2xd+2=0, a1+…+ ad+2=0



Let I={xi: ai>0}, J={xi: ai<0},
X1={xi: ai>0}, X2={xi: ai<0}
z=(iIaixi )/(iIai)Conv(X1)Conv(x2)
Helly's theorem

(1884-1943)
Suppose A1,…AmRd is a family of convex
sets, and every d+1 of them have a nonempty intersection. Then Ai is non-empty.
Proof of Helly's theorem



The proof is by induction on m.
If m=d+1, then the statement is true.
Suppose the statement is true if m-1>d.
Proof of Helly's theorem





The sets Bj=ijAi by inductive
hypothesis.
Pick a point pi from each of Bi, {p1,…,pm}
By Radon's lemma, there is a partition of p's
into two sets P1,P2 s.t.
xX=conv(P1)conv(P2)
I1={i:pi P1}, I2={i:pi P1}
Let xX. We claim that xAi.
Proof of Helly's theorem








Note that for all ji, pjAi.
Consider i{1,2,…,n}
Them i  I1 or i I2 .
Assume that i I2
For each j I1, pj conv(P1)
So xconv(P1)  Ai
So Conv(P1) Ai
Therefore xAi.
Separating hyperplane
theorem

Theorem
if C and D are disjoint convex sets, then there
exists a=0, b such that
a’x ≤ b for xC and a’x≥b for x  D


strict separation requires additional assumptions
(e.g., C is closed, D is a singleton)
Quotient space

The quotient of a vector space V by a
subspace N is a vector space obtained
by "collapsing" N to zero. The space
obtained is called a quotient space
and is denoted V/N (read V mod N).
Quotient space


The points of V/L are the affine subspace
parallel to L.
Addition in V/L is defined as:



A1+A2=A3
A1=L+v1,A2=L+v2,A3=L+v3 and v1+v2=v3 for
some v1+v2+v3
Scalar multiplication in V/L is defined as:


A A1=A2
Provide A1=A1+v1, A2=L+v2, and v2=a v1
The main problem in V/L


There is freedom in the choice of v.
This is the same problem as in Rational
numbers



For example 1/2=2/4=3/6.
Usually we use some unique
representation. What is the standard
representation for Q?
What is the standard representation V/L?
The isolation theorem



Let ARd be an open convex set, Let uA be
a point in R then there exists an affine
hyperplain H which contains u and strictly
isolates A.
Proof.
We can assume u=0.
The isolation theorem


Let ARd be an open convex set, Let uA be
a point in R then there exists an affine
hyperplain H which contains u and strictly
isolates A.
Proof. First we prove the claim in d=2.
The isolation theorem


Let ARd be an open convex set, Let uA be a
point in R, then there exists an affine hyperplain H
which contains u and strictly isolates A.
Proof. If d>2 There is a straight line L s.t.



0L, LA=, consider any 2-dim plane P s.t. 0P
PA is convex.
Now we prove the theorem
The isolation theorem


Now we prove the theorem
Let H be the maximal affine subspace s.t.


0H, HA= be maximal i.e. the subspace that
is not contained in any larger subspace.
If dim(Rd/H)≥2 then there exists a line LRd/H
s.t.


0L, LA=.
Using this line we can increase the dimension of H
Separating hyperplane
theorem

If C and D are disjoint convex sets, then
there exists a0, b s.t.





For all x in C ax≤b and for all x in D ax≥b
Proof:
Let dis(C,D)=inf{||u-v||:uC, uD}
Let a=d-c, b=(||d||2+||c||2)/2
c
F[x]=a’x-b is the separation d
D
C
Generalized inequalities
partial order
1.
2.
3.
a≤a (reflexivity)
if a≤b and b≤a then a = b
(antisymmetry)
if a ≤ b and b≤c then a≤c (transitivity)
Generalized inequalities
proper cone

a convex cone KRn is a proper cone
K is closed (contains its boundary)
 K is solid (has nonempty interior)
 K is pointed (contains no line)
xK,-xKx=0

Generalized inequalities
proper cone

Example
nonnegative orthant K={x: xi≥0}
n
 positive semidefinite cone K = S +
 nonnegative polynomials on [0,1]
K={xRn:x1+x2t+…xntn-1≥0}

generalized inequality defined
by a proper cone K


x ≤k y  x-yK , x <k y  x-yint K
Properties of generalized inequality





≤k preserves addition: if x≤ky and u≤kv, then x+u
≤k y+v.
≤k is transitive: if x≤ky and y≤kz then x≤kz
≤kis reflexive: x≤kx
≤kis antisymmetric: if x≤ky and y K x, then x = y.
K is preserved under limits
Minimum and minimal
elements


≤k is not in general a linear ordering we can have x
≤k y and y≤k x
xS is the minimum element of S with respect to ≤k


If yS  x ≤k y
xS is the minimal element of S with respect to ≤k

If yS and y ≤k x x=y
Dual cones

Dual cone of a cone K:


K* ={y: y’x≥ 0 for all xK}
Example:



K = Rd+, K*=Rd+
K={v:≥0}, K*={x:<x,v>≥ 0 }
The dual cone of a subspace is the
orthogonal complementation V.
Minimum and minimal
elements via dual inequalities


x is the minimum element of S, with
respect to the generalized inequality ≤k
Iff for all k*≥0, x is a unique minimizer
of ’z over zS.
This means that for any k*≥0 the
hyperplane {z: ’(z-x)=0} is a strict
supporting hyperplane to S at x
Convex function



A real-valued function f defined on an
interval (or on any convex subset C of
some vector space) is called convex, if for
any two points x and y in its domain C and
any t in [0,1], we have
f(tx+(1-t)y)≤tf[x]+(1-t)
f is concave if -f is convex
Examples convex on R
Affine: ax + b on R, for any a,bR.
 exponential: eax, for any aR.
 powers: xa on R++, for a≥1,a<0.
 powers of absolute value: |x| on R, for
p≥1
 negative entropy: xlog(x) on R++.
Examples concave on R



Affine: ax + b on R, for any a,bR.
powers: xa on R++, for 0≤a<1.
negative entropy: log(x) on R++.
Examples on Rn and Rmn





Affine function f(x) = a’x + b
Norms: ||x||
Max
f(X) = tr(A’X) + b =Ai,j Xi,j+b
spectral (maximum singular value)
norm f(X) = ||X||2 = max(X)=max(X’X)