Aryeh Kontorovich Pajor`s lemma November 14, 2016 This proof of

20225701 Advanced Seminar in Machine Learning
Pajor’s lemma
Lecturer: Aryeh Kontorovich
November 14, 2016
This proof of Pajor’s lemma is taken from here: http://tinyurl.com/pssvc-pajor; this version
has fewer words and more symbols and of course any mistakes are mine alone.
We treat a subset S ⊆ X and a binary function f : X → {0, 1} as the same object. (I mention binary
functions only to be consistent with the notation used in class; for this note, the set notation is
more convenient.) Let X be a set and F ⊆ {0, 1}X be a family of subsets. We say that F shatters
a subset S ⊆ X if |F(S)| = 2|S| , where F(S) is the range of F on S:
F(S) = {S ∩ T : T ∈ F} .
Define the shards F $ — my own non-standard term! — of F to be the collection of all subsets of
X that it shatters:
o
n
F $ = S ⊆ X : |F(S)| = 2|S| .
Theorem 0.1 For all finite F,
|F $ | ≥ |F |.
Proof: We proceed by induction on |F|. The base case is F = {T }, and this singleton shatters
the empty set, so the claim holds.
Now assuming |F| ≥ 2, there must be some x ∈ X that belongs to some but not all T ∈ F. Split
F = Fx ∪ F̄x into two nonempty disjoint collections of sets: those that contain x and those that do
not:
Fx = {T ∈ F : x ∈ T } .
The main insight of the proof is the claim that
|F $ | ≥ |Fx$ | + |F̄x$ |.
(∗)
As a sanity check, note that for all A, B ⊆ {0, 1}X , we have (A ∪ B)$ ⊇ A$ ∪ B $ (why?) and hence
|F $ | ≥ |Fx$ ∪ F̄x$ |. However, we might have |Fx$ ∪ F̄x$ | < |Fx$ | + |F̄x$ |, since there might be sets T ⊆ X
that are shattered by both Fx and F̄x . To prove (*), we must account for sets T ∈ F $ \ Fx$ ∪ F̄x$ .
Such sets necessarily exist, since for all T ∈ Fx$ ∪ F̄x$ , we have x ∈
/ T (why?). Next, convince yourself
that
T ∈ Fx$ ∩ F̄x$ =⇒ T ∪ {x} ∈ F $ .
Hence, every T ∈ Fx$ ∩ F̄x$ contributes two distinct sets to F $ : T itself, and T ∪ {x}. Of course,
every T ∈ Fx$ contributes a set to F $ , as does every T ∈ F̄x$ . It follows that
|F $ | ≥ |Fx$ ∪ F̄x$ | + |Fx$ ∩ F̄x$ | = |Fx$ | + |F̄x$ |,
which proves (*). We are now done, since by induction, |Fx$ | ≥ |Fx | and |F̄x$ | ≥ |F̄x |.
1
Let’s end the note by showing how Pajor’s lemma implies the bound
d X
|X |
|F(X )| ≤
,
i
(∗∗)
i=0
where d is the VC-dimension of F. By Pajor’s lemma, we have |F(X )$ | ≥ |F(X )|. But F can only
shatter sets S with |S| ≤ d, and the number of such sets is upper-bounded by the right-hand side
of (**).
2