Lecture 7: The Pumping Lemma for CFLs
September 29, 2016
CS 1010 Theory of Computation
Recall the definition of a context-free grammar (CFG). Suppose we have a “mystery CFG”
with the following set of rules:
S → φ ∣ ε ∣ a ∣ b ∣ U ∣ C ∣ K ∣ (S)
U →S∪S
C →S○S
K → N∗
N → φ ∣ ε ∣ a ∣ b ∣ (U ) ∣ (C)
A CFG consists of a finite set V of variables, a finite set Σ of terminals, a set R of rules
of the form A → u for A ∈ V and u ∈ (V ∪ Σ), and a start variable S ∈ V . In our “mystery
CFG”, we have V = {S, U, C, K, N } and Σ = {φ, ε, a, b, (, ),∗ , ○, ∪}.
For example, one derivation from this CFG is: S → U → S ∪S → K ∪S → N ∗ ∪S → a∗ ∪S →
∗
a∗ ∪ (S) → a∗ ∪ (C) → a∗ ∪ (S ○ S) Ð
→ a∗ ∪ (a ○ b∗ ○ a). It turns out that this CFG generates
regular expressions!
Topics Covered
1. The Pumping Lemma for CFLs
2. Applications of the Pumping Lemma
1
The Pumping Lemma for CFLs
The Pumping Lemma If L is a context-free language (CFL), then there exists a pumping length p ≥ 1 such that for all s ∈ L with ∣s∣ ≥ p, we can write s = uvxyz such that the
following conditions hold:
1. For all i ≥ 0, uv i xy i z ∈ L
2. ∣vy∣ ≥ 1
3. ∣vxy∣ ≤ p
Lecture 7: The Pumping Lemma for CFLs
Page 1 / 4
We can consider the pumping lemma in terms of a parse tree diagram:
S
R
R
u
v
x
y
z
If the string is long enough, somewhere in the derivation of the string uvxyz, there is a
repeating variable R. That is, somewhere the path from the root to a terminal si is of
length greater than ∣V ∣. The variable R thus appears twice in the parse tree. As a result,
we can perform “surgery on trees” to pump the string uvxyz:
S
R
R
R
u
v
v
x
y
y
z
Proof of the Pumping Lemma Let G be a CFG for L. Let d be the maximum number
of symbols on the right-hand side of a rule in G. That is, we have a rule A → w1 w2 . . . wd .
Consider a parse tree of height h. Then for a string s ∈ L, there is an upper bound ∣s∣ ≤ dh .
If ∣s∣ ≥ dh + 1, then the height of the parse tree is greater than h. Let h = ∣V ∣. If s ∈ L with
∣s∣ ≥ dh + 1, then the parse tree for s has a height greater than or equal to ∣V ∣ + 1, which
means this parse tree is large enough for “tree surgery”.
Let p = d(∣V ∣+1) and let s ∈ L be of length greater than p. Let τ be the smallest parse tree
for s, in terms of the number of non-leaves. Let i be the position of a leaf whose path
from the root is at least ∣V ∣ + 1, such that the leaf is as deep as possible and corresponds
to terminal si . Let R be a variable that appears at least twice in the first ∣V ∣ + 1 steps
from the root to si . Let s be in the form uvxyz. Now consider the three conditions of the
pumping lemma:
Lecture 7: The Pumping Lemma for CFLs
Page 2 / 4
1. We can pump uvxyz using the “tree surgery” method outlined above, such that the
result is still in L. Thus, uv i xy i z ∈ L for all i ≥ 0.
2. If we made the parse tree smaller by pumping down, without changing s, then τ
would no longer be the smallest parse tree. It follows that we must have ∣vy∣ ≥ 1.
3. Note that R is at most ∣V ∣+1 steps away from the leaf si , and si is as deep as possible,
so the span of all leaves from R is at most p = d(∣V ∣+1) .
Thus, we’ve shown that the three conditions of the pumping lemma hold when pumping a
string of appropriate length. ∎
2
Applications of the Pumping Lemma
Consider the language A = {an bn cn }. We will prove that A is not context-free.
Proof that A = {an bn cn } is not Context-Free Suppose that A were a CFL. Then
there exists a pumping length p that satisfies the pumping lemma. Let s = ap bp cp ∈ L.
Because ∣vxy∣ ≤ p and ∣vy∣ ≥ 1, v and y can contain at most two different types of letters.
Consider both cases:
1. Suppose v or y contains more than one type of letter. Then uv 2 xy 2 z ∉ L because
some letters will be out of order.
2. Alternatively, suppose v and y each contain only one type of letter. Then some
letter is not represented in v, so uv 2 xy 2 z increases the number of at most two types
of letters. This means that there are more of these two letters than the third, so
uv 2 xy 2 ∉ L.
This contradicts the first condition of the pumping lemma, implying that A is not contextfree. ∎
Regular Operations on CFLs
the regular operations to CFLs?
Let A and B be CFLs. What happens when we apply
1. For A ∪ B, we can make a new start variable and rule S → SA ∣ SB , and the result is
still a CFL.
2. For A ○ B, we make a new start variable and rule S → SA ○ SB . The result is a CFL.
3. For A∗ , we make a new start variable and rule S → SA ∣ ε ∣ SS, which results in a
CFL.
In other words, CFLs are closed under the regular operations union, concatenation, and
Kleene star.
Lecture 7: The Pumping Lemma for CFLs
Page 3 / 4
Other Operations on CFLs However, CFLs are not closed under some operations
under which regular operations are closed. For example, consider the intersection A ∩ B.
Let A = {an bn cm }, which is context-free with the following rules:
S → BC
B → ε ∣ aBb
C → ε ∣ cC
The language B = {am bn cn } is also context-free with a similar grammar. However, the
intersection A ∩ B = {an bn cn }, which we know is not a CFL. Hence, CFLs are not closed
under intersection.
Similarly, CFLs are not closed under complement. If they were, then they would be closed
under intersection because A ∩ B = (AC ∪ B C )C and we know CFLs are closed under union.
As another example, consider the language L = {ww}. We will prove that it is not contextfree.
Proof that L = {ww} is not Context-Free Suppose that L were context-free. Let p be
its pumping length. Let s = 0p 1p 0p 1p ∈ L. Then vxy cannot be within any single string 0p
or 1p , as pumping would change the total number of ones or zeroes only. Assume that vxy
is within the first 0p 1p portion of the string. Pumping down would make the first “half”
of the string shorter than the second. The new string would be of the form 0m 1n 0p 1p , so
cutting it in half would yield a first half 0m 1n 0x and second half 0p−x 1p . This string is not
symmetric and is not in L. As a result, L cannot be context-free. ∎
Lecture 7: The Pumping Lemma for CFLs
Page 4 / 4
© Copyright 2026 Paperzz