Properties of Context-Free Languages

Properties of Context-Free
Languages
Properties of Context-Free Languages – p.1/45
Proving context-freeness
There are two mechanisms that may be used to
show that a language is not context-free:
1. Use closure operators
2. Use pumping lemma for context-free
languages.
Properties of Context-Free Languages – p.2/45
Using closure operators
We illustrate here the usage of closure operators
solving the problem 2.18 from the textbook.
Let be a context free language and
be a regular language.
Problem 2.18
1. Prove that
is context-free.
2. Use the result at (1) to show that the
language
contains an
equal number of a’s, b’s, and c’s is not a
context-free language.
Properties of Context-Free Languages – p.3/45
Part 1
and
as follows:
,
does, keeping track of the states of
,
).
stops only at a state
(i.e.,
4.
,
3.
does whatever
i.e.,
2.
1.
Construct
be regular. Consider
the PDA that recognizes
the DFA that recognizes .
that recognizes
be context-free and
Let
Properties of Context-Free Languages – p.4/45
Conclusion
Since
recognizes
context-free
it follows that
is
this a model for the construction of a machine that simulates the working of other (two in
our case) machines.
Note:
Properties of Context-Free Languages – p.5/45
Part 2
Consider the regular language
and let
. If would be context-free then
according to Part 1 we would have a
context-free language. But we know that
is not a context-free
language. Hence, is not a context-free
language.
we will use pumping lemma for CFL to prove that
Note:
is not context free.
Properties of Context-Free Languages – p.6/45
Other closure properties
class of CFL is closed under union,
concatenation, and star
is a
, where
, where
, where
nonterminal symbol, specifies
new nonterminal symbol, specifies
3.
is a new nonterminal symbol, specifies
2.
specified by the CFG
.
1.
by construction. Consider two CFL,
where
,
,
Proof:
Theorem:
is a new
.
Properties of Context-Free Languages – p.7/45
Corollary
by a
differ from
is a CFL
then
by the definition of set-subtraction
2. If
operator.
then by CFL closure for union
1. If
Proof:
by construction. If is context-free and
or
regular language then either
Each language that differ from a context-free
language by a regular language is context-free
3. Since is regular it follows by the the solution of problem 2.18
that
is a CFL.
Properties of Context-Free Languages – p.8/45
Homomorphism
Let
and
be two alphabets.
can be extended to
and for all
,
by:
with
Extension:
A homomorphism is a function
assigning words in
to symbols in
may be extended to operate on languages:
for
,
Properties of Context-Free Languages – p.9/45
Substitution
.
A substitution is a function
and
Consider again two alphabets,
A substitution assigns to each letter of
(possible infinite) set of words over .
a
by element-wise application:
; for all
,
,
,
Extension:
Properties of Context-Free Languages – p.10/45
is a particular case
where
A homomorphism
of a substitution
for all
.
Note
Properties of Context-Free Languages – p.11/45
;
.
is a CFL for each
is context-free is
is regular for each
3.
:
is regular if
2.
1.
,
For
Operations on languages
Properties of Context-Free Languages – p.12/45
Fact
, and for each substitution
:
3.
2.
1.
For each
(homomorphism)
Properties of Context-Free Languages – p.13/45
Theorem 2.39 (Fleck)
The class of CFL is closed under substitution
and homomorphism.
and
Consider a CFL
by construction.
Proof:
a CF substitution.
Properties of Context-Free Languages – p.14/45
,
is a CFG and derivations initiated in
are strings in
where each terminal
is replaced by .
thus
,
5. Derivations in (4) may be continued with rules in
obtaining
4.
,
where each terminal
, where
where
consists of
is replaced by .
3. Construct
2. We may assume without loss of generality that
and
and for each
, and
with
1. There exist a CFG
Proof, continuation
Properties of Context-Free Languages – p.15/45
from DGSM
and PDA
and
as sub-machines of
and
2. Generates hypothetical input sequences for
1. Embed both
to accept
construct PDA
that accepts
Proof idea:
is a
is
If
is a CFL and
transducer then
a CFL
Theorem 2.40 (Fleck)
3. Monitor computations of
and so that when accepts an
input and ’s output for it is ’s input, accepts
Properties of Context-Free Languages – p.16/45
Most general CFL
Dyck languages are the most general CFL
and are defined over alphabets with an even
number of letters that are explicitly paired
For
we assume an alphabet where
pairing is explicitly given
Dyck language
over
is the language of
the CFG with one non-terminal, and
productions,
and
, for
.
Properties of Context-Free Languages – p.17/45
Intuitive interpretation
and
,
for
Example:
Given pairs of different parentheses, the
Dyck language is the collection of all
well-formed parenthesis sequences
Properties of Context-Free Languages – p.18/45
Well-formed string
The property that characterizes a well-formed
parenthesis string:
1. each left parenthesis in string is eventually
followed by a matching right parenthesis
2. each matched left and right parenthesis
encloses a well-formed string
Properties of Context-Free Languages – p.19/45
Theorem 2.41 (Fleck)
,
A language is a CFL iff there is a Dyck
, a regular language
language
and a homomorphism
so that
Proof:
by construction (see Fleck 288–289)
Properties of Context-Free Languages – p.20/45
Pumping lemma for CFL
The second mechanism for proving that a
given language is not context-free is
pumping-lemma for CFL.
This mechanism is similar to the pumping
lemma used for proving that a given language
is not regular.
Here we present a similar lemma for
context-free languages
Properties of Context-Free Languages – p.21/45
Informal
Pumping lemma for context-free languages states that every CFL
has a specific value called pumping length such that all longer
strings in the language can be pumped
However, the meaning of pumping is a bit more complex than in
case of regular languages
Here pumping means that a string can be divided into five parts
so that the second and fourth parts may be repeated any number
of times and the resulting string is in the language
Properties of Context-Free Languages – p.22/45
Pumping lemma for CFL
If is a context-free language, then
there is a number (the pumping length) where,
if
and
, then may be divided into
five pieces,
satisfying the conditions:
,
3.
2.
1. For each
Theorem 2.34:
Properties of Context-Free Languages – p.23/45
Interpretation
Condition 1 states that length of strings in
can be unlimited but have a fixed structure
Condition 2 says that in the structure
either or is not empty; otherwise theorem
would be trivially true
Condition 3 states that pieces , , together
have length at most ; this is useful for
proving that some languages are not
context-free
Properties of Context-Free Languages – p.24/45
Proof idea
.
and has a derivation tree
of must be very tall because is very long
2. The derivation tree
is derivable from
,
1. Because
Let be a CFL and be the CFG generating
We must show that any sufficiently long
can be pumped and remains in .
Proving steps:
contains some long path from start variable at
3. This means that
the root to one of terminal symbols at a leaf
4. On this long path some variable symbol
because of pigeonhole principle
must be repeated
Properties of Context-Free Languages – p.25/45
Note
Repetition of in
noticed before allows us
to replace the subtree under the second
occurrence of with the subtree under the
first occurrence of and still get a legal
derivation tree.
Hence, we may cut into five pieces, as
shown in Figures 1, and we may repeat the
second and fourth piece and obtain a string
, for any
Properties of Context-Free Languages – p.26/45
Figure 1:
Very tall trees
Tall derivation trees
Properties of Context-Free Languages – p.27/45
,
is at most
the length of the string generated is
to
, then we set
is the number of variables in
If
If the hight of
at most
In any derivation tree using we know that a node cannot have
more than children; i.e., at most leaves are in step 1 from ; at
in step h.
most leaves in step 2; at most
be a CFG that generates .
be the maximum number of symbols in
, i.e.
.
Let
Let
Formal proof
.
Properties of Context-Free Languages – p.28/45
must be at least
and thus
,
If
high because
Observation
.
the derivation tree of any string from
of length at least requires a hight of at least
Conclusion:
Properties of Context-Free Languages – p.29/45
. We show how to pump
,
Let
be the derivation tree of ; if has several derivation trees
to be the tree with the smallest number of nodes
we choose
The longest path in
has at least
nodes labeled by
variables because only the leaf is a terminal
Since there are only
variables in , some variable appear
; we may assume that
more than once on the longest path in
is the closest variables to the leaf that repeat on this path.
Let
Formal proof, continuation
Properties of Context-Free Languages – p.30/45
according to Figure 1
has a subtree that generates a portion of
Each occurrence of
into
Now we divide
Formal proof, continuation
The upper occurrence of has the larger subtree and generates
, whereas the lower occurrence generates just
Since both trees mentioned above are generated by we may
substitute one for the other and still obtain a valid derivation tree
Replacing the smaller tree by the larger repeatedly, gives
derivation trees for the string
, at each
; replacing the
, at
larger tree by the smaller one generates the string
each
.
This establishes condition 1 of the lemma.
Properties of Context-Free Languages – p.31/45
To get condition 2 we must ensure that not both
and are .
Condition 2
If both and were the derivation tree obtained by substituting
the smaller tree for the larger tree would have fewer nodes than
and would still generates .
This is impossible because we have chosen
to be the
derivation tree for with the smallest number of nodes
Properties of Context-Free Languages – p.32/45
generates
is at most
generates
Hence, the subtree where
high.
We chose so that both occurrences fall within the bottom
variables on the path and we chose the longest path in
the upper occurrence of
In the derivation tree
Now we need to be sure that
Condition 3
A tree of this height can generates a string of length at most
Properties of Context-Free Languages – p.33/45
Example 1
is context-free to obtain a contradiction
can be pumped and here we
I.e, we will show that no matter how we divide into
the three conditions of the pumping lemma is violated
The pumping lemma states that
show that it cannot be pumped
of length at least
Consider the string
that is guaranteed to exists by
Let be the pumping length for
pumping lemma
assume that
Proof:
Use pumping lemma to show that the language
is not context-free
one of
Properties of Context-Free Languages – p.34/45
Condition 2 stipulates that either
cases to examine:
or
Checking condition 2
is not empty. There are two
1. When both and contain only one type of symbols (a,b,c)
does not contain both a’s and b’s or both b’s and c’s; the same
cannot contain equal number of
hold for . In this case
a’s, b’s and c’s., hence it cannot be in which violates condition 1
of the lemma
2. When either of contain more than one type of symbols (a,b,c)
may contain equal numbers of a’s, b’s, c’s but they don’t
come in the right order. Hence, it cannot be in either
Properties of Context-Free Languages – p.35/45
Note
One of the cases enumerated before must occur.
Because both cases result in contradiction, a contradiction is unavoidable so the assumption is a
CFL must be false
Properties of Context-Free Languages – p.36/45
is not a
Use pumping lemma to show that
context-free language
Example 2
assume that is CFL and obtain a contradiction. Let be the
. We will try to pump it down and pump
pumping length and
it up.
and again consider two cases
Let
and
2. Either
or
1. Both
Proof:
contain only one of the symbols a,b,c
contain more than one of symbols a,b,c
Properties of Context-Free Languages – p.37/45
Case 1
Because or contains one symbol, one of the symbols a, b, or c
does not appear in or .
1. The a’s do not appear in and . Consider the string
which contains the same number of a’s as but it
is not a member of .
contain fewer b’s or fewer c’s. Therefore
2. The b’s do not appear in and . Since not both and may be
empty strings, a’s or c’s must appear in or . If a’s appear the
contains more a’s than b’s so it is not in ; if c’s
string
contains more b’s than c’s so it is not in
appear, the string
. Either way we obtain a contradiction
contains more a’s or b’s
3. The c’s do not appear. Then
than c’s so it is not in , and a contradiction occurs.
Properties of Context-Free Languages – p.38/45
contain more than one of a, b, c, then
or
When either
Case 2
will
not contain the symbols a, b, c in the correct order. Hence it cannot be
a member of
and a contradiction occurs.
cannot be pumped in violation of
pumping lemma and is not a CFL.
Conclusion :
Properties of Context-Free Languages – p.39/45
Example 3
Use pumping lemma to show that
is not a CFL
assume that is CFL and obtain a contradiction. Let be the
pumping length given by pumping lemma. However, here the choosing
of is less obvious
,
,
. But we may chose:
,
,
and we can see that it can be pumped.
1. Try
Proof:
. We shows that this string
2. Another candidate is
cannot be pumped using condition 3 of the pumping lemma.
Properties of Context-Free Languages – p.40/45
where
Assume that can be pumped and set
Using
.
String
must contain the midpoint of . If
would occur only
in the first half of , pumping
moves a 1 into the first
.
position of the second half and so it cannot be of the form
Similarly if
is in the second part of , pumping
moves
a 0 into the last position of the first half of so it cannot be of the
form
.
cannot be pumped and
Hence,
contains the midpoint of , when we pump down to
, it has the form
where and cannot be
both because then both v and y wold have to be empty thus
violating
. Hence this string is not of the form
either.
If
is not a CFL
Properties of Context-Free Languages – p.41/45
with
A PDA would try to match
Note:
is not context free.
Example 4
. Because this need two
comparisons that interfere with one another, they cannot be accomplished with a stack.
Properties of Context-Free Languages – p.42/45
Using pumping lemma
or
or
Assume that is CF and let be its pumping length. Consider the
,
, Then by the pumping lemma
string
such that
and
for any
there exists
,
, and
. Trying to pump we examine the
following cases:
contain two or more different letters.
contain just one letter.
Properties of Context-Free Languages – p.43/45
Case 1: 2 or more letters
has a’ following b’ contradicting the definition of .
3. Similarly, assuming that
similar contradiction.
2. But note,
contains both a’s and b’s,
,
. Then
by the pumping lemma.
1. Suppose that
contains 2 or more letters we reach a
Properties of Context-Free Languages – p.44/45
contain one letter only
or
1. Suppose
. Since
,
. Then
by the pumping lemma. But it has either more a’s than c’s or more
b’s than d’s or both, since
, contradicting the definition of
.
. Since
,
. Then
2. Suppose
by the pumping lemma. But it has either more b’s than d’s or more
c’s than a’s or both, since
, again contradicting the
definition of .
Conclusion:
. This can be analyzed as above and the
3. Suppose that
same contradiction is generated.
cannot be CF.
Properties of Context-Free Languages – p.45/45