Proof of Pumping Lemma PL Use Example

COSC 2001B
2002 Summer
Proof of Pumping Lemma
• Since there are only n different states, two of q0,
q1, … qn must be the same — say qi = qj, where
0”i<j”n
• Let x = a1…ai
y = ai+1…aj z = aj+1…am
• Since we claim L is regular, there must be a DFA A
such that L = L(A)
• Let A have n states; choose this n for the pumping
lemma
• Let w be a string of length ≥ n in L, say
w = a1a2…am, where m ≥ n
• Let qi be the state A is in after reading the first i
symbols of w
• Then by repeating loop from qi to qj with label
ai+1…aj k times, we can show that xykz is
accepted by A
y
z
ai+1 … aj
– q0 = start state, q1 = δ(q0, a1), q2 = δˆ (q0, a1 a2), etc.
a1 …
qj
…ai
qi
aj+1 … am
x
2002 Summer
COSC 2001
1
2002 Summer
COSC 2001
2
PL Use
• Pumping Lemma because string y is "pumped"
– Note: nature of FA's means cannot control number of times
pumped
– So, regular language with strings of length ≥ n is always
infinite
• PL only interesting for infinite languages
– but works for finite languages, which are always regular – in
this case n is larger than the longest string so nothing can be
pumped
• The PL is an application of the "pigeon-hole principle"
2002 Summer
COSC 2001
3
We use the PL to show a language L is not regular
• Start by assuming L is regular
• Then there must be some n that serves as the PL
constant
– We may not know what n is, but we can work the
rest of the "game" with n as a parameter
• We choose some w that is known to be in L
– Typically, w depends on n
2002 Summer
COSC 2001
4
Example
• Applying the PL, we know w can be broken
into xyz, satisfying the PL properties
• Again, we may not know how to break w,
so we use x, y, z as parameters
• We derive a contradiction by picking i
(which might depend on n, x, y, and/or z)
such that xyiz is not in L
2002 Summer
COSC 2001
5
• Consider the set of strings of 0's whose
length is a perfect square; formally
L = {0i | i is a square}
– We claim L is not regular
– Suppose L is regular. Then there is a constant n
satisfying the PL conditions
– Consider w = 0n2, which is surely in L
– Then w = xyz, where |xy| ” n and y ≠ ε
2002 Summer
COSC 2001
6
1
COSC 2001B
2002 Summer
The PL "game"
• By PL, xyyz is in L. But the length of xyyz is
greater than n2 and no greater than n2 + n (Why?)
• However, the next perfect square after n2 is
(n+ 1)2 = n2 + 2n + 1
• Thus, xyyz is not of square length and is not in L
• Since we have derived a contradiction, the only
unproved assumption – that L is regular – must be
at fault, and we have a "proof by contradiction"
that L is not regular
2002 Summer
Goal: win the PL game against our opponent
by establishing a contradiction of the PL,
while the opponent tries to foil us.
COSC 2001
7
2002 Summer
COSC 2001
8
Four Steps
Four Steps (II)
1. Number of states in automaton is n
Note: we don't have to know what n is, since
we use n to define our string
2. Given n, we pick a string w in L of length
equal to or greater than n
3. Our opponent chooses the decomposition
xyz, subject to
|xy| ” n
|y| ≥ 1
4. We try to pick i (the power factor in xyiz)
in such a way that the pumped string wi is
not in L. If we can do so, we win the game
– We are free to choose any w, subject to w ∈ L
and |w| ≥ n
– We usually define the string in terms of n
2002 Summer
COSC 2001
9
2002 Summer
COSC 2001
10
Example 1
• Σ = {a,b}; consider L = {wwR | w ∈ Σ*}.
– Whatever n the opponent chooses in step 1, we
can always choose a w as follows:
n n n n
|-------|-------|--------|-------|
a…ab…bb…ba…a
|---|---|------------------------|
x y
z
2002 Summer
COSC 2001
11
• Because of this choice and the requirement
that |xy| ” n, the opponent is restricted in
step 3 to choosing a y that consists entirely
of a's.
• In step 4, we use i=2.The string xy2z has
more a's on the left than on the right, so it
cannot be of form wwR. So L is not regular.
2002 Summer
COSC 2001
12
2
COSC 2001B
2002 Summer
Example 2
•
•
•
Example 3
Consider L = {aibi | i ≥ 0}
Given n, we choose the string anbn for our
argument. If the language is regular, our
opponent can break this string into xyz
where for any j ≥ 0 xyjz is in L
Because |xy|” n and |y|> 0, the string y
has to consist of a’s only. So pumping y
adds to the number of a’s and hence there
are more a’s than b’s
2002 Summer
COSC 2001
13
•
Consider L = {w | w has an equal number of
1's and 0's}
• Given n, we choose the string (01)n
• We need to show splitting this string into xyz
where xyiz is in L is impossible…
• But it is possible!
– If x = ε, y = 01, and z = (01)n-1, xyiz is in L for
every value of i.
•
Are we out of luck?
2002 Summer
First law of PL use
COSC 2001
14
Not this time…
If your string does not succeed, try another!
• Let's try 1n0n.
• Again, we need to show splitting this string into xyz
where xyiz is in L is impossible…
• But it is possible!
• … the PL says that our string has to be divided so
that |xy| ” n and |y| > 0
• If |xy|” Q then y must consist only of 0's,
so xyyz ∉ L
• Contradiction! We win!
– If x and y are the empty string and y is 1n0n, then xyiz
always has an equal number of 0's and 1's.
• Are we still in trouble?
2002 Summer
COSC 2001
15
2002 Summer
COSC 2001
16
Example 4
• Consider L = {ww | w ∈ Σ*}
• We choose the string anbanb, where n is the number of
states in the FA. We now show that there is no
decomposition of this string into xyz where for any j ≥ 0
xyjz is in L.
• Again, it is crucial that the PL insists that |xy|” n, because
without it we could could pump the string if we let x and z
be the empty string.
• With this condition, it's easy to show that the PL won't
apply because y must consist only of a's, so xyyz is not in
L.
2002 Summer
COSC 2001
17
As before, the choice of string is critical: had
we chosen anan (which is a member of L)
instead of anbanb, it wouldn't work because it
can be pumped and still satisfy the PL.
MORAL
Choose your strings wisely.
2002 Summer
COSC 2001
18
3
COSC 2001B
2002 Summer
Example 5
• The PL states that xyiz is in L even when i = 0
• So, consider the string xy0z
"Pumping down"
• L = {0i1j | i > j}
• Given n, choose s = 0n+11n.
• Split into xyz… etc.
• Because by the PL, |xy|” n, y consists only
of 0's.
• Is xyyz in L?
2002 Summer
COSC 2001
– Removing string y decreases the number of 0's in s
– s has only one more 0 than 1
– Therefore, xz cannot have more 0's than 1's, and is
not a member of L.
• Contradiction!
19
2002 Summer
•
•
•
•
• You need to find only one string for which
the PL does not hold to prove a language is
not regular
• But you must show that for any
decomposition into xyz the PL holds
L = {a3bici-3 | i > 3}
Assume L is regular, PL holds.
Choose w = a3bncn-3 with n as the PL constant.
Three ways to partition w into xyz:
1. y contains only a’s
2. y contains only b’s
3. y contains a’s and b’s
– This sometimes means considering several
different cases
COSC 2001
20
Example
Remember
2002 Summer
COSC 2001
• Have to show that each of these partitions leads to a
contradiction, i.e. that there is no possible way to
divide w into xyz so that the PL holds
21
2002 Summer
COSC 2001
22
Case1: y contains only a’s
Case 2: y contains only b’s
• Then x contains 0 to 2 a’s, y contains 1 to 3 a’s,
and z contains 0 to 2 a’s concatenated onto the rest
of the string bncn-3, such that there are exactly 3
a’s. So the partition is
x = ak y = aj z = a3-k-jbncn-3
where k • 0, j > 0, and k+j ” 3.
• It should be true that xyiz ∈ L for all i • 0.
• xy2z = (x)(y)(y)(z) = (ak)(aj)(aj)(a3-j-kbncn-3) =
a3+jbncn-3 ∉ L since j > 0, there are too many a’s
• CONTRADICTION!
• Then x contains 3 a’s followed by 0 or more b’s, y
contains 1 to n-3 b’s, and z contains 3 to n-3 b’s
concatenated onto the rest of the string cn-3. So the
partition is
x = a3bk y = bj z = bn-k-jcn-3
where k • 0, j > 0, and k+j ” Q-3.
• It should be true that xyiz ∈ L for all i • 0.
• xy0z = a3bn-jcn-3 ∉ L since j > 0, there are too few b’s
• CONTRADICTION!
2002 Summer
COSC 2001
23
2002 Summer
COSC 2001
24
4
COSC 2001B
2002 Summer
Case 3: y contains a’s and b’s
Closure Properties
• Then x contains 0 to 2 a’s, y contains 1 to 3 a’s and 1
to n-3 b’s, and z contains 3 to n-1 b’s concatenated
onto the rest of the string cn-3. So the partition is
x = a3-k y = akbj z = bn-jcn-3
where 3 • N • 0, and n-3 • M > 0.
• It should be true that xyiz ∈ L for all i • 0.
• xy2z =a3bjakbncn-3 ∉ L since j, k > 0, and there are b’s
before a’s
• There is no partition of w
• L is not regular!
• Certain operations on regular languages are
guaranteed to produce regular languages
• Example: the union of regular languages is
regular:
2002 Summer
COSC 2001
25
– start with RE's, and apply + to get an RE for the
union.
2002 Summer
Aside: Applications
COSC 2001
26
FST: Finite State Transducer
• FST: Finite State Transducer
• Implementing FSTs in Java, C, …
• Solving practical problems with FSTs
• Like DFA, but reads and writes
• Arcs show input and output: "in/out"
• I/O can be complex: "x in Σ-{a}/x"
1/1
Start
q0
x in Σ/x
q1
0/ε
2002 Summer
COSC 2001
27
2002 Summer
FST Implementation
2002 Summer
COSC 2001
28
Clamping FST
1/1
• Need to read/write, track state:
state = 0;
x in Σ/x
Start
q0
until EOF:
in = read()
switch on state:
0/ε
case 0:
write(in); state = 1; end case
case 1:
if in=='0' then state=0;
else write('1'); state=1; end if
end case
end switch
end until
COSC 2001
• derive from FA:
q1
0/0
start
0,1/1
1/1
1/1
0/0
29
2002 Summer
COSC 2001
30
5
COSC 2001B
2002 Summer
Clamping FST in Java
Solving practical problems
int state = 0;
char inchar, outchar;
while ( !in.eof() ) {
outchar = inchar = in.read();
switch( state ) {
case 0:
switch( inchar )
case '0': state = 1; break;
case '1': state = 0; break;
default: // error
}
break;
// ...
}
print(outchar);
}
• RE's used in:
– grep, egrep
– lex – lexical analyser generator
2002 Summer
Lex/Flex

definitions 



rules 



user code 

• generates huge file of tables like:
static yyconst short int yy_accept[12] =
{ 0, 0, 0, 6, 4, 3, 3, 2, 1, 3, 1, 0
};
printf("Int %s\n", yytext);
printf("Op %s\n", yytext);
/* eat up white space */
printf("Unk: %s\n", yytext);
plus functions like yylex, which emulates
an automaton
• tables drive transitions made by yylex
%%
int main(int argc, char *argv[]) {
yylex();
}
2002 Summer
COSC 2001
33
2002 Summer
Swap regular languages for every symbol in the alphabet
of a regular language, and the resulting language is
regular:
∀a∈Σ (for each a in Σ), we define a substitution s:
34
• Reverse of string w = a1a2...an is an...a2a1
– Denoted wR
– Note εR = ε
• Reverse of language L is set containing
reverse of each string in L
= La
s(a)
s(a1a2…an) = s(a1)s(a2)…s(an)
= La1La2 …Lan
where La is RL
extend to strings
s(M)
extend to languages
= ∪w in M s(w)
COSC 2001
Closure Under Reversal
Substitution of RLs into RL is Regular
– Denoted LR
• If L is regular, so is LR
– Proof: use RE's, recursive reversal as in text
Then s(L) is regular
2002 Summer
32
Lex/Flex (II)
%{
#include …
%}
%%
[0-9]+
[-+*/]
[ \t\n]+
.
COSC 2001
COSC 2001
35
2002 Summer
COSC 2001
36
6
COSC 2001B
2002 Summer
Decision Properties of RLs
Finiteness
• Given a representation (e.g., RE, FA) of a regular
language L, what can we tell about L?
• Is L a finite language?
– Have algorithms to convert between representations, so can
choose representation that makes test easiest
• Membership: Is string w in regular language L?
– Use DFA representation for L & simulate DFA on
input w.
• Emptiness: Is L = φ?
– Use DFA representation + graph-reachability
algorithm to test if at least one accepting state is
reachable from start state
2002 Summer
COSC 2001
37
– Note every finite language is regular (why?),
but a regular language is not necessarily finite
• DFA method:
– Given a DFA for L, eliminate all states that are
not reachable from the start state and all states
that do not reach an accepting state
– Test if there are any cycles in the remaining
DFA; if so, L is infinite, if not, then L is finite
2002 Summer
* in RE means infinite language (almost)
•
•
What about 0ε*1 or 0*φ
Follow steps:
?
1. (a) Find and (b) eliminate sub-expressions
equivalent to φ
2. Find sub-expressions equivalent to ε
3. L(R) is infinite if there is a non-ε sub-expression
E*
2002 Summer
COSC 2001
39
a) Find REs equivalent to φ:
if E=F=φ
E+F
if E=φ or F=φ
EF
But never:
ε or a or E*
b) Replace sub-expressions equivalent to φ:
φ+E⇒E
φ*⇒ ε
E+φ⇒E
2002 Summer
Look for non-ε E*
40
• Consider (0 + 1φ)* + 1φ*
1a) φ (twice) and 1φ are sub-expressions equivalent
to φ
1b)Remove these sub-expressions
(0 + 1φ)* ⇒ (0 + φ)* ⇒ 0* and 1φ* ⇒ 1ε
0* + 1ε
Now, if L(R) has a sub-expression E* such that E is
not equivalent to ε, then L is infinite.
COSC 2001
COSC 2001
Example
Sub-expressions equivalent to ε :
E+F iff E=F=ε
EF iff E=F=ε
E* iff E=ε
2002 Summer
38
Find and Eliminate φ
Finite L? — RE Method
•
COSC 2001
remains
2) only sub-expression ε is equivalent to ε
0* + 1
remains
Since 0 is starred, language is infinite.
41
2002 Summer
COSC 2001
42
7
COSC 2001B
2002 Summer
Minimization of States
Distinguishable States
• Key idea: find states p and q that are distinguishable
because there is some input w that takes exactly one
of p and q to an accepting state.
• Basis: any non-accepting state is distinguishable
from any accepting state (w = ε).
• Induction: p and q are distinguishable if there is
some input symbol a such that δ(p, a) is
distinguishable from δ(q, a).
• Real goal is testing equivalence of
(representations of) two regular languages.
• Interesting fact: DFA's have unique (up to
state names) minimum-state equivalents.
– All other pairs of states are indistinguishable, and can be
merged into one state.
2002 Summer
COSC 2001
43
2002 Summer
Example (very simple)
COSC 2001
44
Can we distinguish q from r?
0
• Consider FA
• No string beginning with 0 works.
q
0
p
1
1
1
• No string beginning with 1 works.
r
• p is distinguishable
from q and r by basis
2002 Summer
• Starting in either q or r, as long as
we have input 1, we are in one of the
accepting states. When a 0 is read,
we go to the same state (p) and then
regardless of input, the same states
forever after.
0
COSC 2001
• both states go to p, and therefore any
string of the form 0x takes q and r to
the same state.
45
2002 Summer
0
q
0
p
1
1
1
r
0
COSC 2001
46
Constructing Minimum-State DFA
• For each group of indistinguishable states, pick a
"representative."
– Note a group can be large, e.g., q1, q2,…, qk, if all
pairs are indistinguishable.
– Indistinguishability is transitive (why?) so
indistinguishability partitions states.
• If p is a representative, and δ(p, a) = q, in
minimum-state DFA the transition from p on a is
the representative of q's group (or to q itself, if q
is either alone in a group or a representative).
2002 Summer
COSC 2001
47
• Start state is representative of the original
start state.
• Accepting states are representatives of
groups of accepting states.
– Notice we could not have a "mixed" (accepting
+ non-accepting) group (why?).
• Delete any state that is not reachable from
the start state.
2002 Summer
COSC 2001
48
8
COSC 2001B
2002 Summer
Example
Why Can't Beat This Minimization
• For the DFA given earlier, p is in a group by itself;
{q, r} is the other group.
Proof by contradiction:
• suppose we minimize DFA A, constructing DFA
M
• suppose N is another DFA that accepts same
language as A and M, yet has fewer states than M
• Run state-distinguishability process on states of
M and N together
• Start states of M and N are indistinguishable
because L(M) = L(N)
0
q
0
1
0
p
1
1
q
p
1
0,1
r
0
2002 Summer
COSC 2001
49
2002 Summer
COSC 2001
• If {p,q} are indistinguishable, then their
successors on any one input symbol are also
indistinguishable
• since neither M nor N could have an inaccessible
state, every state of M is indistinguishable from at
least one state of N
• Since N has fewer states than M, there are two
states of M that are indistinguishable from the
same state of N, and therefore indistinguishable
from each other
• But M was designed so that all its states are
distinguishable from each other
• Contradiction! so assumption that N exists is
wrong, and M in fact has as few states as any
equivalent DFA for A
• In fact, there must be a 1-1 correspondence
between the states of any other minimum-state N
and the DFA M, showing that the minimum-state
DFA for A is unique up to renaming of the states.
2002 Summer
2002 Summer
COSC 2001
51
DFA Minimization: The Idea
COSC 2001
50
52
DFA Minimization: The Idea
Background:
Yes, there is a unique minimal DFA
& we can construct it.
– Have seen that single RL has many DFAs.
– Have seen that can simplify DFAs, REs, etc.,
by equivalences.
Minimal:
Minimal number of states.
Questions:
Unique:
– Is there a unique simplest DFA for a RL?
– If so, can we construct it?
2002 Summer
COSC 2001
Unique up to renaming of states.
I.e., has same shape. Isomorphic.
53
2002 Summer
COSC 2001
54
9
COSC 2001B
2002 Summer
Algorithm Idea
DFA Minimization: Algorithm
Identify & collapse states having same behavior.
Build table to compare each unordered pair of
distinct states p,q.
Build equivalence relation on states:
p≡q
Each table entry has
z, δ̂(p,z) ∈ F ↔ δ̂ (q,z) ∈ F
↔
– a “mark” as to whether p and q are known to be
not equivalent, and
– a list of entries, recording dependencies: “If
this entry is later marked, also mark these.”
I.e., iff for every string z, one of the following is true:
z
p
z
p
or
z
q
z
q
2002 Summer
COSC 2001
55
2002 Summer
DFA Minimization: Algorithm
COSC 2001
DFA Minimization: Example
1. Initialize all entries as unmarked and with no
dependencies.
2. Mark all pairs of a final and non-final state.
3. For each unmarked pair p,q and input symbol a:
1
0
0
a
1
b
1
b
1
e
c
0
0
f
1
g
0
d
1
0
c
1
h
d
0
1
1. Let r=δ(p,a), s=δ(q,a).
2. If (r,s) unmarked, add (p,q) to (r,s)’s dependencies,
3. Otherwise mark (p,q), and recursively mark all
dependencies of newly-marked entries.
0
e
1. Initialize table entries:
Unmarked, empty list
h
a
COSC 2001
57
2002 Summer
DFA Minimization: Example
1
e
1
0
0
f
c
1
g
0
0
0
1
d
1
0
2. Mark pairs of final
& non-final states
0
a
h
e
1
c
0
0
f
1
d
g
0
d
1
0
e
f
g
0
3. For each unmarked
pair & symbol, …
f
c
1
0
1
e
h
d
e
f
g
g
h
h
a
2002 Summer
d
δ(b,0) ≡? δ(a,0)
δ(b,1) ≡? δ(a,1)
b
1
b
1
c
1
c
58
1
0
b
1
b
b
COSC 2001
DFA Minimization: Example
1
0
0
a
f
g
4. Coalesce unmarked pairs of states.
5. Delete inaccessible states.
2002 Summer
56
COSC 2001
b
c
d
e
f
g
59
a
2002 Summer
COSC 2001
b
c
d
e
f
g
60
10
COSC 2001B
2002 Summer
DFA Minimization: Example
1
e
1
b
1
b
c
0
0
f
g ≡? b
c ≡? f
1
0
0
a
1
g
0
1
0
0
a
h
e
1
b
1
b
1
c
1
0
f
1
d
0
c
1
h
d
0
e
3. For each unmarked
pair & symbol, …
g
0
1
0
1
0
c
0
d
0
1
Maybe.
No!
d
1
0
DFA Minimization: Example
e
3. For each unmarked
pair & symbol, …
f
f
g
g
h
h
a
2002 Summer
b
c
d
e
f
COSC 2001
g
61
a
2002 Summer
DFA Minimization: Example
1
e
1
b
1
b
c
0
0
f
1
g
0
d
1
0
e
1
1
d
1
0
e
f
g
h
b
c
d
e
f
COSC 2001
63
2002 Summer
1
e
1
0
0
f
c
1
g
0
d
1
0
0
1
0
3. For each unmarked
pair & symbol, …
e
1
0
f
c
0
1
d
g
0
0
3. For each unmarked
pair & symbol, …
(g,a)
COSC 2001
d
e
f
(g,a)
(g,a)
h
(a,e)
b
c
d
e
f
g
h
g
(g,a)
a
f
c
1
0
h
e
Need to mark.
So, mark (g,a) also.
d
1
0
1
e
f
d
64
b
1
b
1
h
g
2002 Summer
0
a
c
1
c
1
0
b
1
b
COSC 2001
DFA Minimization: Example
1
0
(a,e)
a
g
DFA Minimization: Example
b
Maybe.
Yes.
d
3. For each unmarked
pair & symbol, …
f
a
0
g
h
h
a
f
c
1
g
2002 Summer
e
d
0
e
3. For each unmarked
pair & symbol, …
g
0
0
1
0
c
0
0
f
h ≡? b
f ≡? f
b
1
b
1
h
0
1
d
62
1
0
0
a
c
1
c
DFA Minimization: Example
δ(e,0) ≡? δ(a,0)
δ(e,1) ≡? δ(a,1)
1
0
0
a
b
COSC 2001
g
65
(a,e)
a
2002 Summer
COSC 2001
b
c
d
e
f
g
66
11
COSC 2001B
2002 Summer
DFA Minimization: Example
1
0
0
a
1
e
1
0
0
f
0
c
1
d
1
e
1
b
1
b
1
c
h
0
f
1
0
g
f
e
4. Coalesce unmarked
pairs of states.
(g,a)
(a,e)
a
b
c
d
e
f
COSC 2001
g
67
a≡e
b≡h
d≡f
1
1
e
1
0
f
0
c
0
1
d
1
c
1
0
g
d
0
e
5. Delete unreachable
states.
f
1
0
None.
0
ae
1
2002 Summer
bh
1
0
df
1
g
c
0
h
1
g
0
h
1
g
0
2002 Summer
Roll
Class
Text
Text
Char
Studs
h
0
1
df
c
0
0
(a,e)
a
b
c
d
e
f
COSC 2001
g
68
• Notation for recursive description of languages.
• Example:
b
1
1
bh
1
g
Context-Free Grammars
1
0
b
0
ae
DFA Minimization: Example
0
f
1
0
h
a
h
d
g
2002 Summer
c
1
0
e
3. For each unmarked
pair & symbol, …
d
1
0
1
0
0
c
0
d
0
1
0
a
1
0
g
1
0
b
1
b
DFA Minimization: Example
(a,e)
a
b
c
d
COSC 2001
e
f
g
69
→
→
→
→
→
→
<ROLL> Class Studs </ROLL>
<CLASS> Text </CLASS>
Char Text
Char
a … (other chars)
Stud Studs
Studs → ε
Stud → <STUD> Text </STUD>
2002 Summer
Example (II)
COSC 2001
70
Example (III)
• Variables (e.g., Studs) represent sets of
strings (i.e., languages).
• Generates "documents" such as:
<ROLL><CLASS>cs240</CLASS>
<STUD>Sally</STUD>
<STUD>Fred</STUD>
…
</ROLL>
– In sensible grammars, these strings share some
common characteristic or roll.
• Terminals (e.g., a or <ROLL>) = symbols
of which strings are composed
– <ROLL> could be considered either a single
terminal or the concatenation of 6 terminals
2002 Summer
COSC 2001
71
2002 Summer
COSC 2001
72
12
COSC 2001B
2002 Summer
CFG Notation
Example
• Productions = rules of the form
head → body
• A simple example generates strings of 0's
and 1's such that each block of 0's is
followed by at least as many 1's
– head is a variable.
– body is a string of zero or more variables and/or
terminals.
S → AS | ε
• Start Symbol = variable that represents "the
language."
• Notation: G = (V, Σ, P, S)
• Note vertical bar separates different bodies
for the same head.
Σ = terminals
S = start symbol
V = variables
P = productions
2002 Summer
A → 0A1 | A1 | 01
COSC 2001
73
2002 Summer
Derivations
• L(G) = set of terminal strings w such that
S ⇒G w, where S is the start symbol.
– (Subscript with name of grammar, e.g., ⇒G, if nec.)
– Example: 011AS ⇒ 0110A1S
• Notation
• α ⇒* β means string can become β in zero or more
derivation steps.
– Examples:
• 011AS ⇒*011AS (zero steps)
• 011AS ⇒*0110A1S (one step)
• 011AS ⇒*0110011 (three steps)
COSC 2001
75
– a, b,… are terminals; … y, z are strings of terminals.
– Greek letters are strings of variables and/or terminals,
often called sentential forms.
– A,B,… are variables.
– …, Y, Z are variables or terminals.
– S is typically the start symbol.
2002 Summer
• choice of variable to replace at each step
– Derivations may appear different only because we make
the same replacements in a different order.
– To avoid such differences, we may restrict the choice.
• A leftmost derivation always replaces the leftmost
variable in a sentential form.
– Yields left-sentential forms.
76
• Leftmost derivation:
S ⇒ AS ⇒ A1S ⇒ 011S ⇒ 011AS
⇒ 0110A1S ⇒ 0110011S ⇒ 0110011
• Rightmost derivation:
S ⇒ AS ⇒ AAS ⇒ AA ⇒ A0A1
⇒ A0011 ⇒ A10011 ⇒ 0110011
• Rightmost derivation defined analogously.
• ⇒,
etc., used to indicate derivations are
⇒,
rm
lm
leftmost or rightmost.
COSC 2001
COSC 2001
Example
Leftmost/Rightmost Derivations
2002 Summer
74
Language of a CFG
• αAβ ⇒ αγβ whenever there is a production
A→γ
2002 Summer
COSC 2001
• Note we derived the same string
77
2002 Summer
COSC 2001
78
13