Formal Languages, Automata and Computation

F ORMAL L ANGUAGES , AUTOMATA AND
C OMPUTATION
R EGULAR E XPRESSIONS
Carnegie Mellon University in Qatar
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
1 / 26
S UMMARY
Nondeterminism
Clone the FA at choice points
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
2 / 26
S UMMARY
Nondeterminism
Clone the FA at choice points
Guess and verify
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
2 / 26
S UMMARY
Nondeterminism
Clone the FA at choice points
Guess and verify
Nondeterministic FA
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
2 / 26
S UMMARY
Nondeterminism
Clone the FA at choice points
Guess and verify
Nondeterministic FA
Multiple transitions from a state with the same input symbol
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
2 / 26
S UMMARY
Nondeterminism
Clone the FA at choice points
Guess and verify
Nondeterministic FA
Multiple transitions from a state with the same input symbol
-transitions
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
2 / 26
S UMMARY
Nondeterminism
Clone the FA at choice points
Guess and verify
Nondeterministic FA
Multiple transitions from a state with the same input symbol
-transitions
NFAs are equivalent to DFAs
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
2 / 26
S UMMARY
Nondeterminism
Clone the FA at choice points
Guess and verify
Nondeterministic FA
Multiple transitions from a state with the same input symbol
-transitions
NFAs are equivalent to DFAs
Determinization procedure builds a DFA with up to 2k states
for an NFA with k states.
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
2 / 26
C LOSURE T HEOREMS
T HEOREM
The class of
regular
languages is
closed under the
union operation.
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
3 / 26
C LOSURE T HEOREMS
T HEOREM
The class of
regular
languages is
closed under the
union operation.
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
P ROOF I DEA BASED ON NFA S
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
3 / 26
C LOSURE T HEOREMS
T HEOREM
The class of
regular
languages is
closed under the
concatenation
operation.
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
4 / 26
C LOSURE T HEOREMS
T HEOREM
The class of
regular
languages is
closed under the
concatenation
operation.
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
P ROOF I DEA BASED ON NFA S
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
4 / 26
C LOSURE T HEOREMS
T HEOREM
The class of
regular
languages is
closed under the
star operation.
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
5 / 26
C LOSURE T HEOREMS
T HEOREM
The class of
regular
languages is
closed under the
star operation.
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
P ROOF I DEA BASED ON NFA S
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
5 / 26
R EGULAR E XPRESSIONS
DFAs are finite descriptions of (finite or infinite)
sets of strings
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
6 / 26
R EGULAR E XPRESSIONS
DFAs are finite descriptions of (finite or infinite)
sets of strings
Finite number of symbols, states, transitions
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
6 / 26
R EGULAR E XPRESSIONS
DFAs are finite descriptions of (finite or infinite)
sets of strings
Finite number of symbols, states, transitions
Regular Expressions provide an algebraic
expression framework to describe the same class
of strings
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
6 / 26
R EGULAR E XPRESSIONS
DFAs are finite descriptions of (finite or infinite)
sets of strings
Finite number of symbols, states, transitions
Regular Expressions provide an algebraic
expression framework to describe the same class
of strings
Thus, DFAs and Regular Expressions are
equivalent.
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
6 / 26
R EGULAR E XPRESSIONS
For every regular expression, there is a
corresponding regular set or language
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
7 / 26
R EGULAR E XPRESSIONS
For every regular expression, there is a
corresponding regular set or language
R, R1 , R2 are regular expressions; L(R) denotes
the corresponding regular set
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
7 / 26
R EGULAR E XPRESSIONS
For every regular expression, there is a
corresponding regular set or language
R, R1 , R2 are regular expressions; L(R) denotes
the corresponding regular set
Regular Expression
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
Regular Set
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
7 / 26
R EGULAR E XPRESSIONS
For every regular expression, there is a
corresponding regular set or language
R, R1 , R2 are regular expressions; L(R) denotes
the corresponding regular set
Regular Expression
φ
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
Regular Set
{}
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
7 / 26
R EGULAR E XPRESSIONS
For every regular expression, there is a
corresponding regular set or language
R, R1 , R2 are regular expressions; L(R) denotes
the corresponding regular set
Regular Expression
φ
a for a ∈ Σ
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
Regular Set
{}
{a}
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
7 / 26
R EGULAR E XPRESSIONS
For every regular expression, there is a
corresponding regular set or language
R, R1 , R2 are regular expressions; L(R) denotes
the corresponding regular set
Regular Expression
φ
a for a ∈ Σ
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
Regular Set
{}
{a}
{}
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
7 / 26
R EGULAR E XPRESSIONS
For every regular expression, there is a
corresponding regular set or language
R, R1 , R2 are regular expressions; L(R) denotes
the corresponding regular set
Regular Expression Regular Set
φ
{}
a for a ∈ Σ
{a}
{}
(R1 ∪ R2 )
L(R1 ) ∪ L(R2 )
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
7 / 26
R EGULAR E XPRESSIONS
For every regular expression, there is a
corresponding regular set or language
R, R1 , R2 are regular expressions; L(R) denotes
the corresponding regular set
Regular Expression Regular Set
φ
{}
a for a ∈ Σ
{a}
{}
(R1 ∪ R2 )
L(R1 ) ∪ L(R2 )
(R1 R2 )
L(R1 )L(R2 )
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
7 / 26
R EGULAR E XPRESSIONS
For every regular expression, there is a
corresponding regular set or language
R, R1 , R2 are regular expressions; L(R) denotes
the corresponding regular set
Regular Expression Regular Set
φ
{}
a for a ∈ Σ
{a}
{}
(R1 ∪ R2 )
L(R1 ) ∪ L(R2 )
(R1 R2 )
L(R1 )L(R2 )
∗
(R )
L(R)∗
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
7 / 26
R EGULAR E XPRESSIONS – M ORE S YNTAX
Regular Expression Regular Set
φ
{}
a for ∈ Σ
{a}
{}
(R1 ∪ R2 )
L(R1 ) ∪ L(R2 )
(R1 ◦ R2 )
L(R1 ) ◦ L(R2 )
∗
(R )
L(R)∗
Also some books use R1 + R2 to denote union.
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
8 / 26
R EGULAR E XPRESSIONS – M ORE S YNTAX
Regular Expression Regular Set
φ
{}
a for ∈ Σ
{a}
{}
(R1 ∪ R2 )
L(R1 ) ∪ L(R2 )
(R1 ◦ R2 )
L(R1 ) ◦ L(R2 )
∗
(R )
L(R)∗
Also some books use R1 + R2 to denote union.
In (. . .), the parenthesis can be deleted
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
8 / 26
R EGULAR E XPRESSIONS – M ORE S YNTAX
Regular Expression Regular Set
φ
{}
a for ∈ Σ
{a}
{}
(R1 ∪ R2 )
L(R1 ) ∪ L(R2 )
(R1 ◦ R2 )
L(R1 ) ◦ L(R2 )
∗
(R )
L(R)∗
Also some books use R1 + R2 to denote union.
In (. . .), the parenthesis can be deleted
In which case, interpretation is done in the precedence
order: star, concatenation and then union.
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
8 / 26
R EGULAR E XPRESSIONS – M ORE S YNTAX
Regular Expression Regular Set
φ
{}
a for ∈ Σ
{a}
{}
(R1 ∪ R2 )
L(R1 ) ∪ L(R2 )
(R1 ◦ R2 )
L(R1 ) ◦ L(R2 )
∗
(R )
L(R)∗
Also some books use R1 + R2 to denote union.
In (. . .), the parenthesis can be deleted
In which case, interpretation is done in the precedence
order: star, concatenation and then union.
R + = RR ∗ and R k for k-fold concatenation are useful shorthands.
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
8 / 26
R EGULAR E XPRESSION E XAMPLES
Regular Expression
0∗ 10∗
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
Regular Language
→
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
9 / 26
R EGULAR E XPRESSION E XAMPLES
Regular Expression
0∗ 10∗
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
→
Regular Language
{ω|ω contains a single 1}
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
9 / 26
R EGULAR E XPRESSION E XAMPLES
Regular Expression
0∗ 10∗
(0 ∪ 1)∗ 1(0 ∪ 1)∗
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
→
→
Regular Language
{ω|ω contains a single 1}
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
9 / 26
R EGULAR E XPRESSION E XAMPLES
Regular Expression
0∗ 10∗
(0 ∪ 1)∗ 1(0 ∪ 1)∗
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
→
→
Regular Language
{ω|ω contains a single 1}
{ω|ω has at least one 1}
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
9 / 26
R EGULAR E XPRESSION E XAMPLES
Regular Expression
0∗ 10∗
(0 ∪ 1)∗ 1(0 ∪ 1)∗
0(0 ∪ 1)∗ 0 ∪ 1(0 ∪ 1)∗ 1 ∪ 0 ∪ 1
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
→
→
→
Regular Language
{ω|ω contains a single 1}
{ω|ω has at least one 1}
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
9 / 26
R EGULAR E XPRESSION E XAMPLES
Regular Expression
0∗ 10∗
(0 ∪ 1)∗ 1(0 ∪ 1)∗
0(0 ∪ 1)∗ 0 ∪ 1(0 ∪ 1)∗ 1 ∪ 0 ∪ 1
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
→
→
→
Regular Language
{ω|ω contains a single 1}
{ω|ω has at least one 1}
{ω|ω starts and ends
with the same symbol}
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
9 / 26
R EGULAR E XPRESSION E XAMPLES
Regular Expression
0∗ 10∗
(0 ∪ 1)∗ 1(0 ∪ 1)∗
0(0 ∪ 1)∗ 0 ∪ 1(0 ∪ 1)∗ 1 ∪ 0 ∪ 1
→
→
→
(0∗ 10∗ 1)∗ 0∗
→
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
Regular Language
{ω|ω contains a single 1}
{ω|ω has at least one 1}
{ω|ω starts and ends
with the same symbol}
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
9 / 26
R EGULAR E XPRESSION E XAMPLES
Regular Expression
0∗ 10∗
(0 ∪ 1)∗ 1(0 ∪ 1)∗
0(0 ∪ 1)∗ 0 ∪ 1(0 ∪ 1)∗ 1 ∪ 0 ∪ 1
→
→
→
(0∗ 10∗ 1)∗ 0∗
→
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
Regular Language
{ω|ω contains a single 1}
{ω|ω has at least one 1}
{ω|ω starts and ends
with the same symbol}
{ω|n1 (ω) is even}
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
9 / 26
W RITING R EGULAR E XPRESSIONS
All strings with at least one pair of consecutive 0s
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
10 / 26
W RITING R EGULAR E XPRESSIONS
All strings with at least one pair of consecutive 0s
(0 ∪ 1)∗ 00(0 ∪ 1)∗
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
10 / 26
W RITING R EGULAR E XPRESSIONS
All strings with at least one pair of consecutive 0s
(0 ∪ 1)∗ 00(0 ∪ 1)∗
All strings such that fourth symbol from the end is
a1
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
10 / 26
W RITING R EGULAR E XPRESSIONS
All strings with at least one pair of consecutive 0s
(0 ∪ 1)∗ 00(0 ∪ 1)∗
All strings such that fourth symbol from the end is
a1
(0 ∪ 1)∗ 1(0 ∪ 1)(0 ∪ 1)(0 ∪ 1)
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
10 / 26
W RITING R EGULAR E XPRESSIONS
All strings with at least one pair of consecutive 0s
(0 ∪ 1)∗ 00(0 ∪ 1)∗
All strings such that fourth symbol from the end is
a1
(0 ∪ 1)∗ 1(0 ∪ 1)(0 ∪ 1)(0 ∪ 1)
All strings with no pair of consecutive 0s
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
10 / 26
W RITING R EGULAR E XPRESSIONS
All strings with at least one pair of consecutive 0s
(0 ∪ 1)∗ 00(0 ∪ 1)∗
All strings such that fourth symbol from the end is
a1
(0 ∪ 1)∗ 1(0 ∪ 1)(0 ∪ 1)(0 ∪ 1)
All strings with no pair of consecutive 0s
(1∗ 011∗ )∗ (0 ∪ ) ∪ 1∗
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
10 / 26
W RITING R EGULAR E XPRESSIONS
All strings with at least one pair of consecutive 0s
(0 ∪ 1)∗ 00(0 ∪ 1)∗
All strings such that fourth symbol from the end is
a1
(0 ∪ 1)∗ 1(0 ∪ 1)(0 ∪ 1)(0 ∪ 1)
All strings with no pair of consecutive 0s
(1∗ 011∗ )∗ (0 ∪ ) ∪ 1∗
Strings consist of repetitions of 1 or 01 or two boundary
cases: (1 ∪ 01)∗ (0 ∪ )
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
10 / 26
W RITING R EGULAR E XPRESSIONS
All strings with at least one pair of consecutive 0s
(0 ∪ 1)∗ 00(0 ∪ 1)∗
All strings such that fourth symbol from the end is
a1
(0 ∪ 1)∗ 1(0 ∪ 1)(0 ∪ 1)(0 ∪ 1)
All strings with no pair of consecutive 0s
(1∗ 011∗ )∗ (0 ∪ ) ∪ 1∗
Strings consist of repetitions of 1 or 01 or two boundary
cases: (1 ∪ 01)∗ (0 ∪ )
All strings that do not end in 01.
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
10 / 26
W RITING R EGULAR E XPRESSIONS
All strings with at least one pair of consecutive 0s
(0 ∪ 1)∗ 00(0 ∪ 1)∗
All strings such that fourth symbol from the end is
a1
(0 ∪ 1)∗ 1(0 ∪ 1)(0 ∪ 1)(0 ∪ 1)
All strings with no pair of consecutive 0s
(1∗ 011∗ )∗ (0 ∪ ) ∪ 1∗
Strings consist of repetitions of 1 or 01 or two boundary
cases: (1 ∪ 01)∗ (0 ∪ )
All strings that do not end in 01.
(0 ∪ 1)∗ (00 ∪ 10 ∪ 11) ∪ 0 ∪ 1 ∪ (C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
10 / 26
W RITING R EGULAR E XPRESSIONS
All strings over Σ = {a, b, c} that contain every
symbol at least once.
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
11 / 26
W RITING R EGULAR E XPRESSIONS
All strings over Σ = {a, b, c} that contain every
symbol at least once.
(a ∪ b ∪ c)∗ a(a ∪ b ∪ c)∗ b(a ∪ b ∪ c)∗ c(a ∪ b ∪ c)∗ ∪
(a ∪ b ∪ c)∗ a(a ∪ b ∪ c)∗ c(a ∪ b ∪ c)∗ b(a ∪ b ∪ c)∗ ∪
(a ∪ b ∪ c)∗ b(a ∪ b ∪ c)∗ a(a ∪ b ∪ c)∗ c(a ∪ b ∪ c)∗ ∪
(a ∪ b ∪ c)∗ b(a ∪ b ∪ c)∗ c(a ∪ b ∪ c)∗ a(a ∪ b ∪ c)∗ ∪
(a ∪ b ∪ c)∗ c(a ∪ b ∪ c)∗ a(a ∪ b ∪ c)∗ b(a ∪ b ∪ c)∗ ∪
(a ∪ b ∪ c)∗ c(a ∪ b ∪ c)∗ b(a ∪ b ∪ c)∗ a(a ∪ b ∪ c)∗
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
11 / 26
W RITING R EGULAR E XPRESSIONS
All strings over Σ = {a, b, c} that contain every
symbol at least once.
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
12 / 26
W RITING R EGULAR E XPRESSIONS
All strings over Σ = {a, b, c} that contain every
symbol at least once.
DFAs and REs may need different ways of looking
at the problem.
For the DFA, you count symbols
For the RE, you enumerate all possible patterns
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
12 / 26
RE I DENTITIES
R∪φ=R
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
13 / 26
RE I DENTITIES
R∪φ=R
R = R = R
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
13 / 26
RE I DENTITIES
R∪φ=R
R = R = R
φ∗ = (C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
13 / 26
RE I DENTITIES
R∪φ=R
R = R = R
φ∗ = Note that we do not have explicit operators for
intersection or complementation!
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
13 / 26
D IGRESSION : RE S IN R EAL LIFE
Linux/Unix Shell, Perl, Awk, Python all have built in regular
expression support for pattern matching functionality
See http://www.wdvl.com/Authoring/
Languages/Perl/PerlfortheWeb/
perlintro2 table1.html
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
14 / 26
D IGRESSION : RE S IN R EAL LIFE
Linux/Unix Shell, Perl, Awk, Python all have built in regular
expression support for pattern matching functionality
See http://www.wdvl.com/Authoring/
Languages/Perl/PerlfortheWeb/
perlintro2 table1.html
Mostly some syntactic extensions/changes to basic regular
expressions with some additional functionality for remembering
matches
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
14 / 26
D IGRESSION : RE S IN R EAL LIFE
Linux/Unix Shell, Perl, Awk, Python all have built in regular
expression support for pattern matching functionality
See http://www.wdvl.com/Authoring/
Languages/Perl/PerlfortheWeb/
perlintro2 table1.html
Mostly some syntactic extensions/changes to basic regular
expressions with some additional functionality for remembering
matches
Substring matches in a string!
Search for and download Regex Coach to learn and experiment
with regular expression matching
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
14 / 26
E QUIVALENCE WITH F INITE AUTOMATA
T HEOREM
A language is regular if and only if some regular expression describes
it.
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
15 / 26
E QUIVALENCE WITH F INITE AUTOMATA
T HEOREM
A language is regular if and only if some regular expression describes
it.
L EMMA - T HE if PART
If a language is described by a regular expression, then it is regular
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
15 / 26
E QUIVALENCE WITH F INITE AUTOMATA
T HEOREM
A language is regular if and only if some regular expression describes
it.
L EMMA - T HE if PART
If a language is described by a regular expression, then it is regular
P ROOF I DEA
Inductively convert a given regular expression to an NFA.
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
15 / 26
C ONVERTING RE S TO NFA S : BASIS C ASES
Regular Expression Corresponding NFA
φ
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
16 / 26
C ONVERTING RE S TO NFA S : BASIS C ASES
Regular Expression Corresponding NFA
φ
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
16 / 26
C ONVERTING RE S TO NFA S : BASIS C ASES
Regular Expression Corresponding NFA
φ
a for a ∈ Σ
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
16 / 26
C ONVERTING RE S TO NFA S
Union
Let N1 and N2 be NFAs for R1 and R2
respectively. Then the NFA for R1 ∪ R2 is
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
17 / 26
C ONVERTING RE S TO NFA S
Concatenation
Let N1 and N2 be NFAs for R1 and R2
respectively. Then the NFA for R1 R2 is
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
18 / 26
C ONVERTING RE S TO NFA S : S TAR
Star
Let N be NFAs for R. Then the NFA for R∗ is
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
19 / 26
RE
TO
NFA C ONVERSION E XAMPLE
Let’s convert (a ∪ b)∗ aba to an NFA.
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
20 / 26
RE
TO
NFA TO DFA
Regular Expression → NFA (possibly with
-transitions)
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
21 / 26
RE
TO
NFA TO DFA
Regular Expression → NFA (possibly with
-transitions)
NFA → DFA via determinization
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
21 / 26
E QUIVALENCE WITH F INITE AUTOMATA
T HEOREM
A language is regular if and only if some regular expression describes
it.
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
22 / 26
E QUIVALENCE WITH F INITE AUTOMATA
T HEOREM
A language is regular if and only if some regular expression describes
it.
L EMMA – T HE only if PART
If a language is regular then it is described by a regular expression
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
22 / 26
E QUIVALENCE WITH F INITE AUTOMATA
T HEOREM
A language is regular if and only if some regular expression describes
it.
L EMMA – T HE only if PART
If a language is regular then it is described by a regular expression
P ROOF I DEA
Generalized transitions: label transitions with regular expressions
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
22 / 26
E QUIVALENCE WITH F INITE AUTOMATA
T HEOREM
A language is regular if and only if some regular expression describes
it.
L EMMA – T HE only if PART
If a language is regular then it is described by a regular expression
P ROOF I DEA
Generalized transitions: label transitions with regular expressions
Generalized NFAs (GNFA)
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
22 / 26
E QUIVALENCE WITH F INITE AUTOMATA
T HEOREM
A language is regular if and only if some regular expression describes
it.
L EMMA – T HE only if PART
If a language is regular then it is described by a regular expression
P ROOF I DEA
Generalized transitions: label transitions with regular expressions
Generalized NFAs (GNFA)
Iteratively eliminate states of the GNFA one by one, until only two
states and a single generalized transition is left.
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
22 / 26
G ENERALIZED T RANSITIONS
DFAs have single symbols as transition labels
If you are in state p and the next input symbol matches a, go
to state q
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
23 / 26
G ENERALIZED T RANSITIONS
DFAs have single symbols as transition labels
If you are in state p and the next input symbol matches a, go
to state q
Now consider
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
23 / 26
G ENERALIZED T RANSITIONS
DFAs have single symbols as transition labels
If you are in state p and the next input symbol matches a, go
to state q
Now consider
If you are in state p and a prefix of the remaining input
matches the regular expression ab∗ ∪ bc∗ then go to state q
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
23 / 26
G ENERALIZED T RANSITIONS AND NFA
A generalized transition is a transition whose
label is a regular expression
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
24 / 26
G ENERALIZED T RANSITIONS AND NFA
A generalized transition is a transition whose
label is a regular expression
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
24 / 26
G ENERALIZED T RANSITIONS AND NFA
A generalized transition is a transition whose
label is a regular expression
A Generalized NFA is an NFA with generalized
transitions.
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
24 / 26
G ENERALIZED T RANSITIONS AND NFA
A generalized transition is a transition whose
label is a regular expression
A Generalized NFA is an NFA with generalized
transitions.
In fact, all standard DFA transitions are generalized
transitions with regular expressions of a single symbol!
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
24 / 26
G ENERALIZED T RANSITIONS
Consider the 2-state DFA
↓
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
25 / 26
G ENERALIZED T RANSITIONS
Consider the 2-state DFA
↓
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
25 / 26
G ENERALIZED T RANSITIONS
Consider the 2-state DFA
↓
0∗ 1 takes the DFA from state q0 to q1
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
25 / 26
G ENERALIZED T RANSITIONS
Consider the 2-state DFA
↓
0∗ 1 takes the DFA from state q0 to q1
(0 ∪ 10∗ 1)∗ takes the machine from q1 back to q1
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
25 / 26
G ENERALIZED T RANSITIONS
Consider the 2-state DFA
↓
0∗ 1 takes the DFA from state q0 to q1
(0 ∪ 10∗ 1)∗ takes the machine from q1 back to q1
So ?= 0∗ 1(0 ∪ 10∗ 1)∗ represents all strings that take the
DFA from state q0 to q1
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
25 / 26
G ENERALIZED NFA S
Take any NFA and transform it into a GNFA
with only two states: one start and one accept
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
26 / 26
G ENERALIZED NFA S
Take any NFA and transform it into a GNFA
with only two states: one start and one accept
with one generalized transition
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
26 / 26
G ENERALIZED NFA S
Take any NFA and transform it into a GNFA
with only two states: one start and one accept
with one generalized transition
then we can “read” the regular expression from
the label of the generalized transition (as in the
example above)
(C ARNEGIE M ELLON U NIVERSITY IN Q ATAR )
S LIDES FOR 15-453 L ECTURE 4
S PRING 2011
26 / 26