Unit-3(Regular Languages)
Learning Objectives : Student should be able to –
Understand the concept such as Regular Sets, Regular expressions.
Know about various identity Rules to solve Regular Expressions.
Construct Finite Automata form a Regular Expressions.
Convert Finite Automata to Regular Expressions.
Learning Pumping Lemma of Regular Sets.--_)-)_
Solve Problem Related to Pumping Lemma.
Learn Closure Properties of Regular Sets.
Regular ExpressionsAny language that can be accepted by finite automata is called Regular
Languages. Regular Expressions are useful are for representing certain sets of strings in an
algebraic fashion. Actually these describe the languages accepted by the finite automata. It is
mainly used for pattern matching.
We give formal recursive definition of regular expression over ∑ as follows:
1. Any terminal symbol(i.e., an element of ∑), ^ and are regular expressions. When we
view a in ∑ as a regular expression, we denote it by a.
2. The union of two regular expressions R1 and R2 , written as R1 + R2 , is also regular
expression.
3. The concatenation of two regular expressions R1 and R2 , written as R1 R2 , is also regular
expression.
4. The iteration (closure) of regular expression R, written as R*, is also regular expression.
5. If R is regular expression, then (R) is also regular expression.
6. The regular expressions over ∑ are precisely those obtained recursively by the
application of the rules 1-5 once or more times.
Definition: Any set represented by a regular expression is called regular set.
I
Example3.1: If a,b ∑, then (a) a denotes the set {a}
(b) a+b denotes {a,b}
(c ) a,b denotes {a,b}
(d) a* denotes the set{^,a, aa,
*
*
aaa, ………} and (e) (a+b) denotes {a,b} .
Let R1 and R2 denotes two regular expressions. Then
a) A string in R1 + R2 is a string from R1 or a string from R2.
b) A string in R1 R2 is a string from R1 followed by a string R2.
c) A string R* is a string obtained by concatenating n elements for some n.0.
Consequently:
a) The set represented by R1 + R2 is the union of set represented by R1 and R2.
b) The set represented by R1 R2 is the concatenation of sets represented by R1 and R2.
c) The set represented by R* is {w1 w2 ……….wn |wI is in the set represented by R and n0.
Example3.2: Describe the following sets by Regular expressions:
(a) {101} (b) {abba}
(c) {01,10}
(d){^.ab}
(e) {abb,a,b,bba)
(f) {^,0,00,000,…….}
(g){1,11,111,…..}
Sol:
(a) {101}is represented by 101.
(b) abba represents {abba}.
(c) {01,10}is the union of {01} and {10}, {01,10} is represented by 01+10.
(d) The set { ^,ab} represented by ^+ab.
(e) The set {abb,a,b,bba} is represented by abb+a+bba.
(f) {^.0,00,000,……} is represented by {0}*, it is represented by 0*.
(g)Any element in {^,1,11,111,…..} can be obtained by concatenating 1 and any element of
{1}*. Hence 1(1)* is represents {1,11,111,………}.
Identities for Regular Expressions
Two Regular expressions P and Q are equivalent (P=Q) if P and Q represent the same set of
strings.
The identities for regular expressions are usefull for simplifying regular expressions.
I1
+R=R
I2
R=R=
I3
^R=R^=R
I4
^*=^ and *=^
I5
R+R=R
I6
R* R* =R*
I7
RR*=R* R
I8
(R* )*=R*
I9
^+RR* =R*=^+R* R
I10
(PQ)* P=P(QP)*
I11
I12
(P+Q)* =(P* Q*)*=(P*+Q*)*
(P+Q)R=PR+QR and R(P+Q)=RP+RQ
Teorem3.1(Arden’s Theorem):
Let P and Q are two regular expressions over ∑. If P does not contains ^, then the
following equation in R, viz,
R=Q+RP……………………………….(1)
has unique solution(i.e., only one solution) given by R=QP*.
Proof:
Q+(QP*)P=Q(^+P* P)=QP* by I9 .
Hence (1) is satisfied when R=QP*. This means R=QP* is a solution of (1).
To prove uniqueness, consider (1).
Here replacing R by Q+RP on RHS, we get the equation
Q+RP=Q+(Q+RP)P
=Q+QP+RPP
=Q+QP+RP2
=Q+QP+QP2+………+QPi+RPi+1
=Q(^+P+P2 +………+Pi )+RPi+1
From R= Q(^+P+P2 +………+Pi )+RPi+1 for i1……………….(2)
We now show that any solution of (1) is equivalent to QP*.
Suppose R satisfies (1) then it satisfies (2).
Let w be a string of length I in set R. Then w belongs to set Q(^+P+P2 +………+Pi )+RPi+1. As
P does not contains ^ , RPi+1 has no string of length less than i+1 and so w is not in the set RPi+1.
this means w belongs to set Q(^+P+P2 +………+Pi ) and hence QP*.
Example3.3:(a) Give an RE for representing the set L of strings in which every 0 is
immediately followed by at least two 1’s.
(b)Prove that the regular expression R=^+1*(011)*(1*(011)*)* describes the some set of
strings.
Sol: (a) If w is in L, then either
(i)
w does not contain any 0 or
(ii)
it contains a 0 proceeded by 1 and followed by 11.
So w can be written as w1 w2 …….wn , where each wi is either 1 or 011 .so L is represented by
the RE (1+011)*.
(b) R=^+P1 P1*
where P = 1*(011)*
*
=P1
using I9
=(1*(011)*)*
=(P2* P3*)*
Letting P2=1, P3=011
=(P2+P3) using I11
=(1+011)*
Example3.4:Prove (1+00* 1)+(1+00* 1)(0+10* 1)* (0+10* 1)=0* 1(0+10* 1)*
Sol: LHS
(0+10* 1)(^+ (0+10* 1)* (0+10* 1))
using I12
*
*
*
=(0+10 1) (0+10 1)
using I9
= (^+00*)1(0+10* 1)*
Using I12 for 1+00* 1
=0 1(0+10* 1)*
using I9
=RHS
Example3.5:
(a) Obtain a regular expression to accept a language consisting of strings of a’s and b’s
of even length.
Sol: (aa+ab+ba+bb)*
L(R)={(aa+ab+ba+bb)n|n0}
(b) Obtain a regular expression to accept language consisting of strings of a’s and b’s of
odd length.
Sol: (aa+ab+ba+bb)* (a+b)
(a+b) (aa+ab+ba+bb)*
(c) Obtain a Regular expression such that
L(R)={w|w{0,1}* with at least three consecutive 0’s}
(0+1)* 000(0+1)*
L(R)={(0+1) 000(0+1) |m0 and n0}
Constructing Finite Automata for a given Regular Expression
Here we will construct an NFA from regular expression . Once we have NFA, we can
easily construct DFA and we know that any language accepted by a DFA is regular.
Since a DFA is obtained from an NFA, we can say that any language accepted by an
NFA is also regular.
Since an NFA is obtained from a regular expression, we can say that a regular language
in fact can be denoted by regular expression.
Construct NFA from the Regular Expression:
Theorem3.2:
Let R be a regular expression. Then there exist a g=finite automata M=(Q, ∑, δ,q0 ,F)
which accept L(R).
Proof:
By definition , and a are regular expressions. So, the corresponding machine to
recoganise these expressions are shown in fig(1) (a),(b) and (c) respectively
The schematic representation of a regular expression R to accept the language L(R) is
shown in Fig(ii).where q is the start state and F is the final State of Machine M.
L(R)
q
M
F
Fig(ii) Schematic representation of FA accepting L(R)
In the definition of regular expression it is clear that if R and S are regular expression, the
R+S, R.S and R* are regul;ar expression which clearly uses three operator ‘+’,’*’and ( ) .
Let us take each separately and
construct equivalent machine.
Let M1= {Q1, ∑1 , δ1,q1 ,F1 } be a machine which accepts the language L(R1) corresponding
to the regular expression R1.
Let M2= {Q2, ∑2 , δ2,q2 ,F2 } be a machine which accepts the language L(R2) corresponding
to the regular expression R2.
Case1 : R=R1+R2. We can construct an NFA which accepts either L(R1) or L(R2) which can
be represented a L(R1+R2) as shown in fig(iii)
It is clear from fig(iii) that m/c can either accepts L(R1) or L(R2).Here q0 is the start state of the
combined Machine and qf is the final state of combined machine M.
Case2: R=R1 .R2 . We can construct an NFA which accepts L(R1) followed by L(R2) which can
be represented as L(R1.R2) as shown in fig(iv).
It is clear from Fig(iv) that machine after accepting L(R1) moves from state q1 to f 1. Since , there
is -transition, without any input there will be a transition from f to state q . In state q2 , upon
accepting L(R2), the machine moves to f2 which is final state.
Thus , q1 which is stste of machine becomes the start state of the combined machine M
and accepts language L(R1.R2).
Case3: L=(R)* . We can construct an NFA which accepts either L(R1)* as shown in fig v(a). It
can also be represented as shown in figv(b).
It is clear from fig that the machine can either accepts or any number of L(R1), thus accepting
the language L(R1)* . Here , q0 is start state and qf is final state.
Example3.6: Obtain an NFA which accepts strings a’s and b’s string with ab.
Sol: The regular expression corresponding to this language ab(a+b)*.
Step1: The machine to accept ‘a’
Step2: The machine to accept ‘b’
Step3: The machine to accept (a+b)
Step4: The machine to accept (a+b)*.
Step5: The machine to accept ‘ab’
Step6: The machine to accept ab(a+b)*.
Example3.7: Obtain an NFA for the RE a*+b*+c*.
Example3.8: Obtain an NFA for the RE (a+b)* aa(a+b)*.
Sol:
Step1: The machine to accept (a+b)
Step2: The machine to accept (a+b)*
Step3: The machine to accept aa
Step4: The machine to accept aa(a+b)*.
Step5: The machine to accept (a+b)*aa(a+b)*.
.
Example3.9: Obtain an NFA to accept following Language
L={w| wababn or aban where n>0}.
Sol:
Construction of FA equivalent to a Regular Expression:
This method is used to construct a FA equivalent to a given expression is called the subset
method, which involve two steps.
Step1: Construct a transition graph equivalent to the regular expression using ^- moves.
Step2: Construct transition table for the transition graph obtained in step 1. Construct equivalent
DFA. We reduce number of states if possible.
Example3.10: Construction of FA equivalent to regular expression
(0+1)*(00+11)(0+1)*.
Sol:
Conversion of FA to Regular expression:
The following method is used to find the regular expression recognized by a transition system.
The following assumption are made regarding the transition system:
i. The transition system doesn’t have ^-moves.
ii. It has only one initial state ,say v1.
iii. It’s vertices are v1----------vn.
iv.
Vi is the RE representing the set of strings accepted by the system even though vi is final
state.
α ij denotes the regular expression representing the set of labels of edges from vi to vj.
When there is no such edge , α ij=Φ.
v.
Consequently, we get the following set of equations vi………vn.
v1= v1 α 11 +v2 α 21+………….+ vn α n1+Λ
v2= v1 α 12 +v2 α 22+………….+ vn α n2
…………………………………….
…………………………………….
Vn= v1 α 1n +v2 α 2n+………….+ vn α nn
By repeatedly applying substitution, we can express Vi in term of α ij’s.
For getting the set of strings recognized by the transition system, we have to take the
union of all
Vi’s corresponding to final state.
Example3.11: Consider the transition system given in fig. Prove that the strings recognized
are (a+a(b+aa)*b)* a(b+aa)*a
Sol: we can directly apply the method since the graph doesn’t contain any Λ-moves and there is
only one initial state.
The three equation for q1 ,q2 and q3 can be written as
q1 =q1 a+q2 b+ Λ,
q2 =q1 a+q2 b+q3 a
= q1 a+q2 b+q2 aa
q2 =q1 a+q2 b+q3 a
and
q 3=q2 a
= q1 a+q2 ( b+aa)
= q1 a+q1 a (b+aa)
= q1 a (Λ +( b+aa))
= q1 a (b+aa)*
q1 =q1 a+q2 b+ Λ
= q1 (a+ a (b+aa)*b)+ Λ
q1 = (a+ a (b+aa)*b)*
q2 =(a+ a (b+aa)*b)* a (b+aa)*
q3 =(a+ a (b+aa)*b)* a (b+aa)*a
Since q3 is final state the saet of strings recognized by
(a+ a (b+aa)*b)* a (b+aa)*a
hence
Example3.12:Prove thet FA whose transition diagram is given in fig accepts the set of all
strings over the alphabet{a,b} with an equal number of a’s and b’s such that each prefix
has almost one more a than b’s and almost onr more b than a’s.
Sol:
q1 =q2 b +q3 a+ Λ
q2 =q1 a
q3 =q1 b
q4 =q2 a+q3 b+q4 a+q4 b
q1= q1 ab+q1 ba+ Λ=q1 (ab+ba)+ Λ
= Λ(ab+ba)*= (ab+ba)*
Example3.13: Describe in english the set accepted by FA whose transition diagram is given
in fig.
Sol: q1 =q1 0+ Λ,
q2 =q1 1+q2 1
and
q3 =q2 0+q3 (0+1)
q1 = Λ 0*=0*
q2 =q1 1+q2 1 = 0* 1+ q2 1
=(0* 1)1*
As the final states are q1 and q2, we need not solve for q3:
q1+q2= 0*+0*(11*)= 0*( Λ +11*)= 0*(1*).
The string represented by the graph 0*1*.We can interpret the string in the
English language following way.
“The strings accepted by the FA are precisely the strings of any number of 0’s followed
by a string of any number of 1’s.
PUMPING LEMMA OF REGULAR SETS
Pumping lemma is powerful tool for proving certain language is non regular.
It is also useful in the development of algorithms to answer certain questions concerning
finite automata, such as whether the language accepted by a given FA is finite or infinite.
Here we give a necessary condition for an input string to belong to a regular set. The
result is called Pumping Lemma as it gives a method of pumping (generating) many input string
from a given string. Any pumping lemma gives a necessary condition; it can be used to know
that certain sets are not regular.
Theorem(Pumping Lemma):
Let M=(Q, ∑, δ,q0 ,F) be a FA with n states. Let L be regular set accepted by M. Let wL
and |w|≥m. If m ≥ n, then these exists x,y,z such that w=xyz, y ≠ Λ and xyi z L for each i ≥
0.
Proof:
Let
w= a1 a2 ………….am , m ≥ n
δ(q0 ,a1 a2 ………….ai )=qi fro i=1,2……m;
Q1 ={q0 ,q1 ,…………….,qm }
i.e., Q1 is the sequence of states in the path with path value w=a1 a2 …..am .
As these are only n distinct states, at least two states in q must coincide.
Among various pairs of repeated states, we take the first pair.
Let us take them as qj and qk (qj =qk ).Then j and k satisfies the condition
0≤j<k≤n.
The string w can be decomposed into three substring a1 a2 …….aj ,aj+1 ……….ak and ak+1
………am.
Let x,y,z denotes three strings a1 a2 …….aj ,aj+1 ……….ak and ak+1 ………am, respectively.
As k≤n,|xy|≤n and w=xyz.
The automation M starts from initial state q0 .
On applying the string x, it reaches qj (=qk ).
On applying the string y, it came back to qj (=qk ).
So after application of yi for each i ≥ 0, the automation is in the same state qj.
On applying z, it reaches to final state. Hence xyi z L.
As every state in Q1 is denoted by applying an input symbol, y≠ .
Application of Pumping Lemma
This theorem can be used to prove that certain sets are not regular.
Step1: Assume L is Regular. Let n be the number of states in the corresponding FA.
Step2: Choose a string w suh that |w | ≥ n. Use pumping Lemma to write w=xyz, with |xy|≤n and
|y|>0.
Step3: Find a suitable integer I such that xyi z L. This is contradiction. Hence L is not regular.
Note: The crucial part of the procedure is to find i such that xyi z L. In some cases we prove
xyi z L by corresponding | xyi z|. In some case we may have to use the structure of strings in C.
Example3.14: Show that L={ap |p is prime} is not regular.
Sol:
Step1: We suppose L is regular and get a contradiction. Let n be number of states in FA
accepting L.
Step2: Let p be a prime number greater than n. Let w=ap . By pumping lemma, w can be
written as w=xyz, with |xy| ≤ n and y>0. x,y,z are simply strings of a’s. So,
y=am for some m ≥ 1 and ≤ n.
Step3: Let i=p+1. Then | xyi z|=|xyz|+|yi-1|=p+(i-1)m=p+pm
By pumping lemma xyi z L. But | xyi z|=p+pm=p(1+m) and p(1+m) is not a prime. So,
xyi z L. This is contradiction. Thus L is not regular.
Example3.15: Show that L={ai2 |i≥ 1 } is not regular.
Sol:
Step1: We suppose L is regular and get a contradiction. Let n be number of states in FA
accepting L.
Step2: let w=an2. Then |w|=n2 >n. By Pumping Lemma we write w=xyz with |xy|≤n and |y|>0.
Step3: Consider xy2 z. | xy2 z |=|x|+2|y|+|z| >|x|+|y|+|z| as |y|>0.
means n2 =|xyz|= |x|+|y|+|z|<| xy2 z |. As |xy|≤n, |y|≤n.
Therefore | xy2 z |=|x|+2|y|+|z|≤ n2 +n, i.e., n< | xy2 z| |≤ n2 +n<n2+n+n+1.
Hence, | xy2 z| strictly lies between n2 and (n+1)2 , but is not equal to any
one of them. Thus | xy2 z| is not perfect square and so xy2 z L. But by
pumping lemma xy2 z L . This is contradiction.
Example3.16: Show that L={ww |w{a,b}* } is not regular.
Sol:
Step1: We suppose L is regular and get a contradiction. Let n be number of states in FA
accepting L.
Step2: Let us consider w=an b an b in L.|ww|=2(n+1)>n. we can apply pumping lemma to
write w=xyz with |y|0, |xy| ≤n.
This
Step3: We want to find I so that xyi zL for getting a contradiction. The string y can be
in only one of the following forms:
Case1: y has no. b’s i.e., y=a for some k1.
Case2: y has only on b.
We may note that can’t have two b’s.
If so, |y| n+2. But |y| ≤ |xy| ≤ n.
In case 1, we can take i=0. Then xy0 z=xz is of the form am ban b, when m=n-k<n
bam b). We can’t write xz in the form aa with a{a,b}* and xz L
(or an
In case 2 also, we can take i=0. Then xyz=xz has only one b(as one b is remove from xyz, b
being in y). So , xzL as any elements in L should have been even number of a’s and even
number of b’s.
Thus in both the cases we get contradiction. Therefore L is not Regular.
Closure properties of regular languages
We have several theorems of the forms, ”If certain language are regular, and a language L is
formed from them by certain operation (such as union of two regular languages), Then L is also
regular”. These theorems are often called closure properties of the regular languages.
Closure properties express the idea that when one language is regular, then certain
related language also regular.
They also serve as an interesting illusions of how the equivalent representation of
the regular languages reinforce each other in our understanding of the class of languages.
Closure of Regular Languages under Boolean Operations
First closure properties are the three Boolean operations:
Union, Intersection and Complementation.
1. Let L and M be languages over alphabet . Then LUM is the language that contains all
strings that are in either or both L and M.
2. Let L and M be languages over alphabet . Then LM is the language that contains all
strings that are in both L and M.
3. Let L be a language over alphabet . Then L, The complement of L, is the string in *
that are not in L.
It turns out that regular languages are closed under all three of the Boolean
operations.
Closure Under Union:
Theorem1: If L and M are regular Languages, Then so is LM.
Closure Under Intersection:
Theorem2: If L and M are regular Languages, Then so is LM.
Closure Under Complementation:
Theorem3: If L is aregular language over alphabet , then L~ =* -L is also regular
language.
Closure Under Difference:
Theorem4: If L and M are regular languages, then so is L – M.
Reversal: The several of a string a1 a2 ………an is the string written backwards i.e.,
…..a1 .
WR is the reversal of string w. Thus, 0010R 0100,
Theorem5:If L is a regular language, so is LR.
an an-1
Homomorphism: a string homomorphism is a function on string that works by substituting a
particular string for each symbol.
Theorem6:If L is regular language over alphabet , and h is homomorphism, then h(L) is
also regular.
Closure under Iteration:
Theoem7: If L is a regular language , then so is L*.
Closure under Concatenation:
Theorem8: If Land M are regular languages , then so is L.M.
© Copyright 2026 Paperzz