CS 4510 : Automata and Complexity : Spring 2010
Myhill-Nerode Theorem
Subrahmanyam Kalyanasundaram
January 26, 2010
Myhill-Nerode theorem gives a necessary and suο¬cient condition for a language to be regular. The
pumping lemma gives only a necessary condition for a language to be regular, and Myhill-Nerode
theorem does one better in that respect. The theorem was proved by John Myhill and Anil Nerode
in 1958. We state the necessary deο¬nitions and the theorem below, and explain the proof in a
top-down manner.
Deο¬nition 1 (Distinguishable by a language). Let π₯, π¦ be strings and πΏ be a language over the
same alphabet Ξ£. π₯, π¦ are said to be distinguishable by πΏ if βπ§ β Ξ£β such that π₯π§ β πΏ and π¦π§ β
/πΏ
or vice versa.
Also, if π₯ and π¦ are not distinguishable by πΏ, they are said to be indistinguishable by πΏ. This is
denoted by π₯ β‘πΏ π¦.
Exercise 1
Show that β‘πΏ is an equivalence relation.
Deο¬nition 2 (Pairwise Distinguishable). Let πΏ be a language and π be a set of strings. π is
pairwise distinguishable by πΏ if every two distinct strings π₯, π¦ β π are distinguishable by πΏ.
Deο¬nition 3 (Index). The index of a language πΏ is the size of the largest set π of strings which
is pairwise distinguishable by πΏ.
These are the deο¬nitions we need. The theorem is stated below.
Theorem 1 (Myhill-Nerode Theorem). Language πΏ is regular if and only if it has a ο¬nite index.
Moreover, the index of πΏ is equal to the size of the smallest DFA which recognizes πΏ.
The proof of the theorem follows by the following two lemmas.
Lemma 1. If πΏ is recognized by a DFA with π states, then index(πΏ) β€ π.
Lemma 2. If index(πΏ) = π < β, then there exists a DFA with π states that recognizes πΏ.
The proof of the theorem follows very easily once we assume the above two lemmas.
1
Proof of Myhill-Nerode Theorem. We have by deο¬nition, πΏ is regular ββ there exists a DFA which
recognizes πΏ.
Suppose πΏ is regular. Choose the smallest (least number of states) DFA that recognizes πΏ. Let π
be the number of states of this DFA. By lemma 1, index(πΏ) β€ π. In particular
index(πΏ) β€ Size of the smallest DFA that recognizes πΏ
This also implies that πΏ has ο¬nite index as well. This shows one direction of the theorem.
For the other direction, let index(πΏ) = π < β. Then by lemma 2 there exists a DFA with π states
that recognizes πΏ. This shows the other direction of the theorem, that πΏ is regular. This also shows
that
Size of the smallest DFA that recognizes πΏ β€ index(πΏ)
By the above two inequalities, we can conclude that index(πΏ) = the size of the smallest DFA that
recognizes πΏ.
Now let us see the proofs of the lemmas. Note that we use the notation πΏ(π, π₯) for strings π₯ β Ξ£β .
πΏ(π, π₯) denotes the state that the DFA reaches starting from π after reading the string π₯.
Proof of Lemma 1. We prove this by contradiction. We have a language πΏ recognized by a DFA
with π states. We assume, for the sake of contradiction, that index(πΏ) > π. This means that there
is a set π, β£πβ£ > π, such that π is pairwise distinguishable by πΏ.
Let π be the DFA with π states that recognizes πΏ. Let π0 be the start state of π . We have β£πβ£ > π.
Then by pigeonhole principle, there exist two distinct π₯, π¦ β π such that πΏ(π0 , π₯) = πΏ(π0 , π¦).
Now we shall assert that π₯, π¦ are pairwise indistinguishable, thus contradicting our assumption that
index(πΏ) > π. Consider any π§ β Ξ£β .
πΏ(π0 , π₯π§) = πΏ(πΏ(π0 , π₯), π§) = πΏ(πΏ(π0 , π¦), π§) = πΏ(π0 , π¦π§)
So π₯π§ β πΏ ββ π¦π§ β πΏ. π₯, π¦ are pairwise indistinguishable, and hence by contradiction, index(πΏ) β€
π.
Proof of Lemma 2. Suppose index(πΏ) = π. We have to show the existence of a DFA with π states
that recognizes πΏ. We shall in fact, construct such a DFA. Let π = {π₯1 , π₯2 , . . . , π₯π } be a largest
set which is pairwise distinguishable by πΏ (the existence of π follows by the deο¬nition of index).
We build DFA π = (π, Ξ£, πΏ, π0 , πΉ ). Ξ£ is chosen to be the same as the alphabet of πΏ. We choose
π = {π1 , π2 , . . . , ππ }. It would be useful to think of each ππ as corresponding to the string π₯π β π.
For each π β Ξ£, we deο¬ne πΏ(ππ , π) as follows. Consider the string π₯π π. We have π₯π π β‘πΏ π₯π for some
π₯π β π. This follows from the fact that π is a largest pairwise distinguishable set by πΏ. If π₯π ββ‘πΏ π₯π
for any π₯π , this would mean that π βͺ {π₯π π} would be a bigger pairwise distinguishable set by πΏ.
Now deο¬ne πΏ(ππ , π) = ππ .
Similarly, we have the empty string π β‘πΏ ππ for some ππ . Set ππ = π0 as the starting state.
We claim (and shall prove soon) that πΏ(ππ , π€) = ππ ββ π₯π π€ β‘πΏ π₯π .
Deο¬ne πΉ = {ππ β£ π₯π β πΏ}. We have deο¬ned all the components of the DFA π now. All that remains
is to show that this DFA recognizes πΏ. Suppose π₯ β πΏ. Then π₯ β‘πΏ π₯π for some π₯π , such that π₯π β πΏ
2
(Why?). By the above claim,
π₯ = ππ₯ β‘πΏ π₯π =β πΏ(π0 , π₯) = ππ
But ππ β πΉ since π₯π β πΏ. So π₯ is recognized by π .
If π₯ β
/ πΏ, then π₯ β‘πΏ π₯π for some π₯π β
/ πΏ. Like above, we get that πΏ(π0 , π₯) = ππ . Again ππ β
/ πΉ since
π₯π β
/ πΏ. So in this case π does not recognize π₯. So π recognizes exactly the language πΏ.
The only part that we are yet to prove is the claim above.
Proof of Claim. The claim is that πΏ(ππ , π€) = ππ ββ π₯π π€ β‘πΏ π₯π .
We prove this by induction on β£π€β£, the length of π€.
β When β£π€β£ = 0, π€ = π. In this case πΏ(ππ , π€) = πΏ(ππ , π) = ππ . The claim is true since π₯π π = π₯π .
β When β£π€β£ = 1, the claim is true by the deο¬nition of the transition function πΏ.
β When β£π€β£ > 1, we use induction. Suppose the claim holds for all π€ such that β£π€β£ β€ π. Let π€
be such that β£π€β£ = π + 1. Let π€ = π£π where β£π£β£ = π and β£πβ£ = 1, π β Ξ£. We have
πΏ(ππ , π€) = πΏ(ππ , π£π) = πΏ(πΏ(ππ , π£), π) = πΏ(ππ1 , π) = ππ2
where ππ1 = πΏ(ππ , π£) and ππ2 = πΏ(ππ1 , π). This means that π₯π1 β‘πΏ π₯π π£ and π₯π2 β‘πΏ π₯π2 π (this
follows by induction hypothesis). The claim is true since
π₯π π€ = π₯π π£π β‘πΏ π₯π1 π β‘πΏ π₯π2
3
© Copyright 2026 Paperzz