Lower Bound Methods for the Size of Nondeterministic Finite

Lower bound methods for the size of nondeterministic
finite automata revisited
Hellis Tamm
Tallinn University of Technology
LATA 2017, Umeå, March 6-9, 2017
Joint work with Brink van der Merwe
Lower bound methods for the size of NFA
We consider the following lower bound methods for the size of NFA:
I
I
I
fooling set technique
extended fooling set technique
biclique edge cover technique
Lower bounds obtained by these methods are not necessarily tight;
a minimal NFA may have more states than the obtained bound.
Some classes of languages for which tight bounds can be achieved,
are known.
The class of regular languages for which the fooling set provides a
tight bound, is known as the class of biseparable languages.
The extended fooling set technique provides a tight bound for
biresidual languages.
The exact classes of languages for which the extended fooling set
technique and the biclique edge cover technique provide tight bounds,
are not known.
Outline
We present the lower bound methods in terms of quotients and atoms
of regular languages.
We then consider certain subsets of the sets of quotients and atoms,
so-called prime quotients and prime atoms, and present the lower
bound methods in terms of prime quotients and prime atoms.
The languages with maximal reversal complexity belong to the class of
languages for which the fooling set technique provides a tight bound.
The extended fooling set technique is tight for a subclass of unary
cyclic languages.
Fooling set techniques
Fooling set technique (Glaister and Shallit, 1996):
Let L ⊆ Σ∗ be a regular language, and suppose there exists
a set of pairs S = {(xi , yi ) | 1 6 i 6 p} such that
(a) xi yi ∈ L, for 1 6 i 6 p, and
(b) xi yj ∈
/ L, for 1 6 i, j 6 p, i 6= j.
Then any NFA accepting L has at least p states.
Extended fooling set technique (Birget, 1992):
(b’) xi yj ∈
/ L or xj yi ∈
/ L, for 1 6 i, j 6 p, i 6= j.
Extended fooling set technique may provide a better lower bound.
Lower bounds obtained by these techniques are not necessarily tight.
Biclique edge cover technique
Let G = (X , Y , E ) be a bipartite graph, with sets of vertices X and Y ,
and set of edges E ⊆ X × Y .
A set C = {H1 , H2 , . . .} of bipartite subgraphs of G is an edge cover of G
if every edge e ∈ E is an edge of some Hi .
An edge cover C of G is a biclique edge cover if every Hi is a biclique,
that is, if Hi = (Xi , Yi , Ei ) with Ei = Xi × Yi .
The bipartite dimension of G , d(G ), is the size of the smallest biclique
edge cover of G if it exists and is infinite otherwise.
Biclique edge cover technique
Let G = (X , Y , E ) be a bipartite graph, with sets of vertices X and Y ,
and set of edges E ⊆ X × Y .
A set C = {H1 , H2 , . . .} of bipartite subgraphs of G is an edge cover of G
if every edge e ∈ E is an edge of some Hi .
An edge cover C of G is a biclique edge cover if every Hi is a biclique,
that is, if Hi = (Xi , Yi , Ei ) with Ei = Xi × Yi .
The bipartite dimension of G , d(G ), is the size of the smallest biclique
edge cover of G if it exists and is infinite otherwise.
The biclique edge cover technique (Gruber and Holzer, 2006):
Theorem
Let L ⊆ Σ∗ be a regular language, let X , Y ⊆ Σ∗ .
Suppose there exists a bipartite graph G = (X , Y , EL ), where
for x ∈ X and y ∈ Y , (x, y ) ∈ EL if and only if xy ∈ L.
Then any NFA accepting L has at least d(G ) states.
Dependency graph of a language
Nerode right congruence is well known:
for x, y ∈ Σ∗ , x ≡L y if for every v ∈ Σ∗ , xv ∈ L if and only if yv ∈ L.
The left congruence is defined:
for x, y ∈ Σ∗ , x L ≡y if for every u ∈ Σ∗ , ux ∈ L if and only if uy ∈ L.
Gruber and Holzer (2006) defined the dependency graph of a language L
as the bipartite graph GL = (X , Y , EL ), where X = Σ∗ / ≡L and
Y = Σ∗ /L ≡, and ([x]L , L [y ]) ∈ EL if and only if xy ∈ L.
They suggested that the maximal fooling sets and extended fooling sets,
as well as the smallest biclique edge cover for L, can be found by
inspecting the dependency graph GL .
Quotients and atoms
Let L be a regular language over an alphabet Σ.
The left quotient of a language L by a word w is the language
w −1 L = {x ∈ Σ∗ | wx ∈ L}.
Let K1 , . . . , Kn be the quotients of L.
An atom of L is any non-empty language of the form
f1 ∩ K
f2 ∩ · · · ∩ K
fn ,
A=K
where Kei is either Ki or Ki .
Any quotient Ki is a (possibly empty) union of atoms.
Atoms define a partition of Σ∗ .
Quotient-atom graph of a language
Dependency graph of L was defined as GL = (X , Y , EL ), where
X = Σ∗ / ≡L and Y = Σ∗ /L ≡, and ([x]L , L [y ]) ∈ EL if and only if xy ∈ L.
Classes of ≡L correspond to the quotients of L.
Classes of L ≡ are the atoms of L (Iván 2016).
We can define GL in terms of quotients and atoms of L:
Quotient-atom graph of a language
Dependency graph of L was defined as GL = (X , Y , EL ), where
X = Σ∗ / ≡L and Y = Σ∗ /L ≡, and ([x]L , L [y ]) ∈ EL if and only if xy ∈ L.
Classes of ≡L correspond to the quotients of L.
Classes of L ≡ are the atoms of L (Iván 2016).
We can define GL in terms of quotients and atoms of L:
Let K = {K1 , . . . , Kn } be the set of quotients of L, and
let A = {A1 , . . . , Am } be the set of atoms of L.
Proposition
For any x, y ∈ Σ∗ , xy ∈ L if and only if Aj ⊆ Ki , where y ∈ Aj and
Ki = x −1 L.
We can express GL = (K , A, EL ), with (Ki , Aj ) ∈ EL if and only if Aj ⊆ Ki .
With this view, we call GL the quotient-atom graph of L.
Lower bound methods in terms of quotients and atoms
By Gruber and Holzer (2006), maximal fooling sets and extended fooling
sets, as well as the smallest biclique edge cover for L, can be found by
inspecting the dependency graph GL , that is, the quotient-atom graph of L.
Consequently, we can express the above mentioned lower bound methods
in terms of quotients and atoms.
The biclique edge cover technique can be presented by the following
theorem:
Theorem
Let L ⊆ Σ∗ be a regular language, and let the quotient-atom graph of L be
GL = (K , A, EL ), with (Ki , Aj ) ∈ EL if and only if Aj ⊆ Ki . Then any NFA
accepting L has at least d(GL ) states.
Fooling set methods in terms of quotients and atoms
The fooling set technique and the extended fooling set technique can be
expressed as the first and the second case, respectively, of the following
theorem:
Theorem
Let L ⊆ Σ∗ be a regular language, and suppose there exists a set of
quotient-atom pairs S = {(Ki , Ai ) | 1 6 i 6 p} such that either
1
(a) Ai ⊆ Ki for 1 6 i 6 p,
(b) Ai 6⊆ Kj for 1 6 i, j 6 p and i 6= j,
or
2
(a) Ai ⊆ Ki for 1 6 i 6 p,
(b) Ai 6⊆ Kj or Aj 6⊆ Ki for 1 6 i, j 6 p and i 6= j,
holds. Then any NFA accepting L has at least p states.
Prime quotient-atom graph
A non-empty quotient Ki is prime if Ki is not a union of other quotients.
T
T
Let Aj = i∈Sj Ki ∩ i∈Sj Ki , where Sj ⊆ {1, . . . , n} and
Sj = {1, . . . , n} \ Sj .
We say that an atom Aj is prime if it has at least one uncomplemented
quotient in its intersection and if the set of uncomplemented quotients is
not a union of such sets of quotients corresponding to other atoms.
We form a subgraph of the quotient-atom graph, with prime quotients and
prime atoms as sets of vertices and corresponding edges between these
vertices.
We call this subgraph the prime quotient-atom graph.
Lower bound methods with prime quotients and atoms
Let K 0 ⊆ K be the set of prime quotients of L, and let A 0 ⊆ A be the set
of prime atoms of L.
The biclique edge cover technique can be presented as follows:
Theorem
Let L ⊆ Σ∗ be a regular language, and let the prime quotient-atom graph
of L be GL0 = (K 0 , A 0 , EL0 ), with (Ki , Aj ) ∈ EL0 if and only if Aj ⊆ Ki . Then
any NFA accepting L has at least d(GL0 ) states, with d(GL0 ) = d(GL ).
Also, both of the fooling set techniques can also be expressed in terms of
prime quotients and prime atoms:
Theorem
If P = {(Ki , Ai ) | 1 6 i 6 p} is a fooling set (an extended fooling set,
resp.) for a regular language L, then there is a fooling set (an extended
fooling set, resp.) P 0 = {(Ki0 , Ai0 ) | 1 6 i 6 p} for L such that all Ki0 ’s and
Ai0 ’s are prime.
When are the lower bounds tight?
Lower bounds obtained by the discussed methods are not necessarily
tight.
FST (L) 6 EFST (L) 6 BECT (L) 6 NFAmin (L)
When are the lower bounds tight?
Lower bounds obtained by the discussed methods are not necessarily
tight.
FST (L) 6 EFST (L) 6 BECT (L) 6 NFAmin (L)
Some classes of languages and automata, for which tight bounds can
be achieved, were shown by HT (LATA 2010):
I
The fooling set technique is tight for and only for biseparable
languages:
F
I
A trim NFA N = (Q, Σ, δ, I , F ) is separable if for every state q ∈ Q
there is a word u ∈ Σ∗ such that δ(I , u) = {q};
N is biseparable if both N and its reverse N R are separable.
The extended fooling set technique is tight for any biresidual language:
F
An NFA N is a residual NFA if for every state q of N , the right
language of q is a left quotient of L(N );
N is biresidual if both N and N R are residual NFAs.
Languages with maximal reversal complexity
If L is a regular language such that the minimal DFA of L has n states
and the minimal DFA of LR has 2n states, then L is a language of
maximal reversal complexity.
Such languages have been studied by Salomaa et al. (2004) and
Salomaa (2012).
Such languages have the maximal number of atoms.
In this paper, we show that such languages are biseparable.
Therefore, the fooling set technique is tight for any language with
maximal reversal complexity.
Since a biseparable NFA is a unique minimal NFA (Latteux et al.
2009), the minimal DFA of a language with maximal reversal
complexity is a unique minimal NFA.
Unary cyclic languages
Let L be a regular language over a unary alphabet Σ = {a}, and let
the minimal DFA of L be D = (Q, Σ, δ, q0 , F ) with a state set
Q = {q0 , . . . , qn−1 } such that δ(qi , a) = qi+1 for i = 0, . . . , n − 2, and
δ(qn−1 , a) = q0 .
This kind of language is called a unary cyclic language.
Proposition
If there is a final state qi ∈ F of D such that for every qj ∈ F with j 6= i,
the state q(2i−j) mod n is not final, then L has an extended fooling set of
size n.
Unary cyclic languages
Let L be a regular language over a unary alphabet Σ = {a}, and let
the minimal DFA of L be D = (Q, Σ, δ, q0 , F ) with a state set
Q = {q0 , . . . , qn−1 } such that δ(qi , a) = qi+1 for i = 0, . . . , n − 2, and
δ(qn−1 , a) = q0 .
This kind of language is called a unary cyclic language.
Proposition
If there is a final state qi ∈ F of D such that for every qj ∈ F with j 6= i,
the state q(2i−j) mod n is not final, then L has an extended fooling set of
size n.
If the condition of the proposition holds for a unary cyclic language L,
then the extended fooling set technique is tight for L.
Consequently, the minimal DFA D of L is a minimal NFA for L.
Does the converse of the proposition also hold?
Concluding remarks
We presented the lower bound methods in terms of quotients and
atoms of regular languages.
We showed some classes of languages for which fooling set techniques
can provide tight bounds.
It would be of interest to find other classes of languages for which
lower bound methods (esp. the biclique edge cover technique) provide
tight bounds.