Property testing of Tree Regular Languages Frédéric Magniez, LRI, CNRS Michel de Rougemont, LRI , University Paris II Property testing of Tree Regular Languages 1. Tester for regular words with the Edit Distance with Moves 2. Tester for ranked regular trees with the Tree-Edit Distance with Moves, Testers on a class K Let F be a property on a class K of structures U An ε -tester for F is a probabilistic algorithm A such that: • If U |= F, A accepts • If U is ε far from F, A rejects with high probability • Time(A) independent of n. (Goldreich, Golwasser, Ron 1996 , Rubinfeld, Sudan 1994) Tester usually implies a linear time corrector. History of Testers Self-testers and correctors for Linear Algebra ,Blum & Kanan 1989 Robust characterizations of polynomials, R. Rubinfeld, M. Sudan, 1994 Testers for graph properties : k-colorability, Goldreich and al. 1996 2 graph properties have testers, Alon and al. 1999 Regular languages have testers, Alon and al. 2000s Testers for Regular tree languages , Mdr and Magniez, ICALP 2004 Edit distance on Words 1. Classical Edit Distance: Insertions, Deletions, Modifications 2. Edit Distance with moves 0111000011110011001 0111011110000011001 3. Edit Distance with Moves generalizes to Trees Testers on words Simpler proof which generalizes to regular trees. L is a regular language and A an automaton for L. C2 C4 C0 C3 C1 Admissible Z= C0.C2.C3.C4 init accept A word W is Z-feasible if there are two states qCi ,q'C j such that W q q' and Z...Ci ...C j... The Tester Tester. Input : W,A, ε For i1,...,log( m/) Choose Ni (2i.m3/) random subwords wij of size 2i1 For every admissible path Z: If all wij of W are Z feasible, ACCEPT. else REJECT. Theorem: Tester(W,A, ε ) is an ε -tester for L(A). Proof schema of the Tester Theorem: Regular words are testable. Robustness lemma: If W is ε-far from L, then for every 2 admissible path Z, there exists ilog( 5.m ) such that the number of Z-infeasible subwords i1 2 2i1 is at least ..n. 2 m Splitting lemma: if W is far from L there are many disjoint infeasible subwords. Amplifying lemma: If there are many infeasible words, there are many short ones. Merging Merging lemma: Let Z be an admissible path, and let F be a Zfeasible cut of size h’ . Then Dist (F,L)m2h' C C C C C C Take each word wi F and split it along its connected components, removing single letters. Rearrange all the words of the same component in its Z-order. Add gluing words to obtain W’ in L: W ' g0.w1.g1.w2.g2.w2....... Splitting Splitting lemma: If Z is an admissible path, W a word s.t. dist(W,L) > h, then W has more than h/m 2 Zinfeasible disjoint subwords. (h.n) Proof by contraposition: W has less than h' h/m2 minimal Zinfeasible and disjoint subwords. Removing the last letters provides a feasible cut F. Dist(W, F) h'. By the merging lemma Dis(F, L)m2h'. Hence Dist(W, L) h'm2h' And Dist(W, L) h Tree-Edit-Distance a Deletion Edge a b e c b a c e d b Insertion Node and Label f d e e c Tree Edit distance with moves: a a c b 1 move b d e c d Distance Problem is NP-complete, non-approximable. e Tree-Edit-Distance on binary trees Binary trees : Distance with moves allows permutations Distance(T1,T2) =4 m-Distance (T1,T2) =2 Tree automata • • (q0, q0) q1 (q0,q1) q1 (q1,q1)q2 (q1,q0)q2 (q2,-) q2 (-,q2) q2 A (Q, q0, , q1) q1 q1 q0 q1 q0 q2 q1 q0 q1 q1 q0 q0 q0 q0 q0 q0 Infeasible subtrees Fact . If Distance (T,L).n then the number of infeasible subtrees of constant size is O(n). Tester for regular Trees Tester. Input : T,A, For i r 2m1 Choose Ni ( m.r 4m3 2 ) random nodes and subtrees tij of size i If all tij of T are Z feasible, ACCEPT. Theorem: Tester(T,A, ε ) is an ε -tester for L(A). Proof schema of the Tester Theorem: Regular trees are testable. Robustness lemma: If T 2is ε-far from L, then for every admissible m 1 r path Z, there exists i( ) such that the number of Z-infeasible i-subtrees is at least r 1 . 2.n. 4m3 Splitting lemma: if T is far from L there are many disjoint infeasible subtrees. Amplifying lemma: If there are many infeasible subtrees, there are many small ones. Splitting and Merging Splitting and Merging on words: C C C C C Splitting and Merging on trees: C Splitting and Merging trees E C C Connected Components Corrected tree C D D Conclusion • • Verification is hard. Approximate verification can be feasible. 1. 2. 3. 4. 5. Testers and Correcters for regular words Tester for regular trees Corrector for regular trees Unranked trees: XML files Applications: Constant algorithm for Edit Distance with moves (Fischer, Magniez, Mdr)
© Copyright 2026 Paperzz