Skip to main content

On the size of partial derivatives and the word membership problem

Abstract

Partial derivatives are widely used to convert regular expressions to nondeterministic automata. For the word membership problem, it is not strictly necessary to build an automaton. In this paper, we study the size of partial derivatives on the average case. For expressions in strong star normal form, we show that on average and asymptotically the largest partial derivative is at most half the size of the expression. The results are obtained in the framework of analytic combinatorics considering generating functions of parametrised combinatorial classes defined implicitly by algebraic curves. Our average case estimates suggest that a detailed word membership algorithm based directly on partial derivatives should be analysed both theoretically and experimentally.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2

Notes

  1. 1.

    This hypothesis states that for every positive \(\delta <1\), SAT cannot be solved in time \(O^*(2^{\delta n})\)—see [22].

  2. 2.

    Note that \(m(\partial ^+(s_n))\) is sequence A034856 minus 2 in OEIS (https://oeis.org/A034856).

  3. 3.

    This could be \(\Vert \gamma \Vert \). We note, however, that representing a set of PDs with the compact method of Sect. 8 results into subtree sharing, so the total size of the set is less than the sum of the sizes of its elements.

References

  1. 1.

    Adams, M.D., Hollenbeck, C., Might, M.: On the complexity and performance of parsing with derivatives. In: Krintz, C., Berger, E. (eds.) Proceedings of the 37th ACM SIGPLAN PLDI, pp. 224–236. ACM (2016). https://doi.org/10.1145/2908080.2908128

  2. 2.

    Antimirov, V.M.: Partial derivatives of regular expressions and finite automaton constructions. Theoret. Comput. Sci. 155(2), 291–319 (1996)

    MathSciNet  Article  Google Scholar 

  3. 3.

    Backurs, A., Indyk, P.: Which regular expression patterns are hard to match? In: Dinur, I. (ed.) Proceedings of the 57th FOCS, pp. 457–466. IEEE Computer Society (2016). https://doi.org/10.1109/FOCS.2016.56

  4. 4.

    Bille, P., Thorup, M.: Faster regular expression matching. In: Albers, S., Marchetti-Spaccamela, A., Matias, Y., Nikoletseas, S.E., Thomas, W. (eds.) Proceedings of the 36th ICALP, Part I, LNCS, vol. 5555, pp. 171–182. Springer, Berlin (2009). https://doi.org/10.1007/978-3-642-02927-1_16

  5. 5.

    Bringmann, K., Grønlund, A., Larsen, K.G.: A dichotomy for regular expression membership testing. In: Umans, C. (ed.) Proceedings of the 58th FOCS, pp. 307–318. IEEE Computer Society (2017). https://doi.org/10.1109/FOCS.2017.36

  6. 6.

    Broda, S., Holzer, M., Maia, E., Moreira, N., Reis, R.: Mesh of automata. Inf. Comput. 265, 94–111 (2019). https://doi.org/10.1016/j.ic.2019.01.003

    MathSciNet  Article  MATH  Google Scholar 

  7. 7.

    Broda, S., Machiavelo, A., Moreira, N., Reis, R.: On the average state complexity of partial derivative automata: an analytic combinatorics approach. Int. J. Found. Comput. Sci. 22(7), 1593–1606 (2011). https://doi.org/10.1142/S0129054111008908

    MathSciNet  Article  MATH  Google Scholar 

  8. 8.

    Broda, S., Machiavelo, A., Moreira, N., Reis, R.: On the average size of Glushkov and partial derivative automata. Int. J. Found. Comput. Sci. 23(5), 969–984 (2012). https://doi.org/10.1142/S0129054112400400

    MathSciNet  Article  MATH  Google Scholar 

  9. 9.

    Broda, S., Machiavelo, A., Moreira, N., Reis, R.: A Hitchhiker’s guide to descriptional complexity through analytic combinatorics. Theoret. Comput. Sci. 528, 85–100 (2014)

  10. 10.

    Broda, S., Machiavelo, A., Moreira, N., Reis, R.: On average behaviour of regular expressions in strong star normal form. Int. J. Found. Comput. Sci. 30(6–7), 899–920 (2019). https://doi.org/10.1142/S0129054119400227

    MathSciNet  Article  MATH  Google Scholar 

  11. 11.

    Broda, S., Machiavelo, A., Moreira, N., Reis, R.: Analytic combinatorics and descriptional complexity of regular languages on average. ACM SIGACT News 51(1), 38–56 (2020). https://doi.org/10.1145/3388392.3388400

    MathSciNet  Article  Google Scholar 

  12. 12.

    Brüggemann-Klein, A.: Regular expressions into finite automata. Theoret. Comput. Sci. 48, 197–213 (1993)

    MathSciNet  Article  Google Scholar 

  13. 13.

    Champarnaud, J., Ziadi, D.: From c-continuations to new quadratic algorithms for automaton synthesis. Int. J. Alg. Comput. 11(6), 707–736 (2001). https://doi.org/10.1142/S0218196701000772

    MathSciNet  Article  MATH  Google Scholar 

  14. 14.

    Champarnaud, J.M., Ouardi, F., Ziadi, D.: Normalized expressions and finite automata. Int. J. Algebra Comput. 17(1), 141–154 (2007). https://doi.org/10.1142/S021819670700355X

    MathSciNet  Article  MATH  Google Scholar 

  15. 15.

    Champarnaud, J.M., Ziadi, D.: From Mirkin’s prebases to Antimirov’s word partial derivatives. Fundam. Inform. 45(3), 195–205 (2001)

  16. 16.

    Cochran, W.G.: Sampling Techniques, 3rd edn. Wiley, New York (1977)

    MATH  Google Scholar 

  17. 17.

    Flajolet, P., Sedgewick, R.: Analytic Combinatorics. CUP, Cambridge (2008)

    MATH  Google Scholar 

  18. 18.

    Gulan, S.: On the relative descriptional complexity of regular expressions and finite automata. Ph.D. thesis, Universität Trier (2011)

  19. 19.

    Hille, E.: Analytic Function Theory, vol. 2. Blaisdell Publishing Company (1962)

  20. 20.

    Khorsi, A., Ouardi, F., Ziadi, D.: Fast equation automaton computation. J. Discrete Algorithms 6(3), 433–448 (2008). https://doi.org/10.1016/j.jda.2007.10.003

    MathSciNet  Article  MATH  Google Scholar 

  21. 21.

    Konstantinidis, S., Machiavelo, A., Moreira, N., Reis, R.: On the average state complexity of partial derivative transducers. In: Chatzigeorgiou, A., Dondi, R., Herodotou, H., Kapoutsis, C.A., Manolopoulos, Y., Papadopoulos, G.A., Sikora, F. (eds.) Proceedings of the SOFSEM 2020, LNCS, vol. 12011, pp. 174–186. Springer, Berlin (2020). https://doi.org/10.1007/978-3-030-38919-2_15

  22. 22.

    Lokshtanov, D., Marx, D., Saurabh, S.: Lower bounds based on the exponential time hypothesis. Bull. EATCS 105, 41–72 (2011)

    MathSciNet  MATH  Google Scholar 

  23. 23.

    Mirkin, B.G.: An algorithm for constructing a base in a language of regular expressions. Eng. Cybern. 5, 51–57 (1966)

    Google Scholar 

  24. 24.

    Myers, E.W.: A four Russians algorithm for regular expression pattern matching. J. ACM 39(2), 430–448 (1992). https://doi.org/10.1145/128749.128755

    MathSciNet  Article  MATH  Google Scholar 

  25. 25.

    Nicaud, C.: On the average size of Glushkov’s automata. In: Dediu, A., Ionescu, A.M., Vide, C.M. (eds.) Proceedings of the 3rd LATA, LNCS, vol. 5457, pp. 626–637. Springer, Berlin (2009)

  26. 26.

    Project FAdo: tools for formal languages manipulation. http://fado.dcc.fc.up.pt. Accessed date 1 Jan 2021

  27. 27.

    Thompson, K.: Regular expression search algorithm. CACM 11(6), 410–422 (1968)

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Nelma Moreira.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors António Machiavelo, Nelma Moreira and Rogério Reis were partially supported by CMUP, which is financed by national funds through FCT—Fundação para a Ciência e a Tecnologia, I.P., under the project with reference UIDB/00144/2020. Stavros Konstantinidis was partially supported by NSERC, Canada.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Konstantinidis, S., Machiavelo, A., Moreira, N. et al. On the size of partial derivatives and the word membership problem. Acta Informatica 58, 357–375 (2021). https://doi.org/10.1007/s00236-021-00399-6

Download citation