Advertisement

POSIX Lexing with Derivatives of Regular Expressions (Proof Pearl)

  • Fahad Ausaf
  • Roy Dyckhoff
  • Christian Urban
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9807)

Abstract

Brzozowski introduced the notion of derivatives for regular expressions. They can be used for a very simple regular expression matching algorithm. Sulzmann and Lu cleverly extended this algorithm in order to deal with POSIX matching, which is the underlying disambiguation strategy for regular expressions needed in lexers. Sulzmann and Lu have made available on-line what they call a “rigorous proof” of the correctness of their algorithm w.r.t. their specification; regrettably, it appears to us to have unfillable gaps. In the first part of this paper we give our inductive definition of what a POSIX value is and show (i) that such a value is unique (for given regular expression and string being matched) and (ii) that Sulzmann and Lu’s algorithm always generates such a value (provided that the regular expression matches the string). We also prove the correctness of an optimised version of the POSIX matching algorithm. Our definitions and proof are much simpler than those by Sulzmann and Lu and can be easily formalised in Isabelle/HOL. In the second part we analyse the correctness argument by Sulzmann and Lu and explain why the gaps in this argument cannot be filled easily.

Keywords

POSIX matching Derivatives of regular expressions Isabelle/HOL 

Notes

Acknowledgements

We are very grateful to Martin Sulzmann for his comments on our work and moreover for patiently explaining to us the details in [11]. We also received very helpful comments from James Cheney and anonymous referees.

References

  1. 1.
    Ausaf, F., Dyckhoff, R., Urban, C.: POSIX Lexing with Derivatives of Regular Expressions. Archive of Formal Proofs (2016). http://www.isa-afp.org/entries/Posix-Lexing.shtml, Formal proof development
  2. 2.
    Brzozowski, J.A.: Derivatives of regular expressions. J. ACM 11(4), 481–494 (1964)MathSciNetCrossRefMATHGoogle Scholar
  3. 3.
    Coquand, T., Siles, V.: A decision procedure for regular expression equivalence in type theory. In: Jouannaud, J.-P., Shao, Z. (eds.) CPP 2011. LNCS, vol. 7086, pp. 119–134. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  4. 4.
    Frisch, A., Cardelli, L.: Greedy regular expression matching. In: Díaz, J., Karhumäki, J., Lepistö, A., Sannella, D. (eds.) ICALP 2004. LNCS, vol. 3142, pp. 618–629. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  5. 5.
    Grathwohl, N.B.B., Henglein, F., Rasmussen, U.T.: A Crash-Course in Regular Expression Parsing and Regular Expressions as Types. Technical report, University of Copenhagen (2014)Google Scholar
  6. 6.
    Krauss, A., Nipkow, T.: Proof pearl: regular expression equivalence and relation algebra. J. Autom. Reasoning 49, 95–106 (2012)MathSciNetCrossRefMATHGoogle Scholar
  7. 7.
    Kuklewicz, C.: Regex Posix. https://wiki.haskell.org/Regex_Posix
  8. 8.
    Nipkow, T.: Verified lexical analysis. In: Grundy, J., Newey, M. (eds.) TPHOLs 1998. LNCS, vol. 1479, pp. 1–15. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  9. 9.
    Owens, S., Slind, K.: Adapting functional programs to higher order logic. High. Order Symbolic Comput. 21(4), 377–409 (2008)CrossRefMATHGoogle Scholar
  10. 10.
    Pierce, B.C., Casinghino, C., Gaboardi, M., Greenberg, M., Hriţcu, C., Sjöberg, V., Yorgey, B.: Software Foundations. Electronic Textbook (2015). http://www.cis.upenn.edu/~bcpierce/sf
  11. 11.
    Sulzmann, M., Lu, K.Z.M.: POSIX regular expression parsing with derivatives. In: Codish, M., Sumii, E. (eds.) FLOPS 2014. LNCS, vol. 8475, pp. 203–220. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  12. 12.
    Sulzmann, M., van Steenhoven, P.: A flexible and efficient ML lexer tool based on extended regular expression submatching. In: Cohen, A. (ed.) CC 2014 (ETAPS). LNCS, vol. 8409, pp. 174–191. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  13. 13.
    Vansummeren, S.: Type inference for unique pattern matching. ACM Trans. Program. Lang. Syst. 28(3), 389–428 (2006)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.King’s College LondonLondonUK
  2. 2.University of St AndrewsSt AndrewsUK

Personalised recommendations