POSIX Lexing with Derivatives of Regular Expressions (Proof Pearl)

Ausaf, Fahad; Dyckhoff, Roy; Urban, Christian

doi:10.1007/978-3-319-43144-4_5

Fahad Ausaf¹⁵,
Roy Dyckhoff¹⁶ &
Christian Urban¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9807))

Included in the following conference series:

International Conference on Interactive Theorem Proving

1062 Accesses
5 Citations

Abstract

Brzozowski introduced the notion of derivatives for regular expressions. They can be used for a very simple regular expression matching algorithm. Sulzmann and Lu cleverly extended this algorithm in order to deal with POSIX matching, which is the underlying disambiguation strategy for regular expressions needed in lexers. Sulzmann and Lu have made available on-line what they call a “rigorous proof” of the correctness of their algorithm w.r.t. their specification; regrettably, it appears to us to have unfillable gaps. In the first part of this paper we give our inductive definition of what a POSIX value is and show (i) that such a value is unique (for given regular expression and string being matched) and (ii) that Sulzmann and Lu’s algorithm always generates such a value (provided that the regular expression matches the string). We also prove the correctness of an optimised version of the POSIX matching algorithm. Our definitions and proof are much simpler than those by Sulzmann and Lu and can be easily formalised in Isabelle/HOL. In the second part we analyse the correctness argument by Sulzmann and Lu and explain why the gaps in this argument cannot be filled easily.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
An extended version of [11] is available at the website of its first author; this extended version already includes remarks in the appendix that their informal proof contains gaps, and possible fixes are not fully worked out.
2.
Sulzmann and Lu state this clause as \(inj\ c\ c\ {(}{)}\) \(\,\mathop {=}\limits ^{\text{ def }}\,\) \(Char\ c\), but our deviation is harmless.
3.
All deviations we introduced are harmless.

References

Ausaf, F., Dyckhoff, R., Urban, C.: POSIX Lexing with Derivatives of Regular Expressions. Archive of Formal Proofs (2016). http://www.isa-afp.org/entries/Posix-Lexing.shtml, Formal proof development
Brzozowski, J.A.: Derivatives of regular expressions. J. ACM 11(4), 481–494 (1964)
Article MathSciNet MATH Google Scholar
Coquand, T., Siles, V.: A decision procedure for regular expression equivalence in type theory. In: Jouannaud, J.-P., Shao, Z. (eds.) CPP 2011. LNCS, vol. 7086, pp. 119–134. Springer, Heidelberg (2011)
Chapter Google Scholar
Frisch, A., Cardelli, L.: Greedy regular expression matching. In: Díaz, J., Karhumäki, J., Lepistö, A., Sannella, D. (eds.) ICALP 2004. LNCS, vol. 3142, pp. 618–629. Springer, Heidelberg (2004)
Chapter Google Scholar
Grathwohl, N.B.B., Henglein, F., Rasmussen, U.T.: A Crash-Course in Regular Expression Parsing and Regular Expressions as Types. Technical report, University of Copenhagen (2014)
Google Scholar
Krauss, A., Nipkow, T.: Proof pearl: regular expression equivalence and relation algebra. J. Autom. Reasoning 49, 95–106 (2012)
Article MathSciNet MATH Google Scholar
Kuklewicz, C.: Regex Posix. https://wiki.haskell.org/Regex_Posix
Nipkow, T.: Verified lexical analysis. In: Grundy, J., Newey, M. (eds.) TPHOLs 1998. LNCS, vol. 1479, pp. 1–15. Springer, Heidelberg (1998)
Chapter Google Scholar
Owens, S., Slind, K.: Adapting functional programs to higher order logic. High. Order Symbolic Comput. 21(4), 377–409 (2008)
Article MATH Google Scholar
Pierce, B.C., Casinghino, C., Gaboardi, M., Greenberg, M., Hriţcu, C., Sjöberg, V., Yorgey, B.: Software Foundations. Electronic Textbook (2015). http://www.cis.upenn.edu/~bcpierce/sf
Sulzmann, M., Lu, K.Z.M.: POSIX regular expression parsing with derivatives. In: Codish, M., Sumii, E. (eds.) FLOPS 2014. LNCS, vol. 8475, pp. 203–220. Springer, Heidelberg (2014)
Chapter Google Scholar
Sulzmann, M., van Steenhoven, P.: A flexible and efficient ML lexer tool based on extended regular expression submatching. In: Cohen, A. (ed.) CC 2014 (ETAPS). LNCS, vol. 8409, pp. 174–191. Springer, Heidelberg (2014)
Chapter Google Scholar
Vansummeren, S.: Type inference for unique pattern matching. ACM Trans. Program. Lang. Syst. 28(3), 389–428 (2006)
Article Google Scholar

Download references

Acknowledgements

We are very grateful to Martin Sulzmann for his comments on our work and moreover for patiently explaining to us the details in [11]. We also received very helpful comments from James Cheney and anonymous referees.

Author information

Authors and Affiliations

King’s College London, London, UK
Fahad Ausaf & Christian Urban
University of St Andrews, St Andrews, UK
Roy Dyckhoff

Authors

Fahad Ausaf
View author publications
You can also search for this author in PubMed Google Scholar
Roy Dyckhoff
View author publications
You can also search for this author in PubMed Google Scholar
Christian Urban
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christian Urban .

Editor information

Editors and Affiliations

Inria Nancy – Grand Est, Villers-lès-Nancy, France
Jasmin Christian Blanchette
Inria Nancy – Grand Est, Villers-lès-Nancy, France
Stephan Merz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ausaf, F., Dyckhoff, R., Urban, C. (2016). POSIX Lexing with Derivatives of Regular Expressions (Proof Pearl). In: Blanchette, J., Merz, S. (eds) Interactive Theorem Proving. ITP 2016. Lecture Notes in Computer Science(), vol 9807. Springer, Cham. https://doi.org/10.1007/978-3-319-43144-4_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-43144-4_5
Published: 07 August 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43143-7
Online ISBN: 978-3-319-43144-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics