Advertisement

Type-Based Optimization for Regular Patterns

  • Michael Y. Levin
  • Benjamin C. Pierce
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3774)

Abstract

Pattern matching mechanisms based on regular expressions feature in a number of recent languages for processing XML. The flexibility of these mechanisms demands novel approaches to the familiar problems of pattern-match compilation—how to minimize the number of tests performed during pattern matching while keeping the size of the output code small.

We describe semantic compilation methods in which we use the schema of the value flowing into a pattern matching expression to generate efficient target code. We start by discussing a pragmatic algorithm used currently in the compiler of Xtatic and report some preliminary performance results. For a more fundamental analysis, we define an optimality criterion of “no useless tests” and show that it is not satisfied by Xtatic’s algorithm. We constructively demonstrate that the problem of generating optimal pattern matching code is decidable for finite (non-recursive) patterns.

Keywords

Pattern Match Target Program Target Language Regular Pattern Tree Automaton 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Benzaken, V., Castagna, G., Frisch, A.: CDuce: An XML-centric general-purpose language. In: ACM SIGPLAN International Conference on Functional Programming (ICFP), Uppsala, Sweden, pp. 51–63 (2003)Google Scholar
  2. 2.
    Flesca, S., Furfaro, F., Masciari, E.: On the minimization of xpath queries. In: VLDB, pp. 153–164 (2003)Google Scholar
  3. 3.
    Fokoue, A.: Improving the performance of XPath query engines on large collections of XML data (2002)Google Scholar
  4. 4.
    Frisch, A.: Regular tree language recognition with static information. In: Workshop on Programming Language Technologies for XML (PLAN-X) (January 2004)Google Scholar
  5. 5.
    Frisch, A.: Théorie, conception et réalisation d’un langage adapté á XML. PhD thesis, Ecole Normale Supérieure, Paris, France (2004)Google Scholar
  6. 6.
    Gapeyev, V., Levin, M.Y., Pierce, B.C., Schmitt, A.: XML goes native: Run-time representations for Xtatic. In: Bodik, R. (ed.) CC 2005. LNCS, vol. 3443, pp. 43–58. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  7. 7.
    Gapeyev, V., Levin, M.Y., Pierce, B.C., Schmitt, A.: The Xtatic experience. In: Workshop on Programming Language Technologies for XML (PLAN-X) (January 2005); University of Pennsylvania Technical Report MS-CIS-04-24 (October 2004)Google Scholar
  8. 8.
    Genevès, P., Vion-Dury, J.-Y.: Logic-based XPath optimization. In: International ACM Symposium on Document Engineering (2004)Google Scholar
  9. 9.
    Gottlob, G., Koch, C., Pichler, R.: Efficient algorithms for processing xpath queries. In: VLDB, pp. 95–106 (2002)Google Scholar
  10. 10.
    Gottlob, G., Koch, C., Pichler, R.: XPath query evaluation: Improving time and space efficiency (2003)Google Scholar
  11. 11.
    Hosoya, H., Pierce, B.C.: XDuce: A statically typed XML processing language. ACM Transactions on Internet Technology 3(2), 117–148 (2003)CrossRefGoogle Scholar
  12. 12.
    Hosoya, H., Vouillon, J., Pierce, B.C.: Regular expression types for XML. ACM Transactions on Programming Languages and Systems (TOPLAS) 27(1), 46–90 (2005); Preliminary version in ICFP 2000CrossRefGoogle Scholar
  13. 13.
    Levin, M.Y.: Compiling regular patterns. In: ACM SIGPLAN International Conference on Functional Programming (ICFP), Uppsala, Sweden (2003)Google Scholar
  14. 14.
    Levin, M.Y., Pierce, B.C.: Type-based optimization for regular patterns. Technical Report MS-CIS-05-13, University of Pennsylvania (June 2005)Google Scholar
  15. 15.
    Wood, P.T.: Minimising simple xpath expressions. In: WebDB, pp. 13–18 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Michael Y. Levin
    • 1
  • Benjamin C. Pierce
    • 2
  1. 1.Microsoft Center for Software Excellence 
  2. 2.University of Pennsylvania 

Personalised recommendations