Skip to main content

A Case Study on Grammatical-Based Representation for Regular Expression Evolution

  • Conference paper
Trends in Practical Applications of Agents and Multiagent Systems

Part of the book series: Advances in Intelligent and Soft Computing ((AINSC,volume 71))

Abstract

Regular expressions, or simply regex, have been widely used as a powerful pattern matching and text extractor tool through decades. Although they provide a powerful and flexible notation to define and retrieve patterns from text, the syntax and the grammatical rules of these regex notations are not easy to use, and even to understand. Any regex can be represented as a Deterministic or Non-Deterministic Finite Automata; so it is possible to design a representation to automatically build a regex, and a optimization algorithm able to find the best regex in terms of complexity. This paper introduces both, a graph-based representation for regex, and a particular heuristic-based evolutionary computing algorithm based on grammatical features from this language in a particular data extraction problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Barrero, D.F., Camacho, D., R-Moreno, M.D.: Automatic Web Data Extraction Based on Genetic Algorithms and Regular Expressions. In: Data Mining and Multiagent Integration. Springer, Heidelberg (2009)

    Google Scholar 

  • Chang, C.-H., Paige, R.: From regular expressions to dfa’s using compressed nfa’s, pp. 90–110 (1992)

    Google Scholar 

  • Cox, R. (ed.): Regular expression matching can be simple and fast (2007)

    Google Scholar 

  • Dunay, B.D., Petry, F., Buckles, B.P.: Regular language induction with genetic programming. In: Proceedings of the 1994 IEEE World Congress on Computational Intelligence, Orlando, Florida, USA, pp. 396–400. IEEE Press, Los Alamitos (1994)

    Chapter  Google Scholar 

  • Eiben, A.E., Smith, J.E.: Introduction to Evolutionary Computing. Natural Computing Series. Springer, Heidelberg (2008)

    Google Scholar 

  • Friedl, J.E.F.: Mastering Regular Expressions. O’Reilly & Associates, Inc., Sebastopol (2002)

    MATH  Google Scholar 

  • Gold, E.M.: Complexity of automaton identification from given data. Inform. Control 37, 302–320 (1978)

    Article  MATH  MathSciNet  Google Scholar 

  • Kleene, S.C.: Representation of events in nerve nets and finite automata. In: Shannon, C.E., McCarthy, J. (eds.) Automata studies, vol. 34, pp. 3–40 (1956)

    Google Scholar 

  • Thompson, K.: Regular expression search algorithm. Comm. Assoc. Comp. Mach. 11(6), 419–422 (1968)

    MATH  Google Scholar 

  • Zipf, G.: The psycho-biology of language. Houghton Mifflin, Boston (1935)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

González-Pardo, A., Barrero, D.F., Camacho, D., R-Moreno, M.D. (2010). A Case Study on Grammatical-Based Representation for Regular Expression Evolution. In: Demazeau, Y., et al. Trends in Practical Applications of Agents and Multiagent Systems. Advances in Intelligent and Soft Computing, vol 71. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12433-4_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12433-4_45

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12432-7

  • Online ISBN: 978-3-642-12433-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics