Regular Approximation of Context-Free Grammars through Transformation

  • Mehryar Mohri
  • Mark-Jan Nederhof
Part of the Text, Speech and Language Technology book series (TLTB, volume 17)

Abstract

We present an algorithm for approximating context-free languages with regular languages. The algorithm is based on a simple transformation that applies to any context-free grammar and guarantees that the result can be compiled into a finite automaton. The resulting grammar contains at most one new nonterminal for any nonterminal symbol of the input grammar. The result thus remains readable and if necessary modifiable. We extend the approximation algorithm to the case of weighted context-free grammars. We also report experiments with several grammars showing that the size of the minimal deterministic automata accepting the resulting approximations is of practical use for applications such as speech recognition.

Keywords

Regular Language Finite Automaton Parse Tree Input String Nonterminal Symbol 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aho, A. V. & Ullman, J. D. (1973). The Theory of Parsing, Translation and Compiling,Prentice-Hall.Google Scholar
  2. Carroll, J. A. (1993). Practical unification-based parsing of natural language, Technical Report No. 314, University of Cambridge, Computer Laboratory, England. PhD thesis.Google Scholar
  3. Chomsky, N. (1959). On certain formal properties of grammars, Information and Control 2: 137–167.CrossRefGoogle Scholar
  4. Grimley Evans, E. (1997). Approximating context-free grammars with a finite-state calculus, 35th Annual Meeting of the ACL, pp. 452–459.Google Scholar
  5. Hoperoft, J. E. & Ullman, J. D. (1979). Introduction to Automata Theory, Languages, and Computation,Addison-Wesley.Google Scholar
  6. Johnson, M. (1998). Finite-state approximation of constraint-based grammars using left-corner grammar transforms, 36th Annual Meeting of the ACL and 17th International Conference on Computational Linguistics, Vol. 1, pp. 619–623.Google Scholar
  7. Mohri, M. (2000). Weighted grammar tools: The GRM library, in J.-C. Junqua & G. van Noord (eds), Robustness in Language and Speech Technology, Kluwer Academic Publishers. This volume.Google Scholar
  8. Mohri, M. & Pereira, E C. N. (1998). Dynamic compilation of weighted context-free grammars, 36th Annual Meeting of the ACL and 17th International Conference on Computational Linguistics, Vol. 2, pp. 891–897.Google Scholar
  9. Nederhof, M.-J. (1997). Regular approximations of CFLs: A grammatical view, International Workshop on Parsing Technologies, Massachusetts Institute of Technology, pp. 159–170.Google Scholar
  10. Nederhof, M.-J. (1998). Context-free parsing through regular approximation, Proceedings of the International Workshop on Finite State Methods in Natural Language Processing, Ankara, Turkey, pp. 13–24.Google Scholar
  11. Nederhof, M.-J. (2000). Practical experiments with regular approximation of context-free languages, Computational Linguistics 26 (1): 174.CrossRefGoogle Scholar
  12. Pereira, E C. N. & Wright, R. N. (1997). Finite-state approximation of phrase-structure grammars, in E. Roche & Y. Schabes (eds), Finite-State Language Processing, MIT Press, pp. 149–173.Google Scholar
  13. Schoorl, J. J. & Belder, S. (1990). Computational linguistics at Delft: A status report, Report WTM/TT 90–09, Delft University of Technology, Applied Linguistics Unit.Google Scholar
  14. Ullian, J. S. (1967). Partial algorithm problems for context free languages, Information and Control 11: 80–101.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2001

Authors and Affiliations

  • Mehryar Mohri
    • 1
  • Mark-Jan Nederhof
    • 1
  1. 1.AT&T Labs — ResearchUSA

Personalised recommendations