Advertisement

Computation of the N Best Parse Trees for Weighted and Stochastic Context-Free Grammars

  • Víctor M. Jiménez
  • Andrés Marzal
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1876)

Abstract

Context-Free Grammars are the object of increasing interest in the pattern recognition research community in an attempt to overcome the limited modeling capabilities of the simpler regular grammars, and have application in a variety of fields such as language modeling, speech recognition, optical character recognition, computational biology, etc. This paper proposes an efficient algorithm to solve one of the problems associated to the use of weighted and stochastic Context-Free Grammars: the problem of computing the N best parse trees of a given string. After the best parse tree has been computed using the CYK algorithm, a large number of alternative parse trees are obtained, in order by weight (or probability), in a small fraction of the time required by the CYK algorithm to find the best parse tree. This is confirmed by experimental results using grammars from two different domains: a chromosome grammar, and a grammar modeling natural language sentences from the Wall Street Journal corpus.

Keywords

Weighted Context-Free Grammars Stochastic Context-Free Grammars CYK Algorithm N Best Parse Trees 

References

  1. 1.
    R. Cole, editor. Survey of the State of the Art in Human Language Technology. Studies in Natural Language Processing. Cambridge University Press, 1998.Google Scholar
  2. 2.
    A. Corazza, R. De Mori, R. Gretter, and G. Satta. Optimal probabilistic evaluation functions for search controlled by stochastic context-free grammars. IEEE Trans, on Pattern Analysis and Machine Intelligence, 16(10):1018–1027, 1994.CrossRefGoogle Scholar
  3. 3.
    K. S. Fu. Syntactic Pattern Recognition and Applications. Prentice-Hall, Englewood Cliffs, NJ, 1982.zbMATHGoogle Scholar
  4. 4.
    R. C. Gonzalez and M. G. Thomason. Syntactic Pattern Recognition, An Introduction. Addison-Wesley, Reading, MA, 1978.zbMATHGoogle Scholar
  5. 5.
    M. A. Harrison. Introduction to Formal Language Theory. Addison-Wesley, Reading, MA, 1978.zbMATHGoogle Scholar
  6. 6.
    F. Jelinek, J. D. Lafferty, and R. L. Mercer. Basic methods of probabilistic context free grammars. In P. Laface and R. De Mori, editors, Speech Recognition and Understanding, volume F75 of NATO ASI, pages 345–360. Springer-Verlag, 1992.Google Scholar
  7. 7.
    D. E. Knuth. The Art of Computer Programming, volume 3 / Sorting and Searching. Addison-Wesley, Reading, MA, 1973.Google Scholar
  8. 8.
    K. Lari and S. J. Young. Applications of stochastic context-free grammars using the Inside-Outside algorithm. Computer, Speech and Language, 5:237–257, 1991.CrossRefGoogle Scholar
  9. 9.
    S. E. Levinson. Structural methods in automatic speech recognition. Proceedings of the IEEE, 73(11):1625–1650, 1985.CrossRefGoogle Scholar
  10. 10.
    M. P. Marcus, B. Santorini, and M. A. Marcinkiewicz. Building a large annotated corpus of English: the Penn Treebank. Computational Linguistics, 19(2):313–330, 1993.Google Scholar
  11. 11.
    H. Ney. Dynamic programming parsing for context-free grammars in continuous speech recognition. IEEE Trans. on Signal Processing, 39(2):336–340, 1991.zbMATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    Y. Sakakibara, M. Brown, R. Hughey, I. S. Mian, K. Sjolander, R. C. Underwood, and D. Haussler. Stochastic context-free grammars for tRNA modeling. Nucleic Acids Research, 22(23):5112–5120, 1994.CrossRefGoogle Scholar
  13. 13.
    J. A. Sánchez and J. M. Benedí. Estimation of the probability distributions of stochastic context free grammars from the K-best derivations. In Proc. Int. Conf. on Spoken Language Processing (ICSLP), pages 2495–2498, 1998.Google Scholar
  14. 14.
    J. A. Sánchez and J. M. Benedí. Learning of stochastic context-free grammars by means of estimation algorithms. In Proc. of the European Conf. on Speech Communication and Technology (EUROSPEECH), 1999.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Víctor M. Jiménez
    • 1
  • Andrés Marzal
    • 1
  1. 1.Dept. de InformáticaUniversitat Jaume ICastellónSpain

Personalised recommendations