Abstract
This chapter presents a new formalization of probabilistic GLR language modeling for statistical parsing. Our model inherits its essential features from Briscoe and Carroll’s generalized probabilistic LR model (Briscoe and Carroll 1993), which takes context of parse derivation into account by assigning a probability to each LR parsing action according to its left and right context. Briscoe and Carroll’s model, however, has a drawback in that it is not formalized in any probabilistically well-founded way, which may degrade its parsing performance. Our formulation overcomes this drawback with a few significant refinements, while maintaining all the advantages of Briscoe and Carroll’s modeling. We discuss the formal and qualitative aspects of our model, illustrating the qualitative differences between Briscoe and Carroll’s model and our model, and their expected impact on parsing performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aho, A., S. Ravi, and J. Ullman (1986) Compilers, Principle, Techniques, and Tools. Reading, MA: Addison Wesley.
Black, E., F. Jelinek, J. Lafferty, D. Magerman, R. Mercer, and S. Roukos (1993) Towards history-based grammars: Using richer models for probabilistic parsing. In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, pp. 31–37.
Briscoe, T. and J. Carroll (1993) Generalized probabilistic LR parsing of natural language (corpora) with unification-based grammars. Computational Linguistics 19 (1): 25–60.
Carroll, J. and E. Briscoe (1992) Probabilistic normalization and unpacking of packed parse forests for unification-based grammars. In Proceedings, AAAI Fall Symposium on Probabilistic Approaches to Natural Language, pp. 33–38.
Carroll, J. and E. Briscoe (1998) Can subcategorisation probabilities help a statistical parser? In Proceedings of the 6th ACL/SIGDAT Workshop on Very Large Corpora pp. 92–100.
Chapman, N. (1987) LR Parsing — Theory and Practice. Cambridge University Press.
Charniak, E. (1997) Statistical parsing with a context-free grammar and word statistics In Proceedings of the National Conference on Artificial Intelligence, pp. 598–603.
Collins, M. (1996) A new statistical parser based on bigram lexical dependencies. In Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics pp. 184–191.
Collins, M. (1997) Three Generative, Lexicalised Models for Statistical Parsing. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics pp. 16–23.
Fujio, M. and Y. Matsumoto (1998) Japanese dependency structure analysis based on lexicalized statistics. In Proceedings of the 3rd Conference on Empirical Methods in Natural Language Processing, pp. 88–96.
Haruno, M., S. Shirai, and Y. Ooyama (1998) Using Decision trees to construct a practical parser. In Proceedings of COLING-ACL’98, pp. 505–511.
Inui, K., K. Shirai, H. Tanaka, and T. Tokunaga (1997) Integrated probabilistic language modeling for statistical parsing. Technical Report TR97–0005, Dept. of Computer Science, Tokyo Institute of Technology. Available from http://www.cs.titech.ac.jp/tr.html.
Inui, K., V. Sornlertlamvanich, H. Tanaka, and T. Tokunaga (1997) A new probabilistic LR language model for statistical parsing. Technical Report TR97–0004, Department of Computer Science, Tokyo Institute of Technology. Available from http://www.cs.titech.ac.jp/tr.html.
Kita, K. (1994) Spoken sentence recognition based on HMM-LR with hybrid language modeling. IEICE Trans. Inf. & Syst., Vol. E77-D, No. 2.
Li, H. (1996) A probabilistic disambiguation method based on psycholinguistic principles. In Proceedings of the Fourth Workshop on Very Large Corpora (WVLC-4). Available from cmp-lg/9606016.
Li, H. and H. Tanaka (1995) A method for integrating the connection constraints into an LR table. In Proceedings of Natural Language Processing Pacific Rim Symposium ‘95, pp. 703–708.
Magerman, D. and M. Marcus (1991) Pearl: A probabilistic chart parser. In Proceedings of the 5th Conference of European Chapter of the Association for Computational Linguistics, pp. 15–20.
Schabes, Y. (1992) Stochastic lexicalized tree-adjoining grammars. In Proceedings of the 14th International Conference on Computational Linguistics, Vol. 2, pp. 425–432.
Sekine, S. and R. Grishman (1995) A corpus-based probabilistic grammar with only two non-terminals. In Proceedings of the Fourth International Workshop on Parsing Technologies. Prague: Charles University, pp. 216–223.
Shirai, K., K. Inui, T. Tokunaga and H. Tanaka (1998) An empirical evaluation of statistical parsing of Japanese sentences using a lexical association statistics. Proc. 3rd Conference on Empirical Methods in Natural Language processing, pp. 80–87.
Sornlertlamvanich, V, K. Inui, K. Shirai, H. Tanaka, and T. Tokunaga (1997) Empirical Evaluation of Probabilistic GLR Parsing. In Proceedings of the Natural Language Processing Pacific Rim Symposium, pp. 169–174.
Sornlertlamvanich, V, K. Inui, K. Shirai, H. Tanaka, T. Tokunaga, and T. Takezawa (1999) Empirical Support for New Probabilistic Generalized LR Parsing. Journal of Natural Language Processing, Vol. 6, No. 2, pp. 3–22.
Su, K., J.-N. Wang, M.-H. Su, and J.-S. Chang (1991) GLR parsing with scoring. In M. Tomita (1991), Efficient Parsing for Natural Language, pp. 93–112.
Tomita, M. (1986) Efficient Parsing for Natural Language. Boston (Mass): Kluwer.
Tomita, M. (ed.) (1991) Generalized LR Parsing. Boston (Mass): Kluwer.
Uchimoto, K., S. Sekine, and H. Isihara (1999) Analysis of Japanese sentences using an integrated statistical language model. In Proceedings of the 13th Conference of the European Chapter of the ACL pp. 196–203.
Wright, J. and E. Wrigley (1991) GLR parsing with probability. In M. Tomita (1991), Efficient Parsing for Natural Language, pp. 113–128.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Inui, K., Sornlertlamvanich, V., Tanaka, H., Tokunaga, T. (2000). Probabilistic GLR Parsing. In: Bunt, H., Nijholt, A. (eds) Advances in Probabilistic and Other Parsing Technologies. Text, Speech and Language Technology, vol 16. Springer, Dordrecht. https://doi.org/10.1007/978-94-015-9470-7_5
Download citation
DOI: https://doi.org/10.1007/978-94-015-9470-7_5
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-5579-8
Online ISBN: 978-94-015-9470-7
eBook Packages: Springer Book Archive