Abstract
Large practical NLP applications require robust analysis components that can effectively handle input that is disfluent or extra-grammatical. The effectiveness and efficiency of any robust parser are a direct function of three main factors: (1) Flexibility: what types of disfluencies and deviations from the grammar can the parser handle?; (2) Search: How does the parser search the space of possible interpretations, and what techniques are applied to prune the search space?; and (3) Parse Selection and Disambiguation: What methods and resources are used to evaluate and rank potential parses and sub-parses, and how does the parser cope with the extreme levels of ambiguity introduced by its flexibility parameters? In this chapter we describe our investigations on how to balance flexibility and efficiency in the context of two different robust parsers — a GLR parser and a left corner Chart parser — both based on a unification-augmented context-free grammar formalism. We demonstrate how the combination of a beam search together with ambiguity packing and statistical disambiguation provide a flexible framework for achieving a good balance between robustness and efficiency in such parsers. Our investigations are based on experimental results and comparative performance evaluations of both parsers using a grammar for the spoken language ESST (English Spontaneous Scheduling Task) domain.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abney, S. (1995). Partial parsing via finite-state cascades, in J. Carroll (ed.), Workshop on Robust Parsing; Eight European Summer School in Logic, Language and Information, pp. 8–15.
Aho, A. V. and Johnson, S. C. (1974). LR parsing, Computing Surveys 6 (2): 99–124.
Aït-Mokhtar, S. and Chanod, J. (1997). Incremental finite-state parsing, Proceedings of the Fifth Conference on Applied Natural Language Processing, pp. 72–79.
Alshawi (ed.), H. (1992). The Core Language Engine, MIT Press, Cambridge, MA.
Bod, R. (1998). Spoken dialogue interpretation with the DOP model, Proceedings of COLING/ACL98, pp. 138–144.
Buo, F. D. (1996). FEasPar–A feature structure parser learning to parse spoken language, Proceedings of COLING-96, pp. 188–193.
Carroll, J. A. (1993). Practical Unification-Based Parsing of Natural Language, PhD thesis, University of Cambridge, Computer Laboratory.
Carroll, J. and Briscoe, T. (1993). Generalized probabilistic LR parsing of natural language (corpora) with unification-based grammars, Computational Linguistics 19 (1): 25–59.
Gazdar, G., E., K., Pullum, G. and Sag, I. (1985). Generalized Phrase Structured Grammar, Blackwell, Oxford, UK.
Goodman, J. (1996). Parsing algorithms and metrics, Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, pp. 177–183.
Henderson, J. and Lane, P. (1998). A connectionist architecture for learning to parse, Proceedings of COLING/ACL-98, pp. 531–537.
Hipp, D. R. (1992). Design and Development of Spoken Natural-Language Dialog Parsing Systems, PhD thesis, Dept. of Computer Science, Duke University.
Hobbs, J. R., Appelt, D. E., Bear, J. and Tyson, M. (1991). Robust processing of real-world natural-language texts, Technical report, SRI International.
Jain, A. N. (1991). PARSEC: A Connectionist Learning Architecture for Parsing Speech, PhD thesis, School of Computer Science, Carnegie Mellon University.
Jain, A. N. and Waibel, A. H. (1990). Incremental parsing by modular recurrent connectionist networks, in D. S. Tourertzky (ed.), Advances in Neural Information Processing 2, Morgan Kaufman Publishers, pp. 346–371.
Kaplan, R. and Bresnan, J. (1982). Lexical-functional grammar: A formal system for grammatical representation, The Mental Representation of Grammatical Relations, MIT Press, pp. 173281.
Lang, B. (1974). Deterministic techniques for efficient non-deterministic parsers, Proceedings of 2nd Colloquium on Automata, Languages and Programming, Lecture Notes in Computer Science, Springer Verlag, Saarbruken, Germany, pp. 255–269.
Lavie, A. (1995). A Grammar Based Robust Parser For Spontaneous Speech, PhD thesis, School of Computer Science, Carnegie Mellon University.
Lehrnan, J. F. (1989). Adaptive Parsing: Self-Extending Natural Language Interfaces, PhD thesis, School of Computer Science, Carnegie Mellon University.
Magerman, D. M. and Marcus, M. P. (1990). Parsing a natural language using mutual information statistics, Proceedings of AAAI, pp. 984–989.
Mayfield, L., Gavaldà, M., Seo, Y.-H., Suhm, B., Ward, W. and Waibel, A. (1995). Parsing real input in JANUS: A concept-based approach to spoken language translation, Proceedings of the Theoretical and Methodological Issues in Machine Translation, pp. 196–205.
Mayfield, L., Gavaldà, M., Ward, W. and Waibel, A. (1995). Concept-based speech translation, Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP’95).
McDonald, D. (1990). Robust partial-parsing through incremental, multi-level processing: Rationales and biases, in P. S. Jacobs (ed.), Proceedings of the AAAI Spring Symposium on Text-Based Intelligent Systems: Current Research in Text Analysis, Information Extraction, and Retrieval. A technical report from the GE Research and Development Center, Schnectedy NY, no 90CRD198.
McDonald, D. (1992). An efficient chart-based algorithm for partial-parsing of unrestricted texts
Proceedings of the 3rd Conference on Applied Natural Language Processing,pp. 193–200. McDonald, D. (1993a). Efficiently parsing large corpora. Submitted to the ACL Workshop on Very
Large Corpora: Academic and Industrial Perspectives.
McDonald, D. (1993b). The interplay of syntactic and semantic node labels in partial parsing, Proceedings of the Third International Workshop on Parsing Technologies, pp. 171–185.
Miller, S., Stallard, D., Bobrow, R. and Schwartz, R. (1996). A fully statistical approach to natural language interfaces, Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, pp. 55–61.
Neumann, G., Backofen, R., Baur, J., Becker, M. and Braun, C. (1997). An information extraction core system for real world German text processing, Proceedings of the Fifth Conference on Applied Natural Language Processing.
Pietra, S., Epstein, M., Roukos, S. and Ward, T. (1997). Fertility models for statistical natural language understanding, Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, pp. 168–173.
Pollard, C. and Sag, I. (1987). Information Based Syntax and Semantics: Vol. 1 - Fundamentals, University of Chicago Press, Chicago, IL.
Rosé, C. P. (1997). Robust Interactive Dialogue Interpretation, PhD thesis, School of Computer Science, Carnegie Mellon University.
Rosé, C. P. and Lavie, A. (1997). An efficient distribution of labor in a two stage robust interpretation process, Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, pp. 26–34.
Rosé, C. P. and Waibel, A. (1997). Recovering from parser failures: A hybrid statistical/symbolic ap proach, in J. Klavans and P. Resnik (eds), The Balancing Act: Combining Symbolic and Statistical Approaches to Language Processing, The MIT Press, pp. 157–179.
Rosenkrantz, D. J. and Lewis, P. M. (1970). Deterministic left corner parsing, Procedings of the IEFF, Conferenc eof the 11th Annual Symposium on Switching and Automata Theory, pp. 139–152.
Sanker, A. and Gorin, A. (1993). Adaptive language acquisition in a multi-sensory device, IEEE Transactions on Systems, Man, and Cybernetics.
Schneider, D. and McCoy, K. F. (1998). Recognizing syntactic errors in the writing of second language learners, Proceedings of COLING/ACL 98, pp. 1198–1204.
Tornita, M. (1986). Efficient Parsing for Natural Language: A Fast Algorithm for Practical Systems,Kluwer Academic Publishers.
Tornita, M. (1987). An efficient augmented context-free parsing algorithm, Computational Linguistics 13 (1–2): 31–46.
Tornita, M. (1990). The generalized LR parser/compiler - version 8.4, Proceedings of Interna-
tional Conference on Computational Linguistics (COLING’90),Helsinki, Finland, pp. 59–63. Van Noord, G. (1997). An efficient implementation of the head-corner parser, Computational
Linguistics 23(3): 425–456.
Ward, W. (1989). Understanding spontaneous speech, Proceedings of the DARPA Speech and Natural Language Workshop, pp. 137–141.
Worm, K. (1998). A model of robust processing of spontaneous speech by integrating viable fragments, Proceedings of COLING-ACL 98, pp. 1403–1407.
Woszcyna, M., Aoki-Waibel, N., Buo, E D., Coccaro, N., Horiguchi, K., Kemp, T, Lavie, A., McNair, A., Polzin, T, Rogina, I., Rosé, C. P, Schultz, T, Suhm, B., Tornita, M. and Waibel, A. (1994). JANUS 93: Towards spontaneous speech translation, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, pp. 345–349.
Woszcyna, M., Coccaro, N., Eisele, A., Lavie, A., McNair, A., Polzin, T., Rogina, I., Rosé, C. P, Sloboda, T, Tornita, M., Tsutsumi, J., Waibel, N., Waibel, A. and Ward, W. (1993). Recent advances in JANUS: A speech translation system, Proceedings of the ARPA Human Languages Technology Workshop, p. 1295.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Rosé, C.P., Lavie, A. (2001). Balancing Robustness and Efficiency in Unification-Augmented Context-Free Parsers for Large Practical Applications. In: Junqua, JC., van Noord, G. (eds) Robustness in Language and Speech Technology. Text, Speech and Language Technology, vol 17. Springer, Dordrecht. https://doi.org/10.1007/978-94-015-9719-7_10
Download citation
DOI: https://doi.org/10.1007/978-94-015-9719-7_10
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-5643-6
Online ISBN: 978-94-015-9719-7
eBook Packages: Springer Book Archive