Employing Oracle Confusion for Parse Quality Estimation

Jain, Sambhav; Jain, Naman; Agrawal, Bhasha; Sangal, Rajeev

doi:10.1007/978-3-319-18111-0_17

Sambhav Jain¹⁴,
Naman Jain¹⁴,
Bhasha Agrawal¹⁴ &
…
Rajeev Sangal¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9041))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

2911 Accesses

Abstract

We propose an approach for Parse Quality Estimation based on the dynamic computation of an entropy-based confusion score for directed arcs and for joint prediction of directed arcs and their dependency labels, in a typed dependency parsing framework. This score accompanies a parsed output and aims to present an exhaustive picture of the parse quality, detailed down to each arc of the parse tree. The methodology explores the confusion encountered by the oracle of a transition-based data-driven dependency parser. We support our hypothesis by analytically illustrating, for 18 languages, that the arcs with high confusion scores are notably the predominant parsing errors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Buchholz, S., Marsi, E.: CoNLL-X shared task on multilingual dependency parsing. In: Proceedings of the Tenth Conference on Computational Natural Language Learning, pp. 149–164. Association for Computational Linguistics (2006)
Google Scholar
Charniak, E., Johnson, M.: Coarse-to-fine n-best parsing and maxent discriminative reranking. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 173–180. ACL (2005)
Google Scholar
Collins, M., Koo, T.: Discriminative reranking for natural language parsing. In: Machine Learning-International Workshop then Conference, pp. 175–182. Citeseer (2000)
Google Scholar
Goldberg, Y., Elhadad, M.: Precision-biased parsing and high-quality parse selection. arXiv preprint arXiv:1205.4387 (2012)
Google Scholar
Hall, J., Nilsson, J., Nivre, J., Eryiǧit, G., Megyesi, B., Nilsson, M., Saers, M.: Single malt or blended? A study in multilingual parser optimization. In: Proceedings of the CoNLL Shared Task of EMNLP-CoNLL 2007, pp. 933–939 (2007)
Google Scholar
Hall, K.: K-best spanning tree parsing. In: Annual Meeting-Association for Computational Linguistics, vol. 45, p. 392 (2007)
Google Scholar
Hwa, R.: Sample selection for statistical parsing. Computational Linguistics 30(3), 253–276 (2004)
Article MATH MathSciNet Google Scholar
Jain, S., Agrawal, B.: A dynamic confusion score for dependency arc labels. In: Proceedings of the Sixth International Joint Conference on Natural Language Processing, pp. 1237–1242. Asian Federation of Natural Language Processing, Nagoya (2013), http://www.aclweb.org/anthology/I13-1176
Google Scholar
Kawahara, D., Uchimoto, K.: Learning reliability of parses for domain adaptation of dependency parsing. IJCNLP 2008 (2008)
Google Scholar
Kolachina, S., Kolachina, P.: Parsing any domain english text to conll dependencies. In: Calzolari N., Choukri, K., Declerck, T., DoÄ§an, M.U., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S. (eds.) Proceedings of the Eight International Conference on Language Resources and Evaluation, LREC 2012. European Language Resources Association (ELRA), Istanbul (May 2012)
Google Scholar
Koo, T., Collins, M.: Hidden-variable models for discriminative reranking. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 507–514. ACL (2005)
Google Scholar
Mannem, P., Dara, A.: Partial parsing from bitext projections. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 1597–1606. Association for Computational Linguistics (2011)
Google Scholar
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to information retrieval, vol. 1. Cambridge University Press, Cambridge (2008)
Google Scholar
McClosky, D., Charniak, E., Johnson, M.: Effective self-training for parsing. In: Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pp. 152–159. Association for Computational Linguistics (2006)
Google Scholar
McDonald, R., Pereira, F., Ribarov, K., Hajič, J.: Non-projective dependency parsing using spanning tree algorithms. In: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 523–530. Association for Computational Linguistics (2005)
Google Scholar
McDonald, R.T., Nivre, J.: Characterizing the errors of data-driven dependency parsing models. In: EMNLP-CoNLL, pp. 122–131 (2007)
Google Scholar
Mejer, A., Crammer, K.: Are you sure?: confidence in prediction of dependency tree edges. In: Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 573–576. Association for Computational Linguistics (2012)
Google Scholar
Nilsson, J., Riedel, S., Yuret, D.: The CoNLL 2007 shared task on dependency parsing. In: Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL, pp. 915–932. sn (2007)
Google Scholar
Nivre, J.: An efficient algorithm for projective dependency parsing. In: Proceedings of the 8th International Workshop on Parsing Technologies, IWPT. Citeseer (2003)
Google Scholar
Nivre, J., Hall, J., Nilsson, J.: Memory-based dependency parsing. In: Proceedings of CoNLL, pp. 49–56 (2004)
Google Scholar
Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kubler, S., Marinov, S., Marsi, E.: MaltParser: A language-independent system for data-driven dependency parsing. Natural Language Engineering 13(2), 95 (2007)
Google Scholar
Nivre, J., Hall, J., Nilsson, J., Eryiǧit, G., Marinov, S.: Labeled pseudo-projective dependency parsing with support vector machines. In: Proceedings of the Tenth Conference on Computational Natural Language Learning, pp. 221–225. Association for Computational Linguistics (2006)
Google Scholar
Owczarzak, K.: Depeval (summ): dependency-based evaluation for automatic summaries. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, vol. 1, pp. 190–198. Association for Computational Linguistics (2009)
Google Scholar
Petrov, S., Chang, P.C., Ringgaard, M., Alshawi, H.: Uptraining for accurate deterministic question parsing. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 705–713. Association for Computational Linguistics (2010)
Google Scholar
Ravi, S., Knight, K., Soricut, R.: Automatic prediction of parser accuracy. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 887–896. Association for Computational Linguistics (2008)
Google Scholar
Settles, B.: Active learning literature survey. University of Wisconsin, Madison (2010)
Google Scholar
Sharma, D.M., Mannem, P., van Genabith, J., Devi, S.L., Mamidi, R., Parthasarathi, R. (eds.) Proceedings of the Workshop on Machine Translation and Parsing in Indian Languages. The COLING 2012 Organizing Committee, Mumbai, India (December 2012), http://www.aclweb.org/anthology/W12-56
Singla, K., Tammewar, A., Jain, N., Jain, S.: Two-stage Approach for Hindi Dependency Parsing Using MaltParser. Training 12041(268,093), 22–27 (2012)
Google Scholar
Steedman, M., Hwa, R., Clark, S., Osborne, M., Sarkar, A., Hockenmaier, J., Ruhlen, P., Baker, S., Crim, J.: Example selection for bootstrapping statistical parsers. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, vol. 1, pp. 157–164. Association for Computational Linguistics (2003)
Google Scholar
Tang, M., Luo, X., Roukos, S.: Active learning for statistical natural language parsing. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 120–127. Association for Computational Linguistics (2002)
Google Scholar
Wann, S., Dras, M., Dale, R., Paris, C.: Improving grammaticality in statistical sentence generation: Introducing a dependency spanning tree algorithm with an argument satisfaction model. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, pp. 852–860. Association for Computational Linguistics (2009)
Google Scholar
Xu, P., Kang, J., Ringgaard, M., Och, F.: Using a dependency parser to improve smt for subject-object-verb languages. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 245–253. Association for Computational Linguistics (2009)
Google Scholar
Zhang, Y., Clark, S.: A tale of two parsers: investigating and combining graph-based and transition-based dependency parsing using beam-search. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 562–571. Association for Computational Linguistics (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

International Institute of Information Technology, Gachibowli, Hyderabad, 500032, Telangana, India
Sambhav Jain, Naman Jain, Bhasha Agrawal & Rajeev Sangal

Authors

Sambhav Jain
View author publications
You can also search for this author in PubMed Google Scholar
Naman Jain
View author publications
You can also search for this author in PubMed Google Scholar
Bhasha Agrawal
View author publications
You can also search for this author in PubMed Google Scholar
Rajeev Sangal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sambhav Jain .

Editor information

Editors and Affiliations

Centro de Investigación en Computación, Instituto Politécnico Nacional, Mexico DF, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jain, S., Jain, N., Agrawal, B., Sangal, R. (2015). Employing Oracle Confusion for Parse Quality Estimation. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2015. Lecture Notes in Computer Science(), vol 9041. Springer, Cham. https://doi.org/10.1007/978-3-319-18111-0_17

Download citation

DOI: https://doi.org/10.1007/978-3-319-18111-0_17
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18110-3
Online ISBN: 978-3-319-18111-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics