Skip to main content

Employing Oracle Confusion for Parse Quality Estimation

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9041))

  • 2911 Accesses

Abstract

We propose an approach for Parse Quality Estimation based on the dynamic computation of an entropy-based confusion score for directed arcs and for joint prediction of directed arcs and their dependency labels, in a typed dependency parsing framework. This score accompanies a parsed output and aims to present an exhaustive picture of the parse quality, detailed down to each arc of the parse tree. The methodology explores the confusion encountered by the oracle of a transition-based data-driven dependency parser. We support our hypothesis by analytically illustrating, for 18 languages, that the arcs with high confusion scores are notably the predominant parsing errors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Buchholz, S., Marsi, E.: CoNLL-X shared task on multilingual dependency parsing. In: Proceedings of the Tenth Conference on Computational Natural Language Learning, pp. 149–164. Association for Computational Linguistics (2006)

    Google Scholar 

  2. Charniak, E., Johnson, M.: Coarse-to-fine n-best parsing and maxent discriminative reranking. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, pp. 173–180. ACL (2005)

    Google Scholar 

  3. Collins, M., Koo, T.: Discriminative reranking for natural language parsing. In: Machine Learning-International Workshop then Conference, pp. 175–182. Citeseer (2000)

    Google Scholar 

  4. Goldberg, Y., Elhadad, M.: Precision-biased parsing and high-quality parse selection. arXiv preprint arXiv:1205.4387 (2012)

    Google Scholar 

  5. Hall, J., Nilsson, J., Nivre, J., Eryiǧit, G., Megyesi, B., Nilsson, M., Saers, M.: Single malt or blended? A study in multilingual parser optimization. In: Proceedings of the CoNLL Shared Task of EMNLP-CoNLL 2007, pp. 933–939 (2007)

    Google Scholar 

  6. Hall, K.: K-best spanning tree parsing. In: Annual Meeting-Association for Computational Linguistics, vol. 45, p. 392 (2007)

    Google Scholar 

  7. Hwa, R.: Sample selection for statistical parsing. Computational Linguistics 30(3), 253–276 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  8. Jain, S., Agrawal, B.: A dynamic confusion score for dependency arc labels. In: Proceedings of the Sixth International Joint Conference on Natural Language Processing, pp. 1237–1242. Asian Federation of Natural Language Processing, Nagoya (2013), http://www.aclweb.org/anthology/I13-1176

    Google Scholar 

  9. Kawahara, D., Uchimoto, K.: Learning reliability of parses for domain adaptation of dependency parsing. IJCNLP 2008 (2008)

    Google Scholar 

  10. Kolachina, S., Kolachina, P.: Parsing any domain english text to conll dependencies. In: Calzolari N., Choukri, K., Declerck, T., Doħan, M.U., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S. (eds.) Proceedings of the Eight International Conference on Language Resources and Evaluation, LREC 2012. European Language Resources Association (ELRA), Istanbul (May 2012)

    Google Scholar 

  11. Koo, T., Collins, M.: Hidden-variable models for discriminative reranking. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 507–514. ACL (2005)

    Google Scholar 

  12. Mannem, P., Dara, A.: Partial parsing from bitext projections. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 1597–1606. Association for Computational Linguistics (2011)

    Google Scholar 

  13. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to information retrieval, vol. 1. Cambridge University Press, Cambridge (2008)

    Google Scholar 

  14. McClosky, D., Charniak, E., Johnson, M.: Effective self-training for parsing. In: Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pp. 152–159. Association for Computational Linguistics (2006)

    Google Scholar 

  15. McDonald, R., Pereira, F., Ribarov, K., Hajič, J.: Non-projective dependency parsing using spanning tree algorithms. In: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, pp. 523–530. Association for Computational Linguistics (2005)

    Google Scholar 

  16. McDonald, R.T., Nivre, J.: Characterizing the errors of data-driven dependency parsing models. In: EMNLP-CoNLL, pp. 122–131 (2007)

    Google Scholar 

  17. Mejer, A., Crammer, K.: Are you sure?: confidence in prediction of dependency tree edges. In: Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 573–576. Association for Computational Linguistics (2012)

    Google Scholar 

  18. Nilsson, J., Riedel, S., Yuret, D.: The CoNLL 2007 shared task on dependency parsing. In: Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL, pp. 915–932. sn (2007)

    Google Scholar 

  19. Nivre, J.: An efficient algorithm for projective dependency parsing. In: Proceedings of the 8th International Workshop on Parsing Technologies, IWPT. Citeseer (2003)

    Google Scholar 

  20. Nivre, J., Hall, J., Nilsson, J.: Memory-based dependency parsing. In: Proceedings of CoNLL, pp. 49–56 (2004)

    Google Scholar 

  21. Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kubler, S., Marinov, S., Marsi, E.: MaltParser: A language-independent system for data-driven dependency parsing. Natural Language Engineering 13(2), 95 (2007)

    Google Scholar 

  22. Nivre, J., Hall, J., Nilsson, J., Eryiǧit, G., Marinov, S.: Labeled pseudo-projective dependency parsing with support vector machines. In: Proceedings of the Tenth Conference on Computational Natural Language Learning, pp. 221–225. Association for Computational Linguistics (2006)

    Google Scholar 

  23. Owczarzak, K.: Depeval (summ): dependency-based evaluation for automatic summaries. In: Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, vol. 1, pp. 190–198. Association for Computational Linguistics (2009)

    Google Scholar 

  24. Petrov, S., Chang, P.C., Ringgaard, M., Alshawi, H.: Uptraining for accurate deterministic question parsing. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 705–713. Association for Computational Linguistics (2010)

    Google Scholar 

  25. Ravi, S., Knight, K., Soricut, R.: Automatic prediction of parser accuracy. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 887–896. Association for Computational Linguistics (2008)

    Google Scholar 

  26. Settles, B.: Active learning literature survey. University of Wisconsin, Madison (2010)

    Google Scholar 

  27. Sharma, D.M., Mannem, P., van Genabith, J., Devi, S.L., Mamidi, R., Parthasarathi, R. (eds.) Proceedings of the Workshop on Machine Translation and Parsing in Indian Languages. The COLING 2012 Organizing Committee, Mumbai, India (December 2012), http://www.aclweb.org/anthology/W12-56

  28. Singla, K., Tammewar, A., Jain, N., Jain, S.: Two-stage Approach for Hindi Dependency Parsing Using MaltParser. Training 12041(268,093), 22–27 (2012)

    Google Scholar 

  29. Steedman, M., Hwa, R., Clark, S., Osborne, M., Sarkar, A., Hockenmaier, J., Ruhlen, P., Baker, S., Crim, J.: Example selection for bootstrapping statistical parsers. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, vol. 1, pp. 157–164. Association for Computational Linguistics (2003)

    Google Scholar 

  30. Tang, M., Luo, X., Roukos, S.: Active learning for statistical natural language parsing. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 120–127. Association for Computational Linguistics (2002)

    Google Scholar 

  31. Wann, S., Dras, M., Dale, R., Paris, C.: Improving grammaticality in statistical sentence generation: Introducing a dependency spanning tree algorithm with an argument satisfaction model. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, pp. 852–860. Association for Computational Linguistics (2009)

    Google Scholar 

  32. Xu, P., Kang, J., Ringgaard, M., Och, F.: Using a dependency parser to improve smt for subject-object-verb languages. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 245–253. Association for Computational Linguistics (2009)

    Google Scholar 

  33. Zhang, Y., Clark, S.: A tale of two parsers: investigating and combining graph-based and transition-based dependency parsing using beam-search. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 562–571. Association for Computational Linguistics (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sambhav Jain .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Jain, S., Jain, N., Agrawal, B., Sangal, R. (2015). Employing Oracle Confusion for Parse Quality Estimation. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2015. Lecture Notes in Computer Science(), vol 9041. Springer, Cham. https://doi.org/10.1007/978-3-319-18111-0_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18111-0_17

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18110-3

  • Online ISBN: 978-3-319-18111-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics