Skip to main content

Abstract

This chapter introduces the Interactive Parsing (IP) framework for obtaining the correct syntactic parse tree of a given sentence. This formal framework allows us to make the construction of interactive systems for tree annotation. These interactive systems can help to human annotators in creating error-free parse trees with little effort, when compared with manual post-editing of the trees provided by an automatic parser.

In principle, the interaction protocol defined in the IP framework differs from the left-to-right interaction protocol used throughout this book. Specifically, the IP protocol will be of desultory order; that is, in IP the user can edit any part of the parse tree and in any order. However, in order to efficiently calculate the next best tree in IP framework, in Sect. 9.4, a left-to-right depth-first tree review order will be introduced. In addition, this order also introduces computational advantages into the lookout of most probable tree for interactive bottom-up parsing algorithms. The use of Confidence Measures in IP is also presented as an efficient technique to detect erroneous parse trees. Confidence Measures can be efficiently computed in the IP framework and can help in detecting erroneous constituents within the IP process more quickly, as they provide discriminant information over all the IP process.

With Contribution Of: José Miguel Benedí, Joan Andreu Sánchez and Ricardo Sánchez-Sáez.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://nltk.sourceforge.net/.

References

  1. Baker, J. K. (1979). Trainable grammars for speech recognition. The Journal of the Acoustical Society of America, 65, 31–35.

    Google Scholar 

  2. Benedí, J. M., & Sánchez, J. A. (2005). Estimation of stochastic context-free grammars and their use as language models. Computer Speech & Language, 19(3), 249–274.

    Article  Google Scholar 

  3. Benedí, J. M., Sánchez, J. A., & Sanchis, A. (2007). Confidence measures for stochastic parsing. In Proceedings of the international conference recent advances in natural language processing (pp. 58–63), Borovets, Bulgaria.

    Google Scholar 

  4. Carter, D. (1997). The TreeBanker. A tool for supervised training of parsed corpora. In Proceedings of the workshop on computational environments for grammar development and linguistic engineering (pp. 9–15), Madrid, Spain.

    Google Scholar 

  5. Charniak, E. (1997). Statistical parsing with a context-free grammar and word statistics. In Proceedings of the national conference on artificial intelligence (pp. 598–603), Providence, Rhode Island, USA.

    Google Scholar 

  6. Charniak, E. (2000). A maximum-entropy-inspired parser. In Proceedings of the first conference on North American chapter of the association for computational linguistics (pp. 132–139), Seattle, Washington, USA.

    Google Scholar 

  7. Charniak, E., Knight, K., & Yamada, K. (2003). Syntax-based language models for statistical machine translation. In Machine translation summit, IX international association for machine translation, New Orleans, Louisiana, USA.

    Google Scholar 

  8. Chelba, F., & Jelinek, C. (2000). Structured language modeling. Computer Speech and Language, 14(4), 283–332.

    Article  Google Scholar 

  9. Chiang, D. (2007). Hierarchical phrase-based translation. Computational Linguistics, 33(2), 201–228.

    Article  MATH  Google Scholar 

  10. Collins, M. (2003). Head-driven statistical models for natural language parsing. Computational Linguistics, 29(4), 589–637.

    Article  MathSciNet  MATH  Google Scholar 

  11. de la Clergerie, E. V., Hamon, O., Mostefa, D., Ayache, C., Paroubek, P., & Vilnat, A. (2008). PASSAGE: from French parser evaluation to large sized treebank. In Proceedings of the sixth international language resources and evaluation (pp. 3570–3577), Marrakech, Morocco.

    Google Scholar 

  12. Earley, J. (1970). An efficient context-free parsing algorithm. Communications of the ACM, 8(6), 451–455.

    Google Scholar 

  13. Gascó, G., & Sánchez, J. A. (2007). A* parsing with large vocabularies. In Proceedings of the international conference recent advances in natural language processing (pp. 215–219), Borovets, Bulgaria.

    Google Scholar 

  14. Gascó, G., Sánchez, J. A., & Benedí, J. M. (2010). Enlarged search space for sitg parsing. In Proceedings of the North American chapter of the association for computational linguistics—human language technologies conference (pp. 653–656), Los Angeles, California.

    Google Scholar 

  15. Hopcroft, J. E., & Ullman, J. D. (1979). Introduction to automata theory, languages and computation. Reading: Addison-Wesley.

    MATH  Google Scholar 

  16. Huang, L., & Chiang, D. (2005). Better k-best parsing. In Proceedings of the ninth international workshop on parsing technology (pp. 53–64), Vancouver, British Columbia. Menlo Park: Association for Computational Linguistics.

    Chapter  Google Scholar 

  17. Jain, A. K., Duin, R. P., & Mao, J. (2000). Statistical pattern recognition: A review. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22, 4–37.

    Article  Google Scholar 

  18. Klein, D., & Manning, C. D. (2003). Accurate unlexicalized parsing. In Proceedings of the 41st annual meeting on association for computational linguistics (Vol. 1, pp. 423–430), Association for Computational Linguistics Morristown, NJ, USA.

    Google Scholar 

  19. Lease, M., Charniak, E., Johnson, M., & McClosky, D. (2006). A look at parsing and its applications. In Proceedings of the twenty-first national conference on artificial intelligence, Boston, Massachusetts, USA.

    Google Scholar 

  20. Marcus, M. P., Santorini, B., & Marcinkiewicz, M. A. (1994). Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2), 313–330.

    Google Scholar 

  21. Oepen, S., Flickinger, D., Toutanova, K., & Manning, C. D. (2004). LinGO redwoods. Research on Language and Computation, 2(4), 575–596.

    Article  Google Scholar 

  22. Pereira, F., & Schabes, Y. (1992). Inside-outside reestimation from partially bracketed corpora. In Proceedings of the 30th annual meeting of the association for computational linguistics (pp. 128–135). Newark: University of Delaware.

    Chapter  Google Scholar 

  23. Petrov, S., & Klein, D. (2007). Improved inference for unlexicalized parsing. In Conference of the North American chapter of the association for computational linguistics; proceedings of the main conference (pp. 404–411), Rochester, New York.

    Google Scholar 

  24. Roark, B. (2001). Probabilistic top-down parsing and language modeling. Computational Linguistics, 27(2), 249–276.

    Article  MathSciNet  Google Scholar 

  25. Salvador, I., & Benedí, J. M. (2002). RNA modeling by combining stochastic context-free grammars and n-gram models. International Journal of Pattern Recognition and Artificial Intelligence, 16(3), 309–315.

    Article  Google Scholar 

  26. San-Segundo, R., Pellom, B., Hacioglu, K., Ward, W., & Pardo, J. M. (2001). Confidence measures for spoken dialogue systems. In IEEE international conference on acoustic speech and signal processing (Vol. 1), Salt Lake City, Utah, USA.

    Google Scholar 

  27. Sánchez-Sáez, R., Sánchez, J. A., & Benedí, J. M. (2009). Statistical confidence measures for probabilistic parsing. In Proceedings of the international conference on recent advances in natural language processing (pp. 388–392), Borovets, Bulgaria.

    Google Scholar 

  28. Sánchez-Sáez, R., Leiva, L., Sánchez, J. A., & Benedí, J. M. (2010). Confidence measures for error discrimination in an interactive predictive parsing framework. In 23rd International conference on computational linguistics (pp. 1220–1228), Beijing, China.

    Google Scholar 

  29. Serrano, N., Sanchis, A., & Juan, A. (2010). Balancing error and supervision effort in interactive-predictive handwriting recognition. In Proceeding of the 14th international conference on intelligent user interfaces (pp. 373–376), Hong Kong, China.

    Chapter  Google Scholar 

  30. Stolcke, A. (1995). An efficient probabilistic context-free parsing algorithm that computes prefix probabilities. Computational Linguistics, 21(2), 165–200.

    MathSciNet  Google Scholar 

  31. Tarazón, L., Pérez, D., Serrano, N., Alabau, V., Terrades, O. R., Sanchis, A., & Juan, A. (2009). Confidence measures for error correction in interactive transcription of handwritten text. In LNCS: Vol. 5716. Proceedings of the 15th international conference on image analysis and processing (pp. 567–574), Salerno, Italy.

    Google Scholar 

  32. Ueffing, N., & Ney, H. (2007). Word-level confidence estimation for machine translation. Computational Linguistics, 33(1), 9–40.

    Article  MATH  Google Scholar 

  33. Wessel, F., Schluter, R., Macherey, K., & Ney, H. (2001). Confidence measures for large vocabulary continuous speech recognition. IEEE Transactions on Speech and Audio Processing, 9(3), 288–298.

    Article  Google Scholar 

  34. Wu, D. (1997). Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Computational Linguistics, 23(3), 377–404.

    Google Scholar 

  35. Yamada, K., & Knight, K. (2002). A decoder for syntax-based statistical MT. In Meeting of the association for computational linguistics, Philadelphia, Pensilvania, USA.

    Google Scholar 

  36. Yamamoto, R., Sako, S., Nishimoto, T., & Sagayama, S. (2006). On-line recognition of handwritten mathematical expressions based on stroke-based stochastic context-free grammar. In 10th international workshop on frontiers in handwriting recognition (pp. 249–254), La Baule, France.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alejandro Héctor Toselli .

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag London Limited

About this chapter

Cite this chapter

Toselli, A.H., Vidal, E., Casacuberta, F. (2011). Interactive Parsing. In: Multimodal Interactive Pattern Recognition and Applications. Springer, London. https://doi.org/10.1007/978-0-85729-479-1_9

Download citation

  • DOI: https://doi.org/10.1007/978-0-85729-479-1_9

  • Publisher Name: Springer, London

  • Print ISBN: 978-0-85729-478-4

  • Online ISBN: 978-0-85729-479-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics