Skip to main content

Detection of Incorrect Case Assignments in Paraphrase Generation

  • Conference paper
Natural Language Processing – IJCNLP 2004 (IJCNLP 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3248))

Included in the following conference series:

Abstract

This paper addresses the issue of post-transfer process in paraphrasing. Our previous investigation into transfer errors revealed that case assignment tends to be incorrect, irrespective of the types of transfer in lexical and structural paraphrasing of Japanese sentences [3]. Motivated by this observation, we propose an empirical method to detect incorrect case assignments. Our error detection model combines two error detection models that are separately trained on a large collection of positive examples and a small collection of manually labeled negative examples. Experimental results show that our combined model significantly enhances the baseline model which is trained only on positive examples. We also propose a selective sampling scheme to reduce the cost of collecting negative examples, and confirm the effectiveness in the error detection task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. ACL. The 2nd International Workshop on Paraphrasing: Paraphrase Acquisition and Applications (IWP) (2003)

    Google Scholar 

  2. Carroll, J., Minnen, G., Pearce, D., Canning, Y., Devlin, S., Tait, J.: Simplifying text for language-impaired readers. In: Proc. of the 9th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pp. 269–270 (1999)

    Google Scholar 

  3. Fujita, A., Inui, K.: Exploring transfer errors in lexical and structural paraphrasing. Journal of Information Processing Society of Japan 44(11), 2826–2838 (2003) (in Japanese)

    Google Scholar 

  4. Hofmann, T.: Probabilistic latent semantic indexing. In: Proc. of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pp. 50–57 (1999)

    Google Scholar 

  5. Ikehara, S., Miyazaki, M., Shirai, S., Yokoo, A., Nakaiwa, H., Ogura, K., Ooyama, Y., Hayashi, Y. (eds.): Nihongo Goi Taikei – A Japanese Lexicon. Iwanami Shoten (1997) (in Japanese)

    Google Scholar 

  6. Inui, K., Fujita, A., Takahashi, T., Iida, R., Iwakura, T.: Text simplification for reading assistance: a project note. In: Proc. of the 2nd International Workshop on Paraphrasing: Paraphrase Acquisition and Applications (IWP), pp. 9–16 (2003)

    Google Scholar 

  7. Keller, F., Lapata, M., Ourioupina, O.: Using the Web to overcome data sparseness. In: Proc. of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 230–237 (2002)

    Google Scholar 

  8. Kudo, T., Matsumoto, Y.: Japanese dependency analysis using cascaded chunking. In: Proc. of 6th Conference on Natural Language Learning (CoNLL), pp. 63–69 (2002)

    Google Scholar 

  9. Lapata, M., Keller, F., McDonald, S.: Evaluating smoothing algorithms against plausibility judgements. In: Proc. of the 39th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 346–353 (2001)

    Google Scholar 

  10. Lee, L.: On the effectiveness of the skew divergence for statistical language analysis. In: Proc. of the 8th International Workshop on Artificial Intelligence and Statistics, pp. 65–72 (2001)

    Google Scholar 

  11. NLPRS. Workshop on Automatic Paraphrasing: Theories and Applications (2001)

    Google Scholar 

  12. Pereira, F., Tishby, N., Lee, L.: Distributional clustering of English words. In: Proc. of the 31st Annual Meeting of the Association for Computational Linguistics (ACL), pp. 183– 190 (1993)

    Google Scholar 

  13. Ravichandran, D., Hovy, E.: Learning surface text patterns for a question answering system. In: Proc. of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 215–222 (2002)

    Google Scholar 

  14. Shirai, S., Ikehara, S., Kawaoka, T.: Effects of automatic rewriting of source language within a Japanese to English MT system. In: Proc. of the 5th International Conference on Theoretical and Methodological Issues in Machine Translation (TMI), pp. 226–239 (1993)

    Google Scholar 

  15. Takahashi, T., Iwakura, T., Iida, R., Fujita, A., Inui, K.: KURA: a transfer-based lexicostructural paraphrasing engine. In: Proc. of the 6th Natural Language Processing Pacific Rim Symposium (NLPRS) Workshop on Automatic Paraphrasing: Theories and Applications, pp. 37–46 (2001)

    Google Scholar 

  16. Torisawa, K.: An unsupervised learning method for associative relationships between verb phrases. In: Proc. of the 19th International Conference on Computational Linguistics (COLING), pp. 1009–1015 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fujita, A., Inui, K., Matsumoto, Y. (2005). Detection of Incorrect Case Assignments in Paraphrase Generation. In: Su, KY., Tsujii, J., Lee, JH., Kwong, O.Y. (eds) Natural Language Processing – IJCNLP 2004. IJCNLP 2004. Lecture Notes in Computer Science(), vol 3248. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30211-7_59

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30211-7_59

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-24475-2

  • Online ISBN: 978-3-540-30211-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics