Detection of Incorrect Case Assignments in Paraphrase Generation

  • Atsushi Fujita
  • Kentaro Inui
  • Yuji Matsumoto
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3248)


This paper addresses the issue of post-transfer process in paraphrasing. Our previous investigation into transfer errors revealed that case assignment tends to be incorrect, irrespective of the types of transfer in lexical and structural paraphrasing of Japanese sentences [3]. Motivated by this observation, we propose an empirical method to detect incorrect case assignments. Our error detection model combines two error detection models that are separately trained on a large collection of positive examples and a small collection of manually labeled negative examples. Experimental results show that our combined model significantly enhances the baseline model which is trained only on positive examples. We also propose a selective sampling scheme to reduce the cost of collecting negative examples, and confirm the effectiveness in the error detection task.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    ACL. The 2nd International Workshop on Paraphrasing: Paraphrase Acquisition and Applications (IWP) (2003) Google Scholar
  2. 2.
    Carroll, J., Minnen, G., Pearce, D., Canning, Y., Devlin, S., Tait, J.: Simplifying text for language-impaired readers. In: Proc. of the 9th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pp. 269–270 (1999)Google Scholar
  3. 3.
    Fujita, A., Inui, K.: Exploring transfer errors in lexical and structural paraphrasing. Journal of Information Processing Society of Japan 44(11), 2826–2838 (2003) (in Japanese)Google Scholar
  4. 4.
    Hofmann, T.: Probabilistic latent semantic indexing. In: Proc. of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), pp. 50–57 (1999)Google Scholar
  5. 5.
    Ikehara, S., Miyazaki, M., Shirai, S., Yokoo, A., Nakaiwa, H., Ogura, K., Ooyama, Y., Hayashi, Y. (eds.): Nihongo Goi Taikei – A Japanese Lexicon. Iwanami Shoten (1997) (in Japanese)Google Scholar
  6. 6.
    Inui, K., Fujita, A., Takahashi, T., Iida, R., Iwakura, T.: Text simplification for reading assistance: a project note. In: Proc. of the 2nd International Workshop on Paraphrasing: Paraphrase Acquisition and Applications (IWP), pp. 9–16 (2003)Google Scholar
  7. 7.
    Keller, F., Lapata, M., Ourioupina, O.: Using the Web to overcome data sparseness. In: Proc. of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 230–237 (2002)Google Scholar
  8. 8.
    Kudo, T., Matsumoto, Y.: Japanese dependency analysis using cascaded chunking. In: Proc. of 6th Conference on Natural Language Learning (CoNLL), pp. 63–69 (2002)Google Scholar
  9. 9.
    Lapata, M., Keller, F., McDonald, S.: Evaluating smoothing algorithms against plausibility judgements. In: Proc. of the 39th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 346–353 (2001)Google Scholar
  10. 10.
    Lee, L.: On the effectiveness of the skew divergence for statistical language analysis. In: Proc. of the 8th International Workshop on Artificial Intelligence and Statistics, pp. 65–72 (2001)Google Scholar
  11. 11.
    NLPRS. Workshop on Automatic Paraphrasing: Theories and Applications (2001) Google Scholar
  12. 12.
    Pereira, F., Tishby, N., Lee, L.: Distributional clustering of English words. In: Proc. of the 31st Annual Meeting of the Association for Computational Linguistics (ACL), pp. 183– 190 (1993)Google Scholar
  13. 13.
    Ravichandran, D., Hovy, E.: Learning surface text patterns for a question answering system. In: Proc. of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 215–222 (2002)Google Scholar
  14. 14.
    Shirai, S., Ikehara, S., Kawaoka, T.: Effects of automatic rewriting of source language within a Japanese to English MT system. In: Proc. of the 5th International Conference on Theoretical and Methodological Issues in Machine Translation (TMI), pp. 226–239 (1993)Google Scholar
  15. 15.
    Takahashi, T., Iwakura, T., Iida, R., Fujita, A., Inui, K.: KURA: a transfer-based lexicostructural paraphrasing engine. In: Proc. of the 6th Natural Language Processing Pacific Rim Symposium (NLPRS) Workshop on Automatic Paraphrasing: Theories and Applications, pp. 37–46 (2001)Google Scholar
  16. 16.
    Torisawa, K.: An unsupervised learning method for associative relationships between verb phrases. In: Proc. of the 19th International Conference on Computational Linguistics (COLING), pp. 1009–1015 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Atsushi Fujita
    • 1
  • Kentaro Inui
    • 1
  • Yuji Matsumoto
    • 1
  1. 1.Graduate School of Information ScienceNara Institute of Science and Technology 

Personalised recommendations