Advertisement

Refinement by Filtering Translation Candidates and Similarity Based Approach to Expand Emotion Tagged Corpus

Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 631)

Abstract

Researches on emotion estimation from text mostly use machine learning method. Because machine learning requires a large amount of example corpora, how to acquire high quality training data has been discussed as one of its major problems. The existing language resources include emotion corpora; however, they are not available if the language is different. Constructing bilingual corpus manually is also financially difficult. We propose a method to convert a training data into different language using an existing Japanese-English parallel emotion corpus. With a bilingual dictionary, the translation candidates are extracted against every word of each sentence included in the corpus. Then the extracted translation candidates are narrowed down into a set of words that highly contribute to emotion estimation and we used the set of words as training data. Moreover, when one language’s unannotated linguistic resources can be obtained, the words can be expanded based on the word distributed expression. By using this expressions, we can improve accuracy without decreasing information volume of one sentence. Then, we tried the corpus expansion without translating target linguistic resource. As the result of the evaluation experiment using the machine learning algorithm, we could clear the effectiveness of the emotion corpus which expanded based on the original language’s unannotated sentences and based on similar sentence. Moreover, when large amount of linguistic resources without annotation can be obtained in one language, their words can be expanded based on distributed expressions of the words. By using distributed expressions, we can improve accuracy without decreasing information volume of one sentence. Then, we attempted to expand corpus without translating target linguistic resource. The result of the evaluation experiment using the machine learning algorithm showed the effectiveness of the expanded emotion corpus based on the original language’s unannotated sentences and their similar sentences.

Keywords

Training Data Machine Translation Emotion Category Statistical Machine Translation Japanese Language 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgements

This work was supported by JSPS KAKENHI Grant Numbers 15H01712, 15K16077, 15K00425.

References

  1. 1.
    Balahur, A., Turchi, M.: Multilingual sentiment analysis using machine translation? In: The 3rd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis, pp. 52–60 (2012)Google Scholar
  2. 2.
    Banerjee, S.: Meteor: an automatic metric for MT evaluation with improved correlation with human judgments. In: ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pp. 65–72 (2005)Google Scholar
  3. 3.
    Brill, E.: Some advances in transformation-based part of speech tagging. In: National Conference on Artificial Intelligence, pp. 722–727 (1994)Google Scholar
  4. 4.
    Buckley, B., Salton, G., Allan, J., Singhal, A.A.: Automatic query expansion using smart: TREC-3. In: The Third Text REtrieval Conference(TREC-3), pp. 500–238 (1995)Google Scholar
  5. 5.
    Echizen-ya, H., Araki, K.: Automatic evaluation of machine translation based on recursive acquisition of an intuitive common parts continuum. In: The Eleventh Machine Translation Summit (MT SUMMIT XI), pp. 151–158 (2007)Google Scholar
  6. 6.
    Hiejima, I.: Japanese-English Emotion Expression Dictionary. Tokyodo Shuppan (1999). (in Japanese)Google Scholar
  7. 7.
    Inui, T., Yamamoto, M.: Usage of different language translated data on classification of evaluation document. In: The Annual Meeting of Association for Natural Language Processing, pp. 119–122 (2011). (in Japanese)Google Scholar
  8. 8.
    Kang, X., Ren, F., Wu, Y.: Bottom up: exploring word emotions for Chinese sentence chief sentiment classification. In: IEEE International Conference on Natural Language Processing and Knowledge Engineering, pp. 422–426 (2010)Google Scholar
  9. 9.
    Lavie, A., Agarwal, A.: Meteor: an automatic metric for MT evaluation with high levels of correlation with human judgments. In: ACL Second Workshop on Statistical Machine Translation, pp. 228–231 (2007)Google Scholar
  10. 10.
    Matsumoto, K., Ren, F.: Estimation of word emotions based on part of speech and positional information. Comput. Hum. Behav. 2011(27), 1553–1564 (2011)CrossRefGoogle Scholar
  11. 11.
    Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: ICLR Workshop (2013)Google Scholar
  12. 12.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS2013 (2013)Google Scholar
  13. 13.
    Mikolov, T., Yih, W.T., Zweig, G.: Linguistic regularities in continuous space word representations. In: NAACL HLT 2013 (2013)Google Scholar
  14. 14.
    Minato, J., Matsumoto, K., Ren, F., Kuroiwa, S.: Corpus-based analysis of Japanese-English of emotional expressions. In: IEEE International Conference on Natural Language Processing and Knowledge Engineering, pp. 413–418 (2007)Google Scholar
  15. 15.
    Minato, J., Matsumoto, K., Ren, F., Tsuchiya, S., Kuroiwa, S.: Evaluation of emotion estimation methods based on statistic features of emotion tagged corpus. Int. J. Innov. Comput. Inf. Control 4(8), 1931–1941 (2008)Google Scholar
  16. 16.
    Papineni, K., Roukos, S., Ward, T., Zhu, W.: BLEU: a method for automatic evaluation of machine translation. In: The 40th Annual Meeting on Association for Computational Linguistics(ACL 2002), pp. 311–318 (2002)Google Scholar
  17. 17.
    Quan, C., Ren, F.: A blog emotion corpus for emotional expression analysis in chinese. Comput. Speech Lang. 24(1), 726–749 (2010)CrossRefGoogle Scholar
  18. 18.
    Quan, C., Ren, F.: Recognition of word emotion state in sentences. IEEJ Trans. Electr. Electron. Eng. 6, 34–41 (2011)CrossRefGoogle Scholar
  19. 19.
    Quan, C., Ren, F.: Unsupervised product feature extraction for feature-oriented opinion determination. Inf. Sci. 272(2014), 16–28 (2014)CrossRefGoogle Scholar
  20. 20.
    Ren, F.: Affective information processing and recognizing human emotion. Electron. Notes in Theoret. Comput. Sci. 225, 39–50 (2009)CrossRefGoogle Scholar
  21. 21.
    Ren, F., Kang, X., Quan, C.: Examining accumulated emotional traits in suicide blogs with an emotion topic model. IEEE J. Biomed. Health Inform. 20(5), 1384–1396 (2015)CrossRefGoogle Scholar
  22. 22.
    Ren, F., Matsumoto, K.: Semi-automatic creation of youth slang corpus and its application to affective computing. IEEE Trans. Affect. Comput. 7(2), 176–189 (2015)CrossRefGoogle Scholar
  23. 23.
    Ren, F., Wu, Y.: Predicting user-topic opinions in Twitter with social and topical context. IEEE Trans. Affect. Comput. 4(4), 412–424 (2013)CrossRefGoogle Scholar
  24. 24.
    Saiki, Y., Takamura, H., Okumura, M.: Domain adaptaion in sentiment classification by instance weighting. In: IPSJ SIG Notes, pp. 61–67 (2008). (in Japanese)Google Scholar
  25. 25.
    Takamura, H., Inui, T., Okumura, M.: Extracting semantic orientations of words using spin model. In: The 43rd Annual Meeting on Association for Computational Linguistics, pp. 133–140 (2005)Google Scholar
  26. 26.
    Takamura, H., Inui, T., Okumura, M.: Latent variable models for semantic orientations of phrases (in Japanese). Trans. Info. Process. Soc. Jpn. 47(11), 3021–3031 (2006)Google Scholar
  27. 27.
    Wan, X.: Co-training for cross-lingual sentiment classification. In: the 47th Annual Meeting of the ACL and The 4th IJCNLP of the AFNLP, pp. 235–243 (2009)Google Scholar
  28. 28.
    Wu, Y., Kita, K., Matsumoto, K.: Three predictions are better than one: sentence multi-emotion analysis from different perspectives. IEEJ Trans. Electr. Electron. Eng. (TEEE) 9(6), 642–649 (2014)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.Tokushima UniversityTokushimaJapan

Personalised recommendations