Skip to main content

Latent Sentiment Model for Weakly-Supervised Cross-Lingual Sentiment Classification

  • Conference paper
Advances in Information Retrieval (ECIR 2011)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6611))

Included in the following conference series:

Abstract

In this paper, we present a novel weakly-supervised method for cross-lingual sentiment analysis. In specific, we propose a latent sentiment model (LSM) based on latent Dirichlet allocation where sentiment labels are considered as topics. Prior information extracted from English sentiment lexicons through machine translation are incorporated into LSM model learning, where preferences on expectations of sentiment labels of those lexicon words are expressed using generalized expectation criteria. An efficient parameter estimation procedure using variational Bayes is presented. Experimental results on the Chinese product reviews show that the weakly-supervised LSM model performs comparably to supervised classifiers such as Support vector Machines with an average of 81% accuracy achieved over a total of 5484 review documents. Moreover, starting with a generic sentiment lexicon, the LSM model is able to extract highly domain-specific polarity words from text.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Banea, C., Mihalcea, R., Wiebe, J., Hassan, S.: Multilingual subjectivity analysis using machine translation. In: Proceedings of the EMNLP, pp. 127–135 (2008)

    Google Scholar 

  2. Bautin, M., Vijayarenu, L., Skiena, S.: International sentiment analysis for news and blogs. In: Proceedings of the ICWSM (2008)

    Google Scholar 

  3. Blei, D., McAuliffe, J.: Supervised topic models. Advances in Neural Information Processing Systems 20, 121–128 (2008)

    Google Scholar 

  4. Druck, G., Mann, G., McCallum, A.: Learning from labeled features using generalized expectation criteria. In: SIGIR, pp. 595–602 (2008)

    Google Scholar 

  5. Esuli, A., Sebastiani, F.: SentiWordNet: A publicly available lexical resource for opinion mining. In: Proceedings of LREC, vol. 6 (2006)

    Google Scholar 

  6. Lacoste-Julien, S., Sha, F., Jordan, M.: DiscLDA: Discriminative learning for dimensionality reduction and classification. In: NIPS (2008)

    Google Scholar 

  7. Lin, C., He, Y., Everson, R.: A Comparative Study of Bayesian Models for Unsupervised Sentiment Detection. In: CoNLL (2010)

    Google Scholar 

  8. Lin, C., He, Y.: Joint sentiment/topic model for sentiment analysis. In: CIKM (2009)

    Google Scholar 

  9. McCallum, A., Mann, G., Druck, G.: Generalized expectation criteria. Tech. Rep. 2007-60, University of Massachusetts Amherst (2007)

    Google Scholar 

  10. Mihalcea, R., Banea, C., Wiebe, J.: Learning multilingual subjective language via cross-lingual projections. In: Proceedings of the ACL, pp. 976–983 (2007)

    Google Scholar 

  11. Mimno, D., McCallum, A.: Topic models conditioned on arbitrary features with dirichlet-multinomial regression. In: Proceedings of the UAI (2008)

    Google Scholar 

  12. Qiu, L., Zhang, W., Hu, C., Zhao, K.: Selc: a self-supervised model for sentiment classification. In: Proceeding of the CIKM, pp. 929–936 (2009)

    Google Scholar 

  13. Ramage, D., Hall, D., Nallapati, R., Manning, C.: Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. In: EMNLP, pp. 248–256 (2009)

    Google Scholar 

  14. Schapire, R., Rochery, M., Rahim, M., Gupta, N.: Incorporating prior knowledge into boosting. In: ICML, pp. 538–545 (2002)

    Google Scholar 

  15. Tan, S., Wang, Y., Cheng, X.: Combining learn-based and lexicon-based techniques for sentiment detection without using labeled examples. In: Proceedings of the SIGIR, pp. 743–744 (2008)

    Google Scholar 

  16. Wan, X.: Using bilingual knowledge and ensemble techniques for unsupervised Chinese sentiment analysis. In: Proceedings of the EMNLP, pp. 553–561 (2008)

    Google Scholar 

  17. Wan, X.: Co-training for cross-lingual sentiment classification. In: Proceedings of the Joint Conference of the ACL and the AFNLP, pp. 235–243 (2009)

    Google Scholar 

  18. Zagibalov, T., Carroll, J.: Automatic seed word selection for unsupervised sentiment classification of Chinese text. In: Proceedings of the COLING, pp. 1073–1080 (2008)

    Google Scholar 

  19. Zagibalov, T., Carroll, J.: Unsupervised classification of sentiment and objectivity in chinese text. In: Proceedings of the IJCNLP, pp. 304–311 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

He, Y. (2011). Latent Sentiment Model for Weakly-Supervised Cross-Lingual Sentiment Classification. In: Clough, P., et al. Advances in Information Retrieval. ECIR 2011. Lecture Notes in Computer Science, vol 6611. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20161-5_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20161-5_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20160-8

  • Online ISBN: 978-3-642-20161-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics