Latent Sentiment Model for Weakly-Supervised Cross-Lingual Sentiment Classification

He, Yulan

doi:10.1007/978-3-642-20161-5_22

Yulan He²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6611))

Included in the following conference series:

European Conference on Information Retrieval

6764 Accesses
6 Citations

Abstract

In this paper, we present a novel weakly-supervised method for cross-lingual sentiment analysis. In specific, we propose a latent sentiment model (LSM) based on latent Dirichlet allocation where sentiment labels are considered as topics. Prior information extracted from English sentiment lexicons through machine translation are incorporated into LSM model learning, where preferences on expectations of sentiment labels of those lexicon words are expressed using generalized expectation criteria. An efficient parameter estimation procedure using variational Bayes is presented. Experimental results on the Chinese product reviews show that the weakly-supervised LSM model performs comparably to supervised classifiers such as Support vector Machines with an average of 81% accuracy achieved over a total of 5484 review documents. Moreover, starting with a generic sentiment lexicon, the LSM model is able to extract highly domain-specific polarity words from text.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Banea, C., Mihalcea, R., Wiebe, J., Hassan, S.: Multilingual subjectivity analysis using machine translation. In: Proceedings of the EMNLP, pp. 127–135 (2008)
Google Scholar
Bautin, M., Vijayarenu, L., Skiena, S.: International sentiment analysis for news and blogs. In: Proceedings of the ICWSM (2008)
Google Scholar
Blei, D., McAuliffe, J.: Supervised topic models. Advances in Neural Information Processing Systems 20, 121–128 (2008)
Google Scholar
Druck, G., Mann, G., McCallum, A.: Learning from labeled features using generalized expectation criteria. In: SIGIR, pp. 595–602 (2008)
Google Scholar
Esuli, A., Sebastiani, F.: SentiWordNet: A publicly available lexical resource for opinion mining. In: Proceedings of LREC, vol. 6 (2006)
Google Scholar
Lacoste-Julien, S., Sha, F., Jordan, M.: DiscLDA: Discriminative learning for dimensionality reduction and classification. In: NIPS (2008)
Google Scholar
Lin, C., He, Y., Everson, R.: A Comparative Study of Bayesian Models for Unsupervised Sentiment Detection. In: CoNLL (2010)
Google Scholar
Lin, C., He, Y.: Joint sentiment/topic model for sentiment analysis. In: CIKM (2009)
Google Scholar
McCallum, A., Mann, G., Druck, G.: Generalized expectation criteria. Tech. Rep. 2007-60, University of Massachusetts Amherst (2007)
Google Scholar
Mihalcea, R., Banea, C., Wiebe, J.: Learning multilingual subjective language via cross-lingual projections. In: Proceedings of the ACL, pp. 976–983 (2007)
Google Scholar
Mimno, D., McCallum, A.: Topic models conditioned on arbitrary features with dirichlet-multinomial regression. In: Proceedings of the UAI (2008)
Google Scholar
Qiu, L., Zhang, W., Hu, C., Zhao, K.: Selc: a self-supervised model for sentiment classification. In: Proceeding of the CIKM, pp. 929–936 (2009)
Google Scholar
Ramage, D., Hall, D., Nallapati, R., Manning, C.: Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. In: EMNLP, pp. 248–256 (2009)
Google Scholar
Schapire, R., Rochery, M., Rahim, M., Gupta, N.: Incorporating prior knowledge into boosting. In: ICML, pp. 538–545 (2002)
Google Scholar
Tan, S., Wang, Y., Cheng, X.: Combining learn-based and lexicon-based techniques for sentiment detection without using labeled examples. In: Proceedings of the SIGIR, pp. 743–744 (2008)
Google Scholar
Wan, X.: Using bilingual knowledge and ensemble techniques for unsupervised Chinese sentiment analysis. In: Proceedings of the EMNLP, pp. 553–561 (2008)
Google Scholar
Wan, X.: Co-training for cross-lingual sentiment classification. In: Proceedings of the Joint Conference of the ACL and the AFNLP, pp. 235–243 (2009)
Google Scholar
Zagibalov, T., Carroll, J.: Automatic seed word selection for unsupervised sentiment classification of Chinese text. In: Proceedings of the COLING, pp. 1073–1080 (2008)
Google Scholar
Zagibalov, T., Carroll, J.: Unsupervised classification of sentiment and objectivity in chinese text. In: Proceedings of the IJCNLP, pp. 304–311 (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Knowledge Media Institute, The Open University, Walton Hall, Milton Keynes, MK7 6AA, UK
Yulan He

Authors

Yulan He
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Information School, University of Sheffield, Regent Court, 211 Portobello Street, S1 4DP, Sheffield, UK
Paul Clough
CLARITY: Centre for Sensor Web Technologies, School of Computing, Dublin City University, Glasnevin, Dublin 9, Ireland
Colum Foley , Cathal Gurrin & Hyowon Lee , &
Centre for Next Generation Localisation, School of Computing, Dublin City University, Glasnevin, Dublin 9, Ireland
Gareth J. F. Jones
TNO Human Factors, Brassersplein 2, 2612 CT, Delft, The Netherlands
Wessel Kraaij
Yahoo! Research, 177 Diagonal, 08018, Barcelona, Spain
Vanessa Mudoch

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

He, Y. (2011). Latent Sentiment Model for Weakly-Supervised Cross-Lingual Sentiment Classification. In: Clough, P., et al. Advances in Information Retrieval. ECIR 2011. Lecture Notes in Computer Science, vol 6611. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20161-5_22

Download citation

DOI: https://doi.org/10.1007/978-3-642-20161-5_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-20160-8
Online ISBN: 978-3-642-20161-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics