Skip to main content

Sentiment Analysis on Twitter through Topic-Based Lexicon Expansion

  • Conference paper
Databases Theory and Applications (ADC 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8506))

Included in the following conference series:

Abstract

Supervised learning approaches are domain-dependent and it is costly to obtain labeled training data from different domains. Lexicon-based approaches enjoy stable performance across domains, but often cannot capture domain-dependent features. It is also hard for lexicon-based classifiers to identify the polarities of abbreviations and misspellings, which are common in short informal social text but usually not found in general sentiment lexicons. We propose to overcome this limitation by expanding a general lexicon with domain-dependent opinion words as well as abbreviations and informal opinion expressions. The expanded terms are automatically selected based on their mutual information with emoticons. As there is an abundant amount of emoticon-bearing tweets on Twitter, our approach provides a way to do domain-dependent sentiment analysis without the cost of data annotation. We show that our technique leads to statistically significant improvements in classification accuracies across 56 topics with a state-of-the-art lexicon-based classifier. We also present the expanded terms, and show the most representative opinion expressions obtained from co-occurrence with emoticons.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baccianella, S., et al.: SentiWordNet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In: LREC (2010)

    Google Scholar 

  2. Becker, L., et al.: AVAYA: Sentiment analysis on twitter with self-training and polarity lexicon expansion. In: SemEval (2013)

    Google Scholar 

  3. Blitzer, J., et al.: Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. In: ACL (2007)

    Google Scholar 

  4. Bonilla, E., et al.: Multi-task gaussian process prediction (2008)

    Google Scholar 

  5. Choi, Y., Cardie, C.: Adapting a polarity lexicon using integer linear programming for domain-specific sentiment classification. In: EMNLP (2009)

    Google Scholar 

  6. Davidov, D., et al.: Enhanced sentiment learning using twitter hashtags and smileys. In: Coling 2010 (2010)

    Google Scholar 

  7. Davis, J., Domingos, P.: Deep transfer via second-order markov logic. In: ICML (2009)

    Google Scholar 

  8. Go, A., et al.: Twitter sentiment classification using distant supervision. In: CS224N Project Report, Stanford (2009)

    Google Scholar 

  9. Liu, K.L., et al.: Emoticon smoothed language models for twitter sentiment analysis. In: AAAI (2012)

    Google Scholar 

  10. Ounis, I., et al.: Overview of the trec-2011 microblog track. In: TREC 2011 (2011)

    Google Scholar 

  11. Owoputi, O., et al.: Improved part-of-speech tagging for online conversational text with word clusters. In: Proceedings of NAACL-HLT (2013)

    Google Scholar 

  12. Pak, A., Paroubek, P.: Twitter as a corpus for sentiment analysis and opinion mining. In: LREC (2010)

    Google Scholar 

  13. Pan, S.J., et al.: Transfer learning via dimensionality reduction. In: AAAI (2008)

    Google Scholar 

  14. Pang, B., et al.: Thumbs up?: sentiment classification using machine learning techniques. In: EMNLP (2002)

    Google Scholar 

  15. Ponomareva, N., Thelwall, M.: Do neighbours help?: An exploration of graph-based algorithms for cross-domain sentiment classification. In: Proceedings of the 2012 Joint Conference on EMNLP and CoNLL (2012)

    Google Scholar 

  16. Taboada, M., et al.: Lexicon-based methods for sentiment analysis. Computational linguistics (2011)

    Google Scholar 

  17. Thelwall, M., Buckley, K.: Topic-based sentiment analysis for the social web: The role of mood and issue-related words. JASIST (2013)

    Google Scholar 

  18. Thelwall, M., et al.: Sentiment strength detection for the social web. JASIST (2012)

    Google Scholar 

  19. Thelwall, M., et al.: Sentiment strength detection in short informal text. JASIST (2010)

    Google Scholar 

  20. Turney, P.D.: Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews. In: ACL (2002)

    Google Scholar 

  21. Zhang, D., et al.: Sentiment detection with auxiliary data. Information retrieval (2012)

    Google Scholar 

  22. Zhang, L., et al.: Combining lexiconbased and learning-based methods for twitter sentiment analysis. HP Laboratories, Technical Report HPL-2011 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Zhou, Z., Zhang, X., Sanderson, M. (2014). Sentiment Analysis on Twitter through Topic-Based Lexicon Expansion. In: Wang, H., Sharaf, M.A. (eds) Databases Theory and Applications. ADC 2014. Lecture Notes in Computer Science, vol 8506. Springer, Cham. https://doi.org/10.1007/978-3-319-08608-8_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-08608-8_9

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-08607-1

  • Online ISBN: 978-3-319-08608-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics