Ensemble Learning for Sentiment Classification

  • Ying Su
  • Yong Zhang
  • Donghong Ji
  • Yibing Wang
  • Hongmiao Wu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7717)


This paper presents an ensemble learning method for sentiment classification of reviews. The diversity among the machine learning algorithms for sentiment classification with different settings, which includes different features, different weight measures and the modeling of negation, is investigated in three domains, which gives a space for improving the performance. Then the ensemble learning framework, stacking generalization is introduced based on different algorithms with different settings, and compared with the majority voting. According to the characteristic of reviews, the opinion summary of review is proposed in this paper, which is composed of the first two and last two sentences of review. Results show that stacking has been proven to be consistently effective over all domains, working better than majority voting, and that using the opinion summary can improve the performance further.


sentiment classification sentiment analysis stacked generalization diversity measure 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing Contextual Polarity an exploration of features for phrase-level sentiment analysis. Computational Linguistics 35, 399–433 (2009)CrossRefGoogle Scholar
  2. 2.
    Dasgupta, S., Ng, V.: Mine the Easy, Classify the Hard: A Semi-Supervised Approach to Automatic Sentiment Classification. In: Preceeding of ACL 2009, pp. 701–709 (2009)Google Scholar
  3. 3.
    Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment Classification Using Machine Learning Techniques. In: Proceeding of EMNLP (2002)Google Scholar
  4. 4.
    Tang, H.F., Tan, S.B., Cheng, X.Q.: A survey on sentiment detection of reviews. Expert Syst. Appl. 36(7), 10760–10773 (2009)CrossRefGoogle Scholar
  5. 5.
    Xu, J., Ding, Y.X., Wang, X.L.: Sentiment Classification for Chinese News Using Machine Learning Methods. Journal of Chinese Information Processing 21(6) (2007)Google Scholar
  6. 6.
    Zhang, Y., Ji, D.-H., Su, Y., Sun, C.: Sentiment Analysis for Online Reviews Using an Author-Review-Object Model. In: Salem, M.V.M., Shaalan, K., Oroumchian, F., Shakery, A., Khelalfa, H. (eds.) AIRS 2011. LNCS, vol. 7097, pp. 362–371. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  7. 7.
    Täckström, O., McDonald, R.: Semi-supervised Latent Variable Models for Sentence-level Sentiment Analysis. In: Proceeding of Association for Computational Linguistics, ACL (2011)Google Scholar
  8. 8.
    Mukherjee, A., Liu, B.: Modeling Review Comments. In: Proceedings of ACL 2012, Jeju, Republic of Korea, July 8-14 (2012)Google Scholar
  9. 9.
    Du, W.F., Tan, S.B., Cheng, X.Q., Yun, X.C.: Adapting information bottleneck method for automatic construction of domain-oriented sentiment lexicon. In: Proceeding of WSDM 2010, pp. 111–120 (2010)Google Scholar
  10. 10.
    Wolpert, David, H.: Stacked Generalization. Neural Networks 5(2), 241–260 (1992)CrossRefGoogle Scholar
  11. 11.
    Lewis, David, D.: Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, Springer, Heidelberg (1998)Google Scholar
  12. 12.
    Domingos, P., Pazzani, M.J.: On the optimality of the simple Bayesian classifier under zero-one loss. Machine Learning 29(2-3), 103–130 (1997)zbMATHCrossRefGoogle Scholar
  13. 13.
    Han, E.H., Karypis, G.: Principles of Data Mining and Knowledge Discovery. Springer (2000)Google Scholar
  14. 14.
    Pan, J.S., Qiao, Y.L., Sun, S.H.: A fast K nearest neighbors classification algorithm. J. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. E87-A(4), 961–963 (2004)Google Scholar
  15. 15.
    Das, S., Chen, M.: Yahoo! for Amazon: Extracting market sentiment from stock message boards. In: Proceeding of the 8th Asia Pacific Finance Association Annual Conference (2001)Google Scholar
  16. 16.
    Sigletos, G., Paliouras, G., Spyropoulos, C.D., Hatzopoulos, M.: Combining Information Extraction Systems Using Voting and Stacked Generalization. Journal of Machine Learning Research 6, 1751–1782 (2005)MathSciNetzbMATHGoogle Scholar
  17. 17.
    Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann (2000)Google Scholar
  18. 18.
    Ting, K., Witten, M.: Issues in stacked generalization. Journal of Artificial Intelligence Research (JAIR) 10, 271–289 (1999)zbMATHGoogle Scholar
  19. 19.
    Jia, L.F., Yu, C., Meng, W.Y.: The Effect of Negation on Sentiment Analysis and Retrieval Effectiveness. In: Proceeding of the 18th ACM Conference on Information and Knowledge Management, pp. 1827–1830 (2009)Google Scholar
  20. 20.
    Kuncheva, L.I., Whitaker, C.J.: Measures of Diversity in Classifier Ensembles and their Relationship with the Ensemble Accuracy. Machine Learning 51, 181–207 (2003)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Ying Su
    • 1
  • Yong Zhang
    • 2
    • 3
  • Donghong Ji
    • 2
  • Yibing Wang
    • 4
  • Hongmiao Wu
    • 5
  1. 1.Department of Computer and ElectronicHuazhong University of Science and Technology Wuchang BranchWuhanP.R. China
  2. 2.Computer SchoolWuhan UniversityWuhanP.R. China
  3. 3.Department of Computer ScienceHuazhong Normal UniversityWuhanP.R. China
  4. 4.Third FacultySecond Artillery Command CollegeP.R. China
  5. 5.School of Foreign Languages and LiteratureWuhan UniversityWuhanP.R. China

Personalised recommendations