Cluster Computing

, Volume 22, Supplement 2, pp 3043–3058 | Cite as

Sentiment analysis of Chinese online reviews using ensemble learning framework

  • Jiafeng Huang
  • Yun XueEmail author
  • Xiaohui Hu
  • Huixia Jin
  • Xin Lu
  • Zhihuang Liu


Unstructured online reviews are undergoing a rather rapid expansion with the development of E-commerce, and they contain sentiment information in which consumers and businesses are very interested. Therefore, effective sentiment classification has become one of the important research topics. Many studies have shown that ensemble learning methods may have great hopeful applicability in sentiment classification tasks. In this paper, we propose a new ensemble learning framework for sentiment classification of Chinese online reviews. First of all, according to the complicated characteristics of Chinese online reviews, we extract Part of Speech Combination Pattern, Frequent Word Sequence Pattern and Order Preserved Submatrix Pattern as the input features. Furthermore, we use the algorithm of Random Subspace based on Information Gain by considering the problem of massive features in the reviews, which can improve the base classifiers simultaneously. Finally, we adopt the algorithm of Constructing Base Classifiers based on Product Attributes to combine the sentiment information of each attribute in a review so as to obtain better performance on sentiment classification. The experimental results show that the proposed ensemble learning framework has significant improvement in sentiment classification of Chinese online reviews.


Online reviews Sentiment classification Ensemble learning Feature extraction 



The authors thank gratefully for the colleagues participated in this work and provided technical supports. This work is supported by Grant from the National Natural Science Foundation of China (No. 61672126), Guangdong Provincial Engineering Technology Research Center for Data Science (Nos. 2016KF09, 2016KF10), and the National Statistical Science Research Project of China (No. 2016LY98). This work was also supported by the Science and Technology Department of Guangdong Province in China (Grant Nos. 2016A010101020, 2016A010101021, 2016A010101022), Foundation of Guangdong Polytechnic of Science and Technology (No. XJSC2016206), Natural Science Funds of Shenzhen Science and Technology Innovation Commission (No. JCYJ20160527172144272) and the Innovation Project of Graduate School of South China Normal University (No. 2015lkxm37). Furthermore, the authors thank gratefully for the scholars who shared datasets used in this work.


  1. 1.
    Xu, R., Wong, K, Xia, Y.: Coarse-fine opinion mining-WIA in NTCIR-7 moat task. In: Proceedings of NTCIR-7 Workshop Meeting, pp. 307–313 (2008)Google Scholar
  2. 2.
    Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment classification using machine learning techniques. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 79–86 (2002)Google Scholar
  3. 3.
    Tan, S., Zhang, J.: An empirical study of sentiment analysis for chinese documents. Expert Syst. Appl. 34(4), 2622–2629 (2008)CrossRefGoogle Scholar
  4. 4.
    Liu, Y.: Computational Linguistics. Tsinghua University Press, Beijing (2002)Google Scholar
  5. 5.
    Xia, R., Zong, C., Li, S.: Ensemble of feature sets and classification algorithms for sentiment classification. Inf. Sci. 181(6), 1138–1152 (2011)CrossRefGoogle Scholar
  6. 6.
    Sivic, J., Zisserman, A.: Efficient visual search of videos cast as text retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 31(4), 591–606 (2009)CrossRefGoogle Scholar
  7. 7.
    Harris, Z.S.: Distributional structure. Word 10(2–3), 146–162 (1954)Google Scholar
  8. 8.
    Ho, T.K.: The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998)CrossRefGoogle Scholar
  9. 9.
    Yang, L.G., Zhu, J., Tian, S.P.: Survey of text sentiment analysis. J. Comput. Appl. 33, 1574–1607 (2013)Google Scholar
  10. 10.
    Turney P.D.: Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In: Meeting on Association for Computational Linguistics. Association for Computational Linguistics, pp. 417–424 (2002)Google Scholar
  11. 11.
    Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 168–177 (2004)Google Scholar
  12. 12.
    Salton, G., Yu, C.T.: On the Construction of Effective Vocabularies for Information Retrieval. ACM SIGIR Forum, pp. 48–60. ACM, New York (1973)Google Scholar
  13. 13.
    Bengio, Y., Ducharme, R., Vincent, P., et al.: A neural probabilistic language model. J. Mach. Learn. Res. 3(6), 1137–1155 (2003)zbMATHGoogle Scholar
  14. 14.
    Mikolov, T., Chen K., Corrado G., et al.: Efficient estimation of word representations in vector space. In: Computer Science (2013)Google Scholar
  15. 15.
    Gui, L., Zhou, Y., Xu, R., et al.: Learning representations from heterogeneous network for sentiment classification of product reviews. Knowl. Based Syst. 124, 34–45 (2017)CrossRefGoogle Scholar
  16. 16.
    Chen, T., Xu, R., He, Y., et al.: Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN. Expert Syst. Appl. 72, 221–230 (2017)CrossRefGoogle Scholar
  17. 17.
    Polikar, R.: Ensemble based systems in decision making. IEEE Circ. Syst. Mag. 6(3), 21–44 (2006)CrossRefGoogle Scholar
  18. 18.
    Fang, D., Wang, G.: Text sentiment classification based on ensemble learning. Comput. Syst. Appl. 07, 177–181+248 (2012)Google Scholar
  19. 19.
    Wu, C.C.: Sentiment classification method based on ensemble learning for Chinese micro-blog. Public Commun. Sci. Technol. 16, 235–236+192 (2014)Google Scholar
  20. 20.
    Wang, G., Sun, J., Ma, J., et al.: Sentiment classification: the contribution of ensemble learning. Decis. Support Syst. 57(1), 77–93 (2004)Google Scholar
  21. 21.
    Alnashwan, R., O’Riordan, A.P., Sorensen, H., et al.: Improving sentiment analysis through ensemble learning of meta-level features. In: KDWEB 2016: 2nd International Workshop on Knowledge Discovery on the Web. Sun SITE Central Europe (CEUR)/RWTH Aachen University, Aachen (2016)Google Scholar
  22. 22.
    Deriu, J., Gonzenbach, M., Uzdilli F., et al.: SwissCheese at SemEval-2016 Task 4: sentiment classification using an ensemble of convolutional neural networks with distant supervision. In: SemEval@ NAACL-HLT, pp. 1124–1128 (2006)Google Scholar
  23. 23.
    Liu, H.Y., Zhao, Y.Y., Qin, B, et al.: Comment target extraction and sentiment classification. J. Chin. Inf. Process. 01, 84–88+122 (2010)Google Scholar
  24. 24.
    Gao, L., Dai, X.Y., Huang, S.J., et al.: Product attribute extraction based on feature selection and pointwise mutual information pruning. Pattern Recog. Artif. Intell. 02, 187–192 (2015)Google Scholar
  25. 25.
    Matsumoto, S., Takamura, H., Okumura, M.: Sentiment classification using word sub-sequences and dependency sub-trees. In: Advances in Knowledge Discovery and Data Mining, pp. 301–311 (2005)Google Scholar
  26. 26.
    Pei, J., Han, J., Mortazavi-Asl, B., et al.: Mining sequential patterns by pattern-growth: the prefixspan approach. IEEE Trans. Knowl. Data Eng. 16(11), 1424–1440 (2004)CrossRefGoogle Scholar
  27. 27.
    Liu, Z., Xue, Y., Li, M., et al.: Discovery of deep order-preserving submatrix in DNA microarray data based on sequential pattern mining. Int. J. Data Mining Bioinform. 17, 217–237 (2017)CrossRefGoogle Scholar
  28. 28.
    Dietterich, T.G.: An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach. Learn. 40(2), 139–157 (2000)CrossRefGoogle Scholar
  29. 29.
    Agrawal, R., Srikant, R.: Mining Sequential Patterns. ICDE, vol. 3. IEEE Computer Society, Washington, DC (1995)Google Scholar
  30. 30.
    Hu, M., Liu, B.: Opinion feature extraction using class sequential rules. In: AAAI Spring Symposium, pp. 61–66 (2006)Google Scholar
  31. 31.
    Li, J., Sun M.: Experimental study on sentiment classification of Chinese review using machine learning techniques. In: International Conference on Natural Language Processing and Knowledge Engineering, 2007. NLP-KE 2007, vol. 2007, pp. 393–400. IEEE (2007)Google Scholar
  32. 32.
    Liu, Y., Chen, F., Kong, W., et al.: Identifying web spam with the wisdom of the crowds. ACM Trans. Web (TWEB) 6(1), 1–30 (2012)CrossRefGoogle Scholar
  33. 33.
    Pedregosa, F., Varoquaux, G., Gramfort, A., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetzbMATHGoogle Scholar
  34. 34.
    Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann Publishers, Burlington (2011)Google Scholar
  35. 35.
    Abadi, M., Agarwal, A., Barham, P., et al.: Tensorflow: large-scale machine learning on heterogeneous distributed systems. arXiv preprint. arXiv:1603.04467 (2016)
  36. 36.
    Dong, Z., Dong, Q.: HowNet—a hybrid language and knowledge resource. In: Proceedings of the 2003 International Conference on Natural Language Processing and Knowledge Engineering, 2003, pp. 820–824. IEEE (2003)Google Scholar
  37. 37.
    Yuan, B., Liu, Y., Li, H.: Sentiment classification in Chinese microblogs: lexicon-based and learning-based approaches. Int. Proc. Econ. Dev. Res. 68, 1 (2013)Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Jiafeng Huang
    • 1
  • Yun Xue
    • 1
    Email author
  • Xiaohui Hu
    • 1
  • Huixia Jin
    • 2
  • Xin Lu
    • 1
  • Zhihuang Liu
    • 1
  1. 1.School of Physics and Telecommunication EngineeringSouth China Normal UniversityGuangzhouChina
  2. 2.College of Information and Electronic EngineeringHunan City UniversityYiyangChina

Personalised recommendations