Advertisement

Ensemble of Feature Sets and Classification Methods for Stance Detection

  • Jiaming Xu
  • Suncong Zheng
  • Jing Shi
  • Yiqun Yao
  • Bo Xu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10102)

Abstract

Stance detection is the task of automatically determining the author’s favorability towards a given target. However, the target may not be explicitly mentioned in the text and even someone may refer some positive opinions to against the target, which make the task more difficult. In this paper, we describe an ensemble framework which integrates various feature sets and classification methods, and does not consist any handcrafted templates or rules to help stance detection. We submit our solution to NLPCC 2016 shared task: Detecting Stance in Chinese Weibo (Task A), which is a supervised task towards five targets. The official results show that our solution of the team “CBrain” achieves one 1st place and one 2nd place on these targets, and the overall ranking is 4th out of 16 teams. Our code is available at https://github.com/jacoxu/2016NLPCC_Stance_Detection.

Keywords

Stance detection Ensemble framework Text classification Chinese Weibo 

Notes

Acknowledgments

We thank the anonymous reviewers for their insightful comments, and this work was supported by the Strategic Priority Research Program of the Chinese Academy of Sciences (Grant No. XDB02070005), the National High Technology Research and Development Program of China (863 Program) (Grant No. 2015AA015402) and the National Natural Science Foundation (Grant No. 61602479 and 61403385).

References

  1. 1.
    Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)CrossRefzbMATHGoogle Scholar
  2. 2.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  3. 3.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Chen, M., Jin, X., Shen, D.: Short text classification improved by learning multi-granularity topics. In: IJCAI, pp. 1776–1781. Citeseer (2011)Google Scholar
  5. 5.
    Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. JASIS 41(6), 391 (1990)CrossRefGoogle Scholar
  6. 6.
    Dunning, T.: Accurate methods for the statistics of surprise and coincidence. Comput. Linguist. 19(1), 61–74 (1993)Google Scholar
  7. 7.
    Freund, Y., Schapire, R.E., et al.: Experiments with a new boosting algorithm. In: ICML, vol. 96, pp. 148–156 (1996)Google Scholar
  8. 8.
    He, X., Cai, D., Liu, H., Ma, W.Y.: Locality preserving indexing for document representation. In: SIGIR, pp. 96–103. ACM (2004)Google Scholar
  9. 9.
    Hu, M., Liu, B.: Mining and summarizing customer reviews. In: KDD, pp. 168–177. ACM (2004)Google Scholar
  10. 10.
    Joachims, T.: Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms. Kluwer Academic Publishers, Dordrecht (2002)CrossRefGoogle Scholar
  11. 11.
    Kim, S.M., Hovy, E.H.: Crystal: analyzing predictive opinions on the web. In: EMNLP-CoNLL, pp. 1056–1064 (2007)Google Scholar
  12. 12.
    Krejzl, P., Steinberger, J.: UWB at SemEval-2016 task 6: stance detection. In: Proceedings of SemEval, pp. 408–412 (2016)Google Scholar
  13. 13.
    Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: ICML, vol. 14, pp. 1188–1196 (2014)Google Scholar
  14. 14.
    Lendvai, P., Augenstein, I., Bontcheva, K., Declerck, T.: Monolingual social media datasets for detecting contradiction and entailment. In: LREC (2016)Google Scholar
  15. 15.
    Li, J., Sun, M.: Experimental study on sentiment classification of Chinese review using machine learning techniques. In: NLPKE, pp. 393–400. IEEE (2007)Google Scholar
  16. 16.
    Liu, G., Nguyen, T.T., Zhao, G., Zha, W., Yang, J., Cao, J., Wu, M., Zhao, P., Chen, W.: Repeat buyer prediction for e-commerce. In: KDD. ACM (2016)Google Scholar
  17. 17.
    Mesnil, G., Mikolov, T., Ranzato, M., Bengio, Y.: Ensemble of generative and discriminative techniques for sentiment analysis of movie reviews (2014). arXiv preprint arXiv:1412.5335
  18. 18.
    Mohammad, S.M., Kiritchenko, S., Sobhani, P., Zhu, X., Cherry, C.: SemEval-2016 task 6: detecting stance in tweets. In: SemEval, vol. 16 (2016)Google Scholar
  19. 19.
    Ng, V., Hasan, K.S.: Predicting stance in ideological debate with rich linguistic knowledge. In: COLING, p. 451 (2012)Google Scholar
  20. 20.
    Phan, X.H., Nguyen, L.M., Horiguchi, S.: Learning to classify short and sparse text & web with hidden topics from large-scale data collections. In: WWW (2008)Google Scholar
  21. 21.
    Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2002)Google Scholar
  22. 22.
    Xia, R., Zong, C., Li, S.: Ensemble of feature sets and classification algorithms for sentiment classification. Inf. Sci. 181(6), 1138–1152 (2011)CrossRefGoogle Scholar
  23. 23.
    Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: ICML, vol. 97, pp. 412–420 (1997)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Jiaming Xu
    • 1
  • Suncong Zheng
    • 1
  • Jing Shi
    • 1
  • Yiqun Yao
    • 1
  • Bo Xu
    • 1
    • 2
  1. 1.Institute of AutomationChinese Academy of Sciences (CAS)BeijingChina
  2. 2.Center for Excellence in Brain Science and Intelligence TechnologyCASShanghaiChina

Personalised recommendations