An Empirical Feasibility Study of Societal Risk Classification Toward BBS Posts
- 75 Downloads
Societal risk classification is the fundamental issue for online societal risk monitoring. To show the challenge and feasibility of societal risk classification toward BBS posts, an empirical analysis is implemented in this paper. Through effectiveness analysis, Support Vector Machine based on Bag-Of-Words (BOW-SVM) is adopted for challenge validation, and the distributed document embeddings of BBS posts generated by Paragraph Vector are applied to feasibility study. Based on BOW-SVM, cross-validations of BBS posts labeled by different groups and annotators are conducted. The big fluctuation of cross-validation results indicates the differences of individual risk perceptions, which brings more challenges to societal risk classification. Furthermore, based on the distributed document embeddings of BBS posts, the pairwise similarities of more than 300 thousands BBS posts from different societal risk categories are compared. The higher similarities of BBS posts in the same societal risk category reveal that BBS posts in the same societal risk category share more features than BBS posts in different categories, which manifests the feasibility of societal risk classification of BBS posts, and also reflects the possibility to improve the performance of societal risk monitoring.
KeywordsSocietal risk classification Tianya Forum cross validation pairwise similarity individual risk perception
Unable to display preview. Download preview PDF.
This study is supported by National Natural Science Foundation of China under grant Nos. 71601023, 61473284, 71731002 and L1624049, the Supplementary and Supportive Project for Teachers at Beijing Information Science and Technology University (2018–2020) (5029011103) and National Key R&D Program of China (2017YFB1400500).
The authors would like to thank other members of our group for their effort in data collection and post labeling.
- Chen, JD. & Tang, XJ. (2014b). Societal Risk Classification of Post Based on Paragraph Vector and KNN Method. In: Wang S Y, Nakamori Y & Huynh V N (eds.), the 15th International Symposium on Knowledge and Systems Sciences, 117–123, Sapporo, November 1–2, 2014, JAIST Press.Google Scholar
- Hao, BB., Li, L, Gao, R, Li, A & Zhu, TS. (2014). Sensing Subjective Well-Being from Social Media. In Ślzak D, Schaefer G, Vuong S & Kim Y S (eds.), International Conference on Active Media Technology, 8610: 324–335, Warsaw, August 11–14, 2014, Springer International Publishing.Google Scholar
- Jeffrey, P, Richard, S, & Christopher, M (2014). Glove: Global Vectors for Word Representation. In: Moschitti A, Pang B & Daelemans W (eds.), Proceedings of the 2014 Empirical Methods in Natural Language Processing, 1532–1543, October 25–29, 2014, Doha, Stroudsburg: Association for Computational Linguistics.Google Scholar
- Le, Q & Mikolov, T (2014). Distributed Representations of Sentences and Documents. Proceedings of the 31st International Conference on Machine Learning, 1188–1196, Beijing, June 21–26, 2014, JMLR Workshop and Conference Proceedings.Google Scholar
- Liu, B, (2012) Sentiment Analysis and Opinion Mining (Synthesis Lectures on Human Language Technologies). Morgan & Claypool Publishers.Google Scholar
- Lu, YF., Hu, X, Wang, F, Kumar, S, Liu, H & Maciejewski, R (2015). Visualizing Social Media Sentiment in Disaster Scenarios. In: Gangemi A, Leonardi S & Panconesi A (eds.), Proceedings of the 24th International Conference on World Wide Web Companion, 1211–1215, Florence, May 18–22, 2015, International World Wide Web Conferences Steering Committee.CrossRefGoogle Scholar
- Mikolov, T, Chen, K, Corrado, G, & Dean, J (2013). Efficient Estimation of Word Representations in Vector Space. nPaper presented at ICLR 2013: International Conference on Learning Representations, Scottsdale, May 2–4, 2013.Google Scholar
- Qiu, L, Cao, Y, Nie, ZQ. & Rui, Y (2014). Learning Word Representation Considering Proximity and Ambiguity. In: Brodley CE & Stone P (eds.), Proceedings of the 28th AAAI Conference on Artificial Intelligence, 572–1578, Québec, July 27 – 31, 2014, AAAI Press.Google Scholar
- Tai, KS., Socher, R & Manning, CD. (2015). Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks. Paper presented at the 53rd Annual Meeting of the Association for Computational Linguistics, Beijing, July 26–31, 2015.Google Scholar
- Tang, DY., Qin, B, & Liu, T (2015). Document Modeling with Gated Recurrent Neural Network for Sentiment Classification. In: Màrquez L, Chris CB, Su J, Pighin D & Marton Y (eds.), Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 1422–1432, Lisbon, September 17–21, 2015, The Association for Computational Linguistics.Google Scholar
- Wen, SY. & Wan, XJ. (2014). Emotion Classification in Microblog Texts Using Class Sequential Rules. In: Brodley CE & Stone P (eds.), Proceedings of the 28th AAAI Conference on Artificial Intelligence, 187–193, Québec, July 27 – 31, 2014, AAAI Press.Google Scholar
- Wu, Y, Xiao, K, Liu, H, & Tang H. (2010). Evolution of BBS virtual community and its simulation[J]. Systems Engineering-Theory & Practice, 30(10): 1883–1890.Google Scholar
- Zhao, YL. & Tang, XJ. (2013). A Preliminary Research of Pattern of Users’ Behavior Based on Tianya Forum. Paper presented at the 14th International Symposium on Knowledge and Systems Sciences, 139–145 Ningbo, Octobrt 25–27, 2013, JAIST Press.Google Scholar
- Zheng, Y & Tok, SK. (2007). “Harmonious Society” and “Harmonious World”: China’s Policy Discourse under Hu Jintao. Briefing Series, Issue 26, China Policy Institute, The University of Nottingham, UKGoogle Scholar
- Zheng, R, Shi, K & Li, S (2009). The InfluenceFactors and Mechanism of Societal Risk Perception. In: Zhou J (eds.), Proceedings of the 1st International Conference on Complex Sciences: Theory and Application, 2266–2275, Shanghai, February 23–25, 2009, Springer Berlin Heidelberg,.Google Scholar