Abstract
Using the posts of Tianya Forum as the data source and adopting the societal risk indicators from socio psychology, we conduct document-level multiple societal risk classification of BBS posts. Two kinds of models are applied to generate the representations of posts respectively: Bag-of-Words focuses on extracting the occurrence information of words in posts, and a deep learning model as Post Vector is designed to capture the semantics and word order of posts. Based on the different post representations, two types of support vector machine (SVM) classifiers are developed and compared in the societal risk classification of the posts. Furthermore, as the complementary information contained in the two different post representations, several SVM ensemble methods at the decision score level of the two SVM classifiers are proposed to improve the performance of societal risk classification. The experimental results reveal that the SVM ensemble method achieves better results in document-level societal risk classification than SVM based on single representation.
Preview
Unable to display preview. Download preview PDF.
References
Zheng, R., Shi, K., Li, S.: The influence factors and mechanism of societal risk perception. In: Zhou, J. (ed.) Complex 2009. LNICST, vol. 5, pp. 2266–2275. Springer, Heidelberg (2009)
Tang, X.J.: Exploring On-line Societal Risk Perception for Harmonious Society Measurement. Journal of Systems Science and Systems Engineering 22(4), 469–486 (2013)
Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. Journal of Machine Learning Research 3, 1137–1155 (2003)
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural Language Processing (Almost) from Scratch. Journal of Machine Learning Research 12, 2461–2505 (2011)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: International Conference on Learning Representations (ICLR 2013), Scottsdale, pp. 1−12 (2013)
Jeffrey, P., Richard, S., Christopher, M.: Glove: Global vectors for word representation. In: Proceedings of the Empirical Methods in Natural Language Processing (EMNLP 2014), pp. 1532−1543. Association for Computational Linguistics, Stroudsburg (2014)
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on Machine Learning (ICML 2014). JMLR Workshop and Conference Proceedings, Beijing, pp. 1188−1196 (2014)
Chen, J.D., Tang, X.J.: Societal risk classification of post based on paragraph vector and KNN method. In: Wang, S.Y., Nakamori, Y., Huynh, V.N. (Eds.) Proceedings of the 15th International Symposium on Knowledge and Systems Sciences, Sapporo, November 1−2, pp. 117−123. JAIST Press (2014). ISBN: 978-4-903092-39-3
Hu, Y., Tang, X.: Using support vector machine for classification of baidu hot word. In: Wang, M. (ed.) KSEM 2013. LNCS, vol. 8041, pp. 580–590. Springer, Heidelberg (2013)
Wen, S.Y., Wan, X.J.: Emotion classification in microblog texts using class sequential rules. In: Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, Québec, pp. 187−193 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Chen, J., Tang, X. (2015). Ensemble of SVM Classifiers with Different Representations for Societal Risk Classification. In: Zhang, S., Wirsing, M., Zhang, Z. (eds) Knowledge Science, Engineering and Management. KSEM 2015. Lecture Notes in Computer Science(), vol 9403. Springer, Cham. https://doi.org/10.1007/978-3-319-25159-2_61
Download citation
DOI: https://doi.org/10.1007/978-3-319-25159-2_61
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25158-5
Online ISBN: 978-3-319-25159-2
eBook Packages: Computer ScienceComputer Science (R0)