An intelligent textual corpus big data computing approach for lexicons construction and sentiment classification of public emergency events

  • Wei Zhang
  • Yan-chun ZhuEmail author
  • Jia-peng Wang


Considering the deficiencies in the existing emotional lexicons like too many manual interventions, lack of scalability and ignorance of dependency parsing in emotional computing, this paper first uses Word2Vec, cosine word vector similarity calculation and SO-PMI algorithms to build a public event-oriented Weibo emotional lexicon; then, it proposes a Weibo emotion computing method based on dependency parsing and designs an emotion binary tree based on dependency parsing, and dependency-based emotion calculation rules; and at last, through an experiment, it shows that this emotional lexicon has a wider coverage and higher accuracy than the existing ones, and it also performs a public opinion evolution analysis on an actual public event and the empirical results show that the algorithm is feasible and effective.


Textual corpus Big data Lexicon construction Sentiment computing Public emergency events 



This work was supported by the National Natural Science Foundation of China (Grant No. 71874215), Beijing Natural Science Foundation (9182016), MOE (Ministry of Education in China) Project of Humanities and Social Sciences (17YJAZH120), and Beijing’s Philosophical and Social Science Foundation (Grant No. 13JGC128, 13JGB058). We wish to thank the anonymous reviewers who helped to improve the quality of the paper. The authors gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.


  1. 1.
    Badaro G, Jundi H, Hajj H, El-Hajj W (2018) EmoWordNet: Automatic Expansion of Emotion Lexicon Using English WordNet. In: the Seventh Joint Conference on Lexical and Computational Semantics, p 86–93Google Scholar
  2. 2.
    Bandhakavi A, Wiratung N, Massie S, Deepak P (2016) Emotion-corpus guided lexicons for sentiment analysis on Twitter. In: Proceedings of the International Conference on Innovative Techniques and Applications of Artificial Intelligence. Springer, Cham, p 71–85Google Scholar
  3. 3.
    Bandhakavi A, Wiratunga N, Padmanabhan D, Massie S (2017) Lexicon based feature extraction for emotion text classification. Pattern Recogn Lett 93:133–142CrossRefGoogle Scholar
  4. 4.
    Bandhakavi A, Wiratunga N, Massie S (2018) Emotion-aware polarity lexicons for Twitter sentiment analysis. Expert Syst:e12332Google Scholar
  5. 5.
    Bestgen Y (2008) Building affective lexicons from specific corpora for automatic sentiment analysis. In: Proceedings of LREC, Trento, Italy, p 496–500Google Scholar
  6. 6.
    Bestgen Y (2008) Building affective lexicons from specific corpora for automatic sentiment analysis. In: International Conference on Language Resources and Evaluation, Lrec 2008, Marrakech, Morocco 24(1):496–500Google Scholar
  7. 7.
    Buechel S, Hahn U (2018) Representation mapping: a novel approach to generate high-quality multi-lingual emotion lexicons. arXiv preprint arXiv:1807.00775Google Scholar
  8. 8.
    Che W, Li Z, Liu T (2010) LTP: a Chinese language technology platform. In: The 23rd International Conference on Computational Linguistics: Demonstrations. Association for Computational Linguistics, p 13–16Google Scholar
  9. 9.
    Ghiassi M, Skinner J, Zimbra D (2013) Twitter brand sentiment analysis: a hybrid system using n-gram analysis and dynamic artificial neural network. Expert Syst Appl 40(16):6266–6282CrossRefGoogle Scholar
  10. 10.
    Guan T (2018) Framing the Boundary of Sino-Japanese Conflicts in China’s Communication Sphere: a Content Analysis of the News Coverage of Japan and Sino-Japanese Controversies by the People’s Daily between 2001 and 2015. J Chin Polit Sci:1–16Google Scholar
  11. 11.
    Guo SJ (2017) The 2013 Boston marathon bombing: publics’ emotions, coping, and organizational engagement. Public Relat Rev 43(4):755–767MathSciNetCrossRefGoogle Scholar
  12. 12.
    Jiang S, Huang W, Cai M, Wang L (2015) Building social emotional lexicons for emotional analysis on microblog. Journal of Chinese Information Processing 29(06):166–171Google Scholar
  13. 13.
    Kalamatianos G, Symeonidis S, Mallis D, Arampatzis A (2018) Towards the creation of an emotion lexicon for microblogging. Journal of Systems and Information Technology, (just-accepted)Google Scholar
  14. 14.
    Kuang W (2018) Empirical studies on new-media public opinion. In: Social Media in China. Palgrave Macmillan, Singapore, p 257–261Google Scholar
  15. 15.
    Kušen E, Strembeck M (2018) Politics, sentiments, and misinformation: an analysis of the twitter discussion on the 2016 Austrian presidential elections. Online Soc Netw Med 5:37–50CrossRefGoogle Scholar
  16. 16.
    Kušen E, Strembeck M, Cascavilla G, Conti M (2017) On the influence of emotional valence shifts on the spread of information in social networks. In: Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2017. ACM, p 321–324Google Scholar
  17. 17.
    Kušen E, Cascavilla G, Figl K, Conti M, Strembeck M (2017) Identifying emotions in social media: Comparison of word-emotional lexicons Identifying emotions in social media: comparison of word-emotion lexicons. In: 5th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW), IEEE, p 132–137Google Scholar
  18. 18.
    Lecheler S, Bos L, Vliegenthart R (2015) The mediating role of emotions: news framing effects on opinions about immigration. J Mass Commun Q 92(4):812–838Google Scholar
  19. 19.
    Lee J, Choi Y (2018) Understanding social viewing through discussion network and emotion: a focus on South Korean presidential debates. Telematics Inform 35(5):1382–1391CrossRefGoogle Scholar
  20. 20.
    Li X, Li J, Wu Y (2015) A global optimization approach to multi-polarity sentiment analysis. PLoS One 10(4):e0124672CrossRefGoogle Scholar
  21. 21.
    Liang HF, Shi F, Ling WD, Ge YU (2017) Mining topic sentiment in microblogging based on multi-feature fusion. Chin J of Comput 40(4):872–888Google Scholar
  22. 22.
    Lin D, Li L, Cao D, Lv Y, Ke X (2018) Multi-modality weakly labeled sentiment learning based on explicit emotion signal for Chinese microblog. Neurocomputing 272:258–269CrossRefGoogle Scholar
  23. 23.
    Liu B, Zhang L (2012) A survey of opinion mining and sentiment analysis. In: Mining text data. Springer, Boston, p 415–463Google Scholar
  24. 24.
    Montejo-Ráez A, Martínez-Cámara E, Martín-Valdivia MT, Ureña-López LA (2014) Ranked wordnet graph for sentiment polarity classification in twitter. Comput Speech Lang 28(1):93–107CrossRefGoogle Scholar
  25. 25.
    Narendra B, Sai KU, Rajesh G, Hemanth K, Teja MC, Kumar KD (2016) Sentiment analysis on movie reviews: a comparative study of machine learning algorithms and open source technologies. Int J Intell Syst Technol Appl 8(8):66–70Google Scholar
  26. 26.
    Nip JY, Fu KW (2016) Challenging official propaganda? Public opinion leaders on Sina Weibo. China Q 225:122–144CrossRefGoogle Scholar
  27. 27.
  28. 28.
    Pang B, Lee L, Vaithyanathan S (2002) Thumbs up?: sentiment classification using machine learning techniques. In: the ACL-02 Conference on Empirical Methods in Natural Language Processing. Philadelphia, PA, USA: Association for Computational Linguistics, 10:79–86Google Scholar
  29. 29.
    Peng H, Cambria E, Hussain A (2017) A review of sentiment analysis research in Chinese language. Cogn Comput 9(4):423–435CrossRefGoogle Scholar
  30. 30.
    Peng H, Ma Y, Li Y, Cambria E (2018) Learning multi-grained aspect target sequence for Chinese sentiment analysis. Knowl-Based Syst 148:167–176CrossRefGoogle Scholar
  31. 31.
    Poria S, Peng H, Hussain A, Howard N, Cambria E (2017) Ensemble application of convolutional neural networks and multiple kernel learning for multimodal sentiment analysis. Neurocomputing 261:217–230CrossRefGoogle Scholar
  32. 32.
    Reka M, Srividhya V (2018) Emotion classification of twitter data using lexicon based approach. Softw Eng Technol 10(4):69–71Google Scholar
  33. 33.
    Stojanovski D, Strezoski G, Madjarov G, Dimitrovski I, Chorbev I (2018) Deep neural network architecture for sentiment analysis and emotion identification of Twitter messages. Multimed Tools Appl:1–30Google Scholar
  34. 34.
    Sun X, Zhang C, Li G, Sun D, Ren F, Zomaya A, Ranjan R (2017) Detecting users’ anomalous emotion using social media for business intelligence. J Comput Sci 25:193–200CrossRefGoogle Scholar
  35. 35.
    Sun X, Peng X, Hu M, Ren FJ (2017) Extended multi-modality features and deep learning based microblog short text sentiment analysis. J Electron Inf Technol 39(9):2048–2055Google Scholar
  36. 36.
    Tubishat M, Idris N, Abushariah MAM (2018) Implicit aspect extraction in sentiment analysis: review, taxonomy, oppportunities, and open challenges. Inf Process Manag 54(4):545–563CrossRefGoogle Scholar
  37. 37.
    Vermeulen A, Vandebosch H, Heirman W (2018) #smiling, #venting, or both? Adolescents’ social sharing of emotions on social media. Comput Hum Behav 84:211–219CrossRefGoogle Scholar
  38. 38.
    Wan C, Jiang T, Zhong M, Bian HR (2013) Sentiment computing of web financial information based on the part-of-speech tagging and dependency parsing. J Comput Res Dev 50(12):2554–2569Google Scholar
  39. 39.
    Wang Q, Lin Z, Jin Y, Cheng S, Yang T (2015) ESIS: emotion-based spreader-ignorant-stifler model for information diffusion. Knowl-Based Syst 81:46–55CrossRefGoogle Scholar
  40. 40.
    Yadollahi A, Shahraki AG, Zaiane OR (2017) Current state of text sentiment analysis from opinion to emotion mining. ACM Comput Surv 50(2):25CrossRefGoogle Scholar
  41. 41.
    Yang XP, Zhang ZX, Wang L (2017) Automatic construction and optimization of sentiment lexicon based on Word2Vec. Comput Sci 44(01):42–47Google Scholar
  42. 42.
    Yu L, Li L, Tang L (2017) What can mass media do to control public panic in accidents of hazardous chemical leakage into rivers? A multi-agent-based online opinion dissemination model. J Clean Prod 143:1203–1214CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.School of InformationCentral University of Finance and EconomicsBeijingChina
  2. 2.Business SchoolBeijing Normal UniversityBeijingChina

Personalised recommendations