Real-time identification of urban rainstorm waterlogging disasters based on Weibo big data

Xiao, Yang; Li, Beiqun; Gong, Zaiwu

doi:10.1007/s11069-018-3427-4

Real-time identification of urban rainstorm waterlogging disasters based on Weibo big data

Original Paper
Published: 13 August 2018

Volume 94, pages 833–842, (2018)
Cite this article

Natural Hazards Aims and scope Submit manuscript

1505 Accesses
48 Citations
1 Altmetric
Explore all metrics

Abstract

With the acceleration of urbanisation in China, preventing and reducing the economic losses and casualties caused by urban rainstorm waterlogging disasters have become a critical and difficult issue that the government is concerned about. As urban storms are sudden, clustered, continuous, and cause huge economic losses, it is difficult to conduct emergency management. Developing a more scientific method for real-time disaster identification will help prevent losses over time. Examining social media big data is a feasible method for obtaining on-site disaster data and carrying out disaster risk assessments. This paper presents a real-time identification method for urban-storm disasters using Weibo data. Taking the June 2016 heavy rainstorm in Nanjing as an example, the obtained Weibo data are divided into eight parts for the training data set and two parts for the testing data set. It then performs text pre-processing using the Jieba segmentation module for word segmentation. Then, the term frequency–inverse document frequency method is used to calculate the feature items weights and extract the features. Hashing algorithms are introduced for processing high-dimensional sparse vector matrices. Finally, the naive Bayes, support vector machine, and random forest text classification algorithms are used to train the model, and a test set sample is introduced for testing the model to select the optimal classification algorithm. The experiments showed that the naive Bayes algorithm had the highest macro-average accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Smart Disaster Management Using Big Data Analytics

Challenges and Opportunities of Using Big Data for Assessing Flood Risks

Analysis of spatial and temporal characteristics of major natural disasters in China from 2008 to 2021 based on mining news database

Article 23 July 2023

References

Bai H, Lin XG (2016) Sina Weibo disaster information detection based on chinese short text classification. Catastrophology 31(02):19–23
Google Scholar
Bermingham A, Smeaton A F (2010) Classifying sentiment in microblogs: is brevity an advantage? In: ACM international conference on information and knowledge management. ACM:1833-1836
Bo P, Lee L, Vaithyanathan S (2002) Thumbs up?: sentiment classification using machine learning techniques. Proc Emnlp 31(1):79–86
Google Scholar
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Google Scholar
Breiman L (2001) Random Forest. Mach Learn 45(1):5–32
Article Google Scholar
Cao YB, Wu YM, Xu RJ (2017) Research about the Perceptible Area Extracted after the Earthquake Based on the Microblog Public Opinion. J Seismol Res 40(02):303–310
Google Scholar
Chen QX (2009) Research on text hierarchical classification. J Harbin Inst Technol 32(1):9–22
Google Scholar
Chen X, Ishwaran H (2012) Random forests for genomic data analysis. Genomics 99(6):323
Article Google Scholar
Choi S, Bae B (2015) The real-time monitoring system of social big data for disaster management. In: Computer Science and its Applications. Springer, Berlin, pp 809–815
Google Scholar
Dong LJ, Xi-Bing LI, Peng K (2013) Prediction of rockburst classification using Random Forest. Chin J Nonferrous Metals 23(2):472–477
Article Google Scholar
Gao YB, Guo WY, Zhou HY (2014a) Improvements of personal weibo clustering algorithm based on K-means. Microcomput Appl 33(14):78–81
Google Scholar
Guo YX, Lu XQ, Li Z (2014b) Bursty topics detection approach on Chinese microblog based on burst words clustering. Microcomput Appl 34(02):486–490 + 505
Ho TK (1998) The Random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–834
Article Google Scholar
http://www.cma.gov.cn/2011xwzx/2011.xmtjj/201207/t20120724_179464.html, 2012-07-24
Jansen BJ, Zhang M, Sobel K (2009) Twitter power: tweets as electronic word of mouth. J Am Soc Inform Sci Technol 60(11):2169–2188
Article Google Scholar
Ko Y (2017) How to use negative class information for Naive Bayes classification. Inf Process Manag 53(6):1255–1268
Article Google Scholar
Li S (2007) Research of Chinese text classification based on Naive Bayesian method and application of microblogging data classification. Beijing Institute of Technology
Lin JH, Yang AM, Zhou YM (2012) Classification of microblog sentiment based on Naive Bayesian. Comput Eng Sci 34(09):160–165
Google Scholar
Mih Ilescu DM, Gui V, Toma CI et al (2013) Computer aided diagnosis method for steatosis rating in ultrasound images using random forests. Med Ultrasonogr 15(15):184–190
Article Google Scholar
Nair MR, Ramya GR, Sivakumar PB (2017) Usage and analysis of Twitter during 2015 Chennai flood towards disaster management. Proc Comput Sci 115:350–358
Article Google Scholar
Qu Y, Huang C, Zhang P, et al (2011) Microblogging after a major disaster in China: a case study of the 2010 Yushu earthquake//ACM Conference on Computer Supported Cooperative Work, CSCW 2011, Hangzhou, China, March. DBLP:25-34
Sakaki T, Okazaki M, Matsuo Y, et al (2010) Earthquake shakes Twitter users
Salakhutdinov R, Hinton G E (2007) Semantic hashing. In: Proceedings of SIGIR workshop on information retrieval and applications of graphical models, Amsterdam
Si QS (2017) Influenza surveiliance and forecast analysis based-on Sina Weibo. In: The 2nd global conference on theory and applications of OR/OM for sustainability, Beijing
Sina Weibo Data Center: 2017 Weibo User Development Report, http://www.useit.com.cn/thread-17562-1-1.html
Tesfamariam Solomon, Zheng L (2010) Earthquake induced damage classification for reinforced concrete buildings. Struct Saf 32(2):154–164
Article Google Scholar
The Progress of China’s Human Rights in 2013 White Paper[EB/OL] (02 August 2014). http://news.sohu.com/20140526/n400036148.shtml
VapnikV Zhang X G (1999) The nature of statistical learning theory. Tsinghua University Press, Tsinghua
Google Scholar
Wang HL, Xia B (2016) Research on the ranking of products of B2B e-commerce platform based on machine learning. Microcomput Appl 35(11):45–47
Google Scholar
Wang Y, Xiao SB, Guo YX (2013) Research on Chinese micro-blog bursty topics detection. New Technol Lib Inf Ser 02:57–62
Google Scholar
Wu XH, Luan CJ (2017) A method for detecting sudden earthquake events based on micro-blog text classification. Microcomput Appl 36(19):58–61
Google Scholar
Xie LX, Zhou M, Sun MS (2012) Hierarchical structure based hybrid approach to sentiment analysis of Chinese micro blog and its feature extraction. J Chin Inf Proc 26(1):73–83
Google Scholar
Xu JH, Chu JX, Nie GZ et al (2015) Earthquake disaster extraction based on location microblogging. J Nat Disasters 05:12–18
Google Scholar
Zan HY, Bi YL, Shi JM (2017) Spam Review Identification Based on Adaboost Algorithm and Rules Matching. J Zhengzhou Univ Nat Sci Ed 49(01):24–28
Google Scholar
Zhao-Tong TG, Yang DW, Cai XM et al (2012) Predict seasonal low flows in the upper Yangtze River using random forests model. J Hydroelect Eng 31(3):18–24
Google Scholar
Zhu M (2010) Study on text classification method based on adaptive genetic BP neutral network. Nanchang, Nanchang University, Thesis
Google Scholar

Download references

Acknowledgements

This research is partially supported by the Major project of the national social science foundation (grant no. 16ZDA047), the National Natural Science Foundation of China (71171115, 71571104), the Reform Foundation of Postgraduate Education and Teaching in Jiangsu Province (JGKT10034), a Six Talent Peaks Project in Jiangsu Province (2014-JY-014), Top-notch Academic Programs Project of Jiangsu Higher Education Institutions, and the Postgraduate Research & Practice Innovation Program, Major project of humanities and social sciences of Anhui Education Department (SK2015ZD07).

Author information

Authors and Affiliations

School of Management and Engineering, Nanjing University of Information Science and Technology, Nanjing, China
Yang Xiao, Beiqun Li & Zaiwu Gong

Authors

Yang Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Beiqun Li
View author publications
You can also search for this author in PubMed Google Scholar
Zaiwu Gong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zaiwu Gong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xiao, Y., Li, B. & Gong, Z. Real-time identification of urban rainstorm waterlogging disasters based on Weibo big data. Nat Hazards 94, 833–842 (2018). https://doi.org/10.1007/s11069-018-3427-4

Download citation

Received: 26 May 2018
Accepted: 03 August 2018
Published: 13 August 2018
Issue Date: November 2018
DOI: https://doi.org/10.1007/s11069-018-3427-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Real-time identification of urban rainstorm waterlogging disasters based on Weibo big data

Abstract

Access this article

Similar content being viewed by others

Smart Disaster Management Using Big Data Analytics

Challenges and Opportunities of Using Big Data for Assessing Flood Risks

Analysis of spatial and temporal characteristics of major natural disasters in China from 2008 to 2021 based on mining news database

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Real-time identification of urban rainstorm waterlogging disasters based on Weibo big data

Abstract

Access this article

Similar content being viewed by others

Smart Disaster Management Using Big Data Analytics

Challenges and Opportunities of Using Big Data for Assessing Flood Risks

Analysis of spatial and temporal characteristics of major natural disasters in China from 2008 to 2021 based on mining news database

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation