Development and Application of a Chinese Webpage Suicide Information Mining System (Sims)
- 273 Downloads
This study aims at designing and piloting a convenient Chinese webpage suicide information mining system (SIMS) to help search and filter required data from the internet and discover potential features and trends of suicide.
SIMS utilizes Microsoft Visual Studio2008, SQL2008 and C# as development tools. It collects webpage data via popular search engines; cleans the data using trained models plus minimum manual help; translates the cleaned texts into quantitative data through models and supervised fuzzy recognition; analyzes and visualizes related variables by self-programmed algorithms.
The SIMS developed comprises such functions as suicide news and blogs collection, data filtering, cleaning, extraction and translation, data analysis and presentation. SIMS-mediated mining of one-year webpage revealed that: peak months and hours of web-reported suicide events were June-July and 10–11 am respectively, and the lowest months and hours, September-October and 1–7 am; suicide reports came mostly from Soho, Tecent, Sina etc.; male suicide victims over counted female victims in most sub-regions but southwest China; homes, public places and rented houses were the top three places to commit suicide; poisoning, cutting vein and jumping from building were the most commonly used methods to commit suicide; love disputes, family disputes and mental diseases were the leading causes.
SIMS provides a preliminary and supplementary means for monitoring and understanding suicide. It proposes useful aspects as well as tools for analyzing the features and trends of suicide using data derived from Chinese webpages. Yet given the intrinsic “dual nature” of internet-based suicide information and the tremendous difficulties experienced by ourselves and other researchers, there is still a long way to go for us to expand, refine and evaluate the system.
KeywordsSuicide News Blogs Data mining Support system
suicide information monitoring system
Uniform Resource Locator
structured query language
supervised machine learning
This paper was co-supported by the Natural Science Foundation of China (grant number 81172201) and Anhui Provincial Fund for Elite Youth (grant number 2011SQRL060). Penglai Chen and Jing Chai contributed equally to this manuscript.
- 6.Zhao, J., Zhao, J., Xiao, R., et al., Suicide exposure and its modulatory effects on relations between life events and suicide risk in Chinese college students. Nan Fang Yi Ke Da Xue Xue Bao 33:1111–6, 2013.Google Scholar
- 8.Ju Ji, N., Young Lee, W., Seok Noh, M., and Yip, P. S., The impact of indiscriminate media coverage of a celebrity suicide on a society with a high suicide rate: epidemiological findings on copycat suicides from South Korea. J Affect Disord 156:56–61, 2014. doi: 10.1016/j.jad.2013.11.015.CrossRefGoogle Scholar
- 17.Stack, S., The effect of the media on suicide: evidence from Japan, 1955–1985. Suicide Life Threat Behav 26(2):132–42, 1996.Google Scholar
- 22.Birbal, R., Maharajh, H. D., Birbal, R., et al., Cybersuicide and the adolescent population: challenges of the future? Int J Adolesc Med Health 21:151–159, 2009.Google Scholar
- 46.Cruz, J. A., and Wishart, D. S., Applications of Machine Learning in Cancer Prediction and Prognosis. Cancer Inform 2:59–77, 2007.Google Scholar
- 49.Moratilla, J. M., Alonso-Calvo, R., Molina-Vaquero, G., et al., A Data Model Based on Semantically Enhanced HL7 RIM for Sharing Patient Data of Breast Cancer Clinical Trials. Stud Health Technol Inform 192:971, 2013.Google Scholar