Abstract
Public opinion monitoring has been well studied in sociology and informatics. Considerable amounts of crime-related information are available on social media platforms every day. Current methods for monitoring public opinion are typically based on rule matching and manual searching instead of automated processing and analysis. However, the extraction of useful information from large volumes of social media data is a major challenge in public opinion monitoring.
This chapter describes a methodology for extracting key information from a large volume of Chinese text using named entity recognition based on the LSTM-CRF model. Since traditional named entity recognition datasets are small and only contain a few types, a custom crime-related corpus was created for training. The results demonstrate that the methodology can automatically extract key attributes such as person, location, organization and crime type with a precision of 87.58%, recall of 83.22% and F1 score of 85.24%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
H. Chan, In pictures: 12,000 Hongkongers march in protest against “evil” China extradition law, organizers say, Hong Kong Free Press, March 31, 2019.
N. Greenberg, T. Bansal, P. Verga and A. McCallum, Marginal likelihood training of BiLSTM-CRF for biomedical named entity recognition from disjoint label sets, Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 2824–2829, 2018.
S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Computation, vol. 9(8), pp. 1735–1780, 1997.
Z. Huang, W. Xu and K. Yu, Bidirectional LSTM-CRF Models for Sequence Tagging, arXiv: 1508.01991v1, 2015.
A. Katiyar and C. Cardie, Nested named entity recognition revisited, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, vol. 1 (Long Papers), pp. 861–871, 2018.
B. Kleinberg, M. Mozes and A. Arntz, Using named entities for computer-automated verbal deception detection, Journal of Forensic Sciences, vol. 63(3), pp. 714–723, 2018.
G. Lample, M. Ballesteros, S. Subramanian, K. Kawakami and C. Dyer, Neural architectures for named entity recognition, Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 260–270, 2016.
C. Marcum, Cyber Crime, Wolters Kluwer, Frederick, Maryland, 2014.
Pudn, MSRA (www.pudn.com/Download/item/id/2435241.html), 2020.
C. Santos and V. Guimaraes, Boosting Named Entity Recognition with Neural Character Embeddings, arXiv: 1505.05008v2, 2015.
Sougou, Sougou Corpus (pinyin.sougou.com), 2020.
Z. Wang, X. Cui, L. Gao, Q. Yin, L. Ke and S. Zhang, A hybrid model of sentimental entity recognition on mobile social media, EURASIP Journal on Wireless Communications and Networking, vol. 2016, article no. 253, 2016.
D. Xu, R. Ge and Z Niu, Forward-looking element recognition based on the LSTM-CRF model with the integrity algorithm, Future Internet, vol. 11(1), article no. 17, 2019.
M. Yang and K. Chow, An information extraction framework for digital forensic investigations, in Advances in Digital Forensics XI, G. Peterson and S. Shenoi (Eds.), Springer, Cham, Switzerland, pp. 61–76, 2015.
J. Zhang and X. Liu, Research on Chinese named entity recognition based on deep learning, Proceedings of the Fourth IEEE International Conference on Computer and Communications, pp. 2142–2147, 2018.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 IFIP International Federation for Information Processing
About this paper
Cite this paper
Wu, W., Chow, KP., Mai, Y., Zhang, J. (2020). Public Opinion Monitoring for Proactive Crime Detection Using Named Entity Recognition. In: Peterson, G., Shenoi, S. (eds) Advances in Digital Forensics XVI. DigitalForensics 2020. IFIP Advances in Information and Communication Technology, vol 589. Springer, Cham. https://doi.org/10.1007/978-3-030-56223-6_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-56223-6_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-56222-9
Online ISBN: 978-3-030-56223-6
eBook Packages: Computer ScienceComputer Science (R0)