Using Natural Language Processing and Machine Learning to Detect Online Grooming Attacks

Street, Jake; Olajide, Funminiyi

doi:10.1007/978-3-031-55568-8_22

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1454))

Included in the following conference series:

UK Workshop on Computational Intelligence

16 Accesses

Abstract

A Natural Language Processing solution which incorporates Online Grooming phases has been developed within this paper. This solution coded each phrase within a transcript between a Honeypot profile and an Online Groomer on a chatroom to one of these phases. This was then compared to a human reviewed coding of each of these phrases to check for accuracy. The paper found that this coding identified the Initiation phase (with underaged declaration detection) within 75% of the transcripts with a 3% false positive rate. Most detections were incorrect for the Risk Assessment and Sexual phases. From analysis of this some words/phrase used in the Sexual phase detection had significantly more ‘incorrect’ human reviews than ‘correct’ (21%). It is likely that through filtration of these words/phrases an effective solution could be established, as 38% of these words/phrases had significantly more ‘correct’ human reviews than ‘incorrect’.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Dube, T., Raines, R., Peterson, G., Bauer, K., Grimaila, M., Rogers, S.: Malware target recognition via static heuristics. Comput. Secur. 31, 137–147 (2012)
Article Google Scholar
Salloum, S., Gaber, T., Vadera, S., Shaalan, K.: Phishing email detection using natural language processing techniques: a literature survey. Procedia Comput. Sci. 189, 19–28 (2021)
Article Google Scholar
Lorenzo-Dus, N., Izura, C.: “cause ur special”: understanding trust and complimenting behaviour in online grooming discourse. J. Pragmatics 112, 68–82 (2017)
Article Google Scholar
Office, H., Patel, P., Atkins, V.: Tackling Child Sexual Abuse Strategy (2021)
Google Scholar
Bentley, H., et al.: How safe are our children? 2020 – Adolescents (2020)
Google Scholar
Zuo, Z., Li, J., Anderson, P., Yang, L., Naik, N.: Grooming detection using fuzzy-rough feature selection and text classification. In: 2018 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pp. 1–8. IEEE (2018)
Google Scholar
O’Connell, R.: A Typology of Child Cybersexploitation and Online Grooming Practices, pp. 8–12. Guardian (2003)
Google Scholar
ChatCoder. https://chatcoder.com/index.html. Accessed 17 June 2022
Bigelow, J., Edwards, A., Edwards, L.: Detecting cyberbullying using latent semantic indexing. In: Proceedings of the First International Workshop on Computational Methods for CyberSafety, pp. 11–14 (2016)
Google Scholar
Department for Education. Teaching Online Safety in Schools, pp. 15–20 (2019)
Google Scholar
Bentley, H., Fellowes, A., Glenister, S.: How safe are our children? 2020 – Adolescents, pp. 42–43 (2020)
Google Scholar
Patchin, J., Hinduja, S.: The nature and extent of sexting among a national sample of middle and high school students in the U.S. Arch. Sex. Behav. 48, 2333–2343 (2019)
Article Google Scholar
Gupta, A., Kumaraguru, P., Sureka, A.: Characterizing Paedophile Conversations on the Internet using Online Grooming (2012)
Google Scholar
Lee, H.S., Lee, H.R., Park, J.U., Han, Y.S.: An abusive text detection system based on enhanced abusive and non-abusive word lists. Decis. Support. Syst. 113, 22–31 (2018)
Article Google Scholar
Moore, R., Lee, T., Hunt, R.: Entrapped on the web? Applying the entrapment defense to cases involving online sting operations. Am. J. Crim. Justice 32, 87–98, 129–130 (2007)
Google Scholar
Dombrowski, S., LeMasney, J., Ahia, C., Dickson, S.: Protecting children from online sexual predators: technological, psychoeducational, and legal considerations. Prof. Psychol. Res. Pract. 35, 65 (2004)
Article Google Scholar
Lykousas, N., Patsakis, C.: Large-scale analysis of grooming in modern social networks. Expert Syst. Appl. 176 (2021)
Google Scholar

Download references

Author information

Authors and Affiliations

Nottingham Trent University, Nottingham, NG1 4FQ, UK
Jake Street & Funminiyi Olajide

Authors

Jake Street
View author publications
You can also search for this author in PubMed Google Scholar
Funminiyi Olajide
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jake Street .

Editor information

Editors and Affiliations

Department of Automatic Control and Systems Engineering, The University of Sheffield, Sheffield, UK
George Panoutsos
Department of Automatic Control and Systems Engineering, Head of the Intelligent Systems Research Laboratory, The University of Sheffield, Sheffield, South Yorkshire, UK
Mahdi Mahfouf
Department of Automatic Control and Systems Engineering, The University of Sheffield, Sheffield, South Yorkshire, UK
Lyudmila S Mihaylova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Street, J., Olajide, F. (2024). Using Natural Language Processing and Machine Learning to Detect Online Grooming Attacks. In: Panoutsos, G., Mahfouf, M., Mihaylova, L.S. (eds) Advances in Computational Intelligence Systems. UKCI 2022. Advances in Intelligent Systems and Computing, vol 1454. Springer, Cham. https://doi.org/10.1007/978-3-031-55568-8_22

Download citation

DOI: https://doi.org/10.1007/978-3-031-55568-8_22
Published: 19 May 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-55567-1
Online ISBN: 978-3-031-55568-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics