Skip to main content

Using Natural Language Processing and Machine Learning to Detect Online Grooming Attacks

  • Conference paper
  • First Online:
Advances in Computational Intelligence Systems (UKCI 2022)

Abstract

A Natural Language Processing solution which incorporates Online Grooming phases has been developed within this paper. This solution coded each phrase within a transcript between a Honeypot profile and an Online Groomer on a chatroom to one of these phases. This was then compared to a human reviewed coding of each of these phrases to check for accuracy. The paper found that this coding identified the Initiation phase (with underaged declaration detection) within 75% of the transcripts with a 3% false positive rate. Most detections were incorrect for the Risk Assessment and Sexual phases. From analysis of this some words/phrase used in the Sexual phase detection had significantly more ‘incorrect’ human reviews than ‘correct’ (21%). It is likely that through filtration of these words/phrases an effective solution could be established, as 38% of these words/phrases had significantly more ‘correct’ human reviews than ‘incorrect’.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Dube, T., Raines, R., Peterson, G., Bauer, K., Grimaila, M., Rogers, S.: Malware target recognition via static heuristics. Comput. Secur. 31, 137–147 (2012)

    Article  Google Scholar 

  2. Salloum, S., Gaber, T., Vadera, S., Shaalan, K.: Phishing email detection using natural language processing techniques: a literature survey. Procedia Comput. Sci. 189, 19–28 (2021)

    Article  Google Scholar 

  3. Lorenzo-Dus, N., Izura, C.: “cause ur special”: understanding trust and complimenting behaviour in online grooming discourse. J. Pragmatics 112, 68–82 (2017)

    Article  Google Scholar 

  4. Office, H., Patel, P., Atkins, V.: Tackling Child Sexual Abuse Strategy (2021)

    Google Scholar 

  5. Bentley, H., et al.: How safe are our children? 2020 – Adolescents (2020)

    Google Scholar 

  6. Zuo, Z., Li, J., Anderson, P., Yang, L., Naik, N.: Grooming detection using fuzzy-rough feature selection and text classification. In: 2018 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pp. 1–8. IEEE (2018)

    Google Scholar 

  7. O’Connell, R.: A Typology of Child Cybersexploitation and Online Grooming Practices, pp. 8–12. Guardian (2003)

    Google Scholar 

  8. ChatCoder. https://chatcoder.com/index.html. Accessed 17 June 2022

  9. Bigelow, J., Edwards, A., Edwards, L.: Detecting cyberbullying using latent semantic indexing. In: Proceedings of the First International Workshop on Computational Methods for CyberSafety, pp. 11–14 (2016)

    Google Scholar 

  10. Department for Education. Teaching Online Safety in Schools, pp. 15–20 (2019)

    Google Scholar 

  11. Bentley, H., Fellowes, A., Glenister, S.: How safe are our children? 2020 – Adolescents, pp. 42–43 (2020)

    Google Scholar 

  12. Patchin, J., Hinduja, S.: The nature and extent of sexting among a national sample of middle and high school students in the U.S. Arch. Sex. Behav. 48, 2333–2343 (2019)

    Article  Google Scholar 

  13. Gupta, A., Kumaraguru, P., Sureka, A.: Characterizing Paedophile Conversations on the Internet using Online Grooming (2012)

    Google Scholar 

  14. Lee, H.S., Lee, H.R., Park, J.U., Han, Y.S.: An abusive text detection system based on enhanced abusive and non-abusive word lists. Decis. Support. Syst. 113, 22–31 (2018)

    Article  Google Scholar 

  15. Moore, R., Lee, T., Hunt, R.: Entrapped on the web? Applying the entrapment defense to cases involving online sting operations. Am. J. Crim. Justice 32, 87–98, 129–130 (2007)

    Google Scholar 

  16. Dombrowski, S., LeMasney, J., Ahia, C., Dickson, S.: Protecting children from online sexual predators: technological, psychoeducational, and legal considerations. Prof. Psychol. Res. Pract. 35, 65 (2004)

    Article  Google Scholar 

  17. Lykousas, N., Patsakis, C.: Large-scale analysis of grooming in modern social networks. Expert Syst. Appl. 176 (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jake Street .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Street, J., Olajide, F. (2024). Using Natural Language Processing and Machine Learning to Detect Online Grooming Attacks. In: Panoutsos, G., Mahfouf, M., Mihaylova, L.S. (eds) Advances in Computational Intelligence Systems. UKCI 2022. Advances in Intelligent Systems and Computing, vol 1454. Springer, Cham. https://doi.org/10.1007/978-3-031-55568-8_22

Download citation

Publish with us

Policies and ethics