A Comprehensive Review on Transforming Security and Privacy with NLP

Garg, Rachit; Gupta, Anshul; Srivastava, Atul

doi:10.1007/978-981-97-0641-9_10

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 918))

Included in the following conference series:

International Conference on Cryptology & Network Security with Machine Learning

82 Accesses

Abstract

The revolutionary field of natural language processing has broad implications for privacy and safety. This article reviews the wide range of natural language processing uses for privacy protection. Through a systematic literature review, we identify the most important natural language processing approaches employed for various forms of security, such as phishing email detection, cyberthreat analysis, anomaly detection, and privacy-aware text processing. We also discuss the moral issues that must be taken into account while implementing NLP, including defenses against adversarial assaults, good AI protocol, and safeguards for personal data. While examining the possible benefits and hazards of NLP systems, the study emphasizes the significance of responsible and ethical use. The future scope is also discussed to strengthen NLP’s utilization in the domain of privacy and security in data-driven era. This study is of related interest to researchers, practitioners, and policymakers in learning about natural language processing and how it relates to security and privacy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Garg R, Kiwelekar AW, Netak LD, Bhate SS (2021) Potential use-cases of natural language processing for a logistics organization. In: Modern approaches in machine learning and cognitive science: a walkthrough: latest trends in AI, vol 2. Springer, pp 157–191. https://doi.org/10.1007/978-3-030-68291-0_13
Georgescu T-M (2020) Natural language processing model for automatic analysis of cybersecurity-related documents. Symmetry (Basel) 12(3):354. https://doi.org/10.3390/sym12030354
Article Google Scholar
Sarker IH, Furhad MH, Nowrozy R (2021) AI-driven cybersecurity: an overview, security intelligence modeling and research directions. SN Comput. Sci. 2(3):173. https://doi.org/10.1007/s42979-021-00557-0
Article Google Scholar
Behera RK, Bala PK, Rana NP, Irani Z (2023) Responsible natural language processing: a principlist framework for social benefits. Technol Forecast Soc Change 188:122306. https://doi.org/10.1016/j.techfore.2022.122306
Article Google Scholar
Feng Q, He D, Liu Z, Wang H, Choo K-KR (2020) SecureNLP: a system for multi-party privacy-preserving natural language processing. IEEE Trans Inf Forensics Secur 15:3709–3721. https://doi.org/10.1109/TIFS.2020.2997134
Article Google Scholar
Steverson K, Carlin C, Mullin J, Ahiskali M (2021) Cyber intrusion detection using natural language processing on windows event logs. In: 2021 International conference on military communication and information systems (ICMCIS), IEEE, pp 1–7. https://doi.org/10.1109/ICMCIS52405.2021.9486307
Cheng Z, Cui B, Qi T, Yang W, Fu J (2021) An improved feature extraction approach for web anomaly detection based on semantic structure. Secur Commun Netw 2021:1–11. https://doi.org/10.1155/2021/6661124
Article Google Scholar
Unal U, Dag H (2022) Anomaly adapters: parameter-efficient multi-anomaly task detection. IEEE Access 10:5635–5646. https://doi.org/10.1109/ACCESS.2022.3141161
Article Google Scholar
Ryciak P, Wasielewska K, Janicki A (2022) Anomaly detection in log files using selected natural language processing methods. Appl Sci 12(10):5089. https://doi.org/10.3390/app12105089
Article Google Scholar
Meira J, Carneiro J, Bolón-Canedo V, Alonso-Betanzos A, Novais P, Marreiros G (2022) Anomaly detection on natural language processing to improve predictions on tourist preferences. Electronics 11(5):779. https://doi.org/10.3390/electronics11050779
Article Google Scholar
Horak M, Chandrasekaran S, Tobar G (2022) NLP based anomaly detection for categorical time series. In: 2022 IEEE 23rd international conference on information reuse and integration for data science (IRI). IEEE, pp 27–34. https://doi.org/10.1109/IRI54793.2022.00019
Xie F et al (2022) Scrutinizing privacy policy compliance of virtual personal assistant apps. In: Proceedings of the 37th IEEE/ACM international conference on automated software engineering. ACM, New York, pp 1–13. https://doi.org/10.1145/3551349.3560416
Del Alamo JM, Guaman DS, García B, Diez A (2022) A systematic mapping study on automated analysis of privacy policies. Computing 104(9):2053–2076. https://doi.org/10.1007/s00607-022-01076-3
Article Google Scholar
Wagner I (2023) Privacy policies across the ages: content of privacy policies 1996–2021. ACM Trans Priv Secur 26(3):1–32. https://doi.org/10.1145/3590152
Article Google Scholar
John S, Ajayi BA, Marafa SM (2022) Natural language processing and deep learning based techniques for evaluation of companies’ privacy policies, pp 15–32. https://doi.org/10.1007/978-3-031-10536-4_2
Ravichander A, Black AW, Norton T, Wilson S, Sadeh N (2021) Breaking down walls of text: how can NLP benefit consumer privacy? ACL-IJCNLP 2021—59th annual meeting of the association for computational linguistics. 11th international joint conference on natural language processing, pp 4125–4140. https://doi.org/10.18653/v1/2021.acl-long.319
Liu Z et al (2023) DeID-GPT: zero-shot medical text de-identification by GPT-4. https://doi.org/10.48550/arXiv.2303.11032
Abu-El-Rub et al N (2022) Natural language processing for enterprise-scale de-identification of protected health information in clinical notes. AMIA joint summits on translational science proceedings. AMIA Jt Summits Transl Sci 2022:92–101 (Online). Available: http://www.ncbi.nlm.nih.gov/pubmed/35854742
Lothritz et al C (2023) Evaluating the impact of text de-identification on downstream {NLP} tasks. In: Proceedings of the 24th Nordic conference on computational linguistics (NoDaLiDa), pp 10–16 (Online). Available: https://aclanthology.org/2023.nodalida-1.2
Larbi IBC, Burchardt A, Roller R (2023) Clinical text anonymization, its influence on downstream NLP tasks and the risk of re-identification. In: EACL 2023—17th conference of the European chapter of the association for computational linguistics: student research workshop, pp 105–111
Google Scholar
Catelli R, Esposito M (2023) De-identification techniques to preserve privacy in medical records. In: Artificial intelligence in healthcare and COVID-19. Elsevier, pp 125–148. https://doi.org/10.1016/B978-0-323-90531-2.00007-2
Marinho R, Holanda R (2023) Automated emerging cyber threat identification and profiling based on natural language processing. IEEE Access 11:58915–58936. https://doi.org/10.1109/ACCESS.2023.3260020
Article Google Scholar
Silvestri S, Islam S, Papastergiou S, Tzagkarakis C, Ciampi M (2023) A machine learning approach for the NLP-based analysis of cyber threats and vulnerabilities of the healthcare ecosystem. Sensors 23(2):651. https://doi.org/10.3390/s23020651
Article Google Scholar
Kim N et al (2019) Study of natural language processing for collecting cyber threat intelligence using SyntaxNet, pp 10–18. https://doi.org/10.1007/978-3-030-20717-5_2
Mimura M, Ito R (2022) Applying NLP techniques to malware detection in a practical environment. Int J Inf Secur 21(2):279–291. https://doi.org/10.1007/s10207-021-00553-8
Article Google Scholar
Boukabous M, Azizi M (2022) Crime prediction using a hybrid sentiment analysis approach based on the bidirectional encoder representations from transformers. Indones J Electr Eng Comput Sci 25(2):1131. https://doi.org/10.11591/ijeecs.v25.i2.pp1131-1139
Article Google Scholar
Cho DX, Son VN, Duc D (2022) Automatically detect software security vulnerabilities based on natural language processing techniques and machine learning algorithms. J ICT Res Appl 16(1):70–87. https://doi.org/10.5614/itbj.ict.res.appl.2022.16.1.5
Article Google Scholar
Shi P, Song Y, Fei Z, Griffioen J (2021) Checking network security policy violations via natural language questions. In: 2021 international conference on computer communications and networks (ICCCN). IEEE, pp 1–9. https://doi.org/10.1109/ICCCN52240.2021.9522325
Sousa S, Kern R (2023) How to keep text private? A systematic review of deep learning methods for privacy-preserving natural language processing. Artif Intell Rev 56(2):1427–1492. https://doi.org/10.1007/s10462-022-10204-6
Article Google Scholar
Habernal I, Mireshghallah F, Thaine P, Ghanavati S, Feyisetan O (2023) Tutorial on privacy-preserving natural language processing. EACL 2023—17th conference of the European chapter of the association for computational linguistics: proceedings of tutoring abstract, pp 27–30
Google Scholar
Kim D, Lee G, Oh S (2022) Toward privacy-preserving text embedding similarity with homomorphic encryption. In: FinNLP 2022—proceedings of the fourth workshop on financial technology and natural language processing (FinNLP), pp 25–36
Google Scholar
Mahendran D, Luo C, Mcinnes BT (2021) Review: privacy-preservation in the context of natural language processing. IEEE Access 9:147600–147612. https://doi.org/10.1109/ACCESS.2021.3124163
Article Google Scholar
Yin Y, Habernal I (2022) Privacy-preserving models for legal natural language processing (Online). Available: http://arxiv.org/abs/2211.02956
Silva P, Goncalves C, Godinho C, Antunes N, Curado M (2020) Using NLP and machine learning to detect data privacy violations. In: IEEE INFOCOM 2020—IEEE conference on computer communications workshops (INFOCOM WKSHPS). IEEE, pp 972–977. https://doi.org/10.1109/INFOCOMWKSHPS50562.2020.9162683

Download references

Author information

Authors and Affiliations

Jain University, Bangalore, India
Rachit Garg
SP Jain School of Global Management, Dubai, UAE
Anshul Gupta
Amity School of Engineering and Technology Lucknow, Lucknow, India
Atul Srivastava

Authors

Rachit Garg
View author publications
You can also search for this author in PubMed Google Scholar
Anshul Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Atul Srivastava
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rachit Garg .

Editor information

Editors and Affiliations

Department of Mathematics, Pranveer Singh Institute of Technology, Kanpur, India
Atul Chaturvedi
Department of Mathematics, Indian Institute of Technology Jammu, Jammu, India
Sartaj Ul Hasan
Applied Statistics Unit, Indian Statistical Institute, Kolkata, India
Bimal Kumar Roy
Department of Mathematics, Bar-Ilan University, Ramat-Gan, Israel
Boaz Tsaban

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Garg, R., Gupta, A., Srivastava, A. (2024). A Comprehensive Review on Transforming Security and Privacy with NLP. In: Chaturvedi, A., Hasan, S.U., Roy, B.K., Tsaban, B. (eds) Cryptology and Network Security with Machine Learning. ICCNSML 2023. Lecture Notes in Networks and Systems, vol 918. Springer, Singapore. https://doi.org/10.1007/978-981-97-0641-9_10

Download citation

DOI: https://doi.org/10.1007/978-981-97-0641-9_10
Published: 23 April 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-0640-2
Online ISBN: 978-981-97-0641-9
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics