Abstract
The revolutionary field of natural language processing has broad implications for privacy and safety. This article reviews the wide range of natural language processing uses for privacy protection. Through a systematic literature review, we identify the most important natural language processing approaches employed for various forms of security, such as phishing email detection, cyberthreat analysis, anomaly detection, and privacy-aware text processing. We also discuss the moral issues that must be taken into account while implementing NLP, including defenses against adversarial assaults, good AI protocol, and safeguards for personal data. While examining the possible benefits and hazards of NLP systems, the study emphasizes the significance of responsible and ethical use. The future scope is also discussed to strengthen NLP’s utilization in the domain of privacy and security in data-driven era. This study is of related interest to researchers, practitioners, and policymakers in learning about natural language processing and how it relates to security and privacy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Garg R, Kiwelekar AW, Netak LD, Bhate SS (2021) Potential use-cases of natural language processing for a logistics organization. In: Modern approaches in machine learning and cognitive science: a walkthrough: latest trends in AI, vol 2. Springer, pp 157–191. https://doi.org/10.1007/978-3-030-68291-0_13
Georgescu T-M (2020) Natural language processing model for automatic analysis of cybersecurity-related documents. Symmetry (Basel) 12(3):354. https://doi.org/10.3390/sym12030354
Sarker IH, Furhad MH, Nowrozy R (2021) AI-driven cybersecurity: an overview, security intelligence modeling and research directions. SN Comput. Sci. 2(3):173. https://doi.org/10.1007/s42979-021-00557-0
Behera RK, Bala PK, Rana NP, Irani Z (2023) Responsible natural language processing: a principlist framework for social benefits. Technol Forecast Soc Change 188:122306. https://doi.org/10.1016/j.techfore.2022.122306
Feng Q, He D, Liu Z, Wang H, Choo K-KR (2020) SecureNLP: a system for multi-party privacy-preserving natural language processing. IEEE Trans Inf Forensics Secur 15:3709–3721. https://doi.org/10.1109/TIFS.2020.2997134
Steverson K, Carlin C, Mullin J, Ahiskali M (2021) Cyber intrusion detection using natural language processing on windows event logs. In: 2021 International conference on military communication and information systems (ICMCIS), IEEE, pp 1–7. https://doi.org/10.1109/ICMCIS52405.2021.9486307
Cheng Z, Cui B, Qi T, Yang W, Fu J (2021) An improved feature extraction approach for web anomaly detection based on semantic structure. Secur Commun Netw 2021:1–11. https://doi.org/10.1155/2021/6661124
Unal U, Dag H (2022) Anomaly adapters: parameter-efficient multi-anomaly task detection. IEEE Access 10:5635–5646. https://doi.org/10.1109/ACCESS.2022.3141161
Ryciak P, Wasielewska K, Janicki A (2022) Anomaly detection in log files using selected natural language processing methods. Appl Sci 12(10):5089. https://doi.org/10.3390/app12105089
Meira J, Carneiro J, BolĂłn-Canedo V, Alonso-Betanzos A, Novais P, Marreiros G (2022) Anomaly detection on natural language processing to improve predictions on tourist preferences. Electronics 11(5):779. https://doi.org/10.3390/electronics11050779
Horak M, Chandrasekaran S, Tobar G (2022) NLP based anomaly detection for categorical time series. In: 2022 IEEE 23rd international conference on information reuse and integration for data science (IRI). IEEE, pp 27–34. https://doi.org/10.1109/IRI54793.2022.00019
Xie F et al (2022) Scrutinizing privacy policy compliance of virtual personal assistant apps. In: Proceedings of the 37th IEEE/ACM international conference on automated software engineering. ACM, New York, pp 1–13. https://doi.org/10.1145/3551349.3560416
Del Alamo JM, Guaman DS, GarcĂa B, Diez A (2022) A systematic mapping study on automated analysis of privacy policies. Computing 104(9):2053–2076. https://doi.org/10.1007/s00607-022-01076-3
Wagner I (2023) Privacy policies across the ages: content of privacy policies 1996–2021. ACM Trans Priv Secur 26(3):1–32. https://doi.org/10.1145/3590152
John S, Ajayi BA, Marafa SM (2022) Natural language processing and deep learning based techniques for evaluation of companies’ privacy policies, pp 15–32. https://doi.org/10.1007/978-3-031-10536-4_2
Ravichander A, Black AW, Norton T, Wilson S, Sadeh N (2021) Breaking down walls of text: how can NLP benefit consumer privacy? ACL-IJCNLP 2021—59th annual meeting of the association for computational linguistics. 11th international joint conference on natural language processing, pp 4125–4140. https://doi.org/10.18653/v1/2021.acl-long.319
Liu Z et al (2023) DeID-GPT: zero-shot medical text de-identification by GPT-4. https://doi.org/10.48550/arXiv.2303.11032
Abu-El-Rub et al N (2022) Natural language processing for enterprise-scale de-identification of protected health information in clinical notes. AMIA joint summits on translational science proceedings. AMIA Jt Summits Transl Sci 2022:92–101 (Online). Available: http://www.ncbi.nlm.nih.gov/pubmed/35854742
Lothritz et al C (2023) Evaluating the impact of text de-identification on downstream {NLP} tasks. In: Proceedings of the 24th Nordic conference on computational linguistics (NoDaLiDa), pp 10–16 (Online). Available: https://aclanthology.org/2023.nodalida-1.2
Larbi IBC, Burchardt A, Roller R (2023) Clinical text anonymization, its influence on downstream NLP tasks and the risk of re-identification. In: EACL 2023—17th conference of the European chapter of the association for computational linguistics: student research workshop, pp 105–111
Catelli R, Esposito M (2023) De-identification techniques to preserve privacy in medical records. In: Artificial intelligence in healthcare and COVID-19. Elsevier, pp 125–148. https://doi.org/10.1016/B978-0-323-90531-2.00007-2
Marinho R, Holanda R (2023) Automated emerging cyber threat identification and profiling based on natural language processing. IEEE Access 11:58915–58936. https://doi.org/10.1109/ACCESS.2023.3260020
Silvestri S, Islam S, Papastergiou S, Tzagkarakis C, Ciampi M (2023) A machine learning approach for the NLP-based analysis of cyber threats and vulnerabilities of the healthcare ecosystem. Sensors 23(2):651. https://doi.org/10.3390/s23020651
Kim N et al (2019) Study of natural language processing for collecting cyber threat intelligence using SyntaxNet, pp 10–18. https://doi.org/10.1007/978-3-030-20717-5_2
Mimura M, Ito R (2022) Applying NLP techniques to malware detection in a practical environment. Int J Inf Secur 21(2):279–291. https://doi.org/10.1007/s10207-021-00553-8
Boukabous M, Azizi M (2022) Crime prediction using a hybrid sentiment analysis approach based on the bidirectional encoder representations from transformers. Indones J Electr Eng Comput Sci 25(2):1131. https://doi.org/10.11591/ijeecs.v25.i2.pp1131-1139
Cho DX, Son VN, Duc D (2022) Automatically detect software security vulnerabilities based on natural language processing techniques and machine learning algorithms. J ICT Res Appl 16(1):70–87. https://doi.org/10.5614/itbj.ict.res.appl.2022.16.1.5
Shi P, Song Y, Fei Z, Griffioen J (2021) Checking network security policy violations via natural language questions. In: 2021 international conference on computer communications and networks (ICCCN). IEEE, pp 1–9. https://doi.org/10.1109/ICCCN52240.2021.9522325
Sousa S, Kern R (2023) How to keep text private? A systematic review of deep learning methods for privacy-preserving natural language processing. Artif Intell Rev 56(2):1427–1492. https://doi.org/10.1007/s10462-022-10204-6
Habernal I, Mireshghallah F, Thaine P, Ghanavati S, Feyisetan O (2023) Tutorial on privacy-preserving natural language processing. EACL 2023—17th conference of the European chapter of the association for computational linguistics: proceedings of tutoring abstract, pp 27–30
Kim D, Lee G, Oh S (2022) Toward privacy-preserving text embedding similarity with homomorphic encryption. In: FinNLP 2022—proceedings of the fourth workshop on financial technology and natural language processing (FinNLP), pp 25–36
Mahendran D, Luo C, Mcinnes BT (2021) Review: privacy-preservation in the context of natural language processing. IEEE Access 9:147600–147612. https://doi.org/10.1109/ACCESS.2021.3124163
Yin Y, Habernal I (2022) Privacy-preserving models for legal natural language processing (Online). Available: http://arxiv.org/abs/2211.02956
Silva P, Goncalves C, Godinho C, Antunes N, Curado M (2020) Using NLP and machine learning to detect data privacy violations. In: IEEE INFOCOM 2020—IEEE conference on computer communications workshops (INFOCOM WKSHPS). IEEE, pp 972–977. https://doi.org/10.1109/INFOCOMWKSHPS50562.2020.9162683
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Garg, R., Gupta, A., Srivastava, A. (2024). A Comprehensive Review on Transforming Security and Privacy with NLP. In: Chaturvedi, A., Hasan, S.U., Roy, B.K., Tsaban, B. (eds) Cryptology and Network Security with Machine Learning. ICCNSML 2023. Lecture Notes in Networks and Systems, vol 918. Springer, Singapore. https://doi.org/10.1007/978-981-97-0641-9_10
Download citation
DOI: https://doi.org/10.1007/978-981-97-0641-9_10
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-0640-2
Online ISBN: 978-981-97-0641-9
eBook Packages: EngineeringEngineering (R0)