Towards Safe LLMs Integration

Majumdar, Subhabrata; Vogelsang, Terry

doi:10.1007/978-3-031-54827-7_27

Subhabrata Majumdar⁶ &
Terry Vogelsang⁷

10k Accesses

Abstract

LLMs face a critical vulnerability known as sandbox breakout, where attackers bypass the system designers’ limitations to prevent malicious access to the resources for which the LLM agent is a user interface. Thus, they can access the system and potentially steal data, change the interaction with other users, or inject malicious code or contents into underlying databases. Therefore, it is essential to identify and address vulnerabilities that could be exploited to break out of the sandbox. These vulnerabilities could exist in the sandbox, the operating system, or the LLM’s software dependencies. To mitigate the risk of LLM sandbox breakout, robust security measures, such as regular model updates, automated model red-teaming, testing, and access control policies, must be implemented. In addition, sandboxing should be enforced at multiple levels to reduce the attack surface and prevent attackers from accessing critical systems. By implementing these measures, the risk of LLM sandbox breakout can be significantly reduced, and the security and reliability of LLM-based applications can be improved.

Download to read the full chapter text

Chapter PDF

References

G. T. Klondike. Threat Modeling LLM Applications. https://aivillage.org/large%20language%20models/threat-modeling-llm, 2023.
E. Eliacik. Playing with fire: The leaked plugin DAN unchains ChatGPT from its moral and ethical restrictions. https://dataconomy.com/2023/03/31/chatgpt-dan-prompt-how-to-jailbreak-chatgpt/, 2023.
A. Stubbs. LLM Hacking: Prompt Injection Techniques. https://medium.com/@austin-stubbs/llm-security-types-of-prompt-injection-d7ad8d7d75a3, 2023.
S. Manjesh. HackerOne and the OWASP Top 10 for LLM: A Powerful Alliance for Secure AI. https://www.hackerone.com/vulnerability-management/owasp-llm-vulnerabilities, 2023.
R. Merritt. What Is Retrieval-Augmented Generation aka RAG? https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation, 2023.

Download references

Author information

Authors and Affiliations

Vijil / AI Risk and Vulnerability Alliance, Seattle, WA, USA
Subhabrata Majumdar
Kudelski Security, Lausanne, Switzerland
Terry Vogelsang

Authors

Subhabrata Majumdar
View author publications
You can also search for this author in PubMed Google Scholar
Terry Vogelsang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Subhabrata Majumdar .

Editor information

Editors and Affiliations

HES-SO Valais-Wallis, Sierre, Switzerland
Andrei Kucharavy
Cyber-Defence Campus, armasuisse Science and Technology, Thun, Switzerland
Octave Plancherel
Cyber-Defence Campus, armasuisse Science and Technology, Thun, Switzerland
Valentin Mulder
Cyber-Defence Campus, armasuisse Science and Technology, Thun, Switzerland
Alain Mermoud
Cyber-Defence Campus, armasuisse Science and Technology, Thun, Switzerland
Vincent Lenders

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Majumdar, S., Vogelsang, T. (2024). Towards Safe LLMs Integration. In: Kucharavy, A., Plancherel, O., Mulder, V., Mermoud, A., Lenders, V. (eds) Large Language Models in Cybersecurity. Springer, Cham. https://doi.org/10.1007/978-3-031-54827-7_27

Download citation

DOI: https://doi.org/10.1007/978-3-031-54827-7_27
Published: 12 April 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-54826-0
Online ISBN: 978-3-031-54827-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics