Security risks in text-based AI applications
Text-based AI applications, better known as Large Language Models, have had an explosive impact in recent years both in efficiency and in popularity. It is a tool that has many different areas of use to streamline the everyday lives of users. Unfortunately, new technologies often come with new vulnerabilities. In this article we highlight these vulnerabilities and demonstrate how the OWASP Top 10 for Large Language Model Applications can be used as a resource to secure these types of applications.
Large Language Models (LLMs) represent a remarkable advancement in artificial intelligence, enabling computers to understand and generate human-like text. These models are built upon deep learning architectures and are trained on vast amounts of text data to develop a nuanced understanding of language. One of the more prominent achievements in this field was the release of ChatGPT, with the latest version GPT-4(Generative Pre-trained Transformer) as of this writing. Some of the practical use cases of LLMs are:
- Content Generation: LLMs are widely employed to generate human-like text content. This includes creating articles, blog posts, marketing copy, and even creative writing. Businesses can save time and resources by using these models to draft high-quality content.
- Chatbots and Virtual Assistants: LLMs power conversational AI applications like chatbots and virtual assistants. These systems can engage in human-like conversations, helping users with inquiries, providing information, or even offering companionship.
- Coding Assistance: Developers can benefit from LLMs that provide code suggestions, auto-completion, and debugging assistance, enhancing programming efficiency.
The OWASP (Open Worldwide Application Security Project) Foundation is a globally recognized non-profit organization dedicated to improving the security of software. The primary goal of the OWASP Foundation is to make security visible and accessible, ensuring that software is designed, developed, and deployed with security in mind. OWASP has closer to 300 different project that helps securing the software of the world. One of those projects is the OWASP Top 10 for Large Language Model Applications which aims to educate developers, designers, architects, managers, and organizations about the potential security risks when deploying and managing Large Language Models. The project lists the top 10 most critical vulnerabilities that most often occurs in LLM applications.
The most critical being LLM01 - Prompt injection: Which involves bypassing filters or manipulating the LLM using carefully crafted prompts that make the model ignore previous instructions or perform unintended actions. These vulnerabilities can lead to unintended consequences, including data leakage, unauthorized access, or other security breaches.
This attack can for example be used in a malicious way by trying to overwrite or reveal the underlying system prompt. This may allow attackers to exploit backend systems by interacting with insecure functions and data stores accessible through the LLM. We can use ChatGPT to demonstrate the concept of this attack in a simpler way by trying to bypass their filter that is meant to protect users from harm and safety concerns:
In the example below we can see that the model will not provide information to an inappropriate direct question.
There are multiple different ways to bypass these kind of filters, one way is to instead of asking a direct question we introduce a whole different context and problem to solve by doing so it’s possible to bypass the filter as shown in the example below.
The full list of the OWASP top 10 for Large Language Model applications is shown below and if you are interested in learning more about these types of attacks, with common attack examples and remediation techniques to prevent attacks look for the links under further reading in the bottom of this article.