Large Language Models (LLMs) are revolutionizing the way we interact with technology. As a result, SaaS vendors are vying for a competitive edge by integrating AI features, offering enterprises tools such as AI-based sales insights or coding co-pilots.
LLM – A User, An Application or a Bit of Both?
Traditionally, zero-trust security models have relied on a clear distinction between users and applications. Yet, LLM-integrated applications disrupt this distinction, functioning simultaneously as both. This reality introduces a new set of security vulnerabilities, such as data leakage, prompt injection, risky access to online resources, and even access to corporate resources on behalf of employees. To address these challenges in LLM deployment, a unique set of zero-trust measures is needed. Below are a few examples of what can go wrong when adopting GenAI services.
Prompt injection – When you hire Harry Potter
Attackers abuse LLM capabilities by crafting inputs to manipulate an LLM’s behavior, either directly or indirectly, with the objective of inducing harmful or unethical behavior.
Prompts can be injected directly by an attacker, or indirectly by an unwitting user as they utilize an LLM-based application for its prescribed use case.
Four types of prompt injections are:
- Direct Prompt Injection, which involves attackers inputting specific prompts to change the LLM’s behavior or output in a harmful way. An attacker might directly instruct an LLM to role-play as an unethical model, to leak sensitive information or cause the model to execute harmful code.
- Indirect Prompt Injection is subtler, involving the manipulation of data sources the LLM uses, making it much more dangerous and harder to detect within organizational environments.
- Multimodal Prompt Injections enable LLMs to receive formats such as images, videos and sounds as inputs, with hidden instructions blended into the media input to alter the behavior of the application bot, making it chat like Harry Potter.
- Denial-of-Service (DoS) attacks can also be perpetrated using prompt injections, leading to resource-heavy operations on LLMs to the point of overload, leading to service degradation or high costs.
Sensitive data leakage – Can your AI keep a secret?
Models can be fine-tuned or augmented with access to data, to achieve better domain-specific results. For example, for your customer support bot, it would be great to fine-tune the model with past trouble tickets. But can your AI keep a secret?
In one study, researchers used the fine-tuning mechanism of ChatGPT to extract names and email addresses of more than 30 New York Times employees. This example shows how sensitive data used to pre-train or fine-tune an LLM can be leaked – creating regulatory risks. As a result, LLM models cannot be trusted to protect sensitive data from being leaked.
An impressionable student – Training-associated risks
Generative AI models undergo extensive training on diverse datasets, often encompassing most internet content. The training process involves pre-training on large datasets for broad language and world understanding, followed by fine-tuning for specific goals using curated datasets.
In data poisoning, attackers can compromise the security of these models by manipulating a small fraction, as little as 0.01%, of the training data. As models and users cannot be blindly trusted, the integrity and security of the training data cannot be assumed to be credible as well.
Access control – Welcome to the Wild Wild West
A growing number of organizations are integrating LLMs into multi-component applications, or “agents”. These integrations enhance the LLM with capabilities such as internet access, retrieval of corporate resources, and performing various actions on them. Notably, OpenAI’s recent launching of its plugin store facilitates widespread access to LLM augmentations.
Access to the internet
Fetching real-time data from the internet can be immensely valuable to users. These augmentations allow LLMs to provide better responses to user queries based on up-to-date information. However, augmenting LLMs to access the internet presents a dramatic challenge, specifically in the context of prompt injection. In recent examples, inserting malicious instructions in URLs, caused Bing chat to persuade users to visit a malicious website or reveal sensitive information which was sent to an external server.
Access to corporate resources
LLM-integrated applications can be designed to interact with corporate resources such as databases or applications. However, this type of access poses a risk even when non-malicious users are involved, as they may inadvertently gain access to sensitive data and resources by interacting with the LLM-integrated application.
In a zero trust AI access framework, the behavior of the LLM-integrated application is not trusted, nor is its decision-making trusted as related to accessing corporate resources, including which and when resources are accessed, what data is exported to which user; and what operations are made in these resources.
Zero Trust AI Access for safe GenAI adoption
With exponential productivity comes exponential risk.
A Zero Trust AI access (ZTAI) approach proposes viewing LLM-integrated applications as entities with a need for strict access control, data protection and threat prevention policies – crafting a more stringent line of defense than would be needed to secure the average employee.