Solving the data security challenge for AI builders
This demo highlights the potential risks of using contextual data with LLMs and demonstrates how HashiCorp Vault can integrate with Pinecone to tackle AI data security challenges.
This post takes a hands-on look at implementing a Microsoft Azure AI text search application that leverages the Azure OpenAI GPT3 models and Pinecone (a vector database) combined with HashiCorp Vault to provide encryption and decryption capabilities that help protect the integrity of data in the RAG-based, large language model (LLM) application.
Generative AI chatbots such as ChatGPT, Google’s Gemini, and Microsoft’s Copilot, are powerful tools that generate human-like text based on user prompts. However, the ability of these generative AI systems to follow instructions also makes them vulnerable to misuse. “Prompt injections” can let attackers bypass safety guardrails and manipulate the model’s responses. For instance, users have coerced ChatGPT into endorsing harmful content or suggesting illegal activities.
Even as companies work to improve LLM security, integrating AI chatbots into products that interact with the internet opens up new risks. Many companies are already using chatbots like ChatGPT for real-world actions such as booking flights or scheduling meetings. However, this could allow malicious actors to exploit these chatbots to create phishing attacks or leak private information.
» RAG makes it even more complicated
Securing retrieval-augmented generation (RAG) enhanced AI applications is complex. RAG involves fetching information from external sources to provide more accurate and comprehensive responses. For instance, RAG increases the risk of data leakage, because sensitive information from external databases could inadvertently be included in the model's responses. With the integration of diverse data sources, the ability to manipulate these sources becomes easier, as bad actors have more surface area to work with and inject harmful or misleading information that the AI system then disseminates.
In this post’s example AI text search application, we use HashiCorp Terraform to build an Azure OpenAI based application to provide inputs to the LLM prompt. It doesn't train the model, as the LLM is pre-trained using public data, but it does generate responses that are augmented by information from the additional context the user adds. The demo leverages Microsoft's Azure to create an image in the Azure Container Registry (ACR) and uses this image to build a container within Azure Kubernetes Service (AKS). We then use Terraform to create a template file using the outputs from the Terraform outputs to deploy an application into the container.
As part of this architecture, we also use Pinecone, a vector database, which plays a crucial role in storing and retrieving relevant information based on the AI search's queries. Pinecone enables efficient searching and retrieval of data, enhancing the AI's ability to provide accurate and relevant responses. However, this also introduces new vulnerabilities: Attackers could exploit weaknesses in the vector database, gaining unauthorized access to sensitive information stored within Pinecone. They could manipulate the database to alter the information retrieved by the AI, leading to inaccurate or harmful responses. Ensuring the security of the data in the vector database is essential to protect against such threats and maintain the integrity of the RAG system. This is where HashiCorp Vault can help.
Building an encryption engine is challenging due to the complex nature of cryptographic algorithms and the rigorous security requirements needed to protect sensitive data. Cryptography demands a deep understanding to implement algorithms that are both secure and efficient. Additionally, to ensure there are no vulnerabilities or weaknesses that could be exploited by attackers, continued testing is required. Proper key management, access control, and compliance with various regulations further complicate the process. Even small errors in design or implementation can lead to significant security flaws, hence using a dedicated tool such as Vault, which is a robust tool for managing secrets and encrypting sensitive data, is an easy choice. Using Vault, sensitive information can be encrypted before it is stored in Pinecone so that the data remains protected even if the vector database is compromised. Vault also provides fine-grained access control, so only authorized entities can decrypt and access the sensitive information.
» How to secure a Pinecone-based RAG system
To illustrate how HashiCorp Vault can enhance the security of a RAG system, we have built a demo using Pinecone as the vector database. This demo shows how to integrate Vault into the RAG workflow to encrypt and manage sensitive data.
» Architecture overview
The architecture for this demo involves several components:
- AI language model: Generates responses and retrieves relevant information based on user prompts.
- Pinecone vector database: Stores and retrieves vectorized information.
- HashiCorp Vault: Manages secrets and encrypts sensitive data.
- The RAG workflow: Manages the interaction between the AI search, Pinecone, and Vault.
» Building the demo
Creating this demo involves five steps:
1. Set up the environment:
- Clone this repository.
- Install Vault and configure it to handle secret management and encryption.
2. Integrate Vault with Pinecone:
- Modify the data flow to include encryption and decryption steps using Vault.
- When storing data in Pinecone, first encrypt it using Vault's transit secrets engine.
- When retrieving data from Pinecone, decrypt it using Vault before using it in the AI search.
3. Implement the encryption and decryption logic:
- Use Vault's API to handle encryption and decryption, as shown in this Python example code:
import hvac
# Initialize the Vault client
client = hvac.Client(url='http://127.0.0.1:8200', token='YOUR_VAULT_TOKEN')
# Encrypt data
response = client.secrets.transit.encrypt_data(
name='your-transit-key',
plaintext='your-plain-text-data'
)
ciphertext = response['data']['ciphertext']
# Decrypt data
response = client.secrets.transit.decrypt_data(
name='your-transit-key',
ciphertext=ciphertext
)
plaintext = response['data']['plaintext']
4. Modify the RAG flow:
- Update the RAG flow to include steps for encryption before storing data in Pinecone and decryption after retrieving data.
- Ensure that the deployed application can handle the encrypted and decrypted data seamlessly.
5. Test the integration:
- Run the demo to ensure the data is correctly encrypted before storage and decrypted after retrieval.
- Validate that the deployed application can generate accurate responses using the encrypted data workflow.
See the full example in our git repository.
You can significantly enhance data security by integrating HashiCorp Vault into a RAG system using Pinecone and Terraform. This demo showcases how encryption and secrets management can protect sensitive information, mitigate risks, and boost compliance with data protection regulations.
» More on accelerating AI adoption on Azure with HashiCorp
If you’d like to learn more about using HashiCorp products for AI use cases, check out these blog posts:
Sign up for the latest HashiCorp news
More blog posts like this one
Fix the developers vs. security conflict by shifting further left
Resolve the friction between dev and security teams with platform-led workflows that make cloud security seamless and scalable.
HashiCorp at AWS re:Invent: Your blueprint to cloud success
If you’re attending AWS re:Invent in Las Vegas, Dec. 2 - Dec. 6th, visit us for breakout sessions, expert talks, and product demos to learn how to take a unified approach to Infrastructure and Security Lifecycle Management.
Speed up app delivery with automated cancellation of plan-only Terraform runs
Automatic cancellation of plan-only runs allows customers to easily cancel any unfinished runs for outdated commits to speed up application delivery.