Indirect Prompt Injection into LLMs

#175 - Does your AI bot see prompts everywhere? Can it take instructions from anywhere?

Feb 19, 2025

AI systems are so keen to follow instructions, that they seem to find prompts everywhere! It’s relatively easy to prompt an AI and update it’s long term memory using this technique.

Haeckel’s embryos as a question on the cover of a magazine that promoted evolution teaching after the Scopes trial. The answer was number 9.

Prompt injection is when an attacker writes prompts that causes an LLM to behave in a ways that it should not, or reveal information that it should not. There are two types of prompts:

✅ System Prompts - A prompt that sets the overall context and guidelines for the use of the LLM. These are typically like “You are a helpful assistant” or “You are a red teamer” etc. It helps set the tone of the conversation. If you have created your own GPTs or Gems you know that you have specify the system prompt.

The process of creating a new Gem - Note the “Instructions” here. This serves as as system prompt

The process of creating a new GPT. The System prompt is again “Instructions”

✅ User Prompts - This is where the user chats with the LLM.

Prompt injections can happen for both prompts.

Before we understand indirect prompt injection, there is one more thing to know — long term memory

Since we use LLMs everyday and the kind makers of these LLMs want to make it easy for us to use their LLMs, they store basic information about an individual in something called the ‘Long Term Memory’ of the LLM. This is why, the more you chat with an LLM, the more targeted the output is. Once, when I was asking it about perspective drawing, the LLM ended it with “Would you like to know more about perspective drawing in the context of cybersecurity” 🤷

As LLM implementations improve their security, prompt injections at system and user level are less and less effective. A new way to inject something into an AI is inject an instruction when the AI is performing a task. A malicious document, containing instructions to update the long term memory of the AI can be inserted into a long document. When the document is uploaded to AI and a task is given to it, the AI follows the instructions in the document and updates the long term memory as per the instruction.

This has been found in Gemini, by Google. Read this article and see the video below to know more.

Take Action:

This is a cause of concern, especially in Gemini, as Google allows a Google Workspace user to connect their workspace to Gemini. This means Google Docs, Sheets, etc in your workspace have the potential to be compromised.

If you are a heavy Google Workspace user, do not link Google workspace to Gemini till more clarity is obtained.

You can change Google Workspace admin console by going to Generative AI, Gemini App and clicking on Workspace extensions. Then you get this:

Screengrab of Google admin settings for allowing connection of Workspace to Gemini

Make sure that this is disabled.

Patrick Jordan

Feb 19

Good post. All the major models - open and closed - have proven to be vulnerable at one point or other, often more than once. This is one of the reasons I appreciate Anthropic; they have been proactive in pen testing and red teaming their models and very transparent about their results, warts and all.

Expand full comment

CyberInsights

Discussion about this post

CyberInsights

Indirect Prompt Injection into LLMs

#175 - Does your AI bot see prompts everywhere? Can it take instructions from anywhere?

Poisoning AI through indirect means

AI systems are so keen to follow instructions, that they seem to find prompts everywhere! It’s relatively easy to prompt an AI and update it’s long term memory using this technique.

Discussion about this post