Prompt injection attacks have been one of the bugbears for modern AI models: it's an unsolved problem that has meant that it can be quite dangerous to expose LLMs to direct user input, among other things. A lot of people have worked on the problem, but progress hasn't been promising.
But as Simon points out, this is changing:
"In the two and a half years that we’ve been talking about prompt injection attacks I’ve seen alarmingly little progress towards a robust solution. The new paper Defeating Prompt Injections by Design from Google DeepMind finally bucks that trend. This one is worth paying attention to.
[...] CaMeL really does represent a promising path forward though: the first credible prompt injection mitigation I’ve seen that doesn’t just throw more AI at the problem and instead leans on tried-and-proven concepts from security engineering, like capabilities and data flow analysis."
If these technologies are going to be a part of our stacks going forward, this problem must be solved. It's certainly a step forward.
Next, do environmental impact, hallucinations, and ethical training sets.
[Link]
·
Links
·
Share this post
Werd I/O © Ben Werdmuller. The text (without images) of this site is licensed under CC BY-NC-SA 4.0.
I’m writing about the intersection of the internet, media, and society. Sign up to my newsletter to receive every post and a weekly digest of the most important stories from around the web.