Episode Details

Embedding Space Attacks | Episode 45

Episode 45 Published 2 months, 1 week ago

Description

In this episode of BHIS Presents: AI Security Ops, the team explores embedding space attacks — a lesser-known but increasingly important threat in modern AI systems — and how attackers can manipulate the mathematical foundations of how models understand data.

Unlike prompt injection, which targets instructions, embedding attacks operate at a deeper level by influencing how data is represented, retrieved, and interpreted inside vector spaces. By subtly altering embeddings or poisoning data sources, attackers can manipulate AI behavior without ever touching the model directly.

Through a hands-on walkthrough of a custom notebook with rich visualizations, this episode breaks down how embeddings work, why they are critical to LLM-powered systems like RAG pipelines, and how attackers can exploit them in real-world scenarios.

We dig into:
- What embeddings are and how AI systems convert text into numerical representations
- How vector spaces enable similarity search and retrieval in LLM applications
- What embedding space attacks are and why they matter for AI security
- How small perturbations in data can drastically change model behavior
- The risks of poisoned data in RAG and vector databases
- How attackers can influence search results and downstream AI outputs
- Why these attacks are subtle, hard to detect, and often overlooked
- The role of visualization in understanding embedding behavior
- Real-world implications for AI-powered applications and workflows
- Defensive considerations when building with embeddings and vector stores

This episode focuses on the foundational layer of AI systems, showing how security risks extend beyond prompts and into the underlying data representations that power modern AI.

⸻

📚 Key Concepts Covered

AI Foundations
- Embeddings and vector representations
- Similarity search and vector space reasoning

AI Security Risks
- Embedding space manipulation
- Data poisoning in vector databases
- Retrieval manipulation in RAG systems

Applications & Impact
- LLM-powered search and assistants
- AI pipelines using embeddings
- Risks in production AI systems

#AISecurity #Embeddings #CyberSecurity #LLMSecurity #AIThreats #BHIS #AIAgents #ArtificialIntelligence #InfoSec

Join the 5,000+ cybersecurity professionals on our BHIS Discord server to ask questions and share your knowledge about AI Security.
https://discord.gg/bhis

(00:00) - Intro & Episode Overview

(01:39) - What Are Embeddings? (AI Only Understands Numbers)

(03:44) - The Embedding Process (Text → Vectors)

(07:43) - Similarity, Classification & Vector Math

(09:55) - Visualizing Embedding Space (2D Projection)

(14:29) - Classifiers

(15:39) - Playing Games with Information

(18:06) - Attack Techniques: Synonyms & Context Manipulation

(20:29) - Context Padding

(27:10) - Collision Attacks, Defenses & Final Thoughts

Click here to watch this episode on YouTube.

Creators & Guests

Brian Fehrman - Host

Bronwen Aker - Host

Derek Banks - Host

Brought to you by:

Black Hills Information Security

https://www.blackhillsinfosec.com

Antisyphon Training

https://www.antisyphontraining.com/

Active Countermeasures

Listen Now

Episode Details

Embedding Space Attacks | Episode 45

Description

Listen Now

Love PodBriefly?