Episode Details
Back to EpisodesEarley AI Podcast - Episode 91: Real-Time Voice Intelligence, Fraud Detection, and AI Guardrails with Mike Pappas
Description
Why Voice Is Not a Solved Problem - and What Real-Time Audio Intelligence Changes for Enterprise AI
Guest: Mike Pappas, CEO at Modulate
Host: Seth Earley, CEO at Earley Information Science
In this episode, Seth Earley speaks with Mike Pappas, CEO of Modulate, whose work began in gaming - one of the most demanding environments for real-time voice intelligence - and has since expanded to enterprise applications including fraud detection, customer abuse prevention, AI agent guardrails, and sales coaching. They explore why transcription is not the same as understanding, what gets lost when audio is reduced to text, and why voice is the most powerful tool fraudsters have. Mike shares candid and specific insights on deepfake detection, the fine line between safety and surveillance, and what organizations need to put in place before deploying voice AI at scale.
Key Takeaways:
- Transcribing voice and understanding voice are not the same thing - intonation, emotion, cadence, and timbre carry information that transcripts cannot capture.
- Voice AI demos are typically built for pristine environments; the real challenge is building systems that hold up under noise, jargon, and emotional complexity in production.
- Real-time intervention changes behavior more effectively than after-the-fact review - feedback delivered in the moment produces measurable reductions in repeat offenses.
- Voice is the most powerful tool for manipulation because it bypasses rational judgment by triggering emotional responses - and AI is now making voice fraud scalable.
- AI voice agents cannot introspect - they cannot tell when a call is going wrong, which is why a separate supervisory layer is essential for any enterprise voice deployment.
- The line between safety systems and surveillance systems is real; collecting and storing only what is necessary for the specific risk being addressed is both a privacy and a trust requirement.
- Before deploying any voice AI, organizations need to define their KPIs clearly - if the system is driving customer satisfaction down, the deployment is failing regardless of what else it is doing.
Insightful Quotes:
"When you hear a voice, you hear the intonation, you hear the emotion, you hear pregnant pauses - there is so much information being carried in that audio that gets lost when you pull down to a transcript. And whenever we talk to someone who professionally works in a contact center, they are always saying, we know these transcripts are losing tons of good value." - Mike Pappas
"If I am actively harassing you and the platform is able to come in and put a stop to it live in the conversation, that feedback actually systematically changes behavior. Getting an email 30 minutes later saying we noticed you did something wrong - that just infuriates people, it does not lead to change." - Mike Pappas
"There is a fine line between safety systems and surveillance systems. How do you design voice AI that improves safety and trust but does not cross that boundary that makes users and employees uncomfortable?" - Seth Earley
Tune in to discover why real-time voice intelligence is one of the most consequential and least understood frontiers in enterprise AI - and what organizations need to get right before they deploy.
Links
LinkedIn: https://www.linkedin.com/in/mike-pappas-9a30a858/
Website: https://www.modulate.ai
Thanks to our sponsors: