Episode Details
Back to Episodes
From AI Pilot to Production
Description
Moiz Kohari, VP of Enterprise AI and Data Intelligence at DDN, breaks down what it actually takes to get AI into production and keep it there. If your org is stuck in pilot mode, this conversation will help you spot the real blockers, from trust and hallucinations to data architecture and GPU bottlenecks.
Key takeaways
• GenAI success in the enterprise is less about the demo and more about trust, accuracy, and knowing when the system should say “I don’t know.”
• “Operationalizing” usually fails at the handoff, when humans stay permanently in the loop and the business never captures the full benefit.
• Data architecture is the multiplier. If your data is siloed, slow, or hard to access safely, your AI roadmap stalls, no matter how good your models are.
• GPU spend is only worth it if your pipelines can feed the GPUs fast enough. A lot of teams are IO bound, so utilization stays low and budgets get burned.
• The real win is better decisions, faster. Moving from end of day batch thinking to intraday intelligence can change risk, margin, and response time in major ways.
Timestamped highlights
00:35 What DDN does, and why data velocity matters when GPUs are the pricey line item
02:12 AI vs GenAI in the enterprise, and why “taking the human out” is where value shows up
08:43 Hallucinations, trust, and why “always answering” creates real production risk
12:00 What teams do with the speed gains, and why faster delivery shifts you toward harder problems
12:58 From hours to minutes, how GPU acceleration changes intraday risk and decision making in finance
20:16 Data architecture choices, POSIX vs object storage, and why your IO layer can make or break AI readiness
A line worth stealing
“Speed is great, but trust is the frontier. If your system can’t admit what it doesn’t know, production is where the project stops.”
Pro tips you can apply this week
• Pick one workflow where the output can be checked quickly, then design the path from pilot to production up front, including who approves what and how exceptions get handled.
• Audit your bottleneck before you buy more compute. If your GPUs are waiting on data, fix storage, networking, and pipeline throughput first.
• Build “confidence behavior” into the system. Decide when it should answer, when it should cite, and when it should escalate to a human.
Call to action
If you got value from this one, follow the show and turn on notifications so you do not miss the next episode.