Episode Details
Back to Episodes
The Lakehouse Architecture: Multimodal Data, Delta Lake, and the Future of Data Engineering (with R. Tyler Croy)
Description
In this episode of the Data Engineering Central Podcast, I sit down with R. Tyler Croy for a wide-ranging conversation on the present—and future—of modern data platforms.
Tyler is a long-time open-source contributor to projects such as delta-rs. You can watch him on YouTube, read his blog, or work directly with him through his consultancy, Buoyant Data.
Tyler has spent years deep in the open-source data ecosystem, contributing to projects such as Delta Lake and thinking critically about how real-world data systems are built and maintained. This isn’t a hype-driven conversation—it’s a grounded discussion about what’s working, what’s breaking, and what’s coming next.
We dig into:
* What the Lakehouse architecture gets right—and where it still falls short
* Why multimodal data (text, images, audio, video, embeddings) changes everything
* How open table formats like Delta Lake fit into the next generation of data platforms
* The growing gap between data tooling hype and day-to-day data engineering reality
* What skills and architectural thinking will matter most for data engineers over the next decade
If you’re building or operating modern data platforms—and trying to separate real signal from noise—this episode is for you.
Thanks for reading Data Engineering Central! This post is public so feel free to share it.
This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe