Podcast Episodes
Back to Search
Bringing Automation To Data Labeling For Machine Learning With Watchful
Episode 316
Summary
Data engineers have typically left the process of data labeling to data scientists or other roles because of its nature as a manual and proce…
3 years, 7 months ago
Collecting And Retaining Contextual Metadata For Powerful And Effective Data Discovery
Episode 315
Summary
Data is useless if it isn’t being used, and you can’t use it if you don’t know where it is. Data catalogs were the first solution to this pro…
3 years, 7 months ago
Useful Lessons And Repeatable Patterns Learned From Data Mesh Implementations At AgileLab
Episode 314
Summary
Data mesh is a frequent topic of conversation in the data community, with many debates about how and when to employ this architectural patter…
3 years, 7 months ago
Optimize Your Machine Learning Development And Serving With The Open Source Vector Database Milvus
Episode 313
Summary
The optimal format for storage and retrieval of data is dependent on how it is going to be used. For analytical systems there are decades of …
3 years, 7 months ago
Interactive Exploratory Data Analysis On Petabyte Scale Data Sets With Arkouda
Episode 312
Summary
Exploratory data analysis works best when the feedback loop is fast and iterative. This is easy to achieve when you are working on small data…
3 years, 7 months ago
What "Data Lineage Done Right" Looks Like And How They're Doing It At Manta
Episode 311
Summary
Data lineage is the roadmap for your data platform, providing visibility into all of the dependencies for any report, machine learning model,…
3 years, 7 months ago
Re-Bundling The Data Stack With Data Orchestration And Software Defined Assets Using Dagster
Episode 310
Summary
The current stage of evolution in the data management ecosystem has resulted in domain and use case specific orchestration capabilities being…
3 years, 7 months ago
Writing The Book That Offers A Single Reference For The Fundamentals Of Data Engineering
Episode 309
Summary
Data engineering is a difficult job, requiring a large number of skills that often don’t overlap. Any effort to understand how to start a car…
3 years, 7 months ago
Joe Reis Flips The Script And Interviews Tobias Macey About The Data Engineering Podcast
Episode 308
Summary
Data engineering is a large and growing subject, with new technologies, specializations, and "best practices" emerging at an accelerating pac…
3 years, 7 months ago
Making The Total Cost Of Ownership For External Data Manageable With Crux
Episode 307
Summary
There are extensive and valuable data sets that are available outside the bounds of your organization. Whether that data is public, paid, or …
3 years, 7 months ago