Podcast Episodes
Back to Search
Building Tools And Platforms For Data Analytics
Episode 95
Summary
Data engineers are responsible for building tools and platforms to power the workflows of other members of the business. Each group of users …
6 years, 9 months ago
A High Performance Platform For The Full Big Data Lifecycle
Episode 94
Summary
Managing big data projects at scale is a perennial problem, with a wide variety of solutions that have evolved over the past 20 years. One of…
6 years, 9 months ago
Digging Into Data Replication At Fivetran
Episode 93
Summary
The extract and load pattern of data replication is the most commonly needed process in data engineering workflows. Because of the myriad sou…
6 years, 9 months ago
Solving Data Discovery At Lyft
Episode 92
Summary
Data is only valuable if you use it for something, and the first step is knowing that it is available. As organizations grow and data sources…
6 years, 10 months ago
Simplifying Data Integration Through Eventual Connectivity
Episode 91
Summary
The ETL pattern that has become commonplace for integrating data from multiple sources has proven useful, but complex to maintain. For a smal…
6 years, 10 months ago
Straining Your Data Lake Through A Data Mesh
Episode 90
Summary
The current trend in data management is to centralize the responsibilities of storing and curating the organization’s information to a data e…
6 years, 10 months ago
Data Labeling That You Can Feel Good About With CloudFactory
Episode 89
Summary
Successful machine learning and artificial intelligence projects require large volumes of data that is properly labelled. The challenge is th…
6 years, 10 months ago
Scale Your Analytics On The Clickhouse Data Warehouse
Episode 88
Summary
The market for data warehouse platforms is large and varied, with options for every use case. ClickHouse is an open source, column-oriented d…
6 years, 11 months ago
Stress Testing Kafka And Cassandra For Real-Time Anomaly Detection
Episode 87
Summary
Anomaly detection is a capability that is useful in a variety of problem domains, including finance, internet of things, and systems monitori…
6 years, 11 months ago
The Workflow Engine For Data Engineers And Data Scientists
Episode 86
Summary
Building a data platform that works equally well for data engineering and data science is a task that requires familiarity with the needs of …
6 years, 11 months ago