Episode Details

Back to Episodes
Fabric data lake performance: fix slow workloads with Azure Container Storage v2 and local NVMe for real‑time analytics

Fabric data lake performance: fix slow workloads with Azure Container Storage v2 and local NVMe for real‑time analytics

Season 1 Published 6 months ago
Description
Fabric data lake performance: in this episode of M365.fm, Mirko Peters explains why your Fabric lakehouse feels slow not because of Spark, Power BI, or engineers—but because your data lives on remote, managed storage that behaves like a networked file share from 2003. He opens with a brutal truth: every query, transform, and dashboard waits on storage latency first, and as long as your bytes commute across Azure’s network to reach compute, you are paying for CPUs to sit idle while I/O negotiations crawl along.

He then unpacks how Fabric and Power Platform end up bottlenecked by their own convenience. Managed tiers promise elasticity and durability, but each layer—service fabrics, gateways, redundancy, regional routing—adds milliseconds that quietly stack into minutes on trillion‑row refreshes. Mirko likens managed storage to a postal service: reliable and distributed, but absurd when you are trying to do millisecond analytics. Meanwhile, administrators keep scaling nodes and spark pools, unknowingly feeding a bottleneck that more compute cannot fix because the physics of distance remain unchanged.

From there, he introduces Azure Container Storage v2 as the NVMe fix for this drag. ACStor v2 abandons the old, complex design and goes all‑in on local NVMe disks wired directly to the host’s PCIe lanes, stripping out managed disks, LVM, and etcd to focus on raw I/O. Volumes are automatically striped across every NVMe drive on a node, trading redundancy for maximum throughput so even small workloads inherit the full bandwidth of the underlying hardware. Mirko explains how this transforms Spark shuffles, Fabric staging zones, and AI caches from network‑bound operations into near‑silicon‑speed workloads.

The episode demystifies NVMe by contrasting it with traditional cloud storage. Legacy protocols serialize operations through a single lane, while NVMe uses thousands of parallel queues mapped straight to the CPU, turning I/O into a massively concurrent conversation instead of a checkout line. ACStor v2 leverages that design so Fabric and Kubernetes workloads talk to storage like it is part of the server, not a distant service—yielding sub‑millisecond latency and multi‑gigabyte‑per‑second throughput without renting premium SAN capacity.

Mirko also tackles practicality and eligibility. He shows where local NVMe disks actually live in Azure—L‑series storage‑optimized VMs, NC‑series GPU machines, and selected D/E series with “temporary” disks—and why ACStor v2 turns those often‑ignored local drives into your primary performance engine instead of a scratchpad. Because NVMe is already baked into the VM price, you stop paying extra for managed speed and start exploiting hardware you already own. He closes with patterns for mapping Fabric lakehouses, Power Platform workloads, and analytic pipelines onto NVMe‑backed storage so your data lake finally moves at the speed your architectures were designed for.

WHAT YOU WILL LEARN
  • Why Fabric and Power Platform workloads feel slow even on powerful compute.
  • How managed storage distance, not bad queries, creates most data‑lake latency.
  • What Azure Container Storage v2 changes by going all‑in on local NVMe disks.
Listen Now