talk-data.com talk-data.com

C

Speaker

Christian Henrik Reich

2

talks

Principal Architect twoday Data & AI

Principal Architect @ twoday Data & AI

Bio from: Outperform Spark with Python Notebooks in Fabric

Filter by Event / Source

Talks & appearances

2 activities · Newest first

Search activities →
The Data Engineer's Guide to Microsoft Fabric

Modern data engineering is evolving; and with Microsoft Fabric, the entire data platform experience is being redefined. This essential book offers a fresh, hands-on approach to navigating this shift. Rather than being an introduction to features, this guide explains how Fabric's key components—Lakehouse, Warehouse, and Real-Time Intelligence—work under the hood and how to put them to use in realistic workflows. Written by Christian Henrik Reich, a data engineering expert with experience that extends from Databricks to Fabric, this book is a blend of foundational theory and practical implementation of lakehouse solutions in Fabric. You'll explore how engines like Apache Spark and Fabric Warehouse collaborate with Fabric's Real-Time Intelligence solution in an integrated platform, and how to build ETL/ELT pipelines that deliver on speed, accuracy, and scale. Ideal for both new and practicing data engineers, this is your entry point into the fabric of the modern data platform. Acquire a working knowledge of lakehouses, warehouses, and streaming in Fabric Build resilient data pipelines across real-time and batch workloads Apply Python, Spark SQL, T-SQL, and KQL within a unified platform Gain insight into architectural decisions that scale with data needs Learn actionable best practices for engineering clean, efficient, governed solutions

Session: When Microsoft Fabric was released, it came with Apache Spark out of the box. Spark’s ability to work with more programming languages opened up possibilities for creating data-driven and automated lakehouses. With Python Notebooks, we have a better tool for handling metadata, automation, and processing of more trivial workloads, while still having the option to use Spark Notebooks for handling more demanding processing. We will cover: The difference between Python Notebooks and a Single Node Spark cluster, and why Spark Notebooks are more costly and less performant with certain types of workloads. When to use Python Notebooks and when to use Spark Notebooks. Where to use Python Notebooks in a meta-driven Lakehouse. A brief introduction to tooling and move workload between Python Notebooks and Spark Notebooks. How to avoid overload the Lakehouse tech stack with python technologies. Costs