talk-data.com talk-data.com

Event

Outperform Spark with Python Notebooks in Fabric

2025-10-22 – 2025-10-22 Meetup Visit website ↗

Activities tracked

1

Detailed information and registration can be found here : https://dataminds.be/outperform-spark-with-python-notebooks-in-fabric/

Outperform Spark with Python Notebooks in Fabric

Session: When Microsoft Fabric was released, it came with Apache Spark out of the box. Spark’s ability to work with more programming languages opened up possibilities for creating data-driven and automated lakehouses. On the other hand, Spark’s primary feature to scale out and handle large amounts of data will, in many cases, be over-dimensioned, less performant, and more costly when working with trivial workloads. With Python Notebooks, we have a better tool for handling metadata, automation, and processing of more trivial workloads, while still having the option to use Spark Notebooks for handling more demanding processing.

We will cover: * The difference between Python Notebooks and a Single Node Spark cluster, and why Spark Notebooks are more costly and less performant with certain types of workloads. * When to use Python Notebooks and when to use Spark Notebooks. * Where to use Python Notebooks in a meta-driven Lakehouse * A brief introduction to tooling and move workload between Python Notebooks and Spark Notebooks. * How to avoid overload the Lakehouse tech stack with python technologies. * Costs

Speakers: Christian Henrik Reich \| Principal Architect @ twoday Data & AI

Agenda (CEST, UTC+2):

18u30 Welcome and introductions

18u30 Outperform Spark with Python Notebooks in Fabric (60 minutes)

19u45 Session End

Sessions & talks

Showing 1–1 of 1 · Newest first

Search within this event →

Outperform Spark with Python Notebooks in Fabric

2025-10-22
talk
Christian Henrik Reich (twoday Data & AI)

Session: When Microsoft Fabric was released, it came with Apache Spark out of the box. Spark’s ability to work with more programming languages opened up possibilities for creating data-driven and automated lakehouses. With Python Notebooks, we have a better tool for handling metadata, automation, and processing of more trivial workloads, while still having the option to use Spark Notebooks for handling more demanding processing. We will cover: The difference between Python Notebooks and a Single Node Spark cluster, and why Spark Notebooks are more costly and less performant with certain types of workloads. When to use Python Notebooks and when to use Spark Notebooks. Where to use Python Notebooks in a meta-driven Lakehouse. A brief introduction to tooling and move workload between Python Notebooks and Spark Notebooks. How to avoid overload the Lakehouse tech stack with python technologies. Costs