Speaker

Allison Wang

Activities

1

talks

Staff Software Engineer Databricks

Allison is a software engineer at Databricks, working on Spark SQL and PySpark. She holds a Bachelor’s degree in Computer Science from Carnegie Mellon University.

Bio from: Databricks DATA + AI Summit 2023

Filtering by: PyData Seattle 2025 ×

Filter by Event / Source

Data + AI Summit 2025 2 PyData Seattle 2025 1 Databricks DATA + AI Summit 2023 1

Talks & appearances

Showing 1 of 4 activities

Search activities →

Polars on Spark: Unlocking Performance with Arrow Python UDFs

2025-11-07 · PyData Seattle 2025

talk

with Shujing Yang , Allison Wang (Databricks)

Arrow Polars PySpark Python Rust Spark

PySpark’s Arrow-based Python UDFs open the door to dramatically faster data processing by avoiding expensive serialization overhead. At the same time, Polars, a high-performance DataFrame library built on Rust, offers zero-copy interoperability with Apache Arrow. This talk shows how combining these two technologies unlocks new performance gains: writing Arrow UDFs with Polars in PySpark can deliver performance speedups compared to Python UDFs. Attendees will learn how Arrow UDFs work in PySpark, how it can be used with other data processing libraries, and how to apply this approach to real-world Spark pipelines for faster, more efficient workloads.