Speaker

Holden Karau

Activities

1

talks

Co-founder Fight Health Insurance

Holden is a transgender Canadian open source developer with a focus on Apache Spark, and related "big data" tools. By day (and night, go go startup life) she works on brining large language models and other AI tools to help healthcare users deal with insurance through https://www.fighthealthinsurance.com & https://www.fightpaperwork.com.

She is the co-author of Learning Spark, High Performance Spark, and a few others. She is a committer and PMC on Apache Spark. She was tricked into the world of big data while trying to improve search and recommendation systems and has long since forgotten her original goal.

Bio from: Small Data SF 2025

Frequent Collaborators

Rachel Warren 2

Filtering by: Small Data SF 2025 ×

Filter by Event / Source

O'Reilly Data Engineering Books 4 O'Reilly Data Science Books 2 The Joe Reis Show 1 Data + AI Summit 2025 1 Small Data SF 2025 1

Talks & appearances

Showing 1 of 9 activities

Search activities →

When not to use Spark?

2025-11-05 · Small Data SF 2025

talk

Analytics Spark

In this talk the somewhat biased Apache Spark PMC Holden will explore the times when using Spark is more likely to lead to disappointment and pages than success and promotions. We'll, of course, look at places where Spark can excel but also explore heuristics like if it fits in Excel double check if you need Spark. By using Spark only when it's truly beneficial you can demonstrate that elusive "thought leadership" that always seems to be required for the next level of promotion. We'll explore how some of Spark's largest disadvantages are changing, but also which ones are likely to stick around -- allowing you to seem like you have a magic tech eightball next time someone asks you to design your analytics strategy. Come for a place to sit after lunch and stay for the OOM therapy.