The lakehouse promised to unify our data, but popular formats can feel bloated and hard to use for most real-world workloads. If you've ever felt that the complexity and operational overhead of "Big Data" tools are overkill, you're not alone. What if your lakehouse could be simple, fast, and maybe even a little fun? Enter DuckLake , the native lakehouse format, managed on MotherDuck. It delivers the powerful features you need like ACID transactions, time travel, and schema evolution without the heavyweight baggage. This approach truly makes massive data sets feel like Small Data. This workshop is a practical, step-by-step walkthrough for the data practitioner. We'll get straight to the point and show you how to build a fully functional, serverless lakehouse from scratch. You will learn: The Architecture: We’ll explore how DuckLake's design choices make it fundamentally simpler and faster for analytical queries compared to its JVM-based cousins. The Workflow: Through hands-on examples, you'll create a DuckLake table, perform atomic updates, and use time travel—all with the simple SQL you already know. The MotherDuck Advantage: Discover how the serverless platform makes it easy to manage, share, and query your DuckLake tables, enabling a seamless hybrid workflow between your laptop and the cloud.
talk-data.com
Speaker
Jacob Matson
4
talks
Jacob Matson is a Developer Advocate at MotherDuck, a cloud data warehouse built on DuckDB, and an amateur sports data scientist. He led Funko’s direct-to-consumer e-commerce launches, including Marvel Collector Corps and the Funko Shop, helping the business reach a $30M run rate and supporting its 2017 IPO, with licenses such as Star Wars, DC Comics, and Disney. He previously held roles at RoamBI (acquired by SAP) and Verimatrix (acquired by Inside Secure). Outside of work, he enjoys time with his wife Kristen and their two children, Olivia and Patrick, in Everett, Washington.
Bio from: Small Data SF 2025
Frequent Collaborators
Filter by Event / Source
Talks & appearances
4 activities · Newest first
Easy, fast, and scalable: pick 3. MotherDuck’s managed DuckLake data lakehouse blends the cost efficiency, scale, and openness of a lakehouse with the speed of a warehouse for truly joyful dbt pipelines. We will show you how!
Structured Query Language (or SQL for short) is a programming language to manage data in a database system and an essential part of any data engineer’s tool kit. In this tutorial, you will learn how to use SQL to create databases, tables, insert data into them and extract, filter, join data or make calculations using queries. We will use DuckDB, a new open source embedded in-process database system that combines cutting edge database research with dataframe-inspired ease of use. DuckDB is only a pip install away (with zero dependencies), and runs right on your laptop. You will learn how to use DuckDB with your existing Python tools like Pandas, Polars, and Ibis to simplify and speed up your pipelines. Lastly, you will learn how to use SQL to create fast, interactive data visualizations, and how to teach your data how to fly and share it via the Cloud.