Jordan Tigani

"Panel": Don't duck up the "Numbers": Where AI Hype Meets BI Reality

2025-11-05 · Small Data SF 2025

panel

with Barry McCardel (Hex) , Colin Zima (Omni) , Jordan Tigani (MotherDuck) , Tristan Handy (dbt Labs) , Barr Moses (Monte Carlo)

AI/ML BI

The Unbearable Bigness of Small Data

2025-11-05 · Small Data SF 2025

talk

DuckDB at Scale

2025-09-24 · Big Data LDN 2025

Face To Face

Big Data Cloud Computing DuckDB SQL

DuckDB is well-loved by SQL-ophiles to handle their small data workloads. How do you make it scale? What happens when you feed it Big Data? What is this DuckLake thing I've been hearing about? This talk will help answer these questions from real-world experience running a DuckDB service in the cloud.

Big Data is Dead: Long Live Hot Data 🔥

2024-11-15 · Small Data SF 2024 Watch

video

Analytics Big Data BigQuery Cloud Computing Data Analytics Data Engineering

Over the last decade, Big Data was everywhere. Let's set the record straight on what is and isn't Big Data. We have been consumed by a conversation about data volumes when we should focus more on the immediate task at hand: Simplifying our work.

Some of us may have Big Data, but our quest to derive insights from it is measured in small slices of work that fit on your laptop or in your hand. Easy data is here— let's make the most of it.

📓 Resources Big Data is Dead: https://motherduck.com/blog/big-data-is-dead/ Small Data Manifesto: https://motherduck.com/blog/small-data-manifesto/ Small Data SF: https://www.smalldatasf.com/

➡️ Follow Us LinkedIn: https://linkedin.com/company/motherduck X/Twitter : https://twitter.com/motherduck Blog: https://motherduck.com/blog/

Explore the "Small Data" movement, a counter-narrative to the prevailing big data conference hype. This talk challenges the assumption that data scale is the most important feature of every workload, defining big data as any dataset too large for a single machine. We'll unpack why this distinction is crucial for modern data engineering and analytics, setting the stage for a new perspective on data architecture.

Delve into the history of big data systems, starting with the non-linear hardware costs that plagued early data practitioners. Discover how Google's foundational papers on GFS, MapReduce, and Bigtable led to the creation of Hadoop, fundamentally changing how we scale data processing. We'll break down the "big data tax"—the inherent latency and system complexity overhead required for distributed systems to function, a critical concept for anyone evaluating data platforms.

Learn about the architectural cornerstone of the modern cloud data warehouse: the separation of storage and compute. This design, popularized by systems like Snowflake and Google BigQuery, allows storage to scale almost infinitely while compute resources are provisioned on-demand. Understand how this model paved the way for massive data lakes but also introduced new complexities and cost considerations that are often overlooked.

We examine the cracks appearing in the big data paradigm, especially for OLAP workloads. While systems like Snowflake are still dominant, the rise of powerful alternatives like DuckDB signals a shift. We reveal the hidden costs of big data analytics, exemplified by a petabyte-scale query costing nearly $6,000, and argue that for most use cases, it's too expensive to run computations over massive datasets.

The key to efficient data processing isn't your total data size, but the size of your "hot data" or working set. This talk argues that the revenge of the single node is here, as modern hardware can often handle the actual data queried without the overhead of the big data tax. This is a crucial optimization technique for reducing cost and improving performance in any data warehouse.

Discover the core principles for designing systems in a post-big data world. We'll show that since only 1 in 500 users run true big data queries, prioritizing simplicity over premature scaling is key. For low latency, process data close to the user with tools like DuckDB and SQLite. This local-first approach offers a compelling alternative to cloud-centric models, enabling faster, more cost-effective, and innovative data architectures.

Jordan Tigani - Why Small Data is Awesome, DuckDB, and More

2024-09-05 · The Joe Reis Show Listen

podcast_episode

with Jordan Tigani (MotherDuck) , Joe Reis (DeepLearning.AI)

AI/ML DuckDB Motherduck

Jordan Tigani is back to chat about why small data is awesome, data lakehouses, DuckDB, AI, and much more.

Motherduck: https://motherduck.com/

LinkedIn: https://www.linkedin.com/in/jordantigani/

Twitter: https://twitter.com/jrdntgn?lang=en

#221 [Radar Recap] The Future of Programming: Accelerating Coding Workflows with LLMs

2024-07-02 · DataFramed Listen

podcast_episode

with Ryan J. Salva (GitHub) , Jordan Tigani (MotherDuck) , Michele Catasta (Replit)

AI/ML Data Science GitHub LLM Motherduck

From data science to software engineering, Large Language Models (LLMs) have emerged as pivotal tools in shaping the future of programming. In this session, Michele Catasta, VP of AI at Replit, Jordan Tigani, CEO at Motherduck, and Ryan J. Salva, VP of Product at GitHub, will explore practical applications of LLMs in coding workflows, how to best approach integrating AI into the workflows of data teams, what the future holds for AI-assisted coding, and a lot more. Links Mentioned in the Show: Rewatch Session from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

Hybrid query execution: What is a database client, anyway? - Coalesce 2023

2023-10-25 · dbt Coalesce 2023 Watch

video

DuckDB Motherduck

Running a full-fledged analytical database inside the client opens up new ways of executing your query; you can run parts of your query locally and part remotely. Once you can split the query plan into two pieces, the same mechanism works with N stages, which can be in series or a tree. This talk discusses the hybrid execution system based on DuckDB built at MotherDuck, but also discusses some further query topologies that are enabled by this pattern.

Speaker: Jordan Tigani, Co-Founder & Chief Duck-Herder, MotherDuck

Register for Coalesce at https://coalesce.getdbt.com

Big Data is Dead | MotherDuck

2023-05-11 · Data Council 2023 Watch

video

AI/ML Analytics Big Data BigQuery Data Engineering DuckDB

This talk will make the case that the era of Big Data is over. Now we can stop worrying about data size and focus on how we’re going to use it to make better decisions.

The data behind the graphs shown in this talk come from Jordan Tigani having analyzed query logs, deal post-mortems, benchmark results (published and unpublished), customer support tickets, customer conversations, service logs, and published blog posts, plus a bit of intuition.

ABOUT THE SPEAKER: Jordan Tigani is co-founder and chief duck-herder at MotherDuck, a startup building a serverless analytics platform based on DuckDB. He helped create Google BigQuery, wrote two books on it, and led first the engineering team then the product team through its first $1B or so in revenue.

👉 Sign up for our “No BS” Newsletter to get the latest technical data & AI content: https://datacouncil.ai/newsletter

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/

The Personal Data Warehouse (w/ Jordan Tigani of MotherDuck)

2022-07-01 · The Analytics Engineering Podcast Listen

podcast_episode

with Jordan Tigani (MotherDuck) , Tristan Handy (dbt Labs) , Julia Schottenstein (dbt labs)

BigQuery DuckDB DWH Motherduck

Jordan Tigani is an expert in large-scale data processing, having spent a decade+ in the development and growth of BigQuery, and later SingleStore. Today, Jordan and his team at MotherDuck are in the early days of working on commercial applications for the open source DuckDB OLAP database. In this conversation with Tristan and Julia, Jordan dives into the origin story of BigQuery, why he thinks we should do away with the concept of working in files, and how truly performant "data apps" will require bringing data to an end user's machine (rather than requiring them to query a warehouse directly).

Google BigQuery: The Definitive Guide

2019-10-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

with Jordan Tigani (MotherDuck) , Valliappa Lakshmanan

data data-engineering google-bigquery Agile/Scrum BigQuery Cloud Computing

Work with petabyte-scale datasets while building a collaborative, agile workplace in the process. This practical book is the canonical reference to Google BigQuery, the query engine that lets you conduct interactive analysis of large datasets. BigQuery enables enterprises to efficiently store, query, ingest, and learn from their data in a convenient framework. With this book, you’ll examine how to analyze data at scale to derive insights from large datasets efficiently. Valliappa Lakshmanan, tech lead for Google Cloud Platform, and Jordan Tigani, engineering director for the BigQuery team, provide best practices for modern data warehousing within an autoscaled, serverless public cloud. Whether you want to explore parts of BigQuery you’re not familiar with or prefer to focus on specific tasks, this reference is indispensable.

Google BigQuery Analytics

2014-06-09 · O'Reilly Data Engineering Books O'Reilly Amazon

book

with Siddartha Naidu , Jordan Tigani (MotherDuck)

data data-engineering google-bigquery Analytics API BigQuery

How to effectively use BigQuery, avoid common mistakes, and execute sophisticated queries against large datasets Google BigQuery Analytics is the perfect guide for business and data analysts who want the latest tips on running complex queries and writing code to communicate with the BigQuery API. The book uses real-world examples to demonstrate current best practices and techniques, and also explains and demonstrates streaming ingestion, transformation via Hadoop in Google Compute engine, AppEngine datastore integration, and using GViz with Tableau to generate charts of query results. In addition to the mechanics of BigQuery, the book also covers the architecture of the underlying Dremel query engine, providing a thorough understanding that leads to better query results. Features a companion website that includes all code and data sets from the book Uses real-world examples to explain everything analysts need to know to effectively use BigQuery Includes web application examples coded in Python

talk-data.com

Frequent Collaborators

Filter by Event / Source

"Panel": Don't duck up the "Numbers": Where AI Hype Meets BI Reality

The Unbearable Bigness of Small Data

DuckDB at Scale

Big Data is Dead: Long Live Hot Data 🔥

Jordan Tigani - Why Small Data is Awesome, DuckDB, and More

#221 [Radar Recap] The Future of Programming: Accelerating Coding Workflows with LLMs

Hybrid query execution: What is a database client, anyway? - Coalesce 2023

Big Data is Dead | MotherDuck

The Personal Data Warehouse (w/ Jordan Tigani of MotherDuck)

Google BigQuery: The Definitive Guide

Google BigQuery Analytics