talk-data.com talk-data.com

Filter by Source

Select conferences and events

People (4 results)

See all 4 →
Showing 16 results

Activities & events

Title & Speakers Event
Keynote: Hannes Mühleisen 2025-09-24 · 06:30
Hannes Mühleisen – co-creator and CEO @ DuckDB Labs

Keynote talk by Hannes Mühleisen, Co-founder & CEO at DuckDB Labs.

DuckDB
Keynote: Maria Börner 2025-09-24 · 06:30
Maria Börner – Head of the AI Competence Center @ Westernacher Solutions

Keynote talk by Maria Börner, Head of the AI Competence Center at Westernacher Solutions.

AI/ML
Keynote: Demetrios Brinkmann 2025-09-24 · 06:30
Demetrios Brinkmann – Chief Vibe Officer @ MLOps Community

Keynote talk by Demetrios Brinkmann, Chief Vibe Officer at MLOps Community.

MLOps
Keynote: Judith Dijk 2025-09-24 · 06:30
Judith Dijk – Project Officer @ European Defence Agency

Keynote talk by Judith Dijk, Project Officer at European Defence Agency.

Hannes Mühleisen – co-creator and CEO @ DuckDB Labs , Mark Raasveldt – Co-creator @ DuckDB , Tobias Macey – host

Summary In this episode of the Data Engineering Podcast Hannes Mühleisen and Mark Raasveldt, the creators of DuckDB, share their work on Duck Lake, a new entrant in the open lakehouse ecosystem. They discuss how Duck Lake, is focused on simplicity, flexibility, and offers a unified catalog and table format compared to other lakehouse formats like Iceberg and Delta. Hannes and Mark share insights into how Duck Lake revolutionizes data architecture by enabling local-first data processing, simplifying deployment of lakehouse solutions, and offering benefits such as encryption features, data inlining, and integration with existing ecosystems.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details. Your host is Tobias Macey and today I'm interviewing Hannes Mühleisen and Mark Raasveldt about DuckLake, the latest entrant into the open lakehouse ecosystemInterview IntroductionHow did you get involved in the area of data management?Can you describe what DuckLake is and the story behind it?What are the particular problems that DuckLake is solving for?How does this compare to the capabilities of MotherDuck?Iceberg and Delta already have a well established ecosystem, but so does DuckDB. Who are the primary personas that you are trying to focus on in these early days of DuckLake?One of the major factors driving the adoption of formats like Iceberg is cost efficiency for large volumes of data. That brings with it challenges of large batch processing of data. How does DuckLake account for these axes of scale?There is also a substantial investment in the ecosystem of technologies that support Iceberg. The most notable ecosystem challenge for DuckDB and DuckLake is in the query layer. How are you thinking about the evolution and growth of that capability beyond DuckDB (e.g. support in Trino/Spark/Flink)?What are your opinions on the viability of a future where DuckLake and Iceberg become a unified standard and implementation? (why can't Iceberg REST catalog implementations just use DuckLake under the hood?)Digging into the specifics of the specification and implementation, what are some of the capabilities that it offers above and beyond Iceberg?Is it now possible to enforce PK/FK constraints, indexing on underlying data?Given that DuckDB has a vector type, how do you think about the support for vector storage/indexing?How do the capabilities of DuckLake and the integration with DuckDB change the ways that data teams design their data architecture and access patterns?What are your thoughts on the impact of "data gravity" in today's data ecosystem, with engines like DuckDB, KuzuDB, LanceDB, etc. available for embedded and edge use cases?What are the most interesting, innovative, or unexpected ways that you have seen DuckLake used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on DuckLake?When is DuckLake the wrong choice?What do you have planned for the future of DuckLake?Contact Info HannesWebsiteMarkWebsiteParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links DuckDBPodcast EpisodeDuckLakeDuckDB LabsMySQLCWIMonetDBIcebergIceberg REST CatalogDeltaHudiLanceDuckDB Iceberg ConnectorACID == Atomicity, Consistency, Isolation, DurabilityMotherDuckMotherDuck Managed DuckLakeTrinoSparkPrestoSpark DuckLake DemoDelta KernelArrowdltS3 TablesAttribute Based Access Control (ABAC)ParquetArrow FlightHadoopHDFSDuckLake RoadmapThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

AI/ML Flink Data Engineering Data Lakehouse Data Management Datafold Delta DuckDB ETL/ELT Iceberg Lance Motherduck Prefect Python Spark Data Streaming Trino
Data Engineering Podcast
Hannes Mühleisen – co-creator and CEO @ DuckDB Labs , Joe Reis – founder @ Ternary Data

Hannes Mühleisen shows off DuckLake and answers live questions.DuckLake: https://duckdb.org/2025/05/27/ducklake.html

DuckDB HTML
The Joe Reis Show
AI Council 2025
Mark Raasveldt @ DuckDB , Hannes Mühleisen – Creator of DuckDB @ DuckDB Labs

Opening talk of DuckCon #6: https://duckdb.org/events/2025/01/31/duckcon6/

Speakers: Hannes Mühleisen and Mark Raasveldt (DuckDB Labs) Slides: https://blobs.duckdb.org/events/duckcon6/hannes-muehleisen-mark-raasveldt-duckcon6-welcome-and-opening-remarks.pdf

DuckCon #6 Amsterdam 2025
Mark Raasveldt @ DuckDB , Hannes Mühleisen – Creator of DuckDB @ DuckDB Labs

Speakers: Hannes Mühleisen, Mark Raasveldt (DuckDB Labs) Slides: https://blobs.duckdb.org/events/duckcon5/hannes-muhleisen-mark-raasveldt-introduction-and-state-of-project.pdf

DuckDB
DuckCon #5 Seattle 2024
Hannes Mühleisen – Creator of DuckDB @ DuckDB Labs

Hannes Mühleisen of DuckDB Labs addressed an audience of thousands during his keynote address at Data + AI Summit 2024 in San Francisco. Mühleisen announced DuckDB support for Delta Lake, a new DuckDB Extension to Unity Catalog, and Community Extensions.

Speaker: Hannes Mühleisen, Creator of DuckDB, DuckDB Labs @duckdb @duckdb3282

AI/ML Delta DuckDB
Bilal Aslam – Sr. Director of Product Management @ Databricks , Yejin Choi – Professor and MacArthur Fellow; Senior Research Director for Commonsense AI at AI2 @ University of Washington; AI2 , Darshana Sivakumar – Staff Product Manager @ Databricks , Ryan Blue – Creator of Apache Iceberg and co-founder @ Tabular , Zeashan Pappa – Staff Product Manager @ Databricks , Ali Ghodsi – CEO @ Databricks , Reynold Xin – Co-founder and Chief Architect @ Databricks , Matei Zaharia – Chief Technologist @ Databricks , Hannes Mühleisen – Creator of DuckDB @ DuckDB Labs , Alexander Booth – Assistant Director of R&D @ Texas Rangers Baseball Club , Tareef Kawaf – President @ Posit Sofware, PBC

Speakers: - Alexander Booth, Asst Director of Research & Development, Texas Rangers - Ali Ghodsi, Co-Founder and CEO, Databricks - Bilal Aslam, Sr. Director of Product Management, Databricks - Darshana Sivakumar, Staff Product Manager, Databricks - Hannes Mühleisen, Creator of DuckDB, DuckDB Labs - Matei Zaharia, Chief Technology Officer and Co-Founder, Databricks - Reynold Xin, Chief Architect and Co-Founder, Databricks - Ryan Blue, CEO, Tabular - Tareef Kawaf, President, Posit Software, PBC - Yejin Choi, Sr Research Director Commonsense AI, AI2, University of Washington - Zeashan Pappa, Staff Product Manager, Databricks

About Databricks Databricks is the Data and AI company. More than 10,000 organizations worldwide — including Block, Comcast, Conde Nast, Rivian, and Shell, and over 60% of the Fortune 500 — rely on the Databricks Data Intelligence Platform to take control of their data and put it to work with AI. Databricks is headquartered in San Francisco, with offices around the globe, and was founded by the original creators of Lakehouse, Apache Spark™, Delta Lake and MLflow.

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data… Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

AI/ML Data Lakehouse Databricks Delta DuckDB Spark
Mark Raasveldt @ DuckDB , Hannes Mühleisen – Creator of DuckDB @ DuckDB Labs

Overview of DuckDB's release v0.10.0 and plans for the upcoming v1.0.0, presented on 2024-02-02 at DuckCon #4 in Amsterdam.

Speakers: Hannes Mühleisen, Mark Raasveldt (DuckDB Labs) Slides: https://blobs.duckdb.org/events/duckcon4/duckcon4-mark-raasveldt-hannes-muhleisen-state-of-the-duck.pdf

DuckDB
DuckCon #4 Amsterdam 2024
Hannes Mühleisen – Creator of DuckDB @ DuckDB Labs

Speaker: Hannes Mühleisen (DuckDB Labs)

DuckDB
DuckCon #3 San Francisco 2023
Mark Raasveldt @ DuckDB , Hannes Mühleisen – Creator of DuckDB @ DuckDB Labs

Slides: https://drive.google.com/file/d/1bscd15MjXnWc7SGEX0AF8F2zYdLVgx-q/view?usp=share_link

DuckCon #2 Brussels 2023
TdT x DuckDB 2022-05-17 · 07:00
Hannes Mühleisen – co-creator and CEO @ DuckDB Labs

Send us a text We are joined by Hannes Mühleisen, the co-founder of DuckDB and CEO of DuckDB Labs. DuckDB is an in-process SQL OLAP databases management system that is highly simplifying data processing without compromising anything in terms of performance.

Check out more on DuckDB and Hannes here: DuckDB websiteDuckDB Labs websiteHannes' LinkedIn

DuckDB SQL
DataTopics: All Things Data, AI & Tech
Hannes Mühleisen – co-creator and CEO @ DuckDB Labs , Tobias Macey – host

Summary When you think about selecting a database engine for your project you typically consider options focused on serving multiple concurrent users. Sometimes what you really need is an embedded database that is blazing fast for single user workloads. DuckDB is an in-process database engine optimized for OLAP applications to speed up your analytical queries that meets you where you are, whether that’s Python, R, Java, even the web. In this episode, Hannes Mühleisen, co-creator and CEO of DuckDB Labs, shares the motivations for creating the project, the myriad ways that it can be used to speed up your data projects, and the detailed engineering efforts that go into making it adaptable to any environment. This is a fascinating and humorous exploration of a truly useful piece of technology.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Atlan is a collaborative workspace for data-driven teams, like Github for engineering or Figma for design teams. By acting as a virtual hub for data assets ranging from tables and dashboards to SQL snippets & code, Atlan enables teams to create a single source of truth for all their data assets, and collaborate across the modern data stack through deep integrations with tools like Snowflake, Slack, Looker and more. Go to dataengineeringpodcast.com/atlan today and sign up for a free trial. If you’re a data engineering podcast listener, you get credits worth $3000 on an annual subscription RudderStack helps you build a customer data platform on your warehouse or data lake. Instead of trapping data in a black box, they enable you to easily collect customer data from the entire stack and build an identity graph on your warehouse, giving you full visibility and control. Their SDKs make event streaming from any app or website easy, and their state-of-the-art reverse ETL pipelines enable you to send enriched data to any cloud tool. Sign up free… or just get the free t-shirt for being a listener of the Data Engineering Podcast at dataengineeringpodcast.com/rudder. Your host is Tobias Macey and today I’m interviewing Hannes Mühleisen about DuckDB, an in-process embedded database engine for columnar analytics

Interview

Introduction How did you get involved in the area of data management? Can you describe what DuckDB is and the story behind it? Where did the name come from? What are some of the use cases that DuckDB is designed to support? The interface for DuckDB is similar (at least in spirit) to SQLite. What are the deciding factors for when to use one vs. the other?

How might they be used in concert to take advantage of their relative strengths?

What are some of the ways that DuckDB can be used to better effect than options provided by different language ecosystems? Can you describe how DuckDB is implemented?

How has the design and goals of the project changed or evolved since you began working on it? What are some of the optimizations that you have had to make in order to support performant access to data that exceeds available memory?

Can you describe a typical workflow of incorporating DuckDB into an analytical project? What are some of the libraries/tools/systems that DuckDB might replace in the scope of a project or team? What are some of the

Analytics CDP Cloud Computing Data Engineering Data Lake Data Management DuckDB ETL/ELT GitHub Java Kubernetes Looker Modern Data Stack Python Snowflake SQL Data Streaming
Data Engineering Podcast
Showing 16 results