talk-data.com

Topic

Lance

file_format vector_db embeddings open_table_format data_lake

Activities

tagged

Activity Trend

3 peak/qtr

2020-Q1 2026-Q2

Top Events

Data Engineering Podcast 7 Data + AI Summit 2025 2 Data Skeptic 2 AI Meetup (August): Generative AI, LLMs and ML 1 PyData Seattle 2025 1 DuckCon #3 San Francisco 2023 1 Microsoft Ignite 2023 1 PyData Berlin 2025 1 dbt Coalesce 2023 1 Moody's Talks - Inside Economics 1 The Analytics Engineering Podcast 1

Top Speakers

Tobias Macey 7 Chang She (LanceDB) 4 Kyle Polich 2 LU QIU (LanceDB) 1 Sam Kleinman 1 Lance Witheridge (Trade Me) 1 Lance Wright 1 Lance Lambert (Fortune) 1 Thomas Maurer 1 Mark Raasveldt (DuckDB) 1 Tristan Handy (dbt Labs) 1 Lance Fortnow 1

Activities

7 activities · Newest first

All Video Podcast Book

Supercharging Multimodal Feature Engineering with Lance and Ray

2025-11-08 · PyData Seattle 2025 Watch

talk

by Jack Ye (AWS Open Data Analytics)

AI/ML GitHub

Efficient feature engineering is key to unlocking modern multimodal AI workloads. In this talk, we’ll dive deep into how Lance - an open-source format with built-in indexing, random access, and data evolution - works seamlessly with Ray’s distributed compute and UDF capabilities. We’ll walk through practical pipelines for preprocessing, embedding computation, and hybrid feature serving, highlighting concrete patterns attendees can take home to supercharge their own multimodal pipelines. See https://lancedb.github.io/lance/integrations/ray to learn more about this integration.

AI-Ready Data in Action: Powering Smarter Agents

2025-09-01 · PyData Berlin 2025 Watch

talk

by Chang She (LanceDB) , Violetta Mishechkina

AI/ML

This hands-on workshop focuses on what AI engineers do most often: making data AI-ready and turning it into production-useful applications. Together with dltHub and LanceDB, you’ll walk through an end-to-end workflow: collecting and preparing real-world data with best practices, managing it in LanceDB, and powering AI applications with search, filters, hybrid retrieval, and lightweight agents. By the end, you’ll know how to move from raw data to functional, production-ready AI setups without the usual friction. We will touch upon multi-modal data and going to production with this end-to-end use case.

Bridging Big Data and AI: Empowering PySpark With Lance Format for Multi-Modal AI Data Pipelines

2025-06-11 · Data + AI Summit 2025 Watch

lightning_talk

by LU QIU (LanceDB) , Allison Wang (Databricks)

AI/ML Analytics API Big Data Data Analytics PySpark Python Spark SQL

PySpark has long been a cornerstone of big data processing, excelling in data preparation, analytics and machine learning tasks within traditional data lakes. However, the rise of multimodal AI and vector search introduces challenges beyond its capabilities. Spark’s new Python data source API enables integration with emerging AI data lakes built on the multi-modal Lance format. Lance delivers unparalleled value with its zero-copy schema evolution capability and robust support for large record-size data (e.g., images, tensors, embeddings, etc), simplifying multimodal data storage. Its advanced indexing for semantic and full-text search, combined with rapid random access, enables high-performance AI data analytics to the level of SQL. By unifying PySpark's robust processing capabilities with Lance's AI-optimized storage, data engineers and scientists can efficiently manage and analyze the diverse data types required for cutting-edge AI applications within a familiar big data framework.

LanceDB: A Complete Search and Analytical Store for Serving Production-scale AI Applications

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Chang She (LanceDB) , Zero Qu (Databricks)

AI/ML Vector DB

If you're building AI applications, chances are you're solving a retrieval problem somewhere along the way. This is why vector databases are popular today. But if we zoom out from just vector search, serving AI applications also requires handling KV workloads like a traditional feature store, as well as analytical workloads to explore and visualize data. This means that building an AI application often requires multiple data stores, which means multiple data copies, manual syncing, and extra infrastructure expenses. LanceDB is the first and only system that supports all of these workloads in one system. Powered by Lance columnar format, LanceDB completely breaks open the impossible triangle of performance, scalability, and cost for AI serving. Serving AI applications is different from previous waves of technology, and a new paradigm demands new tools.

Bring enhanced manageability to SQL Server anywhere with Azure Arc | OD45

2023-11-16 · Microsoft Ignite 2023 Watch

video

by Dhananjay Mahajan , Thomas Maurer , Lance Wright , Raj Pochiraju , Nikita Takru

Azure Cloud Computing Microsoft Cyber Security SQL

Join this discussion to discover how connecting your SQL Servers to Azure can enhance your management, security, and governance capabilities with live demos. SQL Server enabled by Azure Arc is a hybrid cloud solution that allows you to manage, secure and govern your SQL Server estate running anywhere from Azure. Our experts will also explore different options for deploying Azure Arc to your SQL Servers at scale.

To learn more, please check out these resources: * https://aka.ms/Ignite23CollectionsOD45 * https://aka.ms/ArcSQL

𝗦𝗽𝗲𝗮𝗸𝗲𝗿𝘀: * Dhananjay Mahajan * Lance Wright * Nikita Takru * Raj Pochiraju

𝗦𝗲𝘀𝘀𝗶𝗼𝗻 𝗜𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻: This video is one of many sessions delivered for the Microsoft Ignite 2023 event. View sessions on-demand and learn more about Microsoft Ignite at https://ignite.microsoft.com

OD45 | English (US) | Data

MSIgnite

Data warehouse as a product: Design to delivery - Coalesce 2023

2023-10-27 · dbt Coalesce 2023 Watch

video

by Lance Witheridge (Trade Me)

DWH

Every day, Trade Me gets 1.5 million new listings and 20 million listing views. With all that data comes the difficulty of managing a complex data ecosystem. This got the Trade Me team thinking: "Which problems are we trying to solve? How can we increase speed to customer value?" Using this framework, the team developed a new mission statement: "To build a data warehouse that analysts love to use." In this session, Trade Me shares exactly how they achieved that vision, with a focus on planning, data operating models, and database architecture.

Speaker: Lance Witheridge, Data Modernisation Lead, Trade Me

Bringing AI to DuckDB with Lance columnar format for multi-modal AI – DuckCon #3 (San Francisco)

· DuckCon #3 San Francisco 2023 Watch

video

by Chang She (LanceDB)

AI/ML DuckDB

Speaker: Chang She (LanceDB) Slides: https://blobs.duckdb.org/events/duckcon3/chang-she-lancedb-bringing-ai-to-duckdb-with-lance-columnar-format.pdf