talk-data.com talk-data.com

Topic

Lance

file_format vector_db embeddings open_table_format data_lake

1

tagged

Activity Trend

3 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: PyData Seattle 2025 ×
Supercharging Multimodal Feature Engineering with Lance and Ray

Efficient feature engineering is key to unlocking modern multimodal AI workloads. In this talk, we’ll dive deep into how Lance - an open-source format with built-in indexing, random access, and data evolution - works seamlessly with Ray’s distributed compute and UDF capabilities. We’ll walk through practical pipelines for preprocessing, embedding computation, and hybrid feature serving, highlighting concrete patterns attendees can take home to supercharge their own multimodal pipelines. See https://lancedb.github.io/lance/integrations/ray to learn more about this integration.