In this talk, Zain Hasan will discuss how we can use open-source multimodal embedding models in conjunction with large generative multimodal models that can that can see, hear, read, and feel data(!), to perform cross-modal search (searching audio with images, videos with text etc.) and multimodal retrieval augmented generation (MM-RAG) at the billion-object scale with the help of open source vector databases. I will also demonstrate, with live code demos, how being able to perform this cross-modal retrieval in real-time can enables users to use LLMs that can reason over their enterprise multimodal data. This talk will revolve around how we can scale the usage of multimodal embedding and generative models in production.
talk-data.com
Topic
vector databases
1
tagged
Activity Trend
1
peak/qtr
2020-Q1
2026-Q1
Top Events
Virtual Summit: LLMs and the Generative AI Revolution
1
Google I/O Extended 2023 North America
1
AI Meetup (June): GenAI, LLMs and ML
1
AI Meetup (October): AI, GenAI and LLMs
1
Build Your Own CLI Chatbot with RAG Support — An All-Code, No-Slides Session
1
Agentic AI Workshop | New York City
1
AI Agents from Scratch: A Beginner's LLM Workshop
1
Building AI Agents with Multimodal Models: NVIDIA DLI Workshop for Academia
1
AI Meetup (February): AI, GenAI, LLMs and ML
1
AI Agents from Scratch: A Beginner's LLM Workshop
1
AI Meetup (September): AI, GenAI and LLMs
1
AI Meetup: End of Year Celebration for AI, GenAI, LLMs and ML
1
Filtering by:
AI Meetup (February): AI, GenAI, LLMs and ML
×