Topic

vision-language models

Activities

1

tagged

Activity Trend

1 peak/qtr

2020-Q1 2026-Q2

Top Events

OpenSearch Paris Meetup - 17 Juillet 2025 à Courbevoie (FR/EN) 1 Building AI Agents with Multimodal Models: NVIDIA DLI Workshop for Academia 1 Nov 24 - Best of ICCV (Day 4) 1

Top Speakers

Praveen Mohan Prasad (AWS) 1 Shijie Zhou (UCLA) 1

Activities

Showing filtered results

All Video Podcast Book

Filtering by: Praveen Mohan Prasad ×

Unlocking Insights from Multimodal PDFs using OpenSearch and Vision-Language Models

2025-07-17 · OpenSearch Paris Meetup - 17 Juillet 2025 à Courbevoie (FR/EN)

talk

by Praveen Mohan Prasad (AWS)

opensearch

PDFs are packed with text, tables, and images, but extracting insights from them isn’t easy. Traditional methods involve multiple components like OCR and task-specific models—making them complex and hard to scale. Vision-Language Models like ColPali simplify this by representing all modalities in a unified format.In this session, you’ll see how ColPali can be combined with OpenSearch to enable conversational search over rich PDF content. We’ll also showcase a live demo to bring this concept to life.