talk-data.com talk-data.com

Topic

opensearch

11

tagged

Activity Trend

4 peak/qtr
2020-Q1 2026-Q1

Activities

11 activities · Newest first

As part of its missions, the Ministry of Culture aims to centralize the diverse data it manages. How can a platform be designed to host and distribute management, heritage, and decision-making data? What architecture choices were made considering the constraints? How does the search engine integrate into this platform, providing performance and smooth access? This is the story we will share, presented as a case study with practical insights.

Search engines are at the heart of the user experience. But how do you move from a “classic” keyword-based search to a semantic search that truly understands user intent? In this session, The Fork team will share their journey toward evolving into an AI-augmented search. Building on their existing OpenSearch stack, they added a semantic layer powered by LLMs. The goal: analyze user queries, extract the key elements, and translate them into a much more relevant semantic search. You’ll discover the challenges they faced, the implementation choices, and the real-world results achieved in a high-impact user case.

The latest version of OpenSearch introduces powerful new features and tools that enable us to build AI-powered assistants capable of interacting not only with our data, but also with cluster configurations and even the external world. We’ll explore how these capabilities open the door to intelligent agent-based systems that can reason, retrieve, and act. We'll walk through how to combine OpenSearch with modern AI tools, like LLMs, agents, and orchestration frameworks, to create assistants that can autonomously diagnose issues, generate insights, or even automate operational tasks.

Les RAG sont partout en 2025. Mais entre les vecteurs, les bases de données, les API Python… la plupart des stacks RAG ressemblent à une usine à gaz. Et si on pouvait tout simplifier ? Dans ce talk, Samir montre comment OpenSearch peut couvrir l’ensemble des besoins d’un RAG, sans dépendre de 15 outils différents. Une stack minimaliste, full open source, déployable partout — parfaite pour les équipes qui veulent garder la main sur leurs données. La session se termine avec une démo live d’un moteur de Q&A, prêt à être testé.

PDFs are packed with text, tables, and images, but extracting insights from them isn’t easy. Traditional methods involve multiple components like OCR and task-specific models—making them complex and hard to scale. Vision-Language Models like ColPali simplify this by representing all modalities in a unified format.In this session, you’ll see how ColPali can be combined with OpenSearch to enable conversational search over rich PDF content. We’ll also showcase a live demo to bring this concept to life.

OpenSearch has become a cornerstone of open source search and observability, empowering developers and organizations to derive meaningful insights from unstructured data at scale. This year marks a significant milestone in its journey, with OpenSearch officially joining The Linux Foundation, further cementing its position in the open source ecosystem. In this session we’ll introduce OpenSearch, from indexing and analyzing unstructured logs to full observability capabilities across tracing, monitoring and security. We’ll share latest improvements in query performance and scalability, and real-time analytics, as well as its expanding ecosystem with new plugins and SDKs in multiple programming languages, and its compatibility with cloud-native environments.

Abstract: You've been tasked with implementing a data streaming pipeline for propagating data changes from your operational Postgres database to a search index in OpenSearch. Data views in OS should be denormalized for fast querying, and of course there should be no noticeable impact on the production database. In this session we'll discuss how to build this data pipeline using two popular open-source projects: Debezium for log-based change data capture (CDC) and Apache Flink for stream processing. Join us for this talk and learn about: * Setting up change data streams with Debezium * Efficiently building nested data structures from 1:n joins * Deployment options: Kafka Connect vs. Flink CDC