talk-data.com talk-data.com

Topic

ClickHouse

columnar_database big_data analytics database data_warehouse olap realtime

59

tagged

Activity Trend

17 peak/qtr
2020-Q1 2026-Q1

Activities

59 activities · Newest first

En regardant certaines des premières pull requests dans le repository ClickHouse, vous verrez un fort accent mis sur l’intégration avec des systèmes externes. Au fil du temps, ClickHouse est devenu un puissant pont entre les data lakes et les data warehouses, prenant en charge les files d’attente, les bases de données et les object stores, avec une compatibilité pour plus de 60 formats d’entrée et de sortie. Cette polyvalence permet aux utilisateurs de bénéficier de la flexibilité d’un data lake tout en conservant les performances de requêtes en temps réel.

Dans cette session, nous discuterons de la manière dont nos utilisateurs exploitent ClickHouse et Iceberg, ainsi que de certaines fonctionnalités en cours de développement pour faciliter cette mouvance.

At Berlin Buzzwords, industry voices highlighted how search is evolving with AI and LLMs.

  • Kacper Łukawski (Qdrant) stressed hybrid search (semantic + keyword) as core for RAG systems and promoted efficient embedding models for smaller-scale use.
  • Manish Gill (ClickHouse) discussed auto-scaling OLAP databases on Kubernetes, combining infrastructure and database knowledge.
  • André Charton (Kleinanzeigen) reflected on scaling search for millions of classifieds, moving from Solr/Elasticsearch toward vector search, while returning to a hands-on technical role.
  • Filip Makraduli (Superlinked) introduced a vector-first framework that fuses multiple encoders into one representation for nuanced e-commerce and recommendation search.
  • Brian Goldin (Voyager Search) emphasized spatial context in retrieval, combining geospatial data with AI enrichment to add the “where” to search.
  • Atita Arora (Voyager Search) highlighted geospatial AI models, the renewed importance of retrieval in RAG, and the cautious but promising rise of AI agents.

Together, their perspectives show a common thread: search is regaining center stage in AI—scaling, hybridization, multimodality, and domain-specific enrichment are shaping the next generation of retrieval systems.

Kacper Łukawski Senior Developer Advocate at Qdrant, he educates users on vector and hybrid search. He highlighted Qdrant’s support for dense and sparse vectors, the role of search with LLMs, and his interest in cost-effective models like static embeddings for smaller companies and edge apps. Connect: https://www.linkedin.com/in/kacperlukawski/

Manish Gill
Engineering Manager at ClickHouse, he spoke about running ClickHouse on Kubernetes, tackling auto-scaling and stateful sets. His team focuses on making ClickHouse scale automatically in the cloud. He credited its speed to careful engineering and reflected on the shift from IC to manager.
Connect: https://www.linkedin.com/in/manishgill/

André Charton
Head of Search at Kleinanzeigen, he discussed shaping the company’s search tech—moving from Solr to Elasticsearch and now vector search with Vespa. Kleinanzeigen handles 60M items, 1M new listings daily, and 50k requests/sec. André explained his career shift back to hands-on engineering.
Connect: https://www.linkedin.com/in/andrecharton/

Filip Makraduli
Founding ML DevRel engineer at Superlinked, an open-source framework for AI search and recommendations. Its vector-first approach fuses multiple encoders (text, images, structured fields) into composite vectors for single-shot retrieval. His Berlin Buzzwords demo showed e-commerce search with natural-language queries and filters.
Connect: https://www.linkedin.com/in/filipmakraduli/

Brian Goldin
Founder and CEO of Voyager Search, which began with geospatial search and expanded into documents and metadata enrichment. Voyager indexes spatial data and enriches pipelines with NLP, OCR, and AI models to detect entities like oil spills or windmills. He stressed adding spatial context (“the where”) as critical for search and highlighted Voyager’s 12 years of enterprise experience.
Connect: https://www.linkedin.com/in/brian-goldin-04170a1/

Atita Arora
Director of AI at Voyager Search, with nearly 20 years in retrieval systems, now focused on geospatial AI for Earth observation data. At Berlin Buzzwords she hosted sessions, attended talks on Lucene, GPUs, and Solr, and emphasized retrieval quality in RAG systems. She is cautiously optimistic about AI agents and values the event as both learning hub and professional reunion.
Connect: https://www.linkedin.com/in/atitaarora/

This talk will take the listeners across a practical use case, where I will be showing how I processed more than 11.55M records of LinkedIn Data for candidates to create a RecSys for hiring idealized candidates based on a job description which is bespoke, and create my AI + Human in the loop. I will be showcasing how initially I was stuck in a loop where latency and processing these records in real-time was a problem, and how ClickHouse solved this.

AI is no longer just a buzzword—it’s a fundamental part of our daily lives. Forward-thinking organizations are embracing AI technologies, not just to stay competitive but to drive innovation, agility, and efficiency. When building AI platforms, the ability to process vast amounts of data in real-time is crucial to deliver optimised performance. This talk explores how ultra-fast analytics databases are transforming the AI landscape by enabling scalable, high-performance data processing.

ClickHouse® is famous for real-time response, cost-efficiency, and flexible Apache 2.0 licensing. In the first half of this talk, we’ll show how ClickHouse® implements sub-second analytics with columnar storage, vectorized query, and compression on datasets running to hundreds of terabytes. We’ll then pivot and discuss how Altinity is adapting open source ClickHouse® to deliver economical, real-time response over 100 Petabytes of data or more.

ClickHouse and Databricks for Real-Time Analytics

ClickHouse is a C++ based, column-oriented database built for real-time analytics. While it has its own internal storage format, the rise of open lakehouse architectures has created a growing need for seamless interoperability. In response, we have developed integrations with your favorite lakehouse ecosystem to enhance compatibility, performance and governance. From integrating with Unity Catalog to embedding the Delta Kernel into ClickHouse, this session will explore the key design considerations behind these integrations, their benefits to the community, the lessons learned and future opportunities for improved compatibility and seamless integration.