Image retrieval is the process of searching for images in a large database that are similar to one or more query images. A classical approach is to transform the database images and the query images into embeddings via a feature extractor (e.g., a CNN or a ViT), so that they can be compared via a distance metric. Self-supervised learning (SSL) can be used to train a feature extractor without the need for expensive and time-consuming labeled training data. We will use DINO's SSL method to build a feature extractor and Milvus, an open-source vector database built for evolutionary similarity search, to index image representation vectors for efficient retrieval. We will compare the SSL approach with supervised and pre-trained feature extractors.
talk-data.com
Topic
Vector DB
ai
2
tagged
Activity Trend
10
peak/qtr
2020-Q1
2026-Q1
Top Events
DATA MINER Big Data Europe Conference 2020
10
Data Engineering Podcast
10
Data + AI Summit 2025
7
DataFramed
7
AWS re:Invent 2024
4
Databricks DATA + AI Summit 2023
3
DataTalks.Club
3
Google Cloud Next '24
3
Generative AI and Intelligent Agents Virtual Summit
2
PyConDE & PyData Berlin 2023
2
Secrets of Data Analytics Leaders
2
Google Cloud Next '25
2
Filtering by:
PyConDE & PyData Berlin 2023
×
Innovations such as sentence-transformers, neural search and vector databases fueled a very fast development of question-answering systems recently. At scieneers, we wanted to test those components to satisfy our own information needs using a slack-bot that will answer our questions by reading through our internal documents and slack-conversations. We therefore leveraged the HayStack QA-Framework in combination with a Weaviate vector database and many fine-tuned NLP-models. This talk will give you insights in both, the technical challenges we faced and the organizational learnings we took.