talk-data.com talk-data.com

Meetup talk 2024-11-12 at 19:10

Vector Search: Precision/Recall Balance

Description

Vector search is a Zero Results system— as long as products are available, it will always return the top N results for any search query. To optimize the precision/recall balance of the vector search system, we need to control the cosine similarity threshold. We will explore how different models inherently have varying cosine similarity distributions, and how factors such as finetuning, query length, and query language impact this.