talk-data.com talk-data.com

Y

Speaker

Yinxi Zhang

1

talks

Staff Data Scientist Databricks

Yinxi Zhang is a Staff Data Scientist at Databricks, where she works with customers to build GenAI applications at scale. Prior to joining Databricks, Yinxi worked as an ML specialist in the energy industry for 7 years, optimizing production for conventional and renewable assets. She holds a Ph.D. in Electrical Engineering from the University of Houston. Yinxi is a former marathon runner, and is now a happy yogi.

Bio from: Data + AI Summit 2025

Filter by Event / Source

Talks & appearances

1 activities · Newest first

Search activities →
Entity Resolution for the Best Outcomes on Your Data

There are many ways to implement entity resolution (ER) system — both using vendor software and open-source libraries that enable DIY Entity Resolution. However, generally we see common challenges with any approach — scalability, bound to a single model architecture, lack of metrics and explainability, and stagnant implementations that do not "learn" with experience. Recent experiments with transformer-based approaches, fast lookups with vector search and Databricks components such as Databricks Apps and Agent Eval provide the foundations for a composable ER system that can get better with time on your data. In this presentation, we include a demo of how to use these components to build a composable ER that has the best outcomes for your data.