Wenwen Xie

Activities

1

talks

Specialist Solutions Architect Databricks

Wenwen Xie is an experienced Machine Learning and Generative AI Solutions Architect at Databricks, based in Jersey City, New Jersey. At Databricks, Wenwen specializes in the design and deployment of advanced ML systems, focusing heavily on model evaluation and validation frameworks. She is particularly passionate about Evaluation-Driven Development, advocating strongly for integrating robust model evaluation as a foundational practice in ML system design. Prior to joining Databricks, Wenwen served briefly as a Data Scientist Lead Vice President at J.P. Morgan. Her earlier role as Senior Data Scientist and Machine Learning Engineer at LexisNexis significantly shaped her expertise in generative AI and natural language processing (NLP).

Bio from: Data + AI Summit 2025

Filter by Event / Source

Data + AI Summit 2025 1

Talks & appearances

1 activities · Newest first

Search activities →

Evaluation-Driven Development Workflows: Best Practices and Real-World Scenarios

2025-06-12 · Data + AI Summit 2025 Watch

talk

with Wenwen Xie (Databricks) , Arthur Dooner (Databricks)

AI/ML API LLM

In enterprise AI, Evaluation-Driven Development (EDD) ensures reliable, efficient systems by embedding continuous assessment and improvement into the AI development lifecycle. High-quality evaluation datasets are created using techniques like document analysis, synthetic data generation via Mosaic AI’s synthetic data generation API, SME validation, and relevance filtering, reducing manual effort and accelerating workflows. EDD focuses on metrics such as context relevance, groundedness, and response accuracy to identify and address issues like retrieval errors or model limitations. Custom LLM judges, tailored to domain-specific needs like PII detection or tone assessment, enhance evaluations. By leveraging tools like Mosaic AI Agent Framework and Agent Evaluation, MLflow, EDD automates data tracking, streamlines workflows, and quantifies improvements, transforming AI development for delivering scalable, high-performing systems that drive measurable organizational value.