talk-data.com talk-data.com

M

Speaker

Mingyang Ge

1

talks

Software Engineer Databricks

Mingyang is the tech lead for Databricks Feature Store team. Previously worked at Google Assistant.

Bio from: Data + AI Summit 2025

Filter by Event / Source

Talks & appearances

1 activities · Newest first

Search activities →
High-Throughput ML: Mastering Efficient Model Serving at Enterprise Scale

Ever wondered how industry leaders handle thousands of ML predictions per second? This session reveals the architecture behind high-performance model serving systems on Databricks. We'll explore how to build inference pipelines that efficiently scale to handle massive request volumes while maintaining low latency. You'll learn how to leverage Feature Store for consistent, low-latency feature lookups and implement auto-scaling strategies that optimize both performance and cost. Key takeaways: Determining optimal compute capacity using the QPS × model execution time formula Configuring Feature Store for high-throughput, low-latency feature retrieval Managing cold starts and scaling strategies for latency-sensitive applications Implementing monitoring systems that provide visibility into inference performance Whether you're serving recommender systems or real-time fraud detection models, you'll gain practical strategies for building enterprise-grade ML serving systems.