talk-data.com talk-data.com

Eric Sun

Speaker

Eric Sun

2

talks

Head of Data Platform Coinbase

Innovating highly-scalable data platform for rapidly-growing digital native industry. Eric pioneers open data lake architecture, builds high performance data platform, and crafts intelligent data tools.

Bio from: Databricks DATA + AI Summit 2023

Filtering by: Data + AI Summit 2025 ×

Filter by Event / Source

Talks & appearances

Showing 2 of 4 activities

Search activities →
Master Schema Translations in the Era of Open Data Lake

Unity Catalog puts variety of schemas into a centralized repository, now the developer community wants more productivity and automation for schema inference, translation, evolution and optimization especially for the scenarios of ingestion and reverse-ETL with more code generations.Coinbase Data Platform attempts to pave a path with "Schemaster" to interact with data catalog with the (proposed) metadata model to make schema translation and evolution more manageable across some of the popular systems, such as Delta, Iceberg, Snowflake, Kafka, MongoDB, DynamoDB, Postgres...This Lighting Talk covers 4 areas: The complexity and caveats of schema differences among The proposed field-level metadata model, and 2 translation patterns: point-to-point vs hub-and-spoke Why Data Profiling be augmented to enhance schema understanding and translation Integrate it with Ingestion & Reverse-ETL in a Databricks-oriented eco system Takeaway: standardize schema lineage & translation

Graph-Powered Observability Data Analysis in Databricks With Credential Vending

Observability data — logs, metrics, and traces — captures the complex interactions within modern distributed systems. A graph query engine on top of Databricks enables complex traversal of massive observability data, helping users trace service dependencies, analyze upstream/downstream impacts, and uncover recurring error patterns, making it easier to diagnose issues and optimize system performance. A critical challenge in handling observability data is managing dynamic RBAC for the sensitive system telemetry. This session explains how Coinbase leverages credential vending, a method for issuing short-lived credentials to enable fine-grained, secure access to observability data stored in Databricks without long-lived secrets. Key takeaways: Querying Databricks tables as graph structures without ETLing data out Secure access management with credential vending Practical graph-based incident analysis solution at Coinbase, with insights on how PuppyGraph enables this application Audio for this session is delivered in the conference mobile app, you must bring your own headphones to listen.