talk-data.com talk-data.com

A

Speaker

Akanksha Nagpal

1

talks

Sr. Data Engineer Adobe

Akanksha is a Sr. Software Engineer with extensive experience in designing and building large-scale distributed systems within the Adobe Experience Platform (AEP). She has led the development of high-performance data pipelines for efficient big data ingestion into Adobe's Identity Graph, leveraging technologies such as Apache Spark, Spark Streaming, and Flink. Her expertise includes working with Delta Lake for scalable data storage and optimizing real-time and batch data processing workflows. Passionate about innovation and continuous learning, she actively contributes to the tech community, sharing insights on scalable data processing, streaming architectures, and distributed systems.

Bio from: Data + AI Summit 2025

Filtering by: Data + AI Summit 2025 ×

Filter by Event / Source

Talks & appearances

Showing 1 of 1 activities

Search activities →
Scaling Identity Graph Ingestion to 1M Events/Sec with Spark Streaming & Delta Lake

Adobe’s Real-Time Customer Data Platform relies on the identity graph to connect over 70 billion identities and deliver personalized experiences. This session will showcase how the platform leverages Databricks, Spark Streaming and Delta Lake, along with 25+ Databricks deployments across multiple regions and clouds — Azure & AWS — to process terabytes of data daily and handle over a million records per second. The talk will highlight the platform’s ability to scale, demonstrating a 10x increase in ingestion pipeline capacity to accommodate peak traffic during events like the Super Bowl. Attendees will learn about the technical strategies employed, including migrating from Flink to Spark Streaming, optimizing data deduplication, and implementing robust monitoring and anomaly detection. Discover how these optimizations enable Adobe to deliver real-time identity resolution at scale while ensuring compliance and privacy.