talk-data.com talk-data.com

A

Speaker

Anurag Bharati

1

talks

Senior Data Engineer DigiCert

Anurag works as a Senior Data Engineer at DigiCert and has built data pipelines that have scaled to ingest data of varying volumes. He has built data pipelines that have leveraged technologies of Apache Spark, Structured Streaming and Delta lake tables. Apart from Data Engineering pursuits, he has also built a internal customer facing UI that uses Databricks' Genie API to let DigiCert's internal stakeholders query business questions using natural language.

Bio from: Data + AI Summit 2025

Filter by Event / Source

Talks & appearances

1 activities · Newest first

Search activities →
Supercharging Sales Intelligence: Processing Billions of Events via Structured Streaming

DigiCert is a digital security company that provides digital certificates, encryption and authentication services and serves 88% of the Fortune 500, securing over 28 billion web connections daily. Our project aggregates and analyzes certificate transparency logs via public APIs to provide comprehensive market and competitive intelligence. Instead of relying on third-party providers with limited data, our project gives full control, deeper insights and automation. Databricks has helped us reliably poll public APIs in a scalable manner that fetches millions of events daily, deduplicate and store them in our Delta tables. We specifically use Spark for parallel processing, structured streaming for real-time ingestion and deduplication, Delta tables for data reliability, pools and jobs to ensure our costs are optimized. These technologies help us keep our data fresh, accurate and cost effective. This data has helped our sales team with real-time intelligence, ensuring DigiCert's success.