Topic

Kinesis

Amazon Kinesis

stream_processing realtime aws

Activities

2

tagged

Activity Trend

3 peak/qtr

2020-Q1 2026-Q2

Top Events

Data Engineering Podcast 7 AWS re:Invent 2024 5 Databricks DATA + AI Summit 2023 3 O'Reilly Data Engineering Books 2 Data + AI Summit 2025 2 Airflow Summit 2025 1 Data Council 2023 1 O'Reilly Data Science Books 1

Top Speakers

Tobias Macey 7 Jonathan Katz (Amazon Redshift) 1 Derek Nelson (PipelineDB) 1 Josh Beemster (Snowplow) 1 Irfan Elahi (Databricks) 1 Eric Sammer (Decodable) 1 Frank Munz (Databricks) 1 Usman Masood (PipelineDB) 1 Mark Needham 1 Margi Dubal (Freewheel (a Comcast Company)) 1 Shilpi Saxena 1 Vikram Koka (Astronomer) 1

Activities

Showing filtered results

All Video Podcast Book

Filtering by: Data + AI Summit 2025 ×

Race to Real-Time: Low-Latency Streaming ETL Meets Next-Gen Databricks OLTP-DB

2025-06-11 · Data + AI Summit 2025 Watch

lightning_talk

by Irfan Elahi (Databricks)

Databricks ETL/ELT Spark Data Streaming postgresql

In today’s digital economy, real-time insights and rapid responsiveness are paramount to delivering exceptional user experiences and lowering TCO. In this session, discover a pioneering approach that leverages a low-latency streaming ETL pipeline built with Spark Structured Streaming and Databricks’ new OLTP-DB—a serverless, managed Postgres offering designed for transactional workloads. Validated in a live customer scenario, this architecture achieves sub-2 second end-to-end latency by seamlessly ingesting streaming data from Kinesis and merging it into OLTP-DB. This breakthrough not only enhances performance and scalability but also provides a replicable blueprint for transforming data pipelines across various verticals. Join us as we delve into the advanced optimization techniques and best practices that underpin this innovation, demonstrating how Databricks’ next-generation solutions can revolutionize real-time data processing and unlock a myriad of new use cases in data landscape.

Let's Save Tons of Money With Cloud-Native Data Ingestion!

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Tyler Croy (Scribd, Inc.)

Airbyte AWS Aurora Azure Cloud Computing Databricks Delta GCP Kafka

Delta Lake is a fantastic technology for quickly querying massive data sets, but first you need those massive data sets! In this session we will dive into the cloud-native architecture Scribd has adopted to ingest data from AWS Aurora, SQS, Kinesis Data Firehose and more. By using off-the-shelf open source tools like kafka-delta-ingest, oxbow and Airbyte, Scribd has redefined its ingestion architecture to be more event-driven, reliable, and most importantly: cheaper. No jobs needed! Attendees will learn how to use third-party tools in concert with a Databricks and Unity Catalog environment to provide a highly efficient and available data platform. This architecture will be presented in the context of AWS but can be adapted for Azure, Google Cloud Platform or even on-premise environments.