Topic

MySQL

relational_database open_source sql

Activities

2

tagged

Activity Trend

27 peak/qtr

2020-Q1 2026-Q1

Top Events

O'Reilly Data Engineering Books 153 Data Engineering Podcast 64 O'Reilly SQL Books 24 AWS re:Invent 2024 6 Google Cloud Next '25 3 Google Cloud Next '24 2 Microsoft Ignite 2023 2 O'Reilly Data Science Books 2 Data + AI Summit 2025 2 Airflow Summit 2021 2 Airflow Summit 2024 2 Databricks DATA + AI Summit 2023 1

Top Speakers

Tobias Macey 64 Charles Bell 7 Paul DuBois 7 Robin Nixon 6 Ben Forta 6 Julie C. Meloni 5 Kevin Kline 5 Allen G. Taylor 4 George Reese 4 Janet Valade 4 Arie D. Jones 3 Alan Beaulieu 3

Activities

Showing filtered results

All Video Podcast Book

Filtering by: Data + AI Summit 2025 ×

Creating a Custom PySpark Stream Reader with PySpark 4.0

2025-06-11 · Data + AI Summit 2025 Watch

lightning_talk

by Skyler Myers (Entrada)

Databricks Delta Java Kafka PySpark Spark Data Streaming

PySpark supports many data sources out of the box, such as Apache Kafka, JDBC, ODBC, Delta Lake, etc. However, some older systems, such as systems that use JMS protocol, are not supported by default and require considerable extra work for developers to read from them. One such example is ActiveMQ for streaming. Traditionally, users of ActiveMQ have to use a middle-man in order to read the stream with Spark (such as writing to a MySQL DB using Java code and reading that table with Spark JDBC). With PySpark 4.0’s custom data sources (supported in DBR 15.3+) we are able to cut out the middle-man processing using batch or Spark Streaming and consume the queues directly from PySpark, saving developers considerable time and complexity in getting source data into your Delta Lake and governed by Unity Catalog and orchestrated with Databricks Workflows.

Sponsored by: Fivetran | Raw Data to Real-Time Insights: How Dropbox Revolutionized Data Ingestion

2025-06-10 · Data + AI Summit 2025 Watch

lightning_talk

by Kelly Kohlleffel (Fivetran) , Chris Neat (Dropbox)

Analytics Cloud Computing Cloud Storage Databricks Fivetran Marketing

Dropbox, a leading cloud storage platform, is on a mission to accelerate data insights to better understand customers’ needs and elevate the overall customer experience. By leveraging Fivetran’s data movement platform, Dropbox gained real-time visibility into customer sentiment, marketing ROI, and ad performance-empowering teams to optimize spend, improve operational efficiency, and deliver greater business outcomes.Join this session to learn how Dropbox:- Cut data pipeline time from 8 weeks to 30 minutes by automating ingestion and streamlining reporting workflows.- Enable real-time, reliable data movement across tools like Zendesk Chat, Google Ads, MySQL, and more — at global operations scale.- Unify fragmented data sources into the Databricks Data Intelligence Platform to reduce redundancy, improve accessibility, and support scalable analytics.