postgresql

Lakeflow Connect: Easy, Efficient Ingestion From Databases

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Peter Pogorski (Databricks) , Bret Grantham (Databricks)

Cyber Security SQL

Lakeflow Connect streamlines the ingestion of incremental data from popular databases like SQL Server and PostgreSQL. In this session, we’ll review best practices for networking, security, minimizing database load, monitoring and more — tailored to common industry scenarios. Join us to gain practical insights into Lakeflow Connect's functionality so that you’re ready to build your own pipelines. Whether you're looking to optimize data ingestion or enhance your database integrations, this session will provide you with a deep understanding of how Lakeflow Connect works with databases.

Getting Started With Lakeflow Connect

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Peter Pogorski (Databricks) , Giselle Goicochea (Databricks)

AI/ML Analytics CI/CD Databricks Google Analytics SaaS SQL

Hundreds of customers are already ingesting data with Lakeflow Connect from SQL Server, Salesforce, ServiceNow, Google Analytics, SharePoint, PostgreSQL and more to unlock the full power of their data. Lakeflow Connect introduces built-in, no-code ingestion connectors from SaaS applications, databases and file sources to help unlock data intelligence. In this demo-packed session, you’ll learn how to ingest ready-to-use data for analytics and AI with a few clicks in the UI or a few lines of code. We’ll also demonstrate how Lakeflow Connect is fully integrated with the Databricks Data Intelligence Platform for built-in governance, observability, CI/CD, automated pipeline maintenance and more. Finally, we’ll explain how to use Lakeflow Connect in combination with downstream analytics and AI tools to tackle common business challenges and drive business impact.

Starburst: How DoorDash is Realizing a Unified Query Engine

2025-05-12 · gartner-data-analytics-uk-2025

talk

Data Lake Data Lakehouse Snowflake Trino

Future-proof your data architecture: Learn how DoorDash built a data lakehouse powered by Starburst to achieve a 20-30% faster time to insights. Akshat Nair shares lessons learned about what drove DoorDash to move beyond Snowflake to embrace the lakehouse. He will share his rationale for selecting Trino as their lakehouse query engine and why his team chose Starburst over open source. Discover how DoorDash seamlessly queries diverse sources, including Snowflake, Postgres, and data lake table formats, achieving faster data-driven decision-making at scale with cost benefits.

Converging Database Architectures: DuckDB in PostgreSQL

2025-04-24 · AI Council 2025 Watch

talk

by Marco Slot (Snowflake)

DuckDB

The Modern Database Debate: PostgreSQL & MongoDB

2025-04-24 · AI Council 2025 Watch

talk

by Franck Pachot

MongoDB

One Does Not Simply Query a Stream

2025-04-14 · IN PERSON: Apache Kafka® x Apache Iceberg™ Meetup

talk

by Viktor Gamov (Confluent)

ClickHouse Kafka SQL Trino apache iceberg flink kafka streams tableflow

Streaming data with Apache Kafka® has become the backbone of modern day applications. While streams are ideal for continuous data flow, they lack built-in querying capability. Unlike databases with indexed lookups, Kafka's append-only logs are designed for high throughput processing, not for on-demand querying. This necessitates teams to build additional infrastructure to enable query capabilities for streaming data. Traditional methods replicate this data into external stores such as relational databases like PostgreSQL for operational workloads and object storage like S3 with Flink, Spark, or Trino for analytical use cases. While useful sometimes, these methods deepen the divide between operational and analytical estates, creating silos, complex ETL pipelines, and issues with schema mismatches, freshness, and failures.\n\nIn this session, we’ll explore and see live demos of some solutions to unify the operational and analytical estates, eliminating data silos. We’ll start with stream processing using Kafka Streams, Apache Flink®, and SQL implementations, then cover integration of relational databases with real-time analytics databases such as Apache Pinot® and ClickHouse. Finally, we’ll dive into modern approaches like Apache Iceberg® with Tableflow, which simplifies data preparation by seamlessly representing Kafka topics and associated schemas as Iceberg or Delta tables in a few clicks. While there's no single right answer to this problem, as responsible system builders, we must understand our options and trade-offs to build robust architectures.

Simplifying Data Pipelines with Durable Execution

2025-04-12 · Data Engineering Podcast Listen

podcast_episode

by Jeremy Edberg (DBOS) , Tobias Macey

AI/ML CI/CD Data Engineering Data Management Datafold Python TypeScript

Summary In this episode of the Data Engineering Podcast Jeremy Edberg, CEO of DBOS, about durable execution and its impact on designing and implementing business logic for data systems. Jeremy explains how DBOS's serverless platform and orchestrator provide local resilience and reduce operational overhead, ensuring exactly-once execution in distributed systems through the use of the Transact library. He discusses the importance of version management in long-running workflows and how DBOS simplifies system design by reducing infrastructure needs like queues and CI pipelines, making it beneficial for data pipelines, AI workloads, and agentic AI.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.Your host is Tobias Macey and today I'm interviewing Jeremy Edberg about durable execution and how it influences the design and implementation of business logicInterview IntroductionHow did you get involved in the area of data management?Can you describe what DBOS is and the story behind it?What is durable execution?What are some of the notable ways that inclusion of durable execution in an application architecture changes the ways that the rest of the application is implemented? (e.g. error handling, logic flow, etc.)Many data pipelines involve complex, multi-step workflows. How does DBOS simplify the creation and management of resilient data pipelines? How does durable execution impact the operational complexity of data management systems?One of the complexities in durable execution is managing code/data changes to workflows while existing executions are still processing. What are some of the useful patterns for addressing that challenge and how does DBOS help?Can you describe how DBOS is architected?How have the design and goals of the system changed since you first started working on it?What are the characteristics of Postgres that make it suitable for the persistence mechanism of DBOS?What are the guiding principles that you rely on to determine the boundaries between the open source and commercial elements of DBOS?What are the most interesting, innovative, or unexpected ways that you have seen DBOS used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on DBOS?When is DBOS the wrong choice?What do you have planned for the future of DBOS?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links DBOSExactly Once SemanticsTemporalSempahorePostgresDBOS TransactPython Typescript Idempotency KeysAgentic AIState MachineYugabyteDBPodcast EpisodeCockroachDBSupabaseNeonPodcast EpisodeAirflowThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA