AWS re:Invent 2025 - Cut costs & operate efficiently on Amazon RDS for SQL Server & Oracle (DAT325)

2025-12-06 · AWS re:Invent 2024 Watch

video

Agile/Scrum AWS Cloud Computing Oracle SQL

Discover how leading enterprises are leveraging 15+ years of Amazon RDS operational excellence to power their SQL Server and Oracle databases in the cloud. In this session, explore features across Amazon RDS for SQL Server and Oracle that help you achieve substantial cost savings, enhanced scalability, and efficient operations. Through real-world cost optimization techniques and architectural best practices, learn how organizations are reducing operational overhead and costs while improving availability, scalability, and performance.

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

AWS re:Invent 2025 - Deep dive into databases zero-ETL integrations (DAT445)

2025-12-06 · AWS re:Invent 2024 Watch

video

Agile/Scrum AI/ML Analytics AWS Aurora Cloud Computing DynamoDB ETL/ELT Redshift Amazon SageMaker

In this session, learn how AWS zero-ETL integrations remove the need to manage complex data movement pipelines across multiple source database engines and targets so data engineers, architects, & DBAs can eliminate maintenance overhead while ensuring near real-time data availability for analytics & ML workloads. Examine the underlying architecture and how it works for the supported zero-ETL integrations between Amazon Aurora, Amazon DynamoDB, and Amazon RDS sources to Amazon Redshift, Amazon SageMaker, and Amazon OpenSearch Service targets - all without traditional ETL complexity. Dive into the data movement options, tunable settings, and how to monitor ongoing data movement.

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

AWS re:Invent 2025 - Revealing the northern lights: Amazon Aurora security deep dive (DAT456)

2025-12-05 · AWS re:Invent 2024 Watch

video

Agile/Scrum AWS Aurora Cloud Computing Cyber Security

Researchers at the DEF CON 33 security conference shared that Amazon RDS successfully defended against their novel attack, despite the underlying open source database engine not releasing patches at the time. This is the result of years of consistent investment in security and isolation, enabling customers to focus on their applications instead of managing databases. We dive deep into the engineering and layered security architecture for Amazon Aurora, examining how multiple complementary layers including encryption, network controls, granular permissions, AWS IAM, active defense, and Amazon GuardDuty work together to protect customers' data from the infrastructure through the application interface.

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

AWS re:Invent 2025 - Boost performance and reduce costs in Amazon Aurora and Amazon RDS (DAT312)

2025-12-03 · AWS re:Invent 2024 Watch

video

Agile/Scrum AWS Aurora Cloud Computing

In this session, explore Amazon Aurora and Amazon RDS cost components and learn important best practices that can help you improve the performance of your relational database workloads while reducing spend on cost components such as compute, storage, backup, and I/O. Learn about the latest performance monitoring features, which provide insights on how to efficiently track and optimize database performance at scale and help ensure that you’re maximizing efficiency while keeping costs under control.

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

AWS re:Invent 2024 - Analyze Amazon Aurora & RDS data in Amazon Redshift with zero-ETL (DAT331)

2024-12-07 · AWS re:Invent 2024 Watch

video

Agile/Scrum AI/ML Analytics AWS Aurora CloudFormation Cloud Computing ETL/ELT RDBMS Redshift

Discover the power of Amazon Aurora and Amazon RDS zero-ETL integrations with Amazon Redshift. Zero-ETL integrations help unify your data across applications and data sources for holistic insights. This session explores how Amazon Aurora and Amazon RDS zero-ETL integrations with Amazon Redshift remove the need to build and manage complex data pipelines, enabling analytics and machine learning using Amazon Redshift on petabytes of transactional data from your relational databases. In this session, learn about key zero-ETL integration functionalities like data filtering, AWS CloudFormation support, and more.

Learn more: AWS re:Invent: https://go.aws/reinvent. More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

About AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2024

AWS re:Invent 2024 - Boost performance and reduce costs in Amazon Aurora and Amazon RDS (DAT315)

2024-12-06 · AWS re:Invent 2024 Watch

video

Agile/Scrum AWS Aurora Cloud Computing

In this session, explore Amazon Aurora and Amazon RDS cost components and learn important best practices that can help you improve the performance of your relational database workloads while reducing spend on cost components such as compute, storage, backup, and I/O. Learn about the latest performance monitoring features, which provide insights on how to efficiently track and optimize database performance at scale and help ensure that you’re maximizing efficiency while keeping costs under control.

Learn more: AWS re:Invent: https://go.aws/reinvent. More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

About AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2024

AWS re:Invent 2024 - Accelerate migrations using AWS DMS Schema Conversion with gen AI (DAT347-NEW)

2024-12-04 · AWS re:Invent 2024 Watch

video

by John Winford (AWS)

Agile/Scrum AI/ML Analytics AWS Aurora Cloud Computing GenAI

Discover how AWS is reshaping database migrations with generative AI. AWS DMS now uses generative AI in Amazon Bedrock to improve automation of schema conversion, which reduces manual effort and accelerates migrations to fully managed services like Amazon Aurora and Amazon RDS. AWS DMS is a managed migration service that helps move your database and analytics workloads to AWS quickly, securely, and with minimal downtime and zero data loss. This session deep dives into the architectural decisions behind schema conversions with generative AI. Also see demos of this innovation in action.

Learn more: AWS re:Invent: https://go.aws/reinvent. More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

About AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2024

AWS re:Invent 2024 - Amazon Aurora HA and DR design patterns for global resilience (DAT304)

2024-12-04 · AWS re:Invent 2024 Watch

video

by Tim Stoakes (AWS) , Grant McAlister (AWS)

Agile/Scrum AWS Aurora Cloud Computing MySQL postgresql

Amazon Aurora is a fully managed relational database designed for unparalleled high performance and availability at global scale with full MySQL and PostgreSQL compatibility. Aurora provides managed high availability (HA) and disaster recovery (DR) capabilities in and across AWS Regions. In this session, explore the Aurora HA and DR capabilities and discover design patterns that enable the development of resilient applications. Learn how to establish in-Region and cross-Region HA and DR using Aurora features, including Multi-AZ deployments, Aurora Global Database, and Amazon RDS Proxy, and discover how to reduce failover times with a JDBC driver.

Learn more: AWS re:Invent: https://go.aws/reinvent. More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

About AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2024

Hands-On MySQL Administration

2024-06-28 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Arunjith Aravindan , Jeyaram Ayyalusamy

Aurora Azure Cloud Computing MySQL Cyber Security data data-engineering relational-databases

Geared to intermediate- to advanced-level DBAs and IT professionals looking to enhance their MySQL skills, this guide provides a comprehensive overview on how to manage and optimize MySQL databases. You'll learn how to create databases and implement backup and recovery, security configurations, high availability, scaling techniques, and performance tuning. Using practical techniques, tips, and real-world examples, authors Arunjith Aravindan and Jeyaram Ayyalusamy show you how to deploy and manage MySQL, Amazon RDS, Amazon Aurora, and Azure MySQL. By the end of the book, you'll have the knowledge and skills necessary to administer, manage, and optimize MySQL databases effectively. Design and implement a scalable and reliable database infrastructure using MySQL 8 on premises and cloud Install and configure software, manage user accounts, and optimize database performance Use backup and recovery strategies, security measures, and high availability solutions Apply best practices for database schema design, indexing strategies, and replication techniques Implement advanced database features and techniques such as replication, clustering, load balancing, and high availability Troubleshoot common issues and errors, using diagnostic tools and techniques to identify and resolve problems quickly and efficiently Facilitate major MySQL upgrades including MySQL 5.7 to MySQL 8

Processing Delta Lake Tables on AWS Using AWS Glue, Amazon Athena, and Amazon Redshift

2023-07-26 · Databricks DATA + AI Summit 2023 Watch

video

by Noritaka Sekiyama (Amazon Web Services (AWS)) , Akira Ajisaka

Athena AWS Amazon EMR AWS Glue Cloud Computing Data Lake Data Lakehouse Databricks Delta DWH DynamoDB MongoDB +3 more

Delta Lake is an open source project that helps implement modern data lake architectures commonly built on cloud storages. With Delta Lake, you can achieve ACID transactions, time travel queries, CDC, and other common use cases on the cloud.

There are a lot of use cases of Delta tables on AWS. AWS has invested a lot in this technology, and now Delta Lake is available with multiple AWS services, such as AWS Glue Spark jobs, Amazon EMR, Amazon Athena, and Amazon Redshift Spectrum. AWS Glue is a serverless, scalable data integration service that makes it easier to discover, prepare, move, and integrate data from multiple sources. With AWS Glue, you can easily ingest data from multiple data sources such as on-prem databases, Amazon RDS, DynamoDB, MongoDB into Delta Lake on Amazon S3 even without expertise in coding.

This session will demonstrate how to get started with processing Delta Lake tables on Amazon S3 using AWS Glue, and querying from Amazon Athena, and Amazon Redshift. The session also covers recent AWS service updates related to Delta Lake.

Talk by: Noritaka Sekiyama and Akira Ajisaka

Here’s more to explore: Why the Data Lakehouse Is Your next Data Warehouse: https://dbricks.co/3Pt5unq Lakehouse Fundamentals Training: https://dbricks.co/44ancQs

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Using DMS and DLT for Change Data Capture

2023-07-25 · Databricks DATA + AI Summit 2023 Watch

video

by Ganesh Chand , Neil Patel

AWS Data Lake Databricks Delta

Bringing in Relational Data Store (RDS) data into your data lake is a critical and important process to facilitate use cases. By leveraging AWS Database Migration Services (DMS) and Databricks Delta Live Tables (DLT) we can simplify change data capture from your RDS. In this talk, we will be breaking down this complex process by discussing the fundamentals and best practices. There will also be a demo where we bring this all together.

Talk by: Neil Patel and Ganesh Chand

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Practical Database Auditing for Microsoft SQL Server and Azure SQL: Troubleshooting, Regulatory Compliance, and Governance

2022-09-19 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Josephine Bush

AWS Azure Cloud Computing DevOps Microsoft SQL SQL Server data data-engineering microsoft-sql-server relational-databases

Know how to track changes and key events in your SQL Server databases in support of application troubleshooting, regulatory compliance, and governance. This book shows how to use key features in SQL Server ,such as SQL Server Audit and Extended Events, to track schema changes, permission changes, and changes to your data. You’ll even learn how to track queries run against specific tables in a database. Not all changes and events can be captured and tracked using SQL Server Audit and Extended Events, and the book goes beyond those features to also show what can be captured using common criteria compliance, change data capture, temporal tables, or querying the SQL Server log. You will learn how to audit just what you need to audit, and how to audit pretty much anything that happens on a SQL Server instance. This book will also help you set up cloud auditing with an emphasis on Azure SQL Database, Azure SQL Managed Instance, and AWS RDS SQL Server. You don’t need expensive, third-party auditing tools to make auditing work for you, and to demonstrate and provide value back to your business. This book will help you set up an auditing solution that works for you and your needs. It shows how to collect the audit data that you need, centralize that data for easy reporting, and generate audit reports using built-in SQL Server functionality for use by your own team, developers, and organization’s auditors. What You Will Learn Understand why auditing is important for troubleshooting, compliance, and governance Track changes and key events using SQL Server Audit and Extended Events Track SQL Server configuration changes for governance and troubleshooting Utilize change data capture and temporal tables to track data changes in SQL Server tables Centralize auditing data from all yourdatabases for easy querying and reporting Configure auditing on Azure SQL, Azure SQL Managed Instance, and AWS RDS SQL Server Who This Book Is For Database administrators who need to know what’s changing on their database servers, and those who are making the changes; database-savvy DevOps engineers and developers who are charged with troubleshooting processes and applications; developers and administrators who are responsible for generating reports in support of regulatory compliance reporting and auditing

SQL Server Advanced Troubleshooting and Performance Tuning

2022-05-13 · O'Reilly SQL Books O'Reilly Amazon

book

by Dmitri Korotkevitch

Azure Cloud Computing Microsoft SQL SQL Server microsoft sql server

This practical book provides a comprehensive overview of troubleshooting and performance tuning best practices for Microsoft SQL Server. Database engineers, including database developers and administrators, will learn how to identify performance issues, troubleshoot the system in a holistic fashion, and properly prioritize tuning efforts to attain the best system performance possible. Author Dmitri Korotkevitch, Microsoft Data Platform MVP and Microsoft Certified Master (MCM), explains the interdependencies between SQL Server database components. You'll learn how to quickly diagnose your system and discover the root cause of any issue. Techniques in this book are compatible with all versions of SQL Server and cover both on-premises and cloud-based SQL Server installations. Discover how performance issues present themselves in SQL Server Learn about SQL Server diagnostic tools, methods, and technologies Perform health checks on SQL Server installations Learn the dependencies between SQL Server components Tune SQL Server to improve performance and reduce bottlenecks Detect poorly optimized queries and inefficiencies in query execution plans Find inefficient indexes and common database design issues Use these techniques with Microsoft Azure SQL databases, Azure SQL Managed Instances, and Amazon RDS for SQL Server

TimescaleDB: The Timeseries Database Built For SQL And Scale - Episode 65

2019-01-14 · Data Engineering Podcast Listen

podcast_episode

by Mike Freedman (Timescale) , Ajay Kulkarni (Timescale) , Tobias Macey

Analytics Flink AWS Data Engineering Data Lake Data Management DevOps GitHub Grafana Hadoop IoT Kafka +8 more

Summary

The past year has been an active one for the timeseries market. New products have been launched, more businesses have moved to streaming analytics, and the team at Timescale has been keeping busy. In this episode the TimescaleDB CEO Ajay Kulkarni and CTO Michael Freedman stop by to talk about their 1.0 release, how the use cases for timeseries data have proliferated, and how they are continuing to simplify the task of processing your time oriented events.

Introduction

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out Linode. With 200Gbit private networking, scalable shared block storage, and a 40Gbit public network, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform. If you need global distribution, they’ve got that covered too with world-wide datacenters including new ones in Toronto and Mumbai. Go to dataengineeringpodcast.com/linode today to get a $20 credit and launch a new server in under a minute. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the mailing list, read the show notes, and get in touch. To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media. Join the community in the new Zulip chat workspace at dataengineeringpodcast.com/chat Your host is Tobias Macey and today I’m welcoming Ajay Kulkarni and Mike Freedman back to talk about how TimescaleDB has grown and changed over the past year

Interview

Introduction How did you get involved in the area of data management? Can you refresh our memory about what TimescaleDB is? How has the market for timeseries databases changed since we last spoke? What has changed in the focus and features of the TimescaleDB project and company? Toward the end of 2018 you launched the 1.0 release of Timescale. What were your criteria for establishing that milestone?

What were the most challenging aspects of reaching that goal?

In terms of timeseries workloads, what are some of the factors that differ across varying use cases?

How do those differences impact the ways in which Timescale is used by the end user, and built by your team?

What are some of the initial assumptions that you made while first launching Timescale that have held true, and which have been disproven? How have the improvements and new features in the recent releases of PostgreSQL impacted the Timescale product?

Have you been able to leverage some of the native improvements to simplify your implementation? Are there any use cases for Timescale that would have been previously impractical in vanilla Postgres that would now be reasonable without the help of Timescale?

What is in store for the future of the Timescale product and organization?

Contact Info

Ajay

@acoustik on Twitter LinkedIn

Mike

LinkedIn Website @michaelfreedman on Twitter

Timescale

Website Documentation Careers timescaledb on GitHub @timescaledb on Twitter

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Links

TimescaleDB Original Appearance on the Data Engineering Podcast 1.0 Release Blog Post PostgreSQL

Podcast Interview

RDS DB-Engines MongoDB IOT (Internet Of Things) AWS Timestream Kafka Pulsar

Podcast Episode

Spark

Podcast Episode

Flink

Podcast Episode

Hadoop DevOps PipelineDB

Podcast Interview

Grafana Tableau Prometheus OLTP (Online Transaction Processing) Oracle DB Data Lake

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Support Data Engineering Podcast

TimescaleDB: Fast And Scalable Timeseries with Ajay Kulkarni and Mike Freedman - Episode 18

2018-02-11 · Data Engineering Podcast Listen

podcast_episode

by Mike Freedman (Timescale) , Ajay Kulkarni (Timescale) , Tobias Macey

Azure Cloud Computing Cloudflare Data Engineering Data Management Databricks DevOps Docker ELK GCP GitHub Grafana +14 more

Summary

As communications between machines become more commonplace the need to store the generated data in a time-oriented manner increases. The market for timeseries data stores has many contenders, but they are not all built to solve the same problems or to scale in the same manner. In this episode the founders of TimescaleDB, Ajay Kulkarni and Mike Freedman, discuss how Timescale was started, the problems that it solves, and how it works under the covers. They also explain how you can start using it in your infrastructure and their plans for the future.

Preamble

Hello and welcome to the Data Engineering Podcast, the show about modern data infrastructure When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at dataengineeringpodcast.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your data pipelines or trying out the tools you hear about on the show. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. You can help support the show by checking out the Patreon page which is linked from the site. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Your host is Tobias Macey and today I’m interviewing Ajay Kulkarni and Mike Freedman about Timescale DB, a scalable timeseries database built on top of PostGreSQL

Interview

Introduction How did you get involved in the area of data management? Can you start by explaining what Timescale is and how the project got started? The landscape of time series databases is extensive and oftentimes difficult to navigate. How do you view your position in that market and what makes Timescale stand out from the other options? In your blog post that explains the design decisions for how Timescale is implemented you call out the fact that the inserted data is largely append only which simplifies the index management. How does Timescale handle out of order timestamps, such as from infrequently connected sensors or mobile devices? How is Timescale implemented and how has the internal architecture evolved since you first started working on it?

What impact has the 10.0 release of PostGreSQL had on the design of the project? Is timescale compatible with systems such as Amazon RDS or Google Cloud SQL?

For someone who wants to start using Timescale what is involved in deploying and maintaining it? What are the axes for scaling Timescale and what are the points where that scalability breaks down?

Are you aware of anyone who has deployed it on top of Citus for scaling horizontally across instances?

What has been the most challenging aspect of building and marketing Timescale? When is Timescale the wrong tool to use for time series data? One of the use cases that you call out on your website is for systems metrics and monitoring. How does Timescale fit into that ecosystem and can it be used along with tools such as Graphite or Prometheus? What are some of the most interesting uses of Timescale that you have seen? Which came first, Timescale the business or Timescale the database, and what is your strategy for ensuring that the open source project and the company around it both maintain their health? What features or improvements do you have planned for future releases of Timescale?

Contact Info

Ajay

LinkedIn @acoustik on Twitter Timescale Blog

Mike

Website LinkedIn @michaelfreedman on Twitter Timescale Blog

Timescale

Website @timescaledb on Twitter GitHub

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Links

Timescale PostGreSQL Citus Timescale Design Blog Post MIT NYU Stanford SDN Princeton Machine Data Timeseries Data List of Timeseries Databases NoSQL Online Transaction Processing (OLTP) Object Relational Mapper (ORM) Grafana Tableau Kafka When Boring Is Awesome PostGreSQL RDS Google Cloud SQL Azure DB Docker Continuous Aggregates Streaming Replication PGPool II Kubernetes Docker Swarm Citus Data

Website Data Engineering Podcast Interview

Database Indexing B-Tree Index GIN Index GIST Index STE Energy Redis Graphite Prometheus pg_prometheus OpenMetrics Standard Proposal Timescale Parallel Copy Hadoop PostGIS KDB+ DevOps Internet of Things MongoDB Elastic DataBricks Apache Spark Confluent New Enterprise Associates MapD Benchmark Ventures Hortonworks 2σ Ventures CockroachDB Cloudflare EMC Timescale Blog: Why SQL is beating NoSQL, and what this means for the future of data

The intro and outro music is from a href="http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug?utm_source=rss&utm_medium=rss" target="_blank"…

Citus Data: Distributed PostGreSQL for Big Data with Ozgun Erdogan and Craig Kerstiens - Episode 13

2018-01-08 · Data Engineering Podcast Listen

podcast_episode

by Ozgun Erdogan (Citus Data) , Craig Kerstiens (Citus Data) , Tobias Macey

Analytics Aurora Big Data CI/CD Data Engineering Data Management GitHub Linux NoSQL SQL Data Streaming postgresql

Summary

PostGreSQL has become one of the most popular and widely used databases, and for good reason. The level of extensibility that it supports has allowed it to be used in virtually every environment. At Citus Data they have built an extension to support running it in a distributed fashion across large volumes of data with parallelized queries for improved performance. In this episode Ozgun Erdogan, the CTO of Citus, and Craig Kerstiens, Citus Product Manager, discuss how the company got started, the work that they are doing to scale out PostGreSQL, and how you can start using it in your environment.

Preamble

Hello and welcome to the Data Engineering Podcast, the show about modern data infrastructure When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at dataengineeringpodcast.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your data pipelines or trying out the tools you hear about on the show. Continuous delivery lets you get new features in front of your users as fast as possible without introducing bugs or breaking production and GoCD is the open source platform made by the people at Thoughtworks who wrote the book about it. Go to dataengineeringpodcast.com/gocd to download and launch it today. Enterprise add-ons and professional support are available for added peace of mind. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. You can help support the show by checking out the Patreon page which is linked from the site. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers Your host is Tobias Macey and today I’m interviewing Ozgun Erdogan and Craig Kerstiens about Citus, worry free PostGreSQL

Interview

Introduction How did you get involved in the area of data management? Can you describe what Citus is and how the project got started? Why did you start with Postgres vs. building something from the ground up? What was the reasoning behind converting Citus from a fork of PostGres to being an extension and releasing an open source version? How well does Citus work with other Postgres extensions, such as PostGIS, PipelineDB, or Timescale? How does Citus compare to options such as PostGres-XL or the Postgres compatible Aurora service from Amazon? How does Citus operate under the covers to enable clustering and replication across multiple hosts? What are the failure modes of Citus and how does it handle loss of nodes in the cluster? For someone who is interested in migrating to Citus, what is involved in getting it deployed and moving the data out of an existing system? How do the different options for leveraging Citus compare to each other and how do you determine which features to release or withhold in the open source version? Are there any use cases that Citus enables which would be impractical to attempt in native Postgres? What have been some of the most challenging aspects of building the Citus extension? What are the situations where you would advise against using Citus? What are some of the most interesting or impressive uses of Citus that you have seen? What are some of the features that you have planned for future releases of Citus?

Contact Info

Citus Data

citusdata.com @citusdata on Twitter citusdata on GitHub

Craig

Email Website @craigkerstiens on Twitter

Ozgun

Email ozgune on GitHub

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Links

Citus Data PostGreSQL NoSQL Timescale SQL blog post PostGIS PostGreSQL Graph Database JSONB Data Type PipelineDB Timescale PostGres-XL Aurora PostGres Amazon RDS Streaming Replication CitusMX CTE (Common Table Expression) HipMunk Citus Sharding Blog Post Wal-e Wal-g Heap Analytics HyperLogLog C-Store

The intro and outro musi

Effective Business Intelligence with QuickSight

2017-03-10 · O'Reilly Business Intelligence Books O'Reilly Amazon

book

by Rajesh Nadipalli

Analytics AWS BI Big Data Cloud Computing Data Analytics DataViz QuickSight S3 Cyber Security amazon-quicksight analytics-platforms +2 more

Effective Business Intelligence with QuickSight introduces you to Amazon QuickSight, a modern BI tool that enables interactive visualizations powered by the cloud. With comprehensive tutorials, you'll master how to load, prepare, and visualize your data for actionable insights. This book provides real-world examples to showcase how QuickSight integrates into the AWS ecosystem. What this Book will help me do Understand how to effectively use Amazon QuickSight for business intelligence. Learn how to connect QuickSight to data sources like S3, RDS, and more. Create interactive dashboards and visualizations with QuickSight tools. Gain expertise in managing users, permissions, and data security in QuickSight. Execute a real-world big data project using AWS Data Lakes and QuickSight. Author(s) None Nadipalli is a seasoned data architect with extensive experience in cloud computing and business intelligence. With expertise in the AWS ecosystem, she has worked on numerous large-scale data analytics projects. Her writing focuses on providing practical knowledge through easy-to-follow examples and actionable insights. Who is it for? This book is ideal for business intelligence architects, developers, and IT executives seeking to leverage Amazon QuickSight. It is suited for readers with foundational knowledge of AWS who want to enhance their capabilities in BI and data visualization. If your goal is to modernize your business intelligence systems and explore advanced analytics, this book is perfect for you.

talk-data.com

Amazon RDS

Activity Trend

Top Events

Top Speakers

AWS re:Invent 2025 - Cut costs & operate efficiently on Amazon RDS for SQL Server & Oracle (DAT325)

AWSreInvent #AWSreInvent2025 #AWS

AWS re:Invent 2025 - Deep dive into databases zero-ETL integrations (DAT445)

AWSreInvent #AWSreInvent2025 #AWS

AWS re:Invent 2025 - Revealing the northern lights: Amazon Aurora security deep dive (DAT456)

AWSreInvent #AWSreInvent2025 #AWS

AWS re:Invent 2025 - Boost performance and reduce costs in Amazon Aurora and Amazon RDS (DAT312)

AWSreInvent #AWSreInvent2025 #AWS

AWS re:Invent 2024 - Analyze Amazon Aurora & RDS data in Amazon Redshift with zero-ETL (DAT331)

AWSreInvent #AWSreInvent2024

AWS re:Invent 2024 - Boost performance and reduce costs in Amazon Aurora and Amazon RDS (DAT315)

AWSreInvent #AWSreInvent2024

AWS re:Invent 2024 - Accelerate migrations using AWS DMS Schema Conversion with gen AI (DAT347-NEW)

AWSreInvent #AWSreInvent2024

AWS re:Invent 2024 - Amazon Aurora HA and DR design patterns for global resilience (DAT304)

AWSreInvent #AWSreInvent2024

Hands-On MySQL Administration

Processing Delta Lake Tables on AWS Using AWS Glue, Amazon Athena, and Amazon Redshift

Using DMS and DLT for Change Data Capture

Practical Database Auditing for Microsoft SQL Server and Azure SQL: Troubleshooting, Regulatory Compliance, and Governance

SQL Server Advanced Troubleshooting and Performance Tuning

TimescaleDB: The Timeseries Database Built For SQL And Scale - Episode 65

TimescaleDB: Fast And Scalable Timeseries with Ajay Kulkarni and Mike Freedman - Episode 18

Citus Data: Distributed PostGreSQL for Big Data with Ozgun Erdogan and Craig Kerstiens - Episode 13

Effective Business Intelligence with QuickSight