Search – talk-data.com

Title & Speakers	Event
Getting started on AWS with Python 2026-01-22 · 17:00 AWS CEO Matt Garman recently referenced Pulumi during a re:Invent keynote while discussing AgentCore, sparking new interest in how teams can bring automation and familiar programming languages into their AWS workflows. Join this hands-on workshop to learn how modern infrastructure management becomes faster and more consistent when you use real programming languages — in this case, Python — to define and deploy AWS resources. You’ll see how Pulumi lets developers and operations teams write infrastructure the same way they write applications, removing the need for domain-specific languages and enabling stronger software engineering practices across your cloud environments. Speakers: Engin Diri, Sr. Solutions Architect at Pulumi Join us to learn: How to provision AWS resources using Python and apply modern development practices to your infrastructure How Pulumi’s programming model helps teams deploy cloud architecture with confidence and repeatability How Pulumi’s ecosystem supports deployments across multiple environments and workflows Where: BigMarker at https://pulumip.us/AWS-on-Python Join our other workshop here: https://pulumip.us/UpcomingWorkshops	Getting started on AWS with Python
Getting started on AWS with Python 2026-01-22 · 17:00 AWS CEO Matt Garman recently referenced Pulumi during a re:Invent keynote while discussing AgentCore, sparking new interest in how teams can bring automation and familiar programming languages into their AWS workflows. Join this hands-on workshop to learn how modern infrastructure management becomes faster and more consistent when you use real programming languages — in this case, Python — to define and deploy AWS resources. You’ll see how Pulumi lets developers and operations teams write infrastructure the same way they write applications, removing the need for domain-specific languages and enabling stronger software engineering practices across your cloud environments. Speakers: Engin Diri, Sr. Solutions Architect at Pulumi Join us to learn: How to provision AWS resources using Python and apply modern development practices to your infrastructure How Pulumi’s programming model helps teams deploy cloud architecture with confidence and repeatability How Pulumi’s ecosystem supports deployments across multiple environments and workflows Where: BigMarker at https://pulumip.us/AWS-on-Python Join our other workshop here: https://pulumip.us/UpcomingWorkshops	Getting started on AWS with Python
Getting started on AWS with Python 2026-01-22 · 17:00 AWS CEO Matt Garman recently referenced Pulumi during a re:Invent keynote while discussing AgentCore, sparking new interest in how teams can bring automation and familiar programming languages into their AWS workflows. Join this hands-on workshop to learn how modern infrastructure management becomes faster and more consistent when you use real programming languages — in this case, Python — to define and deploy AWS resources. You’ll see how Pulumi lets developers and operations teams write infrastructure the same way they write applications, removing the need for domain-specific languages and enabling stronger software engineering practices across your cloud environments. Speakers: Engin Diri, Sr. Solutions Architect at Pulumi Join us to learn: How to provision AWS resources using Python and apply modern development practices to your infrastructure How Pulumi’s programming model helps teams deploy cloud architecture with confidence and repeatability How Pulumi’s ecosystem supports deployments across multiple environments and workflows Where: BigMarker at https://pulumip.us/AWS-on-Python Join our other workshop here: https://pulumip.us/UpcomingWorkshops	Getting started on AWS with Python
Getting started on AWS with Python 2026-01-22 · 17:00 AWS CEO Matt Garman recently referenced Pulumi during a re:Invent keynote while discussing AgentCore, sparking new interest in how teams can bring automation and familiar programming languages into their AWS workflows. Join this hands-on workshop to learn how modern infrastructure management becomes faster and more consistent when you use real programming languages — in this case, Python — to define and deploy AWS resources. You’ll see how Pulumi lets developers and operations teams write infrastructure the same way they write applications, removing the need for domain-specific languages and enabling stronger software engineering practices across your cloud environments. Speakers: Engin Diri, Sr. Solutions Architect at Pulumi Join us to learn: How to provision AWS resources using Python and apply modern development practices to your infrastructure How Pulumi’s programming model helps teams deploy cloud architecture with confidence and repeatability How Pulumi’s ecosystem supports deployments across multiple environments and workflows Where: BigMarker at https://pulumip.us/AWS-on-Python Join our other workshop here: https://pulumip.us/UpcomingWorkshops	Getting started on AWS with Python
Live Workshop: Getting Started with Infrastructure as Code on AWS, in Python 2024-05-22 · 16:00 In this workshop, you will learn the fundamentals of infrastructure as code through guided exercises. You will be introduced to Pulumi and learn how to use programming languages to provision modern cloud infrastructure. This workshop is designed to help new users become familiar with the core concepts needed to deploy resources on AWS effectively. Speakers: Diana Esteves, Solutions Architect, Pulumi Marina Novikova, Sr. Partner Solutions Architect, AWS Join us to learn: How to use Python with Pulumi The basics of the Pulumi Programming Model How to provision, update, and destroy AWS resources Where: BigMarker at https://pulumip.us/Intro-AWS-Pyhton Please register using the link above to attend. The example code will be provided during the workshop and emailed with the recording. Join our other workshop here: https://pulumip.us/UpcomingWorkshops	Live Workshop: Getting Started with Infrastructure as Code on AWS, in Python
Live Workshop: Getting Started with Infrastructure as Code on AWS, in Python 2024-04-18 · 16:00 In this workshop, you will learn the fundamentals of infrastructure as code through guided exercises. You will be introduced to Pulumi and learn how to use programming languages to provision modern cloud infrastructure. This workshop is designed to help new users become familiar with the core concepts needed to deploy resources on AWS effectively. Speakers: Diana Esteves, Solutions Architect, Pulumi Marina Novikova, Sr. Partner Solutions Architect, AWS Join us to learn: How to use Python with Pulumi The basics of the Pulumi Programming Model How to provision, update, and destroy AWS resources Where: BigMarker at https://pulumip.us/Intro-AWS-Pyhton Please register using the link above to attend. The example code will be provided during the workshop and emailed with the recording. Join our other workshop here: https://pulumip.us/UpcomingWorkshops	Live Workshop: Getting Started with Infrastructure as Code on AWS, in Python
Getting Started with Streamlit for Data Science 2021-08-20 Tyler Richards – author Getting Started with Streamlit for Data Science is your essential guide to quickly and efficiently building dynamic data science web applications in Python using Streamlit. Whether you're embedding machine learning models, visualizing data, or deploying projects, this book helps you excel in creating and sharing interactive apps with ease. What this Book will help me do Set up a development environment to create your first Streamlit application. Implement and visualize dynamic data workflows by integrating various Python libraries into Streamlit. Develop and showcase machine learning models within Streamlit for clear and interactive presentations. Deploy your projects effortlessly using platforms like Streamlit Sharing, Heroku, and AWS. Utilize tools like Streamlit Components and themes to enhance the aesthetics and usability of your apps. Author(s) Tyler Richards is a data science expert with extensive experience in leveraging technology to present complex data models in an understandable way. He brings practical solutions to readers, aiming to empower them with the tools they need to succeed in the field of data science. Tyler adopts a hands-on teaching method with illustrative examples to ensure clarity and easy learning. Who is it for? This book is designed for anyone involved in data science, from beginners just starting in the field to experienced professionals who want to learn to create interactive web applications using Streamlit. Ideal for those with a working knowledge of Python, this resource will help you streamline your workflows and enhance your project presentations. data data-science AI/ML AWS Data Science Python	O'Reilly Data Science Books
Beginning Apache Spark Using Azure Databricks: Unleashing Large Cluster Analytics in the Cloud 2020-06-11 Robert Ilijason – author Analyze vast amounts of data in record time using Apache Spark with Databricks in the Cloud. Learn the fundamentals, and more, of running analytics on large clusters in Azure and AWS, using Apache Spark with Databricks on top. Discover how to squeeze the most value out of your data at a mere fraction of what classical analytics solutions cost, while at the same time getting the results you need, incrementally faster. This book explains how the confluence of these pivotal technologies gives you enormous power, and cheaply, when it comes to huge datasets. You will begin by learning how cloud infrastructure makes it possible to scale your code to large amounts of processing units, without having to pay for the machinery in advance. From there you will learn how Apache Spark, an open source framework, can enable all those CPUs for data analytics use. Finally, you will see how services such as Databricks provide the power of Apache Spark, without you having to know anything aboutconfiguring hardware or software. By removing the need for expensive experts and hardware, your resources can instead be allocated to actually finding business value in the data. This book guides you through some advanced topics such as analytics in the cloud, data lakes, data ingestion, architecture, machine learning, and tools, including Apache Spark, Apache Hadoop, Apache Hive, Python, and SQL. Valuable exercises help reinforce what you have learned. What You Will Learn Discover the value of big data analytics that leverage the power of the cloud Get started with Databricks using SQL and Python in either Microsoft Azure or AWS Understand the underlying technology, and how the cloud and Apache Spark fit into the bigger picture See how these tools are used in the real world Run basic analytics, including machine learning, on billions of rows at a fraction of a cost or free Who This Book Is For Data engineers, data scientists, and cloud architects who want or need to run advanced analytics in the cloud. It is assumed that the reader has data experience, but perhaps minimal exposure to Apache Spark and Azure Databricks. The book is also recommended for people who want to get started in the analytics field, as it provides a strong foundation. data data-engineering apache-spark AI/ML Analytics AWS Azure Big Data Cloud Computing Confluence Data Analytics Databricks Hadoop Hive Microsoft Python Spark SQL	O'Reilly Data Engineering Books
SnowflakeDB: The Data Warehouse Built For The Cloud 2019-12-09 · 01:00 Kent Graziano – chief technical evangelist @ SnowflakeDB , Tobias Macey – host Summary Data warehouses have gone through many transformations, from standard relational databases on powerful hardware, to column oriented storage engines, to the current generation of cloud-native analytical engines. SnowflakeDB has been leading the charge to take advantage of cloud services that simplify the separation of compute and storage. In this episode Kent Graziano, chief technical evangelist for SnowflakeDB, explains how it is differentiated from other managed platforms and traditional data warehouse engines, the features that allow you to scale your usage dynamically, and how it allows for a shift in your workflow from ETL to ELT. If you are evaluating your options for building or migrating a data platform, then this is definitely worth a listen. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With 200Gbit private networking, scalable shared block storage, and a 40Gbit public network, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform. If you need global distribution, they’ve got that covered too with world-wide datacenters including new ones in Toronto and Mumbai. And for your machine learning workloads, they just announced dedicated CPU instances. Go to dataengineeringpodcast.com/linode today to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with what’s happening in databases, streaming platforms, big data, and everything else you need to know about modern data management. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media and the Python Software Foundation. Upcoming events include the Software Architecture Conference in NYC and PyCOn US in Pittsburgh. Go to dataengineeringpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host is Tobias Macey and today I’m interviewing Kent Graziano about SnowflakeDB, the cloud-native data warehouse Interview Introduction How did you get involved in the area of data management? Can you start by explaining what SnowflakeDB is for anyone who isn’t familiar with it? How does it compare to the other available platforms for data warehousing? How does it differ from traditional data warehouses? How does the performance and flexibility affect the data modeling requirements? Snowflake is one of the data stores that is enabling the shift from an ETL to an ELT workflow. What are the features that allow for that approach and what are some of the challenges that it introduces? Can you describe how the platform is architected and some of the ways that it has evolved as it has grown in popularity? What are some of the current limitations that you are struggling with? For someone getting started with Snowflake what is involved with loading data into the platform? What is their workflow for allocating and scaling compute capacity and running anlyses? One of the interesting features enabled by your architecture is data sharing. What are some of the most interesting or unexpected uses of that capability that you have seen? What are some other features or use cases for Snowflake that are not as well known or publicized which you think users should know about? When is SnowflakeDB the wrong choice? What are some of the plans for the future of SnowflakeDB? Contact Info LinkedIn Website @KentGraziano on Twitter Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Links SnowflakeDB Free Trial Stack Overflow Data Warehouse Oracle DB MPP == Massively Parallel Processing Shared Nothing Architecture Multi-Cluster Shared Data Architecture Google BigQuery AWS Redshift AWS Redshift Spectrum Presto Podcast Episode SnowflakeDB Semi-Structured Data Types Hive ACID == Atomicity, Consistency, Isolation, Durability 3rd Normal Form Data Vault Modeling Dimensional Modeling JSON AVRO Parquet SnowflakeDB Virtual Warehouses CRM == Customer Relationship Management Master Data Management Podcast Episode FoundationDB Podcast Episode Apache Spark Podcast Episode SSIS == SQL Server Integration Services Talend Informatica Fivetran Podcast Episode Matillion Apache Kafka Snowpipe Snowflake Data Exchange OLTP == Online Transaction Processing GeoJSON Snowflake Documentation SnowAlert Splunk Data Catalog The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Support Data Engineering Podcast AI/ML Avro AWS Big Data BigQuery Cloud Computing CRM Data Engineering Data Management Data Modelling Data Vault DWH ETL/ELT Fivetran Hive Informatica JSON Kafka dimensional modeling Master Data Management Matillion Oracle Parquet Presto Python RDBMS Redshift Snowflake Spark Splunk SQL SSIS Data Streaming Talend	Data Engineering Podcast Listen

Getting started on AWS with Python 2026-01-22 · 17:00

AWS CEO Matt Garman recently referenced Pulumi during a re:Invent keynote while discussing AgentCore, sparking new interest in how teams can bring automation and familiar programming languages into their AWS workflows.

Join this hands-on workshop to learn how modern infrastructure management becomes faster and more consistent when you use real programming languages — in this case, Python — to define and deploy AWS resources.

You’ll see how Pulumi lets developers and operations teams write infrastructure the same way they write applications, removing the need for domain-specific languages and enabling stronger software engineering practices across your cloud environments.

Speakers:

Engin Diri, Sr. Solutions Architect at Pulumi

Join us to learn:

How to provision AWS resources using Python and apply modern development practices to your infrastructure
How Pulumi’s programming model helps teams deploy cloud architecture with confidence and repeatability
How Pulumi’s ecosystem supports deployments across multiple environments and workflows

Where: BigMarker at https://pulumip.us/AWS-on-Python Join our other workshop here: https://pulumip.us/UpcomingWorkshops

Getting started on AWS with Python

Getting started on AWS with Python 2026-01-22 · 17:00

AWS CEO Matt Garman recently referenced Pulumi during a re:Invent keynote while discussing AgentCore, sparking new interest in how teams can bring automation and familiar programming languages into their AWS workflows.

Join this hands-on workshop to learn how modern infrastructure management becomes faster and more consistent when you use real programming languages — in this case, Python — to define and deploy AWS resources.

You’ll see how Pulumi lets developers and operations teams write infrastructure the same way they write applications, removing the need for domain-specific languages and enabling stronger software engineering practices across your cloud environments.

Speakers:

Engin Diri, Sr. Solutions Architect at Pulumi

Join us to learn:

How to provision AWS resources using Python and apply modern development practices to your infrastructure
How Pulumi’s programming model helps teams deploy cloud architecture with confidence and repeatability
How Pulumi’s ecosystem supports deployments across multiple environments and workflows

Where: BigMarker at https://pulumip.us/AWS-on-Python Join our other workshop here: https://pulumip.us/UpcomingWorkshops

Getting started on AWS with Python

Getting started on AWS with Python 2026-01-22 · 17:00

AWS CEO Matt Garman recently referenced Pulumi during a re:Invent keynote while discussing AgentCore, sparking new interest in how teams can bring automation and familiar programming languages into their AWS workflows.

Join this hands-on workshop to learn how modern infrastructure management becomes faster and more consistent when you use real programming languages — in this case, Python — to define and deploy AWS resources.

You’ll see how Pulumi lets developers and operations teams write infrastructure the same way they write applications, removing the need for domain-specific languages and enabling stronger software engineering practices across your cloud environments.

Speakers:

Engin Diri, Sr. Solutions Architect at Pulumi

Join us to learn:

How to provision AWS resources using Python and apply modern development practices to your infrastructure
How Pulumi’s programming model helps teams deploy cloud architecture with confidence and repeatability
How Pulumi’s ecosystem supports deployments across multiple environments and workflows

Where: BigMarker at https://pulumip.us/AWS-on-Python Join our other workshop here: https://pulumip.us/UpcomingWorkshops

Getting started on AWS with Python

Getting started on AWS with Python 2026-01-22 · 17:00

AWS CEO Matt Garman recently referenced Pulumi during a re:Invent keynote while discussing AgentCore, sparking new interest in how teams can bring automation and familiar programming languages into their AWS workflows.

Join this hands-on workshop to learn how modern infrastructure management becomes faster and more consistent when you use real programming languages — in this case, Python — to define and deploy AWS resources.

You’ll see how Pulumi lets developers and operations teams write infrastructure the same way they write applications, removing the need for domain-specific languages and enabling stronger software engineering practices across your cloud environments.

Speakers:

Engin Diri, Sr. Solutions Architect at Pulumi

Join us to learn:

How to provision AWS resources using Python and apply modern development practices to your infrastructure
How Pulumi’s programming model helps teams deploy cloud architecture with confidence and repeatability
How Pulumi’s ecosystem supports deployments across multiple environments and workflows

Where: BigMarker at https://pulumip.us/AWS-on-Python Join our other workshop here: https://pulumip.us/UpcomingWorkshops

Getting started on AWS with Python

Live Workshop: Getting Started with Infrastructure as Code on AWS, in Python 2024-05-22 · 16:00

In this workshop, you will learn the fundamentals of infrastructure as code through guided exercises. You will be introduced to Pulumi and learn how to use programming languages to provision modern cloud infrastructure. This workshop is designed to help new users become familiar with the core concepts needed to deploy resources on AWS effectively. Speakers:

Diana Esteves, Solutions Architect, Pulumi
Marina Novikova, Sr. Partner Solutions Architect, AWS

Join us to learn:

How to use Python with Pulumi
The basics of the Pulumi Programming Model
How to provision, update, and destroy AWS resources

Where: BigMarker at https://pulumip.us/Intro-AWS-Pyhton Please register using the link above to attend. The example code will be provided during the workshop and emailed with the recording. Join our other workshop here: https://pulumip.us/UpcomingWorkshops

Live Workshop: Getting Started with Infrastructure as Code on AWS, in Python

Live Workshop: Getting Started with Infrastructure as Code on AWS, in Python 2024-04-18 · 16:00

In this workshop, you will learn the fundamentals of infrastructure as code through guided exercises. You will be introduced to Pulumi and learn how to use programming languages to provision modern cloud infrastructure. This workshop is designed to help new users become familiar with the core concepts needed to deploy resources on AWS effectively. Speakers:

Diana Esteves, Solutions Architect, Pulumi
Marina Novikova, Sr. Partner Solutions Architect, AWS

Join us to learn:

How to use Python with Pulumi
The basics of the Pulumi Programming Model
How to provision, update, and destroy AWS resources

Where: BigMarker at https://pulumip.us/Intro-AWS-Pyhton Please register using the link above to attend. The example code will be provided during the workshop and emailed with the recording. Join our other workshop here: https://pulumip.us/UpcomingWorkshops

Live Workshop: Getting Started with Infrastructure as Code on AWS, in Python

Getting Started with Streamlit for Data Science 2021-08-20

Tyler Richards – author

Getting Started with Streamlit for Data Science is your essential guide to quickly and efficiently building dynamic data science web applications in Python using Streamlit. Whether you're embedding machine learning models, visualizing data, or deploying projects, this book helps you excel in creating and sharing interactive apps with ease. What this Book will help me do Set up a development environment to create your first Streamlit application. Implement and visualize dynamic data workflows by integrating various Python libraries into Streamlit. Develop and showcase machine learning models within Streamlit for clear and interactive presentations. Deploy your projects effortlessly using platforms like Streamlit Sharing, Heroku, and AWS. Utilize tools like Streamlit Components and themes to enhance the aesthetics and usability of your apps. Author(s) Tyler Richards is a data science expert with extensive experience in leveraging technology to present complex data models in an understandable way. He brings practical solutions to readers, aiming to empower them with the tools they need to succeed in the field of data science. Tyler adopts a hands-on teaching method with illustrative examples to ensure clarity and easy learning. Who is it for? This book is designed for anyone involved in data science, from beginners just starting in the field to experienced professionals who want to learn to create interactive web applications using Streamlit. Ideal for those with a working knowledge of Python, this resource will help you streamline your workflows and enhance your project presentations.

data data-science AI/ML AWS Data Science Python

O'Reilly Data Science Books

Beginning Apache Spark Using Azure Databricks: Unleashing Large Cluster Analytics in the Cloud 2020-06-11

Robert Ilijason – author

Analyze vast amounts of data in record time using Apache Spark with Databricks in the Cloud. Learn the fundamentals, and more, of running analytics on large clusters in Azure and AWS, using Apache Spark with Databricks on top. Discover how to squeeze the most value out of your data at a mere fraction of what classical analytics solutions cost, while at the same time getting the results you need, incrementally faster. This book explains how the confluence of these pivotal technologies gives you enormous power, and cheaply, when it comes to huge datasets. You will begin by learning how cloud infrastructure makes it possible to scale your code to large amounts of processing units, without having to pay for the machinery in advance. From there you will learn how Apache Spark, an open source framework, can enable all those CPUs for data analytics use. Finally, you will see how services such as Databricks provide the power of Apache Spark, without you having to know anything aboutconfiguring hardware or software. By removing the need for expensive experts and hardware, your resources can instead be allocated to actually finding business value in the data. This book guides you through some advanced topics such as analytics in the cloud, data lakes, data ingestion, architecture, machine learning, and tools, including Apache Spark, Apache Hadoop, Apache Hive, Python, and SQL. Valuable exercises help reinforce what you have learned. What You Will Learn Discover the value of big data analytics that leverage the power of the cloud Get started with Databricks using SQL and Python in either Microsoft Azure or AWS Understand the underlying technology, and how the cloud and Apache Spark fit into the bigger picture See how these tools are used in the real world Run basic analytics, including machine learning, on billions of rows at a fraction of a cost or free Who This Book Is For Data engineers, data scientists, and cloud architects who want or need to run advanced analytics in the cloud. It is assumed that the reader has data experience, but perhaps minimal exposure to Apache Spark and Azure Databricks. The book is also recommended for people who want to get started in the analytics field, as it provides a strong foundation.

data data-engineering apache-spark AI/ML Analytics AWS Azure Big Data Cloud Computing Confluence Data Analytics Databricks Hadoop Hive Microsoft Python Spark SQL

O'Reilly Data Engineering Books

SnowflakeDB: The Data Warehouse Built For The Cloud 2019-12-09 · 01:00

Kent Graziano – chief technical evangelist @ SnowflakeDB , Tobias Macey – host

Summary Data warehouses have gone through many transformations, from standard relational databases on powerful hardware, to column oriented storage engines, to the current generation of cloud-native analytical engines. SnowflakeDB has been leading the charge to take advantage of cloud services that simplify the separation of compute and storage. In this episode Kent Graziano, chief technical evangelist for SnowflakeDB, explains how it is differentiated from other managed platforms and traditional data warehouse engines, the features that allow you to scale your usage dynamically, and how it allows for a shift in your workflow from ETL to ELT. If you are evaluating your options for building or migrating a data platform, then this is definitely worth a listen.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With 200Gbit private networking, scalable shared block storage, and a 40Gbit public network, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform. If you need global distribution, they’ve got that covered too with world-wide datacenters including new ones in Toronto and Mumbai. And for your machine learning workloads, they just announced dedicated CPU instances. Go to dataengineeringpodcast.com/linode today to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show! You listen to this show to learn and stay up to date with what’s happening in databases, streaming platforms, big data, and everything else you need to know about modern data management. For even more opportunities to meet, listen, and learn from your peers you don’t want to miss out on this year’s conference season. We have partnered with organizations such as O’Reilly Media and the Python Software Foundation. Upcoming events include the Software Architecture Conference in NYC and PyCOn US in Pittsburgh. Go to dataengineeringpodcast.com/conferences to learn more about these and other events, and take advantage of our partner discounts to save money when you register today. Your host is Tobias Macey and today I’m interviewing Kent Graziano about SnowflakeDB, the cloud-native data warehouse

Interview

Introduction How did you get involved in the area of data management? Can you start by explaining what SnowflakeDB is for anyone who isn’t familiar with it?

How does it compare to the other available platforms for data warehousing? How does it differ from traditional data warehouses?

How does the performance and flexibility affect the data modeling requirements?

Snowflake is one of the data stores that is enabling the shift from an ETL to an ELT workflow. What are the features that allow for that approach and what are some of the challenges that it introduces? Can you describe how the platform is architected and some of the ways that it has evolved as it has grown in popularity?

What are some of the current limitations that you are struggling with?

For someone getting started with Snowflake what is involved with loading data into the platform?

What is their workflow for allocating and scaling compute capacity and running anlyses?

One of the interesting features enabled by your architecture is data sharing. What are some of the most interesting or unexpected uses of that capability that you have seen? What are some other features or use cases for Snowflake that are not as well known or publicized which you think users should know about? When is SnowflakeDB the wrong choice? What are some of the plans for the future of SnowflakeDB?

Contact Info

LinkedIn Website @KentGraziano on Twitter

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Links

SnowflakeDB

Free Trial Stack Overflow

Data Warehouse Oracle DB MPP == Massively Parallel Processing Shared Nothing Architecture Multi-Cluster Shared Data Architecture Google BigQuery AWS Redshift AWS Redshift Spectrum Presto

Podcast Episode

SnowflakeDB Semi-Structured Data Types Hive ACID == Atomicity, Consistency, Isolation, Durability 3rd Normal Form Data Vault Modeling Dimensional Modeling JSON AVRO Parquet SnowflakeDB Virtual Warehouses CRM == Customer Relationship Management Master Data Management

Podcast Episode

FoundationDB

Podcast Episode

Apache Spark

Podcast Episode

SSIS == SQL Server Integration Services Talend Informatica Fivetran

Podcast Episode

Matillion Apache Kafka Snowpipe Snowflake Data Exchange OLTP == Online Transaction Processing GeoJSON Snowflake Documentation SnowAlert Splunk Data Catalog

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Support Data Engineering Podcast

AI/ML Avro AWS Big Data BigQuery Cloud Computing CRM Data Engineering Data Management Data Modelling Data Vault DWH ETL/ELT Fivetran Hive Informatica JSON Kafka dimensional modeling Master Data Management Matillion Oracle Parquet Presto Python RDBMS Redshift Snowflake Spark Splunk SQL SSIS Data Streaming Talend

Data Engineering Podcast

Listen

Activities & events