talk-data.com talk-data.com

Topic

Kinesis

Amazon Kinesis

stream_processing realtime aws

22

tagged

Activity Trend

3 peak/qtr
2020-Q1 2026-Q1

Activities

22 activities · Newest first

Summary

Building a data pipeline that is reliable and flexible is a difficult task, especially when you have a small team. Astronomer is a platform that lets you skip straight to processing your valuable business data. Ry Walker, the CEO of Astronomer, explains how the company got started, how the platform works, and their commitment to open source.

Preamble

Hello and welcome to the Data Engineering Podcast, the show about modern data infrastructure When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at www.dataengineeringpodcast.com/linode?utm_source=rss&utm_medium=rss and get a $20 credit to try out their fast and reliable Linux virtual servers for running your data pipelines or trying out the tools you hear about on the show. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the newsletter, read the show notes, and get in touch. You can help support the show by checking out the Patreon page which is linked from the site. To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers This is your host Tobias Macey and today I’m interviewing Ry Walker, CEO of Astronomer, the platform for data engineering.

Interview

Introduction How did you first get involved in the area of data management? What is Astronomer and how did it get started? Regulatory challenges of processing other people’s data What does your data pipelining architecture look like? What are the most challenging aspects of building a general purpose data management environment? What are some of the most significant sources of technical debt in your platform? Can you share some of the failures that you have encountered while architecting or building your platform and company and how you overcame them? There are certain areas of the overall data engineering workflow that are well defined and have numerous tools to choose from. What are some of the unsolved problems in data management? What are some of the most interesting or unexpected uses of your platform that you are aware of?

Contact Information

Email @rywalker on Twitter

Links

Astronomer Kiss Metrics Segment Marketing tools chart Clickstream HIPAA FERPA PCI Mesos Mesos DC/OS Airflow SSIS Marathon Prometheus Grafana Terraform Kafka Spark ELK Stack React GraphQL PostGreSQL MongoDB Ceph Druid Aries Vault Adapter Pattern Docker Kinesis API Gateway Kong AWS Lambda Flink Redshift NOAA Informatica SnapLogic Meteor

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Support Data Engineering Podcast

Real-Time Big Data Analytics

This book delves into the techniques and tools essential for designing, processing, and analyzing complex datasets in real-time using advanced frameworks like Apache Spark, Storm, and Amazon Kinesis. By engaging with this thorough guide, you'll build proficiency in creating robust, efficient, and scalable real-time data processing architectures tailored to real-world scenarios. What this Book will help me do Learn the fundamentals of real-time data processing and how it differs from batch processing. Gain hands-on experience with Apache Storm for creating robust data-driven solutions. Develop real-world applications using Amazon Kinesis for cloud-based analytics. Perform complex data queries and transformations with Spark SQL and understand Spark RDDs. Master the Lambda Architecture to combine batch and real-time analytics effectively. Author(s) Shilpi Saxena is a renowned expert in big data technologies, holding extensive experience in real-time data analytics. With a career spanning years in the industry, Shilpi has provided innovative solutions for big data challenges in top-tier organizations. Her teaching approach emphasizes practical applicability, making her writings accessible and impactful for developers and architects alike. Who is it for? This book is for software professionals such as Big Data architects, developers, or programmers looking to enhance their skills in real-time big data analytics. If you are familiar with basic programming principles and seek to build solutions for processing large data streams in real-time environments, this book caters to your needs. It is also suitable for those seeking to familiarize themselves with using state-of-the-art tools like Spark SQL, Apache Storm, and Amazon Kinesis. Whether you're extending current expertise or transitioning into this field, this resource helps you achieve your objectives.