AWS

Infrastructure Provisioning at Grammarly: Scalable, Maintainable, User-Friendly.

2023-12-13 · BLN DevOps December edition #44

talk

cloud provisioning golden image management infrastructure as code self-service provisioning

Golden Image Management and Lifecycle: creating, maintaining, and retiring Golden Images to ensure consistent and efficient infrastructure deployments, including the toolset used, ownership model, and deployment pipeline. Streamlining Cloud Workloads Provisioning: enabling engineers to provision cloud workloads via self-service, using internally developed applications, incentivizing adoption, and addressing AWS capacity limits.

Intro to Infrastructure as Code with Pulumi and TypeScript

2023-12-05 · Live Workshop: Introduction to AWS in TypeScript

workshop

by Marina Novikova (AWS) , Josh Kodroff (Pulumi)

Pulumi TypeScript

Learn the fundamentals of infrastructure as code through guided exercises in TypeScript. You will be introduced to Pulumi and learn how to provision modern cloud infrastructure on AWS. This workshop covers how to use TypeScript with Pulumi, the basics of the Pulumi Programming Model, and how to provision, update, and destroy AWS resources.

Artsiom Yudovin: Data Acquisition: Overcoming Challenges

2023-12-05 · DATA MINER Big Data Europe Conference 2020 Watch

video

by Artsiom Yudovin

Big Data

Explore data acquisition challenges and solutions with Artsiom Yudovin in his session. 📊 Learn how his team overcame obstacles to create a robust system that provides timely insights and data reliability. 🛠️ #DataAcquisition #AWS #DataChallenges

✨ H I G H L I G H T S ✨

🙌 A huge shoutout to all the incredible participants who made Big Data Conference Europe 2023 in Vilnius, Lithuania, from November 21-24, an absolute triumph! 🎉 Your attendance and active participation were instrumental in making this event so special. 🌍

Don't forget to check out the session recordings from the conference to relive the valuable insights and knowledge shared! 📽️

Once again, THANK YOU for playing a pivotal role in the success of Big Data Conference Europe 2023. 🚀 See you next year for another unforgettable conference! 📅 #BigDataConference #SeeYouNextYear

#166 Optimizing Cloud Data Warehouses with Salim Syed, VP, Head of Engineering at Capital One Software

2023-12-04 · DataFramed Listen

podcast_episode

by Adel (DataFramed) , Salim Syed (Capital One Slingshot)

Big Data Cloud Computing Data Engineering Data Governance Data Lake Data Management DWH SaaS Cyber Security Snowflake

Effective data management has become a cornerstone of success in our digital era. It involves not just collecting and storing information but also organizing, securing, and leveraging data to drive progress and innovation. Many organizations turn to tools like Snowflake for advanced data warehousing capabilities. However, while Snowflake enhances data storage and access, it's not a complete solution for all data management challenges. To address this, tools like Capital One’s Slingshot can be used alongside Snowflake, helping to optimize costs and refine data management strategies. Salim Syed is a VP, Head of engineering for Capital One Slingshot product. He led Capital One’s data warehouse migration to AWS and is a specialist in deploying Snowflake to a large enterprise. Salim’s expertise lies in developing Big Data (Lake) and Data Warehouse strategy on the public cloud. He leads an organization of more than 100 data engineers, support engineers, DBAs and full stack developers in driving enterprise data lake, data warehouse, data management and visualization platform services. Salim has more than 25 years of experience in the data ecosystem. His career started in data engineering where he built data pipelines and then moved into maintenance and administration of large database servers using multi-tier replication architecture in various remote locations. He then worked at CodeRye as a database architect and at 3M Health Information Systems as an enterprise data architect. Salim has been at Capital One for the past six years. In this episode, Adel and Salim explore cloud data management and the evolution of Slingshot into a major multi-tenant SaaS platform, the shift from on-premise to cloud-based data governance, the role of centralized tooling, strategies for effective cloud data management, including data governance, cost optimization, and waste reduction as well as insights into navigating the complexities of data infrastructure, security, and scalability in the modern digital era. Links Mentioned in the Show: Capital One SlingshotSnowflakeCourse: Introduction to Data WarehousingCourse: Introduction to Snowflake

Addressing The Challenges Of Component Integration In Data Platform Architectures

2023-11-27 · Data Engineering Podcast Listen

podcast_episode

by Tobias Macey

AI/ML Airflow Analytics AWS Lambda BI Cloud Computing Data Engineering Data Lake Data Lakehouse Data Management dbt Delta +10 more

Summary

Building a data platform that is enjoyable and accessible for all of its end users is a substantial challenge. One of the core complexities that needs to be addressed is the fractal set of integrations that need to be managed across the individual components. In this episode Tobias Macey shares his thoughts on the challenges that he is facing as he prepares to build the next set of architectural layers for his data platform to enable a larger audience to start accessing the data being managed by his team.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. You specify the customer traits, then Profiles runs the joins and computations for you to create complete customer profiles. Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstack You shouldn't have to throw away the database to build with fast-changing data. You should be able to keep the familiarity of SQL and the proven architecture of cloud warehouses, but swap the decades-old batch computation model for an efficient incremental engine to get complex queries that are always up-to-date. With Materialize, you can! It’s the only true SQL streaming database built from the ground up to meet the needs of modern data products. Whether it’s real-time dashboarding and analytics, personalization and segmentation or automation and alerting, Materialize gives you the ability to work with fresh, correct, and scalable results — all in a familiar SQL interface. Go to dataengineeringpodcast.com/materialize today to get 2 weeks free! Developing event-driven pipelines is going to be a lot easier - Meet Functions! Memphis functions enable developers and data engineers to build an organizational toolbox of functions to process, transform, and enrich ingested events “on the fly” in a serverless manner using AWS Lambda syntax, without boilerplate, orchestration, error handling, and infrastructure in almost any language, including Go, Python, JS, .NET, Java, SQL, and more. Go to dataengineeringpodcast.com/memphis today to get started! Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics. Trusted by teams of all sizes, including Comcast and Doordash, Starburst is a data lake analytics platform that delivers the adaptability and flexibility a lakehouse ecosystem promises. And Starburst does all of this on an open architecture with first-class support for Apache Iceberg, Delta Lake and Hudi, so you always maintain ownership of your data. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino. Your host is Tobias Macey and today I'll be sharing an update on my own journey of building a data platform, with a particular focus on the challenges of tool integration and maintaining a single source of truth

Interview

Introduction How did you get involved in the area of data management? data sharing weight of history

existing integrations with dbt switching cost for e.g. SQLMesh de facto standard of Airflow

Single source of truth

permissions management across application layers Database engine Storage layer in a lakehouse Presentation/access layer (BI) Data flows dbt -> table level lineage orchestration engine -> pipeline flows

task based vs. asset based

Metadata platform as the logical place for horizontal view

Contact Info

LinkedIn Website

Parting Questio

Kubernetes on AWS with Pulumi

2023-11-21 · Live Workshop: Getting Started with Kubernetes on AWS

workshop

by Ringo De Smet (Pulumi)

Kubernetes Pulumi eks

In this live workshop, you will learn the fundamentals of setting up EKS clusters on AWS through guided exercises. This workshop covers the basics of writing Pulumi programs to manage infrastructure using real languages, how to create and manage EKS clusters in AWS with Pulumi, and how to create and manage Kubernetes resources with Pulumi.

AWS Cloud Practitioner Essentials - 1-Day Virtual Training

2023-11-16 · AWS Cloud Practitioner Essentials: Master the Cloud Fundamentals

talk

by NetCom Learning Instructor (NetCom Learning)

Free 1-day virtual instructor-led training on AWS Cloud Practitioner Essentials.

All the new features of aws-classic v6 and AWSX

2023-11-14 · Live Workshop: Getting Started with Infrastructure as Code on AWS

workshop

by Kat Morgan

Pulumi aws-classic v6 awsx

All the new features of aws-classic v6 and AWSX

How to provision, update, and destroy AWS resources

2023-11-14 · Live Workshop: Getting Started with Infrastructure as Code on AWS

workshop

by Kat Morgan

Pulumi

How to provision, update, and destroy AWS resources

The basics of the Pulumi Programming Model

2023-11-14 · Live Workshop: Getting Started with Infrastructure as Code on AWS

workshop

by Kat Morgan

Pulumi

The basics of the Pulumi Programming Model

Generative AI on AWS

2023-11-13 · O'Reilly AI & ML Books O'Reilly Amazon

book

by Antje Barth , Chris Fregly , Shelbee Eigenbrode (Amazon Web Services)

AI/ML GenAI LLM RAG React ai-ml artificial-intelligence-ai data generative-ai

Companies today are moving rapidly to integrate generative AI into their products and services. But there's a great deal of hype (and misunderstanding) about the impact and promise of this technology. With this book, Chris Fregly, Antje Barth, and Shelbee Eigenbrode from AWS help CTOs, ML practitioners, application developers, business analysts, data engineers, and data scientists find practical ways to use this exciting new technology. You'll learn the generative AI project life cycle including use case definition, model selection, model fine-tuning, retrieval-augmented generation, reinforcement learning from human feedback, and model quantization, optimization, and deployment. And you'll explore different types of models including large language models (LLMs) and multimodal models such as Stable Diffusion for generating images and Flamingo/IDEFICS for answering questions about images. Apply generative AI to your business use cases Determine which generative AI models are best suited to your task Perform prompt engineering and in-context learning Fine-tune generative AI models on your datasets with low-rank adaptation (LoRA) Align generative AI models to human values with reinforcement learning from human feedback (RLHF) Augment your model with retrieval-augmented generation (RAG) Explore libraries such as LangChain and ReAct to develop agents and actions Build generative AI applications with Amazon Bedrock

Explore how to use AWS services to provide secure and well-governed access to data

2023-11-09 · [Virtual Meetup] AWS Discovery Day - Fundamentals of a Modern Data Strategy

talk

A session on data access governance and security using AWS services.

Learn how to innovate with AI/ML by harnessing data with built-in ML

2023-11-09 · [Virtual Meetup] AWS Discovery Day - Fundamentals of a Modern Data Strategy

talk

ai/ml

A session on applying AI/ML to data using built-in machine learning capabilities in AWS.

Understanding how AWS services help modernize, unify, and innovate data infrastructure

2023-11-09 · [Virtual Meetup] AWS Discovery Day - Fundamentals of a Modern Data Strategy

talk

A session on how AWS services can modernize data infrastructure, unify data silos, and drive innovation across data platforms.

The Silent Pillar: How IAC Holds Up the Well-Architected Framework

2023-11-08 · BLN DevOps November edition #43

talk

Cloud Computing DevOps IaC Cyber Security

In today's cloud ecosystem, many laud the visible pillars of AWS's Well-Architected Framework, yet an essential component often remains in the shadows: Infrastructure as Code (IAC). Elizabeth Adeotun Adegbaju, a DevOps Engineer with a rich history in AWS cloud infrastructure, unravels the indispensable role of IAC in fortifying each of the renowned AWS pillars. Through this illuminating talk, attendees will gain insights into the intricate interplay between IAC and AWS's principles of operational excellence, cost optimization, reliability, performance efficiency, security, and sustainability. Dive deep into real-world examples, understand the potential pitfalls of overlooking IAC, and emerge with a renewed appreciation for its foundational significance in cloud architecture. This session is a clarion call for organizations to recognize and harness the power of IAC, positioning it not just as an option but as an imperative in achieving success in the cloud.

Data Engineering with AWS - Second Edition

2023-10-31 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Gareth Eagar

Analytics AWS Glue Data Engineering Data Governance QuickSight Redshift S3 Cyber Security data data-engineering

Learn data engineering and modern data pipeline design with AWS in this comprehensive guide! You will explore key AWS services like S3, Glue, Redshift, and QuickSight to ingest, transform, and analyze data, and you'll gain hands-on experience creating robust, scalable solutions. What this Book will help me do Understand and implement data ingestion and transformation processes using AWS tools. Optimize data for analytics with advanced AWS-powered workflows. Build end-to-end modern data pipelines leveraging cutting-edge AWS technologies. Design data governance strategies using AWS services for security and compliance. Visualize data and extract insights using Amazon QuickSight and other tools. Author(s) Gareth Eagar is a Senior Data Architect with over 25 years of experience in designing and implementing data solutions across various industries. He combines his deep technical expertise with a passion for teaching, aiming to make complex concepts approachable for learners at all levels. Who is it for? This book is intended for current or aspiring data engineers, data architects, and analysts seeking to leverage AWS for data engineering. It suits beginners with a basic understanding of data concepts who want to gain practical experience as well as intermediate professionals aiming to expand into AWS-based systems.

Explore AWS CloudFormation template and its structure, parameters, stacks, updates, importing resources, and drift detection

2023-10-26 · Master Class: Getting Started with AWS DevOps

talk

CloudFormation

Explore AWS CloudFormation template and its structure, parameters, stacks, updates, importing resources, and drift detection

Understand the implementation of DevOps culture and techniques in the AWS Cloud

2023-10-26 · Master Class: Getting Started with AWS DevOps

talk

Cloud Computing DevOps

Understand the implementation of DevOps culture and techniques in the AWS Cloud

Master Class: Getting Started with AWS DevOps

2023-10-26 · Unlock the Power of Scalable AI on AWS: Join Our Webinar🚀

webinar

CloudFormation infrastructure as code

Date: 2023-10-26. Webinar: Master Class: Getting Started with AWS DevOps. Topics include AWS DevOps concepts and culture, infrastructure automation, and AWS CloudFormation templates (structure, parameters, stacks, updates, importing resources, and drift detection).

Community driven: The dbt Labs and AWS partnership - Coalesce 2023

2023-10-24 · dbt Coalesce 2023 Watch

video

by David Nalley (Amazon Web Services)

Athena AWS Glue dbt Marketing Redshift Cyber Security

When companies work together through open source development, good things happen. Open source contributions lead to strong relationships between engineers across company lines, and positive outcomes for customers whether through improved functionality, performance, or supply chain security. In this keynote, learn about the power of open source in driving innovation, how AWS approaches open source collaboration, and some of the key improvements for Amazon Redshift, AWS Glue, and Amazon Athena customers and dbt users resulting from our partnership.

Speaker: David Nalley, Director, Open Source Strategy and Marketing, Amazon Web Services

Register for Coalesce at https://coalesce.getdbt.com/

talk-data.com

Activity Trend

Top Events

Top Speakers

Infrastructure Provisioning at Grammarly: Scalable, Maintainable, User-Friendly.

Intro to Infrastructure as Code with Pulumi and TypeScript

Artsiom Yudovin: Data Acquisition: Overcoming Challenges

#166 Optimizing Cloud Data Warehouses with Salim Syed, VP, Head of Engineering at Capital One Software

Addressing The Challenges Of Component Integration In Data Platform Architectures

Kubernetes on AWS with Pulumi

AWS Cloud Practitioner Essentials - 1-Day Virtual Training

All the new features of aws-classic v6 and AWSX

How to provision, update, and destroy AWS resources

The basics of the Pulumi Programming Model

Generative AI on AWS

Explore how to use AWS services to provide secure and well-governed access to data

Learn how to innovate with AI/ML by harnessing data with built-in ML

Understanding how AWS services help modernize, unify, and innovate data infrastructure

The Silent Pillar: How IAC Holds Up the Well-Architected Framework

Data Engineering with AWS - Second Edition

Explore AWS CloudFormation template and its structure, parameters, stacks, updates, importing resources, and drift detection

Understand the implementation of DevOps culture and techniques in the AWS Cloud

Master Class: Getting Started with AWS DevOps

Community driven: The dbt Labs and AWS partnership - Coalesce 2023