From Notebook to Pipeline: Hands-On Data Engineering with Python

2025-12-08 · PyData Boston 2025 Watch

talk

by Gilberto Hernandez

Cloud Computing Data Engineering SQL

In this hands-on tutorial, you'll go from a blank notebook to a fully orchestrated data pipeline built entirely in Python, all in under 90 minutes. You'll learn how to design and deploy end-to-end data pipelines using familiar notebook environments, using Python for your data loading, data transformations, and insights delivery.

We'll dive into the Ingestion-Tranformation-Delivery (ITD) framework for building data pipelines: ingest raw data from cloud object storage, transform the data using Python DataFrames, and deliver insights via a Streamlit application.

Basic familiarity with Python (and/or SQL) is helpful, but not required. By the end of the session, you'll understand practical data engineering patterns and leave with reusable code templates to help you build, orchestrate, and deploy data pipelines from notebook environments.

AWS re:Invent 2025 - Build production AI agents with the Strands Agents SDK for TypeScript (AIM3331)

2025-12-07 · AWS re:Invent 2024 Watch

video

Agile/Scrum AI/ML AWS Cloud Computing JavaScript LLM TypeScript

Discover how to build enterprise-ready AI agents using the newly launched Strands Agents SDK for TypeScript. This session introduces developers to a simple model-driven framework for building agents that run on any cloud, support multiple LLM providers, and integrate with the tools you already have. Learn how TypeScript developers can now leverage the same production-ready agent framework that Python teams have been using, with full type safety and seamless integration into modern JavaScript ecosystems. We'll cover key features, demonstrate multi-agent patterns, and explore deployment options from Amazon EKS to Amazon Bedrock AgentCore with live coding examples.

Learn More: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

AWS re:Invent 2025 - Accelerating data engineering with AI Agents for AWS Analytics (ANT215)

2025-12-05 · AWS re:Invent 2024 Watch

video

Agile/Scrum AI/ML Analytics AWS Cloud Computing Data Engineering Amazon SageMaker Spark SQL

Data engineers face critical time sinks: writing code to build analytics pipelines from scratch and upgrading Apache Spark versions. In this lightning talk, discover how AWS is addressing both challenges with AI agents that accelerate development cycles. Learn how the Amazon SageMaker Data Agent transforms natural language instructions into executable SQL and Python code within SageMaker notebooks, maintaining full context awareness of your data sources and schemas. Then explore the Apache Spark upgrade agent, which accelerates complex multi-month upgrade projects into week-long initiatives through automated code analysis and transformation. Walk away understanding how these agents work to automate manual work from your data engineering workflows, whether you're building new applications or modernizing existing ones.

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

AWS re:Invent 2025 - Advanced agentic RAG Systems: Deep dive with Amazon Bedrock (AIM425)

2025-12-04 · AWS re:Invent 2024 Watch

video

Agile/Scrum AWS Cloud Computing RAG

Learn to build a production-grade agentic RAG system using Amazon Bedrock Knowledge Bases, Strands, and AgentCore in this expert-level code talk. Through live coding and detailed walkthroughs, learn how to build an intelligent event assistant agent that integrates knowledge retrieval, long-term memory, and user authentication. This hands-on session covers the complete journey from knowledge base setup through agent creation, memory integration (short-term and long-term), runtime deployment, and identity management. Prerequisites: strong experience with Python and familiarity with RAG concepts.

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

From Beats to Bytes: Teaching AI, Music, & Python Coding W. Elastic by @Treeko.mp3

2025-12-03 · From Beats to Bytes: AI, Observability & Creativity with Elastic

talk

Kibana ai elasticsearch

We’ll explore: - How Elasticsearch can be used to organize and search through large content libraries (music samples, project files, video clips, code snippets) for fast retrieval. - Ways Kibana visualizations can track audience engagement and content performance in real time. - The role of AI-driven insights in shaping creative output, from deciding which DJ mixes to publish next to optimizing Python course delivery for beginners. - How to bridge technical concepts with creative workflows to reach a global audience, including Spanish-speaking learners in Latin America and Spain. Whether you’re a developer, educator, or creator, you’ll leave with practical ideas for using Elastic’s tools to bring structure, insight, and scalability to your own projects, technical or creative.

Why is your Python code slow? Recommendations for improving performance

2025-12-02 · PyBerlin 56 - December event

talk

optimization performance optimization profiling

It's very likely that throughout your journey with Python, you've heard people say that Python is slow. While there is a gap between interpreted and compiled languages that favors compiled languages, Python has ways to improve the performance of your programs, but these aren't widely known among coders. In this talk, we'll explore some tools and programming patterns that will help you improve the performance of your programs, thereby improving the speed of your applications, tests, and products. After the presentation, you'll have a list of techniques you can apply to your code, as well as the necessary steps to continue exploring code optimization. No prior knowledge of code profilers or advanced techniques is required to attend this talk.

Moving beyond Slop Coding

2025-12-02 · PyBerlin 56 - December event

talk

by Matt Harrison

ai

AI can type faster than you. However, it has been trained on lots of naive or poor code (and a little decent code). Let's explore how you can take advantage of software engineering (and Python) best practices to help tame the bias of the AIs.

How I Learned to Stop Worrying and Love Generators

2025-12-02 · PyBerlin 56 - December event

talk

by Paweł Wiszniewski (Flink SE)

Data Engineering generators

Python's generators offer a simple, elegant way to build lightweight data pipelines. In this talk, we’ll break down generator functions and expressions and walk through practical Data Engineering examples: streaming large datasets in chunks, transforming records without exhausting memory, and using yield for clean setup and teardown. A concise tour of how generators can make data workflows more efficient—and more elegant.

Emoji Master Challenge - Python Masterclass (Ages 12-18)

2025-11-29 · Python Emoji Master Challenge! [Ages 12-16] [EN/DE]

workshop

emoji

Hands-on Python workshop featuring the Emoji Master Challenge with levels (e.g., display a rose emoji 10 times; conceal a superhero with emojis) and a final reveal of student-created superheroes.

Blurring Lines: Data, AI, and the New Playbook for Team Velocity

2025-11-24 · Data Engineering Podcast Listen

podcast_episode

by Maxime Beauchemin (Preset) , Tobias Macey

AI/ML Cloud Computing Data Engineering Data Management Data Quality Datafold dbt ETL/ELT Git Prefect SQL Data Streaming

Summary In this crossover episode, Max Beauchemin explores how multiplayer, multi‑agent engineering is transforming the way individuals and teams build data and AI systems. He digs into the shifting boundary between data and AI engineering, the rise of “context as code,” and how just‑in‑time retrieval via MCP and CLIs lets agents gather what they need without bloating context windows. Max shares hard‑won practices from going “AI‑first” for most tasks, where humans focus on orchestration and taste, and the new bottlenecks that appear — code review, QA, async coordination — when execution accelerates 2–10x. He also dives deep into Agor, his open‑source agent orchestration platform: a spatial, multiplayer workspace that manages Git worktrees and live dev environments, templatizes prompts by workflow zones, supports session forking and sub‑sessions, and exposes an internal MCP so agents can schedule, monitor, and even coordinate other agents.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.Composable data infrastructure is great, until you spend all of your time gluing it together. Bruin is an open source framework, driven from the command line, that makes integration a breeze. Write Python and SQL to handle the business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. Bruin allows you to build end-to-end data workflows using AI, has connectors for hundreds of platforms, and helps data teams deliver faster. Teams that use Bruin need less engineering effort to process data and benefit from a fully integrated data platform. Go to dataengineeringpodcast.com/bruin today to get started. And for dbt Cloud customers, they'll give you $1,000 credit to migrate to Bruin Cloud.Your host is Tobias Macey and today I'm interviewing Maxime Beauchemin about the impact of multi-player multi-agent engineering on individual and team velocity for building better data systemsInterview IntroductionHow did you get involved in the area of data management?Can you start by giving an overview of the types of work that you are relying on AI development agents for?As you bring agents into the mix for software engineering, what are the bottlenecks that start to show up?In my own experience there are a finite number of agents that I can manage in parallel. How does Agor help to increase that limit?How does making multi-agent management a multi-player experience change the dynamics of how you apply agentic engineering workflows?Contact Info LinkedInLinks AgorApache AirflowApache SupersetPresetClaude CodeCodexPlaywright MCPTmuxGit WorktreesOpencode.aiGitHub CodespacesOnaThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA