talk-data.com talk-data.com

Topic

Python

programming_language data_science web_development

1446

tagged

Activity Trend

185 peak/qtr
2020-Q1 2026-Q1

Activities

1446 activities · Newest first

From Notebook to Pipeline: Hands-On Data Engineering with Python

In this hands-on tutorial, you'll go from a blank notebook to a fully orchestrated data pipeline built entirely in Python, all in under 90 minutes. You'll learn how to design and deploy end-to-end data pipelines using familiar notebook environments, using Python for your data loading, data transformations, and insights delivery.

We'll dive into the Ingestion-Tranformation-Delivery (ITD) framework for building data pipelines: ingest raw data from cloud object storage, transform the data using Python DataFrames, and deliver insights via a Streamlit application.

Basic familiarity with Python (and/or SQL) is helpful, but not required. By the end of the session, you'll understand practical data engineering patterns and leave with reusable code templates to help you build, orchestrate, and deploy data pipelines from notebook environments.

AWS re:Invent 2025 - Build production AI agents with the Strands Agents SDK for TypeScript (AIM3331)

Discover how to build enterprise-ready AI agents using the newly launched Strands Agents SDK for TypeScript. This session introduces developers to a simple model-driven framework for building agents that run on any cloud, support multiple LLM providers, and integrate with the tools you already have. Learn how TypeScript developers can now leverage the same production-ready agent framework that Python teams have been using, with full type safety and seamless integration into modern JavaScript ecosystems. We'll cover key features, demonstrate multi-agent patterns, and explore deployment options from Amazon EKS to Amazon Bedrock AgentCore with live coding examples.

Learn More: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

AWS re:Invent 2025 - Accelerating data engineering with AI Agents for AWS Analytics (ANT215)

Data engineers face critical time sinks: writing code to build analytics pipelines from scratch and upgrading Apache Spark versions. In this lightning talk, discover how AWS is addressing both challenges with AI agents that accelerate development cycles. Learn how the Amazon SageMaker Data Agent transforms natural language instructions into executable SQL and Python code within SageMaker notebooks, maintaining full context awareness of your data sources and schemas. Then explore the Apache Spark upgrade agent, which accelerates complex multi-month upgrade projects into week-long initiatives through automated code analysis and transformation. Walk away understanding how these agents work to automate manual work from your data engineering workflows, whether you're building new applications or modernizing existing ones.

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

AWS re:Invent 2025 - Advanced agentic RAG Systems: Deep dive with Amazon Bedrock (AIM425)

Learn to build a production-grade agentic RAG system using Amazon Bedrock Knowledge Bases, Strands, and AgentCore in this expert-level code talk. Through live coding and detailed walkthroughs, learn how to build an intelligent event assistant agent that integrates knowledge retrieval, long-term memory, and user authentication. This hands-on session covers the complete journey from knowledge base setup through agent creation, memory integration (short-term and long-term), runtime deployment, and identity management. Prerequisites: strong experience with Python and familiarity with RAG concepts.

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

We’ll explore: - How Elasticsearch can be used to organize and search through large content libraries (music samples, project files, video clips, code snippets) for fast retrieval. - Ways Kibana visualizations can track audience engagement and content performance in real time. - The role of AI-driven insights in shaping creative output, from deciding which DJ mixes to publish next to optimizing Python course delivery for beginners. - How to bridge technical concepts with creative workflows to reach a global audience, including Spanish-speaking learners in Latin America and Spain. Whether you’re a developer, educator, or creator, you’ll leave with practical ideas for using Elastic’s tools to bring structure, insight, and scalability to your own projects, technical or creative.

It's very likely that throughout your journey with Python, you've heard people say that Python is slow. While there is a gap between interpreted and compiled languages ​​that favors compiled languages, Python has ways to improve the performance of your programs, but these aren't widely known among coders. In this talk, we'll explore some tools and programming patterns that will help you improve the performance of your programs, thereby improving the speed of your applications, tests, and products. After the presentation, you'll have a list of techniques you can apply to your code, as well as the necessary steps to continue exploring code optimization. No prior knowledge of code profilers or advanced techniques is required to attend this talk.

Python's generators offer a simple, elegant way to build lightweight data pipelines. In this talk, we’ll break down generator functions and expressions and walk through practical Data Engineering examples: streaming large datasets in chunks, transforming records without exhausting memory, and using yield for clean setup and teardown. A concise tour of how generators can make data workflows more efficient—and more elegant.

Summary In this crossover episode, Max Beauchemin explores how multiplayer, multi‑agent engineering is transforming the way individuals and teams build data and AI systems. He digs into the shifting boundary between data and AI engineering, the rise of “context as code,” and how just‑in‑time retrieval via MCP and CLIs lets agents gather what they need without bloating context windows. Max shares hard‑won practices from going “AI‑first” for most tasks, where humans focus on orchestration and taste, and the new bottlenecks that appear — code review, QA, async coordination — when execution accelerates 2–10x. He also dives deep into Agor, his open‑source agent orchestration platform: a spatial, multiplayer workspace that manages Git worktrees and live dev environments, templatizes prompts by workflow zones, supports session forking and sub‑sessions, and exposes an internal MCP so agents can schedule, monitor, and even coordinate other agents.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.Composable data infrastructure is great, until you spend all of your time gluing it together. Bruin is an open source framework, driven from the command line, that makes integration a breeze. Write Python and SQL to handle the business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. Bruin allows you to build end-to-end data workflows using AI, has connectors for hundreds of platforms, and helps data teams deliver faster. Teams that use Bruin need less engineering effort to process data and benefit from a fully integrated data platform. Go to dataengineeringpodcast.com/bruin today to get started. And for dbt Cloud customers, they'll give you $1,000 credit to migrate to Bruin Cloud.Your host is Tobias Macey and today I'm interviewing Maxime Beauchemin about the impact of multi-player multi-agent engineering on individual and team velocity for building better data systemsInterview IntroductionHow did you get involved in the area of data management?Can you start by giving an overview of the types of work that you are relying on AI development agents for?As you bring agents into the mix for software engineering, what are the bottlenecks that start to show up?In my own experience there are a finite number of agents that I can manage in parallel. How does Agor help to increase that limit?How does making multi-agent management a multi-player experience change the dynamics of how you apply agentic engineering workflows?Contact Info LinkedInLinks AgorApache AirflowApache SupersetPresetClaude CodeCodexPlaywright MCPTmuxGit WorktreesOpencode.aiGitHub CodespacesOnaThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

SQL for Data Analytics - Fourth Edition

Dive into the world of data analytics with 'SQL for Data Analytics'. This book takes you beyond simple query writing to teach you how to use SQL to analyze, interpret, and derive actionable insights from real-world data. By the end, you'll build technical skills that allow you to solve complex problems and demonstrate results using data. What this Book will help me do Understand how to create, manage, and utilize structured databases for analytics. Use advanced SQL techniques such as window functions and subqueries effectively. Analyze various types of data like geospatial, JSON, and time-series data in SQL. Apply statistical principles within the context of SQL for enhanced insights. Automate data workflows and presentations using SQL and Python integration. Author(s) The authors Jun Shan, Haibin Li, Matt Goldwasser, Upom Malik, and Benjamin Johnston bring together a wealth of knowledge in data analytics, database management, and applied statistics. Together, they aim to empower readers through clear explanations, practical examples, and a focus on real-world applicability. Who is it for? This book is aimed at data professionals and learners such as aspiring data analysts, backend developers, and anyone involved in data-driven decision-making processes. The ideal reader has a basic understanding of SQL and mathematics and is eager to extend their skills to tackle real-world data challenges effectively.

The Definitive Guide to Microsoft Fabric

Master Microsoft Fabric from basics to advanced architectures with expert guidance to unify, secure, and scale analytics on real-world data platforms Key Features Build a complete data analytics platform with Microsoft Fabric Apply proven architectures, governance, and security strategies Gain real-world insights from five seasoned data experts Purchase of the print or Kindle book includes a free PDF eBook Book Description Microsoft Fabric is reshaping how organizations manage, analyze, and act on data by unifying ingestion, storage, transformation, analytics, AI, and visualization in a single platform. The Definitive Guide to Microsoft Fabric takes you from your very first workspace to building a secure, scalable, and future-proof analytics environment. You’ll learn how to unify data in OneLake, design data meshes, transform and model data, implement real-time analytics, and integrate AI capabilities. The book also covers advanced topics, such as governance, security, cost optimization, and team collaboration using DevOps and DataOps principles. Drawing on the real-world expertise of five seasoned professionals who have built and advised on platforms for startups, SMEs, and Europe’s largest enterprises, this book blends strategic insight with practical guidance. By the end of this book, you’ll have gained the knowledge and skills to design, deploy, and operate a Microsoft Fabric platform that delivers sustainable business value. What you will learn Understand Microsoft Fabric architecture and concepts Unify data storage and data governance with OneLake Ingest and transform data using multiple Fabric tools Implement real-time analytics and event processing Design effective semantic models and reports Integrate AI and machine learning into data workflows Apply governance, security, and compliance controls Optimize performance and costs at scale Who this book is for This book is for data engineers, analytics engineers, architects, and data analysts moving into platform design roles. It’s also valuable for technical leaders seeking to unify analytics in their organizations. You’ll need only a basic grasp of databases, SQL, and Python.

Build a multi-agent application leveraging MCP (Model Context Protocol) with the Microsoft Agent Framework in C# or LangGraph in Python, integrated with Azure Cosmos DB for scalable and high-performance data persistence and retrieval. Define agents, functions, and external service integrations, implement memory, state management, and semantic search using Azure Cosmos DB. By the end, you’ll have a robust AI agent system designed for real-world applications.

Please RSVP and arrive at least 5 minutes before the start time, at which point remaining spaces are open to standby attendees.