Python is at the core of our analytics platform, which processes over 8,000 game records daily, each approximately 500 MB in size. Over the past two years, we have accumulated more than 200 TB of data, equivalent to 1,600 years of game time from over 7 million players—and our goal is to increase this user count tenfold. This talk will cover how we transitioned from Go and C++ parsers connected via PyBind to data frames in Python, how our analyses evolved from Pandas to Polars, and why we migrated our backend from Django to FastAPI. Finally, we will share our real-world experience with performance optimization, leveraging RabbitMQ, Redis, and process monitoring in an environment where Python bridges the worlds of game data and AI analysis.
talk-data.com
Topic
Python
1446
tagged
Activity Trend
Top Events
How computers generate numbers for different purposes Ever wondered how your computer decides what’s “random”? Let’s peek behind the curtain — and see why getting it wrong can be disastrous.
Continuation of the conference with multiple advanced sessions and talks.
Conference officially opens with two keynote speakers.
R and Python workshops delivered by leading industry experts.
Summary In this episode of the Data Engineering Podcast, host Tobias Macey welcomes back Nick Schrock, CTO and founder of Dagster Labs, to discuss Compass - a Slack-native, agentic analytics system designed to keep data teams connected with business stakeholders. Nick shares his journey from initial skepticism to embracing agentic AI as model and application advancements made it practical for governed workflows, and explores how Compass redefines the relationship between data teams and stakeholders by shifting analysts into steward roles, capturing and governing context, and integrating with Slack where collaboration already happens. The conversation covers organizational observability through Compass's conversational system of record, cost control strategies, and the implications of agentic collaboration on Conway's Law, as well as what's next for Compass and Nick's optimistic views on AI-accelerated software engineering.
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details. Your host is Tobias Macey and today I'm interviewing Nick Schrock about building an AI analyst that keeps data teams in the loopInterview IntroductionHow did you get involved in the area of data management?Can you describe what Compass is and the story behind it?context repository structurehow to keep it relevant/avoid sprawl/duplicationproviding guardrailshow does a tool like Compass help provide feedback/insights back to the data teams?preparing the data warehouse for effective introspection by the AILLM selectioncost managementcaching/materializing ad-hoc queriesWhy Slack and enterprise chat are important to b2b softwareHow AI is changing stakeholder relationshipsHow not to overpromise AI capabilities How does Compass relate to BI?How does Compass relate to Dagster and Data Infrastructure?What are the most interesting, innovative, or unexpected ways that you have seen Compass used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on Compass?When is Compass the wrong choice?What do you have planned for the future of Compass?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links DagsterDagster LabsDagster PlusDagster CompassChris Bergh DataOps EpisodeRise of Medium Code blog postContext EngineeringData StewardInformation ArchitectureConway's LawTemporal durable execution frameworkThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
Hands-on coding workshop for ages 12-18 featuring the Emoji Master Challenge. Participants complete level-based emoji tasks using Python, culminating in a final reveal of the superheroes created by the students.
Who’s the most clutch quarterback in NFL history — Tom Brady, Patrick Mahomes, Aaron Rodgers, or someone completely unexpected? We’ll use Python + Data Science to figure it out. 👉 Try Sphinx for free - https://www.sphinx.ai ⏱️ TIMESTAMPS00:00 - Who’s the most clutch QB? 00:40 - Python + Sphinx AI: analyzing 1M NFL plays 02:00 - Defining “clutch” in football (data-driven approach) 03:15 - “TV Clutch” Top 10 07:50 - Using AI to processes play-by-play data 11:10 - Advanced Clutch Factor 17:00 - Advanced Top 10 24:30 - Build your own analysis 🔗 RESOURCES & LINKS💌 Join 20k+ aspiring data analysts — https://www.datacareerjumpstart.com/newsletter 🎯 Free Training: How to Land Your First Data Job — https://www.datacareerjumpstart.com/training 👩💻 Accelerator Program: Data Analytics Accelerator — https://www.datacareerjumpstart.com/daa 💼 Interview Prep Tool: Interview Simulator — https://www.datacareerjumpstart.com/interviewsimulator 📱 CONNECT WITH AVERY🎥 YouTube: @averysmith 🤝 LinkedIn: https://www.linkedin.com/in/averyjsmith 📸 Instagram: https://instagram.com/datacareerjumpstart 🎵 TikTok: https://www.tiktok.com/@verydata 💻 Website: https://www.datacareerjumpstart.com 📱 CONNECT WITH SPHINX🐦Twitter/X - https://x.com/getsphinx 🔗Linkedin - https://www.linkedin.com/company/sphinx-ml/ Mentioned in this episode: Join the last cohort of 2025! The LAST cohort of The Data Analytics Accelerator for 2025 kicks off on Monday, December 8th and enrollment is officially open!
To celebrate the end of the year, we’re running a special End-of-Year Sale, where you’ll get: ✅ A discount on your enrollment 🎁 6 bonus gifts, including job listings, interview prep, AI tools + more
If your goal is to land a data job in 2026, this is your chance to get ahead of the competition and start strong.
👉 Join the December Cohort & Claim Your Bonuses: https://DataCareerJumpstart.com/daa https://www.datacareerjumpstart.com/daa
Modern data engineering leverages Python to build robust, scalable, end-to-end workflows. In this talk, we will cover how Snowflake offers you a flexible development environment for developing Python data pipelines, performing transformation at scale, orchestrating and deploying your pipelines at scale. Topics we’ll cover include:
•Ingest: Data source APIs, Snowflake file-to-read and ingest data of any format when files arrive, with sources outside Snowflake •Develop: Packaging (artifact repo), Python runtimes, IDE (Notebook, vscode) •Transform: Snowpark pandas, UDFs, UDAFs •Deploy: Tasks, Notebook scheduling
Learn how to efficiently scale and manage data engineering pipelines with Snowflake's latest native capabilities for transformations and orchestration with SQL, Python, and dbt Projects on Snowflake. Join us for new product and feature overviews, best practices, and live demos.
Atelier gratuit mettant en pratique les bases de Python et l’utilisation des bibliothèques Python pour prédire les ventes et les stocks à partir de données passées.
You have likely witnessed the hype-cycle around MCP (the Model-Context Protocol) for LLMs. It was heralded as \"the universal interface between LLMs and the world\" but then faded into the background as attention shifted towards AI Agents. Yet, the background of your AI app is exactly where an MCP should be, and in this talk we cover why. We will tour the MCP protocol, the Python reference implementation, and an example agent using an MCP. Expect protocol flow-charts, architecture diagrams, and a real-world demo. You will walk away knowing the core ideas of MCP, how it connects to the broader ecosystem, and how to power your AI agents.
In this session, I'll walk you through how to build a smart, context-aware agent in just 45 minutes. You'll see how OpenAI APIs, LangChain, and Python can work together to create an agent that goes beyond basic chat. With a demo and easy-to-follow steps, you’ll leave with the confidence to start building and customizing your own AI Assistant. We'll cover: Core principles of AI agents and what makes them different from simple chatbots Step-by-step walkthrough of building an agent with LangChain and OpenAI APIs Demo of an AI agent Practical ways to customize agents for your own use cases
Everyone’s talking about AI agents! But what are they, and how do you build one? This talk cuts through the hype. Drawing on my experience building a GenAI platform, I’ll show that powerful agents are within reach, no advanced degree required. We’ll define agents simply: LLMs + tools + memory. Then we’ll build an agent with the OpenAI Python SDK, using coding basics you know: functions, loops, and conditions. I’ll show how you can enhance your agent with a knowledge base using Elasticsearch as a tool. By the end, you won't just understand agents; you'll be fully equipped to build your own.
Brought to You By: • Statsig — The unified platform for flags, analytics, experiments, and more. Most teams end up in this situation: ship a feature to 10% of users, wait a week, check three different tools, try to correlate the data, and you’re still unsure if it worked. The problem is that each tool has its own user identification and segmentation logic. Statsig solved this problem by building everything within a unified platform. Check out Statsig. • Linear – The system for modern product development. In the episode, Armin talks about how he uses an army of “AI interns” at his startup. With Linear, you can easily do the same: Linear’s Cursor integration lets you add Cursor as an agent to your workspace. This agent then works alongside you and your team to make code changes or answer questions. You’ve got to try it out: give Linear a spin and see how it integrates with Cursor. — Armin Ronacher is the creator of the Flask framework for Python, was one of the first engineers hired at Sentry, and now the co-founder of a new startup. He has spent his career thinking deeply about how tools shape the way we build software. In this episode of The Pragmatic Engineer Podcast, he joins me to talk about how programming languages compare, why Rust may not be ideal for early-stage startups, and how AI tools are transforming the way engineers work. Armin shares his view on what continues to make certain languages worth learning, and how agentic coding is driving people to work more, sometimes to their own detriment. We also discuss: • Why the Python 2 to 3 migration was more challenging than expected • How Python, Go, Rust, and TypeScript stack up for different kinds of work • How AI tools are changing the need for unified codebases • What Armin learned about error handling from his time at Sentry • And much more Jump to interesting parts: • (06:53) How Python, Go, and Rust stack up and when to use each one • (30:08) Why Armin has changed his mind about AI tools • (50:32) How important are language choices from an error-handling perspective? — Timestamps (00:00) Intro (01:34) Why the Python 2 to 3 migration created so many challenges (06:53) How Python, Go, and Rust stack up and when to use each one (08:35) The friction points that make Rust a bad fit for startups (12:28) How Armin thinks about choosing a language for building a startup (22:33) How AI is impacting the need for unified code bases (24:19) The use cases where AI coding tools excel (30:08) Why Armin has changed his mind about AI tools (38:04) Why different programming languages still matter but may not in an AI-driven future (42:13) Why agentic coding is driving people to work more and why that’s not always good (47:41) Armin’s error-handling takeaways from working at Sentry (50:32) How important is language choice from an error-handling perspective (56:02) Why the current SDLC still doesn’t prioritize error handling (1:04:18) The challenges language designers face (1:05:40) What Armin learned from working in startups and who thrives in that environment (1:11:39) Rapid fire round — The Pragmatic Engineer deepdives relevant for this episode:
— Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].
Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe
Formation immersive et pratique pour apprendre à créer et déployer une IA capable de prédire le prix d'une voiture, en manipulant des données, en construisant un modèle de régression et en le mettant en production avec Python, TensorFlow, Pytorch, Flask et Ngrok.
Formation immersive et orientée pratique sur le machine learning : apprendre à créer et déployer une IA capable de prédire le prix d’une voiture; manipulation des données; création d'un modèle de régression; mise en production avec Python, TensorFlow, PyTorch, Flask et Ngrok.
Share your machine learning models, create chatbots, as well as build and deploy insightful dashboards speedily using Taipy with this hands-on book featuring real-world application examples from multiple industries Free with your book: DRM-free PDF version + access to Packt's next-gen Reader Key Features Create visually compelling, interactive data applications with Taipy Bring predictive models to end users and create data pipelines to compare scenarios with what-if analyses Go beyond prototypes to build and deploy production-ready applications using the cloud provider of your choice Purchase of the print or Kindle book includes a free PDF eBook in full color Book Description While data analysts, data scientists, and BI experts have the tools to analyze data, build models, and create compelling visuals, they often struggle to translate these insights into practical, user-friendly applications that help end users answer real-world questions, such as identifying revenue trends, predicting inventory needs, or detecting fraud, without wading through complex code. This book is a comprehensive guide to overcoming this challenge. This book teaches you how to use Taipy, a powerful open-source Python library, to build intuitive, production-ready data apps quickly and efficiently. Instead of creating prototypes that nobody uses, you'll learn how to build faster applications that process large amounts of data for multiple users and deliver measurable business impact. Taipy does the heavy lifting to enable your users to visualize their KPIs, interact with charts and maps, and compare scenarios for better decision-making. You’ll learn to use Taipy to build apps that make your data accessible and actionable in production environments like the cloud or Docker. By the end of this book, you won’t just understand Taipy, you'll be able to transform your data skills into impactful solutions that address real-world needs and deliver valuable insights. Email sign-up and proof of purchase required What you will learn Explore Taipy, its use cases, and how it's different from other projects Discover how to create visually appealing interactive apps, display KPIs, charts, and maps Understand how to compare scenarios to make better decisions Connect Taipy applications to several data sources and services Develop apps for diverse use cases, including chatbots, dashboards, ML apps, and maps Deploy Taipy applications on different types of servers and services Master advanced concepts for simplifying and accelerating your development workflow Who this book is for If you’re a data analyst, data scientist, or BI analyst looking to build production-ready data apps entirely in Python, this book is for you. If your scripts and models sit idle because non-technical stakeholders can’t use them, this book shows you how to turn them into full applications fast with Taipy, so your work delivers real business value. It’s also valuable for developers and engineers who want to streamline their data workflows and build UIs in pure Python.
Summary In this episode of the Data Engineering Podcast Vijay Subramanian, founder and CEO of Trace, talks about metric trees - a new approach to data modeling that directly captures a company's business model. Vijay shares insights from his decade-long experience building data practices at Rent the Runway and explains how the modern data stack has led to a proliferation of dashboards without a coherent way for business consumers to reason about cause, effect, and action. He explores how metric trees differ from and interoperate with other data modeling approaches, serve as a backend for analytical workflows, and provide concrete examples like modeling Uber's revenue drivers and customer journeys. Vijay also discusses the potential of AI agents operating on metric trees to execute workflows, organizational patterns for defining inputs and outputs with business teams, and a vision for analytics that becomes invisible infrastructure embedded in everyday decisions.
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.Your host is Tobias Macey and today I'm interviewing Vijay Subramanian about metric trees and how they empower more effective and adaptive analyticsInterview IntroductionHow did you get involved in the area of data management?Can you describe what metric trees are and their purpose?How do metric trees relate to metric/semantic layers?What are the shortcomings of existing data modeling frameworks that prevent effective use of those assets?How do metric trees build on top of existing investments in dimensional data models?What are some strategies for engaging with the business to identify metrics and their relationships?What are your recommendations for storage, representation, and retrieval of metric trees?How do metric trees fit into the overall lifecycle of organizational data workflows?When creating any new data asset it introduces overhead of maintenance, monitoring, and evolution. How do metric trees fit into existing testing and validation frameworks that teams rely on for dimensional modeling?What are some of the key differences in useful evaluation/testing that teams need to develop for metric trees?How do metric trees assist in context engineering for AI-powered self-serve access to organizational data?What are the most interesting, innovative, or unexpected ways that you have seen metric trees used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on metric trees and operationalizing them at Trace?When is a metric tree the wrong abstraction?What do you have planned for the future of Trace and applications of metric trees?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links Metric TreeTraceModern Data StackHadoopVerticaLuigidbtRalph KimballBill InmonMetric LayerDimensional Data WarehouseMaster Data ManagementData GovernanceFinancial P&L (Profit and Loss)EBITDA ==Earnings before interest, taxes, depreciation and amortizationThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA