talk-data.com talk-data.com

Topic

Data Science

machine_learning statistics analytics

1516

tagged

Activity Trend

68 peak/qtr
2020-Q1 2026-Q1

Activities

1516 activities · Newest first

Shifting Left — Setting up Your GenAI Ecosystem to Work for Business Analysts

At Data and AI in 2022, Databricks pioneered the term to shift left in how AI workloads would enable less data science driven people to create their own apps. In 2025, we take a look at how Experian is doing on that journey. This session highlights Databricks services that assist with the shift left paradigm for Generative AI, including how AI/BI Genie helps with Generative analytics, and how Agent Studio helps with synthetic generation of test cases to validate model performance.

FinOps: Automated Unity Catalog Cost Observability, Data Isolation and Governance Framework

Westat, a leader in data-driven research for 60 years+, has implemented a centralized Databricks platform to support hundreds of research projects for government, foundations, and private clients. This initiative modernizes Westat’s technical infrastructure while maintaining rigorous statistical standards and streamlining data science. The platform enables isolated project environments with strict data boundaries, centralized oversight, and regulatory compliance. It allows project-specific customization of compute and analytics, and delivers scalable computing for complex analyses. Key features include config-driven Infrastructure as Code (IaC) with Terragrunt, custom tagging and AWS cost integration for ROI tracking, budget policies with alerts for proactive cost management, and a centralized dashboard with row-level security for self-service cost analytics. This unified approach provides full financial visibility and governance while empowering data teams to deliver value. Audio for this session is delivered in the conference mobile app, you must bring your own headphones to listen.

AI Powering Epsilon's Identity Strategy: Unified Marketing Platform on Databricks

Join us to hear about how Epsilon Data Management migrated Epsilon’s unique, AI-powered marketing identity solution from multi-petabyte on-prem Hadoop and data warehouse systems to a unified Databricks Lakehouse platform. This transition enabled Epsilon to further scale its Decision Sciences solution and enable new cloud-based AI research capabilities on time and within budget, without being bottlenecked by the resource constraints of on-prem systems. Learn how Delta Lake, Unity Catalog, MLflow and LLM endpoints powered massive data volume, reduced data duplication, improved lineage visibility, accelerated Data Science and AI, and enabled new data to be immediately available for consumption by the entire Epsilon platform in a privacy-safe way. Using the Databricks platform as the base for AI and Data Science at global internet scale, Epsilon deploys marketing solutions across multiple cloud providers and multiple regions for many customers.

From Overwhelmed to Empowered: How SAP is Democratizing Data & AI with Databricks to Solve Problems

From Overwhelmed to Empowered - How SAP is Democratizing Data & AI to Solve Real Business Problems with Databricks: Scaling the adoption of Data & AI within enterprises is critical for driving transformative business outcomes. Learn how the SAP Experience Garage, SAP’s largest internal enablement and innovation driver, is turning all employees into data enthusiast through the integration of Databricks technologies. The SAP Experience Garage platform brings together colleagues with various levels of data knowledge and skills in one seamless space. Here, they can explore and use tangible datasets and data science/AI tooling from Databricks, enablement capabilities, and collaborative features to tackle real business-related challenges and create prototypes that find their way into SAP’s ecosystem.

Industrial organizations are unlocking new possibilities through the partnership between AVEVA and Databricks. The seamless, no-code, zero-copy solution—powered by Delta Sharing and CONNECT—enables companies to combine IT and OT data effortlessly. By bridging the gap between operational and enterprise data, businesses can harness the power of AI, data science, and business intelligence at an unprecedented scale to drive innovation. In this session, explore real-world applications of this integration, including how industry leaders are using CONNECT and Databricks to boost efficiency, reduce costs, and advance sustainability—all without fragmented point solutions. You’ll also see a live demo of the integration, showcasing how secure, scalable access to trusted industrial data is enabling new levels of industrial intelligence across sectors like mining, manufacturing, power, and oil and gas.

AI/BI Driving Speed to Value in Supply Chain

Conagra is a global food manufacturer with $12.2B in revenue, 18K+ employees, 45+ plants in US, Canada and Mexico. Conagra's Supply Chain organization is heavily focused on delivering results in productivity, waste reduction, inventory rationalization, safety and customer service levels. By migrating the Supply Chain reporting suite to Databricks over the past 2 years, Conagra's Supply Chain Analytics & Data Science team has been able to deliver new AI solutions which complement traditional BI platforms and lay the foundation for additional AI/ML applications in the future. With Databricks Genie integrated within traditional BI reports, Conagra Supply Chain users can now go from insight to action faster and with fewer clicks, enabling speed to value in a complex Supply Chain. The Databricks platform also allows the team to curate data products to be consumed by traditional BI applications today as well as the ability to rapidly scale for the AI/ML applications of tomorrow.

Sponsored by: EY | Navigating the Future: Knowledge-Powered Insights on AI, Information Governance, Real-Time Analytics

In an era where data drives strategic decision-making, organizations must adapt to the evolving landscape of business analytics. This session will focus on three pivotal themes shaping the future of data management and analytics in 2025. Join our panel of experts, including a Business Analytics Leader, Head of Information Governance, and Data Science Leader, as they explore: - Knowledge-Powered AI: Discover trends in Knowledge-Powered AI and how these initiatives can revolutionize business analytics, with real-world examples of successful implementations. - Information Governance: Explore the role of information governance in ensuring data integrity and compliance. Our experts will discuss strategies for establishing robust frameworks that protect organizational assets. - Real-Time Analytics: Understand the importance of real-time analytics in today’s fast-paced environment. The panel will highlight how organizations can leverage real-time data for agile decision-making.

Boosting Data Science and AI Productivity With Databricks Notebooks

This session is repeated. Want to accelerate your team's data science workflow? This session reveals how Databricks Notebooks can transform your productivity through an optimized environment designed specifically for data science and AI work. Discover how notebooks serve as a central collaboration hub where code, visualizations, documentation and results coexist seamlessly, enabling faster iteration and development. Key takeaways: Leveraging interactive coding features including multi-language support, command-mode shortcuts and magic commands Implementing version control best practices through Git integration and notebook revision history Maximizing collaboration through commenting, sharing and real-time co-editing capabilities Streamlining ML workflows with built-in MLflow tracking and experiment management You'll leave with practical techniques to enhance your notebook-based workflow and deliver AI projects faster with higher-quality results.

Today, I’m responding to a listener's question about what it takes to succeed as a data or AI product manager, especially if you’re coming from roles like design/BI/data visualization, data science/engineering, or traditional software product management. This reader correctly observed that most of my content “seems more targeted at senior leadership” — and had asked if I could address this more IC-oriented topic on the show. I’ll break down why technical chops alone aren’t enough, and how user-centered thinking, business impact, and outcome-focused mindsets are key to real success — and where each of these prior roles brings strengths and/or weaknesses. I’ll also get into the evolving nature of PM roles in the age of AI, and what I think the super-powered AI product manager will look like.

Highlights/ Skip to:

Who can transition into an AI and data product management role? What does it take? (5:29) Software product managers moving into  AI product management (10:05) Designers moving into data/AI product management (13:32) Moving into the AI PM role from the engineering side (21:47) Why the challenge of user adoption and trust is often the blocker to the business value (29:56) Designing change management into AI/data products as a skill (31:26) The challenge of value creation vs. delivery work — and how incentives are aligned for ICs  (35:17) Quantifying the financial value of data and AI product work(40:23)

Quotes from Today’s Episode

“Who can transition into this type of role, and what is this role? I’m combining these two things. AI product management often seems closely tied to software companies that are primarily leveraging AI, or trying to, and therefore, they tend to utilize this AI product management role. I’m seeing less of that in internal data teams, where you tend to see data product management more, which, for me, feels like an umbrella term that may include traditional analytics work, data platforms, and often AI and machine learning. I’m going to frame this more in the AI space, primarily because I think AI tends to capture the end-to-end product than data product management does more frequently.” — Brian (2:55)

“There are three disciplines I’m going to talk about moving into this role. Coming into AI and data PM from design and UX, coming into it from data engineering (or just broadly technical spaces), and then coming into it from software product management. I think software product management and moving into the AI product management - as long as you’re not someone that has two years of experience, and then 18 years of repeating the second year of experience over and over again - and you’ve had a robust product management background across some different types of products; you can show that the domain doesn’t necessarily stop you from producing value. I think you will have the easiest time moving into AI product management because you’ve shown that you can adapt across different industries.” - Brian (9:45)

“Let’s talk about designers next. I’m going to include data visualization, user experience research, user experience design, product design, all those types of broad design, category roles. Moving into data and/or AI product management, first of all, you don’t see too many—I don’t hear about too many designers wanting to move into DPM roles, because oftentimes I don’t think there’s a lot of heavy UI and UX all the time in that space. Or at least the teams that are doing that work feel that’s somebody else’s job because they’re not doing end-to-end product thinking the way I talk about it, so therefore, a lot of times they don’t see the application, the user experience, the human adoption, the change management, they’re just not looking at the world that way, even though I think they should be.” - Brian (13:32)

“Coming at this from the data and engineering side, this is the classic track for data product management. At least that is the way I tend to see it. I believe most companies prefer to develop this role in-house. My biggest concern is that you end up with job title changes, but not necessarily the benefits that are supposed to come with this. I do like learning by doing, but having a coach and someone senior who can coach your other PMs is important because there’s a lot of information that you won’t necessarily get in a class or a course. It’s going to come from experience doing the work.” - Brian (22:26)

“This value piece is the most important thing, and I want to focus on that. This is something I frequently discuss in my training seminar: how do we attach financial value to the work we’re doing? This is both art and science, but it’s a language that anyone in a product management role needs to be comfortable with. If you’re finding it very hard to figure out how your data product contributes financial value because it’s based on this waterfalling of “We own the model, and it’s deployed on a platform.” The platform then powers these other things, which in turn power an application. How do we determine the value of our tool? These things are challenging, and if it’s challenging for you, guess how hard it will be for stakeholders downstream if you haven’t had the practice and the skills required to understand how to estimate value, both before we build something as well as after?” - Brian (31:51)

“If you don’t want to spend your time getting to know how your business makes money or creates value, then [AI and data product management work] is not for you. It’s just not. I would stay doing what you’re doing already or find a different thing because a lot of your time is going to be spent “managing up” for half the time, and then managing the product stuff “down.” Then, sitting in this middle layer, trying to explain to the business what’s going to come out and what the impact is going to be, in language that they care about and understand. You can't be talking about models, model accuracy, data pipelines, and all that stuff. They’re not going to care about any of that. - Brian (34:08)

Polars, DuckDB, PySpark, PyArrow, pandas, cuDF: how Narwhals has brought them all together!

Suppose you want to write a data science tool to do feature engineering. Your experience may go like this: - Expectation: you can focus on state-of-the art techniques for feature engineering. - Reality: you keep having to make you codebase more complex because a new dataframe library has come out and users are demanding support for it.

Or rather, it might have gone like that in the pre-Narwhals era. Because now, you can focus on solving the problems which your tool set out to do, and let Narwhals handle the subtle differences between different kinds of dataframe inputs!

Learn Python for Data Science in this Beginners’ Day Workshop Would you like to learn to code but don’t know where to start? Taking your first steps in programming can seem like an impossible task so we’ve decided to put on a workshop to show beginners how it can be done and share our passion for the world of data science!

Apply to be a student https://forms.gle/2cvNyRK8c8pNnpnz5

Analysing smart meter data to uncover energy consumption patterns

Smart meters have the potential to not only provide information to individual householders about their energy consumption, but to identify patterns of usage across the entire energy system. At Nesta, we have been analysing smart meter data to uncover information about energy consumption habits, and how household appliances, physical property characteristics and demographic factors influence energy usage - as this can help develop energy-saving initiatives. In this talk we will present the data science techniques we used, such as clustering, present our results as well as discuss how we translate them to a non-data science audience, and share learnings of conducting data science work in a secure data lab to allow for analysis of sensitive and confidential data.

Successful Projects through a bit of Rebellion

This talk is for leaders who want new techniques to improve their success rates. In the last 15 months I've built a private data science peer mentorship group where we discuss rebellious ideas that improve our ability to make meaningful change in organisations of all sizes.

As a leader you've no doubt had trouble defining new projects (perhaps you've been asked - "add ChatGPT!"), getting buy-in, building support, defining defensible metrics and milestones, hiring, developing your team, dealing with conflict, avoiding overload and ultimately delivering valuable projects that are adopted by the business. I'll share advice across all of these areas based on 25 years of personal experience and the topics we've discussed in my leadership community.

You'll walk away with new ideas, perspectives and references that ought to change how to work with your team and organisation.

Platforms for valuable AI Products: Iteration, iteration, iteration

In data science experimentation is vital, the more we can experiment, the more we can learn. However quick iteration isn't sufficient we also need to be able to easily promote these experiments to production to deliver value. This requires all the stability and reliability of any production system. John will discuss building platforms that treat iteration as a first class consideration, the role of open source libraries, and balancing trade-offs.

Conquering PDFs: document understanding beyond plain text

NLP and data science could be so easy if all of our data came as clean and plain text. But in practice, a lot of it is hidden away in PDFs, Word documents, scans and other formats that have been a nightmare to work with. In this talk, I'll present a new and modular approach for building robust document understanding systems, using state-of-the-art models and the awesome Python ecosystem. I'll show you how you can go from PDFs to structured data and even build fully custom information extraction pipelines for your specific use case.

Since the end of 2022, the AI space has reached unprecedented velocity, scale and proliferation. When it seems like everyone (and their dog) is talking about AI, how should those of us who've been working in Machine Learning, Data Science (and AI) as domain experts look to navigate the conversation? In this talk, Leanne will aim to shine a light on the impact the AI arms race is having on our field, the reality of what it means to be a practitioner and some principles to stick by to help traverse what may appear to be a time of panic.

Poussés par des exigences élevées en matière de centralisation des données, de sécurité et d’auditabilité des accès, nous avons conçu une solution sur-mesure permettant aux équipes Data Science de collaborer efficacement à partir d’un point d’accès centralisé. Découvrez comment nous avons fait pour conjuguer besoin client architecture technique dans cette session axée pratique et retour terrain. À écouter pour tous ceux qui s’intéressent à la mise en production de la Data Science à l’échelle !