talk-data.com talk-data.com

Topic

AI/ML

Artificial Intelligence/Machine Learning

data_science algorithms predictive_analytics

9014

tagged

Activity Trend

1532 peak/qtr
2020-Q1 2026-Q1

Activities

9014 activities · Newest first

What Does It Take to Optimize Every Drop Of Milk Across a 150-year-old Global Dairy Cooperative?

In this session, Joëlle van der Bijl, Chief Data & Analytics Officer at FrieslandCampina, shares the bold journey of replacing legacy data systems with a single, unified data, analytics, and AI platform built on Databricks. Rather than evolving gradually, the company took a leap: transforming its entire data foundation in one go. Today, this data-centric vision is delivering high-value impact: from optimizing milk demand and supply to enabling commercial AI prediction models and scaling responsible AI across the business. Learn how FrieslandCampina is using Databricks to blend tradition with innovation, and unlock a smarter, more sustainable future for dairy.

What’s New in Security and Compliance on the Databricks Data Intelligence Platform

In this session, we’ll walk through the latest advancements in platform security and compliance on Databricks — from networking updates to encryption, serverless security and new compliance certifications across AWS, Azure and Google Cloud. We’ll also share our roadmap and best practices for how to securely configure workloads on Databricks SQL Serverless, Unity Catalog, Mosaic AI and more — at scale. If you're building on Databricks and want to stay ahead of evolving risk and regulatory demands, this session is your guide.

What’s new with Collaboration: Delta Sharing, Clean Room, Marketplace and the Ecosystem

Databricks continues to redefine how organizations securely and openly collaborate on data. With new innovations like Clean Rooms for multi-party collaboration, Sharing for Lakehouse Federation, cross-platform view sharing and Databricks Apps in the Marketplace, teams can now share and access data more easily, cost-effectively and across platforms — whether or not they’re using Databricks. In this session, we’ll deliver live demos of key capabilities that power this transformation: Delta Sharing: The industry’s only open protocol for seamless cross-platform data sharing Databricks Marketplace: A central hub for discovering and monetizing data and AI assets Clean Rooms: A privacy-preserving solution for secure, multi-party data collaboration Join us to see how these tools enable trusted data sharing, accelerate insights and drive innovation across your ecosystem. Bring your questions and walk away with practical ways to put these capabilities into action today.

Your Wish is AI Command — Get to Grips With Databricks Genie

Picture the scene — you're exploring a deep, dark cave looking for insights to unearth when, in a burst of smoke, Genie appears and offers you not three but unlimited data wishes. This isn't a folk tale, it's the growing wave of Generative BI that is going to be a part of analytics platforms. Databricks Genie is a tool powered by a SQL-writing LLM that redefines how we interact with data. We'll look at the basics of creating a new Genie room, scoping its data tables and asking questions. We'll help it out with some complex pre-defined questions and ensure it has the best chance of success. We'll give the tool a personality, set some behavioural guidelines and prepare some hidden easter eggs for our users to discover. Generative BI is going to be a fundamental part of the analytics toolset used across businesses. If you're using Databricks, you should be aware of Genie, if you're not, you should be planning your Generative BI Roadmap, and this session will answer your wishes.

Supported by Our Partners • Sonar —  Code quality and code security for ALL code.  •⁠ Statsig ⁠ — ⁠ The unified platform for flags, analytics, experiments, and more. • Augment Code — AI coding assistant that pro engineering teams love. — Kent Beck is one of the most influential figures in modern software development. Creator of Extreme Programming (XP), co-author of The Agile Manifesto, and a pioneer of Test-Driven Development (TDD), he’s shaped how teams write, test, and think about code. Now, with over five decades of programming experience, Kent is still pushing boundaries—this time with AI coding tools. In this episode of Pragmatic Engineer, I sit down with him to talk about what’s changed, what hasn’t, and why he’s more excited than ever to code. In our conversation, we cover: • Why Kent calls AI tools an “unpredictable genie”—and how he’s using them • Why Kent no longer has an emotional attachment to any specific programming language • The backstory of The Agile Manifesto—and why Kent resisted the word “agile” • An overview of XP (Extreme Programming) and how Grady Booch played a role in the name  • Tape-to-tape experiments in Kent’s childhood that laid the groundwork for TDD • Kent’s time at Facebook and how he adapted to its culture and use of feature flags • And much more! — Timestamps (00:00) Intro (02:27) What Kent has been up to since writing Tidy First (06:05) Why AI tools are making coding more fun for Kent and why he compares it to a genie (13:41) Why Kent says languages don’t matter anymore (16:56) Kent’s current project building a small talk server (17:51) How Kent got involved with The Agile Manifesto (23:46) Gergely’s time at JP Morgan, and why Kent didn’t like the word ‘agile’ (26:25) An overview of “extreme programming” (XP)  (35:41) Kent’s childhood tape-to-tape experiments that inspired TDD (42:11) Kent’s response to Ousterhout’s criticism of TDD (50:05) Why Kent still uses TDD with his AI stack  (54:26) How Facebook operated in 2011 (1:04:10) Facebook in 2011 vs. 2017 (1:12:24) Rapid fire round — The Pragmatic Engineer deepdives relevant for this episode: • — See the transcript and other references from the episode at ⁠⁠https://newsletter.pragmaticengineer.com/podcast⁠⁠ — Production and marketing by ⁠⁠⁠⁠⁠⁠⁠⁠https://penname.co/⁠⁠⁠⁠⁠⁠⁠⁠. For inquiries about sponsoring the podcast, email [email protected].

Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe

keynote
by Jamie Dimon (JPMorgan Chase) , Kasey Uhlenhuth (Databricks) , Justin DeBrabant (Databricks) , Greg Ulrich (Mastercard) , Richard Masters (Virgin Atlantic Airways) , Ali Ghodsi (Databricks) , Reynold Xin (Databricks) , Nikita Shamgunov (Neon) , Dario Amodei (Anthropic) , Holly Smith (Databricks) , Hanlin Tang (Databricks)

Be first to witness the latest breakthroughs from Databricks and share the success of innovative data and AI companies.

Está no ar, o Data Hackers News !! Os assuntos mais quentes da semana, com as principais notícias da área de Dados, IA e Tecnologia, que você também encontra na nossa Newsletter semanal, agora no Podcast do Data Hackers !! Aperte o play e ouça agora, o Data Hackers News dessa semana ! Para saber tudo sobre o que está acontecendo na área de dados, se inscreva na Newsletter semanal: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.datahackers.news/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Conheça nossos comentaristas do Data Hackers News: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Monique Femme⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Paulo Vasconcellos Demais canais do Data Hackers: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Site⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Linkedin⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Instagram⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Tik Tok⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠You Tube⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

Send us a text This week on Making Data Simple, we welcome Ralph Gootee, CTO and co-founder of TigerEye, a company reshaping strategic sales intelligence with a data-driven edge. Ralph’s journey spans Pixar, Sony, and PlanGrid — and now he’s building tools to help sales leaders see around corners. From the secrets behind TigerEye's intuitive reporting to the realities of entrepreneurship and the affordability of LLMs, this episode hits both the business brain and the tech heart. 00:52 Meet Ralph Gootee 01:43 TigerEye 04:49 PlanGrid 07:43 Monetization 08:50 TigerEye's Objective 12:38 Reinventing Reporting 17:06 How it Works 22:21 The Secret Sauce 27:48 LLM Affordability 34:14 Last Call 38:57 The Entrepreneur Dilemma 39:47 Where to Reach TigerEye 40:07 Do Code Assistants Work? 47:32 For Fun🔗 Connect with Ralph & TigerEye LinkedIn: Ralph Gootee  Website: TigerEye Blog: TigerEye Blog

MakingDataSimple #SalesIntelligence #AIinSales #EntrepreneurMindset #LLMs #StartupLife #PixarToPipeline #DataDrivenDecisions #TechLeadership #TigerEye

Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

Summary In this episode of the Data Engineering Podcast Alex Albu, tech lead for AI initiatives at Starburst, talks about integrating AI workloads with the lakehouse architecture. From his software engineering roots to leading data engineering efforts, Alex shares insights on enhancing Starburst's platform to support AI applications, including an AI agent for data exploration and using AI for metadata enrichment and workload optimization. He discusses the challenges of integrating AI with data systems, innovations like SQL functions for AI tasks and vector databases, and the limitations of traditional architectures in handling AI workloads. Alex also shares his vision for the future of Starburst, including support for new data formats and AI-driven data exploration tools.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.This is a pharmaceutical Ad for Soda Data Quality. Do you suffer from chronic dashboard distrust? Are broken pipelines and silent schema changes wreaking havoc on your analytics? You may be experiencing symptoms of Undiagnosed Data Quality Syndrome — also known as UDQS. Ask your data team about Soda. With Soda Metrics Observability, you can track the health of your KPIs and metrics across the business — automatically detecting anomalies before your CEO does. It’s 70% more accurate than industry benchmarks, and the fastest in the category, analyzing 1.1 billion rows in just 64 seconds. And with Collaborative Data Contracts, engineers and business can finally agree on what “done” looks like — so you can stop fighting over column names, and start trusting your data again.Whether you’re a data engineer, analytics lead, or just someone who cries when a dashboard flatlines, Soda may be right for you. Side effects of implementing Soda may include: Increased trust in your metrics, reduced late-night Slack emergencies, spontaneous high-fives across departments, fewer meetings and less back-and-forth with business stakeholders, and in rare cases, a newfound love of data. Sign up today to get a chance to win a $1000+ custom mechanical keyboard. Visit dataengineeringpodcast.com/soda to sign up and follow Soda’s launch week. It starts June 9th. This episode is brought to you by Coresignal, your go-to source for high-quality public web data to power best-in-class AI products. Instead of spending time collecting, cleaning, and enriching data in-house, use ready-made multi-source B2B data that can be smoothly integrated into your systems via APIs or as datasets. With over 3 billion data records from 15+ online sources, Coresignal delivers high-quality data on companies, employees, and jobs. It is powering decision-making for more than 700 companies across AI, investment, HR tech, sales tech, and market intelligence industries. A founding member of the Ethical Web Data Collection Initiative, Coresignal stands out not only for its data quality but also for its commitment to responsible data collection practices. Recognized as the top data provider by Datarade for two consecutive years, Coresignal is the go-to partner for those who need fresh, accurate, and ethically sourced B2B data at scale. Discover how Coresignal's data can enhance your AI platforms. Visit dataengineeringpodcast.com/coresignal to start your free 14-day trial.Your host is Tobias Macey and today I'm interviewing Alex Albu about how Starburst is extending the lakehouse to support AI workloadsInterview IntroductionHow did you get involved in the area of data management?Can you start by outlining the interaction points of AI with the types of data workflows that you are supporting with Starburst?What are some of the limitations of warehouse and lakehouse systems when it comes to supporting AI systems?What are the points of friction for engineers who are trying to employ LLMs in the work of maintaining a lakehouse environment?Methods such as tool use (exemplified by MCP) are a means of bolting on AI models to systems like Trino. What are some of the ways that is insufficient or cumbersome?Can you describe the technical implementation of the AI-oriented features that you have incorporated into the Starburst platform?What are the foundational architectural modifications that you had to make to enable those capabilities?For the vector storage and indexing, what modifications did you have to make to iceberg?What was your reasoning for not using a format like Lance?For teams who are using Starburst and your new AI features, what are some examples of the workflows that they can expect?What new capabilities are enabled by virtue of embedding AI features into the interface to the lakehouse?What are the most interesting, innovative, or unexpected ways that you have seen Starburst AI features used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on AI features for Starburst?When is Starburst/lakehouse the wrong choice for a given AI use case?What do you have planned for the future of AI on Starburst?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links StarburstPodcast EpisodeAWS AthenaMCP == Model Context ProtocolLLM Tool UseVector EmbeddingsRAG == Retrieval Augmented GenerationAI Engineering Podcast EpisodeStarburst Data ProductsLanceLanceDBParquetORCpgvectorStarburst IcehouseThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Learn How the Virtue Foundation Saves Lives by Optimizing Health Care Delivery Across the Globe

The Virtue Foundation uses cutting-edge techniques in AI to optimize global health care delivery to save lives. With Unity Catalog as a foundation, they are using advanced Gen AI with model serving, vector search and MLflow to radically change how they map volunteer health resources with the right locations and facilities. Audio for this session is delivered in the conference mobile app, you must bring your own headphones to listen.

Databricks as the Backbone of MLOps: From Orchestration to Inference

As machine learning (ML) models scale in complexity and impact, organizations must establish a robust MLOps foundation to ensure seamless model deployment, monitoring and retraining. In this session, we’ll share how we leverage Databricks as the backbone of our MLOps ecosystem — handling everything from workflow orchestration to large-scale inference. We’ll walk through our journey of transitioning from fragmented workflows to an integrated, scalable system powered by Databricks Workflows. You’ll learn how we built an automated pipeline that streamlines model development, inference and monitoring while ensuring reliability in production. We’ll also discuss key challenges we faced, lessons learned and best practices for organizations looking to operationalize ML with Databricks.

Empowering Business Users With Databricks — Integrating AI/BI Genie With Microsoft Teams

In this session, we'll explore how Rooms To Go enhances organizational collaboration by integrating AI/BI Genie with Microsoft Teams. Genie enables warehouse employees and members of the sales team to interact with data using natural language, simplifying data exploration and analysis. By connecting Genie to Microsoft Teams, we bring real-time data insights directly to a user’s phone. We'll provide a comprehensive overview on setting up this integration as well as a demo of how the team uses it daily. Attendees will gain practical knowledge to implement this integration, empowering their teams to access and interact with data seamlessly within Microsoft Teams.

Enterprise Financial Crime Detection: A Lakehouse Framework for FATF, Basel III, and BSA Compliance

We will present a framework for FinCrime detection leveraging Databricks lakehouse architecture specifically how institutions can achieve both data flexibility & ACID transaction guarantees essential for FinCrime monitoring. The framework incorporates advanced ML models for anomaly detection, pattern recognition, and predictive analytics, while maintaining clear data lineage & audit trails required by regulatory bodies. We will also discuss some specific improvements in reduction of false positives, improvement in detection speed, and faster regulatory reporting, delve deep into how the architecture addresses specific FATF recommendations, Basel III risk management requirements, and BSA compliance obligations, particularly in transaction monitoring and SAR. The ability to handle structured and unstructured data while maintaining data quality and governance makes it particularly valuable for large financial institutions dealing with complex, multi-jurisdictional compliance requirements.

Future of Anti-Cheat With Riot Games

As online gaming evolves, so do cheating methods that exploit client-server vulnerabilities. Traditional anti-cheat, such as kernel-level drivers and runtime detections, has long been the primary defense. However, advanced cheats like Direct Memory Access (DMA) exploits and AI-powered Computer Vision (CV) hacks increasingly render client-side detection ineffective. This presentation examines the escalating arms race between cheat creators and developers, highlighting client-side limitations. With CV cheats mimicking human behavior, anti-cheat must shift toward server-side, data-driven detection. By leveraging AI, machine learning, and behavioral analytics to analyze player patterns, input anomalies, and decision inconsistencies, future solutions can move beyond static detection to adaptive security models, ensuring fair play at scale. The session will also include real-life examples from Riot Games’ anti-cheat efforts, specifically insights and case studies from the development and operation of Riot Vanguard, to illustrate how these strategies are applied in practice.

Prada has developed a complex solution, leveraging MosaicAI to propose an interactive and natural language product discovery capability that could improve its e-commerce search bar. The backbone is a 70B model and a Vector Store, which collaborates with additional filterings and AI solutions to suggest not only the perfect outfit for each occasion, but also provide alternative solutions and similar items.

Integrating AI With Data: A Unified Strategy for Business

In the modern business landscape, AI and data strategies can no longer operate in isolation. To drive meaningful outcomes, organizations must align these critical components within a unified framework tied to overarching business objectives. This presentation explores the necessity of integrating AI and data strategies, emphasizing the importance of high-quality data, scalable architectures and robust governance. Attendees will learn three essential steps that need to be taken: Recognize that AI requires the right data to succeed Prioritize data quality and architecture Establish strong governance practices Additionally, the talk will highlight the cultural shift required to bridge IT and business silos, fostering roles that combine technical and business expertise. We’ll dive into specific practical steps that can be taken to ensure an organization has a cohesive and blended AI and data strategy, using specific case examples.

Maximize Retail Data Insights in Genie with DeltaSharing via Crisp’s Collaborative Commerce Platform

Crisp streamlines a brand’s data ingestion across 60+ retail sources, to build a foundation of sales and inventory intelligence on Databricks. Data is normalized and analysis-ready, and integrates seamlessly with AI tools - such as Databricks’ Genie and Blueprints. This session will provide an overview of the Crisp retail data platform and how our semantic layer, normalized and harmonized data sets can help drive powerful insights for supply chain, BI/Analytics, and data science teams.

Meet Goose, an Open Source AI Agent

goose is an open source AI agent framework that allows anyone to connect language model output to real world action. Released in January by Block (the company made up of Square, Cash App, Afterpay, and TIDAL), its use cases range from vibe coding to connecting all of the internal apps and services an enterprise uses. It can be powered by any language model that has tool calling capabilities.goose's modular design allows it to connect with any system through simple extensions. Built on the open Model Context Protocol (developed with Anthropic), goose transforms natural language into actions across various tools and services. Whether integrating with platforms like Jira and GitHub, or executing system commands and scripts, its plug-and-play architecture means anyone can extend Goose's capabilities to suit their needs.Finally, goose has both a command line interface and desktop app — it isn't limited to an IDE to start connecting to MCP servers and building powerful agentic workflows.

One-Stop Machine Translation Solution in Game Domain From Real-Time UGC Content to In-Game Text

We present Level Infinite AI Translation, a translation engine developed by Tencent, tailored specifically for the gaming industry. The primary challenge in game machine translation (MT) lies in accurately interpreting the intricate context of game texts, effectively handling terminology and adapting to the highly diverse translation formats and stylistic requirements across different games. Traditional MT approaches cannot effectively address the aforementioned challenges due to their weak context representation ability and lack of common knowledge. Leveraging large language model and related technology, our engine is crafted to capture the subtleties of localized language expression while ensuring optimization for domain-specific terminology, jargon and required formats and styles. To date, the engine has been successfully implemented in 15 international projects, translating over one billion words across 23 languages, and has demonstrated cost savings exceeding 25% for partners.