talk-data.com talk-data.com

Topic

LLM

Large Language Models (LLM)

nlp ai machine_learning

1405

tagged

Activity Trend

158 peak/qtr
2020-Q1 2026-Q1

Activities

1405 activities · Newest first

Taming the LLM Wild West: Unified Governance with Mosaic AI Gateway

Whether you're using OpenAI, Anthropic or open-source models like Meta Llama, the Mosaic AI Gateway is the central control plane across any AI model or agent. Learn how you can streamline access controls, enforce guardrails for compliance, ensure an audit trail and monitor costs across providers — without slowing down innovation. Lastly, we’ll dive even deeper into how AI Gateway works with Unity Catalog to deliver a full governance story for your end-to-end AI agents across models, tools and data. Key takeaways: Centrally manage governance and observability across any LLM (proprietary or open-source) Give developers a unified query interface to swap, experiment and A/B test across models Attribute costs and usage to teams for better visibility and chargebacks Enforce enterprise-grade compliance with guardrails and payload logging Ensure production reliability with load balancing and fallbacks

Your Wish is AI Command — Get to Grips With Databricks Genie

Picture the scene — you're exploring a deep, dark cave looking for insights to unearth when, in a burst of smoke, Genie appears and offers you not three but unlimited data wishes. This isn't a folk tale, it's the growing wave of Generative BI that is going to be a part of analytics platforms. Databricks Genie is a tool powered by a SQL-writing LLM that redefines how we interact with data. We'll look at the basics of creating a new Genie room, scoping its data tables and asking questions. We'll help it out with some complex pre-defined questions and ensure it has the best chance of success. We'll give the tool a personality, set some behavioural guidelines and prepare some hidden easter eggs for our users to discover. Generative BI is going to be a fundamental part of the analytics toolset used across businesses. If you're using Databricks, you should be aware of Genie, if you're not, you should be planning your Generative BI Roadmap, and this session will answer your wishes.

Send us a text This week on Making Data Simple, we welcome Ralph Gootee, CTO and co-founder of TigerEye, a company reshaping strategic sales intelligence with a data-driven edge. Ralph’s journey spans Pixar, Sony, and PlanGrid — and now he’s building tools to help sales leaders see around corners. From the secrets behind TigerEye's intuitive reporting to the realities of entrepreneurship and the affordability of LLMs, this episode hits both the business brain and the tech heart. 00:52 Meet Ralph Gootee 01:43 TigerEye 04:49 PlanGrid 07:43 Monetization 08:50 TigerEye's Objective 12:38 Reinventing Reporting 17:06 How it Works 22:21 The Secret Sauce 27:48 LLM Affordability 34:14 Last Call 38:57 The Entrepreneur Dilemma 39:47 Where to Reach TigerEye 40:07 Do Code Assistants Work? 47:32 For Fun🔗 Connect with Ralph & TigerEye LinkedIn: Ralph Gootee  Website: TigerEye Blog: TigerEye Blog

MakingDataSimple #SalesIntelligence #AIinSales #EntrepreneurMindset #LLMs #StartupLife #PixarToPipeline #DataDrivenDecisions #TechLeadership #TigerEye

Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

Summary In this episode of the Data Engineering Podcast Alex Albu, tech lead for AI initiatives at Starburst, talks about integrating AI workloads with the lakehouse architecture. From his software engineering roots to leading data engineering efforts, Alex shares insights on enhancing Starburst's platform to support AI applications, including an AI agent for data exploration and using AI for metadata enrichment and workload optimization. He discusses the challenges of integrating AI with data systems, innovations like SQL functions for AI tasks and vector databases, and the limitations of traditional architectures in handling AI workloads. Alex also shares his vision for the future of Starburst, including support for new data formats and AI-driven data exploration tools.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.This is a pharmaceutical Ad for Soda Data Quality. Do you suffer from chronic dashboard distrust? Are broken pipelines and silent schema changes wreaking havoc on your analytics? You may be experiencing symptoms of Undiagnosed Data Quality Syndrome — also known as UDQS. Ask your data team about Soda. With Soda Metrics Observability, you can track the health of your KPIs and metrics across the business — automatically detecting anomalies before your CEO does. It’s 70% more accurate than industry benchmarks, and the fastest in the category, analyzing 1.1 billion rows in just 64 seconds. And with Collaborative Data Contracts, engineers and business can finally agree on what “done” looks like — so you can stop fighting over column names, and start trusting your data again.Whether you’re a data engineer, analytics lead, or just someone who cries when a dashboard flatlines, Soda may be right for you. Side effects of implementing Soda may include: Increased trust in your metrics, reduced late-night Slack emergencies, spontaneous high-fives across departments, fewer meetings and less back-and-forth with business stakeholders, and in rare cases, a newfound love of data. Sign up today to get a chance to win a $1000+ custom mechanical keyboard. Visit dataengineeringpodcast.com/soda to sign up and follow Soda’s launch week. It starts June 9th. This episode is brought to you by Coresignal, your go-to source for high-quality public web data to power best-in-class AI products. Instead of spending time collecting, cleaning, and enriching data in-house, use ready-made multi-source B2B data that can be smoothly integrated into your systems via APIs or as datasets. With over 3 billion data records from 15+ online sources, Coresignal delivers high-quality data on companies, employees, and jobs. It is powering decision-making for more than 700 companies across AI, investment, HR tech, sales tech, and market intelligence industries. A founding member of the Ethical Web Data Collection Initiative, Coresignal stands out not only for its data quality but also for its commitment to responsible data collection practices. Recognized as the top data provider by Datarade for two consecutive years, Coresignal is the go-to partner for those who need fresh, accurate, and ethically sourced B2B data at scale. Discover how Coresignal's data can enhance your AI platforms. Visit dataengineeringpodcast.com/coresignal to start your free 14-day trial.Your host is Tobias Macey and today I'm interviewing Alex Albu about how Starburst is extending the lakehouse to support AI workloadsInterview IntroductionHow did you get involved in the area of data management?Can you start by outlining the interaction points of AI with the types of data workflows that you are supporting with Starburst?What are some of the limitations of warehouse and lakehouse systems when it comes to supporting AI systems?What are the points of friction for engineers who are trying to employ LLMs in the work of maintaining a lakehouse environment?Methods such as tool use (exemplified by MCP) are a means of bolting on AI models to systems like Trino. What are some of the ways that is insufficient or cumbersome?Can you describe the technical implementation of the AI-oriented features that you have incorporated into the Starburst platform?What are the foundational architectural modifications that you had to make to enable those capabilities?For the vector storage and indexing, what modifications did you have to make to iceberg?What was your reasoning for not using a format like Lance?For teams who are using Starburst and your new AI features, what are some examples of the workflows that they can expect?What new capabilities are enabled by virtue of embedding AI features into the interface to the lakehouse?What are the most interesting, innovative, or unexpected ways that you have seen Starburst AI features used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on AI features for Starburst?When is Starburst/lakehouse the wrong choice for a given AI use case?What do you have planned for the future of AI on Starburst?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links StarburstPodcast EpisodeAWS AthenaMCP == Model Context ProtocolLLM Tool UseVector EmbeddingsRAG == Retrieval Augmented GenerationAI Engineering Podcast EpisodeStarburst Data ProductsLanceLanceDBParquetORCpgvectorStarburst IcehouseThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Generating Zero-Shot Hard-Case Hallucinations: A Synthetic and Open Data Approach

We present a novel framework for designing and inducing controlled hallucinations in long-form content generation by LLMs across diverse domains. The purpose is to create fully-synthetic benchmarks and mine hard cases for iterative refinement of zero-shot hallucination detectors. We will first demonstrate how Gretel Data Designer (now part of NVIDIA) can be used to design realistic, high-quality long-context datasets across various domains. Second, we will describe our reasoning-based approach to hard-case mining. Specifically, our methodology relies on chain-of-thought-based generation of both faithful and deceptive question-answer pairs based upon long-context samples. Subsequently, a consensus labeling & detector framework is employed to filter synthetic examples to zero-shot hard cases. The result of this process is a fully-automated system, operating under open data licenses such as Apache-2.0, for the generation of hallucinations at the edge-of-capabilities for a target LLM to detect.

One-Stop Machine Translation Solution in Game Domain From Real-Time UGC Content to In-Game Text

We present Level Infinite AI Translation, a translation engine developed by Tencent, tailored specifically for the gaming industry. The primary challenge in game machine translation (MT) lies in accurately interpreting the intricate context of game texts, effectively handling terminology and adapting to the highly diverse translation formats and stylistic requirements across different games. Traditional MT approaches cannot effectively address the aforementioned challenges due to their weak context representation ability and lack of common knowledge. Leveraging large language model and related technology, our engine is crafted to capture the subtleties of localized language expression while ensuring optimization for domain-specific terminology, jargon and required formats and styles. To date, the engine has been successfully implemented in 15 international projects, translating over one billion words across 23 languages, and has demonstrated cost savings exceeding 25% for partners.

Sponsored by: Google Cloud | Unleash the power of Gemini for Databricks

Elevate your AI initiatives on Databricks by harnessing the latest advancements in Google Cloud's Gemini models. Learn how to integrate Gemini's built-in reasoning and powerful development tools to build more dynamic and intelligent applications within your existing Databricks platform. We'll explore concrete ideas for agentic AI solutions, showcasing how Gemini can help you unlock new value from your data in Databricks.

Building Trustworthy AI at Northwestern Mutual: Guardrail Technologies and Strategies

This intermediate-level presentation will explore the various methods we've leveraged within Databricks to deliver and evaluate guardrail models for AI safety. From prompt engineering with custom built frameworks to hosting models served from the market place and beyond. We've utilized GPU within clusters to fine-tune and run large open sourced models at inference such as Llama Guard 3.1 and generate synthetic datasets based on questions we've received from production.

Accelerate End-to-End Multi-Agents on Databricks and DSPy

A production-ready GenAI application is more than the framework itself. Like ML, you need a unified platform to create an end-to-end workflow for production quality applications.Below is an example of how this works on Databricks: Data ETL with Lakeflow Declarative Pipelines and jobs Data storage for governance and access with Unity Catalog Code development with Notebooks Agent versioning and metric tracking with MLflow and Unity Catalog Evaluation and optimizations with Mosaic AI Agent Framework and DSPy Hosting infrastructure with monitoring with Model Serving and AI Gateway Front-end apps using Databricks Apps In this session, learn how to build agents to access all your data and models through function calling. Then, learn how DSPy enables agent interaction with each other to ensure the question is answered correctly. We will demonstrate a chatbot, powered by multiple agents, to be able to answer questions and reason answers the base LLM does not know and very specialized topics.ow and very specialized topics.

AI Meets SQL: Leverage GenAI at Scale to Enrich Your Data

This session is repeated. Integrating AI into existing data workflows can be challenging, often requiring specialized knowledge and complex infrastructure. In this session, we'll share how SQL users can leverage AI/ML to access large language models (LLMs) and traditional machine learning directly from within SQL, simplifying the process of incorporating AI into data workflows. We will demonstrate how to use Databricks SQL for natural language processing, traditional machine learning, retrieval augmented generation and more. You'll learn about best practices and see examples of solving common use cases such as opinion mining, sentiment analysis, forecasting and other common AI/ML tasks.

RecSys, Topic Modeling and Agents: Bridging the GenAI-Traditional ML Divide

The rise of GenAI has led to a complete reinvention of how we conceptualize Data + AI. In this breakout, we will recontextualize the rise of GenAI in traditional ML paradigms, and hopefully unite the pre- and post-LLM eras. We will demonstrate when and where GenAI may prove more effective than traditional ML algorithms, and highlight problems for which the wheel is unnecessarily being reinvented with GenAI. This session will also highlight how MLflow provides a unified means of benchmarking traditional ML against GenAI, and lay out a vision for bridging the divide between Traditional ML and GenAI practitioners.

Curious to know how Adidas is transforming customer experience and business impact with agentic workflows, powered by Databricks? By leveraging cutting-edge tools like MosaicML’s deployment capabilities, Mosaic AI Gateway, and MLflow, Adidas built a scalable GenAI agentic infrastructure that delivers actionable insights from growing 2 million product reviews annually. With remarkable results: 60% latency reduction (15.5 seconds to 6 seconds) 91.67% cost savings (transitioning to more efficient LLMs) 98.5% token efficiency, reducing input tokens from 200k to just 3k 20% increase in productivity (faster time to insight) Empowering over 500 decision-makers across 150+ countries, this infrastructure is set to optimize products and services for Adidas’ 500 million members by 2025 while supporting dozens of upcoming AI-driven solutions. Join us to explore how Adidas turned agentic workflows infra into a strategic advantage using Databricks and learn how you can do the same!

AI Powering Epsilon's Identity Strategy: Unified Marketing Platform on Databricks

Join us to hear about how Epsilon Data Management migrated Epsilon’s unique, AI-powered marketing identity solution from multi-petabyte on-prem Hadoop and data warehouse systems to a unified Databricks Lakehouse platform. This transition enabled Epsilon to further scale its Decision Sciences solution and enable new cloud-based AI research capabilities on time and within budget, without being bottlenecked by the resource constraints of on-prem systems. Learn how Delta Lake, Unity Catalog, MLflow and LLM endpoints powered massive data volume, reduced data duplication, improved lineage visibility, accelerated Data Science and AI, and enabled new data to be immediately available for consumption by the entire Epsilon platform in a privacy-safe way. Using the Databricks platform as the base for AI and Data Science at global internet scale, Epsilon deploys marketing solutions across multiple cloud providers and multiple regions for many customers.

Moody's AI Screening Agent: Automating Compliance Decisions

The AI Screening Agent automates Level 1 (L1) screening process, essential for Know Your Customer (KYC) and compliance due diligence during customer onboarding. This system aims to minimize false positives, significantly reducing human review time and costs. Beyond typical Retrieval-Augmented Generation (RAG) applications like summarization and chat-with-your-data (CWYD), the AI Screening Agent employs a ReAct architecture with intelligent tools, enabling it to perform complex compliance decision-making with human-like accuracy and greater consistency. In this talk, I will explore the screening agent architecture, demonstrating its ability to meet evolving client policies. I will discuss evaluation and configuration management using MLflow LLM-as-judge and Unity Catalog, and discuss challenges, such as, data fidelity and customization. This session underscores the transformative potential of AI agents in compliance workflows, emphasizing their adaptability, accuracy, and consistency.

Gaining Insight From Image Data in Databricks Using Multi-Modal Foundation Model API

Unlock the hidden potential in your image data without specialized computer vision expertise! This session explores how to leverage Databricks' multi-modal Foundation Model APIs to analyze, classify and extract insights from visual content. Learn how Databricks provides a unified API to understand images using powerful foundation models within your data workflows. Key takeaways: Implementing efficient workflows for image data processing within your Databricks lakehouse Understanding multi-modal foundation models for image understanding Integrating image analysis with other data types for business insights Using OpenAI-compatible APIs to query multi-modal models Building end-to-end pipelines from image ingestion to model deployment Whether analyzing product images, processing visual documents or building content moderation systems, you'll discover how to extract valuable insights from your image data within the Databricks ecosystem.

Accelerating Model Development and Fine-Tuning on Databricks with TwelveLabs

Scaling large language models (LLMs) and multimodal architectures requires efficient data management and computational power. NVIDIA NeMo Framework Megatron-LM on Databricks is an open source solution that integrates GPU acceleration and advanced parallelism with Databricks Delta Lakehouse, streamlining workflows for pre-training and fine-tuning models at scale. This session highlights context parallelism, a unique NeMo capability for parallelizing over sequence lengths, making it ideal for video datasets with large embeddings. Through the case study of TwelveLabs’ Pegasus-1 model, learn how NeMo empowers scalable multimodal AI development, from text to video processing, setting a new standard for LLM workflows.

At Zillow, we have accelerated the volume and quality of our dashboards by leveraging a modern SDLC with version control and CI/CD. In the past three months, we have released 32 production-grade dashboards and shared them securely across the organization while cutting error rates in half over that span. In this session, we will provide an overview of how we utilize Databricks asset bundles and GitLab CI/CD to create performant dashboards that can be confidently used for mission-critical operations. As a concrete example, we'll then explore how Zillow's Data Platform team used this approach to automate our on-call support analysis, leveraging our dashboard development strategy alongside Databricks LLM offerings to create a comprehensive view that provides actionable performance metrics alongside AI-generated insights and action items from the hundreds of requests that make up our support workload.

AI Agents in Action: Structuring Unstructured Data on Demand With Databricks and Unstructured

LLM agents aren’t just answering questions — they’re running entire workflows. In this talk, we’ll show how agents can autonomously ingest, process and structure unstructured data using Unstructured, with outputs flowing directly into Databricks. Powered by the Model Context Protocol (MCP), agents can interface with Unstructured’s full suite of capabilities — discovering documents across sources, building ephemeral workflows and exporting structured insights into Delta tables. We’ll walk through a demo where an agent responds to a natural language request, dynamically pulls relevant documents, transforms them into usable data and surfaces insights — fast. Join us for a sneak peek into the future of AI-native data workflows, where LLMs don’t just assist — they operate.

AT&T AutoClassify: Unified Multi-Head Binary Classification From Unlabeled Text

We present AT&T AutoClassify, built jointly between AT&T's Chief Data Office (CDO) and Databricks professional services, a novel end-to-end system for automatic multi-head binary classifications from unlabeled text data. Our approach automates the challenge of creating labeled datasets and training multi-head binary classifiers with minimal human intervention. Starting only from a corpus of unlabeled text and a list of desired labels, AT&T AutoClassify leverages advanced natural language processing techniques to automatically mine relevant examples from raw text, fine-tune embedding models and train individual classifier heads for multiple true/false labels. This solution can reduce LLM classification costs by 1,000x, making it an efficient solution in operational costs. The end result is a highly optimized and low-cost model servable in Databricks capable of taking raw text and producing multiple binary classifications. An example use case using call transcripts will be examined.

Getting Data AI Ready: Testimonial of Good Governance Practices Constructing Accurate Genie Spaces

Genie Rooms have played an integral role in democratizing important datasets like Cell Tower and Lease Information. However, in order to ensure that this exciting new release from Databricks was configured as optimally as possible from development to deployment, we needed additional scaffolding around governance. In this talk we will describe the four main components we used in conjunction with the Genie Room to build a successful product and will provide generalizable lessons to help others get the most out of this object. At the core are a declarative, metadata approach to creating UC tables deployed on a robust framework. Second, a platform that efficiently crowdsourced targeted feedback from different user groups. Third, a tool that balances the LLM’s creativity with human wisdom. And finally, a platform that enforces our principle of separating Storage from Compute to manage access to the room at a fine-grained level and enables a whole host of interesting use-cases.