LLM

Secure and Protect AI Usage in your Organization with DSPM for AI

2025-11-18 · Microsoft Ignite 2025

theater

by Tushar Kumar (Codec Ireland)

AI/ML GenAI Microsoft Cyber Security

As organizations adopt AI to boost productivity and creativity, concerns grow about data security and possible leaks through generative AI tools like Copilot, ChatGPT, and Gemini. Microsoft’s Data Security Posture Management (DSPM) for AI helps organizations monitor AI activities, enforce data protection policies, and meet regulatory standards, allowing safe, productive AI use without compromising sensitive information.

Build A2A and MCP Systems using SWE Agents and agent-framework

2025-11-18 · Microsoft Ignite 2025

talk

by Govind Kamtamneni (Microsoft) , Mark Wallace (Microsoft)

Azure GitHub

Learn to leverage agent-framework, the new unified platform from Semantic Kernel and AutoGen engineering teams, to build A2A compatible agents similar to magnetic-one. Use SWE Agents (GitHub Copilot coding agent and Codex with Azure OpenAI models) to accelerate development. Implement MCP tools for secure enterprise agentic workflows. Experience hands-on building, deploying, and orchestrating multi-agent systems with pre-release capabilities. Note: Contains embargoed content.

Please RSVP and arrive at least 5 minutes before the start time, at which point remaining spaces are open to standby attendees.

Context Engineering for Multi-Agent Systems

2025-11-18 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Denis Rothman

AI/ML GenAI Marketing RAG architectural-patterns service-oriented-architecture-soa service-oriented architecture (soa) software-architecture software-development

Build AI that thinks in context using semantic blueprints, multi-agent orchestration, memory, RAG pipelines, and safeguards to create your own Context Engine Free with your book: DRM-free PDF version + access to Packt's next-gen Reader Key Features Design semantic blueprints to give AI structured, goal-driven contextual awareness Orchestrate multi-agent workflows with MCP for adaptable, context-rich reasoning Engineer a glass-box Context Engine with high-fidelity RAG, trust, and safeguards Book Description Generative AI is powerful, yet often unpredictable. This guide shows you how to turn that unpredictability into reliability by thinking beyond prompts and approaching AI like an architect. At its core is the Context Engine, a glass-box, multi-agent system you’ll learn to design and apply across real-world scenarios. Written by an AI guru and author of various cutting-edge AI books, this book takes you on a hands-on journey from the foundations of context design to building a fully operational Context Engine. Instead of relying on brittle prompts that give only simple instructions, you’ll begin with semantic blueprints that map goals and roles with precision, then orchestrate specialized agents using the Model Context Protocol. As the engine evolves, you’ll integrate memory and high-fidelity retrieval with citations, implement safeguards against data poisoning and prompt injection, and enforce moderation to keep outputs aligned with policy. You’ll also harden the system into a resilient architecture, then see it pivot across domains, from legal compliance to strategic marketing, proving its domain independence. By the end of this book, you’ll be equipped with the skills to engineer an adaptable, verifiable architecture you can repurpose across domains and deploy with confidence. Email sign-up and proof of purchase required What you will learn Develop memory models to retain short-term and cross-session context Craft semantic blueprints and drive multi-agent orchestration with MCP Implement high-fidelity RAG pipelines with verifiable citations Apply safeguards against prompt injection and data poisoning Enforce moderation and policy-driven control in AI workflows Repurpose the Context Engine across legal, marketing, and beyond Deploy a scalable, observable Context Engine in production Who this book is for This book is for AI engineers, software developers, system architects, and data scientists who want to move beyond ad hoc prompting and learn how to design structured, transparent, and context-aware AI systems. It will also appeal to ML engineers and solutions architects with basic familiarity with LLMs who are eager to understand how to orchestrate agents, integrate memory and retrieval, and enforce safeguards.

#332 How to Build AI Your Users Can Trust with David Colwell, VP of AI & ML at Tricentis

2025-11-17 · DataFramed Listen

podcast_episode

by David Colwell (Tricentis) , Richie (DataCamp)

AI/ML Data Governance

The relationship between data governance and AI quality is more critical than ever. As organizations rush to implement AI solutions, many are discovering that without proper data hygiene and testing protocols, they're building on shaky foundations. How do you ensure your AI systems are making decisions based on accurate, appropriate information? What benchmarking strategies can help you measure real improvement rather than just increased output? With AI now touching everything from code generation to legal documents, the consequences of poor quality control extend far beyond simple errors—they can damage reputation, violate regulations, or even put licenses at risk. David Colwell is the Vice President of Artificial Intelligence and Machine Learning at Tricentis, a global leader in continuous testing and quality engineering. He founded the company’s AI division in 2018 with a mission to make quality assurance more effective and engaging through applied AI innovation. With over 15 years of experience in AI, software testing, and automation, David has played a key role in shaping Tricentis’ intelligent testing strategy. His team developed Vision AI, a patented computer vision–based automation capability within Tosca, and continues to pioneer work in large language model agents and AI-driven quality engineering. Before joining Tricentis, David led testing and innovation initiatives at DX Solutions and OnePath, building automation frameworks and leading teams to deliver scalable, AI-enabled testing solutions. Based in Sydney, he remains focused on advancing practical, trustworthy applications of AI in enterprise software development. In the episode, Richie and David explore AI disasters in legal settings, the balance between AI productivity and quality, the evolving role of data scientists, and the importance of benchmarks and data governance in AI development, and much more. Links Mentioned in the Show: Tricentis 2025 Quality Transformation ReportConnect with DavidCourse: Artificial Intelligence (AI) LeadershipRelated Episode: Building & Managing Human+Agent Hybrid Teams with Karen Ng, Head of Product at HubSpotRewatch RADAR AI New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

OpenAI fecha acordo bilionário com a Amazon para impulsionar IA; Elon Musk quer levar data centers para o espaço; Robô que lava, dobra e passa roupas - Data Hackers News #97

2025-11-13 · Data Hackers Listen

podcast_episode

AI/ML

Está no ar, o Data Hackers News !! Os assuntos mais quentes da semana, com as principais notícias da área de Dados, IA e Tecnologia, que você também encontra na nossa Newsletter semanal, agora no Podcast do Data Hackers !! Aperte o play e ouça agora, o Data Hackers News dessa semana ! Para saber tudo sobre o que está acontecendo na área de dados, se inscreva na Newsletter semanal: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.datahackers.news/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Conheça nossos comentaristas do Data Hackers News: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Monique Femme⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠Preencha a pesquisa State of Data Brazil: https://www.stateofdata.com.br/ Demais canais do Data Hackers: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Site⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Linkedin⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Instagram⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Tik Tok⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠You Tube⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

Accessible by Design: Redefining AI Inclusion with Valerie Lockhart

2025-11-12 · Women in AI and Data Science Conference 2025 Watch

video

by Valerie Lockhart

AI/ML Data Science

AI has the potential to transform learning, work, and daily life for millions of people, but only if we design with accessibility at the core. Too often, disabled people are underrepresented in datasets, creating systemic barriers that ripple through models and applications. This talk explores how data scientists and technologists can mitigate bias, from building synthetic datasets to fine-tuning LLMs on accessibility-focused corpora. We’ll look at opportunities in multimodal AI: voice, gesture, AR/VR, and even brain-computer interfaces, that open new pathways for inclusion. Beyond accuracy, we’ll discuss evaluation metrics that measure usability, comprehension, and inclusion, and why testing with humans is essential to closing the gap between model performance and lived experience. Attendees will leave with three tangible ways to integrate accessibility into their own work through datasets, open-source tools, and collaborations. Accessibility is not just an ethical mandate, it’s a driver of innovation, and it begins with thoughtful, human-centered data science.

From Predictions to Action: The AI Agent Revolution with Fareeha Amber Ansari

2025-11-12 · Women in AI and Data Science Conference 2025 Watch

video

by Fareeha Amber Ansari

AI/ML GenAI

Large language models are powerful, but their true potential emerges when they evolve into AI agents which are systems that can reason, plan, and take action autonomously. My talk will explore the shift from using models as passive tools to designing agents that actively interact with data, systems, and people.

I will cover: - Gen AI and Agentic AI – How are These Different - Single Agent (monolithic) and Multi Agent Architectures (modular / distributed) - Open Source and Closed Source AI Systems - Challenges of Integrating Agents with Existing Systems

I will break down the technical building blocks of AI agents, including memory, planning loops, tool integration, and feedback mechanisms. Examples will be used to highlight how agents are being used in workflow automation, knowledge management, and decision support.

I will wrap up with where limitations of AI Agents still pose risks: - Assessing Maturity Cycle of Agents - Cybersecurity Risks of Agents

By the end, attendees will understand: - What makes AI agents different from LLMs - Technical considerations required to build AI Agents responsibly - Applicable knowledge to begin experimenting with agents.

Large Language Models for Tacit Knowledge Extraction and Transfer with Mina Cho

2025-11-12 · Women in AI and Data Science Conference 2025 Watch

video

by Mina Cho

A central challenge in knowledge transfer lies in the transfer of tacit knowledge. LLMs, capable of identifying latent patterns in data, present an interesting opportunity to address this issue. This paper explores the potential of LLMs to externalize experts’ tacit knowledge and aid its transfer to novices. Specifically, we examine three questions: RQ1: Can LLMs effectively externalize experts’ tacit knowledge? How to do so (e.g., prompting strategy)? RQ2: How can LLMs use externalized tacit knowledge to make effective decisions? RQ3: How can LLM-externalized tacit knowledge support novice learning? We explore these questions using real-world tutoring conversations collected by Wang et al. (2024).

Our findings suggest that LLMs may be capturing nuances from experts’ observed behavior that are different from the knowledge experts articulate. With carefully designed prompting strategies, LLMs may offer a practical and scalable means of externalizing and transferring tacit knowledge.

Tech Talk: Context is All You Need

2025-11-12 · AI Meetup (November): GenAI LLMs and Agents

talk

AI/ML

In this session, we’ll show how to structure and deliver the right context to large language models (LLMs) so they can actually reason through tasks - not just retrieve answers. We'll show practical ways to provide context across prompts and tools, using a Model Context Repository to make your AI apps much smarter.

I Built an AI That Talks Like My Parents | Emotional AI, Empathy, and the Future of Human Tech

2025-11-11 · Data & AI with Mukundan | Learn AI by Building Listen

podcast_episode

by Mukundan Sankar

AI/ML GitHub Spark

I missed my parents, so I built an AI that talks like them. This isn’t about replacing people—it’s about remembering the voices that make us feel safe. In this 90-minute episode of Data & AI with Mukundan, we explore what happens when technology stops chasing efficiency and starts chasing empathy. Mukundan shares the story behind “What Would Mom & Dad Say?”, a Streamlit + GPT-4 experiment that generates comforting messages in the voice of loved ones. You’ll hear: The emotional spark that inspired the projectThe plain-English prompts anyone can use to teach AI empathyBoundaries & ethics of emotional AIHow this project reframed loneliness, creativity, and connectionTakeaway: AI can’t love you—but it can remind you of the people who do. 🔗 Try the free reflection prompts below: THE ONE-PROMPT VERSION: “What Would Mom & Dad Say?”
“You are speaking to me as one of my parents. Choose the tone I mention: either Mom (warm and reflective) or Dad (practical and encouraging). First, notice the emotion in what I tell you—fear, stress, guilt, joy, or confusion—and name it back to me so I feel heard. Then reply in 3 parts: Start by validating what I’m feeling, in a caring way.Share a short story, lesson, or perspective that fits the situation.End with one hopeful or guiding question that helps me think forward. Keep your words gentle, honest, and simple. No technical language. Speak like someone who loves me and wants me to feel calm and capable again.”

Join the Discussion (comments hub): https://mukundansankar.substack.com/notes Tools I use for my Podcast and Affiliate PartnersRecording Partner: Riverside → Sign up here (affiliate)Host Your Podcast: RSS.com (affiliate )Research Tools: Sider.ai (affiliate)Sourcetable AI: Join Here(affiliate)🔗 Connect with Me:Free Email NewsletterWebsite: Data & AI with MukundanGitHub: https://github.com/mukund14Twitter/X: @sankarmukund475LinkedIn: Mukundan SankarYouTube: Subscribe

How AI is Transforming Business, Retail & Data Careers with Walmart's Director of Data Science & AI, Ketan Mudda

2025-11-10 · Data Career School: Grow Your Career in Data Analytics & AI Listen

podcast_episode

by Ketan Mudda (Walmart) , Amlan Mohanty

AI/ML Analytics Data Analytics Data Science

AI and data analytics are transforming business, and your data career can’t afford to be left behind. 🎙️ In this episode of Data Career School, I sit down with Ketan Mudda, Director of Data Science & AI Solutions at Walmart, to explore how AI is reshaping retail, analytics, and decision-making—and what it means for students, job seekers, and early-career professionals in 2026.

We dive into: How AI is driving innovation and smarter decisions in retail and business Essential skills data professionals need to thrive in an AI-first world How AI tools like ChatGPT are changing the way analysts work What employers look for beyond technical expertise Strategies to future-proof your data career

Ketan also shares his journey from Credit Risk Analyst at HSBC to leading AI-driven initiatives at one of the world’s largest retailers.

Whether you’re starting your data career, exploring AI’s impact on business, or curious about analytics in action, this episode is packed with actionable insights, inspiration, and career guidance.

🎙️ Hosted by Amlan Mohanty — creator of Data Career School, where we explore AI, data analytics, and the future of work. Follow me: 📺 YouTube 🔗 LinkedIn 📸 Instagram

🎧Listen now to level up your data career!

Chapters 00:00 The Journey of Ketan Mudda05:18 AI's Transformative Impact on Industries12:49 Responsible AI Practices14:28 The Role of Education in Data Science23:18 AI and the Future of Jobs28:03 Embracing AI Tools for Success29:44 The Importance of Networking31:40 Curiosity and Continuous Learning32:50 Storytelling in Data Science Leadership36:22 Focus on AI Ethics and Change Management41:03 Learning How to Learn44:57 Identifying Problems Over Tools

LLMs, Chatbots, and Dashboards: Visualize and Analyze Your Data with Natural Language

2025-11-09 · PyData Seattle 2025

talk

by Daniel Chen

AI/ML Dashboard Data Science

LLMs have a lot of hype around them these days. Let’s demystify how they work and see how we can put them in context for data science use. As data scientists, we want to make sure our results are inspectable, reliable, reproducible, and replicable. We already have many tools to help us in this front. However, LLMs provide a new challenge; we may not always be given the same results back from a query. This means trying to work out areas where LLMs excel in, and use those behaviors in our data science artifacts. This talk will introduce you to LLMs, the Chatlas packages, and how they can be integrated into a Shiny to create an AI-powered dashboard (using querychat). We’ll see how we can leverage the tasks LLMs are good at to better our data science products.

Building a Deep Research Agentic Workflow

2025-11-09 · PyData Seattle 2025

talk

by Ravi Kumar Yadav , nidhin pattaniyil

OpenAI and Gemini's Deep Research offerings are a great way to get a detailed research report on a topic.

In this beginner friendly tutorial, we’ll walk through building a simple lightweight agent workflow to perform deep research.

Prompt Variation as a Diagnostic Tool: Exposing Contamination, Memorization, and True Capability in LLMs

2025-11-08 · PyData Seattle 2025

talk

by Aziza Mirsaidova

Prompt variation isn't just an engineering nuisance, it's a window into fundamental LLM limitations. When a model's accuracy drops from 95% to 75% due to minor rephrasing, we're not just seeing brittleness; we're potentially exposing data contamination, spurious correlations, and shallow pattern matching. This talk explores prompt variation as a powerful diagnostic tool for understanding LLM reliability. We discuss how small changes in format, phrasing, or ordering can cause accuracy to collapse revealing about models memorizing benchmark patterns or learning superficial correlations rather than robust task representations. Drawing from academic and industry research, you will learn to distinguish between LLM's true capability and memorization, identify when models are pattern-matching rather than reasoning, and build evaluation frameworks that expose these vulnerabilities before deployment.

Securing Retrieval-Augmented Generation: How to Defend Vector Databases Against 2025 Threats

2025-11-08 · PyData Seattle 2025 Watch

talk

by Rajesh

RAG Cyber Security Vector DB

Modern LLM applications rely heavily on embeddings and vector databases for retrieval-augmented generation (RAG). But in 2025, researchers and OWASP flagged vector databases as a new attack surface — from embedding inversion (recovering sensitive training text) to poisoned vectors that hijack prompts. This talk demystifies these threats for practitioners and shows how to secure your RAG pipeline with real-world techniques like encrypted stores, anomaly detection, and retrieval validation. Attendees will leave with a practical security checklist for keeping embeddings safe while still unlocking the power of retrieval.

Evaluation is all you need

2025-11-08 · PyData Seattle 2025 Watch

talk

by Sebastian Duerr

RAG

LLM apps fail without reliable, reproducible evaluation. This talk maps the open‑source evaluation landscape, compares leading techniques (RAGAS, Evaluation Driven Development) and frameworks (DeepEval, Phoenix, LangFuse, and braintrust), and shows how to combine tests, RAG‑specific evals, and observability to ship higher‑quality systems. Attendees leave with a decision checklist, code patterns, and a production‑ready playbook.

Building Machine Learning Systems with a Feature Store

2025-11-07 · O'Reilly AI & ML Books O'Reilly Amazon

book

by Jim Dowling

AI/ML Data Modelling MLOps RAG ai-ml data machine-learning

Get up to speed on a new unified approach to building machine learning (ML) systems with a feature store. Using this practical book, data scientists and ML engineers will learn in detail how to develop and operate batch, real-time, and agentic ML systems. Author Jim Dowling introduces fundamental principles and practices for developing, testing, and operating ML and AI systems at scale. You'll see how any AI system can be decomposed into independent feature, training, and inference pipelines connected by a shared data layer. Through example ML systems, you'll tackle the hardest part of ML systems--the data, learning how to transform data into features and embeddings, and how to design a data model for AI. Develop batch ML systems at any scale Develop real-time ML systems by shifting left or shifting right feature computation Develop agentic ML systems that use LLMs, tools, and retrieval-augmented generation Understand and apply MLOps principles when developing and operating ML systems

From Swift to Mojo and high-performance AI Engineering with Chris Lattner

2025-11-05 · The Pragmatic Engineer Listen

podcast_episode

by Chris Lattner , Gergely Orosz

AI/ML Analytics Marketing Microsoft Python Rust TypeScript

Brought to You By: •⁠ Statsig ⁠ — ⁠ The unified platform for flags, analytics, experiments, and more. Companies like Graphite, Notion, and Brex rely on Statsig to measure the impact of the pace they ship. Get a 30-day enterprise trial here. •⁠ Linear – The system for modern product development. Linear is a heavy user of Swift: they just redesigned their native iOS app using their own take on Apple’s Liquid Glass design language. The new app is about speed and performance – just like Linear is. Check it out. — Chris Lattner is one of the most influential engineers of the past two decades. He created the LLVM compiler infrastructure and the Swift programming language – and Swift opened iOS development to a broader group of engineers. With Mojo, he’s now aiming to do the same for AI, by lowering the barrier to programming AI applications. I sat down with Chris in San Francisco, to talk language design, lessons on designing Swift and Mojo, and – of course! – compilers. It’s hard to find someone who is as enthusiastic and knowledgeable about compilers as Chris is! We also discussed why experts often resist change even when current tools slow them down, what he learned about AI and hardware from his time across both large and small engineering teams, and why compiler engineering remains one of the best ways to understand how software really works. — Timestamps (00:00) Intro (02:35) Compilers in the early 2000s (04:48) Why Chris built LLVM (08:24) GCC vs. LLVM (09:47) LLVM at Apple (19:25) How Chris got support to go open source at Apple (20:28) The story of Swift (24:32) The process for designing a language (31:00) Learnings from launching Swift (35:48) Swift Playgrounds: making coding accessible (40:23) What Swift solved and the technical debt it created (47:28) AI learnings from Google and Tesla (51:23) SiFive: learning about hardware engineering (52:24) Mojo’s origin story (57:15) Modular’s bet on a two-level stack (1:01:49) Compiler shortcomings (1:09:11) Getting started with Mojo (1:15:44) How big is Modular, as a company? (1:19:00) AI coding tools the Modular team uses (1:22:59) What kind of software engineers Modular hires (1:25:22) A programming language for LLMs? No thanks (1:29:06) Why you should study and understand compilers — The Pragmatic Engineer deepdives relevant for this episode: •⁠ AI Engineering in the real world • The AI Engineering stack • Uber's crazy YOLO app rewrite, from the front seat • Python, Go, Rust, TypeScript and AI with Armin Ronacher • Microsoft’s developer tools roots — Production and marketing by ⁠⁠⁠⁠⁠⁠⁠⁠https://penname.co/⁠⁠⁠⁠⁠⁠⁠⁠. For inquiries about sponsoring the podcast, email [email protected].

Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe

Keep it Simple and "Scalable": pythonic Extract, Load, Transform (ELT) using dltHub

2025-11-04 · Small Data SF 2025

workshop

by Elvis Kahoro (Chalk) , Brian Douglas (Continue) , Thierry Jean (dltHub)

AI/ML Data Quality ETL/ELT Python

Get ready to ingest data and transform it into ready-to-use datasets using Python. We'll share a no-nonsense approach for developing and testing data connectors and transformations locally. Moving to production will be a matter of tweaking your configuration. In the end, you get a simple dataset interface to build dashboards & applications, train predictive models, or create agentic workflows. This workshop includes two guest speakers. Brian teach how to leverage AI IDEs, MCP servers and LLM scaffoldings to create ingestion pipelines. Elvis will show how to interactively define transformations and data quality checks.

Stop Measuring LLM Accuracy, Start Building Context

2025-11-04 · Small Data SF 2025

workshop

by Tahlia DeMaio (Hex)

AI/ML Analytics

Everyone's trying to make LLMs "accurate." But the real challenge isn't accuracy — it's context. We'll explore why traditional approaches like evals suites or synthetic question sets fall short, and how successful AI systems are built instead through compounding context over time. Hex enables a new workflow for conversational analytics that grows smarter with every interaction. With Hex's Notebook Agent and Threads, business users define the questions that matter while data teams refine, audit, and operationalize them into durable, trusted workflows. In this model, "tests" aren't written in isolation by data teams — they're defined by the business and operationalized through data workflows. The result is a living system of context — not a static set of prompts or tests — that evolves alongside your organization. Join us for a candid discussion on what's working in production AI systems, and get hands-on building context-aware analytical workflows in Hex!

talk-data.com

Activity Trend

Top Events

Top Speakers

Secure and Protect AI Usage in your Organization with DSPM for AI

Build A2A and MCP Systems using SWE Agents and agent-framework

Context Engineering for Multi-Agent Systems

#332 How to Build AI Your Users Can Trust with David Colwell, VP of AI & ML at Tricentis

OpenAI fecha acordo bilionário com a Amazon para impulsionar IA; Elon Musk quer levar data centers para o espaço; Robô que lava, dobra e passa roupas - Data Hackers News #97

Accessible by Design: Redefining AI Inclusion with Valerie Lockhart

From Predictions to Action: The AI Agent Revolution with Fareeha Amber Ansari

Large Language Models for Tacit Knowledge Extraction and Transfer with Mina Cho

Tech Talk: Context is All You Need

I Built an AI That Talks Like My Parents | Emotional AI, Empathy, and the Future of Human Tech

How AI is Transforming Business, Retail & Data Careers with Walmart's Director of Data Science & AI, Ketan Mudda

LLMs, Chatbots, and Dashboards: Visualize and Analyze Your Data with Natural Language

Building a Deep Research Agentic Workflow

Prompt Variation as a Diagnostic Tool: Exposing Contamination, Memorization, and True Capability in LLMs

Securing Retrieval-Augmented Generation: How to Defend Vector Databases Against 2025 Threats

Evaluation is all you need

Building Machine Learning Systems with a Feature Store

From Swift to Mojo and high-performance AI Engineering with Chris Lattner

Keep it Simple and "Scalable": pythonic Extract, Load, Transform (ELT) using dltHub

Stop Measuring LLM Accuracy, Start Building Context