talk-data.com talk-data.com

Topic

LLM

Large Language Models (LLM)

nlp ai machine_learning

1405

tagged

Activity Trend

158 peak/qtr
2020-Q1 2026-Q1

Activities

1405 activities · Newest first

Tackling Data Challenges for Scaling Multi-Agent GenAI Apps with Python

The use of multiple Large Language Models (LLMs) working together perform complex tasks, known as multi-agent systems, has gained significant traction. While orchestration frameworks like LangGraph and Semantic Kernel can streamline orchestration and coordination among agents, developing large-scale, production-grade systems can bring a host of data challenges. Issues such as supporting multi-tenancy, preserving transactional integrity and state, and managing reliable asynchronous function calls while scaling efficiently can be difficult to navigate.

Leveraging insights from practical experiences in the Azure Cosmos DB engineering team, this talk will guide you through key considerations and best practices for storing, managing, and leveraging data in multi-agent applications at any scale. You’ll learn how to understand core multi-agent concepts and architectures, manage statefulness and conversation histories, personalize agents through retrieval-augmented generation (RAG), and effectively integrate APIs and function calls.

Aimed at developers, architects, and data scientists at all skill levels, this session will show you how to take your multi-agent systems from the lab to full-scale production deployments, ready to solve real-world problems. We’ll also walk through code implementations that can be quickly and easily put into practice, all in Python.

Keynote- From Next Token Prediction to Reasoning and Beyond

Large Language Models (LLMs) have grown into prominence as some of the most popular technological artifacts of the day. This talk will provide a highly accessible and visual overview of LLM concepts relevant to today's data professionals. This includes looking at present-day Transformer architectures, tokenizers, reward models, reasoning LLMs, agentic trajectories, and the various training stages of a large language model including next-word prediction, instruction-tuning, preference-tuning, and reinforcement learning.

This talk explores how leveraging Large Language Models (LLMs) to generate structured customer profile summaries improved both compliance analyst workflows and fraud scoring models at a financial institution. Attendees will learn how embeddings derived from LLM-generated narratives outperformed traditional manual feature engineering and raw text embeddings, offering insights into practical applications of NLP in fraud detection.

Sovereign Data for AI with Python

The only certainty in life is that the pendulum will always swing. Recently, the pendulum has been swinging towards repatriation. However, the infrastructure needed to build and operate AI systems using Python in a sovereign (even air-gapped) environment has changed since the shift towards the cloud. This talk will introduce the infrastructure you need to build and deploy Python applications for AI - from data processing, to model training and LLM fine-tuning at scale to inference at scale. We will focus on open-source infrastructure including: a Python library server (Pypi, Conda, etc) and avoiding supply chain attacks a container registry that works at scale a S3 storage layer a database server with a vector index

Small Language Models (SLMs) provide a viable alternative to Large Language Models (LLMs) for developing high-performing, cost-effective, and secure generative AI solutions. CDAOs, AI architects, and data architects should attend this presentation to gain insights into the strengths and weaknesses of SLMs and discover five specific use cases where SLMs outperform LLMs.

In this session, we will explore cost transformative vendors such as DeepSeek and MistralAI’s impact on AI development, highlighting the potential to enhance enterprise innovation at a price/performance that is unprecedented. Data, analytics & AI leaders can bring their questions on how they can capitalize on new training and inferencing paradigms in AI, the future of AI model development and the impact of reasoning models on a more autonomous future in AI.

Patrick Thompson, co-founder of Clarify and former co-founder of Iteratively (acquired by Amplitude), joined Yuliia and Dumky to discuss the evolution from data quality to decision quality. Patrick shares his experience building data contracts solutions at Atlassian and later developing analytics tracking tools. Patrick challenges the assumption that AI will eliminate the need for structured data. He argues that while LLMs excel at understanding unstructured data, businesses still need deterministic systems for automation and decision-making. Patrick shares insights on why enforcing data quality at the source remains critical, even in an AI-first world, and explains his shift from analytics to CRM while maintaining focus on customer data unification and business impact over technical perfectionism.Tune in!

Supported by Our Partners •⁠ Statsig ⁠ — ⁠ The unified platform for flags, analytics, experiments, and more. •⁠ Sinch⁠ — Connect with customers at every step of their journey. •⁠ Cortex⁠ — Your Portal to Engineering Excellence. — What does it take to land a job as an AI Engineer—and thrive in the role? In this episode of Pragmatic Engineer, I’m joined by Janvi Kalra, currently an AI Engineer at OpenAI. Janvi shares how she broke into tech with internships at top companies, landed a full-time software engineering role at Coda, and later taught herself the skills to move into AI Engineering: by things like building projects in her free time, joining hackathons, and ultimately proving herself and earning a spot on Coda’s first AI Engineering team. In our conversation, we dive into the world of AI Engineering and discuss three types of AI companies, how to assess them based on profitability and growth, and practical advice for landing your dream job in the field. We also discuss the following:  • How Janvi landed internships at Google and Microsoft, and her tips for interview prepping • A framework for evaluating AI startups • An overview of what an AI Engineer does • A mini curriculum for self-learning AI: practical tools that worked for Janvi • The Coda project that impressed CEO Shishir Mehrotra and sparked Coda Brain • Janvi’s role at OpenAI and how the safety team shapes responsible AI • How OpenAI blends startup speed with big tech scale • Why AI Engineers must be ready to scrap their work and start over • Why today’s engineers need to be product-minded, design-aware, full-stack, and focused on driving business outcomes • And much more! — Timestamps (00:00) Intro (02:31) How Janvi got her internships at Google and Microsoft (03:35) How Janvi prepared for her coding interviews  (07:11) Janvi’s experience interning at Google (08:59) What Janvi worked on at Microsoft  (11:35) Why Janvi chose to work for a startup after college (15:00) How Janvi picked Coda  (16:58) Janvi’s criteria for picking a startup now  (18:20) How Janvi evaluates ‘customer obsession’  (19:12) Fast—an example of the downside of not doing due diligence (21:38) How Janvi made the jump to Coda’s AI team (25:48) What an AI Engineer does  (27:30) How Janvi developed her AI Engineering skills through hackathons (30:34) Janvi’s favorite AI project at Coda: Workspace Q&A  (37:40) Learnings from interviewing at 46 companies (40:44) Why Janvi decided to get experience working for a model company  (43:17) Questions Janvi asks to determine growth and profitability (45:28) How Janvi got an offer at OpenAI, and an overview of the interview process (49:08) What Janvi does at OpenAI  (51:01) What makes OpenAI unique  (52:30) The shipping process at OpenAI (55:41) Surprising learnings from AI Engineering  (57:50) How AI might impact new graduates  (1:02:19) The impact of AI tools on coding—what is changing, and what remains the same (1:07:51) Rapid fire round — The Pragmatic Engineer deepdives relevant for this episode: •⁠ AI Engineering in the real world •⁠ The AI Engineering stack •⁠ Building, launching, and scaling ChatGPT Images — See the transcript and other references from the episode at ⁠⁠https://newsletter.pragmaticengineer.com/podcast⁠⁠ — Production and marketing by ⁠⁠⁠⁠⁠⁠⁠⁠https://penname.co/⁠⁠⁠⁠⁠⁠⁠⁠. For inquiries about sponsoring the podcast, email [email protected].

Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe

Está no ar, o Data Hackers News !! Os assuntos mais quentes da semana, com as principais notícias da área de Dados, IA e Tecnologia, que você também encontra na nossa Newsletter semanal, agora no Podcast do Data Hackers !! Aperte o play e ouça agora, o Data Hackers News dessa semana ! Para saber tudo sobre o que está acontecendo na área de dados, se inscreva na Newsletter semanal: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://www.datahackers.news/⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Conheça nossos comentaristas do Data Hackers News: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Monique Femme⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ Paulo Vasconcellos Demais canais do Data Hackers: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Site⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Linkedin⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Instagram⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Tik Tok⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠You Tube⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

Summary In this episode of the Data Engineering Podcast, host Tobias Macy welcomes back Shinji Kim to discuss the evolving role of semantic layers in the era of AI. As they explore the challenges of managing vast data ecosystems and providing context to data users, they delve into the significance of semantic layers for AI applications. They dive into the nuances of semantic modeling, the impact of AI on data accessibility, and the importance of business logic in semantic models. Shinji shares her insights on how SelectStar is helping teams navigate these complexities, and together they cover the future of semantic modeling as a native construct in data systems. Join them for an in-depth conversation on the evolving landscape of data engineering and its intersection with AI.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.Your host is Tobias Macey and today I'm interviewing Shinji Kim about the role of semantic layers in the era of AIInterview IntroductionHow did you get involved in the area of data management?Semantic modeling gained a lot of attention ~4-5 years ago in the context of the "modern data stack". What is your motivation for revisiting that topic today?There are several overlapping concepts – "semantic layer," "metrics layer," "headless BI." How do you define these terms, and what are the key distinctions and overlaps?Do you see these concepts converging, or do they serve distinct long-term purposes?Data warehousing and business intelligence have been around for decades now. What new value does semantic modeling beyond practices like star schemas, OLAP cubes, etc.?What benefits does a semantic model provide when integrating your data platform into AI use cases?How is it different between using AI as an interface to your analytical use cases vs. powering customer facing AI applications with your data?Putting in the effort to create and maintain a set of semantic models is non-zero. What role can LLMs play in helping to propose and construct those models?For teams who have already invested in building this capability, what additional context and metadata is necessary to provide guidance to LLMs when working with their models?What's the most effective way to create a semantic layer without turning it into a massive project? There are several technologies available for building and serving these models. What are the selection criteria that you recommend for teams who are starting down this path?What are the most interesting, innovative, or unexpected ways that you have seen semantic models used?What are the most interesting, unexpected, or challenging lessons that you have learned while working with semantic modeling?When is semantic modeling the wrong choice?What do you predict for the future of semantic modeling?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links SelectStarSun MicrosystemsMarkov Chain Monte CarloSemantic ModelingSemantic LayerMetrics LayerHeadless BICubePodcast EpisodeAtScaleStar SchemaData VaultOLAP CubeRAG == Retrieval Augmented GenerationAI Engineering Podcast EpisodeKNN == K-Nearest NeighbersHNSW == Hierarchical Navigable Small Worlddbt Metrics LayerSoda DataLookMLHexPowerBITableauSemantic View (Snowflake)Databricks GenieSnowflake Cortex AnalystMalloyThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

GenAI systems are evolving beyond basic information retrieval and question answering, becoming sophisticated agents capable of managing multi-turn dialogues and executing complex, multi-step tasks autonomously. However, reliably evaluating and systematically improving their performance remains challenging. In this session, we'll explore methods for assessing the behavior of LLM-driven agentic systems, highlighting techniques and showcasing actionable insights to identify performance bottlenecks and to creating better-aligned, more reliable agentic AI systems.

In this session, we will explore DeepSeek’s transformative impact on AI development, highlighting its potential to enhance enterprise innovation at a price/performance that is unprecedented. Data, analytics & AI leaders can bring their questions on how they can capitalize on new training and inferencing paradigms in AI, the future of AI model development and the impact of reasoning models on a more autonomous future in AI.

Today, I'm chatting with Stuart Winter-Tear about AI product management. We're getting into the nitty-gritty of what it takes to build and launch LLM-powered products for the commercial market that actually produce value. Among other things in this rich conversation, Stuart surprised me with the level of importance he believes UX has in making LLM-powered products successful, even for technical audiences.

After spending significant time on the forefront of AI’s breakthroughs, Stuart believes many of the products we’re seeing today are the result of FOMO above all else. He shares a belief that I’ve emphasized time and time again on the podcast–product is about the problem, not the solution. This design philosophy has informed Staurt’s 20-plus year-long career, and it is pivotal to understanding how to best use AI to build products that meet users’ needs.

Highlights/ Skip to 

Why Stuart was asked to speak to the House of Lords about AI (2:04) The LLM-powered products has Stuart been building recently (4:20) Finding product-market fit with AI products (7:44) Lessons Stuart has learned over the past two years working with LLM-power products (10:54)  Figuring out how to build user trust in your AI products (14:40) The differences between being a digital product manager vs. AI product manager (18:13) Who is best suited for an AI product management role (25:42) Why Stuart thinks user experience matters greatly with AI products (32:18) The formula needed to create a business-viable AI product (38:22)  Stuart describes the skills and roles he thinks are essential in an AI product team and who he brings on first (50:53) Conversations that need to be had with academics and data scientists when building AI-powered products (54:04) Final thoughts from Stuart and where you can find more from him (58:07)

Quotes from Today’s Episode

“I think that the core dream with GenAI is getting data out of IT hands and back to the business. Finding a way to overlay all this disparate, unstructured data and [translate it] to the human language is revolutionary. We’re finding industries that you would think were more conservative (i.e. medical, legal, etc.) are probably the most interested because of the large volumes of unstructured data they have to deal with. People wouldn’t expect large language models to be used for fact-checking… they’re actually very powerful, especially if you can have your own proprietary data or pipelines. Same with security–although large language models introduce a terrifying amount of security problems, they can also be used in reverse to augment security. There’s a lovely contradiction with this technology that I do enjoy.” - Stuart Winter-Tear (5:58) “[LLM-powered products] gave me the wow factor, and I think that’s part of what’s caused the problem. If we focus on technology, we build more technology, but if we focus on business and customers, we’re probably going to end up with more business and customers. This is why we end up with so many products that are effectively solutions in search of problems. We’re in this rush and [these products] are [based on] FOMO. We’re leaving behind what we understood about [building] products—as if [an LLM-powered product] is a special piece of technology. It’s not. It’s another piece of technology. [Designers] should look at this technology from the prism of the business and from the prism of the problem. We love to solutionize, but is the problem the problem? What’s the context of the problem? What’s the problem under the problem? Is this problem worth solving, and is GenAI a desirable way to solve it? We’re putting the cart before the horse.” - Stuart Winter-Tear (11:11) “[LLM-powered products] feel most amazing when you’re not a domain expert in whatever you’re using it for. I’ll give you an example: I’m terrible at coding. When I got my hands on Cursor, I felt like a superhero. It was unbelievable what I could build. Although [LLM products] look most amazing in the hands of non-experts, it’s actually most powerful in the hands of experts who do understand the domain they’re using this technology. Perhaps I want to do a product strategy, so I ask [the product] for some assistance, and it can get me 70% of the way there. [LLM products] are great as a jumping off point… but ultimately [they are] only powerful because I have certain domain expertise.” - Stuart Winter-Tear (13:01) “We’re so used to the digital paradigm. The deterministic nature of you put in X, you get out Y; it’s the same every time. Probabilistic changes every time. There is a huge difference between what results you might be getting in the lab compared to what happens in the real world. You effectively find yourself building [AI products] live, and in order to do that, you need good communities and good feedback available to you. You need these fast feedback loops. From a pure product management perspective, we used to just have the [engineering] timeline… Now, we have [the data research timeline]. If you’re dealing with cutting-edge products, you’ve got these two timelines that you’re trying to put together, and the data research one is very unpredictable. It’s the nature of research. We don’t necessarily know when we’re going to get to where we want to be.” - Stuart Winter-Tear (22:25) “I believe that UX will become the #1 priority for large language model products. I firmly believe whoever wins in UX will win in this large language model product world.  I’m against fully autonomous agents without human intervention for knowledge work. We need that human in the loop. What was the intent of the user? How do we get that right push back from the large language model to understand even the level of the person that they’re dealing with? These are fundamental UX problems that are going to push UX to the forefront… This is going to be on UX to educate the user, to be able to inject the user in at the right time to be able to make this stuff work. The UX folk who do figure this out are going to create the breakthrough and create the mass adoption.” - Stuart Winter-Tear (33:42)

Organisations adopting a Data Mesh framework often face challenges in ensuring regulatory compliance, transforming data assets into scalable products, and maintaining governance. Explore how NatWest addresses these complexities by integrating knowledge graphs with GenAI and LLMs to enhance data discovery, enforce governance policies, and accelerate product development. Learn how this approach strengthens regulatory data qualifications, automates metadata management, and delivers faster, more reliable insights— to build and scale AI-driven data products yielding a potential 10x efficiency gain.