talk-data.com
Topic
RAG
Retrieval Augmented Generation (RAG)
369
tagged
Activity Trend
Top Events
Building product recommendation system/chat bot using LLMs is simple... on paper. In reality, simple RAG covers only the simplest scenarios. To cover more complicated one, you may want to learn about such things as a conversation graph, logical and semantic routing, hybrid search etc. In this talk I share lessons and tricks we have learn during building product recommendation system using Gemini.
This hands-on lab empowers you to build a cutting-edge multimodal question answering system using Google's Vertex AI and the powerful Gemini family of models. By constructing this system from the ground up, you'll gain a deep understanding of its inner workings and the advantages of incorporating visual information into Retrieval Augmented Generation (RAG). This hands-on experience equips you with the knowledge to customize and optimize your own multimodal question answering systems, unlocking new possibilities for knowledge discovery and reasoning.
If you register for a Learning Center lab, please ensure that you sign up for a Google Cloud Skills Boost account for both your work domain and personal email address. You will need to authenticate your account as well (be sure to check your spam folder!). This will ensure you can arrive and access your labs quickly onsite. You can follow this link to sign up!
Learn how to leverage Gemini's long context for creating and interpreting data analytics queries using Agentic Retrieval Augmented Generation with NL2SQL techniques and reasoning.
Join this session to learn how to ground your AI with relevant data with retrieval-augmented generation (RAG) from Firebase Data Connect, which brings rapid development and intelligent context from your Cloud SQL database to your generative AI experiences. Data Connect makes it easy to connect your app, data, and AI all together, and seamlessly integrates Vertex AI and Cloud SQL to make RAG easy and ready for AI agents.
This session will cover Claude’s advanced reasoning capabilities, prompt engineering techniques, and empirical evaluations. We’ll also delve into best practices for prompt engineering, and how to optimize agents to advance Retrieval-Augmented Generation (RAG) within your Google Cloud environment. Whether you're an AI practitioner or an enterprise architect, this session will equip you with the knowledge to harness Claude’s full potential for enhanced AI workflows on Google Cloud.
This Session is hosted by a Google Cloud Next Sponsor.
Visit your registration profile at g.co/cloudnext to opt out of sharing your contact information with the sponsor hosting this session.
Discover how to transform your unstructured data into a strategic advantage by embedding GenAI into the core of your business operations. In this live demo from Hyperscience and Google, learn how to convert your back office information assets into LLM and RAG-ready data, with unprecedented accuracy, governance, and relevance. Combining the Hyperscience enterprise AI platform with Google Vertex and Big Query, organizations can now power GenAI initiatives that deliver smarter decision making, operational agility, and competitive advantage.
This Session is hosted by a Google Cloud Next Sponsor.
Visit your registration profile at g.co/cloudnext to opt out of sharing your contact information with the sponsor hosting this session.
This hands-on lab empowers you to build a cutting-edge multimodal question answering system using Google's Vertex AI and the powerful Gemini family of models. By constructing this system from the ground up, you'll gain a deep understanding of its inner workings and the advantages of incorporating visual information into Retrieval Augmented Generation (RAG). This hands-on experience equips you with the knowledge to customize and optimize your own multimodal question answering systems, unlocking new possibilities for knowledge discovery and reasoning.
If you register for a Learning Center lab, please ensure that you sign up for a Google Cloud Skills Boost account for both your work domain and personal email address. You will need to authenticate your account as well (be sure to check your spam folder!). This will ensure you can arrive and access your labs quickly onsite. You can follow this link to sign up!
App life cycles with Google Cloud databases and runtimes have gotten smarter. This session takes you through the life cycle of building and optimizing scalable, high-performance generative AI apps. We’ll use Google’s latest natural language and vector search technologies to build the “art of the possible” for AI-powered software. We’ll also cover various app design patterns using Gemini Code Assist, foundation models, agentic retrieval-augmented generation (RAG), and orchestration frameworks, and show you how it all comes together in a live demo.
Generative AI enables you to build amazing new apps, but you may also have concerns. Will it be too complex or too expensive, or will it create security risks? Fortunately, building good gen AI apps has become much easier in the past year, in large part due to updated capabilities within databases. We’ll learn how Google Cloud databases handle key AI technologies like vector search, retrieval-augmented generation (RAG), and orchestration, helping you bridge the gap between general-purpose AI models and your business-specific data. We’ll dive into some fun application examples to see how they were built and how you can use the same techniques in your apps.
Build a multimodal search engine with Gemini and Vertex AI. This hands-on lab demonstrates Retrieval Augmented Generation (RAG) to query documents containing text and images. Learn to extract metadata, generate embeddings, and search using text or image queries.
If you register for a Learning Center lab, please ensure that you sign up for a Google Cloud Skills Boost account for both your work domain and personal email address. You will need to authenticate your account as well (be sure to check your spam folder!). This will ensure you can arrive and access your labs quickly onsite. You can follow this link to sign up!
Un.e data scientist vous présentera notre implémentation d’un RAG pour l’exploration des podcasts de Radio France
It's finally possible to bring the awesome power of Large Language Models (LLMs) to your laptop. This talk will explore how to run and leverage small, openly available LLMs to power common tasks involving data, including selecting the right models, practical use cases for running small models, and best practices for deploying small models effectively alongside databases.
Bio: Jeffrey Morgan is the founder of Ollama, an open-source tool to get up and run large language models. Prior to founding Ollama, Jeffrey founded Kitematic, which was acquired by Docker and evolved into Docker Desktop. He has previously worked at companies including Docker, Twitter, and Google.
➡️ Follow Us LinkedIn: https://www.linkedin.com/company/small-data-sf/ X/Twitter : https://twitter.com/smalldatasf Website: https://www.smalldatasf.com/
Discover how to run large language models (LLMs) locally using Ollama, the easiest way to get started with small AI models on your Mac, Windows, or Linux machine. Unlike massive cloud-based systems, small open source models are only a few gigabytes, allowing them to run incredibly fast on consumer hardware without network latency. This video explains why these local LLMs are not just scaled-down versions of larger models but powerful tools for developers, offering significant advantages in speed, data privacy, and cost-effectiveness by eliminating hidden cloud provider fees and risks.
Learn the most common use case for small models: combining them with your existing factual data to prevent hallucinations. We dive into retrieval augmented generation (RAG), a powerful technique where you augment a model's prompt with information from a local data source. See a practical demo of how to build a vector store from simple text files and connect it to a model like Gemma 2B, enabling you to query your own data using natural language for fast, accurate, and context-aware responses.
Explore the next frontier of local AI with small agents and tool calling, a new feature that empowers models to interact with external tools. This guide demonstrates how an LLM can autonomously decide to query a DuckDB database, write the correct SQL, and use the retrieved data to answer your questions. This advanced tutorial shows you how to connect small models directly to your data engineering workflows, moving beyond simple chat to create intelligent, data-driven applications.
Get started with practical applications for small models today, from building internal help desks to streamlining engineering tasks like code review. This video highlights how small and large models can work together effectively and shows that open source models are rapidly catching up to their cloud-scale counterparts. It's never been a better time for developers and data analysts to harness the power of local AI.
In this session we will review the following data preparation tools and techniques we have discussed in the previous sessions: Data Prep Kit; Docling; Open source RAG with Data Prep Kit + Milvus + Llama.
Summary In this episode of the Data Engineering Podcast Bartosz Mikulski talks about preparing data for AI applications. Bartosz shares his journey from data engineering to MLOps and emphasizes the importance of data testing over software development in AI contexts. He discusses the types of data assets required for AI applications, including extensive test datasets, especially in generative AI, and explains the differences in data requirements for various AI application styles. The conversation also explores the skills data engineers need to transition into AI, such as familiarity with vector databases and new data modeling strategies, and highlights the challenges of evolving AI applications, including frequent reprocessing of data when changing chunking strategies or embedding models.
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details. Your host is Tobias Macey and today I'm interviewing Bartosz Mikulski about how to prepare data for use in AI applicationsInterview IntroductionHow did you get involved in the area of data management?Can you start by outlining some of the main categories of data assets that are needed for AI applications?How does the nature of the application change those requirements? (e.g. RAG app vs. agent, etc.)How do the different assets map to the stages of the application lifecycle?What are some of the common roles and divisions of responsibility that you see in the construction and operation of a "typical" AI application?For data engineers who are used to data warehousing/BI, what are the skills that map to AI apps?What are some of the data modeling patterns that are needed to support AI apps?chunking strategies metadata managementWhat are the new categories of data that data engineers need to manage in the context of AI applications?agent memory generation/evolution conversation history managementdata collection for fine tuningWhat are some of the notable evolutions in the space of AI applications and their patterns that have happened in the past ~1-2 years that relate to the responsibilities of data engineers?What are some of the skills gaps that teams should be aware of and identify training opportunities for?What are the most interesting, innovative, or unexpected ways that you have seen data teams address the needs of AI applications?What are the most interesting, unexpected, or challenging lessons that you have learned while working on AI applications and their reliance on data?What are some of the emerging trends that you are paying particular attention to?Contact Info WebsiteLinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links SparkRayChunking StrategiesHypothetical document embeddingsModel Fine TuningPrompt CompressionThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
If you're looking to build production-ready AI applications that can reason and retrieve external data for context-awareness, you'll need to master--;a popular development framework and platform for building, running, and managing agentic applications. LangChain is used by several leading companies, including Zapier, Replit, Databricks, and many more. This guide is an indispensable resource for developers who understand Python or JavaScript but are beginners eager to harness the power of AI. Authors Mayo Oshin and Nuno Campos demystify the use of LangChain through practical insights and in-depth tutorials. Starting with basic concepts, this book shows you step-by-step how to build a production-ready AI agent that uses your data. Harness the power of retrieval-augmented generation (RAG) to enhance the accuracy of LLMs using external up-to-date data Develop and deploy AI applications that interact intelligently and contextually with users Make use of the powerful agent architecture with LangGraph Integrate and manage third-party APIs and tools to extend the functionality of your AI applications Monitor, test, and evaluate your AI applications to improve performance Understand the foundations of LLM app development and how they can be used with LangChain
This talk explores how modern AI tools can bridge the gap between developers and adopters, transforming documentation into an interactive, continuously improving resource.\n\nFocusing on Retrieval-Augmented Generation (RAG) in Elixir, it highlights how RAG can enhance both the creation and interaction with documentation, making it more engaging and effective. Attendees will learn about core RAG principles, handling fuzzy code searches, and mitigating security risks, offering a roadmap for small teams to create impactful documentation that supports growth and adoption.
Deepti Srivastava, Founder of Snow Leopard AI and former Spanner Product Lead at Google Cloud, joined Yuliia to chat what's wrong with current approaches to AI integration. Deepti introduces a paradigm shift away from ETL pipelines towards federated, real-time data access for AI applications. She explains how Snow Leopard's intelligent data retrieval platform enables enterprises to connect AI systems directly to operational data sources without compromising security or freshness. Through practical examples Deepti explains why conventional RAG approaches with vector stores are not good enough for business-critical AI applications, and how a systems thinking approach to AI infrastructure can unlock greater value while reducing unnecessary data movement.Deepti's linkedin - https://www.linkedin.com/in/thedeepti/Snowleopard.ai - http://snowleopard.ai/
Supported by Our Partners • Swarmia — The engineering intelligence platform for modern software organizations. • Graphite — The AI developer productivity platform. • Vanta — Automate compliance and simplify security with Vanta. — On today’s episode of The Pragmatic Engineer, I’m joined by Chip Huyen, a computer scientist, author of the freshly published O’Reilly book AI Engineering, and an expert in applied machine learning. Chip has worked as a researcher at Netflix, was a core developer at NVIDIA (building NeMo, NVIDIA’s GenAI framework), and co-founded Claypot AI. She also taught Machine Learning at Stanford University. In this conversation, we dive into the evolving field of AI Engineering and explore key insights from Chip’s book, including: • How AI Engineering differs from Machine Learning Engineering • Why fine-tuning is usually not a tactic you’ll want (or need) to use • The spectrum of solutions to customer support problems – some not even involving AI! • The challenges of LLM evals (evaluations) • Why project-based learning is valuable—but even better when paired with structured learning • Exciting potential use cases for AI in education and entertainment • And more! — Timestamps (00:00) Intro (01:31) A quick overview of AI Engineering (05:00) How Chip ensured her book stays current amidst the rapid advancements in AI (09:50) A definition of AI Engineering and how it differs from Machine Learning Engineering (16:30) Simple first steps in building AI applications (22:53) An explanation of BM25 (retrieval system) (23:43) The problems associated with fine-tuning (27:55) Simple customer support solutions for rolling out AI thoughtfully (33:44) Chip’s thoughts on staying focused on the problem (35:19) The challenge in evaluating AI systems (38:18) Use cases in evaluating AI (41:24) The importance of prioritizing users’ needs and experience (46:24) Common mistakes made with Gen AI (52:12) A case for systematic problem solving (53:13) Project-based learning vs. structured learning (58:32) Why AI is not the end of engineering (1:03:11) How AI is helping education and the future use cases we might see (1:07:13) Rapid fire round — The Pragmatic Engineer deepdives relevant for this episode: • Applied AI Software Engineering: RAG https://newsletter.pragmaticengineer.com/p/rag • How do AI software engineering agents work? https://newsletter.pragmaticengineer.com/p/ai-coding-agents • AI Tooling for Software Engineers in 2024: Reality Check https://newsletter.pragmaticengineer.com/p/ai-tooling-2024 • IDEs with GenAI features that Software Engineers love https://newsletter.pragmaticengineer.com/p/ide-that-software-engineers-love — See the transcript and other references from the episode at https://newsletter.pragmaticengineer.com/podcast — Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].
Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe
The future of education lies in personalized and scalable solutions, especially in fields like computer engineering where complex concepts often challenge students. This talk introduces Lumina (AI Teaching Assistant), a cutting-edge agentic system designed to revolutionize programming education through its innovative architecture and teaching strategies. Built using OpenAI API, LangChain, RAG, and ChromaDB, Lumina employs an agentic, multi-modal framework that dynamically integrates course materials, technical documentation, and pedagogical strategies into an adaptive knowledge-driven system. Its unique “Knowledge Components” approach decomposes programming concepts into interconnected teachable units, enabling proficiency-based learning and dynamic problem-solving guidance. Attendees will discover how Lumina’s agentic architecture enhances engagement, fosters critical thinking, and improves concept mastery, paving the way for scalable AI-driven educational solutions.