AI assistants are evolving from simple Q&A bots to intelligent, multimodal, multilingual, and agentic systems capable of reasoning, retrieving, and autonomously acting. In this talk, we’ll showcase how to build a voice-enabled, multilingual, multimodal RAG (Retrieval-Augmented Generation) assistant using Gradio, OpenAI’s Whisper, LangChain, LangGraph, and FAISS. Our assistant will not only process voice and text inputs in multiple languages but also intelligently retrieve information from structured and unstructured data. We’ll demonstrate this with a flight search use case—leveraging a flight database for retrieval and, when necessary, autonomously searching external sources using LangGraph. You will gain practical insights into building scalable, adaptive AI assistants that move beyond static chatbots to autonomous agents that interact dynamically with users and the web.
talk-data.com
Topic
AI/ML
Artificial Intelligence/Machine Learning
9014
tagged
Activity Trend
Top Events
Designing tomorrow's materials requires understanding how atoms behave – a challenge that's both fascinating and incredibly complex. While machine learning offers exciting speedups in materials simulation, it often falls short, missing vital electronic structure information needed to connect theory with experimental results. This work introduces a powerful solution: Density Functional Tight Binding (DFTB), which, combined with the versatile tools of Scientific Python, allows us to understand the electronic behavior of materials while maintaining computational efficiency. In this talk, I will present our findings demonstrating how DFTB, coupled with readily available Python packages, allows for direct comparison between theoretical predictions and experimental data, such as XPS measurements. I will also showcase our publicly available repository, containing DFTB parameters for a wide range of materials, making this powerful approach accessible to the broader research community.
This talk explores various methods to accelerate traditional machine learning pipelines using scikit-learn, UMAP, and HDBSCAN on GPUs. We will contrast the experimental Array API Standard support layer in scikit-learn with the cuML library from the NVIDIA RAPIDS Data Science stack, including its zero-code change acceleration capability. ML and data science practitioners will learn how to seamlessly accelerate machine learning workflows, highlight performance benefits, and receive practical guidance for different problem types and sizes. Insights into minimizing cost and runtime by effectively mixing hardware for various tasks, as well as the current implementation status and future plans for these acceleration methods, will be provided.
In this episode of Hub & Spoken, Jason Foster talks to Cali Wood, Head of Data and AI Strategy & Culture at AXA UK and Ireland. Cali shares how AXA is shaping its data and AI transformation through a clear strategic framework built on creation of value, connection of data and tooling, and culture to accelerate value. From embedding human-centred design into automation use cases to launching a data and AI academy with more than 50% workforce engagement, AXA is making data and AI a true business-wide initiative. This conversation explores: The three pillars of AXA's data and AI strategy How culture and leadership unlock real business value Scaling responsible AI across a highly regulated industry Evolving from traditional to agentic AI in a people-first way Whether you're leading data transformation or navigating GenAI, this episode offers practical ideas and inspiration to help bring your people and strategy together. Listen now to learn how to build AI-driven change - the right way.
Cynozure is a leading data, analytics and AI company that helps organisations to reach their data potential. It works with clients on data and AI strategy, data management, data architecture and engineering, analytics and AI, data culture and literacy, and data leadership. The company was named one of The Sunday Times' fastest-growing private companies in both 2022 and 2023 and recognised as The Best Place to Work in Data by DataIQ in 2023 and 2024. Cynozure is a certified B Corporation.
Deliver flexible, scalable, and high-performance data storage that's perfect for AI and other modern applications with MongoDB 8.0 and MongoDB Atlas multi-cloud data platform. In MongoDB 8.0 in Action, Third Edition you'll find comprehensive coverage of the latest version of MongoDB 8.0 and the MongoDB Atlas multi-cloud data platform. Learn to utilize MongoDB’s flexible schema design for data modeling, scale applications effectively using advanced sharding features, integrate full-text and vector-based semantic search, and more. This totally revised new edition delivers engaging hands-on tutorials and examples that put MongoDB into action! In MongoDB 8.0 in Action, Third Edition you'll: Master new features in MongoDB 8.0 Create your first, free Atlas cluster using the Atlas CLI Design scalable NoSQL databases with effective data modeling techniques Master Vector Search for building GenAI-driven applications Utilize advanced search capabilities in MongoDB Atlas, including full-text search Build Event-Driven Applications with Atlas Stream Processing Deploy and manage MongoDB Atlas clusters both locally and in the cloud using the Atlas CLI Leverage the Atlas SQL interface for familiar SQL querying Use MongoDB Atlas Online Archive for efficient data management Establish robust security practices including encryption Master backup and restore strategies Optimize database performance and identify slow queries MongoDB 8.0 in Action, Third Edition offers a clear, easy-to-understand introduction to everything in MongoDB 8.0 and MongoDB Atlas—including new advanced features such as embedded config servers in sharded clusters, or moving an unsharded collection to a different shard. The book also covers Atlas stream processing, full text search, and vector search capabilities for generative AI applications. Each chapter is packed with tips, tricks, and practical examples you can quickly apply to your projects, whether you're brand new to MongoDB or looking to get up to speed with the latest version. About the Technology MongoDB is the database of choice for storing structured, semi-structured, and unstructured data like business documents and other text and image files. MongoDB 8.0 introduces a range of exciting new features—from sharding improvements that simplify the management of distributed data, to performance enhancements that stay resilient under heavy workloads. Plus, MongoDB Atlas brings vector search and full-text search features that support AI-powered applications. About the Book MongoDB 8.0 in Action, Third Edition you’ll learn how to take advantage of all the new features of MongoDB 8.0, including the powerful MongoDB Atlas multi-cloud data platform. You’ll start with the basics of setting up and managing a document database. Then, you’ll learn how to use MongoDB for AI-driven applications, implement advanced stream processing, and optimize performance with improved indexing and query handling. Hands-on projects like creating a RAG-based chatbot and building an aggregation pipeline mean you’ll really put MongoDB into action! What's Inside The new features in MongoDB 8.0 Get familiar with MongoDB’s Atlas cloud platform Utilizing sharding enhancements Using vector-based search technologies Full-text search capabilities for efficient text indexing and querying About the Reader For developers and DBAs of all levels. No prior experience with MongoDB required. About the Author Arek Borucki is a MongoDB Champion, certified MongoDB and MongoDB Atlas administrator with expertise in distributed systems, NoSQL databases, and Kubernetes. Quotes An excellent resource with real-world examples and best practices to design, optimize, and scale modern applications. - Advait Patel, Broadcom Essential MongoDB resource. Covers new features such as full-text search, vector search, AI, and RAG applications. - Juan Roy, Credit Suisse Reflects author’s practical experience and clear teaching style. It’s packed with real-world examples and up-to-date insights. - Rajesh Nair, MongoDB Champion & community leader This book will definitely make you a MongoDB star! - Vinicios Wentz, JP Morgan & Chase Co.
This talk presents a candid reflection on integrating generative AI into an Engineering Computations course, revealing unexpected challenges despite best intentions. Students quickly developed patterns of using AI as a shortcut rather than a learning companion, leading to decreased attendance and an "illusion of competence." I'll discuss the disconnect between instructor expectations and student behavior, analyze how traditional assessment structures reinforced counterproductive AI usage, and share strategies for guiding students toward using AI as a co-pilot rather than a substitute for critical thinking while maintaining academic integrity.
Explainable AI (XAI) emerged to clarify the decision-making of complex deep learning models, but standard XAI methods are often uninformative on Earth system models due to their high-dimensional and physically constrained nature. We introduce “physical XAI,” which adapts XAI techniques to maintain physical realism and handle autocorrelated data effectively. Our approach includes physically consistent perturbations, analysis of uncertainty, and the use of variance-based global sensitivity tools. Furthermore, we expand the definition of “physical XAI” to include meaningful interactive data analysis. We demonstrate these methods on two Earth system models: a data-driven global weather model and a winter precipitation type model to show how we can gain more physically meaningful insights.
At CERN (European Organization for Nuclear Research), machine learning models are developed and deployed for various applications, including data analysis, event reconstruction, and classification. These models must not only be highly sophisticated but also optimized for efficient inference. A critical application is in Triggers- systems designed to identify and select interesting events from an immense stream of experimental data. Experiments like ATLAS and CMS generate data at rates of approximately 100 TB/s, requiring Triggers to rapidly filter out irrelevant events. This talk will explore the challenges of deploying machine learning in such high-throughput environments and discuss solutions to enhance their performance and reliability.
X-ray ptychographic imaging is becoming an indispensable tool for visualizing matter at nanoscale, driving innovation across many fields, including functional materials, electronics, life sciences, etc. This imaging mode is particularly attractive thanks to its ability to generate high-resolution view of an extended object without using a lens with high numerical aperture. The technique relies on advanced mathematical algorithms to retrieve the missing phase information that is not directly recorded by a physical detector, therefore computation intensive. Advances in accelerator, optics, and detector technologies have greatly increased data generate rate, imposing a big challenge on efficient execution of reconstruction process to support decision-making in an experiment. Here, we demonstrate how efficient GPU-based reconstruction algorithms, deployed at the edge, enable real-time feedback during high-speed continuous data acquisition increasing the speed and efficiency of the experiments. The developments further pave the way for AI-augmented autonomous microscopic experimentation performed at machine speeds.
Generative Artificial Intelligence (AI) is reshaping engineering education by offering students new ways to engage with complex concepts and content. Ethical concerns including bias, intellectual property, and plagiarism make Generative AI a controversial educational tool. Overreliance on AI may also lead to academic integrity issues, necessitating clear student codes of conduct that define acceptable use. As educators we should carefully design learning objectives to align with transferrable career skills in our fields. By practicing backward design with a focus on career-readiness skills, we can incorporate useful prompt engineering, rapid prototyping, and critical reasoning skills that incorporate generative AI. Engineering students want to develop essential career skills such as critical thinking, communication, and technology. This talk will focus on case studies for using generative AI and rapid prototyping for scientific computing in engineering courses for physics, programming, and technical writing. These courses include assignments and reading examples using NumPy, SciPy, Pandas, etc. in Jupyter notebooks. Embracing generative AI tools has helped students compare, evaluate, and discuss work that was inaccessible before generative AI. This talk explores strategies for using AI in engineering education while accomplishing learning objectives and giving students opportunities to practice career readiness skills.
Scaling artificial intelligence (AI) and machine learning (ML) workflows on high-performance computing (HPC) systems presents unique challenges, particularly as models become more complex and data-intensive. This study explores strategies to optimize AI/ML workflows for enhanced performance and resource utilization on HPC platforms.
We investigate advanced parallelization techniques, such as Data Parallelism (DP), Distributed Data Parallel (DDP), and Fully Sharded Data Parallel (FSDP). Implementing memory-efficient strategies, including mixed precision training and activation checkpointing, significantly reduces memory consumption without compromising model accuracy. Additionally, we examine various communication backends( i.e. NCCL, MPI, and Gloo) to enhance inter-GPU and inter-node communication efficiency. Special attention is given to the complexities of implementing these backends in HPC environments, providing solutions for proper configuration and execution.
Our findings demonstrate that these optimizations enable stable and scalable AI/ML model training and inference, achieving substantial improvements in training times and resource efficiency. This presentation will detail the technical challenges encountered and the solutions developed, offering insights into effectively scaling AI/ML workflows on HPC systems.
Block-based programming divides inputs into local arrays that are processed concurrently by groups of threads. Users write sequential array-centric code, and the framework handles parallelization, synchronization, and data movement behind the scenes. This approach aligns well with SciPy's array-centric ethos and has roots in older HPC libraries, such as NWChem’s TCE, BLIS, and ATLAS.
In recent years, many block-based Python programming models for GPUs have emerged, like Triton, JAX/Pallas, and Warp, aiming to make parallelism more accessible for scientists and increase portability.
In this talk, we'll present cuTile and Tile IR, a new Pythonic tile-based programming model and compiler recently announced by NVIDIA. We'll explore cuTile examples from a variety of domains, including a new LLAMA3-based reference app and a port of miniWeather. You'll learn the best practices for writing and debugging block-based Python GPU code, gain insight into how such code performs, and learn how it differs from traditional SIMT programming.
By the end of the session, you'll understand how block-based GPU programming enables more intuitive, portable, and efficient development of high-performance, data-parallel Python applications for HPC, data science, and machine learning.
LLMs are powerful, flexible, easy-to-use... and often wrong. This is a dangerous combination, especially for data analysis and scientific research, where correctness and reproducibility are core requirements. Fortunately, it turns out that by carefully applying LLMs to narrower use cases, we can turn them into surprisingly reliable assistants that accelerate and enhance, rather than undermine, scientific work.
This is not just theory—I’ll showcase working examples of seamlessly integrating LLMs into analytic workflows, helping data scientists build interactive, intelligent applications without needing to be web developers. You’ll see firsthand how keeping LLMs focused lets us leverage their "intelligence" in a way that’s practical, rigorous, and reproducible.
Large language models (LLMs) enable powerful data-driven applications, but many projects get stuck in “proof-of-concept purgatory”—where flashy demos fail to translate into reliable, production-ready software. This talk introduces the LLM software development lifecycle (SDLC)—a structured approach to moving beyond early-stage prototypes. Using first principles from software engineering, observability, and iterative evaluation, we’ll cover common pitfalls, techniques for structured output extraction, and methods for improving reliability in real-world data applications. Attendees will leave with concrete strategies for integrating AI into scientific Python workflows—ensuring LLMs generate value beyond the prototype stage.
Training Large Language Models (LLMs) requires processing massive-scale datasets efficiently. Traditional CPU-based data pipelines struggle to keep up with the exponential growth of data, leading to bottlenecks in model training. In this talk, we present NeMo Curator, an accelerated, scalable Python-based framework designed to curate high-quality datasets for LLMs efficiently. Leveraging GPU-accelerated processing with RAPIDS, NeMo Curator provides modular pipelines for synthetic data generation, deduplication, filtering, classification, and PII redaction—improving data quality and training efficiency.
We will showcase real-world examples demonstrating how multi-node, multi-GPU processing scales dataset preparation to 100+ TB of data, achieving up to 7% improvement in LLM downstream tasks. Attendees will gain insights into configurable pipelines that enhance training workflows, with a focus on reproducibility, scalability, and open-source integration within Python's scientific computing ecosystem.
The widespread fascination with AI often fuels a "myth of the artificial", the belief that scientific and technological progress stems solely from algorithms and large tech breakthroughs. This talk challenges that notion, arguing that truly responsible and impactful science is fundamentally built upon and sustained by the resilient, collective intelligence of the scientific and research community.
Supported by Our Partners • Statsig — The unified platform for flags, analytics, experiments, and more. • Graphite — The AI developer productivity platform. • Augment Code — AI coding assistant that pro engineering teams love. — Steve Huynh spent 17 years at Amazon, including four as a Principal Engineer. In this episode of The Pragmatic Engineer, I join Steve in his studio for a deep dive into what the Principal role actually involves, why the path from Senior to Principal is so tough, and how even strong engineers can get stuck. Not because they’re unqualified, but because the bar is exceptionally high. We discuss what’s expected at the Principal level, the kind of work that matters most, and the trade-offs that come with the title. Steve also shares how Amazon’s internal policies shaped his trajectory, and what made the Principal Engineer community one of the most rewarding parts of his time at the company. We also go into: • Why being promoted from Senior to Principal is one of the hardest jumps in tech • How Amazon’s freedom of movement policy helped Steve work across multiple teams, from Kindle to Prime Video • The scale of Amazon: handling 10k–100k+ requests per second and what that means for engineering • Why latency became a company-wide obsession—and the research that tied it directly to revenue • Why companies should start with a monolith, and what led Amazon to adopt microservices • What makes the Principal Engineering community so special • Amazon’s culture of learning from its mistakes, including COEs (correction of errors) • The pros and cons of the Principal Engineer role • What Steve loves about the leadership principles at Amazon • Amazon’s intense writing culture and 6-pager format • Why Amazon patents software and what that process looks like • And much more! — Timestamps (00:00) Intro (01:11) What Steve worked on at Amazon, including Kindle, Prime Video, and payments (04:38) How Steve was able to work on so many teams at Amazon (09:12) An overview of the scale of Amazon and the dependency chain (16:40) Amazon’s focus on latency and the tradeoffs they make to keep latency low at scale (26:00) Why companies should start with a monolith (26:44) The structure of engineering at Amazon and why Amazon’s Principal is so hard to reach (30:44) The Principal Engineering community at Amazon (36:06) The learning benefits of working for a tech giant (38:44) Five challenges of being a Principal Engineer at Amazon (49:50) The types of managing work you have to do as a Principal Engineer (51:47) The pros and cons of the Principal Engineer role (54:59) What Steve loves about Amazon’s leadership principles (59:15) Amazon’s intense focus on writing (1:01:11) Patents at Amazon (1:07:58) Rapid fire round — The Pragmatic Engineer deepdives relevant for this episode: • Inside Amazon’s engineering culture — See the transcript and other references from the episode at https://newsletter.pragmaticengineer.com/podcast — Production and marketing by https://penname.co/. For inquiries about sponsoring the podcast, email [email protected].
Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe
The practice of data science in genomics and computational biology is fraught with friction. This is largely due to a tight coupling of bioinformatic tools to file input/output. While omic data is specialized and the storage formats for high-throughput sequencing and related data are often standardized, the adoption of emerging open standards not tied to bioinformatics can help better integrate bioinformatic workflows into the wider data science, visualization, and AI/ML ecosystems. Here, we present two bridge libraries as short vignettes for composable bioinformatics. First, we present Anywidget, an architecture and toolkit based on modern web standards for sharing interactive widgets across all Jupyter-compatible runtimes, including JupyterLab, Google Colab, VSCode, and more. Second, we present Oxbow, a Rust and Python-based adapter library that unifies access to common genomic data formats by efficiently transforming queries into Apache Arrow, a standard in-memory columnar representation for tabular data analytics. Together, we demonstrate the composition of these libraries to build a custom connected genomic analysis and visualization environments. We propose that components such as these, which leverage scientific domain-agnostic standards to unbundle specialized file manipulation, analytics, and web interactivity, can serve as reusable building blocks for composing flexible genomic data analysis and machine learning workflows as well as systems for exploratory data analysis and visualization.
Está no ar, o Data Hackers News !! Os assuntos mais quentes da semana, com as principais notícias da área de Dados, IA e Tecnologia, que você também encontra na nossa Newsletter semanal, agora no Podcast do Data Hackers !! Aperte o play e ouça agora, o Data Hackers News dessa semana ! Para saber tudo sobre o que está acontecendo na área de dados, se inscreva na Newsletter semanal: https://www.datahackers.news/ Conheça nossos comentaristas do Data Hackers News: Inscrições do Data Hackers Challenge 2025 Live de Bain: Estratégias de GenAI para análise de dados não-estruturados Conheça nossos comentaristas do Data Hackers News: Monique Femme Paulo Vasconcellos Demais canais do Data Hackers: Site Linkedin Instagram Tik Tok You Tube