talk-data.com talk-data.com

Topic

AI/ML

Artificial Intelligence/Machine Learning

data_science algorithms predictive_analytics

9014

tagged

Activity Trend

1532 peak/qtr
2020-Q1 2026-Q1

Activities

9014 activities · Newest first

Business intelligence has been transforming organizations for decades, yet many companies still struggle with widespread adoption. With less than 40% of employees in most organizations having access to BI tools, there's a significant 'information underclass' making decisions without data-driven insights. How can businesses bridge this gap and achieve true information democracy? While new technologies like generative AI and semantic layers offer promising solutions, the fundamentals of data quality and governance remain critical. What balance should organizations strike between investing in innovative tools and strengthening their data infrastructure? How can you ensure your business becomes a 'data athlete' capable of making hyper-decisive moves in an uncertain economic landscape? Howard Dresner is founder and Chief Research Officer at Dresner Advisory Services and a leading voice in Business Intelligence (BI), credited with coining the term “Business Intelligence” in 1989. He spent 13 years at Gartner as lead BI analyst, shaping its research agenda and earning recognition as Analyst of the Year, Distinguished Analyst, and Gartner Fellow. He also led Gartner’s BI conferences in Europe and North America. Before founding Dresner Advisory in 2007, Howard was Chief Strategy Officer at Hyperion Solutions, where he drove strategy and thought leadership, helping position Hyperion as a leader in performance management prior to its acquisition by Oracle.  Howard has written two books, The Performance Management Revolution – Business Results through Insight and Action, and Profiles in Performance – Business Intelligence Journeys and the Roadmap for Change - both published by John Wiley & Sons. In the episode, Richie and Howard explore the surprising low penetration of business intelligence in organizations, the importance of data governance and infrastructure, the evolving role of AI in BI, and the strategic initiatives driving BI usage, and much more. Links Mentioned in the Show: Dresner Advisory ServicesHoward’s Book - Profiles in Performance: Business Intelligence Journeys and the Roadmap for ChangeConnect with HowardSkill Track: Power BI FundamentalsRelated Episode: The Next Generation of Business Intelligence with Colin Zima, CEO at OmniRewatch RADAR AI  New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

Since agent processing take significant time, what happens to this latency induced if agentic-ai is implemented in existing workflow. What are the latency challenges ? What could be key strategies to overcome challenges? What should we do to change the user expectation.=? What should be done to maintain/enhance user experience? What trade-offs should be considers between performance, latency, cost etc?

Recent breakthroughs in large language model-based artificial intelligence (AI) have captured the public’s interest in AI more broadly. With the growing adoption of these technologies in professional and educational settings, public dialog about their potential impacts on the workforce has been ubiquitous. It is, however, difficult to separate the public dialog about the potential impact of the technology from the experienced impact of the technology in the research software engineer and data science workplace. Likewise, it is challenging to separate the generalized anxiety about AI from its specific impacts on individuals working in specialized work settings.

As research software engineers (RSEs) and those in adjacent computational fields engage with AI in the workplace, the realities of the impacts of this technology are becoming clearer. However, much of the dialog has been limited to high-level discussion around general intra-institutional impacts, and lacks the nuance required to provide helpful guidance to RSE practitioners in research settings, specifically. Surprisingly, many RSEs are not involved in career discussions on what the rise of AI means for their professions.

During this BoF, we will hold a structured, interactive discussion session with the goal of identifying critical areas of engagement with AI in the workplace including: current use of AI, AI assistance and automation, AI skills and workforce development, AI and open science, and AI futures. This BoF will represent the first of a series of discussions held jointly by the Academic Data Science Alliance and the US Research Software Engineer Association over the coming year, with support from Schmidt Sciences. The insights gathered from these sessions will inform the development of guidance resources on these topic areas for the broader RSE and computational data practitioner communities.

The rapid growth of scientific data repositories demands innovative solutions for efficient metadata creation. In this talk, we present our open-source project that leverages large language models to automate the generation of standard-compliant metadata files from raw scientific datasets. Our approach harnesses the capabilities of pre-trained open source models, finetuned with domain-specific data, and integrated with Langgraph to orchestrate a modular, end-to-end pipeline capable of ingesting heterogeneous raw data files and outputting metadata conforming to specific standards.

The methodology involves a multi-stage process where raw data is first parsed and analyzed by the LLM to extract relevant scientific and contextual information. This information is then structured into metadata templates that adhere strictly to recognized standards, thereby reducing human error and accelerating the data release cycle. We demonstrate the effectiveness of our approach using the USGS ScienceBase repository, where we have successfully generated metadata for a variety of scientific datasets, including images, time series, and text data.

Beyond its immediate application to the USGS ScienceBase repository, our open-source framework is designed to be extensible, allowing adaptation to other data release processes across various scientific domains. We will discuss the technical challenges encountered, such as managing diverse data formats and ensuring metadata quality, and outline strategies for community-driven enhancements. This work not only streamlines the metadata creation workflow but also sets the stage for broader adoption of generative AI in scientific data management.

Additional Material: - Project supported by USGS and ORNL - Codebase will be available on GitHub after paper publication - Fine-tuned LLM models will be available on Hugginface after paper publication

Flyte is a Linux Foundation OSS orchestrator built for Data and Machine Learning workflows focused on scalability, reliability, and developer productivity. Flyte’s Python SDK, Flytekit, empowers developers by shipping their code from their local environments onto a cluster with one simple CLI command. In this talk, you will learn about the design and implementation details that powers Flytekit’s core features, such as “fast registration” and “type transformers”, and a plugin system that enables Dask, Ray, or distributed GPU workflows.

The increasing prevalence of AI models necessitates robust mechanisms to ensure their trustworthiness. This talk introduces a standardized, PKI-agnostic approach to verifying the origins and integrity of machine learning models, as built by the OpenSSF Model Signing project. We extend this methodology beyond models to encompass datasets and other associated files, offering a holistic solution for maintaining data provenance and integrity.

Real-time machine learning depends on features and data that by definition can’t be pre-computed. Detecting fraud or acute diseases like sepsis requires processing events that emerged seconds ago. How do we build an infrastructure platform that executes complex data pipelines (< 10ms) end-to-end and on-demand? All while meeting data teams where they are–in Python–the language of ML! Learn how we built a symbolic interpreter that accelerates ML pipelines by transpiling Python into DAGs of static expressions. These expressions are optimized in C++ and eventually run in production workloads at scale with Velox–an OSS (~4k stars) unified query engine (C++) from Meta.

Our global forecast looks for a sharp slowing in growth in the coming months, concentrated in the US. Despite this contrasting with the more benign outlook apparent in risk markets, we see downside risks edging higher on US trade and immigration policies. Recent US fiscal policies should provide some offset but likely less than advertised.

Speakers:

Bruce Kasman

Joseph Lupton

This podcast was recorded on 11 July 2025.

This communication is provided for information purposes only. Institutional clients please visit www.jpmm.com/research/disclosures for important disclosures. © 2025 JPMorgan Chase & Co. All rights reserved. This material or any portion hereof may not be reprinted, sold or redistributed without the written consent of J.P. Morgan. It is strictly prohibited to use or share without prior written consent from J.P. Morgan any research material received from J.P. Morgan or an authorized third-party (“J.P. Morgan Data”) in any third-party artificial intelligence (“AI”) systems or models when such J.P. Morgan Data is accessible by a third-party. It is permissible to use J.P. Morgan Data for internal business purposes only in an AI system or model that protects the confidentiality of J.P. Morgan Data so as to prevent any and all access to or use of such J.P. Morgan Data by any third-party.

podcast_episode
by Cris deRitis , Austan Goolsbee (Federal Reserve Bank of Chicago) , Mark Zandi (Moody's Analytics) , Marisa DiNatale (Moody's Analytics)

Chicago Federal Reserve President Austan Goolsbee joins Mark and Cris to talk about the economy and monetary policy. He explains that the up and down tariffs and other economic policies have thrown lots of dirt in the air, so to speak, complicating things for the Fed and thus delaying the normalization of interest rates. He also weighs in on the policy response to the financial crisis and the economic repercussions of artificial intelligence. And tune in to hear why he wants to be 80% Paul Volker and 20% Muhammad Ali.   Guest: Austan Goolsbee, President of the Federal Reserve Bank of Chicago Hosts: Mark Zandi – Chief Economist, Moody’s Analytics, Cris deRitis – Deputy Chief Economist, Moody’s Analytics, and Marisa DiNatale – Senior Director - Head of Global Forecasting, Moody’s Analytics Follow Mark Zandi on 'X' and BlueSky @MarkZandi, Cris deRitis on LinkedIn, and Marisa DiNatale on LinkedIn

Questions or Comments, please email us at [email protected]. We would love to hear from you.    To stay informed and follow the insights of Moody's Analytics economists, visit Economic View.

Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

AI, particularly generative AI, is rapidly transforming the scientific landscape, offering unprecedented opportunities and novel challenges across all stages of research. This Birds of a Feather session aims to bring together researchers, developers, and practitioners to share experiences, discuss best practices, and explore the evolving role of AI in science.

Building AI Agents with LLMs, RAG, and Knowledge Graphs

This book provides a comprehensive and practical guide to creating cutting-edge AI agents combining advanced technologies such as LLMs, retrieval-augmented generation (RAG), and knowledge graphs. By reading this book, you'll gain a deep understanding of how to design and build AI agents capable of real-world problem solving, reasoning, and action execution. What this Book will help me do Understand the foundations of LLMs, RAG, and knowledge graphs, and how they can be combined to build effective AI agents. Learn techniques to enhance factual accuracy and grounding through RAG pipelines and knowledge graphs. Develop AI agents that integrate planning, reasoning, and live tool usage to solve complex problems. Master the use of Python and popular AI libraries to build scalable AI agent applications. Acquire strategies for deploying and monitoring AI agents in production for reliable operation. Author(s) This book is written by Salvatore Raieli and Gabriele Iuculano, accomplished experts in artificial intelligence and machine learning. Both authors bring extensive professional experience from their work in AI-related fields, particularly in applying innovative AI methods to solve challenging problems. Through their clear and approachable writing style, they aim to make advanced AI concepts accessible to readers at various levels. Who is it for? This book is ideally suited for data scientists, AI practitioners, and technology enthusiasts seeking to deepen their knowledge in building intelligent AI agents. It is perfect for those who already have a foundational understanding of Python and general artificial intelligence concepts. Experienced professionals looking to explore state-of-the-art AI solutions, as well as beginners eager to advance their technical skills, will find this book invaluable.

The Universe isn't always so quiet: neutron stars, fast radio bursts, and potentially alien civilizations emit bursts of electromagnetic energy - radio transients - into the unknown. In some cases, these emissions, like with pulsars, are constant and periodic; but in others, like with fast radio bursts, they're short in duration and infrequent. Classical detection surveys typically rely on dedispersion techniques and human-crafted signal processing filters to remove noise and highlight a signal of interest. But what if we're missing something?

In this talk we will introduce a workflow to avoid classical processing all together. By feeding RF samples directly from the telescope's digitizers into GPU computing, we can train an AI model to serve as a detector -- not only enabling real time performance, but also making decisions directly on raw spectrogram data, eliminating the need for classical processing. We will demonstrate how each step of the pipeline works - from AI model training and data curation to real-time inferencing at scale. Our hope is that this new sensor processing architecture can simplify development, democratize science, and process increasingly large amounts of data in real time.

Camera traps are an essential tool for wildlife research. Zamba is an open source Python package that leverages machine learning and computer vision to automate time-intensive processing tasks for wildlife camera trap data. This talk will dive into Zamba's capabilities and key factors that influenced its design and development. Topics will include the importance of code-free custom model training, Zamba’s origins in an open machine learning competition, and the technical challenges of processing video data. Attendees will walk away with a better understanding of how machine learning and Python tools can support conservation efforts.

In today’s world of ever-growing data and AI, learning about GPUs has become an essential part of software carpentry, professional development and the education curriculum. However, teaching with GPUs can be challenging, from resource accessibility to managing dependencies and varying knowledge levels.

During this talk we will address these issues by offering practical strategies to promote active learning with GPUs and share our experiences from running numerous Python conference tutorials that leveraged GPUs. Attendees will learn different options to how to provide GPU access, tailor content for different expertise levels, and simplify package management when possible.

If you are an educator, researcher, and/or developer who is interested in teaching or learning about GPU computing with Python, this talk will give you the confidence to teach topics that require GPU acceleration and quickly get your audience up and running.

Women remain critically underrepresented in data science and Python communities, comprising only 15–22% of professionals globally and less than 3% of contributors to Python open-source projects. This disparity not only limits diversity but also represents a missed opportunity for innovation and community growth. This talk explores actionable strategies to address these gaps, drawing from my leadership in Women in AI at IBM, TechWomen mentorship, and initiatives with NumFOCUS. Attendees will gain insights and practical steps to create inclusive environments, foster diverse collaboration, and ensure the scientific Python community thrives by unlocking its full potential.

Generative AI has rapidly changed the landscape of computing and data education. Many learners are utilizing generative AI to assist in learning, so what should educators do to address the opportunities, risks, and potential for their use? The goal of this open discussion session is to bring together community members to unravel these pressing questions in order to not only improve learning outcomes in a variety of diverse contexts: not only students learning in a classroom setting, but also ed-tech or generative AI designers developing new user experiences that aim to improve human capacities, and even scientists interested in learning best practices for communicating results to stakeholders or creating learning materials for colleagues. The open discussion will include ample opportunity for community members to network with each other and build connections after the conference.

Collaborating on code and software is essential to open science—but it’s not always easy. Join this BoF for an interactive discussion on the real-world challenges of open source collaboration. We’ll explore common hurdles like Python packaging, contributing to existing codebases, and emerging issues around LLM-assisted development and AI-generated software contributions.

We’ll kick off with a brief overview of pyOpenSci—an inclusive community of Pythonistas, from novices to experts—working to make it easier to create, find, share, and contribute to reusable code. We’ll then facilitate small-group discussions and use an interactive Mentimeter survey to help you share your experiences and ideas.

Your feedback will directly shape pyOpenSci’s priorities for the coming year, as we build new programs and resources to support your work in the Python scientific ecosystem. Whether you’re just starting out or a seasoned developer, you’ll leave with clear ways to get involved and make an impact on the broader Python ecosystem in service of advancing scientific discovery.

NVIDIA’s CUDA platform has long been the backbone of high-performance GPU computing, but its power has historically been gated behind C and C++ expertise. With the recent introduction of native Python support, CUDA is more accessible to the programming language you know and love, ushering in a new era for scientific computing, data science, and AI development.

Synthetic aviation fuels (SAFs) offer a pathway to improving efficiency, but high cost and volume requirements hinder property testing and increase risk of developing low-performing fuels. To promote productive SAF research, we used Fourier Transform Infrared (FTIR) spectra to train accurate, interpretable fuel property models. In this presentation, we will discuss how we leveraged standard Python libraries – NumPy, pandas, and scikit-learn – and Non-negative Matrix Factorization to decompose FTIR spectra and develop predictive models. Specifically, we will review the pipeline developed for preprocessing FTIR data, the ensemble models used for property prediction, and how the features correlate with physicochemical properties.