AI/ML workloads depend heavily on complex software stacks, including numerical computing libraries (SciPy, NumPy), deep learning frameworks (PyTorch, TensorFlow), and specialized toolchains (CUDA, cuDNN). However, integrating these dependencies into Bazel-based workflows remains challenging due to compatibility issues, dependency resolution, and performance optimization. This session explores the process of creating and maintaining Bazel packages for key AI/ML libraries, ensuring reproducibility, performance, and ease of use for researchers and engineers.
talk-data.com
Topic
SciPy
43
tagged
Activity Trend
Top Events
We all love to tell stories with data and we all love to listen to them. Wouldn't it be great if we could also draw actionable insights from these nice stories?
As scikit-learn maintainers, we would love to use PyPI download stats and other proxy metrics (website analytics, github repository statistics, etc ...) to help inform some of our decisions like: - how do we increase user awareness of best practices (please use Pipeline and cross-validation)? - how do we advertise our recent improvements (use HistGradientBoosting rather than GradientBoosting, TunedThresholdClassifier, PCA and a few other models can run on GPU) ? - do users care more about new features from recent releases or consolidation of what already exists? - how long should we support older versions of Python, numpy or scipy ?
In this talk we will highlight a number of lessons learned while trying to understand the complex reality behind these seemingly simple metrics.
Telling nice stories is not always hard, trying to grasp the reality behind these metrics is often tricky.
The array API standard is unifying the ecosystem of Python array computing, facilitating greater interoperability between code written for different array libraries, including NumPy, CuPy, PyTorch, JAX, and Dask.
But what are all of these "array-api-" libraries for? How can you use these libraries to 'future-proof' your libraries, and provide support for GPU and distributed arrays to your users? Find out in this talk, where I'll guide you through every corner of the array API standard ecosystem, explaining how SciPy and scikit-learn are using all of these tools to adopt the standard. I'll also be sharing progress updates from the past year, to give you a clear picture of where we are now, and what the future holds.
Keynote talk by Ralf Gommers, SciPy and NumPy maintainer, director at Quansight Labs.
Come join the BoF to do a practice run on contributing to a GitHub project. We will walk through how to open a Pull Request for a bugfix, using the workflow most libraries participating at the weekend sprints use (hosted by the sprint chairs)
If you have interest in NumPy, SciPy, Signal Processing, Simulation, DataFrames, Linear Programming (LP), Vehicle Routing Problems (VRP), or Graph Analysis, we'd love to hear what performance you're seeing and how you're measuring.
Come share your ideas next year's SciPy. Participants will have an opportunity to sign up to be on next year's organizing committee.
Lightning talks are 5-minute talks on any topic of interest for the SciPy community. We encourage spontaneous and prepared talks from everyone, but we can’t guarantee spots. Sign ups are at the NumFOCUS booth during the conference.
The SciPy library provides objects representing well over 100 univariate probability distributions. These have served the scientific Python ecosystem for decades, but they are built upon an infrastructure that has not kept up with the demands of today’s users. To address its shortcomings, SciPy 1.15 includes a new infrastructure for working with probability distributions. This talk will introduce users to the new infrastructure and demonstrate its many advantages in terms of usability, flexibility, accuracy, and performance.
As scientific computing increasingly relies on diverse hardware (CPUs, GPUs, etc) and data structures, libraries face pressure to support multiple backends while maintaining a consistent API. This talk presents practical considerations for adding dispatching to existing libraries, enabling seamless integration with external backends. Using NetworkX and scikit-image as case studies, we demonstrate how they evolved to become a common API with multiple implementations, handle backend-specific behaviors, and ensure robustness through testing and documentation. We also discuss technical challenges, differences in approaches, community adoption strategies, and the broader implications for the SciPy ecosystem.
The SciPy Proceedings (https://proceedings.scipy.org) have long served as a cornerstone for publishing research in the scientific python community; with over 330 peer-reviewed articles being published over the last 17 years. In 2024, the SciPy Proceedings underwent a significant transformation, adopting MyST Markdown (https://mystmd.org) and Curvenote (https://curvenote.com) to enhance accessibility, interactivity, and reproducibility — including publishing of Jupyter Notebooks. The new proceedings articles are web-first, providing features such as deep-dive links for cross-references and previews of GItHub content, interactive 3D visualizations, and rich-rendering of Jupyter Notebooks. In this talk, we will (1) present the new authoring & reading capabilities introduced in 2024; (2) highlight connections to prominent open-science initiatives and their impact on advancing computational research publishing; and (3) demonstrate the underlying technologies and how they enhance integrations with SciPy packages and how to use these tools in your own communication workflows.
Our presentation will give an overview of the revised authoring process for SciPy Proceedings; how we improve metadata standards in a similar way to code-linting and continuous integration; and the integration of live previews of the articles, including auto-generated PDFs and JATS XML (a standard used in scientific publishing). The peer-review process for the proceedings currently happens using GitHub’s peer-review commenting in a similar fashion to the Journal of Open Source Software; we will demonstrate this process as well as showcase opportunities for working with distributed review services such as PREreview (https://prereview.org). The open publishing pipeline has streamlined the submission, review, and revision processes while maintaining high scientific quality and improving the completeness of scholarly metadata. Finally, we will present how this work connects into other high-profile scientific publishing initiatives that have incorporated Jupyter Notebooks and live computational figures as well as interactive displays of large-scale data. These initiatives include Notebooks Now! by the American Geophysical Union, which is focusing on ensuring that Jupyter Notebooks can be properly integrated into the scholarly record; and the Microscopy Society of America’s work on interactive publishing and publishing of large-scale microscopy data with interactive visualizations. These initiatives and the SciPy Proceedings are enabled by recent improvements in open-source tools including MyST Markdown, JupyterLab, BinderHub, and Curvenote, which enable new ways to share executable research content. These initiatives collectively aim to improve both the reproducibility, interactivity, and the accessibility of research by providing improved connections between data, software and narrative research articles.
By embracing open science principles and modern technologies, the SciPy Proceedings exemplify how computational research can be more transparent, reproducible, and accessible. The shift to computational publishing, especially in the context of the scientific python community, opens new opportunities for researchers to publish not only their final results but also the computational workflows, datasets, and interactive visualizations that underpin them. This transformation aligns with broader efforts in open science infrastructure, such as integrating persistent identifiers (DOIs, ORCID, ROR), and adopting FAIR (Findable, Accessible, Interoperable, Reusable) principles for computational content. Building on these foundations, as well as open tools like MyST Markdown and Curvenote, provides a scalable model for open scientific publishing that bridges the gap between computational research and scholarly communication, fostering a more collaborative, iterative, and continuous approach to scientific knowledge dissemination.
Lightning talks are 5-minute talks on any topic of interest for the SciPy community. We encourage spontaneous and prepared talks from everyone, but we can’t guarantee spots. Sign ups are at the NumFOCUS booth during the conference.
This track highlights the fantastic scientific applications that the SciPy community creates with the tools we collectively make. Talk proposals to this track should be stories of how using the Scientific Python ecosystem the speakers were able to overcome challenges, create new collaborations, reduce the time to scientific insight, and share their results in ways not previously possible. Proposals should focus on novel applications and problems, and be of broad interest to the conference, but should not shy away from explaining the scientific nuances that make the story in the proposal exciting.
This keynote will trace the personal journey of NumPy's development and the evolution of the SciPy community from 2001 to the present. Drawing on over two decades of involvement, I’ll reflect on how a small group of enthusiastic contributors grew into a vibrant, global ecosystem that now forms the foundation of scientific computing in Python. Through stories, milestones, and community moments, we’ll explore the challenges, breakthroughs, and collaborative spirit that shaped both NumPy and the SciPy conventions over the years.
Join us for an evening of connection and creativity at the Museum of Glass, just a short walk from the conference venue. Explore stunning glass art exhibits, enjoy refreshments, and connect with fellow SciPy attendees in a unique and inspiring setting.
The virtual poster session takes place 6:00–7:00 p.m. in the SciPy 2025 Gather space. Meet with the poster authors to ask questions and learn about the posters that will be on display in the poster hall inside our virtual space. Virtual and in-person ticket holders are all welcome to attend in Gather!
The Poster session will be in the Ballroom from 6:00-7:00pm. Meet with the poster authors to ask questions and learn about the posters that will be on display throughout the main conference.
Lightning talks are 5-minute talks on any topic of interest for the SciPy community. We encourage spontaneous and prepared talks from everyone, but we can’t guarantee spots. Sign ups are at the NumFOCUS booth during the conference.