talk-data.com talk-data.com

Event

SciPy 2025

2025-07-07 – 2025-07-13 PyData

Activities tracked

9

Filtering by: API ×

Sessions & talks

Showing 1–9 of 9 · Newest first

Search within this event →

Lessons Learned from Adding Backend Dispatching to NetworkX and scikit-image

2025-07-11
talk

As scientific computing increasingly relies on diverse hardware (CPUs, GPUs, etc) and data structures, libraries face pressure to support multiple backends while maintaining a consistent API. This talk presents practical considerations for adding dispatching to existing libraries, enabling seamless integration with external backends. Using NetworkX and scikit-image as case studies, we demonstrate how they evolved to become a common API with multiple implementations, handle backend-specific behaviors, and ensure robustness through testing and documentation. We also discuss technical challenges, differences in approaches, community adoption strategies, and the broader implications for the SciPy ecosystem.

Numba v2: Towards a SuperOptimizing Python Compiler

2025-07-10
talk

The rapidly evolving Python ecosystem presents increasing challenges for adapting code using traditional methods. Developers frequently need to rewrite applications to leverage new libraries, hardware architectures, and optimization techniques. To address this challenge, the Numba team is developing a superoptimizing compiler built on equality saturation-based term rewriting. This innovative approach enables domain experts to express and share optimizations without requiring extensive compiler expertise. This talk explores how Numba v2 enables sophisticated optimizations—from floating-point approximation and automatic GPU acceleration to energy-efficient multiplication for deep learning models—all through the familiar NumPy API. Join us to discover how Numba v2 is bringing superoptimization capabilities to the Python ecosystem.

GPUs & ML – Beyond Deep Learning

2025-07-10
talk

This talk explores various methods to accelerate traditional machine learning pipelines using scikit-learn, UMAP, and HDBSCAN on GPUs. We will contrast the experimental Array API Standard support layer in scikit-learn with the cuML library from the NVIDIA RAPIDS Data Science stack, including its zero-code change acceleration capability. ML and data science practitioners will learn how to seamlessly accelerate machine learning workflows, highlight performance benefits, and receive practical guidance for different problem types and sizes. Insights into minimizing cost and runtime by effectively mixing hardware for various tasks, as well as the current implementation status and future plans for these acceleration methods, will be provided.

User guides: engaging new users, delighting old ones

2025-07-09
talk

User guides are the piece you often hit right after clicking the "Learn" or "Get Started" button in a package's documentation. They're responsible for onboarding new users, and providing a learning path through a package. Surprisingly, while pieces of documentation like the API Reference tend to be the same, the design of user guides tend to differ across packages.

In this talk, I'll discuss how to design an effective user guide for open source software. I'll explain how the guides for Polars, DuckDB, and FastAPI balance working end-to-end like a course, with being browsable like a reference.

Python is all you need: an overview of the composable, Python-native data stack

2025-07-09
talk

For the past decade, SQL has reigned king of the data transformation world, and tools like dbt have formed a cornerstone of the modern data stack. Until recently, Python-first alternatives couldn't compete with the scale and performance of modern SQL. Now Ibis can provide the same benefits of SQL execution with a flexible Python dataframe API.

In this talk, you will learn how Ibis supercharges existing open-source libraries like Kedro and Pandera and how you can combine these technologies (and a few more) to build and orchestrate scalable data engineering pipelines without sacrificing the comfort (and other advantages) of Python.

Burning fuel for cheap! Transport-independent depletion in OpenMC

2025-07-09
talk

OpenMC is an open source, community-developed, Monte Carlo tool for neutron transport simulations, featuring a depletion module for fuel burnup calculations in nuclear reactors and a Python API. Depletion calculations can be expensive as they require solving the neutron transport and bateman equations in each timestep to update the neutron flux and material composition, respectively. Material properties such as temperature and density govern material cross sections, which in turn govern reaction rates. The reaction rates can effect the neutron population. In a scenario where there is no significant change in the material properties or composition, the transport simulation may only need to be run once; the same cross sections are used for the entire depletion calculation. We recently extended the depletion module in OpenMC to enable transport-independent depletion using multigroup cross sections and fluxes. This talk will focus on the technical details of this feature, its validation, and briefly touch on areas where the feature has been used. Two recent use cases will be highlighted. The first use case calculates shutdown dose rates for fusion power applications, and the second performs depletion for fission reactor fuel cycle modeling.

Network Analysis Made Simple

2025-07-08
talk

Through the use of NetworkX's API, tutorial participants will learn about the basics of graph theory and its use in applied network science. Starting with a computationally-oriented definition of a graph and its associated methods, we will progress through the following concepts: path and structure finding, visualization, and graph storage on disk. We will also offer tutorial participants the option of one advanced topic overview, including the use of graphs alongside LLMs for knowledge retrieval, scalable alternatives to NetworkX including cuGraph, and the use of linear algebraic translation of graph problems to speed up computations.

3D Visualization with PyVista

2025-07-07
talk

PyVista is a general purpose 3D visualization library used for over 2000+ open source projects for the visualization of everything from computer aided engineering and geophysics to volcanoes and digital artwork.

PyVista exposes a Pythonic API to the Visualization Toolkit (VTK) to provide tooling that is immediately usable without any prior knowledge of VTK and is being built as the 3D equivalent of Matplotlib, with plugins to Jupyter to enable visualization of 3D data using both server- and client-side rendering.

Vega-Altair: A Structured Way to Build Interactive Charts

2025-07-07
talk

This tutorial is an introduction to data visualization using the popular Vega-Altair Python library. Vega-Altair provides a simple and expressive API, enabling authors to rapidly create a wide range of interactive charts.

Participants will explore the fundamentals of effective chart design and gain hands-on experience building a variety of visualizations using Vega-Altair's declarative API. Furthermore, this tutorial will introduce users to advanced topics such as data transformations and interaction design. We will finish off by covering practical workflows such as integrating Vega-Altair into dashboarding systems, publishing visualizations, and creating reusable, themed charting libraries. By the end of the session, attendees will have the skills to leverage Vega-Altair for both rapid prototyping and production-ready visualizations in diverse environments