Today’s quantum computers are far noisier than their classical counterparts. Unlike traditional computing errors, quantum noise is more complex, arising from decoherence, crosstalk, and gate imperfections that corrupt quantum states. Error mitigation has become a rapidly evolving field, offering ways to address these errors on existing devices. New techniques emerge regularly, requiring flexible tools for implementation and testing. This talk explores the challenges of mitigating noise and how researchers and engineers use Python to iterate quickly while maintaining reliable and reproducible workflows.
talk-data.com
Topic
Python
67
tagged
Activity Trend
Top Events
In today’s world of ever-growing data and AI, learning about GPUs has become an essential part of software carpentry, professional development and the education curriculum. However, teaching with GPUs can be challenging, from resource accessibility to managing dependencies and varying knowledge levels.
During this talk we will address these issues by offering practical strategies to promote active learning with GPUs and share our experiences from running numerous Python conference tutorials that leveraged GPUs. Attendees will learn different options to how to provide GPU access, tailor content for different expertise levels, and simplify package management when possible.
If you are an educator, researcher, and/or developer who is interested in teaching or learning about GPU computing with Python, this talk will give you the confidence to teach topics that require GPU acceleration and quickly get your audience up and running.
mybinder.org has served millions of scientific python users for 8 years now! It is an experiment in running open source infrastructure as a public good. Sustainability challenges faced by open source software production are magnified here - we need people time to manage the infrastructure, pay for computational infrastructure required to run the service, operate it reliably by responding to outages in a timely fashion, and fight off abuse from malicious actors. This talk covers the lessons learnt over the years, and new community oriented experiments to better sustainability, functionality & reliability that we are trying out now.
The rapidly evolving Python ecosystem presents increasing challenges for adapting code using traditional methods. Developers frequently need to rewrite applications to leverage new libraries, hardware architectures, and optimization techniques. To address this challenge, the Numba team is developing a superoptimizing compiler built on equality saturation-based term rewriting. This innovative approach enables domain experts to express and share optimizations without requiring extensive compiler expertise. This talk explores how Numba v2 enables sophisticated optimizations—from floating-point approximation and automatic GPU acceleration to energy-efficient multiplication for deep learning models—all through the familiar NumPy API. Join us to discover how Numba v2 is bringing superoptimization capabilities to the Python ecosystem.
This track highlights the fantastic scientific applications that the SciPy community creates with the tools we collectively make. Talk proposals to this track should be stories of how using the Scientific Python ecosystem the speakers were able to overcome challenges, create new collaborations, reduce the time to scientific insight, and share their results in ways not previously possible. Proposals should focus on novel applications and problems, and be of broad interest to the conference, but should not shy away from explaining the scientific nuances that make the story in the proposal exciting.
Women remain critically underrepresented in data science and Python communities, comprising only 15–22% of professionals globally and less than 3% of contributors to Python open-source projects. This disparity not only limits diversity but also represents a missed opportunity for innovation and community growth. This talk explores actionable strategies to address these gaps, drawing from my leadership in Women in AI at IBM, TechWomen mentorship, and initiatives with NumFOCUS. Attendees will gain insights and practical steps to create inclusive environments, foster diverse collaboration, and ensure the scientific Python community thrives by unlocking its full potential.
Collaborating on code and software is essential to open science—but it’s not always easy. Join this BoF for an interactive discussion on the real-world challenges of open source collaboration. We’ll explore common hurdles like Python packaging, contributing to existing codebases, and emerging issues around LLM-assisted development and AI-generated software contributions.
We’ll kick off with a brief overview of pyOpenSci—an inclusive community of Pythonistas, from novices to experts—working to make it easier to create, find, share, and contribute to reusable code. We’ll then facilitate small-group discussions and use an interactive Mentimeter survey to help you share your experiences and ideas.
Your feedback will directly shape pyOpenSci’s priorities for the coming year, as we build new programs and resources to support your work in the Python scientific ecosystem. Whether you’re just starting out or a seasoned developer, you’ll leave with clear ways to get involved and make an impact on the broader Python ecosystem in service of advancing scientific discovery.
Conferences serve as a way to connect groups of humans around common topics of interest. In the open source community, they have played a critical role in knowledge sharing, advancing technology, and fostering a sense of community. This is especially true for the global Python community. Times are changing, the political climate both in the US and abroad has drastically shifted making gathering in the real world much more complex. Advancements in technologies have changed the calculus on what is considered quality participation. Join us in this BoF to discuss these challenges and how we can continue to come together as a community.
NVIDIA’s CUDA platform has long been the backbone of high-performance GPU computing, but its power has historically been gated behind C and C++ expertise. With the recent introduction of native Python support, CUDA is more accessible to the programming language you know and love, ushering in a new era for scientific computing, data science, and AI development.
Synthetic aviation fuels (SAFs) offer a pathway to improving efficiency, but high cost and volume requirements hinder property testing and increase risk of developing low-performing fuels. To promote productive SAF research, we used Fourier Transform Infrared (FTIR) spectra to train accurate, interpretable fuel property models. In this presentation, we will discuss how we leveraged standard Python libraries – NumPy, pandas, and scikit-learn – and Non-negative Matrix Factorization to decompose FTIR spectra and develop predictive models. Specifically, we will review the pipeline developed for preprocessing FTIR data, the ensemble models used for property prediction, and how the features correlate with physicochemical properties.
Designing tomorrow's materials requires understanding how atoms behave – a challenge that's both fascinating and incredibly complex. While machine learning offers exciting speedups in materials simulation, it often falls short, missing vital electronic structure information needed to connect theory with experimental results. This work introduces a powerful solution: Density Functional Tight Binding (DFTB), which, combined with the versatile tools of Scientific Python, allows us to understand the electronic behavior of materials while maintaining computational efficiency. In this talk, I will present our findings demonstrating how DFTB, coupled with readily available Python packages, allows for direct comparison between theoretical predictions and experimental data, such as XPS measurements. I will also showcase our publicly available repository, containing DFTB parameters for a wide range of materials, making this powerful approach accessible to the broader research community.
Rydberg atoms offer unique quantum properties that enable radio-frequency sensing capabilities distinct from any classical analogue; however, large parameter spaces and complex configurations make understanding and designing these quantum experiments challenging. Current solutions are often developed as in-house, closed-sourced software simulating a narrow range of problems. We present RydIQule, an open-source package leveraging tools of computational python in novel ways to model the behavior of these systems generally. We describe RydIQule’s approach to representing quantum systems using computational graphs and leveraging numpy broadcasting to define complete experiments. In addition to discussing the computational challenges RydIQule helps overcome, we outline how collaboration between physics and computational research backgrounds has led to this impactful tool.
This keynote will trace the personal journey of NumPy's development and the evolution of the SciPy community from 2001 to the present. Drawing on over two decades of involvement, I’ll reflect on how a small group of enthusiastic contributors grew into a vibrant, global ecosystem that now forms the foundation of scientific computing in Python. Through stories, milestones, and community moments, we’ll explore the challenges, breakthroughs, and collaborative spirit that shaped both NumPy and the SciPy conventions over the years.
Computational needs in high energy physics applications are increasingly met by utilizing GPUs as hardware accelerators, but achieving the highest throughput requires directly reading data into GPU memory. This has yet to be achieved for HEP’s standard domain specific “ROOT” file formats. Using KvikIO’s python bindings to CuFile and NvComp, KvikUproot is a prototype package to support the reading of ROOT file formats by the GPU. On GPUDirect storage (GDS) enabled systems, data bypasses the CPU and is loaded directly from storage to the GPU. We will discuss the methodology we developed to read ROOT files into GPUs via RDMA.
Block-based programming divides inputs into local arrays that are processed concurrently by groups of threads. Users write sequential array-centric code, and the framework handles parallelization, synchronization, and data movement behind the scenes. This approach aligns well with SciPy's array-centric ethos and has roots in older HPC libraries, such as NWChem’s TCE, BLIS, and ATLAS.
In recent years, many block-based Python programming models for GPUs have emerged, like Triton, JAX/Pallas, and Warp, aiming to make parallelism more accessible for scientists and increase portability.
In this talk, we'll present cuTile and Tile IR, a new Pythonic tile-based programming model and compiler recently announced by NVIDIA. We'll explore cuTile examples from a variety of domains, including a new LLAMA3-based reference app and a port of miniWeather. You'll learn the best practices for writing and debugging block-based Python GPU code, gain insight into how such code performs, and learn how it differs from traditional SIMT programming.
By the end of the session, you'll understand how block-based GPU programming enables more intuitive, portable, and efficient development of high-performance, data-parallel Python applications for HPC, data science, and machine learning.
Tracking and Object-Based Analysis of Clouds (tobac) is a Python package that enables researchers to identify, track, and perform object-based analyses of phenomena in large atmospheric datasets. Over the past four years, tobac’s userbase has grown within atmospheric science, and the package has transitioned from its original life as a small, focused package with few maintainers to a larger package with more robust governance and structure. In this presentation, we will discuss the challenges and lessons learned during the transition to robust governance structures and the future of tobac as we incorporate new techniques for using multiple variables and scales to track the same system.
Large language models (LLMs) enable powerful data-driven applications, but many projects get stuck in “proof-of-concept purgatory”—where flashy demos fail to translate into reliable, production-ready software. This talk introduces the LLM software development lifecycle (SDLC)—a structured approach to moving beyond early-stage prototypes. Using first principles from software engineering, observability, and iterative evaluation, we’ll cover common pitfalls, techniques for structured output extraction, and methods for improving reliability in real-world data applications. Attendees will leave with concrete strategies for integrating AI into scientific Python workflows—ensuring LLMs generate value beyond the prototype stage.
One of the most important aspects of developing scientific software is distribution for others. The Scientific Python Development Guide was developed to provide up-to-date best practices for packaging, linting, and testing, along with a versatile template supporting multiple backends, and a WebAssembly-powered repo-review tool to check a repository directly in the guide. This talk, with the guide for reference, will cover key best practices for project setup, backend selection, packaging metadata, GitHub Actions for testing and deployment, tools for validating code quality. We will even cover tools for packaging compiled components that are simple enough for anyone to use.
For the past decade, SQL has reigned king of the data transformation world, and tools like dbt have formed a cornerstone of the modern data stack. Until recently, Python-first alternatives couldn't compete with the scale and performance of modern SQL. Now Ibis can provide the same benefits of SQL execution with a flexible Python dataframe API.
In this talk, you will learn how Ibis supercharges existing open-source libraries like Kedro and Pandera and how you can combine these technologies (and a few more) to build and orchestrate scalable data engineering pipelines without sacrificing the comfort (and other advantages) of Python.
Cubed is a framework for distributed processing of large arrays without a cluster. Designed to respect memory constraints at all times, Cubed can express any NumPy-like array operation as a series of embarrassingly-parallel, bounded-memory steps. By using Zarr as persistent storage between steps, Cubed can run in a serverless fashion on both a local machine and on a range of Cloud platforms. After explaining Cubed’s model, we will show how Cubed has been integrated with Xarray and demonstrate its performance on various large array geoscience workloads.