GitHub

Episode 242: Thrust & Parallel Algorithms (Part 4)

2025-07-11 · ADSP: Algorithms + Data Structures = Programs Listen

podcast_episode

by Conor Hoekstra , Jared Hoberock (NVIDIA) , Bryce Adelstein Lelbach (NVIDIA) , Ben Deane

Computer Science

In this episode, Conor and Bryce chat with Jared Hoberock about the NVIDIA Thrust Parallel Algorithms Library and more!. Link to Episode 242 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)Socials ADSP: The Podcast: TwitterConor Hoekstra: Twitter | BlueSky | MastodonBryce Adelstein Lelbach: TwitterAbout the Guest Jared Hoberock joined NVIDIA Research in October 2008. His interests include parallel programming models and physically-based rendering. Jared is the co-creator of Thrust, a high performance parallel algorithms library. While at NVIDIA, Jared has contributed to the DirectX graphics driver, Gelato, a final frame film renderer, and OptiX, a high-performance, programmable ray tracing engine. Jared received a Ph.D in computer science from the University of Illinois at Urbana-Champaign. He is a two-time recipient of the NVIDIA Graduate Research Fellowship. Show Notes Date Generated: 2025-05-21 Date Released: 2025-07-11 ThrustThrust DocsCUB LibraryCCCL LibrariesIntro Song Info Miss You by Sarah Jansen https://soundcloud.com/sarahjansenmusic Creative Commons — Attribution 3.0 Unported — CC BY 3.0 Free Download / Stream: http://bit.ly/l-miss-you Music promoted by Audio Library https://youtu.be/iYYxnasvfx8

Reliable executable tutorials -- CI/CD challenges

2025-07-11 · SciPy 2025

talk

by Brigitta Sipőcz

CI/CD Python

This BoF aims to host discussion about best practices for maintaining executable tutorials that are reproducible and reliable. The BoF is intended to be a platform to collect tips and tricks of CI/CD practices, too. The moderators recently put together a repository that builds on their experiences of maintaining numerous tutorial repositories https://scientific-python.github.io/executable-tutorials/ that covers some of the use cases but we are well aware that there are still user scenarios and use cases that are not well covered.

The BoF is a complement for both the Teaching&Learning and Maintainers track, none of the talks in those tracks seem to focus on the technical challenges around tutorials.

cuTile, the New/Old Kid on the Block: Python Programming Models for GPUs

2025-07-09 · SciPy 2025

talk

by Bryce Adelstein Lelbach (NVIDIA)

AI/ML C#/.NET Data Science HTML LLM Python SciPy

Block-based programming divides inputs into local arrays that are processed concurrently by groups of threads. Users write sequential array-centric code, and the framework handles parallelization, synchronization, and data movement behind the scenes. This approach aligns well with SciPy's array-centric ethos and has roots in older HPC libraries, such as NWChem’s TCE, BLIS, and ATLAS.

In recent years, many block-based Python programming models for GPUs have emerged, like Triton, JAX/Pallas, and Warp, aiming to make parallelism more accessible for scientists and increase portability.

In this talk, we'll present cuTile and Tile IR, a new Pythonic tile-based programming model and compiler recently announced by NVIDIA. We'll explore cuTile examples from a variety of domains, including a new LLAMA3-based reference app and a port of miniWeather. You'll learn the best practices for writing and debugging block-based Python GPU code, gain insight into how such code performs, and learn how it differs from traditional SIMT programming.

By the end of the session, you'll understand how block-based GPU programming enables more intuitive, portable, and efficient development of high-performance, data-parallel Python applications for HPC, data science, and machine learning.

Packaging a Scientific Python Project

2025-07-09 · SciPy 2025

talk

by Henry Fredrick Schreiner III

Python

One of the most important aspects of developing scientific software is distribution for others. The Scientific Python Development Guide was developed to provide up-to-date best practices for packaging, linting, and testing, along with a versatile template supporting multiple backends, and a WebAssembly-powered repo-review tool to check a repository directly in the guide. This talk, with the guide for reference, will cover key best practices for project setup, backend selection, packaging metadata, GitHub Actions for testing and deployment, tools for validating code quality. We will even cover tools for packaging compiled components that are simple enough for anyone to use.

Create Your First Python Package: Make Your Python Code Easier to Share and Use

2025-07-08 · SciPy 2025

talk

by Leah Wasser , Inessa Pawson , Carol Willing , Tetsuo Koyama , Jeremiah Paige

Python

Python packaging can be overwhelming. However, a trusted, community-vetted workflow can make it easier. In this hands-on workshop, you’ll learn a tested approach developed by the pyOpenSci community and vetted by Python packaging maintainers. You’ll create an installable, maintainable, and citable package using a quickstart template. You’ll also receive step-by-step guidance on publishing to TestPyPI (and resources for conda-forge, and adding a DOI with Zenodo). If you can’t install software on your laptop, you can use GitHub Codespaces to participate in the workshop. Join us to package your Python code confidently and to access ongoing support in our community beyond the workshop.

Processing Cloud-optimized data in Python with Serverless Functions (Lithops, Dataplug)

2025-07-08 · SciPy 2025

talk

by Universitat Rovira i Virgili (Pedro Garcia Lopez) , Enrique Molina Giménez

Cloud Computing Cloud Storage Data Management Python

Cloud-optimized (CO) data formats are designed to efficiently store and access data directly from cloud storage without needing to download the entire dataset. These formats enable faster data retrieval, scalability, and cost-effectiveness by allowing users to fetch only the necessary subsets of data. They also allow for efficient parallel data processing using on-the-fly partitioning, which can considerably accelerate data management operations. In this sense, cloud-optimized data is a nice fit for data-parallel jobs using serverless. FaaS provides a data-driven scalable and cost-efficient experience, with practically no management burden. Each serverless function will read and process a small portion of the cloud-optimized dataset, being read in parallel directly from object storage, significantly increasing the speedup.

In this talk, you will learn how to process cloud-optimized data formats in Python using the Lithops toolkit. Lithops is a serverless data processing toolkit that is specially designed to process data from Cloud Object Storage using Serverless functions. We will also demonstrate the Dataplug library that enables Cloud Optimized data managament of scientific settings such as genomics, metabolomics, or geospatial data. We will show different data processing pipelines in the Cloud that demonstrate the benefits of cloud-optimized data management.

3D Visualization with PyVista

2025-07-07 · SciPy 2025

talk

by Alexander Kaszynski , Tetsuo Koyama , Bane Sullivan

API HTML Matplotlib

PyVista is a general purpose 3D visualization library used for over 2000+ open source projects for the visualization of everything from computer aided engineering and geophysics to volcanoes and digital artwork.

PyVista exposes a Pythonic API to the Visualization Toolkit (VTK) to provide tooling that is immediately usable without any prior knowledge of VTK and is being built as the 3D equivalent of Matplotlib, with plugins to Jupyter to enable visualization of 3D data using both server- and client-side rendering.

Episode 241: Parallel Algorithm Talk (Part 3)

2025-07-04 · ADSP: Algorithms + Data Structures = Programs Listen

podcast_episode

by Conor Hoekstra , Jared Hoberock (NVIDIA) , Bryce Adelstein Lelbach (NVIDIA) , Ben Deane

Computer Science

In this episode, Conor and Bryce chat with Jared Hoberock about the NVIDIA Thrust Parallel Algorithms Library, specifically scan and rotate. Link to Episode 241 on WebsiteDiscuss this episode, leave a comment, or ask a question (on GitHub)Socials ADSP: The Podcast: TwitterConor Hoekstra: Twitter | BlueSky | MastodonBryce Adelstein Lelbach: TwitterAbout the Guest Jared Hoberock joined NVIDIA Research in October 2008. His interests include parallel programming models and physically-based rendering. Jared is the co-creator of Thrust, a high performance parallel algorithms library. While at NVIDIA, Jared has contributed to the DirectX graphics driver, Gelato, a final frame film renderer, and OptiX, a high-performance, programmable ray tracing engine. Jared received a Ph.D in computer science from the University of Illinois at Urbana-Champaign. He is a two-time recipient of the NVIDIA Graduate Research Fellowship. Show Notes Date Generated: 2025-05-21 Date Released: 2025-07-04 ThrustThrust DocsNumPyRAPIDS cuDFthrust::inclusive_scanC++98 std::rotatethrust::permutation_iteratorthrust::gatherthrust::adjacent_differenceIntro Song Info Miss You by Sarah Jansen https://soundcloud.com/sarahjansenmusic Creative Commons — Attribution 3.0 Unported — CC BY 3.0 Free Download / Stream: http://bit.ly/l-miss-you Music promoted by Audio Library https://youtu.be/iYYxnasvfx8

How AI is changing software engineering at Shopify with Farhan Thawar

2025-07-02 · The Pragmatic Engineer Listen

podcast_episode

by Gergely Orosz , Farhan Thawar (Shopify)

AI/ML Analytics LLM Marketing SaaS Cyber Security

Supported by Our Partners •⁠ WorkOS — The modern identity platform for B2B SaaS. •⁠ Statsig ⁠ — ⁠ The unified platform for flags, analytics, experiments, and more. • Sonar — Code quality and code security for ALL code. — What happens when a company goes all in on AI? At Shopify, engineers are expected to utilize AI tools, and they’ve been doing so for longer than most. Thanks to early access to models from GitHub Copilot, OpenAI, and Anthropic, the company has had a head start in figuring out what works. In this live episode from LDX3 in London, I spoke with Farhan Thawar, VP of Engineering, about how Shopify is building with AI across the entire stack. We cover the company’s internal LLM proxy, its policy of unlimited token usage, and how interns help push the boundaries of what’s possible. In this episode, we cover: • How Shopify works closely with AI labs • The story behind Shopify’s recent Code Red • How non-engineering teams are using Cursor for vibecoding • Tobi Lütke’s viral memo and Shopify’s expectations around AI • A look inside Shopify’s LLM proxy—used for privacy, token tracking, and more • Why Shopify places no limit on AI token spending • Why AI-first isn’t about reducing headcount—and why Shopify is hiring 1,000 interns • How Shopify’s engineering department operates and what’s changed since adopting AI tooling • Farhan’s advice for integrating AI into your workflow • And much more! — Timestamps (00:00) Intro (02:07) Shopify’s philosophy: “hire smart people and pair with them on problems” (06:22) How Shopify works with top AI labs (08:50) The recent Code Red at Shopify (10:47) How Shopify became early users of GitHub Copilot and their pivot to trying multiple tools (12:49) The surprising ways non-engineering teams at Shopify are using Cursor (14:53) Why you have to understand code to submit a PR at Shopify (16:42) AI tools' impact on SaaS (19:50) Tobi Lütke’s AI memo (21:46) Shopify’s LLM proxy and how they protect their privacy (23:00) How Shopify utilizes MCPs (26:59) Why AI tools aren’t the place to pinch pennies (30:02) Farhan’s projects and favorite AI tools (32:50) Why AI-first isn’t about freezing headcount and the value of hiring interns (36:20) How Shopify’s engineering department operates, including internal tools (40:31) Why Shopify added coding interviews for director-level and above hires (43:40) What has changed since Spotify added AI tooling (44:40) Farhan’s advice for implementing AI tools — The Pragmatic Engineer deepdives relevant for this episode: • How Shopify built its Live Globe for Black Friday • Inside Shopify's leveling split • Real-world engineering challenges: building Cursor • How Anthropic built Artifacts — See the transcript and other references from the episode at ⁠⁠https://newsletter.pragmaticengineer.com/podcast⁠⁠ — Production and marketing by ⁠⁠⁠⁠⁠⁠⁠⁠https://penname.co/⁠⁠⁠⁠⁠⁠⁠⁠. For inquiries about sponsoring the podcast, email [email protected].

Get full access to The Pragmatic Engineer at newsletter.pragmaticengineer.com/subscribe