talk-data.com

Event

PyConDE & PyData Berlin 2023

2023-04-17 – 2023-04-19 PyData

Activities tracked

Filtering by: GitHub ×

Top Speakers

Lev Konstantinovskiy 3 Noa Tamir 3 Patrick Hoefler 3 Joris Van den Bossche 2 Theodore Meynard 2 Gregor Riegler 2 Jérémy Tuloup 2 Nico Kreiling 2 Nitsan Avni 2 Tereza Iofciu 2 Cheuk Ting Ho 1 Ines Montani 1

Sessions & talks

Showing 1–6 of 6 · Newest first

Search within this event →

evosax: JAX-Based Evolution Strategies

2023-04-19

talk

Robert Lange

API GitHub

Tired of having to handle asynchronous processes for neuroevolution? Do you want to leverage massive vectorization and high-throughput accelerators for evolution strategies (ES)? evosax allows you to leverage JAX, XLA compilation and auto-vectorization/parallelization to scale ES to your favorite accelerators. In this talk we will get to know the core API and how to solve distributed black-box optimization problems with evolution strategies.

The Beauty of Zarr

2023-04-19

talk

Sanket Verma

GitHub HTML Python

In this talk, I’d be talking about Zarr, an open-source data format for storing chunked, compressed N-dimensional arrays. This talk presents a systematic approach to understanding and implementing Zarr by showing how it works, the need for using it, and a hands-on session at the end. Zarr is based on an open technical specification, making implementations across several languages possible. I’d mainly talk about Zarr’s Python implementation and show how it beautifully interoperates with the existing libraries in the PyData stack.

Writing Plugin Friendly Python Applications

2023-04-18

talk

Travis Hathaway

GitHub Python

In modern software engineering, plugin systems are a ubiquitous way to extend and modify the behavior of applications and libraries. When software is written in a way that is plugin friendly, it encourages the use of modular organization where the contracts between the core software and the plugin have been well thought out. In this talk, we cover exactly how to define this contract and how you can start designing your software to be more plugin friendly.

Throughout the talk we will be creating our own plugin friendly application using the pluggy library to show these design principles in action. At the end of the talk, I also cover a real-life case study of how the package manager conda is currently making its 10 year old code more plugin friendly to illustrate how to retrofit an existing project.

Exploring the Power of Cyclic Boosting: A Pure-Python, Explainable, and Efficient ML Method

2023-04-17

talk

Felix Wick

AI/ML GitHub Python

We have recently open-sourced a pure-Python implementation of Cyclic Boosting, a family of general-purpose, supervised machine learning algorithms. Its predictions are fully explainable on individual sample level, and yet Cyclic Boosting can deliver highly accurate and robust models. For this, it requires little hyperparameter tuning and minimal data pre-processing (including support for missing information and categorical variables of high cardinality), making it an ideal off-the-shelf method for structured, heterogeneous data sets. Furthermore, it is computationally inexpensive and fast, allowing for rapid improvement iterations. The modeling process, especially the infamous but unavoidable feature engineering, is facilitated by automatic creation of an extensive set of visualizations for data dependencies and training results. In this presentation, we will provide an overview of the inner workings of Cyclic Boosting, along with a few sample use cases, and demonstrate the usage of the new Python library.

You can find Cyclic Boosting on GitHub: https://github.com/Blue-Yonder-OSS/cyclic-boosting

Hyperparameter optimization for the impatient

2023-04-17

talk

Martin Wistuba

AI/ML GitHub

In the last years, Hyperparameter Optimization (HPO) became a fundamental step in the training of Machine Learning (ML) models and in the creation of automatic ML pipelines. Unfortunately, while HPO improves the predictive performance of the final model, it comes with a significant cost both in terms of computational resources and waiting time. This leads many practitioners to try to lower the cost of HPO by employing unreliable heuristics.

In this talk we will provide simple and practical algorithms for users that want to train models with almost-optimal predictive performance, while incurring in a significantly lower cost and waiting time. The presented algorithms are agnostic to the application and the model being trained so they can be useful in a wide range of scenarios.

We provide results from an extensive experimental activity on public benchmarks, including comparisons with well-known techniques like Bayesian Optimization (BO), ASHA, Successive Halving. We will describe in which scenarios the biggest gains are observed (up to 30x) and provide examples for how to use these algorithms in a real-world environment.

All the code used for this talk is available on (GitHub)[https://github.com/awslabs/syne-tune].

Large Scale Feature Engineering and Datascience with Python & Snowflake

2023-04-17

talk

Michael Gorkow

Data Science GitHub Python Cyber Security Snowflake

Snowflake as a data platform is the core data repository of many large organizations.
With the introduction of Snowflake's Snowpark for Python, Python developers can now collaborate and build on one platform with a secure Python sandbox, providing developers with dynamic scalability & elasticity as well as security and compliance.

In this talk I'll explain the core concepts of Snowpark for Python and how they can be used for large scale feature engineering and data science.

PyConDE & PyData Berlin 2023

Top Topics

Top Speakers