Deepyaman Datta

Activities

2

talks

Filtering by: SciPy 2025 ×

Filter by Event / Source

SciPy 2025 2 PyData Boston 2025 1 Data Council Austin 2024 - Day 1 1

Talks & appearances

Showing 2 of 4 activities

Search activities →

Python is all you need: an overview of the composable, Python-native data stack

2025-07-09 · SciPy 2025

talk

API Data Engineering dbt Modern Data Stack Python SQL

For the past decade, SQL has reigned king of the data transformation world, and tools like dbt have formed a cornerstone of the modern data stack. Until recently, Python-first alternatives couldn't compete with the scale and performance of modern SQL. Now Ibis can provide the same benefits of SQL execution with a flexible Python dataframe API.

In this talk, you will learn how Ibis supercharges existing open-source libraries like Kedro and Pandera and how you can combine these technologies (and a few more) to build and orchestrate scalable data engineering pipelines without sacrificing the comfort (and other advantages) of Python.

Building machine learning pipelines that scale: a case study using Ibis and IbisML

2025-07-07 · SciPy 2025

talk

with Anjali Datta , Deepyaman Datta

AI/ML Analytics Data Engineering Pandas Python Scikit-learn

Pandas and scikit-learn have become staples in the machine learning toolkit for processing and modeling tabular data in Python. However, when data size scales up, these tools become slow or run out of memory. Ibis provides a unified, Pythonic, dataframe-like interface to 20+ execution backends, including dataframe libraries, databases, and analytics engines. Ibis enables users to leverage these powerful tools without rewriting their data engineering code (or learning SQL). IbisML extends the benefits of using Ibis to the ML workflow by letting users preprocess their data at scale on any Ibis-supported backend.

In this tutorial, you'll build an end-to-end machine learning project to predict the live win probability after each move during chess games.