Apache Arrow has become a de-facto standard for efficient in-memory columnar data representation. Beyond the standardized and language-independent columnar memory format for tabular data, the Apache Arrow project also has a growing set of supplementary specifications and language implementations. This talk will give an overview of the recent developments in the Apache Arrow ecosystem, including ADBC, nanoarrow, new data types, and the Arrow PyCapsule protocol.
talk-data.com
Speaker
Joris Van den Bossche
3
talks
Filter by Event / Source
Talks & appearances
3 activities · Newest first
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing, and is becoming the de facto standard for tabular data. This talk will give an overview of the recent developments both in Apache Arrow itself as how it is being adopted in the PyData ecosystem (and beyond) and can improve your day-to-day data analytics workflows.
Pandas has reached a 2.0 milestone in 2023. But what does that mean? And what is coming after 2.0? This talk will give an overview of what happened in the latest releases of pandas and highlight some topics and major new features the pandas project is working on.