Topic

Arrow

Apache Arrow

data_processing columnar_memory_format big_data

Activities

2

tagged

Activity Trend

6 peak/qtr

2020-Q1 2026-Q1

Top Events

Data Engineering Podcast 10 Data Council Austin 2024 - Day 1 5 Databricks DATA + AI Summit 2023 4 PyConDE & PyData Berlin 2023 3 PyData Paris 2025 3 The Analytics Engineering Podcast 2 Data Council 2023 2 O'Reilly Data Engineering Books 2 PyData Paris 2024 2 Data + AI Summit 2025 2 Data Skeptic 1 Making Data Simple 1

Top Speakers

Tobias Macey 10 Matthew Topol (Voltron Data) 3 Wes McKinney (Posit) 3 Julien Le Dem (Astronomer) 3 Alenka Frim (United.Cloud) 2 Joris Van den Bossche 2 Tristan Handy (dbt Labs) 2 Julia Schottenstein (dbt labs) 2 Kyle Polich 1 Hyukjin Kwon (Databricks) 1 Thomas Bierhance 1 Yann LECHELLE (:PROBABL.) 1

Activities

Showing filtered results

All Video Podcast Book

Filtering by: PyData Paris 2024 ×

xsimd: from xtensor to firefox

2024-09-25 · PyData Paris 2024

talk

by Serge « sans » Paille

Almost all modern CPU have a vector processing unit, making it possible to write faster code for a large category of problems, at the cost of portability - there a re many different instruction sets in the wild! The xsimd library makes it possible to write portable C++ code that targets different architectures and sub-architectures. The specialization choice can be made at compile-time or at runtime, using a provided dispatching mechanism. Intel, ARM, RiscV and Webassembly are supported, and the library has already been adopted by Xtensor, Pythran, Apache Arrow and Firefox.

The expanding Apache Arrow universe - standardizing and accelerating tabular data access and interchange

2024-09-25 · PyData Paris 2024

talk

by Joris Van den Bossche

Apache Arrow has become a de-facto standard for efficient in-memory columnar data representation. Beyond the standardized and language-independent columnar memory format for tabular data, the Apache Arrow project also has a growing set of supplementary specifications and language implementations. This talk will give an overview of the recent developments in the Apache Arrow ecosystem, including ADBC, nanoarrow, new data types, and the Arrow PyCapsule protocol.