talk-data.com talk-data.com

P

Speaker

Petr Andreev

1

talks

Senior Data Engineer Mantel Group

Petr is a data engineer specializing in Databricks Azure platform design and development. With a strong focus on optimizing distributed systems, Petr brings extensive experience in creating efficient, scalable solutions that drive performance and innovation in data engineering."

Bio from: Data + AI Summit 2025

Filter by Event / Source

Talks & appearances

1 activities · Newest first

Search activities →
Highways and Hexagons: Processing Large Geospatial Datasets With H3

The problem of matching GPS locations to roads and local government areas (LGAs) involves handling large datasets and a number of geospatial operations. In this deep dive, we will outline the challenges of developing scalable solutions for these tasks. We will discuss our multi-step approach, first focusing on the use of H3 indexing to isolate matches with single candidates, then explaining use of different geospatial computational techniques to accurately match points with multiple candidates. From technical perspective, the talk will showcase the use of broadcasting and partitioning techniques, their effect on autoscaling, memory usage and effective data parallelization. This session is for anyone interested in geospatial data, spark performance optimization and the real-world challenges of large-scale data engineering.