talk-data.com
Event
PyData Paris 2025
Activities tracked
8
Top Topics
Sessions & talks
Showing 1–8 of 8 · Newest first
Boostez la qualité de vos données tout en maîtrisant vos dépenses dans le Cloud
Actian Data Observability : qualité data en temps réel, monitoring découplé, coûts maîtrisés. Plus de qualité, moins de dépenses. Démo live.
De la dépendance aux grandes plateformes cloud à l'autonomie via une IA souveraine, quelle trajectoire pour COVEA ?
Et si l’Europe était coupée des IA US ? Covéa partage son parcours et sa vision pour une IA souveraine, résiliente et maîtrisée 🛡️🤖
De la dépendance aux grandes plateformes cloud à l'autonomie via une IA souveraine, quelle trajectoire pour Covéa ?
Et si l’Europe était privée des IA US ? Covéa & Orange Business explorent souveraineté et autonomie.
Modernisation et migration de la plateforme data : le parcours move-to-cloud accéléré de la MACIF
Modernisation et migration de la plateforme data : le parcours move-to-cloud accéléré de la MACIF
Retour d’expérience – Migration Data & IA
Présentation d’un projet de migration d’une stack data On Prem vers une architecture cloud (Databricks + Power BI)
Documents Meet LLMs: Tales from the Trenches
Processing documents with LLMs comes with unexpected challenges: handling long inputs, enforcing structured outputs, catching hallucinations, and recovering from partial failures. In this talk, we’ll cover why large context windows are not a silver bullet, why chunking is deceptively hard and how to design input and output that allow for intelligent retrial. We'll also share practical prompting strategies, discuss OCR and parsing tools, compare different LLMs (and their cloud APIs) and highlight real-world insights from our experience developing production GenAI applications with multiple document processing scenarios.
A Journey Through a Geospatial Data Pipeline: From Raw Coordinates to Actionable Insights
Every dataset has a story — and when it comes to geospatial data, it’s a story deeply rooted in space and scale. But working with geospatial information is often a hidden challenge: massive file sizes, strange formats, projections, and pipelines that don't scale easily.
In this talk, we'll follow the life of a real-world geospatial dataset, from its raw collection in the field to its transformation into meaningful insights. Along the way, we’ll uncover the key steps of building a robust, scalable open-source geospatial pipeline.
Drawing on years of experience at Camptocamp, we’ll explore:
- How raw spatial data is ingested and cleaned
- How vector and raster data are efficiently stored and indexed (PostGIS, Cloud Optimized GeoTIFFs, Zarr)
- How modern tools like Dask, GeoServer, and STAC (SpatioTemporal Asset Catalogs) help process and serve geospatial data
- How to design pipelines that handle both "small data" (local shapefiles) and "big data" (terabytes of satellite imagery)
- Common pitfalls and how to avoid them when moving from prototypes to production
This journey will show how the open-source ecosystem has matured to make geospatial big data accessible — and how spatial thinking can enrich almost any data project, whether you are building dashboards, doing analytics, or setting the stage for machine learning later on.