Atelier de 2 heures sur l'extraction de données des sites web avec Python, automatisation de la collecte et de l'analyse, avec Jupyter notebooks.
talk-data.com
Topic
web scraping
6
tagged
Activity Trend
Async Python for Data Science: Speeding Up IO - Bound Workflows\nMost Python scripts in data science are synchronous — fetching one record at a time, waiting for APIs, or slowly scraping websites. In this talk, we’ll introduce Python’s asyncio ecosystem and show how it transforms IO - heavy data workflows. You'll see how httpx , aiofiles , and async constructs speed up tasks like web scraping and batch API calls. We’ll compare async vs threading, walk through a real - world case study, and wrap with performance benchmarks that demonstrate async's value.\nKeywords: p ython 3.x , AsyncIO, Web Scraping, API, Concurrency, Performance, Optimization
Most Python scripts in data science are synchronous — fetching one record at a time, waiting for APIs, or slowly scraping websites. In this talk, we’ll introduce Python’s asyncio ecosystem and show how it transforms IO-heavy data workflows. You'll see how httpx, aiofiles, and async constructs speed up tasks like web scraping and batch API calls. We’ll compare async vs threading, walk through a real-world case study, and wrap with performance benchmarks that demonstrate async's value.
Grâce au web scraping, les entreprises peuvent rassembler de larges ensembles de données, construire des modèles de prédiction pour prendre de meilleures décisions business. Pendant cet atelier vous apprendrez : ✨ Les bases de la programmation en Python ✨ Comment exécuter du code avec Jupyter ✨ Comment collecter des données pour vos analyses.
Project aim is to try building a web scraping library that uses OCR, an LLM and some automation scripts to retrieve data from highly protected websites without API’s.
This project aims to develop an AI-powered system that predicts the most cost-effective locations for users to book flights, ensuring they can access the cheapest possible prices. The project aims to generate a rich dataset generated through various web scraping techniques, querying flight data and prices from various locations. Utilizing a machine learning model, the system analyzes this comprehensive dataset to suggest optimal booking locations and predict potential savings for a user given a query.