Sarah McKenna joins me to chat about all things web scraping. We discuss its applications, the evolution of alternative data, and AI's impact on the industry. We also discuss privacy concerns, the challenges of bot blocking, and the importance of data quality. Sarah shares ideas on how to get started with web scraping and the ethical considerations surrounding copyright and data collection.
talk-data.com
Topic
Data Collection
2
tagged
Activity Trend
17
peak/qtr
2020-Q1
2026-Q1
Top Events
O'Reilly Data Science Books
44
Data Engineering Podcast
19
O'Reilly Data Engineering Books
18
DataFramed
6
Data + AI Summit 2025
5
Big Data LDN 2024
4
The Analytics Power Hour
4
Databricks DATA + AI Summit 2023
3
O'Reilly Data Visualization Books
2
Data Skeptic
2
Women in AI and Data Science Conference 2025
2
Hub & Spoken: Data | Analytics | Chief Data Officer | CDO | Data Strategy
2
Filtering by:
Sarah McKenna
×
Automating Data Quality via Shift Left for Real-Time Web Data Feeds at Industrial Scale | Sarah McKenna | Shift Left Data Conference 2025
Real-time web data is one of the hardest data streams to automate with trust since web sites don't want to be scraped, are constantly changing with no notice, and employ sophisticated bot blocking mechanisms to try to stop automated data collection. At Sequentum we cut our teeth on web data and have come out with a general purpose cloud platform for any type of data ingestion and data enrichment that our clients can transparently audit and ultimately trust to get their mission critical data delivered on time and with quality to fuel their business decision making.