talk-data.com talk-data.com

Meetup workshop 2025-03-27 at 16:00

Data Prep Kit Workshop: Data wrangling for ML and data apps

Description

Hands-on workshop on using Data Prep Kit to clean and prepare high-quality datasets: extract content from PDFs/HTML, cleanup markups, remove SPAM, score and filter low-quality documents, identify and remove PII data, and detect Hate/Abusive language. Prerequisites: comfortable with Python; run the workshop in Google Colab.