talk-data.com
Meetup
Hands-on workshop
2025-03-20 at 16:00
Data Prep Kit Workshop
Topics
Description
Hands-on workshop to explore IBM Data Prep Kit for data preparation, including getting started, extracting content from PDFs, DOCX, and HTML, cleaning markup, deduplicating data, and removing low-quality or spam documents. The session will be run in Google Colab and is suitable for LLM app developers, data scientists, and data engineers. Prerequisites: comfortable with Python.