Hands-on session to explore Data Prep Kit and accelerate data preparation for building robust LLM applications. Topics include getting started with Data Prep Kit, extracting content from PDFs, DOCX, and HTML, cleanup of excess markup, detecting/removing duplicate documents, and removing low-quality and spam documents. Attendees should be comfortable with Python; workshop code will run in Google Colab.
talk-data.com
Topic
data prep kit
1
tagged
Activity Trend
1
peak/qtr
2020-Q1
2026-Q1
Top Events
[AI Alliance] Review of Data Preparation Tools (incl. open Q&A)
1
AI Alliance: An in-depth look at Data Prep Kit
1
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
1
[AI Alliance] Review of Data Preparation Tools (incl. open Q&A)
1
[AI Alliance] Workshop: Hands-on with Data Prep Kit
1
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
1
[AI Alliance] Review of Data Preparation Tools (incl. open Q&A)
1
[AI Alliance] Workshop: Hands-on with Data Prep Kit
1
Open Source GenAI Hands-On Workshops November 10, 1 Madison Avenue, NYC
1
[AI Alliance] Workshop: Hands-on with Data Prep Kit
1
AI Alliance: Getting Started with Docling and Data Prep Kit
1
[AI Alliance] RAG pipeline With Data Prep Kit + Milvus + Granite (Workshop)
1
Top Speakers
Filtering by:
[AI Alliance] Workshop: Hands-on with Data Prep Kit
×