talk-data.com
Activities & events
| Title & Speakers | Event |
|---|---|
|
Data Prep Kit Workshop
2025-03-27 · 16:00
Hands-on workshop on cleaning and preparing high-quality datasets using Data Prep Kit. Topics include extracting content from PDFs and HTML, cleaning up markup, detecting and removing SPAM content, scoring and removing low-quality documents, identifying and removing PII data, and detecting and removing HAP (Hate Abuse Profanity) speech. More about Data Prep Kit: https://github.com/IBM/data-prep-kit |
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
|
|
Data Prep Kit Workshop: Clean and Prepare High-Quality Datasets
2025-03-27 · 16:00
Hands-on workshop on using Data Prep Kit to extract content from PDFs/HTML, clean up data, remove SPAM, score and remove low-quality documents, identify and remove PII data, and detect and remove HAP (Hate Abuse Profanity) speech to improve dataset quality. Code will be run in Google Colab using Python. |
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
|
|
Data Prep Kit Workshop: Data wrangling for ML and data apps
2025-03-27 · 16:00
Hands-on workshop on using Data Prep Kit to clean and prepare high-quality datasets: extract content from PDFs/HTML, cleanup markups, remove SPAM, score and filter low-quality documents, identify and remove PII data, and detect Hate/Abusive language. Prerequisites: comfortable with Python; run the workshop in Google Colab. |
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
|
|
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
2025-03-27 · 16:00
Overview When building machine learning and data applications, a significant portion of your time will be dedicated to data wrangling - from content extraction and filtering out problematic and low quality data. In this hands-on session we will explore Data Prep Kit - an open source toolkit, designed to streamline these essential tasks. Attendees will learn first hand how to use the Data Prep Kit to improve overall data quality such as removing spam and low quality documents, removing HAP (Hate Abuse Profanity) speech, removing PII (Personally Identifiable Information) data, thus leading to higher quality dataset. Description Join us for an interactive, hands-on session where you will learn to clean up data and prepare high quality datasets. In this workshop we will do the following:
More about Data Prep Kit : https://github.com/IBM/data-prep-kit What do you need to participate in this workshop?
Session Type Workshop (hands-on) Audience LLM app developers, data scientists, data engineers Technical Level Beginner - Intermediate Prerequisites
Duration 60 mins Industry Cross industry Speaker Bio https://sujee.dev/bio About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players. |
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
|
|
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
2025-03-27 · 16:00
Overview When building machine learning and data applications, a significant portion of your time will be dedicated to data wrangling - from content extraction and filtering out problematic and low quality data. In this hands-on session we will explore Data Prep Kit - an open source toolkit, designed to streamline these essential tasks. Attendees will learn first hand how to use the Data Prep Kit to improve overall data quality such as removing spam and low quality documents, removing HAP (Hate Abuse Profanity) speech, removing PII (Personally Identifiable Information) data, thus leading to higher quality dataset. Description Join us for an interactive, hands-on session where you will learn to clean up data and prepare high quality datasets. In this workshop we will do the following:
More about Data Prep Kit : https://github.com/IBM/data-prep-kit What do you need to participate in this workshop?
Session Type Workshop (hands-on) Audience LLM app developers, data scientists, data engineers Technical Level Beginner - Intermediate Prerequisites
Duration 60 mins Industry Cross industry Speaker Bio https://sujee.dev/bio About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players. |
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
|
|
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
2025-03-27 · 16:00
Overview When building machine learning and data applications, a significant portion of your time will be dedicated to data wrangling - from content extraction and filtering out problematic and low quality data. In this hands-on session we will explore Data Prep Kit - an open source toolkit, designed to streamline these essential tasks. Attendees will learn first hand how to use the Data Prep Kit to improve overall data quality such as removing spam and low quality documents, removing HAP (Hate Abuse Profanity) speech, removing PII (Personally Identifiable Information) data, thus leading to higher quality dataset. Description Join us for an interactive, hands-on session where you will learn to clean up data and prepare high quality datasets. In this workshop we will do the following:
More about Data Prep Kit : https://github.com/IBM/data-prep-kit What do you need to participate in this workshop?
Session Type Workshop (hands-on) Audience LLM app developers, data scientists, data engineers Technical Level Beginner - Intermediate Prerequisites
Duration 60 mins Industry Cross industry Speaker Bio https://sujee.dev/bio About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players. |
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
|
|
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
2025-03-27 · 16:00
Overview When building machine learning and data applications, a significant portion of your time will be dedicated to data wrangling - from content extraction and filtering out problematic and low quality data. In this hands-on session we will explore Data Prep Kit - an open source toolkit, designed to streamline these essential tasks. Attendees will learn first hand how to use the Data Prep Kit to improve overall data quality such as removing spam and low quality documents, removing HAP (Hate Abuse Profanity) speech, removing PII (Personally Identifiable Information) data, thus leading to higher quality dataset. Description Join us for an interactive, hands-on session where you will learn to clean up data and prepare high quality datasets. In this workshop we will do the following:
More about Data Prep Kit : https://github.com/IBM/data-prep-kit What do you need to participate in this workshop?
Session Type Workshop (hands-on) Audience LLM app developers, data scientists, data engineers Technical Level Beginner - Intermediate Prerequisites
Duration 60 mins Industry Cross industry Speaker Bio https://sujee.dev/bio About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players. |
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
|