talk-data.com
Activities & events
| Title & Speakers | Event |
|---|---|
|
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
2025-03-27 · 16:00
Overview When building machine learning and data applications, a significant portion of your time will be dedicated to data wrangling - from content extraction and filtering out problematic and low quality data. In this hands-on session we will explore Data Prep Kit - an open source toolkit, designed to streamline these essential tasks. Attendees will learn first hand how to use the Data Prep Kit to improve overall data quality such as removing spam and low quality documents, removing HAP (Hate Abuse Profanity) speech, removing PII (Personally Identifiable Information) data, thus leading to higher quality dataset. Description Join us for an interactive, hands-on session where you will learn to clean up data and prepare high quality datasets. In this workshop we will do the following:
More about Data Prep Kit : https://github.com/IBM/data-prep-kit What do you need to participate in this workshop?
Session Type Workshop (hands-on) Audience LLM app developers, data scientists, data engineers Technical Level Beginner - Intermediate Prerequisites
Duration 60 mins Industry Cross industry Speaker Bio https://sujee.dev/bio About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players. |
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
|
|
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
2025-03-27 · 16:00
Overview When building machine learning and data applications, a significant portion of your time will be dedicated to data wrangling - from content extraction and filtering out problematic and low quality data. In this hands-on session we will explore Data Prep Kit - an open source toolkit, designed to streamline these essential tasks. Attendees will learn first hand how to use the Data Prep Kit to improve overall data quality such as removing spam and low quality documents, removing HAP (Hate Abuse Profanity) speech, removing PII (Personally Identifiable Information) data, thus leading to higher quality dataset. Description Join us for an interactive, hands-on session where you will learn to clean up data and prepare high quality datasets. In this workshop we will do the following:
More about Data Prep Kit : https://github.com/IBM/data-prep-kit What do you need to participate in this workshop?
Session Type Workshop (hands-on) Audience LLM app developers, data scientists, data engineers Technical Level Beginner - Intermediate Prerequisites
Duration 60 mins Industry Cross industry Speaker Bio https://sujee.dev/bio About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players. |
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
|
|
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
2025-03-27 · 16:00
Overview When building machine learning and data applications, a significant portion of your time will be dedicated to data wrangling - from content extraction and filtering out problematic and low quality data. In this hands-on session we will explore Data Prep Kit - an open source toolkit, designed to streamline these essential tasks. Attendees will learn first hand how to use the Data Prep Kit to improve overall data quality such as removing spam and low quality documents, removing HAP (Hate Abuse Profanity) speech, removing PII (Personally Identifiable Information) data, thus leading to higher quality dataset. Description Join us for an interactive, hands-on session where you will learn to clean up data and prepare high quality datasets. In this workshop we will do the following:
More about Data Prep Kit : https://github.com/IBM/data-prep-kit What do you need to participate in this workshop?
Session Type Workshop (hands-on) Audience LLM app developers, data scientists, data engineers Technical Level Beginner - Intermediate Prerequisites
Duration 60 mins Industry Cross industry Speaker Bio https://sujee.dev/bio About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players. |
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
|
|
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
2025-03-27 · 16:00
Overview When building machine learning and data applications, a significant portion of your time will be dedicated to data wrangling - from content extraction and filtering out problematic and low quality data. In this hands-on session we will explore Data Prep Kit - an open source toolkit, designed to streamline these essential tasks. Attendees will learn first hand how to use the Data Prep Kit to improve overall data quality such as removing spam and low quality documents, removing HAP (Hate Abuse Profanity) speech, removing PII (Personally Identifiable Information) data, thus leading to higher quality dataset. Description Join us for an interactive, hands-on session where you will learn to clean up data and prepare high quality datasets. In this workshop we will do the following:
More about Data Prep Kit : https://github.com/IBM/data-prep-kit What do you need to participate in this workshop?
Session Type Workshop (hands-on) Audience LLM app developers, data scientists, data engineers Technical Level Beginner - Intermediate Prerequisites
Duration 60 mins Industry Cross industry Speaker Bio https://sujee.dev/bio About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players. |
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
|
|
Data Prep Kit Hands-on Workshop
2025-03-20 · 16:00
Hands-on session to explore Data Prep Kit and accelerate data preparation for building robust LLM applications. Topics include getting started with Data Prep Kit, extracting content from PDFs, DOCX, and HTML, cleanup of excess markup, detecting/removing duplicate documents, and removing low-quality and spam documents. Attendees should be comfortable with Python; workshop code will run in Google Colab. |
[AI Alliance] Workshop: Hands-on with Data Prep Kit
|
|
Hands-on workshop: Data Prep Kit for data preparation and LLM applications
2025-03-20 · 16:00
Hands-on session to explore Data Prep Kit and how to accelerate data preparation for LLM applications. The workshop covers getting started with Data Prep Kit, extracting content from PDFs, DOCX, and HTML, cleaning markup, deduplicating content, and detecting/removing low-quality or spam documents. |
[AI Alliance] Workshop: Hands-on with Data Prep Kit
|
|
Data Prep Kit Workshop
2025-03-20 · 16:00
Hands-on workshop to explore IBM Data Prep Kit for data preparation, including getting started, extracting content from PDFs, DOCX, and HTML, cleaning markup, deduplicating data, and removing low-quality or spam documents. The session will be run in Google Colab and is suitable for LLM app developers, data scientists, and data engineers. Prerequisites: comfortable with Python. |
[AI Alliance] Workshop: Hands-on with Data Prep Kit
|
|
[AI Alliance] Workshop: Hands-on with Data Prep Kit
2025-03-20 · 16:00
Overview When building machine learning and data applications, a significant portion of your time will be dedicated to data wrangling - from content extraction and cleaning to de-duplication and filtering out problematic data. In this hands-on session we will explore Data Prep Kit - an open source toolkit, designed to streamline these essential tasks. Attendees will learn first hand how to use the Data Prep Kit to accelerate data preparation, improve overall data quality, and enhance the efficiency of building robust LLM applications. Description Data Prep Kit is a comprehensive Python library that democratizes and accelerates data preparation by providing out-of-the-box solutions for common tasks. Engineered to scale from a single laptop to large cloud clusters, it has been successfully used to process terabytes of data for training IBM Granite Large Language Models (LLMs). Data Prep Kit offers a robust feature set including duplicate elimination, advanced document and code handling, language detection (for both spoken and programming languages), removal of personally identifiable information (PII), as well as spam, hate speech, and malware detection. More about Data Prep Kit : https://github.com/IBM/data-prep-kit Join us for this hands-on session to explore how to use Data Prep Kit to accelerate data preparation, enhance data quality. In this workshop we will do the following:
What do you need to participate in this workshop?
Session Type Hands-on workshop Audience LLM app developers, data scientists, data engineers Technical Level Beginner - Intermediate Prerequisites
Duration 60 mins Industry Cross industry Speaker Bio https://sujee.dev/bio About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players. |
[AI Alliance] Workshop: Hands-on with Data Prep Kit
|
|
[AI Alliance] Workshop: Hands-on with Data Prep Kit
2025-03-20 · 16:00
Overview When building machine learning and data applications, a significant portion of your time will be dedicated to data wrangling - from content extraction and cleaning to de-duplication and filtering out problematic data. In this hands-on session we will explore Data Prep Kit - an open source toolkit, designed to streamline these essential tasks. Attendees will learn first hand how to use the Data Prep Kit to accelerate data preparation, improve overall data quality, and enhance the efficiency of building robust LLM applications. Description Data Prep Kit is a comprehensive Python library that democratizes and accelerates data preparation by providing out-of-the-box solutions for common tasks. Engineered to scale from a single laptop to large cloud clusters, it has been successfully used to process terabytes of data for training IBM Granite Large Language Models (LLMs). Data Prep Kit offers a robust feature set including duplicate elimination, advanced document and code handling, language detection (for both spoken and programming languages), removal of personally identifiable information (PII), as well as spam, hate speech, and malware detection. More about Data Prep Kit : https://github.com/IBM/data-prep-kit Join us for this hands-on session to explore how to use Data Prep Kit to accelerate data preparation, enhance data quality. In this workshop we will do the following:
What do you need to participate in this workshop?
Session Type Hands-on workshop Audience LLM app developers, data scientists, data engineers Technical Level Beginner - Intermediate Prerequisites
Duration 60 mins Industry Cross industry Speaker Bio https://sujee.dev/bio About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players. |
[AI Alliance] Workshop: Hands-on with Data Prep Kit
|
|
[AI Alliance] Workshop: Hands-on with Data Prep Kit
2025-03-20 · 16:00
Overview When building machine learning and data applications, a significant portion of your time will be dedicated to data wrangling - from content extraction and cleaning to de-duplication and filtering out problematic data. In this hands-on session we will explore Data Prep Kit - an open source toolkit, designed to streamline these essential tasks. Attendees will learn first hand how to use the Data Prep Kit to accelerate data preparation, improve overall data quality, and enhance the efficiency of building robust LLM applications. Description Data Prep Kit is a comprehensive Python library that democratizes and accelerates data preparation by providing out-of-the-box solutions for common tasks. Engineered to scale from a single laptop to large cloud clusters, it has been successfully used to process terabytes of data for training IBM Granite Large Language Models (LLMs). Data Prep Kit offers a robust feature set including duplicate elimination, advanced document and code handling, language detection (for both spoken and programming languages), removal of personally identifiable information (PII), as well as spam, hate speech, and malware detection. More about Data Prep Kit : https://github.com/IBM/data-prep-kit Join us for this hands-on session to explore how to use Data Prep Kit to accelerate data preparation, enhance data quality. In this workshop we will do the following:
What do you need to participate in this workshop?
Session Type Hands-on workshop Audience LLM app developers, data scientists, data engineers Technical Level Beginner - Intermediate Prerequisites
Duration 60 mins Industry Cross industry Speaker Bio https://sujee.dev/bio About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players. |
[AI Alliance] Workshop: Hands-on with Data Prep Kit
|
|
[AI Alliance] Workshop: Hands-on with Data Prep Kit
2025-03-20 · 16:00
Overview When building machine learning and data applications, a significant portion of your time will be dedicated to data wrangling - from content extraction and cleaning to de-duplication and filtering out problematic data. In this hands-on session we will explore Data Prep Kit - an open source toolkit, designed to streamline these essential tasks. Attendees will learn first hand how to use the Data Prep Kit to accelerate data preparation, improve overall data quality, and enhance the efficiency of building robust LLM applications. Description Data Prep Kit is a comprehensive Python library that democratizes and accelerates data preparation by providing out-of-the-box solutions for common tasks. Engineered to scale from a single laptop to large cloud clusters, it has been successfully used to process terabytes of data for training IBM Granite Large Language Models (LLMs). Data Prep Kit offers a robust feature set including duplicate elimination, advanced document and code handling, language detection (for both spoken and programming languages), removal of personally identifiable information (PII), as well as spam, hate speech, and malware detection. More about Data Prep Kit : https://github.com/IBM/data-prep-kit Join us for this hands-on session to explore how to use Data Prep Kit to accelerate data preparation, enhance data quality. In this workshop we will do the following:
What do you need to participate in this workshop?
Session Type Hands-on workshop Audience LLM app developers, data scientists, data engineers Technical Level Beginner - Intermediate Prerequisites
Duration 60 mins Industry Cross industry Speaker Bio https://sujee.dev/bio About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players. |
[AI Alliance] Workshop: Hands-on with Data Prep Kit
|
|
[AI Alliance] RAG pipeline With Data Prep Kit + Milvus + Granite (Workshop)
2025-02-13 · 17:00
RAG (Retrieval-Augmented Generation) or fine-tuning a model, a significant portion of your time will be dedicated to data wrangling (cleaning, de-duping, removing markups, etc.). Data Prep Kit (https://github.com/IBM/data-prep-kit) can help you with data wrangling. Noteworthy features of DPK include:
In this workshop, we will demonstrate implementing an end-to-end RAG pipeline using open source technologies:
Session Type Hands-on workshop Audience LLM app developers, data scientists, data engineers Technical Level Intermediate Prerequisites
Industry Cross industry Agenda
About the instructor Sujee Maniyam (AI Engineer, Developer Advocate @ Node51) is an expert in Generative AI, Machine Learning, Deep Learning, Big Data, Distributed Systems, and Cloud technologies. He is passionate about developer education, fostering community engagement. Sujee has led numerous training sessions, hackathons, and workshops. He is also an author, open source contributor and frequent speaker at conferences and meetups. About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players. |
[AI Alliance] RAG pipeline With Data Prep Kit + Milvus + Granite (Workshop)
|
|
[AI Alliance] RAG pipeline With Data Prep Kit + Milvus + Granite (Workshop)
2025-02-13 · 17:00
RAG (Retrieval-Augmented Generation) or fine-tuning a model, a significant portion of your time will be dedicated to data wrangling (cleaning, de-duping, removing markups, etc.). Data Prep Kit (https://github.com/IBM/data-prep-kit) can help you with data wrangling. Noteworthy features of DPK include:
In this workshop, we will demonstrate implementing an end-to-end RAG pipeline using open source technologies:
Session Type Hands-on workshop Audience LLM app developers, data scientists, data engineers Technical Level Intermediate Prerequisites
Industry Cross industry Agenda
About the instructor Sujee Maniyam (AI Engineer, Developer Advocate @ Node51) is an expert in Generative AI, Machine Learning, Deep Learning, Big Data, Distributed Systems, and Cloud technologies. He is passionate about developer education, fostering community engagement. Sujee has led numerous training sessions, hackathons, and workshops. He is also an author, open source contributor and frequent speaker at conferences and meetups. About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players. |
[AI Alliance] RAG pipeline With Data Prep Kit + Milvus + Granite (Workshop)
|
|
[AI Alliance] RAG pipeline With Data Prep Kit + Milvus + Granite (Workshop)
2025-02-13 · 17:00
RAG (Retrieval-Augmented Generation) or fine-tuning a model, a significant portion of your time will be dedicated to data wrangling (cleaning, de-duping, removing markups, etc.). Data Prep Kit (https://github.com/IBM/data-prep-kit) can help you with data wrangling. Noteworthy features of DPK include:
In this workshop, we will demonstrate implementing an end-to-end RAG pipeline using open source technologies:
Session Type Hands-on workshop Audience LLM app developers, data scientists, data engineers Technical Level Intermediate Prerequisites
Industry Cross industry Agenda
About the instructor Sujee Maniyam (AI Engineer, Developer Advocate @ Node51) is an expert in Generative AI, Machine Learning, Deep Learning, Big Data, Distributed Systems, and Cloud technologies. He is passionate about developer education, fostering community engagement. Sujee has led numerous training sessions, hackathons, and workshops. He is also an author, open source contributor and frequent speaker at conferences and meetups. About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players. |
[AI Alliance] RAG pipeline With Data Prep Kit + Milvus + Granite (Workshop)
|
|
[AI Alliance] RAG pipeline With Data Prep Kit + Milvus + Granite (Workshop)
2025-02-13 · 17:00
RAG (Retrieval-Augmented Generation) or fine-tuning a model, a significant portion of your time will be dedicated to data wrangling (cleaning, de-duping, removing markups, etc.). Data Prep Kit (https://github.com/IBM/data-prep-kit) can help you with data wrangling. Noteworthy features of DPK include:
In this workshop, we will demonstrate implementing an end-to-end RAG pipeline using open source technologies:
Session Type Hands-on workshop Audience LLM app developers, data scientists, data engineers Technical Level Intermediate Prerequisites
Industry Cross industry Agenda
About the instructor Sujee Maniyam (AI Engineer, Developer Advocate @ Node51) is an expert in Generative AI, Machine Learning, Deep Learning, Big Data, Distributed Systems, and Cloud technologies. He is passionate about developer education, fostering community engagement. Sujee has led numerous training sessions, hackathons, and workshops. He is also an author, open source contributor and frequent speaker at conferences and meetups. About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players. |
[AI Alliance] RAG pipeline With Data Prep Kit + Milvus + Granite (Workshop)
|
|
[AI Alliance] RAG pipeline With Data Prep Kit + Milvus + Granite (Workshop)
2025-02-13 · 17:00
RAG (Retrieval-Augmented Generation) or fine-tuning a model, a significant portion of your time will be dedicated to data wrangling (cleaning, de-duping, removing markups, etc.). Data Prep Kit (https://github.com/IBM/data-prep-kit) can help you with data wrangling. Noteworthy features of DPK include:
In this workshop, we will demonstrate implementing an end-to-end RAG pipeline using open source technologies:
Session Type Hands-on workshop Audience LLM app developers, data scientists, data engineers Technical Level Intermediate Prerequisites
Industry Cross industry Agenda
About the instructor Sujee Maniyam (AI Engineer, Developer Advocate @ Node51) is an expert in Generative AI, Machine Learning, Deep Learning, Big Data, Distributed Systems, and Cloud technologies. He is passionate about developer education, fostering community engagement. Sujee has led numerous training sessions, hackathons, and workshops. He is also an author, open source contributor and frequent speaker at conferences and meetups. About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players. |
[AI Alliance] RAG pipeline With Data Prep Kit + Milvus + Granite (Workshop)
|