talk-data.com talk-data.com

Filter by Source

Select conferences and events

People (1 result)

Activities & events

Title & Speakers Event

Abstract Imagine being able to ask questions about a website in natural language—and receiving meaningful answers instead of simple keyword matches. In this talk, I’ll introduce Allycat, an open-source, end-to-end stack that enables conversational interaction with website content using Large Language Models (LLMs).

We’ll walk through the complete pipeline:

  • Crawling and indexing website content
  • Cleaning and extracting meaningful information from HTML
  • Creating embeddings and storing them in a vector database
  • Querying the data using an LLM for contextual, accurate responses

We’ll also demonstrate Allycat’s lightweight UI that allows users to interactively test their queries. The entire stack is built with Python and open-source components, making it easy to adopt, adapt, and extend.

You can checkout Allycat here : https://github.com/The-AI-Alliance/allycat

Audience AI/ML Engineers, Data Engineers, Data Scientists interested in building intelligent, LLM-powered search and chatbot interfaces.

Level Beginner to Intermediate

Format 45-minute presentation with demonstration

About the speaker Sujee Maniyam (AI Engineer, Developer Advocate @ Node51) is an expert in Generative AI, Machine Learning, Deep Learning, Big Data, Distributed Systems, and Cloud technologies. He is passionate about developer education, fostering community engagement. Sujee has led numerous training sessions, hackathons, and workshops. He is also an author, open source contributor and frequent speaker at conferences and meetups.

About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players.

[AI Alliance] Chat with your website using an LLM

Abstract Imagine being able to ask questions about a website in natural language—and receiving meaningful answers instead of simple keyword matches. In this talk, I’ll introduce Allycat, an open-source, end-to-end stack that enables conversational interaction with website content using Large Language Models (LLMs).

We’ll walk through the complete pipeline:

  • Crawling and indexing website content
  • Cleaning and extracting meaningful information from HTML
  • Creating embeddings and storing them in a vector database
  • Querying the data using an LLM for contextual, accurate responses

We’ll also demonstrate Allycat’s lightweight UI that allows users to interactively test their queries. The entire stack is built with Python and open-source components, making it easy to adopt, adapt, and extend.

You can checkout Allycat here : https://github.com/The-AI-Alliance/allycat

Audience AI/ML Engineers, Data Engineers, Data Scientists interested in building intelligent, LLM-powered search and chatbot interfaces.

Level Beginner to Intermediate

Format 45-minute presentation with demonstration

About the speaker Sujee Maniyam (AI Engineer, Developer Advocate @ Node51) is an expert in Generative AI, Machine Learning, Deep Learning, Big Data, Distributed Systems, and Cloud technologies. He is passionate about developer education, fostering community engagement. Sujee has led numerous training sessions, hackathons, and workshops. He is also an author, open source contributor and frequent speaker at conferences and meetups.

About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players.

[AI Alliance] Chat with your website using an LLM

Abstract Imagine being able to ask questions about a website in natural language—and receiving meaningful answers instead of simple keyword matches. In this talk, I’ll introduce Allycat, an open-source, end-to-end stack that enables conversational interaction with website content using Large Language Models (LLMs).

We’ll walk through the complete pipeline:

  • Crawling and indexing website content
  • Cleaning and extracting meaningful information from HTML
  • Creating embeddings and storing them in a vector database
  • Querying the data using an LLM for contextual, accurate responses

We’ll also demonstrate Allycat’s lightweight UI that allows users to interactively test their queries. The entire stack is built with Python and open-source components, making it easy to adopt, adapt, and extend.

You can checkout Allycat here : https://github.com/The-AI-Alliance/allycat

Audience AI/ML Engineers, Data Engineers, Data Scientists interested in building intelligent, LLM-powered search and chatbot interfaces.

Level Beginner to Intermediate

Format 45-minute presentation with demonstration

About the speaker Sujee Maniyam (AI Engineer, Developer Advocate @ Node51) is an expert in Generative AI, Machine Learning, Deep Learning, Big Data, Distributed Systems, and Cloud technologies. He is passionate about developer education, fostering community engagement. Sujee has led numerous training sessions, hackathons, and workshops. He is also an author, open source contributor and frequent speaker at conferences and meetups.

About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players.

[AI Alliance] Chat with your website using an LLM

Abstract Imagine being able to ask questions about a website in natural language—and receiving meaningful answers instead of simple keyword matches. In this talk, I’ll introduce Allycat, an open-source, end-to-end stack that enables conversational interaction with website content using Large Language Models (LLMs).

We’ll walk through the complete pipeline:

  • Crawling and indexing website content
  • Cleaning and extracting meaningful information from HTML
  • Creating embeddings and storing them in a vector database
  • Querying the data using an LLM for contextual, accurate responses

We’ll also demonstrate Allycat’s lightweight UI that allows users to interactively test their queries. The entire stack is built with Python and open-source components, making it easy to adopt, adapt, and extend.

You can checkout Allycat here : https://github.com/The-AI-Alliance/allycat

Audience AI/ML Engineers, Data Engineers, Data Scientists interested in building intelligent, LLM-powered search and chatbot interfaces.

Level Beginner to Intermediate

Format 45-minute presentation with demonstration

About the speaker Sujee Maniyam (AI Engineer, Developer Advocate @ Node51) is an expert in Generative AI, Machine Learning, Deep Learning, Big Data, Distributed Systems, and Cloud technologies. He is passionate about developer education, fostering community engagement. Sujee has led numerous training sessions, hackathons, and workshops. He is also an author, open source contributor and frequent speaker at conferences and meetups.

About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players.

[AI Alliance] Chat with your website using an LLM

Abstract Imagine being able to ask questions about a website in natural language—and receiving meaningful answers instead of simple keyword matches. In this talk, I’ll introduce Allycat, an open-source, end-to-end stack that enables conversational interaction with website content using Large Language Models (LLMs).

We’ll walk through the complete pipeline:

  • Crawling and indexing website content
  • Cleaning and extracting meaningful information from HTML
  • Creating embeddings and storing them in a vector database
  • Querying the data using an LLM for contextual, accurate responses

We’ll also demonstrate Allycat’s lightweight UI that allows users to interactively test their queries. The entire stack is built with Python and open-source components, making it easy to adopt, adapt, and extend.

You can checkout Allycat here : https://github.com/The-AI-Alliance/allycat

Audience AI/ML Engineers, Data Engineers, Data Scientists interested in building intelligent, LLM-powered search and chatbot interfaces.

Level Beginner to Intermediate

Format 45-minute presentation with demonstration

About the speaker Sujee Maniyam (AI Engineer, Developer Advocate @ Node51) is an expert in Generative AI, Machine Learning, Deep Learning, Big Data, Distributed Systems, and Cloud technologies. He is passionate about developer education, fostering community engagement. Sujee has led numerous training sessions, hackathons, and workshops. He is also an author, open source contributor and frequent speaker at conferences and meetups.

About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players.

[AI Alliance] Chat with your website using an LLM

Abstract Imagine being able to ask questions about a website in natural language—and receiving meaningful answers instead of simple keyword matches. In this talk, I’ll introduce Allycat, an open-source, end-to-end stack that enables conversational interaction with website content using Large Language Models (LLMs).

We’ll walk through the complete pipeline:

  • Crawling and indexing website content
  • Cleaning and extracting meaningful information from HTML
  • Creating embeddings and storing them in a vector database
  • Querying the data using an LLM for contextual, accurate responses

We’ll also demonstrate Allycat’s lightweight UI that allows users to interactively test their queries. The entire stack is built with Python and open-source components, making it easy to adopt, adapt, and extend.

You can checkout Allycat here : https://github.com/The-AI-Alliance/allycat

Audience AI/ML Engineers, Data Engineers, Data Scientists interested in building intelligent, LLM-powered search and chatbot interfaces.

Level Beginner to Intermediate

Format 45-minute presentation with demonstration

About the speaker Sujee Maniyam (AI Engineer, Developer Advocate @ Node51) is an expert in Generative AI, Machine Learning, Deep Learning, Big Data, Distributed Systems, and Cloud technologies. He is passionate about developer education, fostering community engagement. Sujee has led numerous training sessions, hackathons, and workshops. He is also an author, open source contributor and frequent speaker at conferences and meetups.

About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players.

[AI Alliance] Chat with your website using an LLM

Abstract Imagine being able to ask questions about a website in natural language—and receiving meaningful answers instead of simple keyword matches. In this talk, I’ll introduce Allycat, an open-source, end-to-end stack that enables conversational interaction with website content using Large Language Models (LLMs).

We’ll walk through the complete pipeline:

  • Crawling and indexing website content
  • Cleaning and extracting meaningful information from HTML
  • Creating embeddings and storing them in a vector database
  • Querying the data using an LLM for contextual, accurate responses

We’ll also demonstrate Allycat’s lightweight UI that allows users to interactively test their queries. The entire stack is built with Python and open-source components, making it easy to adopt, adapt, and extend.

You can checkout Allycat here : https://github.com/The-AI-Alliance/allycat

Audience AI/ML Engineers, Data Engineers, Data Scientists interested in building intelligent, LLM-powered search and chatbot interfaces.

Level Beginner to Intermediate

Format 45-minute presentation with demonstration

About the speaker Sujee Maniyam (AI Engineer, Developer Advocate @ Node51) is an expert in Generative AI, Machine Learning, Deep Learning, Big Data, Distributed Systems, and Cloud technologies. He is passionate about developer education, fostering community engagement. Sujee has led numerous training sessions, hackathons, and workshops. He is also an author, open source contributor and frequent speaker at conferences and meetups.

About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players.

[AI Alliance] Chat with your website using an LLM
Sujee Maniyam – AI Engineer, Developer Advocate @ IBM / The AI Alliance

In this session we will review the following data preparation tools and techniques we have discussed in the previous sessions: Data Prep Kit; Docling; Open source RAG with Data Prep Kit + Milvus + Llama.

data prep kit docling milvus llama RAG
[AI Alliance] Review of Data Preparation Tools (incl. open Q&A)
Sujee Maniyam – AI Engineer, Developer Advocate @ IBM / The AI Alliance

In this session we will review the following data preparation tools and techniques we have discussed in the previous sessions: Data Prep Kit, Docling, Open source RAG with Data Prep Kit + Milvus + Llama.

data prep kit docling milvus llama
[AI Alliance] Review of Data Preparation Tools (incl. open Q&A)
Sujee Maniyam – AI Engineer, Developer Advocate @ IBM / The AI Alliance

In this session we will review the following data preparation tools and techniques we have discussed in the previous sessions: Data Prep Kit, Docling, Open source RAG with Data Prep Kit + Milvus + Llama.

data prep kit docling open source rag with data prep kit + milvus + llama
[AI Alliance] Review of Data Preparation Tools (incl. open Q&A)

Agenda

  • Welcome, housekeeping, etc.
  • Quick intro about AI Alliance (1 min)
  • Presentation: Review of data preparation tools (40 mins)
  • Q&A (10 mins)
  • Wrap-up

Presentation: Reviewing Data Preparation Tools In this session we will review the following data preparation tools and techniques we have discussed in the previous sessions:

Watch the replay:

  • January 30, 2025: Using Docling to process data (YouTube)
  • February 6, 2025: Using Data Prep Kit to process data (YouTube)
  • February 13, 2025: Open source RAG pipeline using Data Prep Kit + Milvus + Granite (recording available soon)

Session Type Talk and open Q&A

Audience LLM app developers, data scientists, data engineers

Technical Level Beginner – Intermediate

Prerequisites None

Resources See links to session recordings above

Speaker: Sujee Maniyam, AI Engineer, Developer Advocate (Consulting for IBM / The AI Alliance) Sujee Maniyam is an expert in Generative AI, Machine Learning, Deep Learning, Big Data, Distributed Systems, and Cloud technologies. He is passionate about developer education, fostering community engagement. Sujee has led numerous training sessions, hackathons, and workshops. He is also an author, open source contributor and frequent speaker at conferences and meetups.

[AI Alliance] Review of Data Preparation Tools (incl. open Q&A)

Agenda

  • Welcome, housekeeping, etc.
  • Quick intro about AI Alliance (1 min)
  • Presentation: Review of data preparation tools (40 mins)
  • Q&A (10 mins)
  • Wrap-up

Presentation: Reviewing Data Preparation Tools In this session we will review the following data preparation tools and techniques we have discussed in the previous sessions:

Watch the replay:

  • January 30, 2025: Using Docling to process data (YouTube)
  • February 6, 2025: Using Data Prep Kit to process data (YouTube)
  • February 13, 2025: Open source RAG pipeline using Data Prep Kit + Milvus + Granite (recording available soon)

Session Type Talk and open Q&A

Audience LLM app developers, data scientists, data engineers

Technical Level Beginner – Intermediate

Prerequisites None

Resources See links to session recordings above

Speaker: Sujee Maniyam, AI Engineer, Developer Advocate (Consulting for IBM / The AI Alliance) Sujee Maniyam is an expert in Generative AI, Machine Learning, Deep Learning, Big Data, Distributed Systems, and Cloud technologies. He is passionate about developer education, fostering community engagement. Sujee has led numerous training sessions, hackathons, and workshops. He is also an author, open source contributor and frequent speaker at conferences and meetups.

[AI Alliance] Review of Data Preparation Tools (incl. open Q&A)

Agenda

  • Welcome, housekeeping, etc.
  • Quick intro about AI Alliance (1 min)
  • Presentation: Review of data preparation tools (40 mins)
  • Q&A (10 mins)
  • Wrap-up

Presentation: Reviewing Data Preparation Tools In this session we will review the following data preparation tools and techniques we have discussed in the previous sessions:

Watch the replay:

  • January 30, 2025: Using Docling to process data (YouTube)
  • February 6, 2025: Using Data Prep Kit to process data (YouTube)
  • February 13, 2025: Open source RAG pipeline using Data Prep Kit + Milvus + Granite (recording available soon)

Session Type Talk and open Q&A

Audience LLM app developers, data scientists, data engineers

Technical Level Beginner – Intermediate

Prerequisites None

Resources See links to session recordings above

Speaker: Sujee Maniyam, AI Engineer, Developer Advocate (Consulting for IBM / The AI Alliance) Sujee Maniyam is an expert in Generative AI, Machine Learning, Deep Learning, Big Data, Distributed Systems, and Cloud technologies. He is passionate about developer education, fostering community engagement. Sujee has led numerous training sessions, hackathons, and workshops. He is also an author, open source contributor and frequent speaker at conferences and meetups.

[AI Alliance] Review of Data Preparation Tools (incl. open Q&A)

Agenda

  • Welcome, housekeeping, etc.
  • Quick intro about AI Alliance (1 min)
  • Presentation: Review of data preparation tools (40 mins)
  • Q&A (10 mins)
  • Wrap-up

Presentation: Reviewing Data Preparation Tools In this session we will review the following data preparation tools and techniques we have discussed in the previous sessions:

Watch the replay:

  • January 30, 2025: Using Docling to process data (YouTube)
  • February 6, 2025: Using Data Prep Kit to process data (YouTube)
  • February 13, 2025: Open source RAG pipeline using Data Prep Kit + Milvus + Granite (recording available soon)

Session Type Talk and open Q&A

Audience LLM app developers, data scientists, data engineers

Technical Level Beginner – Intermediate

Prerequisites None

Resources See links to session recordings above

Speaker: Sujee Maniyam, AI Engineer, Developer Advocate (Consulting for IBM / The AI Alliance) Sujee Maniyam is an expert in Generative AI, Machine Learning, Deep Learning, Big Data, Distributed Systems, and Cloud technologies. He is passionate about developer education, fostering community engagement. Sujee has led numerous training sessions, hackathons, and workshops. He is also an author, open source contributor and frequent speaker at conferences and meetups.

[AI Alliance] Review of Data Preparation Tools (incl. open Q&A)
Sujee Maniyam – AI Engineer, Developer Advocate @ IBM / The AI Alliance

In this workshop, we will demonstrate implementing an end-to-end RAG pipeline using open source technologies: Data Prep Kit for processing documents; Milvus as vector database; Granite 3 as the LLM.

data prep kit milvus granite 3

Introductory session on Milvus vector datastore, presented by Stefan Webb from Zilliz.

milvus

RAG (Retrieval-Augmented Generation) or fine-tuning a model, a significant portion of your time will be dedicated to data wrangling (cleaning, de-duping, removing markups, etc.). Data Prep Kit (https://github.com/IBM/data-prep-kit) can help you with data wrangling.

Noteworthy features of DPK include:

  • de-duping documents (exact dedupe and fuzzy dedupe),
  • handling documents and code,
  • language detection (spoken languages and programming languages),
  • malware detection, and
  • creating embeddings.

In this workshop, we will demonstrate implementing an end-to-end RAG pipeline using open source technologies:

  • Data Prep Kit for processing documents
  • Milvus as vector database
  • Granite 3 as the LLM

Session Type Hands-on workshop

Audience LLM app developers, data scientists, data engineers

Technical Level Intermediate

Prerequisites

  • A computer with a Python development environment is strongly recommended. setup instructions are here.
  • A FREE Replicate account - get one at replicate.com. Use this code to add some credit to your Replicate account! 💰

Industry Cross industry

Agenda

  • Welcome & introductions (5')
  • About the AI Alliance & how you can get involved (5')
  • Introduction to Milvus vector datastore by Stefan Webb from Zilliz (15’)
  • Workshop: “RAG pipeline with Data Prep Kit + Milvus + Granite” (60')
  • Q&A
  • Closing

About the instructor Sujee Maniyam (AI Engineer, Developer Advocate @ Node51) is an expert in Generative AI, Machine Learning, Deep Learning, Big Data, Distributed Systems, and Cloud technologies. He is passionate about developer education, fostering community engagement. Sujee has led numerous training sessions, hackathons, and workshops. He is also an author, open source contributor and frequent speaker at conferences and meetups.

About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players.

[AI Alliance] RAG pipeline With Data Prep Kit + Milvus + Granite (Workshop)

RAG (Retrieval-Augmented Generation) or fine-tuning a model, a significant portion of your time will be dedicated to data wrangling (cleaning, de-duping, removing markups, etc.). Data Prep Kit (https://github.com/IBM/data-prep-kit) can help you with data wrangling.

Noteworthy features of DPK include:

  • de-duping documents (exact dedupe and fuzzy dedupe),
  • handling documents and code,
  • language detection (spoken languages and programming languages),
  • malware detection, and
  • creating embeddings.

In this workshop, we will demonstrate implementing an end-to-end RAG pipeline using open source technologies:

  • Data Prep Kit for processing documents
  • Milvus as vector database
  • Granite 3 as the LLM

Session Type Hands-on workshop

Audience LLM app developers, data scientists, data engineers

Technical Level Intermediate

Prerequisites

  • A computer with a Python development environment is strongly recommended. setup instructions are here.
  • A FREE Replicate account - get one at replicate.com. Use this code to add some credit to your Replicate account! 💰

Industry Cross industry

Agenda

  • Welcome & introductions (5')
  • About the AI Alliance & how you can get involved (5')
  • Introduction to Milvus vector datastore by Stefan Webb from Zilliz (15’)
  • Workshop: “RAG pipeline with Data Prep Kit + Milvus + Granite” (60')
  • Q&A
  • Closing

About the instructor Sujee Maniyam (AI Engineer, Developer Advocate @ Node51) is an expert in Generative AI, Machine Learning, Deep Learning, Big Data, Distributed Systems, and Cloud technologies. He is passionate about developer education, fostering community engagement. Sujee has led numerous training sessions, hackathons, and workshops. He is also an author, open source contributor and frequent speaker at conferences and meetups.

About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players.

[AI Alliance] RAG pipeline With Data Prep Kit + Milvus + Granite (Workshop)

RAG (Retrieval-Augmented Generation) or fine-tuning a model, a significant portion of your time will be dedicated to data wrangling (cleaning, de-duping, removing markups, etc.). Data Prep Kit (https://github.com/IBM/data-prep-kit) can help you with data wrangling.

Noteworthy features of DPK include:

  • de-duping documents (exact dedupe and fuzzy dedupe),
  • handling documents and code,
  • language detection (spoken languages and programming languages),
  • malware detection, and
  • creating embeddings.

In this workshop, we will demonstrate implementing an end-to-end RAG pipeline using open source technologies:

  • Data Prep Kit for processing documents
  • Milvus as vector database
  • Granite 3 as the LLM

Session Type Hands-on workshop

Audience LLM app developers, data scientists, data engineers

Technical Level Intermediate

Prerequisites

  • A computer with a Python development environment is strongly recommended. setup instructions are here.
  • A FREE Replicate account - get one at replicate.com. Use this code to add some credit to your Replicate account! 💰

Industry Cross industry

Agenda

  • Welcome & introductions (5')
  • About the AI Alliance & how you can get involved (5')
  • Introduction to Milvus vector datastore by Stefan Webb from Zilliz (15’)
  • Workshop: “RAG pipeline with Data Prep Kit + Milvus + Granite” (60')
  • Q&A
  • Closing

About the instructor Sujee Maniyam (AI Engineer, Developer Advocate @ Node51) is an expert in Generative AI, Machine Learning, Deep Learning, Big Data, Distributed Systems, and Cloud technologies. He is passionate about developer education, fostering community engagement. Sujee has led numerous training sessions, hackathons, and workshops. He is also an author, open source contributor and frequent speaker at conferences and meetups.

About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players.

[AI Alliance] RAG pipeline With Data Prep Kit + Milvus + Granite (Workshop)

RAG (Retrieval-Augmented Generation) or fine-tuning a model, a significant portion of your time will be dedicated to data wrangling (cleaning, de-duping, removing markups, etc.). Data Prep Kit (https://github.com/IBM/data-prep-kit) can help you with data wrangling.

Noteworthy features of DPK include:

  • de-duping documents (exact dedupe and fuzzy dedupe),
  • handling documents and code,
  • language detection (spoken languages and programming languages),
  • malware detection, and
  • creating embeddings.

In this workshop, we will demonstrate implementing an end-to-end RAG pipeline using open source technologies:

  • Data Prep Kit for processing documents
  • Milvus as vector database
  • Granite 3 as the LLM

Session Type Hands-on workshop

Audience LLM app developers, data scientists, data engineers

Technical Level Intermediate

Prerequisites

  • A computer with a Python development environment is strongly recommended. setup instructions are here.
  • A FREE Replicate account - get one at replicate.com. Use this code to add some credit to your Replicate account! 💰

Industry Cross industry

Agenda

  • Welcome & introductions (5')
  • About the AI Alliance & how you can get involved (5')
  • Introduction to Milvus vector datastore by Stefan Webb from Zilliz (15’)
  • Workshop: “RAG pipeline with Data Prep Kit + Milvus + Granite” (60')
  • Q&A
  • Closing

About the instructor Sujee Maniyam (AI Engineer, Developer Advocate @ Node51) is an expert in Generative AI, Machine Learning, Deep Learning, Big Data, Distributed Systems, and Cloud technologies. He is passionate about developer education, fostering community engagement. Sujee has led numerous training sessions, hackathons, and workshops. He is also an author, open source contributor and frequent speaker at conferences and meetups.

About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players.

[AI Alliance] RAG pipeline With Data Prep Kit + Milvus + Granite (Workshop)