talk-data.com
Activities & events
| Title & Speakers | Event |
|---|---|
|
Building LLM applications with Python
2026-01-05 · 18:00
Overview Students, developers, and anyone interested in getting started with theory and practice on building LLM-based applications with Python. Who is this for? Undeniably, large language models (LLMs) are at the centre of a modern gold-rush in technology. Students, developers, and anyone interested in getting started with theory and practice on building LLM-based applications with Python. Who is leading the session? The session is led by Dr. Stelios Sotiriadis, CEO of Warestack, Associate Professor and MSc Programme Director at Birkbeck, University of London. His expertise includes cloud computing, distributed systems, and AI engineering. Stelios holds a PhD from the University of Derby, completed a postdoctoral fellowship at the University of Toronto, and has worked with Huawei, IBM, Autodesk, and several startups. Since 2018 he has taught at Birkbeck and, in 2021, founded Warestack, building software for startups globally. What we’ll cover A practical introduction on the basics of local models and cloud APIs to build real software systems. You will learn:
Requirements
This space is needed for running local models. You may also use the lab computers if your device doesn’t meet the requirements. Format A 1.5-hours live session including:
The session will run in person, with streaming available for remote attendees. Prerequisites You should be comfortable writing Python scripts (basic to intermediate level). |
Building LLM applications with Python
|
|
A Practical Starter's Guide to building LLM based projects | Marcin S. | DSC DACH 25
2025-12-10 · 15:28
In his tech tutorial, Marcin showed how to go beyond creating prompts for ChatGPT and build full applications leveraging generative AI. He covered the fundamentals of large language models (LLMs), introduced LangChain, and demonstrated techniques like question answering over documents and creating reasoning agents. The session also addressed advanced methods and practical challenges of deploying LLMs in production. By the end, participants with Python experience gained hands-on knowledge to develop GPT-driven applications while understanding potential pitfalls and limitations. This tutorial by Marcin Szymaniuk was held on October 14th at DSC DACH 25 in Vienna. Follow us on social media : LinkedIn: https://www.linkedin.com/company/11184830/admin/ Instagram: https://www.instagram.com/datasciconf/ Facebook page: https://www.facebook.com/DataSciConference Website: https://datasciconference.com/ |
DSC DACH 25 |
|
Building AI Interoperability with MCP and ContextForge
2025-12-02 · 17:00
This meetup is in association with IBM. ----------------------------------------- Note: PyData Ireland will be collecting your name and email for smooth access to the venue. ----------------------------------------- APIs changed the web. MCP (Model Context Protocol) is changing AI. In this deep dive, Mihai takes you under the hood of the new interoperability layer that's powering 15,000+ AI tools and servers across the ecosystem.This session focuses on building real, working MCP servers in Python and understanding how they connect through ContextForge: https://github.com/IBM/mcp-context-forge - an open source MCP Gateway and Registry that serves as a central hub for tools\, resources\, and prompts accessible to MCP-compatible LLMs. ContextForge converts REST APIs to MCP\, composes virtual MCP servers with added security and observability\, and bridges protocols such as stdio\, SSE\, and Streamable HTTP. Key Takeaways: Build secure, scalable MCP Servers to drive AI Agents. See how MCP and ContextForge work together to make AI tools interoperable, secure, and production-ready. Speaker Designation: Mihai Criveti, Distinguished Engineer for AI Agents at IBM Speaker Bio: Mihai Criveti is a Distinguished Engineer for AI Agents at IBM and leading the development of ContexForge - the open source Model Context Protocol (MCP) Gateway and Registry. He shapes Agentic AI standards across IBM and is building a team in Dublin to advance ContextForge and its global adoption. His work focuses on platform engineering, AI orchestration, and open systems that accelerate real-world AI adoption. -------------------------------------------- P.S: ContextForge team is hiring. Read on and apply. From team ContextForge: We're looking for highly motivated Python Developers - from early-career engineers to senior contributors — to join the ContextForge MCP & A2A Gateway team at IBM Software. ContextForge - https://github.com/IBM/mcp-context-forge is an open-source, production-grade gateway, proxy, and registry for Model Context Protocol (MCP) servers and A2A Agents, unifying discovery, authentication, rate-limiting, observability, and federation across distributed AI and REST ecosystems Application portal: https://ibmglobal.avature.net/en_US/careers/JobDetail?jobId=71800 GitHub - IBM/mcp-context-forge: A Model Context Protocol (MCP) Gateway & Registry. Serves as a central management point for tools, resources, and prompts that can be accessed by MCP-compatible LLM applications. Converts REST API endpoints to MCP, composes virtual MCP servers with added security and observability, and converts between protocols (stdio, SSE, Streamable HTTP). Mention in the application #pydata |
Building AI Interoperability with MCP and ContextForge
|
|
RSVPs are on Eventbrite - Hands-On : MCP (Model Context Protocol) Bootcamp
2025-09-27 · 13:00
Hands-On : MCP (Model Context Protocol) Bootcamp Date: 27th September 2025, 9 AM to 12.30 PM Eastern Time Level: Beginners/Intermediate Registration Link: https://www.eventbrite.com/e/hands-on-mcp-model-context-protocol-bootcamp-tickets-1583073859529?aff=oddtdtcreator Who Should Attend? This hands-on workshop is for developers, senior software engineers, IT pros, architects, IT managers, citizen developers, product managers, IT leaders, enterprise architects, chief analytics officers, CIOs, CTOs, and other decision-makers who want to learn how to seamlessly integrate AI applications and agents into Azure AI Foundry and Microsoft Copilot Studio using the Model Context Protocol (MCP). Experience with C#, Python, or JavaScript is helpful but not required—no prior AI knowledge needed. And while this isn’t a data & analytics-focused session, data scientists, data stewards, and tech-savvy data protection officers will also find it super valuable. Description: MCP is an open protocol that standardizes how applications provide context to LLMs. Think of MCP like a USB-C port for AI applications. Just as USB-C gives you a universal way to connect devices to different peripherals and accessories, MCP gives AI models a standardized way to connect to data sources and tools. Why MCP?MCP makes it easier to build agents and complex workflows on top of LLMs. Since LLMs often need to integrate with data and tools, MCP provides:
In this half-day, hands-on virtual workshop, Microsoft AI and Business Applications MVP and Microsoft Certified Trainer Prashant G Bhoyar will show you how to use MCP with Azure AI Foundry and Copilot Studio to build powerful AI Agents. Here’s what we’ll cover in detail:
By the end of this bootcamp, you'll be ready to use MCP to create enterprise-grade agents. The labs will feature a mix of Python, C#, and low-code/no-code UI tools—so even if you don't want to write code, you're covered. Workshop Resources: You’ll get access to Microsoft Copilot, Azure, and Azure OpenAI services (a $500 value) for the hands-on labs. If you already have your own Microsoft Copilot or Azure subscription, you can use that instead. Attendee Workstation Requirements Bring your own computer (Windows or Mac) with:
|
RSVPs are on Eventbrite - Hands-On : MCP (Model Context Protocol) Bootcamp
|
|
Real-Time Context Engineering for LLMs
2025-09-26 · 12:55
Context engineering has replaced prompt engineering as the main challenge in building agents and LLM applications. Context engineering involves providing LLMs with relevant and timely context data from various data sources, which allows them to make context-aware decisions. The context data provided to the LLM must be produced in real-time to enable it to react intelligently at human perceivable latencies (a second or two at most). If the application takes longer to react, humans would perceive it as laggy and unintelligent. In this talk, we will introduce context engineering and motivate for real-time context engineering for interactive applications. We will also demonstrate how to integrate real-time context data from applications inside Python agents using the Hopsworks feature store and corresponding application IDs. Application IDs are the key to unlock application context data for agents and LLMs. We will walk through an example of an interactive application (TikTok clone) that we make AI-enabled with Hopsworks. |
PyData Amsterdam 2025 |
|
From RAG to Relational: How Agentic Patterns Are Reshaping Data Architecture
2025-09-18 · 00:24
Mark Brooker
– VP and Distinguished Engineer
@ AWS
,
Tobias Macey
– host
Summary In this episode of the AI Engineering Podcast Mark Brooker, VP and Distinguished Engineer at AWS, talks about how agentic workflows are transforming database usage and infrastructure design. He discusses the evolving role of data in AI systems, from traditional models to more modern approaches like vectors, RAG, and relational databases. Mark explains why agents require serverless, elastic, and operationally simple databases, and how AWS solutions like Aurora and DSQL address these needs with features such as rapid provisioning, automated patching, geodistribution, and spiky usage. The conversation covers topics including tool calling, improved model capabilities, state in agents versus stateless LLM calls, and the role of Lambda and AgentCore for long-running, session-isolated agents. Mark also touches on the shift from local MCP tools to secure, remote endpoints, the rise of object storage as a durable backplane, and the need for better identity and authorization models. The episode highlights real-world patterns like agent-driven SQL fuzzing and plan analysis, while identifying gaps in simplifying data access, hardening ops for autonomous systems, and evolving serverless database ergonomics to keep pace with agentic development. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.Your host is Tobias Macey and today I'm interviewing Marc Brooker about the impact of agentic workflows on database usage patterns and how they change the architectural requirements for databasesInterview IntroductionHow did you get involved in the area of data management?Can you describe what the role of the database is in agentic workflows?There are numerous types of databases, with relational being the most prevalent. How does the type and purpose of an agent inform the type of database that should be used?Anecdotally I have heard about how agentic workloads have become the predominant "customers" of services like Neon and Fly.io. How would you characterize the different patterns of scale for agentic AI applications? (e.g. proliferation of agents, monolithic agents, multi-agent, etc.)What are some of the most significant impacts on workload and access patterns for data storage and retrieval that agents introduce?What are the categorical differences in that behavior as compared to programmatic/automated systems?You have spent a substantial amount of time on Lambda at AWS. Given that LLMs are effectively stateless, how does the added ephemerality of serverless functions impact design and performance considerations around having to "re-hydrate" context when interacting with agents?What are the most interesting, innovative, or unexpected ways that you have seen serverless and database systems used for agentic workloads?What are the most interesting, unexpected, or challenging lessons that you have learned while working on technologies that are supporting agentic applications?Contact Info BlogLinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links AWS Aurora DSQLAWS LambdaThree Tier ArchitectureVector DatabaseGraph DatabaseRelational DatabaseVector EmbeddingRAG == Retrieval Augmented GenerationAI Engineering Podcast EpisodeGraphRAGAI Engineering Podcast EpisodeLLM Tool CallingMCP == Model Context ProtocolA2A == Agent 2 Agent ProtocolAWS Bedrock AgentCoreStrandsLangChainKiroThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA |
Data Engineering Podcast |
|
Navigating healthcare scientific knowledge:building AI agents for accurate biomedical data retrieval
2025-09-02 · 13:00
With a focus on healthcare applications where accuracy is non negotiable, this talk highlights challenges and delivers practical insights on building AI agents which query complex biological and scientific data to answer sophisticated questions. Drawing from our experience developing Owkin-K Navigator, a free-to-use AI co-pilot for biological research, I'll share hard-won lessons about combining natural language processing with SQL querying and vector database retrieval to navigate large biomedical knowledge sources, addressing challenges of preventing hallucinations and ensuring proper source attribution. This session is ideal for data scientists, ML engineers, and anyone interested in applying python and LLM ecosystem to the healthcare domain. |
PyData Berlin 2025 |
|
Build Agentic Assistants with OpenAI Function Calling: Part 2
2025-07-30 · 10:00
How to build, refactor, and extend your own agents - Alexey Grigorev Join us for a hands-on walkthrough of building a chat assistant powered by OpenAI’s function calling, led by Alexey Grigorev. This live session focuses on understanding the code behind agent-like assistants. During a previous workshop, Alexey demoed how to build such an assistant quickly. This time, we’ll slow down, go deeper, explaining the code line by line, refactoring it into a reusable library. We will also go over the MCP protocol and create a simple MCP client from scratch. By the end, you’ll better understand how this assistant works and gain a solid foundation to extend the same setup in your projects. What You'll Learn
It will be a live demo with practical tips and a chance to ask your questions. Thinking About LLM Zoomcamp? This workshop is part of the things and projects we do at LLM Zoomcamp, a free online course about real-life applications of LLMs. In 10 weeks, you will learn how to build an AI system that answers questions about your knowledge base. The course is now live. You can join it by registering here. About the Speaker Alexey Grigorev is the Founder of DataTalks.Club and creator of the Zoomcamp series. Alexey is a seasoned software and ML engineer with over 10 years in engineering and 6+ years in machine learning. He has deployed large-scale ML systems at companies like OLX Group and Simplaex, authored several technical books including Machine Learning Bookcamp, and is a Kaggle Master with a 1st place finish in the NIPS'17 Criteo Challenge. Join our slack: https://datatalks.club/slack.html |
Build Agentic Assistants with OpenAI Function Calling: Part 2
|
|
hugo bowne-anderson
– data scientist and educator
@ Outerbounds
Large language models (LLMs) enable powerful data-driven applications, but many projects get stuck in “proof-of-concept purgatory”—where flashy demos fail to translate into reliable, production-ready software. This talk introduces the LLM software development lifecycle (SDLC)—a structured approach to moving beyond early-stage prototypes. Using first principles from software engineering, observability, and iterative evaluation, we’ll cover common pitfalls, techniques for structured output extraction, and methods for improving reliability in real-world data applications. Attendees will leave with concrete strategies for integrating AI into scientific Python workflows—ensuring LLMs generate value beyond the prototype stage. |
SciPy 2025
|
|
Ravindranatha Anthapu
– author
,
Siddhant Agarwal
– author
Dive into building applications that combine the power of Large Language Models (LLMs) with Neo4j knowledge graphs, Haystack, and Spring AI to deliver intelligent, data-driven recommendations and search outcomes. This book provides actionable insights and techniques to create scalable, robust solutions by leveraging the best-in-class frameworks and a real-world project-oriented approach. What this Book will help me do Understand how to use Neo4j to build knowledge graphs integrated with LLMs for enhanced data insights. Develop skills in creating intelligent search functionalities by combining Haystack and vector-based graph techniques. Learn to design and implement recommendation systems using LangChain4j and Spring AI frameworks. Acquire the ability to optimize graph data architectures for LLM-driven applications. Gain proficiency in deploying and managing applications on platforms like Google Cloud for scalability. Author(s) Ravindranatha Anthapu, a Principal Consultant at Neo4j, and Siddhant Agarwal, a Google Developer Expert in Generative AI, bring together their vast experience to offer practical implementations and cutting-edge techniques in this book. Their combined expertise in Neo4j, graph technology, and real-world AI applications makes them authoritative voices in the field. Who is it for? Designed for database developers and data scientists, this book caters to professionals aiming to leverage the transformational capabilities of knowledge graphs alongside LLMs. Readers should have a working knowledge of Python and Java as well as familiarity with Neo4j and the Cypher query language. If you're looking to enhance search or recommendation functionalities through state-of-the-art AI integrations, this book is for you. |
O'Reilly Data Engineering Books
|
|
#300 End to End AI Application Development with Maxime Labonne, Head of Post-training at Liquid AI & Paul-Emil Iusztin, Founder at Decoding ML
2025-05-05 · 10:00
Maxime Labonne
– Senior Staff Machine Learning Scientist, Head of Post-training
@ Liquid AI
,
Richie
– host
@ DataCamp
,
Paul-Emil Iusztin
– Founder
@ Decoding ML
The roles within AI engineering are as diverse as the challenges they tackle. From integrating models into larger systems to ensuring data quality, the day-to-day work of AI professionals is anything but routine. How do you navigate the complexities of deploying AI applications? What are the key steps from prototype to production? For those looking to refine their processes, understanding the full lifecycle of AI development is essential. Let's delve into the intricacies of AI engineering and the strategies that lead to successful implementation. Maxime Labonne is a Senior Staff Machine Learning Scientist at Liquid AI, serving as the head of post-training. He holds a Ph.D. in Machine Learning from the Polytechnic Institute of Paris and is recognized as a Google Developer Expert in AI/ML. An active blogger, he has made significant contributions to the open-source community, including the LLM Course on GitHub, tools such as LLM AutoEval, and several state-of-the-art models like NeuralBeagle and Phixtral. He is the author of the best-selling book “Hands-On Graph Neural Networks Using Python,” published by Packt. Paul-Emil Iusztin designs and implements modular, scalable, and production-ready ML systems for startups worldwide. He has extensive experience putting AI and generative AI into production. Previously, Paul was a Senior Machine Learning Engineer at Metaphysic.ai and a Machine Learning Lead at Core.ai. He is a co-author of The LLM Engineer's Handbook, a best seller in the GenAI space. In the episode, Richie, Maxime, and Paul explore misconceptions in AI application development, the intricacies of fine-tuning versus few-shot prompting, the limitations of current frameworks, the roles of AI engineers, the importance of planning and evaluation, the challenges of deployment, and the future of AI integration, and much more. Links Mentioned in the Show: Maxime’s LLM Course on HuggingFaceMaxime and Paul’s Code Alongs on DataCampDecoding ML on SubstackConnect with Maxime and PaulSkill Track: AI FundamentalsRelated Episode: Building Multi-Modal AI Applications with Russ d'Sa, CEO & Co-founder of LiveKitRewatch sessions from RADAR: Skills Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business |
DataFramed |
|
🎮 Discord Hosted Graph Workshop : Building Intelligent AI with Neo4j & RAG
2025-04-16 · 16:00
** RSVP HERE: RSVPs on Meetup.com alone, will not provide you workshop access ✔️ Pre-Workshop Checklist
Beyond GenAI: Building Intelligent AI with Neo4j & RAG This 2-hour workshop is for anyone ready to move beyond GenAI demos and start building real, grounded AI applications. You’ll learn how to use Neo4j as a retrieval engine for large language models—bringing structure, context, and reasoning into your AI workflows. Using Python (please have Python installed) and the `neo4j-graphrag` library, we’ll walk through how to build Retrieval-Augmented Generation (RAG) pipelines that pull the right data from the graph at the right time. You'll explore concrete retrieval strategies—like vector search, hybrid search, and graph-native patterns—that reduce hallucinations and improve relevance. We’ll close by adding an agentic layer, where AI agents use Neo4j not just to look up facts, but to reason over connected data. By the end, you’ll know how to combine LLMs and knowledge graphs into intelligent, explainable systems you can actually trust. We’ll close by adding an agentic layer, where AI agents use Neo4j not just to look up facts, but to reason over connected data. By the end, you’ll know how to combine LLMs and knowledge graphs into intelligent, explainable systems you can actually trust. |
🎮 Discord Hosted Graph Workshop : Building Intelligent AI with Neo4j & RAG
|
|
🎮 Discord Hosted Graph Workshop : Building Intelligent AI with Neo4j & RAG
2025-04-16 · 16:00
** RSVP HERE: RSVPs on Meetup.com alone, will not provide you workshop access ✔️ Pre-Workshop Checklist
Beyond GenAI: Building Intelligent AI with Neo4j & RAG This 2-hour workshop is for anyone ready to move beyond GenAI demos and start building real, grounded AI applications. You’ll learn how to use Neo4j as a retrieval engine for large language models—bringing structure, context, and reasoning into your AI workflows. Using Python (please have Python installed) and the `neo4j-graphrag` library, we’ll walk through how to build Retrieval-Augmented Generation (RAG) pipelines that pull the right data from the graph at the right time. You'll explore concrete retrieval strategies—like vector search, hybrid search, and graph-native patterns—that reduce hallucinations and improve relevance. We’ll close by adding an agentic layer, where AI agents use Neo4j not just to look up facts, but to reason over connected data. By the end, you’ll know how to combine LLMs and knowledge graphs into intelligent, explainable systems you can actually trust. We’ll close by adding an agentic layer, where AI agents use Neo4j not just to look up facts, but to reason over connected data. By the end, you’ll know how to combine LLMs and knowledge graphs into intelligent, explainable systems you can actually trust. |
🎮 Discord Hosted Graph Workshop : Building Intelligent AI with Neo4j & RAG
|
|
🎮 Discord Hosted Graph Workshop : Building Intelligent AI with Neo4j & RAG
2025-04-16 · 16:00
** RSVP HERE: RSVPs on Meetup.com alone, will not provide you workshop access ✔️ Pre-Workshop Checklist
Beyond GenAI: Building Intelligent AI with Neo4j & RAG This 2-hour workshop is for anyone ready to move beyond GenAI demos and start building real, grounded AI applications. You’ll learn how to use Neo4j as a retrieval engine for large language models—bringing structure, context, and reasoning into your AI workflows. Using Python (please have Python installed) and the `neo4j-graphrag` library, we’ll walk through how to build Retrieval-Augmented Generation (RAG) pipelines that pull the right data from the graph at the right time. You'll explore concrete retrieval strategies—like vector search, hybrid search, and graph-native patterns—that reduce hallucinations and improve relevance. We’ll close by adding an agentic layer, where AI agents use Neo4j not just to look up facts, but to reason over connected data. By the end, you’ll know how to combine LLMs and knowledge graphs into intelligent, explainable systems you can actually trust. We’ll close by adding an agentic layer, where AI agents use Neo4j not just to look up facts, but to reason over connected data. By the end, you’ll know how to combine LLMs and knowledge graphs into intelligent, explainable systems you can actually trust. |
🎮 Discord Hosted Graph Workshop : Building Intelligent AI with Neo4j & RAG
|
|
🎮 Discord Hosted Graph Workshop : Building Intelligent AI with Neo4j & RAG
2025-04-16 · 16:00
** RSVP HERE: RSVPs on Meetup.com alone, will not provide you workshop access ✔️ Pre-Workshop Checklist
Beyond GenAI: Building Intelligent AI with Neo4j & RAG This 2-hour workshop is for anyone ready to move beyond GenAI demos and start building real, grounded AI applications. You’ll learn how to use Neo4j as a retrieval engine for large language models—bringing structure, context, and reasoning into your AI workflows. Using Python (please have Python installed) and the `neo4j-graphrag` library, we’ll walk through how to build Retrieval-Augmented Generation (RAG) pipelines that pull the right data from the graph at the right time. You'll explore concrete retrieval strategies—like vector search, hybrid search, and graph-native patterns—that reduce hallucinations and improve relevance. We’ll close by adding an agentic layer, where AI agents use Neo4j not just to look up facts, but to reason over connected data. By the end, you’ll know how to combine LLMs and knowledge graphs into intelligent, explainable systems you can actually trust. We’ll close by adding an agentic layer, where AI agents use Neo4j not just to look up facts, but to reason over connected data. By the end, you’ll know how to combine LLMs and knowledge graphs into intelligent, explainable systems you can actually trust. |
🎮 Discord Hosted Graph Workshop : Building Intelligent AI with Neo4j & RAG
|
|
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
2025-03-27 · 16:00
Overview When building machine learning and data applications, a significant portion of your time will be dedicated to data wrangling - from content extraction and filtering out problematic and low quality data. In this hands-on session we will explore Data Prep Kit - an open source toolkit, designed to streamline these essential tasks. Attendees will learn first hand how to use the Data Prep Kit to improve overall data quality such as removing spam and low quality documents, removing HAP (Hate Abuse Profanity) speech, removing PII (Personally Identifiable Information) data, thus leading to higher quality dataset. Description Join us for an interactive, hands-on session where you will learn to clean up data and prepare high quality datasets. In this workshop we will do the following:
More about Data Prep Kit : https://github.com/IBM/data-prep-kit What do you need to participate in this workshop?
Session Type Workshop (hands-on) Audience LLM app developers, data scientists, data engineers Technical Level Beginner - Intermediate Prerequisites
Duration 60 mins Industry Cross industry Speaker Bio https://sujee.dev/bio About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players. |
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
|
|
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
2025-03-27 · 16:00
Overview When building machine learning and data applications, a significant portion of your time will be dedicated to data wrangling - from content extraction and filtering out problematic and low quality data. In this hands-on session we will explore Data Prep Kit - an open source toolkit, designed to streamline these essential tasks. Attendees will learn first hand how to use the Data Prep Kit to improve overall data quality such as removing spam and low quality documents, removing HAP (Hate Abuse Profanity) speech, removing PII (Personally Identifiable Information) data, thus leading to higher quality dataset. Description Join us for an interactive, hands-on session where you will learn to clean up data and prepare high quality datasets. In this workshop we will do the following:
More about Data Prep Kit : https://github.com/IBM/data-prep-kit What do you need to participate in this workshop?
Session Type Workshop (hands-on) Audience LLM app developers, data scientists, data engineers Technical Level Beginner - Intermediate Prerequisites
Duration 60 mins Industry Cross industry Speaker Bio https://sujee.dev/bio About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players. |
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
|
|
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
2025-03-27 · 16:00
Overview When building machine learning and data applications, a significant portion of your time will be dedicated to data wrangling - from content extraction and filtering out problematic and low quality data. In this hands-on session we will explore Data Prep Kit - an open source toolkit, designed to streamline these essential tasks. Attendees will learn first hand how to use the Data Prep Kit to improve overall data quality such as removing spam and low quality documents, removing HAP (Hate Abuse Profanity) speech, removing PII (Personally Identifiable Information) data, thus leading to higher quality dataset. Description Join us for an interactive, hands-on session where you will learn to clean up data and prepare high quality datasets. In this workshop we will do the following:
More about Data Prep Kit : https://github.com/IBM/data-prep-kit What do you need to participate in this workshop?
Session Type Workshop (hands-on) Audience LLM app developers, data scientists, data engineers Technical Level Beginner - Intermediate Prerequisites
Duration 60 mins Industry Cross industry Speaker Bio https://sujee.dev/bio About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players. |
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
|
|
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
2025-03-27 · 16:00
Overview When building machine learning and data applications, a significant portion of your time will be dedicated to data wrangling - from content extraction and filtering out problematic and low quality data. In this hands-on session we will explore Data Prep Kit - an open source toolkit, designed to streamline these essential tasks. Attendees will learn first hand how to use the Data Prep Kit to improve overall data quality such as removing spam and low quality documents, removing HAP (Hate Abuse Profanity) speech, removing PII (Personally Identifiable Information) data, thus leading to higher quality dataset. Description Join us for an interactive, hands-on session where you will learn to clean up data and prepare high quality datasets. In this workshop we will do the following:
More about Data Prep Kit : https://github.com/IBM/data-prep-kit What do you need to participate in this workshop?
Session Type Workshop (hands-on) Audience LLM app developers, data scientists, data engineers Technical Level Beginner - Intermediate Prerequisites
Duration 60 mins Industry Cross industry Speaker Bio https://sujee.dev/bio About the AI Alliance The AI Alliance is an international community of researchers, developers and organizational leaders committed to support and enhance open innovation across the AI technology landscape to accelerate progress, improve safety, security and trust in AI, and maximize benefits to people and society everywhere. Members of the AI Alliance believe that open innovation is essential to develop and achieve safe and responsible AI that benefit society rather than benefit a select few big players. |
[AI Alliance] Workshop: Preparing High Quality Datasets with Data Prep Kit
|
|
PyLadies Paris Python Talks #19
2025-03-25 · 17:30
Dear PyLadies 💚🐍 Our next on-site event is coming on the 25th of March featuring 𓆙 Anna Astori from Cigna Group and Julia Wabant **.** and continuing with ⚡ lightning talks where you can take 3 mins to talk about anything Python or tech related (more below) 🌟Agenda (preliminary) 18h30 - 18h45 Come and take your seat 18h45 - 19h00 Welcome by PyLadies Paris and Octo Technology 19h00 - 19h30 Building an LLM-backed chatbot with Chainlit by Anna Astori 19h30 - 20h00 Rustyfing Python: A Synergistic Approach to High Performance by Julia Wabant 20h00 - 20h20 Lightning talks 20h20 - 22h00 Snacks and Networking 🌟 Anna Astori from Cigna Group Talk Title: Building an LLM-backed chatbot with Chainlit Abstract: If you’re interested in Generative AI, building a chatbot application sounds exciting, right? But creating the web UI can quickly become overwhelming, especially if you don't have much experience. That’s where Chainlit comes to the rescue! In this talk, you’ll see how you can quickly build pretty sophisticated interactive chatbots and integrate generative AI models using Chainlit. We’ll also cover features such as testing and debugging, streaming responses, and more advanced backend settings. About Anna: Anna Astori is a Software Engineer. She is also a co-organizer of PyLadies Boston, Women Techmakers Ambassador, and formerly the Director for Women Who Code Boston. 🌟Julia Wabant Talk Title: Rustyfing Python: A Synergistic Approach to High Performance Abstract: This talk explores the synergistic combination of Python and Rust for high-performance software development. While Python's dynamic typing and interpreted nature excel in rapid prototyping and development, its performance limitations become apparent in complex applications. Rust, with its static typing and compilation to native code, provides a powerful solution for optimizing computationally demanding tasks. By leveraging Rust's native performance within Python projects, we achieve a compelling balance: the flexibility and ease of development offered by Python coupled with the raw speed and efficiency of Rust. This talk showcases practical examples of integrating Rust into Python projects, demonstrating how to build a single package, and ultimately achieve streamlined development, deployment, and execution. About Julia : Julia Wabant is a seasoned developer with 7 years of experience building and deploying Machine Learning solutions across various industries. She has a deep understanding of the entire ML lifecycle, from data preparation to model training, deployment and monitoring. Beyond industry experience, Julia is also a passionate educator, having spent significant time in the last 4 years teaching programming, mathematics, data science and machine learning to students, developers, and executives. As a Google Cloud and AWS certified Machine Learning specialist, Julia is well-versed into both cloud platforms and their role in powering modern ML applications. Get ready for lightning talks: Many of you told us that you would like to give a talk, but your project is not mature enough. You no longer have to worry about it. Come and practice your public speaking during the 3 minutes time-slot. Some ideas on what you can talk about:
You can decide anytime before the start of lightning talks or you may want to prepare up to one slide (in pdf format) which you can send us the latest on the 11th of March to [email protected] Octo Technology will be our host and sponsor of the food and the drinks during the networking session after the talks: thank you 💚 and special thanks to Loic from Octo n for all the support. Important info 1:❗For safety reasons, the venue's staff will check everyone's identity on site. 📝Please remember to bring an ID with you and register for the event with your real name and family name. Thank you!2: Please be on time. We can’t guarantee a seat once the meetup has started# 🔍 FAQ Q. I'm not female, is it ok for me to attend? A. Yes, PyLadies Paris events are open to everyone at all levels. |
PyLadies Paris Python Talks #19
|