Search – talk-data.com

Title & Speakers	Event
Paris NLP saison 8 Meetup #3 2025-03-05 · 18:00 📍GitGuardian office, 12 rue d'Aboukir, 75002 📆 March 5th, 7:00 p.m. 👥 Johan Leduc Senior ML Engineer @ GitGuardian ➡️ Uncovering Critical Secrets with LLMs Summary: GitGuardian detects secrets—like passwords and API keys—in code, but the sheer volume can overwhelm users. Sifting through them to find the most critical ones is like searching for a needle in a haystack. In this talk, we’ll dive into how we leveraged LLMs at key stages to prioritize secrets efficiently and at scale. 👥 Louis Leconte - ML Research Engineer @Pruna AI ➡️ How to quantize a LLM in 3 lines of code Summary: Large Language Models (LLMs) are powerful but often computationally expensive, making deployment challenging. In this talk, we’ll explore how to quantize an LLM in just three lines of code using Pruna AI’s frictionless solution. I’ll introduce our data-free vector quantization approach, which optimizes CUDA kernels to enable efficient inference—all in under five minutes. Whether you're working on edge AI, server-side deployments, or simply curious about making LLMs more efficient, this session will give you a hands-on glimpse into state-of-the-art quantization techniques.	Paris NLP saison 8 Meetup #3
Paris NLP saison 8 Meetup #3 2024-03-06 · 18:00 Sergei Bogdanov - Data Scientist - Numind Title: NuNER & NuSentiment - Creating efficient Foundation Models thanks to LLMs Summary : How do you create small and data-efficient foundation models that are on-par with 7B LLMs? In this talk we will talk about how we managed to create NuNER & NuSentiment - 100M foundation models that outperform existing similar-sized models in few-shot learning Classification and Entity Recognition. * Raphaël Bournhonesque - Machine Learning Engineer - Open Food Facts Title: Extracting ingredients from photos of food packaging: from LLM-augmented annotation to production Summary** : Raphaël from Open Food Facts will present their latest machine learning project: the automatic extraction of ingredient lists from photos of food packaging. He will share his experience of using LLMs to pre-annotate data and of how this model was integrated in production.	Paris NLP saison 8 Meetup #3
Paris NLP saison 8 Meetup #1 2023-10-25 · 17:00 This event is in-person only and will be followed by a networking apéro. We are looking forward to seeing you all in person! * Florent Gbelidji - Hugging Face Title: Customizing RAG System Components to Build Domain-Specific Assistant Summary : Retrieval Augmented Generation (RAG) has become a prevalent approach in developing Large Language Models (LLM) applications, incorporating industry-specific data and the most recent information. In this session, we'll delve into the mechanisms of RAG applications, focusing on key components like the retriever and the LLM. Our exploration will include leveraging tools from the open-source ecosystem to fine-tune these components, enhancing their performance in providing assistance, especially when confronted with domain-specific questions. * Guillaume Richard and Marie Lopez - InstaDeep Title: The Nucleotide Transformer: Building and Evaluating Robust Foundation Models for Human Genomics Summary : Closing the gap between measurable genetic information and observable traits is a longstanding challenge in genomics. Yet, the prediction of molecular phenotypes from DNA sequences alone remains limited and inaccurate, often driven by the scarcity of annotated data and the inability to transfer learnings between prediction tasks. Here, we present an extensive study of foundation models pre-trained on DNA sequences, named the Nucleotide Transformer, ranging from 50M up to 2.5B parameters and integrating information from 3,202 diverse human genomes, as well as 850 genomes selected across diverse phyla, including both model and non-model organisms. These transformer models yield transferable, context-specific representations of nucleotide sequences, which allow for accurate molecular phenotype prediction even in low-data settings. We show that the developed models can be fine-tuned at low cost and despite low available data regime to solve a variety of genomics applications. Despite no supervision, the transformer models learned to focus attention on key genomic elements, including those that regulate gene expression, such as enhancers. Lastly, we demonstrate that utilizing model representations can improve the prioritization of functional genetic variants. The training and application of foundational models in genomics explored in this study provide a widely applicable stepping stone to bridge the gap of accurate molecular phenotype prediction from DNA sequence.	Paris NLP saison 8 Meetup #1

Paris NLP saison 8 Meetup #3 2025-03-05 · 18:00

📍GitGuardian office, 12 rue d'Aboukir, 75002 📆 March 5th, 7:00 p.m.

👥 Johan Leduc Senior ML Engineer @ GitGuardian ➡️ Uncovering Critical Secrets with LLMs Summary: GitGuardian detects secrets—like passwords and API keys—in code, but the sheer volume can overwhelm users. Sifting through them to find the most critical ones is like searching for a needle in a haystack. In this talk, we’ll dive into how we leveraged LLMs at key stages to prioritize secrets efficiently and at scale.

👥 Louis Leconte - ML Research Engineer @Pruna AI ➡️ How to quantize a LLM in 3 lines of code Summary: Large Language Models (LLMs) are powerful but often computationally expensive, making deployment challenging. In this talk, we’ll explore how to quantize an LLM in just three lines of code using Pruna AI’s frictionless solution. I’ll introduce our data-free vector quantization approach, which optimizes CUDA kernels to enable efficient inference—all in under five minutes. Whether you're working on edge AI, server-side deployments, or simply curious about making LLMs more efficient, this session will give you a hands-on glimpse into state-of-the-art quantization techniques.

Paris NLP saison 8 Meetup #3

Paris NLP saison 8 Meetup #3 2024-03-06 · 18:00

Sergei Bogdanov - Data Scientist - Numind Title: NuNER & NuSentiment - Creating efficient Foundation Models thanks to LLMs Summary : How do you create small and data-efficient foundation models that are on-par with 7B LLMs? In this talk we will talk about how we managed to create NuNER & NuSentiment - 100M foundation models that outperform existing similar-sized models in few-shot learning Classification and Entity Recognition.

***

Raphaël Bournhonesque - Machine Learning Engineer - Open Food Facts Title: Extracting ingredients from photos of food packaging: from LLM-augmented annotation to production Summary : Raphaël from Open Food Facts will present their latest machine learning project: the automatic extraction of ingredient lists from photos of food packaging. He will share his experience of using LLMs to pre-annotate data and of how this model was integrated in production.

Paris NLP saison 8 Meetup #3

Paris NLP saison 8 Meetup #1 2023-10-25 · 17:00

This event is in-person only and will be followed by a networking apéro. We are looking forward to seeing you all in person!

***

Florent Gbelidji - Hugging Face Title: Customizing RAG System Components to Build Domain-Specific Assistant Summary : Retrieval Augmented Generation (RAG) has become a prevalent approach in developing Large Language Models (LLM) applications, incorporating industry-specific data and the most recent information. In this session, we'll delve into the mechanisms of RAG applications, focusing on key components like the retriever and the LLM. Our exploration will include leveraging tools from the open-source ecosystem to fine-tune these components, enhancing their performance in providing assistance, especially when confronted with domain-specific questions.

***

Guillaume Richard and Marie Lopez - InstaDeep Title: The Nucleotide Transformer: Building and Evaluating Robust Foundation Models for Human Genomics Summary : Closing the gap between measurable genetic information and observable traits is a longstanding challenge in genomics. Yet, the prediction of molecular phenotypes from DNA sequences alone remains limited and inaccurate, often driven by the scarcity of annotated data and the inability to transfer learnings between prediction tasks. Here, we present an extensive study of foundation models pre-trained on DNA sequences, named the Nucleotide Transformer, ranging from 50M up to 2.5B parameters and integrating information from 3,202 diverse human genomes, as well as 850 genomes selected across diverse phyla, including both model and non-model organisms. These transformer models yield transferable, context-specific representations of nucleotide sequences, which allow for accurate molecular phenotype prediction even in low-data settings. We show that the developed models can be fine-tuned at low cost and despite low available data regime to solve a variety of genomics applications. Despite no supervision, the transformer models learned to focus attention on key genomic elements, including those that regulate gene expression, such as enhancers. Lastly, we demonstrate that utilizing model representations can improve the prioritization of functional genetic variants. The training and application of foundational models in genomics explored in this study provide a widely applicable stepping stone to bridge the gap of accurate molecular phenotype prediction from DNA sequence.

Paris NLP saison 8 Meetup #1

Activities & events