talk-data.com
People (42 results)
See all 42 →Activities & events
| Title & Speakers | Event |
|---|---|
|
How the Texas Rangers Use a Unified Data Platform to Drive World Class Baseball Analytics
2025-06-11 · 23:10
Michael Topol
– Assistant Director, Baseball R&D
@ Texas Rangers
,
Oliver Dykstra
– Data Engineer
@ Texas Rangers
Don't miss this session where we demonstrate how the Texas Rangers baseball team is staying one step ahead of the competition by going back to the basics. After implementing a modern data strategy with Databricks and winnng the 2023 World Series the rest of the league quickly followed suit. Now more than ever, data and AI are a central pillar of every baseball team's strategy driving profound insights into player performance and game dynamics. With a 'fundamentals win games' back to the basics focus, join us as we explain our commmitment to world-class data quality, engineering, and MLOPS by taking full advantage of the Databricks Data Intelligence Platform. From system tables to federated querying, find out how the Rangers use every tool at their disposal to stay one step ahead in the hyper competitive world of baseball. |
Data + AI Summit 2025 |
|
PyData Hamburg 17th February 2025 Meetup
2025-02-17 · 17:00
🗺 Hey PyData Hamburg Community 💚, Happy new 2025!! The PyData Hamburg community is hosting an exciting first meetup in 2025! This event is sponsored by Google Hamburg. We are looking forward to an evening of learning and collaboration! (drum rolls & background music please♬ ♭) The talk schedule is as follows: 💫💙The First Speaker - Dr. Tobias Gärtner Linkedin link: https://www.linkedin.com/in/tobias-g%C3%A4rtner-at-google/ The topic: Let Google do the boring stuff: MLOps for lone wolves and world conquerors alike Tired of the endless cycle of model training, deployment, and monitoring? Wish you could focus on the exciting parts of machine learning instead of the tedious infrastructure management? This talk explores how Google Vertex AI can streamline your MLOps workflow, whether you're a solo data scientist or part of a large team. We'll dive into: • Automated model training and deployment: Vertex AI's custom training options and Vizier allow you to build, tune, and deploy high-quality models with minimal code. • Simplified model monitoring and management: Track model performance, identify drift and retrain models with ease using Vertex AI's built-in tools. • Scalable infrastructure: Vertex AI provides the infrastructure you need to run your ML workloads, from data preprocessing to model serving, without worrying about scaling or resource management. Learn how Vertex AI can free you from the "boring stuff" and empower you to focus on what matters: building innovative data products that drive real-world impact. 💫💙 The Second Speaker - Dr. Alexander Meier Linkedin link: https://www.linkedin.com/in/reiemrednaxela?utm_source=share&utm_campaign=share_via&utm_content=profile&utm_medium=android_app The topic: Mind the Gap: Forecasting Demand from Censored Sales Data This talk explores how to handle demand forecasting when the historical sales data is censored by inventory or supply constraints. We will cover different modeling techniques, show how to evaluate them properly and walk through a simulation study that compares various approaches. ⭐️Agenda⭐️: • 18:00 - Open Doors • 18:20 - Short Intro • 18:30 -19:00 First Talk(Dr. Tobias Gärtner) and Questions • 19:00 - 19:30 Break -Networking & food/snacks • 19:30 - 20:00 Second Talk(Dr. Alexander Meier) and Questions • 20:00 - 20:45 Networking & food/snacks |
PyData Hamburg 17th February 2025 Meetup
|
|
Evolving Responsibilities in AI Data Management
2025-02-16 · 16:09
Bartosz Mikulski
– guest
,
Tobias Macey
– host
Summary In this episode of the Data Engineering Podcast Bartosz Mikulski talks about preparing data for AI applications. Bartosz shares his journey from data engineering to MLOps and emphasizes the importance of data testing over software development in AI contexts. He discusses the types of data assets required for AI applications, including extensive test datasets, especially in generative AI, and explains the differences in data requirements for various AI application styles. The conversation also explores the skills data engineers need to transition into AI, such as familiarity with vector databases and new data modeling strategies, and highlights the challenges of evolving AI applications, including frequent reprocessing of data when changing chunking strategies or embedding models. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details. Your host is Tobias Macey and today I'm interviewing Bartosz Mikulski about how to prepare data for use in AI applicationsInterview IntroductionHow did you get involved in the area of data management?Can you start by outlining some of the main categories of data assets that are needed for AI applications?How does the nature of the application change those requirements? (e.g. RAG app vs. agent, etc.)How do the different assets map to the stages of the application lifecycle?What are some of the common roles and divisions of responsibility that you see in the construction and operation of a "typical" AI application?For data engineers who are used to data warehousing/BI, what are the skills that map to AI apps?What are some of the data modeling patterns that are needed to support AI apps?chunking strategies metadata managementWhat are the new categories of data that data engineers need to manage in the context of AI applications?agent memory generation/evolution conversation history managementdata collection for fine tuningWhat are some of the notable evolutions in the space of AI applications and their patterns that have happened in the past ~1-2 years that relate to the responsibilities of data engineers?What are some of the skills gaps that teams should be aware of and identify training opportunities for?What are the most interesting, innovative, or unexpected ways that you have seen data teams address the needs of AI applications?What are the most interesting, unexpected, or challenging lessons that you have learned while working on AI applications and their reliance on data?What are some of the emerging trends that you are paying particular attention to?Contact Info WebsiteLinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links SparkRayChunking StrategiesHypothetical document embeddingsModel Fine TuningPrompt CompressionThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA |
Data Engineering Podcast |
|
52. Paris Women in Machine Learning & Data Science @Probabl
2025-02-11 · 18:00
The Women in Machine Learning & Data Science (WiMLDS) Meetup aims to inspire, educate, regardless of gender, and support women and gender minorities in the field. We are back for our 52th edition to kick-off 2025! All genders may attend our meetups. Agenda 18:50 arrival to go up to floor 27 in montparnasse tower and ID check --- 19:00 - Launch of the evening by Paris WiMLDS meetup team & Probabl --- 19:15 - Decision by AI in medical context, by Christel Gérardin, MD, PhD, Specialised in Internal Medicine and Automatic Language Processing, Assistant Chief of Clinic. Abstract: There are now a large number of solutions based on artificial intelligence algorithms in the healthcare sector. However, their implementation in clinical practice is not always guaranteed. The aim of the presentation will be to give several examples of AI algorithms that have been co-constructed from the design phase by multi-disciplinary teams (specialist clinicians, data engineers, developers), and whose performance has been evaluated throughout the development phases. 19:45 - Learning common structures in a collection of networks. An application to food webs, by Sophie Donnet, Directrice de Recherche (Senior researcher) INRAE, Unité MIA Paris Saclay Abstract : Studying networks in ecology helps uncover the structural patterns underlying ecosystem interactions, shedding light on species roles, resilience, and the organization of biodiversity. In this work with Pierre Barbillon and Saint Clair Chabert Liddell, we analyze collections of networks to identify shared structural patterns and cluster them into homogeneous groups. Using a probabilistic approach (Stochastic Block Model (SBM)) and a variational EM algorithm, we capture common connectivity structures and classify networks effectively. This approach, validated on ecological data, reveals structural homogeneity and key mesoscale patterns across diverse ecosystems. Based on the work https://projecteuclid.org/journals/annals-of-applied-statistics/volume-18/issue-2/Learning-common-structures-in-a-collection-of-networks-An-application/10.1214/23-AOAS1831.short 20:15 - Data in the shadows: How Open-Source Intelligence and Malicious AI can fuel Cybercrime, by Noor Bhatnagar, Senior Cybersecurity Analyst, EMEA @cybelangel Abstract : Data is everywhere and we interact with it on a regular basis. It is collected, analyzed, and shared at an unprecedented scale. But as much as this data powers insights, it also create new vulnerabilities. For data scientists and anyone handling data, the challenge is not only about building smarter models but also navigating the fine line between secure data use and inadvertent risk. As open-source intelligence (OSINT) becomes more accessible, cybercriminals are finding new ways to exploit publicly available data for malicious purposes. Additionally, the rise of AI-driven tools like WormGPT has given bad actors the power to automate attacks at scale. We will explore two critical ways in which data practices — both innocent and intentional — can contribute to the dark web’s ever-growing ecosystem. --- 20:45 - Cocktail & Networking The cocktail is sponsored by probabl. --- After the meet-up, a summary be available on our Medium page : https://wimlds-paris.medium.com/ --- Code of Conduct WiMLDS & MLOps are dedicated to providing a harassment-free experience for everyone. We do not tolerate harassment of participants in any form. All communication should be appropriate for a professional audience including people of many different backgrounds. Sexual language and imagery is not appropriate. Be kind to others. Do not insult or put down others. Behave professionally. Remember that harassment and sexist, racist, or exclusionary jokes are not appropriate. Thank you for helping make this a welcoming, friendly community for all. All attendees should read the full Code of Conduct before participating: https://github.com/WiMLDS/starter-kit/wiki/Code-of-conduct |
52. Paris Women in Machine Learning & Data Science @Probabl
|
|
PyBerlin 51 - AI Night with a Panel Session
2025-02-05 · 17:30
We kick off 2025 with a great panel about AI and running AI in production! Join us for an expert panel, we will talk about AI, LLMs, MLOps and continuous experimentation. Agenda: • 18:00 - Opening doors of the venue • 18:30 - Welcome to PyBerlin! // Organisers • 18:40 - Welcome from the host - Aleph Alpha Moderator: Christian Barra Panelists: Ceyhun Derinbogaz, Jacek Golebiowski, Matheus Veleci Christian is a software engineer and co-founder of zerobang.dev. Ceyhun is an engineer and a serial entrepreneur with background in data engineering. He started textcortex after reading GPT-2 paper and creating an open source fine tuning script for specific applications while working at trivago as data engineering lead. His previous startup developed machine vision applications for companies like Renault and Nokia. https://www.linkedin.com/in/ceyhunderinbogaz/ Jacek a physics PhD focusing on machine learning for most of his career. He has been a science lead in the AWS Long Term Science team, driving NLP and Automated Machine Learning (AutoML) products and research. Most recently, he is the the CTO and co-founder of distil-labs, making fine-tuning task-specific small language models (SLM) as simple as writing an LLM prompt. https://www.linkedin.com/in/jacek-golebiowski/ Matheus Veleci is an engineer with experience in generative AI, NLP, and data solutions. At Lengoo, he led the development of a product centered on Machine Translation Models and LLMs, building an in-house data platform and MLOps infrastructure focused on fine-tuning models. Earlier in his career, he co-founded a startup, blending technical expertise with business insights to deliver innovative solutions. He now leads data initiatives as Head of Data at Aleph Alpha. https://www.linkedin.com/in/matheus-veleci-dos-santos/ • 21:20 - Closing session // Organisers This event will be only in-person. Please check our Code of Conduct and official health regulation in Berlin before coming. If you feel some signs of sickness, please consider skipping this event and attending another time. We will have plenty of events in different formats in the future. Looking forward seeing you all soon! |
PyBerlin 51 - AI Night with a Panel Session
|
|
From Observability to Deployment: Exploring AI System Lifecycles
2024-12-11 · 17:30
📅 Date: 11-12-2024 📍 Location: Team System, Via Emilio Cornalia, 11 (Gioia) Agenda
Talks & SpeakersTalk 1: Observability for Large Language Models with OpenTelemetrySpeaker: Nir Gazit, CEO @ Traceloop, OpenLLMetry Co-Creator Large Language Models (LLMs) represent a breakthrough in AI, performing tasks like text generation, translation, and querying. With LLMs becoming a core component of applications such as chatbots and search engines, monitoring their behavior and ensuring reliability is critical. This session will delve into the concept of observability for LLMs, focusing on:
An essential talk for practitioners looking to understand how observability can enhance trustworthiness and reliability in AI systems. Talk 2: MLOps with AzureML: Build and Deploy a Recommender SystemSpeaker: Marco Bevilacqua, Machine Learning Engineer @ TeamSystem For AI-focused organizations, MLOps is essential for managing machine learning workflows at scale. This session will present a practical MLOps use case using the Azure ecosystem, exploring its integration with various tools to streamline workflows from experimentation to deployment. Key highlights:
This session will provide actionable strategies to optimize machine learning lifecycle management, helping organizations leverage AI at scale. 🎟️ RSVP now to secure your spot! ⚠️ Note: If you realize you cannot attend, please cancel your RSVP to allow others to join. Looking forward to seeing you there! |
From Observability to Deployment: Exploring AI System Lifecycles
|
|
DevOops #5 + DevOps #48 - Unconference
2024-12-04 · 17:00
DetailsTopic are voted! join our discussion This time, we're mixing things up with a mini unconference in cooperation with Berlin DevOops , offering an evening full of fun, learning, and, most importantly, your input! What is an unconference? An unconference is an open-style event with no formal speakers. Instead, the sessions are discussion-oriented. It is a great place to practice public speaking when you don’t normally do it. For DevOops that means we have 1 slot with 2-3 different sessions. You're in the driver's seat! Your votes are in, and the top topics are:
We’re excited about this great lineup, but we need your help to make it happen! If you’re passionate about any of these areas and would like to lead a session, please reach out to us: No full talk or CFP is needed. As a session leader you facilitate the discussion and give some input to a topic! Who is leading a session and how to lead? We value diversity and encourage everybody to get involved. The goal is to provide a respectful and supportive environment. You can submit a topic and already indicate that you are up for leading it. But no worries you can also suggest topics without the need to become a session leader. Once we know which sessions are voted, we will publish the sessions and also indicate which sessions still need a leader. So you can step up there too. We will provide some help and ideas on how to prepare your session. In the meanwhile this guide gives some first ideas. ---------------------------------------------------------------------------------------- If you have any questions feel free to reach out! See you soon! The joint Berlin DevO(o)ps Orga Team |
DevOops #5 + DevOps #48 - Unconference
|
|
PyLadies Paris Python Talks #17
2024-11-27 · 17:30
Dear PyLadies 💚🐍 Our next on-site event is coming on the 27th of November featuring 𓆙 Adrin Jalali from Probabl and Celia Kherfallah from Zama and continuing with ⚡ lightning talks where you can take 3 mins to talk about anything Python or tech related (more below) 🌟Agenda (preliminary) 18h30 - 18h45 Come and take your seat 18h45 - 19h00 Welcome by PyLadies Paris and GitGuardian 19h00 - 19h30 Let’s exploit pickle, and `skops` to the rescue! by Adrin Jalali from Probabl. 19h30 - 20h00 Privacy-Preserving Machine Learning With Fully Homomorphic Encryption (FHE) by Celia Kherfallah from Zama 20h00 - 20h20 Lightning talks 20h20 - 22h00 Pizza & networking 🌟 Adrin Jalali from Probabl Talk Title: Let’s exploit pickle, and `skops` to the rescue! Abstract: Pickle files can be evil and simply loading them can run arbitrary code on your system. This talk presents why that is, and we show in simple ways how you can create such an exploit. It would give you a good basis to understand pickle vulnerabilities. This talk also gives you the resources to find more about these exploits. We then talk about how `skops` [1] is tackling the issue for scikit-learn/statistical ML models. We go through some lower level pickle related machinery, and go in detail how the new format works. The new format does not only solve the issue for scikit-learn models, but also for most third party estimators which are in the same ecosystem. In terms of usage, you can simply change two import statements and use the new format almost as a drop in replacement. - [1] https://skops.readthedocs.io/en/stable/persistence.html About Adrin: Adrin, a cofounder at probabl.ai, works on a few open source projects including skops which tackles some of the MLOps challenges related to scikit-learn models. He has a PhD in Bioinformatics, has worked as a consultant, and in an algorithmic privacy and fairness team. He's also a core developer of scikit-learn and fairlearn. 🌟 Celia Kherfallah from Zama Talk Title: Privacy-Preserving Machine Learning With Fully Homomorphic Encryption (FHE) Abstract: We live in an era where the amount of online data has reached hundreds of zettabytes, and cloud services are evolving at an unprecedented rate. Despite tighter regulations, the risk of personal data misuse remains a major concern. At Zama, we believe that responsibility for this issue doesn’t rest with Internet users, but with developers. It is their duty to ensure the protection and security of the data they process. In this talk, we'll raise awareness among developers about the importance of data privacy, thanks to Fully Homomorphic Encryption (FHE). We'll also introduce Zama's Concrete ML library, which provides the necessary tools (built using FHE) for training models, performing inference on encrypted data, and deploying these solutions, which will enable developers to integrate strong privacy protections without requiring any specific knowledge in cryptography. About Celia: Celia, Machine Learning Researcher at Zama, has contributed to the development of the Concrete ML library and to the democratization of Fully Homomorphic Encryption (FHE) in the field of Machine Learning. Get ready for lightning talks: Many of you told us that you would like to give a talk, but your project is not mature enough. You no longer have to worry about it. Come and practice your public speaking during the 3 minutes time-slot. Some ideas on what you can talk about:
You can decide anytime before the start of lightning talks or you may want to prepare up to one slide (in pdf format) which you can send us the latest on the 11th of March to [email protected] GitGuardian will be our host and sponsor of the food and the drinks during the networking session after the talks: thank you 💚 and special thanks to Oscar Burns and Antoine Gaillard from GitGuardian for all the support. Important info 1:❗For safety reasons, the venue's staff will check everyone's identity on site. 📝Please remember to bring an ID with you and register for the event with your real name and family name. Thank you!2: Please be on time. We can’t guarantee a seat once the meetup has started# 🔍 FAQ Q. I'm not female, is it ok for me to attend? A. Yes, PyLadies Paris events are open to everyone at all levels. |
PyLadies Paris Python Talks #17
|
|
Eindhoven Data Community meetup 19 - ASML
2024-11-21 · 16:00
We’re excited to return to ASML for our annual meetup! This year, we have two concurrent tracks featuring a total of four talks. We're also thrilled to welcome a special guest from the US coming over to ASML: Joe Reis, author of "The Fundamentals of Data Engineering," who will be doing an AMA. Joe \| Ask Me Anything about Data Engineering or Otherwise Joe Reis is here to answer all of your questions about data engineering, the state of the industry and technology, and anything else on your mind. This is a very rare change to have a free-flowing conversation with Joe Reis. Cristiano & Shashank\| Automating Creating Trusted Data Products: a developer experience-driven approach Creating high-quality data products is a complex task that often burdens data professionals with repetitive activities. Our trusted dataset creation framework aims to alleviate this challenge by providing a comprehensive mechanism that automates essential processes in data product development. This presentation will delve into how it not only simplifies workflows but also improves developer experience by enhancing feedback loops and cognitive load. Juan \| Standardization of Predictive Maintenance Pipelines Juan will show how his team, The Model Factory, is currently setting up a framework that ensures that all our predictive maintenance pipelines follow standards that ensure 1) Short time-to-market, 2) maintainability, and 3) interpretability of outputs and intermediate calculations. Ismael & Ricardo \| Airflow 3.0: A New Perspective on MLOps and GenAi The new version of Airflow is more than just a tool for data orchestration, and is coming up early 2025. Airflow It's evolving to meet the needs derived by the explosion of GenAi applications, and it is even changing its internal architecture to be faster and more flexible. In this talk, we'll discuss how Airflow 3.0 is evolving to support the requirements of modern applications. We'll also provide a practical example of using Airflow with a RAG implementation. It's a look at the future of Airflow, and we hope you'll join us. Program 17:00 – 18:00 🍕 Food Track 1
Track 2
20:00-21:00 🥤 Drinks 20:15-21:00 Tour ASML experience center Joe Reis \| Author\, data engineer\, "recovering data scientist" Joe Reis, a "recovering data scientist" with 20 years in the data industry, is the co-author of the best-selling O'Reilly book, "Fundamentals of Data Engineering." He’s also the instructor for the wildly popular Data Engineering Professional Specialization on Coursera, created with DeepLearning.ai and AWS. Joe’s extensive experience encompasses data engineering, data architecture, machine learning, and more. He regularly keynotes major data conferences globally, advises and invests in innovative data product companies, writes at Practical Data Modeling and his personal blog, and hosts the popular data podcasts "The Monday Morning Data Chat" and "The Joe Reis Show." In his free time, Joe is dedicated to writing new books and articles, and thinking of ways to advance the data industry. Cristiano Rocha \| Lead Data Engineer Cristiano is a lead engineer at ASML with an educational background in Distributed and Parallel Computing. With over 15+ years of experience in on-premise and cloud data-based solutions, Cristiano has a wealth of knowledge in building and maturing high-impact data platforms and self-service analytics programs for large organizations. He has extensive experience in a variety of roles, including data infrastructure engineer, self-service analytics platform engineer, data engineer, big data competence lead, DataOps competence lead, machine learning engineer, and data analyst. Shashank Shekhar \| Senior Data Engineer Shashank is a Senior Data Engineer at ASML with extensive expertise in cross-cloud technologies and architecting and optimizing data pipelines that drive actionable insights. Over 7 years in the industry, Shashank has successfully executed complex data projects, enabling organizations to harness the full potential of their data. Juan Manuel Ortiz Sevillano \| Machine Learning Engineer Juan is originally a Data Scientist who turned into a Machine Learning Engineer driven by the need to make ML models produce actual value. He currently focuses on reducing time-to-market and improving maintainability of Predictive Maintenance pipelines at ASML Ismael Cabral \| Author\, Machine Learning Engineer Ismael is a Machine Learning Engineer and Airflow trainer at Xebia Data in The Netherlands. At the same time, he is currently co-authoring the 2nd version of “Data Pipelines with Apache Airflow”. Ricardo Granados \| Author\, Analytics Engineer Ricardo Granados, co-author of Fundamentals of Analytics Engineering, is an analytics engineer specializing in data engineering and analysis. With a master’s in IT management and a focus on data science, he is proficient in using various programming languages and tools. Ricardo is skilled in exploring efficient alternatives and has contributed to multicultural teams, creating business value with data products using modern data stack solutions. As an analytics engineer, he helps companies enhance data value through data modeling, best practices, task automation, and data quality improvement. Note: For security reasons, we must register all visitors in advance. When registering, we ask for additional information such as first and last name, e-mail address and possibly license plate of your vehicle if you want to use a parking facility. Please use the extra field "Reason for visiting" to register your license plate. Please note: bring a valid ID! |
Eindhoven Data Community meetup 19 - ASML
|
|
MLOps as a Team - Raphaël Hoogvliets
2024-11-08 · 18:00
Raphaël Hoogvliets
– guest
We talked about: 00:00 DataTalks.Club intro 02:34 Career journey and transition into MLOps 08:41 Dutch agriculture and its challenges 10:36 The concept of "technical debt" in MLOps 13:37 Trade-offs in MLOps: moving fast vs. doing things right 14:05 Building teams and the role of coordination in MLOps 16:58 Key roles in an MLOps team: evangelists and tech translators 23:01 Role of the MLOps team in an organization 25:19 How MLOps teams assist product teams 27 :56 Standardizing practices in MLOps 32:46 Getting feedback and creating buy-in from data scientists 36:55 The importance of addressing pain points in MLOps 39:06 Best practices and tools for standardizing MLOps processes 42:31 Value of data versioning and reproducibility 44:22 When to start thinking about data versioning 45:10 Importance of data science experience for MLOps 46:06 Skill mix needed in MLOps teams 47:33 Building a diverse MLOps team 48:18 Best practices for implementing MLOps in new teams 49:52 Starting with CI/CD in MLOps 51:21 Key components for a complete MLOps setup 53:08 Role of package registries in MLOps 54:12 Using Docker vs. packages in MLOps 57:56 Examples of MLOps success and failure stories 1:00:54 What MLOps is in simple terms 1:01:58 The complexity of achieving easy deployment, monitoring, and maintenance Join our Slack: https://datatalks .club/slack.html |
DataTalks.Club |
|
MLOps as a Team
2024-10-23 · 10:30
How I learned to stop worrying and love deployment - Raphaël Hoogvliets About the event Outline:
About the speaker: Raphaël Hoogvliets is a notable figure in the field of MLOps, known for his expertise as a data scientist and machine learning engineer. He is particularly recognized for his leadership and team-building skills within the technology sector. Raphaël is active in sharing his knowledge and insights on platforms like LinkedIn and Substack, where he regularly posts, and runs a newsletter. He shares learnings on MLOps from a variety of perspectives, including CxO-level strategies and beginner-level coding. Currently, Raphaël is leading a team of 12 engineers at Eneco, a leading sustainable energy company that provides energy to 2m households and businesses. DataTalks.Club is the place to talk about data. Join our slack community! |
MLOps as a Team
|
|
50. Paris Women in Machine Learning & Data Science @Pruna.ai
2024-10-11 · 16:30
The Women in Machine Learning & Data Science (WiMLDS) Meetup aims to inspire, educate, regardless of gender, and support women and gender minorities in the field. We are back for our 50th edition! For this anniversary edition, we have incredible speakers coming! To give them more time to enrich us, we have only two of them. All genders may attend our meetups. Agenda --- 18:30 - Launch of the evening by Paris WiMLDS meetup team & Pruna.ai --- 18:45 - Du Golem au code : l'IA et les fantasmes masculins d'autoengendrement, by Isabelle Collet, Associated Professor at University of Geneva, author of "Les oubliées du numérique" L'histoire occidentale est remplie d'une longue série de mythes parlant de créatures artificielles et, dans certains cas, d'humains tentant d'usurper la place de Dieu en se lançant dans le processus de création. Il y a 20 ans, quand j'ai commencé à travailler sur les questions de genre en informatique, je me suis intéressée à créatures artificielles, car, pour moi, l'ordinateur a été rêvé comme faisant partie de cette grande famille. Quand ils ont conçu l'ENIAC, les pères de l'informatique ne cherchaient pas réellement à produire une grosse machine pour calculer, même si c'est ce qu'ils ont réalisé. L'ordinateur des années 1950, qui était pourtant très loin des performances de ChatGPT, était vu comme une étape vers le but ultime de la science : une duplication du cerveau humain. Si je relie ces fantasmes à la question « Genre », c'est parce que tous les créateurs de créatures artificielles sont des hommes et que tous trouvent une solution pour créer un être nouveau sans passer par la reproduction sexuée, c'est-à-dire sans l'aide des femmes. 19:30 - Low-rank optimization, by Irène Waldspurger, CNRS researcher at CEREMADE A low-rank optimization problem consists in identifying a matrix from a few observations, under the assumption that this matrix has low rank. I will give examples of such problems to motivate their analysis. Then, I will explain why they are difficult to solve, and present the algorithms which have been developed in the last fifteen years as well as some open research questions. --- 20:15 - Cocktail & Networking --- After the meet-up, a summary be available on our Medium page : https://wimlds-paris.medium.com/ --- Code of Conduct WiMLDS & MLOps are dedicated to providing a harassment-free experience for everyone. We do not tolerate harassment of participants in any form. All communication should be appropriate for a professional audience including people of many different backgrounds. Sexual language and imagery is not appropriate. Be kind to others. Do not insult or put down others. Behave professionally. Remember that harassment and sexist, racist, or exclusionary jokes are not appropriate. Thank you for helping make this a welcoming, friendly community for all. All attendees should read the full Code of Conduct before participating: https://github.com/WiMLDS/starter-kit/wiki/Code-of-conduct |
50. Paris Women in Machine Learning & Data Science @Pruna.ai
|
|
PyBerlin 49 - 🍃🍃October event 🍃🍃
2024-10-09 · 16:30
Agenda: • 18:30 - Opening doors of the venue • 19:00 - Welcome to PyBerlin! // Organisers • 19:10 - Welcome from the host - HeyJobs Gmbh • 19:20 - Applied NLP with LLMs: Beyond black-box monoliths // Ines Montani Large Language Models (LLMs) have enormous potential, but also challenge existing workflows in industry that require modularity, transparency and data privacy. In this talk, I'll show some practical solutions for using the latest state-of-the-art models in real-world applications and distilling their knowledge into smaller and faster components that you can run and maintain in-house. Speaker's bio: Ines Montani is a developer specializing in tools for AI and NLP technology. She’s the co-founder and CEO of Explosion and a core developer of spaCy, a popular open-source library for Natural Language Processing in Python, and Prodigy, a modern annotation tool for creating training data for machine learning models. • 19:50 - Short break • 20:20 - MLOps at HeyJobs: Building a Continuous Training Pipeline for Our Custom Embeddings Model // Viktor Bubanja and Shantanu Ladhwe As user behaviour and the relationship between users and jobs evolve over time, it’s crucial that our embeddings models powering Recommendations and Search capture these changes. In this talk, we’ll explore how we use AWS SageMaker to train and dynamically update embedding models to reflect the latest user-job relationships. We’ll dive into the MLOps challenges of handling model versioning, backfilling job embeddings, and managing rollbacks, sharing insights from our approach to keeping our Matching services up-to-date. Viktor Bubanja speaker's bio: Viktor is a Senior Software Engineer in the Matching team at HeyJobs, where he is helping build the search and recommendations systems connecting talent with ideal job opportunities. His focus is on deploying ML systems into production and ensuring their scalability and reliability. Shantanu Ladhwe speaker's bio: Shantanu is ML Engineering Manager at HeyJobs, where he works with a talented team of ML and Software Engineers to develop machine learning-driven search and recommendation systems for the job platform and has over 8 years of experience in Data Science, Machine Learning, MLOps, and NLP - now LMMs ;) • 20:50 - Building Blinkist's Personalized Real-Time Recommendation Feed: From Design to Implementation // Idil Ismiguzel In this talk, I'll take you behind the scenes of how we revamped Blinkist's homepage with a real-time, personalized feed. I'll share how our system leverages multiple recommender algorithms and a ranking model (aka brain) to serve tailored content that boosts user engagement and retention. We’ll dive into how we utilize AWS Sagemaker to train these models and host models as endpoints, making real-time recommendations accessible for multiple services. From design to deployment, we will cover full journey of delivering impactful user experiences through smart personalization. Speaker's bio: Idil is a Data Scientist and Machine Learning Engineer at Blinkist+Go1, where she helps drive personalized content and learning experiences. With a background in mobile gaming and tourism, she brings a diverse range of industry experience to her work. In addition to her work in ed-tech, Idil is a contributor to Towards Data Science on Medium, where she writes about machine learning, AI, and data science trends. • 21:20 - Closing session // Organisers This event will be only in-person. Please check our Code of Conduct and official health regulation in Berlin before coming. If you feel some signs of sickness, please consider skipping this event and attending another time. We will have plenty of events in different formats in the future. Looking forward seeing you all soon! |
PyBerlin 49 - 🍃🍃October event 🍃🍃
|
|
MLOps as a Team
2024-10-07 · 10:30
How I learned to stop worrying and love deployment - Raphaël Hoogvliets About the event Outline:
About the speaker: Raphaël Hoogvliets is a notable figure in the field of MLOps, known for his expertise as a data scientist and machine learning engineer. He is particularly recognized for his leadership and team-building skills within the technology sector. Raphaël is active in sharing his knowledge and insights on platforms like LinkedIn and Substack, where he regularly posts, and runs a newsletter. He shares learnings on MLOps from a variety of perspectives, including CxO-level strategies and beginner-level coding. Currently, Raphaël is leading a team of 12 engineers at Eneco, a leading sustainable energy company that provides energy to 2m households and businesses. DataTalks.Club is the place to talk about data. Join our slack community! |
MLOps as a Team
|
|
Shahar Epstein
– MLOps Engineer at NCR Voyix
NCR Voyix Retail Analytics AI team offers ML products for retailers while embracing Airflow as its MLOps Platform. As the team is small and there have been twice as many data scientists as engineers, we encountered challenges in making Airflow accessible to the scientists: As they come from diverse programming backgrounds, we needed an architecture enabling them to develop production-ready ML workflows without prior knowledge of Airflow. Due to dynamic product demands, we had to implement a mechanism to interchange Airflow operators effortlessly. As workflows serve multiple customers, they should be easily configurable and simultaneously deployable. We came up with the following architecture to deal with the above: Enabling our data scientists to formulate ML workflows as structured Python files. Seamlessly converting the workflows into Airflow DAGs while aggregating their steps to be executed on different Airflow operators. Deploying DAGs via CI/CD’s UI to the DAGs folder for all customers while considering definitions for each in their configuration files. In this session, we will cover Airflow’s evolution in our team and review the concepts of our architecture. |
Airflow Summit 2024
|
|
Ensuring your AI project's success: A comprehensive guide
2024-06-28 · 10:00
Welcome to our webinar, "Ensuring Your AI Project's Success: A Comprehensive Guide." In this focused workshop, we provide you with the blueprint for unlocking the full potential of your AI endeavors. Join us as we delve into the essentials of the machine learning lifecycle, emphasizing the critical role of Human-Centric MLOps in achieving success. From data preparation and model training to deployment and monitoring, we'll dissect each stage of the process, equipping you with the tools and strategies needed for end-to-end implementation. But success in AI isn't just about algorithms and data—it's about people. That's why we place a strong emphasis on the human element throughout this workshop. Learn how to navigate the complexities of AI projects while fostering team synergy and collaboration. By prioritizing both technological efficiency and human connection, you'll pave the way for a truly successful AI project. Don't miss out on this invaluable opportunity to gain actionable insights and take your AI projects to new heights. Register now to secure your spot and unlock the keys to AI success! |
Ensuring your AI project's success: A comprehensive guide
|
|
AI and Deep Learning for Enterprise #17
2024-06-13 · 17:30
Join us at the Daemon Clubhouse on June 13th for an evening of talks, food, and conversation with ML and AI industry pros. Please note you will be unable to enter the venue before 6.30pm. RSVPs will close 24 hours before the event, you will be unable to register after this time but you can still watch online. If you can't join us in person you can watch remotely via our YouTube channel. Agenda 06:30pm - Doors open, food and drink served 07:00pm - Welcome 07:05pm - An opening statement from Daemon 07:10pm - Konrad Bachusz, Lead Data Engineer at Credera, "State of AI in 2024" Konrad Bachusz has a broad range of experience in finance and data, working in various industries including accounting, government, trading, healthcare, transportation, finance, and consulting. Currently, he is a Lead Data Engineer at a global boutique consultancy called Credera. His current role involves helping enterprise clients deliver data engineering, data platforms, and AI solutions. Konrad's interest in AI stems from his background in data analytics, data science, machine learning and data engineering. Outside of work he enjoys travelling, playing his guitars and programming. 07:50pm - Break 08:00 - Lucy Thomas ML Engineer at the Natwest Group and Dr. Ariadna Blanca Romero Senior Data Scientist / AI-Machine Learning Engineer at the Natwest Group, "Enterprise Scale MLOps at NatWest" Lucy and Ariadna will take us on NatWest Group's journey of building an MLOps platform that streamlines the entire machine learning lifecycle, and examples of using additional tools to support MLOps. This platform has boosted team productivity, reduced manual workflows for building, training, and deploying models, and enhanced security and compliance. Lucy Thomas is a Senior AI/ML Engineer at NatWest Group, working in the Data Innovation team. Her focus at the moment is building ML solutions for conversational intelligence that are reliable and robust. Outside of work she enjoys going to the gym, swimming in the sea and reading. Dr. Ariadna Blanca Romero is a Senior Data Scientist/AI-ML Engineer at NatWest with a background in computational material science. She is passionate about creating tools for automation that take advantage of the reproducibility of results from models and optimize the process to ensure prompt decision-making to deliver the best customer experience. Ariadna volunteers as a STEM virtual mentor for a non-profit organization helping the professional development of Latin-American students and young professionals, particularly for empowering women. She is passionate about practising yoga, cooking, and playing her electric guitar. 09:00pm - Wrap up, drinks at The Bear Our hosts require that we provide a list of all attendees, please ensure that you register with a name that matches your government issued ID or bank card: if you do not we cannot guarantee you entry to the building. Please RSVP for the event well in advance if you plan to attend in person and unRSVP if you can no longer attend as limited spaces are available. |
AI and Deep Learning for Enterprise #17
|
|
PyData Karlsruhe - Coding at Scale | Fast DataFrames
2024-05-14 · 16:00
DataScience and AI: in person in Karlsruhe and live on PyData.TV on YouTube Agenda 18:00 Doors open 18:30 Welcome 18:45 Kickstart Coding at Scale – How Project Template Automation unlocks Developer Productivity - Adrian Freund, Pavel Zwerschke, Bela Stoyan (QuantCo) 19:15 Break: Networking with snacks and beverages 20:15 Dask DataFrame is fast now - Florian Jetter (Coiled) 20:45 Lightning Talks 21:00 Networking with snacks and beverages 21:30 End ⚡️ Lightning Talks: 1. Tim Berti - A case study of custom kernels 2. Natalia Mokeeva - Find the best strategy to get involved in Open Source 3. Dr. Lisa A. Chalaguine - Legal Argument Mining from Court Decisions - A Flyby Asking questions: Please go to Slido to ask questions. How to sign up for on site It's important for us to make this meet up happen in a responsible way. We have limited seats available only. No limits to sign up remotely! How to join remotely Join the live stream on YouTube. This event will be in English. ---- Talk #1 Kickstart Coding at Scale: How Project Template Automation Unlocks Developer Productivity – Adrian Freund, Pavel Zwerschke, Bela Stoyan (QuantCo) As your company grows, often so does your software landscape. Setting up each repository from scratch quickly leads to a fragmented ocean of project and ci setups that basically solve the same problems. For this, we came up with a standardized customizable internal project template that we use for all new projects. To operate this effectively at scale, we also came up with a solution to automatically migrate existing projects to newer versions of the template. This lets our developers focus on what they do best, writing code, and not getting stuck on boilerplate while still giving them the option to later deviate from the standard setup if needed. Adrian Freund is a working student in developer tools and is currently studying computer science at Karlsruhe Institute of Technology. One of his most popular OSS contributions is the support for the match statement in mypy. Pavel Zwerschke is a Data Engineer focused on building platforms for Data Science development. As part of his work, he is working on packaging topics, especially the adoption of pixi in the wider scientific Python community. Bela Stoyan is technical staff working on the intersection of Data Science and MLOps. He enables Data Scientist to write data pipelines and helps them to bring them to production systems. Talk #2 Dask DataFrame is fast now - Florian Jetter (Coiled) Dask is a library for distributed computing with Python that integrates tightly with pandas. Historically, Dask was the easiest choice to use (it's just pandas) but struggled to achieve robust performance (there were many ways to accidentally perform poorly). The re-implementation of the DataFrame API addresses all of the pain points that users ran into. We will look into how Dask is a lot faster now, how it performs on benchmarks that is struggled with in the past and how it compares to other tools like Spark, DuckDB and Polars. Florian Jetter is leading the Dask Engineering team at Coiled Computing. He is a long term dask core maintainer and is an expert in distributed cloud computing and data storage ---- Acknowledgements Also a big thank you to our sponsors:
Contact If you have any questions or suggestions, please feel free to contact us via:
|
PyData Karlsruhe - Coding at Scale | Fast DataFrames
|
|
Road Map to learn AI Engineering
2024-04-26 · 23:00
Road Map: Learn AI EngineeringDetailsDo you have questions about Machine Learning Engineering and how to search for an AI job, even after reading multiple blog posts and articles? Come talk to the real experts! Our speakers will share with you their personal journeys and views on choosing MLOps as a career.
Please make sure to leave time before the start of the workshop for registration
Many AI projects fail due to the lack of MLOps expertise to help put ML in production. Companies now realize that building successful AI products requires Data Engineers, Scientists, ML Engineers, and Product Managers to work in a team. AI/ML Engineers play an essential role in putting models in production and ensuring models can be continuously integrated and deployed (CI/CD) with high-quality data (data-centric AI).
If you said yes to any of the above questions, then you don\'t want to miss this FREE Career Panel.
Are you interested in MLE & MLOps? WeCloudData\'s official Machine Learning Engineering Bootcamp starts this November. Do not miss out on the chance to kickstart your future career - use the link below to learn more about how WeCloudData can help you advance your career and achieve your goals! https://weclouddata.com/courses/online/ml-engineering-bootcamp/ |
Road Map to learn AI Engineering
|
|
ODSC East 2024 | Data Science Training Conference
2024-04-23 · 13:00
This is PAID event. Pre-registration is required.
Use COMMUNITY-EAST2024 - code for extra discount on any pass of your choice. Join us for hundreds of hours of workshops, tutorials, and talks on Data Engineering, Generative AI, NLP&LLMs, MLOps&LLMOps, Machine Learning, Responsible AI, and more at ODSC East. ODSC Keynotes & Track Keynotes:
ODSC East’s schedule is now live! Visit AI Expo Hall and engage with leading AI solution providers including IBM, SAS, PREFECT, HPCC Systems, Expanso, KNIME,Tangent Works, Dagster, Plotly, Xethub, Gurobi Optimization, Lightning AI, Covalent, and more, and learn about the products and solutions that will help you automate machine learning, optimize data science costs, and extract value from data, and much more. Free Expo Hall passes are available here - https://hubs.ly/Q02pSyP60 ODSC Links: • Get free access to more talks/trainings like this at Ai+ Training platform: https://hubs.li/H0Zycsf0 • ODSC blog: https://opendatascience.com/ • Facebook: https://www.facebook.com/OPENDATASCI • Twitter: https://twitter.com/_ODSC & @odsc • LinkedIn: https://www.linkedin.com/company/open-data-science • Slack Channel: https://hubs.li/Q02mx9-s0 • Code of conduct: https://odsc.com/code-of-conduct/ |
ODSC East 2024 | Data Science Training Conference
|