talk-data.com talk-data.com

S

Speaker

Savin Goyal

2

talks

Co-Founder & CTO Outerbounds

Filter by Event / Source

Talks & appearances

2 activities · Newest first

Search activities →

The role of the data scientist is changing. Some organizations are splitting the role into more narrowly focused jobs, while others are broadening it. The latter approach, known as the Full Stack Data Scientist, is derived from the concept of a full stack software engineer, with this role often including software engineering tasks. In particular, one of the key functions of a full stack data scientist is to take machine learning models and get them into production inside software. So, what separates projects from production? Savin Goyal is the Co-Founder & CTO at Outerbounds. In addition to his work at Outerbounds, Savin is the creator of the open source machine learning management platform Metaflow. Previously Savin has worked as a Software Engineer at Netflix and LinkedIn. In the episode, Richie and Savin explore the definition of production in data science, steps to move from internal projects to production, the lifecycle of a machine learning project, success stories in data science, challenges in quality control, Metaflow, scalability and robustness in production, AI and MLOps, advice for organizations and much more.  Links Mentioned in the Show: OuterboundsMetaflowConnect with Savin on Linkedin[Course] Developing Machine Learning Models for ProductionRelated Episode: Why ML Projects Fail, and How to Ensure Success with Eric Siegel, Founder of Machine Learning Week, Former Columbia Professor, and Bestselling AuthorRewatch sessions from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

Airflow is a household brand in data engineering: It is readily familiar to most data engineers, quick to set up, and, as proven by millions of data pipelines powered by it since 2014, it can keep DAGs running. But with the increasing demands of ML, there is a pressing need for tools that meet data scientists where they are and address two pressing issues - improving the developer experience & minimizing operational overhead. In this talk, we discuss the problem space and the approach to solving it with Metaflow, the open-source framework we developed at Netflix, which now powers thousands of business-critical ML projects at Netflix & other companies. We wanted to provide data scientists with the best possible UX, allowing them to focus on parts they like (e.g., modeling) while providing robust solutions for the foundational infrastructure: data, compute, orchestration (using Airflow), & versioning. In this talk, we will demo our latest work that builds on top of Airflow.