Search – talk-data.com

Title & Speakers	Event
Event Data Engineering Open Forum at Netflix 2024 2024-06-19
Unbundling the Data Warehouse: The Case for Independent Storage 2024-06-19 · 20:02 Jason Reid – Director, Product @ Tabular Speaker: Jason Reid (Co-founder & Head of Product at Tabular) This tech talk is a part of the Data Engineering Open Forum at Netflix 2024. Unbundling a data warehouse means splitting it into constituent and modular components that interact via open standard interfaces. In this talk, Jason Reid discusses the pros and cons of both data warehouse bundling and unbundling in terms of performance, governance, and flexibility, and he examines how the trend of data warehouse unbundling will impact the data engineering landscape in the next 5 years. If you are interested in attending a future Data Engineering Open Forum, we highly recommend you join our Google Group (https://groups.google.com/g/data-engineering-open-forum) to stay tuned to event announcements. Data Engineering DWH	YouTube
Automating the Data Architect: Generative AI for Enterprise Data Modeling 2024-06-19 · 20:02 Jide Ogunjobi – Founder & CTO @ Context Data Speaker: Jide Ogunjobi (Founder & CTO at Context Data) This tech talk is a part of the Data Engineering Open Forum at Netflix 2024. As organizations accumulate ever-larger stores of data across disparate systems, efficiently querying and gaining insights from enterprise data remain ongoing challenges. To address this, we propose developing an intelligent agent that can automatically discover, map, and query all data within an enterprise. This “Enterprise Data Model/Architect Agent” employs generative AI techniques for autonomous enterprise data modeling and architecture. If you are interested in attending a future Data Engineering Open Forum, we highly recommend you join our Google Group (https://groups.google.com/g/data-engineering-open-forum) to stay tuned to event announcements. AI/ML Data Engineering Data Modelling GenAI	YouTube
Welcome Address for the Data Engineering Open Forum 2024 2024-06-19 · 20:02 Max Schmeiser – Vice President of Studio and Content Data Science & Engineering Max Schmeiser (Vice President of Studio and Content Data Science & Engineering) extends a warm welcome to all attendees, marking the beginning of our inaugural Data Engineering Open Forum. If you are interested in attending a future Data Engineering Open Forum, we highly recommend you join our Google Group (https://groups.google.com/g/data-engineering-open-forum) to stay tuned to event announcements. Data Engineering Data Science	YouTube
Real-Time Delivery of Impressions at Scale 2024-06-19 · 20:02 Tulika Bhatt – Senior Software Engineer @ Netflix Speaker: Tulika Bhatt (Senior Data Engineer at Netflix) This tech talk is a part of the Data Engineering Open Forum at Netflix 2024. Netflix generates approximately 18 billion impressions daily. These impressions significantly influence a viewer’s browsing experience, as they are essential for powering video ranker algorithms and computing adaptive pages, With the evolution of user interfaces to be more responsive to in-session interactions, coupled with the growing demand for real-time adaptive recommendations, it has become highly imperative that these impressions are provided on a near real-time basis. This talk will delve into the creative solutions Netflix deploys to manage this high-volume, real-time data requirement while balancing scalability and cost. If you are interested in attending a future Data Engineering Open Forum, we highly recommend you join our Google Group (https://groups.google.com/g/data-engineering-open-forum) to stay tuned to event announcements. Data Engineering	YouTube
Machine Learning Powered Auto Remediation in Netflix Data Platform 2024-06-19 · 20:02 Binbing Hou – Senior Software Engineer @ Netflix , Stephanie Vezich Tamayo – Senior Machine Learning Engineer @ Netflix Speakers: Stephanie Vezich Tamayo (Senior Machine Learning Engineer at Netflix) Binbing Hou (Senior Software Engineer at Netflix) This tech talk is a part of the Data Engineering Open Forum at Netflix 2024. At Netflix, hundreds of thousands of workflows and millions of jobs are running every day on our big data platform, but diagnosing and remediating job failures can impose considerable operational burdens. To handle errors efficiently, Netflix developed a rule-based classifier for error classification called “Pensive.” However, as the system has increased in scale and complexity, Pensive has been facing challenges due to its limited support for operational automation, especially for handling memory configuration errors and unclassified errors. To address these challenges, we have developed a new feature called “Auto Remediation,” which integrates the rules-based classifier with an ML service. If you are interested in attending a future Data Engineering Open Forum, we highly recommend you join our Google Group (https://groups.google.com/g/data-engineering-open-forum) to stay tuned to event announcements. AI/ML Big Data Data Engineering	YouTube
Reflections on Building a Data Platform From the Ground Up in a Post-GDPR World. 2024-06-19 · 20:02 Jessica Larson – Data Engineer and Author @ Netflix Speaker: Jessica Larson (Data Engineer & Author of “Snowflake Access Control”) This tech talk is a part of the Data Engineering Open Forum at Netflix 2024. The requirements for creating a new data warehouse in the post-GDPR world are significantly different from those of the pre-GDPR world, such as the need to prioritize sensitive data protection and regulatory compliance over performance and cost. In this talk, Jessica Larson shares her takeaways from building a new data platform post-GDPR. If you are interested in attending a future Data Engineering Open Forum, we highly recommend you join our Google Group (https://groups.google.com/g/data-engineering-open-forum) to stay tuned to event announcements. Data Engineering DWH GDPR/CCPA Snowflake	YouTube
Data Productivity at Scale 2024-06-19 · 20:02 Iaroslav Zeigerman – Co-Founder and Chief Architect @ Tobiko Data Speaker: Iaroslav Zeigerman (Co-Founder and Chief Architect at Tobiko Data) This tech talk is a part of the Data Engineering Open Forum at Netflix 2024. The development and evolution of data pipelines are hindered by outdated tooling compared to software development. Creating new development environments is cumbersome: Populating them with data is compute-intensive, and the deployment process is error-prone, leading to higher costs, slower iteration, and unreliable data. SQLMesh, an open-source project born from our collective experience at companies like Airbnb, Apple, Google, and Netflix, is designed to handle the complexities of evolving data pipelines at an internet scale. In this talk, Iaroslav Zeigerman discusses challenges faced by data practitioners today and how core SQLMesh concepts solve them. If you are interested in attending a future Data Engineering Open Forum, we highly recommend you join our Google Group (https://groups.google.com/g/data-engineering-open-forum) to stay tuned to event announcements. Data Engineering SQLMesh	YouTube
Data Quality Score: How We Evolved the Data Quality Strategy at Airbnb 2024-06-12 · 15:48 Clark Wright – Staff Analytics Engineer @ Airbnb Speaker: Clark Wright (Staff Analytics Engineer at Airbnb) This tech talk is a part of the Data Engineering Open Forum at Netflix 2024. Recently, Airbnb published a post to their Tech Blog called Data Quality Score: The next chapter of data quality at Airbnb. In this talk, Clark Wright shares the narrative of how data practitioners at Airbnb recognized the need for higher-quality data and then proposed, conceptualized, and launched Airbnb’s first Data Quality Score. If you are interested in attending a future Data Engineering Open Forum, we highly recommend you join our Google Group (https://groups.google.com/g/data-engineering-open-forum) to stay tuned to event announcements. Analytics Data Engineering Data Quality	YouTube

Unbundling the Data Warehouse: The Case for Independent Storage 2024-06-19 · 20:02

Jason Reid – Director, Product @ Tabular

Speaker: Jason Reid (Co-founder & Head of Product at Tabular)

This tech talk is a part of the Data Engineering Open Forum at Netflix 2024. Unbundling a data warehouse means splitting it into constituent and modular components that interact via open standard interfaces. In this talk, Jason Reid discusses the pros and cons of both data warehouse bundling and unbundling in terms of performance, governance, and flexibility, and he examines how the trend of data warehouse unbundling will impact the data engineering landscape in the next 5 years.