talk-data.com talk-data.com

Topic

Stitch

etl data_integration cloud

2

tagged

Activity Trend

4 peak/qtr
2020-Q1 2026-Q1

Activities

2 activities · Newest first

Scalable & Sustainable Feature Engineering with Hamilton | DAGWorks

ABOUT THE TALK: Hamilton is a novel open-source framework for developing and maintaining scalable feature engineering dataflows.

We introduce the framework, discuss its motivations and initial successes at Stitch Fix, showcase its lightweight data lineage and catalog abilities, and share recent extensions that seamlessly integrate it with distributed compute offerings, such as Dask, Ray, and Spark.

ABOUT THE SPEAKER: Elijah Ben Izzy has always enjoyed working at the intersection of math and engineering. He has more recently focused on building tools to make data scientists and researchers more productive.

He built infrastructure to help quantitative researchers efficiently turn ideas into production trading models at Two Sigma and ran the Model Lifecycle team at Stitch Fix.

He is now the CTO at DAGWorks, which aims to solve the problem of building and maintaining complex ETLs for machine learning.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/

Delta Live Tables: Modern Software Engineering and Management for ETL

Data engineers have the difficult task of cleansing complex, diverse data, and transforming it into a usable source to drive data analytics, data science, and machine learning. They need to know the data infrastructure platform in depth, build complex queries in various languages and stitch them together for production. Join this talk to learn how Delta Live Tables (DLT) simplifies the complexity of data transformation and ETL. DLT is the first ETL framework to use modern software engineering practices to deliver reliable and trusted data pipelines at any scale. Discover how analysts and data engineers can innovate rapidly with simple pipeline development and maintenance, how to remove operational complexity by automating administrative tasks and gaining visibility into pipeline operations, how built-in quality controls and monitoring ensure accurate BI, data science, and ML, and how simplified batch and streaming can be implemented with self-optimizing and auto-scaling data pipelines.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/