talk-data.com talk-data.com

Event

Data Council 2023

2026-01-10 YouTube Visit website ↗

Activities tracked

76

Sessions & talks

Showing 51–75 of 76 · Newest first

Search within this event →
Making Moves with Arrow Data: Introducing Arrow Database Connectivity (ADBC) | Voltron Data

Making Moves with Arrow Data: Introducing Arrow Database Connectivity (ADBC) | Voltron Data

2023-05-11 Watch
video
Matthew Topol (Voltron Data)

ABOUT THE TALK: In this talk, we'll dive into one of the newest Apache Arrow subprojects, Arrow Database Connectivity (ADBC), an API specification for Arrow-based database access.

Over the course of this session, you’ll get a crash course in ADBC and learn how it communicates with different data APIs (like Arrow Flight SQL and Postgres) using Arrow-native in-memory data. By the end, you’ll understand the use cases it can conquer and know where to access the resources you need to get started.

This talk will cover goals, use-cases, and examples of using ADBC to communicate with different Data APIs (such as Flight SQL or postgres) with Arrow Native in-memory data.

ABOUT THE SPEAKER: Matthew Topol is a committer for the Apache Arrow project, frequently enhancing the Golang Arrow and Parquet libraries among other enhancements and helping to grow the Arrow Community. Recently, Matt has joined Voltron Data in order to work on the Apache Arrow libraries full time and grow the Arrow Golang community. In June 2022, Matt's first book was published, which is the first (and currently only) book on Apache Arrow titled "In-Memory Analytics with Apache Arrow".

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/

Govern Your Data Clients the Right Way to Scale | Memphis

Govern Your Data Clients the Right Way to Scale | Memphis

2023-05-11 Watch
video
Yaniv Ben Hemo (Memphis.dev)

ABOUT THE TALK: In this talk, you will learn what are the risks of growing without data governance, how to create a supportive framework to control your different data clients using a common open-source tools, and how to construct the framework to adapt to changes and additions that scale with the business movement.

ABOUT THE SPEAKER: Yaniv Ben Hemo has been fascinated by data and its power to disrupt our world, which motivated him to deepen his journey with it, specifically, data engineering. In 2022, alongside his three best friends from college, Yaniv founded Memphis.dev, a real-time data processing platform out of their struggles as devs working with legacy queues and brokers to enable high-scale real-time data processing for developers.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/

Latency is the Mind Killer. It's Time to Reimagine Data Interactions | Hunch

Latency is the Mind Killer. It's Time to Reimagine Data Interactions | Hunch

2023-05-11 Watch
video
David Wilson (Hunch)

ABOUT THE TALK: Billions spent on data have one goal: better decisions, faster. Yet, with most tools echoing Tableau and Excel, are we really unlocking data's potential? What might a more productive—and playful—data experience look like?

Hunch is crafting an ambitious user interface for data that's more visual, spatial, and fluid—a shared canvas for both analysts and business users to make faster decisions together. Tune in to the discussion about why it’s time to reimagine how we engage with data, and Hunch's journey exploring the ideas that led them to now and why some work better than others.

ABOUT THE SPEAKER: David Wilson has worked with large datasets for companies across Africa and the Middle East.

David co-founded Cape Networks, combining his data and design experience to craft a SaaS tool that helped non-experts make sense of complex networking data. Aruba Networks acquired Cape to be the face of their cloud software portfolio.

Drawing from his experience at Cape, David co-founded Hunch—a data tool he’s dreamed of since his consulting days.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/

How Freewheel Processes Billions of Ad tech Events in Real Time | Freewheel

How Freewheel Processes Billions of Ad tech Events in Real Time | Freewheel

2023-05-11 Watch
video
Margi Dubal (Freewheel (a Comcast Company))

ABOUT THE TALK: The power to gather, analyze, and quickly act on real-time bidding data is critical for advertisers and publishers. A data platform that supports real-time bidding empowers these participants to obtain insights from the huge amounts of data generated by programmatic advertising.

Learn how our Beeswax data platform captures real-time information about bids and impressions and provides feedback to advertisers, enabling them to make data-driven decisions for optimal results. It is built on an event-based architecture, leveraging AWS Kinesis and Snowflake's Snowpipe, that is capable of processing bid requests at a massive scale - around half a million QPS in real-time! We also talk about how the platform evolved over time and how we've built the platform and monitoring infrastructure to enable sustained growth.

ABOUT THE SPEAKER: Margi Dubal is a Director of Data Engineering at Freewheel, a Comcast Company. She currently leads various data teams to build scalable, reliable, and high-quality data solutions. Prior to joining Freewheel, Margi has held data engineering management positions at Paperless Post, Amplify and Adknowledge Inc.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/

The End of History? Convergence of Batch and Realtime Data Technologies | Ternary Data

The End of History? Convergence of Batch and Realtime Data Technologies | Ternary Data

2023-05-11 Watch
video
Matt Housley (Halfpipe Systems)

ABOUT THE TALK: Hybridization approaches such as Lambda and Kappa architecture are powerful tools for combining the most useful characteristics of batch and real time systems. Implementation and management of these architectures is not for the faint of heart, but the last several years have seen a wave of SaaS platforms and managed services that deliver hybrid capabilities with a greatly reduced operational burden. This talk details the anticipated impact of these hybrid technologies on future data stacks, and touches on the mythical “one database to rule them all.”

ABOUT THE SPEAKER: Matt Housley holds a Ph.D. in mathematics and is co-author of the bestselling O’Reilly book Fundamentals of Data Engineering.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/

HuggingFace + Ray AIR Integration: A Python Developer’s Guide to Scaling Transformers | AnyScale

HuggingFace + Ray AIR Integration: A Python Developer’s Guide to Scaling Transformers | AnyScale

2023-05-11 Watch
video
Antoni Baum (Anyscale) , Jules S. Damji (Anyscale Inc)

ABOUT THE TALK: Hugging Face Transformers is a popular open-source project with cutting-edge Machine Learning (ML). Still, meeting the computational requirements for advanced models it provides often requires scaling beyond a single machine. This session explores the integration between Hugging Face and Ray AI Runtime (AIR), allowing users to scale their model training and data loading seamlessly. We will dive deep into the implementation and API and explore how we can use Ray AIR to create an end-to-end Hugging Face workflow, from data ingest through fine-tuning and HPO to inference and serving.

ABOUT THE SPEAKERS: Jules S. Damji is a lead developer advocate at Anyscale Inc, an MLflow contributor, and co-author of Learning Spark, 2nd Edition. He is a hands-on developer with over 25 years of experience and has worked at leading companies, such as Sun Microsystems, Netscape, @Home, Opsware/LoudCloud, VeriSign, ProQuest, Hortonworks, and Databricks, building large-scale distributed systems.

Antoni Baum is a software engineer at Anyscale, working on Ray Tune, XGBoost-Ray, Ray AIR, and other ML libraries. In his spare time, he contributes to various open source projects, trying to make machine learning more accessible and approachable.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/

Feed The Alligators With the Lights On: How Data Engineers Can See Who Really Uses Data | Stemma

Feed The Alligators With the Lights On: How Data Engineers Can See Who Really Uses Data | Stemma

2023-05-11 Watch
video
Mark Grover (Stemma)

ABOUT THE TALK: At Lyft, Mark Grover built the Amundsen data catalog so data scientists could navigate hundreds of thousands of tables to distinguish trustworthy data from sandboxed, out-of-date data. When he took Amundsen open source, he helped dozens of data teams support a variety of demands to make data discoverable and self-serve. Mark frequently sees processes that seem “good enough” come back to bite data teams. In this talk, Mark takes us deep into query logs and APIs to see where all of that metadata lives, and he'll demonstrate how to use it so you don’t lose any fingers during your next data change.

ABOUT THE SPEAKER: Mark Grover is the co-founder/CEO of Stemma - a modern data catalog for building self-serve data culture used by Grafana, iRobot, SoFi, Convoy and many others. He is the co-creator of the leading open-source data catalog, Amundsen, used by Lyft, Instacart, Square, ING, Snap and many more! ​Mark was previously a developer on Apache Spark at Cloudera and is a committer and PMC member on a few open-source Apache project. He is a co-author of Hadoop Application Architectures.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/

Building an ML Experimentation Platform for Easy Reproducibility | Treeverse

Building an ML Experimentation Platform for Easy Reproducibility | Treeverse

2023-05-11 Watch
video
Vino Duraisamy (lakeFS)

ABOUT THE TALK: Quality ML at scale is only possible when we can reproduce a specific iteration of the ML experiment–and this is where data is key.

In this talk, you will learn how to use a data versioning engine to intuitively and easily version your ML experiments and reproduce any specific iteration of the experiment.

This talk will demo through a live code example: -Creating a basic ML experimentation framework with lakeFS (on Jupyter notebook) -Reproducing ML components from a specific iteration of an experiment Building intuitive, zero-maintenance experiments infrastructure -All with common data engineering stacks & open source tooling.

ABOUT THE SPEAKER: Vino Duraisamy is a developer advocate at lakeFS, an open-source platform that delivers git-like experience to object store based data lakes. She has previously worked at NetApp (on data management applications for NetApp data centers), on data teams of Nike and Apple, where she worked mainly on batch processing workloads as a data engineer, built custom NLP models as an ML engineer and even touched upon MLOps a bit for model deployments.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/

Scaling Uber's Metric System from Elasticsearch to Pinot | Uber

Scaling Uber's Metric System from Elasticsearch to Pinot | Uber

2023-05-11 Watch
video
Nan Ding (Uber) , Yupeng Fu (Uber)

ABOUT THE TALK: Uber has been using realtime system to support time-sensitive critical use cases for years, including Gairos, which was initiated in the Marketplace Org and then widely used across the company since 2014, and uMetric, which has emerged rapidly since 2020.

Continuous effort has been spent toward the reliability and performance of these realtime platforms, to cope with traffic growth, increasing number of users, different varieties of use cases, and following work such as operation cost, resource planning, and optimization feature development. This presentation shares the things done right to solve these challenges, including fully replace Elasticsearch with Apache Pinot as the realtime storage of our ecosystem.

ABOUT THE SPEAKERS: Yupeng Fu is a Principal Engineer at Uber and he leads the Real-time Data platform and Search platform at Uber. Yupeng Fu is also an Apache Pinot committer.

Nan Ding is a staff engineer at Uber, and leads data platform reliability and performance of Marketplace uMetric team.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/

Human In The Loop Concept to Building Fully Adaptive MI Models Using Crowdsourcing | Toloka

Human In The Loop Concept to Building Fully Adaptive MI Models Using Crowdsourcing | Toloka

2023-05-11 Watch
video
Fedor Zhdanov (Toloka)

ABOUT THE TALK: In this talk, Fedor Zhdanov explains how to combat the concept of drift in ML with crowdsourcing. He shows you how to build complex drift-monitoring systems and human-in-the-loop ML models that can be fully automated. He also shares how this has led him and his team to start building so-called “adaptive ML models”. Learn what they are and how to build and maintain them.

ABOUT THE SPEAKER: Fedor Zhdanov is the Head of AI at Toloka, who had previous roles as Principal Applied Scientist at AWS, Microsoft and Amazon. Fedor has been creating products with R&D in Machine Learning for the last 18+ years. For the last 6 years, he has been focusing on connecting ML and humans in human-in-the-loop processes. His ventures are focused on building responsible state of the art AI-first business solutions with human oversight.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/

How to Interpret & Explain Your Black Box Models | Anaconda

How to Interpret & Explain Your Black Box Models | Anaconda

2023-05-11 Watch
video
Sophia Yang (Anaconda)

ABOUT THE TALK: There has been an increasing interest in machine learning model interpretability and explainability. Researchers and ML practitioners have designed many explanation techniques such as explainable boosting machine, visual analytics, distillation, prototypes, saliency map, counterfactual, feature visualization, LIME, SHAP, interpretML, and TCAV. In this talk, Sophia Yang provides a high-level overview of the popular model explanation techniques.

ABOUT THE SPEAKER: Sophia Yang is a Senior Data Scientist and a Developer Advocate at Anaconda. She is passionate about the data science community and the Python open-source community. She is the author of multiple Python open-source libraries such as condastats, cranlogs, PyPowerUp, intake-stripe, and intake-salesforce.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/

Big Data is Dead | MotherDuck

Big Data is Dead | MotherDuck

2023-05-11 Watch
video
Jordan Tigani (MotherDuck)

This talk will make the case that the era of Big Data is over. Now we can stop worrying about data size and focus on how we’re going to use it to make better decisions.

The data behind the graphs shown in this talk come from Jordan Tigani having analyzed query logs, deal post-mortems, benchmark results (published and unpublished), customer support tickets, customer conversations, service logs, and published blog posts, plus a bit of intuition.

ABOUT THE SPEAKER: Jordan Tigani is co-founder and chief duck-herder at MotherDuck, a startup building a serverless analytics platform based on DuckDB. He helped create Google BigQuery, wrote two books on it, and led first the engineering team then the product team through its first $1B or so in revenue.

👉 Sign up for our “No BS” Newsletter to get the latest technical data & AI content: https://datacouncil.ai/newsletter

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/

Extinguishing the Garbage Fire of ML Testing | Mailchimp

Extinguishing the Garbage Fire of ML Testing | Mailchimp

2023-05-11 Watch
video
Emily Curtain (Intuit Mailchimp)

ABOUT THE TALK:
Our traditional testing and CI methods for Data Science are not working, but we can't just give up on providing guardrails.

As engineers, how do you solve ML testing?

In this talk, Emily Curtain discusses: - abstracting, decoupling, and separating concerns - keeping pytest only where it belongs - substituting testing for observability in appropriate places - applying data reliability practices and thereby solving some problems at the source - by honoring Data Scientists' mental models, and ways of working

ABOUT THE SPEAKER: Emily Curtin is a Staff MLOps Engineer at Intuit Mailchimp. She leads a crazy good team focused on helping Data Scientists do higher quality work faster and more intuitively.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/

LLM's & Semantic Layer: Self Serve has Entered the Chat | Zenlytic

LLM's & Semantic Layer: Self Serve has Entered the Chat | Zenlytic

2023-05-11 Watch
video
Paul Blankley (Zenlytic)

ABOUT THE TALK: Self-serve analytics has always been the promised land for data teams. A fantastic ideal but not something actually achievable. The combination of LLM’s and the semantic layer completely changes that. LLM’s and Semantic Layer combine high level intelligence with context of the business that allows deep and accurate question answering. Together, they unlock truly self-serve analytics.

ABOUT THE SPEAKER: Paul Blankley is the Co-founder and CTO of Zenlytic. He previously co-founded Ex Quanta AI Studio. He was also a Data Engineer at Capital One.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/

Malloy An Experimental Language for Data | Google

Malloy An Experimental Language for Data | Google

2023-05-11 Watch
video
Lloyd Tabb (Looker (acquired by Google))

ABOUT THE TALK: Forcing data through a rectangle shapes the way we solve problems (for example, dimensional fact tables, OLAP Cubes).

Most Data isn’t rectangular it rather exists in hierarchies (orders, items, products, users). Most query results are better returned as a hierarchy (category, brand, product).

Malloy is a new experimental data programming language that, among other things, breaks the rectangle paradigm and several other long held misconceptions in the way we analyze data.

In this talk, Lloyd Tabb shares the ideas behind the Malloy language, semantic data modeling, and his vision for the future of data.

ABOUT THE SPEAKER: Lloyd Tabb spent the last 30 years revolutionizing how the world uses the internet and, by extension, data. He is one of the internet pioneers, having worked at Netscape during the browser wars as the Principal Engineer on Navigator Gold, the first HTML WYSIWYG editor.

Originally a database & languages architect at Borland, Lloyd founded Looker,, which Google acquired in 2019. Lloyd's work at Looker helped define the Modern Data Stack.

At Google, Lloyd continues to pursue his passion for data, and love of programming languages through his current project, Malloy.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/

Writing Unit Tests for Data Science Code | Microsoft

Writing Unit Tests for Data Science Code | Microsoft

2023-05-11 Watch
video
Dr. Nile Wilson (Microsoft)

ABOUT THE TALK: In Data Science, the small piece of code that you want to test also needs to take in data, training a model, or evaluating a model, but all of these steps are complicated and consist of many smaller units.

Learn from Dr. Nile Wilson her Software Engineering best practices for testing Data Science Code and some of the common scenarios for data, such as mocking calls or mocking data.

ABOUT THE SPEAKER: Dr. Nile Wilson is a Data Scientist 2 in Industry Solutions Engineering at Microsoft, focused on developing and implementing Machine Learning solutions for enterprise customers. She has worked with interdisciplinary teams across various industries to develop production-ready data science solutions to drive business impact.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/

Data Contracts - Accountable Data Quality | Data Quality Camp

Data Contracts - Accountable Data Quality | Data Quality Camp

2023-05-11 Watch
video
Chad Sanderson (Gable.ai)

ABOUT THE TALK: Data Contracts are a mechanism for driving accountability and data ownership between producers and consumers. Contracts are used to ensure production-grade data pipelines are treated as part of the product and have clear SLAs and ownership.

Learn about the why, when and how of Data Contracts and the spectrum from culture change to implementation details.

ABOUT THE SPEAKER: Chad Sanderson is the former Head of Data at Convoy. He has implemented Data Contracts at scale on everything from Machine Learning models to Embedded Metrics. He currently operates the Data Quality Camp Slack group.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/

The Fun Sized MLOps Stack from Scratch | Featureform

The Fun Sized MLOps Stack from Scratch | Featureform

2023-05-11 Watch
video
Mikiko Bazeley (Featureform)

ABOUT THE TALK: Learn about "fun-sized companies" (SMB's, small startups, etc) and how to build a fully fledged MLOps platform from scratch using the best OSS tools out there in under a day.

This talk covers: -The main problems MLOps tries to solve -The most common tools being used & their drawbacks -OSS projects & tools that have been developed in the past 2-3 years and how do they solve some of the pain points of the prior tools -The realistic roadmap for companies that are forever “not-Google” scale but want to continue improving their data and ML maturity

ABOUT THE SPEAKER: Mikiko Bazeley is Head of MLOps at Featureform, a Virtual Feature Store. He has previously taken on engineer, data scientist, and data analyst roles for companies including Mailchimp (Intuit), Teladoc, Sunrun, Autodesk along with a handful of early stage startups.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai

Publishing Jupyter Notebooks with Quarto | RStudio

Publishing Jupyter Notebooks with Quarto | RStudio

2023-05-11 Watch
video
J.J. Allaire (RStudio)

ABOUT THE TALK: Quarto is a multi-language, open-source toolkit for creating data-driven websites, reports, presentations, and scientific articles, built on Jupyter.

This talk teaches you how to use Quarto to publish Jupyter notebooks as production quality websites, books, blogs, presentations, PDFs, Office documents, and more. It covers how to publish notebooks within existing content management systems like Hugo, Docusaurus, and Confluence and also explore how Quarto works under the hood along with how the system can be extended to accommodate unique requirements and workflows.

ABOUT THE SPEAKER: J.J. Allaire is the founder of RStudio and the creator of the RStudio IDE. He is an author of several packages in the R Markdown publishing ecosystem and has also worked extensively on the R interfaces to Python and TensorFlow. J.J. is now leading the Quarto project, which is a new Jupyter-based scientific and technical publishing system.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai

The Missing Manual: Everything You Need to Know about Snowflake Optimization | SELECT

The Missing Manual: Everything You Need to Know about Snowflake Optimization | SELECT

2023-05-11 Watch
video
Ian Whitestone (Shopify) , Niall Woodward (SELECT)

ABOUT THE TALK Learn all about cost and performance optimization in Snowflake. This talk deep dive's into Snowflake’s architecture & billing model, covering key concepts like virtual warehouses, micro-partitioning, the lifecycle of a query and Snowflake’s two-tiered cache. It then goes in depth on the most important optimization strategies, like virtual warehouse configuration, table clustering and query writing best practices. Throughout the talk, code snippets and other resources are shared to help you get the most out of Snowflake.

ABOUT THE SPEAKERS Niall Woodward and Ian Whitestone are the co-founders at SELECT, a tool to help Snowflake users optimize their Snowflake cost & performance.

Niall Woodward has been well known in the data community for creating and contributing to open source packages.

Ian Whitestone previously led data teams at Shopify and Capital One. At Shopify, Ian spearheaded the efforts to reduce their data warehouse spend by over 50%.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai

Building a Control Plane for Data | Acryl

Building a Control Plane for Data | Acryl

2023-05-11 Watch
video
Shirshanka Das (Acryl Data)

ABOUT THE TALK: This talk explains what the control plane of data looks like and how it fits into the reference architecture for the deconstructed data stack: a data stack that includes operational data stores, streaming systems, transformation engines, BI tools, warehouses, ML tools and orchestrators.

It dives into the fundamental characteristics for a control plane:

Breadth (completeness) Latency (freshness) Scale Source of Truth Auditability

ABOUT THE SPEAKER: Shirshanka Das is the Co-founder and CEO of Acryl Data, the company which is commercializing the open source DataHub project, a real-time metadata platform used by LinkedIn, Stripe, Pinterest, Optum, Expedia and many others. Prior to founding Acryl, he was the overall architect for Big Data at LinkedIn from 2010 to 2020, and responsible for creating the metadata and data management strategy at the company.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai

Scaling Experimentation to 20 Billion Users | Statsig

Scaling Experimentation to 20 Billion Users | Statsig

2023-05-11 Watch
video
Timothy Chan (Statsig)

ABOUT THE TALK:

Statsig is a product observability platform that helps product teams move faster and make better decisions. Companies like Notion, Flipkart, Eventbrite, Ancestry, and Univision use it to release features, run experiments and measure impact.

In only two years, Statsig is supporting thousands of experiments across billions of users (unique company specific userIDs). In this session you will learn lessons on their company's growth.

ABOUT THE SPEAKER: Timothy Chan is an experienced data science professional, currently serving as the Data Science Lead at Statsig. Before joining Statsig, Timothy spent almost 5 years as a Data Scientist at Facebook (now Meta), where he was involved in projects across Facebook App and Reality Labs. His background includes working in biotech, researching treatments for diseases such as Alzheimer’s, Multiple Sclerosis, Lupus, and Cancer.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil

Extreme Self-Service: Turning Data Consumers into Data Constructors | Whatnot

Extreme Self-Service: Turning Data Consumers into Data Constructors | Whatnot

2023-05-11 Watch
video
Alice Leach (Whatnot)

ABOUT THE TALK: Small data teams face supply and demand problems. Triaging and prioritizing data work can be overwhelming. But what if data consumers could create their own products with minimal training?

Learn how to empower data consumers without disrupting others. Discover lessons from an 'extreme' self-service analytics approach: best practices, fostering a data community, promoting SQL literacy, and establishing solid guard rails.

ABOUT THE SPEAKER: Alice Leach is a Data Engineer at Whatnot Inc., a live stream platform and marketplace that enables collectors and enthusiasts to connect, buy, and sell verified products. She transitioned from academia to data in 2021, working first as a data scientist then data engineer. Her current work at Whatnot focuses on designing and building robust, self-service data workflows using a modern data stack.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil

Modern Data Management   How to Set Your Data Team Up for Success | Select Star

Modern Data Management How to Set Your Data Team Up for Success | Select Star

2023-05-11 Watch
video
Alec Bialosky (Select Star)

ABOUT THE TALK: Got your Modern Data Stack setup, now what? A mature data practice goes beyond setting up the data pipeline, and ensures there are both systems and processes in place to make it easy for everyone to find and understand data.

Learn how Select Star enables data discovery, making knowledge searchable and understandable for all. Uncover best practices for setting up a data discovery portal as your single source of truth.

ABOUT THE SPEAKER: Alec Bialosky is currently the Director of Business Operations at Select Star where he spends the majority of his time working with prospects and customers to help them achieve their data discovery goals with Select Star.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/data...

The State of Cross Company Data Exchange | General Folders

The State of Cross Company Data Exchange | General Folders

2023-05-11 Watch
video
Pardis Noorzad (General Folders)

ABOUT THE TALK: Data exchange is vital for business partnerships, but current practices are manual, prone to leaks, hard to validate, monitor, and audit.

Tune in to this talk for an overview of data sharing methods, security comparisons, simplicity, and speed. Discover best practices and solutions to overcome challenges.

ABOUT THE SPEAKER: Pardis Noorzad is CEO at General Folders. She led a data team at Twitter, covering a variety of consumer products. Pardis has also built products in growth stage fintech and digital health and early stage AI platform companies.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil