talk-data.com talk-data.com

V

Speaker

Vicky Andonova

2

talks

Filtering by: Databricks DATA + AI Summit 2023 ×

Filter by Event / Source

Talks & appearances

Showing 2 of 2 activities

Search activities →
Sponsored: Anomalo | Data Archaeology: Quickly Understand Unfamiliar Datasets Using Machine Learning

One of the most daunting and time-consuming activities for data scientists and data analysts is understanding new and unfamiliar data sets. When given such a new data set, how do you understand its shape and structure? How can you quickly understand its important trends and characteristics? The typical answer is hours of manual querying and exploration, a process many call data archaeology.

This session will show a better way to explore new data sets by letting machine learning do the work for you. In particular, we will showcase how Anomalo simplifies the process of understanding and obtaining insights from Databricks tables — without manual querying. With a few clicks, you can generate comprehensive profiles and powerful visualizations that give immediate insight into your data's key characteristics and trends, as well as its shape and structure. With this approach, very little manual data archaeology is required, and you can quickly get to work on getting value out of the data (rather than just exploring it).

Talk by: Elliot Shmukler and Vicky Andonova

Here’s more to explore: State of Data + AI Report: https://dbricks.co/44i2HBp The Data Team's Guide to the Databricks Lakehouse Platform: https://dbricks.co/46nuDpI

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksi

Sponsored by: Anomalo | Scaling Data Quality with Unsupervised Machine Learning Methods

The challenge is no longer how big, diverse, or distributed your data is. It's that you can't trust it. Companies are utilizing rules and metrics to monitor data quality, but they’re tedious to set up and maintain. We will present a set of fully unsupervised machine learning algorithms for monitoring data quality at scale, which requires no setup, catching unexpected issues and preventing alert fatigue by minimizing false positives. At the end of this talk, participants will be equipped with insight into unsupervised data quality monitoring, its advantages and limitations, and how it can help scale trust in your data.

Talk by: Vicky Andonova

Here’s more to explore: LLM Compact Guide: https://dbricks.co/43WuQyb Big Book of MLOps: https://dbricks.co/3r0Pqiz

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc