talk-data.com talk-data.com

Event

Databricks DATA + AI Summit 2023

2026-01-11 YouTube Visit website ↗

Activities tracked

47

Filtering by: MLOps ×

Sessions & talks

Showing 26–47 of 47 · Newest first

Search within this event →
Discuss How LLMs Will Change the Way We Work

Discuss How LLMs Will Change the Way We Work

2023-07-26 Watch
video
Ben Harvey , Sean Owen (Databricks) , Ankit Mathur (Databricks) , Debu Sinha , Jan van der Vegt

Will LLMs change the way we work?  Ask questions from a panel of LLM and AI experts on what problems LLMs will solve and its potential new challenges

Talk by: Ben Harvey, Jan van der Vegt, Ankit Mathur, Debu Sinha, and Sean Owen

Here’s more to explore: LLM Compact Guide: https://dbricks.co/43WuQyb Big Book of MLOps: https://dbricks.co/3r0Pqiz

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Foundation Models in the Modern Data Stack

Foundation Models in the Modern Data Stack

2023-07-26 Watch
video

As Foundation Models (FMs) continue to grow in size, innovations continue to push the boundaries of what these models can do on language and image tasks. This talk will describe our work on applying FMs to structured data tasks like data linkage, cleaning and querying. We will then discuss challenges and solutions that these models present for production deployment in the modern data stack.

Talk by: Ines Chami

Here’s more to explore: LLM Compact Guide: https://dbricks.co/43WuQyb Big Book of MLOps: https://dbricks.co/3r0Pqiz

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

PaLM 2: A Smaller, Faster and More Capable LLM

PaLM 2: A Smaller, Faster and More Capable LLM

2023-07-26 Watch
video

PaLM 2 is a new state-of-the-art language model that has better multilingual and reasoning capabilities and is more compute-efficient than its predecessor PaLM. PaLM 2 is a Transformer-based model trained using a mixture of objectives. Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on downstream tasks across different model sizes, while simultaneously exhibiting faster and more efficient inference compared to PaLM. This improved efficiency enables broader deployment while also allowing the model to respond faster, for a more natural pace of interaction.

PaLM 2 demonstrates robust reasoning capabilities exemplified by large improvements over PaLM on BIG-Bench and other reasoning tasks. PaLM 2 exhibits stable performance on a suite of responsible AI evaluations, and enables inference-time control over toxicity without additional overhead or impact on other capabilities. Overall, PaLM 2 achieves state-of-the-art performance across a diverse set of tasks and capabilities.

Talk by: Andy Dai

Here’s more to explore: LLM Compact Guide: https://dbricks.co/43WuQyb Big Book of MLOps: https://dbricks.co/3r0Pqiz

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Perplexity: A Copilot for All Your Web Searches and Research

Perplexity: A Copilot for All Your Web Searches and Research

2023-07-26 Watch
video

In this demo, we will show you the fastest and functional answer engine and search copilot that exists right now: Perplexity.ai. It can solve a wide array of problems starting from giving you fast answers to any topic to planning trips and doing market research on things unfamiliar to you, all in a trustworthy way without hallucinations, providing you references in the form of citations. This is made possible by harnessing the power of LLMs along with retrieval augmented generation from traditional search engines and indexes.

We will also show you how information discovery can now be fully personalized to you: personalization through prompt engineering. Finally, we will see use cases of how this search copilot can help you in your day to day tasks in a data team: be it a data engineer, data scientist, or a data analyst.

Talk by: Aravind Srinivas

Here’s more to explore: LLM Compact Guide: https://dbricks.co/43WuQyb Big Book of MLOps: https://dbricks.co/3r0Pqiz

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Sponsored by: Anomalo | Scaling Data Quality with Unsupervised Machine Learning Methods

Sponsored by: Anomalo | Scaling Data Quality with Unsupervised Machine Learning Methods

2023-07-26 Watch
video

The challenge is no longer how big, diverse, or distributed your data is. It's that you can't trust it. Companies are utilizing rules and metrics to monitor data quality, but they’re tedious to set up and maintain. We will present a set of fully unsupervised machine learning algorithms for monitoring data quality at scale, which requires no setup, catching unexpected issues and preventing alert fatigue by minimizing false positives. At the end of this talk, participants will be equipped with insight into unsupervised data quality monitoring, its advantages and limitations, and how it can help scale trust in your data.

Talk by: Vicky Andonova

Here’s more to explore: LLM Compact Guide: https://dbricks.co/43WuQyb Big Book of MLOps: https://dbricks.co/3r0Pqiz

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Sponsored by: Wipro | Personalized Price Transparency Using Generative AI

Sponsored by: Wipro | Personalized Price Transparency Using Generative AI

2023-07-26 Watch
video

Patients are increasingly taking an active role in managing their healthcare costs and are more likely to choose providers and treatments based on cost considerations. Learn how technology can help build cost-efficient care models across the healthcare continuum, delivering higher quality care while improving patient experience and operational efficiency.

Talk by: Janine Pratt

Here’s more to explore: LLM Compact Guide: https://dbricks.co/43WuQyb Big Book of MLOps: https://dbricks.co/3r0Pqiz

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

MLOps at Gucci: From Zero to Hero

MLOps at Gucci: From Zero to Hero

2023-07-25 Watch
video

Delta Lake is an open-source storage format that can be ideally used for storing large-scale datasets, which can be used for single-node and distributed training of deep learning models. Delta Lake storage format gives deep learning practitioners unique data management capabilities for working with their datasets. The challenge is that, as of now, it’s not possible to use Delta Lake to train PyTorch models directly.

PyTorch community has recently introduced a Torchdata library for efficient data loading. This library supports many formats out of the box, but not Delta Lake. This talk will demonstrate using the Delta Lake storage format for single-node and distributed PyTorch training using the torchdata framework and standalone delta-rs Delta Lake implementation.

Talk by: Michael Shtelma

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

LLMOps: Everything You Need to Know to Manage LLMs

LLMOps: Everything You Need to Know to Manage LLMs

2023-07-25 Watch
video
Joseph Bradley , Eric Peter (Databricks)

With the recent surge in popularity of ChatGPT and other LLMs such as Dolly, many people are going to start training, tuning, and deploying their own custom models to solve their domain-specific challenges. When training and tuning these models, there are certain considerations that need to be accounted for in the MLOps process that differ from traditional machine learning. Come watch this session where you’ll gain a better understanding of what to look out for when starting to enter the world of applying LLMs in your domain.

In this session, you’ll learn about:

  • Grabbing foundational models and fine-tuning them
  • Optimizing resource management such as GPUs
  • Integrating human feedback and reinforcement learning to improve model performance
  • Different evaluation methods for LLMs

Talk by: Joseph Bradley and Eric Peter

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

ML on the Lakehouse: Bringing Data and ML Together to Accelerate AI Use Cases

ML on the Lakehouse: Bringing Data and ML Together to Accelerate AI Use Cases

2022-07-19 Watch
video

Discover the latest innovations from Databricks that can help you build and operationalize the next generation of machine learning solutions. This session will dive into Databricks Machine Learning, a data-centric AI platform that spans the full machine learning lifecycle - from data ingestion and model training to production MLOps. You'll learn about key capabilities that you can leverage in your ML use cases and see the product in action. You will also directly hear how Databricks ML is being used to maximize supply chain logistics and keep millions of Coca-Cola products on the shelf.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

MLOps at DoorDash

MLOps at DoorDash

2022-07-19 Watch
video

MLOps is one of the widely discussed topics in the ML practitioner community. Streamlining the ML development and productionalizing ML are important ingredients to realize the power of ML, however it requires a vast and complex infrastructure. The ROI of ML projects will start only when they are in production. The journey to implementing MLOps will be unique to each company. At DoorDash, we’ve been applying MLOps for a couple of years to support a diverse set of ML use cases and to perform large scale predictions at low latency.

This session will share our approach to MLOps, as well as some of the learnings and challenges. In addition, it will share some details about the DoorDash ML stack, which consists of a mixture of homegrown solutions, open source solutions and vendor solutions like Databricks.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

MLOps on Databricks: A How-To Guide

MLOps on Databricks: A How-To Guide

2022-07-19 Watch
video

As companies roll out ML pervasively, operational concerns become the primary source of complexity. Machine Learning Operations (MLOps) has emerged as a practice to manage this complexity. At Databricks, we see firsthand how customers develop their MLOps approaches across a huge variety of teams and businesses. In this session, we will show how your organization can build robust MLOps practices incrementally. We will unpack general principles which can guide your organization’s decisions for MLOps, presenting the most common target architectures we observe across customers. Combining our experiences designing and implementing MLOps solutions for Databricks customers, we will walk through our recommended approaches to deploying ML models and pipelines on Databricks. You will come away with a deeper understanding of how to scale deployment of ML models across your organization, as well as a practical, coded example illustrating how to implement an MLOps workflow on Databricks.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Multimodal Deep Learning Applied to E-commerce Big Data

Multimodal Deep Learning Applied to E-commerce Big Data

2022-07-19 Watch
video

At Mirakl, we empower marketplaces with Artificial Intelligence solutions. Catalogs data is an extremely rich source of e-commerce sellers and marketplaces products which include images, descriptions, brands, prices and attributes (for example, size, gender, material or color). Such big volumes of data are suitable for training multimodal deep learning models and present several technical Machine Learning and MLOps challenges to tackle.

We will dive deep into two key use cases: deduplication and categorization of products. For categorization the creation of quality multimodal embeddings plays a crucial role and is achieved through experimentation of transfer learning techniques on state-of-the-art models. Finding very similar or almost identical products among millions and millions can be a very difficult problem and that is where our deduplication algorithm comes to bring a fast and computationally efficient solution.

Furthermore we will show how we deal with big volumes of products using robust and efficient pipelines, Spark for distributed and parallel computing, TFRecords to stream and ingest data optimally on multiple machines avoiding memory issues, and MLflow for tracking experiments and metrics of our models.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Automating Model Lifecycle Orchestration with Jenkins

Automating Model Lifecycle Orchestration with Jenkins

2022-07-19 Watch
video

A key part of the lifecycle involves bringing a model to production. In regular software systems, this is accomplished via a CI/CD pipeline such as one built with Jenkins. However, integrating Jenkins into a typical DS/ML workflow is not straightforward for X, Y, Z reasons. In this hands-on talk, I will talk about what Jenkins and CI/CD practices can bring to your ML workflows, demonstrate a few of these workflows, and share some best practices on how a bit of Jenkins can level up your MLOps processes.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Quick to Production with the Best of Both Apache Spark and Tensorflow on Databricks

Quick to Production with the Best of Both Apache Spark and Tensorflow on Databricks

2022-07-19 Watch
video

Using Tensorflow with big datasets has been an impediment for building deep learning models due to the added complexities of running it in a distributed setting and complicated MLOps code, recent advancements in tensorflow 2, and some extension libraries for Spark has now simplified a lot of this. This talk focuses on how we can leverage the best of both Spark and tensorflow to build machine learning and deep learning models using minimal MLOps code letting Spark handle the grunt of work, enabling us to focus more on feature engineering and building the model itself. This design also enables us to use any of the libraries in the tensorflow ecosystem (like tensorflow recommenders) with the same boilerplate code. For businesses like ours, fast prototyping and quick experimentations are key to building completely new experiences in an efficient and iterative way. It is always preferable to have tangible results before putting more resources into a certain project. This design provides us with that capability and lets us spend more time on research, building models, testing quickly, and rapidly iterating. It also provides us with the flexibility to use our choice of framework at any stage of the machine learning lifecycle. In this talk, we will go through some of the best and new features of both spark and tensorflow, how to go from single node training to distributed training with very few extra lines of code, how to leverage MLFlow as a central model store, and finally, using these models for batch and real-time inference.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Survey of Production ML Tech Stacks

Survey of Production ML Tech Stacks

2022-07-19 Watch
video

Production machine learning demand stitching together many tools ranging from open source standards to cloud-specific and third party solutions. This session surveys the current ML deployment technology landscape to contextualize which tools solve for which features off production ML systems such as CI/CD, REST endpoint, and monitoring. It'll help answer the questions: what tools are out there? Where do I start with the MLops tech stack for my application? What are the pros and cons of open source versus managed solutions? This talk takes a features driven approach to tool selection for MLops tacks to provide best practices in the most rapidly evolving field of data science.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Designing Better MLOps Systems

Designing Better MLOps Systems

2022-07-19 Watch
video

Real-world data problems are becoming increasingly daunting to solve, as data volume grows and computing tools proliferate. Since 2018, Gartner has predicted that 85% of ML projects will fail and this trend will likely continue through 2022 as well. Nevertheless, in most cases, ML practitioners have the opportunity to avoid their projects from failing in the early phases.

In this talk, the speaker will borrow from her consultancy and hands-on implementation experience with cross-functional clients to share her takeaways in designing better ML systems. The talk will walk through common pitfalls to watch out for, relevant best practices in software engineering for ML, and technical anchors that make a robust system. This talk aims to empower the audience – beginner and experienced practitioners alike – with confidence in their ML project designs and help provide the big-picture design thinking framework for successful projects.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Emerging Data Architectures & Approaches for Real-Time AI using Redis

Emerging Data Architectures & Approaches for Real-Time AI using Redis

2022-07-19 Watch
video

As more applications harness the power of real-time data, it’s important to architect and implement a data stack to meet the broad requirements of operational ML and be able to seamlessly integrate neural embeddings into applications.

Real-time ML requires more than just deploying ML models to production using MLOps tooling; it requires a fast and scalable operational database that easily integrates into the MLOps workflow. Milliseconds matter and can make the difference in delivering fast online predictions whether it’s personalized recommendations, detecting fraud, or figuring out the most optimal food delivery route.

Attend this session to explore how a modern data stack can be used for real-time operational ML and building AI-infused applications. The session will over the following topics:

Emerging architectural components for operational ML such as the online feature store for real-time serving.

Operational excellence in managing globally distributed ML data and feature pipelines

Foundational data types of Redis including the representation of data using vector embeddings.

Using Redis as a vector database to build vector similarity search applications.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

FutureMetrics: Using Deep Learning to Create a Multivariate Time Series Forecasting Platform

FutureMetrics: Using Deep Learning to Create a Multivariate Time Series Forecasting Platform

2022-07-19 Watch
video

Liquidity forecasting is one of the most essential activities at any bank. TD bank, the largest of the big Five, has to provide liquidity for half a trillion dollars in products, and to forecast it to remain within a $5BN buffer.

The use case was to predict liquidity growth over short to moderate time horizons: 90 days to 18 months. Model must perform reliably in a strict regulatory framework, and as such validating such a model to the required standards is a key area of focus for this talk. While univariate models are widely used for this reason, their performance is capped preventing future improvements for these type problems.

The most challenging aspect of this problem is that the data is shallow (P N): the primary cadence is monthly, and chaotic nature of economic systems results in poor connectivity of behavior across transitions. Goal is to create an MLOps platform for these types of time series forecasting metrics across the enterprise.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Detecting Financial Crime Using an Azure Advanced Analytics Platform and MLOps Approach

Detecting Financial Crime Using an Azure Advanced Analytics Platform and MLOps Approach

2022-07-19 Watch
video

As gatekeepers of the financial system, banks play a crucial role in reporting possible instances of financial crime. At the same time, criminals continuously reinvent their approaches to hide their activities among dense transaction data. In this talk, we describe the challenges of detecting money laundering and outline why employing machine learning via MLOps is critically important to identify complex and ever-changing patterns.

In anti-money-laundering, machine learning answers to a dire need for vigilance and efficiency where previous-generation systems fall short. We will demonstrate how our open platform facilitates a gradual migration towards a model-driven landscape, using the example of transaction-monitoring to showcase applications of supervised and unsupervised learning, human explainability, and model monitoring. This environment enables us to drive change from the ground up in how the bank understands its clients to detect financial crime.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

MLflow Pipelines: Accelerating MLOps from Development to Production

MLflow Pipelines: Accelerating MLOps from Development to Production

2022-07-19 Watch
video

Despite being an emerging topic, MLOps is hard and there are no widely established approaches for MLOps. What makes it even harder is that in many companies the ownership of MLOps usually falls through the cracks between data science teams and production engineering teams. Data scientists are mostly focused on modeling the business problems and reasoning about data, features, and metrics, while the production engineers/ops are mostly focused on traditional DevOps for software development, ignoring ML-specific Ops like ML development cycles, experiment tracking, data/model validation, etc. In this talk, we will introduce MLflow Pipelines, an opinionated approach for MLOps. It provides predefined ML pipeline templates for common ML problems and opinionated development workflows to help data scientists bootstrap ML projects, accelerate model development, and ship production-grade code with little help from production engineers.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Financial Services Experience at Data + AI Summit 2022

Financial Services Experience at Data + AI Summit 2022

2022-07-19 Watch
video

The future of Financial Services is open with data and AI at its core. Welcome data teams and executives in Financial Services! This year’s Data + AI Summit is jam-packed with talks, demos and discussions on how Financial Services leaders are harnessing the power of data and analytics to digitally transform, minimize risk, accelerate time to market and drive sustainable value creation To help you take full advantage of the Financial Services industry experience at Summit, we’ve curated all the programs in one place.

Highlights at this year’s Summit:

Financial Services Industry Forum: Our flagship event for Financial Services attendees at Summit featuring keynotes and panel discussions with ADP, Northwestern Mutual, Point72 Asset Management, S&P Global and EY, followed by networking. More details in the agenda below. Financial Services Lounge: Stop by our lounge located outside the Expo floor to meet with Databricks’ industry experts and see solutions from our partners including Accenture, Avanade, Deloitte and others. Session Talks: Over 15 technical talks and demos on topics including hyper-personalization, AI-fueled forecasting, enterprise analytics in cloud, scaling privacy and cybersecurity, MLOps in cryptocurrency, ethical credit scoring and more.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Simplify Global DataOps and MLOps Using Okta’s FIG Automation Library

Simplify Global DataOps and MLOps Using Okta’s FIG Automation Library

2022-07-19 Watch
video

Think for a moment about an ML pipeline that you have created. Was it tedious to write? Did you have to familiarize yourself with technology outside your normal domain? Did you find many bugs? Did you give up with a “good enough” solution? Even simple ML pipelines are tedious. Complex ML pipelines make teams that include Data Engineers and ML Engineers still end up with delays and bugs. Okta’s FIG (Feature Infrastructure Generator) simplifies this with a configuration language for Data Scientists that produces scalable and correct ML pipelines, even highly complex ones. FIG is “just a library” in the sense that you can PIP install it. Once installed, FIG will configure your AWS account, creating ETL jobs, workflows, and ML training and scoring jobs. Data Scientists then use FIG’s configuration language to specify features and model integrations. With a single function call, FIG will run an ML pipeline to generate feature data, train models, and create scoring data. Feature generation is performed in a scalable, efficient, and temporally correct manner. Model training artifacts and scoring are automatically labeled and traced. This greatly simplifies the ML prototyping experience. Once it is time to productionize a model, FIG is able to use the same configuration to coordinate with Okta’s deployment infrastructure to configure production AWS accounts, register build and model artifacts, and setup monitoring. This talk will show a demo of using FIG in the development of Okta’s next generation security infrastructure. The demo includes a walkthrough of the configuration language and how that is translated into AWS during a prototyping session. The demo will also briefly cover how FIG interacts with Okta’s deployment system to make productionization seamless.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/