talk-data.com talk-data.com

Event

Databricks DATA + AI Summit 2023

2026-01-11 YouTube Visit website ↗

Activities tracked

287

Filtering by: AI/ML ×

Sessions & talks

Showing 76–100 of 287 · Newest first

Search within this event →
Streamlining API Deploy ML Models Across Multiple Brands: Ahold Delhaize's Experience on Serverless

Streamlining API Deploy ML Models Across Multiple Brands: Ahold Delhaize's Experience on Serverless

2023-07-26 Watch
video
Basak Eskili , Maria Vechtomova (Marvelous MLOps)

At Ahold Delhaize, we have 19 local brands. Most of our brands have common goals, such as providing personalized offers to their customers, a better search engine on e-commerce websites, and forecasting models to reduce food waste and ensure availability. As a central team, our goal is to standardize the way of working across all of these brands, including the deployment of machine learning models. To this end, we have adopted Databricks as our standard platform for our batch inference models.

However, API deployment for real time inference models remained challenging due to the varying capabilities of our brands. Our attempts to standardize API deployments with different tools failed due to complexity of our organization. Fortunately, Databricks has recently introduced a new feature: serverless API deployment. Since all our brands already use Databricks, this feature was easy to adopt. It allows us to easily reuse API deployment across all of our brands, significantly reducing time to market (from 6-12 months to one month), increasing efficiency, and reducing the costs. In this session, you will see the solution architecture, sample use case specifically used to cross-sell model deployed to four different brands, and API deployment using Databricks Serverless API with custom model.

Talk by: Maria Vechtomova and Basak Eskili

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

US Army Corp of Engineers Enhanced Commerce & National Sec Through Data-Driven Geospatial Insight

US Army Corp of Engineers Enhanced Commerce & National Sec Through Data-Driven Geospatial Insight

2023-07-26 Watch
video

The US Army Corps of Engineers (USACE) is responsible for maintaining and improving nearly 12,000 miles of shallow-draft (9'-14') inland and intracoastal waterways, 13,000 miles of deep-draft (14' and greater) coastal channels, and 400 ports, harbors, and turning basins throughout the United States. Because these components of the national waterway network are considered assets to both US commerce and national security, they must be carefully managed to keep marine traffic operating safely and efficiently.

The National DQM Program is tasked with providing USACE a nationally standardized remote monitoring and documentation system across multiple vessel types with timely data access, reporting, dredge certifications, data quality control, and data management. Government systems have often lagged commercial systems in modernization efforts, and the emergence of the cloud and Data Lakehouse Architectures have empowered USACE to successfully move into the modern data era.

This session incorporates aspects of these topics: Data Lakehouse Architecture: Delta Lake, platform security and privacy, serverless, administration, data warehouse, Data Lake, Apache Iceberg, Data Mesh GIS: H3, MOSAIC, spatial analysis data engineering: data pipelines, orchestration, CDC, medallion architecture, Databricks Workflows, data munging, ETL/ELT, lakehouses, data lakes, Parquet, Data Mesh, Apache Spark™ internals. Data Streaming: Apache Spark Structured Streaming, real-time ingestion, real-time ETL, real-time ML, real-time analytics, and real-time applications, Delta Live Tables. ML: PyTorch, TensorFlow, Keras, scikit-learn, Python and R ecosystems data governance: security, compliance, RMF, NIST data sharing: sharing and collaboration, delta sharing, data cleanliness, APIs.

Talk by: Jeff Mroz

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Accelerating the Development of Viewership Personas with a Unified Feature Store

Accelerating the Development of Viewership Personas with a Unified Feature Store

2023-07-26 Watch
video

With the proliferation of video content and flourishing consumer demand, there is an enormous opportunity for customer-centric video entertainment companies to use data and analytics to understand what their viewers want and deliver more of the content that that meets their needs.

At DIRECTV, our Data Science Center of Excellence is constantly looking to push the boundary of innovation in how we can better and more quickly understand the needs of our customers and leverage those actionable insights to deliver business impact. One way in which we do so is through the development of Viewership Personas with cluster analysis at scale to group our customers by the types of content they enjoy watching. This process is significantly accelerated by a unified feature store which contain a wide array of features that captures key information on viewing preferences.

This talk will focus on how the DIRECTV Data Science team utilizes Databricks to help develop a unified feature store, and learn how we leverage the feature store to accelerate the process of running machine learning algorithms to find meaningful viewership clusters.

Talk by: Malav Shah,Taylor Hosbach

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

AI-Accelerated Delta Tables: Faster, Easier, Cheaper

AI-Accelerated Delta Tables: Faster, Easier, Cheaper

2023-07-26 Watch
video

In this session, learn about recent releases for Delta Tables and the upcoming roadmap. Learn how to leverage AI to get blazing fast performance from Delta, without requiring users to do time-consuming and complicated tuning themselves. Recent releases like Predictive I/O and Auto Tuning for Optimal File Sizes will be covered, as well as the exciting roadmap of even more intelligent capabilities.

Talk by: Sirui Sun and Vijayan Prabhakaran

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Bridging the Production Gap: Develop and Deploy Code Easily With IDEs

Bridging the Production Gap: Develop and Deploy Code Easily With IDEs

2023-07-26 Watch
video
Fabian Jakobs (Databricks) , Saad Ansari (Databricks)

Hear from customers how they are using software development best practices to combine the best of Integrated Development Environments (IDEs) with Databricks. See the latest developments that unlock key productivity gains from IDEs like code linters, AI code assistants and integrations with CI/CD tools to make going to production smoother and more reliable.

Attend this session to learn how to use IDEs with Databricks and take advantage of:

  • Native development - Write code, edit files and run on Databricks with the familiarity of your favorite IDE with DB Connect
  • Interactive debugging - Step through code in a cluster to quickly pinpoint and fix errors so that code is more robust and easily maintained
  • CI/CD pipelines - Set up and manage your CI/CD pipelines using the new CLI
  • IDE ecosystems - Use familiar integrations to streamline code reviews and deploy code faster

Sign up today to boost your productivity by combining your favorite IDE with the scale of Databricks.

Talk by: Saad Ansari and Fabian Jakobs

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksin

Data & AI Products on Databricks: Making Data Engineering & Consumption Self-Service Data Platforms

Data & AI Products on Databricks: Making Data Engineering & Consumption Self-Service Data Platforms

2023-07-26 Watch
video

Our client, a large IT and business consulting firm, embarked on a journey to create “Data As a Product” for both their internal and external stakeholders. In this project, Infosys took a data platform approach and leveraged Delta Sharing, API endpoints, and Unity Catalog to effectively create a realization of Data and AI Products (Data Mesh) architecture. This session presents the three primary design patterns used, providing valuable insights for your evolution toward a no-code/low-code approach.

Talk by: Ankit Sharma

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Databricks Marketplace: Going Beyond Data and Applications

Databricks Marketplace: Going Beyond Data and Applications

2023-07-26 Watch
video

The demand for third-party data has never been greater, but existing marketplaces simply aren't cutting it. You deserve more than being locked into a walled garden of just data sets and simple applications. You deserve an open marketplace to exchange ML models, notebooks, datasets and more. The Databricks Marketplace is the ultimate solution for your data, AI and analytics needs, powered by open source Delta Sharing. Databricks is revolutionizing the data marketplace space.

Join us for a demo-filled session and learn how Databricks Marketplace is exactly what you need in today’s AI-driven innovation ecosystem. Hear from customers on how Databricks is empowering organizations to leverage shared knowledge and take their analytics and AI to new heights. Take advantage of this rare opportunity to ask questions of the Databricks product team that is building the Databricks Marketplace..

Talk by: Mengxi Chen and Darshana Sivakumar

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Journey to Real-Time ML: A Look at Feature Platforms & Modern RT ML Architectures Using Tecton

Journey to Real-Time ML: A Look at Feature Platforms & Modern RT ML Architectures Using Tecton

2023-07-26 Watch
video

Are you struggling to keep up with the demands of real-time machine learning? Like most organizations building real-time ML, you’re probably looking for a better way to: Manage the lifecycle of ML models and features, Implement batch, streaming, and real-time data pipelines, Generate accurate training datasets and serve models and data online with strict SLAs, supporting millisecond latencies and high query volumes. Look no further. In this session, we will unveil a modern technical architecture that simplifies the process of managing real-time ML models and features.

Using MLflow and Tecton, we’ll show you how to build a robust MLOps platform on Databricks that can easily handle the unique challenges of real-time data processing. Join us to discover how to streamline the lifecycle of ML models and features, implement data pipelines with ease, and generate accurate training datasets with minimal effort. See how to serve models and data online with mission-critical speed and reliability, supporting millisecond latencies and high query volumes.

Take a firsthand look at how FanDuel uses this solution to power their real-time ML applications, from responsible gaming to content recommendations and marketing optimization. See for yourself how this system can be used to define features, train models, process streaming data, and serve both models and features online for real-time inference with a live demo. Join us to learn how to build a modern MLOps platform for your real-time ML use cases.

Talk by: Mike Del Balso and Morgan Hsu

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Leveraging Machine Learning on Databricks to Deliver Best in Class Customer Engagement

Leveraging Machine Learning on Databricks to Deliver Best in Class Customer Engagement

2023-07-26 Watch
video
Ryan Kennedy , Raja Lanka (Morgan Stanley)

In today's competitive business environment, customer engagement is a top priority for organizations looking to retain and grow their customer base. In this session, we will showcase how we used Databricks, a powerful machine learning platform, to build and deploy distributed deep learning machine learning models using Apache Spark™ and Horovod for best-in-class customer engagement. We will discuss the challenges we faced and the solutions we implemented, including data preparation, model training, and model deployment. We will also share the results of our efforts, including increased customer retention and improved customer satisfaction. Attendees will walk away with practical tips and best practices for using Databricks to drive customer engagement in for their own organizations. In this session we will:

  • ]Explore Morgan Stanley’s approach to best-in-class customer engagement
  • Discuss how data and technology was leveraged to help solve the business problem
  • Share our experience using Databricks to build and deploy machine learning models for customer engagement
  • Provide practical tips and best practices for using Databricks in a production environment

Talk by: Raja Lanka and Ryan Kennedy

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Planning and Executing a Snowflake Data Warehouse Migration to Databricks

Planning and Executing a Snowflake Data Warehouse Migration to Databricks

2023-07-26 Watch
video

Organizations are going through a critical phase of data infrastructure modernization, laying the foundation for the future, and adapting to support growing data and AI needs. Organizations that embraced cloud data warehouses (CDW) such as Snowflake have ended up trying to use a data warehousing tool for ETL pipelines and data science. This created unnecessary complexity and resulted in poor performance since data warehouses are optimized for SQL-based analytics only.

Realizing the limitation and pain with cloud data warehouses, organizations are turning to a lakehouse-first architecture. Though a cloud platform to cloud platform migration should be relatively easy, the breadth of the Databricks platform provides flexibility and hence requires careful planning and execution. In this session, we present the migration methodology, technical approaches, automation tools, product/feature mapping, a technical demo and best practices using real-world case studies for migrating data, ELT pipelines and warehouses from Snowflake to Databricks.

Talk by: Satish Garla and Ramachandran Venkat

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksin

Scaling AI Applications with Databricks, HuggingFace and Pinecone

Scaling AI Applications with Databricks, HuggingFace and Pinecone

2023-07-26 Watch
video

The production and management of large-scale vector embeddings can be a challenging problem. The integration of Databricks, Hugging Face and Pinecone offers a powerful solution. Vector embeddings have become an essential tool in the development of AI powered applications. Embeddings are representations of data learned by machine models. High quality embeddings are unlocking use cases like semantic search, recommendation engines, and anomaly detection. Databricks' Apache Spark™ ecosystem together with Hugging Face's Transformers library enable large-scale vector embeddings production using GPU processing, Pinecone's vector database provides ultra-low latency querying and upserting of billions of embeddings, allowing for high-quality embeddings at scale for real-time AI apps.

In this session, we will present a concrete use case of this integration in the context of a natural language processing application. We will demonstrate how Pinecone's vector database can be integrated with Databricks and Hugging Face to produce large-scale vector embeddings of text data and how these embeddings can be used to improve the performance of various AI applications. You will see the benefits of this integration in terms of speed, scalability, and cost efficiency. By leveraging the GPU processing capabilities of Databricks and the ultra low-latency querying capabilities of Pinecone, we can significantly improve the performance of NLP tasks while reducing the cost and complexity of managing large-scale vector embeddings. You will learn about the technical details of this integration and how it can be implemented in your own AI projects, and gain insights into the speed, scalability, and cost efficiency benefits of using this solution.

Talk by: Roie Schwaber-Cohen

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Sponsored: AWS|Build Generative AI Solution on Open Source Databricks Dolly 2.0 on Amazon SageMaker

Sponsored: AWS|Build Generative AI Solution on Open Source Databricks Dolly 2.0 on Amazon SageMaker

2023-07-26 Watch
video

Create a custom chat-based solution to query and summarize your data within your VPC using Dolly 2.0 and Amazon SageMaker. In this talk, you will learn about Dolly 2.0, Databricks, state-of-the-art, open source, LLM, available for commercial and Amazon SageMaker, AWS’s premiere toolkit for ML builders. You will learn how to deploy and customize models to reference your data using retrieval augmented generation (RAG) and additional fine tuning techniques…all using open-source components available today.

Talk by: Venkat Viswanathan and Karl Albertsen

Here’s more to explore: LLM Compact Guide: https://dbricks.co/43WuQyb Big Book of MLOps: https://dbricks.co/3r0Pqiz

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

What’s New With Platform Security and Compliance in the Databricks Lakehouse Platform

What’s New With Platform Security and Compliance in the Databricks Lakehouse Platform

2023-07-26 Watch
video
David Veuve (Databricks) , Samrat Ray (Databricks)

At Databricks, we know that data is one of your most valuable assets and alwasys must be protected, that’s why security is built into every layer of the Databricks Lakehouse Platform. Databricks provides comprehensive security to protect your data and workloads, such as encryption, network controls, data governance and auditing.

In this session, you will hear from Databricks product leaders on the platform security and compliance progress made over the past year, with demos on how administrators can start protecting workloads fast. You will also learn more about the roadmap that delivers on the Databricks commitment to you as the most trusted, compliant, and secure data and AI platform with the Databricks Lakehouse.

Talk by: Samrat Ray and David Veuve

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksin

Colossal AI: Scaling AI Models in Big Model Era

Colossal AI: Scaling AI Models in Big Model Era

2023-07-26 Watch
video

The proliferation of large models based on Transformer has outpaced advances in hardware, resulting in an urgent need for the ability to distribute enormous models across multiple GPUs. Despite this growing demand, best practices for choosing an optimal strategy are still lacking due to the breadth of knowledge required across HPC, DL, and distributed systems. These difficulties have stimulated both AI and HPC developers to explore the key questions: How can training and inference efficiency of large models be improved to reduce costs? How can larger AI models be accommodated even with limited resources?

What can be done to enable more community members to easily access large models and large-scale applications? In this session, we investigate efforts to solve the questions mentioned above. Firstly, diverse parallelization is an important tool to improve the efficiency of large model training and inference. Heterogeneous memory management can help enhance the model accommodation capacity of processors like GPUs.

Furthermore, user-friendly DL systems for large models significantly reduce the specialized background knowledge users need, allowing more community members to get started with larger models more efficiently. We will provide participants with a system-level open-source solution, Colossal-AI. More information can be found at https://github.com/hpcaitech/ColossalAI.

Talk by: James Demmel and Yang You

Here’s more to explore: LLM Compact Guide: https://dbricks.co/43WuQyb Big Book of MLOps: https://dbricks.co/3r0Pqiz

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Enterprise Use of Generative AI Needs Guardrails: Here's How to Build Them

Enterprise Use of Generative AI Needs Guardrails: Here's How to Build Them

2023-07-26 Watch
video

Large Language Models (LLMs) such as ChatGPT have revolutionized AI applications, offering unprecedented potential for complex real-world scenarios. However, fully harnessing this potential comes with unique challenges such as model brittleness and the need for consistent, accurate outputs. These hurdles become more pronounced when developing production-grade applications that utilize LLMs as a software abstraction layer.

In this session, we will tackle these challenges head-on. We introduce Guardrails AI, an open-source platform designed to mitigate risks and enhance the safety and efficiency of LLMs. We will delve into specific techniques and advanced control mechanisms that enable developers to optimize model performance effectively. Furthermore, we will explore how implementing these safeguards can significantly improve the development process of LLMs, ultimately leading to safer, more reliable, and robust real-world AI applications

Talk by: Shreya Rajpal

Here’s more to explore: LLM Compact Guide: https://dbricks.co/43WuQyb Big Book of MLOps: https://dbricks.co/3r0Pqiz

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Navigating the Complexities of LLMs: Insights from Practitioners

Navigating the Complexities of LLMs: Insights from Practitioners

2023-07-26 Watch
video
Ankit Mathur (Databricks) , Eric Peter (Databricks) , Salman Mohammed , Sai Ravuru

Interested in diving deeper into the world of large language models (LLMs) and their real-life applications? In this session, we bring together our experienced team members and some of our esteemed customers to talk about their journey with LLMs. We'll delve into the complexities of getting these models to perform accurately and efficiently, the challenges, and the dynamic nature of LLM technology as it constantly evolves. This engaging conversation will offer you a broader perspective on how LLMs are being applied across different industries and how they’re revolutionizing our interaction with technology. Whether you're well-versed in AI or just beginning to explore, this session promises to enrich your understanding of the practical aspects of LLM implementation.

Talk by: Sai Ravuru, Eric Peter, Ankit Mathur, and Salman Mohammed

Here’s more to explore: LLM Compact Guide: https://dbricks.co/43WuQyb Big Book of MLOps: https://dbricks.co/3r0Pqiz

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Simplifying Lakehouse Observability: Databricks Key Design Goals and Strategies

Simplifying Lakehouse Observability: Databricks Key Design Goals and Strategies

2023-07-26 Watch
video
Michael Milirud (Databricks)

In this session, we'll explore Databricks vision for simplifying lakehouse observability, a critical component of any successful data, analytics, and machine learning initiatives. By directly integrating observability solutions within the lakehouse, Databricks aims to provide users with the tools and insights needed to run a successful business on top of lakehouse.

Our approach is designed to leverage existing expertise and simplify the process of monitoring and optimizing data and ML workflows, enabling teams to deliver sustainable and scalable data and AI applications. Join us to learn more about our key design goals and how Databricks is streamlining lakehouse observability to support the next generation of data-driven applications

Talk by: Michael Milirud

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Sponsored by: Dataiku | Have Your Cake and Eat it Too with Dataiku + Databricks

Sponsored by: Dataiku | Have Your Cake and Eat it Too with Dataiku + Databricks

2023-07-26 Watch
video

In this session, we will highlight all parts of the analytics lifecycle using Dataiku + Databricks. Explore, blend, and prepare source data, train a machine learning model and score new data, and visualize and publish results — all using only Dataiku's visual interface. Plus, we will use LLMs for everything from simple data prep to sophisticated development pipelines. Attend and learn how you can truly have it all with Dataiku + Databricks.

Talk by: Amanda Milberg

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Apache Spark™ Streaming and Delta Live Tables Accelerates KPMG Clients For Real Time IoT Insights

Apache Spark™ Streaming and Delta Live Tables Accelerates KPMG Clients For Real Time IoT Insights

2023-07-26 Watch
video

Unplanned downtime in manufacturing costs firms up to a trillion dollars annually. Time that materials spend sitting on a production line is lost revenue. Even just 15 hours of downtime a week adds up to over 800 hours of downtime yearly. The use of Internet of Things or IoT devices can cut this time down by providing details of machine metrics. However, IoT predictive maintenance is challenged by the lack of effective, scalable infrastructure and machine learning solutions. IoT data can be the size of multiple terabytes per day and can come in a variety of formats. Furthermore, without any insights and analysis, this data becomes just another table.

The KPMG Databricks IoT Accelerator is a comprehensive solution enabling manufacturing plant operators to have a bird’s eye view of their machines’ health and empowers proactive machine maintenance across their portfolio of IoT devices. The Databricks Accelerator ingests IoT streaming data at scale and implements the Databricks Medallion architecture while leveraging Delta Live Tables to clean and process data. Real time machine learning models are developed from IoT machine measurements and are managed in MLflow. The AI predictions and IoT device readings are compiled in the gold table powering downstream dashboards like Tableau. Dashboards inform machine operators of not only machines’ ailments, but action they can take to mitigate issues before they arise. Operators can see fault history to aid in understanding failure trends, and can filter dashboards by fault type, machine, or specific sensor reading. 

Talk by: MacGregor Winegard

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksin

Clean Room Primer: Using Databricks Clean Rooms to Use More & Better Data in your BI, ML, & Beyond

Clean Room Primer: Using Databricks Clean Rooms to Use More & Better Data in your BI, ML, & Beyond

2023-07-26 Watch
video

In this session, we will discuss the foundational changes in the ecosystem, the implications of data insights on marketing, analytics, and measurement, and how companies are coming together to collaborate through data clean rooms in new and exciting ways to power mutually beneficial value for their businesses while preserving privacy and governance.

We will delve into the concept and key features of clean room technology and how they can be used to access more and better data for business intelligence (BI), machine learning (ML), and other data-driven initiatives. By examining real-world use cases of clean rooms in action, attendees will gain a clear understanding of the benefits they can bring to industries like CPG, retail, and media and entertainment. In addition, we will unpack the advantages of using Databricks as a clean room platform, specifically showcasing how interoperable clean rooms can be leveraged to enhance BI, ML and other compute scenarios. By the end of this session, you will be equipped with the knowledge and inspiration to explore how clean rooms can unlock new collaboration opportunities that drive better outcomes for your business and improved experiences for consumers.

Talk by: Matthew Karasick, and Anil Puliyeril

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Data Biases and Generative AI: A Practitioner Discussion

Data Biases and Generative AI: A Practitioner Discussion

2023-07-26 Watch
video

Join this panel discussion that unpacks the technical challenges surrounding biases found in data, and poses potential solutions and strategies for the future including Generative AI. This session is a showcase highlighting diverse perspectives in the data and AI industry.

Talk by: Adi Polak, Gavita Regunath, Christina Taylor, and Layla Yang

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Demonstrate-Search-Predict: Composing Retrieval and Language Models for Knowledge-Intensive NLP

Demonstrate-Search-Predict: Composing Retrieval and Language Models for Knowledge-Intensive NLP

2023-07-26 Watch
video

In this talk, you will learn about how retrieval-augmented in-context learning has emerged as a powerful approach for addressing knowledge intensive tasks using frozen language models (LM) and retrieval models (RM). Existing work has combined these in simple “retrieve-then-read” pipelines in which the RM retrieves passages that are inserted into the LM prompt.

To begin to fully realize the potential of frozen LMs and RMs, we propose Demonstrate–Search–Predict (DSP), a framework that relies on passing natural language texts in sophisticated pipelines between an LM and an RM. DSP can express high-level programs that bootstrap pipeline-aware demonstrations, search for relevant passages, and generate grounded predictions, systematically breaking down problems into small transformations that the LM and RM can handle more reliably.

We have written novel DSP programs for answering questions in open-domain, multi-hop, and conversational settings, establishing in early evaluations new state-of-the-art in-context learning results and delivering 37–125%, 8–40%, and 80–290% relative gains against vanilla LMs, a standard retrieve-then-read pipeline, and a contemporaneous self-ask pipeline, respectively.

Talk by: Keshav Santhanam

Here’s more to explore: State of Data + AI Report: https://dbricks.co/44i2HBp Databricks named a Leader in 2022 Gartner® Magic QuadrantTM CDBMS: https://dbricks.co/3phw20d

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

JoinBoost: In Data Base Machine Learning for Tree-Models

JoinBoost: In Data Base Machine Learning for Tree-Models

2023-07-26 Watch
video

Data and machine learning (ML) are crucial for enterprise operations. Enterprises store data in databases for management and use ML to gain business insights. However, there is a mismatch between the way ML expects data to be organized (a single table) and the way data is organized in databases (a join graph of multiple tables) and leads to inefficiencies when joining and materializing tables.

In this talk, you will see how we successfully address this issue. We introduce JoinBoost, a lightweight python library that trains tree models (such as random forests and gradient boosting) for join graphs in databases. JoinBoost acts as a query rewriting layer that is compatible with cloud databases, and eliminates the need for costly join materialization.

Talk by: Zachary Huang

Here’s more to explore: State of Data + AI Report: https://dbricks.co/44i2HBp Databricks named a Leader in 2022 Gartner® Magic QuadrantTM CDBMS: https://dbricks.co/3phw20d

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Lakehouse Architecture to Advance Security Analytics at the Department of State

Lakehouse Architecture to Advance Security Analytics at the Department of State

2023-07-26 Watch
video

In 2023, the Department of State surged forward on implementing a lakehouse architecture to get faster, smarter, and more effective on cybersecurity log monitoring and incident response. In addition to getting us ahead of federal mandates, this approach promises to enable advanced analytics and machine learning across our highly federated global IT environment while minimizing costs associated with data retention and aggregation.

This talk will include a high-level overview of the technical and policy challenge and a technical deeper dive on the tactical implementation choices made. We’ll share lessons learned related to governance and securing organizational support, connecting between multiple cloud environments, and standardizing data to make it useful for analytics. And finally, we’ll discuss how the lakehouse leverages Databricks in multicloud environments to promote decentralized ownership of data while enabling strong, centralized data governance practices.

Talk by: Timothy Ahrens and Edward Moe

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Sponsored by: Infosys | Topaz AI First Innovations

Sponsored by: Infosys | Topaz AI First Innovations

2023-07-26 Watch
video
Neeraj Dixit (Infosys)

Insights into Infosys' Topaz AI First Innovations including AI-enabled Analytics and AI-enabled Automation to help clients in significant cost savings, improved efficiency and customer experience across industry segments.

Talk by: Neeraj Dixit

Here’s more to explore: LLM Compact Guide: https://dbricks.co/43WuQyb Big Book of MLOps: https://dbricks.co/3r0Pqiz

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc