Databricks DATA + AI Summit 2023

Lineage System Table in Unity Catalog

2023-07-26 Watch

video

Menglei Sun

API Data Governance Data Lakehouse Databricks Delta Python

Unity Catalog provides fully automated data lineage for all workloads in SQL, R, Python, Scala and across all asset types at Databricks. The aggregated view has been available to end users through data explorer and API. In this session, we are excited to share that lineage is available via delta table in their UC metastore. It stores full history of recent lineage records and it is near real time. Additionally, customers can query it through standard SQL interface. With that, customers can get significant operational insights about their workload for impact analysis, troubleshooting, quality assurance, data discovery, and data governance.

Together with the system table platform effort, which provides query history, job run operational data, audit logs and more, lineage table will be a critical piece to link all the data asset and entity asset together, providing better lakehouse observability and unification to customers.

Talk by: Menglei Sun

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Processing Prescriptions at Scale at Walgreens

2023-07-26 Watch

video

Daniel Zafar

API Cosmos Data Engineering Data Lakehouse Databricks Microsoft

We designed a scalable Spark Streaming job to manage 100s of millions of prescription-related operations per day at an end-to-end SLA of a few minutes and a lookup time of one second using CosmosDB.

In this session, we will share not only the architecture, but the challenges and solutions to using the Spark Cosmos connector at scale. We will discuss usages of the Aggregator API, custom implementations of the CosmosDB connector, and the major roadblocks we encountered with the solutions we engineered. In addition, we collaborated closely with Cosmos development team at Microsoft and will share the new features which resulted. If you ever plan to use Spark with Cosmos, you won't want to miss these gotchas!

Talk by: Daniel Zafar

Here’s more to explore: Big Book of Data Engineering: 2nd Edition: https://dbricks.co/3XpPgNV The Data Team's Guide to the Databricks Lakehouse Platform: https://dbricks.co/46nuDpI

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Rapidly Scaling Applied AI/ML with Foundational Models and Applying Them to Modern AI/ML Use Cases

2023-07-26 Watch

video

Nick King (Snowplow)

AI/ML Databricks LLM

Today many of us are familiar with foundational models such as LLM/ChatGPT. However, there are many more enterprise foundational models that can be rapidly deployed, trained and applied to enterprise use cases. This approach dramatically increases the performance of AI/ML models in production, but also gives AI teams rapid roadmaps for efficiency and delivering value to the business. Databricks provides the ideal toolset to enable this approach.

In this session, we will provide a logically overview of foundational models available today, demonstrate a real-world use case, and provide a business framework for data scientists and business leaders to collaborate to rapidly deploy these use cases.

Talk by: Nick King

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksin

Real-Time Streaming Solution for Call Center Analytics: Business Challenges and Technical Enablement

2023-07-26 Watch

video

Natalia Demidova

Analytics Azure BI Cloud Computing Data Lake Data Lakehouse

A large international client with a business footprint in North America, Europe and Africa reached out to us with an interest in having a real-time streaming solution designed and implemented for its call center handling incoming and outgoing client calls. The client had a previous bad experience with another vendor, who overpromised and underdelivered on the latency of the streaming solution. The previous vendor delivered an over-complex streaming data pipeline resulting in the data taking over five minutes to reach a visualization layer. The client felt that architecture was too complex and involved many services integrated together.

Our immediate challenges involved gaining the client's trust and proving that our design and implementation quality would supersede a previous experience. To resolve an immediate challenge of the overly complicated pipeline design, we deployed a Databricks Lakehouse architecture with Azure Databricks at the center of the solution. Our reference architecture integrated Genesys Cloud : App Services : Event Hub : Databricks : : Data Lake : Power BI.

The streaming solution proved to be low latency (seconds) during the POV stage, which led to subsequent productionalization of the pipeline with deployment of jobs, DLTs pipeline, including multi-notebook workflow and business and performance metrics dashboarding relied on by the call center staff for a day-to-day performance monitoring and improvements.

Talk by: Natalia Demidova

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Sponsored: Accenture | Factory of the Future: Building Digital Twins Using Knowledge Graphs & Gen AI

Streaming Data Analytics with Power BI and Databricks

2023-07-26 Watch

video

Liping Huang , Marius Panga

Analytics BI Data Analytics Databricks Power BI Data Streaming

This session is comprised of a series of end-to-end technical demos illustrating the synergy between Databricks and Power BI for streaming use cases, and considerations around when to choose which scenario:

Scenario 1: DLT + Power BI Direct Query and Auto Refresh

Scenario 2: Structured Streaming + Power BI streaming datasets

Scenario 3: DLT + Power BI composite datasets

Talk by: Liping Huang and Marius Panga

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Testing Generative AI Models: What You Need to Know

2023-07-26 Watch

video

Yaron Singer

AI/ML AWS Databricks GenAI LLM MLOps

Generative AI shows incredible promise for enterprise applications. The explosion of generative AI can be attributed to the convergence of several factors. Most significant is that the barrier to entry has dropped for AI application developers through customizable prompts (few-shot learning), enabling laypeople to generate high-quality content. The flexibility of models like ChatGPT and DALLE-2 have sparked curiosity and creativity about new applications that they can support. The number of tools will continue to grow in a manner similar to how AWS fueled app development. But excitement must be tampered by concerns about new risks imposed to business and society. Increased capability and adoption also increase risk exposure. As organizations explore creative boundaries of generative models, measures to reduce risk must be put in place. However, the enormous size of the input space and inherent complexity make this task more challenging than traditional ML models.

In this session, we summarize the new risks introduced by the new class of generative foundation models through several examples, and compare how these risks relate to the risks of mainstream discriminative models. Steps can be taken to reduce the operational risk, bias and fairness issues, and privacy and security of systems that leverage LLM for automation. We’ll explore model hallucinations, output evaluation, output bias, prompt injection, data leakage, stochasticity, and more. We’ll discuss some of the larger issues common to LLMs and show how to test for them. A comprehensive, test-based approach to generative AI development will help instill model integrity by proactively mitigating failure and the associated business risk.

Talk by: Yaron Singer

Here’s more to explore: LLM Compact Guide: https://dbricks.co/43WuQyb Big Book of MLOps: https://dbricks.co/3r0Pqiz

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Unleashing the Magic of Large Language Modeling with Dolly 2.0

2023-07-26 Watch

video

Gavita Regunath

AI/ML Databricks LLM MLOps

As the field of artificial intelligence continues to advance at an unprecedented pace, LLMs are becoming increasingly powerful and transformative. LLMs use deep learning techniques to analyze vast amounts of text data, and can generate language that is like human language. These models have been used for a wide range of applications, including language translation, chatbots, text summarization, and more.

Dolly 2.0 is the first open-source, instruction-following LLM that has been fine-tuned on a human-generated instruction dataset – with zero chance of copyright implications. This makes it an ideal tool for research and commercial use, and opens up new possibilities for businesses looking to streamline their operations and enhance their customer service offerings.

In this session, we will provide an overview of Dolly 2.0, discuss its features and capabilities, and showcase its potential through a demo of Dolly in action. Attendees will gain insights into the LLMs, and learn how to maximize the impact of this cutting-edge technology in their organizations. By the end of the session, attendees will have a deep understanding of the capabilities of Dolly 2.0, and will be equipped with the knowledge they need to integrate LLMs into their own operations in order to achieve greater efficiency, productivity, and customer satisfaction.

Talk by: Gavita Regunath

Here’s more to explore: LLM Compact Guide: https://dbricks.co/43WuQyb Big Book of MLOps: https://dbricks.co/3r0Pqiz

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Weaving the Data Mesh in the Department of Defense

2023-07-26 Watch

video

Brad Corwin , Cody Ferguson

AI/ML Analytics Data Lakehouse Databricks

The Chief Digital and AI Office (CDAO) was created to lead the strategy and policy on data, analytics, and AI adoption across the Department of Defense. To enable that vision, the Department must achieve new ways to scale and standardize delivery under a global strategy while enabling decentralized workflows that capture the wealth of data and domain expertise.

CDAO’s strategy and goals are aligned with data mesh principles. This alignment starts with providing enterprise-level infrastructure and services to advance the adoption of data, analytics, and AI, creating the self-service data infrastructure as a platform. And it continues through implementing policy for federated computational governance centered around decentralizing data ownership to become domain-oriented but enforcing the quality and trustworthiness of data. CDAO seeks to expand and make enterprise data more accessible through providing data as a product and leveraging a federated data catalog to designate authoritative data and common data models. This results in domain-oriented, decentralized data ownership to empower the business domains across the Department to increase mission and business impact that result in significant cost savings, saving lives, and data serving as a “public good.”

Please join us in our session as we discuss how the CDAO leverages modern, innovative implementations that accelerate the delivery of data and AI throughout one of the largest distributed organizations in the world; the Department of Defense. We will walk through how this enables delivery in various Department of Defense use cases.

Talk by: Brad Corwin and Cody Ferguson

Here’s more to explore: State of Data + AI Report: https://dbricks.co/44i2HBp The Data Team's Guide to the Databricks Lakehouse Platform: https://dbricks.co/46nuDpI

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Delta Sharing: The Key Data Mesh Enabler

2023-07-26 Watch

video

Francesco Pizzolon

Azure Cloud Computing Databricks Delta

Data Mesh is an emerging architecture pattern that challenges the centralized data platform approach by empowering different engineering teams to own the data products in a specific business domain. One of the keys to the success of any Data Mesh initiative is selecting the right protocol for Data Sharing between different business data domains that could potentially be implemented through different technologies and cloud providers.

In this session you will learn about how the Delta Sharing protocol and the Delta table format have enabled the historically stuck-in-the-past energy and construction industry to be catapulted to the 21st century by way of a modern Data Mesh implementation based on Azure Databricks.

Talk by: Francesco Pizzolon

Here’s more to explore: A New Approach to Data Sharing: https://dbricks.co/44eUnT1

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

How Mars Achieved a People Analytics Transformation with a Modern Data Stack

2023-07-26 Watch

video

Rachel Belino , Sreeharsha Alagani

AI/ML Analytics Data Governance Data Lakehouse Databricks Modern Data Stack

People Analytics at Mars was formed two years ago as part of an ambitious journey to transform our HR analytics capabilities. To transform, we needed to build foundational services to provide our associates with helpful insights through fast results and resolving complex problems. Critical in that foundation are data governance and data enablement which is the responsibility of the Mars People Data Office team whose focus is to deliver high quality and reliable data that is reusable for current and future People Analytics use cases. Come learn how this team used Databricks in helping Mars achieve its People Analytics Transformation.

Talk by: Rachel Belino and Sreeharsha Alagani

Here’s more to explore: State of Data + AI Report: https://dbricks.co/44i2HBp The Data Team's Guide to the Databricks Lakehouse Platform: https://dbricks.co/46nuDpI

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Simplifying Migrations to Lakehouse

2023-07-26 Watch

video

Dan Smith

Data Lakehouse Databricks

This session will cover:

Challenges with legacy platforms
Perenti Databricks migration journey
Reimagining migrations the Databricks way
The Databricks migration methodology and approach

Talk by: Dan Smith

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Unlocking the Value of Data Sharing in Financial Services with Lakehouse

2023-07-26 Watch

video

Spencer Cook (Databricks)

Data Lakehouse Databricks Delta Data Streaming

The emergence of secure data sharing is already having a tremendous economic impact, in large part due to the increasing ease and safety of sharing financial data. McKinsey predicts that the impact of open financial data will be 1-4.5% of GDP globally by 2030. This indicates there is a narrowing window on a massive opportunity for financial institutions and it is critical that they prioritize data sharing. This session will first address the ways in which Delta Sharing and Unity Catalog on a Databricks Lakehouse architecture provides a simple and open framework for building a Secure Data Sharing platform in the financial services industry. Next we will use a Databricks environment to walk through different use cases for open banking data and secure data sharing, demonstrating how they will be implemented using Delta Sharing, Unity Catalog, and other parts of the Lakehouse platform. The use cases will include examples of new product features such as Databricks to Databricks sharing, change data feed and streaming on Delta Sharing, table/column lineage, and the Delta Sharing Excel plugin to demonstrate state of the art sharing capabilities.

In this session, we will discuss secure data sharing on Databricks Lakehouse and will demonstrate architecture and code for common sharing use cases in the finance industry.

Talk by: Spencer Cook

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Feeding the World One Plant at a Time

2023-07-26 Watch

video

Naveed Farooqui , Fahad Khan (Volt Active Data)

AI/ML CI/CD Data Management Databricks Delta LLM

Join this session to learn how the CVML and Data Platform team at BlueRiver Technology utilized Databricks to maximize savings on herbicide usage and revolutionize Precision Agriculture.

Blue River Technology is an agricultural technology company that uses computer vision and machine learning (CVML) to revolutionize the way crops are grown and harvested. BRT’s See & Spray technology, which uses CVML to identify and precisely determine whether the plant is a weed or a crop so it can deliver a small, targeted dose of herbicide directly to the plant, while leaving the crop unharmed. By using this approach, Blue River significantly reduces the amount of herbicides used in agriculture by over 70% and has a positive impact on the environment and human health.

The technical challenges we seek to overcome are: - Processing massive petabytes of proprietary data at scale and in real time. Equipment in the field can generate up to 40TBs of data per hour per machine. - Aggregating, curating and visualizing at scale data can often be convoluted, error-prone and complex. - Streamlining pipelines runs from weeks to hours to ensure continuous delivery of data. - Abstracting and automating the infra, deployment and data management from each program. - Building downstream data products based on descriptive analysis, predictive analysis or prescriptive analysis to drive the machine behavior.

The business questions we seek to answer for any machine are: - Are we getting the spray savings we anticipated? - Are we reducing the use of herbicide at the scale we expected? - Are spraying nozzles performing at the expected rate? - Finding the relevant data to troubleshoot new edge conditions. - Providing a simple interface for data exploration to both technical and non-technical personas to help improve our model. - Identifying repetitive and new faults in our machines. - Filtering out data based on certain incidents. - Identifying anomalies for e.g. sudden drop in spray saving, like frequency of broad spray suddenly is too high.

How we are addressing and plan to address these challenges: - Designating Databricks as our purposeful DB for all data - using the bronze, silver and gold layer standards. - Processing new machine logs using a Delta Live table as a source both in batch and incremental manner. - Democratize access for data scientists, product managers, data engineers who are not proficient with the robotic software stack via notebooks for quick development as well as real time dashboards.

Talk by: Fahad Khan and Naveed Farooqui

Here’s more to explore: LLM Compact Guide: https://dbricks.co/43WuQyb Big Book of MLOps: https://dbricks.co/3r0Pqiz

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksin

Sponsored: Kyvos | Analytics 100x Faster Lowest Cost w/ Kyvos & Databricks, Even on Trillions Rows

Activate Your Lakehouse with Unity Catalog

2023-07-26 Watch

video

Anup Segu

Analytics AWS Data Lakehouse Databricks Delta ETL/ELT

Building a lakehouse is straightforward today thanks to many open source technologies and Databricks. However, it can be taxing to extract value from lakehouses as they grow without robust data operations. Join us to learn how YipitData uses the Unity Catalog to streamline data operations and discover best practices to scale your own Lakehouse. At YipitData, our 15+ petabyte Lakehouse is a self-service data platform built with Databricks and AWS, supporting analytics for a data team of over 250. We will share how leveraging Unity Catalog accelerates our mission to help financial institutions and corporations leverage alternative data by:

Enabling clients to universally access our data through a spectrum of channels, including Sigma, Delta Sharing, and multiple clouds
Fostering collaboration across internal teams using a data mesh paradigm that yields rich insights
Strengthening the integrity and security of data assets through ACLs, data lineage, audit logs, and further isolation of AWS resources
Reducing the cost of large tables without downtime through automated data expiration and ETL optimizations on managed delta tables

Through our migration to Unity Catalog, we have gained tactics and philosophies to seamlessly flow our data assets internally and externally. Data platforms need to be value-generating, secure, and cost-effective in today's world. We are excited to share how Unity Catalog delivers on this and helps you get the most out of your lakehouse.

Talk by: Anup Segu

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

An API for Deep Learning Inferencing on Apache Spark™

2023-07-26 Watch

video

Lee Yang

API AWS Glue Big Data Databricks ETL/ELT LLM

Apache Spark is a popular distributed framework for big data processing. It is commonly used for ETL (extract, transform and load) across large datasets. Today, the transform stage can often include the application of deep learning models on the data. For example, common models can be used for classification of images, sentiment analysis of text, language translation, anomaly detection, and many other use cases. Applying these models within Spark can be done today with the combination of PySpark, Pandas_UDF, and a lot of glue code. Often, that glue code can be difficult to get right, because it requires expertise across multiple domains - deep learning frameworks, PySpark APIs, pandas_UDF internal behavior, and performance optimization.

In this session, we introduce a new, simplified API for deep learning inferencing on Spark, introduced in SPARK-40264 as a collaboration between NVIDIA and Databricks, which seeks to standardize and open source this glue code to make deep learning inference integrations easier for everyone. We discuss its design and demonstrate its usage across multiple deep learning frameworks and models.

Talk by: Lee Yang

Here’s more to explore: LLM Compact Guide: https://dbricks.co/43WuQyb Big Book of MLOps: https://dbricks.co/3r0Pqiz

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Best Data Warehouse is a Lakehouse: Databricks Achieves Ops Efficiency w/ Lakehouse Architecture

2023-07-26 Watch

video

Naveen Zutshi (Databricks) , Romit Jadhwani (Databricks)

Analytics Data Lakehouse Databricks DWH

At Databricks, we use the Lakehouse architecture to build an optimized data warehouse that drives better insights, increased operational efficiency, and reduces costs. In this session, Naveen Zutshi, CIO at Databricks and Romit Jadhwani, Senior Director Analytics and Integrations at Databricks will discuss the Databricks journey and provide technical and business insights into how these results were achieved.

The session will cover topics such as medallion architecture, building efficient third party integrations, how Databricks built various data products/services on the data warehouse, and how to use governance to break down data silos and achieve consistent sources of truth.

Talk by: Naveen Zutshi and Romit Jadhwani

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksin

Building AI-Powered Products with Foundation Models

2023-07-26 Watch

video

Vincent Chen

AI/ML Databricks LLM MLOps

Foundation models make for fantastic demos, but in practice, they can be challenging to put into production. These models work well over datasets that match common training distributions (e.g., generating WEBTEXT or internet images), but may fail on domain-specific tasks or long-tail edge case; the settings that matter most to organizations building differentiated products. We propose a data-centric development approach that organizations can use to adapt foundation models to their own private/proprietary datasets.

We'll describe several techniques, including supervision "warmstarts" and interactive prompting (spoiler alert: no code needed). To make these techniques come to life, we'll walk through real case studies describing how we've seen data-centric development drive AI-powered products, from "AI assist" use cases (e.g., copywriting assistants) to "fully automated" solutions (e.g., loan processing engines).

Talk by: Vincent Chen

Here’s more to explore: LLM Compact Guide: https://dbricks.co/43WuQyb Big Book of MLOps: https://dbricks.co/3r0Pqiz

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Building Apps on the Lakehouse with Databricks SQL

2023-07-26 Watch

video

Adriana Ispas , Chris Stevens

API BI CRM Data Lakehouse Databricks DWH

BI applications are undoubtedly one of the major consumers of a data warehouse. Nevertheless, the prospect of accessing data using standard SQL is appealing to many more stakeholders than just the data analysts. We’ve heard from customers that they experience an increasing demand to provide access to data in their lakehouse platforms from external applications beyond BI, such as e-commerce platforms, CRM systems, SaaS applications, or custom data applications developed in-house. These applications require an “always on” experience, which makes Databricks SQL Serverless a great fit.

In this session, we give an overview of the approaches available to application developers to connect to Databricks SQL and create modern data applications tailored to needs of users across an entire organization. We discuss when to choose one of the Databricks native client libraries for languages such as Python, Go, or node.js and when to use the SQL Statement Execution API, the newest addition to the toolset. We also explain when ODBC and JDBC might not be the best for the task and when they are your best friends. Live demos are included.

Talk by: Adriana Ispas and Chris Stevens

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Combining Privacy Solutions to Solve Data Access at Scale

2023-07-26 Watch

video

Maxime Agostini

Data Management Databricks DWH Marketing

The trend that has made data easier to collect and analyze has only aggravated privacy risks. Luckily, a range of privacy technologies have emerged to enable private data management; differential privacy, synthetic data, confidential computing. In isolation, those technologies have had a limited impact because they did not always bring the 10x improvement expected by data leaders.

Combining these privacy technologies has been the real game changer. We will demonstrate that the right mix of technologies brings the optimal balance of privacy and flexibility at the scale of the data warehouse. We will illustrate this by real-life applications of Sarus in three domains:

Healthcare: how to make hospital data available for research at scale in full compliance
Finance: how to pool data between several banks to fight criminal transactions
Marketing: how to build insights on combined data from partners and distributors

The examples will be illustrated using data stored in Databricks and queried using Sarus differential privacy engine.

Talk by: Maxime Agostini

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

talk-data.com

Databricks DATA + AI Summit 2023

Top Topics

Top Speakers

Lineage System Table in Unity Catalog

Processing Prescriptions at Scale at Walgreens

Rapidly Scaling Applied AI/ML with Foundational Models and Applying Them to Modern AI/ML Use Cases

Real-Time Streaming Solution for Call Center Analytics: Business Challenges and Technical Enablement

Sponsored: Accenture | Factory of the Future: Building Digital Twins Using Knowledge Graphs & Gen AI

Sponsored: Anomalo | Data Archaeology: Quickly Understand Unfamiliar Datasets Using Machine Learning

Sponsored: AWS-Real Time Stream Data & Vis Using Databricks DLT, Amazon Kinesis, & Amazon QuickSight

Sponsored: dbt Labs | Modernizing the Data Stack: Lessons Learned From Evolution at Zurich Insurance

Sponsored: Matillion - OurFamilyWizard Moves and Transforms Data for Databricks Delta Lake Easy

Streaming Data Analytics with Power BI and Databricks

Testing Generative AI Models: What You Need to Know

Unleashing the Magic of Large Language Modeling with Dolly 2.0

Weaving the Data Mesh in the Department of Defense

Delta Sharing: The Key Data Mesh Enabler

How Mars Achieved a People Analytics Transformation with a Modern Data Stack

Simplifying Migrations to Lakehouse

Unlocking the Value of Data Sharing in Financial Services with Lakehouse

Feeding the World One Plant at a Time

Sponsored: Kyvos | Analytics 100x Faster Lowest Cost w/ Kyvos & Databricks, Even on Trillions Rows

Activate Your Lakehouse with Unity Catalog

An API for Deep Learning Inferencing on Apache Spark™

Best Data Warehouse is a Lakehouse: Databricks Achieves Ops Efficiency w/ Lakehouse Architecture

Building AI-Powered Products with Foundation Models

Building Apps on the Lakehouse with Databricks SQL

Combining Privacy Solutions to Solve Data Access at Scale