talk-data.com talk-data.com

Event

Databricks DATA + AI Summit 2023

2026-01-11 YouTube Visit website ↗

Activities tracked

99

Filtering by: Cloud Computing ×

Sessions & talks

Showing 1–25 of 99 · Newest first

Search within this event →
Databricks LakeFlow: A Unified, Intelligent Solution for Data Engineering. Presented by Bilal Aslam

Databricks LakeFlow: A Unified, Intelligent Solution for Data Engineering. Presented by Bilal Aslam

2024-06-16 Watch
video
Bilal Aslam (Databricks)

Speaker: Bilal Aslam, Sr. Director of Product Management, Databricks

Bilal explains that everything starts with good data and outlines the three steps to good data including, ingesting, transforming and orchestrating your data. Then Bilal announces Databricks LakeFlow - a unified solution for data engineering. With LakeFlow you can ingest data from databases, enterprise apps and cloud sources, transform it in batch and real-time streaming, and confidently deploy and operate in production. Includes a live demo of Databricks LakeFlow.

To learn more about Databricks LakeFlow, see the announcement blog post: https://www.databricks.com/blog/introducing-databricks-lakeflow

Internet-Scale Analytics: Migrating a Mission Critical Product to the Cloud

Internet-Scale Analytics: Migrating a Mission Critical Product to the Cloud

2023-07-28 Watch
video

While we may not all agree on a “If it ain’t broke, don’t fix it” approach, we can all agree that “If it shows any crack, migrate it to the cloud and completely re-architect it.” Akamai’s CSI (Cloud Security Intelligence) group is responsible for processing massive amounts of security events arriving from our edge network, which is estimated to process 30% of internet traffic, making it accessible by various internal consumers powering customer-facing products.

In this session, we will visit the reasons for migrating one of our mission critical security products and its 10GB ingest pipeline to the cloud, examine our new architecture and its benefits and touch on the challenges we faced during the process (and still do). While our requirements are unique and our solution contains a few proprietary components, this session will provide you with several concepts involving popular off-the-shelf products you can easily use in your own cloud environment.

Talk by: Yaniv Kunda

Here’s more to explore: Why the Data Lakehouse Is Your next Data Warehouse: https://dbricks.co/3Pt5unq Lakehouse Fundamentals Training: https://dbricks.co/44ancQs

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Sponsored: EY | Business Value Unleashed: Real-World Accelerating AI & Data-Centric Transformation

Sponsored: EY | Business Value Unleashed: Real-World Accelerating AI & Data-Centric Transformation

2023-07-28 Watch
video

Data and AI are revolutionizing industries and transforming businesses at an unprecedented pace. These advancements pave the way for groundbreaking outcomes such as fresh revenue streams, optimized working capital, and captivating, personalized customer experiences.

Join Hugh Burgin, Luke Pritchard and Dan Diasio as we explore a range of real-world examples of AI and data-driven transformation opportunities being powered by Databricks, including business value realized and technical solutions implemented. We will focus on how to integrate and leverage business insights, a diverse network of cloud-based solutions and Databricks to unleash new business value opportunities. By highlighting real-world use cases we will discuss:

  • Examples of how Manufacturing, Retail, Financial Services and other sectors are using Databricks services to scale AI, gain insights that matter and secure their data
  • The ways data monetization are changing how companies view data and incentivizing better data management
  • Examples of Generative AI and LLMs changing how businesses operate, how their customers engage, and what you can do about it

Talk by: Hugh Burgin and Luke Pritchard

Here’s more to explore: State of Data + AI Report: https://dbricks.co/44i2HBp The Data Team's Guide to the Databricks Lakehouse Platform: https://dbricks.co/46nuDpI

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksin

If a Duck Quacks in the Forest and Everyone Hears, Should You Care?

If a Duck Quacks in the Forest and Everyone Hears, Should You Care?

2023-07-28 Watch
video
Ryan Boyd (Databricks)

YES! "Duck posting" has become an internet meme for praising DuckDB on Twitter. Nearly every quack using DuckDB has done it once or twice. But, why all the fuss? With advances in CPUs, memory, SSDs, and the software that enables it all, our personal machines are powerful beasts relegated to handling a few Chrome tabs and sitting 90% idle. As data engineers and data analysts, this seems like a waste that's not only expensive, but also impacting the environment.

In this session, you will see how DuckDB brings SQL analytics capabilities to a 2MB standalone executable on your laptop that only recently required a large cluster. This session will explain the architecture of DuckDB that enables high performance analytics on a laptop: great query optimization, vectorized execution, continuous improvements in compression and more. We will show its capabilities using live demos, from the pandas library to WASM, to the command-line. We'll demonstrate performance on large datasets, and talk about how we're exploring using the laptop to augment cloud analytics workloads.

Talk by: Ryan Boyd

Here’s more to explore: Why the Data Lakehouse Is Your next Data Warehouse: https://dbricks.co/3Pt5unq Lakehouse Fundamentals Training: https://dbricks.co/44ancQs

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Democratize AI & ML in a Large Company: The Importance of User Enablement & Technical Training

Democratize AI & ML in a Large Company: The Importance of User Enablement & Technical Training

2023-07-27 Watch
video

The biggest critical factor to success in a cloud transformation is people. As such, having a change management process in place to manage the impact of the transformation and user enablement is foundational to any large program. In this session, we will dive into how TD bank democratizes data, mobilizes a community of over 2000 analytics users and the tactics we used to successfully enable new use cases on Cloud. The session will focus on the following:

To democratize data: - Centralize a data platform that is accessible to all employees and allow for easy data sharing - Implement privacy and security to protect data and use data ethically - Compliance and governance for using data in responsible and compliant way - Simplification of processes and procedures to reduce redundancy and faster adoption

To mobilize end users: - Increase data literacy: provide training and resources for employees to increase their abilities and skills - Foster a culture of collaboration and openness: cross-functional teams to collaborate and share ideas - Encourage exploration of innovative ideas that impact the organization's values and customers technical enablement and adoption tactics we've used at TD Bank:

  1. Hands-on training for over 1300+ analytics users with emphasis on learn by doing, to relate to real-life situations
  2. Online tutorials and documentations to be used as self-paced study
  3. Workshops and office hours on specific topics to empower business users
  4. Coaching to work with teams on a specific use case/complex issue and provide recommendations for a faster, cost effective solutions
  5. Offer certification and encourage continuous education for employees to keep up to date with latest
  6. Feedback loop: get user feedback on training and user experience to improve future trainings

Talk by: Ellie Hajarian

Here’s more to explore: State of Data + AI Report: https://dbricks.co/44i2HBp The Data Team's Guide to the Databricks Lakehouse Platform: https://dbricks.co/46nuDpI

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Five Things You Didn't Know You Could Do with Databricks Workflows

Five Things You Didn't Know You Could Do with Databricks Workflows

2023-07-27 Watch
video
Prashanth Babu (Databricks)

Databricks workflows has come a long way since the initial days of orchestrating simple notebooks and jar/wheel files. Now we can orchestrate multi-task jobs and create a chain of tasks with lineage and DAG with either fan-in or fan-out among multiple other patterns or even run another Databricks job directly inside another job.

Databricks workflows takes its tag: “orchestrate anything anywhere” pretty seriously and is a truly fully-managed, cloud-native orchestrator to orchestrate diverse workloads like Delta Live Tables, SQL, Notebooks, Jars, Python Wheels, dbt, SQL, Apache Spark™, ML pipelines with excellent monitoring, alerting and observability capabilities as well. Basically, it is a one-stop product for all orchestration needs for an efficient lakehouse. And what is even better is, it gives full flexibility of running your jobs in a cloud-agnostic and cloud-independent way and is available across AWS, Azure and GCP.

In this session, we will discuss and deep dive on some of the very interesting features and will showcase end-to-end demos of the features which will allow you to take full advantage of Databricks workflows for orchestrating the lakehouse.

Talk by: Prashanth Babu

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Improving Hospital Operations with Streaming Data and Real Time AI/ML

Improving Hospital Operations with Streaming Data and Real Time AI/ML

2023-07-27 Watch
video

Over the past two years, Providence has developed a robust streaming data platform (SDP) leveraging Databricks in Azure. The SDP enables us to ingest and process real-time data reflecting clinical operations across our 52 hospitals and roughly 1000 ambulatory clinics. The HL7 messages generated by Epic are parsed using Databricks in our secure cloud environment and used to generate an up-to-the minute picture of exactly what is happening at the point of care.

We are already leveraging this information to minimize hospital overcrowding and have been actively integrating AI/ML to accurately forecast future conditions (e.g., arrivals, length of stay, acuity, and discharge requirements.) This allows us to both improve resource utilization (e.g., nurse staffing levels) and to optimize patient throughput. The result is both improved patient care and operational efficiency.

In this session, we will share how these outcomes are only possible with the power and elegance afforded by our investments in Azure, Databricks, and increasingly Lakehouse. We will demonstrate Providence's blueprint for enabling real-time analytics which can be generalized to other healthcare providers.

Talk by: Lindsay Mico and Deylo Woo

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Jet Streaming Data & Predictive Analytics: How Collins Aerospace to Keep Aircraft Flying

Jet Streaming Data & Predictive Analytics: How Collins Aerospace to Keep Aircraft Flying

2023-07-27 Watch
video

Most have experienced the frustration and disappointment of a flight delay or cancelation due to aircraft issues. The Collins Aerospace business unit at Raytheon Technologies is committed to redefining aerospace by using data to deliver a more reliable, sustainable, efficient, and enjoyable aviation industry.

Ascentia is a product example of this with focus on helping airlines make smarter and more sustainable decisions by anticipating aircraft maintenance issues in advance, leading to more reliable flight schedules and fewer delays. Over the past five years a variety of products from the Databricks technology suite were employed to achieve this. Leveraging cloud infrastructure and harnessing the Databricks Lakehouse, Apache Spark™ development, and Databricks’ dynamic platform, Collins has been able to accelerate development and deployment of predictive health monitoring (PHM) analytics to generate Ascentia’s aircraft maintenance recommendations.

Labcorp Data Platform Journey: From Selection to Go-Live in Six Months

Labcorp Data Platform Journey: From Selection to Go-Live in Six Months

2023-07-27 Watch
video

Join this session to learn about the Labcorp data platform transformation from on-premises Hadoop to AWS Databricks Lakehouse. We will share best practices and lessons learned from cloud-native data platform selection, implementation, and migration from Hadoop (within six months) with Unity Catalog.

We will share steps taken to retire several legacy on-premises technologies and leverage Databricks native features like Spark streaming, workflows, job pools, cluster policies and Spark JDBC within Databricks platform. Lessons learned in Implementing Unity Catalog and building a security and governance model that scales across applications. We will show demos that walk you through batch frameworks, streaming frameworks, data compare tools used across several applications to improve data quality and speed of delivery.

Discover how we have improved operational efficiency, resiliency and reduced TCO, and how we scaled building workspaces and associated cloud infrastructure using Terraform provider.

Talk by: Mohan Kolli and Sreekanth Ratakonda

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Multicloud Data Governance on the Databricks Lakehouse

Multicloud Data Governance on the Databricks Lakehouse

2023-07-27 Watch
video

Across industries, a multicloud setup has quickly become the reality for large organizations. Multi-cloud introduces new governance challenges as permissions models often do not translate from one cloud to the other and if they do, are insufficiently granular to accommodate privacy requirements and principles of least privilege. This problem can be especially acute for data and AI workloads that rely on sharing and aggregating large and diverse data sources across business unit boundaries and where governance models need to incorporate assets such as table rows/columns and ML features and models.

In this session, we will provide guidelines on how best to overcome these challenges for companies that have adopted the Databricks Lakehouse as their collaborative space for data teams across the organization, by exploiting some of the unique product features of the Databricks platform. We will focus on a common scenario: a data platform team providing data assets to two different ML teams, one using the same cloud and the other one using a different cloud.

We will explain the step-by-step setup of a unified governance model by leveraging the following components and conventions:

  • Unity Catalog for implementing fine-grained access control across all data assets: files in cloud storage, rows and columns in tables and ML features and models
  • The Databricks Terraform provider to automatically enforce guardrails and permissions across clouds
  • Account level SSO Integration and identity federation to centralize administer access across workspaces
  • Delta sharing to seamlessly propagate changes in provider data sets to consumers in near real-time
  • Centralized audit logging for a unified view on what asset was accessed by whom

Talk by: Ioannis Papadopoulos and Volker Tjaden

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

The Future is Open: Data Streaming in an Omni-Cloud Reality

The Future is Open: Data Streaming in an Omni-Cloud Reality

2023-07-27 Watch
video

This session begins with data warehouse trivia and lessons learned from production implementations of multicloud data architecture. You will learn to design future-proof low latency data systems that focus on openness and interoperability. You will also gain a gentle introduction to Cloud FinOps principles that can help your organization reduce compute spend and increase efficiency. 

Most enterprises today are multicloud. While an assortment of low-code connectors boasts the ability to make data available for analytics in real time, they post long-lasting challenges:

  • Inefficient EDW targets
  • Inability to evolve schema
  • Forbiddingly expensive data exports due to cloud and vendor lock-in

The alternative is an open data lake that unifies batch and streaming workloads. Bronze landing zones in open format eliminate the data extraction costs required by proprietary EDW. Apache Spark™ Structured Streaming provides a unified ingestion interface. Streaming triggers allow us to switch back and forth between batch and stream with one-line code changes. Streaming aggregation enables us to incrementally compute on data that arrives near each other.

Specific examples are given on how to use Autoloader to discover newly arrived data and ensure exactly once, incremental processing. How DLT can be configured effectively to further simplify streaming jobs and accelerate the development cycle. How to apply SWE best practices to Workflows and integrate with popular Git providers, either using the Databricks Project or Databricks Terraform provider. 

Talk by: Christina Taylor

Here’s more to explore: Big Book of Data Engineering: 2nd Edition: https://dbricks.co/3XpPgNV The Data Team's Guide to the Databricks Lakehouse Platform: https://dbricks.co/46nuDpI

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Sponsored: Gathr | Achieve 50x Faster Outcomes From Data at Scale - Using ML-Powered, No-Code Apps

Sponsored: Gathr | Achieve 50x Faster Outcomes From Data at Scale - Using ML-Powered, No-Code Apps

2023-07-27 Watch
video

Data Engineers love data and business users need outcomes. How do we cross the chasm? While there is no dearth of data in today’s world, managing and analyzing large datasets can be daunting. Additionally, data may lose its value over time. It needs to be analyzed and acted upon quickly, to accelerate decision-making, and help realize business outcomes faster. 

Take a deep dive into the future of the data economy and learn how to drive 50 times faster time to value. Hear from United Airlines how they leveraged Gathr to process massive volumes of complex digital interactions and operational data, to create breakthroughs in operations and customer experience, in real time.

The session will feature a live-demo, showcasing how enterprises from across domains leverage Gathr’s machine learning powered zero-code applications for ingestion, ETL, ML, XOps, Cloud Cost Control, Business Process Automation, and more – to accelerate their journey from data to outcomes, like never before.

Talk by: Sameer Bhide and Sarang Bapat

Here’s more to explore: LLM Compact Guide: https://dbricks.co/43WuQyb Big Book of MLOps: https://dbricks.co/3r0Pqiz

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksin

Enabling Data Governance at Enterprise Scale Using Unity Catalog

Enabling Data Governance at Enterprise Scale Using Unity Catalog

2023-07-26 Watch
video

Amgen has invested in building modern, cloud-native enterprise data and analytics platforms over the past few years with a focus on tech rationalization, data democratization, overall user experience, increase reusability, and cost-effectiveness. One of these platforms is our Enterprise Data Fabric which focuses on pulling in data across functions and providing capabilities to integrate and connect the data and govern access. For a while, we have been trying to set up robust data governance capabilities which are simple, yet easy to manage through Databricks. There were a few tools in the market that solved a few immediate needs, but none solved the problem holistically. For use cases like maintaining governance on highly restricted data domains like Finance and HR, a long-term solution native to Databricks and addressing the below limitations was deemed important:

The way these tools were set up, allowed the overriding of a few security policies

  • Tools were not UpToDate with the latest DBR runtime
  • Complexity of implementing fine-grained security
  • Policy management – AWS IAM + In tool policies

To address these challenges, and for large-scale enterprise adoption of our governance capability, we started working on UC integration with our governance processes. With an aim to realize the following tech benefits:

  • Independent of Databricks runtime
  • Easy fine-grained access control
  • Eliminated management of IAM roles
  • Dynamic access control using UC and dynamic views

Today, using UC, we have to implement fine-grained access control & governance for the restricted data of Amgen. We are in the process of devising a realistic migration & change management strategy across the enterprise.

Talk by: Lakhan Prajapati and Jaison Dominic

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Self-Service Data Analytics and Governance at Enterprise Scale with Unity Catalog

Self-Service Data Analytics and Governance at Enterprise Scale with Unity Catalog

2023-07-26 Watch
video

This session focuses on one of the first Unity Catalog implementations for a large-scale enterprise. In this scenario, a cloud scale analytics platform with 7500 active users based on the lakehouse approach is used. In addition, there is potential for 1500 further users who are subject to special governance rules. They are consuming more than 600 TB of data stored in Delta Lake - continuously growing at more than 1TB per day. This might grow due to local country data. Therefore, the existing data platform must be extended to enable users to combine global and local data from their countries. A new data management was required, which reflects the strict information security rules at a need to know base. Core requirements are: read only from global data, write into local and share the results.

Due to a very pronounced information security awareness and a lack of the technological possibilities it was not possible to interdisciplinary analyze and exchange data so easy or at all so far. Therefore, a lot of business potential and gains could not be identified and realized.

With the new developments in the technology used and the basis of the lakehouse approach, thanks to Unity Catalog, we were able to develop a solution that could meet high requirements for security and process. And enables globally secured interdisciplinary data exchange and analysis at scale. This solution enables the democratization of the data. This results not only in the ability to gain better insights for business management, but also to generate entirely new business cases or products that require a higher degree of data integration and encourage the culture to change. We highlight technical challenges and solutions, present best practices and point out benefits of implementing Unity catalog for enterprises.

Talk by: Artem Meshcheryakov and Pascal van Bellen

Here’s more to explore: Data, Analytics, and AI Governance: https://dbricks.co/44gu3YU

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

ABN Story: Migrating to Future Proof Data Platformh

ABN Story: Migrating to Future Proof Data Platformh

2023-07-26 Watch
video

ABN AMRO Bank is one of the top leading banks in the Netherlands. It is the third largest bank in the Netherlands by revenue and number of mortgages held within the Netherlands, and has top management support of the objective to become a fully data-driven bank. ABN AMRO started its data journey almost seven years ago and has built a data platform off-premises with Hadoop technologies. This data platform has been used by more than 200 data providers, 150 data consumers, and more than 3000 datasets.

To become a fully digital bank and address the limitation of the on-premises platform requires a future-proof data platform DIAL (digital integration and access layer). ABN AMRO decided to build an Azure cloud-native data platform with the help of Microsoft and Databricks. Last year this cloud-native platform was ready for our data providers and data consumers. Six months ago we started the journey of migrating all the content from the on-premises data platform to the Azure data platform, this was a very large-scale migration and was achieved in six months.

In this session, we will focus on three things: 1. The migration strategy going from on-premises to a cloud-native platform 2. Which Databricks solutions were used in the data platform 3. How the Databricks team assisted in the overall migration

Talk by: Rakesh Singh and Marcel Kramer

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Data Democratization with Lakehouse: An Open Banking Application Case

Data Democratization with Lakehouse: An Open Banking Application Case

2023-07-26 Watch
video

Banco Bradesco represents one of the largest companies in the financial sector in Latin America. They have more than 99 million customers, 79 years of history, and a legacy of data distributed in hundreds of on-premises systems. With the spread of data-driven approaches and the growth of cloud computing adoption, we needed to innovate and adapt to new trends and enable an analytical environment with democratized data.

We will show how more than eight business departments have already engaged in using the Lakehouse exploratory environment, with more than 190 use cases mapped and a multi-bank financial manager. Unlike with on-premises, the cost of each process can be isolated and managed in near real-time, allowing quick responses to cost and budget deviations, while increasing the deployment speed of new features 36 times compared to on-premises.

The data is now used and shared safely and easily between different areas and companies of the group. Also, the view of dashboards within Databricks allows panels to be efficiently "prototyped" with real data, allowing an easy interaction of the business area with its real needs and then creating a definitive view with all relevant points duly stressed.

Talk by: Pedro Boareto and Fabio Luis Correia da Silva

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Taking Your Cloud Vendor to the Next Level: Solving Complex Challenges with Azure Databricks

Taking Your Cloud Vendor to the Next Level: Solving Complex Challenges with Azure Databricks

2023-07-26 Watch
video

Akamai's content delivery network (CDN) processes about 30% of the internet's daily traffic, resulting in a massive amount of data that presents engineering challenges, both internally and with cloud vendors. In this session, we will discuss the barriers faced while building a data infrastructure on Azure, Databricks, and Kafka to meet strict SLAs, hitting the limits of some of our cloud vendors’ services. We will describe the iterative process of re-architecting a massive scale data platform using the aforementioned technologies.

We will also delve into how today, Akamai is able to quickly ingest and make available to customers terabytes of data, as well as efficiently query Petabytes of data and return results within 10 seconds for most queries. This discussion will provide valuable insights for attendees and organizations seeking to effectively process and analyze large amounts of data.

Databricks Lakehouse: How BlackBerry is Revolutionizing Cybersecurity Services Worldwide

Databricks Lakehouse: How BlackBerry is Revolutionizing Cybersecurity Services Worldwide

2023-07-26 Watch
video
Robert Lombardi , Justin Lai (Arctic Wolf)

Cybersecurity incidents are costly, and using an endpoint detection and response (EDR) solution enables the detection of cybersecurity incidents as quickly as possible. To effectively detect cybersecurity incidences requires the collection of millions of data points, and the storing/querying of endpoints data presents considerable engineering challenges. This includes quickly moving local data from endpoints to a single table in the cloud and enabling performant querying against it.

The need to avoid internal data siloing within BlackBerry was paramount as multiple teams required access to the data to deliver an effective EDR solution for the present and the future. Databricks tooling enabled us to break down our data silos and iteratively improve our EDR pipeline to ingest data faster and reduce querying latency by more than 20% while reducing costs by more than 30%.

In this session, we will share the journey, lessons learned, and the future for collecting, storing, governing, and sharing data from endpoints in Databricks. The result of building EDR using Databricks helped us accelerate the deployment of our data platform.

Talk by: Justin Lai and Robert Lombardi

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksin

Databricks SQL: Why the Best Serverless Data Warehouse is a Lakehouse

Databricks SQL: Why the Best Serverless Data Warehouse is a Lakehouse

2023-07-26 Watch
video

Many organizations rely on complex cloud data architectures that create silos between applications, users and data. This fragmentation makes it difficult to access accurate, up-to-date information for analytics, often resulting in the use of outdated data. Enter the lakehouse, a modern data architecture that unifies data, AI, and analytics in a single location.

This session explores why the lakehouse is the best data warehouse, featuring success stories, use cases and best practices from industry experts. You'll discover how to unify and govern business-critical data at scale to build a curated data lake for data warehousing, SQL and BI. Additionally, you'll learn how Databricks SQL can help lower costs and get started in seconds with on-demand, elastic SQL serverless warehouses, and how to empower analytics engineers and analysts to quickly find and share new insights using their preferred BI and SQL tools such as Fivetran, dbt, Tableau, or Power BI.

Talk by: Miranda Luna and Cyrielle Simeone

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Data Extraction and Sharing Via The Delta Sharing Protocol

Data Extraction and Sharing Via The Delta Sharing Protocol

2023-07-26 Watch
video

The Delta Sharing open protocol for secure sharing and distribution of Lakehouse data is designed to reduce friction in getting data to users. Delivering custom data solutions from this protocol further leverages the technical investment committed to your Delta Lake infrastructure. There are key design and computational concepts unique to Delta Sharing to know when undertaking development. And there are pitfalls and hazards to avoid when delivering modern cloud data to traditional data platforms and users.

In this session, we introduce Delta Sharing Protocol development and examine our journey and the lessons learned while creating the Delta Sharing Excel Add-in. We will demonstrate scenarios of overfetching, underfetching, and interpretation of types. We will suggest methods to overcome these development challenges. The session will combine live demonstrations that exercise the Delta Sharing REST protocol with detailed analysis of the responses. The demonstrations will elaborate on optional capabilities of the protocol’s query mechanism, and how they are used and interpreted in real-life scenarios. As a reference baseline for data professionals, the Delta Sharing exercises will be framed relative to SQL counterparts. Specific attention will be paid to how they differ, and how Delta Sharing’s Change Data Feed (CDF) can power next-generation data architectures. The session will conclude with a survey of available integration solutions for getting the most out of your Delta Sharing environment, including frameworks, connectors, and managed services.

Attendees are encouraged to be familiar with REST, JSON, and modern programming concepts. A working knowledge of Delta Lake, the Parquet file format, and the Delta Sharing Protocol are advised.

Talk by: Roger Dunn

Here’s more to explore: A New Approach to Data Sharing: https://dbricks.co/44eUnT1

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Essential Data Security Strategies for the Modern Enterprise Data Architecture

Essential Data Security Strategies for the Modern Enterprise Data Architecture

2023-07-26 Watch
video

Balancing critical data requirements is a 24-7 task for enterprise-level organizations that must straddle the need to open specific gates to enable self-service data access while closing other access points to maintain internal and external compliance. Data breaches can cost U.S. businesses an average of $9.4 million per occurrence; ignoring this leaves organizations vulnerable to severe losses and crippling costs.

The 2022 Gartner Hype Cycle for Data Security reports that more and more enterprises are modernizing their data architecture with cloud and technology partners to help them collect, store and manage business data; a trend that does not appear to be letting up. According to Gartner®, “by 2025, 30% of enterprises will have adopted the Broad Data Security Platform (bDSP), up from less than 10% in 2021, due to the pent-up demand for higher levels of data security and the rapid increase in product capabilities."

Moving to both a modern data architecture and data-driven culture sets enterprises on the right trajectory for growth, but it’s important to keep in mind individual public cloud platforms are not guaranteed to protect and secure data. To solve this, Privacera pioneered the industry’s first open-standards-based data security platform that integrates privacy and compliance across multiple cloud services.

During this presentation, we will discuss: - Why today’s modern data architecture needs a DSP that works across the entire data ecosystem; Essential DSP prescriptive measures and adoption strategies. - Why faster and more responsible access to data insights helps reduce cost, increases productivity, expedites decision making, and leads to exponential growth.

Talk by: Piet Loubser

Here’s more to explore: Data, Analytics, and AI Governance: https://dbricks.co/44gu3YU

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Real-Time Streaming Solution for Call Center Analytics: Business Challenges and Technical Enablement

Real-Time Streaming Solution for Call Center Analytics: Business Challenges and Technical Enablement

2023-07-26 Watch
video

A large international client with a business footprint in North America, Europe and Africa reached out to us with an interest in having a real-time streaming solution designed and implemented for its call center handling incoming and outgoing client calls. The client had a previous bad experience with another vendor, who overpromised and underdelivered on the latency of the streaming solution. The previous vendor delivered an over-complex streaming data pipeline resulting in the data taking over five minutes to reach a visualization layer. The client felt that architecture was too complex and involved many services integrated together.

Our immediate challenges involved gaining the client's trust and proving that our design and implementation quality would supersede a previous experience. To resolve an immediate challenge of the overly complicated pipeline design, we deployed a Databricks Lakehouse architecture with Azure Databricks at the center of the solution. Our reference architecture integrated Genesys Cloud : App Services : Event Hub : Databricks : : Data Lake : Power BI.

The streaming solution proved to be low latency (seconds) during the POV stage, which led to subsequent productionalization of the pipeline with deployment of jobs, DLTs pipeline, including multi-notebook workflow and business and performance metrics dashboarding relied on by the call center staff for a day-to-day performance monitoring and improvements.

Talk by: Natalia Demidova

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Sponsored: Matillion - OurFamilyWizard Moves and Transforms Data for Databricks Delta Lake Easy

Sponsored: Matillion - OurFamilyWizard Moves and Transforms Data for Databricks Delta Lake Easy

2023-07-26 Watch
video
Beth Mattson (OurFamilyWizard) , Jamie Baker (Matillion)

OurFamilyWizard helps families living separately thrive, empowering parents with needed tools after divorce or separation. Migrating to a modern data stack built on a Databricks Delta Lake seemed like the obvious choice for OurFamilyWizard to start integrating 20 years of on-prem Oracle data with event tracking and SaaS cloud data, but they needed tools to do it. OurFamilyWizard turned to Matillion, a powerful and intuitive solution, to quickly load, combine, and transform source data into reporting tables and data marts, and empower them to turn raw data into information the organization can use to make decisions.

In this session, Beth Mattson, OurFamilyWizard Senior Data Engineer, will detail how Matillion helped OurFamilyWizard migrate their data to Databricks fast and provided end-to-end ETL capabilities. In addition, Jamie Baker, Matillion Director of Product Management, will give a brief demo and discuss the Matillion and Databricks partnership and what is on the horizon.

Talk by: Jamie Baker and Beth Mattson

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Delta Sharing: The Key Data Mesh Enabler

Delta Sharing: The Key Data Mesh Enabler

2023-07-26 Watch
video

Data Mesh is an emerging architecture pattern that challenges the centralized data platform approach by empowering different engineering teams to own the data products in a specific business domain. One of the keys to the success of any Data Mesh initiative is selecting the right protocol for Data Sharing between different business data domains that could potentially be implemented through different technologies and cloud providers.

In this session you will learn about how the Delta Sharing protocol and the Delta table format have enabled the historically stuck-in-the-past energy and construction industry to be catapulted to the 21st century by way of a modern Data Mesh implementation based on Azure Databricks.

Talk by: Francesco Pizzolon

Here’s more to explore: A New Approach to Data Sharing: https://dbricks.co/44eUnT1

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Sponsored: Kyvos | Analytics 100x Faster Lowest Cost w/ Kyvos & Databricks, Even on Trillions Rows

Sponsored: Kyvos | Analytics 100x Faster Lowest Cost w/ Kyvos & Databricks, Even on Trillions Rows

2023-07-26 Watch
video

Databricks and Kyvos together are helping organizations build their next-generation cloud analytics platform. A platform that can process and analyze massive amounts of data, even trillions of rows, and provide multidimensional insights instantly. Combining the power of Databricks with the speed, scale and cost optimization capabilities of Kyvos Analytics Acceleration Platform, customers can go beyond the limit of their analytics boundaries. Join our session to know how and also learn about a real-world use case.

Talk by: Leo Duncan

Here’s more to explore: Why the Data Lakehouse Is Your next Data Warehouse: https://dbricks.co/3Pt5unq Lakehouse Fundamentals Training: https://dbricks.co/44ancQs

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksin