talk-data.com talk-data.com

Event

Databricks DATA + AI Summit 2023

2026-01-11 YouTube Visit website ↗

Activities tracked

205

Filtering by: Data Lakehouse ×

Sessions & talks

Showing 51–75 of 205 · Newest first

Search within this event →
Processing Prescriptions at Scale at Walgreens

Processing Prescriptions at Scale at Walgreens

2023-07-26 Watch
video

We designed a scalable Spark Streaming job to manage 100s of millions of prescription-related operations per day at an end-to-end SLA of a few minutes and a lookup time of one second using CosmosDB.

In this session, we will share not only the architecture, but the challenges and solutions to using the Spark Cosmos connector at scale. We will discuss usages of the Aggregator API, custom implementations of the CosmosDB connector, and the major roadblocks we encountered with the solutions we engineered. In addition, we collaborated closely with Cosmos development team at Microsoft and will share the new features which resulted. If you ever plan to use Spark with Cosmos, you won't want to miss these gotchas!

Talk by: Daniel Zafar

Here’s more to explore: Big Book of Data Engineering: 2nd Edition: https://dbricks.co/3XpPgNV The Data Team's Guide to the Databricks Lakehouse Platform: https://dbricks.co/46nuDpI

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Real-Time Streaming Solution for Call Center Analytics: Business Challenges and Technical Enablement

Real-Time Streaming Solution for Call Center Analytics: Business Challenges and Technical Enablement

2023-07-26 Watch
video

A large international client with a business footprint in North America, Europe and Africa reached out to us with an interest in having a real-time streaming solution designed and implemented for its call center handling incoming and outgoing client calls. The client had a previous bad experience with another vendor, who overpromised and underdelivered on the latency of the streaming solution. The previous vendor delivered an over-complex streaming data pipeline resulting in the data taking over five minutes to reach a visualization layer. The client felt that architecture was too complex and involved many services integrated together.

Our immediate challenges involved gaining the client's trust and proving that our design and implementation quality would supersede a previous experience. To resolve an immediate challenge of the overly complicated pipeline design, we deployed a Databricks Lakehouse architecture with Azure Databricks at the center of the solution. Our reference architecture integrated Genesys Cloud : App Services : Event Hub : Databricks : : Data Lake : Power BI.

The streaming solution proved to be low latency (seconds) during the POV stage, which led to subsequent productionalization of the pipeline with deployment of jobs, DLTs pipeline, including multi-notebook workflow and business and performance metrics dashboarding relied on by the call center staff for a day-to-day performance monitoring and improvements.

Talk by: Natalia Demidova

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Sponsored: Anomalo | Data Archaeology: Quickly Understand Unfamiliar Datasets Using Machine Learning

Sponsored: Anomalo | Data Archaeology: Quickly Understand Unfamiliar Datasets Using Machine Learning

2023-07-26 Watch
video

One of the most daunting and time-consuming activities for data scientists and data analysts is understanding new and unfamiliar data sets. When given such a new data set, how do you understand its shape and structure? How can you quickly understand its important trends and characteristics? The typical answer is hours of manual querying and exploration, a process many call data archaeology.

This session will show a better way to explore new data sets by letting machine learning do the work for you. In particular, we will showcase how Anomalo simplifies the process of understanding and obtaining insights from Databricks tables — without manual querying. With a few clicks, you can generate comprehensive profiles and powerful visualizations that give immediate insight into your data's key characteristics and trends, as well as its shape and structure. With this approach, very little manual data archaeology is required, and you can quickly get to work on getting value out of the data (rather than just exploring it).

Talk by: Elliot Shmukler and Vicky Andonova

Here’s more to explore: State of Data + AI Report: https://dbricks.co/44i2HBp The Data Team's Guide to the Databricks Lakehouse Platform: https://dbricks.co/46nuDpI

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksi

Sponsored: AWS-Real Time Stream Data & Vis Using Databricks DLT, Amazon Kinesis, & Amazon QuickSight

Sponsored: AWS-Real Time Stream Data & Vis Using Databricks DLT, Amazon Kinesis, & Amazon QuickSight

2023-07-26 Watch
video

Amazon Kinesis Data Analytics is a managed service that can capture streaming data from IoT devices. Databricks Lakehouse platform provides ease of processing streaming and batch data using Delta Live Tables. Amazon Quicksight with powerful visualization capabilities can provides various advanced visualization capabilities with direct integration with Databricks. Combining these services, customers can capture, process, and visualize data from hundreds and thousands of IoT sensors with ease.

Talk by: Venkat Viswanathan

Here’s more to explore: Big Book of Data Engineering: 2nd Edition: https://dbricks.co/3XpPgNV The Data Team's Guide to the Databricks Lakehouse Platform: https://dbricks.co/46nuDpI

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Sponsored: dbt Labs | Modernizing the Data Stack: Lessons Learned From Evolution at Zurich Insurance

Sponsored: dbt Labs | Modernizing the Data Stack: Lessons Learned From Evolution at Zurich Insurance

2023-07-26 Watch
video
Jose L Sanchez Ros (Zurich Insurance) , Gerard Sola (Zurich Insurance)

In this session, we will explore the path Zurich Insurance took to modernize its data stack and data engineering practices, and the lessons learned along the way. We'll touch on how and why the team chose to:

  • Adopt community standards in code quality, code coverage, code reusability, and CI/CD
  • Rebuild the way data engineering collaborates with business teams
  • Explore data tools accessible to non-engineering users, with considerations for code-first and no-code interfaces
  • Structure our dbt project and orchestration — and the factors that played into our decisions

Talk by: Jose L Sanchez Ros and Gerard Sola

Here’s more to explore: Why the Data Lakehouse Is Your next Data Warehouse: https://dbricks.co/3Pt5unq Lakehouse Fundamentals Training: https://dbricks.co/44ancQs

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Weaving the Data Mesh in the Department of Defense

Weaving the Data Mesh in the Department of Defense

2023-07-26 Watch
video

The Chief Digital and AI Office (CDAO) was created to lead the strategy and policy on data, analytics, and AI adoption across the Department of Defense. To enable that vision, the Department must achieve new ways to scale and standardize delivery under a global strategy while enabling decentralized workflows that capture the wealth of data and domain expertise.

CDAO’s strategy and goals are aligned with data mesh principles. This alignment starts with providing enterprise-level infrastructure and services to advance the adoption of data, analytics, and AI, creating the self-service data infrastructure as a platform. And it continues through implementing policy for federated computational governance centered around decentralizing data ownership to become domain-oriented but enforcing the quality and trustworthiness of data. CDAO seeks to expand and make enterprise data more accessible through providing data as a product and leveraging a federated data catalog to designate authoritative data and common data models. This results in domain-oriented, decentralized data ownership to empower the business domains across the Department to increase mission and business impact that result in significant cost savings, saving lives, and data serving as a “public good.”

Please join us in our session as we discuss how the CDAO leverages modern, innovative implementations that accelerate the delivery of data and AI throughout one of the largest distributed organizations in the world; the Department of Defense. We will walk through how this enables delivery in various Department of Defense use cases.

Talk by: Brad Corwin and Cody Ferguson

Here’s more to explore: State of Data + AI Report: https://dbricks.co/44i2HBp The Data Team's Guide to the Databricks Lakehouse Platform: https://dbricks.co/46nuDpI

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

How Mars Achieved a People Analytics Transformation with a Modern Data Stack

How Mars Achieved a People Analytics Transformation with a Modern Data Stack

2023-07-26 Watch
video

People Analytics at Mars was formed two years ago as part of an ambitious journey to transform our HR analytics capabilities. To transform, we needed to build foundational services to provide our associates with helpful insights through fast results and resolving complex problems. Critical in that foundation are data governance and data enablement which is the responsibility of the Mars People Data Office team whose focus is to deliver high quality and reliable data that is reusable for current and future People Analytics use cases. Come learn how this team used Databricks in helping Mars achieve its People Analytics Transformation.

Talk by: Rachel Belino and Sreeharsha Alagani

Here’s more to explore: State of Data + AI Report: https://dbricks.co/44i2HBp The Data Team's Guide to the Databricks Lakehouse Platform: https://dbricks.co/46nuDpI

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Simplifying Migrations to Lakehouse

Simplifying Migrations to Lakehouse

2023-07-26 Watch
video

This session will cover:

  • Challenges with legacy platforms
  • Perenti Databricks migration journey
  • Reimagining migrations the Databricks way
  • The Databricks migration methodology and approach

Talk by: Dan Smith

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Unlocking the Value of Data Sharing in Financial Services with Lakehouse

Unlocking the Value of Data Sharing in Financial Services with Lakehouse

2023-07-26 Watch
video
Spencer Cook (Databricks)

The emergence of secure data sharing is already having a tremendous economic impact, in large part due to the increasing ease and safety of sharing financial data. McKinsey predicts that the impact of open financial data will be 1-4.5% of GDP globally by 2030. This indicates there is a narrowing window on a massive opportunity for financial institutions and it is critical that they prioritize data sharing. This session will first address the ways in which Delta Sharing and Unity Catalog on a Databricks Lakehouse architecture provides a simple and open framework for building a Secure Data Sharing platform in the financial services industry. Next we will use a Databricks environment to walk through different use cases for open banking data and secure data sharing, demonstrating how they will be implemented using Delta Sharing, Unity Catalog, and other parts of the Lakehouse platform. The use cases will include examples of new product features such as Databricks to Databricks sharing, change data feed and streaming on Delta Sharing, table/column lineage, and the Delta Sharing Excel plugin to demonstrate state of the art sharing capabilities.

In this session, we will discuss secure data sharing on Databricks Lakehouse and will demonstrate architecture and code for common sharing use cases in the finance industry.

Talk by: Spencer Cook

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Sponsored: Kyvos | Analytics 100x Faster Lowest Cost w/ Kyvos & Databricks, Even on Trillions Rows

Sponsored: Kyvos | Analytics 100x Faster Lowest Cost w/ Kyvos & Databricks, Even on Trillions Rows

2023-07-26 Watch
video

Databricks and Kyvos together are helping organizations build their next-generation cloud analytics platform. A platform that can process and analyze massive amounts of data, even trillions of rows, and provide multidimensional insights instantly. Combining the power of Databricks with the speed, scale and cost optimization capabilities of Kyvos Analytics Acceleration Platform, customers can go beyond the limit of their analytics boundaries. Join our session to know how and also learn about a real-world use case.

Talk by: Leo Duncan

Here’s more to explore: Why the Data Lakehouse Is Your next Data Warehouse: https://dbricks.co/3Pt5unq Lakehouse Fundamentals Training: https://dbricks.co/44ancQs

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksin

Activate Your Lakehouse with Unity Catalog

Activate Your Lakehouse with Unity Catalog

2023-07-26 Watch
video

Building a lakehouse is straightforward today thanks to many open source technologies and Databricks. However, it can be taxing to extract value from lakehouses as they grow without robust data operations. Join us to learn how YipitData uses the Unity Catalog to streamline data operations and discover best practices to scale your own Lakehouse. At YipitData, our 15+ petabyte Lakehouse is a self-service data platform built with Databricks and AWS, supporting analytics for a data team of over 250. We will share how leveraging Unity Catalog accelerates our mission to help financial institutions and corporations leverage alternative data by:

  • Enabling clients to universally access our data through a spectrum of channels, including Sigma, Delta Sharing, and multiple clouds
  • Fostering collaboration across internal teams using a data mesh paradigm that yields rich insights
  • Strengthening the integrity and security of data assets through ACLs, data lineage, audit logs, and further isolation of AWS resources
  • Reducing the cost of large tables without downtime through automated data expiration and ETL optimizations on managed delta tables

Through our migration to Unity Catalog, we have gained tactics and philosophies to seamlessly flow our data assets internally and externally. Data platforms need to be value-generating, secure, and cost-effective in today's world. We are excited to share how Unity Catalog delivers on this and helps you get the most out of your lakehouse.

Talk by: Anup Segu

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Best Data Warehouse is a Lakehouse: Databricks Achieves Ops Efficiency w/ Lakehouse Architecture

Best Data Warehouse is a Lakehouse: Databricks Achieves Ops Efficiency w/ Lakehouse Architecture

2023-07-26 Watch
video
Naveen Zutshi (Databricks) , Romit Jadhwani (Databricks)

At Databricks, we use the Lakehouse architecture to build an optimized data warehouse that drives better insights, increased operational efficiency, and reduces costs. In this session, Naveen Zutshi, CIO at Databricks and Romit Jadhwani, Senior Director Analytics and Integrations at Databricks will discuss the Databricks journey and provide technical and business insights into how these results were achieved.

The session will cover topics such as medallion architecture, building efficient third party integrations, how Databricks built various data products/services on the data warehouse, and how to use governance to break down data silos and achieve consistent sources of truth.

Talk by: Naveen Zutshi and Romit Jadhwani

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksin

Building Apps on the Lakehouse with Databricks SQL

Building Apps on the Lakehouse with Databricks SQL

2023-07-26 Watch
video

BI applications are undoubtedly one of the major consumers of a data warehouse. Nevertheless, the prospect of accessing data using standard SQL is appealing to many more stakeholders than just the data analysts. We’ve heard from customers that they experience an increasing demand to provide access to data in their lakehouse platforms from external applications beyond BI, such as e-commerce platforms, CRM systems, SaaS applications, or custom data applications developed in-house. These applications require an “always on” experience, which makes Databricks SQL Serverless a great fit.

In this session, we give an overview of the approaches available to application developers to connect to Databricks SQL and create modern data applications tailored to needs of users across an entire organization. We discuss when to choose one of the Databricks native client libraries for languages such as Python, Go, or node.js and when to use the SQL Statement Execution API, the newest addition to the toolset. We also explain when ODBC and JDBC might not be the best for the task and when they are your best friends. Live demos are included.

Talk by: Adriana Ispas and Chris Stevens

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

From Insights to Recommendations:How SkyWatch Predicts Demand for Satellite Imagery Using Databricks

From Insights to Recommendations:How SkyWatch Predicts Demand for Satellite Imagery Using Databricks

2023-07-26 Watch
video
Aayush Patel (SkyWatch)

SkyWatch is on a mission to democratize earth observation data and make it simple for anyone to use.

In this session, you will learn about how SkyWatch aggregates demand signals for the EO market and turns them into monetizable recommendations for satellite operators. Skywatch’s Data & Platform Engineer, Aayush will share how the team built a serverless architecture that synthesizes customer requests for satellite images and identifies geographic locations with high demand, helping satellite operators maximize revenue and satisfying a broad range of EO data hungry consumers.

This session will cover:

  • Challenges with Fulfillment in Earth Observation ecosystem
  • Processing large scale GeoSpatial Data with Databricks
  • Databricks in-built H3 functions
  • Delta Lake to efficiently store data leveraging optimization techniques like Z-Ordering
  • Data LakeHouse Architecture with Serverless SQL Endpoints and AWS Step Functions
  • Building Tasking Recommendations for Satellite Operators

Talk by: Aayush Patel

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

How the Texas Rangers Revolutionized Baseball Analytics with a Modern Data Lakehouse

How the Texas Rangers Revolutionized Baseball Analytics with a Modern Data Lakehouse

2023-07-26 Watch
video
Alexander Booth (Texas Rangers Baseball Club) , Oliver Dykstra (Texas Rangers)

Don't miss this session where we demonstrate how the Texas Rangers baseball team organized their predictive models by using MLflow and the MLRegistry inside Databricks. They started using Databricks as a simple solution to centralizing our development on the cloud. This helped lessen the issue of siloed development in our team, and allowed us to leverage the benefits of distributed cloud computing.

But we quickly found that Databricks was a perfect solution to another problem that we faced in our data engineering stack. Specifically, cost, complexity, and scalability issues hampered our data architecture development for years, and we decided we needed to modernize our stack by migrating to a lakehouse. With Databricks Lakehouse, ad-hoc-analytics, ETL operations, and MLOps all living within Databricks, development at scale has never been easier for our team.

Going forward, we hope to fully eliminate the silos of development, and remove the disconnect between our analytics and data engineering teams. From computer vision, pose analytics, and player tracking, to pitch design, base stealing likelihood, and more, come see how the Texas Rangers are using innovative cloud technologies to create action-driven reports from the current sea of big data.

Talk by: Alexander Booth and Oliver Dykstra

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Improve Apache Spark™ DS v2 Query Planning Using Column Stats

Improve Apache Spark™ DS v2 Query Planning Using Column Stats

2023-07-26 Watch
video

When doing the TPC-DS benchmark using external v2 data source, we have observed that for several of the queries, DS v1 has better join plans than Apache Spark. The main reason is that DS v1 uses column stats, especially number of distinct values (NDV) for query optimization. Currently, Spark™ DS v2 only has interfaces for data sources to report table statistics such as size in bytes and number of rows. In order to use column stats in DS v2, we have added new interfaces to allow external data sources to report column stats to Spark.

For a data source with huge data, it’s always challenging to get the column stats, especially the NDV. We plan to calculate NDV using Apache DataSketches Theta sketch and save the serialized compact sketch in the statistics file. The NDV and other column stats will be reported to Spark for query plan optimization.

Talk by: Huaxin Gao and Parth Chandra

Here’s more to explore: Why the Data Lakehouse Is Your next Data Warehouse: https://dbricks.co/3Pt5unq Lakehouse Fundamentals Training: https://dbricks.co/44ancQs

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Leveraging IoT Data at Scale to Mitigate Global Water Risks Using Apache Spark™ Streaming and Delta

Leveraging IoT Data at Scale to Mitigate Global Water Risks Using Apache Spark™ Streaming and Delta

2023-07-26 Watch
video

Every year, billions of dollars are lost due to water risks from storms, floods, and droughts. Water data scarcity and excess are issues that risk models cannot overcome, creating a world of uncertainty. Divirod is building a platform of water data by normalizing diverse data sources of varying velocity into one unified data asset. In addition to publicly available third-party datasets, we are rapidly deploying our own IoT sensors. These sensors ingest signals at a rate of about 100,000 messages per hour into preprocessing, signal-processing, analytics, and postprocessing workloads in one spark-streaming pipeline to enable critical real-time decision-making processes. By leveraging streaming architecture, we were able to reduce end-to-end latency from tens of minutes to just a few seconds.

We are leveraging Delta Lake to provide a single query interface across multiple tables of this continuously changing data. This enables data science and analytics workloads to always use the most current and comprehensive information available. In addition to the obvious schema transformations, we implement data quality metrics and datum conversions to provide a trustworthy unified dataset.

Talk by: Adam Wilson and Heiko Udluft

Here’s more to explore: Big Book of Data Engineering: 2nd Edition: https://dbricks.co/3XpPgNV The Data Team's Guide to the Databricks Lakehouse Platform: https://dbricks.co/46nuDpI

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Making the Shift to Application-Driven Intelligence

Making the Shift to Application-Driven Intelligence

2023-07-26 Watch
video

In the digital economy, application-driven intelligence delivered against live, real-time data will become a core capability of successful enterprises. It has the potential to improve the experience that you provide to your customers and deepen their engagement. But to make application-driven intelligence a reality, you can no longer rely only on copying live application data out of operational systems into analytics stores. Rather, it takes the unique real-time application-serving layer of a MongoDB database combined with the scale and real-time capabilities of a Databricks Lakehouse to automate and operationalize complex and AI-enhanced applications at scale.

In this session, we will show how it can be seamless for developers and data scientists to automate decisioning and actions on fresh application data and we'll deliver a practical demonstration on how operational data can be integrated in real time to run complex machine learning pipelines.

Talk by: Mat Keep and Ashwin Gangadhar

Here’s more to explore: Why the Data Lakehouse Is Your next Data Warehouse: https://dbricks.co/3Pt5unq Lakehouse Fundamentals Training: https://dbricks.co/44ancQs

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Processing Delta Lake Tables on AWS Using AWS Glue, Amazon Athena, and Amazon Redshift

Processing Delta Lake Tables on AWS Using AWS Glue, Amazon Athena, and Amazon Redshift

2023-07-26 Watch
video

Delta Lake is an open source project that helps implement modern data lake architectures commonly built on cloud storages. With Delta Lake, you can achieve ACID transactions, time travel queries, CDC, and other common use cases on the cloud.

There are a lot of use cases of Delta tables on AWS. AWS has invested a lot in this technology, and now Delta Lake is available with multiple AWS services, such as AWS Glue Spark jobs, Amazon EMR, Amazon Athena, and Amazon Redshift Spectrum. AWS Glue is a serverless, scalable data integration service that makes it easier to discover, prepare, move, and integrate data from multiple sources. With AWS Glue, you can easily ingest data from multiple data sources such as on-prem databases, Amazon RDS, DynamoDB, MongoDB into Delta Lake on Amazon S3 even without expertise in coding.

This session will demonstrate how to get started with processing Delta Lake tables on Amazon S3 using AWS Glue, and querying from Amazon Athena, and Amazon Redshift. The session also covers recent AWS service updates related to Delta Lake.

Talk by: Noritaka Sekiyama and Akira Ajisaka

Here’s more to explore: Why the Data Lakehouse Is Your next Data Warehouse: https://dbricks.co/3Pt5unq Lakehouse Fundamentals Training: https://dbricks.co/44ancQs

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Simon + Denny Live: Ask Us Anything

Simon + Denny Live: Ask Us Anything

2023-07-26 Watch
video
Denny Lee (Databricks) , Simon Whiteley (Advancing Analytics)

Simon and Denny have been discussing and debating all things Delta, Lakehouse and Apache Spark™ on their regular webshow. Whether you want advice on lake structures, want to hear their opinions on the latest trends and hype in the data world, or you simply have a tech implementation question to throw at two seasoned experts, these two will have something to say on the matter. In their previous shows, Simon and Denny focused on building out a sample lakehouse architecture, refactoring and tinkering as new features came out, but now we're throwing the doors open for any and every question you might have.

So if you've had a persistent question and think these two can help, this is the session for you. There will be a question submission form shared prior to the event, so the team will be prepped with a whole bunch of topics to talk through. Simon and Denny want to hear your questions, which they can field drawing from a wealth of industry experience, wide ranging community engagement and their differing perspectives as external consultant and internal Databricks respectively. There's also a chance they'll get distracted and go way off track talking about coffee, sci-fi, nerdery or the English weather. It happens.

Talk by: Simon Whiteley and Denny Lee

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Sponsored by: Qlik | Extracting the Full Potential of SAP Data for Global Automotive Manufacturing

Sponsored by: Qlik | Extracting the Full Potential of SAP Data for Global Automotive Manufacturing

2023-07-26 Watch
video
Matthew Hayes , Bala Amavasai (Celebal Technologies)

Every year, organizations lose millions of dollars due to equipment failure, unscheduled downtime, or unoptimized supply chains because business and operational data is not integrated. During this session you will hear from experts at Qlik and Databricks on how global luxury automotive manufacturers are accelerating the discovery and availability of complex data sets like SAP. Learn how Qlik, Microsoft, and Databricks together are delivering an integrated solution for global luxury automotive manufacturers that combines the automated data delivery capabilities of Qlik Data Integration with the agility and openness of the Databricks Lakehouse platform and AI on Azure Synpase.

We'll explore how to leverage the IT and OT data convergence to extract the full potential of business-critical SAP data, lower IT costs and deliver real-time prescriptive insights, at scale, for more resilient, predictable, and sustainable supply-chains. Learn how organizations can track and manage inventory levels, predict demand, optimize production and help their organizations identify opportunities for improvements.

Talk by: Matthew Hayes and Bala Amavasai

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksi

Sponsored by: Striim | Powering a Delightful Travel Experience with a Real-Time Operational Data Hub

Sponsored by: Striim | Powering a Delightful Travel Experience with a Real-Time Operational Data Hub

2023-07-26 Watch
video

American Airlines champions operational excellence in airline operations to provide the most delightful experience to our customers with on-time flights and meticulously maintained aircraft. To modernize and scale technical operations with real-time, data-driven processes, we delivered a DataHub that connects data from multiple sources and delivers it to analytics engines and systems of engagement in real-time. This enables operational teams to use any kind of aircraft data from almost any source imaginable and turn it into meaningful and actionable insights with speed and ease. This empowers maintenance hubs to choose the best service and determine the most effective ways to utilize resources that can impact maintenance outcomes and costs. The end-product is a smooth and scalable operation that results in a better experience for travelers. In this session, you will learn how we combine an operational data store (MongoDB) and a fully managed streaming engine (Striim) to enable analytics teams using Databricks with real-time operational data.

Talk by: John Kutay and Ganesh Deivarayan

Here’s more to explore: Big Book of Data Engineering: 2nd Edition: https://dbricks.co/3XpPgNV The Data Team's Guide to the Databricks Lakehouse Platform: https://dbricks.co/46nuDpI

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksin

Sponsored by: Toptal | Enable Data Streaming within Multicloud Strategies

Sponsored by: Toptal | Enable Data Streaming within Multicloud Strategies

2023-07-26 Watch
video

Join Toptal as we discuss how we can help organizations handle their data streaming needs in an environment utilizing multiple cloud providers. We will delve into the data scientist and data engineering perspective on this challenge. Embracing an open format, utilizing open source technologies while managing the solution through code are the keys to success.

Talk by: Christina Taylor and Matt Kroon

Here’s more to explore: Big Book of Data Engineering: 2nd Edition: https://dbricks.co/3XpPgNV The Data Team's Guide to the Databricks Lakehouse Platform: https://dbricks.co/46nuDpI

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksin

Unlock the Next Evolution of the Modern Data Stack With the Lakehouse Revolution -- with Live Demos

Unlock the Next Evolution of the Modern Data Stack With the Lakehouse Revolution -- with Live Demos

2023-07-26 Watch
video
Roberto Salcido (Databricks) , Kyle Hale

As the data landscape evolves, organizations are seeking innovative solutions that provide enhanced value and scalability without exploding costs. In this session, we will explore the exciting frontier of the Modern Data Stack on Databricks Lakehouse, a game-changing alternative to traditional Data Cloud offerings. Learn how Databricks Lakehouse empowers you to harness the full potential of Fivetran, dbt, and Tableau, while optimizing your data investments and delivering unmatched performance.

We will showcase real-world demos that highlight the seamless integration of these modern data tools on the Databricks Lakehouse platform, enabling you to unlock faster and more efficient insights. Witness firsthand how the synergy of Lakehouse and the Modern Data Stack outperforms traditional solutions, propelling your organization into the future of data-driven innovation. Don't miss this opportunity to revolutionize your data strategy and unleash unparalleled value with the lakehouse revolution.

Talk by: Kyle Hale and Roberto Salcido

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

US Army Corp of Engineers Enhanced Commerce & National Sec Through Data-Driven Geospatial Insight

US Army Corp of Engineers Enhanced Commerce & National Sec Through Data-Driven Geospatial Insight

2023-07-26 Watch
video

The US Army Corps of Engineers (USACE) is responsible for maintaining and improving nearly 12,000 miles of shallow-draft (9'-14') inland and intracoastal waterways, 13,000 miles of deep-draft (14' and greater) coastal channels, and 400 ports, harbors, and turning basins throughout the United States. Because these components of the national waterway network are considered assets to both US commerce and national security, they must be carefully managed to keep marine traffic operating safely and efficiently.

The National DQM Program is tasked with providing USACE a nationally standardized remote monitoring and documentation system across multiple vessel types with timely data access, reporting, dredge certifications, data quality control, and data management. Government systems have often lagged commercial systems in modernization efforts, and the emergence of the cloud and Data Lakehouse Architectures have empowered USACE to successfully move into the modern data era.

This session incorporates aspects of these topics: Data Lakehouse Architecture: Delta Lake, platform security and privacy, serverless, administration, data warehouse, Data Lake, Apache Iceberg, Data Mesh GIS: H3, MOSAIC, spatial analysis data engineering: data pipelines, orchestration, CDC, medallion architecture, Databricks Workflows, data munging, ETL/ELT, lakehouses, data lakes, Parquet, Data Mesh, Apache Spark™ internals. Data Streaming: Apache Spark Structured Streaming, real-time ingestion, real-time ETL, real-time ML, real-time analytics, and real-time applications, Delta Live Tables. ML: PyTorch, TensorFlow, Keras, scikit-learn, Python and R ecosystems data governance: security, compliance, RMF, NIST data sharing: sharing and collaboration, delta sharing, data cleanliness, APIs.

Talk by: Jeff Mroz

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc