talk-data.com talk-data.com

Topic

Azure

Microsoft Azure

cloud cloud_provider microsoft infrastructure

29

tagged

Activity Trend

278 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: Databricks DATA + AI Summit 2023 ×
Announcing Databricks Clean Rooms with Live Demo. Presented by Matei Zaharia and Darshana Sivakumar

Speakers: Matei Zaharia, Original Creator of Apache Spark™ and MLflow; Chief Technologist, Databricks Darshana Sivakumar, Staff Product Manager, Databricks

Organizations are looking for ways to securely exchange their data and collaborate with external partners to foster data-driven innovations. In the past, organizations had limited data sharing solutions, relinquishing control over how their sensitive data was shared with partners and little to no visibility into how their data was consumed. This created the risk for potential data misuse and data privacy breaches. Customers who tried using other clean room solutions have told us these solutions are limited and do not meet their needs, as they often require all parties to copy their data into the same platform, do not allow sophisticated analysis beyond basic SQL queries, and have limited visibility or control over their data.

Organizations need an open, flexible, and privacy-safe way to collaborate on data, and Databricks Clean Rooms meets these critical needs.

See a demo of Databricks Clean Rooms, now in Public Preview on AWS + Azure

Five Things You Didn't Know You Could Do with Databricks Workflows

Databricks workflows has come a long way since the initial days of orchestrating simple notebooks and jar/wheel files. Now we can orchestrate multi-task jobs and create a chain of tasks with lineage and DAG with either fan-in or fan-out among multiple other patterns or even run another Databricks job directly inside another job.

Databricks workflows takes its tag: “orchestrate anything anywhere” pretty seriously and is a truly fully-managed, cloud-native orchestrator to orchestrate diverse workloads like Delta Live Tables, SQL, Notebooks, Jars, Python Wheels, dbt, SQL, Apache Spark™, ML pipelines with excellent monitoring, alerting and observability capabilities as well. Basically, it is a one-stop product for all orchestration needs for an efficient lakehouse. And what is even better is, it gives full flexibility of running your jobs in a cloud-agnostic and cloud-independent way and is available across AWS, Azure and GCP.

In this session, we will discuss and deep dive on some of the very interesting features and will showcase end-to-end demos of the features which will allow you to take full advantage of Databricks workflows for orchestrating the lakehouse.

Talk by: Prashanth Babu

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Improving Hospital Operations with Streaming Data and Real Time AI/ML

Over the past two years, Providence has developed a robust streaming data platform (SDP) leveraging Databricks in Azure. The SDP enables us to ingest and process real-time data reflecting clinical operations across our 52 hospitals and roughly 1000 ambulatory clinics. The HL7 messages generated by Epic are parsed using Databricks in our secure cloud environment and used to generate an up-to-the minute picture of exactly what is happening at the point of care.

We are already leveraging this information to minimize hospital overcrowding and have been actively integrating AI/ML to accurately forecast future conditions (e.g., arrivals, length of stay, acuity, and discharge requirements.) This allows us to both improve resource utilization (e.g., nurse staffing levels) and to optimize patient throughput. The result is both improved patient care and operational efficiency.

In this session, we will share how these outcomes are only possible with the power and elegance afforded by our investments in Azure, Databricks, and increasingly Lakehouse. We will demonstrate Providence's blueprint for enabling real-time analytics which can be generalized to other healthcare providers.

Talk by: Lindsay Mico and Deylo Woo

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Sponsored by: Avanade | Accelerating Adoption of Modern Analytics and Governance at Scale

To unlock all the competitive advantage Databricks offers your organization, you might need to update your strategy and methodology for the platform. With over 1,000+ Databricks projects completed globally in the last 18 months, we are going to share our insights on the best building blocks to target as you search for efficiency and competitive advantage.

These building blocks supporting this include enterprise metadata and data management services, data management foundation, and data services and products that enable business units to fully use their data and analytics at scale.

In this session, Avanade data leaders will highlight how Databricks’ modern data stack fits Azure PaaS and SaaS (such as Microsoft Fabric) ecosystem, how Unity catalog metadata supports automated data operations scenarios, and how we are helping clients measure modern analytics and governance business impact and value.

Talk by: Alan Grogan and Timur Mehmedbasic

Here’s more to explore: State of Data + AI Report: https://dbricks.co/44i2HBp Databricks named a Leader in 2022 Gartner® Magic QuadrantTM CDBMS: https://dbricks.co/3phw20d

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

ABN Story: Migrating to Future Proof Data Platformh

ABN AMRO Bank is one of the top leading banks in the Netherlands. It is the third largest bank in the Netherlands by revenue and number of mortgages held within the Netherlands, and has top management support of the objective to become a fully data-driven bank. ABN AMRO started its data journey almost seven years ago and has built a data platform off-premises with Hadoop technologies. This data platform has been used by more than 200 data providers, 150 data consumers, and more than 3000 datasets.

To become a fully digital bank and address the limitation of the on-premises platform requires a future-proof data platform DIAL (digital integration and access layer). ABN AMRO decided to build an Azure cloud-native data platform with the help of Microsoft and Databricks. Last year this cloud-native platform was ready for our data providers and data consumers. Six months ago we started the journey of migrating all the content from the on-premises data platform to the Azure data platform, this was a very large-scale migration and was achieved in six months.

In this session, we will focus on three things: 1. The migration strategy going from on-premises to a cloud-native platform 2. Which Databricks solutions were used in the data platform 3. How the Databricks team assisted in the overall migration

Talk by: Rakesh Singh and Marcel Kramer

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Taking Your Cloud Vendor to the Next Level: Solving Complex Challenges with Azure Databricks

Akamai's content delivery network (CDN) processes about 30% of the internet's daily traffic, resulting in a massive amount of data that presents engineering challenges, both internally and with cloud vendors. In this session, we will discuss the barriers faced while building a data infrastructure on Azure, Databricks, and Kafka to meet strict SLAs, hitting the limits of some of our cloud vendors’ services. We will describe the iterative process of re-architecting a massive scale data platform using the aforementioned technologies.

We will also delve into how today, Akamai is able to quickly ingest and make available to customers terabytes of data, as well as efficiently query Petabytes of data and return results within 10 seconds for most queries. This discussion will provide valuable insights for attendees and organizations seeking to effectively process and analyze large amounts of data.

Real-Time Streaming Solution for Call Center Analytics: Business Challenges and Technical Enablement

A large international client with a business footprint in North America, Europe and Africa reached out to us with an interest in having a real-time streaming solution designed and implemented for its call center handling incoming and outgoing client calls. The client had a previous bad experience with another vendor, who overpromised and underdelivered on the latency of the streaming solution. The previous vendor delivered an over-complex streaming data pipeline resulting in the data taking over five minutes to reach a visualization layer. The client felt that architecture was too complex and involved many services integrated together.

Our immediate challenges involved gaining the client's trust and proving that our design and implementation quality would supersede a previous experience. To resolve an immediate challenge of the overly complicated pipeline design, we deployed a Databricks Lakehouse architecture with Azure Databricks at the center of the solution. Our reference architecture integrated Genesys Cloud : App Services : Event Hub : Databricks : : Data Lake : Power BI.

The streaming solution proved to be low latency (seconds) during the POV stage, which led to subsequent productionalization of the pipeline with deployment of jobs, DLTs pipeline, including multi-notebook workflow and business and performance metrics dashboarding relied on by the call center staff for a day-to-day performance monitoring and improvements.

Talk by: Natalia Demidova

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Delta Sharing: The Key Data Mesh Enabler

Data Mesh is an emerging architecture pattern that challenges the centralized data platform approach by empowering different engineering teams to own the data products in a specific business domain. One of the keys to the success of any Data Mesh initiative is selecting the right protocol for Data Sharing between different business data domains that could potentially be implemented through different technologies and cloud providers.

In this session you will learn about how the Delta Sharing protocol and the Delta table format have enabled the historically stuck-in-the-past energy and construction industry to be catapulted to the 21st century by way of a modern Data Mesh implementation based on Azure Databricks.

Talk by: Francesco Pizzolon

Here’s more to explore: A New Approach to Data Sharing: https://dbricks.co/44eUnT1

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Scaling MLOps for a Demand Forecasting Across Multiple Markets for a Large CPG

In this session, we look at how one of the world’s largest CPG company setup a scalable MLOps pipeline for a demand forecasting use case that predicted demand at 100,000+ DFUs (demand forecasting units) on a weekly basis across more than 20 markets. This implementation resulted in significant cost savings in terms of improved productivity, reduced cloud usage and faster time to value amongst other benefits. You will leave this session with a clearer picture on the following:

  • Best practices in scaling MLOps with Databricks and Azure for a demand forecasting use case with a multi-market and multi-region roll-out.
  • Best practices related to model re-factoring and setting up standard CI-CD pipelines for MLOps.
  • What are some of the pitfalls to avoid in such scenarios?

Talk by: Sunil Ranganathan and Vinit Doshi

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Sponsored by: Qlik | Extracting the Full Potential of SAP Data for Global Automotive Manufacturing

Every year, organizations lose millions of dollars due to equipment failure, unscheduled downtime, or unoptimized supply chains because business and operational data is not integrated. During this session you will hear from experts at Qlik and Databricks on how global luxury automotive manufacturers are accelerating the discovery and availability of complex data sets like SAP. Learn how Qlik, Microsoft, and Databricks together are delivering an integrated solution for global luxury automotive manufacturers that combines the automated data delivery capabilities of Qlik Data Integration with the agility and openness of the Databricks Lakehouse platform and AI on Azure Synpase.

We'll explore how to leverage the IT and OT data convergence to extract the full potential of business-critical SAP data, lower IT costs and deliver real-time prescriptive insights, at scale, for more resilient, predictable, and sustainable supply-chains. Learn how organizations can track and manage inventory levels, predict demand, optimize production and help their organizations identify opportunities for improvements.

Talk by: Matthew Hayes and Bala Amavasai

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksi

Sponsored by: Avanade | Enabling Real-Time Analytics with Structured Streaming and Delta Live Tables

Join the panel to hear how Avanade is helping clients enable real-time analytics and tackle the people and process problems that accompany technology, powered by Azure Databricks.

Talk by: Thomas Kim, Dael Williamson, Zoé Durand

Here’s more to explore: Data, Analytics, and AI Governance: https://dbricks.co/44gu3YU

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Taking Your Cloud Vendor to the Next Level: Solving Complex Challenges with Azure Databricks

Akamai's content delivery network (CDN) processes about 30% of the internet's daily traffic, resulting in a massive amount of data that presents engineering challenges, both internally and with cloud vendors. In this session, we will discuss the barriers faced while building a data infrastructure on Azure, Databricks, and Kafka to meet strict SLAs, hitting the limits of some of our cloud vendors’ services. We will describe the iterative process of re-architecting a massive scale data platform using the aforementioned technologies.

We will also delve into how today, Akamai is able to quickly ingest and make available to customers terabytes of data, as well as efficiently query Petabytes of data and return results within 10 seconds for most queries. This discussion will provide valuable insights for attendees and organizations seeking to effectively process and analyze large amounts of data.

Talk by: Tomer Patel and Itai Yaffe

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Applied Predictive Maintenance in Aviation: Without Sensor Data

We will show how using Azure Databricks Lakehouse is modernizing our data & analytics environment which has given us new capability to create custom predictive models for hundreds of families of aircraft components without sensor data. We currently have over 95% success rate with over $1.3 million in avoided operational impact costs in FY21.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Cloud and Data Science Modernization of Veterans Affairs Financial Service Center with Azure Databri

The Department of Veterans Affairs (VA) is home to over 420,000 employees, provides health care for 9.16 million enrollees and manages the benefits of 5.75 million recipients. The VA also hosts an array of financial management, professional, and administrative services at their Financial Service Center (FSC), located in Austin, Texas. The FSC is divided into various service groups organized around revenue centers and product lines, including the Data Analytics Service (DAS). To support the VA mission, in 2021 FSC DAS continued to press forward with their cloud modernization efforts, successfully achieving four key accomplishments:

Office of Community Care (OCC) Financial Time Series Forecast - Financial forecasting enhancements to predict claims CFO Dashboard - Productivity and capability enhancements for financial and audit analytics Datasets Migrated to the Cloud - Migration of on-prem datasets to the cloud for down-stream analytics (includes a supply chain proof-of-concept) Data Science Hackathon - A hackathon to predict bad claims codes and demonstrate DAS abilities to accelerate a ML use case using Databricks AutoML

This talk discusses FSC DAS’ cloud and data science modernization accomplishments in 2021, lessons learned, and what’s ahead.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

A Case Study in Rearchitecting an On-Premises Pipeline in the Cloud

This talk will give a detailed discussion of some of the many considerations that must be taken into account when rebuilding on-premises data pipelines in the cloud. I will give an initial overview of the original pipeline and the reasons that we chose to migrate this pipeline to Azure. Next, I will discuss the decisions that lead to the architecture we used to replace the original pipeline, and give a thorough overview of the new cloud pipeline, including design components and networking. I will also discuss the many lessons we learned along the way to successfully migrating this pipeline.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Improving Apache Spark Application Processing Time by Configurations, Code Optimizations, etc.

In this session, we'll go over several use-cases and describe the process of improving our spark structured streaming application micro-batch time from ~55 to ~30 seconds in several steps.

Our app is processing ~ 700 MB/s of compressed data, it has very strict KPIs, and it is using several technologies and frameworks such as: Spark 3.1, Kafka, Azure Blob Storage, AKS and Java 11.

We'll share our work and experience in those fields, and go over a few tips to create better Spark structured streaming applications.

The main areas that will be discussed are: Spark Configuration changes, code optimizations and the implementation of the Spark custom data source.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Improving patient care with Databricks

Learn how Wipro helped a world leader in medical technology to modernize its data used the PySpark interface on Azure Databricks to create reusable generic frameworks, including slowly changing dimensions (SCDs), data validation/reconciliation tools, and delta lake tables created from metadata.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Optimizing Speed and Scale of User-Facing Analytics Using Apache Kafka and Pinot

Apache Kafka is the de facto standard for real-time event streaming, but what do you do if you want to perform user-facing, ad-hoc, real-time analytics too? That's where Apache Pinot comes in.

Apache Pinot is a realtime distributed OLAP datastore, which is used to deliver scalable real time analytics with low latency. It can ingest data from batch data sources (S3, HDFS, Azure Data Lake, Google Cloud Storage) as well as streaming sources such as Kafka. Pinot is used extensively at LinkedIn and Uber to power many analytical applications such as Who Viewed My Profile, Ad Analytics, Talent Analytics, Uber Eats and many more serving 100k+ queries per second while ingesting 1Million+ events per second.

Apache Kafka's highly performant, distributed, fault-tolerant, real-time publish-subscribe messaging platform powers big data solutions at Airbnb, LinkedIn, MailChimp, Netflix, the New York Times, Oracle, PayPal, Pinterest, Spotify, Twitter, Uber, Wikimedia Foundation, and countless other businesses.

Come hear from Neha Power, Founding Engineer at a StarTree and PMC and committer of Apache Pinot, and Karin Wolok, Head of Developer Community at StarTree, on an introduction to both systems and a view of how they work together.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Accidentally Building a Petabyte-Scale Cybersecurity Data Mesh in Azure With Delta Lake at HSBC

Due to the unique cybersecurity challenges that HSBC faces daily - from high data volumes to untrustworthy sources to the privacy and security restrictions of a highly regulated industry - the resulting architecture was an unwieldy set of disparate data silos. So, how do we build a cybersecurity advanced analytics environment to enrich and transform these myriad data sources into a unified, well-documented, robust, resilient, repeatable, scalable, maintainable platform that will empower the cyber analysts of the future? That at the same time remains cost-effective and enables everyone from the less-technical junior reporting user to the senior machine learning engineers?

In this session, Ryan Harris, Principal Cybersecurity Engineer at HSBC, dives into the infrastructure and architecture employed, ranging from the landing zone concepts, secure access workstations, data lake structure, and isolated data ingestion, to the enterprise integration layer. In the process of building the data pipelines and lakehouses, we ended up building a hybrid data mesh leveraging Delta Lake. The result is a flexible, secure, self-service environment that is unlocking the capabilities of our humans.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Pushing the limits of scale/performance for enterprise-wide analytics: A fire-side chat with Akamai

With the world’s most distributed compute platform — from cloud to edge — Akamai makes it easy for businesses to develop and run applications, while keeping experiences closer to users and threats farther away. ​So when it was time to scale it’s legacy Hadoop-like infrastructure reaching its capacity limits, while keeping their global operations running uninterrupted, Akamai partnered with Microsoft and Databricks to migrate to Azure Databricks.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/