talk-data.com talk-data.com

Event

Databricks DATA + AI Summit 2023

2026-01-11 YouTube Visit website ↗

Activities tracked

50

Filtering by: DWH ×

Sessions & talks

Showing 26–50 of 50 · Newest first

Search within this event →
Improve Apache Spark™ DS v2 Query Planning Using Column Stats

Improve Apache Spark™ DS v2 Query Planning Using Column Stats

2023-07-26 Watch
video

When doing the TPC-DS benchmark using external v2 data source, we have observed that for several of the queries, DS v1 has better join plans than Apache Spark. The main reason is that DS v1 uses column stats, especially number of distinct values (NDV) for query optimization. Currently, Spark™ DS v2 only has interfaces for data sources to report table statistics such as size in bytes and number of rows. In order to use column stats in DS v2, we have added new interfaces to allow external data sources to report column stats to Spark.

For a data source with huge data, it’s always challenging to get the column stats, especially the NDV. We plan to calculate NDV using Apache DataSketches Theta sketch and save the serialized compact sketch in the statistics file. The NDV and other column stats will be reported to Spark for query plan optimization.

Talk by: Huaxin Gao and Parth Chandra

Here’s more to explore: Why the Data Lakehouse Is Your next Data Warehouse: https://dbricks.co/3Pt5unq Lakehouse Fundamentals Training: https://dbricks.co/44ancQs

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Making the Shift to Application-Driven Intelligence

Making the Shift to Application-Driven Intelligence

2023-07-26 Watch
video

In the digital economy, application-driven intelligence delivered against live, real-time data will become a core capability of successful enterprises. It has the potential to improve the experience that you provide to your customers and deepen their engagement. But to make application-driven intelligence a reality, you can no longer rely only on copying live application data out of operational systems into analytics stores. Rather, it takes the unique real-time application-serving layer of a MongoDB database combined with the scale and real-time capabilities of a Databricks Lakehouse to automate and operationalize complex and AI-enhanced applications at scale.

In this session, we will show how it can be seamless for developers and data scientists to automate decisioning and actions on fresh application data and we'll deliver a practical demonstration on how operational data can be integrated in real time to run complex machine learning pipelines.

Talk by: Mat Keep and Ashwin Gangadhar

Here’s more to explore: Why the Data Lakehouse Is Your next Data Warehouse: https://dbricks.co/3Pt5unq Lakehouse Fundamentals Training: https://dbricks.co/44ancQs

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Processing Delta Lake Tables on AWS Using AWS Glue, Amazon Athena, and Amazon Redshift

Processing Delta Lake Tables on AWS Using AWS Glue, Amazon Athena, and Amazon Redshift

2023-07-26 Watch
video

Delta Lake is an open source project that helps implement modern data lake architectures commonly built on cloud storages. With Delta Lake, you can achieve ACID transactions, time travel queries, CDC, and other common use cases on the cloud.

There are a lot of use cases of Delta tables on AWS. AWS has invested a lot in this technology, and now Delta Lake is available with multiple AWS services, such as AWS Glue Spark jobs, Amazon EMR, Amazon Athena, and Amazon Redshift Spectrum. AWS Glue is a serverless, scalable data integration service that makes it easier to discover, prepare, move, and integrate data from multiple sources. With AWS Glue, you can easily ingest data from multiple data sources such as on-prem databases, Amazon RDS, DynamoDB, MongoDB into Delta Lake on Amazon S3 even without expertise in coding.

This session will demonstrate how to get started with processing Delta Lake tables on Amazon S3 using AWS Glue, and querying from Amazon Athena, and Amazon Redshift. The session also covers recent AWS service updates related to Delta Lake.

Talk by: Noritaka Sekiyama and Akira Ajisaka

Here’s more to explore: Why the Data Lakehouse Is Your next Data Warehouse: https://dbricks.co/3Pt5unq Lakehouse Fundamentals Training: https://dbricks.co/44ancQs

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

US Army Corp of Engineers Enhanced Commerce & National Sec Through Data-Driven Geospatial Insight

US Army Corp of Engineers Enhanced Commerce & National Sec Through Data-Driven Geospatial Insight

2023-07-26 Watch
video

The US Army Corps of Engineers (USACE) is responsible for maintaining and improving nearly 12,000 miles of shallow-draft (9'-14') inland and intracoastal waterways, 13,000 miles of deep-draft (14' and greater) coastal channels, and 400 ports, harbors, and turning basins throughout the United States. Because these components of the national waterway network are considered assets to both US commerce and national security, they must be carefully managed to keep marine traffic operating safely and efficiently.

The National DQM Program is tasked with providing USACE a nationally standardized remote monitoring and documentation system across multiple vessel types with timely data access, reporting, dredge certifications, data quality control, and data management. Government systems have often lagged commercial systems in modernization efforts, and the emergence of the cloud and Data Lakehouse Architectures have empowered USACE to successfully move into the modern data era.

This session incorporates aspects of these topics: Data Lakehouse Architecture: Delta Lake, platform security and privacy, serverless, administration, data warehouse, Data Lake, Apache Iceberg, Data Mesh GIS: H3, MOSAIC, spatial analysis data engineering: data pipelines, orchestration, CDC, medallion architecture, Databricks Workflows, data munging, ETL/ELT, lakehouses, data lakes, Parquet, Data Mesh, Apache Spark™ internals. Data Streaming: Apache Spark Structured Streaming, real-time ingestion, real-time ETL, real-time ML, real-time analytics, and real-time applications, Delta Live Tables. ML: PyTorch, TensorFlow, Keras, scikit-learn, Python and R ecosystems data governance: security, compliance, RMF, NIST data sharing: sharing and collaboration, delta sharing, data cleanliness, APIs.

Talk by: Jeff Mroz

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Planning and Executing a Snowflake Data Warehouse Migration to Databricks

Planning and Executing a Snowflake Data Warehouse Migration to Databricks

2023-07-26 Watch
video

Organizations are going through a critical phase of data infrastructure modernization, laying the foundation for the future, and adapting to support growing data and AI needs. Organizations that embraced cloud data warehouses (CDW) such as Snowflake have ended up trying to use a data warehousing tool for ETL pipelines and data science. This created unnecessary complexity and resulted in poor performance since data warehouses are optimized for SQL-based analytics only.

Realizing the limitation and pain with cloud data warehouses, organizations are turning to a lakehouse-first architecture. Though a cloud platform to cloud platform migration should be relatively easy, the breadth of the Databricks platform provides flexibility and hence requires careful planning and execution. In this session, we present the migration methodology, technical approaches, automation tools, product/feature mapping, a technical demo and best practices using real-world case studies for migrating data, ELT pipelines and warehouses from Snowflake to Databricks.

Talk by: Satish Garla and Ramachandran Venkat

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksin

What's New in Databricks SQL -- With Live Demos

What's New in Databricks SQL -- With Live Demos

2023-07-25 Watch
video
Can Efeoglu (Databricks)

We’ve been pushing ahead to make the lakehouse even better for data warehousing across several pillars: native serverless experience, best in class price performance, intelligent workload management & observability and enhanced connectivity, analyst & developer experiences. As we look to double down on that pace of innovation, we want to deep dive into everything that’s been keeping us busy.

In this session we will share an update on key roadmap items. To bring things to life, you will see live demos of the most recent capabilities, from data ingestion, transformation, and consumption, using the modern data stack along with Databricks SQL.

Talk by: Can Efeoglu

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Build Your Data Lakehouse with a Modern Data Stack on Databricks

Build Your Data Lakehouse with a Modern Data Stack on Databricks

2023-07-25 Watch
video
Pearl Ubaru (Databricks) , Ari Kaplan (Databricks)

Are you looking for an introduction to the Lakehouse and what the related technology is all about? This session is for you. This session explains the value that lakehouses bring to the table using examples of companies that are actually modernizing their data, showing demos throughout. The data lakehouse is the future for modern data teams that want to simplify data workloads, ease collaboration, and maintain the flexibility and openness to stay agile as a company scales.

Come to this session and learn about the full stack, including data engineering, data warehousing in a lakehouse, data streaming, governance, and data science and AI. Learn how you can create modern data solutions of your own.

Talk by: Ari Kaplan and Pearl Ubaru

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Building a Lakehouse for Data Science at DoorDash

Building a Lakehouse for Data Science at DoorDash

2022-07-19 Watch
video

DoorDash was using a data warehouse but found that they needed more data transparency, lower costs, and the ability to handle streaming data as well as batch data. With an engineering team rooted in big data backgrounds at Uber and LinkedIn, they moved to a Lakehouse architecture intuitively, without knowing about the term. In this session, learn more about how they arrived at that architecture, the process of making the move, and the results they have seen. While addressing both data analysts and data scientists from their lakehouse, this session will focus on their machine learning operations, and how their efficiencies are enabling them to tackle more advanced use cases such as NLP and image classification.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Secure Data Distribution and Insights with Databricks on AWS

Secure Data Distribution and Insights with Databricks on AWS

2022-07-19 Watch
video

Every industry must comply with some form of compliance or data security in order to operate. As data becomes more mission critical to the organization, so does the need to protect and secure it.

Public Sector organizations are responsible for securing sensitive data sets and complying with regulatory programs such as HIPAA, FedRAMP, and StateRAMP.

This does not come as a surprise given the many different attacks targeted at the industry and the extremely sensitive nature of the large volumes of data stored and analyzed. For a product owner or DBA, this can be extremely overwhelming with a security team issuing more restrictions and data access becoming more of a common request among business users. It can be difficult finding an effective governance model to democratize data while also managing compliance across your hybrid estate.

In this session, we will discuss challenges faced in the public sector when expanding to AWS cloud. We will review best practices for managing access and data integrity for a cloud-based data lakehouse with Databricks, and discuss recommended approaches for securing your AWS Cloud environment. We will highlight ways to enable compliance by developing a continuous monitoring strategy and providing tips for implementation of defense in depth. This guide will provide critical questions to ask, an overall strategy, and specific recommendations to serve all security leaders and data engineers in the Public Sector.

This talk is intended to educate on security design considerations when extending your data warehouse to the cloud. This guidance is expected to grow and evolve as new standards and offerings emerge for local, state, and federal government.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

So Fresh and So Clean: Learn How to Build Real-Time Warehouses on Lakehouse

So Fresh and So Clean: Learn How to Build Real-Time Warehouses on Lakehouse

2022-07-19 Watch
video

Warehouses? Where we are going, we won't need warehouses! Join Dillon, Franco, and Shannon as they take an industry-standard Data Warehouse integration benchmark, called TPC-DI, which is a typical 80s style data warehouse, and bring it into the future. We will review how to implement standard data warehousing practices on Lakehouse, and show you how to deliver optimal price/performance in the cloud and keep your data so fresh and so clean. We will take an assortment of structured, semi-structured, and unstructured data in the form of CSV, TXT, XML, and Fixed-Width files, and transform them warehouse-style into Lakehouse with a historical load and incremental CDC loads.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

How To Use Databricks SQL for Analytics on Your Lakehouse

How To Use Databricks SQL for Analytics on Your Lakehouse

2022-07-19 Watch
video

Most organizations run complex cloud data architectures that silo applications, users, and data. As a result, most analysis is performed with stale data and there isn’t a single source of truth of data for analytics.

Join this interactive follow-along deep dive demo to learn how Databricks SQL allows you to operate a multicloud lakehouse architecture that delivers data warehouse performance at data lake economics — with up to 12x better price/performance than traditional cloud data warehouses. Now data analysts and scientists can work with the freshest and most complete data and quickly derive new insights for accurate decision-making.

Here’s what we’ll cover: • Managing data access and permissions and monitoring how the data is being used and accessed in real time across your entire lakehouse infrastructure • Configuring and managing compute resources for fast performance, low latency, and high user concurrency to your data lake • Creating and working with queries, dashboards, query refresh, troubleshooting features and alerts • Creating connections to third-party BI and database tools (Power BI, Tableau, DbVisualizer, etc.) so that you can query your lakehouse without making changes to your analytical and dashboarding workflows

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Backfill Streaming Data Pipelines in Kappa Architecture

Backfill Streaming Data Pipelines in Kappa Architecture

2022-07-19 Watch
video

Streaming data pipelines can fail due to various reasons. Since the source data, such as Kafka topics, often have limited retention, prolonged job failures can lead to data loss. Thus, streaming jobs need to be backfillable at all times to prevent data loss in case of failures. One solution is to increase the source's retention so that backfilling is simply replaying source streams, but extending Kafka retention is very costly for Netflix's data sizes. Another solution is to utilize source data stored in DWH, commonly known as the Lambda architecture. However, this method introduces significant code duplication, as it requires engineers to maintain a separate equivalent batch job. At Netflix, we have created the Iceberg Source Connector to provide backfilling capabilities to Flink streaming applications. It allows Flink to stream data stored in Apache Iceberg while mirroring Kafka's ordering semantics, enabling us to backfill large-scale stateful Flink pipelines at low retention cost.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Building an Analytics Lakehouse at Grab

Building an Analytics Lakehouse at Grab

2022-07-19 Watch
video

Grab shares the story of their Lakehouse journey, from the drivers behind their shift to this new paradigm, to lessons learned along the way. From a starting point of a siloed, data warehouse centric architecture that had inherent challenges with scalability, performance and data duplication, Grab has standardized upon Databricks to serve as an open and unified Lakehouse platform to deliver insights at scale, democratizing data through the rapid deployment of AI and BI use cases across their operations.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Radical Speed on the Lakehouse: Photon Under the Hood

Radical Speed on the Lakehouse: Photon Under the Hood

2022-07-19 Watch
video

Many organizations are standardizing on the lakehouse, however, this new architecture poses challenges with an underlying query execution engine for accessing structured and unstructured data. The execution engine needs to provide the performance of a data warehouse and the scalability of data lakes. To ensure optimum performance, the Databricks Lakehouse Platform offers Photon. This next-gen vectorized query execution engine outperforms existing data warehouses in SQL workloads and implements a more general execution framework for efficient processing of data with support of the Apache Spark™ API. With Photon, analytical queries are seeing a 3 to 5x speed increase, with a 40% reduction in compute hours for ETL workloads. In this session, we will dive into Photon, describe its integration with the Databricks Platform and Apache Spark™ runtimes, talk through customer use cases, and show how your SQL and DataFrame workloads can benefit from the performance of Photon.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Rethinking Orchestration as Reconciliation: Software-Defined Assets in Dagster

Rethinking Orchestration as Reconciliation: Software-Defined Assets in Dagster

2022-07-19 Watch
video

This talk discusses “software-defined assets”, a declarative approach to orchestration and data management that makes it drastically easier to trust and evolve datasets and ML models. Dagster is an open source orchestrator built for maintaining software-defined assets.

In traditional data platforms, code and data are only loosely coupled. As a consequence, deploying changes to data feels dangerous, backfills are error-prone and irreversible, and it’s difficult to trust data, because you don’t know where it comes from or how it’s intended to be maintained. Each time you run a job that mutates a data asset, you add a new variable to account for when debugging problems.

Dagster proposes an alternative approach to data management that tightly couples data assets to code - each table or ML model corresponds to the function that’s responsible for generating it. This results in a “Data as Code” approach that mimics the “Infrastructure as Code” approach that’s central to modern DevOps. Your git repo becomes your source of truth on your data, so pushing data changes feels as safe as pushing code changes. Backfills become easy to reason about. You trust your data assets because you know how they’re computed and can reproduce them at any time. The role of the orchestrator is to ensure that physical assets in the data warehouse match the logical assets that are defined in code, so each job run is a step towards order.

Software-defined assets is a natural approach to orchestration for the modern data stack, in part because dbt models are a type of software-defined asset.

Attendees of this session will learn how to build and maintain lakehouses of software-defined assets with Dagster.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Data Lake for State Health Exchange Analytics using Databricks

Data Lake for State Health Exchange Analytics using Databricks

2022-07-19 Watch
video

One of the largest State based health exchanges in the country was looking to modernize their data warehouse (DWH) environment to support the vision that every decision to design, implement and evaluate their state-based health exchange portal is informed by timely and rigorous evidence about its consumers’ experiences. The scope of the project was to replace existing Oracle-based DWH with an analytics platform that could support a much broader range of requirements with an ability to provide unified analytics capabilities including machine learning. The modernized analytics platform comprises a cloud native data lake and DWH solution using Databricks. The solution provides significantly higher performance and elastic scalability to better handle larger and varying data volumes with a much lower cost of ownership compared to the existing solution. In this session, we will walk through the rationale behind tool selection, solution architecture, project timeline and benefits expected.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Data Lakehouse and Data Mesh—Two Sides of the Same Coin

Data Lakehouse and Data Mesh—Two Sides of the Same Coin

2022-07-19 Watch
video

Over the last few years two new approaches to data management have been developed in the data community: Data Mesh and Data Lakehouse. The latter is an open architecture that pushes the technological advancements of a Data Lake by adding data management capabilities proven by a long history of Data Warehousing practices. Data Mesh on the other hand is addressing data management challenges from an organizational angle, by advocating decentralized ownership of domain data while applying product thinking and domain-driven design to analytics data. At first one might think that those two architectural approaches are competing with each other, however in this talk you will learn that the two are rather orthogonal and can go very well together.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Enabling BI in a Lakehouse Environment: How Spark and Delta Can Help With Automating a DWH Develop

Enabling BI in a Lakehouse Environment: How Spark and Delta Can Help With Automating a DWH Develop

2022-07-19 Watch
video

Traditional data warehouses typically struggle when it comes to handling large volumes of data and traffic, particularly when it comes to unstructured data. In contrast, data lakes overcome such issues and have become the central hub for storing data. We outline how we can enable BI Kimball data modelling in a Lakehouse environment.

We present how we built a Spark-based framework to modernize DWH development with data lake as central storage, assuring high data quality and scalability. The framework was implemented at over 15 enterprise data warehouses across Europe.

We present how one can tackle in Spark & with Delta Lake the data warehouse principles like surrogate, foreign and business keys, SCD type 1 and 2 etc. Additionally, we share our experiences on how such a unified data modelling framework can bridge BI with modern day use cases, such as machine learning and real time analytics. The session outlines the original challenges, the steps taken and the technical hurdles we faced.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Data Warehousing on the Lakehouse

Data Warehousing on the Lakehouse

2022-07-19 Watch
video

Most organizations routinely operate their business with complex cloud data architectures that silo applications, users and data. As a result, there is no single source of truth of data for analytics, and most analysis is performed with stale data. To solve these challenges, the lakehouse has emerged as the new standard for data architecture, with the promise to unify data, AI and analytic workloads in one place. In this session, we will cover why the data lakehouse is the next best data warehouse. You will hear from the experts success stories, use cases, and best practices learned from the field and discover how the data lakehouse ingests, stores and governs business-critical data at scale to build a curated data lake for data warehousing, SQL and BI workloads. You will also learn how Databricks SQL can help you lower costs and get started in seconds with instant, elastic SQL serverless compute, and how to empower every analytics engineers and analysts to quickly find and share new insights using their favorite BI and SQL tools, like Fivetran, dbt, Tableau or PowerBI.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Evolution of Data Architectures and How to Build a Lakehouse

Evolution of Data Architectures and How to Build a Lakehouse

2022-07-19 Watch
video

Data architectures are the key and part of a larger picture to building robust analytical and AI applications. One must take a holistic view of the entire data analytics realm when it comes to planning for data science initiatives.

Through this talk, learn about the evolution of the data landscape and why Lakehouses are becoming a de facto for organizations building scalable data architectures. A lakehouse architecture combines data management capability including reliability, integrity, and quality from the data warehouse and supports all data workloads including BI and AI with the low cost and open approach of data lakes.

Data Practitioners will also learn some core concepts of building an efficient Lakehouse with Delta Lake.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

How AARP Services, Inc. automated SAS transformation to Databricks using LeapLogic

How AARP Services, Inc. automated SAS transformation to Databricks using LeapLogic

2022-07-19 Watch
video

While SAS has been a standard in analytics and data science use cases, it is not cloud-native and does not scale well. Join us to learn how AARP automated the conversion of hundreds of complex data processing, model scoring, and campaign workloads to Databricks using LeapLogic, an intelligent code transformation accelerator that can transform any and all legacy ETL, analytics, data warehouse and Hadoop to modern data platforms.

In this session experts from AARP and Impetus will share about collaborating with Databricks and how they were able to: • Automate modernization of SAS marketing analytics based on coding best practices • Establish a rich library of Spark and Python equivalent functions on Databricks with the same capabilities as SAS procedures, DATA step operations, macros, and functions • Leverage Databricks-native services like Delta Live Tables to implement waterfall techniques for campaign execution and simplify pipeline monitoring

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

How to Automate the Modernization and Migration of Your Data Warehousing Workloads to Databricks

How to Automate the Modernization and Migration of Your Data Warehousing Workloads to Databricks

2022-07-19 Watch
video

The logic in your data is the heartbeat of your organization’s reports, analytics, dashboards and applications. But that logic is often trapped in antiquated technologies that can’t take advantage of the massive scalability in the Databricks Lakehouse.

In this session BladeBridge will show how to automate the conversion of this metadata and code into Databricks PySpark and DBSQL. BladeBridge will demonstrate the flexibility of configuring for N legacy technologies to facilitate an automated path for not just a single modernization project but a factory approach for corporate wide modernization.

BladeBridge will also present how you can empirically size your migration project to determine the level of effort required.

In this session you will learn: What BladeBridge Converter is What BladeBridge Analyzer is How BladeBridge configures Readers and Writers How to size a conversion effort How to accelerate adoption of Databricks

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Day 1 Morning Keynote | Data + AI Summit 2022

Day 1 Morning Keynote | Data + AI Summit 2022

2022-07-19 Watch
video
Kerby Johnson , Shant Hovespian , Dave Weinstein (Adobe) , Karthik Ramasamy (Databricks) , Ali Ghodsi (Databricks) , Reynold Xin (Databricks) , Matei Zaharia (Databricks) , Michael Armbrust (Databricks) , Tristan Handy

Day 1 Morning Keynote | Data + AI Summit 2022 Welcome & "Destination Lakehouse" | Ali Ghodsi Apache Spark Community Update | Reynold Xin Streaming Lakehouse | Karthik Ramasamy Delta Lake | Michael Armbrust How Adobe migrated to a unified and open data Lakehouse to deliver personalization at unprecedented scale | Dave Weinstein Data Governance and Sharing on Lakehouse |Matei Zaharia Analytics Engineering and the Great Convergence | Tristan Handy Data Warehousing | Shant Hovespian Unlocking the power of data, AI & analytics: Amgen’s journey to the Lakehouse | Kerby Johnson

Get insights on how to launch a successful lakehouse architecture in Rise of the Data Lakehouse by Bill Inmon, the father of the data warehouse. Download the ebook: https://dbricks.co/3ER9Y0K

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Data Warehousing   Shant Hovespian   Keynote Data + AI Summit 2022

Data Warehousing Shant Hovespian Keynote Data + AI Summit 2022

2022-07-19 Watch
video

Data + AI Summit Keynote talk from Shant Hovespian

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Unlocking the power of data, AI & analytics: Amgen’s journey to the Lakehouse | Kerby Johnson

Unlocking the power of data, AI & analytics: Amgen’s journey to the Lakehouse | Kerby Johnson

2022-07-19 Watch
video

In this keynote, you will learn more about Amgen's data platform journey from data warehouse to data lakehouse. They’’ll discuss our decision process and the challenges they faced with legacy architectures, and how they designed and implemented a sustaining platform strategy with Databricks Lakehouse, accelerating their ability to democratize data to thousands of users.
Today, Amgen has implemented 400+ data science and analytics projects covering use cases like clinical trial optimization, supply chain management and commercial sales reporting, with more to come as they complete their digital transformation and unlock the power of data across the company.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/