talk-data.com talk-data.com

Topic

Analytics

data_analysis insights metrics

170

tagged

Activity Trend

398 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: Databricks DATA + AI Summit 2023 ×
How Mars Achieved a People Analytics Transformation with a Modern Data Stack

People Analytics at Mars was formed two years ago as part of an ambitious journey to transform our HR analytics capabilities. To transform, we needed to build foundational services to provide our associates with helpful insights through fast results and resolving complex problems. Critical in that foundation are data governance and data enablement which is the responsibility of the Mars People Data Office team whose focus is to deliver high quality and reliable data that is reusable for current and future People Analytics use cases. Come learn how this team used Databricks in helping Mars achieve its People Analytics Transformation.

Talk by: Rachel Belino and Sreeharsha Alagani

Here’s more to explore: State of Data + AI Report: https://dbricks.co/44i2HBp The Data Team's Guide to the Databricks Lakehouse Platform: https://dbricks.co/46nuDpI

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Sponsored: Kyvos | Analytics 100x Faster Lowest Cost w/ Kyvos & Databricks, Even on Trillions Rows

Databricks and Kyvos together are helping organizations build their next-generation cloud analytics platform. A platform that can process and analyze massive amounts of data, even trillions of rows, and provide multidimensional insights instantly. Combining the power of Databricks with the speed, scale and cost optimization capabilities of Kyvos Analytics Acceleration Platform, customers can go beyond the limit of their analytics boundaries. Join our session to know how and also learn about a real-world use case.

Talk by: Leo Duncan

Here’s more to explore: Why the Data Lakehouse Is Your next Data Warehouse: https://dbricks.co/3Pt5unq Lakehouse Fundamentals Training: https://dbricks.co/44ancQs

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksin

Activate Your Lakehouse with Unity Catalog

Building a lakehouse is straightforward today thanks to many open source technologies and Databricks. However, it can be taxing to extract value from lakehouses as they grow without robust data operations. Join us to learn how YipitData uses the Unity Catalog to streamline data operations and discover best practices to scale your own Lakehouse. At YipitData, our 15+ petabyte Lakehouse is a self-service data platform built with Databricks and AWS, supporting analytics for a data team of over 250. We will share how leveraging Unity Catalog accelerates our mission to help financial institutions and corporations leverage alternative data by:

  • Enabling clients to universally access our data through a spectrum of channels, including Sigma, Delta Sharing, and multiple clouds
  • Fostering collaboration across internal teams using a data mesh paradigm that yields rich insights
  • Strengthening the integrity and security of data assets through ACLs, data lineage, audit logs, and further isolation of AWS resources
  • Reducing the cost of large tables without downtime through automated data expiration and ETL optimizations on managed delta tables

Through our migration to Unity Catalog, we have gained tactics and philosophies to seamlessly flow our data assets internally and externally. Data platforms need to be value-generating, secure, and cost-effective in today's world. We are excited to share how Unity Catalog delivers on this and helps you get the most out of your lakehouse.

Talk by: Anup Segu

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Best Data Warehouse is a Lakehouse: Databricks Achieves Ops Efficiency w/ Lakehouse Architecture

At Databricks, we use the Lakehouse architecture to build an optimized data warehouse that drives better insights, increased operational efficiency, and reduces costs. In this session, Naveen Zutshi, CIO at Databricks and Romit Jadhwani, Senior Director Analytics and Integrations at Databricks will discuss the Databricks journey and provide technical and business insights into how these results were achieved.

The session will cover topics such as medallion architecture, building efficient third party integrations, how Databricks built various data products/services on the data warehouse, and how to use governance to break down data silos and achieve consistent sources of truth.

Talk by: Naveen Zutshi and Romit Jadhwani

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksin

How the Texas Rangers Revolutionized Baseball Analytics with a Modern Data Lakehouse

Don't miss this session where we demonstrate how the Texas Rangers baseball team organized their predictive models by using MLflow and the MLRegistry inside Databricks. They started using Databricks as a simple solution to centralizing our development on the cloud. This helped lessen the issue of siloed development in our team, and allowed us to leverage the benefits of distributed cloud computing.

But we quickly found that Databricks was a perfect solution to another problem that we faced in our data engineering stack. Specifically, cost, complexity, and scalability issues hampered our data architecture development for years, and we decided we needed to modernize our stack by migrating to a lakehouse. With Databricks Lakehouse, ad-hoc-analytics, ETL operations, and MLOps all living within Databricks, development at scale has never been easier for our team.

Going forward, we hope to fully eliminate the silos of development, and remove the disconnect between our analytics and data engineering teams. From computer vision, pose analytics, and player tracking, to pitch design, base stealing likelihood, and more, come see how the Texas Rangers are using innovative cloud technologies to create action-driven reports from the current sea of big data.

Talk by: Alexander Booth and Oliver Dykstra

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Leveraging IoT Data at Scale to Mitigate Global Water Risks Using Apache Spark™ Streaming and Delta

Every year, billions of dollars are lost due to water risks from storms, floods, and droughts. Water data scarcity and excess are issues that risk models cannot overcome, creating a world of uncertainty. Divirod is building a platform of water data by normalizing diverse data sources of varying velocity into one unified data asset. In addition to publicly available third-party datasets, we are rapidly deploying our own IoT sensors. These sensors ingest signals at a rate of about 100,000 messages per hour into preprocessing, signal-processing, analytics, and postprocessing workloads in one spark-streaming pipeline to enable critical real-time decision-making processes. By leveraging streaming architecture, we were able to reduce end-to-end latency from tens of minutes to just a few seconds.

We are leveraging Delta Lake to provide a single query interface across multiple tables of this continuously changing data. This enables data science and analytics workloads to always use the most current and comprehensive information available. In addition to the obvious schema transformations, we implement data quality metrics and datum conversions to provide a trustworthy unified dataset.

Talk by: Adam Wilson and Heiko Udluft

Here’s more to explore: Big Book of Data Engineering: 2nd Edition: https://dbricks.co/3XpPgNV The Data Team's Guide to the Databricks Lakehouse Platform: https://dbricks.co/46nuDpI

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Making the Shift to Application-Driven Intelligence

In the digital economy, application-driven intelligence delivered against live, real-time data will become a core capability of successful enterprises. It has the potential to improve the experience that you provide to your customers and deepen their engagement. But to make application-driven intelligence a reality, you can no longer rely only on copying live application data out of operational systems into analytics stores. Rather, it takes the unique real-time application-serving layer of a MongoDB database combined with the scale and real-time capabilities of a Databricks Lakehouse to automate and operationalize complex and AI-enhanced applications at scale.

In this session, we will show how it can be seamless for developers and data scientists to automate decisioning and actions on fresh application data and we'll deliver a practical demonstration on how operational data can be integrated in real time to run complex machine learning pipelines.

Talk by: Mat Keep and Ashwin Gangadhar

Here’s more to explore: Why the Data Lakehouse Is Your next Data Warehouse: https://dbricks.co/3Pt5unq Lakehouse Fundamentals Training: https://dbricks.co/44ancQs

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Sponsored by: Striim | Powering a Delightful Travel Experience with a Real-Time Operational Data Hub

American Airlines champions operational excellence in airline operations to provide the most delightful experience to our customers with on-time flights and meticulously maintained aircraft. To modernize and scale technical operations with real-time, data-driven processes, we delivered a DataHub that connects data from multiple sources and delivers it to analytics engines and systems of engagement in real-time. This enables operational teams to use any kind of aircraft data from almost any source imaginable and turn it into meaningful and actionable insights with speed and ease. This empowers maintenance hubs to choose the best service and determine the most effective ways to utilize resources that can impact maintenance outcomes and costs. The end-product is a smooth and scalable operation that results in a better experience for travelers. In this session, you will learn how we combine an operational data store (MongoDB) and a fully managed streaming engine (Striim) to enable analytics teams using Databricks with real-time operational data.

Talk by: John Kutay and Ganesh Deivarayan

Here’s more to explore: Big Book of Data Engineering: 2nd Edition: https://dbricks.co/3XpPgNV The Data Team's Guide to the Databricks Lakehouse Platform: https://dbricks.co/46nuDpI

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksin

US Army Corp of Engineers Enhanced Commerce & National Sec Through Data-Driven Geospatial Insight

The US Army Corps of Engineers (USACE) is responsible for maintaining and improving nearly 12,000 miles of shallow-draft (9'-14') inland and intracoastal waterways, 13,000 miles of deep-draft (14' and greater) coastal channels, and 400 ports, harbors, and turning basins throughout the United States. Because these components of the national waterway network are considered assets to both US commerce and national security, they must be carefully managed to keep marine traffic operating safely and efficiently.

The National DQM Program is tasked with providing USACE a nationally standardized remote monitoring and documentation system across multiple vessel types with timely data access, reporting, dredge certifications, data quality control, and data management. Government systems have often lagged commercial systems in modernization efforts, and the emergence of the cloud and Data Lakehouse Architectures have empowered USACE to successfully move into the modern data era.

This session incorporates aspects of these topics: Data Lakehouse Architecture: Delta Lake, platform security and privacy, serverless, administration, data warehouse, Data Lake, Apache Iceberg, Data Mesh GIS: H3, MOSAIC, spatial analysis data engineering: data pipelines, orchestration, CDC, medallion architecture, Databricks Workflows, data munging, ETL/ELT, lakehouses, data lakes, Parquet, Data Mesh, Apache Spark™ internals. Data Streaming: Apache Spark Structured Streaming, real-time ingestion, real-time ETL, real-time ML, real-time analytics, and real-time applications, Delta Live Tables. ML: PyTorch, TensorFlow, Keras, scikit-learn, Python and R ecosystems data governance: security, compliance, RMF, NIST data sharing: sharing and collaboration, delta sharing, data cleanliness, APIs.

Talk by: Jeff Mroz

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Accelerating the Development of Viewership Personas with a Unified Feature Store

With the proliferation of video content and flourishing consumer demand, there is an enormous opportunity for customer-centric video entertainment companies to use data and analytics to understand what their viewers want and deliver more of the content that that meets their needs.

At DIRECTV, our Data Science Center of Excellence is constantly looking to push the boundary of innovation in how we can better and more quickly understand the needs of our customers and leverage those actionable insights to deliver business impact. One way in which we do so is through the development of Viewership Personas with cluster analysis at scale to group our customers by the types of content they enjoy watching. This process is significantly accelerated by a unified feature store which contain a wide array of features that captures key information on viewing preferences.

This talk will focus on how the DIRECTV Data Science team utilizes Databricks to help develop a unified feature store, and learn how we leverage the feature store to accelerate the process of running machine learning algorithms to find meaningful viewership clusters.

Talk by: Malav Shah,Taylor Hosbach

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Databricks Marketplace: Going Beyond Data and Applications

The demand for third-party data has never been greater, but existing marketplaces simply aren't cutting it. You deserve more than being locked into a walled garden of just data sets and simple applications. You deserve an open marketplace to exchange ML models, notebooks, datasets and more. The Databricks Marketplace is the ultimate solution for your data, AI and analytics needs, powered by open source Delta Sharing. Databricks is revolutionizing the data marketplace space.

Join us for a demo-filled session and learn how Databricks Marketplace is exactly what you need in today’s AI-driven innovation ecosystem. Hear from customers on how Databricks is empowering organizations to leverage shared knowledge and take their analytics and AI to new heights. Take advantage of this rare opportunity to ask questions of the Databricks product team that is building the Databricks Marketplace..

Talk by: Mengxi Chen and Darshana Sivakumar

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Lakehouses: The Best Start to Your Graph Data and Analytics Journey

Data architects and IT executives are continually looking for the best ways to integrate graph data and analytics into their organizations to improve business outcomes. This session outlines how the Data Lakehouse provides the perfect starting point for a successful journey. We will explore how the Data Lakehouse offer the unique combination of scalability, flexibility, and speed to quickly and effectively ingest, pre-process, curate, and analyze graph data to create powerful analytics. Additionally, we will discuss the benefits of using the Data Lakehouse over traditional graph databases and how it can help improve time to insight, time to production and overall satisfaction. At the end of this presentation, attendees will: - Understand the benefits of using a Data Lakehouse for graph data and analytics - Learn how to get started with a successful Lakehouse implementation (demo) - Discover the advantages of using a Data Lakehouse over graph databases - Learn specifically where graph databases integrate and perform better together

Key Takeaways: - Data lakehouses provide the perfect starting point for a successful graph data and analytics journey - Data lakehouses offer scalability, flexibility, and speed to quickly and effectively analyze graph data - The Data lakehouse is a cost-effective alternative to traditional graph database shortening your time to insight and de-risk your project

Talk by: Douglas Moore

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Planning and Executing a Snowflake Data Warehouse Migration to Databricks

Organizations are going through a critical phase of data infrastructure modernization, laying the foundation for the future, and adapting to support growing data and AI needs. Organizations that embraced cloud data warehouses (CDW) such as Snowflake have ended up trying to use a data warehousing tool for ETL pipelines and data science. This created unnecessary complexity and resulted in poor performance since data warehouses are optimized for SQL-based analytics only.

Realizing the limitation and pain with cloud data warehouses, organizations are turning to a lakehouse-first architecture. Though a cloud platform to cloud platform migration should be relatively easy, the breadth of the Databricks platform provides flexibility and hence requires careful planning and execution. In this session, we present the migration methodology, technical approaches, automation tools, product/feature mapping, a technical demo and best practices using real-world case studies for migrating data, ELT pipelines and warehouses from Snowflake to Databricks.

Talk by: Satish Garla and Ramachandran Venkat

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksin

Simplifying Lakehouse Observability: Databricks Key Design Goals and Strategies

In this session, we'll explore Databricks vision for simplifying lakehouse observability, a critical component of any successful data, analytics, and machine learning initiatives. By directly integrating observability solutions within the lakehouse, Databricks aims to provide users with the tools and insights needed to run a successful business on top of lakehouse.

Our approach is designed to leverage existing expertise and simplify the process of monitoring and optimizing data and ML workflows, enabling teams to deliver sustainable and scalable data and AI applications. Join us to learn more about our key design goals and how Databricks is streamlining lakehouse observability to support the next generation of data-driven applications

Talk by: Michael Milirud

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Sponsored by: Dataiku | Have Your Cake and Eat it Too with Dataiku + Databricks

In this session, we will highlight all parts of the analytics lifecycle using Dataiku + Databricks. Explore, blend, and prepare source data, train a machine learning model and score new data, and visualize and publish results — all using only Dataiku's visual interface. Plus, we will use LLMs for everything from simple data prep to sophisticated development pipelines. Attend and learn how you can truly have it all with Dataiku + Databricks.

Talk by: Amanda Milberg

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Writing Data-Sharing Apps Using Node.js and Delta Sharing

JavaScript remains the top programming language today with most code repositories written using JavaScript on GitHub. However, JavaScript is evolving beyond just a language for web application development into a language built for tomorrow. Everyday tasks like data wrangling, data analysis, and predictive analytics are possible today directly from a web browser. For example, many popular data analytics libraries, like Tensorflow.js, now support JavaScript SDKs.

Another popular library, Danfo.js, makes it possible to wrangle data using familiar pandas-like operations, shortening the learning curve and arming the typical data engineer or data scientist with another data tool in their toolbox. In this presentation, we’ll explore using the Node.js connector for Delta Sharing to build a data analytics app that summarizes a Twitter dataset.

Talk by: Will Girten

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Clean Room Primer: Using Databricks Clean Rooms to Use More & Better Data in your BI, ML, & Beyond

In this session, we will discuss the foundational changes in the ecosystem, the implications of data insights on marketing, analytics, and measurement, and how companies are coming together to collaborate through data clean rooms in new and exciting ways to power mutually beneficial value for their businesses while preserving privacy and governance.

We will delve into the concept and key features of clean room technology and how they can be used to access more and better data for business intelligence (BI), machine learning (ML), and other data-driven initiatives. By examining real-world use cases of clean rooms in action, attendees will gain a clear understanding of the benefits they can bring to industries like CPG, retail, and media and entertainment. In addition, we will unpack the advantages of using Databricks as a clean room platform, specifically showcasing how interoperable clean rooms can be leveraged to enhance BI, ML and other compute scenarios. By the end of this session, you will be equipped with the knowledge and inspiration to explore how clean rooms can unlock new collaboration opportunities that drive better outcomes for your business and improved experiences for consumers.

Talk by: Matthew Karasick, and Anil Puliyeril

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Lakehouse Architecture to Advance Security Analytics at the Department of State

In 2023, the Department of State surged forward on implementing a lakehouse architecture to get faster, smarter, and more effective on cybersecurity log monitoring and incident response. In addition to getting us ahead of federal mandates, this approach promises to enable advanced analytics and machine learning across our highly federated global IT environment while minimizing costs associated with data retention and aggregation.

This talk will include a high-level overview of the technical and policy challenge and a technical deeper dive on the tactical implementation choices made. We’ll share lessons learned related to governance and securing organizational support, connecting between multiple cloud environments, and standardizing data to make it useful for analytics. And finally, we’ll discuss how the lakehouse leverages Databricks in multicloud environments to promote decentralized ownership of data while enabling strong, centralized data governance practices.

Talk by: Timothy Ahrens and Edward Moe

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Leveraging Data Science for Game Growth and Matchmaking Optimization

For video games, Data Science solutions can be applied throughout players' lifecycle, from Adtech, LTV forecasting, In-game economic system monitoring to experimentation. Databricks is used as a data and computation foundation to power these data science solutions, enabling data scientists to easily develop and deploy these solutions for different use cases.

In this session, we will share insights on how Databricks-powered data science solutions drive game growth and improve player experiences using different advanced analytics, modeling, experimentation, and causal inference methods. We will introduce the business use cases, data science techniques, as well as Databricks demos.

Talk by: Zhenyu Zhao and Shuo Chen

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Sponsored by: AccuWeather | Weather Matters: How to Harness its Power to Improve Your Bottom Line

AccuWeather provides the world's most sophisticated weather intelligence to make lives and businesses simpler, safer, better, and more-informed. To achieve AccuWeather's core mission and obtain its technical and business ambitions, AccuWeather uses Databricks to support its next-generation forecasting engine and historical database, enabling their teams of scientists and engineers to develop accurate, scalable weather data solutions. Businesses can take advantage and truly harness the power of AccuWeather's data suite by easily integrating weather information into their own decision support systems and analytics to effectively improve their bottom line.

Talk by: Timothy Loftus and Eric Michielli

Here’s more to explore: A New Approach to Data Sharing: https://dbricks.co/44eUnT1

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc