talk-data.com talk-data.com

Topic

Analytics

data_analysis insights metrics

178

tagged

Activity Trend

398 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: Data + AI Summit 2025 ×

How do you transform a data pipeline from sluggish 10-hour batch processing into a real-time powerhouse that delivers insights in just 10 minutes? This was the challenge we tackled at one of France's largest manufacturing companies, where data integration and analytics were mission-critical for supply chain optimization. Power BI dashboards needed to refresh every 15 minutes. Our team struggled with legacy Azure Data Factory batch pipelines. These outdated processes couldn’t keep up, delaying insights and generating up to three daily incident tickets. We identified Lakeflow Declarative Pipelines and Databricks SQL as the game-changing solution to modernize our workflow, implement quality checks, and reduce processing times.In this session, we’ll dive into the key factors behind our success: Pipeline modernization with Lakeflow Declarative Pipelines: improving scalability Data quality enforcement: clean, reliable datasets Seamless BI integration: Using Databricks SQL to power fast, efficient queries in Power BI

Get the Most of Your Delta Lake

Unlock the full potential of Delta Lake, the open-source storage framework for Apache Spark, with this session focused on its latest and most impactful features. Discover how capabilities like Time Travel, Column Mapping, Deletion Vectors, Liquid Clustering, UniForm interoperability, and Change Data Feed (CDF) can transform your data architecture. Learn not just what these features do, but when and how to use them to maximize performance, simplify data management, and enable advanced analytics across your lakehouse environment.

Leveling Up Gaming Analytics: How Supercell Evolved Player Experiences With Snowplow and Databricks

In the competitive gaming industry, understanding player behavior is key to delivering engaging experiences. Supercell, creators of Clash of Clans and Brawl Stars, faced challenges with fragmented data and limited visibility into user journeys. To address this, they partnered with Snowplow and Databricks to build a scalable, privacy-compliant data platform for real-time insights. By leveraging Snowplow’s behavioral data collection and Databricks’ Lakehouse architecture, Supercell achieved: Cross-platform data unification: A unified view of player actions across web, mobile and in-game Real-time analytics: Streaming event data into Delta Lake for dynamic game balancing and engagement Scalable infrastructure: Supporting terabytes of data during launches and live events AI & ML use cases: Churn prediction and personalized in-game recommendations This session explores Supercell’s data journey and AI-driven player engagement strategies.

Optimizing Smart Meter IIoT Data in Databricks for At-Scale Interactive Electrical Load Analytics

Octave is a Plotly Dash application used daily by about 1,000 Hydro-Québec technicians and engineers to analyze smart meter load and voltage data from 4.5M meters across the province. As adoption grew, Octave’s back end was migrated to Databricks to address increasingly massive scale (>1T data points), governance and security requirements. This talk will summarize how Databricks was optimized to support performant at-scale interactive Dash application experiences while in parallel managing complex back-end ETL processes. The talk will outline optimizations targeted to further optimize query latency and user concurrency, along with plans to increase data update frequency. Non-technology related success factors to be reviewed will include the value of: subject matter expertise, operational autonomy, code quality for long-term maintainability and proactive vendor technical support.

Powering Secure and Scalable Data Governance at PepsiCo With Unity Catalog Open APIs

PepsiCo, given its scale, has numerous teams leveraging different tools and engines to access data and perform analytics and AI. To streamline governance across this diverse ecosystem, PepsiCo unifies its data and AI assets under an open and enterprise-grade governance framework with Unity Catalog. In this session, we'll explore real-world examples of how PepsiCo extends Unity Catalog’s governance to all its data and AI assets, enabling secure collaboration even for teams outside Databricks. Learn how PepsiCo architects permissions using service principals and service accounts to authenticate with Unity Catalog, building a multi-engine architecture with seamless and open governance. Attendees will gain practical insights into designing a scalable, flexible data platform that unifies governance across all teams while embracing openness and interoperability.

Real-World Impact in Healthcare: How VUMC’s Enterprise Data Platform Supports Patient Care and Leading-Edge Research

Vanderbilt University Medical Center (VUMC) stands at the forefront of health informatics, harnessing the power of data to redefine patient care and make healthcare personal. Join us as we explore how VUMC enables operational and strategic analytics, supports research, and ultimately drives insights into clinical workflow in and around the Epic EHR platform.

Sponsored by: Confluent | Turn SAP Data into AI-Powered Insights with Databricks

Learn how Confluent simplifies real-time streaming of your SAP data into AI-ready Delta tables on Databricks. In this session, you'll see how Confluent’s fully managed data streaming platform—with unified Apache Kafka® and Apache Flink®—connects data from SAP S/4HANA, ECC, and 120+ other sources to enable easy development of trusted, real-time data products that fuel highly contextualized AI and analytics. With Tableflow, you can represent Kafka topics as Delta tables in just a few clicks—eliminating brittle batch jobs and custom pipelines. You’ll see a product demo showcasing how Confluent unites your SAP and Databricks environments to unlock ERP-fueled AI, all while reducing the total cost of ownership (TCO) for data streaming by up to 60%.

Sponsored by: Dataiku | Agility Meets Governance: How Morgan Stanley Scales ML in a Regulated World

In regulated industries like finance, agility can't come at the cost of compliance. Morgan Stanley found the answer in combining Dataiku and Databricks to create a governed, collaborative ecosystem for machine learning and predictive analytics. This session explores how the firm accelerated model development and decision-making, reducing time-to-insight by 50% while maintaining full audit readiness. Learn how no-code workflows empowered business users, while scalable infrastructure powered Terabyte-scale ML. Discover best practices for unified data governance, risk automation, and cross-functional collaboration that unlock innovation without compromising security. Ideal for data leaders and ML practitioners in regulated industries looking to harmonize speed, control, and value.

Sponsored by: Pantomath | The Shift from 3,000 to 500 BI Reports: A Data Leader’s Guide to Leaner, Smarter Data Operations

Join Sandy Steiger, Head of Advanced Analytics & Automation (formerly at TQL), as she walks through how her team tackled one of the most common and least talked about problems in data teams: report bloat, data blind spots, and broken trust with the business. You’ll learn how TQL went from 3,000 reports to fewer than 500 while gaining better visibility, faster data issue resolution, and cloud agility through practical use of lineage, automated detection, and surprising outcomes from implementing Pantomath (an automated data operations platform). Sandy will share how her team identified upstream issues (before Microsoft did), avoided major downstream breakages, and built the credibility every data team needs to earn trust from the business. Walk away with a playbook for using automation to drive smarter, faster decisions across your organization.

Sponsored by: Salesforce | From Data to Action: A Unified and Trusted Approach

Empower AI and agents with trusted data and metadata from an end-to-end unified system. Discover how Salesforce Data Cloud, Agentforce, and Databricks work together to fuel automation, AI, and analytics through a unified data strategy—driving real-time intelligence, enabling zero-copy data sharing, and unlocking scalable activation across the enterprise.

Techcombank's Multi-Million Dollar Transformation Leveraging Cloud and Databricks

The migration to the Databricks Data Intelligence Platform has enabled Techcombank to more efficiently unify data from over 50 systems, improve governance, streamline daily operational analytics pipelines and use advanced analytics tools and AI to create more meaningful and personalized experiences for customers. With Databricks, Techcombank has also introduced key solutions that are reshaping its digital banking services: AI-driven lead management system: Techcombank's internally developed AI program called 'Lead Allocation Curated Engine' (LACE) optimizes lead management and provides relationship managers with enriched insights for smarter lead allocation to drive business growth. AI-powered program for digital banking inclusion of small businesses: An AI-powered GeoSense assists frontline workers with analytics-driven insights about which small businesses and merchants to engage in the bank's digital ecosystem. And more examples, which will be presented.

Unlocking Cross-Organizational Collaboration to Protect the Environment With Databricks at DEFRA

Join us to learn how the UK's Department for Environment, Food & Rural Affairs (DEFRA) transformed data use with Databricks’ Unity Catalog, enabling nationwide projects through secure, scalable analytics. DEFRA safeguards the UK's natural environment. Historical fragmentation of data, talent and tools across siloed platforms and organizations, made it difficult to fully exploit the department’s rich data. DEFRA launched its Data Analytics & Science Hub (DASH), powered by the Databricks Data Intelligence Platform, to unify its data ecosystem. DASH enables hundreds of users to access and share datasets securely. A flagship example demonstrates its power, using Databricks to process aerial photography and satellite data to identify peatlands in need of restoration — a complex task made possible through unified data governance, scalable compute and AI. Attendees will hear about DEFRA’s journey, learn valuable lessons about building a platform crossing organizational boundaries.

Next-Gen Data Science: How Posit and Databricks Are Transforming Analytics at Scale

Modern data science teams face the challenge of navigating complex landscapes of languages, tools and infrastructure. Positron, Posit’s next-generation IDE, offers a powerful environment tailored for data science, seamlessly integrating with Databricks to empower teams working in Python and R. Now integrated within Posit Workbench, Positron enables data scientists to efficiently develop, iterate and analyze data with Databricks — all while maintaining their preferred workflows. In this session, we’ll explore how Python and R users can develop, deploy and scale their data science workflows by combining Posit tools with Databricks. We’ll showcase how Positron simplifies development for both Python and R and how Posit Connect enables seamless deployment of applications, reports and APIs powered by Databricks. Join us to see how Posit + Databricks create a frictionless, scalable and collaborative data science experience — so your teams can focus on insights, not infrastructure.

No-Trust, All Value: Monetizing Analytics With Databricks Clean Rooms

In a world where data collaboration is essential but trust is scarce, Databricks Clean Rooms delivers a game-changing model: no data shared, all value gained. Discover how data providers can unlock new revenue streams by launching subscription-based analytics and “Built-on-Databricks” services that run on customer data — without exposing raw data or violating compliance. Clean Rooms integrates Unity Catalog’s governance, Delta Sharing’s secure exchange and serverless compute to enable true multi-party collaboration — without moving data. See how privacy-preserving models like fraud detection, clinical analytics and ad measurement become scalable, productizable and monetizable across industries. Walk away with a proven pattern to productize analytics, preserve compliance and turn trustless collaboration into recurring revenue.

Scaling Blockchain ML With Databricks: From Graph Analytics to Graph Machine Learning

Coinbase leverages Databricks to scale ML on blockchain data, turning vast transaction networks into actionable insights. This session explores how Databricks’ scalable infrastructure, powered by Delta Lake, enables real-time processing for ML applications like NFT floor price predictions. We’ll show how GraphFrames helps us analyze billion-node transaction graphs (e.g., Bitcoin) for clustering and fraud detection, uncovering structural patterns in blockchain data. But traditional graph analytics has limits. We’ll go further with Graph Neural Networks (GNNs) using Kumo AI, which learn from the transaction network itself rather than relying on hand-engineered features. By encoding relationships directly into the model, GNNs adapt to new fraud tactics, capturing subtle relationships that evolve over time. Join us to see how Coinbase is advancing blockchain ML with Databricks and deep learning on graphs.

Sponsored by: Google Cloud | Powering AI & Analytics: Innovations in Google Cloud Storage for Data Lakes

Enterprise customers need a powerful and adaptable data foundation to navigate demands of AI and multi-cloud environments. This session dives into how Google Cloud Storage serves as a unified platform for modern analytics data lakes, together with Databricks. Discover how Google Cloud Storage provides key innovations like performance optimizations for Apache Iceberg, Anywhere Cache as the easiest way to colocate storage and compute, Rapid Storage for ultra low latency object reads and appends, and Storage Intelligence for vital data insights and recommendations. Learn how you can optimize your infrastructure to unlock the full value of your data for AI-driven success.

ClickHouse and Databricks for Real-Time Analytics

ClickHouse is a C++ based, column-oriented database built for real-time analytics. While it has its own internal storage format, the rise of open lakehouse architectures has created a growing need for seamless interoperability. In response, we have developed integrations with your favorite lakehouse ecosystem to enhance compatibility, performance and governance. From integrating with Unity Catalog to embedding the Delta Kernel into ClickHouse, this session will explore the key design considerations behind these integrations, their benefits to the community, the lessons learned and future opportunities for improved compatibility and seamless integration.

Most organizations run complex cloud data architectures that silo applications, users and data. Join this interactive hands-on workshop to learn how Databricks SQL allows you to operate a multi-cloud lakehouse architecture that delivers data warehouse performance at data lake economics — with up to 12x better price/performance than traditional cloud data warehouses. Here’s what we’ll cover: How Databricks SQL fits in the Data Intelligence Platform, enabling you to operate a multicloud lakehouse architecture that delivers data warehouse performance at data lake economics How to manage and monitor compute resources, data access and users across your lakehouse infrastructure How to query directly on your data lake using your tools of choice or the built-in SQL editor and visualizations How to use AI to increase productivity when querying, completing code or building dashboards Ask your questions during this hands-on lab, and the Databricks experts will guide you.

How the Texas Rangers Use a Unified Data Platform to Drive World Class Baseball Analytics

Don't miss this session where we demonstrate how the Texas Rangers baseball team is staying one step ahead of the competition by going back to the basics. After implementing a modern data strategy with Databricks and winnng the 2023 World Series the rest of the league quickly followed suit. Now more than ever, data and AI are a central pillar of every baseball team's strategy driving profound insights into player performance and game dynamics. With a 'fundamentals win games' back to the basics focus, join us as we explain our commmitment to world-class data quality, engineering, and MLOPS by taking full advantage of the Databricks Data Intelligence Platform. From system tables to federated querying, find out how the Rangers use every tool at their disposal to stay one step ahead in the hyper competitive world of baseball.

Intuit's Privacy-Safe Lending Marketplace: Leveraging Databricks Clean Rooms

Intuit leverages Databricks Clean Rooms to create a secure, privacy-safe lending marketplace, enabling small business lending partners to perform analytics and deploy ML/AI workflows on sensitive data assets. This session explores the technical foundations of building isolated clean rooms across multiple partners and cloud providers, differentiating Databricks Clean Rooms from market alternatives. We'll demonstrate our automated approach to clean room lifecycle management using APIs, covering creation, collaborator onboarding, data asset sharing, workflow orchestration and activity auditing. The integration with Unity Catalog for managing clean room inputs and outputs will also be discussed. Attendees will gain insights into harnessing collaborative ML/AI potential, support various languages and workloads, and enable complex computations without compromising sensitive information in Clean Rooms.