talk-data.com talk-data.com

Topic

Analytics

data_analysis insights metrics

178

tagged

Activity Trend

398 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: Data + AI Summit 2025 ×
Daft and Unity Catalog: A Multimodal/AI-Native Lakehouse

Modern data organizations have moved beyond big data analytics to also incorporate advanced AI/ML data workloads. These workflows often involve multimodal datasets containing documents, images, long-form text, embeddings, URLs and more. Unity Catalog is an ideal solution for organizing and governing this data at scale. When paired with the Daft open source data engine, you can build a truly multimodal, AI-ready data lakehouse. In this session, we’ll explore how Daft integrates with Unity Catalog’s core features (such as volumes and functions) to enable efficient, AI-driven data lakehouses. You will learn how to ingest and process multimodal data (images, text and videos), run AI/ML transformations and feature extractions at scale, and maintain full control and visibility over your data with Unity Catalog’s fine-grained governance.

Iceberg Geo Type: Transforming Geospatial Data Management at Scale

The Apache Iceberg™ community is introducing native geospatial type support, addressing key challenges in managing geospatial data at scale, including fragmented formats and inefficiencies in storing large spatial datasets. This talk will delve into the origins of the Iceberg geo type, its specification design and future goals. We will examine the impact on both the geospatial and Iceberg communities, in introducing a standard data warehouse storage layer to the geospatial community, and enabling optimized geospatial analytics for Iceberg users. We will also present a live demonstration of the Iceberg geo data type with Apache Sedona™ and Apache Spark™, showcasing how it simplifies and accelerates geospatial analytics workflows and queries. Finally, we will also provide an in-depth look at its current capabilities and outline the roadmap for future developments, and offer a perspective on its role in advancing geospatial data management in the industry.

Lakeflow Observability: From UI Monitoring to Deep Analytics

Monitoring data pipelines is key to reliability at scale. In this session, we’ll dive into the observability experience in Lakeflow, Databricks’ unified DE solution — from intuitive UI monitoring to advanced event analysis, cost observability and custom dashboards. We’ll walk through the revamped UX for Lakeflow observability, showing how to: Monitor runs and task states, dependencies and retry behavior in the UI Set up alerts for job and pipeline outcomes + failures Use pipeline and job system tables for historical insights Explore run events and event logs for root cause analysis Analyze metadata to understand and optimize pipeline spend How to build custom dashboards using system tables to track performance data quality, freshness, SLAs and failure trends, and drive automated alerting based on real-time signals This session will help you unlock full visibility into your data workflows.

Latest Innovations in AI/BI Dashboards and Genie

Discover how the latest innovations in Databricks AI/BI Dashboards and Genie are transforming self-service analytics. This session offers a high-level tour of new capabilities that empower business users to ask questions in natural language, generate insights faster and make smarter decisions. Whether you're a long-time Databricks user or just exploring what's possible with AI/BI, you'll walk away with a clear understanding of how these tools are evolving — and how to leverage them for greater business impact.

Low-Emission Oil & Gas: Engineering the Balance Between Clean and Reliable

Join two energy industry leaders as they showcase groundbreaking applications of AI and data solutions in modern oil and gas operations. NOV demonstrates how their Generative AI pipeline revolutionized drilling mud report processing, automating the analysis of 300 reports daily with near-perfect accuracy and real-time analytics capabilities. BP shares how Unity Catalog has transformed their enterprise-wide data strategy, breaking down silos while maintaining robust governance and security. Together, these case studies illustrate how AI and advanced analytics are enabling cleaner, more efficient energy operations while maintaining the reliability demanded by today's market.

Revolutionizing Insurance: How to Drive Growth and Innovation

The insurance industry is rapidly evolving as advances in data and artificial intelligence (AI) drive innovation, enabling more personalized customer experiences, streamlined operations, and improved efficiencies. With powerful data analytics and AI-driven solutions, insurers can automate claims processing, enhance risk management, and make real-time decisions. Leveraging insights from large and complex datasets, organizations are delivering more customer-centric products and services than ever before. Key takeaways: Real-world applications of data and AI in claims automation, underwriting, and customer engagementHow predictive analytics and advanced data modeling help anticipate risks and meet customer needs. Personalization of policies, optimized pricing, and more efficient workflows for greater ROI. Discover how data and AI are fueling growth, improving protection, and shaping the future of the insurance industry!

Sponsored by: e6data, Inc. | Hybrid Lakehouses with Unity Governance, Local Execution and Egress Control

Data residency laws and legal mandates are driving the need for lakehouses across public and private clouds. This sprawl threatens centralized governance and compliance, while impacting cost, performance, and analytics/AI functionality. This session shows how e6data extends Unity Catalog across hybrid environments for consistent policy enforcement and query execution—regardless of data location—with guarantees around network egress, entitlements, performance, scalability, and cost. Learn how e6data’s “zero-data movement” philosophy powers a cost- and latency-optimized, location-aware architecture. We’ll cover onboarding strategies for hybrid fleets that enforce data movement restrictions and stay close to the data for better performance and lower cost. Discover how a location-aware compute strategy enables hybrid lakehouses with four key value metrics: cross-platform functionality, governed access, low latency, and total cost of ownership.

Accelerating Growth in Capital Markets: Data-Driven Strategies for Success

Growth in capital markets thrives on innovation, agility and real-time insights. This session highlights how leading firms use Databricks’ Data Intelligence Platform to uncover opportunities, optimize trading strategies and deliver personalized client experiences. Learn how advanced analytics and AI help organizations expand their reach, improve decision-making and unlock new revenue streams. Industry leaders share how unified data platforms break down silos, deepen insights and drive success in a fast-changing market. Key takeaways: Predictive analytics and machine learning strategies for growth Real-world examples of optimized trading and enhanced client engagement Tools to innovate while ensuring operational efficiency Discover how data intelligence empowers capital markets firms to thrive in today’s competitive landscape!

Cutting Costs, Not Performance: Optimizing Databricks at Scale

As Databricks transforms data processing, analytics and machine learning, managing platform costs has become crucial for organizations aiming to maximize value while staying within budget. While Databricks offers unmatched scalability and performance, inefficient usage can lead to unexpected cost overruns. This presentation will explore common challenges organizations face in controlling Databricks costs and provide actionable best practices for optimizing resource allocation, preventing over-provisioning and eliminating underutilization. Drawing from NTT DATA’s experience, I'll share how we reduced Databricks costs by up to 50% through strategies like choosing the right compute resource, leveraging manage tables and using Unity Catalog features, such as system tables, to monitor consumption. Join this session to gain practical insights and tools that will empower your team to optimize Databricks without overspending.

Databricks Lakeflow: the Foundation of Data + AI Innovation for Your Industry

Every analytics, BI and AI project relies on high-quality data. This is why data engineering, the practice of building reliable data pipelines that ingest and transform data, is consequential to the success of these projects. In this session, we'll show how you can use Lakeflow to accelerate innovation in multiple parts of the organization. We'll review real-world examples of Databricks customers using Lakeflow in different industries such as automotive, healthcare and retail. We'll touch on how the foundational data engineering capabilities Lakeflow provides help power initiatives that improve customer experiences, make real-time decisions and drive business results.

In this session, we’ll introduce Zerobus Direct Write API, part of Lakeflow Connect, which enables you to push data directly to your lakehouse and simplify ingestion for IOT, clickstreams, telemetry, and more. We’ll start with an overview of the ingestion landscape to date. Then, we'll cover how you can “shift left” with Zerobus, embedding data ingestion into your operational systems to make analytics and AI a core component of the business, rather than an afterthought. The result is a significantly simpler architecture that scales your operations, using this new paradigm to skip unnecessary hops. We'll also highlight one of our early customers, Joby Aviation and how they use Zerobus. Finally, we’ll provide a framework to help you understand when to use Zerobus versus other ingestion offerings—and we’ll wrap up with a live Q&A so that you can hit the ground running with your own use cases.

IQVIA’s Serverless Journey: Enabling Data and AI in a Regulated World

Your data and AI use-cases are multiplying. At the same time, there is increased focus and scrutiny to meet sophisticated security and regulatory requirements. IQVIA utilizes serverless use-cases across data engineering, data analytics, and ML and AI, to empower their customers to make informed decisions, support their R&D processes and improve patient outcomes. By leveraging native controls on the platform, serverless enables them to streamline their use cases while maintaining a strong security posture, top performance and optimized costs. This session will go over IQVIA’s journey to serverless, how they met their security and regulatory requirements, and the latest and upcoming enhancements to the Databricks Platform.

Scaling AI/BI Genie: Best Practices for Curating and Managing Production Spaces

Unlock Genie's full potential with best practices for curating, deploying and monitoring Genie spaces at scale. This session offers a deep dive into the latest enhancements and provides practical guidance on designing high-quality spaces, streamlining deployment workflows and implementing robust monitoring to ensure accuracy and performance in production. Ideal for teams aiming to scale conversational analytics, you’ll leave with actionable strategies to keep your Genie spaces efficient, reliable and aligned with business outcomes.

Tech Industry Session: Building Collaborative Ecosystems With Openness and Portability

Join us to discover how leading tech companies accelerate growth using open ecosystems and built-on solutions to foster collaboration, accelerate innovation and create scalable data products. This session will explore how organizations use Databricks to securely share data, integrate with partners and enable teams to build impactful applications powered by AI and analytics. Topics include: Using Delta Sharing for secure, real-time data collaboration across teams and partners Embedding analytics and creating marketplaces to extend product capabilities Building with open standards and governance frameworks to ensure compliance without sacrificing agility Hear real-world examples of how open ecosystems empower organizations to widen the aperture on collaboration, driving better business outcomes. Walk away with insights into how open data sharing and built-on solutions can help your teams innovate faster at scale.

What's New and What's Next: Building Impactful AI/BI Dashboards

Ready to take your AI/BI dashboards to the next level? This session dives into the latest capabilities in Databricks AI/BI Dashboards and how to maximize impact across your organization. Learn how data authors can tailor visualizations for different audiences, optimize performance and seamlessly integrate with Genie for a unified analytics experience. We’ll also share practical tips on how business users and data teams can better collaborate — ensuring insights are accessible, actionable and aligned to business goals.

AI-Assisted BI: Everything You Need to Know

Explore how AI is transforming business intelligence and data analytics across the Databricks platform. This session offers a comprehensive overview of AI-assisted capabilities, from generating dashboards and visualizations to integrating Genie on dashboards for conversational analytics. Whether you’re a data engineer, analyst or BI developer, this session will equip you to leverage AI with BI for better, smarter decisions.

Sponsored by: Anomalo | Reconciling IoT, Policy, and Insurer Data to Deliver Better Customer Discounts

As insurers increasingly leverage IoT data to personalize policy pricing, reconciling disparate datasets across devices, policies, and insurers becomes mission-critical. In this session, learn how Nationwide transitioned from prototype workflows in Dataiku to a hardened data stack on Databricks, enabling scalable data governance and high-impact analytics. Discover how the team orchestrates data reconciliation across Postgres, Oracle, and Databricks to align customer driving behavior with insurer and policy data—ensuring more accurate, fair discounts for policyholders. With Anomalo’s automated monitoring layered on top, Nationwide ensures data quality at scale while empowering business units to define custom logic for proactive stewardship. We’ll also look ahead to how these foundations are preparing the enterprise for unstructured data and GenAI initiatives.

Summit Live: Data Sharing and Collaboration

Hear more on the latest in data collaboration, which is paramount to unlocking business success. Delta Sharing is an open-source approach to share and govern data, AI models, dashboards, and notebooks across clouds and platforms - without the costly need for replication. Databricks Clean Rooms provide safe hosting environments for data collaboration across companies, also without the costly duplication of data. And the Databricks Marketplace is the open marketplace for all your data, analytics, and AI needs.

Bridging BI Tools: Deep Dive Into AI/BI Dashboards for Power BI Practitioners

In the rapidly-evolving field of data analytics, (AI/BI) dashboards and Power BI stand out as two formidable approaches, each offering unique strengths and catering to specific use cases. Power BI has earned its reputation for delivering user-friendly, highly customisable visualisations and reports for data analysis. On the other hand, AI/BI dashboards have gained good traction due to their seamless integration with the Databricks platform, making them an attractive option for data practitioners. This session will provide a comparison of these two tools, highlighting their respective features, strengths and potential limitations. Understanding the nuances between these tools is crucial for organizations aiming to make informed decisions about their data analytics strategy. This session will equip participants with the knowledge needed to select the most appropriate tool or combination of tools to meet their data analysis requirements and drive data-informed decision-making processes.

Databricks in Action: Azure’s Blueprint for Secure and Cost-Effective Operations

Erste Group's transition to Azure Databricks marked a significant upgrade from a legacy system to a secure, scalable and cost-effective cloud platform. The initial architecture, characterized by a complex hub-spoke design and stringent compliance regulations, was replaced with a more efficient solution. The phased migration addressed high network costs and operational inefficiencies, resulting in a 60% reduction in networking costs and a 30% reduction in compute costs for the central team. This transformation, completed over a year, now supports real-time analytics, advanced machine learning and GenAI while ensuring compliance with European regulations. The new platform features a Unity Catalogue, separate data catalogs and dedicated workspaces, demonstrating a successful shift to a cloud-based machine learning environment with significant improvements in cost, performance and security.