talk-data.com talk-data.com

Topic

Data Analytics

data_analysis statistics insights

13

tagged

Activity Trend

38 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: Data + AI Summit 2025 ×
Daft and Unity Catalog: A Multimodal/AI-Native Lakehouse

Modern data organizations have moved beyond big data analytics to also incorporate advanced AI/ML data workloads. These workflows often involve multimodal datasets containing documents, images, long-form text, embeddings, URLs and more. Unity Catalog is an ideal solution for organizing and governing this data at scale. When paired with the Daft open source data engine, you can build a truly multimodal, AI-ready data lakehouse. In this session, we’ll explore how Daft integrates with Unity Catalog’s core features (such as volumes and functions) to enable efficient, AI-driven data lakehouses. You will learn how to ingest and process multimodal data (images, text and videos), run AI/ML transformations and feature extractions at scale, and maintain full control and visibility over your data with Unity Catalog’s fine-grained governance.

Revolutionizing Insurance: How to Drive Growth and Innovation

The insurance industry is rapidly evolving as advances in data and artificial intelligence (AI) drive innovation, enabling more personalized customer experiences, streamlined operations, and improved efficiencies. With powerful data analytics and AI-driven solutions, insurers can automate claims processing, enhance risk management, and make real-time decisions. Leveraging insights from large and complex datasets, organizations are delivering more customer-centric products and services than ever before. Key takeaways: Real-world applications of data and AI in claims automation, underwriting, and customer engagementHow predictive analytics and advanced data modeling help anticipate risks and meet customer needs. Personalization of policies, optimized pricing, and more efficient workflows for greater ROI. Discover how data and AI are fueling growth, improving protection, and shaping the future of the insurance industry!

IQVIA’s Serverless Journey: Enabling Data and AI in a Regulated World

Your data and AI use-cases are multiplying. At the same time, there is increased focus and scrutiny to meet sophisticated security and regulatory requirements. IQVIA utilizes serverless use-cases across data engineering, data analytics, and ML and AI, to empower their customers to make informed decisions, support their R&D processes and improve patient outcomes. By leveraging native controls on the platform, serverless enables them to streamline their use cases while maintaining a strong security posture, top performance and optimized costs. This session will go over IQVIA’s journey to serverless, how they met their security and regulatory requirements, and the latest and upcoming enhancements to the Databricks Platform.

AI-Assisted BI: Everything You Need to Know

Explore how AI is transforming business intelligence and data analytics across the Databricks platform. This session offers a comprehensive overview of AI-assisted capabilities, from generating dashboards and visualizations to integrating Genie on dashboards for conversational analytics. Whether you’re a data engineer, analyst or BI developer, this session will equip you to leverage AI with BI for better, smarter decisions.

Bridging BI Tools: Deep Dive Into AI/BI Dashboards for Power BI Practitioners

In the rapidly-evolving field of data analytics, (AI/BI) dashboards and Power BI stand out as two formidable approaches, each offering unique strengths and catering to specific use cases. Power BI has earned its reputation for delivering user-friendly, highly customisable visualisations and reports for data analysis. On the other hand, AI/BI dashboards have gained good traction due to their seamless integration with the Databricks platform, making them an attractive option for data practitioners. This session will provide a comparison of these two tools, highlighting their respective features, strengths and potential limitations. Understanding the nuances between these tools is crucial for organizations aiming to make informed decisions about their data analytics strategy. This session will equip participants with the knowledge needed to select the most appropriate tool or combination of tools to meet their data analysis requirements and drive data-informed decision-making processes.

Unlocking Cross-Organizational Collaboration to Protect the Environment With Databricks at DEFRA

Join us to learn how the UK's Department for Environment, Food & Rural Affairs (DEFRA) transformed data use with Databricks’ Unity Catalog, enabling nationwide projects through secure, scalable analytics. DEFRA safeguards the UK's natural environment. Historical fragmentation of data, talent and tools across siloed platforms and organizations, made it difficult to fully exploit the department’s rich data. DEFRA launched its Data Analytics & Science Hub (DASH), powered by the Databricks Data Intelligence Platform, to unify its data ecosystem. DASH enables hundreds of users to access and share datasets securely. A flagship example demonstrates its power, using Databricks to process aerial photography and satellite data to identify peatlands in need of restoration — a complex task made possible through unified data governance, scalable compute and AI. Attendees will hear about DEFRA’s journey, learn valuable lessons about building a platform crossing organizational boundaries.

This hands-on lab guides participants through the complete customer data analytics journey on Databricks, leveraging leading partner solutions - Fivetran, dbt Cloud, and Sigma. Attendees will learn how to:- Seamlessly connect to Fivetran, dbt Cloud, and Sigma using Databricks Partner Connect- Ingest data using Fivetran, transform and model data with dbt Cloud, and create interactive dashboards in Sigma, all on top of the Databricks Data Intelligence Platform- Empower teams to make faster, data-driven decisions by streamlining the entire analytics workflow using an integrated, scalable, and user-friendly platform

Revolutionizing PepsiCo BI Capabilities: From Traditional BI to Next-Gen Analytics Powerhouse

This session will provide an in-depth overview of how PepsiCo, a global leader in food and beverage, transformed its outdated data platform into a modern, unified and centralized data and AI-enabled platform using the Databricks SQL serverless environment. Through three distinct implementations that transpired at PepsiCo in 2024, we will demonstrate how the PepsiCo Data Analytics & AI Group unlocked pivotal capabilities that facilitated the delivery of diverse data-driven insights to the business, reduced operational expenses and enhanced overall performance through the newly implemented platform.

Bridging Big Data and AI: Empowering PySpark With Lance Format for Multi-Modal AI Data Pipelines

PySpark has long been a cornerstone of big data processing, excelling in data preparation, analytics and machine learning tasks within traditional data lakes. However, the rise of multimodal AI and vector search introduces challenges beyond its capabilities. Spark’s new Python data source API enables integration with emerging AI data lakes built on the multi-modal Lance format. Lance delivers unparalleled value with its zero-copy schema evolution capability and robust support for large record-size data (e.g., images, tensors, embeddings, etc), simplifying multimodal data storage. Its advanced indexing for semantic and full-text search, combined with rapid random access, enables high-performance AI data analytics to the level of SQL. By unifying PySpark's robust processing capabilities with Lance's AI-optimized storage, data engineers and scientists can efficiently manage and analyze the diverse data types required for cutting-edge AI applications within a familiar big data framework.

Dusting off the Cobwebs — Moving off a 26-year-old Heritage Platform to Databricks [Teradata]

Join us to hear about how National Australia Bank (NAB) successfully completed a significant milestone in its data strategy by decommissioning its 26-year-old Teradata environment and migrating to a new strategic data platform called 'Ada'. This transition marks a pivotal shift from legacy systems to a modern, cloud-based data and AI platform powered by Databricks. The migration process, which spanned two years, involved ingesting 16 data sources, transferring 456 use cases, and collaborating with hundreds of users across 12 business units. This strategic move positions NAB to leverage the full potential of cloud-native data analytics, enabling more agile and data-driven decision-making across the organization. The successful migration to Ada represents a significant step forward in NAB's ongoing efforts to modernize its data infrastructure and capitalize on emerging technologies in the rapidly evolving financial services landscape

Federated Data Analytics Platform

Are you struggling to keep up with rapid business changes that demand constant updates to your data pipelines? Is your data engineering team growing rapidly just to manage this complexity? Databricks was not immune to this challenge either. Managing our BI with contributions from hundreds of Product Engineering Teams across the company while maintaining central oversight and quality posed significant hurdles. Join us to learn how we developed a config-driven data pipeline framework using Metric Store and UC Metrics that helped us reduce engineering effort — achieving the work of 100 classical data engineers with just two platform engineers.

Tracing the Path of a Row Through a GPU-Enabled Query Engine on the Grace-Blackwell Architecture

Grace-Blackwell is NVIDIA’s most recent GPU system architecture. It addresses a key concern of query engines: fast data access. In this session, we will take a close look at how GPUs can accelerate data analytics by tracing how a row flows through a GPU-enabled query engine.Query engines read large data from CPU memory or from disk. On Blackwell GPUs, a query engine can rely on hardware-accelerated decompression of compact formats. The Grace-Blackwell system takes data access performance even further, by reading data at up to 450 GB/s across its CPU to GPU interconnect. We demonstrate full end-to-end SQL query acceleration using GPUs in a prototype query engine using industry standard benchmark queries. We compare the results to existing CPU solutions.Using Apache Spark™ and the RAPIDS Accelerator for Apache Spark, we demonstrate the impact GPU acceleration has on the performance of SQL queries at the 100TB scale using NDS, a suite that simulates real-world business scenarios.

Transforming Government With Data and AI: Singapore GovTech's Journey With Databricks

GovTech is an agency in the Singapore Government focused on tech for good. The GovTech Chief Data Office (CDO) has built the GovTech Data Platform with Databricks at the core. As the government tech agency, we safeguard national-level government and citizen data. A comprehensive data strategy is essential to uplifting data maturity. GovTech has adopted the service model approach where data services are offered to stakeholders based on their data maturity. Their maturity is uplifted through partnership, readying them for more advanced data analytics. CDO offers a plethora of data assets in a “data restaurant” ranging from raw data to data products, all delivered via Databricks and enabled through fine-grained access control, underpinned by data management best practices such as data quality, security and governance. Within our first year on Databricks, CDO was able to save 8,000 man-hours, democratize data across 50% of the agency and achieve six-figure savings through BI consolidation.