This presentation provides an overview of how NVIDIA RAPIDS accelerates data science and data engineering workflows end-to-end. Key topics include leveraging RAPIDS for machine learning, large-scale graph analytics, real-time inference, hyperparameter optimization, and ETL processes. Case studies demonstrate significant performance improvements and cost savings across various industries using RAPIDS for Apache Spark, XGBoost, cuML, and other GPU-accelerated tools. The talk emphasizes the impact of accelerated computing on modern enterprise applications, including LLMs, recommenders, and complex data processing pipelines.
talk-data.com
Topic
ETL/ELT
ETL/ELT
5
tagged
Activity Trend
Top Events
In data integration and data management, the focus is often on technology—databases, ETL processes, automation, reporting tools. But in the process, the true objective is easily overlooked: generating business value.
Why does this happen? What organizational and technical barriers contribute to the disconnect? And most importantly: what strategies can we adopt to better align data initiatives with the goals of business stakeholders?
This session explores the root causes and presents practical approaches to building a data-driven culture—with a clear focus on business impact.
This session will provide a Maia demo with roadmap teasers. The demo will showcase Maia's core capabilities: authoring pipelines in business language, multiplying productivity by accelerating tasks, and enabling self-service. It demonstrates how Maia takes natural language prompts and translates them into YAML-based, human-readable Data Pipeline Language (DPL), generating graphical pipelines. Expect to see Maia interacting with Snowflake metadata to sample data and suggest transformations, as well as its ability to troubleshoot and debug pipelines in real-time. The session will also cover how Maia can create custom connectors from REST API documentation in seconds, a task that traditionally takes days . Roadmap teasers will likely include the upcoming Semantic Layer, a Pipeline Reviewing Agent, and enhanced file type support for various legacy ETL tools and code conversions.
The growth of connected data has made graph databases essential, yet organisations often face a dilemma: choosing between an operational graph for real-time queries or an analytical engine for large-scale processing. This division leads to data silos and complex ETL pipelines, hindering the seamless integration of real-time insights with deep analytics and the ability to ground AI models in factual, enterprise-specific knowledge. Google Cloud aims to solve this with a unified "Graph Fabric," introducing Spanner Graph, which extends Spanner with native support for the ISO standard Graph Query Language (GQL). This session will cover how Google Cloud has developed a Unified Graph Solution with BigQuery and Spanner graphs to serve a full spectrum of graph needs from operational to analytical.
In this session, Paul Wilkinson, Principal Solutions Architect at Redpanda, will demonstrate Redpanda's native Iceberg capability: a game-changing addition that bridges the gap between real-time streaming and analytical workloads, eliminating the complexity of traditional data lake architectures while maintaining the performance and simplicity that Redpanda is known for.
Paul will explore how this new capability enables organizations to seamlessly transition streaming data into analytical formats without complex ETL pipelines or additional infrastructure overhead in a follow-along demo - allowing you to build your own streaming lakehouse and show it to your team!