Cloud Storage

AI-powered defense for cloud workloads

2025-11-20 · Microsoft Ignite 2025 Watch

breakout

by Parul Seth (Microsoft) , Michael Withrow (Microsoft)

AI/ML Cloud Computing Microsoft

Witness a real-world cyberattack targeting sensitive data in cloud storage and containers. See how Microsoft Defender for Cloud uses AI to detect threats, power threat hunting, and automate responses to stop lateral movement.

Delivered in a silent stage breakout.

AVS from Migration and Optimization to Modernization

2025-11-20 · Microsoft Ignite 2025

talk

by Husam Hilal (Microsoft) , Carlos Villuendas (Microsoft) , Trevor Davis (Microsoft)

AI/ML Azure Cloud Computing ELK SQL VMware

Deploy AVS and manage VMware vSphere environment as an Azure resources (B) Connect AVS private cloud secretly with Internet (C) Migrate VMs from on-premises to AVS using VMware HCX technology (D) Expand AVS private cloud storage with Azure NetApp Files or Azure Elastic SAN; scaling storage independently from compute (E) Manage AVS workloads VMs using Azure interfaces by Arc-enabling AVS private cloud and its VMs (F) Modernization workloads with Azure Services like Azure SQL Managed Instances and AI capabilities.

Note: this session will use demo environments instead of live environments due to complexity and time.

Please RSVP and arrive at least 5 minutes before the start time, at which point remaining spaces are open to standby attendees.

AVS from Migration and Optimization to Modernization

2025-11-19 · Microsoft Ignite 2025

talk

by Husam Hilal (Microsoft) , Carlos Villuendas (Microsoft) , Trevor Davis (Microsoft)

AI/ML Azure Cloud Computing ELK SQL VMware

Deploy AVS and manage VMware vSphere environment as an Azure resources (B) Connect AVS private cloud secretly with Internet (C) Migrate VMs from on-premises to AVS using VMware HCX technology (D) Expand AVS private cloud storage with Azure NetApp Files or Azure Elastic SAN; scaling storage independently from compute (E) Manage AVS workloads VMs using Azure interfaces by Arc-enabling AVS private cloud and its VMs (F) Modernization workloads with Azure Services like Azure SQL Managed Instances and AI capabilities.

Note: this session will use demo environments instead of live environments due to complexity and time.

Please RSVP and arrive at least 5 minutes before the start time, at which point remaining spaces are open to standby attendees.

AVS from Migration and Optimization to Modernization

2025-11-19 · Microsoft Ignite 2025

talk

by Husam Hilal (Microsoft) , Carlos Villuendas (Microsoft) , Trevor Davis (Microsoft)

AI/ML Azure Cloud Computing ELK SQL VMware

Deploy AVS and manage VMware vSphere environment as an Azure resources (B) Connect AVS private cloud secretly with Internet (C) Migrate VMs from on-premises to AVS using VMware HCX technology (D) Expand AVS private cloud storage with Azure NetApp Files or Azure Elastic SAN; scaling storage independently from compute (E) Manage AVS workloads VMs using Azure interfaces by Arc-enabling AVS private cloud and its VMs (F) Modernization workloads with Azure Services like Azure SQL Managed Instances and AI capabilities.

Note: this session will use demo environments instead of live environments due to complexity and time.

Please RSVP and arrive at least 5 minutes before the start time, at which point remaining spaces are open to standby attendees.

From Spreadsheets to Scale: How EQT Centralized Data with Fivetran

2025-10-14 · Snowflake World Tour - Stockholm

session

Analytics Cloud Computing Fivetran GCP Modern Data Stack

EQT, a global investment organization specializing in private capital, infrastructure, and real assets, has transformed its data operations by fully adopting the modern data stack. As a cloud-native company with hundreds of internal and external data sources — from YouTube to Google Cloud Storage — EQT needed a scalable, centralized solution to ingest and transform data for complex financial use cases. Their journey took them from fragmented, Excel-based workflows to a robust, integrated data pipeline powered by Fivetran.

In this session, you’ll learn how:

•EQT streamlined external data ingestion and broke down data silos •How a unified data pipeline supports scalable financial analytics and decision-making •Fivetran’s ease of use, connector maintenance, and cost-effectiveness made it the clear choice

🛰️➡️🧑‍💻: Streamlining Satellite Data for Analysis-Ready Outputs

2025-09-01 · PyData Berlin 2025 Watch

talk

by Vinayak Nair

AI/ML Analytics Cloud Computing Prefect

I will share how our team built an end-to-end system to transform raw satellite imagery into analysis-ready datasets for use cases like vegetation monitoring, deforestation detection, and identifying third-party activity. We streamlined the entire pipeline from automated acquisition and cloud storage to preprocessing that ensures spatial, spectral, and temporal consistency. By leveraging Prefect for orchestration, Anyscale Ray for scalable processing, and the open source STAC standard for metadata indexing, we reduced processing times from days to near real-time. We addressed challenges like inconsistent metadata and diverse sensor types, building a flexible system capable of supporting large-scale geospatial analytics and AI workloads.

Processing Cloud-optimized data in Python with Serverless Functions (Lithops, Dataplug)

2025-07-08 · SciPy 2025

talk

by Universitat Rovira i Virgili (Pedro Garcia Lopez) , Enrique Molina Giménez

Cloud Computing Data Management GitHub Python

Cloud-optimized (CO) data formats are designed to efficiently store and access data directly from cloud storage without needing to download the entire dataset. These formats enable faster data retrieval, scalability, and cost-effectiveness by allowing users to fetch only the necessary subsets of data. They also allow for efficient parallel data processing using on-the-fly partitioning, which can considerably accelerate data management operations. In this sense, cloud-optimized data is a nice fit for data-parallel jobs using serverless. FaaS provides a data-driven scalable and cost-efficient experience, with practically no management burden. Each serverless function will read and process a small portion of the cloud-optimized dataset, being read in parallel directly from object storage, significantly increasing the speedup.

In this talk, you will learn how to process cloud-optimized data formats in Python using the Lithops toolkit. Lithops is a serverless data processing toolkit that is specially designed to process data from Cloud Object Storage using Serverless functions. We will also demonstrate the Dataplug library that enables Cloud Optimized data managament of scientific settings such as genomics, metabolomics, or geospatial data. We will show different data processing pipelines in the Cloud that demonstrate the benefits of cloud-optimized data management.

Sponsored by: Google Cloud | Powering AI & Analytics: Innovations in Google Cloud Storage for Data Lakes

Mastering Change Data Capture With Lakeflow Declarative Pipelines

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Ray Zhu (Databricks) , Jacob Gollub (Square)

Analytics Cloud Computing Data Streaming

Transactional systems are a common source of data for analytics, and Change Data Capture (CDC) offers an efficient way to extract only what’s changed. However, ingesting CDC data into an analytics system comes with challenges, such as handling out-of-order events or maintaining global order across multiple streams. These issues often require complex, stateful stream processing logic.This session will explore how Lakeflow Declarative Pipelines simplifies CDC ingestion using the Apply Changes function. With Apply Changes, global ordering across multiple change feeds is handled automatically — there is no need to manually manage state or understand advanced streaming concepts like watermarks. It supports both snapshot-based inputs from cloud storage and continuous change feeds from systems like message buses, reducing complexity for common streaming use cases.

Real-Time Analytics Pipeline for IoT Device Monitoring and Reporting

2025-06-11 · Data + AI Summit 2025 Watch

talk

by Nayan Sharma (CKDelta) , Padraic Kirrane (CK Delta)

Analytics API Cloud Computing Data Quality Databricks IoT Data Streaming

This session will show how we implemented a solution to support high-frequency data ingestion from smart meters. We implemented a robust API endpoint that interfaces directly with IoT devices. This API processes messages in real time from millions of distributed IoT devices and meters across the network. The architecture leverages cloud storage as a landing zone for the raw data, followed by a streaming pipeline built on Lakeflow Declarative Pipelines. This pipeline implements a multi-layer medallion architecture to progressively clean, transform and enrich the data. The pipeline operates continuously to maintain near real-time data freshness in our gold layer tables. These datasets connect directly to Databricks Dashboards, providing stakeholders with immediate insights into their operational metrics. This solution demonstrates how modern data architecture can handle high-volume IoT data streams while maintaining data quality and providing accessible real-time analytics for business users.

Sponsored by: Fivetran | Raw Data to Real-Time Insights: How Dropbox Revolutionized Data Ingestion

Lakeflow Connect: Smarter, Simpler File Ingestion With the Next Generation of Auto Loader

2025-06-10 · Data + AI Summit 2025 Watch

talk

by Sandip Agarwala (Databricks) , Chavdar Botev (Databricks)

API Cloud Computing Data Lakehouse Data Quality Databricks Delta

Auto Loader is the definitive tool for ingesting data from cloud storage into your lakehouse. In this session, we’ll unveil new features and best practices that simplify every aspect of cloud storage ingestion. We’ll demo out-of-the-box observability for pipeline health and data quality, walk through improvements for schema management, introduce a series of new data formats and unveil recent strides in Auto Loader performance. Along the way, we’ll provide examples and best practices for optimizing cost and performance. Finally, we’ll introduce a preview of what’s coming next — including a REST API for pushing files directly to Delta, a UI for creating cloud storage pipelines and more. Join us to help shape the future of file ingestion on Databricks.

Data on Kubernetes: Run stateful apps and AI workloads on GKE

2025-04-11 · Google Cloud Next '25

session

by Ugur Arpaci (Codeway) , Brian Kaufman (Google Cloud) , Volkan Aydingul (Codeway) , Thierry Damiba (Qdrant)

AI/ML Cloud Computing Data Management Kubernetes

This session explores the evolution of data management on Kubernetes for AI and machine learning (ML) workloads and modern databases, including Google’s leadership in this space. We’ll discuss key challenges and solutions, including persistent storage with solutions like checkpointing and Cloud Storage FUSE, and accelerating data access with caching. Customers Qdrant and Codeway will share how they’ve successfully leveraged these technologies to improve their AI, ML, and database performance on Google Kubernetes Engine (GKE).

Accelerating Data Modernization with a Cloud-Native Platform on Google Cloud

2025-04-10 · Google Cloud Next '25

session

by Maksood Mohiuddin (McKinsey & Company)

BigQuery Cloud Computing GCP Data Fusion Cloud Run SaaS

Discover how to transition from legacy, siloed systems to a unified, scalable, and insights-driven data platform on GCP. This session will cover best practices for data migration, overcoming common challenges, and integrating SaaS and third-party solutions using key Google Cloud services like BigQuery, Data Fusion, Cloud Storage, Application Integration, Cloud Run, Cloud Build, and Artifact Registry.

This Session is hosted by a Google Cloud Next Sponsor.
Visit your registration profile at g.co/cloudnext to opt out of sharing your contact information with the sponsor hosting this session.

Build POC easily with Flutter Front-end and Firebase Backend

2025-04-10 · Google Cloud Next '25

session

by Suesi Tran (Nartus)

AI/ML Cloud Computing LLM

Introduce how easy it can be to build a fully function Flutter app with Firebase Backend. It's fast, easy, and fully function with Front-end UI, and backend task (using Cloud Function), database (Firestore), storage (Cloud Storage), and more. Additional tips on Vertex AI and Gemini in both Flutter and Cloud Function will be added if time allows.

Under the Iceberg: Simple, unified Cloud Storage for analytics data lakes

2025-04-10 · Google Cloud Next '25

session

by Edward Yang (Two Sigma) , Vivek Saraswat (Google Cloud) , Dave Stiver (Google Cloud)

AI/ML Analytics BigQuery Cloud Computing Dataproc Iceberg Spark

Modern analytics and AI workloads demand a unified storage layer for structured and unstructured data. Learn how Cloud Storage simplifies building data lakes based on Apache Iceberg. We’ll discuss storage best practices and new capabilities that enable high performance and cost efficiency. We’ll also guide you through real-world examples, including Iceberg data lakes with BigQuery or third-party solutions, data preparation for AI pipelines with Dataproc and Apache Spark, and how customers have built unified analytics and AI solutions on Cloud Storage.

What’s new with Cloud Storage

2025-04-10 · Google Cloud Next '25

session

by James Coomer (DDN) , Asad Khan (Google Cloud) , Sameet Agarwal (Google Cloud)

Cloud Computing Data Management

Discover the latest breakthroughs in Cloud Storage. This executive session provides a high-level overview of the latest object, block, file storage, and backup and recovery solutions. Gain insights into our cutting-edge storage technologies and learn how they can optimize your infrastructure, reduce costs, and enhance your data management strategy. Don’t miss this opportunity to learn directly from Google executives about the future of storage. This session is a must for IT decision-makers seeking a competitive edge.

Rapidly Deploying ML GPU GKE or Slurm Cluster: A Comprehensive Guide Using Cluster Toolkit and Terraform, From Absolute Beginner to Expert

2025-04-10 · Google Cloud Next '25

session

by Thomas Leung (Google)

AI/ML Cloud Computing GCP Terraform

This talk offers demonstrations and live discussions on how to rapidly deploy production-ready GKE or Slurm clusters using Cluster Toolkit and Terraform. Leverage the latest GPUs to accelerate machine learning workloads and optimize resource utilization with GKE's Kueue, autoscaling Slurm, and Dynamic Workload Scheduler (DWS). Explore storage solutions like Google Cloud Storage (GCS), GCSFuse, Filestore Zonal, and Parallelstore. Leave this session with the tools and knowledge you need to deploy a high-performance ML cluster in minutes.

AI Hypercomputer: Master your storage infrastructure

2025-04-09 · Google Cloud Next '25

session

by Bo Chen (Snap Inc.) , Sridevi Ravuri (Google Cloud) , Sean Derrington (Google Cloud) , Jason Wu (Google)

AI/ML Cloud Computing

Properly architecting your storage infrastructure for AI is critical for success. Snap will share some of their best practices, implementation tips, and success stories for AI workloads. This session dives deep into training, checkpointing, and serving recommendations, covering Cloud Storage FUSE, Anywhere Cache, and parallel file systems. Gain insights to optimize your AI infrastructure and unlock its full potential. Don’t miss this opportunity to learn from real-world examples and expert advice.

A master class in managing billions of Google Cloud storage objects and beyond

2025-04-09 · Google Cloud Next '25

session

by Manjul Sahay (Google Cloud) , Matthew Rahmann (Google Cloud) , Adam Steele (Spotify)

AI/ML Cloud Computing GCP Cyber Security

Managing petabytes of Google Cloud storage objects? Attend this session to learn how the new Storage Intelligence product simplifies managing billions of objects across thousands of buckets. Leverage AI-driven insights to analyze cost, performance, security, and compliance – all through intuitive natural language queries. Quickly act on insights with bucket relocation and batch operations. Join us to uncover practical tools and exciting new features that can transform you into a Cloud Storage superhero.

talk-data.com

Activity Trend

Top Events

Top Speakers

AI-powered defense for cloud workloads

AVS from Migration and Optimization to Modernization

AVS from Migration and Optimization to Modernization

AVS from Migration and Optimization to Modernization

From Spreadsheets to Scale: How EQT Centralized Data with Fivetran

🛰️➡️🧑‍💻: Streamlining Satellite Data for Analysis-Ready Outputs

Processing Cloud-optimized data in Python with Serverless Functions (Lithops, Dataplug)

Sponsored by: Google Cloud | Powering AI & Analytics: Innovations in Google Cloud Storage for Data Lakes

Mastering Change Data Capture With Lakeflow Declarative Pipelines

Real-Time Analytics Pipeline for IoT Device Monitoring and Reporting

Sponsored by: Fivetran | Raw Data to Real-Time Insights: How Dropbox Revolutionized Data Ingestion

Lakeflow Connect: Smarter, Simpler File Ingestion With the Next Generation of Auto Loader

Data on Kubernetes: Run stateful apps and AI workloads on GKE

Accelerating Data Modernization with a Cloud-Native Platform on Google Cloud

Build POC easily with Flutter Front-end and Firebase Backend

Under the Iceberg: Simple, unified Cloud Storage for analytics data lakes

What’s new with Cloud Storage

Rapidly Deploying ML GPU GKE or Slurm Cluster: A Comprehensive Guide Using Cluster Toolkit and Terraform, From Absolute Beginner to Expert

AI Hypercomputer: Master your storage infrastructure

A master class in managing billions of Google Cloud storage objects and beyond