talk-data.com talk-data.com

Filter by Source

Select conferences and events

People (191 results)

See all 191 →

Activities & events

Title & Speakers Event

This technical workshop focuses on the data infrastructure required to build and maintain production-grade mobility datasets at fleet scale.

Date, Time and Location

Jan 14, 2026 9:00-10:00 AM Pacific Online. Register for the Zoom!

We will examine how to structure storage, metadata, access patterns, and quality controls so that mobility teams can treat perception datasets as first-class, versioned “infrastructure” assets. The session will walk through how to design a mobility data stack that connects object storage, labeling systems, simulation environments, and experiment tracking into a coherent, auditable pipeline.

What you’ll learn:

  • Model the mobility data plane: Define schemas for camera, LiDAR, radar, and HD, and represent temporal windows, ego poses, and scenario groupings in a way that is queryable and stable under schema evolution.
  • Build a versioned dataset catalog with FiftyOne: Use FiftyOne customized workspaces and views to represent canonical datasets, and integrate with your raw data sources. All while preserving lineage between raw logs, the curated data, and simulation inputs.
  • Implement governance and access control on mobility data: Configure role-based access and auditable pipelines to enforce data residency constraints while encouraging multi-team collaboration across research, perception, and safety functions.
  • Operationalize curation and scenario mining workflows: Use FiftyOne’s embeddings and labeling capabilities to surface rare events such as adverse weather and sensor anomalies. Assign review tasks, and codify “critical scenario” definitions as reproducible dataset views.
  • Close the loop with evaluation and feedback signals: Connect FiftyOne to training and evaluation pipelines so that model failures feed back into dataset updates

By the end of the workshop, attendees will have a concrete mental model and reference architecture for treating mobility datasets as a governed, queryable, and continuously evolving layer in their stack.

Jan 14 - Designing Data Infrastructures for Multimodal Mobility Datasets

This technical workshop focuses on the data infrastructure required to build and maintain production-grade mobility datasets at fleet scale.

Date, Time and Location

Jan 14, 2026 9:00-10:00 AM Pacific Online. Register for the Zoom!

We will examine how to structure storage, metadata, access patterns, and quality controls so that mobility teams can treat perception datasets as first-class, versioned “infrastructure” assets. The session will walk through how to design a mobility data stack that connects object storage, labeling systems, simulation environments, and experiment tracking into a coherent, auditable pipeline.

What you’ll learn:

  • Model the mobility data plane: Define schemas for camera, LiDAR, radar, and HD, and represent temporal windows, ego poses, and scenario groupings in a way that is queryable and stable under schema evolution.
  • Build a versioned dataset catalog with FiftyOne: Use FiftyOne customized workspaces and views to represent canonical datasets, and integrate with your raw data sources. All while preserving lineage between raw logs, the curated data, and simulation inputs.
  • Implement governance and access control on mobility data: Configure role-based access and auditable pipelines to enforce data residency constraints while encouraging multi-team collaboration across research, perception, and safety functions.
  • Operationalize curation and scenario mining workflows: Use FiftyOne’s embeddings and labeling capabilities to surface rare events such as adverse weather and sensor anomalies. Assign review tasks, and codify “critical scenario” definitions as reproducible dataset views.
  • Close the loop with evaluation and feedback signals: Connect FiftyOne to training and evaluation pipelines so that model failures feed back into dataset updates

By the end of the workshop, attendees will have a concrete mental model and reference architecture for treating mobility datasets as a governed, queryable, and continuously evolving layer in their stack.

Jan 14 - Designing Data Infrastructures for Multimodal Mobility Datasets

This technical workshop focuses on the data infrastructure required to build and maintain production-grade mobility datasets at fleet scale.

Date, Time and Location

Jan 14, 2026 9:00-10:00 AM Pacific Online. Register for the Zoom!

We will examine how to structure storage, metadata, access patterns, and quality controls so that mobility teams can treat perception datasets as first-class, versioned “infrastructure” assets. The session will walk through how to design a mobility data stack that connects object storage, labeling systems, simulation environments, and experiment tracking into a coherent, auditable pipeline.

What you’ll learn:

  • Model the mobility data plane: Define schemas for camera, LiDAR, radar, and HD, and represent temporal windows, ego poses, and scenario groupings in a way that is queryable and stable under schema evolution.
  • Build a versioned dataset catalog with FiftyOne: Use FiftyOne customized workspaces and views to represent canonical datasets, and integrate with your raw data sources. All while preserving lineage between raw logs, the curated data, and simulation inputs.
  • Implement governance and access control on mobility data: Configure role-based access and auditable pipelines to enforce data residency constraints while encouraging multi-team collaboration across research, perception, and safety functions.
  • Operationalize curation and scenario mining workflows: Use FiftyOne’s embeddings and labeling capabilities to surface rare events such as adverse weather and sensor anomalies. Assign review tasks, and codify “critical scenario” definitions as reproducible dataset views.
  • Close the loop with evaluation and feedback signals: Connect FiftyOne to training and evaluation pipelines so that model failures feed back into dataset updates

By the end of the workshop, attendees will have a concrete mental model and reference architecture for treating mobility datasets as a governed, queryable, and continuously evolving layer in their stack.

Jan 14 - Designing Data Infrastructures for Multimodal Mobility Datasets

This technical workshop focuses on the data infrastructure required to build and maintain production-grade mobility datasets at fleet scale.

Date, Time and Location

Jan 14, 2026 9:00-10:00 AM Pacific Online. Register for the Zoom!

We will examine how to structure storage, metadata, access patterns, and quality controls so that mobility teams can treat perception datasets as first-class, versioned “infrastructure” assets. The session will walk through how to design a mobility data stack that connects object storage, labeling systems, simulation environments, and experiment tracking into a coherent, auditable pipeline.

What you’ll learn:

  • Model the mobility data plane: Define schemas for camera, LiDAR, radar, and HD, and represent temporal windows, ego poses, and scenario groupings in a way that is queryable and stable under schema evolution.
  • Build a versioned dataset catalog with FiftyOne: Use FiftyOne customized workspaces and views to represent canonical datasets, and integrate with your raw data sources. All while preserving lineage between raw logs, the curated data, and simulation inputs.
  • Implement governance and access control on mobility data: Configure role-based access and auditable pipelines to enforce data residency constraints while encouraging multi-team collaboration across research, perception, and safety functions.
  • Operationalize curation and scenario mining workflows: Use FiftyOne’s embeddings and labeling capabilities to surface rare events such as adverse weather and sensor anomalies. Assign review tasks, and codify “critical scenario” definitions as reproducible dataset views.
  • Close the loop with evaluation and feedback signals: Connect FiftyOne to training and evaluation pipelines so that model failures feed back into dataset updates

By the end of the workshop, attendees will have a concrete mental model and reference architecture for treating mobility datasets as a governed, queryable, and continuously evolving layer in their stack.

Jan 14 - Designing Data Infrastructures for Multimodal Mobility Datasets

This technical workshop focuses on the data infrastructure required to build and maintain production-grade mobility datasets at fleet scale.

Date, Time and Location

Jan 14, 2026 9:00-10:00 AM Pacific Online. Register for the Zoom!

We will examine how to structure storage, metadata, access patterns, and quality controls so that mobility teams can treat perception datasets as first-class, versioned “infrastructure” assets. The session will walk through how to design a mobility data stack that connects object storage, labeling systems, simulation environments, and experiment tracking into a coherent, auditable pipeline.

What you’ll learn:

  • Model the mobility data plane: Define schemas for camera, LiDAR, radar, and HD, and represent temporal windows, ego poses, and scenario groupings in a way that is queryable and stable under schema evolution.
  • Build a versioned dataset catalog with FiftyOne: Use FiftyOne customized workspaces and views to represent canonical datasets, and integrate with your raw data sources. All while preserving lineage between raw logs, the curated data, and simulation inputs.
  • Implement governance and access control on mobility data: Configure role-based access and auditable pipelines to enforce data residency constraints while encouraging multi-team collaboration across research, perception, and safety functions.
  • Operationalize curation and scenario mining workflows: Use FiftyOne’s embeddings and labeling capabilities to surface rare events such as adverse weather and sensor anomalies. Assign review tasks, and codify “critical scenario” definitions as reproducible dataset views.
  • Close the loop with evaluation and feedback signals: Connect FiftyOne to training and evaluation pipelines so that model failures feed back into dataset updates

By the end of the workshop, attendees will have a concrete mental model and reference architecture for treating mobility datasets as a governed, queryable, and continuously evolving layer in their stack.

Jan 14 - Designing Data Infrastructures for Multimodal Mobility Datasets

This technical workshop focuses on the data infrastructure required to build and maintain production-grade mobility datasets at fleet scale.

Date, Time and Location

Jan 14, 2026 9:00-10:00 AM Pacific Online. Register for the Zoom!

We will examine how to structure storage, metadata, access patterns, and quality controls so that mobility teams can treat perception datasets as first-class, versioned “infrastructure” assets. The session will walk through how to design a mobility data stack that connects object storage, labeling systems, simulation environments, and experiment tracking into a coherent, auditable pipeline.

What you’ll learn:

  • Model the mobility data plane: Define schemas for camera, LiDAR, radar, and HD, and represent temporal windows, ego poses, and scenario groupings in a way that is queryable and stable under schema evolution.
  • Build a versioned dataset catalog with FiftyOne: Use FiftyOne customized workspaces and views to represent canonical datasets, and integrate with your raw data sources. All while preserving lineage between raw logs, the curated data, and simulation inputs.
  • Implement governance and access control on mobility data: Configure role-based access and auditable pipelines to enforce data residency constraints while encouraging multi-team collaboration across research, perception, and safety functions.
  • Operationalize curation and scenario mining workflows: Use FiftyOne’s embeddings and labeling capabilities to surface rare events such as adverse weather and sensor anomalies. Assign review tasks, and codify “critical scenario” definitions as reproducible dataset views.
  • Close the loop with evaluation and feedback signals: Connect FiftyOne to training and evaluation pipelines so that model failures feed back into dataset updates

By the end of the workshop, attendees will have a concrete mental model and reference architecture for treating mobility datasets as a governed, queryable, and continuously evolving layer in their stack.

Jan 14 - Designing Data Infrastructures for Multimodal Mobility Datasets

This technical workshop focuses on the data infrastructure required to build and maintain production-grade mobility datasets at fleet scale.

Date, Time and Location

Jan 14, 2026 9:00-10:00 AM Pacific Online. Register for the Zoom!

We will examine how to structure storage, metadata, access patterns, and quality controls so that mobility teams can treat perception datasets as first-class, versioned “infrastructure” assets. The session will walk through how to design a mobility data stack that connects object storage, labeling systems, simulation environments, and experiment tracking into a coherent, auditable pipeline.

What you’ll learn:

  • Model the mobility data plane: Define schemas for camera, LiDAR, radar, and HD, and represent temporal windows, ego poses, and scenario groupings in a way that is queryable and stable under schema evolution.
  • Build a versioned dataset catalog with FiftyOne: Use FiftyOne customized workspaces and views to represent canonical datasets, and integrate with your raw data sources. All while preserving lineage between raw logs, the curated data, and simulation inputs.
  • Implement governance and access control on mobility data: Configure role-based access and auditable pipelines to enforce data residency constraints while encouraging multi-team collaboration across research, perception, and safety functions.
  • Operationalize curation and scenario mining workflows: Use FiftyOne’s embeddings and labeling capabilities to surface rare events such as adverse weather and sensor anomalies. Assign review tasks, and codify “critical scenario” definitions as reproducible dataset views.
  • Close the loop with evaluation and feedback signals: Connect FiftyOne to training and evaluation pipelines so that model failures feed back into dataset updates

By the end of the workshop, attendees will have a concrete mental model and reference architecture for treating mobility datasets as a governed, queryable, and continuously evolving layer in their stack.

Jan 14 - Designing Data Infrastructures for Multimodal Mobility Datasets

This technical workshop focuses on the data infrastructure required to build and maintain production-grade mobility datasets at fleet scale.

Date, Time and Location

Jan 13, 2026 9:00-10:00 AM Pacific Online. Register for the Zoom!

We will examine how to structure storage, metadata, access patterns, and quality controls so that mobility teams can treat perception datasets as first-class, versioned “infrastructure” assets. The session will walk through how to design a mobility data stack that connects object storage, labeling systems, simulation environments, and experiment tracking into a coherent, auditable pipeline.

What you’ll learn:

  • Model the mobility data plane: Define schemas for camera, LiDAR, radar, and HD, and represent temporal windows, ego poses, and scenario groupings in a way that is queryable and stable under schema evolution.
  • Build a versioned dataset catalog with FiftyOne: Use FiftyOne customized workspaces and views to represent canonical datasets, and integrate with your raw data sources. All while preserving lineage between raw logs, the curated data, and simulation inputs.
  • Implement governance and access control on mobility data: Configure role-based access and auditable pipelines to enforce data residency constraints while encouraging multi-team collaboration across research, perception, and safety functions.
  • Operationalize curation and scenario mining workflows: Use FiftyOne’s embeddings and labeling capabilities to surface rare events such as adverse weather and sensor anomalies. Assign review tasks, and codify “critical scenario” definitions as reproducible dataset views.
  • Close the loop with evaluation and feedback signals: Connect FiftyOne to training and evaluation pipelines so that model failures feed back into dataset updates

By the end of the workshop, attendees will have a concrete mental model and reference architecture for treating mobility datasets as a governed, queryable, and continuously evolving layer in their stack.

Jan 13 - Designing Data Infrastructures for Multimodal Mobility Datasets

Start 2026 with the ClickHouse India community in Gurgaon!

Connect with fellow data practitioners and hear from industry experts through engaging talks focused on lessons learned, best practices, and modern data challenges.

Agenda:

  • 10:30 AM: Registration, light snacks & networking
  • 11:00 AM: Welcome & Introductions
  • 11:10 AM: Inside ClickStack: Engineering Observability for Scale by Rakesh Puttaswamy, Lead Solutions Architect @ ClickHouse
  • 11:35 AM: Supercharging Personalised Notifications At Jobhai With ClickHouse by Sumit Kumar and Arvind Saini, Tech Leads @ Info Edge
  • 12:00 PM: Simplifying CDC: Migrating from Debezium to ClickPipes by Abhash Solanki, DevOps Engineer @ Spyne AI
  • 12:25 PM: Solving Analytics at Scale: From CDC to Actionable Insights by Kunal Sharma, Software Developer @ Samarth eGov
  • 12:50 PM: Q&A
  • 1:30 PM: Lunch & Networking

👉🏼 RSVP to secure your spot!

Interested in speaking at this meetup or future ClickHouse events? 🎤Shoot an email to [email protected] and she'll be in touch.

******** 🎤 Session Details: Inside ClickStack: Engineering Observability for Scale Dive deep into ClickStack, ClickHouse’s fresh approach to observability built for engineers who care about speed, scale, and simplicity. We’ll unpack the technical architecture behind how ClickStack handles metrics, logs, and traces using ClickHouse as the backbone for real-time, high-cardinality analytics. Expect a hands-on look at ingestion pipelines, schema design patterns, query optimization, and the integrations that make ClickStack tick.

Speaker: Rakesh Puttaswamy, Lead Solutions Architect @ ClickHouse

🎤 Session Details: Supercharging Personalised Notifications At Jobhai With ClickHouse Calculating personalized alerts for 2 million users is a data-heavy challenge that requires more than just standard indexing. This talk explores how Jobhai uses ClickHouse to power its morning notification pipeline, focusing on the architectural shifts and query optimizations that made our massive scale manageable and fast.

Speaker: Sumit Kumar and Arvind Saini, Tech Leads @ Info Edge Sumit is a seasoned software engineer with deep expertise in databases, backend systems, and machine learning. For over six years, he has led the Jobhai engineering team, driving continuous improvements across their database infrastructure and user-facing systems while streamlining workflows through ongoing innovation. Connect with Sumit Kumar on LinkedIn.

Arvind is a Tech Lead at Info Edge India Ltd with experience building and scaling backend systems for large consumer and enterprise platforms. Over the years, they have worked across system design, backend optimization, and data-driven services, contributing to initiatives such as notification platforms, workflow automation, and product revamps. Their work focuses on improving reliability, performance, and scalability of distributed systems, and they enjoy solving complex engineering problems while mentoring teams and driving technical excellence.

🎤 Session Details: Simplifying CDC: Migrating from Debezium to ClickPipes In this talk, Abhash will share their engineering team's journey migrating our core MySQL and MongoDB CDC flows to ClickPipes. We will contrast our previous architecture—where every schema change required manual intervention or complex Debezium configurations—with the new reality of ClickPipes' automated schema evolution, which seamlessly handles upstream schema changes and ingests flexible data without breaking pipelines.

Speaker: Abhash Solanki, DevOps Engineer @ Spyne AI Abhash serves as a DevOps Engineer at Spyne, orchestrating the AWS infrastructure behind the company's data warehouse and CDC pipelines. Having managed complex self-hosted Debezium and Kafka clusters, he understands the operational overhead of running stateful data stacks in the cloud. He recently led the architectural shift to ClickHouse Cloud, focusing on eliminating engineering toil and automating schema evolution handling.

🎤 Session Details: Solving Analytics at Scale: From CDC to Actionable Insights As SAMARTH’s data volumes grew rapidly, our analytics systems faced challenges with frequent data changes and near real-time reporting. These challenges were compounded by the platform’s inherently high cardinality in multidimensional data models - spanning institutions, programmes, states, categories, workflow stages, and time, resulting in highly complex and dynamic query patterns.

This talk describes how we evolved from basic CDC pipelines to a fast, reliable, and scalable near real-time analytics platform using ClickHouse. We share key design and operational learnings that enabled us to process continuous high-volume transactional data and deliver low-latency analytics for operational monitoring and policy-level decision-making.

Speaker: Kunal Sharma, Software Developer @ Samarth eGov Kunal Sharma is a data-focused professional with experience in building scalable data pipelines. His work includes designing and implementing robust ETL/ELT workflows, data-driven decision engines, and large-scale analytics platforms. At SAMARTH, he has contributed to building near real-time analytics systems, including the implementation of ClickHouse for large-scale, low-latency analytics.

ClickHouse Gurgaon/Delhi Meetup

The next evolution of agentic AI isn’t just “better prompts” or “more tools,” it’s agents that can collaborate across boundaries. The A2A (Agent-to-Agent) Protocol makes that collaboration practical by standardizing how agents discover each other, negotiate capabilities, exchange tasks, stream progress, and return artifacts — even when they’re built on different frameworks or run in different environments.

In this session, we’ll unpack why many multi-agent systems fail in production (fragile handoffs, unclear responsibilities, brittle integrations, and poor reliability under long-running workflows). Then we’ll introduce the core A2A building blocks — Agent Cards, task lifecycles, streaming updates, artifact delivery, and secure interoperability and show how to orchestrate multiple specialist agents with clear contracts and robust coordination patterns.

A live walkthrough will demonstrate how to design a Supervisor + Specialist architecture using A2A, including real-time progress streaming, error recovery, and observable “handoffs” that make multi-agent workflows durable instead of demo-only.

What We Will Cover:

  • Why multi-agent systems fail in production – context loss, inconsistent handoffs, poor visibility, and unreliable sub-task delegation
  • A2A Protocol Fundamentals – standardizing agent discovery, capability signaling, tasks, artifacts, and streaming
  • Agent Discovery with Agent Cards – skills, modalities, endpoints, versioning, and trust boundaries
  • Scalable Orchestration Patterns – Supervisor → Router → Specialist teams, and contract-based delegation with clear inputs/outputs
  • Tool-Using Agents vs Agent-to-Agent Collaboration – understanding how A2A complements MCP
  • Long-Running Task Management – designing task lifecycles, handling partial outputs, cancellations, retries, fallbacks, and resumable execution
  • Streaming Progress & Artifact Delivery – real-time updates and structured outputs you can store and reuse
  • Production Considerations – observability, debugging, governance, authentication boundaries, and safety guardrails for agent networks

Hands-On Insights:

Through a guided demo and Q&A, you’ll learn how to:

  • Stand up a simple A2A orchestrator agent that discovers specialist agents via their Agent Cards
  • Delegate work across multiple agents with reliable handoffs
  • Stream progress updates and collect artifacts (reports, structured data, intermediate reasoning outputs)
  • Implement practical resilience (timeouts, retries, fallback agents, and error-aware routing)

You’ll leave with a clear mental model and a reusable orchestration blueprint to evolve from single-agent demos into durable multi-agent systems.

A2A Protocol Workshop: Build Interoperable Multi-Agent Systems

The next evolution of agentic AI isn’t just “better prompts” or “more tools,” it’s agents that can collaborate across boundaries. The A2A (Agent-to-Agent) Protocol makes that collaboration practical by standardizing how agents discover each other, negotiate capabilities, exchange tasks, stream progress, and return artifacts — even when they’re built on different frameworks or run in different environments.

In this session, we’ll unpack why many multi-agent systems fail in production (fragile handoffs, unclear responsibilities, brittle integrations, and poor reliability under long-running workflows). Then we’ll introduce the core A2A building blocks — Agent Cards, task lifecycles, streaming updates, artifact delivery, and secure interoperability and show how to orchestrate multiple specialist agents with clear contracts and robust coordination patterns.

A live walkthrough will demonstrate how to design a Supervisor + Specialist architecture using A2A, including real-time progress streaming, error recovery, and observable “handoffs” that make multi-agent workflows durable instead of demo-only.

What We Will Cover:

  • Why multi-agent systems fail in production – context loss, inconsistent handoffs, poor visibility, and unreliable sub-task delegation
  • A2A Protocol Fundamentals – standardizing agent discovery, capability signaling, tasks, artifacts, and streaming
  • Agent Discovery with Agent Cards – skills, modalities, endpoints, versioning, and trust boundaries
  • Scalable Orchestration Patterns – Supervisor → Router → Specialist teams, and contract-based delegation with clear inputs/outputs
  • Tool-Using Agents vs Agent-to-Agent Collaboration – understanding how A2A complements MCP
  • Long-Running Task Management – designing task lifecycles, handling partial outputs, cancellations, retries, fallbacks, and resumable execution
  • Streaming Progress & Artifact Delivery – real-time updates and structured outputs you can store and reuse
  • Production Considerations – observability, debugging, governance, authentication boundaries, and safety guardrails for agent networks

Hands-On Insights:

Through a guided demo and Q&A, you’ll learn how to:

  • Stand up a simple A2A orchestrator agent that discovers specialist agents via their Agent Cards
  • Delegate work across multiple agents with reliable handoffs
  • Stream progress updates and collect artifacts (reports, structured data, intermediate reasoning outputs)
  • Implement practical resilience (timeouts, retries, fallback agents, and error-aware routing)

You’ll leave with a clear mental model and a reusable orchestration blueprint to evolve from single-agent demos into durable multi-agent systems.

A2A Protocol Workshop: Build Interoperable Multi-Agent Systems

The next evolution of agentic AI isn’t just “better prompts” or “more tools,” it’s agents that can collaborate across boundaries. The A2A (Agent-to-Agent) Protocol makes that collaboration practical by standardizing how agents discover each other, negotiate capabilities, exchange tasks, stream progress, and return artifacts — even when they’re built on different frameworks or run in different environments.

In this session, we’ll unpack why many multi-agent systems fail in production (fragile handoffs, unclear responsibilities, brittle integrations, and poor reliability under long-running workflows). Then we’ll introduce the core A2A building blocks — Agent Cards, task lifecycles, streaming updates, artifact delivery, and secure interoperability and show how to orchestrate multiple specialist agents with clear contracts and robust coordination patterns.

A live walkthrough will demonstrate how to design a Supervisor + Specialist architecture using A2A, including real-time progress streaming, error recovery, and observable “handoffs” that make multi-agent workflows durable instead of demo-only.

What We Will Cover:

  • Why multi-agent systems fail in production – context loss, inconsistent handoffs, poor visibility, and unreliable sub-task delegation
  • A2A Protocol Fundamentals – standardizing agent discovery, capability signaling, tasks, artifacts, and streaming
  • Agent Discovery with Agent Cards – skills, modalities, endpoints, versioning, and trust boundaries
  • Scalable Orchestration Patterns – Supervisor → Router → Specialist teams, and contract-based delegation with clear inputs/outputs
  • Tool-Using Agents vs Agent-to-Agent Collaboration – understanding how A2A complements MCP
  • Long-Running Task Management – designing task lifecycles, handling partial outputs, cancellations, retries, fallbacks, and resumable execution
  • Streaming Progress & Artifact Delivery – real-time updates and structured outputs you can store and reuse
  • Production Considerations – observability, debugging, governance, authentication boundaries, and safety guardrails for agent networks

Hands-On Insights:

Through a guided demo and Q&A, you’ll learn how to:

  • Stand up a simple A2A orchestrator agent that discovers specialist agents via their Agent Cards
  • Delegate work across multiple agents with reliable handoffs
  • Stream progress updates and collect artifacts (reports, structured data, intermediate reasoning outputs)
  • Implement practical resilience (timeouts, retries, fallback agents, and error-aware routing)

You’ll leave with a clear mental model and a reusable orchestration blueprint to evolve from single-agent demos into durable multi-agent systems.

A2A Protocol Workshop: Build Interoperable Multi-Agent Systems

Enterprise analytics platforms are undergoing a major transformation—from centralized, overloaded data warehouses to federated, governed, GenAI-ready multi-warehouse architectures. In this session, you’ll learn how to design your data warehouse architecture to scale with your business needs. We’ll explore the end-to-end architectural evolution from a monolithic Redshift cluster to a modern multi-warehouse architecture and the best practices to deploy them in a cost-effective manner.

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

Agile/Scrum Analytics AWS Cloud Computing DWH GenAI Redshift
AWS re:Invent 2024

Large Language Models are increasingly used to evaluate, score, and audit the outputs of other AI systems, from code generation to customer interactions and risk assessments. But how can you actually design and maintain an LLM-as-a-Judge system that is trustworthy, scalable, and aligned with your business goals?

In this 60-minute online interactive webinar, we’ll explore the architectural patterns, governance frameworks, and operational practices that enable LLMs to act as reliable evaluators across domains.

You’ll learn: 🧩 Core Concepts of LLM-as-a-Judge How evaluators differ from chatbots, copilots, and agents, and what makes them essential for assessing model quality and compliance.

🏗️ Design & Architecture Patterns Key patterns for prompt evaluation, reasoning calibration, rubric-based scoring, multi-model arbitration, and continuous feedback loops.

⚙️ Tools & Infrastructure Open-source and cloud solutions for evaluator orchestration, logging, monitoring, and performance tracking.

📏 Governance & Maintenance Best practices for bias mitigation, rubric evolution, drift detection, and maintaining long-term consistency.

🏢 Real-World Use Cases Examples from companies that use “AI judges” to review code, summarize documents, evaluate customer interactions, or enforce compliance.

🎯 Who should attend?

  • AI/ML engineers and data scientists designing LLM evaluation systems
  • Solution architects and MLOps professionals deploying LLM pipelines
  • Compliance and model governance leads ensuring fairness and auditability
  • Anyone curious about how “AI judges” are redefining quality assurance in AI

By the end of this session, you’ll know how to build, govern, and evolve an LLM-as-a-Judge framework, and how to apply it to your own AI evaluation workflows.

📅 Duration: 60 minutes

🔗 URL: https://events.teams.microsoft.com/event/4bb20580-cffe-4322-80d3-dfebab4062ce@d94ea0cb-fd25-43ad-bf69-8d9e42e4d175

🤖 Agentic AI for Engineers - How to build and maintain LLM as a Judge🧑‍⚖️
Jacob Matson – Dev Advocate @ MotherDuck

The lakehouse promised to unify our data, but popular formats can feel bloated and hard to use for most real-world workloads. If you've ever felt that the complexity and operational overhead of "Big Data" tools are overkill, you're not alone. What if your lakehouse could be simple, fast, and maybe even a little fun? Enter DuckLake , the native lakehouse format, managed on MotherDuck. It delivers the powerful features you need like ACID transactions, time travel, and schema evolution without the heavyweight baggage. This approach truly makes massive data sets feel like Small Data. This workshop is a practical, step-by-step walkthrough for the data practitioner. We'll get straight to the point and show you how to build a fully functional, serverless lakehouse from scratch. You will learn: The Architecture: We’ll explore how DuckLake's design choices make it fundamentally simpler and faster for analytical queries compared to its JVM-based cousins. The Workflow: Through hands-on examples, you'll create a DuckLake table, perform atomic updates, and use time travel—all with the simple SQL you already know. The MotherDuck Advantage: Discover how the serverless platform makes it easy to manage, share, and query your DuckLake tables, enabling a seamless hybrid workflow between your laptop and the cloud.

Big Data Cloud Computing Data Lakehouse Motherduck SQL
Small Data SF 2025
Harry Carr – CEO @ Vicinity , Molly Presley – host

As we count down to the 100th episode of Data Unchained, we’re revisiting one of the conversations that perfectly captures the spirit of this show: how data mobility is transforming business. In this look-back episode, host Molly Presley welcomes Harry Carr, CEO of Vcinity, for a deep dive into the technology that’s redefining how enterprises access and move data across distributed environments. Harry explains why hybrid cloud exists, how Vcinity accelerates data access without duplication or compression, and why the future of data architecture lies in making data available anywhere—instantly. From connecting global AI workflows to eliminating the need to move massive datasets, this episode explores what true “data anti-gravity” looks like and how it’s reshaping the modern enterprise. Listen as Molly and Harry discuss the evolution of data architectures, the synergy between Hammerspace and Vcinity, and what it means to build a world where applications and data connect seamlessly, no matter where they live. Cyberpunk by jiglr | https://soundcloud.com/jiglrmusic Music promoted by https://www.free-stock-music.com Creative Commons Attribution 3.0 Unported License https://creativecommons.org/licenses/by/3.0/deed.en_US Hosted on Acast. See acast.com/privacy for more information.

AI/ML Cloud Computing
Data Unchained
Podcast

# Building a Unified Medallion Architecture in Microsoft Fabric: From Raw Data to AI-Ready Insights

As data platforms evolve toward unified and intelligent ecosystems, the Medallion Architecture has emerged as a foundational design pattern for building scalable, governed, and AI-ready analytics environments. Microsoft Fabric brings this vision to life by seamlessly integrating ingestion, transformation, governance, and visualization within a single, end-to-end platform.

In this session, Rajesh Vayyala will share an architecture-driven perspective on how organizations can conceptualize and operationalize a Medallion Architecture in Microsoft Fabric. We will focus on the architectural blueprint, governance principles, and organizational strategies required to establish Bronze, Silver, and Gold layers that ensure data consistency, lineage, and trust across data domains. 🏗️

Attendees will gain practical insights into designing resilient data estates, aligning data architecture with business goals, and enabling self-service analytics and AI through Fabric’s semantic layer and Power BI integration.

## Learning Objective

By the end of this session, attendees will be able to: 🔹 Recognize the key principles of the Medallion Architecture and understand how it supports scalable, governed, and AI-ready data ecosystems. 🔹 Map organizational data flows into Bronze, Silver, and Gold layers within Microsoft Fabric using an architecture-driven approach rather than deep technical coding. 🔹 Apply governance best practices for metadata management, lineage, and schema evolution using Fabric’s unified data foundation (OneLake, Data Activator, and Dataflows Gen2). 🔹 Align performance and reusability goals with enterprise data strategy—ensuring consistent, high-quality data delivery across analytics and AI initiatives. 🔹 Leverage the semantic layer and Power BI integration in Fabric to enable self-service analytics and informed decision-making across business domains.

Session Level: Intermediate

Microsoft Fabric: Medallion Architecture for AI-Ready Insights

This presentation outlines the transformation from legacy systems to a modern data and AI platform within Toyota Material Handling Europe. It highlights the strategic adoption of Snowflake to unify data architecture, enable real-time analytics, and support external data sharing. The journey includes the foundation and evolution of an internal AI initiative, DataLabs, which matured into a full-scale AI program.

AI/ML Analytics Snowflake
Snowflake World Tour - Stockholm
Rafael Natali – Lead DevSecOps @ Marionete , Venkata Abburi – Lead Data Engineer @ Marionete

In this session, we’ll explore the real-world journey of implementing a scalable, secure, and resilient data streaming platform—from the ground up. Bridging DevOps and DataOps practices, we’ll cover how our team designed the architecture, selected the right tools (like Kafka and Kubernetes), automated deployments, and enforced data governance across environments. You'll learn how we tackled challenges like schema evolution, CI/CD for data pipelines, monitoring at scale, and team collaboration. Whether you're just starting or scaling your data platform, this talk offers practical takeaways and battle-tested lessons from the trenches of building streaming infrastructure in production.

Kafka Kubernetes
DevOps/DataOps Journey to implement a Data Platform streaming solution

The exponential growth of textual data—ranging from social media posts and digital news archives to speech-to-text transcripts—has opened new frontiers for research in the social sciences. Tasks such as stance detection, topic classification, and information extraction have become increasingly common. At the same time, the rapid evolution of Natural Language Processing, especially pretrained language models and generative AI, has largely been led by the computer science community, often leaving a gap in accessibility for social scientists.

To address this, we initiated since 2023 the development of ActiveTigger, a lightweight, open-source Python application (with a web frontend in React) designed to accelerate annotation process and manage large-scale datasets through the integration of fine-tuned models. It aims to support computational social science for a large public both within and outside social sciences. Already used by a dynamic community in social sciences, the stable version is planned for early June 2025.

From a more technical prospect, the API is designed to manage the complete workflow from project creation, embeddings computation, exploration of the text corpus, human annotation with active learning, fine-tuning of pre-trained models (BERT-like), prediction on a larger corpus, and export. It also integrates LLM-as-a-service capabilities for prompt-based annotation and information extraction, offering a flexible approach for hybrid manual/automatic labeling. Accessible both with a web frontend and a Python client, ActiveTigger encourages customization and adaptation to specific research contexts and practices.

In this talk, we will delve into the motivations behind the creation of ActiveTigger, outline its technical architecture, and walk through its core functionalities. Drawing on several ongoing research projects within the Computational Social Science (CSS) group at CREST, we will illustrate concrete use cases where ActiveTigger has accelerated data annotation, enabled scalable workflows, and fostered collaborations. Beyond the technical demonstration, the talk will also open a broader reflection on the challenges and opportunities brought by generative AI in academic research—especially in terms of reliability, transparency, and methodological adaptation for qualitative and quantitative inquiries.

The repository of the project : https://github.com/emilienschultz/activetigger/

The development of this software is funded by the DRARI Ile-de-France and supported by Progédo.

AI/ML API Computer Science GenAI GitHub LLM NLP Python React
PyData Paris 2025