Search – talk-data.com

SQL Engineering Connection Series - SQL Engine improvements 2026-01-22 · 17:00

SQL Engine - Query processing improvements
SQL Engine improvements - Query Processing
Derek Wilson
SQL Engine improvements for performance -Storage, optimized locking
SQL Engine improvements - Storage
Dimitri Furman

SQL Engineering Connection Series - SQL Engine improvements

SQL Engine improvements - Query Processing & Storage 2026-01-22 · 17:00

SQL Community Engineering Connection Series: Episode 5

Part 1: SQL Engine - Query processing improvements Part 2: SQL Engine improvements for performance -Storage, optimized locking

Speakers: Derek Wilson and Dimitri Furman

This series is all about connection bringing Microsoft's product experts and the SQL community together to share knowledge, answer questions, and help you stay up to date on the latest in the SQL and data ecosystem.

Join Link: https://teams.microsoft.com/meet/2914607474379?p=x3u2ymC3IvLoLi18ZO Meeting ID: 291 460 747 437 9 Passcode: NB9W7RP3

SQL Engine improvements - Query Processing & Storage

Announcing The SQL Community Engineering Connection Series 2025-11-19 · 17:00

Microsoft is excited to announce the launch of the first major step in rebuilding that connection: The SQL Community Engineering Connection Series, launching Wednesday, November 19th with Priya Sathy, Partner Director of Product SQL.

Get ready for an exciting session from Microsoft’s new SQL Community Engineering Connection Series - a biweekly event that brings members of the Microsoft SQL Product Group directly to the community! Topic: SQL25 - MSFT Ignite Announcements Date: November 19, 2025 Time: 9:00AM – 10:00AM PT Speakers: Priya Sathy, Partner Director of Product Management (Azure Data) Description: Learn all about what's new in SQL Server 2025, SQL in Fabric & Azure SQL and ask questions LIVE! This series is all about connection bringing Microsoft's product experts and the SQL community together to share knowledge, answer questions, and help you stay up to date on the latest in the SQL and data ecosystem. Be sure to register now in order to be notified when the Teams link goes live.

Session Format

55 min: Featured topics + demos from SQL Product Managers
Live Q&A with the speakers where time permits

NOTE: There will also be occasional special AMA sessions featuring SQL Leadership Upcoming Session Details Topic: SQL25 - MSFT Ignite Announcements Date: November 19, 2025 Time: 9:00AM – 10:00AM PT Speakers: Priya Sathy, Partner Director of Product Management (Azure Data) Description: Learn all about what's new in SQL Server 2025, SQL in Fabric & Azure SQL and ask questions LIVE! Join Link: https://teams.microsoft.com/meet/2914607474379?p=x3u2ymC3IvLoLi18ZO Meeting ID: 291 460 747 437 9 Passcode: NB9W7RP3 We hope to see you there!

| Date/Time | Speaker | Product/Feature | Session Topic | | --------- | ------- | --------------- | ------------- | | November 19th

9:00am PST | Tejas Shah / Sudhir Raparla | Migration | Modernizing SQL : end to end Migration journey to Azure |

Announcing The SQL Community Engineering Connection Series

ClickHouse Delhi/Gurgaon Meetup - March 2025 2025-03-22 · 05:00

We are excited to finally have the first ClickHouse Meetup in the vibrant city of Delhi! Join the ClickHouse crew, from Singapore and from different cities in India, for an engaging day of talks, food, and discussion with your fellow database enthusiasts.

But here's the deal: to secure your spot, make sure you register ASAP!

🗓️ Agenda:

10:30 AM: Registration & Networking
11:05 AM: Welcome & Opening
11:10 AM: Introduction to ClickHouse by Rakesh Puttaswamy, Solution Architect @ ClickHouse
11:25 AM: ClickPipes Overview and demo by Kunal Gupta, Sr. Software Engineer @ ClickHouse
11:40 AM: Optimizing Log Management with Clickhouse: Cost-Effective & Scalable Solutions by Pushpender Kumar, DevOps Architect @ OLX India
12:10 PM: ClickHouse at Physics Wallah: Empowering Real-Time Analytics at Scale by Utkarsh G. Srivastava, Software Development Engineer III @ Physics Wallah
12:40 PM: FabFunnel & ClickHouse: Delivering Real-Time Marketing Analytics by Anmol Jain, SDE-2 (Full stack Developer) and Siddhant Gaba, SDE-2 (Python), @ Idea Clan
1:10 PM: From SQL to AI: Building Intelligent Applications with ClickHouse and LangDB by Matteo Pelati, Co-founder, LangDB.ai
1:40 PM: Lunch & Networking

If anyone from the community is interested in sharing a talk at future meetups, complete this CFP form and we’ll be in touch. _______

🎤 Session Details: Introduction to ClickHouse Discover the secrets behind ClickHouse's unparalleled efficiency and performance. Johnny will give an overview of different use cases for which global companies are adopting this groundbreaking database to transform data storage and analytics.

Speaker: Rakesh Puttaswamy, Solution Architect @ ClickHouse Rakesh Puttaswamy is a Solution Architect with ClickHouse, working with users across India, with over 12 years of experience in data architecture, big data, data science, and software engineering.Rakesh helps organizations design and implement cutting-edge data-driven solutions. With deep expertise in a broad range of databases and data warehousing technologies, he specializes in building scalable, innovative solutions to enable data transformation and drive business success.

🎤 Session Details: ClickPipes Overview and demo ClickPipes is a powerful integration engine that simplifies data ingestion at scale, making it as easy as a few clicks. With an intuitive onboarding process, setting up new ingestion pipelines takes just a few steps—select your data source, define the schema, and let ClickPipes handle the rest. Designed for continuous ingest, it automates pipeline management, ensuring seamless data flow without manual intervention. In this talk, Kunal will demo the Postgres CDC connector for ClickPipes, enabling seamless, native replication of Postgres data to ClickHouse Cloud in just a few clicks—no external tools needed for fast, cost-effective analytics.

Speaker: Kunal Gupta, Sr. Software Engineer @ ClickHouse Kunal Gupta is a Senior Software Engineer at ClickHouse, joining through the acquisition of PeerDB in 2024, where he played a pivotal role as a founding engineer. With several years of experience in architecting scalable systems and real-time applications, Kunal has consistently driven innovation and technical excellence. Previously, he was a founding engineer for new solutions at ICICIdirect and at AsknBid Tech, leading high-impact teams and advancing code analysis, storage solutions, and enterprise software development.

🎤 Session Details: Optimizing Log Management with Clickhouse: Cost-Effective & Scalable Solutions Efficient log management is essential in today's cloud-native environments, yet traditional solutions like ElasticSearch often face scalability issues, high costs, and performance limitations. This talk will begin with an overview of common logging tools and their challenges, followed by an in-depth look at ClickHouse's architecture. We will compare ClickHouse with ElasticSearch, focusing on improvements in query performance, storage efficiency, and overall cost-effectiveness.

A key highlight will be OLX India's migration to ClickHouse, detailing the motivations behind the shift, the migration strategy, key optimizations, and the resulting 50% reduction in log storage costs. By the end of this talk, attendees will gain a clear understanding of when and how to leverage ClickHouse for log management, along with best practices for optimizing performance and reducing operational costs.

Speaker: Pushpender Kumar, DevOps Architect @ OLX India Born and raised in Bijnor, moved to Delhi to stay ahead in the race of life. Currently working as a DevOps Architect at OLX India, specializing in cloud infrastructure, Kubernetes, and automation with over 10 years of experience. Successfully optimized log storage costs by 50% using Clickhouse, bringing scalability and efficiency to large-scale logging systems. Passionate about cloud optimization, DevOps hiring, and performance engineering.

🎤 Session Details: ClickHouse at Physics Wallah: Empowering Real-Time Analytics at Scale This session explores how Physics Wallah revolutionized its real-time analytics capabilities by leveraging ClickHouse. We'll delve into the journey of implementing ClickHouse to efficiently handle large-scale data processing, optimize query performance, and power diverse use cases such as user activity tracking and engagement analysis. By enabling actionable insights and seamless decision-making, this transformation has significantly enhanced the learning experience for millions of users.

Today, more than five customer-facing products at Physics Wallah are powered by ClickHouse, serving over 10 million students and parents, including 1.5 million Daily Active Users. Our in-house ClickHouse cluster, hosted and managed within our EKS infrastructure on AWS Cloud, ingests more than 10 million rows of data daily from various sources. Join us to learn about the architecture, challenges, and key strategies behind this scalable, high-performance analytics solution.

Speaker: Utkarsh G. Srivastava, Software Development Engineer III @ Physics Wallah As a versatile Software Engineer with over 7 years of experience in the IT industry, I have had the privilege of taking on diverse roles, with a primary focus on backend development, data engineering, infrastructure, DevOps, and security. Throughout my career, I have played a pivotal role in transformative projects, consistently striving to craft innovative and effective solutions for customers in the SaaS space.

🎤 Session Details: FabFunnel & ClickHouse: Delivering Real-Time Marketing Analytics We are a performance marketing company that relies on real-time reporting to drive data-driven decisions and maximize campaign effectiveness. As our client base expanded, we encountered significant challenges with our reporting system—frequent data updates meant handling large datasets inefficiently, leading to slow query execution and delays in delivering insights. This bottleneck hindered our ability to provide timely optimizations for ad campaigns. To address these issues, we needed a solution that could handle rapid data ingestion and querying at scale without the overhead of traditional refresh processes. In this talk, we’ll share how we transformed our reporting infrastructure to achieve real-time insights, enhancing speed, scalability, and efficiency in managing large-scale ad performance data.

Speakers: Anmol Jain, SDE-2 (Full stack Developer), & Siddhant Gaba, SDE-2 (Python) @ Idea Clan From competing as a national table tennis player to building high-performance software, Anmol Jain brings a unique mix of strategy and problem-solving to tech. With 3+ years of experience at Idea Clan, they play a key role in scaling Lookfinity and FabFunnel, managing multi-million-dollar ad spends every month. Specializing in ClickHouse, React.js, and Node.js, Anmol focuses on real-time data processing and scalable backend solutions. At this meet-up, they’ll share insights on solving reporting challenges and driving real-time decision-making in performance marketing.

Siddhant Gaba is an SDE II at Idea Clan, with expertise in Python, Java, and C#, specializing in scalable backend systems. With four years of experience working with FastAPI, PostgreSQL, MongoDB, and ClickHouse, he focuses on real-time analytics, database optimization, and distributed systems. Passionate about high-performance computing, asynchronous APIs, and system design, he aims to advance real-time data processing. Outside of work, he enjoys playing volleyball. At this meetup, he will share insights on how ClickHouse transformed real-time reporting and scalability.

🎤 Session Details: From SQL to AI: Building Intelligent Applications with ClickHouse and LangDB As AI becomes a driving force behind innovation, building applications that seamlessly integrate AI capabilities with existing data infrastructures is critical.

In this session, we explore the creation of agentic applications using ClickHouse and LangDB. We will introduce the concept of an AI gateway, explaining its role in connecting powerful AI models with the high-performance analytics engine of ClickHouse. By leveraging LangDB, we demonstrate how to directly interact with AI functions as User-Defined Functions (UDFs) in ClickHouse, enabling developers to design and execute complex AI workflows within SQL.

Additionally, we will showcase how LangDB facilitates deep visibility into AI function behaviors and agent interactions, providing tools to analyze and optimize the performance of AI-driven logic. Finally, we will highlight how ClickHouse, powered by LangDB APIs, can be used to evaluate and refine the quality of LLM responses, ensuring reliable and efficient AI integrations.

Speaker: Matteo Pelati, Co-founder, LangDB.ai Matteo Pelati is a seasoned software engineer with over two decades of experience, specializing in data engineering for the past ten years. He is the co-founder of LangDB, a company based in Singapore building the fastest Open Source AI Gateway. Before founding LangDB, he was part of the early team at DataRobot, where he contributed to scaling their product for enterprise clients. Subsequently, he joined DBS Bank where he built their data platform and team from the ground up. Prior to starting LangDB, Matteo led the data group for Asia Pacific and data engineering at Goldman Sachs.

ClickHouse Delhi/Gurgaon Meetup - March 2025

Stateful, Distributed Stream Processing on Flink with Fabian Hueske - Episode 57 2018-11-19

Fabian Hueske – guest @ Data Artisans , Tobias Macey – host

Summary

Modern applications and data platforms aspire to process events and data in real time at scale and with low latency. Apache Flink is a true stream processing engine with an impressive set of capabilities for stateful computation at scale. In this episode Fabian Hueske, one of the original authors, explains how Flink is architected, how it is being used to power some of the world’s largest businesses, where it sits in the lanscape of stream processing tools, and how you can start using it today.

Preamble

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out Linode. With 200Gbit private networking, scalable shared block storage, and a 40Gbit public network, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform. If you need global distribution, they’ve got that covered too with world-wide datacenters including new ones in Toronto and Mumbai. Go to dataengineeringpodcast.com/linode today to get a $20 credit and launch a new server in under a minute. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the mailing list, read the show notes, and get in touch. Join the community in the new Zulip chat workspace at dataengineeringpodcast.com/chat Your host is Tobias Macey and today I’m interviewing Fabian Hueske, co-author of the upcoming O’Reilly book Stream Processing With Apache Flink, about his work on Apache Flink, the stateful streaming engine

Interview

Introduction How did you get involved in the area of data management? Can you start by describing what Flink is and how the project got started? What are some of the primary ways that Flink is used? How does Flink compare to other streaming engines such as Spark, Kafka, Pulsar, and Storm?

What are some use cases that Flink is uniquely qualified to handle?

Where does Flink fit into the current data landscape? How is Flink architected?

How has that architecture evolved? Are there any aspects of the current design that you would do differently if you started over today?

How does scaling work in a Flink deployment?

What are the scaling limits? What are some of the failure modes that users should be aware of?

How is the statefulness of a cluster managed?

What are the mechanisms for managing conflicts? What are the limiting factors for the volume of state that can be practically handled in a cluster and for a given purpose? Can state be shared across processes or tasks within a Flink cluster?

What are the comparative challenges of working with bounded vs unbounded streams of data? How do you handle out of order events in Flink, especially as the delay for a given event increases? For someone who is using Flink in their environment, what are the primary means of interacting with and developing on top of it? What are some of the most challenging or complicated aspects of building and maintaining Flink? What are some of the most interesting or unexpected ways that you have seen Flink used? What are some of the improvements or new features that are planned for the future of Flink? What are some features or use cases that you are explicitly not planning to support? For people who participate in the training sessions that you offer through Data Artisans, what are some of the concepts that they are challenged by?

What do they find most interesting or exciting?

Contact Info

LinkedIn @fhueske on Twitter fhueske on GitHub

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Links

Flink Data Artisans IBM DB2 Technische Universität Berlin Hadoop Relational Database Google Cloud Dataflow Spark Cascading Java RocksDB Flink Checkpoints Flink Savepoints Kafka Pulsar Storm Scala LINQ (Language INtegrated Query) SQL Backpressure

Flink Cloud Computing Data Engineering Data Management Dataflow GCP GitHub Hadoop IBM Java Kafka Scala Spark SQL Data Streaming

Data Engineering Podcast

Listen

talk-data.com

People (1 result)

Activities & events