talk-data.com talk-data.com

Topic

ClickHouse

columnar_database big_data analytics database data_warehouse olap realtime

5

tagged

Activity Trend

17 peak/qtr
2020-Q1 2026-Q1

Activities

5 activities · Newest first

ClickHouse and Databricks for Real-Time Analytics

ClickHouse is a C++ based, column-oriented database built for real-time analytics. While it has its own internal storage format, the rise of open lakehouse architectures has created a growing need for seamless interoperability. In response, we have developed integrations with your favorite lakehouse ecosystem to enhance compatibility, performance and governance. From integrating with Unity Catalog to embedding the Delta Kernel into ClickHouse, this session will explore the key design considerations behind these integrations, their benefits to the community, the lessons learned and future opportunities for improved compatibility and seamless integration.

Delta Kernel for Rust and Java

Delta Kernel makes it easy for engines and connectors to read and write Delta tables. It supports many Delta features and robust connectors, including DuckDB, Clickhouse, Spice AI and delta-dotnet. In this session, we'll cover lessons learned about how to build a high-performance library that lets engines integrate the way they want, while not having to worry about the details of the Delta protocol. We'll talk through how we streamlined the API as well as its changes and underlying motivations. We'll discuss some new highlight features like write support, and the ability to do CDF scans. Finally we'll cover the future roadmap for the Kernel project and what you can expect from the project over the coming year.

Coalesce 2024: A strategic approach to testing & monitoring with Data Products

With analytics teams' growing ambition to build business automation, foundational AI systems, or customer-facing products, we must shift our mindset about data quality. Mechanically applied testing will not be enough; we need a more robust strategy akin to software engineering.

We outline a new approach to data testing and observability anchored in the ‘Data Products’ concept and walk through the practical implementation of a production-grade analytics system at SYNQ, powered by ClickHouse and dbt.

Speaker: Petr Janda Founder SYNQ

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Opening the Floodgates: Enabling Fast, Unmediated End User Access to Trillion-Row Datasets with SQL

Spreadsheets revolutionized IT by giving end users the ability to create their own analytics. Providing direct end user access to trillion-row datasets generated in financial markets or digital marketing is much harder. New SQL data warehouses like ClickHouse and Druid can provide fixed latency with constant cost on very large datasets, which opens up new possibilities.

Our talk walks through recent experience on analytic apps developed by ClickHouse users that enable end users like market traders to develop their own analytics directly off raw data. We’ll cover the following topics.

  1. Characteristics of new open source column databases and how they enable low-latency analytics at constant cost.

  2. Idiomatic ways to validate new apps by building MVPs that support a wide range of queries on source data including storing source JSON, schema design, applying compression on columns, and building indexes for needle-in-a-haystack queries.

  3. Incrementally identifying hotspots and applying easy optimizations to bring query performance into line with long term latency and cost requirements.

  4. Methods of building accessible interfaces, including traditional dashboards, imitating existing APIs that are already known, and creating app-specific visualizations.

We’ll finish by summarizing a few of the benefits we’ve observed and also touch on ways that analytic infrastructure could be improved to make end user access even more productive. The lessons are as general as possible so that they can be applied across a wide range of analytic systems, not just ClickHouse.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/