talk-data.com talk-data.com

Topic

Looker

bi data_exploration analytics

13

tagged

Activity Trend

14 peak/qtr
2020-Q1 2026-Q1

Activities

13 activities · Newest first

Scaling Trust in BI: How Bolt Manages Thousands of Metrics Across Databricks, dbt, and Looker

Managing metrics across teams can feel like everyone’s speaking a different language, which often leads to loss of trust in numbers. Based on a real-world use case, we’ll show you how to establish a governed source of truth for metrics that works at scale and builds a solid foundation for AI integration. You’ll explore how Bolt.eu’s data team governs consistent metrics for different data users and leverages Euno’s automations to navigate the overlap between Looker and dbt. We’ll cover best practices for deciding where your metrics belong and how to optimize engineering and maintenance workflows across Databricks, dbt and Looker. For curious analytics engineers, we’ll dive into thinking in dimensions & measures vs. tables & columns and determining when pre-aggregations make sense. The goal is to help you contribute to a self-serve experience with consistent metric definitions, so business teams and AI agents can access the right data at the right time without endless back-and-forth.

The Best Data Warehouse is a Lakehouse

Reynold Xin, Co-founder and Chief Architect at Databricks, presented during Data + AI Summit 2024 on Databricks SQL and its advancements and how to drive performance improvements with the Databricks Data Intelligence Platform.

Speakers: Reynold Xin, Co-founder and Chief Architect, Databricks Pearl Ubaru, Technical Product Engineer, Databricks

Main Points and Key Takeaways (AI-generated summary)

Introduction of Databricks SQL: - Databricks SQL was announced four years ago and has become the fastest-growing product in Databricks history. - Over 7,000 customers, including Shell, AT&T, and Adobe, use Databricks SQL for data warehousing.

Evolution from Data Warehouses to Lakehouses: - Traditional data architectures involved separate data warehouses (for business intelligence) and data lakes (for machine learning and AI). - The lakehouse concept combines the best aspects of data warehouses and data lakes into a single package, addressing issues of governance, storage formats, and data silos.

Technological Foundations: - To support the lakehouse, Databricks developed Delta Lake (storage layer) and Unity Catalog (governance layer). - Over time, lakehouses have been recognized as the future of data architecture.

Core Data Warehousing Capabilities: - Databricks SQL has evolved to support essential data warehousing functionalities like full SQL support, materialized views, and role-based access control. - Integration with major BI tools like Tableau, Power BI, and Looker is available out-of-the-box, reducing migration costs.

Price Performance: - Databricks SQL offers significant improvements in price performance, which is crucial given the high costs associated with data warehouses. - Databricks SQL scales more efficiently compared to traditional data warehouses, which struggle with larger data sets.

Incorporation of AI Systems: - Databricks has integrated AI systems at every layer of their engine, improving performance significantly. - AI systems automate data clustering, query optimization, and predictive indexing, enhancing efficiency and speed.

Benchmarks and Performance Improvements: - Databricks SQL has seen dramatic improvements, with some benchmarks showing a 60% increase in speed compared to 2022. - Real-world benchmarks indicate that Databricks SQL can handle high concurrency loads with consistent low latency.

User Experience Enhancements: - Significant efforts have been made to improve the user experience, making Databricks SQL more accessible to analysts and business users, not just data scientists and engineers. - New features include visual data lineage, simplified error messages, and AI-driven recommendations for error fixes.

AI and SQL Integration: - Databricks SQL now supports AI functions and vector searches, allowing users to perform advanced analysis and query optimizations with ease. - The platform enables seamless integration with AI models, which can be published and accessed through the Unity Catalog.

Conclusion: - Databricks SQL has transformed into a comprehensive data warehousing solution that is powerful, cost-effective, and user-friendly. - The lakehouse approach is presented as a superior alternative to traditional data warehouses, offering better performance and lower costs.

Transforming healthcare by putting data in the driver’s seat at Vida Health - Coalesce 2023

In this session, Vida Health’s senior director of data, mobile, and web engineering shares a story that can help other data and business leaders capitalize on the opportunities being created by current technology innovations, market realities, and real-world problems. This includes a playbook on how Vida Health uses modern data technologies like dbt Cloud, Fivetran, Looker, BigQuery, BigQueryML/dbtML, Vertex AI, LLMs, and more to put data in the driver’s seat to solve meaningful problems in complex industries like healthcare.

Speaker: Trenton Huey, Senior Director, Data and Frontend Engineering, Vida Health

Register for Coalesce at https://coalesce.getdbt.com

Business process occurrence, volume, and duration modeling using dbt Cloud - Coalesce 2023

Business processes are the foundation of any organization, directing entities towards achieving specific outcomes. These processes can be simple or complex and may take days or even months to complete. Insights into business processes can be determined through three categories: occurrence, volume, and velocity.

In this presentation, Routable’s Director of Data & Analytics discusses the technical and process complexities involved in creating data models in a data warehouse using dbt Cloud. The session also provides tips to make the process easier and explains how to expose this data to users using Looker.

Speaker: Jason Hodson, Director, Data & Analytics, Routable

Register for Coalesce at https://coalesce.getdbt.com

How Comcast Effectv Drives Data Observability with Databricks and Monte Carlo

Comcast Effectv, the 2,000-employee advertising wing of Comcast, America’s largest telecommunications company, provides custom video ad solutions powered by aggregated viewership data. As a global technology and media company connecting millions of customers to personalized experiences and processing billions of transactions, Comcast Effectv was challenged with handling massive loads of data, monitoring hundreds of data pipelines, and managing timely coordination across data teams.

In this session, we will discuss Comcast Effectv’s journey to building a more scalable, reliable lakehouse and driving data observability at scale with Monte Carlo. This has enabled Effectv to have a single pane of glass view of their entire data environment to ensure consumer data trust across their entire AWS, Databricks, and Looker environment.

Talk by: Scott Lerner and Robinson Creighton

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Cubing and Metrics in SQL, Oh My!

ABOUT THE TALK Apache Calcite has extended SQL to support metrics (which we call ‘measures’), filter context, and analytic expressions. With these concepts you can define data models (which we call Analytic Views) that contain metrics, use them in queries, and define new metrics in queries.

This talk, hosted by the original developer of Apache Calcite describes the SQL syntax extensions for metrics, and how to use them for cross-dimensional calculations such as period-over-period, percent-of-total, non-additive and semi-additive measures. It details how we got around fundamental limitations in SQL semantics, and approaches for optimizing queries that use metrics.

ABOUT THE SPEAKER Julian Hyde is the original developer of Apache Calcite, an open source framework for building data management systems, and Morel, a functional query language. Previously he created Mondrian, an analytics engine, and SQLstream, an engine for continuous queries. He is a staff engineer at Google, where he works on Looker and BigQuery.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/

Malloy An Experimental Language for Data | Google

ABOUT THE TALK: Forcing data through a rectangle shapes the way we solve problems (for example, dimensional fact tables, OLAP Cubes).

Most Data isn’t rectangular it rather exists in hierarchies (orders, items, products, users). Most query results are better returned as a hierarchy (category, brand, product).

Malloy is a new experimental data programming language that, among other things, breaks the rectangle paradigm and several other long held misconceptions in the way we analyze data.

In this talk, Lloyd Tabb shares the ideas behind the Malloy language, semantic data modeling, and his vision for the future of data.

ABOUT THE SPEAKER: Lloyd Tabb spent the last 30 years revolutionizing how the world uses the internet and, by extension, data. He is one of the internet pioneers, having worked at Netscape during the browser wars as the Principal Engineer on Navigator Gold, the first HTML WYSIWYG editor.

Originally a database & languages architect at Borland, Lloyd founded Looker,, which Google acquired in 2019. Lloyd's work at Looker helped define the Modern Data Stack.

At Google, Lloyd continues to pursue his passion for data, and love of programming languages through his current project, Malloy.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/

Prevent dbt Changes from Breaking Looker with Spectacles

We'll explain how Spectacles makes dbt developers more confident and efficient by revealing the impact of their proposed changes on Looker—preventing dashboard outages and angry analysts.

Check the slides here: https://docs.google.com/presentation/d/1NOG61IhhN3GkkCe2osIZJ516oT4k_T7oJB3l9kklfyk/edit?usp=sharing

Coalesce 2023 is coming! Register for free at https://coalesce.getdbt.com/.

Beyond the buzz: 20 real metadata use cases in 20 minutes with Atlan and dbt Labs

for a few use cases like static and passive data catalogs. However, active metadata can be the key to unlock a variety of use cases, acting as the glue that binds together our diverse modern data stacks (e.g. dbt, Snowflake, Fivetran, Databricks, Looker, and Tableau) and diverse teams (e.g. analytics engineers, data analysts, data engineers, and business users)! At Atlan, we’ve worked closely with modern data teams like WeWork, Plaid, PayU, SnapCommerce, and Bestow. In this session, we’ll lay out all our learnings about how real-life data teams are using metadata to drive powerful use cases like column-level lineage, programmatic governance, root cause analysis, proactive upstream alerts, dynamic pipeline optimization, cost optimization, data deprecation, automated quality control, metrics management, and more. P.S. We’ll also reveal how active metadata and the dbt Semantic Layer can work together to transform the way your team works with metrics!

Check the slides here: https://docs.google.com/presentation/d/1xrC9yhHOQ00qWt-gVlgbakRELg2FzEPt-RwMsUWzdZA/edit?usp=sharing

Coalesce 2023 is coming! Register for free at https://coalesce.getdbt.com/.

Building a Data Platform from Scratch with dbt, Snowflake and Looker

When Prateek Chawla, founding engineer, joined Monte Carlo in 2019, he was responsible for spinning up our data platform from scratch. He was more of a backend/cloud engineer, but like with any startup had to wear many hats, so got the opportunity to play the role of data engineer too. In this talk, we’ll walk through how we spun up Monte Calro’s data stack with Snowflake, Looker, and dbt, touching on how and why we implemented dbt (and later, dbt Cloud), key use cases, and handy tricks for integrating dbt with other popular tools, like Airflow, and Spark. We’ll discuss what worked, what didn’t work, and other lessons learned along the way, as well as share how our data stack evolved over time to scale to meet the demands of our growing startup. We’ll also touch on a very critical component of the dbt value proposition, data quality testing, and discuss some of our favorite tests and what we’ve done to automate and integrate them with other elements of our stack.

Coalesce 2023 is coming! Register for free at https://coalesce.getdbt.com/.

Perfect complements: Using dbt with Looker for effective data governance

Learn how a rapidly growing software development firm transformed their legacy data analytics approach by embracing analytics engineering with dbt and Looker. In this video, Johnathan Brooks of 4 Mile Analytics outlines the complementary benefits of these tools and discusses design patterns and analytics engineering principles that enable strong data governance, increased agility and scalability, while decreasing maintenance overhead.

Analytics on your analytics, Drizly

Using dbt's metadata on dbt runs (run_results.json) Drizly analytics is able to track, monitor, and alert on its dbt models using Looker to visualize the data. In this video, Emily Hawkins covers how Drizly did this before, using dbt macros and inserts, and how the process was improved using run_results.json in conjunction with Dagster (and teamwork with Fishtown Analytics!)