talk-data.com talk-data.com

Topic

DWH

Data Warehouse

analytics business_intelligence data_storage

568

tagged

Activity Trend

35 peak/qtr
2020-Q1 2026-Q1

Activities

568 activities · Newest first

Data Warehouse Design Solutions

"Each chapter is... a practice run for the way we all ought to design our data marts and hence our data warehouses."-Ralph Kimball, from the Foreword. Let the experts show you how to customize data warehouse designs for real business needs in Data Warehouse Design Solutions. To effectively design a data warehouse, you have to understand its many business uses. This guidebook shows you how business managers in different corporate functions actually use data warehouses to make decisions. You'll get a rich set of data warehouse designs that flow from realistic business cases. Two top experts show you how to customize your data warehouse designs for real-life business needs including: Sales and marketing Production and inventory management Budgeting and financial reporting Quality control Product delivery and fulfillment Strategic business analysis such as determining market share, rates of return on investment, and other key analytic ratios. CD-ROM includes All sample data warehouse designs with accompanying preformatted reports in HTML for specific business uses such as marketing, sales, and financial analysis. This title includes additional digital media when purchased in print format. For this digital book edition, media content may not be included. Contact the publisher's customer service directly for assistance.

Experience the transformative power of real-time interactions with your dashboards through natural language. Say goodbye to waiting for analysts to provide graphics and conclusions. Witness the immediacy of getting your questions answered accurately and precisely, grounded in your data warehouse. Join us for an immersive exploration of data-driven decision-making, complete with a live demo showcasing a practical business scenario. By attending this session, your contact information may be shared with the sponsor for relevant follow up for this event only.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

Big Data is Dead: Long Live Hot Data 🔥

Over the last decade, Big Data was everywhere. Let's set the record straight on what is and isn't Big Data. We have been consumed by a conversation about data volumes when we should focus more on the immediate task at hand: Simplifying our work.

Some of us may have Big Data, but our quest to derive insights from it is measured in small slices of work that fit on your laptop or in your hand. Easy data is here— let's make the most of it.

📓 Resources Big Data is Dead: https://motherduck.com/blog/big-data-is-dead/ Small Data Manifesto: https://motherduck.com/blog/small-data-manifesto/ Small Data SF: https://www.smalldatasf.com/

➡️ Follow Us LinkedIn: https://linkedin.com/company/motherduck X/Twitter : https://twitter.com/motherduck Blog: https://motherduck.com/blog/


Explore the "Small Data" movement, a counter-narrative to the prevailing big data conference hype. This talk challenges the assumption that data scale is the most important feature of every workload, defining big data as any dataset too large for a single machine. We'll unpack why this distinction is crucial for modern data engineering and analytics, setting the stage for a new perspective on data architecture.

Delve into the history of big data systems, starting with the non-linear hardware costs that plagued early data practitioners. Discover how Google's foundational papers on GFS, MapReduce, and Bigtable led to the creation of Hadoop, fundamentally changing how we scale data processing. We'll break down the "big data tax"—the inherent latency and system complexity overhead required for distributed systems to function, a critical concept for anyone evaluating data platforms.

Learn about the architectural cornerstone of the modern cloud data warehouse: the separation of storage and compute. This design, popularized by systems like Snowflake and Google BigQuery, allows storage to scale almost infinitely while compute resources are provisioned on-demand. Understand how this model paved the way for massive data lakes but also introduced new complexities and cost considerations that are often overlooked.

We examine the cracks appearing in the big data paradigm, especially for OLAP workloads. While systems like Snowflake are still dominant, the rise of powerful alternatives like DuckDB signals a shift. We reveal the hidden costs of big data analytics, exemplified by a petabyte-scale query costing nearly $6,000, and argue that for most use cases, it's too expensive to run computations over massive datasets.

The key to efficient data processing isn't your total data size, but the size of your "hot data" or working set. This talk argues that the revenge of the single node is here, as modern hardware can often handle the actual data queried without the overhead of the big data tax. This is a crucial optimization technique for reducing cost and improving performance in any data warehouse.

Discover the core principles for designing systems in a post-big data world. We'll show that since only 1 in 500 users run true big data queries, prioritizing simplicity over premature scaling is key. For low latency, process data close to the user with tools like DuckDB and SQLite. This local-first approach offers a compelling alternative to cloud-centric models, enabling faster, more cost-effective, and innovative data architectures.

In this course, you will learn basic skills that will allow you to use the Databricks Data Intelligence Platform to perform a simple data engineering workflow and support data warehousing endeavors. You will be given a tour of the workspace and be shown how to work with objects in Databricks such as catalogs, schemas, volumes, tables, compute clusters and notebooks. You will then follow a basic data engineering workflow to perform tasks such as creating and working with tables, ingesting data into Delta Lake, transforming data through the medallion architecture, and using Databricks Workflows to orchestrate data engineering tasks. You’ll also learn how Databricks supports data warehousing needs through the use of Databricks SQL, DLT, and Unity Catalog.

This course provides a comprehensive overview of Databricks’ modern approach to data warehousing, highlighting how a data lakehouse architecture combines the strengths of traditional data warehouses with the flexibility and scalability of the cloud. You’ll learn about the AI-driven features that enhance data transformation and analysis on the Databricks Data Intelligence Platform. Designed for data warehousing practitioners, this course provides you with the foundational information needed to begin building and managing high-performant, AI-powered data warehouses on Databricks. This course is designed for those starting out in data warehousing and those who would like to execute data warehousing workloads on Databricks. Participants may also include data warehousing practitioners who are familiar with traditional data warehousing techniques and concepts and are looking to expand their understanding of how data warehousing workloads are executed on Databricks.

Think Inside the Box: Constraints Drive Data Warehousing Innovation

As a Head of Data or a one-person data team, keeping the lights on for the business while running all things data-related as efficiently as possible is no small feat. This talk will focus on tactics and strategies to manage within and around constraints, including monetary costs, time and resources, and data volumes.

📓 Resources Big Data is Dead: https://motherduck.com/blog/big-data-... Small Data Manifesto: https://motherduck.com/blog/small-dat... Why Small Data?: https://benn.substack.com/p/is-excel-... Small Data SF: https://www.smalldatasf.com/

➡️ Follow Us LinkedIn: / motherduck
X/Twitter : / motherduck
Blog: https://motherduck.com/blog/


Learn how your data team can drive innovation and maximize ROI by embracing constraints, drawing inspiration from SpaceX's revolutionary cost-effective approach. This video challenges the "abundance mindset" prevalent in the modern data stack, where easily scalable cloud data warehouses and a surplus of tools often lead to unmanageable data models and underutilized dashboards. We explore a focused data strategy for extracting maximum value from small data, shifting the paradigm from "more data" to more impact.

To maximize value, data teams must move beyond being order-takers and practice strategic stakeholder management. Discover how to use frameworks like the stakeholder engagement matrix to prioritize high-impact business leaders and align your work with core business goals. This involves speaking the language of business growth models, not technical jargon about data pipelines or orchestration, ensuring your data engineering efforts resonate with key decision-makers and directly contribute to revenue-generating activities.

Embracing constraints is key to innovation and effective data project management. We introduce the Iron Triangle—a fundamental engineering concept balancing scope, cost, and time—as a powerful tool for planning data projects and having transparent conversations with the business. By treating constraints not as limitations but as opportunities, data engineers and analysts can deliver higher-quality data products without succumbing to scope creep or uncontrolled costs.

A critical component of this strategy is understanding the Total Cost of Ownership (TCO), which goes far beyond initial compute costs to include ongoing maintenance, downtime, and the risk of vendor pricing changes. Learn how modern, efficient tools like DuckDB and MotherDuck are designed for cost containment from the ground up, enabling teams to build scalable, cost-effective data platforms. By making the true cost of data requests visible, you can foster accountability and make smarter architectural choices. Ultimately, this guide provides a blueprint for resisting data stack bloat and turning cost and constraints into your greatest assets for innovation.

Uncover data's hidden connections using graph analytics in BigQuery. This session shows how to use BigQuery's scalable infrastructure for graph analysis directly in your data warehouse. Identify patterns, connections, and influences for fraud detection, drug discovery, social network analysis, and recommendation engines. Join us to explore the latest innovations in graphs and see real-world examples. Transform your data into actionable insights with BigQuery's powerful graph capabilities.