talk-data.com talk-data.com

Topic

Data Modelling

data_governance data_quality metadata_management

355

tagged

Activity Trend

18 peak/qtr
2020-Q1 2026-Q1

Activities

355 activities · Newest first

John Giles is a legend in the data modeling world, having authored "Nimble Elephant" and "The Elephant in the Fridge," and written extensively on the topic. John and I discuss the power of using data model patterns, enterprise data modeling, "data town plans," and much more. It was an honor to chat with John, and I felt like I was conversing with someone who comes from a more advanced dimension than most data practitioners. You'll learn a ton from this discussion.

John's website: https://www.countrye.com.au/

Books: Nimble Elephant (https://www.amazon.com/Nimble-Elephant-Delivery-Pattern-based-Approach/dp/1935504258), The Elephant in the Fridge (https://www.amazon.com/Elephant-Fridge-Success-Building-Business-Centered/dp/1634624890)

LinkedIn: https://www.linkedin.com/in/john-giles-data/

data #datamodeling

Data Modeling with Snowflake

This comprehensive guide, "Data Modeling with Snowflake", is your go-to resource for mastering the art of efficient data modeling tailored to the capabilities of the Snowflake Data Cloud. In this book, you will learn how to design agile and scalable data solutions by effectively leveraging Snowflake's unique architecture and advanced features. What this Book will help me do Understand the core principles of data modeling and how they apply to Snowflake's cloud-native environment. Learn to use Snowflake's features, such as time travel and zero-copy cloning, to create efficient data solutions. Gain hands-on experience with SQL recipes that outline practical approaches to transforming and managing Snowflake data. Discover techniques for modeling structured and semi-structured data for real-world business needs. Learn to integrate universal modeling frameworks like Star Schema and Data Vault into Snowflake implementations for scalability and maintainability. Author(s) The author, Serge Gershkovich, is a seasoned expert in database design and Snowflake architecture. With years of experience in the data management field, Serge has dedicated himself to making complex technical subjects approachable to professionals at all levels. His insights in this book are informed by practical applications and real-world experience. Who is it for? This book is targeted at data professionals, ranging from newcomers to database design to seasoned SQL developers seeking to specialize in Snowflake. If you are looking to understand and apply data modeling practices effectively within Snowflake's architecture, this book is for you. Whether you're refining your modeling skills or getting started with Snowflake, it provides the practical knowledge you need to succeed.

In this reflective conversation, Shane and Alcine wrap up Season 2 by sharing some of their own stories, lenses, and learning around the work. You’ll hear what’s emerging on the ground as Shane and Dr. Dugan try to bring the Street Data model to life through communities of practice. You’ll consider the difference between cultural appropriation and appreciation, tapping into the brilliance of Jo Chrona’s book Wayi Wah! Indigenous Pedagogies: An Act for Reconciliation and Anti-Racist Education. We also learn more about how Alcine’s mother influenced her student-centered pedagogy and how her experiences as a good test taker during desegregation efforts in the 1980’s shaped her views on standardized testing. And we say goodbye to our original producer, the incomparable Maya Cueva, who is off to work on a new film and other projects! 

For Further Learning: 

Podcasts Cheaper Than Therapy: Avoiding Resentment The Cult of Pedagogy with Jennifer Gonzalez 

BooksTomorrow and tomorrow and tomorrow by Gabrielle Zevin Wayi Wah! Indigenous Pedagogies by Jo Chrona 

Articles Cultivating a Pedagogy of Student Voice by Shane Safir  Metacognition in the Classroom: Benefits & Strategies 

Films   Watch Maya’s film On The Divide ( https://vimeo.com/ondemand/onthedivide )

Summary

All of the advancements in our technology is based around the principles of abstraction. These are valuable until they break down, which is an inevitable occurrence. In this episode the host Tobias Macey shares his reflections on recent experiences where the abstractions leaked and some observances on how to deal with that situation in a data platform architecture.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management RudderStack helps you build a customer data platform on your warehouse or data lake. Instead of trapping data in a black box, they enable you to easily collect customer data from the entire stack and build an identity graph on your warehouse, giving you full visibility and control. Their SDKs make event streaming from any app or website easy, and their extensive library of integrations enable you to automatically send data to hundreds of downstream tools. Sign up free at dataengineeringpodcast.com/rudderstack Your host is Tobias Macey and today I'm sharing some thoughts and observances about abstractions and impedance mismatches from my experience building a data lakehouse with an ELT workflow

Interview

Introduction impact of community tech debt

hive metastore new work being done but not widely adopted

tensions between automation and correctness data type mapping

integer types complex types naming things (keys/column names from APIs to databases)

disaggregated databases - pros and cons

flexibility and cost control not as much tooling invested vs. Snowflake/BigQuery/Redshift

data modeling

dimensional modeling vs. answering today's questions

What are the most interesting, unexpected, or challenging lessons that you have learned while working on your data platform? When is ELT the wrong choice? What do you have planned for the future of your data platform?

Contact Info

LinkedIn

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Closing Announcements

Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on Apple Podcasts and tell your friends and co-workers

Links

dbt Airbyte

Podcast Episode

Dagster

Podcast Episode

Trino

Podcast Episode

ELT Data Lakehouse Snowflake BigQuery Redshift Technical Debt Hive Metastore AWS Glue

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Sponsored By: Rudderstack: Rudderstack

RudderStack provides all your customer data pipelines in one platform. You can collect, transform, and route data across your entire stack with its event streaming, ETL, and reverse ETL pipelines.

RudderStack’s warehouse-first approach means it does not store sensitive information, and it allows you to leverage your existing data warehouse/data lake infrastructure to build a single source of truth for every team.

RudderStack also supports real-time use cases. You can Implement RudderStack SDKs once, then automatically send events to your warehouse and 150+ business tools, and you’ll never have to worry about API changes again.

Visit dataengineeringpodcast.com/rudderstack to sign up for free today, and snag a free T-Shirt just for being a Data Engineering Podcast listener.Support Data Engineering Podcast

How Vercel Builds Dozens of Metrics from One Heterogenous Table

ABOUT THE TALK: This talk discusses how Vercel leverages dozens of metrics created from one heterogenous table to drive business, technical, product, and operations decisions across the company. Vercel's approach has empowered technical and non-technical stakeholders to jump into their analytical discovery from the metrics table with more frequent iterations and less involvement from the data team.

Centralizing data and metadata used in creating Vercel's many metrics has increased the number of stakeholders that can participate in analytics, decreased the time needed to troubleshoot outlier events, and removed the data team as a dependency for all data-related tasks.

ABOUT THE SPEAKER: Thomas Mickley-Doyle leads analytics and data science initiatives at Vercel, scaling insights across engineering, product, and design. He focuses on making data modeling, analytics, and decision-making more accessible for all users.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/

Malloy An Experimental Language for Data | Google

ABOUT THE TALK: Forcing data through a rectangle shapes the way we solve problems (for example, dimensional fact tables, OLAP Cubes).

Most Data isn’t rectangular it rather exists in hierarchies (orders, items, products, users). Most query results are better returned as a hierarchy (category, brand, product).

Malloy is a new experimental data programming language that, among other things, breaks the rectangle paradigm and several other long held misconceptions in the way we analyze data.

In this talk, Lloyd Tabb shares the ideas behind the Malloy language, semantic data modeling, and his vision for the future of data.

ABOUT THE SPEAKER: Lloyd Tabb spent the last 30 years revolutionizing how the world uses the internet and, by extension, data. He is one of the internet pioneers, having worked at Netscape during the browser wars as the Principal Engineer on Navigator Gold, the first HTML WYSIWYG editor.

Originally a database & languages architect at Borland, Lloyd founded Looker,, which Google acquired in 2019. Lloyd's work at Looker helped define the Modern Data Stack.

At Google, Lloyd continues to pursue his passion for data, and love of programming languages through his current project, Malloy.

ABOUT DATA COUNCIL: Data Council (https://www.datacouncil.ai/) is a community and conference series that provides data professionals with the learning and networking opportunities they need to grow their careers.

Make sure to subscribe to our channel for the most up-to-date talks from technical professionals on data related topics including data infrastructure, data engineering, ML systems, analytics and AI from top startups and tech companies.

FOLLOW DATA COUNCIL: Twitter: https://twitter.com/DataCouncilAI LinkedIn: https://www.linkedin.com/company/datacouncil-ai/

Summary

Every business has customers, and a critical element of success is understanding who they are and how they are using the companies products or services. The challenge is that most companies have a multitude of systems that contain fragments of the customer's interactions and stitching that together is complex and time consuming. Segment created the Unify product to reduce the burden of building a comprehensive view of customers and synchronizing it to all of the systems that need it. In this episode Kevin Niparko and Hanhan Wang share the details of how it is implemented and how you can use it to build and maintain rich customer profiles.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management RudderStack helps you build a customer data platform on your warehouse or data lake. Instead of trapping data in a black box, they enable you to easily collect customer data from the entire stack and build an identity graph on your warehouse, giving you full visibility and control. Their SDKs make event streaming from any app or website easy, and their extensive library of integrations enable you to automatically send data to hundreds of downstream tools. Sign up free at dataengineeringpodcast.com/rudderstack Your host is Tobias Macey and today I'm interviewing Kevin Niparko and Hanhan Wang about Segment's new Unify product for building and syncing comprehensive customer profiles across your data systems

Interview

Introduction How did you get involved in the area of data management? Can you describe what Segment Unify is and the story behind it? What are the net-new capabilities that it brings to the Segment product suite? What are some of the categories of attributes that need to be managed in a prototypical customer profile? What are the different use cases that are enabled/simplified by the availability of a comprehensive customer profile?

What is the potential impact of more detailed customer profiles on LTV?

How do you manage permissions/auditability of updating or amending profile data? Can you describe how the Unify product is implemented?

What are the technical challenges that you had to address while developing/launching this product?

What is the workflow for a team who is adopting the Unify product?

What are the other Segment products that need to be in use to take advantage of Unify?

What are some of the most complex edge cases to address in identity resolution? How does reverse ETL factor into the enrichment process for profile data? What are some of the issues that you have to account for in synchronizing profiles across platforms/products?

How do you mititgate the impact of "regression to the mean" for systems that don't support all of the attributes that you want to maintain in a profile record?

What are some of the data modeling considerations that you have had to account for to support e.g. historical changes (e.g. slowly changing dimensions)? What are the most interesting, innovative, or unexpected ways that you have seen Segment Unify used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Segment Unify? When is Segment Unify the wrong choice? What do you have planned for the future of Segment Unify?

Contact Info

Kevin

LinkedIn Blog

Hanhan

LinkedIn

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Closing Announcements

Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email [email protected]) with your

Expert Data Modeling with Power BI - Second Edition

Expert Data Modeling with Power BI, Second Edition, serves as your comprehensive guide to mastering data modeling using Power BI. With clear explanations, actionable examples, and a focus on hands-on learning, this book takes you through the concepts and advanced techniques that will enable you to build high-performing data models tailored to real-world requirements. What this Book will help me do Master time intelligence and virtual tables in DAX to enhance your data models. Understand best practices for creating efficient Star Schemas and preparing data in Power Query. Deploy advanced modeling techniques such as calculation groups, aggregations, and incremental refresh. Manage complex data models and streamline them to improve performance. Leverage data marts and data flows within Power BI for modularity and scalability. Author(s) Soheil Bakhshi is a seasoned expert in data visualization and analytics with extensive experience in leveraging Power BI for business intelligence solutions. Passionate about educating others, he combines practical insights and technical knowledge to make learning accessible and effective. His approachable writing style reflects his commitment to helping readers succeed. Who is it for? This book is ideal for business intelligence professionals, data analysts, or report developers with basic knowledge of Power BI and experience with Star Schema concepts. Whether you're looking to refine your data modeling skills or expand your expertise in advanced features, this guide aims to help you achieve your goals efficiently.

Summary

The customer data platform is a category of services that was developed early in the evolution of the current era of cloud services for data processing. When it was difficult to wire together the event collection, data modeling, reporting, and activation it made sense to buy monolithic products that handled every stage of the customer data lifecycle. Now that the data warehouse has taken center stage a new approach of composable customer data platforms is emerging. In this episode Darren Haken is joined by Tejas Manohar to discuss how Autotrader UK is addressing their customer data needs by building on top of their existing data stack.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management RudderStack helps you build a customer data platform on your warehouse or data lake. Instead of trapping data in a black box, they enable you to easily collect customer data from the entire stack and build an identity graph on your warehouse, giving you full visibility and control. Their SDKs make event streaming from any app or website easy, and their extensive library of integrations enable you to automatically send data to hundreds of downstream tools. Sign up free at dataengineeringpodcast.com/rudderstack Your host is Tobias Macey and today I'm interviewing Darren Haken and Tejas Manohar about building a composable CDP and how you can start adopting it incrementally

Interview

Introduction How did you get involved in the area of data management? Can you describe what you mean by a "composable CDP"?

What are some of the key ways that it differs from the ways that we think of a CDP today?

What are the problems that you were focused on addressing at Autotrader that are solved by a CDP? One of the promises of the first generation CDP was an opinionated way to model your data so that non-technical teams could own this responsibility. What do you see as the risks/tradeoffs of moving CDP functionality into the same data stack as the rest of the organization?

What about companies that don't have the capacity to run a full data infrastructure?

Beyond the core technology of the data warehouse, what are the other evolutions/innovations that allow for a CDP experience to be built on top of the core data stack? added burden on core data teams to generate event-driven data models When iterating toward a CDP on top of the core investment of the infrastructure to feed and manage a data warehouse, what are the typical first steps?

What are some of the components in the ecosystem that help to speed up the time to adoption? (e.g. pre-built dbt packages for common transformations, etc.)

What are the most interesting, innovative, or unexpected ways that you have seen CDPs implemented? What are the most interesting, unexpected, or challenging lessons that you have learned while working on CDP related functionality? When is a CDP (composable or monolithic) the wrong choice? What do you have planned for the future of the CDP stack?

Contact Info

Darren

LinkedIn @DarrenHaken on Twitter

Tejas

LinkedIn @tejasmanohar on Twitter

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Closing Announcements

Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on Apple Podcasts and tell your friends and co-workers

Links

Autotrader Hightouch

Customer Studio

CDP == Customer Data Platform Segment

Podcast Episode

mPar

Is data modeling on life support? I posed this question to LinkedIn earlier this week. It got a fair number of replies, some supportive and others saying I'm full of sh*t. In this 5 minute Friday nerdy rant, I unpack what I mean by data modeling being on life support, and where I think data modeling needs to go given newer practices like streaming and machine learning, which aren't currently discussed in data modeling circles.

LinkedIn post about data modeling on life support: https://www.linkedin.com/posts/josephreis_dataengineering-datamodeling-data-activity-7048722463010013185-OyIy

dataengineering #datamodel #data


If you like this show, give it a 5-star rating on your favorite podcast platform.

Purchase Fundamentals of Data Engineering at your favorite bookseller.

Check out my substack: https://joereis.substack.com/

Shane Gibson joins the show to discuss how to make data modeling more accessible, why the world's moved past traditional data modeling, enabling data mesh, and more.

Shane's LinkedIn: https://www.linkedin.com/in/shagility/

Shagility: https://shagility.nz/

Shane's podcasts: https://shagility.nz/podcasts/


If you like this show, give it a 5-star rating on your favorite podcast platform.

Purchase Fundamentals of Data Engineering at your favorite book seller.

Check out my substack: https://joereis.substack.com/

Data Wrangling with R

Data Wrangling with R guides you through mastering data preparation in the R programming language using tidyverse libraries. You will learn techniques to load, explore, transform, and visualize data effectively, gaining the skills needed for data modeling and insights extraction. What this Book will help me do Understand how to use R and tidyverse libraries to handle data wrangling tasks. Learn methods to work with diverse data types like numbers, strings, and dates. Gain proficiency in building visual representations of data using ggplot2. Build and validate your first predictive model for useful insights. Create an interactive web application with Shiny in R. Author(s) Gustavo Santos is an experienced data scientist specializing in R programming and data visualization. With a background in statistics and several years of professional experience in industry and academia, Gustavo excels at translating complex data analytics concepts into practical skills. His approach to teaching is hands-on and example-driven, aiming to empower readers to excel in real-world applications. Who is it for? If you are a data scientist, data analyst, or even a beginner programmer who wants to enhance their data manipulation and visualization skills, this book is perfect for you. Familiarity with R or a general understanding of programming concepts is suggested but not mandatory. It caters to professionals looking to refine their data wrangling workflow and to students aspiring to break into data-centered fields. By the end, you'll be ready to apply data wrangling and visualization tools in your projects.

Summary

Business intelligence has gone through many generational shifts, but each generation has largely maintained the same workflow. Data analysts create reports that are used by the business to understand and direct the business, but the process is very labor and time intensive. The team at Omni have taken a new approach by automatically building models based on the queries that are executed. In this episode Chris Merrick shares how they manage integration and automation around the modeling layer and how it improves the organizational experience of business intelligence.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management Truly leveraging and benefiting from streaming data is hard - the data stack is costly, difficult to use and still has limitations. Materialize breaks down those barriers with a true cloud-native streaming database - not simply a database that connects to streaming systems. With a PostgreSQL-compatible interface, you can now work with real-time data using ANSI SQL including the ability to perform multi-way complex joins, which support stream-to-stream, stream-to-table, table-to-table, and more, all in standard SQL. Go to dataengineeringpodcast.com/materialize today and sign up for early access to get started. If you like what you see and want to help make it better, they're hiring across all functions! Your host is Tobias Macey and today I'm interviewing Chris Merrick about the Omni Analytics platform and how they are adding automatic data modeling to your business intelligence

Interview

Introduction How did you get involved in the area of data management? Can you describe what Omni Analytics is and the story behind it?

What are the core goals that you are trying to achieve with building Omni?

Business intelligence has gone through many evolutions. What are the unique capabilities that Omni Analytics offers over other players in the market?

What are the technical and organizational anti-patterns that typically grow up around BI systems?

What are the elements that contribute to BI being such a difficult product to use effectively in an organization?

Can you describe how you have implemented the Omni platform?

How have the design/scope/goals of the product changed since you first started working on it?

What does the workflow for a team using Omni look like?

What are some of the developments in the broader ecosystem that have made your work possible?

What are some of the positive and negative inspirations that you have drawn from the experience that you and your team-mates have gained in previous businesses?

What are the most interesting, innovative, or unexpected ways that you have seen Omni used?

What are the most interesting, unexpected, or challenging lessons that you have learned while working on Omni?

When is Omni the wrong choice?

What do you have planned for the future of Omni?

Contact Info

LinkedIn @cmerrick on Twitter

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Closing Announcements

Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on Apple Podcasts and tell your friends and co-workers

Links

Omni Analytics Stitch RJ Metrics Looker

Podcast Episode

Singer dbt

Podcast Episode

Teradata Fivetran Apache Arrow

Podcast Episode

DuckDB

Podcast Episode

BigQuery Snowflake

Podcast Episode

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Sponsored By: Materialize: Materialize

Looking for the simplest way to get the freshest data possible to your teams? Because let's face it: if real-time were easy, everyone would be using it. Look no further than Materialize, the streaming database you already know how to use.

Materialize’s PostgreSQL-compatible interface lets users leverage the tools they already use, with unsurpassed simplicity enabled by full ANSI SQL support. Delivered as a single platform with the separation of storage and compute, strict-serializability, active replication, horizontal scalability and workload isolation — Materialize is now the fastest way to build products with streaming data, drastically reducing the time, expertise, cost and maintenance traditionally associated with implementation of real-time features.

Sign up now for early access to Materialize and get started with the power of streaming data with the same simplicity and low implementation cost as batch cloud data warehouses.

Go to materialize.comSupport Data Engineering Podcast

Data Modeling with Tableau

"Data Modeling with Tableau" provides a comprehensive guide to effectively utilizing Tableau Prep and Tableau Desktop for building elegant data models that drive organizational insights. You'll explore robust data modeling strategies and governance practices tailored to Tableau's diverse toolset, empowering you to make faster and more informed decisions based on data. What this Book will help me do Understand the fundamentals of data modeling in Tableau using Prep Builder and Desktop. Learn to optimize data sources for performance and better query capabilities. Implement secure and scalable governance strategies with Tableau Server and Cloud. Use advanced Tableau features like Ask Data and Explain Data to enable powerful analytics. Apply best practices for sharing and extending data models within your organization. Author(s) Kirk Munroe is an experienced data professional with a deep understanding of Tableau-driven analytics. With years of in-field expertise, Kirk now dedicates his career to helping businesses unlock their data's potential through effective Tableau solutions. His hands-on approach ensures this book is practical and approachable. Who is it for? This book is ideal for data analysts and business analysts aiming to enhance their skills in data modeling. It is also valuable for professionals such as data stewards, looking to implement secure and performant data strategies. If you seek to make enterprise data more accessible and actionable, this book is for you.

How to scale your data team, hosted by Tasman Analytics

Scaling data teams from zero is hard: there are no easy shortcuts, and it is hard to find clear examples to learn from. That’s why we are very excited to co-present the work we did at On Deck over the last year: starting in Summer 2021 we built a data team from scratch, relying heavily on dbt as the core data modelling environment. Come hear us talk about how we set up the team, prioritised the many different requirements from an ever expanding team of stakeholders and, after just a few months, succeeding in moving On Deck away from a no-code data architecture (with more than 400 SaaS tools) and towards a centralised data model in dbt. We think the lessons (and especially the pitfalls) are worth telling!

Coalesce 2023 is coming! Register for free at https://coalesce.getdbt.com/.

Pro DAX and Data Modeling in Power BI: Creating the Perfect Semantic Layer to Drive Your Dashboard Analytics

Develop powerful data models that bind data from disparate sources into a coherent whole. Then extend your data models using DAX–the query language that underpins Power BI–to create reusable measures to deliver finely-crafted custom calculations in your dashboards. This book starts off teaching you how to define and enhance the core structures of your data model to make it a true semantic layer that transforms complex data into familiar business terms. You’ll learn how to create calculated columns to solve basic analytical challenges. Then you’ll move up to mastering DAX measures to finely slice and dice your data. The book also shows how to handle temporal analysis in Power BI using a Date dimension. You will see how DAX Time Intelligence functions can simplify your analysis of data over time. Finally, the book shows how to extend DAX to filter and calculate datasets and develop DAX table functions and variables to handle complex queries. What You Will Learn Create clear and efficient data models that support in-depth analytics Define core attributes such as data types and standardized formatting consistently throughout a data model Define cross-filtering settings to enhance the data model Make use of DAX to create calculated columns and custom tables Extend your data model with custom calculations and reusable measures using DAX Perform time-based analysis using a Date dimension and Time Intelligence functions Who This Book Is For Everyone from the CEO to the Business Intelligence developer and from BI and Data architects and analysts to power users and IT managers can use this book to outshine the competition and create the data framework that they need and interactive dashboards using Power BI

We talked about:

Nikola’s background Making the first steps towards a transition to BI and Analytics Engineering Learning the skills necessary to transition to Analytics Engineering The in-between period – from Marketing to Analytics Engineering Nikola’s current responsibilities Understanding what a Data Model is Tools needed to work as an Analytics Engineer The Analytics Engineering role over time The importance of DBT for Analytics Engineers Where can one learn about data modeling theory? Going from Ancient Greek and Latin to understanding Data (Just-In-Time Learning) The importance of having domain knowledge to analytics engineering Suggestion for those wishing to transition into analytics engineering The importance of having a mentor when transitioning Finding a mentor Helpful newsletters and blogs Finding Nikola online

Links:

Nikola's LinkedIn account: https://www.linkedin.com/in/nikola-maksimovic-40188183/

ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Learning Google Analytics

Why is Google Analytics 4 the most modern data model available for digital marketing analytics? Rather than simply reporting what has happened, GA4's new cloud integrations enable more data activation, linking online and offline data across all your streams to provide end-to-end marketing data. This practical book prepares you for the future of digital marketing by demonstrating how GA4 supports these additional cloud integrations. Author Mark Edmondson, Google developer expert for Google Analytics and Google Cloud, provides a concise yet comprehensive overview of GA4 and its cloud integrations. Data, business, and marketing analysts will learn major facets of GA4's powerful new analytics model, with topics including data architecture and strategy, and data ingestion, storage, and modeling. You'll explore common data activation use cases and get the guidance you need to implement them. You'll learn: How Google Cloud integrates with GA4 The potential use cases that GA4 integrations can enable Skills and resources needed to create GA4 integrations How much GA4 data capture is necessary to enable use cases The process of designing dataflows from strategy through data storage, modeling, and activation How to adapt the use cases to fit your business needs

Build scalable data products leveraging user stitching

User stitching can be cumbersome; third-party tracking, expiring cookies, browser privacy features, and users not being logged in. These challenges require companies to rethink how they collect and use data, and how they want to establish a mutually beneficial relationship with their users. In this session, learn how to solve these problems with an architecture that allows you to implement and control first-party tracking with data creation and data modelling, enabling you to easily set up user stitching. Learn how this directly drives the development of smarter data products that leverage user identity resolution, leading to more user engagement and growth.

Check slides here: https://docs.google.com/presentation/d/17s6qEAXyt1dhHmvxvoibs0X88RxlDjGjnDojYNtGYeQ/edit?usp=sharing

Coalesce 2023 is coming! Register for free at https://coalesce.getdbt.com/.