BI – talk-data.com

Insights And Advice On Building A Data Lake Platform From Someone Who Learned The Hard Way

2022-05-16 · Data Engineering Podcast Listen

podcast_episode

by Srivatsan Sridharan , Tobias Macey

Airflow CDP Cloud Computing Data Engineering Data Lake Data Management dbt ETL/ELT Kubernetes Modern Data Stack Monte Carlo PagerDuty +2 more

Summary Designing a data platform is a complex and iterative undertaking which requires accounting for many conflicting needs. Designing a platform that relies on a data lake as its central architectural tenet adds additional layers of difficulty. Srivatsan Sridharan has had the opportunity to design, build, and run data lake platforms for both Yelp and Robinhood, with many valuable lessons learned from each experience. In this episode he shares his insights and advice on how to approach such an undertaking in your own organization.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This episode is brought to you by Acryl Data, the company behind DataHub, the leading developer-friendly data catalog for the modern data stack. Open Source DataHub is running in production at several companies like Peloton, Optum, Udemy, Zynga and others. Acryl Data provides DataHub as an easy to consume SaaS product which has been adopted by several companies. Signup for the SaaS product at dataengineeringpodcast.com/acryl RudderStack helps you build a customer data platform on your warehouse or data lake. Instead of trapping data in a black box, they enable you to easily collect customer data from the entire stack and build an identity graph on your warehouse, giving you full visibility and control. Their SDKs make event streaming from any app or website easy, and their state-of-the-art reverse ETL pipelines enable you to send enriched data to any cloud tool. Sign up free… or just get the free t-shirt for being a listener of the Data Engineering Podcast at dataengineeringpodcast.com/rudder. Struggling with broken pipelines? Stale dashboards? Missing data? If this resonates with you, you’re not alone. Data engineers struggling with unreliable data need look no further than Monte Carlo, the leading end-to-end Data Observability Platform! Trusted by the data teams at Fox, JetBlue, and PagerDuty, Monte Carlo solves the costly problem of broken data pipelines. Monte Carlo monitors and alerts for data issues across your data warehouses, data lakes, dbt models, Airflow jobs, and business intelligence tools, reducing time to detection and resolution from weeks to just minutes. Monte Carlo also gives you a holistic picture of data health with automatic, end-to-end lineage from ingestion to the BI layer directly out of the box. Start trusting your data with Monte Carlo today! Visit dataengineeringpodcast.com/montecarlo to learn more. Your host is Tobias Macey and today I’m interviewing Srivatsan Sridharan about the technological, staffing, and design considerations for building a data platform

Interview

Introduction How did you get involved in the area of data management? Can you describe what your experience has been with designing and implementing data platforms? What are the elements that you have found to be common requirements across organizations and data characteristics? What are the architectural elements that require the most detailed consideration based on organizational needs and data requirements? How has the ecosystem for building maintainable and usable data lakes matured over the past few years?

What are the elements that are still cumbersome or intractable?

The streaming ecosystem has also gone t

Will BI Notebooks replace Power BI Tableau? w/ Hjalmar Gislason

2022-05-11 · Analytics on Fire Listen

podcast_episode

by Hjalmar Gislason (GRID) , Mico Yuk (Data Storytelling Academy)

Power BI Tableau

Hjalmar Gislason is the Founder and CEO of GRID.is, a BI notebook company that has been making big waves in the field in the last few years. He tells us all about GRID.is and the rise of data notebooks. In this episode, you'll learn: [0:05:50] Hjalmar explains exactly what a data notebook is and how it compares to other tools. [0:11:52] Reasons to consider using a data notebook along with a range of other tools. [0:27:30] The 'living' quality of the models that can be created on GRID. [0:33:28] A special offer for AOF from Hjalmar and how to sign up immediately! For full show notes, and the links mentioned visit: https://bibrainz.com/podcast/85 Enjoyed the Show? Please leave us a review on iTunes.

Scaling Analysis of Connected Data And Modeling Complex Relationships With The TigerGraph Graph Database

2022-05-09 · Data Engineering Podcast Listen

podcast_episode

by Jon Herke , Tobias Macey

AI/ML Airflow Analytics CDP Cloud Computing Data Engineering Data Lake Data Management Data Science dbt ETL/ELT Kubernetes +5 more

Summary Many of the events, ideas, and objects that we try to represent through data have a high degree of connectivity in the real world. These connections are best represented and analyzed as graphs to provide efficient and accurate analysis of their relationships. TigerGraph is a leading database that offers a highly scalable and performant native graph engine for powering graph analytics and machine learning. In this episode Jon Herke shares how TigerGraph customers are taking advantage of those capabilities to achieve meaningful discoveries in their fields, the utilities that it provides for modeling and managing your connected data, and some of his own experiences working with the platform before joining the company.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! This episode is brought to you by Acryl Data, the company behind DataHub, the leading developer-friendly data catalog for the modern data stack. Open Source DataHub is running in production at several companies like Peloton, Optum, Udemy, Zynga and others. Acryl Data provides DataHub as an easy to consume SaaS product which has been adopted by several companies. Signup for the SaaS product at dataengineeringpodcast.com/acryl RudderStack helps you build a customer data platform on your warehouse or data lake. Instead of trapping data in a black box, they enable you to easily collect customer data from the entire stack and build an identity graph on your warehouse, giving you full visibility and control. Their SDKs make event streaming from any app or website easy, and their state-of-the-art reverse ETL pipelines enable you to send enriched data to any cloud tool. Sign up free… or just get the free t-shirt for being a listener of the Data Engineering Podcast at dataengineeringpodcast.com/rudder. Struggling with broken pipelines? Stale dashboards? Missing data? If this resonates with you, you’re not alone. Data engineers struggling with unreliable data need look no further than Monte Carlo, the leading end-to-end Data Observability Platform! Trusted by the data teams at Fox, JetBlue, and PagerDuty, Monte Carlo solves the costly problem of broken data pipelines. Monte Carlo monitors and alerts for data issues across your data warehouses, data lakes, dbt models, Airflow jobs, and business intelligence tools, reducing time to detection and resolution from weeks to just minutes. Monte Carlo also gives you a holistic picture of data health with automatic, end-to-end lineage from ingestion to the BI layer directly out of the box. Start trusting your data with Monte Carlo today! Visit http://www.dataengineeringpodcast.com/montecarlo?utm_source=rss&utm_medium=rss to learn more. Your host is Tobias Macey and today I’m interviewing Jon Herke about TigerGraph, a distributed native graph database

Interview

Introduction How did you get involved in the area of data management? Can you describe what TigerGraph is and the story behind it? What are some of the core use cases that you are focused on supporting? How has TigerGraph changed over the past 4 years since I spoke with Todd Blaschka at the Open Data Science Conference? How has the ecosystem of graph databases changed in usage and design in recent years? What are some of the persi

What's The Role Of AI in BI?

2022-05-06 · The Analytics Engineering Podcast Listen

podcast_episode

by Tristan Handy (dbt Labs) , Amit Prakash (ThoughtSpot) , Julia Schottenstein (dbt labs)

AI/ML Analytics Analytics Engineering dbt Microsoft Thoughtspot

Amit Prakash is Co-founder and CTO at ThoughtSpot. He has a deep background in search, having previously led the AdSense engineering team at Google and served on the early Bing team at Microsoft. In this conversation with Tristan and Julia, Amit gets real about the promise of AI in data: which applications are being widely used today, and which are still a few years out? For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.

Leading The Charge For The ELT Data Integration Pattern For Cloud Data Warehouses At Matillion

2022-05-02 · Data Engineering Podcast Listen

podcast_episode

by Ed Thompson (Matillion) , Tobias Macey

Airflow Analytics CI/CD Cloud Computing Data Engineering Data Management Data Quality Datafold dbt ETL/ELT GitHub Kubernetes +7 more

Summary The predominant pattern for data integration in the cloud has become extract, load, and then transform or ELT. Matillion was an early innovator of that approach and in this episode CTO Ed Thompson explains how they have evolved the platform to keep pace with the rapidly changing ecosystem. He describes how the platform is architected, the challenges related to selling cloud technologies into enterprise organizations, and how you can adopt Matillion for your own workflows to reduce the maintenance burden of data integration workflows.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Atlan is a collaborative workspace for data-driven teams, like Github for engineering or Figma for design teams. By acting as a virtual hub for data assets ranging from tables and dashboards to SQL snippets & code, Atlan enables teams to create a single source of truth for all their data assets, and collaborate across the modern data stack through deep integrations with tools like Snowflake, Slack, Looker and more. Go to dataengineeringpodcast.com/atlan today and sign up for a free trial. If you’re a data engineering podcast listener, you get credits worth $3000 on an annual subscription Modern data teams are dealing with a lot of complexity in their data pipelines and analytical code. Monitoring data quality, tracing incidents, and testing changes can be daunting and often takes hours to days or even weeks. By the time errors have made their way into production, it’s often too late and damage is done. Datafold built automated regression testing to help data and analytics engineers deal with data quality in their pull requests. Datafold shows how a change in SQL code affects your data, both on a statistical level and down to individual rows and values before it gets merged to production. No more shipping and praying, you can now know exactly what will change in your database! Datafold integrates with all major data warehouses as well as frameworks such as Airflow & dbt and seamlessly plugs into CI workflows. Visit dataengineeringpodcast.com/datafold today to book a demo with Datafold. Struggling with broken pipelines? Stale dashboards? Missing data? If this resonates with you, you’re not alone. Data engineers struggling with unreliable data need look no further than Monte Carlo, the leading end-to-end Data Observability Platform! Trusted by the data teams at Fox, JetBlue, and PagerDuty, Monte Carlo solves the costly problem of broken data pipelines. Monte Carlo monitors and alerts for data issues across your data warehouses, data lakes, dbt models, Airflow jobs, and business intelligence tools, reducing time to detection and resolution from weeks to just minutes. Monte Carlo also gives you a holistic picture of data health with automatic, end-to-end lineage from ingestion to the BI layer directly out of the box. Start trusting your data with Monte Carlo today! Visit http://www.dataengineeringpodcast.com/montecarlo?utm_source=rss&utm_medium=rss to learn more. Your host is Tobias Macey and today I’m interviewing Ed Thompson about Matillion, a cloud-native data integration platform for accelerating your time to analytics

Interview

Introduction How did you get involved in the area of data management?

Artificial Intelligence with Power BI

2022-04-29 · O'Reilly Data Science Books O'Reilly Amazon

book

by Mary-Jo Diepeveen

AI/ML Analytics Azure Data Analytics NLP Power BI business-intelligence data data-science microsoft-power-platform power-bi

Discover how to enhance your data analysis with 'Artificial Intelligence with Power BI,' a resource designed to teach you how to leverage Power BI's AI capabilities. You will learn practical methods for enriching your analytics with forecasting, anomaly detection, and machine learning, equipping you to create intelligent, insightful BI reports. What this Book will help me do Learn how to apply AI capabilities such as forecasting and anomaly detection to enrich your reports and drive actionable insights. Explore data preparation techniques optimized for AI, ensuring your datasets are structured for advanced analytics. Develop skills to integrate Azure Machine Learning and Cognitive Services into Power BI, expanding your analytical toolset. Understand how to build Q&A interfaces and integrate Natural Language Processing into your BI solutions. Gain expertise in training and deploying your own machine learning models to achieve tailored insights and predictive analytics. Author(s) None Diepeveen is an experienced data analyst and Power BI expert with a passion for making advanced analytics accessible to professionals. With years of hands-on experience working in the data analytics field, they deliver insights using intuitive, practical approaches through clear and engaging tutorials. Who is it for? This book is ideal for data analysts and BI developers who aim to expand their analytics capabilities with AI. Readers should already be familiar with Power BI and are looking for a resource to teach them how to incorporate predictive and advanced AI techniques into their reporting workflow. Whether you're seeking to gain a professional edge or enhance your organization's data storytelling and insights, this guide is perfect for you.

The Tableau Workshop

2022-04-28 · O'Reilly Data Science Books O'Reilly Amazon

book

by Sylvester Pinto , Shweta Savale , Kenneth Michael Cherven , JC Gillet , Sumit Gupta

Analytics Data Analytics DataViz Tableau data data-science data-science-tasks data-visualization

The Tableau Workshop offers a comprehensive, hands-on guide to mastering data visualization with Tableau. Through practical exercises and engaging examples, you will learn how to prepare, analyze, and visualize data to uncover valuable business insights. By completing this book, you will confidently understand the key concepts and tools needed to create impactful data-driven visual stories. What this Book will help me do Master the use of Tableau Desktop and Tableau Prep for data visualization tasks. Gain the ability to prepare and process data for effective analysis. Learn to choose and utilize the most appropriate chart types for different scenarios. Develop the skills to create interactive dashboards that engage stakeholders. Understand how to perform calculations to extract deeper insights from data. Author(s) Sumit Gupta, None Pinto, Shweta Savale, JC Gillet None, and None Cherven are experts in the field of data analytics and visualization. With diverse backgrounds in business intelligence and hands-on experience with industry tools like Tableau, they bring valuable insights to this book. Their collaborative effort offers practical, real-world knowledge tailored to help learners excel in Tableau and data visualization. With their passion for making technical concepts accessible, they guide readers step by step through their learning journey. Who is it for? This book is ideal for professionals, analysts, or students looking to delve into the world of data visualization with Tableau. Whether you're a complete beginner seeking foundational knowledge, or an intermediate user aiming to refine your skills, this book offers the practical insights you need. It's designed for those who want to master Tableau tools, explore meaningful data insights, and effectively communicate them through engaging dashboards and stories.

Operational Analytics At Speed With Minimal Busy Work Using Incorta

2022-04-24 · Data Engineering Podcast Listen

podcast_episode

by Matthew Halliday (Incorta) , Tobias Macey

Airflow Analytics CI/CD Data Engineering Data Management Data Quality Datafold dbt GitHub Kubernetes Looker Modern Data Stack +4 more

Summary A huge amount of effort goes into modeling and shaping data to make it available for analytical purposes. This is often due to the need to simplify the final queries so that they are performant for visualization or limited exploration. In order to cut down the level of effort involved in making data usable, Matthew Halliday and his co-founders created Incorta as an end-to-end, in-memory analytical engine that removes barriers to insights on your data. In this episode he explains how the system works, the use cases that it empowers, and how you can start using it for your own analytics today.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Atlan is a collaborative workspace for data-driven teams, like Github for engineering or Figma for design teams. By acting as a virtual hub for data assets ranging from tables and dashboards to SQL snippets & code, Atlan enables teams to create a single source of truth for all their data assets, and collaborate across the modern data stack through deep integrations with tools like Snowflake, Slack, Looker and more. Go to dataengineeringpodcast.com/atlan today and sign up for a free trial. If you’re a data engineering podcast listener, you get credits worth $3000 on an annual subscription Modern data teams are dealing with a lot of complexity in their data pipelines and analytical code. Monitoring data quality, tracing incidents, and testing changes can be daunting and often takes hours to days or even weeks. By the time errors have made their way into production, it’s often too late and damage is done. Datafold built automated regression testing to help data and analytics engineers deal with data quality in their pull requests. Datafold shows how a change in SQL code affects your data, both on a statistical level and down to individual rows and values before it gets merged to production. No more shipping and praying, you can now know exactly what will change in your database! Datafold integrates with all major data warehouses as well as frameworks such as Airflow & dbt and seamlessly plugs into CI workflows. Visit dataengineeringpodcast.com/datafold today to book a demo with Datafold. Struggling with broken pipelines? Stale dashboards? Missing data? If this resonates with you, you’re not alone. Data engineers struggling with unreliable data need look no further than Monte Carlo, the leading end-to-end Data Observability Platform! Trusted by the data teams at Fox, JetBlue, and PagerDuty, Monte Carlo solves the costly problem of broken data pipelines. Monte Carlo monitors and alerts for data issues across your data warehouses, data lakes, dbt models, Airflow jobs, and business intelligence tools, reducing time to detection and resolution from weeks to just minutes. Monte Carlo also gives you a holistic picture of data health with automatic, end-to-end lineage from ingestion to the BI layer directly out of the box. Start trusting your data with Monte Carlo today! Visit http://www.dataengineeringpodcast.com/montecarlo?utm_source=rss&utm_medium=rss to learn more. Your host is Tobias Macey and today I’m interviewing Matthew Halliday about Incorta, an in-memory, unified data and analytics platform as a service

Interview

Introduction How did you g

Microsoft Power BI Performance Best Practices

2022-04-22 · O'Reilly Data Science Books O'Reilly Amazon

book

by Bhavik Merchant

Analytics Azure DAX Microsoft Power BI Cyber Security business-intelligence data data-science microsoft-power-platform power-bi

"Microsoft Power BI Performance Best Practices" is a thorough guide to mastering efficiently operating Power BI solutions. This book walks you through optimizing every layer of a Power BI project, from data transformations to architecture, equipping you with the ability to create robust and scalable analytics solutions. What this Book will help me do Understand how to set realistic performance goals for Power BI projects and implement ongoing performance monitoring. Apply effective architectural and configuration strategies to improve Power BI solution efficiency. Learn practices for constructing and optimizing data models and implementing Row-Level Security effectively. Utilize tools like DAX Studio and VertiPaq Analyzer to detect and resolve common performance bottlenecks. Gain deep knowledge of Power BI Premium and techniques for handling large-scale data solutions using Azure. Author(s) Bhavik Merchant is a recognized expert in business intelligence and analytics solutions. With extensive experience in designing and implementing Power BI solutions across industries, he brings a pragmatic approach to solving performance issues in Power BI. Bhavik's writing style reflects his passion for teaching, ensuring readers gain practical knowledge they can directly apply to their work. Who is it for? This book is designed for data analysts, BI developers, and data professionals who have foundational knowledge of Power BI and aim to elevate their skills to construct high-performance analytics solutions. It is particularly suited to individuals seeking guidance on best practices and tools for optimizing Power BI applications.

Introducing Charticulator for Power BI: Design Vibrant and Customized Visual Representations of Data

2022-03-26 · O'Reilly Data Visualization Books O'Reilly Amazon

book

by Alison Box

DAX Microsoft Power BI SQL business-intelligence data data-science microsoft-power-platform power-bi

Create stunning and complex visualizations using the amazing Charticulator custom visuals in Power BI. Charticulator offers users immense power to generate visuals and graphics. To a beginner, there are myriad settings and options that can be combined in what feels like an unlimited number of combinations, giving it the unfair label, “the DAX of the charting world”. This is not true. This book is your start-to-finish guide to using Charticulator, a custom visualization software that Microsoft integrated into Power BI Desktop so that Power BI users can create incredibly powerful, customized charts and graphs. You will learn the concepts that underpin the software, journeying through every building block of chart design, enabling you to combine these parts to create spectacular visuals that represent the story of your data. Unlike other custom Power BI visuals, Charticulator runs in a separate application window within Power BI with its own interface and requires a different set of interactions and associated knowledge. This book covers the ins and outs of all of them. What You Will Learn Generate inspirational and technically competent visuals with no programming or other specialist technical knowledge Create charts that are not restricted to conventional chart types such as bar, line, or pie Limit the use of diverse Power BI custom visuals to one Charticulator custom visual Alleviate frustrations with the limitations of default chart types in Power BI, such as being able to plot data on only one categorical axis Use a much richer set of options to compare different sets of data Re-use your favorite or most often used chart designs with Charticulator templates Who This Book Is For The average Power BI user. It assumes no prior knowledge on the part of the reader other than being able to open Power BI desktop, import data, and create a simple Power BI visual. User experiences may vary, from people attending a Power BI training course to those with varying skills and abilities, from SQL developers and advanced Excel users to people with limited data analysis experience and technical skills.

Getting Started with Elastic Stack 8.0

2022-03-23 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Asjad Athick

ELK Kibana Logstash Cyber Security data data-engineering elastic-stack-elk-stack elastic stack (elk stack) elasticsearch search

Discover how to harness the power of the Elastic Stack 8.0 to manage, analyze, and secure complex data environments. You will learn to combine components such as Elasticsearch, Kibana, Logstash, and more to build scalable and effective solutions for your organization. By focusing on hands-on implementations, this book ensures you can apply your knowledge to real-world use cases. What this Book will help me do Set up and manage Elasticsearch clusters tailored to various architecture scenarios. Utilize Logstash and Elastic Agent to ingest and process diverse data sources efficiently. Create interactive dashboards and data models in Kibana, enabling business intelligence insights. Implement secure and effective search infrastructures for enterprise applications. Deploy Elastic SIEM to fortify your organization's security against modern cybersecurity threats. Author(s) Asjad Athick is a seasoned technologist and author with expertise in developing scalable data solutions. With years of experience working with the Elastic Stack, Asjad brings a pragmatic approach to teaching complex architectures. His dedication to explaining technical concepts in an accessible manner makes this book a valuable resource for learners. Who is it for? This book is ideal for developers seeking practical knowledge in search, observability, and security solutions using Elastic Stack. Solutions architects who aim to design scalable data platforms will also benefit greatly. Even tech leads or managers keen to understand the Elastic Stack's impact on their operations will find the insights valuable. No prior experience with Elastic Stack is needed.

087 - How Data Product Management and UX Integrate with Data Scientists at Albertsons Companies to Improve the Grocery Shopping Experience

2022-03-22 · Experiencing Data w/ Brian T. O’Neill (AI & data product management leadership—powered by UX design) Listen

podcast_episode

by Brian O’Neill (Designing for Analytics) , Danielle Crop (Albertsons Companies)

AI/ML Big Data Dashboard Data Science

For Danielle Crop, the Chief Data Officer of Albertsons, to draw distinctions between “digital” and “data” only limits the ability of an organization to create useful products. One of the reasons I asked Danielle on the show is due to her background as a CDO and former SVP of digital at AMEX, where she also managed product and design groups. My theory is that data leaders who have been exposed to the worlds of software product and UX design are prone to approach their data product work differently, and so that’s what we dug into this episode. It didn’t take long for Danielle to share how she pushes her data science team to collaborate with business product managers for a “cross-functional, collaborative” end result. This also means getting the team to understand what their models are personalizing, and how customers experience the data products they use. In short, for her, it is about getting the data team to focus on “outcomes” vs “outputs.”

Scaling some of the data science and ML modeling work at Albertsons is a big challenge, and we talked about one of the big use cases she is trying to enable for customers, as well as one “real-life” non-digital experience that her team’s data science efforts are behind.

The big takeaway for me here was hearing how a CDO like Danielle is really putting customer experience and the company’s brand at the center of their data product work, as opposed solely focusing on ML model development, dashboard/BI creation, and seeing data as a raw ingredient that lives in a vacuum isolated from people.

In this episode, we cover:

Danielle’s take on the “D” in CDO: is the distinction between “digital” and “data” even relevant, especially for a food and drug retailer? (01:25) The role of data product management and design in her org and how UX (i.e. shopper experience) is influenced by and considered in her team’s data science work (06:05) How Danielle’s team thinks about “customers” particularly in the context of internal stakeholders vs. grocery shoppers (10:20) Danielle’s current and future plans for bringing her data team into stores to better understand shoppers and customers (11:11) How Danielle’s data team works with the digital shopper experience team (12:02) “Outputs” versus “Outcomes” for product managers, data science teams, and data products (16:30) Building customer loyalty, in-store personalization, and long term brand interaction with data science at Albertsons (20:40) How Danielle and her team at Albertsons measure the success of their data products (24:04) Finding the problems, building the solutions, and connecting the data to the non-technical side of the company (29:11)

Quotes from Today’s Episode “Data always comes from somewhere, right? It always has a source. And in our modern world, most of that source is some sort of digital software. So, to distinguish your data from its source is not very smart as a data scientist. You need to understand your data very well, where it came from, how it was developed, and software is a massive source of data. [As a CDO], I think it’s not important to distinguish between [data and digital]. It is important to distinguish between roles and responsibilities, you need different skills for these different areas, but to create an artificial silo between them doesn’t make a whole lot of sense to me.”- Danielle (03:00)

“Product managers need to understand what the customer wants, what the business needs, how to pass that along to data scientists and data scientists, and to understand how that’s affecting business outcomes. That’s how I see this all working. And it depends on what type of models they’re customizing and building, right? Are they building personalization models that are going to be a digital asset? Are they building automation models that will go directly to some sort of operational activity in the store? What are they trying to solve?” - Danielle (06:30)

“In a company that sells products—groceries—to individuals, personalization is a huge opportunity. How do we make that experience, both in-digital and in-store, more relevant to the customer, more sticky and build loyalty with those customers? That’s the core problem, but underneath that is you got to build a lot of models that help personalize that experience. When you start talking about building a lot of different models, you need scale.” - Danielle (9:24)

“[Customer interaction in the store] is a true big data problem, right, because you need to use the WiFi devices, et cetera. that you have in store that are pinging the devices at all times, and it’s a massive amount of data. Trying to weed through that and find the important signals that help us to actually drive that type of personalized experience is challenging. No one’s gotten there yet. I hope that we’ll be the first.” - Danielle (19:50)

“I can imagine a checkout clerk who doesn’t want to talk to the customer, despite a data-driven suggestion appearing on the clerk’s monitor as to how to personalize a given customer interaction. The recommendation suggested to the clerk may be ‘accurate from a data science point of view, but if the clerk doesn’t actually act on it, then the data product didn’t provide any value. When I train people in my seminar, I try to get them thinking about that last mile. It may not be data science work, and maybe you have a big enough org where that clerk/customer experience is someone else’s responsibility, but being aware that this is a fault point and having a cross-team perspective is key.” - Brian @rhythmspice (24:50)

“We’re going through a moment in time in which trust in data is shaky. What I’d like people to understand and know on a broader philosophical level, is that in order to be able to understand data and use it to make decisions, you have to know its source. You have to understand its source. You have to understand the incentives around that source of data….you have to look at the data from the perspective of what it means and what the incentives were for creating it, and then analyze it, and then give an output. And fortunately, most statisticians, most data scientists, most people in most fields that I know, are incredibly motivated to be ethical and accurate in the information that they’re putting out.” - Danielle (34:15)

Gordon Wong on Success Metrics

2022-03-07 · Secrets of Data Analytics Leaders Listen

podcast_episode

by Gordon Wong

Analytics Hubspot KPI

Gordon Wong is on a mission. A long-time business intelligence leader who has led data & analytics teams at HubSpot and FitBit, Wong believes BI teams aren’t data-driven enough. He says BI leaders need to think of themselves as small businesses owners and aggressively court and manage customers. He says too many don’t have metrics to track customer engagement and usage. In short, BI teams need to eat their own dog food and build success metrics to guide their activities.

If you are a data or analytics leader, do you know the value your team contributes to the business? Do you have KPIs for business intelligence? Can you measure the impact of data and analytics endeavors in terms the business understands and respects? Too often BI and data leaders get caught up in technical details and fail to evaluate how their technical initiatives add value to the business. This wide-ranging interview with a BI veteran will shed light on how to run a successful BI shop.

Powering Your Organisation with Advanced Business Intelligence - Featuring Jen Stirrup

2022-03-03 · Leaders of Analytics Listen

podcast_episode

by Jen Stirrup , Jonas Christensen

AI/ML Analytics Dashboard Data Science

Data is eating the world and every industry is impacted. In most modern businesses, customer and employee activities create a plethora of data points and information that can be analysed and interpreted to make better decisions for the business and its customers. Unfortunately, this sounds a lot easier than it is. Despite the huge mountains of data being created, many organisations struggle to get their business intelligence to serve them in the best way. This is not due to a shortage of reports and dashboard floating around – in many cases there are too many ways to get an answer to the same question. So, why are so many organisations lacking good BI and what should they do about it? I recently spoke to Jen Stirrup to get an answer to this question and many more relating to producing and consuming business intelligence effectively. Jen is the CEO & Founder of Data Relish, a global AI, Data Science and Business Intelligence Consultancy. She is a leading authority in AI and Business Intelligence Leadership and has been named one of the Top 50 Global Data Visionaries and Top 50 Women in Technology worldwide. In this episode of Leaders of Analytics, you will learn how to avoid data paralysis and discover how to create business intelligence that gives your organisation new superpowers. Jen's website: https://jenstirrup.com/ Jen's LinkedIn profile: https://www.linkedin.com/in/jenstirrup/ Jen on Twitter: https://twitter.com/jenstirrup

Mastering Snowflake Solutions: Supporting Analytics and Data Sharing

2022-02-27 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Adam Morton

Agile/Scrum Analytics GDPR/CCPA Cyber Security Snowflake data data-engineering

Design for large-scale, high-performance queries using Snowflake’s query processing engine to empower data consumers with timely, comprehensive, and secure access to data. This book also helps you protect your most valuable data assets using built-in security features such as end-to-end encryption for data at rest and in transit. It demonstrates key features in Snowflake and shows how to exploit those features to deliver a personalized experience to your customers. It also shows how to ingest the high volumes of both structured and unstructured data that are needed for game-changing business intelligence analysis. Mastering Snowflake Solutions starts with a refresher on Snowflake’s unique architecture before getting into the advanced concepts that make Snowflake the market-leading product it is today. Progressing through each chapter, you will learn how to leverage storage, query processing, cloning, data sharing, and continuous data protection features. This approach allows for greater operational agility in responding to the needs of modern enterprises, for example in supporting agile development techniques via database cloning. The practical examples and in-depth background on theory in this book help you unleash the power of Snowflake in building a high-performance system with little to no administrative overhead. Your result from reading will be a deep understanding of Snowflake that enables taking full advantage of Snowflake’s architecture to deliver value analytics insight to your business. What You Will Learn Optimize performance and costs associated with your use of the Snowflake data platform Enable data security to help in complying with consumer privacy regulations such as CCPA and GDPR Share data securely both inside your organization and with external partners Gain visibility to each interaction with your customersusing continuous data feeds from Snowpipe Break down data silos to gain complete visibility your business-critical processes Transform customer experience and product quality through real-time analytics Who This Book Is for Data engineers, scientists, and architects who have had some exposure to the Snowflake data platform or bring some experience from working with another relational database. This book is for those beginning to struggle with new challenges as their Snowflake environment begins to mature, becoming more complex with ever increasing amounts of data, users, and requirements. New problems require a new approach and this book aims to arm you with the practical knowledge required to take advantage of Snowflake’s unique architecture to get the results you need.

Analytics Optimization with Columnstore Indexes in Microsoft SQL Server: Optimizing OLAP Workloads

2022-02-26 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Edward Pollack

Analytics Microsoft SQL SQL Server data data-engineering microsoft-sql-server relational-databases

Meet the challenge of storing and accessing analytic data in SQL Server in a fast and performant manner. This book illustrates how columnstore indexes can provide an ideal solution for storing analytic data that leads to faster performing analytic queries and the ability to ask and answer business intelligence questions with alacrity. The book provides a complete walk through of columnstore indexing that encompasses an introduction, best practices, hands-on demonstrations, explanations of common mistakes, and presents a detailed architecture that is suitable for professionals of all skill levels. With little or no knowledge of columnstore indexing you can become proficient with columnstore indexes as used in SQL Server, and apply that knowledge in development, test, and production environments. This book serves as a comprehensive guide to the use of columnstore indexes and provides definitive guidelines. You will learn when columnstore indexes shouldbe used, and the performance gains that you can expect. You will also become familiar with best practices around architecture, implementation, and maintenance. Finally, you will know the limitations and common pitfalls to be aware of and avoid. As analytic data can become quite large, the expense to manage it or migrate it can be high. This book shows that columnstore indexing represents an effective storage solution that saves time, money, and improves performance for any applications that use it. You will see that columnstore indexes are an effective performance solution that is included in all versions of SQL Server, with no additional costs or licensing required. What You Will Learn Implement columnstore indexes in SQL Server Know best practices for the use and maintenance of analytic data in SQL Server Use metadata to fully understand the size and shape of data stored in columnstore indexes Employ optimal ways to load, maintain, and delete data from large analytic tables Know how columnstore compression saves storage, memory, and time Understand when a columnstore index should be used instead of a rowstore index Be familiar with advanced features and analytics Who This Book Is For Database developers, administrators, and architects who are responsible for analytic data, especially for those working with very large data sets who are looking for new ways to achieve high performance in their queries, and those with immediate or future challenges to analytic data and query performance who want a methodical and effective solution

IBM DS8900F Architecture and Implementation: Updated for Release 9.2

2022-02-22 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Connie Riggins , Lisa Martinez , Bertrand Dufrasne , Mike Stenson , Jeff Cook , Sherri Brunson

AI/ML IBM data data-engineering

This IBM® RedpaperRedbooks® publication describes the concepts, architecture, and implementation of the IBM DS8900F family. The WhitepaperRedpaperbook provides reference information to assist readers who need to plan for, install, and configure the DS8900F systems. This edition applies to DS8900F systems with IBM DS8000® Licensed Machine Code (LMC) 7.9.20 (bundle version 89.20.xx.x), referred to as Release 9.2. The DS8900F is an all-flash system exclusively, and it offers three classes: DS8980F: Analytic Class: The DS8980F Analytic Class offers best performance for organizations that want to expand their workload possibilities to artificial intelligence (AI), Business Intelligence (BI), and machine learning (ML). IBM DS8950F: Agility Class all-flash: The Agility Class consolidates all your mission-critical workloads for IBM Z®, IBM LinuxONE, IBM Power Systems, and distributed environments under a single all-flash storage solution.. IBM DS8910F: Flexibility Class all-flash: The Flexibility Class reduces complexity while addressing various workloads at the lowest DS8900F family entry cost. . TThe DS8900F architecture relies on powerful IBM POWER9™ processor-based servers that manage the cache to streamline disk input/output (I/O), which maximizes performance and throughput. These capabilities are further enhanced by High-Performance Flash Enclosures (HPFE) Gen2. Like its predecessors, the DS8900F supports advanced disaster recovery (DR) solutions, business continuity solutions, and thin provisioning. The IBM DS8910F Rack-Mounted model 993 is described in IBM DS8910F Model 993 Rack-Mounted Storage System Release 9.1, REDP-5566.

Learn Power BI - Second Edition

2022-02-18 · O'Reilly Data Science Books O'Reilly Amazon

book

by Greg Deckler

Analytics DAX Microsoft Power BI business-intelligence data data-science microsoft-power-platform power-bi

Learn Power BI is a comprehensive guide to mastering Microsoft Power BI. With step-by-step instructions, this book equips you to analyze and visualize data effectively, delivering actionable business insights. Whether you're new to Power BI or seeking to deepen your knowledge, you'll find practical examples and hands-on exercises to enhance your skills. What this Book will help me do Master the basics of using Microsoft Power BI for data analysis. Learn to clean and transform datasets effectively using Power Query. Build analytical models and perform calculations using DAX. Design professional-quality reports, dashboards, and visualizations. Understand governance and deploy Power BI in organizational environments. Author(s) Greg Deckler is a recognized expert in business intelligence and analytics, bringing years of practical experience in using Microsoft Power BI for data-driven decision-making. As an accomplished author, Greg's approachable writing style helps readers of all levels. In his book, he conveys complex concepts in a clear, structured, and user-friendly manner. Who is it for? This book is ideal for IT professionals, data analysts, and individuals interested in business intelligence using Power BI. Whether you're a beginner or transitioning from other tools, it guides you through the basics to advanced features. If you want to harness Power BI to create impactful reports or dashboards, this book is for you.

DIGITAL BUSINESS INTELLIGENCE: EVOLUTION AND MATURITY OF DIGITAL COMMERCE

2022-02-01 · Superweek 2022

talk

by Ivan Rečević (Analytics consultant)

Analytics

Since the dawn of digital analytics, commerce activities have been defined with eCommerce metrics and tracking. This approach was described and accepted as a de-facto standard, as no better integration and commerce model was offered. With technology advancements and the evolution of purchasing behavior, commerce activities are becoming more dominant. It is time to reformat or redesign the data and start talking to businesses with a different narrative.

Actionable Insights with Amazon QuickSight

2022-01-28 · O'Reilly Data Science Books O'Reilly Amazon

book

by Manos Samatas

AI/ML Analytics API AWS Big Data Data Analytics DataViz QuickSight amazon-quicksight analytics-platforms data data-science

Discover the power of Amazon QuickSight with this comprehensive guide. Learn to create stunning data visualizations, integrate machine learning insights, and automate operations to optimize your data analytics workflows. This book offers practical guidance on utilizing QuickSight to develop insightful and interactive business intelligence solutions. What this Book will help me do Understand the role of Amazon QuickSight within the AWS analytics ecosystem. Learn to configure data sources and develop visualizations effectively. Gain skills in adding interactivity to dashboards using custom controls and parameters. Incorporate machine learning capabilities into your dashboards, including forecasting and anomaly detection. Explore advanced features like QuickSight APIs and embedded multi-tenant analytics design. Author(s) None Samatas is an AWS-certified big data solutions architect with years of experience in designing and implementing scalable analytics solutions. With a clear and practical approach, None teaches how to effectively leverage Amazon QuickSight for efficient and insightful business intelligence applications. Their expertise ensures readers will gain actionable skills. Who is it for? This book is ideal for business intelligence (BI) developers and data analysts looking to deepen their expertise in creating interactive dashboards using Amazon QuickSight. It is a perfect guide for professionals aiming to explore machine learning integration in BI solutions. Familiarity with basic data visualization concepts is recommended, but no prior experience with Amazon QuickSight is needed.

talk-data.com

BI

Activity Trend

Top Events

Top Speakers

Insights And Advice On Building A Data Lake Platform From Someone Who Learned The Hard Way

Will BI Notebooks replace Power BI Tableau? w/ Hjalmar Gislason

Scaling Analysis of Connected Data And Modeling Complex Relationships With The TigerGraph Graph Database

What's The Role Of AI in BI?

Leading The Charge For The ELT Data Integration Pattern For Cloud Data Warehouses At Matillion

Artificial Intelligence with Power BI

The Tableau Workshop

Operational Analytics At Speed With Minimal Busy Work Using Incorta

Microsoft Power BI Performance Best Practices

Introducing Charticulator for Power BI: Design Vibrant and Customized Visual Representations of Data

Getting Started with Elastic Stack 8.0

087 - How Data Product Management and UX Integrate with Data Scientists at Albertsons Companies to Improve the Grocery Shopping Experience

Gordon Wong on Success Metrics

Powering Your Organisation with Advanced Business Intelligence - Featuring Jen Stirrup

Mastering Snowflake Solutions: Supporting Analytics and Data Sharing

Analytics Optimization with Columnstore Indexes in Microsoft SQL Server: Optimizing OLAP Workloads

IBM DS8900F Architecture and Implementation: Updated for Release 9.2

Learn Power BI - Second Edition

DIGITAL BUSINESS INTELLIGENCE: EVOLUTION AND MATURITY OF DIGITAL COMMERCE

Actionable Insights with Amazon QuickSight