talk-data.com talk-data.com

Topic

Analytics

data_analysis insights metrics

4552

tagged

Activity Trend

398 peak/qtr
2020-Q1 2026-Q1

Activities

4552 activities · Newest first

AI-Enabled Analytics for Business

We are entering the era of digital transformation where human and artificial intelligence (AI) work hand in hand to achieve data driven performance. Today, more than ever, businesses are expected to possess the talent, tools, processes, and capabilities to enable their organizations to implement and utilize continuous analysis of past business performance and events to gain forward-looking insight to drive business decisions and actions. AI-Enabled Analytics in Business is your Roadmap to meet this essential business capability. To ensure we can plan for the future vs react to the future when it arrives, we need to develop and deploy a toolbox of tools, techniques, and effective processes to reveal forward-looking unbiased insights that help us understand significant patterns, relationships, and trends. This book promotes clarity to enable you to make better decisions from insights about the future. Learn how advanced analytics ensures that your people have the right information at the right time to gain critical insights and performance opportunities Empower better, smarter decision making by implementing AI-enabled analytics decision support tools Uncover patterns and insights in data, and discover facts about your business that will unlock greater performance Gain inspiration from practical examples and use cases showing how to move your business toward AI-Enabled decision making AI-Enabled Analytics in Business is a must-have practical resource for directors, officers, and executives across various functional disciplines who seek increased business performance and valuation.

Data science and machine learning are integral parts of most large-scale product manufacturing processes and are used to understand customer needs, detect quality issues, automate repetitive tasks and optimise supply chains. It’s an invisible glue that helps us produce more things for less, and in a timely fashion. To learn more about this fascinating topic, I recently spoke to Ranga Ramesh who is Senior Director, Quality Innovation and Transformation at Georgia-Pacific. Georgia-Pacific is one of the world’s largest manufacturers of consumer paper products and uses AI technologies throughout their manufacturing process. In this episode of Leaders of Analytics, we explore how computer vision and machine learning can be used to classify tissue paper softness and instantly detect quality issues that could otherwise render large volumes of product useless. Ranga’s work is featured as a case study in our recently published book, Demystifying AI for the Enterprise.

In this video, I wanted to recap the necessary ingredients to becoming a data scientist in 2022. I highlight the data paths you can take, the technical data analytics skills you should learn, and also highlight how I think 2022 will be different.

(0:39) What Is A Data Scientist? (1:31) 4 Ways To Break Into Data (2:44) Data Scientist Requirements (3:20) Proving Your Skills (4:08) Finding An Opportunity  (5:12) What Should You Focus On Learning?  (7:05) How Will 2022 Be Different (8:45) My Data Scientist Story

Rather watch this as a YouTube video? Check it out (and subscribe).

Want to jumpstart your data scientist journey in 2022? Check out The #21DaysToData Challenge: https://www.datacareerjumpstart.com/21daystodata

If you listen to this podcast regularly, please subscribe and leave a rating; it helps other people like you find the podcast, and supports the show, and it is free!

Mentioned in this episode: Join the last cohort of 2025! The LAST cohort of The Data Analytics Accelerator for 2025 kicks off on Monday, December 8th and enrollment is officially open!

To celebrate the end of the year, we’re running a special End-of-Year Sale, where you’ll get: ✅ A discount on your enrollment 🎁 6 bonus gifts, including job listings, interview prep, AI tools + more

If your goal is to land a data job in 2026, this is your chance to get ahead of the competition and start strong.

👉 Join the December Cohort & Claim Your Bonuses: https://DataCareerJumpstart.com/daa https://www.datacareerjumpstart.com/daa

Summary Applications of data have grown well beyond the venerable business intelligence dashboards that organizations have relied on for decades. Now it is being used to power consumer facing services, influence organizational behaviors, and build sophisticated machine learning systems. Given this increased level of importance it has become necessary for everyone in the business to treat data as a product in the same way that software applications have driven the early 2000s. In this episode Brian McMillan shares his work on the book "Building Data Products" and how he is working to educate business users and data professionals about the combination of technical, economical, and business considerations that need to be blended for these projects to succeed.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Today’s episode is Sponsored by Prophecy.io – the low-code data engineering platform for the cloud. Prophecy provides an easy-to-use visual interface to design & deploy data pipelines on Apache Spark & Apache Airflow. Now all the data users can use software engineering best practices – git, tests and continuous deployment with a simple to use visual designer. How does it work? – You visually design the pipelines, and Prophecy generates clean Spark code with tests on git; then you visually schedule these pipelines on Airflow. You can observe your pipelines with built in metadata search and column level lineage. Finally, if you have existing workflows in AbInitio, Informatica or other ETL formats that you want to move to the cloud, you can import them automatically into Prophecy making them run productively on Spark. Create your free account today at dataengineeringpodcast.com/prophecy. StreamSets DataOps Platform is the world’s first single platform for building smart data pipelines across hybrid and multi-cloud architectures. Build, run, monitor and manage data pipelines confidently with an end-to-end data integration platform that’s built for constant change. Amp up your productivity with an easy-to-navigate interface and 100s of pre-built connectors. And, get pipelines and new hires up and running quickly with powerful, reusable components that work across batch and streaming. Once you’re up and running, your smart data pipelines are resilient to data drift. Those ongoing and unexpected changes in schema, semantics, and infrastructure. Finally, one single pane of glass for operating and monitoring all your data pipelines. The full transparency and control you desire for your data operations. Get started building pipelines in minutes for free at dataengineeringpodcast.com/streamsets. The first 10 listeners of the podcast that subscribe to StreamSets’ Professional Tier, receive 2 months free after their first month. Your host is Tobias Macey and today I’m interviewing Brian McMillan about building data products and his book to introduce the work of data analysts and engineers to non-programmers

Interview

Introduction How did you get involved in the area of data management? Can you describe what motivated you to write a book about the work of building data products?

Who is your target audience? What are the main goals that you are trying to achieve through the book?

What

Michael McNamara, Senior Principal for MasterCard, joins Mark, Ryan, and Cris to discuss the state of the American consumer and the impact of Omicron on spending.  Full episode transcript

Questions or Comments, please email us at [email protected]. We would love to hear from you.    To stay informed and follow the insights of Moody's Analytics economists, visit Economic View.

Data science and machine learning are continuing to evolve as core capabilities across many industries. But high-quality data science output is only half the story. As the data science profession matures from “back office support” to leading from the front, there is an increasing need for more integrated systems that plug into business operations. To get the most out of these capabilities, organisations must move beyond just building robust models, and establish operational processes that can produce, implement and maintain machine learning systems at scale. Enter MLOps. To understand the fundamentals and best practices of MLOps, I recently spoke to Shalini Kurapati who is CEO of Clearbox.ai. Clearbox AI is the data-centric MLOps company that enables trustworthy and human-centred AI. Their AI Control Room automatically produces synthetic data and insights to solve the issues related to data quality, data access and sharing, and privacy aspects that block AI adoption in companies. In this episode of Leaders of Analytics, we cover: What MLOps is and why we need it to succeed with advanced data science solutionsHow to get beyond the proof-of-concept-to-production gap and get models into operationThe importance of data-centric AI in building MLOps best practicesThe most common AI pitfalls to avoidHow Human Centred Design principles can be used to build AI for good, and much more.Check out Clearbox here: https://clearbox.ai/ Connect with Shalini here: https://www.linkedin.com/in/shalini-kurapati-phd-she-her-06516324/

With the Omicron wave upon us, it would be Pollyannaish to get overly enthused about the economy's prospects in the new year. But if the economy's performance last year is a guide, we should not be too pessimistic either. Despite being hit hard by the Delta wave of the virus, the economy grew like gangbusters in 2021. It will not grow as strongly in 2022, but inflation, which took off in recent months, will come back to earth. Having said this, how good a year the economy will have depends on the pandemic's path and how well policymakers respond.  Webinar Slides

Questions or Comments, please email us at [email protected]. We would love to hear from you.    To stay informed and follow the insights of Moody's Analytics economists, visit Economic View.

podcast_episode
by Val Kroll , Julie Hoyer , Tim Wilson (Analytics Power Hour - Columbus (OH) , Moe Kiss (Canva) , Michael Helbling (Search Discovery) , J.D. Long

Mistakes happen. In healthy work environments, not only is that fact acknowledged, it's recognized as an opportunity to learn. That's something JD Long has been thinking about quite a bit over the past few years, and he joined the show for a chat about psychological safety: what it is, why it's important, and different techniques for engendering it. Michael trolled Tim almost immediately, which is: 1) ironic, and 2) slated to be addressed in a blameless post-mortem. For complete show notes, including links to items mentioned in this episode and a transcript of the show, visit the show page.

In this episodes, I’ll cover 20 red flags to check before taking a job in the tech industry 🚩🚩🚩

Your job is half of your waking hours, so it’s important to understand what you’re getting into before accepting a job. Going through this simple checklist could save you a lot of headache down the road. Listen to hear some things to look out for.

Want me to answer one of your questions? Submit it on my Discord

Want some help finding a tech job? Schedule a 1:1 call with me.

SUBSCRIBE!

If you’re finding this podcast helpful, and want to help so others can listen, leave a rating and review

Mentioned in this episode: Join the last cohort of 2025! The LAST cohort of The Data Analytics Accelerator for 2025 kicks off on Monday, December 8th and enrollment is officially open!

To celebrate the end of the year, we’re running a special End-of-Year Sale, where you’ll get: ✅ A discount on your enrollment 🎁 6 bonus gifts, including job listings, interview prep, AI tools + more

If your goal is to land a data job in 2026, this is your chance to get ahead of the competition and start strong.

👉 Join the December Cohort & Claim Your Bonuses: https://DataCareerJumpstart.com/daa https://www.datacareerjumpstart.com/daa

Summary Reverse ETL is a product category that evolved from the landscape of customer data platforms with a number of companies offering their own implementation of it. While struggling with the work of automating data integration workflows with marketing, sales, and support tools Brian Leonard accidentally discovered this need himself and turned it into the open source framework Grouparoo. In this episode he explains why he decided to turn these efforts into an open core business, how the platform is implemented, and the benefits of having an open source contender in the landscape of operational analytics products.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! StreamSets DataOps Platform is the world’s first single platform for building smart data pipelines across hybrid and multi-cloud architectures. Build, run, monitor and manage data pipelines confidently with an end-to-end data integration platform that’s built for constant change. Amp up your productivity with an easy-to-navigate interface and 100s of pre-built connectors. And, get pipelines and new hires up and running quickly with powerful, reusable components that work across batch and streaming. Once you’re up and running, your smart data pipelines are resilient to data drift. Those ongoing and unexpected changes in schema, semantics, and infrastructure. Finally, one single pane of glass for operating and monitoring all your data pipelines. The full transparency and control you desire for your data operations. Get started building pipelines in minutes for free at dataengineeringpodcast.com/streamsets. The first 10 listeners of the podcast that subscribe to StreamSets’ Professional Tier, receive 2 months free after their first month. Atlan is a collaborative workspace for data-driven teams, like Github for engineering or Figma for design teams. By acting as a virtual hub for data assets ranging from tables and dashboards to SQL snippets & code, Atlan enables teams to create a single source of truth for all their data assets, and collaborate across the modern data stack through deep integrations with tools like Snowflake, Slack, Looker and more. Go to dataengineeringpodcast.com/atlan today and sign up for a free trial. If you’re a data engineering podcast listener, you get credits worth $3000 on an annual subscription Your host is Tobias Macey and today I’m interviewing Brian Leonard about Grouparoo, an open source framework for managing your reverse ETL pipelines

Interview

Introduction How did you get involved in the area of data management? Can you describe what Grouparoo is and the story behind it? What are the core requirements for building a reverse ETL system?

What are the additional capabilities that users of the system ask for as they get more advanced in their usage?

Who is your target user for Grouparoo and how does that influence your priorities on feature development and UX design? What are the benefits of building an open source core for a reverse ETL platform as compared to the other commercial options? Can you describe the architecture and implementation of the Grouparoo project?

What are the additional systems that you have built to support the hosted offering? How have the design and goals of the

podcast_episode
by Dante DeAntonio (Moody's Analytics) , Cris deRitis , Mark Zandi (Moody's Analytics) , Ryan Sweet

Mark, Ryan, and Cris welcome back Dante DeAntonio, Senior Economist at Moody's Analytics, to dissect the December U.S. employment report and the latest effects of the Omicron variant on the economy. They also discuss reasons why people are quitting in droves. Full episode transcript.

Questions or Comments, please email us at [email protected]. We would love to hear from you.    To stay informed and follow the insights of Moody's Analytics economists, visit Economic View.

“Out of stock”. Three words with a great deal of significance for retailers and their customers. It is estimated that retail products are out of stock 8% of the time in physical stores, and more than 14% of the time in e-commerce stores, leading to frustration for retailers and customers alike. Retailers miss out on important revenue from the forgone sales. Customers leave unfulfilled and are less likely to return to the same retailer or recommend it to others in their network. Supply chains feel the ripples of the gaps between demand and supply. This is a trillion-dollar problem globally. The solution to this problem is not just about demand forecasting, but also knowing what you have in stock, which is a huge challenge in itself. To understand how to solve this challenge, I recently spoke to Min Chen who is the co-founder and CEO of Wisy Inc. The company’s technology is focused on reducing retail stockouts and waste with artificial intelligence and data analytics. Min is a seasoned entrepreneur and an all-round interesting person. Having migrated from China to Panama at age 4, the now lives in Silicon Valley after moving Wisy from Panama to the US in 2020. In this episode of Leaders of Analytics, you will learn: How AI can help solve a global, trillion-dollar supply chain problemHow to develop a product-market fit for AI solutionsHow to bootstrap a start-up in a difficult environmentWhy Wisy decided to move the company from Panama to Silicon Valley

Welcome to 2022! 🎉 Thank you so much for listening! In this episode, I review 2021, discuss goals, and introduce a new challenge!

Check out The 21 Days To Data Challenge: https://www.datacareerjumpstart.com/Challenge

New Data Career Podcast episodes EVERY Monday morning

Here’s what I did in 2021:

Quit my job Snow Data Science Consulted for 15 businesses Ran 50 miles, 60k elevation = 11 peaks Ran a marathon Sold a house, bought a house Interned with the Utah Jazz Graduated with masters from Georgia Tech 20 days with youth group in Dominican Republic Launched Data Career Jumpstart

Please subscribe to the podcast, and leave us a review! It means the world to me!

Mentioned in this episode: Join the last cohort of 2025! The LAST cohort of The Data Analytics Accelerator for 2025 kicks off on Monday, December 8th and enrollment is officially open!

To celebrate the end of the year, we’re running a special End-of-Year Sale, where you’ll get: ✅ A discount on your enrollment 🎁 6 bonus gifts, including job listings, interview prep, AI tools + more

If your goal is to land a data job in 2026, this is your chance to get ahead of the competition and start strong.

👉 Join the December Cohort & Claim Your Bonuses: https://DataCareerJumpstart.com/daa https://www.datacareerjumpstart.com/daa

podcast_episode
by Val Kroll , Julie Hoyer , Tim Wilson (Analytics Power Hour - Columbus (OH) , Josh Crowhurst , Moe Kiss (Canva) , Michael Helbling (Search Discovery)

We did it! Another year in the books, and 2021 was a bit of a ride. As we do every year, on this episode we reflect a little bit on the podcast and then a lot on the industry: what the major themes of 2021 were, and what we think might be coming in 2022. Google Analytics 4, 3rd party cookies, remote work and Zoom meetings, and even the metaverse! Plus, of course, this is our annual excuse to get our executive producer, Josh Crowhurst, on a mic! For complete show notes, including links to items mentioned in this episode and a transcript of the show, visit the show page.

Data Mesh in Practice

The data mesh is poised to replace data lakes and data warehouses as the dominant architectural pattern in data and analytics. By promoting the concept of domain-focused data products that go beyond file sharing, data mesh helps you deal with data quality at scale by establishing true data ownership. This approach is so new, however, that many misconceptions and a general lack of practical experience for implementing data mesh are widespread. With this report, you'll learn how to successfully overcome challenges in the adoption process. By drawing on their experience building large-scale data infrastructure, designing data architectures, and contributing to data strategies of large and successful corporations, authors Max Schultze and Arif Wider have identified the most common pain points along the data mesh journey. You'll examine the foundations of the data mesh paradigm and gain both technical and organizational insights. This report is ideal for companies just starting to work with data, for organizations already in the process of transforming their data infrastructure landscape, as well as for advanced companies working on federated governance setups for a sustainable data-driven future. This report covers: Data mesh principles and practical examples for getting started Typical challenges and solutions you'll encounter when implementing a data mesh Data mesh pillars including domain ownership, data as a product, and infrastructure as a platform How to move toward a decentralized data product and build a data infrastructure platform

Optimizing Databricks Workloads

Unlock the full potential of Apache Spark on the Databricks platform with "Optimizing Databricks Workloads". This book equips you with must-know techniques to effectively configure, manage, and optimize big data processing pipelines. Dive into real-world scenarios and learn practical approaches to reduce costs and improve performance in your data engineering processes. What this Book will help me do Understand and apply optimization techniques for Databricks workloads. Choose the right cluster configurations to maximize efficiency and minimize costs. Leverage Delta Lake for performance-boosted data processing and optimization. Develop skills for managing Spark DataFrames and core functionalities in Databricks. Gain insights into real-world scenarios to effectively improve workload performance. Author(s) Anirudh Kala and the co-authors are experienced practitioners in the fields of data engineering and analytics. With years of professional expertise in leveraging Apache Spark and Databricks, they bring real-world insight into performance optimization. Their approach blends practical instruction with actionable strategies, making this book an essential guide for data engineers aiming to excel in this domain. Who is it for? This book is tailored for data engineers, data scientists, and cloud architects looking to elevate their skills in managing Databricks workloads. Ideal for readers with basic knowledge of Spark and Databricks, it helps them get hands-on with optimization techniques. If you are aiming to enhance your Spark-based data processing systems, this book offers the guidance you need.

Mark, Ryan, and Cris welcome back Joe Kennedy, senior principal economist at MITRE, to discuss the U.S. dollar, reserve currencies, and crypto. Recommended Read: Ending Poverty: Changing Behavior, Guaranteeing Income, and Transforming Government, by Joseph Kennedy.

Questions or Comments, please email us at [email protected]. We would love to hear from you.    To stay informed and follow the insights of Moody's Analytics economists, visit Economic View.

Send us a text Want to be featured as a guest on Making Data Simple? Reach out to us at [[email protected]] and tell us why you should be next.

Abstract Making Data Simple Podcast is hosted by Al Martin, VP, IBM Expert Services Delivery, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun. This week on Making Data Simple, we have Benn Stancil, Chief Analytics Officer + Founder @ Mode. Benn is an accomplished data analyst with deep expertise in collaborative Business Intelligence and Interactive Data Science. Benn is Co-founder, President, and Chief  Analytics Officer of Mode, an award-winning SaaS company that combines the best elements of Business Intelligence (ABI), Data Science (DS) and Machine Learning (ML) to empower data teams to answer impactful questions and collaborate on analysis across a range of business functions. Under Benn’s leadership, the Mode platform has evolved to enable data teams to explore, visualize, analyze and share data in a powerful end-to-end workflow. Prior to founding Mode, Benn served in senior Analytics positions at Microsoft and Yammer, and worked as a  researcher for the International Economics Program at the Carnegie Endowment for International Peace. Benn also served as an Undergraduate Research Fellow at Wake Forest University,  where he received his B.S. in Mathematics and Economics. Benn believes in fostering a shared sense of humility and gratitude.

Show Notes 1:22 – Benn’s history 7:09 – Tell us how you got to where you are today 9:14 – Tell us about Mode 12:08 – What is your definition of the Chief Analytics Officer? 21:53 – Why do we need another BI tool? 24:09 – What’s your secret sauce? 27:48 – Where did the name Mode come from? 28:41 – How do we use Mode? 31:08 – What is you goto market strategy?  32:38 – Any client references? 34:58 – “The missing piece in the modern data stack” tell us about this Mode  Email: [email protected] [email protected] Twitter: benn stancil Connect with the Team Producer Kate Brown - LinkedIn. Producer Steve Templeton - LinkedIn. Host Al Martin - LinkedIn and Twitter.  Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

Summary One of the perennial challenges of data analytics is having a consistent set of definitions, along with a flexible and performant API endpoint for querying them. In this episode Artom Keydunov and Pavel Tiunov share their work on Cube.js and the various ways that it is being used in the open source community.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Atlan is a collaborative workspace for data-driven teams, like Github for engineering or Figma for design teams. By acting as a virtual hub for data assets ranging from tables and dashboards to SQL snippets & code, Atlan enables teams to create a single source of truth for all their data assets, and collaborate across the modern data stack through deep integrations with tools like Snowflake, Slack, Looker and more. Go to dataengineeringpodcast.com/atlan today and sign up for a free trial. If you’re a data engineering podcast listener, you get credits worth $3000 on an annual subscription Modern Data teams are dealing with a lot of complexity in their data pipelines and analytical code. Monitoring data quality, tracing incidents, and testing changes can be daunting and often takes hours to days. Datafold helps Data teams gain visibility and confidence in the quality of their analytical data through data profiling, column-level lineage and intelligent anomaly detection. Datafold also helps automate regression testing of ETL code with its Data Diff feature that instantly shows how a change in ETL or BI code affects the produced data, both on a statistical level and down to individual rows and values. Datafold integrates with all major data warehouses as well as frameworks such as Airflow & dbt and seamlessly plugs into CI workflows. Go to dataengineeringpodcast.com/datafold today to start a 30-day trial of Datafold. Your host is Tobias Macey and today I’m interviewing Artyom Keydunov and Pavel Tiunov about Cube.js a framework for building analytics APIs to power your applications and BI dashboards

Interview

Introduction How did you get involved in the area of data management? Can you describe what Cube is and the story behind it? What are the main use cases and platform architectures that you are focused on?

Who are the target personas that will be using and managing Cube.js?

The name comes from the concept of an OLAP cube. Can you discuss the applications of OLAP cubes and their role in the current state of the data ecosystem?

How does the idea of an OLAP cube compare to the recent focus on a dedicated metrics layer?

What are the pieces of a data platform that might be replaced by Cube.js? Can you describe the design and architecture of the Cube platform?

How has the focus and target use case for the Cube platform evolved since you first started working on it?

One of the perpetually hard problems in computer science is cache management. How have you approached that challenge in the pre-aggregation layer of the Cube framework? What is your overarching design philosophy for the API of the Cube system? Can you talk through the workflow of someone building a cube and querying it from a downstream system?

What do the iteration cycles look like as you go from initial proof of concept to a more sophisticated usage of Cube.js