talk-data.com talk-data.com

Topic

Analytics

data_analysis insights metrics

4552

tagged

Activity Trend

398 peak/qtr
2020-Q1 2026-Q1

Activities

4552 activities · Newest first

Building Real-Time Analytics Systems

Gain deep insight into real-time analytics, including the features of these systems and the problems they solve. With this practical book, data engineers at organizations that use event-processing systems such as Kafka, Google Pub/Sub, and AWS Kinesis will learn how to analyze data streams in real time. The faster you derive insights, the quicker you can spot changes in your business and act accordingly. Author Mark Needham from StarTree provides an overview of the real-time analytics space and an understanding of what goes into building real-time applications. The book's second part offers a series of hands-on tutorials that show you how to combine multiple software products to build real-time analytics applications for an imaginary pizza delivery service. You will: Learn common architectures for real-time analytics Discover how event processing differs from real-time analytics Ingest event data from Apache Kafka into Apache Pinot Combine event streams with OLTP data using Debezium and Kafka Streams Write real-time queries against event data stored in Apache Pinot Build a real-time dashboard and order tracking app Learn how Uber, Stripe, and Just Eat use real-time analytics

What skills should you learn when studying to be a Data Analyst?

Join me with data legend Luke Barousse to discuss where you should focus your time.

Is it Python? Is it SQL? Is it Excel? Is it Power BI?

Listen to find out 👀

Connect with Luke Barousse:

🤝 Connect on Linkedin

▶️ Subscribe on Youtube

📊 Datanerd.tech

📩 Get my weekly email with helpful data career tips

📊 Come to my next free “How to Land Your First Data Job” training

🏫 Check out my 10-week data analytics bootcamp

Timestamps:

(03:42) - Analyzing 1.2M data jobs (DataNerd.tech)

(06:21) - The most important data skills

(12:13) - More senior skills

(22:52) - Data job titles

Connect with Avery:

📺 Subscribe on YouTube

🎙Listen to My Podcast

👔 Connect with me on LinkedIn

📸 Instagram

🎵 TikTok

Mentioned in this episode: Join the last cohort of 2025! The LAST cohort of The Data Analytics Accelerator for 2025 kicks off on Monday, December 8th and enrollment is officially open!

To celebrate the end of the year, we’re running a special End-of-Year Sale, where you’ll get: ✅ A discount on your enrollment 🎁 6 bonus gifts, including job listings, interview prep, AI tools + more

If your goal is to land a data job in 2026, this is your chance to get ahead of the competition and start strong.

👉 Join the December Cohort & Claim Your Bonuses: https://DataCareerJumpstart.com/daa https://www.datacareerjumpstart.com/daa

Many organizations with data warehouses or data lakes are discovering that they are not prepared to scale AI workloads. A lack of processes and standards has led to a growing quagmire of unstructured, ungoverned data, best described as a data swamp. In this session, Paul Zikopoulos, Vice President of IBM Skills Vitality and Enablement, IBM, will introduce you to a new approach using an open source, data lakehouse strategy, which can help resolve the problems hindering your ability to put analytics and AI to work at scale. Then, Tarun Chopra, Vice President of Product Management for Data and AI, IBM, will dive into IBM watsonx.data, a fit-for-purpose data store to scale AI workloads, and how leading companies are deploying it within their data strategy.

Summary

Data systems are inherently complex and often require integration of multiple technologies. Orchestrators are centralized utilities that control the execution and sequencing of interdependent operations. This offers a single location for managing visibility and error handling so that data platform engineers can manage complexity. In this episode Nick Schrock, creator of Dagster, shares his perspective on the state of data orchestration technology and its application to help inform its implementation in your environment.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. You specify the customer traits, then Profiles runs the joins and computations for you to create complete customer profiles. Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstack This episode is brought to you by Datafold – a testing automation platform for data engineers that finds data quality issues before the code and data are deployed to production. Datafold leverages data-diffing to compare production and development environments and column-level lineage to show you the exact impact of every code change on data, metrics, and BI tools, keeping your team productive and stakeholders happy. Datafold integrates with dbt, the modern data stack, and seamlessly plugs in your data CI for team-wide and automated testing. If you are migrating to a modern data stack, Datafold can also help you automate data and code validation to speed up the migration. Learn more about Datafold by visiting dataengineeringpodcast.com/datafold You shouldn't have to throw away the database to build with fast-changing data. You should be able to keep the familiarity of SQL and the proven architecture of cloud warehouses, but swap the decades-old batch computation model for an efficient incremental engine to get complex queries that are always up-to-date. With Materialize, you can! It’s the only true SQL streaming database built from the ground up to meet the needs of modern data products. Whether it’s real-time dashboarding and analytics, personalization and segmentation or automation and alerting, Materialize gives you the ability to work with fresh, correct, and scalable results — all in a familiar SQL interface. Go to dataengineeringpodcast.com/materialize today to get 2 weeks free! Your host is Tobias Macey and today I'm welcoming back Nick Schrock to talk about the state of the ecosystem for data orchestration

Interview

Introduction How did you get involved in the area of data management? Can you start by defining what data orchestration is and how it differs from other types of orchestration systems? (e.g. container orchestration, generalized workflow orchestration, etc.) What are the misconceptions about the applications of/need for/cost to implement data orchestration?

How do those challenges of customer education change across roles/personas?

Because of the multi-faceted nature of data in an organization, how does that influence the capabilities and interfaces that are needed in an orchestration engine? You have been working on Dagster for five years now. How have the requirements/adoption/application for orchestrators changed in that time? One of the challenges for any orchestration engine is to balance the need for robust and extensible core capabilities with a rich suite of integrations to the broader data ecosystem. What are the factors that you have seen make the most influence in driving adoption of a given engine? What are the most interesting, innovative, or unexpected ways that you have seen data orchestration implemented and/or used? What are the most interesting, unexpected, or challenging lessons that you have learned while working o

podcast_episode
by Martin Fleming (Varicent) , Cris deRitis , Mark Zandi (Moody's Analytics) , Marisa DiNatale (Moody's Analytics)

Given the strong counter narratives regarding the impact of artificial intelligence on the economy – from bright optimism that AI will significantly lift productivity growth and wealth to dark pessimism that it will lead to a dystopic increase in unemployment and cybercrime – we asked Martin Fleming to sort it out. And the former chief economist of IBM and current author and Chief Revenue Scientist at Varicent does just that. And Mark does a bit of advertising along the way. For more about Martin Fleming, click here For more on Martin Fleming's book, Breakthrough: A Growth Revolution, click here To participate in the weekly Survey of Business Confidence, click here Follow Mark Zandi @MarkZandi, Cris deRitis @MiddleWayEcon, and Marisa DiNatale on LinkedIn for additional insight.

Questions or Comments, please email us at [email protected]. We would love to hear from you.    To stay informed and follow the insights of Moody's Analytics economists, visit Economic View.

Send us a text Datatopics is a podcast presented by Kevin Missoorten to talk about the fuzzy and misunderstood concepts in the world of data, analytics, and AI and get to the bottom of things.

In this episode, together with expert guests Jeroen, Guillaume and our own Murilo, we dive deep into the fascinating world of Edge AI. Where “traditional” AI models can be accessed remotely they often run centrally on a cloud instance, but what do you do when you need an immediate response, or when you don’t want the data to be sent or don’t always have connection? In that case you can use ‘smaller’ models deployed on a device, this is called Edge AI. Edge AI can bring many benefits but there are still some challenges as well. Tune in to DataTopics to hear our take on Edge AI, what business value it can bring, where it is today and where we see it evolve to next! Datatopics is brought to you by Dataroots Music: The Gentlemen - DivKidThe thumbnail is generated by Midjourney

How can we leverage data products to generate value? Join this insightful conversation with host Jason Foster and Deepak Jose, Global Head of One Demand Data & Analytics Solutions, Mars. In this thought-provoking episode, we explore the world of impactful data products, the power of having a business problem-first mindset, and practical strategies to implement them to generate substantial business value.

Join me with data interview expert Nick Singh to discuss how to ace your data science and analytics interviews, from preparation tips to tackling SQL technical questions, in this must-listen episode of the Data Career Podcast—you won't want to miss it!

Connect with Nick Singh:

🎁 Free SQL Tutorial from Data Lemur: https://datalemur.com/sql-tutorial

🤝 Connect on Linkedin

📕 Buy "Ace the Data Science Interview" book

🐵DataLemur

👔 Get a discount on my data interview prep course: https://www.datacareerjumpstart.com//interview

📩 Get my weekly email with helpful data career tips

📊 Come to my next free “How to Land Your First Data Job” training

🏫 Check out my 10-week data analytics bootcamp

Timestamps:

(16:10) - 📊 Best interview tips

(36:30) - 🔄 Mock interview practice

Connect with Avery:

📺 Subscribe on YouTube

🎙Listen to My Podcast

👔 Connect with me on LinkedIn

📸 Instagram

🎵 TikTok

Mentioned in this episode: Join the last cohort of 2025! The LAST cohort of The Data Analytics Accelerator for 2025 kicks off on Monday, December 8th and enrollment is officially open!

To celebrate the end of the year, we’re running a special End-of-Year Sale, where you’ll get: ✅ A discount on your enrollment 🎁 6 bonus gifts, including job listings, interview prep, AI tools + more

If your goal is to land a data job in 2026, this is your chance to get ahead of the competition and start strong.

👉 Join the December Cohort & Claim Your Bonuses: https://DataCareerJumpstart.com/daa https://www.datacareerjumpstart.com/daa

MCA Microsoft Certified Associate Azure Data Engineer Study Guide

Prepare for the Azure Data Engineering certification—and an exciting new career in analytics—with this must-have study aide In the MCA Microsoft Certified Associate Azure Data Engineer Study Guide: Exam DP-203, accomplished data engineer and tech educator Benjamin Perkins delivers a hands-on, practical guide to preparing for the challenging Azure Data Engineer certification and for a new career in an exciting and growing field of tech. In the book, you’ll explore all the objectives covered on the DP-203 exam while learning the job roles and responsibilities of a newly minted Azure data engineer. From integrating, transforming, and consolidating data from various structured and unstructured data systems into a structure that is suitable for building analytics solutions, you’ll get up to speed quickly and efficiently with Sybex’s easy-to-use study aids and tools. This Study Guide also offers: Career-ready advice for anyone hoping to ace their first data engineering job interview and excel in their first day in the field Indispensable tips and tricks to familiarize yourself with the DP-203 exam structure and help reduce test anxiety Complimentary access to Sybex’s expansive online study tools, accessible across multiple devices, and offering access to hundreds of bonus practice questions, electronic flashcards, and a searchable, digital glossary of key terms A one-of-a-kind study aid designed to help you get straight to the crucial material you need to succeed on the exam and on the job, the MCA Microsoft Certified Associate Azure Data Engineer Study Guide: Exam DP-203 belongs on the bookshelves of anyone hoping to increase their data analytics skills, advance their data engineering career with an in-demand certification, or hoping to make a career change into a popular new area of tech.

For the past few years, we've seen the importance of data literacy and why organizations must invest in a data-driven culture, mindset, and skillset. However, as generative AI tools like ChatGPT have risen to prominence in the past year, AI literacy has never been more important. But how do we begin to approach AI literacy? Is it an extension of data literacy, a complement, or a new paradigm altogether? How should you get started on your AI literacy ambitions?  Cindi Howson is the Chief Data Strategy Officer at ThoughtSpot and host of The Data Chief podcast. Cindi is a data analytics, AI, and BI thought leader and an expert with a flair for bridging business needs with technology. As Chief Data Strategy Officer at ThoughtSpot, she advises top clients on data strategy and best practices to become data-driven, speaks internationally on top trends such as AI ethics, and influences ThoughtSpot’s product strategy.

Cindi was previously a Gartner Research Vice President, the lead author for the data and analytics maturity model and analytics and BI Magic Quadrant, and a popular keynote speaker. She introduced new research in data and AI for good, NLP/BI Search, and augmented analytics, bringing both BI bake-offs and innovation panels to Gartner globally. She’s frequently quoted in MIT, Harvard Business Review, and Information Week. She is rated a top 12 influencer in big data and analytics by Analytics Insight, Onalytca, Solutions Review, and Humans of Data.

In the episode, Cindi and Adel discuss how generative AI accelerates an organization’s data literacy, how leaders can think beyond data literacy and start to think about AI literacy, the importance of responsible use of AI, how to best communicate the value of AI within your organization, what generative AI means for data teams, AI use-cases in the data space, the psychological barriers blocking AI adoption, and much more. 

Links Mentioned in the Show: The Data Chief Podcast  ThoughtSpot Sage  BloombergGPT  Radar: Data & AI Literacy Course: AI Ethics  Course: Generative AI Concepts Course: Implementing AI Solutions in Business 

Summary

Cloud data warehouses and the introduction of the ELT paradigm has led to the creation of multiple options for flexible data integration, with a roughly equal distribution of commercial and open source options. The challenge is that most of those options are complex to operate and exist in their own silo. The dlt project was created to eliminate overhead and bring data integration into your full control as a library component of your overall data system. In this episode Adrian Brudaru explains how it works, the benefits that it provides over other data integration solutions, and how you can start building pipelines today.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. You specify the customer traits, then Profiles runs the joins and computations for you to create complete customer profiles. Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstack You shouldn't have to throw away the database to build with fast-changing data. You should be able to keep the familiarity of SQL and the proven architecture of cloud warehouses, but swap the decades-old batch computation model for an efficient incremental engine to get complex queries that are always up-to-date. With Materialize, you can! It’s the only true SQL streaming database built from the ground up to meet the needs of modern data products. Whether it’s real-time dashboarding and analytics, personalization and segmentation or automation and alerting, Materialize gives you the ability to work with fresh, correct, and scalable results — all in a familiar SQL interface. Go to dataengineeringpodcast.com/materialize today to get 2 weeks free! This episode is brought to you by Datafold – a testing automation platform for data engineers that finds data quality issues before the code and data are deployed to production. Datafold leverages data-diffing to compare production and development environments and column-level lineage to show you the exact impact of every code change on data, metrics, and BI tools, keeping your team productive and stakeholders happy. Datafold integrates with dbt, the modern data stack, and seamlessly plugs in your data CI for team-wide and automated testing. If you are migrating to a modern data stack, Datafold can also help you automate data and code validation to speed up the migration. Learn more about Datafold by visiting dataengineeringpodcast.com/datafold Your host is Tobias Macey and today I'm interviewing Adrian Brudaru about dlt, an open source python library for data loading

Interview

Introduction How did you get involved in the area of data management? Can you describe what dlt is and the story behind it?

What is the problem you want to solve with dlt? Who is the target audience?

The obvious comparison is with systems like Singer/Meltano/Airbyte in the open source space, or Fivetran/Matillion/etc. in the commercial space. What are the complexities or limitations of those tools that leave an opening for dlt? Can you describe how dlt is implemented? What are the benefits of building it in Python? How have the design and goals of the project changed since you first started working on it? How does that language choice influence the performance and scaling characteristics? What problems do users solve with dlt? What are the interfaces available for extending/customizing/integrating with dlt? Can you talk through the process of adding a new source/destination? What is the workflow for someone building a pipeline with dlt? How does the experience scale when supporting multiple connections? Given the limited scope of extract and load, and the composable design of dlt it seems like a purpose built companion to dbt (down to th

podcast_episode
by Dante DeAntonio (Moody's Analytics) , Cris deRitis , Mark Zandi (Moody's Analytics) , Marisa DiNatale (Moody's Analytics)

The August jobs report couldn’t have been much better.  Dante and Cris called it a good report, while Mark and Marisa thought it was a VERY good report.  Either way, the report has soft landing written all over it.  In standing with the good cheer over the jobs numbers, the group recounted their favorite music. Some surprises there. For the full transcript, click here Follow Mark Zandi @MarkZandi, Cris deRitis @MiddleWayEcon, and Marisa DiNatale on LinkedIn for additional insight.  

Questions or Comments, please email us at [email protected]. We would love to hear from you.    To stay informed and follow the insights of Moody's Analytics economists, visit Economic View.

Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Python Data Analytics: With Pandas, NumPy, and Matplotlib

Explore the latest Python tools and techniques to help you tackle the world of data acquisition and analysis. You'll review scientific computing with NumPy, visualization with matplotlib, and machine learning with scikit-learn. This third edition is fully updated for the latest version of Python and its related libraries, and includes coverage of social media data analysis, image analysis with OpenCV, and deep learning libraries. Each chapter includes multiple examples demonstrating how to work with each library. At its heart lies the coverage of pandas, for high-performance, easy-to-use data structures and tools for data manipulation Author Fabio Nelli expertly demonstrates using Python for data processing, management, and information retrieval. Later chapters apply what you've learned to handwriting recognition and extending graphical capabilities with the JavaScript D3 library. Whether you are dealing with sales data, investment data, medical data, web page usage, or other data sets, Python Data Analytics, Third Edition is an invaluable reference with its examples of storing, accessing, and analyzing data. What You'll Learn Understand the core concepts of data analysis and the Python ecosystem Go in depth with pandas for reading, writing, and processing data Use tools and techniques for data visualization and image analysis Examine popular deep learning libraries Keras, Theano,TensorFlow, and PyTorch Who This Book Is For Experienced Python developers who need to learn about Pythonic tools for data analysis

Join me in this week's episode as I share real success stories from participants in the Data Analytics Accelerator program. 🧑🏽‍🎓

These are REAL stories from REAL people like you, going through their journey. Some of their wins are big like data analyst job offers. Others are small and just include posting on LinkedIn or reaching out to a recruiter.

Featuring tips on resume and LinkedIn profile optimization that led to interviews and job offers, proving that small changes can make a big impact in your data career journey—tune in now!

📩 Get my weekly email with helpful data career tips

📊 Come to my next free “How to Land Your First Data Job” training

🏫 Check out my 10-week data analytics bootcamp

Timestamps:

(02:55) - 📊 The Job Market is HARD.

(04:58) - 💼 Story 1: Interviews

(7:33) - 👩‍💻 Story 2: Quick Wins

(09:25) - 💌 Story 3: Cold Messaging

(10:36) - 💰 Story 4: Senior Job Offer

(13:40) - 🚀 Story 5: Small Wins

(14:37) - 🎓 Story 6: Job Offer

Connect with Avery:

📺 Subscribe on YouTube

🎙Listen to My Podcast

👔 Connect with me on LinkedIn

📸 Instagram

🎵 TikTok

Mentioned in this episode: Join the last cohort of 2025! The LAST cohort of The Data Analytics Accelerator for 2025 kicks off on Monday, December 8th and enrollment is officially open!

To celebrate the end of the year, we’re running a special End-of-Year Sale, where you’ll get: ✅ A discount on your enrollment 🎁 6 bonus gifts, including job listings, interview prep, AI tools + more

If your goal is to land a data job in 2026, this is your chance to get ahead of the competition and start strong.

👉 Join the December Cohort & Claim Your Bonuses: https://DataCareerJumpstart.com/daa https://www.datacareerjumpstart.com/daa

Every day, like invisible breadcrumbs, we leave trails of personal data scattered across the digital landscape. Each click, every search, every purchase - they all tell a story about us. But do we know where these breadcrumbs lead? Who's picking them up? And most importantly, what are they doing with them? In an era where data is documenting our lives across a host of platforms, understanding these trails and their implications is no longer a luxury but rather, a necessity. It's about our privacy, our rights, and our well-being in an increasingly interconnected world. In this episode of Leaders of Analytics John Thompson and I dive into his newly released book that should be on everyone's reading list - "Data for All". During our discussion, we'll delve into the eye-opening insights Thompson shares in his book, such as understanding the scope and consequences of companies manipulating and exploiting your data. We also explore the step-by-step guide he provides on how to navigate this changing landscape.

Serverless Machine Learning with Amazon Redshift ML

Serverless Machine Learning with Amazon Redshift ML provides a hands-on guide to using Amazon Redshift Serverless and Redshift ML for building and deploying machine learning models. Through SQL-focused examples and practical walkthroughs, you will learn efficient techniques for cloud data analytics and serverless machine learning. What this Book will help me do Grasp the workflow of building machine learning models with Redshift ML using SQL. Learn to handle supervised learning tasks like classification and regression. Apply unsupervised learning techniques, such as K-means clustering, in Redshift ML. Develop time-series forecasting models within Amazon Redshift. Understand how to operationalize machine learning in serverless cloud architecture. Author(s) Debu Panda, Phil Bates, Bhanu Pittampally, and Sumeet Joshi are seasoned professionals in cloud computing and machine learning technologies. They combine deep technical knowledge with teaching expertise to guide learners through mastering Amazon Redshift ML. Their collaborative approach ensures that the content is accessible, engaging, and practically applicable. Who is it for? This book is perfect for data scientists, machine learning engineers, and database administrators using or intending to use Amazon Redshift. It's tailored for professionals with basic knowledge of machine learning and SQL who aim to enhance their efficiency and specialize in serverless machine learning within cloud architectures.

Vector and Raster Data Unification Through H3 | M. Colic | Tech Lead Public Sector UK&I | Databricks

Milos Colic, Tech Lead Public Sector UK&I at Databricks, demonstrates how raster and vector geospatial data can be standardised into a unified domain.

This unification facilitates an easy plugin/plugout capability for all raster and vector layers. Databricks used these principles to design an easy, scalable and extensible Flood Risk for Physical Assets solution using H3 as a unification grid.

To learn more about H3 check out: https://docs.carto.com/data-and-analysis/analytics-toolbox-for-bigquery/sql-reference/h3

Improving Urban Mobility through Geospatial Analytics | Fawad A. Qureshi | Snowflake

Fawad A. Qureshi, Industry Field CTO at Snowflake, explores how Voi Technologies, a Swedish e-scooter sharing company, is revolutionizing urban mobility through the power of geospatial analytics.

He also discusses the challenges of urban transportation and how Voi is using data-driven insights to optimize scooter placement, improve safety, and create more efficient transportation networks. By harnessing the power of geospatial analytics, Voi is redefining urban mobility and improving the overall quality of life in cities. Join us to learn how data can transform the way we move around our cities.

To learn more about mobility check out: https://carto.com/solutions/mobility-planning

Why Do Some Retail Stores Perform Better Than Others? | T. Backlar & A. Outman | Echo Analytics

Join Thea Backlar, VP of Product at Echo Analytics and Alexandre Outman, Product Manager at Echo analytics for a thought-provoking session to discover how mobility data can be used to solve the most perplexing and complex business insights conundrums.

Get an inside look at how a large European retailer solved the mystery of why some stores outperformed others.

To learn more about site selection check out: https://carto.com/solutions/site-selection