talk-data.com talk-data.com

Topic

Data Analytics

data_analysis statistics insights

760

tagged

Activity Trend

38 peak/qtr
2020-Q1 2026-Q1

Activities

760 activities · Newest first

We’re improving DataFramed, and we need your help! We want to hear what you have to say about the show, and how we can make it more enjoyable for you—find out more here. Imagine spending millions on data tools only to find you can’t trust the answers they provide. What if different teams define key metrics in different ways? Without a clear, unified approach, chaos reigns, and confidence erodes. What role do data governance and semantic layers play in helping you trust the AI tools you build and the insights you get from your data? Sarah Levy is a seasoned executive with extensive experience in data science, artificial intelligence, and technology leadership. Currently serving as Co-Founder and CEO of Euno since January 2023, Sarah has previously held significant positions, including VP of Data Science and Data Analytics for Real Estate at Pagaya and CTO at Sight Diagnostics, where innovative advancements in blood testing were achieved. With a strong foundation in research and development from roles at Sight Diagnostics and Natural Intelligence, as well as a robust background in cyber security gained from tenure at the IDF, Sarah has consistently driven impactful decision-making and technological advancements throughout their career. Academic credentials include a Master's degree in Condensed Matter Physics from the Weizmann Institute of Science and a Bachelor's degree in Mathematics and Physics from The Hebrew University of Jerusalem. In the episode, Richie and Sarah explore the challenges of data governance, the role of semantic layers in ensuring data trust, the emergence of analytics engineers, the integration of AI in data processes, and much more. Links Mentioned in the Show: EunoConnect with SarahCourse: Responsible AI Data ManagementRelated Episode: How Data Leaders Can Make Data Governance a Priority with Saurabh Gupta, Chief Strategy & Revenue Officer at The Modern Data CompanyRewatch sessions from RADAR: Forward Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

DuckDB: Up and Running

DuckDB, an open source in-process database created for OLAP workloads, provides key advantages over more mainstream OLAP solutions: It's embeddable and optimized for analytics. It also integrates well with Python and is compatible with SQL, giving you the performance and flexibility of SQL right within your Python environment. This handy guide shows you how to get started with this versatile and powerful tool. Author Wei-Meng Lee takes developers and data professionals through DuckDB's primary features and functions, best practices, and practical examples of how you can use DuckDB for a variety of data analytics tasks. You'll also dive into specific topics, including how to import data into DuckDB, work with tables, perform exploratory data analysis, visualize data, perform spatial analysis, and use DuckDB with JSON files, Polars, and JupySQL. Understand the purpose of DuckDB and its main functions Conduct data analytics tasks using DuckDB Integrate DuckDB with pandas, Polars, and JupySQL Use DuckDB to query your data Perform spatial analytics using DuckDB's spatial extension Work with a diverse range of data including Parquet, CSV, and JSON

Help us become the #1 Data Podcast by leaving a rating & review! We are 67 reviews away! It’s not just about skills; find out what makes hiring managers say, “You’re the one we’ve been looking for.” Featuring hiring managers like Alex The Analyst, Megan McGuire, Jesse Morris, and Andrew Madson, the episode provides actionable tips and behind-the-scenes looks at what it takes to stand out in the data job market. 💌 Join 30k+ aspiring data analysts & get my tips in your inbox weekly 👉 https://www.datacareerjumpstart.com/newsletter 🆘 Feeling stuck in your data journey? Come to my next free "How to Land Your First Data Job" training 👉 https://www.datacareerjumpstart.com/training 👩‍💻 Want to land a data job in less than 90 days? 👉 https://www.datacareerjumpstart.com/daa 👔 Ace The Interview with Confidence 👉 https://www.datacareerjumpstart.com//interviewsimulator Subscribe to Our Newsletter: Join the Data Career Jumpstart Newsletter Prepare for Interviews: Try the Interview Simulator ⌚ TIMESTAMPS 00:25 Alex The Analyst: The Importance of Personality in Hiring 08:01 Megan McGuire: The Hiring Process from Start to Finish 17:06 Jesse Morris: Storytelling and Tenacity in Data Roles 23:21 Andrew Madson: The Value of Projects and Team Fit 26:27 Conclusion and Additional Resources 🔗 CONNECT WITH GUESTS Alex Freberg: https://www.linkedin.com/in/alex-freberg Megan McGuire: https://www.linkedin.com/in/megan-s-mcguire Jesse Morris: https://www.linkedin.com/in/jessemorris1 Andrew Madson: https://www.linkedin.com/in/andrew-madson 🔗 CONNECT WITH AVERY 🎥 YouTube Channel 🤝 LinkedIn 📸 Instagram 🎵 TikTok 💻 Website Mentioned in this episode: Join the last cohort of 2025! The LAST cohort of The Data Analytics Accelerator for 2025 kicks off on Monday, December 8th and enrollment is officially open!

To celebrate the end of the year, we’re running a special End-of-Year Sale, where you’ll get: ✅ A discount on your enrollment 🎁 6 bonus gifts, including job listings, interview prep, AI tools + more

If your goal is to land a data job in 2026, this is your chance to get ahead of the competition and start strong.

👉 Join the December Cohort & Claim Your Bonuses: https://DataCareerJumpstart.com/daa https://www.datacareerjumpstart.com/daa

Takahiko Saito: Empowering Real-Time ML Inference and Training with GRIS

🌟 Session Overview 🌟

Session Name: Empowering Real-Time ML Inference and Training with GRIS: A Deep Dive into High Availability and Low Latency Data Solutions Speaker: Takahiko Saito Session Description: In the rapidly evolving landscape of machine learning (ML) and data processing, the need for real-time data delivery systems that offer high availability, low latency, and robust service level agreements (SLAs) has never been more critical. This session introduces GRIS (Generic Real-time Inference Service), a cutting-edge platform designed to meet these demands head-on, facilitating real-time ML inference and historical data processing for ML model training.

Attendees will gain insights into GRIS's capabilities, including its support for real-time data delivery for ML inference, products requiring high availability, low latency, and strong SLA adherence, and real-time product performance monitoring. We will explore how GRIS prioritizes use cases off the Netflix critical path, such as choosing, playback, and sign-up processes, while ensuring data delivery for critical real-time monitoring tasks like anomaly detection during product launches and live events.

The session will delve into the key design decisions and challenges faced during the MVP release of GRIS, highlighting its low latency, high availability gRPC API for inference, and the use of Granular Historical Dataset via Iceberg for training. We will discuss the MVP metrics, including feature groups, categories, and aggregation windows, and how these elements contribute to the platform's effectiveness in real-time data processing.

Furthermore, we will cover the production readiness of GRIS, including streaming jobs, on-call alerts, and data quality measures. The session will provide a comprehensive overview of the MVP data quality framework for GRIS, including online and offline checks, and how these measures ensure the integrity and consistency of data processed by the platform.

Looking ahead, the roadmap for GRIS will be presented, outlining the journey from POC to GA, including the introduction of processor metrics, event-level transaction history, and the next batch of metrics for advanced aggregation types. We will also discuss the potential for a user-facing metrics definition API/DSL and how GRIS is poised to enable new use cases for teams across various domains.

This session is a must-attend for data scientists, ML engineers, and technology leaders looking to stay at the forefront of real-time data processing and ML model training. Whether you're interested in the technical underpinnings of GRIS or its application in real-world scenarios, this session will provide valuable insights into how high availability, low latency data solutions are shaping the future of ML and data analytics.

🚀 About Big Data and RPA 2024 🚀

Unlock the future of innovation and automation at Big Data & RPA Conference Europe 2024! 🌟 This unique event brings together the brightest minds in big data, machine learning, AI, and robotic process automation to explore cutting-edge solutions and trends shaping the tech landscape. Perfect for data engineers, analysts, RPA developers, and business leaders, the conference offers dual insights into the power of data-driven strategies and intelligent automation. 🚀 Gain practical knowledge on topics like hyperautomation, AI integration, advanced analytics, and workflow optimization while networking with global experts. Don’t miss this exclusive opportunity to expand your expertise and revolutionize your processes—all from the comfort of your home! 📊🤖✨

📅 Yearly Conferences: Curious about the evolution of QA? Check out our archive of past Big Data & RPA sessions. Watch the strategies and technologies evolve in our videos! 🚀 🔗 Find Other Years' Videos: 2023 Big Data Conference Europe https://www.youtube.com/playlist?list=PLqYhGsQ9iSEpb_oyAsg67PhpbrkCC59_g 2022 Big Data Conference Europe Online https://www.youtube.com/playlist?list=PLqYhGsQ9iSEryAOjmvdiaXTfjCg5j3HhT 2021 Big Data Conference Europe Online https://www.youtube.com/playlist?list=PLqYhGsQ9iSEqHwbQoWEXEJALFLKVDRXiP

💡 Stay Connected & Updated 💡

Don’t miss out on any updates or upcoming event information from Big Data & RPA Conference Europe. Follow us on our social media channels and visit our website to stay in the loop!

🌐 Website: https://bigdataconference.eu/, https://rpaconference.eu/ 👤 Facebook: https://www.facebook.com/bigdataconf, https://www.facebook.com/rpaeurope/ 🐦 Twitter: @BigDataConfEU, @europe_rpa 🔗 LinkedIn: https://www.linkedin.com/company/73234449/admin/dashboard/, https://www.linkedin.com/company/75464753/admin/dashboard/ 🎥 YouTube: http://www.youtube.com/@DATAMINERLT

AWS re:Invent 2024 - Innovations in AWS analytics: Data warehousing and SQL analytics (ANT349)

Join this session to learn about the newest innovations in data warehousing and SQL analytics with AWS analytics services. Amazon Redshift is the AI-powered, cloud-based data warehousing solution used by tens of thousands of AWS customers to modernize data analytics workloads and generate business insights with the best price performance. Learn more about the latest capabilities launched for Amazon Redshift to further drive quick decision-making with lower costs for your organization.

Learn more: AWS re:Invent: https://go.aws/reinvent. More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

About AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2024

Snowflake Data Engineering

A practical introduction to data engineering on the powerful Snowflake cloud data platform. Data engineers create the pipelines that ingest raw data, transform it, and funnel it to the analysts and professionals who need it. The Snowflake cloud data platform provides a suite of productivity-focused tools and features that simplify building and maintaining data pipelines. In Snowflake Data Engineering, Snowflake Data Superhero Maja Ferle shows you how to get started. In Snowflake Data Engineering you will learn how to: Ingest data into Snowflake from both cloud and local file systems Transform data using functions, stored procedures, and SQL Orchestrate data pipelines with streams and tasks, and monitor their execution Use Snowpark to run Python code in your pipelines Deploy Snowflake objects and code using continuous integration principles Optimize performance and costs when ingesting data into Snowflake Snowflake Data Engineering reveals how Snowflake makes it easy to work with unstructured data, set up continuous ingestion with Snowpipe, and keep your data safe and secure with best-in-class data governance features. Along the way, you’ll practice the most important data engineering tasks as you work through relevant hands-on examples. Throughout, author Maja Ferle shares design tips drawn from her years of experience to ensure your pipeline follows the best practices of software engineering, security, and data governance. About the Technology Pipelines that ingest and transform raw data are the lifeblood of business analytics, and data engineers rely on Snowflake to help them deliver those pipelines efficiently. Snowflake is a full-service cloud-based platform that handles everything from near-infinite storage, fast elastic compute services, inbuilt AI/ML capabilities like vector search, text-to-SQL, code generation, and more. This book gives you what you need to create effective data pipelines on the Snowflake platform. About the Book Snowflake Data Engineering guides you skill-by-skill through accomplishing on-the-job data engineering tasks using Snowflake. You’ll start by building your first simple pipeline and then expand it by adding increasingly powerful features, including data governance and security, adding CI/CD into your pipelines, and even augmenting data with generative AI. You’ll be amazed how far you can go in just a few short chapters! What's Inside Ingest data from the cloud, APIs, or Snowflake Marketplace Orchestrate data pipelines with streams and tasks Optimize performance and cost About the Reader For software developers and data analysts. Readers should know the basics of SQL and the Cloud. About the Author Maja Ferle is a Snowflake Subject Matter Expert and a Snowflake Data Superhero who holds the SnowPro Advanced Data Engineer and the SnowPro Advanced Data Analyst certifications. Quotes An incredible guide for going from zero to production with Snowflake. - Doyle Turner, Microsoft A must-have if you’re looking to excel in the field of data engineering. - Isabella Renzetti, Data Analytics Consultant & Trainer Masterful! Unlocks the true potential of Snowflake for modern data engineers. - Shankar Narayanan, Microsoft Valuable insights will enhance your data engineering skills and lead to cost-effective solutions. A must read! - Frédéric L’Anglais, Maxa Comprehensive, up-to-date and packed with real-life code examples. - Albert Nogués, Danone

AWS re:Invent 2024 - Scaling to new heights with Amazon Redshift multi-cluster architecture (ANT339)

AWS customers use Amazon Redshift to modernize their data analytics workloads and deliver insights for their businesses. Learn how to design your analytics system to scale with your business needs. Explore the different patterns of multi-cluster architectures and best practices to deploy them cost-effectively. Explore how GE Aerospace overcame challenges with its on-premises system by using a combination of architectural patterns to create an extensible design that met strict compliance and security requirements, achieved performance targets, and shared data across its enterprise.

Learn more: AWS re:Invent: https://go.aws/reinvent. More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

About AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2024

AI Engineering

Recent breakthroughs in AI have not only increased demand for AI products, they've also lowered the barriers to entry for those who want to build AI products. The model-as-a-service approach has transformed AI from an esoteric discipline into a powerful development tool that anyone can use. Everyone, including those with minimal or no prior AI experience, can now leverage AI models to build applications. In this book, author Chip Huyen discusses AI engineering: the process of building applications with readily available foundation models. The book starts with an overview of AI engineering, explaining how it differs from traditional ML engineering and discussing the new AI stack. The more AI is used, the more opportunities there are for catastrophic failures, and therefore, the more important evaluation becomes. This book discusses different approaches to evaluating open-ended models, including the rapidly growing AI-as-a-judge approach. AI application developers will discover how to navigate the AI landscape, including models, datasets, evaluation benchmarks, and the seemingly infinite number of use cases and application patterns. You'll learn a framework for developing an AI application, starting with simple techniques and progressing toward more sophisticated methods, and discover how to efficiently deploy these applications. Understand what AI engineering is and how it differs from traditional machine learning engineering Learn the process for developing an AI application, the challenges at each step, and approaches to address them Explore various model adaptation techniques, including prompt engineering, RAG, fine-tuning, agents, and dataset engineering, and understand how and why they work Examine the bottlenecks for latency and cost when serving foundation models and learn how to overcome them Choose the right model, dataset, evaluation benchmarks, and metrics for your needs Chip Huyen works to accelerate data analytics on GPUs at Voltron Data. Previously, she was with Snorkel AI and NVIDIA, founded an AI infrastructure startup, and taught Machine Learning Systems Design at Stanford. She's the author of the book Designing Machine Learning Systems, an Amazon bestseller in AI. AI Engineering builds upon and is complementary to Designing Machine Learning Systems (O'Reilly).

Help us become the #1 Data Podcast by leaving a rating & review! We are 67 reviews away! Steven Tran went from tech support to analytics pro in just three months, and he's spilling the tea on how he made it happen. 💌 Join 30k+ aspiring data analysts & get my tips in your inbox weekly 👉 https://www.datacareerjumpstart.com/newsletter 🆘 Feeling stuck in your data journey? Come to my next free "How to Land Your First Data Job" training 👉 https://www.datacareerjumpstart.com/training 👩‍💻 Want to land a data job in less than 90 days? 👉 https://www.datacareerjumpstart.com/daa 👔 Ace The Interview with Confidence 👉 https://www.datacareerjumpstart.com//interviewsimulator ⌚ TIMESTAMPS 00:37 Meet Steven Tran: From Tech Support to Data Analytics 02:30 Steven's Career Transformation Timeline 06:29 Financial and Career Growth 07:52 The Importance of Projects and Passion 16:57 The Importance of a Portfolio 18:34 Growing Your LinkedIn Presence 24:42 Interview Experiences and Job Success 🔗 CONNECT WITH STEVEN TRAN Connect on LinkedIn: https://www.linkedin.com/in/stephentran96 🔗 CONNECT WITH AVERY 🎥 YouTube Channel 🤝 LinkedIn 📸 Instagram 🎵 TikTok 💻 Website Mentioned in this episode: Join the last cohort of 2025! The LAST cohort of The Data Analytics Accelerator for 2025 kicks off on Monday, December 8th and enrollment is officially open!

To celebrate the end of the year, we’re running a special End-of-Year Sale, where you’ll get: ✅ A discount on your enrollment 🎁 6 bonus gifts, including job listings, interview prep, AI tools + more

If your goal is to land a data job in 2026, this is your chance to get ahead of the competition and start strong.

👉 Join the December Cohort & Claim Your Bonuses: https://DataCareerJumpstart.com/daa https://www.datacareerjumpstart.com/daa

We’re improving DataFramed, and we need your help! We want to hear what you have to say about the show, and how we can make it more enjoyable for you—find out more here. Edge computing is poised to transform industries by bringing computation and data storage closer to the source of data generation. This shift unlocks new types of value creation with data & AI and allows for a privacy-first and deeply personalized use of AI on our devices. What will the edge computing transition look like? How do you ensure applications are edge-ready, and what is the role of AI in the transition?  Derek Collison is the founder and CEO at Synadia. He is an industry veteran, entrepreneur and pioneer in large-scale distributed systems and cloud computing. Derek founded Synadia Communications and Apcera, and has held executive positions at Google, VMware, and TIBCO Software. He is also an active angel investor and a technology futurist around Artificial Intelligence, Machine Learning, IOT and Cloud Computing. Justyna Bak is VP of Marketing at Synadia. Justyna is a versatile executive bridging Marketing, Sales and Product, a spark-plug for innovation at startups and Fortune 100 and a tech expert in Data Analytics and AI, AppDev and Networking. She is an astute influencer, panelist and presenter (Google, HBR) and a respected leader in Silicon Valley and Europe. In the episode, Richie, Derek, and Justyna explore the transition from cloud to edge computing, the benefits of reduced latency, real-time decision-making in industries like manufacturing and retail, the role of AI at the edge, and the future of edge-native applications, and much more. Links Mentioned in the Show: SynadiaConnect with Derek and JustynaCourse: Understanding Cloud ComputingRelated Episode: The Database is the Operating System with Mike Stonebraker, CTO & Co-Founder At DBOSRewatch sessions from RADAR: Forward Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

Help us become the #1 Data Podcast by leaving a rating & review! We are 67 reviews away! Why pay to learn data skills when you could get paid to learn instead? Let’s explore the options and find what works for you. 💌 Join 30k+ aspiring data analysts & get my tips in your inbox weekly 👉 https://www.datacareerjumpstart.com/newsletter 🆘 Feeling stuck in your data journey? Come to my next free "How to Land Your First Data Job" training 👉 https://www.datacareerjumpstart.com/training 👩‍💻 Want to land a data job in less than 90 days? 👉 https://www.datacareerjumpstart.com/daa 👔 Ace The Interview with Confidence 👉 https://www.datacareerjumpstart.com//interviewsimulator ⌚ TIMESTAMPS 00:21 Learning for Free 01:13 Paid Learning 02:53 Getting Paid to Learn 04:58 Company-Sponsored Learning: Courses and Degrees 🔗 CONNECT WITH AVERY 🎥 YouTube Channel 🤝 LinkedIn 📸 Instagram 🎵 TikTok 💻 Website Mentioned in this episode: Join the last cohort of 2025! The LAST cohort of The Data Analytics Accelerator for 2025 kicks off on Monday, December 8th and enrollment is officially open!

To celebrate the end of the year, we’re running a special End-of-Year Sale, where you’ll get: ✅ A discount on your enrollment 🎁 6 bonus gifts, including job listings, interview prep, AI tools + more

If your goal is to land a data job in 2026, this is your chance to get ahead of the competition and start strong.

👉 Join the December Cohort & Claim Your Bonuses: https://DataCareerJumpstart.com/daa https://www.datacareerjumpstart.com/daa

Power BI in Microsoft Fabric: unveiling the latest innovations | BRK202

Join us as we unveil the latest announcements for Power BI in Microsoft Fabric. Explore how Copilot's advanced AI capabilities are transforming data analytics by automating report creation, generating intelligent insights, and enhancing data visualization. Discover how Power BI seamlessly integrates to enable smooth data flow and enhance collaboration. We'll also share real-world success stories, demonstrating the profound impact of this innovation on data analytics.

𝗦𝗽𝗲𝗮𝗸𝗲𝗿𝘀: * Mohammad Ali * Kimberly Manis

𝗦𝗲𝘀𝘀𝗶𝗼𝗻 𝗜𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻: This is one of many sessions from the Microsoft Ignite 2024 event. View even more sessions on-demand and learn about Microsoft Ignite at https://ignite.microsoft.com

BRK202 | English (US) | Data

MSIgnite

Boost productivity with Microsoft Fabric​ Copilot and AI | BRK197

Explore how AI is revolutionizing the data analytics landscape. Microsoft Fabric serves as the center of data gravity for both business and technical users. We'll delve into real-world scenarios showcasing the power of combining Microsoft Fabric with AI to enhance data insights, analyze customer interactions, and customize machine learning models. ​Join us to understand the impact of AI on data analytics and take the next steps to leverage Microsoft Fabric for your organization’s success.

𝗦𝗽𝗲𝗮𝗸𝗲𝗿𝘀: * Mohammad Ali * Nellie Gustafsson * Kelly Richardson-Lewis

𝗦𝗲𝘀𝘀𝗶𝗼𝗻 𝗜𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻: This is one of many sessions from the Microsoft Ignite 2024 event. View even more sessions on-demand and learn about Microsoft Ignite at https://ignite.microsoft.com

BRK197 | English (US) | Data

MSIgnite

The data analytics job market seems more competitive than ever right now. In this episode, John David Ariansen talks about the state of hiring, his recent experience in the job market, what employers are looking for, and how you can stand out and set yourself up for success. You'll leave the show with a good understanding of the current market, and some practical and actionable tips you can use to land your next data role. What You'll Learn: What employers are looking for in today's analytics job market What candidates can expect from applying, interviewing, and offers How you can stand out and set yourself up for success   Register for free to be part of the next live session: https://bit.ly/3XB3A8b   About our guests: John David Ariansen is a Business Analyst, Professor, Podcast Host, and Career Advisor. Follow John David on LinkedIn How to Get an Analytics Job Podcast

Follow us on Socials: LinkedIn YouTube Instagram (Mavens of Data) Instagram (Maven Analytics) TikTok Facebook Medium X/Twitter

We’re improving DataFramed, and we need your help! We want to hear what you have to say about the show, and how we can make it more enjoyable for you—find out more here. Integrating generative AI with robust databases is becoming essential. As organizations face a plethora of database options and AI tools, making informed decisions is crucial for enhancing customer experiences and operational efficiency. How do you ensure your AI systems are powered by high-quality data? And how can these choices impact your organization's success? Gerrit Kazmaier is the VP and GM of Data Analytics at Google Cloud. Gerrit leads the development and design of Google Cloud’s data technology, which includes data warehousing and analytics. Gerrit’s mission is to build a unified data platform for all types of data processing as the foundation for the digital enterprise. Before joining Google, Gerrit served as President of the HANA & Analytics team at SAP in Germany and led the global Product, Solution & Engineering teams for Databases, Data Warehousing and Analytics. In 2015, Gerrit served as the Vice President of SAP Analytics Cloud in Vancouver, Canada. In this episode, Richie and Gerrit explore the transformative role of AI in data tools, the evolution of dashboards, the integration of AI with existing workflows, the challenges and opportunities in SQL code generation, the importance of a unified data platform, leveraging unstructured data, and much more. Links Mentioned in the Show: Google CloudConnect with GerritThinking Fast and Slow by Daniel KahnemanCourse: Introduction to GCPRelated Episode: Not Only Vector Databases: Putting Databases at the Heart of AI, with Andi Gutmans, VP and GM of Databases at GoogleRewatch sessions from RADAR: Forward Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

Help us become the #1 Data Podcast by leaving a rating & review! We are 67 reviews away! No fluff, no jargon; just the essentials to kick-start your data analyst career in 2025 with a strategy built for success. 💌 Join 30k+ aspiring data analysts & get my tips in your inbox weekly 👉 https://www.datacareerjumpstart.com/newsletter 🆘 Feeling stuck in your data journey? Come to my next free "How to Land Your First Data Job" training 👉 https://www.datacareerjumpstart.com/training 👩‍💻 Want to land a data job in less than 90 days? 👉 https://www.datacareerjumpstart.com/daa 👔 Ace The Interview with Confidence 👉 https://www.datacareerjumpstart.com//interviewsimulator ⌚ TIMESTAMPS 00:16 Understanding Different Data Roles 01:48 Essential Data Skills and Tools 04:36 Building Projects to Showcase Skills 08:13 Creating a Portfolio for Your Projects 09:06 Optimizing LinkedIn and Resume 10:46 Applying for Jobs and Networking 12:38 Preparing for Interviews 14:25 Conclusion and Final Tips Join the Bootcamp: Data Career Jumpstart Browse Data Jobs: Find a Data Job Must-Learn Skills for Aspiring Analysts: Watch on YouTube Find Free Datasets for Practice: Watch on YouTube Stratascratch for SQL Practice: Visit Stratascratch Prepare for Interviews: Interview Simulator 🔗 CONNECT WITH AVERY 🎥 YouTube Channel 🤝 LinkedIn 📸 Instagram 🎵 TikTok 💻 Website Mentioned in this episode: Join the last cohort of 2025! The LAST cohort of The Data Analytics Accelerator for 2025 kicks off on Monday, December 8th and enrollment is officially open!

To celebrate the end of the year, we’re running a special End-of-Year Sale, where you’ll get: ✅ A discount on your enrollment 🎁 6 bonus gifts, including job listings, interview prep, AI tools + more

If your goal is to land a data job in 2026, this is your chance to get ahead of the competition and start strong.

👉 Join the December Cohort & Claim Your Bonuses: https://DataCareerJumpstart.com/daa https://www.datacareerjumpstart.com/daa

Is BI Too Big for Small Data?

This is a talk about how we thought we had Big Data, and we built everything planning for Big Data, but then it turns out we didn't have Big Data, and while that's nice and fun and seems more chill, it's actually ruining everything, and I am here asking you to please help us figure out what we are supposed to do now.

📓 Resources Big Data is Dead: https://motherduck.com/blog/big-data-... Small Data Manifesto: https://motherduck.com/blog/small-dat... Is Excel Immortal?: https://benn.substack.com/p/is-excel-immortal Small Data SF: https://www.smalldatasf.com/

➡️ Follow Us LinkedIn: / motherduck
X/Twitter : / motherduck
Blog: https://motherduck.com/blog/


Mode founder David Wheeler challenges the data industry's obsession with "big data," arguing that most companies are actually working with "small data," and our tools are failing us. This talk deconstructs the common sales narrative for BI tools, exposing why the promise of finding game-changing insights through data exploration often falls flat. If you've ever built dashboards nobody uses or wondered why your analytics platform doesn't deliver on its promises, this is a must-watch reality check on the modern data stack.

We explore the standard BI demo, where an analyst uncovers a critical insight by drilling into event data. This story sells tools like Tableau and Power BI, but it rarely reflects reality, leading to a "revolving door of BI" as companies swap tools every few years. Discover why the narrative of the intrepid analyst finding a needle in the haystack only works in movies and how this disconnect creates a cycle of failed data initiatives and unused "trashboards."

The presentation traces our belief that "data is the new oil" back to the early 2010s, with examples from Target's predictive analytics and Facebook's growth hacking. However, these successes were built on truly massive datasets. For most businesses, analyzing small data results in noisy charts that offer vague "directional vibes" rather than clear, actionable insights. We contrast the promise of big data analytics with the practical challenges of small data interpretation.

Finally, learn actionable strategies for extracting real value from the data you actually have. We argue that BI tools should shift focus from data exploration to data interpretation, helping users understand what their charts actually mean. Learn why "doing things that don't scale," like manually analyzing individual customer journeys, can be more effective than complex models for small datasets. This talk offers a new perspective for data scientists, analysts, and developers looking for better data analysis techniques beyond the big data hype.

Big Data is Dead: Long Live Hot Data 🔥

Over the last decade, Big Data was everywhere. Let's set the record straight on what is and isn't Big Data. We have been consumed by a conversation about data volumes when we should focus more on the immediate task at hand: Simplifying our work.

Some of us may have Big Data, but our quest to derive insights from it is measured in small slices of work that fit on your laptop or in your hand. Easy data is here— let's make the most of it.

📓 Resources Big Data is Dead: https://motherduck.com/blog/big-data-is-dead/ Small Data Manifesto: https://motherduck.com/blog/small-data-manifesto/ Small Data SF: https://www.smalldatasf.com/

➡️ Follow Us LinkedIn: https://linkedin.com/company/motherduck X/Twitter : https://twitter.com/motherduck Blog: https://motherduck.com/blog/


Explore the "Small Data" movement, a counter-narrative to the prevailing big data conference hype. This talk challenges the assumption that data scale is the most important feature of every workload, defining big data as any dataset too large for a single machine. We'll unpack why this distinction is crucial for modern data engineering and analytics, setting the stage for a new perspective on data architecture.

Delve into the history of big data systems, starting with the non-linear hardware costs that plagued early data practitioners. Discover how Google's foundational papers on GFS, MapReduce, and Bigtable led to the creation of Hadoop, fundamentally changing how we scale data processing. We'll break down the "big data tax"—the inherent latency and system complexity overhead required for distributed systems to function, a critical concept for anyone evaluating data platforms.

Learn about the architectural cornerstone of the modern cloud data warehouse: the separation of storage and compute. This design, popularized by systems like Snowflake and Google BigQuery, allows storage to scale almost infinitely while compute resources are provisioned on-demand. Understand how this model paved the way for massive data lakes but also introduced new complexities and cost considerations that are often overlooked.

We examine the cracks appearing in the big data paradigm, especially for OLAP workloads. While systems like Snowflake are still dominant, the rise of powerful alternatives like DuckDB signals a shift. We reveal the hidden costs of big data analytics, exemplified by a petabyte-scale query costing nearly $6,000, and argue that for most use cases, it's too expensive to run computations over massive datasets.

The key to efficient data processing isn't your total data size, but the size of your "hot data" or working set. This talk argues that the revenge of the single node is here, as modern hardware can often handle the actual data queried without the overhead of the big data tax. This is a crucial optimization technique for reducing cost and improving performance in any data warehouse.

Discover the core principles for designing systems in a post-big data world. We'll show that since only 1 in 500 users run true big data queries, prioritizing simplicity over premature scaling is key. For low latency, process data close to the user with tools like DuckDB and SQLite. This local-first approach offers a compelling alternative to cloud-centric models, enabling faster, more cost-effective, and innovative data architectures.