talk-data.com talk-data.com

Topic

DWH

Data Warehouse

analytics business_intelligence data_storage

568

tagged

Activity Trend

35 peak/qtr
2020-Q1 2026-Q1

Activities

568 activities · Newest first

Google Cloud Certified Professional Data Engineer Certification Guide

A guide to pass the GCP Professional Data Engineer exam on your first attempt and upgrade your data engineering skills on GCP. Key Features Fully understand the certification exam content and exam objectives Consolidate your knowledge of all essential exam topics and key concepts Get realistic experience of answering exam-style questions Develop practical skills for everyday use Purchase of this book unlocks access to web-based exam prep resources including mock exams, flashcards, exam tips Book Description The GCP Professional Data Engineer certification validates the fundamental knowledge required to perform data engineering tasks and use GCP services to enhance data engineering processes and further your career in the data engineering/architecting field. This book is a best-in-class study guide that fully covers the GCP Professional Data Engineer exam objectives and helps you pass the exam first time. Complete with clear explanations, chapter review questions, realistic mock exams, and pragmatic solutions, this guide will help you master the core exam concepts and build the understanding you need to go into the exam with the skills and confidence to get the best result you can. With the help of relevant examples, you'll learn fundamental data engineering concepts such as data warehousing and data security. As you progress, you'll delve into the important domains of the exam, including data pipelining, data migration, and data processing. Unlike other study guides, this book contains logical reasoning behind the choice of correct answers based in scenarios and provide you with excellent tips regarding the optimal use of each service, and gives you everything you need to pass the exam and enhance your prospects in the data engineering field. What you will learn Create data solutions and pipelines in GCP Analyze and transform data into useful information Apply data engineering concepts to real scenarios Create secure, cost-effective, valuable GCP workloads Work in the GCP environment with industry best practices Who this book is for This book is for data engineers who want a reliable source for the key concepts and terms present in the most prestigious and highly-sought-after cloud-based data engineering certification. This book will help you improve your data engineering in GCP skills to give you a better chance at earning the GCP Professional Data Engineer Certification. You will already be familiar with the Google Cloud Platform, having either explored it (professionally or personally) for at least a year. You should also have some familiarity with basic data concepts (such as types of data and basic SQL knowledge).

AWS re:Invent 2025 - Modernize your data warehouse by moving to Amazon Redshift (ANT317)

Are you spending too much time on data warehouse management tasks like hardware provisioning, software patching, and performance tuning and not enough time building your applications and innovating with data? Tens of thousands of customers rely on AWS Analytics every day to run and scale analytics in seconds on all their data without managing data warehouse infrastructure. In this session, you’ll learn best practices and proven strategies for modernizing your data warehouse, helping your build powerful analytics and machine learning applications that operate at scale while keeping costs low.

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

AWS re:Invent 2025 - What's new in Amazon Redshift and Amazon Athena (ANT206)

Learn how AWS is enhancing its SQL analytics offerings with new capabilities in Amazon Redshift and Amazon Athena. Discover how Redshift's AI-powered data warehousing capabilities are enabling customers to modernize their analytics workloads with enhanced performance and cost optimization. Explore Athena's latest features for interactively querying data directly in their Amazon S3 data lakes. This session showcases new features and real-world examples of how organizations are using these services to accelerate business insights while optimizing costs.

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

AWS re:Invent 2025 - Scaling Amazon Redshift with a multi-warehouse architecture (ANT318)

Enterprise analytics platforms are undergoing a major transformation—from centralized, overloaded data warehouses to federated, governed, GenAI-ready multi-warehouse architectures. In this session, you’ll learn how to design your data warehouse architecture to scale with your business needs. We’ll explore the end-to-end architectural evolution from a monolithic Redshift cluster to a modern multi-warehouse architecture and the best practices to deploy them in a cost-effective manner.

Learn more: More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

ABOUT AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2025 #AWS

Refactoring our data warehouse from a legacy system with spaghetti-like model relationships into a medallion architecture, we implemented model labelling for improved finops. This allowed us to optimise queries, eliminate redundancies, and deliver faster results with greater team efficiency, providing more value to internal users.

Learn how to accelerate and automate migrations with SnowConvert AI, featuring data ecosystem migration agents powered by Snowflake Cortex AI. SnowConvert AI is your free, automated solution designed to dramatically reduce the complexities, costs, and timelines associated with data warehouse and BI migrations. It intelligently analyzes your existing code, automating code conversion, data validation, and streamlining the entire migration process. Join us for an overview of the solution, migration best practices, and live demos.

The journey from startup to billion-dollar enterprise requires more than just a great product—it demands strategic alignment between sales and marketing. How do you identify your ideal customer profile when you're just starting out? What data signals help you find the twins of your successful early adopters? With AI now automating everything from competitive analysis to content creation, the traditional boundaries between departments are blurring. But what personality traits should you look for when building teams that can scale with your growth? And how do you ensure your data strategy supports rather than hinders your AI ambitions in this rapidly evolving landscape? Denise Persson is CMO at Snowflake and has 20 years of technology marketing experience at high-growth companies. Prior to joining Snowflake, she served as CMO for Apigee, an API platform company that went public in 2015 and Google acquired in 2016. She began her career at collaboration software company Genesys, where she built and led a global marketing organization. Denise also helped lead Genesys through its expansion to become a successful IPO and acquired company. Denise holds a BA in Business Administration and Economics from Stockholm University, and holds an MBA from Georgetown University. Chris Degnan is the former CRO at Snowflake and has over 15 years of enterprise technology sales experience. Before working at Snowflake, Chris served as the AVP of the West at EMC, and prior to that as VP Western Region at Aveksa, where he helped grow the business 250% year-over-year. Before Aveksa, Chris spent eight years at EMC and managed a team responsible for 175 select accounts. Prior to EMC, Chris worked in enterprise sales at Informatica and Covalent Technologies (acquired by VMware). He holds a BA from the University of Delaware. In the episode, Richie, Denise, and Chris explore the journey to a billion-dollar ARR, the importance of customer obsession, aligning sales and marketing, leveraging data for decision-making, and the role of AI in scaling operations, and much more. Links Mentioned in the Show: SnowflakeSnowflake BUILDConnect with Denise and ChrisSnowflake is FREE on DataCamp this weekRelated Episode: Adding AI to the Data Warehouse with Sridhar Ramaswamy, CEO at SnowflakeRewatch RADAR AI  New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

Bilt's conversational data layer: How we connected data to LLMs with dbt

Bilt Rewards turned their dbt project into a natural language interface. By connecting their semantic layer and underlying data warehouse to an LLM, business users and data analysts can ask real business questions and get trusted and creative insights. This session shows how they modeled their data for AI, how they kept accuracy intact, and increased data driven conversations across the business.

Learn how to accelerate and automate migrations with SnowConvert AI, featuring data ecosystem migration agents powered by Snowflake Cortex AI. SnowConvert AI is your free, automated solution designed to dramatically reduce the complexities, costs, and timelines associated with data warehouse and BI migrations. It intelligently analyzes your existing code, automating code conversion, data validation, and streamlining the entire migration process.

DNB, Norway’s largest bank, began building a cloud-based self-service Data & AI Platform in 2017, delivering its first capabilities by 2018. Initially focused on ML and analytics, the platform expanded in 2021 to include traditional data warehouses and modern data products. Snowflake was officially launched in 2023 after a successful PoC and pilot.

In this talk, we’ll walk through our journey.

Where We Came From

•Discover how legacy data warehouse bottlenecks sparked a shift toward decentralised, self-service data capabilities.

Where We Are

•Learn how DNB enabled teams to own and operate their data products through: •Streamlined domain onboarding •“DevOps for data” and “SQL as code” practices •Automated services for historisation (PSA)

Where We’re Going

•Explore how DNB is evolving its data mesh with: •A hybrid model of decentralised and centralised data products •Generative AI, metadata automation, and development support •Enhanced tooling and services for data consumers

Summary In this episode of the Data Engineering Podcast, host Tobias Macey welcomes back Nick Schrock, CTO and founder of Dagster Labs, to discuss Compass - a Slack-native, agentic analytics system designed to keep data teams connected with business stakeholders. Nick shares his journey from initial skepticism to embracing agentic AI as model and application advancements made it practical for governed workflows, and explores how Compass redefines the relationship between data teams and stakeholders by shifting analysts into steward roles, capturing and governing context, and integrating with Slack where collaboration already happens. The conversation covers organizational observability through Compass's conversational system of record, cost control strategies, and the implications of agentic collaboration on Conway's Law, as well as what's next for Compass and Nick's optimistic views on AI-accelerated software engineering.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details. Your host is Tobias Macey and today I'm interviewing Nick Schrock about building an AI analyst that keeps data teams in the loopInterview IntroductionHow did you get involved in the area of data management?Can you describe what Compass is and the story behind it?context repository structurehow to keep it relevant/avoid sprawl/duplicationproviding guardrailshow does a tool like Compass help provide feedback/insights back to the data teams?preparing the data warehouse for effective introspection by the AILLM selectioncost managementcaching/materializing ad-hoc queriesWhy Slack and enterprise chat are important to b2b softwareHow AI is changing stakeholder relationshipsHow not to overpromise AI capabilities How does Compass relate to BI?How does Compass relate to Dagster and Data Infrastructure?What are the most interesting, innovative, or unexpected ways that you have seen Compass used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on Compass?When is Compass the wrong choice?What do you have planned for the future of Compass?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links DagsterDagster LabsDagster PlusDagster CompassChris Bergh DataOps EpisodeRise of Medium Code blog postContext EngineeringData StewardInformation ArchitectureConway's LawTemporal durable execution frameworkThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

In a dynamic world where your data needs are evolving as quickly as the data warehousing solutions on the market, flexibility is key to unlock the full potential of your data at the best cost-performance ratio possible. Gameloft, a leading mobile and console game developer that operates a petabyte scale modern data architecture, has completely migrated their data warehouse from Snowflake to Google BigQuery and got leaner, faster and more flexible along the way.

ClickHouse est conçu pour l’analytique à très grande échelle — qu’il s’agisse d’observabilité, de data warehousing ou de traitements en temps réel — tout en garantissant une latence ultra-faible. Dans cette démo, nous construirons un pipeline complet capable d’ingérer des millions d’événements par seconde — logs, métriques ou données applicatives — et de les stocker efficacement dans ClickHouse. Vous découvrirez comment modéliser des données massives, utiliser des vues matérialisées pour accélérer les agrégations, appliquer des filtres complexes à la volée, ainsi que définir des index pour optimiser les performances de vos requêtes.

Learn how to accelerate and automate migrations with SnowConvert AI, featuring data ecosystem migration agents powered by Snowflake Cortex AI. SnowConvert AI is your free, automated solution designed to dramatically reduce the complexities, costs, and timelines associated with data warehouse and BI migrations. It intelligently analyzes your existing code, automating code conversion, data validation, and streamlining the entire migration process. Join us for an overview of the solution, migration best practices, and live demos.

Unlocking dbt: Design and Deploy Transformations in Your Cloud Data Warehouse

Master the art of data transformation with the second edition of this trusted guide to dbt. Building on the foundation of the first edition, this updated volume offers a deeper, more comprehensive exploration of dbt’s capabilities—whether you're new to the tool or looking to sharpen your skills. It dives into the latest features and techniques, equipping you with the tools to create scalable, maintainable, and production-ready data transformation pipelines. Unlocking dbt, Second Edition introduces key advancements, including the semantic layer, which allows you to define and manage metrics at scale, and dbt Mesh, empowering organizations to orchestrate decentralized data workflows with confidence. You’ll also explore more advanced testing capabilities, expanded CI/CD and deployment strategies, and enhancements in documentation—such as the newly introduced dbt Catalog. As in the first edition, you’ll learn how to harness dbt’s power to transform raw data into actionable insights, while incorporating software engineering best practices like code reusability, version control, and automated testing. From configuring projects with the dbt Platform or open source dbt to mastering advanced transformations using SQL and Jinja, this book provides everything you need to tackle real-world challenges effectively. What You Will Learn Understand dbt and its role in the modern data stack Set up projects using both the cloud-hosted dbt Platform and open source project Connect dbt projects to cloud data warehouses Build scalable models in SQL and Python Configure development, testing, and production environments Capture reusable logic with Jinja macros Incorporate version control with your data transformation code Seamlessly connect your projects using dbt Mesh Build and manage a semantic layer using dbt Deploy dbt using CI/CD best practices Who This Book Is For Current and aspiring data professionals, including architects, developers, analysts, engineers, data scientists, and consultants who are beginning the journey of using dbt as part of their data pipeline’s transformation layer. Readers should have a foundational knowledge of writing basic SQL statements, development best practices, and working with data in an analytical context such as a data warehouse.

Minus Three Tier: Data Architecture Turned Upside Down

Every data architecture diagram out there makes it abundantly clear who's in charge: At the bottom sits the analyst, above that is an API server, and on the very top sits the mighty data warehouse. This pattern is so ingrained we never ever question its necessity, despite its various issues like slow data response time, multi-level scaling issues, and massive cost.

But there is another way: Disconnect of storage and compute enables localization of query processing closer to people, leading to much snappier responses, natural scaling with client-side query processing, and much reduced cost.

In this talk, it will be discussed how modern data engineering paradigms like decomposition of storage, single-node query processing, and lakehouse formats enable a radical departure from the tired three-tier architecture. By inverting the architecture we can put user's needs first. We can rely on commoditised components like object store to enable fast, scalable, and cost-effective solutions.

In this session, we’ll share our transformation journey from a traditional, centralised data warehouse to a modern data lakehouse architecture, powered by data mesh principles. We’ll explore the challenges we faced with legacy systems, the strategic decisions that led us to adopt a lakehouse model, and how data mesh enabled us to decentralise ownership, improve scalability, and enhance data governance.

Learn how to transform your data warehouse for AI/LLM readiness while making advanced analytics accessible to all team members, regardless of technical expertise. 

We'll share practical approaches to adapting data infrastructure and building user-friendly AI tools that lower the barrier to entry for sophisticated analysis. 

Key takeaways include implementation best practices, challenges encountered, and strategies for balancing technical requirements with user accessibility. Ideal for data teams looking to democratize AI-powered analytics in their organization.