This is the introductory course for developers jumping into dbt! We will dive into data modeling, sources, data tests, documentation, and deployment. As an instructor-led course, you’ll have the chance to learn with peers, ask questions, and get live coaching and feedback. After this course, you will be able to: Explain the foundational concepts of dbt Build data models and a DAG to visualize dependencies Configure tests and add documentation to your models Deploy your dbt project to refresh data models on a schedule Prerequisites for this course: Intermediate SQL knowledge What to bring: You must bring your own laptop to complete the hands-on exercises. We will provide all the other sandbox environments for dbt and data platform. Duration: 4 hours Fee: $400 Trainings and certifications are not offered separately and must be purchased with a Coalesce pass Trainings and certifications are not available for Coalesce Online passes
talk-data.com
Topic
SQL
Structured Query Language (SQL)
1751
tagged
Activity Trend
Top Events
Get certified at Coalesce! Choose from two certification exams: The dbt Analytics Engineering Certification Exam is designed to evaluate your ability to: Build, test, and maintain models to make data accessible to others Use dbt to apply engineering principles to analytics infrastructure We recommend that you have at least SQL proficiency and have had 6+ months of experience working in dbt (self-hosted dbt or the dbt platform) before attempting the exam. The dbt Architect Certification Exam assesses your ability to: Design secure, scalable dbt implementations, with a focus on environment orchestration Role-based access control Integrations with other tools Collaborative development workflows aligned with best practices What to expect Your purchase includes sitting for one attempt at one of the two in-person exams at Coalesce You will let the proctor know which certification you are sitting for Please arrive on time, this is a closed-door certification, and attendees will not be let in after the doors are closed What to bring You will need to bring your own laptop to take the exam Duration: 2 Hours Fee: $100 Trainings and certifications are not offered separately and must be purchased with a Coalesce pass Trainings and certifications are not available for Coalesce Online passes If you no-show your certification, you will not be refunded
Master AI-powered analytics in Snowflake. Learn how to leverage Snowflake's powerful AI SQL functions for advanced analytics on unstructured, time-series, and geospatial data. We'll also demonstrate how to use Semantic Views to bridge the gap between data and business understanding.
Learn how to efficiently scale and manage data engineering pipelines with Snowflake's latest native capabilities for transformations and orchestration with SQL, Python, and dbt Projects on Snowflake. Join us for new product and feature overviews, best practices, and live demos.
Data silos slow down insights, and moving data across systems can be costly and complex. With Microsoft Fabric Mirroring, you can now make your external data instantly available inside Fabric—without complex ETL pipelines or duplication overhead.
This session will introduce the core concepts of Fabric Mirroring, show how it works behind the scenes, and demonstrate how to quickly connect to popular sources like SQL, Snowflake, and even files such as Excel.
We’ll explore how Mirroring integrates with OneLake, the SQL Endpoint, and Power BI, giving you a single, consistent experience across your analytics stack.
Whether you’re a business analyst, data engineer, or architect, you’ll walk away understanding how Fabric Mirroring helps you simplify data access, reduce latency, and unlock more value from your existing investments.
Dans cette session, découvrez comment les organisations extraient des informations exploitables à partir de texte, de documents, d'images et d'audio – le tout dans Snowflake Cortex AI. Cette session révèle des techniques pratiques pour construire des pipelines d'analyse multimodaux intégrés à l'aide des fonctions SQL de Cortex AI et de Document AI. Apprenez à orchestrer une analyse de données complexe en plusieurs étapes sur des types de données auparavant silotés – simplement, avec SQL.
Une part essentielle des données stratégiques réside dans des systèmes critiques et de production (tels que IBM i, Oracle, SAP, SQL Server...) . Les extraire sans perturber la production est l’un des obstacles majeurs aux initiatives de modernisation.
Cette démonstration montrera comment la réplication de données permet de :
• Diffuser la donnée en temps réel sans impact sur les opérations,
• Consolider les données dans Snowflake, BigQuery ou les Data Lake pour l’analyse et l’IA,
• Réduire les coûts d’intégration et limiter les risques projets.
Une session de 30 minutes avec démonstration et temps de questions-réponses.
With over 50,000 active users, discover how we transformed enterprise data interaction through Snowflake's Cortex Analyst API with SiemensGPT. Our plugin architecture, powered by the ReACT agent model, converts natural language into SQL queries and dynamic visualizations, orchestrating everything through a unified interface. Beyond productivity gains, this solution democratizes data access across Siemens, enabling employees at all levels to derive business insights through simple conversations.
Master Al-powered analytics in Snowflake. Learn how to leverage Snowflake's powerful Al SQL functions for advanced analytics on unstructured data. We'll also demonstrate how to use Semantic Views to bridge the gap between data and business understanding.
In this talk, we present our Proof of Concept (PoC) for Cortex Analyst on Snowflake, enabling interactive queries on complex geospatial data enriched with sociodemographic, market, and infrastructure information. An AI-powered text-to-SQL interface translates natural language queries into SQL in real time, with results shown in tables and visualizations. All of this leverages Snowflake’s built-in security and governance features.
If your job search feels like tab-hell—applications everywhere, prep scattered, follow-ups forgotten—this episode is your reset. I walk you through three small but mighty AI agents you can build in an afternoon: • Application Tracker Agent — paste a job link → extract company, title, pay, location → auto-log to Notion/Sheets → set a 7-day follow-up. • Interview Prep Agent — feed the JD + your resume → get tailored behavioral questions, SQL/case drills, and a tight “Tell me about yourself.” • Follow-Up Agent — generate a thank-you in your voice, log the interview date, and nudge you if you haven’t heard back. You’ll learn the agent essentials—planning, memory, feedback loops—plus a copy-and-paste framework, example prompts, and quality checks so your agents save time instead of making noise. Chapters below. Show notes include my working templates, prompts, and affiliate tools I actually use (Riverside for recording, RSS.com for hosting, Sider for research). Rate the show if this helped—it means a lot. Primary keywords: ai agents, job search, interview prep, application tracking, follow-up emails Secondary keywords: Notion, Google Sheets, SQL interview, behavioral questions, automation, productivity, podseo, career tools
Links & Resources Recording Partner: Riverside → Sign up here (affiliate)Host Your Podcast: RSS.com (affiliate )Research Tools: Sider.ai (affiliate)Join the Newsletter: Free Email Newsletter to receive practical AI tools weekly.Join the Discussion (comments hub): https://mukundansankar.substack.com/notes🔗 Connect with Me:Website: Data & AI with MukundanTwitter/X: @sankarmukund475LinkedIn: Mukundan SankarYouTube: Subscribe
Ace the DP-300 Exam with this essential study companion, chock-full of insights and tips you cannot find online. This book will help you build a comprehensive understanding of Azure SQL systems and their role in supporting business solutions, and it will equip you with the mental models and technical knowledge needed to confidently answer exam questions. Structured to align with Microsoft’s published study guide, the book spans five major sections that correspond to the skills measured by the exam, covering topics vital to modern cloud operations and including HA/DR, security, compliance, performance, and scalability. [if !supportAnnotations]You’ll also learn about the ways cloud operations have changed the focus of operating database systems from task execution to platform configuration—and how to configure your data platforms to meet this new reality. [if !supportAnnotations] By the end of this book, you’ll be prepared to navigate exam scenarios with finesse, pass the exam with confidence, and advance in your career with a solid foundation of knowledge. What You Will Learn Maximize your ability to benefit from the online learning tools for Exam DP-300 Gain depth and context for Azure SQL technical solutions relevant to Exam DP-300 Boost your confidence in Azure SQL Database skills Extend your on-premises SQL Server skill set into the Azure SQL cloud Enhance your overall understanding of Azure SQL administration and operations Develop your Azure SQL skill set to increase your value as an employee or contractor Adopt a new mindset for cloud-based solutions versus on-premises solutions Who This Book Is For Anyone planning to take the DP-300: Administering Microsoft Azure SQL Solutions exam, and those who wish to understand Azure SQL and how to successfully migrate and manage SQL solutions using all Azure SQL Technologies
Master the art of data transformation with the second edition of this trusted guide to dbt. Building on the foundation of the first edition, this updated volume offers a deeper, more comprehensive exploration of dbt’s capabilities—whether you're new to the tool or looking to sharpen your skills. It dives into the latest features and techniques, equipping you with the tools to create scalable, maintainable, and production-ready data transformation pipelines. Unlocking dbt, Second Edition introduces key advancements, including the semantic layer, which allows you to define and manage metrics at scale, and dbt Mesh, empowering organizations to orchestrate decentralized data workflows with confidence. You’ll also explore more advanced testing capabilities, expanded CI/CD and deployment strategies, and enhancements in documentation—such as the newly introduced dbt Catalog. As in the first edition, you’ll learn how to harness dbt’s power to transform raw data into actionable insights, while incorporating software engineering best practices like code reusability, version control, and automated testing. From configuring projects with the dbt Platform or open source dbt to mastering advanced transformations using SQL and Jinja, this book provides everything you need to tackle real-world challenges effectively. What You Will Learn Understand dbt and its role in the modern data stack Set up projects using both the cloud-hosted dbt Platform and open source project Connect dbt projects to cloud data warehouses Build scalable models in SQL and Python Configure development, testing, and production environments Capture reusable logic with Jinja macros Incorporate version control with your data transformation code Seamlessly connect your projects using dbt Mesh Build and manage a semantic layer using dbt Deploy dbt using CI/CD best practices Who This Book Is For Current and aspiring data professionals, including architects, developers, analysts, engineers, data scientists, and consultants who are beginning the journey of using dbt as part of their data pipeline’s transformation layer. Readers should have a foundational knowledge of writing basic SQL statements, development best practices, and working with data in an analytical context such as a data warehouse.
Large‑language‑model agents are only as useful as the context and tools they can reach.
Anthropic’s Model Context Protocol (MCP) proposes a universal, bidirectional interface that turns every external system—SQL databases, Slack, Git, web browsers, even your local file‑system—into first‑class “context providers.”
In just 30 minutes we’ll step from high‑level buzzwords to hands‑on engineering details:
- How MCP’s JSON‑RPC message format, streaming channels, and version‑negotiation work under the hood.
- Why per‑tool sandboxing via isolated client processes hardens security (and what happens when an LLM tries
rm ‑rf /). - Techniques for hierarchical context retrieval that stretch a model’s effective window beyond token limits.
- Real‑world patterns for accessing multiple tools—Postgres, Slack, GitHub—and plugging MCP into GenAI applications.
Expect code snippets and lessons from early adoption.
You’ll leave ready to wire your own services into any MCP‑aware model and level‑up your GenAI applications—without the N×M integration nightmare.
Are AI code generators delivering SQL that "looks right but works wrong" for your data engineering challenges? Is your AI generating brilliant-sounding but functionally flawed results?
The critical bottleneck isn't the AI's intelligence; it's the missing context.
In this talk, we will put thing in context and reveal how providing AI with structured, deep understanding—from data semantics and lineage to user intent and external knowledge—is the true paradigm shift.
We'll explore how this context engineering powers the rise of dependable AI agents and leverages techniques like Retrieval-Augmented Generation (RAG) to move beyond mere text generation towards trustworthy, intelligent automation across all domains.
This limitation highlights a broader challenge across AI applications: the need for systems to possess a deep understanding of all relevant signals, ranging from environmental cues and user history to explicit intent, to achieve reliable and meaningful operation.
Join us for real-world, practical case studies directly from data engineers that demonstrate precisely how to unlock this transformative power and achieve truly reliable AI.
Data is one of the most valuable assets in any organisation, but accessing and analysing it has been limited to technical experts. Business users often rely on predefined dashboards and data teams to extract insights, creating bottlenecks and slowing decision-making.
This is changing with the rise of Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG). These technologies are redefining how organisations interact with data, allowing users to ask complex questions in natural language and receive accurate, real-time insights without needing deep technical expertise.
In this session, I’ll explore how LLMs and RAG are driving true data democratisation by making analytics accessible to everyone, enabling real-time insights with AI-powered search and retrieval and overcoming traditional barriers like SQL, BI tool complexity, and rigid reporting structures.
Send us a text In this episode, we're joined by Sam Debruyn and Dorian Van den Heede who reflect on their talks at SQL Bits 2025 and dive into the technical content they presented. Sam walks through how dbt integrates with Microsoft Fabric, explaining how it improves lakehouse and warehouse workflows by adding modularity, testing, and documentation to SQL development. He also touches on Fusion’s SQL optimization features and how it compares to tools like SQLMesh. Dorian shares his MLOps demo, which simulates beating football bookmakers using historical data,nshowing how to build a full pipeline with Azure ML, from feature engineering to model deployment. They discuss the role of Python modeling in dbt, orchestration with Azure ML, and the practical challenges of implementing MLOps in real-world scenarios. Toward the end, they explore how AI tools like Copilot are changing the way engineers learn and debug code, raising questions about explainability, skill development, and the future of junior roles in tech. It’s rich conversation covering dbt, MLOps, Python, Azure ML, and the evolving role of AI in engineering.
As organizations increasingly adopt data lake architectures, analytics databases face significant integration challenges beyond simple data ingestion. This talk explores the complex technical hurdles encountered when building robust connections between analytics engines and modern data lake formats.
We'll examine critical implementation challenges, including the absence of native library support for formats like Delta Lake, which necessitates expansion into new programming languages such as Rust to achieve optimal performance. The session explores the complexities of managing stateful systems, addressing caching inconsistencies, and reconciling state across distributed environments.
A key focus will be on integrating with external catalogs while maintaining data consistency and performance - a challenge that requires careful architectural decisions around metadata management and query optimization. We'll explore how these technical constraints impact system design and the trade-offs involved in different implementation approaches.
Attendees will gain a practical understanding of the engineering complexity behind seamless data lake integration and actionable approaches to common implementation obstacles.
The data engineer’s role is shifting in the AI era. With LLMs and agents as new consumers, the challenge moves from SQL and schemas to semantics, context engineering, and making databases LLM-friendly. This session explores how data engineers can design semantic layers, document relationships, and expose data through MCPs and AI interfaces. We’ll highlight new skills required, illustrate pipelines that combine offline and online LLM processing, and show how data can serve business users, developers, and AI agents alike.
How to move data from thousands of SQL databases to data lake with no impact on OLTP? We'll explore the challenges we faced while migrated legacy batch data flows to event-based architecture. A key challenge for our data engineers was the multi-tenant architecture of our backend, meaning that we had to handle the same SQL schema on over 15k databases. We'll present the journey employing Debezium, Azure Event Hub, Delta Live tables and the extra tooling we had to put in place.