talk-data.com talk-data.com

Topic

Azure

Microsoft Azure

cloud cloud_provider microsoft infrastructure

723

tagged

Activity Trend

278 peak/qtr
2020-Q1 2026-Q1

Activities

723 activities · Newest first

Hands-On MySQL Administration

Geared to intermediate- to advanced-level DBAs and IT professionals looking to enhance their MySQL skills, this guide provides a comprehensive overview on how to manage and optimize MySQL databases. You'll learn how to create databases and implement backup and recovery, security configurations, high availability, scaling techniques, and performance tuning. Using practical techniques, tips, and real-world examples, authors Arunjith Aravindan and Jeyaram Ayyalusamy show you how to deploy and manage MySQL, Amazon RDS, Amazon Aurora, and Azure MySQL. By the end of the book, you'll have the knowledge and skills necessary to administer, manage, and optimize MySQL databases effectively. Design and implement a scalable and reliable database infrastructure using MySQL 8 on premises and cloud Install and configure software, manage user accounts, and optimize database performance Use backup and recovery strategies, security measures, and high availability solutions Apply best practices for database schema design, indexing strategies, and replication techniques Implement advanced database features and techniques such as replication, clustering, load balancing, and high availability Troubleshoot common issues and errors, using diagnostic tools and techniques to identify and resolve problems quickly and efficiently Facilitate major MySQL upgrades including MySQL 5.7 to MySQL 8

Summary

Data lakehouse architectures have been gaining significant adoption. To accelerate adoption in the enterprise Microsoft has created the Fabric platform, based on their OneLake architecture. In this episode Dipti Borkar shares her experiences working on the product team at Fabric and explains the various use cases for the Fabric service.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst is an end-to-end data lakehouse platform built on Trino, the query engine Apache Iceberg was designed for, with complete support for all table formats including Apache Iceberg, Hive, and Delta Lake. Trusted by teams of all sizes, including Comcast and Doordash. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino. Your host is Tobias Macey and today I'm interviewing Dipti Borkar about her work on Microsoft Fabric and performing analytics on data withou

Interview

Introduction How did you get involved in the area of data management? Can you describe what Microsoft Fabric is and the story behind it? Data lakes in various forms have been gaining significant popularity as a unified interface to an organization's analytics. What are the motivating factors that you see for that trend? Microsoft has been investing heavily in open source in recent years, and the Fabric platform relies on several open components. What are the benefits of layering on top of existing technologies rather than building a fully custom solution?

What are the elements of Fabric that were engineered specifically for the service? What are the most interesting/complicated integration challenges?

How has your prior experience with Ahana and Presto informed your current work at Microsoft? AI plays a substantial role in the product. What are the benefits of embedding Copilot into the data engine?

What are the challenges in terms of safety and reliability?

What are the most interesting, innovative, or unexpected ways that you have seen the Fabric platform used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on data lakes generally, and Fabric specifically? When is Fabric the wrong choice? What do you have planned for the future of data lake analytics?

Contact Info

LinkedIn

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Closing Announcements

Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.

Links

Microsoft Fabric Ahana episode DB2 Distributed Spark Presto Azure Data MAD Landscape

Podcast Episode ML Podcast Episode

Tableau dbt Medallion Architecture Microsoft Onelake ORC Parquet Avro Delta Lake Iceberg

Podcast Episode

Hudi

Podcast Episode

Hadoop PowerBI

Podcast Episode

Velox Gluten Apache XTable GraphQL Formula 1 McLaren

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Sponsored By: Starburst: Starburst Logo

This episode is brought to you by Starburst - an end-to-end data lakehouse platform for data engineers who are battling to build and scale high quality data pipelines on the data lake. Powered by T

Announcing Databricks Clean Rooms with Live Demo. Presented by Matei Zaharia and Darshana Sivakumar

Speakers: Matei Zaharia, Original Creator of Apache Spark™ and MLflow; Chief Technologist, Databricks Darshana Sivakumar, Staff Product Manager, Databricks

Organizations are looking for ways to securely exchange their data and collaborate with external partners to foster data-driven innovations. In the past, organizations had limited data sharing solutions, relinquishing control over how their sensitive data was shared with partners and little to no visibility into how their data was consumed. This created the risk for potential data misuse and data privacy breaches. Customers who tried using other clean room solutions have told us these solutions are limited and do not meet their needs, as they often require all parties to copy their data into the same platform, do not allow sophisticated analysis beyond basic SQL queries, and have limited visibility or control over their data.

Organizations need an open, flexible, and privacy-safe way to collaborate on data, and Databricks Clean Rooms meets these critical needs.

See a demo of Databricks Clean Rooms, now in Public Preview on AWS + Azure

The Definitive Guide to KQL: Using Kusto Query Language for operations, defending, and threat hunting

Turn the avalanche of raw data from Azure Data Explorer, Azure Monitor, Microsoft Sentinel, and other Microsoft data platforms into actionable intelligence with KQL (Kusto Query Language). Experts in information security and analysis guide you through what it takes to automate your approach to risk assessment and remediation, speeding up detection time while reducing manual work using KQL. This accessible and practical guidedesigned for a broad range of people with varying experience in KQLwill quickly make KQL second nature for information security. Solve real problems with Kusto Query Language and build your competitive advantage: Learn the fundamentals of KQLwhat it is and where it is used Examine the anatomy of a KQL query Understand why data summation and aggregation is important See examples of data summation, including count, countif, and dcount Learn the benefits of moving from raw data ingestion to a more automated approach for security operations Unlock how to write efficient and effective queries Work with advanced KQL operators, advanced data strings, and multivalued strings Explore KQL for day-to-day admin tasks, performance, and troubleshooting Use KQL across Azure, including app services and function apps Delve into defending and threat hunting using KQL Recognize indicators of compromise and anomaly detection Learn to access and contribute to hunting queries via GitHub and workbooks via Microsoft Entra ID

We’ll walk you through building several sample architectures on Azure through a series of hands-on exercises. In this workshop, you’ll learn how to use Pulumi to manage infrastructure in Azure using general-purpose programming languages. Topics include the Pulumi Programming Model, managing Azure resources with Pulumi's Azure Native provider, and Pulumi for Platform Teams.

Azure Data Engineer Associate Certification Guide - Second Edition

This book is your gateway to mastering the skills required for achieving the Azure Data Engineer Associate certification (DP-203). Whether you're new to the field or a seasoned professional, it comprehensively prepares you for the challenges of the exam. Learn to design and implement advanced data solutions, secure sensitive information, and optimize data processes effectively. What this Book will help me do Understand and utilize Azure's data services such as Azure Synapse and Azure Databricks for data processing. Master advanced data storage and management solutions, including designing partitions and lake architectures. Learn to secure data with state-of-the-art tools like RBAC, encryption, and Azure Purview. Develop and manage data pipelines and workflows using tools like Azure Data Factory (ADF) and Spark. Prepare for and confidently pass the DP-203 certification exam with the included practical resources and guidance. Author(s) The authors, None Palmieri, Surendra Mettapalli, and None Alex, bring a wealth of expertise in cloud and data engineering. With extensive industry experience, they've designed this guide to be both educational and practical, enabling learners to not only understand but also apply concepts in real-world scenarios. Their goal is to make complex topics approachable, supporting your journey to certification success. Who is it for? This guide is perfect for aspiring and current data engineers aiming to achieve the Azure Data Engineer Associate certification (DP-203). It's particularly useful for professionals familiar with cloud services and basic data engineering concepts who want to delve deeper into Azure's offerings. Additionally, managers and learners preparing for roles involving Azure cloud data solutions will find the content invaluable for career advancement.

In this workshop, we will explore real-world examples of how organizations can leverage the power of Pulumi and Spot to optimize their Azure Kubernetes Service (AKS) workloads at launch and why continuous optimization of your AKS infrastructure is critical to any successful platform engineering program. Learn best practices for managing AKS at scale and how Pulumi and Spot facilitate AKS usage at the enterprise scale.

Join our experts to learn more about your identity choices with Google Cloud. We will deep-dive into workload identity federation and support for GCE and workforce identity federation for Azure AD, Okta, and other IdPs, secure your environments with access policies like Allow and Deny, and seamlessly use policy intelligence tools for federated identities. We’ll show how you can unlock new use cases, and catch fraud and abuse in your environment. You’ll also learn how L’Oreal uses these features toward their identity-first security platform.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

In this session we will show how you can query, connect, and report on your data insights across clouds, including AWS and Azure, with BigQuery Omni and Looker. Reduce costly copying and customization and get answers quickly, so you can get back to work.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

Artificial Intelligence with Microsoft Power BI

Advance your Power BI skills by adding AI to your repertoire at a practice level. With this practical book, business-oriented software engineers and developers will learn the terminologies, practices, and strategy necessary to successfully incorporate AI into your business intelligence estate. Jen Stirrup, CEO of AI and BI leadership consultancy Data Relish, and Thomas Weinandy, research economist at Upside, show you how to use data already available to your organization. Springboarding from the skills that you already possess, this book adds AI to your organization's technical capability and expertise with Microsoft Power BI. By using your conceptual knowledge of BI, you'll learn how to choose the right model for your AI work and identify its value and validity. Use Power BI to build a good data model for AI Demystify the AI terminology that you need to know Identify AI project roles, responsibilities, and teams for AI Use AI models, including supervised machine learning techniques Develop and train models in Azure ML for consumption in Power BI Improve your business AI maturity level with Power BI Use the AI feedback loop to help you get started with the next project

Engineering Data Mesh in Azure Cloud

Discover how to implement a modern data mesh architecture using Microsoft Azure's Cloud Adoption Framework. In this book, you'll learn the strategies to decentralize data while maintaining strong governance, turning your current analytics struggles into scalable and streamlined processes. Unlock the potential of data mesh to achieve advanced and democratized analytics platforms. What this Book will help me do Learn to decentralize data governance and integrate data domains effectively. Master strategies for building and implementing data contracts suited to your organization's needs. Explore how to design a landing zone for a data mesh using Azure's Cloud Adoption Framework. Understand how to apply key architecture patterns for analytics, including AI and machine learning. Gain the knowledge to scale analytics frameworks using modern cloud-based platforms. Author(s) None Deswandikar is a seasoned data architect with extensive experience in implementing cutting-edge data solutions in the cloud. With a passion for simplifying complex data strategies, None brings real-world customer experiences into practical guidance. This book reflects None's dedication to helping organizations achieve their data goals with clarity and effectiveness. Who is it for? This book is ideal for chief data officers, data architects, and engineers seeking to transform data analytics frameworks to accommodate advanced workloads. Especially useful for professionals aiming to implement cloud-based data mesh solutions, it assumes familiarity with centralized data systems, data lakes, and data integration techniques. If modernizing your organization's data strategy appeals to you, this book is for you.

Send us a text Welcome to the cozy corner of the tech world where ones and zeros mingle with casual chit-chat. Datatopics Unplugged is your go-to spot for relaxed discussions around tech, news, data, and society. Dive into conversations that should flow as smoothly as your morning coffee (but don't), where industry insights meet laid-back banter. Whether you're a data aficionado or just someone curious about the digital age, pull up a chair, relax, and let's get into the heart of data, unplugged style! In this episode #42, titled "Unraveling the Fabric of Data: Microsoft's Ecosystem and Beyond," we're joined once again by the tech maestro and newly minted Microsoft MVP, Sam Debruyn. Sam brings to the table a bevy of updates from his recent accolades to the intricacies of Microsoft's data platforms and the world of SQL.

Biz Buzz: From Reddit's IPO to the performance versus utility debate in database selection, we dissect the big moves shaking up the business side of tech. Read about Reddit's IPO.Microsoft's Fabric Unraveled: Get the lowdown on Microsoft's Fabric, the one-stop AI platform, as Sam Debruyn gives us a deep dive into its capabilities and integration with Azure Databricks and Power BI. Discover more about Fabric and dive into Sam's blog.dbt Developments: Sam talks dbt and the exciting new SQL tool for data pipeline building with upcoming unit testing capabilities.Polaris Project: Delving into Microsoft's internal storage projects, including insights on Polaris and its integration with Synapse SQL. Read the paper here.AI Advances: From the release of Grok-1 and Apple's MM1 AI model to GPT-4's trillion parameters, we discuss the leaps in artificial intelligence.Stability in Motion: After OpenAI's Sora, we look at Stability AI's new venture into motion with Stable Video. Check out Stable Video.Benchmarking Debate: A critical look at performance benchmarks in database selection and the ongoing search for the 'best' database. Contemplate benchmarking perspectives.Versioning Philosophy: Hot takes on semantic versioning and what stability really means in software development. Dive into Semantic Versioning.

Azure Data Factory by Example: Practical Implementation for Data Engineers

Data engineers who need to hit the ground running will use this book to build skills in Azure Data Factory v2 (ADF). The tutorial-first approach to ADF taken in this book gets you working from the first chapter, explaining key ideas naturally as you encounter them. From creating your first data factory to building complex, metadata-driven nested pipelines, the book guides you through essential concepts in Microsoft’s cloud-based ETL/ELT platform. It introduces components indispensable for the movement and transformation of data in the cloud. Then it demonstrates the tools necessary to orchestrate, monitor, and manage those components. This edition, updated for 2024, includes the latest developments to the Azure Data Factory service: Enhancements to existing pipeline activities such as Execute Pipeline, along with the introduction of new activities such as Script, and activities designed specifically to interact with Azure Synapse Analytics. Improvements to flow control provided by activity deactivation and the Fail activity. The introduction of reusable data flow components such as user-defined functions and flowlets. Extensions to integration runtime capabilities including Managed VNet support. The ability to trigger pipelines in response to custom events. Tools for implementing boilerplate processes such as change data capture and metadata-driven data copying. What You Will Learn Create pipelines, activities, datasets, and linked services Build reusable components using variables, parameters, and expressions Move data into and around Azure services automatically Transform data natively using ADF data flows and Power Query data wrangling Master flow-of-control and triggers for tightly orchestrated pipeline execution Publish and monitor pipelines easily and with confidence Who This Book Is For Data engineers and ETL developers taking their first steps in Azure Data Factory, SQL Server Integration Services users making the transition toward doing ETL in Microsoft’s Azure cloud, and SQL Server database administrators involved in data warehousing and ETL operations

Join Amir Shengross in an exploration of cloud-native applications and how they are shaping the future of software development. Learn about the principles and practices of cloud-native architecture, and discover how cloud-native applications enable organizations to innovate faster, scale efficiently, and deliver superior user experiences.