talk-data.com talk-data.com

Topic

Terraform

infrastructure_as_code cloud devops

79

tagged

Activity Trend

13 peak/qtr
2020-Q1 2026-Q1

Activities

79 activities · Newest first

Data Engineering with Azure Databricks

Master end-to-end data engineering on Azure Databricks. From data ingestion and Delta Lake to CI/CD and real-time streaming, build secure, scalable, and performant data solutions with Spark, Unity Catalog, and ML tools. Key Features Build scalable data pipelines using Apache Spark and Delta Lake Automate workflows and manage data governance with Unity Catalog Learn real-time processing and structured streaming with practical use cases Implement CI/CD, DevOps, and security for production-ready data solutions Explore Databricks-native ML, AutoML, and Generative AI integration Book Description "Data Engineering with Azure Databricks" is your essential guide to building scalable, secure, and high-performing data pipelines using the powerful Databricks platform on Azure. Designed for data engineers, architects, and developers, this book demystifies the complexities of Spark-based workloads, Delta Lake, Unity Catalog, and real-time data processing. Beginning with the foundational role of Azure Databricks in modern data engineering, you’ll explore how to set up robust environments, manage data ingestion with Auto Loader, optimize Spark performance, and orchestrate complex workflows using tools like Azure Data Factory and Airflow. The book offers deep dives into structured streaming, Delta Live Tables, and Delta Lake’s ACID features for data reliability and schema evolution. You’ll also learn how to manage security, compliance, and access controls using Unity Catalog, and gain insights into managing CI/CD pipelines with Azure DevOps and Terraform. With a special focus on machine learning and generative AI, the final chapters guide you in automating model workflows, leveraging MLflow, and fine-tuning large language models on Databricks. Whether you're building a modern data lakehouse or operationalizing analytics at scale, this book provides the tools and insights you need. What you will learn Set up a full-featured Azure Databricks environment Implement batch and streaming ingestion using Auto Loader Optimize Spark jobs with partitioning and caching Build real-time pipelines with structured streaming and DLT Manage data governance using Unity Catalog Orchestrate production workflows with jobs and ADF Apply CI/CD best practices with Azure DevOps and Git Secure data with RBAC, encryption, and compliance standards Use MLflow and Feature Store for ML pipelines Build generative AI applications in Databricks Who this book is for This book is for data engineers, solution architects, cloud professionals, and software engineers seeking to build robust and scalable data pipelines using Azure Databricks. Whether you're migrating legacy systems, implementing a modern lakehouse architecture, or optimizing data workflows for performance, this guide will help you leverage the full power of Databricks on Azure. A basic understanding of Python, Spark, and cloud infrastructure is recommended.

In this talk we’ll learn Infrastructure-as-Code by automating the world’s most popular game: Minecraft. Using Packer, Terraform and GitHub Actions, we’ll build a server, configure Linux, provision cloud infrastructure and operate it through GitOps. Finally, we’ll demonstrate how to go beyond automating traditional cloud control planes—automating the Minecraft world itself by using Terraform to build and demolish structures like castles and pyramids before our very eyes!

In this talk we’ll learn Infrastructure-as-Code by automating the world’s most popular game: Minecraft. Using Packer, Terraform and GitHub Actions, we’ll build a server, configure Linux, provision cloud infrastructure and operate it through GitOps. Finally, we’ll demonstrate how to go beyond automating traditional cloud control planes—automating the Minecraft world itself by using Terraform to build and demolish structures like castles and pyramids before our very eyes!

The scale-up company Solynta focuses on hybrid potato breeding, which helps achieve improvements in yield, disease resistance, and climate adaptation. Scientific innovation is part of our core business. Plant selections are highly data-driven, involving, for example, drone observations and genetic data. Minimal time-to-production for new ideas is essential, which is facilitated by our custom AWS devops platform. This platform focusses on automation and accessible data storage.

In this talk, we introduce how computer vision (YOLO and SAM modelling) enables monitoring traits of plants in the field, and how we operate these models. This further entails: • Our experience from training and evaluating models on drone images • Trade-offs selecting AWS services, Terraform modules and Python packages for automation and robustness • Our team setup that allows IT specialists and biologists to work together effectively

The talk will provide practical insights for both data scientists and DevOps engineers. The main takeaways are that object detection and segmentation from drone maps, at scale, are achievable for a small team. Furthermore, with the right approach, you can standardise a DevOps platform to let operations and developers work together.

As Europe’s top B2B used-goods auction platform, TBAuctions is entering the AI era. Roberto Bonilla, Lead Data Engineer, shows how Databricks, Azure, Terraform, MLflow and LangGraph unify to simplify complex AI workflows. Bas Lucieer, Head of Data, details the strategy and change management that bring a sales-driven organization along, ensuring adoption and lasting value. Together they show tech + strategy = marketplace edge.

Microsoft Fabric is transforming how organizations build unified data platforms for analytics, data science, and business intelligence. Until recently, deploying and managing Fabric resources required manual effort or ad hoc automation. That changed with the release of the Terraform provider for Microsoft Fabric last year, enabling teams to manage Fabric infrastructure as code. In this session, you'll learn how to get started using Terraform to provision and manage Microsoft Fabric components — including workspaces, pipelines, dataflows, and more — in a repeatable and scalable way. Aimed at data engineers, cloud architects, and DevOps professionals, we'll cover core Terraform concepts, walk through practical examples, and share best practices for integrating with Azure and CI/CD workflows. By the end of the session, you'll be equipped to bring automation, consistency, and governance to your Microsoft Fabric environments using Terraform.

According to Wikipedia, Infrastructure as Code is the process of managing and provisioning computer data center resources through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools. This also applies to resources and reference data, connector plugins, connector configurations, and stream processes to clean up the data.

In this talk, we are going to discuss the use cases based on the Network Rail Data Feeds, the scripts used to spin up the environment and cluster in the Confluent Cloud as well as the different components required for the ingress and processing of the data.

This particular environment is used as a teaching tool for Event Stream Processing for Kafka Streams, ksqlDB, and Flink. Some examples of further processing and visualisation will also be provided.

According to Wikipedia, Infrastructure as Code is the process of managing and provisioning computer data center resources through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools. This also applies to resources and reference data, connector plugins, connector configurations, and stream processes to clean up the data.

In this talk, we are going to discuss the use cases based on the Network Rail Data Feeds, the scripts used to spin up the environment and cluster in the Confluent Cloud as well as the different components required for the ingress and processing of the data.

This particular environment is used as a teaching tool for Event Stream Processing for Kafka Streams, ksqlDB, and Flink. Some examples of further processing and visualisation will also be provided.

The talk focuses on the practical implementation of GitOps in a hybrid infrastructure setup, designing Helm charts and provisioning infrastructure with Terraform. Target audience: DevOps engineers or platform engineers building internal developer platforms, especially those working with Kubernetes.