Git

Data Engineering with Azure Databricks

2026-04-10 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Xenia Ireton , Tonya Chernyshova , Dmitry Foshin , Dmitry Anoshin

AI/ML Airflow Analytics Azure ADF Azure DevOps CI/CD Cloud Computing Data Engineering Data Governance Data Lakehouse Databricks +11 more

Master end-to-end data engineering on Azure Databricks. From data ingestion and Delta Lake to CI/CD and real-time streaming, build secure, scalable, and performant data solutions with Spark, Unity Catalog, and ML tools. Key Features Build scalable data pipelines using Apache Spark and Delta Lake Automate workflows and manage data governance with Unity Catalog Learn real-time processing and structured streaming with practical use cases Implement CI/CD, DevOps, and security for production-ready data solutions Explore Databricks-native ML, AutoML, and Generative AI integration Book Description "Data Engineering with Azure Databricks" is your essential guide to building scalable, secure, and high-performing data pipelines using the powerful Databricks platform on Azure. Designed for data engineers, architects, and developers, this book demystifies the complexities of Spark-based workloads, Delta Lake, Unity Catalog, and real-time data processing. Beginning with the foundational role of Azure Databricks in modern data engineering, you’ll explore how to set up robust environments, manage data ingestion with Auto Loader, optimize Spark performance, and orchestrate complex workflows using tools like Azure Data Factory and Airflow. The book offers deep dives into structured streaming, Delta Live Tables, and Delta Lake’s ACID features for data reliability and schema evolution. You’ll also learn how to manage security, compliance, and access controls using Unity Catalog, and gain insights into managing CI/CD pipelines with Azure DevOps and Terraform. With a special focus on machine learning and generative AI, the final chapters guide you in automating model workflows, leveraging MLflow, and fine-tuning large language models on Databricks. Whether you're building a modern data lakehouse or operationalizing analytics at scale, this book provides the tools and insights you need. What you will learn Set up a full-featured Azure Databricks environment Implement batch and streaming ingestion using Auto Loader Optimize Spark jobs with partitioning and caching Build real-time pipelines with structured streaming and DLT Manage data governance using Unity Catalog Orchestrate production workflows with jobs and ADF Apply CI/CD best practices with Azure DevOps and Git Secure data with RBAC, encryption, and compliance standards Use MLflow and Feature Store for ML pipelines Build generative AI applications in Databricks Who this book is for This book is for data engineers, solution architects, cloud professionals, and software engineers seeking to build robust and scalable data pipelines using Azure Databricks. Whether you're migrating legacy systems, implementing a modern lakehouse architecture, or optimizing data workflows for performance, this guide will help you leverage the full power of Databricks on Azure. A basic understanding of Python, Spark, and cloud infrastructure is recommended.

Generative AI for Full-Stack Development: AI Empowered Accelerated Coding

2026-01-01 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Shantanu Baruah

AI/ML GenAI JavaScript MongoDB React data data-engineering nosql-databases

Gain cutting-edge skills in building a full-stack web application with AI assistance. This book will guide you in creating your own travel application using React and Node.js, with MongoDB as the database, while emphasizing the use of Gen AI platforms like Perplexity.ai and Claude for quicker development and more accurate debugging. The book’s step-by-step approach will help you bridge the gap between traditional web development methods and modern AI-assisted techniques, making it both accessible and insightful. It provides valuable lessons on professional web application development practices. By focusing on a practical example, the book offers hands-on experience that mirrors real-world scenarios, equipping you with relevant and in-demand skills that can be easily transferred to other projects. The book emphasizes the principles of responsive design, teaching you how to create web applications that adapt seamlessly to different screen sizes and devices. This includes using fluid grids, media queries, and optimizing layouts for usability across various platforms. You will also learn how to design, manage, and query databases using MongoDB, ensuring you can effectively handle data storage and retrieval in your applications. Most significantly, the book will introduce you to generative AI tools and prompt engineering techniques that can accelerate coding and debugging processes. This modern approach will streamline development workflows and enhance productivity. By the end of this book, you will not only have learned how to create a complete web application from backend to frontend, along with database management, but you will also have gained invaluable associated skills such as using IDEs, version control, and deploying applications efficiently and effectively with AI. What You Will Learn How to build a full-stack web application from scratch How to use generative AI tools to enhance coding efficiency and streamline the development process How to create user-friendly interfaces that enhance the overall experience of your web applications How to design, manage, and query databases using MongoDB Who This Book Is For Frontend developers, backend developers, and full-stack developers.

Mastering Snowflake DataOps with DataOps.live: An End-to-End Guide to Modern Data Management

2025-10-30 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Ronald L. Steelman Jr.

Data Management DataOps dbt DevOps Snowflake data data-engineering

This practical, in-depth guide shows you how to build modern, sophisticated data processes using the Snowflake platform and DataOps.live —the only platform that enables seamless DataOps integration with Snowflake. Designed for data engineers, architects, and technical leaders, it bridges the gap between DataOps theory and real-world implementation, helping you take control of your data pipelines to deliver more efficient, automated solutions. . You’ll explore the core principles of DataOps and how they differ from traditional DevOps, while gaining a solid foundation in the tools and technologies that power modern data management—including Git, DBT, and Snowflake. Through hands-on examples and detailed walkthroughs, you’ll learn how to implement your own DataOps strategy within Snowflake and maximize the power of DataOps.live to scale and refine your DataOps processes. Whether you're just starting with DataOps or looking to refine and scale your existing strategies, this book—complete with practical code examples and starter projects—provides the knowledge and tools you need to streamline data operations, integrate DataOps into your Snowflake infrastructure, and stay ahead of the curve in the rapidly evolving world of data management. What You Will Learn Explore the fundamentals of DataOps , its differences from DevOps, and its significance in modern data management Understand Git’s role in DataOps and how to use it effectively Know why DBT is preferred for DataOps and how to apply it Set up and manage DataOps.live within the Snowflake ecosystem Apply advanced techniques to scale and evolve your DataOps strategy Who This Book Is For Snowflake practitioners—including data engineers, platform architects, and technical managers—who are ready to implement DataOps principles and streamline complex data workflows using DataOps.live.

Unlocking dbt: Design and Deploy Transformations in Your Cloud Data Warehouse

2025-09-30 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Dustin Dorsey (Onix) , Cameron Cyr

CI/CD Cloud Computing dbt DWH Modern Data Stack Python SQL data data-engineering data-warehouse storage-repositories

Master the art of data transformation with the second edition of this trusted guide to dbt. Building on the foundation of the first edition, this updated volume offers a deeper, more comprehensive exploration of dbt’s capabilities—whether you're new to the tool or looking to sharpen your skills. It dives into the latest features and techniques, equipping you with the tools to create scalable, maintainable, and production-ready data transformation pipelines. Unlocking dbt, Second Edition introduces key advancements, including the semantic layer, which allows you to define and manage metrics at scale, and dbt Mesh, empowering organizations to orchestrate decentralized data workflows with confidence. You’ll also explore more advanced testing capabilities, expanded CI/CD and deployment strategies, and enhancements in documentation—such as the newly introduced dbt Catalog. As in the first edition, you’ll learn how to harness dbt’s power to transform raw data into actionable insights, while incorporating software engineering best practices like code reusability, version control, and automated testing. From configuring projects with the dbt Platform or open source dbt to mastering advanced transformations using SQL and Jinja, this book provides everything you need to tackle real-world challenges effectively. What You Will Learn Understand dbt and its role in the modern data stack Set up projects using both the cloud-hosted dbt Platform and open source project Connect dbt projects to cloud data warehouses Build scalable models in SQL and Python Configure development, testing, and production environments Capture reusable logic with Jinja macros Incorporate version control with your data transformation code Seamlessly connect your projects using dbt Mesh Build and manage a semantic layer using dbt Deploy dbt using CI/CD best practices Who This Book Is For Current and aspiring data professionals, including architects, developers, analysts, engineers, data scientists, and consultants who are beginning the journey of using dbt as part of their data pipeline’s transformation layer. Readers should have a foundational knowledge of writing basic SQL statements, development best practices, and working with data in an analytical context such as a data warehouse.

Data Engineering for Cybersecurity

2025-08-26 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by James Bonifield

Ansible Cloud Computing Data Engineering ELK Kafka Linux Logstash PowerShell Redis Cyber Security Data Streaming data +1 more

Security teams rely on telemetry—the continuous stream of logs, events, metrics, and signals that reveal what’s happening across systems, endpoints, and cloud services. But that data doesn’t organize itself. It has to be collected, normalized, enriched, and secured before it becomes useful. That’s where data engineering comes in. In this hands-on guide, cybersecurity engineer James Bonifield teaches you how to design and build scalable, secure data pipelines using free, open source tools such as Filebeat, Logstash, Redis, Kafka, and Elasticsearch and more. You’ll learn how to collect telemetry from Windows including Sysmon and PowerShell events, Linux files and syslog, and streaming data from network and security appliances. You’ll then transform it into structured formats, secure it in transit, and automate your deployments using Ansible. You’ll also learn how to: Encrypt and secure data in transit using TLS and SSH Centrally manage code and configuration files using Git Transform messy logs into structured events Enrich data with threat intelligence using Redis and Memcached Stream and centralize data at scale with Kafka Automate with Ansible for repeatable deployments Whether you’re building a pipeline on a tight budget or deploying an enterprise-scale system, this book shows you how to centralize your security data, support real-time detection, and lay the groundwork for incident response and long-term forensics.

Pro T-SQL 2019: Toward Speed, Scalability, and Standardization for SQL Server Developers

2020-02-12 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Elizabeth Noble

Data Management SQL data data-engineering

Design and write simple and efficient T-SQL code in SQL Server 2019 and beyond. Writing T-SQL that pulls back correct results can be challenging. This book provides the help you need in writing T-SQL that performs fast and is easy to maintain. You also will learn how to implement version control, testing, and deployment strategies. Hands-on examples show modern T-SQL practices and provide straightforward explanations. Attention is given to selecting the right data types and objects when designing T-SQL solutions. Author Elizabeth Noble teaches you how to improve your T-SQL performance through good design practices that benefit programmers and ultimately the users of the applications. You will know the common pitfalls of writing T-SQL and how to avoid those pitfalls going forward. What You Will Learn Choose correct data types and database objects when designing T-SQL Write T-SQL that searches data efficiently and uses hardware effectively Implement source control and testing methods to streamline the deployment process Design T-SQL that can be enhanced or modified with less effort Plan for long-term data management and storage Who This Book Is For Database developers who want to improve the efficiency of their applications, and developers who want to solve complex query and data problems more easily by writing T-SQL that performs well, brings back correct results, and is easy for other developers to understand and maintain

SamsTeachYourself PHP, MySQL & JavaScript: All in One, 6th Edition

2017-10-10 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Julie C. Meloni

HTML JavaScript Linux MySQL React SQL data data-engineering relational-databases

In just a short time, you can learn how to use PHP, MySQL, and JavaScript together to create dynamic, interactive websites and applications using three leading web development technologies. No previous programming experience is required. Using a straightforward, step-by-step approach, each lesson in this book builds on the previous ones, enabling you to learn the essentials of full-stack web application development – from HTML, CSS, and JavaScript on the front end, to PHP scripting and MySQL databases on the server. Regardless of whether you run Linux, Windows, or MacOS, the book includes complete instructions to install all the software you need to set up a stable environment for learning, testing, and production. Step-by-step instructions carefully walk you through the most common web application development tasks. Practical, hands-on examples show you how to apply what you learn. Quizzes and exercises help you test your knowledge and stretch your skills. Learn how to: Build web pages with HTML5 and CSS Use JavaScript to build dynamic, interactive web pages Get PHP, MySQL, and JavaScript to work together to create modern, standards-compliant web applications Enhance interactivity with AJAX Leverage JavaScript libraries such as jQuery Work with cookies and user sessions Get user input with web-based forms Use basic SQL commands Interact with the MySQL database using PHP Write maintainable code and get started with version control Decide when frameworks such as Bootstrap, Foundation, React, Angular, and Laravel can be useful Create a web-based discussion forum or calendar Add a storefront and shopping cart to your site Contents at a Glance PART I Web Application Basics 1 Understanding How the Web Works 2 Structuring HTML and Using Cascading Style Sheets 3 Understanding the CSS Box Model and Positioning 4 Introducing JavaScript 5 Introducing PHP PART II Getting Started with Dynamic Web Sites 6 Understanding Dynamic Web Sites and HTML5 Applications 7 JavaScript Fundamentals: Variables, Strings, and Arrays 8 JavaScript Fundamentals: Functions, Objects, and Flow Control 9 Understanding JavaScript Event Handling 10 The Basics of Using jQuery PART III Taking Your Web Applications to the Next Level 11 AJAX: Getting Started with Remote Scripting 12 PHP Fundamentals: Variables, Strings, and Arrays 13 PHP Fundamentals: Functions, Objects, and Flow Control 14 Working with Cookies and User Sessions 15 Working with Web-Based Forms PART IV Integrating a Database into Your Applications 16 Understanding the Database Design Process 17 Learning Basic SQL Commands 18 Interacting with MySQL Using PHP PART V Getting Started with Application Development 19 Creating a Simple Discussion Forum 20 Creating an Online Storefront 21 Creating a Simple Calendar 22 Managing Web Applications PART VI Appendixes A Installation QuickStart with XAMPP B Installing and Configuring MySQL C Installing and Configuring Apache D Installing and Configuring PHP

Essentials of Cloud Application Development on IBM Bluemix

2017-08-07 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Hala Aziz , Ahmed Azraq , Sally Fikry , Ben Smith , Mohamed El-Khouly , Ahmed S. Hassan

Agile/Scrum API Cloud Computing Computer Science Dashboard DevOps IBM JavaScript JSON data data-engineering

Abstract This IBM® Redbooks® publication is based on the Presentations Guide of the course Essentials of Cloud Application Development on IBM Bluemix that was developed by the IBM Redbooks team in partnership with IBM Skills Academy Program. This course is designed to teach university students the basic skills that are required to develop, deploy, and test cloud-based applications that use the IBM Bluemix® cloud services. The primary target audience for this course is university students in undergraduate computer science and computer engineer programs with no previous experience working in cloud environments. However, anyone new to cloud computing can also benefit from this course. After completing this course, you should be able to accomplish the following tasks: Define cloud computing Describe the factors that lead to the adoption of cloud computing Describe the choices that developers have when creating cloud applications Describe infrastructure as a service, platform as a service, and software as a service Describe IBM Bluemix and its architecture Identify the runtimes and services that IBM Bluemix offers Describe IBM Bluemix infrastructure types Create an application in IBM Bluemix Describe the IBM Bluemix dashboard, catalog, and documentation features Explain how the application route is used to test an application from the browser Create services in IBM Bluemix Describe how to bind services to an application in IBM Bluemix Describe the environment variables that are used with IBM Bluemix services Explain what are IBM Bluemix organizations, domains, spaces, and users Describe how to create an IBM SDK for Node.js application that runs on IBM Bluemix Explain how to manage your IBM Bluemix account with the Cloud Foundry CLI Describe how to set up and use the IBM Bluemix plug-in for Eclipse Describe the role of Node.js for server-side scripting Describe IBM Bluemix DevOps Services and the capabilities of IBM DevOps Services Identify the Web IDE features in IBM Bluemix DevOps Describe how to connect a Git repository client to Bluemix DevOps Services project Explain the pipeline build and deploy processes that IBM Bluemix DevOps Services use Describe how IBM Bluemix DevOps Services integrate with the IBM Bluemix cloud Describe the agile planning tools in IBM Bluemix Describe the characteristics of REST APIs Explain the advantages of the JSON data format Describe an example of REST APIs using Watson Describe the main types of data services in IBM Bluemix Describe the benefits of IBM Cloudant® Explain how Cloudant databases and documents are accessed from IBM Bluemix Describe how to use REST APIs to interact with Cloudant database Describe Bluemix mobile backend as a service (MBaaS) and the MBaaS architecture Describe the Push Notifications service Describe the App ID service Describe the Kinetise service Describe how to create Bluemix Mobile applications by using MobileFirst Services Starter Boilerplate The workshop materials were created in June 2017. Therefore, all IBM Bluemix features that are described in this Presentations Guide and IBM Bluemix user interfaces that are used in the examples are current as of June 2017.

Essentials of Cloud Application Development on IBM Bluemix

2016-10-10 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Hala Aziz , Ahmed Azraq , Sally Fikry , Ben Smith , Mohamed El-Khouly

API Cloud Computing Computer Science DevOps IBM JavaScript JSON data data-engineering

Abstract This IBM® Redbooks® publication is based on the Presentations Guide of the course "Essentials of Cloud Application Development on IBM Bluemix" that was developed by the IBM Redbooks team in partnership with IBM Middle East and Africa (MEA) University Program. This course is designed to teach university students the basic skills that are required to develop, deploy, and test cloud-based applications that use the IBM Bluemix® cloud services. The primary target audience for this course is university students in undergraduate computer science and computer engineer programs with no previous experience working in cloud environments. However, anyone new to cloud computing can benefit from this course. After completing this course, you should be able to accomplish these tasks: Describe the factors that lead to the adoption of cloud computing. Describe infrastructure as a service, platform as a service, and software as a service. Define cloud computing. Describe IBM Bluemix. Describe the architecture of IBM Bluemix. Identify the runtimes and services that Bluemix offers. Explain how to get started with Bluemix. Describe Bluemix organizations, domains, spaces, and users. Create Bluemix applications. Use services in a Bluemix application. Set environmental variables that are used with Bluemix services. Deploy and run Bluemix applications. Describe how to create an IBM SDK for Node.js application that runs on Bluemix. Explain how to manage a Bluemix account with the Cloud Foundry CLI.[ ]Describe how to integrate workstation development platforms with Bluemix. Manage application code and assets with IBM Bluemix DevOps services. Work with the Git repository that is used by DevOps services. Describe the characteristics of REST APIs. Describe the use of JSON as the preferred data format for REST APIs. dentify the data services that are available on Bluemix. Describe the features in Bluemix for developing mobile applications. Create a MobileFirst Services Starter application on Bluemix. Send push notifications from Bluemix and receive them on the mobile device emulator. The workshop materials were created in August 2016. Thus, all IBM Bluemix features discussed in this Presentations Guide and Bluemix user interfaces used in the examples are current as of August 2016. Note: This IBM Redbooks publication references exercises that are NOT included with this book. The exercises are only available to students attending the course.

Pro XAML with C#: Application Development Strategies

2015-07-07 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Buddy James , Lori Lalonde

CI/CD Microsoft XML data data-engineering storage-formats xaml

Pro XAML with C#: Application Development Strategies is your guide to real-world development practices on Microsoft’s XAML-based platforms, with examples in WPF, Windows 8.1, and Windows Phone 8.1. Learn how to properly plan and architect an application on one or more of these platforms for a robust, scalable solution. In Part I, authors Buddy James and Lori Lalonde introduce you to XAML and reveal proven techniques for developing successful line-of-business applications. You’ll also find out about some of the conflicting needs and interests that you might encounter as an enterprise XAML developer. Part II begins to lay the groundwork to help you properly architect your application, providing you with a deeper understanding of domain-driven design and the Model-View-ViewModel design pattern. You will also learn about proper exception handling and logging techniques, and how to cover your code with unit tests to reduce bugs and validate your design. Part III explores implementation and deployment details for each of Microsoft’s XAML UIs, along with advice on deploying and maintaining your application across different devices using version control repositories and continuous integration. Pro XAML with C# Application Development Strategies is for intermediate to experienced developers looking to improve their professional practice. Readers should have experience working with C# and at least one XAML-based technology (WPF, Silverlight, Windows Store, or Windows Phone).

Oracle SQL Developer Data Modeler for Database Design Mastery

2015-05-22 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Heli Helskyaho

Data Modelling DWH IBM Microsoft Oracle SQL SQL Server data data-engineering data-models

Design Databases with Oracle SQL Developer Data Modeler In this practical guide, Oracle ACE Director Heli Helskyaho explains the process of database design using Oracle SQL Developer Data Modeler—the powerful, free tool that flawlessly supports Oracle and other database environments, including Microsoft SQL Server and IBM DB2. Oracle SQL Developer Data Modeler for Database Design Mastery covers requirement analysis, conceptual, logical, and physical design, data warehousing, reporting, and more. Create and deploy high-performance enterprise databases on any platform using the expert tips and best practices in this Oracle Press book. Configure Oracle SQL Developer Data Modeler Perform requirement analysis Translate requirements into a formal conceptual data model and process models Transform the conceptual (logical) model into a relational model Manage physical database design Generate data definition language (DDL) scripts to create database objects Design a data warehouse database Use subversion for version control and to enable a multiuser environment Document an existing database Use the reporting tools in Oracle SQL Developer Data Modeler Compare designs and the database

Oracle PL/SQL Performance Tuning Tips & Techniques

2014-08-29 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Michael Rosenblum , Paul Dorsey

API Oracle SQL XML data data-engineering pl-sql pl/sql

Proven PL/SQL Optimization Solutions In Oracle PL/SQL Performance Tuning Tips & Techniques, Oracle ACE authors with decades of experience building complex production systems for government, industry, and educational organizations present a hands-on approach to enabling optimal results from PL/SQL. The book begins by describing the discovery process required to pinpoint performance problems and then provides measurable and repeatable test cases. In-depth coverage of linking SQL and PL/SQL is followed by deep dives into essential Oracle Database performance tuning tools. Real-world examples and best practices are included throughout this Oracle Press guide. Follow a request-driven nine-step process to identify and address performance problems in web applications Use performance-related database tools, including data dictionary views, logging, tracing, PL/SQL Hierarchical Profiler, PL/Scope, and RUNSTATS Instrument code to pinpoint performance issues using call stack APIs, error stack APIs, and timing markers Embed PL/SQL in SQL and manage user-defined functions Embed SQL in PL/SQL using a set-based approach to handle large volumes of data Properly write and deploy data manipulation language triggers to avoid performance problems Work with advanced datatypes, including LOBs and XML Use caching techniques to avoid redundant operations Effectively use dynamic SQL to reduce the amount of code needed and streamline system management Manage version control and ensure that performance fixes are successfully deployed Code examples in the book are available for download.

talk-data.com

Activity Trend

Top Events

Top Speakers

Data Engineering with Azure Databricks

Generative AI for Full-Stack Development: AI Empowered Accelerated Coding

Mastering Snowflake DataOps with DataOps.live: An End-to-End Guide to Modern Data Management

Unlocking dbt: Design and Deploy Transformations in Your Cloud Data Warehouse

Data Engineering for Cybersecurity

Pro T-SQL 2019: Toward Speed, Scalability, and Standardization for SQL Server Developers

SamsTeachYourself PHP, MySQL & JavaScript: All in One, 6th Edition

Essentials of Cloud Application Development on IBM Bluemix

Essentials of Cloud Application Development on IBM Bluemix

Pro XAML with C#: Application Development Strategies

Oracle SQL Developer Data Modeler for Database Design Mastery

Oracle PL/SQL Performance Tuning Tips & Techniques