Cloud Computing

Architecting a Modern Data Warehouse for Large Enterprises: Build Multi-cloud Modern Distributed Data Warehouses with Azure and AWS

2023-12-27 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Abhishek Mishra , Anjani Kumar , Sanjeev Kumar (Tesa SE)

AWS Azure BI Big Data Data Governance Data Lake Data Lakehouse Delta DWH Pandas Cyber Security Data Streaming +4 more

Design and architect new generation cloud-based data warehouses using Azure and AWS. This book provides an in-depth understanding of how to build modern cloud-native data warehouses, as well as their history and evolution. The book starts by covering foundational data warehouse concepts, and introduces modern features such as distributed processing, big data storage, data streaming, and processing data on the cloud. You will gain an understanding of the synergy, relevance, and usage data warehousing standard practices in the modern world of distributed data processing. The authors walk you through the essential concepts of Data Mesh, Data Lake, Lakehouse, and Delta Lake. And they demonstrate the services and offerings available on Azure and AWS that deal with data orchestration, data democratization, data governance, data security, and business intelligence. After completing this book, you will be ready to design and architect enterprise-grade, cloud-based modern data warehouses using industry best practices and guidelines. What You Will Learn Understand the core concepts underlying modern data warehouses Design and build cloud-native data warehousesGain a practical approach to architecting and building data warehouses on Azure and AWS Implement modern data warehousing components such as Data Mesh, Data Lake, Delta Lake, and Lakehouse Process data through pandas and evaluate your model’s performance using metrics such as F1-score, precision, and recall Apply deep learning to supervised, semi-supervised, and unsupervised anomaly detection tasks for tabular datasets and time series applications Who This Book Is For Experienced developers, cloud architects, and technology enthusiasts looking to build cloud-based modern data warehouses using Azure and AWS

Data Exploration and Preparation with BigQuery

2023-11-29 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Mike Kahn

Big Data BigQuery DWH GCP SQL data data-engineering google-bigquery

In "Data Exploration and Preparation with BigQuery," Michael Kahn provides a hands-on guide to understanding and utilizing Google's powerful data warehouse solution, BigQuery. This comprehensive book equips you with the skills needed to clean, transform, and analyze large datasets for actionable business insights. What this Book will help me do Master the process of exploring and assessing the quality of datasets. Learn SQL for performing efficient and advanced data transformations in BigQuery. Optimize the performance of BigQuery queries for speed and cost-effectiveness. Discover best practices for setting up and managing BigQuery resources. Apply real-world case studies to analyze data and derive meaningful insights. Author(s) Michael Kahn is an experienced data engineer and author specializing in big data solutions and technologies. With years of hands-on experience working with Google Cloud Platform and BigQuery, he has assisted organizations in optimizing their data pipelines for effective decision-making. His accessible writing style ensures complex topics become approachable, enabling readers of various skill levels to succeed. Who is it for? This book is tailored for data analysts, data engineers, and data scientists who want to learn how to effectively use BigQuery for data exploration and preparation. Whether you're new to BigQuery or looking to deepen your expertise in working with large datasets, this book provides clear guidance and practical examples to achieve your goals.

Kafka Troubleshooting in Production: Stabilizing Kafka Clusters in the Cloud and On-premises

2023-11-29 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Elad Eldor

DataOps DevOps Kafka Data Streaming data data-engineering streaming-messaging

This book provides Kafka administrators, site reliability engineers, and DataOps and DevOps practitioners with a list of real production issues that can occur in Kafka clusters and how to solve them. The production issues covered are assembled into a comprehensive troubleshooting guide for those engineers who are responsible for the stability and performance of Kafka clusters in production, whether those clusters are deployed in the cloud or on-premises. This book teaches you how to detect and troubleshoot the issues, and eventually how to prevent them. Kafka stability is hard to achieve, especially in high throughput environments, and the purpose of this book is not only to make troubleshooting easier, but also to prevent production issues from occurring in the first place. The guidance in this book is drawn from the author's years of experience in helping clients and internal customers diagnose and resolve knotty production problems and stabilize their Kafka environments. The book is organized into recipe-style troubleshooting checklists that field engineers can easily follow when under pressure to fix an unstable cluster. This is the book you will want by your side when the stakes are high, and your job is on the line. What You Will Learn Monitor and resolve production issues in your Kafka clusters Provision Kafka clusters with the lowest costs and still handle the required loads Perform root cause analyses of issues affecting your Kafka clusters Know the ways in which your Kafka cluster can affect its consumers and producers Prevent or minimize data loss and delays in data streaming Forestall production issues through an understanding of common failure points Create checklists for troubleshooting your Kafka clusters when problems occur Who This Book Is For Site reliability engineers tasked with maintaining stability of Kafka clusters, Kafka administrators who troubleshoot production issues around Kafka, DevOps and DataOps experts who are involved with provisioning Kafka (whether on-premises or in the cloud), developers of Kafka consumers and producers who wish to learn more about Kafka

Cracking the Data Engineering Interview

2023-11-07 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Kedeisha Bryan , Taamir Ransome

CI/CD Data Engineering Data Modelling ETL/ELT Python Cyber Security SQL data data-engineering

"Cracking the Data Engineering Interview" is your essential guide to mastering the data engineering interview process. This book offers practical insights and techniques to build your resume, refine your skills in Python, SQL, data modeling, and ETL, and confidently tackle over 100 mock interview questions. Gain the knowledge and confidence to land your dream role in data engineering. What this Book will help me do Craft a compelling data engineering portfolio to stand out to employers. Refresh and deepen understanding of essential topics like Python, SQL, and ETL. Master over 100 interview questions that cover both technical and behavioral aspects. Understand data engineering concepts such as data modeling, security, and CI/CD. Develop negotiation, networking, and personal branding skills crucial for job applications. Author(s) None Bryan and None Ransome are seasoned authors with a wealth of experience in data engineering and professional development. Drawing from their extensive industry backgrounds, they provide actionable strategies for aspiring data engineers. Their approachable writing style and real-world insights make complex topics accessible to readers. Who is it for? This book is ideal for aspiring data engineers looking to navigate the job application process effectively. Readers should be familiar with data engineering fundamentals, including Python, SQL, cloud data platforms, and ETL processes. It's tailored for professionals aiming to enhance their portfolios, tackle challenging interviews, and boost their chances of landing a data engineering role.

IBM TS7700 R5 DS8000 Object Store User's Guide

2023-11-07 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Dave Brettell , Lourie Goodall , Rin Fujiwara

IBM data data-engineering

The IBM® TS7700 features a functional enhancement that allows for the TS7700 to act as an object store for transparent cloud tiering with IBM DS8000®, DFSMShsm (HSM), and native DFSMSdss (DSS). This function can be used to move data sets directly from DS8000 to TS7700. This IBM Redpaper publication provides a functional overview of the features, provides client value information, and walks through DFSMS, DS8000, and TS7700 set up steps.

Designing a Modern Application Data Stack

2023-10-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Adam Morton , Brad Culberson (Snowflake (Field CTO's office)) , Kevin McGinley

Analytics Snowflake data data-engineering

Today's massive datasets represent an unprecedented opportunity for organizations to build data-intensive applications. With this report, product leads, architects, and others who deal with applications and application development will explore why a cloud data platform is a great fit for data-intensive applications. You'll learn how to carefully consider scalability, data processing, and application distribution when making data app design decisions. Cloud data platforms are the modern infrastructure choice for data applications, as they offer improved scalability, elasticity, and cost efficiency. With a better understanding of data-intensive application architectures on cloud-based data platforms and the best practices outlined in this report, application teams can take full advantage of advances in data processing and app distribution to accelerate development, deployment, and adoption cycles. With this insightful report, you will: Learn why a modern cloud data platform is essential for building data-intensive applications Explore how scalability, data processing, and distribution models are key for today's data apps Implement best practices to improve application scalability and simplify data processing for efficiency gains Modernize application distribution plans to meet the needs of app providers and consumers About the authors: Adam Morton works with Intelligen Group, a Snowflake pure-play data and analytics consultancy. Kevin McGinley is technical director of the Snowflake customer acceleration team. Brad Culberson is a data platform architect specializing in data applications at Snowflake.

IBM Storage Virtualize, IBM Storage FlashSystem, and IBM SAN Volume Controller Security Feature Checklist - For IBM Storage Virtualize 8.5.3

2023-10-10 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by James Whitaker , Bill Scales , Barry Whyte

IBM Cyber Security data data-engineering ibm-system-storage ibm-system-storage-san-volume-controller

IBM® Storage Virtualize based storage systems are secure storage platforms that implement various security-related features, in terms of system-level access controls and data-level security features. This document outlines the available security features and options of IBM Storage Virtualize based storage systems. It is not intended as a "how to" or best practice document. Instead, it is a checklist of features that can be reviewed by a user security team to aid in the definition of a policy to be followed when implementing IBM FlashSystem®, IBM SAN Volume Controller, and IBM Storage Virtualize for Public Cloud. IBM Storage Virtualize features the following levels of security to protect against threats and to keep the attack surface as small as possible: The first line of defense is to offer strict verification features that stop unauthorized users from using login interfaces and gaining access to the system and its configuration. The second line of defense is to offer least privilege features that restrict the environment and limit any effect if a malicious actor does access the system configuration. The third line of defense is to run in a minimal, locked down, mode to prevent damage spreading to the kernel and rest of the operating system. The fourth line of defense is to protect the data at rest that is stored on the system from theft, loss, or corruption (malicious or accidental). The topics that are discussed in this paper can be broadly split into two categories: System security: This type of security encompasses the first three lines of defense that prevent unauthorized access to the system, protect the logical configuration of the storage system, and restrict what actions users can perform. It also ensures visibility and reporting of system level events that can be used by a Security Information and Event Management (SIEM) solution, such as IBM QRadar®. Data security: This type of security encompasses the fourth line of defense. It protects the data that is stored on the system against theft, loss, or attack. These data security features include Encryption of Data At Rest (EDAR) or IBM Safeguarded Copy (SGC). This document is correct as of IBM Storage Virtualize 8.5.3.

Amazon Redshift: The Definitive Guide

2023-10-03 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Rajesh Francis , Rajiv Gupta , Milind Oke

AI/ML Analytics AWS DWH Redshift Cyber Security amazon-redshift data data-engineering relational-databases

Amazon Redshift powers analytic cloud data warehouses worldwide, from startups to some of the largest enterprise data warehouses available today. This practical guide thoroughly examines this managed service and demonstrates how you can use it to extract value from your data immediately, rather than go through the heavy lifting required to run a typical data warehouse. Analytic specialists Rajesh Francis, Rajiv Gupta, and Milind Oke detail Amazon Redshift's underlying mechanisms and options to help you explore out-of-the box automation. Whether you're a data engineer who wants to learn the art of the possible or a DBA looking to take advantage of machine learning-based auto-tuning, this book helps you get the most value from Amazon Redshift. By understanding Amazon Redshift features, you'll achieve excellent analytic performance at the best price, with the least effort. This book helps you: Build a cloud data strategy around Amazon Redshift as foundational data warehouse Get started with Amazon Redshift with simple-to-use data models and design best practices Understand how and when to use Redshift Serverless and Redshift provisioned clusters Take advantage of auto-tuning options inherent in Amazon Redshift and understand manual tuning options Transform your data platform for predictive analytics using Redshift ML and break silos using data sharing Learn best practices for security, monitoring, resilience, and disaster recovery Leverage Amazon Redshift integration with other AWS services to unlock additional value

Learning and Operating Presto

2023-09-21 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Tim Meehan , Ying Su , Angelica Lo Duca , Vivek Bharathan

BI DWH Hadoop IBM Presto Cyber Security SQL data data-engineering

The Presto community has mushroomed since its origins at Facebook in 2012. But ramping up this open source distributed SQL query engine can be challenging even for the most experienced engineers. With this practical book, data engineers and architects, platform engineers, cloud engineers, and software engineers will learn how to use Presto operations at your organization to derive insights on datasets wherever they reside. Authors Angelica Lo Duca, Tim Meehan, Vivek Bharathan, and Ying Su explain what Presto is, where it came from, and how it differs from other data warehousing solutions. You'll discover why Facebook, Uber, Alibaba Cloud, Hewlett Packard Enterprise, IBM, Intel, and many more use Presto and how you can quickly deploy Presto in production. With this book, you will: Learn how to install and configure Presto Use Presto with business intelligence tools Understand how to connect Presto to a variety of data sources Extend Presto for real-time business insight Learn how to apply best practices and tuning Get troubleshooting tips for logs, error messages, and more Explore Presto's architectural concepts and usage patterns Understand Presto security and administration

IBM Storage as a Service Offering Guide

2023-08-31 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Hartmut Lonzer , Gucer Vasfi

Cloud Storage IBM data data-engineering

IBM® Storage as a Service (STaaS) extends your hybrid cloud experience with a new flexible consumption model enabled for both your on-premises and hybrid cloud infrastructure needs, giving you the agility, cash flow efficiency, and services of cloud storage with the flexibility to dynamically scale up or down and only pay for what you use beyond the minimal capacity. This IBM Redpaper provides a detailed introduction to the IBM STaaS service. The paper is targeted for data center managers and storage administrators.

IBM Power E1050: Technical Overview and Introduction

2023-08-30 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Guido Somers , Scott Vetter , Tsvetomir Spasov , Marc Gregorutti , Michael Malicdem , Stephen Lutz , Giuliano Anselmi

IBM Linux Marketing Cyber Security data data-engineering

This IBM® Redpaper publication is a comprehensive guide that covers the IBM Power E1050 server (9043-MRX) that uses the latest IBM Power10 processor-based technology and supports IBM AIX® and Linux operating systems (OSs). The goal of this paper is to provide a hardware architecture analysis and highlight the changes, new technologies, and major features that are being introduced in this system, such as: The latest IBM Power10 processor design, including the dual-chip module (DCM) packaging, which is available in various configurations from 12 - 24 cores per socket. Support of up to 16 TB of memory. Native Peripheral Component Interconnect Express (PCIe) 5th generation (Gen5) connectivity from the processor socket to deliver higher performance and bandwidth for connected adapters. Open Memory Interface (OMI) connected Differential Dual Inline Memory Module (DDIMM) memory cards delivering increased performance, resiliency, and security over industry-standard memory technologies, including transparent memory encryption. Enhanced internal storage performance with the use of native PCIe-connected Non-volatile Memory Express (NVMe) devices in up to 10 internal storage slots to deliver up to 64 TB of high-performance, low-latency storage in a single 4-socket system. Consumption-based pricing in the Power Private Cloud with Shared Utility Capacity commercial model to allow customers to consume resources more flexibly and efficiently, including AIX, Red Hat Enterprise Linux (RHEL), SUSE Linux Enterprise Server, and Red Hat OpenShift Container Platform workloads. This publication is for professionals who want to acquire a better understanding of IBM Power products. The intended audience includes: IBM Power customers Sales and marketing professionals Technical support professionals IBM Business Partners Independent software vendors (ISVs) This paper expands the set of IBM Power documentation by providing a desktop reference that offers a detailed technical description of the Power E1050 Midrange server model. This paper does not replace the current marketing materials and configuration tools. It is intended as an extra source of information that, together with existing sources, can be used to enhance your knowledge of IBM server solutions..

IBM Power E1080 Technical Overview and Introduction

2023-08-30 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Ivaylo B. Bozhinov , Scott Vetter , Dinil Das , Giuliano Anselmi , Manish Arora , Madison Lee , Bartlomiej Grabowski , Armin Röll , Turgut Genc

IBM Linux Marketing SAS data data-engineering

This IBM® Redpaper® publication provides a broad understanding of a new architecture of the IBM Power® E1080 (also known as the Power E1080) server that supports IBM AIX®, IBM i, and selected distributions of Linux operating systems. The objective of this paper is to introduce the Power E1080, the most powerful and scalable server of the IBM Power portfolio, and its offerings and relevant functions: Designed to support up to four system nodes and up to 240 IBM Power10™ processor cores The Power E1080 can be initially ordered with a single system node or two system nodes configuration, which provides up to 60 Power10 processor cores with a single node configuration or up to 120 Power10 processor cores with a two system nodes configuration. More support for a three or four system nodes configuration is to be added on December 10, 2021, which provides support for up to 240 Power10 processor cores with a full combined four system nodes server. Designed to supports up to 64 TB memory The Power E1080 can be initially ordered with the total memory RAM capacity up to 8 TB. More support is to be added on December 10, 2021 to support up to 64 TB in a full combined four system nodes server. Designed to support up to 32 Peripheral Component Interconnect® (PCIe) Gen 5 slots in a full combined four system nodes server and up to 192 PCIe Gen 3 slots with expansion I/O drawers The Power E1080 supports initially a maximum of two system nodes; therefore, up to 16 PCIe Gen 5 slots, and up to 96 PCIe Gen 3 slots with expansion I/O drawer. More support is to be added on December 10, 2021, to support up to 192 PCIe Gen 3 slots with expansion I/O drawers. Up to over 4,000 directly attached serial-attached SCSI (SAS) disks or solid-state drives (SSDs) Up to 1,000 virtual machines (VMs) with logical partitions (LPARs) per system System control unit, providing redundant system master Flexible Service Processor (FSP) Supports IBM Power System Private Cloud Solution with Dynamic Capacity This publication is for professionals who want to acquire a better understanding of Power servers. The intended audience includes the following roles: Customers Sales and marketing professionals Technical support professionals IBM Business Partners Independent software vendors (ISVs) This paper does not replace the current marketing materials and configuration tools. It is intended as an extra source of information that, together with existing sources, can be used to enhance your knowledge of IBM server solutions.

Serverless Machine Learning with Amazon Redshift ML

2023-08-30 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Debu Panda , Bhanu Pittampally , Sumeet Joshi , Phil Bates

AI/ML Analytics Data Analytics Redshift SQL amazon-redshift data data-engineering relational-databases

Serverless Machine Learning with Amazon Redshift ML provides a hands-on guide to using Amazon Redshift Serverless and Redshift ML for building and deploying machine learning models. Through SQL-focused examples and practical walkthroughs, you will learn efficient techniques for cloud data analytics and serverless machine learning. What this Book will help me do Grasp the workflow of building machine learning models with Redshift ML using SQL. Learn to handle supervised learning tasks like classification and regression. Apply unsupervised learning techniques, such as K-means clustering, in Redshift ML. Develop time-series forecasting models within Amazon Redshift. Understand how to operationalize machine learning in serverless cloud architecture. Author(s) Debu Panda, Phil Bates, Bhanu Pittampally, and Sumeet Joshi are seasoned professionals in cloud computing and machine learning technologies. They combine deep technical knowledge with teaching expertise to guide learners through mastering Amazon Redshift ML. Their collaborative approach ensures that the content is accessible, engaging, and practically applicable. Who is it for? This book is perfect for data scientists, machine learning engineers, and database administrators using or intending to use Amazon Redshift. It's tailored for professionals with basic knowledge of machine learning and SQL who aim to enhance their efficiency and specialize in serverless machine learning within cloud architectures.

High-Performance Data Architectures

2023-08-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Joe McKendrick , Ed Huang

Agile/Scrum Analytics Data Management SQL data data-engineering

By choosing the right database, you can maximize your business potential, improve performance, increase efficiency, and gain a competitive edge. This insightful report examines the benefits of using a simplified data architecture containing cloud-based HTAP (hybrid transactional and analytical processing) database capabilities. You'll learn how this data architecture can help data engineers and data decision makers focus on what matters most: growing your business. Authors Joe McKendrick and Ed Huang explain how cloud native infrastructure supports enterprise businesses and operations with a much more agile foundation. Just one layer up from the infrastructure, cloud-based databases are a crucial part of data management and analytics. Learn how distributed SQL databases containing HTAP capabilities provide more efficient and streamlined data processing to improve cost efficiency and expedite business operations and decision making. This report helps you: Explore industry trends in database development Learn the benefits of a simplified data architecture Comb through the complex and crowded database choices on the market Examine the process of selecting the right database for your business Learn the latest innovations database for improving your company's efficiency and performance

Introduction to Integration Suite Capabilities: Learn SAP API Management, Open Connectors, Integration Advisor and Trading Partner Management

2023-08-16 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Jaspreet Bagga

API SAP Cyber Security data data-engineering

Discover the power of SAP Integration Suite's capabilities with this hands-on guide. Learn how this integration platform (iPaaS) can help you connect and automate your business processes with integrations, connectors, APIs, and best practices for a faster ROI. Over the course of this book, you will explore the powerful capabilities of SAP Integration Suite, including API Management, Open Connectors, Integration Advisor, Trading Partner Management, Migration Assessment, and Integration Assessment. With detailed explanations and real-world examples, this book is the perfect resource for anyone looking to unlock the full potential of SAP Integration Suite. With each chapter, you'll gain a greater understanding of why SAP Integration Suite can be the proverbial swiss army knife in your toolkit to design and develop enterprise integration scenarios, offering simplified integration, security, and governance for your applications. Author Jaspreet Bagga demonstrates howto create, publish, and monitor APIs with SAP API Management, and how to use its features to enhance your API lifecycle. He also provides a detailed walkthrough of how other capabilities of SAP Integration Suite can streamline your connectivity, design, development, and architecture methodology with a tool-based approach completely managed by SAP. Whether you are a developer, an architect, or a business user, this book will help you unlock the potential of SAP's Integration Suite platform, API Management, and accelerate your digital transformation. What You Will Learn Understand what APIs are, what they are used for, and why they are crucial for building effective and reliable applications Gain an understanding of SAP Integration Suite's features and benefits Study SAP Integration assessment process, patterns, and much more Explore tools and capabilities other than the Cloud Integration that address the full value chain of the enterprise integration components Who This Book Is For Web developers and application leads who want to learn SAP API Management.

Oracle Global Data Services for Mission-critical Systems: Maximizing Performance and Reliability in Complex Enterprise Environments

2023-08-16 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Mariami Kupatadze , Y V Ravi Kumar , Sambaiah Sammeta

Oracle data data-engineering oracle-database-solutions

New to Oracle Global Data Services? You’ve come to the right place. This book will show you how to leverage the power of Oracle GDS to ensure runtime load balancing, region affinity, replication lag tolerance-based workload routing, and inter-database service failover. In particular, you will see how to maximize the utilization of replication investments with Oracle GDS. The book starts by guiding you through the installation and configuration of GDS and provides details for each component in the GDS framework. Next, you’ll learn how to configure various components of Oracle GDS in standalone environments. Hands-on exercises that explore the advantages of GDS with different test cases utilizing Active Data Guard (ADG), Oracle GoldenGate (OGG), and Oracle Real Application Clusters (RAC) will help you put your learning in context. The book concludes with a demonstration of how to add Oracle GDS to OEM for monitoring and troubleshooting. You’ll also see how to monitor Oracle GDS in a centralized location using Oracle Enterprise Manager Cloud Control. After completing this book, you will understand the architecture, components, and implementation strategies of GDS using ADG and OGG in mission-critical environments. What You Will Learn Understand Oracle Global Data Services architecture and its various components Install and configure Oracle Global Data Services Use Global Data Services with Active Data Guard and Oracle Golden Gate. Monitor Global Data Services using Oracle Enterprise Manager Cloud Control. Troubleshoot issues in Global Data Services Who This Book Is For Oracle database administrators, Oracle database architects, Oracle technical managers, Oracle application business analysts, and Oracle data engineers.

Graph-Powered Analytics and Machine Learning with TigerGraph

2023-07-24 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Phuc Kien Nguyen , Alexander Thomas , Victor Lee

AI/ML Analytics data data-engineering graph-databases tigergraph

With the rapid rise of graph databases, organizations are now implementing advanced analytics and machine learning solutions to help drive business outcomes. This practical guide shows data scientists, data engineers, architects, and business analysts how to get started with a graph database using TigerGraph, one of the leading graph database models available. You'll explore a three-stage approach to deriving value from connected data: connect, analyze, and learn. Victor Lee, Phuc Kien Nguyen, and Alexander Thomas present real use cases covering several contemporary business needs. By diving into hands-on exercises using TigerGraph Cloud, you'll quickly become proficient at designing and managing advanced analytics and machine learning solutions for your organization. Use graph thinking to connect, analyze, and learn from data for advanced analytics and machine learning Learn how graph analytics and machine learning can deliver key business insights and outcomes Use five core categories of graph algorithms to drive advanced analytics and machine learning Deliver a real-time 360-degree view of core business entities, including customer, product, service, supplier, and citizen Discover insights from connected data through machine learning and advanced analytics

A Practical Guide to SAP Integration Suite: SAP’s Cloud Middleware and Integration Solution

2023-06-30 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Jaspreet Bagga

API SAP data data-engineering

This book covers the basics of SAP’s Integration Suite, including a broad overview of its capabilities, installation, and real-life examples to illustrate how it can be used to integrate, develop, administer, and monitor applications in the cloud. As you progress through the book, you will see how SAP Integration Suite works as an open, enterprise-grade platform that is a fully vendor-managed, multi-cloud offering that will help you expedite your SAP and third-party integration scenarios. The entire value chain is explored in detail, including usage of APIs and runtime control. Author Jaspreet Bagga demonstrates how SAP’s prebuilt integration packages facilitate quicker, more comprehensive integrations, and how they support a variety of integration patterns. You’ll learn how to leverage the platform to enable seamless cloud and on-premises applications connectivity, develop custom scenarios, mix master data, blend business-to-business (B2B) and electronic data interchange (EDI) processes, including trading partner management. Also covered are business-to-government (B2G) scenarios, orchestrating data and pipelines, and mixing event-driven integration. Upon completing this book, you will have a thorough understanding of why SAP Integration Suite is the middleware of SAP’s integration strategy, and be able to effectively use it in your own integration scenarios. What You Will Learn Understand SAP Integration Suite and its core capabilities Know how integration technologies, such as architecture and supplementary intelligent technologies, work within the SAP Integration Suite Discover services for pre-packaged accelerators: SAP API Management, the Integration Advisor, and the SAP API Business Hub Utilize integration features to link your on-premises or cloud-based systems Understand the capabilities of the newly released Migration Assessment Who This Book Is forWeb developers and application leads who want to learn SAP Integration Suite.

Data Engineering with dbt

2023-06-30 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Roberto Zagni

Analytics Data Engineering dbt ETL/ELT Snowflake SQL data data-engineering

Data Engineering with dbt provides a comprehensive guide to building modern, reliable data platforms using dbt and SQL. You'll gain hands-on experience building automated ELT pipelines, using dbt Cloud with Snowflake, and embracing patterns for scalable and maintainable data solutions. What this Book will help me do Set up and manage a dbt Cloud environment and create reliable ELT pipelines. Integrate Snowflake with dbt to implement robust data engineering workflows. Transform raw data into analytics-ready data using dbt's features and SQL. Apply advanced dbt functionality such as macros and Jinja for efficient coding. Ensure data accuracy and platform reliability with built-in testing and monitoring. Author(s) None Zagni is a seasoned data engineering professional with a wealth of experience in designing scalable data platforms. Through practical insights and real-world applications, Zagni demystifies complex data engineering practices. Their approachable teaching style makes technical concepts accessible and actionable. Who is it for? This book is perfect for data engineers, analysts, and analytics engineers looking to leverage dbt for data platform development. If you're a manager or decision maker interested in fostering efficient data workflows or a professional with basic SQL knowledge aiming to deepen your expertise, this resource will be invaluable.

Geospatial Data Analytics on AWS

2023-06-30 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Jeff DeMuth , Janahan Gnanachandran , Scott Bateman

AI/ML Analytics Athena AWS Data Analytics Data Management Data Science GIS QuickSight Redshift S3 Amazon SageMaker +5 more

In "Geospatial Data Analytics on AWS," you will learn how to store, manage, and analyze geospatial data effectively using various AWS services. This book provides insight into building geospatial data lakes, leveraging AWS databases, and applying best practices to derive insights from spatial data in the cloud. What this Book will help me do Design and manage geospatial data lakes on AWS leveraging S3 and other storage solutions. Analyze geospatial data using AWS services such as Athena and Redshift. Utilize machine learning models for geospatial data processing and analytics using SageMaker. Visualize geospatial data through services like Amazon QuickSight and OpenStreetMap integration. Avoid common pitfalls when managing geospatial data in the cloud. Author(s) Scott Bateman, Janahan Gnanachandran, and Jeff DeMuth bring their extensive experience in cloud computing and geospatial analytics to this book. With backgrounds in cloud architecture, data science, and geospatial applications, they aim to make complex topics accessible. Their collaborative approach ensures readers can practically apply concepts to real-world challenges. Who is it for? This book is ideal for GIS and data professionals, including developers, analysts, and scientists. It suits readers with a basic understanding of geographical concepts but no prior AWS experience. If you're aiming to enhance your cloud-based geospatial data management and analytics skills, this is the guide for you.

talk-data.com

Activity Trend

Top Events

Top Speakers

Architecting a Modern Data Warehouse for Large Enterprises: Build Multi-cloud Modern Distributed Data Warehouses with Azure and AWS

Data Exploration and Preparation with BigQuery

Kafka Troubleshooting in Production: Stabilizing Kafka Clusters in the Cloud and On-premises

Cracking the Data Engineering Interview

IBM TS7700 R5 DS8000 Object Store User's Guide

Designing a Modern Application Data Stack

IBM Storage Virtualize, IBM Storage FlashSystem, and IBM SAN Volume Controller Security Feature Checklist - For IBM Storage Virtualize 8.5.3

Amazon Redshift: The Definitive Guide

Learning and Operating Presto

IBM Storage as a Service Offering Guide

IBM Power E1050: Technical Overview and Introduction

IBM Power E1080 Technical Overview and Introduction

Serverless Machine Learning with Amazon Redshift ML

High-Performance Data Architectures

Introduction to Integration Suite Capabilities: Learn SAP API Management, Open Connectors, Integration Advisor and Trading Partner Management

Oracle Global Data Services for Mission-critical Systems: Maximizing Performance and Reliability in Complex Enterprise Environments

Graph-Powered Analytics and Machine Learning with TigerGraph

A Practical Guide to SAP Integration Suite: SAP’s Cloud Middleware and Integration Solution

Data Engineering with dbt

Geospatial Data Analytics on AWS