talk-data.com talk-data.com

Event

O'Reilly Data Engineering Books

2001-10-19 – 2027-05-25 Oreilly Visit website ↗

Activities tracked

3377

Collection of O'Reilly books on Data Engineering.

Filtering by: data-engineering ×

Sessions & talks

Showing 626–650 of 3377 · Newest first

Search within this event →
SAP HANA Data Management and Performance on IBM Power Systems

This IBM® Redpaper Redbooks publication provides information and concepts about how to take advantage of SAP HANA and IBM Power Systems features to manage data and performance efficiently. The target audience of this book includes architects, IT specialists, and systems administrators who deploy SAP HANA and manage data and SAP system performance.

Modern Big Data Architectures

Provides an up-to-date analysis of big data and multi-agent systems The term Big Data refers to the cases, where data sets are too large or too complex for traditional data-processing software. With the spread of new concepts such as Edge Computing or the Internet of Things, production, processing and consumption of this data becomes more and more distributed. As a result, applications increasingly require multiple agents that can work together. A multi-agent system (MAS) is a self-organized computer system that comprises multiple intelligent agents interacting to solve problems that are beyond the capacities of individual agents. Modern Big Data Architectures examines modern concepts and architecture for Big Data processing and analytics. This unique, up-to-date volume provides joint analysis of big data and multi-agent systems, with emphasis on distributed, intelligent processing of very large data sets. Each chapter contains practical examples and detailed solutions suitable for a wide variety of applications. The author, an internationally-recognized expert in Big Data and distributed Artificial Intelligence, demonstrates how base concepts such as agent, actor, and micro-service have reached a point of convergence—enabling next generation systems to be built by incorporating the best aspects of the field. This book: Illustrates how data sets are produced and how they can be utilized in various areas of industry and science Explains how to apply common computational models and state-of-the-art architectures to process Big Data tasks Discusses current and emerging Big Data applications of Artificial Intelligence Modern Big Data Architectures: A Multi-Agent Systems Perspective is a timely and important resource for data science professionals and students involved in Big Data analytics, and machine and artificial learning.

Open Source Data Pipelines for Intelligent Applications

For decades, businesses have used information about their customers to make critical decisions on what to stock in inventory, which items to recommend to customers, and when to run promotions. But the advent of big data early in this century changed the game considerably. The key to achieving a competitive advantage today is the ability to process and store ever-increasing amounts of information that affect those decisions. In this report, solutions specialists from Red Hat provide an architectural guide to help you navigate the modern data analytics ecosystem. You’ll learn how the industry has evolved and examine current approaches to storage. That includes a deep dive into the anatomy of a portable data platform architecture, along with several aspects of running data pipelines and intelligent applications with Kubernetes. Explore the history of open source data processing and the evolution of container scheduling Get a concise overview of intelligent applications Learn how to use storage with Kubernetes to produce effective intelligent applications Understand how to structure applications on Kubernetes in your platform architecture Delve into example pipeline architectures for deploying intelligent applications on Kubernetes

SAP HANA Platform Migration

This IBM® Redpaper publication provides SAP HANA platform migration information and details for successful planning for migration to IBM Power Systems servers. This publication addresses topics for sellers, IT architects, IT specialists, and anyone who wants to migrate and manage SAP workloads on Power Systems servers. Moreover, this guide provides documentation to transfer how-to skills to the technical teams, and it provides solution guidance to the sales team. This publication complements documentation that is available at IBM Knowledge Center, and it aligns with educational materials that are provided by IBM Systems.

IBM DS8000 and Transparent Cloud Tiering

This IBM® Redbooks® publication gives a broad understanding of storage clouds and the initial functionality that was introduced for mainframes to have Transparent Cloud Tiering. IBM DFSMS and the IBM DS8000 added functionality to provide elements of serverless data movement, and for IBM z/OS® to communicate with a storage cloud. The function is known as Transparent Cloud Tiering and is composed of the following key elements: A gateway in the DS8000, which allows the movement of data to and from Object Storage by using a network connection, with the option to encrypt data in the Cloud. DFSMShsm enhancements to support Migrate and Recall functions to and from the Object Storage. Other commands were enhanced to monitor and report on the new functionality. DFSMShsm uses the Web Enablement toolkit for z/OS to create and access the metadata for specific clouds, containers, and objects. DFSMSdss enhancements to provide some basic backup and restore functions to and from the cloud. The IBM TS7700 can also be set up to act as if it were cloud storage from the DS8000 perspective. This IBM Redbooks publication is divided into the following parts: Part 1 provides you with an introduction to clouds. Part 2 shows you how we set up the Transparent Cloud Tiering in a controlled laboratory and how the new functions work. We provide points to consider to help you set up your storage cloud and integrate it into your operational environment. Part 3 shows you how we used the new functionality to communicate with the cloud and to send data and retrieve data from it..

IBM TS7700 R5.0 Cloud Storage Tier Guide

Building on over 20 years of virtual tape experience, the TS7700 (TS7760, TS7770) now supports the ability to store virtual tape volumes in an object store. This IBM® Redpaper publication helps you set up and configure the cloud object storage support for IBM Cloud™ Object Storage (COS) or Amazon Simple Storage Service (Amazon S3). The TS7700 supported off loading to physical tape for over two decades. Off loading to physical tape behind a TS7700 is used by hundreds of organizations around the world. By using the same hierarchical storage techniques, the TS7700 can also off load to object storage. Because object storage is cloud-based and accessible from different regions, the TS7700 Cloud Storage Tier support essentially allows the cloud to be an extension of the grid. In this IBM Redpaper publication, we provide a brief overview of cloud technology with an emphasis on Object Storage. Object Storage is used by a broad set of technologies, including those technologies that are exclusive to IBM Z®. The aim of this publication is to provide a basic understanding of cloud, Object Storage, and different ways it can be integrated into your environment. This Redpaper is intended for system architects and storage administrators with TS7700 experience who want to add the support of a Cloud Storage Tier to their TS7700 solution. Note: As of this writing, the TS7700C supports the ability to offload to on-premise cloud with IBM Cloud Object Storage and public cloud with Amazon S3.

MySQL 8 Query Performance Tuning: A Systematic Method for Improving Execution Speeds

Identify, analyze, and improve poorly performing queries that damage user experience and lead to lost revenue for your business. This book will help you make query tuning an integral part of your daily routine through a multi-step process that includes monitoring of execution times, identifying candidate queries for optimization, analyzing their current performance, and improving them to deliver results faster and with less overhead. Author Jesper Krogh systematically discusses each of these steps along with the data sources and the tools used to perform them. MySQL 8 Query Performance Tuning aims to help you improve query performance using a wide range of strategies. You will know how to analyze queries using both the traditional EXPLAIN command as well as the new EXPLAIN ANALYZE tool. You also will see how to use the Visual Explain feature to provide a visually-oriented view of an execution plan. Coverage of indexes includes indexing strategies and index statistics, and you will learn how histograms can be used to provide input on skewed data distributions that the optimizer can use to improve query performance. You will learn about locks, and how to investigate locking issues. And you will come away with an understanding of how the MySQL optimizer works, including the new hash join algorithm, and how to change the optimizer’s behavior when needed to deliver faster execution times. You will gain the tools and skills needed to delight application users and to squeeze the most value from corporate computing resources. What You Will Learn Monitor query performance to identify poor performers Choose queries to optimize that will provide the greatest gain Analyze queries using tools such as EXPLAIN ANALYZE and Visual Explain Improve slow queries through a wide range of strategies Properly deploy indexes and histograms to aid in creating fast execution plans Understand and analyze locks to resolve contention and increase throughput Who This Book Is For Database administrators and SQL developers who are familiar with MySQL and need to participate in query tuning. While some experience with MySQL is required, no prior knowledge of query performance tuning is needed.

PostgreSQL Configuration: Best Practices for Performance and Security

Obtain all the skills you need to configure and manage a PostgreSQL database. In this book you will begin by installing and configuring PostgreSQL on a server by focusing on system-level parameter settings before installation. You will also look at key post-installation steps to avoid issues in the future. The basic configuration of PostgreSQL is tuned for compatibility rather than performance. Keeping this in mind, you will fine-tune your PostgreSQL parameters based on your environment and application behavior. You will then get tips to improve database monitoring and maintenance followed by database security for handling sensitive data in PostgreSQL. Every system containing valuable data needs to be backed-up regularly. PostgreSQL follows a simple back-up procedure and provides fundamental approaches to back up your data. You will go through these approaches and choose the right one based on your environment. Running your application with limited resources can be tricky. To achieve this you will implement a pooling mechanism for your PostgreSQL instances to connect to other databases. Finally, you will take a look at some basic errors faced while working with PostgreSQL and learn to resolve them in the quickest manner. What You Will Learn Configure PostgreSQL for performance Monitor and maintain PostgreSQL instances Implement a backup strategy for your data Resolve errors faced while using PostgreSQL Who This Book Is For Readers with basic knowledge of PostgreSQL who wish to implement key solutions based on their environment.

IBM DS8900F Product Guide

Built on over 50 years of Enterprise storage expertise, the IBM® DS8000® series is the flagship of disk storage systems within the IBM System Storage™ portfolio. As of September 2019, the DS8900F is the latest addition and offers two new classes: DS8910F: Flexibility Class all-flash DS8950F: Agility Class all-flash The agility class is efficiently designed to consolidate all your mission-critical workloads for IBM Z®, IBM LinuxONE, IBM Power Systems, and distributed environments under a single all-flash storage solution. This IBM Redbooks® Product Guide gives an overview of the features and functions that are available with the IBM DS8900F models running microcode Release 9.0 (Bundle 89.0 / Licensed Machine Code 7.9.0.xxx).

IBM ZPDT Guide and Reference

This IBM® Redbooks® publication provides both introductory information and technical details about the IBM System z® Personal Development Tool (IBM zPDT®), which produces a small System z environment suitable for application development. zPDT is a PC Linux application. When zPDT is installed (on Linux), normal System z operating systems (such as IBM z/OS®) can be run on it. zPDT provides the basic System z architecture and emulated IBM 3390 disk drives, 3270 interfaces, OSA interfaces, and so on. The systems that are discussed in this document are complex. They have elements of Linux (for the underlying PC machine), IBM z/Architecture® (for the core zPDT elements), System z I/O functions (for emulated I/O devices), z/OS (the most common System z operating system), and various applications and subsystems under z/OS. The reader is assumed to be familiar with general concepts and terminology of System z hardware and software elements, and with basic PC Linux characteristics. This book provides the primary documentation for zPDT.

SQL Server 2019 Administration Inside Out

Conquer SQL Server 2019 administration–from the inside out Dive into SQL Server 2019 administration–and really put your SQL Server DBA expertise to work. This supremely organized reference packs hundreds of timesaving solutions, tips, and workarounds–all you need to plan, implement, manage, and secure SQL Server 2019 in any production environment: on-premises, cloud, or hybrid. Six experts thoroughly tour DBA capabilities available in SQL Server 2019 Database Engine, SQL Server Data Tools, SQL Server Management Studio, PowerShell, and Azure Portal. You’ll find extensive new coverage of Azure SQL, big data clusters, PolyBase, data protection, automation, and more. Discover how experts tackle today’s essential tasks–and challenge yourself to new levels of mastery. Explore SQL Server 2019’s toolset, including the improved SQL Server Management Studio, Azure Data Studio, and Configuration Manager Design, implement, manage, and govern on-premises, hybrid, or Azure database infrastructures Install and configure SQL Server on Windows and Linux Master modern maintenance and monitoring with extended events, Resource Governor, and the SQL Assessment API Automate tasks with maintenance plans, PowerShell, Policy-Based Management, and more Plan and manage data recovery, including hybrid backup/restore, Azure SQL Database recovery, and geo-replication Use availability groups for high availability and disaster recovery Protect data with Transparent Data Encryption, Always Encrypted, new Certificate Management capabilities, and other advances Optimize databases with SQL Server 2019’s advanced performance and indexing features Provision and operate Azure SQL Database and its managed instances Move SQL Server workloads to Azure: planning, testing, migration, and post-migration

IBM Power Systems Virtualization Operation Management for SAP Applications

Businesses are using IBM® Power Systems servers and Linux to consolidate multiple SAP workloads onto fewer systems, increasing infrastructure utilization; reliability, availability, and serviceability (RAS); and scalability, and reducing cost. This IBM Redpaper Redbooks publication describes key hardware and software components of an SAP solution stack. Furthermore, this book addresses non-functional items like RAS, security, and issue handling. Practical help for planning, implementation, configuration, installation, and monitoring of a solution stack are provided. This publication addresses topics for sellers, IT architects, IT specialists, and anyone who wants to implement and manage SAP workloads on IBM Power Systems servers. Moreover, this guide provides documentation to transfer how-to skills to the technical teams, and it provides solution guidance to the sales team. This publication complements documentation that is available at IBM Knowledge Center, and it aligns with educational materials that are provided by IBM Systems.

Implementing and Managing a High-performance Enterprise Infrastructure with Nutanix on IBM Power Systems

This IBM® Redbooks® publication describes how to implement and manage a hyperconverged private cloud solution by using theoretical knowledge, hands-on exercises, and documenting the findings by way of sample scenarios. This book also is a guide about how to implement and manage a high-performance enterprise infrastructure and private cloud platform for big data, artificial intelligence, and transactional and analytics workloads on IBM Power Systems. This book use available documentation, hardware, and software resources to meet the following goals: Document the web-scale architecture that demonstrates the simple and agile nature of public clouds. Showcase the hyperconverged infrastructure to help cloud native applications mine cognitive analytics workloads. Conduct and document implementation case studies. Document guidelines to help provide an optimal system configuration, implementation, and management. This publication addresses topics for developers, IT architects, IT specialists, sellers, and anyone that wants to implement and manage a high-performance enterprise infrastructure and private cloud platform on IBM Power Systems. This book also provides documentation to transfer the how-to-skills to the technical teams, and solution guidance to the sales team. This book compliments any documentation that is available in IBM Knowledge Center, and aligns with the educational materials that are provided by the IBM Systems Software Education (SSE).

PostgreSQL 12 High Availability Cookbook - Third Edition

The 'PostgreSQL 12 High Availability Cookbook' is a comprehensive guide to setting up and maintaining highly available PostgreSQL clusters. This book provides practical recipes for designing a resilient database system that can handle outages and recover quickly without downtime. What this Book will help me do Learn how to configure replication tools to protect PostgreSQL data effectively. Understand and implement hardware strategies for ensuring optimal database performance. Master the techniques for reducing contention with connections using pooling strategies. Gain insights into using monitoring tools like Nagios and Grafana for PostgreSQL cluster management. Develop a robust strategy for version upgrades, backups, and failover. Author(s) Shaun Thomas is a seasoned database specialist with extensive experience managing PostgreSQL systems. As a PostgreSQL contributor and advocate, he brings a depth of practical knowledge to database reliability and automation. Shaun's engaging and clear writing style ensures that readers can apply the discussed techniques with confidence. Who is it for? This book is ideal for database administrators, IT professionals, and developers who maintain PostgreSQL systems and want to improve uptime or reliability. Familiarity with basic PostgreSQL concepts is recommended, but no specific knowledge of version 12 features is required. Readers aiming to build advanced high availability solutions will find this book invaluable. It's perfect for those aspiring to ensure their database systems are both resilient and adaptive.

Next-Generation Machine Learning with Spark: Covers XGBoost, LightGBM, Spark NLP, Distributed Deep Learning with Keras, and More

Access real-world documentation and examples for the Spark platform for building large-scale, enterprise-grade machine learning applications. The past decade has seen an astonishing series of advances in machine learning. These breakthroughs are disrupting our everyday life and making an impact across every industry. Next-Generation Machine Learning with Spark provides a gentle introduction to Spark and Spark MLlib and advances to more powerful, third-party machine learning algorithms and libraries beyond what is available in the standard Spark MLlib library. By the end of this book, you will be able to apply your knowledge to real-world use cases through dozens of practical examples and insightful explanations. What You Will Learn Be introduced to machine learning, Spark, and Spark MLlib 2.4.x Achieve lightning-fast gradient boosting on Spark with the XGBoost4J-Spark and LightGBM libraries Detect anomalies with the Isolation Forest algorithm for Spark Use the Spark NLP and Stanford CoreNLP libraries that support multiple languages Optimize your ML workload with the Alluxio in-memory data accelerator for Spark Use GraphX and GraphFrames for Graph Analysis Perform image recognition using convolutional neural networks Utilize the Keras framework and distributed deep learning libraries with Spark Who This Book Is For Data scientists and machine learning engineers who want to take their knowledge to the next level and use Spark and more powerful, next-generation algorithms and libraries beyond what is available in the standard Spark MLlib library; also serves as a primer for aspiring data scientists and engineers who need an introduction to machine learning, Spark, and Spark MLlib.

Multicloud Storage as a Service using VRealize Automation and IBM Spectrum Storage

This document is intended to facilitate the deployment of the Multicloud Solution for Business Continuity and Storage as service by using IBM Spectrum Virtualize for Public Cloud on Amazon Web Services (AWS). To complete the tasks it describes, you must understand IBM FlashSystem 9100, IBM Spectrum Virtualize for Public Cloud, IBM Spectrum Connect, VMware vRealize Orchestrator, and vRealize Automation and AWS Cloud. The information in this document is distributed on an "as is" basis without any warranty that is either expressed or implied. Support assistance for the use of this material is limited to situations where IBM Storwize or IBM FlashSystem storage devices are supported and entitled and where the issues are specific to a blueprint implementation.

SAP on Azure Implementation Guide

SAP on Azure Implementation Guide is your essential companion for transitioning your SAP infrastructure to Microsoft Azure. The book takes a practical and detailed approach, providing step-by-step guidance to help you leverage Azure for migrating, scaling, and transforming your SAP solutions effectively. What this Book will help me do Understand and implement different SAP to Azure migration strategies, such as lift-and-shift and database transformations. Learn to ensure high availability and scalability for your SAP systems using Azure's capabilities. Gain insight into securing SAP workloads on Azure for compliance and safety. Achieve operational excellence by leveraging cloud-native features of Azure for SAP. Acquire the skills to optimize SAP infrastructure on Azure for enhanced business value. Author(s) Nick Morgan and Bartosz Jarkowski are experienced consultants with extensive knowledge of SAP systems and cloud implementations. With backgrounds in designing and deploying SAP on cloud platforms, they have a thorough understanding of transitioning business-critical applications to modern infrastructures. They bring a wealth of practical experience to this comprehensive guide. Who is it for? This book is ideal for SAP architects and IT professionals who are looking to migrate their SAP infrastructures to Azure. Whether you are moderately familiar with SAP or an experienced architect evaluating advanced migration strategies, you'll find the information in this guide precise and actionable to help you achieve your objectives.

IBM TS7700 Series DS8000 Object Store User's Guide Version 1.0

The IBM® TS7700 features a functional enhancement that allows for the TS7700 to act as an object store for transparent cloud tiering with IBM DS8000® (DS8K) and DFSMShsm (HSM). This function can be used to move datasets directly from DS8000 to TS7700. This IBM Redpaper™ publication describes the client value, and how DFSMShsm, DS8000, and TS7700 are set up to enable and use the function.

Achieving Hybrid Cloud Cyber Resiliency with IBM Spectrum Virtualize for Public Cloud

This document is intended to facilitate the approach of achieving the Cyber Resiliency solution for IBM® Spectrum Virtualize for Public Cloud. This solution is designed to protect the data on IBM Spectrum™ Virtualize storage in a hybrid multicloud environment by deploying cloud backup to Amazon S3 using the function Transparent Cloud Tiering .

Practical Oracle SQL: Mastering the Full Power of Oracle Database

Write powerful queries using as much of the feature-rich Oracle SQL language as possible, progressing beyond the simple queries of basic SQL as standardized in SQL-92. Both standard SQL and Oracle’s own extensions to the language have progressed far over the decades in terms of how much you can work with your data in a single, albeit sometimes complex, SQL statement. If you already know the basics of SQL, this book provides many examples of how to write even more advanced SQL to huge benefit in your applications, such as: Pivoting rows to columns and columns to rows Recursion in SQL with MODEL and WITH clauses Answering Top-N questions Forecasting with linear regressions Row pattern matching to group or distribute rows Using MATCH_RECOGNIZE as a row processing engineThe process of starting from simpler statements in SQL, and gradually working those statements stepwise into more complexstatements that deliver powerful results, is covered in each example. By trying out the recipes and examples for yourself, you will put together the building blocks into powerful SQL statements that will make your application run circles around your competitors. What You Will Learn Take full advantage of advanced and modern features in Oracle SQL Recognize when modern SQL constructs can help create better applications Improve SQL query building skills through stepwise refinement Apply set-based thinking to process more data in fewer queries Make cross-row calculations with analytic functions Search for patterns across multiple rows using row pattern matching Break complex calculations into smaller steps with subquery factoring Who This Book Is For Oracle Database developers who already knowsome SQL, but rarely use features of the language beyond the SQL-92 standard. And it is for developers who would like to apply the more modern features of Oracle SQL, but don’t know where to start. The book also is for those who want to write increasingly complex queries in a stepwise and understandable manner. Experienced developers will use the book to develop more efficient queries using the advanced features of the Oracle SQL language.

Understanding Log Analytics at Scale

If enabled, logging captures almost every system process, event, or message in your software or hardware. But once you have all that data, what do you do with it? This report shows you how to use log analytics—the process of gathering, correlating, and analyzing that information—to drive critical business insights and outcomes. Drawing on real-world use cases, Matt Gillespie outlines the opportunities for log analytics and the challenges you may face—along with approaches for meeting them. Data architects and IT and infrastructure leads will learn the mechanics of log analytics and key architectural considerations for data storage. The report also offers nine key guideposts that will help you plan and design your own solutions to obtain the full value from your log data. Learn the current state of log analytics and common challenges See how log analytics is helping organizations achieve better business outcomes in areas such as cybersecurity, IT operations, and industrial automation Explore tools for log analytics, including Splunk, the Elastic stack, and Sumo Logic Understand the role storage plays in ensuring successful outcomes

Implementing a VersaStack Solution by Cisco and IBM with IBM FlashSystem 5030, Cisco UCS Mini, Hyper-V, and SQL Server

VersaStack, an IBM® and Cisco integrated infrastructure solution, combines computing, networking, and storage into a single integrated system. It combines the Cisco Unified Computing System (Cisco UCS) Integrated Infrastructure with IBM Spectrum Virtualize™, which includes IBM FlashSystem® storage offerings, for quick deployment and rapid time to value for the implementation of modern infrastructures. This IBM Redbooks® publication covers the preferred practices for implementing a VersaStack Solution with IBM FlashSystem 5030, Cisco UCS Mini, Hyper-V 2016, and Microsoft SQL Server. Cisco UCS Mini is optimized for branch and remote offices, point-of-sale locations, and smaller IT environments. It is the ideal solution for customers who need fewer servers but still want the comprehensive management capabilities provided by Cisco UCS Manager. The IBM FlashSystem 5030 delivers efficient, entry-level configurations that are designed to meet the needs of small and midsize businesses. Designed to provide organizations with the ability to consolidate and share data at an affordable price, the IBM FlashSystem 5030 offers advanced software capabilities such as clustering, IBM Easy Tier®, replication and snapshots that are found in more expensive systems. This book is intended for pre-sales and post-sales technical support professionals and storage administrators who are tasked with deploying a VersaStack solution with Hyper-V 2016 and Microsoft SQL Server.

Implementing VersaStack with Cisco ACI Multi-Pod and IBM HyperSwap for High Availability

The IBM HyperSwap® high availability (HA) function allows business continuity in a hardware failure, power failure, connectivity failure, or disasters, such as fire or flooding. It is available on the IBM SAN Volume Controller and IBM FlashSystem products. This IBM Redbooks publication covers the preferred practices for implementing Cisco VersaStack with IBM HyperSwap. The following are some of the topics covered in this book: Cisco Application Centric Infrastructure to showcase Cisco's ACI with Nexus 9Ks Cisco Fabric Interconnects and Unified Computing System (UCS) management capabilities Cisco Multilayer Director Switch (MDS) to showcase fabric channel connectivity Overall IBM HyperSwap solution architecture Differences between HyperSwap and Metro Mirroring, Volume Mirroring, and Stretch Cluster Multisite IBM SAN Volume Controller (SVC) deployment to showcase HyperSwap configuration and capabilities This book is intended for pre-sales and post-sales technical support professionals and storage administrators who are tasked with deploying a VersaStack solution with IBM HyperSwap.