talk-data.com talk-data.com

Topic

data-engineering

3395

tagged

Activity Trend

1 peak/qtr
2020-Q1 2026-Q1

Activities

3395 activities · Newest first

Block Storage Migration in Open Environments

Companies need to migrate data not only when technology needs to be replaced, but also for consolidation, load balancing, and disaster recovery (DR). Data migration is a critical operation, and this book explains the phases and steps to ensure a smooth migration. Topics range from planning and preparation to execution and validation. The book explains, from a generic standpoint, the appliance-based, storage-based, and host-based techniques that can be used to accomplish the migration. Each method is explained through practical migration scenarios and for various operating systems. This publication addresses the aspects of data migration efforts while focusing on fixed block storage systems in open environment with the IBM® FlashSystem 9100 as the target system. Therefore, the book also emphasizes various migration techniques using the Spectrum Virtualize built-in functions. This document targets storage administrators, storage network administrators, system designers, architects, and IT professionals who design, administer or plan data migrations in large data Centers. The aim is to ensure that you are aware of the current thinking, methods, and products that IBM can make available to you. These items are provided to ensure a data migration process that is as efficient and problem-free as possible. The material presented in this book was developed with versions of the referenced products as of February, 2020.

IBM Spectrum Protect Plus Practical Guidance for Deployment, Configuration, and Usage

IBM® Spectrum Protect Plus is a data protection solution that provides near-instant recovery, replication, retention, and reuse for virtual machines, databases, and applications in hybrid multicloud environments. IBM Knowledge Center for IBM Spectrum® Protect Plus provides extensive documentation for installation, deployment, and usage. In addition, IBM Spectrum Protect Plus Blueprint (https://ibm.biz/IBMSpectrumProtectPlusBlueprints) provides guidance about how to build and size an IBM Spectrum Protect Plus solution. The goal of this IBM Redpaper publication is to summarize and complement the available information by providing useful hints and tips based on the authors' practical experience in installing and supporting IBM Spectrum Protect Plus in actual customer environments. Over time, our aim is to compile a set of best practices that cover all aspects of the product, from planning and installation to tuning, maintenance, and troubleshooting.

Building an Anonymization Pipeline

How can you use data in a way that protects individual privacy but still provides useful and meaningful analytics? With this practical book, data architects and engineers will learn how to establish and integrate secure, repeatable anonymization processes into their data flows and analytics in a sustainable manner. Luk Arbuckle and Khaled El Emam from Privacy Analytics explore end-to-end solutions for anonymizing device and IoT data, based on collection models and use cases that address real business needs. These examples come from some of the most demanding data environments, such as healthcare, using approaches that have withstood the test of time. Create anonymization solutions diverse enough to cover a spectrum of use cases Match your solutions to the data you use, the people you share it with, and your analysis goals Build anonymization pipelines around various data collection models to cover different business needs Generate an anonymized version of original data or use an analytics platform to generate anonymized outputs Examine the ethical issues around the use of anonymized data

IBM z15 Technical Introduction

This IBM® Redbooks® publication introduces the latest member of the IBM Z® platform, the IBM z15™. It includes information about the Z environment and how it helps integrate data and transactions more securely. It also provides insight for faster and more accurate business decisions. The z15 is a state-of-the-art data and transaction system that delivers advanced capabilities, which are vital to any digital transformation. The z15 is designed for enhanced modularity, and occupies an industry-standard footprint. It is offered as a single air-cooled 19-inch frame called the z15 T02, or as a multi-frame (1 to 4 19-inch frames) called the z15 T01. Both z15 models excel at the following tasks: Using hybrid multicloud integration services Securing and protecting data with encryption everywhere Providing resilience with key to zero downtime Transforming a transactional platform into a data powerhouse Getting more out of the platform with IT Operational Analytics Accelerating digital transformation with agile service delivery Revolutionizing business processes Blending open source and IBM Z technologies This book explains how this system uses innovations and traditional Z strengths to satisfy growing demand for cloud, analytics, and open source technologies. With the z15 as the base, applications can run in a trusted, reliable, and secure environment that improves operations and lessens business risk.

Protecting Data Privacy Beyond the Trusted System of Record

To help you safeguard your sensitive data and provide ease of auditability and control, IBM introduced a new capability for IBM Z® called IBM Data Privacy Passports. It can help minimize the risk and impact of data loss and privacy breaches when collecting and storing sensitive data. Data Privacy Passports can manage how data is shared securely through a central control of user access. Data Privacy Passports can protect data wherever it goes. Security policies are kept and honored whenever the data is accessed. Future data access may be revoked remotely via Data Privacy Passports, long after data leaves the system of record, and sensitive data may even be made unusable simply by destroying its encryption key. Data Privacy Passports is designed to help reduce the time that is spent by staff to protect data and ensure privacy throughout its lifecycle via a central point of control. This IBM Redguide presents a business view of Data Privacy Passports, including how data privacy and protection concerns are addressed. We also explore how value is gained through various business model examples.

IBM Spectrum Scale CSI Driver for Container Persistent Storage

IBM® Spectrum Scale is a proven, scalable, high-performance data and file management solution. It provides world-class storage management with extreme scalability, flash accelerated performance, automatic policy-based storage that has tiers of flash through disk to tape. It also provides support for various protocols, such as NFS, SMB, Object, HDFS, and iSCSI. Containers can leverage the performance, information lifecycle management (ILM), scalability, and multisite data management to give the full flexibility on storage as they experience on the runtime. Container adoption is increasing in all industries, and they sprawl across multiple nodes on a cluster. The effective management of containers is necessary because their number will probably reach a far greater number than virtual machines today. Kubernetes is the standard container management platform currently being used. Data management is of ultimate importance, and often is forgotten because the first workloads containerized are ephemeral. For data management, many drivers with different specifications were available. A specification named Container Storage Interface (CSI) was created and is now adopted by all major Container Orchestrator Systems available. Although other container orchestration systems exist, Kubernetes became the standard framework for container management. It is a very flexible open source platform used as the base for most cloud providers and software companies' container orchestration systems. Red Hat OpenShift is one of the most reliable enterprise-grade container orchestration systems based on Kubernetes, designed and optimized to easily deploy web applications and services. OpenShift enables developers to focus on the code, while the platform takes care of all of the complex IT operations and processes. This IBM Redbooks® publication describes how the CSI Driver for IBM file storage enables IBM Spectrum® Scale to be used as persistent storage for stateful applications running in Kubernetes clusters. Through the Container Storage Interface Driver for IBM file storage, Kubernetes persistent volumes (PVs) can be provisioned from IBM Spectrum Scale. Therefore, the containers can be used with stateful microservices, such as database applications (MongoDB, PostgreSQL, and so on).

Cassandra: The Definitive Guide, 3rd Edition

Imagine what you could do if scalability wasn't a problem. With this hands-on guide, you’ll learn how the Cassandra database management system handles hundreds of terabytes of data while remaining highly available across multiple data centers. This third edition—updated for Cassandra 4.0—provides the technical details and practical examples you need to put this database to work in a production environment. Authors Jeff Carpenter and Eben Hewitt demonstrate the advantages of Cassandra’s nonrelational design, with special attention to data modeling. If you’re a developer, DBA, or application architect looking to solve a database scaling issue or future-proof your application, this guide helps you harness Cassandra’s speed and flexibility. Understand Cassandra’s distributed and decentralized structure Use the Cassandra Query Language (CQL) and cqlsh—the CQL shell Create a working data model and compare it with an equivalent relational model Develop sample applications using client drivers for languages including Java, Python, and Node.js Explore cluster topology and learn how nodes exchange data

IBM Storage for Red Hat OpenShift Blueprint Version 1 Release 4

IBM Storage for Red Hat OpenShift is a comprehensive container-ready solution that includes all the hardware & software components necessary to setup and/or expand your Red Hat OpenShift environment. This blueprint includes Red Hat OpenShift Container Platform and uses Container Storage Interface (CSI) standards. IBM Storage brings enterprise data services to containers. In this blueprint, learn how to: · Combine the benefits of IBM Systems with the performance of IBM Storage solutions so that you can deliver the right services to your clients today! · Build a 24 by 7 by 365 enterprise class private cloud with Red Hat OpenShift Container Platform utilizing new open source Container Storage interface (CSI) drivers · Leverage enterprise class services such as NVMe based flash performance, high data availability, and advanced container security IBM Storage for Red Hat OpenShift Container Platform is designed for your DevOps environment for on-premises deployment with easy-to-consume components built to perform and scale for your enterprise. Simplify your journey to cloud with pre-tested and validated blueprints engineered to enable rapid deployment and peace of mind as you move to a hybrid multicloud environment. You now have the capabilities.

SAP HANA Data Management and Performance on IBM Power Systems

This IBM® Redpaper Redbooks publication provides information and concepts about how to take advantage of SAP HANA and IBM Power Systems features to manage data and performance efficiently. The target audience of this book includes architects, IT specialists, and systems administrators who deploy SAP HANA and manage data and SAP system performance.

Modern Big Data Architectures

Provides an up-to-date analysis of big data and multi-agent systems The term Big Data refers to the cases, where data sets are too large or too complex for traditional data-processing software. With the spread of new concepts such as Edge Computing or the Internet of Things, production, processing and consumption of this data becomes more and more distributed. As a result, applications increasingly require multiple agents that can work together. A multi-agent system (MAS) is a self-organized computer system that comprises multiple intelligent agents interacting to solve problems that are beyond the capacities of individual agents. Modern Big Data Architectures examines modern concepts and architecture for Big Data processing and analytics. This unique, up-to-date volume provides joint analysis of big data and multi-agent systems, with emphasis on distributed, intelligent processing of very large data sets. Each chapter contains practical examples and detailed solutions suitable for a wide variety of applications. The author, an internationally-recognized expert in Big Data and distributed Artificial Intelligence, demonstrates how base concepts such as agent, actor, and micro-service have reached a point of convergence—enabling next generation systems to be built by incorporating the best aspects of the field. This book: Illustrates how data sets are produced and how they can be utilized in various areas of industry and science Explains how to apply common computational models and state-of-the-art architectures to process Big Data tasks Discusses current and emerging Big Data applications of Artificial Intelligence Modern Big Data Architectures: A Multi-Agent Systems Perspective is a timely and important resource for data science professionals and students involved in Big Data analytics, and machine and artificial learning.

Open Source Data Pipelines for Intelligent Applications

For decades, businesses have used information about their customers to make critical decisions on what to stock in inventory, which items to recommend to customers, and when to run promotions. But the advent of big data early in this century changed the game considerably. The key to achieving a competitive advantage today is the ability to process and store ever-increasing amounts of information that affect those decisions. In this report, solutions specialists from Red Hat provide an architectural guide to help you navigate the modern data analytics ecosystem. You’ll learn how the industry has evolved and examine current approaches to storage. That includes a deep dive into the anatomy of a portable data platform architecture, along with several aspects of running data pipelines and intelligent applications with Kubernetes. Explore the history of open source data processing and the evolution of container scheduling Get a concise overview of intelligent applications Learn how to use storage with Kubernetes to produce effective intelligent applications Understand how to structure applications on Kubernetes in your platform architecture Delve into example pipeline architectures for deploying intelligent applications on Kubernetes

SAP HANA Platform Migration

This IBM® Redpaper publication provides SAP HANA platform migration information and details for successful planning for migration to IBM Power Systems servers. This publication addresses topics for sellers, IT architects, IT specialists, and anyone who wants to migrate and manage SAP workloads on Power Systems servers. Moreover, this guide provides documentation to transfer how-to skills to the technical teams, and it provides solution guidance to the sales team. This publication complements documentation that is available at IBM Knowledge Center, and it aligns with educational materials that are provided by IBM Systems.

IBM DS8000 and Transparent Cloud Tiering

This IBM® Redbooks® publication gives a broad understanding of storage clouds and the initial functionality that was introduced for mainframes to have Transparent Cloud Tiering. IBM DFSMS and the IBM DS8000 added functionality to provide elements of serverless data movement, and for IBM z/OS® to communicate with a storage cloud. The function is known as Transparent Cloud Tiering and is composed of the following key elements: A gateway in the DS8000, which allows the movement of data to and from Object Storage by using a network connection, with the option to encrypt data in the Cloud. DFSMShsm enhancements to support Migrate and Recall functions to and from the Object Storage. Other commands were enhanced to monitor and report on the new functionality. DFSMShsm uses the Web Enablement toolkit for z/OS to create and access the metadata for specific clouds, containers, and objects. DFSMSdss enhancements to provide some basic backup and restore functions to and from the cloud. The IBM TS7700 can also be set up to act as if it were cloud storage from the DS8000 perspective. This IBM Redbooks publication is divided into the following parts: Part 1 provides you with an introduction to clouds. Part 2 shows you how we set up the Transparent Cloud Tiering in a controlled laboratory and how the new functions work. We provide points to consider to help you set up your storage cloud and integrate it into your operational environment. Part 3 shows you how we used the new functionality to communicate with the cloud and to send data and retrieve data from it..

IBM TS7700 R5.0 Cloud Storage Tier Guide

Building on over 20 years of virtual tape experience, the TS7700 (TS7760, TS7770) now supports the ability to store virtual tape volumes in an object store. This IBM® Redpaper publication helps you set up and configure the cloud object storage support for IBM Cloud™ Object Storage (COS) or Amazon Simple Storage Service (Amazon S3). The TS7700 supported off loading to physical tape for over two decades. Off loading to physical tape behind a TS7700 is used by hundreds of organizations around the world. By using the same hierarchical storage techniques, the TS7700 can also off load to object storage. Because object storage is cloud-based and accessible from different regions, the TS7700 Cloud Storage Tier support essentially allows the cloud to be an extension of the grid. In this IBM Redpaper publication, we provide a brief overview of cloud technology with an emphasis on Object Storage. Object Storage is used by a broad set of technologies, including those technologies that are exclusive to IBM Z®. The aim of this publication is to provide a basic understanding of cloud, Object Storage, and different ways it can be integrated into your environment. This Redpaper is intended for system architects and storage administrators with TS7700 experience who want to add the support of a Cloud Storage Tier to their TS7700 solution. Note: As of this writing, the TS7700C supports the ability to offload to on-premise cloud with IBM Cloud Object Storage and public cloud with Amazon S3.

MySQL 8 Query Performance Tuning: A Systematic Method for Improving Execution Speeds

Identify, analyze, and improve poorly performing queries that damage user experience and lead to lost revenue for your business. This book will help you make query tuning an integral part of your daily routine through a multi-step process that includes monitoring of execution times, identifying candidate queries for optimization, analyzing their current performance, and improving them to deliver results faster and with less overhead. Author Jesper Krogh systematically discusses each of these steps along with the data sources and the tools used to perform them. MySQL 8 Query Performance Tuning aims to help you improve query performance using a wide range of strategies. You will know how to analyze queries using both the traditional EXPLAIN command as well as the new EXPLAIN ANALYZE tool. You also will see how to use the Visual Explain feature to provide a visually-oriented view of an execution plan. Coverage of indexes includes indexing strategies and index statistics, and you will learn how histograms can be used to provide input on skewed data distributions that the optimizer can use to improve query performance. You will learn about locks, and how to investigate locking issues. And you will come away with an understanding of how the MySQL optimizer works, including the new hash join algorithm, and how to change the optimizer’s behavior when needed to deliver faster execution times. You will gain the tools and skills needed to delight application users and to squeeze the most value from corporate computing resources. What You Will Learn Monitor query performance to identify poor performers Choose queries to optimize that will provide the greatest gain Analyze queries using tools such as EXPLAIN ANALYZE and Visual Explain Improve slow queries through a wide range of strategies Properly deploy indexes and histograms to aid in creating fast execution plans Understand and analyze locks to resolve contention and increase throughput Who This Book Is For Database administrators and SQL developers who are familiar with MySQL and need to participate in query tuning. While some experience with MySQL is required, no prior knowledge of query performance tuning is needed.

PostgreSQL Configuration: Best Practices for Performance and Security

Obtain all the skills you need to configure and manage a PostgreSQL database. In this book you will begin by installing and configuring PostgreSQL on a server by focusing on system-level parameter settings before installation. You will also look at key post-installation steps to avoid issues in the future. The basic configuration of PostgreSQL is tuned for compatibility rather than performance. Keeping this in mind, you will fine-tune your PostgreSQL parameters based on your environment and application behavior. You will then get tips to improve database monitoring and maintenance followed by database security for handling sensitive data in PostgreSQL. Every system containing valuable data needs to be backed-up regularly. PostgreSQL follows a simple back-up procedure and provides fundamental approaches to back up your data. You will go through these approaches and choose the right one based on your environment. Running your application with limited resources can be tricky. To achieve this you will implement a pooling mechanism for your PostgreSQL instances to connect to other databases. Finally, you will take a look at some basic errors faced while working with PostgreSQL and learn to resolve them in the quickest manner. What You Will Learn Configure PostgreSQL for performance Monitor and maintain PostgreSQL instances Implement a backup strategy for your data Resolve errors faced while using PostgreSQL Who This Book Is For Readers with basic knowledge of PostgreSQL who wish to implement key solutions based on their environment.

IBM DS8900F Product Guide

Built on over 50 years of Enterprise storage expertise, the IBM® DS8000® series is the flagship of disk storage systems within the IBM System Storage™ portfolio. As of September 2019, the DS8900F is the latest addition and offers two new classes: DS8910F: Flexibility Class all-flash DS8950F: Agility Class all-flash The agility class is efficiently designed to consolidate all your mission-critical workloads for IBM Z®, IBM LinuxONE, IBM Power Systems, and distributed environments under a single all-flash storage solution. This IBM Redbooks® Product Guide gives an overview of the features and functions that are available with the IBM DS8900F models running microcode Release 9.0 (Bundle 89.0 / Licensed Machine Code 7.9.0.xxx).

IBM ZPDT Guide and Reference

This IBM® Redbooks® publication provides both introductory information and technical details about the IBM System z® Personal Development Tool (IBM zPDT®), which produces a small System z environment suitable for application development. zPDT is a PC Linux application. When zPDT is installed (on Linux), normal System z operating systems (such as IBM z/OS®) can be run on it. zPDT provides the basic System z architecture and emulated IBM 3390 disk drives, 3270 interfaces, OSA interfaces, and so on. The systems that are discussed in this document are complex. They have elements of Linux (for the underlying PC machine), IBM z/Architecture® (for the core zPDT elements), System z I/O functions (for emulated I/O devices), z/OS (the most common System z operating system), and various applications and subsystems under z/OS. The reader is assumed to be familiar with general concepts and terminology of System z hardware and software elements, and with basic PC Linux characteristics. This book provides the primary documentation for zPDT.

SQL Server 2019 Administration Inside Out

Conquer SQL Server 2019 administration–from the inside out Dive into SQL Server 2019 administration–and really put your SQL Server DBA expertise to work. This supremely organized reference packs hundreds of timesaving solutions, tips, and workarounds–all you need to plan, implement, manage, and secure SQL Server 2019 in any production environment: on-premises, cloud, or hybrid. Six experts thoroughly tour DBA capabilities available in SQL Server 2019 Database Engine, SQL Server Data Tools, SQL Server Management Studio, PowerShell, and Azure Portal. You’ll find extensive new coverage of Azure SQL, big data clusters, PolyBase, data protection, automation, and more. Discover how experts tackle today’s essential tasks–and challenge yourself to new levels of mastery. Explore SQL Server 2019’s toolset, including the improved SQL Server Management Studio, Azure Data Studio, and Configuration Manager Design, implement, manage, and govern on-premises, hybrid, or Azure database infrastructures Install and configure SQL Server on Windows and Linux Master modern maintenance and monitoring with extended events, Resource Governor, and the SQL Assessment API Automate tasks with maintenance plans, PowerShell, Policy-Based Management, and more Plan and manage data recovery, including hybrid backup/restore, Azure SQL Database recovery, and geo-replication Use availability groups for high availability and disaster recovery Protect data with Transparent Data Encryption, Always Encrypted, new Certificate Management capabilities, and other advances Optimize databases with SQL Server 2019’s advanced performance and indexing features Provision and operate Azure SQL Database and its managed instances Move SQL Server workloads to Azure: planning, testing, migration, and post-migration