talk-data.com talk-data.com

Event

O'Reilly Data Engineering Books

2001-10-19 – 2027-05-25 Oreilly Visit website ↗

Activities tracked

3377

Collection of O'Reilly books on Data Engineering.

Filtering by: data-engineering ×

Sessions & talks

Showing 601–625 of 3377 · Newest first

Search within this event →
Optimize the Value of Your Data with Oracle and IBM Flash Storage Solutions

In this multicloud and cognitive era, information continues to grow rapidly. By 2025, IDC says worldwide data will grow by 61% to 175 zettabytes, with as much of the data in data centers as in the cloud. IT environments with Oracle deployments will need to accommodate that data growth, including storing, copying, mirroring, and protecting the data. When IT budgets are constrained but data keeps growing, storage costs can consume more than their fair share of the IT budget. The leading-edge portfolio of storage solutions and essential technologies of IBM® can help organizations stay ahead of the information explosion. Designed with built-in efficiency, these solutions represent preferred practices that address the following main storage objectives for hybrid multicloud environments: Stop storing so much Store more with what you have. Move Oracle and related data to balance performance and efficiency IBM offers true enterprise class storage support for Oracle deployments at a low total cost of ownership (TCO). With flash disk, tape, storage network hardware, consolidated management console, software-defined storage solutions, and security software, IBM can provide Oracle customers the full spectrum of products to meet their availability, retention, security, and compliance requirements.

IBM AIX Enhancements and Modernization

This IBM® Redbooks publication is a comprehensive guide that covers the IBM AIX® operating system (OS) layout capabilities, distinct features, system installation, and maintenance, which includes AIX security, trusted environment, and compliance integration, with the benefits of IBM Power Virtualization Management (PowerVM®) and IBM Power Virtualization Center (IBM PowerVC), which includes cloud capabilities and automation types. The objective of this book is to introduce IBM AIX modernization features and integration with different environments: General AIX enhancements AIX Live Kernel Update individually or using Network Installation Manager (NIM) AIX security features and integration AIX networking enhancements PowerVC integration and features for cloud environments AIX deployment using IBM Terraform and IBM Cloud Automation Manager AIX automation that uses configuration management tools PowerVM enhancements and features Latest disaster recovery (DR) solutions AIX Logical Volume Manager (LVM) and Enhanced Journaled File System (JFS2) AIX installation and maintenance techniques

Implementing IBM Spectrum Virtualize for Public Cloud Version 8.3

IBM® Spectrum Virtualize is a key member of the IBM Spectrum™ Storage portfolio. It is a highly flexible storage solution that enables rapid deployment of block storage services for new and traditional workloads, on-premises, off-premises and in a combination of both. IBM Spectrum Virtualize™ for Public Cloud provides the IBM Spectrum Virtualize functionality in IBM Cloud™. This new capability provides a monthly license to deploy and use Spectrum Virtualize in IBM Cloud to enable hybrid cloud solutions, offering the ability to transfer data between on-premises private clouds or data centers and the public cloud. This IBM Redpaper™ publication gives a broad understanding of IBM Spectrum Virtualize for Public Cloud architecture and provides planning and implementation details of the common use cases for this product. This publication helps storage and networking administrators plan and implement install, tailor, and configure IBM Spectrum Virtualize for Public Cloud offering. It also provides a detailed description of troubleshooting tips. IBM Spectrum Virtualize is also available on AWS. For more information, see Implementation guide for IBM Spectrum Virtualize for Public Cloud on AWS, REDP-5534.

IBM Power Systems Infrastructure I/O for SAP Applications

This IBM® Redpaper publication describes practical experiences to run SAP workloads to take advantage of IBM Power Systems I/O capabilities. With IBM POWER® processor-based servers, you have the flexibility to fit seamlessly new applications and workloads into a single data center, and even consolidate them into a single server. This approach highlights all viable options and describes the pros and cons of each one to select the correct option for a specific data center. The target audiences of this book are architects, IT specialists, and systems administrators deploying SAP workloads, who spend much time and effort managing, provisioning, and monitoring SAP software systems and landscapes on IBM Power Systems servers.

MOS Study Guide for Microsoft Access Expert Exam MO-500

Advance your everyday proficiency with Access 2019. And earn the credential that proves it! Demonstrate your expertise with Microsoft Access! Designed to help you practice and prepare for Microsoft Office Specialist (MOS): Access 2019 certification, this official Study Guide delivers: In-depth preparation for each MOS objective Detailed procedures to help build the skills measured by the exam Hands-on tasks to practice what you've learned Practice files and sample solutions Sharpen the skills measured by these objectives: Create and manage databases Build tables Create queries Create forms Create reports About MOS A Microsoft Office Specialist (MOS) certification validates your proficiency with Microsoft Office programs, demonstrating that you can meet globally recognized performance standards. Hands-on experience with the technology is required to successfully pass Microsoft Certification exams.

Geographical Modeling

The modeling of cities and territories has progressed greatly in the last 20 years. This is firstly due to geographic information systems, followed by the availability of large amounts of georeferenced data – both on the Internet and through the use of connected objects. In addition, the rise in performance of computational methods for the simulation and exploration of dynamic models has facilitated advancement. Geographical Modeling presents previously unpublished information on the main advances achieved by these new approaches. Each of the six chapters builds a bibliographic review and precisely describes the methods used, highlighting their advantages and discussing their interpretations. They are all illustrated by many examples. The book also explains with clarity the theoretical foundations of geographical analysis, the delicate operations of model selection, and the applications of fractals and scaling laws. These applications include gaining knowledge of the morphology of cities and the organization of urban transport, and finding new methods of building and exploring simulation models and visualizations of data and results.

IBM FlashSystem 9200R Rack Solution Product Guide

The FlashSystem 9200 combines the performance of flash and end-to-end Non-Volatile Memory Express (NVMe) with the reliability and innovation of IBM® FlashCore technology, the ultra-low latency of Storage Class Memory (SCM), the rich features of IBM Spectrum® Virtualize and AI predictive storage management, and proactive support by Storage Insights. All of these features are included in a powerful 2U enterprise-class, blazing fast storage all-flash array.

Building a Unified Data Infrastructure

The vast majority of businesses today already have a documented data strategy. But only a third of these forward-thinking companies have evolved into data-driven organizations or even begun to move toward a data culture. Most have yet to treat data as a business asset, much less use data and analytics to compete in the marketplace. What’s the solution? This insightful report demonstrates the importance of creating a holistic data infrastructure approach. You’ll learn how data virtualization (DV), master data management (MDM), and metadata-management capabilities can help your organization meet business objectives. Chief data officers, enterprise architects, analytics leaders, and line-of-business executives will understand the benefits of combining these capabilities into a unified data platform. Explore three separate business contexts that depend on data: operations, analytics, and governance Learn a pragmatic and holistic approach to building a unified data infrastructure Understand the critical capabilities of this approach, including the ability to work with existing technology Apply six best practices for combining data management capabilities

Streaming Integration

Data is being generated at an unrelenting pace, and data storage capacity can’t keep up. Enterprises must modernize the way they use and manage data by collecting, processing, and analyzing it in real time—in other words, streaming. This practical report explains everything organizations need to know to begin their streaming integration journey and make the most of their data. Authors Steve Wilkes and Alok Pareek detail the key attributes and components of an enterprise-grade streaming integration platform, along with stream processing and analysis techniques that will help companies reap immediate value from their data and solve their most pressing business challenges. Learn how to collect and handle large volumes of data at scale See how streams move data between threads, processes, servers, and data centers Get your data in the form you need and analyze it in real time Dive into the pros and cons of data targets such as databases, Hadoop, and cloud services for specific use cases Ensure your streaming integration infrastructure scales, is secure, works 24/7, and can handle failure

The Evolving Role of the Data Engineer

Companies working to become data driven often view data scientists as heroes, but that overlooks the vital role that data engineers play in the process. While data scientists focus on finding new insights from datasets, data engineers deal with preparation—obtaining, cleaning, and creating enhanced versions of the data an organization needs. In this report, Andy Oram examines how the role of data engineer has quickly evolved. DBAs, software engineers, developers, and students will explore the responsibilities of modern data engineers and the skills and tools necessary to do the job. You’ll learn how to deal with software engineering concepts such as rapid and continuous development, automation and orchestration, modularity, and traceability. Decision makers considering a move to the cloud will also benefit from the in-depth discussion this report provides. This report covers: Major tasks of data engineers today The different levels of structure in data and ways to maximize its value Capabilities of third-party cloud options Tools for ingestion, transfer, and enrichment Using containers and VMs to run the tools Software engineering development Automation and orchestration of data engineering

IBM Spectrum Virtualize HyperSwap SAN Implementation and Design Best Practices

In this paper, we outline some IBM® Spectrum Virtualize HyperSwap® SAN implementation and design best practices for optimum resiliency of the SAN Volume Controller cluster. It provides IBM Spectrum® Virtualize HyperSwap and Stretched Cluster configuration details. Note: In this book, for brevity, we use HyperSwap to refer to both HyperSwap and Stretched Cluster. The documentation there details the minimum requirements. However, it does not describe the design of the storage area network (SAN) in detail, nor does it describe the recommended way to implement those requirements on a SAN. In this IBM Redpaper publication, we outline some of the best practices for SAN design and implementation that leads to optimum resiliency of the SAN Volume Controller (SVC) cluster, and we explain why each recommendation is made. This paper is SAN vendor-neutral wherever possible. Any mention of a specific SAN switch vendor, or terms used by a specific switch vendor, is made only where relevant to a specific context, and does not imply an endorsement of a specific switch vendor. Note: Some of the figures in this document might not depict redundant fabrics or storage configurations. This was done for simplicity, and it should be assumed that any recommendations made for fabric design assume that there are two redundant fabrics.

IBM Spectrum LSF Suite: Installation Best Practices Guide

This IBM® Redpaper publication describes IBM Spectrum® LSF® Suite best practices installation topics, application checks for workload management, and high availability configurations by using theoretical knowledge and hands-on exercises. These findings are documented by way of sample scenarios. This publication addresses topics for sellers, IT architects, IT specialists, and anyone who wants to implement and manage a high-performing workload management solution with LSF. Moreover, this guide provides documentation to transfer how-to-skills to the technical teams, and solution guidance to the sales team. This publication compliments documentation that is available at IBM Knowledge Center, and aligns with educational materials that are provided by IBM Systems.

IBM DS8000 Encryption for data at rest, Transparent Cloud Tiering, and Endpoint Security (DS8000 Release 9.0)

IBM® experts recognize the need for data protection, both from hardware or software failures, and from physical relocation of hardware, theft, and retasking of existing hardware. The IBM DS8000® supports encryption-capable hard disk drives (HDDs) and flash drives. These Full Disk Encryption (FDE) drive sets are used with key management services that are provided by IBM Security Key Lifecycle Manager software or Gemalto SafeNet KeySecure to allow encryption for data at rest. Use of encryption technology involves several considerations that are critical for you to understand to maintain the security and accessibility of encrypted data. Failure to follow the requirements that are described in the IBM Redpaper can result in an encryption deadlock. Starting with Release 8.5 code, the DS8000 also supports Transparent Cloud Tiering (TCT) data object encryption. With TCT encryption, data is encrypted before it is transmitted to the cloud. The data remains encrypted in cloud storage and is decrypted after it is transmitted back to the IBM DS8000. Starting with DS8000 Release 9.0, the DS8900F provides Fibre Channel Endpoint Security when communicating with an IBM z15™, which supports link authentication and the encryption of data that is in-flight. For more information, see IBM Fibre Channel Endpoint Security for IBM DS8900F and IBM Z, SG24-8455. This edition focuses on IBM Security Key Lifecycle Manager Version 3.0.1.3 or later, which enables support Key Management Interoperability Protocol (KMIP) with the DS8000 Release 9.0 code or later and updated DS GUI for encryption functions.

SAP HANA on IBM Power Systems Architectural Summary

This IBM® Redpaper publication delivers SAP HANA architectural concepts for successful implementation on IBM Power Systems servers. This publication addresses topics for sellers, IT architects, IT specialists, and anyone who wants to understand how to take advantage of running SAP HANA workloads on Power Systems servers. Moreover, this guide provides documentation to transfer how-to skills to the technical teams, and it provides solution guidance to the sales team. This publication complements documentation that is available at IBM Knowledge Center, and it aligns with educational materials that are provided by IBM Systems.

IBM Spectrum Scale Immutability Introduction, Configuration Guidance, and Use Cases

This IBM Redpaper™ publication introduces the IBM Spectrum Scale immutability function. It shows how to set it up and presents different ways for managing immutable and append-only files. This publication also provides guidance for implementing IT security aspects in an IBM Spectrum Scale cluster by addressing regulatory requirements. It also describes two typical use cases for managing immutable files. One use case involves applications that manage file immutability; the other use case presents a solution to automatically set files to immutable within a IBM Spectrum Scale immutable fileset.

Block Storage Migration in Open Environments

Companies need to migrate data not only when technology needs to be replaced, but also for consolidation, load balancing, and disaster recovery (DR). Data migration is a critical operation, and this book explains the phases and steps to ensure a smooth migration. Topics range from planning and preparation to execution and validation. The book explains, from a generic standpoint, the appliance-based, storage-based, and host-based techniques that can be used to accomplish the migration. Each method is explained through practical migration scenarios and for various operating systems. This publication addresses the aspects of data migration efforts while focusing on fixed block storage systems in open environment with the IBM® FlashSystem 9100 as the target system. Therefore, the book also emphasizes various migration techniques using the Spectrum Virtualize built-in functions. This document targets storage administrators, storage network administrators, system designers, architects, and IT professionals who design, administer or plan data migrations in large data Centers. The aim is to ensure that you are aware of the current thinking, methods, and products that IBM can make available to you. These items are provided to ensure a data migration process that is as efficient and problem-free as possible. The material presented in this book was developed with versions of the referenced products as of February, 2020.

IBM Spectrum Protect Plus Practical Guidance for Deployment, Configuration, and Usage

IBM® Spectrum Protect Plus is a data protection solution that provides near-instant recovery, replication, retention, and reuse for virtual machines, databases, and applications in hybrid multicloud environments. IBM Knowledge Center for IBM Spectrum® Protect Plus provides extensive documentation for installation, deployment, and usage. In addition, IBM Spectrum Protect Plus Blueprint (https://ibm.biz/IBMSpectrumProtectPlusBlueprints) provides guidance about how to build and size an IBM Spectrum Protect Plus solution. The goal of this IBM Redpaper publication is to summarize and complement the available information by providing useful hints and tips based on the authors' practical experience in installing and supporting IBM Spectrum Protect Plus in actual customer environments. Over time, our aim is to compile a set of best practices that cover all aspects of the product, from planning and installation to tuning, maintenance, and troubleshooting.

Building an Anonymization Pipeline

How can you use data in a way that protects individual privacy but still provides useful and meaningful analytics? With this practical book, data architects and engineers will learn how to establish and integrate secure, repeatable anonymization processes into their data flows and analytics in a sustainable manner. Luk Arbuckle and Khaled El Emam from Privacy Analytics explore end-to-end solutions for anonymizing device and IoT data, based on collection models and use cases that address real business needs. These examples come from some of the most demanding data environments, such as healthcare, using approaches that have withstood the test of time. Create anonymization solutions diverse enough to cover a spectrum of use cases Match your solutions to the data you use, the people you share it with, and your analysis goals Build anonymization pipelines around various data collection models to cover different business needs Generate an anonymized version of original data or use an analytics platform to generate anonymized outputs Examine the ethical issues around the use of anonymized data

IBM z15 Technical Introduction

This IBM® Redbooks® publication introduces the latest member of the IBM Z® platform, the IBM z15™. It includes information about the Z environment and how it helps integrate data and transactions more securely. It also provides insight for faster and more accurate business decisions. The z15 is a state-of-the-art data and transaction system that delivers advanced capabilities, which are vital to any digital transformation. The z15 is designed for enhanced modularity, and occupies an industry-standard footprint. It is offered as a single air-cooled 19-inch frame called the z15 T02, or as a multi-frame (1 to 4 19-inch frames) called the z15 T01. Both z15 models excel at the following tasks: Using hybrid multicloud integration services Securing and protecting data with encryption everywhere Providing resilience with key to zero downtime Transforming a transactional platform into a data powerhouse Getting more out of the platform with IT Operational Analytics Accelerating digital transformation with agile service delivery Revolutionizing business processes Blending open source and IBM Z technologies This book explains how this system uses innovations and traditional Z strengths to satisfy growing demand for cloud, analytics, and open source technologies. With the z15 as the base, applications can run in a trusted, reliable, and secure environment that improves operations and lessens business risk.

Protecting Data Privacy Beyond the Trusted System of Record

To help you safeguard your sensitive data and provide ease of auditability and control, IBM introduced a new capability for IBM Z® called IBM Data Privacy Passports. It can help minimize the risk and impact of data loss and privacy breaches when collecting and storing sensitive data. Data Privacy Passports can manage how data is shared securely through a central control of user access. Data Privacy Passports can protect data wherever it goes. Security policies are kept and honored whenever the data is accessed. Future data access may be revoked remotely via Data Privacy Passports, long after data leaves the system of record, and sensitive data may even be made unusable simply by destroying its encryption key. Data Privacy Passports is designed to help reduce the time that is spent by staff to protect data and ensure privacy throughout its lifecycle via a central point of control. This IBM Redguide presents a business view of Data Privacy Passports, including how data privacy and protection concerns are addressed. We also explore how value is gained through various business model examples.

IBM Spectrum Scale CSI Driver for Container Persistent Storage

IBM® Spectrum Scale is a proven, scalable, high-performance data and file management solution. It provides world-class storage management with extreme scalability, flash accelerated performance, automatic policy-based storage that has tiers of flash through disk to tape. It also provides support for various protocols, such as NFS, SMB, Object, HDFS, and iSCSI. Containers can leverage the performance, information lifecycle management (ILM), scalability, and multisite data management to give the full flexibility on storage as they experience on the runtime. Container adoption is increasing in all industries, and they sprawl across multiple nodes on a cluster. The effective management of containers is necessary because their number will probably reach a far greater number than virtual machines today. Kubernetes is the standard container management platform currently being used. Data management is of ultimate importance, and often is forgotten because the first workloads containerized are ephemeral. For data management, many drivers with different specifications were available. A specification named Container Storage Interface (CSI) was created and is now adopted by all major Container Orchestrator Systems available. Although other container orchestration systems exist, Kubernetes became the standard framework for container management. It is a very flexible open source platform used as the base for most cloud providers and software companies' container orchestration systems. Red Hat OpenShift is one of the most reliable enterprise-grade container orchestration systems based on Kubernetes, designed and optimized to easily deploy web applications and services. OpenShift enables developers to focus on the code, while the platform takes care of all of the complex IT operations and processes. This IBM Redbooks® publication describes how the CSI Driver for IBM file storage enables IBM Spectrum® Scale to be used as persistent storage for stateful applications running in Kubernetes clusters. Through the Container Storage Interface Driver for IBM file storage, Kubernetes persistent volumes (PVs) can be provisioned from IBM Spectrum Scale. Therefore, the containers can be used with stateful microservices, such as database applications (MongoDB, PostgreSQL, and so on).

Cassandra: The Definitive Guide, 3rd Edition

Imagine what you could do if scalability wasn't a problem. With this hands-on guide, you’ll learn how the Cassandra database management system handles hundreds of terabytes of data while remaining highly available across multiple data centers. This third edition—updated for Cassandra 4.0—provides the technical details and practical examples you need to put this database to work in a production environment. Authors Jeff Carpenter and Eben Hewitt demonstrate the advantages of Cassandra’s nonrelational design, with special attention to data modeling. If you’re a developer, DBA, or application architect looking to solve a database scaling issue or future-proof your application, this guide helps you harness Cassandra’s speed and flexibility. Understand Cassandra’s distributed and decentralized structure Use the Cassandra Query Language (CQL) and cqlsh—the CQL shell Create a working data model and compare it with an equivalent relational model Develop sample applications using client drivers for languages including Java, Python, and Node.js Explore cluster topology and learn how nodes exchange data

IBM Storage for Red Hat OpenShift Blueprint Version 1 Release 4

IBM Storage for Red Hat OpenShift is a comprehensive container-ready solution that includes all the hardware & software components necessary to setup and/or expand your Red Hat OpenShift environment. This blueprint includes Red Hat OpenShift Container Platform and uses Container Storage Interface (CSI) standards. IBM Storage brings enterprise data services to containers. In this blueprint, learn how to: · Combine the benefits of IBM Systems with the performance of IBM Storage solutions so that you can deliver the right services to your clients today! · Build a 24 by 7 by 365 enterprise class private cloud with Red Hat OpenShift Container Platform utilizing new open source Container Storage interface (CSI) drivers · Leverage enterprise class services such as NVMe based flash performance, high data availability, and advanced container security IBM Storage for Red Hat OpenShift Container Platform is designed for your DevOps environment for on-premises deployment with easy-to-consume components built to perform and scale for your enterprise. Simplify your journey to cloud with pre-tested and validated blueprints engineered to enable rapid deployment and peace of mind as you move to a hybrid multicloud environment. You now have the capabilities.