data-engineering

IBM TS4500 R8 Tape Library Guide

2022-03-21 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Jesus Eduardo Cervantes Rolon , Ole Asmussen , Larry Coyne , Albrecht Friess , Robert Beiderbeck , Khanh Ngo , Fabian Corona Villarreal , Hans-Günther Hörhammer

ELK IBM Cyber Security data

The IBM® TS4500 (TS4500) tape library is a next-generation tape solution that offers higher storage density and better integrated management than previous solutions. This IBM Redbooks® publication gives you a close-up view of the new IBM TS4500 tape library. In the TS4500, IBM delivers the density that today's and tomorrow's data growth requires. It has the cost-effectiveness and the manageability to grow with business data needs, while you preserve investments in IBM tape library products. Now, you can achieve a low per-terabyte cost and high density, with up to 13 PB of data (up to 39 PB compressed) in a single 10 square-foot library by using LTO Ultrium 9 cartridges or 11 PB with 3592 cartridges. The TS4500 offers the following benefits: Support of the IBM Linear Tape-Open (LTO) Ultrium 9 tape drive: Store up to 1.04 EB 2.5:1 compressed per library with IBM LTO 9 cartridges. High availability: Dual active accessors with integrated service bays reduce inactive service space by 40%. The Elastic Capacity option can be used to eliminate inactive service space. Flexibility to grow: The TS4500 library can grow from the right side and the left side of the first L frame because models can be placed in any active position. Increased capacity: The TS4500 can grow from a single L frame up to another 17 expansion frames with a capacity of over 23,000 cartridges. High-density (HD) generation 1 frames from the TS3500 library can be redeployed in a TS4500. Capacity on demand (CoD): CoD is supported through entry-level, intermediate, and base-capacity configurations. Advanced Library Management System (ALMS): ALMS supports dynamic storage management, which enables users to create and change logical libraries and configure any drive for any logical library. Support for IBM TS1160 while also supporting TS1155, TS1150, and TS1140 tape drive. The TS1160 gives organizations an easy way to deliver fast access to data, improve security, and provide long-term retention, all at a lower cost than disk solutions. The TS1160 offers high-performance, flexible data storage with support for data encryption. Also, this enhanced fifth-generation drive can help protect investments in tape automation by offering compatibility with existing automation. Store up to 1.05 EB 3:1 compressed per library with IBM 3592 cartridges Integrated TS7700 back-end Fibre Channel (FC) switches are available. Up to four library-managed encryption (LME) key paths per logical library are available. This book describes the TS4500 components, feature codes, specifications, supported tape drives, encryption, new integrated management console (IMC), command-line interface (CLI), and REST over SCSI (RoS) to obtain status information about library components. You learn how to accomplish the following tasks: Improve storage density with increased expansion frame capacity up to 2.4 times, and support 33% more tape drives per frame

Data Lakehouse in Action

2022-03-17 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Pradeep Menon

Analytics Azure Cloud Computing Data Analytics Data Governance Data Lakehouse Cyber Security data data-lake storage-repositories

"Data Lakehouse in Action" provides a comprehensive exploration of the Data Lakehouse architecture, a modern solution for scalable and effective large-scale analytics. This book guides you through understanding the principles and components of the architecture, and its implementation using cloud platforms like Azure. Learn the practical techniques for designing robust systems tailored to organizational needs and maturity. What this Book will help me do Understand the evolution and need for modern data architecture patterns like Data Lakehouse. Learn how to design systems for data ingestion, storage, processing, and serving in a Data Lakehouse. Develop best practices for data governance and security in the Data Lakehouse architecture. Discover various analytics workflows enabled by the Data Lakehouse, including real-time and batch approaches. Implement practical Data Lakehouse patterns on a cloud platform, and integrate them with macro-patterns such as Data Mesh. Author(s) Pradeep Menon is a seasoned data architect and engineer with extensive experience implementing data analytics solutions for leading companies. With a penchant for simplifying complex architectures, Pradeep has authored several technical publications and frequently shares his expertise at industry conferences. His hands-on approach and passion for teaching shine through in his practical guides. Who is it for? This book is ideal for data professionals including architects, engineers, and data strategists eager to enhance their knowledge in modern analytics platforms. If you have a basic understanding of data architecture and are curious about implementing systems governed by the Data Lakehouse paradigm, this book is for you. It bridges foundational concepts with advanced practices, making it suitable for learners aiming to contribute effectively to their organization's analytics efforts.

IBM Spectrum Virtualize, IBM FlashSystem, and IBM SAN Volume Controller Security Feature Checklist

2022-03-16 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by James Whitaker , Bill Scales , Barry Whyte

Cloud Computing IBM Cyber Security data

IBM Spectrum® Virtualize based storage systems are secure storage platforms that implement various security-related features, in terms of system-level access controls and data-level security features. This document outlines the available security features and options of IBM Spectrum Virtualize based storage systems. It is not intended as a "how to" or best practice document. Instead, it is a checklist of features that can be reviewed by a user security team to aid in the definition of a policy to be followed when implementing IBM FlashSystem®, IBM SAN Volume Controller, and IBM Spectrum Virtualize for Public Cloud. The topics that are discussed in this paper can be broadly split into two categories: System security This type of security encompasses the first three lines of defense that prevent unauthorized access to the system, protect the logical configuration of the storage system, and restrict what actions users can perform. It also ensures visibility and reporting of system level events that can be used by a Security Information and Event Management (SIEM) solution, such as IBM QRadar®. Data security This type of security encompasses the fourth line of defense. It protects the data that is stored on the system against theft, loss, or attack. These data security features include, but are not limited to, encryption of data at rest (EDAR) or IBM Safeguarded Copy (SGC). This document is correct as of IBM Spectrum Virtualize version 8.5.0.

Data Analysis with Python and PySpark

2022-03-15 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Jonathan Rioux

AI/ML Analytics API Big Data Cloud Computing Data Science Hadoop Microsoft Pandas PySpark Python Spark +2 more

Think big about your data! PySpark brings the powerful Spark big data processing engine to the Python ecosystem, letting you seamlessly scale up your data tasks and create lightning-fast pipelines. In Data Analysis with Python and PySpark you will learn how to: Manage your data as it scales across multiple machines Scale up your data programs with full confidence Read and write data to and from a variety of sources and formats Deal with messy data with PySpark’s data manipulation functionality Discover new data sets and perform exploratory data analysis Build automated data pipelines that transform, summarize, and get insights from data Troubleshoot common PySpark errors Creating reliable long-running jobs Data Analysis with Python and PySpark is your guide to delivering successful Python-driven data projects. Packed with relevant examples and essential techniques, this practical book teaches you to build pipelines for reporting, machine learning, and other data-centric tasks. Quick exercises in every chapter help you practice what you’ve learned, and rapidly start implementing PySpark into your data systems. No previous knowledge of Spark is required. About the Technology The Spark data processing engine is an amazing analytics factory: raw data comes in, insight comes out. PySpark wraps Spark’s core engine with a Python-based API. It helps simplify Spark’s steep learning curve and makes this powerful tool available to anyone working in the Python data ecosystem. About the Book Data Analysis with Python and PySpark helps you solve the daily challenges of data science with PySpark. You’ll learn how to scale your processing capabilities across multiple machines while ingesting data from any source—whether that’s Hadoop clusters, cloud data storage, or local data files. Once you’ve covered the fundamentals, you’ll explore the full versatility of PySpark by building machine learning pipelines, and blending Python, pandas, and PySpark code. What's Inside Organizing your PySpark code Managing your data, no matter the size Scale up your data programs with full confidence Troubleshooting common data pipeline problems Creating reliable long-running jobs About the Reader Written for data scientists and data engineers comfortable with Python. About the Author As a ML director for a data-driven software company, Jonathan Rioux uses PySpark daily. He teaches the software to data scientists, engineers, and data-savvy business analysts. Quotes A clear and in-depth introduction for truly tackling big data with Python. - Gustavo Patino, Oakland University William Beaumont School of Medicine The perfect way to learn how to analyze and master huge datasets. - Gary Bake, Brambles Covers both basic and more advanced topics of PySpark, with a good balance between theory and hands-on. - Philippe Van Bergenl, P² Consulting For beginner to pro, a well-written book to help understand PySpark. - Raushan Kumar Jha, Microsoft

Multimedia Security, Volume 1

2022-03-15 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by William Puech

Cyber Security data data-security-privacy data security & privacy

Today, more than 80% of the data transmitted over networks and archived on our computers, tablets, cell phones or clouds is multimedia data - images, videos, audio, 3D data. The applications of this data range from video games to healthcare, and include computer-aided design, video surveillance and biometrics. It is becoming increasingly urgent to secure this data, not only during transmission and archiving, but also during its retrieval and use. Indeed, in today’s "all-digital" world, it is becoming ever-easier to copy data, view it unrightfully, steal it or falsify it. Multimedia Security 1 analyzes the issues of the authentication of multimedia data, code and the embedding of hidden data, both from the point of view of defense and attack. Regarding the embedding of hidden data, it also covers invisibility, color, tracing and 3D data, as well as the detection of hidden messages in an image by steganalysis.

Getting Started with CockroachDB

2022-03-11 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Kishen Das Kondabagilu Rajanna

Cloud Computing Cyber Security SQL cockroachdb data relational-databases

"Getting Started with CockroachDB" provides an in-depth introduction to CockroachDB, a modern, distributed SQL database designed for cloud-native applications. Through this guide, you'll learn how to deploy, manage, and optimize CockroachDB to build highly reliable, scalable database solutions tailored for demanding and distributed workloads. What this Book will help me do Understand the architecture and design principles of CockroachDB and its fault-tolerant model. Learn how to set up and manage CockroachDB clusters for high availability and automatic scaling. Discover the concepts of data distribution and geo-partitioning to achieve low-latency global interactions. Explore indexing mechanisms in CockroachDB to optimize query performance for fast data retrieval. Master operational strategies, security configuration, and troubleshooting techniques for database management. Author(s) Kishen Das Kondabagilu Rajanna is an experienced software developer and database expert with a deep interest in distributed architectures. With hands-on experience working with CockroachDB and other database technologies, Kishen is passionate about sharing actionable insights with readers. His approach focuses on equipping developers with practical skills to excel in building and managing scalable, efficient database services. Who is it for? This book is ideal for software developers, database administrators, and database engineers seeking to learn CockroachDB for building robust, scalable database systems. If you're new to CockroachDB but possess basic database knowledge, this guide will equip you with the practical skills to leverage CockroachDB's capabilities effectively.

IBM Spectrum Archive Enterprise Edition V1.3.2.2: Installation and Configuration Guide

2022-03-10 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Yasuhiro Yoshihara , Larry Coyne , Yuka Sasaki , Arnold Byron Lua , Hiroyuki Miyoshi , Khanh Ngo

IBM data

This IBM® Redbooks® publication helps you with the planning, installation, and configuration of the new IBM Spectrum® Archive Enterprise Edition (EE) Version 1.3.2.2 for the IBM TS4500, IBM TS3500, IBM TS4300, and IBM TS3310 tape libraries. IBM Spectrum Archive Enterprise Edition enables the use of the LTFS for the policy management of tape as a storage tier in an IBM Spectrum Scale based environment. It also helps encourage the use of tape as a critical tier in the storage environment. This edition of this publication is the tenth edition of IBM Spectrum Archive Installation and Configuration Guide. IBM Spectrum Archive EE can run any application that is designed for disk files on a physical tape media. IBM Spectrum Archive EE supports the IBM Linear Tape-Open (LTO) Ultrium 9, 8, 7, 6, and 5 tape drives. and the IBM TS1160, TS1155, TS1150, and TS1140 tape drives. IBM Spectrum Archive EE can play a major role in reducing the cost of storage for data that does not need the access performance of primary disk. The use of IBM Spectrum Archive EE to replace disks with physical tape in tier 2 and tier 3 storage can improve data access over other storage solutions because it improves efficiency and streamlines management for files on tape. IBM Spectrum Archive EE simplifies the use of tape by making it transparent to the user and manageable by the administrator under a single infrastructure. This publication is intended for anyone who wants to understand more about IBM Spectrum Archive EE planning and implementation. This book is suitable for IBM customers, IBM Business Partners, IBM specialist sales representatives, and technical specialists.

Databases Illuminated, 4th Edition

2022-03-09 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Karen C. Davis , Catherine M. Ricardo , Susan D. Urban

data database-theory relational-databases

Databases Illuminated, Fourth Edition is designed to help students integrate theoretical material with practical knowledge, using an approach that applies theory to practical database implementation.

Data Mesh

2022-03-09 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Zhamak Dehghani (Nextdata)

AI/ML Analytics Big Data Data Governance Data Management data data-mesh database-architecture

We're at an inflection point in data, where our data management solutions no longer match the complexity of organizations, the proliferation of data sources, and the scope of our aspirations to get value from data with AI and analytics. In this practical book, author Zhamak Dehghani introduces data mesh, a decentralized sociotechnical paradigm drawn from modern distributed architecture that provides a new approach to sourcing, sharing, accessing, and managing analytical data at scale. Dehghani guides practitioners, architects, technical leaders, and decision makers on their journey from traditional big data architecture to a distributed and multidimensional approach to analytical data management. Data mesh treats data as a product, considers domains as a primary concern, applies platform thinking to create self-serve data infrastructure, and introduces a federated computational model of data governance. Get a complete introduction to data mesh principles and its constituents Design a data mesh architecture Guide a data mesh strategy and execution Navigate organizational design to a decentralized data ownership model Move beyond traditional data warehouses and lakes to a distributed data mesh

Cyber Resilient Infrastructure: Detect, Protect, and Mitigate Threats Against Brocade SAN FOS with IBM QRadar

2022-03-02 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by IBM Storage

IBM Fabric Python Cyber Security data

Enterprise networks are large and rely on numerous connected endpoints to ensure smooth operational efficiency. However, they also present a challenge from a security perspective. The focus of this Blueprint is to demonstrate an early threat detection against the network fabric that is powered by Brocade that uses IBM® QRadar®. It also protects the same if a cyberattack or an internal threat by rouge user within the organization occurs. The publication also describes how to configure the syslog that is forwarding on Brocade SAN FOS. Finally, it explains how the forwarded audit events are used for detecting the threat and runs the custom action to mitigate the threat. The focus of this publication is to proactively start a cyber resilience workflow from IBM QRadar to block an IP address when multiple failed logins on Brocade switch are detected. As part of early threat detection, a sample rule that us used by IBM QRadar is shown. A Python script that also is used as a response to block the user's IP address in the switch is provided. Customers are encouraged to create control path or data path use cases, customized IBM QRadar rules, and custom response scripts that are best-suited to their environment. The use cases, QRadar rules, and Python script that are presented here are templates only and cannot be used as-is in an environment.

Snowflake Access Control: Mastering the Features for Data Privacy and Regulatory Compliance

2022-03-02 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Jessica Megan Larson

Cloud Computing DWH GDPR/CCPA Cyber Security Snowflake data

Understand the different access control paradigms available in the Snowflake Data Cloud and learn how to implement access control in support of data privacy and compliance with regulations such as GDPR, APPI, CCPA, and SOX. The information in this book will help you and your organization adhere to privacy requirements that are important to consumers and becoming codified in the law. You will learn to protect your valuable data from those who should not see it while making it accessible to the analysts whom you trust to mine the data and create business value for your organization. Snowflake is increasingly the choice for companies looking to move to a data warehousing solution, and security is an increasing concern due to recent high-profile attacks. This book shows how to use Snowflake's wide range of features that support access control, making it easier to protect data access from the data origination point all the way to the presentation and visualization layer.Reading this book helps you embrace the benefits of securing data and provide valuable support for data analysis while also protecting the rights and privacy of the consumers and customers with whom you do business. What You Will Learn Identify data that is sensitive and should be restricted Implement access control in the Snowflake Data Cloud Choose the right access control paradigm for your organization Comply with CCPA, GDPR, SOX, APPI, and similar privacy regulations Take advantage of recognized best practices for role-based access control Prevent upstream and downstream services from subverting your access control Benefit from access control features unique to the Snowflake Data Cloud Who This Book Is For Data engineers, database administrators, and engineering managers who wantto improve their access control model; those whose access control model is not meeting privacy and regulatory requirements; those new to Snowflake who want to benefit from access control features that are unique to the platform; technology leaders in organizations that have just gone public and are now required to conform to SOX reporting requirements

Mastering Snowflake Solutions: Supporting Analytics and Data Sharing

2022-02-27 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Adam Morton

Agile/Scrum Analytics BI GDPR/CCPA Cyber Security Snowflake data

Design for large-scale, high-performance queries using Snowflake’s query processing engine to empower data consumers with timely, comprehensive, and secure access to data. This book also helps you protect your most valuable data assets using built-in security features such as end-to-end encryption for data at rest and in transit. It demonstrates key features in Snowflake and shows how to exploit those features to deliver a personalized experience to your customers. It also shows how to ingest the high volumes of both structured and unstructured data that are needed for game-changing business intelligence analysis. Mastering Snowflake Solutions starts with a refresher on Snowflake’s unique architecture before getting into the advanced concepts that make Snowflake the market-leading product it is today. Progressing through each chapter, you will learn how to leverage storage, query processing, cloning, data sharing, and continuous data protection features. This approach allows for greater operational agility in responding to the needs of modern enterprises, for example in supporting agile development techniques via database cloning. The practical examples and in-depth background on theory in this book help you unleash the power of Snowflake in building a high-performance system with little to no administrative overhead. Your result from reading will be a deep understanding of Snowflake that enables taking full advantage of Snowflake’s architecture to deliver value analytics insight to your business. What You Will Learn Optimize performance and costs associated with your use of the Snowflake data platform Enable data security to help in complying with consumer privacy regulations such as CCPA and GDPR Share data securely both inside your organization and with external partners Gain visibility to each interaction with your customersusing continuous data feeds from Snowpipe Break down data silos to gain complete visibility your business-critical processes Transform customer experience and product quality through real-time analytics Who This Book Is for Data engineers, scientists, and architects who have had some exposure to the Snowflake data platform or bring some experience from working with another relational database. This book is for those beginning to struggle with new challenges as their Snowflake environment begins to mature, becoming more complex with ever increasing amounts of data, users, and requirements. New problems require a new approach and this book aims to arm you with the practical knowledge required to take advantage of Snowflake’s unique architecture to get the results you need.

Analytics Optimization with Columnstore Indexes in Microsoft SQL Server: Optimizing OLAP Workloads

2022-02-26 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Edward Pollack

Analytics BI Microsoft SQL SQL Server data microsoft-sql-server relational-databases

Meet the challenge of storing and accessing analytic data in SQL Server in a fast and performant manner. This book illustrates how columnstore indexes can provide an ideal solution for storing analytic data that leads to faster performing analytic queries and the ability to ask and answer business intelligence questions with alacrity. The book provides a complete walk through of columnstore indexing that encompasses an introduction, best practices, hands-on demonstrations, explanations of common mistakes, and presents a detailed architecture that is suitable for professionals of all skill levels. With little or no knowledge of columnstore indexing you can become proficient with columnstore indexes as used in SQL Server, and apply that knowledge in development, test, and production environments. This book serves as a comprehensive guide to the use of columnstore indexes and provides definitive guidelines. You will learn when columnstore indexes shouldbe used, and the performance gains that you can expect. You will also become familiar with best practices around architecture, implementation, and maintenance. Finally, you will know the limitations and common pitfalls to be aware of and avoid. As analytic data can become quite large, the expense to manage it or migrate it can be high. This book shows that columnstore indexing represents an effective storage solution that saves time, money, and improves performance for any applications that use it. You will see that columnstore indexes are an effective performance solution that is included in all versions of SQL Server, with no additional costs or licensing required. What You Will Learn Implement columnstore indexes in SQL Server Know best practices for the use and maintenance of analytic data in SQL Server Use metadata to fully understand the size and shape of data stored in columnstore indexes Employ optimal ways to load, maintain, and delete data from large analytic tables Know how columnstore compression saves storage, memory, and time Understand when a columnstore index should be used instead of a rowstore index Be familiar with advanced features and analytics Who This Book Is For Database developers, administrators, and architects who are responsible for analytic data, especially for those working with very large data sets who are looking for new ways to achieve high performance in their queries, and those with immediate or future challenges to analytic data and query performance who want a methodical and effective solution

What Is Distributed SQL?

2022-02-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Charles Custer , Paul Modderman , Jim Walker (Cockroach Labs)

Cloud Computing ELK NoSQL RDBMS Cyber Security SQL data nosql-databases

Globally available resources have become the status quo. They're accessible, distributed, and resilient. Our traditional SQL database options haven't kept up. Centralized SQL databases, even those with read replicas in the cloud, put all the transactional load on a central system. The further away that a transaction happens from the user, the more the user experience suffers. If the transactional data powering the application is greatly slowed down, fast-loading web pages mean nothing. In this report, Paul Modderman, Jim Walker, and Charles Custer explain how distributed SQL fits all applications and eliminates complex challenges like sharding from traditional RDBMS systems. You'll learn how distributed SQL databases can reach global scale without introducing the consistency trade-offs found in NoSQL solutions. These databases come to life through cloud computing, while legacy databases simply can't rise to meet the elastic and ubiquitous new paradigm. You'll learn: Key concepts driving this new technology, including the CAP theorem, the Raft consensus algorithm, multiversion concurrency control, and Google Spanner How distributed SQL databases meet enterprise requirements, including management, security, integration, and Everything as a Service (XaaS) The impact that distributed SQL has already made in the telecom, retail, and gaming industries Why serverless computing is an ideal fit for distributed SQL How distributed SQL can help you expand your company's strategic plan

Electronic Health Records with Epic and IBM FlashSystem 9500 Blueprint Version 2 Release 4

2022-02-23 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by IBM

IBM data

This information is intended to facilitate the deployment of IBM© FlashSystem© for the Epic Corporation electronic health record (EHR) solution by describing the requirements and specifications for configuring IBM FlashSystem 9500 and its parameters. This document also describes the required steps to configure the server that hosts the EHR application. To complete these tasks, you must be knowledgeable of IBM FlashSystem 9500 and Epic applications. This Blueprint provides the following information: A solutions architecture and the related solution configuration information for the following essential components of software and hardware: Detailed technical configuration steps for configuring IBM FlashSystem 9500 Server configuration details for Caché database and Epic applications

IBM DS8000 Easy Tier (Updated for DS8000 R9.0)

2022-02-23 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Peter Kimmel , Matthew Houzenga , Bertrand Dufrasne , Dennis Robertson

IBM data

This IBM® Redpaper™ publication describes the concepts and functions of IBM System Storage® Easy Tier®, and explains its practical use with the IBM DS8000® series and License Machine Code 7.9.0.xxx (also known as R9.0).. Easy Tier is designed to automate data placement throughout the storage system disks pool. It enables the system to (automatically and without disruption to applications) relocate data (at the extent level) across up to three drive tiers. The process is fully automated. Easy Tier also automatically rebalances extents among ranks within the same tier, removing workload skew between ranks, even within homogeneous and single-tier extent pools. Easy Tier supports a Manual Mode that enables you to relocate full volumes. Manual Mode also enables you to merge extent pools and offers a rank depopulation function. Easy Tier fully supports thin-provisioned Extent Space Efficient fixed block (FB) and count key data (CKD) volumes in Manual Mode and Automatic Mode. Easy Tier also supports extent pools with small extents (16 MiB extents for FB pools and 21 cylinders extents for CKD pools). Easy Tier also supports high-performance and high-capacity flash drives in the High-performance flash enclosure, and it enables additional user controls at the pool and volume levels. This paper is aimed at those professionals who want to understand the Easy Tier concept and its underlying design. It also provides guidance and practical illustrations for users who want to use the Easy Tier Manual Mode capabilities. Easy Tier includes additional capabilities to further enhance your storage performance automatically: Easy Tier Application, and Easy Tier Heat Map Transfer.

IBM DS8900F Architecture and Implementation: Updated for Release 9.2

2022-02-22 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Connie Riggins , Lisa Martinez , Bertrand Dufrasne , Mike Stenson , Jeff Cook , Sherri Brunson

AI/ML BI IBM data

This IBM® RedpaperRedbooks® publication describes the concepts, architecture, and implementation of the IBM DS8900F family. The WhitepaperRedpaperbook provides reference information to assist readers who need to plan for, install, and configure the DS8900F systems. This edition applies to DS8900F systems with IBM DS8000® Licensed Machine Code (LMC) 7.9.20 (bundle version 89.20.xx.x), referred to as Release 9.2. The DS8900F is an all-flash system exclusively, and it offers three classes: DS8980F: Analytic Class: The DS8980F Analytic Class offers best performance for organizations that want to expand their workload possibilities to artificial intelligence (AI), Business Intelligence (BI), and machine learning (ML). IBM DS8950F: Agility Class all-flash: The Agility Class consolidates all your mission-critical workloads for IBM Z®, IBM LinuxONE, IBM Power Systems, and distributed environments under a single all-flash storage solution.. IBM DS8910F: Flexibility Class all-flash: The Flexibility Class reduces complexity while addressing various workloads at the lowest DS8900F family entry cost. . TThe DS8900F architecture relies on powerful IBM POWER9™ processor-based servers that manage the cache to streamline disk input/output (I/O), which maximizes performance and throughput. These capabilities are further enhanced by High-Performance Flash Enclosures (HPFE) Gen2. Like its predecessors, the DS8900F supports advanced disaster recovery (DR) solutions, business continuity solutions, and thin provisioning. The IBM DS8910F Rack-Mounted model 993 is described in IBM DS8910F Model 993 Rack-Mounted Storage System Release 9.1, REDP-5566.

Highly Efficient Data Access with RoCE on IBM Elastic Storage Systems and IBM Spectrum Scale

2022-02-18 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Piyush Chaudhary , Gero Schmidt , Olaf Weiser

ELK IBM data

With Remote Direct Memory Access (RDMA), you can make a subset of a host's memory directly available to a remote host. RDMA is available on standard Ethernet-based networks by using the RDMA over Converged Ethernet (RoCE) interface. The RoCE network protocol is an industry-standard initiative by the InfiniBand Trade Association. This IBM® Redpaper publication describes how to set up RoCE to use within an IBM Spectrum® Scale cluster and IBM Elastic Storage® Systems (ESSs). This book is targeted at technical professionals (consultants, technical support staff, IT Architects, and IT Specialists) who are responsible for delivering cost-effective storage solutions with IBM Spectrum Scale and IBM ESSs.

Kafka in Action

2022-02-13 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Dave Klein , Dylan Scott , Viktor Gamov (Confluent)

Analytics ETL/ELT Java Kafka Data Streaming data streaming-messaging

Master the wicked-fast Apache Kafka streaming platform through hands-on examples and real-world projects. In Kafka in Action you will learn: Understanding Apache Kafka concepts Setting up and executing basic ETL tasks using Kafka Connect Using Kafka as part of a large data project team Performing administrative tasks Producing and consuming event streams Working with Kafka from Java applications Implementing Kafka as a message queue Kafka in Action is a fast-paced introduction to every aspect of working with Apache Kafka. Starting with an overview of Kafka's core concepts, you'll immediately learn how to set up and execute basic data movement tasks and how to produce and consume streams of events. Advancing quickly, you’ll soon be ready to use Kafka in your day-to-day workflow, and start digging into even more advanced Kafka topics. About the Technology Think of Apache Kafka as a high performance software bus that facilitates event streaming, logging, analytics, and other data pipeline tasks. With Kafka, you can easily build features like operational data monitoring and large-scale event processing into both large and small-scale applications. About the Book Kafka in Action introduces the core features of Kafka, along with relevant examples of how to use it in real applications. In it, you’ll explore the most common use cases such as logging and managing streaming data. When you’re done, you’ll be ready to handle both basic developer- and admin-based tasks in a Kafka-focused team. What's Inside Kafka as an event streaming platform Kafka producers and consumers from Java applications Kafka as part of a large data project About the Reader For intermediate Java developers or data engineers. No prior knowledge of Kafka required. About the Authors Dylan Scott is a software developer in the insurance industry. Viktor Gamov is a Kafka-focused developer advocate. At Confluent, Dave Klein helps developers, teams, and enterprises harness the power of event streaming with Apache Kafka. Quotes The authors have had many years of real-world experience using Kafka, and this book’s on-the-ground feel really sets it apart. - From the foreword by Jun Rao, Confluent Cofounder A surprisingly accessible introduction to a very complex technology. Developers will want to keep a copy close by. - Conor Redmond, InComm Payments A comprehensive and practical guide to Kafka and the ecosystem. - Sumant Tambe, Linkedin It quickly gave me insight into how Kafka works, and how to design and protect distributed message applications. - Gregor Rayman, Cloudfarms

PHP & MySQL: Novice to Ninja, 7th Edition

2022-02-10 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Tom Butler

MySQL SQL data relational-databases

PHP & MySQL: Novice to Ninja, 7th Edition is a hands-on guide to learning all the tools, principles, and techniques needed to build a professional web application using PHP & MySQL. Comprehensively updated to cover PHP 8 and modern best practice, this highly practical and fun book covers everything from installation through to creating a complete online content management system. Gain a thorough understanding of PHP syntax Master database design principles and SQL Write robust, maintainable, best practice code Build a working content management system (CMS) And much more!

talk-data.com

Activity Trend

Top Events

Top Speakers

IBM TS4500 R8 Tape Library Guide

Data Lakehouse in Action

IBM Spectrum Virtualize, IBM FlashSystem, and IBM SAN Volume Controller Security Feature Checklist

Data Analysis with Python and PySpark

Multimedia Security, Volume 1

Getting Started with CockroachDB

IBM Spectrum Archive Enterprise Edition V1.3.2.2: Installation and Configuration Guide

Databases Illuminated, 4th Edition

Data Mesh

Cyber Resilient Infrastructure: Detect, Protect, and Mitigate Threats Against Brocade SAN FOS with IBM QRadar

Snowflake Access Control: Mastering the Features for Data Privacy and Regulatory Compliance

Mastering Snowflake Solutions: Supporting Analytics and Data Sharing

Analytics Optimization with Columnstore Indexes in Microsoft SQL Server: Optimizing OLAP Workloads

What Is Distributed SQL?

Electronic Health Records with Epic and IBM FlashSystem 9500 Blueprint Version 2 Release 4

IBM DS8000 Easy Tier (Updated for DS8000 R9.0)

IBM DS8900F Architecture and Implementation: Updated for Release 9.2

Highly Efficient Data Access with RoCE on IBM Elastic Storage Systems and IBM Spectrum Scale

Kafka in Action

PHP & MySQL: Novice to Ninja, 7th Edition