talk-data.com talk-data.com

Topic

Linux

operating_system open_source unix_like

200

tagged

Activity Trend

20 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: O'Reilly Data Engineering Books ×
Implementing an IBM High-Performance Computing Solution on IBM Power System S822LC

This IBM® Redbooks® publication demonstrates and documents that IBM Power Systems™ high-performance computing and technical computing solutions deliver faster time to value with powerful solutions. Configurable into highly scalable Linux clusters, Power Systems offer extreme performance for demanding workloads such as genomics, finance, computational chemistry, oil and gas exploration, and high-performance data analytics. This book delivers a high-performance computing solution implemented on the IBM Power System S822LC. The solution delivers high application performance and throughput based on its built-for-big-data architecture that incorporates IBM POWER8® processors, tightly coupled Field Programmable Gate Arrays (FPGAs) and accelerators, and faster I/O by using Coherent Accelerator Processor Interface (CAPI). This solution is ideal for clients that need more processing power while simultaneously increasing workload density and reducing datacenter floor space requirements. The Power S822LC offers a modular design to scale from a single rack to hundreds, simplicity of ordering, and a strong innovation roadmap for graphics processing units (GPUs). This publication is targeted toward technical professionals (consultants, technical support staff, IT Architects, and IT Specialists) responsible for delivering cost effective high-performance computing (HPC) solutions that help uncover insights from their data so they can optimize business results, product development, and scientific discoveries

iSCSI Implementation and Best Practices on IBM Storwize

This IBM® Redbooks® publication helps administrators and technical professionals understand Internet Small Computer System Interface (iSCSI) and how to implement it for use with IBM Storwize® storage systems. iSCSI can be used alone or with other technologies. This publication provides an overview of the iSCSI protocol and helps you understand how it is similar to and different from Fibre Channel (FC) technology. It helps you plan and design your network topology. It explains how to configure your IBM Storwize storage systems and hosts (including IBM AIX®, Linux, VMware, and Microsoft Windows hosts) to interact with it. It also provides an overview of using IBM Storwize storage systems with OpenStack. This book describes iSCSI configuring for IBM Storwize and SAN Volume Controller storage systems at Version 7.6 or later. In addition to configuration, this publication provides information about performance and troubleshooting.

Getting Started with KVM for IBM z Systems

This IBM® Redbooks® publication gives a broad explanation of the kernel-based virtual machine (KVM) for IBM z Systems™ (KVM for IBM z Systems) and how it uses the architecture of IBM z Systems platforms. It focuses on the planning of the environment and provides installation and configuration definitions that are necessary to build and manage KVM for IBM z Systems. This publication is useful to IT architects and system administrators who plan for and install KVM for IBM z Systems. The reader is expected to have a good understanding of IBM z Systems hardware, KVM for IBM z Systems, Linux on z Systems, and virtualization concepts.

IBM PowerKVM: Configuration and Use

This IBM® Redpaper Redbooks® publication presents the IBM PowerKVM virtualization for scale-out Linux systems, including the new LC IBM Power Systems™. PowerKVM is open source server virtualization that is based on the IBM POWER8® processor technology. It includes the Linux open source technology of KVM virtualization, and it complements the performance, scalability, and security qualities of Linux. This book describes the concepts of PowerKVM and how you can deploy your virtual machines with the software stack included in the product. It helps you install and configure PowerKVM on your Power Systems server and provides guidance for managing the supported virtualization features by using the web interface and command-line interface (CLI). This information is for professionals who want to acquire a better understanding of PowerKVM virtualization technology to optimize Linux workload consolidation and use the POWER8 processor features. The intended audience also includes people in these roles: Clients Sales and marketing professionals Technical support professionals IBM Business Partners Independent software vendors Open source community IBM OpenPower partners It does not replace the latest marketing materials and configuration tools. It is intended as an additional source of information that, along with existing sources, can be used to increase your knowledge of IBM virtualization solutions. Before you start reading, you must be familiar with the general concepts of kernel-based virtual machine (KVM), Linux, and IBM Power architecture.

Ceph Cookbook

Ceph Cookbook is a practical guide offering over 100 detailed recipes to help you effectively design, implement, and manage the Ceph software-defined storage system. Through step-by-step tutorials, readers will master critical tasks, from cluster setup to integration with cloud and virtualization platforms. What this Book will help me do Gain hands-on skills to set up, manage, and maintain a Ceph cluster effectively. Learn to integrate Ceph with popular cloud solutions like OpenStack for optimal performance. Understand techniques for advanced troubleshooting, monitoring, and optimization of storage systems. Develop proficiency in creating scalable storage solutions for enterprise environments. Master best practices in utilizing Ceph's various storage paradigms and technologies. Author(s) Karan Singh is a seasoned technology professional with extensive experience in storage systems and cloud design. With years of experience working with Ceph and an active participant in the open-source community, Karan brings practical insights and in-depth technical knowledge to his writing. His clear and approachable style helps demystify complex concepts for readers. Who is it for? This book is ideal for storage engineers, cloud administrators, and technical architects seeking to understand and deploy software-defined storage solutions. Whether you have foundational knowledge of Linux and storage technologies or are new to Ceph, this book will guide you. Professionals aiming to enhance their cloud infrastructure will find actionable steps and strategies here.

IBM Wave for z/VM Installation, Implementation, and Exploitation

IBM® Wave for z/VM® (IBM Wave) is a virtualization management solution for IBM z/VM and Linux on z Systems™. This virtualization management software provides a simplified and cost-effective way for companies to harness the consolidation capabilities of the IBM z™ Systems platform and its ability to host the workloads of tens of thousands of commodity servers. IBM Wave is a complete management solution for z Systems based virtual server farms. This IBM Redbooks® publication provides a guide to understanding IBM Wave by providing information about the IBM Wave architecture and how it fits into the cloud. This publication also provides a planning and design guide that is based on common scenarios. This publication also provides installation and configuration task information and how to manage and operate the environment. The intended audience for this publication is IT Architects who are responsible for planning their IBM Wave environments and IT Specialists who are responsible for implementing them.

Getting Started with KVM for IBM z Systems

This IBM® Redbooks® publication gives a broad explanation of the kernel-based virtual machine (KVM) for IBM z™ Systems and how it uses the architecture of IBM z Systems™. It focuses on the planning and design of the environment and provides installation and configuration definitions that are necessary to build and manage KVM for IBM z Systems. It also helps you plan, install, and configure IBM Cloud Manager with OpenStack for use with KVM for IBM z Systems in a cloud environment. This book is useful to IT architects and system administrators who plan for and install KVM for IBM z Systems. The reader is expected to have a good understanding of IBM z Systems hardware, KVM, Linux on z Systems, and cloud concepts.

Hadoop 2 Quick-Start Guide: Learn the Essentials of Big Data Computing in the Apache Hadoop 2 Ecosystem

Get Started Fast with Apache Hadoop ® 2, YARN, and Today’s Hadoop Ecosystem With Hadoop 2.x and YARN, Hadoop moves beyond MapReduce to become practical for virtually any type of data processing. Hadoop 2.x and the Data Lake concept represent a radical shift away from conventional approaches to data usage and storage. Hadoop 2.x installations offer unmatched scalability and breakthrough extensibility that supports new and existing Big Data analytics processing methods and models. Hadoop ® 2 Quick-Start Guide is the first easy, accessible guide to Apache Hadoop 2.x, YARN, and the modern Hadoop ecosystem. Building on his unsurpassed experience teaching Hadoop and Big Data, author Douglas Eadline covers all the basics you need to know to install and use Hadoop 2 on personal computers or servers, and to navigate the powerful technologies that complement it. Eadline concisely introduces and explains every key Hadoop 2 concept, tool, and service, illustrating each with a simple “beginning-to-end” example and identifying trustworthy, up-to-date resources for learning more. This guide is ideal if you want to learn about Hadoop 2 without getting mired in technical details. Douglas Eadline will bring you up to speed quickly, whether you’re a user, admin, devops specialist, programmer, architect, analyst, or data scientist. Coverage Includes Understanding what Hadoop 2 and YARN do, and how they improve on Hadoop 1 with MapReduce Understanding Hadoop-based Data Lakes versus RDBMS Data Warehouses Installing Hadoop 2 and core services on Linux machines, virtualized sandboxes, or clusters Exploring the Hadoop Distributed File System (HDFS) Understanding the essentials of MapReduce and YARN application programming Simplifying programming and data movement with Apache Pig, Hive, Sqoop, Flume, Oozie, and HBase Observing application progress, controlling jobs, and managing workflows Managing Hadoop efficiently with Apache Ambari–including recipes for HDFS to NFSv3 gateway, HDFS snapshots, and YARN configuration Learning basic Hadoop 2 troubleshooting, and installing Apache Hue and Apache Spark

IBM PowerVC Version 1.2.3: Introduction and Configuration

IBM® Power Virtualization Center (PowerVC™) is an advanced enterprise virtualization management offering for IBM® Power Systems™, which is based on the OpenStack framework. This IBM Redbooks® publication introduces PowerVC and helps you understand its functions, planning, installation, and setup. Starting with PowerVC version 1.2.2, the Express Edition offering is no longer available and the Standard Edition is the only offering. PowerVC supports both large and small deployments, either by managing IBM PowerVM® that is controlled with the Hardware Management Console (HMC) or by managing PowerKVM directly. PowerVC can manage IBM AIX®, IBM i, and Linux workloads that run on POWER® hardware, including IBM PurePower systems. PowerVC editions include the following features and benefits: Virtual Image capture, deployment, and management Policy-based Virtual Machine (VM) placement to improve use Management of real-time optimization and VM resilience to increase productivity Managing real-time optimization and VM resilience to increase productivity VM Mobility with placement policies to reduce the burden on IT staff in a simple-to-install and easy-to-use graphical user interface (GUI) An open and extensible PowerVM management system that you can adapt as you need and that runs in parallel with existing infrastructure, preserving your investment A management system for existing PowerVM deployments You will also find all the details about how we set up the lab environment that is used in this book. This book is for experienced users of IBM PowerVM and other virtualization solutions who want to understand and implement the next generation of enterprise virtualization management for Power Systems. Unless stated otherwise, the content of this book refers to versions 1.2.2 and 1.2.3 of IBM PowerVC. Unless stated otherwise, the content of this book refers to versions 1.2.2 and 1.2.3 of IBM PowerVC Version 1.2.3 Introduction and Configuration IBM PowerVC.

Managing the Data Lake

Organizations across many industries have recently created fast-growing repositories to deal with an influx of new data from many sources and often in multiple formats. To manage these data lakes, companies have begun to leave the familiar confines of relational databases and data warehouses for Hadoop and various big data solutions. But adopting new technology alone won’t solve the problem. Based on interviews with several experts in data management, author Andy Oram provides an in-depth look at common issues you’re likely to encounter as you consider how to manage business data. You’ll explore five key topic areas, including: Acquisition and ingestion: how to solve these problems with a degree of automation. Metadata: how to keep track of when data came in and how it was formatted, and how to make it available at later stages of processing. Data preparation and cleaning: what you need to know before you prepare and clean your data, and what needs to be cleaned up and how. Organizing workflows: what you should do to combine your tasks—ingestion, cataloging, and data preparation—into an end-to-end workflow. Access control: how to address security and access controls at all stages of data handling. Andy Oram, an editor at O’Reilly Media since 1992, currently specializes in programming. His work for O'Reilly includes the first books on Linux ever published commercially in the United States.

Performance Optimization and Tuning Techniques for IBM Power Systems Processors Including IBM POWER8

This IBM® Redbooks® publication focuses on gathering the correct technical information, and laying out simple guidance for optimizing code performance on IBM POWER8® processor-based systems that run the IBM AIX®, IBM i, or Linux operating systems. There is straightforward performance optimization that can be performed with a minimum of effort and without extensive previous experience or in-depth knowledge. The POWER8 processor contains many new and important performance features, such as support for eight hardware threads in each core and support for transactional memory. The POWER8 processor is a strict superset of the IBM POWER7+™ processor, and so all of the performance features of the POWER7+ processor, such as multiple page sizes, also appear in the POWER8 processor. Much of the technical information and guidance for optimizing performance on POWER8 processors that is presented in this guide also applies to POWER7+ and earlier processors, except where the guide explicitly indicates that a feature is new in the POWER8 processor. This guide strives to focus on optimizations that tend to be positive across a broad set of IBM POWER® processor chips and systems. Specific guidance is given for the POWER8 processor; however, the general guidance is applicable to the IBM POWER7+, IBM POWER7®, IBM POWER6®, IBM POWER5, and even to earlier processors. This guide is directed at personnel who are responsible for performing migration and implementation activities on POWER8 processor-based systems. This includes system administrators, system architects, network administrators, information architects, and database administrators (DBAs).

PostgreSQL Replication, Second Edition

The second edition of 'PostgreSQL Replication' by Hans-Jürgen Schönig is a comprehensive guide that empowers PostgreSQL database professionals to establish robust replication solutions. Through detailed explanations and expert techniques, you will learn how to enhance the security, scalability, and reliability of your PostgreSQL databases using modern replication methods. What this Book will help me do Master Point-in-Time Recovery to safeguard data and perform database recoveries effectively. Implement both synchronous and asynchronous streaming replication to suit different operational needs. Optimize database performance and scalability using tools like pgpool and PgBouncer. Ensure database high availability and data security through Linux High Availability configurations. Solve replication-related challenges by leveraging advanced knowledge of the PostgreSQL transaction log. Author(s) Hans-Jürgen Schönig, a seasoned PostgreSQL specialist, has years of experience architecting and optimizing PostgreSQL database systems for businesses of all sizes. With a strong focus on practical implementation and a passion for teaching, his writing bridges the gap between theoretical concepts and hands-on solutions, making PostgreSQL topics accessible and actionable. Who is it for? This book is tailored for PostgreSQL administrators and professionals seeking to implement robust database replication. Whether you're familiar with basic database administration or looking to deepen your expertise, this book provides valuable insights into replication strategies. It's ideal for those aiming to boost database performance and enhance operational reliability through advanced PostgreSQL features.

Virtualizing Hadoop: How to Install, Deploy, and Optimize Hadoop in a Virtualized Architecture

Plan and Implement Hadoop Virtualization for Maximum Performance, Scalability, and Business Agility Enterprises running Hadoop must absorb rapid changes in big data ecosystems, frameworks, products, and workloads. Virtualized approaches can offer important advantages in speed, flexibility, and elasticity. Now, a world-class team of enterprise virtualization and big data experts guide you through the choices, considerations, and tradeoffs surrounding Hadoop virtualization. The authors help you decide whether to virtualize Hadoop, deploy Hadoop in the cloud, or integrate conventional and virtualized approaches in a blended solution. First, Virtualizing Hadoop reviews big data and Hadoop from the standpoint of the virtualization specialist. The authors demystify MapReduce, YARN, and HDFS and guide you through each stage of Hadoop data management. Next, they turn the tables, introducing big data experts to modern virtualization concepts and best practices. Finally, they bring Hadoop and virtualization together, guiding you through the decisions you’ll face in planning, deploying, provisioning, and managing virtualized Hadoop. From security to multitenancy to day-to-day management, you’ll find reliable answers for choosing your best Hadoop strategy and executing it. Coverage includes the following: • Reviewing the frameworks, products, distributions, use cases, and roles associated with Hadoop • Understanding YARN resource management, HDFS storage, and I/O • Designing data ingestion, movement, and organization for modern enterprise data platforms • Defining SQL engine strategies to meet strict SLAs • Considering security, data isolation, and scheduling for multitenant environments • Deploying Hadoop as a service in the cloud • Reviewing the essential concepts, capabilities, and terminology of virtualization • Applying current best practices, guidelines, and key metrics for Hadoop virtualization • Managing multiple Hadoop frameworks and products as one unified system • Virtualizing master and worker nodes to maximize availability and performance • Installing and configuring Linux for a Hadoop environment

Beginning Oracle Database 12c Administration: From Novice to Professional, Second Edition

Beginning Oracle Database 12c Administration is your entry point into a successful and satisfying career as an Oracle Database Administrator. The chapters of this book are logically organized into four parts closely tracking the way your database administration career will naturally evolve. Part 1 "Database Concepts" gives necessary background in relational database theory and Oracle Database concepts, Part 2 "Database Implementation" teaches how to implement an Oracle database correctly, Part 3 "Database Support" exposes you to the daily routine of a database administrator, and Part 4 "Database Tuning" introduces the fine art of performance tuning. Beginning Oracle Database 12c Administration provides information that you won't find in other books on Oracle Database. You'll discover not only technical information, but also guidance on work practices that are as vital to your success as are your technical skills. The author's favorite chapter is "The Big Picture and the Ten Deliverables." (It is the editor’s favorite chapter too!) If you take the lessons in that chapter to heart, you can quickly become a much better Oracle database administrator than you ever thought possible. You will grasp the key aspects of theory behind relational database management systems and learn how to: Install and configure an Oracle database, and ensure that it’s properly licensed; Execute common management tasks in a Linux environment; Defend against data loss by implementing sound backup and recovery practices; and Improve database and query performance.

Getting Started with MariaDB

Dive into the world of MariaDB with this comprehensive beginner's guide. From installation and configuration to advanced data handling, this book provides hands-on instructions on using MariaDB effectively. Tailored for newcomers, it ensures you can learn and apply database management in a practical way. What this Book will help me do Install MariaDB on various platforms like Windows, Mac OS X, and Linux to start working with databases. Optimize MariaDB for better performance by utilizing the advanced features available in version 10. Secure your databases effectively, ensuring sensitive data is protected from unauthorized access. Learn techniques to analyze and retrieve data efficiently using operators and sorting mechanisms. Perform database maintenance to ensure MariaDB functions optimally in the long run. Author(s) Daniel Bartholomew has extensive experience with open-source databases and has been a key advocate for MariaDB. With years of hands-on practice, Daniel helps simplify complex topics, making learning straightforward for beginners while ensuring robust coverage of advanced capabilities. Who is it for? This book is an excellent choice for those new to databases and wishing to start with MariaDB. Whether you're aiming to learn database basics or looking to expand your technical skillset, this guide provides the foundational knowledge you need. For IT learners or aspiring database managers, it's a perfect first step into database systems. Previous database experience is not necessary.

Implementing an IBM InfoSphere BigInsights Cluster using Linux on Power

This IBM® Redbooks® publication demonstrates and documents how to implement and manage an IBM PowerLinux™ cluster for big data focusing on hardware management, operating systems provisioning, application provisioning, cluster readiness check, hardware, operating system, IBM InfoSphere® BigInsights™, IBM Platform Symphony®, IBM Spectrum™ Scale (formerly IBM GPFS™), applications monitoring, and performance tuning. This publication shows that IBM PowerLinux clustering solutions (hardware and software) deliver significant value to clients that need cost-effective, highly scalable, and robust solutions for big data and analytics workloads. This book documents and addresses topics on how to use IBM Platform Cluster Manager to manage PowerLinux BigData data clusters through IBM InfoSphere BigInsights, Spectrum Scale, and Platform Symphony. This book documents how to set up and manage a big data cluster on PowerLinux servers to customize application and programming solutions, and to tune applications to use IBM hardware architectures. This document uses the architectural technologies and the software solutions that are available from IBM to help solve challenging technical and business problems. This book is targeted at technical professionals (consultants, technical support staff, IT Architects, and IT Specialists) that are responsible for delivering cost-effective Linux on IBM Power Systems™ solutions that help uncover insights among client's data so they can act to optimize business results, product development, and scientific discoveries.

IBM zPDT Guide and Reference: System z Personal Development Tool

This IBM® Redpaper Redbooks® publication provides both introductory information and technical details for the IBM System z® Personal Development Tool (IBM zPDT®), which produces a small System z environment suitable for application development. zPDT is a PC Linux application. When zPDT is installed (on Linux), normal System z Operating Systems (such as IBM z/OS®) may be run on it. zPDT provides the basic System z architecture and provides emulated IBM 3390 disk drives, 3270 interfaces, OSA interfaces, and so forth. This current document merges four separate previous Redbooks publications into this single book. The primary reason for this merger is to provide simpler zPDT documentation usage when viewing or searching the documentation onscreen. The systems that are discussed in this document are complex, with elements of Linux (for the underlying PC machine), IBM z/Architecture® (for the core zPDT elements), System z I/O functions (for emulated I/O devices), z/OS (the most common System z operating system), and various applications and subsystems under z/OS. We assume that the reader is familiar with general concepts and terminology of System z hardware and software elements, and with basic PC Linux characteristics. This book provides the primary documentation for zPDT and includes basic system overview, installation, operation, z/OS distribution, FAQs.

Oracle RMAN Database Duplication

RMAN is Oracle’s flagship backup and recovery tool, but did you know it’s also an effective database duplication tool? Oracle RMAN Database Duplication is a deep dive into RMAN’s duplication feature set, showing how RMAN can make it so much easier for you as a database administrator to satisfy the many requests from developers and testers for database copies and refreshes for use in their work. You’ll learn to make and refresh duplicate databases with a single command, and of course you can automate and schedule that command so that developers and testers are supplied with regular, known good databases without any manual intervention on your part. Fast and easy provisioning of databases for developers and testers is a driving force in the move to cloud computing and virtualization. RMAN’s robust database duplication feature set plays right into this growing need for ease of provisioning, enabling easy duplication of known-good databases on demand, across operating systems such as between Linux and Solaris, and even across storage environments such as when duplicating from a RAC/ASM environment to a single-node instance using regular file system storage. Oracle RMAN Database Duplication is your thorough guide to providing amazing business value to your organization by way of fast and easy provisioning of database duplicates in service of development and testing projects.

Learning Hadoop 2

Delve into the world of big data with 'Learning Hadoop 2', a comprehensive guide to leveraging the capabilities of Hadoop 2 for data processing and analysis. In this book, you will explore the tools and frameworks that integrate with Hadoop, discovering the best ways to design and deploy effective workflows for managing and analyzing large datasets. What this Book will help me do Understand the fundamentals of the MapReduce framework and its applications. Utilize advanced tools such as Samza and Spark for real-time and iterative data processing. Manage large datasets with data mining techniques tailored for Hadoop environments. Deploy Hadoop applications across various infrastructures, including local clusters and cloud services. Create and orchestrate sophisticated data workflows and pipelines with Apache Pig and Oozie. Author(s) Gabriele Modena is an experienced developer and trained data specialist with a keen focus on distributed data processing frameworks. Having worked extensively with big data platforms, Gabriele brings practical insights and a hands-on perspective to technical subjects. His writing is concise and engaging, aiming to render complex concepts accessible. Who is it for? This book is ideal for system and application developers eager to learn practical implementations of the Hadoop framework. Readers should be familiar with the Unix/Linux command-line interface and Java programming. Prior experience with Hadoop will be advantageous, but not necessary.

Getting Started with IBM InfoSphere Optim Workload Replay for DB2

This IBM® Redbooks® publication will help you install, configure, and use IBM InfoSphere® Optim™ Workload Replay (InfoSphere Workload Replay), a web-based tool that lets you capture real production SQL workload data and then replay the workload data in a pre-production environment. With InfoSphere Workload Replay, you can set up and run realistic tests for enterprise database changes without the need to create a complex client and application infrastructure to mimic your production environment. The publication goes through the steps to install and configure the InfoSphere Workload Replay appliance and related database components for IBM DB2® for Linux, UNIX, and Windows and for DB2 for IBM z/OS®. The capture, replay, and reporting process, including user ID and roles management, is described in detail to quickly get you up and running. Ongoing operations, such as appliance health monitoring, starting and stopping the product, and backup and restore in your day-to-day management of the product, extensive troubleshooting information, and information about how to integrate InfoSphere Workload Replay with other InfoSphere products are covered in separate chapters.