talk-data.com talk-data.com

Topic

data-engineering

3377

tagged

Activity Trend

1 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: O'Reilly Data Engineering Books ×
IBM Tape Library Guide for Open Systems

Abstract This IBM® Redbooks® publication presents a general introduction to the latest IBM tape and tape library technologies. Featured tape technologies include the IBM LTO Ultrium and Enterprise 3592 tape drives, and their implementation in IBM tape libraries. This 15th edition includes information about the latest TS4300 Ultrium tape library, TS1155 Enterprise tape drive, and the IBM Linear Tape-Open (LTO) Ultrium 8 tape drive, along with technical information about each IBM tape product for open systems. It includes generalized sections about Small Computer System Interface (SCSI) and Fibre Channel connections, and multipath architecture configurations. This book also covers tools and techniques for library management. It is intended for anyone who wants to understand more about IBM tape products and their implementation. It is suitable for IBM clients, IBM Business Partners, IBM specialist sales representatives, and technical specialists. If you do not have a background in computer tape storage products, you might need to read other sources of information. In the interest of being concise, topics that are generally understood are not covered in detail.

IBM Open Platform for DBaaS on IBM Power Systems

Abstract This IBM Redbooks publication describes how to implement an Open Platform for Database as a Service (DBaaS) on IBM Power Systems environment for Linux, and demonstrate the open source tools, optimization and best practices guidelines for it. Open Platform for DBaaS on Power Systems is an on-demand, secure, and scalable self-service database platform that automates provisioning and administration of databases to support new business applications and information insights. This publication addresses topics to help sellers, architects, brand specialists, distributors, resellers and anyone offering secure and scalable Open Platform for DBaaS on Power Systems solution with APIs that are consistent across heterogeneous open database types. An Open Platform for DBaaS on Power Systems solution has the capability to accelerate business success by providing an infrastructure, and tools leveraging Open Source and OpenStack software engineered to optimize hardware and software between workloads and resources so you have a responsive, and an adaptive environment. Moreover, this publication provides documentation to transfer the how-to-skills for cloud oriented operational management of Open Platform for DBaaS on Power Systems service and underlying infrastructure to the technical teams. Open Platform for DBaaS on Power Systems mission is to provide scalable and reliable cloud database as a service provisioning functionality for both relational and non-relational database engines, and to continue to improve its fully-featured and extensible open source framework. For example, Trove is a database as a service for OpenStack. It is designed to run entirely on OpenStack, with the goal of allowing users to quickly and easily utilize the features of a relational or non-relational database without the burden of handling complex administrative tasks. Cloud users and database administrators can provision and manage multiple database instances as needed. Initially, the service focuses on providing resource isolation at high performance while automating complex administrative tasks including deployment, configuration, patching, backups, restores, and monitoring. In the context of this publication, the monitoring tool implemented is Nagios Core which is an open source monitoring tool. Hence, when you see a reference of Nagios in this book, Nagios Core is the open source monitoring solution implemented. Also note that the implementation of Open Platform for DBaaS on IBM Power Systems is based on open source solutions. This book is targeted toward sellers, architects, brand specialists, distributors, resellers and anyone developing and implementing Open Platform for DBaaS on Power Systems solutions.

IBM Power System AC922 Introduction and Technical Overview

This IBM® Redpaper™ publication is a comprehensive guide that covers the IBM Power System AC922 server (8335-GTG and 8335-GTW models). The Power AC922 server is the next generation of the IBM Power processor-based systems, which are designed for deep learning and artificial intelligence (AI), high-performance analytics, and high-performance computing (HPC). This paper introduces the major innovative Power AC922 server features and their relevant functions: Powerful IBM POWER9™ processors that offer 16 cores at 2.6 GHz with 3.09 GHz turbo performance or 20 cores at 2.0 GHz with 2.87 GHz turbo for the 8335-GTG Eighteen cores at 2.98 GHz with 3.26 GHz turbo performance or 22 at 2.78 GHz cores with 3.07 GHz turbo for the 8335-GTW IBM Coherent Accelerator Processor Interface (CAPI) 2.0, IBM OpenCAPI™, and second-generation NVIDIA NVLink technology for exceptional processor-to-accelerator intercommunication Up to six dedicated NVIDIA Tesla V100 GPUs This publication is for professionals who want to acquire a better understanding of IBM Power Systems™ products and is intended for the following audiences: Clients Sales and marketing professionals Technical support professionals IBM Business Partners Independent software vendors (ISVs) This paper expands the set of IBM Power Systems documentation by providing a desktop reference that offers a detailed technical description of the Power AC922 server. This paper does not replace the current marketing materials and configuration tools. It is intended as an extra source of information that, together with existing sources, can be used to enhance your knowledge of IBM server solutions.

IBM Spectrum Archive Single Drive Edition and Library Edition: Installation and Configuration Guide

Abstract The IBM® Linear Tape File System™ (LTFS) is the first file system that works along with Linear Tape-Open (LTO) tape technology to set a new standard for ease of use and portability for open systems tape storage. In 2011, LTFS won an Engineering Emmy Award for Innovation from the Academy of Television Arts & Sciences. This IBM Redbooks® publication helps you install, tailor, and configure the IBM Spectrum™ Archive Single Drive Edition (SDE) and the IBM Spectrum Archive™ Library Edition (LE) products. LTFS is a file system that was originally implemented on dual-partition linear tape (IBM LTO Ultrium 5 tape drives (LTO-5) and IBM TS1140 tape drives). Now IBM Spectrum Archive SDE and LE support IBM LTO Ultrium 8, 7, 6, or 5 tape drives, and IBM TS1155, IBM TS1150, and IBM TS1140 tape drives. IBM Spectrum Archive LE supports the IBM TS4500 tape library, IBM TS3500 tape library, IBM TS3310 tape library, IBM TS3200 tape library express, IBM TS3100 tape library express, and IBM TS2900 tape autoloader express. IBM Spectrum Archive makes tape look and work like any removable media, such as a USB drive. Files and directories appear on the desktop as a directory listing. It is now simple to drag files to and from tape. Any application that is written to use disk files works with the same files on tape. IBM Spectrum Archive SDE supports stand-alone drives only. IBM Spectrum Archive LE supports tape libraries. IBM Spectrum Archive LE presents each cartridge in the library as a subdirectory in the LTFS file system. With IBM Spectrum Archive LE, you can list the contents and search all of the volumes in the library without mounting the volumes by using an in-memory index. This publication is intended for anyone who wants to understand more about IBM Linear Tape System products and their implementation. This book is suitable for IBM clients, IBM Business Partners, IBM specialist sales representatives, and technical specialists.

IBM TS4500 R4 Tape Library Guide

Abstract The IBM® TS4500 (TS4500) tape library is a next-generation tape solution that offers higher storage density and integrated management than previous solutions. This IBM Redbooks® publication gives you a close-up view of the new IBM TS4500 tape library. In the TS4500, IBM delivers the density that today's and tomorrow's data growth requires. It has the cost-effectiveness and the manageability to grow with business data needs, while you preserve existing investments in IBM tape library products. Now, you can achieve both a low cost per terabyte (TB) and a high TB density per square foot, because the TS4500 can store up to 8.25 petabytes (PB) of uncompressed data in a single frame library or scale up at 1.5 PB per square foot to over 263 PB, which is more than 4 times the capacity of the IBM TS3500 tape library. The TS4500 offers these benefits: High availability dual active accessors with integrated service bays to reduce inactive service space by 40%. The Elastic Capacity option can be used to completely eliminate inactive service space. Flexibility to grow: The TS4500 library can grow from both the right side and the left side of the first L frame because models can be placed in any active position. Increased capacity: The TS4500 can grow from a single L frame up to an additional 17 expansion frames with a capacity of over 23,000 cartridges. High-density (HD) generation 1 frames from the existing TS3500 library can be redeployed in a TS4500. Capacity on demand (CoD): CoD is supported through entry-level, intermediate, and base-capacity configurations. Advanced Library Management System (ALMS): ALMS supports dynamic storage management, which enables users to create and change logical libraries and configure any drive for any logical library. Support for the IBM TS1155 while also supporting TS1150 and TS1140 tape drive: The TS1155 gives organizations an easy way to deliver fast access to data, improve security, and provide long-term retention, all at a lower cost than disk solutions. The TS1155 offers high-performance, flexible data storage with support for data encryption. Also, this enhanced fifth-generation drive can help protect investments in tape automation by offering compatibility with existing automation. The new TS1155 Tape Drive Model 55E delivers a 10 Gb Ethernet host attachment interface optimized for cloud-based and hyperscale environments. The TS1155 Tape Drive Model 55F delivers a native data rate of 360 MBps, the same load/ready, locate speeds, and access times as the TS1150, and includes dual-port 8 Gb Fibre Channel support. Support of the IBM Linear Tape-Open (LTO) Ultrium 8 tape drive: The LTO Ultrium 8 offering represents significant improvements in capacity, performance, and reliability over the previous generation, LTO Ultrium 7, while they still protect your investment in the previous technology. Support of LTO 8 Type M cartridge (M8): The LTO Program is introducing a new capability with LTO-8 drives. The ability of the LTO-8 drive to write 9 TB on a brand new LTO-7 cartridge instead of 6 TB as specified by the LTO-7 format. Such a cartridge is called an LTO-7 initialized LTO-8 Type M cartridge. Integrated TS7700 back-end Fibre Channel (FC) switches are available. Up to four library-managed encryption (LME) key paths per logical library are available. This book describes the TS4500 components, feature codes, specifications, supported tape drives, encryption, new integrated management console (IMC), and command-line interface (CLI). You learn how to accomplish several specific tasks: Improve storage density with increased expansion frame capacity up to 2.4 times and support 33% more tape drives per frame. Manage storage by using the ALMS feature. Improve business continuity and disaster recovery with dual active accessor, automatic control path failover, and data path failover. Help ensure security and regulatory compliance with tape-drive encryption and Write Once Read Many (WORM) media. Support IBM LTO Ultrium 8, 7, 6, and 5, IBM TS1155, TS1150, and TS1140 tape drives. Provide a flexible upgrade path for users who want to expand their tape storage as their needs grow. Reduce the storage footprint and simplify cabling with 10 U of rack space on top of the library. This guide is for anyone who wants to understand more about the IBM TS4500 tape library. It is particularly suitable for IBM clients, IBM Business Partners, IBM specialist sales representatives, and technical specialists.

Implementing the IBM Storwize V5000 Gen2 (including the Storwize V5010, V5020, and V5030) with IBM Spectrum Virtualize V8.1

Abstract Organizations of all sizes face the challenge of managing massive volumes of increasingly valuable data. But storing this data can be costly, and extracting value from the data is becoming more difficult. IT organizations have limited resources but must stay responsive to dynamic environments and act quickly to consolidate, simplify, and optimize their IT infrastructures. The IBM® Storwize® V5000 Gen2 system provides a smarter solution that is affordable, easy to use, and self-optimizing, which enables organizations to overcome these storage challenges. The Storwize V5000 Gen2 delivers efficient, entry-level configurations that are designed to meet the needs of small and midsize businesses. Designed to provide organizations with the ability to consolidate and share data at an affordable price, the Storwize V5000 Gen2 offers advanced software capabilities that are found in more expensive systems. This IBM Redbooks® publication is intended for pre-sales and post-sales technical support professionals and storage administrators. It applies to the Storwize V5030, V5020, and V5010, and to IBM Spectrum Virtualize™ V8.1.

Agent-based Modeling of Tax Evasion

The only single-source guide to understanding, using, adapting, and designing state-of-the-art agent-based modelling of tax evasion A computational method for simulating the behavior of individuals or groups and their effects on an entire system, agent-based modeling has proven itself to be a powerful new tool for detecting tax fraud. While interdisciplinary groups and individuals working in the tax domain have published numerous articles in diverse peer-reviewed journals and have presented their findings at international conferences, until Agent-based Modelling of Tax Evasion there was no authoritative, single-source guide to state-of-the-art agent-based tax evasion modeling techniques and technologies. Featuring contributions from distinguished experts in the field from around the globe, Agent-Based Modelling of Tax Evasion provides in-depth coverage of an array of field tested agent-based tax evasion models. Models are presented in a unified format so as to enable readers to systematically work their way through the various modeling alternatives available to them. Three main components of each agent-based model are explored in accordance with the Overview, Design Concepts, and Details (ODD) protocol, each section of which contains several sub elements that help to illustrate the model clearly and that assist readers in replicating the modeling results described. Presents models in a unified and structured manner to provide a point of reference for readers interested in agent-based modelling of tax evasion Explores the theoretical aspects and diversity of agent-based modeling through the example of tax evasion Provides an overview of the characteristics of more than thirty agent-based tax evasion frameworks Functions as a solid foundation for lectures and seminars on agent-based modelling of tax evasion The only comprehensive treatment of agent-based tax evasion models and their applications, this book is an indispensable working resource for practitioners and tax evasion modelers both in the agent-based computational domain and using other methodologies. It is also an excellent pedagogical resource for teaching tax evasion modeling and/or agent-based modeling generally.

Beginning PostgreSQL on the Cloud: Simplifying Database as a Service on Cloud Platforms

Get started with PostgreSQL on the cloud and discover the advantages, disadvantages, and limitations of the cloud services from Amazon, Rackspace, Google, and Azure. Once you have chosen your cloud service, you will focus on securing it and developing a back-up strategy for your PostgreSQL instance as part of your long-term plan. Beginning PostgreSQL on the Cloud covers other essential topics such as setting up replication and high availability; encrypting your saved cloud data; creating a connection pooler for your database; and monitoring PostgreSQL on the cloud. The book concludes by showing you how to install and configure some of the tools that will help you get started with PostgreSQL on the cloud. This book shows you how database as a service enables you to spread your data across multiple data centers, ensuring that it is always accessible. You’ll discover that this model does not expect you to install and maintain databases yourself because the database service provider does it for you. You no longer have to worry about the scalability and high availability of your database. What You Will Learn Migrate PostgreSQL to the cloud Choose the best configuration and specifications of cloud instances Set up a backup strategy that enables point-in-time recovery Use connection pooling and load balancing on cloud environments Monitor database environments on the cloud Who This Book Is For Those who are looking to migrate to PostgreSQL on the Cloud. It will also help database administrators in setting up a cloud environment in an optimized way and help them with their day-to-day tasks.

SQL Server 2017 Developer???s Guide

"SQL Server 2017 Developer's Guide" provides a comprehensive approach to learning and utilizing the new features introduced in SQL Server 2017. From advanced Transact-SQL to integrating R and Python into your database projects, this book equips you with the knowledge to design and develop efficient database applications tailored to modern requirements. What this Book will help me do Master new features in SQL Server 2017 to enhance database application development. Implement In-Memory OLTP and columnstore indexes for optimal performance. Utilize JSON support in SQL Server to integrate modern data formats. Leverage R and Python integration to apply advanced data analytics and machine learning. Learn Linux and container deployment options to expand SQL Server usage scenarios. Author(s) The authors of "SQL Server 2017 Developer's Guide" are industry veterans with extensive experience in database design, business intelligence, and advanced analytics. They bring a practical, hands-on writing style that helps developers apply theoretical concepts effectively. Their commitment to teaching is evident in the clear and detailed guidance provided throughout the book. Who is it for? This book is ideal for database developers and solution architects aiming to build robust database applications with SQL Server 2017. It's a valuable resource for business intelligence developers or analysts seeking to harness SQL Server 2017's advanced features. Some familiarity with SQL Server and T-SQL is recommended to fully leverage the insights provided by this book.

Cleaning Up the Data Lake with an Operational Data Hub

The data lake was once heralded as the answer to the flood of big data that arrived in a variety of structured and unstructured formats. But, due to the ease of integration and the lack of governance, data lakes in many companies have devolved into unusable data swamps. This short ebook shows you how to solve this problem using an Operational Data Hub (ODH) to collect, store, index, cleanse, harmonize, and master data of all shapes and formats. Gerhard Ungerer—CTO and co-founder of Random Bit LLC—explains how the ODH supports transactional integrity so that the hub can serve as integration point for enterprise applications. You’ll also learn how the ODH helps you leverage the investment in your data lake (or swamp), so that the data trapped there can finally be ingested, processed, and provisioned. With this ebook, you’ll learn how an ODH: Allows you to focus on categorizing data for easy and fast retrieval Provides flexible storage models, indexing support, query capabilities, security, and a governance framework Delivers flexible storage models; support for indexing, scripting, and automation; query capabilities; transactional integrity; and security Includes a governance model to help you access, ingest, harmonize, materialize, provision, and consume data

Gaining Data Agility with Multi-Model Databases

Most organizations realize that their future depends on the ability to quickly adapt to constant changes brought on by variable and complex environments. It's become increasingly clear that the core source behind these innovative solutions is data. Polyglot persistence refers to systems that provide many different types of data storage technologies to deal with this vast variability of data. Applications that need to access data from more than one store have to navigate an array of databases in a complex—and ultimately unsustainable—maze. One solution to this problem is readily available. In this ebook, consultant Joel Ruisi explains how a multi-model database enables you to take advantage of many different types of data models (and multiple schemas) in a single backend. With a multi-model database, companies can easily centralize, manage, and search all the data the IT system collects. The result is data agility: the ability to adapt to changing environments and serve users what they need when they need it. Through several detailed use cases, this ebook explains how multi-model databases enable you to: Store and manage multiple heterogeneous data sources Consolidate your data by bringing everything in "as is" Invisibly extend model features from one model to another Take a hybrid approach to analytical and operational data Enhance user search experience, including big data search Conduct queries across data models Offer SQL without relational constraints

MarkLogic Cookbook

Learn how to get the most out of MarkLogic with recipes from people who understand this powerful multi-model database platform from the inside out. MarkLogic comes with a broad set of capabilities to help you quickly integrate data from silos, but it takes time to learn how to harness that power. In this three-part series, key members of the MarkLogic team—including engineers who built the database—provide targeted recipes to get you up to speed. In Part 1, you’ll learn how to solve real-world problems with XQuery, the functional language for working with hierarchical data structures such as XML. Part 2 helps you solve common search-related problems with recipes that work with MarkLogic 9 as well as with older versions. With recipes in Part 3, you’ll explore the multiple ways MarkLogic represents data. XQuery: Gain XQuery peak performance, and explore its use in maps, documents, document security, the task server, and administration Search-related problems: Conduct document searches, score search results, understand how data is used, and search with the Optic API MarkLogic and data: Work with input transformations, tokenization, template-driven extraction, and redaction

IBM Storage Networking SAN512B-6 and SAN256B-6 Directors

This IBM® Redbooks® product guide describes the IBM Storage Networking SAN512B-6 (8961-F08) and SAN256B-6 (8961-F04) directors and the IBM b-type Gen 6 Extension Blade (FC 3892, 3893). Digital transformation is pushing mission-critical storage environments to the limit, with users expecting data to be accessible from anywhere, at any time, on any device. Faced with exponential data growth, the network must evolve to enable businesses to thrive in this new era. A new approach to storage networking is needed to enable databases, virtual servers, desktops, and critical applications, and to unlock the full capabilities of flash. By treating the network as a strategic part of a storage environment, organizations can maximize their productivity and efficiency even as they rapidly scale their environments. IBM Storage Networking SAN512B-6 and SAN256B-6 directors with Fabric Vision technology are modular building blocks that combine innovative hardware, software, and built-in instrumentation to ensure high levels of operational stability and redefine application performance. Fabric Vision technology enhances visibility into the health of storage environments, delivering greater control and insight to quickly identify problems and achieve critical service level agreements (SLAs). Breakthrough 32 Gbps performance shatters application performance barriers and provides support for more than 1 billion input/output operations per second (IOPS) for flash-based storage workloads while 128 Gbps UltraScale inter-chassis links enable simplified, high-bandwidth scalability between directors.

IBM PowerAI: Deep Learning Unleashed on IBM Power Systems Servers

Abstract This IBM® Redbooks® publication is a guide about the IBM PowerAI Deep Learning solution. This book provides an introduction to artificial intelligence (AI) and deep learning (DL), IBM PowerAI, and components of IBM PowerAI, deploying IBM PowerAI, guidelines for working with data and creating models, an introduction to IBM Spectrum™ Conductor Deep Learning Impact (DLI), and case scenarios. IBM PowerAI started as a package of software distributions of many of the major DL software frameworks for model training, such as TensorFlow, Caffe, Torch, Theano, and the associated libraries, such as CUDA Deep Neural Network (cuDNN). The IBM PowerAI software is optimized for performance by using the IBM Power Systems™ servers that are integrated with NVLink. The AI stack foundation starts with servers with accelerators. graphical processing unit (GPU) accelerators are well-suited for the compute-intensive nature of DL training, and servers with the highest CPU to GPU bandwidth, such as IBM Power Systems servers, enable the high-performance data transfer that is required for larger and more complex DL models. This publication targets technical readers, including developers, IT specialists, systems architects, brand specialist, sales team, and anyone looking for a guide about how to understand the IBM PowerAI Deep Learning architecture, framework configuration, application and workload configuration, and user infrastructure.

Implementing the IBM Storwize V7000 with IBM Spectrum Virtualize V8.1

Abstract Continuing its commitment to developing and delivering industry-leading storage technologies, IBM® introduces the IBM Storwize® V7000 solution powered by IBM Spectrum™ Virtualize. This innovative storage offering delivers essential storage efficiency technologies and exceptional ease of use and performance, all integrated into a compact, modular design that is offered at a competitive, midrange price. The IBM Storwize V7000 solution incorporates some of the top IBM technologies that are typically found only in enterprise-class storage systems, raising the standard for storage efficiency in midrange disk systems. This cutting-edge storage system extends the comprehensive storage portfolio from IBM and can help change the way organizations address the ongoing information explosion. This IBM Redbooks® publication introduces the features and functions of the IBM Storwize V7000 and IBM Spectrum Virtualize™ V8.1 system through several examples. This book is aimed at pre-sales and post-sales technical support and marketing and storage administrators. It helps you understand the architecture of the Storwize V7000, how to implement it, and how to take advantage of its industry-leading functions and features.

Redis 4.x Cookbook

Redis 4.x Cookbook offers practical solutions for developers and administrators to master Redis, a popular key-value database. This book contains over 80 step-by-step recipes covering topics like installation, replication, high availability, and troubleshooting, making it an indispensable resource for enhancing your Redis expertise. What this Book will help me do Master the installation and configuration of a Redis instance for optimal setups. Learn how to use Redis data types effectively in various application scenarios. Implement replication and high availability to ensure reliability and scale. Gain skills to troubleshoot, benchmark, and fine-tune Redis deployments. Extend Redis functionalities with modules for custom needs. Author(s) The authors of Redis 4.x Cookbook are seasoned database administrators and developers with extensive expertise in Redis and distributed systems. Their practical experience shapes this book, offering proven insights and techniques. They are adept at conveying technical concepts in an engaging and clear manner. Who is it for? This book is ideal for developers, database administrators, and architects familiar with basic Redis concepts who want a comprehensive guide to address advanced Redis tasks. Readers seeking to implement, optimize, and troubleshoot Redis in production environments will find this resource invaluable.

Implementing the IBM System Storage SAN Volume Controller with IBM Spectrum Virtualize V8.1

Abstract This IBM® Redbooks publication is a detailed technical guide to the IBM System Storage® SAN Volume Controller, which is powered by IBM Spectrum™ Virtualize V8.1. IBM SAN Volume Controller is a virtualization appliance solution that maps virtualized volumes that are visible to hosts and applications to physical volumes on storage devices. Each server within the storage area network (SAN) has its own set of virtual storage addresses that are mapped to physical addresses. If the physical addresses change, the server continues running by using the same virtual addresses that it had before. Therefore, volumes or storage can be added or moved while the server is still running. The IBM virtualization technology improves the management of information at the "block" level in a network, which enables applications and servers to share storage devices on a network.

Spark: The Definitive Guide

Learn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals. Youâ??ll explore the basic operations and common functions of Sparkâ??s structured APIs, as well as Structured Streaming, a new high-level API for building end-to-end streaming applications. Developers and system administrators will learn the fundamentals of monitoring, tuning, and debugging Spark, and explore machine learning techniques and scenarios for employing MLlib, Sparkâ??s scalable machine-learning library. Get a gentle overview of big data and Spark Learn about DataFrames, SQL, and Datasetsâ??Sparkâ??s core APIsâ??through worked examples Dive into Sparkâ??s low-level APIs, RDDs, and execution of SQL and DataFrames Understand how Spark runs on a cluster Debug, monitor, and tune Spark clusters and applications Learn the power of Structured Streaming, Sparkâ??s stream-processing engine Learn how you can apply MLlib to a variety of problems, including classification or recommendation

Camel in Action, Second Edition

Camel in Action, Second Edition is the most complete Camel book on the market. Written by core developers of Camel and the authors of the highly acclaimed first edition, this book distills their experience and practical insights so that you can tackle integration tasks like a pro. About the Technology Apache Camel is a Java framework that implements enterprise integration patterns (EIPs) and comes with over 200 adapters to third-party systems. A concise DSL lets you build integration logic into your app with just a few lines of Java or XML. By using Camel, you benefit from the testing and experience of a large and vibrant open source community. About the Book Camel in Action, Second Edition is the definitive guide to the Camel framework. It starts with core concepts like sending, receiving, routing, and transforming data. It then goes in depth on many topics such as how to develop, debug, test, deal with errors, secure, scale, cluster, deploy, and monitor your Camel applications. The book also discusses how to run Camel with microservices, reactive systems, containers, and in the cloud. What's Inside Coverage of all relevant EIPs Camel microservices with Spring Boot Camel on Docker and Kubernetes Error handling, testing, security, clustering, monitoring, and deployment Hundreds of examples in Java and XML About the Reader Readers should be familiar with Java. This book is accessible to beginners and invaluable to experts. About the Authors Claus Ibsen is a senior principal engineer working for Red Hat specializing in cloud and integration. He has worked on Apache Camel for the last nine years where he heads the project. Claus lives in Denmark. Jonathan Anstey is an engineering manager at Red Hat and a core Camel contributor. He lives in Newfoundland, Canada. Quotes I highly recommend this book to anyone with even a passing interest in Apache Camel. Do take Camel for a ride...and don't get the hump! - From the Foreword by James Strachan, Creator of Apache Camel Claus and Jon are great writers, relying on figures and diagrams where needed and presenting lots of code snippets and worked examples. - From the Foreword by Dr. Mark Little, Technical Director of JBoss The second edition of this all-time classic is an indispensable companion for your Apache Camel rides. - Gregor Zurowski, Apache Camel Committer The absolute best way to learn and use Camel - top to bottom, front to back, and all the way through. Camel is a fantastic tool - every Java coder should have a copy of this book. - Rick Wagner, Red Hat An excellent book and the definite reference for experienced engineers. - Yan Guo, EventBrite

Mastering Apache Solr 7.x

"Mastering Apache Solr 7.x" is your practical guide to building, advancing, and optimizing enterprise search solutions using Solr 7. With this book, you will harness the robust features of Solr, implement efficient search capabilities, and tackle complex business intelligence problems to achieve unparalleled search performance. What this Book will help me do Develop and implement efficient schemas using the Solr Schema API. Optimize enterprise search performance with advanced querying and scoring techniques. Implement fault-tolerant and distributed search systems using SolrCloud. Leverage Apache Tika for seamless data indexing and content extraction. Utilize programming languages like JavaScript, Python, and Ruby to integrate with Solr. Author(s) With years of experience in search technologies and deep expertise in Apache Solr, authors None Nair, None Mehta, and Dharmesh Vasoya bring together a wealth of knowledge in this book. Their collaborative insights equip readers to master advanced Solr features, sharing practical examples and real-world applications with a passion for clarity and efficiency. Who is it for? This book is ideal for software developers, data engineers, and database architects who aim to design and implement effective enterprise search systems. It is tailored for readers with prior experience in Apache Solr or Java programming, focusing on those eager to enhance their search solution expertise. Achieve your advanced search system goals here.