O'Reilly Data Engineering Books

Mastering DynamoDB

2014-08-25 O'Reilly Amazon

book

Tanmay Deshpande

data data-engineering nosql-databases DynamoDB AWS Cloud Computing

"Mastering DynamoDB" will guide you through the advanced usage and operational subtleties of DynamoDB, Amazon's fully managed NoSQL database solution. By mastering the topics in this book, readers will unlock the tools to efficiently build scalable and high-performance applications using DynamoDB, leveraging its unique architecture and features. What this Book will help me do Gain a deep understanding of DynamoDB's data model and how it supports scalable performance. Master advanced DynamoDB architecture concepts for reliability and data handling. Learn to integrate DynamoDB securely with other AWS services to build a comprehensive ecosystem. Use tools and libraries to automate processes like autoscaling, testing, and data backups for DynamoDB. Develop mobile and web applications using DynamoDB as a backend, ensuring high availability and speedy operations. Author(s) None Deshpande, the author of "Mastering DynamoDB", is an experienced cloud computing and database professional. With a strong background in AWS technologies and particular expertise in DynamoDB, None brings hands-on knowledge to the forefront. The author is dedicated to making complex concepts accessible and practical for learners, aiding in their professional growth. Who is it for? This book is ideal for developers and IT professionals who want to deepen their expertise in cloud-based NoSQL databases. Readers should ideally have intermediate experience with programming, AWS services, and an interest in enhancing their skills around scalable database systems. Those seeking practical insights for advanced database integration and application development will benefit the most. If you aim to build robust, high-performance applications, "Mastering DynamoDB" is for you.

ABCs of IBM z/OS System Programming Volume 1

2014-08-20 O'Reilly Amazon

book

Paul Rogers , Karan Singh

data data-engineering IBM Cyber Security Unix

The ABCs of IBM® z/OS® System Programming is a 13-volume collection that provides an introduction to the z/OS operating system and the hardware architecture. Whether you are a beginner or an experienced system programmer, the ABCs collection provides the information that you need to start your research into z/OS and related subjects. Whether you want to become more familiar with z/OS in your current environment, or you are evaluating platforms to consolidate your online business applications, the ABCs collection will serve as a powerful technical tool. Volume 1 provides an updated understanding of the software and IBM zSeries architecture, and explains how it is used together with the z/OS operating system. This includes the main components of z/OS needed to customize and install the z/OS operating system. This edition has been significantly updated and revised. The other volumes contain the following content: Volume 2: z/OS implementation and daily maintenance, defining subsystems, IBM Job Entry Subsystem 2 (JES2) and JES3, link pack area (LPA), LNKLST, authorized libraries, System Modification Program/Extended (SMP/E), IBM Language Environment® Volume 3: Introduction to Data Facility Storage Management Subsystem (DFSMS), data set basics, storage management hardware and software, catalogs, and DFSMS Transactional Virtual Storage Access Method (VSAM), or DFSMStvs Volume 4: z/OS Communications Server, Transmission Control Protocol/Internet Protocol (TCP/IP), and IBM Virtual Telecommunications Access Method (IBM VTAM®) Volume 5: Base and IBM Parallel Sysplex®, z/OS System Logger, Resource Recovery Services (RRS), Global Resource Serialization (GRS), z/OS system operations, z/OS Automatic Restart Manager (ARM), IBM Geographically Dispersed Parallel Sysplex™ (IBM GDPS®) Volume 6: Introduction to security, IBM Resource Access Control Facility (IBM RACF®), Digital certificates and public key infrastructure (PKI), Kerberos, cryptography and IBM eServer™ z990 integrated cryptography, zSeries firewall technologies, Lightweight Directory Access Protocol (LDAP), and Enterprise Identity Mapping (EIM) Volume 7: Printing in a z/OS environment, Infoprint Server, and Infoprint Central Volume 8: An introduction to z/OS problem diagnosis Volume 9: z/OS UNIX System Services Volume 10: Introduction to IBM z/Architecture®, zSeries processor design, zSeries connectivity, LPAR concepts, HCD, and IBM DS8000® Volume 11: Capacity planning, IBM Performance Management, z/OS Workload Manager (WLM), IBM Resource Management Facility (IBM RMF™), and IBM System Management Facility (SMF) Volume 12: WLM Volume 13: JES2 and JES3 System Display and Search Facility (SDSF)

Bitemporal Data

2014-08-19 O'Reilly Amazon

book

Tom Johnston

data data-engineering relational-databases Data Management IBM dimensional modeling

Bitemporal data has always been important. But it was not until 2011 that the ISO released a SQL standard that supported it. Currently, among major DBMS vendors, Oracle, IBM and Teradata now provide at least some bitemporal functionality in their flagship products. But to use these products effectively, someone in your IT organization needs to know more than how to code bitemporal SQL statements. Perhaps, in your organization, that person is you. To correctly interpret business requests for temporal data, to correctly specify requirements to your IT development staff, and to correctly design bitemporal databases and applications, someone in your enterprise needs a deep understanding of both the theory and the practice of managing bitemporal data. Someone also needs to understand what the future may bring in the way of additional temporal functionality, so their enterprise can plan for it. Perhaps, in your organization, that person is you. This is the book that will show the do-it-yourself IT professional how to design and build bitemporal databases and how to write bitemporal transactions and queries, and will show those who will direct the use of vendor-provided bitemporal DBMSs exactly what is going on "under the covers" of that software. Explains the business value of bitemporal data in terms of the information that can be provided by bitemporal tables and not by any other form of temporal data, including history tables, version tables, snapshot tables, or slowly-changing dimensions Provides an integrated account of the mathematics, logic, ontology and semantics of relational theory and relational databases, in terms of which current relational theory and practice can be seen as unnecessarily constrained to the management of nontemporal and incompletely temporal data Explains how bitemporal tables can provide the time-variance and nonvolatility hitherto lacking in Inmon historical data warehouses Explains how bitemporal dimensions can replace slowly-changing dimensions in Kimball star schemas, and why they should do so Describes several extensions to the current theory and practice of bitemporal data, including the use of episodes, "whenever" temporal transactions and queries, and future transaction time Points out a basic error in the ISO’s bitemporal SQL standard, and warns practitioners against the use of that faulty functionality. Recommends six extensions to the ISO standard which will increase the business value of bitemporal data Points towards a tritemporal future for bitemporal data, in which an Aristotelian ontology and a speech-act semantics support the direct management of the statements inscribed in the rows of relational tables, and add the ability to track the provenance of database content to existing bitemporal databases This book also provides the background needed to become a business ontologist, and explains why an IT data management person, deeply familiar with corporate databases, is best suited to play that role. Perhaps, in your organization, that person is you

IBM ProtecTIER Implementation and Best Practices Guide

2014-08-19 O'Reilly Amazon

book

Karen Orlando , Mara Miranda Bautista , Jose Roberto Mosqueda Mejia , Rosane Goldstein Langnor

data data-engineering IBM API

This IBM® Redbooks® publication provides best practice guidance for planning, installing, and configuring the IBM System Storage® TS7600 ProtecTIER® family of products. This guide provides the current best practices for using ProtecTIER software version physical general availability (pGA) 3.3 and the revolutionary and patented IBM HyperFactor® deduplication engine, along with other data storage efficiency techniques, such as compression and defragmentation. The System Storage TS7650G ProtecTIER Deduplication Gateway and the System Storage TS7620 ProtecTIER Deduplication Appliance Express are disk-based data storage systems that are configured for three available interfaces: The Virtual Tape Library (VTL) interface is the foundation of ProtecTIER and emulates traditional automated tape libraries. The OpenStorage (OST) application programming interface (API) can be integrated with Symantec NetBackup to provide backup-to-disk without having to emulate traditional tape libraries. The File System Interface (FSI) supports Common Internet File System (CIFS) and Network File System (NFS) as backup targets. When you build a ProtecTIER data deduplication environment, this guide helps your IT architects and solution designers plan for the best options and scenarios for data deduplication for their environments. This guide helps you optimize your deduplication ratio, and at the same time reduce the hardware, power and cooling, and management costs. This guide provides expertise that was gained from the IBM ProtecTIER Field Technical Sales Support (FTSS) group, development, and quality assurance (QA) teams.

Leaflet.js Essentials

2014-08-18 O'Reilly Amazon

book

Paul Crickard III

data data-engineering location-data geographic-information-system-gis web-mapping JavaScript

Leaflet.js Essentials is a practical guide designed to help web developers create engaging, mobile-friendly map applications using the Leaflet.js library. Through clear step-by-step tutorials, you will gain the skills to integrate interactive mapping features into your web projects. What this Book will help me do Build web maps integrating Tile Layers and Web Mapping Services. Develop interactive maps using Leaflet.js and JavaScript. Add GeoJSON data and create custom map markers. Create advanced visualizations such as heatmaps and choropleth maps. Enhance maps using third-party plugins and additional tools. Author(s) Paul Crickard III is an experienced software developer and an expert in geospatial technologies. With a passion for teaching complex topics in an accessible way, Paul has helped many developers integrate geospatial functionalities into their projects. His practical approach to Leaflet in this book equips developers with actionable knowledge. Who is it for? This book is ideal for web developers with a basic knowledge of JavaScript looking to enhance their projects with interactive maps. It suits both beginners to geospatial technologies and those familiar with mapping concepts. If you're aiming to create engaging maps or want to leverage Leaflet for app development, this is the right book for you.

IBM Tivoli Storage Manager as a Data Protection Solution

2014-08-15 O'Reilly Amazon

book

Pia Nymann , Mary Lovelace , Julien Sauvanet , Gerd Becker , Norbert Pott , Felipe Peres , Gokhan Yildirim , Mikael Lindstrom , Rosane Langnor

data data-engineering IBM ibm-tivoli

When you hear IBM® Tivoli® Storage Manager, the first thing that you typically think of is data backup. Tivoli Storage Manager is the premier storage management solution for mixed platform environments. Businesses face a tidal wave of information and data that seems to increase daily. The ability to successfully and efficiently manage information and data has become imperative. The Tivoli Storage Manager family of products helps businesses successfully gain better control and efficiently manage the information tidal wave through significant enhancements in multiple facets of data protection. Tivoli Storage Manager is a highly scalable and available data protection solution. It takes data protection scalability to the next level with a relational database, which is based on IBM DB2® technology. Greater availability is delivered through enhancements such as online, automated database reorganization. This IBM Redbooks® publication describes the evolving set of data-protection challenges and how capabilities in Tivoli Storage Manager can best be used to address those challenges. This book is more than merely a description of new and changed functions in Tivoli Storage Manager; it is a guide to use for your overall data protection solution.

Time and Relational Theory, 2nd Edition

2014-08-13 O'Reilly Amazon

book

Hugh Darwen , Nikos Lorentzos , C.J. Date

data data-engineering relational-databases SQL

Time and Relational Theory provides an in-depth description of temporal database systems, which provide special facilities for storing, querying, and updating historical and future data. Traditionally, database management systems provide little or no special support for temporal data at all. This situation is changing because: Cheap storage enables retention of large volumes of historical data in data warehouses Users are now faced with temporal data problems, and need solutions Temporal features have recently been incorporated into the SQL standard, and vendors have begun to add temporal support to their DBMS products Based on the groundbreaking text Temporal Data & the Relational Model (Morgan Kaufmann, 2002) and new research led by the authors, Time and Relational Theory is the only book to offer a complete overview of the functionality of a temporal DBMS. Expert authors Nikos Lorentzos, Hugh Darwen, and Chris Date describe an approach to temporal database management that is firmly rooted in classical relational theory and will stand the test of time. This book covers the SQL:2011 temporal extensions in depth and identifies and discusses the temporal functionality still missing from SQL. Understand how the relational model provides an ideal basis for taming the complexities of temporal databases Learn how to analyze and evaluate commercial temporal products with this timely and important information Be able to use sound principles in designing and using temporal databases Understand the temporal support recently added to SQL with coverage of the new SQL features in this unique, accurate, and authoritative reference Appreciate the benefits of a truly relational approach to the problem with this clear, user friendly presentation

ABCs of IBM z/OS System Programming Volume 6

2014-08-12 O'Reilly Amazon

book

Paul Rogers , Oerjan Lundgren , Bob McCormack , Rui Feio , Rita Pleus , Karan Singh

data data-engineering IBM Cyber Security Unix

The ABCs of IBM® z/OS® System Programming is an 11-volume collection that provides an introduction to the z/OS operating system and the hardware architecture. Whether you are a beginner or an experienced system programmer, the ABCs collection provides the information that you need to start your research into z/OS and related subjects. If you want to become more familiar with z/OS in your current environment or if you are evaluating platforms to consolidate your e-business applications, the ABCs collection can serve as a powerful technical tool. Following are the contents of the volumes: Volume 1: Introduction to z/OS and storage concepts, TSO/E, ISPF, JCL, SDSF, and z/OS delivery and installation Volume 2: z/OS implementation and daily maintenance, defining subsystems, JES2 and JES3, LPA, LNKLST, authorized libraries, IBM Language Environment®, and SMP/E Volume 3: Introduction to DFSMS, data set basics, storage management hardware and software, VSAM, System-managed storage, catalogs, and DFSMStvs Volume 4: Communication Server, TCP/IP, and IBM VTAM® Volume 5: Base and IBM Parallel Sysplex®, System Logger, Resource Recovery Services (RRS), global resource serialization (GRS), z/OS system operations, automatic restart management (ARM), and IBM Geographically Dispersed Parallel Sysplex™ (IBM GDPS®) Volume 6: Introduction to security, IBM RACF®, digital certificates and public key infrastructure (PKI), Kerberos, cryptography and IBM z9® integrated cryptography, Lightweight Directory Access Protocol (LDAP), and Enterprise Identity Mapping (EIM) Volume 7: Printing in a z/OS environment, Infoprint Server, and Infoprint Central Volume 8: An introduction to z/OS problem diagnosis Volume 9: z/OS UNIX System Services Volume 10: Introduction to IBM z/Architecture®, IBM System z® processor design, System z connectivity, logical partition (LPAR) concepts, hardware configuration definition (HCD), and Hardware Management Console (HMC) Volume 11: Capacity planning, performance management, Workload Manager (WLM), IBM Resource Measurement Facility™ (RMF™), and System Management Facilities (SMF)

GDPS Family: An Introduction to Concepts and Capabilities

2014-08-12 O'Reilly Amazon

book

Sim Schindel , David Clitherow

data data-engineering IBM

This IBM® Redbooks® publication presents an overview of the IBM Geographically Dispersed Parallel Sysplex™ (IBM GDPS®) family of offerings and the role they play in delivering a business IT resilience solution. The book begins with general concepts of business IT resilience and disaster recovery, along with issues related to high application availability, data integrity, and performance. These topics are considered within the framework of government regulation, increasing application and infrastructure complexity, and the competitive and rapidly changing modern business environment. Next, it describes the GDPS family of offerings with specific reference to how they can help you achieve your defined goals for disaster recovery and high availability. Also covered are the features that simplify and enhance data replication activities, the prerequisites for implementing each offering, and hints for planning for the future and immediate business requirements. Tables provide easy-to-use summaries and comparisons of the offerings, and the additional planning and implementation services available from IBM are explained. Finally, several practical client scenarios and requirements are described, along with the most suitable GDPS solution for each case. The introductory chapters of this publication are intended for a broad technical audience including IT System Architects, Availability Managers, Technical IT Managers, Operations Managers, System Programmers, and Disaster Recovery Planners. The subsequent chapters provide more technical details about the GDPS offerings, and each can be read in isolation for those readers who are interested. Therefore, if you do read all the chapters, be aware that some information is repeated.

MySQL Cookbook, 3rd Edition

2014-08-08 O'Reilly Amazon

book

Paul DuBois

data data-engineering relational-databases MySQL API Java

MySQLâ??s popularity has brought a flood of questions about how to solve specific problems, and thatâ??s where this cookbook is essential. When you need quick solutions or techniques, this handy resource provides scores of short, focused pieces of code, hundreds of worked-out examples, and clear, concise explanations for programmers who donâ??t have the time (or expertise) to solve MySQL problems from scratch. Ideal for beginners and professional database and web developers, this updated third edition covers powerful features in MySQL 5.6 (and some in 5.7). The book focuses on programming APIs in Python, PHP, Java, Perl, and Ruby. With more than 200+ recipes, youâ??ll learn how to: Use the mysql client and write MySQL-based programs Create, populate, and select data from tables Store, retrieve, and manipulate strings Work with dates and times Sort query results and generate summaries Use stored routines, triggers, and scheduled events Import, export, validate, and reformat data Perform transactions and work with statistics Process web input, and generate web content from query results Use MySQL-based web session management Provide security and server administration

Predictive Analytics Using Oracle Data Miner

2014-08-08 O'Reilly Amazon

book

Brendan Tierney

data data-engineering oracle-database-solutions Analytics BI Oracle

Build Next-Generation In-Database Predictive Analytics Applications with Oracle Data Miner “If you have an Oracle Database and want to leverage that data to discover new insights, make predictions, and generate actionable insights, this book is a must read for you! In Predictive Analytics Using Oracle Data Miner: Develop & Use Oracle Data Mining Models in Oracle Data Miner, SQL & PL/SQL, Brendan Tierney, Oracle ACE Director and data mining expert, guides you through the basic concepts of data mining and offers step-by-step instructions for solving data-driven problems using SQL Developer’s Oracle Data Mining extension. Brendan takes it full circle by showing you how to deploy advanced analytical methodologies and predictive models immediately into enterprise-wide production environments using the in-database SQL and PL/SQL functionality. Definitely a must read for any Oracle data professional!” --Charlie Berger, Senior Director Product Management, Oracle Data Mining and Advanced Analytics Perform in-database data mining to unlock hidden insights in data. Written by an Oracle ACE Director, Predictive Analytics Using Oracle Data Miner shows you how to use this powerful tool to create and deploy advanced data mining models. Covering topics for the data scientist, Oracle developer, and Oracle database administrator, this Oracle Press guide shows you how to get started with Oracle Data Miner and build Oracle Data Miner models using SQL and PL/SQL packages. You'll get best practices for integrating your Oracle Data Miner models into applications to automate the discovery and distribution of business intelligence predictions throughout the enterprise. Install and configure Oracle Data Miner for Oracle Database 11 g Release 11.2 and Oracle Database 12 c Create Oracle Data Miner projects and workflows Prepare data for data mining Develop data mining models using association rule analysis, classification, clustering, regression, and anomaly detection Use data dictionary views and prepare your data using in-database transformations Build and use data mining models using SQL and PL/SQL packages Migrate your Oracle Data Miner models, integrate them into dashboards and applications, and run them in parallel Build transient data mining models with the Predictive Queries feature in Oracle Database 12 c

The Complete Guide to CICS Transaction Gateway Volume 1 Configuration and Administration

2014-08-08 O'Reilly Amazon

book

Leigh Compton , Rufus Credle , Richard Mercadante , Manuela Mandelli , Robert Jones , Sue Bayliss

data data-engineering IBM API Cyber Security

In this IBM® Redbooks® publication, you will gain an appreciation of the IBM CICS® Transaction Gateway (CICS TG) product suite, based on key criteria, such as capabilities, scalability, platform, CICS server support, application language support, and licensing model. Matching the requirements to available infrastructure and hardware choices requires an appreciation of the choices available. In this book, you will gain an understanding of those choices, and will be capable of choosing the appropriate CICS connection protocol, APIs for the applications, and security options. You will understand the services available to the application developer when using a chosen protocol. You will then learn about how to implement CICS TG solutions, taking advantage of the latest capabilities, such as IPIC connectivity, high availability, and Dynamic Server Selection. Specific scenarios illustrate the usage of CICS TG for IBM z/OS®, and CICS TG for Multiplatforms, with CICS Transaction Server for z/OS and IBM WebSphere® Application Server, including connections in CICS, configuring simple end-to-end connectivity (all platforms) with verification for remote and local mode applications, and adding security, XA support, and high availability.

Britain and the EU: In or Out?

2014-08-01 O'Reilly Amazon

book

Financial Times

data data-engineering data-security-privacy eu-general-data-protection-regulation-gdpr eu general data protection regulation (gdpr)

Britain has had an ambivalent attitude to the European Union ever since it joined 40 years ago. So what does prime minister David Cameron's promise to hold a referendum on whether the UK should stay in the union mean? What would a "Brexit" entail for Britain, Europe, and the world? These are the questions answered in Britain and the EU, an ebook of 10,000 words, compiled from news and comment published in the Financial Times, the global business newspaper which combines expert UK political coverage with unrivalled reporting on the European Union. The ebook's publication in April 2013 comes less than a year after the runaway success of the FT's first ebook, If Greece goes.... which looked at the consequences of Athens' feared expulsion from the eurozone.

Transportation Management with SAP TM 9.0

2014-08-01 O'Reilly Amazon

book

Jayant Daithankar , Tejkumar Pandit

data data-engineering SAP

"The implementation of a TMS solution is a highly complex and mission critical project. If executed correctly a good TMS can deliver a number of benefits to the organization in terms of optimization, greater efficiency, reduced errors and improved revenue through accurate invoicing. However a number of projects fail to realize these benefits for a host of reasons such as an incorrect product selection, over customization of the system and lack of detailed processes. The evaluation and selection of the right transportation management system is a very critical step in the successful implementation of a TMS product as well as ensuring that the organization is able to realize the benefits expected from the system. Transportation Management with SAP TM 9 is a guide for CIO/CXOs evaluating options for various transportation management solutions available in the market and helps inappropriate decision making before committing investment. A proven evaluation framework and guidance provided in the book can help decision makers with product selection and help to create a business case for management approval and design a future roadmap for the organization. The book provides a comprehensive understanding of what SAP transportation management is and is useful for teams involved in TM Implementation and roll outs to ensure preparedness. The book explains end-to-end freight life cycle processes, functional system landscape, implementation challenges and post go-live precautions required to optimize investments in SAP TM. Transportation Management with SAP TM 9 also acts as a step by step implementation guide with details of configuration required to set up a TM9 system. This book also covers the upgrade of SAP TM8 to SAP TM9 which will be useful for existing clients who are on TM 8. Nonavailability of SAP TM skilled resources is a major challenge faced by organizations and the book provides a detailed competency building plan along with skill set requirements to create a competent and trained workforce to manage-transformation.The current book available in the market on SAP TM is based on Version 6 release which does not cover air freight processes. Our book covers end-to-end air freight configuration scenarios for logistic companies."

IBM System Storage N series Software Guide

2014-07-31 O'Reilly Amazon

book

Christian Fey , Michael Klimes , Steven Pemberton , Danny Yang , Roland Tretau , Tom Provost

data data-engineering IBM ibm-system-storage system-storage-n DWH

Corporate workgroups, distributed enterprises, and small to medium-sized companies are increasingly seeking to network and consolidate storage to improve availability, share information, reduce costs, and protect and secure information. These organizations require enterprise-class solutions capable of addressing immediate storage needs cost-effectively, while providing an upgrade path for future requirements. IBM® System Storage® N series storage systems and their software capabilities are designed to meet these requirements. IBM System Storage N series storage systems offer an excellent solution for a broad range of deployment scenarios. IBM System Storage N series storage systems function as a multiprotocol storage device that is designed to allow you to simultaneously serve both file and block-level data across a single network. These activities are demanding procedures that, for some solutions, require multiple, separately managed systems. The flexibility of IBM System Storage N series storage systems, however, allows them to address the storage needs of a wide range of organizations, including distributed enterprises and data centers for midrange enterprises. IBM System Storage N series storage systems also support sites with computer and data-intensive enterprise applications, such as database, data warehousing, workgroup collaboration, and messaging. This IBM Redbooks® publication explains the software features of the IBM System Storage N series storage systems. This book also covers topics such as installation, setup, and administration of those software features from the IBM System Storage N series storage systems and clients and provides example scenarios.

Implementing the IBM System Storage SAN Volume Controller V7.2

2014-07-28 O'Reilly Amazon

book

Sangam Racherla , Libor Miklas , Hartmut Lonzer , Matus Butora

data data-engineering IBM ibm-system-storage ibm-system-storage-san-volume-controller

This IBM® Redbooks® publication is a detailed technical guide to the IBM System Storage® SAN Volume Controller Version 7.2. SAN Volume Controller is a virtualization appliance solution, which maps virtualized volumes that are visible to hosts and applications to physical volumes on storage devices. Each server within the storage area network (SAN) has its own set of virtual storage addresses that are mapped to physical addresses. If the physical addresses change, the server continues running by using the same virtual addresses that it had before. Therefore, volumes or storage can be added or moved while the server is still running. The IBM virtualization technology improves the management of information at the “block” level in a network, which enables applications and servers to share storage devices on a network. This book is intended for readers who must implement the SAN Volume Controller at a 7.2 release level with minimal effort.

Unimodal and Multimodal Biometric Data Indexing

2014-07-28 O'Reilly Amazon

book

Somnath Dey , Debasis Samanta

data data-engineering data-models

This work is on biometric data indexing for large-scale identification systems with a focus on different biometrics data indexing methods. It provides state-of-the-art coverage including different biometric traits, together with the pros and cons for each. Discussion of different multimodal fusion strategies are also included.

Data Classification

2014-07-25 O'Reilly Amazon

book

Charu C. Aggarwal

data data-engineering search AI/ML

Research on the problem of classification tends to be fragmented across such areas as pattern recognition, database, data mining, and machine learning. Addressing the work of these different communities in a unified way, this book explores the underlying algorithms of classification as well as applications of classification in a variety of problem domains, including text, multimedia, social network, and biological data. It presents core methods in data classification, covers recent problem domains, and discusses advanced methods for enhancing the quality of the underlying classification results.

Scaling Apache Solr

2014-07-25 O'Reilly Amazon

book

Hrishikesh Vijay Karambelkar

data data-engineering search solr Cloud Computing

Become an expert in implementing high-performance, scalable search solutions with Apache Solr in 'Scaling Apache Solr'. This detailed guide teaches you how to architect and manage top-tier search functionalities tailored for different enterprise environments. What this Book will help me do Understand the Apache Solr ecosystem and its core functionality. Apply techniques for scaling and optimizing search for enterprise environments. Implement sharding, replication, and fault tolerance for robust searches. Integrate Solr with various systems and infrastructure to enhance capability. Optimize data indexing and retrieval for high-performance applications. Author(s) Vijay Karambelkar is an experienced software architect with extensive expertise in search technologies, including Solr and Lucene. He has worked on numerous enterprise applications where scalable and efficient search was critical. Vijay's writing is informed by his real-world implementations and is structured to provide practical knowledge to help readers tackle similar challenges. Who is it for? This book is ideal for software developers, architects, and IT professionals who manage or create enterprise search solutions. It's suitable for readers with basic programming knowledge but no experience with Apache Solr. This detailed guide will also benefit those looking to improve performance and scalability in their applications using cutting-edge technology. If scalability, integration, and cloud search solutions are topics you want to master, this book is tailored for you.

SQL Server 2014 Development Essentials

2014-07-25 O'Reilly Amazon

book

Basit A. Masood-Al-Farooq

data data-engineering SQL Data Management Microsoft SQL Server

This book is your ultimate guide to mastering database development using Microsoft SQL Server 2014. By diving into this hands-on resource, you will explore the essentials of database design, implementation, and deployment to create robust solutions that meet modern enterprise needs. What this Book will help me do Gain a deep understanding of SQL Server 2014's new features and enhancements. Master database design principles for scalable and efficient solutions. Develop and optimize SQL queries for robust data retrieval and manipulation. Understand advanced database object topics and effective error handling. Learn performance optimization techniques for maintaining database efficiency. Author(s) None A. Masood-Al-Farooq is a seasoned database professional with extensive experience in SQL Server development and administration. They have worked on numerous critical projects in enterprise data management and have a practical, results-driven approach to database solutions. As an author, they focus on equipping readers with actionable insights and techniques through clear explanations and real-world examples. Who is it for? This book is ideal for database developers, administrators, and architects who work with Microsoft SQL Server and wish to expand their expertise in its 2014 version. Beginners to intermediate-level professionals will find it accessible and straightforward, while advanced users can discover new features and optimizations. It caters to anyone looking to design or optimize database solutions effectively. Whether you manage databases or are diving into database software development, this book will enhance your SQL Server 2014 skills.

Implementing the IBM Storwize V7000 V7.2

2014-07-21 O'Reilly Amazon

book

Sangam Racherla , Libor Miklas , Hartmut Lonzer , Matus Butora

data data-engineering IBM Marketing

Continuing its commitment to developing and delivering industry-leading storage technologies, IBM® introduces the IBM Storwize® V7000 solution, an innovative new storage offering that delivers essential storage efficiency technologies and exceptional ease of use and performance, all integrated into a compact, modular design that is offered at a competitive, midrange price. The IBM Storwize V7000 solution incorporates some of the top IBM technologies typically found only in enterprise-class storage systems, raising the standard for storage efficiency in midrange disk systems. This cutting-edge storage system extends the comprehensive storage portfolio from IBM and can help change the way organizations address the ongoing information explosion. This IBM Redbooks® publication introduces the features and functions of the IBM Storwize V7000 system through several examples. This book is aimed at pre- and post-sales technical support and marketing, storage administrators, and will help you understand the architecture of the Storwize V7000, how to implement it, and take advantage of the industry leading functions and features.

Cloudera Administration Handbook

2014-07-18 O'Reilly Amazon

book

Rohit Menon

data data-engineering Hadoop cloudera Big Data HDFS

Discover how to effectively administer large Apache Hadoop clusters with the Cloudera Administration Handbook. This guide offers step-by-step instructions and practical examples, enabling you to confidently set up and manage Hadoop environments using Cloudera Manager and CDH5 tools. Through this book, administrators or aspiring experts can unlock the power of distributed computing and streamline cluster operations. What this Book will help me do Gain in-depth understanding of Apache Hadoop architecture and its operational framework. Master the setup, configuration, and management of Hadoop clusters using Cloudera tools. Implement robust security measures in your cluster including Kerberos authentication. Optimize for reliability with advanced HDFS features like High Availability and Federation. Streamline cluster management and address troubleshooting effectively using best practices. Author(s) None Menon is an experienced technologist specializing in distributed computing and data infrastructure. With a strong background in big data platforms and certifications in Hadoop administration, None has helped enterprises optimize their cluster deployments. Their instructional approach combines clarity, practical insights, and a hands-on focus. Who is it for? This book is ideal for systems administrators, data engineers, and IT professionals keen on mastering Hadoop environments. It serves both beginners getting started with cluster setup and seasoned administrators seeking advanced configurations. If you're aiming to efficiently manage Hadoop clusters using Cloudera solutions, this guide provides the knowledge and tools you need.

PostgreSQL 9 High Availability Cookbook

2014-07-17 O'Reilly Amazon

book

Shaun Thomas

data data-engineering relational-databases postgresql Linux

"PostgreSQL 9 High Availability Cookbook" is a guide for PostgreSQL DBAs and developers looking to build a robust and highly available database ecosystem. Through over 100 tested recipes, it delves into vital topics like replication, clustering, and monitoring to ensure system reliability and uptime. What this Book will help me do Set up PostgreSQL replication to enhance data availability and reliability. Implement monitoring solutions to keep your database's performance and health under check. Learn to troubleshoot common database issues to reduce downtime. Configure connection pooling to optimize resource usage and ensure better scalability. Master techniques for clustering and partitioning large datasets to handle growing system needs. Author(s) The author, Shaun Thomas, is a seasoned PostgreSQL administrator with extensive experience in database tuning, high availability solutions, and Linux system management. Shaun brings practical insights from his years of professional practice, aiming to make complex topics approachable. Who is it for? This book caters to intermediate to advanced PostgreSQL administrators and developers. If you are seeking to enhance your database's performance, reliability, and resilience, this book is for you. With its practical recipe approach, it's a great fit for those who enjoy hands-on learning. Whether you're maintaining production systems or scaling for growth, this guide is your ally.

Performance Optimization and Tuning Techniques for IBM Processors, including IBM POWER8

2014-07-11 O'Reilly Amazon

book

Suresh Warrier , Peter Bergner , Wainer dos Santos Moschetta , Brian Hall , Berni Schiefer , Robert Enenkel , Pat Haugen , Philipp Oehler , Daniel Zabawa , Ryan Arnold , Michael R. Meissner , Alex Mericas , Brian F. Veale , Adhemerval Zanella

data data-engineering IBM Linux Superset

This IBM® Redbooks® publication focuses on gathering the correct technical information, and laying out simple guidance for optimizing code performance on IBM POWER8™ systems that run the AIX®, IBM i, or Linux operating systems. There is much straightforward performance optimization that can be performed with a minimum of effort and without extensive previous experience or in-depth knowledge. The POWER8 processor contains many new and important performance features, such as support for eight hardware threads in each core and support for transactional memory. POWER8 is a strict superset of IBM POWER7+™, and so all of the performance features of POWER7+, such as multiple page sizes, also appear in POWER8. Much of the technical information and guidance for optimizing performance on POWER8 presented in this guide also applies to POWER7+ and earlier processors, except where the guide explicitly indicates that a feature is new in POWER8. This guide strives to focus on optimizations that tend to be positive across a broad set of IBM POWER® processor chips and systems. Specific guidance is given for the POWER8 processor; however, the general guidance is applicable to the IBM POWER7+, IBM POWER7®, IBM POWER6®, IBM POWER5, and even to earlier processors. This guide is directed to personnel who are responsible for performing migration and implementation activities on IBM POWER8-based servers. This includes system administrators, system architects, network administrators, information architects, and database administrators (DBAs).

Understanding Big Data Scalability: Big Data Scalability Series, Part I

2014-07-11 O'Reilly Amazon

book

Cory Isaacson

data data-engineering nosql-databases Big Data Hadoop NoSQL

Get Started Scaling Your Database Infrastructure for High-Volume Big Data Applications “Understanding Big Data Scalability presents the fundamentals of scaling databases from a single node to large clusters. It provides a practical explanation of what ‘Big Data’ systems are, and fundamental issues to consider when optimizing for performance and scalability. Cory draws on many years of experience to explain issues involved in working with data sets that can no longer be handled with single, monolithic relational databases.... His approach is particularly relevant now that relational data models are making a comeback via SQL interfaces to popular NoSQL databases and Hadoop distributions.... This book should be especially useful to database practitioners new to scaling databases beyond traditional single node deployments.” —Brian O’Krafka, software architect presents a solid foundation for scaling Big Data infrastructure and helps you address each crucial factor associated with optimizing performance in scalable and dynamic Big Data clusters. Understanding Big Data Scalability Database expert Cory Isaacson offers practical, actionable insights for every technical professional who must scale a database tier for high-volume applications. Focusing on today’s most common Big Data applications, he introduces proven ways to manage unprecedented data growth from widely diverse sources and to deliver real-time processing at levels that were inconceivable until recently. Isaacson explains why databases slow down, reviews each major technique for scaling database applications, and identifies the key rules of database scalability that every architect should follow. You’ll find insights and techniques proven with all types of database engines and environments, including SQL, NoSQL, and Hadoop. Two start-to-finish case studies walk you through planning and implementation, offering specific lessons for formulating your own scalability strategy. Coverage includes Understanding the true causes of database performance degradation in today’s Big Data environments Scaling smoothly to petabyte-class databases and beyond Defining database clusters for maximum scalability and performance Integrating NoSQL or columnar databases that aren’t “drop-in” replacements for RDBMSes Scaling application components: solutions and options for each tier Recognizing when to scale your data tier—a decision with enormous consequences for your application environment Why data relationships may be even more important in non-relational databases Why virtually every database scalability implementation still relies on sharding, and how to choose the best approach How to set clear objectives for architecting high-performance Big Data implementations The Big Data Scalability Series is a comprehensive, four-part series, containing information on many facets of database performance and scalability. is the first book in the series. Understanding Big Data Scalability Learn more and join the conversation about Big Data scalability at bigdatascalability.com.

talk-data.com

O'Reilly Data Engineering Books

Top Topics

Top Speakers

Mastering DynamoDB

ABCs of IBM z/OS System Programming Volume 1

Bitemporal Data

IBM ProtecTIER Implementation and Best Practices Guide

Leaflet.js Essentials

IBM Tivoli Storage Manager as a Data Protection Solution

Time and Relational Theory, 2nd Edition

ABCs of IBM z/OS System Programming Volume 6

GDPS Family: An Introduction to Concepts and Capabilities

MySQL Cookbook, 3rd Edition

Predictive Analytics Using Oracle Data Miner

The Complete Guide to CICS Transaction Gateway Volume 1 Configuration and Administration

Britain and the EU: In or Out?

Transportation Management with SAP TM 9.0

IBM System Storage N series Software Guide

Implementing the IBM System Storage SAN Volume Controller V7.2

Unimodal and Multimodal Biometric Data Indexing

Data Classification

Scaling Apache Solr

SQL Server 2014 Development Essentials

Implementing the IBM Storwize V7000 V7.2

Cloudera Administration Handbook

PostgreSQL 9 High Availability Cookbook

Performance Optimization and Tuning Techniques for IBM Processors, including IBM POWER8

Understanding Big Data Scalability: Big Data Scalability Series, Part I