talk-data.com talk-data.com

Event

O'Reilly Data Engineering Books

2001-10-19 – 2027-05-25 Oreilly Visit website ↗

Activities tracked

615

Collection of O'Reilly books on Data Engineering.

Filtering by: Cyber Security ×

Sessions & talks

Showing 301–325 of 615 · Newest first

Search within this event →
Apache Oozie Essentials

Apache Oozie Essentials serves as your guide to mastering Apache Oozie, a powerful workflow scheduler for Hadoop environments. Through lucid explanations and practical examples, you will learn how to create, schedule, and enhance workflows for data ingestion, processing, and machine learning tasks using Oozie. What this Book will help me do Install and configure Apache Oozie in your Hadoop environment to start managing workflows. Develop seamless workflows that integrate tools like Hive, Pig, and Sqoop to automate data operations. Set up coordinators to handle timed and dependent job executions efficiently. Deploy Spark jobs within your workflows for machine learning on large datasets. Harness Oozie security features to improve your system's reliability and trustworthiness. Author(s) Authored by None Singh, a seasoned developer with a deep understanding of big data processing and Apache Oozie. With their practical experience, the book intersperses technical detail with real-world examples for an effective learning experience. The author's goal is to make Oozie accessible and useful to professionals. Who is it for? This book is ideal for data engineers and Hadoop professionals looking to streamline their workflow management using Apache Oozie. Whether you're a novice to Oozie or aiming to implement complex data and ML pipelines, the book offers comprehensive guidance tailored to your needs.

Pro Couchbase Server, Second Edition

This new edition is a hands-on guide for developers and administrators who want to use the power and flexibility of Couchbase Server 4.0 in their applications. The second edition extends coverage of N1QL, the SQL-like query language for Couchbase. It also brings coverage of multiple new features, including the new generation of client SDKs, security and LDAP integration, secondary indexes, and multi-dimensional scaling. Pro Couchbase Server covers everything you need to develop Couchbase solutions and deploy them in production. The NoSQL movement has fundamentally changed the database world in recent years. Influenced by the growing needs of web-scale applications, NoSQL databases such as Couchbase Server provide new approaches to scalability, reliability, and performance. Never have document databases been so powerful and performant. With the power and flexibility of Couchbase Server, you can model your data however you want, and easily change the data model any time you want. Pro Couchbase Server shows what is possible and helps you take full advantage of Couchbase Server and all the performance and scalability that it offers. • Helps you design and develop a document database using Couchbase Server. • Covers the latest features such as the N1QL query language. • Gives you the tools to scale out your application as needed.

Data Lake Development with Big Data

In "Data Lake Development with Big Data," you will explore the fundamental principles and techniques for constructing and managing a Data Lake tailored for your organization's big data challenges. This book provides practical advice and architectural strategies for ingesting, managing, and analyzing large-scale data efficiently and effectively. What this Book will help me do Learn how to architect a Data Lake from scratch tailored to your organizational needs. Master techniques for ingesting data using real-time and batch processing frameworks efficiently. Understand data governance, quality, and security considerations essential for scalable Data Lakes. Discover strategies for enabling users to explore data within the Data Lake effectively. Gain insights into integrating Data Lakes with Big Data analytic applications for high performance. Author(s) None Pasupuleti and Beulah Salome Purra bring their extensive expertise in big data and enterprise data management to this book. With years of hands-on experience designing and managing large-scale data architectures, their insights are rooted in practical knowledge and proven techniques. Who is it for? This book is ideal for data architects and senior managers tasked with adapting or creating scalable data solutions in enterprise contexts. Readers should have foundational knowledge of master data management and be familiar with Big Data technologies to derive maximum value from the content presented.

Introducing and Implementing IBM FlashSystem V9000

Storage capacity and performance requirements are growing faster than ever before, and the costs of managing this growth are depleting more of the information technology (IT) budget. The IBM® FlashSystem™ V9000 is the premier, fully integrated, Tier 1, all-flash offering from IBM. It has changed the economics of today's data center by eliminating storage bottlenecks. Its software-defined storage features simplify data management, improve data security, and preserve your investments in storage. IBM FlashSystem® V9000 includes IBM FlashCore™ technology and advanced software-defined storage available in one solution in a compact 6U form factor. FlashSystem V9000 improves business application availability. It delivers greater resource utilization so you can get the most from your storage resources, and achieve a simpler, more scalable, and cost-efficient IT Infrastructure. This IBM Redbooks® publication provides information about IBM FlashSystem V9000 Software V7.5 and its new functionality. It describes the product architecture, software, hardware, and implementation, and provides hints and tips. It illustrates use cases and independent software vendor (ISV) scenarios that demonstrate real-world solutions, and also provides examples of the benefits gained by integrating the FlashSystem storage into business environments. Using IBM FlashSystem V9000 software version 7.5 functions, management tools, and interoperability combines the performance of FlashSystem architecture with the advanced functions of software-defined storage to deliver performance, efficiency, and functions that meet the needs of enterprise workloads that demand IBM MicroLatency® response time. This book offers FlashSystem V9000 scalability concepts and guidelines for planning, installing, and configuring, which can help environments scale up and out to add more flash capacity and expand virtualized systems. Port utilization methodologies are provided to help you maximize the full potential of IBM FlashSystem V9000 performance and low latency in your scalable environment. In addition, all of the functions that FlashSystem V9000 software version 7.5 brings are explained, including IBM HyperSwap® capability, increased IBM FlashCopy® bitmap space, Microsoft Windows offloaded data transfer (ODX), and direct 16 gigabits per second (Gbps) Fibre Channel host attach support. This book also describes support for VMware 6, which enhances and improves scalability in a VMware environment. This book is intended for pre-sales and post-sales technical support professionals, storage administrators, and anyone who wants to understand how to implement this exciting technology.

WHOIS Running the Internet: Protocol, Policy, and Privacy

Discusses the evolution of WHOIS and how policy changes will affect WHOIS' place in IT today and in the future This book provides a comprehensive overview of WHOIS. The text begins with an introduction to WHOIS and an in-depth coverage of its forty-year history. Afterwards it examines how to use WHOIS and how WHOIS fits in the overall structure of the Domain Name System (DNS). Other technical topics covered include WHOIS query code and WHOIS server details. The book also discusses current policy developments and implementations, reviews critical policy documents, and explains how they will affect the future of the Internet and WHOIS. Additional resources and content updates will be provided through a supplementary website. Includes an appendix with information on current and authoritative WHOIS services around the world Provides illustrations of actual WHOIS records and screenshots of web-based WHOIS query interfaces with instructions for navigating them Explains network dependencies and processes related to WHOIS utilizing flowcharts Contains advanced coding for programmers WHOIS Running the Internet: Protocol, Policy, and Privacy is written primarily for internet developers, policy developers, industry professionals in law enforcement, digital forensic investigators, and intellectual property attorneys. Garth O. Bruen is an Internet policy and security researcher whose work has been published in the Wall Street Journal and the Washington Post. Since 2012 Garth Bruen has served as the North American At-Large Chair to the Internet Corporation of Assigned Names and Numbers (ICANN). In 2003 Bruen created KnujOn.com with his late father, Dr. Robert Bruen, to process and investigate Internet abuse complaints (SPAM) from consumers. Bruen has trained and advised law enforcement at the federal and local levels on malicious use of the Domain Name System in the way it relates to the WHOIS record system. He has presented multiple times to the High Technology Crime Investigation Association (HTCIA) as well as other cybercrime venues including the Anti-Phishing Working Group (APWG) and the National Center for Justice and the Rule of Law at The University of Mississippi School of Law. Bruen also teaches the Fisher College Criminal Justice School in Boston where he develops new approaches to digital crime.

IBM Content Manager OnDemand Guide

This IBM® Redbooks® publication provides a practical guide to the design, installation, configuration, and maintenance of IBM Content Manager OnDemand Version 9.5. Content Manager OnDemand manages the high-volume storage and retrieval of electronic statements and provides efficient enterprise report management. Content Manager OnDemand transforms formatted computer output and printed reports, such as statements and invoices, into electronic information for easy report management. Content Manager OnDemand helps eliminate costly, high-volume print output by capturing, indexing, archiving, and presenting electronic information for improved customer service. This publication covers the key areas of Content Manager OnDemand, some of which might not be known to the Content Manager OnDemand community or are misunderstood. The book covers various topics, including basic information in administration, database structure, storage management, and security. In addition, the book covers data indexing, loading, conversion, and expiration. Other topics include user exits, performance, retention management, records management, and many more. Because many other resources are available that address subjects on different platforms, this publication is not intended as a comprehensive guide for Content Manager OnDemand. Rather, it is intended to complement the existing Content Manager OnDemand documentation and provide insight into the issues that might be encountered in the setup and use of Content Manager OnDemand. This book is intended for individuals who need to design, install, configure, and maintain Content Manager OnDemand.

IBM Software for SAP Solutions

SAP is a market leader in enterprise business application software. SAP solutions provide a rich set of composable application modules, and configurable functional capabilities that are expected from a comprehensive enterprise business application software suite. In most cases, companies that adopt SAP software remain heterogeneous enterprises running both SAP and non-SAP systems to support their business processes. Regardless of the specific scenario, in heterogeneous enterprises most SAP implementations must be integrated with a variety of non-SAP enterprise systems: Portals Messaging infrastructure Business process management (BPM) tools Enterprise Content Management (ECM) methods and tools Business analytics (BA) and business intelligence (BI) technologies Security Systems of record Systems of engagement The tooling included with SAP software addresses many needs for creating SAP-centric environments. However, the classic approach to implementing SAP functionality generally leaves the business with a rigid solution that is difficult and expensive to change and enhance. When SAP software is used in a large, heterogeneous enterprise environment, SAP clients face the dilemma of selecting the correct set of tools and platforms to implement SAP functionality, and to integrate the SAP solutions with non-SAP systems. This IBM® Redbooks® publication explains the value of integrating IBM software with SAP solutions. It describes how to enhance and extend pre-built capabilities in SAP software with best-in-class IBM enterprise software, enabling clients to maximize return on investment (ROI) in their SAP investment and achieve a balanced enterprise architecture approach. This book describes IBM Reference Architecture for SAP, a prescriptive blueprint for using IBM software in SAP solutions. The reference architecture is focused on defining the use of IBM software with SAP, and is not intended to address the internal aspects of SAP components. The chapters of this book provide a specific reference architecture for many of the architectural domains that are each important for a large enterprise to establish common strategy, efficiency, and balance. The majority of the most important architectural domain topics, such as integration, process optimization, master data management, mobile access, Enterprise Content Management, business intelligence, DevOps, security, systems monitoring, and so on, are covered in the book. However, there are several other architectural domains which are not included in the book. This is not to imply that these other architectural domains are not important or are less important, or that IBM does not offer a solution to address them. It is only reflective of time constraints, available resources, and the complexity of assembling a book on an extremely broad topic. Although more content could have been added, the authors feel confident that the scope of architectural material that has been included should provide organizations with a fantastic head start in defining their own enterprise reference architecture for many of the important architectural domains, and it is hoped that this book provides great value to those reading it. This IBM Redbooks publication is targeted to the following audiences: Client decision makers and solution architects leading enterprise transformation projects and wanting to gain further insight so that they can benefit from the integration of IBM software in large-scale SAP projects. IT architects and consultants integrating IBM technology with SAP solutions.

Reduce Risk and Improve Security on IBM Mainframes: Volume 2 Mainframe Communication and Networking Security

This IBM® Redbooks® publication documents the strength and value of the IBM security strategy with IBM z Systems hardware and software (referred to in this book by the previous product name, IBM System z®). In an age of increasing security consciousness and more dangerous and advanced persistent threats, System z provides the capabilities to address today’s business security challenges. This book explores how System z hardware is designed to provide integrity, process isolation, and cryptographic capability to help address security requirements. We highlight the features of IBM z/OS® and other operating systems that offer a variety of customizable security elements. We also describe z/OS and other operating systems and additional software that use the building blocks of System z hardware to meet business security needs. We explore these from the perspective of an enterprise security architect and how a modern mainframe must fit into an enterprise security architecture. This book is part of a three-volume series that focuses on guiding principles for optimized mainframe security configuration within a holistic enterprise security architecture. The intended audience includes enterprise security architects, planners, and managers who are interested in exploring how the security design and features of the System z platform, the z/OS operating system, and associated software address current issues, such as data encryption, authentication, authorization, network security, auditing, ease of security administration, and monitoring.

SAP in 24 Hours, Sams Teach Yourself, Fifth Edition

Thoroughly updated and expanded! Includes new coverage on HANA, the cloud, and using SAP’s applications! In just 24 sessions of one hour or less, you’ll get up and running with the latest SAP technologies, applications, and solutions. Using a straightforward, step-by-step approach, each lesson strengthens your understanding of SAP from both a business and technical perspective, helping you gain practical mastery from the ground up on topics such as security, governance, validations, release management, SLA, and legal issues. Step-by-step instructions carefully walk you through the most common questions, issues, and tasks. Quizzes and exercises help you build and test your knowledge. Notes present interesting pieces of information. Tips offer advice or teach an easier way to do something. Cautions advise you about potential problems and help you steer clear of disaster. Learn how to… Understand SAP terminology, concepts, and solutions Install SAP on premises or in the cloud Master SAP’s revamped user interface Discover how and when to use in-memory HANA databases Integrate SAP Software as a Service (SaaS) solutions such as Ariba, Successfactors, Fieldglass, and hybris Find resources at SAP’s Service Marketplace, Developer Network, and Help Portal Avoid pitfalls in SAP project implementation, migration, and upgrades Discover how SAP fits with mobile devices, social media, big data, and the Internet of Things Start or accelerate your career working with SAP technologies

Managing the Data Lake

Organizations across many industries have recently created fast-growing repositories to deal with an influx of new data from many sources and often in multiple formats. To manage these data lakes, companies have begun to leave the familiar confines of relational databases and data warehouses for Hadoop and various big data solutions. But adopting new technology alone won’t solve the problem. Based on interviews with several experts in data management, author Andy Oram provides an in-depth look at common issues you’re likely to encounter as you consider how to manage business data. You’ll explore five key topic areas, including: Acquisition and ingestion: how to solve these problems with a degree of automation. Metadata: how to keep track of when data came in and how it was formatted, and how to make it available at later stages of processing. Data preparation and cleaning: what you need to know before you prepare and clean your data, and what needs to be cleaned up and how. Organizing workflows: what you should do to combine your tasks—ingestion, cataloging, and data preparation—into an end-to-end workflow. Access control: how to address security and access controls at all stages of data handling. Andy Oram, an editor at O’Reilly Media since 1992, currently specializes in programming. His work for O'Reilly includes the first books on Linux ever published commercially in the United States.

Redis Essentials

Redis Essentials is your go-to guide for understanding and mastering Redis, the leading in-memory data structure store. In this book, you will explore the powerful features offered by Redis, such as real-time data processing, highly scalable architectures, and practical implementations for web applications. You'll complete the journey equipped to handle and optimize Redis for your development projects. What this Book will help me do Design analytics applications with advanced data structures like Bitmaps and HyperLogLogs. Scale your application infrastructure using Redis Sentinel, Twemproxy, and Redis Cluster. Develop custom Redis commands and extend its functionality with the Lua scripting language. Implement robust security measures for Redis, including SSL encryption and firewall rules. Master the usage of Redis client libraries in PHP, Python, Node.js, and Ruby for seamless development. Author(s) Maxwell Dayvson da Silva is an experienced software engineer and author with expertise in designing high-performance systems. With a strong focus on practical knowledge and hands-on solutions, Maxwell brings over a decade of experience using Redis to this book. His approachable teaching style ensures learners grasp complex topics easily while emphasizing their practical application to real-world challenges. Who is it for? Redis Essentials is aimed at developers looking to enhance their system's performance and scalability using Redis. Whether you're moderately familiar with key-value stores or new to Redis, this book will provide the explanations and hands-on examples you need. Recommended for developers with experience in data architectures, the book bridges the gap between understanding Redis features and their real-world application. Start here to bring high-performance in-memory data solutions to your projects.

Oracle SOA Suite 12c Handbook

Master Oracle SOA Suite 12 c Design, implement, manage, and maintain a highly flexible service-oriented computing infrastructure across your enterprise using the detailed information in this Oracle Press guide. Written by an Oracle ACE director, Oracle SOA Suite 12c Handbook uses a start-to-finish case study to illustrate each concept and technique. Learn expert techniques for designing and implementing components, assembling composite applications, integrating Java, handling complex business logic, and maximizing code reuse. Runtime administration, governance, and security are covered in this practical resource. Get started with the Oracle SOA Suite 12 c development and run time environment Deploy and manage SOA composite applications Expose SOAP/XML REST/JSON through Oracle Service Bus Establish interactions through adapters for Database, JMS, File/FTP, UMS, LDAP, and Coherence Embed custom logic using Java and the Spring component Perform fast data analysis in real time with Oracle Event Processor Implement Event Drive Architecture based on the Event Delivery Network (EDN) Use Oracle Business Rules to encapsulate logic and automate decisions Model complex processes using BPEL, BPMN, and human task components Establish KPIs and evaluate performance using Oracle Business Activity Monitoring Control traffic, audit system activity, and encrypt sensitive data

The Architecture of Privacy

Technology’s influence on privacy not only concerns consumers, political leaders, and advocacy groups, but also the software architects who design new products. In this practical guide, experts in data analytics, software engineering, security, and privacy policy describe how software teams can make privacy-protective features a core part of product functionality, rather than add them late in the development process. Ideal for software engineers new to privacy, this book helps you examine privacy-protective information management architectures and their foundational components—building blocks that you can combine in many ways. Policymakers, academics, students, and advocates unfamiliar with the technical terrain will learn how these tools can help drive policies to maximize privacy protection.

You: For Sale

Everything we do online, and increasingly in the real world, is tracked, logged, analyzed, and often packaged and sold on to the highest bidder. Every time you visit a website, use a credit card, drive on the freeway, or go past a CCTV camera, you are logged and tracked. Every day billions of people choose to share their details on social media, which are then sold to advertisers. The Edward Snowden revelations that governments - including those of the US and UK – have been snooping on their citizens, have rocked the world. But nobody seems to realize that this has already been happening for years, with firms such as Google capturing everything you type into a browser and selling it to the highest bidder. Apps take information about where you go, and your contact book details, harvest them and sell them on – and people just click the EULA without caring. No one is revealing the dirty secret that is the tech firms harvesting customers’ personal data and selling it for vast profits – and people are totally unaware of the dangers. You: For Sale is for anyone who is concerned about what corporate and government invasion of privacy means now and down the road. The book sets the scene by spelling out exactly what most users of the Internet and smart phones are exposing themselves to via commonly used sites and apps such as facebook and Google, and then tells you what you can do to protect yourself. The book also covers legal and government issues as well as future trends. With interviews of leading security experts, black market data traders, law enforcement and privacy groups, You: For Sale will help you view your personal data in a new light, and understand both its value, and its danger. Provides a clear picture of how companies and governments harvest and use personal data every time someone logs on Describes exactly what these firms do with the data once they have it – and what you can do to stop it Learn about the dangers of unwittingly releasing private data to tech firms, including interviews with top security experts, black market data traders, law enforcement and privacy groups Understand the legal information and future trends that make this one of the most important issues today

Introduction to JavaScript Object Notation

What is JavaScript Object Notation (JSON) and how can you put it to work? This concise guide helps busy IT professionals get up and running quickly with this popular data interchange format, and provides a deep understanding of how JSON works. Author Lindsay Bassett begins with an overview of JSON syntax, data types, formatting, and security concerns before exploring the many ways you can apply JSON today. From Web APIs and server-side language libraries to NoSQL databases and client-side frameworks, JSON has emerged as a viable alternative to XML for exchanging data between different platforms. If you have some programming experience and understand HTML and JavaScript, this is your book. Learn why JSON syntax represents data in name-value pairs Explore JSON data types, including object, string, number, and array Find out how you can combat common security concerns Learn how the JSON schema verifies that data is formatted correctly Examine the relationship between browsers, web APIs, and JSON Understand how web servers can both request and create data Discover how jQuery and other client-side frameworks use JSON Learn why the CouchDB NoSQL database uses JSON to store data

PostgreSQL Replication, Second Edition

The second edition of 'PostgreSQL Replication' by Hans-Jürgen Schönig is a comprehensive guide that empowers PostgreSQL database professionals to establish robust replication solutions. Through detailed explanations and expert techniques, you will learn how to enhance the security, scalability, and reliability of your PostgreSQL databases using modern replication methods. What this Book will help me do Master Point-in-Time Recovery to safeguard data and perform database recoveries effectively. Implement both synchronous and asynchronous streaming replication to suit different operational needs. Optimize database performance and scalability using tools like pgpool and PgBouncer. Ensure database high availability and data security through Linux High Availability configurations. Solve replication-related challenges by leveraging advanced knowledge of the PostgreSQL transaction log. Author(s) Hans-Jürgen Schönig, a seasoned PostgreSQL specialist, has years of experience architecting and optimizing PostgreSQL database systems for businesses of all sizes. With a strong focus on practical implementation and a passion for teaching, his writing bridges the gap between theoretical concepts and hands-on solutions, making PostgreSQL topics accessible and actionable. Who is it for? This book is tailored for PostgreSQL administrators and professionals seeking to implement robust database replication. Whether you're familiar with basic database administration or looking to deepen your expertise, this book provides valuable insights into replication strategies. It's ideal for those aiming to boost database performance and enhance operational reliability through advanced PostgreSQL features.

Virtualizing Hadoop: How to Install, Deploy, and Optimize Hadoop in a Virtualized Architecture

Plan and Implement Hadoop Virtualization for Maximum Performance, Scalability, and Business Agility Enterprises running Hadoop must absorb rapid changes in big data ecosystems, frameworks, products, and workloads. Virtualized approaches can offer important advantages in speed, flexibility, and elasticity. Now, a world-class team of enterprise virtualization and big data experts guide you through the choices, considerations, and tradeoffs surrounding Hadoop virtualization. The authors help you decide whether to virtualize Hadoop, deploy Hadoop in the cloud, or integrate conventional and virtualized approaches in a blended solution. First, Virtualizing Hadoop reviews big data and Hadoop from the standpoint of the virtualization specialist. The authors demystify MapReduce, YARN, and HDFS and guide you through each stage of Hadoop data management. Next, they turn the tables, introducing big data experts to modern virtualization concepts and best practices. Finally, they bring Hadoop and virtualization together, guiding you through the decisions you’ll face in planning, deploying, provisioning, and managing virtualized Hadoop. From security to multitenancy to day-to-day management, you’ll find reliable answers for choosing your best Hadoop strategy and executing it. Coverage includes the following: • Reviewing the frameworks, products, distributions, use cases, and roles associated with Hadoop • Understanding YARN resource management, HDFS storage, and I/O • Designing data ingestion, movement, and organization for modern enterprise data platforms • Defining SQL engine strategies to meet strict SLAs • Considering security, data isolation, and scheduling for multitenant environments • Deploying Hadoop as a service in the cloud • Reviewing the essential concepts, capabilities, and terminology of virtualization • Applying current best practices, guidelines, and key metrics for Hadoop virtualization • Managing multiple Hadoop frameworks and products as one unified system • Virtualizing master and worker nodes to maximize availability and performance • Installing and configuring Linux for a Hadoop environment

Accumulo

Get up to speed on Apache Accumulo, the flexible, high-performance key/value store created by the National Security Agency (NSA) and based on Google’s BigTable data storage system. Written by former NSA team members, this comprehensive tutorial and reference covers Accumulo architecture, application development, table design, and cell-level security. With clear information on system administration, performance tuning, and best practices, this book is ideal for developers seeking to write Accumulo applications, administrators charged with installing and maintaining Accumulo, and other professionals interested in what Accumulo has to offer. You will find everything you need to use this system fully.

Oracle Database 12c DBA Handbook

The definitive reference for every Oracle DBA—completely updated for Oracle Database 12 c Oracle Database 12c DBA Handbook is the quintessential tool for the DBA with an emphasis on the big picture—enabling administrators to achieve effective and efficient database management. Fully revised to cover every new feature and utility, this Oracle Press guide shows how to harness cloud capability, perform a new installation, upgrade from previous versions, configure hardware and software, handle backup and recovery, and provide failover capability. The newly revised material features high-level and practical content on cloud integration, storage management, performance tuning, information management, and the latest on a completely revised security program. Shows how to administer a scalable, flexible Oracle enterprise database Includes new chapters on cloud integration, new security capabilities, and other cutting-edge features All code and examples available online

Hadoop Security

As more corporations turn to Hadoop to store and process their most valuable data, the risk of a potential breach of those systems increases exponentially. This practical book not only shows Hadoop administrators and security architects how to protect Hadoop data from unauthorized access, it also shows how to limit the ability of an attacker to corrupt or modify data in the event of a security breach. Authors Ben Spivey and Joey Echeverria provide in-depth information about the security features available in Hadoop, and organize them according to common computer security concepts. You’ll also get real-world examples that demonstrate how you can apply these concepts to your use cases. Understand the challenges of securing distributed systems, particularly Hadoop Use best practices for preparing Hadoop cluster hardware as securely as possible Get an overview of the Kerberos network authentication protocol Delve into authorization and accounting principles as they apply to Hadoop Learn how to use mechanisms to protect data in a Hadoop cluster, both in transit and at rest Integrate Hadoop data ingest into enterprise-wide security architecture Ensure that security architecture reaches all the way to end-user access

IBM System Storage Solutions Handbook

The IBM® System Storage® Solutions Handbook helps you solve your current and future data storage business requirements to achieve enhanced storage efficiency by design to allow managed cost, capacity of growth, greater mobility, and stronger control over storage performance and management. It describes the current IBM storage products, including IBM FlashSystem™, disk, and tape, and virtualized solutions, such as IBM Storage Cloud, IBM SmartCloud® Virtual Storage Center, and IBM Spectrum™ Storage. This IBM Redbooks® publication provides overviews and pointers for information about the current IBM System Storage products, showing how IBM delivers the right mix of products for nearly every aspect of business continuance and business efficiency. IBM storage products can help you store, safeguard, retrieve, and share your data. The following topics are covered: Part 1 introduces IBM storage solutions. It provides overviews of the IBM storage solutions, including IBM Spectrum Storage™, IBM Storage Cloud, IBM SmartCloud Virtual Storage Center (VSC), and the IBM PureSystems® products. Part 2 describes the IBM disk and flash products that include IBM DS Series (entry-level, midrange, and enterprise offerings), IBM XIV® storage, IBM Storwize® products, and the IBM FlashSystem offerings. Part 3 is an overview of the IBM tape drives, IBM tape automation products, and IBM tape virtualization solutions and products. Part 4 describes storage networking infrastructure, switches and directors to form storage area network (SAN) solutions, and converged networks and data center networking. Part 5 describes the IBM storage software portfolio, including IBM SAN Volume Controller, IBM Tivoli® Storage Manager, Tivoli Storage Productivity Center, and IBM Security Key Lifecycle Manager. Part 6 describes the IBM z/OS® storage management software and tools. The appendixes provide information about the High Performance Storage System (HPSS) and recently withdrawn IBM storage products. This book is intended as a reference for basic and comprehensive information about the IBM Storage products portfolio. It provides a starting point for establishing your own enterprise storage environment.

Designing and Operating a Data Reservoir

Together, big data and analytics have tremendous potential to improve the way we use precious resources, to provide more personalized services, and to protect ourselves from unexpected and ill-intentioned activities. To fully use big data and analytics, an organization needs a system of insight. This is an ecosystem where individuals can locate and access data, and build visualizations and new analytical models that can be deployed into the IT systems to improve the operations of the organization. The data that is most valuable for analytics is also valuable in its own right and typically contains personal and private information about key people in the organization such as customers, employees, and suppliers. Although universal access to data is desirable, safeguards are necessary to protect people's privacy, prevent data leakage, and detect suspicious activity. The data reservoir is a reference architecture that balances the desire for easy access to data with information governance and security. The data reservoir reference architecture describes the technical capabilities necessary for a system of insight, while being independent of specific technologies. Being technology independent is important, because most organizations already have investments in data platforms that they want to incorporate in their solution. In addition, technology is continually improving, and the choice of technology is often dictated by the volume, variety, and velocity of the data being managed. A system of insight needs more than technology to succeed. The data reservoir reference architecture includes description of governance and management processes and definitions to ensure the human and business systems around the technology support a collaborative, self-service, and safe environment for data use. The data reservoir reference architecture was first introduced in Governing and Managing Big Data for Analytics and Decision Makers, REDP-5120, which is available at: http://www.redbooks.ibm.com/redpieces/abstracts/redp5120.html. This IBM® Redbooks publication, Designing and Operating a Data Reservoir, builds on that material to provide more detail on the capabilities and internal workings of a data reservoir.

FileMaker Pro 14: The Missing Manual

You don’t need a technical background to build powerful databases with FileMaker Pro 14. This crystal-clear, objective guide shows you how to create a database that lets you do almost anything with your data so you can quickly achieve your goals. Whether you’re creating catalogs, managing inventory and billing, or planning a wedding, you’ll learn how to customize your database to run on a PC, Mac, web browser, or iOS device. The important stuff you need to know: Dive into relational data. Solve problems quickly by connecting and combining data from different tables. Create professional documents. Publish reports, charts, invoices, catalogs, and other documents with ease. Access data anywhere. Use FileMaker Go on your iPad or iPhone—or share data on the Web. Harness processing power. Use new calculation and scripting tools to crunch numbers, search text, and automate tasks. Run your database on a secure server. Learn the high-level features of FileMaker Pro Advanced. Keep your data safe. Set privileges and allow data sharing with FileMaker’s streamlined security features.

Apache Oozie

Get a solid grounding in Apache Oozie, the workflow scheduler system for managing Hadoop jobs. With this hands-on guide, two experienced Hadoop practitioners walk you through the intricacies of this powerful and flexible platform, with numerous examples and real-world use cases. Once you set up your Oozie server, you’ll dive into techniques for writing and coordinating workflows, and learn how to write complex data pipelines. Advanced topics show you how to handle shared libraries in Oozie, as well as how to implement and manage Oozie’s security capabilities. Install and configure an Oozie server, and get an overview of basic concepts Journey through the world of writing and configuring workflows Learn how the Oozie coordinator schedules and executes workflows based on triggers Understand how Oozie manages data dependencies Use Oozie bundles to package several coordinator apps into a data pipeline Learn about security features and shared library management Implement custom extensions and write your own EL functions and actions Debug workflows and manage Oozie’s operational details

IBM z13 Technical Guide

Digital business has been driving the transformation of underlying IT infrastructure to be more efficient, secure, adaptive, and integrated. Information Technology (IT) must be able to handle the explosive growth of mobile clients and employees. IT also must be able to use enormous amounts of data to provide deep and real-time insights to help achieve the greatest business impact. This IBM® Redbooks® publication addresses the new IBM Mainframe, the IBM z13. The IBM z13 is the trusted enterprise platform for integrating data, transactions, and insight. A data-centric infrastructure must always be available with a 99.999% or better availability, have flawless data integrity, and be secured from misuse. It needs to be an integrated infrastructure that can support new applications. It needs to have integrated capabilities that can provide new mobile capabilities with real-time analytics delivered by a secure cloud infrastructure. IBM z13 is designed with improved scalability, performance, security, resiliency, availability, and virtualization. The superscalar design allows the z13 to deliver a record level of capacity over the prior z Systems. In its maximum configuration, z13 is powered by up to 141 client characterizable microprocessors (cores) running at 5 GHz. This configuration can run more than 110,000 millions of instructions per second (MIPS) and up to 10 TB of client memory. The IBM z13 Model NE1 is estimated to provide up to 40% more total system capacity than the IBM zEnterprise® EC12 (zEC1) Model HA1. This book provides information about the IBM z13 and its functions, features, and associated software support. Greater detail is offered in areas relevant to technical planning. It is intended for systems engineers, consultants, planners, and anyone who wants to understand the IBM z Systems functions and plan for their usage. It is not intended as an introduction to mainframes. Readers are expected to be generally familiar with existing IBM z Systems technology and terminology.