talk-data.com talk-data.com

Event

O'Reilly Data Engineering Books

2001-10-19 – 2027-05-25 Oreilly Visit website ↗

Activities tracked

3406

Collection of O'Reilly books on Data Engineering.

Filtering by: data ×

Sessions & talks

Showing 1926–1950 of 3406 · Newest first

Search within this event →
IBM Information Server: Integration and Governance for Emerging Data Warehouse Demands

This IBM® Redbooks® publication is intended for business leaders and IT architects who are responsible for building and extending their data warehouse and Business Intelligence infrastructure. It provides an overview of powerful new capabilities of Information Server in the areas of big data, statistical models, data governance and data quality. The book also provides key technical details that IT professionals can use in solution planning, design, and implementation.

Database Cloud Storage

Implement a Centralized Cloud Storage Infrastructure with Oracle Automatic Storage Management Build and manage a scalable, highly available cloud storage solution. Filled with detailed examples and best practices, this Oracle Press guide explains how to set up a complete cloud-based storage system using Oracle Automatic Storage Management. Find out how to prepare hardware, build disk groups, efficiently allocate storage space, and handle security. Database Cloud Storage: The Essential Guide to Oracle Automatic Storage Management shows how to monitor your system, maximize throughput, and ensure consistency across servers and clusters. Set up and configure Oracle Automatic Storage Management Discover and manage disks and establish disk groups Create, clone, and administer Oracle databases Consolidate resources with Oracle Private Database Cloud Control access, encrypt files, and assign user privileges Integrate replication, file tagging, and automatic failover Employ pre-engineered private cloud database consolidation tools Check for data consistency and resync failed disks Code examples in the book are available for download

Learning SPARQL, 2nd Edition

Gain hands-on experience with SPARQL, the RDF query language that’s bringing new possibilities to semantic web, linked data, and big data projects. This updated and expanded edition shows you how to use SPARQL 1.1 with a variety of tools to retrieve, manipulate, and federate data from the public web as well as from private sources. Author Bob DuCharme has you writing simple queries right away before providing background on how SPARQL fits into RDF technologies. Using short examples that you can run yourself with open source software, you’ll learn how to update, add to, and delete data in RDF datasets. Get the big picture on RDF, linked data, and the semantic web Use SPARQL to find bad data and create new data from existing data Use datatype metadata and functions in your queries Learn techniques and tools to help your queries run more efficiently Use RDF Schemas and OWL ontologies to extend the power of your queries Discover the roles that SPARQL can play in your applications

Apache Sqoop Cookbook

Integrating data from multiple sources is essential in the age of big data, but it can be a challenging and time-consuming task. This handy cookbook provides dozens of ready-to-use recipes for using Apache Sqoop, the command-line interface application that optimizes data transfers between relational databases and Hadoop. Sqoop is both powerful and bewildering, but with this cookbook’s problem-solution-discussion format, you’ll quickly learn how to deploy and then apply Sqoop in your environment. The authors provide MySQL, Oracle, and PostgreSQL database examples on GitHub that you can easily adapt for SQL Server, Netezza, Teradata, or other relational systems. Transfer data from a single database table into your Hadoop ecosystem Keep table data and Hadoop in sync by importing data incrementally Import data from more than one database table Customize transferred data by calling various database functions Export generated, processed, or backed-up data from Hadoop to your database Run Sqoop within Oozie, Hadoop’s specialized workflow scheduler Load data into Hadoop’s data warehouse (Hive) or database (HBase) Handle installation, connection, and syntax issues common to specific database vendors

The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd Edition

Updated new edition of Ralph Kimball's groundbreaking book on dimensional modeling for data warehousing and business intelligence! The first edition of Ralph Kimball's The Data Warehouse Toolkit introduced the industry to dimensional modeling, and now his books are considered the most authoritative guides in this space. This new third edition is a complete library of updated dimensional modeling techniques, the most comprehensive collection ever. It covers new and enhanced star schema dimensional modeling patterns, adds two new chapters on ETL techniques, includes new and expanded business matrices for 12 case studies, and more. Authored by Ralph Kimball and Margy Ross, known worldwide as educators, consultants, and influential thought leaders in data warehousing and business intelligence Begins with fundamental design recommendations and progresses through increasingly complex scenarios Presents unique modeling techniques for business applications such as inventory management, procurement, invoicing, accounting, customer relationship management, big data analytics, and more Draws real-world case studies from a variety of industries, including retail sales, financial services, telecommunications, education, health care, insurance, e-commerce, and more Design dimensional databases that are easy to understand and provide fast query response with The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd Edition.

Disruptive Possibilities: How Big Data Changes Everything

Big data has more disruptive potential than any information technology developed in the past 40 years. As author Jeffrey Needham points out in this revealing book, big data can provide unprecedented visibility into the operational efficiency of enterprises and agencies. Disruptive Possibilities provides an historically-informed overview through a wide range of topics, from the evolution of commodity supercomputing and the simplicity of big data technology, to the ways conventional clouds differ from Hadoop analytics clouds. This relentlessly innovative form of computing will soon become standard practice for organizations of any size attempting to derive insight from the tsunami of data engulfing them. Replacing legacy silos—whether they’re infrastructure, organizational, or vendor silos—with a platform-centric perspective is just one of the big stories of big data. To reap maximum value from the myriad forms of data, organizations and vendors will have to adopt highly collaborative habits and methodologies.

IBM System Storage Tape Library Guide for Open Systems

This IBM® Redbooks® publication presents a general introduction to Linear Tape-Open (LTO) technology and the implementation of corresponding IBM products. The IBM Enterprise 3592 Tape Drive also is described. This tenth edition includes information about the latest enhancements to the IBM Ultrium family of tape drives and tape libraries. In particular, it includes details of the latest IBM LTO Ultrium 6 tape drive technology and its implementation in IBM tape libraries. Information is included about the recently released, enhanced, higher-performance ProtecTIER servers and the features of the new version 3.2 server software. The new software also enables a new feature, the File System Interface (FSI). It also contains technical information about each IBM tape product for Open Systems. It includes generalized sections about Small Computer System Interface (SCSI) and Fibre Channel connections and multipath architecture configurations. This book also includes information about tools and techniques for library management. This edition includes details about Tape System Library Manager (TSLM). TSLM provides consolidation and simplification in large TS3500 Tape Library environments, including the IBM Shuttle Complex. This publication is intended for anyone who wants to understand more about IBM tape products and their implementation. This book is suitable for IBM clients, IBM Business Partners, IBM specialist sales representatives, and technical specialists. If you do not have a background in computer tape storage products, you might need to reference other sources of information. In the interest of being concise, topics that are generally understood are not covered in detail.

IBM XIV Storage System Copy Services and Migration

This IBM® Redbook®s publication provides a practical understanding of theI BM XIV® Storage System copy and migration functions. The XIV Storage System has a rich set of copy functions suited for various data protection scenarios, which enables clients to enhance their business continuance, data migration, and online backup solutions. These functions allow point-in-time copies, known as snapshots and full volume copies, and also include remote copy capabilities in either synchronous or asynchronous mode. These functions are included in the XIV software and all their features are available at no additional charge. The various copy functions are reviewed in separate chapters, which include detailed information about usage, and also practical illustrations. This book also explains the XIV built-in migration capability, and presents migration alternatives based on the SAN Volume Controller (SVC). Finally, the book illustrates the use of IBM Tivoli® Storage Productivity Center for Replication to manage XIV Copy Services. This book is intended for anyone who needs a detailed and practical understanding of the XIV copy functions.

Implementing the IBM Storwize V7000 Unified

In this IBM® Redbooks® publication we introduce a new product, the IBM Storwize® V7000 Unified (V7000U). Storwize V7000 Unified is a virtualized storage system designed to consolidate block and file workloads into a single storage system. Advantages include simplicity of management, reduced cost, highly scalable capacity, performance, and high availability. Storwize V7000 Unified storage also offers improved efficiency and flexibility through built-in solid-state drive (SSD) optimization, thin provisioning, IBM Real-time Compression™, and nondisruptive migration of data from existing storage. The system can virtualize and reuse existing disk systems offering a greater potential return on investment. We suggest that you familiarize yourself with the following books to get the most from this publication: Implementing the IBM Storwize V7000 V6.3, SG24-7938

Server Time Protocol Implementation Guide

Server Time Protocol (STP) is a server-wide facility that is implemented in the Licensed Internal Code (LIC) of IBM® zEnterprise EC12 (zEC12), IBM zEnterprise 196 (z196), IBM zEnterprise 114 (z114), IBM System z10®, and IBM System z9®. It provides improved time synchronization in both a sysplex or non-sysplex configuration. This IBM Redbooks® publication will help you configure a Mixed Coordinated Timing Network (CTN) or an STP-only CTN. It is intended for technical support personnel requiring information about: -Installing and configuring a Coordinated Timing Network Readers are expected to be familiar with IBM System z technology and terminology. For planning information, see our companion book, Server Time Protocol Planning Guide, SG24-7280. For information about how to recover your STP environment functionality, see the Server Time Protocol Recovery Guide, SG24-7380.

Big Data Imperatives: Enterprise 'Big Data' Warehouse, 'BI' Implementations and Analytics

Big Data Imperatives, focuses on resolving the key questions on everyone's mind: Which data matters? Do you have enough data volume to justify the usage? How you want to process this amount of data? How long do you really need to keep it active for your analysis, marketing, and BI applications? Big data is emerging from the realm of one-off projects to mainstream business adoption; however, the real value of big data is not in the overwhelming size of it, but more in its effective use. This book addresses the following big data characteristics: Very large, distributed aggregations of loosely structured data - often incomplete and inaccessible Petabytes/Exabytes of data Millions/billions of people providing/contributing to the context behind the data Flat schema's with few complex interrelationships Involves time-stamped events Made up of incomplete data Includes connections between data elements that must be probabilistically inferred Big Data Imperatives explains 'what big data can do'. It can batch process millions and billions of records both unstructured and structured much faster and cheaper. Big data analytics provide a platform to merge all analysis which enables data analysis to be more accurate, well-rounded, reliable and focused on a specific business capability. Big Data Imperatives describes the complementary nature of traditional data warehouses and big-data analytics platforms and how they feed each other. This book aims to bring the big data and analytics realms together with a greater focus on architectures that leverage the scale and power of big data and the ability to integrate and apply analytics principles to data which earlier was not accessible. This book can also be used as a handbook for practitioners; helping them on methodology,technical architecture, analytics techniques and best practices. At the same time, this book intends to hold the interest of those new to big data and analytics by giving them a deep insight into the realm of big data. What you'll learn Understanding the technology, implementation of big data platforms and their usage for analytics Big data architectures Big data design patterns Implementation best practices Who this book is for This book is designed for IT professionals, data warehousing, business intelligence professionals, data analysis professionals, architects, developers and business users.

Pro Hibernate and MongoDB

Hibernate and MongoDB are a powerful combination of open source persistence and NoSQL technologies for today's Java-based enterprise and cloud application developers. Hibernate is the leading open source Java-based persistence, object relational management engine, recently repositioned as an object grid management engine. MongoDB is a growing, popular open source NoSQL framework, especially popular among cloud application and big data developers. With these two, enterprise and cloud developers have a "complete out of the box" solution. Pro Hibernate and MongoDB shows you how to use and integrate Hibernate and MongoDB. More specifically, this book guides you through the bootstrap; building transactions; handling queries and query entities; and mappings. Then, this book explores the principles and techniques for taking these application principles to the cloud, using the OpenShift Platform as a Service (PaaS) and more. In this book, you get two case studies: An enterprise application using Hibernate and MongoDB. then, A cloud application (OpenShip) migrated from the enterprise application case study After reading or using this book, you come away with the experience from two case studies that give you possible frameworks or templates that you can apply to your own specific application or cloud application building context. What you'll learn How to use and integrate Hibernate and MongoDB to be your "complete out of the box" solution for database driven enterprise and cloud applications How to bootstrap; run in supported environments; do transactions; handle queries and query entities; and mappings How to build an enterprise application case study using Hibernate and MongoDB What are the principles and techniques for taking applications to the Cloud, using the OpenShift Platform as a Service (PaaS) and more How to build a cloud-based app or application (OpenShip) Who this book is for This book is for experienced Java, enterprise Java programmers who may have some experience with Hibernate and/or MongoDB.

Agent-Based Modeling and Simulation with Swarm

A thorough overview of multi-agent simulation and supporting tools, this book provides the methodology for a multi-agent-based modeling approach that integrates computational techniques such as artificial life, cellular automata, and bio-inspired optimization. It shows how this type of simulation is used to acquire an understanding of complex systems and artificial life. The author carefully explains how to construct a simulation program for various applications. Swarm-based software and source codes are available on his website.

Oracle Data Guard 11gR2 Administration : Beginner's Guide

Dive into "Oracle Data Guard 11gR2 Administration: Beginner's Guide" and start mastering data protection and high availability for Oracle Databases. This guide breaks down the essentials of setting up and managing Oracle Data Guard configurations, equipping you with knowledge and skills through step-by-step examples. What this Book will help me do Learn to configure Oracle Data Guard and manage its essential components. Gain expertise in performing role transitions such as switchover and failover between databases. Use Data Guard Broker for streamlined management of your high availability setup. Understand best practices for patching and maintaining Oracle Data Guard environments. Integrate Data Guard with advanced Oracle features like RAC and RMAN for optimal performance. Author(s) While the specific authors of "Oracle Data Guard 11gR2 Administration: Beginner's Guide" are not listed, it is written by experts in Oracle Database Administration with substantial experience working on high availability setups and disaster recovery solutions. The book distills their expertise into an accessible format. Who is it for? This guide is perfect for Oracle Database Administrators looking to deepen their knowledge in setting up and managing an Oracle Data Guard environment. Whether you're just starting out or have some experience, it provides hands-on instructions and practical examples to elevate your skills. Its user-friendly approach appeals to tech-savvy professionals aiming to protect their data and ensure system availability effectively.

Oracle SOA BPEL Process Manager 11gR1 - A Hands-on Tutorial

Delve into the world of Oracle SOA BPEL Process Manager and master the skills required for designing, deploying, and managing business process applications. In this book, you will learn to implement and optimize SOA services and BPEL processes, enabling you to tackle real-world challenges with confidence. Gain hands-on experience through detailed examples and practical exercises. What this Book will help me do Understand and utilize the BPEL standard for defining business processes in a SOA context. Develop, configure, and test BPEL processes using Oracle SOA Suite and JDeveloper. Gain expertise in deploying, debugging, and troubleshooting BPEL processes effectively. Learn techniques for integrating BPEL with other SOA suite components like OSB and BAM. Explore advanced topics such as performance tuning and implementing high availability strategies. Author(s) The authors of this hands-on guide are seasoned experts in service-oriented architecture (SOA) and integration technologies, bringing decades of industry experience to their teachings. They have trained and worked with numerous organizations to design and implement robust SOA solutions, particularly leveraging Oracle technology. Their approachable writing style makes complex technical concepts accessible for learners at all levels. Who is it for? This book is designed for SOA developers, architects, and administrators aiming to master Oracle BPEL Process Manager 11gR1. Ideal for professionals with a basic understanding of SOA concepts looking to deepen their skills in BPEL. It's suitable for those interested in building business process applications or managing Oracle SOA solutions. Also, it's an excellent resource for enhancing practical expertise.

IBM System z Personal Development Tool: Volume 2 Installation and Basic Use

This IBM® Redbooks® publication introduces the IBM System z® Personal Development Tool (zPDT®), which runs on an underlying Linux system based on an Intel processor. zPDT provides a System z system on a PC capable of running current System z operating systems, including emulation of selected System z I/O devices and control units. It is intended as a development, demonstration, and learning platform and is not designed as a production system. This book, providing specific installation instructions, is the second of three volumes. The first volume describes the general concepts of zPDT and a syntax reference for zPDT commands and device managers. The third volume discusses more advanced topics that may not interest all zPDT users. The IBM order numbers for the three volumes are SG24-7721, SG24-7722, and SG24-7723. The systems discussed in these volumes are complex, with elements of Linux (for the underlying PC machine), IBM z/Architecture® (for the core zPDT elements), System z I/O functions (for emulated I/O devices), and IBM z/OS® (providing the System z application interface), and possibly with other System z operating systems. We assume the reader is familiar with the general concepts and terminology of System z hardware and software elements and with basic PC Linux characteristics.

IBM System z Personal Development Tool: Volume 3 Additional Topics

This IBM® Redbooks® publication introduces the IBM System z® Personal Development Tool (zPDT), which runs on an underlying Linux system based on an Intel processor. zPDT provides a System z system on a PC capable of running current System z operating systems, including emulation of selected System z I/O devices and control units. It is intended as a development, demonstration, and learning platform; it is not designed as a production system. This book, discussing more advanced topics, is the last of three volumes. The first volume introduces zPDT and provides reference material for zPDT commands and device managers. The second volume describes the installation of zPDT (including the underlying Linux, and a particular z/OS® distribution) and basic usage patterns. The third volume discusses more advanced topics that may not interest all zPDT users. The IBM order numbers for the three volumes are SG24-7721, SG24-7722, and SG24-7723. The systems discussed in these volumes are complex, with elements of Linux (for the underlying PC machine), z/Architecture® (for the core zPDT elements), System z I/O functions (for emulated I/O devices), and z/OS (providing the System z application interface), and possibly with other System z operating systems. We assume the reader is familiar with the general concepts and terminology of System z hardware and software elements and with basic PC Linux characteristics.

Implementing IBM SmartCloud Entry on IBM PureFlex System

Distributed computing has been transformed with the introduction of virtualization technology. This has driven a re-architecture of traditional data center workload placement. In 2012, IBM® announced IBM PureSystems™, a new offering based on preconfigured software, servers, and storage that form an expert integrated system. Expert integrated systems now combine traditional IT resources into a single optimized solution, with prepackaged components including servers, storage devices, networking equipment, and software. With this evolution of technology, we move from discrete, siloed, and underutilized IT resources to shared resource pools. This IBM Redbooks® publication can help you install, tailor, and configure IBM SmartCloud® Entry on the IBM PureFlex™ System offering. This book is intended for anyone who wants to learn more about cloud computing with IBM SmartCloud Entry and offerings based on IBM Flex System™ elements.

Oracle WebLogic Server 12c Advanced Administration Cookbook

Dive into advanced administration of Oracle WebLogic Server 12c with this cookbook-style guide. With 70 practical recipes, you'll gain the knowledge and skills to handle both basic and complex administrative tasks effectively. This book provides expert guidance to help you elevate your WebLogic server management proficiency to the next level. What this Book will help me do Master the installation and configuration of Oracle WebLogic Server 12c for efficient system setup. Configure high-availability administration servers to ensure consistent and reliable performance. Effectively create and manage different types of JDBC data sources and multi-data sources for database connectivity. Utilize WebLogic Diagnostic Framework to monitor performance and handle threshold notifications. Optimize server settings for enhanced stability and resilience to meet enterprise needs. Author(s) Dalton Iwazaki is a seasoned professional with extensive experience in managing and deploying Oracle WebLogic Servers in various enterprise environments. With years of hands-on expertise, Dalton shares his practical insights and tested methods throughout his writing. His approachable style and thorough teaching make complex topics accessible to all technical readers. Who is it for? This book is designed for system administrators, developers, and IT professionals who want to master the administration of Oracle WebLogic Server 12c. Readers should have basic knowledge of the Java platform and web server concepts. Whether you're a beginner looking to establish foundational skills or an experienced professional seeking advanced configurations, this cookbook is tailored for your learning.

Redis in Action

Redis in Action introduces Redis and walks you through examples that demonstrate how to use it effectively. You'll begin by getting Redis set up properly and then exploring the key-value model. Then, you'll dive into real use cases including simple caching, distributed ad targeting, and more. You'll learn how to scale Redis from small jobs to massive datasets. Experienced developers will appreciate chapters on clustering and internal scripting to make Redis easier to use. About the Technology When you need near-real-time access to a fast-moving data stream, key-value stores like Redis are the way to go. Redis expands on the key-value pattern by accepting a wide variety of data types, including hashes, strings, lists, and other structures. It provides lightning-fast operations on in-memory datasets, and also makes it easy to persist to disk on the fly. Plus, it's free and open source. About the Book What's Inside Redis from the ground up Preprocessing real-time data Managing in-memory datasets Pub/sub and configuration Persisting to disk About the Reader Written for developers familiar with database concepts. No prior exposure to Redis or other NoSQL databases required. Appropriate for systems administrators comfortable with programming. About the Author Dr. Josiah L. Carlson is a seasoned database professional and an active contributor to the Redis community. Quotes A great addition to the Redis ecosystem. - From the Foreword by Salvatore Sanfilippo, Creator of Redis The examples, taken from real-world use cases, are one of the major strengths of the book. - Filippo Pacini, SG Consulting From beginner to expert with real and comprehensive examples. - Felipe Gutierrez, VMware/Spring Source Excellent in-depth analysis ... insightful real-world examples. - Bobby Abraham, Integri LLC Pure gold! - Leo Cassarani, Unboxed Consulting

Internet and Surveillance

The Internet has been transformed in the past years from a system primarily oriented on information provision into a medium for communication and community-building. The notion of “Web 2.0”, social software, and social networking sites such as Facebook, Twitter and MySpace have emerged in this context. With such platforms comes the massive provision and storage of personal data that are systematically evaluated, marketed, and used for targeting users with advertising. In a world of global economic competition, economic crisis, and fear of terrorism after 9/11, both corporations and state institutions have a growing interest in accessing this personal data. Here, contributors explore this changing landscape by addressing topics such as commercial data collection by advertising, consumer sites and interactive media; self-disclosure in the social web; surveillance of file-sharers; privacy in the age of the internet; civil watch-surveillance on social networking sites; and networked interactive surveillance in transnational space. This book is a result of a research action launched by the intergovernmental network COST (European Cooperation in Science and Technology).

Microsoft SQL Server 2012 Administration: Real-World Skills for MCSA Certification and Beyond (Exams 70-461, 70-462, and 70-463)

Implement, maintain, and repair SQL Server 2012 databases As the most significant update since 2008, Microsoft SQL Server 2012 boasts updates and new features that are critical to understand. Whether you manage and administer SQL Server 2012 or are planning to get your MCSA: SQL Server 2012 certification, this book is the perfect supplement to your learning and preparation. From understanding SQL Server's roles to implementing business intelligence and reporting, this practical book explores tasks and scenarios that a working SQL Server DBA faces regularly and shows you step by step how to handle them. Includes practice exams and coverage of exam objectives for those seeking MSCA: SQL Server 2012 certification Explores the skills you'll need on the job as a SQL Server 2012 DBA Discusses designing and implementing database solutions Walks you through administrating, maintaining, and securing SQL Server 2012 Addresses implementing high availability and data distribution Includes bonus videos where the author walks you through some of the more difficult tasks expected of a DBA Featuring hands-on exercises and real-world scenarios, this resource guides you through the essentials of implementing, maintaining, and repairing SQL Server 2012 databases. Note: The ebook version does not provide access to the companion files.

Structural Equation Modeling with Mplus

Modeled after Barbara Byrne’s other best-selling structural equation modeling (SEM) books, this practical guide reviews the basic concepts and applications of SEM using M plus Versions 5 & 6. The author reviews SEM applications based on actual data taken from her own research. Using non-mathematical language, it is written for the novice SEM user. With each application chapter, the author "walks" the reader through all steps involved in testing the SEM model including: an explanation of the issues addressed illustrated and annotated testing of the hypothesized and post hoc models explanation and interpretation of all M plus input and output files important caveats pertinent to the SEM application under study a description of the data and reference upon which the model was based the corresponding data and syntax files available under "Supplementary Material" below The first two chapters introduce the fundamental concepts of SEM and important basics of the M plus program. The remaining chapters focus on SEM applications and include a variety of SEM models presented within the context of three sections: Single-group analyses, Multiple-group analyses, and other important topics, the latter of which includes the multitrait-multimethod, latent growth curve, and multilevel models. Intended for researchers, practitioners, and students who use SEM and M plus, this book is an ideal resource for graduate level courses on SEM taught in psychology, education, business, and other social and health sciences and/or as a supplement for courses on applied statistics, multivariate statistics, intermediate or advanced statistics, and/or research design. Appropriate for those with limited exposure to SEM or M plus, a prerequisite of basic statistics through regression analysis is recommended.