talk-data.com talk-data.com

Topic

data-engineering

3395

tagged

Activity Trend

1 peak/qtr
2020-Q1 2026-Q1

Activities

3395 activities · Newest first

IBM Flex System and PureFlex System Network Implementation with Juniper Networks

To meet today's complex and ever-changing business demands, you need a solid foundation of server, storage, networking and software resources that is simple to deploy and can quickly and automatically adapt to changing conditions. You also need access to, and the ability to take advantage of, broad expertise and proven best practices in systems management, applications, hardware maintenance and more. IBM® PureFlex™ System, which is a part of the IBM PureSystems™ family of expert integrated systems, combines advanced IBM hardware and software along with patterns of expertise and integrates them into three optimized configurations that are simple to acquire and deploy so you can achieve faster time to value. If you want a pre-configured, pre-integrated infrastructure with integrated management and cloud capabilities, factory tuned from IBM with x86 and Power hybrid solution, IBM PureFlex System is the answer. In this IBM Redbooks® publication, we use EX4500 core switches to demonstrate interoperability with the System Networking switches (RackSwitch™ G8264 top of rack switch and the Flex system fabric EN4093 10Gb scalable switch). We also describe a redundant environment using QFX3500 switches running IBM Virtual-Link Aggregation Group (MC-LAG/vLAG) and Juniper Multi- Chassis-Link Aggregation Group.

IBM XIV Storage System Gen3: Architecture, Implementation, and Usage

This IBM® Redbooks® publication describes the concepts, architecture, and implementation of the IBM XIV® Storage System. The XIV Storage System is a scalable enterprise storage system that is based on a grid array of hardware components. It can attach to both Fibre Channel Protocol (FCP) and IP network Small Computer System Interface (iSCSI) capable hosts. This system is a good fit for clients who want to be able to grow capacity without managing multiple tiers of storage. The XIV Storage System is suited for mixed or random access workloads, including online transaction processing, video streamings, images, email, and emerging workload areas, such as Web 2.0 and storage cloud. The focus of this edition is on the XIV Gen3 hardware Release 3.2, running Version 11.2 of the XIV system software. With this version, XIV Storage System offers up to five times the iSCSI throughput with new 10 GbE ports, a performance boost with new CPUs, and enhanced caching with optional solid-state drives (SSDs). The IBM XIV software Version 11.2 also offers support for Windows Server 2012, including space reclamation. And, the software enables drive rebuild times as fast as 26 minutes for a fully utilized 2 TB hard disk drive under heavy load. In the first few chapters of this book, we describe many of the unique and powerful concepts that form the basis of the XIV Storage System logical and physical architecture. We explain how the system is designed to eliminate direct dependencies between the hardware elements and the software that governs the system. In subsequent chapters, we explain the planning and preparation tasks that are required to deploy the system in your environment. A step-by-step procedure is presented that describes how to configure and administer the system. Illustrations are provided about how to perform those tasks by using the intuitive, yet powerful XIV Storage Manager GUI or the XIV command-line interface (XCLI). We describe the performance characteristics of the XIV Storage System and present options that are available for alerting and monitoring, including an enhanced secure remote support capability. This book is intended for IT professionals who want an understanding of the XIV Storage System. It also targets readers who need detailed advice on how to configure and use the system.

Enterprise Data Workflows with Cascading

There is an easier way to build Hadoop applications. With this hands-on book, you’ll learn how to use Cascading, the open source abstraction framework for Hadoop that lets you easily create and manage powerful enterprise-grade data processing applications—without having to learn the intricacies of MapReduce. Working with sample apps based on Java and other JVM languages, you’ll quickly learn Cascading’s streamlined approach to data processing, data filtering, and workflow optimization. This book demonstrates how this framework can help your business extract meaningful information from large amounts of distributed data. Start working on Cascading example projects right away Model and analyze unstructured data in any format, from any source Build and test applications with familiar constructs and reusable components Work with the Scalding and Cascalog Domain-Specific Languages Easily deploy applications to Hadoop, regardless of cluster location or data size Build workflows that integrate several big data frameworks and processes Explore common use cases for Cascading, including features and tools that support them Examine a case study that uses a dataset from the Open Data Initiative

Software Development on the SAP HANA Platform

Software Development on the SAP HANA Platform equips you with all the knowledge you need to master developing on this high-performance in-memory technology. From setup and installation to deploying fully functional HANA applications, this book guides you step by step. With hands-on chapters, you'll gain the analytical tools and data management proficiency needed to excel. What this Book will help me do Set up a SAP HANA development environment from scratch. Successfully execute your first development project on SAP HANA. Utilize each type of view in SAP HANA effectively for data manipulation. Create users with appropriate authorizations for reporting purposes. Deploy reporting applications to end-user software seamlessly. Author(s) Mark Walker is a seasoned expert in SAP HANA, with years of professional experience in enterprise software development and training. He brings a passion for teaching complex technologies in an approachable and practical way. Mark's hands-on approach ensures that readers not only learn but can confidently apply their new skills. Who is it for? This book is designed for software developers and data professionals looking to expand their expertise with SAP HANA. It is ideal for those new to this platform or professionals enhancing their analytical and data management skills. Whether you're starting from scratch or upgrading your capabilities, this book suits your needs. The lessons here will assist in reaching your SAP HANA proficiency goals.

IBM Information Server: Integration and Governance for Emerging Data Warehouse Demands

This IBM® Redbooks® publication is intended for business leaders and IT architects who are responsible for building and extending their data warehouse and Business Intelligence infrastructure. It provides an overview of powerful new capabilities of Information Server in the areas of big data, statistical models, data governance and data quality. The book also provides key technical details that IT professionals can use in solution planning, design, and implementation.

Database Cloud Storage

Implement a Centralized Cloud Storage Infrastructure with Oracle Automatic Storage Management Build and manage a scalable, highly available cloud storage solution. Filled with detailed examples and best practices, this Oracle Press guide explains how to set up a complete cloud-based storage system using Oracle Automatic Storage Management. Find out how to prepare hardware, build disk groups, efficiently allocate storage space, and handle security. Database Cloud Storage: The Essential Guide to Oracle Automatic Storage Management shows how to monitor your system, maximize throughput, and ensure consistency across servers and clusters. Set up and configure Oracle Automatic Storage Management Discover and manage disks and establish disk groups Create, clone, and administer Oracle databases Consolidate resources with Oracle Private Database Cloud Control access, encrypt files, and assign user privileges Integrate replication, file tagging, and automatic failover Employ pre-engineered private cloud database consolidation tools Check for data consistency and resync failed disks Code examples in the book are available for download

Learning SPARQL, 2nd Edition

Gain hands-on experience with SPARQL, the RDF query language that’s bringing new possibilities to semantic web, linked data, and big data projects. This updated and expanded edition shows you how to use SPARQL 1.1 with a variety of tools to retrieve, manipulate, and federate data from the public web as well as from private sources. Author Bob DuCharme has you writing simple queries right away before providing background on how SPARQL fits into RDF technologies. Using short examples that you can run yourself with open source software, you’ll learn how to update, add to, and delete data in RDF datasets. Get the big picture on RDF, linked data, and the semantic web Use SPARQL to find bad data and create new data from existing data Use datatype metadata and functions in your queries Learn techniques and tools to help your queries run more efficiently Use RDF Schemas and OWL ontologies to extend the power of your queries Discover the roles that SPARQL can play in your applications

Apache Sqoop Cookbook

Integrating data from multiple sources is essential in the age of big data, but it can be a challenging and time-consuming task. This handy cookbook provides dozens of ready-to-use recipes for using Apache Sqoop, the command-line interface application that optimizes data transfers between relational databases and Hadoop. Sqoop is both powerful and bewildering, but with this cookbook’s problem-solution-discussion format, you’ll quickly learn how to deploy and then apply Sqoop in your environment. The authors provide MySQL, Oracle, and PostgreSQL database examples on GitHub that you can easily adapt for SQL Server, Netezza, Teradata, or other relational systems. Transfer data from a single database table into your Hadoop ecosystem Keep table data and Hadoop in sync by importing data incrementally Import data from more than one database table Customize transferred data by calling various database functions Export generated, processed, or backed-up data from Hadoop to your database Run Sqoop within Oozie, Hadoop’s specialized workflow scheduler Load data into Hadoop’s data warehouse (Hive) or database (HBase) Handle installation, connection, and syntax issues common to specific database vendors

The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd Edition

Updated new edition of Ralph Kimball's groundbreaking book on dimensional modeling for data warehousing and business intelligence! The first edition of Ralph Kimball's The Data Warehouse Toolkit introduced the industry to dimensional modeling, and now his books are considered the most authoritative guides in this space. This new third edition is a complete library of updated dimensional modeling techniques, the most comprehensive collection ever. It covers new and enhanced star schema dimensional modeling patterns, adds two new chapters on ETL techniques, includes new and expanded business matrices for 12 case studies, and more. Authored by Ralph Kimball and Margy Ross, known worldwide as educators, consultants, and influential thought leaders in data warehousing and business intelligence Begins with fundamental design recommendations and progresses through increasingly complex scenarios Presents unique modeling techniques for business applications such as inventory management, procurement, invoicing, accounting, customer relationship management, big data analytics, and more Draws real-world case studies from a variety of industries, including retail sales, financial services, telecommunications, education, health care, insurance, e-commerce, and more Design dimensional databases that are easy to understand and provide fast query response with The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd Edition.

Disruptive Possibilities: How Big Data Changes Everything

Big data has more disruptive potential than any information technology developed in the past 40 years. As author Jeffrey Needham points out in this revealing book, big data can provide unprecedented visibility into the operational efficiency of enterprises and agencies. Disruptive Possibilities provides an historically-informed overview through a wide range of topics, from the evolution of commodity supercomputing and the simplicity of big data technology, to the ways conventional clouds differ from Hadoop analytics clouds. This relentlessly innovative form of computing will soon become standard practice for organizations of any size attempting to derive insight from the tsunami of data engulfing them. Replacing legacy silos—whether they’re infrastructure, organizational, or vendor silos—with a platform-centric perspective is just one of the big stories of big data. To reap maximum value from the myriad forms of data, organizations and vendors will have to adopt highly collaborative habits and methodologies.

IBM System Storage Tape Library Guide for Open Systems

This IBM® Redbooks® publication presents a general introduction to Linear Tape-Open (LTO) technology and the implementation of corresponding IBM products. The IBM Enterprise 3592 Tape Drive also is described. This tenth edition includes information about the latest enhancements to the IBM Ultrium family of tape drives and tape libraries. In particular, it includes details of the latest IBM LTO Ultrium 6 tape drive technology and its implementation in IBM tape libraries. Information is included about the recently released, enhanced, higher-performance ProtecTIER servers and the features of the new version 3.2 server software. The new software also enables a new feature, the File System Interface (FSI). It also contains technical information about each IBM tape product for Open Systems. It includes generalized sections about Small Computer System Interface (SCSI) and Fibre Channel connections and multipath architecture configurations. This book also includes information about tools and techniques for library management. This edition includes details about Tape System Library Manager (TSLM). TSLM provides consolidation and simplification in large TS3500 Tape Library environments, including the IBM Shuttle Complex. This publication is intended for anyone who wants to understand more about IBM tape products and their implementation. This book is suitable for IBM clients, IBM Business Partners, IBM specialist sales representatives, and technical specialists. If you do not have a background in computer tape storage products, you might need to reference other sources of information. In the interest of being concise, topics that are generally understood are not covered in detail.

IBM XIV Storage System Copy Services and Migration

This IBM® Redbook®s publication provides a practical understanding of theI BM XIV® Storage System copy and migration functions. The XIV Storage System has a rich set of copy functions suited for various data protection scenarios, which enables clients to enhance their business continuance, data migration, and online backup solutions. These functions allow point-in-time copies, known as snapshots and full volume copies, and also include remote copy capabilities in either synchronous or asynchronous mode. These functions are included in the XIV software and all their features are available at no additional charge. The various copy functions are reviewed in separate chapters, which include detailed information about usage, and also practical illustrations. This book also explains the XIV built-in migration capability, and presents migration alternatives based on the SAN Volume Controller (SVC). Finally, the book illustrates the use of IBM Tivoli® Storage Productivity Center for Replication to manage XIV Copy Services. This book is intended for anyone who needs a detailed and practical understanding of the XIV copy functions.

Implementing the IBM Storwize V7000 Unified

In this IBM® Redbooks® publication we introduce a new product, the IBM Storwize® V7000 Unified (V7000U). Storwize V7000 Unified is a virtualized storage system designed to consolidate block and file workloads into a single storage system. Advantages include simplicity of management, reduced cost, highly scalable capacity, performance, and high availability. Storwize V7000 Unified storage also offers improved efficiency and flexibility through built-in solid-state drive (SSD) optimization, thin provisioning, IBM Real-time Compression™, and nondisruptive migration of data from existing storage. The system can virtualize and reuse existing disk systems offering a greater potential return on investment. We suggest that you familiarize yourself with the following books to get the most from this publication: Implementing the IBM Storwize V7000 V6.3, SG24-7938

Server Time Protocol Implementation Guide

Server Time Protocol (STP) is a server-wide facility that is implemented in the Licensed Internal Code (LIC) of IBM® zEnterprise EC12 (zEC12), IBM zEnterprise 196 (z196), IBM zEnterprise 114 (z114), IBM System z10®, and IBM System z9®. It provides improved time synchronization in both a sysplex or non-sysplex configuration. This IBM Redbooks® publication will help you configure a Mixed Coordinated Timing Network (CTN) or an STP-only CTN. It is intended for technical support personnel requiring information about: -Installing and configuring a Coordinated Timing Network Readers are expected to be familiar with IBM System z technology and terminology. For planning information, see our companion book, Server Time Protocol Planning Guide, SG24-7280. For information about how to recover your STP environment functionality, see the Server Time Protocol Recovery Guide, SG24-7380.

Big Data Imperatives: Enterprise 'Big Data' Warehouse, 'BI' Implementations and Analytics

Big Data Imperatives, focuses on resolving the key questions on everyone's mind: Which data matters? Do you have enough data volume to justify the usage? How you want to process this amount of data? How long do you really need to keep it active for your analysis, marketing, and BI applications? Big data is emerging from the realm of one-off projects to mainstream business adoption; however, the real value of big data is not in the overwhelming size of it, but more in its effective use. This book addresses the following big data characteristics: Very large, distributed aggregations of loosely structured data - often incomplete and inaccessible Petabytes/Exabytes of data Millions/billions of people providing/contributing to the context behind the data Flat schema's with few complex interrelationships Involves time-stamped events Made up of incomplete data Includes connections between data elements that must be probabilistically inferred Big Data Imperatives explains 'what big data can do'. It can batch process millions and billions of records both unstructured and structured much faster and cheaper. Big data analytics provide a platform to merge all analysis which enables data analysis to be more accurate, well-rounded, reliable and focused on a specific business capability. Big Data Imperatives describes the complementary nature of traditional data warehouses and big-data analytics platforms and how they feed each other. This book aims to bring the big data and analytics realms together with a greater focus on architectures that leverage the scale and power of big data and the ability to integrate and apply analytics principles to data which earlier was not accessible. This book can also be used as a handbook for practitioners; helping them on methodology,technical architecture, analytics techniques and best practices. At the same time, this book intends to hold the interest of those new to big data and analytics by giving them a deep insight into the realm of big data. What you'll learn Understanding the technology, implementation of big data platforms and their usage for analytics Big data architectures Big data design patterns Implementation best practices Who this book is for This book is designed for IT professionals, data warehousing, business intelligence professionals, data analysis professionals, architects, developers and business users.

Pro Hibernate and MongoDB

Hibernate and MongoDB are a powerful combination of open source persistence and NoSQL technologies for today's Java-based enterprise and cloud application developers. Hibernate is the leading open source Java-based persistence, object relational management engine, recently repositioned as an object grid management engine. MongoDB is a growing, popular open source NoSQL framework, especially popular among cloud application and big data developers. With these two, enterprise and cloud developers have a "complete out of the box" solution. Pro Hibernate and MongoDB shows you how to use and integrate Hibernate and MongoDB. More specifically, this book guides you through the bootstrap; building transactions; handling queries and query entities; and mappings. Then, this book explores the principles and techniques for taking these application principles to the cloud, using the OpenShift Platform as a Service (PaaS) and more. In this book, you get two case studies: An enterprise application using Hibernate and MongoDB. then, A cloud application (OpenShip) migrated from the enterprise application case study After reading or using this book, you come away with the experience from two case studies that give you possible frameworks or templates that you can apply to your own specific application or cloud application building context. What you'll learn How to use and integrate Hibernate and MongoDB to be your "complete out of the box" solution for database driven enterprise and cloud applications How to bootstrap; run in supported environments; do transactions; handle queries and query entities; and mappings How to build an enterprise application case study using Hibernate and MongoDB What are the principles and techniques for taking applications to the Cloud, using the OpenShift Platform as a Service (PaaS) and more How to build a cloud-based app or application (OpenShip) Who this book is for This book is for experienced Java, enterprise Java programmers who may have some experience with Hibernate and/or MongoDB.

Agent-Based Modeling and Simulation with Swarm

A thorough overview of multi-agent simulation and supporting tools, this book provides the methodology for a multi-agent-based modeling approach that integrates computational techniques such as artificial life, cellular automata, and bio-inspired optimization. It shows how this type of simulation is used to acquire an understanding of complex systems and artificial life. The author carefully explains how to construct a simulation program for various applications. Swarm-based software and source codes are available on his website.

Oracle Data Guard 11gR2 Administration : Beginner's Guide

Dive into "Oracle Data Guard 11gR2 Administration: Beginner's Guide" and start mastering data protection and high availability for Oracle Databases. This guide breaks down the essentials of setting up and managing Oracle Data Guard configurations, equipping you with knowledge and skills through step-by-step examples. What this Book will help me do Learn to configure Oracle Data Guard and manage its essential components. Gain expertise in performing role transitions such as switchover and failover between databases. Use Data Guard Broker for streamlined management of your high availability setup. Understand best practices for patching and maintaining Oracle Data Guard environments. Integrate Data Guard with advanced Oracle features like RAC and RMAN for optimal performance. Author(s) While the specific authors of "Oracle Data Guard 11gR2 Administration: Beginner's Guide" are not listed, it is written by experts in Oracle Database Administration with substantial experience working on high availability setups and disaster recovery solutions. The book distills their expertise into an accessible format. Who is it for? This guide is perfect for Oracle Database Administrators looking to deepen their knowledge in setting up and managing an Oracle Data Guard environment. Whether you're just starting out or have some experience, it provides hands-on instructions and practical examples to elevate your skills. Its user-friendly approach appeals to tech-savvy professionals aiming to protect their data and ensure system availability effectively.

Oracle SOA BPEL Process Manager 11gR1 - A Hands-on Tutorial

Delve into the world of Oracle SOA BPEL Process Manager and master the skills required for designing, deploying, and managing business process applications. In this book, you will learn to implement and optimize SOA services and BPEL processes, enabling you to tackle real-world challenges with confidence. Gain hands-on experience through detailed examples and practical exercises. What this Book will help me do Understand and utilize the BPEL standard for defining business processes in a SOA context. Develop, configure, and test BPEL processes using Oracle SOA Suite and JDeveloper. Gain expertise in deploying, debugging, and troubleshooting BPEL processes effectively. Learn techniques for integrating BPEL with other SOA suite components like OSB and BAM. Explore advanced topics such as performance tuning and implementing high availability strategies. Author(s) The authors of this hands-on guide are seasoned experts in service-oriented architecture (SOA) and integration technologies, bringing decades of industry experience to their teachings. They have trained and worked with numerous organizations to design and implement robust SOA solutions, particularly leveraging Oracle technology. Their approachable writing style makes complex technical concepts accessible for learners at all levels. Who is it for? This book is designed for SOA developers, architects, and administrators aiming to master Oracle BPEL Process Manager 11gR1. Ideal for professionals with a basic understanding of SOA concepts looking to deepen their skills in BPEL. It's suitable for those interested in building business process applications or managing Oracle SOA solutions. Also, it's an excellent resource for enhancing practical expertise.

IBM System z Personal Development Tool: Volume 2 Installation and Basic Use

This IBM® Redbooks® publication introduces the IBM System z® Personal Development Tool (zPDT®), which runs on an underlying Linux system based on an Intel processor. zPDT provides a System z system on a PC capable of running current System z operating systems, including emulation of selected System z I/O devices and control units. It is intended as a development, demonstration, and learning platform and is not designed as a production system. This book, providing specific installation instructions, is the second of three volumes. The first volume describes the general concepts of zPDT and a syntax reference for zPDT commands and device managers. The third volume discusses more advanced topics that may not interest all zPDT users. The IBM order numbers for the three volumes are SG24-7721, SG24-7722, and SG24-7723. The systems discussed in these volumes are complex, with elements of Linux (for the underlying PC machine), IBM z/Architecture® (for the core zPDT elements), System z I/O functions (for emulated I/O devices), and IBM z/OS® (providing the System z application interface), and possibly with other System z operating systems. We assume the reader is familiar with the general concepts and terminology of System z hardware and software elements and with basic PC Linux characteristics.