talk-data.com talk-data.com

Topic

data-engineering

3395

tagged

Activity Trend

1 peak/qtr
2020-Q1 2026-Q1

Activities

3395 activities · Newest first

IBM TS7700 Virtualization Engine with R3.2

This IBM® Redbooks® publication highlights IBM TS7700 Virtualization Engine Release 3.2 (IBM TS7700). IBM TS7700 is part of a family of IBM System Storage® Enterprise tape products. This book is intended for system architects who want to integrate their storage systems for smoother operation. The IBM TS7700 offers a modular, scalable, and high-performing architecture for mainframe tape virtualization for the IBM System z® environment. It integrates IBM 3592 tape cartridges, high-performance disks, and a new disk cache subsystem into a storage hierarchy. This storage hierarchy is managed by robust storage management firmware with extensive self-management capability. It includes the following advanced functions: Policy management to control physical volume pooling Cache management Redundant copies, including across a grid network Copy mode control The IBM TS7700 Virtualization Engine offers enhanced statistical reporting. It also includes a standards-based Management Interface (MI) for IBM TS7700 management. The IBM TS7700 Release 3.2 continues the next generation of IBM TS7700 Virtualization Engine servers for System z tape: IBM TS7720 features encryption-capable, high-capacity cache using 3 terabyte (TB) serial-attached Small Computer System Interface (SAS) disk drives with Redundant Array of Independent Disks (RAID) 6, providing the ability to scale to very large capacities with the highest level of data protection. IBM TS7740 features encryption-capable 600 gigabyte (GB) SAS drives with RAID 6 protection. Both models write data by policy to physical tape through attachment to high-capacity, high-performance IBM TS1140 and earlier IBM 3592 model tape drives installed in IBM TS3500 tape libraries. Physical tape support is optional on IBM TS7720. These Virtualization Engines are based on IBM POWER7® technology. They offer improved performance for most System z tape workloads compared to the first generation of IBM TS7700 Virtualization Engine servers. IBM TS7700 Virtualization Engine Release 3.2 builds on the existing capabilities of the IBM TS7700 family. It also includes the following enhancements to the IBM TS7700 family: 25 GB logical volume sizes Options for attaching back-end physical tape to IBM TS7720 systems Up to eight repository partitions in a tape-attached IBM TS7720

IBM Software Defined Environment

This IBM® Redbooks® publication introduces the IBM Software Defined Environment (SDE) solution, which helps to optimize the entire computing infrastructure--compute, storage, and network resources--so that it can adapt to the type of work required. In today's environment, resources are assigned manually to workloads, but that happens automatically in a SDE. In an SDE, workloads are dynamically assigned to IT resources based on application characteristics, best-available resources, and service level policies so that they deliver continuous, dynamic optimization and reconfiguration to address infrastructure issues. Underlying all of this are policy-based compliance checks and updates in a centrally managed environment. Readers get a broad introduction to the new architecture. Think integration, automation, and optimization. Those are enablers of cloud delivery and analytics. SDE can accelerate business success by matching workloads and resources so that you have a responsive, adaptive environment. With the IBM Software Defined Environment, infrastructure is fully programmable to rapidly deploy workloads on optimal resources and to instantly respond to changing business demands. This information is intended for IBM sales representatives, IBM software architects, IBM Systems Technology Group brand specialists, distributors, resellers, and anyone who is developing or implementing SDE.

Introduction to JavaScript Object Notation

What is JavaScript Object Notation (JSON) and how can you put it to work? This concise guide helps busy IT professionals get up and running quickly with this popular data interchange format, and provides a deep understanding of how JSON works. Author Lindsay Bassett begins with an overview of JSON syntax, data types, formatting, and security concerns before exploring the many ways you can apply JSON today. From Web APIs and server-side language libraries to NoSQL databases and client-side frameworks, JSON has emerged as a viable alternative to XML for exchanging data between different platforms. If you have some programming experience and understand HTML and JavaScript, this is your book. Learn why JSON syntax represents data in name-value pairs Explore JSON data types, including object, string, number, and array Find out how you can combat common security concerns Learn how the JSON schema verifies that data is formatted correctly Examine the relationship between browsers, web APIs, and JSON Understand how web servers can both request and create data Discover how jQuery and other client-side frameworks use JSON Learn why the CouchDB NoSQL database uses JSON to store data

Pro Couchbase Development: A NoSQL Platform for the Enterprise

Pro Couchbase Development: A NoSQL Platform for the Enterprise discusses programming for Couchbase using Java and scripting languages, querying and searching, handling migration, and integrating Couchbase with Hadoop, HDFS, and JSON. It also discusses migration from other NoSQL databases like MongoDB. This book is for big data developers who use Couchbase NoSQL database or want to use Couchbase for their web applications as well as for those migrating from other NoSQL databases like MongoDB and Cassandra. For example, a reason to migrate from Cassandra is that it is not based on the JSON document model with support for a flexible schema without having to define columns and supercolumns. The target audience is largely Java developers but the book also supports PHP and Ruby developers who want to learn about Couchbase. The author supplies examples in Java, PHP, Ruby, and JavaScript. After reading and using this hands-on guide for developing with Couchbase, you'll be able to build complex enterprise, database and cloud applications that leverage this powerful platform.

Learning NHibernate 4

Dive into the essentials of NHibernate 4 with this comprehensive guide. Designed for .NET developers, you will discover how to map domain models to databases effectively, perform various database operations, optimize performance, and apply powerful data access patterns using NHibernate. What this Book will help me do Understand how to map domain entities to a database schema using NHibernate's mapping mechanisms. Efficiently configure NHibernate for your application using XML configuration files. Perform CRUD operations and craft data retrieval strategies with NHibernate. Optimize your database-oriented application in terms of performance and memory. Apply NHibernate in real-world projects, including interaction with legacy databases. Author(s) Suhas H. Chatekar is an experienced software engineer who specializes in .NET development and database integration using ORM tools like NHibernate. With a passion for creating clear and operational technical resources, his comprehensive expertise ensures that his works empower developers to achieve practical outcomes. Who is it for? This book is perfect for .NET developers who are new to ORM tools and want to skillfully integrate NHibernate into their projects. Readers who have tried ORM solutions before or used NHibernate but wish to delve deeper into its capabilities will find this book invaluable. If your goal is to model databases effectively and utilize NHibernate for real-world applications, this guide is for you. Beginners to intermediate-level developers will benefit greatly from the step-by-step approach and clear explanations.

Getting Started with Hazelcast, Second Edition

This book is your gateway to mastering Hazelcast, a powerful open-source distributed data grid platform. By using Hazelcast, you'll gain the tools to manage data at scale within your modern applications while improving performance and reliability. What this Book will help me do Gain a comprehensive understanding of distributed data grids and Hazelcast's architecture. Master the configuration and deployment of Hazelcast clusters in various scenarios. Learn to design scalable and resilient systems using Hazelcast's in-memory features. Implement advanced messaging, querying, and processing using Hazelcast APIs. Enhance your applications with distributed caching and data sharing capabilities. Author(s) Matthew Johns is an experienced software engineer and author specializing in distributed systems and Java enterprise development. He has worked extensively in building scalable applications and is passionate about teaching others to leverage modern technologies. His practical approach to programming and clarity of instruction make complex topics accessible and actionable. Who is it for? This book is ideal for Java developers, software architects, and DevOps engineers seeking to enhance their skills in distributed systems. If you're looking to manage data at scale, improve application performance, and build resilient architectures, this book is for you. Whether new to distributed computing or experienced developers exploring Hazelcast, you'll find practical insights for your work. Readers should have basic Java knowledge to get the most out of this book.

IBM GDPS Family of Products: An Introduction to Concepts and Capabilities

This IBM® Redbooks® publication presents an overview of the IBM Geographically Dispersed Parallel Sysplex™ (IBM GDPS®) offerings and the roles they play in delivering a business IT resilience solution. The book begins with general concepts of business IT resilience and disaster recovery, along with issues related to high application availability, data integrity, and performance. These topics are considered within the framework of government regulation, increasing application and infrastructure complexity, and the competitive and rapidly changing modern business environment. Next, it describes the GDPS family of offerings with specific reference to how they can help you achieve your defined goals for disaster recovery and high availability. Also covered are the features that simplify and enhance data replication activities, the prerequisites for implementing each offering, and tips for planning for the future and immediate business requirements. Tables provide easy-to-use summaries and comparisons of the offerings, and the additional planning and implementation services available from IBM are explained. Then, several practical client scenarios and requirements are described, along with the most suitable GDPS solution for each case. The introductory chapters of this publication are intended for a broad technical audience, including IT System Architects, Availability Managers, Technical IT Managers, Operations Managers, System Programmers, and Disaster Recovery Planners. The subsequent chapters provide more technical details about the GDPS offerings, and each can be read independently for those readers who are interested in specific topics. Therefore, if you do read all the chapters, be aware that some information is intentionally repeated.

PostgreSQL Replication, Second Edition

The second edition of 'PostgreSQL Replication' by Hans-Jürgen Schönig is a comprehensive guide that empowers PostgreSQL database professionals to establish robust replication solutions. Through detailed explanations and expert techniques, you will learn how to enhance the security, scalability, and reliability of your PostgreSQL databases using modern replication methods. What this Book will help me do Master Point-in-Time Recovery to safeguard data and perform database recoveries effectively. Implement both synchronous and asynchronous streaming replication to suit different operational needs. Optimize database performance and scalability using tools like pgpool and PgBouncer. Ensure database high availability and data security through Linux High Availability configurations. Solve replication-related challenges by leveraging advanced knowledge of the PostgreSQL transaction log. Author(s) Hans-Jürgen Schönig, a seasoned PostgreSQL specialist, has years of experience architecting and optimizing PostgreSQL database systems for businesses of all sizes. With a strong focus on practical implementation and a passion for teaching, his writing bridges the gap between theoretical concepts and hands-on solutions, making PostgreSQL topics accessible and actionable. Who is it for? This book is tailored for PostgreSQL administrators and professionals seeking to implement robust database replication. Whether you're familiar with basic database administration or looking to deepen your expertise, this book provides valuable insights into replication strategies. It's ideal for those aiming to boost database performance and enhance operational reliability through advanced PostgreSQL features.

Programming ArcGIS with Python Cookbook, Second Edition

Dive into 'Programming ArcGIS with Python Cookbook, Second Edition,' an essential guide for automating your ArcGIS for Desktop tasks with hands-on Python recipes. Through this book, you will understand how to effectively handle GIS data, automate geoprocessing tasks, and extend ArcGIS functionalities to streamline your workflows and boost your productivity. What this Book will help me do Master the management of map documents, layer files, feature classes, and tables using Python. Automate common ArcGIS tasks such as map production, printing, and creating PDF map books programmatically. Learn to find and correct broken data links and make your datasets reliable. Develop custom geoprocessing tools and share them efficiently among your team or projects. Expand your knowledge by leveraging advanced practices such as Python scripting for ArcGIS Pro and REST API integration. Author(s) Eric Pimpler is an accomplished GIS professional and Python programmer with years of practical experience in geospatial science and technology. He specializes in teaching GIS automation using Python and aims to simplify complex concepts into approachable recipes for learners. Eric's writing is marked by clarity and a methodical approach, ensuring that readers can apply their new knowledge effectively. Who is it for? This book is aimed at GIS professionals, cartographers, or analysts who routinely work with ArcGIS and want to streamline their workflow. If you have foundational experience with ArcGIS and basic Python programming skills, this book will build upon them, offering practical recipes to extend your capabilities. It's perfect for those looking to enhance their efficiency and automate their GIS tasks. By the end of this book, readers will have skills valuable to GIS experts and data analysts alike.

Neo4j Graph Data Modelling

Neo4j Graph Data Modelling provides practical guidance in designing and implementing graph databases using Neo4j. This book walks you through modeling concepts, database evolution, and performance optimization. You'll learn how to model real-world domains, write Cypher queries, and adapt your database as requirements change. What this Book will help me do Model data effectively using Neo4j to represent complex relationships. Translate real-world problems into graph database designs efficiently. Write optimized Cypher queries to retrieve and manipulate data. Improve database performance through thoughtful design practices. Adapt and evolve databases seamlessly as application needs change. Author(s) Mahesh K Lal is an experienced developer and database specialist with a deep understanding of graph data modeling. With a focus on practical and accessible instruction, Mahesh's work provides actionable insights into database design. Neo4j Graph Data Modelling reflects his years of hands-on experience with Neo4j. Who is it for? This book is designed for software developers and data professionals looking to explore graph databases. If you aim to effectively model real-world situations using Neo4j or optimize database queries, this guide is for you. Prior experience with databases is helpful but not mandatory.

Oracle Goldengate 12c Implementers Guide

Master Oracle GoldenGate 12c to manage high-volume data replication and integration in real time. This guide provides you with comprehensive knowledge and skills to optimize database processes through GoldenGate's capabilities. What this Book will help me do Install and configure Oracle GoldenGate 12c within your environment effectively. Leverage GoldenGate's high-availability features for robust system setups. Optimize replication processes with advanced configuration and performance tuning techniques. Troubleshoot common GoldenGate issues to ensure seamless operations. Apply best practices for GoldenGate in enterprise database architectures. Author(s) John P Jeffries, the author of this guide, is a seasoned Oracle database consultant with over a decade of experience specializing in high-availability architectures and data replication. His mission is to make complex systems accessible through clear and detailed instructional writing. Who is it for? This book is designed for Oracle database administrators wanting to integrate GoldenGate into their architecture. Ideal for solution architects building robust systems and project managers overseeing database projects. A basic understanding of Oracle databases is assumed, but no prior knowledge of GoldenGate is required.

Spark Cookbook

Spark Cookbook is your practical guide to mastering Apache Spark, encompassing a comprehensive set of patterns and examples. Through its over 60 recipes, you will gain actionable insights into using Spark Core, Spark SQL, Spark Streaming, MLlib, and GraphX effectively for your big data needs. What this Book will help me do Understand how to install and configure Apache Spark in various environments. Build data pipelines and perform real-time analytics with Spark Streaming. Utilize Spark SQL for interactive data querying and reporting. Apply machine learning workflows using MLlib, including supervised and unsupervised models. Develop optimized big data solutions and integrate them into enterprise platforms. Author(s) None Yadav, the author of Spark Cookbook, is an experienced data engineer and technical expert with deep insights into big data processing frameworks. Yadav has spent years working with Spark and its ecosystem, providing practical guidance to developers and data scientists alike. This book reflects their commitment to sharing actionable knowledge. Who is it for? This book is designed for data engineers, developers, and data scientists who work with big data systems and wish to utilize Apache Spark effectively. Whether you're looking to optimize existing Spark applications or explore its libraries for new use cases, this book will provide the guidance you need. A basic familiarity with big data concepts and programming in languages like Java or Python is recommended to make the most out of this book.

ElasticSearch Blueprints

Dive into search technology with "ElasticSearch Blueprints"! This is the perfect project-based guide to help you master Elasticsearch. You will learn how to build and design scalable, effective search solutions, improve search relevancy, manage data efficiently, perform analytics, and visualize your data in comprehensive ways. What this Book will help me do Build and fine-tune scalable search engine features with Elasticsearch. Design and implement accurate ecommerce search solutions using filters. Analyze and visualize data with Elasticsearch's powerful data aggregation capabilities. Increase search relevancy and enhance user query assistance using analyzers. Incorporate enhanced data organization methods, including parent-child relationships. Author(s) None Mohan is an experienced professional specializing in search technologies. With a strong technical background, they have engaged deeply with Elasticsearch, creating solutions that address practical challenges. Their approach focuses on making technical topics accessible, guiding readers step-by-step through projects. Who is it for? This book is tailored for data professionals, application developers, and enthusiasts eager to delve into search technologies. Whether you're beginning with Elasticsearch or aiming to refine your skills, this guide will advance your expertise. By working through practical cases, you'll gain confidence in using Elasticsearch effectively to meet diverse requirements.

Virtualizing Hadoop: How to Install, Deploy, and Optimize Hadoop in a Virtualized Architecture

Plan and Implement Hadoop Virtualization for Maximum Performance, Scalability, and Business Agility Enterprises running Hadoop must absorb rapid changes in big data ecosystems, frameworks, products, and workloads. Virtualized approaches can offer important advantages in speed, flexibility, and elasticity. Now, a world-class team of enterprise virtualization and big data experts guide you through the choices, considerations, and tradeoffs surrounding Hadoop virtualization. The authors help you decide whether to virtualize Hadoop, deploy Hadoop in the cloud, or integrate conventional and virtualized approaches in a blended solution. First, Virtualizing Hadoop reviews big data and Hadoop from the standpoint of the virtualization specialist. The authors demystify MapReduce, YARN, and HDFS and guide you through each stage of Hadoop data management. Next, they turn the tables, introducing big data experts to modern virtualization concepts and best practices. Finally, they bring Hadoop and virtualization together, guiding you through the decisions you’ll face in planning, deploying, provisioning, and managing virtualized Hadoop. From security to multitenancy to day-to-day management, you’ll find reliable answers for choosing your best Hadoop strategy and executing it. Coverage includes the following: • Reviewing the frameworks, products, distributions, use cases, and roles associated with Hadoop • Understanding YARN resource management, HDFS storage, and I/O • Designing data ingestion, movement, and organization for modern enterprise data platforms • Defining SQL engine strategies to meet strict SLAs • Considering security, data isolation, and scheduling for multitenant environments • Deploying Hadoop as a service in the cloud • Reviewing the essential concepts, capabilities, and terminology of virtualization • Applying current best practices, guidelines, and key metrics for Hadoop virtualization • Managing multiple Hadoop frameworks and products as one unified system • Virtualizing master and worker nodes to maximize availability and performance • Installing and configuring Linux for a Hadoop environment

Exposure-Response Modeling

This book explores a wide range of topics in exposure-response modeling, from traditional PKPD modeling to other areas in drug development and beyond. It incorporates numerous examples and software programs for implementing novel methods. The book emphasizes dose adjustment and treatment adaptation based on dynamic exposure-response models, illustrates how to apply causal inference to exposure-response modeling in pharmacometrics and epidemiology, and links exposure-response modeling to clinical decision making through model-based decision analysis.

Building web applications with Python and Neo4j

Expand your Python web development expertise by integrating Neo4j into your applications. Through this book, you'll journey from understanding Neo4j's fundamentals to building powerful Python-based applications using tools like Flask, Py2neo, and Django. Learn how to model, query, and update graph data effectively. What this Book will help me do Gain an in-depth understanding of Neo4j installation, licensing, and tools. Master using Cypher for querying and modifying graph data models. Learn how to integrate Python with Neo4j effectively using Py2neo. Build RESTful services with Flask leveraging Neo4j for structured data. Create robust Django applications using graph-based data models with Neomodel. Author(s) Sumit Gupta is a seasoned Python developer with a strong background in graph database design and integration. He has extensive experience using Neo4j to create efficient, scalable applications for real-world problems. His hands-on approach combines practical examples with the depth of knowledge required to develop expertise. Who is it for? This book is ideal for Python developers with an interest in enhancing their applications through graph database technology. If you possess a moderate understanding of Python and wish to explore Neo4j for creating smarter, more interconnected data-driven solutions, this book is for you. You should be comfortable with basic programming concepts to fully benefit from this book.

Toad for Oracle Unleashed

Bert Scalzo and Dan Hotka have written the definitive, up-to-date guide to Version 12.x, Dell’s powerful new release of Toad for Oracle. Packed with step-by-step recipes, detailed screen shots, and hands-on exercises, Toad for Oracle Unleashed shows both developers and DBAs how to maximize their productivity. Drawing on their unsurpassed experience running Toad in production Oracle environments, Scalzo and Hotka thoroughly cover every area of Toad’s functionality. You’ll find practical insights into each of Toad’s most useful tools, from App Designer to Doc Generator, ER Diagrammer to Code Road Map. The authors offer proven solutions you can apply immediately to solve a wide variety of problems, from maintaining code integrity to automating performance and scalability testing. Learn how to… Install and launch Toad, connect to a database, and explore Toad’s new features Customize Toad to optimize productivity in your environment Use the Editor Window to execute SQL and PL/SQL, and view, save, or convert data Browse your schema, and create and edit objects Quickly generate useful reports with FastReport and Report Manager Clarify your database’s tables and data with the powerful Entity Relationship Diagrammer (ERD) and HTML documentation generator Work more efficiently with PL/SQL using code templates, snippets, and shortcuts Automate actions and applications with Automation Designer Perform key DBA tasks including database health checks, tablespace management, database and schema comparisons, and object rebuilding Identify and optimize poorlyperforming SQL and applications ON THE WEB:Download all examples and source code presented in this book from informit.com/title/9780134131856 as it becomes available.

Hadoop Application Architectures

Get expert guidance on architecting end-to-end data management solutions with Apache Hadoop. While many sources explain how to use various components in the Hadoop ecosystem, this practical book takes you through architectural considerations necessary to tie those components together into a complete tailored application, based on your particular use case.

Accumulo

Get up to speed on Apache Accumulo, the flexible, high-performance key/value store created by the National Security Agency (NSA) and based on Google’s BigTable data storage system. Written by former NSA team members, this comprehensive tutorial and reference covers Accumulo architecture, application development, table design, and cell-level security. With clear information on system administration, performance tuning, and best practices, this book is ideal for developers seeking to write Accumulo applications, administrators charged with installing and maintaining Accumulo, and other professionals interested in what Accumulo has to offer. You will find everything you need to use this system fully.