talk-data.com talk-data.com

Topic

data-engineering

3377

tagged

Activity Trend

1 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: O'Reilly Data Engineering Books ×
IBM GDPS Family of Products: An Introduction to Concepts and Capabilities

This IBM® Redbooks® publication presents an overview of the IBM Geographically Dispersed Parallel Sysplex™ (IBM GDPS®) offerings and the roles they play in delivering a business IT resilience solution. The book begins with general concepts of business IT resilience and disaster recovery, along with issues related to high application availability, data integrity, and performance. These topics are considered within the framework of government regulation, increasing application and infrastructure complexity, and the competitive and rapidly changing modern business environment. Next, it describes the GDPS family of offerings with specific reference to how they can help you achieve your defined goals for disaster recovery and high availability. Also covered are the features that simplify and enhance data replication activities, the prerequisites for implementing each offering, and tips for planning for the future and immediate business requirements. Tables provide easy-to-use summaries and comparisons of the offerings, and the additional planning and implementation services available from IBM are explained. Then, several practical client scenarios and requirements are described, along with the most suitable GDPS solution for each case. The introductory chapters of this publication are intended for a broad technical audience, including IT System Architects, Availability Managers, Technical IT Managers, Operations Managers, System Programmers, and Disaster Recovery Planners. The subsequent chapters provide more technical details about the GDPS offerings, and each can be read independently for those readers who are interested in specific topics. Therefore, if you do read all the chapters, be aware that some information is intentionally repeated.

PostgreSQL Replication, Second Edition

The second edition of 'PostgreSQL Replication' by Hans-Jürgen Schönig is a comprehensive guide that empowers PostgreSQL database professionals to establish robust replication solutions. Through detailed explanations and expert techniques, you will learn how to enhance the security, scalability, and reliability of your PostgreSQL databases using modern replication methods. What this Book will help me do Master Point-in-Time Recovery to safeguard data and perform database recoveries effectively. Implement both synchronous and asynchronous streaming replication to suit different operational needs. Optimize database performance and scalability using tools like pgpool and PgBouncer. Ensure database high availability and data security through Linux High Availability configurations. Solve replication-related challenges by leveraging advanced knowledge of the PostgreSQL transaction log. Author(s) Hans-Jürgen Schönig, a seasoned PostgreSQL specialist, has years of experience architecting and optimizing PostgreSQL database systems for businesses of all sizes. With a strong focus on practical implementation and a passion for teaching, his writing bridges the gap between theoretical concepts and hands-on solutions, making PostgreSQL topics accessible and actionable. Who is it for? This book is tailored for PostgreSQL administrators and professionals seeking to implement robust database replication. Whether you're familiar with basic database administration or looking to deepen your expertise, this book provides valuable insights into replication strategies. It's ideal for those aiming to boost database performance and enhance operational reliability through advanced PostgreSQL features.

Programming ArcGIS with Python Cookbook, Second Edition

Dive into 'Programming ArcGIS with Python Cookbook, Second Edition,' an essential guide for automating your ArcGIS for Desktop tasks with hands-on Python recipes. Through this book, you will understand how to effectively handle GIS data, automate geoprocessing tasks, and extend ArcGIS functionalities to streamline your workflows and boost your productivity. What this Book will help me do Master the management of map documents, layer files, feature classes, and tables using Python. Automate common ArcGIS tasks such as map production, printing, and creating PDF map books programmatically. Learn to find and correct broken data links and make your datasets reliable. Develop custom geoprocessing tools and share them efficiently among your team or projects. Expand your knowledge by leveraging advanced practices such as Python scripting for ArcGIS Pro and REST API integration. Author(s) Eric Pimpler is an accomplished GIS professional and Python programmer with years of practical experience in geospatial science and technology. He specializes in teaching GIS automation using Python and aims to simplify complex concepts into approachable recipes for learners. Eric's writing is marked by clarity and a methodical approach, ensuring that readers can apply their new knowledge effectively. Who is it for? This book is aimed at GIS professionals, cartographers, or analysts who routinely work with ArcGIS and want to streamline their workflow. If you have foundational experience with ArcGIS and basic Python programming skills, this book will build upon them, offering practical recipes to extend your capabilities. It's perfect for those looking to enhance their efficiency and automate their GIS tasks. By the end of this book, readers will have skills valuable to GIS experts and data analysts alike.

Neo4j Graph Data Modelling

Neo4j Graph Data Modelling provides practical guidance in designing and implementing graph databases using Neo4j. This book walks you through modeling concepts, database evolution, and performance optimization. You'll learn how to model real-world domains, write Cypher queries, and adapt your database as requirements change. What this Book will help me do Model data effectively using Neo4j to represent complex relationships. Translate real-world problems into graph database designs efficiently. Write optimized Cypher queries to retrieve and manipulate data. Improve database performance through thoughtful design practices. Adapt and evolve databases seamlessly as application needs change. Author(s) Mahesh K Lal is an experienced developer and database specialist with a deep understanding of graph data modeling. With a focus on practical and accessible instruction, Mahesh's work provides actionable insights into database design. Neo4j Graph Data Modelling reflects his years of hands-on experience with Neo4j. Who is it for? This book is designed for software developers and data professionals looking to explore graph databases. If you aim to effectively model real-world situations using Neo4j or optimize database queries, this guide is for you. Prior experience with databases is helpful but not mandatory.

Oracle Goldengate 12c Implementers Guide

Master Oracle GoldenGate 12c to manage high-volume data replication and integration in real time. This guide provides you with comprehensive knowledge and skills to optimize database processes through GoldenGate's capabilities. What this Book will help me do Install and configure Oracle GoldenGate 12c within your environment effectively. Leverage GoldenGate's high-availability features for robust system setups. Optimize replication processes with advanced configuration and performance tuning techniques. Troubleshoot common GoldenGate issues to ensure seamless operations. Apply best practices for GoldenGate in enterprise database architectures. Author(s) John P Jeffries, the author of this guide, is a seasoned Oracle database consultant with over a decade of experience specializing in high-availability architectures and data replication. His mission is to make complex systems accessible through clear and detailed instructional writing. Who is it for? This book is designed for Oracle database administrators wanting to integrate GoldenGate into their architecture. Ideal for solution architects building robust systems and project managers overseeing database projects. A basic understanding of Oracle databases is assumed, but no prior knowledge of GoldenGate is required.

Spark Cookbook

Spark Cookbook is your practical guide to mastering Apache Spark, encompassing a comprehensive set of patterns and examples. Through its over 60 recipes, you will gain actionable insights into using Spark Core, Spark SQL, Spark Streaming, MLlib, and GraphX effectively for your big data needs. What this Book will help me do Understand how to install and configure Apache Spark in various environments. Build data pipelines and perform real-time analytics with Spark Streaming. Utilize Spark SQL for interactive data querying and reporting. Apply machine learning workflows using MLlib, including supervised and unsupervised models. Develop optimized big data solutions and integrate them into enterprise platforms. Author(s) None Yadav, the author of Spark Cookbook, is an experienced data engineer and technical expert with deep insights into big data processing frameworks. Yadav has spent years working with Spark and its ecosystem, providing practical guidance to developers and data scientists alike. This book reflects their commitment to sharing actionable knowledge. Who is it for? This book is designed for data engineers, developers, and data scientists who work with big data systems and wish to utilize Apache Spark effectively. Whether you're looking to optimize existing Spark applications or explore its libraries for new use cases, this book will provide the guidance you need. A basic familiarity with big data concepts and programming in languages like Java or Python is recommended to make the most out of this book.

ElasticSearch Blueprints

Dive into search technology with "ElasticSearch Blueprints"! This is the perfect project-based guide to help you master Elasticsearch. You will learn how to build and design scalable, effective search solutions, improve search relevancy, manage data efficiently, perform analytics, and visualize your data in comprehensive ways. What this Book will help me do Build and fine-tune scalable search engine features with Elasticsearch. Design and implement accurate ecommerce search solutions using filters. Analyze and visualize data with Elasticsearch's powerful data aggregation capabilities. Increase search relevancy and enhance user query assistance using analyzers. Incorporate enhanced data organization methods, including parent-child relationships. Author(s) None Mohan is an experienced professional specializing in search technologies. With a strong technical background, they have engaged deeply with Elasticsearch, creating solutions that address practical challenges. Their approach focuses on making technical topics accessible, guiding readers step-by-step through projects. Who is it for? This book is tailored for data professionals, application developers, and enthusiasts eager to delve into search technologies. Whether you're beginning with Elasticsearch or aiming to refine your skills, this guide will advance your expertise. By working through practical cases, you'll gain confidence in using Elasticsearch effectively to meet diverse requirements.

Virtualizing Hadoop: How to Install, Deploy, and Optimize Hadoop in a Virtualized Architecture

Plan and Implement Hadoop Virtualization for Maximum Performance, Scalability, and Business Agility Enterprises running Hadoop must absorb rapid changes in big data ecosystems, frameworks, products, and workloads. Virtualized approaches can offer important advantages in speed, flexibility, and elasticity. Now, a world-class team of enterprise virtualization and big data experts guide you through the choices, considerations, and tradeoffs surrounding Hadoop virtualization. The authors help you decide whether to virtualize Hadoop, deploy Hadoop in the cloud, or integrate conventional and virtualized approaches in a blended solution. First, Virtualizing Hadoop reviews big data and Hadoop from the standpoint of the virtualization specialist. The authors demystify MapReduce, YARN, and HDFS and guide you through each stage of Hadoop data management. Next, they turn the tables, introducing big data experts to modern virtualization concepts and best practices. Finally, they bring Hadoop and virtualization together, guiding you through the decisions you’ll face in planning, deploying, provisioning, and managing virtualized Hadoop. From security to multitenancy to day-to-day management, you’ll find reliable answers for choosing your best Hadoop strategy and executing it. Coverage includes the following: • Reviewing the frameworks, products, distributions, use cases, and roles associated with Hadoop • Understanding YARN resource management, HDFS storage, and I/O • Designing data ingestion, movement, and organization for modern enterprise data platforms • Defining SQL engine strategies to meet strict SLAs • Considering security, data isolation, and scheduling for multitenant environments • Deploying Hadoop as a service in the cloud • Reviewing the essential concepts, capabilities, and terminology of virtualization • Applying current best practices, guidelines, and key metrics for Hadoop virtualization • Managing multiple Hadoop frameworks and products as one unified system • Virtualizing master and worker nodes to maximize availability and performance • Installing and configuring Linux for a Hadoop environment

Exposure-Response Modeling

This book explores a wide range of topics in exposure-response modeling, from traditional PKPD modeling to other areas in drug development and beyond. It incorporates numerous examples and software programs for implementing novel methods. The book emphasizes dose adjustment and treatment adaptation based on dynamic exposure-response models, illustrates how to apply causal inference to exposure-response modeling in pharmacometrics and epidemiology, and links exposure-response modeling to clinical decision making through model-based decision analysis.

Building web applications with Python and Neo4j

Expand your Python web development expertise by integrating Neo4j into your applications. Through this book, you'll journey from understanding Neo4j's fundamentals to building powerful Python-based applications using tools like Flask, Py2neo, and Django. Learn how to model, query, and update graph data effectively. What this Book will help me do Gain an in-depth understanding of Neo4j installation, licensing, and tools. Master using Cypher for querying and modifying graph data models. Learn how to integrate Python with Neo4j effectively using Py2neo. Build RESTful services with Flask leveraging Neo4j for structured data. Create robust Django applications using graph-based data models with Neomodel. Author(s) Sumit Gupta is a seasoned Python developer with a strong background in graph database design and integration. He has extensive experience using Neo4j to create efficient, scalable applications for real-world problems. His hands-on approach combines practical examples with the depth of knowledge required to develop expertise. Who is it for? This book is ideal for Python developers with an interest in enhancing their applications through graph database technology. If you possess a moderate understanding of Python and wish to explore Neo4j for creating smarter, more interconnected data-driven solutions, this book is for you. You should be comfortable with basic programming concepts to fully benefit from this book.

Toad for Oracle Unleashed

Bert Scalzo and Dan Hotka have written the definitive, up-to-date guide to Version 12.x, Dell’s powerful new release of Toad for Oracle. Packed with step-by-step recipes, detailed screen shots, and hands-on exercises, Toad for Oracle Unleashed shows both developers and DBAs how to maximize their productivity. Drawing on their unsurpassed experience running Toad in production Oracle environments, Scalzo and Hotka thoroughly cover every area of Toad’s functionality. You’ll find practical insights into each of Toad’s most useful tools, from App Designer to Doc Generator, ER Diagrammer to Code Road Map. The authors offer proven solutions you can apply immediately to solve a wide variety of problems, from maintaining code integrity to automating performance and scalability testing. Learn how to… Install and launch Toad, connect to a database, and explore Toad’s new features Customize Toad to optimize productivity in your environment Use the Editor Window to execute SQL and PL/SQL, and view, save, or convert data Browse your schema, and create and edit objects Quickly generate useful reports with FastReport and Report Manager Clarify your database’s tables and data with the powerful Entity Relationship Diagrammer (ERD) and HTML documentation generator Work more efficiently with PL/SQL using code templates, snippets, and shortcuts Automate actions and applications with Automation Designer Perform key DBA tasks including database health checks, tablespace management, database and schema comparisons, and object rebuilding Identify and optimize poorlyperforming SQL and applications ON THE WEB:Download all examples and source code presented in this book from informit.com/title/9780134131856 as it becomes available.

Hadoop Application Architectures

Get expert guidance on architecting end-to-end data management solutions with Apache Hadoop. While many sources explain how to use various components in the Hadoop ecosystem, this practical book takes you through architectural considerations necessary to tie those components together into a complete tailored application, based on your particular use case.

Accumulo

Get up to speed on Apache Accumulo, the flexible, high-performance key/value store created by the National Security Agency (NSA) and based on Google’s BigTable data storage system. Written by former NSA team members, this comprehensive tutorial and reference covers Accumulo architecture, application development, table design, and cell-level security. With clear information on system administration, performance tuning, and best practices, this book is ideal for developers seeking to write Accumulo applications, administrators charged with installing and maintaining Accumulo, and other professionals interested in what Accumulo has to offer. You will find everything you need to use this system fully.

Implementing IBM FlashSystem 840

Almost all technological components in the data center are getting faster: central processing units, networks, storage area networks (SANs), and memory. All of them have improved their speed by a minimum of 10X; some of them by 100X, for example, data networks. However, spinning disk performance has only increased by 1.2 times. IBM® FlashSystem™ 840 version 1.3 closes this gap. The FlashSystem 840 is optimized for the data center to enable organizations of all sizes to strategically harness the value of stored data. It provides flexible capacity and extreme performance for the most demanding applications, including virtualized or bare-metal online transaction processing (OLTP) and online analytical processing (OLAP) databases, virtual desktop infrastructures (VDI), technical computing applications, and cloud environments. The system accelerates response times with IBM MicroLatency® access times as low as 90 µs write latency and 135 µs read latency to enable faster decision making. The introduction of a low capacity 1 TB flash module allows the FlashSystem 840 to be configured in capacity points as low as 2 TB in protected RAID 5 mode. Coupled with 10 GB iSCSI, the FlashSystem is positioned to bring extreme performance to small and medium-sized businesses (SMB) and growth markets. Implementing the IBM FlashSystem® 840 provides value that goes beyond those benefits that are seen on disk-based arrays. These benefits include better user experience, server and application consolidation, development cycle reduction, application scalability, data center footprint savings, and improved price performance economics. This IBM Redbooks® publication discusses IBM FlashSystem 840 version 1.3. It provides in-depth knowledge of the product architecture, software and hardware, its implementation, and hints and tips. Also illustrated are use cases that show real-world solutions for tiering, flash-only, and preferred read, as well as examples of the benefits gained by integrating the FlashSystem storage into business environments. Also described are product integration scenarios running the IBM FlashSystem 840 with the IBM SAN Volume Controller, and the IBM Storwize® family of products such V7000, V5000, and the V3700, as well as considerations when integrating with the IBM FlashSystem 840. The preferred practice guidance is provided for your FlashSystem environment with IBM 16 Gbps b-type products and features, focusing on Fibre Channel design. This book is intended for pre-sales and post-sales technical support professionals and storage administrators, and for anyone who wants to understand and learn how to implement this exciting technology.

Beginning Oracle Database 12c Administration: From Novice to Professional, Second Edition

Beginning Oracle Database 12c Administration is your entry point into a successful and satisfying career as an Oracle Database Administrator. The chapters of this book are logically organized into four parts closely tracking the way your database administration career will naturally evolve. Part 1 "Database Concepts" gives necessary background in relational database theory and Oracle Database concepts, Part 2 "Database Implementation" teaches how to implement an Oracle database correctly, Part 3 "Database Support" exposes you to the daily routine of a database administrator, and Part 4 "Database Tuning" introduces the fine art of performance tuning. Beginning Oracle Database 12c Administration provides information that you won't find in other books on Oracle Database. You'll discover not only technical information, but also guidance on work practices that are as vital to your success as are your technical skills. The author's favorite chapter is "The Big Picture and the Ten Deliverables." (It is the editor’s favorite chapter too!) If you take the lessons in that chapter to heart, you can quickly become a much better Oracle database administrator than you ever thought possible. You will grasp the key aspects of theory behind relational database management systems and learn how to: Install and configure an Oracle database, and ensure that it’s properly licensed; Execute common management tasks in a Linux environment; Defend against data loss by implementing sound backup and recovery practices; and Improve database and query performance.

Introducing and Implementing IBM FlashSystem V9000

IBM FlashSystem® V9000 is a comprehensive all-flash enterprise storage solution that delivers the full capabilities of IBM FlashCore™ technology plus a rich set of software-defined storage features. FlashSystem V9000 is available as one solution in a compact 6U form factor to achieve a simpler, more scalable, and cost efficient IT Infrastructure. FlashSystem V9000 improves business application availability and delivers greater resource utilization so you can get the most from your storage resources. Built with IBM Spectrum Virtualize™ functions, management tools, and interoperability, this product combines the performance of FlashSystem architecture with the advanced functions of software-defined storage, including: IBM Real-time Compression™, dynamic tiering, thin provisioning, snapshots, cloning, replication, data copy services and high-availability configurations. This IBM Redbooks® publication introduces clients to the IBM FlashSystem V9000. It provides in-depth knowledge of the product architecture, software and hardware, implementation, guidelines for scalability, and hints and tips. It illustrates use cases and ISV scenarios that demonstrate real-world solutions, as well as examples of the benefits gained by integrating the FlashSystem storage into business environments.Port utilization methodologies are provided in order to maximize the full potential of performance and the low latency of IBM FlashSystem V9000 in your scaled environment. This book is intended for pre-sales and post-sales technical support professionals and storage administrators, and for anyone who wants to understand and learn how to implement this exciting technology.

Pro XAML with C#: Application Development Strategies

Pro XAML with C#: Application Development Strategies is your guide to real-world development practices on Microsoft’s XAML-based platforms, with examples in WPF, Windows 8.1, and Windows Phone 8.1. Learn how to properly plan and architect an application on one or more of these platforms for a robust, scalable solution. In Part I, authors Buddy James and Lori Lalonde introduce you to XAML and reveal proven techniques for developing successful line-of-business applications. You’ll also find out about some of the conflicting needs and interests that you might encounter as an enterprise XAML developer. Part II begins to lay the groundwork to help you properly architect your application, providing you with a deeper understanding of domain-driven design and the Model-View-ViewModel design pattern. You will also learn about proper exception handling and logging techniques, and how to cover your code with unit tests to reduce bugs and validate your design. Part III explores implementation and deployment details for each of Microsoft’s XAML UIs, along with advice on deploying and maintaining your application across different devices using version control repositories and continuous integration. Pro XAML with C# Application Development Strategies is for intermediate to experienced developers looking to improve their professional practice. Readers should have experience working with C# and at least one XAML-based technology (WPF, Silverlight, Windows Store, or Windows Phone).

Understanding Oracle APEX 5 Application Development, Second Edition

This new edition of Understanding Oracle APEX 5 Application Development shows APEX developers how to build practical, non-trivial web applications. The book introduces the world of APEX properties, explaining the functionality supported by each page component as well as the techniques developers use to achieve that functionality. The book is targeted at those who are new to APEX and just beginning to develop real projects for production deployment. Reading the book and working the examples will leave you in a good position to build good-looking, highly-functional, web applications. Topics include: conditional formatting, user-customized reports, data entry forms, concurrency and lost updates, and updatable reports. Accompanying the book is a demo web application that illustrates each concept mentioned in the book. Specific attention is given in the book to the thought process involved in choosing and assembling APEX components and features to deliver a specific result. Understanding Oracle APEX 5 Application Development is the ideal book to take you from an understanding of the individual pieces of APEX to an understanding of how those pieces are assembled into polished applications. Teaches how to develop non-trivial APEX applications. Provides deep understanding of APEX functionality. Shows the techniques needed for customization.

Oracle Database 12c DBA Handbook

The definitive reference for every Oracle DBA—completely updated for Oracle Database 12 c Oracle Database 12c DBA Handbook is the quintessential tool for the DBA with an emphasis on the big picture—enabling administrators to achieve effective and efficient database management. Fully revised to cover every new feature and utility, this Oracle Press guide shows how to harness cloud capability, perform a new installation, upgrade from previous versions, configure hardware and software, handle backup and recovery, and provide failover capability. The newly revised material features high-level and practical content on cloud integration, storage management, performance tuning, information management, and the latest on a completely revised security program. Shows how to administer a scalable, flexible Oracle enterprise database Includes new chapters on cloud integration, new security capabilities, and other cutting-edge features All code and examples available online