talk-data.com talk-data.com

Topic

data-engineering

3377

tagged

Activity Trend

1 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: O'Reilly Data Engineering Books ×
PostgreSQL 10 High Performance - Third Edition

PostgreSQL 10 High Performance provides you with all the tools to maximize the efficiency and reliability of your PostgreSQL 10 database. Written for database admins and architects, this book offers deep insights into optimizing queries, configuring hardware, and managing complex setups. By integrating these best practices, you'll ensure scalability and stability in your systems. What this Book will help me do Optimize PostgreSQL 10 queries for improved performance and efficiency. Implement database monitoring systems to identify and resolve issues proactively. Scale your database by implementing partitioning, replication, and caching strategies. Understand PostgreSQL hardware compatibility and configuration for maximum throughput. Learn how to design high-performance solutions tailored for large and demanding applications. Author(s) Enrico Pirozzi is a seasoned database professional with extensive experience in PostgreSQL management and optimization. Having worked on large-scale database infrastructures, Enrico shares his hands-on knowledge and practical advice for achieving high performance with PostgreSQL. His approachable style makes complex topics accessible to every reader. Who is it for? This book is intended for database administrators and system architects who are working with or planning to adopt PostgreSQL 10. Readers should have a foundational knowledge of SQL and some prior exposure to PostgreSQL. If you're aiming to design efficient, scalable database solutions while ensuring high availability, this book is for you.

Networking Design for HPC and AI on IBM Power Systems

This publication provides information about networking design for IBM® High Performance Computing (HPC) and AI for Power Systems™. This paper will help you understand the basic requirements when designing a solution, the components in an infrastructure for HPC and AI Systems, the designing of interconnect and data networks with use cases based in real life scenarios, the administration and the Out-Of-Band management networks. We cover all the necessary requirements, provide a good understanding of the technology and include examples for small, medium and large cluster environments. This paper is intended for IT architects, system designers, data center planners, and system administrators who must design or provide a solution for the infrastructure of a HPC cluster.

IBM Spectrum Scale Best Practices for Genomics Medicine Workloads

Advancing the science of medicine by targeting a disease more precisely with treatment specific to each patient relies on access to that patient's genomics information and the ability to process massive amounts of genomics data quickly. Although genomics data is becoming a critical source for precision medicine, it is expected to create an expanding data ecosystem. Therefore, hospitals, genome centers, medical research centers, and other clinical institutes need to explore new methods of storing, accessing, securing, managing, sharing, and analyzing significant amounts of data. Healthcare and life sciences organizations that are running data-intensive genomics workloads on an IT infrastructure that lacks scalability, flexibility, performance, management, and cognitive capabilities also need to modernize and transform their infrastructure to support current and future requirements. IBM® offers an integrated solution for genomics that is based on composable infrastructure. This solution enables administrators to build an IT environment in a way that disaggregates the underlying compute, storage, and network resources. Such a composable building block based solution for genomics addresses the most complex data management aspect and allows organizations to store, access, manage, and share huge volumes of genome sequencing data. IBM Spectrum™ Scale is software-defined storage that is used to manage storage and provide massive scale, a global namespace, and high-performance data access with many enterprise features. IBM Spectrum Scale™ is used in clustered environments, provides unified access to data via file protocols (POSIX, NFS, and SMB) and object protocols (Swift and S3), and supports analytic workloads via HDFS connectors. Deploying IBM Spectrum Scale and IBM Elastic Storage™ Server (IBM ESS) as a composable storage building block in a Genomics Next Generation Sequencing deployment offers key benefits of performance, scalability, analytics, and collaboration via multiple protocols. This IBM Redpaper™ publication describes a composable solution with detailed architecture definitions for storage, compute, and networking services for genomics next generation sequencing that enable solution architects to benefit from tried-and-tested deployments, to quickly plan and design an end-to-end infrastructure deployment. The preferred practices and fully tested recommendations described in this paper are derived from running GATK Best Practices work flow from the Broad Institute. The scenarios provide all that is required, including ready-to-use configuration and tuning templates for the different building blocks (compute, network, and storage), that can enable simpler deployment and that can enlarge the level of assurance over the performance for genomics workloads. The solution is designed to be elastic in nature, and the disaggregation of the building blocks allows IT administrators to easily and optimally configure the solution with maximum flexibility. The intended audience for this paper is technical decision makers, IT architects, deployment engineers, and administrators who are working in the healthcare domain and who are working on genomics-based workloads.

IBM Spectrum Scale Functionality to Support GDPR Requirements

The role of the IT solutions is to enforce the correct handling of personal data using processes developed by the establishment. Each element of the solution stack must address the objectives as appropriate to the data that it handles. Typically, personal data exists either in the form of structured data (like databases) or unstructured data (like files, text, documents, and so on.). This IBM Redbooks publication specifically deals with unstructured data and storage systems used to host unstructured data. For unstructured data storage in particular, some key attributes enable the overall solution to support compliance with the EU General Data Protection Regulation (GDPR). Because personal data subject to GDPR is commonly stored in an unstructured data format, a scale out file system like IBM Spectrum Scale provides essential functions to support GDPR requirements. This paper highlights some of the key compliance requirements and explains how IBM Spectrum Scale helps to address them.

JavaScript and JSON Essentials - Second Edition

Dive into "JavaScript and JSON Essentials" to discover how JSON works as a cornerstone in modern web development. Through hands-on examples and practical guidance, this book equips you with the knowledge to effectively use JSON with JavaScript for creating responsive, scalable, and capable web applications. What this Book will help me do Master JSON structures and utilize them in web development workflows. Integrate JSON data within Angular, Node.js, and other popular frameworks. Implement real-time JSON features using tools like Kafka and Socket.io. Understand BSON, GeoJSON, and JSON-LD formats for specialized applications. Develop efficient JSON handling for distributed and scalable systems. Author(s) None Joseph D'mello and Sai S Sriparasa are seasoned software developers and educators with extensive experience in JavaScript. Their expertise in web application development and JSON usage shines through in this book. They take a clear and engaging approach, ensuring that complex concepts are demystified and actionable. Who is it for? This book is best suited for web developers familiar with JavaScript who want to enhance their abilities to use JSON for building fast, data-driven web applications. Whether you're looking to strengthen your backend skills or learn tools like Angular and Kafka in conjunction with JSON, this book is made for you.

A Deep Dive into NoSQL Databases: The Use Cases and Applications

A Deep Dive into NoSQL Databases: The Use Cases and Applications, Volume 109, the latest release in the Advances in Computers series first published in 1960, presents detailed coverage of innovations in computer hardware, software, theory, design and applications. In addition, it provides contributors with a medium in which they can explore their subjects in greater depth and breadth. This update includes sections on NoSQL and NewSQL databases for big data analytics and distributed computing, NewSQL databases and scalable in-memory analytics, NoSQL web crawler application, NoSQL Security, a Comparative Study of different In-Memory (No/New)SQL Databases, NoSQL Hands On-4 NoSQLs, the Hadoop Ecosystem, and more. Provides a very comprehensive, yet compact, book on the popular domain of NoSQL databases for IT professionals, practitioners and professors Articulates and accentuates big data analytics and how it gets simplified and streamlined by NoSQL database systems Sets a stimulating foundation with all the relevant details for NoSQL database researchers, developers and administrators

Consolidation Planning Workbook Practical Migration from x86 to IBM LinuxOne

IBM LinuxONE™ is a portfolio of hardware, software, and solutions for an enterprise-grade Linux environment. It is designed to run more transactions faster and with more security and reliability specifically for the open community. It fully embraces open source-based technology. This IBM® Redbooks® publication provides a technical sample workbook for IT organizations that are considering a migration from their x86 distributed servers to IBM LinuxONE. This book provides you with checklists for each facet of your migration to IBM LinuxONE. This IBM Redbooks workbook assists you by providing the following information: Choosing workloads to migrate Analysis of how to size workloads for migration Financial benefits of a migration Project definition Planning checklists

Architecting Data Lakes, 2nd Edition

Many organizations today are succeeding with data lakes, not just as storage repositories but as places to organize, prepare, analyze, and secure a wide variety of data. Management and governance is critical for making your data lake work, yet hard to do without a roadmap. With this ebook, you’ll learn an approach that merges the flexibility of a data lake with the management and governance of a traditional data warehouse. Author Ben Sharma explains the steps necessary to deploy data lakes with robust, metadata-driven data management platforms. You’ll learn best practices for building, maintaining, and deriving value from a data lake in your production environment. Included is a detailed checklist to help you construct a data lake in a controlled yet flexible way. Managing and governing data in your lake cannot be an afterthought. This ebook explores how integrated data lake management solutions, such as the Zaloni Data Platform (ZDP), deliver necessary controls without making data lakes slow and inflexible. You’ll examine: A reference architecture for a production-ready data lake An overview of the data lake technology stack and deployment options Key data lake attributes, including ingestion, storage, processing, and access Why implementing management and governance is crucial for the success of your data lake How to curate data lakes through data governance, acquisition, organization, preparation, and provisioning Methods for providing secure self-service access for users across the enterprise How to build a future-proof data lake tech stack that includes storage, processing, data management, and reference architecture Emerging trends that will shape the future of data lakes

Enhancing the IBM Power Systems Platform with IBM Watson Services

Abstract This IBM® Redbooks® publication provides an introduction to the IBM POWER® processor architecture. It describes the IBM POWER processor and IBM Power Systems™ servers, highlighting the advantages and benefits of IBM Power Systems servers, IBM AIX®, IBM i, and Linux on Power. This publication showcases typical business scenarios that are powered by Power Systems servers. It provides an introduction to the artificial intelligence (AI) capabilities that IBM Watson® services enable, and how these AI capabilities can be augmented in existing applications by using an agile approach to embed intelligence into every operational process. For each use case, the business benefits of adding Watson services are detailed. This publication gives an overview about each Watson service, and how each one is commonly used in real business scenarios. It gives an introduction to the Watson API explorer, which you can use to try the application programming interfaces (APIs) and their capabilities. The Watson services are positioned against the machine learning capabilities of IBM PowerAI. In this publication, you have a guide about how to set up a development environment on Power Systems servers, a sample code implementation of one of the business cases, and a description of preferred practices to move any application that you develop into production. This publication is intended for technical professionals who are interested in learning about or implementing IBM Watson services on AIX, IBM i, and Linux.

Implementing IBM FlashSystem V9000 AE3

Abstract The success or failure of businesses often depends on how well organizations use their data assets for competitive advantage. Deeper insights from data require better information technology. As organizations modernize their IT infrastructure to boost innovation rather than limit it, they need a data storage system that can keep pace with several areas that affect your business: Highly virtualized environments Cloud computing Mobile and social systems of engagement In-depth, real-time analytics Making the correct decision on storage investment is critical. Organizations must have enough storage performance and agility to innovate when they need to implement cloud-based IT services, deploy virtual desktop infrastructure, enhance fraud detection, and use new analytics capabilities. At the same time, future storage investments must lower IT infrastructure costs while helping organizations to derive the greatest possible value from their data assets. The IBM® FlashSystem V9000 is the premier, fully integrated, Tier 1, all-flash offering from IBM. It has changed the economics of today's data center by eliminating storage bottlenecks. Its software-defined storage features simplify data management, improve data security, and preserve your investments in storage. The IBM FlashSystem® V9000 SAS expansion enclosures provide new tiering options with read-intensive SSDs or nearline SAS HDDs. IBM FlashSystem V9000 includes IBM FlashCore® technology and advanced software-defined storage available in one solution in a compact 6U form factor. IBM FlashSystem V9000 improves business application availability. It delivers greater resource utilization so you can get the most from your storage resources, and achieve a simpler, more scalable, and cost-efficient IT Infrastructure. This IBM Redbooks® publication provides information about IBM FlashSystem V9000 Software V8.1. It describes the core product architecture, software, hardware, and implementation, and provides hints and tips. The underlying basic hardware and software architecture and features of the IBM FlashSystem V9000 AC3 control enclosure and on IBM Spectrum Virtualize 8.1 software are described in these publications: Implementing IBM FlashSystem 900 Model AE3, SG24-8414 Implementing the IBM System Storage SAN Volume Controller V7.4, SG24-7933 Using IBM FlashSystem V9000 software functions, management tools, and interoperability combines the performance of IBM FlashSystem architecture with the advanced functions of software-defined storage to deliver performance, efficiency, and functions that meet the needs of enterprise workloads that demand IBM MicroLatency® response time. This book offers IBM FlashSystem V9000 scalability concepts and guidelines for planning, installing, and configuring, which can help environments scale up and out to add more flash capacity and expand virtualized systems. Port utilization methodologies are provided to help you maximize the full potential of IBM FlashSystem V9000 performance and low latency in your scalable environment. This book is intended for pre-sales and post-sales technical support professionals, storage administrators, and anyone who wants to understand how to implement this exciting technology.

IBM Spectrum Connect and IBM Storage Enabler for Containers: Practical Example with IBM FlashSystem A9000

This IBM® Redpaper™ publication provides an overview of containers and their framework. Container technology enables prepackaged and pre-configured software with the elements that are needed to run in any environment. Because they are meant to be portable, containers normally restrict applications from storing data on external storage. To overcome this limitation, IBM has developed a solution to provide persistent storage for containers on IBM storage systems, known as the IBM Storage Enabler for Containers. The Enabler tightly integrates with IBM Spectrum™ Connect (formerly IBM Spectrum Control™ Base Edition). IBM Storage Enabler for Containers v1.0 extends IBM Spectrum Connect to Kubernetes orchestrated container environments. The paper focuses on containers implementation, management, and control by using IBM Spectrum Connect and IBM Storage Enabler for Containers plug-in, with IBM FlashSystem® A9000 or A9000R.

IBM z14 Model ZR1 Technical Introduction

Abstract This IBM® Redbooks® publication introduces the latest member of the IBM Z platform, the IBM z14 Model ZR1 (Machine Type 3907). It includes information about the Z environment and how it helps integrate data and transactions more securely, and provides insight for faster and more accurate business decisions. The z14 ZR1 is a state-of-the-art data and transaction system that delivers advanced capabilities, which are vital to any digital transformation. The z14 ZR1 is designed for enhanced modularity, which is in an industry standard footprint. This system excels at the following tasks: Securing data with pervasive encryption Transforming a transactional platform into a data powerhouse Getting more out of the platform with IT Operational Analytics Providing resilience towards zero downtime Accelerating digital transformation with agile service delivery Revolutionizing business processes Mixing open source and Z technologies This book explains how this system uses new innovations and traditional Z strengths to satisfy growing demand for cloud, analytics, and open source technologies. With the z14 ZR1 as the base, applications can run in a trusted, reliable, and secure environment that improves operations and lessens business risk.

IBM Z Connectivity Handbook

Abstract This IBM Redbooks publication describes the connectivity options that are available for use within and beyond the data center for the IBM Z family of mainframes, which includes these systems: IBM z14® IBM z14 Model ZR1 IBM z13® IBM z13s™ IBM zEnterprise® EC12 (zEC12) IBM zEnterprise BC12 (zBC12) This book highlights the hardware and software components, functions, typical uses, coexistence, and relative merits of these connectivity features. It helps readers understand the connectivity alternatives that are available when planning and designing their data center infrastructures. The changes to this edition are based on the IBM Z hardware announcement dated April 10, 2018. This book is intended for data center planners, IT professionals, systems engineers, and network planners who are involved in the planning of connectivity solutions for IBM mainframes.

PHP, MySQL, & JavaScript All-in-One For Dummies

Explore the engine that drives the internet It takes a powerful suite of technologies to drive the most-visited websites in the world. PHP, mySQL, JavaScript, and other web-building languages serve as the foundation for application development and programming projects at all levels of the web. Dig into this all-in-one book to get a grasp on these in-demand skills, and figure out how to apply them to become a professional web builder. You’ll get valuable information from seven handy books covering the pieces of web programming, HTML5 & CSS3, JavaScript, PHP, MySQL, creating object-oriented programs, and using PHP frameworks. This book is ideal for the inexperienced programmer interested in adding these skills to their toolbox. New coders who've made it through an online course or boot camp will also find great value in how this book builds on what you already know. Helps you grasp the technologies that power web applications Covers PHP version 7.2 Includes coverage of the latest updates in web development Perfect for developers to use to solve problems

Oracle SQL Revealed: Executing Business Logic in the Database Engine

Write queries using little-known, but powerful, SQL features implemented in Oracle's database engine. You will be able to take advantage of Oracle’s power in implementing business logic, thereby maximizing return from your company’s investment in Oracle Database products. Important features and aspects of SQL covered in this book include the model clause, row pattern matching, analytic and aggregate functions, and recursive subquery factoring, just to name a few. The focus is on implementing business logic in pure SQL, with a comparison of different approaches that can be used to write SELECT statements to return results that drive good decision making and competitive action in the marketplace. This book covers features that are often not well known, and sometimes not implemented in competing products. Chapters on query transformation and logical execution order provide a grasp of the big picture in which the individual SQL features described in the other chapters are executed. Also included are a discussion on when to use the procedural capabilities from PL/SQL, and a series of examples showing different mixes of SQL features being applied in common types of queries that you are likely to encounter. What You Will Learn Gain competitive advantage from Oracle SQL Know when to step up to PL/SQL versus staying in SQL Become familiar with query transformations and join mechanics Apply the model clause and analytic functions to business intelligence queries Make use of features that are specific to Oracle Database, such as row pattern matching Understand the pros and cons of different SQL approaches to solving common query tasks Traverse hierarchies using CONNECT BY and recursive subquery factoring Who This Book Is For Database programmers withsome Oracle Database experience. The book is also for SQL developers who are moving to the Oracle Database platform or want to learn unique features of its query engine. Both audiences will learn to apply the full power of Oracle’s own SQL dialect to commonly encountered types of business questions and query challenges.

ABCs of IBM z/OS System Programming Volume 2

Abstract The ABCs of IBM® z/OS® System Programming is a 13-volume collection that provides an introduction to the z/OS operating system and the hardware architecture. Whether you are a beginner or an experienced system programmer, the ABCs collection provides the information that you need to start your research into z/OS and related subjects. If you want to become more familiar with z/OS in your current environment or if you are evaluating platforms to consolidate your e-business applications, the ABCs collection can serve as a powerful technical tool. This volume describes the basic system programming activities related to implementing and maintaining the z/OS installation and provides details about the modules that are used to manage jobs and data. It covers the following topics: Overview of the parmlib definitions and the IPL process. The parameters and system data sets necessary to IPL and run a z/OS operating system are described, along with the main daily tasks for maximizing performance of the z/OS system. Basic concepts related to subsystems and subsystem interface and how to use the subsystem services that are provided by IBM subsystems. Job management in the z/OS system using the JES2 and JES3 job entry subsystems. It provides a detailed discussion about how JES2 and JES3 are used to receive jobs into the operating system, schedule them for processing by z/OS, and control their output processing. The link pack area (LPA), LNKLST, authorized libraries, and the role of VLF and LLA components. An overview of SMP/E for z/OS. An overview of IBM Language Environment® architecture and descriptions of Language Environment’s full program model, callable services, storage management model, and debug information. Other volumes in this series include the following content: Volume 1: Introduction to z/OS and storage concepts, TSO/E, ISPF, JCL, SDSF, and z/OS delivery and installation Volume 3: Introduction to DFSMS, data set basics, storage management, hardware and software, catalogs, and DFSMStvs Volume 4: Communication Server, TCP/IP, and IBM VTAM® Volume 5: Base and IBM Parallel Sysplex®, System Logger, Resource Recovery Services (RRS), global resource serialization (GRS), z/OS system operations, automatic restart management (ARM), IBM Geographically Dispersed Parallel Sysplex™ (IBM GDPS®) Volume 6: Introduction to security, IBM RACF®, Digital certificates and PKI, Kerberos, cryptography and z990 integrated cryptography, zSeries firewall technologies, LDAP, and Enterprise Identity Mapping (EIM) Volume 7: Printing in a z/OS environment, Infoprint Server, and Infoprint Central Volume 8: An introduction to z/OS problem diagnosis Volume 9: z/OS UNIX System Services Volume 10: Introduction to IBM z/Architecture®, the IBM Z platform and IBM Z connectivity, LPAR concepts, HCD, and the DS Storage Solution Volume 11: Capacity planning, performance management, WLM, IBM RMF™, and SMF Volume 12: WLM Volume 13: JES3, JES3 SDSF

Modern Big Data Processing with Hadoop

Delve into the world of big data with 'Modern Big Data Processing with Hadoop.' This comprehensive guide introduces you to the powerful capabilities of Apache Hadoop and its ecosystem to solve data processing and analytics challenges. By the end, you will have mastered the techniques necessary to architect innovative, scalable, and efficient big data solutions. What this Book will help me do Master the principles of building an enterprise-level big data strategy with Apache Hadoop. Learn to integrate Hadoop with tools such as Apache Spark, Elasticsearch, and more for comprehensive solutions. Set up and manage your big data architecture, including deployment on cloud platforms with Apache Ambari. Develop real-time data pipelines and enterprise search solutions. Leverage advanced visualization tools like Apache Superset to make sense of data insights. Author(s) None R. Patil, None Kumar, and None Shindgikar are experienced big data professionals and accomplished authors. With years of hands-on experience in implementing and managing Apache Hadoop systems, they bring a depth of expertise to their writing. Their dedication lies in making complex technical concepts accessible while demonstrating real-world best practices. Who is it for? This book is designed for data professionals aiming to advance their expertise in big data solutions using Apache Hadoop. Ideal readers include engineers and project managers involved in data architecture and those aspiring to become big data architects. Some prior exposure to big data systems is beneficial to fully benefit from this book's insights and tutorials.

Seven NoSQL Databases in a Week

Learn the fundamentals of seven essential NoSQL databases in just one week with this book. Covering MongoDB, DynamoDB, Redis, Cassandra, Neo4j, InfluxDB, and HBase, you'll explore their functionalities and practical applications. Designed to give you a working understanding of NoSQL database types, this guide helps aspiring DBAs and developers comprehend and utilize modern data solutions. What this Book will help me do Master the fundamentals of MongoDB, including high-performance, high-availability, and scaling features. Gain hands-on experience with Neo4j to perform database queries and integrate with Python and Java applications. Learn efficient querying with Redis for storage and retrieval tasks. Understand Cassandra's powerful solution for scalable and fault-tolerant systems. Get well-versed with HBase for creating tables, and reading and writing data efficiently. Author(s) Sudarshan Kadambi and Xun (Brian) Wu bring a wealth of experience in database technologies. They have worked extensively in the software development and database management fields. With their practical and concise teaching approach, the authors make complex topics accessible for readers. Who is it for? This book is ideal for budding DBAs and developers looking to understand NoSQL databases. It is particularly useful for those transitioning from relational databases who want to learn about modern database technologies. Suitable for both beginners and those with some database knowledge, it aims to bridge skill gaps and expand the reader's technical expertise.

IBM FlashSystem V9000 AE3 and AC3 Performance

This IBM® Redpaper™ publication provides information about the best practices and performance capabilities when implementing a storage solution using IBM FlashSystem® V9000 9846-AC3 with IBM FlashSystem V9000 9846-AE3 storage enclosures. The results that are achieved and demonstrated are specific to the used configuration. However, they can be used as reference points for other configurations. There was no intention to demonstrate the best or the worst results, or minimal latency, or maximum throughput. We tried to stay closer to the configurations and workloads used by IBM clients and provide reference points for them.

PostGIS Cookbook - Second Edition

PostGIS Cookbook provides a thorough introduction to working with spatial data in the PostgreSQL environment using PostGIS. The book covers topics such as importing and exporting geographic data, analyzing vector and raster data, database optimization, and building GIS web applications. By the end, you'll be equipped to fully leverage PostGIS for spatial data projects. What this Book will help me do Efficiently import and export geographic data between PostGIS and other platforms. Apply PostGIS functions for advanced vector data analysis and visualization. Manipulate and optimize spatial data for better performance and robustness. Integrate PostGIS with Python for spatial data scripting. Develop GIS web applications leveraging PostGIS and Open Geospatial standards. Author(s) The authors of PostGIS Cookbook are experienced professionals and active contributors to the spatial database community. Vincent Mather, Pedro Wightman, Thomas Kraft, and their co-authors bring extensive software engineering and geo-computing expertise to the text. Their hands-on approach ensures practicality and relevance to current technologies. Who is it for? This book is ideal for developers and GIS professionals who want to enhance their spatial data handling skills using PostGIS. Whether you're a beginner to spatial databases or looking to extend your PostgreSQL knowledge, this book offers practical solutions and advanced techniques for spatial data management and analysis.