talk-data.com talk-data.com

Topic

data-engineering

3395

tagged

Activity Trend

1 peak/qtr
2020-Q1 2026-Q1

Activities

3395 activities · Newest first

Graph Data Processing with Cypher

This comprehensive guide, "Graph Data Processing with Cypher," provides a clear and practical approach to mastering Cypher for querying Neo4j graph databases. Packed with real-world examples and detailed explanations, you'll learn how to model graph data, write and optimize Cypher queries, and leverage advanced features to extract meaningful insights from your data. What this Book will help me do Master the Cypher query language, from basics to advanced graph traversal techniques. Develop graph data models based on real-world business requirements and efficiently load data. Optimize Cypher queries for performance through query profiling and tuning techniques. Enhance Cypher's capabilities using APOC utilities for advanced data processing. Create impactful visualizations of graph data using tools like Neo4j Bloom. Author(s) Ravindranatha Anthapu has vast expertise in graph databases and years of professional experience working with Cypher and Neo4j. He brings a hands-on and accessible approach to teaching technical concepts, aiming to empower developers to effectively use graph databases. Through a passion for knowledge-sharing, Ravindranatha ensures readers feel both supported and challenged in their learning journey. Who is it for? This book is ideal for database administrators, developers, and architects, especially those who work with graph databases or want to transition into this domain. Beginners with basic Cypher knowledge and professionals aiming to advance their graph modeling and query optimization skills will find this resource invaluable. It is especially beneficial for individuals seeking to harness the full potential of Neo4j graph databases through Cypher.

Oracle Autonomous Database in Enterprise Architecture

Explore the capabilities of Oracle Autonomous Database (ADB) to improve enterprise-level data management. Through this book, you will dive deep into deploying, managing, and securing ADBs using Oracle Cloud Infrastructure (OCI). Gain hands-on experience with high-availability setups, data migration methods, and advanced security measures to elevate your enterprise architecture. What this Book will help me do Understand the key considerations for planning, migrating, and maintaining Oracle Autonomous Databases. Learn to implement high availability solutions using Autonomous Data Guard in ADB environments. Master the configuration of backup, restore, and disaster recovery strategies within OCI. Implement advanced security practices including encryption and IAM policy management. Gain proficiency in leveraging ADB features like APEX, SQL Developer Web, and REST APIs for rapid application development. Author(s) The authors None Sharma, Krishnakumar KM, and None Panda are experts in database systems, particularly in Oracle technologies. With years of hands-on experience implementing enterprise solutions and training professionals, they have pooled their knowledge to craft a resource-rich guide filled with practical advice. Who is it for? This book is ideal for cloud architects, database administrators, and implementation consultants seeking to leverage Oracle's Autonomous Database for enhanced automation, security, and scalability. It is well-suited for professionals with foundational knowledge of Linux, OCI, and databases. Aspiring cloud engineers and students aiming to understand modern database management will also benefit greatly.

ISV IBM zPDT Guide and Reference

This IBM® Redbooks® publication provides both introductory information and technical details for ISV IBM Z® Program Development Tool (IBM zPDT®), which produces a small IBM zSystems environment that is suitable for application development. ISV zPDT is a personal computer (PC) Linux application. When ISV zPDT is installed on Linux, normal IBM zSystems operating systems (such as IBM z/OS®) may be run on it. ISV zPDT provides the basic IBM zSystems architecture and provides emulated IBM 3390 disk drives, 3270 interfaces, Open Systems Adapter (OSA) interfaces, and other items. The systems that are described in this publication are complex, with elements of Linux (for the underlying PC machine), IBM z/Architecture® (for the core zPDT elements), IBM zSystems I/O functions (for emulated I/O devices), z/OS (the most common IBM zSystems operating system), and various applications and subsystems under z/OS. We assume that the reader is familiar with general concepts and terminology of IBM zSystems hardware and software elements, and with basic PC Linux characteristics. This publication provides the primary documentation for ISV zPDT and corresponds to zPDT V1 R11, commonly known as GA11.

Cyber Resiliency with Splunk Enterprise and IBM FlashSystem Storage Safeguarded Copy with IBM Copy Services Manager

The focus of this document is to highlight early threat detection by using Splunk Enterprise and proactively start a cyber resilience workflow in response to a cyberattack or malicious user action. The workflow uses IBM® Copy Services Manager (CSM) as orchestration software to invoke the IBM FlashSystem® storage Safeguarded Copy function, which creates an immutable copy of the data in an air-gapped form on the same IBM FlashSystem Storage for isolation and eventual quick recovery. This document explains the steps that are required to enable and forward IBM FlashSystem audit logs and set a Splunk forwarder configuration to forward local event logs to Splunk Enterprise. This document also describes how to create various alerts in Splunk Enterprise to determine a threat, and configure and invoke an appropriate response to the detected threat in Splunk Enterprise. This document explains the lab setup configuration steps that are involved in configuring various components like Splunk Enterprise, Splunk Enterprise config files for custom apps, IBM CSM, and IBM FlashSystem Storage. The last steps in the lab setup section demonstrate the automated Safeguarded Copy creation and validation steps. This document also describes brief steps for configuring various components and integrating them. This document demonstrates a use case for protecting a Microsoft SQL database (DB) volume that is created on IBM FlashSystem Storage. When a threat is detected on the Microsoft SQL DB volume, Safeguarded Copy starts on an IBM FlashSystem Storage volume. The Safeguarded Copy creates an immutable copy of the data, and the same data volume can be recovered or restored by using IBM CSM. This publication does not describe the installation procedures for Splunk Enterprise, Splunk Forwarder for IBM CSM, th Microsoft SQL server, or the IBM FlashSystem Storage setup. It is assumed that the reader of the book has a basic understanding of system, Windows, and DB administration; storage administration; and has access to the required software and documentation that is used in this document.

The Cloud Data Lake

More organizations than ever understand the importance of data lake architectures for deriving value from their data. Building a robust, scalable, and performant data lake remains a complex proposition, however, with a buffet of tools and options that need to work together to provide a seamless end-to-end pipeline from data to insights. This book provides a concise yet comprehensive overview on the setup, management, and governance of a cloud data lake. Author Rukmani Gopalan, a product management leader and data enthusiast, guides data architects and engineers through the major aspects of working with a cloud data lake, from design considerations and best practices to data format optimizations, performance optimization, cost management, and governance. Learn the benefits of a cloud-based big data strategy for your organization Get guidance and best practices for designing performant and scalable data lakes Examine architecture and design choices, and data governance principles and strategies Build a data strategy that scales as your organizational and business needs increase Implement a scalable data lake in the cloud Use cloud-based advanced analytics to gain more value from your data

Using RDP with IBM FlashSystem to Debug Fibre Channel Optics Errors

The focus of this IBM® blueprint is to showcase the Read Diagnostic Parameters (RDP) feature of the Fibre Channel protocol (FCP). The data that is provided by RDP commands can simplify the process of managing and analyzing any issues on complex SAN fabrics. In this blueprint, we provide guidance to help users and administrators understand the meaning of RDP data and how to use it. The intent of this blueprint is to help a user understand what RDP is, what data RDP represents, and how to use that data to identify potential issues within the SAN fabric that is hosted by that Fibre Channel (FC) switch.

Optimized Inferencing and Integration with AI on IBM zSystems: Introduction, Methodology, and Use Cases

In today's fast-paced, ever-growing digital world, you face various new and complex business problems. To help resolve these problems, enterprises are embedding artificial intelligence (AI) into their mission-critical business processes and applications to help improve operations, optimize performance, personalize the user experience, and differentiate themselves from the competition. Furthermore, the use of AI on the IBM® zSystems platform, where your mission-critical transactions, data, and applications are installed, is a key aspect of modernizing business-critical applications while maintaining strict service-level agreements (SLAs) and security requirements. This colocation of data and AI empowers your enterprise to optimally and easily deploy and infuse AI capabilities into your enterprise workloads with the most recent and relevant data available in real time, which enables a more transparent, accurate, and dependable AI experience. This IBM Redpaper publication introduces and explains AI technologies and hardware optimizations, such as IBM zSystems Integrated Accelerator for AI, and demonstrates how to leverage certain capabilities and components to enable solutions in business-critical use cases, such as fraud detection and credit risk scoring on the platform. Real-time inferencing with AI models, a capability that is critical to certain industries and use cases such as fraud detection, now can be implemented with optimized performance thanks to innovations like IBM zSystems Integrated Accelerator for AI embedded in the Telum chip within IBM z16™. This publication also describes and demonstrates the implementation and integration of the two end-to-end solutions (fraud detection and credit risk), from developing and training the AI models to deploying the models in an IBM z/OS® V2R5 environment on IBM z16 hardware, and to integrating AI functions into an application, for example an IBM z/OS Customer Information Control System (IBM CICS®) application. We describe performance optimization recommendations and considerations when leveraging AI technology on the IBM zSystems platform, including optimizations for micro-batching in IBM Watson® Machine Learning for z/OS (WMLz). The benefits that are derived from the solutions also are described in detail, which includes how the open-source AI framework portability of the IBM zSystems platform enables model development and training to be done anywhere, including on IBM zSystems, and the ability to easily integrate to deploy on IBM zSystems for optimal inferencing. You can uncover insights at the transaction level while taking advantage of the speed, depth, and securability of the platform. This publication is intended for technical specialists, site reliability engineers, architects, system programmers, and systems engineers. Technologies that are covered include TensorFlow Serving, WMLz, IBM Cloud Pak® for Data (CP4D), IBM z/OS Container Extensions (zCX), IBM Customer Information Control System (IBM CICS), Open Neural Network Exchange (ONNX), and IBM Deep Learning Compiler (zDLC).

Pro SQL Server 2022 Administration: A Guide for the Modern DBA

Get your daily work done efficiently using this comprehensive guide for SQL Server DBAs that covers all that a practicing database administrator needs to know. Updated for SQL Server 2022, this edition includes coverage of new features, such as Ledger, which provides an immutable record of table history to protect you against malicious data tampering, and integration with cloud providers to support hybrid cloud scenarios. You’ll also find new content on performance optimizations, such as query pan feedback, and security controls, such as new database roles, which are restructured for modern ways of working. Coverage also includes Query Store, installation on Linux, and the use of containerized SQL. Pro SQL Server 2022 Administration takes DBAs on a journey that begins with planning their SQL Server deployment and runs through installing and configuring the instance, administering and optimizing database objects, and ensuring that data issecure and highly available. Readers will learn how to perform advanced maintenance and tuning techniques, and discover SQL Server's hybrid cloud functionality. This book teaches you how to make the most of new SQL Server 2022 functionality, including integration for hybrid cloud scenarios. The book promotes best-practice installation, shows how to configure for scalability and high availability, and demonstrates the gamut of database-level maintenance tasks, such as index maintenance, database consistency checks, and table optimizations. What You Will Learn Integrate SQL Server with Azure for hybrid cloud scenarios Audit changes and prevent malicious data changes with SQL Server’s Ledger Secure and encrypt data to protect against embarrassing data breaches Ensure 24 x 7 x 365 access through high availability and disaster recovery features in today’s hybrid world Use Azure tooling, including Arc, to gain insight into and manage your SQL Server enterprise Install and configure SQL Server on Windows, Linux, and in containers Perform routine maintenance tasks, such as backups and database consistency checks Optimize performance and undertake troubleshooting in the Database Engine Who This Book Is For SQL Server DBAs who manage on-premise installations of SQL Server. This book is also useful for DBAs who wish to learn advanced features, such as integration with Azure, Query Store, Extended Events, and Policy-Based Management, or those who need to install SQL Server in a variety of environments.

Streaming Video Strategies

Video is an essential tool for businesses and a key driver in consumer sales. But consumers expect the seamless viewing experiences they get on specialized streaming sites like Netflix and YouTube across every company everywhere they watch. Building video that meets those expectations into your sites and apps means dealing with complex challenges. In this report, Carolyn Handler Miller and Frank Kane help you think through decisions about building video at your company—whether you're a founder considering the role of video in your app, a product manager or team lead overseeing video infrastructure, or a developer executing on user experience. You'll explore a solid framework for incorporating video into your websites and apps that considers your existing infrastructure so that you can deliver seamless, high-quality video experiences that drive real results. Four case studies then show how real companies have successfully built video experiences into their businesses' software architecture. This report helps you: Understand the changing role of video for businesses today Appreciate the unique challenges of building video Decide whether to design and build video infrastructure yourself or partner with a third-party expert

Unlocking the Value of Real-Time Analytics

Storing data and making it accessible for real-time analysis is a huge challenge for organizations today. In 2020 alone, 64.2 billion GB of data was created or replicated, and it continues to grow. With this report, data engineers, architects, and software engineers will learn how to do deep analysis and automate business decisions while keeping your analytical capabilities timely. Author Christopher Gardner takes you through current practices for extracting data for analysis and uncovers the opportunities and benefits of making that data extraction and analysis continuous. By the end of this report, you’ll know how to use new and innovative tools against your data to make real-time decisions. And you’ll understand how to examine the impact of real-time analytics on your business. Learn the four requirements of real-time analytics: latency, freshness, throughput, and concurrency Determine where delays between data collection and actionable analytics occur Understand the reasons for real-time analytics and identify the tools you need to reach a faster, more dynamic level Examine changes in data storage and software while learning methodologies for overcoming delays in existing database architecture Explore case studies that show how companies use columnar data, sharding, and bitmap indexing to store and analyze data Fast and fresh data can make the difference between a successful transaction and a missed opportunity. The report shows you how.

Offloading storage volumes from Safeguarded Copy to AWS S3 Object Storage with IBM FlashSystem Transparent Cloud Tiering

The focus of this IBM® Blueprint is to showcase a method to store volumes that are created by using Safeguarded Copy off-premise to Amazon S3 object storage that uses the IBM FlashSystem Transparent cloud tiering (TCT) feature. TCT enables volume data to be copied and transferred to object storage. The TCT feature supports creating connections to cloud service providers to store copies of volume data in private or public clouds. This feature is useful for organizations of all sizes when planning for disaster recovery operations or storing a copy of data as extra backup. TCT provides seamless integration between the storage system and public or private clouds for Safeguarded Copy volumes and non-Safeguarded Copy volumes.

IBM Elastic Storage System Introduction Guide

This IBM® Redpaper Redbookspublication provides an overview of the IBM Elastic Storage® Server (IBM ESS) and IBM Elastic Storage System (also IBM ESS). These scalable, high-performance data and file management solution, are built on IBM Spectrum® Scale technology. Providing reliability, performance, and scalability, IBM ESS can be implemented for a range of diverse requirements. The latest IBM ESS 3500 is the most innovative system that provides investment protection to expand or build a new Global Data Platform and use current storage. The system allows enhanced, non-disruptive upgrades to grow from flash to hybrid or from hard disk drives (HDDs) to hybrid. IBM ESS can scale up or out with two different storage mediums in the environment, and it is ready for technologies like 200 Gb Ethernet or InfiniBand NDR-200 connectivity. This publication helps you to understand the solution and its architecture. It describes ordering the best solution for your environment, planning the installation and integration of the solution into your environment, and correctly maintaining your solution. The solution is created from the following combination of physical and logical components: Hardware Operating system Storage Network Applications Knowledge of the IBM Elastic Storage Server and IBM Elastic Storage System components is key for planning an environment. This paper is targeted toward technical professionals (consultants, technical support staff, IT Architects, and IT specialists) who are responsible for delivering cost-effective cloud services and big data solutions. The content of this paper can help you to uncover insights among client's data so that you can take appropriate actions to optimize business results, product development, and scientific discoveries.

Introducing RavenDB: The Database for Modern Data Persistence

Simplify your first steps with the RavenDB NoSQL Document Database. This book takes a task-oriented approach by showing common problems, potential solutions, brief explanations of how those solutions work, and the mechanisms used. Based on real-world examples, the recipes in this book will show you how to solve common problems with Raven Query Language and will highlight reasons why RavenDB is a great choice for fast prototyping solutions that can sustain increasing amounts of data as your application grows. Introducing RavenDB includes code and query examples that address real-life challenges you’ll encounter when using RavenDB, helping you learn the basics of the Raven Query Language more quickly and efficiently. In many cases, you’ll be able to copy and paste the examples into your own code, making only minor modifications to suit your application. RavenDB supports many advanced features, such full-text search, graph queries, and timeseries; recipes in the latter portion of the book will help you understand those advanced features and how they might be applied to your own code and applications. After reading this book, you will be able to employ RavenDB’s powerful features in your own projects. What You Will Learn Set up and start working with RavenDB Model your objects for persistence in a NoSQL document database Write basic and advanced queries in the Raven Query Language Index your data using map/reduce techniques Implement techniques leading to highly performant systems Efficiently aggregate data and query on those aggregations Who This Book Is For Developers accustomed to relational databases who are about to enter a world of NoSQL databases. The book is also for experienced programmers who have used other non-relational databases and want to learn RavenDB. It will also prove useful for developers who want to move away from using Object-Relational Modeling frameworks and start working with a persistence solution that can store object graphs directly.

SQL Server 2022 Revealed: A Hybrid Data Platform Powered by Security, Performance, and Availability

Know how to use the new capabilities and cloud integrations in SQL Server 2022. This book covers the many innovative integrations with the Azure Cloud that make SQL Server 2022 the most cloud-connected edition ever. The book covers cutting-edge features such as the blockchain-based Ledger for creating a tamper-evident record of changes to data over time that you can rely on to be correct and reliable. You'll learn about built-in Query Intelligence capabilities to help you to upgrade with confidence that your applications will perform at least as fast after the upgrade than before. In fact, you'll probably see an increase in performance from the upgrade, with no code changes needed. Also covered are innovations such as contained availability groups and data virtualization with S3 object storage. New cloud integrations covered in this book include Microsoft Azure Purview and the use of Azure SQL for high availability and disaster recovery. The bookcovers Azure Synapse Link with its built-in capabilities to take changes and put them into Synapse automatically. Anyone building their career around SQL Server will want this book for the valuable information it provides on building SQL skills from edge to the cloud. ​ What You Will Learn Know how to use all of the new capabilities and cloud integrations in SQL Server 2022 Connect to Azure for disaster recovery, near real-time analytics, and security Leverage the Ledger to create a tamper-evident record of data changes over time Upgrade from prior releases and achieve faster and more consistent performance with no code changes Access data and storage in different and new formats, such as Parquet and S3, without moving the data and using your existing T-SQL skills Explore new application scenarios using innovations with T-SQL in areassuch as JSON and time series Who This Book Is For SQL Server professionals who want to upgrade their skills to the latest edition of SQL Server; those wishing to take advantage of new integrations with Microsoft Azure Purview (governance), Azure Synapse (analytics), and Azure SQL (HA and DR); and those in need of the increased performance and security offered by Query Intelligence and the new Ledger

Architecting Solutions with SAP Business Technology Platform

Gain a comprehensive understanding of SAP Business Technology Platform (SAP BTP) and its role in the intelligent enterprise. This book provides you with the knowledge and skills to design and implement effective architectural solutions. You'll explore integration strategies, extensibility options, and data processing methods to innovate and enhance your organization's SAP ecosystem. What this Book will help me do Architect enterprise solutions with SAP BTP to address key integration challenges. Leverage SAP BTP tools for process automation and effective solution extensibility. Understand non-functional requirements such as operability and security. Drive innovation by integrating SAP's intelligent technologies into your designs. Utilize SAP BTP to derive actionable insights from business data for value generation. Author(s) Serdar Simsekler and None Du are experienced professionals in the field of SAP architecture and technology. They bring years of expertise in building enterprise solutions leveraging the latest SAP innovations. Their approachable writing style aims to connect technical concepts with practical enterprise applications, ensuring readers can directly apply the knowledge gained. Who is it for? This book is intended for technical architects, solution architects, and enterprise architects who are working with or intending to adopt SAP Business Technology Platform. It is ideal for those seeking to enhance their understanding of SAP's solution ecosystem and deliver innovative systems. A foundational knowledge of IT systems and basic cloud concepts is assumed, as is familiarity with the SAP framework.

Red Hat OpenShift Container Platform for IBM zCX

Application modernization is essential for continuous improvements to your business value. Modernizing your applications includes improvements to your software architecture, application infrastructure, development techniques, and business strategies. All of which allows you to gain increased business value from existing application code. IBM® z/OS® Container Extensions (IBM zCX) is a part of the IBM z/OS operating system. It makes it possible to run Linux on IBM Z® applications that are packaged as Docker container images on z/OS. Application developers can develop, and data centers can operate, popular open source packages, Linux applications, IBM software, and third-party software together with z/OS applications and data. This IBM Redbooks® publication presents the capabilities of IBM zCX along with several use cases that demonstrate Red Hat OpenShift Container Platform for IBM zCX and the application modernization benefits your business can realize.

Neural Search - From Prototype to Production with Jina

Dive into the world of modern search systems with 'Neural Search - From Prototype to Production with Jina.' This book introduces you to the fundamentals of neural search, exploring how machine learning revolutionizes information retrieval. You'll gain hands-on experience building versatile, scalable search engines using Jina, unraveling the complexities of AI-powered searches. What this Book will help me do Understand the basics of neural search compared to traditional search methods. Develop mastery of vector representation and its application in neural search. Learn to utilize Jina for constructing AI-powered search engines. Enhance your capabilities to handle multi-modal search systems like text, images, and audio. Acquire the skills to deploy and optimize deep learning-powered search systems effectively. Author(s) Bo Wang, Cristian Mitroi, Feng Wang, Shubham Saboo, and Susana Guzmán are experienced technologists and AI researchers passionate about simplifying complex subjects like neural search. With their expertise in Jina and deep learning, their collaborative approach ensures practical, reader-friendly content that empowers learners to excel in creating cutting-edge search systems. Who is it for? This book is perfect for machine learning, AI, or Python developers eager to advance their understanding of neural search. Whether you're building text, image, or other modality-based search systems, it caters to beginners with foundational knowledge and extends to professionals wanting to deepen their skills. Unlock the potential of Jina for your projects.

Proactive Early Threat Detection and Securing SQL Database With IBM QRadar and IBM Spectrum Copy Data Management Using IBM FlashSystem Safeguarded Copy

This IBM® blueprint publication focuses on early threat detection within a database environment by using IBM QRadar®. It also highlights how to proactively start a cyber resilience workflow in response to a cyberattack or potential malicious user actions. The workflow that is presented here uses IBM Spectrum® Copy Data Management as orchestration software to start IBM FlashSystem® Safeguarded Copy functions. The Safeguarded Copy creates an immutable copy of the data in an air-gapped form on the same IBM FlashSystem for isolation and eventual quick recovery. This document describes how to enable and forward SQL database user activities to IBM QRadar. This document also describes how to create various rules to determine a threat, and configure and start a suitable response to the detected threat in IBM QRadar. Finally, this document outlines the steps that are involved to create a Scheduled Job by using IBM Spectrum® Copy Data Management with various actions.

What is New in DFSMSrmm

DFSMSrmm is an IBM z/OS feature that is a fully functioning tape management system to manage your removable media. In the last decade, many enhancements were made to DFSMSrmm. This IBM Redbooks publication is intended to help you configure and use the newer functions and features that are now available. Discussion of the new features is included along with use cases. Hints and tips of various common DFSMSrmm problems and useful configuration and reporting JCL also are included. This publication is intended as a supplement to DFSMSrmm Primer, SG24-5983, which is still the recommended starting point for any users new to DFSMSrmm.

Trino: The Definitive Guide, 2nd Edition

Perform fast interactive analytics against different data sources using the Trino high-performance distributed SQL query engine. In the second edition of this practical guide, you'll learn how to conduct analytics on data where it lives, whether it's a data lake using Hive, a modern lakehouse with Iceberg or Delta Lake, a different system like Cassandra, Kafka, or SingleStore, or a relational database like PostgreSQL or Oracle. Analysts, software engineers, and production engineers learn how to manage, use, and even develop with Trino and make it a critical part of their data platform. Authors Matt Fuller, Manfred Moser, and Martin Traverso show you how a single Trino query can combine data from multiple sources to allow for analytics across your entire organization. Explore Trino's use cases, and learn about tools that help you connect to Trino for querying and processing huge amounts of data Learn Trino's internal workings, including how to connect to and query data sources with support for SQL statements, operators, functions, and more Deploy and secure Trino at scale, monitor workloads, tune queries, and connect more applications Learn how other organizations apply Trino successfully