talk-data.com talk-data.com

Topic

data-engineering

3377

tagged

Activity Trend

1 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: O'Reilly Data Engineering Books ×
Geospatial Data Science Quick Start Guide

"Geospatial Data Science Quick Start Guide" provides a practical and effective introduction to leveraging geospatial data in data science. In this book, you will learn techniques for analyzing location-based data, building intelligent models, and performing geospatial operations for various applications. What this Book will help me do Understand the principles and techniques for analyzing geospatial data. Set up Python tools to work effectively with location intelligence. Perform advanced spatial operations such as geocoding and proximity analysis. Develop systems such as geofencing and location-based recommendation engines. Obtain actionable insights by visualizing and processing spatial data effectively. Author(s) Abdishakur Hassan and Jayakrishnan Vijayaraghavan are experts in geospatial analysis. With extensive experience in applying data science to location intelligence, they bring a practical and hands-on approach to coding, teaching, and problem-solving. They are passionate about sharing their knowledge through their clear explanations and structured learning paths. Who is it for? This book is ideal for data scientists interested in integrating geospatial analysis into their models and workflows. It is also suitable for GIS developers looking to enhance existing systems with advanced data analysis capabilities. Readers should have experience with Python and a basic understanding of data science concepts. If location-based data intrigues you, this book is your guide.

Learning Elastic Stack 7.0 - Second Edition

"Learning Elastic Stack 7.0" introduces you to the tools and techniques of Elastic Stack, covering Elasticsearch, Logstash, Beats, and Kibana. With clear explanations and practical examples, this book helps you grasp the 7.0 version's new features and capabilities, empowering you to build and deploy robust, real-time data processing applications. What this Book will help me do Gain the necessary skills to install and configure Elastic Stack for professional use. Master the data handling capabilities of Elasticsearch for distributed search and analytics. Develop expertise in creating data pipelines with Logstash and other ingestion tools. Learn to utilize Kibana to visualize and interpret complex datasets. Acquire knowledge of deploying Elastic Stack solutions both on-premise and in cloud environments. Author(s) Pranav Shukla and Sharath Kumar M N are experienced software engineers and data professionals with a profound knowledge of databases, distributed systems, and cloud architectures. They specialize in educating developers through structured guidance and proven methodologies related to data handling and visualization. Who is it for? This book is designed for software engineers, data analysts, and technical architects interested in learning the Elastic Stack tools from the ground up. Readers familiar with database concepts but new to Elastic Stack will find this book particularly helpful. Advanced users seeking to understand the updates in Elastic Stack 7.0 are also a complementary audience. If you wish to apply Elastic Stack to real-time data processing and analytics, this book provides a strong foundation.

Mastering SAP ABAP

Mastering SAP ABAP guides you through learning and applying the powerful SAP ABAP programming language. You will start with foundational concepts of programming within SAP environments and progress towards advanced topics such as UI development with SAPUI5 and optimizing ABAP code performance. What this Book will help me do Master the ABAP programming language, from fundamental constructs to advanced techniques. Learn to design and implement efficient and maintainable SAP applications. Gain expertise in creating modern UIs for SAP systems using SAPUI5. Understand performance optimization techniques for SAP ABAP programs. Acquire skills to handle exceptions and perform robust testing in ABAP. Author(s) The authors, Paweł Grzełkowiak, Philipp Deth, Wojciech Ciesielski, and Wojciech Łuźwik, are seasoned SAP technologists with years of practical experience in development and consulting. Their dedication to clarity and usefulness is evident in this book, where they share their collective expertise. Who is it for? This book is for SAP developers, both budding and experienced, who want to increase their efficiency in ABAP programming. Prior exposure to programming concepts and a desire to understand SAP-specific technologies are required prerequisites. Whether you are delving deeper into your career as an SAP developer or are aiming to bring new technical solutions to your organization, this guide is ideal for you.

Obtaining Value from Big Data for Service Systems, Volume I, 2nd Edition

This volume will assist readers in fitting big data analysis into their service-based organizations. Volume I of this two-volume series focuses on the role of big data in service delivery systems. It discusses the definition and orientation to big data, applications of it in service delivery systems, how to obtain results that can affect/enhance service delivery, and how to build an effective big data organization. This volume will assist readers in fitting big data analysis into their service-based organizations. It will also help readers understand how to improve the use of big data to enhance their service-oriented organizations.

Electronic Health Records with Epic and IBM FlashSystem 9100 Blueprint Version 2 Release 1

This information is intended to facilitate the deployment of IBM® FlashSystem for the Epic Corporation electronic health record (EHR) solution by describing the requirements and specifications for configuring IBM FlashSystem® 9100 and its parameters. The document also describes the steps that are required to configure the server that host the EHR application. To complete the tasks, you must have a working knowledge of IBM FlashSystem 9100 and Epic applications. The information in this document is distributed on an "as is" basis, without any warranty that is either expressed or implied. Support assistance for the use of this material is limited to situations where IBM FlashSystem storage devices are supported and entitled and where the issues are not specific to a blueprint implementation.

Pro Oracle SQL Development: Best Practices for Writing Advanced Queries

Write SQL statements that are more powerful, simpler, and faster using Oracle SQL and its full range of features. This book provides a clearer way of thinking about SQL by building sets, and provides practical advice for using complex features while avoiding anti-patterns that lead to poor performance and wrong results. Relevant theories, real-world best practices, and style guidelines help you get the most out of Oracle SQL. Pro Oracle SQL Development is for anyone who already knows Oracle SQL and is ready to take their skills to the next level. Many developers, analysts, testers, and administrators use Oracle databases frequently, but their queries are limited because they do not have the knowledge, experience, or right environment to help them take full advantage of Oracle’s advanced features. This book will inspire you to achieve more with your Oracle SQL statements through tips for creating your own style for writing simple, yet powerful, SQL. It teaches you how to think about and solve performance problems in Oracle SQL, and covers advanced topics and shows you how to become an Oracle expert. What You'll Learn Understand the power of Oracle SQL and where to apply it Create a database development environment that is simple, scalable, and conducive to learning Solve complex problems that were previously solved in a procedural language Write large Oracle SQL statements that are powerful, simple, and fast Apply coding styles to make your SQL statements more readable Tune large Oracle SQL statements to eliminate and avoid performance problems Who This Book Is For Developers, testers, analysts, and administrators who want to harness the full power of Oracle SQL to solve their problems as simply and as quickly as possible. For traditional database professionals the book offers new ways of thinking about the language they have used for so long. For modern full stack developers the book explains how a database can be much more than simply a place to store data.

Loss Models, 5th Edition

A guide that provides in-depth coverage of modeling techniques used throughout many branches of actuarial science, revised and updated Now in its fifth edition, Loss Models: From Data to Decisions puts the focus on material tested in the Society of Actuaries (SOA) newly revised Exams STAM (Short-Term Actuarial Mathematics) and LTAM (Long-Term Actuarial Mathematics). Updated to reflect these exam changes, this vital resource offers actuaries, and those aspiring to the profession, a practical approach to the concepts and techniques needed to succeed in the profession. The techniques are also valuable for anyone who uses loss data to build models for assessing risks of any kind. Loss Models contains a wealth of examples that highlight the real-world applications of the concepts presented, and puts the emphasis on calculations and spreadsheet implementation. With a focus on the loss process, the book reviews the essential quantitative techniques such as random variables, basic distributional quantities, and the recursive method, and discusses techniques for classifying and creating distributions. Parametric, non-parametric, and Bayesian estimation methods are thoroughly covered. In addition, the authors offer practical advice for choosing an appropriate model. This important text: • Presents a revised and updated edition of the classic guide for actuaries that aligns with newly introduced Exams STAM and LTAM • Contains a wealth of exercises taken from previous exams • Includes fresh and additional content related to the material required by the Society of Actuaries (SOA) and the Canadian Institute of Actuaries (CIA) • Offers a solutions manual available for further insight, and all the data sets and supplemental material are posted on a companion site Written for students and aspiring actuaries who are preparing to take the SOA examinations, Loss Models offers an essential guide to the concepts and techniques of actuarial science.

IBM GDPS Family: An Introduction to Concepts and Capabilities

This IBM® Redbooks® publication presents an overview of the IBM Geographically Dispersed Parallel Sysplex™ (IBM GDPS®) offerings and the roles they play in delivering a business IT resilience solution. The book begins with general concepts of business IT resilience and disaster recovery, along with issues related to high application availability, data integrity, and performance. These topics are considered within the framework of government regulation, increasing application and infrastructure complexity, and the competitive and rapidly changing modern business environment. Next, it describes the GDPS family of offerings with specific reference to how they can help you achieve your defined goals for disaster recovery and high availability. Also covered are the features that simplify and enhance data replication activities, the prerequisites for implementing each offering, and tips for planning for the future and immediate business requirements. Tables provide easy-to-use summaries and comparisons of the offerings. The extra planning and implementation services available from IBM also are explained. Then, several practical client scenarios and requirements are described, along with the most suitable GDPS solution for each case. The introductory chapters of this publication are intended for a broad technical audience, including IT System Architects, Availability Managers, Technical IT Managers, Operations Managers, System Programmers, and Disaster Recovery Planners. The subsequent chapters provide more technical details about the GDPS offerings, and each can be read independently for those readers who are interested in specific topics. Therefore, if you read all of the chapters, be aware that some information is intentionally repeated.

Learn T-SQL Querying

Dive into the world of T-SQL with 'Learn T-SQL Querying,' a book designed to enhance your database querying skills and help you master Microsoft's SQL Server and Azure SQL Database. Through this guide, you'll explore best practices, learn advanced techniques for analyzing execution plans, and create efficient T-SQL queries. What this Book will help me do Understand the fundamentals of query optimization to write performant T-SQL queries. Analyze query execution plans to identify and troubleshoot performance issues effectively. Utilize dynamic management views and functions to monitor and optimize query performance. Implement features like Query Store to streamline troubleshooting and maintain performance changes. Avoid common T-SQL anti-patterns and embrace best practices to ensure scalable query design. Author(s) Pedro Lopes and None Lahoud bring years of expertise in SQL Server and database systems. Pedro has extensive experience as a database engineer, where he specializes in query processing and optimization. None has a deep understanding of T-SQL development, focusing on practical solutions. Together, they provide in-depth insights and actionable advice. Who is it for? This book is perfect for database administrators, database developers, and data analysts at any level looking to improve their T-SQL expertise. Beginners will gain foundational skills in T-SQL querying, while experienced professionals will find advanced strategies for optimizing SQL Server performance. Readers aiming to master both practical querying and troubleshooting will benefit the most.

PostgreSQL 11 Administration Cookbook

Discover practical solutions for administering PostgreSQL 11 databases in "PostgreSQL 11 Administration Cookbook." This recipe-style book provides actionable, step-by-step guidance for efficiently managing PostgreSQL databases, leveraging its features, and optimizing performance. You'll gain comprehensive knowledge to troubleshoot, maintain, and enhance enterprise database systems. What this Book will help me do Understand and implement robust database backup and recovery techniques. Improve the performance of PostgreSQL solutions through expert tuning and diagnostics. Master high availability and replication strategies for PostgreSQL 11. Use hands-on recipes to enhance PostgreSQL security and user management. Learn efficient database management techniques for production environments. Author(s) Simon Riggs, an experienced database architect, along with co-authors Gianni Ciolli and None Meesala, brings years of PostgreSQL expertise to this book. Their collaborative effort ensures a practical yet comprehensive approach to PostgreSQL 11. With rich industry experience, they provide readers with valuable insights to address real-world database challenges. Who is it for? The ideal readers are database administrators, architects, or developers working with PostgreSQL databases. This book is perfect for professionals seeking actionable solutions to PostgreSQL 11 challenges. Prior PostgreSQL knowledge will enhance the learning experience and practical application. If managing and optimizing databases is your goal, this book is tailored for you.

IBM High-Performance Computing Insights with IBM Power System AC922 Clustered Solution

This IBM® Redbooks® publication documents and addresses topics to set up a complete infrastructure environment and tune the applications to use an IBM POWER9™ hardware architecture with the technical computing software stack. This publication is driven by a CORAL project solution. It explores, tests, and documents how to implement an IBM High-Performance Computing (HPC) solution on a POWER9 processor-based system by using IBM technical innovations to help solve challenging scientific, technical, and business problems. This book documents the HPC clustering solution with InfiniBand on IBM Power Systems™ AC922 8335-GTH and 8335-GTX servers with NVIDIA Tesla V100 SXM2 graphics processing units (GPUs) with NVLink, software components, and the IBM Spectrum™ Scale parallel file system. This solution includes recommendations about the components that are used to provide a cohesive clustering environment that includes job scheduling, parallel application tools, scalable file systems, administration tools, and a high-speed interconnect. This book is divided into three parts: Part 1 focuses on the planners of the solution, Part 2 focuses on the administrators, and Part 3 focuses on the developers. This book targets technical professionals (consultants, technical support staff, IT architects, and IT specialists) who are responsible for delivering cost-effective HPC solutions that help uncover insights among clients' data so that they can act to optimize business results, product development, and scientific discoveries.

IBM zPDT Guide and Reference

This IBM® Redbooks® publication provides both introductory information and technical details about the IBM System z® Personal Development Tool (IBM zPDT®), which produces a small System z environment suitable for application development. zPDT is a PC Linux application. When zPDT is installed (on Linux), normal System z operating systems (such as IBM z/OS®) can be run on it. zPDT provides the basic System z architecture and emulated IBM 3390 disk drives, 3270 interfaces, OSA interfaces, and so on. The systems that are discussed in this document are complex. They have elements of Linux (for the underlying PC machine), IBM z/Architecture® (for the core zPDT elements), System z I/O functions (for emulated I/O devices), z/OS (the most common System z operating system), and various applications and subsystems under z/OS. The reader is assumed to be familiar with general concepts and terminology of System z hardware and software elements, and with basic PC Linux characteristics. This book provides the primary documentation for zPDT.

Data Architecture: A Primer for the Data Scientist, 2nd Edition

Over the past 5 years, the concept of big data has matured, data science has grown exponentially, and data architecture has become a standard part of organizational decision-making. Throughout all this change, the basic principles that shape the architecture of data have remained the same. There remains a need for people to take a look at the "bigger picture" and to understand where their data fit into the grand scheme of things. Data Architecture: A Primer for the Data Scientist, Second Edition addresses the larger architectural picture of how big data fits within the existing information infrastructure or data warehousing systems. This is an essential topic not only for data scientists, analysts, and managers but also for researchers and engineers who increasingly need to deal with large and complex sets of data. Until data are gathered and can be placed into an existing framework or architecture, they cannot be used to their full potential. Drawing upon years of practical experience and using numerous examples and case studies from across various industries, the authors seek to explain this larger picture into which big data fits, giving data scientists the necessary context for how pieces of the puzzle should fit together. New case studies include expanded coverage of textual management and analytics New chapters on visualization and big data Discussion of new visualizations of the end-state architecture

Elasticsearch 7.0 Cookbook - Fourth Edition

"Elasticsearch 7.0 Cookbook" is a practical guide to effectively using Elasticsearch, packed with over 100 recipes that cover everything from simple setup tasks to advanced query creation. Whether you're deploying Elasticsearch nodes or integrating with various technologies, this book will empower you to make the most out of Elasticsearch's robust search capabilities. What this Book will help me do Understand how to efficiently deploy and manage Elasticsearch architectures within your enterprise. Learn to create and optimize queries for effective analytics and data retrieval. Explore advanced indexing and mapping techniques to enhance data searchability. Monitor and scale your Elasticsearch clusters to ensure optimal performance. Integrate Elasticsearch with programming languages and big data applications. Author(s) Alberto Paro, a seasoned Elasticsearch expert, brings years of experience in designing and implementing large-scale search and analytics solutions. His practical experience in guiding teams through complex Elasticsearch deployments is evident in his clear and solution-focused writing approach. Alberto's passion for technology drives his mission to make advanced technical topics accessible. Who is it for? This book is ideal for software engineers, data professionals, and Elasticsearch developers who are looking to expand their technical capabilities in search and data analytics. It is also suited for individuals in industries like e-commerce utilizing Elastic for insights. A basic understanding of Elasticsearch will allow readers to gain deeper value from this book.

Fifty Years of Data Management and Beyond

Every decade since the 1960s, researchers at companies like IBM, Amazon, and many others have introduced major new frameworks and techniques to handle rising data management problems. This concise ebook explains how these new systems helped data science evolve quickly—from hierarchical and relational databases to big data and cloud computing to streaming and graph data. Computer scientist Paco Nathan shows members of your data science team how major companies created each of these data management systems not just to deal with new data types but also to take full advantage of the opportunities the data presented. Their efforts over the years have propelled an entire industry. This report covers the historical progression of data management topics including: Hierarchical databases—1960s mainframe batch systems are still used in finance, healthcare, manufacturing, energy, and other industries. Relational databases—these enabled faster transactions, mathematical optimization, and budgeting guarantees for many businesses. Big data—this includes relatively cheap horizontal scale-out systems for collecting huge amounts of customer data. Cloud computing—large companies began managing reliable, scalable, cost-effective data centers; Amazon turned the concept into a business. Cluster schedulers—managing horizontal clusters was difficult before schedulers such as Apache Mesos appeared. Streaming data—data continuously generated by different sources requires responses in "real time"—generally milliseconds.

SQL All-In-One For Dummies, 3rd Edition

The latest on SQL databases SQL All -In-One For Dummies, 3rd Edition, is a one-stop shop for everything you need to know about SQL and SQL-based relational databases. Everyone from database administrators to application programmers and the people who manage them will find clear, concise explanations of the SQL language and its many powerful applications. With the ballooning amount of data out there, more and more businesses, large and small, are moving from spreadsheets to SQL databases like Access, Microsoft SQL Server, Oracle databases, MySQL, and PostgreSQL. This compendium of information covers designing, developing, and maintaining these databases. Cope with any issue that arises in SQL database creation and management Get current on the newest SQL updates and capabilities Reference information on querying SQL-based databases in the SQL language Understand relational databases and their importance to today’s organizations SQL All-In-One For Dummies is a timely update to the popular reference for readers who want detailed information about SQL databases and queries.

IBM Spectrum Archive Enterprise Edition V1.2.6 Installation and Configuration Guide

Note: This is a republication of IBM Spectrum Archive Enterprise Edition V1.2.6: Installation and Configuration Guide with new book number SG24-8445 to keep the content available on the Internet along with the recent publication IBM Spectrum Archive Enterprise Edition V1.3.0: Installation and Configuration Guide, SG24-8333. This IBM® Redbooks® publication helps you with the planning, installation, and configuration of the new IBM Spectrum™ Archive V1.2.6 for the IBM TS3310, IBM TS3500, IBM TS4300, and IBM TS4500 tape libraries. IBM Spectrum Archive™ EE enables the use of the LTFS for the policy management of tape as a storage tier in an IBM Spectrum Scale™ based environment. It helps encourage the use of tape as a critical tier in the storage environment. This is the sixth edition of IBM Spectrum Archive Installation and Configuration Guide. IBM Spectrum Archive EE can run any application that is designed for disk files on a physical tape media. IBM Spectrum Archive EE supports the IBM Linear Tape-Open (LTO) Ultrium 8, 7, 6, and 5 tape drives in IBM TS3310, TS3500, TS4300, and TS4500 tape libraries. In addition, IBM TS1155, TS1150, and TS1140 tape drives are supported in TS3500 and TS4500 tape library configurations. IBM Spectrum Archive EE can play a major role in reducing the cost of storage for data that does not need the access performance of primary disk. The use of IBM Spectrum Archive EE to replace disks with physical tape in tier 2 and tier 3 storage can improve data access over other storage solutions because it improves efficiency and streamlines management for files on tape. IBM Spectrum Archive EE simplifies the use of tape by making it transparent to the user and manageable by the administrator under a single infrastructure. This publication is intended for anyone who wants to understand more about IBM Spectrum Archive EE planning and implementation. This book is suitable for IBM clients, IBM Business Partners, IBM specialist sales representatives, and technical specialists.

IBM TS7700 Release 4.2 Guide

This IBM® Redbooks® publication covers IBM TS7700 R4.2. The IBM TS7700 is part of a family of IBM Enterprise tape products. This book is intended for system architects and storage administrators who want to integrate their storage systems for optimal operation. Building on over 20 years of virtual tape experience, the TS7760 now supports the ability to store virtual tape volumes in an object store. The TS7700 has supported off loading to physical tape for over two decades. Off loading to physical tape behind a TS7700 is utilized by hundreds of organizations around the world. Using the same hierarchical storage techniques, the TS7700 can also off load to object storage. Given object storage is cloud based and accessible from different regions, the TS7760 Cloud Storage Tier support essentially allows the cloud to be an extension of the grid. As of the release of this document, the TS7760C supports the ability to off load to IBM Cloud Object Storage as well as Amazon S3. To learn about the TS7760 cloud storage tier function, planning, implementation, best practices, and support see IBM Redpaper IBM TS7760 R4.2 Cloud Storage Tier Guide, redp-5514 at: http://www.redbooks.ibm.com/abstracts/redp5514.html The IBM TS7700 offers a modular, scalable, and high-performance architecture for mainframe tape virtualization for the IBM Z® environment. It is a fully integrated, tiered storage hierarchy of disk and tape. This storage hierarchy is managed by robust storage management microcode with extensive self-management capability. It includes the following advanced functions: Improved reliability and resiliency Reduction in the time that is needed for the backup and restore process Reduction of services downtime that is caused by physical tape drive and library outages Reduction in cost, time, and complexity by moving primary workloads to virtual tape More efficient procedures for managing daily backup and restore processing Infrastructure simplification through reduction of the number of physical tape libraries, drives, and media TS7700 delivers the following new capabilities: TS7760C supports the ability to off load to IBM Cloud Object Storage as well as Amazon S3 8-way Grid Cloud consisting of any generation of TS7700 Synchronous and asynchronous replication Tight integration with IBM Z and DFSMS policy management Optional Transparent Cloud Tiering Optional integration with physical tape Cumulative 16Gb FICON throughput up to 4.8GB/s 8 IBM Z hosts view up to 496 8 equivalent devices Grid access to all data independent of where it exists The TS7760T writes data by policy to physical tape through attachment to high-capacity, high-performance IBM TS1150 and IBM TS1140 tape drives installed in an IBM TS4500 or TS3500 tape library. The TS7760 models are based on high-performance and redundant IBM POWER8® technology. They provide improved performance for most IBM Z tape workloads when compared to the previous generations of IBM TS7700.

IBM DS8880 Encryption for data at rest and Transparent Cloud Tiering (DS8000 Release 8.5)

-update for Release 8.5 - IBM experts recognize the need for data protection, both from hardware or software failures, and also from physical relocation of hardware, theft, and retasking of existing hardware. The IBM DS8880 supports encryption-capable hard disk drives (HDDs) and flash drives. These Full Disk Encryption (FDE) drive sets are used with key management services that are provided by IBM Security Key Lifecycle Manager software or Gemalto SafeNet KeySecure to allow encryption for data at rest on a DS8880. Use of encryption technology involves several considerations that are critical for you to understand to maintain the security and accessibility of encrypted data. The IBM Security Key Lifecycle Manager software also supports Transparent Cloud Tiering (TCT) data object encryption, which is part of this publication. With TCT encryption, data is encrypted before it is transmitted to the Cloud. The data remains encrypted in cloud storage and is decrypted after it is transmitted back to the DS8000®. This IBM Redpaper™ publication contains information that can help storage administrators plan for disk and TCT data object encryption. It also explains how to install and manage the encrypted storage and how to comply with IBM requirements for using the IBM DS8000 encrypted disk storage system. This edition focuses on IBM Security Key Lifecycle Manager Version 3.0 which enables support Key Management Interoperability Protocol (KMIP) with the DS8000 Release 8.5 code or later and updated GUI for encryption functions. The publication also discusses support for data at rest encryption with Gemalto SafeNet KeySecure Version 8.3.2.

IBM Storage Solutions for IBM Cloud Private Blueprint

IBM Storage Solutions for IBM Cloud™ Private delivers a blueprint for multicloud architecture. IBM, delivering solutions to help you win. In this blueprint, learn how to: Combine the benefits of IBM Systems with the performance of IBM Storage solutions so that you can deliver the right services to your clients today. Deliver optimized private cloud services ahead of schedule and under budget with a complete IBM Cloud Private stack. Containerize applications and deliver the SLAs that your team needs to thrive and win. Implement IBM Cloud Private to deploy modern applications like blockchain and AI or modernize what you already have. You now have the capabilities. This edition applies to IBM Storage Solutions for IBM Cloud Private Version 1 Release 5.0.