talk-data.com talk-data.com

Event

O'Reilly Data Engineering Books

2001-10-19 – 2027-05-25 Oreilly Visit website ↗

Activities tracked

3432

Collection of O'Reilly books on Data Engineering.

Sessions & talks

Showing 676–700 of 3432 · Newest first

Search within this event →
MySQL 8 Query Performance Tuning: A Systematic Method for Improving Execution Speeds

Identify, analyze, and improve poorly performing queries that damage user experience and lead to lost revenue for your business. This book will help you make query tuning an integral part of your daily routine through a multi-step process that includes monitoring of execution times, identifying candidate queries for optimization, analyzing their current performance, and improving them to deliver results faster and with less overhead. Author Jesper Krogh systematically discusses each of these steps along with the data sources and the tools used to perform them. MySQL 8 Query Performance Tuning aims to help you improve query performance using a wide range of strategies. You will know how to analyze queries using both the traditional EXPLAIN command as well as the new EXPLAIN ANALYZE tool. You also will see how to use the Visual Explain feature to provide a visually-oriented view of an execution plan. Coverage of indexes includes indexing strategies and index statistics, and you will learn how histograms can be used to provide input on skewed data distributions that the optimizer can use to improve query performance. You will learn about locks, and how to investigate locking issues. And you will come away with an understanding of how the MySQL optimizer works, including the new hash join algorithm, and how to change the optimizer’s behavior when needed to deliver faster execution times. You will gain the tools and skills needed to delight application users and to squeeze the most value from corporate computing resources. What You Will Learn Monitor query performance to identify poor performers Choose queries to optimize that will provide the greatest gain Analyze queries using tools such as EXPLAIN ANALYZE and Visual Explain Improve slow queries through a wide range of strategies Properly deploy indexes and histograms to aid in creating fast execution plans Understand and analyze locks to resolve contention and increase throughput Who This Book Is For Database administrators and SQL developers who are familiar with MySQL and need to participate in query tuning. While some experience with MySQL is required, no prior knowledge of query performance tuning is needed.

PostgreSQL Configuration: Best Practices for Performance and Security

Obtain all the skills you need to configure and manage a PostgreSQL database. In this book you will begin by installing and configuring PostgreSQL on a server by focusing on system-level parameter settings before installation. You will also look at key post-installation steps to avoid issues in the future. The basic configuration of PostgreSQL is tuned for compatibility rather than performance. Keeping this in mind, you will fine-tune your PostgreSQL parameters based on your environment and application behavior. You will then get tips to improve database monitoring and maintenance followed by database security for handling sensitive data in PostgreSQL. Every system containing valuable data needs to be backed-up regularly. PostgreSQL follows a simple back-up procedure and provides fundamental approaches to back up your data. You will go through these approaches and choose the right one based on your environment. Running your application with limited resources can be tricky. To achieve this you will implement a pooling mechanism for your PostgreSQL instances to connect to other databases. Finally, you will take a look at some basic errors faced while working with PostgreSQL and learn to resolve them in the quickest manner. What You Will Learn Configure PostgreSQL for performance Monitor and maintain PostgreSQL instances Implement a backup strategy for your data Resolve errors faced while using PostgreSQL Who This Book Is For Readers with basic knowledge of PostgreSQL who wish to implement key solutions based on their environment.

IBM DS8900F Product Guide

Built on over 50 years of Enterprise storage expertise, the IBM® DS8000® series is the flagship of disk storage systems within the IBM System Storage™ portfolio. As of September 2019, the DS8900F is the latest addition and offers two new classes: DS8910F: Flexibility Class all-flash DS8950F: Agility Class all-flash The agility class is efficiently designed to consolidate all your mission-critical workloads for IBM Z®, IBM LinuxONE, IBM Power Systems, and distributed environments under a single all-flash storage solution. This IBM Redbooks® Product Guide gives an overview of the features and functions that are available with the IBM DS8900F models running microcode Release 9.0 (Bundle 89.0 / Licensed Machine Code 7.9.0.xxx).

IBM ZPDT Guide and Reference

This IBM® Redbooks® publication provides both introductory information and technical details about the IBM System z® Personal Development Tool (IBM zPDT®), which produces a small System z environment suitable for application development. zPDT is a PC Linux application. When zPDT is installed (on Linux), normal System z operating systems (such as IBM z/OS®) can be run on it. zPDT provides the basic System z architecture and emulated IBM 3390 disk drives, 3270 interfaces, OSA interfaces, and so on. The systems that are discussed in this document are complex. They have elements of Linux (for the underlying PC machine), IBM z/Architecture® (for the core zPDT elements), System z I/O functions (for emulated I/O devices), z/OS (the most common System z operating system), and various applications and subsystems under z/OS. The reader is assumed to be familiar with general concepts and terminology of System z hardware and software elements, and with basic PC Linux characteristics. This book provides the primary documentation for zPDT.

SQL Server 2019 Administration Inside Out

Conquer SQL Server 2019 administration–from the inside out Dive into SQL Server 2019 administration–and really put your SQL Server DBA expertise to work. This supremely organized reference packs hundreds of timesaving solutions, tips, and workarounds–all you need to plan, implement, manage, and secure SQL Server 2019 in any production environment: on-premises, cloud, or hybrid. Six experts thoroughly tour DBA capabilities available in SQL Server 2019 Database Engine, SQL Server Data Tools, SQL Server Management Studio, PowerShell, and Azure Portal. You’ll find extensive new coverage of Azure SQL, big data clusters, PolyBase, data protection, automation, and more. Discover how experts tackle today’s essential tasks–and challenge yourself to new levels of mastery. Explore SQL Server 2019’s toolset, including the improved SQL Server Management Studio, Azure Data Studio, and Configuration Manager Design, implement, manage, and govern on-premises, hybrid, or Azure database infrastructures Install and configure SQL Server on Windows and Linux Master modern maintenance and monitoring with extended events, Resource Governor, and the SQL Assessment API Automate tasks with maintenance plans, PowerShell, Policy-Based Management, and more Plan and manage data recovery, including hybrid backup/restore, Azure SQL Database recovery, and geo-replication Use availability groups for high availability and disaster recovery Protect data with Transparent Data Encryption, Always Encrypted, new Certificate Management capabilities, and other advances Optimize databases with SQL Server 2019’s advanced performance and indexing features Provision and operate Azure SQL Database and its managed instances Move SQL Server workloads to Azure: planning, testing, migration, and post-migration

IBM Power Systems Virtualization Operation Management for SAP Applications

Businesses are using IBM® Power Systems servers and Linux to consolidate multiple SAP workloads onto fewer systems, increasing infrastructure utilization; reliability, availability, and serviceability (RAS); and scalability, and reducing cost. This IBM Redpaper Redbooks publication describes key hardware and software components of an SAP solution stack. Furthermore, this book addresses non-functional items like RAS, security, and issue handling. Practical help for planning, implementation, configuration, installation, and monitoring of a solution stack are provided. This publication addresses topics for sellers, IT architects, IT specialists, and anyone who wants to implement and manage SAP workloads on IBM Power Systems servers. Moreover, this guide provides documentation to transfer how-to skills to the technical teams, and it provides solution guidance to the sales team. This publication complements documentation that is available at IBM Knowledge Center, and it aligns with educational materials that are provided by IBM Systems.

Implementing and Managing a High-performance Enterprise Infrastructure with Nutanix on IBM Power Systems

This IBM® Redbooks® publication describes how to implement and manage a hyperconverged private cloud solution by using theoretical knowledge, hands-on exercises, and documenting the findings by way of sample scenarios. This book also is a guide about how to implement and manage a high-performance enterprise infrastructure and private cloud platform for big data, artificial intelligence, and transactional and analytics workloads on IBM Power Systems. This book use available documentation, hardware, and software resources to meet the following goals: Document the web-scale architecture that demonstrates the simple and agile nature of public clouds. Showcase the hyperconverged infrastructure to help cloud native applications mine cognitive analytics workloads. Conduct and document implementation case studies. Document guidelines to help provide an optimal system configuration, implementation, and management. This publication addresses topics for developers, IT architects, IT specialists, sellers, and anyone that wants to implement and manage a high-performance enterprise infrastructure and private cloud platform on IBM Power Systems. This book also provides documentation to transfer the how-to-skills to the technical teams, and solution guidance to the sales team. This book compliments any documentation that is available in IBM Knowledge Center, and aligns with the educational materials that are provided by the IBM Systems Software Education (SSE).

Software Engineering at Google

Today, software engineers need to know not only how to program effectively but also how to develop proper engineering practices to make their codebase sustainable and healthy. This book emphasizes this difference between programming and software engineering. How can software engineers manage a living codebase that evolves and responds to changing requirements and demands over the length of its life? Based on their experience at Google, software engineers Titus Winters and Hyrum Wright, along with technical writer Tom Manshreck, present a candid and insightful look at how some of the world's leading practitioners construct and maintain software. This book covers Google's unique engineering culture, processes, and tools and how these aspects contribute to the effectiveness of an engineering organization. You'll explore three fundamental principles that software organizations should keep in mind when designing, architecting, writing, and maintaining code: How time affects the sustainability of software and how to make your code resilient over time How scale affects the viability of software practices within an engineering organization What trade-offs a typical engineer needs to make when evaluating design and development decisions

PostgreSQL 12 High Availability Cookbook - Third Edition

The 'PostgreSQL 12 High Availability Cookbook' is a comprehensive guide to setting up and maintaining highly available PostgreSQL clusters. This book provides practical recipes for designing a resilient database system that can handle outages and recover quickly without downtime. What this Book will help me do Learn how to configure replication tools to protect PostgreSQL data effectively. Understand and implement hardware strategies for ensuring optimal database performance. Master the techniques for reducing contention with connections using pooling strategies. Gain insights into using monitoring tools like Nagios and Grafana for PostgreSQL cluster management. Develop a robust strategy for version upgrades, backups, and failover. Author(s) Shaun Thomas is a seasoned database specialist with extensive experience managing PostgreSQL systems. As a PostgreSQL contributor and advocate, he brings a depth of practical knowledge to database reliability and automation. Shaun's engaging and clear writing style ensures that readers can apply the discussed techniques with confidence. Who is it for? This book is ideal for database administrators, IT professionals, and developers who maintain PostgreSQL systems and want to improve uptime or reliability. Familiarity with basic PostgreSQL concepts is recommended, but no specific knowledge of version 12 features is required. Readers aiming to build advanced high availability solutions will find this book invaluable. It's perfect for those aspiring to ensure their database systems are both resilient and adaptive.

Next-Generation Machine Learning with Spark: Covers XGBoost, LightGBM, Spark NLP, Distributed Deep Learning with Keras, and More

Access real-world documentation and examples for the Spark platform for building large-scale, enterprise-grade machine learning applications. The past decade has seen an astonishing series of advances in machine learning. These breakthroughs are disrupting our everyday life and making an impact across every industry. Next-Generation Machine Learning with Spark provides a gentle introduction to Spark and Spark MLlib and advances to more powerful, third-party machine learning algorithms and libraries beyond what is available in the standard Spark MLlib library. By the end of this book, you will be able to apply your knowledge to real-world use cases through dozens of practical examples and insightful explanations. What You Will Learn Be introduced to machine learning, Spark, and Spark MLlib 2.4.x Achieve lightning-fast gradient boosting on Spark with the XGBoost4J-Spark and LightGBM libraries Detect anomalies with the Isolation Forest algorithm for Spark Use the Spark NLP and Stanford CoreNLP libraries that support multiple languages Optimize your ML workload with the Alluxio in-memory data accelerator for Spark Use GraphX and GraphFrames for Graph Analysis Perform image recognition using convolutional neural networks Utilize the Keras framework and distributed deep learning libraries with Spark Who This Book Is For Data scientists and machine learning engineers who want to take their knowledge to the next level and use Spark and more powerful, next-generation algorithms and libraries beyond what is available in the standard Spark MLlib library; also serves as a primer for aspiring data scientists and engineers who need an introduction to machine learning, Spark, and Spark MLlib.

Multicloud Storage as a Service using VRealize Automation and IBM Spectrum Storage

This document is intended to facilitate the deployment of the Multicloud Solution for Business Continuity and Storage as service by using IBM Spectrum Virtualize for Public Cloud on Amazon Web Services (AWS). To complete the tasks it describes, you must understand IBM FlashSystem 9100, IBM Spectrum Virtualize for Public Cloud, IBM Spectrum Connect, VMware vRealize Orchestrator, and vRealize Automation and AWS Cloud. The information in this document is distributed on an "as is" basis without any warranty that is either expressed or implied. Support assistance for the use of this material is limited to situations where IBM Storwize or IBM FlashSystem storage devices are supported and entitled and where the issues are specific to a blueprint implementation.

SAP on Azure Implementation Guide

SAP on Azure Implementation Guide is your essential companion for transitioning your SAP infrastructure to Microsoft Azure. The book takes a practical and detailed approach, providing step-by-step guidance to help you leverage Azure for migrating, scaling, and transforming your SAP solutions effectively. What this Book will help me do Understand and implement different SAP to Azure migration strategies, such as lift-and-shift and database transformations. Learn to ensure high availability and scalability for your SAP systems using Azure's capabilities. Gain insight into securing SAP workloads on Azure for compliance and safety. Achieve operational excellence by leveraging cloud-native features of Azure for SAP. Acquire the skills to optimize SAP infrastructure on Azure for enhanced business value. Author(s) Nick Morgan and Bartosz Jarkowski are experienced consultants with extensive knowledge of SAP systems and cloud implementations. With backgrounds in designing and deploying SAP on cloud platforms, they have a thorough understanding of transitioning business-critical applications to modern infrastructures. They bring a wealth of practical experience to this comprehensive guide. Who is it for? This book is ideal for SAP architects and IT professionals who are looking to migrate their SAP infrastructures to Azure. Whether you are moderately familiar with SAP or an experienced architect evaluating advanced migration strategies, you'll find the information in this guide precise and actionable to help you achieve your objectives.

IBM TS7700 Series DS8000 Object Store User's Guide Version 1.0

The IBM® TS7700 features a functional enhancement that allows for the TS7700 to act as an object store for transparent cloud tiering with IBM DS8000® (DS8K) and DFSMShsm (HSM). This function can be used to move datasets directly from DS8000 to TS7700. This IBM Redpaper™ publication describes the client value, and how DFSMShsm, DS8000, and TS7700 are set up to enable and use the function.

Achieving Hybrid Cloud Cyber Resiliency with IBM Spectrum Virtualize for Public Cloud

This document is intended to facilitate the approach of achieving the Cyber Resiliency solution for IBM® Spectrum Virtualize for Public Cloud. This solution is designed to protect the data on IBM Spectrum™ Virtualize storage in a hybrid multicloud environment by deploying cloud backup to Amazon S3 using the function Transparent Cloud Tiering .

Practical Oracle SQL: Mastering the Full Power of Oracle Database

Write powerful queries using as much of the feature-rich Oracle SQL language as possible, progressing beyond the simple queries of basic SQL as standardized in SQL-92. Both standard SQL and Oracle’s own extensions to the language have progressed far over the decades in terms of how much you can work with your data in a single, albeit sometimes complex, SQL statement. If you already know the basics of SQL, this book provides many examples of how to write even more advanced SQL to huge benefit in your applications, such as: Pivoting rows to columns and columns to rows Recursion in SQL with MODEL and WITH clauses Answering Top-N questions Forecasting with linear regressions Row pattern matching to group or distribute rows Using MATCH_RECOGNIZE as a row processing engineThe process of starting from simpler statements in SQL, and gradually working those statements stepwise into more complexstatements that deliver powerful results, is covered in each example. By trying out the recipes and examples for yourself, you will put together the building blocks into powerful SQL statements that will make your application run circles around your competitors. What You Will Learn Take full advantage of advanced and modern features in Oracle SQL Recognize when modern SQL constructs can help create better applications Improve SQL query building skills through stepwise refinement Apply set-based thinking to process more data in fewer queries Make cross-row calculations with analytic functions Search for patterns across multiple rows using row pattern matching Break complex calculations into smaller steps with subquery factoring Who This Book Is For Oracle Database developers who already knowsome SQL, but rarely use features of the language beyond the SQL-92 standard. And it is for developers who would like to apply the more modern features of Oracle SQL, but don’t know where to start. The book also is for those who want to write increasingly complex queries in a stepwise and understandable manner. Experienced developers will use the book to develop more efficient queries using the advanced features of the Oracle SQL language.

Understanding Log Analytics at Scale

If enabled, logging captures almost every system process, event, or message in your software or hardware. But once you have all that data, what do you do with it? This report shows you how to use log analytics—the process of gathering, correlating, and analyzing that information—to drive critical business insights and outcomes. Drawing on real-world use cases, Matt Gillespie outlines the opportunities for log analytics and the challenges you may face—along with approaches for meeting them. Data architects and IT and infrastructure leads will learn the mechanics of log analytics and key architectural considerations for data storage. The report also offers nine key guideposts that will help you plan and design your own solutions to obtain the full value from your log data. Learn the current state of log analytics and common challenges See how log analytics is helping organizations achieve better business outcomes in areas such as cybersecurity, IT operations, and industrial automation Explore tools for log analytics, including Splunk, the Elastic stack, and Sumo Logic Understand the role storage plays in ensuring successful outcomes

Implementing a VersaStack Solution by Cisco and IBM with IBM FlashSystem 5030, Cisco UCS Mini, Hyper-V, and SQL Server

VersaStack, an IBM® and Cisco integrated infrastructure solution, combines computing, networking, and storage into a single integrated system. It combines the Cisco Unified Computing System (Cisco UCS) Integrated Infrastructure with IBM Spectrum Virtualize™, which includes IBM FlashSystem® storage offerings, for quick deployment and rapid time to value for the implementation of modern infrastructures. This IBM Redbooks® publication covers the preferred practices for implementing a VersaStack Solution with IBM FlashSystem 5030, Cisco UCS Mini, Hyper-V 2016, and Microsoft SQL Server. Cisco UCS Mini is optimized for branch and remote offices, point-of-sale locations, and smaller IT environments. It is the ideal solution for customers who need fewer servers but still want the comprehensive management capabilities provided by Cisco UCS Manager. The IBM FlashSystem 5030 delivers efficient, entry-level configurations that are designed to meet the needs of small and midsize businesses. Designed to provide organizations with the ability to consolidate and share data at an affordable price, the IBM FlashSystem 5030 offers advanced software capabilities such as clustering, IBM Easy Tier®, replication and snapshots that are found in more expensive systems. This book is intended for pre-sales and post-sales technical support professionals and storage administrators who are tasked with deploying a VersaStack solution with Hyper-V 2016 and Microsoft SQL Server.

Implementing VersaStack with Cisco ACI Multi-Pod and IBM HyperSwap for High Availability

The IBM HyperSwap® high availability (HA) function allows business continuity in a hardware failure, power failure, connectivity failure, or disasters, such as fire or flooding. It is available on the IBM SAN Volume Controller and IBM FlashSystem products. This IBM Redbooks publication covers the preferred practices for implementing Cisco VersaStack with IBM HyperSwap. The following are some of the topics covered in this book: Cisco Application Centric Infrastructure to showcase Cisco's ACI with Nexus 9Ks Cisco Fabric Interconnects and Unified Computing System (UCS) management capabilities Cisco Multilayer Director Switch (MDS) to showcase fabric channel connectivity Overall IBM HyperSwap solution architecture Differences between HyperSwap and Metro Mirroring, Volume Mirroring, and Stretch Cluster Multisite IBM SAN Volume Controller (SVC) deployment to showcase HyperSwap configuration and capabilities This book is intended for pre-sales and post-sales technical support professionals and storage administrators who are tasked with deploying a VersaStack solution with IBM HyperSwap.

Pro T-SQL 2019: Toward Speed, Scalability, and Standardization for SQL Server Developers

Design and write simple and efficient T-SQL code in SQL Server 2019 and beyond. Writing T-SQL that pulls back correct results can be challenging. This book provides the help you need in writing T-SQL that performs fast and is easy to maintain. You also will learn how to implement version control, testing, and deployment strategies. Hands-on examples show modern T-SQL practices and provide straightforward explanations. Attention is given to selecting the right data types and objects when designing T-SQL solutions. Author Elizabeth Noble teaches you how to improve your T-SQL performance through good design practices that benefit programmers and ultimately the users of the applications. You will know the common pitfalls of writing T-SQL and how to avoid those pitfalls going forward. What You Will Learn Choose correct data types and database objects when designing T-SQL Write T-SQL that searches data efficiently and uses hardware effectively Implement source control and testing methods to streamline the deployment process Design T-SQL that can be enhanced or modified with less effort Plan for long-term data management and storage Who This Book Is For Database developers who want to improve the efficiency of their applications, and developers who want to solve complex query and data problems more easily by writing T-SQL that performs well, brings back correct results, and is easy for other developers to understand and maintain

DS8000 4-Site Replication with IBM Copy Services Manager

This IBM® Redpaper publication helps you design and implement a 4-site replication solution for IBM DS8000® environments. IBM Copy Services Manager is used to orchestrate the data replication and failover and failback mechanisms between the different sites. The IBM DS8000 Copy Services functions are the foundation of this 4-site replication solution. The four sites consist of two pairs of sites. Within each pair, the two sites are at metro distance, while the pairs connect at long distance over asynchronous links. The solution is based on a Multi-Target PPRC topology that consists of a Metro Mirror replication to the secondary site in the local region and a Global Mirror replication to the third site in the remote region. The fourth site is set up as a cascaded Global Copy, which is an asynchronous copy.

IBM Spectrum Archive Enterprise Edition V1.3.0.6: Installation and Configuration Guide

This IBM® Redbooks® publication helps you with the planning, installation, and configuration of the new IBM Spectrum® Archive v1.3.0.6 for the IBM TS4500, IBM TS3500, IBM TS4300, and IBM TS3310 tape libraries. IBM Spectrum Archive EE enables the use of the LTFS for the policy management of tape as a storage tier in an IBM Spectrum Scale based environment. It helps encourage the use of tape as a critical tier in the storage environment. This is the eighth edition of IBM Spectrum Archive Installation and Configuration Guide. IBM Spectrum Archive EE can run any application that is designed for disk files on a physical tape media. IBM Spectrum Archive EE supports the IBM Linear Tape-Open (LTO) Ultrium 8, 7, 6, and 5 tape drives in IBM TS3310, TS3500, TS4300, and TS4500 tape libraries. In addition, IBM TS1160, TS1155, TS1150, and TS1140 tape drives are supported in TS3500 and TS4500 tape library configurations. IBM Spectrum Archive EE can play a major role in reducing the cost of storage for data that does not need the access performance of primary disk. The use of IBM Spectrum Archive EE to replace disks with physical tape in tier 2 and tier 3 storage can improve data access over other storage solutions because it improves efficiency and streamlines management for files on tape. IBM Spectrum Archive EE simplifies the use of tape by making it transparent to the user and manageable by the administrator under a single infrastructure. This publication is intended for anyone who wants to understand more about IBM Spectrum Archive EE planning and implementation. This book is suitable for IBM clients, IBM Business Partners, IBM specialist sales representatives, and technical specialists.

IBM Storage Solutions for SAP Applications Version 1.3

This paper is intended as an architecture and configuration guide to set up the IBM® System Storage® for the SAP HANA tailored data center integration (SAP HANA TDI) within a storage area network (SAN) environment. SAP HANA TDI allows the SAP customer to attach external storage to the SAP HANA server. The paper also describes the setup and configuration of SAP Landscape Management for SAP HANA systems on IBM infrastructure components: IBM Power Systems™ and IBM Storage based on IBM Spectrum™ Virtualize. This document is written for IT technical specialists and architects with advanced skill levels on SUSE Linux Enterprise Server (SLES) or Red Hat Enterprise Linux (RHEL) and IBM System Storage. This document provides the necessary information to select, verify, and connect IBM System Storage to the SAP HANA server through a Fibre Channel-based SAN. The recommendations in this Blueprint apply to single-node and scale-out configurations, and Intel and IBM Power based SAP HANA systems.