data

Beginning Power BI for Business Users

2023-10-31 · O'Reilly Data Science Books O'Reilly Amazon

book

by Paul D. Fuller

Analytics BI Data Analytics DAX Microsoft Power BI RDBMS business-intelligence data-science microsoft-power-platform power-bi

Discover the utility of your organization’s data with Microsoft Power BI In Beginning Power BI for Business Users: Learning to Turn Data into Insights, accomplished data professional and business intelligence expert Paul Fuller delivers an intuitive and accessible handbook for professionals seeking to use Microsoft’s Power BI to access, analyze, understand, report, and act on the data available to their organizations. In the book, you’ll discover Power BI’s robust feature set, learn to ingest and model data, visualize and report on that data, and even use the DAX scripting language to unlock still more utility from Microsoft’s popular program. Beginning with general principles geared to readers with no or little experience with reporting or data analytics tools, the author walks you through how to manipulate common, publicly available data sources—including Excel files and relational databases. You’ll also learn to: Use the included and tested sample code to work through the helpful examples included by the author Conduct data orchestration and visualization to better understand and gain insights from your data An essential resource for business analysts and Excel power users reaching the limits of that program’s capabilities, Beginning Power BI for Business Users will also benefit data analysts who seek to prepare reports for their organizations using Microsoft’s flexible and intuitive software.

Data Engineering with AWS - Second Edition

2023-10-31 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Gareth Eagar

Analytics AWS AWS Glue Data Engineering Data Governance QuickSight Redshift S3 Cyber Security data-engineering

Learn data engineering and modern data pipeline design with AWS in this comprehensive guide! You will explore key AWS services like S3, Glue, Redshift, and QuickSight to ingest, transform, and analyze data, and you'll gain hands-on experience creating robust, scalable solutions. What this Book will help me do Understand and implement data ingestion and transformation processes using AWS tools. Optimize data for analytics with advanced AWS-powered workflows. Build end-to-end modern data pipelines leveraging cutting-edge AWS technologies. Design data governance strategies using AWS services for security and compliance. Visualize data and extract insights using Amazon QuickSight and other tools. Author(s) Gareth Eagar is a Senior Data Architect with over 25 years of experience in designing and implementing data solutions across various industries. He combines his deep technical expertise with a passion for teaching, aiming to make complex concepts approachable for learners at all levels. Who is it for? This book is intended for current or aspiring data engineers, data architects, and analysts seeking to leverage AWS for data engineering. It suits beginners with a basic understanding of data concepts who want to gain practical experience as well as intermediate professionals aiming to expand into AWS-based systems.

Learn PostgreSQL - Second Edition

2023-10-31 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Luca Ferrari (Bending Spoons) , Enrico Pirozzi

Cyber Security SQL data-engineering postgresql relational-databases

Learn PostgreSQL, a comprehensive guide to mastering PostgreSQL 16, takes readers on a journey from the fundamentals to advanced concepts, such as replication and database optimization. With hands-on exercises and practical examples, this book provides all you need to confidently use, manage, and build secure and scalable databases. What this Book will help me do Master the essentials of PostgreSQL 16, including advanced SQL features and performance tuning. Understand database replication methods and manage a scalable architecture. Enhance database security through roles, schemas, and strict privilege management. Learn how to personalize your experience with custom extensions and functions. Acquire practical skills in backup, restoration, and disaster recovery planning. Author(s) Luca Ferrari and Enrico Pirozzi are experienced database engineers and PostgreSQL enthusiasts with years of experience using and teaching PostgreSQL technology. They specialize in creating learning content that is practical and focused on real-world situations. Their writing emphasizes clarity and systematically equips readers with professional skills. Who is it for? This book is perfect for database professionals, software developers, and system administrators looking to develop their PostgreSQL expertise. Beginners with an interest in databases will also find this book highly approachable. Ideal for readers seeking to improve their database scalability and robustness. If you aim to hone practical PostgreSQL skills, this guide is essential.

R Bioinformatics Cookbook - Second Edition

2023-10-31 · O'Reilly Data Science Books O'Reilly Amazon

book

by Dan MacLean

AI/ML Data Science DataViz bioinformatics data-science data-science-domains

R Bioinformatics Cookbook is your guide to leveraging the power of R for advanced bioinformatics tasks. This updated second edition uses a recipe-based method to teach data analysis, visualization, and machine learning tailored for biological datasets. You'll gain hands-on experience with popular tools like Bioconductor, ggplot2, and tidyverse to solve real-world genomics problems. What this Book will help me do Set up a reproducible bioinformatics analysis environment using R. Clean, analyze, and visualize biological data with R's powerful packages. Apply RNA-seq and ChIP-seq workflows to study genetic information effectively. Incorporate machine learning techniques into bioinformatics pipelines using R. Automate tasks and create professional-grade reports using functional programming and reporting tools. Author(s) The author, None MacLean, brings years of expertise in bioinformatics and computational biology. Known for clear explanations and practical approaches, they ensure the material is accessible yet challenging. With a strong focus on real-world applications, this book reflects their commitment to bridging bioinformatics and modern data science. Who is it for? This book is perfect for bioinformaticians, researchers, and data scientists with prior R experience. It's tailored for those looking to delve deeper into genomics, data visualization, and bioinformatics techniques. Intermediate knowledge of bioinformatics concepts and familiarity with R programming are assumed for readers to fully benefit from the content.

Procedural Programming with PostgreSQL PL/pgSQL: Design Complex Database-Centric Applications with PL/pgSQL

2023-10-30 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Dinesh Kumar Chemuduru , Baji Shaik (Amazon RDS)

data-engineering postgresql relational-databases

Learn the fundamentals of PL/PGSQL, the programming language of PostgreSQL which is most robust Open Source Relational Database. This book provides practical insights into developing database code objects such as functions and procedures, with a focus on effectively handling strings, numbers, and arrays to achieve desired outcomes, and transaction management. The unique approach to handling Triggers in PostgreSQL ensures that both functionality and performance are maintained without compromise. You'll gain proficiency in writing inline/anonymous server-side code within the limitations, along with learning essential debugging and profiling techniques. Additionally, the book delves into statistical analysis of PL/PGSQL code and offers valuable knowledge on managing exceptions while writing code blocks. Finally, you'll explore the installation and configuration of extensions to enhance the performance of stored procedures and functions. What You'll Learn Understand the PL/PGSQL concepts Learn to debug, profile, and optimize PL/PGSQL code Study linting PL/PGSQL code Review transaction management within PL/PGSQL code Work with developer friendly features like operators, casts, and aggregators Who Is This Book For App developers, database migration consultants, and database administrators.

Machine Learning with Qlik Sense

2023-10-27 · O'Reilly Data Science Books O'Reilly Amazon

book

by Hannu Ranta

AI/ML Analytics Data Analytics Qlik analytics-platforms data-science qlik-sense

Machine Learning with Qlik Sense introduces practical applications of machine learning within the Qlik platform. Through this book, you will gain a thorough understanding of fundamental ML concepts, learn to apply these within Qlik Sense, and see how to use predictive analytics to solve real-world problems. The hands-on examples ensure you can translate learnings into actionable insights. What this Book will help me do Understand the key principles of machine learning and how to apply them using the Qlik platform. Develop skills in data preprocessing and analysis to prepare datasets for machine learning models. Learn to validate and interpret machine learning models and evaluate their performance. Master advanced visualization techniques for presenting insights derived from data. Apply newfound knowledge to practical business problems through real-world use-case examples. Author(s) Hannu Ranta is an expert in data analytics and has extensive experience utilizing the Qlik platform to derive actionable insights from data. With years of practical exposure and a focus on teaching, Hannu brings a clear and structured approach to using machine learning for analytics. His writing seeks to empower readers to achieve practical solutions using Qlik's powerful tools. Who is it for? This book is perfect for data analysts, data scientists, or anyone working in data analytics who wants to incorporate machine learning into their skill set. It is especially suited to those with a basic familiarity with Qlik tools or data analysis concepts. Beginners in machine learning can also benefit because the book starts from foundational concepts and builds step-by-step.

Designing a Modern Application Data Stack

2023-10-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Adam Morton , Brad Culberson (Snowflake (Field CTO's office)) , Kevin McGinley

Analytics Cloud Computing Snowflake data-engineering

Today's massive datasets represent an unprecedented opportunity for organizations to build data-intensive applications. With this report, product leads, architects, and others who deal with applications and application development will explore why a cloud data platform is a great fit for data-intensive applications. You'll learn how to carefully consider scalability, data processing, and application distribution when making data app design decisions. Cloud data platforms are the modern infrastructure choice for data applications, as they offer improved scalability, elasticity, and cost efficiency. With a better understanding of data-intensive application architectures on cloud-based data platforms and the best practices outlined in this report, application teams can take full advantage of advances in data processing and app distribution to accelerate development, deployment, and adoption cycles. With this insightful report, you will: Learn why a modern cloud data platform is essential for building data-intensive applications Explore how scalability, data processing, and distribution models are key for today's data apps Implement best practices to improve application scalability and simplify data processing for efficiency gains Modernize application distribution plans to meet the needs of app providers and consumers About the authors: Adam Morton works with Intelligen Group, a Snowflake pure-play data and analytics consultancy. Kevin McGinley is technical director of the Snowflake customer acceleration team. Brad Culberson is a data platform architect specializing in data applications at Snowflake.

The Statistics and Machine Learning with R Workshop

2023-10-25 · O'Reilly Data Science Books O'Reilly Amazon

book

by Liu Peng

AI/ML Data Science data-science data-science-tools r

This book guides readers through the essentials of applied statistics and machine learning using the R programming language. By delving into robust data processing techniques, visualization, and statistical modeling with R, you will develop skills to effectively analyze data and design predictive models. Each chapter includes hands-on exercises to reinforce the concepts in a practical, intuitive way. What this Book will help me do Understand and apply key statistical concepts such as probability distributions and hypothesis testing to analyze data. Master foundational mathematical principles like linear algebra and calculus relevant to data science and machine learning. Develop proficiency in data manipulation and visualization using robust R libraries such as dplyr and ggplot2. Build predictive models through practical exercises and learn advanced concepts like Bayesian statistics and linear regression. Gain the practical knowledge needed to apply statistical and machine learning methodologies in real-world scenarios. Author(s) Liu Peng is an accomplished author with a strong academic and practical background in statistics and data science. Armed with extensive experience in applying R to real-world problems, he brings a blend of technical mastery and teaching expertise. His commitment is to transform complex concepts into accessible, enriching learning experiences for readers. Who is it for? This book is ideal for data scientists and analysts ranging from beginners to those at an intermediate level. It caters especially to those interested in practicing statistical modeling and learning R in depth. If you have basic familiarity with statistics and are looking to expand your data science capabilities using R, this book is well-suited for you.

Machine and Deep Learning Using MATLAB

2023-10-24 · O'Reilly Data Science Books O'Reilly Amazon

book

by Kamal I. M. Al-Malah

MATLAB data-science data-science-tools

MACHINE AND DEEP LEARNING In-depth resource covering machine and deep learning methods using MATLAB tools and algorithms, providing insights and algorithmic decision-making processes Machine and Deep Learning Using MATLAB introduces early career professionals to the power of MATLAB to explore machine and deep learning applications by explaining the relevant MATLAB tool or app and how it is used for a given method or a collection of methods. Its properties, in terms of input and output arguments, are explained, the limitations or applicability is indicated via an accompanied text or a table, and a complete running example is shown with all needed MATLAB command prompt code. The text also presents the results, in the form of figures or tables, in parallel with the given MATLAB code, and the MATLAB written code can be later used as a template for trying to solve new cases or datasets. Throughout, the text features worked examples in each chapter for self-study with an accompanying website providing solutions and coding samples. Highlighted notes draw the attention of the user to critical points or issues. Readers will also find information on: Numeric data acquisition and analysis in the form of applying computational algorithms to predict the numeric data patterns (clustering or unsupervised learning) Relationships between predictors and response variable (supervised), categorically sub-divided into classification (discrete response) and regression (continuous response) Image acquisition and analysis in the form of applying one of neural networks, and estimating net accuracy, net loss, and/or RMSE for the successive training, validation, and testing steps Retraining and creation for image labeling, object identification, regression classification, and text recognition Machine and Deep Learning Using MATLAB is a useful and highly comprehensive resource on the subject for professionals, advanced students, and researchers who have some familiarity with MATLAB and are situated in engineering and scientific fields, who wish to gain mastery over the software and its numerous applications.

Cyber Resiliency with IBM Storage Sentinel and IBM Storage Safeguarded Copy

2023-10-23 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by David Green , Thomas Gerisch , Axel Westphal , Nezih Boyacioglu , Gerd Franke , Gucer Vasfi , Daniel Thompson , Guillaume Legmar , Christopher Vollmar , Markus Standau

AI/ML IBM Oracle SAP Cyber Security data-engineering

IBM Storage Sentinel is a cyber resiliency solution for SAP HANA, Oracle, and Epic healthcare systems, designed to help organizations enhance ransomware detection and incident recovery. IBM Storage Sentinel automates the creation of immutable backup copies of your data, then uses machine learning to detect signs of possible corruption and generate forensic reports that help you quickly diagnose and identify the source of the attack. Because IBM Storage Sentinel can intelligently isolate infected backups, your organization can identify the most recent verified and validated backup copies, greatly accelerating your time to recovery. This IBM Redbooks publication explains how to implement a cyber resiliency solution for SAP HANA, Oracle, and Epic healthcare systems using IBM Storage Sentinel and IBM Storage Safeguarded Copy. Target audience of this document is cyber security and storage specialists.

Delta Lake: Up and Running

2023-10-17 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Dan Davis , Bennie Haelen

AI/ML Analytics Big Data Data Engineering Data Lake Data Lakehouse Data Management Data Quality Delta S3 Data Streaming data-engineering +2 more

With the surge in big data and AI, organizations can rapidly create data products. However, the effectiveness of their analytics and machine learning models depends on the data's quality. Delta Lake's open source format offers a robust lakehouse framework over platforms like Amazon S3, ADLS, and GCS. This practical book shows data engineers, data scientists, and data analysts how to get Delta Lake and its features up and running. The ultimate goal of building data pipelines and applications is to gain insights from data. You'll understand how your storage solution choice determines the robustness and performance of the data pipeline, from raw data to insights. You'll learn how to: Use modern data management and data engineering techniques Understand how ACID transactions bring reliability to data lakes at scale Run streaming and batch jobs against your data lake concurrently Execute update, delete, and merge commands against your data lake Use time travel to roll back and examine previous data versions Build a streaming data quality pipeline following the medallion architecture

Architecting Data and Machine Learning Platforms

2023-10-12 · O'Reilly AI & ML Books O'Reilly Amazon

book

by Firat Tekiner (Google Cloud) , Valliappa Lakshmanan , Marco Tranquillin

AI/ML Analytics AWS Azure Cloud Computing Data Analytics Databricks GCP MLOps Snowflake Data Streaming ai-ml +1 more

All cloud architects need to know how to build data platforms that enable businesses to make data-driven decisions and deliver enterprise-wide intelligence in a fast and efficient way. This handbook shows you how to design, build, and modernize cloud native data and machine learning platforms using AWS, Azure, Google Cloud, and multicloud tools like Snowflake and Databricks. Authors Marco Tranquillin, Valliappa Lakshmanan, and Firat Tekiner cover the entire data lifecycle from ingestion to activation in a cloud environment using real-world enterprise architectures. You'll learn how to transform, secure, and modernize familiar solutions like data warehouses and data lakes, and you'll be able to leverage recent AI/ML patterns to get accurate and quicker insights to drive competitive advantage. You'll learn how to: Design a modern and secure cloud native or hybrid data analytics and machine learning platform Accelerate data-led innovation by consolidating enterprise data in a governed, scalable, and resilient data platform Democratize access to enterprise data and govern how business teams extract insights and build AI/ML capabilities Enable your business to make decisions in real time using streaming pipelines Build an MLOps platform to move to a predictive and prescriptive analytics approach

IBM Storage Virtualize, IBM Storage FlashSystem, and IBM SAN Volume Controller Security Feature Checklist - For IBM Storage Virtualize 8.5.3

2023-10-10 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by James Whitaker , Bill Scales , Barry Whyte

Cloud Computing IBM Cyber Security data-engineering ibm-system-storage ibm-system-storage-san-volume-controller

IBM® Storage Virtualize based storage systems are secure storage platforms that implement various security-related features, in terms of system-level access controls and data-level security features. This document outlines the available security features and options of IBM Storage Virtualize based storage systems. It is not intended as a "how to" or best practice document. Instead, it is a checklist of features that can be reviewed by a user security team to aid in the definition of a policy to be followed when implementing IBM FlashSystem®, IBM SAN Volume Controller, and IBM Storage Virtualize for Public Cloud. IBM Storage Virtualize features the following levels of security to protect against threats and to keep the attack surface as small as possible: The first line of defense is to offer strict verification features that stop unauthorized users from using login interfaces and gaining access to the system and its configuration. The second line of defense is to offer least privilege features that restrict the environment and limit any effect if a malicious actor does access the system configuration. The third line of defense is to run in a minimal, locked down, mode to prevent damage spreading to the kernel and rest of the operating system. The fourth line of defense is to protect the data at rest that is stored on the system from theft, loss, or corruption (malicious or accidental). The topics that are discussed in this paper can be broadly split into two categories: System security: This type of security encompasses the first three lines of defense that prevent unauthorized access to the system, protect the logical configuration of the storage system, and restrict what actions users can perform. It also ensures visibility and reporting of system level events that can be used by a Security Information and Event Management (SIEM) solution, such as IBM QRadar®. Data security: This type of security encompasses the fourth line of defense. It protects the data that is stored on the system against theft, loss, or attack. These data security features include Encryption of Data At Rest (EDAR) or IBM Safeguarded Copy (SGC). This document is correct as of IBM Storage Virtualize 8.5.3.

Hands-On Web Scraping with Python - Second Edition

2023-10-06 · O'Reilly Data Science Books O'Reilly Amazon

book

by Anish Chapagain

API Data Science Pandas Plotly Python Selenium data-science data-science-tasks web-scraping

In "Hands-On Web Scraping with Python," you'll learn how to harness the power of Python libraries to extract, process, and analyze data from the web. This book provides a practical, step-by-step guide for beginners and data enthusiasts alike. What this Book will help me do Master the use of Python libraries like requests, lxml, Scrapy, and Beautiful Soup for web scraping. Develop advanced techniques for secure browsing and data extraction using APIs and Selenium. Understand the principles behind regex and PDF data parsing for comprehensive scraping. Analyze and visualize data using data science tools such as Pandas and Plotly. Build a portfolio of real-world scraping projects to demonstrate your capabilities. Author(s) Anish Chapagain, the author of "Hands-On Web Scraping with Python," is an experienced programmer and instructor who specializes in Python and data-related technologies. With his vast experience in teaching individuals from diverse backgrounds, Anish approaches complex concepts with clarity and a hands-on methodology. Who is it for? This book is perfect for aspiring data scientists, Python beginners, and anyone who wants to delve into web scraping. Readers should have a basic understanding of how websites work but no prior coding experience is required. If you aim to develop scraping skills and understand data analysis, this book is the ideal starting point.

Amazon Redshift: The Definitive Guide

2023-10-03 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Rajesh Francis , Rajiv Gupta , Milind Oke

AI/ML Analytics AWS Cloud Computing DWH Redshift Cyber Security amazon-redshift data-engineering relational-databases

Amazon Redshift powers analytic cloud data warehouses worldwide, from startups to some of the largest enterprise data warehouses available today. This practical guide thoroughly examines this managed service and demonstrates how you can use it to extract value from your data immediately, rather than go through the heavy lifting required to run a typical data warehouse. Analytic specialists Rajesh Francis, Rajiv Gupta, and Milind Oke detail Amazon Redshift's underlying mechanisms and options to help you explore out-of-the box automation. Whether you're a data engineer who wants to learn the art of the possible or a DBA looking to take advantage of machine learning-based auto-tuning, this book helps you get the most value from Amazon Redshift. By understanding Amazon Redshift features, you'll achieve excellent analytic performance at the best price, with the least effort. This book helps you: Build a cloud data strategy around Amazon Redshift as foundational data warehouse Get started with Amazon Redshift with simple-to-use data models and design best practices Understand how and when to use Redshift Serverless and Redshift provisioned clusters Take advantage of auto-tuning options inherent in Amazon Redshift and understand manual tuning options Transform your data platform for predictive analytics using Redshift ML and break silos using data sharing Learn best practices for security, monitoring, resilience, and disaster recovery Leverage Amazon Redshift integration with other AWS services to unlock additional value

Geospatial Analysis with SQL

2023-10-03 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Bonny P McClain

GIS SQL data-engineering geographic-information-system-gis geographic information system (gis) location-data

"Geospatial Analysis with SQL" is a practical guide that teaches you how to use SQL for geospatial data analysis. With direct, actionable guidance, you will learn to explore and analyze data using geospatial techniques without needing additional programming. This book equips you with the knowledge to solve location-based queries and perform advanced geospatial operations. What this Book will help me do Master the fundamentals of geospatial analysis and learn the importance of location-based data. Develop skills in creating and manipulating spatial database objects in SQL. Gain proficiency in using tools such as PostGIS and QGIS for geospatial data analysis. Learn techniques to visualize spatial data effectively and communicate results. Perform both single-layer and multi-layer spatial analysis for complex real-world scenarios. Author(s) Bonny P. McClain, the author of "Geospatial Analysis with SQL", brings extensive experience as a spatial data analyst and GIS expert. Bonny specializes in helping practitioners make data-driven insights through geospatial techniques. With a passion for teaching, Bonny's goal is to make complex concepts accessible and practical for analysts and developers alike. Who is it for? This book is ideal for GIS analysts, data analysts, and data scientists who have a basic understanding of SQL and geospatial concepts and want to expand their analytical capabilities. Readers looking to perform professional-grade geospatial analysis using SQL will find this book especially valuable. It caters to professionals wishing to use their SQL skills to understand and work with spatial datasets effectively.

Grow Your Business with AI: A First Principles Approach for Scaling Artificial Intelligence in the Enterprise

2023-10-03 · O'Reilly Business Intelligence Books O'Reilly Amazon

book

by Francisco Javier Campos Zabala

AI/ML NLP ai-ml artificial-intelligence-ai artificial intelligence (ai)

Leverage the power of Artificial Intelligence (AI) to drive the growth and success of your organization. This book thoroughly explores the reasons why it is so hard to implement AI, and highlights the need to reconcile the motivations and goals of two very different groups of people, business-minded and technical-minded. Divided into four main parts (First Principles, The Why, The What, The How), you'll review case studies and examples from companies that have successfully implemented AI. Part 1 provides a comprehensive overview of the First Principles approach and its basic conventions. Part 2 provides an in-depth look at the current state of AI and why it is increasingly important to businesses of all sizes. Part 3 delves into the key concepts and technologies of AI. Part 4 shares practical guidance and actionable steps for businesses looking to implement AI. Grow Your Business with AI is a must-read for anyone looking to understand and harness the power of AI for business growth and to stay ahead of the curve. What You'll Learn Review the key concepts and technologies of AI, including machine learning, natural language processing, and computer vision Apply the benefits of AI, including increased efficiency, improved decision-making, and new revenue streams in different industries Integrate AI into existing systems and processes. Who This Book Is For Entrepreneurs, business leaders, and professionals looking to leverage the power of AI to drive growth and success for their organizations.

IBM SAN Volume Controller Best Practices and Performance Guidelines

2023-10-03 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Anil K Nayak , Jordan Fincher , Duane Bolland , David Green , Jon Herd , Chris Hoffmann , Marcelos Avalos , Sidney Varoni Junior , Sergey Kubin , Gucer Vasfi , Thales Noivo Ferreira , Barry Whyte , Jackson Shea , Antonio Rainero , Danilo Morelli Miyasiro

IBM data-engineering ibm-system-storage ibm-system-storage-san-volume-controller

This IBM® Redbooks® publication describes several of the preferred practices and describes the performance gains that can be achieved by implementing the IBM SAN Volume Controller powered by IBM Spectrum® Virtualize V8.4. These practices are based on field experience. This book highlights configuration guidelines and preferred practices for the storage area network (SAN) topology, clustered system, back-end storage, storage pools, and managed disks, volumes, Remote Copy services, and hosts. Then, it provides performance guidelines for IBM SAN Volume Controller, back-end storage, and applications. It explains how you can optimize disk performance with the IBM System Storage Easy Tier® function. It also provides preferred practices for monitoring, maintaining, and troubleshooting IBM SAN Volume Controller. This book is intended for experienced storage, SAN, and IBM SAN Volume Controller administrators and technicians. Understanding this book requires advanced knowledge of the IBM SAN Volume Controller, IBM FlashSystem, and SAN environments.

IBM SAN Volume Controller Best Practices and Performance Guidelines for IBM Spectrum Virtualize Version 8.4.2

2023-10-03 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Carlton Beatty , Nils Olsson , Konrad Trojok , David Green , Hartmut Lonzer , Mandy Stevens , Uwe Schreiber , Renato Santos , Rene Oehme , Kendall Williams , Sergey Kubin , Nezih Boyacioglu , Gucer Vasfi , Jonathan Wilkie , Thales Noivo Ferreira , Antonio Rainero

IBM data-engineering

This IBM® Redbooks® publication captures several of the preferred practices and describes the performance gains that can be achieved by implementing the IBM SAN Volume Controller powered by IBM Spectrum® Virtualize Version 8.4.2. These practices are based on field experience. This book highlights configuration guidelines and preferred practices for the storage area network (SAN) topology, clustered system, back-end storage, storage pools and managed disks, volumes, Remote Copy services and hosts. It explains how you can optimize disk performance with the IBM System Storage Easy Tier® function. It also provides preferred practices for monitoring, maintaining, and troubleshooting. This book is intended for experienced storage, SAN, IBM FlashSystem®, IBM SAN Volume Controller, and IBM Storwize® administrators and technicians. Understanding this book requires advanced knowledge of these environments.

Practical Implementation of a Data Lake: Translating Customer Expectations into Tangible Technical Goals

2023-10-03 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Nayanjyoti Paul

AI/ML Data Lake Cyber Security data-engineering data-lake storage-repositories

This book explains how to implement a data lake strategy, covering the technical and business challenges architects commonly face. It also illustrates how and why client requirements should drive architectural decisions. Drawing upon a specific case from his own experience, author Nayanjyoti Paul begins with the consideration from which all subsequent decisions should flow: what does your customer need? He also describes the importance of identifying key stakeholders and the key points to focus on when starting a new project. Next, he takes you through the business and technical requirement-gathering process, and how to translate customer expectations into tangible technical goals. From there, you’ll gain insight into the security model that will allow you to establish security and legal guardrails, as well as different aspects of security from the end user’s perspective. You’ll learn which organizational roles need to be onboarded into the data lake, their responsibilities, the services they need access to, and how the hierarchy of escalations should work. Subsequent chapters explore how to divide your data lakes into zones, organize data for security and access, manage data sensitivity, and techniques used for data obfuscation. Audit and logging capabilities in the data lake are also covered before a deep dive into designing data lakes to handle multiple kinds and file formats and access patterns. The book concludes by focusing on production operationalization and solutions to implement a production setup. After completing this book, you will understand how to implement a data lake, the best practices to employ while doing so, and will be armed with practical tips to solve business problems. What You Will Learn Understand the challenges associated with implementing a data lake Explore the architectural patterns and processes used to design a new data lake Design and implement data lake capabilities Associate business requirements with technical deliverables to drive success Who This Book Is For Data Scientists and Architects, Machine Learning Engineers, and Software Engineers.

talk-data.com

Activity Trend

Top Events

Top Speakers

Beginning Power BI for Business Users

Data Engineering with AWS - Second Edition

Learn PostgreSQL - Second Edition

R Bioinformatics Cookbook - Second Edition

Procedural Programming with PostgreSQL PL/pgSQL: Design Complex Database-Centric Applications with PL/pgSQL

Machine Learning with Qlik Sense

Designing a Modern Application Data Stack

The Statistics and Machine Learning with R Workshop

Machine and Deep Learning Using MATLAB

Cyber Resiliency with IBM Storage Sentinel and IBM Storage Safeguarded Copy

Delta Lake: Up and Running

Architecting Data and Machine Learning Platforms

IBM Storage Virtualize, IBM Storage FlashSystem, and IBM SAN Volume Controller Security Feature Checklist - For IBM Storage Virtualize 8.5.3

Hands-On Web Scraping with Python - Second Edition

Amazon Redshift: The Definitive Guide

Geospatial Analysis with SQL

Grow Your Business with AI: A First Principles Approach for Scaling Artificial Intelligence in the Enterprise

IBM SAN Volume Controller Best Practices and Performance Guidelines

IBM SAN Volume Controller Best Practices and Performance Guidelines for IBM Spectrum Virtualize Version 8.4.2

Practical Implementation of a Data Lake: Translating Customer Expectations into Tangible Technical Goals