data

What Every Engineer Should Know About Data-Driven Analytics

2023-04-13 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Phillip A. Laplante , Satish Mahadevan Srinivasan

AI/ML Analytics Data Analytics business-intelligence data-science prescriptive-analytics

What Every Engineer Should Know About Data-Driven Analytics provides a comprehensive introduction to the machine learning theoretical concepts and approaches that are used in predictive data analytics through practical applications and case studies.

Building an Event-Driven Data Mesh

2023-04-10 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Adam Bellemare

Delta data-engineering data-mesh database-architecture event-driven-data-mesh

The exponential growth of data combined with the need to derive real-time business value is a critical issue today. An event-driven data mesh can power real-time operational and analytical workloads, all from a single set of data product streams. With practical real-world examples, this book shows you how to successfully design and build an event-driven data mesh. Building an Event-Driven Data Mesh provides: Practical tips for iteratively building your own event-driven data mesh, including hurdles you'll experience, possible solutions, and how to obtain real value as soon as possible Solutions to pitfalls you may encounter when moving your organization from monoliths to event-driven architectures A clear understanding of how events relate to systems and other events in the same stream and across streams A realistic look at event modeling options, such as fact, delta, and command type events, including how these choices will impact your data products Best practices for handling events at scale, privacy, and regulatory compliance Advice on asynchronous communication and handling eventual consistency

Principles of Data Fabric

2023-04-06 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Sonia Mezzetta

Analytics Data Governance Data Management DataOps Fabric data-engineering data-mesh database-architecture

In "Principles of Data Fabric," you will gain a comprehensive understanding of Data Fabric solutions and architectures. This book provides a clear picture of how to design, implement, and optimize Data Fabric solutions to tackle complex data challenges. By the end, you'll be equipped with the knowledge to unify and leverage your organizational data efficiently. What this Book will help me do Design and architect Data Fabric solutions tailored to specific organizational needs. Learn to integrate Data Fabric with DataOps and Data Mesh for holistic data management. Master the principles of Data Governance and Self-Service analytics within the Data Fabric. Implement best practices for distributed data management and regulatory compliance. Apply industry insights and frameworks to optimize Data Fabric deployment. Author(s) Sonia Mezzetta, the author of "Principles of Data Fabric," is an experienced data professional with a deep understanding of data management frameworks and architectures like Data Fabric, Data Mesh, and DataOps. With years of industry expertise, Sonia has helped organizations implement effective data strategies. Her writing combines technical know-how with an approachable style to enlighten and guide readers on their data journey. Who is it for? This book is ideal for data engineers, data architects, and business analysts who seek to understand and implement Data Fabric solutions. It will also appeal to senior data professionals like Chief Data Officers aiming to integrate Data Fabric into their enterprises. Novice to intermediate knowledge of data management would be beneficial for readers. The content provides clear pathways to achieve actionable results in data strategies.

All About Bioinformatics

2023-04-05 · O'Reilly Data Science Books O'Reilly Amazon

book

by Yasha Hasija

AI/ML bioinformatics data-science data-science-domains

All About Bioinformatics: From Beginner to Expert provides readers with an overview of the fundamentals and advances in the _x001F_field of bioinformatics, as well as some future directions. Each chapter is didactically organized and includes introduction, applications, tools, and future directions to cover the topics thoroughly. The book covers both traditional topics such as biological databases, algorithms, genetic variations, static methods, and structural bioinformatics, as well as contemporary advanced topics such as high-throughput technologies, drug informatics, system and network biology, and machine learning. It is a valuable resource for researchers and graduate students who are interested to learn more about bioinformatics to apply in their research work. Presents a holistic learning experience, beginning with an introduction to bioinformatics to recent advancements in the field Discusses bioinformatics as a practice rather than in theory focusing on more application-oriented topics as high-throughput technologies, system and network biology, and workflow management systems Encompasses chapters on statistics and machine learning to assist readers in deciphering trends and patterns in biological data

Beginning Database Design Solutions, 2nd Edition

2023-04-04 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Rod Stephens

Cloud Computing Data Management NoSQL data-engineering relational-databases

A concise introduction to database design concepts, methods, and techniques in and out of the cloud In the newly revised second edition of Beginning Database Design Solutions: Understanding and Implementing Database Design Concepts for the Cloud and Beyond, Second Edition, award-winning programming instructor and mathematician Rod Stephens delivers an easy-to-understand guide to designing and implementing databases both in and out of the cloud. Without assuming any prior database design knowledge, the author walks you through the steps you’ll need to take to understand, analyze, design, and build databases. In the book, you’ll find clear coverage of foundational database concepts along with hands-on examples that help you practice important techniques so you can apply them to your own database designs, as well as: Downloadable source code that illustrates the concepts discussed in the book Best practices for reliable, platform-agnostic database design Strategies for digital transformation driven by universally accessible database design An essential resource for database administrators, data management specialists, and database developers seeking expertise in relational, NoSQL, and hybrid database design both in and out of the cloud, Beginning Database Design Solutions is a hands-on guide ideal for students and practicing professionals alike.

Practical Business Analytics Using R and Python: Solve Business Problems Using a Data-driven Approach

2023-04-03 · O'Reilly Data Science Books O'Reilly Amazon

book

by Umesh R. Hodeghatta , Umesha Nayak

AI/ML Analytics Big Data Data Analytics NLP NumPy Pandas Python SQL data-science data-science-tools r

This book illustrates how data can be useful in solving business problems. It explores various analytics techniques for using data to discover hidden patterns and relationships, predict future outcomes, optimize efficiency and improve the performance of organizations. You’ll learn how to analyze data by applying concepts of statistics, probability theory, and linear algebra. In this new edition, both R and Python are used to demonstrate these analyses. Practical Business Analytics Using R and Python also features new chapters covering databases, SQL, Neural networks, Text Analytics, and Natural Language Processing.Part one begins with an introduction to analytics, the foundations required to perform data analytics, and explains different analytics terms and concepts such as databases and SQL, basic statistics, probability theory, and data exploration. Part two introduces predictive models using statistical machine learning and discusses concepts like regression, classification, and neural networks. Part three covers two of the most popular unsupervised learning techniques, clustering and association mining, as well as text mining and natural language processing (NLP). The book concludes with an overview of big data analytics, R and Python essentials for analytics including libraries such as pandas and NumPy. Upon completing this book, you will understand how to improve business outcomes by leveraging R and Python for data analytics. What You Will Learn Master the mathematical foundations required for business analytics Understand various analytics models and data mining techniques such as regression, supervised machine learning algorithms for modeling, unsupervised modeling techniques, and how to choose the correct algorithm for analysis in any given task Use R and Python to develop descriptive models, predictive models, and optimize models Interpret and recommend actions based on analytical model outcomes Who This Book Is For Software professionals and developers, managers, and executives who want to understand and learn the fundamentals of analytics using R and Python.

An Introduction to Economic Dynamics

2023-03-31 · O'Reilly Data Science Books O'Reilly Amazon

book

by Petri T. Piiroinen , Srinivas Raghavendra

MATLAB data-science data-science-tools

An Introduction to Economic Dynamics provides a framework for students to appreciate and understand the basic intuition behind economic models and to experiment with those models using simulation techniques in MATLAB.

Computational Statistical Methodologies and Modeling for Artificial Intelligence

2023-03-31 · O'Reilly Data Science Books O'Reilly Amazon

book

by Basant Agarwal , Priyanka Harjule , Vinita Tiwari , Azizur Rahman

AI/ML data-science data-science-tasks statistics

This book covers computational statistics-based approaches for Artificial Intelligence. The aim of this book is to provide comprehensive coverage of the fundamentals through the applications of the different kinds of mathematical modelling and statistical techniques and describing their applications in different Artificial Intelligence systems.

Data Driven Strategies

2023-03-31 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Ruben Morales-Menendez , Wang Jianhong , Ricardo A. Ramirez-Mendoza

data-engineering data-models

Finding exciting and efficient ways to integrate data into control theory has been a problem of great interest. As most of the classical contributions in control strategy rely on model description, the issue of finding such a model from measured data, i.e., system identification, has become mature research filed.

Data Fabric and Data Mesh Approaches with AI: A Guide to AI-based Data Cataloging, Governance, Integration, Orchestration, and Consumption

2023-03-31 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Maryela Weihrauch , Eberhard Hechler , Yan (Catherine) Wu

AI/ML Cloud Computing Data Governance DataOps Fabric MLOps data-engineering data-mesh database-architecture

Understand modern data fabric and data mesh concepts using AI-based self-service data discovery and delivery capabilities, a range of intelligent data integration styles, and automated unified data governance—all designed to deliver "data as a product" within hybrid cloud landscapes. This book teaches you how to successfully deploy state-of-the-art data mesh solutions and gain a comprehensive overview on how a data fabric architecture uses artificial intelligence (AI) and machine learning (ML) for automated metadata management and self-service data discovery and consumption. You will learn how data fabric and data mesh relate to other concepts such as data DataOps, MLOps, AIDevOps, and more. Many examples are included to demonstrate how to modernize the consumption of data to enable a shopping-for-data (data as a product) experience. By the end of this book, you will understand the data fabric concept and architecture as it relates to themes such as automated unifieddata governance and compliance, enterprise information architecture, AI and hybrid cloud landscapes, and intelligent cataloging and metadata management. What You Will Learn Discover best practices and methods to successfully implement a data fabric architecture and data mesh solution Understand key data fabric capabilities, e.g., self-service data discovery, intelligent data integration techniques, intelligent cataloging and metadata management, and trustworthy AI Recognize the importance of data fabric to accelerate digital transformation and democratize data access Dive into important data fabric topics, addressing current data fabric challenges Conceive data fabric and data mesh concepts holistically within an enterprise context Become acquainted with the business benefits of data fabric and data mesh Who This Book Is For Anyone who is interested in deploying modern data fabric architectures and data mesh solutions within an enterprise, including IT and business leaders, data governance and data office professionals, data stewards and engineers, data scientists, and information and data architects. Readers should have a basic understanding of enterprise information architecture.

Forecasting Time Series Data with Prophet - Second Edition

2023-03-31 · O'Reilly Data Science Books O'Reilly Amazon

book

by Greg Rafferty

AI/ML Data Science data-science data-science-tasks prophet statistics time-series

Discover how to effectively forecast time series data using Prophet, the versatile open-source tool developed by Meta. Whether you're a business analyst or a machine learning expert, this book provides comprehensive insights into creating, diagnosing, and refining forecasting models. By mastering Prophet, you'll be equipped to make accurate predictions that drive decisions. What this Book will help me do Master the core principles of using Prophet for time series forecasting. Ensure your forecasts are accurate and robust for better decision-making. Gain experience in handling real-world forecasting challenges, like seasonality and outliers. Learn how to fine-tune and optimize models using additional regressors. Understand productionalization of forecasting models to apply solutions at scale. Author(s) Greg Rafferty is a seasoned data scientist specializing in time series analysis and machine learning. With years of practical experience building forecasting models in industries ranging from finance to e-commerce, Greg is dedicated to teaching accessible and actionable approaches to data science. Through clear explanations and practical examples, he empowers readers to solve challenging forecasting problems with confidence. Who is it for? Ideal for data scientists, business analysts, machine learning engineers, and software developers seeking to enhance their forecasting skills with Prophet. Whether you're familiar with time series concepts or just starting to explore forecasting methods, this book helps you advance from fundamental understanding to practical application of state-of-the-art techniques for impactful results.

IBM Storage DS8900F Product Guide Release 9.3.2

2023-03-29 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Connie Riggins , Peter Kimmel , Jeff Cook

AI/ML BI IBM data-engineering

This IBM® Redbooks Product Guide provides an overview of the features and functions that are available with the IBM Storage DS8900F models that run microcode Release 9.3.2 (Bundle 89.32/Licensed Machine Code 7.9.32). As of February 2023, the DS8900F with DS8000 Release 9.3.2 is the latest addition. The DS8900F is an all-flash system exclusively, and it offers three classes: IBM DS8980F: Analytic Class: The DS8980F Analytic Class offers best performance for organizations that want to expand their workload possibilities to artificial intelligence (AI), Business Intelligence, and Machine Learning. IBM DS8950F: Agility Class: The agility class is efficiently designed to consolidate all your mission-critical workloads for IBM zSystems, IBM LinuxONE, IBM Power Systems, and distributed environments under a single all-flash storage solution. IBM DS8910F: Flexibility Class: The flexibility class delivers significant performance for midrange organizations that are looking to meet storage challenges with advanced functionality delivered as a single rack solution.

Business Models for Industry 4.0

2023-03-24 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Sebastian Saniuk , Sandra Grabowska

data-engineering data-models

This book provides comprehensive knowledge on the operating conditions and challenges of small and medium-sized enterprises operating in the era of industry 4.0 and proposes a business model 4.0 concept.

Sentient Strategy

2023-03-23 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Alan Weiss

data-engineering real-time-analytics streaming-messaging

Alan Weiss equips the reader to consider using this approach independently. These are new times -- new reality, a “no normal" -- hence, it’s ridiculous to use old approaches to strategy.

Azure SQL Hyperscale Revealed: High-performance Scalable Solutions for Critical Data Workloads

2023-03-21 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Daniel Scott-Raynsford , Zoran Barać

Azure Cloud Computing Microsoft SQL azure-sql-database data-engineering relational-databases

Take a deep dive into the Azure SQL Database Hyperscale Service Tier and discover a new form of cloud architecture from Microsoft that supports massive databases. The new horizontally scalable architecture, formerly code-named Socrates, allows you to decouple compute nodes from storage layers. This radically different approach dramatically increases the scalability of the service. This book shows you how to leverage Hyperscale to provide next-level scalability, high throughput, and fast performance from large databases in your environment. The book begins by showing how Hyperscale helps you eliminate many of the problems of traditional high-availability and disaster recovery architecture. You’ll learn how Hyperscale overcomes storage capacity limitations and issues with scale-up times and costs. With Hyperscale, your costs do not increase linearly with database size and you can manage more data than ever at a lower cost. The book teaches you how todeploy, configure, and monitor an Azure SQL Hyperscale database in a production environment. The book also covers migrating your current workloads from traditional architecture to Azure SQL Hyperscale. What You Will Learn Understand the advantages of Hyperscale over traditional architecture Deploy a Hyperscale database on the Azure cloud (interactively and with code) Configure the advanced features of the Hyperscale database tier Monitor and scale database performance to suit your needs Back up and restore your Azure SQL Hyperscale databases Implement disaster recovery and failover capability Compare performance of Hyperscale vs traditional architecture Migrate existing databases to the Hyperscale service tier Who This Book Is For SQL architects, data engineers, and DBAs who want the most efficient and cost-effective cloud technologies to run their critical data workloads, and those seeking rapid scalability and high performance and throughput while utilizing large databases

Bioinformatics Tools for Pharmaceutical Drug Product Development

2023-03-14 · O'Reilly Data Science Books O'Reilly Amazon

book

by Vivek Chavda , Vasso Apostolopoulos , Krishnan Anand

AI/ML bioinformatics data-science data-science-domains

BIOINFORMATICS TOOLS FOR Pharmaceutical DRUG PRODUCT DLEVELOPMENT A timely book that details bioinformatics tools, artificial intelligence, machine learning, computational methods, protein interactions, peptide-based drug design, and omics technologies, for drug development in the pharmaceutical and medical sciences industries. The book contains 17 chapters categorized into 3 sections. The first section presents the latest information on bioinformatics tools, artificial intelligence, machine learning, computational methods, protein interactions, peptide-based drug design, and omics technologies. The following 2 sections include bioinformatics tools for the pharmaceutical sector and the healthcare sector. Bioinformatics brings a new era in research to accelerate drug target and vaccine design development, improving validation approaches as well as facilitating and identifying side effects and predicting drug resistance. As such, this will aid in more successful drug candidates from discovery to clinical trials to the market, and most importantly make it a more cost-effective process overall. Readers will find in this book: Applications of bioinformatics tools for pharmaceutical drug product development like process development, pre-clinical development, clinical development, commercialization of the product, etc.; The ever-expanding application of this novel technology and discusses some of the unique challenges associated with such an approach; The broad and deep background, as well as updates, on recent advances in both medicine and AI/ML that enable the application of these cutting-edge bioinformatics tools. Audience The book will be used by researchers and scientists in academia and industry including drug developers, computational biochemists, bioinformaticians, immunologists, pharmaceutical and medical sciences, as well as those in artificial intelligence and machine learning.

Introduction to IBM PowerVM

2023-03-13 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Ivaylo B. Bozhinov , Prerna Upmanyu , Muhammad Mahmood , Vivek Shukla , Ahmed Mashhour , Ayman Mostafa , Turgut Genc

IBM Linux data-engineering

Virtualization plays an important role in resource efficiency by optimizing performance, reducing costs, and improving business continuity. IBM PowerVM® provides a secure and scalable server virtualization environment for IBM AIX®, IBM® i, and Linux applications. PowerVM is built on the advanced reliability, availability, and serviceability (RAS) features and leading performance of IBM Power servers. This IBM Redbooks® publication introduces PowerVM virtualization technologies on Power servers. This publication targets clients who are new to Power servers and introduces the available capabilities of the PowerVM platform. This publication includes the following chapters: Chapter 1, "IBM PowerVM overview" introduces PowerVM and provides a high-level overview of the capabilities and benefits of the platform. Chapter 2, "IBM PowerVM features in details" provides a more in-depth review of PowerVM capabilities for system administrators and architects to familiarize themselves with its features. Chapter 3, "Planning for IBM PowerVM" provides planning guidance about PowerVM to prepare for the implementation of the solution. Chapter 4, "Implementing IBM PowerVM" describes and details configuration steps to implement PowerVM, starting from implementing the Virtual I/O Server (VIOS) to storage and network I/O virtualization configurations. Chapter 5, "Managing the PowerVM environment" focuses on systems management, day-to-day operations, monitoring, and maintenance. Chapter 6, "Automation on IBM Power servers" explains available techniques, utilities, and benefits of modern automation solutions.

Proactive Early Threat Detection and Securing Oracle Database with IBM QRadar, IBM Security Guardium Database Protection, and IBM Copy Services Manager by using IBM FlashSystem Safeguarded Copy

2023-03-10 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Raninder Ravi Bhandari , Shashank Shingornikar

IBM Oracle Cyber Security data-engineering

This IBM® blueprint publication focuses on early threat detection within a database environment by using IBM Security® Guardium® Data Protection and IBM QRadar® . It also highlights how to proactively start a cyber resilience workflow in response to a cyberattack or potential malicious user actions. The workflow that is presented here uses IBM Copy Services Manager as orchestration software to start IBM FlashSystem® Safeguarded Copy functions. The Safeguarded Copy creates an immutable copy of the data in an air-gapped form on the same IBM FlashSystem for isolation and eventual quick recovery. This document describes how to enable and forward Oracle database user activities (by using IBM Security Guardium Data Protection) and IBM FlashSystem audit logs by using IBM FlashSystem to IBM QRadar. This document also describes how to create various rules to determine a threat, and configure and launch a suitable response to the detected threat in IBM QRadar. The document also outlines the steps that are involved to create a Scheduled Task by using IBM Copy Services Manager with various actions.

Scaling Machine Learning with Spark

2023-03-09 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Adi Polak (Treeverse)

AI/ML PyTorch Spark TensorFlow apache-spark data-engineering

Learn how to build end-to-end scalable machine learning solutions with Apache Spark. With this practical guide, author Adi Polak introduces data and ML practitioners to creative solutions that supersede today's traditional methods. You'll learn a more holistic approach that takes you beyond specific requirements and organizational goals--allowing data and ML practitioners to collaborate and understand each other better. Scaling Machine Learning with Spark examines several technologies for building end-to-end distributed ML workflows based on the Apache Spark ecosystem with Spark MLlib, MLflow, TensorFlow, and PyTorch. If you're a data scientist who works with machine learning, this book shows you when and why to use each technology. You will: Explore machine learning, including distributed computing concepts and terminology Manage the ML lifecycle with MLflow Ingest data and perform basic preprocessing with Spark Explore feature engineering, and use Spark to extract features Train a model with MLlib and build a pipeline to reproduce it Build a data system to combine the power of Spark with deep learning Get a step-by-step example of working with distributed TensorFlow Use PyTorch to scale machine learning and its internal architecture

SnowPro™ Core Certification Companion: Hands-on Preparation and Practice

2023-03-09 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Maja Ferle

Cloud Computing Cyber Security Snowflake data-engineering

This study companion helps you prepare for the SnowPro Core Certification exam. The author guides your studies so you will not have to tackle the exam by yourself. To help you track your progress, chapters in this book correspond to the exam domains as described on Snowflake’s website. Upon studying the material in this book, you will have solid knowledge that should give you the best shot possible at taking and passing the exam and earning the certification you deserve. Each chapter provides explanations, instructions, guidance, tips, and other information with the level of detail that you need to prepare for the exam. You will not waste your time with unneeded detail and advanced content which is out of scope of the exam. Focus is kept on reviewing the materials and helping you become familiar with the content of the exam that is recommended by Snowflake. This Book Helps You Review the domainsthat Snowflake specifically recommends you study in preparation for Exam COF-C02 Identify gaps in your knowledge that you can study and fill in to increase your chances of passing Exam COF-C02 Level up your knowledge even if not taking the exam, so you know the same material as someone who has taken the exam Learn how to set up a Snowflake account and configure access according to recommended security best practices Be capable of loading structured and unstructured data into Snowflake as well as unloading data from Snowflake Understand how to apply Snowflake data protection features such as cloning, time travel, and fail safe Review Snowflake’s data sharing capabilities, including data marketplace and data exchange Who This Book Is For Those who are planning to take the SnowPro Core Certification COF-C02 exam, and anyone who wishes to gain core expertise in implementing and migrating tothe Snowflake Data Cloud

talk-data.com

Activity Trend

Top Events

Top Speakers

What Every Engineer Should Know About Data-Driven Analytics

Building an Event-Driven Data Mesh

Principles of Data Fabric

All About Bioinformatics

Beginning Database Design Solutions, 2nd Edition

Practical Business Analytics Using R and Python: Solve Business Problems Using a Data-driven Approach

An Introduction to Economic Dynamics

Computational Statistical Methodologies and Modeling for Artificial Intelligence

Data Driven Strategies

Data Fabric and Data Mesh Approaches with AI: A Guide to AI-based Data Cataloging, Governance, Integration, Orchestration, and Consumption

Forecasting Time Series Data with Prophet - Second Edition

IBM Storage DS8900F Product Guide Release 9.3.2

Business Models for Industry 4.0

Sentient Strategy

Azure SQL Hyperscale Revealed: High-performance Scalable Solutions for Critical Data Workloads

Bioinformatics Tools for Pharmaceutical Drug Product Development

Introduction to IBM PowerVM

Proactive Early Threat Detection and Securing Oracle Database with IBM QRadar, IBM Security Guardium Database Protection, and IBM Copy Services Manager by using IBM FlashSystem Safeguarded Copy

Scaling Machine Learning with Spark

SnowPro™ Core Certification Companion: Hands-on Preparation and Practice