data

IBM GDPS: An Introduction to Concepts and Capabilities

2024-03-21 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Rosazila Musa , David Draper , Mairi Jane Lee , Marie-France Narbey , John Thompson (EY)

IBM data-engineering

This IBM Redbooks® publication presents an overview of the IBM Geographically Dispersed Parallel Sysplex® (IBM GDPS®) offerings and the roles they play in delivering a business IT resilience solution. The book begins with general concepts of business IT resilience and disaster recovery (DR), along with issues that are related to high application availability, data integrity, and performance. These topics are considered within the framework of government regulation, increasing application and infrastructure complexity, and the competitive and rapidly changing modern business environment. Next, it describes the GDPS family of offerings with specific reference to how they can help you achieve your defined goals for high availability and disaster recovery (HADR). Also covered are the features that simplify and enhance data replication activities, the prerequisites for implementing each offering, and tips for planning for the future and immediate business requirements. Tables provide easy-to-use summaries and comparisons of the offerings. The extra planning and implementation services available from IBM® also are explained. Then, several practical client scenarios and requirements are described, along with the most suitable GDPS solution for each case. The introductory chapters of this publication are intended for a broad technical audience, including IT System Architects, Availability Managers, Technical IT Managers, Operations Managers, System Programmers, and Disaster Recovery Planners. The subsequent chapters provide more technical details about the GDPS offerings, and each can be read independently for those readers who are interested in specific topics. Therefore, if you read all of the chapters, be aware that some information is intentionally repeated.

Introduction to the New Statistics, 2nd Edition

2024-03-21 · O'Reilly Data Science Books O'Reilly Amazon

book

by Robert Calin-Jageman , Geoff Cumming

data-science data-science-tasks statistics

This fully updated second edition is an essential introduction to inferential statistics. It is the first introductory statistics text to use an estimation approach with meta-analysis from the start and also to explain the new and exciting Open Science practices, which encourage replication and enhance the trustworthiness of research.

The Complete Developer

2024-03-19 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Martin Krause

API Docker GitHub JavaScript MongoDB NoSQL React TypeScript data-engineering nosql-databases

Whether you’ve been in the developer kitchen for decades or are just taking the plunge to do it yourself, The Complete Developer will show you how to build and implement every component of a modern stack—from scratch. You’ll go from a React-driven frontend to a fully fleshed-out backend with Mongoose, MongoDB, and a complete set of REST and GraphQL APIs, and back again through the whole Next.js stack. The book’s easy-to-follow, step-by-step recipes will teach you how to build a web server with Express.js, create custom API routes, deploy applications via self-contained microservices, and add a reactive, component-based UI. You’ll leverage command line tools and full-stack frameworks to build an application whose no-effort user management rides on GitHub logins. You’ll also learn how to: Work with modern JavaScript syntax, TypeScript, and the Next.js framework Simplify UI development with the React library Extend your application with REST and GraphQL APIs Manage your data with the MongoDB NoSQL database Use OAuth to simplify user management, authentication, and authorization Automate testing with Jest, test-driven development, stubs, mocks, and fakes Whether you’re an experienced software engineer or new to DIY web development, The Complete Developer will teach you to succeed with the modern full stack. After all, control matters. Covers: Docker, Express.js, JavaScript, Jest, MongoDB, Mongoose, Next.js, Node.js, OAuth, React, REST and GraphQL APIs, and TypeScript

Healthcare Big Data Analytics

2024-03-18 · O'Reilly Data Science Books O'Reilly Amazon

book

by Rutvij H. Jhaveri , Victor, de Albuquerque , Akash Kumar Bhoi , Ranjit Panigrahi

Analytics Big Data Data Analytics Data Collection IoT data-science healthcare-analytics

This book highlights how optimized big data applications can be used for patient monitoring and clinical diagnosis. In fact, IoT-based applications are data-driven and mostly employ modern optimization techniques. The book also explores challenges, opportunities, and future research directions, discussing the stages of data collection and pre-processing, as well as the associated challenges and issues in data handling and setup.

Fundamentals of Organizational Behaviour

2024-03-03 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Chia-Yu Kou-Barrett

data-engineering database-theory relational-databases

Explore the practical implications and relevance of organizational behaviour with this concise textbook that successfully bridges the gap between theory and practice.

MATLAB Machine Learning Recipes: A Problem-Solution Approach

2024-03-01 · O'Reilly Data Science Books O'Reilly Amazon

book

by Michael Paluszek , Stephanie Thomas

AI/ML MATLAB data-science data-science-tools

Harness the power of MATLAB to resolve a wide range of machine learning challenges. This new and updated third edition provides examples of technologies critical to machine learning. Each example solves a real-world problem, and all code provided is executable. You can easily look up a particular problem and follow the steps in the solution. This book has something for everyone interested in machine learning. It also has material that will allow those with an interest in other technology areas to see how machine learning and MATLAB can help them solve problems in their areas of expertise. The chapter on data representation and MATLAB graphics includes new data types and additional graphics. Chapters on fuzzy logic, simple neural nets, and autonomous driving have new examples added. And there is a new chapter on spacecraft attitude determination using neural nets. Authors Michael Paluszek and Stephanie Thomas show how all of these technologies allow you to build sophisticated applications to solve problems with pattern recognition, autonomous driving, expert systems, and much more. What You Will Learn Write code for machine learning, adaptive control, and estimation using MATLAB Use MATLAB graphics and visualization tools for machine learning Become familiar with neural nets Build expert systems Understand adaptive control Gain knowledge of Kalman Filters Who This Book Is For Software engineers, control engineers, university faculty, undergraduate and graduate students, hobbyists.

Practical MongoDB Aggregations

2024-03-01 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Paul Done

Big Data IoT MongoDB Cyber Security data-engineering nosql-databases

Dive into the capabilities of the MongoDB aggregation framework with this official guide, "Practical MongoDB Aggregations". You'll learn how to design and optimize efficient aggregation pipelines for MongoDB 7.0, empowering you to handle complex data analysis and processing tasks directly within the database. What this Book will help me do Gain expertise in crafting advanced MongoDB aggregation pipelines for custom data workflows. Learn to perform time series analysis for financial datasets and IoT applications. Discover optimization techniques for working with sharded clusters and large datasets. Master array manipulation and other specific operations essential for MongoDB data models. Build pipelines that ensure data security and distribution while maintaining performance. Author(s) Paul Done, a recognized expert in MongoDB, brings his extensive experience in database technologies to this book. With years of practice in helping companies leverage MongoDB for big data solutions, Paul shares his deep knowledge in an accessible and logical manner. His approach to writing is hands-on, focusing on practical insights and clear explanations. Who is it for? This book is tailored for intermediate-level developers, database architects, data analysts, engineers, and scientists who use MongoDB. If you are familiar with MongoDB and looking to expand your understanding specifically around its aggregation capabilities, this guide is for you. Whether you're analyzing time series data or need to optimize pipelines for performance, you'll find actionable tips and examples here to suit your needs.

Building Interactive Dashboards in Microsoft 365 Excel

2024-02-29 · O'Reilly Data Science Books O'Reilly Amazon

book

by Michael Olafusi (Data analytics consulting company)

Analytics BI Dashboard Microsoft dashboards data-science data-science-tasks data-visualization

Microsoft 365 Excel introduces enhanced features that transform how business dashboards are built and maintained. This book guides you through creating dynamic, interactive dashboards that leverage these modern capabilities. From understanding the essential principles of effective dashboard design to mastering the latest tools like Power Query and dynamic array functions, you'll make the most of Excel's full potential. What this Book will help me do Understand the purpose and advantages of effective dashboards in business analytics. Use advanced Excel functions and tools such as Power Query and dynamic arrays to handle complex data workflows. Design visually engaging dashboards using charts and data visualizations that communicate key insights. Optimize dashboards for automation and real-time data updates, saving time and effort. Apply best practices and techniques for creating professional-grade Excel dashboards. Author(s) Michael Olafusi is a skilled data analyst and expert in Microsoft Excel, with years of experience leveraging Excel for business intelligence and analytics solutions. He enjoys teaching Excel users how to elevate their skills to create functional and visually impactful tools. Michael's approach combines clarity and practical advice, helping readers build proficiency and confidence. Who is it for? This book is perfect for Excel users who want to create professional dashboards for business decision support. It's especially useful for data analysts, financial analysts, business analysts, and those in similar roles. It requires a basic familiarity with Excel's interface and is ideal for those seeking to enhance their data presentation skills and automate repetitive reporting tasks.

Cracking the Data Science Interview

2024-02-29 · O'Reilly Data Science Books O'Reilly Amazon

book

by Aaren Stubberfield , Leondra R. Gonzalez

AI/ML Bash Data Science Git Python SQL data-science

"Cracking the Data Science Interview" is your ultimate resource for preparing for roles in the competitive field of data science. With this book, you'll explore essential topics such as Python, SQL, statistics, and machine learning, as well as learn practical skills for building portfolios and acing interviews. Follow its guidance and you'll be equipped to stand out in any data science interview. What this Book will help me do Confidently explain complex statistical and machine learning concepts. Develop models and deploy them while ensuring version control and efficiency. Learn and apply scripting skills in shell and Bash for productivity. Master Git workflows to handle collaborative coding in projects. Perfectly tailor portfolios and resumes to land data science opportunities. Author(s) Leondra R. Gonzalez, with years of data science and mentorship experience, co-authors this book with None Stubberfield, a seasoned expert in technology and machine learning. Together, they integrate their expertise to provide practical advice for navigating the data science job market. Who is it for? If you're preparing for data science interviews, this book is for you. It's ideal for candidates with a foundational knowledge of Python, SQL, and statistics looking to refine and expand their technical and professional skills. Professionals transitioning into data science will also find it invaluable for building confidence and succeeding in this rewarding field.

Data Cleaning with Power BI

2024-02-29 · O'Reilly Data Science Books O'Reilly Amazon

book

by Gus Frazer

Analytics BI Data Quality Data Science DAX Microsoft Power BI business-intelligence data-science microsoft-power-platform power-bi

Delve into the powerful world of data cleaning with Microsoft Power BI in this detailed guide. You'll learn how to connect, transform, and optimize data from various sources, setting a strong foundation for insightful data-driven decisions. Equip yourself with the skills to master data quality, leverage DAX and Power Query, and produce actionable insights with improved efficiency. What this Book will help me do Master connecting to various data sources and importing data effectively into Power BI. Learn to use the Query Editor to clean and transform data efficiently. Understand how to use the M language to perform advanced data transformations. Gain expertise in creating optimized data models and handling relationships within Power BI. Explore insights-driven exploratory data analysis using Power BI's powerful tools. Author(s) None Frazer is an experienced data professional with a deep knowledge of business intelligence tools and analytics processes. With a strong background in data science and years of hands-on experience using Power BI, Frazer brings practical advice to help users improve their data preparation and analysis skills. Known for creating resources that are both comprehensive and approachable, Frazer is dedicated to empowering readers in their data journey. Who is it for? This book is ideal for data analysts, business intelligence professionals, and business analysts who work regularly with data. If you are someone with a basic understanding of BI tools and concepts looking to deepen their skills, especially in Power BI, this book will guide you effectively. It will also help data scientists and other professionals interested in data cleaning to build a robust basis for data quality and analysis. Whether you're addressing common data challenges or seeking to enhance your BI capabilities, this guide is tailored to accommodate your needs.

Kibana 8.x – A Quick Start Guide to Data Analysis

2024-02-29 · O'Reilly Data Science Books O'Reilly Amazon

book

by Krishna Shah

Analytics Big Data Data Analytics ELK Kibana data-science data-science-tasks data-visualization

Kibana 8.x - A Quick Start Guide to Data Analysis is an essential resource for anyone wanting to harness the robust capabilities of Kibana to analyze, visualize, and make sense of their data. Through clear explanations and practical exercises, this guide breaks down topics like creating dashboards, exploring datasets, and configuring Kibana's powerful features. What this Book will help me do Understand Kibana's interface and functionalities to manage Elasticsearch data. Learn how to create intuitive visualizations and customize dashboards. Explore features such as data discovery and real-time updates for analytics. Optimize and query datasets using ESQL and detailed analytics techniques. Master the process of embedding dashboards and exporting insights. Author(s) None Shah is an experienced data analytics professional with a deep understanding of the Elastic Stack, including Kibana and Elasticsearch. Having spent years working on big data projects, Shah is dedicated to helping technologists turn data into actionable insights. Her writing aims to simplify complex concepts into achievable learning milestones. Who is it for? This book is ideal for data analysts, data engineers, and anyone working extensively with Elasticsearch datasets. If you aim to gain hands-on experience with building interactive dashboards and visualizing data trends, this book is tailored for you. A foundational understanding of Elasticsearch would be beneficial but is not strictly required. Perfect for advancing decision-making with data insights.

Learn Microsoft Fabric

2024-02-29 · O'Reilly Data Science Books O'Reilly Amazon

book

by Arshad Ali , Bradley Schacht

AI/ML Analytics Data Analytics Data Science Microsoft Fabric Cyber Security Spark SQL analytics-platforms data-science microsoft-fabric

Dive into the wonders of Microsoft Fabric, the ultimate solution for mastering data analytics in the AI era. Through engaging real-world examples and hands-on scenarios, this book will equip you with all the tools to design, build, and maintain analytics systems for various use cases like lakehouses, data warehouses, real-time analytics, and data science. What this Book will help me do Understand and utilize the key components of Microsoft Fabric for modern analytics. Build scalable and efficient data analytics solutions with medallion architecture. Implement real-time analytics and machine learning models to derive actionable insights. Monitor and administer your analytics platform for high performance and security. Leverage AI-powered assistant Copilot to boost analytics productivity. Author(s) Arshad Ali and None Schacht bring years of expertise in data analytics and system architecture to this book. Arshad is a seasoned professional specialized in AI-integrated analytics platforms, while None Schacht has a proven track record in deploying enterprise data solutions. Together, they provide deep insights and practical knowledge with a structured and approachable teaching method. Who is it for? Ideal for data professionals such as data analysts, engineers, scientists, and AI/ML experts aiming to enhance their data analytics skills and master Microsoft Fabric. It's also suited for students and new entrants to the field looking to establish a firm foundation in analytics systems. Requires a basic understanding of SQL and Spark.

Learn T-SQL Querying - Second Edition

2024-02-29 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Pam Lahoud , Pedro Lopes

Azure Microsoft SQL SQL Server data-engineering

Troubleshoot query performance issues, identify anti-patterns in your code, and write efficient T-SQL queries with this guide for T-SQL developers Key Features A definitive guide to mastering the techniques of writing efficient T-SQL code Learn query optimization fundamentals, query analysis, and how query structure impacts performance Discover insightful solutions to detect, analyze, and tune query performance issues Purchase of the print or Kindle book includes a free PDF eBook Book Description Data professionals seeking to excel in Transact-SQL for Microsoft SQL Server and Azure SQL Database often lack comprehensive resources. Learn T-SQL Querying second edition focuses on indexing queries and crafting elegant T-SQL code enabling data professionals gain mastery in modern SQL Server versions (2022) and Azure SQL Database. The book covers new topics like logical statement processing flow, data access using indexes, and best practices for tuning T-SQL queries. Starting with query processing fundamentals, the book lays a foundation for writing performant T-SQL queries. You’ll explore the mechanics of the Query Optimizer and Query Execution Plans, learning to analyze execution plans for insights into current performance and scalability. Using dynamic management views (DMVs) and dynamic management functions (DMFs), you’ll build diagnostic queries. The book covers indexing and delves into SQL Server’s built-in tools to expedite resolution of T-SQL query performance and scalability issues. Hands-on examples will guide you to avoid UDF pitfalls and understand features like predicate SARGability, Query Store, and Query Tuning Assistant. By the end of this book, you‘ll have developed the ability to identify query performance bottlenecks, recognize anti-patterns, and avoid pitfalls What you will learn Identify opportunities to write well-formed T-SQL statements Familiarize yourself with the Cardinality Estimator for query optimization Create efficient indexes for your existing workloads Implement best practices for T-SQL querying Explore Query Execution Dynamic Management Views Utilize the latest performance optimization features in SQL Server 2017, 2019, and 2022 Safeguard query performance during upgrades to newer versions of SQL Server Who this book is for This book is for database administrators, database developers, data analysts, data scientists and T-SQL practitioners who want to master the art of writing efficient T-SQL code and troubleshooting query performance issues through practical examples. A basic understanding of T-SQL syntax, writing queries in SQL Server, and using the SQL Server Management Studio tool will be helpful to get started.

Azure Data Factory Cookbook - Second Edition

2024-02-28 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Xenia Ireton , Tonya Chernyshova , Dmitry Foshin , Dmitry Anoshin

Analytics Azure ADF Cloud Computing Data Engineering Data Lake Databricks Delta DWH ETL/ELT Microsoft Fabric +4 more

This comprehensive guide to Azure Data Factory shows you how to create robust data pipelines and workflows to handle both cloud and on-premises data solutions. Through practical recipes, you will learn to build, manage, and optimize ETL, hybrid ETL, and ELT processes. The book offers detailed explanations to help you integrate technologies like Azure Synapse, Data Lake, and Databricks into your projects. What this Book will help me do Master building and managing data pipelines using Azure Data Factory's latest versions and features. Leverage Azure Synapse and Azure Data Lake for streamlined data integration and analytics workflows. Enhance your ETL/ELT solutions with Microsoft Fabric, Databricks, and Delta tables. Employ debugging tools and workflows in Azure Data Factory to identify and solve data processing issues efficiently. Implement industry-grade best practices for reliable and efficient data orchestration and integration pipelines. Author(s) Dmitry Foshin, Tonya Chernyshova, Dmitry Anoshin, and Xenia Ireton collectively bring years of expertise in data engineering and cloud-based solutions. They are recognized professionals in the Azure ecosystem, dedicated to sharing their knowledge through detailed and actionable content. Their collaborative approach ensures that this book provides practical insights for technical audiences. Who is it for? This book is ideal for data engineers, ETL developers, and professional architects who work with cloud and hybrid environments. If you're looking to upskill in Azure Data Factory or expand your knowledge into related technologies like Synapse Analytics or Databricks, this is for you. Readers should have a foundational understanding of data warehousing concepts to fully benefit from the material.

Big Data Computing

2024-02-27 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Bishwajeet Kumar Pandey , Tanvir Habib Sardar

Big Data Hadoop Hive NoSQL Data Streaming apache-hive data-engineering

This book primarily aims to provide an in-depth understanding of recent advances in big data computing technologies, methodologies, and applications along with introductory details of big data computing models such as Apache Hadoop, MapReduce, Hive, Pig, Mahout in-memory storage systems, NoSQL databases, and big data streaming services.

IBM FlashSystem and VMware Implementation and Best Practices Guide

2024-02-27 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Jordan Fincher , Duane Bolland , David Green , Nezih Boyacioglu , Gucer Vasfi , Ibrahim Alade Rufai , Leandro Torolho , Warren Hawkins

API IBM VMware data-engineering

This IBM® Redbooks® publication details the configuration and best practices for using the IBM FlashSystem® family of storage products within a VMware environment. The first version of this book was published in 2021 and specifically addressed IBM Spectrum® Virtualize Version 8.4 with VMware vSphere 7.0. This second version of this book includes all the enhancements that are available with IBM Spectrum Virtualize 8.5. Topics illustrate planning, configuring, operations, and preferred practices that include integration of IBM FlashSystem storage systems with the VMware vCloud suite of applications: VMware vSphere Web Client (vWC) vSphere Storage APIs - Storage Awareness (VASA) vSphere Storage APIs – Array Integration (VAAI) VMware Site Recovery Manager (SRM) VMware vSphere Metro Storage Cluster (vMSC) Embedded VASA Provider for VMware vSphere Virtual Volumes (vVols) This book is intended for presales consulting engineers, sales engineers, and IBM clients who want to deploy IBM FlashSystem storage systems in virtualized data centers that are based on VMware vSphere. Note: There is a newer version of this book: "IBM Storage Virtualize and VMware: Integrations, Implementation and Best Practices, SG24-8549". This book addresses IBM Storage Virtualize Version 8.6 with VMware vSphere 8. The new IBM Storage plugin for vSphere is covered in this book.

IBM TS7700 Release 5.3 Guide

2024-02-27 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Shinsuke Ueyama , Aderson Pacini , Dave Brettell , Yuki Asakura , Nao Takemura , Lourie Goodall , Alberto Barajas Ortiz , Erina Tatsumi , Nielson ’Nino’ de Carvalho , Chen Zhu , Larry Coyne , Taisei Takai , Tomoaki Ogino , Michael Scott , Kousei Kawamura , Derek Erdmann , Trinidad Armando Rangel Ruiz , Shinya Ohri , Nobuhiko Furuya , Joe Hew , Rin Fujiwara , Ramón A. Minjares Campos , Stefan Neff , Tony Makepeace , Takahiro Tsuda

Cloud Computing Cloud Storage IBM S3 Cyber Security data-engineering

This IBM Redbooks® publication covers IBM TS7700 R5.3. The IBM TS7700 is part of a family of IBM Enterprise tape products. This book is intended for system architects and storage administrators who want to integrate their storage systems for optimal operation. Building on over 25 years of experience, the R5.3 release includes many features that enable improved performance, usability, and security. Highlights include the IBM TS7700 Advanced Object Store, an all flash TS7770, grid resiliency enhancements, and Logical WORM retention. By using the same hierarchical storage techniques, the TS7700 (TS7770 and TS7760) can also off load to object storage. Because object storage is cloud-based and accessible from different regions, the TS7700 Cloud Storage Tier support essentially allows the cloud to be an extension of the grid. As of this writing, the TS7700C supports the ability to off load to IBM Cloud Object Storage, Amazon S3, and RSTOR. This publication explains features and concepts that are specific to the IBM TS7700 as of release R5.3. The R5.3 microcode level provides IBM TS7700 Cloud Storage Tier enhancements, IBM DS8000 Object Storage enhancements, Management Interface dual control security, and other smaller enhancements. The R5.3 microcode level can be installed on the IBM TS7770 and IBM TS7760 models only. TS7700 provides tape virtualization for the IBM Z® environment. Off loading to physical tape behind a TS7700 is used by hundreds of organizations around the world. New and existing capabilities of the TS7700 5.3 release includes the following highlights: Support for IBM TS1160 Tape Drives and JE/JM media Eight-way Grid Cloud, which consists of up to three generations of TS7700 Synchronous and asynchronous replication of virtual tape and TCT objects Grid access to all logical volume and object data independent of where it resides An all flash TS7770 option for improved performance Full Advanced Object Store Grid Cloud support of DS8000 Transparent Cloud Tier Full AES256 encryption for data that is in-flight and at-rest Tight integration with IBM Z and DFSMS policy management DS8000 Object Store with AES256 in-flight encryption and compression Regulatory compliance through Logical WORM and LWORM Retention support Cloud Storage Tier support for archive, logical volume versions, and disaster recovery Optional integration with physical tape 16 Gb IBM FICON® throughput that exceeds 4 GBps per TS7700 cluster Grid Resiliency Support with Control Unit Initiated Reconfiguration (CUIR) support IBM Z hosts view up to 3,968 3490 devices per TS7700 grid TS7770 Cache On Demand feature that uses capacity-based licensing TS7770 support of SSD within the VED server The TS7700T writes data by policy to physical tape through attachment to high-capacity, high-performance IBM TS1160, IBM TS1150, and IBM TS1140 tape drives that are installed in an IBM TS4500 or TS3500 tape library. The TS7770 models are based on high-performance and redundant IBM Power9® technology. They provide improved performance for most IBM Z tape workloads when compared to the previous generations of IBM TS7700.

Graph Algorithms for Data Science

2024-02-26 · O'Reilly Data Science Books O'Reilly Amazon

book

by Tomaz Bratanic

AI/ML CSV Data Science NLP SQL data-science

Practical methods for analyzing your data with graphs, revealing hidden connections and new insights. Graphs are the natural way to represent and understand connected data. This book explores the most important algorithms and techniques for graphs in data science, with concrete advice on implementation and deployment. You don’t need any graph experience to start benefiting from this insightful guide. These powerful graph algorithms are explained in clear, jargon-free text and illustrations that makes them easy to apply to your own projects. In Graph Algorithms for Data Science you will learn: Labeled-property graph modeling Constructing a graph from structured data such as CSV or SQL NLP techniques to construct a graph from unstructured data Cypher query language syntax to manipulate data and extract insights Social network analysis algorithms like PageRank and community detection How to translate graph structure to a ML model input with node embedding models Using graph features in node classification and link prediction workflows Graph Algorithms for Data Science is a hands-on guide to working with graph-based data in applications like machine learning, fraud detection, and business data analysis. It’s filled with fascinating and fun projects, demonstrating the ins-and-outs of graphs. You’ll gain practical skills by analyzing Twitter, building graphs with NLP techniques, and much more. About the Technology A graph, put simply, is a network of connected data. Graphs are an efficient way to identify and explore the significant relationships naturally occurring within a dataset. This book presents the most important algorithms for graph data science with examples from machine learning, business applications, natural language processing, and more. About the Book Graph Algorithms for Data Science shows you how to construct and analyze graphs from structured and unstructured data. In it, you’ll learn to apply graph algorithms like PageRank, community detection/clustering, and knowledge graph models by putting each new algorithm to work in a hands-on data project. This cutting-edge book also demonstrates how you can create graphs that optimize input for AI models using node embedding. What's Inside Creating knowledge graphs Node classification and link prediction workflows NLP techniques for graph construction About the Reader For data scientists who know machine learning basics. Examples use the Cypher query language, which is explained in the book. About the Author Tomaž Bratanič works at the intersection of graphs and machine learning. Arturo Geigel was the technical editor for this book. Quotes Undoubtedly the quickest route to grasping the practical applications of graph algorithms. Enjoyable and informative, with real-world business context and practical problem-solving. - Roger Yu, Feedzai Brilliantly eases you into graph-based applications. - Sumit Pal, Independent Consultant I highly recommend this book to anyone involved in analyzing large network databases. - Ivan Herreros, talentsconnect Insightful and comprehensive. The author’s expertise is evident. Be prepared for a rewarding journey. - Michal Štefaňák, Volke

Mastering Microsoft Fabric: SAASification of Analytics

2024-02-21 · O'Reilly Data Science Books O'Reilly Amazon

book

by Debananda Ghosh

AI/ML Analytics AWS Azure ADF BI Cloud Computing Data Engineering Data Lakehouse Data Management Data Science DWH +9 more

Learn and explore the capabilities of Microsoft Fabric, the latest evolution in cloud analytics suites. This book will help you understand how users can leverage Microsoft Office equivalent experience for performing data management and advanced analytics activity. The book starts with an overview of the analytics evolution from on premises to cloud infrastructure as a service (IaaS), platform as a service (PaaS), and now software as a service (SaaS version) and provides an introduction to Microsoft Fabric. You will learn how to provision Microsoft Fabric in your tenant along with the key capabilities of SaaS analytics products and the advantage of using Fabric in the enterprise analytics platform. OneLake and Lakehouse for data engineering is discussed as well as OneLake for data science. Author Ghosh teaches you about data warehouse offerings inside Microsoft Fabric and the new data integration experience which brings Azure Data Factory and Power Query Editor of Power BI together in a single platform. Also demonstrated is Real-Time Analytics in Fabric, including capabilities such as Kusto query and database. You will understand how the new event stream feature integrates with OneLake and other computations. You also will know how to configure the real-time alert capability in a zero code manner and go through the Power BI experience in the Fabric workspace. Fabric pricing and its licensing is also covered. After reading this book, you will understand the capabilities of Microsoft Fabric and its Integration with current and upcoming Azure OpenAI capabilities. What You Will Learn Build OneLake for all data like OneDrive for Microsoft Office Leverage shortcuts for cross-cloud data virtualization in Azure and AWS Understand upcoming OpenAI integration Discover new event streaming and Kusto query inside Fabric real-time analytics Utilize seamless tooling for machine learning and data science Who This Book Is For Citizen users and experts in the data engineering and data science fields, along with chief AI officers

Speed Metrics Guide: Choosing the Right Metrics to Use When Evaluating Websites

2024-02-20 · O'Reilly Data Science Books O'Reilly Amazon

book

by Matthew Edgar

data-science google-analytics web-analytics

Faster websites offer a better user experience and typically have higher conversion rates. It can be challenging to know where to invest to meaningfully improve a website's speed. Investing correctly to improve speed starts with understanding how to correctly measure speed and knowing how to use those measurements to identify the biggest opportunities. Speed Metrics Guidehelps marketers, SEOs, business leaders, designers, and everybody else involved in website performance select the right metrics to use to optimize their website's speed. Each chapter examines a specific metric, discusses what it measures, why the metric matters and what tactics will help improve that metric. What You'll Learn The best metrics and tools to help you measure website speed, including Google's Core Web Vitals How and when to best use each metric Where each metric fits within the website loading process How to use each metric to find different ways of improving website speed Who This book Is For Non-technical audience, including marketers, SEOs, designers, and UX professionals.

talk-data.com

Activity Trend

Top Events

Top Speakers

IBM GDPS: An Introduction to Concepts and Capabilities

Introduction to the New Statistics, 2nd Edition

The Complete Developer

Healthcare Big Data Analytics

Fundamentals of Organizational Behaviour

MATLAB Machine Learning Recipes: A Problem-Solution Approach

Practical MongoDB Aggregations

Building Interactive Dashboards in Microsoft 365 Excel

Cracking the Data Science Interview

Data Cleaning with Power BI

Kibana 8.x – A Quick Start Guide to Data Analysis

Learn Microsoft Fabric

Learn T-SQL Querying - Second Edition

Azure Data Factory Cookbook - Second Edition

Big Data Computing

IBM FlashSystem and VMware Implementation and Best Practices Guide

IBM TS7700 Release 5.3 Guide

Graph Algorithms for Data Science

Mastering Microsoft Fabric: SAASification of Analytics

Speed Metrics Guide: Choosing the Right Metrics to Use When Evaluating Websites