O'Reilly Data Science Books

A Friendly Guide to Data Science: Everything You Should Know About the Hottest Field in Tech

2025-06-26 O'Reilly Amazon

book

Kelly P. Vincent

data data-science AI/ML Analytics Data Analytics Data Science

Unlock the world of data science—no coding required. Curious about data science but not sure where to start? This book is a beginner-friendly guide to what data science is and how people use it. It walks you through the essential topics—what data analysis involves, which skills are useful, and how terms like “data analytics” and “machine learning” connect—without getting too technical too fast. Data science isn’t just about crunching numbers, pulling data from a database, or running fancy algorithms. It’s about asking the right questions, understanding the process from start to finish, and knowing what’s possible (and what’s not). This book teaches you all of that, while also introducing important topics like ethics, privacy, and security—because working with data means thinking about people, too. Whether you're a student exploring new skills, a professional navigating data-driven decisions, or someone considering a career change, this book is your friendly gateway into the world of data science, one of today’s most exciting fields. No coding or programming experience? No problem. You'll build a solid foundation and gain the confidence to engage with data science concepts— just as AI and data become increasingly central to everyday life. What You Will Learn Grasp foundational statistics and how it matters in data analysis and data science Understand the data science project life cycle and how to manage a data science project Examine the ethics of working with data and its use in data analysis and data science Understand the foundations of data security and privacy Collect, store, prepare, visualize, and present data Identify the many types of machine learning and know how to gauge performance Prepare for and find a career in data science Who This Book is for A wide range of readers who are curious about data science and eager to build a strong foundation. Perfect for undergraduates in the early semesters of their data science degrees, as it assumes no prior programming or industry experience. Professionals will find particular value in the real-world insights shared through practitioner interviews. Business leaders can use it to better understand what data science can do for them and how their teams are applying it. And for career changers, this book offers a welcoming entry point into the field—helping them explore the landscape before committing to more intensive learning paths like degrees or boot camps.

Microsoft Power Platform Solution Architect Certification Companion: Mastering the PL-600 Certification

2025-06-18 O'Reilly Amazon

book

Loganathan K

data data-science business-intelligence microsoft-power-platform pl-600-microsoft-power-platform-solution-architect pl-600: microsoft power platform solution architect

This comprehensive guide book equips you with the knowledge and confidence needed to prep for the exam and thrive as a Power Platform Solution Architect. The book starts with a foundation for successful solution architecture, emphasizing essential skills such as requirements gathering, governance, and security. You will learn to navigate customer discovery, translate business needs into technical requirements, and design solutions that address both functional and non-functional needs. The second part of the book delves into the Microsoft Power Platform ecosystem, offering an in-depth look at its core components—Power Apps, Power Automate, Power BI, Microsoft Copilot, and Robotic Process Automation (RPA). Detailed insights into data modeling, security strategies, and AI integration will guide you in building scalable, secure solutions. Coverage of application life cycle management, which empowers solution architects to design, implement, and deploy Power Platform solutions effectively, is discussed next. You will then go through real-world scenarios, giving you a practical understanding of the challenges and considerations in managing Power Platform projects within a business context. The book concludes with strategies for continuous learning and resources for professional development, including practice questions to assess knowledge and readiness for the PL-600 exam. After reading the book, you will be ready to take the exam and become a successful Power Platform Solution Architect. What You Will Learn Understand the Solution Architect's role, responsibilities, and strategic approaches to successfully navigate projects Master the basics of Power Platform Solution Architecture Understand governance, security, and integration concepts in real-world scenarios Design and deploy effective business solutions using Power Platform components Gain the skills necessary to prep for the PL-600 certification exam Who This Book Is For Professionals pursuing Microsoft PL-600 Solution Architect certification and IT consultants and developers transitioning to solution architect roles

HBR's 10 Must Reads on Data Strategy (featuring "Democratizing Transformation" by Marco Iansiti and Satya Nadella)

2025-06-17 O'Reilly Amazon

book

Marco Iansiti , Satya Nadella , Harvard Business Review , Tsedal Neeley , Thomas H. Davenport

data data-science AI/ML Analytics

Data is your business. Have you unlocked its full potential? If you read nothing else on data strategy, read this book. We've combed through hundreds of Harvard Business Review articles and selected the most important ones to help you maximize your analytics capabilities; harness the power of data, algorithms, and AI; and gain competitive advantage in our hyperconnected world. This book will inspire you to: Reap the rewards of digital transformation Make better data-driven decisions Design breakout products that generate profitable insights Address vulnerabilities to cyberattacks and data breaches Reskill your workforce and build a culture of continuous learning Win with personalized customer experiences at scale This collection of articles includes "What's Your Data Strategy?," by Leandro DalleMule and Thomas H. Davenport; "Democratizing Transformation," by Marco Iansiti and Satya Nadella; "Why Companies Should Consolidate Tech Roles in the C-Suite," by Thomas H. Davenport, John Spens, and Saurabh Gupta; "Developing a Digital Mindset," by Tsedal Neeley and Paul Leonardi; "What Does It Actually Take to Build a Data-Driven Culture?," by Mai B. AlOwaish and Thomas C. Redman; "When Data Creates Competitive Advantage," by Andrei Hagiu and Julian Wright; "Building an Insights Engine," by Frank van den Driest, Stan Sthanunathan, and Keith Weed; "Personalization Done Right," by Mark Abraham and David C. Edelman; "Ensure High-Quality Data Powers Your AI," by Thomas C. Redman; "The Ethics of Managing People's Data," by Michael Segalla and Dominique Rouzies; "Where Data-Driven Decision-Making Can Go Wrong," by Michael Luca and Amy C. Edmondson; "Sizing Up Your Cyberrisks," by Thomas J. Parenty and Jack J. Domet; "A Better Way to Put Your Data to Work," Veeral Desai, Tim Fountaine, and Kayvaun Rowshankish; and "Heavy Machinery Meets AI," by Vijay Govindarajan and Venkat Venkatraman. HBR's 10 Must Reads are definitive collections of classic ideas, practical advice, and essential thinking from the pages of Harvard Business Review. Exploring topics like disruptive innovation, emotional intelligence, and new technology in our ever-evolving world, these books empower any leader to make bold decisions and inspire others.

R Programming for Mass Spectrometry

2025-05-28 O'Reilly Amazon

book

Randall K. Julian

data data-science data-science-tools r AI/ML

A practical guide to reproducible and high impact mass spectrometry data analysis R Programming for Mass Spectrometry teaches a rigorous and detailed approach to analyzing mass spectrometry data using the R programming language. It emphasizes reproducible research practices and transparent data workflows and is designed for analytical chemists, biostatisticians, and data scientists working with mass spectrometry. Readers will find specific algorithms and reproducible examples that address common challenges in mass spectrometry alongside example code and outputs. Each chapter provides practical guidance on statistical summaries, spectral search, chromatographic data processing, and machine learning for mass spectrometry. Key topics include: Comprehensive data analysis using the Tidyverse in combination with Bioconductor, a widely used software project for the analysis of biological data Processing chromatographic peaks, peak detection, and quality control in mass spectrometry data Applying machine learning techniques, using Tidymodels for supervised and unsupervised learning, as well as for feature engineering and selection, providing modern approaches to data-driven insights Methods for producing reproducible, publication-ready reports and web pages using RMarkdown R Programming for Mass Spectrometry is an indispensable guide for researchers, instructors, and students. It provides modern tools and methodologies for comprehensive data analysis. With a companion website that includes code and example datasets, it serves as both a practical guide and a valuable resource for promoting reproducible research in mass spectrometry.

Data Without Labels

2025-05-26 O'Reilly Amazon

book

Vaibhav Verdhan

data data-science data-science-tools Pandas AI/ML Data Science

Discover all-practical implementations of the key algorithms and models for handling unlabeled data. Full of case studies demonstrating how to apply each technique to real-world problems. In Data Without Labels you’ll learn: Fundamental building blocks and concepts of machine learning and unsupervised learning Data cleaning for structured and unstructured data like text and images Clustering algorithms like K-means, hierarchical clustering, DBSCAN, Gaussian Mixture Models, and Spectral clustering Dimensionality reduction methods like Principal Component Analysis (PCA), SVD, Multidimensional scaling, and t-SNE Association rule algorithms like aPriori, ECLAT, SPADE Unsupervised time series clustering, Gaussian Mixture models, and statistical methods Building neural networks such as GANs and autoencoders Dimensionality reduction methods like Principal Component Analysis and multidimensional scaling Association rule algorithms like aPriori, ECLAT, and SPADE Working with Python tools and libraries like sci-kit learn, numpy, Pandas, matplotlib, Seaborn, Keras, TensorFlow, and Flask How to interpret the results of unsupervised learning Choosing the right algorithm for your problem Deploying unsupervised learning to production Maintenance and refresh of an ML solution Data Without Labels introduces mathematical techniques, key algorithms, and Python implementations that will help you build machine learning models for unannotated data. You’ll discover hands-off and unsupervised machine learning approaches that can still untangle raw, real-world datasets and support sound strategic decisions for your business. Don’t get bogged down in theory—the book bridges the gap between complex math and practical Python implementations, covering end-to-end model development all the way through to production deployment. You’ll discover the business use cases for machine learning and unsupervised learning, and access insightful research papers to complete your knowledge. About the Technology Generative AI, predictive algorithms, fraud detection, and many other analysis tasks rely on cheap and plentiful unlabeled data. Machine learning on data without labels—or unsupervised learning—turns raw text, images, and numbers into insights about your customers, accurate computer vision, and high-quality datasets for training AI models. This book will show you how. About the Book Data Without Labels is a comprehensive guide to unsupervised learning, offering a deep dive into its mathematical foundations, algorithms, and practical applications. It presents practical examples from retail, aviation, and banking using fully annotated Python code. You’ll explore core techniques like clustering and dimensionality reduction along with advanced topics like autoencoders and GANs. As you go, you’ll learn where to apply unsupervised learning in business applications and discover how to develop your own machine learning models end-to-end. What's Inside Master unsupervised learning algorithms Real-world business applications Curate AI training datasets Explore autoencoders and GANs applications About the Reader Intended for data science professionals. Assumes knowledge of Python and basic machine learning. About the Author Vaibhav Verdhan is a seasoned data science professional with extensive experience working on data science projects in a large pharmaceutical company. Quotes An invaluable resource for anyone navigating the complexities of unsupervised learning. A must-have. - Ganna Pogrebna, The Alan Turing Institute Empowers the reader to unlock the hidden potential within their data. - Sonny Shergill, Astra Zeneca A must-have for teams working with unstructured data. Cuts through the fog of theory ili Explains the theory and delivers practical solutions. - Leonardo Gomes da Silva, onGRID Sports Technology The Bible for unsupervised learning! Full of real-world applications, clear explanations, and excellent Python implementations. - Gary Bake, Falconhurst Technologies

Applied Machine Learning for Data Science Practitioners

2025-04-29 O'Reilly Amazon

book

Vidya Subramanian

data ai-ml machine-learning AI/ML Data Science Python

A single-volume reference on data science techniques for evaluating and solving business problems using Applied Machine Learning (ML). Applied Machine Learning for Data Science Practitioners offers a practical, step-by-step guide to building end-to-end ML solutions for real-world business challenges, empowering data science practitioners to make informed decisions and select the right techniques for any use case. Unlike many data science books that focus on popular algorithms and coding, this book takes a holistic approach. It equips you with the knowledge to evaluate a range of techniques and algorithms. The book balances theoretical concepts with practical examples to illustrate key concepts, derive insights, and demonstrate applications. In addition to code snippets and reviewing output, the book provides guidance on interpreting results. This book is an essential resource if you are looking to elevate your understanding of ML and your technical capabilities, combining theoretical and practical coding examples. A basic understanding of using data to solve business problems, high school-level math and statistics, and basic Python coding skills are assumed. Written by a recognized data science expert, Applied Machine Learning for Data Science Practitioners covers essential topics, including: Data Science Fundamentals that provide you with an overview of core concepts, laying the foundation for understanding ML. Data Preparation covers the process of framing ML problems and preparing data and features for modeling. ML Problem Solving introduces you to a range of ML algorithms, including Regression, Classification, Ranking, Clustering, Patterns, Time Series, and Anomaly Detection. Model Optimization explores frameworks, decision trees, and ensemble methods to enhance performance and guide the selection of the most effective model. ML Ethics addresses ethical considerations, including fairness, accountability, transparency, and ethics. Model Deployment and Monitoring focuses on production deployment, performance monitoring, and adapting to model drift.

SAS For Dummies, 3rd Edition

2025-04-29 O'Reilly Amazon

book

Chris Hemedinger

data data-science analytics-platforms SAS AI/ML Analytics

Become data-savvy with the widely used data and AI software Data and analytics are essential for any business, giving insight into what's working, what can be improved, and what else needs to be done. SAS software helps you make sure you're doing data right, with a host of data management, reporting, and analysis tools. SAS For Dummies teaches you the essentials, helping you navigate this statistical software and turn information into value. In this book, learn how to gather data, create reports, and analyze results. You'll also discover how SAS machine learning and AI can help deliver decisions based on data. Even if you're brand new to data and analytics, this easy-to-follow guide will turn you into an SAS power user. Become familiar with the most popular SAS applications, including SAS 9 and SAS Viya Connect to data, organize your information, and adopt sound data security practices Get a primer on working with data sets, variables, and statistical analysis Explore and analyze data through SAS programming and rich application interfaces Create and share graphs interactive visualizations to deliver insights This is the perfect Dummies guide for new SAS users looking to improve their skills—in any industry and for any organization size.

Architecting Power BI Solutions in Microsoft Fabric

2025-04-25 O'Reilly Amazon

book

Nagaraj Venkatesan

data data-science business-intelligence microsoft-power-platform power-bi AI/ML

This book is a comprehensive guide to building sophisticated and robust Power BI solutions that solve common data problems effectively. Written with hands-on professionals in mind, it provides essential insights and practical advice to help you choose the right tools and approaches for any BI task. Readers will learn to create performant, secure, and innovative business intelligence systems. What this Book will help me do Identify the scenarios where each Power BI component fits best. Apply secure and performance-conscious design principles when building BI solutions. Leverage Microsoft Fabric and other advanced integrations to maximize Power BI's capabilities. Implement AI-powered features such as Copilot and predictive modeling in Power BI. Facilitate collaboration and governance using Power BI's advanced features. Author(s) Nagaraj Venkatesan has over 17 years of professional expertise in data platform technologies and business intelligence tools. Through a rich career in data solution architecture, he has mastered the art of designing efficient and reliable Power BI implementations. This book reflects his passion for empowering professionals to make the most of Power BI. Who is it for? If you are a solution architect, data engineer, or Power BI report developer looking to elevate your skills in designing optimized Power BI solutions, this book is for you. Business analysts and data scientists can also benefit immensely from the book's coverage of self-service BI and data science integration. Some familiarity with Power BI will enhance your learning experience, but newcomers eager to learn will also find it invaluable.

An Introduction to Self-Report Measurement

2025-04-21 O'Reilly Amazon

book

Michael G. Elasmar

data data-science data-science-tasks statistics AI/ML

This book covers the science of measuring the invisible building blocks of thought processes that are useful for understanding humans, including technology users, media consumers, and consumers of goods and services. It provides: An explanation of what self-report measurement entails for beginners; A clear set of assumptions needed in order for self-report measures to yield valuable information; A mindset that needs to be adopted when using self-report measurement in the contexts of surveys and experiments; Guidance for extracting opinion from social media text content and integrating AI; A roadmap for quantifying the errors associated with self-report measurement.

3D Data Science with Python

2025-04-10 O'Reilly Amazon

book

Florent Poux

software-development programming-languages Python AI/ML Data Science GenAI

Our physical world is grounded in three dimensions. To create technology that can reason about and interact with it, our data must be 3D too. This practical guide offers data scientists, engineers, and researchers a hands-on approach to working with 3D data using Python. From 3D reconstruction to 3D deep learning techniques, you'll learn how to extract valuable insights from massive datasets, including point clouds, voxels, 3D CAD models, meshes, images, and more. Dr. Florent Poux helps you leverage the potential of cutting-edge algorithms and spatial AI models to develop production-ready systems with a focus on automation. You'll get the 3D data science knowledge and code to: Understand core concepts and representations of 3D data Load, manipulate, analyze, and visualize 3D data using powerful Python libraries Apply advanced AI algorithms for 3D pattern recognition (supervised and unsupervised) Use 3D reconstruction techniques to generate 3D datasets Implement automated 3D modeling and generative AI workflows Explore practical applications in areas like computer vision/graphics, geospatial intelligence, scientific computing, robotics, and autonomous driving Build accurate digital environments that spatial AI solutions can leverage Florent Poux is an esteemed authority in the field of 3D data science who teaches and conducts research for top European universities. He's also head professor at the 3D Geodata Academy and innovation director for French Tech 120 companies.

Time Series Analysis with Spark

2025-03-28 O'Reilly Amazon

book

Yoni Ramaswami

data data-science data-science-tasks statistics time-series AI/ML

Time Series Analysis with Spark provides a practical introduction to leveraging Apache Spark and Databricks for time series analysis. You'll learn to prepare, model, and deploy robust and scalable time series solutions for real-world applications. From data preparation to advanced generative AI techniques, this guide prepares you to excel in big data analytics. What this Book will help me do Understand the core concepts and architectures of Apache Spark for time series analysis. Learn to clean, organize, and prepare time series data for big data environments. Gain expertise in choosing, building, and training various time series models tailored to specific projects. Master techniques to scale your models in production using Spark and Databricks. Explore the integration of advanced technologies such as generative AI to enhance predictions and derive insights. Author(s) Yoni Ramaswami, a Senior Solutions Architect at Databricks, has extensive experience in data engineering and AI solutions. With a focus on creating innovative big data and AI strategies across industries, Yoni authored this book to empower professionals to efficiently handle time series data. Yoni's approachable style ensures that both foundational concepts and advanced techniques are accessible to readers. Who is it for? This book is ideal for data engineers, machine learning engineers, data scientists, and analysts interested in enhancing their expertise in time series analysis using Apache Spark and Databricks. Whether you're new to time series or looking to refine your skills, you'll find both foundational insights and advanced practices explained clearly. A basic understanding of Spark is helpful but not required.

Time Series Forecasting Using Generative AI : Leveraging AI for Precision Forecasting

2025-03-24 O'Reilly Amazon

book

Banglore Vijay Kumar Vishwas , Sri Ram Macharla

data data-science data-science-tasks statistics time-series AI/ML

"Time Series Forecasting Using Generative AI introduces readers to Generative Artificial Intelligence (Gen AI) in time series analysis, offering an essential exploration of cutting-edge forecasting methodologies." The book covers a wide range of topics, starting with an overview of Generative AI, where readers gain insights into the history and fundamentals of Gen AI with a brief introduction to large language models. The subsequent chapter explains practical applications, guiding readers through the implementation of diverse neural network architectures for time series analysis such as Multi-Layer Perceptrons (MLP), WaveNet, Temporal Convolutional Network (TCN), Bidirectional Temporal Convolutional Network (BiTCN), Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM), Deep AutoRegressive(DeepAR), and Neural Basis Expansion Analysis(NBEATS) using modern tools. Building on this foundation, the book introduces the power of Transformer architecture, exploring its variants such as Vanilla Transformers, Inverted Transformer (iTransformer), DLinear, NLinear, and Patch Time Series Transformer (PatchTST). Finally, The book delves into foundation models such as Time-LLM, Chronos, TimeGPT, Moirai, and TimesFM enabling readers to implement sophisticated forecasting models tailored to their specific needs. This book empowers readers with the knowledge and skills needed to leverage Gen AI for accurate and efficient time series forecasting. By providing a detailed exploration of advanced forecasting models and methodologies, this book enables practitioners to make informed decisions and drive business growth through data-driven insights. ● Understand the core history and applications of Gen AI and its potential to revolutionize time series forecasting. ● Learn to implement different neural network architectures such as MLP, WaveNet, TCN, BiTCN, RNN, LSTM, DeepAR, and NBEATS for time series forecasting. ● Discover the potential of Transformer architecture and its variants, such as Vanilla Transformers, iTransformer, DLinear, NLinear, and PatchTST, for time series forecasting. ● Explore complex foundation models like Time-LLM, Chronos, TimeGPT, Moirai, and TimesFM. ● Gain practical knowledge on how to apply Gen AI techniques to real-world time series forecasting challenges and make data-driven decisions. Who this book is for: Data Scientists, Machine learning engineers, Business Aanalysts, Statisticians, Economists, Financial Analysts, Operations Research Analysts, Data Analysts, Students.

Hands-On APIs for AI and Data Science

2025-03-05 O'Reilly Amazon

book

Ryan Day

data data-science AI/ML API Cloud Computing Data Science

Are you ready to grow your skills in AI and data science? A great place to start is learning to build and use APIs in real-world data and AI projects. API skills have become essential for AI and data science success, because they are used in a variety of ways in these fields. With this practical book, data scientists and software developers will gain hands-on experience developing and using APIs with the Python programming language and popular frameworks like FastAPI and StreamLit. As you complete the chapters in the book, you'll be creating portfolio projects that teach you how to: Design APIs that data scientists and AIs love Develop APIs using Python and FastAPI Deploy APIs using multiple cloud providers Create data science projects such as visualizations and models using APIs as a data source Access APIs using generative AI and LLMs

The Impact of Algorithmic Technologies on Healthcare

2025-02-11 O'Reilly Amazon

book

Pushkar Dubey , Parul Dubey , Mangala Madankar , Bui Thanh Hung

data data-science healthcare-analytics AI/ML Analytics Blockchain

The book explores the fundamental principles and transformative advancements in cutting-edge algorithmic technologies, detailing their application and impact on revolutionizing healthcare. This book provides an in-depth account of how technologies such as artificial intelligence (AI), machine learning (ML), and the Internet of Things (IoT) are reshaping healthcare, transitioning from traditional diagnostic and treatment approaches to data-driven solutions that improve predictive accuracy and patient outcomes. The text also addresses the challenges and considerations associated with adopting these technologies, including ethical implications, data security concerns, and the need for human-centered approaches in algorithmic medicine. After introducing digital twin technology and its potential to enhance healthcare delivery, the book examines the broader effects of digital technology on the healthcare system. Subsequent chapters explore topics such as innovations in medical imaging, predictive analytics for improved patient outcomes, and deep learning algorithms for brain tumor detection. Other topics include generative adversarial networks (GANs), convolutional neural networks (CNNs), smart wearables for remote patient monitoring, effective IoT solutions, telemedicine advancements, and blockchain security for healthcare systems. The integration of biometric systems driven by AI, securing cyber-physical systems in healthcare, and digitizing wellness through electronic health records (EHRs) and electronic medical records (EMRs) are also discussed. The book concludes with an extensive case study comparing the impacts of various healthcare applications, offering insights and encouraging further research and innovation in this dynamic field. Audience This book is suitable for academicians and professionals in health informatics, bioinformatics, biomedical science and engineering, artificial intelligence, as well as clinicians, IT specialists, and policymakers in healthcare.

Causal Inference for Data Science

2025-01-30 O'Reilly Amazon

book

Aleix Ruiz de Villa

data data-science AI/ML Data Science Python

When you know the cause of an event, you can affect its outcome. This accessible introduction to causal inference shows you how to determine causality and estimate effects using statistics and machine learning. A/B tests or randomized controlled trials are expensive and often unfeasible in a business environment. Causal Inference for Data Science reveals the techniques and methodologies you can use to identify causes from data, even when no experiment or test has been performed. In Causal Inference for Data Science you will learn how to: Model reality using causal graphs Estimate causal effects using statistical and machine learning techniques Determine when to use A/B tests, causal inference, and machine learning Explain and assess objectives, assumptions, risks, and limitations Determine if you have enough variables for your analysis It’s possible to predict events without knowing what causes them. Understanding causality allows you both to make data-driven predictions and also intervene to affect the outcomes. Causal Inference for Data Science shows you how to build data science tools that can identify the root cause of trends and events. You’ll learn how to interpret historical data, understand customer behaviors, and empower management to apply optimal decisions. About the Technology Why did you get a particular result? What would have lead to a different outcome? These are the essential questions of causal inference. This powerful methodology improves your decisions by connecting cause and effect—even when you can’t run experiments, A/B tests, or expensive controlled trials. About the Book Causal Inference for Data Science introduces techniques to apply causal reasoning to ordinary business scenarios. And with this clearly-written, practical guide, you won’t need advanced statistics or high-level math to put causal inference into practice! By applying a simple approach based on Directed Acyclic Graphs (DAGs), you’ll learn to assess advertising performance, pick productive health treatments, deliver effective product pricing, and more. What's Inside When to use A/B tests, causal inference, and ML Assess objectives, assumptions, risks, and limitations Apply causal inference to real business data About the Reader For data scientists, ML engineers, and statisticians. About the Author Aleix Ruiz de Villa Robert is a data scientist with a PhD in mathematical analysis from the Universitat Autònoma de Barcelona. Quotes With intuitive explanations, application-focused insights, and real-world examples, this book offers immense practical value. - Philipp Bach, Maintainer of the DoubleML libraries for Python and R An essential guide for navigating the complexities of real-world data analysis. - Adi Shavit, SWAPP A must-read! Demystifies causal inference with a blend of theory and practice. - Karan Gupta, SunPower Corporation Causal relationships can mask and distort results. This book provides a set of tools to extract insights correctly. - Peter V. Henstock, Harvard Extension

Analytics the Right Way

2025-01-22 O'Reilly Amazon

book

Joe Sutherland , Tim Wilson

data data-science business-intelligence AI/ML Analytics Computer Science

CLEAR AND CONCISE TECHNIQUES FOR USING ANALYTICS TO DELIVER BUSINESS IMPACT AT ANY ORGANIZATION Organizations have more data at their fingertips than ever, and their ability to put that data to productive use should be a key source of sustainable competitive advantage. Yet, business leaders looking to tap into a steady and manageable stream of “actionable insights” often, instead, get blasted with a deluge of dashboards, chart-filled slide decks, and opaque machine learning jargon that leaves them asking, “So what?” Analytics the Right Way is a guide for these leaders. It provides a clear and practical approach to putting analytics to productive use with a three-part framework that brings together the realities of the modern business environment with the deep truths underpinning statistics, computer science, machine learning, and artificial intelligence. The result: a pragmatic and actionable guide for delivering clarity, order, and business impact to an organization’s use of data and analytics. The book uses a combination of real-world examples from the authors’ direct experiences—working inside organizations, as external consultants, and as educators—mixed with vivid hypotheticals and illustrations—little green aliens, petty criminals with an affinity for ice cream, skydiving without parachutes, and more—to empower the reader to put foundational analytical and statistical concepts to effective use in a business context.

Statistical Quantitative Methods in Finance: From Theory to Quantitative Portfolio Management

2025-01-22 O'Reilly Amazon

book

Samit Ahlawat

data data-science data-science-tasks statistics AI/ML Data Science

Statistical quantitative methods are vital for financial valuation models and benchmarking machine learning models in finance. This book explores the theoretical foundations of statistical models, from ordinary least squares (OLS) to the generalized method of moments (GMM) used in econometrics. It enriches your understanding through practical examples drawn from applied finance, demonstrating the real-world applications of these concepts. Additionally, the book delves into non-linear methods and Bayesian approaches, which are becoming increasingly popular among practitioners thanks to advancements in computational resources. By mastering these topics, you will be equipped to build foundational models crucial for applied data science, a skill highly sought after by software engineering and asset management firms. The book also offers valuable insights into quantitative portfolio management, showcasing how traditional data science tools can be enhanced with machine learning models. These enhancements are illustrated through real-world examples from finance and econometrics, accompanied by Python code. This practical approach ensures that you can apply what you learn, gaining proficiency in the statsmodels library and becoming adept at designing, implementing, and calibrating your models. By understanding and applying these statistical models, you enhance your data science skills and effectively tackle financial challenges. What You Will Learn Understand the fundamentals of linear regression and its applications in financial data analysis and prediction Apply generalized linear models for handling various types of data distributions and enhancing model flexibility Gain insights into regime switching models to capture different market conditions and improve financial forecasting Benchmark machine learning models against traditional statistical methods to ensure robustness and reliability in financial applications Who This Book Is For Data scientists, machine learning engineers, finance professionals, and software engineers

Learning AI Tools in Tableau

2025-01-14 O'Reilly Amazon

book

Ann Jackson

data data-science data-science-tasks data-visualization Tableau AI/ML

As businesses increasingly rely on data to drive decisions, the role of advanced analytics and AI in enhancing data interpretation is becoming crucial. For professionals tasked with optimizing data analytics platforms like Tableau, staying ahead of the curve with the latest tools isn't just beneficial—it's essential. This insightful guide takes you through the integration of Tableau Pulse and Einstein Copilot, explaining their roles within the broader Tableau and Salesforce ecosystems. Author Ann Jackson, an esteemed analytics professional with a deep expertise in Tableau, offers a step-by-step exploration of these tools, backed by real-world use cases that demonstrate their impact across various industries. By the end of this book, you will: Understand the functionalities of Tableau Pulse and Einstein Copilot and how to use them Learn to deploy Tableau Pulse effectively, ensuring it aligns with your business objectives Navigate discussions on AI's role within Tableau, enhancing your strategic conversations Visualize how Tableau Pulse operates through detailed images and scenarios Utilize Einstein Copilot in Tableau Desktop/Prep to streamline and enhance data analysis

Julia Quick Syntax Reference: A Pocket Guide for Data Science Programming

2025-01-03 O'Reilly Amazon

book

Antonello Lobianco

data data-science AI/ML API Data Science Python

Learn the Julia programming language as quickly as possible. This book is a must-have reference guide that presents the essential Julia syntax in a well-organized format, updated with the latest features of Julia’s APIs, libraries, and packages. This book provides an introduction that reveals basic Julia structures and syntax; discusses data types, control flow, functions, input/output, exceptions, metaprogramming, performance, and more. Additionally, you'll learn to interface Julia with other programming languages such as R for statistics or Python. At a more applied level, you will learn how to use Julia packages for data analysis, numerical optimization, symbolic computation, and machine learning, and how to present your results in dynamic documents. The Second Edition delves deeper into modules, environments, and parallelism in Julia. It covers random numbers, reproducibility in stochastic computations, and adds a section on probabilistic analysis. Finally, it provides forward-thinking introductions to AI and machine learning workflows using BetaML, including regression, classification, clustering, and more, with practical exercises and solutions for self-learners. What You Will Learn Work with Julia types and the different containers for rapid development Use vectorized, classical loop-based code, logical operators, and blocks Explore Julia functions: arguments, return values, polymorphism, parameters, anonymous functions, and broadcasts Build custom structures in Julia Use C/C++, Python or R libraries in Julia and embed Julia in other code. Optimize performance with GPU programming, profiling and more. Manage, prepare, analyse and visualise your data with DataFrames and Plots Implement complete ML workflows with BetaML, from data coding to model evaluation, and more. Who This Book Is For Experienced programmers who are new to Julia, as well as data scientists who want to improve their analysis or try out machine learning algorithms with Julia.

Essential Data Analytics, Data Science, and AI: A Practical Guide for a Data-Driven World

2024-12-18 O'Reilly Amazon

book

Maxine Attobrah

data data-science AI/ML Analytics Cloud Computing Data Analytics

In today’s world, understanding data analytics, data science, and artificial intelligence is not just an advantage but a necessity. This book is your thorough guide to learning these innovative fields, designed to make the learning practical and engaging. The book starts by introducing data analytics, data science, and artificial intelligence. It illustrates real-world applications, and, it addresses the ethical considerations tied to AI. It also explores ways to gain data for practice and real-world scenarios, including the concept of synthetic data. Next, it uncovers Extract, Transform, Load (ETL) processes and explains how to implement them using Python. Further, it covers artificial intelligence and the pivotal role played by machine learning models. It explains feature engineering, the distinction between algorithms and models, and how to harness their power to make predictions. Moving forward, it discusses how to assess machine learning models after their creation, with insights into various evaluation techniques. It emphasizes the crucial aspects of model deployment, including the pros and cons of on-device versus cloud-based solutions. It concludes with real-world examples and encourages embracing AI while dispelling fears, and fostering an appreciation for the transformative potential of these technologies. Whether you’re a beginner or an experienced professional, this book offers valuable insights that will expand your horizons in the world of data and AI. What you will learn: What are Synthetic data and Telemetry data How to analyze data using programming languages like Python and Tableau. What is feature engineering What are the practical Implications of Artificial Intelligence Who this book is for: Data analysts, scientists, and engineers seeking to enhance their skills, explore advanced concepts, and stay up-to-date with ethics. Business leaders and decision-makers across industries are interested in understanding the transformative potential and ethical implications of data analytics and AI in their organizations.

Exam Ref DP-100 Designing and Implementing a Data Science Solution on Azure

2024-12-06 O'Reilly Amazon

book

Dayne Sorvisto

it-operations cloud-computing cloud-platforms microsoft-azure microsoft-azure-certifications microsoft-azure-certifications-associate-tier

Prepare for Microsoft Exam DP-100 and demonstrate your real-world knowledge of managing data ingestion and preparation, model training and deployment, and machine learning solution monitoring with Python, Azure Machine Learning, and MLflow. Designed for professionals with data science experience, this Exam Ref focuses on the critical thinking and decision-making acumen needed for success at the Microsoft Certified: Azure Data Scientist Associate level. Focus on the expertise measured by these objectives: Design and prepare a machine learning solution Explore data and train models Prepare a model for deployment Deploy and retrain a model This Microsoft Exam Ref: Organizes its coverage by exam objectives Features strategic, what-if scenarios to challenge you Assumes you have experience in designing and creating a suitable working environment for data science workloads, training machine learning models, and managing, deploying, and monitoring scalable machine learning solutions About the Exam Exam DP-100 focuses on knowledge needed to design and prepare a machine learning solution, manage an Azure Machine Learning workspace, explore data and train models, create models by using the Azure Machine Learning designer, prepare a model for deployment, manage models in Azure Machine Learning, deploy and retrain a model, and apply machine learning operations (MLOps) practices. About Microsoft Certification Passing this exam fulfills your requirements for the Microsoft Certified: Azure Data Scientist Associate credential, demonstrating your expertise in applying data science and machine learning to implement and run machine learning workloads on Azure, including knowledge and experience using Azure Machine Learning and MLflow.

Just Enough Data Science and Machine Learning: Essential Tools and Techniques

2024-12-05 O'Reilly Amazon

book

Mark Levene , Martyn Harris

data data-science AI/ML Data Science DataViz Python

An accessible introduction to applied data science and machine learning, with minimal math and code required to master the foundational and technical aspects of data science. In Just Enough Data Science and Machine Learning, authors Mark Levene and Martyn Harris present a comprehensive and accessible introduction to data science. It allows the readers to develop an intuition behind the methods adopted in both data science and machine learning, which is the algorithmic component of data science involving the discovery of patterns from input data. This book looks at data science from an applied perspective, where emphasis is placed on the algorithmic aspects of data science and on the fundamental statistical concepts necessary to understand the subject. The book begins by exploring the nature of data science and its origins in basic statistics. The authors then guide readers through the essential steps of data science, starting with exploratory data analysis using visualisation tools. They explain the process of forming hypotheses, building statistical models, and utilising algorithmic methods to discover patterns in the data. Finally, the authors discuss general issues and preliminary concepts that are needed to understand machine learning, which is central to the discipline of data science. The book is packed with practical examples and real-world data sets throughout to reinforce the concepts. All examples are supported by Python code external to the reading material to keep the book timeless. Notable features of this book: Clear explanations of fundamental statistical notions and concepts Coverage of various types of data and techniques for analysis In-depth exploration of popular machine learning tools and methods Insight into specific data science topics, such as social networks and sentiment analysis Practical examples and case studies for real-world application Recommended further reading for deeper exploration of specific topics. ....

The Data Science Handbook, 2nd Edition

2024-12-05 O'Reilly Amazon

book

Field Cady

data data-science AI/ML Analytics Computer Science Data Science

Practical, accessible guide to becoming a data scientist, updated to include the latest advances in data science and related fields. Becoming a data scientist is hard. The job focuses on mathematical tools, but also demands fluency with software engineering, understanding of a business situation, and deep understanding of the data itself. This book provides a crash course in data science, combining all the necessary skills into a unified discipline. The focus of The Data Science Handbook is on practical applications and the ability to solve real problems, rather than theoretical formalisms that are rarely needed in practice. Among its key points are: An emphasis on software engineering and coding skills, which play a significant role in most real data science problems. Extensive sample code, detailed discussions of important libraries, and a solid grounding in core concepts from computer science (computer architecture, runtime complexity, and programming paradigms). A broad overview of important mathematical tools, including classical techniques in statistics, stochastic modeling, regression, numerical optimization, and more. Extensive tips about the practical realities of working as a data scientist, including understanding related jobs functions, project life cycles, and the varying roles of data science in an organization. Exactly the right amount of theory. A solid conceptual foundation is required for fitting the right model to a business problem, understanding a tool’s limitations, and reasoning about discoveries. Data science is a quickly evolving field, and this 2nd edition has been updated to reflect the latest developments, including the revolution in AI that has come from Large Language Models and the growth of ML Engineering as its own discipline. Much of data science has become a skillset that anybody can have, making this book not only for aspiring data scientists, but also for professionals in other fields who want to use analytics as a force multiplier in their organization.

Microsoft Power Apps Cookbook - Third Edition

2024-10-31 O'Reilly Amazon

book

Eickhel Mendoza

data data-science business-intelligence microsoft-power-platform microsoft-power-automate AI/ML

Microsoft Power Apps Cookbook is a comprehensive guide to harnessing the full potential of Microsoft Power Apps, a powerful low-code platform for building business applications. Packed with practical recipes, this book details how to develop scalable, efficient apps, automate workflows with RPA, and utilize new capabilities like AI-powered Microsoft Copilot and the Power Apps Component Framework. What this Book will help me do Create and deploy scalable canvas and model-driven apps using Microsoft Power Apps. Utilize AI-powered features like Copilot to speed up app creation and development. Implement robust data management strategies with Microsoft Dataverse. Extend app functionalities using the Power Apps Component Framework for custom components. Design and build secure external-facing websites with Microsoft Power Pages. Author(s) Eickhel Mendoza is an experienced Microsoft Power Platform developer and educator who has helped numerous organizations enhance their capabilities through low-code app development. Authoring from extensive hands-on experience, their teaching style bridges technical theory and practical application. Eickhel is passionate about empowering users to achieve more with modern app development tools. Who is it for? This book is ideal for information workers and developers looking to streamline their application development processes with Microsoft's low-code solutions. It is particularly targeted toward users with a foundational understanding of the Power Platform looking to deepen their knowledge. Readers will benefit most if they are eager to learn how to create innovative solutions efficiently. Traditional developers aiming to explore a new paradigm of rapid application development will also find it highly beneficial.

Computational Intelligence in Sustainable Computing and Optimization

2024-10-08 O'Reilly Amazon

book

Brindha K , Sudha Senthilkumar , Balamurugan Balusamy , Vinayakumar Ravi , Rajesh Kumar Dhanaraj

data data-science analytics-platforms Informatica AI/ML BI

Computational Intelligence in Sustainable Computing and Optimization: Trends and Applications focuses on developing and evolving advanced computational intelligence algorithms for the analysis of data involved in applications, such as agriculture, biomedical systems, bioinformatics, business intelligence, economics, disaster management, e-learning, education management, financial management, and environmental policies. The book presents research in sustainable computing and optimization, combining methods from engineering, mathematics, artificial intelligence, and computer science to optimize environmental resources Computational intelligence in the field of sustainable computing combines computer science and engineering in applications ranging from Internet of Things (IoT), information security systems, smart storage, cloud computing, intelligent transport management, cognitive and bio-inspired computing, and management science. In addition, data intelligence techniques play a critical role in sustainable computing. Recent advances in data management, data modeling, data analysis, and artificial intelligence are finding applications in energy networks and thus making our environment more sustainable. Presents computational, intelligence–based data analysis for sustainable computing applications such as pattern recognition, biomedical imaging, sustainable cities, sustainable transport, sustainable agriculture, and sustainable financial management Develops research in sustainable computing and optimization, combining methods from engineering, mathematics, and computer science to optimize environmental resources Includes three foundational chapters dedicated to providing an overview of computational intelligence and optimization techniques and their applications for sustainable computing

talk-data.com

O'Reilly Data Science Books

Top Topics

Top Speakers

A Friendly Guide to Data Science: Everything You Should Know About the Hottest Field in Tech

Microsoft Power Platform Solution Architect Certification Companion: Mastering the PL-600 Certification

HBR's 10 Must Reads on Data Strategy (featuring "Democratizing Transformation" by Marco Iansiti and Satya Nadella)

R Programming for Mass Spectrometry

Data Without Labels

Applied Machine Learning for Data Science Practitioners

SAS For Dummies, 3rd Edition

Architecting Power BI Solutions in Microsoft Fabric

An Introduction to Self-Report Measurement

3D Data Science with Python

Time Series Analysis with Spark

Time Series Forecasting Using Generative AI : Leveraging AI for Precision Forecasting

Hands-On APIs for AI and Data Science

The Impact of Algorithmic Technologies on Healthcare

Causal Inference for Data Science

Analytics the Right Way

Statistical Quantitative Methods in Finance: From Theory to Quantitative Portfolio Management

Learning AI Tools in Tableau

Julia Quick Syntax Reference: A Pocket Guide for Data Science Programming

Essential Data Analytics, Data Science, and AI: A Practical Guide for a Data-Driven World

Exam Ref DP-100 Designing and Implementing a Data Science Solution on Azure

Just Enough Data Science and Machine Learning: Essential Tools and Techniques

The Data Science Handbook, 2nd Edition

Microsoft Power Apps Cookbook - Third Edition

Computational Intelligence in Sustainable Computing and Optimization