Data Science

Learn Data Science Using SAS Studio : From Clicks to Code

2026-01-01 · O'Reilly Data Science Books O'Reilly Amazon

book

by Engy Fouda

Analytics Marketing Python SAS analytics-platforms data data-science

Do you want to create data analysis reports without writing a line of code? This book introduces SAS Studio, a free, web-based data science product for educational and non-commercial purposes. The power of SAS Studio lies in its visual, point-and-click user interface, which generates SAS code. It is easier to learn SAS Studio than to learn R and Python to accomplish data cleaning, statistics, and visualization tasks. The book includes a case study analyzing the data required to predict the results of presidential elections in the state of Maine for 2016 and 2020. In addition to the presidential elections, the book provides real-life examples, including analyses of stock, oil, and gold prices, crime, marketing, and healthcare. You will see data science in action and how easily it can be performed using complicated tasks and visualizations in SAS Studio. You will learn, step by step, how to perform visualizations, including creating maps. In most cases, you will not need a line of code as you work with the SAS Studio graphical user interface. The book includes explanations of the code that SAS Studio generates automatically. You will learn how to edit this code to perform more complicated advanced tasks. What You Will Learn Become familiar with the SAS Studio IDE. How to create essential visualizations. Know the fundamental statistical analysis required in most data science and analytics reports. Clean the most common dataset problems Learn linear and logistic regression for data prediction and analysis. Write programs in SAS. How to analyze data and get insights from it for decision-making. Learn character, numeric, date, time, and datetime functions and typecasting. Who This Book Is For A general audience of people who are new to data science, students, and data analysts and scientists who are new to SAS. No prior programming or statistical knowledge is required.

Bioinformatics with Python Cookbook - Fourth Edition

2025-12-19 · O'Reilly Data Science Books O'Reilly Amazon

book

by Shane Brubaker

AI/ML Cloud Computing Python bioinformatics data data-science data-science-domains

Bioinformatics with Python Cookbook provides a practical, hands-on approach to solving computational biology challenges with Python, enabling readers to analyze sequencing data, leverage AI for bioinformatics applications, and design robust computational pipelines. What this Book will help me do Perform comprehensive sequence analysis using Python libraries for refined data interpretation. Configure and run bioinformatics workflows on cloud environments for scalable solutions. Apply advanced data science practices to analyze and visualize bioinformatics data. Explore the integration of AI tools in processing multimodal biological datasets. Understand and utilize bioinformatics databases for research and development. Author(s) Shane Brubaker is an experienced computational biologist and software developer with a strong background in bioinformatics and Python programming. With years of experience in data analysis and software engineering, Shane has authored numerous solutions for real-world bioinformatics issues. He brings a practical, example-driven teaching approach, aimed at empowering readers to apply techniques effectively in their work. Who is it for? This book is suitable for bioinformatics professionals, data scientists, and software engineers with moderate experience seeking to expand their computational biology knowledge. Readers should have basic understanding of biology, programming, and cloud tools. By engaging with this book, learners can advance their skills in Python and bioinformatics to address complex biological data challenges effectively.

CompTIA Data+ Study Guide, 2nd Edition

2025-11-04 · O'Reilly Data Science Books O'Reilly Amazon

book

by Sharif Nijim , Mike Chapple

Data Governance Data Quality DataViz comptia-data comptia data+ data data-science

Prepare for the CompTIA Data+ exam, as well as a new career in data science, with this effective study guide In the newly revised second edition of CompTIA Data+ Study Guide: Exam DA0-002, veteran IT professionals Mike Chapple and Sharif Nijim provide a powerful, one-stop resource for anyone planning to pursue the CompTIA Data+ certification and go on to an exciting new career in data science. The authors walk you through the info you need to succeed on the exam and in your first day at a data science-focused job. Complete with two online practice tests, this book comprehensively covers every objective tested by the updated DA0-002 exam, including databases and data acquisition, data quality, data analysis and statistics, data visualization, and data governance. You'll also find: Efficient and comprehensive content, helping you get up-to-speed as quickly as possible Bite-size chapters that break down essential topics into manageable and accessible lessons Complimentary access to Sybex' famous online learning environment, with practice questions, a complete glossary of common industry terminology, hundreds of flashcards, and more A practical and hands-on pathway to the CompTIA Data+ certification, as well as a new career in data science, the CompTIA Data+ Study Guide, Second Edition, offers the foundational knowledge, skills, and abilities you need to get started in an exciting and rewarding new career.

The Big Book of Data Science. Part I: Data Processing

2025-09-19 · O'Reilly Data Science Books O'Reilly Amazon

book

by Eugenia Robles , David Lopez

AI/ML Data Modelling GenAI data data-science

There are already excellent books on software programming for data processing and data transformation for instance: Wes McKinney’s. This book, reflecting on my own industrial and teaching experience, tries to overcome the big learning curve newcomers to the field have to travel before they are ready to tackle real data science and AI challenges. In this regard this book is different to other books in that:

It assumes zero software programming knowledge. This instructional design is intentional given the book’s aim to open the practice of data science to anyone interested in data exploration and analysis irrespective of their previous background.

It follows an incremental approach to facilitate the assimilation of, sometimes, arcane software techniques to manipulate data.

It is practice oriented to ensure readers can apply what they learn in their daily practices.

Illustrates how to use generative AI to help you become a more productive data scientist and AI engineer.

By reading and working on the labs included in this book you will develop software programming skills required to successfully contribute to the data understanding and data preparation stages involved in any data related project. You will become proficient at manipulating and transforming datasets in industrial contexts and produce clean, reliable datasets that can drive accurate analysis and informed decision-making. Moreover you will be prepared to develop and deploy dashboards and visualizations supporting the insights and conclusions in the deployment stage.

Data modelling and evaluation are not covered in this book. We are working on a second installment of the book series illustrating the application of statistical and machine learning techniques to derive data insights.

Statistics Every Programmer Needs

2025-07-29 · O'Reilly Data Science Books O'Reilly Amazon

book

by Gary Sutton

AI/ML Analytics BI Monte Carlo Python data data-science data-science-tasks statistics

Put statistics into practice with Python! Data-driven decisions rely on statistics. Statistics Every Programmer Needs introduces the statistical and quantitative methods that will help you go beyond “gut feeling” for tasks like predicting stock prices or assessing quality control, with examples using the rich tools of the Python ecosystem. Statistics Every Programmer Needs will teach you how to: Apply foundational and advanced statistical techniques Build predictive models and simulations Optimize decisions under constraints Interpret and validate results with statistical rigor Implement quantitative methods using Python In this hands-on guide, stats expert Gary Sutton blends the theory behind these statistical techniques with practical Python-based applications, offering structured, reproducible, and defensible methods for tackling complex decisions. Well-annotated and reusable Python code listings illustrate each method, with examples you can follow to practice your new skills. About the Technology Whether you’re analyzing application performance metrics, creating relevant dashboards and reports, or immersing yourself in a numbers-heavy coding project, every programmer needs to know how to turn raw data into actionable insight. Statistics and quantitative analysis are the essential tools every programmer needs to clarify uncertainty, optimize outcomes, and make informed choices. About the Book Statistics Every Programmer Needs teaches you how to apply statistics to the everyday problems you’ll face as a software developer. Each chapter is a new tutorial. You’ll predict ultramarathon times using linear regression, forecast stock prices with time series models, analyze system reliability using Markov chains, and much more. The book emphasizes a balance between theory and hands-on Python implementation, with annotated code and real-world examples to ensure practical understanding and adaptability across industries. What's Inside Probability basics and distributions Random variables Regression Decision trees and random forests Time series analysis Linear programming Monte Carlo and Markov methods and much more About the Reader Examples are in Python. About the Author Gary Sutton is a business intelligence and analytics leader and the author of Statistics Slam Dunk: Statistical analysis with R on real NBA data. Quotes A well-organized tour of the statistical, machine learning and optimization tools every data science programmer needs. - Peter Bruce, Author of Statistics for Data Science and Analytics Turns statistics from a stumbling block into a superpower. Clear, relevant, and written with a coder’s mindset! - Mahima Bansod, LogicMonitor Essential! Stats and modeling with an emphasis on real-world system design. - Anupam Samanta, Google A great blend of theory and practice. - Ariel Andres, Scotia Global Asset Management

A Friendly Guide to Data Science: Everything You Should Know About the Hottest Field in Tech

2025-06-26 · O'Reilly Data Science Books O'Reilly Amazon

book

by Kelly P. Vincent

AI/ML Analytics Data Analytics Cyber Security data data-science

Unlock the world of data science—no coding required. Curious about data science but not sure where to start? This book is a beginner-friendly guide to what data science is and how people use it. It walks you through the essential topics—what data analysis involves, which skills are useful, and how terms like “data analytics” and “machine learning” connect—without getting too technical too fast. Data science isn’t just about crunching numbers, pulling data from a database, or running fancy algorithms. It’s about asking the right questions, understanding the process from start to finish, and knowing what’s possible (and what’s not). This book teaches you all of that, while also introducing important topics like ethics, privacy, and security—because working with data means thinking about people, too. Whether you're a student exploring new skills, a professional navigating data-driven decisions, or someone considering a career change, this book is your friendly gateway into the world of data science, one of today’s most exciting fields. No coding or programming experience? No problem. You'll build a solid foundation and gain the confidence to engage with data science concepts— just as AI and data become increasingly central to everyday life. What You Will Learn Grasp foundational statistics and how it matters in data analysis and data science Understand the data science project life cycle and how to manage a data science project Examine the ethics of working with data and its use in data analysis and data science Understand the foundations of data security and privacy Collect, store, prepare, visualize, and present data Identify the many types of machine learning and know how to gauge performance Prepare for and find a career in data science Who This Book is for A wide range of readers who are curious about data science and eager to build a strong foundation. Perfect for undergraduates in the early semesters of their data science degrees, as it assumes no prior programming or industry experience. Professionals will find particular value in the real-world insights shared through practitioner interviews. Business leaders can use it to better understand what data science can do for them and how their teams are applying it. And for career changers, this book offers a welcoming entry point into the field—helping them explore the landscape before committing to more intensive learning paths like degrees or boot camps.

Handbook of Decision Analysis, 2nd Edition

2025-05-28 · O'Reilly Data Science Books O'Reilly Amazon

book

by Gregory S. Parnell , Steven N. Tani , Eric Specking , Eric R. Johnson , Terry A. Bresnick

Analytics Big Data Data Analytics Microsoft business-intelligence data data-science prescriptive-analytics

Qualitative and quantitative techniques to apply decision analysis to real-world decision problems, supported by sound mathematics, best practices, soft skills, and more With substantive illustrations based on the authors’ personal experiences throughout, Handbook of Decision Analysis describes the philosophy, knowledge, science, and art of decision analysis. Key insights from decision analysis applications and behavioral decision analysis research are presented, and numerous decision analysis textbooks, technical books, and research papers are referenced for comprehensive coverage. This book does not introduce new decision analysis mathematical theory, but rather ensures the reader can understand and use the most common mathematics and best practices, allowing them to apply rigorous decision analysis with confidence. The material is supported by examples and solution steps using Microsoft Excel and includes many challenging real-world problems. Given the increase in the availability of data due to the development of products that deliver huge amounts of data, and the development of data science techniques and academic programs, a new theme of this Second Edition is the use of decision analysis techniques with big data and data analytics. Written by a team of highly qualified professionals and academics, Handbook of Decision Analysis includes information on: Behavioral decision-making insights, decision framing opportunities, collaboration with stakeholders, information assessment, and decision analysis modeling techniques Principles of value creation through designing alternatives, clear value/risk tradeoffs, and decision implementation Qualitative and quantitative techniques for each key decision analysis task, as opposed to presenting one technique for all decisions. Stakeholder analysis, decision hierarchies, and influence diagrams to frame descriptive, predictive, and prescriptive analytics decision problems to ensure implementation success Handbook of Decision Analysis is a highly valuable textbook, reference, and/or refresher for students and decision professionals in business, management science, engineering, engineering management, operations management, mathematics, and statistics who want to increase the breadth and depth of their technical and soft skills for success when faced with a professional or personal decision.

Data Without Labels

2025-05-26 · O'Reilly Data Science Books O'Reilly Amazon

book

by Vaibhav Verdhan

AI/ML GenAI Keras Matplotlib NumPy Pandas Python Seaborn TensorFlow data data-science data-science-tools

Discover all-practical implementations of the key algorithms and models for handling unlabeled data. Full of case studies demonstrating how to apply each technique to real-world problems. In Data Without Labels you’ll learn: Fundamental building blocks and concepts of machine learning and unsupervised learning Data cleaning for structured and unstructured data like text and images Clustering algorithms like K-means, hierarchical clustering, DBSCAN, Gaussian Mixture Models, and Spectral clustering Dimensionality reduction methods like Principal Component Analysis (PCA), SVD, Multidimensional scaling, and t-SNE Association rule algorithms like aPriori, ECLAT, SPADE Unsupervised time series clustering, Gaussian Mixture models, and statistical methods Building neural networks such as GANs and autoencoders Dimensionality reduction methods like Principal Component Analysis and multidimensional scaling Association rule algorithms like aPriori, ECLAT, and SPADE Working with Python tools and libraries like sci-kit learn, numpy, Pandas, matplotlib, Seaborn, Keras, TensorFlow, and Flask How to interpret the results of unsupervised learning Choosing the right algorithm for your problem Deploying unsupervised learning to production Maintenance and refresh of an ML solution Data Without Labels introduces mathematical techniques, key algorithms, and Python implementations that will help you build machine learning models for unannotated data. You’ll discover hands-off and unsupervised machine learning approaches that can still untangle raw, real-world datasets and support sound strategic decisions for your business. Don’t get bogged down in theory—the book bridges the gap between complex math and practical Python implementations, covering end-to-end model development all the way through to production deployment. You’ll discover the business use cases for machine learning and unsupervised learning, and access insightful research papers to complete your knowledge. About the Technology Generative AI, predictive algorithms, fraud detection, and many other analysis tasks rely on cheap and plentiful unlabeled data. Machine learning on data without labels—or unsupervised learning—turns raw text, images, and numbers into insights about your customers, accurate computer vision, and high-quality datasets for training AI models. This book will show you how. About the Book Data Without Labels is a comprehensive guide to unsupervised learning, offering a deep dive into its mathematical foundations, algorithms, and practical applications. It presents practical examples from retail, aviation, and banking using fully annotated Python code. You’ll explore core techniques like clustering and dimensionality reduction along with advanced topics like autoencoders and GANs. As you go, you’ll learn where to apply unsupervised learning in business applications and discover how to develop your own machine learning models end-to-end. What's Inside Master unsupervised learning algorithms Real-world business applications Curate AI training datasets Explore autoencoders and GANs applications About the Reader Intended for data science professionals. Assumes knowledge of Python and basic machine learning. About the Author Vaibhav Verdhan is a seasoned data science professional with extensive experience working on data science projects in a large pharmaceutical company. Quotes An invaluable resource for anyone navigating the complexities of unsupervised learning. A must-have. - Ganna Pogrebna, The Alan Turing Institute Empowers the reader to unlock the hidden potential within their data. - Sonny Shergill, Astra Zeneca A must-have for teams working with unstructured data. Cuts through the fog of theory ili Explains the theory and delivers practical solutions. - Leonardo Gomes da Silva, onGRID Sports Technology The Bible for unsupervised learning! Full of real-world applications, clear explanations, and excellent Python implementations. - Gary Bake, Falconhurst Technologies

Applied Machine Learning for Data Science Practitioners

2025-04-29 · O'Reilly Data Science Books O'Reilly Amazon

book

by Vidya Subramanian

AI/ML Python ai-ml data machine-learning

A single-volume reference on data science techniques for evaluating and solving business problems using Applied Machine Learning (ML). Applied Machine Learning for Data Science Practitioners offers a practical, step-by-step guide to building end-to-end ML solutions for real-world business challenges, empowering data science practitioners to make informed decisions and select the right techniques for any use case. Unlike many data science books that focus on popular algorithms and coding, this book takes a holistic approach. It equips you with the knowledge to evaluate a range of techniques and algorithms. The book balances theoretical concepts with practical examples to illustrate key concepts, derive insights, and demonstrate applications. In addition to code snippets and reviewing output, the book provides guidance on interpreting results. This book is an essential resource if you are looking to elevate your understanding of ML and your technical capabilities, combining theoretical and practical coding examples. A basic understanding of using data to solve business problems, high school-level math and statistics, and basic Python coding skills are assumed. Written by a recognized data science expert, Applied Machine Learning for Data Science Practitioners covers essential topics, including: Data Science Fundamentals that provide you with an overview of core concepts, laying the foundation for understanding ML. Data Preparation covers the process of framing ML problems and preparing data and features for modeling. ML Problem Solving introduces you to a range of ML algorithms, including Regression, Classification, Ranking, Clustering, Patterns, Time Series, and Anomaly Detection. Model Optimization explores frameworks, decision trees, and ensemble methods to enhance performance and guide the selection of the most effective model. ML Ethics addresses ethical considerations, including fairness, accountability, transparency, and ethics. Model Deployment and Monitoring focuses on production deployment, performance monitoring, and adapting to model drift.

Architecting Power BI Solutions in Microsoft Fabric

2025-04-25 · O'Reilly Data Science Books O'Reilly Amazon

book

by Nagaraj Venkatesan

AI/ML BI Microsoft Fabric Power BI business-intelligence data data-science microsoft-power-platform power-bi

This book is a comprehensive guide to building sophisticated and robust Power BI solutions that solve common data problems effectively. Written with hands-on professionals in mind, it provides essential insights and practical advice to help you choose the right tools and approaches for any BI task. Readers will learn to create performant, secure, and innovative business intelligence systems. What this Book will help me do Identify the scenarios where each Power BI component fits best. Apply secure and performance-conscious design principles when building BI solutions. Leverage Microsoft Fabric and other advanced integrations to maximize Power BI's capabilities. Implement AI-powered features such as Copilot and predictive modeling in Power BI. Facilitate collaboration and governance using Power BI's advanced features. Author(s) Nagaraj Venkatesan has over 17 years of professional expertise in data platform technologies and business intelligence tools. Through a rich career in data solution architecture, he has mastered the art of designing efficient and reliable Power BI implementations. This book reflects his passion for empowering professionals to make the most of Power BI. Who is it for? If you are a solution architect, data engineer, or Power BI report developer looking to elevate your skills in designing optimized Power BI solutions, this book is for you. Business analysts and data scientists can also benefit immensely from the book's coverage of self-service BI and data science integration. Some familiarity with Power BI will enhance your learning experience, but newcomers eager to learn will also find it invaluable.

3D Data Science with Python

2025-04-10 · O'Reilly Data Science Books O'Reilly Amazon

book

by Florent Poux

AI/ML GenAI Python programming-languages software-development

Our physical world is grounded in three dimensions. To create technology that can reason about and interact with it, our data must be 3D too. This practical guide offers data scientists, engineers, and researchers a hands-on approach to working with 3D data using Python. From 3D reconstruction to 3D deep learning techniques, you'll learn how to extract valuable insights from massive datasets, including point clouds, voxels, 3D CAD models, meshes, images, and more. Dr. Florent Poux helps you leverage the potential of cutting-edge algorithms and spatial AI models to develop production-ready systems with a focus on automation. You'll get the 3D data science knowledge and code to: Understand core concepts and representations of 3D data Load, manipulate, analyze, and visualize 3D data using powerful Python libraries Apply advanced AI algorithms for 3D pattern recognition (supervised and unsupervised) Use 3D reconstruction techniques to generate 3D datasets Implement automated 3D modeling and generative AI workflows Explore practical applications in areas like computer vision/graphics, geospatial intelligence, scientific computing, robotics, and autonomous driving Build accurate digital environments that spatial AI solutions can leverage Florent Poux is an esteemed authority in the field of 3D data science who teaches and conducts research for top European universities. He's also head professor at the 3D Geodata Academy and innovation director for French Tech 120 companies.

Data Insight Foundations: Step-by-Step Data Analysis with R

2025-03-31 · O'Reilly Data Science Books O'Reilly Amazon

book

by Nikita Tkachenko

Analytics Data Analytics Data Collection Data Management DataViz Git Python data data-science data-science-tools r

This book is an essential guide designed to equip you with the vital tools and knowledge needed to excel in data science. Master the end-to-end process of data collection, processing, validation, and imputation using R, and understand fundamental theories to achieve transparency with literate programming, renv, and Git--and much more. Each chapter is concise and focused, rendering complex topics accessible and easy to understand. Data Insight Foundations caters to a diverse audience, including web developers, mathematicians, data analysts, and economists, and its flexible structure allows enables you to explore chapters in sequence or navigate directly to the topics most relevant to you. While examples are primarily in R, a basic understanding of the language is advantageous but not essential. Many chapters, especially those focusing on theory, require no programming knowledge at all. Dive in and discover how to manipulate data, ensure reproducibility, conduct thorough literature reviews, collect data effectively, and present your findings with clarity. What You Will Learn Data Management: Master the end-to-end process of data collection, processing, validation, and imputation using R. Reproducible Research: Understand fundamental theories and achieve transparency with literate programming, renv, and Git. Academic Writing: Conduct scientific literature reviews and write structured papers and reports with Quarto. Survey Design: Design well-structured surveys and manage data collection effectively. Data Visualization: Understand data visualization theory and create well-designed and captivating graphics using ggplot2. Who this Book is For Career professionals such as research and data analysts transitioning from academia to a professional setting where production quality significantly impacts career progression. Some familiarity with data analytics processes and an interest in learning R or Python are ideal.

Hands-On APIs for AI and Data Science

2025-03-05 · O'Reilly Data Science Books O'Reilly Amazon

book

by Ryan Day

AI/ML API Cloud Computing GenAI LLM Python data data-science

Are you ready to grow your skills in AI and data science? A great place to start is learning to build and use APIs in real-world data and AI projects. API skills have become essential for AI and data science success, because they are used in a variety of ways in these fields. With this practical book, data scientists and software developers will gain hands-on experience developing and using APIs with the Python programming language and popular frameworks like FastAPI and StreamLit. As you complete the chapters in the book, you'll be creating portfolio projects that teach you how to: Design APIs that data scientists and AIs love Develop APIs using Python and FastAPI Deploy APIs using multiple cloud providers Create data science projects such as visualizations and models using APIs as a data source Access APIs using generative AI and LLMs

The Well-Grounded Data Analyst

2025-02-06 · O'Reilly Data Science Books O'Reilly Amazon

book

by David Asboth

Data Modelling IBM Python data data-science

Complete eight data science projects that lock in important real-world skills—along with a practical process you can use to learn any new technique quickly and efficiently. Data analysts need to be problem solvers—and The Well-Grounded Data Analyst will teach you how to solve the most common problems you'll face in industry. You'll explore eight scenarios that your class or bootcamp won’t have covered, so you can accomplish what your boss is asking for. In The Well-Grounded Data Analyst you'll learn: High-value skills to tackle specific analytical problems Deconstructing problems for faster, practical solutions Data modeling, PDF data extraction, and categorical data manipulation Handling vague metrics, deciphering inherited projects, and defining customer records The Well-Grounded Data Analyst is for junior and early-career data analysts looking to supplement their foundational data skills with real-world problem solving. As you explore each project, you'll also master a proven process for quickly learning new skills developed by author and Half Stack Data Science podcast host David Asboth. You'll learn how to determine a minimum viable answer for your stakeholders, identify and obtain the data you need to deliver, and reliably present and iterate on your findings. The book can be read cover-to-cover or opened to the chapter most relevant to your current challenges. About the Technology Real world data analysis is messy. Success requires tackling challenges like unreliable data sources, ambiguous requests, and incompatible formats—often with limited guidance. This book goes beyond the clean, structured examples you see in classrooms and bootcamps, offering a step-by-step framework you can use to confidently solve any data analysis problem like a pro. About the Book The Well-Grounded Data Analyst introduces you to eight scenarios that every data analyst is bound to face. You’ll practice author David Asboth’s results-oriented approach as you model data by identifying customer records, navigate poorly-defined metrics, extract data from PDFs, and much more! It also teaches you how to take over incomplete projects and create rapid prototypes with real data. Along the way, you’ll build an impressive portfolio of projects you can showcase at your next interview. What's Inside Deconstructing problems Handling vague metrics Data modeling Categorical data manipulation About the Reader For early-career data scientists. About the Author David Asboth is a data generalist educator, and software architect. He co-hosts the Half Stack Data Science podcast. Quotes Well reasoned and well written, with approaches to solve many sorts of data analysis problems. - Naomi Ceder, Fellow of the Python Software Foundation An excellent resource for any aspiring data scientist! - Andrew R. Freed, IBM David’s clear and repeatable framework will give you confidence to tackle open-ended stakeholder requests and reach an answer much faster! - Shaun McGirr, DevOn Software Services A book version of shadowing a senior data analyst while they explain handling frequent data problems at work, including all the ugly gotchas. - Randy Au, Google

Causal Inference for Data Science

2025-01-30 · O'Reilly Data Science Books O'Reilly Amazon

book

by Aleix Ruiz de Villa

AI/ML Python data data-science

When you know the cause of an event, you can affect its outcome. This accessible introduction to causal inference shows you how to determine causality and estimate effects using statistics and machine learning. A/B tests or randomized controlled trials are expensive and often unfeasible in a business environment. Causal Inference for Data Science reveals the techniques and methodologies you can use to identify causes from data, even when no experiment or test has been performed. In Causal Inference for Data Science you will learn how to: Model reality using causal graphs Estimate causal effects using statistical and machine learning techniques Determine when to use A/B tests, causal inference, and machine learning Explain and assess objectives, assumptions, risks, and limitations Determine if you have enough variables for your analysis It’s possible to predict events without knowing what causes them. Understanding causality allows you both to make data-driven predictions and also intervene to affect the outcomes. Causal Inference for Data Science shows you how to build data science tools that can identify the root cause of trends and events. You’ll learn how to interpret historical data, understand customer behaviors, and empower management to apply optimal decisions. About the Technology Why did you get a particular result? What would have lead to a different outcome? These are the essential questions of causal inference. This powerful methodology improves your decisions by connecting cause and effect—even when you can’t run experiments, A/B tests, or expensive controlled trials. About the Book Causal Inference for Data Science introduces techniques to apply causal reasoning to ordinary business scenarios. And with this clearly-written, practical guide, you won’t need advanced statistics or high-level math to put causal inference into practice! By applying a simple approach based on Directed Acyclic Graphs (DAGs), you’ll learn to assess advertising performance, pick productive health treatments, deliver effective product pricing, and more. What's Inside When to use A/B tests, causal inference, and ML Assess objectives, assumptions, risks, and limitations Apply causal inference to real business data About the Reader For data scientists, ML engineers, and statisticians. About the Author Aleix Ruiz de Villa Robert is a data scientist with a PhD in mathematical analysis from the Universitat Autònoma de Barcelona. Quotes With intuitive explanations, application-focused insights, and real-world examples, this book offers immense practical value. - Philipp Bach, Maintainer of the DoubleML libraries for Python and R An essential guide for navigating the complexities of real-world data analysis. - Adi Shavit, SWAPP A must-read! Demystifies causal inference with a blend of theory and practice. - Karan Gupta, SunPower Corporation Causal relationships can mask and distort results. This book provides a set of tools to extract insights correctly. - Peter V. Henstock, Harvard Extension

Statistical Quantitative Methods in Finance: From Theory to Quantitative Portfolio Management

2025-01-22 · O'Reilly Data Science Books O'Reilly Amazon

book

by Samit Ahlawat

AI/ML Python data data-science data-science-tasks statistics

Statistical quantitative methods are vital for financial valuation models and benchmarking machine learning models in finance. This book explores the theoretical foundations of statistical models, from ordinary least squares (OLS) to the generalized method of moments (GMM) used in econometrics. It enriches your understanding through practical examples drawn from applied finance, demonstrating the real-world applications of these concepts. Additionally, the book delves into non-linear methods and Bayesian approaches, which are becoming increasingly popular among practitioners thanks to advancements in computational resources. By mastering these topics, you will be equipped to build foundational models crucial for applied data science, a skill highly sought after by software engineering and asset management firms. The book also offers valuable insights into quantitative portfolio management, showcasing how traditional data science tools can be enhanced with machine learning models. These enhancements are illustrated through real-world examples from finance and econometrics, accompanied by Python code. This practical approach ensures that you can apply what you learn, gaining proficiency in the statsmodels library and becoming adept at designing, implementing, and calibrating your models. By understanding and applying these statistical models, you enhance your data science skills and effectively tackle financial challenges. What You Will Learn Understand the fundamentals of linear regression and its applications in financial data analysis and prediction Apply generalized linear models for handling various types of data distributions and enhancing model flexibility Gain insights into regime switching models to capture different market conditions and improve financial forecasting Benchmark machine learning models against traditional statistical methods to ensure robustness and reliability in financial applications Who This Book Is For Data scientists, machine learning engineers, finance professionals, and software engineers

Julia Quick Syntax Reference: A Pocket Guide for Data Science Programming

2025-01-03 · O'Reilly Data Science Books O'Reilly Amazon

book

by Antonello Lobianco

AI/ML API Python data data-science

Learn the Julia programming language as quickly as possible. This book is a must-have reference guide that presents the essential Julia syntax in a well-organized format, updated with the latest features of Julia’s APIs, libraries, and packages. This book provides an introduction that reveals basic Julia structures and syntax; discusses data types, control flow, functions, input/output, exceptions, metaprogramming, performance, and more. Additionally, you'll learn to interface Julia with other programming languages such as R for statistics or Python. At a more applied level, you will learn how to use Julia packages for data analysis, numerical optimization, symbolic computation, and machine learning, and how to present your results in dynamic documents. The Second Edition delves deeper into modules, environments, and parallelism in Julia. It covers random numbers, reproducibility in stochastic computations, and adds a section on probabilistic analysis. Finally, it provides forward-thinking introductions to AI and machine learning workflows using BetaML, including regression, classification, clustering, and more, with practical exercises and solutions for self-learners. What You Will Learn Work with Julia types and the different containers for rapid development Use vectorized, classical loop-based code, logical operators, and blocks Explore Julia functions: arguments, return values, polymorphism, parameters, anonymous functions, and broadcasts Build custom structures in Julia Use C/C++, Python or R libraries in Julia and embed Julia in other code. Optimize performance with GPU programming, profiling and more. Manage, prepare, analyse and visualise your data with DataFrames and Plots Implement complete ML workflows with BetaML, from data coding to model evaluation, and more. Who This Book Is For Experienced programmers who are new to Julia, as well as data scientists who want to improve their analysis or try out machine learning algorithms with Julia.

Big Data, Data Mining and Data Science

2024-12-30 · O'Reilly Data Science Books O'Reilly Amazon

book

by George Dimitoglou , Hamid Arabnia , Leonidas Deligiannidis

Big Data data data-science

Through the application of cutting-edge techniques like Big Data, Data Mining, and Data Science, it is possible to extract insights from massive datasets. These methodologies are crucial in enabling informed decision-making and driving transformative advancements across many fields, industries, and domains. This book offers an overview of latest tools, methods and approaches while also highlighting their practical use through various applications and case studies.

Data Science for Decision Makers

2024-12-26 · O'Reilly Data Science Books O'Reilly Amazon

book

by Erik Herman

Data Collection data data-science

Data Science for Decision Makers is an essential guide for executives, managers, entrepreneurs, and anyone seeking to harness the power of data to drive business success. In today's fast-paced and increasingly digital world, the ability to make informed decisions based on data-driven insights is vital. This book serves as a bridge between the complex world of data science and the strategic decision-making process, providing readers with the knowledge and tools they need to leverage data effectively. With a clear focus on practical application, this book demystifies key concepts in data science, from data collection and analysis to predictive modeling and visualization. Via real-world examples, case studies, and actionable insights, readers will learn how to extract insights from data and translate them into actionable strategies that drive organizational growth. Written in a reader-friendly manner, this book caters to both novice and experienced professionals alike. Whether you're a seasoned executive looking to sharpen your strategic acumen or a manager seeking to enhance your team's data literacy, this essential reference provides the necessary foundation to navigate the complex landscape of data science with confidence.

Data Science Essentials For Dummies

2024-12-24 · O'Reilly Data Science Books O'Reilly Amazon

book

by Lillian Pierson

Data Collection data data-science

Feel confident navigating the fundamentals of data science Data Science Essentials For Dummies is a quick reference on the core concepts of the exploding and in-demand data science field, which involves data collection and working on dataset cleaning, processing, and visualization. This direct and accessible resource helps you brush up on key topics and is right to the point—eliminating review material, wordy explanations, and fluff—so you get what you need, fast. Strengthen your understanding of data science basics Review what you've already learned or pick up key skills Effectively work with data and provide accessible materials to others Jog your memory on the essentials as you work and get clear answers to your questions Perfect for supplementing classroom learning, reviewing for a certification, or staying knowledgeable on the job, Data Science Essentials For Dummies is a reliable reference that's great to keep on hand as an everyday desk reference.

talk-data.com

Activity Trend

Top Events

Top Speakers

Learn Data Science Using SAS Studio : From Clicks to Code

Bioinformatics with Python Cookbook - Fourth Edition

CompTIA Data+ Study Guide, 2nd Edition

The Big Book of Data Science. Part I: Data Processing

Statistics Every Programmer Needs

A Friendly Guide to Data Science: Everything You Should Know About the Hottest Field in Tech

Handbook of Decision Analysis, 2nd Edition

Data Without Labels

Applied Machine Learning for Data Science Practitioners

Architecting Power BI Solutions in Microsoft Fabric

3D Data Science with Python

Data Insight Foundations: Step-by-Step Data Analysis with R

Hands-On APIs for AI and Data Science

The Well-Grounded Data Analyst

Causal Inference for Data Science

Statistical Quantitative Methods in Finance: From Theory to Quantitative Portfolio Management

Julia Quick Syntax Reference: A Pocket Guide for Data Science Programming

Big Data, Data Mining and Data Science

Data Science for Decision Makers

Data Science Essentials For Dummies