O'Reilly Data Science Books

It's All Analytics!

2020-05-25 O'Reilly Amazon

book

Scott Burk , Gary D. Miner

data data-science business-intelligence prescriptive-analytics AI/ML Analytics

This book, the first in a series of three, provides a look at the foundations of artificial intelligence and analytics and why readers need an unbiased understanding of the subject.

Analytical Skills for AI and Data Science

2020-05-21 O'Reilly Amazon

book

Daniel Vaughan

data data-science AI/ML Analytics Data Science

While several market-leading companies have successfully transformed their business models by following data- and AI-driven paths, the vast majority have yet to reap the benefits. How can your business and analytics units gain a competitive advantage by capturing the full potential of this predictive revolution? This practical guide presents a battle-tested end-to-end method to help you translate business decisions into tractable prescriptive solutions using data and AI as fundamental inputs. Author Daniel Vaughan shows data scientists, analytics practitioners, and others interested in using AI to transform their businesses not only how to ask the right questions but also how to generate value using modern AI technologies and decision-making principles. You’ll explore several use cases common to many enterprises, complete with examples you can apply when working to solve your own issues. Break business decisions into stages that can be tackled using different skills from the analytical toolbox Identify and embrace uncertainty in decision making and protect against common human biases Customize optimal decisions to different customers using predictive and prescriptive methods and technologies Ask business questions that create high value through AI- and data-driven technologies

Practical Synthetic Data Generation

2020-05-19 O'Reilly Amazon

book

Khaled El Emam , Lucy Mosquera , Richard Hoptroff

data data-science data-science-tasks data-wrangling-preparation-cleaning data wrangling, preparation, cleaning AI/ML

Building and testing machine learning models requires access to large and diverse data. But where can you find usable datasets without running into privacy issues? This practical book introduces techniques for generating synthetic data—fake data generated from real data—so you can perform secondary analysis to do research, understand customer behaviors, develop new products, or generate new revenue. Data scientists will learn how synthetic data generation provides a way to make such data broadly available for secondary purposes while addressing many privacy concerns. Analysts will learn the principles and steps for generating synthetic data from real datasets. And business leaders will see how synthetic data can help accelerate time to a product or solution. This book describes: Steps for generating synthetic data using multivariate normal distributions Methods for distribution fitting covering different goodness-of-fit metrics How to replicate the simple structure of original data An approach for modeling data structure to consider complex relationships Multiple approaches and metrics you can use to assess data utility How analysis performed on real data can be replicated with synthetic data Privacy implications of synthetic data and methods to assess identity disclosure

Practical Statistics for Data Scientists, 2nd Edition

2020-05-11 O'Reilly Amazon

book

Andrew Bruce , Peter Bruce , Peter Gedeck

data data-science data-science-tasks statistics AI/ML Big Data

Statistical methods are a key part of data science, yet few data scientists have formal statistical training. Courses and books on basic statistics rarely cover the topic from a data science perspective. The second edition of this popular guide adds comprehensive examples in Python, provides practical guidance on applying statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what’s important and what’s not. Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you’re familiar with the R or Python programming languages and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format. With this book, you’ll learn: Why exploratory data analysis is a key preliminary step in data science How random sampling can reduce bias and yield a higher-quality dataset, even with big data How the principles of experimental design yield definitive answers to questions How to use regression to estimate outcomes and detect anomalies Key classification techniques for predicting which categories a record belongs to Statistical machine learning methods that "learn" from data Unsupervised learning methods for extracting meaning from unlabeled data

ML Ops: Operationalizing Data Science

2020-04-25 O'Reilly Amazon

book

Dev Kannabiran , Michael O'Connell , Dan Rope , Steven Hillion , Thomas Hill , David Sweenor

data data-science AI/ML Analytics Data Analytics Data Science

More than half of the analytics and machine learning (ML) models created by organizations today never make it into production. Instead, many of these ML models do nothing more than provide static insights in a slideshow. If they aren’t truly operational, these models can’t possibly do what you’ve trained them to do. This report introduces practical concepts to help data scientists and application engineers operationalize ML models to drive real business change. Through lessons based on numerous projects around the world, six experts in data analytics provide an applied four-step approach—Build, Manage, Deploy and Integrate, and Monitor—for creating ML-infused applications within your organization. You’ll learn how to: Fulfill data science value by reducing friction throughout ML pipelines and workflows Constantly refine ML models through retraining, periodic tuning, and even complete remodeling to ensure long-term accuracy Design the ML Ops lifecycle to ensure that people-facing models are unbiased, fair, and explainable Operationalize ML models not only for pipeline deployment but also for external business systems that are more complex and less standardized Put the four-step Build, Manage, Deploy and Integrate, and Monitor approach into action

Strategic Analytics: The Insights You Need from Harvard Business Review

2020-04-21 O'Reilly Amazon

book

Cassie Kozyrkov , Eric Siegel , Harvard Business Review , Thomas H. Davenport , Edward L. Glaeser

data data-science analytics-platforms AI/ML Analytics Blockchain

Is your company ready for the next wave of analytics? Data analytics offer the opportunity to predict the future, use advanced technologies, and gain valuable insights about your business. But unless you're staying on top of the latest developments, your company is wasting that potential--and your competitors will be gaining speed while you fall behind. Strategic Analytics: The Insights You Need from Harvard Business Review will provide you with today's essential thinking about what data analytics are capable of, what critical talents your company needs to reap their benefits, and how to adopt analytics throughout your organization--before it's too late. Business is changing. Will you adapt or be left behind? Get up to speed and deepen your understanding of the topics that are shaping your company's future with the Insights You Need from Harvard Business Review series. Featuring HBR's smartest thinking on fast-moving issues--blockchain, cybersecurity, AI, and more--each book provides the foundational introduction and practical case studies your organization needs to compete today and collects the best research, interviews, and analysis to get it ready for tomorrow. You can't afford to ignore how these issues will transform the landscape of business and society. The Insights You Need series will help you grasp these critical ideas--and prepare you and your company for the future.

Intelligence at the Edge

2020-02-28 O'Reilly Amazon

book

Michael Harvey

data data-science analytics-platforms SAS AI/ML Analytics

Explore powerful SAS analytics and the Internet of Things! The world that we live in is more connected than ever before. The Internet of Things (IoT) consists of mechanical and electronic devices connected to one another and to software through the internet. Businesses can use the IoT to quickly make intelligent decisions based on massive amounts of data gathered in real time from these connected devices. IoT increases productivity, lowers operating costs, and provides insights into how businesses can serve existing markets and expand into new ones. Intelligence at the Edge: Using SAS with the Internet of Things is for anyone who wants to learn more about the rapidly changing field of IoT. Current practitioners explain how to apply SAS software and analytics to derive business value from the Internet of Things. The cornerstone of this endeavor is SAS Event Stream Processing, which enables you to process and analyze continuously flowing events in real time. With step-by-step guidance and real-world scenarios, you will learn how to apply analytics to streaming data. Each chapter explores a different aspect of IoT, including the analytics life cycle, monitoring, deployment, geofencing, machine learning, artificial intelligence, condition-based maintenance, computer vision, and edge devices.

Pandas 1.x Cookbook - Second Edition

2020-02-27 O'Reilly Amazon

book

Theodore Petrou , Matthew Harrison

data data-science data-science-tools Pandas AI/ML Data Science

The 'Pandas 1.x Cookbook' offers a recipe-based guide for mastering the powerful Python library, pandas. You will gain practical knowledge for handling and manipulating data efficiently, from the fundamentals to advanced techniques. The book is an essential resource for exploring and analyzing datasets with pandas. What this Book will help me do Understand and apply data exploration techniques in pandas. Use pandas to manipulate, aggregate, and clean datasets to extract meaningful insights. Combine pandas with Matplotlib and Seaborn to create effective visualizations. Perform time series analysis and transform datasets for machine learning. Implement workflows for handling large-scale data that exceeds your computer's memory. Author(s) Matthew Harrison and Theodore Petrou are highly experienced educators and practitioners in data science and Python programming. With their extensive expertise in using pandas, they provide insights through practical exercises and approachable narratives. Their aim is to make complex concepts accessible to learners of varying skill levels. Who is it for? This book is ideal for Python programmers, analysts, and data scientists seeking to expand their data handling and analysis capabilities. It caters to both beginners who are new to pandas and those looking to deepen their understanding of its advanced features. If your goal is to explore, clean, and analyze complex datasets efficiently, this book is tailored for you.

Advances in Data Science

2020-02-05 O'Reilly Amazon

book

Huiwen Wang , Gilbert Saporta , Rong Guan , Edwin Diday

data data-science AI/ML Data Science

Data science unifies statistics, data analysis and machine learning to achieve a better understanding of the masses of data which are produced today, and to improve prediction. Special kinds of data (symbolic, network, complex, compositional) are increasingly frequent in data science. These data require specific methodologies, but there is a lack of reference work in this field. Advances in Data Science fills this gap. It presents a collection of up-to-date contributions by eminent scholars following two international workshops held in Beijing and Paris. The 10 chapters are organized into four parts: Symbolic Data, Complex Data, Network Data and Clustering. They include fundamental contributions, as well as applications to several domains, including business and the social sciences.

Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP, 2nd Edition

2020-02-05 O'Reilly Amazon

book

Bhisham C. Gupta , Kalanka P. Jayalath , Irwin Guttman

data data-science data-science-tasks statistics AI/ML Big Data

Introduces basic concepts in probability and statistics to data science students, as well as engineers and scientists Aimed at undergraduate/graduate-level engineering and natural science students, this timely, fully updated edition of a popular book on statistics and probability shows how real-world problems can be solved using statistical concepts. It removes Excel exhibits and replaces them with R software throughout, and updates both MINITAB and JMP software instructions and content. A new chapter discussing data mining—including big data, classification, machine learning, and visualization—is featured. Another new chapter covers cluster analysis methodologies in hierarchical, nonhierarchical, and model based clustering. The book also offers a chapter on Response Surfaces that previously appeared on the book’s companion website. Statistics and Probability with Applications for Engineers and Scientists using MINITAB, R and JMP, Second Edition is broken into two parts. Part I covers topics such as: describing data graphically and numerically, elements of probability, discrete and continuous random variables and their probability distributions, distribution functions of random variables, sampling distributions, estimation of population parameters and hypothesis testing. Part II covers: elements of reliability theory, data mining, cluster analysis, analysis of categorical data, nonparametric tests, simple and multiple linear regression analysis, analysis of variance, factorial designs, response surfaces, and statistical quality control (SQC) including phase I and phase II control charts. The appendices contain statistical tables and charts and answers to selected problems. Features two new chapters—one on Data Mining and another on Cluster Analysis Now contains R exhibits including code, graphical display, and some results MINITAB and JMP have been updated to their latest versions Emphasizes the p-value approach and includes related practical interpretations Offers a more applied statistical focus, and features modified examples to better exhibit statistical concepts Supplemented with an Instructor's-only solutions manual on a book’s companion website Statistics and Probability with Applications for Engineers and Scientists using MINITAB, R and JMP is an excellent text for graduate level data science students, and engineers and scientists. It is also an ideal introduction to applied statistics and probability for undergraduate students in engineering and the natural sciences.

The Data Science Workshop

2020-01-29 O'Reilly Amazon

book

Tiffany Ford , Anthony So , Pritesh Tiwari , Thomas Joseph , Andrew Worsley , Ivan Liu , Dr. Samuel Asare , Robert Thas John , Barbora stetinova

data data-science AI/ML Data Science Pandas Python

The Data Science Workshop is designed for beginners looking to step into the rigorous yet rewarding world of data science. By leveraging a hands-on approach, this book demystifies key concepts and guides you gently into creating practical machine learning models with Python. What this Book will help me do Understand supervised and unsupervised learning and their applications. Gain hands-on experience with Python libraries like scikit-learn and pandas for data manipulation. Learn practical use cases of machine learning techniques such as regression and clustering. Discover techniques to ensure robustness in machine learning with hyperparameter tuning and ensembling. Develop efficiency in feature engineering with automated tools to accelerate workflows. Author(s) Anthony So None, Thomas Joseph, Robert Thas John, and Andrew Worsley are seasoned experts in data science and Python programming. Along with Dr. Samuel Asare None, they bring decades of experience and practical knowledge to this book, delivering an engaging and approachable learning experience. Who is it for? This book is targeted toward individuals who are beginners in data science and are eager to acquire foundational knowledge and practical skills. It appeals to those who prefer a structured, hands-on approach to learning, possibly having some prior programming experience or interest in Python. Professionals aspiring to pivot into data-oriented roles or students aiming to strengthen their understanding of data science concepts will find this book particularly valuable. If you're looking to gain confidence in implementing data science projects and solving real-world problems, this text is for you.

Probability with R, 2nd Edition

2020-01-22 O'Reilly Amazon

book

Jane M. Horgan

data data-science data-science-tools r AI/ML Cloud Computing

Provides a comprehensive introduction to probability with an emphasis on computing-related applications This self-contained new and extended edition outlines a first course in probability applied to computer-related disciplines. As in the first edition, experimentation and simulation are favoured over mathematical proofs. The freely down-loadable statistical programming language R is used throughout the text, not only as a tool for calculation and data analysis, but also to illustrate concepts of probability and to simulate distributions. The examples in Probability with R: An Introduction with Computer Science Applications, Second Edition cover a wide range of computer science applications, including: testing program performance; measuring response time and CPU time; estimating the reliability of components and systems; evaluating algorithms and queuing systems. Chapters cover: The R language; summarizing statistical data; graphical displays; the fundamentals of probability; reliability; discrete and continuous distributions; and more. This second edition includes: improved R code throughout the text, as well as new procedures, packages and interfaces; updated and additional examples, exercises and projects covering recent developments of computing; an introduction to bivariate discrete distributions together with the R functions used to handle large matrices of conditional probabilities, which are often needed in machine translation; an introduction to linear regression with particular emphasis on its application to machine learning using testing and training data; a new section on spam filtering using Bayes theorem to develop the filters; an extended range of Poisson applications such as network failures, website hits, virus attacks and accessing the cloud; use of new allocation functions in R to deal with hash table collision, server overload and the general allocation problem. The book is supplemented with a Wiley Book Companion Site featuring data and solutions to exercises within the book. Primarily addressed to students of computer science and related areas, Probability with R: An Introduction with Computer Science Applications, Second Edition is also an excellent text for students of engineering and the general sciences. Computing professionals who need to understand the relevance of probability in their areas of practice will find it useful.

Data Science Programming All-in-One For Dummies

2020-01-09 O'Reilly Amazon

book

John Paul Mueller , Luca Massaron

data data-science AI/ML Data Science Python

Your logical, linear guide to the fundamentals of data science programming Data science is exploding—in a good way—with a forecast of 1.7 megabytes of new information created every second for each human being on the planet by 2020 and 11.5 million job openings by 2026. It clearly pays dividends to be in the know. This friendly guide charts a path through the fundamentals of data science and then delves into the actual work: linear regression, logical regression, machine learning, neural networks, recommender engines, and cross-validation of models. Data Science Programming All-In-One For Dummies is a compilation of the key data science, machine learning, and deep learning programming languages: Python and R. It helps you decide which programming languages are best for specific data science needs. It also gives you the guidelines to build your own projects to solve problems in real time. Get grounded: the ideal start for new data professionals What lies ahead: learn about specific areas that data is transforming Be meaningful: find out how to tell your data story See clearly: pick up the art of visualization Whether you’re a beginning student or already mid-career, get your copy now and add even more meaning to your life—and everyone else’s!

Practical Data Science with R, Second Edition

2019-12-23 O'Reilly Amazon

book

John Mount , Nina Zumel

data data-science AI/ML BI Computer Science Data Science

Practical Data Science with R, Second Edition takes a practice-oriented approach to explaining basic principles in the ever expanding field of data science. You’ll jump right to real-world use cases as you apply the R programming language and statistical analysis techniques to carefully explained examples based in marketing, business intelligence, and decision support. About the Technology Evidence-based decisions are crucial to success. Applying the right data analysis techniques to your carefully curated business data helps you make accurate predictions, identify trends, and spot trouble in advance. The R data analysis platform provides the tools you need to tackle day-to-day data analysis and machine learning tasks efficiently and effectively. About the Book Practical Data Science with R, Second Edition is a task-based tutorial that leads readers through dozens of useful, data analysis practices using the R language. By concentrating on the most important tasks you’ll face on the job, this friendly guide is comfortable both for business analysts and data scientists. Because data is only useful if it can be understood, you’ll also find fantastic tips for organizing and presenting data in tables, as well as snappy visualizations. What's Inside Statistical analysis for business pros Effective data presentation The most useful R tools Interpreting complicated predictive models About the Reader You’ll need to be comfortable with basic statistics and have an introductory knowledge of R or another high-level programming language. About the Authors Nina Zumel and John Mount founded a San Francisco–based data science consulting firm. Both hold PhDs from Carnegie Mellon University and blog on statistics, probability, and computer science. Quotes Full of useful shared experience and practical advice. Highly recommended. - From the Foreword by Jeremy Howard and Rachel Thomas Great examples and an informative walk-through of the data science process. - David Meza, NASA Offers interesting perspectives that cover many aspects of practical data science; a good reference. - Pascal Barbedor, BL SET R you ready to get data science done the right way? - Taylor Dolezal, Disney Studios

Big Data Analytics Methods

2019-12-16 O'Reilly Amazon

book

Peter Ghavami

data data-science AI/ML Analytics BI Big Data

Big Data Analytics Methods unveils secrets to advanced analytics techniques ranging from machine learning, random forest classifiers, predictive modeling, cluster analysis, natural language processing (NLP), Kalman filtering and ensembles of models for optimal accuracy of analysis and prediction. More than 100 analytics techniques and methods provide big data professionals, business intelligence professionals and citizen data scientists insight on how to overcome challenges and avoid common pitfalls and traps in data analytics. The book offers solutions and tips on handling missing data, noisy and dirty data, error reduction and boosting signal to reduce noise. It discusses data visualization, prediction, optimization, artificial intelligence, regression analysis, the Cox hazard model and many analytics using case examples with applications in the healthcare, transportation, retail, telecommunication, consulting, manufacturing, energy and financial services industries. This book's state of the art treatment of advanced data analytics methods and important best practices will help readers succeed in data analytics.

Practical DataOps: Delivering Agile Data Science at Scale

2019-12-09 O'Reilly Amazon

book

Harvinder Atwal

data data-science DataOps Agile/Scrum AI/ML Analytics

Gain a practical introduction to DataOps, a new discipline for delivering data science at scale inspired by practices at companies such as Facebook, Uber, LinkedIn, Twitter, and eBay. Organizations need more than the latest AI algorithms, hottest tools, and best people to turn data into insight-driven action and useful analytical data products. Processes and thinking employed to manage and use data in the 20th century are a bottleneck for working effectively with the variety of data and advanced analytical use cases that organizations have today. This book provides the approach and methods to ensure continuous rapid use of data to create analytical data products and steer decision making. Practical DataOps shows you how to optimize the data supply chain from diverse raw data sources to the final data product, whether the goal is a machine learning model or other data-orientated output. The book provides an approach to eliminate wasted effort and improve collaboration between data producers, data consumers, and the rest of the organization through the adoption of lean thinking and agile software development principles. This book helps you to improve the speed and accuracy of analytical application development through data management and DevOps practices that securely expand data access, and rapidly increase the number of reproducible data products through automation, testing, and integration. The book also shows how to collect feedback and monitor performance to manage and continuously improve your processes and output. What You Will Learn Develop a data strategy for your organization to help it reach its long-term goals Recognize and eliminate barriers to delivering data to users at scale Work on the right things for the right stakeholders through agile collaboration Create trust in data via rigorous testing and effective data management Build a culture of learning and continuous improvement through monitoring deployments and measuring outcomes Create cross-functional self-organizing teams focused on goals not reporting lines Build robust, trustworthy, data pipelines in support of AI, machine learning, and other analytical data products Who This Book Is For Data science and advanced analytics experts, CIOs, CDOs (chief data officers), chief analytics officers, business analysts, business team leaders, and IT professionals (data engineers, developers, architects, and DBAs) supporting data teams who want to dramatically increase the value their organization derives from data. The book is ideal for data professionals who want to overcome challenges of long delivery time, poor data quality, high maintenance costs, and scaling difficulties in getting data science output and machine learning into customer-facing production.

The Decision Maker's Handbook to Data Science: A Guide for Non-Technical Executives, Managers, and Founders

2019-11-26 O'Reilly Amazon

book

Stylianos Kampakis

data data-science AI/ML Data Collection Data Science

Data science is expanding across industries at a rapid pace, and the companies first to adopt best practices will gain a significant advantage. To reap the benefits, decision makers need to have a confident understanding of data science and its application in their organization. It is easy for novices to the subject to feel paralyzed by intimidating buzzwords, but what many don’t realize is that data science is in fact quite multidisciplinary—useful in the hands of business analysts, communications strategists, designers, and more. With the second edition of The Decision Maker’s Handbook to Data Science, you will learn how to think like a veteran data scientist and approach solutions to business problems in an entirely new way. Author Stylianos Kampakis provides you with the expertise and tools required to develop a solid data strategy that is continuously effective. Ethics and legal issues surrounding data collection and algorithmic bias are some common pitfalls that Kampakis helps you avoid, while guiding you on the path to build a thriving data science culture at your organization. This updated and revised second edition, includes plenty of case studies, tools for project assessment, and expanded content for hiring and managing data scientists Data science is a language that everyone at a modern company should understand across departments. Friction in communication arises most often when management does not connect with what a data scientist is doing or how impactful data collection and storage can be for their organization. The Decision Maker’s Handbook to Data Science bridges this gap and readies you for both the present and future of your workplace in this engaging, comprehensive guide. What You Will Learn Understand how data science can be used within your business. Recognize the differences between AI, machine learning, and statistics. Become skilled at thinking like a data scientist, without being one. Discover how to hire and manage data scientists. Comprehend how to build the right environment in order to make your organization data-driven. Who This Book Is For Startup founders, product managers, higher level managers, and any other non-technical decision makers who are thinking to implement data science in their organization and hire data scientists. A secondary audience includes people looking for a soft introduction into the subject of data science.

Reporting, Predictive Analytics, and Everything in Between

2019-11-25 O'Reilly Amazon

book

Brett Stupakevich , David Sweenor , Shane Swiderek

data data-science business-intelligence AI/ML Analytics BI

Business decisions today are tactical and strategic at the same time. How do you respond to a competitor’s price change? Or to specific technology changes? What new products, markets, or businesses should you pursue? Decisions like these are based on information from only one source: data. With this practical report, technical and non-technical leaders alike will explore the fundamental elements necessary to embark on a data analytics initiative. Is your company planning or contemplating a data analytics initiative? Authors Brett Stupakevich, David Sweenor, and Shane Swiderek from TIBCO guide you through several analytics options. IT leaders, product developers, analytics leaders, data analysts, data scientists, and business professionals will learn how to deploy analytic components in streaming and embedded systems using one of five platforms. You’ll examine: Analytics platforms including embedded BI, reporting, data exploration & discovery, streaming BI, and data science & machine learning The business problems each option solves and the capabilities and requirements of each How to identify the right analytics type for your particular use case Key considerations and the level of investment for each analytics platform

Business Analytics, Volume II

2019-11-08 O'Reilly Amazon

book

Dr. Amar Sahay

data data-science business-intelligence AI/ML Analytics BI

This business analytics (BA) text discusses the models based on fact-based data to measure past business performance to guide an organization in visualizing and predicting future business performance and outcomes. It provides a comprehensive overview of analytics in general with an emphasis on predictive analytics. Given the booming interest in analytics and data science, this book is timely and informative. It brings many terms, tools, and methods of analytics together. The first three chapters provide an introduction to BA, importance of analytics, types of BA-descriptive, predictive, and prescriptive-along with the tools and models. Business intelligence (BI) and a case on descriptive analytics are discussed. Additionally, the book discusses on the most widely used predictive models, including regression analysis, forecasting, data mining, and an introduction to recent applications of predictive analytics-machine learning, neural networks, and artificial intelligence. The concluding chapter discusses on the current state, job outlook, and certifications in analytics.

Data Mining for Business Analytics

2019-11-05 O'Reilly Amazon

book

Nitin R. Patel , Galit Shmueli , Peter C. Bruce , Peter Gedeck

data data-science data-science-tasks exploratory-data-analysis AI/ML Analytics

Data Mining for Business Analytics: Concepts, Techniques, and Applications in Python presents an applied approach to data mining concepts and methods, using Python software for illustration Readers will learn how to implement a variety of popular data mining algorithms in Python (a free and open-source software) to tackle business problems and opportunities. This is the sixth version of this successful text, and the first using Python. It covers both statistical and machine learning algorithms for prediction, classification, visualization, dimension reduction, recommender systems, clustering, text mining and network analysis. It also includes: A new co-author, Peter Gedeck, who brings both experience teaching business analytics courses using Python, and expertise in the application of machine learning methods to the drug-discovery process A new section on ethical issues in data mining Updates and new material based on feedback from instructors teaching MBA, undergraduate, diploma and executive courses, and from their students More than a dozen case studies demonstrating applications for the data mining techniques described End-of-chapter exercises that help readers gauge and expand their comprehension and competency of the material presented A companion website with more than two dozen data sets, and instructor materials including exercise solutions, PowerPoint slides, and case solutions Data Mining for Business Analytics: Concepts, Techniques, and Applications in Python is an ideal textbook for graduate and upper-undergraduate level courses in data mining, predictive analytics, and business analytics. This new edition is also an excellent reference for analysts, researchers, and practitioners working with quantitative methods in the fields of business, finance, marketing, computer science, and information technology. “This book has by far the most comprehensive review of business analytics methods that I have ever seen, covering everything from classical approaches such as linear and logistic regression, through to modern methods like neural networks, bagging and boosting, and even much more business specific procedures such as social network analysis and text mining. If not the bible, it is at the least a definitive manual on the subject.” —Gareth M. James, University of Southern California and co-author (with Witten, Hastie and Tibshirani) of the best-selling book An Introduction to Statistical Learning, with Applications in R

Applications of Computational Intelligence in Data-Driven Trading

2019-10-29 O'Reilly Amazon

book

Cris Doloc

data data-science data-science-domains sector-specific-data-science AI/ML

“Life on earth is filled with many mysteries, but perhaps the most challenging of these is the nature of Intelligence.” – Prof. Terrence J. Sejnowski, Computational Neurobiologist The main objective of this book is to create awareness about both the promises and the formidable challenges that the era of Data-Driven Decision-Making and Machine Learning are confronted with, and especially about how these new developments may influence the future of the financial industry. The subject of Financial Machine Learning has attracted a lot of interest recently, specifically because it represents one of the most challenging problem spaces for the applicability of Machine Learning. The author has used a novel approach to introduce the reader to this topic: The first half of the book is a readable and coherent introduction to two modern topics that are not generally considered together: the data-driven paradigm and Computational Intelligence. The second half of the book illustrates a set of Case Studies that are contemporarily relevant to quantitative trading practitioners who are dealing with problems such as trade execution optimization, price dynamics forecast, portfolio management, market making, derivatives valuation, risk, and compliance. The main purpose of this book is pedagogical in nature, and it is specifically aimed at defining an adequate level of engineering and scientific clarity when it comes to the usage of the term “ Artificial Intelligence,” especially as it relates to the financial industry. The message conveyed by this book is one of confidence in the possibilities offered by this new era of Data-Intensive Computation. This message is not grounded on the current hype surrounding the latest technologies, but on a deep analysis of their effectiveness and also on the author’s two decades of professional experience as a technologist, quant and academic.

What Is Augmented Analytics?

2019-10-25 O'Reilly Amazon

book

Alice LaPlante

data data-science business-intelligence prescriptive-analytics AI/ML Analytics

As your business tries to make sense of today’s staggering amount of structured and unstructured data, traditional analytics will take you only so far. The key to success over the next few years will depend on augmented analytics, a method that embeds machine learning and natural language processing (NLP) in the process. This report explains how augmented analytics can help you uncover hidden insights, predict results, and even prescribe solutions. Author Alice LaPlante provides best practices for deploying augmented analytics, along with real-world case studies that show you how to take full advantage of this method. IT professionals, business managers, and CFOs will learn ways to democratize data use among business users and executives, using a self-service model. The future belongs to those who can get more from their data. This report shows you how. Get a primer on the key components and learn how they work together Delve into the benefits of—and roadblocks to—adopting augmented analytics Learn how companies use this method in marketing, sales, finance, and human resources Examine case studies of companies including Accenture and Riverbed

Practical Time Series Analysis

2019-10-16 O'Reilly Amazon

book

Aileen Nielsen

data data-science data-science-tasks statistics time-series AI/ML

Time series data analysis is increasingly important due to the massive production of such data through the internet of things, the digitalization of healthcare, and the rise of smart cities. As continuous monitoring and data collection become more common, the need for competent time series analysis with both statistical and machine learning techniques will increase. Covering innovations in time series data analysis and use cases from the real world, this practical guide will help you solve the most common data engineering and analysis challengesin time series, using both traditional statistical and modern machine learning techniques. Author Aileen Nielsen offers an accessible, well-rounded introduction to time series in both R and Python that will have data scientists, software engineers, and researchers up and running quickly. You’ll get the guidance you need to confidently: Find and wrangle time series data Undertake exploratory time series data analysis Store temporal data Simulate time series data Generate and select features for a time series Measure error Forecast and classify time series with machine or deep learning Evaluate accuracy and performance

R Bioinformatics Cookbook

2019-10-11 O'Reilly Amazon

book

Dan MacLean

data data-science data-science-domains bioinformatics AI/ML DataViz

In the "R Bioinformatics Cookbook", you will explore the full potential of the R programming language and the Bioconductor ecosystem to overcome challenges in bioinformatics. By working through real-world examples, you will learn to handle biological data effectively and gain insights into genomics, RNA sequencing, and advanced data visualization. What this Book will help me do Develop skills to analyze RNA sequencing data using R and Bioconductor packages such as edgeR and DESeq. Learn to create professional-grade graphical representations of biological data using ggplot and other visualization tools. Understand how to perform genome-wide studies like variant calling and metagenomics analysis with R. Master the integration of external genomic databases with Ensembl for functional annotation. Explore machine learning applications in bioinformatics including classification and clustering models. Author(s) None MacLean and Dr. Dan Maclean are experienced bioinformatics researchers and R programmers. With a deep understanding of computational biology and visualization techniques, they bring years of academic and practical expertise to help readers excel in bioinformatics. Their approachable writing style ensures that complex topics are made accessible. Who is it for? This book is ideal for bioinformatics professionals and data analysts with an interest in applying R to biological data. It is particularly suited for those with a basic knowledge of R and bioinformatics looking to enhance their analysis skills. Researchers seeking to integrate genomics and computational methods into their workflows will find this book valuable. It's perfect for anyone aiming to tackle intermediate to advanced topics in biological data analysis.

Practical Data Science with Python 3: Synthesizing Actionable Insights from Data

2019-09-07 O'Reilly Amazon

book

Ervin Varga

data data-science AI/ML Big Data Cloud Computing Data Engineering

Gain insight into essential data science skills in a holistic manner using data engineering and associated scalable computational methods. This book covers the most popular Python 3 frameworks for both local and distributed (in premise and cloud based) processing. Along the way, you will be introduced to many popular open-source frameworks, like, SciPy, scikitlearn, Numba, Apache Spark, etc. The book is structured around examples, so you will grasp core concepts via case studies and Python 3 code. As data science projects gets continuously larger and more complex, software engineering knowledge and experience is crucial to produce evolvable solutions. You'll see how to create maintainable software for data science and how to document data engineering practices. This book is a good starting point for people who want to gain practical skills to perform data science. All the code willbe available in the form of IPython notebooks and Python 3 programs, which allow you to reproduce all analyses from the book and customize them for your own purpose. You'll also benefit from advanced topics like Machine Learning, Recommender Systems, and Security in Data Science. Practical Data Science with Python will empower you analyze data, formulate proper questions, and produce actionable insights, three core stages in most data science endeavors. What You'll Learn Play the role of a data scientist when completing increasingly challenging exercises using Python 3 Work work with proven data science techniques/technologies Review scalable software engineering practices to ramp up data analysis abilities in the realm of Big Data Apply theory of probability, statistical inference, and algebra to understand the data sciencepractices Who This Book Is For Anyone who would like to embark into the realm of data science using Python 3.

talk-data.com

O'Reilly Data Science Books

Top Topics

Top Speakers

It's All Analytics!

Analytical Skills for AI and Data Science

Practical Synthetic Data Generation

Practical Statistics for Data Scientists, 2nd Edition

ML Ops: Operationalizing Data Science

Strategic Analytics: The Insights You Need from Harvard Business Review

Intelligence at the Edge

Pandas 1.x Cookbook - Second Edition

Advances in Data Science

Statistics and Probability with Applications for Engineers and Scientists Using MINITAB, R and JMP, 2nd Edition

The Data Science Workshop

Probability with R, 2nd Edition

Data Science Programming All-in-One For Dummies

Practical Data Science with R, Second Edition

Big Data Analytics Methods

Practical DataOps: Delivering Agile Data Science at Scale

The Decision Maker's Handbook to Data Science: A Guide for Non-Technical Executives, Managers, and Founders

Reporting, Predictive Analytics, and Everything in Between

Business Analytics, Volume II

Data Mining for Business Analytics

Applications of Computational Intelligence in Data-Driven Trading

What Is Augmented Analytics?

Practical Time Series Analysis

R Bioinformatics Cookbook

Practical Data Science with Python 3: Synthesizing Actionable Insights from Data