O'Reilly Data Science Books

R for Everyone: Advanced Analytics and Graphics, 2nd Edition

2017-06-14 O'Reilly Amazon

book

Jared P. Lander

data data-science data-science-tools r AI/ML Analytics

Statistical Computation for Programmers, Scientists, Quants, Excel Users, and Other Professionals Using the open source R language, you can build powerful statistical models to answer many of your most challenging questions. R has traditionally been difficult for non-statisticians to learn, and most R books assume far too much knowledge to be of help. is the solution. R for Everyone, Second Edition, Drawing on his unsurpassed experience teaching new users, professional data scientist Jared P. Lander has written the perfect tutorial for anyone new to statistical programming and modeling. Organized to make learning easy and intuitive, this guide focuses on the 20 percent of R functionality you'll need to accomplish 80 percent of modern data tasks. Lander's self-contained chapters start with the absolute basics, offering extensive hands-on practice and sample code. You'll download and install R; navigate and use the R environment; master basic program control, data import, manipulation, and visualization; and walk through several essential tests. Then, building on this foundation, you'll construct several complete models, both linear and nonlinear, and use some data mining techniques. After all this you'll make your code reproducible with LaTeX, RMarkdown, and Shiny. By the time you're done, you won't just know how to write R programs, you'll be ready to tackle the statistical problems you care about most. Coverage includes Explore R, RStudio, and R packages Use R for math: variable types, vectors, calling functions, and more Exploit data structures, including data.frames, matrices, and lists Read many different types of data Create attractive, intuitive statistical graphics Write user-defined functions Control program flow with if, ifelse, and complex checks Improve program efficiency with group manipulations Combine and reshape multiple datasets Manipulate strings using R's facilities and regular expressions Create normal, binomial, and Poisson probability distributions Build linear, generalized linear, and nonlinear models Program basic statistics: mean, standard deviation, and t-tests Train machine learning models Assess the quality of models and variable selection Prevent overfitting and perform variable selection, using the Elastic Net and Bayesian methods Analyze univariate and multivariate time series data Group data via K-means and hierarchical clustering Prepare reports, slideshows, and web pages with knitr Display interactive data with RMarkdown and htmlwidgets Implement dashboards with Shiny Build reusable R packages with devtools and Rcpp

Business in Real-Time Using Azure IoT and Cortana Intelligence Suite: Driving Your Digital Transformation

2017-06-05 O'Reilly Amazon

book

Jeff Barnes , Bob Familiar

data data-science business-intelligence AI/ML Analytics Azure Resource Manager (ARM)

Learn how today’s businesses can transform themselves by leveraging real-time data and advanced machine learning analytics. This book provides prescriptive guidance for architects and developers on the design and development of modern Internet of Things (IoT) and Advanced Analytics solutions. In addition, Business in Real-Time Using Azure IoT and Cortana Intelligence Suite offers patterns and practices for those looking to engage their customers and partners through Software-as-a-Service solutions that work on any device. Whether you're working in Health & Life Sciences, Manufacturing, Retail, Smart Cities and Buildings or Process Control, there exists a common platform from which you can create your targeted vertical solutions. Business in Real-Time Using Azure IoT and Cortana Intelligence Suite uses a reference architecture as a road map. Building on Azure’s PaaS services, you'll see how a solution architecture unfolds that demonstrates a complete end-to-end IoT and Advanced Analytics scenario. What You'll Learn: Automate your software product life cycle using PowerShell, Azure Resource Manager Templates, and Visual Studio Team Services Implement smart devices using Node.JS and C# Use Azure Streaming Analytics to ingest millions of events Provide both "Hot" and "Cold" path outputs for real-time alerts, data transformations, and aggregation analytics Implement batch processing using Azure Data Factory Create a new form of Actionable Intelligence (AI) to drive mission critical business processes Provide rich Data Visualizations across a wide variety of mobile and web devices Who This Book is For: Solution Architects, Software Developers, Data Architects, Data Scientists, and CIO/CTA Technical Leadership Professionals

Practical Statistics for Data Scientists

2017-05-18 O'Reilly Amazon

book

Andrew Bruce , Peter Bruce

data data-science data-science-tasks statistics AI/ML Big Data

Statistical methods are a key part of of data science, yet very few data scientists have any formal statistics training. Courses and books on basic statistics rarely cover the topic from a data science perspective. This practical guide explains how to apply various statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what's important and what's not. Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you’re familiar with the R programming language, and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format. With this book, you’ll learn: Why exploratory data analysis is a key preliminary step in data science How random sampling can reduce bias and yield a higher quality dataset, even with big data How the principles of experimental design yield definitive answers to questions How to use regression to estimate outcomes and detect anomalies Key classification techniques for predicting which categories a record belongs to Statistical machine learning methods that “learn” from data Unsupervised learning methods for extracting meaning from unlabeled data

Breaking Data Science Open

2017-05-15 O'Reilly Amazon

book

Christine Doig , Michele Chambers , Ian Stokes-Rees

data data-science AI/ML Analytics Data Science IBM

Over the past decade, data science has come out of the back office to become a force of change across the entire organization. At the forefront of this change is the open data science movement that advocates the use of open source tools in a powerful, connected ecosystem. This report explores how open data science can help your organization break free from the shackles of proprietary tools, embrace a more open and collaborative work style, and unleash new intelligent applications quickly. Authors Michele Chambers and Christine Doig explain how open source tools have helped bring about many facets of the data science evolution, including collaboration, self-service, and deployment. But you’ll discover that open data science is about more than tools; it’s about a new way of working as an organization. Learn how data science—particularly open data science—has become part of everyday business Understand how open data science engages people from other disciplines, not just statisticians Examine tools and practices that enable data science to be open across technical, operational, and organizational aspects Learn benefits of open data science, including rich resources, agility, transparency, and collective intelligence Explore case studies that demonstrate different ways to implement open data science Discover how open data science can help you break down department barriers and make bold market moves Michele Chambers, Chief Marketing Officer and VP Products at Continuum Analytics, is an entrepreneurial executive with over 25 years of industry experience. Prior to Continuum Analytics, Michele held executive leadership roles at several database and analytic companies, including Netezza, IBM, Revolution Analytics, MemSQL, and RapidMiner. Christine Doig is a senior data scientist at Continuum Analytics, where she's worked on several projects, including MEMEX, a DARPA-funded open data science project to help stop human trafficking. She has 5+ years of experience in analytics, operations research, and machine learning in a variety of industries.

Mastering Machine Learning with R - Second Edition

2017-04-24 O'Reilly Amazon

book

Vikram Dhillon , Miroslav Kopecky , Doug Ortiz , Cory Lesmeister

data data-science data-science-tools r AI/ML Cloud Computing

Dive into the world of advanced machine learning techniques with "Mastering Machine Learning with R, Second Edition." This comprehensive guide equips you with the skills to implement sophisticated algorithms and create powerful prediction models using R 3.x. You will explore topics such as supervised and unsupervised learning, decision trees, ensemble methods, and deep learning. What this Book will help me do Implement machine learning workflows using a variety of R packages like XGBOOST. Effectively use linear and logistic regression for statistical analysis and pattern recognition. Develop skills in advanced methods such as support vector machines and neural networks. Learn actionable techniques to create recommendation engines and perform text mining. Gain hands-on experience running R-based machine learning analyses on cloud platforms. Author(s) None Lesmeister, a seasoned data scientist, combines extensive hands-on experience and a passion for teaching to deliver technical concepts in a practical, engaging manner. With a strong background in statistical analysis and machine learning, they are dedicated to providing readers with actionable knowledge and step-by-step guidance. Who is it for? This book is ideal for data scientists, analysts, and machine learning practitioners aiming to deepen their expertise in R. Readers should have a fundamental understanding of machine learning concepts and a basic knowledge of R programming. If you're looking to master advanced learning methods and apply them effectively, this book is tailored for you.

Beginning Data Science in R: Data Analysis, Visualization, and Modelling for the Data Scientist

2017-03-09 O'Reilly Amazon

book

Thomas Mailund

data data-science AI/ML Analytics Big Data Data Science

Discover best practices for data analysis and software development in R and start on the path to becoming a fully-fledged data scientist. This book teaches you techniques for both data manipulation and visualization and shows you the best way for developing new software packages for R. Beginning Data Science in R details how data science is a combination of statistics, computational science, and machine learning. You'll see how to efficiently structure and mine data to extract useful patterns and build mathematical models. This requires computational methods and programming, and R is an ideal programming language for this. This book is based on a number of lecture notes for classes the author has taught on data science and statistical programming using the R programming language. Modern data analysis requires computational skills and usually a minimum of programming. What You Will Learn Perform data science and analytics using statistics and the R programming language Visualize and explore data, including working with large data sets found in big data Build an R package Test and check your code Practice version control Profile and optimize your code Who This Book Is For Those with some data science or analytics background, but not necessarily experience with the R programming language.

Data Science For Dummies, 2nd Edition

2017-03-06 O'Reilly Amazon

book

Jake Porway , Lillian Pierson

data data-science AI/ML Big Data Data Science DataViz

Your ticket to breaking into the field of data science! Jobs in data science are projected to outpace the number of people with data science skills—making those with the knowledge to fill a data science position a hot commodity in the coming years. Data Science For Dummies is the perfect starting point for IT professionals and students interested in making sense of an organization's massive data sets and applying their findings to real-world business scenarios. From uncovering rich data sources to managing large amounts of data within hardware and software limitations, ensuring consistency in reporting, merging various data sources, and beyond, you'll develop the know-how you need to effectively interpret data and tell a story that can be understood by anyone in your organization. Provides a background in data science fundamentals and preparing your data for analysis Details different data visualization techniques that can be used to showcase and summarize your data Explains both supervised and unsupervised machine learning, including regression, model validation, and clustering techniques Includes coverage of big data processing tools like MapReduce, Hadoop, Dremel, Storm, and Spark It's a big, big data world out there—let Data Science For Dummies help you harness its power and gain a competitive edge for your organization.

The Data Science Handbook

2017-02-28 O'Reilly Amazon

book

Field Cady

data data-science AI/ML Analytics Big Data Computer Science

A comprehensive overview of data science covering the analytics, programming, and business skills necessary to master the discipline Finding a good data scientist has been likened to hunting for a unicorn: the required combination of technical skills is simply very hard to find in one person. In addition, good data science is not just rote application of trainable skill sets; it requires the ability to think flexibly about all these areas and understand the connections between them. This book provides a crash course in data science, combining all the necessary skills into a unified discipline. Unlike many analytics books, computer science and software engineering are given extensive coverage since they play such a central role in the daily work of a data scientist. The author also describes classic machine learning algorithms, from their mathematical foundations to real-world applications. Visualization tools are reviewed, and their central importance in data science is highlighted. Classical statistics is addressed to help readers think critically about the interpretation of data and its common pitfalls. The clear communication of technical results, which is perhaps the most undertrained of data science skills, is given its own chapter, and all topics are explained in the context of solving real-world data problems. The book also features: • Extensive sample code and tutorials using Python™ along with its technical libraries • Core technologies of “Big Data,” including their strengths and limitations and how they can be used to solve real-world problems • Coverage of the practical realities of the tools, keeping theory to a minimum; however, when theory is presented, it is done in an intuitive way to encourage critical thinking and creativity • A wide variety of case studies from industry • Practical advice on the realities of being a data scientist today, including the overall workflow, where time is spent, the types of datasets worked on, and the skill sets needed The Data Science Handbook is an ideal resource for data analysis methodology and big data software tools. The book is appropriate for people who want to practice data science, but lack the required skill sets. This includes software professionals who need to better understand analytics and statisticians who need to understand software. Modern data science is a unified discipline, and it is presented as such. This book is also an appropriate reference for researchers and entry-level graduate students who need to learn real-world analytics and expand their skill set. FIELD CADY is the data scientist at the Allen Institute for Artificial Intelligence, where he develops tools that use machine learning to mine scientific literature. He has also worked at Google and several Big Data startups. He has a BS in physics and math from Stanford University, and an MS in computer science from Carnegie Mellon.

Scala: Guide for Data Science Professionals

2017-02-24 O'Reilly Amazon

book

Patrick R. Nicolas , Pascal Bugnion , Arun Manivannan

software-development programming-languages jvm-languages Scala AI/ML Analytics

Scala will be a valuable tool to have on hand during your data science journey for everything from data cleaning to cutting-edge machine learning About This Book Build data science and data engineering solutions with ease An in-depth look at each stage of the data analysis process — from reading and collecting data to distributed analytics Explore a broad variety of data processing, machine learning, and genetic algorithms through diagrams, mathematical formulations, and source code Who This Book Is For This learning path is perfect for those who are comfortable with Scala programming and now want to enter the field of data science. Some knowledge of statistics is expected. What You Will Learn Transfer and filter tabular data to extract features for machine learning Read, clean, transform, and write data to both SQL and NoSQL databases Create Scala web applications that couple with JavaScript libraries such as D3 to create compelling interactive visualizations Load data from HDFS and HIVE with ease Run streaming and graph analytics in Spark for exploratory analysis Bundle and scale up Spark jobs by deploying them into a variety of cluster managers Build dynamic workflows for scientific computing Leverage open source libraries to extract patterns from time series Master probabilistic models for sequential data In Detail Scala is especially good for analyzing large sets of data as the scale of the task doesn’t have any significant impact on performance. Scala’s powerful functional libraries can interact with databases and build scalable frameworks — resulting in the creation of robust data pipelines. The first module introduces you to Scala libraries to ingest, store, manipulate, process, and visualize data. Using real world examples, you will learn how to design scalable architecture to process and model data — starting from simple concurrency constructs and progressing to actor systems and Apache Spark. After this, you will also learn how to build interactive visualizations with web frameworks. Once you have become familiar with all the tasks involved in data science, you will explore data analytics with Scala in the second module. You’ll see how Scala can be used to make sense of data through easy to follow recipes. You will learn about Bokeh bindings for exploratory data analysis and quintessential machine learning with algorithms with Spark ML library. You’ll get a sufficient understanding of Spark streaming, machine learning for streaming data, and Spark graphX. Armed with a firm understanding of data analysis, you will be ready to explore the most cutting-edge aspect of data science — machine learning. The final module teaches you the A to Z of machine learning with Scala. You’ll explore Scala for dependency injections and implicits, which are used to write machine learning algorithms. You’ll also explore machine learning topics such as clustering, dimentionality reduction, Naïve Bayes, Regression models, SVMs, neural networks, and more. This learning path combines some of the best that Packt has to offer into one complete, curated package. It includes content from the following Packt products: Scala for Data Science, Pascal Bugnion Scala Data Analysis Cookbook, Arun Manivannan Scala for Machine Learning, Patrick R. Nicolas Style and approach A complete package with all the information necessary to start building useful data engineering and data science solutions straight away. It contains a diverse set of recipes that cover the full spectrum of interesting data analysis tasks and will help you revolutionize your data analysis skills using Scala. Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the code file.

MATLAB Machine Learning

2016-12-28 O'Reilly Amazon

book

Michael Paluszek , Stephanie Thomas

data data-science data-science-tools MATLAB AI/ML Big Data

This book is a comprehensive guide to machine learning with worked examples in MATLAB. It starts with an overview of the history of Artificial Intelligence and automatic control and how the field of machine learning grew from these. It provides descriptions of all major areas in machine learning. The book reviews commercially available packages for machine learning and shows how they fit into the field. The book then shows how MATLAB can be used to solve machine learning problems and how MATLAB graphics can enhance the programmer’s understanding of the results and help users of their software grasp the results. Machine Learning can be very mathematical. The mathematics for each area is introduced in a clear and concise form so that even casual readers can understand the math. Readers from all areas of engineering will see connections to what they know and will learn new technology. The book then provides complete solutions in MATLAB for several important problems in machine learning including face identification, autonomous driving, and data classification. Full source code is provided for all of the examples and applications in the book. What you'll learn: An overview of the field of machine learning Commercial and open source packages in MATLAB How to use MATLAB for programming and building machine learning applications MATLAB graphics for machine learning Practical real world examples in MATLAB for major applications of machine learning in big data Who is this book for: The primary audiences are engineers and engineering students wanting a comprehensive and practical introduction to machine learning.

Business Analytics Using R - A Practical Approach

2016-12-27 O'Reilly Amazon

book

Umesh R. Hodeghatta , Umesha Nayak

data data-science data-science-tools r AI/ML Analytics

Learn the fundamental aspects of the business statistics, data mining, and machine learning techniques required to understand the huge amount of data generated by your organization. This book explains practical business analytics through examples, covers the steps involved in using it correctly, and shows you the context in which a particular technique does not make sense. Further, Practical Business Analytics using R helps you understand specific issues faced by organizations and how the solutions to these issues can be facilitated by business analytics. This book will discuss and explore the following through examples and case studies: An introduction to R: data management and R functions The architecture, framework, and life cycle of a business analytics project Descriptive analytics using R: descriptive statistics and data cleaning Data mining: classification, association rules, and clustering Predictive analytics: simple regression, multiple regression, and logistic regression This book includes case studies on important business analytic techniques, such as classification, association, clustering, and regression. The R language is the statistical tool used to demonstrate the concepts throughout the book. What You Will Learn • Write R programs to handle data • Build analytical models and draw useful inferences from them • Discover the basic concepts of data mining and machine learning • Carry out predictive modeling • Define a business issue as an analytical problem Who This Book Is For Beginners who want to understand and learn the fundamentals of analytics using R. Students, managers, executives, strategy and planning professionals, software professionals, and BI/DW professionals.

Principles of Data Science

2016-12-16 O'Reilly Amazon

book

Sinan Ozdemir

data data-science AI/ML Analytics Data Science Python

If you've ever wondered how to bridge the gap between mathematics, programming, and actionable data insights, 'Principles of Data Science' is the guide for you. This book explores the full data science pipeline, providing you with tools and knowledge to transform raw data into impactful decisions. With practical lessons and hands-on tutorials, you'll master the essential skills of a data scientist. What this Book will help me do Understand and apply the five core steps of the data science process. Gain insight into data cleaning, visualization, and effective communication of results. Learn and implement foundational machine learning models using Python or R. Bridge gaps between mathematics, statistics, and programming to solve data-driven problems. Evaluate machine learning models using key metrics for better predictive capabilities. Author(s) The author, a seasoned data scientist with years of professional experience in analytics and software development, brings a rich perspective to the topic. Combining a strong foundation in mathematics with expertise in Python and R, they have worked on diverse real-world data projects. Their teaching philosophy emphasizes clarity and practical application, ensuring you not only gain knowledge but also know how to apply it effectively. Who is it for? This book is intended for individuals with a basic understanding of algebra and some programming experience in Python or R. It is perfect for programmers who wish to dive into the world of data science or for those with math skills looking to apply them practically. If you seek to turn raw data into valuable insights and predictions, this book is tailored for you.

Python Data Science Handbook

2016-11-21 O'Reilly Amazon

book

Jake VanderPlas

software-development programming-languages Python AI/ML Data Science Matplotlib

For many researchers, Python is a first-class tool mainly because of its libraries for storing, manipulating, and gaining insight from data. Several resources exist for individual pieces of this data science stack, but only with the Python Data Science Handbook do you get them all—IPython, NumPy, Pandas, Matplotlib, Scikit-Learn, and other related tools. Working scientists and data crunchers familiar with reading and writing Python code will find this comprehensive desk reference ideal for tackling day-to-day issues: manipulating, transforming, and cleaning data; visualizing different types of data; and using data to build statistical or machine learning models. Quite simply, this is the must-have reference for scientific computing in Python. With this handbook, you’ll learn how to use: IPython and Jupyter: provide computational environments for data scientists using Python NumPy: includes the ndarray for efficient storage and manipulation of dense data arrays in Python Pandas: features the DataFrame for efficient storage and manipulation of labeled/columnar data in Python Matplotlib: includes capabilities for a flexible range of data visualizations in Python Scikit-Learn: for efficient and clean Python implementations of the most important and established machine learning algorithms

Learning R Programming

2016-10-28 O'Reilly Amazon

book

Kun Ren

data data-science data-science-tools r AI/ML Data Science

This book provides a comprehensive introduction to R programming, a powerful tool for data science and statistics. Throughout the book, readers will explore programming constructs, data structures, and popular R packages, gaining the skills needed for practical applications and problem-solving. What this Book will help me do Understand R's foundational concepts like variables, data types, and functions. Learn how to use R for data analysis, visualization, and machine learning tasks. Develop advanced R skills such as meta-programming and performance optimization. Master object-oriented programming using R's S3, S4, and R6 systems. Gain confidence in utilizing R for creating web scraping scripts and interactive reports. Author(s) None Ren, an experienced software developer and educator, specializes in languages for data analysis, including R. With years of practical experience and teaching R programming, they bring clarity and depth to complex topics. Their approachable writing style ensures learners at any level can engage effectively. Who is it for? This book is ideal for professionals in data science, statistics, and related fields with basic programming skills looking to delve into R programming. It caters to beginners and those consolidating their knowledge of R, aiming to develop practical skills for data manipulation and analysis.

Practical Data Analysis - Second Edition

2016-09-30 O'Reilly Amazon

book

Hector Cuesta , Dr. Sampath Kumar

data data-science data-science-tasks exploratory-data-analysis AI/ML Big Data

Practical Data Analysis provides a hands-on guide to mastering essential data analysis techniques using tools like Pandas, MongoDB, and Apache Spark. With step-by-step instructions, you'll explore how to process diverse data types, apply machine learning methods, and uncover actionable insights that can drive innovative projects and business solutions. What this Book will help me do Master data acquisition, formatting, and visualization techniques to prepare your data for analysis. Understand and apply machine learning algorithms for tasks like classification and forecasting. Learn to analyze textual data, such as performing sentiment analysis and text classification. Effectively work with databases using tools like MongoDB and handle big data with Apache Spark. Develop data-driven applications using real-world examples like image similarity searches and social network graph analysis. Author(s) None Cuesta and Dr. Sampath Kumar are experienced data scientists and educators. They have considerable experience applying data analysis techniques in various domains and a passion for teaching these skills. Their practical approach to data analysis ensures an engaging learning experience for readers. Who is it for? This book is ideal for developers and data enthusiasts aiming to incorporate practical data analysis into their projects. It is perfectly suited for readers with basic programming, statistics, and linear algebra knowledge. Even if you're new to professional data analysis, you'll find the step-by-step examples approachable. This book guides you in transforming raw data into valuable insights.

Big Data Analytics with R

2016-07-29 O'Reilly Amazon

book

Simon Walkowiak

data data-science data-science-tools r AI/ML Analytics

Unlock the potential of big data analytics by mastering R programming with this comprehensive guide. This book takes you step-by-step through real-world scenarios where R's capabilities shine, providing you with practical skills to handle, process, and analyze large and complex datasets effectively. What this Book will help me do Understand the latest big data processing methods and how R can enhance their application. Set up and use big data platforms such as Hadoop and Spark in conjunction with R. Utilize R for practical big data problems, such as analyzing consumption and behavioral datasets. Integrate R with SQL and NoSQL databases to maximize its versatility in data management. Discover advanced machine learning implementations using R and Spark MLlib for predictive analytics. Author(s) None Walkowiak is an experienced data analyst and R programming expert with a passion for data engineering and machine learning. With a deep knowledge of big data platforms and extensive teaching experience, they bring a clear and approachable writing style to help learners excel. Who is it for? Ideal for data analysts, scientists, and engineers with fundamental data analysis knowledge looking to enhance their big data capabilities using R. If you aim to adapt R for large-scale data management and analysis workflows, this book is your ideal companion to bridge the gap.

R for Data Science Cookbook

2016-07-29 O'Reilly Amazon

book

Prabhanjan Narayanachar Tattar , Yu-Wei, Chiu (David Chiu)

data data-science data-science-tools r AI/ML Data Science

The "R for Data Science Cookbook" is your comprehensive guide to tackling data problems using R. Focusing on practical applications, you will learn data manipulation, visualization, statistical inference, and machine learning with a hands-on approach using popular R packages. What this Book will help me do Master the use of R's functional programming features to streamline your analysis workflows. Extract, transform, and visualize data effectively using robust R packages like dplyr and ggplot2. Learn to create intuitive and professional visualizations and reports that communicate insights effectively. Implement key statistical modeling and machine learning techniques to solve real-world problems. Acquire expertise in data mining techniques, including clustering and association rule mining. Author(s) Yu-Wei Chiu, also known as David Chiu, is an experienced data scientist and educator. With a solid technical background in using R for data science, he combines theory with practical applications in his writing. David's approachable style and rich examples make complex topics accessible and engaging for learners. Who is it for? This book is perfect for individuals who already have a foundation in R and are looking to deepen their expertise in applying R to data science tasks. Ideal readers are analysts and statisticians eager to solve real-world problems using practical tools. If you're aspiring to work effectively with large data sets or want to learn versatile data analysis techniques, this book is designed for you. It bridges the gap between theoretical knowledge and actionable skills, making it invaluable for professionals and learners alike.

AI and Medicine

2016-07-15 O'Reilly Amazon

book

Mike Barlow

data data-science healthcare-analytics AI/ML Analytics Data Analytics

Data-driven techniques have improved decision-making processes for people in industries such as finance and real estate. Yet, despite promising solutions that data analytics and artificial intelligence/machine learning (ML) tools can bring to healthcare, the industry remains largely unconvinced. In this O’Reilly report, you’ll explore the potential of—and impediments to—widespread adoption of AI and ML in the medical field. You’ll also learn how extensive government regulation and resistance from the medical community have so far stymied full-scale acceptance of sophisticated data analytics in healthcare. Through interviews with several professionals working at the intersection of medicine and data science, author Mike Barlow examines five areas where the application of AI/ML strategies can spur a beneficial revolution in healthcare: Identifying risks and interventions for healthcare management of entire populations Closing gaps in care by designing plans for individual patients Supporting customized self-care treatment plans and monitoring patient health in real time Optimizing healthcare processes through data analysis to improve care and reduce costs Helping doctors and patients choose proper medications, dosages, and promising surgical options

Mastering Python Data Analysis

2016-06-27 O'Reilly Amazon

book

Luiz Felipe Martins , Magnus Vilhelm Persson

data data-science AI/ML Pandas Python

Mastering Python Data Analysis provides a comprehensive roadmap for Python developers to enhance their data analysis skills to tackle real-world problems. This book delves into advanced statistical analysis, covering tools, models, and methods to transform raw data into valuable insights. What this Book will help me do Effectively handle and preprocess data using Python and Pandas. Explore statistical models to identify patterns and gain insights from data. Learn clustering approaches to detect data groupings and predict outcomes. Utilize Bayesian methods for quantifying causal relationships. Generate professional reports and visualizations with Python tools like Jupyter Notebook. Author(s) None Vilhelm Persson is a seasoned software developer and data analyst with expertise in leveraging Python for sophisticated data analysis and machine learning tasks. Drawing from years of experience in the tech industry, None provides practical, real-world insights throughout the book. His approachable writing style ensures technical concepts are conveyed with clarity, making data analysis accessible to developers at varying skill levels. Who is it for? This book is ideal for intermediate Python developers seeking to elevate their data analysis skills. If you are familiar with Python libraries and have an interest in solving complex data problems, this guide will serve as a stepping stone to mastery. Advanced beginners with a curiosity for statistical methods and a desire to learn through practical examples will find this book invaluable. It is also perfect for professionals aiming to integrate Python-based statistical techniques into their workflow.

Advancing Procurement Analytics

2016-06-15 O'Reilly Amazon

book

Federico Castanedo

data data-science analytics-platforms AI/ML Analytics Data Analytics

One area where data analytics can have profound effect is your company’s procurement process. Some organizations spend more than two thirds of their revenue buying goods and services, making procurement—out of all business activities—a key element in achieving cost reduction. This report examines how your company can significantly improve procurement analytics to solve business questions quickly and effectively. Author Federico Castanedo, Chief Data Scientist at WiseAthena.com, explains how a probabilistic, bottom-up approach can significantly increase the quality, speed, and scalability of your data preparation operations—whether you’re integrating datasets or cleaning and classifying them. You’ll learn how new solutions leverage automation and machine learning, including the Tamr platform, and help you take advantage of several data-driven actions for procurement—including compliance, price arbitrage, and spend recovery.

Python: Real-World Data Science

2016-06-10 O'Reilly Amazon

book

Sebastian Raschka , Martin Czygan , Robert Layton , Phuong Vo.T.H , Fabrizio Romano , Dusty Phillips

data data-science AI/ML Analytics Big Data Data Science

Unleash the power of Python and its robust data science capabilities About This Book Unleash the power of Python 3 objects Learn to use powerful Python libraries for effective data processing and analysis Harness the power of Python to analyze data and create insightful predictive models Unlock deeper insights into machine learning with this vital guide to cutting-edge predictive analytics Who This Book Is For Entry-level analysts who want to enter in the data science world will find this course very useful to get themselves acquainted with Python's data science capabilities for doing real-world data analysis. What You Will Learn Install and setup Python Implement objects in Python by creating classes and defining methods Get acquainted with NumPy to use it with arrays and array-oriented computing in data analysis Create effective visualizations for presenting your data using Matplotlib Process and analyze data using the time series capabilities of pandas Interact with different kind of database systems, such as file, disk format, Mongo, and Redis Apply data mining concepts to real-world problems Compute on big data, including real-time data from the Internet Explore how to use different machine learning models to ask different questions of your data In Detail The Python: Real-World Data Science course will take you on a journey to become an efficient data science practitioner by thoroughly understanding the key concepts of Python. This learning path is divided into four modules and each module are a mini course in their own right, and as you complete each one, you'll have gained key skills and be ready for the material in the next module. The course begins with getting your Python fundamentals nailed down. After getting familiar with Python core concepts, it's time that you dive into the field of data science. In the second module, you'll learn how to perform data analysis using Python in a practical and example-driven way. The third module will teach you how to design and develop data mining applications using a variety of datasets, starting with basic classification and affinity analysis to more complex data types including text, images, and graphs. Machine learning and predictive analytics have become the most important approaches to uncover data gold mines. In the final module, we'll discuss the necessary details regarding machine learning concepts, offering intuitive yet informative explanations on how machine learning algorithms work, how to use them, and most importantly, how to avoid the common pitfalls. Style and approach This course includes all the resources that will help you jump into the data science field with Python and learn how to make sense of data. The aim is to create a smooth learning path that will teach you how to get started with powerful Python libraries and perform various data science techniques in depth.

The Evolution of Analytics

2016-05-15 O'Reilly Amazon

book

Patrick Hall , Wen Phan , Katie Whitson

data data-science analytics-platforms AI/ML Analytics

Machine learning is a hot topic in business. Even data-driven organizations that have spent years developing successful data analysis platforms, with many accurate statistical models in place, are now looking into this decades-old discipline. But how can companies turn hyped opportunities for machine learning into real business value? This report examines the growing momentum of machine learning in the analytics landscape, the challenges machine learning presents to businesses, and examples of how organizations are actively seeking to incorporate modern machine learning techniques into their production data infrastructures. Authors Patrick Hall, Wen Phan, and Katie Whitson look at two companies in depth—one in healthcare and one in finance—that are seeing the real impact of machine learning. Discover how machine learning can help your organization: Analyze and generate insights from large amounts of varied, messy, and unstructured data unfit for traditional statistical analysis Increase the predictive accuracy beyond what was previously possible Augment aging analytical processes and other decision-making tools

Introducing Data Science

2016-05-03 O'Reilly Amazon

book

Mohamed Ali , Davy Cielen , Arno Meysman

data data-science AI/ML Big Data Data Science DataViz

Introducing Data Science teaches you how to accomplish the fundamental tasks that occupy data scientists. Using the Python language and common Python libraries, you'll experience firsthand the challenges of dealing with data at scale and gain a solid foundation in data science. About the Technology Many companies need developers with data science skills to work on projects ranging from social media marketing to machine learning. Discovering what you need to learn to begin a career as a data scientist can seem bewildering. This book is designed to help you get started. About the Book Introducing Data Science explains vital data science concepts and teaches you how to accomplish the fundamental tasks that occupy data scientists. You'll explore data visualization, graph databases, the use of NoSQL, and the data science process. You'll use the Python language and common Python libraries as you experience firsthand the challenges of dealing with data at scale. Discover how Python allows you to gain insights from data sets so big that they need to be stored on multiple machines, or from data moving so quickly that no single machine can handle it. This book gives you hands-on experience with the most popular Python data science libraries, Scikit-learn and StatsModels. After reading this book, you'll have the solid foundation you need to start a career in data science. What's Inside Handling large data Introduction to machine learning Using Python to work with data Writing data science algorithms About the Reader This book assumes you're comfortable reading code in Python or a similar language, such as C, Ruby, or JavaScript. No prior experience with data science is required. About the Authors Davy Cielen, Arno D. B. Meysman, and Mohamed Ali are the founders and managing partners of Optimately and Maiton, where they focus on developing data science projects and solutions in various sectors. Quotes Read this book if you want to get a quick overview of data science, with lots of examples to get you started! - Alvin Raj, Oracle The map that will help you navigate the data science oceans. - Marius Butuc, Shopify Covers the processes involved in data science from end to end… A complete overview. - Heather Campbell, Kainos A must-read for anyone who wants to get into the data science world. - Hector Cuesta, Big Data Bootcamp

Learning Probabilistic Graphical Models in R

2016-04-29 O'Reilly Amazon

book

David Bellot

data data-science data-science-tools r AI/ML

Explore the fundamentals of probabilistic graphical models (PGM) with hands-on examples using R. This book helps you translate theoretical concepts into practical solutions, addressing complex problems with Bayesian and Markov networks. It's written to demystify PGMs, equipping you to create robust models for inference, learning, and prediction. What this Book will help me do Understand and implement probabilistic graphical models, including Bayesian and Markov networks, directly in R. Learn to use various R packages for performing inference and analyzing probabilistic models. Master the essentials of Bayesian methods, transitioning to advanced concepts with clear, step-by-step guidance. Familiarize yourself with methods like PCA and ICA for analyzing and reducing complex data dimensions. Develop practical skills to apply PGM techniques to machine learning challenges and real-world data problems. Author(s) The authors bring diverse expertise in probabilistic modeling, R programming, and applied machine learning. They are passionate educators and technical writers, focusing on breaking down complex theories into accessible knowledge. Their writing emphasizes practical demonstration, leveraging their industry and academic experiences. Who is it for? This book is designed for data scientists, engineers, and machine learning enthusiasts who wish to enhance their understanding of probabilistic graphical models. Whether you're curious about Bayesian methods or looking to apply PGM approaches to data-rich challenges, this guide is perfect for learners at an intermediate level, offering practical insights and real-world applications.

NumPy Essentials

2016-04-28 O'Reilly Amazon

book

Shane Holloway , Tanmay Dutta , Jaidev Deshpande , Leo (Liang-Huan) Chin

data data-science data-science-tools NumPy AI/ML API

NumPy Essentials is your guide to mastering NumPy, the powerful Python library for scientific computing. In this book, you'll discover how to manipulate arrays, perform mathematical operations, and create advanced models. With its clear examples and practical exercises, you'll build the skills needed to efficiently tackle analytical challenges. What this Book will help me do Learn to manipulate data efficiently with NumPy array objects and universal functions. Gain proficiency in solving linear algebra problems using NumPy's powerful modules. Master regression techniques and curve fitting for statistical modeling. Apply Fourier Transform and spectral analysis in solving real-world problems. Integrate and optimize Python code using Cython and the NumPy C API for higher performance. Author(s) Jaidev Deshpande, None Chin, Tanmay Dutta, and Shane Holloway are seasoned developers passionate about Python and scientific computing. With experience across diverse projects, they bring practical insights and accessible explanations to their writing. Who is it for? This book is ideal for Python developers seeking to sharpen their numerical computing skills. Prior experience with Python is expected, as the content progresses quickly to advanced topics. Whether you're working in data analysis, scientific research, or machine learning, this book will provide valuable tools and insights.

talk-data.com

O'Reilly Data Science Books

Top Topics

Top Speakers

R for Everyone: Advanced Analytics and Graphics, 2nd Edition

Business in Real-Time Using Azure IoT and Cortana Intelligence Suite: Driving Your Digital Transformation

Practical Statistics for Data Scientists

Breaking Data Science Open

Mastering Machine Learning with R - Second Edition

Beginning Data Science in R: Data Analysis, Visualization, and Modelling for the Data Scientist

Data Science For Dummies, 2nd Edition

The Data Science Handbook

Scala: Guide for Data Science Professionals

MATLAB Machine Learning

Business Analytics Using R - A Practical Approach

Principles of Data Science

Python Data Science Handbook

Learning R Programming

Practical Data Analysis - Second Edition

Big Data Analytics with R

R for Data Science Cookbook

AI and Medicine

Mastering Python Data Analysis

Advancing Procurement Analytics

Python: Real-World Data Science

The Evolution of Analytics

Introducing Data Science

Learning Probabilistic Graphical Models in R

NumPy Essentials