O'Reilly Data Science Books

Hands-On Data Analysis with Pandas

2019-07-26 O'Reilly Amazon

book

Stefanie Molin

data data-science data-science-tools Pandas AI/ML Analytics

Hands-On Data Analysis with Pandas provides an intensive dive into mastering the pandas library for data science and analysis using Python. Through a combination of conceptual explanations and practical demonstrations, readers will learn how to manipulate, visualize, and analyze data efficiently. What this Book will help me do Understand and apply the pandas library for efficient data manipulation. Learn to perform data wrangling tasks such as cleaning and reshaping datasets. Create effective visualizations using pandas and libraries like matplotlib and seaborn. Grasp the basics of machine learning and implement solutions with scikit-learn. Develop reusable data analysis scripts and modules in Python. Author(s) Stefanie Molin is a seasoned data scientist and software engineer with extensive experience in Python and data analytics. She specializes in leveraging the latest data science techniques to solve real-world problems. Her engaging and detailed writing draws from her practical expertise, aiming to make complex concepts accessible to all. Who is it for? This book is ideal for data analysts and aspiring data scientists who are at the beginning stages of their careers or looking to enhance their toolset with pandas and Python. It caters to Python developers eager to delve into data analysis workflows. Readers should have some programming knowledge to fully benefit from the examples and exercises.

Fundamentals of Programming in SAS

2019-07-25 O'Reilly Amazon

book

James Blum , Jonathan Duggins

data data-science analytics-platforms SAS

Unlock the essentials of SAS programming! Fundamentals of Programming in SAS: A Case Studies Approach gives a complete introduction to SAS programming. Perfect for students, novice SAS users, and programmers studying for their Base SAS certification, this book covers all the basics, including: working with data creating visualizations data validation good programming practices Experienced programmers know that real-world scenarios require practical solutions. Designed for use in the classroom and for self-guided learners, this book takes a novel approach to learning SAS programming by following a single case study throughout the text and circling back to previous concepts to reinforce material. Readers will benefit from the variety of exercises, including both multiple choice questions and in-depth case studies. Additional case studies are also provided online for extra practice. This approach mirrors the way good SAS programmers develop their skills—through hands-on work with an eye toward developing the knowledge necessary to tackle more difficult tasks. After reading this book, you will gain the skills and confidence to take on larger challenges with the power of SAS.

Data Science with Python and Dask

2019-07-18 O'Reilly Amazon

book

Jesse Daniel

data data-science data-science-tools dask AI/ML Analytics

Dask is a native parallel analytics tool designed to integrate seamlessly with the libraries you’re already using, including Pandas, NumPy, and Scikit-Learn. With Dask you can crunch and work with huge datasets, using the tools you already have. And Data Science with Python and Dask is your guide to using Dask for your data projects without changing the way you work! About the Technology An efficient data pipeline means everything for the success of a data science project. Dask is a flexible library for parallel computing in Python that makes it easy to build intuitive workflows for ingesting and analyzing large, distributed datasets. Dask provides dynamic task scheduling and parallel collections that extend the functionality of NumPy, Pandas, and Scikit-learn, enabling users to scale their code from a single laptop to a cluster of hundreds of machines with ease. About the Book Data Science with Python and Dask teaches you to build scalable projects that can handle massive datasets. After meeting the Dask framework, you’ll analyze data in the NYC Parking Ticket database and use DataFrames to streamline your process. Then, you’ll create machine learning models using Dask-ML, build interactive visualizations, and build clusters using AWS and Docker. What's Inside Working with large, structured and unstructured datasets Visualization with Seaborn and Datashader Implementing your own algorithms Building distributed apps with Dask Distributed Packaging and deploying Dask apps About the Reader For data scientists and developers with experience using Python and the PyData stack. About the Author Jesse Daniel is an experienced Python developer. He taught Python for Data Science at the University of Denver and leads a team of data scientists at a Denver-based media technology company. We interviewed Jesse as a part of our Six Questions series. Check it out here. Quotes The most comprehensive coverage of Dask to date, with real-world examples that made a difference in my daily work. - Al Krinker, United States Patent and Trademark Office An excellent alternative to PySpark for those who are not on a cloud platform. The author introduces Dask in a way that speaks directly to an analyst. - Jeremy Loscheider, Panera Bread A greatly paced introduction to Dask with real-world datasets. - George Thomas, R&D Architecture Manhattan Associates The ultimate resource to quickly get up and running with Dask and parallel processing in Python. - Gustavo Patino, Oakland University William Beaumont School of Medicine

Hands-On Web Scraping with Python

2019-07-15 O'Reilly Amazon

book

Anish Chapagain

data data-science data-science-tasks web-scraping API Python

This book, "Hands-On Web Scraping with Python", is your comprehensive guide to mastering web scraping techniques and tools. Harnessing the power of Python libraries like Scrapy, Beautiful Soup, and Selenium, you'll learn how to extract and analyze data from websites effectively and efficiently. What this Book will help me do Master the foundational concepts of web scraping using Python. Efficiently use libraries such as Scrapy, Beautiful Soup, and Selenium for data extraction. Handle advanced scenarios such as forms, logins, and dynamic content in scraping. Leverage XPath, CSS selectors, and Regex for precise data targeting and processing. Improve scraping reliability and manage challenges like cookies, API use, and web security. Author(s) None Chapagain is an accomplished Python programmer and an expert in web scraping methodologies. With years of experience in applying Python to solve practical data challenges, they bring a clear and insightful approach to teaching these skills. Readers appreciate their practical examples and ready-to-use guidance for real-world applications. Who is it for? This book is designed for Python developers and data enthusiasts eager to master web scraping. Whether you're a beginner looking to dep dive into new techniques or an analyst needing reliable data extraction methods, this book offers clear guidance. A basic understanding of Python is recommended to fully benefit from this text.

Data Science Strategy For Dummies

2019-07-11 O'Reilly Amazon

book

Ulrika Jägare

data data-science Analytics Big Data Data Science

All the answers to your data science questions Over half of all businesses are using data science to generate insights and value from big data. How are they doing it? Data Science Strategy For Dummies answers all your questions about how to build a data science capability from scratch, starting with the “what” and the “why” of data science and covering what it takes to lead and nurture a top-notch team of data scientists. With this book, you’ll learn how to incorporate data science as a strategic function into any business, large or small. Find solutions to your real-life challenges as you uncover the stories and value hidden within data. Learn exactly what data science is and why it’s important Adopt a data-driven mindset as the foundation to success Understand the processes and common roadblocks behind data science Keep your data science program focused on generating business value Nurture a top-quality data science team In non-technical language, Data Science Strategy For Dummies outlines new perspectives and strategies to effectively lead analytics and data science functions to create real value.

Bayesian Statistics the Fun Way

2019-07-09 O'Reilly Amazon

book

Will Kurt

data data-science data-science-tasks statistics bayesian-statistics

Probability and statistics are increasingly important in a huge range of professions. But many people use data in ways they don’t even understand, meaning they aren’t getting the most from it. Bayesian Statistics the Fun Way will change that. This book will give you a complete understanding of Bayesian statistics through simple explanations and un-boring examples. Find out the probability of UFOs landing in your garden, how likely Han Solo is to survive a flight through an asteroid belt, how to win an argument about conspiracy theories, and whether a burglary really was a burglary, to name a few examples. By using these off-the-beaten-track examples, the author actually makes learning statistics fun. And you’ll learn real skills, like how to: •How to measure your own level of uncertainty in a conclusion or belief •Calculate Bayes theorem and understand what it’s useful for •Find the posterior, likelihood, and prior to check the accuracy of your conclusions •Calculate distributions to see the range of your data •Compare hypotheses and draw reliable conclusions from them Next time you find yourself with a sheaf of survey results and no idea what to do with them, turn to Bayesian Statistics the Fun Way to get the most value from your data.

Associations and Correlations

2019-06-28 O'Reilly Amazon

book

Lee Baker

data data-science data-science-tasks statistics Analytics Data Analytics

"Associations and Correlations: Unearth the powerful insights buried in your data" is a comprehensive guide for understanding and utilizing associations and correlations in data analysis. This book walks you through methods of classifying data, selecting appropriate statistical tests, and interpreting results effectively. By the end, you'll have mastered how to reveal data insights clearly and reliably. What this Book will help me do Identify and prepare datasets suitable for analysis with confidence. Understand and apply the principles of associations and correlations in data analytics. Use statistical tests to uncover univariate and multivariate relationships. Classify and interpret data into qualitative and quantitative segments effectively. Develop visual representations of data relationships to communicate insights clearly. Author(s) Lee Baker is an experienced statistician and data scientist with a passion for education. With years of teaching and mentoring professionals in data analysis, Lee excels in breaking down complex statistical concepts into understandable insights. Lee's approachable style aims to empower learners to harness their data's full potential. Who is it for? This book is designed for budding data analysts and data scientists, targeting those starting their journey into data analytics. It serves well as an introduction to the fundamentals of associations and correlations, making it suitable for beginners. If you seek a foundational understanding or a recap of key concepts, this book is for you.

Probability and Statistics for Computer Scientists, 3rd Edition

2019-06-25 O'Reilly Amazon

book

Michael Baron

data data-science data-science-tasks statistics

Probability and statistical methods, simulation techniques, and modeling tools. This third edition textbook adds R, including codes for data analysis examples, helps students solve problems, make optimal decisions in select stochastic models, probabilities and forecasts, and evaluate performance of computer systems and networks.

R Cookbook, 2nd Edition

2019-06-25 O'Reilly Amazon

book

Paul Teetor , JD Long

data data-science data-science-tools r R

Perform data analysis with R quickly and efficiently with more than 275 practical recipes in this expanded second edition. The R language provides everything you need to do statistical work, but its structure can be difficult to master. These task-oriented recipes make you productive with R immediately. Solutions range from basic tasks to input and output, general statistics, graphics, and linear regression. Each recipe addresses a specific problem and includes a discussion that explains the solution and provides insight into how it works. If you’re a beginner, R Cookbook will help get you started. If you’re an intermediate user, this book will jog your memory and expand your horizons. You’ll get the job done faster and learn more about R in the process. Create vectors, handle variables, and perform basic functions Simplify data input and output Tackle data structures such as matrices, lists, factors, and data frames Work with probability, probability distributions, and random variables Calculate statistics and confidence intervals and perform statistical tests Create a variety of graphic displays Build statistical models with linear regressions and analysis of variance (ANOVA) Explore advanced statistical techniques, such as finding clusters in your data

The Care and Feeding of Data Scientists

2019-06-25 O'Reilly Amazon

book

Michelangelo D'Agostino , Katie Malone

data data-science data-science-as-a-profession Agile/Scrum Analytics Data Science

As a discipline, data science is relatively young, but the job of managing data scientists is younger still. Many people undertake this management position without the tools, mentorship, or role models they need to do it well. This report examines the steps necessary to build, manage, sustain, and retain a growing data science team. You’ll learn how data science management is similar to but distinct from other management types. Michelangelo D’Agostino, VP of Data Science and Engineering at ShopRunner, and Katie Malone, Director of Data Science at Civis Analytics, provide concrete tips for balancing and structuring a data science team. The authors provide tips for balancing and structuring a data science team, recruiting and interviewing the best candidates, and keeping them productive and happy once they're in place. In this report, you'll: Explore data scientist archetypes, such as operations and research, that fit your organization Devise a plan to recruit, interview, and hire members for your data science team Retain your hires by providing challenging work and learning opportunities Explore Agile and OKR methodology to determine how your team will work together Provide your team with a career ladder through guidance and mentorship

Digital Processing of Random Oscillations

2019-06-17 O'Reilly Amazon

book

Viacheslav Karmalita

data data-science data-science-tasks statistics stata ELK

This book deals with the autoregressive method for digital processing of random oscillations. The method is based on a one-to-one transformation of the numeric factors of the Yule series model to linear elastic system characteristics. This parametric approach allowed to develop a formal processing procedure from the experimental data to obtain estimates of logarithmic decrement and natural frequency of random oscillations. A straightforward mathematical description of the procedure makes it possible to optimize a discretization of oscillation realizations providing efficient estimates. The derived analytical expressions for confidence intervals of estimates enable a priori evaluation of their accuracy. Experimental validation of the method is also provided. Statistical applications for the analysis of mechanical systems arise from the fact that the loads experienced by machineries and various structures often cannot be described by deterministic vibration theory. Therefore, a sufficient description of real oscillatory processes (vibrations) calls for the use of random functions. In engineering practice, the linear vibration theory (modeling phenomena by common linear differential equations) is generally used. This theory’s fundamental concepts such as natural frequency, oscillation decrement, resonance, etc. are credited for its wide use in different technical tasks. In technical applications two types of research tasks exist: direct and inverse. The former allows to determine stochastic characteristics of the system output X(t) resulting from a random process E(t) when the object model is considered known. The direct task enables to evaluate the effect of an operational environment on the designed object and to predict its operation under various loads. The inverse task is aimed at evaluating the object model on known processes E(t) and X(t), i.e. finding model (equations) factors. This task is usually met at the tests of prototypes to identify (or verify) its model experimentally. To characterize random processes a notion of "shaping dynamic system" is commonly used. This concept allows to consider the observing process as the output of a hypothetical system with the input being stationary Gauss-distributed ("white") noise. Therefore, the process may be exhaustively described in terms of parameters of that system. In the case of random oscillations, the "shaping system" is an elastic system described by the common differential equation of the second order: X ̈(t)+2hX ̇(t)+ ω_0^2 X(t)=E(t), where ω0 = 2π/Т0 is the natural frequency, T0 is the oscillation period, and h is a damping factor. As a result, the process X(t) can be characterized in terms of the system parameters – natural frequency and logarithmic oscillations decrement δ = hT0 as well as the process variance. Evaluation of these parameters is subjected to experimental data processing based on frequency or time-domain representations of oscillations. It must be noted that a concept of these parameters evaluation did not change much during the last century. For instance, in case of the spectral density utilization, evaluation of the decrement values is linked with bandwidth measurements at the points of half-power of the observed oscillations. For a time-domain presentation, evaluation of the decrement requires measuring covariance values delayed by a time interval divisible by T0. Both estimation procedures are derived from a continuous description of research phenomena, so the accuracy of estimates is linked directly to the adequacy of discrete representation of random oscillations. This approach is similar a concept of transforming differential equations to difference ones with derivative approximation by corresponding finite differences. The resulting discrete model, being an approximation, features a methodical error which can be decreased but never eliminated. To render such a presentation more accurate it is imperative to decrease the discretization interval and to increase realization size growing requirements for computing power. The spectral density and covariance function estimates comprise a non-parametric (non-formal) approach. In principle, any non-formal approach is a kind of art i.e. the results depend on the performer’s skills. Due to interference of subjective factors in spectral or covariance estimates of random signals, accuracy of results cannot be properly determined or justified. To avoid the abovementioned difficulties, the application of linear time-series models with well-developed procedures for parameter estimates is more advantageous. A method for the analysis of random oscillations using a parametric model corresponding discretely (no approximation error) with a linear elastic system is developed and presented in this book. As a result, a one-to-one transformation of the model’s numerical factors to logarithmic decrement and natural frequency of random oscillations is established. It allowed to develop a formal processing procedure from experimental data to obtain the estimates of δ and ω0. The proposed approach allows researchers to replace traditional subjective techniques by a formal processing procedure providing efficient estimates with analytically defined statistical uncertainties.

Getting Started with Tableau 2019.2 - Second Edition

2019-06-14 O'Reilly Amazon

book

Tristan Guillevin

data data-science data-science-tasks data-visualization Tableau Analytics

"Getting Started with Tableau 2019.2" is your primer to mastering the latest version of Tableau, a leading tool for data visualization and analysis. Whether you're new to Tableau or looking to upgrade your skills, this book will guide you through both foundational and advanced features, enabling you to create impactful dashboards and visual analytics. What this Book will help me do Understand and utilize the latest features introduced in Tableau 2019.2, including natural language queries in Ask Data. Learn how to connect to diverse data sources, transform data by pivoting fields, and split columns effectively. Gain skills to design intuitive data visualizations and dashboards using various Tableau mark types and properties. Develop interactive and storytelling-based dashboards to communicate insights visually and effectively. Discover methods to securely share your analyses through Tableau Server, enhancing collaboration. Author(s) Tristan Guillevin is an experienced data visualization consultant and an expert in Tableau. Having helped several organizations adopt Tableau for business intelligence, he brings a practical and results-oriented approach to teaching. Tristan's philosophy is to make data accessible and actionable for everyone, no matter their technical background. Who is it for? This book is ideal for Tableau users and data professionals looking to enhance their skills on Tableau 2019.2. If you're passionate about uncovering insights from data but need the right tools to communicate and collaborate effectively, this book is for you. It's suited for those with some prior experience in Tableau but also offers introductory content for newcomers. Whether you're a business analyst, data enthusiast, or BI professional, this guide will build solid foundations and sharpen your Tableau expertise.

GARCH Models, 2nd Edition

2019-06-10 O'Reilly Amazon

book

Christian Francq , Jean-Michel Zakoian

data data-science data-science-tasks statistics

Provides a comprehensive and updated study of GARCH models and their applications in finance, covering new developments in the discipline This book provides a comprehensive and systematic approach to understanding GARCH time series models and their applications whilst presenting the most advanced results concerning the theory and practical aspects of GARCH. The probability structure of standard GARCH models is studied in detail as well as statistical inference such as identification, estimation, and tests. The book also provides new coverage of several extensions such as multivariate models, looks at financial applications, and explores the very validation of the models used. GARCH Models: Structure, Statistical Inference and Financial Applications, 2nd Edition features a new chapter on Parameter-Driven Volatility Models, which covers Stochastic Volatility Models and Markov Switching Volatility Models. A second new chapter titled Alternative Models for the Conditional Variance contains a section on Stochastic Recurrence Equations and additional material on EGARCH, Log-GARCH, GAS, MIDAS, and intraday volatility models, among others. The book is also updated with a more complete discussion of multivariate GARCH; a new section on Cholesky GARCH; a larger emphasis on the inference of multivariate GARCH models; a new set of corrected problems available online; and an up-to-date list of references. Features up-to-date coverage of the current research in the probability, statistics, and econometric theory of GARCH models Covers significant developments in the field, especially in multivariate models Contains completely renewed chapters with new topics and results Handles both theoretical and applied aspects Applies to researchers in different fields (time series, econometrics, finance) Includes numerous illustrations and applications to real financial series Presents a large collection of exercises with corrections Supplemented by a supporting website featuring R codes, Fortran programs, data sets and Problems with corrections GARCH Models, 2nd Edition is an authoritative, state-of-the-art reference that is ideal for graduate students, researchers, and practitioners in business and finance seeking to broaden their skills of understanding of econometric time series models.

Principles of Strategic Data Science

2019-06-03 O'Reilly Amazon

book

Peter Prevos

data data-science Data Science Python

"Principles of Strategic Data Science" is your go-to guide for creating measurable value from data through strategic use of tools and techniques. This book takes you through key theoretical foundations, practical tools, and the managerial perspective necessary to succeed in data science. What this Book will help me do Master the five-phase framework for strategic data science. Learn ways to effectively visualize data information. Explore the role and contributions of a data science manager. Gain clear insights into organizational benefits of data science. Understand the ethical and mathematical boundaries of data analysis. Author(s) Peter Prevos is an accomplished engineer and social scientist with extensive expertise in data science applications. He combines technical insights with social science management practices to design effective data strategies. Known for his clear teaching style, Peter helps professionals integrate theory with practical planning. Who is it for? This book is ideal for data scientists and analysts seeking to deepen their strategic understanding of data science. It's well-suited for intermediate professionals looking to gain insights into data-driven decision making. Readers should have basic programming knowledge in Python or R. Novice managers eager to harness data for organizational goals will also find it valuable.

Applied Supervised Learning with R

2019-05-31 O'Reilly Amazon

book

Jojo Moolayil , Karthik Ramasubramanian

data data-science data-science-tools r AI/ML Analytics

Applied Supervised Learning with R equips you with the essential knowledge and practical skills to leverage machine learning techniques for solving business problems using R. With this book, you'll gain hands-on experience in implementing various supervised learning models, assessing their performance, and selecting the best-suited method for your objectives. What this Book will help me do Gain expertise in identifying and framing business problems suitable for supervised learning. Acquire skills in data wrangling and visualization using R packages like dplyr and ggplot2. Master techniques for tuning hyperparameters to optimize machine learning models. Understand methods for feature selection and dimensionality reduction to enhance model performance. Learn how to deploy machine learning models to production environments, such as AWS Lambda. Author(s) Karthik Ramasubramanian and Jojo Moolayil are both seasoned data science practitioners and educators who bring a wealth of experience in machine learning and analytics. With a deep understanding of R and its applications in real-world scenarios, they offer practical insights and actionable examples to their readers. Their teaching style focuses on clarity and practical application. Who is it for? This book is ideal for data analysts, data scientists, and data engineers at a beginner to intermediate level who aim to master supervised machine learning with R. Readers should have basic knowledge of statistics, probabilities, and R programming. It is designed for those eager to apply machine learning techniques to real-world problems and improve their decision-making capabilities.

Hands-On Exploratory Data Analysis with R

2019-05-31 O'Reilly Amazon

book

Harish Garg , Radhika Datar

data data-science data-science-tools r Data Collection

Immerse yourself in 'Hands-On Exploratory Data Analysis with R,' a comprehensive guide designed to hone your skills in data analysis using the powerful R programming language. This book walks you through all essential aspects of exploratory data analysis, from data collection and cleaning to generating insights with statistical and graphical methods, setting you up for success with any dataset. What this Book will help me do Utilize powerful R packages to accelerate your data analysis workflow. Effectively import, clean, and prepare diverse datasets for analysis. Create informative and visually appealing data visualizations using ggplot2. Generate comprehensive and sharable reports with R Markdown and knitr. Handle multi-factor, optimization, and regression data challenges. Author(s) Radhika Datar and Harish Garg are experienced data analysts and educators specializing in using R for practical data analysis. They have developed this book to share their depth of expertise, offering a detailed yet approachable learning experience. Their combined experience in teaching and applying data analysis in real-world scenarios makes this book an invaluable resource for practitioners. Who is it for? This book is perfect for data enthusiasts looking to strengthen their foundational knowledge in exploratory data analysis. Data analysts, engineers, software developers, and product managers seeking to broaden their skillset in data interpretation and visualization will find this guide extremely beneficial. Whether you're a beginner or already possess basic understanding of data analysis, this book will provide actionable insights to improve your workflow.

Hands-On Time Series Analysis with R

2019-05-31 O'Reilly Amazon

book

Rami Krispin

data data-science data-science-tasks statistics time-series AI/ML

Dive into the intricacies of time series analysis and forecasting with R in this comprehensive guide. From foundational concepts to practical implementations, this book equips you with the tools and techniques to analyze, understand, and predict time-dependent data. What this Book will help me do Develop insights by visualizing time-series data and identifying patterns. Master statistical time-series concepts including autocorrelation and moving averages. Learn and implement forecasting models like ARIMA and exponential smoothing. Apply machine learning methodologies for advanced time-series predictions. Work with key R packages for cleaning, manipulating, and analyzing time-series data. Author(s) Rami Krispin is an accomplished statistician and R programmer with extensive experience in data analysis and time-series modeling. His hands-on approach in utilizing R packages and libraries brings clarity to complex time-series concepts. With a passion for teaching and simplifying intricate topics, Rami ensures readers both grasp the theories and apply them effectively. Who is it for? This book is ideal for data analysts, statisticians, and R developers interested in mastering time-series analysis for real-world applications. Designed for readers with a basic understanding of statistics and R programming, it offers a practical approach to learning effective forecasting and data visualization techniques. Professionals aiming to expand their skillset in predictive analytics will find it particularly beneficial.

Implementing CDISC Using SAS, 2nd Edition

2019-05-30 O'Reilly Amazon

book

Chris Holland , Jack Shostak

data data-science analytics-platforms SAS Data Modelling XML

For decades researchers and programmers have used SAS to analyze, summarize, and report clinical trial data. Now Chris Holland and Jack Shostak have updated their popular Implementing CDISC Using SAS, the first comprehensive book on applying clinical research data and metadata to the Clinical Data Interchange Standards Consortium (CDISC) standards. Implementing CDISC Using SAS: An End-to-End Guide, Revised Second Edition, is an all-inclusive guide on how to implement and analyze the Study Data Tabulation Model (SDTM) and the Analysis Data Model (ADaM) data and prepare clinical trial data for regulatory submission. Updated to reflect the 2017 FDA mandate for adherence to CDISC standards, this new edition covers creating and using metadata, developing conversion specifications, implementing and validating SDTM and ADaM data, determining solutions for legacy data conversions, and preparing data for regulatory submission. The book covers products such as Base SAS, SAS Clinical Data Integration, and the SAS Clinical Standards Toolkit, as well as JMP Clinical. Topics included in this edition include an implementation of the Define-XML 2.0 standard, new SDTM domains, validation with Pinnacle 21 software, event narratives in JMP Clinical, STDM and ADAM metadata spreadsheets, and of course new versions of SAS and JMP software. The second edition was revised to add the latest C-Codes from the most recent release as well as update the make_define macro that accompanies this book in order to add the capability to handle C-Codes. The metadata spreadsheets were updated accordingly. Any manager or user of clinical trial data in this day and age is likely to benefit from knowing how to either put data into a CDISC standard or analyzing and finding data once it is in a CDISC format. If you are one such person--a data manager, clinical and/or statistical programmer, biostatistician, or even a clinician--then this book is for you.

Practical Applications of Bayesian Reliability

2019-05-28 O'Reilly Amazon

book

Yan Liu , Athula I. Abeyratne

data data-science data-science-tasks statistics bayesian-statistics

Demonstrates how to solve reliability problems using practical applications of Bayesian models This self-contained reference provides fundamental knowledge of Bayesian reliability and utilizes numerous examples to show how Bayesian models can solve real life reliability problems. It teaches engineers and scientists exactly what Bayesian analysis is, what its benefits are, and how they can apply the methods to solve their own problems. To help readers get started quickly, the book presents many Bayesian models that use JAGS and which require fewer than 10 lines of command. It also offers a number of short R scripts consisting of simple functions to help them become familiar with R coding. Practical Applications of Bayesian Reliability starts by introducing basic concepts of reliability engineering, including random variables, discrete and continuous probability distributions, hazard function, and censored data. Basic concepts of Bayesian statistics, models, reasons, and theory are presented in the following chapter. Coverage of Bayesian computation, Metropolis-Hastings algorithm, and Gibbs Sampling comes next. The book then goes on to teach the concepts of design capability and design for reliability; introduce Bayesian models for estimating system reliability; discuss Bayesian Hierarchical Models and their applications; present linear and logistic regression models in Bayesian Perspective; and more. Provides a step-by-step approach for developing advanced reliability models to solve complex problems, and does not require in-depth understanding of statistical methodology Educates managers on the potential of Bayesian reliability models and associated impact Introduces commonly used predictive reliability models and advanced Bayesian models based on real life applications Includes practical guidelines to construct Bayesian reliability models along with computer codes for all of the case studies JAGS and R codes are provided on an accompanying website to enable practitioners to easily copy them and tailor them to their own applications Practical Applications of Bayesian Reliability is a helpful book for industry practitioners such as reliability engineers, mechanical engineers, electrical engineers, product engineers, system engineers, and materials scientists whose work includes predicting design or product performance.

Data Analysis and Applications 1

2019-05-21 O'Reilly Amazon

book

Christos H. Skiadas , James R. Bozeman

data data-science data-science-tasks exploratory-data-analysis

This series of books collects a diverse array of work that provides the reader with theoretical and applied information on data analysis methods, models, and techniques, along with appropriate applications. Volume 1 begins with an introductory chapter by Gilbert Saporta, a leading expert in the field, who summarizes the developments in data analysis over the last 50 years. The book is then divided into three parts: Part 1 presents clustering and regression cases; Part 2 examines grouping and decomposition, GARCH and threshold models, structural equations, and SME modeling; and Part 3 presents symbolic data analysis, time series and multiple choice models, modeling in demography, and data mining.

Data Analysis and Applications 2

2019-05-21 O'Reilly Amazon

book

Christos H. Skiadas , James R. Bozeman

data data-science data-science-tasks statistics

This series of books collects a diverse array of work that provides the reader with theoretical and applied information on data analysis methods, models and techniques, along with appropriate applications. Volume 2 begins with an introductory chapter by Gilbert Saporta, a leading expert in the field, who summarizes the developments in data analysis over the last 50 years. The book is then divided into four parts: Part 1 examines (in)dependence relationships, innovation in the Nordic countries, dentistry journals, dependence among growth rates of GDP of V4 countries, emissions mitigation, and five-star ratings; Part 2 investigates access to credit for SMEs, gender-based impacts given Southern Europe’s economic crisis, and labor market transition probabilities; Part 3 looks at recruitment at university job-placement offices and the Program for International Student Assessment; and Part 4 examines discriminants, PageRank, and the political spectrum of Germany.

Statistics for Biomedical Engineers and Scientists

2019-05-18 O'Reilly Amazon

book

Andrew P. King , Robert Eckersley

data data-science data-science-tasks statistics MATLAB

Statistics for Biomedical Engineers and Scientists: How to Analyze and Visualize Data provides an intuitive understanding of the concepts of basic statistics, with a focus on solving biomedical problems. Readers will learn how to understand the fundamental concepts of descriptive and inferential statistics, analyze data and choose an appropriate hypothesis test to answer a given question, compute numerical statistical measures and perform hypothesis tests ‘by hand’, and visualize data and perform statistical analysis using MATLAB. Practical activities and exercises are provided, making this an ideal resource for students in biomedical engineering and the biomedical sciences who are in a course on basic statistics. Presents a practical guide on how to visualize and analyze statistical data Provides numerous practical examples and exercises to illustrate the power of statistics in biomedical engineering applications Gives an intuitive understanding of statistical tests Covers practical skills by showing how to perform operations ‘by hand’ and by using MATLAB as a computational tool Includes an online resource with downloadable materials for students and teachers

Graph Algorithms

2019-05-16 O'Reilly Amazon

book

Mark Needham , Amy E. Hodler

data data-science AI/ML Analytics Neo4j Spark

Learn how graph algorithms can help you leverage relationships within your data to develop intelligent solutions and enhance your machine learning models. With this practical guide,developers and data scientists will discover how graph analytics deliver value, whether they’re used for building dynamic network models or forecasting real-world behavior. Mark Needham and Amy Hodler from Neo4j explain how graph algorithms describe complex structures and reveal difficult-to-find patterns—from finding vulnerabilities and bottlenecksto detecting communities and improving machine learning predictions. You’ll walk through hands-on examples that show you how to use graph algorithms in Apache Spark and Neo4j, two of the most common choices for graph analytics. Learn how graph analytics reveal more predictive elements in today’s data Understand how popular graph algorithms work and how they’re applied Use sample code and tips from more than 20 graph algorithm examples Learn which algorithms to use for different types of questions Explore examples with working code and sample datasets for Spark and Neo4j Create an ML workflow for link prediction by combining Neo4j and Spark

Statistics Essentials For Dummies

2019-05-14 O'Reilly Amazon

book

Deborah J. Rumsey

data data-science data-science-tasks statistics

Statistics Essentials For Dummies (9781119590309) was previously published as Statistics Essentials For Dummies (9780470618394). While this version features a new Dummies cover and design, the content is the same as the prior release and should not be considered a new or updated product. Statistics Essentials For Dummies not only provides students enrolled in Statistics I with an excellent high-level overview of key concepts, but it also serves as a reference or refresher for students in upper-level statistics courses. Free of review and ramp-up material, Statistics Essentials For Dummies sticks to the point, with content focused on key course topics only. It provides discrete explanations of essential concepts taught in a typical first semester college-level statistics course, from odds and error margins to confidence intervals and conclusions. This guide is also a perfect reference for parents who need to review critical statistics concepts as they help high school students with homework assignments, as well as for adult learners headed back into the classroom who just need a refresher of the core concepts. The Essentials For Dummies Series Dummies is proud to present our new series, The Essentials For Dummies. Now students who are prepping for exams, preparing to study new material, or who just need a refresher can have a concise, easy-to-understand review guide that covers an entire course by concentrating solely on the most important concepts. From algebra and chemistry to grammar and Spanish, our expert authors focus on the skills students most need to succeed in a subject.

Intro to Python for Computer Science and Data Science: Learning to Program with AI, Big Data and The Cloud

2019-05-10 O'Reilly Amazon

book

Paul J. Deitel , Harvey M. Deitel

software-development programming-languages Python AI/ML Big Data Cloud Computing

This is the eBook of the printed book and may not include any media, website access codes, or print supplements that may come packaged with the bound book. For introductory-level Python programming and/or data-science courses. A groundbreaking, flexible approach to computer science and data science The Deitels’ Introduction to Python for Computer Science and Data Science: Learning to Program with AI, Big Data and the Cloud offers a unique approach to teaching introductory Python programming, appropriate for both computer-science and data-science audiences. Providing the most current coverage of topics and applications, the book is paired with extensive traditional supplements as well as Jupyter Notebooks supplements. Real-world datasets and artificial-intelligence technologies allow students to work on projects making a difference in business, industry, government and academia. Hundreds of examples, exercises, projects (EEPs), and implementation case studies give students an engaging, challenging and entertaining introduction to Python programming and hands-on data science. Related Content Video: Python Fundamentals Live courses: Python Full Throttle with Paul Deitel: A One-Day, Fast-Paced, Code-Intensive Python Presentation Python® Data Science Full Throttle with Paul Deitel: Introductory Artificial Intelligence (AI), Big Data and Cloud Case Studies The book’s modular architecture enables instructors to conveniently adapt the text to a wide range of computer-science and data-science courses offered to audiences drawn from many majors. Computer-science instructors can integrate as much or as little data-science and artificial-intelligence topics as they’d like, and data-science instructors can integrate as much or as little Python as they’d like. The book aligns with the latest ACM/IEEE CS-and-related computing curriculum initiatives and with the Data Science Undergraduate Curriculum Proposal sponsored by the National Science Foundation.

talk-data.com

O'Reilly Data Science Books

Top Topics

Top Speakers

Hands-On Data Analysis with Pandas

Fundamentals of Programming in SAS

Data Science with Python and Dask

Hands-On Web Scraping with Python

Data Science Strategy For Dummies

Bayesian Statistics the Fun Way

Associations and Correlations

Probability and Statistics for Computer Scientists, 3rd Edition

R Cookbook, 2nd Edition

The Care and Feeding of Data Scientists

Digital Processing of Random Oscillations

Getting Started with Tableau 2019.2 - Second Edition

GARCH Models, 2nd Edition

Principles of Strategic Data Science

Applied Supervised Learning with R

Hands-On Exploratory Data Analysis with R

Hands-On Time Series Analysis with R

Implementing CDISC Using SAS, 2nd Edition

Practical Applications of Bayesian Reliability

Data Analysis and Applications 1

Data Analysis and Applications 2

Statistics for Biomedical Engineers and Scientists

Graph Algorithms

Statistics Essentials For Dummies

Intro to Python for Computer Science and Data Science: Learning to Program with AI, Big Data and The Cloud