data-science-tasks

Time Series Databases: New Ways to Store and Access Data

2014-12-04 · O'Reilly Data Science Books O'Reilly Amazon

book

by Ellen Friedman , Ted Dunning

data data-science statistics time-series

Time series data is of growing importance, especially with the rapid expansion of the Internet of Things. This concise guide shows you effective ways to collect, persist, and access large-scale time series data for analysis. You’ll explore the theory behind time series databases and learn practical methods for implementing them. Authors Ted Dunning and Ellen Friedman provide a detailed examination of open source tools such as OpenTSDB and new modifications that greatly speed up data ingestion.

Create Web Charts with D3

2014-12-03 · O'Reilly Data Science Books O'Reilly Amazon

book

by Fabio Nelli

DataViz HTML JavaScript MATLAB SQL d3 data data-science data-visualization

Create Web Charts with D3 shows how to convert your data into eye-catching, innovative, animated, and highly interactive browser-based charts. This book is suitable for developers of all experience levels and needs: if you want power and control and need to create data visualization beyond traditional charts, then D3 is the JavaScript library for you. By the end of the book, you will have a good knowledge of all the elements needed to manage data from every possible source, from high-end scientific instruments to Arduino boards, from PHP SQL databases queries to simple HTML tables, and from Matlab calculations to reports in Excel. This book contains content previously published in Beginning JavaScript Charts. Create all kinds of charts using the latest technologies available on browsers Full of step-by-step examples, Create Web Charts with D3 introduces you gradually to all aspects of chart development, from the data source to the choice of which solution to apply. This book provides a number of tools that can be the starting point for any project requiring graphical representations of data, whether using commercial libraries or your own

Visualization Analysis and Design

2014-12-01 · O'Reilly Data Visualization Books O'Reilly Amazon

book

by Tamara Munzner

Analytics data data-science data-visualization

This book provides a systematic, comprehensive framework for thinking about visualization in terms of principles and design choices. It features a unified approach encompassing information visualization techniques for abstract data, scientific visualization techniques for spatial data, and visual analytics techniques for interweaving data transformation and analysis with interactive visual exploration. Suitable for both beginners and more experienced designers, the book does not assume any experience with programming, mathematics, human-computer interaction, or graphic design.

Statistical Graphics Procedures by Example

2014-11-29 · O'Reilly Data Science Books O'Reilly Amazon

book

by Sanjay Matange , Dan Heath

SAS data data-science statistics

Sanjay Matange and Dan Heath's Statistical Graphics Procedures by Example: Effective Graphs Using SAS shows the innumerable capabilities of SAS Statistical Graphics (SG) procedures. The authors begin with a general discussion of the principles of effective graphics, ODS Graphics, and the SG procedures. They then move on to show examples of the procedures' many features. The book is designed so that you can easily flip through it, find the graph you need, and view the code right next to the example. Among the topics included are how to combine plot statements to create custom graphs; customizing graph axes, legends, and insets; advanced features, such as annotation and attribute maps; tips and tricks for creating the optimal graph for the intended usage; real-world examples from the health and life sciences domain; and ODS styles. The procedures in Statistical Graphics Procedures by Example are specifically designed for the creation of analytical graphs. That makes this book a must-read for analysts and statisticians in the health care, clinical trials, financial, and insurance industries. However, you will find that the examples here apply to all fields. This book is part of the SAS Press program.

Statistics: An Introduction Using R, 2nd Edition

2014-11-24 · O'Reilly Data Science Books O'Reilly Amazon

book

by Michael J. Crawley

data data-science statistics

"...I know of no better book of its kind..." (Journal of the Royal Statistical Society, Vol 169 (1), January 2006) A revised and updated edition of this bestselling introductory textbook to statistical analysis using the leading free software package R This new edition of a bestselling title offers a concise introduction to a broad array of statistical methods, at a level that is elementary enough to appeal to a wide range of disciplines. Step-by-step instructions help the non-statistician to fully understand the methodology. The book covers the full range of statistical techniques likely to be needed to analyse the data from research projects, including elementary material like t--tests and chi--squared tests, intermediate methods like regression and analysis of variance, and more advanced techniques like generalized linear modelling. Includes numerous worked examples and exercises within each chapter.

Text Mining and Analysis

2014-11-22 · O'Reilly Data Science Books O'Reilly Amazon

book

by Satish Garla , Murali Pagolu , Dr. Goutam Chakraborty

Analytics Big Data SAS data data-science exploratory-data-analysis

Big data: It's unstructured, it's coming at you fast, and there's lots of it. In fact, the majority of big data is text-oriented, thanks to the proliferation of online sources such as blogs, emails, and social media.

However, having big data means little if you can't leverage it with analytics. Now you can explore the large volumes of unstructured text data that your organization has collected with Text Mining and Analysis: Practical Methods, Examples, and Case Studies Using SAS.

This hands-on guide to text analytics using SAS provides detailed, step-by-step instructions and explanations on how to mine your text data for valuable insight. Through its comprehensive approach, you'll learn not just how to analyze your data, but how to collect, cleanse, organize, categorize, explore, and interpret it as well. Text Mining and Analysis also features an extensive set of case studies, so you can see examples of how the applications work with real-world data from a variety of industries.

Text analytics enables you to gain insights about your customers' behaviors and sentiments. Leverage your organization's text data, and use those insights for making better business decisions with Text Mining and Analysis.

This book is part of the SAS Press program.

Correspondence Analysis: Theory, Practice and New Strategies

2014-11-17 · O'Reilly Data Science Books O'Reilly Amazon

book

by Eric J. Beh , Rosaria Lombardo

data data-science statistics

A comprehensive overview of the internationalisation of correspondence analysis Correspondence Analysis: Theory, Practice and New Strategies examines the key issues of correspondence analysis, and discusses the new advances that have been made over the last 20 years. The main focus of this book is to provide a comprehensive discussion of some of the key technical and practical aspects of correspondence analysis, and to demonstrate how they may be put to use. Particular attention is given to the history and mathematical links of the developments made. These links include not just those major contributions made by researchers in Europe (which is where much of the attention surrounding correspondence analysis has focused) but also the important contributions made by researchers in other parts of the world. Key features include: A comprehensive international perspective on the key developments of correspondence analysis. Discussion of correspondence analysis for nominal and ordinal categorical data. Discussion of correspondence analysis of contingency tables with varying association structures (symmetric and non-symmetric relationship between two or more categorical variables). Extensive treatment of many of the members of the correspondence analysis family for two-way, three-way and multiple contingency tables. Correspondence Analysis offers a comprehensive and detailed overview of this topic which will be of value to academics, postgraduate students and researchers wanting a better understanding of correspondence analysis. Readers interested in the historical development, internationalisation and diverse applicability of correspondence analysis will also find much to enjoy in this book.

Introduction to Mixed Modelling: Beyond Regression and Analysis of Variance, 2nd Edition

2014-11-17 · O'Reilly Data Science Books O'Reilly Amazon

book

by N. W. Galwey

SAS data data-science statistics

Mixed modelling is very useful, and easier than you think! Mixed modelling is now well established as a powerful approach to statistical data analysis. It is based on the recognition of random-effect terms in statistical models, leading to inferences and estimates that have much wider applicability and are more realistic than those otherwise obtained. Introduction to Mixed Modelling leads the reader into mixed modelling as a natural extension of two more familiar methods, regression analysis and analysis of variance. It provides practical guidance combined with a clear explanation of the underlying concepts. Like the first edition, this new edition shows diverse applications of mixed models, provides guidance on the identification of random-effect terms, and explains how to obtain and interpret best linear unbiased predictors (BLUPs). It also introduces several important new topics, including the following: Use of the software SAS, in addition to GenStat and R. Meta-analysis and the multiple testing problem. The Bayesian interpretation of mixed models. Including numerous practical exercises with solutions, this book provides an ideal introduction to mixed modelling for final year undergraduate students, postgraduate students and professional researchers. It will appeal to readers from a wide range of scientific disciplines including statistics, biology, bioinformatics, medicine, agriculture, engineering, economics, archaeology and geography. Praise for the first edition: "One of the main strengths of the text is the bridge it provides between traditional analysis of variance and regression models and the more recently developed class of mixed models...Each chapter is well-motivated by at least one carefully chosen example...demonstrating the broad applicability of mixed models in many different disciplines...most readers will likely learn something new, and those previously unfamiliar with mixed models will obtain a solid foundation on this topic."— Kerrie Nelson University of South Carolina, in American Statistician, 2007

Fixed Effects Regression Methods for Longitudinal Data Using SAS

2014-11-12 · O'Reilly Data Science Books O'Reilly Amazon

book

by Paul D. Allison

SAS data data-science regression-analysis statistics

Fixed Effects Regression Methods for Longitudinal Data Using SAS, written by Paul Allison, is an invaluable resource for all researchers interested in adding fixed effects regression methods to their tool kit of statistical techniques. First introduced by economists, fixed effects methods are gaining widespread use throughout the social sciences. Designed to eliminate major biases from regression models with multiple observations (usually longitudinal) for each subject (usually a person), fixed effects methods essentially offer control for all stable characteristics of the subjects, even characteristics that are difficult or impossible to measure. This straightforward and thorough text shows you how to estimate fixed effects models with several SAS procedures that are appropriate for different kinds of outcome variables. The theoretical background of each model is explained, and the models are then illustrated with detailed examples using real data. The book contains thorough discussions of the following uses of SAS procedures: PROC GLM for estimating fixed effects linear models for quantitative outcomes, PROC LOGISTIC for estimating fixed effects logistic regression models, PROC PHREG for estimating fixed effects Cox regression models for repeated event data, PROC GENMOD for estimating fixed effects Poisson regression models for count data, and PROC CALIS for estimating fixed effects structural equation models. To gain the most benefit from this book, readers should be familiar with multiple linear regression, have practical experience using multiple regression on real data, and be comfortable interpreting the output from a regression analysis. An understanding of logistic regression and Poisson regression is a plus. Some experience with SAS is helpful, but not required. This book is part of the SAS Press program.

Doing Bayesian Data Analysis, 2nd Edition

2014-11-11 · O'Reilly Data Science Books O'Reilly Amazon

book

by John Kruschke

bayesian-statistics data data-science statistics

Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan, Second Edition provides an accessible approach for conducting Bayesian data analysis, as material is explained clearly with concrete examples. Included are step-by-step instructions on how to carry out Bayesian data analyses in the popular and free software R and WinBugs, as well as new programs in JAGS and Stan. The new programs are designed to be much easier to use than the scripts in the first edition. In particular, there are now compact high-level scripts that make it easy to run the programs on your own data sets. The book is divided into three parts and begins with the basics: models, probability, Bayes’ rule, and the R programming language. The discussion then moves to the fundamentals applied to inferring a binomial probability, before concluding with chapters on the generalized linear model. Topics include metric-predicted variable on one or two groups; metric-predicted variable with one metric predictor; metric-predicted variable with multiple metric predictors; metric-predicted variable with one nominal predictor; and metric-predicted variable with multiple nominal predictors. The exercises found in the text have explicit purposes and guidelines for accomplishment. This book is intended for first-year graduate students or advanced undergraduates in statistics, data analysis, psychology, cognitive science, social sciences, clinical sciences, and consumer sciences in business. Accessible, including the basics of essential concepts of probability and random sampling Examples with R programming language and JAGS software Comprehensive coverage of all scenarios addressed by non-Bayesian textbooks: t-tests, analysis of variance (ANOVA) and comparisons in ANOVA, multiple regression, and chi-square (contingency table analysis) Coverage of experiment planning R and JAGS computer programming code on website Exercises have explicit purposes and guidelines for accomplishment Provides step-by-step instructions on how to conduct Bayesian data analyses in the popular and free software R and WinBugs

Introduction to Imaging from Scattered Fields

2014-11-10 · O'Reilly Data Science Books O'Reilly Amazon

book

by R. Shane Ritter , Michael A Fiddy

MATLAB data data-science exploratory-data-analysis

Obtain the Best Estimate of a Strongly Scattering Object from Limited Scattered Field Data Introduction to Imaging from Scattered Fields presents an overview of the challenging problem of determining information about an object from measurements of the field scattered from that object. It covers widely used approaches to recover information about the objects and examines the assumptions made a priori about the object and the consequences of recovering object information from limited numbers of noisy measurements of the scattered fields. The book explores the strengths and weaknesses of using inverse methods for weak scattering. These methods, including Fourier-based signal and image processing techniques, allow more straightforward inverse algorithms to be exploited based on a simple mapping of scattered field data. The authors also discuss their recent approach based on a nonlinear filtering step in the inverse algorithm. They illustrate how to use this algorithm through numerous two-dimensional electromagnetic scattering examples. MATLAB® code is provided to help readers quickly apply the approach to a wide variety of inverse scattering problems. In later chapters of the book, the authors focus on important and often forgotten overarching constraints associated with exploiting inverse scattering algorithms. They explain how the number of degrees of freedom associated with any given scattering experiment can be found and how this allows one to specify a minimum number of data that should be measured. They also describe how the prior discrete Fourier transform (PDFT) algorithm helps in estimating the properties of an object from scattered field measurements. The PDFT restores stability and improves estimates of the object even with severely limited data (provided it is sufficient to meet a criterion based on the number of degrees of freedom). Suitable for graduate students and researchers working on medical, geophysical, defense, and industrial inspection inverse problems, this self-contained book provides the necessary details for readers to design improved experiments and process measured data more effectively. It shows how to obtain the best estimate of a strongly scattering object from limited scattered field data.

Experimental Design

2014-11-01 · O'Reilly Data Science Books O'Reilly Amazon

book

by Richard T. O’Connell , Bruce L. Bowerman

data data-science statistics

This book is a concise and innovative book that gives a complete presentation of the design and analysis of experiments in approximately one half the space of competing books. With only the modest prerequisite of a basic (non-calculus) statistics course, this text is appropriate for the widest possible audience. Two procedures are generally used to analyze experimental design data—analysis of variance (ANOVA) and regression analysis. Because ANOVA is more intuitive, this book devotes most of its first three chapters to showing how to use ANOVA to analyze balanced (equal sample size) experimental design data. The text first discusses regression analysis at the end of Chapter 2, where regression is used to analyze data that cannot be analyzed by ANOVA: unbalanced (unequal sample size) data from two-way factorials and data from incomplete block designs. Regression is then used again in Chapter 4 to analyze data resulting from two-level fractional factorial and block confounding experiments.

Simulation Technologies in Networking and Communications

2014-10-31 · O'Reilly Data Science Books O'Reilly Amazon

book

by Muhammad Mostafa Monowar , Al-Sakib Khan Pathan , Shafiullah Khan

Cloud Computing Monte Carlo data data-science monte-carlo-simulation statistics

Simulation is a widely used mechanism for validating the theoretical models of networking and communication systems. Although the claims made based on simulations are considered to be reliable, how reliable they really are is best determined with real-world implementation trials. Simulation Technologies in Networking and Communications: Selecting the Best Tool for the Test Considers superefficient Monte Carlo simulations Describes how to simulate and evaluate multicast routing algorithms Covers simulation tools for cloud computing and broadband passive optical networks Reports on recent developments in simulation tools for WSNs Examines modeling and simulation of vehicular networks The book compiles expert perspectives about the simulation of various networking and communications technologies. These experts review and evaluate popular simulation modeling tools and recommend the best tools for your specific tests. They also explain how to determine when theoretical modeling would be preferred over simulation.

Probability and Stochastic Processes

2014-10-27 · O'Reilly Data Science Books O'Reilly Amazon

book

by Ionut Florescu

data data-science statistics

A comprehensive and accessible presentation of probability and stochastic processes with emphasis on key theoretical concepts and real-world applications With a sophisticated approach, Probability and Stochastic Processes successfully balances theory and applications in a pedagogical and accessible format. The book's primary focus is on key theoretical notions in probability to provide a foundation for understanding concepts and examples related to stochastic processes. Organized into two main sections, the book begins by developing probability theory with topical coverage on probability measure; random variables; integration theory; product spaces, conditional distribution, and conditional expectations; and limit theorems. The second part explores stochastic processes and related concepts including the Poisson process, renewal processes, Markov chains, semi-Markov processes, martingales, and Brownian motion. Featuring a logical combination of traditional and complex theories as well as practices, Probability and Stochastic Processes also includes: Multiple examples from disciplines such as business, mathematical finance, and engineering Chapter-by-chapter exercises and examples to allow readers to test their comprehension of the presented material A rigorous treatment of all probability and stochastic processes concepts An appropriate textbook for probability and stochastic processes courses at the upper-undergraduate and graduate level in mathematics, business, and electrical engineering, Probability and Stochastic Processes is also an ideal reference for researchers and practitioners in the fields of mathematics, engineering, and finance.

An Introduction to Probability and Statistical Inference, 2nd Edition

2014-10-21 · O'Reilly Data Science Books O'Reilly Amazon

book

by George G. Roussas

data data-science statistics

An Introduction to Probability and Statistical Inference, Second Edition, guides you through probability models and statistical methods and helps you to think critically about various concepts. Written by award-winning author George Roussas, this book introduces readers with no prior knowledge in probability or statistics to a thinking process to help them obtain the best solution to a posed question or situation. It provides a plethora of examples for each topic discussed, giving the reader more experience in applying statistical methods to different situations. This text contains an enhanced number of exercises and graphical illustrations where appropriate to motivate the reader and demonstrate the applicability of probability and statistical inference in a great variety of human activities. Reorganized material is included in the statistical portion of the book to ensure continuity and enhance understanding. Each section includes relevant proofs where appropriate, followed by exercises with useful clues to their solutions. Furthermore, there are brief answers to even-numbered exercises at the back of the book and detailed solutions to all exercises are available to instructors in an Answers Manual. This text will appeal to advanced undergraduate and graduate students, as well as researchers and practitioners in engineering, business, social sciences or agriculture. Content, examples, an enhanced number of exercises, and graphical illustrations where appropriate to motivate the reader and demonstrate the applicability of probability and statistical inference in a great variety of human activities Reorganized material in the statistical portion of the book to ensure continuity and enhance understanding A relatively rigorous, yet accessible and always within the prescribed prerequisites, mathematical discussion of probability theory and statistical inference important to students in a broad variety of disciplines Relevant proofs where appropriate in each section, followed by exercises with useful clues to their solutions Brief answers to even-numbered exercises at the back of the book and detailed solutions to all exercises available to instructors in an Answers Manual

The Synoptic Problem and Statistics

2014-10-20 · O'Reilly Data Science Books O'Reilly Amazon

book

by Andris Abakuks

data data-science statistics

This book lays the foundations for a new area of interdisciplinary research that uses statistical techniques to investigate the synoptic problem in New Testament studies, which concerns the relationships between the Gospels of Matthew, Mark, and Luke. There are potential applications of the techniques to study other sets of similar documents. The book presents core statistical material on the use of hidden Markov models to analyze binary time series. The binary time series data sets and R code used are available on the author's website.

Mathematical Statistics for Applied Econometrics

2014-10-16 · O'Reilly Data Science Books O'Reilly Amazon

book

by Charles B Moss

data data-science statistics

An Introductory Econometrics Text Mathematical Statistics for Applied Econometrics covers the basics of statistical inference in support of a subsequent course on classical econometrics. The book shows students how mathematical statistics concepts form the basis of econometric formulations. It also helps them think about statistics as more than a toolbox of techniques. Uses Computer Systems to Simplify Computation The text explores the unifying themes involved in quantifying sample information to make inferences. After developing the necessary probability theory, it presents the concepts of estimation, such as convergence, point estimators, confidence intervals, and hypothesis tests. The text then shifts from a general development of mathematical statistics to focus on applications particularly popular in economics. It delves into matrix analysis, linear models, and nonlinear econometric techniques. Students Understand the Reasons for the Results Avoiding a cookbook approach to econometrics, this textbook develops students’ theoretical understanding of statistical tools and econometric applications. It provides them with the foundation for further econometric studies.

Think Stats, 2nd Edition

2014-10-16 · O'Reilly Data Science Books O'Reilly Amazon

book

by Allen B. Downey

Python data data-science statistics

If you know how to program, you have the skills to turn data into knowledge, using tools of probability and statistics. This concise introduction shows you how to perform statistical analysis computationally, rather than mathematically, with programs written in Python. By working with a single case study throughout this thoroughly revised book, you’ll learn the entire process of exploratory data analysis—from collecting data and generating statistics to identifying patterns and testing hypotheses. You’ll explore distributions, rules of probability, visualization, and many other tools and concepts. New chapters on regression, time series analysis, survival analysis, and analytic methods will enrich your discoveries. Develop an understanding of probability and statistics by writing and testing code Run experiments to test statistical behavior, such as generating samples from several distributions Use simulations to understand concepts that are hard to grasp mathematically Import data from most sources with Python, rather than rely on data that’s cleaned and formatted for statistics tools Use statistical inference to answer questions about real-world data

Nonparametric Statistical Methods Using R

2014-10-09 · O'Reilly Data Science Books O'Reilly Amazon

book

by John Kloke , Joseph McKean

data data-science statistics

This book covers traditional nonparametric methods and rank-based analyses, including estimation and inference for models ranging from simple location models to general linear and nonlinear models for uncorrelated and correlated responses. The authors emphasize applications and statistical computation. They illustrate the methods with many real and simulated data examples using R, including the packages Rfit and npsm, which are available on CRAN. Each chapter includes exercises, making the book suitable for an undergraduate or graduate course.

Presenting Data: How to Communicate Your Message Effectively

2014-10-06 · O'Reilly Data Science Books O'Reilly Amazon

book

by Ed Swires-Hennessy

data data-science stata statistics

A clear easy-to-read guide to presenting your message using statistical data Poor presentation of data is everywhere; basic principles are forgotten or ignored. As a result, audiences are presented with confusing tables and charts that do not make immediate sense. This book is intended to be read by all who present data in any form. The author, a chartered statistician who has run many courses on the subject of data presentation, presents numerous examples alongside an explanation of how improvements can be made and basic principles to adopt. He advocates following four key 'C' words in all messages: Clear, Concise, Correct and Consistent. Following the principles in the book will lead to clearer, simpler and easier to understand messages which can then be assimilated faster. Anyone from student to researcher, journalist to policy adviser, charity worker to government statistician, will benefit from reading this book. More importantly, it will also benefit the recipients of the presented data. 'Ed Swires-Hennessy, a recognised expert in the presentation of statistics, explains and clearly describes a set of "principles" of clear and objective statistical communication. This book should be required reading for all those who present statistics.' Richard Laux, UK Statistics Authority 'I think this is a fantastic book and hope everyone who presents data or statistics makes time to read it first.' David Marder, Chief Media Adviser, Office for National Statistics, UK 'Ed's book makes his tried-and-tested material widely available to anyone concerned with understanding and presenting data. It is full of interesting insights, is highly practical and packed with sensible suggestions and nice ideas that you immediately want to try out.' Dr Shirley Coleman, Principal Statistician, Industrial Statistics Research Unit, School of Mathematics and Statistics, Newcastle University, UK

talk-data.com

Activity Trend

Top Events

Top Speakers

Time Series Databases: New Ways to Store and Access Data

Create Web Charts with D3

Visualization Analysis and Design

Statistical Graphics Procedures by Example

Statistics: An Introduction Using R, 2nd Edition

Text Mining and Analysis

Correspondence Analysis: Theory, Practice and New Strategies

Introduction to Mixed Modelling: Beyond Regression and Analysis of Variance, 2nd Edition

Fixed Effects Regression Methods for Longitudinal Data Using SAS

Doing Bayesian Data Analysis, 2nd Edition

Introduction to Imaging from Scattered Fields

Experimental Design

Simulation Technologies in Networking and Communications

Probability and Stochastic Processes

An Introduction to Probability and Statistical Inference, 2nd Edition

The Synoptic Problem and Statistics

Mathematical Statistics for Applied Econometrics

Think Stats, 2nd Edition

Nonparametric Statistical Methods Using R

Presenting Data: How to Communicate Your Message Effectively