talk-data.com talk-data.com

Event

O'Reilly Data Science Books

2013-08-09 – 2026-02-25 Oreilly Visit website ↗

Activities tracked

794

Collection of O'Reilly books on Data Science.

Filtering by: data-science-tasks ×

Sessions & talks

Showing 676–700 of 794 · Newest first

Search within this event →
Smoothing Splines

With many real-world examples, this book shows how to apply the powerful methods of smoothing splines in practice. It covers basic smoothing spline models as well as more advanced models, such as spline smoothing with correlated random errors. It also presents methods for model selection and inference. The author makes the advanced smoothing spline methodology based on reproducing kernel Hilbert space (RKHS) accessible to practitioners and students by keeping theory to a minimum. R is used throughout to implement the methods, with code available for download on the book's web page.

Data Mining: Concepts and Techniques, 3rd Edition

Data Mining: Concepts and Techniques provides the concepts and techniques in processing gathered data or information, which will be used in various applications. Specifically, it explains data mining and the tools used in discovering knowledge from the collected data. This book is referred as the knowledge discovery from data (KDD). It focuses on the feasibility, usefulness, effectiveness, and scalability of techniques of large data sets. After describing data mining, this edition explains the methods of knowing, preprocessing, processing, and warehousing data. It then presents information about data warehouses, online analytical processing (OLAP), and data cube technology. Then, the methods involved in mining frequent patterns, associations, and correlations for large data sets are described. The book details the methods for data classification and introduces the concepts and methods for data clustering. The remaining chapters discuss the outlier detection and the trends, applications, and research frontiers in data mining. This book is intended for Computer Science students, application developers, business professionals, and researchers who seek information on data mining. Presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects Addresses advanced topics such as mining object-relational databases, spatial databases, multimedia databases, time-series databases, text databases, the World Wide Web, and applications in several fields Provides a comprehensive, practical look at the concepts and techniques you need to get the most out of your data

Using Microsoft® Visio® 2010

Using Microsoft Visio 2010 is a customized, media-rich learning experience designed to help new users master Microsoft Visio 2010 quickly, and get the most out of it, fast! It starts with a concise, friendly, straight-to-the-point guide to Microsoft Visio 2010. This exceptional book is fully integrated with an unprecedented collection of online learning resources: online video, screencasts, podcasts, and additional web content, all designed to reinforce key concepts and help users achieve real mastery. The book and online content work together to teach everything mainstream Microsoft Visio 2010 users need to know. This practical, approachable coverage guides readers through all facets of working with Visio 2010, and adds coverage on new features and capabilities including: Visio's brand-new Ribbon interface, Live Preview, Auto size, align and adjust, Containers, status bar navigation tools, and data graphics legends. The 2010 edition also includes coverage on Visio services, processes management tools, new diagram types, and much more. Practical, approachable coverage that completely flattens the Microsoft Visio 2010 learning curve Tightly integrated with online video, screencast tutorials, podcasts, and more: the total learning experience for new Microsoft Visio 2010 users Companion website offers supplemental media including video, screencast tutorials, podcasts, and much more

Numeric Data Services and Sources for the General Reference Librarian

The proliferation of online access to social science statistical and numeric data sources, such as the U.S. Census Bureau’s American Fact Finder, has lead to an increased interest in supporting these sources in academic libraries. Many large libraries have been able to devote staff to data services for years, and recently smaller academic libraries have recognized the need to provide numeric data services and support. This guidebook serves as a primer to developing and supporting social science statistical and numerical data sources in the academic library. It provides strategies for the establishment of data services and offers short descriptions of the essential sources of free and commercial social science statistical and numeric data. Finally, it discusses the future of numeric data services, including the integration of statistics and data into library instruction and the use of Web 2.0 tools to visualize data. Written for a general reference audience with little knowledge of data services and sources who would like to incorporate support into their general reference practice Combines information on establishing data services with an introduction to available statistical and numeric data sources Provides insight into the integration of statistics and data into library instruction and the social science research process

SAP BusinessObjects Dashboards 4.0 Cookbook

Discover how to create compelling, interactive dashboards efficiently using SAP BusinessObjects Dashboards 4.0 (formerly Xcelsius). In this comprehensive cookbook, learn step-by-step recipes that guide you through practical tasks leveraging the tool's advanced features like dynamic visibility and live data connections. What this Book will help me do Master the use of Dashboard Design's spreadsheet and data-driven components to efficiently create professional dashboards. Learn various data visualization techniques to present data clearly and effectively within your dashboards. Implement dynamic interactivity and control logic using features like Dynamic Visibility to enhance dashboard usability. Understand how to connect dashboards to live data sources for up-to-date and real-time business insights. Explore and utilize additional Dashboard Design features and add-ons to customize and extend the capabilities of your dashboards. Author(s) The author(s) of this book are seasoned professionals in SAP BusinessObjects Dashboards and enterprise dashboard design. With years of experience in teaching and applying these tools in practical settings, they aim to demystify complex dashboarding concepts and empower professionals to build impactful visualizations. Who is it for? This book is targeted at developers and business professionals who wish to sharpen their skills in dashboard creation using SAP BusinessObjects Dashboards 4.0. Readers should have a basic understanding of dashboards and some experience with Excel to benefit from this content. It's ideal for those new to SAP Dashboard Design seeking a comprehensive and practical guide.

Statistics in Education and Psychology

Statistics in Education and Psychology aims to develop a coherent, logical and comprehensive outlook towards statistics. The subject involves a wide range of observations, measurements, tools, techniques and data analysis. This book covers diverse topics like measures of central tendency, measures of variability, the correlation method, normal probability curve (NPC), significance of difference of means, analysis of variance, non-parametric chi-square, standard score and T-score.

Statistics For Dummies®, 2nd Edition

The fun and easy way to get down to business with statistics Stymied by statistics? No fear ? this friendly guide offers clear, practical explanations of statistical ideas, techniques, formulas, and calculations, with lots of examples that show you how these concepts apply to your everyday life. Statistics For Dummies shows you how to interpret and critique graphs and charts, determine the odds with probability, guesstimate with confidence using confidence intervals, set up and carry out a hypothesis test, compute statistical formulas, and more. Tracks to a typical first semester statistics course Updated examples resonate with today's students Explanations mirror teaching methods and classroom protocol Packed with practical advice and real-world problems, Statistics For Dummies gives you everything you need to analyze and interpret data for improved classroom or on-the-job performance.

Statistical Analysis: Microsoft® Excel 2010, Video Enhanced Edition

Statistical Analysis: Microsoft Excel 2010 “Excel has become the standard platform for quantitative analysis. Carlberg has become a world-class guide for Excel users wanting to do quantitative analysis. The combination makes Statistical Analysis: Microsoft Excel 2010 a must-have addition to the library of those who want to get the job done and done right.” —Gene V Glass, Regents’ Professor Emeritus, Arizona State University Use Excel 2010’s statistical tools to transform your data into knowledge Use Excel 2010’s powerful statistical tools to gain a deeper understanding of your data, Top Excel guru Conrad Carlberg shows how to use Excel 2010 to perform the core statistical tasks every business professional, student, and researcher should master. Using real-world examples, Carlberg helps you choose the right technique for each problem and get the most out of Excel’s statistical features, including its new consistency functions. Along the way, you discover the most effective ways to use correlation and regression and analysis of variance and covariance. You see how to use Excel to test statistical hypotheses using the normal, binomial, t and F distributions. Becoming an expert with Excel statistics has never been easier! You’ll find crystal-clear instructions, insider insights, and complete step-by-step projects—all complemented by an extensive set of web-based resources. • Master Excel’s most useful descriptive and inferential statistical tools • Tell the truth with statistics, and recognize when others don’t • Accurately summarize sets of values • View how values cluster and disperse • Infer a population’s characteristics from a sample’s frequency distribution • Explore correlation and regression to learn how variables move in tandem • Understand Excel’s new consistency functions • Test differences between two means using z tests, t tests, and Excel’s • Use ANOVA and ANCOVA to test differences between more than two means • Explore statistical power by manipulating mean differences, standard errors, directionality, and alpha

Data Analysis: What Can Be Learned From the Past 50 Years

This book explores the many provocative questions concerning the fundamentals of data analysis. It is based on the time-tested experience of one of the gurus of the subject matter. Why should one study data analysis? How should it be taught? What techniques work best, and for whom? How valid are the results? How much data should be tested? Which machine languages should be used, if used at all? Emphasis on apprenticeship (through hands-on case studies) and anecdotes (through real-life applications) are the tools that Peter J. Huber uses in this volume. Concern with specific statistical techniques is not of immediate value; rather, questions of strategy – when to use which technique – are employed. Central to the discussion is an understanding of the significance of massive (or robust) data sets, the implementation of languages, and the use of models. Each is sprinkled with an ample number of examples and case studies. Personal practices, various pitfalls, and existing controversies are presented when applicable. The book serves as an excellent philosophical and historical companion to any present-day text in data analysis, robust statistics, data mining, statistical learning, or computational statistics.

Microsoft® Visio® 2010: Step by Step

Experience learning made easy—and quickly teach yourself how to create professional-looking business and technical diagrams with Visio 2010. With Step by Step, you set the pace—building and practicing the skills you need, just when you need them! Build a variety of charts and diagrams with Visio templates Draw organization charts, floor plans, flowcharts, and more Apply color, text, and themes to your Visio diagrams Use Visio shapes to link to, store, and visualize data Collaborate on diagrams with Microsoft SharePoint 2010 Create custom diagrams with your own shapes and templates Your Step by Step digital content includes: All the book's practice files—ready to download and put to work. Fully searchable online edition of this book—with unlimited access on the Web. Free online account required.

Statistical Methods for Fuzzy Data

Statistical data are not always precise numbers, or vectors, or categories. Real data are frequently what is called fuzzy. Examples where this fuzziness is obvious are quality of life data, environmental, biological, medical, sociological and economics data. Also the results of measurements can be best described by using fuzzy numbers and fuzzy vectors respectively. Statistical analysis methods have to be adapted for the analysis of fuzzy data. In this book, the foundations of the description of fuzzy data are explained, including methods on how to obtain the characterizing function of fuzzy measurement results. Furthermore, statistical methods are then generalized to the analysis of fuzzy data and fuzzy a-priori information. Key Features: Provides basic methods for the mathematical description of fuzzy data, as well as statistical methods that can be used to analyze fuzzy data. Describes methods of increasing importance with applications in areas such as environmental statistics and social science. Complements the theory with exercises and solutions and is illustrated throughout with diagrams and examples. Explores areas such quantitative description of data uncertainty and mathematical description of fuzzy data. This work is aimed at statisticians working with fuzzy logic, engineering statisticians, finance researchers, and environmental statisticians. It is written for readers who are familiar with elementary stochastic models and basic statistical methods.

Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management, Third Edition

The leading introductory book on data mining, fully updated and revised! When Berry and Linoff wrote the first edition of Data Mining Techniques in the late 1990s, data mining was just starting to move out of the lab and into the office and has since grown to become an indispensable tool of modern business. This new edition—more than 50% new and revised—is a significant update from the previous one, and shows you how to harness the newest data mining methods and techniques to solve common business problems. The duo of unparalleled authors share invaluable advice for improving response rates to direct marketing campaigns, identifying new customer segments, and estimating credit risk. In addition, they cover more advanced topics such as preparing data for analysis and creating the necessary infrastructure for data mining at your company. Features significant updates since the previous edition and updates you on best practices for using data mining methods and techniques for solving common business problems Covers a new data mining technique in every chapter along with clear, concise explanations on how to apply each technique immediately Touches on core data mining techniques, including decision trees, neural networks, collaborative filtering, association rules, link analysis, survival analysis, and more Provides best practices for performing data mining using simple tools such as Excel Data Mining Techniques, Third Edition covers a new data mining technique with each successive chapter and then demonstrates how you can apply that technique for improved marketing, sales, and customer support to get immediate results.

Cluster Analysis, 5th Edition

Cluster analysis comprises a range of methods for classifying multivariate data into subgroups. By organizing multivariate data into such subgroups, clustering can help reveal the characteristics of any structure or patterns present. These techniques have proven useful in a wide range of areas such as medicine, psychology, market research and bioinformatics. This fifth edition of the highly successful Cluster Analysis includes coverage of the latest developments in the field and a new chapter dealing with finite mixture models for structured data. Real life examples are used throughout to demonstrate the application of the theory, and figures are used extensively to illustrate graphical techniques. The book is comprehensive yet relatively non-mathematical, focusing on the practical aspects of cluster analysis. Key Features: Presents a comprehensive guide to clustering techniques, with focus on the practical aspects of cluster analysis. Provides a thorough revision of the fourth edition, including new developments in clustering longitudinal data and examples from bioinformatics and gene studies Updates the chapter on mixture models to include recent developments and presents a new chapter on mixture modeling for structured data. Practitioners and researchers working in cluster analysis and data analysis will benefit from this book.

Mining the Social Web

Popular social networks such as Facebook and Twitter generate a tremendous amount of valuable data on topics and use patterns. Who's talking to whom? What are they talking about? How often are they talking? This concise and practical book shows you how to answer these questions and more by harvesting and analyzing data using social web APIs, Python, and pragmatic storage technologies such as Redis, CouchDB, and NetworkX. With Mining the Social Web, intermediate to advanced programmers will learn how to harvest and analyze social data in way that lends itself to hacking as well as more industrial-strength analysis. Algorithms are designed with robustness and efficiency in mind so that the approaches scale well on an ordinary piece of commodity hardware. The book is highly readable from cover to cover as content progressively grows in complexity, but also lends itself to being read in an ad-hoc fashion. Use easily adaptable scripts to access popular social network APIs including Twitter, OpenSocial, and Facebook Learn approaches for slicing and dicing social data that's been harvested from social web APIs as well as other common formats such as email and markup formats Harvest data from other sources such as Freebase and other sites to enrich your analytic capabilities with additional context Visualize and analyze data in interactive ways with tools built upon rich UI JavaScript toolkits Get a concise and straightforward synopsis of some practical technologies from the semantic web landscape that you can incorporate into your analysis This book is still in progress, but you can get going on this technology through our Rough Cuts edition, which lets you read the manuscript as it's being written, either online or via PDF.

21 Recipes for Mining Twitter

Millions of public Twitter streams harbor a wealth of data, and once you mine them, you can gain some valuable insights. This short and concise book offers a collection of recipes to help you extract nuggets of Twitter information using easy-to-learn Python tools. Each recipe offers a discussion of how and why the solution works, so you can quickly adapt it to fit your particular needs. The recipes include techniques to: Use OAuth to access Twitter data Create and analyze graphs of retweet relationships Use the streaming API to harvest tweets in realtime Harvest and analyze friends and followers Discover friendship cliques Summarize webpages from short URLs This book is a perfect companion to O’Reilly's Mining the Social Web.

Practical Applications of Data Mining

Practical Applications of Data Mining emphasizes both theory and applications of data mining algorithms. Various topics of data mining techniques are identified and described throughout, including clustering, association rules, rough set theory, probability theory, neural networks, classification, and fuzzy logic. Each of these techniques is explored with a theoretical introduction and its effectiveness is demonstrated with various chapter examples. This book will help any database and IT professional understand how to apply data mining techniques to real-world problems.

Following an introduction to data mining principles, Practical Applications of Data Mining introduces association rules to describe the generation of rules as the first step in data mining. It covers classification and clustering methods to show how data can be classified to retrieve information from data. Statistical functions and drough set theory are discussed to demonstrate how statistical and rough set formulas can be used for data analytics and knowlege discovery. Neural networks is an important branch in computational intelligence. It is introduced and explored in the text to investigate the role of neural network algorithms in data analytics.

Entity Resolution and Information Quality

Entity Resolution and Information Quality presents topics and definitions, and clarifies confusing terminologies regarding entity resolution and information quality. It takes a very wide view of IQ, including its six-domain framework and the skills formed by the International Association for Information and Data Quality {IAIDQ). The book includes chapters that cover the principles of entity resolution and the principles of Information Quality, in addition to their concepts and terminology. It also discusses the Fellegi-Sunter theory of record linkage, the Stanford Entity Resolution Framework, and the Algebraic Model for Entity Resolution, which are the major theoretical models that support Entity Resolution. In relation to this, the book briefly discusses entity-based data integration (EBDI) and its model, which serve as an extension of the Algebraic Model for Entity Resolution. There is also an explanation of how the three commercial ER systems operate and a description of the non-commercial open-source system known as OYSTER. The book concludes by discussing trends in entity resolution research and practice. Students taking IT courses and IT professionals will find this book invaluable. First authoritative reference explaining entity resolution and how to use it effectively Provides practical system design advice to help you get a competitive advantage Includes a companion site with synthetic customer data for applicatory exercises, and access to a Java-based Entity Resolution program.

Doing Bayesian Data Analysis

There is an explosion of interest in Bayesian statistics, primarily because recently created computational methods have finally made Bayesian analysis tractable and accessible to a wide audience. Doing Bayesian Data Analysis, A Tutorial Introduction with R and BUGS, is for first year graduate students or advanced undergraduates and provides an accessible approach, as all mathematics is explained intuitively and with concrete examples. It assumes only algebra and ‘rusty’ calculus. Unlike other textbooks, this book begins with the basics, including essential concepts of probability and random sampling. The book gradually climbs all the way to advanced hierarchical modeling methods for realistic data. The text provides complete examples with the R programming language and BUGS software (both freeware), and begins with basic programming examples, working up gradually to complete programs for complex analyses and presentation graphics. These templates can be easily adapted for a large variety of students and their own research needs.The textbook bridges the students from their undergraduate training into modern Bayesian methods. -Accessible, including the basics of essential concepts of probability and random sampling -Examples with R programming language and BUGS software -Comprehensive coverage of all scenarios addressed by non-bayesian textbooks- t-tests, analysis of variance (ANOVA) and comparisons in ANOVA, multiple regression, and chi-square (contingency table analysis). -Coverage of experiment planning -R and BUGS computer programming code on website -Exercises have explicit purposes and guidelines for accomplishment

Performance Dashboards: Measuring, Monitoring, and Managing Your Business, 2nd Edition

Tips, techniques, and trends on harnessing dashboard technology to optimize business performance In Performance Dashboards, Second Edition, author Wayne Eckerson explains what dashboards are, where they can be used, and why they are important to measuring and managing performance. As Director of Research for The Data Warehousing Institute, a worldwide association of business intelligence professionals, Eckerson interviewed dozens of organizations that have built various types of performance dashboards in different industries and lines of business. Their practical insights explore how you can effectively turbo-charge performance–management initiatives with dashboard technology. Includes all-new case studies, industry research, news chapters on "Architecting Performance Dashboards" and "Launching and Managing the Project" and updated information on designing KPIs, designing dashboard displays, integrating dashboards, and types of dashboards. Provides a solid foundation for understanding performance dashboards, business intelligence, and performance management Addresses the next generation of performance dashboards, such as Mashboards and Visual Discovery tools, and including new techniques for designing dashboards and developing key performance indicators Offers guidance on how to incorporate predictive analytics, what-if modeling, collaboration, and advanced visualization techniques This updated book, which is 75% rewritten, provides a foundation for understanding performance dashboards, business intelligence, and performance management to optimize performance and accelerate results.

Statistical Programming with SAS/IML Software

SAS/IML software is a powerful tool for data analysts because it enables implementation of statistical algorithms that are not available in any SAS procedure. Rick Wicklin's Statistical Programming with SAS/IML Software is the first book to provide a comprehensive description of the software and how to use it. He presents tips and techniques that enable you to use the IML procedure and the SAS/IML Studio application efficiently. In addition to providing a comprehensive introduction to the software, the book also shows how to create and modify statistical graphs, call SAS procedures and R functions from a SAS/IML program, and implement such modern statistical techniques as simulations and bootstrap methods in the SAS/IML language. Written for data analysts working in all industries, graduate students, and consultants, Statistical Programming with SAS/IML Software includes numerous code snippets and more than 100 graphs.

This book is part of the SAS Press program.

Excel® Dashboards & Reports

The go to resource for how to use Excel dashboards and reports to better conceptualize data Many Excel books do an adequate job of discussing the individual functions and tools that can be used to create an "Excel Report." What they don't offer is the most effective ways to present and report data. Offering a comprehensive review of a wide array of technical and analytical concepts, Excel Reports and Dashboards helps Excel users go from reporting data with simple tables full of dull numbers, to presenting key information through the use of high-impact, meaningful reports and dashboards that will wow management both visually and substantively. Details how to analyze large amounts of data and report the results in a meaningful, eye-catching visualization Describes how to use different perspectives to achieve better visibility into data, as well as how to slice data into various views on the fly Shows how to automate redundant reporting and analyses Part technical manual, part analytical guidebook, Excel Dashboards and Reports is the latest addition to the Mr. Spreadsheet's Bookshelf series and is the leading resource for learning to create dashboard reports in an easy-to-use format that's both visually attractive and effective.

Analysis of Financial Time Series, Third Edition

This book provides a broad, mature, and systematic introduction to current financial econometric models and their applications to modeling and prediction of financial time series data. It utilizes real-world examples and real financial data throughout the book to apply the models and methods described. The author begins with basic characteristics of financial time series data before covering three main topics: Analysis and application of univariate financial time series The return series of multiple assets Bayesian inference in finance methods Key features of the new edition include additional coverage of modern day topics such as arbitrage, pair trading, realized volatility, and credit risk modeling; a smooth transition from S-Plus to R; and expanded empirical financial data sets. The overall objective of the book is to provide some knowledge of financial time series, introduce some statistical tools useful for analyzing these series and gain experience in financial applications of various econometric methods.

GARCH Models

This book provides a comprehensive and systematic approach to understanding GARCH time series models and their applications whilst presenting the most advanced results concerning the theory and practical aspects of GARCH. The probability structure of standard GARCH models is studied in detail as well as statistical inference such as identification, estimation and tests. The book also provides coverage of several extensions such as asymmetric and multivariate models and looks at financial applications. Key features: Provides up-to-date coverage of the current research in the probability, statistics and econometric theory of GARCH models. Numerous illustrations and applications to real financial series are provided. Supporting website featuring R codes, Fortran programs and data sets. Presents a large collection of problems and exercises. This authoritative, state-of-the-art reference is ideal for graduate students, researchers and practitioners in business and finance seeking to broaden their skills of understanding of econometric time series models.