talk-data.com talk-data.com

Event

O'Reilly Data Science Books

2013-08-09 – 2026-02-25 Oreilly Visit website ↗

Activities tracked

2118

Collection of O'Reilly books on Data Science.

Sessions & talks

Showing 1851–1875 of 2118 · Newest first

Search within this event →
CBAP®/CCBA™ Certified Business Analysis, Study Guide

A must-have resource for anyone preparing for the version 2.0 of the CBAP exam As organizations look to streamline their production models, the need for qualified and certified business analysts is growing. The Certified Business Analyst Professional (CBAP) certification is the only certification for this growing field and this study guide is an essential step towards preparation for the CBAP exam. With this resource, you ll benefit from coverage of both the CBAP as well as the CCBA (Certification in Competency in Business Analysis) exam. Each chapter covers the Business Analysis standards and best practices and includes a list of exam topics covered, followed by in-depth discusses of those objectives. Real-world, hands-on scenarios help take the learning process a step further. Covers Version 2 of the Business Analyst Body of Knowledge (BABOK) Offers invaluable preparation for both the CBAP and CCBA exams Includes a list of exam topics and presents detailed discussions of each objective Features real-world scenarios, best practices, key terms, and a wide range of helpful topics that will prepare you for taking the exams Shares practice exam questions, topic summaries, and exam tips and tricks, all aimed at providing a solid foundation for achieving exam success This valuable study guide provides you with the preparation you need to confidently take the CBAP and CCBA exams.

SharePoint® 2010 Business Intelligence 24-Hour Trainer

This unique book-and-video package introduces SharePoint® 2010 Business Intelligence Learn to build and deliver SharePoint BI applications Written by a team of leading SharePoint and Business Intelligence (BI) experts, this unique book-and-video package shows you how to successfully build and deliver BI applications using SharePoint 2010. Assuming no previous SharePoint experience, the authors deliver a clear explanation of what SharePoint will do for your BI and information management capabilities. Each lesson in the book is reinforced with a helpful tutorial on the video and cover topics such as interactive reporting with Excel, document sharing for collaborative reporting, and controlling data sources. As you learn best practices for configuring and securing SharePoint 2010 BI applications and planning and implementing your SharePoint BI project plan, you'll be well on your way to gaining a solid foundation of understanding and working with SharePoint 2010 and BI. Provides an invaluable training book-and-video package that takes you through building and delivering BI applications using SharePoint 2010 Features an accompanying video tutorial for each lesson covered in the book, along with a "Try It" section at the end of each lesson Covers interactive reporting with excel Covers Power Pivot advanced analytics Details report and document sharing for collaborative reportingShows how to use SharePoint lists and libraries for data sources repositories in your BI projects Explains how to control data sources, reports and business intelligence content with permissions and workflow approvals. With this unique book-and-video combo, you'll be well on your way to successfully building and delivering BI applications using SharePoint 2010.

Statistics For Dummies®, 2nd Edition

The fun and easy way to get down to business with statistics Stymied by statistics? No fear ? this friendly guide offers clear, practical explanations of statistical ideas, techniques, formulas, and calculations, with lots of examples that show you how these concepts apply to your everyday life. Statistics For Dummies shows you how to interpret and critique graphs and charts, determine the odds with probability, guesstimate with confidence using confidence intervals, set up and carry out a hypothesis test, compute statistical formulas, and more. Tracks to a typical first semester statistics course Updated examples resonate with today's students Explanations mirror teaching methods and classroom protocol Packed with practical advice and real-world problems, Statistics For Dummies gives you everything you need to analyze and interpret data for improved classroom or on-the-job performance.

Engineering Circuit Analysis: International Student Version, Tenth Edition

Maintaining its accessible approach to circuit analysis, the tenth edition includes even more features to engage and motivate engineers. Exciting chapter openers and accompanying photos are included to enhance visual learning. The book introduces figures with color-coding to significantly improve comprehension. New problems and expanded application examples in PSPICE, MATLAB, and LabView are included. New quizzes are also added to help engineers reinforce the key concepts.

Statistical Analysis: Microsoft® Excel 2010, Video Enhanced Edition

Statistical Analysis: Microsoft Excel 2010 “Excel has become the standard platform for quantitative analysis. Carlberg has become a world-class guide for Excel users wanting to do quantitative analysis. The combination makes Statistical Analysis: Microsoft Excel 2010 a must-have addition to the library of those who want to get the job done and done right.” —Gene V Glass, Regents’ Professor Emeritus, Arizona State University Use Excel 2010’s statistical tools to transform your data into knowledge Use Excel 2010’s powerful statistical tools to gain a deeper understanding of your data, Top Excel guru Conrad Carlberg shows how to use Excel 2010 to perform the core statistical tasks every business professional, student, and researcher should master. Using real-world examples, Carlberg helps you choose the right technique for each problem and get the most out of Excel’s statistical features, including its new consistency functions. Along the way, you discover the most effective ways to use correlation and regression and analysis of variance and covariance. You see how to use Excel to test statistical hypotheses using the normal, binomial, t and F distributions. Becoming an expert with Excel statistics has never been easier! You’ll find crystal-clear instructions, insider insights, and complete step-by-step projects—all complemented by an extensive set of web-based resources. • Master Excel’s most useful descriptive and inferential statistical tools • Tell the truth with statistics, and recognize when others don’t • Accurately summarize sets of values • View how values cluster and disperse • Infer a population’s characteristics from a sample’s frequency distribution • Explore correlation and regression to learn how variables move in tandem • Understand Excel’s new consistency functions • Test differences between two means using z tests, t tests, and Excel’s • Use ANOVA and ANCOVA to test differences between more than two means • Explore statistical power by manipulating mean differences, standard errors, directionality, and alpha

Data Analysis: What Can Be Learned From the Past 50 Years

This book explores the many provocative questions concerning the fundamentals of data analysis. It is based on the time-tested experience of one of the gurus of the subject matter. Why should one study data analysis? How should it be taught? What techniques work best, and for whom? How valid are the results? How much data should be tested? Which machine languages should be used, if used at all? Emphasis on apprenticeship (through hands-on case studies) and anecdotes (through real-life applications) are the tools that Peter J. Huber uses in this volume. Concern with specific statistical techniques is not of immediate value; rather, questions of strategy – when to use which technique – are employed. Central to the discussion is an understanding of the significance of massive (or robust) data sets, the implementation of languages, and the use of models. Each is sprinkled with an ample number of examples and case studies. Personal practices, various pitfalls, and existing controversies are presented when applicable. The book serves as an excellent philosophical and historical companion to any present-day text in data analysis, robust statistics, data mining, statistical learning, or computational statistics.

Microsoft® Visio® 2010: Step by Step

Experience learning made easy—and quickly teach yourself how to create professional-looking business and technical diagrams with Visio 2010. With Step by Step, you set the pace—building and practicing the skills you need, just when you need them! Build a variety of charts and diagrams with Visio templates Draw organization charts, floor plans, flowcharts, and more Apply color, text, and themes to your Visio diagrams Use Visio shapes to link to, store, and visualize data Collaborate on diagrams with Microsoft SharePoint 2010 Create custom diagrams with your own shapes and templates Your Step by Step digital content includes: All the book's practice files—ready to download and put to work. Fully searchable online edition of this book—with unlimited access on the Web. Free online account required.

What Is Data Science?

We've all heard it: according to Hal Varian, statistics is the next sexy job. Five years ago, in What is Web 2.0, Tim O'Reilly said that "data is the next Intel Inside." But what does that statement mean? Why do we suddenly care about statistics and about data? This report examines the many sides of data science -- the technologies, the companies and the unique skill sets.The web is full of "data-driven apps." Almost any e-commerce application is a data-driven application. There's a database behind a web front end, and middleware that talks to a number of other databases and data services (credit card processing companies, banks, and so on). But merely using data isn't really what we mean by "data science." A data application acquires its value from the data itself, and creates more data as a result. It's not just an application with data; it's a data product. Data science enables the creation of data products.

The Power of Your Past

There's nothing wrong with “living in the now”—except that it's only part of our story. If we underuse or misuse our past, we're losing a tremendous source of wisdom and self-knowledge. The problem isn't the past itself, it's that we don't use it well. John Schuster exposes the many ways we ignore, distort, or become captive to our pasts and explains how we can tap into this underutilized treasure trove. He shows how to systematically recall key images and experiences that have influenced us, for good or ill, and reclaim the positive experiences—deepen our understanding of their impact and use them to guide our going forward. The negative experiences must be recast—reinterpreted so that they no longer lessen our possibilities but rather serve to expand our understanding of who we are and what we can be. Schuster's enlightening and entertaining stories as well as simple and compelling techniques will enable you to make your past sing and play and work for you.

Statistical Methods for Fuzzy Data

Statistical data are not always precise numbers, or vectors, or categories. Real data are frequently what is called fuzzy. Examples where this fuzziness is obvious are quality of life data, environmental, biological, medical, sociological and economics data. Also the results of measurements can be best described by using fuzzy numbers and fuzzy vectors respectively. Statistical analysis methods have to be adapted for the analysis of fuzzy data. In this book, the foundations of the description of fuzzy data are explained, including methods on how to obtain the characterizing function of fuzzy measurement results. Furthermore, statistical methods are then generalized to the analysis of fuzzy data and fuzzy a-priori information. Key Features: Provides basic methods for the mathematical description of fuzzy data, as well as statistical methods that can be used to analyze fuzzy data. Describes methods of increasing importance with applications in areas such as environmental statistics and social science. Complements the theory with exercises and solutions and is illustrated throughout with diagrams and examples. Explores areas such quantitative description of data uncertainty and mathematical description of fuzzy data. This work is aimed at statisticians working with fuzzy logic, engineering statisticians, finance researchers, and environmental statisticians. It is written for readers who are familiar with elementary stochastic models and basic statistical methods.

Business Research Methods

Business Research Methods provides students with the knowledge, understanding and necessary skills to complete a business research. The reader is taken step-by-step through a range of contemporary research methods, while numerous worked examples and real-life case studies bring to life the realities of undertaking these researchs. Emphasis on data analysis is the key strength of this book. The book uses the latest software packages: MS Excel (2007), SPSS 17 and Minitab 15 to solve statistical data analysis. The complexity of multivariate analysis is also dealt with the help of these three softwares.

R Cookbook

With more than 200 practical recipes, this book helps you perform data analysis with R quickly and efficiently. The R language provides everything you need to do statistical work, but its structure can be difficult to master. This collection of concise, task-oriented recipes makes you productive with R immediately, with solutions ranging from basic tasks to input and output, general statistics, graphics, and linear regression. Each recipe addresses a specific problem, with a discussion that explains the solution and offers insight into how it works. If you’re a beginner, R Cookbook will help get you started. If you’re an experienced data programmer, it will jog your memory and expand your horizons. You’ll get the job done faster and learn more about R in the process. Create vectors, handle variables, and perform other basic functions Input and output data Tackle data structures such as matrices, lists, factors, and data frames Work with probability, probability distributions, and random variables Calculate statistics and confidence intervals, and perform statistical tests Create a variety of graphic displays Build statistical models with linear regressions and analysis of variance (ANOVA) Explore advanced statistical techniques, such as finding clusters in your data "Wonderfully readable, R Cookbook serves not only as a solutions manual of sorts, but as a truly enjoyable way to explore the R language—one practical example at a time."—Jeffrey Ryan, software consultant and R package author

Is America Getting What it Pays For?

This Element is an excerpt from Overhauling America's Healthcare Machine: Stop the Bleeding and Save Trillions (9780132173254) by Douglas A. Perednia. Available in print and digital formats. We’re paying more for healthcare than anyone else on in the world. Are we really getting what we’re paying for? To justify the status quo, politicians, insurers, and the media say many stupid things--as when they remind us we have the “best healthcare in the world.” The implication: We’re getting what we’re paying for, and the high price is simply the cost of being #1. But is this really true? Most of these pronouncements are consistently, suspiciously vague....

Data Mashups in R

How do you use R to import, manage, visualize, and analyze real-world data? With this short, hands-on tutorial, you learn how to collect online data, massage it into a reasonable form, and work with it using R facilities to interact with web servers, parse HTML and XML, and more. Rather than use canned sample data, you'll plot and analyze current home foreclosure auctions in Philadelphia. This practical mashup exercise shows you how to access spatial data in several formats locally and over the Web to produce a map of home foreclosures. It's an excellent way to explore how the R environment works with R packages and performs statistical analysis. Parse messy data from public foreclosure auction postings Plot the data using R's PBSmapping package Import US Census data to add context to foreclosure data Use R's lattice and latticeExtra packages for data visualization Create multidimensional correlation graphs with the pairs() scatterplot matrix package

Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management, Third Edition

The leading introductory book on data mining, fully updated and revised! When Berry and Linoff wrote the first edition of Data Mining Techniques in the late 1990s, data mining was just starting to move out of the lab and into the office and has since grown to become an indispensable tool of modern business. This new edition—more than 50% new and revised—is a significant update from the previous one, and shows you how to harness the newest data mining methods and techniques to solve common business problems. The duo of unparalleled authors share invaluable advice for improving response rates to direct marketing campaigns, identifying new customer segments, and estimating credit risk. In addition, they cover more advanced topics such as preparing data for analysis and creating the necessary infrastructure for data mining at your company. Features significant updates since the previous edition and updates you on best practices for using data mining methods and techniques for solving common business problems Covers a new data mining technique in every chapter along with clear, concise explanations on how to apply each technique immediately Touches on core data mining techniques, including decision trees, neural networks, collaborative filtering, association rules, link analysis, survival analysis, and more Provides best practices for performing data mining using simple tools such as Excel Data Mining Techniques, Third Edition covers a new data mining technique with each successive chapter and then demonstrates how you can apply that technique for improved marketing, sales, and customer support to get immediate results.

25 Recipes for Getting Started with R

R is a powerful tool for statistics and graphics, but getting started with this language can be frustrating. This short, concise book provides beginners with a selection of how-to recipes to solve simple problems with R. Each solution gives you just what you need to know to use R for basic statistics, graphics, and regression. You'll find recipes on reading data files, creating data frames, computing basic statistics, testing means and correlations, creating a scatter plot, performing simple linear regression, and many more. These solutions were selected from O'Reilly's , which contains more than 200 recipes for R that you'll find useful once you move beyond the basics. R Cookbook

Cluster Analysis, 5th Edition

Cluster analysis comprises a range of methods for classifying multivariate data into subgroups. By organizing multivariate data into such subgroups, clustering can help reveal the characteristics of any structure or patterns present. These techniques have proven useful in a wide range of areas such as medicine, psychology, market research and bioinformatics. This fifth edition of the highly successful Cluster Analysis includes coverage of the latest developments in the field and a new chapter dealing with finite mixture models for structured data. Real life examples are used throughout to demonstrate the application of the theory, and figures are used extensively to illustrate graphical techniques. The book is comprehensive yet relatively non-mathematical, focusing on the practical aspects of cluster analysis. Key Features: Presents a comprehensive guide to clustering techniques, with focus on the practical aspects of cluster analysis. Provides a thorough revision of the fourth edition, including new developments in clustering longitudinal data and examples from bioinformatics and gene studies Updates the chapter on mixture models to include recent developments and presents a new chapter on mixture modeling for structured data. Practitioners and researchers working in cluster analysis and data analysis will benefit from this book.

BIRT: A Field Guide, Third Edition

More than seven million people have downloaded BIRT (Business Intelligence and Reporting Tools) from the Eclipse web site, and more than one million developers are estimated to be using BIRT. Built on the open source Eclipse platform, BIRT is a powerful report development system that provides an end-to-end solution–from creating and deploying reports to integrating report capabilities in enterprise applications. The first in a two-book series about this exciting technology, is the authoritative guide to using BIRT Report Designer, the graphical tool that enables users of all levels to build reports, from simple to complex, without programming. BIRT: A Field Guide to Reporting, Third Edition, This book is an essential resource for users who want to create presentation-quality reports quickly. The extensive examples, step-by-step instructions, and abundant illustrations help new users develop report design skills. Power users can find the information they need to make the most of the product’s rich set of features to build sophisticated and compelling reports. Readers of this book learn how to Design effective corporate reports that convey complex business information using images, charts, tables, and cross tabs Build reports using data from multiple sources, including databases, spreadsheets, web services, and XML documents Enliven reports with interactive features, such as hyperlinks, tooltips, and highlighting Create reports using a consistent style, and, drawing on templates and libraries of reusable elements, collaborate with other report designers Localize reports for an international audience The third edition, newly revised for BIRT 2.6, adds updated examples, contains close to 1,000 new and replacement screenshots, and covers all the new and improved product features, including Result-set sharing to create dashboard-style reports Data collation conforming to local conventions Using cube data in charts, new chart types, and functionality Displaying bidirectional text, used in right-to-left languages Numerous enhancements to cross tabs, page management, and report layout

Mining the Social Web

Popular social networks such as Facebook and Twitter generate a tremendous amount of valuable data on topics and use patterns. Who's talking to whom? What are they talking about? How often are they talking? This concise and practical book shows you how to answer these questions and more by harvesting and analyzing data using social web APIs, Python, and pragmatic storage technologies such as Redis, CouchDB, and NetworkX. With Mining the Social Web, intermediate to advanced programmers will learn how to harvest and analyze social data in way that lends itself to hacking as well as more industrial-strength analysis. Algorithms are designed with robustness and efficiency in mind so that the approaches scale well on an ordinary piece of commodity hardware. The book is highly readable from cover to cover as content progressively grows in complexity, but also lends itself to being read in an ad-hoc fashion. Use easily adaptable scripts to access popular social network APIs including Twitter, OpenSocial, and Facebook Learn approaches for slicing and dicing social data that's been harvested from social web APIs as well as other common formats such as email and markup formats Harvest data from other sources such as Freebase and other sites to enrich your analytic capabilities with additional context Visualize and analyze data in interactive ways with tools built upon rich UI JavaScript toolkits Get a concise and straightforward synopsis of some practical technologies from the semantic web landscape that you can incorporate into your analysis This book is still in progress, but you can get going on this technology through our Rough Cuts edition, which lets you read the manuscript as it's being written, either online or via PDF.

21 Recipes for Mining Twitter

Millions of public Twitter streams harbor a wealth of data, and once you mine them, you can gain some valuable insights. This short and concise book offers a collection of recipes to help you extract nuggets of Twitter information using easy-to-learn Python tools. Each recipe offers a discussion of how and why the solution works, so you can quickly adapt it to fit your particular needs. The recipes include techniques to: Use OAuth to access Twitter data Create and analyze graphs of retweet relationships Use the streaming API to harvest tweets in realtime Harvest and analyze friends and followers Discover friendship cliques Summarize webpages from short URLs This book is a perfect companion to O’Reilly's Mining the Social Web.

Practical Applications of Data Mining

Practical Applications of Data Mining emphasizes both theory and applications of data mining algorithms. Various topics of data mining techniques are identified and described throughout, including clustering, association rules, rough set theory, probability theory, neural networks, classification, and fuzzy logic. Each of these techniques is explored with a theoretical introduction and its effectiveness is demonstrated with various chapter examples. This book will help any database and IT professional understand how to apply data mining techniques to real-world problems.

Following an introduction to data mining principles, Practical Applications of Data Mining introduces association rules to describe the generation of rules as the first step in data mining. It covers classification and clustering methods to show how data can be classified to retrieve information from data. Statistical functions and drough set theory are discussed to demonstrate how statistical and rough set formulas can be used for data analytics and knowlege discovery. Neural networks is an important branch in computational intelligence. It is introduced and explored in the text to investigate the role of neural network algorithms in data analytics.

Entity Resolution and Information Quality

Entity Resolution and Information Quality presents topics and definitions, and clarifies confusing terminologies regarding entity resolution and information quality. It takes a very wide view of IQ, including its six-domain framework and the skills formed by the International Association for Information and Data Quality {IAIDQ). The book includes chapters that cover the principles of entity resolution and the principles of Information Quality, in addition to their concepts and terminology. It also discusses the Fellegi-Sunter theory of record linkage, the Stanford Entity Resolution Framework, and the Algebraic Model for Entity Resolution, which are the major theoretical models that support Entity Resolution. In relation to this, the book briefly discusses entity-based data integration (EBDI) and its model, which serve as an extension of the Algebraic Model for Entity Resolution. There is also an explanation of how the three commercial ER systems operate and a description of the non-commercial open-source system known as OYSTER. The book concludes by discussing trends in entity resolution research and practice. Students taking IT courses and IT professionals will find this book invaluable. First authoritative reference explaining entity resolution and how to use it effectively Provides practical system design advice to help you get a competitive advantage Includes a companion site with synthetic customer data for applicatory exercises, and access to a Java-based Entity Resolution program.

MATLAB®: An Introduction with Applications, Fourth Edition

MATLAB: An Introduction with Applications 4th Edition walks readers through the ins and outs of this powerful software for technical computing. The first chapter describes basic features of the program and shows how to use it in simple arithmetic operations with scalars. The next two chapters focus on the topic of arrays (the basis of MATLAB), while the remaining text covers a wide range of other applications. MATLAB: An Introduction with Applications 4th Edition is presented gradually and in great detail, generously illustrated through computer screen shots and step-by-step tutorials, and applied in problems in mathematics, science, and engineering.

Computational Intelligence and Pattern Analysis in Biological Informatics

An invaluable tool in Bioinformatics, this unique volume provides both theoretical and experimental results, and describes basic principles of computational intelligence and pattern analysis while deepening the reader's understanding of the ways in which these principles can be used for analyzing biological data in an efficient manner. This book synthesizes current research in the integration of computational intelligence and pattern analysis techniques, either individually or in a hybridized manner. The purpose is to analyze biological data and enable extraction of more meaningful information and insight from it. Biological data for analysis include sequence data, secondary and tertiary structure data, and microarray data. These data types are complex and advanced methods are required, including the use of domain-specific knowledge for reducing search space, dealing with uncertainty, partial truth and imprecision, efficient linear and/or sub-linear scalability, incremental approaches to knowledge discovery, and increased level and intelligence of interactivity with human experts and decision makers Chapters authored by leading researchers in CI in biology informatics. Covers highly relevant topics: rational drug design; analysis of microRNAs and their involvement in human diseases. Supplementary material included: program code and relevant data sets correspond to chapters. Note: The ebook version does not provide access to the companion files.