exploratory-data-analysis

Service-Oriented Distributed Knowledge Discovery

2012-10-05 · O'Reilly Data Science Books O'Reilly Amazon

book

by Paolo Trunfio , Domenico Talia

AI/ML Analytics Data Analytics data data-science data-science-tasks

A new approach to distributed large-scale data mining, service-oriented knowledge discovery extracts useful knowledge from often unmanageable volumes of data by exploiting data mining and machine learning distributed models and techniques in service-oriented infrastructures. Service-Oriented Distributed Knowledge Discovery presents techniques, algorithms, and systems based on the service-oriented paradigm. It explains how to design services for data analytics, describes real systems for implementing distributed knowledge discovery applications, and explores mobile data mining models.

Practical Data Mining

2011-12-19 · O'Reilly Data Science Books O'Reilly Amazon

book

by Jr., F. Hancock

data data-science data-science-tasks

Intended for those who need a practical guide to proven and up-to-date data mining techniques and processes, this book covers specific problem genres. With chapters that focus on application specifics, it allows readers to go to material relevant to their problem domain. Each section starts with a chapter-length roadmap for the given problem domain. This includes a checklist/decision-tree, which allows the reader to customize a data mining solution for their problem space. The roadmap discusses the technical components of solutions.

Spectral Feature Selection for Data Mining

2011-12-14 · O'Reilly Data Science Books O'Reilly Amazon

book

by Huan Liu , Zheng Alan Zhao

data data-science data-science-tasks

Spectral Feature Selection for Data Mining introduces a novel feature selection technique that establishes a general platform for studying existing feature selection algorithms and developing new algorithms for emerging problems in real-world applications. This technique represents a unified framework for supervised, unsupervised, and semisupervise

Data Mining: Concepts and Techniques, 3rd Edition

2011-06-09 · O'Reilly Data Science Books O'Reilly Amazon

book

by Jian Pei , Jiawei Han , Micheline Kamber

Computer Science RDBMS data data-science data-science-tasks

Data Mining: Concepts and Techniques provides the concepts and techniques in processing gathered data or information, which will be used in various applications. Specifically, it explains data mining and the tools used in discovering knowledge from the collected data. This book is referred as the knowledge discovery from data (KDD). It focuses on the feasibility, usefulness, effectiveness, and scalability of techniques of large data sets. After describing data mining, this edition explains the methods of knowing, preprocessing, processing, and warehousing data. It then presents information about data warehouses, online analytical processing (OLAP), and data cube technology. Then, the methods involved in mining frequent patterns, associations, and correlations for large data sets are described. The book details the methods for data classification and introduces the concepts and methods for data clustering. The remaining chapters discuss the outlier detection and the trends, applications, and research frontiers in data mining. This book is intended for Computer Science students, application developers, business professionals, and researchers who seek information on data mining. Presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects Addresses advanced topics such as mining object-relational databases, spatial databases, multimedia databases, time-series databases, text databases, the World Wide Web, and applications in several fields Provides a comprehensive, practical look at the concepts and techniques you need to get the most out of your data

Data Analysis: What Can Be Learned From the Past 50 Years

2011-04-19 · O'Reilly Data Science Books O'Reilly Amazon

book

by Peter J. Huber

data data-science data-science-tasks

This book explores the many provocative questions concerning the fundamentals of data analysis. It is based on the time-tested experience of one of the gurus of the subject matter. Why should one study data analysis? How should it be taught? What techniques work best, and for whom? How valid are the results? How much data should be tested? Which machine languages should be used, if used at all? Emphasis on apprenticeship (through hands-on case studies) and anecdotes (through real-life applications) are the tools that Peter J. Huber uses in this volume. Concern with specific statistical techniques is not of immediate value; rather, questions of strategy – when to use which technique – are employed. Central to the discussion is an understanding of the significance of massive (or robust) data sets, the implementation of languages, and the use of models. Each is sprinkled with an ample number of examples and case studies. Personal practices, various pitfalls, and existing controversies are presented when applicable. The book serves as an excellent philosophical and historical companion to any present-day text in data analysis, robust statistics, data mining, statistical learning, or computational statistics.

Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management, Third Edition

2011-03-01 · O'Reilly Data Science Books O'Reilly Amazon

book

by Gordon S. Linoff , Michael J. A. Berry

Marketing data data-science data-science-tasks

The leading introductory book on data mining, fully updated and revised! When Berry and Linoff wrote the first edition of Data Mining Techniques in the late 1990s, data mining was just starting to move out of the lab and into the office and has since grown to become an indispensable tool of modern business. This new edition—more than 50% new and revised—is a significant update from the previous one, and shows you how to harness the newest data mining methods and techniques to solve common business problems. The duo of unparalleled authors share invaluable advice for improving response rates to direct marketing campaigns, identifying new customer segments, and estimating credit risk. In addition, they cover more advanced topics such as preparing data for analysis and creating the necessary infrastructure for data mining at your company. Features significant updates since the previous edition and updates you on best practices for using data mining methods and techniques for solving common business problems Covers a new data mining technique in every chapter along with clear, concise explanations on how to apply each technique immediately Touches on core data mining techniques, including decision trees, neural networks, collaborative filtering, association rules, link analysis, survival analysis, and more Provides best practices for performing data mining using simple tools such as Excel Data Mining Techniques, Third Edition covers a new data mining technique with each successive chapter and then demonstrates how you can apply that technique for improved marketing, sales, and customer support to get immediate results.

Cluster Analysis, 5th Edition

2011-02-21 · O'Reilly Data Science Books O'Reilly Amazon

book

by Sabine Landau , Brian S. Everitt , Morven Leese , Daniel Stahl

data data-science data-science-tasks

Cluster analysis comprises a range of methods for classifying multivariate data into subgroups. By organizing multivariate data into such subgroups, clustering can help reveal the characteristics of any structure or patterns present. These techniques have proven useful in a wide range of areas such as medicine, psychology, market research and bioinformatics. This fifth edition of the highly successful Cluster Analysis includes coverage of the latest developments in the field and a new chapter dealing with finite mixture models for structured data. Real life examples are used throughout to demonstrate the application of the theory, and figures are used extensively to illustrate graphical techniques. The book is comprehensive yet relatively non-mathematical, focusing on the practical aspects of cluster analysis. Key Features: Presents a comprehensive guide to clustering techniques, with focus on the practical aspects of cluster analysis. Provides a thorough revision of the fourth edition, including new developments in clustering longitudinal data and examples from bioinformatics and gene studies Updates the chapter on mixture models to include recent developments and presents a new chapter on mixture modeling for structured data. Practitioners and researchers working in cluster analysis and data analysis will benefit from this book.

Practical Applications of Data Mining

2011-01-20 · O'Reilly Data Science Books O'Reilly Amazon

book

by Sang C. Suh

Analytics Data Analytics data data-science data-science-tasks

Practical Applications of Data Mining emphasizes both theory and applications of data mining algorithms. Various topics of data mining techniques are identified and described throughout, including clustering, association rules, rough set theory, probability theory, neural networks, classification, and fuzzy logic. Each of these techniques is explored with a theoretical introduction and its effectiveness is demonstrated with various chapter examples. This book will help any database and IT professional understand how to apply data mining techniques to real-world problems.

Following an introduction to data mining principles, Practical Applications of Data Mining introduces association rules to describe the generation of rules as the first step in data mining. It covers classification and clustering methods to show how data can be classified to retrieve information from data. Statistical functions and drough set theory are discussed to demonstrate how statistical and rough set formulas can be used for data analytics and knowlege discovery. Neural networks is an important branch in computational intelligence. It is introduced and explored in the text to investigate the role of neural network algorithms in data analytics.

Knowledge Discovery from Data Streams

2010-05-25 · O'Reilly Data Science Books O'Reilly Amazon

book

by Joao Gama

Data Streaming data data-science data-science-tasks

Exploring how to extract knowledge structures from evolving and time-changing data, this book presents a coherent overview of state-of-the-art research in learning from data streams. It covers the fundamentals that are imperative to understanding data streams and describes important applications, such as TCP/IP traffic, GPS data, sensor networks, and customer click streams. It also explores advanced areas, such as ubiquitous data stream mining; addresses several challenges of data mining in the future, when stream mining will be at the core of many applications; and includes pseudo-code of more than 30 streaming-like algorithms.

Smart Data: Enterprise Performance Optimization Strategy

2010-03-15 · O'Reilly Data Science Books O'Reilly Amazon

book

by James A. George , James A. Rodger

data data-science data-science-tasks

The authors advocate attention to smart data strategy as an organizing element of enterprise performance optimization. They believe that "smart data" as a corporate priority could revolutionize government or commercial enterprise performance much like "six sigma" or "total quality" as organizing paradigms have done in the past. This revolution has not yet taken place because data historically resides in the province of the information resources organization. Solutions that render data smart are articulated in "technoid" terms versus the language of the board room. While books such as Adaptive Information by Pollock and Hodgson ably describe the current state of the art, their necessarily technical tone is not conducive to corporate or agency wide qualitative change.

Head First Data Analysis

2009-07-28 · O'Reilly Data Science Books O'Reilly Amazon

book

by Michael Milton

Data Quality Marketing data data-science data-science-tasks

Today, interpreting data is a critical decision-making factor for businesses and organizations. If your job requires you to manage and analyze all kinds of data, turn to Head First Data Analysis, where you'll quickly learn how to collect and organize data, sort the distractions from the truth, find meaningful patterns, draw conclusions, predict the future, and present your findings to others. Whether you're a product developer researching the market viability of a new product or service, a marketing manager gauging or predicting the effectiveness of a campaign, a salesperson who needs data to support product presentations, or a lone entrepreneur responsible for all of these data-intensive functions and more, the unique approach in Head First Data Analysis is by far the most efficient way to learn what you need to know to convert raw data into a vital business tool. You'll learn how to: Determine which data sources to use for collecting information Assess data quality and distinguish signal from noise Build basic data models to illuminate patterns, and assimilate new information into the models Cope with ambiguous information Design experiments to test hypotheses and draw conclusions Use segmentation to organize your data within discrete market groups Visualize data distributions to reveal new relationships and persuade others Predict the future with sampling and probability models Clean your data to make it useful Communicate the results of your analysis to your audience Using the latest research in cognitive science and learning theory to craft a multi-sensory learning experience, Head First Data Analysis uses a visually rich format designed for the way your brain works, not a text-heavy approach that puts you to sleep.

Data Mining Techniques in Grid Computing Environments

2008-12-30 · O'Reilly Data Science Books O'Reilly Amazon

book

by Werner Dubitzky

data data-science data-science-tasks

Based around eleven international real life case studies and including contributions from leading experts in the field this groundbreaking book explores the need for the grid-enabling of data mining applications and provides a comprehensive study of the technology, techniques and management skills necessary to create them. This book provides a simultaneous design blueprint, user guide, and research agenda for current and future developments and will appeal to a broad audience; from developers and users of data mining and grid technology, to advanced undergraduate and postgraduate students interested in this field.

Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining

2006-11-28 · O'Reilly Data Science Books O'Reilly Amazon

book

by Glenn J. Myatt

DataViz data data-science data-science-tasks

A practical, step-by-step approach to making sense out of data Making Sense of Data educates readers on the steps and issues that need to be considered in order to successfully complete a data analysis or data mining project. The author provides clear explanations that guide the reader to make timely and accurate decisions from data in almost every field of study. A step-by-step approach aids professionals in carefully analyzing data and implementing results, leading to the development of smarter business decisions. With a comprehensive collection of methods from both data analysis and data mining disciplines, this book successfully describes the issues that need to be considered, the steps that need to be taken, and appropriately treats technical topics to accomplish effective decision making from data. Readers are given a solid foundation in the procedures associated with complex data analysis or data mining projects and are provided with concrete discussions of the most universal tasks and technical solutions related to the analysis of data, including: Problem definitions Data preparation Data visualization Data mining Statistics Grouping methods Predictive modeling Deployment issues and applications Throughout the book, the author examines why these multiple approaches are needed and how these methods will solve different problems. Processes, along with methods, are carefully and meticulously outlined for use in any data analysis or data mining project. From summarizing and interpreting data, to identifying non-trivial facts, patterns, and relationships in the data, to making predictions from the data, Making Sense of Data addresses the many issues that need to be considered as well as the steps that need to be taken to master data analysis and mining.

Performance Gap Analysis

2006-03-01 · O'Reilly Data Science Books O'Reilly Amazon

book

by Maren Franklin

data data-science data-science-tasks

Proposing any performance or training solutions requires rigorous analysis based on data, not speculation. Conducting a front-end analysis—a process for determining why a perceived performance gap exists and how to close the gap—enables workplace learning and performance professionals to find successful solutions. This Infoline describes how to carry out the two distinct analysis processes that go into a front-end analysis: a gap (or performance) analysis and root cause analysis. The first process determines if a performance problem exists. The second process identifies the true root cause of the issue. Helpful sidebars explain the analysis sequence, when to conduct a gap analysis, how to define performance gaps without bias, and factors that influence performance. In addition, the job aid provides a checklist of questions for a training needs analysis.

Baseball Hacks

2006-01-31 · O'Reilly Data Science Books O'Reilly Amazon

book

by Joseph Adler

Analytics data data-science data-science-tasks

Baseball Hacks isn't your typical baseball book--it's a book about how to watch, research, and understand baseball. It's an instruction manual for the free baseball databases. It's a cookbook for baseball research. Every part of this book is designed to teach baseball fans how to do something. In short, it's a how-to book--one that will increase your enjoyment and knowledge of the game. So much of the way baseball is played today hinges upon interpreting statistical data. Players are acquired based on their performance in statistical categories that ownership deems most important. Managers make in-game decisions based not on instincts, but on probability - how a particular batter might fare against left-handedpitching, for instance. The goal of this unique book is to show fans all the baseball-related stuff that they can do for free (or close to free). Just as open source projects have made great software freely available, collaborative projects such as Retrosheet and Baseball DataBank have made great data freely available. You can use these data sources to research your favorite players, win your fantasy league, or appreciate the game of baseball even more than you do now. Baseball Hacks shows how easy it is to get data, process it, and use it to truly understand baseball. The book lists a number of sources for current and historical baseball data, and explains how to load it into a database for analysis. It then introduces several powerful statistical tools for understanding data and forecasting results. For the uninitiated baseball fan, author Joseph Adler walks readers through the core statistical categories for hitters (batting average, on-base percentage, etc.), pitchers (earned run average, strikeout-to-walk ratio, etc.), and fielders (putouts, errors, etc.). He then extrapolates upon these numbers to examine more advanced data groups like career averages, team stats, season-by-season comparisons, and more. Whether you're a mathematician, scientist, or season-ticket holder to your favorite team, Baseball Hacks is sure to have something for you. Advance praise for Baseball Hacks: " Baseball Hacks is the best book ever written for understanding and practicing baseball analytics. A must-read for baseball professionals and enthusiasts alike." -- Ari Kaplan, database consultant to the Montreal Expos, San Diego Padres, and Baltimore Orioles "The game was born in the 19th century, but the passion for its analysis continues to grow into the 21st. In Baseball Hacks, Joe Adler not only demonstrates thatthe latest data-mining technologies have useful application to the study of baseball statistics, he also teaches the reader how to do the analysis himself, arming the dedicated baseball fan with tools to take his understanding of the game to a higher level." -- Mark E. Johnson, Ph.D., Founder, SportMetrika, Inc. and Baseball Analyst for the 2004 St. Louis Cardinals

Enhance Your Business Applications: Simple Integration of Advanced Data Mining Functions

2002-12-24 · O'Reilly Data Science Books O'Reilly Amazon

book

by Corinne Baragoin

IBM RDBMS data data-science data-science-tasks

Today data mining is no longer thought of as a set of stand-alone techniques, far from the business applications, and used only by data mining specialists or statisticians. Integrating data mining with mainstream applications is becoming an important issue for e-business applications. To support this move to applications, data mining is now an extension of the relational databases that database administrators or IT developers use. They use data mining as they would use any other standard relational function that they manipulate. This IBM Redbooks publication positions the new DB2 data mining functions: Part 1 of this book helps business analysts and implementers to understand and position these new DB2 data mining functions. Part 2 provides examples for implementers on how to easily and quickly integrate the data mining functions in business applications to enhance them. And part 3 helps database administrators and IT developers to configure these functions once to prepare them for use and integration in any application. Please note that the additional material referenced in the text is not available from IBM.

talk-data.com

exploratory-data-analysis

Activity Trend

Top Events

Top Speakers

Service-Oriented Distributed Knowledge Discovery

Practical Data Mining

Spectral Feature Selection for Data Mining

Data Mining: Concepts and Techniques, 3rd Edition

Data Analysis: What Can Be Learned From the Past 50 Years

Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management, Third Edition

Cluster Analysis, 5th Edition

Practical Applications of Data Mining

Knowledge Discovery from Data Streams

Smart Data: Enterprise Performance Optimization Strategy

Head First Data Analysis

Data Mining Techniques in Grid Computing Environments

Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining

Performance Gap Analysis

Baseball Hacks

Enhance Your Business Applications: Simple Integration of Advanced Data Mining Functions