talk-data.com talk-data.com

Topic

data

5765

tagged

Activity Trend

3 peak/qtr
2020-Q1 2026-Q1

Activities

5765 activities · Newest first

Business Intelligence

This book is about using business intelligence as a management information system for supporting managerial decision making. It concentrates primarily on practical business issues and demonstrates how to apply data warehousing and data analytics to support business decision making. This book progresses through a logical sequence, starting with data model infrastructure, then data preparation, followed by data analysis, integration, knowledge discovery, and finally the actual use of discovered knowledge. All examples are based on the most recent achievements in business intelligence. Finally this book outlines an overview of a methodology that takes into account the complexity of developing applications in an integrated business intelligence environment. This book is written for managers, business consultants, and undergraduate and postgraduates students in business administration.

Data Mashups in R

How do you use R to import, manage, visualize, and analyze real-world data? With this short, hands-on tutorial, you learn how to collect online data, massage it into a reasonable form, and work with it using R facilities to interact with web servers, parse HTML and XML, and more. Rather than use canned sample data, you'll plot and analyze current home foreclosure auctions in Philadelphia. This practical mashup exercise shows you how to access spatial data in several formats locally and over the Web to produce a map of home foreclosures. It's an excellent way to explore how the R environment works with R packages and performs statistical analysis. Parse messy data from public foreclosure auction postings Plot the data using R's PBSmapping package Import US Census data to add context to foreclosure data Use R's lattice and latticeExtra packages for data visualization Create multidimensional correlation graphs with the pairs() scatterplot matrix package

Advanced Kalman Filtering, Least-Squares and Modeling: A Practical Handbook

This book provides a complete explanation of estimation theory and application, modeling approaches, and model evaluation. Each topic starts with a clear explanation of the theory (often including historical context), followed by application issues that should be considered in the design. Different implementations designed to address specific problems are presented, and numerous examples of varying complexity are used to demonstrate the concepts. This book is intended primarily as a handbook for engineers who must design practical systems. Its primary goal is to explain all important aspects of Kalman filtering and least-squares theory and application. Discussion of estimator design and model development is emphasized so that the reader may develop an estimator that meets all application requirements and is robust to modeling assumptions. Since it is sometimes difficult to a priori determine the best model structure, use of exploratory data analysis to define model structure is discussed. Methods for deciding on the "best" model are also presented. A second goal is to present little known extensions of least squares estimation or Kalman filtering that provide guidance on model structure and parameters, or make the estimator more robust to changes in real-world behavior. A third goal is discussion of implementation issues that make the estimator more accurate or efficient, or that make it flexible so that model alternatives can be easily compared. The fourth goal is to provide the designer/analyst with guidance in evaluating estimator performance and in determining/correcting problems. The final goal is to provide a subroutine library that simplifies implementation, and flexible general purpose high-level drivers that allow both easy analysis of alternative models and access to extensions of the basic filtering.

Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management, Third Edition

The leading introductory book on data mining, fully updated and revised! When Berry and Linoff wrote the first edition of Data Mining Techniques in the late 1990s, data mining was just starting to move out of the lab and into the office and has since grown to become an indispensable tool of modern business. This new edition—more than 50% new and revised—is a significant update from the previous one, and shows you how to harness the newest data mining methods and techniques to solve common business problems. The duo of unparalleled authors share invaluable advice for improving response rates to direct marketing campaigns, identifying new customer segments, and estimating credit risk. In addition, they cover more advanced topics such as preparing data for analysis and creating the necessary infrastructure for data mining at your company. Features significant updates since the previous edition and updates you on best practices for using data mining methods and techniques for solving common business problems Covers a new data mining technique in every chapter along with clear, concise explanations on how to apply each technique immediately Touches on core data mining techniques, including decision trees, neural networks, collaborative filtering, association rules, link analysis, survival analysis, and more Provides best practices for performing data mining using simple tools such as Excel Data Mining Techniques, Third Edition covers a new data mining technique with each successive chapter and then demonstrates how you can apply that technique for improved marketing, sales, and customer support to get immediate results.

25 Recipes for Getting Started with R

R is a powerful tool for statistics and graphics, but getting started with this language can be frustrating. This short, concise book provides beginners with a selection of how-to recipes to solve simple problems with R. Each solution gives you just what you need to know to use R for basic statistics, graphics, and regression. You'll find recipes on reading data files, creating data frames, computing basic statistics, testing means and correlations, creating a scatter plot, performing simple linear regression, and many more. These solutions were selected from O'Reilly's , which contains more than 200 recipes for R that you'll find useful once you move beyond the basics. R Cookbook

Data Source Handbook

If you're a developer looking to supplement your own data tools and services, this concise ebook covers the most useful sources of public data available today. You'll find useful information on APIs that offer broad coverage, tie their data to the outside world, and are either accessible online or feature downloadable bulk data. You'll also find code and helpful links. This guide organizes APIs by the subjects they cover—such as websites, people, or places—so you can quickly locate the best resources for augmenting the data you handle in your own service. Categories include: Website tools such as WHOIS, bit.ly, and Compete Services that use email addresses as search terms, including Github Finding information from just a name, with APIs such as WhitePages Services, such as Klout, for locating people with Facebook and Twitter accounts Search APIs, including BOSS and Wikipedia Geographical data sources, including SimpleGeo and U.S. Census Company information APIs, such as CrunchBase and ZoomInfo APIs that list IP addresses, such as MaxMind Services that list books, films, music, and products

IBM System Storage DS Storage Manager Copy Services Guide

The purpose of this IBM® Redbooks® publication is to provide customers with guidance and recommendations for how and when to use the IBM System Storage® Copy Services premium features. The topics discussed in this publication apply to the IBM System Storage DS® models DS3000, DS4000®, and DS5000 running the firmware v7.70, and IBM System Storage DS Storage Manager v10.70. Customers in today’s IT world are finding a major need to ensure a good archive of their data and a requirement to create these archives with minimal interruptions. The IBM Midrange System Storage helps to fulfill these requirements by offering three copy services premium features: IBM FlashCopy® VolumeCopy Enhanced Remote Mirroring (ERM) This publication specifically addresses the copy services premium features and can be used in conjunction with the following IBM DS System Storage books: IBM System Storage DS4000 and Storage Manager V10.30, SG24-7010 IBM System Storage DS3000: Introduction and Implementation Guide, SG24-7065 IBM System Storage DS3500: Introduction and Implementation Guide, SG24-7914 IBM Midrange System Storage Hardware Guide, SG24-7676 IBM Midrange System Storage Implementation and Best Practices Guide, SG24-6363

Scaling MongoDB

Create a MongoDB cluster that will to grow to meet the needs of your application. With this short and concise book, you'll get guidelines for setting up and using clusters to store a large volume of data, and learn how to access the data efficiently. In the process, you'll understand how to make your application work with a distributed database system. Scaling MongoDB will help you: Set up a MongoDB cluster through sharding Work with a cluster to query and update data Operate, monitor, and backup your cluster Plan your application to deal with outages By following the advice in this book, you'll be well on your way to building and running an efficient, predictable distributed system using MongoDB.

Cluster Analysis, 5th Edition

Cluster analysis comprises a range of methods for classifying multivariate data into subgroups. By organizing multivariate data into such subgroups, clustering can help reveal the characteristics of any structure or patterns present. These techniques have proven useful in a wide range of areas such as medicine, psychology, market research and bioinformatics. This fifth edition of the highly successful Cluster Analysis includes coverage of the latest developments in the field and a new chapter dealing with finite mixture models for structured data. Real life examples are used throughout to demonstrate the application of the theory, and figures are used extensively to illustrate graphical techniques. The book is comprehensive yet relatively non-mathematical, focusing on the practical aspects of cluster analysis. Key Features: Presents a comprehensive guide to clustering techniques, with focus on the practical aspects of cluster analysis. Provides a thorough revision of the fourth edition, including new developments in clustering longitudinal data and examples from bioinformatics and gene studies Updates the chapter on mixture models to include recent developments and presents a new chapter on mixture modeling for structured data. Practitioners and researchers working in cluster analysis and data analysis will benefit from this book.

Database Modeling and Design, 5th Edition

Database Modeling and Design, Fifth Edition, focuses on techniques for database design in relational database systems. This extensively revised fifth edition features clear explanations, lots of terrific examples and an illustrative case, and practical advice, with design rules that are applicable to any SQL-based system. The common examples are based on real-life experiences and have been thoroughly class-tested. This book is immediately useful to anyone tasked with the creation of data models for the integration of large-scale enterprise data. It is ideal for a stand-alone data management course focused on logical database design, or a supplement to an introductory text for introductory database management. In-depth detail and plenty of real-world, practical examples throughout Loaded with design rules and illustrative case studies that are applicable to any SQL, UML, or XML-based system Immediately useful to anyone tasked with the creation of data models for the integration of large-scale enterprise data

BIRT: A Field Guide, Third Edition

More than seven million people have downloaded BIRT (Business Intelligence and Reporting Tools) from the Eclipse web site, and more than one million developers are estimated to be using BIRT. Built on the open source Eclipse platform, BIRT is a powerful report development system that provides an end-to-end solution–from creating and deploying reports to integrating report capabilities in enterprise applications. The first in a two-book series about this exciting technology, is the authoritative guide to using BIRT Report Designer, the graphical tool that enables users of all levels to build reports, from simple to complex, without programming. BIRT: A Field Guide to Reporting, Third Edition, This book is an essential resource for users who want to create presentation-quality reports quickly. The extensive examples, step-by-step instructions, and abundant illustrations help new users develop report design skills. Power users can find the information they need to make the most of the product’s rich set of features to build sophisticated and compelling reports. Readers of this book learn how to Design effective corporate reports that convey complex business information using images, charts, tables, and cross tabs Build reports using data from multiple sources, including databases, spreadsheets, web services, and XML documents Enliven reports with interactive features, such as hyperlinks, tooltips, and highlighting Create reports using a consistent style, and, drawing on templates and libraries of reusable elements, collaborate with other report designers Localize reports for an international audience The third edition, newly revised for BIRT 2.6, adds updated examples, contains close to 1,000 new and replacement screenshots, and covers all the new and improved product features, including Result-set sharing to create dashboard-style reports Data collation conforming to local conventions Using cube data in charts, new chart types, and functionality Displaying bidirectional text, used in right-to-left languages Numerous enhancements to cross tabs, page management, and report layout

Automated Physical Database Design and Tuning

Due to the increasing complexity in application workloads and query engines, database administrators are turning to automated tuning tools that systematically explore the space of physical design alternatives. A critical element of such tuning is physical database design since the choice of physical structures has a significant impact on the performance of the database system. Automated Physical Database Design and Tuning presents a detailed overview of the fundamental ideas and algorithms for automatically recommending changes to the physical design of a database system. The first part of the book introduces the necessary technical background. The author explains SQL, the space of execution plans for answering SQL queries, query optimization, how the choice of access paths (e.g., indexes) is crucial to performance, and the complexity of the physical design problem. The second part extensively discusses automated physical design techniques, covering fundamental research ideas in the last 15 years that have resulted in a new generation of tuning tools. The text focuses on the search space of alternatives, the necessity of a cost model to compare such alternatives, different mechanisms to traverse and enumerate the search space, and practical aspects in real-world tuning tools. In the third part, the author explores new advances in automated physical design. He applies previous approaches to other physical structures, such as materialized views, partitioning, and multidimensional clustering. He also analyzes workload models for new types of applications, generalizes the optimizing function of current physical design tools to cope with other application scenarios, and examines open-ended challenges in physical database design. This book offers valuable insights on well-established principles and cutting-edge research results in automated physical design. It helps readers gain a deeper understanding of how automated tuning tools work in database installations as well as the challenges and opportunities involved in designing next-generation tuning tools.

Developing High Quality Data Models

Developing High Quality Data Models provides an introduction to the key principles of data modeling. It explains the purpose of data models in both developing an Enterprise Architecture and in supporting Information Quality; common problems in data model development; and how to develop high quality data models, in particular conceptual, integration, and enterprise data models. The book is organized into four parts. Part 1 provides an overview of data models and data modeling including the basics of data model notation; types and uses of data models; and the place of data models in enterprise architecture. Part 2 introduces some general principles for data models, including principles for developing ontologically based data models; and applications of the principles for attributes, relationship types, and entity types. Part 3 presents an ontological framework for developing consistent data models. Part 4 provides the full data model that has been in development throughout the book. The model was created using Jotne EPM Technologys EDMVisualExpress data modeling tool. This book was designed for all types of modelers: from those who understand data modeling basics but are just starting to learn about data modeling in practice, through to experienced data modelers seeking to expand their knowledge and skills and solve some of the more challenging problems of data modeling. Uses a number of common data model patterns to explain how to develop data models over a wide scope in a way that is consistent and of high quality Offers generic data model templates that are reusable in many applications and are fundamental for developing more specific templates Develops ideas for creating consistent approaches to high quality data models

Mining the Social Web

Popular social networks such as Facebook and Twitter generate a tremendous amount of valuable data on topics and use patterns. Who's talking to whom? What are they talking about? How often are they talking? This concise and practical book shows you how to answer these questions and more by harvesting and analyzing data using social web APIs, Python, and pragmatic storage technologies such as Redis, CouchDB, and NetworkX. With Mining the Social Web, intermediate to advanced programmers will learn how to harvest and analyze social data in way that lends itself to hacking as well as more industrial-strength analysis. Algorithms are designed with robustness and efficiency in mind so that the approaches scale well on an ordinary piece of commodity hardware. The book is highly readable from cover to cover as content progressively grows in complexity, but also lends itself to being read in an ad-hoc fashion. Use easily adaptable scripts to access popular social network APIs including Twitter, OpenSocial, and Facebook Learn approaches for slicing and dicing social data that's been harvested from social web APIs as well as other common formats such as email and markup formats Harvest data from other sources such as Freebase and other sites to enrich your analytic capabilities with additional context Visualize and analyze data in interactive ways with tools built upon rich UI JavaScript toolkits Get a concise and straightforward synopsis of some practical technologies from the semantic web landscape that you can incorporate into your analysis This book is still in progress, but you can get going on this technology through our Rough Cuts edition, which lets you read the manuscript as it's being written, either online or via PDF.

21 Recipes for Mining Twitter

Millions of public Twitter streams harbor a wealth of data, and once you mine them, you can gain some valuable insights. This short and concise book offers a collection of recipes to help you extract nuggets of Twitter information using easy-to-learn Python tools. Each recipe offers a discussion of how and why the solution works, so you can quickly adapt it to fit your particular needs. The recipes include techniques to: Use OAuth to access Twitter data Create and analyze graphs of retweet relationships Use the streaming API to harvest tweets in realtime Harvest and analyze friends and followers Discover friendship cliques Summarize webpages from short URLs This book is a perfect companion to O’Reilly's Mining the Social Web.

Extremely pureXML in DB2 10 for z/OS

The DB2® pureXML® feature offers sophisticated capabilities to store, process and manage XML data in its native hierarchical format. By integrating XML data intact into a relational database structure, users can take full advantage of DB2's relational data management features. In this IBM® Redbooks® publication, we document the steps for the implementation of a simple but meaningful XML application scenario. We have chosen to provide samples in COBOL and Java™ language. The purpose is to provide an easy path to follow to integrate the XML data type for the traditional DB2 for z/OS® user. We also add considerations for the data administrator and suggest best practices for ease of use and better performance.

Practical Applications of Data Mining

Practical Applications of Data Mining emphasizes both theory and applications of data mining algorithms. Various topics of data mining techniques are identified and described throughout, including clustering, association rules, rough set theory, probability theory, neural networks, classification, and fuzzy logic. Each of these techniques is explored with a theoretical introduction and its effectiveness is demonstrated with various chapter examples. This book will help any database and IT professional understand how to apply data mining techniques to real-world problems.

Following an introduction to data mining principles, Practical Applications of Data Mining introduces association rules to describe the generation of rules as the first step in data mining. It covers classification and clustering methods to show how data can be classified to retrieve information from data. Statistical functions and drough set theory are discussed to demonstrate how statistical and rough set formulas can be used for data analytics and knowlege discovery. Neural networks is an important branch in computational intelligence. It is introduced and explored in the text to investigate the role of neural network algorithms in data analytics.

Mastering XPages: A Step-by-Step Guide to XPages Application Development and the XSP Language

The first complete, practical guide to XPages development - direct from members of the XPages development team at IBM Lotus Martin Donnelly, Mark Wallace, and Tony McGuckin have written the definitive programmer's guide to utilizing this breakthrough technology. Packed with tips, tricks, and best practices from IBM's own XPages developers, Mastering XPages brings together all the information developers need to become experts - whether you’re experienced with Notes/Domino development or not. The authors start from the very beginning, helping developers steadily build your expertise through practical code examples and clear, complete explanations. Readers will work through scores of real-world XPages examples, learning cutting-edge XPages and XSP language skills and gaining deep insight into the entire development process. Drawing on their own experience working directly with XPages users and customers, the authors illuminate both the technology and how it can be applied to solving real business problems. Martin Donnelly previously led a software startup that developed and distributed small business accounting software. Donnelly holds a Commerce degree from University College Cork and an M.S. in Computer Science from Boston University. Mark Wallace has worked at IBM for 15 years on many projects as a technical architect and application developer. Tony McGuckin participates in the Lotus OneUI Web Application and iWidget Adoption Workgroup. He holds a bachelor's degree in Software Engineering from the University of Ulster.

Entity Resolution and Information Quality

Entity Resolution and Information Quality presents topics and definitions, and clarifies confusing terminologies regarding entity resolution and information quality. It takes a very wide view of IQ, including its six-domain framework and the skills formed by the International Association for Information and Data Quality {IAIDQ). The book includes chapters that cover the principles of entity resolution and the principles of Information Quality, in addition to their concepts and terminology. It also discusses the Fellegi-Sunter theory of record linkage, the Stanford Entity Resolution Framework, and the Algebraic Model for Entity Resolution, which are the major theoretical models that support Entity Resolution. In relation to this, the book briefly discusses entity-based data integration (EBDI) and its model, which serve as an extension of the Algebraic Model for Entity Resolution. There is also an explanation of how the three commercial ER systems operate and a description of the non-commercial open-source system known as OYSTER. The book concludes by discussing trends in entity resolution research and practice. Students taking IT courses and IT professionals will find this book invaluable. First authoritative reference explaining entity resolution and how to use it effectively Provides practical system design advice to help you get a competitive advantage Includes a companion site with synthetic customer data for applicatory exercises, and access to a Java-based Entity Resolution program.

Tivoli Integration Scenarios

This IBM® Redbooks® publication provides a broad view of how Tivoli® system management products work together in several common scenarios. You must achieve seamless integration for operations personnel to work with the solution. This integration is necessary to ensure that the product can be used easily by the users. Product integration contains multiple dimensions, such as security, navigation, data and task integrations. Within the context of the scenarios in this book, you see examples of these integrations. The scenarios implemented in this book are largely based on the input from the integration team, and several clients using IBM products. We based these scenarios on common real-life examples that IT operations often have to deal with. Of course, these scenarios are only a small subset of the possible integration scenarios that can be accomplished by the Tivoli products, but they were chosen to be representative of the integration possibilities using the Tivoli products. We discuss these implementations and benefits that are realized by these integrations, and also provide sample scenarios of how these integrations work. This book is a reference guide for IT architects and IT specialists working on integrating Tivoli products in real-life environments.