talk-data.com talk-data.com

Topic

data

5765

tagged

Activity Trend

3 peak/qtr
2020-Q1 2026-Q1

Activities

5765 activities · Newest first

Big Data Glossary

To help you navigate the large number of new data tools available, this guide describes 60 of the most recent innovations, from NoSQL databases and MapReduce approaches to machine learning and visualization tools. Descriptions are based on first-hand experience with these tools in a production environment. This handy glossary also includes a chapter of key terms that help define many of these tool categories: NoSQL Databases—Document-oriented databases using a key/value interface rather than SQL MapReduce—Tools that support distributed computing on large datasets Storage—Technologies for storing data in a distributed way Servers—Ways to rent computing power on remote machines Processing—Tools for extracting valuable information from large datasets Natural Language Processing—Methods for extracting information from human-created text Machine Learning—Tools that automatically perform data analyses, based on results of a one-off analysis Visualization—Applications that present meaningful data graphically Acquisition—Techniques for cleaning up messy public data sources Serialization—Methods to convert data structure or object state into a storable format

Scaling BPM Adoption from Project to Program with IBM Business Process Manager

Your first Business Process Management (BPM) project is a crucial first step on your BPM journey. It is important to begin this journey with a philosophy of change that will enable you to avoid common pitfalls that lead to failed BPM projects, and ultimately, poor BPM adoption. This IBM® Redbooks® publication describes the methodology and best practices that lead to a successful project and how to use that success to scale to enterprise-wide BPM adoption. The intended audience for this book includes all people who participate in the discovery, planning, delivery, deployment, and continuous improvement activities for a business process. These roles include process owners, process participants and subject matter experts (SMEs) from the operational business as well as technologists responsible for delivery including BPM analysts, BPM solution architects, BPM administrators, and BPM developers.

HBase: The Definitive Guide

If you're looking for a scalable storage solution to accommodate a virtually endless amount of data, this book shows you how Apache HBase can fulfill your needs. As the open source implementation of Google's BigTable architecture, HBase scales to billions of rows and millions of columns, while ensuring that write and read performance remain constant. Many IT executives are asking pointed questions about HBase. This book provides meaningful answers, whether you’re evaluating this non-relational database or planning to put it into practice right away. Discover how tight integration with Hadoop makes scalability with HBase easier Distribute large datasets across an inexpensive cluster of commodity servers Access HBase with native Java clients, or with gateway servers providing REST, Avro, or Thrift APIs Get details on HBase’s architecture, including the storage format, write-ahead log, background processes, and more Integrate HBase with Hadoop's MapReduce framework for massively parallelized data processing jobs Learn how to tune clusters, design schemas, copy tables, import bulk data, decommission nodes, and many other tasks

Professional NoSQL

A hands-on guide to leveraging NoSQL databases NoSQL databases are an efficient and powerful tool for storing and manipulating vast quantities of data. Most NoSQL databases scale well as data grows. In addition, they are often malleable and flexible enough to accommodate semi-structured and sparse data sets. This comprehensive hands-on guide presents fundamental concepts and practical solutions for getting you ready to use NoSQL databases. Expert author Shashank Tiwari begins with a helpful introduction on the subject of NoSQL, explains its characteristics and typical uses, and looks at where it fits in the application stack. Unique insights help you choose which NoSQL solutions are best for solving your specific data storage needs. Professional NoSQL: Demystifies the concepts that relate to NoSQL databases, including column-family oriented stores, key/value databases, and document databases. Delves into installing and configuring a number of NoSQL products and the Hadoop family of products. Explains ways of storing, accessing, and querying data in NoSQL databases through examples that use MongoDB, HBase, Cassandra, Redis, CouchDB, Google App Engine Datastore and more. Looks at architecture and internals. Provides guidelines for optimal usage, performance tuning, and scalable configurations. Presents a number of tools and utilities relating to NoSQL, distributed platforms, and scalable processing, including Hive, Pig, RRDtool, Nagios, and more.

Statistics and Probability with Applications for Engineers and Scientists, Preliminary Edition

All statistical concepts are supported by a large number of examples using data encountered in real life situations; and the text illustrates how the statistical packages MINITAB®, Microsoft Excel ®, and JMP® may be used to aid in the analysis of various data sets. The text also covers an appropriate and understandable level of the design of experiments. This includes randomized block designs, one and two-way designs, Latin square designs, factorial designs, response surface designs, and others. This text is suitable for a one- or two-semester calculus-based undergraduate statistics course for engineers and scientists, and the presentation of material gives instructors flexibility to pick and choose topics for their particular courses.

Microsoft® BizTalk® Server 2010 Unleashed

The most complete, practical guide to BizTalk Server 2009: an all-new book focused on delivering real, start-to-finish enterprise solutions Pragmatic coverage of every crucial step of BizTalk development: architecture, design, infrastructure, deployment, lifecycle management, and more Fully up to date with the R2 release of BizTalk Server 2009 Not a revision of previous BizTalk Server Unleashed books, but completely rewritten from the ground up Microsoft BizTalk Server 2009 Unleashed is the definitive, pragmatic guide to Microsoft's latest and most powerful version of BizTalk Server. In this book, a team of world-class BizTalk Server 2009 experts bring together the deep practical insights .NET developers need to solve real business problems with BizTalk Server 2009 in any enterprise environment. Drawing on their immense BizTalk experience, the authors present best practices for the entire development lifecycle, from planning and architecture through deployment, and beyond. Writing at just the right level of technical detail for experienced .NET developers now starting out with BizTalk, they cover these and many other crucial issues: " Architecting and designing effective, high-value BizTalk solutions " Working with BizTalk schemas, maps, orchestrations, pipelines, pipeline components, and adapters " Implementing business rules with the Microsoft Business Rules Framework " Creating highly-available, high-performance BizTalk environments " Monitoring business activity " Collaborating effectively among BizTalk developers and users " Using BizTalk's leading-edge RFID capabilities Note: This is a 100% new book, NOT an update to Microsoft BizTalk Server 2004 Unleashed.

Practical Biomedical Signal Analysis Using MATLAB

Practical Biomedical Signal Analysis Using MATLAB presents a coherent treatment of various signal processing methods and applications. The book not only covers the current techniques of biomedical signal processing, but it also offers guidance on which methods are appropriate for a given task and different types of data.The first several chapters o

SAN Storage Performance Management Using Tivoli Storage Productivity Center

IBM Tivoli® Storage Productivity Center is an ideal tool for performing storage management reporting, because it uses industry standards for cross vendor compliance, and it can provide reports based on views from all application servers, all Fibre Channel fabric devices, and storage subsystems from different vendors, both physical and virtual. This IBM® Redbooks® publication is intended for experienced storage managers who want to provide detailed performance reports to satisfy their business requirements. The focus of this book is to use the reports provided by Tivoli Storage Productivity Center for performance management. We do address basic storage architecture in order to set a level playing field for understanding of the terminology that we are using throughout this book. Although this book has been created to cover storage performance management, just as important in the larger picture of Enterprise-wide management are both Asset Management and Capacity Management. Tivoli Storage Productivity Center is an excellent tool to provide all of these reporting and management requirements.

Database Design Using Entity-Relationship Diagrams, 2nd Edition

Essential to database design, entity-relationship (ER) diagrams are known for their usefulness in mapping out clear database designs. They are also well-known for being difficult to master. With Database Design Using Entity-Relationship Diagrams, Second Edition, database designers, developers, and students preparing to enter the field can quickly learn the ins and outs of ER diagramming. Building on the success of the bestselling first edition, this accessible text includes a new chapter on the relational model and functional dependencies. It also includes expanded chapters on Enhanced Entity Relationship (EER) diagrams and reverse mapping. It uses cutting-edge case studies and examples to help readers master database development basics and defines ER and EER diagramming in terms of requirements (end user requests) and specifications (designer feedback to those requests). Describes a step-by-step approach for producing an ER diagram and developing a relational database from it Contains exercises, examples, case studies, bibliographies, and summaries in each chapter Details the rules for mapping ER diagrams to relational databases Explains how to reverse engineer a relational database back to an entity-relationship model Includes grammar for the ER diagrams that can be presented back to the user The updated exercises and chapter summaries provide the real-world understanding needed to develop ER and EER diagrams, map them to relational databases, and test the resulting relational database. Complete with a wealth of additional exercises and examples throughout, this edition should be a basic component of any database course. Its comprehensive nature and easy-to-navigate structure makes it a resource that students and professionals will turn to throughout their careers.

MariaDB Crash Course

This first-to-market tutorial covers everything beginners need to succeed with MariaDB, the open, community-based branch of MySQL. Master trainer Ben Forta introduces all the essentials through a series of quick, easy-to-follow, hands-on lessons. Instead of belaboring database theory and relational design, Forta focuses on teaching solutions for the majority of SQL users who simply want to interact with data. Forta covers all this, and more: Using the MariaDB toolset * Retrieving and sorting data Filtering data using comparisons, wildcards, and full text searching Analyzing data with aggregate functions Performing insert, update, and delete operations Joining relational tables using inner, outer, and self joins Combining queries using unions * Using views Creating and modifying tables, and accessing table schemas Working with stored procedures, cursors, and other advanced database features Managing databases, users, and security privileges This book was reviewed and is supported by MariaDB's developers, Monty Program AB. It contains a foreword by project founder Monty Widenius, primary developer of the original version of MySQL.

Mathematical Statistics with Resampling and R

This book bridges the latest software applications with the benefits of modern resampling techniques Resampling helps students understand the meaning of sampling distributions, sampling variability, P-values, hypothesis tests, and confidence intervals. This groundbreaking book shows how to apply modern resampling techniques to mathematical statistics. Extensively class-tested to ensure an accessible presentation, Mathematical Statistics with Resampling and R utilizes the powerful and flexible computer language R to underscore the significance and benefits of modern resampling techniques. The book begins by introducing permutation tests and bootstrap methods, motivating classical inference methods. Striking a balance between theory, computing, and applications, the authors explore additional topics such as: Exploratory data analysis Calculation of sampling distributions The Central Limit Theorem Monte Carlo sampling Maximum likelihood estimation and properties of estimators Confidence intervals and hypothesis tests Regression Bayesian methods Throughout the book, case studies on diverse subjects such as flight delays, birth weights of babies, and telephone company repair times illustrate the relevance of the real-world applications of the discussed material. Key definitions and theorems of important probability distributions are collected at the end of the book, and a related website is also available, featuring additional material including data sets, R scripts, and helpful teaching hints. Mathematical Statistics with Resampling and R is an excellent book for courses on mathematical statistics at the upper-undergraduate and graduate levels. It also serves as a valuable reference for applied statisticians working in the areas of business, economics, biostatistics, and public health who utilize resampling methods in their everyday work.

Security Functions of IBM DB2 10 for z/OS

IBM® DB2® 9 and 10 for z/OS® have added functions in the areas of security, regulatory compliance, and audit capability that provide solutions for the most compelling requirements. DB2 10 enhances the DB2 9 role-based security with additional administrative and other finer-grained authorities and privileges. This authority granularity helps separate administration and data access that provide only the minimum appropriate authority. The authority profiles provide better separation of duties while limiting or eliminating blanket authority over all aspects of a table and its data. In addition, DB2 10 provides a set of criteria for auditing for the possible abuse and overlapping of authorities within a system. In DB2 10, improvements to security and regulatory compliance focus on data retention and protecting sensitive data from privileged users and administrators. Improvements also help to separate security administration from database administration. DB2 10 also lets administrators enable security on a particular column or particular row in the database complementing the privilege model. This IBM Redbooks® publication provides a detailed description of DB2 10 security functions from the implementation and usage point of view. It is intended to be used by database, audit, and security administrators.

Practical Oracle Security

This is the only practical, hands-on guide available to database administrators to secure their Oracle databases. This book will help the DBA to assess their current level of risk as well as their existing security posture. It will then provide practical, applicable knowledge to appropriately secure the Oracle database. The only practical, hands-on guide for securing your Oracle database published by independent experts. Your Oracle database does not exist in a vacuum, so this book shows you how to securely integrate your database into your enterprise.

Cloud and Virtual Data Storage Networking

Written by noted author, blogger, industry analyst, and IT veteran, Greg Schulz, this book covers data storage networks for cloud and virtual environments, from a hardware, software, services, and best practices perspective. Filled with real-world insights, blueprints, and best practices, this vendor- and technology-neutral text provides the tools to achieve efficient, optimized, flexible, scalable, and resilient data storage networking infrastructures. Coverage includes public and private cloud, virtualization, and traditional IT environments.

IBM BladeCenter Virtual Fabric Solutions

The deployment of server virtualization technologies in data centers requires significant efforts in providing sufficient network I/O bandwidth to satisfy the demand of virtualized applications and services. For example, every virtualized system can host several dozen network applications and services, and each of these services requires certain bandwidth (or speed) to function properly. Furthermore, because of different network traffic patterns relevant to different service types, these traffic flows may interfere with each other, leading to serious network problems including the inability of the service to perform its functions. The IBM® Virtual Fabric solution for IBM BladeCenter addresses these issues. The solution is based on the IBM BladeCenter H chassis with a 10-Gb Converged Enhanced Ethernet infrastructure built on 10-Gb Ethernet switch modules in the chassis and the Emulex or Broadcom Virtual Fabric Adapters in each blade server.

Digital Signal Processing Using MATLAB for Students and Researchers

Quickly Engages in Applying Algorithmic Techniques to Solve Practical Signal Processing Problems With its active, hands-on learning approach, this text enables readers to master the underlying principles of digital signal processing and its many applications in industries such as digital television, mobile and broadband communications, and medical/scientific devices. Carefully developed MATLAB® examples throughout the text illustrate the mathematical concepts and use of digital signal processing algorithms. Readers will develop a deeper understanding of how to apply the algorithms by manipulating the codes in the examples to see their effect. Moreover, plenty of exercises help to put knowledge into practice solving real-world signal processing challenges. Following an introductory chapter, the text explores: Sampled signals and digital processing Random signals Representing signals and systems Temporal and spatial signal processing Frequency analysis of signals Discrete-time filters and recursive filters Each chapter begins with chapter objectives and an introduction. A summary at the end of each chapter ensures that one has mastered all the key concepts and techniques before progressing in the text. Lastly, appendices listing selected web resources, research papers, and related textbooks enable the investigation of individual topics in greater depth. Upon completion of this text, readers will understand how to apply key algorithmic techniques to address practical signal processing problems as well as develop their own signal processing algorithms. Moreover, the text provides a solid foundation for evaluating and applying new digital processing signal techniques as they are developed.

Oracle Database 11g Oracle Real Application Clusters Handbook, 2nd Edition, 2nd Edition

Master Oracle Real Application Clusters Maintain a dynamic enterprise computing infrastructure with expert instruction from an Oracle ACE. Oracle Database 11g Oracle Real Application Clusters Handbook, Second Edition has been fully revised and updated to cover the latest tools and features. Find out how to prepare your hardware, deploy Oracle Real Application Clusters, optimize data integrity, and integrate seamless failover protection. Troubleshooting, performance tuning, and application development are also discussed in this comprehensive Oracle Press guide. Install and configure Oracle Real Application Clusters Configure and manage diskgroups using Oracle Automatic Storage Management Work with services, voting disks, and Oracle Clusterware Repository Look under the hood of the Cache Fusion and Global Resource Directory operations in Oracle Real Applications Clusters Explore the internal workings of backup and recovery in Oracle Real Application Clusters Employ workload balancing and the Transparent Application Failover feature of an Oracle database Get complete coverage of Stretch Clusters, also known as Metro Clusters Troubleshoot Oracle Clusterware using the most advanced diagnostics available Develop custom Oracle Real Application Clusters applications

Oracle Database 11g Performance Tuning Recipes: A Problem-Solution Approach

Performance problems are rarely "problems" per se. They are more often "crises" during which you're pressured for results by a manager standing outside your cubicle while your phone rings with queries from the help desk. You won't have the time for a leisurely perusal of the manuals, nor to lean back and read a book on theory. What you need in that situation is a book of solutions, and solutions are precisely what Oracle Database 11g Performance Tuning Recipes delivers. Oracle Database 11g Performance Tuning Recipes is a ready reference for database administrators in need of immediate help with performance issues relating to Oracle Database. The book takes an example-based approach, wherein each chapter covers a specific problem domain. Within each chapter are "recipes," showing by example how to perform common tasks in that chapter's domain. Solutions in the recipes are backed by clear explanations of background and theory from the author team. Whatever the task, if it's performance-related, you'll probably find a recipe and a solution in this book. Provides proven solutions to real-life Oracle performance problems Offers relevant background and theory to support each solution Written by a team of experienced database administrators successful in their careers What you'll learn Optimize the use of memory and storage Monitor performance and troubleshoot problems Identify and improve poorly-performing SQL statements Adjust the most important optimizer parameters to your advantage Create indexes that get used and make a positive impact upon performance Automate and stabilize using key features such as SQL Tuning Advisor and SQL Plan Baselines Who this book is for Oracle Database 11g Performance Tuning Recipes is aimed squarely at Oracle Database administrators. The book especially appeals to those administrators desiring to have at their side a ready-to-go set of solutions to common database performance problems.

SAS Statistics by Example

In SAS Statistics by Example, Ron Cody offers up a cookbook approach for doing statistics with SAS. Structured specifically around the most commonly used statistical tasks or techniques--for example, comparing two means, ANOVA, and regression--this book provides an easy-to-follow, how-to approach to statistical analysis not found in other books.

For each statistical task, Cody includes heavily annotated examples using ODS Statistical Graphics procedures such as SGPLOT, SGSCATTER, and SGPANEL that show how SAS can produce the required statistics. Also, you will learn how to test the assumptions for all relevant statistical tests. Major topics featured include descriptive statistics, one- and two-sample tests, ANOVA, correlation, linear and multiple regression, analysis of categorical data, logistic regression, nonparametric techniques, and power and sample size.

This is not a book that teaches statistics. Rather, SAS Statistics by Example is perfect for intermediate to advanced statistical programmers who know their statistics and want to use SAS to do their analyses.

This book is part of the SAS Press program.