talk-data.com talk-data.com

Topic

data

5765

tagged

Activity Trend

3 peak/qtr
2020-Q1 2026-Q1

Activities

5765 activities · Newest first

Practical Cassandra: A Developer’s Approach

Build and Deploy Massively Scalable, Super-fast Data Management Applications with Apache Cassandra is the first hands-on developer’s guide to building Cassandra systems and applications that deliver breakthrough speed, scalability, reliability, and performance. Fully up to date, it reflects the latest versions of Cassandra–including Cassandra Query Language (CQL), which dramatically lowers the learning curve for Cassandra developers. Practical Cassandra Pioneering Cassandra developers and Datastax MVPs Russell Bradberry and Eric Lubow walk you through every step of building a real production application that can store enormous amounts of structured, semi-structured, and unstructured data. Drawing on their exceptional expertise, Bradberry and Lubow share practical insights into issues ranging from querying to deployment, management, maintenance, monitoring, and troubleshooting. The authors cover key issues, from architecture to migration, and guide you through crucial decisions about configuration and data modeling. They provide tested sample code, detailed explanations of how Cassandra works ”under the covers,” and new case studies from three cutting-edge users: Ooyala, Hailo, and eBay. Coverage includes Understanding Cassandra’s approach, architecture, key concepts, and primary use cases– and why it’s so blazingly fast Getting Cassandra up and running on single nodes and large clusters Applying the new design patterns, philosophies, and features that make Cassandra such a powerful data store Leveraging CQL to simplify your transition from SQL-based RDBMSes Deploying and provisioning through the cloud or on bare-metal hardware Choosing the right configuration options for each type of workload Tweaking Cassandra to get maximum performance from your hardware, OS, and JVM Mastering Cassandra’s essential tools for maintenance and monitoring Efficiently solving the most common problems with Cassandra deployment, operation, and application development

Predictive Modeling with SAS Enterprise Miner, 2nd Edition

Learn the theory behind and methods for predictive modeling using SAS Enterprise Miner.

Learn how to produce predictive models and prepare presentation-quality graphics in record time with Predictive Modeling with SAS Enterprise Miner: Practical Solutions for Business Applications, Second Edition.

If you are a graduate student, researcher, or statistician interested in predictive modeling; a data mining expert who wants to learn SAS Enterprise Miner; or a business analyst looking for an introduction to predictive modeling using SAS Enterprise Miner, you'll be able to develop predictive models quickly and effectively using the theory and examples presented in this book.

Author Kattamuri Sarma offers the theory behind, programming steps for, and examples of predictive modeling with SAS Enterprise Miner, along with exercises at the end of each chapter. You'll gain a comprehensive awareness of how to find solutions for your business needs. This second edition features expanded coverage of the SAS Enterprise Miner nodes, now including File Import, Time Series, Variable Clustering, Cluster, Interactive Binning, Principal Components, AutoNeural, DMNeural, Dmine Regression, Gradient Boosting, Ensemble, and Text Mining.

Develop predictive models quickly, learn how to test numerous models and compare the results, gain an in-depth understanding of predictive models and multivariate methods, and discover how to do in-depth analysis. Do it all with Predictive Modeling with SAS Enterprise Miner.

This book is part of the SAS Press program.

Programming Elastic MapReduce

Although you don’t need a large computing infrastructure to process massive amounts of data with Apache Hadoop, it can still be difficult to get started. This practical guide shows you how to quickly launch data analysis projects in the cloud by using Amazon Elastic MapReduce (EMR), the hosted Hadoop framework in Amazon Web Services (AWS). Authors Kevin Schmidt and Christopher Phillips demonstrate best practices for using EMR and various AWS and Apache technologies by walking you through the construction of a sample MapReduce log analysis application. Using code samples and example configurations, you’ll learn how to assemble the building blocks necessary to solve your biggest data analysis problems. Get an overview of the AWS and Apache software tools used in large-scale data analysis Go through the process of executing a Job Flow with a simple log analyzer Discover useful MapReduce patterns for filtering and analyzing data sets Use Apache Hive and Pig instead of Java to build a MapReduce Job Flow Learn the basics for using Amazon EMR to run machine learning algorithms Develop a project cost model for using Amazon EMR and other AWS tools

R for Everyone: Advanced Analytics and Graphics

Statistical Computation for Programmers, Scientists, Quants, Excel Users, and Other Professionals Using the open source R language, you can build powerful statistical models to answer many of your most challenging questions. R has traditionally been difficult for non-statisticians to learn, and most R books assume far too much knowledge to be of help. R for Everyone is the solution. Drawing on his unsurpassed experience teaching new users, professional data scientist Jared P. Lander has written the perfect tutorial for anyone new to statistical programming and modeling. Organized to make learning easy and intuitive, this guide focuses on the 20 percent of R functionality you’ll need to accomplish 80 percent of modern data tasks. Lander’s self-contained chapters start with the absolute basics, offering extensive hands-on practice and sample code. You’ll download and install R; navigate and use the R environment; master basic program control, data import, and manipulation; and walk through several essential tests. Then, building on this foundation, you’ll construct several complete models, both linear and nonlinear, and use some data mining techniques. By the time you’re done, you won’t just know how to write R programs, you’ll be ready to tackle the statistical problems you care about most. COVERAGE INCLUDES • Exploring R, RStudio, and R packages • Using R for math: variable types, vectors, calling functions, and more • Exploiting data structures, including data.frames, matrices, and lists • Creating attractive, intuitive statistical graphics • Writing user-defined functions • Controlling program flow with if, ifelse, and complex checks • Improving program efficiency with group manipulations • Combining and reshaping multiple datasets • Manipulating strings using R’s facilities and regular expressions • Creating normal, binomial, and Poisson probability distributions • Programming basic statistics: mean, standard deviation, and t-tests • Building linear, generalized linear, and nonlinear models • Assessing the quality of models and variable selection • Preventing overfitting, using the Elastic Net and Bayesian methods • Analyzing univariate and multivariate time series data • Grouping data via K-means and hierarchical clustering • Preparing reports, slideshows, and web pages with knitr • Building reusable R packages with devtools and Rcpp • Getting involved with the R global community

The Definitive Guide to Warehousing: Managing the Storage and Handling of Materials and Products in the Supply Chain

This is the most authoritative and complete guide to planning, implementing, measuring, and optimizing world-class supply chain warehousing processes. Straight from the Council of Supply Chain Management Professionals (CSCMP), it explains each warehousing option, basic warehousing storage and handling operations, strategic planning, and the effects of warehousing design and service decisions on total logistics costs and customer service. This reference introduces crucial concepts including product handling, labor management, warehouse support, and extended value chain processes, facility ownership, planning, and strategy decisions; materials handling; warehouse management systems; Auto-ID, AGVs, and much more. Step by step, The Definitive Guide to Warehousing helps you optimize all facets of warehousing, one of the most pivotal areas of supply chain management. Coverage includes: Basic warehousing management concepts and their essential role in demand fulfillment Key elements, processes, and interactions in warehousing operations management Principles and strategies for effectively planning and managing warehouse operations Principles and strategies for designing materials handling operations in warehousing facilities Critical roles of technology in managing warehouse operations and product flows Best practices for assessing the performance of warehousing operations using standard metrics and frameworks

IBM XIV Storage System Architecture and Implementation

This IBM® Redbooks® publication describes the concepts, architecture, and implementation of the IBM XIV® Storage System. The XIV Storage System is a scalable enterprise storage system that is based on a grid array of hardware components. It can attach to both Fibre Channel Protocol (FCP) and IP network Small Computer System Interface (iSCSI) capable hosts. This system is a good fit for clients who want to be able to grow capacity without managing multiple tiers of storage. The XIV Storage System is suited for mixed or random access workloads, including online transaction processing, video streamings, images, email, and emerging workload areas, such as Web 2.0 and storage cloud. The focus of this edition is on the XIV Gen3 hardware Release 3.4, running Version 11.4 of the XIV system software. With this version, XIV Storage System offers 4 TB drives, and enhanced caching with optional 800 GB flash cache devices (solid-state drives (SSDs)) per module. IBM XIV software Version 11.4, XIV Gen3 supports encryption for all capacity points. This version also scales XIV snapshot management out with the new Hyper-Scale Consistency, by coordinating concurrent snapshots of volumes that are spread across multiple XIV systems and belong to one application. In the first few chapters of this book, we describe many of the unique and powerful concepts that form the basis of the XIV Storage System logical and physical architecture. We explain how the system is designed to eliminate direct dependencies between the hardware elements and the software that governs the system. In subsequent chapters, we explain the planning and preparation tasks that are required to deploy the system in your environment by using the intuitive, yet powerful XIV Storage Manager GUI or the XIV command-line interface (XCLI). We describe the performance characteristics of the XIV Storage System and present options that are available for alerting and monitoring, including an enhanced secure remote support capability. This book is intended for IT professionals who want an understanding of the XIV Storage System. It also targets readers who need detailed advice on how to configure and use the system.

Big Data Application Architecture Q&A: A Problem - Solution Approach

Big Data Application Architecture Pattern Recipes provides an insight into heterogeneous infrastructures, databases, and visualization and analytics tools used for realizing the architectures of big data solutions. Its problem-solution approach helps in selecting the right architecture to solve the problem at hand. In the process of reading through these problems, you will learn harness the power of new big data opportunities which various enterprises use to attain real-time profits. Big Data Application Architecture Pattern Recipes answers one of the most critical questions of this time 'how do you select the best end-to-end architecture to solve your big data problem?'. The book deals with various mission critical problems encountered by solution architects, consultants, and software architects while dealing with the myriad options available for implementing a typical solution, trying to extract insight from huge volumes of data in real-time and across multiple relational and non-relational data types for clients from industries like retail, telecommunication, banking, and insurance. The patterns in this book provide the strong architectural foundation required to launch your next big data application. The architectures for realizing these opportunities are based on relatively less expensive and heterogeneous infrastructures compared to the traditional monolithic and hugely expensive options that exist currently. This book describes and evaluates the benefits of heterogeneity which brings with it multiple options of solving the same problem, evaluation of trade-offs and validation of 'fitness-for-purpose' of the solution. What you'll learn Major considerations in building a big data solution Big data application architectures problems for specific industries What are the components one needs to build and end-to-end big data solution? Does one really need a real-time big data solution or an off-line analytics batch solution? What are the operations and support architectures for a big data solution? What are the scalability considerations, and options for a Hadoop installation? Who this book is for CIOs, CTOs, enterprise architects, and software architects Consultants, solution architects, and information management (IM) analysts who want to architect a big data solution for their enterprise

Handbook of Graph Theory, 2nd Edition

With 34 new contributors, this best-selling handbook provides comprehensive coverage of the main topics in pure and applied graph theory. This second edition incorporates 14 new sections. Each chapter includes lists of essential definitions and facts, accompanied by examples, tables, remarks, and, in some cases, conjectures and open problems.

SAS 9.4 Language Reference, Second Edition

Provides conceptual information for the Base SAS language. Major topics include SAS keywords and naming conventions, SAS variables and expressions, error processing and debugging, SAS data sets and files, creating and customizing output, DATA step concepts and DATA step processing, reading raw data, and creating and managing SAS libraries.

Understanding Uncertainty, Revised Edition

Praise for the First Edition "...a reference for everyone who is interested in knowing and handling uncertainty." —Journal of Applied Statistics The critically acclaimed First Edition of Understanding Uncertainty provided a study of uncertainty addressed to scholars in all fields, showing that uncertainty could be measured by probability, and that probability obeyed three basic rules that enabled uncertainty to be handled sensibly in everyday life. These ideas were extended to embrace the scientific method and to show how decisions, containing an uncertain element, could be rationally made. Featuring new material, the Revised Edition remains the go-to guide for uncertainty and decision making, providing further applications at an accessible level including: A critical study of transitivity, a basic concept in probability A discussion of how the failure of the financial sector to use the proper approach to uncertainty may have contributed to the recent recession A consideration of betting, showing that a bookmaker's odds are not expressions of probability Applications of the book's thesis to statistics A demonstration that some techniques currently popular in statistics, like significance tests, may be unsound, even seriously misleading, because they violate the rules of probability Understanding Uncertainty, Revised Edition is ideal for students studying probability or statistics and for anyone interested in one of the most fascinating and vibrant fields of study in contemporary science and mathematics.

Image Statistics in Visual Computing

To achieve the complex task of interpreting what we see, our brains rely on statistical regularities and patterns in visual data. Knowledge of these regularities can also be considerably useful in visual computing disciplines, such as computer vision, computer graphics, and image processing. The field of natural image statistics studies the regularities to exploit their potential and better understand human vision. With numerous color figures throughout, Image Statistics in Visual Computing The authors keep the material accessible, providing mathematical definitions where appropriate to help readers understand the transforms that highlight statistical regularities present in images. The book also describes patterns that arise once the images are transformed and gives examples of applications that have successfully used statistical regularities. Numerous references enable readers to easily look up more information about a specific concept or application. A supporting website also offers additional information, including descriptions of various image databases suitable for statistics. Collecting state-of-the-art, interdisciplinary knowledge in one source, this book explores the relation of natural image statistics to human vision and shows how natural image statistics can be applied to visual computing. It encourages readers in both academic and industrial settings to develop novel insights and applications in all disciplines that relate to visual computing.

Oracle Database 12c New Features

Maximize the New and Improved Features of Oracle Database 12 c Written by Master Principle Database Expert, Oracle, and Oracle ACE Robert G. Freeman, this Oracle Press guide describes the myriad new and enhanced capabilities available in the latest Oracle Database release. Inside, you’ll find everything you need to know to get up and running quickly on Oracle Database 12 c. Supported by running commentary from world-renowned Oracle expert Tom Kyte, and with additional contributions by Oracle experts Eric Yen and Scott Black, Oracle Database 12c New Features offers detailed coverage of: Installing Oracle Database 12 c Architectural changes, such as Oracle Multitenant The most current information on upgrading and migrating to Oracle Database 12 c The pre-upgrade information tool and parallel processing for database upgrades Oracle Real Application Clusters new features, such as Oracle Flex Cluster, Oracle Flex Automatic Storage Management, and Oracle Automatic Storage Management Cluster File System Oracle RMAN enhancements, including cross-platform backup and recovery Oracle Data Guard improvements, such as Fast Sync, and Oracle Active Data Guard new features, such as Far Sync SQL, PL/SQL, DML, and DDL new features Improvements to partitioning manageability, performance, and availability Advanced business intelligence and data warehousing capabilities Security enhancements, including privileges analysis, data redaction, and new administrative-level privileges Manageability, performance, and optimization improvements

IBM z/OS DFSMShsm Primer

DFSMShsm provides space, availability, and tape mount management functions in a storage device hierarchy for both system-managed, and non-system-managed storage environments. DFSMShsm allows you to automate your storage management tasks, improving the productivity by effectively managing the storage devices. This IBM® Redbooks® publication provides technical storage specialists and storage administrators with the basic DFSMShsm knowledge for implementing and customizing DFSMShsm at the IBM z/OS® V1.13 level. Hints and tips about the daily operation, monitoring, and tuning are included. Sysplex environment considerations are also included. If you are implementing DFSMShsm for the first time, you will find valuable information for exploiting the DFSMShsm functions. If you are experienced, you will find that this publication can be used as an update to the latest DFSMShsm functions, and it shows how to use those functions in an existing DFSMShsm installation.

IMS Integration and Connectivity Across the Enterprise

This IBM® Redbooks® publication gives a broad understanding of IBM IMS™ integration and connectivity solutions to access applications and data stores across your enterprise architecture. As an application developer, architect, systems integrator, or systems programmer, there is important information that is available in this book that pertains to your responsibilities to continue to include the proven performance, data integrity, and workload distribution that is available from IMS in to selected projects that are related to your entire enterprise. This book updates and adds to the information in the following IBM Redbooks publications: IMS e-business Connectors: A Guide to IMS Connectivity, SG24-6514 IMS Connectivity in an On Demand Environment: A Practical Guide to IMS Connectivity, SG24-6794 Powering SOA Solutions with IMS, SG24-7662 IBM IMS Version 12 Technical Overview, SG24-7972 IMS 12: The IMS Catalog, REDP-4812 Rethink Your Mainframe Applications: Reasons and Approaches for Extension, Transformation, and Growth, REDP-4938 Please note that the additional material referenced in the text is not available from IBM.

Web Cartography

Web mapping technologies continue to evolve at an incredible pace. Technology is but one facet of web map creation, however. Map design, aesthetics, and user-interactivity are equally important for effective map communication. From interactivity to graphical user interface design, from symbolization choices to animation, and from layout to typeface and color selection, Web Cartography offers the first comprehensive overview and guide for designing beautiful and effective web maps for a variety of devices. Written for those with a basic understanding of mapmaking, but who may not have an in-depth knowledge of web design, this book explains how to create effective interaction, animation, and layouts for maps in online and mobile platforms. Concept-driven, this reference emphasizes cartographic principles for web and mobile map design over specific software techniques. It focuses on key design concepts that will remain true regardless of software technologies used. The book is supplemented with a website providing links to stellar web maps, video tutorials and lectures, do-it-yourself labs, map critique exercises, and links to others’ tutorials. Approachable, clear, and concise, the book provides a nontechnical, approachable guide to map design for the web. It provides best practices for map communication, based on spatial data visualization and graphic design theory. By carefully avoiding overly technical jargon, it provides a solid launching pad from which students, practitioners, and innovators can begin to design aesthetically pleasing and intuitive web maps.

BizTalk 2013 Recipes: A Problem-Solution Approach, Second Edition

BizTalk 2013 Recipes provides ready-made solutions to BizTalk Server 2013 developers. The recipes in the book save you the effort of developing your own solutions to common problems that have been solved many times over. The solutions demonstrate sound practice, the result of hard-earned wisdom by those who have gone before. Presented in a step-by-step format with clear code examples and explanations, the solutions in BizTalk 2013 Recipes help you take advantage of new features and deeper capabilities in BizTalk Server 2013. You'll learn to integrate your solutions with the cloud, configure BizTalk on Azure, work with electronic data interchange (EDI), and deploy the growing range of adapters for integrating with the different systems and technologies that you will encounter. You'll find recipes covering all the core areas: schemas, maps, orchestrations, messaging and more. BizTalk Server 2013 is Microsoft's market-leading platform for orchestrating process flow across disparate applications. BizTalk 2013 Recipes is your key to unlocking the full power of that platform. What you'll learn Automate business processes across different systems in your enterprise. Build, test, and deploy complex maps and schemas. Implement the business rules engine (BRE). Develop business activity monitoring (BAM) solutions. Manage electronic data interchange (EDI) with trading partners. Monitor and troubleshoot automated processes. Deploy BizTalk to Azure and build cloud based solutions. Who this book is for BizTalk 2013 Recipes is aimed at developers working in Microsoft BizTalk Server 2013. Experienced BizTalk developers will find great value in the information around new functionality in the 2013 release such as that for cloud based integrations. Those brand new to BizTalk will appreciate the clear examples of core functionality that help them understand how best to design and deploy BizTalk Server solutions.

Nonparametric Statistics for Social and Behavioral Sciences

Incorporating a hands-on pedagogical approach, this text presents the concepts, principles, and methods used in performing many nonparametric procedures. It also demonstrates practical applications of the most common nonparametric procedures using IBM's SPSS software. The text is the only current nonparametric book written specifically for students in the behavioral and social sciences. With examples of real-life research problems, it emphasizes sound research designs, appropriate statistical analyses, and accurate interpretations of results.