talk-data.com talk-data.com

Topic

data

5765

tagged

Activity Trend

3 peak/qtr
2020-Q1 2026-Q1

Activities

5765 activities · Newest first

TIBCO Spotfire: A Comprehensive Primer - Second Edition

Explore the possibilities of TIBCO Spotfire with this comprehensive guide. You'll start with fundamental data visualization principles and progress to creating powerful, professional-grade analytics dashboards and applications. By following this book, you'll master both basic usage and advanced features such as predictive and spatial analytics. What this Book will help me do Understand the fundamentals of TIBCO Spotfire and its various interfaces including web and desktop clients. Utilize Spotfire's range of visualization tools to effectively analyze and present data. Develop robust analytics dashboards and applications tailored for enterprise needs. Implement advanced features like predictive analytics and location-based data representations. Learn strategies for deploying and administrating Spotfire in a scalable, enterprise-oriented environment. Author(s) The authors, None Berridge and None Phillips, bring years of experience in business intelligence and data analytics. Their practical knowledge and real-world perspective shape the book into a practical resource for learning Spotfire. Their approach ensures that concepts are clearly explained with relatable examples, improving accessibility for all readers. Who is it for? This book is intended for business intelligence professionals, data analysts, and developers who aim to enhance their analytics skills using TIBCO Spotfire. It is suitable for beginners as no prior experience with Spotfire or advanced analytics is required. Readers looking to develop enterprise-grade visualization and analytical solutions will find it valuable.

Nonparametric Statistical Process Control

A unique approach to understanding the foundations of statistical quality control with a focus on the latest developments in nonparametric control charting methodologies Statistical Process Control (SPC) methods have a long and successful history and have revolutionized many facets of industrial production around the world. This book addresses recent developments in statistical process control bringing the modern use of computers and simulations along with theory within the reach of both the researchers and practitioners. The emphasis is on the burgeoning field of nonparametric SPC (NSPC) and the many new methodologies developed by researchers worldwide that are revolutionizing SPC. Over the last several years research in SPC, particularly on control charts, has seen phenomenal growth. Control charts are no longer confined to manufacturing and are now applied for process control and monitoring in a wide array of applications, from education, to environmental monitoring, to disease mapping, to crime prevention. This book addresses quality control methodology, especially control charts, from a statistician’s viewpoint, striking a careful balance between theory and practice. Although the focus is on the newer nonparametric control charts, the reader is first introduced to the main classes of the parametric control charts and the associated theory, so that the proper foundational background can be laid. Reviews basic SPC theory and terminology, the different types of control charts, control chart design, sample size, sampling frequency, control limits, and more Focuses on the distribution-free (nonparametric) charts for the cases in which the underlying process distribution is unknown Provides guidance on control chart selection, choosing control limits and other quality related matters, along with all relevant formulas and tables Uses computer simulations and graphics to illustrate concepts and explore the latest research in SPC Offering a uniquely balanced presentation of both theory and practice, Nonparametric Methods for Statistical Quality Control is a vital resource for students, interested practitioners, researchers, and anyone with an appropriate background in statistics interested in learning about the foundations of SPC and latest developments in NSPC.

Fifty Years of Data Management and Beyond

Every decade since the 1960s, researchers at companies like IBM, Amazon, and many others have introduced major new frameworks and techniques to handle rising data management problems. This concise ebook explains how these new systems helped data science evolve quickly—from hierarchical and relational databases to big data and cloud computing to streaming and graph data. Computer scientist Paco Nathan shows members of your data science team how major companies created each of these data management systems not just to deal with new data types but also to take full advantage of the opportunities the data presented. Their efforts over the years have propelled an entire industry. This report covers the historical progression of data management topics including: Hierarchical databases—1960s mainframe batch systems are still used in finance, healthcare, manufacturing, energy, and other industries. Relational databases—these enabled faster transactions, mathematical optimization, and budgeting guarantees for many businesses. Big data—this includes relatively cheap horizontal scale-out systems for collecting huge amounts of customer data. Cloud computing—large companies began managing reliable, scalable, cost-effective data centers; Amazon turned the concept into a business. Cluster schedulers—managing horizontal clusters was difficult before schedulers such as Apache Mesos appeared. Streaming data—data continuously generated by different sources requires responses in "real time"—generally milliseconds.

Data Science and Engineering at Enterprise Scale

As enterprise-scale data science sharpens its focus on data-driven decision making and machine learning, new tools have emerged to help facilitate these processes. This practical ebook shows data scientists and enterprise developers how the notebook interface, Apache Spark, and other collaboration tools are particularly well suited to bridge the communication gap between their teams. Through a series of real-world examples, author Jerome Nilmeier demonstrates how to generate a model that enables data scientists and developers to share ideas and project code. You’ll learn how data scientists can approach real-world business problems with Spark and how developers can then implement the solution in a production environment. Dive deep into data science technologies, including Spark, TensorFlow, and the Jupyter Notebook Learn how Spark and Python notebooks enable data scientists and developers to work together Explore how the notebook environment works with Spark SQL for structured data Use notebooks and Spark as a launchpad to pursue supervised, unsupervised, and deep learning data models Learn additional Spark functionality, including graph analysis and streaming Explore the use of analytics in the production environment, particularly when creating data pipelines and deploying code

R Quick Syntax Reference: A Pocket Guide to the Language, APIs and Library

This handy reference book detailing the intricacies of R updates the popular first edition by adding R version 3.4 and 3.5 features. Starting with the basic structure of R, the book takes you on a journey through the terminology used in R and the syntax required to make R work. You will find looking up the correct form for an expression quick and easy. Some of the new material includes information on RStudio, S4 syntax, working with character strings, and an example using the Twitter API. With a copy of the R Quick Syntax Reference in hand, you will find that you are able to use the multitude of functions available in R and are even able to write your own functions to explore and analyze data. What You Will Learn Discover the modes and classes of R objects and how to use them Use both packaged and user-created functions in R Import/export data and create new data objects in R Create descriptive functions and manipulate objects in R Take advantage of flow control and conditional statements Work with packages such as base, stats, and graphics Who This Book Is For Those with programming experience, either new to R, or those with at least some exposure to R but who are new to the latest version.

SQL All-In-One For Dummies, 3rd Edition

The latest on SQL databases SQL All -In-One For Dummies, 3rd Edition, is a one-stop shop for everything you need to know about SQL and SQL-based relational databases. Everyone from database administrators to application programmers and the people who manage them will find clear, concise explanations of the SQL language and its many powerful applications. With the ballooning amount of data out there, more and more businesses, large and small, are moving from spreadsheets to SQL databases like Access, Microsoft SQL Server, Oracle databases, MySQL, and PostgreSQL. This compendium of information covers designing, developing, and maintaining these databases. Cope with any issue that arises in SQL database creation and management Get current on the newest SQL updates and capabilities Reference information on querying SQL-based databases in the SQL language Understand relational databases and their importance to today’s organizations SQL All-In-One For Dummies is a timely update to the popular reference for readers who want detailed information about SQL databases and queries.

Statistics Workbook For Dummies with Online Practice, 2nd Edition

Practice your way to a higher statistics score The adage that "practice makes perfect" is never truer than with math problems. S tatistics Workbook For Dummies with Online Practice provides succinct content reviews for every topic, with plenty of examples and practice problems for each concept, in the book and online. Every lesson begins with a concept review, followed by a few example problems and plenty of practice problems. There's a step-by-step solution for every problem, with tips and tricks to help with comprehension and retention. New for this edition, free online practice quizzes for each chapter provide extra opportunities to test your knowledge and understanding. Get FREE access to chapter quizzes in an online test bank Work along with each chapter or use the test bank for final exam review Discover which statistical measures are most meaningful Scoring high in your Statistics class has never been easier!

IBM Spectrum Archive Enterprise Edition V1.2.6 Installation and Configuration Guide

Note: This is a republication of IBM Spectrum Archive Enterprise Edition V1.2.6: Installation and Configuration Guide with new book number SG24-8445 to keep the content available on the Internet along with the recent publication IBM Spectrum Archive Enterprise Edition V1.3.0: Installation and Configuration Guide, SG24-8333. This IBM® Redbooks® publication helps you with the planning, installation, and configuration of the new IBM Spectrum™ Archive V1.2.6 for the IBM TS3310, IBM TS3500, IBM TS4300, and IBM TS4500 tape libraries. IBM Spectrum Archive™ EE enables the use of the LTFS for the policy management of tape as a storage tier in an IBM Spectrum Scale™ based environment. It helps encourage the use of tape as a critical tier in the storage environment. This is the sixth edition of IBM Spectrum Archive Installation and Configuration Guide. IBM Spectrum Archive EE can run any application that is designed for disk files on a physical tape media. IBM Spectrum Archive EE supports the IBM Linear Tape-Open (LTO) Ultrium 8, 7, 6, and 5 tape drives in IBM TS3310, TS3500, TS4300, and TS4500 tape libraries. In addition, IBM TS1155, TS1150, and TS1140 tape drives are supported in TS3500 and TS4500 tape library configurations. IBM Spectrum Archive EE can play a major role in reducing the cost of storage for data that does not need the access performance of primary disk. The use of IBM Spectrum Archive EE to replace disks with physical tape in tier 2 and tier 3 storage can improve data access over other storage solutions because it improves efficiency and streamlines management for files on tape. IBM Spectrum Archive EE simplifies the use of tape by making it transparent to the user and manageable by the administrator under a single infrastructure. This publication is intended for anyone who wants to understand more about IBM Spectrum Archive EE planning and implementation. This book is suitable for IBM clients, IBM Business Partners, IBM specialist sales representatives, and technical specialists.

IBM TS7700 Release 4.2 Guide

This IBM® Redbooks® publication covers IBM TS7700 R4.2. The IBM TS7700 is part of a family of IBM Enterprise tape products. This book is intended for system architects and storage administrators who want to integrate their storage systems for optimal operation. Building on over 20 years of virtual tape experience, the TS7760 now supports the ability to store virtual tape volumes in an object store. The TS7700 has supported off loading to physical tape for over two decades. Off loading to physical tape behind a TS7700 is utilized by hundreds of organizations around the world. Using the same hierarchical storage techniques, the TS7700 can also off load to object storage. Given object storage is cloud based and accessible from different regions, the TS7760 Cloud Storage Tier support essentially allows the cloud to be an extension of the grid. As of the release of this document, the TS7760C supports the ability to off load to IBM Cloud Object Storage as well as Amazon S3. To learn about the TS7760 cloud storage tier function, planning, implementation, best practices, and support see IBM Redpaper IBM TS7760 R4.2 Cloud Storage Tier Guide, redp-5514 at: http://www.redbooks.ibm.com/abstracts/redp5514.html The IBM TS7700 offers a modular, scalable, and high-performance architecture for mainframe tape virtualization for the IBM Z® environment. It is a fully integrated, tiered storage hierarchy of disk and tape. This storage hierarchy is managed by robust storage management microcode with extensive self-management capability. It includes the following advanced functions: Improved reliability and resiliency Reduction in the time that is needed for the backup and restore process Reduction of services downtime that is caused by physical tape drive and library outages Reduction in cost, time, and complexity by moving primary workloads to virtual tape More efficient procedures for managing daily backup and restore processing Infrastructure simplification through reduction of the number of physical tape libraries, drives, and media TS7700 delivers the following new capabilities: TS7760C supports the ability to off load to IBM Cloud Object Storage as well as Amazon S3 8-way Grid Cloud consisting of any generation of TS7700 Synchronous and asynchronous replication Tight integration with IBM Z and DFSMS policy management Optional Transparent Cloud Tiering Optional integration with physical tape Cumulative 16Gb FICON throughput up to 4.8GB/s 8 IBM Z hosts view up to 496 8 equivalent devices Grid access to all data independent of where it exists The TS7760T writes data by policy to physical tape through attachment to high-capacity, high-performance IBM TS1150 and IBM TS1140 tape drives installed in an IBM TS4500 or TS3500 tape library. The TS7760 models are based on high-performance and redundant IBM POWER8® technology. They provide improved performance for most IBM Z tape workloads when compared to the previous generations of IBM TS7700.

Learn RStudio IDE: Quick, Effective, and Productive Data Science

Discover how to use the popular RStudio IDE as a professional tool that includes code refactoring support, debugging, and Git version control integration. This book gives you a tour of RStudio and shows you how it helps you do exploratory data analysis; build data visualizations with ggplot; and create custom R packages and web-based interactive visualizations with Shiny. In addition, you will cover common data analysis tasks including importing data from diverse sources such as SAS files, CSV files, and JSON. You will map out the features in RStudio so that you will be able to customize RStudio to fit your own style of coding. Finally, you will see how to save a ton of time by adopting best practices and using packages to extend RStudio. Learn RStudio IDE is a quick, no-nonsense tutorial of RStudio that will give you a head start to develop the insights you need in your data science projects. What YouWill Learn Quickly, effectively, and productively use RStudio IDE for building data science applications Install RStudio and program your first Hello World application Adopt the RStudio workflow Make your code reusable using RStudio Use RStudio and Shiny for data visualization projects Debug your code with RStudio Import CSV, SPSS, SAS, JSON, and other data Who This Book Is For Programmers who want to start doing data science, but don’t know what tools to focus on to get up to speed quickly.

IBM DS8880 Encryption for data at rest and Transparent Cloud Tiering (DS8000 Release 8.5)

-update for Release 8.5 - IBM experts recognize the need for data protection, both from hardware or software failures, and also from physical relocation of hardware, theft, and retasking of existing hardware. The IBM DS8880 supports encryption-capable hard disk drives (HDDs) and flash drives. These Full Disk Encryption (FDE) drive sets are used with key management services that are provided by IBM Security Key Lifecycle Manager software or Gemalto SafeNet KeySecure to allow encryption for data at rest on a DS8880. Use of encryption technology involves several considerations that are critical for you to understand to maintain the security and accessibility of encrypted data. The IBM Security Key Lifecycle Manager software also supports Transparent Cloud Tiering (TCT) data object encryption, which is part of this publication. With TCT encryption, data is encrypted before it is transmitted to the Cloud. The data remains encrypted in cloud storage and is decrypted after it is transmitted back to the DS8000®. This IBM Redpaper™ publication contains information that can help storage administrators plan for disk and TCT data object encryption. It also explains how to install and manage the encrypted storage and how to comply with IBM requirements for using the IBM DS8000 encrypted disk storage system. This edition focuses on IBM Security Key Lifecycle Manager Version 3.0 which enables support Key Management Interoperability Protocol (KMIP) with the DS8000 Release 8.5 code or later and updated GUI for encryption functions. The publication also discusses support for data at rest encryption with Gemalto SafeNet KeySecure Version 8.3.2.

IBM Storage Solutions for IBM Cloud Private Blueprint

IBM Storage Solutions for IBM Cloud™ Private delivers a blueprint for multicloud architecture. IBM, delivering solutions to help you win. In this blueprint, learn how to: Combine the benefits of IBM Systems with the performance of IBM Storage solutions so that you can deliver the right services to your clients today. Deliver optimized private cloud services ahead of schedule and under budget with a complete IBM Cloud Private stack. Containerize applications and deliver the SLAs that your team needs to thrive and win. Implement IBM Cloud Private to deploy modern applications like blockchain and AI or modernize what you already have. You now have the capabilities. This edition applies to IBM Storage Solutions for IBM Cloud Private Version 1 Release 5.0.

IBM Hyper-Scale Manager for IBM Spectrum Accelerate Family: IBM XIV, IBM FlashSystem A9000 and A9000R, and IBM Spectrum Accelerate

This IBM® Redbooks® publication describes storage management functions and their configuration and use with the IBM Hyper-Scale Manager management graphical user interface (GUI) for IBM XIV® Gen3, IBM FlashSystem® A9000 and A9000R, and IBM Spectrum™ Accelerate software. The web-based GUI provides a revolutionary object-centered interface design that is aimed toward ease of use together with enhanced efficiency for storage administrators. The first chapter describes general features of the GUI and installation of the IBM Hyper-Scale Manager server. Subsequent chapters illustrate some typical GUI actions, among many other possibilities, to manage and configure the storage systems, to define security roles, and to set up multitenancy. For most of the GUI-based actions that are illustrated in this book, the corresponding XIV Storage System command-line interface (XCLI) commands are also shown. This edition applies to IBM Hyper-Scale Manager V5.4. IBM Hyper-Scale Manager based GUI information regarding host attachment and replication is covered in IBM FlashSystem A9000, IBM FlashSystem A9000R, and IBM XIV Storage System: Host Attachment and Interoperability, SG24-8368 and IBM FlashSystem A9000 and A9000R Business Continuity Solutions, REDP-5401. See also IBM HyperSwap and Multi-site HA/DR for IBM FlashSystem A9000 and A9000R, REDP-5434.

Implementing IBM FlashSystem 900 Model AE3

Today's global organizations depend on being able to unlock business insights from massive volumes of data. Now, with IBM® FlashSystem 900 Model AE3 that is powered by IBM FlashCore® technology, they can make faster decisions that are based on real-time insights. They also can unleash the power of the most demanding applications, including online transaction processing (OLTP) and analytics databases, virtual desktop infrastructures (VDIs), technical computing applications, and cloud environments. This IBM Redbooks® publication introduces clients to the IBM FlashSystem® 900 Model AE3. It provides in-depth knowledge of the product architecture, software and hardware, implementation, and hints and tips. Also presented are use cases that show real-world solutions for tiering, flash-only, and preferred-read. Examples of the benefits that are gained by integrating the FlashSystem storage into business environments also are described. This book is intended for pre-sales and post-sales technical support professionals and storage administrators, and anyone who wants to understand how to implement this new and exciting technology.

Data Science for Business and Decision Making

Data Science for Business and Decision Making covers both statistics and operations research while most competing textbooks focus on one or the other. As a result, the book more clearly defines the principles of business analytics for those who want to apply quantitative methods in their work. Its emphasis reflects the importance of regression, optimization and simulation for practitioners of business analytics. Each chapter uses a didactic format that is followed by exercises and answers. Freely-accessible datasets enable students and professionals to work with Excel, Stata Statistical Software®, and IBM SPSS Statistics Software®. Combines statistics and operations research modeling to teach the principles of business analytics Written for students who want to apply statistics, optimization and multivariate modeling to gain competitive advantages in business Shows how powerful software packages, such as SPSS and Stata, can create graphical and numerical outputs

Stream Processing with Apache Flink

Get started with Apache Flink, the open source framework that powers some of the world’s largest stream processing applications. With this practical book, you’ll explore the fundamental concepts of parallel stream processing and discover how this technology differs from traditional batch data processing. Longtime Apache Flink committers Fabian Hueske and Vasia Kalavri show you how to implement scalable streaming applications with Flink’s DataStream API and continuously run and maintain these applications in operational environments. Stream processing is ideal for many use cases, including low-latency ETL, streaming analytics, and real-time dashboards as well as fraud detection, anomaly detection, and alerting. You can process continuous data of any kind, including user interactions, financial transactions, and IoT data, as soon as you generate them. Learn concepts and challenges of distributed stateful stream processing Explore Flink’s system architecture, including its event-time processing mode and fault-tolerance model Understand the fundamentals and building blocks of the DataStream API, including its time-based and statefuloperators Read data from and write data to external systems with exactly-once consistency Deploy and configure Flink clusters Operate continuously running streaming applications

Data Science Using Python and R

Learn data science by doing data science! Data Science Using Python and R will get you plugged into the world’s two most widespread open-source platforms for data science: Python and R. Data science is hot. Bloomberg called data scientist “the hottest job in America.” Python and R are the top two open-source data science tools in the world. In Data Science Using Python and R, you will learn step-by-step how to produce hands-on solutions to real-world business problems, using state-of-the-art techniques. Data Science Using Python and R is written for the general reader with no previous analytics or programming experience. An entire chapter is dedicated to learning the basics of Python and R. Then, each chapter presents step-by-step instructions and walkthroughs for solving data science problems using Python and R. Those with analytics experience will appreciate having a one-stop shop for learning how to do data science using Python and R. Topics covered include data preparation, exploratory data analysis, preparing to model the data, decision trees, model evaluation, misclassification costs, naïve Bayes classification, neural networks, clustering, regression modeling, dimension reduction, and association rules mining. Further, exciting new topics such as random forests and general linear models are also included. The book emphasizes data-driven error costs to enhance profitability, which avoids the common pitfalls that may cost a company millions of dollars. Data Science Using Python and R provides exercises at the end of every chapter, totaling over 500 exercises in the book. Readers will therefore have plenty of opportunity to test their newfound data science skills and expertise. In the Hands-on Analysis exercises, readers are challenged to solve interesting business problems using real-world data sets.

Social-Behavioral Modeling for Complex Systems

This volume describes frontiers in social-behavioral modeling for contexts as diverse as national security, health, and on-line social gaming. Recent scientific and technological advances have created exciting opportunities for such improvements. However, the book also identifies crucial scientific, ethical, and cultural challenges to be met if social-behavioral modeling is to achieve its potential. Doing so will require new methods, data sources, and technology. The volume discusses these, including those needed to achieve and maintain high standards of ethics and privacy. The result should be a new generation of modeling that will advance science and, separately, aid decision-making on major social and security-related subjects despite the myriad uncertainties and complexities of social phenomena. Intended to be relatively comprehensive in scope, the volume balances theory-driven, data-driven, and hybrid approaches. The latter may be rapidly iterative, as when artificial-intelligence methods are coupled with theory-driven insights to build models that are sound, comprehensible and usable in new situations. With the intent of being a milestone document that sketches a research agenda for the next decade, the volume draws on the wisdom, ideas and suggestions of many noted researchers who draw in turn from anthropology, communications, complexity science, computer science, defense planning, economics, engineering, health systems, medicine, neuroscience, physics, political science, psychology, public policy and sociology. In brief, the volume discusses: Cutting-edge challenges and opportunities in modeling for social and behavioral science Special requirements for achieving high standards of privacy and ethics New approaches for developing theory while exploiting both empirical and computational data Issues of reproducibility, communication, explanation, and validation Special requirements for models intended to inform decision making about complex social systems

Getting Started with Linux on Z Encryption for Data At-Rest

This IBM® Redbooks® publication provides a general explanation of data protection through encryption and IBM Z® pervasive encryption with a focus on Linux on IBM Z encryption for data at-rest. It also describes how the various hardware and software components interact in a Linux on Z encryption environment for . In addition, this book concentrates on the planning and preparing of the environment. It offers implementation, configuration, and operational examples that can be used in Linux on Z volume encryption environments. This publication is intended for IT architects, system administrators, and security administrators who plan for, deploy, and manage security on the Z platform. The reader is expected to have a basic understanding of IBM Z security concepts.

IBM FlashCore Module Cryptographic Erase

IBM® FlashCore Modules (FCMs) are storage devices that are available in 4.8 TB, 9.6 TB, and 19.2 TB capacities. They are a 2.5-inch drive form factor device and use second-generation 3D triple-level cell (TLC) flash memory on which to store data. This paper describes the cryptographic erasure of data that is stored on these devices when used in an IBM FlashSystem® 9100 (9846-AF7, 9846-AF8, 9848-AF7, 9848-AF8, 9848-UF7, and 9848-UF8), or IBM Storwize® V5100 (2077-424, 2077-A4F, 2078-424, and 2078-A4F).