talk-data.com talk-data.com

Topic

data

5765

tagged

Activity Trend

3 peak/qtr
2020-Q1 2026-Q1

Activities

5765 activities · Newest first

Practical Synthetic Data Generation

Building and testing machine learning models requires access to large and diverse data. But where can you find usable datasets without running into privacy issues? This practical book introduces techniques for generating synthetic data—fake data generated from real data—so you can perform secondary analysis to do research, understand customer behaviors, develop new products, or generate new revenue. Data scientists will learn how synthetic data generation provides a way to make such data broadly available for secondary purposes while addressing many privacy concerns. Analysts will learn the principles and steps for generating synthetic data from real datasets. And business leaders will see how synthetic data can help accelerate time to a product or solution. This book describes: Steps for generating synthetic data using multivariate normal distributions Methods for distribution fitting covering different goodness-of-fit metrics How to replicate the simple structure of original data An approach for modeling data structure to consider complex relationships Multiple approaches and metrics you can use to assess data utility How analysis performed on real data can be replicated with synthetic data Privacy implications of synthetic data and methods to assess identity disclosure

Securing Data on Threat Detection Using IBM Spectrum Scale and IBM QRadar: An Enhanced Cyber Resiliency Solution

Having appropriate storage for hosting business-critical data and advanced Security Information and Event Management (SIEM) software for deep inspection, detection, and prioritization of threats has become a necessity for any business. This IBM® Redpaper publication explains how the storage features of IBM Spectrum® Scale, when combined with the log analysis, deep inspection, and detection of threats that are provided by IBM QRadar®, help reduce the impact of incidents on business data. Such integration provides an excellent platform for hosting unstructured business data that is subject to regulatory compliance requirements. This paper describes how IBM Spectrum Scale File Audit Logging can be integrated with IBM QRadar. Using IBM QRadar, an administrator can monitor, inspect, detect, and derive insights for identifying potential threats to the data that is stored on IBM Spectrum Scale. When the threats are identified, you can quickly act on them to mitigate or reduce the impact of incidents. We further demonstrate how the threat detection by IBM QRadar can proactively trigger data snapshots or cyber resiliency workflow in IBM Spectrum Scale to protect the data during threat. This paper is intended for chief technology officers, solution engineers, security architects, and systems administrators.

Optimize the Value of Your Data with Oracle and IBM Flash Storage Solutions

In this multicloud and cognitive era, information continues to grow rapidly. By 2025, IDC says worldwide data will grow by 61% to 175 zettabytes, with as much of the data in data centers as in the cloud. IT environments with Oracle deployments will need to accommodate that data growth, including storing, copying, mirroring, and protecting the data. When IT budgets are constrained but data keeps growing, storage costs can consume more than their fair share of the IT budget. The leading-edge portfolio of storage solutions and essential technologies of IBM® can help organizations stay ahead of the information explosion. Designed with built-in efficiency, these solutions represent preferred practices that address the following main storage objectives for hybrid multicloud environments: Stop storing so much Store more with what you have. Move Oracle and related data to balance performance and efficiency IBM offers true enterprise class storage support for Oracle deployments at a low total cost of ownership (TCO). With flash disk, tape, storage network hardware, consolidated management console, software-defined storage solutions, and security software, IBM can provide Oracle customers the full spectrum of products to meet their availability, retention, security, and compliance requirements.

IBM AIX Enhancements and Modernization

This IBM® Redbooks publication is a comprehensive guide that covers the IBM AIX® operating system (OS) layout capabilities, distinct features, system installation, and maintenance, which includes AIX security, trusted environment, and compliance integration, with the benefits of IBM Power Virtualization Management (PowerVM®) and IBM Power Virtualization Center (IBM PowerVC), which includes cloud capabilities and automation types. The objective of this book is to introduce IBM AIX modernization features and integration with different environments: General AIX enhancements AIX Live Kernel Update individually or using Network Installation Manager (NIM) AIX security features and integration AIX networking enhancements PowerVC integration and features for cloud environments AIX deployment using IBM Terraform and IBM Cloud Automation Manager AIX automation that uses configuration management tools PowerVM enhancements and features Latest disaster recovery (DR) solutions AIX Logical Volume Manager (LVM) and Enhanced Journaled File System (JFS2) AIX installation and maintenance techniques

Applied Numerical Methods Using MATLAB, 2nd Edition

This new edition provides an updated approach for students, engineers, and researchers to apply numerical methods for solving problems using MATLAB® This accessible book makes use of MATLAB® software to teach the fundamental concepts for applying numerical methods to solve practical engineering and/or science problems. It presents programs in a complete form so that readers can run them instantly with no programming skill, allowing them to focus on understanding the mathematical manipulation process and making interpretations of the results. Applied Numerical Methods Using MATLAB®, Second Edition begins with an introduction to MATLAB usage and computational errors, covering everything from input/output of data, to various kinds of computing errors, and on to parameter sharing and passing, and more. The system of linear equations is covered next, followed by a chapter on the interpolation by Lagrange polynomial. The next sections look at interpolation and curve fitting, nonlinear equations, numerical differentiation/integration, ordinary differential equations, and optimization. Numerous methods such as the Simpson, Euler, Heun, Runge-kutta, Golden Search, Nelder-Mead, and more are all covered in those chapters. The eighth chapter provides readers with matrices and Eigenvalues and Eigenvectors. The book finishes with a complete overview of differential equations. Provides examples and problems of solving electronic circuits and neural networks Includes new sections on adaptive filters, recursive least-squares estimation, Bairstow's method for a polynomial equation, and more Explains Mixed Integer Linear Programing (MILP) and DOA (Direction of Arrival) estimation with eigenvectors Aimed at students who do not like and/or do not have time to derive and prove mathematical results Applied Numerical Methods Using MATLAB®, Second Edition is an excellent text for students who wish to develop their problem-solving capability without being involved in details about the MATLAB codes. It will also be useful to those who want to delve deeper into understanding underlying algorithms and equations.

Forensic Analytics, 2nd Edition

Become the forensic analytics expert in your organization using effective and efficient data analysis tests to find anomalies, biases, and potential fraud—the updated new edition Forensic Analytics reviews the methods and techniques that forensic accountants can use to detect intentional and unintentional errors, fraud, and biases. This updated second edition shows accountants and auditors how analyzing their corporate or public sector data can highlight transactions, balances, or subsets of transactions or balances in need of attention. These tests are made up of a set of initial high-level overview tests followed by a series of more focused tests. These focused tests use a variety of quantitative methods including Benford’s Law, outlier detection, the detection of duplicates, a comparison to benchmarks, time-series methods, risk-scoring, and sometimes simply statistical logic. The tests in the new edition include the newly developed vector variation score that quantifies the change in an array of data from one period to the next. The goals of the tests are to either produce a small sample of suspicious transactions, a small set of transaction groups, or a risk score related to individual transactions or a group of items. The new edition includes over two hundred figures. Each chapter, where applicable, includes one or more cases showing how the tests under discussion could have detected the fraud or anomalies. The new edition also includes two chapters each describing multi-million-dollar fraud schemes and the insights that can be learned from those examples. These interesting real-world examples help to make the text accessible and understandable for accounting professionals and accounting students without rigorous backgrounds in mathematics and statistics. Emphasizing practical applications, the new edition shows how to use either Excel or Access to run these analytics tests. The book also has some coverage on using Minitab, IDEA, R, and Tableau to run forensic-focused tests. The use of SAS and Power BI rounds out the software coverage. The software screenshots use the latest versions of the software available at the time of writing. This authoritative book: Describes the use of statistically-based techniques including Benford’s Law, descriptive statistics, and the vector variation score to detect errors and anomalies Shows how to run most of the tests in Access and Excel, and other data analysis software packages for a small sample of the tests Applies the tests under review in each chapter to the same purchasing card data from a government entity Includes interesting cases studies throughout that are linked to the tests being reviewed. Includes two comprehensive case studies where data analytics could have detected the frauds before they reached multi-million-dollar levels Includes a continually-updated companion website with the data sets used in the chapters, the queries used in the chapters, extra coverage of some topics or cases, end of chapter questions, and end of chapter cases. Written by a prominent educator and researcher in forensic accounting and auditing, the new edition of Forensic Analytics: Methods and Techniques for Forensic Accounting Investigations is an essential resource for forensic accountants, auditors, comptrollers, fraud investigators, and graduate students.

Innovative Tableau

Level up with Tableau to build eye-catching, easy-to-interpret data visualizations. In this follow-up guide to Practical Tableau, author Ryan Sleeper takes you through a collection of unique tips and tutorials for using this popular software. Beginning to advanced Tableau users will learn how to go beyond Show Me to make better charts and learn dozens of tricks to improve both the author and user experience. Featuring many approaches he developed himself, Ryan shows you how to create charts that empower Tableau users to explore, understand, and derive value from their data. He also shares many of his favorite tricks that enabled him to become a Tableau Zen Master, Tableau Public Visualization of the Year author, and Tableau Global Iron Viz Champion. Learn what’s new in Tableau since Practical Tableau was released Examine unique new charts—timelines, custom gauges, and leapfrog charts—plus innovations to traditional charts such as highlight tables, scatter plots, and maps Get tips that can help make a Tableau developer’s life easier Understand what developers can do to make users’ lives easier

Implementing IBM Spectrum Virtualize for Public Cloud Version 8.3

IBM® Spectrum Virtualize is a key member of the IBM Spectrum™ Storage portfolio. It is a highly flexible storage solution that enables rapid deployment of block storage services for new and traditional workloads, on-premises, off-premises and in a combination of both. IBM Spectrum Virtualize™ for Public Cloud provides the IBM Spectrum Virtualize functionality in IBM Cloud™. This new capability provides a monthly license to deploy and use Spectrum Virtualize in IBM Cloud to enable hybrid cloud solutions, offering the ability to transfer data between on-premises private clouds or data centers and the public cloud. This IBM Redpaper™ publication gives a broad understanding of IBM Spectrum Virtualize for Public Cloud architecture and provides planning and implementation details of the common use cases for this product. This publication helps storage and networking administrators plan and implement install, tailor, and configure IBM Spectrum Virtualize for Public Cloud offering. It also provides a detailed description of troubleshooting tips. IBM Spectrum Virtualize is also available on AWS. For more information, see Implementation guide for IBM Spectrum Virtualize for Public Cloud on AWS, REDP-5534.

Practical Statistics for Data Scientists, 2nd Edition

Statistical methods are a key part of data science, yet few data scientists have formal statistical training. Courses and books on basic statistics rarely cover the topic from a data science perspective. The second edition of this popular guide adds comprehensive examples in Python, provides practical guidance on applying statistical methods to data science, tells you how to avoid their misuse, and gives you advice on what’s important and what’s not. Many data science resources incorporate statistical methods but lack a deeper statistical perspective. If you’re familiar with the R or Python programming languages and have some exposure to statistics, this quick reference bridges the gap in an accessible, readable format. With this book, you’ll learn: Why exploratory data analysis is a key preliminary step in data science How random sampling can reduce bias and yield a higher-quality dataset, even with big data How the principles of experimental design yield definitive answers to questions How to use regression to estimate outcomes and detect anomalies Key classification techniques for predicting which categories a record belongs to Statistical machine learning methods that "learn" from data Unsupervised learning methods for extracting meaning from unlabeled data

IBM Power Systems Infrastructure I/O for SAP Applications

This IBM® Redpaper publication describes practical experiences to run SAP workloads to take advantage of IBM Power Systems I/O capabilities. With IBM POWER® processor-based servers, you have the flexibility to fit seamlessly new applications and workloads into a single data center, and even consolidate them into a single server. This approach highlights all viable options and describes the pros and cons of each one to select the correct option for a specific data center. The target audiences of this book are architects, IT specialists, and systems administrators deploying SAP workloads, who spend much time and effort managing, provisioning, and monitoring SAP software systems and landscapes on IBM Power Systems servers.

MOS Study Guide for Microsoft Access Expert Exam MO-500

Advance your everyday proficiency with Access 2019. And earn the credential that proves it! Demonstrate your expertise with Microsoft Access! Designed to help you practice and prepare for Microsoft Office Specialist (MOS): Access 2019 certification, this official Study Guide delivers: In-depth preparation for each MOS objective Detailed procedures to help build the skills measured by the exam Hands-on tasks to practice what you've learned Practice files and sample solutions Sharpen the skills measured by these objectives: Create and manage databases Build tables Create queries Create forms Create reports About MOS A Microsoft Office Specialist (MOS) certification validates your proficiency with Microsoft Office programs, demonstrating that you can meet globally recognized performance standards. Hands-on experience with the technology is required to successfully pass Microsoft Certification exams.

Geographical Modeling

The modeling of cities and territories has progressed greatly in the last 20 years. This is firstly due to geographic information systems, followed by the availability of large amounts of georeferenced data – both on the Internet and through the use of connected objects. In addition, the rise in performance of computational methods for the simulation and exploration of dynamic models has facilitated advancement. Geographical Modeling presents previously unpublished information on the main advances achieved by these new approaches. Each of the six chapters builds a bibliographic review and precisely describes the methods used, highlighting their advantages and discussing their interpretations. They are all illustrated by many examples. The book also explains with clarity the theoretical foundations of geographical analysis, the delicate operations of model selection, and the applications of fractals and scaling laws. These applications include gaining knowledge of the morphology of cities and the organization of urban transport, and finding new methods of building and exploring simulation models and visualizations of data and results.

IBM FlashSystem 9200R Rack Solution Product Guide

The FlashSystem 9200 combines the performance of flash and end-to-end Non-Volatile Memory Express (NVMe) with the reliability and innovation of IBM® FlashCore technology, the ultra-low latency of Storage Class Memory (SCM), the rich features of IBM Spectrum® Virtualize and AI predictive storage management, and proactive support by Storage Insights. All of these features are included in a powerful 2U enterprise-class, blazing fast storage all-flash array.

Building a Unified Data Infrastructure

The vast majority of businesses today already have a documented data strategy. But only a third of these forward-thinking companies have evolved into data-driven organizations or even begun to move toward a data culture. Most have yet to treat data as a business asset, much less use data and analytics to compete in the marketplace. What’s the solution? This insightful report demonstrates the importance of creating a holistic data infrastructure approach. You’ll learn how data virtualization (DV), master data management (MDM), and metadata-management capabilities can help your organization meet business objectives. Chief data officers, enterprise architects, analytics leaders, and line-of-business executives will understand the benefits of combining these capabilities into a unified data platform. Explore three separate business contexts that depend on data: operations, analytics, and governance Learn a pragmatic and holistic approach to building a unified data infrastructure Understand the critical capabilities of this approach, including the ability to work with existing technology Apply six best practices for combining data management capabilities

ML Ops: Operationalizing Data Science

More than half of the analytics and machine learning (ML) models created by organizations today never make it into production. Instead, many of these ML models do nothing more than provide static insights in a slideshow. If they aren’t truly operational, these models can’t possibly do what you’ve trained them to do. This report introduces practical concepts to help data scientists and application engineers operationalize ML models to drive real business change. Through lessons based on numerous projects around the world, six experts in data analytics provide an applied four-step approach—Build, Manage, Deploy and Integrate, and Monitor—for creating ML-infused applications within your organization. You’ll learn how to: Fulfill data science value by reducing friction throughout ML pipelines and workflows Constantly refine ML models through retraining, periodic tuning, and even complete remodeling to ensure long-term accuracy Design the ML Ops lifecycle to ensure that people-facing models are unbiased, fair, and explainable Operationalize ML models not only for pipeline deployment but also for external business systems that are more complex and less standardized Put the four-step Build, Manage, Deploy and Integrate, and Monitor approach into action

Streaming Integration

Data is being generated at an unrelenting pace, and data storage capacity can’t keep up. Enterprises must modernize the way they use and manage data by collecting, processing, and analyzing it in real time—in other words, streaming. This practical report explains everything organizations need to know to begin their streaming integration journey and make the most of their data. Authors Steve Wilkes and Alok Pareek detail the key attributes and components of an enterprise-grade streaming integration platform, along with stream processing and analysis techniques that will help companies reap immediate value from their data and solve their most pressing business challenges. Learn how to collect and handle large volumes of data at scale See how streams move data between threads, processes, servers, and data centers Get your data in the form you need and analyze it in real time Dive into the pros and cons of data targets such as databases, Hadoop, and cloud services for specific use cases Ensure your streaming integration infrastructure scales, is secure, works 24/7, and can handle failure

The Evolving Role of the Data Engineer

Companies working to become data driven often view data scientists as heroes, but that overlooks the vital role that data engineers play in the process. While data scientists focus on finding new insights from datasets, data engineers deal with preparation—obtaining, cleaning, and creating enhanced versions of the data an organization needs. In this report, Andy Oram examines how the role of data engineer has quickly evolved. DBAs, software engineers, developers, and students will explore the responsibilities of modern data engineers and the skills and tools necessary to do the job. You’ll learn how to deal with software engineering concepts such as rapid and continuous development, automation and orchestration, modularity, and traceability. Decision makers considering a move to the cloud will also benefit from the in-depth discussion this report provides. This report covers: Major tasks of data engineers today The different levels of structure in data and ways to maximize its value Capabilities of third-party cloud options Tools for ingestion, transfer, and enrichment Using containers and VMs to run the tools Software engineering development Automation and orchestration of data engineering

The Value of AI-Powered Business Intelligence

Artificial intelligence can yield powerful results when applied to business intelligence. Whether it’s pattern recognition in words, numbers, and big datasets or optimizing processes and expediting outcomes, AI is becoming a critical business component. In this report, Michael Norris from IBM explains how to drive AI adoption in your company. What does it mean to infuse AI into BI? It means business users can discover actionable, easy-to-understand insights on their own, independently from IT—even while remaining within the organization’s secure and governed IT architecture. Explore how AI in BI helps you to "get to the why" when analyzing and optimizing the insights you discover. Learn how AI-infused business intelligence: Enables line-of-business users to easily discover data-driven insights without requiring specialized data science expertise Allows users to ask questions in plain language with intuitive exploration tools to gain deeper insight into their data Provides recommended visualizations and dashboards to present compelling, concise, and explainable data Prepares datasets for analysis to free up IT analysts and line-of-business users