talk-data.com talk-data.com

Topic

data

5765

tagged

Activity Trend

3 peak/qtr
2020-Q1 2026-Q1

Activities

5765 activities · Newest first

SQL Server Big Data Clusters: Early First Edition Based on Release Candidate 1

Get a head-start on learning one of SQL Server 2019’s latest and most impactful features—Big Data Clusters—that combines large volumes of non-relational data for analysis along with data stored relationally inside a SQL Server database. This book provides a first look at Big Data Clusters based upon SQL Server 2019 Release Candidate 1. Start now and get a jump on your competition in learning this important new feature. Big Data Clusters is a feature set covering data virtualization, distributed computing, and relational databases and provides a complete AI platform across the entire cluster environment. This book shows you how to deploy, manage, and use Big Data Clusters. For example, you will learn how to combine data stored on the HDFS file system together with data stored inside the SQL Server instances that make up the Big Data Cluster. Filled with clear examples and use cases, this book provides everything necessary to get started working with Big Data Clusters in SQL Server 2019 using Release Candidate 1. You will learn about the architectural foundations that are made up from Kubernetes, Spark, HDFS, and SQL Server on Linux. You then are shown how to configure and deploy Big Data Clusters in on-premises environments or in the cloud. Next, you are taught about querying. You will learn to write queries in Transact-SQL—taking advantage of skills you have honed for years—and with those queries you will be able to examine and analyze data from a wide variety of sources such as Apache Spark. Through the theoretical foundation provided in this book and easy-to-follow example scripts and notebooks, you will be ready to use and unveil the full potential of SQL Server 2019: combining different types of data spread across widely disparate sources into a single view that is useful for business intelligence and machine learning analysis. What You Will Learn Install, manage, and troubleshoot Big Data Clusters in cloud or on-premise environments Analyze large volumes of data directly from SQL Server and/or Apache Spark Manage data stored in HDFS from SQL Server as if it were relational data Implement advanced analytics solutions through machine learning and AI Expose different data sources as a single logical source using data virtualization Who This Book Is For For data engineers, data scientists, data architects, and database administrators who want to employ data virtualization and big data analytics in their environment

The Decision Maker's Handbook to Data Science: A Guide for Non-Technical Executives, Managers, and Founders

Data science is expanding across industries at a rapid pace, and the companies first to adopt best practices will gain a significant advantage. To reap the benefits, decision makers need to have a confident understanding of data science and its application in their organization. It is easy for novices to the subject to feel paralyzed by intimidating buzzwords, but what many don’t realize is that data science is in fact quite multidisciplinary—useful in the hands of business analysts, communications strategists, designers, and more. With the second edition of The Decision Maker’s Handbook to Data Science, you will learn how to think like a veteran data scientist and approach solutions to business problems in an entirely new way. Author Stylianos Kampakis provides you with the expertise and tools required to develop a solid data strategy that is continuously effective. Ethics and legal issues surrounding data collection and algorithmic bias are some common pitfalls that Kampakis helps you avoid, while guiding you on the path to build a thriving data science culture at your organization. This updated and revised second edition, includes plenty of case studies, tools for project assessment, and expanded content for hiring and managing data scientists Data science is a language that everyone at a modern company should understand across departments. Friction in communication arises most often when management does not connect with what a data scientist is doing or how impactful data collection and storage can be for their organization. The Decision Maker’s Handbook to Data Science bridges this gap and readies you for both the present and future of your workplace in this engaging, comprehensive guide. What You Will Learn Understand how data science can be used within your business. Recognize the differences between AI, machine learning, and statistics. Become skilled at thinking like a data scientist, without being one. Discover how to hire and manage data scientists. Comprehend how to build the right environment in order to make your organization data-driven. Who This Book Is For Startup founders, product managers, higher level managers, and any other non-technical decision makers who are thinking to implement data science in their organization and hire data scientists. A secondary audience includes people looking for a soft introduction into the subject of data science.

Reporting, Predictive Analytics, and Everything in Between

Business decisions today are tactical and strategic at the same time. How do you respond to a competitor’s price change? Or to specific technology changes? What new products, markets, or businesses should you pursue? Decisions like these are based on information from only one source: data. With this practical report, technical and non-technical leaders alike will explore the fundamental elements necessary to embark on a data analytics initiative. Is your company planning or contemplating a data analytics initiative? Authors Brett Stupakevich, David Sweenor, and Shane Swiderek from TIBCO guide you through several analytics options. IT leaders, product developers, analytics leaders, data analysts, data scientists, and business professionals will learn how to deploy analytic components in streaming and embedded systems using one of five platforms. You’ll examine: Analytics platforms including embedded BI, reporting, data exploration & discovery, streaming BI, and data science & machine learning The business problems each option solves and the capabilities and requirements of each How to identify the right analytics type for your particular use case Key considerations and the level of investment for each analytics platform

Avoiding Data Pitfalls

Avoid data blunders and create truly useful visualizations Avoiding Data Pitfalls is a reputation-saving handbook for those who work with data, designed to help you avoid the all-too-common blunders that occur in data analysis, visualization, and presentation. Plenty of data tools exist, along with plenty of books that tell you how to use them—but unless you truly understand how to work with data, each of these tools can ultimately mislead and cause costly mistakes. This book walks you step by step through the full data visualization process, from calculation and analysis through accurate, useful presentation. Common blunders are explored in depth to show you how they arise, how they have become so common, and how you can avoid them from the outset. Then and only then can you take advantage of the wealth of tools that are out there—in the hands of someone who knows what they're doing, the right tools can cut down on the time, labor, and myriad decisions that go into each and every data presentation. Workers in almost every industry are now commonly expected to effectively analyze and present data, even with little or no formal training. There are many pitfalls—some might say chasms—in the process, and no one wants to be the source of a data error that costs money or even lives. This book provides a full walk-through of the process to help you ensure a truly useful result. Delve into the "data-reality gap" that grows with our dependence on data Learn how the right tools can streamline the visualization process Avoid common mistakes in data analysis, visualization, and presentation Create and present clear, accurate, effective data visualizations To err is human, but in today's data-driven world, the stakes can be high and the mistakes costly. Don't rely on "catching" mistakes, avoid them from the outset with the expert instruction in Avoiding Data Pitfalls.

IBM z14 Model ZR1 Configuration Setup

This IBM® Redbooks® publication helps you install, configure, and maintain the IBM z14® Model ZR1 (Machine Type 3907). The z14 ZR1 offers new functions that require a comprehensive understanding of the available configuration options. This book presents configuration setup scenarios and describes implementation examples in detail. This publication is intended for systems engineers, hardware planners, and anyone who needs to understand IBM Z® configuration and implementation. Readers should be generally familiar with current IBM Z technology and terminology. For more information about the functions of the z14 Model ZR1, see IBM z14 Model ZR1 Technical Introduction, SG24-8550, and IBM z14 Model ZR1 Technical Guide, SG24-8651.

Building Big Data Applications

Building Big Data Applications helps data managers and their organizations make the most of unstructured data with an existing data warehouse. It provides readers with what they need to know to make sense of how Big Data fits into the world of Data Warehousing. Readers will learn about infrastructure options and integration and come away with a solid understanding on how to leverage various architectures for integration. The book includes a wide range of use cases that will help data managers visualize reference architectures in the context of specific industries (healthcare, big oil, transportation, software, etc.). Explores various ways to leverage Big Data by effectively integrating it into the data warehouse Includes real-world case studies which clearly demonstrate Big Data technologies Provides insights on how to optimize current data warehouse infrastructure and integrate newer infrastructure matching data processing workloads and requirements

IBM DS8000 SafeGuarded Copy

This IBM® Redpaper™ publication explains the IBM DS8000 Safeguarded Copy functionality. With Safeguarded Copy, organizations have the ability to improve their cyber resiliency by frequently creating protected point-in-time backups of their critical data, with minimum impact and effective resource utilization. The paper introduces Safeguarded Copy and discusses the need for logical corruption protection (LCP) and information about regulatory requirements. It presents the general concepts of LCP, and then explore various use cases for recovery. The paper is intended for IT security architects, who plan and design an organization's cyber security strategy, as well as the infrastructure technical specialists who implement them.

T-SQL Window Functions: For data analysis and beyond, 2nd Edition

Use window functions to write simpler, better, more efficient T-SQL queries Most T-SQL developers recognize the value of window functions for data analysis calculations. But they can do far more, and recent optimizations make them even more powerful. In T-SQL Window Functions, renowned T-SQL expert Itzik Ben-Gan introduces breakthrough techniques for using them to handle many common T-SQL querying tasks with unprecedented elegance and power. Using extensive code examples, he guides you through window aggregate, ranking, distribution, offset, and ordered set functions. You'll find a detailed section on optimization, plus an extensive collection of business solutions — including novel techniques available in no other book. Microsoft MVP Itzik Ben-Gan shows how to: • Use window functions to improve queries you previously built with predicates • Master essential SQL windowing concepts, and efficiently design window functions • Effectively utilize partitioning, ordering, and framing • Gain practical in-depth insight into window aggregate, ranking, offset, and statistical functions • Understand how the SQL standard supports ordered set functions, and find working solutions for functions not yet available in the language • Preview advanced Row Pattern Recognition (RPR) data analysis techniques • Optimize window functions in SQL Server and Azure SQL Database, making the most of indexing, parallelism, and more • Discover a full library of window function solutions for common business problems About This Book • For developers, DBAs, data analysts, data scientists, BI professionals, and power users familiar with T-SQL queries • Addresses any edition of the SQL Server 2019 database engine or later, as well as Azure SQL Database Get all code samples at: MicrosoftPressStore.com/TSQLWindowFunctions/downloads

Advanced Statistics with Applications in R

Advanced Statistics with Applications in R fills the gap between several excellent theoretical statistics textbooks and many applied statistics books where teaching reduces to using existing packages. This book looks at what is under the hood. Many statistics issues including the recent crisis with p-value are caused by misunderstanding of statistical concepts due to poor theoretical background of practitioners and applied statisticians. This book is the product of a forty-year experience in teaching of probability and statistics and their applications for solving real-life problems. There are more than 442 examples in the book: basically every probability or statistics concept is illustrated with an example accompanied with an R code. Many examples, such as Who said π? What team is better? The fall of the Roman empire, James Bond chase problem, Black Friday shopping, Free fall equation: Aristotle or Galilei, and many others are intriguing. These examples cover biostatistics, finance, physics and engineering, text and image analysis, epidemiology, spatial statistics, sociology, etc. Advanced Statistics with Applications in R teaches students to use theory for solving real-life problems through computations: there are about 500 R codes and 100 datasets. These data can be freely downloaded from the author's website dartmouth.edu/~eugened. This book is suitable as a text for senior undergraduate students with major in statistics or data science or graduate students. Many researchers who apply statistics on the regular basis find explanation of many fundamental concepts from the theoretical perspective illustrated by concrete real-world applications.

Managing Data Science

Discover how to successfully manage data science projects and build high-performing teams with 'Managing Data Science.' This book provides actionable insights on handling the entire data science workflow, from conception to production, and addresses common challenges with practical strategies. What this Book will help me do Understand the fundamentals of building scalable and efficient data science pipelines. Acquire techniques to manage every stage of data science projects effectively, from prototype to production. Learn proven strategies for assembling, cultivating, and sustaining a skilled data science team. Explore the latest tools, methodologies, and best practices in ModelOps and DevOps for data science. Gain insights into troubleshooting and optimizing data science workflows to achieve organizational goals. Author(s) None Dubovikov is a seasoned expert in data science and project management, bringing years of hands-on experience to both domains. With a passion for leveraging data to drive business success, None guides readers through building sustainable practices and effective teams in the growing field of data science. Who is it for? This book is perfect for data science professionals, project managers, and business leaders seeking practical guidance to reap the benefits of data-driven decision-making. Designed for readers with a foundational understanding of data science, it helps bridge the gap between technical expertise and managerial efficiency.

Monitoring and Managing the IBM Elastic Storage Server Using the GUI

The IBM® Elastic Storage Server GUI provides an easy way to configure and monitor various features that are available with the IBM ESS system. It is a web application that runs on common web browsers, such as Chrome, Firefox, and Edge. The ESS GUI uses Java Script and Ajax technologies to enable smooth and desktop-like interfacing. This IBM Redpaper publication provides a broad understanding of the architecture and features of the ESS GUI. It includes information about how to install and configure the GUI and in-depth information about the use of the GUI options. The primary audience for this paper includes experienced and new users of the ESS system.

Business Analytics, Volume II

This business analytics (BA) text discusses the models based on fact-based data to measure past business performance to guide an organization in visualizing and predicting future business performance and outcomes. It provides a comprehensive overview of analytics in general with an emphasis on predictive analytics. Given the booming interest in analytics and data science, this book is timely and informative. It brings many terms, tools, and methods of analytics together. The first three chapters provide an introduction to BA, importance of analytics, types of BA-descriptive, predictive, and prescriptive-along with the tools and models. Business intelligence (BI) and a case on descriptive analytics are discussed. Additionally, the book discusses on the most widely used predictive models, including regression analysis, forecasting, data mining, and an introduction to recent applications of predictive analytics-machine learning, neural networks, and artificial intelligence. The concluding chapter discusses on the current state, job outlook, and certifications in analytics.

Implementing the IBM Storwize V7000 with IBM Spectrum Virtualize V8.2.1

Continuing its commitment to developing and delivering industry-leading storage technologies, IBM® introduces the IBM Storwize® V7000 solution powered by IBM Spectrum™ Virtualize. This innovative storage offering delivers essential storage efficiency technologies and exceptional ease of use and performance, all integrated into a compact, modular design that is offered at a competitive, midrange price. The IBM Storwize V7000 solution incorporates some of the top IBM technologies that are typically found only in enterprise-class storage systems, which raises the standard for storage efficiency in midrange disk systems. This cutting-edge storage system extends the comprehensive storage portfolio from IBM and can help change the way organizations address the ongoing information explosion. This IBM Redbooks® publication introduces the features and functions of the IBM Storwize V7000 and IBM Spectrum Virtualize™ V8.2.1 system through several examples. This book is aimed at pre-sales and post-sales technical support and marketing and storage administrators. It helps you understand the architecture of the Storwize V7000, how to implement it, and how to take advantage of its industry-leading functions and features.

SAP Landscape Management 3.0 and IBM Power Systems Servers

This IBM® Redpaper publication is part of a series of technical documentation to help the enablement of SAP on Linux for IBM Power Systems servers and IBM System Storage™ servers. This book describes how by using SAP Landscape Management (SAP LaMa) 3.0 software that clients gain full visibility and control over their SAP and non-SAP systems, including the underlying physical, virtual, and cloud infrastructures. With SAP LaMa, you can automate repetitive tasks to manage critical applications across complex, hybrid IT landscapes. This publication helps you to better control IT costs and increase business agility, for example, by freeing staff to focus on more strategic work rather than manual, error-prone tasks. The target audiences of this book are architects, IT specialists, and systems administrators deploying SAP LaMa 3.0 whom often spend much time and effort managing and provisioning SAP software systems and landscapes.

A Guide to JES3 to JES2 Migration

This IBM® Redbooks® publication provides information to help clients that have JES3 and want to migrate to JES2. It provides a comprehensive list of the differences between the two job entry subsystems and provides information to help you determine the migration effort and actions. This book considers the features of JES2 as available on releases of IBM z/OS® V2R3 and V2R4. It should be used with JES3 to JES2 Migration Considerations, SG24-8083. This publication is divided into three parts: Part 1, "Planning to migrate from JES3 to JES2" on page 1, gives you information to make the decision and plan your migration. Part 2, "Use case study" on page 111, provides a Use Case Study that is based on an actual customer experience in a successful migration. Part 3, "Appendixes" on page 193, provides an appendix with sample tools that can help the migration process and exploitation of some of the new JES2 functions. This book is aimed at operations personnel, system programmers, and application developers

Electronic Health Records with Epic and IBM FlashSystem 9100 Blueprint Version 2 Release 2

This information is intended to facilitate the deployment of IBM® FlashSystem for the Epic Corporation electronic health record (EHR) solution by describing the requirements and specifications for configuring IBM FlashSystem® 9100 and its parameters. The document also describes the steps that are required to configure the server that host the EHR application. To complete the tasks, you must have a working knowledge of IBM FlashSystem 9100 and Epic applications. The information in this document is distributed on an "as is" basis, without any warranty that is either expressed or implied. Support assistance for the use of this material is limited to situations where IBM FlashSystem storage devices are supported and entitled and where the issues are not specific to a blueprint implementation.

Data Mining for Business Analytics

Data Mining for Business Analytics: Concepts, Techniques, and Applications in Python presents an applied approach to data mining concepts and methods, using Python software for illustration Readers will learn how to implement a variety of popular data mining algorithms in Python (a free and open-source software) to tackle business problems and opportunities. This is the sixth version of this successful text, and the first using Python. It covers both statistical and machine learning algorithms for prediction, classification, visualization, dimension reduction, recommender systems, clustering, text mining and network analysis. It also includes: A new co-author, Peter Gedeck, who brings both experience teaching business analytics courses using Python, and expertise in the application of machine learning methods to the drug-discovery process A new section on ethical issues in data mining Updates and new material based on feedback from instructors teaching MBA, undergraduate, diploma and executive courses, and from their students More than a dozen case studies demonstrating applications for the data mining techniques described End-of-chapter exercises that help readers gauge and expand their comprehension and competency of the material presented A companion website with more than two dozen data sets, and instructor materials including exercise solutions, PowerPoint slides, and case solutions Data Mining for Business Analytics: Concepts, Techniques, and Applications in Python is an ideal textbook for graduate and upper-undergraduate level courses in data mining, predictive analytics, and business analytics. This new edition is also an excellent reference for analysts, researchers, and practitioners working with quantitative methods in the fields of business, finance, marketing, computer science, and information technology. “This book has by far the most comprehensive review of business analytics methods that I have ever seen, covering everything from classical approaches such as linear and logistic regression, through to modern methods like neural networks, bagging and boosting, and even much more business specific procedures such as social network analysis and text mining. If not the bible, it is at the least a definitive manual on the subject.” —Gareth M. James, University of Southern California and co-author (with Witten, Hastie and Tibshirani) of the best-selling book An Introduction to Statistical Learning, with Applications in R

Data Privacy and GDPR Handbook

The definitive guide for ensuring data privacy and GDPR compliance Privacy regulation is increasingly rigorous around the world and has become a serious concern for senior management of companies regardless of industry, size, scope, and geographic area. The Global Data Protection Regulation (GDPR) imposes complex, elaborate, and stringent requirements for any organization or individuals conducting business in the European Union (EU) and the European Economic Area (EEA)—while also addressing the export of personal data outside of the EU and EEA. This recently-enacted law allows the imposition of fines of up to 5% of global revenue for privacy and data protection violations. Despite the massive potential for steep fines and regulatory penalties, there is a distressing lack of awareness of the GDPR within the business community. A recent survey conducted in the UK suggests that only 40% of firms are even aware of the new law and their responsibilities to maintain compliance. The Data Privacy and GDPR Handbook helps organizations strictly adhere to data privacy laws in the EU, the USA, and governments around the world. This authoritative and comprehensive guide includes the history and foundation of data privacy, the framework for ensuring data privacy across major global jurisdictions, a detailed framework for complying with the GDPR, and perspectives on the future of data collection and privacy practices. Comply with the latest data privacy regulations in the EU, EEA, US, and others Avoid hefty fines, damage to your reputation, and losing your customers Keep pace with the latest privacy policies, guidelines, and legislation Understand the framework necessary to ensure data privacy today and gain insights on future privacy practices The Data Privacy and GDPR Handbook is an indispensable resource for Chief Data Officers, Chief Technology Officers, legal counsel, C-Level Executives, regulators and legislators, data privacy consultants, compliance officers, and audit managers.

Clustering Methodology for Symbolic Data

Covers everything readers need to know about clustering methodology for symbolic data—including new methods and headings—while providing a focus on multi-valued list data, interval data and histogram data This book presents all of the latest developments in the field of clustering methodology for symbolic data—paying special attention to the classification methodology for multi-valued list, interval-valued and histogram-valued data methodology, along with numerous worked examples. The book also offers an expansive discussion of data management techniques showing how to manage the large complex dataset into more manageable datasets ready for analyses. Filled with examples, tables, figures, and case studies, Clustering Methodology for Symbolic Data begins by offering chapters on data management, distance measures, general clustering techniques, partitioning, divisive clustering, and agglomerative and pyramid clustering. Provides new classification methodologies for histogram valued data reaching across many fields in data science Demonstrates how to manage a large complex dataset into manageable datasets ready for analysis Features very large contemporary datasets such as multi-valued list data, interval-valued data, and histogram-valued data Considers classification models by dynamical clustering Features a supporting website hosting relevant data sets Clustering Methodology for Symbolic Data will appeal to practitioners of symbolic data analysis, such as statisticians and economists within the public sectors. It will also be of interest to postgraduate students of, and researchers within, web mining, text mining and bioengineering.