talk-data.com talk-data.com

Topic

Analytics

data_analysis insights metrics

395

tagged

Activity Trend

398 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: O'Reilly Data Engineering Books ×
The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd Edition

Updated new edition of Ralph Kimball's groundbreaking book on dimensional modeling for data warehousing and business intelligence! The first edition of Ralph Kimball's The Data Warehouse Toolkit introduced the industry to dimensional modeling, and now his books are considered the most authoritative guides in this space. This new third edition is a complete library of updated dimensional modeling techniques, the most comprehensive collection ever. It covers new and enhanced star schema dimensional modeling patterns, adds two new chapters on ETL techniques, includes new and expanded business matrices for 12 case studies, and more. Authored by Ralph Kimball and Margy Ross, known worldwide as educators, consultants, and influential thought leaders in data warehousing and business intelligence Begins with fundamental design recommendations and progresses through increasingly complex scenarios Presents unique modeling techniques for business applications such as inventory management, procurement, invoicing, accounting, customer relationship management, big data analytics, and more Draws real-world case studies from a variety of industries, including retail sales, financial services, telecommunications, education, health care, insurance, e-commerce, and more Design dimensional databases that are easy to understand and provide fast query response with The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd Edition.

Disruptive Possibilities: How Big Data Changes Everything

Big data has more disruptive potential than any information technology developed in the past 40 years. As author Jeffrey Needham points out in this revealing book, big data can provide unprecedented visibility into the operational efficiency of enterprises and agencies. Disruptive Possibilities provides an historically-informed overview through a wide range of topics, from the evolution of commodity supercomputing and the simplicity of big data technology, to the ways conventional clouds differ from Hadoop analytics clouds. This relentlessly innovative form of computing will soon become standard practice for organizations of any size attempting to derive insight from the tsunami of data engulfing them. Replacing legacy silos—whether they’re infrastructure, organizational, or vendor silos—with a platform-centric perspective is just one of the big stories of big data. To reap maximum value from the myriad forms of data, organizations and vendors will have to adopt highly collaborative habits and methodologies.

Big Data Imperatives: Enterprise 'Big Data' Warehouse, 'BI' Implementations and Analytics

Big Data Imperatives, focuses on resolving the key questions on everyone's mind: Which data matters? Do you have enough data volume to justify the usage? How you want to process this amount of data? How long do you really need to keep it active for your analysis, marketing, and BI applications? Big data is emerging from the realm of one-off projects to mainstream business adoption; however, the real value of big data is not in the overwhelming size of it, but more in its effective use. This book addresses the following big data characteristics: Very large, distributed aggregations of loosely structured data - often incomplete and inaccessible Petabytes/Exabytes of data Millions/billions of people providing/contributing to the context behind the data Flat schema's with few complex interrelationships Involves time-stamped events Made up of incomplete data Includes connections between data elements that must be probabilistically inferred Big Data Imperatives explains 'what big data can do'. It can batch process millions and billions of records both unstructured and structured much faster and cheaper. Big data analytics provide a platform to merge all analysis which enables data analysis to be more accurate, well-rounded, reliable and focused on a specific business capability. Big Data Imperatives describes the complementary nature of traditional data warehouses and big-data analytics platforms and how they feed each other. This book aims to bring the big data and analytics realms together with a greater focus on architectures that leverage the scale and power of big data and the ability to integrate and apply analytics principles to data which earlier was not accessible. This book can also be used as a handbook for practitioners; helping them on methodology,technical architecture, analytics techniques and best practices. At the same time, this book intends to hold the interest of those new to big data and analytics by giving them a deep insight into the realm of big data. What you'll learn Understanding the technology, implementation of big data platforms and their usage for analytics Big data architectures Big data design patterns Implementation best practices Who this book is for This book is designed for IT professionals, data warehousing, business intelligence professionals, data analysis professionals, architects, developers and business users.

Real-Time Big Data Analytics: Emerging Architecture

Five or six years ago, analysts working with big datasets made queries and got the results back overnight. The data world was revolutionized a few years ago when Hadoop and other tools made it possible to getthe results from queries in minutes. But the revolution continues. Analysts now demand sub-second, near real-time query results. Fortunately, we have the tools to deliver them. This report examines tools and technologies that are driving real-time big data analytics.

Using R to Unlock the Value of Big Data: Big Data Analytics with Oracle R Enterprise and Oracle R Connector for Hadoop

The Oracle Press Guide to Big Data Analytics using R Cowritten by members of the Big Data team at Oracle, this Oracle Press book focuses on analyzing data with R while making it scalable using Oracle’s R technologies. Using R to Unlock the Value of Big Data provides an introduction to open source R and describes issues with traditional R and database interaction. The book then offers in-depth coverage of Oracle’s strategic R offerings: Oracle R Enterprise, Oracle R Distribution, ROracle, and Oracle R Connector for Hadoop. You can practice your new skills using the end-of-chapter exercises.

Implementing IBM InfoSphere BigInsights on IBM System x

As world activities become more integrated, the rate of data growth has been increasing exponentially. And as a result of this data explosion, current data management methods can become inadequate. People are using the term big data (sometimes referred to as Big Data) to describe this latest industry trend. IBM® is preparing the next generation of technology to meet these data management challenges. To provide the capability of incorporating big data sources and analytics of these sources, IBM developed a stream-computing product that is based on the open source computing framework Apache Hadoop. Each product in the framework provides unique capabilities to the data management environment, and further enhances the value of your data warehouse investment. In this IBM Redbooks® publication, we describe the need for big data in an organization. We then introduce IBM InfoSphere® BigInsights™ and explain how it differs from standard Hadoop. BigInsights provides a packaged Hadoop distribution, a greatly simplified installation of Hadoop and corresponding open source tools for application development, data movement, and cluster management. BigInsights also brings more options for data security, and as a component of the IBM big data platform, it provides potential integration points with the other components of the platform. A new chapter has been added to this edition. Chapter 11 describes IBM Platform Symphony®, which is a new scheduling product that works with IBM Insights, bringing low-latency scheduling and multi-tenancy to IBM InfoSphere BigInsights. The book is designed for clients, consultants, and other technical professionals.

Advanced Case Management with IBM Case Manager

Organizations face case management challenges that require insight, responsiveness, and collaboration. IBM® Case Manager, Version 5.1.1, is an advanced case management product that unites information, process, and people to provide the 360-degree view of case information and achieve optimized outcomes. With IBM Case Manager, knowledge workers can extract critical case information through integrated business rules, collaboration, and analytics. This easy access to information enhances decision making ability and leads to more successful case outcomes. IBM Case Manager also helps capture industry best practices in frameworks and templates to empower business users and accelerate return on investment. This IBM Redbooks® publication introduces the case management concept. It includes the reason for and benefits of case management, and why it is different from the traditional business process management or content management. In addition, this book addresses how you can design and build a case management solution with IBM Case Manager, and integrate that solution with external products and components. This book is intended to provide IT architects and IT specialists with the high-level concepts of case management and the capabilities of IBM Case Manager. In addition, it serves as a practical guide for IT professionals who are responsible for designing, building, and deploying IBM Case Manager solutions.

IBM Real-time Compression Appliance Version 4.1

Continuing its commitment to developing and delivering industry-leading storage technologies, IBM is introducing the IBM Real-time Compression Appliance for NAS, an innovative new storage offering that delivers essential storage efficiency technologies, combined with exceptional ease of use and performance. In an era when the amount of information, particularly in unstructured files, is exploding, but budgets for storing that information are stagnant, IBM Real-time Compression technology offers a powerful tool for better information management, protection and access. IBM Real-time Compression can help slow the growth of storage acquisition, reducing storage costs while simplifying both operations and management. It also enables organizations to keep more data available for use rather than storing it offsite or on tape that is more difficult to access, so they can support improved analytics and decision-making. IBM Real-time Compression Appliance provides online storage optimization through real-time data compression, delivering dramatic cost reduction without performance degradation. This IBM Redbooks publication is for system administrators and IT architects. It describes the enhancements made in version 4.1 of the Real-time Compression Appliance as compared to previous releases. This book is a companion to the publication Introduction to IBM Real-time Compression Appliances, SG24-7953.

Extending z/OS System Management Functions with IBM zAware

This IBM® Redbooks® publication explains the capabilities of the IBM System z® Advanced Workload Analysis Reporter (IBM zAware), and shows how you can use it as an integral part of your existing System z management tools. IBM zAware is an integrated, self-learning, analytics solution for IBM z/OS® that helps identify unusual system behavior in near real time. It is designed to help IT personnel improve problem determination so they can restore service quickly and improve overall availability. The book gives you a conceptual description of the IBM zAware appliance. It will help you to understand how it fits into the family of IBM mainframe system management tools that include Runtime Diagnostics, Predictive Failure Analysis (PFA), IBM Health Checker for z/OS, and z/OS Management Facility (z/OSMF). You are provided with the information you need to get IBM zAware up and running so you can start to benefit from its capabilities immediately. You will learn how to manage an IBM zAware environment, and see how other products can use the IBM zAware Application Programming Interface to extract information from IBM zAware for their own use. The target audience includes system programmers, system operators, configuration planners, and system automation analysts.

IBM Platform Computing Integration Solutions

This IBM® Redbooks® publication describes the integration of IBM Platform Symphony® with IBM BigInsights™. It includes IBM Platform LSF® implementation scenarios that use IBM System x® technologies. This IBM Redbooks publication is written for consultants, technical support staff, IT architects, and IT specialists who are responsible for providing solutions and support for IBM Platform Computing solutions. This book explains how the IBM Platform Computing solutions and the IBM System x platform can help to solve customer challenges and to maximize systems throughput, capacity, and management. It examines the tools, utilities, documentation, and other resources that are available to help technical teams provide solutions and support for IBM Platform Computing solutions in a System x environment. In addition, this book includes a well-defined and documented deployment model within a System x environment. It provides a planned foundation for provisioning and building large scale parallel high-performance computing (HPC) applications, cluster management, analytics workloads, and grid applications.

MongoDB Applied Design Patterns

Whether you’re building a social media site or an internal-use enterprise application, this hands-on guide shows you the connection between MongoDB and the business problems it’s designed to solve. You’ll learn how to apply MongoDB design patterns to several challenging domains, such as ecommerce, content management, and online gaming. Using Python and JavaScript code examples, you’ll discover how MongoDB lets you scale your data model while simplifying the development process. Many businesses launch NoSQL databases without understanding the techniques for using their features most effectively. This book demonstrates the benefits of document embedding, polymorphic schemas, and other MongoDB patterns for tackling specific big data use cases, including: Operational intelligence: Perform real-time analytics of business data Ecommerce: Use MongoDB as a product catalog master or inventory management system Content management: Learn methods for storing content nodes, binary assets, and discussions Online advertising networks: Apply techniques for frequency capping ad impressions, and keyword targeting and bidding Social networking: Learn how to store a complex social graph, modeled after Google+ Online gaming: Provide concurrent access to character and world data for a multiplayer role-playing game

Hadoop Beginner's Guide

Hadoop Beginner's Guide introduces you to the essential concepts and practical applications of Apache Hadoop, one of the leading frameworks for big data processing. You will learn how to set up and use Hadoop to store, manage, and analyze vast amounts of data efficiently. With clear examples and step-by-step instructions, this book is the perfect starting point for beginners. What this Book will help me do Understand the trends leading to the adoption of Hadoop and determine when to use it effectively in your projects. Build and configure Hadoop clusters tailored to your specific needs, enabling efficient data processing. Develop and execute applications on Hadoop using Java and Ruby, with practical examples provided. Leverage Amazon AWS and Elastic MapReduce to deploy Hadoop on the cloud and manage hosted environments. Integrate Hadoop with relational databases using tools like Hive and Sqoop for effective data transfer and querying. Author(s) The author of Hadoop Beginner's Guide is an experienced data engineer with a focus on big data technologies. They have extensive experience deploying Hadoop in various industries and are passionate about making complex systems accessible to newcomers. Their approach combines technical depth with an understanding of the needs of learners, ensuring clarity and relevance throughout the book. Who is it for? This book is designed for professionals who are new to big data processing and want to learn Apache Hadoop from scratch. It is ideal for system administrators, data analysts, and developers with basic programming knowledge in Java or Ruby looking to get started with Hadoop. If you have an interest in leveraging Hadoop for scalable data management and analytics, this book is for you. By the end, you'll gain the confidence and skills to utilize Hadoop effectively in your projects.

ElasticSearch Server

ElasticSearch Server is an excellent resource for mastering the ElasticSearch open-source search engine. This book takes you through practical steps to implement, configure, and optimize search capabilities, suitable for various data sets and applications, making faster and more accurate search outcomes accessible. What this Book will help me do Understand the core concepts of ElasticSearch, including data indexing, dynamic mapping, and search analysis. Develop practical skills in writing queries and filters to retrieve precise and relevant results. Learn to set up and efficiently manage ElasticSearch clusters for scalability and real-time performance. Implement advanced ElasticSearch functions like autocompletion, faceting, and geo-search. Utilize optimization techniques for cluster monitoring, health-checks, and tuning for reliable performance. Author(s) The authors of ElasticSearch Server are industry professionals with extensive experience in search technologies and system architecture. They have contributed to multiple tools and publications in the field of data search and analytics. Their writing aims to distill complex technical concepts into practical knowledge, making it valuable for readers from all backgrounds. Who is it for? This book is perfect for developers, system architects, and IT professionals seeking a robust and scalable search solution for their projects. Whether you're new to ElasticSearch or looking to deepen your expertise, this book will serve as a practical guide to implement ElasticSearch effectively. The only prerequisites are a basic understanding of databases and general query concepts, so prior search server knowledge is not required.

IBM Platform Computing Solutions

This IBM® Platform Computing Solutions Redbooks® publication is the first book to describe each of the available offerings that are part of the IBM portfolio of Cloud, analytics, and High Performance Computing (HPC) solutions for our clients. This IBM Redbooks publication delivers descriptions of the available offerings from IBM Platform Computing that address challenges for our clients in each industry. We include a few implementation and testing scenarios with selected solutions. This publication helps strengthen the position of IBM Platform Computing solutions with a well-defined and documented deployment model within an IBM System x® environment. This deployment model offers clients a planned foundation for dynamic cloud infrastructure, provisioning, large-scale parallel HPC application development, cluster management, and grid applications. This IBM publication is targeted to IT specialists, IT architects, support personnel, and clients. This book is intended for anyone who wants information about how IBM Platform Computing solutions use IBM to provide a wide array of client solutions.

MapReduce Design Patterns

Until now, design patterns for the MapReduce framework have been scattered among various research papers, blogs, and books. This handy guide brings together a unique collection of valuable MapReduce patterns that will save you time and effort regardless of the domain, language, or development framework you’re using. Each pattern is explained in context, with pitfalls and caveats clearly identified to help you avoid common design mistakes when modeling your big data architecture. This book also provides a complete overview of MapReduce that explains its origins and implementations, and why design patterns are so important. All code examples are written for Hadoop. Summarization patterns: get a top-level view by summarizing and grouping data Filtering patterns: view data subsets such as records generated from one user Data organization patterns: reorganize data to work with other systems, or to make MapReduce analysis easier Join patterns: analyze different datasets together to discover interesting relationships Metapatterns: piece together several patterns to solve multi-stage problems, or to perform several analytics in the same job Input and output patterns: customize the way you use Hadoop to load or store data "A clear exposition of MapReduce programs for common data processing patterns—this book is indespensible for anyone using Hadoop." --Tom White, author of Hadoop: The Definitive Guide

HBase in Action

HBase in Action has all the knowledge you need to design, build, and run applications using HBase. First, it introduces you to the fundamentals of distributed systems and large scale data handling. Then, you'll explore real-world applications and code samples with just enough theory to understand the practical techniques. You'll see how to build applications with HBase and take advantage of the MapReduce processing framework. And along the way you'll learn patterns and best practices. About the Technology HBase is a NoSQL storage system designed for fast, random access to large volumes of data. It runs on commodity hardware and scales smoothly from modest datasets to billions of rows and millions of columns. About the Book HBase in Action is an experience-driven guide that shows you how to design, build, and run applications using HBase. First, it introduces you to the fundamentals of handling big data. Then, you'll explore HBase with the help of real applications and code samples and with just enough theory to back up the practical techniques. You'll take advantage of the MapReduce processing framework and benefit from seeing HBase best practices in action. What's Inside When and how to use HBase Practical examples Design patterns for scalable data systems Deployment, integration, and design About the Reader Written for developers and architects familiar with data storage and processing. No prior knowledge of HBase, Hadoop, or MapReduce is required. About the Authors Nick Dimiduk is a Data Architect with experience in social media analytics, digital marketing, and GIS. Amandeep Khurana is a Solutions Architect focused on building HBase-driven solutions. Quotes Timely, practical ... explains in plain language how to use HBase. - From the Foreword by Michael Stack, Chair of the Apache HBase Project Management Committee A difficult topic lucidly explained. - John Griffin, coauthor of "Hibernate Search in Action" Amusing tongue-in-cheek style that doesn’t detract from the substance. - Charles Pyle, APS Healthcare Learn how to think the HBase way. - Gianluca Righetto, Menttis

Getting Started with Storm

Even as big data is turning the world upside down, the next phase of the revolution is already taking shape: real-time data analysis. This hands-on guide introduces you to Storm, a distributed, JVM-based system for processing streaming data. Through simple tutorials, sample Java code, and a complete real-world scenario, you’ll learn how to build fast, fault-tolerant solutions that process results as soon as the data arrives. Discover how easy it is to set up Storm clusters for solving various problems, including continuous data computation, distributed remote procedure calls, and data stream processing. Learn how to program Storm components: spouts for data input and bolts for data transformation Discover how data is exchanged between spouts and bolts in a Storm topology Make spouts fault-tolerant with several commonly used design strategies Explore bolts—their life cycle, strategies for design, and ways to implement them Scale your solution by defining each component’s level of parallelism Study a real-time web analytics system built with Node.js, a Redis server, and a Storm topology Write spouts and bolts with non-JVM languages such as Python, Ruby, and Javascript

Complete Analytics with IBM DB2 Query Management Facility: Accelerating Well-Informed Decisions Across the Enterprise

There is enormous pressure today for businesses across all industries to cut costs, enhance business performance, and deliver greater value with fewer resources. To take business analytics to the next level and drive tangible improvements to the bottom line, it is important to manage not only the volume of data, but the speed with which actionable findings can be drawn from a wide variety of disparate sources. The findings must be easily communicated to those responsible for making both strategic and tactical decisions. At the same time, strained IT budgets require that the solution be self-service for everyone from DBAs to business users, and easily deployed to thin, browser-based clients. Business analytics hosted in the Query Management Facility™ (QMF™) on DB2® and System z® allow you to tackle these challenges in a practical way, using new features and functions that are easily deployed across the enterprise and easily consumed by business users who do not have prior IT experience. This IBM® Redbooks® publication provides step-by-step instructions on using these new features: Access to data that resides in any JDBC-compliant data source OLAP access through XMLA 150+ new analytical functions Graphical query interfaces and graphical reports Graphical, interactive dashboards Ability to integrate QMF functions with third-party applications Support for the IBM DB2 Analytics Accelerator A new QMF Classic perspective in QMF for Workstation Ability to start QMF for TSO as a DB2 for z/OS stored procedure New metadata capabilities, including ER diagrams and capability to federate data into a single virtual source

Optimizing DB2 Queries with IBM DB2 Analytics Accelerator for z/OS

The IBM® DB2® Analytics Accelerator Version 2.1 for IBM z/OS® (also called DB2 Analytics Accelerator or Query Accelerator in this book and in DB2 for z/OS documentation) is a marriage of the IBM System z® Quality of Service and Netezza® technology to accelerate complex queries in a DB2 for z/OS highly secure and available environment. Superior performance and scalability with rapid appliance deployment provide an ideal solution for complex analysis. This IBM Redbooks® publication provides technical decision-makers with a broad understanding of the IBM DB2 Analytics Accelerator architecture and its exploitation by documenting the steps for the installation of this solution in an existing DB2 10 for z/OS environment. In this book we define a business analytics scenario, evaluate the potential benefits of the DB2 Analytics Accelerator appliance, describe the installation and integration steps with the DB2 environment, evaluate performance, and show the advantages to existing business intelligence processes. Please note that the additional material referenced in the text is not available from IBM.

IBM Real Time Compression Appliance Application Integration Guide

Continuing its commitment to developing and delivering industry-leading storage technologies, IBM® is introducing the IBM Real-time Compression Appliances for NAS, an innovative new storage offering that delivers essential storage efficiency technologies, combined with exceptional ease of use and performance. In an era when the amount of information, particularly in unstructured files, is exploding, but budgets for storing that information are stagnant, IBM Real-time Compression technology offers a powerful tool for better information management, protection, and access. IBM Real-time Compression can help slow the growth of storage acquisition, reducing storage costs while simplifying both operations and management. It also enables organizations to keep more data available for use rather than storing it offsite or on harder-to-access tape, so they can support improved analytics and decision making. IBM Real-time Compression Appliances provide online storage optimization through real-time data compression, delivering dramatic cost reduction without performance degradation. This IBM Redbooks® publication is an easy-to-follow guide that describes how to design solutions successfully using IBM Real-time Compression Appliances (IBM RTCAs). It explains best practices for RTCA solution design, application integration, and practical RTCA use cases. This is a companion book to Introduction to IBM Real-time Compression Appliances, SG24-7953.