talk-data.com talk-data.com

Topic

DWH

Data Warehouse

analytics business_intelligence data_storage

568

tagged

Activity Trend

35 peak/qtr
2020-Q1 2026-Q1

Activities

568 activities · Newest first

Apache Sqoop Cookbook

Integrating data from multiple sources is essential in the age of big data, but it can be a challenging and time-consuming task. This handy cookbook provides dozens of ready-to-use recipes for using Apache Sqoop, the command-line interface application that optimizes data transfers between relational databases and Hadoop. Sqoop is both powerful and bewildering, but with this cookbook’s problem-solution-discussion format, you’ll quickly learn how to deploy and then apply Sqoop in your environment. The authors provide MySQL, Oracle, and PostgreSQL database examples on GitHub that you can easily adapt for SQL Server, Netezza, Teradata, or other relational systems. Transfer data from a single database table into your Hadoop ecosystem Keep table data and Hadoop in sync by importing data incrementally Import data from more than one database table Customize transferred data by calling various database functions Export generated, processed, or backed-up data from Hadoop to your database Run Sqoop within Oozie, Hadoop’s specialized workflow scheduler Load data into Hadoop’s data warehouse (Hive) or database (HBase) Handle installation, connection, and syntax issues common to specific database vendors

The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd Edition

Updated new edition of Ralph Kimball's groundbreaking book on dimensional modeling for data warehousing and business intelligence! The first edition of Ralph Kimball's The Data Warehouse Toolkit introduced the industry to dimensional modeling, and now his books are considered the most authoritative guides in this space. This new third edition is a complete library of updated dimensional modeling techniques, the most comprehensive collection ever. It covers new and enhanced star schema dimensional modeling patterns, adds two new chapters on ETL techniques, includes new and expanded business matrices for 12 case studies, and more. Authored by Ralph Kimball and Margy Ross, known worldwide as educators, consultants, and influential thought leaders in data warehousing and business intelligence Begins with fundamental design recommendations and progresses through increasingly complex scenarios Presents unique modeling techniques for business applications such as inventory management, procurement, invoicing, accounting, customer relationship management, big data analytics, and more Draws real-world case studies from a variety of industries, including retail sales, financial services, telecommunications, education, health care, insurance, e-commerce, and more Design dimensional databases that are easy to understand and provide fast query response with The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling, 3rd Edition.

Big Data Imperatives: Enterprise 'Big Data' Warehouse, 'BI' Implementations and Analytics

Big Data Imperatives, focuses on resolving the key questions on everyone's mind: Which data matters? Do you have enough data volume to justify the usage? How you want to process this amount of data? How long do you really need to keep it active for your analysis, marketing, and BI applications? Big data is emerging from the realm of one-off projects to mainstream business adoption; however, the real value of big data is not in the overwhelming size of it, but more in its effective use. This book addresses the following big data characteristics: Very large, distributed aggregations of loosely structured data - often incomplete and inaccessible Petabytes/Exabytes of data Millions/billions of people providing/contributing to the context behind the data Flat schema's with few complex interrelationships Involves time-stamped events Made up of incomplete data Includes connections between data elements that must be probabilistically inferred Big Data Imperatives explains 'what big data can do'. It can batch process millions and billions of records both unstructured and structured much faster and cheaper. Big data analytics provide a platform to merge all analysis which enables data analysis to be more accurate, well-rounded, reliable and focused on a specific business capability. Big Data Imperatives describes the complementary nature of traditional data warehouses and big-data analytics platforms and how they feed each other. This book aims to bring the big data and analytics realms together with a greater focus on architectures that leverage the scale and power of big data and the ability to integrate and apply analytics principles to data which earlier was not accessible. This book can also be used as a handbook for practitioners; helping them on methodology,technical architecture, analytics techniques and best practices. At the same time, this book intends to hold the interest of those new to big data and analytics by giving them a deep insight into the realm of big data. What you'll learn Understanding the technology, implementation of big data platforms and their usage for analytics Big data architectures Big data design patterns Implementation best practices Who this book is for This book is designed for IT professionals, data warehousing, business intelligence professionals, data analysis professionals, architects, developers and business users.

Implementing IBM InfoSphere BigInsights on IBM System x

As world activities become more integrated, the rate of data growth has been increasing exponentially. And as a result of this data explosion, current data management methods can become inadequate. People are using the term big data (sometimes referred to as Big Data) to describe this latest industry trend. IBM® is preparing the next generation of technology to meet these data management challenges. To provide the capability of incorporating big data sources and analytics of these sources, IBM developed a stream-computing product that is based on the open source computing framework Apache Hadoop. Each product in the framework provides unique capabilities to the data management environment, and further enhances the value of your data warehouse investment. In this IBM Redbooks® publication, we describe the need for big data in an organization. We then introduce IBM InfoSphere® BigInsights™ and explain how it differs from standard Hadoop. BigInsights provides a packaged Hadoop distribution, a greatly simplified installation of Hadoop and corresponding open source tools for application development, data movement, and cluster management. BigInsights also brings more options for data security, and as a component of the IBM big data platform, it provides potential integration points with the other components of the platform. A new chapter has been added to this edition. Chapter 11 describes IBM Platform Symphony®, which is a new scheduling product that works with IBM Insights, bringing low-latency scheduling and multi-tenancy to IBM InfoSphere BigInsights. The book is designed for clients, consultants, and other technical professionals.

Data Warehousing in the Age of Big Data

Data Warehousing in the Age of the Big Data will help you and your organization make the most of unstructured data with your existing data warehouse. As Big Data continues to revolutionize how we use data, it doesn't have to create more confusion. Expert author Krish Krishnan helps you make sense of how Big Data fits into the world of data warehousing in clear and concise detail. The book is presented in three distinct parts. Part 1 discusses Big Data, its technologies and use cases from early adopters. Part 2 addresses data warehousing, its shortcomings, and new architecture options, workloads, and integration techniques for Big Data and the data warehouse. Part 3 deals with data governance, data visualization, information life-cycle management, data scientists, and implementing a Big Data–ready data warehouse. Extensive appendixes include case studies from vendor implementations and a special segment on how we can build a healthcare information factory. Ultimately, this book will help you navigate through the complex layers of Big Data and data warehousing while providing you information on how to effectively think about using all these technologies and the architectures to design the next-generation data warehouse. Learn how to leverage Big Data by effectively integrating it into your data warehouse. Includes real-world examples and use cases that clearly demonstrate Hadoop, NoSQL, HBASE, Hive, and other Big Data technologies Understand how to optimize and tune your current data warehouse infrastructure and integrate newer infrastructure matching data processing workloads and requirements

Managing Data in Motion

Managing Data in Motion describes techniques that have been developed for significantly reducing the complexity of managing system interfaces and enabling scalable architectures. Author April Reeve brings over two decades of experience to present a vendor-neutral approach to moving data between computing environments and systems. Readers will learn the techniques, technologies, and best practices for managing the passage of data between computer systems and integrating disparate data together in an enterprise environment. The average enterprise's computing environment is comprised of hundreds to thousands computer systems that have been built, purchased, and acquired over time. The data from these various systems needs to be integrated for reporting and analysis, shared for business transaction processing, and converted from one format to another when old systems are replaced and new systems are acquired. The management of the "data in motion" in organizations is rapidly becoming one of the biggest concerns for business and IT management. Data warehousing and conversion, real-time data integration, and cloud and "big data" applications are just a few of the challenges facing organizations and businesses today. Managing Data in Motion tackles these and other topics in a style easily understood by business and IT managers as well as programmers and architects. Presents a vendor-neutral overview of the different technologies and techniques for moving data between computer systems including the emerging solutions for unstructured as well as structured data types Explains, in non-technical terms, the architecture and components required to perform data integration Describes how to reduce the complexity of managing system interfaces and enable a scalable data architecture that can handle the dimensions of "Big Data"

Training Kit (Exam 70-463): Implementing a Data Warehouse with Microsoft SQL Server 2012

Ace your preparation for Microsoft® Certification Exam 70-463 with this 2-in-1 Training Kit from Microsoft Press®. Work at your own pace through a series of lessons and practical exercises, and then assess your skills with online practice tests—featuring multiple, customizable testing options. Maximize your performance on the exam by learning how to: Design and implement a data warehouse Develop and enhance SQL Server Integration Services packages Manage and maintain SQL Server Integration Services packages Build data quality solutions Implement custom code in SQL Server Integration Services packages

Using Open Source Platforms for Business Intelligence

Open Source BI solutions have many advantages over traditional proprietary software, from offering lower initial costs to more flexible support and integration options; but, until now, there has been no comprehensive guide to the complete offerings of the OS BI market. Writing for IT managers and business analysts without bias toward any BI suite, industry insider Lyndsay Wise covers the benefits and challenges of all available open source BI systems and tools, enabling readers to identify the solutions and technologies that best meet their business needs. Wise compares and contrasts types of OS BI and proprietary tools on the market, including Pentaho, Jaspersoft, RapidMiner, SpagoBI, BIRT, and many more. Real-world case studies and project templates clarify the steps involved in implementing open source BI, saving new users the time and trouble of developing their own solutions from scratch. For business managers who are hard pressed to indentify the best BI solutions and software for their companies, this book provides a practical guide to evaluating the ROI of open source versus traditional BI deployments. The only book to provide complete coverage of all open source BI systems and tools specifically for business managers, without bias toward any OS BI suite A practical, step-by-step guide to implementing OS BI solutions that maximize ROI Comprehensive coverage of all open source systems and tools, including architectures, data integration, support, optimization, data mining, data warehousing, and interoperability Case studies and project templates enable readers to evaluate the benefits and tradeoffs of all OS BI options without having to spend time developing their own solutions from scratch

Business Intelligence Applied: Implementing an Effective Information and Communications Technology Infrastructure

Expert guidance for building an information communication and technology infrastructure that provides best in business intelligence Enterprise performance management (EPM) technology has been rapidly advancing, especially in the areas of predictive analysis and cloud-based solutions. Business intelligence caught on as a concept in the business world as the business strategy application of data warehousing in the early 2000s. With the recent surge in interest in data analytics and big data, it has seen a renewed level of interest as the ability of a business to find the valuable data in a timely—and competitive—fashion. Business Intelligence Applied reveals essential information for building an optimal and effective information and communication technology (ICT) infrastructure. Defines ICT infrastructure Examines best practices for documenting business change and for documenting technology recommendations Includes examples and cases from Europe and Asia Written for business intelligence staff, CIOs, CTOs, and technology managers With examples and cases from Europe and Asia, Business Intelligence Applied expertly covers business intelligence, a hot topic in business today as a key element to business and data analytics.

IBM System Storage N series Software Guide

Corporate workgroups, distributed enterprises, and small to medium-sized companies are increasingly seeking to network and consolidate storage to improve availability, share information, reduce costs, and protect and secure information. These organizations require enterprise-class solutions capable of addressing immediate storage needs cost-effectively, while providing an upgrade path for future requirements. IBM® System Storage® N series storage systems and their software capabilities are designed to meet these requirements. IBM System Storage N series storage systems offer an excellent solution for a broad range of deployment scenarios. IBM System Storage N series storage systems function as a mulitprotocol storage device that is designed to allow you to simultaneously serve both file and block-level data across a single network. These activities are demanding procedures that, for some solutions, require multiple, separately managed systems. The flexibility of IBM System Storage N series storage systems, however, allows them to address the storage needs of a wide range of organizations, including distributed enterprises and data centers for midrange enterprises. IBM System Storage N series storage systems also support sites with computer and data-intensive enterprise applications, such as database, data warehousing, workgroup collaboration, and messaging. This IBM® Redbooks® publication explains the software features of the IBM System Storage N series storage systems. This book also covers topics such as installation, setup, and administration of those software features from the IBM System Storage N series storage systems and clients and provides example scenarios.

IBM Cognos Dynamic Cubes

IBM® Cognos® Business Intelligence (BI) provides a proven enterprise BI platform with an open data strategy, providing customers with the ability to leverage data from any source, package it into a business model, and make it available to consumers in various interfaces that are tailored to the task. IBM Cognos Dynamic Cubes complements the existing Cognos BI capabilities and continues the tradition of an open data model. It focuses on extending the scalability of the IBM Cognos platform to enable speed-of-thought analytics over terabytes of enterprise data, without having to invest in a new data warehouse appliance. This capability adds a new level of query intelligence so you can unleash the power of your enterprise data warehouse. This IBM Redbooks® publication addresses IBM Cognos Business Intelligence V10.2 and specifically, the IBM Cognos Dynamic Cubes capabilities. This book can help you in the following ways: Understand core features of the Dynamic Cubes capabilities of IBM Cognos BI V10.2 Learn by example with practical scenarios using the IBM Cognos samples

Professional Microsoft SQL Server 2012 Analysis Services with MDX and DAX

Understand Microsoft's dramatically updated new release of its premier toolset for business intelligence The first major update to Microsoft's state-of-the-art, complex toolset for business intelligence (BI) in years is now available and what better way to master it than with this detailed book from key members of the product's development team? If you're a database or data warehouse developer, this is the expert resource you need to build full-scale, multi-dimensional, database applications using Microsoft's new SQL Server 2012 Analysis Services and related tools. Discover how to solve real-world BI problems by leveraging a slew of powerful new Analysis Services features and capabilities. These include the new DAX language, which is a more user-friendly version of MDX; PowerPivot, a new tool for performing simplified analysis of data; BISM, Microsoft's new Business Intelligence Semantic Model; and much more. Serves as an authoritative guide to Microsoft's new SQL Server 2012 Analysis Services BI product and is written by key members of the Microsoft Analysis Services product development team Covers SQL Server 2012 Analysis Services, a major new release with a host of powerful new features and capabilities Topics include using the new DAX language, a simplified, more user-friendly version of MDX; PowerPivot, a new tool for performing simplified analysis of data; BISM, Microsoft's new Business Intelligence Semantic Model; and a new, yet-to-be-named BI reporting tool Explores real-world scenarios to help developers build comprehensive solutions Get thoroughly up to speed on this powerful new BI toolset with the timely and authoritative Professional Microsoft SQL Server 2012 Analysis Services with MDX.

Solving Operational Business Intelligence with InfoSphere Warehouse Advanced Edition

IBM® InfoSphere® Warehouse is the IBM flagship data warehouse platform for departmental data marts and enterprise data warehouses. It offers leading architecture, performance, backup, and recovery tools that help improve efficiency and reduce time to market through increased understanding of current data assets, while simplifying the daily operations of managing complex warehouse deployments. InfoSphere Warehouse Advanced Enterprise Edition delivers an enhanced set of database performance, management, and design tools. These tools assist companies in maintaining and increasing value from their warehouses, while helping to reduce the total cost of maintaining these complex environments. In this IBM Redbooks® publication we explain how you can build a business intelligence system with InfoSphere Warehouse Advanced Enterprise to manage and support daily business operations for an enterprise, to generate more income with lower cost. We describe the foundation of the business analytics, the Data Warehouse features and functions, and the solutions that can deliver immediate analytics solutions and help you drive better business outcomes. We show you how to use the advanced analytics of InfoSphere Warehouse Advanced Enterprise Edition and integrated tools for data modeling, mining, text analytics, and identifying and meeting the data latency requirements. We describe how the performance and storage optimization features can make building and managing a large data warehouse more affordable, and how they can help significantly reduce the cost of ownership. We also cover data lifecycle management and the key features of IBM Cognos® Business Intelligence. This book is intended for data warehouse professionals who are interested in gaining in-depth knowledge about the operational business intelligence solution for a data warehouse that the IBM InfoSphere Warehouse Advanced Enterprise Edition offers.

Programming Hive

Need to move a relational database application to Hadoop? This comprehensive guide introduces you to Apache Hive, Hadoop’s data warehouse infrastructure. You’ll quickly learn how to use Hive’s SQL dialect—HiveQL—to summarize, query, and analyze large datasets stored in Hadoop’s distributed filesystem. This example-driven guide shows you how to set up and configure Hive in your environment, provides a detailed overview of Hadoop and MapReduce, and demonstrates how Hive works within the Hadoop ecosystem. You’ll also find real-world case studies that describe how companies have used Hive to solve unique problems involving petabytes of data. Use Hive to create, alter, and drop databases, tables, views, functions, and indexes Customize data formats and storage options, from files to external databases Load and extract data from tables—and use queries, grouping, filtering, joining, and other conventional query methods Gain best practices for creating user defined functions (UDFs) Learn Hive patterns you should use and anti-patterns you should avoid Integrate Hive with other data processing programs Use storage handlers for NoSQL databases and other datastores Learn the pros and cons of running Hive on Amazon’s Elastic MapReduce

Enterprise Analytics: Optimize Performance, Process, and Decisions Through Big Data

The Definitive Guide to Enterprise-Level Analytics Strategy, Technology, Implementation, and Management Organizations are capturing exponentially larger amounts of data than ever, and now they have to figure out what to do with it. Using analytics, you can harness this data, discover hidden patterns, and use this knowledge to act meaningfully for competitive advantage. Suddenly, you can go beyond understanding “how, when, and where” events have occurred, to understand why – and use this knowledge to reshape the future. Now, analytics pioneer Tom Davenport and the world-renowned experts at the International Institute for Analytics (IIA) have brought together the latest techniques, best practices, and research on analytics in a single primer for maximizing the value of enterprise data. Enterprise Analytics is today’s definitive guide to analytics strategy, planning, organization, implementation, and usage. It covers everything from building better analytics organizations to gathering data; implementing predictive analytics to linking analysis with organizational performance. The authors offer specific insights for optimizing supply chains, online services, marketing, fraud detection, and many other business functions. They support their powerful techniques with many real-world examples, including chapter-length case studies from healthcare, retail, and financial services. Enterprise Analytics will be an invaluable resource for every business and technical professional who wants to make better data-driven decisions: operations, supply chain, and product managers; product, financial, and marketing analysts; CIOs and other IT leaders; data, web, and data warehouse specialists, and many others.

Pro SQL Server 2012 BI Solutions

Business intelligence projects do not need to cost multi-millions of dollars or take months or even years to complete! Using rapid application development (RAD) techniques along with Microsoft SQL Server 2012, this book guides database administrators, SQL programmers, and report specialists in creating practical, cost-effective business intelligence solutions for their companies and departments. Pro SQL Server 2012 BI Solutions provides practical examples of cost-effective business intelligence projects. Readers will be guided through several complete projects that build a foundation for real-world solutions. Even with limited experience using Microsoft's SQL Server, Integration Server, Analysis Server, and Reporting Server, you can leverage your existing knowledge of SQL programming and database design to provide users with the business intelligence reports they need. Provides recipes for multiple business intelligence scenarios Progresses from simple to advanced projects using several examples Shows Microsoft SQL Server technology used to complete real-world business intelligence projects What you'll learn How to plan and implement cost-effective business intelligence projects How to create practical data warehouse databases How to extract, transform, and load data with Integration Services How to develop OLAP cubes and dimensions on Analysis Server How to create Reporting Server reports using both SQL and MDX How to apply performance-tuning techniques to get the most from your solutions Who this book is for Pro SQL Server 2012 BI Solutions is aimed at database administrators, SQL programmers, and report developers who create business intelligence solutions for midsized businesses and departments.

SQL Server 2012 Integration Services Design Patterns

SQL Server 2012 Integration Services Design Patterns is a book of recipes for SQL Server Integration Services (SSIS). Design patterns in the book show how to solve common problems encountered when developing data integration solutions. Because you do not have to build the code from scratch each time, using design patterns improves your efficiency as an SSIS developer. In SSIS Design Patterns, we take you through several of these snippets in detail, providing the technical details of the resolution. SQL Server 2012 Integration Services Design Patterns does not focus on the problems to be solved; instead, the book delves into why particular problems should be solved in certain ways. You'll learn more about SSIS as a result, and you'll learn by practical example. Where appropriate, SQL Server 2012 Integration Services Design Patterns provides examples of alternative patterns and discusses when and where they should be used. Highlights of the book include sections on ETL Instrumentation, SSIS Frameworks, and Dependency Services. Takes you through solutions to several common data integration challenges Demonstrates new features in SQL Server 2012 Integration Services Teaches SSIS using practical examples What you'll learn Load data from flat file formats Explore patterns for executing SSIS packages Discover a pattern for loading XML data Migrate SSIS packages through your application lifecycle without editing connections Take advantage of SSIS 2012 Dependency Services Build an SSIS Framework to support your application needs Who this book is for SQL Server 2012 Integration Services Design Patterns is for the data integration developer who is ready to take their SQL Server Integration Services (SSIS) skills to a more efficient level. It's for the developer interested in locating a previously-tested solution quickly. SQL Server 2012 Integration Services Design Patterns is a great book for ETL (extract, transform, and load) specialists and those seeking practical uses for new features in SQL Server 2012 Integration Services. It's an excellent choice for business intelligence and data warehouse developers.

Data Virtualization for Business Intelligence Systems

Data virtualization can help you accomplish your goals with more flexibility and agility. Learn what it is and how and why it should be used with Data Virtualization for Business Intelligence Systems. In this book, expert author Rick van der Lans explains how data virtualization servers work, what techniques to use to optimize access to various data sources and how these products can be applied in different projects. You’ll learn the difference is between this new form of data integration and older forms, such as ETL and replication, and gain a clear understanding of how data virtualization really works. Data Virtualization for Business Intelligence Systems outlines the advantages and disadvantages of data virtualization and illustrates how data virtualization should be applied in data warehouse environments. You’ll come away with a comprehensive understanding of how data virtualization will make data warehouse environments more flexible and how it make developing operational BI applications easier. Van der Lans also describes the relationship between data virtualization and related topics, such as master data management, governance, and information management, so you come away with a big-picture understanding as well as all the practical know-how you need to virtualize your data. First independent book on data virtualization that explains in a product-independent way how data virtualization technology works. Illustrates concepts using examples developed with commercially available products. Shows you how to solve common data integration challenges such as data quality, system interference, and overall performance by following practical guidelines on using data virtualization. Apply data virtualization right away with three chapters full of practical implementation guidance. Understand the big picture of data virtualization and its relationship with data governance and information management.

Business Intelligence Cookbook: A Project Lifecycle Approach Using Oracle Technology

Business Intelligence Cookbook: A Project Lifecycle Approach Using Oracle Technology is an expert guide for enhancing your data warehousing and business intelligence skills, specifically targeted at Oracle Database 11g. With over 80 advanced step-by-step recipes, this book walks you through creating, optimizing, and managing actionable business intelligence solutions. What this Book will help me do Understand practical project management approaches specific to business intelligence and data warehousing. Learn to effectively estimate efforts for DW/BI projects using structured methodologies. Model data using Oracle Database and Oracle SQL Data Modeler while aligning it with business requirements. Discover best practices for transitioning BI solutions from development to deployment. Master techniques to secure organizational data as a critical asset. Author(s) John Heaton is an experienced IT professional specializing in data warehousing and business intelligence projects. With extensive knowledge of Oracle technologies, John brings deep technical insights along with a clear writing style, making complex concepts accessible to readers. He is passionate about helping professionals in the IT industry enhance their skills. Who is it for? This book is ideal for IT professionals, data warehouse developers, and project managers working with Oracle Database who seek to advance their expertise in business intelligence. If you have foundational knowledge of DW/BI concepts and want to professionally manage complete lifecycle projects leveraging Oracle tools, this guide is tailored for you.

Principles of Data Integration

Principles of Data Integration is the first comprehensive textbook of data integration, covering theoretical principles and implementation issues as well as current challenges raised by the semantic web and cloud computing. The book offers a range of data integration solutions enabling you to focus on what is most relevant to the problem at hand. Readers will also learn how to build their own algorithms and implement their own data integration application. Written by three of the most respected experts in the field, this book provides an extensive introduction to the theory and concepts underlying today's data integration techniques, with detailed, instruction for their application using concrete examples throughout to explain the concepts. This text is an ideal resource for database practitioners in industry, including data warehouse engineers, database system designers, data architects/enterprise architects, database researchers, statisticians, and data analysts; students in data analytics and knowledge discovery; and other data professionals working at the R&D and implementation levels. Offers a range of data integration solutions enabling you to focus on what is most relevant to the problem at hand Enables you to build your own algorithms and implement your own data integration applications