talk-data.com talk-data.com

Topic

Data Quality

data_management data_cleansing data_validation

537

tagged

Activity Trend

82 peak/qtr
2020-Q1 2026-Q1

Activities

537 activities · Newest first

What's New in SQL Server 2012

SQL Server 2012 introduces a wealth of new features and enhancements that database professionals need to master to stay ahead in the ever-evolving industry. This book provides a practical guide to upgrading your knowledge with the latest advancements, from T-SQL improvements to new Business Intelligence tools and cloud capabilities. What this Book will help me do Understand and install the core and advanced features of SQL Server 2012 effectively. Implement new SQL Server Management Studio features for enhanced administration. Utilize Business Intelligence Semantic Models for insightful data analysis. Execute data cleansing projects using Data Quality Services (DQS). Simulate real-world database loads using Distributed Replay for testing purposes. Author(s) The author is an experienced database administrator and SQL Server expert with a career spanning over two decades. With hands-on experience in implementation, administration, and optimization of complex SQL Server environments, the author brings a wealth of practical knowledge to this book. Their approach is to provide concise, actionable insights tailored to the needs of IT professionals. Who is it for? This book is tailored for database administrators, developers, and BI professionals familiar with SQL Server 2008 R2 seeking to efficiently upgrade to SQL Server 2012. If you aim to quickly adopt and utilize the new features and improvements in SQL Server 2012, this book provides the clear and focused learning path you need.

Microsoft® SQL Server® 2012 Integration Services

Build and manage data integration solutions with expert guidance from the Microsoft SQL Server Integration Services (SSIS) team. See best practices in action and dive deep into the SSIS engine, SSISDB catalog, and security features. Using the developer enhancements in SQL Server 2012 and the flexible SSIS toolset, you’ll handle complex data integration scenarios more efficiently—and acquire the skills you need to build comprehensive solutions. Discover how to: Use SSIS to extract, transform, and load data from multiple data sources Apply best practices to optimize package and project configuration and deployment Manage security settings in the SSISDB catalog and control package access Work with SSIS data quality features to profile, cleanse, and increase reliability Monitor, troubleshoot, and tune SSIS solutions with advanced features such as detailed views and data taps Load data incrementally to capture an easily consumable stream of insert, update, and delete activity

Data Virtualization for Business Intelligence Systems

Data virtualization can help you accomplish your goals with more flexibility and agility. Learn what it is and how and why it should be used with Data Virtualization for Business Intelligence Systems. In this book, expert author Rick van der Lans explains how data virtualization servers work, what techniques to use to optimize access to various data sources and how these products can be applied in different projects. You’ll learn the difference is between this new form of data integration and older forms, such as ETL and replication, and gain a clear understanding of how data virtualization really works. Data Virtualization for Business Intelligence Systems outlines the advantages and disadvantages of data virtualization and illustrates how data virtualization should be applied in data warehouse environments. You’ll come away with a comprehensive understanding of how data virtualization will make data warehouse environments more flexible and how it make developing operational BI applications easier. Van der Lans also describes the relationship between data virtualization and related topics, such as master data management, governance, and information management, so you come away with a big-picture understanding as well as all the practical know-how you need to virtualize your data. First independent book on data virtualization that explains in a product-independent way how data virtualization technology works. Illustrates concepts using examples developed with commercially available products. Shows you how to solve common data integration challenges such as data quality, system interference, and overall performance by following practical guidelines on using data virtualization. Apply data virtualization right away with three chapters full of practical implementation guidance. Understand the big picture of data virtualization and its relationship with data governance and information management.

Fundamentals of Database Management Systems, Second Edition

Gillenson's new edition of Fundamentals of Database Management Systems provides concise coverage of the fundamental topics necessary for a deep understanding of the basics. In this issue, there is more emphasis on a practical approach, with new "your turn" boxes and much more coverage in a separate supplement on how to implement databases with Access. In every chapter, the author covers concepts first, then show how they're implemented in continuing case(s.) "Your Turn" boxes appear several times throughout the chapter to apply concepts to projects. And "Concepts in Action" boxes contain examples of concepts used in practice. This pedagogy is easily demonstrable and the text also includes more hands-on exercises and projects and a standard diagramming style for the data modeling diagrams. Furthermore, revised and updated content and organization includes more coverage on database control issues, earlier coverage of SQL, and new coverage on data quality issues.

Oracle Hyperion Financial Management Tips And Techniques

Master Oracle Hyperion Financial Management Consolidate financial data and maintain a scalable compliance framework with expert instruction from an Oracle ACE. Oracle Hyperion Financial Management Tips & Techniques provides advanced, time-saving procedures not documented in user manuals or help files. Find out how to configure Oracle Hyperion Financial Management, import and reconcile data, deliver dynamic business reports, and automate administrative tasks. Stragegies for supporting, testing, and tuning your application are also covered in this comprehensive Oracle Press guide. Establish objectives and develop an effective rollout plan Set up and customize Oracle Hyperion Financial Management Create rules using VBScript and the Calculation Manager feature of Oracle Hyperion Foundation Services Load, test, and reconcile your data with Oracle Data Integrator and Oracle Hyperion Financial Data Quality Management Design, update, and distribute Web-based business reports Integrate content from Microsoft Excel, Word, and PowerPoint using SmartView Work with the Lifecycle Management feature of Oracle Hyperion Foundation Services Identify and resolve performance, design, and capacity problems

Data Architecture

Data Architecture: From Zen to Reality explains the principles underlying data architecture, how data evolves with organizations, and the challenges organizations face in structuring and managing their data. Using a holistic approach to the field of data architecture, the book describes proven methods and technologies to solve the complex issues dealing with data. It covers the various applied areas of data, including data modelling and data model management, data quality, data governance, enterprise information management, database design, data warehousing, and warehouse design. This text is a core resource for anyone customizing or aligning data management systems, taking the Zen-like idea of data architecture to an attainable reality. The book presents fundamental concepts of enterprise architecture with definitions and real-world applications and scenarios. It teaches data managers and planners about the challenges of building a data architecture roadmap, structuring the right team, and building a long term set of solutions. It includes the detail needed to illustrate how the fundamental principles are used in current business practice. The book is divided into five sections, one of which addresses the software-application development process, defining tools, techniques, and methods that ensure repeatable results. Data Architecture is intended for people in business management involved with corporate data issues and information technology decisions, ranging from data architects to IT consultants, IT auditors, and data administrators. It is also an ideal reference tool for those in a higher-level education process involved in data or information technology management. Presents fundamental concepts of enterprise architecture with definitions and real-world applications and scenarios Teaches data managers and planners about the challenges of building a data architecture roadmap, structuring the right team, and building a long term set of solutions Includes the detail needed to illustrate how the fundamental principles are used in current business practice

Entity Resolution and Information Quality

Entity Resolution and Information Quality presents topics and definitions, and clarifies confusing terminologies regarding entity resolution and information quality. It takes a very wide view of IQ, including its six-domain framework and the skills formed by the International Association for Information and Data Quality {IAIDQ). The book includes chapters that cover the principles of entity resolution and the principles of Information Quality, in addition to their concepts and terminology. It also discusses the Fellegi-Sunter theory of record linkage, the Stanford Entity Resolution Framework, and the Algebraic Model for Entity Resolution, which are the major theoretical models that support Entity Resolution. In relation to this, the book briefly discusses entity-based data integration (EBDI) and its model, which serve as an extension of the Algebraic Model for Entity Resolution. There is also an explanation of how the three commercial ER systems operate and a description of the non-commercial open-source system known as OYSTER. The book concludes by discussing trends in entity resolution research and practice. Students taking IT courses and IT professionals will find this book invaluable. First authoritative reference explaining entity resolution and how to use it effectively Provides practical system design advice to help you get a competitive advantage Includes a companion site with synthetic customer data for applicatory exercises, and access to a Java-based Entity Resolution program.

DW 2.0: The Architecture for the Next Generation of Data Warehousing

DW 2.0: The Architecture for the Next Generation of Data Warehousing is the first book on the new generation of data warehouse architecture, DW 2.0, by the father of the data warehouse. The book describes the future of data warehousing that is technologically possible today, at both an architectural level and technology level. The perspective of the book is from the top down: looking at the overall architecture and then delving into the issues underlying the components. This allows people who are building or using a data warehouse to see what lies ahead and determine what new technology to buy, how to plan extensions to the data warehouse, what can be salvaged from the current system, and how to justify the expense at the most practical level. This book gives experienced data warehouse professionals everything they need in order to implement the new generation DW 2.0. It is designed for professionals in the IT organization, including data architects, DBAs, systems design and development professionals, as well as data warehouse and knowledge management professionals. First book on the new generation of data warehouse architecture, DW 2.0 Written by the "father of the data warehouse", Bill Inmon, a columnist and newsletter editor of The Bill Inmon Channel on the Business Intelligence Network Long overdue comprehensive coverage of the implementation of technology and tools that enable the new generation of the DW: metadata, temporal data, ETL, unstructured data, and data quality control

Star Schema The Complete Reference

The definitive guide to dimensional design for your data warehouse Learn the best practices of dimensional design. Star Schema: The Complete Reference offers in-depth coverage of design principles and their underlying rationales. Organized around design concepts and illustrated with detailed examples, this is a step-by-step guidebook for beginners and a comprehensive resource for experts. This all-inclusive volume begins with dimensional design fundamentals and shows how they fit into diverse data warehouse architectures, including those of W.H. Inmon and Ralph Kimball. The book progresses through a series of advanced techniques that help you address real-world complexity, maximize performance, and adapt to the requirements of BI and ETL software products. You are furnished with design tasks and deliverables that can be incorporated into any project, regardless of architecture or methodology. Master the fundamentals of star schema design and slow change processing Identify situations that call for multiple stars or cubes Ensure compatibility across subject areas as your data warehouse grows Accommodate repeating attributes, recursive hierarchies, and poor data quality Support conflicting requirements for historic data Handle variation within a business process and correlation of disparate activities Boost performance using derived schemas and aggregates Learn when it's appropriate to adjust designs for BI and ETL tools

Head First Data Analysis

Today, interpreting data is a critical decision-making factor for businesses and organizations. If your job requires you to manage and analyze all kinds of data, turn to Head First Data Analysis, where you'll quickly learn how to collect and organize data, sort the distractions from the truth, find meaningful patterns, draw conclusions, predict the future, and present your findings to others. Whether you're a product developer researching the market viability of a new product or service, a marketing manager gauging or predicting the effectiveness of a campaign, a salesperson who needs data to support product presentations, or a lone entrepreneur responsible for all of these data-intensive functions and more, the unique approach in Head First Data Analysis is by far the most efficient way to learn what you need to know to convert raw data into a vital business tool. You'll learn how to: Determine which data sources to use for collecting information Assess data quality and distinguish signal from noise Build basic data models to illuminate patterns, and assimilate new information into the models Cope with ambiguous information Design experiments to test hypotheses and draw conclusions Use segmentation to organize your data within discrete market groups Visualize data distributions to reveal new relationships and persuade others Predict the future with sampling and probability models Clean your data to make it useful Communicate the results of your analysis to your audience Using the latest research in cognitive science and learning theory to craft a multi-sensory learning experience, Head First Data Analysis uses a visually rich format designed for the way your brain works, not a text-heavy approach that puts you to sleep.

Tapping into Unstructured Data: Integrating Unstructured Data and Textual Analytics into Business Intelligence

“The authors, the best minds on the topic, are breaking new ground. They show how every organization can realize the benefits of a system that can search and present complex ideas or data from what has been a mostly untapped source of raw data.” --Randy Chalfant, CTO, Sun Microsystems The Definitive Guide to Unstructured Data Management and Analysis--From the World’s Leading Information Management Expert A wealth of invaluable information exists in unstructured textual form, but organizations have found it difficult or impossible to access and utilize it. This is changing rapidly: new approaches finally make it possible to glean useful knowledge from virtually any collection of unstructured data. William H. Inmon--the father of data warehousing--and Anthony Nesavich introduce the next data revolution: unstructured data management. Inmon and Nesavich cover all you need to know to make unstructured data work for your organization. You’ll learn how to bring it into your existing structured data environment, leverage existing analytical infrastructure, and implement textual analytic processing technologies to solve new problems and uncover new opportunities. Inmon and Nesavich introduce breakthrough techniques covered in no other book--including the powerful role of textual integration, new ways to integrate textual data into data warehouses, and new SQL techniques for reading and analyzing text. They also present five chapter-length, real-world case studies--demonstrating unstructured data at work in medical research, insurance, chemical manufacturing, contracting, and beyond. This book will be indispensable to every business and technical professional trying to make sense of a large body of unstructured text: managers, database designers, data modelers, DBAs, researchers, and end users alike. Coverage includes What unstructured data is, and how it differs from structured data First generation technology for handling unstructured data, from search engines to ECM--and its limitations Integrating text so it can be analyzed with a common, colloquial vocabulary: integration engines, ontologies, glossaries, and taxonomies Processing semistructured data: uncovering patterns, words, identifiers, and conflicts Novel processing opportunities that arise when text is freed from context Architecture and unstructured data: Data Warehousing 2.0 Building unstructured relational databases and linking them to structured data Visualizations and Self-Organizing Maps (SOMs), including Compudigm and Raptor solutions Capturing knowledge from spreadsheet data and email Implementing and managing metadata: data models, data quality, and more William H. Inmon is founder, president, and CTO of Inmon Data Systems. He is the father of the data warehouse concept, the corporate information factory, and the government information factory. Inmon has written 47 books on data warehouse, database, and information technology management; as well as more than 750 articles for trade journals such as Data Management Review, Byte, Datamation, and ComputerWorld. His b-eye-network.com newsletter currently reaches 55,000 people. Anthony Nesavich worked at Inmon Data Systems, where he developed multiple reports that successfully query unstructured data. Preface xvii 1 Unstructured Textual Data in the Organization 1 2 The Environments of Structured Data and Unstructured Data 15 3 First Generation Textual Analytics 33 4 Integrating Unstructured Text into the Structured Environment 47 5 Semistructured Data 73 6 Architecture and Textual Analytics 83 7 The Unstructured Database 95 8 Analyzing a Combination of Unstructured Data and Structured Data 113 9 Analyzing Text Through Visualization 127 10 Spreadsheets and Email 135 11 Metadata in Unstructured Data 147 12 A Methodology for Textual Analytics 163 13 Merging Unstructured Databases into the Data Warehouse 175 14 Using SQL to Analyze Text 185 15 Case Study--Textual Analytics in Medical Research 195 16 Case Study--A Database for Harmful Chemicals 203 17 Case Study--Managing Contracts Through an Unstructured Database 209 18 Case Study--Creating a Corporate Taxonomy (Glossary) 215 19 Case Study--Insurance Claims 219 Glossary 227 Index 233

Siebel 7.8 with IBM DB2 UDB V8.2 Handbook

This IBM Redbooks publication delivers details about DB2 UDB V8.2 on Siebel 7.8. It outlines the partnership between Siebel Systems and IBM and the benefits of using DB2 UDB to support the Siebel Enterprise. The most commonly used components of the Siebel Enterprise and the DB2 UDB architecture are described. We provide the planning considerations for running DB2 UDB in Siebel environment. The step-by-step installation and configuration details are followed. We then describe information on methods to populate and maintain data in Siebel tables including data archival techniques and information on ensuring data integrity and data quality. The database administration, monitoring, and tuning tools provided by DB2 UDB and operating systems are discussed and the tool usage provided. The book also provides in-depth discussion on high availability and disaster recovery options and setup procedure for a Siebel/DB2 UDB environment. Finally, the book provides information about the components of Siebel Analytics and where these components fit in the overall scheme with Siebel Enterprise.

The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data

Cowritten by Ralph Kimball, the world's leading data warehousing authority, whose previous books have sold more than 150,000 copies Delivers real-world solutions for the most time- and labor-intensive portion of data warehousing-data staging, or the extract, transform, load (ETL) process Delineates best practices for extracting data from scattered sources, removing redundant and inaccurate data, transforming the remaining data into correctly formatted data structures, and then loading the end product into the data warehouse Offers proven time-saving ETL techniques, comprehensive guidance on building dimensional structures, and crucial advice on ensuring data quality

Business Intelligence

Business Intelligence describes the basic architectural components of a business intelligence environment, ranging from traditional topics such as business process modeling, data modeling, and more modern topics such as business rule systems, data profiling, information compliance and data quality, data warehousing, and data mining. This book progresses through a logical sequence, starting with data model infrastructure, then data preparation, followed by data analysis, integration, knowledge discovery, and finally the actual use of discovered knowledge. The book contains a quick reference guide for business intelligence terminology. Business Intelligence is part of Morgan Kaufmann's Savvy Manager's Guide series. * Provides clear explanations without technical jargon, followed by in-depth descriptions. * Articulates the business value of new technology, while providing relevant introductory technical background. * Contains a handy quick-reference to technologies and terminologies. * Guides managers through developing, administering, or simply understanding business intelligence technology. * Bridges the business-technical gap. * Is Web enhanced. Companion sites to the book and series provide value-added information, links, discussions, and more.

Data Warehousing And Business Intelligence For e-Commerce

You go online to buy a digital camera. Soon, you realize you've bought a more expensive camera than intended, along with extra batteries, charger, and graphics software-all at the prompting of the retailer. Happy with your purchases? The retailer certainly is, and if you are too, you both can be said to be the beneficiaries of "customer intimacy" achieved through the transformation of data collected during this visit or stored from previous visits into real business intelligence that can be exercised in real time. Data Warehousing and Business Intelligence for e-Commerce is a practical exploration of the technological innovations through which traditional data warehousing is brought to bear on this and other less modest e-commerce applications, such as those at work in B2B, G2C, B2G, and B2E models. The authors examine the core technologies and commercial products in use today, providing a nuts-and-bolts understanding of how you can deploy customer and product data in ways that meet the unique requirements of the online marketplace-particularly if you are part of a brick-and-mortar company with specific online aspirations. In so doing, they build a powerful case for investment in and aggressive development of these approaches, which are likely to separate winners from losers as e-commerce grows and matures. * Includes the latest from successful data warehousing consultants whose work has encouraged the field's new focus on e-commerce. * Presents information that is written for both consultants and practitioners in companies of all sizes. * Emphasizes the special needs and opportunities of traditional brick-and-mortar businesses that are going online or participating in B2B supply chains or e-marketplaces. * Explains how long-standing assumptions about data warehousing have to be rethought in light of emerging business models that depend on customer intimacy. * Provides advice on maintaining data quality and integrity in environments marked by extensive customer self-input. * Advocates careful planning that will help both old economy and new economy companies develop long-lived and successful e-commerce strategies. * Focuses on data warehousing for emerging e-commerce areas such as e-government and B2E environments.

This presentation is a roadmap for running a successful Data Science and AI team. It advocates for a focus on experimental agility, and what is needed to achieve that from different angles. We'll explore the importance of teamwork and a culture of continuous learning. We'll talk about the importance of testability, reproducibility, and learning from mistakes. Also, we'll discuss sharing results and insights within the team to foster collective learning. Lastly, we'll circle back to data quality, emphasizing its crucial role in model performance.