Data Management

Data Integration Blueprint and Modeling: Techniques for a Scalable and Sustainable Architecture

2010-12-27 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Anthony David Giordano

Analytics BI Data Governance DWH IBM data data-engineering data-warehouse storage-repositories

Making Data Integration Work: How to Systematically Reduce Cost, Improve Quality, and Enhance Effectiveness Today’s enterprises are investing massive resources in data integration. Many possess thousands of point-to-point data integration applications that are costly, undocumented, and difficult to maintain. Data integration now accounts for a major part of the expense and risk of typical data warehousing and business intelligence projects--and, as businesses increasingly rely on analytics, the need for a blueprint for data integration is increasing now more than ever. This book presents the solution: a clear, consistent approach to defining, designing, and building data integration components to reduce cost, simplify management, enhance quality, and improve effectiveness. Leading IBM data management expert Tony Giordano brings together best practices for architecture, design, and methodology, and shows how to do the disciplined work of getting data integration right. Mr. Giordano begins with an overview of the “patterns” of data integration, showing how to build blueprints that smoothly handle both operational and analytic data integration. Next, he walks through the entire project lifecycle, explaining each phase, activity, task, and deliverable through a complete case study. Finally, he shows how to integrate data integration with other information management disciplines, from data governance to metadata. The book’s appendices bring together key principles, detailed models, and a complete data integration glossary. Coverage includes Implementing repeatable, efficient, and well-documented processes for integrating data Lowering costs and improving quality by eliminating unnecessary or duplicative data integrations Managing the high levels of complexity associated with integrating business and technical data Using intuitive graphical design techniques for more effective process and data integration modeling Building end-to-end data integration applications that bring together many complex data sources

Managing Time in Relational Databases

2010-08-19 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Randall Weis , Tom Johnston

RDBMS data data-engineering relational-databases

Managing Time in Relational Databases: How to Design, Update and Query Temporal Data introduces basic concepts that will enable businesses to develop their own framework for managing temporal data. It discusses the management of uni-temporal and bi-temporal data in relational databases, so that they can be seamlessly accessed together with current data; the encapsulation of temporal data structures and processes; ways to implement temporal data management as an enterprise solution; and the internalization of pipeline datasets. The book is organized into three parts. Part 1 traces the history of temporal data management and presents a taxonomy of bi-temporal data management methods. Part 2 provides an introduction to Asserted Versioning, covering the origins of Asserted Versioning; core concepts of Asserted Versioning; the schema common to all asserted version tables, as well as the various diagrams and notations used in the rest of the book; and how the basic scenario works when the target of that activity is an asserted version table. Part 3 deals with designing, maintaining, and querying asserted version databases. It discusses the design of Asserted Versioning databases; temporal transactions; deferred assertions and other pipeline datasets; Allen relationships; and optimizing Asserted Versioning databases. Integrates an enterprise-wide viewpoint with a strong conceptual model of temporal data management allowing for realistic implementation of database application development. Provides a true practical guide to the different possible methods of time-oriented databases with techniques of using existing funtionality to solve real world problems within an enterprise data architecture environment. Written by IT professionals for IT professionals, this book employs a heavily example-driven approach which reinforces learning by showing the results of puting the techniques discussed into practice.

Microsoft® Access® 2010 Inside Out

2010-08-15 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by John Viescas and Jeff Conrad

Microsoft VBA data data-engineering database-management-tools microsoft-access

You're beyond the basics, so dive right in and really put your database skills to work! This supremely organized reference is packed with hundreds of timesaving solutions, troubleshooting tips, and workarounds. It's all muscle and no fluff. Discover how the experts tackle Access 2010 -- and challenge yourself to new levels of mastery! Master essential data management and design techniques Import and link to data from spreadsheets, databases, text files, and other sources Use action queries to quickly insert, update, or delete entire sets of data Create custom forms to capture and display data Design reports to calculate, summarize, and highlight critical data--and learn advanced techniques Automate your application with macros and Visual Basic for Applications (VBA) Use Access Services to extend your database application to the Web Try out the sample client and web database applications in both 32-bit and 64-bit versions A Note Regarding the CD or DVD The print version of this book ships with a CD or DVD. The sample client and web database applications are provided in both 32-bit and 64-bit versions. Note that while we provide as much of the media content as we are able via free download, we are sometimes limited by licensing restrictions. For customers who purchase an ebook version of this title, instructions for downloading the CD files can be found in the ebook.

Database Modeling and Design, 4th Edition

2010-08-05 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by H.V. Jagadish , Tom Nadeau , Toby J. Teorey , Sam Lightstone (IBM)

BI Data Modelling DWH SQL data data-engineering data-models

Database Modeling and Design, Fourth Edition, the extensively revised edition of the classic logical database design reference, explains how you can model and design your database application in consideration of new technology or new business needs. It is an ideal text for a stand-alone data management course focused on logical database design, or a supplement to an introductory text for introductory database management. This book features clear explanations, lots of terrific examples and an illustrative case, and practical advice, with design rules that are applicable to any SQL-based system. The common examples are based on real-life experiences and have been thoroughly class-tested. The text takes a detailed look at the Unified Modeling Language (UML-2) as well as the entity-relationship (ER) approach for data requirements specification and conceptual modeling - complemented with examples for both approaches. It also discusses the use of data modeling concepts in logical database design; the transformation of the conceptual model to the relational model and to SQL syntax; the fundamentals of database normalization through the fifth normal form; and the major issues in business intelligence such as data warehousing, OLAP for decision support systems, and data mining. There are examples for how to use the most popular CASE tools to handle complex data modeling problems, along with exercises that test understanding of all material, plus solutions for many exercises. Lecture notes and a solutions manual are also available. This edition will appeal to professional data modelers and database design professionals, including database application designers, and database administrators (DBAs); new/novice data management professionals, such as those working on object oriented database design; and students in second courses in database focusing on design.+ a detailed look at the Unified Modeling Language (UML-2) as well as the entity-relationship (ER) approach for data requirements specification and conceptual modeling--with examples throughout the book in both approaches! + the details and examples of how to use data modeling concepts in logical database design, and the transformation of the conceptual model to the relational model and to SQL syntax; + the fundamentals of database normalization through the fifth normal form; + practical coverage of the major issues in business intelligence--data warehousing, OLAP for decision support systems, and data mining; + examples for how to use the most popular CASE tools to handle complex data modeling problems. + Exercises that test understanding of all material, plus solutions for many exercises.

Data Model Patterns: A Metadata Map

2010-07-20 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by David C. Hay

Data Modelling DWH data data-engineering metadata

Data Model Patterns: A Metadata Map not only presents a conceptual model of a metadata repository but also demonstrates a true enterprise data model of the information technology industry itself. It provides a step-by-step description of the model and is organized so that different readers can benefit from different parts. It offers a view of the world being addressed by all the techniques, methods, and tools of the information processing industry (for example, object-oriented design, CASE, business process re-engineering, etc.) and presents several concepts that need to be addressed by such tools. This book is pertinent, with companies and government agencies realizing that the data they use represent a significant corporate resource recognize the need to integrate data that has traditionally only been available from disparate sources. An important component of this integration is management of the "metadata" that describe, catalogue, and provide access to the various forms of underlying business data. The "metadata repository" is essential to keep track of the various physical components of these systems and their semantics. The book is ideal for data management professionals, data modeling and design professionals, and data warehouse and database repository designers. A comprehensive work based on the Zachman Framework for information architecture—encompassing the Business Owner's, Architect's, and Designer's views, for all columns (data, activities, locations, people, timing, and motivation) Provides a step-by-step description of model and is organized so that different readers can benefit from different parts Provides a view of the world being addressed by all the techniques, methods and tools of the information processing industry (for example, object-oriented design, CASE, business process re-engineering, etc.) Presents many concepts that are not currently being addressed by such tools — and should be

Statistical Programming in SAS®

2010-07-01 · O'Reilly Data Science Books O'Reilly Amazon

book

by A. John Bailer

SAS data data-science data-science-tasks statistics

In Statistical Programming in SAS, author A. John Bailer integrates SAS tools with interesting statistical applications and uses SAS 9.2 as a platform to introduce programming ideas for statistical analysis, data management, and data display and simulation. Written using a reader-friendly and narrative style, the book includes extensive examples and case studies to present a well-structured introduction to programming issues. This book has two parts. The first part addresses the nuts and bolts of programming, including fostering good programming habits, getting external data sets into SAS to construct an analysis data set, generating basic descriptive statistical summaries, producing customized tables, generating more attractive output, and producing high-quality graphical displays. The second part emphasizes programming in the context of a DATA step, in macros, and in SAS/IML software. Examples of statistical methods and concepts not always encountered in basic statistics courses (for example, bootstrapping, randomization tests, and jittering) are used to illustrate programming ideas. This book provides extensive illustrations of the new ODS Statistical Graphics procedures in SAS, a description of the new ODS Graphics Editor, and a brief introduction to some of the capabilities of SAS/IML Studio, such as producing dynamically linked data displays and invoking R from SAS.

Oracle Coherence 3.5

2010-03-08 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Aleksandar Seovic

Java Oracle data data-engineering oracle-database-solutions

If you're aiming to build high-performance applications that can scale seamlessly to accommodate hundreds or thousands of users, Oracle Coherence 3.5 may be just the tool you need. This book serves as a comprehensive guide to utilizing Oracle's Coherence data grid technology, emphasizing how to design and develop robust, scalable, and responsive applications. What this Book will help me do Design effective domain objects optimized for Oracle Coherence to enhance scalability and performance. Implement distributed caching to efficiently manage data across your application's architecture. Leverage real-time event processing with Coherence to provide up-to-date information to users. Integrate Coherence with other persistence solutions like JDBC and Hibernate for versatile, robust data management. Utilize parallel processing capabilities within the Coherence grid to achieve superior application performance. Author(s) Aleksandar Seovic, the author of this book, has substantial experience in developing enterprise-scale applications and a deep expertise in utilizing Oracle Coherence. They are dedicated to writing technical material that is both approachable and comprehensive for developers and architects. Their industry insights ensure that this book addresses real-world challenges effectively. Who is it for? This book is ideal for architects and developers designing Internet and enterprise applications requiring high scalability and responsiveness. Readers should have a solid understanding of Java and familiarity with Domain-Driven Design (DDD) principles. If you aim to optimize application performance through technologies like Oracle Coherence, you'll find valuable insights here.

Random Data: Analysis and Measurement Procedures, Fourth Edition

2010-02-08 · O'Reilly Data Science Books O'Reilly Amazon

book

by JULIUS S. BENDAT , ALLAN G. PIERSOL

data data-science data-science-tasks statistics

A timely update of the classic book on the theory and application of random data analysis First published in 1971, Random Data served as an authoritative book on the analysis of experimental physical data for engineering and scientific applications. This Fourth Edition features coverage of new developments in random data management and analysis procedures that are applicable to a broad range of applied fields, from the aerospace and automotive industries to oceanographic and biomedical research. This new edition continues to maintain a balance of classic theory and novel techniques. The authors expand on the treatment of random data analysis theory, including derivations of key relationships in probability and random process theory. The book remains unique in its practical treatment of nonstationary data analysis and nonlinear system analysis, presenting the latest techniques on modern data acquisition, storage, conversion, and qualification of random data prior to its digital analysis. The Fourth Edition also includes: A new chapter on frequency domain techniques to model and identify nonlinear systems from measured input/output random data New material on the analysis of multiple-input/single-output linear models The latest recommended methods for data acquisition and processing of random data Important mathematical formulas to design experiments and evaluate results of random data analysis and measurement procedures Answers to the problem in each chapter Comprehensive and self-contained, Random Data, Fourth Edition is an indispensible book for courses on random data analysis theory and applications at the upper-undergraduate and graduate level. It is also an insightful reference for engineers and scientists who use statistical methods to investigate and solve problems with dynamic data.

DB2® pureXML® Cookbook: Master the Power of the IBM® Hybrid Data Server

2009-08-10 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Pav Kumar-Chatterjee , Matthias Nicola (Snowflake)

Data Modelling IBM Java Linux SQL Unix XML data data-engineering ibm-db2 relational-databases

DB2 pureXML Cookbook Master the Power of the IBM Hybrid Data Server Hands-On Solutions and Best Practices for Developing and Managing XML Database Applications with DB2 More and more database developers and DBAs are being asked to develop applications and manage databases that involve XML data. Many are utilizing the highly praised DB2 pureXML technology from IBM. In the DB2 pureXML Cookbook, two leading experts from IBM offer the practical solutions and proven code samples that database professionals need to build better XML solutions faster. Organized by task, this book is packed with more than 700 easy-to-adapt “recipe-style” examples covering the entire application lifecycle–from planning and design through coding, optimization, and troubleshooting. This extraordinary library of recipes includes more than 250 XQuery and SQL/XML queries. With the authors’ hands-on guidance, you’ll learn how to combine pureXML “ingredients” to efficiently perform virtually any XML data management task, from the simplest to the most advanced. Coverage includes pureXML in DB2 9 for z/OS and DB2 9.1, 9.5, and 9.7 for Linux, UNIX, and Windows Best practices for designing XML data, applications, and storage objects Importing, exporting, loading, replicating, and federating XML data Querying XML data, from start to finish: XPath and XQuery data model and languages, SQL/XML, stored procedures, UDFs, and much more Avoiding common errors and inefficient XML queries Converting relational data to XML and vice versa Updating and transforming XML documents Defining and working with XML indexes Monitoring and optimizing the performance of XML queries and other operations Using XML Schemas to constrain and validate XML documents XML application development–including code samples for Java, .NET, C, COBOL,PL/1, PHP, and Perl

Release 2.0: Issue 11

2009-03-09 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Jimmy Guterman

Big Data data data-engineering google-bigquery

Big Data: when the size and performance requirements for data management become significant design and decision factors for implementing a data management and analysis system. For some organizations, facing hundreds of gigabytes of data for the first time may trigger a need to reconsider data management options. For others, it may take tens or hundreds of terabytes before data size becomes a significant consideration.

Microsoft® SQL Server™ 2008 Integration Services Unleashed

2009-01-29 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Kirk Haselden

C#/.NET Master Data Management Microsoft SQL SQL Server data data-engineering microsoft-sql-server relational-databases

Microsoft SQL Server Integration Services is Microsoft’s powerful platform for building enterprise-level data integration and data transformation solutions. It’s a powerful product, but it’s also complex and can be confusing if you don’t have a clear map for the journey. Microsoft SQL Server 2008 Integration Services Unleashed will be the only book you’ll need to harness the power that Integration Services provides. Through clear, concise explanations and samples, you’ll grasp a clear understanding of working in the Integration Services environment, including how to set up stock components, how to use the various designer features, and how to gain practical knowledge on configuring, deploying, securing, and managing packages. Sample packages are provided to reinforce the discussion and quickly help you gain hands-on experience, and more complex topics such as Data Flow Task internals and tuning, advanced transformations, and writing custom components are all illustrated in easy-to-understand graphics. In addition, there are several custom tasks and transformations and two useful utilities with full source code available for you to use and study, including an ADO.NET destination, a text file encryption task, and a data profiling transform. Detailed information on: Using the powerful Integration Services tools to create solutions without the need to write lines of code Creating packages programmatically or developing custom tasks via the Integration Services object Building robust packages to solve common requirements Securing packages for different environments Using often overlooked or unknown platform features Setting up all the stock components, including data flow components, tasks, Foreach enumerators, connection managers, and log providers Writing robust and useful custom tasks Building packages that seamlessly deploy to other environments Writing custom data flow adapters and transforms Using script tasks and components Easily modifying configurations for multiple packages simultaneously Writing a Task UI that looks just like the stock tasks Tapping into the power of Integration Services for accessing heterogeneous data sources Using expressions to make packages more responsive to the environment Migrating your DTS packages with no stress Kirk Haselden is the Group Program Manager for the Microsoft Master Data Management product forthcoming in the next wave of Office SharePoint Services and owns the long term strategy, vision, planning, and development of that product. Kirk has been with Microsoft for 12 years in various groups including Hardware, eHome, Connected Home, SQL Server, and Office Business Platform. He was the development manager for Integration Services and the primary designer for the runtime, as well as many of the tasks. He has written a number of articles for SQL Server Magazine, speaks regularly at industry events, writes profusely on his personal and MSDN blog, and holds 35 patents or patents pending. Category: Microsoft SQL Server Covers: Microsoft SQL Server 2008 Integration Services User Level: Intermediate—Advanced $59.99 US / $71.99 CAN / £38.99 Net UK

The Manga Guide to Databases

2009-01-28 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Mana Takahashi , Ltd. Trend-Pro Co. , Shoko Azuma

SQL data data-engineering relational-databases

Want to learn about databases without the tedium? With its unique combination of Japanese-style comics and serious educational content, The Manga Guide to Databases is just the book for you. Princess Ruruna is stressed out. With the king and queen away, she has to manage the Kingdom of Kod's humongous fruit-selling empire. Overseas departments, scads of inventory, conflicting prices, and so many customers! It's all such a confusing mess. But a mysterious book and a helpful fairy promise to solve her organizational problems-with the practical magic of databases. In The Manga Guide to Databases, Tico the fairy teaches the Princess how to simplify her data management. We follow along as they design a relational database, understand the entity-relationship model, perform basic database operations, and delve into more advanced topics. Once the Princess is familiar with transactions and basic SQL statements, she can keep her data timely and accurate for the entire kingdom. Finally, Tico explains ways to make the database more efficient and secure, and they discuss methods for concurrency and replication. Examples and exercises (with answer keys) help you learn, and an appendix of frequently used SQL statements gives the tools you need to create and maintain full-featured databases. (Of course, it wouldn't be a royal kingdom without some drama, so read on to find out who gets the girl-the arrogant prince or the humble servant.) This EduManga book is a translation of a bestselling series in Japan, co-published with Ohmsha, Ltd., of Tokyo, Japan.

The Data Model Resource Book, Volume 3: Universal Patterns for Data Modeling

2009-01-09 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Len Silverston , Paul Agnew

BI Data Modelling DWH XML data data-engineering data-models

This third volume of the best-selling "Data Model Resource Book" series revolutionizes the data modeling discipline by answering the question "How can you save significant time while improving the quality of any type of data modeling effort?" In contrast to the first two volumes, this new volume focuses on the fundamental, underlying patterns that affect over 50 percent of most data modeling efforts. These patterns can be used to considerably reduce modeling time and cost, to jump-start data modeling efforts, as standards and guidelines to increase data model consistency and quality, and as an objective source against which an enterprise can evaluate data models. Praise for The Data Model Resource Book, Volume 3 "Len and Paul look beneath the superficial issues of data modeling and have produced a work that is a must for every serious designer and manager of an IT project." " The Data Model Resource Book, Volume 3: Universal Patterns for Data Modeling is a great source for reusable patterns you can use to save a tremendous amount of time, effort, and cost on any data modeling effort. Len Silverston and Paul Agnewhave provided an indispensable reference of very high-quality patterns for the most foundational types of datamodel structures. This book represents a revolutionary leap in moving the data modeling profession forward." — Ron Powell, Cofounder and Editorial Director of the Business Intelligence Network "After we model a Customer, Product, or Order, there is still more about each of these that remains to be captured, such as roles they play, classifications in which they belong, or states in which they change. The Data Model Resource Book, Volume 3: Universal Patterns for Data Modeling clearly illustrates these common structures. Len Silverston and Paul Agnew have created a valuable addition to our field, allowing us to improve the consistency and quality of our models by leveraging the many common structures within this text." — Steve Hoberman, Best-Selling Author of Data Modeling Made Simple "The large national health insurance company I work at has actively used these data patterns and the (Universal Data Models) UDM, ahead of this book, through Len Silverston’s UDM Jump Start engagement. The patterns have found their way into the core of our Enterprise Information Model, our data warehouse designs, and progressively into key business function databases. We are getting to reuse the patterns across projects and are reaping benefits in understanding, flexibility, and time-to-market. Thanks so much." — David Chasteen, Enterprise Information Architect "Reusing proven data modeling design patterns means exactly that. Data models become stable, but remain very flexible to accommodate changes. We have had the fortune of having Len and Paul share the patterns that are described in this book via our engagements with Universal Data Models, LLC. These data modeling design patterns have helped us to focus on the essential business issues because we have leveraged these reusable building blocks for many of the standard design problems. These design patterns have also helped us to evaluate the quality of data models for their intended purpose. Many times there are a lot of enhancements required. Too often the very specialized business-oriented data model is also implemented physically. This may have significant drawbacks to flexibility. I’m looking forward to increasing the data modeling design pattern competence within Nokia with the help of this book." — Teemu Mattelmaki, Chief Information Architect, Nokia "Once again, Len Silverston, this time together with Paul Agnew, has made a valuable contribution to the body of knowledge about datamodels, and the act of building sound data models. As a professional data modeler, and teacher of data modeling for almost three decades, I have always been aware that I had developed some familiar mental "patterns" which I acquired very early in my data modeling experience. When teaching data modeling, we use relatively simple workshops, but they are carefully designed so the students will see and acquire a lot of these basic "patterns" — templates that they will recognize and can use to interpret different subject matter into data model form quickly and easily. I’ve always used these patterns in the course of facilitating data modeling sessions; I was able to recognize "Ah, this is just like . . .," and quickly apply a pattern that I’d seen before. But, in all this time, I’ve never sat down and clearly categorized and documented what each of these “patterns’’ actually was in such a way that they could be easily and clearly communicated to others; Len and Paul have done exactly that. As in the other Data Model Resource Books, the thinking and writing is extraordinarily clear and understandable. I personally would have been very proud to have authored this book, and I sincerely applaud Len and Paul for another great contribution to the art and science of data modeling. It will be of great value to any data modeler." — William G. Smith, President, William G. Smith & Associates, www.williamgsmith.com "Len Silverston and Paul Agnew’s book, Universal Patterns for Data Modeling, is essential reading for anyone undertaking commercial datamodeling. With this latest volume that compiles and insightfully describes fundamental, universal data patterns, The Data Model Resource Book series represents the most important contribution to the data modeling discipline in the last decade." — Dr. Graeme Simsion, Author of Data Modeling Essentials and Data Modeling Theory and Practice "Volume 3 of this trilogy is a most welcome addition to Len Silverston’s two previous books in this area. Guidance has existed for some time for those who desire to use pattern-based analysis to jump-start their data modeling efforts. Guidance exists for those who want to use generalized and industry-specific data constructs to leverage their efforts. What has been missing is guidance to those of us needing guidance to complete the roughly one-third of data models that are not generalized or industry-specific. This is where the magic of individual organizational strategies must manifest itself, and Len and Paul have done so clearly and articulately in a manner that complements the first two volumes of The Data Model Resource Book. By adding this book to Volumes 1 and 2 you will be gaining access to some of the most integrated data modeling guidance available on the planet." — Dr. Peter Aiken, Author of XML in Data Management and data management industry leader VCU/Data Blueprint

Microsoft® SQL Server® 2008 For Dummies®

2008-09-29 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Mike Chapple

Microsoft SQL SQL Server SSIS SSRS data data-engineering microsoft-sql-server relational-databases

If you’re a database administrator, you know Microsoft SQL Server 2008 is revolutionizing database development. Get up to speed on SQL Server 2008, impress your boss, and improve your company’s data management — read Microsoft SQL Server 2008 For Dummies! SQL Server 2008 lets you build powerful databases and create database queries that give your organization the information it needs to excel. Microsoft SQL Server 2008 For Dummies helps you build the skills you need to set up, administer, and troubleshoot SQL Server 2008. You’ll be able to: Develop and maintain a SQL Server system Design databases with integrity and efficiency Turn data into information with SQL Server Reporting Services Organize query results, summarizing data with aggregate functions and formatting output Import large quantities of data with SSIS Keep your server running smoothly Protect data from prying eyes Develop and implement a disaster recovery plan Improve performance with database snapshots Automate SQL Server 2008 administration Microsoft SQL Server 2008 For Dummies is a great first step toward becoming a SQL Server 2008 pro!

Statistics in a Nutshell

2008-07-25 · O'Reilly Data Science Books O'Reilly Amazon

book

by Sarah Boslaugh , Paul Andrew Watters

data data-science data-science-tasks statistics

Need to learn statistics as part of your job, or want some help passing a statistics course? Statistics in a Nutshell is a clear and concise introduction and reference that's perfect for anyone with no previous background in the subject. This book gives you a solid understanding of statistics without being too simple, yet without the numbing complexity of most college texts. You get a firm grasp of the fundamentals and a hands-on understanding of how to apply them before moving on to the more advanced material that follows. Each chapter presents you with easy-to-follow descriptions illustrated by graphics, formulas, and plenty of solved examples. Before you know it, you'll learn to apply statistical reasoning and statistical techniques, from basic concepts of probability and hypothesis testing to multivariate analysis. Organized into four distinct sections, Statistics in a Nutshell offers you: Introductory material: Different ways to think about statistics Basic concepts of measurement and probability theory Data management for statistical analysis Research design and experimental design How to critique statistics presented by others Basic inferential statistics: Basic concepts of inferential statistics The concept of correlation, when it is and is not an appropriate measure of association Dichotomous and categorical data The distinction between parametric and nonparametric statistics Advanced inferential techniques: The General Linear Model Analysis of Variance (ANOVA) and MANOVA Multiple linear regression Specialized techniques: Business and quality improvement statistics Medical and public health statistics Educational and psychological statistics Unlike many introductory books on the subject, Statistics in a Nutshell doesn't omit important material in an effort to dumb it down. And this book is far more practical than most college texts, which tend to over-emphasize calculation without teaching you when and how to apply different statistical tests. With Statistics in a Nutshell, you learn how to perform most common statistical analyses, and understand statistical techniques presented in research articles. If you need to know how to use a wide range of statistical techniques without getting in over your head, this is the book you want.

Tapping into Unstructured Data: Integrating Unstructured Data and Textual Analytics into Business Intelligence

2007-12-11 · O'Reilly Data Science Books O'Reilly Amazon

book

by William H. Inmon , Anthony Nesavich

Analytics BI Data Quality DWH RDBMS SQL business-intelligence data data-science

“The authors, the best minds on the topic, are breaking new ground. They show how every organization can realize the benefits of a system that can search and present complex ideas or data from what has been a mostly untapped source of raw data.” --Randy Chalfant, CTO, Sun Microsystems The Definitive Guide to Unstructured Data Management and Analysis--From the World’s Leading Information Management Expert A wealth of invaluable information exists in unstructured textual form, but organizations have found it difficult or impossible to access and utilize it. This is changing rapidly: new approaches finally make it possible to glean useful knowledge from virtually any collection of unstructured data. William H. Inmon--the father of data warehousing--and Anthony Nesavich introduce the next data revolution: unstructured data management. Inmon and Nesavich cover all you need to know to make unstructured data work for your organization. You’ll learn how to bring it into your existing structured data environment, leverage existing analytical infrastructure, and implement textual analytic processing technologies to solve new problems and uncover new opportunities. Inmon and Nesavich introduce breakthrough techniques covered in no other book--including the powerful role of textual integration, new ways to integrate textual data into data warehouses, and new SQL techniques for reading and analyzing text. They also present five chapter-length, real-world case studies--demonstrating unstructured data at work in medical research, insurance, chemical manufacturing, contracting, and beyond. This book will be indispensable to every business and technical professional trying to make sense of a large body of unstructured text: managers, database designers, data modelers, DBAs, researchers, and end users alike. Coverage includes What unstructured data is, and how it differs from structured data First generation technology for handling unstructured data, from search engines to ECM--and its limitations Integrating text so it can be analyzed with a common, colloquial vocabulary: integration engines, ontologies, glossaries, and taxonomies Processing semistructured data: uncovering patterns, words, identifiers, and conflicts Novel processing opportunities that arise when text is freed from context Architecture and unstructured data: Data Warehousing 2.0 Building unstructured relational databases and linking them to structured data Visualizations and Self-Organizing Maps (SOMs), including Compudigm and Raptor solutions Capturing knowledge from spreadsheet data and email Implementing and managing metadata: data models, data quality, and more William H. Inmon is founder, president, and CTO of Inmon Data Systems. He is the father of the data warehouse concept, the corporate information factory, and the government information factory. Inmon has written 47 books on data warehouse, database, and information technology management; as well as more than 750 articles for trade journals such as Data Management Review, Byte, Datamation, and ComputerWorld. His b-eye-network.com newsletter currently reaches 55,000 people. Anthony Nesavich worked at Inmon Data Systems, where he developed multiple reports that successfully query unstructured data. Preface xvii 1 Unstructured Textual Data in the Organization 1 2 The Environments of Structured Data and Unstructured Data 15 3 First Generation Textual Analytics 33 4 Integrating Unstructured Text into the Structured Environment 47 5 Semistructured Data 73 6 Architecture and Textual Analytics 83 7 The Unstructured Database 95 8 Analyzing a Combination of Unstructured Data and Structured Data 113 9 Analyzing Text Through Visualization 127 10 Spreadsheets and Email 135 11 Metadata in Unstructured Data 147 12 A Methodology for Textual Analytics 163 13 Merging Unstructured Databases into the Data Warehouse 175 14 Using SQL to Analyze Text 185 15 Case Study--Textual Analytics in Medical Research 195 16 Case Study--A Database for Harmful Chemicals 203 17 Case Study--Managing Contracts Through an Unstructured Database 209 18 Case Study--Creating a Corporate Taxonomy (Glossary) 215 19 Case Study--Insurance Claims 219 Glossary 227 Index 233

Oracle Automatic Storage Management: Under-the-Hood & Practical Deployment Guide

2007-11-26 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Murali Vallath , Nitin Vengurlekar , Rich Long

API Oracle RDBMS data data-engineering oracle-database-solutions

Streamline data management and provisioning Build and manage a storage infrastructure with Oracle scalable Automatic Storage Management using Oracle Automatic Storage Management (Oracle ASM) and the detailed information contained in this exclusive Oracle Press resource. Written by a team of database experts, maintain a dynamic, highly available Oracle database Oracle Automatic Storage Management: Under-the-Hood & Practical Deployment Guide explains how to build and storage environment . Inside, you'll learn how to configure storage for Oracle ASM, build disk groups, use data striping and mirroring, and optimize performance. You'll also learn how to ensure consistency across server and storage platforms, maximize data redundancy, and administer Oracle ASM from the command line. Manage Oracle ASM Instances and configure Oracle RDBMS instances to leverage Oracle ASM Define, discover, and manage disk storage under Oracle ASM Create external, normal-redundancy, and high-redundancy disk groups Add and remove Oracle ASM storage without affecting RDMS instance availability Learn how Oracle ASM provides even I/O distribution Work with Oracle ASM directories, files, templates, and aliases Improve storage performance and integrity using the ASMLIB API Simplify system administration with the Oracle ASM command line interface Understand key internal Oracle ASM structures and algorithms

Pro Oracle Spatial for Oracle Database 11g

2007-10-29 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Albert Godfrind , Ravi Kothuri , Euro Beinat

Oracle data data-engineering oracle-11g oracle-database-solutions

Pro Oracle Spatial for Oracle Database 11g shows how to take advantage of Oracle Database's built-in feature set for working with location-based data. Authors Ravi Kothuri and Albert Godfrind address the special nature of spatial data and its role in professional and consumer applications. They also detail issues in spatial data management, such as modeling, storing, accessing, and analyzing spatial data, as well as the Oracle Spatial solution and the integration of spatial data into enterprise databases. In addition, they cover how spatial information is used to understand business and support decisions, to manage customer relations, and to better serve private and corporate users. When you read Pro Oracle Spatial for Oracle Database 11g, you're learning from the very best. Ravi Kothuri is a key member of Oracle's Spatial development team. Albert Godfrind consults widely with Oracle clients on the implementation of Oracle Spatial, develops training courses, and presents frequently at conferences. Together they have crafted a technically sound and authoritative fountain of information on working with spatial data in the Oracle database.

Microsoft® SQL Server™ 2005: Database Essentials Step by Step

2006-09-06 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Solid Quality Learning

Microsoft Cyber Security SQL data data-engineering microsoft-sql-server relational-databases

SQL Server 2005 is Microsoft’s next-generation data management and analysis solution that delivers enhanced scalability, availability, and security features to enterprise data and analytical applications while making them easier to create, deploy, and manage. Now you can teach yourself how to design, build, test, deploy, and maintain SQL Server databases—one step at a time. With STEP BY STEP, you work at your own pace through hands-on, learn-by-doing exercises. Instead of merely focusing on describing new features, this book shows new database programmers and administrators how to use specific features within typical business scenarios. Each chapter puts you to work, providing a highly practical learning experience that demonstrates how to build database solutions to solve common business problems.

Microsoft® SQL Server™ 2005: Applied Techniques Step by Step

2006-06-21 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Solid Quality Learning

Microsoft Cyber Security SQL data data-engineering microsoft-sql-server relational-databases

SQL Server 2005 is Microsoft’s next-generation data management and analysis solution that delivers increased security, scalability, and availability to enterprise data and analytical applications while making them easier to create, deploy, and manage. This book shows readers with fundamental SQL Server skills, as well as new-to-topic but experienced database developers, techniques to design, build, test, deploy, and maintain better SQL Server databases. The format is a hands-on, sequential, developer’s companion, providing beyond-the-basics guidance. This book is not a feature-driven reference manual, but a highly practical learning experience demonstrating how to build database solutions to solve business problems.

talk-data.com

Activity Trend

Top Events

Top Speakers

Data Integration Blueprint and Modeling: Techniques for a Scalable and Sustainable Architecture

Managing Time in Relational Databases

Microsoft® Access® 2010 Inside Out

Database Modeling and Design, 4th Edition

Data Model Patterns: A Metadata Map

Statistical Programming in SAS®

Oracle Coherence 3.5

Random Data: Analysis and Measurement Procedures, Fourth Edition

DB2® pureXML® Cookbook: Master the Power of the IBM® Hybrid Data Server

Release 2.0: Issue 11

Microsoft® SQL Server™ 2008 Integration Services Unleashed

The Manga Guide to Databases

The Data Model Resource Book, Volume 3: Universal Patterns for Data Modeling

Microsoft® SQL Server® 2008 For Dummies®

Statistics in a Nutshell

Tapping into Unstructured Data: Integrating Unstructured Data and Textual Analytics into Business Intelligence

Oracle Automatic Storage Management: Under-the-Hood & Practical Deployment Guide

Pro Oracle Spatial for Oracle Database 11g

Microsoft® SQL Server™ 2005: Database Essentials Step by Step

Microsoft® SQL Server™ 2005: Applied Techniques Step by Step