talk-data.com talk-data.com

Topic

data-engineering

3395

tagged

Activity Trend

1 peak/qtr
2020-Q1 2026-Q1

Activities

3395 activities · Newest first

MongoDB and Python

Learn how to leverage MongoDB with your Python applications, using the hands-on recipes in this book. You get complete code samples for tasks such as making fast geo queries for location-based apps, efficiently indexing your user documents for social-graph lookups, and many other scenarios. This guide explains the basics of the document-oriented database and shows you how to set up a Python environment with it. Learn how to read and write to MongoDB, apply idiomatic MongoDB and Python patterns, and use the database with several popular Python web frameworks. You’ll discover how to model your data, write effective queries, and avoid concurrency problems such as race conditions and deadlocks. The recipes will help you: Read, write, count, and sort documents in a MongoDB collection Learn how to use the rich MongoDB query language Maintain data integrity in replicated/distributed MongoDB environments Use embedding to efficiently model your data without joins Code defensively to avoid keyerrors and other bugs Apply atomic operations to update game scores, billing systems, and more with the fast accounting pattern Use MongoDB with the Pylons 1.x, Django, and Pyramid web frameworks

Big Data Glossary

To help you navigate the large number of new data tools available, this guide describes 60 of the most recent innovations, from NoSQL databases and MapReduce approaches to machine learning and visualization tools. Descriptions are based on first-hand experience with these tools in a production environment. This handy glossary also includes a chapter of key terms that help define many of these tool categories: NoSQL Databases—Document-oriented databases using a key/value interface rather than SQL MapReduce—Tools that support distributed computing on large datasets Storage—Technologies for storing data in a distributed way Servers—Ways to rent computing power on remote machines Processing—Tools for extracting valuable information from large datasets Natural Language Processing—Methods for extracting information from human-created text Machine Learning—Tools that automatically perform data analyses, based on results of a one-off analysis Visualization—Applications that present meaningful data graphically Acquisition—Techniques for cleaning up messy public data sources Serialization—Methods to convert data structure or object state into a storable format

Scaling BPM Adoption from Project to Program with IBM Business Process Manager

Your first Business Process Management (BPM) project is a crucial first step on your BPM journey. It is important to begin this journey with a philosophy of change that will enable you to avoid common pitfalls that lead to failed BPM projects, and ultimately, poor BPM adoption. This IBM® Redbooks® publication describes the methodology and best practices that lead to a successful project and how to use that success to scale to enterprise-wide BPM adoption. The intended audience for this book includes all people who participate in the discovery, planning, delivery, deployment, and continuous improvement activities for a business process. These roles include process owners, process participants and subject matter experts (SMEs) from the operational business as well as technologists responsible for delivery including BPM analysts, BPM solution architects, BPM administrators, and BPM developers.

HBase: The Definitive Guide

If you're looking for a scalable storage solution to accommodate a virtually endless amount of data, this book shows you how Apache HBase can fulfill your needs. As the open source implementation of Google's BigTable architecture, HBase scales to billions of rows and millions of columns, while ensuring that write and read performance remain constant. Many IT executives are asking pointed questions about HBase. This book provides meaningful answers, whether you’re evaluating this non-relational database or planning to put it into practice right away. Discover how tight integration with Hadoop makes scalability with HBase easier Distribute large datasets across an inexpensive cluster of commodity servers Access HBase with native Java clients, or with gateway servers providing REST, Avro, or Thrift APIs Get details on HBase’s architecture, including the storage format, write-ahead log, background processes, and more Integrate HBase with Hadoop's MapReduce framework for massively parallelized data processing jobs Learn how to tune clusters, design schemas, copy tables, import bulk data, decommission nodes, and many other tasks

Professional NoSQL

A hands-on guide to leveraging NoSQL databases NoSQL databases are an efficient and powerful tool for storing and manipulating vast quantities of data. Most NoSQL databases scale well as data grows. In addition, they are often malleable and flexible enough to accommodate semi-structured and sparse data sets. This comprehensive hands-on guide presents fundamental concepts and practical solutions for getting you ready to use NoSQL databases. Expert author Shashank Tiwari begins with a helpful introduction on the subject of NoSQL, explains its characteristics and typical uses, and looks at where it fits in the application stack. Unique insights help you choose which NoSQL solutions are best for solving your specific data storage needs. Professional NoSQL: Demystifies the concepts that relate to NoSQL databases, including column-family oriented stores, key/value databases, and document databases. Delves into installing and configuring a number of NoSQL products and the Hadoop family of products. Explains ways of storing, accessing, and querying data in NoSQL databases through examples that use MongoDB, HBase, Cassandra, Redis, CouchDB, Google App Engine Datastore and more. Looks at architecture and internals. Provides guidelines for optimal usage, performance tuning, and scalable configurations. Presents a number of tools and utilities relating to NoSQL, distributed platforms, and scalable processing, including Hive, Pig, RRDtool, Nagios, and more.

Microsoft® BizTalk® Server 2010 Unleashed

The most complete, practical guide to BizTalk Server 2009: an all-new book focused on delivering real, start-to-finish enterprise solutions Pragmatic coverage of every crucial step of BizTalk development: architecture, design, infrastructure, deployment, lifecycle management, and more Fully up to date with the R2 release of BizTalk Server 2009 Not a revision of previous BizTalk Server Unleashed books, but completely rewritten from the ground up Microsoft BizTalk Server 2009 Unleashed is the definitive, pragmatic guide to Microsoft's latest and most powerful version of BizTalk Server. In this book, a team of world-class BizTalk Server 2009 experts bring together the deep practical insights .NET developers need to solve real business problems with BizTalk Server 2009 in any enterprise environment. Drawing on their immense BizTalk experience, the authors present best practices for the entire development lifecycle, from planning and architecture through deployment, and beyond. Writing at just the right level of technical detail for experienced .NET developers now starting out with BizTalk, they cover these and many other crucial issues: " Architecting and designing effective, high-value BizTalk solutions " Working with BizTalk schemas, maps, orchestrations, pipelines, pipeline components, and adapters " Implementing business rules with the Microsoft Business Rules Framework " Creating highly-available, high-performance BizTalk environments " Monitoring business activity " Collaborating effectively among BizTalk developers and users " Using BizTalk's leading-edge RFID capabilities Note: This is a 100% new book, NOT an update to Microsoft BizTalk Server 2004 Unleashed.

SAN Storage Performance Management Using Tivoli Storage Productivity Center

IBM Tivoli® Storage Productivity Center is an ideal tool for performing storage management reporting, because it uses industry standards for cross vendor compliance, and it can provide reports based on views from all application servers, all Fibre Channel fabric devices, and storage subsystems from different vendors, both physical and virtual. This IBM® Redbooks® publication is intended for experienced storage managers who want to provide detailed performance reports to satisfy their business requirements. The focus of this book is to use the reports provided by Tivoli Storage Productivity Center for performance management. We do address basic storage architecture in order to set a level playing field for understanding of the terminology that we are using throughout this book. Although this book has been created to cover storage performance management, just as important in the larger picture of Enterprise-wide management are both Asset Management and Capacity Management. Tivoli Storage Productivity Center is an excellent tool to provide all of these reporting and management requirements.

Database Design Using Entity-Relationship Diagrams, 2nd Edition

Essential to database design, entity-relationship (ER) diagrams are known for their usefulness in mapping out clear database designs. They are also well-known for being difficult to master. With Database Design Using Entity-Relationship Diagrams, Second Edition, database designers, developers, and students preparing to enter the field can quickly learn the ins and outs of ER diagramming. Building on the success of the bestselling first edition, this accessible text includes a new chapter on the relational model and functional dependencies. It also includes expanded chapters on Enhanced Entity Relationship (EER) diagrams and reverse mapping. It uses cutting-edge case studies and examples to help readers master database development basics and defines ER and EER diagramming in terms of requirements (end user requests) and specifications (designer feedback to those requests). Describes a step-by-step approach for producing an ER diagram and developing a relational database from it Contains exercises, examples, case studies, bibliographies, and summaries in each chapter Details the rules for mapping ER diagrams to relational databases Explains how to reverse engineer a relational database back to an entity-relationship model Includes grammar for the ER diagrams that can be presented back to the user The updated exercises and chapter summaries provide the real-world understanding needed to develop ER and EER diagrams, map them to relational databases, and test the resulting relational database. Complete with a wealth of additional exercises and examples throughout, this edition should be a basic component of any database course. Its comprehensive nature and easy-to-navigate structure makes it a resource that students and professionals will turn to throughout their careers.

MariaDB Crash Course

This first-to-market tutorial covers everything beginners need to succeed with MariaDB, the open, community-based branch of MySQL. Master trainer Ben Forta introduces all the essentials through a series of quick, easy-to-follow, hands-on lessons. Instead of belaboring database theory and relational design, Forta focuses on teaching solutions for the majority of SQL users who simply want to interact with data. Forta covers all this, and more: Using the MariaDB toolset * Retrieving and sorting data Filtering data using comparisons, wildcards, and full text searching Analyzing data with aggregate functions Performing insert, update, and delete operations Joining relational tables using inner, outer, and self joins Combining queries using unions * Using views Creating and modifying tables, and accessing table schemas Working with stored procedures, cursors, and other advanced database features Managing databases, users, and security privileges This book was reviewed and is supported by MariaDB's developers, Monty Program AB. It contains a foreword by project founder Monty Widenius, primary developer of the original version of MySQL.

Security Functions of IBM DB2 10 for z/OS

IBM® DB2® 9 and 10 for z/OS® have added functions in the areas of security, regulatory compliance, and audit capability that provide solutions for the most compelling requirements. DB2 10 enhances the DB2 9 role-based security with additional administrative and other finer-grained authorities and privileges. This authority granularity helps separate administration and data access that provide only the minimum appropriate authority. The authority profiles provide better separation of duties while limiting or eliminating blanket authority over all aspects of a table and its data. In addition, DB2 10 provides a set of criteria for auditing for the possible abuse and overlapping of authorities within a system. In DB2 10, improvements to security and regulatory compliance focus on data retention and protecting sensitive data from privileged users and administrators. Improvements also help to separate security administration from database administration. DB2 10 also lets administrators enable security on a particular column or particular row in the database complementing the privilege model. This IBM Redbooks® publication provides a detailed description of DB2 10 security functions from the implementation and usage point of view. It is intended to be used by database, audit, and security administrators.

Practical Oracle Security

This is the only practical, hands-on guide available to database administrators to secure their Oracle databases. This book will help the DBA to assess their current level of risk as well as their existing security posture. It will then provide practical, applicable knowledge to appropriately secure the Oracle database. The only practical, hands-on guide for securing your Oracle database published by independent experts. Your Oracle database does not exist in a vacuum, so this book shows you how to securely integrate your database into your enterprise.

Cloud and Virtual Data Storage Networking

Written by noted author, blogger, industry analyst, and IT veteran, Greg Schulz, this book covers data storage networks for cloud and virtual environments, from a hardware, software, services, and best practices perspective. Filled with real-world insights, blueprints, and best practices, this vendor- and technology-neutral text provides the tools to achieve efficient, optimized, flexible, scalable, and resilient data storage networking infrastructures. Coverage includes public and private cloud, virtualization, and traditional IT environments.

IBM BladeCenter Virtual Fabric Solutions

The deployment of server virtualization technologies in data centers requires significant efforts in providing sufficient network I/O bandwidth to satisfy the demand of virtualized applications and services. For example, every virtualized system can host several dozen network applications and services, and each of these services requires certain bandwidth (or speed) to function properly. Furthermore, because of different network traffic patterns relevant to different service types, these traffic flows may interfere with each other, leading to serious network problems including the inability of the service to perform its functions. The IBM® Virtual Fabric solution for IBM BladeCenter addresses these issues. The solution is based on the IBM BladeCenter H chassis with a 10-Gb Converged Enhanced Ethernet infrastructure built on 10-Gb Ethernet switch modules in the chassis and the Emulex or Broadcom Virtual Fabric Adapters in each blade server.

Oracle Database 11g Oracle Real Application Clusters Handbook, 2nd Edition, 2nd Edition

Master Oracle Real Application Clusters Maintain a dynamic enterprise computing infrastructure with expert instruction from an Oracle ACE. Oracle Database 11g Oracle Real Application Clusters Handbook, Second Edition has been fully revised and updated to cover the latest tools and features. Find out how to prepare your hardware, deploy Oracle Real Application Clusters, optimize data integrity, and integrate seamless failover protection. Troubleshooting, performance tuning, and application development are also discussed in this comprehensive Oracle Press guide. Install and configure Oracle Real Application Clusters Configure and manage diskgroups using Oracle Automatic Storage Management Work with services, voting disks, and Oracle Clusterware Repository Look under the hood of the Cache Fusion and Global Resource Directory operations in Oracle Real Applications Clusters Explore the internal workings of backup and recovery in Oracle Real Application Clusters Employ workload balancing and the Transparent Application Failover feature of an Oracle database Get complete coverage of Stretch Clusters, also known as Metro Clusters Troubleshoot Oracle Clusterware using the most advanced diagnostics available Develop custom Oracle Real Application Clusters applications

Oracle Database 11g Performance Tuning Recipes: A Problem-Solution Approach

Performance problems are rarely "problems" per se. They are more often "crises" during which you're pressured for results by a manager standing outside your cubicle while your phone rings with queries from the help desk. You won't have the time for a leisurely perusal of the manuals, nor to lean back and read a book on theory. What you need in that situation is a book of solutions, and solutions are precisely what Oracle Database 11g Performance Tuning Recipes delivers. Oracle Database 11g Performance Tuning Recipes is a ready reference for database administrators in need of immediate help with performance issues relating to Oracle Database. The book takes an example-based approach, wherein each chapter covers a specific problem domain. Within each chapter are "recipes," showing by example how to perform common tasks in that chapter's domain. Solutions in the recipes are backed by clear explanations of background and theory from the author team. Whatever the task, if it's performance-related, you'll probably find a recipe and a solution in this book. Provides proven solutions to real-life Oracle performance problems Offers relevant background and theory to support each solution Written by a team of experienced database administrators successful in their careers What you'll learn Optimize the use of memory and storage Monitor performance and troubleshoot problems Identify and improve poorly-performing SQL statements Adjust the most important optimizer parameters to your advantage Create indexes that get used and make a positive impact upon performance Automate and stabilize using key features such as SQL Tuning Advisor and SQL Plan Baselines Who this book is for Oracle Database 11g Performance Tuning Recipes is aimed squarely at Oracle Database administrators. The book especially appeals to those administrators desiring to have at their side a ready-to-go set of solutions to common database performance problems.

IBM z/OS Mainframe Security and Audit Management Using the IBM Security zSecure Suite

Every organization has a core set of mission-critical data that must be protected. Security lapses and failures are not simply disruptions—they can be catastrophic events, and the consequences can be felt across the entire organization. As a result, security administrators face serious challenges in protecting the company’s sensitive data. IT staff are challenged to provide detailed audit and controls documentation at a time when they are already facing increasing demands on their time, due to events such as mergers, reorganizations, and other changes. Many organizations do not have enough experienced mainframe security administrators to meet these objectives, and expanding employee skillsets with low-level mainframe security technologies can be time-consuming. The IBM® Security zSecure suite consists of multiple components designed to help you administer your mainframe security server, monitor for threats, audit usage and configurations, and enforce policy compliance. Administration, provisioning, and management components can significantly reduce administration, contributing to improved productivity, faster response time, and reduced training time needed for new administrators. This IBM Redbooks® publication is a valuable resource for security officers, administrators, and architects who wish to better understand their mainframe security solutions.

Oracle E-Business Suite 12 Financials Cookbook

Delve into the comprehensive world of Oracle E-Business Suite 12 Financials with this practical cookbook. This book equips you with 50+ expert recipes, designed to simplify your daily interactions with EBS financials. Learn step-by-step how to navigate core financial modules, and gain a thorough understanding of the integration points between them. What this Book will help me do Navigate through core Oracle E-Business Suite 12 Financials modules and understand their key functionalities. Streamline financial operations by implementing recipes that mirror real-world business scenarios. Acquire expertise in handling items, suppliers, invoices, assets, customers, and ledgers. Master the end-to-end processes from purchasing to payments, and from sales to receivables. Produce accurate financial reports by managing sub-ledger transactions and period closures effectively. Author(s) Yemi Onigbode, a seasoned expert in Oracle E-Business Suite, brings years of experience to this comprehensive guide. With a background in financial systems and implementation, Yemi's approach in this book is practical and insightful. His focus on scenario-based learning ensures that readers gain actionable skills applicable to their daily work. Who is it for? This book is ideal for Oracle EBS Financials specialists, business analysts, functional consultants, and system accountants aiming to expand their understanding of EBS financial modules. It is equally beneficial for novice professionals seeking to acquaint themselves with the functionalities of EBS financials in real-world business contexts.

IBM System Storage TS7650, TS7650G, and TS7610

This IBM® Redbooks® publication describes the IBM solution for data deduplication, the IBM System Storage® TS7650G ProtecTIER® Deduplication Gateway, and the IBM System Storage TS7650 ProtecTIER Deduplication Appliance. This solution consists of IBM System Storage ProtecTIER Enterprise Edition V2.3 software and the IBM System Storage TS7650G Deduplication Gateway (3958-DD1 and DD3) hardware, as well as the IBM System Storage TS7650 Deduplication Appliance (3958-AP1). They are designed to address the disk-based data protection needs of enterprise data centers. We introduce data deduplication and IP replication and describe in detail the components that make up IBM System Storage TS7600 with ProtecTIER. We provide extensive planning and sizing guidance that enables you to determine your requirements and the correct configuration for your environment. We then guide you through the basic setup steps on the system and on the host, and describe all operational tasks that might be required during normal day-to-day operation or when upgrading the TS7650G and TS7650. This publication is intended for system programmers, storage administrators, hardware and software planners, and other IT personnel involved in planning, implementing, and using the IBM deduplication solution, as well as anyone seeking detailed technical information about the IBM System Storage TS7600 with ProtecTIER.

IBM zEnterprise System Technical Introduction

Recently, the IT industry has seen an explosion in applications, architectures, and platforms. With the generalized availability of the internet and the appearance of commodity hardware and software, several patterns have emerged that have gained center stage. Workloads have changed. Many applications, including mission-critical ones, are deployed in heterogeneous infrastructures. System z® design has adapted to this change. IBM® has a holistic approach to System z design, which includes hardware, software, and procedures. It takes into account a wide range of factors, including compatibility and investment protection, which ensures a tighter fit with IT requirements of an enterprise. This IBM Redbooks® publication introduces the revolutionary scalable IBM zEnterprise System, which consists of the IBM zEnterprise 196 (z196) or the IBM zEnterprise 114 (z114), the IBM zEnterprise BladeCenter® Extension (zBX), and the IBM zEnterprise Unified Resource Manager. IBM is taking a bold step by integrating heterogeneous platforms under the proven System z hardware management capabilities, while extending System z qualities of service to those platforms. The z196 and z114 are general-purpose servers that are equally at ease with compute-intensive workloads and with I/O-intensive workloads. The integration of heterogeneous platforms is based on IBM BladeCenter technology, allowing improvements in price and performance for key workloads, while enabling a new range of heterogeneous platform solutions. The z196 and z114 are at the core of the enhanced System z platforms, which are designed to deliver technologies that businesses need today along with a foundation to drive future business growth. The changes to this edition are based on the System z hardware announcement dated July 12, 2011. This book provides basic information about z196, z114, zBX, and Unified Resource Manager capabilities, hardware functions and features, and associated software support. It is intended for IT managers, architects, consultants, and anyone else who wants to understand the elements of the zEnterprise System. For this introduction to the zEnterprise System, readers are not expected to be familiar with current IBM System z technology and terminology.

Expert Oracle Exadata

Throughout history, advances in technology have come in spurts. A single great idea can often spur rapid change as the idea takes hold and is propagated, often in totally unexpected directions. Exadata embodies such a change in how we think about and manage relational databases. The key change lies in the concept of offloading SQL processing to the storage layer. That concept is a huge win, and its implementation in the form of Exadata is truly a game changer. Expert Oracle Exadata will give you a look under the covers at how the combination of hardware and software that comprise Exadata actually work. Authors Kerry Osborne, Randy Johnson, and Tanel Pöder share their real-world experience, gained through multiple Exadata implementations with the goal of opening up the internals of the Exadata platform. This book is intended for readers who want to understand what makes the platform tick and for whom—"how" it does what it is does is as important as what it does. By being exposed to the features that are unique to Exadata, you will gain an understanding of the mechanics that will allow you to fully benefit from the advantages that the platform provides. Will changes the way you think about managing SQL performance and processing Provides a roadmap to laying out the Exadata platform to best support your existing systems Dives deeply into the internals, removing the "black box" mystique and showing how Exadata actually works What you'll learn Configure Exadata from the ground up Optimize for mixed OLTP/DW workloads Migrate large data sets from existing systems Connect Exadata to external systems Support consolidation stratigies using the Resource Manager Configure high-availability features of Exadata, including real application clusters (RAC) and automatic storage management (ASM) Apply tuning strategies utilizing the unique features of Exadata Who this book is for Expert Oracle Exadata is for database administrators and developers who want to understand what makes Exadata unique so that they can take advantage of all the platform has to offer. It is also for anyone who needs to plan and execute migrations of systems to the Exadata platform. Finally, the book will be invaluable to those who support and maintain such systems.