talk-data.com talk-data.com

Topic

RDBMS

Relational Database Management System (RDBMS)

databases sql data_storage

134

tagged

Activity Trend

5 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: O'Reilly Data Engineering Books ×
RDF Database Systems

RDF Database Systems is a cutting-edge guide that distills everything you need to know to effectively use or design an RDF database. This book starts with the basics of linked open data and covers the most recent research, practice, and technologies to help you leverage semantic technology. With an approach that combines technical detail with theoretical background, this book shows how to design and develop semantic web applications, data models, indexing and query processing solutions. Understand the Semantic Web, RDF, RDFS, SPARQL, and OWL within the context of relational database management and NoSQL systems Learn about the prevailing RDF triples solutions for both relational and non-relational databases, including column family, document, graph, and NoSQL Implement systems using RDF data with helpful guidelines and various storage solutions for RDF Process SPARQL queries with detailed explanations of query optimization, query plans, caching, and more Evaluate which approaches and systems to use when developing Semantic Web applications with a helpful description of commercial and open-source systems

Learning Neo4j

Dive into the exciting world of graph databases with "Learning Neo4j". This book introduces you to the Neo4j graph database system, showing how graph theory can unlock new ways of organizing and querying complex datasets. Through practical examples, you will explore Neo4j's capabilities and learn to implement real-world applications using graph data models. What this Book will help me do Understand the fundamentals of graph theory and how it relates to databases. Install and set up the Neo4j graph database on local and cloud platforms. Model complex data for use in Neo4j and import various datasets into it. Implement real-world use cases, such as recommendation systems and social networks. Explore visualization tools and resources for enhancing graph database applications. Author(s) The author, None Van Bruggen, is a seasoned expert in data systems with extensive hands-on experience with Neo4j. Drawing from real-world expertise, they provide practical guidance, bridging theoretical concepts to practical utility seamlessly. None Van Bruggen's accessible writing style makes navigating the complexities of graph databases achievable and rewarding for learners. Who is it for? This book is ideal for IT professionals, database administrators, and data analysts looking to harness the power of graph databases. Readers should have a basic understanding of relational databases and data modeling concepts. Whether you're starting with Neo4j or seeking to deepen your knowledge, this book provides the guidance you need. It is particularly great for anyone interested in implementing graph data solutions in real-world scenarios.

Bitemporal Data

Bitemporal data has always been important. But it was not until 2011 that the ISO released a SQL standard that supported it. Currently, among major DBMS vendors, Oracle, IBM and Teradata now provide at least some bitemporal functionality in their flagship products. But to use these products effectively, someone in your IT organization needs to know more than how to code bitemporal SQL statements. Perhaps, in your organization, that person is you. To correctly interpret business requests for temporal data, to correctly specify requirements to your IT development staff, and to correctly design bitemporal databases and applications, someone in your enterprise needs a deep understanding of both the theory and the practice of managing bitemporal data. Someone also needs to understand what the future may bring in the way of additional temporal functionality, so their enterprise can plan for it. Perhaps, in your organization, that person is you. This is the book that will show the do-it-yourself IT professional how to design and build bitemporal databases and how to write bitemporal transactions and queries, and will show those who will direct the use of vendor-provided bitemporal DBMSs exactly what is going on "under the covers" of that software. Explains the business value of bitemporal data in terms of the information that can be provided by bitemporal tables and not by any other form of temporal data, including history tables, version tables, snapshot tables, or slowly-changing dimensions Provides an integrated account of the mathematics, logic, ontology and semantics of relational theory and relational databases, in terms of which current relational theory and practice can be seen as unnecessarily constrained to the management of nontemporal and incompletely temporal data Explains how bitemporal tables can provide the time-variance and nonvolatility hitherto lacking in Inmon historical data warehouses Explains how bitemporal dimensions can replace slowly-changing dimensions in Kimball star schemas, and why they should do so Describes several extensions to the current theory and practice of bitemporal data, including the use of episodes, "whenever" temporal transactions and queries, and future transaction time Points out a basic error in the ISO’s bitemporal SQL standard, and warns practitioners against the use of that faulty functionality. Recommends six extensions to the ISO standard which will increase the business value of bitemporal data Points towards a tritemporal future for bitemporal data, in which an Aristotelian ontology and a speech-act semantics support the direct management of the statements inscribed in the rows of relational tables, and add the ability to track the provenance of database content to existing bitemporal databases This book also provides the background needed to become a business ontologist, and explains why an IT data management person, deeply familiar with corporate databases, is best suited to play that role. Perhaps, in your organization, that person is you

Understanding Big Data Scalability: Big Data Scalability Series, Part I

Get Started Scaling Your Database Infrastructure for High-Volume Big Data Applications “Understanding Big Data Scalability presents the fundamentals of scaling databases from a single node to large clusters. It provides a practical explanation of what ‘Big Data’ systems are, and fundamental issues to consider when optimizing for performance and scalability. Cory draws on many years of experience to explain issues involved in working with data sets that can no longer be handled with single, monolithic relational databases.... His approach is particularly relevant now that relational data models are making a comeback via SQL interfaces to popular NoSQL databases and Hadoop distributions.... This book should be especially useful to database practitioners new to scaling databases beyond traditional single node deployments.” —Brian O’Krafka, software architect presents a solid foundation for scaling Big Data infrastructure and helps you address each crucial factor associated with optimizing performance in scalable and dynamic Big Data clusters. Understanding Big Data Scalability Database expert Cory Isaacson offers practical, actionable insights for every technical professional who must scale a database tier for high-volume applications. Focusing on today’s most common Big Data applications, he introduces proven ways to manage unprecedented data growth from widely diverse sources and to deliver real-time processing at levels that were inconceivable until recently. Isaacson explains why databases slow down, reviews each major technique for scaling database applications, and identifies the key rules of database scalability that every architect should follow. You’ll find insights and techniques proven with all types of database engines and environments, including SQL, NoSQL, and Hadoop. Two start-to-finish case studies walk you through planning and implementation, offering specific lessons for formulating your own scalability strategy. Coverage includes Understanding the true causes of database performance degradation in today’s Big Data environments Scaling smoothly to petabyte-class databases and beyond Defining database clusters for maximum scalability and performance Integrating NoSQL or columnar databases that aren’t “drop-in” replacements for RDBMSes Scaling application components: solutions and options for each tier Recognizing when to scale your data tier—a decision with enormous consequences for your application environment Why data relationships may be even more important in non-relational databases Why virtually every database scalability implementation still relies on sharding, and how to choose the best approach How to set clear objectives for architecting high-performance Big Data implementations The Big Data Scalability Series is a comprehensive, four-part series, containing information on many facets of database performance and scalability. is the first book in the series. Understanding Big Data Scalability Learn more and join the conversation about Big Data scalability at bigdatascalability.com.

Just Hibernate

If you’re looking for a short, sweet, and simple introduction (or reintroduction) to Hibernate, this is the book you want. Through clear real-world examples, you’ll learn Hibernate and object-relational mapping from the ground up, starting with the basics. Then you’ll dive into the framework’s moving parts to understand how they work in action. Storing Java objects in relational databases is usually a challenging and complex task for any Java developer, experienced or not. This book, like others in the Just series, delivers a concise, example-driven tutorial for Java beginners. You’ll gain enough knowledge and confidence to start working on real-world projects with Hibernate. Compare how JDBC and Hibernate work with object persistence Learn how annotations are used to create Hibernate applications Understand how to persist and retrieve Java data structures Focus on the fundamentals of associations and their mappings Delve into advanced concepts such as caching, inheritance, and types Walk through the Hibernate Query Language API, with examples Develop Java Persistence API applications, using Hibernate as the provider Work hands-on with code snippets to understand the technology

Learning Cypher

"Learning Cypher" provides an in-depth look into Cypher, the functional query language for Neo4j, the powerful graph database. Whether you're transitioning from relational databases or exploring graph technology for the first time, this book offers practical guidance to help you write efficient, clear, and reusable queries. What this Book will help me do Master the Cypher declarative query syntax for graph databases. Write optimized Cypher queries for better application performance. Transform relational database data to graph database structures with ease. Understand the nuances of transitioning from SQL to graph paradigms. Learn the common pitfalls in Neo4j programming and how to avoid them. Author(s) Onofrio Panzarino is an experienced database developer specializing in graph technologies and Neo4j. He is passionate about teaching industry best practices to developers and has extensive experience in designing graph-based solutions. Through his approachable style, he empowers readers to excel in using graph databases effectively. Who is it for? This book is for database developers, data analysts, and software engineers who want to expand their knowledge into graph databases. If you work with large-scale connected data or are transitioning from SQL to a graph model, this book is ideal for you. Prior experience with any database query language will be helpful. The book is also suitable for students and professionals looking to integrate graph technology into their projects.

SQL Server Analysis Services 2012 Cube Development Cookbook

The 'SQL Server Analysis Services 2012 Cube Development Cookbook' equips you with the practical skills needed to design and implement OLAP cubes using both multidimensional and tabular models. By working through hands-on recipes, you'll quickly gain the expertise to create robust Business Intelligence solutions with SQL Server Analysis Services. What this Book will help me do Build and enhance multidimensional and tabular models for effective data analysis. Implement key OLAP features like dimensions, cubes, actions, and aggregations. Scale and optimize your Business Intelligence solutions for enterprise-level performance. Utilize MDX and DAX languages proficiently to query and manipulate data. Develop skills in securing, monitoring, and troubleshooting SQL Server Analysis Services. Author(s) None Dewald is an experienced business intelligence professional with years of hands-on expertise with SQL Server Analysis Services. His approach to teaching focuses on practical application and equipping his audience with tools to be successful in deploying and maintaining BI solutions. Who is it for? This book is perfect for BI and ETL developers who work with SQL Server Analysis Services to build OLAP cubes. It assumes familiarity with relational database management systems, Excel, and SQL. If you're looking to deepen your understanding of Microsoft's BI stack and improve your practical skills in SSAS, this book is for you.

Oracle® 12c For Dummies®

Demystifying the power of the Oracle 12c database The Oracle database is the industry-leading relational database management system (RDMS) used from small companies to the world's largest enterprises alike for their most critical business and analytical processing. Oracle 12c includes industry leading enhancements to enable cloud computing and empowers users to manage both Big Data and traditional data structures faster and cheaper than ever before. Oracle 12c For Dummies is the perfect guide for a novice database administrator or an Oracle DBA who is new to Oracle 12c. The book covers what you need to know about Oracle 12c architecture, software tools, and how to successfully manage Oracle databases in the real world. Highlights the important features of Oracle 12c Explains how to create, populate, protect, tune, and troubleshoot a new Oracle database Covers advanced Oracle 12c technologies including Oracle Multitenant—the “pluggable database” concept—as well as several other key changes in this release Make the most of Oracle 12c's improved efficiency, stronger security, and simplified management capabilities with Oracle 12c For Dummies.

Oracle Big Data Handbook

Transform Big Data into Insight "In this book, some of Oracle's best engineers and architects explain how you can make use of big data. They'll tell you how you can integrate your existing Oracle solutions with big data systems, using each where appropriate and moving data between them as needed." -- Doug Cutting, co-creator of Apache Hadoop Cowritten by members of Oracle's big data team, Oracle Big Data Handbook provides complete coverage of Oracle's comprehensive, integrated set of products for acquiring, organizing, analyzing, and leveraging unstructured data. The book discusses the strategies and technologies essential for a successful big data implementation, including Apache Hadoop, Oracle Big Data Appliance, Oracle Big Data Connectors, Oracle NoSQL Database, Oracle Endeca, Oracle Advanced Analytics, and Oracle's open source R offerings. Best practices for migrating from legacy systems and integrating existing data warehousing and analytics solutions into an enterprise big data infrastructure are also included in this Oracle Press guide. Understand the value of a comprehensive big data strategy Maximize the distributed processing power of the Apache Hadoop platform Discover the advantages of using Oracle Big Data Appliance as an engineered system for Hadoop and Oracle NoSQL Database Configure, deploy, and monitor Hadoop and Oracle NoSQL Database using Oracle Big Data Appliance Integrate your existing data warehousing and analytics infrastructure into a big data architecture Share data among Hadoop and relational databases using Oracle Big Data Connectors Understand how Oracle NoSQL Database integrates into the Oracle Big Data architecture Deliver faster time to value using in-database analytics Analyze data with Oracle Advanced Analytics (Oracle R Enterprise and Oracle Data Mining), Oracle R Distribution, ROracle, and Oracle R Connector for Hadoop Analyze disparate data with Oracle Endeca Information Discovery Plan and implement a big data governance strategy and develop an architecture and roadmap

Making Sense of NoSQL

Making Sense of NoSQL clearly and concisely explains the concepts, features, benefits, potential, and limitations of NoSQL technologies. Using examples and use cases, illustrations, and plain, jargon-free writing, this guide shows how you can effectively assemble a NoSQL solution to replace or augment the traditional RDBMS you have now. About the Technology About the Book If you want to understand and perhaps start using the new data storage and analysis technologies that go beyond the SQL database model, this book is for you. Written in plain language suitable for technical managers and developers, and using many examples, use cases, and illustrations, this book explains the concepts, features, benefits, potential, and limitations of NoSQL. Making Sense of NoSQL starts by comparing familiar database concepts to the new NoSQL patterns that augment or replace them. Then, you'll explore case studies on big data, search, reliability, and business agility that apply these new patterns to today's business problems. You'll see how NoSQL systems can leverage the resources of modern cloud computing and multiple-CPU data centers. The final chapters show you how to choose the right NoSQL technologies for your own needs. What's Inside NoSQL data architecture patterns NoSQL for big data Search, high availability, and security Choosing an architecture About the Reader Managers and developers will welcome this lucid overview of the potential and capabilities of NoSQL technologies. About the Authors Dan McCreary and Ann Kelly lead an independent training and consultancy firm focused on NoSQL solutions and are cofounders of the NoSQL Now! Conference. Quotes Easily digestible, practical advice for technical managers, architects, and developers. - From the Foreword by Tony Shaw, CEO of DATAVERSITY Cuts through the jargon and gives you the information you need to know. - Craig Smith, Unbound DNA A concise yet thorough description of the many facets of NoSQL, from big data to search. - John Guthrie, Pivotal Brings common sense to the world of NoSQL. - Ignacio Lopez Vellon, Atos Worldgrid Get ahead of your peers ... fast-track to NoSQL now! - Ian Stirk, Stirk Consultancy, Ltd

SQL For Dummies, 8th Edition

Uncover the secrets of SQL and start building better relational databases today! This fun and friendly guide will help you demystify database management systems so you can create more powerful databases and access information with ease. Updated for the latest SQL functionality, SQL For Dummies, 8th Edition covers the core SQL language and shows you how to use SQL to structure a DBMS, implement a database design, secure your data, and retrieve information when you need it. Includes new enhancements of SQL:2011, including temporal data functionality which allows you to set valid times for transactions to occur and helps prevent database corruption Covers creating, accessing, manipulating, maintaining, and storing information in relational database management systems like Access, Oracle, SQL Server, and MySQL Provides tips for keeping your data safe from theft, accidental or malicious corruption, or loss due to equipment failures and advice on eliminating errors in your work Don't be daunted by database development anymore - get SQL For Dummies, 8th Edition, and you'll be on your way to SQL stardom.

Apache Sqoop Cookbook

Integrating data from multiple sources is essential in the age of big data, but it can be a challenging and time-consuming task. This handy cookbook provides dozens of ready-to-use recipes for using Apache Sqoop, the command-line interface application that optimizes data transfers between relational databases and Hadoop. Sqoop is both powerful and bewildering, but with this cookbook’s problem-solution-discussion format, you’ll quickly learn how to deploy and then apply Sqoop in your environment. The authors provide MySQL, Oracle, and PostgreSQL database examples on GitHub that you can easily adapt for SQL Server, Netezza, Teradata, or other relational systems. Transfer data from a single database table into your Hadoop ecosystem Keep table data and Hadoop in sync by importing data incrementally Import data from more than one database table Customize transferred data by calling various database functions Export generated, processed, or backed-up data from Hadoop to your database Run Sqoop within Oozie, Hadoop’s specialized workflow scheduler Load data into Hadoop’s data warehouse (Hive) or database (HBase) Handle installation, connection, and syntax issues common to specific database vendors

MySQL, 5th Edition

MySQL, Fifth Edition by Paul DuBois The definitive guide to using, programming and administering MySQL 5.5 and MySQL 5.6 MySQL provides a comprehensive guide to effectively using and administering the MySQL database management system (DBMS). Author Paul DuBois describes everything from the basics of getting information into a database and formulating queries, to using MySQL with PHP or Perl to generate dynamic web pages, to writing your own programs that access MySQL databases, to administering MySQL servers. The book also includes a comprehensive reference section providing detailed information on MySQL’s structure, language, syntax, and APIs. The fifth edition of this bestselling book has been meticulously revised and updated to thoroughly cover the latest features and capabilities of MySQL 5.5, as well as to add new coverage of features introduced with MySQL 5.6. MySQL is an open source relational database management system (DBMS) that has experienced a phenomenal growth in popularity and use. Known for its speed and ease of use, MySQL has proven itself to be particularly well-suited for developing database-backed websites and applications. MySQL runs on anything from modest hardware all the way up to enterprise servers, and its performance rivals any database system put up against it. Paul DuBois’ MySQL, Fifth Edition, is the definitive guide to fully exploiting all the power and versatility of MySQL 5.5 and MySQL 5.6 Contents at a Glance Part I: General MySQL Use Chapter 1 Getting Started with MySQL Chapter 2 Using SQL to Manage Data Chapter 3 Data Types Chapter 4 Views and Stored Programs Chapter 5 Query Optimization Part II: Using MySQL Programming Interfaces Chapter 6 Introduction to MySQL Programming Chapter 7 Writing MySQL Programs Using C Chapter 8 Writing MySQL Programs Using Perl DBI Chapter 9 Writing MySQL Programs Using PHP Part III: MySQL Administration Chapter 10 Introduction to MySQL Administration Chapter 11 The MySQL Data Directory Chapter 12 General MySQL Administration Chapter 13 Security and Access Control Chapter 14 Database Maintenance, Backups, and Replication Part IV: Appendixes Appendix A Software Required to Use This Book Appendix B Data Type Reference Appendix C Operator and Function Reference Appendix D System, Status, and User Variable Reference Appendix E SQL Syntax Reference Appendix F MySQL Program Reference Online Appendixes: Appendix G C API Reference Appendix H Perl DBI API Reference Appendix I PHP API Reference .

Hadoop Beginner's Guide

Hadoop Beginner's Guide introduces you to the essential concepts and practical applications of Apache Hadoop, one of the leading frameworks for big data processing. You will learn how to set up and use Hadoop to store, manage, and analyze vast amounts of data efficiently. With clear examples and step-by-step instructions, this book is the perfect starting point for beginners. What this Book will help me do Understand the trends leading to the adoption of Hadoop and determine when to use it effectively in your projects. Build and configure Hadoop clusters tailored to your specific needs, enabling efficient data processing. Develop and execute applications on Hadoop using Java and Ruby, with practical examples provided. Leverage Amazon AWS and Elastic MapReduce to deploy Hadoop on the cloud and manage hosted environments. Integrate Hadoop with relational databases using tools like Hive and Sqoop for effective data transfer and querying. Author(s) The author of Hadoop Beginner's Guide is an experienced data engineer with a focus on big data technologies. They have extensive experience deploying Hadoop in various industries and are passionate about making complex systems accessible to newcomers. Their approach combines technical depth with an understanding of the needs of learners, ensuring clarity and relevance throughout the book. Who is it for? This book is designed for professionals who are new to big data processing and want to learn Apache Hadoop from scratch. It is ideal for system administrators, data analysts, and developers with basic programming knowledge in Java or Ruby looking to get started with Hadoop. If you have an interest in leveraging Hadoop for scalable data management and analytics, this book is for you. By the end, you'll gain the confidence and skills to utilize Hadoop effectively in your projects.

Resilience and Reliability on AWS

Cloud services are just as susceptible to network outages as any other platform. This concise book shows you how to prepare for potentially devastating interruptions by building your own resilient and reliable applications in the public cloud. Guided by engineers from 9apps—an independent provider of Amazon Web Services and Eucalyptus cloud solutions—you’ll learn how to combine AWS with open source tools such as PostgreSQL, MongoDB, and Redis. This isn’t a book on theory. With detailed examples, sample scripts, and solid advice, software engineers with operations experience will learn specific techniques that 9apps routinely uses in its cloud infrastructures. Build cloud applications with the "rip, mix, and burn" approach Get a crash course on Amazon Web Services Learn the top ten tips for surviving outages in the cloud Use elasticsearch to build a dependable NoSQL data store Combine AWS and PostgreSQL to build an RDBMS that scales well Create a highly available document database with MongoDB Replica Set and SimpleDB Augment Redis with AWS to provide backup/restore, failover, and monitoring capabilities Work with CloudFront and Route 53 to safeguard global content delivery

NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence

The need to handle increasingly larger data volumes is one factor driving the adoption of a new class of nonrelational “NoSQL” databases. Advocates of NoSQL databases claim they can be used to build systems that are more performant, scale better, and are easier to program. NoSQL Distilled is a concise but thorough introduction to this rapidly emerging technology. Pramod J. Sadalage and Martin Fowler explain how NoSQL databases work and the ways that they may be a superior alternative to a traditional RDBMS. The authors provide a fast-paced guide to the concepts you need to know in order to evaluate whether NoSQL databases are right for your needs and, if so, which technologies you should explore further. The first part of the book concentrates on core concepts, including schemaless data models, aggregates, new distribution models, the CAP theorem, and map-reduce. In the second part, the authors explore architectural and design issues associated with implementing NoSQL. They also present realistic use cases that demonstrate NoSQL databases at work and feature representative examples using Riak, MongoDB, Cassandra, and Neo4j. In addition, by drawing on Pramod Sadalage’s pioneering work, NoSQL Distilled shows how to implement evolutionary design with schema migration: an essential technique for applying NoSQL databases. The book concludes by describing how NoSQL is ushering in a new age of Polyglot Persistence, where multiple data-storage worlds coexist, and architects can choose the technology best optimized for each type of data access.

Pro SQL Server 2012 Reporting Services, Third Edition

Pro SQL Server 2012 Reporting Services opens the door to delivering customizable, web-enabled reports across your business at reasonable cost. Reporting Services is Microsoft's enterprise-level reporting platform. It is included with many editions of SQL Server, and is something you'll want to take advantage of if you're running SQL Server as your database engine. Reporting Services provides a full set of tools with which to create and deploy reports. Create interactive reports for business users. Define reporting models from which business users can generate their own ad hoc reports. Pull data from relational databases, from XML, and from other sources. Present that data to users in tabular and graphical forms, and more. Reporting Services experts Brian McDonald, Rodney Landrum, and Shawn McGehee show how to do all this and much more in this third edition of their longstanding book on the topic. Provides best practices for using Reporting Services Covers the very latest in new features for SQL Server 2012 Your key to delivering business intelligence across the enterprise

FileMaker® 12 In Depth

FileMaker® 12 In Depth Do more in less time! FileMaker 12 In Depth is the most comprehensive, coherent, and practical guide to creating professional-quality solutions with the newest versions of FileMaker! Drawing on his unsurpassed real-world experience as a FileMaker user, consultant, and developer, Jesse Feiler helps you gain practical mastery of today’s newest, most advanced FileMaker tools and features. • Use themes to build solutions for FileMaker Pro on Windows and OS X, FileMaker Go on iOS, and Instant Web Publishing • Get the most out of new container field technology • Quickly become a FileMaker 12 power user • Make the most of FileMaker fields, tables, layouts, and parts • Iteratively design reliable, high-performance FileMaker relational databases • Work with relationships, including self-joins and cross-product relationships • Write calculation formulas and use functions • Use event-driven scripts to make databases more interactive • Build clear and usable reports, publish them, and incorporate them into workflows • Secure applications with user accounts, privileges, file-level access, network security, and authentication • Use FileMaker’s Web Viewer to access live web-based data • Convert systems from older versions of FileMaker, and troubleshoot successfully • Share, exchange, export, and publish data via SQL and XML • Instantly publish databases on the web, and use advanced Custom Web Publishing techniques • Trigger automated behaviors whenever specific events occur • Extend FileMaker’s functionality with plug-ins • Set up, configure, tune, and secure FileMaker Server All In Depth books offer • Comprehensive coverage with detailed solutions • Troubleshooting help for tough problems you can’t fix on your own • Outstanding authors recognized worldwide for their expertise and teaching style Learning, reference, problem-solving... the only FileMaker 12 book you need!

Hadoop: The Definitive Guide, 3rd Edition

Ready to unlock the power of your data? With this comprehensive guide, you’ll learn how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. This book is ideal for programmers looking to analyze datasets of any size, and for administrators who want to set up and run Hadoop clusters. You’ll find illuminating case studies that demonstrate how Hadoop is used to solve specific problems. This third edition covers recent changes to Hadoop, including material on the new MapReduce API, as well as MapReduce 2 and its more flexible execution model (YARN). Store large datasets with the Hadoop Distributed File System (HDFS) Run distributed computations with MapReduce Use Hadoop’s data and I/O building blocks for compression, data integrity, serialization (including Avro), and persistence Discover common pitfalls and advanced features for writing real-world MapReduce programs Design, build, and administer a dedicated Hadoop cluster—or run Hadoop in the cloud Load data from relational databases into HDFS, using Sqoop Perform large-scale data processing with the Pig query language Analyze datasets with Hive, Hadoop’s data warehousing system Take advantage of HBase for structured and semi-structured data, and ZooKeeper for building distributed systems

Seven Databases in Seven Weeks

Data is getting bigger and more complex by the day, and so are the choices in handling that data. As a modern application developer you need to understand the emerging field of data management, both RDBMS and NoSQL. Seven Databases in Seven Weeks takes you on a tour of some of the hottest open source databases today. In the tradition of Bruce A. Tate's Seven Languages in Seven Weeks, this book goes beyond your basic tutorial to explore the essential concepts at the core each technology. Redis, Neo4J, CouchDB, MongoDB, HBase, Riak and Postgres. With each database, you'll tackle a real-world data problem that highlights the concepts and features that make it shine. You'll explore the five data models employed by these databases-relational, key/value, columnar, document and graph-and which kinds of problems are best suited to each. You'll learn how MongoDB and CouchDB are strikingly different, and discover the Dynamo heritage at the heart of Riak. Make your applications faster with Redis and more connected with Neo4J. Use MapReduce to solve Big Data problems. Build clusters of servers using scalable services like Amazon's Elastic Compute Cloud (EC2). Discover the CAP theorem and its implications for your distributed data. Understand the tradeoffs between consistency and availability, and when you can use them to your advantage. Use multiple databases in concert to create a platform that's more than the sum of its parts, or find one that meets all your needs at once. Seven Databases in Seven Weeks will take you on a deep dive into each of the databases, their strengths and weaknesses, and how to choose the ones that fit your needs. What You Need: To get the most of of this book you'll have to follow along, and that means you'll need a *nix shell (Mac OSX or Linux preferred, Windows users will need Cygwin), and Java 6 (or greater) and Ruby 1.8.7 (or greater). Each chapter will list the downloads required for that database.