Cassandra

Cassandra 3.x High Availability - Second Edition

2016-08-29 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Robbie Strickland

DevOps data data-engineering nosql-databases

Cassandra 3.x High Availability is an in-depth guide to mastering the high availability features of Apache Cassandra. This book takes you through its architecture, implementing solutions to achieve zero downtime, and configuring clusters for fault tolerance and scalability. With practical examples and tips, it is a go-to resource for designing robust Cassandra-powered systems. What this Book will help me do Understand the architecture of Apache Cassandra and its high availability mechanisms. Master replication and tunable consistency levels for optimal data distribution. Learn to scale out your Cassandra deployments with multiple data centers. Acquire skills in creating efficient and scalable data models for fault-tolerant systems. Prevent system failures by avoiding anti-patterns and managing graceful failover scenarios. Author(s) None Strickland has extensive experience working as a developer and architect with distributed database systems. Specializing in Apache Cassandra, Strickland focuses on designing systems with high availability, scalability, and fault tolerance. Their practical teaching style ensures readers gain actionable knowledge to build robust database solutions. Who is it for? This book is ideal for developers and DevOps engineers familiar with Cassandra basics who wish to deepen their expertise. If your goal is to build highly available and fault-tolerant systems, this book will guide you step by step. It suits professionals managing data-intensive applications and looking to optimize their database strategy using Cassandra.

Sams Teach Yourself Apache Spark™ in 24 Hours

2016-08-17 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Jeffrey Aven

AI/ML API Big Data Cloud Computing Data Engineering Kafka NoSQL Python Scala Spark SQL Data Streaming +3 more

Apache Spark is a fast, scalable, and flexible open source distributed processing engine for big data systems and is one of the most active open source big data projects to date. In just 24 lessons of one hour or less, Sams Teach Yourself Apache Spark in 24 Hours helps you build practical Big Data solutions that leverage Spark’s amazing speed, scalability, simplicity, and versatility. This book’s straightforward, step-by-step approach shows you how to deploy, program, optimize, manage, integrate, and extend Spark–now, and for years to come. You’ll discover how to create powerful solutions encompassing cloud computing, real-time stream processing, machine learning, and more. Every lesson builds on what you’ve already learned, giving you a rock-solid foundation for real-world success. Whether you are a data analyst, data engineer, data scientist, or data steward, learning Spark will help you to advance your career or embark on a new career in the booming area of Big Data. Learn how to • Discover what Apache Spark does and how it fits into the Big Data landscape • Deploy and run Spark locally or in the cloud • Interact with Spark from the shell • Make the most of the Spark Cluster Architecture • Develop Spark applications with Scala and functional Python • Program with the Spark API, including transformations and actions • Apply practical data engineering/analysis approaches designed for Spark • Use Resilient Distributed Datasets (RDDs) for caching, persistence, and output • Optimize Spark solution performance • Use Spark with SQL (via Spark SQL) and with NoSQL (via Cassandra) • Leverage cutting-edge functional programming techniques • Extend Spark with streaming, R, and Sparkling Water • Start building Spark-based machine learning and graph-processing applications • Explore advanced messaging technologies, including Kafka • Preview and prepare for Spark’s next generation of innovations Instructions walk you through common questions, issues, and tasks; Q-and-As, Quizzes, and Exercises build and test your knowledge; "Did You Know?" tips offer insider advice and shortcuts; and "Watch Out!" alerts help you avoid pitfalls. By the time you're finished, you'll be comfortable using Apache Spark to solve a wide spectrum of Big Data problems.

Cassandra: The Definitive Guide, 2nd Edition

2016-07-12 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Eben Hewitt , Jeff Carpenter

Cloud Computing Data Modelling Docker ELK Hadoop Java JavaScript Python Spark data data-engineering nosql-databases

Imagine what you could do if scalability wasn't a problem. With this hands-on guide, you’ll learn how the Cassandra database management system handles hundreds of terabytes of data while remaining highly available across multiple data centers. This expanded second edition—updated for Cassandra 3.0—provides the technical details and practical examples you need to put this database to work in a production environment. Authors Jeff Carpenter and Eben Hewitt demonstrate the advantages of Cassandra’s non-relational design, with special attention to data modeling. If you’re a developer, DBA, or application architect looking to solve a database scaling issue or future-proof your application, this guide helps you harness Cassandra’s speed and flexibility. Understand Cassandra’s distributed and decentralized structure Use the Cassandra Query Language (CQL) and cqlsh—the CQL shell Create a working data model and compare it with an equivalent relational model Develop sample applications using client drivers for languages including Java, Python, and Node.js Explore cluster topology and learn how nodes exchange data Maintain a high level of performance in your cluster Deploy Cassandra on site, in the Cloud, or with Docker Integrate Cassandra with Spark, Hadoop, Elasticsearch, Solr, and Lucene

Pro Spark Streaming: The Zen of Real-Time Analytics Using Apache Spark

2016-06-13 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Zubair Nabi

AI/ML Analytics AWS Lambda BI Big Data ETL/ELT Apache HBase Hive IoT Kafka Redis Spark +4 more

Learn the right cutting-edge skills and knowledge to leverage Spark Streaming to implement a wide array of real-time, streaming applications. This book walks you through end-to-end real-time application development using real-world applications, data, and code. Taking an application-first approach, each chapter introduces use cases from a specific industry and uses publicly available datasets from that domain to unravel the intricacies of production-grade design and implementation. The domains covered in Pro Spark Streaming include social media, the sharing economy, finance, online advertising, telecommunication, and IoT. In the last few years, Spark has become synonymous with big data processing. DStreams enhance the underlying Spark processing engine to support streaming analysis with a novel micro-batch processing model. Pro Spark Streaming by Zubair Nabi will enable you to become a specialist of latency sensitive applications by leveraging the key features of DStreams, micro-batch processing, and functional programming. To this end, the book includes ready-to-deploy examples and actual code. Pro Spark Streaming will act as the bible of Spark Streaming. What You'll Learn Discover Spark Streaming application development and best practices Work with the low-level details of discretized streams Optimize production-grade deployments of Spark Streaming via configuration recipes and instrumentation using Graphite, collectd, and Nagios Ingest data from disparate sources including MQTT, Flume, Kafka, Twitter, and a custom HTTP receiver Integrate and couple with HBase, Cassandra, and Redis Take advantage of design patterns for side-effects and maintaining state across the Spark Streaming micro-batch model Implement real-time and scalable ETL using data frames, SparkSQL, Hive, and SparkR Use streaming machine learning, predictive analytics, and recommendations Mesh batch processing with stream processing via the Lambda architecture Who This Book Is For Data scientists, big data experts, BI analysts, and data architects.

Big Data Analytics with Spark: A Practitioner’s Guide to Using Spark for Large-Scale Data Processing, Machine Learning, and Graph Analytics, and High-Velocity Data Stream Processing

2016-01-02 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Mohammed Guller

AI/ML Analytics Avro BI Big Data Data Analytics ETL/ELT Apache HBase HDFS Kafka Parquet Scala +6 more

This book is a step-by-step guide for learning how to use Spark for different types of big-data analytics projects, including batch, interactive, graph, and stream data analysis as well as machine learning. It covers Spark core and its add-on libraries, including Spark SQL, Spark Streaming, GraphX, MLlib, and Spark ML. Big Data Analytics with Spark shows you how to use Spark and leverage its easy-to-use features to increase your productivity. You learn to perform fast data analysis using its in-memory caching and advanced execution engine, employ in-memory computing capabilities for building high-performance machine learning and low-latency interactive analytics applications, and much more. Moreover, the book shows you how to use Spark as a single integrated platform for a variety of data processing tasks, including ETL pipelines, BI, live data stream processing, graph analytics, and machine learning. The book also includes a chapter on Scala, the hottest functional programming language, and the language that underlies Spark. You’ll learn the basics of functional programming in Scala, so that you can write Spark applications in it. What's more, Big Data Analytics with Spark provides an introduction to other big data technologies that are commonly used along with Spark, such as HDFS, Avro, Parquet, Kafka, Cassandra, HBase, Mesos, and so on. It also provides an introduction to machine learning and graph concepts. So the book is self-sufficient; all the technologies that you need to know to use Spark are covered. The only thing that you are expected to have is some programming knowledge in any language.

Apache Cassandra Essentials

2015-11-20 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Nitin Padalia

data data-engineering nosql-databases

"Apache Cassandra Essentials" is your guide to understanding and mastering the core concepts of Apache Cassandra. Whether you're setting up your first Cassandra cluster or optimizing performance, this book provides actionable steps and insights to help you design highly responsive database architectures. What this Book will help me do Set up and configure a Cassandra cluster for optimal performance. Design schemas in Cassandra using CQL for evenly distributed data. Employ tools to monitor and maintain Cassandra clusters effectively. Debug queries to improve database query performance. Tune Cassandra to adapt to specific operational environments. Author(s) Nitin Padalia, the author, is an experienced database engineer with a deep understanding of distributed systems. With years of experience working with Apache Cassandra and similar technologies, he has dedicated his efforts to simplifying complex concepts for developers. His clear and straightforward writing helps readers build expertise efficiently. Who is it for? This book is perfect for developers who are already familiar with Cassandra and want a deeper understanding of its architecture and functionality. If you're interested in diving into the non-relational aspects of Cassandra or need guidance on database optimization, you'll find this book invaluable. It's designed for those ready to advance their skills and maximize the potential of their Cassandra deployments.

Cassandra Design Patterns - Second Edition

2015-11-04 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Rajanarayanan Thottuvaikkatumana

Analytics Big Data NoSQL RDBMS data data-engineering nosql-databases

Cassandra Design Patterns is your guide to harnessing the full potential of Apache Cassandra's distributed database capabilities through advanced design practices. Whether you're migrating from an RDBMS or implementing scalable storage for big data, this book provides clear strategies, practical examples, and real-world use cases demonstrating effective design patterns. What this Book will help me do Learn to integrate Cassandra with existing RDBMS solutions, enabling hybrid data architecture. Understand and implement key design patterns for distributed, scalable databases. Master the transition from RDBMS or cache systems to Cassandra with minimal disruption. Dive into time-series and temporal data patterns unique to Cassandra's strengths. Apply learned design patterns directly to real-world big data scenarios for analytics. Author(s) Rajanarayanan Thottuvaikkatumana, the author of Cassandra Design Patterns, is an expert in distributed systems and holds extensive experience in designing and implementing big data solutions. His hands-on approach to Cassandra is evident throughout the book as he bridges theoretical knowledge with practical applications. Rajanarayanan's approachable writing style aims to make complex concepts accessible. Who is it for? This book is ideal for big data developers and system architects who are familiar with the basics of Cassandra and are looking to deepen their understanding of design patterns for robust applications. Readers should have experience with relational databases and desire to migrate or integrate these concepts with NoSQL systems. Whether you're building solutions for data scalability, high availability, or analytics, Cassandra Design Patterns positions itself as an essential resource.

Pro MongoDB™ Development

2015-09-26 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Deepak Vohra

Hadoop Hive Java JavaScript JSON MongoDB NoSQL Oracle RDBMS data data-engineering nosql-databases

Pro MongoDB Development is a critical reference for anyone using MongoDB, a NoSQL database based on the BSON (binary JSON) document model. The book explores many aspects of implementing MongoDB in web applications, whether you are using Java, PHP, Ruby, and JavaScript. Noted expert Deepak Vohra walks you through accessing MongoDB databases with all these languages and working with various other technologies and databases. Vohra discusses using Java EE frameworks Kundera and Spring Data with MongoDB. You learn the nuts and bolts of migrating data from other NoSQL databases (Apache Cassandra and Couchbase) and from relational databases (Oracle Database). And, because NoSQL databases are commonly used with the Hadoop ecosystem, the book also covers using MongoDB with Apache Hive. Each chapter includes details about the software you need and hands on examples of working with MongoDB and these technologies so you know exactly what to do, whatever your MongoDB implementation requires.

Pro Couchbase Development: A NoSQL Platform for the Enterprise

2015-08-05 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Deepak Vohra

Big Data Cloud Computing Hadoop HDFS Java JavaScript JSON MongoDB NoSQL couchbase data data-engineering +1 more

Pro Couchbase Development: A NoSQL Platform for the Enterprise discusses programming for Couchbase using Java and scripting languages, querying and searching, handling migration, and integrating Couchbase with Hadoop, HDFS, and JSON. It also discusses migration from other NoSQL databases like MongoDB. This book is for big data developers who use Couchbase NoSQL database or want to use Couchbase for their web applications as well as for those migrating from other NoSQL databases like MongoDB and Cassandra. For example, a reason to migrate from Cassandra is that it is not based on the JSON document model with support for a flexible schema without having to define columns and supercolumns. The target audience is largely Java developers but the book also supports PHP and Ruby developers who want to learn about Couchbase. The author supplies examples in Java, PHP, Ruby, and JavaScript. After reading and using this hands-on guide for developing with Couchbase, you'll be able to build complex enterprise, database and cloud applications that leverage this powerful platform.

IBM Software Defined Infrastructure for Big Data Analytics Workloads

2015-06-29 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Marcelo Correia Lima , Dino Quintero , Maciej Olejniczak , Daniel de Souza Casali , Istvan Gabor Szabo , Nilton Carlos dos Santos , Tiago Rodrigues de Mello

Analytics Big Data Cloud Computing Data Analytics Hadoop IBM MongoDB Spark data data-engineering ibm-power-systems

This IBM® Redbooks® publication documents how IBM Platform Computing, with its IBM Platform Symphony® MapReduce framework, IBM Spectrum Scale (based Upon IBM GPFS™), IBM Platform LSF®, the Advanced Service Controller for Platform Symphony are work together as an infrastructure to manage not just Hadoop-related offerings, but many popular industry offeringsm such as Apach Spark, Storm, MongoDB, Cassandra, and so on. It describes the different ways to run Hadoop in a big data environment, and demonstrates how IBM Platform Computing solutions, such as Platform Symphony and Platform LSF with its MapReduce Accelerator, can help performance and agility to run Hadoop on distributed workload managers offered by IBM. This information is for technical professionals (consultants, technical support staff, IT architects, and IT specialists) who are responsible for delivering cost-effective cloud services and big data solutions on IBM Power Systems™ to help uncover insights among client’s data so they can optimize product development and business results.

Big Data

2015-04-30 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by James Warren , Nathan Marz

AI/ML Analytics AWS Lambda Big Data Hadoop NoSQL data data-engineering

Big Data teaches you to build big data systems using an architecture that takes advantage of clustered hardware along with new tools designed specifically to capture and analyze web-scale data. It describes a scalable, easy-to-understand approach to big data systems that can be built and run by a small team. Following a realistic example, this book guides readers through the theory of big data systems, how to implement them in practice, and how to deploy and operate them once they're built. About the Technology About the Book Web-scale applications like social networks, real-time analytics, or e-commerce sites deal with a lot of data, whose volume and velocity exceed the limits of traditional database systems. These applications require architectures built around clusters of machines to store and process data of any size, or speed. Fortunately, scale and simplicity are not mutually exclusive. Big Data teaches you to build big data systems using an architecture designed specifically to capture and analyze web-scale data. This book presents the Lambda Architecture, a scalable, easy-to-understand approach that can be built and run by a small team. You'll explore the theory of big data systems and how to implement them in practice. In addition to discovering a general framework for processing big data, you'll learn specific technologies like Hadoop, Storm, and NoSQL databases. What's Inside Introduction to big data systems Real-time processing of web-scale data Tools like Hadoop, Cassandra, and Storm Extensions to traditional database skills About the Reader This book requires no previous exposure to large-scale data analysis or NoSQL tools. Familiarity with traditional databases is helpful. About the Authors Nathan Marz is the creator of Apache Storm and the originator of the Lambda Architecture for big data systems. James Warren is an analytics architect with a background in machine learning and scientific computing. Quotes Transcends individual tools or platforms. Required reading for anyone working with big data systems. - Jonathan Esterhazy, Groupon A comprehensive, example-driven tour of the Lambda Architecture with its originator as your guide. - Mark Fisher, Pivotal Contains wisdom that can only be gathered after tackling many big data projects. A must-read. - Pere Ferrera Bertran, Datasalt The de facto guide to streamlining your data pipeline in batch and near-real time. - Alex Holmes, Author of "Hadoop in Practice"

NoSQL for Mere Mortals®

2015-04-16 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Dan Sullivan

MongoDB Neo4j NoSQL RDBMS Redis SQL data data-engineering nosql-databases

NoSQL was developed to overcome the limitations of relational databases in the largest Web applications at companies such as Google, Yahoo and Facebook. As it is applied more widely, developers are finding that it can simplify scalability while requiring far less coding and management overhead. However, NoSQL requires fundamentally different approaches to database design and modeling, and many conventional relational techniques lead to suboptimal results. NoSQL for Mere Mortals is an easy, practical guide to succeeding with NoSQL in your environment. Following the classic, best-selling format pioneered in SQL Queries for Mere Mortals, enterprise database expert Dan Sullivan guides you step-by-step through choosing technologies, designing high-performance databases, and planning for long-term maintenance. Sullivan introduces each type of NoSQL database, shows how to install and manage them, and demonstrates how to leverage their features while avoiding common mistakes that lead to poor performance and unmet requirements. He uses four popular NoSQL databases as reference models: MongoDB, a document database; Cassandra, a column family data store; Redis, a key-value database; and Neo4j, a graph database. You'll find explanations of each database's structure and capabilities, practical guidelines for choosing amongst them, and expert guidance on designing databases with them. Packed with examples, NoSQL for Mere Mortals is today's best way to master NoSQL—whether you're a DBA, developer, user, or student.

Mastering Apache Cassandra - Second Edition

2015-03-26 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Nishant Neeraj

Big Data NoSQL data data-engineering nosql-databases

Mastering Apache Cassandra - Second Edition is your comprehensive guide to understanding and utilizing the power of Cassandra, an efficient and scalable NoSQL database. Throughout this book, you will learn how to design, deploy, and manage Cassandra databases effectively, tailored to your application's needs. What this Book will help me do Understand the architecture of Apache Cassandra and how it ensures scalability and reliability. Learn to build, configure, and deploy a Cassandra database cluster for high performance. Develop skills in monitoring and tuning Cassandra clusters for optimal operation. Gain expertise in managing clusters through scaling, node repair, and backup strategies. Integrate Apache Cassandra with other tools and your application seamlessly. Author(s) Nishant Neeraj is an experienced software developer and database engineer with a focus on delivering high-performance solutions. They have extensive hands-on experience with NoSQL databases, especially Apache Cassandra, and bring their practical insights and in-depth technical knowledge to this book to help readers tackle real-world challenges. Who is it for? This book is ideal for intermediate developers aiming to enhance their expertise in NoSQL databases. If you have a foundational understanding of database concepts and want to bring your skills to a professional level by mastering Apache Cassandra for modern applications, this book is perfect for you. It provides actionable insights and guidance suitable for professionals tackling high concurrency and big data challenges. Whether you are a developer, database administrator, or architect, this book provides a targeted deep dive into Cassandra.

Field Guide to Hadoop

2015-03-02 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Marshall Presser , Kevin Sitto

Avro Big Data Chef Cloud Computing Data Management Docker Hadoop Apache HBase HDFS Hive JSON MongoDB +5 more

If your organization is about to enter the world of big data, you not only need to decide whether Apache Hadoop is the right platform to use, but also which of its many components are best suited to your task. This field guide makes the exercise manageable by breaking down the Hadoop ecosystem into short, digestible sections. You’ll quickly understand how Hadoop’s projects, subprojects, and related technologies work together. Each chapter introduces a different topic—such as core technologies or data transfer—and explains why certain components may or may not be useful for particular needs. When it comes to data, Hadoop is a whole new ballgame, but with this handy reference, you’ll have a good grasp of the playing field. Topics include: Core technologies—Hadoop Distributed File System (HDFS), MapReduce, YARN, and Spark Database and data management—Cassandra, HBase, MongoDB, and Hive Serialization—Avro, JSON, and Parquet Management and monitoring—Puppet, Chef, Zookeeper, and Oozie Analytic helpers—Pig, Mahout, and MLLib Data transfer—Scoop, Flume, distcp, and Storm Security, access control, auditing—Sentry, Kerberos, and Knox Cloud computing and virtualization—Serengeti, Docker, and Whirr

Learning Apache Cassandra

2015-02-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Matthew Brown

API MySQL NoSQL SQL data data-engineering nosql-databases postgresql

Learning Apache Cassandra is your comprehensive guide to mastering one of the most popular distributed databases for building scalable, fault-tolerant data layers. Through step-by-step examples and clear explanations, this book will help you understand Cassandra's architecture and how to use its features to design efficient applications. What this Book will help me do Successfully install and set up Apache Cassandra in your environment. Develop highly scalable data models for various application scenarios. Implement efficient query designs using Cassandra's specialized APIs. Maintain data consistency and handle concurrent updates in distributed systems. Apply best practices for securing Cassandra deployments and managing distributed data. Author(s) None Brown is an experienced software developer with a focus on database systems and distributed architectures. With years of hands-on experience working with SQL and NoSQL databases, they bring practical insights and clear instructions to their readers. Their writing aims to demystify complex topics and provide practical learning paths. Who is it for? This book is intended for software developers and database administrators looking to expand their knowledge of distributed databases. If you are familiar with SQL databases like MySQL or PostgreSQL and want to transition to Cassandra, this guide will help you. No prior experience with distributed databases is assumed. By following this book, you'll quickly become proficient in using Cassandra for your distributed application needs.

NoSQL For Dummies

2015-02-24 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Adam Fowler

Analytics Big Data Data Analytics Hadoop MongoDB Neo4j NoSQL RDBMS data data-engineering nosql-databases

Get up to speed on the nuances of NoSQL databases and what they mean for your organization This easy to read guide to NoSQL databases provides the type of no-nonsense overview and analysis that you need to learn, including what NoSQL is and which database is right for you. Featuring specific evaluation criteria for NoSQL databases, along with a look into the pros and cons of the most popular options, NoSQL For Dummies provides the fastest and easiest way to dive into the details of this incredible technology. You'll gain an understanding of how to use NoSQL databases for mission-critical enterprise architectures and projects, and real-world examples reinforce the primary points to create an action-oriented resource for IT pros. If you're planning a big data project or platform, you probably already know you need to select a NoSQL database to complete your architecture. But with options flooding the market and updates and add-ons coming at a rapid pace, determining what you require now, and in the future, can be a tall task. This is where NoSQL For Dummies comes in! Learn the basic tenets of NoSQL databases and why they have come to the forefront as data has outpaced the capabilities of relational databases Discover major players among NoSQL databases, including Cassandra, MongoDB, MarkLogic, Neo4J, and others Get an in-depth look at the benefits and disadvantages of the wide variety of NoSQL database options Explore the needs of your organization as they relate to the capabilities of specific NoSQL databases Big data and Hadoop get all the attention, but when it comes down to it, NoSQL databases are the engines that power many big data analytics initiatives. With NoSQL For Dummies, you'll go beyond relational databases to ramp up your enterprise's data architecture in no time.

Cassandra High Availability

2014-12-29 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Robbie Strickland

DevOps data data-engineering nosql-databases

This book, "Cassandra High Availability", equips you with the knowledge and practical skills to harness Apache Cassandra's capabilities for building resilient, scalable, and highly-available systems. Suitable for developers or DevOps engineers with foundational knowledge of Cassandra, this resource takes you deeper into advanced topics necessary for maintaining robust distributed systems. What this Book will help me do Understand and utilize Cassandra's replication protocols and consistency levels to balance performance and reliability. Configure and manage multi-data-center setups in Cassandra for failover and geographic redundancy. Implement techniques to efficiently scale your Cassandra cluster with no downtime. Learn how to design high-availability data models optimized for performance and resilience. Identify and avoid common anti-patterns in Cassandra to maintain system efficiency and reliability. Author(s) None Strickland, the author of "Cassandra High Availability", is an experienced data engineer with a deep understanding of distributed systems and database technologies. None has worked extensively with Apache Cassandra in designing and optimizing scalable infrastructures. They bring a hands-on and detailed approach to explaining complex topics, making them accessible to both developers and system operators. Who is it for? This book is tailored for developers and DevOps engineers who have foundational knowledge of Apache Cassandra and are aiming to deepen their expertise. If your goal is to design, manage, and optimize high-availability distributed systems, this book provides practical strategies and technical insights for mastering Cassandra's capabilities. Ideal for those seeking to build fault-tolerant, scalable infrastructures.

Beginning Apache Cassandra Development

2014-12-17 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Vivek Mishra

Java JavaScript JSON NoSQL Python SQL data data-engineering nosql-databases

Beginning Apache Cassandra Development introduces you to one of the most robust and best-performing NoSQL database platforms on the planet. Apache Cassandra is a document database following the JSON document model. It is specifically designed to manage large amounts of data across many commodity servers without there being any single point of failure. This design approach makes Apache Cassandra a robust and easy-to-implement platform when high availability is needed. Apache Cassandra can be used by developers in Java, PHP, Python, and JavaScript—the primary and most commonly used languages. In Beginning Apache Cassandra Development, author and Cassandra expert Vivek Mishra takes you through using Apache Cassandra from each of these primary languages. Mishra also covers the Cassandra Query Language (CQL), the Apache Cassandra analog to SQL. You'll learn to develop applications sourcing data from Cassandra, query that data, and deliver it at speed to your application's users. Cassandra is one of the leading NoSQL databases, meaning you get unparalleled throughput and performance without the sort of processing overhead that comes with traditional proprietary databases. Beginning Apache Cassandra Development will therefore help you create applications that generate search results quickly, stand up to high levels of demand, scale as your user base grows, ensure operational simplicity, and—not least—provide delightful user experiences.

Practical Cassandra: A Developer’s Approach

2013-12-19 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Eric Lubow , Russell Bradberry

Cloud Computing Data Management Data Modelling SQL data data-engineering nosql-databases

Build and Deploy Massively Scalable, Super-fast Data Management Applications with Apache Cassandra is the first hands-on developer’s guide to building Cassandra systems and applications that deliver breakthrough speed, scalability, reliability, and performance. Fully up to date, it reflects the latest versions of Cassandra–including Cassandra Query Language (CQL), which dramatically lowers the learning curve for Cassandra developers. Practical Cassandra Pioneering Cassandra developers and Datastax MVPs Russell Bradberry and Eric Lubow walk you through every step of building a real production application that can store enormous amounts of structured, semi-structured, and unstructured data. Drawing on their exceptional expertise, Bradberry and Lubow share practical insights into issues ranging from querying to deployment, management, maintenance, monitoring, and troubleshooting. The authors cover key issues, from architecture to migration, and guide you through crucial decisions about configuration and data modeling. They provide tested sample code, detailed explanations of how Cassandra works ”under the covers,” and new case studies from three cutting-edge users: Ooyala, Hailo, and eBay. Coverage includes Understanding Cassandra’s approach, architecture, key concepts, and primary use cases– and why it’s so blazingly fast Getting Cassandra up and running on single nodes and large clusters Applying the new design patterns, philosophies, and features that make Cassandra such a powerful data store Leveraging CQL to simplify your transition from SQL-based RDBMSes Deploying and provisioning through the cloud or on bare-metal hardware Choosing the right configuration options for each type of workload Tweaking Cassandra to get maximum performance from your hardware, OS, and JVM Mastering Cassandra’s essential tools for maintenance and monitoring Efficiently solving the most common problems with Cassandra deployment, operation, and application development

NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence

2012-08-08 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Pramod Sadalage , Martin Fowler (Thoughtworks)

MongoDB Neo4j NoSQL RDBMS data data-engineering nosql-databases

The need to handle increasingly larger data volumes is one factor driving the adoption of a new class of nonrelational “NoSQL” databases. Advocates of NoSQL databases claim they can be used to build systems that are more performant, scale better, and are easier to program. NoSQL Distilled is a concise but thorough introduction to this rapidly emerging technology. Pramod J. Sadalage and Martin Fowler explain how NoSQL databases work and the ways that they may be a superior alternative to a traditional RDBMS. The authors provide a fast-paced guide to the concepts you need to know in order to evaluate whether NoSQL databases are right for your needs and, if so, which technologies you should explore further. The first part of the book concentrates on core concepts, including schemaless data models, aggregates, new distribution models, the CAP theorem, and map-reduce. In the second part, the authors explore architectural and design issues associated with implementing NoSQL. They also present realistic use cases that demonstrate NoSQL databases at work and feature representative examples using Riak, MongoDB, Cassandra, and Neo4j. In addition, by drawing on Pramod Sadalage’s pioneering work, NoSQL Distilled shows how to implement evolutionary design with schema migration: an essential technique for applying NoSQL databases. The book concludes by describing how NoSQL is ushering in a new age of Polyglot Persistence, where multiple data-storage worlds coexist, and architects can choose the technology best optimized for each type of data access.

talk-data.com

Activity Trend

Top Events

Top Speakers

Cassandra 3.x High Availability - Second Edition

Sams Teach Yourself Apache Spark™ in 24 Hours

Cassandra: The Definitive Guide, 2nd Edition

Pro Spark Streaming: The Zen of Real-Time Analytics Using Apache Spark

Big Data Analytics with Spark: A Practitioner’s Guide to Using Spark for Large-Scale Data Processing, Machine Learning, and Graph Analytics, and High-Velocity Data Stream Processing

Apache Cassandra Essentials

Cassandra Design Patterns - Second Edition

Pro MongoDB™ Development

Pro Couchbase Development: A NoSQL Platform for the Enterprise

IBM Software Defined Infrastructure for Big Data Analytics Workloads

Big Data

NoSQL for Mere Mortals®

Mastering Apache Cassandra - Second Edition

Field Guide to Hadoop

Learning Apache Cassandra

NoSQL For Dummies

Cassandra High Availability

Beginning Apache Cassandra Development

Practical Cassandra: A Developer’s Approach

NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence