ELK

Agile Data Science 2.0

2017-06-13 · O'Reilly Data Science Books O'Reilly Amazon

book

by Russell Jurney

Agile/Scrum Airflow Analytics Data Science JavaScript Kafka MongoDB Python Scikit-learn Spark data data-science

Data science teams looking to turn research into useful analytics applications require not only the right tools, but also the right approach if they’re to succeed. With the revised second edition of this hands-on guide, up-and-coming data scientists will learn how to use the Agile Data Science development methodology to build data applications with Python, Apache Spark, Kafka, and other tools. Author Russell Jurney demonstrates how to compose a data platform for building, deploying, and refining analytics applications with Apache Kafka, MongoDB, ElasticSearch, d3.js, scikit-learn, and Apache Airflow. You’ll learn an iterative approach that lets you quickly change the kind of analysis you’re doing, depending on what the data is telling you. Publish data science work as a web application, and affect meaningful change in your organization. Build value from your data in a series of agile sprints, using the data-value pyramid Extract features for statistical models from a single dataset Visualize data with charts, and expose different aspects through interactive reports Use historical data to predict the future via classification and regression Translate predictions into actions Get feedback from users after each sprint to keep your project on track

Mastering Elastic Stack

2017-02-28 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Ravi Kumar Gupta , Yuvraj Gupta

Analytics Data Analytics Kibana Logstash Cyber Security data data-engineering elastic-stack-elk-stack elastic stack (elk stack) elasticsearch search

Mastering Elastic Stack is your complete guide to advancing your data analytics expertise using the ELK Stack. With detailed coverage of Elasticsearch, Logstash, Kibana, Beats, and X-Pack, this book equips you with the skills to process and analyze any type of data efficiently. Through practical examples and real-world scenarios, you'll gain the ability to build end-to-end pipelines and create insightful dashboards. What this Book will help me do Build and manage log pipelines using Logstash, Beats, and Elasticsearch for real-time analytics. Develop advanced Kibana dashboards to visualize and interpret complex datasets. Efficiently utilize X-Pack features for alerting, monitoring, and security in the Elastic Stack. Master plugin customization and deployment for a tailored Elastic Stack environment. Apply Elastic Stack solutions to real-world cases for centralized logging and actionable insights. Author(s) The authors, None Kumar Gupta and None Gupta, are experienced technologists who have spent years working at the forefront of data processing and analytics. They are well-versed in Elasticsearch, Logstash, Kibana, and the Elastic ecosystem, having worked extensively in enterprise environments where these tools have transformed operations. Their passion for teaching and thorough understanding of the tools culminate in this comprehensive resource. Who is it for? The ideal reader is a developer already familiar with Elasticsearch, Logstash, and Kibana who wants to deepen their understanding of the stack. If you're involved in creating scalable data pipelines, analyzing complex datasets, or looking to implement centralized logging solutions in your work, this book is an excellent resource. It bridges the gap from intermediate to expert knowledge, allowing you to use the Elastic Stack effectively in various scenarios. Whether you are transitioning from a beginner or enhancing your skill set, this book meets your needs.

Mastering Elasticsearch 5.x - Third Edition

2017-02-21 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Bharvi Dixit

Analytics Big Data data data-engineering elasticsearch search

This comprehensive guide dives deep into the functionalities of Elasticsearch 5, the widely-used search and analytics engine. Leveraging the power of Apache Lucene, this book will help you understand advanced concepts like querying, indexing, and cluster management to build efficient and scalable search solutions. What this Book will help me do Master advanced features of Elasticsearch such as text scoring, sharding, and aggregation. Understand how to handle big data efficiently using Elasticsearch's architecture. Learn practical implementation techniques for Elasticsearch features through hands-on examples. Develop custom plugins for Elasticsearch to tailor its functionalities to specific needs. Scale and optimize Elasticsearch clusters for high performance in production environments. Author(s) Bharvi Dixit is an experienced software engineer and a recognized expert in implementing Elasticsearch solutions. With a strong background in distributed systems and database management, Bharvi's writing is informed by real-world experience and a focus on practical applications. Who is it for? This book is ideal for developers and data engineers with existing experience in Elasticsearch who wish to deepen their knowledge. It serves as a valuable resource for professionals tasked with creating scalable search applications. A working understanding of Elasticsearch basics and query DSL is recommended to fully benefit from this guide.

Learning Kibana 5.0

2017-02-15 · O'Reilly Data Science Books O'Reilly Amazon

book

by Bahaaldine Azarmi

Analytics Dashboard Data Analytics DataViz Kibana Logstash data data-science data-science-tasks data-visualization

Learning Kibana 5.0 is your gateway to mastering the art of data visualization using the powerful features of the Kibana platform. This book guides you through the process of creating stunning interactive dashboards and making data-driven insights accessible with real-time visualizations. Whether you're new to the Elastic stack or seeking to refine your expertise, this book equips you to harness Kibana's full potential. What this Book will help me do Build robust, real-time dashboards in Kibana to visualize complex datasets efficiently. Leverage Timelion to perform time-series data analysis and create metrics-based dashboards. Explore advanced analytics using the Graph plugin to uncover relationships and correlations in data. Learn how to create and deploy custom plugins to tailor Kibana to specific project needs. Understand how to use the Elastic stack to monitor, analyze, and optimize various types of data flows. Author(s) Bahaaldine Azarmi is a seasoned expert in the Elastic stack, known for his dedication to making complex technical topics approachable and practical. With years of experience in data analytics and software development, Bahaaldine shares not only his technical expertise but also his passion for helping professionals achieve their goals through clear, actionable guidance. His writing emphasizes hands-on learning and practical application. Who is it for? This book is perfect for developers, data visualization engineers, and data scientists who aim to hone their skills in data visualization and interactive dashboard development. It assumes a basic understanding of Elasticsearch and Logstash to maximize its practicality. If you aim to advance your career by learning how to optimize data architecture and solve real-world problems using the Elastic stack, this book is ideal for you.

Elasticsearch 5.x Cookbook - Third Edition

2017-02-06 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Alberto Paro

Analytics Big Data Java JSON data data-engineering elasticsearch search

Elasticsearch 5.x Cookbook is a comprehensive guide that teaches you how to leverage the full power of Elasticsearch for high-performance search and analytics. Through step-by-step recipes, you'll explore deployment, query building, plugin integration, and advanced analytics, ensuring you can manage and scale Elasticsearch like a pro. What this Book will help me do Understand and deploy complex Elasticsearch cluster topologies for optimal performance. Create tailored mappings to gain finer control over data indexing and retrieval. Design and execute advanced queries and analytics using Elasticsearch capabilities. Integrate Elasticsearch with popular programming languages and big data platforms. Monitor and improve Elasticsearch cluster health using the best practices and tools. Author(s) Alberto Paro is a seasoned software engineer and data scientist with extensive experience in distributed systems and search technologies. Having worked on numerous search-related projects, he brings practical, real-world insights to his writing. Alberto is passionate about teaching and simplifying complex concepts, making this book both approachable and expertly detailed. Who is it for? This book is ideal for developers or data engineers seeking to utilize Elasticsearch for advanced search and analytics tasks. If you have some prior knowledge of JSON and programming concepts, particularly Java, you will benefit most from this material. Whether you're looking to integrate Elasticsearch into your systems or to optimize its usage, this book caters to your needs.

HBase High Performance Cookbook

2017-01-31 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Ruchir Choudhry

Big Data Cloud Computing Data Management Hadoop Apache HBase NoSQL data data-engineering nosql-databases

"HBase High Performance Cookbook" is your guide to mastering the optimization, scaling, and tuning of HBase systems. Covering everything from configuring HBase clusters to designing scalable table structures and performance tuning, this comprehensive book provides practical advice and strategies for leveraging HBase's full potential. By following this book's recipes, you'll supercharge your HBase expertise. What this Book will help me do Understand how to configure HBase for optimal performance, improving your data system's efficiency. Learn to design table structures to maximize scalability and functionality in HBase. Gain skills in performing CRUD operations and using advanced features like MapReduce within HBase. Discover practices for integrating HBase with other technologies such as ElasticSearch. Master the steps involved in setting up and optimizing HBase in cloud environments for enhanced performance. Author(s) Ruchir Choudhry is a seasoned data management professional with extensive experience in distributed database systems. He possesses deep expertise in HBase, Hadoop, and other big data technologies. His practical and engaging writing style aims to demystify complex technical topics, making them accessible to developers and architects alike. Who is it for? This book is tailored for developers and system architects looking to deepen their understanding of HBase. Whether you are experienced with other NoSQL databases or are new to HBase, this book provides extensive practical knowledge. Ideal for professionals working in big data applications or those eager to optimize and scale their database systems effectively.

Mastering RethinkDB

2016-12-16 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Shahid Shaikh

Docker JavaScript data data-engineering nosql-databases rethinkdb

Mastering RethinkDB offers a comprehensive guide to using the open-source, scalable database RethinkDB for real-time application development. Throughout this book, you'll gain practical knowledge on query management with ReQL, build dynamic web apps, and perform advanced database administration tasks. What this Book will help me do Gain expertise in managing and configuring RethinkDB clusters for optimal performance in real-time applications. Develop robust web applications using RethinkDB and integrate them seamlessly with Node.js. Leverage advanced querying features of ReQL, including geospatial and time-series queries. Enhance RethinkDB's capabilities with integration techniques for third-party libraries like ElasticSearch. Master deployment practices using platforms such as Docker and PaaS for production-grade applications. Author(s) None Shaikh, an expert in database technologies and real-time system design, brings years of hands-on experience working with open-source databases like RethinkDB. Known for writing practical technical books, None emphasizes real-world applications and clarity to help both novice and seasoned developers excel. Who is it for? This book is ideal for developers who are building real-time applications and want to adopt RethinkDB for their solutions. Readers should have a basic understanding of RethinkDB and Node.js to get the most benefit. It's particularly suited for programmers looking to deepen their database administration skills and enhance their real-time data handling expertise.

Beginning Elastic Stack

2016-12-09 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Vishal Sharma

Kibana Logstash data data-engineering elastic-stack-elk-stack elastic stack (elk stack) elasticsearch search

Learn how to install, configure and implement the Elastic Stack (Elasticsearch, Logstash and Kibana) – the invaluable tool for anyone deploying a centralized log management solution for servers and apps. You will see how to use and configure Elastic Stack independently and alongside Puppet. Each chapter includes real-world examples and practical troubleshooting tips, enabling you to get up and running with Elastic Stack in record time. Fully customizable and easy to use, Elastic Stack enables you to be on top of your servers all the time, and resolve problems for your clients as fast as possible. Supported by Puppet and available with various plugins. Get started with Beginning Elastic Stack today and see why many consider Elastic Stack the best option for server log management. What You Will Learn: Install and configure Logstash Use Logstash with Elasticsearch and Kibana Use Logstash with Puppet and Foreman Centralize data processing Who This Book Is For: Anyone working on multiple servers who needs to search their logs using a web interface. It is ideal for server administrators who have just started their job and need to look after multiple servers efficiently.

Fast Data Architectures for Streaming Applications

2016-10-15 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Dean Wampler

Big Data IoT Data Streaming data data-engineering streaming-architecture streaming-messaging

Why have stream-oriented data systems become so popular, when batch-oriented systems have served big data needs for many years? In this report, author Dean Wampler examines the rise of streaming systems for handling time-sensitive problems—such as detecting fraudulent financial activity as it happens. You’ll explore the characteristics of fast data architectures, along with several open source tools for implementing them. Batch-mode processing isn’t going away, but exclusive use of these systems is now a competitive disadvantage. You’ll learn that, while fast data architectures are much harder to build, they represent the state of the art for dealing with mountains of data that require immediate attention. Learn step-by-step how a basic fast data architecture works Understand why event logs are the core abstraction for streaming architectures, while message queues are the core integration tool Use methods for analyzing infinite data sets, where you don’t have all the data and never will Take a tour of open source streaming engines, and discover which ones work best for different use cases Get recommendations for making real-world streaming systems responsive, resilient, elastic, and message driven Explore an example streaming application for the IoT: telemetry ingestion and anomaly detection for home automation systems

Hadoop: Data Processing and Modelling

2016-08-31 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Sandeep Karanth , Tanmay Deshpande , Garry Turkington

AI/ML Big Data DWH Hadoop HDFS Hive Java RDBMS Spark SQL data data-engineering

Unlock the power of your data with Hadoop 2.X ecosystem and its data warehousing techniques across large data sets About This Book Conquer the mountain of data using Hadoop 2.X tools The authors succeed in creating a context for Hadoop and its ecosystem Hands-on examples and recipes giving the bigger picture and helping you to master Hadoop 2.X data processing platforms Overcome the challenging data processing problems using this exhaustive course with Hadoop 2.X Who This Book Is For This course is for Java developers, who know scripting, wanting a career shift to Hadoop - Big Data segment of the IT industry. So if you are a novice in Hadoop or an expert, this book will make you reach the most advanced level in Hadoop 2.X. What You Will Learn Best practices for setup and configuration of Hadoop clusters, tailoring the system to the problem at hand Integration with relational databases, using Hive for SQL queries and Sqoop for data transfer Installing and maintaining Hadoop 2.X cluster and its ecosystem Advanced Data Analysis using the Hive, Pig, and Map Reduce programs Machine learning principles with libraries such as Mahout and Batch and Stream data processing using Apache Spark Understand the changes involved in the process in the move from Hadoop 1.0 to Hadoop 2.0 Dive into YARN and Storm and use YARN to integrate Storm with Hadoop Deploy Hadoop on Amazon Elastic MapReduce and Discover HDFS replacements and learn about HDFS Federation In Detail As Marc Andreessen has said "Data is eating the world," which can be witnessed today being the age of Big Data, businesses are producing data in huge volumes every day and this rise in tide of data need to be organized and analyzed in a more secured way. With proper and effective use of Hadoop, you can build new-improved models, and based on that you will be able to make the right decisions. The first module, Hadoop beginners Guide will walk you through on understanding Hadoop with very detailed instructions and how to go about using it. Commands are explained using sections called "What just happened" for more clarity and understanding. The second module, Hadoop Real World Solutions Cookbook, 2nd edition, is an essential tutorial to effectively implement a big data warehouse in your business, where you get detailed practices on the latest technologies such as YARN and Spark. Big data has become a key basis of competition and the new waves of productivity growth. Hence, once you get familiar with the basics and implement the end-to-end big data use cases, you will start exploring the third module, Mastering Hadoop. So, now the question is if you need to broaden your Hadoop skill set to the next level after you nail the basics and the advance concepts, then this course is indispensable. When you finish this course, you will be able to tackle the real-world scenarios and become a big data expert using the tools and the knowledge based on the various step-by-step tutorials and recipes. Style and approach This course has covered everything right from the basic concepts of Hadoop till you master the advance mechanisms to become a big data expert. The goal here is to help you learn the basic essentials using the step-by-step tutorials and from there moving toward the recipes with various real-world solutions for you. It covers all the important aspects of Hadoop from system designing and configuring Hadoop, machine learning principles with various libraries with chapters illustrated with code fragments and schematic diagrams. This is a compendious course to explore Hadoop from the basics to the most advanced techniques available in Hadoop 2.X.

IBM TS4500 R3 Tape Library Guide

2016-08-20 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Larry Coyne , Michael Engelbrecht , Simon Browne

IBM Cyber Security data data-engineering

The IBM® TS4500 tape library is a next-generation tape solution that offers higher storage density and integrated management than previous solutions. This IBM Redbooks® publication gives you a close-up view of the new IBM TS4500 tape library. In the TS4500, IBM delivers the density that today's and tomorrow's data growth require, with the cost-effectiveness and the manageability to grow with business data needs, while you preserve existing investments in IBM tape library products. Now, you can achieve both a low cost per terabyte (TB) and a high TB density per square foot because the TS4500 can store up to 5.5 petabytes (PBs) of data in a single 10-square foot library frame, which is up to 3.4 times more capacity than the IBM TS3500 tape library. The TS4500 offers these benefits: High availability dual active accessors with integrated service bays to reduce inactive service space by 40%. The Elastic Capacity option can be used to completely eliminate inactive service space. Flexibility to grow: The TS4500 library can grow from both the right side and the left side of the first L frame because models can be placed in any active position. Increased capacity: The TS4500 can grow from a single L frame up to an additional 17 expansion frames with a capacity of over 23,000 cartridges. High-density (HD) generation 1 frames from the existing TS3500 library can be redeployed in a TS4500. Capacity on demand (CoD): CoD is supported through entry-level, intermediate, and base-capacity configurations. Advanced Library Management System (ALMS): ALMS supports dynamic storage management, which enables users to create and change logical libraries and configure any drive for any logical library. Support for the IBM TS1150 tape drive: The TS1150 gives organizations an easy way to deliver fast access to data, improve security, and provide long-term retention, all at a lower cost than disk solutions. The TS1150 offers high-performance, flexible data storage with support for data encryption. Also, this fifth-generation drive can help protect investments in tape automation by offering compatibility with existing automation. Support of the IBM Linear Tape-Open (LTO) Ultrium 7 tape drive: The LTO Ultrium 7 offering represents significant improvements in capacity, performance, and reliability over the previous generation, LTO Ultrium 6, while they still protect your investment in the previous technology. Integrated TS7700 back-end Fibre Channel (FC) switches are available. Up to four library-managed encryption (LME) key paths per logical library are available. This book describes the TS4500 components, feature codes, specifications, supported tape drives, encryption, new integrated management console (IMC), and command-line interface (CLI). You learn how to accomplish several specific tasks: Improve storage density with increased expansion frame capacity up to 2.4 times and support 33% more tape drives per frame. Manage storage by using the ALMS feature. Improve business continuity and disaster recovery with dual active accessor, automatic control path failover, and data path failover. Help ensure security and regulatory compliance with tape-drive encryption and Write Once Read Many (WORM) media. Support IBM LTO Ultrium 7, 6, and 5, IBM TS1150, and TS1140 tape drives. Provide a flexible upgrade path for users who want to expand their tape storage as their needs grow. Reduce the storage footprint and simplify cabling with 10 U of rack space on top of the library. This guide is for anyone who wants to understand more about the IBM TS4500 tape library. It is particularly suitable for IBM clients, IBM Business Partners, IBM specialist sales representatives, and technical specialists.

IBM Tape Library Guide for Open Systems

2016-08-11 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Larry Coyne , Michael Engelbrecht , Simon Browne

IBM data data-engineering

This IBM® Redbooks® publication presents a general introduction to the latest IBM tape and tape library technologies. Featured tape technologies include the IBM LTO Ultrium and Enterprise 3592 tape drives, and their implementation in IBM tape libraries. This 13th edition includes information about the latest enhancements to the IBM TS4500 enterprise tape library. In particular, it includes details about the latest TS4500 High Availability feature and its elastic capacity option. This book also provides details about the new TS7650G IBM ProtecTIER® gateway model DD6, contains technical information about each IBM tape product for open systems, and includes generalized sections about Small Computer System Interface (SCSI) and Fibre Channel connections and multipath architecture configurations. This book also covers tools and techniques for library management. It is intended for anyone who wants to understand more about IBM tape products and their implementation. It is suitable for IBM clients, IBM Business Partners, IBM specialist sales representatives, and technical specialists. If you do not have a background in computer tape storage products, you might need to read other sources of information. In the interest of being concise, topics that are generally understood are not covered in detail.

Monitoring Elasticsearch

2016-07-27 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Dan Noble

API Kibana data data-engineering elasticsearch search

"Monitoring Elasticsearch" focuses on teaching readers how to manage and monitor the health and performance of Elasticsearch clusters. Through practical steps and real-world examples, this book ensures that users can diagnose, resolve, and prevent common issues to optimize system reliability and performance. What this Book will help me do Obtain a clear understanding of Elasticsearch monitoring tools and their features. Learn how to diagnose and troubleshoot common Elasticsearch performance issues. Master the use of Elasticsearch APIs for monitoring and analysis. Explore the best practices for effectively maintaining cluster reliability. Understand the features of tools like Kibana, Marvel, and BigDesk for Elasticsearch monitoring. Author(s) The authors of "Monitoring Elasticsearch" are experts in distributed systems and database management, with extensive experience in Elasticsearch deployment and monitoring. They bring their practical knowledge, teaching readers clear and actionable techniques. Their approachable style makes complex systems accessible, helping professionals and aficionados alike. Who is it for? This book is ideal for developers and system administrators who work with Elasticsearch, regardless of their industry. Whether you're new to Elasticsearch or aiming to deepen your expertise, you will find practical solutions and helpful tools. The content suits a range of experiences, from beginners curious about cluster monitoring to experts needing solutions for specific issues. If you use Elasticsearch or plan to, this book is for you.

Cassandra: The Definitive Guide, 2nd Edition

2016-07-12 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Eben Hewitt , Jeff Carpenter

Cassandra Cloud Computing Data Modelling Docker Hadoop Java JavaScript Python Spark data data-engineering nosql-databases

Imagine what you could do if scalability wasn't a problem. With this hands-on guide, you’ll learn how the Cassandra database management system handles hundreds of terabytes of data while remaining highly available across multiple data centers. This expanded second edition—updated for Cassandra 3.0—provides the technical details and practical examples you need to put this database to work in a production environment. Authors Jeff Carpenter and Eben Hewitt demonstrate the advantages of Cassandra’s non-relational design, with special attention to data modeling. If you’re a developer, DBA, or application architect looking to solve a database scaling issue or future-proof your application, this guide helps you harness Cassandra’s speed and flexibility. Understand Cassandra’s distributed and decentralized structure Use the Cassandra Query Language (CQL) and cqlsh—the CQL shell Create a working data model and compare it with an equivalent relational model Develop sample applications using client drivers for languages including Java, Python, and Node.js Explore cluster topology and learn how nodes exchange data Maintain a high level of performance in your cluster Deploy Cassandra on site, in the Cloud, or with Docker Integrate Cassandra with Spark, Hadoop, Elasticsearch, Solr, and Lucene

Relevant Search

2016-06-20 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Doug Turnbull , John Berryman

Analytics data data-engineering search

Relevant Search demystifies relevance work. Using Elasticsearch, it teaches you how to return engaging search results to your users, helping you understand and leverage the internals of Lucene-based search engines. About the Technology Users are accustomed to and expect instant, relevant search results. To achieve this, you must master the search engine. Yet for many developers, relevance ranking is mysterious or confusing. About the Book Relevant Search demystifies the subject and shows you that a search engine is a programmable relevance framework. You'll learn how to apply Elasticsearch or Solr to your business's unique ranking problems. The book demonstrates how to program relevance and how to incorporate secondary data sources, taxonomies, text analytics, and personalization. In practice, a relevance framework requires softer skills as well, such as collaborating with stakeholders to discover the right relevance requirements for your business. By the end, you'll be able to achieve a virtuous cycle of provable, measurable relevance improvements over a search product's lifetime. What's Inside Techniques for debugging relevance Applying search engine features to real problems Using the user interface to guide searchers A systematic approach to relevance A business culture focused on improving search About the Reader For developers trying to build smarter search with Elasticsearch or Solr. About the Authors Doug Turnbull is lead relevance consultant at OpenSource Connections, where he frequently speaks and blogs. John Berryman is a data engineer at Eventbrite, where he specializes in recommendations and search. Quotes One of the best and most engaging technical books I’ve ever read. - From the Foreword by Trey Grainger, Author of "Solr in Action" Will help you solve real-world search relevance problems for Lucene-based search engines. - Dimitrios Kouzis-Loukas, Bloomberg L.P. An inspiring book revealing the essence and mechanics of relevant search. - Ursin Stauss, Swiss Post Arms you with invaluable knowledge to temper the relevancy of search results and harness the powerful features provided by modern search engines. - Russ Cam, Elastic

Mastering Redis

2016-05-31 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Jeremy Nelson (Insight)

MongoDB NoSQL Redis data data-engineering nosql-databases

"Mastering Redis" is your comprehensive guide to truly leveraging the power of the Redis data structure server. This hands-on resource offers detailed insights into scaling data with Redis clusters, optimizing memory, scripting with Lua, and integrating Redis with other NoSQL technologies to create robust, efficient applications. What this Book will help me do Select and utilize the appropriate Redis data structure to solve specific use cases efficiently. Implement Lua scripts on Redis for complex workflows and custom functionality. Optimize Redis configurations to achieve efficient memory usage and server performance. Integrate Redis with other NoSQL databases, such as MongoDB and Elasticsearch, for enhanced capabilities. Set up Redis Clusters and use Redis Sentinel for distributed and highly available setups. Author(s) Vidyasagar N V and None Nelson bring a wealth of expertise in software development and distributed systems to this book. Vidyasagar has extensive hands-on experience with Redis, enabling him to provide practical insights and best practices. Nelson complements this with deep knowledge of database optimization, making their combined perspective invaluable for anyone diving deep into Redis. Who is it for? This book is aimed at software developers who have an understanding of Redis basics and want to advance their proficiency. It is also targeted at developers aiming to implement Redis in production efficiently. By reading this book, readers will deepen their Redis skills and learn how to integrate it with other technologies to develop scalable, high-performance applications.

Elasticsearch Server - Third Edition - Third Edition

2016-02-29 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Rafal Kuc , Marek Rogozinski

data data-engineering elasticsearch search

Master the art of efficient search solutions with the insights and techniques provided in 'Elasticsearch Server - Third Edition'. This comprehensive guide covers everything from the basics of indexing and querying to advanced topics like aggregation and scaling, ensuring you can build robust search infrastructures tailored to your project's needs. What this Book will help me do Gain practical expertise in configuring Elasticsearch indices and retrieving data efficiently. Learn to craft complex queries using the Elasticsearch query domain-specific language (DSL). Understand and implement advanced search features for enhanced functionality. Master the aggregation framework to derive valuable insights from your data. Equip yourself with the skills to monitor and optimize your Elasticsearch cluster for performance and scalability. Author(s) Marek Rogozinski and Rafal Kuc are seasoned experts in search technologies and have extensive experience working with Elasticsearch and related domains. With years of technical experience and a passion for teaching through clear, hands-on examples, they aim to make mastering Elasticsearch accessible and practical for tech professionals and enthusiasts alike. Who is it for? This book is aimed at software developers and IT professionals who are eager to build or strengthen their expertise in Elasticsearch. Whether you're new to search infrastructure or looking to refine your skills, this book is tailored for beginner to intermediate levels. If your goal is to deploy scalable search solutions or understand how to analyze large datasets effectively, this book is for you.

Elasticsearch Essentials

2016-01-30 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Bharvi Dixit

Analytics API Big Data Data Analytics Data Modelling data data-engineering elasticsearch search

"Elasticsearch Essentials" provides a comprehensive introduction to Elasticsearch, the powerful search and analytics engine. This book delivers a fast-paced, practical guide to harnessing Elasticsearch for creating scalable search and analytics applications. What this Book will help me do Learn to effectively use Elasticsearch REST APIs for search and analytics. Understand and design schema and mappings with best practices. Master data modeling concepts for efficient data queries. Develop skills to create and manage Elasticsearch clusters in production. Learn techniques for ensuring high availability and handling large datasets. Author(s) Bharvi Dixit is a seasoned developer and expert in search technologies with hands-on experience in Elasticsearch and other search solutions. With extensive knowledge in data analytics and large-scale systems, Bharvi ensures readers gain practical skills and insights through well-structured examples and explanations. Who is it for? This book is perfect for developers looking to enhance their skills in building search and analytics solutions with Elasticsearch. It's particularly suited for those familiar with search technologies like Apache Lucene or Solr but new to Elasticsearch. Beginners to intermediate learners in big data and analytics will find the structured approach beneficial. It's ideal for professionals aspiring to develop advanced search implementations with modern tools.

Scalable Big Data Architecture: A Practitioner’s Guide to Choosing Relevant Big Data Architecture

2016-01-06 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Bahaaldine Azarmi

AI/ML Analytics API Big Data Hadoop Kafka Logstash NoSQL Spark SQL data data-engineering +1 more

This book highlights the different types of data architecture and illustrates the many possibilities hidden behind the term "Big Data", from the usage of No-SQL databases to the deployment of stream analytics architecture, machine learning, and governance. Scalable Big Data Architecture covers real-world, concrete industry use cases that leverage complex distributed applications , which involve web applications, RESTful API, and high throughput of large amount of data stored in highly scalable No-SQL data stores such as Couchbase and Elasticsearch. This book demonstrates how data processing can be done at scale from the usage of NoSQL datastores to the combination of Big Data distribution. When the data processing is too complex and involves different processing topology like long running jobs, stream processing, multiple data sources correlation, and machine learning, it’s often necessary to delegate the load to Hadoop or Spark and use the No-SQL to serve processed data in real time. This book shows you how to choose a relevant combination of big data technologies available within the Hadoop ecosystem. It focuses on processing long jobs, architecture, stream data patterns, log analysis, and real time analytics. Every pattern is illustrated with practical examples, which use the different open sourceprojects such as Logstash, Spark, Kafka, and so on. Traditional data infrastructures are built for digesting and rendering data synthesis and analytics from large amount of data. This book helps you to understand why you should consider using machine learning algorithms early on in the project, before being overwhelmed by constraints imposed by dealing with the high throughput of Big data. Scalable Big Data Architecture is for developers, data architects, and data scientists looking for a better understanding of how to choose the most relevant pattern for a Big Data project and which tools to integrate into that pattern.

Elasticsearch Indexing

2015-12-30 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Huseyin Akdogan

data data-engineering elasticsearch search

Elasticsearch Indexing focuses on empowering developers to create optimized and user-friendly search experiences using Elasticsearch. By learning how to configure indices and mapping strategies, and leveraging analyzers effectively, you will gain proficiency in delivering fast and relevant search results tailored to modern user expectations. What this Book will help me do Understand how Elasticsearch stores data and how it reduces costs Develop advanced mapping strategies to improve index performance Utilize Elasticsearch analyzers for efficient search query processing Optimize ElasticSearch clusters for scalability and stability Perform strategic indexing to minimize resource usage while maximizing functionality Author(s) Huseyin Akdogan is a seasoned software developer specializing in search technologies and scalability. With his deep expertise and practical insights, he brings a metric-driven approach to optimizing Elasticsearch. His book reflects his dedication to making technical concepts accessible and actionable for developers. Who is it for? This book is ideal for developers looking to gain expertise in Elasticsearch. It caters to individuals with a foundational understanding of search systems who wish to optimize their indexing and search result delivery. If you are focused on improving user search experiences tailored to scalable needs, this book is perfect for you.

talk-data.com

Activity Trend

Top Events

Top Speakers

Agile Data Science 2.0

Mastering Elastic Stack

Mastering Elasticsearch 5.x - Third Edition

Learning Kibana 5.0

Elasticsearch 5.x Cookbook - Third Edition

HBase High Performance Cookbook

Mastering RethinkDB

Beginning Elastic Stack

Fast Data Architectures for Streaming Applications

Hadoop: Data Processing and Modelling

IBM TS4500 R3 Tape Library Guide

IBM Tape Library Guide for Open Systems

Monitoring Elasticsearch

Cassandra: The Definitive Guide, 2nd Edition

Relevant Search

Mastering Redis

Elasticsearch Server - Third Edition - Third Edition

Elasticsearch Essentials

Scalable Big Data Architecture: A Practitioner’s Guide to Choosing Relevant Big Data Architecture

Elasticsearch Indexing