talk-data.com talk-data.com

Topic

ELK

Elasticsearch/ELK Stack

search_engine log_analysis elk_stack

168

tagged

Activity Trend

10 peak/qtr
2020-Q1 2026-Q1

Activities

168 activities · Newest first

Learning ELK Stack

Dive into the ELK stack-Elasticsearch, Logstash, and Kibana-with this comprehensive guide. Designed to help you set up, configure, and utilize the stack to its fullest, this book provides you with the skills to manage data with precision, enrich logs, and create meaningful analytics. Develop an entire data pipeline and cultivate powerful visual insights from your data. What this Book will help me do Install and configure Elasticsearch, Logstash, and Kibana to establish a robust ELK stack setup. Understand the role of each component in the stack and master the basics of log analysis. Create custom Logstash plugins to handle non-standard data processing requirements. Develop interactive and insightful data visualizations and dashboards using Kibana. Implement a complete data pipeline and gain expertise in data indexing, searching, and reporting. Author(s) None Chhajed brings depth of technical understanding and practical experience to the exploration of the ELK Stack. With a strong background in open-source technologies and data analytics, Chhajed has worked extensively with ELK stack implementations in real-world scenarios. Through this guide, the author offers clarity, detailed examples, and actionable insights for professionals seeking to improve their data systems. Who is it for? This book is targeted towards software developers, data analysts, and DevOps engineers seeking to harness the potential of the ELK stack for data analysis and logging. It is most suitable for intermediate-level professionals with basic knowledge of Unix or programming. If your aim is to gain insights and build metrics from diverse data formats utilizing open-source technologies, this book is crafted for you.

Learning Couchbase

Embark on your journey to mastering Couchbase with this comprehensive guide designed for learners of all levels. By exploring the fundamentals of NoSQL databases and diving into Couchbase's functionality, you'll gain the skills to design, manage, and scale modern applications effectively. Learn practical solutions and techniques to leverage Couchbase as a powerful backend system. What this Book will help me do Understand the core concepts of NoSQL databases and configure a Couchbase database system from scratch. Design efficient document data schemas and use Couchbase SDKs for high-performance application development. Explore the integration of Couchbase with Elasticsearch to implement robust full-text search capabilities. Master advanced Couchbase features like XDCR for disaster recovery and N1QL for SQL-like application queries. Develop and scale a real-world e-commerce application using Couchbase as the backend database system. Author(s) Henry Potsangbam is an experienced software developer and database specialist with a focus on scalable NoSQL solutions. He has worked extensively with Couchbase in developing real-world applications and is passionate about teaching others the intricacies of database systems. Henry's writing style makes advanced concepts accessible and practical for readers of all levels. Who is it for? This book is crafted for developers, database administrators, and IT professionals who want to learn NoSQL database basics and Couchbase's capabilities. Beginners with no prior experience in NoSQL will find step-by-step guidance, and experienced developers can expand their skill set to include Couchbase. A familiarity with Java programming will be helpful but is not mandatory.

Elasticsearch in Action

Elasticsearch in Action teaches you how to build scalable search applications using Elasticsearch. You'll ramp up fast, with an informative overview and an engaging introductory example. Within the first few chapters, you'll pick up the core concepts you need to implement basic searches and efficient indexing. With the fundamentals well in hand, you'll go on to gain an organized view of how to optimize your design. Perfect for developers and administrators building and managing search-oriented applications. About the Technology Modern search seems like magic'you type a few words and the search engine appears to know what you want. With the Elasticsearch real-time search and analytics engine, you can give your users this magical experience without having to do complex low-level programming or understand advanced data science algorithms. You just install it, tweak it, and get on with your work. About the Book Elasticsearch in Action teaches you how to write applications that deliver professional quality search. As you read, you'll learn to add basic search features to any application, enhance search results with predictive analysis and relevancy ranking, and use saved data from prior searches to give users a custom experience. This practical book focuses on Elasticsearch's REST API via HTTP. Code snippets are written mostly in bash using cURL, so they're easily translatable to other languages. What's Inside What is a great search application? Building scalable search solutions Using Elasticsearch with any language Configuration and tuning About the Reader This book is for developers and administrators building and managing search-oriented applications. About the Authors Radu Gheorghe is a search consultant and software engineer. Matthew Lee Hinman develops highly available, cloud-based systems. Roy Russo is a specialist in predictive analytics. Quotes To understand how a modern search infrastructure works is a daunting task. Radu, Matt, and Roy make it an engaging, hands-on experience. - Sen Xu, Twitter Inc. An indispensable guide to the challenges of search of semi-structured data. - Artur Nowak, Evidence Prime The best resource for a complex topic. Highly recommended. - Daniel Beck, juris GmbH Took me from confused to confident in a week. - Alan McCann, Givsum.com

Kibana Essentials

Dive into "Kibana Essentials" and discover how to efficiently analyze data and create visually engaging visualizations and dashboards with Kibana. Whether you are new to Kibana or looking to enhance your skills, this book provides practical guidance to help you apply Kibana features to real-world scenarios. By the end, you'll have the skills to create and apply dashboards that run on Elasticsearch. What this Book will help me do Understand the core features and setup process of Kibana on both Windows and Ubuntu platforms. Master the Discover, Visualize, Dashboard, and Settings functionalities in Kibana. Utilize Elasticsearch's search capabilities to analyze data in Kibana. Create, customize, and share stunning visualizations and dashboards for various use cases. Gain advanced knowledge to tweak Kibana settings for optimized workflows. Author(s) None Gupta is an experienced author and data professional who has worked extensively with Kibana and Elasticsearch technologies. With a passion for simplifying complex concepts, None specializes in breaking down technical topics into digestible, actionable steps. Their practical approach ensures that learners can confidently apply knowledge immediately after reading. Who is it for? This book is for professionals or enthusiasts aiming to delve into data visualization using Kibana. Whether starting from scratch or familiar with similar tools, readers will find the foundational to advanced techniques invaluable. It's especially suited for those who want a practical, hands-on approach to mastering Kibana.

ElasticSearch Blueprints

Dive into search technology with "ElasticSearch Blueprints"! This is the perfect project-based guide to help you master Elasticsearch. You will learn how to build and design scalable, effective search solutions, improve search relevancy, manage data efficiently, perform analytics, and visualize your data in comprehensive ways. What this Book will help me do Build and fine-tune scalable search engine features with Elasticsearch. Design and implement accurate ecommerce search solutions using filters. Analyze and visualize data with Elasticsearch's powerful data aggregation capabilities. Increase search relevancy and enhance user query assistance using analyzers. Incorporate enhanced data organization methods, including parent-child relationships. Author(s) None Mohan is an experienced professional specializing in search technologies. With a strong technical background, they have engaged deeply with Elasticsearch, creating solutions that address practical challenges. Their approach focuses on making technical topics accessible, guiding readers step-by-step through projects. Who is it for? This book is tailored for data professionals, application developers, and enthusiasts eager to delve into search technologies. Whether you're beginning with Elasticsearch or aiming to refine your skills, this guide will advance your expertise. By working through practical cases, you'll gain confidence in using Elasticsearch effectively to meet diverse requirements.

Statistical Learning with Sparsity

Discover New Methods for Dealing with High-Dimensional Data A sparse statistical model has only a small number of nonzero parameters or weights; therefore, it is much easier to estimate and interpret than a dense model. Statistical Learning with Sparsity: The Lasso and Generalizations presents methods that exploit sparsity to help recover the underlying signal in a set of data. Top experts in this rapidly evolving field, the authors describe the lasso for linear regression and a simple coordinate descent algorithm for its computation. They discuss the application of ℓ 1 penalties to generalized linear models and support vector machines, cover generalized penalties such as the elastic net and group lasso, and review numerical methods for optimization. They also present statistical inference methods for fitted (lasso) models, including the bootstrap, Bayesian methods, and recently developed approaches. In addition, the book examines matrix decomposition, sparse multivariate analysis, graphical models, and compressed sensing. It concludes with a survey of theoretical results for the lasso. In this age of big data, the number of features measured on a person or object can be large and might be larger than the number of observations. This book shows how the sparsity assumption allows us to tackle these problems and extract useful and reproducible patterns from big datasets. Data analysts, computer scientists, and theorists will appreciate this thorough and up-to-date treatment of sparse statistical modeling.

Apache Solr Search Patterns

Master Elasticsearch as you uncover advanced Solr techniques in this professional guide. This book dives deeply into deploying and optimizing Solr-powered search engines and explores high-performance techniques. Learn to leverage your data with accessible, comprehensive, and practical insights. What this Book will help me do Learn to customize Solr's query scorer to provide tailored search results. Understand the internals of Solr, including indexing and query facilities, for better optimization. Implement scalable and reliable search clusters using SolrCloud. Explore the use of Solr for spatial, e-commerce, and advertising searches. Combine Solr with front-end technologies like AJAX and advanced tagging with FSTs. Author(s) Jayant Kumar, an experienced developer and search solutions architect, specializes in leveraging Apache Solr. With years of practical experience, he brings unique insights into scaling search platforms. His commitment to imparting clear, actionable knowledge is reflected in this focused resource. Who is it for? This book is ideal for software developers and architects embedded in the Solr ecosystem looking to enhance their expertise. If you are seeking to develop advanced and scalable solutions, master Solr's core capabilities, or improve your analytics and graph-generating skills, this book will support your goals.

Storm Applied

Storm Applied is a practical guide to using Apache Storm for the real-world tasks associated with processing and analyzing real-time data streams. This immediately useful book starts by building a solid foundation of Storm essentials so that you learn how to think about designing Storm solutions the right way from day one. But it quickly dives into real-world case studies that will bring the novice up to speed with productionizing Storm. About the Technology It's hard to make sense out of data when it's coming at you fast. Like Hadoop, Storm processes large amounts of data but it does it reliably and in real time, guaranteeing that every message will be processed. Storm allows you to scale with your data as it grows, making it an excellent platform to solve your big data problems. About the Book Storm Applied is an example-driven guide to processing and analyzing real-time data streams. This immediately useful book starts by teaching you how to design Storm solutions the right way. Then, it quickly dives into real-world case studies that show you how to scale a high-throughput stream processor, ensure smooth operation within a production cluster, and more. Along the way, you'll learn to use Trident for stateful stream processing, along with other tools from the Storm ecosystem. What's Inside Mapping real problems to Storm components Performance tuning and scaling Practical troubleshooting and debugging Exactly-once processing with Trident About the Reader This book moves through the basics quickly. While prior experience with Storm is not assumed, some experience with big data and real-time systems is helpful. About the Authors Sean Allen, Matthew Jankowski, and Peter Pathirana lead the development team for a high-volume, search-intensive commercial web application at TheLadders. Quotes Will no doubt become the definitive practitioner’s guide for Storm users. - From the Foreword by Andrew Montalenti The book’s practical approach to Storm will save you a lot of hassle and a lot of time. - Tanguy Leroux, Elasticsearch Great introduction to distributed computing with lots of real-world examples. - Shay Elkin, Tangent Logic Go beyond the MapReduce way of thinking to solve big data problems. - Muthusamy Manigandan, OzoneMedia

Mastering Elasticsearch - Second Edition

Delve deeper into Elasticsearch in "Mastering Elasticsearch - Second Edition" to gain comprehensive insights into advanced querying, data indexing, and internal workings of Elasticsearch servers. With this book, you'll enhance your ability to implement powerful search solutions and optimize performance with confidence. What this Book will help me do Build advanced querying skills to utilize the Elasticsearch Query DSL effectively. Gain hands-on understanding of optimal data indexing for your Elasticsearch applications. Learn to improve user search experiences by tailoring Elasticsearch functionalities. Master Elasticsearch performance tuning and server optimization techniques. Develop custom Elasticsearch plugins to expand its core capabilities. Author(s) Marek Rogozinski, a seasoned Elasticsearch developer, brings years of professional expertise to this comprehensive guide. With a focus on practical and actionable knowledge, Marek has crafted this edition for users eager to deepen their Elasticsearch proficiency. His hands-on approach ensures you can apply the lessons directly and effectively. Who is it for? Ideal readers are those experienced with Elasticsearch, familiar with Query DSL and indexing techniques, and looking to expand their technical capabilities. Whether you're an Elasticsearch administrator, developer, or enthusiast, this book will enable you to master advanced topics and achieve your goals in search technology.

Apache Flume: Distributed Log Collection for Hadoop - Second Edition

"Apache Flume: Distributed Log Collection for Hadoop - Second Edition" is your hands-on guide to learning how to use Apache Flume to reliably collect and move logs and data streams into your Hadoop ecosystem. Through practical examples and real-world scenarios, this book will help you master the setup, configuration, and optimization of Flume for various data ingestion use cases. What this Book will help me do Understand the key concepts and architecture behind Apache Flume to build reliable and scalable data ingestion systems. Set up Flume agents to collect and transfer data into the Hadoop File System (HDFS) or other storage solutions effectively. Learn stream data processing techniques, such as filtering, transforming, and enriching data during transit to improve data usability. Integrate Flume with other tools like Elasticsearch and Solr to enhance analytics and search capabilities. Implement monitoring and troubleshooting workflows to maintain healthy and optimized Flume data pipelines. Author(s) Steven Hoffman, a seasoned software developer and data engineer, brings years of practical experience working with big data technologies to this book. He has a strong background in distributed systems and big data solutions, having implemented enterprise-scale analytics projects. Through clear and approachable writing, he aims to empower readers to successfully deploy reliable data pipelines using Apache Flume. Who is it for? This book is written for Hadoop developers, data engineers, and IT professionals who seek to build robust pipelines for streaming data into Hadoop environments. It is ideal for readers who have a basic understanding of Hadoop and HDFS but are new to Apache Flume. If you are looking to enhance your analytics capabilities by efficiently ingesting, routing, and processing streaming data, this book is for you. Beginners as well as experienced engineers looking to dive deeper into Flume will find it insightful.

ElasticSearch Cookbook - Second Edition

The "ElasticSearch Cookbook - Second Edition" is a hands-on guide featuring over 130 advanced recipes to help you harness the power of ElasticSearch, a leading search and analytics engine. Through insightful examples and practical guidance, you'll learn to implement efficient search solutions, optimize queries, and manage ElasticSearch clusters effectively. What this Book will help me do Design and configure ElasticSearch topologies optimized for your specific deployment needs. Develop and utilize custom mappings to optimize your data indexes. Execute advanced queries and filters to refine and retrieve search results effectively. Set up and monitor ElasticSearch clusters for optimal performance. Extend ElasticSearch capabilities through plugin development and integrations using Java and Python. Author(s) Alberto Paro is a technology expert with years of experience working with ElasticSearch, Big Data solutions, and scalable cloud architecture. He has authored multiple books and technical articles on ElasticSearch, leveraging his extensive knowledge to provide practical insights. His approachable and detail-oriented style makes complex concepts accessible to technical professionals. Who is it for? This book is best suited for software developers and IT professionals looking to use ElasticSearch in their projects. Readers should be familiar with JSON, as well as basic programming skills in Java. It is ideal for those who have an understanding of search applications and want to deepen their expertise. Whether you're integrating ElasticSearch into a web application or optimizing your system's search capabilities, this book will provide the skills and knowledge you need.

Elasticsearch: The Definitive Guide

Whether you need full-text search or real-time analytics of structured data—or both—the Elasticsearch distributed search engine is an ideal way to put your data to work. This practical guide not only shows you how to search, analyze, and explore data with Elasticsearch, but also helps you deal with the complexities of human language, geolocation, and relationships.

Using Flume

How can you get your data from frontend servers to Hadoop in near real time? With this complete reference guide, you’ll learn Flume’s rich set of features for collecting, aggregating, and writing large amounts of streaming data to the Hadoop Distributed File System (HDFS), Apache HBase, SolrCloud, Elastic Search, and other systems. Using Flume shows operations engineers how to configure, deploy, and monitor a Flume cluster, and teaches developers how to write Flume plugins and custom components for their specific use-cases. You’ll learn about Flume’s design and implementation, as well as various features that make it highly scalable, flexible, and reliable. Code examples and exercises are available on GitHub. Learn how Flume provides a steady rate of flow by acting as a buffer between data producers and consumers Dive into key Flume components, including sources that accept data and sinks that write and deliver it Write custom plugins to customize the way Flume receives, modifies, formats, and writes data Explore APIs for sending data to Flume agents from your own applications Plan and deploy Flume in a scalable and flexible way—and monitor your cluster once it’s running

Programming Elastic MapReduce

Although you don’t need a large computing infrastructure to process massive amounts of data with Apache Hadoop, it can still be difficult to get started. This practical guide shows you how to quickly launch data analysis projects in the cloud by using Amazon Elastic MapReduce (EMR), the hosted Hadoop framework in Amazon Web Services (AWS). Authors Kevin Schmidt and Christopher Phillips demonstrate best practices for using EMR and various AWS and Apache technologies by walking you through the construction of a sample MapReduce log analysis application. Using code samples and example configurations, you’ll learn how to assemble the building blocks necessary to solve your biggest data analysis problems. Get an overview of the AWS and Apache software tools used in large-scale data analysis Go through the process of executing a Job Flow with a simple log analyzer Discover useful MapReduce patterns for filtering and analyzing data sets Use Apache Hive and Pig instead of Java to build a MapReduce Job Flow Learn the basics for using Amazon EMR to run machine learning algorithms Develop a project cost model for using Amazon EMR and other AWS tools

R for Everyone: Advanced Analytics and Graphics

Statistical Computation for Programmers, Scientists, Quants, Excel Users, and Other Professionals Using the open source R language, you can build powerful statistical models to answer many of your most challenging questions. R has traditionally been difficult for non-statisticians to learn, and most R books assume far too much knowledge to be of help. R for Everyone is the solution. Drawing on his unsurpassed experience teaching new users, professional data scientist Jared P. Lander has written the perfect tutorial for anyone new to statistical programming and modeling. Organized to make learning easy and intuitive, this guide focuses on the 20 percent of R functionality you’ll need to accomplish 80 percent of modern data tasks. Lander’s self-contained chapters start with the absolute basics, offering extensive hands-on practice and sample code. You’ll download and install R; navigate and use the R environment; master basic program control, data import, and manipulation; and walk through several essential tests. Then, building on this foundation, you’ll construct several complete models, both linear and nonlinear, and use some data mining techniques. By the time you’re done, you won’t just know how to write R programs, you’ll be ready to tackle the statistical problems you care about most. COVERAGE INCLUDES • Exploring R, RStudio, and R packages • Using R for math: variable types, vectors, calling functions, and more • Exploiting data structures, including data.frames, matrices, and lists • Creating attractive, intuitive statistical graphics • Writing user-defined functions • Controlling program flow with if, ifelse, and complex checks • Improving program efficiency with group manipulations • Combining and reshaping multiple datasets • Manipulating strings using R’s facilities and regular expressions • Creating normal, binomial, and Poisson probability distributions • Programming basic statistics: mean, standard deviation, and t-tests • Building linear, generalized linear, and nonlinear models • Assessing the quality of models and variable selection • Preventing overfitting, using the Elastic Net and Bayesian methods • Analyzing univariate and multivariate time series data • Grouping data via K-means and hierarchical clustering • Preparing reports, slideshows, and web pages with knitr • Building reusable R packages with devtools and Rcpp • Getting involved with the R global community

Advanced Tuning for JD Edwards EnterpriseOne Implementations

Best Practices for JD Edwards EnterpriseOne Tuning and Optimization Achieve peak performance from your ERP platform while minimizing downtime and lowering TCO. Advanced Tuning for JD Edwards EnterpriseOne Implementations shows how to plan and adopt a structured, top-to-bottom maintenance methodology. Uncover and eliminate bottlenecks, maximize efficiency at every component layer, troubleshoot databases and web servers, automate system testing, and handle mobile issues. This Oracle Press guide offers complete coverage of the latest cloud, clustering, load balancing, and virtualization solutions. Understand the components of a structured tuning plan Establish benchmarks and implement key industry practices Perform changes and accurately measure system-wide impact Diagnose and repair HTTP, web application, and Java issues Troubleshoot Oracle Database connections and transactions Streamline Oracle’s JD Edwards EnterpriseOne kernel and JDENeT processes Configure, test, and manage virtual machines and servers Work with Oracle Exadata Database Machine and Oracle Exalogic Elastic Cloud

Hadoop Beginner's Guide

Hadoop Beginner's Guide introduces you to the essential concepts and practical applications of Apache Hadoop, one of the leading frameworks for big data processing. You will learn how to set up and use Hadoop to store, manage, and analyze vast amounts of data efficiently. With clear examples and step-by-step instructions, this book is the perfect starting point for beginners. What this Book will help me do Understand the trends leading to the adoption of Hadoop and determine when to use it effectively in your projects. Build and configure Hadoop clusters tailored to your specific needs, enabling efficient data processing. Develop and execute applications on Hadoop using Java and Ruby, with practical examples provided. Leverage Amazon AWS and Elastic MapReduce to deploy Hadoop on the cloud and manage hosted environments. Integrate Hadoop with relational databases using tools like Hive and Sqoop for effective data transfer and querying. Author(s) The author of Hadoop Beginner's Guide is an experienced data engineer with a focus on big data technologies. They have extensive experience deploying Hadoop in various industries and are passionate about making complex systems accessible to newcomers. Their approach combines technical depth with an understanding of the needs of learners, ensuring clarity and relevance throughout the book. Who is it for? This book is designed for professionals who are new to big data processing and want to learn Apache Hadoop from scratch. It is ideal for system administrators, data analysts, and developers with basic programming knowledge in Java or Ruby looking to get started with Hadoop. If you have an interest in leveraging Hadoop for scalable data management and analytics, this book is for you. By the end, you'll gain the confidence and skills to utilize Hadoop effectively in your projects.

ElasticSearch Server

ElasticSearch Server is an excellent resource for mastering the ElasticSearch open-source search engine. This book takes you through practical steps to implement, configure, and optimize search capabilities, suitable for various data sets and applications, making faster and more accurate search outcomes accessible. What this Book will help me do Understand the core concepts of ElasticSearch, including data indexing, dynamic mapping, and search analysis. Develop practical skills in writing queries and filters to retrieve precise and relevant results. Learn to set up and efficiently manage ElasticSearch clusters for scalability and real-time performance. Implement advanced ElasticSearch functions like autocompletion, faceting, and geo-search. Utilize optimization techniques for cluster monitoring, health-checks, and tuning for reliable performance. Author(s) The authors of ElasticSearch Server are industry professionals with extensive experience in search technologies and system architecture. They have contributed to multiple tools and publications in the field of data search and analytics. Their writing aims to distill complex technical concepts into practical knowledge, making it valuable for readers from all backgrounds. Who is it for? This book is perfect for developers, system architects, and IT professionals seeking a robust and scalable search solution for their projects. Whether you're new to ElasticSearch or looking to deepen your expertise, this book will serve as a practical guide to implement ElasticSearch effectively. The only prerequisites are a basic understanding of databases and general query concepts, so prior search server knowledge is not required.

Resilience and Reliability on AWS

Cloud services are just as susceptible to network outages as any other platform. This concise book shows you how to prepare for potentially devastating interruptions by building your own resilient and reliable applications in the public cloud. Guided by engineers from 9apps—an independent provider of Amazon Web Services and Eucalyptus cloud solutions—you’ll learn how to combine AWS with open source tools such as PostgreSQL, MongoDB, and Redis. This isn’t a book on theory. With detailed examples, sample scripts, and solid advice, software engineers with operations experience will learn specific techniques that 9apps routinely uses in its cloud infrastructures. Build cloud applications with the "rip, mix, and burn" approach Get a crash course on Amazon Web Services Learn the top ten tips for surviving outages in the cloud Use elasticsearch to build a dependable NoSQL data store Combine AWS and PostgreSQL to build an RDBMS that scales well Create a highly available document database with MongoDB Replica Set and SimpleDB Augment Redis with AWS to provide backup/restore, failover, and monitoring capabilities Work with CloudFront and Route 53 to safeguard global content delivery

Programming Hive

Need to move a relational database application to Hadoop? This comprehensive guide introduces you to Apache Hive, Hadoop’s data warehouse infrastructure. You’ll quickly learn how to use Hive’s SQL dialect—HiveQL—to summarize, query, and analyze large datasets stored in Hadoop’s distributed filesystem. This example-driven guide shows you how to set up and configure Hive in your environment, provides a detailed overview of Hadoop and MapReduce, and demonstrates how Hive works within the Hadoop ecosystem. You’ll also find real-world case studies that describe how companies have used Hive to solve unique problems involving petabytes of data. Use Hive to create, alter, and drop databases, tables, views, functions, and indexes Customize data formats and storage options, from files to external databases Load and extract data from tables—and use queries, grouping, filtering, joining, and other conventional query methods Gain best practices for creating user defined functions (UDFs) Learn Hive patterns you should use and anti-patterns you should avoid Integrate Hive with other data processing programs Use storage handlers for NoSQL databases and other datastores Learn the pros and cons of running Hive on Amazon’s Elastic MapReduce