talk-data.com talk-data.com

Topic

Hadoop

Apache Hadoop

big_data distributed_computing data_processing

165

tagged

Activity Trend

3 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: O'Reilly Data Engineering Books ×
Lucene in Action, Second Edition

When Lucene first appeared, this superfast search engine was nothing short of amazing. Today, Lucene still delivers. Its high-performance, easy-to-use API, features like numeric fields, payloads, near-real-time search, and huge increases in indexing and searching speed make it the leading search tool. And with clear writing, reusable examples, and unmatched advice, Lucene in Action, Second Edition is still the definitive guide to effectively integrating search into your applications. This totally revised book shows you how to index your documents, including formats such as MS Word, PDF, HTML, and XML. It introduces you to searching, sorting, and filtering, and covers the numerous improvements to Lucene since the first edition. Source code is for Lucene 3.0.1. About the Technology About the Book What's Inside Performing hot backups Using numeric fields Tuning for indexing or searching speed Boosting matches with payloads Creating reusable analyzers Adding concurrency with threads Four new case studies Much more! About the Reader About the Authors Michael McCandless is a Lucene PMC member and committer with more than a decade of experience building search engines. Erik Hatcher and Otis Gospodnetić are the authors of the first edition of Lucene in Action and long-time contributors to Lucene, Solr, Mahout, and other Lucene-based projects. Quotes ... brings you up to speed. - Doug Cutting, Founder of Lucene, Nutch, and Hadoop This new edition has it all. - Chad Davis, Blackdog Software, Author of Struts 2 in Action Very readable, full of expert tips. - Rick Wagner, Acxiom Corp. Elegant, and easy to read - just like Lucene itself. - Shai Erera, IBM Haifa Research Labs For a Lucene developer, it's required reading. - Stuart Caborn, Thoughtworks

How to Evaluate the Job You’ve Been Offered

This Element is an excerpt from Rebound: A Proven Plan for Starting Over After Job Loss (ISBN: 9780137021147) by Martha I. Finney. Available in print and digital formats. Now that they’ve offered a job, should you take it? Analyze prospective employers rationally and make decisions you won’t regret! Setting aside money for just a moment, so much more goes into deciding whether a potential employer is right for you. You need to know whether the company is a good fit, a reasonably logical choice on your professional progression--not just an invitation to be unemployed again....

Trends Are an Investor’s Best Friend

This Element is an excerpt from The ETF Trend Following Playbook: Profiting from Trends in Bull or Bear Markets with Exchange Traded Funds (ISBN: 9780137029013) by Tom Lydon. Available in print and digital formats. Simple calculations that spot powerful market trends early, so there’s time to cash in on them! Of all the things you can teach yourself to become a better investor, the best is to learn how to identify trends. You probably do it now, to a degree. But by the time news of a trend spreads to the point where it’s cocktail-party fodder, the bulk of the profits have been made. Instead, you need to learn to spot trends as early as possible, to enjoy the longest ride possible.

Hadoop: The Definitive Guide

Hadoop: The Definitive Guide helps you harness the power of your data. Ideal for processing large datasets, the Apache Hadoop framework is an open source implementation of the MapReduce algorithm on which Google built its empire. This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems: programmers will find details for analyzing large datasets, and administrators will learn how to set up and run Hadoop clusters. Complete with case studies that illustrate how Hadoop solves specific problems, this book helps you: Use the Hadoop Distributed File System (HDFS) for storing large datasets, and run distributed computations over those datasets using MapReduce Become familiar with Hadoop's data and I/O building blocks for compression, data integrity, serialization, and persistence Discover common pitfalls and advanced features for writing real-world MapReduce programs Design, build, and administer a dedicated Hadoop cluster, or run Hadoop in the cloud Use Pig, a high-level query language for large-scale data processing Take advantage of HBase, Hadoop's database for structured and semi-structured data Learn ZooKeeper, a toolkit of coordination primitives for building distributed systems If you have lots of data -- whether it's gigabytes or petabytes -- Hadoop is the perfect solution. Hadoop: The Definitive Guide is the most thorough book available on the subject. "Now you have the opportunity to learn about Hadoop from a master-not only of the technology, but also of common sense and plain talk."-- Doug Cutting, Hadoop Founder, Yahoo!