Manish Kumar

Activities

2

talks

author

Frequent Collaborators

Chanchal Singh 2

Filter by Event / Source

O'Reilly Data Engineering Books 2

Talks & appearances

2 activities · Newest first

Search activities →

Mastering Hadoop 3

2019-02-28 · O'Reilly Data Engineering Books O'Reilly Amazon

book

with Timothy Wong , Chanchal Singh , Manish Kumar

data data-engineering Hadoop Flink Big Data Data Engineering

"Mastering Hadoop 3" is your in-depth guide to understanding and mastering the advanced features of the Hadoop ecosystem. With a focus on distributed computing and data processing, this book covers essential tools such as YARN, MapReduce, and Apache Spark to help you build scalable, efficient data pipelines. What this Book will help me do Gain a comprehensive understanding of Hadoop Distributed File System (HDFS) and YARN for effective resource management. Master data processing with MapReduce and learn to integrate with real-time processing engines like Spark and Flink. Develop and secure enterprise-grade Hadoop-based data pipelines by implementing robust security and governance measures. Explore techniques for batch data processing, data modeling, and designing applications tailored for Hadoop environments. Understand best practices for optimizing and troubleshooting Hadoop clusters for enhanced performance and reliability. Author(s) The authors, including None Wong, None Singh, and None Kumar, bring together years of experience in big data engineering, distributed systems, and enterprise application development. They aim to provide a clear pathway to mastering Hadoop ecosystem tools. Who is it for? This book is ideal for budding big data professionals who have some familiarity with Java and basic Hadoop concepts and wish to elevate their expertise. If you're a Hadoop career practitioner keen to expand your understanding of the ecosystem's advanced capabilities or a professional looking to implement Hadoop in organizational workflows, this book is well-suited for you.

Building Data Streaming Applications with Apache Kafka

2017-08-18 · O'Reilly Data Engineering Books O'Reilly Amazon

book

with Manisha Sethi , Anshul Joshi , Chanchal Singh , Manish Kumar

data data-engineering streaming-messaging Kafka Data Engineering Java

Learn how to design and build efficient real-time streaming applications using Apache Kafka, a leading distributed streaming platform. This book provides comprehensive guidance on setting up Kafka clusters, developing producers and consumers, and integrating with frameworks like Spark, Storm, and Heron. By the end, you'll master the skills needed to create enterprise-grade data streaming solutions. What this Book will help me do Grasp the core concepts and components of Apache Kafka and its ecosystem. Develop robust Kafka producers and consumers to process real-time data streams. Design and implement streaming applications using Spark, Storm, and Heron. Plan Kafka deployments with a focus on scalability, capacity, and fault tolerance. Ensure secure data streaming with best practices for securing Apache Kafka. Author(s) The authors, None Singh and None Kumar, bring years of expertise in data engineering and distributed systems. Having worked extensively with streaming technologies like Apache Kafka, they aim to share their in-depth knowledge through practical examples and real-world scenarios. Their approach to teaching focuses on making complex concepts easily understandable. Who is it for? This book is ideal for software developers and data engineers who are eager to learn Apache Kafka for building streaming applications. Some experience with programming, particularly Java, will help readers get the most out of the material. If you are working on data-processing systems or looking to enhance your skills in real-time data handling, this book caters to your needs.