Brian O’Neill

Activities

2

talks

Podcast host Designing for Analytics

Frequent Collaborators

Nadiem von Heydebrand Mindfuel 2 Zalak Trivedi Sigma Computing 2

Filter by Event / Source

Experiencing Data w/ Brian T. O’Neill (AI & data product management leadership—powered by UX design) 100 O'Reilly Data Engineering Books 2

Talks & appearances

102 activities · Newest first

Search activities →

Machine Learning with Spark - Second Edition

2017-04-28 · O'Reilly Data Engineering Books O'Reilly Amazon

book

with Rajdeep Dua , Brian O’Neill (Designing for Analytics) , Manpreet Singh Ghotra , Stephen Boesch , Nick Pentreath

data data-engineering apache-spark AI/ML Big Data Python

Dive into the world of distributed machine learning with Apache Spark, a powerful framework for handling, processing, and analyzing big data. This book will take you through implementing popular machine learning algorithms using Spark ML, covering end-to-end workflows such as data preparation, model building, predictive analysis, and text processing. What this Book will help me do Learn to implement scalable machine learning solutions using Spark ML. Develop the skills to set up and configure Apache Spark environments. Master the application of machine learning techniques like clustering, classification, and regression with Spark. Efficiently handle and process large-scale datasets using Spark tools. Put Spark's capabilities to work in building real-world distributed data processing solutions. Author(s) None Dua and None Ghotra bring a wealth of experience in big data and machine learning to this book. They have been involved in building scalable data systems and implementing machine learning solutions in various industry scenarios. Their approach is hands-on and focused on teaching practical, actionable knowledge. Who is it for? This book is perfect for data enthusiasts, data engineers, and machine learning practitioners who are familiar with Python and Scala, eager to apply machine learning concepts in distributed environments. It's aimed at professionals looking to develop their skills in building scalable data systems and implementing advanced machine learning workflows in Spark.

Storm Blueprints: Patterns for Distributed Real-time Computation

2014-03-26 · O'Reilly Data Engineering Books O'Reilly Amazon

book

data data-engineering streaming-messaging storm

"Storm Blueprints: Patterns for Distributed Real-time Computation" takes you on a hands-on journey into understanding and implementing distributed real-time processing with Apache Storm. Through real-world examples and projects, you'll gain a sound understanding of the fundamentals and learn to design systems capable of resilient, scalable, and fast computation. What this Book will help me do Understand the essentials of Apache Storm and its architecture. Learn to deploy and manage Storm in different modes, including distributed clusters. Discover design patterns for real-time data flow in distributed systems. Master the implementation of fault tolerance and continuous availability in processing. Analyze system performance insights through practical integrations and use cases. Author(s) The author(s) of 'Storm Blueprints' bring extensive experience in distributed systems engineering and real-time computations. Their passion for sharing knowledge is evident in this approachable yet comprehensive book. With years of practical experience, they offer insights and proven techniques to empower readers to build practical distributed systems. Who is it for? This book is designed for software engineers and developers working on data pipelines and real-time processing systems. Beginners to Storm will find it an excellent introduction, while those with experience will appreciate the advanced design patterns and use cases. If you aim to leverage Storm effectively in distributed architectures, this guide is tailored for you.

Page 5 of 5

← Previous

1 ... 3 4 5