TensorFlow

Scaling Machine Learning with Spark

2023-03-09 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Adi Polak (Treeverse)

AI/ML PyTorch Spark apache-spark data data-engineering

Learn how to build end-to-end scalable machine learning solutions with Apache Spark. With this practical guide, author Adi Polak introduces data and ML practitioners to creative solutions that supersede today's traditional methods. You'll learn a more holistic approach that takes you beyond specific requirements and organizational goals--allowing data and ML practitioners to collaborate and understand each other better. Scaling Machine Learning with Spark examines several technologies for building end-to-end distributed ML workflows based on the Apache Spark ecosystem with Spark MLlib, MLflow, TensorFlow, and PyTorch. If you're a data scientist who works with machine learning, this book shows you when and why to use each technology. You will: Explore machine learning, including distributed computing concepts and terminology Manage the ML lifecycle with MLflow Ingest data and perform basic preprocessing with Spark Explore feature engineering, and use Spark to extract features Train a model with MLlib and build a pipeline to reproduce it Build a data system to combine the power of Spark with deep learning Get a step-by-step example of working with distributed TensorFlow Use PyTorch to scale machine learning and its internal architecture

Optimized Inferencing and Integration with AI on IBM zSystems: Introduction, Methodology, and Use Cases

2022-11-30 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Makenzie Manna , Erhan Mengusoglu , Markus Wolff , Thomas Rüter , Artem Minin , Pia Velazco , Krishna Teja Rekapalli

AI/ML Cloud Computing IBM Cyber Security data data-engineering

In today's fast-paced, ever-growing digital world, you face various new and complex business problems. To help resolve these problems, enterprises are embedding artificial intelligence (AI) into their mission-critical business processes and applications to help improve operations, optimize performance, personalize the user experience, and differentiate themselves from the competition. Furthermore, the use of AI on the IBM® zSystems platform, where your mission-critical transactions, data, and applications are installed, is a key aspect of modernizing business-critical applications while maintaining strict service-level agreements (SLAs) and security requirements. This colocation of data and AI empowers your enterprise to optimally and easily deploy and infuse AI capabilities into your enterprise workloads with the most recent and relevant data available in real time, which enables a more transparent, accurate, and dependable AI experience. This IBM Redpaper publication introduces and explains AI technologies and hardware optimizations, such as IBM zSystems Integrated Accelerator for AI, and demonstrates how to leverage certain capabilities and components to enable solutions in business-critical use cases, such as fraud detection and credit risk scoring on the platform. Real-time inferencing with AI models, a capability that is critical to certain industries and use cases such as fraud detection, now can be implemented with optimized performance thanks to innovations like IBM zSystems Integrated Accelerator for AI embedded in the Telum chip within IBM z16™. This publication also describes and demonstrates the implementation and integration of the two end-to-end solutions (fraud detection and credit risk), from developing and training the AI models to deploying the models in an IBM z/OS® V2R5 environment on IBM z16 hardware, and to integrating AI functions into an application, for example an IBM z/OS Customer Information Control System (IBM CICS®) application. We describe performance optimization recommendations and considerations when leveraging AI technology on the IBM zSystems platform, including optimizations for micro-batching in IBM Watson® Machine Learning for z/OS (WMLz). The benefits that are derived from the solutions also are described in detail, which includes how the open-source AI framework portability of the IBM zSystems platform enables model development and training to be done anywhere, including on IBM zSystems, and the ability to easily integrate to deploy on IBM zSystems for optimal inferencing. You can uncover insights at the transaction level while taking advantage of the speed, depth, and securability of the platform. This publication is intended for technical specialists, site reliability engineers, architects, system programmers, and systems engineers. Technologies that are covered include TensorFlow Serving, WMLz, IBM Cloud Pak® for Data (CP4D), IBM z/OS Container Extensions (zCX), IBM Customer Information Control System (IBM CICS), Open Neural Network Exchange (ONNX), and IBM Deep Learning Compiler (zDLC).

Distributed Data Systems with Azure Databricks

2021-05-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Alan Bernardo Palacio

AI/ML Azure ADF Big Data Cloud Computing Databricks Delta ETL/ELT Microsoft Python Data Streaming data +3 more

In 'Distributed Data Systems with Azure Databricks', you will explore the capabilities of Microsoft Azure Databricks as a platform for building and managing big data pipelines. Learn how to process, transform, and analyze data at scale while developing expertise in training distributed machine learning models and integrating them into enterprise workflows. What this Book will help me do Design and implement Extract, Transform, Load (ETL) pipelines using Azure Databricks. Conduct distributed training of machine learning models using TensorFlow and Horovod. Integrate Azure Databricks with Azure Data Factory for optimized data pipeline orchestration. Utilize Delta Engine for efficient querying and analysis of data within Delta Lake. Employ Databricks Structured Streaming to manage real-time production-grade data flows. Author(s) None Palacio is an experienced data engineer and cloud computing specialist, with extensive knowledge of the Microsoft Azure platform. With years of practical application of Databricks in enterprise settings, Palacio provides clear, actionable insights through relatable examples. They bring a passion for innovative solutions to the field of big data automation. Who is it for? This book is ideal for data engineers, machine learning engineers, and software developers looking to master Azure Databricks for large-scale data processing and analysis. Readers should have basic familiarity with cloud platforms, understanding of data pipelines, and a foundational grasp of Python and machine learning concepts. It is perfect for those wanting to create scalable and manageable data workflows.

Deep Learning for Search

2019-06-13 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Tommaso Teofili

AI/ML Data Science Java NLP data data-engineering search

Deep Learning for Search teaches you how to improve the effectiveness of your search by implementing neural network-based techniques. By the time you're finished with the book, you'll be ready to build amazing search engines that deliver the results your users need and that get better as time goes on! About the Technology Deep learning handles the toughest search challenges, including imprecise search terms, badly indexed data, and retrieving images with minimal metadata. And with modern tools like DL4J and TensorFlow, you can apply powerful DL techniques without a deep background in data science or natural language processing (NLP). This book will show you how. About the Book Deep Learning for Search teaches you to improve your search results with neural networks. You’ll review how DL relates to search basics like indexing and ranking. Then, you’ll walk through in-depth examples to upgrade your search with DL techniques using Apache Lucene and Deeplearning4j. As the book progresses, you’ll explore advanced topics like searching through images, translating user queries, and designing search engines that improve as they learn! What's Inside Accurate and relevant rankings Searching across languages Content-based image search Search with recommendations About the Reader For developers comfortable with Java or a similar language and search basics. No experience with deep learning or NLP needed. About the Author Tommaso Teofili is a software engineer with a passion for open source and machine learning. As a member of the Apache Software Foundation, he contributes to a number of open source projects, ranging from topics like information retrieval (such as Lucene and Solr) to natural language processing and machine translation (including OpenNLP, Joshua, and UIMA). He currently works at Adobe, developing search and indexing infrastructure components, and researching the areas of natural language processing, information retrieval, and deep learning. He has presented search and machine learning talks at conferences including BerlinBuzzwords, International Conference on Computational Science, ApacheCon, EclipseCon, and others. You can find him on Twitter at @tteofili. Quotes A practical approach that shows you the state of the art in using neural networks, AI, and deep learning in the development of search engines. - From the Foreword by Chris Mattmann, NASA JPL A thorough and thoughtful synthesis of traditional search and the latest advancements in deep learning. - Greg Zanotti, Marquette Partners A well-laid-out deep dive into the latest technologies that will take your search engine to the next level. - Andrew Wyllie, Thynk Health Hands-on exercises teach you how to master deep learning for search-based products. - Antonio Magnaghi, System1

Data Science and Engineering at Enterprise Scale

2019-04-25 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Jerome Nilmeier

AI/ML Analytics Data Science Python Spark SQL Data Streaming data data-science

As enterprise-scale data science sharpens its focus on data-driven decision making and machine learning, new tools have emerged to help facilitate these processes. This practical ebook shows data scientists and enterprise developers how the notebook interface, Apache Spark, and other collaboration tools are particularly well suited to bridge the communication gap between their teams. Through a series of real-world examples, author Jerome Nilmeier demonstrates how to generate a model that enables data scientists and developers to share ideas and project code. You’ll learn how data scientists can approach real-world business problems with Spark and how developers can then implement the solution in a production environment. Dive deep into data science technologies, including Spark, TensorFlow, and the Jupyter Notebook Learn how Spark and Python notebooks enable data scientists and developers to work together Explore how the notebook environment works with Spark SQL for structured data Use notebooks and Spark as a launchpad to pursue supervised, unsupervised, and deep learning data models Learn additional Spark functionality, including graph analysis and streaming Explore the use of analytics in the production environment, particularly when creating data pipelines and deploying code

Hands-On Deep Learning with Apache Spark

2019-01-31 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Guglielmo Iozzia

AI/ML Keras RNNs Scala Spark apache-spark data data-engineering

"Hands-On Deep Learning with Apache Spark" is an essential resource for mastering distributed deep learning frameworks and applications on Apache Spark. Through practical examples and guided tutorials, this book teaches you to deploy scalable deep learning solutions for handling complex data challenges efficiently. What this Book will help me do Understand how to set up Apache Spark for deep learning workflows. Gain practical insight into implementing neural networks, including CNNs and RNNs, on distributed platforms. Learn to train and optimize models using popular frameworks like TensorFlow and Keras. Develop expertise in analyzing large datasets with textual and image-based deep learning methods. Acquire skills to deploy trained models for real-world applications in distributed environments. Author(s) None Iozzia is an accomplished software engineer and data scientist with a strong background in distributed computing and machine learning. With years of experience working with Apache Spark and deep learning technologies, None brings a wealth of practical knowledge to the table. Their passion for providing clear, hands-on guidance makes this book an approachable and valuable resource for learners of all levels. Who is it for? This book is aimed at Scala developers, data scientists, and data analysts who are looking to extend their skill set to include distributed deep learning on Apache Spark. It's ideally suited for readers familiar with machine learning basics and those with prior exposure to Apache Spark workflows. If you aim to create scalable machine learning solutions that handle complex data, this book offers precisely what you need.

Apache Spark Deep Learning Cookbook

2018-07-13 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Ahmed Sherif , Amrith Ravindra , Michal Malohlava , Adnan Masood

AI/ML Big Data Keras NLP Python RNNs Spark apache-spark data data-engineering

Embark on a journey to master distributed deep learning with the "Apache Spark Deep Learning Cookbook". Designed specifically for leveraging the capabilities of Apache Spark, TensorFlow, and Keras, this book offers over 80 problem-solving recipes to efficiently train and deploy state-of-the-art neural networks, addressing real-world AI challenges. What this Book will help me do Set up and configure a working Apache Spark environment optimized for deep learning tasks. Implement distributed training practices for deep learning models using TensorFlow and Keras. Develop and test neural networks such as CNNs and RNNs targeting specific big data problems. Apply Spark's built-in libraries and integrations for enhanced NLP and computer vision applications. Effectively manage and preprocess large datasets using Spark DataFrames for machine learning tasks. Author(s) Authors Ahmed Sherif and None Ravindra bring years of experience in deep learning, Apache Spark use cases, and hands-on practical training. Their collective expertise has contributed to designing this cookbook approach, focusing on clarity and usability for readers tackling challenging machine learning scenarios. Who is it for? This book is ideal for IT professionals, data scientists, and software developers with foundational understanding of machine learning concepts and Apache Spark framework capabilities. If you aim to scale deep learning and integrate efficient computing with Spark's power, this guide is for you. Familiarity with Python will help maximize the book's potential.

IBM PowerAI: Deep Learning Unleashed on IBM Power Systems Servers

2018-03-07 · O'Reilly Data Engineering Books O'Reilly Amazon

book

by Alfonso Jara , Dino Quintero , Shota Tsukamoto , Richard Wale , Bruno C. Faria , Bing He , Chris Parsons

AI/ML IBM data data-engineering

Abstract This IBM® Redbooks® publication is a guide about the IBM PowerAI Deep Learning solution. This book provides an introduction to artificial intelligence (AI) and deep learning (DL), IBM PowerAI, and components of IBM PowerAI, deploying IBM PowerAI, guidelines for working with data and creating models, an introduction to IBM Spectrum™ Conductor Deep Learning Impact (DLI), and case scenarios. IBM PowerAI started as a package of software distributions of many of the major DL software frameworks for model training, such as TensorFlow, Caffe, Torch, Theano, and the associated libraries, such as CUDA Deep Neural Network (cuDNN). The IBM PowerAI software is optimized for performance by using the IBM Power Systems™ servers that are integrated with NVLink. The AI stack foundation starts with servers with accelerators. graphical processing unit (GPU) accelerators are well-suited for the compute-intensive nature of DL training, and servers with the highest CPU to GPU bandwidth, such as IBM Power Systems servers, enable the high-performance data transfer that is required for larger and more complex DL models. This publication targets technical readers, including developers, IT specialists, systems architects, brand specialist, sales team, and anyone looking for a guide about how to understand the IBM PowerAI Deep Learning architecture, framework configuration, application and workload configuration, and user infrastructure.

talk-data.com

Activity Trend

Top Events

Top Speakers

Scaling Machine Learning with Spark

Optimized Inferencing and Integration with AI on IBM zSystems: Introduction, Methodology, and Use Cases

Distributed Data Systems with Azure Databricks

Deep Learning for Search

Data Science and Engineering at Enterprise Scale

Hands-On Deep Learning with Apache Spark

Apache Spark Deep Learning Cookbook

IBM PowerAI: Deep Learning Unleashed on IBM Power Systems Servers