talk-data.com talk-data.com

Topic

API

Application Programming Interface (API)

integration software_development data_exchange

232

tagged

Activity Trend

65 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: O'Reilly Data Engineering Books ×
IBM FlashSystem and VMware Implementation and Best Practices Guide

This IBM® Redbooks® publication details the configuration and best practices for using IBM's FlashSystem family of storage products within a VMware environment. This book was published in 2021 and specifically addresses Spectrum Virtualize Version 8.4 with VMware vSphere Version 7.0. Topics illustrate planning, configuring, operations, and preferred practices that include integration of FlashSystem storage systems with the VMware vCloud suite of applications: - vSphere Web Client (VWC) - vStorage APIs for Storage Awareness (VASA) - vStorage APIs for Array Integration (VAAI) - Site Recovery Manager (SRM) - vSphere Metro Storage Cluster (vMSC) This book is intended for presales consulting engineers, sales engineers, and IBM clients who want to deploy IBM FlashSystem® storage systems in virtualized data centers that are based on VMware vSphere.

Introducing .NET for Apache Spark: Distributed Processing for Massive Datasets

Get started using Apache Spark via C# or F# and the .NET for Apache Spark bindings. This book is an introduction to both Apache Spark and the .NET bindings. Readers new to Apache Spark will get up to speed quickly using Spark for data processing tasks performed against large and very large datasets. You will learn how to combine your knowledge of .NET with Apache Spark to bring massive computing power to bear by distributed processing of extremely large datasets across multiple servers. This book covers how to get a local instance of Apache Spark running on your developer machine and shows you how to create your first .NET program that uses the Microsoft .NET bindings for Apache Spark. Techniques shown in the book allow you to use Apache Spark to distribute your data processing tasks over multiple compute nodes. You will learn to process data using both batch mode and streaming mode so you can make the right choice depending on whether you are processing an existing dataset or are working against new records in micro-batches as they arrive. The goal of the book is leave you comfortable in bringing the power of Apache Spark to your favorite .NET language. What You Will Learn Install and configure Spark .NET on Windows, Linux, and macOS Write Apache Spark programs in C# and F# using the .NET bindings Access and invoke the Apache Spark APIs from .NET with the same high performance as Python, Scala, and R Encapsulate functionality in user-defined functions Transform and aggregate large datasets Execute SQL queries against files through Apache Hive Distribute processing of large datasets across multiple servers Create your own batch, streaming, and machine learning programs Who This Book Is For .NETdevelopers who want to perform big data processing without having to migrate to Python, Scala, or R; and Apache Spark developers who want to run natively on .NET and take advantage of the C# and F# ecosystems

R2DBC Revealed: Reactive Relational Database Connectivity for Java and JVM Programmers

Understand the newest trend in database programming for developers working in Java, Kotlin, Clojure, and other JVM-based languages. This book introduces Reactive Relational Database Connectivity (R2DBC), a modern way of connecting to and querying relational databases from Java and other JVM languages. The book begins by helping you understand not only what reactive programming is, but why it is necessary. Then building on those fundamentals, the book takes you into the world of databases and the newly released Reactive Relational Database Connectivity (R2DBC) specification. Examples in the book are worked using the freely available MariaDB database along with MariaDB’s vendor-implementation of the R2DBC service-provider interface (SPI). Following along with the examples and the provided example code helps prepare you to work with any of the growing number of R2DBC implementations for popular enterprise databases such as Oracle Database and SQL Server. You’ll be well prepared for what is becoming the future of database access from Java and other languages built on the JVM. What You Will Learn Understand why R2DBC was created and how it utilizes the Reactive Streams API Understand the components of the R2DBC service-provider interface Create and manage reactive database connections and connection pools using an R2DBC client Programmatically execute queries on a relational database using an R2DBC client Effectively utilize transactions using an R2DBC client Build relational database-driven applications that are event-driven and non-blocking Who This Book Is For Software developers building solutions using JVM languages and the JVM ecosystem, and developers who need an introduction to the R2DBC specification and reactive programming with relational databases and want to understand what Reactive Relational Database Connectivity is and why it came about. This book includes practical examples of using the R2DBC specification with Java and MariaDB that will provide developers with the knowledge they need to create their own solutions.

Effortless App Development with Oracle Visual Builder

In "Effortless App Development with Oracle Visual Builder," you will explore how to quickly design, develop, and deploy robust web and mobile applications using Oracle Visual Builder's intuitive drag-and-drop features. This book equips you with the know-how to simplify application development tasks, making it perfect for professionals looking to boost productivity. What this Book will help me do Master the core architecture and features of Oracle Visual Builder to develop real-world applications effectively. Learn to create, manage, and leverage business objects and connect to various SaaS APIs within your applications. Build scalable and secure web and mobile applications using practical examples and clear implementation guidelines. Discover best practices for application lifecycle management, debugging, and troubleshooting VB applications. Extend Oracle and non-Oracle SaaS applications through hands-on knowledge tailored to real-world scenarios. Author(s) None Jain is an experienced developer and technical writer specializing in Oracle Visual Builder and cloud-based application development. With years of hands-on experience building and deploying cloud applications, they bring expertise and a practical approach to education. Their engaging writing style focuses on enabling readers to learn and apply new skills confidently. Who is it for? This book is perfectly suited for developers, UI designers, and IT professionals who want to master Oracle Visual Builder for developing web and mobile applications. If you already have experience with technologies like JavaScript, UI frameworks, and REST APIs, and seek to create intuitive applications using a simplified interface, this book is for you. Whether you're in the early stages of learning VB or looking to refine your skills, this book serves as a valuable guide.

Modernizing Applications with IBM CICS

IBM® CICS® is a mixed language application server that runs on IBM Z®. Over the 50 years since CICS was introduced in 1969, enterprises have used the qualities of service (QoSs) that CICS provides to allow them to create high throughput and secure transactional applications that have powered their business. As the IT landscape has evolved, so has CICS to allow these applications to integrate with new platforms and still provide value to the rest of the business. Because of this capability, many businesses still rely on CICS to power their core applications. This IBM Redpaper publication focuses on modernizing these CICS applications, allowing them to integrate with cloud-native applications. This modernization can be achieved either by constructing application programming interfaces (APIs) that allow new cloud-native applications to connect to your existing assets, rewriting parts of your application in newer languages and hosting them back on CICS, or by using CICS capabilities to extend your applications to provide new capabilities and functions. The paper takes a traditional example application and shows you how it works. Then, the paper extends the example, rewrites portions of its functions, and enables its APIs. It also explains how CICS applications can use continuous integration (CI) and continuous delivery (CD) to deliver, test, and deploy code into CICS easily and with quality.

Applied Data Science Using PySpark: Learn the End-to-End Predictive Model-Building Cycle

Discover the capabilities of PySpark and its application in the realm of data science. This comprehensive guide with hand-picked examples of daily use cases will walk you through the end-to-end predictive model-building cycle with the latest techniques and tricks of the trade. Applied Data Science Using PySpark is divided unto six sections which walk you through the book. In section 1, you start with the basics of PySpark focusing on data manipulation. We make you comfortable with the language and then build upon it to introduce you to the mathematical functions available off the shelf. In section 2, you will dive into the art of variable selection where we demonstrate various selection techniques available in PySpark. In section 3, we take you on a journey through machine learning algorithms, implementations, and fine-tuning techniques. We will also talk about different validation metrics and how to use them for picking the best models. Sections 4 and 5 go through machine learning pipelines and various methods available to operationalize the model and serve it through Docker/an API. In the final section, you will cover reusable objects for easy experimentation and learn some tricks that can help you optimize your programs and machine learning pipelines. By the end of this book, you will have seen the flexibility and advantages of PySpark in data science applications. This book is recommended to those who want to unleash the power of parallel computing by simultaneously working with big datasets. What You Will Learn Build an end-to-end predictive model Implement multiple variable selection techniques Operationalize models Master multiple algorithms and implementations Who This Book is For Data scientists and machine learning and deep learning engineers who want to learn and use PySpark for real-time analysis of streamingdata.

Custom Fiori Applications in SAP HANA: Design, Develop, and Deploy Fiori Applications for the Enterprise

Get started building custom Fiori applications for your enterprise. This book teaches you how to design, build, and deploy enterprise-ready, custom Fiori applications in SAP HANA. Tips and tricks collected from projects using Fiori applications (built consuming OData models and REST APIs) and integrating third-party JS libraries are presented. Also included are examples using Fiori templates from different tools such as the SAP Web IDE and the new Visual Studio Code extensions. This book explains the 5 design principles that all Fiori applications are built upon: Role-based, Responsive, Coherent, Simple, and Delightful. The book expands on consuming OData services and REST APIs internal and external to SAP HANA. The Fiori application exercise demonstrates the use of the MVC pattern, JavaScript modularization, reuse of SAP UI5 controls, debugging, and the tools required for a complete scenario. The book closes with an exercise showcasing a finished single page application with multiple views and layouts, navigation between the views, and deployment of the application to AWS. This book is simple enough for entry-level developers getting started in web frameworks but also highlights integration points from the data models being consumed from the application, and shows how the application communicates with back-end services, resulting in a complete front-end custom Fiori application. What You Will Learn Know the 5 Fiori design principles Understand how to consume OData and REST API models Apply the MVC pattern using XML views and the SAP UI5 controls along with controller behavior in JavaScript Debug and deploy the application Who This Book is For Web developers and application leads who have some experience in JavaScript frameworks and web development and understand web protocol communication

Practical Azure SQL Database for Modern Developers: Building Applications in the Microsoft Cloud

Here is the expert-level, insider guidance you need on using Azure SQL Database as your back-end data store. This book highlights best practices in everything ranging from full-stack projects to mobile applications to critical, back-end APIs. The book provides instruction on accessing your data from any language and platform. And you learn how to push processing-intensive work into the database engine to be near the data and avoid undue networking traffic. Azure SQL is explained from a developer's point of view, helping you master its feature set and create applications that perform well and delight users. Core to the book is showing you how Azure SQL Database provides relational and post-relational support so that any workload can be managed with easy accessibility from any platform and any language. You will learn about features ranging from lock-free tables to columnstore indexes, and about support for data formats ranging from JSON and key-values to the nodes and edges in the graph database paradigm. Reading this book prepares you to deal with almost all data management challenges, allowing you to create lean and specialized solutions having the elasticity and scalability that are needed in the modern world. What You Will Learn Master Azure SQL Database in your development projects from design to the CI/CD pipeline Access your data from any programming language and platform Combine key-value, JSON, and relational data in the same database Push data-intensive compute work into the database for improved efficiency Delight your customers by detecting and improving poorly performing queries Enhance performance through features such as columnstore indexes and lock-free tables Build confidence in your mastery of Azure SQL Database's feature set Who This Book Is For Developers of applications and APIs that benefit from cloud database support, developers who wish to master their tools (including Azure SQL Database, and those who want their applications to be known for speedy performance and the elegance of their code

Privacy Optimization Meets Pandemic Tracking

Can smartphone apps help track the spread of the novel coronavirus, privately and securely? In this report, Rob Pegoraro weighs the issue of whether mobile apps can help trace and then slow the spread of COVID-19 or will end up as just another episode of botched government procurement and application of technology. Apple and Google have recently devised a system to track COVID-19 infections anonymously using Bluetooth with iOS and Android smartphones. This development points a spotlight on a needed debate about balancing privacy and collecting useful data. Do privacy-optimizing techniques, such as federated learning and differential privacy, offer useful alternatives to building centralized databases that may later invite abuse? This report takes a close look at this subject and then provides recommendations for software developers, public health authorities, and elected officials who want to build on the Apple-Google API. Understand the scope of the problem, including how contact tracing can help slow and stop outbreaks Take a closer look at Apple and Google’s proposed remedy Learn how other countries including Singapore, India, France, and Australia have traced the spread of COVID-19 Examine the risk factors for adopting and using a decentralized system like the Apple-Google app

Microservices in SAP HANA XSA: A Guide to REST APIs Using Node.js

Build enterprise-grade microservices in the SAP HANA Advanced Model (XSA). This book explains building scalable APIs in XSA and the benefits of building microservices with SAP HANA XSA. This book covers the cloud foundry (CF) architecture and how SAP HANA XSA follows the model. It begins with the details of the different architectural layers of applications hosted in XSA (specifically, microservices). Everything you need to know is presented, including analyzing requests, modularization, database ingestion, building JSON responses, and scaling your microservices. You will learn to use developmental tools such as the SAP WEB IDE, POSTMAN, and the SAP HANA Cockpit for XSA, including debugging examples on SAP HANA XSA with code snippets showing how microservices can be developed, debugged, scaled, and deployed on SAP HANA XSA. Microservices are divided into security and authentication, request handling, modularization of Node.js, and interaction with the SAP HANA database containers and response formatting. An end-to-end scenario is presented of a Node.js REST API that uses HTTP methods, concluding with deploying an SAP HANA XSA project to a production environment. This book is simple enough to help you implement a Node.js module in order to understand the development of microservices, and complex enough for architects to design their next business-ready solution integrating UAA security, application modularization, and an end-to-end REST API on SAP HANA XSA. What You Will Learn Know the definition and architecture of cloud foundry and its application on SAP HANA XSA Understand REST principles and different HTTP methods Explore microservices (Node.js) development Database interaction from Node (executing SQL statements and stored procedures) Who This Book Is For Architects designing business-ready solutions that integrate UAA security, application modularization, and an end-to-end REST API on SAP HANA XSA

Learning Spark, 2nd Edition

Data is bigger, arrives faster, and comes in a variety of formatsâ??and it all needs to be processed at scale for analytics or machine learning. But how can you process such varied workloads efficiently? Enter Apache Spark. Updated to include Spark 3.0, this second edition shows data engineers and data scientists why structure and unification in Spark matters. Specifically, this book explains how to perform simple and complex data analytics and employ machine learning algorithms. Through step-by-step walk-throughs, code snippets, and notebooks, youâ??ll be able to: Learn Python, SQL, Scala, or Java high-level Structured APIs Understand Spark operations and SQL Engine Inspect, tune, and debug Spark operations with Spark configurations and Spark UI Connect to data sources: JSON, Parquet, CSV, Avro, ORC, Hive, S3, or Kafka Perform analytics on batch and streaming data using Structured Streaming Build reliable data pipelines with open source Delta Lake and Spark Develop machine learning pipelines with MLlib and productionize models using MLflow

Spark in Action, Second Edition

The Spark distributed data processing platform provides an easy-to-implement tool for ingesting, streaming, and processing data from any source. In Spark in Action, Second Edition, you’ll learn to take advantage of Spark’s core features and incredible processing speed, with applications including real-time computation, delayed evaluation, and machine learning. Spark skills are a hot commodity in enterprises worldwide, and with Spark’s powerful and flexible Java APIs, you can reap all the benefits without first learning Scala or Hadoop. About the Technology Analyzing enterprise data starts by reading, filtering, and merging files and streams from many sources. The Spark data processing engine handles this varied volume like a champ, delivering speeds 100 times faster than Hadoop systems. Thanks to SQL support, an intuitive interface, and a straightforward multilanguage API, you can use Spark without learning a complex new ecosystem. About the Book Spark in Action, Second Edition, teaches you to create end-to-end analytics applications. In this entirely new book, you’ll learn from interesting Java-based examples, including a complete data pipeline for processing NASA satellite data. And you’ll discover Java, Python, and Scala code samples hosted on GitHub that you can explore and adapt, plus appendixes that give you a cheat sheet for installing tools and understanding Spark-specific terms. What's Inside Writing Spark applications in Java Spark application architecture Ingestion through files, databases, streaming, and Elasticsearch Querying distributed datasets with Spark SQL About the Reader This book does not assume previous experience with Spark, Scala, or Hadoop. About the Author Jean-Georges Perrin is an experienced data and software architect. He is France’s first IBM Champion and has been honored for 12 consecutive years. Quotes This book reveals the tools and secrets you need to drive innovation in your company or community. - Rob Thomas, IBM An indispensable, well-paced, and in-depth guide. A must-have for anyone into big data and real-time stream processing. - Anupam Sengupta, GuardHat Inc. This book will help spark a love affair with distributed processing. - Conor Redmond, InComm Product Control Currently the best book on the subject! - Markus Breuer, Materna IPS

SQL Server 2019 Administration Inside Out

Conquer SQL Server 2019 administration–from the inside out Dive into SQL Server 2019 administration–and really put your SQL Server DBA expertise to work. This supremely organized reference packs hundreds of timesaving solutions, tips, and workarounds–all you need to plan, implement, manage, and secure SQL Server 2019 in any production environment: on-premises, cloud, or hybrid. Six experts thoroughly tour DBA capabilities available in SQL Server 2019 Database Engine, SQL Server Data Tools, SQL Server Management Studio, PowerShell, and Azure Portal. You’ll find extensive new coverage of Azure SQL, big data clusters, PolyBase, data protection, automation, and more. Discover how experts tackle today’s essential tasks–and challenge yourself to new levels of mastery. Explore SQL Server 2019’s toolset, including the improved SQL Server Management Studio, Azure Data Studio, and Configuration Manager Design, implement, manage, and govern on-premises, hybrid, or Azure database infrastructures Install and configure SQL Server on Windows and Linux Master modern maintenance and monitoring with extended events, Resource Governor, and the SQL Assessment API Automate tasks with maintenance plans, PowerShell, Policy-Based Management, and more Plan and manage data recovery, including hybrid backup/restore, Azure SQL Database recovery, and geo-replication Use availability groups for high availability and disaster recovery Protect data with Transparent Data Encryption, Always Encrypted, new Certificate Management capabilities, and other advances Optimize databases with SQL Server 2019’s advanced performance and indexing features Provision and operate Azure SQL Database and its managed instances Move SQL Server workloads to Azure: planning, testing, migration, and post-migration

Stream Processing with Apache Spark

Before you can build analytics tools to gain quick insights, you first need to know how to process data in real time. With this practical guide, developers familiar with Apache Spark will learn how to put this in-memory framework to use for streaming data. You’ll discover how Spark enables you to write streaming jobs in almost the same way you write batch jobs. Authors Gerard Maas and François Garillot help you explore the theoretical underpinnings of Apache Spark. This comprehensive guide features two sections that compare and contrast the streaming APIs Spark now supports: the original Spark Streaming library and the newer Structured Streaming API. Learn fundamental stream processing concepts and examine different streaming architectures Explore Structured Streaming through practical examples; learn different aspects of stream processing in detail Create and operate streaming jobs and applications with Spark Streaming; integrate Spark Streaming with other Spark APIs Learn advanced Spark Streaming techniques, including approximation algorithms and machine learning algorithms Compare Apache Spark to other stream processing projects, including Apache Storm, Apache Flink, and Apache Kafka Streams

Stream Processing with Apache Flink

Get started with Apache Flink, the open source framework that powers some of the world’s largest stream processing applications. With this practical book, you’ll explore the fundamental concepts of parallel stream processing and discover how this technology differs from traditional batch data processing. Longtime Apache Flink committers Fabian Hueske and Vasia Kalavri show you how to implement scalable streaming applications with Flink’s DataStream API and continuously run and maintain these applications in operational environments. Stream processing is ideal for many use cases, including low-latency ETL, streaming analytics, and real-time dashboards as well as fraud detection, anomaly detection, and alerting. You can process continuous data of any kind, including user interactions, financial transactions, and IoT data, as soon as you generate them. Learn concepts and challenges of distributed stateful stream processing Explore Flink’s system architecture, including its event-time processing mode and fault-tolerance model Understand the fundamentals and building blocks of the DataStream API, including its time-based and statefuloperators Read data from and write data to external systems with exactly-once consistency Deploy and configure Flink clusters Operate continuously running streaming applications

Integration of IBM Aspera Sync with IBM Spectrum Scale: Protecting and Sharing Files Globally

Economic globalization requires data to be available globally. With most data stored in file systems, solutions to make this data globally available become more important. Files that are in file systems can be protected or shared by replicating these files to another file system that is in a remote location. The remote location might be just around the corner or in a different country. Therefore, the techniques that are used to protect and share files must account for long distances and slow and unreliable wide area network (WAN) connections. IBM® Spectrum Scale is a scalable clustered file system that can be used to store all kinds of unstructured data. It provides open data access by way of Network File System (NFS); Server Message Block (SMB); POSIX Object Storage APIs, such as S3 and OpenStack Swift; and the Hadoop Distributed File System (HDFS) for accessing and sharing data. The IBM Aspera® file transfer solution (IBM Aspera Sync) provides predictable and reliable data transfer across large distance for small and large files. The combination of both can be used for global sharing and protection of data. This IBM Redpaper™ publication describes how IBM Aspera Sync can be used to protect and share data that is stored in IBM Spectrum™ Scale file systems across large distances of several hundred to thousands of miles. We also explain the integration of IBM Aspera Sync with IBM Spectrum Scale™ and differentiate it from solutions that are built into IBM Spectrum Scale for protection and sharing. We also describe different use cases for IBM Aspera Sync with IBM Spectrum Scale.

Walmart and the CICS Asynchronous API: An Adoption Experience

Abstract This IBM® Redbooks® publication discusses practical uses of the IBM CICS asynchronous API capability. It describes the methodology, design and thought process used by a large client, Walmart, and the considerations of the choices made. The Redbooks publication provides real life examples and application patterns that benefit from the performance and scalability offered by the new API. The book discusses the homegrown methodology used by Walmart before the API was available and compares it with the design using the new API. A discussion of the process used to migrate older applications to begin using the new API is included so the reader will understand the ease of implementing the new API. A description of real world usage patterns describes the current production application Walmart has deployed as well as other patterns to give the reader a sense of what's possible applying creative thinking with technology improvements. Finally, a section is included on the areas to be considered as you begin to plan and implement asynchronous API capabilities. This book should be read by: Enterprise Architects searching for faster ways to service strategic applications across the enterprise. Solution Architects who want to better understand implementation possibilities for improved response times and better performance for CICS applications. CICS programmers looking to modernize and provide improved response times.

Apache Spark Quick Start Guide

Dive into the world of scalable data processing with the "Apache Spark Quick Start Guide." This book offers a foundational introduction to Spark, empowering readers to harness its capabilities for big data processing. With clear explanations and hands-on examples, you'll learn to implement Spark applications that handle complex data tasks efficiently. What this Book will help me do Understand and implement Spark's RDDs and DataFrame APIs to process large datasets effectively. Set up a local development environment for Spark-based projects. Develop skills to debug and optimize slow-performing Spark applications. Harness built-in modules of Spark for SQL, streaming, and machine learning applications. Adopt best practices and optimization techniques for high-performance Spark applications. Author(s) Shrey Mehrotra is a seasoned software developer with expertise in big data technologies, particularly Apache Spark. With years of hands-on industry experience, Shrey focuses on making complex technical concepts accessible to all. Through his writing, he aims to share clear, practical guidance for developers of all levels. Who is it for? This guide is perfect for big data enthusiasts and professionals looking to learn Apache Spark's capabilities from scratch. It's aimed at data engineers interested in optimizing application performance and data scientists wanting to integrate machine learning with Spark. A basic familiarity with either Scala, Python, or Java is recommended.

Java XML and JSON: Document Processing for Java SE

Use this guide to master the XML metalanguage and JSON data format along with significant Java APIs for parsing and creating XML and JSON documents from the Java language. New in this edition is coverage of Jackson (a JSON processor for Java) and Oracle’s own Java API for JSON processing (JSON-P), which is a JSON processing API for Java EE that also can be used with Java SE. This new edition of Java XML and JSON also expands coverage of DOM and XSLT to include additional API content and useful examples. All examples in this book have been tested under Java 11. In some cases, source code has been simplified to use Java 11’s var language feature. The first six chapters focus on XML along with the SAX, DOM, StAX, XPath, and XSLT APIs. The remaining six chapters focus on JSON along with the mJson, GSON, JsonPath, Jackson, and JSON-P APIs. Each chapter ends with select exercises designed to challenge your grasp of the chapter's content.An appendix provides the answers to these exercises. What You'll Learn Master the XML language Create, validate, parse, and transform XML documents Apply Java’s SAX, DOM, StAX, XPath, and XSLT APIs Master the JSON format for serializing and transmitting data Code against third-party APIs such as Jackson, mJson, Gson, JsonPath Master Oracle’s JSON-P API in a Java SE context Who This Book Is For Intermediate and advanced Java programmers who are developing applications that must access data stored in XML or JSON documents. The book also targets developers wanting to understand the XML language and JSON data format.

Practical Apache Spark: Using the Scala API

Work with Apache Spark using Scala to deploy and set up single-node, multi-node, and high-availability clusters. This book discusses various components of Spark such as Spark Core, DataFrames, Datasets and SQL, Spark Streaming, Spark MLib, and R on Spark with the help of practical code snippets for each topic. Practical Apache Spark also covers the integration of Apache Spark with Kafka with examples. You’ll follow a learn-to-do-by-yourself approach to learning – learn the concepts, practice the code snippets in Scala, and complete the assignments given to get an overall exposure. On completion, you’ll have knowledge of the functional programming aspects of Scala, and hands-on expertise in various Spark components. You’ll also become familiar with machine learning algorithms with real-time usage. What You Will Learn Discover the functional programming features of Scala Understand the completearchitecture of Spark and its components Integrate Apache Spark with Hive and Kafka Use Spark SQL, DataFrames, and Datasets to process data using traditional SQL queries Work with different machine learning concepts and libraries using Spark's MLlib packages Who This Book Is For Developers and professionals who deal with batch and stream data processing.