talk-data.com talk-data.com

Topic

DevOps

software_development it_operations continuous_delivery

216

tagged

Activity Trend

25 peak/qtr
2020-Q1 2026-Q1

Activities

216 activities · Newest first

Spark in Action

Spark in Action teaches you the theory and skills you need to effectively handle batch and streaming data using Spark. Fully updated for Spark 2.0. About the Technology Big data systems distribute datasets across clusters of machines, making it a challenge to efficiently query, stream, and interpret them. Spark can help. It is a processing system designed specifically for distributed data. It provides easy-to-use interfaces, along with the performance you need for production-quality analytics and machine learning. Spark 2 also adds improved programming APIs, better performance, and countless other upgrades. About the Book Spark in Action teaches you the theory and skills you need to effectively handle batch and streaming data using Spark. You'll get comfortable with the Spark CLI as you work through a few introductory examples. Then, you'll start programming Spark using its core APIs. Along the way, you'll work with structured data using Spark SQL, process near-real-time streaming data, apply machine learning algorithms, and munge graph data using Spark GraphX. For a zero-effort startup, you can download the preconfigured virtual machine ready for you to try the book's code. What's Inside Updated for Spark 2.0 Real-life case studies Spark DevOps with Docker Examples in Scala, and online in Java and Python About the Reader Written for experienced programmers with some background in big data or machine learning. About the Authors Petar Zečević and Marko Bonaći are seasoned developers heavily involved in the Spark community. Quotes Dig in and get your hands dirty with one of the hottest data processing engines today. A great guide. - Jonathan Sharley, Pandora Media Must-have! Speed up your learning of Spark as a distributed computing framework. - Robert Ormandi, Yahoo! An easy-to-follow, step-by-step guide. - Gaurav Bhardwaj, 3Pillar Global An ambitiously comprehensive overview of Spark and its diverse ecosystem. - Jonathan Miller, Optensity

Essentials of Cloud Application Development on IBM Bluemix

Abstract This IBM® Redbooks® publication is based on the Presentations Guide of the course "Essentials of Cloud Application Development on IBM Bluemix" that was developed by the IBM Redbooks team in partnership with IBM Middle East and Africa (MEA) University Program. This course is designed to teach university students the basic skills that are required to develop, deploy, and test cloud-based applications that use the IBM Bluemix® cloud services. The primary target audience for this course is university students in undergraduate computer science and computer engineer programs with no previous experience working in cloud environments. However, anyone new to cloud computing can benefit from this course. After completing this course, you should be able to accomplish these tasks: Describe the factors that lead to the adoption of cloud computing. Describe infrastructure as a service, platform as a service, and software as a service. Define cloud computing. Describe IBM Bluemix. Describe the architecture of IBM Bluemix. Identify the runtimes and services that Bluemix offers. Explain how to get started with Bluemix. Describe Bluemix organizations, domains, spaces, and users. Create Bluemix applications. Use services in a Bluemix application. Set environmental variables that are used with Bluemix services. Deploy and run Bluemix applications. Describe how to create an IBM SDK for Node.js application that runs on Bluemix. Explain how to manage a Bluemix account with the Cloud Foundry CLI.[ ]Describe how to integrate workstation development platforms with Bluemix. Manage application code and assets with IBM Bluemix DevOps services. Work with the Git repository that is used by DevOps services. Describe the characteristics of REST APIs. Describe the use of JSON as the preferred data format for REST APIs. dentify the data services that are available on Bluemix. Describe the features in Bluemix for developing mobile applications. Create a MobileFirst Services Starter application on Bluemix. Send push notifications from Bluemix and receive them on the mobile device emulator. The workshop materials were created in August 2016. Thus, all IBM Bluemix features discussed in this Presentations Guide and Bluemix user interfaces used in the examples are current as of August 2016. Note: This IBM Redbooks publication references exercises that are NOT included with this book. The exercises are only available to students attending the course.

Cassandra 3.x High Availability - Second Edition

Cassandra 3.x High Availability is an in-depth guide to mastering the high availability features of Apache Cassandra. This book takes you through its architecture, implementing solutions to achieve zero downtime, and configuring clusters for fault tolerance and scalability. With practical examples and tips, it is a go-to resource for designing robust Cassandra-powered systems. What this Book will help me do Understand the architecture of Apache Cassandra and its high availability mechanisms. Master replication and tunable consistency levels for optimal data distribution. Learn to scale out your Cassandra deployments with multiple data centers. Acquire skills in creating efficient and scalable data models for fault-tolerant systems. Prevent system failures by avoiding anti-patterns and managing graceful failover scenarios. Author(s) None Strickland has extensive experience working as a developer and architect with distributed database systems. Specializing in Apache Cassandra, Strickland focuses on designing systems with high availability, scalability, and fault tolerance. Their practical teaching style ensures readers gain actionable knowledge to build robust database solutions. Who is it for? This book is ideal for developers and DevOps engineers familiar with Cassandra basics who wish to deepen their expertise. If your goal is to build highly available and fault-tolerant systems, this book will guide you step by step. It suits professionals managing data-intensive applications and looking to optimize their database strategy using Cassandra.

Site Reliability Engineering

The overwhelming majority of a software system's lifespan is spent in use, not in design or implementation. So, why does conventional wisdom insist that software engineers focus primarily on the design and development of large-scale computing systems? In this collection of essays and articles, key members of Google's Site Reliability Team explain how and why their commitment to the entire lifecycle has enabled the company to successfully build, deploy, monitor, and maintain some of the largest software systems in the world. You'll learn the principles and practices that enable Google engineers to make systems more scalable, reliable, and efficient—lessons directly applicable to your organization. This book is divided into four sections: Introduction—Learn what site reliability engineering is and why it differs from conventional IT industry practices Principles—Examine the patterns, behaviors, and areas of concern that influence the work of a site reliability engineer (SRE) Practices—Understand the theory and practice of an SRE's day-to-day work: building and operating large distributed computing systems Management—Explore Google's best practices for training, communication, and meetings that your organization can use

IT Modernization using Catalogic ECX Copy Data Management and IBM Spectrum Storage

Data is the currency of the new economy, and organizations are increasingly tasked with finding better ways to protect, recover, access, share, and use data. Traditional storage technologies are being stretched to the breaking point. This challenge is not because of storage hardware performance, but because management tools and techniques have not kept pace with new requirements. Primary data growth rates of 35% to 50% annually only amplify the problem. Organizations of all sizes find themselves needing to modernize their IT processes to enable critical new use cases such as storage self-service, Development and Operations (DevOps), and integration of data centers with the Cloud. They are equally challenged with improving management efficiencies for long established IT processes such as data protection, disaster recovery, reporting, and business analytics. Access to copies of data is the one common feature of all these use cases. However, the slow, manual processes common to IT organizations, including a heavy reliance on labor-intensive scripting and disparate tool sets, are no longer able to deliver the speed and agility required in today's fast-paced world. Copy Data Management (CDM) is an IT modernization technology that focuses on using existing data in a manner that is efficient, automated, scalable, and easy to use, delivering the data access that is urgently needed to meet the new use cases. Catalogic ECX, with IBM® storage, provides in-place copy data management that modernizes IT processes, enables key use cases, and does it all within existing infrastructure. This IBM Redbooks® publication shows how Catalogic Software and IBM have partnered together to create an integrated solution that addresses today's IT environment.

Ten Signs of Data Science Maturity

How well prepared is your organization to innovate, using data science? In this report, two leading data scientists at the consulting firm Booz Allen Hamilton describe ten characteristics of a mature data science capability. After spending years helping clients such as the US government and commercial organizations worldwide build innovative data science capabilities, Peter Guerra and Dr. Kirk Borne identified these characteristics to help you measure your company’s competence in this area. This report provides a detailed discussion of each of the 10 signs of data science maturity, which—among many other things—encourage you to: Give members of your organization access to all your available data Use Agile and leverage "DataOps"—DevOps for data product development Help your data science team sharpen its skills through open or internal competitions Personify data science as a way of doing things, and not a thing to do

Learning ELK Stack

Dive into the ELK stack-Elasticsearch, Logstash, and Kibana-with this comprehensive guide. Designed to help you set up, configure, and utilize the stack to its fullest, this book provides you with the skills to manage data with precision, enrich logs, and create meaningful analytics. Develop an entire data pipeline and cultivate powerful visual insights from your data. What this Book will help me do Install and configure Elasticsearch, Logstash, and Kibana to establish a robust ELK stack setup. Understand the role of each component in the stack and master the basics of log analysis. Create custom Logstash plugins to handle non-standard data processing requirements. Develop interactive and insightful data visualizations and dashboards using Kibana. Implement a complete data pipeline and gain expertise in data indexing, searching, and reporting. Author(s) None Chhajed brings depth of technical understanding and practical experience to the exploration of the ELK Stack. With a strong background in open-source technologies and data analytics, Chhajed has worked extensively with ELK stack implementations in real-world scenarios. Through this guide, the author offers clarity, detailed examples, and actionable insights for professionals seeking to improve their data systems. Who is it for? This book is targeted towards software developers, data analysts, and DevOps engineers seeking to harness the potential of the ELK stack for data analysis and logging. It is most suitable for intermediate-level professionals with basic knowledge of Unix or programming. If your aim is to gain insights and build metrics from diverse data formats utilizing open-source technologies, this book is crafted for you.

Hadoop 2 Quick-Start Guide: Learn the Essentials of Big Data Computing in the Apache Hadoop 2 Ecosystem

Get Started Fast with Apache Hadoop ® 2, YARN, and Today’s Hadoop Ecosystem With Hadoop 2.x and YARN, Hadoop moves beyond MapReduce to become practical for virtually any type of data processing. Hadoop 2.x and the Data Lake concept represent a radical shift away from conventional approaches to data usage and storage. Hadoop 2.x installations offer unmatched scalability and breakthrough extensibility that supports new and existing Big Data analytics processing methods and models. Hadoop ® 2 Quick-Start Guide is the first easy, accessible guide to Apache Hadoop 2.x, YARN, and the modern Hadoop ecosystem. Building on his unsurpassed experience teaching Hadoop and Big Data, author Douglas Eadline covers all the basics you need to know to install and use Hadoop 2 on personal computers or servers, and to navigate the powerful technologies that complement it. Eadline concisely introduces and explains every key Hadoop 2 concept, tool, and service, illustrating each with a simple “beginning-to-end” example and identifying trustworthy, up-to-date resources for learning more. This guide is ideal if you want to learn about Hadoop 2 without getting mired in technical details. Douglas Eadline will bring you up to speed quickly, whether you’re a user, admin, devops specialist, programmer, architect, analyst, or data scientist. Coverage Includes Understanding what Hadoop 2 and YARN do, and how they improve on Hadoop 1 with MapReduce Understanding Hadoop-based Data Lakes versus RDBMS Data Warehouses Installing Hadoop 2 and core services on Linux machines, virtualized sandboxes, or clusters Exploring the Hadoop Distributed File System (HDFS) Understanding the essentials of MapReduce and YARN application programming Simplifying programming and data movement with Apache Pig, Hive, Sqoop, Flume, Oozie, and HBase Observing application progress, controlling jobs, and managing workflows Managing Hadoop efficiently with Apache Ambari–including recipes for HDFS to NFSv3 gateway, HDFS snapshots, and YARN configuration Learning basic Hadoop 2 troubleshooting, and installing Apache Hue and Apache Spark

IBM Software for SAP Solutions

SAP is a market leader in enterprise business application software. SAP solutions provide a rich set of composable application modules, and configurable functional capabilities that are expected from a comprehensive enterprise business application software suite. In most cases, companies that adopt SAP software remain heterogeneous enterprises running both SAP and non-SAP systems to support their business processes. Regardless of the specific scenario, in heterogeneous enterprises most SAP implementations must be integrated with a variety of non-SAP enterprise systems: Portals Messaging infrastructure Business process management (BPM) tools Enterprise Content Management (ECM) methods and tools Business analytics (BA) and business intelligence (BI) technologies Security Systems of record Systems of engagement The tooling included with SAP software addresses many needs for creating SAP-centric environments. However, the classic approach to implementing SAP functionality generally leaves the business with a rigid solution that is difficult and expensive to change and enhance. When SAP software is used in a large, heterogeneous enterprise environment, SAP clients face the dilemma of selecting the correct set of tools and platforms to implement SAP functionality, and to integrate the SAP solutions with non-SAP systems. This IBM® Redbooks® publication explains the value of integrating IBM software with SAP solutions. It describes how to enhance and extend pre-built capabilities in SAP software with best-in-class IBM enterprise software, enabling clients to maximize return on investment (ROI) in their SAP investment and achieve a balanced enterprise architecture approach. This book describes IBM Reference Architecture for SAP, a prescriptive blueprint for using IBM software in SAP solutions. The reference architecture is focused on defining the use of IBM software with SAP, and is not intended to address the internal aspects of SAP components. The chapters of this book provide a specific reference architecture for many of the architectural domains that are each important for a large enterprise to establish common strategy, efficiency, and balance. The majority of the most important architectural domain topics, such as integration, process optimization, master data management, mobile access, Enterprise Content Management, business intelligence, DevOps, security, systems monitoring, and so on, are covered in the book. However, there are several other architectural domains which are not included in the book. This is not to imply that these other architectural domains are not important or are less important, or that IBM does not offer a solution to address them. It is only reflective of time constraints, available resources, and the complexity of assembling a book on an extremely broad topic. Although more content could have been added, the authors feel confident that the scope of architectural material that has been included should provide organizations with a fantastic head start in defining their own enterprise reference architecture for many of the important architectural domains, and it is hoped that this book provides great value to those reading it. This IBM Redbooks publication is targeted to the following audiences: Client decision makers and solution architects leading enterprise transformation projects and wanting to gain further insight so that they can benefit from the integration of IBM software in large-scale SAP projects. IT architects and consultants integrating IBM technology with SAP solutions.

Getting Started with Hazelcast, Second Edition

This book is your gateway to mastering Hazelcast, a powerful open-source distributed data grid platform. By using Hazelcast, you'll gain the tools to manage data at scale within your modern applications while improving performance and reliability. What this Book will help me do Gain a comprehensive understanding of distributed data grids and Hazelcast's architecture. Master the configuration and deployment of Hazelcast clusters in various scenarios. Learn to design scalable and resilient systems using Hazelcast's in-memory features. Implement advanced messaging, querying, and processing using Hazelcast APIs. Enhance your applications with distributed caching and data sharing capabilities. Author(s) Matthew Johns is an experienced software engineer and author specializing in distributed systems and Java enterprise development. He has worked extensively in building scalable applications and is passionate about teaching others to leverage modern technologies. His practical approach to programming and clarity of instruction make complex topics accessible and actionable. Who is it for? This book is ideal for Java developers, software architects, and DevOps engineers seeking to enhance their skills in distributed systems. If you're looking to manage data at scale, improve application performance, and build resilient architectures, this book is for you. Whether new to distributed computing or experienced developers exploring Hazelcast, you'll find practical insights for your work. Readers should have basic Java knowledge to get the most out of this book.

Cassandra High Availability

This book, "Cassandra High Availability", equips you with the knowledge and practical skills to harness Apache Cassandra's capabilities for building resilient, scalable, and highly-available systems. Suitable for developers or DevOps engineers with foundational knowledge of Cassandra, this resource takes you deeper into advanced topics necessary for maintaining robust distributed systems. What this Book will help me do Understand and utilize Cassandra's replication protocols and consistency levels to balance performance and reliability. Configure and manage multi-data-center setups in Cassandra for failover and geographic redundancy. Implement techniques to efficiently scale your Cassandra cluster with no downtime. Learn how to design high-availability data models optimized for performance and resilience. Identify and avoid common anti-patterns in Cassandra to maintain system efficiency and reliability. Author(s) None Strickland, the author of "Cassandra High Availability", is an experienced data engineer with a deep understanding of distributed systems and database technologies. None has worked extensively with Apache Cassandra in designing and optimizing scalable infrastructures. They bring a hands-on and detailed approach to explaining complex topics, making them accessible to both developers and system operators. Who is it for? This book is tailored for developers and DevOps engineers who have foundational knowledge of Apache Cassandra and are aiming to deepen their expertise. If your goal is to design, manage, and optimize high-availability distributed systems, this book provides practical strategies and technical insights for mastering Cassandra's capabilities. Ideal for those seeking to build fault-tolerant, scalable infrastructures.

IBM Software for SAP Solutions

SAP is a market leader in enterprise business application software. SAP solutions provide a rich set of composable application modules, and configurable functional capabilities that are expected from a comprehensive enterprise business application software suite. In most cases, companies that adopt SAP software remain heterogeneous enterprises running both SAP and non-SAP systems to support their business processes. Regardless of the specific scenario, in heterogeneous enterprises most SAP implementations must be integrated with a variety of non-SAP enterprise systems: Portals Messaging infrastructure Business process management (BPM) tools Enterprise Content Management (ECM) methods and tools Business analytics (BA) and business intelligence (BI) technologies Security Systems of record Systems of engagement When SAP software is used in a large, heterogeneous enterprise environment, SAP clients face the dilemma of selecting the correct set of tools and platforms to implement SAP functionality, and to integrate the SAP solutions with non-SAP systems. This IBM® Redbooks® publication explains the value of integrating IBM software with SAP solutions. It describes how to enhance and extend pre-built capabilities in SAP software with best-in-class IBM enterprise software, enabling clients to maximize return on investment (ROI) in their SAP investment and achieve a balanced enterprise architecture approach. This book describes IBM Reference Architecture for SAP, a prescriptive blueprint for using IBM software in SAP solutions. The reference architecture is focused on defining the use of IBM software with SAP, and is not intended to address the internal aspects of SAP components. The chapters of this book provide a specific reference architecture for many of the architectural domains that are each important for a large enterprise to establish common strategy, efficiency, and balance. The majority of the most important architectural domain topics, such as integration, process optimization, master data management, mobile access, Enterprise Content Management, business intelligence, DevOps, security, systems monitoring, and so on, are covered in the book. However, there are several other architectural domains which are not included in the book. This is not to imply that these other architectural domains are not important or are less important, or that IBM does not offer a solution to address them. It is only reflective of time constraints, available resources, and the complexity of assembling a book on an extremely broad topic. Although more content could have been added, the authors feel confident that the scope of architectural material that has been included should provide organizations with a fantastic head start in defining their own enterprise reference architecture for many of the important architectural domains, and it is hoped that this book provides great value to those reading it. This IBM Redbooks publication is targeted to the following audiences: Client decision makers and solution architects leading enterprise transformation projects and wanting to gain further insight so that they can benefit from the integration of IBM software in large-scale SAP projects. IT architects and consultants integrating IBM technology with SAP solutions.

Agentic DevOps with GitHub Copilot

Our very own (not so secret) agent, Martin Woodward, takes us through the latest developments in GitHub Copilot with a deep dive into all the announcements from the keynote. You will not only learn how to get started with all the latest and greatest AI enhanced development features across VS Code and GitHub, but you will also learn how to take the best advantage of them in your day-to-day development work.​

Create your blueprint for a successful AI transformation

Gartner predicts over 40% of agentic AI projects will fail by 2027, despite AI transformation being a top industry priority. The cause is a classic "last-mile problem": AI agents require step-by-step instructions, but key workflows are undocumented. This session demonstrates how to create a roadmap for your AI transformation by developing living blueprints of your processes or architecture with tools like the Lucid Suite, Microsoft Teams, and Azure DevOps.

Retaining Software Developers is a significant challenge for teams. According to the Infragistics Reveal Survey, 37.5% of respondents expected difficulty in finding developers in 2023. To retain talent and keep DevOps engineers happy, we need to know how to make them unhappy. Join me as I discuss antipatterns in management, development, testing and monitoring patterns that can stop you retaining awesome software engineers. Outline

  • I’ll cover:- Alert volume evaluation, and how we alert bombardment leads to burnout and alert fatigue. I’ll also cover best practices for on-call rotation and BYOD usage to stop engineer burnout even when they’re not on call.
  • SLO and metric comparison across teams, and how comparing team metrics rather than improving metrics such as DORA over time for a single team breeds animosity and demoralises engineers.
  • Code reviews with jerkish or unhelpful comments, and the difference between radical candour through constructive feedback and pulling people down.
  • Tool overload, and how selecting a common toolbox reduces the need for context switching.
  • Flaky or poor testing, and how it builds mistrust and apathy in platform quality.
  • Constant work items and a lack of learning time, and how a lack of training opportunities and space to grow leaves engineers feeling stuck.
  • Lack of support for conference attendance and speaking, and how community connections help engineers grow and learn.