talk-data.com talk-data.com

Topic

IBM

technology cloud ai

1631

tagged

Activity Trend

26 peak/qtr
2020-Q1 2026-Q1

Activities

1631 activities · Newest first

Summary

Modern applications and data platforms aspire to process events and data in real time at scale and with low latency. Apache Flink is a true stream processing engine with an impressive set of capabilities for stateful computation at scale. In this episode Fabian Hueske, one of the original authors, explains how Flink is architected, how it is being used to power some of the world’s largest businesses, where it sits in the lanscape of stream processing tools, and how you can start using it today.

Preamble

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out Linode. With 200Gbit private networking, scalable shared block storage, and a 40Gbit public network, you’ve got everything you need to run a fast, reliable, and bullet-proof data platform. If you need global distribution, they’ve got that covered too with world-wide datacenters including new ones in Toronto and Mumbai. Go to dataengineeringpodcast.com/linode today to get a $20 credit and launch a new server in under a minute. Go to dataengineeringpodcast.com to subscribe to the show, sign up for the mailing list, read the show notes, and get in touch. Join the community in the new Zulip chat workspace at dataengineeringpodcast.com/chat Your host is Tobias Macey and today I’m interviewing Fabian Hueske, co-author of the upcoming O’Reilly book Stream Processing With Apache Flink, about his work on Apache Flink, the stateful streaming engine

Interview

Introduction How did you get involved in the area of data management? Can you start by describing what Flink is and how the project got started? What are some of the primary ways that Flink is used? How does Flink compare to other streaming engines such as Spark, Kafka, Pulsar, and Storm?

What are some use cases that Flink is uniquely qualified to handle?

Where does Flink fit into the current data landscape? How is Flink architected?

How has that architecture evolved? Are there any aspects of the current design that you would do differently if you started over today?

How does scaling work in a Flink deployment?

What are the scaling limits? What are some of the failure modes that users should be aware of?

How is the statefulness of a cluster managed?

What are the mechanisms for managing conflicts? What are the limiting factors for the volume of state that can be practically handled in a cluster and for a given purpose? Can state be shared across processes or tasks within a Flink cluster?

What are the comparative challenges of working with bounded vs unbounded streams of data? How do you handle out of order events in Flink, especially as the delay for a given event increases? For someone who is using Flink in their environment, what are the primary means of interacting with and developing on top of it? What are some of the most challenging or complicated aspects of building and maintaining Flink? What are some of the most interesting or unexpected ways that you have seen Flink used? What are some of the improvements or new features that are planned for the future of Flink? What are some features or use cases that you are explicitly not planning to support? For people who participate in the training sessions that you offer through Data Artisans, what are some of the concepts that they are challenged by?

What do they find most interesting or exciting?

Contact Info

LinkedIn @fhueske on Twitter fhueske on GitHub

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Links

Flink Data Artisans IBM DB2 Technische Universität Berlin Hadoop Relational Database Google Cloud Dataflow Spark Cascading Java RocksDB Flink Checkpoints Flink Savepoints Kafka Pulsar Storm Scala LINQ (Language INtegrated Query) SQL Backpressure

IBM Power Systems E870C and E880C Technical Overview and Introduction

This IBM® Redpaper™ publication is a comprehensive guide that covers the IBM Power® System E870C (9080-MME) and IBM Power System E880C (9080-MHE) servers that support IBM AIX®, IBM i, and Linux operating systems. The objective of this paper is to introduce the major innovative Power E870C and Power E880C offerings and their relevant functions. The new Power E870C and Power E880C servers with OpenStack-based cloud management and open source automation enables clients to accelerate the transformation of their IT infrastructure for cloud while providing tremendous flexibility during the transition. In addition, the Power E870C and Power E880C models provide clients increased security, high availability, rapid scalability, simplified maintenance, and management, all while enabling business growth and dramatically reducing costs. The systems management capability of the Power E870C and Power E880C servers speeds up and simplifies cloud deployment by providing fast and automated VM deployments, prebuilt image templates, and self-service capabilities, all with an intuitive interface. Enterprise servers provide the highest levels of reliability, availability, flexibility, and performance to bring you a world-class enterprise private and hybrid cloud infrastructure. Through enterprise-class security, efficient built-in virtualization that drives industry-leading workload density, and dynamic resource allocation and management, the server consistently delivers the highest levels of service across hundreds of virtual workloads on a single system. The Power E870C and Power E880C server includes the cloud management software and services to assist with clients' move to the cloud, both private and hybrid. The following capabilities are included: Private cloud management with IBM Cloud PowerVC Manager, Cloud-based HMC Apps as a service, and open source cloud automation and configuration tooling for AIX Hybrid cloud support Hybrid infrastructure management tools Securely connect system of record workloads and data to cloud native applications IBM Cloud Starter Pack Flexible capacity on demand Power to Cloud Services This paper expands the current set of IBM Power Systems™ documentation by providing a desktop reference that offers a detailed technical description of the Power E870C and Power E880C systems. This paper does not replace the latest marketing materials and configuration tools. It is intended as another source of information that, together with existing sources, can be used to enhance your knowledge of IBM server solutions.

IBM DS8880 Thin Provisioning (Updated for Release 8.5)

Ever-increasing storage demands have a negative effect on an organization's IT budget and complicate the overall storage infrastructure and management. Companies are looking at ways to use their storage resources more efficiently. Thin provisioning can help by reducing the amount of unused storage that is typically allocated to applications or users. Now available for the IBM® DS8880 for Fixed Block (FB) and Count Key Data (CKD) volumes, thin provisioning defers the allocation of actual space on the storage system until the time that the data must effectively be written to disk. This IBM Redpaper™ publication provides an overall understanding of how thin provisioning works on the IBM DS8880. It also provides insights into the functional design and its implementation on the DS8880 and includes illustrations for the configuration of thin-provisioned volumes from the DS GUI or the DS CLI. This edition applies to DS8880 Release 8.5 or later.

Send us a text Making Data Simple host Al Martin has a chance to discuss all thing data with Laura Ellis, also known as Little Miss Data. Laura is an analytics architect for IBM Cloud as well as a frequent blogger. Together, they talk about how critical it is to understand your data in order create specific calls to action, and what it means to build a data democracy. Show Notes 00:00 - Follow @IBMAnalyticsSupport on Twitter. 00:22 - Check out our YouTube channel. We're posting full episodes weekly. 00:24 - Connect with Al Martin on LinkedIn and Twitter. 01:20 - Check out littlemissdata.com. 01:22 - Connect with Laura Ellis on Twitter, Instagram, and LinkedIn. 02:20 - Curious to know more about analytics architecture? Check out this IBM article on the topic. 03:52 - Check out the Little Miss Data article Al referenced here. 04:45 - Learn more about Data Democracy here in Laura's blog post. 05:31 - Understand more about the importance of data for your business in this article. 09:11 - Find out more about the challenges of being a data scientist here. 12:45 - Working with good quality data is crucial. Check out this article for more details. 16:12 - Simple data can provide the most effective returns. Learn more here. 21:15 - Choosing the right, supportive environment for your data science journey will make sure you don't get burnt out. This article examines your options. 21:35 - Data is a fundamental step when working with AI. But do you know the difference between data analytics, AI and machine learning? This Forbes article walks you through it. 22:42 - Need to brush up on what a data dashboard is? Learn more here. Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

IBM z14 Model ZR1 Technical Introduction

Abstract This IBM® Redbooks® publication introduces the latest member of the IBM Z platform, the IBM z14 Model ZR1 (Machine Type 3907). It includes information about the Z environment and how it helps integrate data and transactions more securely, and provides insight for faster and more accurate business decisions. The z14 ZR1 is a state-of-the-art data and transaction system that delivers advanced capabilities, which are vital to any digital transformation. The z14 ZR1 is designed for enhanced modularity, which is in an industry standard footprint. This system excels at the following tasks: Securing data with pervasive encryption Transforming a transactional platform into a data powerhouse Getting more out of the platform with IT Operational Analytics Providing resilience towards zero downtime Accelerating digital transformation with agile service delivery Revolutionizing business processes Mixing open source and Z technologies This book explains how this system uses new innovations and traditional Z strengths to satisfy growing demand for cloud, analytics, and open source technologies. With the z14 ZR1 as the base, applications can run in a trusted, reliable, and secure environment that improves operations and lessens business risk.

IBM z14 Technical Introduction

Abstract This IBM® Redbooks® publication introduces the latest IBM z platform, the IBM z14™. It includes information about the Z environment and how it helps integrate data and transactions more securely, and can infuse insight for faster and more accurate business decisions. The z14 is a state-of-the-art data and transaction system that delivers advanced capabilities, which are vital to the digital era and the trust economy. This system includes the following functionality: Securing data with pervasive encryption Transforming a transactional platform into a data powerhouse Getting more out of the platform with IT Operational Analytics Providing resilience with key to zero downtime Accelerating digital transformation with agile service delivery Revolutionizing business processes Blending open source and Z technologies This book explains how this system uses both new innovations and traditional Z strengths to satisfy growing demand for cloud, analytics, and mobile applications. With the z14 as the base, applications can run in a trusted, reliable, and secure environment that both improves operations and lessens business risk.

Send us a text On this week's Making Data Simple Podcast, we are joined by Steve Moore, IBM senior content designer and story strategist, along with Aishwarya Srinivasan, IBM data scientist and deep learning researcher. Aishwarya discusses what it means to be a unicorn in an industry and why it is useful to pair textbook learning from schooling with hands-on industry experience. Aishwarya, Steve, and Al explore how to use reinforcement learning to make better data decisions. Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

Send us a text Interested in learning what it takes to operate a start-up? On this episode of Making Data Simple, host Al Martin sits down with Simon Lightstone, IBM offering manager, to discuss what it took to get his startup off the ground. Simon offers tips to those facing a similar experience and describes how the decision to pursue a dream ultimately affected his career. Want to be featured as a guest on Making Data Simple? Reach out to us at [email protected] and tell us why you should be next. The Making Data Simple Podcast is hosted by Al Martin, WW VP Technical Sales, IBM, where we explore trending technologies, business innovation, and leadership ... while keeping it simple & fun.

IBM Spectrum Scale Security

Storage systems must provide reliable and convenient data access to all authorized users while simultaneously preventing threats coming from outside or even inside the enterprise. Security threats come in many forms, from unauthorized access to data, data tampering, denial of service, and obtaining privileged access to systems. According to the Storage Network Industry Association (SNIA), data security in the context of storage systems is responsible for safeguarding the data against theft, prevention of unauthorized disclosure of data, prevention of data tampering, and accidental corruption. This process ensures accountability, authenticity, business continuity, and regulatory compliance. Security for storage systems can be classified as follows: Data storage (data at rest, which includes data durability and immutability) Access to data Movement of data (data in flight) Management of data IBM® Spectrum Scale is a software-defined storage system for high performance, large-scale workloads on-premises or in the cloud. IBM Spectrum™ Scale addresses all four aspects of security by securing data at rest (protecting data at rest with snapshots, and backups and immutability features) and securing data in flight (providing secure management of data, and secure access to data by using authentication and authorization across multiple supported access protocols). These protocols include POSIX, NFS, SMB, Hadoop, and Object (REST). For automated data management, it is equipped with powerful information lifecycle management (ILM) tools that can help administer unstructured data by providing the correct security for the correct data. This IBM Redpaper™ publication details the various aspects of security in IBM Spectrum Scale™, including the following items: Security of data in transit Security of data at rest Authentication Authorization Hadoop security Immutability Secure administration Audit logging Security for transparent cloud tiering (TCT) Security for OpenStack drivers Unless stated otherwise, the functions that are mentioned in this paper are available in IBM Spectrum Scale V4.2.1 or later releases.

IBM FlashSystem V9000 Model AE3 Product IBM FlashSystem V9000 AC3 with Flash Enclosure Model AE3 Product Guide

This IBM Redbooks® Product Guide describes IBM FlashSystem® V9000, which is a comprehensive all-flash enterprise storage solution that delivers the full capabilities of IBM FlashCore® technology. In addition, it provides a rich set of software-defined storage features, including IBM Real-time Compression™, data reductions, dynamic tiering, thin provisioning, snapshots, cloning, replication, data copy services, and IBM HyperSwap® for high availability. Scale out scale up configurations can now add a hot spare node to further enhance availability. With the release of FlashSystem V9000 Software V8.1, extra functions and features are available, including support for new and more powerful FlashSystem V9000 storage enclosure Model AE3. Software features added include GUI enhancements, a new dashboard, support assistance, and data deduplication. AE3 capacities include Small (3.6 TB), Medium (8.5 TB), and Large (18 TB) IBM MicroLatency® modules for between 14.4 TB and 180 TB usable capacity (TBu), with inline hardware compression increasing the capacity up to 219 TB effective capacity (TBe). New SAS-based small form factor (SFF) and large form factor (LFF) expansion enclosures that provide a mixture of nearline hard disk drives (HDDs) and flash MDisks in a pool that can be used for IBM Easy Tier®. The new IBM FlashSystem V9000 SFF expansion enclosure Model92F offers new tiering options with low-cost solid-state drive (SSD flash drives) and nearline HDDs. Up to 784 drives per node pair of serial-attached SCSI (SAS) expansions are supported per FlashSystem V9000 controller pair, providing up to 480 drives with expansion Model 24F and up to 240 drives with expansion Model 12F. FlashSystem V9000 Software version 8.1 replaces version 7.8, and is available to all IBM FlashSystem V9000 customers with current warranty or software maintenance agreements.

SAP HANA and ESS: A Winning Combination

SAP HANA on IBM® POWER® is an established HANA solution with which customers can run HANA-based analytic and business applications on a flexible IBM Power based infrastructure. IT assets, such as servers, storage, and skills and operation procedures, can easily be used and reused instead of enforcing more investment into dedicated SAP HANA only appliances. In this scenario, IBM Spectrum™ Scale as the underlying block storage and files system adds further benefits to this solution stack to take advantage of scale effects, higher availability, simplification, and performance. With the IBM Elastic Storage™ Server (ESS) based on IBM Spectrum Scale™, RAID capabilities are added to the file system. By using the intelligent internal logic of the IBM Spectrum Scale RAID code, reasonable performance and significant disk failure recovery improvements are achieved. This IBM Redpaper™ publication focuses on the benefits and advantages of implementing a HANA solution on top of IBM Spectrum Scale storage file system. This paper is intended to help experienced administrators and IT specialists to plan and set up an IBM Spectrum Scale cluster and configure an ESS for SAP HANA workloads. It provides important tips and bestpreferred practices about how to manage IBM Spectrum Scale''s availability and performance. If you are familiar with ESS, IBM Spectrum Scale, and IBM Spectrum Scale RAID, and you need only the pertinent documentation about how to configure a IBM Spectrum Scale cluster with an ESS for SAP HANA, see Chapter 5, "IBM Spectrum Scale customization for HANA" on page 25. Before reading this IBM Redpaper publication, you should be familiar with the basic concepts of IBM Spectrum Scale and IBM Spectrum Scale RAID. This IBM Redpaper publication can be helpful for architects and specialists who are planning an SAP HANA on POWER deployment with the IBM Spectrum Scale file system. For more information about planning considerations for Power, see the SAP HANA on Power Planning Guide.

Getting Started with IBM zHyperLink for z/OS

With the pressures to drive transaction processing 24/7 because of online banking and other business demands, IBM® zHyperLink on the IBM DS8880 is making it easy to accelerate transaction processing for the mainframe. This IBM Redpaper™ publication helps you to understand the concepts, business perspectives, and reference architecture of installing, tailoring, and configuring zHyperLink in your own environment.

In this podcast @DanDeGrazia from @IBM spoke with @Vishaltx from @AnalyticsWeek to discuss the mingling of chief data scientist with open sources. He sheds light into some of the big opportunities in open source and how businesses could work together to achieve progress in data science. Dan also shared the importance of smooth communication for success as a data scientist.

TIMELINE: 0:29 Dan's journey. 9:40 Dan's role in IBM. 11:26 Tips on staying consistent while creating a database. 16:23 Chief data scientist and open-source put together. 20:28 The state of open source when it comes to data. 23:50 Evaluating the market to understand business requirements. 29:19 Future of data and open-source market. 33:23 Exciting opportunities in data. 37:06 Data scientist's role in integrating business and data. 49:41 Ingredients of a successful data scientist. 53:04 Data science and trust issues. 59:35 Human element behind data. 1:01:20 Dan's success mantra. 1:06:52 Key takeaways.

Dan's Recommended Read: The Five Temptations of a CEO, Anniversary Edition: A Leadership Fable by Patrick Lencioni https://amzn.to/2Jcm5do What Every BODY is Saying: An Ex-FBI Agent8217;s Guide to Speed-Reading People by Joe Navarro, Marvin Karlins https://amzn.to/2J1RXxO

Podcast Link: https://futureofdata.org/where-chief-data-scientist-open-source-meets-dandegrazia-futureofdata-podcast/

Dan's BIO: Dan has almost 30 years of experience working with large data sets. Starting with the unusual work of analyzing potential jury pools in the 1980s, Dan also did some of the first PC based voter registration analytics in the Chicago area, including putting the first complete list of registered voters on a PC (as hard as that is to imagine today a 50-megabyte hard drive on DOS systems was staggering). Interested in almost anything new and technical, he worked at The Chicago Board of Trade. He taught himself BASIC to write algorithms while working as an Arbitrager in financial futures. After the military, Dan moved to San Francisco. He worked with several small companies and startups designing and implementing some of the first PC-based fax systems (who cares now!), enterprise accounting software, and early middleware connections using the early 3GL/4GL languages. Always perusing the technical edge cases, Dan worked for InfoBright, a Column store Database startup in the US and EMEA, at Lingotek, an In-Q-Tel funded company working in large data set translations and big data analytics companies like Datameer and his current position as a Chief Data Scientist for Open Source in the IBM Channels organization. Dan's current just for fun Project is working to create an app that will record and analyze bird songs and provide the user with information on the bird and the specifics of the current song.

About #Podcast:

FutureOfData podcast is a conversation starter to bring leaders, influencers, and lead practitioners to discuss their journey to create the data-driven future.

Want to sponsor? Email us @ [email protected]

Keywords:

FutureOfData #DataAnalytics #Leadership #Podcast #BigData #Strategy

Introduction to IBM Common Data Provider for z Systems

IBM Common Data Provider for z Systems collects, filters, and formats IT operational data in near real-time and provides that data to target analytics solutions. IBM Common Data Provider for z Systems enables authorized IT operations teams using a single web-based interface to specify the IT operational data to be gathered and how it needs to be handled. This data is provided to both on- and off-platform analytic solutions, in a consistent, consumable format for analysis. This Redpaper discusses the value of IBM Common Data Provider for z Systems, provides a high-level reference architecture for IBM Common Data Provider for z Systems, and introduces key components of the architecture. It shows how IBM Common Data Provider for z Systems provides operational data to various analytic solutions. The publication provides high-level integration guidance, preferred practices, tips on planning for IBM Common Data Provider for z Systems, and example integration scenarios.

IBM Software-Defined Storage Guide

Today, new business models in the marketplace coexist with traditional ones and their well-established IT architectures. They generate new business needs and new IT requirements that can only be satisfied by new service models and new technological approaches. These changes are reshaping traditional IT concepts. Cloud in its three main variants (Public, Hybrid, and Private) represents the major and most viable answer to those IT requirements, and software-defined infrastructure (SDI) is its major technological enabler. IBM® technology, with its rich and complete set of storage hardware and software products, supports SDI both in an open standard framework and in other vendors' environments. IBM services are able to deliver solutions to the customers with their extensive knowledge of the topic and the experiences gained in partnership with clients. This IBM Redpaper™ publication focuses on software-defined storage (SDS) and IBM Storage Systems product offerings for software-defined environments (SDEs). It also provides use case examples across various industries that cover different client needs, proposed solutions, and results. This paper can help you to understand current organizational capabilities and challenges, and to identify specific business objectives to be achieved by implementing an SDS solution in your enterprise.

In this episode, Wayne Eckerson and Jen Underwood explore a new era of analytics. Data volumes and complexity have exceeded the limits of current manual drag-and-drop analytics solutions. Data moves at the speed of light while speed-to-insight lags farther and farther behind. It is time to explore intelligent, next generation, machine-powered analytics to retain your competitive edge. It is time to combine the best of the human mind and machine.

Underwood is an analytics expert and founder of Impact Analytic. She is a former product manager at Microsoft who spearheaded the design and development of the reinvigorated version of Power BI, which has since become a market leading BI tool. Underwood is an IBM Analytics Insider, SAS contributor, former Tableau Zen Master, Top 10 Women Influencer and active analytics community member. She is keenly interested in the intersection of data visualization and data science and writes and speaks persuasively about these topics.

IBM Db2 11.1 Certification Guide

Delve into the IBM Db2 11.1 Certification Guide to comprehensively prepare for the IBM C2090-600 exam and master database programming and administration tasks in Db2 environments. Across its insightful chapters, this guide provides practical steps, expert guidance, and over 150 practice questions aimed at ensuring your success. What this Book will help me do Master Db2 server management, including configuration and maintenance tasks, to ensure optimized performance. Implement advanced features such as BLU Acceleration and Db2 pureScale to enhance database functionality. Gain expertise in security protocols, including data encryption and integrity enforcement, for secure database environments. Troubleshoot common Db2 issues using advanced diagnostic tools like db2pd and dsmtop, improving efficiency and uptime. Develop skills in creating and altering database objects, enabling robust database design and management. Author(s) The authors, None Collins and None Saraswatipura, are seasoned database professionals with vast experience in administering and optimizing Db2 environments. Their expertise in guiding students and professionals shines through in the accessible language and practical approach of the book. They bring a blend of theoretical and hands-on insights to ensure learners not only understand but also apply the knowledge effectively. Who is it for? This book is ideal for database administrators, architects, and application developers who are pursuing certification in Db2. It caters to readers with basic Db2 understanding seeking to advance their skills. Whether you're aiming for professional growth or practical expertise, this guide serves your goals by covering certification essentials while enriching your practical knowledge.

Hortonworks Data Platform with IBM Spectrum Scale: Reference Guide for Building an Integrated Solution

This IBM® Redpaper™ publication provides guidance on building an enterprise-grade data lake by using IBM Spectrum™ Scale and Hortonworks Data Platform for performing in-place Hadoop or Spark-based analytics. It covers the benefits of the integrated solution, and gives guidance about the types of deployment models and considerations during the implementation of these models. Hortonworks Data Platform (HDP) is a leading Hadoop and Spark distribution. HDP addresses the complete needs of data-at-rest, powers real-time customer applications, and delivers robust analytics that accelerate decision making and innovation. IBM Spectrum Scale™ is flexible and scalable software-defined file storage for analytics workloads. Enterprises around the globe have deployed IBM Spectrum Scale to form large data lakes and content repositories to perform high-performance computing (HPC) and analytics workloads. It can scale performance and capacity both without bottlenecks.

Security on IBM z/VSE

Abstract One of a firm’s most valuable resources is its data: client lists, accounting data, employee information, and so on. This critical data must be securely managed and controlled, and simultaneously made available to those users authorized to see it. The IBM® z/VSE® system features extensive capabilities to simultaneously share the firm’s data among multiple users and protect them. Threats to this data come from various sources. Insider threats and malicious hackers are not only difficult to detect and prevent, they might be using resources with the business being unaware. This IBM Redbooks® publication was written to assist z/VSE support and security personnel in providing the enterprise with a safe, secure and manageable environment. This book provides an overview of the security that is provided by z/VSE and the processes for the implementation and configuration of z/VSE security components, Basic Security Manager (BSM), IBM CICS® security, TCP/IP security, single sign-on using LDAP, and connector security.