talk-data.com talk-data.com

Topic

data-engineering

3395

tagged

Activity Trend

1 peak/qtr
2020-Q1 2026-Q1

Activities

3395 activities · Newest first

Database-Driven Web Development: Learn to Operate at a Professional Level with PERL and MySQL

This book will teach you the essential knowledge required to be a successful and productive web developer with the ability to produce cutting-edge websites utilizing a database. This updated edition starts with the fundamentals of web development before delving into Perl and MySQL concepts such as script and database modelling, script-driven database interactions, content generation from a database, and information delivery from the server to the browser and vice versa. The only skills required to get the most from this book are basic knowledge of how the Internet works and a novice skill level with Perl and MySQL. The rest is intuitively presented code that most people can quickly and easily understand and employ. An extensive selection of practical, fully functional programming constructs in six different programming languages will give you the knowledge and tools required to create eye-catching, capable, and functionally impressive database-driven websites. Author Thomas Valentine has taken the concepts presented in the first edition of this book to new heights, offering in-depth discussions of each area of functionality required to develop fully formed database-driven web applications. He has expanded on the examples presented in the first edition and has included some very interesting and useful programming techniques for your consideration. Upon completing this book, you’ll have gained the benefit of the author’s decades worth of experience and will be able to apply your new knowledge and skills to your own projects. What You Will Learn Install, configure and use a trio of software packages (Apache Web Server, MySQL Database Server, and Perl Scripting Server) Create an effective web development workstation with databases in mind Use the PERL scripting language and MySQL databases effectively Maximize the Apache Web Server Who This Book Is For Those who already know web development basics and web developers who want to master database-driven web development. The skills required to understand the concepts put forth in this book are a working knowledge of PERL and basic MySQL.

The Unrealized Opportunities with Real-Time Data

The amount of data generated from various processes and platforms has increased exponentially in the past decade, and the challenges of filtering useful data out of streams of raw data has become even greater. Meanwhile, the essence of making useful insights from that data has become even more important. In this incisive report, Federico Castanedo examines the challenges companies face when acting on data at rest as well as the benefits you unlock when acting on data as it's generated. Data engineers, enterprise architects, CTOs, and CIOs will explore the tools, processes, and mindset your company needs to process streaming data in real time. Learn how to make quick data-driven decisions to gain an edge on competitors. This report helps you: Explore gaps in today's real-time data architectures, including the limitations of real-time analytics to act on data immediately Examine use cases that can't be served efficiently with real-time analytics Understand how stream processing engines work with real-time data Learn how distributed data processing architectures, stream processing, streaming analytics, and event-based architectures relate to real-time data Understand how to transition from traditional batch processing environments to stream processing Federico Castanedo is an academic director and adjunct professor at IE University in Spain. A data science and AI leader, he has extensive experience in academia, industry, and startups.

Learning and Operating Presto

The Presto community has mushroomed since its origins at Facebook in 2012. But ramping up this open source distributed SQL query engine can be challenging even for the most experienced engineers. With this practical book, data engineers and architects, platform engineers, cloud engineers, and software engineers will learn how to use Presto operations at your organization to derive insights on datasets wherever they reside. Authors Angelica Lo Duca, Tim Meehan, Vivek Bharathan, and Ying Su explain what Presto is, where it came from, and how it differs from other data warehousing solutions. You'll discover why Facebook, Uber, Alibaba Cloud, Hewlett Packard Enterprise, IBM, Intel, and many more use Presto and how you can quickly deploy Presto in production. With this book, you will: Learn how to install and configure Presto Use Presto with business intelligence tools Understand how to connect Presto to a variety of data sources Extend Presto for real-time business insight Learn how to apply best practices and tuning Get troubleshooting tips for logs, error messages, and more Explore Presto's architectural concepts and usage patterns Understand Presto security and administration

Kafka Connect

Used by more than 80% of Fortune 100 companies, Apache Kafka has become the de facto event streaming platform. Kafka Connect is a key component of Kafka that lets you flow data between your existing systems and Kafka to process data in real time. With this practical guide, authors Mickael Maison and Kate Stanley show data engineers, site reliability engineers, and application developers how to build data pipelines between Kafka clusters and a variety of data sources and sinks. Kafka Connect allows you to quickly adopt Kafka by tapping into existing data and enabling many advanced use cases. No matter where you are in your event streaming journey, Kafka Connect is the ideal tool for building a modern data pipeline. Learn Kafka Connect's capabilities, main concepts, and terminology Design data and event streaming pipelines that use Kafka Connect Configure and operate Kafka Connect environments at scale Deploy secured and highly available Kafka Connect clusters Build sink and source connectors and single message transforms and converters

Building Real-Time Analytics Systems

Gain deep insight into real-time analytics, including the features of these systems and the problems they solve. With this practical book, data engineers at organizations that use event-processing systems such as Kafka, Google Pub/Sub, and AWS Kinesis will learn how to analyze data streams in real time. The faster you derive insights, the quicker you can spot changes in your business and act accordingly. Author Mark Needham from StarTree provides an overview of the real-time analytics space and an understanding of what goes into building real-time applications. The book's second part offers a series of hands-on tutorials that show you how to combine multiple software products to build real-time analytics applications for an imaginary pizza delivery service. You will: Learn common architectures for real-time analytics Discover how event processing differs from real-time analytics Ingest event data from Apache Kafka into Apache Pinot Combine event streams with OLTP data using Debezium and Kafka Streams Write real-time queries against event data stored in Apache Pinot Build a real-time dashboard and order tracking app Learn how Uber, Stripe, and Just Eat use real-time analytics

Practical MongoDB Aggregations

Practical MongoDB Aggregations serves as the definitive guide to mastering aggregation pipelines within MongoDB 7.0. Officially endorsed by MongoDB, Inc., this book provides streamlined strategies and practical examples to help you achieve complex data manipulation and analytical tasks, ultimately enhancing your database operation proficiency. What this Book will help me do Understand the architecture of the MongoDB aggregation framework to build scalable pipelines. Design and implement optimized aggregation pipelines for high performance. Learn practical techniques for processing large datasets efficiently using sharding. Apply data processing directly within MongoDB to minimize external workflows. Master handling arrays and securing data through well-designed pipelines. Author(s) Paul Done is an experienced software engineer with in-depth expertise in MongoDB and database systems. With years of professional experience managing and optimizing databases, Paul draws from real-world scenarios to devise effective strategies for learning MongoDB's advanced features. His approachable and instructional writing style empowers developers, engineers, and analysts to reach their full potential. Who is it for? This book is perfect for developers, database architects, and data engineers who have a foundational understanding of MongoDB and are looking to deepen their practical skills in using aggregation pipelines. Professionals who want to perform efficient data processing and gain insights into MongoDB's advanced features will find this guide invaluable. If you wish to streamline analytical tasks, optimize performance, and work efficiently with MongoDB's latest functionalities, this book is tailored for you.

Leveling Up with SQL: Advanced Techniques for Transforming Data into Insights

Learn to write SQL queries to select and analyze data, and improve your ability to manipulate data. This book will help you take your existing skills to the next level. Author Mark Simon kicks things off with a quick review of basic SQL knowledge, followed by a demonstration of how efficient SQL databases are designed and how to extract just the right data from them. You’ll then learn about each individual table’s structure and how to work with the relationships between tables. As you progress through the book, you will learn more sophisticated techniques such as using common table expressions and subqueries, analyzing your data using aggregate and windowing functions, and how to save queries in the form of views and other methods. This book employs an accessible approach to work through a realistic sample, enabling you to learn concepts as they arise to improve parts of the database or to work with the data itself. After completing this book, you will have a more thorough understanding of database structure and how to use advanced techniques to extract, manage, and analyze data. What Will You Learn Gain a stronger understanding of database design principles, especially individual tables Understand the relationships between tables Utilize techniques such as views, subqueries, common table expressions, and windowing functions Who Is This Book For: SQL Databases users who want to improve their knowledge and techniques.

IBM Storage as a Service Offering Guide

IBM® Storage as a Service (STaaS) extends your hybrid cloud experience with a new flexible consumption model enabled for both your on-premises and hybrid cloud infrastructure needs, giving you the agility, cash flow efficiency, and services of cloud storage with the flexibility to dynamically scale up or down and only pay for what you use beyond the minimal capacity. This IBM Redpaper provides a detailed introduction to the IBM STaaS service. The paper is targeted for data center managers and storage administrators.

IBM Power E1050: Technical Overview and Introduction

This IBM® Redpaper publication is a comprehensive guide that covers the IBM Power E1050 server (9043-MRX) that uses the latest IBM Power10 processor-based technology and supports IBM AIX® and Linux operating systems (OSs). The goal of this paper is to provide a hardware architecture analysis and highlight the changes, new technologies, and major features that are being introduced in this system, such as: The latest IBM Power10 processor design, including the dual-chip module (DCM) packaging, which is available in various configurations from 12 - 24 cores per socket. Support of up to 16 TB of memory. Native Peripheral Component Interconnect Express (PCIe) 5th generation (Gen5) connectivity from the processor socket to deliver higher performance and bandwidth for connected adapters. Open Memory Interface (OMI) connected Differential Dual Inline Memory Module (DDIMM) memory cards delivering increased performance, resiliency, and security over industry-standard memory technologies, including transparent memory encryption. Enhanced internal storage performance with the use of native PCIe-connected Non-volatile Memory Express (NVMe) devices in up to 10 internal storage slots to deliver up to 64 TB of high-performance, low-latency storage in a single 4-socket system. Consumption-based pricing in the Power Private Cloud with Shared Utility Capacity commercial model to allow customers to consume resources more flexibly and efficiently, including AIX, Red Hat Enterprise Linux (RHEL), SUSE Linux Enterprise Server, and Red Hat OpenShift Container Platform workloads. This publication is for professionals who want to acquire a better understanding of IBM Power products. The intended audience includes: IBM Power customers Sales and marketing professionals Technical support professionals IBM Business Partners Independent software vendors (ISVs) This paper expands the set of IBM Power documentation by providing a desktop reference that offers a detailed technical description of the Power E1050 Midrange server model. This paper does not replace the current marketing materials and configuration tools. It is intended as an extra source of information that, together with existing sources, can be used to enhance your knowledge of IBM server solutions..

IBM Power E1080 Technical Overview and Introduction

This IBM® Redpaper® publication provides a broad understanding of a new architecture of the IBM Power® E1080 (also known as the Power E1080) server that supports IBM AIX®, IBM i, and selected distributions of Linux operating systems. The objective of this paper is to introduce the Power E1080, the most powerful and scalable server of the IBM Power portfolio, and its offerings and relevant functions: Designed to support up to four system nodes and up to 240 IBM Power10™ processor cores The Power E1080 can be initially ordered with a single system node or two system nodes configuration, which provides up to 60 Power10 processor cores with a single node configuration or up to 120 Power10 processor cores with a two system nodes configuration. More support for a three or four system nodes configuration is to be added on December 10, 2021, which provides support for up to 240 Power10 processor cores with a full combined four system nodes server. Designed to supports up to 64 TB memory The Power E1080 can be initially ordered with the total memory RAM capacity up to 8 TB. More support is to be added on December 10, 2021 to support up to 64 TB in a full combined four system nodes server. Designed to support up to 32 Peripheral Component Interconnect® (PCIe) Gen 5 slots in a full combined four system nodes server and up to 192 PCIe Gen 3 slots with expansion I/O drawers The Power E1080 supports initially a maximum of two system nodes; therefore, up to 16 PCIe Gen 5 slots, and up to 96 PCIe Gen 3 slots with expansion I/O drawer. More support is to be added on December 10, 2021, to support up to 192 PCIe Gen 3 slots with expansion I/O drawers. Up to over 4,000 directly attached serial-attached SCSI (SAS) disks or solid-state drives (SSDs) Up to 1,000 virtual machines (VMs) with logical partitions (LPARs) per system System control unit, providing redundant system master Flexible Service Processor (FSP) Supports IBM Power System Private Cloud Solution with Dynamic Capacity This publication is for professionals who want to acquire a better understanding of Power servers. The intended audience includes the following roles: Customers Sales and marketing professionals Technical support professionals IBM Business Partners Independent software vendors (ISVs) This paper does not replace the current marketing materials and configuration tools. It is intended as an extra source of information that, together with existing sources, can be used to enhance your knowledge of IBM server solutions.

Serverless Machine Learning with Amazon Redshift ML

Serverless Machine Learning with Amazon Redshift ML provides a hands-on guide to using Amazon Redshift Serverless and Redshift ML for building and deploying machine learning models. Through SQL-focused examples and practical walkthroughs, you will learn efficient techniques for cloud data analytics and serverless machine learning. What this Book will help me do Grasp the workflow of building machine learning models with Redshift ML using SQL. Learn to handle supervised learning tasks like classification and regression. Apply unsupervised learning techniques, such as K-means clustering, in Redshift ML. Develop time-series forecasting models within Amazon Redshift. Understand how to operationalize machine learning in serverless cloud architecture. Author(s) Debu Panda, Phil Bates, Bhanu Pittampally, and Sumeet Joshi are seasoned professionals in cloud computing and machine learning technologies. They combine deep technical knowledge with teaching expertise to guide learners through mastering Amazon Redshift ML. Their collaborative approach ensures that the content is accessible, engaging, and practically applicable. Who is it for? This book is perfect for data scientists, machine learning engineers, and database administrators using or intending to use Amazon Redshift. It's tailored for professionals with basic knowledge of machine learning and SQL who aim to enhance their efficiency and specialize in serverless machine learning within cloud architectures.

Building a Fast Universal Data Access Platform

Your company relies on data to succeed—data that traditionally comes from a business's transactional processes, pulled from the transaction systems through an extract-transform-load (ETL) process into a warehouse for reporting purposes. But this data flow is no longer sufficient given the growth of the internet of things (IOT), web commerce, and cybersecurity. How can your company keep up with today's increasing magnitude of data and insights? Organizations that can no longer rely on data generated by business processes are looking outside their workflow for information on customer behavior, retail patterns, and industry trends. In this report, author Christopher Gardner examines the challenges of building a framework that provides universal access to data. You will: Learn the advantages and challenges of universal data access, including data diversity, data volume, and the speed of analytic operations Discover how to build a framework for data diversity and universal access Learn common methods for improving database and performance SLAs Examine the organizational requirements that a fast universal data access platform must meet Explore a case study that demonstrates how components work together to form a multiaccess, high-volume, high-performance interface About the author: Christopher Gardner is the campus Tableau application administrator at the University of Michigan, controlling security, updates, and performance maintenance.

High-Performance Data Architectures

By choosing the right database, you can maximize your business potential, improve performance, increase efficiency, and gain a competitive edge. This insightful report examines the benefits of using a simplified data architecture containing cloud-based HTAP (hybrid transactional and analytical processing) database capabilities. You'll learn how this data architecture can help data engineers and data decision makers focus on what matters most: growing your business. Authors Joe McKendrick and Ed Huang explain how cloud native infrastructure supports enterprise businesses and operations with a much more agile foundation. Just one layer up from the infrastructure, cloud-based databases are a crucial part of data management and analytics. Learn how distributed SQL databases containing HTAP capabilities provide more efficient and streamlined data processing to improve cost efficiency and expedite business operations and decision making. This report helps you: Explore industry trends in database development Learn the benefits of a simplified data architecture Comb through the complex and crowded database choices on the market Examine the process of selecting the right database for your business Learn the latest innovations database for improving your company's efficiency and performance

Introduction to Integration Suite Capabilities: Learn SAP API Management, Open Connectors, Integration Advisor and Trading Partner Management

Discover the power of SAP Integration Suite's capabilities with this hands-on guide. Learn how this integration platform (iPaaS) can help you connect and automate your business processes with integrations, connectors, APIs, and best practices for a faster ROI. Over the course of this book, you will explore the powerful capabilities of SAP Integration Suite, including API Management, Open Connectors, Integration Advisor, Trading Partner Management, Migration Assessment, and Integration Assessment. With detailed explanations and real-world examples, this book is the perfect resource for anyone looking to unlock the full potential of SAP Integration Suite. With each chapter, you'll gain a greater understanding of why SAP Integration Suite can be the proverbial swiss army knife in your toolkit to design and develop enterprise integration scenarios, offering simplified integration, security, and governance for your applications. Author Jaspreet Bagga demonstrates howto create, publish, and monitor APIs with SAP API Management, and how to use its features to enhance your API lifecycle. He also provides a detailed walkthrough of how other capabilities of SAP Integration Suite can streamline your connectivity, design, development, and architecture methodology with a tool-based approach completely managed by SAP. Whether you are a developer, an architect, or a business user, this book will help you unlock the potential of SAP's Integration Suite platform, API Management, and accelerate your digital transformation. What You Will Learn Understand what APIs are, what they are used for, and why they are crucial for building effective and reliable applications Gain an understanding of SAP Integration Suite's features and benefits Study SAP Integration assessment process, patterns, and much more Explore tools and capabilities other than the Cloud Integration that address the full value chain of the enterprise integration components Who This Book Is For Web developers and application leads who want to learn SAP API Management.

Oracle Global Data Services for Mission-critical Systems: Maximizing Performance and Reliability in Complex Enterprise Environments

New to Oracle Global Data Services? You’ve come to the right place. This book will show you how to leverage the power of Oracle GDS to ensure runtime load balancing, region affinity, replication lag tolerance-based workload routing, and inter-database service failover. In particular, you will see how to maximize the utilization of replication investments with Oracle GDS. The book starts by guiding you through the installation and configuration of GDS and provides details for each component in the GDS framework. Next, you’ll learn how to configure various components of Oracle GDS in standalone environments. Hands-on exercises that explore the advantages of GDS with different test cases utilizing Active Data Guard (ADG), Oracle GoldenGate (OGG), and Oracle Real Application Clusters (RAC) will help you put your learning in context. The book concludes with a demonstration of how to add Oracle GDS to OEM for monitoring and troubleshooting. You’ll also see how to monitor Oracle GDS in a centralized location using Oracle Enterprise Manager Cloud Control. After completing this book, you will understand the architecture, components, and implementation strategies of GDS using ADG and OGG in mission-critical environments. What You Will Learn Understand Oracle Global Data Services architecture and its various components Install and configure Oracle Global Data Services Use Global Data Services with Active Data Guard and Oracle Golden Gate. Monitor Global Data Services using Oracle Enterprise Manager Cloud Control. Troubleshoot issues in Global Data Services Who This Book Is For Oracle database administrators, Oracle database architects, Oracle technical managers, Oracle application business analysts, and Oracle data engineers.

Graph-Powered Analytics and Machine Learning with TigerGraph

With the rapid rise of graph databases, organizations are now implementing advanced analytics and machine learning solutions to help drive business outcomes. This practical guide shows data scientists, data engineers, architects, and business analysts how to get started with a graph database using TigerGraph, one of the leading graph database models available. You'll explore a three-stage approach to deriving value from connected data: connect, analyze, and learn. Victor Lee, Phuc Kien Nguyen, and Alexander Thomas present real use cases covering several contemporary business needs. By diving into hands-on exercises using TigerGraph Cloud, you'll quickly become proficient at designing and managing advanced analytics and machine learning solutions for your organization. Use graph thinking to connect, analyze, and learn from data for advanced analytics and machine learning Learn how graph analytics and machine learning can deliver key business insights and outcomes Use five core categories of graph algorithms to drive advanced analytics and machine learning Deliver a real-time 360-degree view of core business entities, including customer, product, service, supplier, and citizen Discover insights from connected data through machine learning and advanced analytics

Managing Electronic Health Records with Epic, IBM Storage FlashSystem, and IBM Storage Sentinel

The information in this blueprint is intended to facilitate the deployment of IBM® Storage FlashSystem® for the Epic Corporation electronic health record (EHR) solution. It describes the requirements and specifications for configuring IBM Storage FlashSystem and its parameters. To complete these tasks, you must have a working knowledge of IBM Storage FlashSystem and Epic applications. Also, this document describes the steps that are required to configure IBM Storage Sentinel for cyber resiliency. To complete these tasks, you must have a working knowledge of IBM Storage Copy Data Manager.

A Practical Guide to SAP Integration Suite: SAP’s Cloud Middleware and Integration Solution

This book covers the basics of SAP’s Integration Suite, including a broad overview of its capabilities, installation, and real-life examples to illustrate how it can be used to integrate, develop, administer, and monitor applications in the cloud. As you progress through the book, you will see how SAP Integration Suite works as an open, enterprise-grade platform that is a fully vendor-managed, multi-cloud offering that will help you expedite your SAP and third-party integration scenarios. The entire value chain is explored in detail, including usage of APIs and runtime control. Author Jaspreet Bagga demonstrates how SAP’s prebuilt integration packages facilitate quicker, more comprehensive integrations, and how they support a variety of integration patterns. You’ll learn how to leverage the platform to enable seamless cloud and on-premises applications connectivity, develop custom scenarios, mix master data, blend business-to-business (B2B) and electronic data interchange (EDI) processes, including trading partner management. Also covered are business-to-government (B2G) scenarios, orchestrating data and pipelines, and mixing event-driven integration. Upon completing this book, you will have a thorough understanding of why SAP Integration Suite is the middleware of SAP’s integration strategy, and be able to effectively use it in your own integration scenarios. What You Will Learn Understand SAP Integration Suite and its core capabilities Know how integration technologies, such as architecture and supplementary intelligent technologies, work within the SAP Integration Suite Discover services for pre-packaged accelerators: SAP API Management, the Integration Advisor, and the SAP API Business Hub Utilize integration features to link your on-premises or cloud-based systems Understand the capabilities of the newly released Migration Assessment Who This Book Is forWeb developers and application leads who want to learn SAP Integration Suite.